Computational Intelligence Systems have become skilled at interpreting and responding to human inputs due to recent advances in natural language processing, sentiment analysis, and machine learning (Broekens et al., 2023). These advancements enable artificial agents to recognize users’ emotional cues and generate appropriate interactions (Hohenstein, 2020). However, most existing systems remain narrowly focused, particularly on dyadic exchanges and reactive emotion modeling, which limits their ability to support dynamic, group-level emotional reasoning (Zhang et al., 2023; Bosse et al., 2015). Emotional contagion—the spread of emotion in online settings—has become a growing concern due to its disruptive potential (Pröllochs et al., 2021; Šuvakov et al., 2012). This highlights the need for approaches that can regulate shared emotional dynamics in group contexts. This research is driven by the question:
How can intelligent agents orchestrate adaptive and generative emotional responses to promote positive affective convergence and mitigate negative emotion contagion in multi-human to multi-agent communication environments?
We introduce a generative AI (genAI)-based orchestration system designed to influence group emotional dynamics in multi-party interactions. The system continuously senses distributed emotional signals, detects prevalent moods across the group, and orchestrates adaptive agent responses that promote emotional alignment. The core innovation lies in leveraging decentralized mood sensing and genAI pipelines to enable collective emotional reasoning among agents. However, the framework remains a research prototype, and its effectiveness under real-world conditions—such as diverse linguistic styles, noisy data, and adversarial behaviors—has not yet been established.
This model aims to enhance the emotional depth of digital interactions, allowing agents to function as collaborative social actors in emotionally rich environments. Agents can help maintain healthier group affect (Delice et al., 2019; Naveenan and Kumar, 2018; Barsade and Knight, 2015) by actively fostering positive emotional alignment (affective convergence) and dampening the spread of negative sentiment.
This work advances the state of the art by introducing several key contributions. First, it presents a reference architecture for the generative orchestration of emotional contagion in multi-human, multi-agent systems, enabling distributed mood sensing, emotional pattern grouping, and coordinated agent response generation. Second, it offers a method for real-time emotional alignment, allowing conversational agents to collectively reason about group mood and adapt their responses through generative AI pipelines conditioned on both local and global affective contexts. Finally, it introduces an approach to configuration-aware emotional modulation, demonstrating the system’s ability to operate effectively across diverse interaction topologies—from dyadic support settings to coordinated agent collectives within social networks. Together, these contributions establish a scalable framework for emotionally intelligent, adaptive multi-agent communication.
2. Background
A seminal study on emotional contagion by Barsade (2002) demonstrated that group members tend to "catch each other’s moods," with positive contagion enhancing collaboration and task performance, while negative moods intensify conflict. In particular, unpleasant or negative emotions often propagate more quickly and robustly than positive ones, creating a "ripple effect" that can undermine group cohesion. This phenomenon persists at scale in online environments. A large Facebook experiment showed that reducing positive content in users’ news feeds led them to post more negative updates (and vice versa), providing experimental evidence of mass emotional contagion through digital communication (Kramer et al., 2014). Such contagion can erode trust and participation in multi-user interactions; for example, a sarcastic or angry remark by one user can trigger an "epidemic of grumpiness" that dampens the experience for the entire group.
Current conversational AI systems, however, are ill-equipped to manage these group-level affective dynamics (Hatfield et al., 2009; Picard, 2000). Most virtual agents and chatbots operate in dyadic (one-on-one) settings and respond to the emotions of a single user in a strictly individualistic manner (Amiot et al., 2025; Jiang et al., 2023). As a result, they lack awareness of the broader mood in multi-user conversations and cannot prevent negative emotions from cascading among participants. Researchers have noted that deploying chatbots as active participants in multiparty dialogues remains largely unexplored, necessitating new capabilities such as managing turn-taking, roles, and social context among multiple individuals (Dohsaka et al., 2014).
Until recently, emotionally intelligent agents for health and education were primarily designed for one-on-one interactions (Nordberg et al., 2019). This limited focus poses a significant challenge in environments such as group chats, online forums, multiplayer games, and collaborative teams, where emotional contagion can quickly escalate and disrupt the collective experience (Kramer et al., 2014; Barsade, 2002). The challenge is clear: negative emotions can amplify and spread in group interactions, yet today’s AI lacks the group-level emotional reasoning needed to detect and mitigate these contagious downturns before they undermine user trust and participation.
Bosse et al. (2015) introduced an agent-based model of emotion contagion within teams, integrated into an ambient intelligence system that monitors group sentiment and suggests supportive interventions to pre-empt "downward emotion spirals." This work formalized how an automated team assistant could track the emotional levels of multiple individuals and issue group-level support actions (e.g., encouraging messages) when collective morale declines. Subsequent simulations expanded on these concepts, modelling the dynamics of emotion spread in agent networks and demonstrating how mood contagion could be predicted and controlled in principle.
In practice, several domain-specific systems have shown the advantages of coordinated, emotionally aware agents in multi-user settings. In educational technology, socially intelligent tutors have been enhanced with positive socio-emotional strategies to cultivate a constructive group climate. For instance, Dohsaka et al. (2014) developed a multiparty quiz game system involving two humans and two agent facilitators. They found that the presence of empathic agents significantly increased user satisfaction and the number of user contributions. These studies illustrate that a carefully orchestrated agent response can foster engagement and promote positive emotional convergence within a group.
More recently, researchers have begun exploring multi-agent setups specifically for group emotional support. Nordberg et al. (2019) introduced Terabot, a chatbot designed to facilitate peer support conversations among users with ADHD in an online program. Acting as a facilitator, the bot provided structure and encouragement in the group chat, which participants found helpful for keeping discussions on track. This multi-chatbot counselling system resulted in higher user engagement and notable linguistic convergence compared to a one-on-one chatbot, indicating a stronger rapport and a sense of support. However, this solution utilized agents with predefined facilitator and peer roles, following a scripted division of labour with minimal real-time coordination or “collective reasoning.” Despite this limitation, participants in the experiment reported greater social support and motivation to cope, demonstrating how multiple agents working together can foster a more positive and resilient group mood.
While these efforts are promising, there remains a significant gap in achieving truly adaptive, generative emotional orchestration for multi-human/multi-agent environments. Most existing systems depend on scripted rules or fixed role strategies to manage group emotions. For instance, some mental health chatbots detect when a user expresses negative feelings and trigger a prewritten empathetic prompt or suggest contacting a crisis hotline. Although such rule-based interventions can be helpful, they lack the flexibility and creativity that modern generative AI can offer.
Currently, there is no publicly documented platform that supports a decentralized network of agents capable of sharing emotional signals and coordinating group mood responses in real time. The focus of AI-driven emotional support has been on empathy for a single agent or on centralized settings where an intelligent agent attempts to facilitate or monitor an entire group (Tavanapour et al., 2020). This creates a technological gap in enabling multiple agents to act as a cohesive team of socially savvy participants, dynamically harmonizing their responses to manage group emotions in an open-ended manner.
Prior multi-agent emotion support systems have demonstrated the value of coordinated agent participation but have largely relied on predefined roles and scripted response strategies instead of dynamic emotional reasoning. For example, Dohsaka et al. (2014) implemented two facilitator agents in a quiz game that followed fixed turn-taking scripts and injected empathic prompts at predetermined points. While this improved user satisfaction, it did not allow for real-time adaptation to the emerging group mood. Similarly, Nordberg et al. (2019) found Terabot coordinated multiple chatbots to guide ADHD peer support groups, but the agents used a static division of labor (facilitator vs. peer roles) and offered pre-programmed supportive statements rather than contextually generated ones. Even earlier work, such as Bosse et al. (2015) modelled emotion contagion at the group level but relied on rule-based triggers for supportive interventions, lacking generative variation and decentralized mood sharing.
In contrast, our approach introduces a generative and coordinated orchestration framework that fundamentally diverges from prior systems in two key aspects. First, it employs generative response pipelines, where agents produce novel, context-conditioned responses using generative AI models informed by both local and group-level affective states. This enables more adaptive and emotionally nuanced interactions tailored to evolving user contexts. Second, it incorporates collective mood reasoning through decentralized sharing, allowing agents to continuously exchange mood observations via the mood pattern observation and grouping components. This decentralized communication forms a real-time consensus of the group’s emotional state, which, in turn, drives the Orchestration Component’s adaptive strategies. Together, these innovations enable coordinated, emotionally intelligent responses that dynamically align with both individual and collective affective dynamics.
This allows agents to function not as isolated scripted responders but as a dynamically coordinated affective collective. While prior systems could only reactively respond to predefined emotional cues, our model supports open-ended, real-time emotional modulation that adapts to evolving group dynamics.
Our system also aligns with architectural strategies from safety-critical domains. For instance, previous work on augmented reality recommendation systems in emergency scenarios (Beloglazov et al., 2017) demonstrated the value of context-aware, distributed sensor input to dynamically guide user actions in real time. This approach parallels our use of affective signals for dynamic, emotion-driven agent coordination, highlighting a shared need for responsiveness, decentralization, and adaptive orchestration across critical and emotionally intensive settings.
Despite these opportunities, deploying such systems faces several practical barriers, including privacy concerns related to sharing emotional data across agents, the need for cross-platform interoperability, and the challenge of preventing bias or misclassification in emotion recognition models. These constraints suggest that transitioning from concept to operational deployment will require extensive technical and ethical safeguards.
3. Proposal
We propose a generative orchestration system that enables a network of conversational agents to collaboratively regulate group emotion in real time. The innovation lies in combining decentralized mood sensing with generative emotional response coordination, allowing agents to function not as isolated responders but as an affectively aware collective capable of influencing group mood dynamics.
Our framework enables each agent to perceive emotional signals from the humans with whom it engages and to share a summary of its local mood readings with other agents (Asghar et al., 2018). This distributed emotional awareness fosters collective emotional reasoning, where agents form a shared representation of the group's affective state through real-time consensus.
Based on this shared model, agents generate adaptive responses using generative AI pipelines conditioned on both the local interaction context and global emotional goals (Wang et al., 2025). For instance, agents could introduce encouraging dialogue, express empathy, or inject subtle humor in a coordinated manner when negativity is detected, guiding participants toward emotional convergence without explicit instruction. However, errors in emotion detection—such as misunderstandings of sarcasm, mixed emotional expressions, or cultural differences in expression—could lead to inappropriate responses, necessitating strategies to mitigate such risks.

The architecture is illustrated in Figure 1 and operates as follows. Each conversational agent independently observes emotional cues from human interactions through the Mood Pattern Observation Component (MPOC), which classifies user mood using NLP pipelines and supervised learning algorithms. These local mood patterns are passed to the Mood Pattern Grouping Component (MPGC), which aggregates and clusters emotional signals from agents using unsupervised processing methods. This enables the system to identify dominant affective trends and emotional divergence within the group.

Figure 2 illustrates the MPOC, which operates through a series of sequential processes designed to capture and interpret emotional dynamics in human-agent interactions. The process begins with data collection and preprocessing, where human-agent conversations are gathered, cleaned, normalized, and prepared for downstream analysis. This is followed by sentiment analysis, which examines each utterance to extract valence and polarity scores, identifying the basic emotional tone of the text. The emotion detection stage applies emotion lexicons and classifiers to pinpoint specific emotions, such as joy, anger, or sadness, expressed throughout the conversation. Next, pattern classification groups these detected emotional states into coherent mood patterns based on their intensity and temporal continuity, effectively capturing the emotional flow over time. Finally, classification refinement employs context-aware rules combined with GenAI-powered analysis to enhance prediction accuracy and robustness. The performance of this component highly depends on the quality and representativeness of the training data, as sparse or highly informal text can lead to reduced accuracy.
This component enables each agent to maintain a fine-grained, real-time understanding of the emotional state of its human counterpart while sharing local mood observations with other agents in the system. The inter-agent sharing process provides the foundation for collective emotional reasoning, allowing agents to combine their individual insights into a broader, group-level emotional model.

The mood pattern-grouping component operates through a structured sequence of analytical steps (Figure 3). It begins by clustering observed mood patterns based on similarity using unsupervised learning algorithms like k-means or hierarchical clustering. The resulting clusters are then processed by the pattern identification module, which detects recurring emotional trajectories across conversations. Next, the prevalent mood detection module identifies dominant emotional trends by analyzing the frequency, intensity, and duration of the clustered mood data. The grouping module organizes conversations according to their shared emotional tone, creating coherent affective clusters. These initial groupings are further refined by the pattern refining module, which applies machine learning models such as decision trees or artificial neural networks to enhance the granularity and cohesion of the emotional groupings. Finally, the classification module assigns each conversation group to predefined affective categories—such as positive, negative, or neutral—based on the refined feature sets, thereby enabling precise and adaptive mood-based orchestration across the agent network.
We emphasize that group mood clustering is sensitive to outlier data and rapid sentiment shifts, which can lead to unstable or misleading group profiles in fast-moving conversations.
This component allows the system to transition from local emotional sensing to shared emotional awareness. It facilitates collective reasoning by providing each agent with a contextual map of emotional convergence and divergence within the group. This shared understanding enables agents to coordinate their actions, targeting emotionally divergent users with responses that promote affective alignment.

Figure 4 illustrates the Agent Interaction Orchestration component, which coordinates how agents modulate emotional responses in harmony with group mood dynamics. This component operates through a sequence of interdependent steps. It begins with a set of predefined mood patterns representing common affective states such as happiness, sadness, anger, and fear, derived from historical mood observations that guide agent strategies. The Agent Interaction Orchestration Module then integrates these mood patterns with the current group affective state to determine appropriate responses, using decision trees to select coordinated agent behaviors during ongoing conversations. The chosen strategy is executed through the emotion contagion component, where agents dynamically adjust their conversational tone, content, and style using generative models to gently steer user mood toward emotionally aligned states. A continuous feedback loop evaluates the emotional impact of these interactions, while the performance optimization module refines orchestration strategies through reinforcement learning, adaptively tuning parameters such as message timing, language style, and agent allocation to enhance overall system effectiveness.
This component shifts emotional reasoning from static, pre-scripted planning to dynamic, responsive coordination. It empowers agents to operate as an emotionally synchronized collective, maintaining coherence across interactions while adapting to evolving user sentiment.
The entire system functions as a coordinated emotional reasoning pipeline, enabling conversational agents to understand, share, and influence collective affect in real time. What begins as individual emotional perception becomes, through successive stages of aggregation and orchestration, a structured, system-wide capacity for emotional modulation.
This architecture aims to transition from isolated, reactive emotion recognition to coordinated, group-level modulation. Traditional approaches have viewed emotional responses as static or individual phenomena. In contrast, this model allows agents to reason about mood at the group level, share their affective perceptions, and act together to guide the collective emotional trajectory (Mao et al., 2024).
The orchestration framework applies collective intelligence principles to affective interaction, enabling agents to coordinate in supporting emotional convergence and mitigating negative contagion. It serves as a foundation for emotionally aware agent systems in domains where group cohesion is essential.
3.1 Practical challenges and limitations
While conceptually promising, deploying this orchestration framework in real-world environments presents several technical and operational challenges. A primary concern is computational constraints: real-time orchestration across many agents necessitates low-latency mood classification, clustering, and response generation pipelines. Maintaining responsiveness under heavy concurrency will likely require parallel processing and model compression techniques, which are beyond the current prototype's capabilities.
Additionally, the system is vulnerable to emotion misclassification, especially with ambiguous language, sarcasm, mixed affect, or cultural variations in emotional expression. Incorrect mood labeling could result in inappropriate responses that exacerbate rather than stabilize group sentiment. This risk highlights the need for confidence-based gating and human-in-the-loop fallback mechanisms.
Coordination complexity is another significant challenge. As the number of agents increases, synchronizing emotional strategies while avoiding feedback loops or contradictory behaviors becomes increasingly difficult. Distributed orchestration must also accommodate network delays, asynchronous updates, and partial agent failures.
Finally, the architecture introduces new failure modes. If grouping or orchestration components malfunction (e.g., due to clustering errors or stale models), agents may converge on the wrong emotional target or overcorrect, thereby amplifying rather than damping emotional contagion. Detecting and mitigating such emergent failures will require continuous monitoring, sandbox testing, and dynamic rollback mechanisms.
4. Configurations
These configurations are presented as conceptual scenarios rather than tested production deployments. Real-world implementation would require addressing latency, bandwidth, data privacy compliance, and robust failover mechanisms, which fall outside the scope of this initial study. We explore various deployment configurations in which the proposed generative orchestration system may operate.
4.1 Dialytic interactions (H–M–Ag)
At the basic level, there is the dyadic interaction loop Human(H)–Medium(M)–Agent(Ag) (Figure 5). In this configuration, the human user interacts with a conversational agent through a communication medium. The agent consists of an interface connected to an internal AI pipeline that includes orchestration, intention, memory, and planning modules, as well as a pipeline for prompt augmentation and model invocation. These agents generate responses by orchestrating calls to the underlying generative models to enhance and contextualize information. However, the specific composition of the agent architecture is not central to our discussion.
In dyadic configurations, emotional contagion is typically limited. The agent's affective behavior is influenced solely by the mood of the individual user, with no direct exposure to emotional signals from other users. Consequently, emotional alignment occurs in isolation, lacking the benefits of collective affective awareness or group-level feedback. In this context, our system introduces capabilities that enhance the emotional intelligence of the agent beyond the local dyad. Specifically, the MPOC continuously detects and classifies emotional cues from the user in real time. These local emotional signals are then shared across the agent network and processed by the MPGC, which identifies emerging mood trends across different users and contexts.
This cross-agent exchange enables even isolated agents to respond with an awareness of prevalent emotional climates. When an agent detects negative emotions from its user, it can adjust its generative responses not only based on local history but also by employing affective strategies that have been successful in similar contexts. The Orchestration of Agent Interactions Component adapts responses using techniques such as emotional mirroring, gentle humor, or reframing to promote emotional alignment with broader user trends. As a result, the system transforms conventional one-to-one agent configurations into emotionally coherent nodes within a distributed, affect-aware network, allowing localized experiences to benefit from collective intelligence.

4.2 Social network with personal assistance (Ag-H–M)
In configurations of Agent(Ag)-Human(H)–Medium(M) , conversational agents act as personal perceptual assistants for individual users, but do not engage directly with the shared medium (Figure 6). These scenarios occur on digital social platforms and collaborative tools, where interaction occurs within a shared environment, yet each user can also utilize a background agent for cognitive or emotional support.
In variation Figure 6(a), each human user interacts with a shared environment (e.g., social network or messaging platform), while their respective agent offers personalized support. However, these agents lack access to the content circulating in the medium. Emotional contagion in this setup occurs primarily through human-to-human interactions, with agents inferring emotional shifts indirectly based on their users' responses. This creates a blind spot in affective awareness, limiting the agent's ability to facilitate proactive emotion modulation.
Our system addresses this limitation by enabling agents to pool mood observations through shared emotional models, even without access to the full content of the medium. Through the MPOC, agents monitor the emotional expressions of their assigned users. These local signals are aggregated via the MPGC across agents to infer prevalent affective trends among the broader user base. In this manner, even agents with partial information can collaboratively reason about the group's mood.
The variation depicted in Figure 6(b) expands this concept by allowing agents to communicate with one another in the background. Here, emotional contagion becomes a multilayered phenomenon: while humans influence each other through a shared medium, agents also exchange emotional cues and response strategies via the orchestration pipeline. This configuration models emergent social behavior in mediated environments, such as support groups or comment threads.
Our system enhances this interaction by activating the Orchestration of Agent Interactions Component, which synchronizes emotional responses among agents based on the mood of the consensus group. This enables agents to deliver coordinated support by recommending uplifting interactions, suggesting de-escalation when tensions rise, or reinforcing affirming sentiments, thereby steering collective emotional trajectories without requiring direct access to the medium.
In both configurations, the proposal shifts the role of agents from isolated affective mirrors to cooperative emotional mediators. It facilitates emotionally intelligent behavior in contexts where agents act as silent observers and advisers, extending the reach of affective computing into domains where emotional nuance and social harmony are essential but challenging to instrument directly.

4.3 Humans and agents participating in social networks ([H|Ag]–M)
Figure 7 illustrates the configuration where Humans (H) and Agents (Ag) interact within a shared Medium (M), depicting a scenario in which both humans and agents exchange messages on a common platform. In these contexts, agents exhibit varying degrees of visibility and engagement, positioning them to effectively influence emotion contagion at scale.
In Figure 7(a), the variation features agents that do not engage in direct interactions with users but instead observe ongoing conversations through the shared medium. These agents can inject messages visible to all participants. This indirect presence allows agents to sense prevailing emotional trends, identify negative affect spirals, and strategically introduce mood-regulating prompts into the conversation stream, such as humor, encouragement, or clarification. This design is well-suited for forums or comment threads, where agents can function as moderators or tone-steering entities.
Figure 7(b) presents a variation in which agents actively participate alongside humans in affective exchanges. Each agent contributes to the ongoing dialogue by aligning its tone and behavior with the detected mood state of its conversation partner while remaining contextually aware of the group's overall sentiment. In this setup, the orchestration system facilitates generative affective responses tailored to local and group mood dynamics, thereby enhancing emotional alignment during shared experiences. This approach is particularly suitable for live discussion groups or multiplayer environments.
Figure 7(c) illustrates a configuration in which multiple agents interact with different users in a shared medium while maintaining a backchannel for inter-agent communication. This backchannel enables agents to synchronize emotional strategies, share observed mood patterns, and collaboratively agree on responses in real time. As a result, the system fosters emergent group-level mood regulation, allowing agents to collectively diffuse negative affect and promote emotional convergence without direct instruction or user intervention. This configuration is well-suited for decentralized emotional reasoning in large-scale social platforms.
These scenarios demonstrate how the proposed system adapts to varying agent roles. In each case, the core innovation of the system facilitates proactive mood shaping in complex ecosystems by supporting distributed mood detection, aggregation of emotional patterns, and orchestration of generative responses.

5. Use cases
The following use cases illustrate how the proposed orchestration framework could operate in various interaction settings. Each scenario describes the social context, key participants (both humans and agents), and the emotional dynamics they experience, demonstrating how the system might sense and influence these dynamics in practice. These vignettes serve as exploratory examples to showcase feasibility, rather than as evidence of deployment-ready systems.
5.1 Emotionally aware virtual assistant for customer service
Consider a large online electronics store where hundreds of shoppers browse and purchase products simultaneously through a shared commerce platform. Each shopper is supported by a personal customer-service assistant agent embedded within the interface, providing real-time product guidance and private emotional support. These agents connect through an orchestration layer that enables them to share anonymized mood signals and coordinate their interaction strategies.
During a busy promotional period, one user repeatedly encounters stock errors while attempting to place an order. Their language becomes abrupt and negative (“This site is useless… why does nothing work?”). The user’s assistant agent detects rising frustration through the MPOC, which flags high negative valence based on sentiment analysis and pacing irregularities. This emotional signal is then shared with the broader agent network.
Simultaneously, other assistants detect similar frustration among a small cluster of shoppers facing the same stock issue. The Mood Pattern Grouping Component aggregates these observations, classifies the group as exhibiting emerging negative affect, and alerts the Orchestration Component to intervene before the sentiment spreads more widely.
The orchestration system issues targeted guidance to the affected assistants. These agents adjust their dialogue tone to acknowledge the issue empathetically (“We’re sorry this item is temporarily unavailable”), offer immediate alternatives (such as nearby models or pre-order options), and inject subtle positive framing (“This model is very popular and will restock soon. Many customers are excited about it”). Meanwhile, assistants serving unaffected users maintain neutral or upbeat engagement to avoid unintentionally amplifying the frustration signal.
Over the next several minutes, the messages from affected users shift from angry to calm (“Okay, thanks. I’ll wait for the restock”), and the frustration levels in the cluster subside. The overall emotional tone of the store session remains stable, and negative contagion does not spread to unrelated shoppers.
This scenario reflects the Human–Agent–Medium (H–Ag–M) configuration: humans interact directly with their agents through the shared shopping platform (Medium), while the agents coordinate emotional strategies behind the scenes. Although conceptual, this example illustrates how distributed emotional sensing and orchestrated response generation can help contain localized negativity, thereby improving customer satisfaction and preserving group morale—factors that are critical for sustaining user trust and brand loyalty in digital commerce environments (Figure 8).

5.2 Emotion-aware social platform for mental health support
Consider a mental health peer support forum with approximately fifty active participants interacting through a shared digital platform. Each user is paired with a personal conversational agent that quietly monitors linguistic and behavioral cues, providing private emotional guidance while coordinating indirectly with other agents through an orchestration system.
During an evening discussion thread, one user discloses a recent experience of cyberbullying, expressing feelings of sadness and anger. Their personal agent detects high negative valence from abrupt phrasing and emotionally charged terms (“I feel worthless… I can’t deal with this”) and shares this signal across the agent network. Meanwhile, several nearby users respond with supportive comments (“I’ve been through this too; it gets better”), which their agents classify as positive-affective behavior.
The Mood Pattern Grouping Component aggregates these observations and identifies an emotional divergence: one participant exhibits concentrated distress, while several others display empathic intent. This divergence poses a risk of emotional contagion if the distressed tone triggers wider negativity or disengagement among the group.
The Orchestration Component provides coordinated guidance to nearby agents. These agents gently encourage their users to post supportive replies, such as sharing similar experiences or offering encouraging words, while minimizing the visibility or impact of potentially dismissive or confrontational comments. At the same time, the distressed user’s agent offers private support by suggesting calming activities, reframing negative thoughts, and recommending safe community subgroups where positive emotional patterns thrive.
Over the next hour, the emotional tone of the thread gradually shifts. The distressed user begins to use calmer, more measured language (“Thank you… this makes me feel less alone”), while other participants increase their frequency of positive and empathetic responses. The overall conversation maintains a constructive tone, and the initial negativity does not spread to other threads or participants.
This scenario illustrates the Agent–Human–Medium (Ag–H–M) configuration: humans communicate through a shared platform (Medium) while their companion agents serve as private emotional supports that coordinate indirectly in the background. Although conceptual, this example demonstrates how distributed emotional sensing, mood pattern grouping, and orchestrated response strategies can help sustain an emotionally safe and supportive atmosphere in online wellness communities—where quickly containing negative affect is vital for maintaining engagement and collective well-being (Figure 9).

5.3 Emotionally supportive virtual assistant in healthcare services
Consider a large telemedicine platform where dozens of patients meet with physicians in parallel virtual consultation rooms. Each patient and clinician is accompanied by a personal conversational agent that monitors linguistic and behavioral cues to assess emotional states. These agents communicate through a shared orchestration layer while interacting privately with their assigned users.
During one session, a patient recently diagnosed with a chronic condition begins to show signs of acute anxiety—characterized by short, fragmented speech, rapid message timing, and expressions of fear and hopelessness (“I don’t think I can handle this”). The patient’s assistant agent flags the high intensity of fear and shares this information with the clinic’s internal agent network. Simultaneously, other assistants monitoring separate sessions detect similar spikes in anxiety among patients who have experienced long waiting times and delayed lab results.
The Mood Pattern Grouping Component aggregates these observations and identifies a cross-session trend of rising anxiety. It clusters several patients into a “high-stress” affective group while noting that clinicians are maintaining a neutral emotional tone. This mismatch poses a risk of emotional contagion spreading across the patient community, potentially leading to higher no-show rates or disengagement from care.
The Orchestration Component responds by issuing coordinated guidance to all active assistant agents, directing them to subtly slow the conversational pacing, acknowledge patient fears with validating statements such as “It’s normal to feel this way at first,” and offer quick coping strategies like deep-breathing prompts or structured waiting-time estimates. Additionally, the agents are encouraged to suggest participation in supportive community discussion boards where patients exhibiting positive coping behaviors can provide reassurance and shared understanding. Through these coordinated actions, the system fosters a calmer, more empathetic emotional environment that promotes collective resilience and psychological comfort.
Meanwhile, clinicians’ dashboard interfaces are enhanced with lightweight affective cues from the agents (e.g., “Patient currently anxious — consider reassurance”), helping to align human responses with the emotional needs detected by the system.
Within minutes, the patient’s tone becomes more measured (“Okay, that helps — thank you”), and their interaction stabilizes. Across the platform, anxiety levels plateau instead of escalating, demonstrating that coordinated emotional adjustment can dampen affective contagion while supporting individual well-being.
This scenario reflects the Human-Agent-Machine configuration, where humans (patients and clinicians) interact through a shared medium, while their companion agents both participate in the medium and coordinate among themselves via a backchannel. Although conceptual, this example illustrates how distributed emotional sensing and real-time orchestration could reinforce trust, improve emotional safety, and sustain engagement in telemedicine contexts, where stress often spreads quickly and silently between participants (Figure 10).

5.4 Impact across use scenarios
The shared capability for collective emotional reasoning is the key factor that enables the proposed system to operate effectively across diverse interaction scenarios. Whether assisting individual customers, moderating group support forums, or facilitating high-stakes healthcare consultations, the system’s architecture allows agents to transition from isolated responders to coordinated emotional collaborators.
This adaptability arises from the integration of three interlocking components within the architecture. The MPOC enables each agent to sense fine-grained emotional cues from its assigned user in real time, ensuring that even in isolated or dyadic contexts, agents can detect early emotional shifts such as frustration, anxiety, or disengagement—signals that often precede negative emotional contagion. The mood pattern grouping component aggregates these distributed signals across the agent network, clustering them to identify group-level affective trends. Through this mechanism, the system can recognize when individual emotional events begin to manifest as collective phenomena, such as detecting a growing cluster of frustrated users on an e-commerce platform or an emerging wave of anxiety among patients facing extended wait times. Finally, the orchestration of agent interactions component translates this collective emotional awareness into coordinated action, adjusting the tone, pacing, and content of agent responses using generative models. This ensures that interventions are contextually appropriate and temporally synchronized, effectively guiding the group toward affective alignment and stability.
This architecture-aware orchestration allows the system to be reconfigured for different scales and roles without redesigning its core logic. Agents maintain autonomy in local sensing while collaborating on emotional responses, which supports flexible deployment in environments ranging from low-volume support chats to high-density social platforms. The approach builds on principles demonstrated in other distributed sensing domains. For example, prior work on agent-based sensing and statistical modeling for real-time parking spot detection using smartphone data (Koster et al., 2014) illustrated how decentralized local observations can be combined into global inferences that guide user behavior. Similarly, grid-based intrusion detection systems (Schulteret al., 2006) emphasized the importance of coupling localized anomaly detection with collective reasoning to mitigate emergent threats. Our system applies this same pattern of distributed sensing and coordinated response to the domain of emotional signal management, enabling conversational agents to collectively detect affective “anomalies” and respond in a harmonized manner to stabilize group mood.
This architecture demonstrates how distributed emotional sensing, collective mood inference, and adaptive response orchestration can be integrated into a flexible framework for emotion-aware digital systems. The design could support the emergence of emotionally intelligent services that enhance user satisfaction, build trust, and reinforce the overall quality of experience across domains while remaining robust to dynamic emotional shifts that often destabilize multi-user interactions.
5.5 Limitations and future work
Several technical and ethical challenges must be addressed before real-world deployment of such systems becomes viable. Misclassification risks can arise from sarcasm, cultural idioms, mixed affect, or noisy text, potentially triggering inappropriate responses. There are also mis-orchestration risks, where coordinated actions may unintentionally amplify negativity or over-reassure users. Additionally, real-time constraints such as network latency, bandwidth limitations, and computational scalability pose significant technical hurdles. Privacy and consent concerns surrounding the sharing of emotional data further demand robust anonymization and secure data handling practices. Future work will focus on enhancing robustness to ambiguous inputs through multimodal emotion detection and adversarial testing, developing fail-safe orchestration protocols to prevent cascading errors, and integrating human-in-the-loop oversight for ambiguous or high-stakes emotional states. These safeguards are essential to ensure that coordinated affective systems promote psychological safety and trust rather than compromise them.
6. Conclusions
This work introduced a conceptual framework for enabling conversational agents to collaboratively sense, interpret, and influence emotional dynamics in multi-human, multi-agent environments. The approach shifts affective computing from isolated, reactive sentiment recognition to the coordinated and adaptive orchestration of group affect, positioning agents as emotionally aware collaborators that contribute to a shared emotional landscape. The architecture integrates three core components—decentralized mood sensing, mood pattern grouping, and orchestration of agent interactions—which together enable agents to detect early emotional shifts, infer emerging group mood trends, and generate contextually aligned responses that promote affective convergence. Our analysis of interaction configurations suggests that the framework offers the greatest potential in scenarios where agents have both access to user communication and the ability to coordinate their responses. These scenarios include one-to-one support (H–M–Ag), private personal assistance during group interactions (Ag–H–M), and multi-agent participation in shared communication platforms ([H|Ag]–M). In such settings, agents can exchange emotional context, adapt strategies collectively, and dynamically steer group mood as affective conditions evolve. While these results illustrate conceptual feasibility, the framework remains in the prototype stage. Achieving operational readiness will require large-scale validation, long-term user trials, and formal ethical review. As digital platforms increasingly mediate social interaction, maintaining emotional stability will become critical to sustaining engagement and well-being. This framework offers an initial foundation for developing emotionally intelligent agent networks capable of fostering constructive and resilient group dynamics in complex multi-user environments.
Acknowledgements
This paper describes a conceptual framework informed by prior design work documented in Patent US20250094842-A1 (Koch et al., 2025), which was developed while the authors were employed at IBM, the current assignee of the patent application. The reference is included to illustrate the industrial relevance of the approach and does not constitute empirical validation or evidence of system performance.
Funding information
There is no fund received by any authors in relation to this publication.
Ethical approval statement
No ethical approval was required to conduct the study.
Data availability
All available data is in the article.
Informed consent statement
Not applicable.
Conflict of interest
The authors declare no conflict of interest.
Authors’ contribution
Conceptualization: Fernando Koch, Jessica Nahulan, Jeremy Fox, Martin Keen; Figure preparation: Fernando Koch, Jessica Nahulan. All authors critically reviewed the manuscript and agreed to submit the final version of the manuscript.