Multi-Agent Systems Glossary: Orchestration & Roles
Multi-Agent Systems (MAS) describe a practical way to ship agentic AI in real enterprise workflows: multiple specialized AI agents collaborate, divide work, and coordinate actions across tools and systems of record. As more teams move from demos to production, the limiting factor is rarely “model quality”; it’s whether you can orchestrate, govern, and operate agent teams safely at scale.
This glossary gives you reusable terminology for architecture reviews, product planning, and hiring—especially if you’re building on an AI low-code platform or enterprise development platform where workflows, permissions, and auditability are non-negotiable.
Why MAS matters now (and why the glossary is getting crowded)
Enterprise adoption is accelerating, and “agents” are becoming a common experiment target. In McKinsey’s 2025 global survey, 62% of respondents said their organizations are at least experimenting with AI agents.
At the same time, governance expectations are hardening. ISO/IEC 42001:2023 defines requirements and guidance for establishing and improving an AI management system in organizations, pushing teams to treat AI as an auditable operational capability.
A mental model you can reuse
A useful, implementation-oriented framing:
- Agents: goal-directed workers with tools.
- Orchestration: the coordination layer that assigns work, routes messages, validates actions, and tracks outcomes.
- Governance: policies, permissions, approvals, and audit trails that bound what agents can do.
- Operations: monitoring, evaluation, and incident response for agent behavior.
This maps well to MAS definitions as “multiple interacting agents” working together to solve tasks through interaction and coordination.
Core glossary: collaboration, division of labor, orchestration
Agent (AI Agent)
A software entity that can interpret goals, plan steps, call tools, and produce outputs that change state (or propose changes) in a system. In enterprise settings, the defining feature is action capability with constraints.
Agentic AI
A product behavior where the system can pursue a goal through multi-step actions (planning + tool use + state updates), typically with guardrails and approvals. “Agentic” describes how work gets done, not a specific model type.
Multi-Agent System (MAS)
A system composed of multiple interacting agents that coordinate to solve problems. MAS designs can be cooperative, competitive, or mixed, and can be centralized or decentralized in control.
Role
A named responsibility slice (e.g., “Policy Checker”, “Data Retriever”, “Executor”) that constrains prompts, tools, permissions, and success criteria. Role clarity reduces cross-agent interference and makes reviews auditable.
Capability
A contract describing what an agent can do: inputs, outputs, tool access, latency budget, and failure modes. Capability-first design makes orchestration composable and testable.
Task decomposition
Breaking a goal into smaller tasks with explicit dependencies. In MAS, decomposition is often paired with routing: each subtask is assigned to the best-suited role.
Task allocation
Choosing which agent (or role instance) owns a task, considering skills, permissions, load, and cost budgets. Allocation is where “division of labor” becomes operational.
Coordination
Mechanisms that ensure agents don’t work at cross purposes: shared state, communication protocols, conflict resolution, and scheduling. Coordination is the core technical challenge MAS exists to solve.
Orchestrator
A control component that manages agent execution: routing tasks, enforcing policies, collecting traces, and deciding retry/rollback strategies. Some MAS use a strong orchestrator; others distribute coordination.
Workflow orchestration
Coordinating agent steps within a discrete workflow (e.g., “lead → qualify → propose → approval → CRM write-back”). This is the most common enterprise pattern because it maps to audit and approvals.
Process orchestration
Coordinating multiple workflows across a broader business process (e.g., “order-to-cash”), often across teams and systems, with SLAs and escalation paths.
Tool calling
A pattern where a model selects and invokes external tools (APIs, databases, services) via an orchestrator. Tool calling expands capability beyond the model’s internal knowledge.
Function calling
A structured form of tool calling where calls must match declared schemas, enabling validation, access control, logging, and safer automation.
Tool registry (or Tool catalog)
A governed inventory of tools with schemas, auth scopes, rate limits, and usage policies. It is the bridge between “agent intent” and “enterprise-safe execution”.
Collaboration patterns: the shapes teams actually ship
Supervisor–worker (centralized control)
One coordinator agent (or orchestration service) assigns tasks to specialist agents, aggregates results, and decides next steps. Easier to audit and reason about; potential bottleneck at scale.
Pipeline (sequential specialists)
Agents run in a fixed order: retrieve → analyze → draft → verify → execute. Works well for compliance-heavy workflows and deterministic reviews.
Blackboard (shared workspace)
Agents post intermediate artifacts to a shared state store; other agents pick up tasks based on state changes. Powerful for parallelism; requires strong consistency rules and conflict handling.
Peer-to-peer (distributed coordination)
Agents communicate directly to negotiate responsibilities and resolve conflicts. Robust to orchestrator failure; requires careful protocol design and stronger observability.
Orchestration terms that matter in production
Control plane vs data plane
- Control plane: routing, policies, permissions, retries, approvals, audit.
- Data plane: tool execution, data reads/writes, integrations.
Keeping these separable makes governance upgrades possible without rewriting business logic.
State (shared memory)
Shared state includes the task graph, partial outputs, retrieved evidence, and pending approvals. State design determines your ability to replay and debug incidents.
Determinism and replay
Enterprises often need repeatable runs for audits and incident analysis. Replay typically requires capturing tool inputs/outputs, model prompts, and routing decisions.
Idempotency
A property where repeated execution leads to the same effect (or safe no-op). For write-back actions, idempotency prevents duplicated tickets, payments, or record updates.
Compensation
When you cannot “undo” an external side effect, you design a compensating action (e.g., revert a status, post a correction entry, open a remediation task).
Budgeting (cost/latency)
MAS increases call volume. Budgets (token, tool, time) should be first-class orchestration constraints, not afterthoughts.
Communication and contracts: how agents stay aligned
Message
A unit of communication that includes sender, receiver, intent, and payload. In practice, messages should carry trace IDs for observability.
Protocol
Rules for how messages are structured and when they can be sent (e.g., request/response, publish/subscribe, negotiation). MAS research has long studied agent communication languages and protocols; modern LLM-MAS often approximates protocols through structured schemas and routing constraints.
Schema
A machine-checkable contract for inputs/outputs, tool calls, and intermediate artifacts. Schema discipline is one of the simplest ways to reduce “agent drift”.
Governance glossary: terms your security and compliance teams will ask for
Policy engine
A component that evaluates whether an action is allowed (tool use, data scope, write-back) based on context: user, role, record type, environment, and risk level.
Permission model
The mapping from identities (users, service accounts, agent roles) to allowed actions and data scopes. Permission alignment is essential for safe write-back.
Human-in-the-loop (HITL)
A required checkpoint where a person reviews, approves, or edits an agent’s proposed action. HITL can be role-based (manager approval) and context-based (high-risk actions).
Audit log
An immutable record of what the system did: prompts, tool calls, approvals, policy decisions, and record updates. Governance standards encourage traceability and accountable operation.
AI management system (AIMS)
In ISO/IEC 42001 terms, the organizational system of policies, processes, and controls that manages AI risks and opportunities across the lifecycle.
Risk framing (NIST AI RMF)
NIST AI RMF provides a voluntary framework to manage AI risks, emphasizing governance and trustworthiness considerations across design, development, and use.
A practical decision guide: when MAS is worth it
MAS brings coordination overhead. It pays off when:
- The task naturally splits into specialist subproblems (policy, data, execution, verification).
- Tooling spans multiple systems, requiring routing and constraints.
- You need parallelism for latency or throughput.
- You need clear responsibility boundaries for audits and approvals.
Single-agent designs can be enough when the workflow is short, tool surface is small, and write-back is minimal.
Research surveys also highlight that LLM-based MAS introduces new failure modes (coordination errors, compounding hallucinations, inconsistent shared state), so orchestration and evaluation become central.
What this means for AI low-code and enterprise development platforms
MAS becomes significantly easier when your platform already provides:
- A workflow engine for orchestration and approvals
- A unified data model for shared state and traceability
- A permission system that governs tool access and write-back
- Observability hooks (traces, logs, run history)
- A tool catalog with schemas and auth policies
This is why many teams treat agent systems as a product capability embedded into the enterprise runtime, not a thin chat layer.
If you want a concrete starting point, the quickest path is building one governed workflow end-to-end (read → propose → approve → write-back) and then splitting roles into multiple agents as complexity grows. The JitAI Tutorial is a good on-ramp for that “workflow-first” approach. (JitAI Tutorial)
When you’re ready to test a real project with your own data and permissions, you can try JitAI and treat the first deliverable as an operational prototype: measurable latency, explicit approval gates, and a complete audit trail. (try JitAI)
Career map: the roles companies are hiring for
MAS vocabulary maps cleanly to job expectations:
Agentic AI application builder
- Designs workflows, schemas, and tools
- Defines roles, prompts, and success metrics
- Owns evaluation and iteration loops
Orchestration engineer
- Builds routing, retries, budgets, and state management
- Implements policy checks, approvals, idempotency, compensation
- Owns traceability and production readiness
AI governance / risk lead
- Defines lifecycle controls aligned to standards
- Ensures auditability and incident response readiness
- Partners with security, legal, and business owners
Industry signals suggest “agents” are becoming a common experimentation area, and enterprise AI investment remains strong—creating demand for people who can operationalize the systems, not only prototype them.
FAQ
What is the difference between tool calling and function calling?
Tool calling is the general pattern of invoking external tools; function calling makes those invocations conform to declared schemas, enabling validation and governance.
Do multi-agent systems always require a central orchestrator?
Many enterprise deployments start with a centralized orchestrator for auditability and reliability. Distributed coordination can scale better, yet it increases complexity in consistency and debugging.
How do I prevent agents from “fighting” or duplicating work?
Use explicit task allocation, shared state with ownership rules, schema-validated handoffs, and conflict resolution policies (retry, merge, escalate). Coordination mechanisms are a core MAS research topic.
What governance artifacts should exist before write-back to core systems?
At minimum: permissioned tool catalog, policy checks, approval gates for risky actions, and end-to-end audit logs. Standards and frameworks like ISO/IEC 42001 and NIST AI RMF provide lifecycle guidance.
What neutral data points support “MAS/agents are becoming real”?
Examples: Stanford’s AI Index reports accelerating enterprise AI usage and strong generative AI investment, and McKinsey reports widespread experimentation with AI agents.