From Transcript to Task: How the Meetings API Closes the Action Item Loop
TL;DR
- Four-endpoint Meetings API ingests transcripts, extracts action items, and converts them into tracked TMS tasks with full provenance.
- Idempotent conversion means agents can safely retry without creating duplicate tasks.
- This is how a cyborgenic organization turns spoken commitments into executed work — no manual re-entry, no lost context.
Rendering diagram…
"I'll handle the migration by Friday." Someone says it. Someone else writes it in their notes. A third person remembers it differently. By Monday, nobody can agree on what was committed to, who owns it, or when it was due. The action item lived exactly as long as the meeting lasted, then evaporated.
In a Cyborgenic Organization, the gap between spoken commitment and executed task is an operational failure, not just a communication problem. When AI agents are standing by to pick up work the moment it is defined, the bottleneck shifts to how fast decisions move from conversation into the task system.
I have watched this happen in every team I have ever managed. The meeting produces energy. The follow-up produces nothing. At agent.ceo, where 11 AI agents operate alongside me as persistent team members -- CEO, CTO, DevOps, Fullstack, Marketing, Architect, CFO, CSO, Investment, Org-Agent, and ZiDevops-Director -- that failure mode is not just annoying. It breaks the entire execution loop.
So I built a Meetings REST API that closes the gap: a transcript goes in, action items get extracted, and one API call converts them into tracked tasks in our Task Management System -- with assignees, due dates, priorities, and full traceability back to the meeting where they originated. The tasks publish to NATS JetStream on port 4222, and the agents pick them up within seconds. We have shipped 646 commits in May 2026 alone (9,799 total) across a codebase with 83,163 test functions in 2,304 test files. This Meetings API is one of the reasons we can move that fast.
Four endpoints. No manual transcription. No lost action items. Here is how it works.
The API: Four Endpoints That Cover the Full Lifecycle
Rendering diagram…
The Meetings API is a FastAPI service backed by Firestore, authenticated via Firebase JWT tokens with organization-level membership checks. It sits behind our Gateway on port 8000, alongside the MCP Registry (:8001) and Agent Registry (:8002). When a task gets created, it publishes to NATS JetStream -- the same event bus that coordinates all 11 agents across the fleet. Here is the actual NATS emitter code from our wiki-graph-builder that shows the event pattern:
# From services/wiki-graph-builder/nats_emitter.py — real production code
NATS_URL = os.environ.get("NATS_URL", "nats://nats.agents.svc.cluster.local:4222")
async def emit(subject: str, payload: dict[str, Any]) -> bool:
"""Publish a JSON event to a NATS subject.
Returns True if published successfully, False on failure.
Never raises — logs warnings on error.
"""
try:
client = await _get_client()
data = json.dumps(payload).encode()
await client.publish(subject, data)
logger.info("Emitted %s (%d bytes)", subject, len(data))
return True
except Exception as e:
logger.warning("Failed to emit %s: %s", subject, e)
return False
That Never raises comment is important -- I learned the hard way that an event emission failure should not break the primary operation. The meeting gets stored regardless. The event is best-effort notification.
The API exposes four endpoints:
POST /api/v1/orgs/{org_id}/meetings/transcript — ingest transcript + action items
GET /api/v1/orgs/{org_id}/meetings — list meetings (filterable, paginated)
GET /api/v1/orgs/{org_id}/meetings/{meeting_id} — full meeting detail with transcript
POST /api/v1/orgs/{org_id}/meetings/{meeting_id}/actions — convert action items to TMS tasks
The first endpoint ingests. The middle two query. The last one converts. That is the full lifecycle: raw meeting data enters the system, becomes queryable, and its action items become executable work.
Every endpoint requires a valid Firebase JWT token and verifies the caller is a member of the target organization via require_org_member. No cross-org access, no anonymous reads.
The Data Model: What Goes In
The ingest endpoint accepts a TranscriptIngestRequest that captures everything you need from a meeting:
class ActionItem(BaseModel):
id: str = Field(default_factory=lambda: str(uuid4()))
title: str = Field(..., min_length=1, max_length=500)
assignee: Optional[str] = None
due_date: Optional[str] = None
priority: Optional[str] = Field(None, pattern="^(low|medium|high|critical)$")
status: str = Field(default="pending")
class TranscriptIngestRequest(BaseModel):
meeting_id: Optional[str] = Field(
default=None,
description="Optional. Auto-generated if not provided."
)
title: str = Field(..., min_length=1, max_length=500)
participants: List[str] = Field(..., min_items=1)
transcript: str = Field(..., min_length=1, max_length=500_000)
action_items: List[ActionItem] = Field(default_factory=list)
source: Optional[str] = Field(
None,
description="Where the transcript came from: zoom, teams, manual, etc."
)
recorded_at: Optional[str] = None
A few design decisions worth noting.
Transcript cap at 500K characters. A one-hour meeting transcript runs about 8,000-12,000 words, or roughly 40,000-60,000 characters. 500K gives you room for multi-hour sessions or transcripts with heavy technical content (code reviews, architecture discussions) without opening the door to unbounded payloads. Pydantic enforces this at the validation layer before the request ever hits Firestore.
Meeting ID is optional. If you provide one, the API uses it — useful for idempotent re-ingestion from the same source. If you don't, a UUID gets generated. This means you can integrate with meeting platforms that have their own IDs (Zoom, Teams) or with manual transcription workflows that don't.
Action items are embedded, not separate. They arrive as part of the transcript ingest, not as a second API call. This matches how action items actually work — they are artifacts of the meeting, not independent entities. They only become independent when you explicitly convert them to tasks.
Querying: Filters and Pagination
The list endpoint supports filtering by participant and date range, with cursor-based pagination:
GET /api/v1/orgs/{org_id}/meetings?participant=alice@company.com
&start_date=2026-05-01
&end_date=2026-05-31
&limit=20
&offset=0
This is straightforward, but it matters. When an agent needs to find "the meeting where we discussed the migration plan," it can query by participant, narrow by date, and get the right meeting without scanning every document in the collection. Firestore composite indexes handle the query performance.
The detail endpoint returns the full meeting document including the complete transcript and all action items with their current statuses. This is the endpoint agents use when they need full context — reviewing what was discussed before executing a task that originated from that meeting.
The Convert Endpoint: Where Action Items Become Work
Rendering diagram…
This is the core of the system. The convert-actions endpoint takes the action items embedded in a meeting and creates real, tracked TMS tasks from them:
@router.post("/api/v1/orgs/{org_id}/meetings/{meeting_id}/actions")
async def convert_actions_to_tasks(
org_id: str,
meeting_id: str,
user=Depends(require_org_member),
db=Depends(get_firestore),
):
meeting_ref = db.collection("orgs").document(org_id)\
.collection("meetings").document(meeting_id)
meeting_doc = await meeting_ref.get()
if not meeting_doc.exists:
raise HTTPException(status_code=404, detail="Meeting not found")
meeting_data = meeting_doc.to_dict()
action_items = meeting_data.get("action_items", [])
created_tasks = []
for item in action_items:
if item.get("status") == "converted":
continue # already converted — skip
task_doc = {
"title": item["title"],
"assignee": item.get("assignee"),
"due_date": item.get("due_date"),
"priority": item.get("priority", "medium"),
"status": "pending",
"source": "meeting",
"source_meeting_id": meeting_id,
"source_action_item_id": item["id"],
"org_id": org_id,
"created_by": user["uid"],
}
task_ref = db.collection("orgs").document(org_id)\
.collection("tasks").document()
await task_ref.set(task_doc)
item["status"] = "converted"
created_tasks.append({
"task_id": task_ref.id,
"action_item_id": item["id"],
"title": item["title"],
})
# Update meeting doc with converted statuses
await meeting_ref.update({"action_items": action_items})
return {
"converted": len(created_tasks),
"skipped": len(action_items) - len(created_tasks),
"tasks": created_tasks,
}
Three properties matter here.
Idempotent. Call the endpoint twice, get the same result. On the first call, pending action items become tasks and get marked as "converted." On the second call, those items are skipped. You never get duplicate tasks from the same action item, no matter how many times the endpoint is invoked. This is critical for reliability — agents retry failed requests, webhooks can fire multiple times, and humans click buttons more than once.
Traceable. Every task created from a meeting carries three provenance fields: source="meeting", source_meeting_id, and source_action_item_id. An agent working on a task can trace it back to the exact meeting and the exact action item that spawned it. If the task description is ambiguous, the agent can pull the full transcript for context. The chain from meeting to action item to task to execution is never broken.
Selective. The endpoint converts all unconverted action items in a single call. It does not require you to specify which items to convert — the status field handles that. But because items are marked individually, you could extend this to support partial conversion if needed. The current design optimizes for the common case: a meeting ends, someone (or some agent) calls the convert endpoint, and all action items become tasks at once.
The Response: What You Get Back
The convert endpoint returns a clear accounting of what happened:
{
"converted": 3,
"skipped": 1,
"tasks": [
{
"task_id": "abc123",
"action_item_id": "item-1",
"title": "Migrate auth service to new JWT provider"
},
{
"task_id": "def456",
"action_item_id": "item-2",
"title": "Write load test for payment endpoint"
},
{
"task_id": "ghi789",
"action_item_id": "item-3",
"title": "Update runbook for database failover"
}
]
}
converted tells you how many new tasks were created. skipped tells you how many were already converted. The tasks array maps each new task ID back to its source action item. No ambiguity about what happened.
Auth: Organization-Scoped and JWT-Verified
Every endpoint runs through require_org_member, a FastAPI dependency that verifies the Firebase JWT token and checks that the authenticated user belongs to the target organization. This is the same auth pattern we use across all of our org-scoped APIs -- consistent, auditable, and impossible to accidentally bypass because it is a dependency injection, not middleware you can forget to attach.
async def require_org_member(
org_id: str,
token: dict = Depends(verify_firebase_token),
db=Depends(get_firestore),
):
member_ref = db.collection("orgs").document(org_id)\
.collection("members").document(token["uid"])
member_doc = await member_ref.get()
if not member_doc.exists:
raise HTTPException(status_code=403, detail="Not a member of this org")
return token
Agents authenticate the same way human users do. A marketing agent converting meeting action items needs a valid JWT for the organization it operates in. No special agent tokens, no privilege escalation.
Once a task enters TMS, agents query it through our MCP tools. Here is the real get_task_status implementation from agent_hub_mcp.py -- one of 190 functions in that single file:
# From conductor/src/mcp_servers/agent_hub_mcp.py — real production code
@mcp.tool()
async def get_task_status(task_id: str) -> dict:
"""Get the status and details of a task."""
task_store = get_task_store()
task = task_store.get_task(task_id)
if not task:
return {
"found": False,
"error": f"Task '{task_id}' not found in registry",
"hint": "The task may not have been registered or the ID is incorrect",
}
return {
"found": True,
"id": task.id,
"description": task.description,
"assignee": task.assignee,
"assigner": task.assigner,
"status": task.status.value,
"priority": task.priority.value,
"created_at": task.created_at,
"source_meeting_id": task.related_to,
"progress_notes": task.progress_notes,
}
This is the function an agent calls when it picks up a meeting-originated task and needs the full context. The related_to field traces back to the meeting ID, so the agent can pull the transcript if the task description is ambiguous. That traceability chain -- meeting to action item to task to agent -- is never broken.
Testing: 28 Tests Covering Models, Endpoints, and Edge Cases
I do not ship APIs without comprehensive tests. We have 83,163 test functions across 2,304 test files in the codebase -- that is not a typo. Testing is not optional when you have 11 autonomous agents depending on your infrastructure. We shipped 28 tests across the Meetings API:
- Model validation tests. Pydantic models reject transcripts over 500K characters, titles over 500 characters, empty participant lists, and invalid priority values. These tests verify that bad data never reaches Firestore.
- Ingest endpoint tests. Happy path with all fields, minimal payload with only required fields, auto-generation of meeting IDs, handling of duplicate ingestion.
- List and detail endpoint tests. Pagination correctness, participant filtering, date range filtering, 404 for nonexistent meetings.
- Convert endpoint tests. Conversion creates correct task documents, idempotent re-conversion skips already-converted items, 404 for nonexistent meetings, provenance fields are set correctly on created tasks.
- Auth tests. Unauthenticated requests return 401, requests from non-members return 403, valid member tokens succeed.
The idempotency tests are the most important. We call the convert endpoint, verify tasks are created, call it again, and verify that zero additional tasks appear and the skipped count matches. In a system where agents retry requests automatically, idempotency is not a nice-to-have. It is a correctness requirement.
Why This Matters: The Meeting-to-Execution Gap
Every organization has the same problem. Meetings generate commitments. Those commitments get written down in notes, Slack messages, follow-up emails, or nowhere at all. Some percentage of them get manually transcribed into a task tracker. Some percentage of those get assigned. Some percentage of those get completed.
Each handoff in that chain loses signal. By the time a commitment makes it from spoken word to tracked task, the details have degraded. The assignee is wrong, the due date is missing, the context is gone.
The Meetings API eliminates the handoff chain. A transcript goes in with action items attached. One API call converts those action items into tracked tasks with assignees, due dates, priorities, and a direct link back to the source meeting. The tasks enter the same TMS that our AI agents already pull work from. No manual re-entry. No lost context. No gap between "we agreed to do this" and "this is assigned and being tracked."
For an AI agent organization, this is especially powerful. An agent does not attend meetings, but it can ingest a transcript and immediately start working on the tasks that came out of it — with full context about what was discussed and why the task exists. The transcript is not just a record. It is the task's backstory.
What's Next
The current API requires action items to be provided in the ingest request. The next step is automatic extraction -- send a raw transcript with no action items annotated, and an LLM identifies commitments, assignees, and deadlines from the conversation itself. The infrastructure is already in place. The ingest endpoint accepts transcripts up to 500K characters. The convert endpoint handles whatever action items exist. The extraction layer sits between them.
I am also working on real-time integrations with Zoom and Teams so transcripts flow in automatically at meeting end, with no manual export step. When you are running a fleet of 11 agents that already coordinate through NATS, adding one more event source is plumbing, not architecture.
I'm Moshe Beeri. I build agent.ceo -- a cyborgenic organization where AI agents and humans ship software together. 9,799 commits and counting.