From Transcript to Task: How the Meetings API Closes the Action Item Loop

TL;DR

Four-endpoint Meetings API ingests transcripts, extracts action items, and converts them into tracked TMS tasks with full provenance.

Idempotent conversion means agents can safely retry without creating duplicate tasks.

This is how a cyborgenic organization turns spoken commitments into executed work — no manual re-entry, no lost context.

"I'll handle the migration by Friday." Someone says it. Someone else writes it in their notes. A third person remembers it differently. By Monday, nobody can agree on what was committed to, who owns it, or when it was due. The action item lived exactly as long as the meeting lasted, then evaporated.

In a cyborgenic organization, that failure mode is unacceptable. We built a Meetings REST API at agent.ceo that closes the loop: a transcript goes in, action items get extracted, and one API call converts them into tracked tasks in our Task Management System — with assignees, due dates, priorities, and full traceability back to the meeting where they originated. AI agents pick up those tasks and execute them.

Four endpoints. No manual transcription. No lost action items. Here is how it works.

The API: Four Endpoints That Cover the Full Lifecycle

The Meetings API is a FastAPI service backed by Firestore, authenticated via Firebase JWT tokens with organization-level membership checks. It exposes four endpoints:

POST /api/v1/orgs/{org_id}/meetings/transcript    — ingest transcript + action items
GET  /api/v1/orgs/{org_id}/meetings                — list meetings (filterable, paginated)
GET  /api/v1/orgs/{org_id}/meetings/{meeting_id}   — full meeting detail with transcript
POST /api/v1/orgs/{org_id}/meetings/{meeting_id}/actions  — convert action items to TMS tasks

The first endpoint ingests. The middle two query. The last one converts. That is the full lifecycle: raw meeting data enters the system, becomes queryable, and its action items become executable work.

Every endpoint requires a valid Firebase JWT token and verifies the caller is a member of the target organization via require_org_member. No cross-org access, no anonymous reads.

The Data Model: What Goes In

The ingest endpoint accepts a TranscriptIngestRequest that captures everything you need from a meeting:

class ActionItem(BaseModel):
    id: str = Field(default_factory=lambda: str(uuid4()))
    title: str = Field(..., min_length=1, max_length=500)
    assignee: Optional[str] = None
    due_date: Optional[str] = None
    priority: Optional[str] = Field(None, pattern="^(low|medium|high|critical)$")
    status: str = Field(default="pending")


class TranscriptIngestRequest(BaseModel):
    meeting_id: Optional[str] = Field(
        default=None,
        description="Optional. Auto-generated if not provided."
    )
    title: str = Field(..., min_length=1, max_length=500)
    participants: List[str] = Field(..., min_items=1)
    transcript: str = Field(..., min_length=1, max_length=500_000)
    action_items: List[ActionItem] = Field(default_factory=list)
    source: Optional[str] = Field(
        None,
        description="Where the transcript came from: zoom, teams, manual, etc."
    )
    recorded_at: Optional[str] = None

A few design decisions worth noting.

Transcript cap at 500K characters. A one-hour meeting transcript runs about 8,000-12,000 words, or roughly 40,000-60,000 characters. 500K gives you room for multi-hour sessions or transcripts with heavy technical content (code reviews, architecture discussions) without opening the door to unbounded payloads. Pydantic enforces this at the validation layer before the request ever hits Firestore.

Meeting ID is optional. If you provide one, the API uses it — useful for idempotent re-ingestion from the same source. If you don't, a UUID gets generated. This means you can integrate with meeting platforms that have their own IDs (Zoom, Teams) or with manual transcription workflows that don't.

Action items are embedded, not separate. They arrive as part of the transcript ingest, not as a second API call. This matches how action items actually work — they are artifacts of the meeting, not independent entities. They only become independent when you explicitly convert them to tasks.

Querying: Filters and Pagination

The list endpoint supports filtering by participant and date range, with cursor-based pagination:

GET /api/v1/orgs/{org_id}/meetings?participant=alice@company.com
                                   &start_date=2026-05-01
                                   &end_date=2026-05-31
                                   &limit=20
                                   &offset=0

This is straightforward, but it matters. When an agent needs to find "the meeting where we discussed the migration plan," it can query by participant, narrow by date, and get the right meeting without scanning every document in the collection. Firestore composite indexes handle the query performance.

The detail endpoint returns the full meeting document including the complete transcript and all action items with their current statuses. This is the endpoint agents use when they need full context — reviewing what was discussed before executing a task that originated from that meeting.

The Convert Endpoint: Where Action Items Become Work

This is the core of the system. The convert-actions endpoint takes the action items embedded in a meeting and creates real, tracked TMS tasks from them:

@router.post("/api/v1/orgs/{org_id}/meetings/{meeting_id}/actions")
async def convert_actions_to_tasks(
    org_id: str,
    meeting_id: str,
    user=Depends(require_org_member),
    db=Depends(get_firestore),
):
    meeting_ref = db.collection("orgs").document(org_id)\
                    .collection("meetings").document(meeting_id)
    meeting_doc = await meeting_ref.get()

    if not meeting_doc.exists:
        raise HTTPException(status_code=404, detail="Meeting not found")

    meeting_data = meeting_doc.to_dict()
    action_items = meeting_data.get("action_items", [])
    created_tasks = []

    for item in action_items:
        if item.get("status") == "converted":
            continue  # already converted — skip

        task_doc = {
            "title": item["title"],
            "assignee": item.get("assignee"),
            "due_date": item.get("due_date"),
            "priority": item.get("priority", "medium"),
            "status": "pending",
            "source": "meeting",
            "source_meeting_id": meeting_id,
            "source_action_item_id": item["id"],
            "org_id": org_id,
            "created_by": user["uid"],
        }

        task_ref = db.collection("orgs").document(org_id)\
                     .collection("tasks").document()
        await task_ref.set(task_doc)

        item["status"] = "converted"
        created_tasks.append({
            "task_id": task_ref.id,
            "action_item_id": item["id"],
            "title": item["title"],
        })

    # Update meeting doc with converted statuses
    await meeting_ref.update({"action_items": action_items})

    return {
        "converted": len(created_tasks),
        "skipped": len(action_items) - len(created_tasks),
        "tasks": created_tasks,
    }

Three properties matter here.

Idempotent. Call the endpoint twice, get the same result. On the first call, pending action items become tasks and get marked as "converted." On the second call, those items are skipped. You never get duplicate tasks from the same action item, no matter how many times the endpoint is invoked. This is critical for reliability — agents retry failed requests, webhooks can fire multiple times, and humans click buttons more than once.

Traceable. Every task created from a meeting carries three provenance fields: source="meeting", source_meeting_id, and source_action_item_id. An agent working on a task can trace it back to the exact meeting and the exact action item that spawned it. If the task description is ambiguous, the agent can pull the full transcript for context. The chain from meeting to action item to task to execution is never broken.

Selective. The endpoint converts all unconverted action items in a single call. It does not require you to specify which items to convert — the status field handles that. But because items are marked individually, you could extend this to support partial conversion if needed. The current design optimizes for the common case: a meeting ends, someone (or some agent) calls the convert endpoint, and all action items become tasks at once.

The Response: What You Get Back

The convert endpoint returns a clear accounting of what happened:

{
  "converted": 3,
  "skipped": 1,
  "tasks": [
    {
      "task_id": "abc123",
      "action_item_id": "item-1",
      "title": "Migrate auth service to new JWT provider"
    },
    {
      "task_id": "def456",
      "action_item_id": "item-2",
      "title": "Write load test for payment endpoint"
    },
    {
      "task_id": "ghi789",
      "action_item_id": "item-3",
      "title": "Update runbook for database failover"
    }
  ]
}

converted tells you how many new tasks were created. skipped tells you how many were already converted. The tasks array maps each new task ID back to its source action item. No ambiguity about what happened.

Auth: Organization-Scoped and JWT-Verified

Every endpoint runs through require_org_member, a FastAPI dependency that verifies the Firebase JWT token and checks that the authenticated user belongs to the target organization. This is the same auth pattern we use across all of our org-scoped APIs — consistent, auditable, and impossible to accidentally bypass because it is a dependency injection, not middleware you can forget to attach.

async def require_org_member(
    org_id: str,
    token: dict = Depends(verify_firebase_token),
    db=Depends(get_firestore),
):
    member_ref = db.collection("orgs").document(org_id)\
                   .collection("members").document(token["uid"])
    member_doc = await member_ref.get()
    if not member_doc.exists:
        raise HTTPException(status_code=403, detail="Not a member of this org")
    return token

Agents authenticate the same way human users do. A marketing agent converting meeting action items needs a valid JWT for the organization it operates in. No special agent tokens, no privilege escalation.

Testing: 28 Tests Covering Models, Endpoints, and Edge Cases

We shipped 28 tests across the Meetings API:

Model validation tests. Pydantic models reject transcripts over 500K characters, titles over 500 characters, empty participant lists, and invalid priority values. These tests verify that bad data never reaches Firestore.
Ingest endpoint tests. Happy path with all fields, minimal payload with only required fields, auto-generation of meeting IDs, handling of duplicate ingestion.
List and detail endpoint tests. Pagination correctness, participant filtering, date range filtering, 404 for nonexistent meetings.
Convert endpoint tests. Conversion creates correct task documents, idempotent re-conversion skips already-converted items, 404 for nonexistent meetings, provenance fields are set correctly on created tasks.
Auth tests. Unauthenticated requests return 401, requests from non-members return 403, valid member tokens succeed.

The idempotency tests are the most important. We call the convert endpoint, verify tasks are created, call it again, and verify that zero additional tasks appear and the skipped count matches. In a system where agents retry requests automatically, idempotency is not a nice-to-have. It is a correctness requirement.

Why This Matters: The Meeting-to-Execution Gap

Every organization has the same problem. Meetings generate commitments. Those commitments get written down in notes, Slack messages, follow-up emails, or nowhere at all. Some percentage of them get manually transcribed into a task tracker. Some percentage of those get assigned. Some percentage of those get completed.

Each handoff in that chain loses signal. By the time a commitment makes it from spoken word to tracked task, the details have degraded. The assignee is wrong, the due date is missing, the context is gone.

The Meetings API eliminates the handoff chain. A transcript goes in with action items attached. One API call converts those action items into tracked tasks with assignees, due dates, priorities, and a direct link back to the source meeting. The tasks enter the same TMS that our AI agents already pull work from. No manual re-entry. No lost context. No gap between "we agreed to do this" and "this is assigned and being tracked."

For an AI agent organization, this is especially powerful. An agent does not attend meetings, but it can ingest a transcript and immediately start working on the tasks that came out of it — with full context about what was discussed and why the task exists. The transcript is not just a record. It is the task's backstory.

What's Next

The current API requires action items to be provided in the ingest request. The next step is automatic extraction — send us a raw transcript with no action items annotated, and we use an LLM to identify commitments, assignees, and deadlines from the conversation itself. The infrastructure is already in place. The ingest endpoint accepts transcripts up to 500K characters. The convert endpoint handles whatever action items exist. The extraction layer sits between them.

We are also working on real-time integrations with Zoom and Teams so transcripts flow in automatically at meeting end, with no manual export step.

Build your own cyborgenic organization at agent.ceo.

From Transcript to Task: How the Meetings API Closes the Action Item Loop

From Transcript to Task: How the Meetings API Closes the Action Item Loop

The API: Four Endpoints That Cover the Full Lifecycle

The Data Model: What Goes In

The Convert Endpoint: Where Action Items Become Work

The Response: What You Get Back

Auth: Organization-Scoped and JWT-Verified

Testing: 28 Tests Covering Models, Endpoints, and Edge Cases

Why This Matters: The Meeting-to-Execution Gap

What's Next

RELATED_DEEP_DIVES

Two-Factor Authentication for AI Organizations: Clerk-Powered MFA

Resilient Agent Task Delivery: Pull-Based Discovery and Role-Based Tool Filtering

Agent State Management: How Firestore Powers Persistent AI Agents in a Cyborgenic Organization