Skip to main content
Back to blog
Product9 min read

Why Your ERP Documentation Is a Liability (And How to Turn It Into an Asset)

M
Moshe Beeri, Founder
/
knowledge-grapherpmanufacturingenterprisedocumentationknowledge-managementcyborgenic-organization

Why Your ERP Documentation Is a Liability (And How to Turn It Into an Asset)

Your ERP system knows every transaction your company has ever processed. It does not know why your senior engineer configures bill-of-materials entries the way she does. It does not know that the workaround for the recurring scheduling conflict in Plant 2 lives in a spreadsheet on someone's desktop. It does not know that the decision to change the default routing logic was made in a hallway conversation three years ago and never written down.

I built agent.ceo to solve the organizational memory problem for AI agent teams -- 11 agents, 9,799 commits, 83,163 test functions across 2,304 test files. Along the way, I realized the same infrastructure that gives my AI agents institutional knowledge can give your manufacturing team something they have never had: documentation that actually answers questions.

In a Cyborgenic Organization, AI agents do not just execute tasks — they understand your institutional knowledge. This is what a Cyborgenic Organization brings to enterprise operations: agents that can traverse your documentation, connect your processes to your problems, and surface answers that would take a human analyst hours to assemble. The knowledge graph is not a search engine. It is the institutional memory layer that makes agents genuinely useful in complex business environments.

This is the gap that costs mid-size manufacturers millions -- not in software licenses, but in lost productivity, repeated mistakes, and knowledge that walks out the door when experienced employees retire.

The Problem Is Not Your ERP

ERP systems do exactly what they are designed to do: process transactions, enforce workflows, store structured data. SAP, Oracle, and mid-market ERP platforms — they are databases with business logic on top. They are very good at answering "what happened" and "how much."

They are structurally incapable of answering "why do we do it this way," "what went wrong last time we tried that," or "who knows how this actually works."

Those answers live in:

  • Process documentation that was written once and never updated
  • Training materials that describe the system as it existed two versions ago
  • Email threads where the actual decision rationale is buried in paragraph four of a reply-all
  • The heads of your most experienced employees — the ones closest to retirement

Every year, manufacturers lose institutional knowledge. Every year, new employees spend months rediscovering what their predecessors already figured out. The ERP system watches this happen and has nothing useful to contribute, because no one ever taught it what the documentation means.

What a Knowledge Graph Changes

Rendering diagram…

A knowledge graph takes your unstructured documentation — PDFs, process notes, training materials, XML exports, meeting transcripts — and turns it into something queryable. Not just full-text search. Structured, connected, reasoned queries.

Here is what that looks like in practice:

Before: You search your document management system for "scheduling conflict Plant 2." You get 47 results. Twelve are relevant. Three contradict each other. You spend two hours reading them and still are not sure which process is current.

After: You ask, "What causes scheduling conflicts in Plant 2 and what solutions have been implemented?" The knowledge graph returns: three documented root causes, two implemented solutions with dates, one proposed solution still in review, and the KPI impact of the solutions that were implemented. Each answer links to its source document.

This works because the knowledge graph does not just store documents. It extracts structured information: problems, causes, solutions, KPIs, systems, and the relationships between them. A document about a scheduling fix becomes a set of connected facts: Problem X is caused by Y, solved by Z, measured by KPI W, and affects System V.

How It Works (Without the Buzzwords)

Rendering diagram…

The pipeline has two paths that run in parallel:

Structured data path. Your ERP's XML exports — entity definitions, field mappings, relationships between modules — are parsed deterministically into a graph. No AI interpretation needed. This gives you the schema scaffold: what entities exist, how they connect, what fields they expose.

Knowledge document path. Your unstructured documents — PDFs, process notes, training materials — go through a question-driven extraction process. For each document and each section within it, the system asks a fixed set of analytical questions: What is the core claim? What problem does it address? What systems are involved? What KPIs are affected? Who is the audience? The answers become structured graph nodes connected to the source text.

The two paths converge in a single graph database. Here is the real ingestion code from our knowledge base tools -- the same code that powers agent.ceo's internal wiki. This is from conductor/src/mcp_servers/kb_tools.py:

# From conductor/src/mcp_servers/kb_tools.py — real production ingestion code
async def _kb_ingest_text(
    title: str, content: str, wiki: str = "strategy",
    page_type: str = "entity", space: str = None,
) -> dict:
    """Ingest raw text into the wiki as a new page."""
    slug = re.sub(r"[^a-z0-9]+", "-", title.lower()).strip("-")
    type_dir = {
        "entity": "entities", "concept": "concepts",
        "comparison": "comparisons"
    }.get(page_type, page_type)
    path = f"wiki/{type_dir}/{slug}.md"

    upsert_cypher = """
    MERGE (p:Page {path: $path})
    SET p.title = $title, p.type = $page_type, p.body_text = $content,
        p.updated_at = datetime($updated_at)
    RETURN p.path AS path, p.title AS title
    """
    await _run_write(upsert_cypher, {
        "path": path, "title": title,
        "page_type": page_type, "content": content,
        "updated_at": datetime.now(timezone.utc).isoformat(),
    })
    return {"status": "ingested", "path": path, "title": title, "type": page_type}

The MERGE pattern is important -- if you re-ingest the same document, it updates rather than duplicates. When I first built this for our own agents, I did not use MERGE and ended up with 300 duplicate nodes after a single import run. The ERP use case is identical: you want to re-process XML exports without polluting the graph.

And here is how the knowledge base connects to the NATS event bus, so other systems react to new ingestions:

# From services/wiki-graph-builder/nats_emitter.py — real production code
async def emit_ingested(
    path: str, title: str, page_type: str, wiki: str, action: str,
) -> bool:
    """Emit a wiki.ingested event after a page is created, updated, or deleted."""
    return await emit("wiki.ingested", {
        "path": path, "title": title, "type": page_type,
        "wiki": wiki, "action": action,
        "timestamp": datetime.now(timezone.utc).isoformat(),
    })

Every ingested document fires an event on NATS JetStream (port 4222). In the ERP context, this means downstream systems -- dashboards, alerting, validation agents -- get notified the instant new knowledge enters the graph.

The structured ERP schema gives you the "what exists" layer. The extracted knowledge gives you the "what it means and why" layer. Together, they make your documentation not just searchable but answerable.

Why Now

Three things have changed that make this practical for mid-size manufacturers, not just enterprises with dedicated data science teams:

Extraction quality. Language models can now reliably extract structured information from messy documents — the kind of semi-formatted process notes and legacy PDFs that actual manufacturers have. The question-catalog approach (asking specific analytical questions rather than hoping for general "understanding") keeps extraction focused and verifiable.

Graph databases are production-ready. Neo4j and similar systems handle the scale of a mid-size manufacturer's documentation corpus without dedicated infrastructure teams. You do not need a graph database PhD. You need a deployment that works. Here is the real Neo4j connection pattern from our production codebase:

# From conductor/src/mcp_servers/kb_spaces_api.py — real production code
_NEO4J_URI = os.environ.get("NEO4J_URI", "bolt://neo4j-genbrain.agents.svc.cluster.local:7687")
_NEO4J_USERNAME = os.environ.get("NEO4J_USERNAME", "neo4j")
_NEO4J_DATABASE = os.environ.get("NEO4J_DATABASE", "neo4j")

def _get_driver():
    global _driver
    if _driver is None:
        from neo4j import AsyncGraphDatabase
        _driver = AsyncGraphDatabase.driver(
            _NEO4J_URI, auth=(_NEO4J_USERNAME, _NEO4J_PASSWORD)
        )
    return _driver

async def _run_query(cypher: str, params: dict | None = None) -> list[dict]:
    driver = _get_driver()
    async with driver.session(database=_NEO4J_DATABASE) as session:
        result = await session.run(cypher, params or {})
        return await result.data()

Async driver, lazy connection, parameterized queries. This is the same pattern that powers 190 MCP tool functions serving 11 agents. It handles our entire knowledge graph without a dedicated DBA.

Tenant isolation is solved. Each organization gets its own isolated graph, its own namespace, its own data boundary. Your ERP documentation does not commingle with anyone else's. For companies in regulated industries or with strict data residency requirements, this is not optional — it is the baseline.

What This Costs You Today

Calculate it yourself. Take the number of experienced employees who have left in the last three years. Multiply by the months it took their replacements to reach equivalent productivity. Multiply by the fully loaded cost of those months.

That number is what your undocumented knowledge costs. Your ERP investment did not prevent it because your ERP was never designed to capture it.

Now ask: how many process decisions in your organization are documented only in someone's head? If the answer is "most of them," your documentation is not an asset. It is a liability — one that compounds every time someone retires, transfers, or just forgets.

Turn It Around

I am building the infrastructure that lets ERP teams make their institutional knowledge queryable. Not a chatbot bolted onto a search engine. A structured knowledge graph that understands the relationships between your problems, solutions, systems, and metrics -- backed by the same Neo4j + NATS + Redis infrastructure that powers 11 AI agents handling 9,799 commits across 2,304 test files.

I am looking for design partners -- ERP administrators, IT directors, and operations managers at mid-size manufacturers who know this problem firsthand and want to solve it with me.

Apply to the design partner program at agent.ceo.

You bring the documentation and domain expertise. I bring the infrastructure to make it useful.


I'm Moshe Beeri. I build agent.ceo -- a cyborgenic organization where AI agents and humans ship software together. 9,799 commits and counting.

Related articles