Skip to content
Operating Automation
01 — Case study

Greenhouse manufacturing — internal build

The knowledge stops living in two heads

A queryable knowledge layer the team can ask questions against. Anyone on the team can ask 'how much of this component did we order last quarter?' and get a real answer. The AI gets better every week — because managers approve the answers, not the model.


02

“The cost of tribal knowledge isn't slow lookups. It's that nobody else has the information at all.”

— What I learned building this
03

The situation

Critical knowledge at our company lived in two places: my head, and the other owner’s head. Pricing history. Vendor specs. Past customer issues. Who said what to whom. Inventory context. Product specifics.

The rest was scattered across a Google Drive nobody could navigate.

The cost wasn’t slow lookups. The cost was infinite, because nobody else had the information at all. New hires couldn’t ramp. Customer service hit dead-ends. Decisions got re-made because no one remembered the first one. If either of us got hit by a bus tomorrow, half the company’s institutional memory walked out with us.

That’s not a productivity problem. That’s a single point of failure.

What we built

A queryable knowledge layer that anyone on the team can ask questions against. Plain-English query in, real answer out, citation back to the source document.

The architecture:

  • Supabase + pgvector for storage and semantic search. Documents are chunked, embedded, and stored alongside structured deal and company data.
  • Edge Functions as the API layer. All core logic — ingestion, retrieval, chat — runs there.
  • Lovable for the frontend. Plain interface; the team learned it without training.
  • HubSpot sync pulls deal and company context into the knowledge base on a schedule.
  • Document ingestion via a processing queue. Documents are uploaded manually or fetched from Google Drive links attached to deals. They’re chunked, embedded, and indexed automatically.

A team member opens the chat, types a question, and gets an answer with sources. Sales can pull deal context mid-call. Operations can query inventory history. Customer service can look up what we shipped a customer two years ago.

The part that actually matters

A vanilla RAG system gets you 60% of the way there. The piece most consultants skip — and the reason most company knowledge bots quietly stop being used after a month — is the feedback loop.

Here’s how ours works:

Stage 1 — A user rates an answer. Every AI response gets a thumbs up or thumbs down. A thumbs up doesn’t go live yet — it gets saved as a candidate example, marked “pending review.” A thumbs down logs the failure pattern.

Stage 2 — A manager reviews. Anyone with manager, admin, or owner role can open a feedback dashboard and see the pending candidates. They can:

  • Approve as-is — it becomes active and starts shaping future answers
  • Edit first — fix the wording of the question or the ideal answer, then approve
  • Mark as golden — a special flag for the most canonical, definitive answers
  • Reject — it stays inactive

Nothing goes live without a human sign-off. Managers control what the AI learns from.

Stage 3 — Approved examples shape future responses. When a new question comes in, the system looks for approved examples that are semantically similar. If it finds them, it shows the AI those approved answers as reference examples — here’s how a good answer looks — before generating.

Golden answers are the special case: if a question is nearly identical to a golden-flagged one, the system skips generation entirely and returns the pre-approved answer verbatim. No generation cost. No variation. The manager’s curated answer goes straight to the user.

Knowledge gaps are the side channel: if a user thumbs down and the platform detects there genuinely isn’t good content to answer the question, it flags a knowledge gap — a tracked record of “users are asking about X and we don’t have a good answer.” That becomes a content to-do list for the team.

What changed

The system is deployed across the whole company — product info, inventory, customer service, deal context. The biggest shift is the recursive learning: the AI gets better every week, because managers are approving real-world Q&A pairs as gold-standard answers.

The system has a memory now. The company has a memory now. We don’t.

What didn’t work the first time

We over-specced it. Hard.

The initial build had too many layers. Too many edge cases anticipated before any actual users had hit them. Too much infrastructure for what we needed in week one.

We peeled the onion all the way back to MVP and rebuilt up. Started with the bare minimum: chunking, embedding, retrieval, chat. Got it working. Got the team using it. Then added the feedback loop. Then golden answers. Then knowledge-gap tracking. Each layer earned its place by addressing a real pain that surfaced in actual usage.

This is the pattern I bring into client work now. Ship the smallest version that works. Add layers when reality demands them, not when your architecture diagram does.

Why this matters for what comes next

This is the foundation work. Everything else — agents, automation, sales-context surfacing, customer-facing chat — depends on it. You can’t deploy an autonomous agent on a swamp of tribal knowledge and untrusted data. You can deploy one on this.

That’s the whole thesis. Build the foundation. Then build the rest.


04 — Tools used
  • Supabase
  • pgvector
  • Edge Functions
  • Lovable
  • HubSpot API
  • Claude API
05

Have a similar problem?

Most engagements start with a 30-minute call. No deck. No homework.

Book a call →