Data Sovereignty for Agencies: Why Self-Hosted AI Matters

TL;DR: Regulated organizations often cannot send data to third-party cloud AI APIs. Kelva's self-hosted stack runs inside your own infrastructure boundary: self-hosted open models (or Gemini / Claude where a managed API is permitted), LangChain for orchestration, self-hosted Supabase for storage, and pgvector for RAG pipelines. Data never leaves your environment, GDPR-aligned and SOC 2 ready.

The Data Sovereignty Imperative

For many regulated organizations — government agencies, financial institutions, and healthcare providers — sending data to a third-party cloud AI service isn't an option. Compliance regimes like GDPR, SOC 2, and HIPAA put hard constraints on where data lives and who can touch it.

This doesn't mean these organizations can't benefit from AI. It means they need AI infrastructure built entirely within their own jurisdiction and infrastructure boundary.

Our Self-Hosted Stack

We've developed a complete AI stack that runs on Google Cloud or inside your own VPC:

Self-hosted open models — Foundation models that run inside your boundary when no data may leave it
Gemini / Claude — Managed APIs where the compliance regime permits an external call
LangChain — Orchestration framework for multi-step AI pipelines
Supabase (self-hosted) — PostgreSQL + auth + storage, deployed via Docker
pgvector — Vector similarity search for RAG pipelines

Architecture

The architecture follows a clear separation of concerns:

API Layer — Next.js API routes handle authentication and request routing
Orchestration — LangChain manages prompt chains and model interactions
Storage — Supabase provides structured data, auth, and file storage
Vector Store — pgvector enables semantic search over document embeddings
Models — Self-hosted open models for generation, local embeddings for search

Key Challenges

The self-hosted tradeoff. Self-hosted open models give you full data control, but they trade away some of the raw quality and latency you get from a frontier managed API. We treat that as a design constraint, not a dealbreaker: aggressive caching, pre-computation where possible, async job queues, and pipelines structured so the model is asked to do narrow, well-scoped tasks rather than open-ended ones.

Playing to model strengths. A self-hosted model on a constrained footprint won't match Gemini or Claude on every task. So we don't ask it to. We decompose the work — retrieval, structuring, drafting, validation — so each step plays to what the model does reliably, and the pipeline as a whole produces output that holds up.

Infrastructure complexity. Self-hosting Supabase and managing Docker containers requires more operational effort than using managed services. We've built deployment scripts and monitoring to reduce this burden.

When to Choose Self-Hosted

Not every project needs self-hosted infrastructure. We recommend it when:

Regulatory requirements mandate data residency or self-hosting (GDPR, SOC 2, HIPAA, etc.)
The organization handles sensitive personal data
There's a strategic need for AI independence
Long-term cost optimization is a priority over initial setup speed

For projects without these constraints, we use managed cloud services — they're faster to ship and easier to maintain.

The Data Sovereignty Imperative

This doesn't mean these organizations can't benefit from AI. It means they need AI infrastructure built entirely within their own jurisdiction and infrastructure boundary.

Our Self-Hosted Stack

We've developed a complete AI stack that runs on Google Cloud or inside your own VPC:

Self-hosted open models — Foundation models that run inside your boundary when no data may leave it

Gemini / Claude — Managed APIs where the compliance regime permits an external call

LangChain — Orchestration framework for multi-step AI pipelines

Supabase (self-hosted) — PostgreSQL + auth + storage, deployed via Docker

pgvector — Vector similarity search for RAG pipelines

Architecture

The architecture follows a clear separation of concerns:

API Layer — Next.js API routes handle authentication and request routing

Orchestration — LangChain manages prompt chains and model interactions

Storage — Supabase provides structured data, auth, and file storage

Vector Store — pgvector enables semantic search over document embeddings

Models — Self-hosted open models for generation, local embeddings for search

Key Challenges

When to Choose Self-Hosted

Not every project needs self-hosted infrastructure. We recommend it when:

Regulatory requirements mandate data residency or self-hosting (GDPR, SOC 2, HIPAA, etc.)

The organization handles sensitive personal data

There's a strategic need for AI independence

Long-term cost optimization is a priority over initial setup speed

For projects without these constraints, we use managed cloud services — they're faster to ship and easier to maintain.