TL;DR: Regulated organizations often cannot send data to third-party cloud AI APIs. Kelva's self-hosted stack runs inside your own infrastructure boundary: self-hosted open models (or Gemini / Claude where a managed API is permitted), LangChain for orchestration, self-hosted Supabase for storage, and pgvector for RAG pipelines. Data never leaves your environment, GDPR-aligned and SOC 2 ready.
The Data Sovereignty Imperative
For many regulated organizations — government agencies, financial institutions, and healthcare providers — sending data to a third-party cloud AI service isn't an option. Compliance regimes like GDPR, SOC 2, and HIPAA put hard constraints on where data lives and who can touch it.
This doesn't mean these organizations can't benefit from AI. It means they need AI infrastructure built entirely within their own jurisdiction and infrastructure boundary.
Our Self-Hosted Stack
We've developed a complete AI stack that runs on Google Cloud or inside your own VPC:
- Self-hosted open models — Foundation models that run inside your boundary when no data may leave it
- Gemini / Claude — Managed APIs where the compliance regime permits an external call
- LangChain — Orchestration framework for multi-step AI pipelines
- Supabase (self-hosted) — PostgreSQL + auth + storage, deployed via Docker
- pgvector — Vector similarity search for RAG pipelines
Architecture
The architecture follows a clear separation of concerns:
- API Layer — Next.js API routes handle authentication and request routing
- Orchestration — LangChain manages prompt chains and model interactions
- Storage — Supabase provides structured data, auth, and file storage
- Vector Store — pgvector enables semantic search over document embeddings
- Models — Self-hosted open models for generation, local embeddings for search
Key Challenges
The self-hosted tradeoff. Self-hosted open models give you full data control, but they trade away some of the raw quality and latency you get from a frontier managed API. We treat that as a design constraint, not a dealbreaker: aggressive caching, pre-computation where possible, async job queues, and pipelines structured so the model is asked to do narrow, well-scoped tasks rather than open-ended ones.
Playing to model strengths. A self-hosted model on a constrained footprint won't match Gemini or Claude on every task. So we don't ask it to. We decompose the work — retrieval, structuring, drafting, validation — so each step plays to what the model does reliably, and the pipeline as a whole produces output that holds up.
Infrastructure complexity. Self-hosting Supabase and managing Docker containers requires more operational effort than using managed services. We've built deployment scripts and monitoring to reduce this burden.
When to Choose Self-Hosted
Not every project needs self-hosted infrastructure. We recommend it when:
- Regulatory requirements mandate data residency or self-hosting (GDPR, SOC 2, HIPAA, etc.)
- The organization handles sensitive personal data
- There's a strategic need for AI independence
- Long-term cost optimization is a priority over initial setup speed
For projects without these constraints, we use managed cloud services — they're faster to ship and easier to maintain.
