Core Service · Data Governance

Know exactly what data your models see.

Stack Vault's Stack Insight scans every prompt, embedding, and training corpus — surfacing PII, secrets, and IP before they leave your VPC.

0x
Data Leaves Your VPC
147M
Records Classified Daily
23types
Built-in Sensitivity Classes
3s
Median Scan Latency
Coverage

Every place sensitive data ends up in AI

Prompts, vector stores, fine-tuning sets, agent memory, eval datasets. We scan all of it.

Prompt Inspection

Inline scanning of user prompts and system messages before they hit the model — redact, block, or alert.

Vector Stores

Continuous classification of embeddings in Pinecone, Weaviate, pgvector, and Chroma.

Training Corpora

Full-corpus scans for fine-tuning datasets. PII, PHI, copyrighted material, and secret leak detection.

Agent Memory

Long-term memory stores for agentic systems audited the same way you'd audit a database.

Egress Controls

Block exfiltration to external model providers when sensitivity policy says no.

Custom Classes

Detect your proprietary IP: source code, customer lists, board materials, M&A artifacts.

Frequently Asked

Questions teams ask before deploying

Straightforward answers about scope, integration, data handling, and rollout.

How is this different from a DLP?

Traditional DLP doesn't understand embeddings or model APIs. We classify vector representations and audit RAG retrieval — not just file movement.

Do you train models on our data?

No. Classification runs on tenant-isolated infrastructure inside your VPC. No prompts or content leave your boundary.

What about HIPAA and GLBA?

We're HIPAA BAA-ready and aligned to GLBA Safeguards Rule. Healthcare and financial deployments use a hardened compute pool.

Can we extend the classifiers?

Yes. Bring your own regex, BYO model, or use our SDK to write custom Python classifiers for proprietary categories.

Ready to See It Live

Scan your RAG corpus this week

We'll find what your models are seeing — usually a surprising amount.