Back to blog
Security6 min readApril 10, 2026

The case for private document AI in enterprise

Why the question 'but where does my data go?' is becoming the deciding factor for enterprise AI tools — and how Docutrix answers it.

Every serious enterprise AI conversation eventually hits the same wall: "But where does our data go?"

It's not a cynical question. Legal, compliance, and security teams have legitimate reasons to ask it. And increasingly, the answer determines whether a tool gets deployed or quietly killed in procurement.

Why enterprise buyers care

The documents that enterprise teams most need to query are also the most sensitive ones. Contracts with customers and suppliers. Board meeting minutes. Financial projections. HR performance files. M&A due diligence materials.

These documents contain trade secrets, personal data, and information that's subject to legal privilege, regulatory requirements, or both. The answer to "where does this go?" isn't just a procurement checkbox — it has real legal and business implications.

The spectrum of "private"

Not all "private" AI solutions are equivalent. Here's how the options break down:

**Shared SaaS with data isolation.** Your data is hosted on the vendor's infrastructure, isolated from other customers' data at the application and storage level. The vendor's team has no routine access to your content, but in principle could. This is Docutrix's standard offering for Starter, Team, and Business plans.

**Private VPC deployment.** The AI stack runs inside your own cloud environment (AWS, Azure, GCP). The vendor provides the software and configuration; you control the infrastructure. Your document content never leaves your cloud. This is Docutrix's Enterprise tier option.

**On-premise deployment.** Everything runs inside your own data centre. No external network calls for document processing. Maximum control, highest operational overhead. Available for air-gapped environments.

**Private LLM.** Rather than calling a third-party LLM API (which would mean sending document content to that provider), the LLM runs entirely within your infrastructure. Docutrix's Enterprise tier supports this for customers where no document content can leave their environment.

What "no training on your data" actually means

This phrase appears in many AI product descriptions. What it means in practice varies.

For Docutrix: your documents and queries are never used to train, fine-tune, or improve AI models — on any plan. This is contractual, not just a policy statement. Enterprise plans include a Data Processing Agreement (DPA) that makes this explicit.

The practical test

When evaluating any AI document tool for enterprise use, ask these questions:

1. Where are documents stored at rest, and who has access?

2. What third-party APIs are called when processing a query, and what data is sent to them?

3. Is there a contractual commitment not to use our data for model training?

4. What happens to our data if we cancel?

5. What certifications or audit reports are available?

The answers reveal a lot about how seriously a vendor has thought about enterprise data requirements.