The honest answer most consultants give is unsatisfying. Either yes, but the AI will be much less capable, or yes, but you’ll need to trust our cloud provider’s terms of service. Both responses concede something a client shouldn’t have to concede.
For me, I think the framing is wrong. The choice between local AI and cloud AI is a false dichotomy. The real question is which layer of the work does/should each tool get to see?
The Three Layers of Any AI-Assisted Analysis or Project
Strip back any piece of analytical or project work and you’ll find three distinct layers stacked inside it.
The Raw Data
Actual client confidential data, market sensitive monetary figures, specific location data, commercially sensitive financial data. This is what your client gave you and trusted you with. It’s covered by your engagement letter, NDA, your professional indemnity insurance, and a quiet assumption that you’ll treat it with the same care as your own financial records.
The Pattern
Observations about the data – abstracted, anonymised, structural. “There’s an unexplained variance between two sites operating under identical contract terms.” “Two suppliers are charging for the same line item under different framework codes.” These describe the shape of the problem rather than its specifics.
The Solution
The analytical framework. The SQL pattern that joins disparate price schedules cleanly. The Python approach for flagging anomalies. The structure of the findings report. The recommended remediation. These are tools and methods – largely portable, mostly generic, often improved by collaboration.
Most users treat AI as a single thing that touches all three layers with very little control or separation, whether behind an enterprise firewall or not. This is where the privacy problem lives.
What If We Treated Them Differently?
The principle I’ve ended up working to is this: the data stays local, but the abstractions can travel.
In practice, that means running two AI tiers in parallel.
Local Models
Qwen 2.5, Qwen 3, DeepSeek, and others in the open-source ecosystem. They live on hardware you control, behind your firewall, processing files that never touch a third-party API. They handle anything involving real client data.
Raw Data LayerFrontier Models
Claude, the OpenAI models, and others. They work on the abstractions – the patterns, architectures, code structures, and analytical frameworks. They see the shape of problems, never the content.
Pattern & Solution LayersThis is how you’d want a senior advisor to work, even if AI wasn’t involved. You don’t email your client’s pricing schedule to a colleague to ask for help. You describe the situation in general terms, get their input on approach, then apply that approach yourself. The discipline of abstraction is good practice whether you’re using large language models or not.
Why This Is Harder Than It Sounds
The reason most people don’t work this way is that it requires more upfront thinking, not less. When you’re staring at a real contract anomaly and you want help, the path of least resistance is to paste the whole thing into the smartest model you have access to.
The discipline of saying wait – what’s actually the abstract pattern here? Can I describe this problem without revealing what I shouldn’t? slows you down. A useful friction, but a friction nonetheless.
It also requires building habits and tooling that enforce the boundary. In my own situation, I keep client work in folders where my AI integration is configured to only talk to local models. The cloud-capable tools live in separate working areas that physically cannot see client data.
The boundary must be enforced by configuration, not just by personal habit – because we are human, and habits can fail at five in the afternoon when you’re tired with a deadline looming. Structure beats discipline every time.
The configuration requires understanding the strengths of each tier well enough to route work intelligently. Local models are now genuinely capable for many tasks – reading contracts, extracting tables, answering questions about specific documents and generating boilerplate code. But they remain a step behind frontier models for complex multi-step reasoning, novel architectural problems, or anything requiring deep code-base awareness. Knowing what to send where is a skill on its own.
The Strategic Case
This is more than just ticking the compliance box: it fundamentally changes what you can credibly offer to whom. Many of the most interesting consulting problems are in organisations whose data is too sensitive to send to a cloud API. Public sector, Defence-adjacent work, Pre-award procurement, Healthcare; anything covered by a meaningful NDA and rife with possible GDPR risk.
The consultant who can credibly say; the analysis happens on infrastructure I control, no client data leaves your environment or mine, here’s the architecture diagram; is operating in a different competitive space from the one who can’t.
Trust isn’t a soft skill in this work. It’s the precondition for getting the work. A clean architectural story about how AI tools are used in your delivery is increasingly part of the trust conversation, alongside the older questions about indemnity, sub-contracting, and conflict of interest.
There’s also a side benefit of this approach. When you’re not paying per token to a cloud provider for every read of a 200-page tender document, the economics of detailed analysis change, for the better. Local inference is, in marginal-cost terms, almost free once you’ve bought the hardware. That makes thorough review affordable in places where API costs would have made it prohibitive.
What This Looks Like in Practice
For a current bid analysis engagement, the working pattern looks something like this:
- Tender documents – sometimes hundreds of pages, often with embedded pricing schedules, occasionally containing pre-award commercial intelligence – are ingested by a local vector database and queried with a local language model. The client’s data never leaves the engagement environment.
- When building the analysis pipeline itself – the Python that orchestrates retrieval, the prompt templates, the output formatting – the architecture is described to a frontier model. The frontier model never sees a tender. It sees a generic description of “a multi-document Q&A system over a specialised subject field.”
- When findings emerge, they’re written up locally, with local models helping phrase technical observations clearly. When the final report needs polish, work happens on the anonymised structure with a frontier model, then that polish is applied locally to the real text.
Two tools doing two jobs, with a clear boundary between them, enforced by both habit and infrastructure.
The Shift This Represents
A year ago, the realistic options for a UK consultant doing data-sensitive work were either to build complex analysis engines with the assistance of a cloud API and hope the engagement letter covered it, or to forgo AI assistance entirely and accept a slower laborious delivery. Neither approach was good.
The hardware to run capable local models has become genuinely affordable. A used GPU and a mini PC will do most of what a small-to-medium consulting practice needs. The open-source models have closed enough of the capability gap that, for many specific tasks, they’re not noticeably worse than the frontier models. The tooling around local inference has matured to the point that integrating it into a working practice is no longer the realm of techies and hobbyists.
This means the architectural choice is now a tangible choice. Not can we afford local AI, but how should we route different kinds of work between local and cloud tools based on what each tool does well?.
Those who can get this right, will, I suspect, find themselves working on more interesting problems for clients with higher confidentiality requirements. Not because they’re using more AI, but because they’re using it in a way that respects the boundaries clients have always cared about.
The abstractions can travel.
That’s the architectural principle. Everything else is implementation detail.
Nick is a UK-based technology consultant specialising in data engineering, business intelligence, and applied AI for SMB and enterprise clients. He works on bid and tender analysis, commercial contract review, and reporting automation – work where client confidentiality is non-negotiable.








