The Trump administration has made it clear: federal agencies must begin deploying AI systems as soon as possible. In April, the Office of Management and Budget (OMB) issued new guidance requiring non-national security agencies to identify AI use cases and begin implementation. National security agencies are likely to follow.
Yet a familiar fog surrounds decisionmakers: what is legal? Can agency general counsels confidently greenlight AI systems that interact with sensitive data — especially when it concerns U.S. persons?
The answer is yes. A technique called Retrieval-Augmented Generation (RAG) offers federal agencies a simple but powerful solution. RAG enables AI systems to interact with sensitive data without actually storing or modifying the underlying AI model, avoiding the legal gray zones that often cause hesitation. This means agencies can start using AI today, with tools they already have and data they already lawfully retain. And in a world of accelerating global competition for AI leadership, they must. Adversaries are not waiting for perfect clarity — they are moving fast. U.S. agencies cannot afford to fall behind.
Many federal agencies remain stalled, unsure whether they can safely use internal or sensitive data to improve AI systems. The confusion centers on two common techniques: training, which teaches a new model by feeding it large amounts of data from scratch; and fine-tuning, which adapts an existing model using agency-specific data to better perform specialized tasks.
Training and fine-tuning raise unresolved legal questions. Would using sensitive internal data to improve an AI model trigger new obligations under the Privacy Act of 1974? Would it count as a new “collection” under E.O. 12333 or implicate recent National Security Council (NSC) guidance on “modified” models? These ambiguities have understandably created risk aversion — but that should not paralyze agencies across the board.
While training or fine-tuning a model may require more legal clarity, RAG offers a clean and compliant alternative. It allows agencies to harness AI’s power without entangling themselves in uncertain legal territory. And with adversaries moving fast, the United States cannot afford to delay. RAG offers a way to build AI systems today, using data agencies already lawfully possess, within existing legal frameworks.
This is not a theoretical workaround. It is a practical and technically mature interim solution. General Counsels and Chief AI Officers should act on this now. If agencies can store and query a database, they can use RAG. If a human analyst can read a document, an AI model can, too.
How does RAG work?
In short, RAG systems connect an AI model to a database of agency documents the model can query and search. When a user asks the model a question, the RAG system searches the attached database and then returns materials based on the user’s query. These documents are then provided to the model temporarily as additional context. The model generates a response using only this input and then discards it. In other words, agencies retain strict control over the data and can revoke access to it at any time.
The model’s internal parameters — known as its weights — remain unchanged. It is not modified, trained, or fine-tuned.
An agency can run RAG systems for its own use, on its own infrastructure. There is no need to connect it to the Internet or share it with anyone outside the agency. This allows bureaucrats to maintain strict control over system access, network configuration, and physical hardware security. The underlying vector database storing these documents can be secured using encryption, access controls, and audit logging — just like any sensitive information system operating under existing agency policy.
Consider two examples. A federal records officer at the Department of Veterans Affairs uses a RAG system to answer a question about eligibility standards. The system retrieves the most relevant excerpts from internal policy manuals and archived guidance, then feeds them into the model to generate a complete, accurate summary.
Or take an intelligence analyst with clearance to access a classified document database so massive and complex that no human could parse it quickly. When investigating a potential threat, the analyst uses a RAG system to query the classified database. The system retrieves only relevant, authorized documents and the AI model synthesizes them—uncovering insights from sources that would be too voluminous to review manually. The model uses only those specific documents for the response and discards them after the session — no memory, no storage, no learning. The analyst sees the citations, can verify the source, and can trust the output.
Importantly, this doesn’t require connecting classified data to the open internet or to untrusted tools: the IC can choose a trusted foundation model and run it entirely locally or in a secure environment. These kinds of tools are already gaining traction. For example, Meta has made Llama—its open-source foundation model—available to U.S. national security agencies and defense contractors, and companies like Scale AI and Lockheed Martin have built national security tools on top of this base. National security agencies can similarly deploy trusted foundation models in air-gapped, classified environments and use RAG to extract insights from sensitive data—without fine-tuning or modifying the models themselves.
When an AI model accesses U.S. person information, it functions no differently than a cleared human analyst reading a memo. Under E.O. 12333 and related guidance for national security agencies reading U.S. person information may count as “collection” and be subject to certain procedural requirements. But RAG does not sidestep these rules — it fits within them. If the documents are already lawfully retained, and the AI simply accesses them temporarily to answer a question, the legal framework remains intact.
Since the model does not absorb the information into its internal structure (i.e., it does not change the model’s weights), RAG can function within existing legal frameworks. RAG keeps the model unaltered and uses agency documents only at runtime, preserving legal accountability and preventing untraceable data retention. That design makes RAG fundamentally different — and legally cleaner — than traditional AI systems that are trained or fine-tuned on sensitive data.
By contrast, fine-tuning a model on sensitive data alters its internal parameters. Once data is incorporated into the model’s weights, it becomes difficult to audit, delete, or control, raising serious compliance challenges under statutes like the Privacy Act and agency-specific retention rules.
The federal government must clarify how existing legal obligations apply to training and fine-tuning, particularly as agencies adopt increasingly sophisticated AI tools. In the meantime, RAG offers a legally prudent path forward: it enables powerful, document-aware AI performance while preserving auditability, control, and compliance over sensitive content.
What RAG offers is structure. It cleanly separates the model from the data, which remains subject to the agency’s existing legal obligations, such as retention schedules, access controls, and audit logs. Rules requiring deletion of irrelevant U.S. person information (such as minimization) still apply, and so do access logs and purpose limitations. That is not new. It is just familiar legal work applied to a new interface.
Still, some risks remain. To manage access, agencies can implement role-based access control (RBAC), relationship-based access control (ReBAC), or fine-grained authorization (FGA). These approaches restrict access based on user roles, relationship to the data, and contextual factors — such as time of access or location — thereby limiting unnecessary exposure and aligning with longstanding data governance practices.
Encryption is also key to securely deploying RAG. Advanced techniques such as homomorphic encryption can enable computations to occur directly on encrypted data without requiring decryption. Some agencies may also worry about newer technical concerns like embedding leakage, where information encoded into vectors might be partially reconstructed. While this risk is specific to modern AI pipelines, it shares characteristics with familiar data security challenges. Access controls, encryption, and adversarial testing strategies can help mitigate these risks.
How Agencies Can Deploy AI Systems — Now
So what should agencies do to take advantage of RAG? First, agencies should choose a trusted foundation model and run it locally on secure agency infrastructure. Without changing any of the model’s internal structure, the agency can connect the AI to internal data through a RAG pipeline and start answering real operational questions. Open-source frameworks can get an agency to a working prototype in a matter of days.
The only real barrier left is institutional inertia. Of course, agencies must still do the blocking and tackling: secure their data, control access, disable unnecessary logging, and ensure that only authorized users can query sensitive material. But these are familiar obligations under the Privacy Act, E.O. 12333, and internal data-handling protocols. RAG does not remove legal responsibility, but rather restores legal visibility.
What RAG removes is the excuse to wait. Failing to deploy AI tools that can support mission-critical decisions now means ceding ground to adversaries who will not hesitate to deploy theirs first. There is no need for a new National Security Council memo, congressional directive, or executive order. The technology exists. So do the use cases. What is missing is deployment.
RAG is not a loophole or a workaround — it is a design choice that fits squarely within established legal frameworks. Agencies do not need new doctrine to deploy it, just the recognition that this architecture works within the old one. While more complex questions about fine-tuning and model training await resolution, RAG offers a compliant, high-impact interim solution.
As the Trump administration’s recent AI guidance made clear, U.S. federal agencies are poised to more widely adopt AI across the board. AI is a strategic asset, and deploying it now is not just an opportunity — it is a necessity.
FEATURED IMAGE: Computer chip with U.S. flag (via Getty Images)