Blind Spot in BAAs: PHI in LLM Context Windows
This article discusses a compliance gap in healthcare AI where Business Associate Agreements (BAAs) do not cover data present in the context windows of large language models (LLMs) during inference. This can lead to exposure of protected health information (PHI).
Why it matters
This issue is an active risk area for healthcare organizations using AI, and compliance teams are treating it as a critical gap that needs to be addressed.
Key Points
- 1BAAs cover data storage and transmission, but not PHI in LLM context windows
- 2The 18 HIPAA Safe Harbor identifiers are broader than most teams realize
- 3Multi-step LLM pipelines can compound the exposure of PHI in intermediate context windows
- 4The solution is to scrub all 18 HIPAA identifiers before text enters any LLM context window
Details
The article explains that while BAAs govern the storage and transmission of data sent to cloud LLMs, they typically do not address the identifiers present in the model's context window during inference. This can lead to exposure of PHI, including elements like device IDs, IP addresses, and image URLs that may contain patient identifiers. The problem is exacerbated in multi-step LLM pipelines, where PHI can accumulate in intermediate context windows. The solution is to proactively scrub all 18 HIPAA Safe Harbor identifiers from the text before it enters any LLM context, not just the final one. This ensures PHI is not inadvertently exposed during the inference process.
No comments yet
Be the first to comment