Lead Institution: Carnegie Mellon University

Project Leader: Anupam Datta

Research Progress

  • Abstract

    Healthcare organizations collect sensitive personal health information from patients in order to provide treatment. To protect patient privacy, these organizations must disclose personal health information to third parties in compliance with complex privacy regulations, such as the Health Insurance Portability and Accountability Act (HIPAA) and state privacy laws. In this project, we develop efficient compliance checking algorithms that can automatically check incomplete audit logs for compliance with large fragments of privacy regulations. Our experiments with all disclosure clauses of the HIPAA Privacy Rule demonstrate that the algorithms work efficiently on realistic healthcare privacy policies.

  • Focus of the research/Market need for this project

    The focus of this project is to develop algorithms for enforcing disclosure policies, in particular, those found in state and federal health-care privacy laws. Commercial audit tools focus on access logs and do not help with enforcement of the kinds of disclosure policies found in health-care regulations. Tools for enforcement will become increasingly important as personal health information is digitized and shared across organizational boundaries, especially with the emergence of health information exchanges (HIEs).

  • Project Aims/Goals

    The overarching aim of this project is to improve the state-of-the-art in automated checking of whether an audit log, which records relevant disclosure events of an organization, is compliant with a given privacy policy. A significant challenge in automated compliance checking is that audit logs maintained by organizations may be incomplete, i.e., they may not contain enough information to decide whether or not the policy has been violated. Moreover, the size of the relevant audit logs required to be checked for compliance can be very large.

  • Key Conclusions/Significant Findings/Milestones Reached

    We have developed an iterative auditing algorithm, which we call “reduce”, that automatically checks an incomplete and spatially distributed audit log for compliance with a given privacy policy. In each iteration, it checks as much of the policy as possible over the current log and outputs a residual policy that can only be checked when the log is extended with additional information. We have empirically demonstrated that reduce can gracefully check compliance of large synthetic logs with practical privacy policies like HIPAA.

    We also explored challenges of deploying our audit algorithm reduce in a practical system. This exploration led to a collaboration with the HIE AUDIT team from UIUC on an audit system for the Illinois HIE (IL-HIE). We enhanced reduce to report not just violations but also explanations for what sub-policy (e.g., exactly which HIPAA clause) was violated by hospital employees’ actions. The new audit algorithm also has an additional privacy-preserving property – the auditor is not given access to the entire audit log but only to parts of the log that are essential to checking whether the policy was violated or respected. This is achieved by storing the audit log in encrypted form and making decryption keys available for only those portions of the log that the reduce algorithm suggests have to be checked by the auditor.

    Privacy laws like HIPAA can restrict a disclosure action based on prior events. For instance, a covered entity can share a patient’s psychotherapy notes if it has previously received an authorization. For checking such conditions (e.g., existence of a prior authorization), reduce requires all prior entries of the audit log to be available. We relax this restriction by developing an algorithm, which we call précis that checks compliance of an audit log with complete information by caching relevant past events, thus enabling us to discard prior entries of the audit log. The retention restriction imposed by the policy (e.g., HIPAA has a retention restriction of 6 years) ensures that précis’s cache, containing relevant prior events, does not grow forever. Our empirical evaluation indicates that précis is 3x-8x faster than reduce on HIPAA with respect to synthetic audit logs.

  • Available Materials for Other Investigators/Interested parties

    The relevant source codes and encoding of relevant federal privacy regulations (e.g., HIPAA, GLBA) in our formalization, is available in the following URLs:

  • Market entry strategies

    We are currently collaborating with UIUC and IL-HIE to explore the possibility of deploying these algorithms in HIEs.

Temporal Mode-Checking of Runtime Monitoring of Privacy Policies
Omar Chowdhury, Limin Jia, Deepak Garg, and Anupam Datta
Under Review, 2014

Policy Auditing over Incomplete Logs: Theory, Implementation and Applications
Deepak Garg, Limin Jia, and Anupam Datta
Proceedings of the 18th ACM CCS, 2011

Privacy-Preserving Audit for Broker-Based Health Information Exchange
Se Eun Oh, Ji Young Chun, Limin Jia, Deepak Garg, Carl A. Gunter, and Anupam Datta
Proceedings of the ACM CODASPY, 2014