A Policy-Driven Approach to Secure Extraction of COVID-19 Data From Research Papers

基于政策驱动的方法,安全地从研究论文中提取 COVID-19 数据

阅读:2

Abstract

The entire scientific and academic community has been mobilized to gain a better understanding of the COVID-19 disease and its impact on humanity. Most research related to COVID-19 needs to analyze large amounts of data in very little time. This urgency has made Big Data Analysis, and related questions around the privacy and security of the data, an extremely important part of research in the COVID-19 era. The White House OSTP has, for example, released a large dataset of papers related to COVID research from which the research community can extract knowledge and information. We show an example system with a machine learning-based knowledge extractor which draws out key medical information from COVID-19 related academic research papers. We represent this knowledge in a Knowledge Graph that uses the Unified Medical Language System (UMLS). However, publicly available studies rely on dataset that might have sensitive data. Extracting information from academic papers can potentially leak sensitive data, and protecting the security and privacy of this data is equally important. In this paper, we address the key challenges around the privacy and security of such information extraction and analysis systems. Policy regulations like HIPAA have updated the guidelines to access data, specifically, data related to COVID-19, securely. In the US, healthcare providers must also comply with the Office of Civil Rights (OCR) rules to protect data integrity in matters like plasma donation, media access to health care data, telehealth communications, etc. Privacy policies are typically short and unstructured HTML or PDF documents. We have created a framework to extract relevant knowledge from the health centers' policy documents and also represent these as a knowledge graph. Our framework helps to understand the extent to which individual provider policies comply with regulations and define access control policies that enforce the regulation rules on data in the knowledge graph extracted from COVID-related papers. Along with being compliant, privacy policies must also be transparent and easily understood by the clients. We analyze the relative readability of healthcare privacy policies and discuss the impact. In this paper, we develop a framework for access control decisions that uses policy compliance information to securely retrieve COVID data. We show how policy compliance information can be used to restrict access to COVID-19 data and information extracted from research papers.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。