Call for Data Standardization: Lessons Learned and Recommendations in an Imaging Study

呼吁数据标准化:影像学研究中的经验教训和建议

阅读:1

Abstract

PURPOSE: Data sharing creates potential cost savings, supports data aggregation, and facilitates reproducibility to ensure quality research; however, data from heterogeneous systems require retrospective harmonization. This is a major hurdle for researchers who seek to leverage existing data. Efforts focused on strategies for data interoperability largely center around the use of standards but ignore the problems of competing standards and the value of existing data. Interoperability remains reliant on retrospective harmonization. Approaches to reduce this burden are needed. METHODS: The Cancer Imaging Archive (TCIA) is an example of an imaging repository that accepts data from a diversity of sources. It contains medical images from investigators worldwide and substantial nonimage data. Digital Imaging and Communications in Medicine (DICOM) standards enable querying across images, but TCIA does not enforce other standards for describing nonimage supporting data, such as treatment details and patient outcomes. In this study, we used 9 TCIA lung and brain nonimage files containing 659 fields to explore retrospective harmonization for cross-study query and aggregation. It took 329.5 hours, or 2.3 months, extended over 6 months to identify 41 overlapping fields in 3 or more files and transform 31 of them. We used the Genomic Data Commons (GDC) data elements as the target standards for harmonization. RESULTS: We characterized the issues and have developed recommendations for reducing the burden of retrospective harmonization. Once we harmonized the data, we also developed a Web tool to easily explore harmonized collections. CONCLUSION: While prospective use of standards can support interoperability, there are issues that complicate this goal. Our work recognizes and reveals retrospective harmonization issues when trying to reuse existing data and recommends national infrastructure to address these issues.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。