Abstract
OBJECTIVE: We explored automated concept-based indexing of unstructured figure captions to improve retrieval of images from radiology journals. DESIGN: The MetaMap Transfer program (MMTx) was used to map the text of 84,846 figure captions from 9,004 peer-reviewed, English-language articles to concepts in three controlled vocabularies from the UMLS Metathesaurus, version 2006AA. Sampling procedures were used to estimate the standard information-retrieval metrics of precision and recall, and to evaluate the degree to which concept-based retrieval improved image retrieval. MEASUREMENTS: Precision was estimated based on a sample of 250 concepts. Recall was estimated based on a sample of 40 concepts. The authors measured the impact of concept-based retrieval to improve upon keyword-based retrieval in a random sample of 10,000 search queries issued by users of a radiology image search engine. RESULTS: Estimated precision was 0.897 (95% confidence interval, 0.857-0.937). Estimated recall was 0.930 (95% confidence interval, 0.838-1.000). In 5,535 of 10,000 search queries (55%), concept-based retrieval found results not identified by simple keyword matching; in 2,086 searches (21%), more than 75% of the results were found by concept-based search alone. CONCLUSION: Concept-based indexing of radiology journal figure captions achieved very high precision and recall, and significantly improved image retrieval.