Abstract
Food plays a large role in health and disease. Foods consist of an incredible number of different compounds, many of which are responsible for (yet unknown) biological effects. To research health effects of food compounds, food composition databases are essential for deriving compound concentrations in foods. Many such databases are limited to around 100 compounds, but FooDB, the largest food composition database, aggregates data from different sources, expanding the number of compounds to more than 10,000. These data require cleaning before it can be used in compound intake analyses. This paper describes a method to clean and standardize FooDB for compound-centric dietary intake analyses.