Abstract
Big Data (BD) has the potential to transform the process of drug discovery. The integration of chemical, biological, pharmacological, and clinical information facilitates the expeditious conception of high-value projects, thereby enhancing the identification of hits and the generation of superior leads or repositioned candidates while concomitantly reducing time and costs. In this review, we demonstrate that BD extends beyond the scope of ligand discovery, thereby supporting the identification of novel pharmacological targets through the integration of genomic, proteomic, and metabolomic data sets. This integration adds further depth and guides the development of individualized therapies. When combined with combinatorial chemistry, high-throughput screening, and artificial intelligence (AI), BD expedites the identification of compounds that exhibit optimal pharmacokinetic and pharmacodynamic profiles. The impact of BD extends to later stages of drug development, including regulatory evaluation and clinical translation. This demonstrates that BD is no longer a supplementary tool but a cornerstone for rational molecular design, predictive modeling, and data-driven drug discovery. Although the benefits generated by the use of BD and AI in MedChem are evident, the impact of the widespread use of these data and tools raises a series of philosophical questions that need to be discussed since the popularization of large language models (LLMs) has resulted in the generation of promiscuous data, which, from a scientific point of view, lacks the criteria necessary for such data to be considered meaningful. All of these factors demonstrate the need for intentional dialogue on how these tools should be applied within the hermeneutics of biomedical sciences themselves, in order to ensure a lucid discussion on the nature of the method, harmonizing this apparent tension between human and AI, which has been a source of controversy since the exponential rise of ChatGPT and various other LLMs.