A data-driven methodology to discover similarities between cocaine samples

Abstract

Machine learning has been used for distinct purposes in the science field but no applications on illegal drug have been done before. This study proposes a new web-based system for cocaine classification, profiling relations and comparison, that is capable of producing meaningful output based on a large amount of chemical profiling's data. In particular, the Profiling Relations In Drug trafficking in Europe (PRIDE) system, offers several advantages to intelligence actions across Europe. Thus, it provides a standardized, broad methodology which uses machine learning algorithms to classify and compare drug profiles, highlight how similar drug samples are, and how probable it is that they share a common origin, batch, or preparation process. We evaluated the proposed algorithms using precision and recall metrics and analyzed the quality of predictions performed by the algorithms, with respect to our gold standard. In our experiments, we reached a value of 88% for F0.5-measure, 91% for precision, and 78% for recall, confirming our main hypothesis: machine learning can learn and be applied to have an automatic classification of cocaine profiles.

期刊：	Scientific Reports	影响因子：	3.800
时间：	2020	起止号：	2020 Sep 29;10(1):15976.
doi：	10.1038/s41598-020-72652-w

A data-driven methodology to discover similarities between cocaine samples

一种数据驱动的方法来发现可卡因样本之间的相似性

Abstract

特别声明