Abstract
Recent years have seen incredible progress in the development of deep-learning (DL) tools for the analysis of biological data, with the most prominent example being AlphaFold2 for accurate protein structure prediction. DL-based tools are especially useful for identifying patterns and connections within sparsely labeled datasets. This makes them essential for the analysis of metagenomic data, which is mostly unannotated and bears little sequence similarity to known genes and proteins. In this review, we chose to present 12 tools which we deem as offering novel capabilities for metagenomic analysis by utilizing interesting DL techniques. This review is thus intended to be a solid starting point for any data scientist looking to apply advanced methods to explore metagenomic datasets. For each DL-based tool, we present its computational principles, followed by relevant examples of its application where possible and a note on its limitations.