Abstract
The successful use of deep learning in computational biology depends on the ability to extract meaningful biological information from the trained models. Recent work has demonstrated that the attention maps generated by self-attention layers can be interpreted to predict cooperativity between binding of transcription factors, a key feature of gene regulatory networks. We extend this earlier work and demonstrate that the addition of an entropy term yields high-precision sparse attention maps that are easy to interpret. Furthermore, we performed a comprehensive evaluation of the relative performance of different flavors of attention-based transcription factor cooperativity discovery. Our findings demonstrate the benefit of the entropy-enhanced attention models and provide additional insights that would enable practitioners to make effective use of this valuable tool for biological discovery.