Abstract
Medical report generation, which aims to reduce workload of doctors and improve diagnosis efficiency, can be always concerned and significant. Currently, many medical report generation methods are proposed to automatically generate diagnosis report. However, it is challenging to generate semantically coherent paragraphs with accurate location and clinical language style. To address these, we, in this paper, propose novel medical report generation method based on generative adversarial networks with joint attention. Within it, an adversarial framework is designed to achieve high-quality medical report generation where generator generates diagnosis report meanwhile discriminator is used to distinguish authenticity and evaluator is utilized to preserve clinical style, respectively. During generation, joint attention is proposed to enhance and fuse visual and textual feature to capture high-order interactions among cross-modal features, which is desired for accurate location and filling visual-to-pathological gap. Then, language evaluator is introduced to preserve clinical language style and yield acceptable and readable report by clinicians. Besides, we propose gradient search-based Pareto optimal (GSPO) strategy for multi-objective GANs, which uses hard parameter sharing to determine multi-objective weight parameters and performs linear search along gradient direction to search Pareto optimal solution. Extensive experiments show that our method gains competitive performance comparing with other state-of-the-art medical report generation methods. Specifically, ours achieves average increments of 6.2% BLEU-2, 8.1% BLEU-3, 7.4% ROUGE and 13.1% MIRQI. Additionally, ours outperforms than other comparisons in clinical language style evaluation by radiologists.