Abstract
The arrival of recent cybersecurity standards has raised the bar for security assessments in organizations, but existing techniques require a high manual effort. Threat analysis and risk assessment are used to identify security threats for new or refactored systems. Still, there is a lack of definition-of-done, so identified threats have to be validated which slows down the analysis. Existing literature has focused on the overall effectiveness of threat analysis, but no previous work has investigated what material must the analysts use to effectively validate the identified security threats. We conduct a controlled experiment with practitioners to investigate whether having some analysis material (either the system's graphical model or LLM-generated advice) is better than none, and whether having both the system's graphical model and LLM-generated advice is better than having only one of them. We run a pilot of the experiment with 41 MSc students, a think-aloud study with three practitioners, and the experiment survey with 68 recruited practitioners. Our main findings suggest that, in terms of additional material needed for threat validation, less is more. We also find that participants perceived the graphical model as equally useful compared to LLMs and that, despite LLMs not always providing conclusive advice, practitioners still perceived it as somewhat useful. The experimental material and data analysis scripts is publicly available in a replication package.