Abstract
Amyloidogenic proteins play a central role in a range of pathological conditions, yet their presence in thrombi has only recently been recognized. Whether computational prediction tools can identify amyloid- forming potential in thrombus proteomes remains unclear. AmyloGram is a computational tool that estimates amyloid-forming potential based on n-gram sequence encoding and random forest classification. Using AmyloGram, we analyzed 204 proteins in UniProt that were tagged by humans as amyloidogenic. We then applied the same approach to proteins identified in thrombi retrieved using mechanical thrombectomy from patients with cardioembolic and atherothrombotic stroke. In addition, we used AmyloGram to analyze the amyloidogenicity of 83,567 canonical human protein sequences. Among the UniProt-annotated 'amyloid' set, nearly all proteins received AmyloGram scores above 0.7, including 23 of the 24 human proteins. Even the lowest-scoring human protein, lysozyme (scoring 0.675), is known to form amyloid under certain conditions. In thrombi from both stroke subtypes in four different studies, all detected proteins (with a single exception) had AmyloGram scores above 0.7, suggesting a high likelihood of amyloid content. A majority of unannotated proteins also achieve AmyloGram scores exceeding 0.7. AmyloGram reliably identifies known amyloid-forming proteins and reveals that stroke thrombi are enriched for proteins with high amyloidogenic potential. These findings support the hypothesis that thrombus formation in stroke involves amyloid-related mechanisms and warrant further investigation using histological and functional validation.