Abstract
Traditional machine learning (ML) relies on a centralized architecture, which makes it unsuitable for applications where data privacy is critical. Federated Learning (FL) addresses this issue by allowing multiple parties to collaboratively train models without sharing their raw data. However, FL is susceptible to data and model poisoning attacks that can severely disrupt the learning process. Existing literature indicates that defense mechanisms predominantly analyze client updates on the server side, often without requiring or involving client cooperation. This paper proposes a novel defense mechanism, SpyShield, that leverages client cooperation to identify malicious clients in data and model poisoning attacks. SpyShield is inspired by tactics used in the social deduction game Spyfall, where the majority of players must detect the deception of a minority, a dynamic aligning with the challenges posed by poisoning attacks in ML. In this paper, we evaluate four different configurations of SpyShield's robustness and performance on the FashionMNIST dataset against five benchmark aggregation algorithms-FedAvg, Krum, Multi-Krum, Median, and Trimmed Mean-under three attack types: (A) Cyclic Label Flipping, (B) Random Label Flipping, and (C) Random Weight Attacks. Each attack is tested across three scenarios: (I) 3 malicious clients out of 30, (II) 10 out of 50, and (III) 40 out of 100, totaling nine experimental settings. These settings simulate varying attack intensities, allowing the assessment of SpyShield's effectiveness under different attack invasiveness. In every setting, at least one configuration of SpyShield consistently outperformed all benchmark algorithms, achieving the highest accuracy. The evaluation shows that SpyShield achieves strong performance and resilience across diverse settings and attack types. These findings highlight its potential as a robust and generalizable defense mechanism for securing federated learning models, while also opening new possibilities for collaborative strategies that move beyond centralized server-side analysis.