Abstract
Projects such as the European 1+ Million Genomes initiative and the European Genomic Data Infrastructure project are paving the way towards the age of genomic medicine. To address the challenge of balancing genomic data privacy with biomedical research, the proposed solution is to enable discovery of private datasets through public metadata. Yet enabling data discovery based on genomic variants present in a dataset-which is the goal of Beacon-raises the risk of re-identifying data subjects. We have implemented a Portuguese Beacon endpoint within the scope of the European Genomic Data Infrastructure project, which features a re-identification prevention algorithm to ensure the privacy of data subjects-the first Beacon endpoint to do so. We assessed the impact of the algorithm on data discovery, which varies with the size of the Beacon dataset.