Abstract
The Domain Name System (DNS) plays a critical role in the functioning of the Internet, providing essential resolution services for nearly all user activities. In this work, we examine the hypothesis that individual users exhibit recurrent and distinctive patterns in their DNS query behavior, which can be leveraged to create unique and robust user fingerprints. Building on a publicly available dataset of real DNS traffic collected from a large-scale network, we evaluate the feasibility of user identification solely based on these behavioral DNS traces, independent of IP address stability. We conducted a comparative study of several machine learning models - including Naive Bayes, Random Forests, XGBoost, Multilayer Perceptrons, and Convolutional Neural Networks - on their ability to classify users based on domain category frequencies and derived statistical features. After extensive data preprocessing, dimensionality reduction, and feature selection, our best-performing model (CNN) achieves a classification accuracy of 86.7% across 1727 classes (unique IP addresses). The results confirm the viability of DNS-based user fingerprinting, even in the presence of dynamic IP addresses. Our approach opens new avenues for applications in network forensics and anomaly detection, while also raising important questions about privacy and ethical use of passive traffic analysis.