Applying Behavioral Biometrics to Mobile Device Use Measurement in Children: Evaluating the Impact of Training Data Size, Proximity, and Type on Model Performance

将行为生物识别技术应用于儿童移动设备使用情况测量:评估训练数据规模、邻近性和类型对模型性能的影响

阅读:2

Abstract

OBJECTIVE: Passive sensing applications are limited by their inability to determine who is using a device, a critical concern in child mobile device use research, where devices are often shared between siblings or between a child and their parent. Our previous work leveraged behavioral biometrics to identify a target child user; however, it is unknown what type of training data is necessary for optimal model performance. This study evaluated model performance across different characteristics of training data. METHODS: Thirty-six children (11.3 ± 0.9 years, 56% female) self-selected a video or a game on iPads for 10 min while laying and for another 5 min while sitting. The SensorLog application captured iPad accelerometer and gyroscope data while the child interacted with the device. Machine learning algorithms including Neural Network (NN), Random Forest (RF), k-Nearest Neighbors (k-NN), and SwipeFormer were applied to determine the most important aspects of training data to optimize model performance. The aspects of training data evaluated included (1) varying the length (i.e., seconds of training data), (2) varying the user position (i.e., sitting, laying), and (3) varying the time proximity between training and testing data. F1 score was used to evaluate model performance. RESULTS: The SwipeFormer F1 scores were lowest when the training data was further from the test data (0 when training data was 11 min away from test data) and highest when training data was close to test data (0.91 when training data was the minute preceding test data). The SwipeFormer F1 scores were highest when predicting the user laying from laying (0.97) and sitting from sitting (0.94), and lowest when predicting the user sitting from laying (0) and laying from sitting (0). The length of training data had little impact on performance, with a SwipeFormer F1 score of 0.91 when training on one minute of data and a SwipeFormer F1 score of 0.94 when training on twelve minutes of data. DISCUSSION: Because researchers would likely be predicting users at different timepoints than their training data, research should focus on improving model performance for identifying users independent of time proximity for training and test data.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。