Abstract
One-hot encoding is a prevalent method used to convert numeric variables into categorical variables. But one-hot encoding omits crucial quantitative data, which compromises the performance of convolutional neural networks (CNNs). This study introduces the ensemble probabilistic quantization encoding, where each class is treated as a quantum with distinct values and the classes collaborate in an ensemble manner to preserve numerical information. This method uses the cross-entropy loss function, enhancing its robustness to outliers. Moreover, classes collaborate in an ensemble fashion to yield more diverse and enriched outcomes. We compared three encoding techniques-ensemble probabilistic quantization, one-hot encoding, label smoothing, and mean square error-using the same dataset and model structure. Our investigations into the impact of quantitative information loss on CNN performance revealed that omitting this information significantly undermines CNN functionality. Ensemble probabilistic quantization proved less dependent on the number of classes than the other methods, thus maintaining effectiveness even with fewer classes. In conclusion, the efficient transmission of quantitative information from numerical to categorical variables is essential for optimal CNN performance. Ensemble probabilistic quantization effectively conveys diverse quantitative information with fewer classes, outperforming one-hot encoding and label smoothing when class numbers are limited.