Abstract
This study proposes a dual-branch framework for precise classification of breast tumor cellularity via histopathological images where it integrates two distinct branches: the Embedding Extraction Branch (embedding-driven) and the Vision Classification Branch (vision-based). The Embedding Extraction Branch uses the Virchow2 transformation to generate dense, structured embeddings, whereas the Vision Classification Branch employs Nomic AI Embedded Vision v1.5 to process image patches and produce classification logits. Both branches' outputs are combined to form the final classification. The framework also suggests Knowledge Block with fully connected layers, batch normalization, and dropout to improve feature extraction and reduce overfitting. The proposed approach reports high performance metrics, with an accuracy of [Formula: see text], specificity of [Formula: see text], and sensitivity, precision, and F1 score of [Formula: see text]. Also, ablation studies show the mandatory role of the embedding extraction branch; as its removal drastically reduces accuracy to [Formula: see text]. Furthermore, the Vision Classification Branch contributes significantly and its removal aims to a smaller decrease in the accuracy performance ([Formula: see text]). Additionally, data augmentation improves model performance and its exclusion results in a notable decline in accuracy performance ([Formula: see text]). The approach's robustness is validated through statistical analysis that reports low variance and high consistency across multiple performance metrics.