Minimalist Machine Learning with Metaheuristic Optimization for Explainable Spam Filtering

Authors

  • Alejandra Ramos Porras

Keywords:

Automatic counting, video stabilization, object detection, object tracking, deep learning

Abstract

Spam detection remains a challenging
task in text classification due to the dynamic
nature of unsolicited messages and the lack of
transparency in conventional machine learning models.
This paper proposes a family of lightweight and
interpretable classifiers based on the Minimalist
Machine Learning (MML) paradigm integrated with
metaheuristic optimization techniques. Three variants
(MML + Random Search, MML + Hill Climb, and
MML + Simulated Annealing) were implemented
and evaluated on the SMS Spam Corpus v.0.1
using a hybrid lexical–semantic representation that
combines BM25 and Word2Vec embeddings. Each
model was designed to select the most discriminative
lexical–semantic features from the feature matrix,
optimizing class separability through an objective
function based on the Intra-Class Correlation Coefficient
(ICC). Experimental results under Leave-One-Out
Cross-Validation (LOOCV) demonstrate that the MML
+ Simulated Annealing variant achieved the best
overall performance (Balanced Accuracy = 0.9327,
F1-score = 0.9014, MCC = 0.8700), yielding results
statistically comparable to a linear SVM baseline
according to the Wilcoxon paired test. These findings
highlight that metaheuristic-enhanced MML models can
achieve competitive performance while maintaining full
interpretability. Future work will extend these models
to sentiment analysis, AI-generated text detection,
and hybrid transformer–MML architectures to combine
transparency with deep semantic understanding. Given
the increasing demand for transparent and responsible
AI in communication systems, this study contributes to
the development of interpretable and lightweight spam
filtering mechanisms.

Published

2026-04-20