{"id":386,"date":"2026-04-15T00:00:00","date_gmt":"2026-04-14T23:00:00","guid":{"rendered":"https:\/\/kosokoking.com\/?p=386"},"modified":"2026-04-12T15:25:35","modified_gmt":"2026-04-12T14:25:35","slug":"supervised-learning","status":"publish","type":"post","link":"https:\/\/kosokoking.com\/index.php\/multifarious\/supervised-learning\/","title":{"rendered":"Supervised learning"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Every supervised learning model carries the same structural assumption that the training data is an honest representation of the world it will operate in. Change that assumption, and the model does exactly what it was trained to do. It just does it wrong.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Before you can poison a dataset, craft an adversarial example, or exploit a classification boundary, you need to understand the learning process itself. If you do not know how a model arrives at its predictions, you cannot reason about where those predictions break.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How supervised learning actually works<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">A supervised learning algorithm takes a dataset of labelled examples, each consisting of input features and a known output, and learns a function that maps one to the other. The goal is generalisation, producing correct outputs for data the model has never seen.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The process is iterative. The algorithm makes a prediction, measures how far off it is from the correct label, and adjusts its internal parameters to reduce that error. Repeat this across thousands or millions of examples, and the model converges on a function that (ideally) captures the real relationship between inputs and outputs rather than memorising the training set.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Two categories of problem dominate supervised learning:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Classification<\/strong>&nbsp;produces categorical outputs. Spam or not spam. Malware or benign. Phishing or legitimate. The model learns a decision boundary that separates classes in feature space.<\/li>\n\n\n\n<li><strong>Regression<\/strong>&nbsp;produces continuous outputs. Predicted revenue. Estimated time to compromise. Risk score. The model learns a curve or surface that best fits the relationship between features and the target value.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">From a red teaming perspective, the distinction matters. Classification models have discrete decision boundaries you can probe and push inputs across. Regression models have continuous output surfaces where small input perturbations produce proportional (and sometimes disproportional) output shifts.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The components that matter<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Training data<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Training data is the single largest attack surface in any supervised learning pipeline. The model can only learn what the data teaches it. If the data contains bias, the model learns bias. If the data contains poisoned examples, the model learns the poison.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Quality and quantity both matter, but quality is harder to verify at scale. A training set of ten million examples is large enough to hide a few thousand corrupted labels without anyone noticing during a spot check. This is the basis of data poisoning attacks: the volume of legitimate data provides cover for injected malicious samples.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Features<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Features are the measurable properties the model uses as inputs. In a phishing detection model, features might include URL length, domain age, presence of suspicious substrings, TLS certificate issuer, and page content similarity scores.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Feature selection is where domain knowledge meets statistical modelling, and it is also where assumptions get baked in. A model trained on features that do not capture the adversary&#8217;s current techniques will miss what it was built to detect. Attackers who understand which features a model relies on can craft inputs that look normal across every measured dimension while being malicious in dimensions the model cannot see.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Labels<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Labels are the ground truth the model trains against. In security contexts, labels often come from analyst decisions: this sample is malware, that network flow is benign, this email is a phish.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Label quality is a known weak point. Analyst disagreement, evolving threat definitions, and the sheer volume of data that needs labelling all introduce noise. Noisy labels degrade model performance in predictable ways, but they also create opportunities. If you know the labelling methodology (and in many commercial products, it is documented or inferable), you can design inputs that exploit its blind spots.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The model itself<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A model is a mathematical function with learnable parameters. A logistic regression model learns a set of weights for each feature. A decision tree learns a sequence of threshold-based splits. A neural network learns millions of parameters across layers of non-linear transformations.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The choice of model architecture constrains what the system can learn and, by extension, what it cannot learn. Linear models cannot capture non-linear relationships without engineered features. Shallow decision trees miss complex interactions. Deep neural networks can approximate almost any function but are opaque, making it harder to reason about their failure modes.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For the red teamer, model architecture is an indicator of attack surface. Linear models are transparent and their decision boundaries are easy to reverse-engineer. Neural networks are harder to interpret but more susceptible to adversarial perturbation attacks that exploit gradient information.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Where models fail<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Overfitting<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A model that has memorised its training data rather than learning general patterns will perform brilliantly on anything resembling the training set and fail on everything else. Overfitting is the model equivalent of an analyst who can only recognise threats they have seen before.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This matters for red teamers because overfitted models are brittle. Small deviations from training data patterns can cause misclassification. If you know a detection model was trained primarily on one malware family&#8217;s tooling, using a different family&#8217;s techniques (or writing your own) may be enough to evade it entirely.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Underfitting<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Underfitting is the opposite problem: a model too simple to capture the patterns in its data. Underfitted models produce high error rates on both training and unseen data. They are blunt instruments.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In practice, underfitting in security models often results from insufficient training data for rare attack classes. If a model has seen five thousand benign samples and fifty malicious ones, it will learn &#8220;predict benign&#8221; as a reliable strategy. The maths works out. The security does not.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The generalisation gap<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The gap between training performance and real-world performance is where most security ML systems quietly fail. A model evaluated on a held-out test set drawn from the same distribution as its training data will look good on paper. Deploy it against adversaries who are actively trying to avoid detection, and the distribution shifts.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is the core tension in adversarial ML in that supervised learning assumes the test distribution matches the training distribution. Adversaries violate that assumption by design.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Evaluation metrics and what they hide<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Standard evaluation metrics tell you how a model performs. They do not tell you how it fails.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Accuracy<\/strong>&nbsp;measures the proportion of correct predictions overall. In imbalanced datasets (which security datasets almost always are), accuracy is misleading. A model that predicts &#8220;benign&#8221; for every input achieves 99.9% accuracy if only 0.1% of traffic is malicious.<\/li>\n\n\n\n<li><strong>Precision<\/strong>&nbsp;measures how many positive predictions were correct. High precision means few false positives, which matters for alert fatigue.<\/li>\n\n\n\n<li><strong>Recall<\/strong>&nbsp;measures how many actual positives were caught. High recall means few false negatives, which matters for detection coverage.<\/li>\n\n\n\n<li><strong>F1 score<\/strong>&nbsp;is the harmonic mean of precision and recall. It is a compromise metric and, like all compromises, satisfies no one completely.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">The red teamer&#8217;s question is never &#8220;what is the model&#8217;s F1 score?&#8221; It is &#8220;what does the model miss, and can I reliably land in that gap?&#8221; Evaluation metrics describe average-case performance. Adversaries operate in the worst case.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Cross-validation and regularisation<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Cross-validation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Cross-validation splits the dataset into multiple folds, trains the model on different combinations, and evaluates on the held-out fold each time. It is the standard technique for estimating how well a model will generalise.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The limitation is that cross-validation only measures generalisation within the existing data distribution. It tells you nothing about performance against distribution shift, which is precisely what an adversary causes.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Regularisation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Regularisation adds a penalty term to the model&#8217;s loss function that discourages overly complex parameter values. L1 regularisation (penalising the absolute value of weights) tends to produce sparse models where some features are zeroed out entirely. L2 regularisation (penalising the square of weights) keeps all features but shrinks their influence.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Both techniques help prevent overfitting, which indirectly makes models more robust to minor input variations. They do not protect against targeted adversarial inputs. Regularisation constrains the model&#8217;s complexity. It does not constrain the adversary&#8217;s.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Inference versus prediction<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Most discussions of supervised learning focus on prediction: feeding new inputs and getting outputs. Inference is the broader concept of understanding what the model has learned, which features matter, and how inputs relate to outputs.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For red teamers, inference is the more useful capability. Understanding which features drive a model&#8217;s decisions tells you which features to manipulate. If a spam classifier relies heavily on the presence of certain keywords, you know to avoid those keywords. If a malware detector weights PE header entropy, you know to control your payload&#8217;s entropy profile.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Model interpretability tools (SHAP values, LIME, attention maps) are designed to help developers understand their models. They are equally useful to adversaries.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What this means for the series ahead<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Supervised learning is the foundation most detection, classification, and scoring systems are built on. The assumptions it makes (stable data distributions, honest labels, representative training sets, well-chosen features) are the assumptions adversaries target.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The articles that follow will build on this foundation: how adversarial examples exploit decision boundaries, how data poisoning corrupts the learning process from the inside, and how model extraction attacks turn a black-box system into a white-box one. Each of those attacks is grounded in the mechanics described here. The training loop, the loss function, the feature space, and the generalisation gap are not abstract concepts. They are the terrain.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>How supervised learning works, where its assumptions break, and why red teamers need to understand the training pipeline before they can attack it.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[634,630,109,51,635,136,633,632,631,636],"class_list":["post-386","post","type-post","status-publish","format-standard","hentry","category-multifarious","tag-adversarial-ml","tag-ai-red-teaming","tag-ai-security","tag-cybersecurity","tag-data-poisoning","tag-machine-learning","tag-ml-fundamentals","tag-model-evasion","tag-supervised-learning","tag-threat-modelling"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/posts\/386","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/comments?post=386"}],"version-history":[{"count":2,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/posts\/386\/revisions"}],"predecessor-version":[{"id":389,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/posts\/386\/revisions\/389"}],"wp:attachment":[{"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/media?parent=386"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/categories?post=386"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/tags?post=386"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}