Logistic regression

A spam filter looks at an email, runs the content through a function, and outputs a number between 0 and 1. If that number crosses a threshold, the email disappears into a junk folder. The user never sees it. The sender is never told. One floating point number, one threshold comparison, and the communication is silently killed. That is logistic regression doing its job. And if you want to red team AI systems, this is where you start, because the mechanics of that decision are far more exposed than most people realise.

Why this model matters for red teaming

Logistic regression is a classification algorithm. Despite the name, it does not predict continuous values the way linear regression does. It predicts categories such as spam or not spam, fraudulent or legitimate, malicious or benign. The model outputs a probability, and a threshold converts that probability into a hard yes or no.

This makes logistic regression the default first line of defence in dozens of security-adjacent systems. Email filtering, fraud detection, intrusion detection, malware triage. It is fast, interpretable, and cheap to run at scale. Many production ML pipelines still use logistic regression as either the primary classifier or as a gating layer before heavier models are invoked.

For a red teamer, that means logistic regression is often the first model standing between you and your objective. Understanding how it makes decisions tells you exactly how to make it decide wrong.

The sigmoid function

The core mechanic is the sigmoid function. It takes a linear combination of input features (weighted sums of the data points the model was trained on) and maps them to a value between 0 and 1. The formula is straightforward:

P(x) = 1 / (1 + e^-z)

Where z is the weighted sum of inputs: z = w1x1 + w2x2 + ... + wnxn + b.

The sigmoid produces an S-shaped curve. For very negative values of z, the output approaches 0. For very positive values, it approaches 1. Around zero, the function transitions rapidly, and this is where the model is most uncertain. That transition zone is also where the model is most vulnerable to manipulation.

If you can shift the input features just enough to push z across that midpoint, you flip the classification. The model does not need to be catastrophically wrong. It needs to be wrong by a fraction, right at the boundary.

The decision boundary is a flat surface

When logistic regression classifies data, it draws a line. In two dimensions, it is a literal line on a scatter plot, separating one class from another. In three dimensions, it is a plane. In higher dimensions (and real models have dozens or hundreds of features), it is a hyperplane: a flat surface cutting through the feature space.

This flatness is the architectural constraint that matters. Logistic regression can only learn linear decision boundaries. It cannot curve around clusters of data or capture complex, nonlinear relationships between features. If the true separation between classes is not roughly linear, the model will misclassify points near that boundary, and those misclassifications follow a predictable geometric pattern.

For red teaming purposes, this is useful information. The decision boundary is a fixed, flat surface in feature space. If you know (or can estimate) where it sits, you know exactly which direction to perturb inputs to cross it.

Threshold manipulation

The default classification threshold is 0.5. If the sigmoid outputs a probability above 0.5, the input is classified as the positive class. Below 0.5, it is the negative class. Some systems adjust this threshold (a fraud detection system might set it at 0.3 to catch more suspicious transactions at the cost of more false positives), but the principle is the same.

The threshold is a hard boundary. A probability of 0.51 and a probability of 0.99 produce the same classification. This means an attacker does not need to fool the model completely. They need to move the probability output from one side of the threshold to the other. In many cases, that requires altering only a handful of input features by small amounts.

Consider a spam classifier trained on features like keyword frequency, sender reputation score, and link count. An attacker does not need to make a phishing email look like a birthday card. They need to suppress a few high-weight features (remove trigger words, reduce link density, spoof a sender domain with decent reputation) just enough to push the sigmoid output from 0.6 to 0.4. The email lands in the inbox.

The assumptions are the attack surface

Logistic regression makes four assumptions about the data it operates on, and each one creates a potential weakness.

Binary outcome. The model assumes the target variable has exactly two classes. In multi-class scenarios, logistic regression is extended (one-vs-rest or softmax), and these extensions introduce additional decision boundaries, each of which is independently attackable.

Linearity of log-odds. The model assumes a linear relationship between the input features and the log-odds of the outcome. If the true relationship is nonlinear, the model’s boundary will be systematically misaligned with reality. An attacker who understands where the model’s linear assumption breaks down can craft inputs that exploit exactly those gaps.

Low multicollinearity. If predictor variables are highly correlated, the model struggles to isolate their individual effects. The learned weights become unstable, meaning small changes in correlated features can produce disproportionate swings in the output probability. Correlated features are leverage points.

Large sample dependence. Logistic regression needs substantial training data for reliable parameter estimation. Models trained on thin datasets have poorly calibrated decision boundaries, and those boundaries are easier to probe and cross.

None of these assumptions are secrets. They are in every textbook. But most defenders do not think about them as attack surface. They think about them as statistical prerequisites.

Evasion in practice

Adversarial evasion against logistic regression is computationally cheap. Because the decision boundary is linear, you can calculate the exact gradient of the loss function with respect to each input feature. That gradient tells you the minimum perturbation needed to flip the classification.

This is not theoretical. Research from Biggio et al. (2013) demonstrated that logistic regression classifiers used in spam detection and malware classification could be systematically evaded by perturbing a small number of features along the gradient direction. The perturbations were small enough to be imperceptible to human reviewers but sufficient to cross the decision boundary.

The attack works because the model’s entire decision logic is encoded in a set of learned weights and a single threshold. Extract or estimate those weights (through model stealing, API probing, or training a surrogate model on the same data distribution), and you have a complete map of the decision boundary. From there, crafting adversarial inputs is arithmetic, not art.

Why defenders keep using it anyway

Given these vulnerabilities, you might wonder why logistic regression persists in production systems. The answer is operational, it is fast, it is interpretable, and it is auditable. A logistic regression model’s decision can be explained by listing the feature weights and showing which inputs pushed the probability above or below the threshold. Regulators like that. Compliance teams like that. Incident responders can trace a classification back to specific inputs in minutes.

More complex models (deep neural networks, ensemble methods) are harder to evade but also harder to explain, monitor, and debug. Many organisations accept the trade-off of a simpler model with known weaknesses and clear observability, backed by layered defences, over a black-box model that is theoretically more robust but operationally opaque.

The red teaming implication is that logistic regression is not going away. It is the model you will encounter first, most often, and in the most security-sensitive contexts.

What to take from this

The decision boundary in logistic regression is a flat surface defined entirely by a set of weights and a threshold. If you can estimate those weights, you can calculate the shortest path from any input to the other side of the boundary. The model’s linearity, which makes it fast and interpretable, is the same property that makes it geometrically predictable. Every assumption the model makes about its training data is a constraint an attacker can test and exploit.

The next article in this series will cover decision trees, where the decision boundary stops being flat and starts branching. The attack surface changes shape. The principles do not.

Leave a Reply

Your email address will not be published. Required fields are marked *

RELATED

Data preprocessing

How cleaning, validation, and imputation decisions in data preprocessing pipelines create exploitable assumptions in models.

Datasets and data quality

Entry 14 in the AI red teaming series. How datasets structure, quality assumptions, and preprocessing pipelines create attack surfaces for…

SARSA

Entry 12 in the AI red teaming series. How SARSA on-policy learning bakes exploration into value estimates, and why that…

Linear regression

Linear regression powers SIEM scoring, fraud detection, and baselines. Here is how it works, and why red teamers need to…