{"id":431,"date":"2026-05-01T00:00:00","date_gmt":"2026-04-30T23:00:00","guid":{"rendered":"https:\/\/kosokoking.com\/?p=431"},"modified":"2026-04-26T17:51:35","modified_gmt":"2026-04-26T16:51:35","slug":"neural-networks","status":"publish","type":"post","link":"https:\/\/kosokoking.com\/index.php\/technology\/neural-networks\/","title":{"rendered":"Neural networks"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">A single perceptron draws one line through your data and calls it a decision. Stack a few hundred of them into layers, wire the outputs of one layer into the inputs of the next, and suddenly the system can learn to recognise faces, classify malware, or flag fraudulent transactions with accuracy that no hand-crafted ruleset can match. The cost is that the same mechanism the network uses to learn (following gradients) is the mechanism an attacker uses to reverse-engineer its behaviour, steal its training data, and craft inputs that make it fail on command. The previous article in this series covered the perceptron: one neuron, one weighted vote, one linear boundary. This article covers what happens when you connect thousands of neural networks. While the arithmetic remains unchanged as the architecture becomes deeper, the attack surface expands with the addition of every new layer.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">From perceptron to network<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The multi-layer perceptron architecture features three components built by connecting several layers of individual units into a continuous sequence.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The&nbsp;<strong>input layer<\/strong>&nbsp;receives the raw data. Each node in the input layer corresponds to a single feature. If you are classifying network packets with 40 features, the input layer has 40 nodes. It performs no computation. It passes data forward.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The&nbsp;<strong>hidden layers<\/strong>&nbsp;sit between input and output. Each neuron in a hidden layer receives every output from the previous layer, multiplies each by a learned weight, sums the results, adds a bias, and passes the total through an activation function. The output of that activation function becomes the input to the next layer. This is the same weighted-sum-plus-activation operation the perceptron performs, repeated across every neuron in every layer.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The&nbsp;<strong>output layer<\/strong>&nbsp;produces the final prediction. For binary classification (malicious or benign), one neuron with a sigmoid activation outputs a probability between 0 and 1. For multi-class classification (categorising malware into families), you get one output neuron per class, with a softmax function converting the raw scores into a probability distribution that sums to 1.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The single most consequential difference between a perceptron and a neural network is the hidden layers. A perceptron can only learn linear boundaries: straight lines, flat planes. Stack even one hidden layer of neurons with non-linear activation functions in between, and the network can learn curved, folded, arbitrarily complex decision boundaries. Two hidden layers, and the network can approximate any continuous function to any desired accuracy, given enough neurons. This is the universal approximation theorem, and it is the reason neural networks displaced every simpler model in tasks where the data is complex enough to warrant them.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Why activation functions are not optional<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Remove the activation functions from a neural network and every layer becomes a linear transformation of the previous one. A chain of linear transformations collapses into a single linear transformation. Your deep network becomes a perceptron with extra steps and no additional representational power.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Activation functions are the mechanism that makes depth useful. Each one introduces a non-linearity that allows the network to bend its decision boundary in a new way. The choice of activation function has direct consequences for how the network trains, where it fails, and how an adversary can exploit those failure modes.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Sigmoid<\/strong>&nbsp;squashes any input into the range (0, 1). It was the default activation for decades. The problem is that for large positive or negative inputs, the sigmoid function&#8217;s gradient approaches zero. During training, the gradient is the signal that tells each weight how to update. When the gradient vanishes, learning stops. In deep networks, this &#8220;vanishing gradient&#8221; problem compounds across layers: early layers receive almost no learning signal, and the network trains slowly or stalls entirely. From an adversarial perspective, sigmoid-saturated neurons are frozen neurons. An attacker who can push inputs into the saturation region can effectively disable parts of the network.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>ReLU<\/strong>\u00a0(Rectified Linear Unit) returns the input if it is positive and zero if it is negative. It is computationally cheap and largely solved the vanishing gradient problem for positive inputs, which is why it became the default activation in modern networks. A neuron experiences the dying ReLU problem if its weights shift to the point that it only receives negative inputs and stays stuck with a zero output and a zero gradient forever. That neuron is dead. It will never recover. In large networks, a meaningful fraction of neurons can die during training, reducing the network&#8217;s effective capacity without any visible error.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Tanh<\/strong>&nbsp;squashes inputs into the range (-1, 1). It is centred at zero, which makes optimisation somewhat easier than sigmoid, but it suffers from the same vanishing gradient problem at the extremes. It appears most frequently in recurrent architectures and sequence models.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Softmax<\/strong>&nbsp;is not used in hidden layers. It is an output-layer function for multi-class classification. It takes a vector of raw scores (one per class) and converts them into probabilities that sum to 1. The highest probability becomes the prediction. For a red teamer, the critical detail is that softmax outputs are probabilities over all classes, which means that even when a model is confident about its prediction, the probability distribution over alternative classes leaks information about how close the input was to other decision boundaries.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How the network learns<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Training a neural network is an optimisation problem. The network makes a prediction, a loss function measures how wrong that prediction is, and the training algorithm adjusts every weight in every layer to make the next prediction slightly less wrong. The two mechanisms that accomplish this are backpropagation and gradient descent.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Forward pass<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The input data flows through the network, layer by layer. Each neuron performs its weighted sum, adds its bias, applies its activation function, and passes the result forward. The forward pass concludes at the output layer when the network generates its final prediction.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Loss calculation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A loss function compares the network&#8217;s prediction against the true label. For binary classification, binary cross-entropy is standard. For multi-class, categorical cross-entropy. For regression, mean squared error. The loss function produces a single number that quantifies how wrong the prediction is. The entire goal of training is to minimise this number.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Backward pass (backpropagation)<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Backpropagation computes the gradient of the loss function with respect to every weight and bias in the network, working backwards from the output layer to the input layer. It uses the chain rule from calculus: the gradient at each layer is the product of the local gradient (how much that layer&#8217;s output changes when its weights change) and the gradient flowing back from the layer above (how much the loss changes when that layer&#8217;s output changes).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is where the architecture&#8217;s vulnerability crystallises. The gradient at each weight tells you exactly how much that weight contributed to the error. It is a precise map of the network&#8217;s sensitivity. If you are the model&#8217;s trainer, you use this map to improve the model. If you are an attacker with access to the same gradient information, you use it to craft adversarial inputs, infer training data, or reverse-engineer the model&#8217;s behaviour.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Gradient descent<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Gradient descent uses the gradients computed by backpropagation to update the weights. The update rule is simple: subtract a fraction of the gradient from each weight. That fraction is the learning rate, a hyperparameter that controls how aggressively the weights change on each step.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>weight_new = weight_old - learning_rate * gradient\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">A learning rate too high causes the weights to overshoot the minimum and oscillate. A learning rate too low causes training to stall or get stuck in a local minimum. In practice, adaptive optimisers like Adam adjust the effective learning rate per-parameter during training, but the core loop is the same: compute gradient, step in the opposite direction, repeat.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The network trains by repeating this cycle (forward pass, loss, backward pass, weight update) across thousands or millions of examples. Over time, the weights converge toward values that minimise the loss on the training data. Whether those weights generalise to new data is a separate problem entirely, and one that defines most of the practical failure modes in production ML.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What this means for a red teamer<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Every attack class in adversarial machine learning traces back to a property of this training loop. Understanding the loop is understanding the attack surface.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Adversarial examples exploit the gradient directly.<\/strong>&nbsp;The same backpropagation pass that trains the network can be run in reverse with respect to the input instead of the weights. Instead of asking &#8220;how should I change the weights to reduce the loss?&#8221;, the attacker asks &#8220;how should I change the input to maximise the loss?&#8221; The answer is a small, often imperceptible perturbation to the input that causes the network to misclassify with high confidence. This is the Fast Gradient Sign Method (FGSM), published by Goodfellow, Shlens, and Szegedy in 2015. The attack requires one forward pass, one backward pass, and one line of arithmetic. The entire mechanism is a consequence of gradient-based learning.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Model inversion exploits the output layer.<\/strong>&nbsp;If the output layer uses softmax, it produces a full probability distribution over all classes. Those probabilities leak information. Fredrikson, Jha, and Ristenpart demonstrated in 2015 that an attacker with API access to a model&#8217;s predictions can iteratively reconstruct approximations of the training data by optimising inputs to maximise the model&#8217;s confidence for a target class. The network does not return its weights. It returns enough gradient-adjacent information in its outputs to reverse-engineer what it learned.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Membership inference exploits overfitting.<\/strong>&nbsp;A network that has memorised its training data behaves differently on examples it has seen versus examples it has not. Shokri, Stronati, Song, and Shmatikov showed in 2017 that an attacker can train a separate &#8220;shadow model&#8221; on similar data, observe how its confidence scores differ between training and non-training examples, and use that signal to determine whether a specific data point was in the target model&#8217;s training set. The attack works because gradient descent, left unchecked, memorises the training distribution rather than learning the underlying pattern.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Gradient inversion reconstructs training data from gradients.<\/strong>&nbsp;In federated learning, participants share gradient updates instead of raw data, on the assumption that gradients are privacy-preserving. They are not. Zhu, Liu, and Han demonstrated in 2019 that an attacker who observes the gradients can reconstruct the original training inputs by solving an optimisation problem: find the input that would have produced this gradient. The quality of the reconstruction is startlingly high.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Every one of these attacks is a direct consequence of the training loop described above. The network learns by gradient. It is attacked by gradient. The symmetry is structural.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The opacity problem<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">A decision tree (covered in article five of this series) tells you its reasoning in full. A neural network does not. Once trained, the learned weights encode the model&#8217;s behaviour, but interpreting what a specific weight &#8220;means&#8221; in a network with millions of parameters is effectively impossible for any human. The model is a black box.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For defenders, this opacity is a liability. You cannot audit what you cannot read. A poisoned training example that shifts the model&#8217;s boundary in a targeted direction will not announce itself in the weights. A backdoor trigger (a specific input pattern that causes the model to output a chosen class) is embedded in the weight distribution and is invisible to inspection unless you know exactly what to search for.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For attackers, the opacity is useful. Backdoors planted during training persist through standard validation and testing. The defender has no equivalent of &#8220;reading the source code&#8221; for a neural network. Interpretability tools like SHAP and LIME offer approximate local explanations, but they describe what the model does near a specific input, not what the model does in general. They are flashlights, not floodlights.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Defence considerations<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Defending neural networks is harder than defending the simpler models covered earlier in this series, precisely because the attack surface grows with model complexity. There are no silver bullets, but there are specific practices that reduce exposure.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Adversarial training<\/strong>&nbsp;incorporates adversarial examples into the training set, forcing the model to learn decision boundaries that are robust to small perturbations. It works, but it is expensive (each training step requires generating adversarial examples on the fly) and it trades clean accuracy for robustness. The defender pays a performance cost for every perturbation budget they want to resist.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Gradient masking<\/strong>&nbsp;attempts to hide or obfuscate the gradients the model produces, making gradient-based attacks harder to execute. It is a brittle defence. Carlini and Wagner demonstrated in 2017 that gradient masking can be bypassed by using a substitute model to compute approximate gradients, or by using gradient-free optimisation methods. Masking moves the attack surface rather than eliminating it.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Differential privacy<\/strong>&nbsp;adds calibrated noise to gradients during training, providing mathematical guarantees about how much any single training example can influence the final model. It is the strongest formal defence against membership inference and gradient inversion. The trade-off is that the noise reduces model accuracy, and the privacy budget must be carefully managed across the entire training process.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Output perturbation<\/strong>\u00a0adds noise to the model&#8217;s output predictions, reducing the information available to an attacker performing model inversion through the API. This is straightforward to implement but must be calibrated: too little noise and the attack still works; too much and the model&#8217;s outputs become unreliable for legitimate users.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>How neural networks learn through backpropagation, and why that same gradient mechanism powers adversarial examples, model inversion, and training data theft.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[668,630,109,687,51,656,693,136,638,137],"class_list":["post-431","post","type-post","status-publish","format-standard","hentry","category-technology","tag-adversarial-machine-learning","tag-ai-red-teaming","tag-ai-security","tag-backpropagation","tag-cybersecurity","tag-deep-learning","tag-gradient-descent","tag-machine-learning","tag-model-security","tag-neural-networks"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/posts\/431","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/comments?post=431"}],"version-history":[{"count":2,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/posts\/431\/revisions"}],"predecessor-version":[{"id":433,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/posts\/431\/revisions\/433"}],"wp:attachment":[{"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/media?parent=431"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/categories?post=431"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/tags?post=431"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}