{"id":453,"date":"2026-05-09T00:00:00","date_gmt":"2026-05-08T23:00:00","guid":{"rendered":"https:\/\/kosokoking.com\/?p=453"},"modified":"2026-05-02T20:46:55","modified_gmt":"2026-05-02T19:46:55","slug":"python-libraries-for-ai-red-teaming","status":"publish","type":"post","link":"https:\/\/kosokoking.com\/index.php\/technology\/python-libraries-for-ai-red-teaming\/","title":{"rendered":"Python libraries for AI red teaming"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Every algorithm we have covered in this series, from linear regression through to SARSA, eventually needs to run somewhere. Theory gets you the intuition, but the moment you want to train a classifier, poison a dataset, or craft an adversarial input, you need code. Two Python libraries dominate the space, and understanding their APIs is the difference between reading about attacks and executing them.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The two-library split<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The AI tooling ecosystem has settled into a clean division of labour. Scikit-learn handles classical machine learning, the algorithms that operate on structured, tabular data and produce interpretable models. PyTorch handles deep learning, where models are large neural networks trained on raw signals like images, audio, and text. If you have worked through this series in order, every algorithm from entries 3 through 9 (supervised learning, linear regression, logistic regression, decision trees, anomaly detection, SVMs, ensemble methods) maps directly to scikit-learn. The reinforcement learning entries (Q-learning, SARSA) lean more towards PyTorch territory, though simple tabular implementations can live in pure NumPy.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For a red teamer, both libraries are operational necessities. Scikit-learn is where you build a malicious classifier to sort exfiltrated data, train a model to evade a spam filter, or replicate a target&#8217;s decision boundary from stolen predictions. PyTorch is where you craft adversarial examples against image classifiers, run gradient-based evasion attacks, or extract a neural network&#8217;s weights through carefully chosen queries. You will use both, often in the same engagement.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Scikit-learn: the classical ML workhorse<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Scikit-learn is built on NumPy, SciPy, and Matplotlib, and it provides a consistent API across every algorithm it implements. That consistency is the library&#8217;s most important property from an operational standpoint, because once you learn the pattern for one model, you can swap in any other without changing the surrounding code.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The API follows a three-step cycle. You instantiate a model, call&nbsp;<code>fit()<\/code>&nbsp;with training data, and call&nbsp;<code>predict()<\/code>&nbsp;on new inputs.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from sklearn.linear_model import LogisticRegression\n\nmodel = LogisticRegression(C=1.0)\nmodel.fit(X_train, y_train)\ny_pred = model.predict(X_test)\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Swap&nbsp;<code>LogisticRegression<\/code>&nbsp;for&nbsp;<code>RandomForestClassifier<\/code>,&nbsp;<code>SVC<\/code>, or&nbsp;<code>GradientBoostingClassifier<\/code>, and the rest of the code stays identical. This uniformity matters when you are iterating through multiple model types during an attack, testing which architecture best replicates a target model&#8217;s behaviour or which classifier most reliably evades a detection system.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Preprocessing as attack surface<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Before data reaches a model, it passes through preprocessing. Scikit-learn provides scalers, encoders, and imputers for this step, and each one is a potential point of manipulation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>StandardScaler<\/code>&nbsp;removes the mean and scales features to unit variance.&nbsp;<code>MinMaxScaler<\/code>&nbsp;compresses values into a fixed range, typically 0 to 1.&nbsp;<code>RobustScaler<\/code>&nbsp;uses median and interquartile range, making it resistant to outliers. For a red teamer running a data poisoning attack, understanding which scaler the target pipeline uses determines how much you can shift a feature&#8217;s distribution before the perturbation gets normalised away.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from sklearn.preprocessing import StandardScaler\n\nscaler = StandardScaler()\nX_scaled = scaler.fit_transform(X)\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Categorical encoding is equally relevant.&nbsp;<code>OneHotEncoder<\/code>&nbsp;expands categories into binary columns, and&nbsp;<code>LabelEncoder<\/code>&nbsp;maps them to integers. If you are injecting poisoned samples into a training set, knowing the encoding scheme tells you exactly which feature columns to target and what values are valid.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Missing value handling introduces another angle.&nbsp;<code>SimpleImputer<\/code>&nbsp;fills gaps with a strategy like mean, median, or most frequent value.&nbsp;<code>KNNImputer<\/code>&nbsp;uses k-nearest neighbours to estimate missing entries. An attacker who understands the imputation strategy can craft incomplete inputs that, once imputed, land in a specific region of feature space.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Model selection and evaluation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Scikit-learn&#8217;s&nbsp;<code>train_test_split<\/code>&nbsp;divides data into training and testing subsets.&nbsp;<code>cross_val_score<\/code>&nbsp;runs k-fold cross-validation, training and evaluating the model across multiple data splits for a more reliable performance estimate.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from sklearn.model_selection import train_test_split, cross_val_score\n\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)\nscores = cross_val_score(model, X, y, cv=5)\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Evaluation metrics like&nbsp;<code>accuracy_score<\/code>,&nbsp;<code>precision_score<\/code>,&nbsp;<code>recall_score<\/code>, and&nbsp;<code>f1_score<\/code>&nbsp;quantify how well a model performs. In an adversarial context, these same metrics measure the effectiveness of your attack. If you are running a model extraction attack, the F1 score between your stolen model&#8217;s predictions and the target&#8217;s predictions tells you how faithful the copy is. If you are running an evasion attack, the drop in the target&#8217;s recall on your crafted inputs measures your success rate.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">PyTorch: deep learning and gradient access<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">PyTorch, developed originally by Meta&#8217;s AI research team, is the framework of choice for deep learning. Where scikit-learn abstracts away the internals behind a clean&nbsp;<code>fit<\/code>\/<code>predict<\/code>&nbsp;interface, PyTorch gives you direct access to the computational graph, the gradients, and every weight in the network. That low-level access is precisely what makes it the primary tool for adversarial machine learning.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Tensors and GPU acceleration<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The fundamental data structure in PyTorch is the tensor, a multi-dimensional array similar to a NumPy array but with two additional capabilities that matter for offensive work. First, tensors can run on GPUs, which accelerates the kind of iterative optimisation loops that adversarial attacks require. Second, tensors track their computational history, enabling automatic gradient calculation.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import torch\n\nx = torch.tensor(&#91;1.0, 2.0, 3.0], requires_grad=True)\n\nif torch.cuda.is_available():\n    x = x.to('cuda')\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">That&nbsp;<code>requires_grad=True<\/code>&nbsp;flag is the key. It tells PyTorch to record every operation performed on this tensor so that gradients can be computed later through backpropagation. When you craft an adversarial example against an image classifier, you set&nbsp;<code>requires_grad=True<\/code>&nbsp;on the input image, run it through the model, compute the loss with respect to a target class, and then use the resulting gradient to perturb the image in the direction that maximises misclassification. The entire attack flows from this single flag.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Building models<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">PyTorch provides two ways to define neural networks. The&nbsp;<code>Sequential<\/code>&nbsp;API stacks layers linearly, which works for straightforward architectures.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import torch.nn as nn\n\nmodel = nn.Sequential(\n    nn.Linear(784, 128),\n    nn.ReLU(),\n    nn.Linear(128, 10),\n    nn.Softmax(dim=1)\n)\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">For anything more complex, you subclass&nbsp;<code>nn.Module<\/code>&nbsp;and define the forward pass explicitly. This is the pattern you will see in most real-world models and most adversarial research code.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>class CustomModel(nn.Module):\n    def __init__(self):\n        super(CustomModel, self).__init__()\n        self.layer1 = nn.Linear(784, 128)\n        self.relu = nn.ReLU()\n        self.layer2 = nn.Linear(128, 10)\n\n    def forward(self, x):\n        x = self.relu(self.layer1(x))\n        return self.layer2(x)\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Understanding the&nbsp;<code>forward()<\/code>&nbsp;method matters because adversarial attacks often hook into intermediate layers. If you want to run a feature-space attack or extract internal representations from a model, you need to know where to tap in.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The training loop<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Unlike scikit-learn&#8217;s single&nbsp;<code>fit()<\/code>&nbsp;call, PyTorch requires you to write the training loop explicitly. This verbosity is a feature, not a limitation, because every step of the loop is a point where an adversary can intervene.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>optimizer = torch.optim.Adam(model.parameters(), lr=0.001)\nloss_fn = nn.CrossEntropyLoss()\n\nfor epoch in range(10):\n    for x_batch, y_batch in dataloader:\n        y_pred = model(x_batch)\n        loss = loss_fn(y_pred, y_batch)\n\n        optimizer.zero_grad()\n        loss.backward()\n        optimizer.step()\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The forward pass computes predictions. The loss function measures the error.&nbsp;<code>loss.backward()<\/code>&nbsp;computes gradients through the entire network.&nbsp;<code>optimizer.step()<\/code>&nbsp;updates the weights. A poisoning attack can manipulate the data that enters the forward pass. A backdoor attack can modify the loss function to optimise for a hidden trigger alongside the legitimate objective. A model inversion attack can repurpose the gradient computation to reconstruct training data from the model&#8217;s weights.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Data loading and model persistence<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">PyTorch&#8217;s&nbsp;<code>Dataset<\/code>&nbsp;and&nbsp;<code>DataLoader<\/code>&nbsp;classes manage batching and shuffling. The&nbsp;<code>Dataset<\/code>&nbsp;subclass defines how individual samples are accessed, while&nbsp;<code>DataLoader<\/code>&nbsp;wraps it with batch sizing and parallel loading.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from torch.utils.data import Dataset, DataLoader\n\nclass CustomDataset(Dataset):\n    def __init__(self, data, labels):\n        self.data = data\n        self.labels = labels\n\n    def __len__(self):\n        return len(self.data)\n\n    def __getitem__(self, idx):\n        return self.data&#91;idx], self.labels&#91;idx]\n\ndataloader = DataLoader(CustomDataset(data, labels), batch_size=32, shuffle=True)\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Model saving and loading uses&nbsp;<code>torch.save()<\/code>&nbsp;and&nbsp;<code>torch.load()<\/code>. A saved model file (<code>.pth<\/code>) contains the learned weights, and loading it into a compatible architecture restores the model completely. From a red team perspective, if you can access a saved model file, you have the model. You can run it locally, inspect every weight, compute gradients, and craft attacks at your leisure without making a single query to the target system.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>torch.save(model.state_dict(), 'model.pth')\n\nmodel = CustomModel()\nmodel.load_state_dict(torch.load('model.pth'))\nmodel.eval()\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Where the two libraries meet the adversary<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">I<a href=\"https:\/\/github.com\/Trusted-AI\/adversarial-robustness-toolbox\" title=\"\">BM&#8217;s Adversarial Robustness Toolbox (ART)<\/a> is worth knowing about because it wraps both scikit-learn and PyTorch models with a unified interface for running evasion, poisoning, extraction, and inference attacks. ART does not replace the need to understand the underlying libraries, but it does give you pre-built implementations of attacks like <a href=\"https:\/\/arxiv.org\/pdf\/1412.6572\" title=\"\">FGSM<\/a>, <a href=\"https:\/\/arxiv.org\/abs\/1706.06083\" title=\"\">PGD<\/a>, and <a href=\"https:\/\/arxiv.org\/abs\/1608.04644\" title=\"\">Carlini-Wagner<\/a> that you can point at a target model and fire.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The pattern in practice tends to look like this. You use scikit-learn when the target is a classical ML system, a fraud detection model running logistic regression, a spam filter using random forests, or an anomaly detector built on isolation forests. You use PyTorch when the target is a neural network, an image classifier, a natural language model, or any system where gradient access unlocks the attack.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Knowing both libraries also matters for model extraction attacks, where you query a target system&#8217;s API, collect input-output pairs, and train a local surrogate model that approximates the target. If the target is a simple classifier, your surrogate lives in scikit-learn. If the target is a deep network, your surrogate lives in PyTorch. Either way, the quality of your extraction depends on understanding the training API well enough to iterate quickly.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Python Libraries: How scikit-learn and PyTorch work, and why their APIs are the operational foundation for adversarial machine learning.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[668,630,51,635,656,662,648,726,723,738],"class_list":["post-453","post","type-post","status-publish","format-standard","hentry","category-technology","tag-adversarial-machine-learning","tag-ai-red-teaming","tag-cybersecurity","tag-data-poisoning","tag-deep-learning","tag-machine-learning-security","tag-model-extraction","tag-python","tag-pytorch","tag-scikit-learn"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/posts\/453","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/comments?post=453"}],"version-history":[{"count":1,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/posts\/453\/revisions"}],"predecessor-version":[{"id":454,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/posts\/453\/revisions\/454"}],"wp:attachment":[{"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/media?parent=453"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/categories?post=453"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kosokoking.com\/index.php\/wp-json\/wp\/v2\/tags?post=453"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}