Training a Neural Network to Detect Bots

Most antibot systems are a pile of if-statements. Check the user agent. Count the headers. Score the fingerprint. Cross a threshold, get blocked. These checks live server-side - an attacker can’t read them directly. But they don’t need to.

Static rules produce deterministic outputs. Same input, same verdict, every time. An attacker sends a request, gets blocked. Changes one thing, tries again. Still blocked? That variable matters. Not blocked? It doesn’t. Repeat until every check is mapped. The rules are invisible, but the binary nature of the output - block or allow - lets an attacker reverse-engineer them through trial and error.

ML doesn’t work like that.

The Spectrum Problem

A rule says “fewer than 5 headers = bot.” It’s a cliff. Four headers: blocked. Five headers: allowed. An attacker finds the edge in two requests.

A neural network outputs a probability. Four headers might produce 0.87. Five headers might produce 0.72. Six headers: 0.41. There’s no cliff. The model evaluates every feature simultaneously, and the output shifts continuously based on combinations the attacker can’t isolate.

Worse for the attacker: changing one variable doesn’t predictably change the output. Adding a header might drop the score from 0.87 to 0.72, or it might do nothing, depending on what other headers are present. The model learned feature interactions, not individual thresholds. There’s no single variable to binary search on.

This is what makes ML harder to debug from the outside. It’s not that the logic is hidden (server-side rules are hidden too). It’s that the logic isn’t decomposable into independent checks.

Adaptability

The other advantage is that ML scales with complexity automatically.

Adding a new signal to a rule-based system means writing new if-statements for every combination. “If TLS fingerprint is wrong AND header order is wrong AND timing is suspicious…” - the number of rules grows combinatorially. You either write a thousand rules or you accept that some combinations slip through.

Adding a new signal to an ML model means adding one feature to the extraction function and retraining. The model figures out how the new signal interacts with everything else on its own. You don’t write the rules. You provide examples and the model learns the decision boundary.

My training takes 2 seconds on a laptop CPU. The model file is 5KB. When attackers change their tooling, I collect new traffic, retrain, and deploy. The model adapts to patterns I never anticipated.

The Setup

I’m working with HTTP headers as features. Not client-side fingerprints - those can be spoofed by JavaScript. Headers are server-side signals. The browser sends them before any client code runs.

The feature extractor pulls 8 signals from each request:

def extract_features(headers_dict, header_order):
    features = {
        'has_user_agent':       float('User-Agent' in headers_dict),
        'has_accept':           float('Accept' in headers_dict),
        'has_accept_language':  float('Accept-Language' in headers_dict),
        'has_accept_encoding':  float('Accept-Encoding' in headers_dict),
        'has_connection':       float('Connection' in headers_dict),
        'has_sec_ch_ua':        float('sec-ch-ua' in headers_dict),
        'header_count':         float(len(headers_dict)),
        'header_order_score':   float(check_header_order(header_order)),
    }
    return features

Six presence checks, a header count, and an order score. That last one is interesting.

Header Order as a Signal

Browsers send headers in a predictable order. Chrome sends Host, then Connection, then sec-ch-ua, then the rest. This order is baked into the browser’s networking stack - it’s not something a page can control.

Bot libraries don’t follow this order. Python requests sends User-Agent first. Go’s net/http alphabetizes headers. Curl has its own order. Even if a bot spoofs every header value perfectly, the order is wrong.

EXPECTED_CHROME_ORDER = [
    'Host', 'Connection', 'sec-ch-ua', 'sec-ch-ua-mobile',
    'sec-ch-ua-platform', 'Upgrade-Insecure-Requests',
    'User-Agent', 'Accept', 'Accept-Encoding', 'Accept-Language'
]

The check_header_order function compares every pair of headers against this expected order and returns a score between 0.0 and 1.0. Chrome gets 1.0. A bot with randomized headers gets something like 0.3.

A rule-based system would say “order score below 0.5 = bot.” The ML model learns its own threshold - and more importantly, it learns how order interacts with the other 7 features in ways that produce a gradient, not a wall.

Training Data

You need labeled data. I built a traffic generator that creates two types of entries:

Bots come in four profiles. Empty headers (raw socket scripts). Minimal headers (python requests defaults). Correct headers but wrong order. And partial headers with random subsets missing. Each one mimics a real class of bot tooling I’ve seen in the wild.

Humans get full Chrome headers with realistic values - proper sec-ch-ua, correct Accept-Language, the right number of headers in the right order.

500 of each, shuffled, dumped to a JSONL file. Each entry is 8 numbers and a label. That’s all the model needs.

The Network

Three layers. Small on purpose - 8 features and 1000 samples doesn’t justify anything bigger.

class BotDetectorNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.network = nn.Sequential(
            nn.Linear(8, 16),    # 8 features → 16 neurons
            nn.ReLU(),
            nn.Linear(16, 8),    # 16 → 8
            nn.ReLU(),
            nn.Linear(8, 1),     # 8 → 1 output
            nn.Sigmoid()         # squash to 0-1 probability
        )

Linear layers are the neurons. ReLU is the activation that lets it learn non-linear patterns. Sigmoid at the end gives a probability between 0 (human) and 1 (bot).

The training loop is textbook PyTorch:

for epoch in range(100):
    model.train()
    predictions = model(X_train).squeeze()
    loss = criterion(predictions, y_train)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

Five lines that do all the learning. The model starts random, sees the data 100 times, and gradually adjusts its weights to minimize prediction error.

Results

 Epoch |     Loss | Accuracy
------------------------------
    10 |   0.5584 |   52.5%
    20 |   0.3587 |   99.0%
    30 |   0.1083 |  100.0%
    ...
   100 |   0.0004 |  100.0%

100% accuracy on the test set by epoch 30. The model is extremely confident - 0.9998 for a bot request, 0.0002 for a human one.

This is partially because the training data is clean and the boundary is sharp. Real-world traffic would be messier. But it demonstrates the pipeline.

Wiring It Into Flask

The model saves to a .pth file - a serialized dictionary of all learned weights. Loading it for inference is three lines:

model = BotDetectorNet()
model.load_state_dict(torch.load('model.pth', weights_only=True))
model.eval()

The Flask route calls predict() on every request before any other checks run:

@main.route('/login', methods=['POST'])
def login():
    headers_dict = dict(request.headers)
    header_order = list(request.headers.keys())
    ml_result = predict(headers_dict, header_order)

    if ml_result['is_bot']:
        bot_request_counts[request.remote_addr] += 1
        if bot_request_counts[request.remote_addr] > BOT_THRESHOLD:
            return "Access Denied", 403

Note the threshold. The model flags every bot request, but we don’t block immediately. The first 10 requests from a flagged IP go through. Request 11 gets blocked.

The Threshold Trick

If you block the first bot request, the attacker knows immediately that something triggered. They change one thing, try again, see if it passes. Fast feedback loop. Even with ML’s spectrum advantage, immediate blocking gives them a binary signal to iterate on.

If you let the first 10 through, the attacker thinks they’re fine. Their script works. They scale up. Then request 11 gets blocked - and they didn’t change anything between request 10 and 11. From their perspective, the system is non-deterministic. They can’t even begin to isolate what triggered the block because nothing in their request changed.

bot_request_counts = defaultdict(int)
BOT_THRESHOLD = 10

if ml_result['is_bot']:
    bot_request_counts[request.remote_addr] += 1
    if bot_request_counts[request.remote_addr] > BOT_THRESHOLD:
        return "Access Denied", 403

This compounds the spectrum advantage. The model’s continuous probability makes it hard to isolate what triggered detection. The delayed threshold makes it hard to isolate when. Together, the attacker has almost nothing to work with.

Testing

Curl sends 4 headers with no sec-ch-ua and wrong order. Chrome sends 10+ with correct order and full client hints. The model distinguishes them trivially:

curl:    bot_probability: 0.9345, is_bot: True
Chrome:  bot_probability: 0.0,    is_bot: False

Running a loop from the terminal shows the threshold:

Request 1-9:   400 (passed ML, failed downstream validation)
Request 10-15: 403 (ML threshold kicked in)

Meanwhile, submitting the login form from Chrome returns 200 - Welcome test!. The model correctly identifies the browser, threshold never triggered.

ML + Rules

The ML model runs alongside the existing rule-based BotDetector. They complement each other.

Rules catch known patterns with certainty. If navigator.webdriver is true, that’s a bot. No ambiguity, no training data needed, no probability. It’s a fact.

ML catches unknown patterns through learned correlations. A stealth bot that gets each individual signal close enough to pass rules, but gets the combination slightly wrong - the model picks up on these because it evaluates the full feature vector, not each dimension independently.

The combination means an attacker has to beat both systems. Rules give you a hard floor - known bot signatures are always caught. ML gives you coverage above that floor, catching things you never wrote rules for, producing verdicts that can’t be decomposed into “which check did I fail.”

Limitations

The 100% accuracy is misleading. The training data has a clean boundary - Chrome humans vs obvious bots. Real traffic includes Firefox (no sec-ch-ua), Safari (different header set), mobile browsers, legitimate API clients, search engine crawlers. A production model needs much more diverse training data and would land somewhere in the 95-98% range.

The feature set is minimal. 8 header signals demonstrates the concept, but production systems would add TLS fingerprinting (JA3/JA4 hashes), request timing patterns, endpoint access patterns, and behavioral signals across sessions. Each new feature makes the model’s decision boundary higher-dimensional. More dimensions means more combinations the attacker needs to get right simultaneously - and each one they can’t isolate independently.

And the threshold needs persistence. Right now bot_request_counts is an in-memory dict that resets when the server restarts. Production would use Redis, and would track per-fingerprint rather than just per-IP.

What I Learned

The ML model itself is the easy part. Three layers, five-line training loop, 2 seconds to train. The hard part is the data pipeline - generating representative training data, labeling correctly, knowing when to retrain.

But the insight that stuck: the value of ML in antibot isn’t accuracy. Rules can be just as accurate for known patterns. The value is that ML produces gradients where rules produce cliffs. An attacker probing a rule-based system gets clean binary feedback that maps directly to individual checks. An attacker probing an ML system gets a continuous signal that reflects the entire feature space at once. The debugging surface is fundamentally different - not because the logic is hidden, but because the logic isn’t separable.

That’s not a theoretical advantage. That’s the difference between an attacker who maps your defenses in an afternoon and one who’s still guessing after a week.

Source: github.com/B9ph0met/antibot-sim