Reversing TikTok's Captcha Encryption

After reversing TikTok’s web protection and building a JS VM for antibot, I wanted to tackle something more self-contained: their captcha system. TikTok uses BDTuring (ByteDance Turing) which serves several captcha types - slide puzzles, 3D rotation, icon matching. The one I went after is “whirl” - a rotation captcha where you drag a slider to align an inner circle with an outer ring.

The plan: reverse the protocol, figure out the encryption, understand how the images work, and map out what it would take to solve these programmatically. The encryption turned out to be the interesting part. What the server actually validates turned out to be more than just the angle.

The Protocol

Open TikTok, trigger a login flow, and the captcha loads via the BDTuring SDK (captcha-ttp.0a4bb10f.js). Two requests matter:

GET https://verification-sg.tiktok.com/captcha/get fetches the challenge. POST https://verification-sg.tiktok.com/captcha/verify submits the answer. Both carry query parameters - device ID, fingerprint token, SDK version, screen dimensions. Standard telemetry.

The interesting part: both the response and submission are wrapped in a field called edata. Not JSON. Not plaintext. A base64 blob.

{
  "edata": "AfOFlN/RrFePd1K/GvGrAslmaw2dXgZCFZPTZXCUbEhitHJaZ1HO/QuIdbqk...",
  "data": {
    "verify_id": "Verify_d55d6b4f-396c-4115-aa87-aa86a8b0cfac"
  }
}

No plaintext challenge data visible in the response. Everything meaningful goes through edata. So step one was figuring out the encryption.

Catching the Plaintext

Standard approach - hook JSON.stringify and TextEncoder.encode to catch data before it enters the crypto layer:

const origStringify = JSON.stringify;
JSON.stringify = function() {
    const result = origStringify.apply(this, arguments);
    if (typeof result === 'string' && result.includes('"mode":"whirl"')) {
        console.log('PLAINTEXT:', result.substring(0, 500));
        debugger;
    }
    return result;
};

This caught the solve payload:

{
  "modified_img_width": 348,
  "id": "9f4770d8416d65...",
  "mode": "whirl",
  "reply": [
    {"x": 6, "y": 0, "relative_time": 17},
    {"x": 23, "y": 0, "relative_time": 19}
  ],
  "drag_width": 348,
  "verify_id": "Verify_0fca0b88-...",
  "version": 2
}

The reply array is the mouse trajectory. x is the slider position, relative_time is milliseconds from drag start. The final x value determines the rotation angle. Stepping through the call stack from there, I landed in TikTok’s VM bytecode interpreter. The encryption function was Et, dispatching to opcode 49 through the VM’s ht.v() function. In the scope I found a ct object:

ct: {key: Uint8Array(32), nonce: Uint8Array(12), reset: f}

32-byte key, 12-byte nonce. I assumed AES-256-GCM and built my first implementation around that. It wasn’t AES.

Per-Session Keys

The key in ct worked for the solve-side encryption, but when I tried to decrypt challenge responses with the same key, every attempt failed with “MAC check failed.” Different payloads, different sizes each time.

I set up a Uint32Array proxy to catch key material being loaded into the crypto engine:

const OrigU32 = Uint32Array;
Uint32Array = new Proxy(OrigU32, {
    construct(target, args) {
        const result = Reflect.construct(target, args);
        if (result.length === 8) {  // 32 bytes = 8 x Uint32
            console.log('KEY CANDIDATE:', Array.from(new Uint8Array(result.buffer)));
            console.trace();
        }
        return result;
    }
});

Loaded two captcha sessions. Different keys each time. Keys are generated per session, not hardcoded.

Not AES

The Uint32Array proxy triggered inside function i(e, t, n) in the VM. Paused the debugger and looked at the source:

function i(e, t, n) {
    r(32 === t.byteLength),
    r(8 === e.byteLength || 12 === e.byteLength),
    // ...
    this.state = new Uint32Array(16);
    for (let r = 0; r < 4; r++)
        this.state[r] = o[r];       // constants
    for (let r = 0; r < 8; r++)
        this.state[4 + r] = a[r];   // key

That o array: [1634760805, 857760878, 2036...]. In hex: 0x61707865, 0x3320646e, 0x79622d32, 0x6b206574. In ASCII: “expand 32-byte k”.

The ChaCha20 constants. Not AES at all.

The Key Was in the Blob

With the debugger paused in the g frame one level up from the crypto function, the full data flow was visible:

g: Array(7)
  2: Uint8Array(8099) [1, 157, 44, 93, 94, 247, ...]     // raw edata bytes
  3: Uint8Array(32)   [157, 44, 93, 94, 247, ...]         // KEY
  4: Uint8Array(12)   [81, 76, 129, 47, 199, ...]         // NONCE
  5: Uint8Array(8054) [10, 4, 197, 222, ...]              // ciphertext
  6: Uint8Array(8054) [0, 0, 0, 0, ...]                   // output buffer

The key starts at byte 1 of the raw edata. The nonce starts at byte 33. The ciphertext starts at byte 45. The math: 8099 - 1 (version) - 32 (key) - 12 (nonce) = 8054 (ciphertext).

No key derivation. No key exchange. The encryption key is embedded in every message:

[0x01] [32-byte key] [12-byte nonce] [ciphertext]

Not Poly1305 Either

Knowing the algorithm and the layout, I wrote the decryption with ChaCha20_Poly1305 from pycryptodome. MAC check failed. Tried the tag at the beginning instead of the end. Failed. Tried different offsets. Failed.

Then I tried raw ChaCha20 without any authentication:

from Crypto.Cipher import ChaCha20

def decrypt(edata_b64):
    raw = base64.b64decode(edata_b64)
    key   = raw[1:33]
    nonce = raw[33:45]
    ct    = raw[45:]
    cipher = ChaCha20.new(key=key, nonce=nonce)
    pt = cipher.decrypt(ct)
    return json.loads(pt)

b'{"code":200,"data":{"challenges":[{"challenge_code":99996...

No authentication. Raw ChaCha20 stream cipher. The JS code had Poly1305 state initialization, but either it wasn’t being used or the tag was being discarded. ChaCha20 worked, ChaCha20_Poly1305 didn’t.

The Decrypted Challenge

With decryption working, here’s the full structure that comes back from captcha/get:

{
  "code": 200,
  "data": {
    "challenges": [{
      "challenge_code": 99996,
      "challenge_version": "2.0",
      "id": "4ac9c237576b84025b34945e6d19040118934991",
      "mode": "whirl",
      "question": {
        "url1": "https://p16-rc-captcha-sg.ibyteimg.com/.../outer.png",
        "url2": "https://p19-rc-captcha-sg.ibyteimg.com/.../inner.png"
      }
    }],
    "cyfreso": 41,
    "expire_time": 1770513177,
    "start_time": 1770512877,
    "verify_id": "Verify_d55d6b4f-396c-4115-aa87-aa86a8b0cfac",
    "version": 2
  }
}

url1 is the outer ring image. url2 is the inner circle. The cyfreso field appears to be a retry/confidence counter - it starts high, shifts after failed attempts (I saw sequences like 41 → 88 → 11 → 5), and seems to influence how aggressively the server rejects borderline answers. Challenge ID and verify ID get sent back with the solve.

The outer response wrapper has edata at the top level and a separate data field containing only the verify_id in plaintext - presumably so the SDK can reference it without decrypting.

Encrypting the Solve

Same format in reverse. Generate random key and nonce, encrypt the payload, prepend the header:

def encrypt(data):
    key = os.urandom(32)
    nonce = os.urandom(12)
    cipher = ChaCha20.new(key=key, nonce=nonce)
    pt = json.dumps(data, separators=(',',':')).encode()
    ct = cipher.encrypt(pt)
    raw = bytes([1]) + key + nonce + ct
    return base64.b64encode(raw).decode()

The solve payload matches what I captured from the browser:

payload = {
    'modified_img_width': 348,
    'id': challenge_id,
    'mode': 'whirl',
    'reply': trajectory,
    'reply2': [],
    'events': '{"userMode":0}',
    'verify_id': verify_id,
    'verify_requests': [],
    'log_params': {},
    'models': {},
    'models2': {},
    'drag_width': 348,
    'version': 2,
}

First submission came back code: 501, msg_sub_code: "5007" - malformed request. The verify endpoint needed the full set of query params matching the GET request, not just the minimal set I was sending. Added those, tried again: code: 500, msg: "VerifyFailedErr". Wrong answer, but the server accepted the format. Crypto pipeline complete.

Solving the Image

Downloaded the captcha images. The outer image (347×347) is a donut - a ring of content with a black hole in the center. The inner image (211×211) is a filled circle that fits inside that hole. Rotate the inner to the correct angle and the content continues seamlessly across the boundary.

The geometry is consistent across images. Scanning outward from the center of both images: the ring content starts at radius ~105, the inner circle content extends to radius ~105. That’s the boundary where pixels need to match.

I tried ORB feature matching first. Got 1 good match out of the minimum 8 needed. The images are too circular and uniform for keypoint detectors to find enough distinctive features.

What worked: sample pixels along a circle at the boundary radius in both images, unwrap them into 1D signals, then use FFT cross-correlation to find the rotation offset.

def sample_ring(img, cx, cy, radius, n=720):
    pixels = []
    for i in range(n):
        angle = 2 * math.pi * i / n
        x = int(cx + radius * math.cos(angle))
        y = int(cy + radius * math.sin(angle))
        pixels.append(img[y, x].astype(float))
    return np.array(pixels)

Sampling at multiple radii near the boundary (3, 5, 8, 12, 16 pixels deep) and averaging the cross-correlation across all channels and depths makes the detection robust. Testing on captured images consistently produces strong peaks with 0.5-0.7 NCC scores and the top results clustering within ±3 degrees.

The final x value maps linearly: target_x = (angle / 360) * 348, where 348 is the drag_width from the protocol.

What the Server Actually Checks

This is where it gets interesting. With the crypto fully reversed and the image solver producing visually correct angles, I built an end-to-end pipeline and submitted answers. The server kept returning code: 500, msg: "VerifyFailedErr".

To isolate the problem, I set up a browser intercept using Playwright. The idea: let the real SDK handle all the anti-bot token generation (X-Bogus, X-Gnarly, detail, msToken), but hook XMLHttpRequest.send inside the page to decrypt the edata payload, swap in the solver’s trajectory, and re-encrypt it - all before the SDK computes its request signatures.

XMLHttpRequest.prototype.send = function(body) {
    if (this.__url?.includes('/captcha/verify')) {
        const obj = JSON.parse(body);
        const { payload, key, nonce } = decryptEdata(obj.edata);
        payload.reply = generateTrajectory(solverTargetX);
        body = JSON.stringify({
            edata: encryptEdata(payload, key, nonce)
        });
    }
    return originalSend.call(this, body);
};

This worked mechanically. The hook intercepted the verify request, replaced the trajectory, and the browser sent the modified request with all valid anti-bot tokens. The server still rejected it.

A few things became clear:

The angle accuracy isn’t the bottleneck. The hook showed the SDK’s own computed x value was within 1-2px of the solver’s answer. The image analysis is correct.

The trajectory matters, not just the endpoint. The reply array isn’t just checked for the final x value. The server analyzes the full drag path - timing patterns, acceleration curves, y-axis jitter. My generated trajectories used a smooth ease-out cubic with uniform random jitter. Real human drags have micro-pauses, overshoot-and-correct patterns, and acceleration that correlates with distance.

There are signals outside the payload. Even with a real browser generating valid tokens, the server-side validation goes beyond what’s in the edata. The SDK collects telemetry throughout the page session. The detail query parameter on the verify request is a large encoded blob that likely contains behavioral data collected before the captcha even appeared.

The server tracks state across attempts. The cyfreso field in challenge responses shifted across my attempts - 28, 88, 11, 5 - suggesting the server adjusts its acceptance threshold based on session history. Getting flagged early makes subsequent attempts harder.

Defense in Depth

TikTok’s captcha has three distinct layers:

Layer 1: Encryption. ChaCha20 with embedded keys prevents casual inspection. It forces you to either reverse the crypto or use the SDK. But the key is right there in the blob - this is obfuscation, not security.

Layer 2: The visual puzzle. The rotation challenge requires computer vision to solve programmatically. Boundary-based NCC cross-correlation handles it reliably. Traditional keypoint matching fails due to the circular symmetry.

Layer 3: Behavioral analysis. This is the real defense. The server validates not just whether the answer is correct, but whether the interaction looks human. Mouse trajectory dynamics, timing between challenge load and solve, telemetry from the broader page session, and anti-bot signatures computed from behavioral data. This is why correct angles still get rejected when submitted programmatically.

Most captcha reversing write-ups focus on breaking layers 1 and 2. Layer 3 is where the actual bot detection lives, and it’s a fundamentally harder problem because the signals aren’t contained in the captcha interaction itself - they’re collected from the entire browsing session.

Lessons Learned

The encryption doesn’t protect anything. ChaCha20 with the key embedded in every message is obfuscation, not security. The purpose is to force SDK usage and prevent casual inspection.
Hook the boundaries, not the internals. I spent time trying to access ct and gt objects inside VM closures. What worked was hooking TextEncoder/TextDecoder and Uint32Array at the boundary between the VM and browser APIs.
Check your constants. [1634760805, 857760878, ...] immediately identifies ChaCha20 if you recognize the “expand 32-byte k” magic. Would have saved an hour of assuming AES.
Try the dumb thing. The key being in the first 32 bytes of the blob seemed too simple to be real. It was real.
The captcha is the easy part. Breaking the encryption and solving the image are tractable engineering problems. The behavioral fingerprinting that wraps the whole system is where the real anti-bot investment is. If you’re evaluating a captcha system’s security, look at what happens around the puzzle, not just at the puzzle itself.

Tools used: Chrome DevTools, Python, Playwright, pycryptodome, OpenCV, numpy

SDK version: captcha-ttp.0a4bb10f.js (h5_sdk_version 2.34.12)

Content on this site is licensed CC BY-NC-SA 4.0