MuleHunter.AI — Detection Strategy Spec

§ 01 / Architecture

Five layers, one verdict.

MuleHunter.AI is not a single model. It is a stack — onboarding intelligence, deterministic rules, behavioural ML, graph reasoning and federated network signals — collapsing to one continuous score per account, updated every transaction.

Identity & onboarding

Pre-transaction intelligence at the account-opening stage. Cross-checks against I4C Suspect Registry, NPCI mule list, device-fingerprint, IP geo, KYC document age, biometric replay detection and branch-batch openings. Mules are easier to catch before they receive their first credit than after they have processed 95,000.

I4C RegistryNPCI Mule HashDevice-FPDoc-vintageBranch-batch

Deterministic rules engine

Hard-coded, fast, explainable signals. Rules R-01…R-24 fire on velocity, structuring, ticket-clustering, sender homogeneity, descriptor entropy, sub-threshold sweeping, dormancy-then-burst and 16 more. Rules are cheap, deterministic and forensically defensible — every alert ships with the firing rule IDs attached.

FlinkCEP windowsSub-90s SLAExplainable

Behavioural ML

Per-account behavioural embedding from 142 features (volumetric, temporal, counterparty, descriptor, balance-trajectory, channel-mix). Trained on supervised labels from I4C lien-marked accounts plus weak labels from rule firings. Ensemble: gradient-boosted trees for ranking + temporal CNN for burst detection + isolation forest for novelty. Updated daily; scored every transaction.

XGBoostTemporal CNNIsolation Forest142 features

Graph & network

Sender↔account↔beneficiary bipartite graph reconstructed daily across participating banks. Graph Neural Network propagates risk: known mule pulls up its counterparties, sequential-numbered account clusters pull up their siblings, shared-sender clusters surface co-mule rings. Identifies the upstream source (the 4897735162098-equivalent) automatically.

GNN · GraphSAGECommunity detectRisk propagationCross-bank

CMAI score & action router

All four layers above collapse into one continuous output: the Composite Mule-Account Index (0–100). The score routes to a four-tier intervention ladder (T1 monitor → T4 lien) and pushes back via webhook to the host bank's core. The same score is contributed to the federated network for cross-bank reuse — one bank's catch becomes every bank's defence.

0–100 CMAIT1→T4 ladderFederatedBank webhook

§ 02 / Features

From raw rail to 142 features.

The model never sees a transaction directly. It sees per-account, per-window aggregations across six families. Each family was instrumented to capture a specific class of mule behaviour observed in the anchor cases.

Ingest

Streams & stores

UPI, IMPS, NEFT, RTGS, AePS, card & cash channels
Core-banking debit/credit feed
KYC, device, IP, geo metadata
I4C Suspect Registry (refreshed hourly)
NPCI mule-hash list (refreshed daily)
Cross-bank federated signals

Feature families · 142 total

Six behavioural axes

VOL/ volume, count, peak-velocity, burst-duration
TMP/ hour-of-day skew, dormancy-then-burst, inter-arrival
CNT/ sender-count, sender-bank-spread, repeat-ratio
DSC/ descriptor entropy, random-token %, phone-VPA %
BAL/ retention ratio, max-balance, velocity multiple
OUT/ destination concentration, ticket-band, sub-threshold sweep

Output

Continuous verdict

Per-account CMAI score 0–100 (refreshed every txn)
Top-5 contributing features (SHAP)
Firing rule IDs (forensic chain)
Graph-neighbourhood risk lift
Recommended action tier T1…T4
Confidence band + model-version stamp

§ 03 / Rules engine

Twenty-four rules, case-derived.

L1 deterministic rules. Each rule traces back to a specific signal in the anchor SBI cases. Severity, logic, window and recommended tier are versioned in the rules registry; thresholds are A/B tested monthly against the I4C-confirmed mule ledger.

R-01 · VEL-BURST

CRITICAL

Sustained credit burst on dormant or new account.

IF credits_count(60m) ≥ 500 AND account_age < 180d AND prior_30d_velocity < 5% of current → FIRE

Window 60 minAnchor case A: 95k credits / 5h

R-02 · VEL-TPS

CRITICAL

Peak credit throughput exceeds human ceiling.

IF peak_credits_per_second(rolling 60s) ≥ 3 sustained ≥ 10 min → FIRE

Window rolling 60sObserved peak ~5 tps

R-03 · VEL-MULT

HIGH

Turnover multiple beyond legitimate threshold.

IF turnover_24h ÷ avg_balance_30d ≥ 30 → FIRE

Window 24hObserved 42×–51×

R-04 · STR-SMURF

HIGH

Micro-credit smurfing at scale.

IF count_credits ≤ ₹2000 within 1h ≥ 300 AND median ≤ ₹500 → FIRE

Window 1hAnchor median ₹300–₹500

R-05 · STR-ROUND

MEDIUM

Round-amount dominance.

IF share_of_credits in {100,200,300,500,1000,2000} ≥ 80% over 24h, count ≥ 500 → FIRE

Window 24hAnchor top-10 = 93.4%

R-06 · PTH-RETAIN

CRITICAL

Pass-through funnel — retention near zero.

IF (credits_sum − debits_sum) ÷ credits_sum < 2% over 24h AND credits_sum ≥ ₹10 L → FIRE

Window 24hObserved 0.03% / 0.42%

R-07 · PTH-LAG

HIGH

Funds out within minutes of in.

IF median(time_credit_to_offsetting_debit) < 30 min over 1h AND count_debits ≥ 5 → FIRE

Window 1hFIFO matching engine

R-08 · OUT-BAND

HIGH

Sub-threshold sweep band concentration.

IF stdev(debit_amount) ÷ mean < 5% AND mean within ±5% of {₹2L, ₹5L, ₹10L} → FIRE

Window 6hAnchor ₹4.00L–₹4.18L

R-09 · OUT-CONC

CRITICAL

All debits to ≤ 5 beneficiaries.

IF unique_beneficiary_accounts(7d) ≤ 5 AND debit_count ≥ 50 → FIRE

Window 7dAnchor A → 1 acct

R-10 · CNT-FANIN

HIGH

Extreme fan-in.

IF unique_sender_VPAs(24h) ≥ 5000 AND avg_credit ≤ ₹1000 → FIRE

Window 24hAnchor 68k / 172k senders

R-11 · CNT-PHONE

MEDIUM

Phone-VPA sender dominance.

IF share_phone_pattern_VPAs(24h) ≥ 75% AND credit_count ≥ 500 → FIRE

Window 24hAnchor 88.7% phone-VPA

R-12 · CNT-SPREAD

MEDIUM

Cross-bank sender spread.

IF unique_sender_IFSCs(24h) ≥ 80 AND avg_credit ≤ ₹1000 → FIRE

Window 24hAnchor 120 / 153 banks

R-13 · DSC-ENTROPY

HIGH

Bot-generated descriptor tokens.

IF share(payment_note matches /^[a-z0-9]{4}$/i) ≥ 70% over 1h, count ≥ 300 → FIRE

Window 1hAnchor 93.9%

R-14 · DSC-NULL

LOW

Absent or generic remarks.

IF share(payment_note ∈ {"", "test", "payment", "UPI"}) ≥ 60% AND count ≥ 200 → FIRE

Window 1hAggregator footprint

R-15 · TMP-DORMANT

HIGH

Dormancy then explosive burst.

IF txn_count(prev_30d) < 10 AND txn_count(current_24h) ≥ 500 → FIRE

Window 30d / 24hClassic mule activation

R-16 · TMP-OFFHRS

MEDIUM

Off-hours / cross-TZ activity.

IF share(txn between 22:00–05:00 IST) ≥ 40% over 24h, count ≥ 200 → FIRE

Window 24hForeign-controller signal

R-17 · GRA-SEQACCT

CRITICAL

Sequential destination account cluster.

IF debits routed to ≥ 6 destination accounts with contiguous-number gaps ≤ 3 AND same branch IFSC → FIRE

Window 30dAnchor 42 sequential accts

R-18 · GRA-SHAREDSRC

CRITICAL

Shared upstream source account.

IF ≥ 2 accounts share an upstream-funder counterparty whose CMAI ≥ 70 → propagate +25 to each

Window graph dailyAnchor both → 4897735162098

R-19 · GRA-SHAREDSND

HIGH

Shared sender-pool overlap.

IF Jaccard(sender_set_A, sender_set_B) ≥ 0.05 over 7d AND |overlap| ≥ 1000 → FIRE on both

Window 7dAnchor 5,005 shared

R-20 · KYC-TRADENAME

HIGH

Beneficiary name spoofing.

IF same destination_account appears under ≥ 3 normalised beneficiary-name variants in 30d → FIRE

Window 30dAnchor SHOP BASKET / RAMSO

R-21 · KYC-BATCHOPEN

CRITICAL

Branch-batch account openings.

IF ≥ 5 accounts opened within 14d at same branch IFSC share device-FP, IP /24 OR introducer → FIRE on all

Window 14dL0 onboarding signal

R-22 · INT-DATESKEW

MEDIUM

Statement-date / UTR-epoch mismatch.

IF reconciled_txn_date ≠ display_date by ≥ 30 days on ≥ 5% of records → FIRE forensic flag

Window per statementAnchor 2025-07 vs 2026-03

R-23 · CHN-RAILSWAP

HIGH

Asymmetric rail usage — UPI in, RTGS/NEFT out.

IF inflow_rail_share(UPI) ≥ 90% AND outflow_rail_share(RTGS+NEFT) ≥ 90% over 24h → FIRE

Window 24hAnchor both accounts

R-24 · I4C-REGMATCH

CRITICAL

Direct I4C Suspect Registry hit.

IF account_id OR PAN OR mobile OR device-FP ∈ I4C Suspect Registry → immediate T4 + lien

Window real-timeMoU data feed

§ 04 / ML ensemble

An ensemble, not a model.

Mule behaviour is not a single shape. The pass-through funnel, the dormant-then-burst, the sequential mule fleet — each has a different temporal signature, sparsity and counterparty topology. MuleHunter.AI uses four specialised learners, each optimised for one failure mode, fused at the score layer.

Model	Specialty	Input	Why this learner	Weight in CMAI
M1 · XGB-RANK	Gradient-boosted ranker	142 tabular features over 24h / 7d / 30d windows	Robust to mixed scales, handles missingness, ships with SHAP explanations — required for explainable banking AI.	0.30
M2 · T-CNN	Temporal CNN for burst signatures	Per-minute credit/debit count & amount tensors, 7-day window	Convolution detects the dormancy-spike-collapse pattern that tabular features blur. Picks up the 5-hour burst even if smeared across calendar dates.	0.20
M3 · ISO-NOV	Isolation forest novelty	Account-embedding vector vs historical population	Catches new mule typologies the rules & supervised models have never seen. Important: typologies evolve faster than label data arrives.	0.15
M4 · GNN-PROP	Graph neural net · GraphSAGE	Sender ↔ account ↔ beneficiary bipartite graph across participating banks	Surfaces co-mule rings, sequential clusters and shared-source funnels. Single-account models cannot see this — by definition, the signal lives between accounts.	0.25
M5 · RULES	Deterministic L1 firings	R-01 … R-24 boolean vector	Forensic floor. Even if every ML model fails or drifts, a single critical rule (e.g., R-24 I4C hit) can still drive the account to T4.	0.10 (with veto)

§ 05 / Worked example

Replaying SBI 1ID through the engine.

A trace of how Account A's CMAI evolves transaction-by-transaction through the burst window. By minute 12 the account would have crossed 85 → T4 lien. By minute 14 the burst is halted. Recovered ₹6.97 Cr of ₹6.98 Cr.

Account A · 01-July-2025 · burst replay

Account opened 2025-05-04

Prior 30d txns 7

Burst start 15:10:42 IST

Engine intercept 15:23:11 IST

T+min	State	Rules firing	CMAI	Action
T+00	First 42 credits (avg ₹287, 21 sender banks)	R-04, R-10	34	T1 · watch
T+03	~520 credits, dormancy break detected	R-04, R-10, R-15, R-13	52	T2 · soft friction on outflow
T+06	1,800+ credits, peak 4.7 tps, ₹500 dominance	R-01, R-02, R-05, R-11	68	T2 → step-up auth armed
T+09	First RTGS outflow attempt — ₹4,12,500 to a low-history beneficiary	+ R-08, R-23, R-07	79	T3 · outflow quarantined, callback queued
T+12	GNN propagation picks up shared upstream with Account B (already at CMAI 91)	+ R-18, R-19, R-09	96	T4 · debit freeze, 1601 PD lien
T+14	Bank webhook acks freeze; further credits accepted (legally required) but no outflow possible	frozen	96	STR filed · I4C feed updated

₹6.97 Cr
RECOVERABLE
14 min
DETECT-TO-FREEZE
99.86%
FUNDS HOLD RATIO

§ 06 / Intervention

Four tiers, calibrated for friction.

A T4 freeze on a false-positive harms a real customer. A T1 watch on a real mule loses ₹6 Cr. Tier calibration is the most reviewed part of the system — monthly re-tuning against I4C-confirmed and customer-disputed labels.

T1 · WATCH

Silent monitoring.

CMAI 25–49 · invisible to customer

Increased scoring cadence
Sampled into review queue
Feature snapshot retained 90d
No friction applied

T2 · SOFT

Friction on outflow.

CMAI 50–69 · low customer perceptibility

OTP on every NEFT / RTGS / IMPS
Velocity cap on outbound
Beneficiary cooling period
Analyst secondary review

T3 · RESTRICT

Outflow quarantine.

CMAI 70–84 · customer contact required

Outflows held for callback
Customer verification call
STR draft auto-generated
Branch & relationship team alerted

T4 · LIEN

Debit freeze.

CMAI 85–100 or R-24 hit · regulatory escalation

1601 PD lien marker applied
FIU-IND STR filed within 7 days
I4C Suspect Registry contribution
Account holder due-process notice

Hunting mules
at machine
speed. MuleHunter.AI · detection strategy, rules engine, model architecture and case-driven feature catalogue.

The two statements that became training data.

Five layers, one verdict.

From raw rail to 142 features.

Ingest

Streams & stores

Feature families · 142 total

Six behavioural axes

Output

Continuous verdict

Twenty-four rules, case-derived.

An ensemble, not a model.

One number. CMAI 0–100.

Replaying SBI 1ID through the engine.

Account A · 01-July-2025 · burst replay

Four tiers, calibrated for friction.

Silent monitoring.

Friction on outflow.

Outflow quarantine.

Debit freeze.

Every catch makes everyone safer.