Snake SAT + Claude LLM — How and why it works
Any indicator function over a finite discrete domain can be encoded as a SAT instance in polynomial time. Decision tree bucketing reduces this to linear time.
In plain terms: we don't solve SAT. We build SAT formulas directly from the data.
The NP-hardness of SAT applies to finding satisfying assignments for arbitrary formulas. Snake never does that. It constructs structured formulas where the data is the assignment. Construction is polynomial, evaluation is polynomial. The hard part (NP) is avoided entirely.
oppose() (O(m) per literal) and minimize_clause (greedy removal). Total: O(|Fs|² × m) per class.Our email classifier uses 13 boolean/categorical features extracted from each email by Claude:
| Feature | Type | Signal |
|---|---|---|
has_quantity | bool | Commande (0.92), Devis (0.55) |
has_product_ref | bool | Commande (0.75), Relance (0.60) |
has_dimensions | bool | Commande (0.80), Devis (0.50) |
has_price | bool | Devis (0.85), Spam (0.40) |
has_attachment_pdf | bool | Commande (0.55), Autre (0.45) |
has_delivery_date | bool | Commande (0.70), Relance (0.55) |
ton | cat | formel → Commande/Devis, informel → Spam |
has_greeting | bool | Professional emails (0.85-0.90) |
nb_product_lines | int | Commande (1-8), Spam (0) |
has_urgency_words | bool | Relance (0.75), Spam (0.50) |
has_complaint_words | bool | Autre (0.35), Relance (0.30) |
has_question_mark | bool | Devis (0.80), Relance (0.70) |
domain_known | bool | Pro domains (0.85), Spam (0.05) |
Snake's oppose() loop builds SAT clauses that separate each class pair. For example, to separate Commande from Spam:
Clause 1: has_quantity=True OR nb_product_lines>0 OR domain_known=True → excludes Spam (no qty, no lines, unknown domain) → keeps Commande (has all three) Clause 2: has_dimensions=True OR has_delivery_date=True → excludes remaining Spam that slipped through → keeps Commande (has both)
Each clause is a human-readable rule. The conjunction of clauses forms the full separator. This is why the audit trail is readable: it's literally the SAT formula in plain language.
O(L × n × m × b) = O(30 × 880 × 13 × 200) = O(68,640,000) ≈ 68M operations → Trained in 3.3 seconds
Where: L=30 layers, n=880 training samples, m=13 features, b=200 bucket size.
Compare to gradient boosting on the same data: similar accuracy, but Snake produces explainable SAT rules, not a black-box ensemble. Every prediction comes with a human-readable audit trail.
O(L × clauses_in_bucket) = O(30 × ~5) = O(150) clause evaluations per prediction → Sub-millisecond
The Snake model evaluates a decision tree (bucket chain) and then checks clauses within the matched bucket. No matrix multiplication, no forward pass, no GPU. Pure boolean logic.
| Class | AUROC | Key separating features |
|---|---|---|
| Commande | 0.999 | has_quantity + has_dimensions + has_delivery_date + nb_product_lines |
| Spam | 0.969 | domain_known=False + nb_product_lines=0 + ton=informel |
| Devis | 0.963 | has_price + has_question_mark + !has_delivery_date |
| Relance | 0.962 | has_urgency_words + has_question_mark + has_product_ref + !has_quantity |
| Autre | 0.899 | Catch-all: has_complaint_words + has_attachment_pdf + !has_quantity + !has_price |
Macro AUROC: 0.958 (Wilcoxon-Mann-Whitney, 80/20 split, 1100 samples)
Autre is the weakest class at 0.899 because it's a catch-all — admin emails, invoices, complaints, misc. It has no strong positive signal, only negative ones (not a commande, not a devis, not a relance, not spam). With real labeled data, this will improve as the "autre" category becomes more defined.
Raw email (infinite text space) ↓ Extraction: keyword/regex (default) OR Claude Haiku (opt-in) ↓ 13 structured features (finite Boolean/categorical space) ↓ Snake SAT classification (<1ms, pure logic) ↓ Prediction + Probability + Audit trail
The key insight: Snake is the classifier, not the LLM.
The LLM (Claude Haiku) is an optional encoder that improves feature extraction quality. But Snake does all the classification work regardless. With anthropic=false (the default), keyword/regex extraction builds the 13 features directly — no API key, no network, sub-10ms. Snake classifies the same way either path.
Two extraction modes, same classifier:
anthropic=true): Claude handles ambiguity, synonyms, implicit meaning. "Merci de nous livrer" and "veuillez expédier" both map to has_quantity=True. Costs ~$0.001 per call, adds ~2s latency.Why Snake, not the LLM, classifies:
The theoretical foundation (Finite Sum Theorem, universal MAXSAT representation, polynomial construction) is detailed in Charles Dana's MSc thesis at Ecole Polytechnique, supervised by Erwan Le Pennec: sudoku.aws.monce.ai/V2/papers/