ThreatForge Documentation
A hybrid threat intelligence platform combining deterministic YARA analysis with probabilistic ML detection. F1-Score: 0.94 • AUC-ROC: 0.97
92/100
Readiness Score
14
Database Tables
35+
API Endpoints
25+
RLS Policies
Production Readiness
Google SRE Maturity Model • DORA Framework Assessment
Security
Zero Trust Architecture, Argon2id, HMAC-SHA256
Concurrency
Actor Model, GIL Bypass, Event Loop I/O
Data Integrity
ACID Compliance, MVCC, 3NF, RLS
Observability
OpenTelemetry, Prometheus, Structured Logging
Scalability
Horizontal Partitioning, Stateless API, CAP Theorem
Reliability
Circuit Breaker, Exponential Backoff, Health Checks
Maintainability
SOLID Principles, Clean Architecture, Type Safety
Total Readiness
System Architecture
Service-Oriented Micro-Monolith • CAP Theorem Trade-offs
Auth & Data (CP)
Strong consistency via PostgreSQL SERIALIZABLE isolation. We sacrifice availability during partitions rather than serve stale auth data.
Scanning Pipeline (AP)
Eventual consistency for task processing. Scans return HTTP 202 Accepted and are queued in Redis with at-least-once delivery.
Container Architecture
Client
Web Browser
API Scripts
Application
Next.js 16 :3000
Flask API :5000
FastAPI ML :7860
Celery Workers
Data
PostgreSQL 16
Redis 7
S3 Storage
Observability
Prometheus
Loki
Tempo
Grafana
Request Lifecycle
Technology Stack
Every choice evaluated on DX, community, performance, security & longevity
Next.js 16
App Router, RSC, Streaming SSR
React 19
Server Components, React Compiler
Tailwind CSS 4
Utility-first styling engine
Framer Motion 12
Spring physics animations
Security Architecture
Zero Trust (NIST SP 800-207) • Defense in Depth • Score: 95/100
Verify Explicitly
Every request authenticated via JWT. No implicit trust based on network location.
Least Privilege
Three roles (admin, analyst, viewer). 25+ RLS policies enforce at the DB level.
Assume Breach
All internal comms encrypted. Secrets via env vars, never hardcoded.
Detection Engine
8-Step Analysis Pipeline • Random Forest (n=100) • F1=0.94 • AUC=0.97
Layer 1
Deterministic
YARA signature matching with O(1) hash lookups
Layer 2
Probabilistic
Random Forest classifier: 79 PE features → confidence score
Layer 3
External Intel
Live VirusTotal API feeds for cross-referencing
File Metadata
MIME type, size, creation date extraction
Shannon Entropy
H(X) = -Σ P(xᵢ)·log₂P(xᵢ) — randomness detection
PE Header Analysis
Entry point, sections, imports via pefile
YARA Rule Scan
Aho-Corasick automaton: O(n+m+z) matching
ML Prediction
Random Forest (100 trees, 79 features) → confidence
Stego Detection
LSB analysis, chi-square test on images
Network Analysis
PCAP anomaly detection, flow statistics
Threat Scoring
Weighted aggregation → score 0-100
ML Model Performance
| Model | Accuracy | Precision | Recall | F1 | AUC |
|---|---|---|---|---|---|
| Malware Detection | 0.96 | 0.95 | 0.93 | 0.94 | 0.97 |
| Network Anomaly | 0.93 | 0.92 | 0.91 | 0.91 | 0.95 |
| Steganography | 0.91 | 0.89 | 0.90 | 0.89 | 0.93 |
Database Design
PostgreSQL 16 • 14 Tables • BCNF Normalized • 25+ RLS Policies
Atomicity
Write-Ahead Log
Consistency
CHECK + FK refs
Isolation
MVCC + SSI
Durability
WAL + fsync
Identity
profiles, user_sessions, security_preferences, ip_whitelist
4 tablesScanning
scans, scan_files, findings, rule_matches
4 tablesRules
yara_rules
1 tablesAccess
api_keys, audit_logs, activity_logs
3 tablesComms
notifications, notification_preferences
2 tablesAPI Reference
RESTful (Fielding, 2000) • 35+ Endpoints • JWT Auth • Rate Limited
Future Roadmap
Federated Learning • Graph Neural Networks • Autonomous Response
Federated Learning
Privacy-preserving ML: train locally, share only gradients. FedAvg with differential privacy.
Graph Neural Networks
Model threat relationships as graphs. Message-passing framework for attack campaign identification.
Autonomous Response
SOAR integration: auto-quarantine, firewall rules, SIEM integration (Splunk, Elastic).
Advanced Detection
Dynamic sandbox, YARA-X (Rust), STIX/TAXII feeds, browser extension.
Full Documentation
This page covers the key architectural decisions. For the complete 23-chapter compendium with mathematical proofs and code-level details, visit our Notion workspace.