Project · Technical Deep Dive

Autek AI

AI-powered regulatory audit platform for Mexican financial institutions — covering SPEI/Banxico (Apéndice M, 214 controls), PCI DSS v4.0 (SAQ D 243 / SAQ A 21), and custom frameworks. Combines RAG over the regulatory corpus, multi-agent orchestration with LangGraph, and an autonomous evidence collector that connects via SSH to servers, network devices, and Kubernetes clusters.

Why this exists

Compliance audits at Mexican financial institutions are slow, repetitive, and evidence-hungry. Each audit drags a team of auditors through hundreds of controls (214 for SPEI Apéndice M alone, 243 for PCI DSS SAQ D), each requiring written justification, supporting documents, and technical evidence from production systems.

Most of that work is mechanical: read the control, find the matching evidence, decide if it complies, write the justification. Exactly the kind of work that LLMs and agents can take on — as long as the system can cite the source, reach the systems where the evidence lives, and let the auditor stay in control of the verdict.

How it's wired

┌──────────────────────────────────────────────────────────────────┐
│                       Frontend (Next.js 15)                      │
│             App Router · SSE streaming · RBAC UI                 │
└──────────────────────────────┬───────────────────────────────────┘
                               │ HTTP / SSE
┌──────────────────────────────▼───────────────────────────────────┐
│                      Backend (FastAPI · Python 3.12)             │
│                                                                  │
│   ┌────────────────────────────────────────────────────────┐     │
│   │              LangGraph Orchestrator                    │     │
│   │   route_query  →  retrieve_context  →  generate        │     │
│   │                                                        │     │
│   │   Sub-agents:                                          │     │
│   │   • Regulatory Agent       (RAG over corpus)           │     │
│   │   • Checklist Assistant    (per-control guidance, SSE) │     │
│   │   • Evidence Analyzer      (PDF/image → structured)    │     │
│   │   • Auto-Eval Agent        (proposes verdicts)         │     │
│   │   • Evidence Collector     (SSH → servers / K8s)       │     │
│   └────────────────────────────────────────────────────────┘     │
└──────┬──────────────────────┬────────────────────────┬───────────┘
       │                      │                        │
┌──────▼──────┐       ┌───────▼──────┐         ┌───────▼──────────┐
│ PostgreSQL  │       │   Qdrant     │         │   AWS S3         │
│  16 (async) │       │  Vector DB   │         │  corpus +        │
│             │       │              │         │  evidence +      │
│ Audits /    │       │ text-embed-  │         │  reports         │
│ Checklists/ │       │ ding-3-large │         └──────────────────┘
│ Evidence /  │       │ cosine, k=12 │
│ Teams / RBAC│       └──────────────┘
└─────────────┘
                                    ┌─────────────────────────────┐
                                    │   Ingestion Worker          │
                                    │   PDF → text → chunk (1000) │
                                    │   → embed → Qdrant          │
                                    │   structure-preserving      │
                                    │   (Banxico articles,        │
                                    │    PCI DSS Requirements)    │
                                    └─────────────────────────────┘

The interesting bits

Regulatory RAG with citations

Semantic search over Banxico circulars and PCI DSS docs. Chunking preserves regulatory structure (article numbers, Requirement IDs) so generated answers carry verifiable citations. Top-k=12, text-embedding-3-large, 3072 dims.

Auto-Evaluation Agent

Reads the evidence attached to a control and proposes a COMPLIANT / NON_COMPLIANT verdict with justification and confidence level. The auditor accepts or rejects — the human stays in the loop on every decision.

SSH Evidence Collector

Agent that connects to Linux/Windows servers, Cisco gear, and Kubernetes clusters, runs predefined command profiles (PCI DSS, Apéndice M), stores outputs to S3, and auto-links them to controls. Closes the gap between "we have a policy" and "the policy is actually enforced in production".

Evidence Analyzer

On upload, classifies the document (policy, log, diagram, screenshot), extracts key facts, detects validity period, and suggests which controls it satisfies — surfaced in a sliding panel for the auditor to confirm.

Per-control AI assistance (streaming)

For every checklist item, an SSE-streamed assistant explains the control in plain Spanish, lists required evidence, flags common gaps, and offers remediation steps — grounded in the regulatory corpus.

Built-in regulatory templates

Ships with SPEI Apéndice M (214 controls), PCI DSS v4.0 SAQ D (243), and SAQ A (21). Custom templates can be loaded from JSON or CSV.

Technology

LayerTechnologies
FrontendNext.js 15 (App Router), TypeScript, Tailwind, SSE
BackendPython 3.12, FastAPI, SQLAlchemy 2.0 (async), Alembic, Pydantic
AI / RAGLangGraph, LangChain, Claude / GPT-4o, OpenAI text-embedding-3-large
Vector StoreQdrant — semantic search, cosine distance
DatabasePostgreSQL 16 — async drivers, 10+ Alembic migrations
StorageAWS S3 (LocalStack in dev) — corpus, evidence, reports
AuthJWT (access + refresh), bcrypt, multi-team RBAC
InfraDocker Compose (local & on-prem), AWS ECS Fargate + RDS + ALB + EFS
IaCTerraform ≥ 1.5 (AWS provider ~> 5.0)
Testing & Lintpytest, pytest-asyncio, httpx, ruff

Decisions worth calling out

  • Structure-aware chunking. Regulatory text loses meaning when chunked naively. The ingestion pipeline detects Banxico article boundaries and PCI DSS Requirement IDs, so retrieved chunks always carry the citation needed to back the answer.
  • Human in the loop, by default. Every AI verdict — auto-eval, evidence classification, suggested control linking — is a proposal. The auditor's click is what writes it to the database. AI as accelerator, not authority.
  • Reaching the real system. Compliance evidence usually lives in production: firewall configs, IAM policies, kernel params, K8s RBAC. The SSH collector turns "ask the sysadmin for a screenshot" into a one-click action, tied to the control it satisfies.
  • Multi-LLM provider. LLM_PROVIDER toggle between OpenAI and Anthropic without code changes — useful for cost/latency tuning and for clients with specific data-residency or vendor constraints.
  • Two deployment modes. Same Docker stack runs on-prem (regulated customers can't send compliance data to a vendor cloud) and on AWS ECS Fargate (turnkey SaaS). Terraform covers the cloud path end-to-end.