Finance Foundation Models
Summary: A new class of proprietary foundation models trained on financial transaction data, delivering step-change uplifts over bespoke ML models across credit, fraud, and personalization.
Sources: raw/articles/simon-taylor-2026-04-26.md, raw/call-notes/carlos-2026-05-10.md, raw/call-notes/shrikant-2026-05-11.md
Last updated: 2026-05-17
Overview
Finance foundation models are transformer-based models trained on large corpora of financial behavioral data (transactions, events, clickstreams) rather than text. They are not LLMs — they don’t generate language, and they don’t compete with GPT or Claude. They learn dense representations of customer behavior that can be fine-tuned across multiple downstream tasks.
The analogy: what BERT did for text in 2020, these models are now doing for financial event sequences.
Known Models (as of 2026)
| Model | Company | Training Data | Key Results |
|---|---|---|---|
| PRAGMA | Revolut | 24B banking events, 26M customers, 111 countries | +130% credit scoring, +65% fraud recall |
| nuFormer | Nubank | 100B+ transactions, 100M+ customers | GPT-like architecture; narrower use cases |
| LTM | Mastercard | Billions of card transactions | Cyber risk identification |
| UPI Help | NPCI (India) | UPI transactions | Fine-tuned Mistral 24B; conversational agent for 400M+ UPI users |
| Nemo-4-PayPal | PayPal | PayPal shopping data | Fine-tuned llama3.1-nemotron-nano-8B-v1; 45% cheaper to run; 2 weeks of fine-tuning |
Why Behavioral Data is Powerful
Pavel Nesterov (PRAGMA author, ex-ad-tech): in ad-tech, 50 clicks plus links visited was enough to estimate a person’s age, income, and number of children. The same principle applies to banking event sequences — behavioral features contain almost everything.
This is the same insight that makes Google and Meta so powerful. Revolut is applying it to banking data.
What These Models Replace
A single foundation model replaces multiple bespoke ML models, each hand-crafted by a data science team:
- Credit scoring models
- Fraud detection models
- Marketing/propensity models
- Product recommendation models
- Churn models
The key shift: instead of building a new model per task, fine-tune one foundation model per task using LoRA. This is dramatically faster and often outperforms the specialized models.
Competitive Implications
- Custom finance foundation models = proprietary IP moat
- Banks with better behavioral embeddings will price credit more aggressively, catch more fraud, and cross-sell more effectively
- This compounds: the more customers, the better the model, the better the outcomes, the more customers
- Traditional banks have the data but not the execution speed; neobanks have the speed
Why Banks Haven’t Done This Yet
- Execution culture — traditional banks move slowly; committee-based decisions, formal change management
- Data infrastructure — modern ML pipelines require clean, accessible data; legacy banks spend months just finding and scrubbing training data
- Risk aversion — 99% accuracy required vs. 95% acceptable at tech-first firms (source: carlos-call)
- Explainability concerns — regulators require interpretability for credit decisions; Shrikant noted this specifically for decisioning teams (source: shrikant-call, jie-capital-one-call)
What’s Coming
The generative version: a model that doesn’t just predict a customer’s next action but generates the full sequence of future events. This enables:
- Simulating when a customer will take a product
- Identifying what conditions lead to that decision
- Engineering those conditions
Infrastructure
See nvidia-finance for the tooling stack (H100s, NeMo AutoModel, cuDF, TensorRT-LLM).