The Fed Just Rewrote the Rulebook for Bank Supervision
The Federal Reserve's November 2025 Statement of Supervisory Operating Principles signals a seismic shift — from checkbox compliance to material risk. Here's what changed and why it matters.
Cybersecurity · AI Governance · Data Science
8 years building AI systems that regulators can audit and executives can trust, focused on financial compliance, model risk, and responsible AI.
The Federal Reserve's November 2025 Statement of Supervisory Operating Principles signals a seismic shift — from checkbox compliance to material risk. Here's what changed and why it matters.
Data ScienceFast.ai uncovered something strange in LLM fine-tuning: training loss dropped suddenly after just one pass through the data — suggesting models can memorize inputs almost immediately. Here's what it means.
AI GovernanceA comprehensive framework for analyzing open-source GenAI across near, mid, and long-term development stages — and why the benefits generally outweigh the risks when governance keeps pace.
AI GovernanceLLMs used as evaluators show an average 40% bias in their outputs and a 49.6% RBO score misalignment with human preferences. The COBBLER benchmark quantifies exactly how and where these biases emerge.
Data ScienceLLM agents hit 94% success on basic web tasks — but drop to 25% on compositional tasks that combine multiple steps. The CompWoB benchmark exposes exactly where and why this happens.
Case StudiesAn EDA of 2007–2011 lending data to identify the driving factors behind loan defaults — amount-to-income ratios, revolving utilisation, derogatory records, and loan purpose all tell a story.
The Federal Reserve's November 2025 Statement of Supervisory Operating Principles signals a seismic shift — from checkbox compliance to material risk. Here's what changed and why it matters.
Fast.ai uncovered something strange in LLM fine-tuning: training loss dropped suddenly after just one pass through the data — suggesting models can memorize inputs almost immediately. Here's what it means.
A comprehensive framework for analyzing open-source GenAI across near, mid, and long-term development stages — and why the benefits generally outweigh the risks when governance keeps pace.
Re-training LLMs from scratch when new data arrives is prohibitively expensive. Three simple strategies — LR re-warming, LR re-decaying, and minimal data replay — match the performance of full re-training at a fraction of the cost.
Traditional ensemble methods fail when correct answers are in the minority. AoR introduces hierarchical reasoning chain evaluation and dynamic sampling to fix this — and consistently outperforms standard approaches.