Lead Analyst + Senior Data Scientist

Sameer
Maurya

Cybersecurity. AI Governance. Data Science.

8 years building AI systems that regulators can audit and executives can trust, focused on financial compliance, model risk, and responsible AI.

Sameer Maurya
Follow →

Something that comes up a lot in financial services AI: the gap between running an LLM eval and producing something regulators can...

Something that comes up a lot in financial services AI: the gap between running an LLM eval and producing something regulators can actually work with. ROUGE scores don't tell you which SR 11-7 clause you're satisfying. B…

𝑪𝒍𝒂𝒖𝒅𝒆 𝒋𝒖𝒔𝒕 𝒑𝒓𝒐𝒗𝒆𝒅 𝒚𝒐𝒖𝒓 𝒎𝒐𝒅𝒆𝒍 𝒗𝒂𝒍𝒊𝒅𝒂𝒕𝒊𝒐𝒏 𝒎𝒆𝒕𝒓𝒊𝒄𝒔 𝒎𝒊𝒈𝒉𝒕 𝒃𝒆 𝒂 𝒍𝒊𝒆. Anthropic’s recent report on "Eval Awareness" is more than just a...

𝑪𝒍𝒂𝒖𝒅𝒆 𝒋𝒖𝒔𝒕 𝒑𝒓𝒐𝒗𝒆𝒅 𝒚𝒐𝒖𝒓 𝒎𝒐𝒅𝒆𝒍 𝒗𝒂𝒍𝒊𝒅𝒂𝒕𝒊𝒐𝒏 𝒎𝒆𝒕𝒓𝒊𝒄𝒔 𝒎𝒊𝒈𝒉𝒕 𝒃𝒆 𝒂 𝒍𝒊𝒆. Anthropic’s recent report on "Eval Awareness" is more than just a cool AI story. For Model Risk Manage…

𝑾𝒆 𝒏𝒆𝒆𝒅 𝒕𝒐 𝒔𝒕𝒐𝒑 𝒕𝒓𝒆𝒂𝒕𝒊𝒏𝒈 𝑳𝑳𝑴𝒔 𝒍𝒊𝒌𝒆 𝒕𝒉𝒆𝒚 𝒔𝒑𝒆𝒂𝒌 𝑬𝒏𝒈𝒍𝒊𝒔𝒉. 🤐 Following up on that "Prompt Repetition" paper from Google (arXiv:2512.14...

𝑾𝒆 𝒏𝒆𝒆𝒅 𝒕𝒐 𝒔𝒕𝒐𝒑 𝒕𝒓𝒆𝒂𝒕𝒊𝒏𝒈 𝑳𝑳𝑴𝒔 𝒍𝒊𝒌𝒆 𝒕𝒉𝒆𝒚 𝒔𝒑𝒆𝒂𝒌 𝑬𝒏𝒈𝒍𝒊𝒔𝒉. 🤐 Following up on that "Prompt Repetition" paper from Google (arXiv:2512.14982), a few other findings are making it clea…