{"id":4903,"date":"2025-09-17T13:31:57","date_gmt":"2025-09-17T13:31:57","guid":{"rendered":"https:\/\/cybersecurityinfocus.com\/?p=4903"},"modified":"2025-09-17T13:31:57","modified_gmt":"2025-09-17T13:31:57","slug":"ex-openai-cto-mira-muratis-startup-fixed-a-major-unfixable-ai-bug","status":"publish","type":"post","link":"https:\/\/cybersecurityinfocus.com\/?p=4903","title":{"rendered":"Ex-OpenAI CTO Mira Murati\u2019s Startup Fixed a Major \u2018Unfixable\u2019 AI Bug"},"content":{"rendered":"<p>OpenAI\u2019s former CTO Mira Murati\u2019s new startup recently dropped its first research paper, and it\u2019s about fixing something that\u2019s been bugging AI engineers since ChatGPT launched.<\/p>\n<p>The paper from Murati\u2019s startup <a href=\"https:\/\/www.eweek.com\/news\/mira-murati-ai-startup-thinking-machines-lab\/\">Thinking Machines Lab<\/a> (or, just Thinky, for those in the know) is titled \u201c<a href=\"https:\/\/thinkingmachines.ai\/blog\/defeating-nondeterminism-in-llm-inference\/\" target=\"_blank\" rel=\"noopener\">Defeating Nondeterminism in LLM Inference<\/a>\u201d and tackles the problem of \u201creproducibility\u201d in language models like ChatGPT.<\/p>\n<p>A few definitions before we start:<\/p>\n<p><strong>Nondeterminism: <\/strong>Getting different answers when asking ChatGPT the same question twice with identical settings.<\/p>\n<p><strong>LLM: <\/strong>Large language model like ChatGPT, Claude, or Gemini.<\/p>\n<p><strong>Inference: <\/strong>When you ask a language model a question, and it generates a response.<\/p>\n<h2>The problem<\/h2>\n<p>Even when you set AI models to their most predictable setting (temperature = 0), you still get different answers to the same question. Engineers have been pulling their hair out thinking it was just \u201cone of those computer things\u201d that couldn\u2019t be fixed. Turns out, they were wrong.<\/p>\n<h2>The solution<\/h2>\n<p>The discovery came from Horace He, a PyTorch (code for AI) wizard who recently jumped from Meta to join Murati\u2019s team. He\u2019s the guy behind torch.compile, which is that thing that makes AI models run 2-4x faster with one line of code.<\/p>\n<p>Horace and his team discovered the real culprit isn\u2019t the usual suspect: floating-point math weirdness that engineers typically blame; instead, it\u2019s something called \u201cbatch invariance.\u201d Think of it like this:<\/p>\n<p>Imagine ordering the same coffee at Starbucks, but it tastes different depending on how many other customers are in line. That\u2019s essentially what\u2019s happening with AI models.<\/p>\n<p>When an AI server is busy handling lots of requests, it processes them in batches.<\/p>\n<p>Your request gets bundled with others because that\u2019s more efficient, and somehow this changes your specific answer\u2026 even though it shouldn\u2019t.<\/p>\n<p>Follow the logic, and the busier the server, the more your results vary.<\/p>\n<p>This also happens in real life. Have you ever been to your local Starbucks during coffee rush hour? Unless you have a god-tier level barista, your order might not taste the same!<\/p>\n<h3>Why this matters<\/h3>\n<p>This matters because of these problems:<\/p>\n<p>AI companies doing research can\u2019t reproduce their own experiments reliably.<\/p>\n<p>Businesses using AI for critical decisions get inconsistent results.<\/p>\n<p>Training new AI models becomes way more expensive when you can\u2019t trust your outputs.<\/p>\n<p>Thinky released its solution as <a href=\"https:\/\/github.com\/thinking-machines-lab\/batch_invariant_ops\" target=\"_blank\" rel=\"noopener\">open-source code<\/a>, which is true to Murati\u2019s promise of \u201cscience is better when shared.\u201d The team calls their approach \u201cbatch-invariant kernels,\u201d which basically teaches AI servers to give you the same coffee regardless of the line.<\/p>\n<h2>Why the AI giants should be nervous<\/h2>\n<p>This is just the appetizer from a team\u00a0<a href=\"https:\/\/www.eweek.com\/news\/thinking-machines-2b-multimodal-ai\/\">that recently raised $2 billion\u00a0<\/a>without even having a product (although they are working on one, internally called \u201c<a href=\"https:\/\/www.theinformation.com\/articles\/ex-openai-cto-muratis-startup-plans-compete-openai-others\" target=\"_blank\" rel=\"noopener\">RL for businesses<\/a>\u201d that customizes models for a company\u2019s specific business metrics, which sounds very cool).<\/p>\n<p>If fixing decade-old \u201cunfixable\u201d problems is their opening move, the AI giants should probably be nervous (though the code is open, so yum yum yum, as the hungry, hungry, AI labs say).<\/p>\n<p><strong><em>Editor\u2019s note:<\/em><\/strong><em> This content originally ran in our sister publication, <a href=\"https:\/\/www.theneurondaily.com\/p\/did-ex-openai-cto-mira-s-12b-startup-just-solve-ai-s-biggest-bug\" target=\"_blank\" rel=\"noopener\">The Neuron<\/a>. To read more from The Neuron, <\/em><a href=\"https:\/\/www.theneuron.ai\/newsletter\" target=\"_blank\" rel=\"noopener\"><em>sign up for its newsletter here<\/em><\/a><em>.<\/em><\/p>\n<p>The post <a href=\"https:\/\/www.eweek.com\/news\/ai-bug-fix-mira-murati-thinking-machine-lab\/\">Ex-OpenAI CTO Mira Murati\u2019s Startup Fixed a Major \u2018Unfixable\u2019 AI Bug<\/a> appeared first on <a href=\"https:\/\/www.eweek.com\/\">eWEEK<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>OpenAI\u2019s former CTO Mira Murati\u2019s new startup recently dropped its first research paper, and it\u2019s about fixing something that\u2019s been bugging AI engineers since ChatGPT launched. The paper from Murati\u2019s startup Thinking Machines Lab (or, just Thinky, for those in the know) is titled \u201cDefeating Nondeterminism in LLM Inference\u201d and tackles the problem of \u201creproducibility\u201d [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-4903","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts\/4903"}],"collection":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4903"}],"version-history":[{"count":0,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts\/4903\/revisions"}],"wp:attachment":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4903"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4903"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4903"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}