Raw LLM Responses

Inspect the exact model output for any coded comment.

Comment
Dave, I love your content, but this video doesn't really reflect the actual state of the art in research. The enterprises need their gigantic claims on "AGI/ASI around the corner" to keep the hype and the investor money coming but, with the fundamental limitations of transformer-based LLMs (which we DO know how they work*), these can never come close to even acting like real intelligence. These models are no more than probabilistic text generators, and the closest thing to reasoning they can do is "writing text that sounds like someone reasoning". This, in turn, gives surprising results for some tasks that were extensively available in the training data, but in practice, there's no actual thought process going on. If there was, adversarial attacks wouldn't be able to fully destroy the logic of the machine's output by just playing with the mathematical weight of certain words. These things fail on puzzles with a logic chain unseen in training, they hallucinate data instead of admitting that they don't know (because they CAN'T know, it's just coherent text generation), and a large etc of situations where the "reasoning" falls off. Throwing billions in getting more GPUs, using a more extensive training dataset, and branching to other generative systems (like video) won't ever solve the fact that these machines are, by design, stupid. Because it's just probabilistic text generation, they also might appear sentient and autonomous on their outputs, but that's just a plain illusion of language. Take, for example, OpenAI's o3 (their flagship reasoning model for a long time). That model, after reviewing my code, used to claim that it "tested it and got these numerical results": it was code I wrote to train models that take a full day to run. It just hallucinated that the output "I tested the code and got this" makes sense in the context of talking about reviewing it. We already have very real problems with the regulation of these AI systems (like the generation of pornographic deepfakes, or usage of computer vision for military purposes, for example), but we don't need a sci-fi like extinction narrative to denounce this stuff. Right now, the bigger problem is people trusting models to be useful (e.g. SWE enterprises firing employees to favour AI code generation, when most of the time it fails to reliably pinpoint a bug) rather than it being way too intelligent. * The hard part of explainability is not understanding the algorithm. We know the architecture and how the parameters get adjusted in training. The hard part, put simply, is that, when learning billions of numbers to optimize the output of text, it's hard to know what numbers correspond to which learned pattern. That's the field of Explainable AI, and it's all about making architectures that, by design, have easy to explain predictions, which is super important in fields like medicine. EDIT: Taking a look at Control AI's page, I saw this video by Altman (https://x.com/ai_ctrl/status/1973056226381414820). I think this is really telling of my first point about the gigantic claims. He's already trying to push down the narrative that GPT5 is outperforming humans, which should be a good indicator that they're plateauing in what they can get out of language models. This claim is based, of course, in some of the most standardized and benchmarkable contests in each area (in this case, ICPC): these are not a good indicative of real world performance, as these type of contest problems and algorithms are insanely overrepresented in training data. As soon as you ask for more specialized stuff, where you can't evade reasoning just with knowledge, it still fails with lots of stuff, giving contradictory results in many areas. Of course, this gets subjective, but it's something people in other fields have agreed with me on; while it sounds super knowledgeable in areas other than yours, you get to see the nonsense in yours past a point. Is this is what they claim to be outperforming humans in intelligence, I can't take their AGI claims as anything else than hype fuel.
youtube AI Governance 2025-10-02T14:2… ♥ 5
Coding Result
DimensionValue
Responsibilitycompany
Reasoningmixed
Policyindustry_self
Emotionindifference
Coded at2026-04-26T19:39:26.816318
Raw LLM Response
[ {"id":"ytc_UgywuSsk8O_Ql2PPJNh4AaABAg","responsibility":"company","reasoning":"consequentialist","policy":"regulate","emotion":"approval"}, {"id":"ytc_UgwNq4BTrDHGbgvprJF4AaABAg","responsibility":"company","reasoning":"mixed","policy":"industry_self","emotion":"indifference"}, {"id":"ytc_Ugx1Ii5Y3iobBtvXFMx4AaABAg","responsibility":"distributed","reasoning":"consequentialist","policy":"regulate","emotion":"fear"}, {"id":"ytc_UgzF-W9IRiQ-hLHmJ_B4AaABAg","responsibility":"government","reasoning":"deontological","policy":"regulate","emotion":"outrage"}, {"id":"ytc_Ugxc7VYSqHHZ_NdBBo54AaABAg","responsibility":"developer","reasoning":"mixed","policy":"unclear","emotion":"mixed"} ]