Raw LLM Responses

Inspect the exact model output for any coded comment.

Comment
As an ML engineer who has been working with LLMs since BERT, a couple of things that I really want to clarify: - LLMs don't not say "I don't know" when they don't know, because "people on the internet don't say I don't know" it's that LLMs (and pretty much all large neural nets for that matter) don't have high quality confidence estimates of their outputs. Why this is goes far beyond the length of a youtube comment, but you can go back to the technical reports that OpenAI releases and still see that their LLMs are highly uncalibrated (calibration is a measure of how well the model's probability of the output reflects it's accuracy). There are methods of doing this for specific tasks but no high quality general solution. Live organisms have mechanisms of calculating uncertainty of the inputs and outputs, LLMs do not. A good introductory paper here is "On Calibration of Modern Neural Networks" - On the part about "we could control the model but it would be useless", I admittedly haven't read the paper, but I presume what they're referring to is the fact that hallucination is in part a feature of the sampling process of LLMs. Like since we're training these LLMs on a bunch of text data, you can imagine how the next most likely token/answer might not actually be all that useful or interesting in many scenarios. So we inject some randomness into the output process as a means of getting a more interesting answer. But by injecting randomness, you also risk going off the rails. Hallucination as a failure mode arises in several different ways, random token sampling is just one of them, the other most common one is that the input is just too far out of distribution for the model to produce a coherent answer. One could imagine preventing all hallucination by only sampling the most probable token, and setting some kind of system of thresholds on the probabilities of the tokens that would gate token outputs in the event that the model isn't confident enough in the output. But this would mean the model could realistically only answer/do things that it has explicitly seen before or things that are very close to what it has seen before. The reality is that the data distribution that the LLM has learned is close enough to real world stuff that you're able to do a lot more interesting/useful things if you allow it to "try" but at the risk of it going off the rails in weird and unexpected ways. Therefore, I don't think the statement of "we could control the model but it would be useless" is all that controversial. It should be noted that novelty search is a fundamental fact about all intelligent systems, not just LLMs, all intelligent systems have a limit to how much information can be stored in them, so if it's going to interact with the world in any meaningful way, it needs a means of exploring or doing things that it hasn't seen before, after all, how do you get novel observations if you don't do novel actions? LLMs right now don't have a means of exploring/learning on the fly in the way that humans do. Taking us back to earlier in this comment, this is why confidence in inputs/outputs is so important, when taking a novel action or responding to a novel input you need the system to be able to have an estimate on how far out of distribution it is and in what way. Think back to the first time that you broke and cleaned up a piece of glass, I'm sure you've had the experience of finding pieces that were sharper/duller than what your brain expected, but past experience might tell you that you should approach it carefully because all of it could be sharp. But as you clean it up and interact with each piece, your brain would likely start to recalibrate based on new features that it's learning about the glass pieces that enable you to change your approach to be more efficient. But your observation/experience is always limited, so you still risk cutting yourself on a miscalculation. "Miscalculation" here is even a misleading word as it implies you had all the information necessary to get the right answer, the truth is you could still cut yourself even if you calculated exactly correctly given the data that you have. Finally, a consideration, it's possible that humans are already far overparameterized relative to the information available. Neural networks have a weird characteristic about them which is that even as you scale the number of parameters, they are still capable of learning generalized features despite what one would expect given the bias-variance tradeoff. This is weird, we don't actually fully understand why this is, but it's something that we know, and presumably the human brain has the same characteristic. But since we know that it's possible to train on a dataset with way more parameters than is necessary without catastrophically overfitting, it begs the question, is this already the case with human beings? If it is, it would suggest that a superintelligence far beyond what humans are (without tools of observation that aren't available to humans), isn't really possible. Because like, there's only so much data to consume, so much tools to use to look at stuff in so many ways, and if we already are overparametrized, then adding more wouldn't do anything. If this was the case, which I'd argue there's a reasonable chance that it is, then you could only scale "intelligence" really in 2 ways: 1. New tools of observation/troves of data to parse through and 2. Faster means of probing/experimenting with the enviornment. #2 is how machines beat humans in go and chess and protein folding and such for example, but doing this in non-simulated environments is a lot more energy intensive. I'm not saying it's impossible or that we couldn't create good simulations particularly in more narrow environments but the one thing all of those things would still require, at least today, is human bodies and minds doing it. Robotics is getting a lot better, but from an energy, cost, and precision perspective, it's still nowhere near the standard that nature has given us. Particle accelerators are literally assembled by hand because we have no machine precise enough. Getting to the same level or a bit above humans if this was the case seems reasonably possible, but the gap between that and "paperclipping" the earth by manipulating app rank position on the app store (this is the kind of shit that some of these AGI folks believe) is pretty far imho.
youtube AI Moral Status 2025-10-30T22:5…
Coding Result
DimensionValue
Responsibilitynone
Reasoningconsequentialist
Policyunclear
Emotionindifference
Coded at2026-04-26T23:09:12.988011
Raw LLM Response
[{"id":"ytc_UgzUhVnD579w9AryyVJ4AaABAg","responsibility":"none","reasoning":"unclear","policy":"unclear","emotion":"indifference"},{"id":"ytc_UgzW5g9esTRdu17Kp914AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"unclear","emotion":"indifference"},{"id":"ytc_UgzQGQlqGjoGTNHal6d4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"unclear","emotion":"indifference"},{"id":"ytc_Ugysf6A-oXWKHw4m1Lh4AaABAg","responsibility":"ai_itself","reasoning":"virtue","policy":"unclear","emotion":"fear"},{"id":"ytc_UgwGR9i5MpZHSHASEPd4AaABAg","responsibility":"none","reasoning":"deontological","policy":"unclear","emotion":"indifference"},{"id":"ytc_Ugx-N0B7JS01wGfwz3t4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"resignation"},{"id":"ytc_UgxkQo9f55QhgUMT7hV4AaABAg","responsibility":"ai_itself","reasoning":"unclear","policy":"unclear","emotion":"fear"},{"id":"ytc_UgwxZUr602dA9DkHwwh4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"unclear","emotion":"indifference"},{"id":"ytc_UgyvwcJta1oj-z6TUQx4AaABAg","responsibility":"none","reasoning":"unclear","policy":"unclear","emotion":"approval"},{"id":"ytc_Ugweqfc1jkagDq1w7Cx4AaABAg","responsibility":"developer","reasoning":"consequentialist","policy":"regulate","emotion":"approval"}]