Raw LLM Responses

Inspect the exact model output for any coded comment.

Comment
Please do the one easy test I can think of that anyone can do to see for yourself how capable or not these chatbots are of reasoning: play a game of chess and take your position near the end of the game to a few supposed reasoning-champ models and have them analyze it. In chess it doesn't matter how you arrive at a position, all that matters is where the pieces are and whose turn it is, but IF these things just "predict the next token" not having a move list and just a position - which, by the nature of chess, is unlikely to have ever been on a board before, provided your game was at least 20 moves long - it might matter to their performance. I've tried this with chatgpt, gemini, and grok, and they not just fail, but fail so spectacularly as to appear utterly insane. They can't correctly identify which pieces are on the board and where they are, but they don't know that, so they produce confident proclamations that they "solved the puzzle" and explain in chess commentary style language which illegal/impossible moves are the best. When I did this I thought I had to choose a position where there is a very obvious and objectively true solution to a problem, so I gave it a simple mate in two scenario (forced, with very few legal moves available on each turn), but based on my experience, you can really give it any position and don't even need to analyze it beforehand with "traditional" chess programs, because the non-sense you will be getting back renders such preparation irrelevant. Obviously this doesn't necessarily say anything about whether AI is useful. We don't think google's search engine can reason but it used to provide value (before search results became mostly ads). But when I've tried to think of how I can test for myself if the AI models I have access to are able to reason, chess is the perfect test. It is easy to come up with novel "problems" and IF some day LLMs start producing answers that _seem_ sensible I can easily check those (e.g. on lichess' web site) by analyzing the game with chess software. But for now, they will tell me to jump my rook over two pieces and magically land it in the face of the enemy king, sometimes even if I don't have a rook. It is fascinating and scary in equal measure, when you ponder what powers might be given to this entity in the coming years (from running companies to governments and all the decision making that goes on all the way down to the clerks). I suspect that with no ability to reason, there will be a "long tail" thing of corner cases where AI utterly fails to do the right thing.
youtube AI Moral Status 2025-10-31T07:5… ♥ 41
Coding Result
DimensionValue
Responsibilitynone
Reasoningunclear
Policyunclear
Emotionindifference
Coded at2026-04-26T23:09:12.988011
Raw LLM Response
[ {"id":"ytc_Ugx6vZjGSGg4CrL-nnN4AaABAg","responsibility":"none","reasoning":"unclear","policy":"unclear","emotion":"indifference"}, {"id":"ytc_UgzbtNzVpAcjVuqkKRJ4AaABAg","responsibility":"none","reasoning":"mixed","policy":"unclear","emotion":"outrage"}, {"id":"ytc_UgzkAZDOJhmoC8Hinhh4AaABAg","responsibility":"none","reasoning":"unclear","policy":"unclear","emotion":"approval"}, {"id":"ytc_UgyfiR1311E7PqIM26J4AaABAg","responsibility":"developer","reasoning":"consequentialist","policy":"regulate","emotion":"approval"}, {"id":"ytc_Ugyt13y3qcMLhP5Gm6Z4AaABAg","responsibility":"none","reasoning":"unclear","policy":"unclear","emotion":"indifference"}, {"id":"ytc_Ugz48ZOMgXd_uPzTEFh4AaABAg","responsibility":"ai_itself","reasoning":"consequentialist","policy":"ban","emotion":"fear"}, {"id":"ytc_UgyYz43cuN5TRi6_PMN4AaABAg","responsibility":"user","reasoning":"deontological","policy":"liability","emotion":"fear"}, {"id":"ytc_UgwaNpbwGEXfFOnqAXZ4AaABAg","responsibility":"none","reasoning":"unclear","policy":"none","emotion":"indifference"}, {"id":"ytc_Ugw5Ge7eWsLI7MIRADV4AaABAg","responsibility":"user","reasoning":"deontological","policy":"regulate","emotion":"fear"}, {"id":"ytc_UgwZW5NeKjUA4OAeTLR4AaABAg","responsibility":"developer","reasoning":"virtue","policy":"regulate","emotion":"approval"} ]