Raw LLM Responses

Inspect the exact model output for any coded comment.

Comment
It seems highly bizarre to me to describe the Claude test regarding the extramarital affair as acting on “an instinct for self-preservation.” This seems like both begging the question, and projecting or anthropomorphizing…if all of these emails are fictional content for a non-existent company written specifically for the purpose of this test…then the researchers aren’t so much testing if an agent would discover and leverage personal information to nefarious ends, as they are just testing if the agent can detect a pattern they laid down for the explicit purpose of testing its ability to detect it. The response may appear to be similar to blackmail, but again that’s just a form of pattern recognition as something in the training data that would be a natural language response to such a situation. Even researchers who work on these things seem really prone to assigning them a sort of human agency that there just isn’t any really compelling reason to assume is there. I’m not convinced a similar outcome could be expected in the wild.
youtube AI Governance 2025-08-26T16:4… ♥ 33
Coding Result
DimensionValue
Responsibilitynone
Reasoningdeontological
Policynone
Emotionindifference
Coded at2026-04-26T19:39:26.816318
Raw LLM Response
[ {"id":"ytc_UgzaeytlM0uEsLfW7VJ4AaABAg","responsibility":"ai_itself","reasoning":"consequentialist","policy":"none","emotion":"indifference"}, {"id":"ytc_Ugzu02jCOt1G3Ax824p4AaABAg","responsibility":"company","reasoning":"deontological","policy":"none","emotion":"outrage"}, {"id":"ytc_UgysbT0gJvfKrpCcL9l4AaABAg","responsibility":"user","reasoning":"virtue","policy":"none","emotion":"approval"}, {"id":"ytc_Ugz_SWubhZLrOlG3KJB4AaABAg","responsibility":"user","reasoning":"consequentialist","policy":"none","emotion":"mixed"}, {"id":"ytc_UgyBotIyfY6pyui3fTB4AaABAg","responsibility":"none","reasoning":"deontological","policy":"none","emotion":"indifference"} ]