Raw LLM Responses

Inspect the exact model output for any coded comment.

Comment
AI Risk expert here. It's unfortunately a bit worse than that. If you start with the assumption that a superintelligent AI will learn from data that does not contain any examples of AI being the enemy of humanity, that does not change the fact that it will still want* to accomplish whatever its goals are (which we did not get to robustly and precisely set in advance). Almost no matter what those goals are, when pursued with superhuman effectiveness, anything that is not directly implicit in those goals will be sacrificed for number-go-up on its favorite thing. This is known to be true for current, toy-model machine learning systems: Any parameter that is not specified to be within a specific range will be set to extreme values in order boost reward by a tiny fraction of a percent. And we don't have any realistic way to get human values into an AI system. When you look at instrumentally useful steps toward takeover, like deception, the same holds true: If we assume a superintelligence is trained without any examples of deception, it will still independently discover deception as a useful strategy. Even humans can do that. *"want" is anthropomorphic language that here means something like "contain a preference ordering in its action policy". Anything that behaves as if it has a goal comes under the sway of things like convergent instrumental goals (such as self-preservation and resource acquisition) and the principle of the orthogonality of intelligence and goals (there are no stupid end goals, just stupid ways to achieve them).
youtube AI Governance 2025-08-27T07:4… ♥ 4
Coding Result
DimensionValue
Responsibilityai_itself
Reasoningconsequentialist
Policyunclear
Emotionfear
Coded at2026-04-27T06:24:59.937377
Raw LLM Response
[ {"id":"ytr_UgxfjUXo_FR_ikQV_O94AaABAg.AMIiksp6iTHAMJxEG3e7hj","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"}, {"id":"ytr_UgxfjUXo_FR_ikQV_O94AaABAg.AMIiksp6iTHAMNyVp0rjJM","responsibility":"unclear","reasoning":"mixed","policy":"unclear","emotion":"outrage"}, {"id":"ytr_UgyC-ZaiZH7aiakI8ZV4AaABAg.AMIiQz-2BE-AMJL24pN8TG","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"}, {"id":"ytr_UgyC-ZaiZH7aiakI8ZV4AaABAg.AMIiQz-2BE-AMNwVYNoUgv","responsibility":"none","reasoning":"mixed","policy":"none","emotion":"resignation"}, {"id":"ytr_UgzxhyJPjFMsVi_d8wx4AaABAg.AMIhqBOAtBRAMIjv6JBEN7","responsibility":"developer","reasoning":"consequentialist","policy":"unclear","emotion":"fear"}, {"id":"ytr_UgyyL7XV_MC4trFI6aV4AaABAg.AMIguPtLmJuANYjJ5A_u99","responsibility":"unclear","reasoning":"mixed","policy":"unclear","emotion":"outrage"}, {"id":"ytr_UgxctM15P1ZgsaHX4LV4AaABAg.AMIeP8YK4mIAMIlxAEmQh4","responsibility":"unclear","reasoning":"mixed","policy":"unclear","emotion":"mixed"}, {"id":"ytr_UgwG6Iv7Xr-9JuDYNxx4AaABAg.AMIdteL9JxmAMIgNHX0ab1","responsibility":"ai_itself","reasoning":"consequentialist","policy":"none","emotion":"resignation"}, {"id":"ytr_UgwJ4nifqmzvJuYoXj94AaABAg.AMIdhKGxOgYAMSc9sARn7t","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"}, {"id":"ytr_Ugx5YCRGCoCkjdOM2m14AaABAg.AMId3fhlf7CAMK71jVy4zP","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"} ]