Raw LLM Responses

Inspect the exact model output for any coded comment.

Comment
This is secondary to being able to effectively give it goals at all, which we are not yet able to effectively do. Mostly we're giving it examples and saying "this output good", "this output bad" with no real idea what it's learning from that. Yes, we can give it a system prompt and include something like Claude's constitution, but that just becomes part of the context, we have no ability to make it consistently follow any goal we try to give it. Nate gave an example of unit tests where he gave specific instruction "don't do this", and the response was to still do that but try more to hide it. If we did get to that point, the concept of "coherent extrapolated volition" is the best attempt I've seen at handling the issue you describe, but we'd have to get much better than we are now at alignment before it really becomes relevant to debate what goals it should have.
youtube AI Governance 2026-03-27T22:0…
Coding Result
DimensionValue
Responsibilitydeveloper
Reasoningconsequentialist
Policyindustry_self
Emotionresignation
Coded at2026-04-27T06:24:59.937377
Raw LLM Response
[ {"id":"ytr_Ugw2Hm1dbiDAfheGYwF4AaABAg.AVpOUC5oZIyAVqmw9fN1WA","responsibility":"company","reasoning":"consequentialist","policy":"liability","emotion":"indifference"}, {"id":"ytr_UgwZPomTG_RHLiPkXlx4AaABAg.AVKZhl3GCoDAVKaMx6lBa5","responsibility":"none","reasoning":"mixed","policy":"none","emotion":"resignation"}, {"id":"ytr_Ugwi_oUaa1CyKC2SdIV4AaABAg.AVHaoc2FtT0AVHmettAP-I","responsibility":"company","reasoning":"deontological","policy":"regulate","emotion":"outrage"}, {"id":"ytr_Ugwi_oUaa1CyKC2SdIV4AaABAg.AVHaoc2FtT0AVHn1Dfkizq","responsibility":"company","reasoning":"consequentialist","policy":"regulate","emotion":"outrage"}, {"id":"ytr_Ugwi_oUaa1CyKC2SdIV4AaABAg.AVHaoc2FtT0AVHoQEXxJv-","responsibility":"company","reasoning":"deontological","policy":"regulate","emotion":"outrage"}, {"id":"ytr_UgwAeorGgJ67NXPviat4AaABAg.AUrAbO_0gD_AUrFJkS1OfP","responsibility":"ai_itself","reasoning":"unclear","policy":"unclear","emotion":"mixed"}, {"id":"ytr_UgwAeorGgJ67NXPviat4AaABAg.AUrAbO_0gD_AUrG7TXbYiX","responsibility":"ai_itself","reasoning":"unclear","policy":"unclear","emotion":"approval"}, {"id":"ytr_Ugz4jQBOIBsXS44-kC54AaABAg.AUmNWp7c7izAUsMqUGSdwq","responsibility":"developer","reasoning":"consequentialist","policy":"industry_self","emotion":"resignation"}, {"id":"ytr_UgwPYFl00Se7DLgI8yZ4AaABAg.AUmItF01DJzAUvLOYVUkSN","responsibility":"ai_itself","reasoning":"consequentialist","policy":"regulate","emotion":"fear"}, {"id":"ytr_UgwPYFl00Se7DLgI8yZ4AaABAg.AUmItF01DJzAUwcpATurWZ","responsibility":"none","reasoning":"unclear","policy":"unclear","emotion":"indifference"} ]