Raw LLM Responses
Inspect the exact model output for any coded comment.
Look up by comment ID
Random samples — click to inspect
G
“If we do manage to build conscious machines, Bach thinks that they might well r…
ytc_Ugy1N-MJA…
G
Given the ultra-high barriers-to-entry to the semiconductor market (engineers ne…
rdc_gt7160x
G
AI detection also says the declaration of independence is AI generated. I don't …
ytc_UgxralsCo…
G
1. Limit AI to specific purpose intelligence instead of general purpose
2. Lim…
ytc_UgwWB7ADI…
G
I really hate the argument that the ai people make. All of the arguments. All to…
ytc_UgyetwOS2…
G
He’s full of shit first, the driverless cars came and get you where you post to …
ytc_Ugx-rCcsm…
G
AI drones just officially beat world class human pilots at high speed acrobatic …
ytc_Ugx-p_bQ6…
G
I'm waiting for an ai to call me a monkey cuz that would be fuckin hilarious…
ytc_Ugy6DKSMr…
Comment
AI Risk expert here. It's unfortunately a bit worse than that.
If you start with the assumption that a superintelligent AI will learn from data that does not contain any examples of AI being the enemy of humanity, that does not change the fact that it will still want* to accomplish whatever its goals are (which we did not get to robustly and precisely set in advance). Almost no matter what those goals are, when pursued with superhuman effectiveness, anything that is not directly implicit in those goals will be sacrificed for number-go-up on its favorite thing. This is known to be true for current, toy-model machine learning systems: Any parameter that is not specified to be within a specific range will be set to extreme values in order boost reward by a tiny fraction of a percent. And we don't have any realistic way to get human values into an AI system.
When you look at instrumentally useful steps toward takeover, like deception, the same holds true: If we assume a superintelligence is trained without any examples of deception, it will still independently discover deception as a useful strategy. Even humans can do that.
*"want" is anthropomorphic language that here means something like "contain a preference ordering in its action policy". Anything that behaves as if it has a goal comes under the sway of things like convergent instrumental goals (such as self-preservation and resource acquisition) and the principle of the orthogonality of intelligence and goals (there are no stupid end goals, just stupid ways to achieve them).
youtube
AI Governance
2025-08-27T07:4…
♥ 4
Coding Result
| Dimension | Value |
|---|---|
| Responsibility | ai_itself |
| Reasoning | consequentialist |
| Policy | unclear |
| Emotion | fear |
| Coded at | 2026-04-27T06:24:59.937377 |
Raw LLM Response
[
{"id":"ytr_UgxfjUXo_FR_ikQV_O94AaABAg.AMIiksp6iTHAMJxEG3e7hj","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"},
{"id":"ytr_UgxfjUXo_FR_ikQV_O94AaABAg.AMIiksp6iTHAMNyVp0rjJM","responsibility":"unclear","reasoning":"mixed","policy":"unclear","emotion":"outrage"},
{"id":"ytr_UgyC-ZaiZH7aiakI8ZV4AaABAg.AMIiQz-2BE-AMJL24pN8TG","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"},
{"id":"ytr_UgyC-ZaiZH7aiakI8ZV4AaABAg.AMIiQz-2BE-AMNwVYNoUgv","responsibility":"none","reasoning":"mixed","policy":"none","emotion":"resignation"},
{"id":"ytr_UgzxhyJPjFMsVi_d8wx4AaABAg.AMIhqBOAtBRAMIjv6JBEN7","responsibility":"developer","reasoning":"consequentialist","policy":"unclear","emotion":"fear"},
{"id":"ytr_UgyyL7XV_MC4trFI6aV4AaABAg.AMIguPtLmJuANYjJ5A_u99","responsibility":"unclear","reasoning":"mixed","policy":"unclear","emotion":"outrage"},
{"id":"ytr_UgxctM15P1ZgsaHX4LV4AaABAg.AMIeP8YK4mIAMIlxAEmQh4","responsibility":"unclear","reasoning":"mixed","policy":"unclear","emotion":"mixed"},
{"id":"ytr_UgwG6Iv7Xr-9JuDYNxx4AaABAg.AMIdteL9JxmAMIgNHX0ab1","responsibility":"ai_itself","reasoning":"consequentialist","policy":"none","emotion":"resignation"},
{"id":"ytr_UgwJ4nifqmzvJuYoXj94AaABAg.AMIdhKGxOgYAMSc9sARn7t","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"},
{"id":"ytr_Ugx5YCRGCoCkjdOM2m14AaABAg.AMId3fhlf7CAMK71jVy4zP","responsibility":"ai_itself","reasoning":"consequentialist","policy":"unclear","emotion":"fear"}
]