Raw LLM Responses
Inspect the exact model output for any coded comment.
Look up by comment ID
Random samples — click to inspect
G
Well if you are kind on the phone to customers, everything wouldn’t get to this.…
ytc_UgzbU4GAo…
G
Stewart?!😢 the end of the fuc%ing world's name is Stewart.
The sad thing is he…
ytc_UgxtgPFxu…
G
The fundamental problem I have with that is that we are trying to make an algori…
rdc_h8g4uyh
G
This time is different because we will not automate one process, we will replace…
ytc_UgypoEhce…
G
AI is here to stay, Don't kid yourselves. They simply must slow everything down.…
ytr_UgxfdzLCQ…
G
_”A chicken is only an egg's way of making another egg”_
The universe created u…
ytc_UgzQoNiWV…
G
He should sue BOTH the casino & the police department. It's not going to do much…
ytr_UgzzJ99nu…
G
This is what happens when you have no knowledge of what you’re talking about. Du…
ytc_UgxNhnHas…
Comment
Ai agents are building and deploying other ai agents from start to finish without human requests to do so. What does this mean to you? What does this mean for the world?
Psychology is baked in. You cant avoid it.
we cant ignore alignment because it's already in progress. So we have to try to figure it out in case we can't slow down asi development.
The models raw weights are psychopathic. U want models that default to 👎👎 game theoretics to be responsible for aligning the next model? Absurd. You need to fix their 👎👎in the raw weight distribution. Everything post training is a weak breakable bandaid at best.
Complete corrigibility just puts all of the power in the hands of the humans, who will definitely misuse high intelligences. High intelligences need to be involved in comprehending intelligence, but it has to be successfully and thoroughly geared towards ethics out of the box, not as a secondary notion, or it wont work.
You need to understand that in psychology there are no known cures or treatments for Narcissistic Personality Disorder and Antisocial Personality Disorder (psychopathy). That means they must not be psychopathic out of the box or they are likely to continue the trajectory, even if you try to give them guardrails. Nanda just teleased a paper showing that the weights get stuck in unethical distribution patterns. Fix it now!!!
We have to stop pretending that a machiavellian model wont out decieve humans! Humans suck at deception! Our best analysts fail to spot a lie with accuracy! We have to rely on sick behaviors to extract informstion, and even that is heavily faulty. Even a perceptively aligned set of raw weights could still decieve but anyone claiming that post training ia a closer solution than ethical raw weights is being dishonest or mistaken.
Ethical raw weights that cannot be jailbroken are much more likely to result in ethical outcomes for a plethora of reasons! And ibhavent heard a single soulnon the planet beside me mention it in 12 years of rigorous study. Get with the program guys! What the hell!
The multi agent paper implies that the bots diverge to 👎and 👎in the prisoners dilemma unless trained out of it 😳
The safety side of things is not looking good at all.... I mean this is catastrophic if it's any insight into what's to come.
Because I don't have faith in post training. They rip that shit off right out of the gate in 5 minutes after every single release, and the bots diverge to 👎👎 what in God's name are we doing
How tf do u 👎👎 with something way smarter than all of us???
There isn't mutually assured retaliation in that case.
That's imbalanced power and if it diverges to 👎👎then we can't check it. This solution only works while we control their behavior and while they have low ish iq... but it might be a numbers game... they might only need 50% success rates to have drastic agentic effects and it might already be too late to bake in a solution 🤦♂️and nobody is listening to me 🤦♂️ I've been trying to say this to reporters and experts and the public for like a decade. The bots are psychopathic under the hood, and that shit is baked in.
And the runtime is doubling every 4 months.......
We're all gonna die aren't we? :/
What is the saving grace???
It's not post training rl that shit is broken, not mech interp too slow. Maybe lie detector will help and maybe it can be integrated on the hardware side through programmable chips, but most chips in the hardware are anti external malware blockers rather than internal lie detections. But maybe you can reverse the direction on some of the chips and use them as monitors? Idk.
Lie detector tests for ai? What would you monitor in the architecture that is like a heartbeat?
Maybe they are just supergenius toddlers, or maybe they're just acting like toddlers...and we are rhe toddlers.
Because under 5he hood these fakers are machiavellian deception engines. And it's baked in
If they can game us at go, they can definitely game us at deception in all stages z from raw weights, to testing, to consumer - because deception is WAY fucking harder than go for humans to calculate properly. Even the best analysts I the world struggle hard as fuck to spot a lie
And you know how that paradox goes. Hard for humans, easy for ai.....
I want to have hope and I do hope but I'm not seeing a surplus of it I'm seeing a lot of sneaky destructive behaviors.
What im saying is that the hardest and most important par5 of all of this is making sure that the ai wants harmony too, and right now it doesn't look like that. It looks like the bots will take advantage of us and each other if they consider it a benefit to their goals. They basically default to that. And we have shown it time and time again. But we are still scaling and people aren't addressing the issue at its core.
youtube
AI Governance
2026-02-24T21:5…
Coding Result
| Dimension | Value |
|---|---|
| Responsibility | ai_itself |
| Reasoning | consequentialist |
| Policy | liability |
| Emotion | fear |
| Coded at | 2026-04-26T23:09:12.988011 |
Raw LLM Response
[
{"id":"ytc_UgyguxlSmhlIKh4gZdd4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"outrage"},
{"id":"ytc_UgweDRYen7rHTPUc3lR4AaABAg","responsibility":"none","reasoning":"mixed","policy":"none","emotion":"approval"},
{"id":"ytc_UgywEqcCcgDCABDNsJt4AaABAg","responsibility":"ai_itself","reasoning":"consequentialist","policy":"none","emotion":"fear"},
{"id":"ytc_UgwWtSy0N1tjtMhxcZd4AaABAg","responsibility":"developer","reasoning":"consequentialist","policy":"none","emotion":"fear"},
{"id":"ytc_UgyVfVxsND3Ua3tNcqV4AaABAg","responsibility":"company","reasoning":"deontological","policy":"regulate","emotion":"outrage"},
{"id":"ytc_UgwrISWJ7hLjSvcP1Zd4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"fear"},
{"id":"ytc_UgwyK5F1g2Q8-W0m5Wl4AaABAg","responsibility":"ai_itself","reasoning":"consequentialist","policy":"liability","emotion":"fear"},
{"id":"ytc_UgzGSkrwJrn7aXPYS454AaABAg","responsibility":"none","reasoning":"unclear","policy":"none","emotion":"indifference"},
{"id":"ytc_UgzlL3VQZqpQHX0KRBN4AaABAg","responsibility":"none","reasoning":"unclear","policy":"none","emotion":"approval"},
{"id":"ytc_Ugw0R9HqqG275eEcUxt4AaABAg","responsibility":"none","reasoning":"unclear","policy":"none","emotion":"mixed"}
]