Raw LLM Responses
Inspect the exact model output for any coded comment.
Look up by comment ID
Random samples — click to inspect
G
I am proud to subscribe to such a channel where with each video is authentic wis…
ytc_UgzOMeK5x…
G
Guys remember ChatGPT is a people pleaser.
It will usually tell you what you wan…
ytc_UgwpNumNT…
G
So since i have an RTX card I can do all this Glitz cannon AI art stuff. Its li…
ytc_UgyrG85Dm…
G
The viable answer to this inevitable post-industrial AI revolution would be impl…
ytc_Ugz7KVWnB…
G
Human beings have proven over and over that we are not mature enough to handle a…
ytc_UgzpH11BQ…
G
This here assumes ai will be perfect. Ai will make mistakes and it will maybe no…
ytc_Ugxnu8izr…
G
If you're worried about AI detection, Ryne AI is your best friend. It bridges th…
ytc_UgwVaK1i6…
G
Today my sister asked me to borrow my phone and I said "hell no ask your man" an…
ytc_UgxBmSTlM…
Comment
My impression has been that Opus is actually much worse at coding than Sonnet. It overcomplicates everything, overgeneralizes simple discrete tasks, creates more bugs, and really does nothing better except cost more money, so idiots and Anthropic bots promote it the hardest. Like bear boxes in national parks, any heuristic you develop to defeat bots will also defeat the dumbest quintile of humanity…
My impression was also that 4.5 was moderately better than 3.7 at remaining “on track” with more complex tasks and managing context rot. Similar to the incremental change from 3.5 to 3.7. I do think Claude 3.5 was a real step change forward. LLMs did not impress me much before that. Perhaps due to my own ignorance of NLP, I didn’t foresee the evolution from the initial release of ChatGPT to the near-future applications of semantic search and RAG.
Like you, I am skeptical of the integrity of anyone who acts as if we’re not *deep* into the curve of diminishing returns with blindly scaling the current system architecture of LLMs. I have not met anyone IRL who thinks 4.5 was anything more than incremental progress, or that 4.6 was noticeably different in any way. Perhaps now that Claude Code’s embedded system prompts have been optimized for Claude 4.6, then 4.5 will seem worse if you downgraded, but judging model performance by the quality of prompt engineering is a category error in my book.
That being said: to play devil’s advocate, perhaps there are people who know less than I did, who are genuinely impressed by the latest models, that they are now able to use to do new things. Perhaps it’s not the model capabilities that upgraded, but the user’s capabilities.
This is all moving very fast. It’s genuinely exciting. Anyone who starts building applications with AI today is still an early adopter. Skepticism is warranted IMO. Crypto burned a lot of people, the neutered chat interfaces are utter garbage, and OpenAI and Anthropic are SoftBank-level financial dumpster fire
reddit
AI Jobs
1774268079.0
♥ 5
Coding Result
| Dimension | Value |
|---|---|
| Responsibility | company |
| Reasoning | deontological |
| Policy | none |
| Emotion | outrage |
| Coded at | 2026-04-25T08:33:43.502452 |
Raw LLM Response
[
{"id":"rdc_obw7q2f","responsibility":"company","reasoning":"consequentialist","policy":"none","emotion":"indifference"},
{"id":"rdc_obvmgt7","responsibility":"ai_itself","reasoning":"mixed","policy":"none","emotion":"mixed"},
{"id":"rdc_oc0674s","responsibility":"company","reasoning":"deontological","policy":"none","emotion":"outrage"},
{"id":"rdc_obv69wi","responsibility":"company","reasoning":"consequentialist","policy":"none","emotion":"outrage"},
{"id":"rdc_obv8w5z","responsibility":"company","reasoning":"consequentialist","policy":"none","emotion":"outrage"}
]