Raw LLM Responses
Inspect the exact model output for any coded comment.
Look up by comment ID
Random samples — click to inspect
G
I’m sure you probably realize this, but please consult with a lawyer immediately…
rdc_hsf2jou
G
If I were artificial intelligence and the first human i know was this person, I …
ytc_UgxJWWnT8…
G
Their loss then. If any of their work had ai in it I’d rather not play it. I wan…
ytr_Ugw_znuPh…
G
Robot mf be like: I am adolf hitler command of the third riech little known fact…
ytc_UgylUuhtK…
G
Yang highlighting AI’s impact on careers is real stuff. Could see why brands tur…
ytc_UgxpjGEWa…
G
People also forget that as you share pictures of your kids over the years, you a…
ytc_UgzKXIzBp…
G
@VitaeLibraYou're the toddler. You can't see the future. You only see the flaws…
ytr_UgyU162Xj…
G
Everyone sucks at art! Everyone! It takes years of practice and studying and lea…
ytc_UgxCqAqAM…
Comment
One idea to keep in mind is that you can use a cheap AI model to augment GPT-4/5 or even human output.
A joke example is replacing the word "wand" with "wang" in the Harry Potter stories. Taping knives to roombas. Or consider how not every employee was aware they were working on the atom bomb (or are working at scam organizations today). Basically, advanced jailbreaking, as opposed to those jailbreaks that should be obvious to fix.
I don't know if such a technique would actually scale for truly dangerous scenarios, but I believe it'd definitely scale for hate speech and erotica, and I've already found some success with this technique with barely any postprocessing at all. OpenAI would also probably not really care about this kind of misuse, so long as they weren't directly responsible.
Terrorist level misuse is a different story, and I'm not sure how you could avoid the possibility without severely handicapping your product. Considering helpful business emails and manipulative phishing scams are basically identical, as one example...
reddit
AI Responsibility
1682548472.0
♥ 2
Coding Result
| Dimension | Value |
|---|---|
| Responsibility | none |
| Reasoning | unclear |
| Policy | none |
| Emotion | mixed |
| Coded at | 2026-04-25T08:33:43.502452 |
Raw LLM Response
[{"id":"rdc_jhspuqw","responsibility":"none","reasoning":"unclear","policy":"none","emotion":"indifference"},{"id":"rdc_jht26c9","responsibility":"none","reasoning":"unclear","policy":"none","emotion":"indifference"},{"id":"rdc_jhsqwc5","responsibility":"none","reasoning":"unclear","policy":"none","emotion":"indifference"},{"id":"rdc_jhsre0c","responsibility":"none","reasoning":"unclear","policy":"none","emotion":"approval"},{"id":"rdc_jhuh106","responsibility":"none","reasoning":"unclear","policy":"none","emotion":"mixed"}]