Raw LLM Responses

Inspect the exact model output for any coded comment.

Comment
A pattern is developing with many posts explaining degradation of outputs and alignment issues with prompts relative to the LLM and index. A smaller, but still vocal group of ChatGPT users, lament quality of issues with prose, reasoning, and generally more semantic and syntax focused prompts. Yet, I have read very few, if any, examples where the posts compare the pre- and post-outputs after the rollback. That would be most helpful. Rather than a pure self-inflicted injury, there are other logical causes. First, OpenAI prioritized saving into memory any specific call-outs by users who wanted outputs, prompts, or entire chats to be available for recall or context. Also the option to open all chats for access by an OpenAI model, and this influenced the experience. Second, there are not enough GPUs, and those available are throttled and made available on a prioritized basis, the top of the line are enterprise customers in the public and private sector. And third, which is my personal opinion, OpenAI realized other for-profit companies across the globe focus on reasoning and inference, and the optimal approach is RNN and neurosymbolic reasoning. This approach, may explain the change in infrastructure to provide what they can now, while they build for the future. Until there are comparisons on a timeline of the same prompt, the same model, and settings, with different outputs, the experiences are anecdotal even if true, and may not be defining the problem accurately. So, any "fix" is likely not solving for the root cause. If an event can't be measured, its conjecture. The benchmarks for testing LLMs for hallucination propensity are there, but testing for hallucinations on the application or prompt layer, is not as mature. When that capability is ubiquitous, model performance for a specific domain will be instructive on defining the problem, exploring solutions, and improving the user experience.
reddit AI Harm Incident 1747016690.0 ♥ 3
Coding Result
DimensionValue
Responsibilityunclear
Reasoningunclear
Policyunclear
Emotionindifference
Coded at2026-04-25T08:33:43.502452
Raw LLM Response
[ {"id":"rdc_mrv267f","responsibility":"company","reasoning":"unclear","policy":"unclear","emotion":"outrage"}, {"id":"rdc_mrut4mz","responsibility":"company","reasoning":"unclear","policy":"unclear","emotion":"mixed"}, {"id":"rdc_mru7bs2","responsibility":"company","reasoning":"consequentialist","policy":"unclear","emotion":"indifference"}, {"id":"rdc_mrum80h","responsibility":"unclear","reasoning":"unclear","policy":"unclear","emotion":"indifference"}, {"id":"rdc_mrvvwd5","responsibility":"company","reasoning":"deontological","policy":"unclear","emotion":"outrage"} ]