Raw LLM Responses

Inspect the exact model output for any coded comment.

Comment
WRONG. I'm not quiet at all; this "research" is trash. I'm guessing GPT is basically the same as generating code, but I'd like to truly know which from some good research. However, this paper is seriously flawed in a number of ways. They didn't actually run a test in March. They didn't consider if less load on older models is a reason they might perform better, and verify it by running tests at off-peak hours. They disqualified generated code that was contained in a markdown codeblock, which is fine but they should have seen if the code worked. They didn't compare API to ChatGPT. There's more they did poorly, but that's a good start.
reddit AI Harm Incident 1689776781.0 ♥ 4
Coding Result
DimensionValue
Responsibilitycompany
Reasoningconsequentialist
Policynone
Emotionoutrage
Coded at2026-04-25T08:33:43.502452
Raw LLM Response
[{"id":"rdc_jskfpeo","responsibility":"ai_itself","reasoning":"mixed","policy":"none","emotion":"outrage"},{"id":"rdc_jslk9i2","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"indifference"},{"id":"rdc_jslntdr","responsibility":"company","reasoning":"deontological","policy":"none","emotion":"outrage"},{"id":"rdc_jskmscy","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"mixed"},{"id":"rdc_jsldrtq","responsibility":"company","reasoning":"consequentialist","policy":"none","emotion":"outrage"}]