Raw LLM Responses

Inspect the exact model output for any coded comment.

Comment
(1) It would be naive to assume that because a computer generated an estimate, it's automatically right, but we should have learned that many years ago. They can be very useful, if used appropriately. (2) As you point out, ANY decision making process will have errors, so comparing some process against perfection is not useful; it should be compared against real world the alternatives. So as you note, an algorithm despite being imperfect, may nevertheless be better, on average, at some predictions than an expert - and less susceptible to things like visual impressions or the sounds of names. Some algorithms do a good job, but we should evaluate that and not choose a bad algorithm, by all means. (3) Bias is a tricky thing to evaluate. For example, an algorithm trained with data from the real world might predict that people with certain characteristics (such as you mention - employment, criminal history, attitudes, etc) tend to re-offend more or less often than people with different characteristics. If a certain population group both (1) more often has the traits associated with re-offending, and (2) actually tends to re-offend more often, then it's likely that the algorithm will have more people from that population group in its higher risk of reoffending ratings. From one viewpoint, that's a well functioning algorithm, correctly rating people with a higher risk of reoffending as more risky, without any knowledge of which population group they are part of and no bias on purely the basis of population group membership (which was NOT provided to the algorithm during training or production). But other people look only at whether "people in the given population group get higher risk ratings", and consider any disparity in risk evaluations to be a priori proof of "bias". Since the algorithm doesn't know the population group, they postulate that perhaps the algorithm somehow inferred the person's population group, or that the population group of the data scientists creating the algorithm may have somehow seeped into the algorithm. (A stronger point is the possibility that a biased dataset was used in training, which indeed does need to be checked - but critics sometimes assume that even when there is no evidence of it). That is, defining what counts as bias turns out to be political, not objective, in many cases. There was an article in Pro Publica a few years back which suffered from this problem. If you look at their argument in detail, it boils down to "the algorithm rates Black arrestees as higher risk of re-offending - on average - than white arrestees, so it's biased". But using the county statistics, Black arrestees in that county HAVE had a higher recidivism rate, so the more accurate the algorithm is, the more that real world data would be reflected in the statistical outcome (any individual of any race may rate anywhere from the highest to lowest risk estimate). By their own data, the higher the risk assessment of past cases, the more often the person actually did reoffend (within two years), for both Black and white arrestees, combined or separately. And if anything, it was a bit harsher on whites on average (comparing risk rating to actual subsequent behavior). But the pro publica article obscures this with a table of "false positives" and "false negatives", which superficially sounds convincing, but if you do the math, actually any group with a higher recidivism rate would have exactly those same elevated and reduced false hits. (The article is also peppered with cherry picked comparisons of individuals intentionally selected to reinforce the narrative, a technique for manipulating the audience, not for finding truth). I found the article convincing until I did a deeper analysis and then discovered that it was in essence treating "unbiased" as meaning mispredicting equal recidivism rates for races, whether or not that matched the real world. They completely omitted that evidence that supported an unbiased algorithm (or rather only slightly biased against whites but mostly remarkably good). If one is trying to perceive bias for political reasons, one can often find it, by defining what counts a bias accordingly. We need more politically neutral evaluations.
youtube AI Harm Incident 2022-07-13T05:5…
Coding Result
DimensionValue
Responsibilityunclear
Reasoningunclear
Policyunclear
Emotionunclear
Coded at2026-04-27T06:24:59.937377
Raw LLM Response
[{"id":"ytc_UgxEoXJmsU2o6hW7-6l4AaABAg","responsibility":"ai_itself","reasoning":"consequentialist","policy":"none","emotion":"resignation"}, {"id":"ytc_UgzQ5-X22e_jw6AD6qh4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"industry_self","emotion":"approval"}, {"id":"ytc_UgzJdKeEiLauijES-xh4AaABAg","responsibility":"developer","reasoning":"deontological","policy":"liability","emotion":"outrage"}, {"id":"ytc_UgxqiNKoyqTcAP4sAER4AaABAg","responsibility":"distributed","reasoning":"mixed","policy":"regulate","emotion":"fear"}, {"id":"ytc_Ugxo75JzWjtKUEJa2vx4AaABAg","responsibility":"government","reasoning":"deontological","policy":"regulate","emotion":"outrage"}, {"id":"ytc_UgzYjmZgqwkL0p4GOal4AaABAg","responsibility":"developer","reasoning":"deontological","policy":"liability","emotion":"outrage"}, {"id":"ytc_UghETbDDLjdorXgCoAEC","responsibility":"ai_itself","reasoning":"consequentialist","policy":"ban","emotion":"fear"}, {"id":"ytc_UgjjbdybEHWzyXgCoAEC","responsibility":"developer","reasoning":"consequentialist","policy":"regulate","emotion":"mixed"}, {"id":"ytc_Ugjn9WM4zF6TIHgCoAEC","responsibility":"distributed","reasoning":"consequentialist","policy":"regulate","emotion":"resignation"}, {"id":"ytc_Ugi75uTYKgdOH3gCoAEC","responsibility":"government","reasoning":"contractualist","policy":"regulate","emotion":"approval"})