Raw LLM Responses
Inspect the exact model output for any coded comment.
Look up by comment ID
Random samples — click to inspect
G
So basically, don't use or spend money via the big AI companies and wait till th…
ytc_UgxKDB32p…
G
if a robot can do a " simple " job easily why give this job to an ordinary worke…
ytr_UgzG1J45o…
G
China is just way too superior when it comes to ai and tech in general…
ytr_Ugz8Yb9PQ…
G
this person was called out several times, but the number of supporters drowns it…
ytc_UgxRGwlDe…
G
If this is a race, what is the finish line, and how will the winner be crowned? …
ytc_UgwlQdUqu…
G
Wait so theres no driver at all? Self driving is sketchy enough but having nobod…
ytc_UgwTaOVZP…
G
Make a deepfake of those same politicians saying and doing things they didn't do…
ytc_UgxO4zRou…
G
People who use AI to impersonate doctors, politicians, Oprah,etc should 1. Have …
ytc_UgxFb0wvZ…
Comment
So what I would recommend is actually rerunning this test but don't use a Greenfield project. Rerun it with something that is well established. Maybe you have some hobby project that you gave up on because you never implemented feature XYZ. Try and get the llm to implement that feature.
Personally what I have found is that the agents are okay to good for Greenfield project and getting the basic scaffolding out of the way that just is essentially a time sink and not actually a hard problem. Once I've got that established the agents vary rapidly become entirely useless. It starts taking more time and effort to get them to do the right thing than to just do it myself. It's essentially like having the worst Junior in the world. But at least a junior learns and becomes better over time while the agent is just stagnant and makes the same mistake on every single project in the future.
I have some code bases that are half a million lines plus and getting the agent to even add the smallest of functions won't work. It will often break the formatting of the file itself.
Later on in the video you mentioned code reuse. Llms are not capable of it. I have had prompts where within the same prompt within the same file there would be two functions and they would be right next to each other and they would be identical. And they just had slightly different names because they were used in different other contexts. Not to say that humans are incapable of doing the same thing, I see it all the time, across our company we often repeat code in different projects, but this repeated pattern I see in the llms is just exceptionally hard and impossible even to properly maintain
youtube
AI Jobs
2026-01-19T15:1…
♥ 33
Coding Result
| Dimension | Value |
|---|---|
| Responsibility | none |
| Reasoning | consequentialist |
| Policy | none |
| Emotion | approval |
| Coded at | 2026-04-27T06:24:59.937377 |
Raw LLM Response
[{"id":"ytc_UgwTHACpuNH3alMZ-At4AaABAg","responsibility":"developer","reasoning":"deontological","policy":"none","emotion":"indifference"},{"id":"ytc_UgyBCSaYB73YLKszb_d4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"approval"},{"id":"ytc_UgyvruCjW_584YguWzR4AaABAg","responsibility":"developer","reasoning":"deontological","policy":"none","emotion":"outrage"},{"id":"ytc_UgxPiP0ljQ6nDspT13h4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"approval"},{"id":"ytc_UgwwfLE4DTHG6aKqYM14AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"approval"},{"id":"ytc_UgyO2So829NZqsYinYB4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"indifference"},{"id":"ytc_UgxbxoDntkTRGrw-lkp4AaABAg","responsibility":"user","reasoning":"deontological","policy":"none","emotion":"indifference"},{"id":"ytc_UgxvMpPopnv6lGrp3Ah4AaABAg","responsibility":"none","reasoning":"consequentialist","policy":"none","emotion":"approval"},{"id":"ytc_UgwdSCD56Lxs9Dtfzwt4AaABAg","responsibility":"government","reasoning":"consequentialist","policy":"regulate","emotion":"fear"},{"id":"ytc_UgzpHMTXZEUlnjrN1Jt4AaABAg","responsibility":"none","reasoning":"unclear","policy":"unclear","emotion":"mixed"}]