A user asked on the official Lutris GitHub two weeks ago “is lutris slop now” and noted an increasing amount of “LLM generated commits”. To which the Lutris creator replied:
It’s only slop if you don’t know what you’re doing and/or are using low quality tools. But I have over 30 years of programming experience and use the best tool currently available. It was tremendously helpful in helping me catch up with everything I wasn’t able to do last year because of health issues / depression.
There are massive issues with AI tech, but those are caused by our current capitalist culture, not the tools themselves. In many ways, it couldn’t have been implemented in a worse way but it was AI that bought all the RAM, it was OpenAI. It was not AI that stole copyrighted content, it was Facebook. It wasn’t AI that laid off thousands of employees, it’s deluded executives who don’t understand that this tool is an augmentation, not a replacement for humans.
I’m not a big fan of having to pay a monthly sub to Anthropic, I don’t like depending on cloud services. But a few months ago (and I was pretty much at my lowest back then, barely able to do anything), I realized that this stuff was starting to do a competent job and was very valuable. And at least I’m not paying Google, Facebook, OpenAI or some company that cooperates with the US army.
Anyway, I was suspecting that this “issue” might come up so I’ve removed the Claude co-authorship from the commits a few days ago. So good luck figuring out what’s generated and what is not. Whether or not I use Claude is not going to change society, this requires changes at a deeper level, and we all know that nothing is going to improve with the current US administration.


Let’s assume we’re skipping the ethical and moral concerns about LLM usage and just discuss the technical.
Nobody who knows anything about coding is claiming human code is error free, that’s why code reviews, testing and all the other aspects of the software development lifecycle exist.
Nobody should trust any code unless it can be verified that it does what is required consistently and predictably.
This is a known thing, paranoia doesn’t really apply here, only subjectively appropriate levels of caution.
Also it’s not that they can’t be seen, it’s just that the effort required to spot them is greater and the likelihood to miss something is higher.
Whether or not these problems can be overcome (or mitigated) remains to be seen, but at the moment it still requires additional effort around the LLM parts, which is why hiding them is counterproductive.
This is important because it’s true, but it’s only true if you can verify it.
This whole issue should theoretically be negated by comprehensive acceptance criteria and testing but if that were the case we’d never have any bugs in human code either.
Personally i think the “uncanny valley code” issue is an inherent part of the way LLM’s work and there is no “solution” to it, the only option is to mitigate as best we can.
I also really really dislike the non-declarative nature of generated code, which fundamentally rules it out as a reliable end to end system tool unless we can get those fully comprehensive tests up to scratch, for me at least.
Thanks for taking the time to reply.
Greater compared to human code? Not sure about that, but I’m not disagreeing either. Greater compared to verified able programmers, sure, but in general?..
I don’t think I’m getting your point here. Do you mean by that, the code basically lacks focus on an end goal? Or are you talking about the fuzzyness and randomization of the output?
Both.
The reasons are quite hard to describe, which is why it’s such a trap, but if you spend some time reviewing LLM code you’ll see what I mean.
One reason is that it isn’t coding for logical correctness it’s coding for linguistic passability.
Internally there are mechanisms for mitigating this somewhat, but its not an actual fix so problems slip through.
The latter, if you give it the exact same input in the exact same conditions, it’s not guaranteed to give you the same output.
The fact that its sometimes close to the same actually makes it worse because then you can’t tell at a glance what has changed.
It also isn’t a simple as using a diff tool, at least for anything non-trivial, because it’s variations can be in logical progression as well as language.
Meaning you need to track these differences across the whole contextual area which, if you are doing end to end generation, is the whole codebase.
As I said, there are mitigations, but they aren’t fixes.