Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

AbuTahir@lemm.ee · edit-2 4 months ago

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

sp3ctr4l@lemmy.dbzer0.com · edit-2 4 months ago

This has been known for years, this is the default assumption of how these models work.

You would have to prove that some kind of actual reasoning capacity has arisen as… some kind of emergent complexity phenomenon… not the other way around.

Corpos have just marketed/gaslit us/themselves so hard that they apparently forgot this.

Riskable@programming.dev · 4 months ago

Define, “reasoning”. For decades software developers have been writing code with conditionals. That’s “reasoning.”

LLMs are “reasoning”… They’re just not doing human-like reasoning.

sp3ctr4l@lemmy.dbzer0.com · edit-2 4 months ago

Howabout uh…

The ability to take a previously given set of knowledge, experiences and concepts, and combine or synthesize them in a consistent, non contradictory manner, to generate hitherto unrealized knowledge, or concepts, and then also be able to verify that those new knowledge and concepts are actually new, and actually valid, or at least be able to propose how one could test whether or not they are valid.

Arguably this is or involves meta-cognition, but that is what I would say… is the difference between what we typically think of as ‘machine reasoning’, and ‘human reasoning’.

Now I will grant you that a large amount of humans essentially cannot do this, they suck at introspecting and maintaining logical consistency, that they are just told ‘this is how things work’, and they never question that untill decades later and their lives force them to address, or dismiss their own internally inconsisten beliefs.

But I would also say that this means they are bad at ‘human reasoning’.

Basically, my definition of ‘human reasoning’ is perhaps more accurately described as ‘critical thinking’.

technocrit@lemmy.dbzer0.com · edit-2 4 months ago

Peak pseudo-science. The burden of evidence is on the grifters who claim “reason”. But neither side has any objective definition of what “reason” means. It’s pseudo-science against pseudo-science in a fierce battle.

x0x7@lemmy.world · edit-2 4 months ago

Even defining reason is hard and becomes a matter of philosophy more than science. For example, apply the same claims to people. Now I’ve given you something to think about. Or should I say the Markov chain in your head has a new topic to generate thought states for.

I_Has_A_Hat@lemmy.world · edit-2 4 months ago

By many definitions, reasoning IS just a form of pattern recognition so the lines are definitely blurred.

Echo Dot@feddit.uk · 4 months ago

And does it even matter anyway?

For the sake of argument let’s say that somebody manages to create an AGI, does it reasoning abilities if it works anyway? No one has proven that sapience is required for intelligence, after all we only have a sample size of one, hardly any conclusions can really be drawn from that.

minoscopede@lemmy.world · edit-2 4 months ago

I see a lot of misunderstandings in the comments 🫤

This is a pretty important finding for researchers, and it’s not obvious by any means. This finding is not showing a problem with LLMs’ abilities in general. The issue they discovered is specifically for so-called “reasoning models” that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that’s a flaw that needs to be corrected before models can actually reason.

AbuTahir@lemm.ee · 4 months ago

Cognitive scientist Douglas Hofstadter (1979) showed reasoning emerges from pattern recognition and analogy-making - abilities that modern AI demonstrably possesses. The question isn’t if AI can reason, but how its reasoning differs from ours.

theherk@lemmy.world · 4 months ago

Yeah these comments have the three hallmarks of Lemmy:

AI is just autocomplete mantras.
Apple is always synonymous with bad and dumb.
Rare pockets of really thoughtful comments.

Thanks for being at least the latter.

Tobberone@lemm.ee · 4 months ago

What statistical method do you base that claim on? The results presented match expectations given that Markov chains are still the basis of inference. What magic juice is added to “reasoning models” that allow them to break free of the inherent boundaries of the statistical methods they are based on?

minoscopede@lemmy.world · edit-2 4 months ago

I’d encourage you to research more about this space and learn more.

As it is, the statement “Markov chains are still the basis of inference” doesn’t make sense, because markov chains are a separate thing. You might be thinking of Markov decision processes, which is used in training RL agents, but that’s also unrelated because these models are not RL agents, they’re supervised learning agents. And even if they were RL agents, the MDP describes the training environment, not the model itself, so it’s not really used for inference.

I mean this just as an invitation to learn more, and not pushback for raising concerns. Many in the research community would be more than happy to welcome you into it. The world needs more people who are skeptical of AI doing research in this field.

Tobberone@lemm.ee · 4 months ago

Which method, then, is the inference built upon, if not the embeddings? And the question still stands, how does “AI” escape the inherent limits of statistical inference?

technocrit@lemmy.dbzer0.com · edit-2 4 months ago

There’s probably alot of misunderstanding because these grifters intentionally use misleading language: AI, reasoning, etc.

If they stuck to scientifically descriptive terms, it would be much more clear and much less sensational.

Knock_Knock_Lemmy_In@lemmy.world · 4 months ago

When given explicit instructions to follow models failed because they had not seen similar instructions before.

This paper shows that there is no reasoning in LLMs at all, just extended pattern matching.

MangoCats@feddit.it · 4 months ago

I’m not trained or paid to reason, I am trained and paid to follow established corporate procedures. On rare occasions my input is sought to improve those procedures, but the vast majority of my time is spent executing tasks governed by a body of (not quite complete, sometimes conflicting) procedural instructions.

If AI can execute those procedures as well as, or better than, human employees, I doubt employers will care if it is reasoning or not.

Knock_Knock_Lemmy_In@lemmy.world · 4 months ago

Sure. We weren’t discussing if AI creates value or not. If you ask a different question then you get a different answer.

MangoCats@feddit.it · 4 months ago

Well - if you want to devolve into argument, you can argue all day long about “what is reasoning?”

technocrit@lemmy.dbzer0.com · edit-2 4 months ago

This would be a much better paper if it addressed that question in an honest way.

Instead they just parrot the misleading terminology that they’re supposedly debunking.

How dat collegial boys club undermines science…

Knock_Knock_Lemmy_In@lemmy.world · edit-2 4 months ago

You were starting a new argument. Let’s stay on topic.

The paper implies “Reasoning” is application of logic. It shows that LRMs are great at copying logic but can’t follow simple instructions that haven’t been seen before.

REDACTED@infosec.pub · edit-2 4 months ago

What confuses me is that we seemingly keep pushing away what counts as reasoning. Not too long ago, some smart alghoritms or a bunch of instructions for software (if/then) was officially, by definition, software/computer reasoning. Logically, CPUs do it all the time. Suddenly, when AI is doing that with pattern recognition, memory and even more advanced alghoritms, it’s no longer reasoning? I feel like at this point a more relevant question is “What exactly is reasoning?”. Before you answer, understand that most humans seemingly live by pattern recognition, not reasoning.

https://en.wikipedia.org/wiki/Reasoning_system

stickly@lemmy.world · 4 months ago

If you want to boil down human reasoning to pattern recognition, the sheer amount of stimuli and associations built off of that input absolutely dwarfs anything an LLM will ever be able to handle. It’s like comparing PhD reasoning to a dog’s reasoning.

While a dog can learn some interesting tricks and the smartest dogs can solve simple novel problems, there are hard limits. They simply lack a strong metacognition and the ability to make simple logical inferences (eg: why they fail at the shell game).

Now we make that chasm even larger by cutting the stimuli to a fixed token limit. An LLM can do some clever tricks within that limit, but it’s designed to do exactly those tricks and nothing more. To get anything resembling human ability you would have to design something to match human complexity, and we don’t have the tech to make a synthetic human.

MangoCats@feddit.it · 4 months ago

I think as we approach the uncanny valley of machine intelligence, it’s no longer a cute cartoon but a menacing creepy not-quite imitation of ourselves.

technocrit@lemmy.dbzer0.com · 4 months ago

It’s just the internet plus some weighted dice. Nothing to be afraid of.

technocrit@lemmy.dbzer0.com · 4 months ago

Sure, these grifters are shady AF about their wacky definition of “reason”… But that’s just a continuation of the entire “AI” grift.

FreakinSteve@lemmy.world · 4 months ago

NOOOOOOOOO

SHIIIIIIIIIITT

SHEEERRRLOOOOOOCK

jj4211@lemmy.world · 4 months ago

Without being explicit with well researched material, then the marketing presentation gets to stand largely unopposed.

So this is good even if most experts in the field consider it an obvious result.

800XL@lemmy.world · 4 months ago

Extept for Siri, right? Lol

Threeme2189@lemmy.world · 4 months ago

Apple Intelligence

technocrit@lemmy.dbzer0.com · 4 months ago

The funny thing about this “AI” griftosphere is how grifters will make some outlandish claim and then different grifters will “disprove” it. Plenty of grant/VC money for everybody.

flandish@lemmy.world · 4 months ago

stochastic parrots. all of them. just upgraded “soundex” models.

this should be no surprise, of course!

MangoCats@feddit.it · 4 months ago

It’s not just the memorization of patterns that matters, it’s the recall of appropriate patterns on demand. Call it what you will, even if AI is just a better librarian for search work, that’s value - that’s the new Google.

cactopuses@lemm.ee · 4 months ago

While a fair idea there are two issues with that even still - Hallucinations and the cost of running the models.

Unfortunately, it take significant compute resources to perform even simple responses, and these responses can be totally made up, but still made to look completely real. It’s gotten much better sure, but blindly trusting these things (Which many people do) can have serious consequences.

MangoCats@feddit.it · edit-2 4 months ago

Hallucinations and the cost of running the models.

So, inaccurate information in books is nothing new. Agreed that the rate of hallucinations needs to decline, a lot, but there has always been a need for a veracity filter - just because it comes from “a book” or “the TV” has never been an indication of absolute truth, even though many people stop there and assume it is. In other words: blind trust is not a new problem.

The cost of running the models is an interesting one - how does it compare with publication on paper to ship globally to store in environmentally controlled libraries which require individuals to physically travel to/from the libraries to access the information? What’s the price of the resulting increased ignorance of the general population due to the high cost of information access?

What good is a bunch of knowledge stuck behind a search engine when people don’t know how to access it, or access it efficiently?

Granted, search engines already take us 95% (IMO) of the way from paper libraries to what AI is almost succeeding in being today, but ease of access of information has tremendous value - and developing ways to easily access the information available on the internet is a very valuable endeavor.

Personally, I feel more emphasis should be put on establishing the veracity of the information before we go making all the garbage easier to find.

I also worry that “easy access” to automated interpretation services is going to lead to a bunch of information encoded in languages that most people don’t know because they’re dependent on machines to do the translation for them. As an example: shiny new computer language comes out but software developer is too lazy to learn it, developer uses AI to write code in the new language instead…

surph_ninja@lemmy.world · 4 months ago

You assume humans do the opposite? We literally institutionalize humans who not follow set patterns.

LemmyIsReddit2Point0@lemmy.world · 4 months ago

deleted by creator

silasmariner@programming.dev · 4 months ago

Some of them, sometimes. But some are adulated and free and contribute vast swathes to our culture and understanding.

petrol_sniff_king@lemmy.blahaj.zone · 4 months ago

Maybe you failed all your high school classes, but that ain’t got none to do with me.

surph_ninja@lemmy.world · 4 months ago

Funny how triggering it is for some people when anyone acknowledges humans are just evolved primates doing the same pattern matching.

NotASharkInAManSuit@lemmy.world · 4 months ago

We actually have sentience, though, and are capable of creating new things and having realizations. AI isn’t real and LLMs and dispersion models are simply reiterating algorithmic patterns, no LLM or dispersion model can create anything original or expressive.

Also, we aren’t “evolved primates.” We are just primates, the thing is, primates are the most socially and cognitively evolved species on the planet, so that’s not a denigrating sentiment unless your a pompous condescending little shit.

surph_ninja@lemmy.world · edit-2 4 months ago

The denigration of simulated thought processes, paired with aggrandizing of wetware processing, is exactly my point. The same self-serving narcissism that’s colored so many biased & flawed arguments in biological philosophy putting humans on a pedestal above all other animals.

It’s also hysterical and ironic that you insist on your own level of higher thinking, as you regurgitate an argument so unoriginal that a bot could’ve easily written it. Just absolutely no self-awareness.

NotASharkInAManSuit@lemmy.world · 4 months ago

It’s not higher thinking, it’s just actual thinking. Computers are not capable of that and never will be. It’s not a level of fighting progress, or whatever you are trying to get at, it’s just a realistic understanding of computers and technology. You’re jerking off a pipe dream, you don’t even understand how the technology you’re talking about works, and calling a brain “wetware” perfectly outlines that. You’re working on a script writers level of understanding how computers, hardware, and software work. You lack the grasp to even know what you’re talking about, this isn’t Johnny Mnemonic.

surph_ninja@lemmy.world · 4 months ago

I call the brain “wetware” because there are companies already working with living neurons to be integrated into AI processing, and it’s an actual industry term.

That you so confidently declare machines will never be capable of processes we haven’t even been able to clearly define ourselves, paired with your almost religious fervor in opposition to its existence, really speaks to where you’re coming from on this. This isn’t coming from an academic perspective. This is clearly personal for you.

NotASharkInAManSuit@lemmy.world · 4 months ago

Here’s the thing, I’m not against LLMs and dispersion for things they can actually be used for, they have potential for real things, just not at all the things you pretend exist. Neural implants aren’t AI. An intelligence is self aware, if we achieved AI it wouldn’t be a program. You’re misconstruing Virtual Intelligence for artificial intelligence and you don’t even understand what a virtual intelligence is. You’re simply delusional in what you believe computer science and technology is, how it works, and what it’s capable of.

El Barto@lemmy.world · 4 months ago

It’s not that institutionalized people don’t follow “set” pattern matches. That’s why you’re getting downvotes.

Some of those humans can operate with the same brain rules alright. They may even be more efficient at it than you and I may. The higher level functions is a different thing.

surph_ninja@lemmy.world · 4 months ago

That’s absolutely what it is. It’s a pattern on here. Any acknowledgment of humans being animals or less than superior gets hit with pushback.

Auli@lemmy.ca · 4 months ago

Humans are animals. But an LLM is not an animal and has no reasoning abilities.

surph_ninja@lemmy.world · edit-2 4 months ago

It’s built by animals, and it reflects them. That’s impressive on its own. Doesn’t need to be exaggerated.

NotASharkInAManSuit@lemmy.world · 4 months ago

Impressive = / = substantial or beneficial.

El Barto@lemmy.world · 4 months ago

I didn’t say we aren’t animals or that we don’t follow physics rules.

But what you’re saying is the equivalent of “everything that goes up will eventually go down - that’s how physics works and you don’t see that, you’re in denial!!!11!!!1”

intensely_human@lemm.ee · 4 months ago

I appreciate your telling the truth. No downvotes from me. See you at the loony bin, amigo.

Grizzlyboy@lemmy.zip · 4 months ago

What a dumb title. I proved it by asking a series of questions. It’s not AI, stop calling it AI, it’s a dumb af language model. Can you get a ton of help from it, as a tool? Yes! Can it reason? NO! It never could and for the foreseeable future, it will not.

It’s phenomenal at patterns, much much better than us meat peeps. That’s why they’re accurate as hell when it comes to analyzing medical scans.

skisnow@lemmy.ca · 4 months ago

What’s hilarious/sad is the response to this article over on reddit’s “singularity” sub, in which all the top comments are people who’ve obviously never got all the way through a research paper in their lives all trashing Apple and claiming their researchers don’t understand AI or “reasoning”. It’s a weird cult.

technocrit@lemmy.dbzer0.com · 4 months ago

ICYMI: A.I. is a Religious Cult with Karen Hao

intensely_human@lemm.ee · 4 months ago

Fair, but the same is true of me. I don’t actually “reason”; I just have a set of algorithms memorized by which I propose a pattern that seems like it might match the situation, then a different pattern by which I break the situation down into smaller components and then apply patterns to those components. I keep the process up for a while. If I find a “nasty logic error” pattern match at some point in the process, I “know” I’ve found a “flaw in the argument” or “bug in the design”.

But there’s no from-first-principles method by which I developed all these patterns; it’s just things that have survived the test of time when other patterns have failed me.

I don’t think people are underestimating the power of LLMs to think; I just think people are overestimating the power of humans to do anything other than language prediction and sensory pattern prediction.

Nalivai@lemmy.world · 4 months ago

You either an llm, or don’t know how your brain works.

conicalscientist@lemmy.world · 4 months ago

This whole era of AI has certainly pushed the brink to existential crisis territory. I think some are even frightened to entertain the prospect that we may not be all that much better than meat machines who on a basic level do pattern matching drawing from the sum total of individual life experience (aka the dataset).

Higher reasoning is taught to humans. We have the capability. That’s why we spend the first quarter of our lives in education. Sometimes not all of us are able.

I’m sure it would certainly make waves if researchers did studies based on whether dumber humans are any different than AI.

melsaskca@lemmy.ca · 4 months ago

It’s all “one instruction at a time” regardless of high processor speeds and words like “intelligent” being bandied about. “Reason” discussions should fall into the same query bucket as “sentience”.

MangoCats@feddit.it · 4 months ago

My impression of LLM training and deployment is that it’s actually massively parallel in nature - which can be implemented one instruction at a time - but isn’t in practice.

MuskyMelon@lemmy.world · edit-2 4 months ago

I use LLMs as advanced search engines. No ads or sponsored results.

Kyrgizion@lemmy.world · 4 months ago

There are ads but they’re subtle enough that you don’t recognize them as such.

Leon@pawb.social · edit-2 4 months ago

There are search engines that do this better. There’s a world out there beyond Google.

auraithx@lemmy.dbzer0.com · 4 months ago

Like what?

I don’t think there’s any search engine better than Perplexity. And for scientific research Consensus is miles ahead.

ccunning@lemmy.world · edit-2 4 months ago

On first read this sounded like you were challenging the basis of the previous comment. But then you went on to provide a couple of your own examples.

So on that basis after rereading your comment, it sounds like maybe you’re actually looking for recommendations.

Ive seen a lot of praise for Kagi over the past year. I’ve finally started playing around with the free tier and I think it’s definitely worth checking out.

Leon@pawb.social · 4 months ago

Through the years I’ve bounced between different engines. I gave Bing a decent go some years back, mostly because I was interested in gauging the performance and wanted to just pit something against Google. After that I’ve swapped between Qwant and Startpage a bunch. I’m a big fan of Startpage’s “Anonymous view” function.

Since then I’ve landed on Kagi, which I’ve used for almost a year now. It’s the first search engine I’ve used that you can make work for you. I use the lens feature to focus on specific tasks, and de-prioritise pages that annoy me, sometimes outright omitting results from sites I find useless or unserious. For example when I’m doing web stuff and need to reference the MDN, I don’t really care for w3schools polluting my results.

I’m a big fan of using my own agency and making my own decisions, and the recent trend in making LLMs think for us is something I find rather worrying, it allows for a much subtler manipulation than what Google does with its rankings and sponsor inserts.

Perplexity openly talking about wanting to buy Chrome and harvesting basically all the private data is also terrifying, thus I wouldn’t touch that service with a stick. That said, I appreciate their candour, somehow being open about being evil is a lot more palatable to me than all these companies pretending to be good.

vala@lemmy.world · 4 months ago

No shit

WorldsDumbestMan@lemmy.today · 4 months ago

It has so much data, it might as well be reasoning. As it helped me with my problem.

LonstedBrowryBased@lemm.ee · 4 months ago

Yah of course they do they’re computers

finitebanjo@lemmy.world · 4 months ago

That’s not really a valid argument for why, but yes the models which use training data to assemble statistical models are all bullshitting. TBH idk how people can convince themselves otherwise.

intensely_human@lemm.ee · 4 months ago

They aren’t bullshitting because the training data is based on reality. Reality bleeds through the training data into the model. The model is a reflection of reality.

finitebanjo@lemmy.world · edit-2 4 months ago

An approximation of a very small limited subset of reality with more than a 1 in 20 error rate who produces massive amounts of tokens in quick succession is a shit representation of reality which is in every way inferior to human accounts to the point of being unusable for the industries in which they are promoted.

And that Error Rate can only spike when the training data contains errors itself, which will only grow as it samples its own content.

Encrypt-Keeper@lemmy.world · 4 months ago

TBH idk how people can convince themselves otherwise.

They don’t convince themselves. They’re convinced by the multi billion dollar corporations pouring unholy amounts of money into not only the development of AI, but its marketing. Marketing designed to not only convince them that AI is something it’s not, but also that that anyone who says otherwise (like you) are just luddites who are going to be “left behind”.

Blackmist@feddit.uk · 4 months ago

It’s no surprise to me that the person at work who is most excited by AI, is the same person who is most likely to be replaced by it.

Encrypt-Keeper@lemmy.world · 4 months ago

Yeah the excitement comes from the fact that they’re thinking of replacing themselves and keeping the money. They don’t get to “Step 2” in theirs heads lmao.

turmacar@lemmy.world · edit-2 4 months ago

I think because it’s language.

There’s a famous quote from Charles Babbage when he presented his difference engine (gear based calculator) and someone asking “if you put in the wrong figures, will the correct ones be output” and Babbage not understanding how someone can so thoroughly misunderstand that the machine is, just a machine.

People are people, the main thing that’s changed since the Cuneiform copper customer complaint is our materials science and networking ability. Most things that people interact with every day, most people just assume work like it appears to on the surface.

And nothing other than a person can do math problems or talk back to you. So people assume that means intelligence.

finitebanjo@lemmy.world · 4 months ago

I often feel like I’m surrounded by idiots, but even I can’t begin to imagine what it must have felt like to be Charles Babbage explaining computers to people in 1840.

intensely_human@lemm.ee · 4 months ago

Computers are better at logic than brains are. We emulate logic; they do it natively.

It just so happens there’s no logical algorithm for “reasoning” a problem through.

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

archive.is