• Omega_Jimes@lemmy.ca
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    3
    ·
    7 days ago

    o3 made the high score on ARC through brute force, not by being good. To raise the score from 75% to 87% required 175 times more computing power, but exactly stunning returns.

    • Communist@lemmy.frozeninferno.xyz
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      3
      ·
      edit-2
      7 days ago

      Why does it matter?

      If it can through brute force, it can do it. That’s the first step towards true agi, nobody said the first AGI would be economical, this feels like a major goalpost shift if you’re acknowledging it can do it at all, isn’t that insane?

      A little bit ago, everyone would’ve been saying this will never happen, that there was a natural wall simply because all it does is predict the next token, it’s been like, a few years of llm’s and they’re already getting this insane. We’re going to have AGI soon, it might not be a transformer, but billions upon billions of dollars are being thrown at this problem, there are people smart enough in the world to make this work, and this is the earliest sign that it’s coming.

      • Omega_Jimes@lemmy.ca
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        1
        ·
        7 days ago

        I’m not convinced that it’s anywhere near an AGI, I’m convinced after combing through papers and code, that it’s an amazing parlor trick.

        I’d love to be proven wrong, but everything I’ve seen and everything I’ve used in my studies ( using DNN to simulate neurodivergence and spinal disgenesis, which is kinda AI adjacent) leads me to believe that the current part won’t lead to anything but convincing parlor tricks.

        The argument could be made that if a trick is convincing enough, does it matter if it’s intelligent or not.

          • Omega_Jimes@lemmy.ca
            link
            fedilink
            English
            arrow-up
            2
            ·
            7 days ago

            I’m not entirely sure.
            A non-probabilistic algorithm, probably. Something that didn’t rely on the liklihood of association, and instead was capable of context and rationality.
            Something that wouldn’t have a system capable of saying “Put glue on your pizza” because it would know that’s a silly thing to say to a human. A system that, when asked "Whats a good caustic detergent " wouldn’t be able to respond "Any good caustic detergent is a good caustic detergent " because duh. Something that doesn’t require thousands of hours of training to update and instead is capable of ingesting and rationalize new information on the fly.

            • Communist@lemmy.frozeninferno.xyz
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              1
              ·
              edit-2
              6 days ago

              You’re probablistic, you just have an internal verifier, you think things that are silly, and then decide not to say them all the time. A human being often thinks things that they realize are silly before they say them… that’s an entirely unfair goal in the first place from my perspective, why does it have to be non-probablistic?

              Are you not a general intelligence because sometimes your brain thinks silly things?

              o3 currently works precisely that way, by the way, it generates hundreds of possible things, and then uses something that checks if the steps actually work, before it outputs. In fact, they then reinforce it on these correct logical steps, so it becomes better at not outputting illogical answers like you said.

              it’s interesting that you said “not on the probability of the next word, but on context and rationality”

              context IS pricesely that, you know what’s likely to come next because of the context, that’s you understanding context. YOU as a human being don’t even always get this right, you must realize we are not perfect beings, we think of possibilities and choose the right one. I think we’re much better at this right now, but i don’t think that’s a fundamental difference between us and o3.

              Rationality is the internal verifier.

              Something that doesn’t require thousands of hours of training to update and instead is capable of ingesting and rationalize new information on the fly.

              Being able to do this is… exactly what arc-agi was testing. Literally the entire point of the benchmark, it can do that.

              I’ve done the test by the way, I solved it by brute forcing possible solutions in my head, then checking if they were true… did you just divine the answers instantly?