edit to clarify a misconception in the comments, this is an instagram post so “caption” refers to the description under the image or video

as an example, this text i am typing now is also a “caption”

just saying because someone started a debate misunderstanding this to be about subtitles (aka “closed captions”) and that’s just not the case 👍

  • MeaanBeaan@lemmy.world
    link
    fedilink
    English
    arrow-up
    28
    ·
    3 个月前

    If you’re capable enough to bitch about being too disabled to use your brain. You’re capable enough to write your own caption.

  • Resistai@lemmy.world
    link
    fedilink
    English
    arrow-up
    128
    arrow-down
    6
    ·
    3 个月前

    Disabled people using their disability as a reason to defend ai but not acknowledging that disabled people will be the first to suffer when it comes to the climate crisis, water crisis, displacement, lack of privacy, and all kinds of inequity. Ai is not here to help disabled people, its here to further capitalist billionaire goals.

  • Broadfern@lemmy.world
    link
    fedilink
    English
    arrow-up
    205
    arrow-down
    4
    ·
    3 个月前

    This feels like the weaponization of disability rights language but I’m not sure.

    • Turret3857@infosec.pub
      link
      fedilink
      English
      arrow-up
      146
      arrow-down
      3
      ·
      3 个月前

      It definitely is. As someone who actually struggles with severe ADHD this comment makes my piss boil.

      • Natanox@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        26
        arrow-down
        1
        ·
        3 个月前

        I second that, this person is actually just lazy. I got ADHD and I always add fucking alt text, it’s part of the normal post routine no matter if I took my meds or not. And it’s not like you can’t edit it into posts if you clicked send too quickly.

        I’d even argue it makes your social media experience better. Forces awareness to what you do, gives you time to reflect on your post.

        • Una@europe.pub
          link
          fedilink
          English
          arrow-up
          15
          ·
          3 个月前

          There was someone on tiktok defending AI “art”, who says that he has ADHD and how it is hard for him to concentrate on art and how AI makes his life “easier” by allowing him to feel like he did something, don’t remember exactly but it was something like that. But he also forgot how many disabled people there are, with different disabilities, and still be able to make like perfect art. He also mentioned how he wasn’t born with talent, not like talent doesn’t really exist.

          • HopeOfTheGunblade@lemmy.blahaj.zone
            link
            fedilink
            English
            arrow-up
            4
            ·
            3 个月前

            Have ADHD, picking up a pencil intermittently when we have the executive function. Shit’s harder for us but come on.

            We’d mind a lot less if people treated it like getting a commission. Sure, it’s cool that there’s art of your character, but you didn’t do the drawing, you just gave some specifics.

            • Norah (pup/it/she)@lemmy.blahaj.zone
              link
              fedilink
              English
              arrow-up
              7
              arrow-down
              1
              ·
              3 个月前

              How is this related to what you’re replying too? Just seems ableist. Like the other person said, the TikTok-er says they have ADHD. I totally get why the instant gratification of AI could be nice with ADHD, that’s why I avoid it at all costs.

    • spujb@lemmy.cafeOP
      link
      fedilink
      English
      arrow-up
      20
      ·
      edit-2
      3 个月前

      to clarify we are talking about a post caption, not closed captions.

      that is, the text you put in the description of an image or video post.

      • thejoker954@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 个月前

        Thanks for the clarification. Lol kinda feeds into my whole let’s call things more accurately debate.

    • vzqq@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      9
      ·
      edit-2
      3 个月前

      Yes and no. There are specialized models that perform better than general purpose LLM with vastly lower resource use. But… the output part is essentially a language model too, so it’s prone to a lot of the same issues.

      They perform A LOT better than traditional models though. So much better it’s not even funny.

    • RushLana@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      53
      arrow-down
      4
      ·
      3 个月前

      As someone who use a screen reader daily, absolutly the fuck not.

      LLMs will invent things out of tin air and ruin any comprehesion. It waste my time rather than help me.

      • thejoker954@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        17
        ·
        3 个月前

        If you use any generic LLM then yes, but there are LLMs (like i said in another reply - its prrobably not a LLM - but as there is no ‘real’ ai that’s what I’m calling all this ai bullshit) That are trained specifically for captioning/transcripts, just not necessarily done in real time.

        Doing it “live” is what increases the error rate.

          • thejoker954@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            arrow-down
            7
            ·
            3 个月前

            I have to disagree with you. Ai is never a more accurate way to describe what we have now. Not until they call true ai something different.

            I know its a weird hill to die on, but die on it I will. Calling one artifical intelligence and one virtual intelligence could work.

            Also it’s my understanding that LLMs are considered a type of neural net so I don’t see it being more accurate to call it a neural net vs a llm.

            And they are all subsets of machine learning so calling it an ml model leads me back to the same issue I have with “ai”. (And the same reason those loser usb fucks can suck a bag of dildos) lack of clairty of what it actually can do.

            • Norah (pup/it/she)@lemmy.blahaj.zone
              link
              fedilink
              English
              arrow-up
              5
              arrow-down
              1
              ·
              3 个月前

              Then call it ML or a neural net. Using the term LLM like you are for other forms of machine learning is just going to cause needless confusion, like it has in this thread.

              • thejoker954@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                3 个月前

                No. “Machine learning” is the root of the tree.

                Or to steal another commenters attempts to have me call it that - that would be like calling a chihuahu a wolf.

                Machine learning -> neural net -> LLM. Thats the basic “path”. I dont CARE if LLM is technically wrong when using machine learning or neural net is also inaccurate.

                If anything yall should be arguing for me to call it ASR 2.0

        • RushLana@lemmy.blahaj.zone
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          1
          ·
          3 个月前

          I will frame it another way. You cannot automate subtitles or caption. And I always find reviewing automated output is harder than doing it yourself.

    • Jack@slrpnk.net
      link
      fedilink
      English
      arrow-up
      109
      arrow-down
      1
      ·
      3 个月前

      No, what you are thinking of is speech to text software, it is much older than LLMs and works in a very different way.

      • thejoker954@lemmy.world
        link
        fedilink
        English
        arrow-up
        13
        arrow-down
        2
        ·
        3 个月前

        While speech to text software indeed predates LLMs - LLMs do it as well. I’ve only tried a few basic (aka free) options so no idea how well they do en masse, but the generated results were at least on par if not better than YouTubes’ auto caption.

        It might not technically be LLMs though. It could be a different type of “ai”. I Just cant stand the “ai” marketing when nothing they are making is actually ai so until they pull their heads out their asses all “ai” models are LLMs to me.

        • Jack@slrpnk.net
          link
          fedilink
          English
          arrow-up
          9
          ·
          3 个月前

          Understandable, AI marketing now is a shitshot, but they are not even AI I think. Just people forget that tech used to do magic before AI existed.

          • LwL@lemmy.world
            link
            fedilink
            English
            arrow-up
            6
            ·
            3 个月前

            It’s kind of the other way around, we’ve always had AI, it used to just basically mean a computer making some decision based on data. Like a thermostat changing the heating in response to a temperature change.

            Then we got LLMs and because they are good at pretending to have complex reasoning ability, AI as a term started to always mean “computer with near human level intelligence” which of course they are absolutely not.

            • Jack@slrpnk.net
              link
              fedilink
              English
              arrow-up
              3
              ·
              3 个月前

              There was a book I can’t remember, the whole thesis was exactly that. “AI is whatever automates the decision making process” not any group of algos

          • ButteryMonkey@piefed.social
            link
            fedilink
            English
            arrow-up
            6
            ·
            3 个月前

            This is a big part of it. Back when ai was first becoming big, my manager said they needed to run all my kb articles through an ai to generate link clouds or some such.

            I was like umm… that’s a service this platform has always offered…? Like just because you don’t know what the kb tools do, or what our rock bottom subscription gets us, doesn’t mean I haven’t looked into it… but that also isn’t worth doing because now we only have a handful of articles in any given category because I’m good at my job…

        • oplkill@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          35
          ·
          3 个月前

          Nope, they still not good. I using YouTube auto gen subs and they 100% need LLM to fix mistakes.

          • AnarchoEngineer@lemmy.dbzer0.com
            link
            fedilink
            English
            arrow-up
            41
            ·
            3 个月前

            Large language models are designed to generate text based on previous text. Translation from audio to text can be done via a neural net but it isn’t a Large Language Model.

            Now, could you combine the two to say reduce error on words that were mumbled by having a generative model predict the words that would fit better in that unclear sentence. However you could likely get away with a much smaller and faster net than an LLM in fact you might be able to get away with using plain-Jane markov chains, no machine learning necessary.

            Point is that there is a difference between LLMs and other neural nets that produce text.

            In the case of audio to text translation, using an LLM would be very inefficient and slow (possibly to the point it isn’t able to keep up with the audio at all), and using a very basic text generation net or even just a probabilistic algorithm would likely do the job just fine.

          • Ziglin (it/they)@lemmy.world
            link
            fedilink
            English
            arrow-up
            17
            arrow-down
            1
            ·
            3 个月前

            How would an llm fix a mistake equivalent to something being misheard? I feel like you’re misunderstanding something and could probably also use some help with your English.