• 18107@aussie.zone
    link
    fedilink
    English
    arrow-up
    53
    arrow-down
    6
    ·
    5 个月前

    In this case the limit was entirely arbitrary.

    The programmers were told to pick a limit and they liked 256. There are issues with having a large number of people in a group, but it wasn’t a hardware limit for this particular case.

  • tired_n_bored@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    5 个月前

    As a software engineer: actually there is no need for a number of people as a power of 2 unless you need exactly 1 byte to store such information which sounds ridiculous for the size of Whatsapp

    • NigelFrobisher@aussie.zone
      link
      fedilink
      English
      arrow-up
      3
      ·
      5 个月前

      It’d make sense at protocol level. Otherwise, yeah, even bit-size database columns end up being stored as a word unless the engine compacts it.

  • ObsidianZed@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    ·
    5 个月前

    I remember thinking something similar when I was a kid modding Starcraft. Max levels/ranks in researching was 256 and I always wondered why such a weirdly specific number.

  • Joh4PM@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    12
    ·
    5 个月前

    Since you start counting from zero, the byte limit should be 255 = 2^8 - 1.

  • ch00f@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    12
    ·
    5 个月前

    I’m typing this on a 64 bit device. Why anyone would limit something to an 8 bit number in 2025 is really odd.

    • Cousin Mose@lemmy.hogru.ch
      link
      fedilink
      English
      arrow-up
      10
      ·
      5 个月前

      I get what you’re saying but I don’t like this line of thinking. In the tech industry there is far too much bloat that we just accept due to cheap memory and storage.

      • Justin@lemmy.jlh.name
        link
        fedilink
        English
        arrow-up
        2
        ·
        5 个月前

        There’s much better algorithmic and datatype optimizations to be made than to design your app around saving 3 bytes that most runtimes probably represent as a long long anyways

        • Cousin Mose@lemmy.hogru.ch
          link
          fedilink
          English
          arrow-up
          3
          ·
          edit-2
          5 个月前

          True but more generally things like Electron apps, not precompiling classes in interpreted languages’ Docker images, looping through millions of records without plucking only the data you need, etc seem to be widespread and shrugged off.

          While writing code you can get in the habit of doing things efficiently and long-term the cost savings pile up. Obviously caring about only this one specific case will hardly accomplish much on its own.

    • magic_lobster_party@fedia.io
      link
      fedilink
      arrow-up
      14
      ·
      5 个月前

      It’s for their servers. I guess it might have to do with cache optimization reasons. For performance reasons, they want to ensure they can fit as much as possible in the cache. One extra byte can throw the memory alignment off, which cause wasted space in cache.

      Just my guess. There might be other reasons.

      • ch00f@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        5 个月前

        A single username will use up more memory than an 8-bit limitation to the number of users will save.

    • Honytawk@feddit.nl
      link
      fedilink
      English
      arrow-up
      7
      ·
      5 个月前

      Whatsapp has 2 billion users.

      The difference is 16 billion bits compared to 128 billion bits, or about 16 GB and that is just for the number.

      When working with big sizes, memory optimization is key.

      • ch00f@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 个月前

        How about the processor optimization as your 8 bit number needs to be packed and unpacked every time you want to use it?

        I you read the responses here, there’s enough ambiguity about the choice of 256 users to maybe put a damper on the reflext to gatekeep computer science from a journalist.

    • MudMan@fedia.io
      link
      fedilink
      arrow-up
      11
      ·
      5 个月前

      On a device with many gigabytes of RAM and probably terabytes of storage.

      I guess when you have billions of users, and presumably tens or hundreds of billions of instances of a thing living in your sever every bit adds up? I don’t even know where to even start doing the napkin math for something like that.

      • boonhet@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 个月前

        100 billion messages per day and over half of them in groups apparently. It’s a lot, but 3 bytes per message is still not a lot of data. I’d guess they pack the metadata as tight as possible.

        • MudMan@fedia.io
          link
          fedilink
          arrow-up
          1
          ·
          5 个月前

          That’s the thing, right? What’s “a lot of data” at these scales? Since they keep all these messages indefinitely (and users keep up to two copies of each, too) that’s 100 gigs of data per byte that they save per day. 40 Terabytes per year. Plus 40 more among their collective users and another 40 presumably stashed in some Google Drive somewhere.

          It’s a lot for me, and it’ll cost you what? A couple grand to store at home? That’s a drop in the ocean of a company like Meta with petabytes upon petabytes of garbage stored all over the place… but then again, if I was making a thing and I could shave 40TB a year of storage I… probably would?

          I don’t know, the scope of modern, monopolistic online services is mind-boggling. I’m in the space where I’m savvy enough to understand how massive this nonsense is but also not working on it directly enough to be desensitized about the numbers. It’s like trying to figure out how many people live on the planet, your brain can parse that it can’t parse what you’re trying to do and the dissonance makes you all wobbly.

  • spongebue@lemmy.world
    link
    fedilink
    English
    arrow-up
    34
    arrow-down
    1
    ·
    5 个月前

    So, I get that 256 is a base 2 number. But we’re not running 8-bit servers or whatever here (and yes, I understand that’s not what 8-bit generally refers to). Is there some kind of technical limitation I’m not thinking of where 257 would be any more difficult to implement, or really is it just that 256 has a special place in someone’s heart because it’s a base 2 number?

    • jaaake@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      5 个月前

      The issue isn’t storing each individual ID, it’s all of the networking operations that are done and total things that are stored/cached per user in each chat. All of those things are handled and stored as efficiently as possible. Sure they could set it to any number, but 256 is a nice round one when considering everything that is happening and the use cases involved. They have user research data and probably see that 128 is too close to a group size that happens with some regularity, but group sizes very rarely get close to 256, and 512 is right out.

    • mEEGal@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      ·
      5 个月前

      when writing somewhat low-level code, you always make assumptions about things. in this case, they chose to manage 256 entries in some array; the bound used to be lower.

      but implicitly there’s a tradeoff, probably memory / CPU utilisation in the server.

      it’s always about the tradeoff between what the users want, what is easier for you to maintain, what your infrastructure can provide, etc.

    • AbsolutelyNotAVelociraptor@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      54
      arrow-down
      1
      ·
      5 个月前

      Because 256 is exactly one byte. If you want to add a 257th member, you need a whole second byte just for that one person. That’s a waste of memory, unless you want to go to the 64k barrier of users per chat.

      • spongebue@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        2
        ·
        5 个月前

        If each user is assigned a number as to where they’re placed in the group, I guess. But what happens when people are added and removed? If #145 leaves a full group, does #146 and beyond get decremented to make room for the new #256? (or #255 if zero-indexed). It just doesn’t seem like something you’d actually see in code not designed by a first semester CS student.

        Also, more importantly, memory is cheap AF now 🤷‍♂️

        • ViatorOmnium@piefed.social
          link
          fedilink
          English
          arrow-up
          3
          ·
          5 个月前

          Memory and network stop being cheap AF when you multiply it by a billion users. And Whatsapp is a mobile app that’s expected to work on the crappiest of networks and connections.

          • spongebue@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 个月前

            It is also used to transmit data including video. I don’t think an additional byte is noticeable on that kind of scale

        • SandmanXC@lemmy.world
          link
          fedilink
          English
          arrow-up
          40
          ·
          5 个月前

          While I completely agree with the sentiment, snorting too much “memory is cheap AF” could lead to terminal cases of Electron.

      • Zagorath@aussie.zone
        link
        fedilink
        English
        arrow-up
        23
        arrow-down
        11
        ·
        5 个月前

        Except that they’re almost certainly just using int, which is almost certainly at least 32 bits.

        256 is chosen because the people writing the code are programmers. And just like regular people like multiples of 10, programmers like powers of 2. They feel like nice round numbers.

        • verstra@programming.dev
          link
          fedilink
          English
          arrow-up
          47
          arrow-down
          2
          ·
          5 个月前

          Well, no. They are not certainly using int, they might be using a more efficient data type.

          This might be for legacy reasons or it might be intentional because it might actually matter a lot. If I make up an example, chat_participant_id is definitely stored with each message and probably also in some index, so you can search the messages. Multiply this over all chats on WhatsApp, even the ones with only two people in, and the difference between u8 and u16 might matter a lot.

          But I understand how a TypeScript or Java dev could think that the difference between 1 and 4 bytes is negligible.

          • Zagorath@aussie.zone
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            12
            ·
            5 个月前

            They are not certainly using int

            Probably why I said “almost certainly”. And I stand by that. We’re not talking about chat_participant_id, we’re talking about GROUP_CHAT_LIMIT, probably a constant somewhere. And we’re talking about a value that would require a 9-bit unsigned int to store it, at a minimum (and therefore at least a 16-bit integer in sizes that actually exist for types). Unless it’s 8-bit and interprets a 0 as 256, which is highly unorthodox and would require bespoke coding basically all over instead of a basic num <= GROUP_CHAT_LIMIT.

            • boonhet@sopuli.xyz
              link
              fedilink
              English
              arrow-up
              9
              arrow-down
              1
              ·
              edit-2
              5 个月前

              Orrrr they have a u8 chat_participant_id of some kind and a binary data format for message passing. The GROUP_CHAT_LIMIT const may have a bigger data type, but they may very well be trying to conserve 3 bytes per message. Ids can easily start at 0.

              150 gigs of bandwidth saved per day doesn’t seem like a whole lot at their scale, but if they archive all the metadata, that’s over 50 terabytes a year saved on storage - multiplied by how many copies they have of their data. Still not a lot tbh, but if they also conserve data in every other place they can, they could be saving petabytes per year in storage.

              Still weird because then they’d have to reuse ids when people leave, otherwise you could join and leave 255 times to disable a group lol

            • Passerby6497@lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              arrow-down
              2
              ·
              edit-2
              5 个月前

              And we’re talking about a value that would require a 9-bit unsigned int to store it, at a minimum (and therefore at least a 16-bit integer in sizes that actually exist for types). Unless it’s 8-bit and interprets a 0 as 256, which is highly unorthodox and would require bespoke coding basically all over instead of a basic num <= GROUP_CHAT_LIMIT.

              I think you’re just very confused friend, or misunderstanding how binary counting works, because why in the 9 hells would they be using 9 bits (512 possible values) to store 8 bits (256 possible members) of data?

              I think you’re confusing indexing (0-255) with counting (0-256), and mistakenly including a negation state (counting 0, which would be a null state for the variable) in your conception of the process. Because yes, index 255 is in fact count 256 and 0 would actually be 1. Index = count -1

              • Zagorath@aussie.zone
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                2
                ·
                5 个月前

                I’m imagining something like this:

                def add_member(group, user):
                    if (len(group.members) <= GROUP_CHAT_LIMIT):
                        ...
                

                If GROUP_CHAT_LIMIT is 8 bits, this does not work.

                • Passerby6497@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  4
                  arrow-down
                  1
                  ·
                  5 个月前

                  So add a +1 like you would for any index to count comparison?

                  I guess I’m failing to see how this doesn’t work as long as you properly handle the comparison logic. Maybe you can explain how this doesn’t work…

          • MyBrainHurts@lemmy.ca
            link
            fedilink
            English
            arrow-up
            39
            arrow-down
            1
            ·
            5 个月前

            But I understand how a TypeScript or Java dev could think that the difference between 1 and 4 bytes is negligible.

            Shots fired.

            • ByteJunk@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              2
              ·
              5 个月前

              Fair point, but still better than wasting a nuclear power plant worth of electricity to solve math homework with an LLM

            • jaybone@lemmy.zip
              link
              fedilink
              English
              arrow-up
              4
              ·
              5 个月前

              All these tough guys think you can’t bit shift in Java, never worked on a project with more than two people. Many such cases.

        • ViatorOmnium@piefed.social
          link
          fedilink
          English
          arrow-up
          11
          arrow-down
          2
          ·
          5 个月前

          For high volume wire formats using uint8 instead of uint32 can make a huge difference when considering the big picture. Not everyone is working on bootcamp level software.

        • jaybone@lemmy.zip
          link
          fedilink
          English
          arrow-up
          5
          ·
          5 个月前

          It’s not that they “like it”. It’s ultimately a hardware limitation. Of course we can have 64 bit integers, or however many bits. It’s an appealing optimization.

        • Lodespawn@aussie.zone
          link
          fedilink
          English
          arrow-up
          24
          ·
          5 个月前

          It’ll have to do with packet headers, 8 bits is a lot for an instant message packet header.

    • SparroHawc@lemmy.zip
      link
      fedilink
      English
      arrow-up
      9
      ·
      5 个月前

      There’s often a lot of fun cheats you can use - bitwise operators, etc - if your numbers are small powers of two.

      Also it’s easier to organize memory, if you’re doing funky memory management tricks, if the memory you’re allocating fits nicely into the blocks available to you which are always in powers of two.

      They’re not necessarily great reasons if you’re using a language with sufficient abstraction, but it’s still easier in most instances to use powers of two anyways if you’re getting into the guts of things.

  • Echo Dot@feddit.uk
    link
    fedilink
    English
    arrow-up
    13
    ·
    5 个月前

    That’s a super old article as well.

    They got rightfully roasted in the comments for not knowing even the most basic things about computing.

  • BilboBargains@lemmy.world
    link
    fedilink
    English
    arrow-up
    24
    arrow-down
    1
    ·
    5 个月前

    I remember being puzzled by this and many other numbers that kept cropping up. 32, 64, 128, 256, 1024, 2048… Why do programmers and electronic engineers hate round numbers? The other set of numbers that was mysterious was timber and sheet materials. They cut them to 1220 x 2440mm and thicknesses of 18 and 25mm. Are programmers and the timber merchants part of some diabolical conspiracy?

      • jj4211@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        5 个月前

        Pretty much this…

        Once upon a time, sure, you might have used an 8 bit char to store an array index and incur a 256 limit for actual reasons…

        But nowadays, you do it because 256 is a “cool techy limit”. Developers are almost all dealing with at least 32 bit values, and the actual constraints driving smaller values generally have nothing to do with some power of two limitation.