Example: I have a book which I wanna archive. Would sending a zip with the pages take less storage than sending the, let’s say, 10 individual pages sparatedly?

  • molave@reddthat.com
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    11 hours ago

    I don’t know the details, but in principle, the zip compression process tries to identify the textual commonalities between the pages. The more commonalities the 10 pages have, the smaller the zip file will be.

    If each page is textually very different, (example, Page 1 is “AB” , Page 2 is “CD”, etc.), it’s possible that the zip file will be larger. It’s because it will contain the full contents of each page, plus the metadata of the zip file.

    Anyone more knowledgeable can correct me on this.

  • Björn@swg-empire.de
    link
    fedilink
    arrow-up
    5
    ·
    12 hours ago

    This depends on the file format used for the pages. If it’s plain txt zipping them will greatly decrease the file size. If you just scanned the pages and have them as jpg, png or pdf zipping them will not greatly decrease the file size. The size might still decrease a little bit or increase a little bit.

  • blackbrook@mander.xyz
    link
    fedilink
    arrow-up
    16
    arrow-down
    2
    ·
    9 hours ago

    Wouldn’t trying it out and seeing how much it saved be about the same amount of work as typing in this question?

  • ℕ𝕖𝕞𝕠@slrpnk.net
    link
    fedilink
    arrow-up
    33
    arrow-down
    2
    ·
    12 hours ago

    The larger the file, the more patterns that can be compressed. Plus some of the dictionary overhead will be duplicated in multiple files.

    The single file should be, at most, no larger than the sum of smaller files. But potentially much smaller.

  • grue@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    11 hours ago

    Taking less storage is almost the entire point of a zip file. It only takes more space than the original files in pathological cases (e.g. maybe if you’re trying to compress already-compressed data, like a video file).

  • over_clox@lemmy.world
    link
    fedilink
    arrow-up
    4
    ·
    12 hours ago

    If the compressed files are of the same/similar format, more compression is possible as the algorithm can detect more related patterns to compress.

    But if you toss in a variety of file formats, compression will tend to suffer more.

    Sometimes, the easiest way is just to try and see, different formats lend themselves to better or worse compression.

    The files that tend to be worst at compression are the ones that are already compressed themselves.

  • Scott@lem.free.as
    link
    fedilink
    English
    arrow-up
    9
    ·
    edit-2
    12 hours ago

    I’m not quite sure what you’re asking.

    ZIP, by default, is a compression tool. It takes multiple files, creates an index of the files within and then performs compression on all the files combined (to allow for a better dictionary). The index and dictionary are “overhead” that exists for each ZIP file.

    Sending multiple files, uncompressed, or sending multiple ZIP files (one for each file) will almost certainly be less efficient.

    • Meow-Misfit@lemmy.worldOP
      link
      fedilink
      arrow-up
      1
      ·
      11 hours ago

      Example: Book A I have all 10 pages of it in a jpg each.

      Let’s say the size of all these 10 pages togheter is 300MB (not tech savy, don’t know if this is realistic).

      If I put them on a zip, will the size be smaller? Like, reduce to 250MB or something?

      • slazer2au@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        10 hours ago

        For images it may be better but images are already compressed so there may not be a large saving in zipping them.

        Alternative options would be to use more storage efficient formats like webp for instance.