• 4 Posts
  • 46 Comments
Joined 10 months ago
cake
Cake day: January 25th, 2024

help-circle










  • Not just semantics. PDFs doesn’t even have segmentations like spaces/lines/paragraph. It’s just text drawn at locations the text processor/any other softwares inserted into. Many pdf editor softwares just detect the closeness of the characters to group them together.

    And one step further is you can convert text to path, which basically won’t even have glyph (characters) info and font info, all characters will just be geometric shapes. In that case you can’t even copy the text. OCR is your only choice.

    PDF is for finalizing something and printing/sharing without the ability to edit.









  • There are basically two types of files. Text files and binary files.

    Most information are stored in text files so humans can easily understand it, and it’s easier to find errors, review, parse. But text storage takes more space than binary files. And many complicated softwares normally need multiple text files or data files, many of them just store them together as a zip file so that it’s easier to handle. Examples are .docx,.pptx, etc files in MS Office, try unzipping them and see what they contain. Zipping also has advantages of reducing file sizes.


  • That depends on what video player you use. Of we have control of that, then sure it works. I use mpv to play things, so for radio streams or live videos I can go back/forward as long as it’s cached.

    But if it’s the web service, even though the browser video player has something cached, the player is still controlled by the website. And considering most of the people use chrome/chromium derivatives or YouTube app, it wouldn’t be hard for them to make it so that the player itself will collaborate with whatever they want to do.

    If YouTube was a separate organization it wouldn’t have been the problem it is because of how Google has been taking over all the different parts they need for advertising.