Epstein Files Jan 30, 2026 Release - Archived from Justice.gov

xodoh74984@lemmy.world · edit-2 5 months ago

Epstein Files Jan 30, 2026 Release - Archived from Justice.gov

Nomad64@lemmy.world · 4 months ago

A consolidated (and structured) torrent file has been released: https://github.com/yung-megafone/Epstein-Files/issues/1#issuecomment-3860836655

Currently clearing data from my seedbox to get this added.

ZaInT@lemmy.world · 4 months ago

In it with 4 nodes still :)

Wild_Cow_5769@lemmy.world · 4 months ago

https://archive.org/details/ds-9-efta-gap-repair

Repaired gaps in from Partial Dataset 9 EFTA00593870.pdf EFTA00595160.pdf EFTA00595410.pdf EFTA00595694.pdf EFTA00595820.pdf EFTA00597207.pdf EFTA00605675.pdf EFTA00645624.pdf EFTA00774768.pdf EFTA01175426.pdf EFTA01220934.pdf

camb143@lemmy.ca · edit-2 4 months ago

So it turns out there’s a pile of videos if you rename files from pdf to mp4. There’s some video torrents for anyone who wants to back them up (datasets 8-12)

Reddit post: https://www.reddit.com/r/Epstein/comments/1qx81dj/type_this_in_and_change_pdf_to_mp4 Fediverse post: https://lemmy.world/post/42756746

DigitalForensick@lemmy.world · 4 months ago

I can work on a script that crawls/detects which of these files are like this. I looked at the hexadecimal data of one of the examples from the reddit thread and there might be some indicators. I wasnt able to get a video to plat, but

Just from this file I can see that they created/edited these pdfs using something called reportlab.com, edited on 12/23/2025. Theres also some text refering to Kids? “Count 1/ Kids [ 3 0 R ]” This is odd. screenshot

File EFTS0024813.pdf/mp4? (Dataset 8, folder 0005)

camb143@lemmy.ca · 4 months ago

Fediverse post I linked has a similar script already made

StinkyFred@lemmy.world · edit-2 4 months ago

I assume you mean EFTA00024813?

If so while it doesn’t show on the website it is included in dataset 8 if you just download the complete set.

It is a .xlsx file.

Its in DataSet 8\VOL00008\NATIVES\0001

This is the case for most of these files, dataset 9 might be missing a few though.

internauta@lemmy.world · 4 months ago

someone on reddit ( u/FuckThisSite3 ) posted a more complete DataSet 9:

I assembled a tar file with all I got from dataset-9.

Magnet link: magnet:?xt=urn:btih:5b50564ee995a54009fec387c97f9465eb18ba00&dn=dataset-9_by_fuckthissite3.tar&xl=148072017920

SHA256: 5adc043bcf94304024d718e57267c1aa009d782835f6adbe6ad7fdbb763f15c5

The tar contains 254,477 files which is 148072017920 bytes (148.07GB/137.9GiB)

locke1@lemmy.world · 4 months ago

While it’s bigger in size, this one seems to be missing a ton of files? I grabbed the earlier collections and my file count is at 531,256. I’ll have to compare when I finish downloading.

ZaInT@lemmy.world · edit-2 4 months ago

Seeding 1 node, 3 on the way EDIT: 3 running, 4th one planned to be temporary but should soon be up

Nomad64@lemmy.world · 5 months ago

I don’t see this posted here yet. Below is the Github repo link for an index of torrent files, DOJ links, and mirrors, for every dataset. https://github.com/yung-megafone/Epstein-Files

ZaInT@lemmy.world · 5 months ago

I am seeding 8, 10, 11, 12 (the ones only available as .zip on justice.gov) for the forseable future (as well as the partials of set 9). I’m looking “everywhere” hoping for some success on part 9 and will be pushing that one until bandwidth dies or until a dozen or so seeders are on - whenever the complete bundle is assembled. Hoping for some good news soon, things seem to be nuked very rapidly now.

I also read that the court documents and one other page was taken down - I have those files but they are not sorted by page, just thrown in a bulk download directory as I had a feeling this would happen and I wanted to pull them quickly. If there’s any use for them anyway I put them on Mega and Gofile a few days ago and they’ve not been taken down so far;

https://gofile.io/d/dff931d5-a646-46f1-b34e-079798f508a2 https://mega.nz/folder/XVMCgLLR#EKVS8Sfiry-VtVAxZ7q_Ig

It’s most likely files that “everyone” already has but better one mirror too much than one less.

ZaInT@lemmy.world · edit-2 5 months ago

Also seeding (but will probably not for very long unless seeders start dropping, they are all at 300-1200 ATM) 59975667f8bdd5baf9945b0e2db8a57d52d32957 0a3d4b84a77bd982c9c2761f40944402b94f9c64 7ac8f771678d19c75a26ea6c14e7d4c003fbf9b6 c3a522d6810ee717a2c7e2ef705163e297d34b72 d509cc4ca1a415a9ba3b6cb920f67c44aed7fe1f e618654607f2c34a41c88458bf2fcdfa86a52174 acb9cb1741502c7dc09460e4fb7b44eac8022906

Trying to pull c100b1b7c4b1e662dd8adc79ae3e42eef6080aee (reduntant limited dataset for that GitHub relations chart)

Pulling f5cbe5026b1f86617c520d0a9cd610d6254cbe85 (just listed on the GitHub repo that lists the same magnets as here - will probably become 2nd seeder in an hour or two and will stay seeding on that one for at least a week or until the swarm looks healthy by the dozens or so.)

Will continue to monitor whatever progress is being made here. I should also have a small subset of DS9 but it will likely only be the first 200 files or so at most. Needless to say I will compare against the existing torrents just in case.

Thanks everyone for your hard work, this is exactly why I started hoarding :)

EDIT: The last magnet ID I listed is the summarized torrent from the repo linked by Nomad64.

Nomad64@lemmy.world · 5 months ago

Thanks for the links. I downloaded the docs and will add them to the pile.

Dhoard@lemmy.world · edit-2 5 months ago

Theoretically speaking, if a website has the archives, what is stopping people from downloading each file on a page by page bases from the archive?

Edit: Never mind to this I saw a full list of URLs that arhive managed to save and it is missing a lot.

DigitalForensick@lemmy.world · 5 months ago

nothing, but event the archived pages arent 100% because some of the files were “faked” in the paginated file lists on the DOJ site. it does work well enough though. I did this to recover all the court records and FOIA files

DigitalForensick@lemmy.world · 5 months ago

For anyone looking into doing some OSINT work, this is an epic file EFTA00809187

It contains lists of ALL know JE emails, usernames, websites, social medias, etc from that time

DigitalForensick@lemmy.world · 4 months ago

nah, i didn’t hear anything back

Dhoard@lemmy.world · 5 months ago

EFTA00809187 Did that guy from pastebin with the complete file DS9file ever answer you?

ArzymKoteyko@lemmy.world · 5 months ago

Hi every one, maybe I’m a bit late to this, but I wanted to share my findings. I parsed every page up to 40k in DS9 3 times and results matched by distribution with PeoplesElbow findings (no content after page 14k and a lot of dublications) BUT I parsed 4 times more unique urls 246_079 (still 2x short of official size). And a strange thing is that on second pass (one day after the first one) I started receiving new urls on old pages.

Here is stat by file type:

 count  | file type 
--------+------
      1 | ts
      8 | mov
    236 | mp4
 244326 | pdf
     73 | m4a
      1 | vob
      1 | docx
      1 | doc
      9 | m4v
   1422 | avi
      1 | wmv

DigitalForensick@lemmy.world · 5 months ago

Nice work man! I also discovered something yesterday that I think is worth pointing out.

DUPLICATE FILES: Within the datasets, there are often emails, doc scans, etc that are duplicate entries. (Im not talking about multi torrent stitching, but actual duplicate documents within the raw dataset.) **These duplicates mustbe preserved. ** When looking at two copies of the same duplicate file, I found that sometimes the redactions are in different places! This can be used to extract more info later down the road.

Wild_Cow_5769@lemmy.world · 5 months ago

Can you make a torrent of the new files if you find any?

ArzymKoteyko@lemmy.world · 5 months ago

Finally got my hands on original DS9 OPT file and I have started downloading files from it. Don’t know how long it will take. Also made a git with stats and index files from doj website and opt from archive: https://github.com/ArzymKoteyko/JEDatasets In short the only difference is that I got additional 1753 links to video files and a strange .docx file with size of 0 bytes [EFTA00335487.docx].

Xenom0rph@lemmy.world · 5 months ago

I’m still seeding the partial Dataset 9 (45.63GB and 89.54GB) and all the other datasets. Is there a newer dataset 9 available?

o_derr889@lemmy.world · 5 months ago

Here is the download link for a text file that has all the original URL’s https://wormhole.app/PpjJ3P#SFfAOKm1bnCyi-h2YroRyA The link will only last for 24 hours.

acelee1012@lemmy.world · 5 months ago

I have never made a torrent file before so feel free to correct me if it doesn’t work. Here is the magnet link for this as a torrent file so its up for more than an hour magnet:?xt=urn:btih:694535d1e3879e899a53647769f1975276723db7&xt=urn:btmh:12207cf818f0f0110ca5e44614f2c65e016eca2fe7bc569810f9fb25e80ff608fc9b&dn=DOJ%20Epstein%20file%20urls.txt&xl=81991719&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce

DigitalForensick@lemmy.world · 5 months ago

What does this contain? anything new?

o_derr889@lemmy.world · 5 months ago

Its the URL’s of the original dataset 9. It was posted on the original reddit post.

Dhoard@lemmy.world · 5 months ago

please post again. thank you.

Wild_Cow_5769@lemmy.world · 5 months ago

its a file list but not the actual files tho.

acelee1012@lemmy.world · 5 months ago

Has anyone made a Dataset 9 and 10 torrent file without the files in it that the NYT reported as potentially CSAM?

locke1@lemmy.world · edit-2 5 months ago

I don’t think anyone knows for sure what files those are. It would’ve been helpful if NYT published the file names. But maybe NYT isn’t sure themselves as they wrote some of the images are “possibly” of teenagers.

To be on the safe side, I guess you could just remove all nude images from the dataset. It is a ton of images to go through though, hundreds of thousands.

BWint@lemmy.world · 5 months ago

The BBC is now reporting that “thousands” of documents have been removed because the DOJ improperly redacted information that can be used to identify the victims: https://www.bbc.com/news/articles/cn0k65pnxjxo

activeinvestigator@lemmy.world · 5 months ago

Do people here have the partial dataset 9? or are you all missing the entire set? There is a magnet link floating around for ~100GB of it, the one removed in the OP

I am trying to figure out exactly how many files dataset 9 is supposed to have in it. Before the zip file went dark, I was able to download about 2GB of it. This was today, maybe not the original zip file from jan 30th In the head of the zip file is an index file, VOL00009.OPT, you don’t need the full download in order to read this index file. The index file says there are 531,307 pdfs the 100GB torrent has 531,256, it’s missing 51 pdfs. I checked the 51 file names and they no longer exist as individual files on the DOJ website either. I’m assuming these are the CSAM.

note that the 3M number of released documents != 3M pdfs. each pdf page is counted as a “document”. dataset 9 contains 1,223,757 documents, and according to the index, we are missing only 51 documents, they are not multipage. In total, I have 2,731,789 documents from datasets 1-12, short of the 3M number. the index I got also was not missing document ranges

it’s curious that the zip file had an extra 80GB when only 51 documents are missing. I’m currently scraping links from the DOJ webpage to double check the filenames

Arthas@lemmy.world · edit-2 5 months ago

i analyzed with AI my 36gb~ that I was able to download before they erased the zip file from the server.

Complete Volume Analysis

  Based on the OPT metadata file, here's what VOL00009 was supposed to contain:

  Full Volume Specifications

  - Total Bates-numbered pages: 1,223,757 pages
  - Total unique PDF files: 531,307 individual PDFs
  - Bates number range: EFTA00039025 to EFTA01262781
  - Subdirectory structure: IMAGES\0001\ through IMAGES\0532\ (532 folders)
  - Expected size: ~180 GB (based on your download info)

  What You Actually Got

  - PDF files received: 90,982 files
  - Subdirectories: 91 folders (0001 through ~0091)
  - Current size: 37 GB
  - Percentage received: ~17% of the files (91 out of 532 folders)

  The Math

  Expected:  531,307 PDF files / 180 GB / 532 folders
  Received:   90,982 PDF files /  37 GB /  91 folders
  Missing:   440,325 PDF files / 143 GB / 441 folders

  ★ Insight ─────────────────────────────────────
  You got approximately the first 17% of the volume before the server deleted it. The good news is that the DAT/OPT index files are complete, so you have a full manifest of what should be there. This means:
  - You know exactly which documents are missing (folders 0092-0532)

I haven’t looked into downloading the partials from archive.org yet to see if I have any useful files that archive.org doesn’t have yet from dataset 9.

GorillaCall@lemmy.world · 5 months ago

I have heard its 186gb

Wild_Cow_5769@lemmy.world · 5 months ago

thats pretty cool…

Can you send me a DM of the 51? if i come across one and it isnt some sketchy porn i’ll let u know

TavernerAqua@lemmy.world · edit-2 5 months ago

In regard to Dataset 9, it’s currently being shared on Dread (forum).

I have no idea if it’s legit or not, and Idc to find out after reading about what’s in it from NYT.

Wild_Cow_5769@lemmy.world · 5 months ago

where… I dont see it here https://dreadytognbh7m5nlmqsogzzlxjy75iuxkulewbhxcorupbqahact2yd.onion/

DigitalForensick@lemmy.world · edit-2 5 months ago

this dude on pastebin posted his filetree in his epstein ubuntu env - i have a high confidence in whatever lives in his DataSet9Complete.zip file haha

Wild_Cow_5769@lemmy.world · 5 months ago

No doubt. High confidence…. :)

Wild_Cow_5769@lemmy.world · 5 months ago

@wild_cow_5769:matrix.org If someone has a group working on finding the dataset.

There are billions of people on earth. Someone downloaded dataset 9 before the link was taken down. We just have to find them :)