cross-posted from: https://sopuli.xyz/post/30596655

Personally identifiable information has been found in DataComp CommonPool, one of the largest open-source data sets used to train image generation models.