This seems like an invalid test.
One of them collected posts from Hacker News and LinkedIn profiles and then linked them by using cross-platform references that appeared in user profiles. They then stripped all identifying references from the posts and ran a large language model on them.
If I post something on LinkedIn, and then post the same thing on Hacker News, of course an LLM could match my accounts up.
Am I missing something?
This seems like complete bullshit.
They didn’t show their work citing misuse as the reason, which leads me to believe it’s fearmongering and that it isn’t going to work so well for the people who are actually trying (AKA have good OPSEC).
This is one of those examples of “It’s not possible to be private online so don’t bother trying” type propaganda posts.
Its just a shit article lazily posted because the headline further radicalizes people. Low quality articles like this should be banned because they lower the IQ of everyone who reads it and misinform anyone who doesn’t.
I call BS we can’t even get AI models to determine if an AI write text. This as go to me some magic statistics
Do y’all not write differently when you’re trying to be discreet on Blind?
The results, especially the high numbers stated in the news article (68% recall, 90% accuracy) are overestimated as their verification method (i.e., whether the LLM detected really the right account) come from matching veryfied accounts with a test set of anonymous accounts of which they knew the real name. They knew the real name bcs the persons had a public link to their LinkedIn in their “anonymous” profile (which was removed for the sake of testing wheter the LLm can match the two acfounts. That being said: a user who uses a pseudonym but links his/her account publically to a, say, LinkedIn account doesn’t really care about anonymity and might hand out many more ‘breadcrumbs’ to follow than a truly anonymous account.
But I still think that also in the case of a fully anonymous account, people can be fingerprinted and matched with non-anonymous identities due to language, style etc. by a LLM.
Reminds me of an AI tool that could identify authorship of articles with surprisingly high accuracy, and then they peeked under the hood and realized it was just looking for the author byline at the top of the article that says “By John Doe,” where it completely failed if the article didn’t explicitly say who the author was.
I can’t believe this product, modeled after humans, would lie and cheat like humans

And it will falsely identify people at even greater scale, because it is an imprecise and buggy tool.
The ones doing the identifying will argue that they are right until they are blue in the face because that’s how we do things now. By lying harder until the ones you’re trying to convince believe you or give up because it’s not worth it to keep arguing. I’ve been accused of being other people or using AI to write, and those people argued or harassed me until I blocked them. I can probably expect more of this going forward.
Yeah, but if it falsely identifies the right people, is it really buggy?
How dare you claim that the hallucination engine hallucinates. The Billionaires have declared this heresy.
For those who don’t know, we’ve been living in a dystopia since the 2000s.
Is this the first step towards using local LLMs for anonymity? 🫠 Always rephrasing each sentence somewhat. Truly dystopian stuff
Have they tried doing this for Satoshi Nakamoto yet?
Hmmm interesting. I’ve never used AI to try and find out stuff about myself. Maybe I’ll try. Just curious.
That’s how they get you
Brazil has 200 million ppl, how they would find someone in Rio like me?
- Filter for Brazilian who are Pink Floyd fans
- Filter for Brazilians who can speak English
- Filter for Brazilians who are left/socialist and who are on alternative social media sites.
And so on
Fuuuu
The bright side - they can also be used to mask pseudonymous users. Guess how.
Yes, but don’t use a public service for this. Use a local LLM and maintain distinct profiles, one for each online account.
Yeah. I got a hunch of that a while ago, while trying some “old” scenarios of de-anonymization we used to do by hand. Just asking questions and posting pictures got surprisingly accurate results. A single picture with (to me) no significant landmark could lead to localizing a specific part of a city, and that was using a local LLM with a relatively small model, running on a 16GB VRAM 4060Ti.
It is now time to remember fondly the time where the younger people were warned by older people to not post all their stuff online, not over-share, be cautious about strangers, etc. I’m not sure when we lost that, but oh boy, it’s a festival.
I’m not sure when we lost that, but oh boy, it’s a festival.
I remember when it was considered outrageous that Flash would phone home and report its version, because that would leak the fact that a given machine was running a given version of Flash.
We sure don’t live in that world today.
I theorized about this a long time ago. pretty sure I’m basically fucked







