LLMs can unmask pseudonymous users at scale with surprising accuracy

return2ozma@lemmy.world · 6 days ago

LLMs can unmask pseudonymous users at scale with surprising accuracy

Fmstrat@lemmy.world · 4 days ago

This seems like an invalid test.

One of them collected posts from Hacker News and LinkedIn profiles and then linked them by using cross-platform references that appeared in user profiles. They then stripped all identifying references from the posts and ran a large language model on them.

If I post something on LinkedIn, and then post the same thing on Hacker News, of course an LLM could match my accounts up.

Am I missing something?

Fizz@lemmy.nz · 5 days ago

This seems like complete bullshit.

Prince Aster [He/They/Zir]@lemmy.dbzer0.com · 3 days ago

They didn’t show their work citing misuse as the reason, which leads me to believe it’s fearmongering and that it isn’t going to work so well for the people who are actually trying (AKA have good OPSEC).

This is one of those examples of “It’s not possible to be private online so don’t bother trying” type propaganda posts.

Fizz@lemmy.nz · 3 days ago

Its just a shit article lazily posted because the headline further radicalizes people. Low quality articles like this should be banned because they lower the IQ of everyone who reads it and misinform anyone who doesn’t.

thedeadwalking4242@lemmy.world · 5 days ago

I call BS we can’t even get AI models to determine if an AI write text. This as go to me some magic statistics

scarabic@lemmy.world · 5 days ago

Do y’all not write differently when you’re trying to be discreet on Blind?

ShotDonkey@lemmy.world · 5 days ago

The results, especially the high numbers stated in the news article (68% recall, 90% accuracy) are overestimated as their verification method (i.e., whether the LLM detected really the right account) come from matching veryfied accounts with a test set of anonymous accounts of which they knew the real name. They knew the real name bcs the persons had a public link to their LinkedIn in their “anonymous” profile (which was removed for the sake of testing wheter the LLm can match the two acfounts. That being said: a user who uses a pseudonym but links his/her account publically to a, say, LinkedIn account doesn’t really care about anonymity and might hand out many more ‘breadcrumbs’ to follow than a truly anonymous account.

But I still think that also in the case of a fully anonymous account, people can be fingerprinted and matched with non-anonymous identities due to language, style etc. by a LLM.

GamingChairModel@lemmy.world · 4 days ago

Reminds me of an AI tool that could identify authorship of articles with surprisingly high accuracy, and then they peeked under the hood and realized it was just looking for the author byline at the top of the article that says “By John Doe,” where it completely failed if the article didn’t explicitly say who the author was.

XeroxCool@lemmy.world · 4 days ago

I can’t believe this product, modeled after humans, would lie and cheat like humans

Widdershins@lemmy.world · 5 days ago

FlashMobOfOne@lemmy.world · 5 days ago

And it will falsely identify people at even greater scale, because it is an imprecise and buggy tool.

Prince Aster [He/They/Zir]@lemmy.dbzer0.com · 3 days ago

The ones doing the identifying will argue that they are right until they are blue in the face because that’s how we do things now. By lying harder until the ones you’re trying to convince believe you or give up because it’s not worth it to keep arguing. I’ve been accused of being other people or using AI to write, and those people argued or harassed me until I blocked them. I can probably expect more of this going forward.

RblScmNerfHerder@lemmy.world · 5 days ago

Yeah, but if it falsely identifies the right people, is it really buggy?

BlameTheAntifa@lemmy.world · 5 days ago

How dare you claim that the hallucination engine hallucinates. The Billionaires have declared this heresy.

deadymouse@lemmy.world · 5 days ago

For those who don’t know, we’ve been living in a dystopia since the 2000s.

ComradePenguin@lemmy.ml · 5 days ago

Is this the first step towards using local LLMs for anonymity? 🫠 Always rephrasing each sentence somewhat. Truly dystopian stuff

73ms@sopuli.xyz · 5 days ago

Have they tried doing this for Satoshi Nakamoto yet?

anon_8675309@lemmy.world · 5 days ago

Hmmm interesting. I’ve never used AI to try and find out stuff about myself. Maybe I’ll try. Just curious.

scala@lemmy.ml · edit-2 4 days ago

That’s how they get you

DarkSideOfTheMoon@lemmy.world · 5 days ago

Brazil has 200 million ppl, how they would find someone in Rio like me?

how_we_burned@lemmy.zip · 5 days ago

Filter for Brazilian who are Pink Floyd fans
Filter for Brazilians who can speak English
Filter for Brazilians who are left/socialist and who are on alternative social media sites.

And so on

DarkSideOfTheMoon@lemmy.world · 5 days ago

Fuuuu

vacuumflower@lemmy.sdf.org · 5 days ago

The bright side - they can also be used to mask pseudonymous users. Guess how.

dejected_warp_core@lemmy.world · 5 days ago

Yes, but don’t use a public service for this. Use a local LLM and maintain distinct profiles, one for each online account.

cley_faye@lemmy.world · 6 days ago

Yeah. I got a hunch of that a while ago, while trying some “old” scenarios of de-anonymization we used to do by hand. Just asking questions and posting pictures got surprisingly accurate results. A single picture with (to me) no significant landmark could lead to localizing a specific part of a city, and that was using a local LLM with a relatively small model, running on a 16GB VRAM 4060Ti.

It is now time to remember fondly the time where the younger people were warned by older people to not post all their stuff online, not over-share, be cautious about strangers, etc. I’m not sure when we lost that, but oh boy, it’s a festival.

tal@lemmy.today · 4 days ago

I’m not sure when we lost that, but oh boy, it’s a festival.

I remember when it was considered outrageous that Flash would phone home and report its version, because that would leak the fact that a given machine was running a given version of Flash.

We sure don’t live in that world today.

nutsack@lemmy.dbzer0.com · 6 days ago

I theorized about this a long time ago. pretty sure I’m basically fucked