Qwen2.5: A Party of Foundation Models!

brucethemoose@lemmy.world · edit-2 16 hours ago

A Qwen 2.5 14B IQ3_M should completely fit in your VRAM, with longish context, with acceptable quality.

An IQ4_XS will just barely overflow but should still be fast at short context.

And while I have not tried it yet, the 14B is allegedly smart.

Also, what I do on my PC is hook up my monitor to the iGPU so the GPU’s VRAM is completely empty, lol.

brucethemoose@lemmy.world · edit-2 22 hours ago

Qwen2.5: A Party of Foundation Models!

brucethemoose@lemmy.world · 22 hours ago

Oh, and you HAVE to try the new Qwen 2.5 14B.

The whole lineup is freaking sick, 34B it outscoring llama 3.1 70B in a lot of benchmarks, and in personal use it feels super smart.

brucethemoose@lemmy.world · edit-2 22 hours ago

You can try a smaller IQ3 imatrix quantization to speed it up, but 22B is indeed tight for 8GB.

If someone comes out with an AQLM for it, it might completely fit in VRAM, but I’m not sure it would even work for a Pascal card TBH.

brucethemoose@lemmy.world · 3 days ago

I’m sorry, but Democratic leadership, the Washington Post, and the New York Times are not “right wing.” The last Republican presidental candidate thje NYT endorsed was Dwight D. Eisenhower in 1956.

They’re right wing to you and Lemmy, but Lemmy is not the center of America’s political compass. And I’m speaking as a rabid DJT hater who votes straight ticket Democrat, bar one primary I registered republican in just so I could vote against DJT.

brucethemoose@lemmy.world · edit-2 3 days ago

Especially if you’re mega rich.

brucethemoose@lemmy.world · edit-2 3 days ago

“Well, one lesson I’ve learned is that just because I say something to a group and they laugh doesn’t mean it’s going to be all that hilarious as a post on X,” he said in a follow-up post early Monday. “Turns out that jokes are WAY less funny if people don’t know the context and the delivery is plain text."

I knew people like this in real life, who’d say something horrible and follow it up with “It’s just a joke,” but only if they ‘lose’ and are called out on it.

They’re slimey jerks, and it’s utterly miserable to even be around them. And I don’t understand why so many would worship/follow Elon and dwell on Twitter for it.

brucethemoose@lemmy.world · 4 days ago

It says center and center-right outlets have a left bias

Which outlets, specifically?

brucethemoose@lemmy.world · 4 days ago

She endorsed Biden before. It wasn’t really a surprise.

brucethemoose@lemmy.world · edit-2 4 days ago

Is this why everyone is downvoting the fact checker? Because they don’t like it saying their preferred outlets have a bias?

The Guardian does have a left bias, its pretty obvious. That’s not a bad thing.

brucethemoose@lemmy.world · 4 days ago

It’s still everywhere in my news/internet diet.

It’s bleeding, for sure, but it’s big. Its gone bad. But I think its premature to say its collapse is a good thing, because it just won’t go away.

brucethemoose@lemmy.world · edit-2 4 days ago

It’s not dead though, it’s still linked to everywhere, from big news to niche communities because it still has that critical mass and inertia.

And I have to be cynical of the Fediverse, but realistically, what replaces it, at least here in the US? Discord? No, thanks, I’d at least rather have information be public.

I’m speaking as someone who has never used Twitter, but I can’t ignore it, as much as I’d like to.

brucethemoose@lemmy.world · edit-2 4 days ago

This is 2024.

No attention is bad attention. Literally all that matters is you stay in people’s phone feeds, in front of their eyeballs. So yes… this is helping Trump, as the attention opportunity cost is literally anything but Trump.

brucethemoose@lemmy.world · 7 days ago

That actually is weird.

brucethemoose@lemmy.world · edit-2 9 days ago

The problem is that splitting models up over a network, even over LAN, is not super efficient. The entire weights need to be run through for every half word.

And the other problem is that petals just can’t keep up with the crazy dev pace of the LLM community. Honestly they should dump it and fork or contribute to llama.cpp or exllama, as TBH no one wants to split up LLAMA 2 (or even llama 3) 70B, and be a generation or two behind for a base instruct model instead of a finetune.

Even the horde has very few hosts relative to users, even though hosting a small model on a 6GB GPU would get you lots of karma.

The diffusion community is very different, as the output is one image and even the largest open models are much smaller. Lora usage is also standardized there, while it is not on LLM land.

brucethemoose@lemmy.world · 11 days ago

TBH this is a great space for modding and local LLM/LLM “hordes”

brucethemoose@lemmy.world · 12 days ago

^

Futurama had it right, spammers are the ultimate destroyers.

brucethemoose@lemmy.world · 13 days ago

Please ask him, tape it, and don’t let the campaign managers talk him out of it.

brucethemoose@lemmy.world · 13 days ago

+1

Never attribute to malice that which is adequately explained by wanting to make money.

brucethemoose@lemmy.world · 13 days ago

Hmm, what if the shadowbanning is ‘soft’? Like if bot comments are locked at a low negative number and hidden by default, that would take away most exposure but let them keep rambling away.

brucethemoose@lemmy.world · 14 days ago

Top 50% of the population still.

After all, they wrote a review.

brucethemoose@lemmy.world · edit-2 16 days ago

How does Lemmy feel about "open source" machine learning, akin to the Fediverse vs Social Media?

brucethemoose@lemmy.world · 1 month ago

Pressure grows as "last chance" negotiations for Gaza deal resume