Oh, and you HAVE to try the new Qwen 2.5 14B.
The whole lineup is freaking sick, 34B it outscoring llama 3.1 70B in a lot of benchmarks, and in personal use it feels super smart.
You can try a smaller IQ3 imatrix quantization to speed it up, but 22B is indeed tight for 8GB.
If someone comes out with an AQLM for it, it might completely fit in VRAM, but I’m not sure it would even work for a Pascal card TBH.
I’m sorry, but Democratic leadership, the Washington Post, and the New York Times are not “right wing.” The last Republican presidental candidate thje NYT endorsed was Dwight D. Eisenhower in 1956.
They’re right wing to you and Lemmy, but Lemmy is not the center of America’s political compass. And I’m speaking as a rabid DJT hater who votes straight ticket Democrat, bar one primary I registered republican in just so I could vote against DJT.
Especially if you’re mega rich.
“Well, one lesson I’ve learned is that just because I say something to a group and they laugh doesn’t mean it’s going to be all that hilarious as a post on X,” he said in a follow-up post early Monday. “Turns out that jokes are WAY less funny if people don’t know the context and the delivery is plain text."
I knew people like this in real life, who’d say something horrible and follow it up with “It’s just a joke,” but only if they ‘lose’ and are called out on it.
They’re slimey jerks, and it’s utterly miserable to even be around them. And I don’t understand why so many would worship/follow Elon and dwell on Twitter for it.
It says center and center-right outlets have a left bias
Which outlets, specifically?
She endorsed Biden before. It wasn’t really a surprise.
Is this why everyone is downvoting the fact checker? Because they don’t like it saying their preferred outlets have a bias?
The Guardian does have a left bias, its pretty obvious. That’s not a bad thing.
It’s still everywhere in my news/internet diet.
It’s bleeding, for sure, but it’s big. Its gone bad. But I think its premature to say its collapse is a good thing, because it just won’t go away.
It’s not dead though, it’s still linked to everywhere, from big news to niche communities because it still has that critical mass and inertia.
And I have to be cynical of the Fediverse, but realistically, what replaces it, at least here in the US? Discord? No, thanks, I’d at least rather have information be public.
I’m speaking as someone who has never used Twitter, but I can’t ignore it, as much as I’d like to.
This is 2024.
No attention is bad attention. Literally all that matters is you stay in people’s phone feeds, in front of their eyeballs. So yes… this is helping Trump, as the attention opportunity cost is literally anything but Trump.
That actually is weird.
The problem is that splitting models up over a network, even over LAN, is not super efficient. The entire weights need to be run through for every half word.
And the other problem is that petals just can’t keep up with the crazy dev pace of the LLM community. Honestly they should dump it and fork or contribute to llama.cpp or exllama, as TBH no one wants to split up LLAMA 2 (or even llama 3) 70B, and be a generation or two behind for a base instruct model instead of a finetune.
Even the horde has very few hosts relative to users, even though hosting a small model on a 6GB GPU would get you lots of karma.
The diffusion community is very different, as the output is one image and even the largest open models are much smaller. Lora usage is also standardized there, while it is not on LLM land.
TBH this is a great space for modding and local LLM/LLM “hordes”
^
Futurama had it right, spammers are the ultimate destroyers.
Please ask him, tape it, and don’t let the campaign managers talk him out of it.
+1
Never attribute to malice that which is adequately explained by wanting to make money.
Hmm, what if the shadowbanning is ‘soft’? Like if bot comments are locked at a low negative number and hidden by default, that would take away most exposure but let them keep rambling away.
Top 50% of the population still.
After all, they wrote a review.
A Qwen 2.5 14B IQ3_M should completely fit in your VRAM, with longish context, with acceptable quality.
An IQ4_XS will just barely overflow but should still be fast at short context.
And while I have not tried it yet, the 14B is allegedly smart.
Also, what I do on my PC is hook up my monitor to the iGPU so the GPU’s VRAM is completely empty, lol.