• 3 Posts
  • 96 Comments
Joined 6 months ago
cake
Cake day: March 22nd, 2024

help-circle






  • “Well, one lesson I’ve learned is that just because I say something to a group and they laugh doesn’t mean it’s going to be all that hilarious as a post on X,” he said in a follow-up post early Monday. “Turns out that jokes are WAY less funny if people don’t know the context and the delivery is plain text."

    I knew people like this in real life, who’d say something horrible and follow it up with “It’s just a joke,” but only if they ‘lose’ and are called out on it.

    They’re slimey jerks, and it’s utterly miserable to even be around them. And I don’t understand why so many would worship/follow Elon and dwell on Twitter for it.









  • The problem is that splitting models up over a network, even over LAN, is not super efficient. The entire weights need to be run through for every half word.

    And the other problem is that petals just can’t keep up with the crazy dev pace of the LLM community. Honestly they should dump it and fork or contribute to llama.cpp or exllama, as TBH no one wants to split up LLAMA 2 (or even llama 3) 70B, and be a generation or two behind for a base instruct model instead of a finetune.

    Even the horde has very few hosts relative to users, even though hosting a small model on a 6GB GPU would get you lots of karma.

    The diffusion community is very different, as the output is one image and even the largest open models are much smaller. Lora usage is also standardized there, while it is not on LLM land.