Are you training those models? Because that requires hardware that is probably $10-12k minimum. Every single LLM self-hosting story I’ve heard is someone using a tiny purpose-driven model that can do only 1 or 2 things. If you want Claude or ChatGPT level of knowledge, you’re running hardware that costs $10k+ minimum. I’ve been in tech 20 years and I don’t even own a single piece of hardware that will run any model well. Even my M3 MBA absolutely chokes on qwen for example.
If you want pre-trained models like Llama, Mistral, or Gemma, you’re circling back to corporate lock-in from Meta, former Meta employees, or Google. Suddenly it’s not open source anymore.
Are you training those models? Because that requires hardware that is probably $10-12k minimum. Every single LLM self-hosting story I’ve heard is someone using a tiny purpose-driven model that can do only 1 or 2 things. If you want Claude or ChatGPT level of knowledge, you’re running hardware that costs $10k+ minimum. I’ve been in tech 20 years and I don’t even own a single piece of hardware that will run any model well. Even my M3 MBA absolutely chokes on qwen for example.
If you want pre-trained models like Llama, Mistral, or Gemma, you’re circling back to corporate lock-in from Meta, former Meta employees, or Google. Suddenly it’s not open source anymore.
I have a £3k laptop that can run most models with <= 16GB of VRAM