Tag: optimization
All the articles with the tag "optimization".
-
Google Just Made Every AI Model 6x Cheaper to Run. Memory Chip Stocks Crashed.
TurboQuant compresses LLM memory from 16 bits to 3 bits with zero accuracy loss. 6x less memory, 8x faster inference. And the stock market panicked, because apparently nobody learned from the DeepSeek episode.
-
One Year After DeepSeek R1: What Actually Changed?
A year ago, a Chinese lab nobody was watching released a model that crashed NVIDIA's stock and challenged every assumption about how much compute you actually need. I was watching them before the explosion. Here is what I think really happened.