You are viewing a single comment's thread from:

RE: Snaps Container // 2/24/2026, 6:36:00 PM

in Snaps3 months ago (edited)

I know all of that! :) For a reason...

But the purpose of these kinds of demonstrations (behind what is exposed) is not the final product to sell millions and become rich. But instead to demonstrate a viable A, B or C process... that either values patents and it's just an asset for monopolising value (aka planned to be acquired on purpose due to competitors fighting each other, and you wanting to xuch these kinds of things) or to steal progressively market share because you want to start making a point that evolution needs to stop because big giants are not able to evolve with so many advantageous featues on the next model (or the market does not percieve that value anymore, aka not everyone wants to be on the latest model).

Also, this is 6nm, so pretty established and probably quick to produce (just not in very large quantities). Which is part of the point of products like these. You never want a product like this for years... this is a fast alternative for inference, vs expensive GPUs that are not very accessible unless you want to pay extra premium and very large quantities.

It's also the point (starting to emerge), where you need 1 billion dollars to go from grok4 to 5, vs you will stay with grok4 for X time for 1 million and sell another product for X millions.

Same already happens with GPUs... 2 years ago (even last year ones if you are a large buyer), GPUs are already operationally more expensive because the power they consume vs performance is already not there, plus datacenter costs add up to long-term solutions like those.

Bitcoin is the same, you can have an old Bitcoin miner, and yes still works and mines peanuts, but for what costs? Yeah, some people have free electricity kind of thing... still, for a large part of the market it will have costs, like cooling etc... there's always a break-even point for large industrialisation, which is where these things are trying to tease or take advantage. Fighting against small customer B is not profitable... fighting against big giants is...

Will they manage, don't know. But either not trying... or get bought because they are making too much noise.

And for some, the second part is worth it.

Sort:  

It's also the point (starting to emerge), where you need 1 billion dollars to go from grok4 to 5, vs you will stay with grok4 for X time for 1 million and sell another product for X millions.

This is not true.

Same already happens with GPUs... 2 years ago (even last year ones if you are a large buyer), GPUs are already operationally more expensive because the power they consume vs performance is already not there, plus datacenter costs add up to long-term solutions like those.

You can use the same GPU for multiple model generations. GPT5 and GPT5.1 are not the same model, not GPT 5.2 or codex models.

You can't fine-tune these cards.

To produce a codex model you would have to produce these cards again, to produce a new iterative fine-tune you need to produce these cards again.

And then you have already alternatives like Groq and Cerebras for fast inference without the drawback of these cards.

They are Dead on Arrival.

Models are not like Bitcoin ASIC where you optimize the process for one single thing. Each model requires a different process due to changes in parameter counts to changes in architecture.

This is like comparing apples to oranges.

Something being able to be done does not mean it is a viable process.

Man... you are "almost" speaking like there are no costs involved in anything. My perspective has nothing to do with technology; it's just business!

Anyhow, not with brains for much discussion today. Catch you next time when the brains cool down.

Right, now that is announced, coming to my point of why I shared this...

Check the GTC talk by NVIDIA, and check another "version" of what these are called (LPU). And this was sort of where I was trying to have my conversation here. ASIC specialization!

No LPUs are not another version of this. They are vastly different in their architecture than this. Which I mention in the above comment.

I know they are different (hence why I said, "version"), but its a signal for microprocessing specialization which is what I am trying to frame on this conversation, and how the generalized compute market usually evolves.

This only signals that GPUs are not solving some parts of inference as efficient as desired, and then new specialized microprocessors and eventually ASICs, are produced instead of keeping adapting the main core produced equipment to all problems (in this case the GPU).

Anyhow, eager to see how this evolves... even at CPU levels for small inference workloads. Everything is going mega fast...

This only signals that GPUs are not solving some parts of inference as efficient as desired, and then new specialized microprocessors and eventually ASICs, are produced instead of keeping adapting the main core produced equipment to all problems (in this case the GPU).

You are extrapolating this too much. LPUs work because they are not as specialized as Taalas. You can switch the model that is running on them.

Taalas is too much specialized to only run the model it is designed to run. Sure this is very efficient at running that model, but it isn't efficient in a general sense. When you are getting new models every 3 months, not to even mention finetuning models.

An ASIC like Taalas is not very adaptable.

You are extrapolating this too much.

Maybe. Let's see in a few good months.

History told us that if it is too complex, it eventually dies or splits into more pieces. And that's how I feel this might happen.