Does not have to be a good product (for now). This is a demonstrator.
These can scale up to multiples of 30kW machines, I believe, with several of these cards inside.
For now its just to try to prove how fast from code to ASIC things can be done and work on the economics of the production (more or less like bitcoin miners started to appear).
Then, these might evolve with some better form of programable Models. Maybe more advanced ASICs that can carry the logic for more than 1 LLM model, and do inference depending on firmware switches.
Similar to how Mellanox/Nvidia did with VPI cards that work both on Infiniband and Ethernet protocols (one port at a time), for example.
Either way, these things usually involve patents, and that's why they buy these companies... to either monopolise the market or prevent it from being too easy to explore. That was the point of my snap :D
Problem with these cards are that they are outdated once a new model comes out. It doesn't matter if it can carry more than 1 LLM model. (Doubtful that they could. And both of them would be outdated.)
Given that a new model comes every ~3 months nowadays. These cards will have a lifetime of 3 months for big players and become e-waste there after.
These sort of cards are basically opposite of acceleration and singularity.
Would someone buy the Taalas the company? Probably.
Would someone be interested in buying these cards? Probably small inference providers, some consumers that is fine with not upgrading.
But the purpose of these kinds of demonstrations (behind what is exposed) is not the final product to sell millions and become rich. But instead to demonstrate a viable A, B or C process... that either values patents and it's just an asset for monopolising value (aka planned to be acquired on purpose due to competitors fighting each other, and you wanting to xuch these kinds of things) or to steal progressively market share because you want to start making a point that evolution needs to stop because big giants are not able to evolve with so many advantageous featues on the next model (or the market does not percieve that value anymore, aka not everyone wants to be on the latest model).
Also, this is 6nm, so pretty established and probably quick to produce (just not in very large quantities). Which is part of the point of products like these. You never want a product like this for years... this is a fast alternative for inference, vs expensive GPUs that are not very accessible unless you want to pay extra premium and very large quantities.
It's also the point (starting to emerge), where you need 1 billion dollars to go from grok4 to 5, vs you will stay with grok4 for X time for 1 million and sell another product for X millions.
Same already happens with GPUs... 2 years ago (even last year ones if you are a large buyer), GPUs are already operationally more expensive because the power they consume vs performance is already not there, plus datacenter costs add up to long-term solutions like those.
Bitcoin is the same, you can have an old Bitcoin miner, and yes still works and mines peanuts, but for what costs? Yeah, some people have free electricity kind of thing... still, for a large part of the market it will have costs, like cooling etc... there's always a break-even point for large industrialisation, which is where these things are trying to tease or take advantage. Fighting against small customer B is not profitable... fighting against big giants is...
Will they manage, don't know. But either not trying... or get bought because they are making too much noise.
It's also the point (starting to emerge), where you need 1 billion dollars to go from grok4 to 5, vs you will stay with grok4 for X time for 1 million and sell another product for X millions.
This is not true.
Same already happens with GPUs... 2 years ago (even last year ones if you are a large buyer), GPUs are already operationally more expensive because the power they consume vs performance is already not there, plus datacenter costs add up to long-term solutions like those.
You can use the same GPU for multiple model generations. GPT5 and GPT5.1 are not the same model, not GPT 5.2 or codex models.
You can't fine-tune these cards.
To produce a codex model you would have to produce these cards again, to produce a new iterative fine-tune you need to produce these cards again.
And then you have already alternatives like Groq and Cerebras for fast inference without the drawback of these cards.
They are Dead on Arrival.
Models are not like Bitcoin ASIC where you optimize the process for one single thing. Each model requires a different process due to changes in parameter counts to changes in architecture.
This is like comparing apples to oranges.
Something being able to be done does not mean it is a viable process.
Right, now that is announced, coming to my point of why I shared this...
Check the GTC talk by NVIDIA, and check another "version" of what these are called (LPU). And this was sort of where I was trying to have my conversation here. ASIC specialization!
In a sense, I agree with you, but I would rather think of a big, already dominant player on the protocol, such as Meta, for example. Or NVIDIA to prevent Meta from achieving higher dominance.
It isn't a good product.
Any use case I see is for embodied AI.
Does not have to be a good product (for now). This is a demonstrator.
These can scale up to multiples of 30kW machines, I believe, with several of these cards inside.
For now its just to try to prove how fast from code to ASIC things can be done and work on the economics of the production (more or less like bitcoin miners started to appear).
Then, these might evolve with some better form of programable Models. Maybe more advanced ASICs that can carry the logic for more than 1 LLM model, and do inference depending on firmware switches.
Similar to how Mellanox/Nvidia did with VPI cards that work both on Infiniband and Ethernet protocols (one port at a time), for example.
Either way, these things usually involve patents, and that's why they buy these companies... to either monopolise the market or prevent it from being too easy to explore. That was the point of my snap :D
Problem with these cards are that they are outdated once a new model comes out. It doesn't matter if it can carry more than 1 LLM model. (Doubtful that they could. And both of them would be outdated.)
Given that a new model comes every ~3 months nowadays. These cards will have a lifetime of 3 months for big players and become e-waste there after.
These sort of cards are basically opposite of acceleration and singularity.
Would someone buy the Taalas the company? Probably.
Would someone be interested in buying these cards? Probably small inference providers, some consumers that is fine with not upgrading.
But big AI labs? No.
I know all of that! :) For a reason...
But the purpose of these kinds of demonstrations (behind what is exposed) is not the final product to sell millions and become rich. But instead to demonstrate a viable A, B or C process... that either values patents and it's just an asset for monopolising value (aka planned to be acquired on purpose due to competitors fighting each other, and you wanting to xuch these kinds of things) or to steal progressively market share because you want to start making a point that evolution needs to stop because big giants are not able to evolve with so many advantageous featues on the next model (or the market does not percieve that value anymore, aka not everyone wants to be on the latest model).
Also, this is 6nm, so pretty established and probably quick to produce (just not in very large quantities). Which is part of the point of products like these. You never want a product like this for years... this is a fast alternative for inference, vs expensive GPUs that are not very accessible unless you want to pay extra premium and very large quantities.
It's also the point (starting to emerge), where you need 1 billion dollars to go from grok4 to 5, vs you will stay with grok4 for X time for 1 million and sell another product for X millions.
Same already happens with GPUs... 2 years ago (even last year ones if you are a large buyer), GPUs are already operationally more expensive because the power they consume vs performance is already not there, plus datacenter costs add up to long-term solutions like those.
Bitcoin is the same, you can have an old Bitcoin miner, and yes still works and mines peanuts, but for what costs? Yeah, some people have free electricity kind of thing... still, for a large part of the market it will have costs, like cooling etc... there's always a break-even point for large industrialisation, which is where these things are trying to tease or take advantage. Fighting against small customer B is not profitable... fighting against big giants is...
Will they manage, don't know. But either not trying... or get bought because they are making too much noise.
And for some, the second part is worth it.
This is not true.
You can use the same GPU for multiple model generations. GPT5 and GPT5.1 are not the same model, not GPT 5.2 or codex models.
You can't fine-tune these cards.
To produce a codex model you would have to produce these cards again, to produce a new iterative fine-tune you need to produce these cards again.
And then you have already alternatives like Groq and Cerebras for fast inference without the drawback of these cards.
They are Dead on Arrival.
Models are not like Bitcoin ASIC where you optimize the process for one single thing. Each model requires a different process due to changes in parameter counts to changes in architecture.
This is like comparing apples to oranges.
Something being able to be done does not mean it is a viable process.
Man... you are "almost" speaking like there are no costs involved in anything. My perspective has nothing to do with technology; it's just business!
Anyhow, not with brains for much discussion today. Catch you next time when the brains cool down.
Right, now that is announced, coming to my point of why I shared this...
Check the GTC talk by NVIDIA, and check another "version" of what these are called (LPU). And this was sort of where I was trying to have my conversation here. ASIC specialization!
No LPUs are not another version of this. They are vastly different in their architecture than this. Which I mention in the above comment.
Which seems to be the main target.
Tokens per second per user implies that it isn't.
There would be only one user on an embodied AI.
Probably the data centers, once they realize that their current orders aren't going to be delivered on time.
It makes sense.
In a sense, I agree with you, but I would rather think of a big, already dominant player on the protocol, such as Meta, for example. Or NVIDIA to prevent Meta from achieving higher dominance.
Just watching it. 👀