Impulse buy: Just ordered the new NVIDIA Jetson Orin Nano Super (instesd of a GPU or AI accelarator M.2)

in #homelab5 days ago

It may end up being a mistake, being a brand new and likely poorly documented ARM based device, but after weeks of not being able to figure out what to add to my first XCP-NG home-lab node in order for me to run basic hobby level trading models on my systems.

A few weeks ago I orderd myself a new MS-01 from Minisforum, the cheapest one with an i5, and I got 96GB of RAM and a x4 4GB gen4 SSD for the system. I also got a 2GB gen3 SSD that I was going to add to the system, but that one now has a new desingation.

The MS-01 comes with a PCI slot, but the only cards it can fit are cards that are not too long, are half hight, take up just one slot, and don't draw too much power because the case is small and cooling capacity is limited. There are some people on the internet who managed to get really decent GPUs in there running at 70 watts, but they had to overcome cooling issues by adding external cooling solutions, and that type of tinkering really isn't for me. I was considering putting an into the system.

But as the main task for the GPU was to be running machine learning models for hobby trading, there was an alternative option that I was considdering. The Hailo-8 is an M.2 AI accellerator card that runs at just a few watts. I would be giving up on the posibility to run transcoder workloads, but the MS01 comes with an onboard Intel Iris, that while not great on performance should ba able to do at least some basic transcoding.

I couldn't make up my mind what to get. At one point I thought I should get both. Or should I wait for the Hailo 10 to come out. I felt the Haile cards were very interesting, but there seems to be close to no community for these in the home-server space. Most of the Hailo-8 community seems to hover arround Raspberry Pi based camera video processing systems.

Now today I suddenly saw the new NVIDIA Jetson Orin Nano Super show up in my feed not ones, but three times in a row. I took one look at the specs, went to the NVIDIA site, went to the Silicon Highway site and orderd myself a Jetson Orin Nano Super.

image.png

The pricing showed at just €237.40 on the site, and after VAT and shipping that became €323.98, a bit more than either the RTX 6400 or the Hailo-8, and I have no clue how it compares with what is going to be the pricing for the upcomming Hailo-10, and this presses on my budget for wiring up my new house in march, but I guess I can manage with just wiring up the ground floor for now.

Let's see if what I chose is the right one in terms of specs.

Let's see what I'm "not" getting now instead of this.

RX 6400

image.png

The card I was considering was this one or something quite similar. It is a 768 core card without tensor cores with 4GB of RAM, what would be cool because combined with the 96 GB of RAM for the main system this would put my total RAM at a round 100 GB. This not a beast by any measure but decent enough to combine base machine learning models with some transcoding workloads at a power consumption of up to 50 watts.

Hailo-8

image.png

My alternative for the GPU was getting this little M.2 AI accelerator card from Hailo. The Hailo-8. It is a bit like the Google Coral cards, but at six times the TOPS (Tera Operations Per Second). The Hailo-8 lists a speed of 26 TOPS.

Comparing the specs.

image.png

RX 6400Hailo 8Jetson Orin Nano Super
price€140€220€324
TOPS-2667
Cuda Cores768-1024
Tensor Cores0-32
Memory4GB-8GB
Memory speed16 GBps-102 GBps
Power50W8W25W

From the specs it seems like my impulse buy was a good choice, it's not the lowest power solution, but performance is better than both the alternatives at a moderately higher price. I can put in the 2TB SSD that I had planned earlier as second disk for my MS01, but I have no case for it at the moment.

There is however a price that I'm going to be paying for this choice when I start adding other nodes to my home lab. The MS01 will for sure play nice as part of a XCP-NG pool, but will these ARM based systems ? If I understand correctly the NVIDIA Jetson Orin Nano comes with its own NVIDIA specific distro pre installed on it, and I don't know what kind of hell I'll be setting myself up for if I tried to go bleeding edge by trying to run XCP-NG on this thing while keeping the whole NVIDIA stack intact. So the price I'm probably going to be paying when adding new nodes is that I'll likely have to implement my own bare metal fail-over for the NVIDIA devices, or, and I hope not, I may need to switch both the MS01s and the Orins to Proxmox rather than XCP-NG.

Did I make the right choice?

Considering the much better specs than the alternatives, but also considering the likely troubles I'm going to run into when adding a second MS01 and second Orin node to my lab (hopefully next summer) and a third of both (hopefully in about a year from now) and maintainig it as a pool, did I make the right choice? And if so, am I making the right choice if I install XCP-NG on my MS01 and leave the Orin as delivered everything running bare mettal? I know XCP-NG is far from having the whole stack on arm64. Shouls I be considering proxmox? I know people are running Proxmox on Raspberry and Jetson systems. Should I put in the energy to make both my first MS01 and my first Orin Super run ProxMox? Whe I consider these questions, while this impulse buy helped me out on the pure hardware dilemma I was facing, I'm now faced with a new dilemma. A virtualization and resource pools dilemma.

If you have any insights on this dilemma, please weigh in. Do I stick with XCP-NG for my MS01 node and stay bare mettal for trading and transcoding workloads? Do I trust XCP-NG will soon run full stack on XCP-NG? Or do I do the radical thing and learn and invest in Proxmox knowledge? I personaly feel that XCP-NG is far superiour to Proxmox in an x86 homogenous setting, but that is not what I'll have. Should I revise my stand and start of my new homelab journey learning and using Proxmox?

Or should I just consider I made the wrong choice, use this device for a while and then buy one of the other options once I get my second MS01 node?