Hive Pressure #3: Catching up with the head block.

in Blockchain Wizardry3 years ago

Basic Hive node has a very simple configuration and with minor changes it can serve as a seed node, a witness node, a broadcaster node, a private node for your wallet (that’s what exchanges are using) or even a simple API node for your Hive microservices.

Regardless of its role, as long as a node has unrestricted network access, it will be part of the Hive p2p network, thus supporting Hive reliability and resilience.

Before your node becomes fully functional, it has to reach the head block of the blockchain.

Get the Hive daemon

Get the blocks

The easy way or the fast way.

  • Sync from the p2p network
    By default, when a fresh Hive node starts, it connects to the Hive p2p network and retrieves blocks from it.
    See: --resync-blockchain

  • Get blocks yourself
    Hive node can use an existing block_log either from another instance or from a public source such as https://gtg.openhive.network/get/blockchain
    Our goal is to reach the head block as soon as possible so we chose that way.
    block_log currently takes over 350GB, so depending on your connection and source, downloading it might take less than an hour or even half a day (for 1Gbps and 100Mbps respectively).
    By default it’s expected to be located at ~/.hived/blockchain/block_log.

Configure your node

Configuration settings are by default in ~/.hived/config.ini
This should be enough:

plugin = witness
plugin = rc

shared-file-dir = "/run/hive"
shared-file-size = 24G

flush-state-interval = 0

Please note that I’m using a custom location for shared_memory.bin file, keeping it on a tmpfs volume for maximum performance, make sure you have enough space there if you are going to use it.

Process the blocks

Having all the blocks is not enough, your node needs to be aware of the current state of Hive.
Live nodes get blocks from the p2p network and process them updating state one block at a time (every three seconds), but when you start from scratch, you have to catch up.

  • Snapshot
    Snapshot is the fastest way because most of the job is already done.
    That however will work only for compatible configurations.
    We will play with snapshots another time.

  • Replay
    Once you have a block_log and config.ini files in place, you need to start hived with --replay-blockchain.
    Replay uses the existing block_log to build up the shared memory file up to the highest block stored there, and then it continues with sync, up to the head block.
    There's very little use of multi-threading here because every block depends on the previous one.
    A lot of data is being processed, so your hardware specs really do matter here.
    Not long ago Hive crossed the 55 millions block mark.
    Let’s see how long does it take to replay that many blocks using different hardware specs.

hived --force-replay --set-benchmark-interval 100000

Test Setups

Alpha

A popular workstation setup. Good enough but will run out of storage soon.

Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
64GB RAM (DDR4 4x16GB 2133MHz)
2x256GB SSD in RAID0 (SAMSUNG MZ7LN256HMJP)

Bravo

Old but not obsolete. CPU released in 2014. New disks after the old ones failed.

Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
32GB RAM (DDR3 4x8GB 1600MHz)
2x480GB SSD in RAID0 (KINGSTON SEDC500M480G)

Charlie

The newest and the most expensive CPU in my list. Also the only AMD.

AMD Ryzen 5 3600
64GB RAM (DDR4 4x16GB 2666MHz)
2x512GB NVMe in RAID0 (SAMSUNG MZVLB512HBJQ)

Delta

My favorite, high quality components for serious tasks.

Intel(R) Xeon(R) E-2136 CPU @ 3.30GHz
64GB RAM (DDR4 4x16GB 2666MHz ECC)
2x512GB NVMe in RAID0 (WD CL SN720)

Warning: spoilers ahead

What do you think? Which one will win the race?

Results

Server[s]H:M:S
Alpha281207h48m40s
Bravo262807h18m00s
Charlie250326h57m12s
Delta233146h28m34s

What are your --replay times?

Sort:  

Epsilon

I hate that one. Destroyer of fun.

AMD Ryzen 9 5950X
128GB RAM (DDR4 4x32GB 3600MHz)
4x2TB NVMe in RAID0 (Samsung SSD 980 PRO)

Result

Server[s]H:M:S
Epsilon183395h05m39s

Courtesy of @blocktrades

Dammit. My money was on Charlie!

Dammit. I do not even have money. 😂

Epsilon (AMD 5xxx series) is suspiciously missing from this list. My money is still on properly clocked Epsilon!

I just got the results (as a reply to top post).
I was expecting something close to 6h because I saw before what this monster can do with full account history node. But this... I'm impressed!

I was on this one with you, my friend!

I was all in on Charlie, lol. Nice article. If you ever try this on different machines, plz post it!

run your hive witness inside of @telosnetwork Dstor when it does computations like a3 . its just storage ATM but will soon let you run something like hived bro sdo it do it then we will have hive inside eosio for 100s of hive chains with mega witness elections witnesses and Bps representing whoel chains going off to the united nations of blockchains lol

No, sorry to dissapoint you, it's not fast enough for Hive.

it will be

Congratulations @gtg! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s) :

You received more than 60000 HP as payout for your posts and comments.
Your next payout target is 62000 HP.
The unit is Hive Power equivalent because your rewards can be split into HP and HBD

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP