Decided to try this as I thought it would be an interesting and challenging way to learn more about the Hive blockchain. My initial plan was to run this node on a RaspberryPi board but it turned out that 4G of memory is just not enough. It ran without complaints on a VirtualBox with 4G RAM + 2G swap so I suppose Raspberry is not out of the question
Running Out Of The Box
There is pretty good documentation at https://github.com/harpagon210/steemsmartcontracts-wiki/blob/master/How-to-setup-a-Steem-Smart-Contracts-node.md. I used official docker images for NodeJS and MongoDB instead of installing them - that saved some time. One thing that is not mentioned is that the HowTo seems to be for running a steem-engine node - or at least I guess so by the nodes specified in the config file. After poking around the source I found a branch named hive-engine and checked it out before proceeding with the npm install step. So everything is installing fast and smooth in 10 minutes or so. The only thing that needed to be tweaked is the mongodb address in the config file. After pointing it to the address of the MongoDB docker image I had a working hive-engine node which was happily streaming blocks from the Hive blockchain
Too Good To Be True
The first disappointment came a few minutes later. Upon checking the logs I noticed the message that Hive blockchain is more than 12 million blocks ahead and the streamer was getting like 20 blocks per 10 seconds. With that speed of 2 blocks/sec it would take 6 million seconds to catch up! There an option to sync the blocks from the blocks_log file which is around 330GB (wow!) but I'm sure downloading it won't be much fun and I decided to take the more challenging path of performance tuning.
The node process on the linux host was not taking more than 20% of the CPU, while the mongodb was less than 5%. This led me to think that the bottleneck was at the public API node that my Hive-Engine streamed from.
The Challenge - How To Speed Up Blockchain Streaming
The Naive Approach
I knew there are several public API nodes - 14 to be precise, as per beempy config. I thought it would be a good idea if I could load-balance over them all and I started a simple HAproxy (just another docker image) with one backend and 14 servers in it. This worked perfectly, spreading incoming requests over all public nodes in a round-robin fashion so I set it up as a stream node for my hive-engine. Believe it or not - this has no effect - still only 2 blocks per second, even with 14 nodes in load balance!
Start Thinking
I guess the streamer needs the block in the exact order as they are generated (that's why it's called blockchain, right?) and it does not get any benefit from load balancing. In fact , the load balancer decreases performance compared to using only the fastest node. For example response time for api.openhive.network is 0.365s:
time curl -s --data '{"id": 1, "jsonrpc": "2.0", "method": "call", "params": ["condenser_api", "get_block", [8675309]]}' https://api.openhive.network > /dev/null
real 0m0.365s
user 0m0.015s
sys 0m0.046s
while anyx.io is more than 2 times slower:
time curl -s --data '{"id": 1, "jsonrpc": "2.0", "method": "call", "params": ["condenser_api", "get_block", [8675309]]}' https://anyx.io > /dev/null
real 0m0.822s
user 0m0.025s
sys 0m0.064s
If I load balance these 2 I'll get an average of (0.8 + 0.3)/2 = 0.55s which is much slower than using only the fastest one
I got seriously despaired by this timings, because if a single block is retrieved for 0.36 seconds at best, then 2 blocks per second processed by the streamer seems like the upper limit. However, as I already had a load balancer, I decided to waste some more time and check what is the maximum speed at which blocks could be retrieved from the public nodes.
And the results gave me
A New Hope!
First I wrote a simple Python script, that sends syncronous requests to the HAproxy using the standard requests module. As expected, it took 7-8 minutes to fetch 1000 blocks this way. To get the maximum of the load balancer I would need asynchronous requests. So I reworked my script to use aiohttp and hammer the HAproxy with 10 requests at once. The results were amazing - less than 2 minutes for 1000 blocks which is like 10 blocks per second - much faster than what the streamer does. Considering the fact that HAproxy runs on an OrangePi board and was occasionally choking due to low maxfiles limit during the test, I would say that there is a possibility for improvements after all
Conclusion And Further Steps
Till now I made the important discovery that fetching blocks asynchronously from multiple nodes is faster then synchronous fetch from a single node. It would be easy to check how this compares to async fetch from a single node, but I'm afraid I might get a ban for sending too much requests. The more interesting question is why the existing libraries like *dsteem *or beem do not implement pre-fetching? There are certain cases when it could be beneficial - like when scanning a bunch of existing blocks for specific transactions.
I realize that implementing a cache is easier said than done. The official caching (write-behind) proxy - Jussi - does not seem simple at all. Yet I'm tempted to try and write a read-ahead proxy. Without any experience as a developer my chances are not good, but it will be fun, at least for myself