You are viewing a single comment's thread from:

RE: Exploring Steem Scalability

in #steem6 years ago (edited)

I have been running a witness node since September 2017. I have never put any shared_memory on RAM, all of it on a regular SATA SSD, and my RAM usage hovers under the 1 GB mark. Never had any problems at all, with disk usage around the 5% mark. Replays probably would take a while, but I have actually never replayed - not even once - since September 2017. It has just been rock solid. I've been playing around with a NVMe RAID0 server, and experimenting with /dev/shm, but can't think of a single reason to go the RAM route. Replay finished from scratch in three and a half hours, and as demonstrated above, replays are pretty rare. So, I'll stick with NVMe SSD for now, Optane next. Hopefully, with all of the scalability improvements, I won't be using RAM for years to come.

PS: I have noticed shared_memory is very compressible. I'd be looking forward to some compression tech built into Steemd. Of course, we can use existing workarounds now.

Sort:  

Thanks @liberosist - I really appreciate you communicating your real-world experience with running a witness node on this post.

You are also correct about the shared_memory file being fairly compressible. We actually run a service that compresses state files regularly and are pulled in and uncompressed on startup. This makes it possible for us to autoscale steemd instances on-demand.

Ah, so I would like to see that tech built into Steemd itself so it always stays compressed - even when it's running. I know this will add CPU overhead, but since AMD's EPYC and Intel's Skylake-X response, we are headed into a world with more CPU cores than we know what to do with. I know some have experimented on zram on Linux, and it works fine. Anecdote - back when Steemd ran on Windows, the in-built RAM compression in Windows 10 kicked in. Back then, it was all on RAM (no shared_memory) and a full node typically used 15 GB. In Windows 10, that was down to only 3 GB, the CPU overhead was minimal and it ran flawlessly. Of course, very different times now, but I'd be looking forward to seeing compression tech built into Steemd.

RocksDB has built in compression. It will store all data on disk compressed and in a machine independent format.

Great to hear! Look forward to the transition from Chainbase to RocksDB.

@Vandeberg

on the STEEM master, I did

git log | grep Vand | wc -l = 1801

Appreciate your hard work. I keep seeing your commits !

I wanted check few details:

  1. So RockDB will be will keep the historical data and provide fast access on-demand ?

  2. The new transactions only can be in the memory / blockchain and thus we can get something similar to Nano blockchain ?!!

  3. If we achived, point2, then we may not need multi-threading for the near future ?

  4. Now, if two and three are true, STEEM can be a general purpse, blockchain on steroids and can easily compete with other specific chains like Hyperledger ?!!

  5. On Hivemind, I observed that the PostgreSQL + the communities logic is confined to single boxes. So, right now horizondal scaling is not planned ? ie, in the event of PostgreSQL hitting its limits + CPU requirement for the logic needs much more than the CPU/RAM capabilities of single nodes ? (In a nutshell the current implementation can scale vertically till some point and later for the facebook-scale, we would need to add support for horizondal scale)

@Vandeberg

Sorry for re-bumping this old comment thread, but didn't know how else to get in touch with you! I was referred to you by Julián González.

I'm José Macedo, senior analyst at AmaZix (https://www.amazix.com/ // https://www.linkedin.com/in/ze-macedo-15b1b175/). In case you haven't heard of us, we're one the largest community management, consulting and advisory companies in the crypto space, partnered with Bancor, HDAC, Bankex and many others. We've worked with over 100 projects and now have over 100 employees.

We're now working on a forum project with a token to incentivise quality content/curation. We really like the STEEM inflation algorithm for this and are very interested in building on top of STEEM. Our ideal scenario would be to use the SMT protocol, but we realise launch is scheduled for January and we simply cannot wait that long to launch.

We’re curious if we’d be able to chat to someone from the SMT team to discuss our options in the meanwhile. Currently, we’re leaning towards issuing an ERC-20 and then switching to SMT’s once they launch, but we’d love to find a way to build on STEEM from the get-go.

Let us know if you have some time to talk and discuss our use case.

Thanks,

José

@justinw - read your other comments. Keep up the good work.

I have questions / doubts:

This makes it possible for us to autoscale steemd instances on-demand.

ie you are compressing the block_log files and decompressing for auto-scalng ?

  1. Are just using xz/zip/bzip/lzma or something else ?

  2. Cursious about the auto-scaling as well - so you use something like ELB from AWS or a hardware (F5) ?

Thanks in advance.

How did you update steemd?
There has been a patch at least at the end of December/in January?
For that, recompiling and restart of steemd process, you should have needed a replay.

The update did not require a replay. Just a rebuild and restart.