Proposal: Funding for anyx.io API Infrastructure Recurrent Costs

in #sps5 years ago (edited)

TL;DR: This proposal seeks to help reimburse the recurrent costs of the public, free-to-use anyx.io Steem API infrastructure. If you use sites or services such as Busy.org, Splinterlands (SteemMonsters), Partiko, or many more — these services rely on this infrastructure for both uptime and performance.

Motivation

API services play a crucial role in the Steem ecosystem. As just recently seen with the previous hard forks — without an API node, it doesn’t matter much if the chain is live if you can’t use it.

All software and services that interact with the Steem chain do require an API node. However, with heavy reliance on Steemit Inc’s nodes, we can see excessive downtime when things go wrong. It does not make for a very decentralized chain to have only one single point of failure and only one good option for public API access -- this is the antithesis of Blockchain ideology.

While others do provide public API nodes, many are configured without all available plugins, do not support high throughput, do not offer good uptime, or are otherwise held back. However, the anyx.io node has already proven itself to be robust, and has recently expanded throughput capabilities.

Timeline

This proposal is set for one year, though the intent will be to continue operation as long as possible. By having a renewal proposal in the future, I can re-evaluate expansion or contraction if needed.

Funding Rationale

The recurrent costs for the infrastructure — namely datacenter hosting — sum totals approximately $450 per month, or about $15 per day. Notably, this proposal does not aim to cover the already-spent hardware costs, which have exceeded $30,000.

The Software Configuration

The current stack in the infrastructure is:
2x Full steemd instances
3x Light steemd instances
3x Hivemind instances
Supplementary custom API instances (e.g. Vessel wallet support)

The Hardware Infrastructure

The current stack of hardware is as follows:

2x “Heavy”:

  • 512GB DDR4 RAM
  • 8-16 core Xeon
  • 1-2 NVME drives
  • 1 OPTANE drive
  • 1-4 SSD drives
  • 1 Gigabit public ethernet

3x “Light”:

  • 64GB DDR4 RAM
  • 4-8 core Xeon or i7
  • 1-2 NVME drives
  • 1 Gigabit public ethernet

Configuration Philosophy

Hosted hardware is owned, not rented. Sure, additional software services (reverse proxy & ddos protection) are "cloud", but these are agile around the back-end owned hardware.

The nodes were custom built with Steem APIs in mind, ensuring a balance of high frequency cores for single threaded tasks (e.g., valiating transactions), with sufficient core count to handle large throughput. Drives are high-end NVME or Optane to ensure low-latency for each request, with sufficient storage available for elements like account history and communities data (hivemind). Due to the high volume of fast storage, this is one of the few nodes that actually offers account history (including get_transaction) support in full.

While originally starting with one "Heavy" node, a second was purchased and installed to ensure backup and redundancy is available. This enables stronger uptime guarantees, as even during crash or failure of one, requests can still be served while recovery takes place. In addition, this enables asynchronous backups that do not interrupt service.

Why not Witness pay?

I have previously been funding this infrastructure costs out of my witness pay. However, with the growth and desired scale-out to ensure satisfying the public demand, operational costs now exceed 25% of a current top 21 witness pay. No other witness offers infrastructure at this level -- the only other entity that can handle heavy public requests is Steemit Inc.

This funding does not seek to reward myself, it is targeted to fund what I believe to be a "public good". The SBD from this proposal will be sold to cover costs.

There have been many debates around whether or not API access should be privatized or offered by organizations to be paid for. Philosophically, I believe API access should be public and freely available -- enabling developers to quickly build and test applications and provide value to the ecosystem, without having to deal with infrastructure woes. The SPS system offers a good avenue to fund this "public good".

This proposal does not seek to reimburse my development time or offer myself a form of "salary" for keeping services running and up to date. I consider Witness pay responsible for this. This proposal is strictly for recurrent hosting costs of the infrastructure.

Qualifications

The anyx.io endpoint has proven itself to be robust to downtime and responsive to elastic public demand. Many services rely on this infrastructure as their primary API endpoint (such as Busy and Partiko), with many others using it heavily or giving users the option to use it (such as Splinterlands, Steemconnect, Keychain, Steempeak, Steemworld, Beem, and many more).

Since the previous hardfork, I have starting light logging of success metrics. Over the past 8 days, here are some interesting statistics:

  • 280,307,849 Total Requests (Approx. 400 Requests per Second Average)
  • 176,856 Unique IP Addresses
  • 0 Server Errors

Independent testing has shown that the anyx.io infrastructure meets or exceeds Steemit Inc's own provided throughput and latency for API requests. The above metrics, while good, do not anywhere near saturate available performance.

Supplementary Reading

Original announcement:
https://steemit.com/steem/@anyx/announcing-https-anyx-io-a-public-high-performance-full-api

Previous upgrades:
https://steemit.com/steem/@anyx/updates-to-anyx-io-infrastructure-including-hivemind-support
https://steemit.com/steem/@anyx/notice-of-upcoming-changes-to-anyx-io-api

Relevent API Development:
https://steemit.com/steem/@anyx/designing-a-restful-steem-api

Learn more about me from my Witness Application:
https://steemit.com/witness/@anyx/updated-witness-application

Consider Voting for this Proposal here:

https://steemitwallet.com/proposals
https://steempeak.com/me/proposals

Sort:  

I will support your proposal. Your nodes are the most reliable and fastest out there. I've never experienced any issues with missing APIs or long lasting downtimes. As long as I cannot run my own node(s) with all required APIs enabled, your service is a must have for me.

Philosophically, I believe API access should be public and freely available -- enabling developers to quickly build and test applications and provide value to the ecosystem, without having to deal with infrastructure woes

I agree with this with respect to developers for experimentation and initial development (what you call "building and testing"), but once something goes into deployment and significant usage, I believe projects and businesses should run their own nodes or at least pay their own way.

I don't support the public good model of running big expensive nodes which IMO actually hurt the ecosystem. The blockchain itself is a public good, but when it comes to accessing it, many businesses and projects could easily run their own nodes with the result being a more robust decentralized network. They don't and won't do this if get free service.

I feel the same way about Steemit's nodes. They should be closed down or have access limited to development purposes.

I would 100% support a smaller proposal that is narrowly focused on providing limited-capacity API support to individuals and developers for their own use.

The hyper anarcho-capitalist mentality that often comes along to the crypto space seems to greatly take for granted the availability of public nodes and doesn't realize they are public goods. Then they download free wallets and expect them to work -- and since it works, no one questions how it works. Yet, these wallets are often reliant on some foundation or altruism. Cryptos often die when this support ends.

Should successful businesses run their own public nodes as well? Yes, absolutely, there should be more public nodes. Having decentralization means that services do not have a single point of failure. Though if you believe that private nodes for private applications means decentralization, I think you have the wrong model in mind -- this is not decentralization, this is a single point of failure (and a position where the application can censor its users however they like). For an application to actually be decentralized, you need to be able to use it in a permissionless way.

The unfortunate truth about steemd nodes is that the setup is dumb and the API is shit. Asking a business to begin by doing this and detract their focus away from what they are good at is a waste of resources (until they have resources to spare). So I think one would be misguided to think that startup businesses and projects "won't run their own nodes" if they get public service. It's more simple than that -- many businesses and projects "won't run, period" if there aren't public services.

The blockchain itself is indeed a public good -- but this means both read and write access should be. Consider a wallet as an application. How should the wallet read and write to the chain? Well, we could envision a model where the wallet is for-profit (say, purchased on the app store) and the owner is a business that also runs nodes for their wallet users. But, this means that the wallet is reliant on a single point of failure -- this business and their nodes -- for it to continue to work. This is not a decentralized application. This is not a public good. If all access to a public good requires going through privatized routes, is it still a public good?

A decentralized application is open-source and open-permission. Sure, it's something you can run your own node for should you choose -- and for purist decentralization everyone would indeed run their own node. But having a public API means you don't have to, having multiple public APIs means you have can even have trustless decentralization available to you -- you are not reliant on any one entity to use your wallet.

In my mind, a successful version of Steem has all witnesses offering excellent API nodes to the public for their use. Witnesses are not only the decentralized authority for putting information into the chain, they can also be the decentralized authority for providing information to the users, removing the need for all users to run their own nodes.

The hyper anarcho-capitalist mentality

I just disagree this is ideological. I see it as mostly practical:

  1. Centralized node services introduce a central point of failure due to natural downtime, DoS (of various kinds, not only the dumb flooding DDoS that is most common etc.), and other reasons. If significant businesses of which there are several in the Steem ecosystem, run their own nodes which they can easily do, there would be far less of the ecosystem offline just because one or a few node operators has "an issue".
  2. Free-for-the-user services encourage inefficiency and overall cost. There is no telling how many inefficient spam bots and other low-value services are using the Steemit public APIs simply because they are free. If they had to pay for a service (public but paid-for), or pay to run their own nodes, the would either improve their code to be more efficient or stop running it. There is no incentive to do either this when the costs can be shifted to someone else, who then faces an escalating cost structure to continue to provide service at an acceptable quality level.

If Steemit wants to offer free node service, arguably on their own dime (though in reality subsidized by stakeholders accepting the ninja-mine), I can speak out about it being a harmful and broken model, which I have. If someone else wants to do it, and solicit SPS funds, I can still speak out about it being a broken model but also oppose the funding that would go to subsidize it.

As far as applications like wallets, the place where the investment should go is not more big nodes to become a point of failure and inefficiency, it is lighter consensus models which allow wallets to sync to the chain without downloading all of it, with a modest, but generally acceptable, increase in the degree of trust required (rough equivalent of SPV on Bitcoin). In the case of DPoS this means delivering a chain of witness changes. This doesn't exist afaik. It has been discussed for years going back to Bitshares, but nobody takes on developing it, and instead people just keep using API nodes with no decentralized validation at all.

and doesn't realize they are public goods

That's because they aren't.

API node service is not at all a public good in terms of actual economics. It is both excludable and rivalrous. (The blockchain itself, by contrast, is actually a public good.) Wishing something to be a public good does not make it so.

You disagree that it's ideological, then you give an point that ignores my last comment entirely, then another point that is subjective and not quantifiable. It sounds to me like you have ideology at heart here.

I'll repeat my point in response to #1 -- I believe many more public nodes are good. I think others, especially businesses (whom have the skills) should follow along and build more infrastructure. I don't know how you've missed the fact that we align here.

For #2, sure, many developers aren't perfectly efficient because they are not immediately forced to optimize.
Yet you don't run a public node so I don't know how you can really comment on this being a problem; it's a subjective view and I have not seen any quantitative data to back up this being a problem. From my perspective, I haven't seen this as a problem on my nodes at all.

Lighter consensus models would be great, if you actually read my post I linked to, I talk about this -- but it requires 2/3rds of witnesses to provide APIs and signatures in order to be byzantine fault tolerant.

Finally, public good... in terms of actual economics, you are straight up wrong.

What is Rivalry? Can someone "use up" an API? No, it doesn't disappear after you use it. It's not like you can chop down all the trees so you can burn firewood and then I can't. I can use the API, you can use the API, anyone can use the API. All at the same time. It's just data, and data is replicable.

What is Excludability? Can you use it if I am using it -- absolutely, yes. Its parallelizable and effectively functions with renewable capacity. Can someone stop you from using it? Most probably not. Many public goods are like this, like roads or clean air. I can pollute the air or congest the roads, and make it shit for others, that doesn't make it no longer a public good, it just degrades the quality.
Could it be excludable? Yes -- if a node were to perhaps ban certain users from using it. But this would be visible (and I don't do this). Notably though, someone else can't stop you from using it -- only I (the provider) could do so. Roads are like this too -- a government may prevent passage for various reasons.

Wishing something wasn't a public good does not make it so. It absolutely does qualify. It qualifies in the exact same way that the blockchain itself also qualifies -- service can be degraded by bad actors, but it does not prevent the fundamental properties of being non-rivalrous and non-excludable. If I were to tell you "A blockchain is not a public good", consider how you would argue against that. Do you see how the same logic applies to the data access as it does to the data entry?

p.s., you're fine to criticize and oppose. I just wish you understood the larger picture here.

We will mostly just have to agree to disagree about what is best.

However, on this point I must clarify:

Finally, public good... in terms of actual economics, you are straight up wrong

No I'm not.

Go look at what I wrote (I just went back and double checked). I did not say that "the API" is not a public good, I said that "API service" is not a public good.

The API (specification) itself is a public good of course. Anyone is free to use "the API", it can not be "used up", etc.

But a server providing API service (which incorporates bandwidth, processing, and storage) is absolutely not a public good. It is rivalrous in that people must compete to use its finite resources. It can absolutely be "used up", as you put it, in the sense that too much usage will degrade the quality of service to unacceptable levels. It is excludable in that access can be easily restricted (using, for example, API keys).

I'd suggest reading https://en.wikipedia.org/wiki/Public_good#Examples for a good discussion as to what is or is not a public good in economic terms. The best examples are intangibles which share more with the abstract concept of "an API" and less with physical API servers:

defence, public fireworks, lighthouses, clean air and other environmental goods, and information goods, such as official statistics, open-source software, authorship, and invention

If I were to tell you "A blockchain is not a public good",

I was mistaken. I was thinking of read access to a blockchain, which is a public good in the same sense as many of the above information goods. Write access to a blockchain is not a public good. Indeed any blockchain which tried to be such would be spam attacked (as Steem was being when it didn't implement sufficient measures of exclusion pre-RC) and fail.

The wiki page you link to suggests "... quasi-public goods because excludability is possible", which in my opinion fits the bill for both read and write access to blockchains. They can, in the same was as roads, be blocked to poor overall usage.

But you are absolutely wrong about rivalrousness. A server providing readable data is infinitely renewable. It cannot be used up because the information is not deleted after being provided. You are confusing this property with excludability. It is a question of quality of service (which falls under the excludability property).

In this case, I would correct my definition to suggest that public APIs and public Blockchains, like roads and public libraries, are both quasi-public goods. Could we agree on that?

But you are absolutely wrong about rivalrousness. A server providing readable data is infinitely renewable

Not infinitely, no. Only up to the point where usage of the server reaches capacity to provide an acceptable quality of service, and even before that point, performance will degrade with more users, though perhaps imperceptibly at first. Being able to sequentually renew (i.e. not "use up") is not enough as you can see from the wikipedia page on rival goods

A hammer is a durable rival good. One person's use of the hammer presents a significant barrier to others who desire to use that hammer at the same time. However, the first user does not "use up" the hammer, meaning that some rival goods can still be shared through time:

Again, I'm not talking about the information. The information itself is unquestionably a public good. A particular server (such as those proposed to paid for by this proposals) is not a public good. It has rivalrous physical resources such as processing power and bandwidth, and people compete to use those resources.

The reason that roads have historically been considered to be a quasi-public good is that it has been (and is still largely) impractical to infeasible to restrict access to most roads, making them effectively unexcludable. You aren't going to put up a toll booth on every corner, and if you try doing it on some and not others (apart from specific chokepoints such as bridges are certain limited access highways), the traffic will just move around it, making the scheme again impractical. This is somewhat becoming less the case technologically (due to automated toll transponders, automatic license plate recognition, GPS, etc.) and we are indeed starting to see more methods of exclusion such as express lanes, urban congestion zones, usage-metered registration fees, etc., making them even less of a public (or more precisely nonexcludable) good.

Anyway, back to servers. It is very practical to restrict access to servers with API keys. As I said earlier, I would support an SPS proposal to pay for servers which are restricted to limited-capacity use for developers and perhaps other categories of cost-sensitive users where there is a clear strategic reason to do so. I think that is a better use of limited resources than subsidizing profitable businesses (which experience has shown are highly prone to use free servers if they are available), opening the general floodgates, and discouraging (via a carrot, at least) people who can easily do so from running their own nodes (which leads to at least unnecessary centralization and fragility).

Agree that at some point, the big projects should run their own nodes. Once they're successful and making profits, they should be able to afford nodes. Steemit's RPC has been good to the community, but at high cost to Steemit Inc. Those servers are more expensive than people realize.

Technology has evolved exponentially in the last two years! Plus, with MIRA, it's more affordable to run a node, albeit with a slower replay time. However, downtimes are part of the internet experience, it's something we already live with. Hopefully in the future Steemit will figure out a way to reduce long replay times, some ideas are in the air, so it's a matter of time before they get implemented.

I fully support this proposal, i know how precious @anyx service is, steemconnect heavily rely on @anyx node, if you had to run a service that send a lot of requests from your server to a node you will quickly discover that most of the nodes available can not handle it or have IP limitation like api.steemit.com.

Hello dear friend @anyx.

I consider this proposal very valuable. All efforts to provide a better user experience in steem should be supported as it would allow for greater incorporation of new accounts.

280,307,849 Total Requests (Approx. 400 Requests per Second Average)

Amazing metrics!

Philosophically, I believe API access should be public and freely available.

This is a very noble position.
I hope your proposal achieves approval and you have the resources to continue offering these high quality services.

All best, Piotr.

Some weeks ago I was witness of how the anyx.io node was the only piece of steem that worked in that halt of the blockchain. I hope you the best for this proposal and every project!

I've just voted. My first SPS vote :)

(for other proposers, not because i don't agree with your proposal, but just lazy to vote)

Sounds quite reasonable for what is being offered. It is shocking how much is reliant on Steemit Inc.

Thank you for everything you do. Have a great weekend.

How about a proposal to finish developing that yummy jussi replacement of yours?

I'll consider it. It's still a ways away from being finished, and I do not have much free time at the moment, so it's hard to quantify a proposal for it.

Perhaps if you team up with a couple devs it can speed it up.

This proposal has my support. A heavy node is too expensive for me, but I entertain the idea of running a lite node myself and may get around to doing it one day with some older hardware and a few tweaks.

Without your and Marky's full node, @steemflagrewards would be dead in the water. I don't think you should have to run a node at a loss so I fully support this proposal.

Furthermore, if there is a business out there with significant income, I think it is reasonable to pay a small premium to cover costs. Perhaps this can be based on amount of API calls relative to their income.

As for SFR, I wished there was more we can do but we ourselves seem to be running on the fumes of altruism.

Without our moderators willing to put in the time to approve downvotes, we would have no service to offer users willing to flag token manipulators. I digress. I wish you the best with this proposal. ♥️

You doing great and i love it i try to fully support no matter what .... that’s actually amazing

Posted using Partiko iOS

Amazing!
Thanks for your service! :bow:

I support this fully!

👍
~Smartsteem Curation Team

I use the anyx.io node for everything. It has a significantly lower ping time than all of the others. I can support this proposal.

How can I support your proposal? New to this proposal thing

Congratulations @anyx! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :

You distributed more than 78000 upvotes. Your next target is to reach 79000 upvotes.

You can view your badges on your Steem Board and compare to others on the Steem Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

To support your work, I also upvoted your post!

Vote for @Steemitboard as a witness to get one more award and increased upvotes!

Sorry we're a little late to the voting... but seeing as this is an important feature for https://steempeak.com/ and all of STEEM we have voted it. Hope it goes well and gets funded and look forward to seeing the potential changes.

Congrats on the funding threshhold being reached (I hope it continues to be above the threshhold) ... would love to hear an update on your plans. Have they changed since this september post? If so it'd be a good time to let people know and hear about your changes or your renewed commitment since it has been 4 months later.

Thank you! I'm glad it's been hanging in there.

I set the end of the proposal for one year to somewhat do that -- give people a chance to renew their support.
Right now I don't have plans to change much; the hardware is good, and nowhere near saturation. There are some steemd software bugs that are a pain -- causing a few cases of degraded service or downtime -- but I've been trying to help track them down at least. Very few people have service at this scale so the bugs are not common!

I'll probably do the renewal proposal a few months out before the start, so people can voice feedback there too.

Will your API support communities which is likely to be released by steempeak and steemit later this week.

Haven't heard from you in a really long time... is everything good?
Are you gonna be updating your API to support communities?

I'm around, and everything is good.. just busy. :)

I will definitely be updating my API nodes to support communities. We're still in beta and I haven't transitioned over my hivemind instances yet, but it's still on my radar. I have been waiting for a more stable release to avoid too much technical hassle in the transition.

Awesome. Can you ping me when you have an update on this? Thank you very much for your work 😉

Wery fine)))

We support this proposal.

Fully suport the proposal!