Struggles with HIVE API

in #hive4 years ago (edited)

A simple script, that first gets all the accounts a given account (example: @felixxx) is following and then looks through those account's histories for vote-operations and prints them, 7 days back.

from beem.account import Account
import datetime


author_following = {}

account = 'felixxx'

author_following[account] = []    

a = Account(account)                        

for following in a.get_following():    
    author_following[account].append(following)

time2 = datetime.datetime.now().timestamp()

def get_followers_votes(author_following, author):    
    for follower in author_following[author]:
        time1 = datetime.datetime.now().timestamp()
        f = Account(follower)
        for vote in f.history_reverse(stop = (datetime.datetime.now() - datetime.timedelta(days = 7)), only_ops = 'vote'):
            print(vote)
        print(follower)    
        print(datetime.datetime.now().timestamp() - time1)

get_followers_votes(author_following, account)

print(datetime.datetime.now().timestamp() - time2)

Guess how long that takes ?

... I am still waiting.
... and I actually wanted to compile a whole dictionary, for more than 1 account ...

I hope Hive 0.23.0 speeds this up, but I am afraid, I will have to build my own DB and feed it blockwise, or find a completely different method :/

However, i will test these numbers again after the hardfork and post about the changes.

Sort:  

Results are in:

1749.3964359760284 s

~ 30 minutes.

I tried your same code on colab and got 865.2024440765381 without GPU
and then 1046.40714097023. Weird that with GPU, it is worst off.

The speed is mostly determined by the network latency (internet speed) - computing power on your end is almost irrelevant ...

(sorry for late reply - working away from home)

You're right. We are looking at the flow rather than the processing of a set of data.

I've tried this beempy a couple of months back too.
I found it slow too. I was using hivesql but that cost 40HBD and I think I have to give up soon.

I will probably have to feed my own DB and cache that data myself, or my script will be too slow :/
lots of work, though ...

Did you know there is a get_following method in the API? You don’t have to parse the whole account history to get a list of followers or followees.

https://developers.hive.io/apidefinitions/#condenser_api.get_following

I am using that function.

Read again.

My bad, not used to Python. Strange, is it still slow nowadays?

Screen Shot 2020-11-27 at 9.39.27 am.png

Read the OP again, please.

Try to read more than the first 5 words, but the whole sentence;

A simple script, that first gets all the accounts a given account (example: @felixxx) is following and then looks through those account's histories for vote-operations and prints them, 7 days back.

Right gotcha now.

I wonder if HiveSQL wouldn’t work better for this type of requests. It’s now free so you might like to try it out

At the end of the day, it is still a lot of http calls ... They take time either way.

It would be much nicer to have a local DB (like SQLite) and only stream the blocks as they come. This is A LOT of work though :(