Btw, I slightly augmented the feature encoding in the meantime. Now the skew and kurtosis of the number of sentences as well as number of syllables are used as features, too. By looking at the feature importances, I realized that the variances of these values are often used by the forest regressor. Hence, I worked in the third and fourth moments as well.
You are viewing a single comment's thread from:
There is one more thing. I went through past top lists and there's a theme and pattern revealing. Most of the top 10 posts are about the Steemit platform or Cryptocurrencies. This is, of course, not surprising. These are popular tags and, hence, just by posting in these categories you can already expected a higher payout than in other categories.
Maybe @trufflepig should balance the rewards out. It should look for good content, not so much for popular tags. So my idea is to slightly punish posts with popular tags and promote less popular tag posts via rescaling the posts' payouts by the average tag payouts.
For example, let's say we have post A with tag
dog
being paid 10 SBD and post B with tagcat
being paid 8 SBD. Moreover, on averagedog
yields 6 SBD per post, butcat
only 4 SBD. Regardless of a tag a post is awarded 5 SBD on average. Next, we compensate for the popular and unpopular tag by normalizing with respect to the ratio between average payout per tag and the total average payout . For instance, post's A reward is rescaled like 10 * 5/6 = 8.3 SBD and post B 8 * 5/4 = 10 SBD.I have to options to include this into the algorithm:
Directly rescaling rewards in the training set. Hence,
TrufflePig
would directly predict already rescaled rewards for new posts.Keep predicting the original expected reward, but adjusting the top list according to the rescaled rewards to promote less popular tag posts in the daily truffle picks.
What do you think about this?
Currently experimenting with version 2...
... and merged into master. Still, there appears lots of posts about Steemit in the top list. But if the community loves posts about itself so much, I cannot help it ;-)