热含量算法/分随时间衰减 [英] Hot content algorithm / score with time decay

查看:203
本文介绍了热含量算法/分随时间衰减的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在阅读+研发的算法和公式来制定出一个得分为我的用户提交的内容,以显示当前热/趋势的项目越往上列表,但是我承认我在我的头上这里是一点点。

I have been reading + researching on algorithms and formulas to work out a score for my user submitted content to display currently hot / trending items higher up the list, however i'll admit i'm a little over my head here.

我给在我后...网友上传的音频到我的网站的一些背景,音频有几个动作:

I'll give some background on what i'm after... users upload audio to my site, audios have several actions:

  • 播放
  • 下载
  • 顶过
  • 收藏

在理想情况下,我想一个算法,我可以更新音频得分新活动被记录每次(出场,下载等),也有下载行为的价值超过一玩,好像多了下载和喜欢多等。

Ideally i want an algorithm where I can update an audios score each time a new activity is logged (played, download etc...), also a download action is worth more than a play, like more than a download and a favourite more than a like.

如果可能,我想为音频超过1周落很明显从列表中给新的内容更加的趋势的机会。

If possible i would like for audios older than 1 week to drop off quite sharply from the list to give newer content more of a chance of trending.

我看了一下reddits算法看起来很不错,但我在我头上就如何调整它要利用我的多个变量,经过7天左右脱落较早的文章。

I have read about reddits algorithm which looked good, but i'm in over my head on how to tweak it to make use of my multiple variables, and to drop off older articles after around 7 days.

一些文章,我们很有趣的:

Some articles that we're interesting:

  • http://amix.dk/blog/post/19588 (reddits算法中)
  • <一个href="http://www.evanmiller.org/rank-hotness-with-newtons-law-of-cooling.html">http://www.evanmiller.org/rank-hotness-with-newtons-law-of-cooling.html
  • http://amix.dk/blog/post/19588 (reddits algo)
  • http://www.evanmiller.org/rank-hotness-with-newtons-law-of-cooling.html

任何帮助是AP preciated!

Any help is appreciated!

推荐答案

基本上,你可以使用reddit的的公式推。由于您的系统只支持upvotes你可以进行加权,从而导致这样的事情:

Basically you can use reddit's formular. Since your system only supports upvotes you could weight them, resulting in something like this:

def hotness(track)
    s = track.playedCount
    s = s + 2*track.downloadCount
    s = s + 3*track.likeCount
    s = s + 4*track.favCount
    baseScore = log(max(s,1))

    timeDiff = (now - track.uploaded).toWeeks

    if(timeDiff > 1)
        x = timeDiff - 1
        baseScore = baseScore * exp(-8*x*x)

    return baseScore

因子 EXP(-8 * X * X)会给你你想要的落客:

The factor exp(-8*x*x) will give you your desired drop off:

让说要播放的概率法为给定的轨道在一个给定的小时为50%,10下载%,比如1%和FAV 0.1%。然后下面的C ++程序会给你估计你的分数的行为:

Lets say that the propability for a given track to be played in a given hour is 50%, download 10%, like 1% and fav 0.1%. Then the following C++ program will give you an estimate for your scores behavior:

#include <iostream>
#include <fstream>
#include <random>
#include <ctime>
#include <cmath>

struct track{
    track() : uploadTime(0),playCount(0),downCount(0),likeCount(0),faveCount(0){}
    std::time_t uploadTime;    
    unsigned int playCount;
    unsigned int downCount;
    unsigned int likeCount;
    unsigned int faveCount;    
    void addPlay(unsigned int n = 1){ playCount += n;}
    void addDown(unsigned int n = 1){ downCount += n;}
    void addLike(unsigned int n = 1){ likeCount += n;}
    void addFave(unsigned int n = 1){ faveCount += n;}
    unsigned int baseScore(){
        return  playCount +
            2 * downCount +
            3 * likeCount +
            4 * faveCount;
    }
};

int main(){
    track test;
    const unsigned int dayLength = 24 * 3600;
    const unsigned int weekLength = dayLength * 7;    

    std::mt19937 gen(std::time(0));
    std::bernoulli_distribution playProb(0.5);
    std::bernoulli_distribution downProb(0.1);
    std::bernoulli_distribution likeProb(0.01);
    std::bernoulli_distribution faveProb(0.001);

    std::ofstream fakeRecord("fakeRecord.dat");
    std::ofstream fakeRecordDecay("fakeRecordDecay.dat");
    for(unsigned int i = 0; i < weekLength * 3; i += 3600){
        test.addPlay(playProb(gen));
        test.addDown(downProb(gen));
        test.addLike(likeProb(gen));
        test.addFave(faveProb(gen));    

        double baseScore = std::log(std::max<unsigned int>(1,test.baseScore()));
        double timePoint = static_cast<double>(i)/weekLength;        

        fakeRecord << timePoint << " " << baseScore << std::endl;
        if(timePoint > 1){
            double x = timePoint - 1;
            fakeRecordDecay << timePoint << " " << (baseScore * std::exp(-8*x*x)) << std::endl;
        }
        else
            fakeRecordDecay << timePoint << " " << baseScore << std::endl;
    }
    return 0;
}

结果:

这应该是够你用的。

这篇关于热含量算法/分随时间衰减的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆