随着时间的推移,可能会获得 Alexa 信息或谷歌页面排名吗? [英] Possible to get alexa information or google page rankings over time?

查看:23
本文介绍了随着时间的推移,可能会获得 Alexa 信息或谷歌页面排名吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试访问历史谷歌页面排名或随着时间的推移在 alexa 排名中添加一些权重我正在寻找乐趣的搜索引擎.这将是一个单独的函数,我将在 Python 中调用(理想情况下)并传入 URL 的参数以及我想要获得平均值的时间(以天为单位),然后我可以使用该信息来衡量我的结果!

I am trying to access historical google page rankings or alexa rankings over time to add some weightings on a search engine I am making for fun. This would be a separate function that I would call in Python (ideally) and pass in the paramaters of the URL and how long I wanted to get the average over, measured in days and then I could just use that information to weight my results!

我认为工作可能会很有趣,但我也觉得这可能很容易通过某些大师可能会向我展示的 API 技巧来实现,并为我节省几个不眠不休的星期!有人可以帮忙吗?

I think it could be fun to work on, but I also feel that this may be easy to do with some trick of the APIs some guru might be able to show me and save me a few sleepless weeks! Can anyone help?

非常感谢!

推荐答案

如果您查看 堆栈溢出 您可以看到,在全球排名旁边,它提供了过去三个月网站排名的变化.这可能不是您想要的粒度级别,但是您可以相对轻松地抓取这些信息,我怀疑您是否会从查看不同时间长度的变化中获得很多额外的信息.长期的答案是自己收集和存储排名,以便您有历史记录.

If you look at the Alexa page for stack overflow you can see that next to the global rank it offers change of the site's rank over the past three months. This may not be down to the level of granularity that you would like, but you could scrape out this information relatively easily and I doubt that you would gain much additional information from looking at changes of different lengths of time. The long term answer is to collect and store the rankings yourself so that you have a historical record going forward.

更新:这是示例代码.

import mechanize
import cookielib
from BeautifulSoup import BeautifulSoup


def changerankscrapper(site):
    """
    Takes a site url, scrapes that site's Alexa page,
    and returns the site's global Alexa rank and the
    change in that rank over the past three months.
    """

    #Create Alexa URL
    url = "http://www.alexa.com/siteinfo/" + site

    #Get HTML
    cj = cookielib.CookieJar()
    mech = mechanize.OpenerFactory().build_opener(mechanize.HTTPCookieProcessor(cj))
    request = mechanize.Request(url)
    response = mech.open(request)
    html = response.read()

    #Parse HTML with BeautifulSoup
    soup = BeautifulSoup(html)

    globalrank = int(soup.find("strong", { "class" : "metricsUrl font-big2 valign" }).text)
    changerank = int(soup.find("span", { "class" : "change-wrapper change-up" }).text)


    return globalrank, changerank

#Example
site = "http://stackoverflow.com/"
globalrank, changerank = changerankscrapper(site)
print(globalrank)
print(changerank)

这篇关于随着时间的推移,可能会获得 Alexa 信息或谷歌页面排名吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆