Google趋势配额限制 [英] Google Trends Quota Limit
问题描述
我试图从Google趋势中提取数据,并且只有2次尝试后才得到您已达到每日限额错误。
有什么办法可以解决这个问题吗?我知道Google API项目有特殊的配额限制,但Google趋势没有API。我也读过,我们可能需要传递一个cookie文件,以便它似乎我登录。有没有人遇到过这个问题?
我正在努力解决同样的问题!
从你的问题我不知道你已经取得了什么阶段......
但是,这是我找到的解决方案:
- 您应该使用cookie模拟浏览器。
我认为最好的方法是使用 机械化 library。 - 首先,您的程序应该使用GET请求登录到 https://accounts.google.com/Login?hl=zh_CN
- 紧接着您可以访问其他个人资源但不是Google趋势!
- 经过一段时间后,您可以成功获取Google趋势数据为CSV。
- 我还没有发现确切的时间段,但它超过10分钟,不到几个小时:)。这就是为什么保存您的cookies以供后期使用是一个好主意!
更多提示:
-
如果您在Windows下使用python / ruby进行开发,请不要忘记为OpenSSL库设置CA ROOT证书包。否则,HTTPS连接将失败,您将无法登录!请参阅获取证书验证失败(OpenSSL :: SSL :: SSLError)错误与机械化对象
-
我建议您在程序关闭时将cookie保存到外部文件。 不要忘记允许重定向,因为Google总是使用重定向。
Ruby代码示例:
需要'mechanize'
需要'logger'
begin
agent = Mechanize.new {| a |
a.user_agent ='Opera / 9.80(Windows NT 5.1)Presto / 2.12.388版本/ 12.16'
cert_store = OpenSSL :: X509 :: Store.new
cert_store .add_file'cacert.pem'
a.cert_store = cert_store
a.log = Logger.new('mech.log')
如果File.file ?('mech.cookies')
cookies = Mechanize :: CookieJar.new
cookies.load('mech.cookies')
a.cookie_jar = cookies
end
a.open_timeout = 5
a.read_timeout = 6
a.keep_alive = true
a.redirect_ok = true
}
LOGIN_URL =https://accounts.google.com/Login?hl=zh-CN&continue=http://www.google.com/trends/
login_page = agent.get(LOGIN_URL)
login_form = login_page.forms.first
login_form.Email = *
login_form.Passwd = *
login_response_page = agent.submit(login_form)
page = agent。获得(url)
#在德州仪器的重要周期后,请求一些趋势ME
如果代理人
确保
agent.cookie_jar.save('mech.cookies')
结束
结束
I'm trying to pull data from Google trends and got a "You have reached your daily limit" error after only 2 tries.
Is there any way to go around this? I know Google API projects have special quota limits but Google Trends doesn't have an API. I also read that we may need to pass it a cookie file so that it seems like I'm logged in. Has anyone faced this issue before?
I'm struggling with the same issue! From your question I can't figure out what stage have you achieved... But here is the solution that I've found:
- You should emulate browser with cookies. I think the best way to do it is to use Mechanize library.
- At first your program should "login" using GET request to "https://accounts.google.com/Login?hl=en"
- Immediately after that you can access some other personal resources, but not google trends!
- After some significant time you can successfully get google trends data as CSV.
- I still have not discovered the exact time period, but it is more than 10 minutes and less than several hours :). That is why saving your cookies for latter use is a good idea!
Few more tips:
If you are developing using python / ruby under Windows do not forget to set up CA ROOT certificates package for OpenSSL library. Otherwise HTTPS connection will fail and you won't login! See Getting the `certificate verify failed (OpenSSL::SSL::SSLError)` erro with Mechanize object
I recommend you to save cookies to external file at program shutdown. And restoring them at startup.
Do not forget to allow redirects, because Google is using redirects all the time.
Ruby code example:
require 'mechanize'
require 'logger'
begin
agent = Mechanize.new { |a|
a.user_agent = 'Opera/9.80 (Windows NT 5.1) Presto/2.12.388 Version/12.16'
cert_store = OpenSSL::X509::Store.new
cert_store.add_file 'cacert.pem'
a.cert_store = cert_store
a.log = Logger.new('mech.log')
if File.file?('mech.cookies')
cookies = Mechanize::CookieJar.new
cookies.load('mech.cookies')
a.cookie_jar = cookies
end
a.open_timeout = 5
a.read_timeout = 6
a.keep_alive = true
a.redirect_ok = true
}
LOGIN_URL = "https://accounts.google.com/Login?hl=en&continue=http://www.google.com/trends/"
login_page = agent.get(LOGIN_URL)
login_form = login_page.forms.first
login_form.Email = *
login_form.Passwd = *
login_response_page = agent.submit(login_form)
page = agent.get(url)
# DO SOME TRENDS REQUESTS AFTER SIGNIFICANT PERIOD OF TIME
ensure
if agent
agent.cookie_jar.save('mech.cookies')
end
end
这篇关于Google趋势配额限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!