如何使用谷歌阅读器同步时跳过已知的项目? [英] How to skip known entries when syncing with Google Reader?

查看:148
本文介绍了如何使用谷歌阅读器同步时跳过已知的项目?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

写脱机客户端的谷歌阅读器服务,我想知道的服务如何最好地同步。

for writing an offline client to the Google Reader service I would like to know how to best sync with the service.

似乎没有成为正式文件还没有,我发现迄今最好的来源是这样的:<一href=\"http://$c$c.google.com/p/pyrfeed/wiki/GoogleReaderAPI\">http://$c$c.google.com/p/pyrfeed/wiki/GoogleReaderAPI

There doesn't seem to be official documentation yet and the best source I found so far is this: http://code.google.com/p/pyrfeed/wiki/GoogleReaderAPI

现在考虑这个问题:随着从上面我可以下载所有的未读条目的信息,我可以指定多少个项目下载和使用Atom-ID我能察觉,我已经下载了重复的条目

Now consider this: With the information from above I can download all unread items, I can specify how many items to download and using the atom-id I can detect duplicate entries that I already downloaded.

缺少了什么对我来说是指定我只是想,因为我上次同步更新的方式。
我可以说给我10(参数的 N = 10)最新(参数的研究的= D)项。如果我指定参数的研究的= O(日升),那么我也可以指定参数的 OT 的= [上次同步时间],但只有和升序没有按'任何意义时,我只是想读一些项目相对于全部项目。

What's missing for me is a way to specify that I just want the updates since my last sync. I can say give me the 10 (parameter n=10) latest (parameter r=d) entries. If I specify the parameter r=o (date ascending) then I can also specify parameter ot=[last time of sync], but only then and the ascending order doesn't make any sense when I just want to read some items versus all items.

不知道如何解决,没有再次下载的所有项目,只是拒绝重复?轮询不是一个非常经济的方式。

Any idea how to solve that without downloading all items again and just rejecting duplicates? Not a very economic way of polling.

有人提出,我可以指定我只想未读条目。但是,要在方式解决方案的工作,谷歌阅读器不会再提供这个项目,我需要将它们标记为已读。反过来这将意味着,我需要保持我自己的读/未读状态,客户端的的这些条目已被标记为当用户登录到谷歌阅读器的在线版本读取。这并不为我工作。

Someone proposed that I can specify that I only want the unread entries. But to make that solution work in the way that Google Reader will not offer this entries again, I would need to mark them as read. In turn that would mean that I need to keep my own read/unread state on the client and that the entries are already marked as read when the user logs on to the online version of Google Reader. That doesn't work for me.

干杯,
马里亚诺

Cheers, Mariano

推荐答案

要获得最新作品,从-最新的最新的递降下载,它将从最新的条目开始使用标准。您会收到一个延续令牌XML结果,看起来像这样:

To get the latest entries, use the standard from-newest-date-descending download, which will start from the latest entries. You will receive a "continuation" token in the XML result, looking something like this:

<gr:continuation>CArhxxjRmNsC</gr:continuation>`

通过扫描结果,拉出什么新的东西给你。你会发现,无论是所有的结果都是新的,一切都还是在一定程度上是新的,所有已经知道你了。

Scan through the results, pulling out anything new to you. You should find that either all results are new, or everything up to a point is new, and all after that are already known to you.

在后一种情况下,你就大功告成了,但在前者,你需要找到新的东西比你已经检索到老。通过连续获得的结果,从刚刚过去的结果你只是通过将其在GET请求中的 C 参数,如检索到的一组之后开始这样做:

In the latter case, you're done, but in the former you need to find the new stuff older than what you've already retrieved. Do this by using the continuation to get the results starting from just after the last result in the set you just retrieved by passing it in the GET request as the c parameter, e.g.:

http://www.google.com/reader/atom/user/-/state/com.google/reading-list?c=CArhxxjRmNsC

,直到你拥有的一切继续这样。

Continue this way until you have everything.

N 参数,即项目要检索的次数的计数,这个效果很好,和你去,你可以改变它。如果检查的频率是用户设定,因而可能是非常频繁或非常罕见的,你可以使用一个自适应算法来减少网络流量,你的处理负荷。最初要求少数的最新作品,比方说五(添加 N = 5 来的GET请求的URL)。如果一切都是新的,在未来的要求,
在您使用的延续,要求更多数量,说,20如果这些仍然全新的,无论是饲料有大量的更新,或者它已经有一段时间,所以继续在100或任何团体上。

The n parameter, which is a count of the number of items to retrieve, works well with this, and you can change it as you go. If the frequency of checking is user-set, and thus could be very frequent or very rare, you can use an adaptive algorithm to reduce network traffic and your processing load. Initially request a small number of the latest entries, say five (add n=5 to the URL of your GET request). If all are new, in the next request, where you use the continuation, ask for a larger number, say, 20. If those are still all new, either the feed has a lot of updates or it's been a while, so continue on in groups of 100 or whatever.


不过,和纠正我,如果我错了这里,你也想知道,你已经下载了一个项目后,无论是从未读状态变为读,由于使用谷歌的人阅读它读卡器接口。

However, and correct me if I'm wrong here, you also want to know, after you've downloaded an item, whether its state changes from "unread" to "read" due to the person reading it using the Google Reader interface.

的一种方法,这将是:


  1. 更新已在本地读取任何物品在谷歌的地位。

  2. 检查并保存未读邮件数的饲料。 (您想下一步之前,要做到这一点,这样就保证新项目已经不是你的最新项目,你检查阅读计数时间下载到之间。)

  3. 下载最新的项目。

  4. 计算你读计数和比较,为谷歌的。如果饲料比你计算出一个更高的读取次数,你知道的东西已经读过谷歌。

  5. 如果事情已经阅读谷歌,开始下载阅读项目,并与您的未读项目数据库进行比较。你会发现一些项目,谷歌说,读你的数据库索赔未读;更新这些。继续这样做,直到你发现了一些这些项目相当于你读计数和谷歌的,或者直到下载获得不合理的。
  6. 之间的区别
  7. 如果您没有找到所有的读项目,的就是生活的;剩下的记录为未读unfound总,你还需要在你的下一个你认为是未读本地号码的计算包括数量。

  1. Update the status on google of any items that have been read locally.
  2. Check and save the unread count for the feed. (You want to do this before the next step, so that you guarantee that new items have not arrived between your download of the newest items and the time you check the read count.)
  3. Download the latest items.
  4. Calculate your read count, and compare that to google's. If the feed has a higher read count than you calculated, you know that something's been read on google.
  5. If something has been read on google, start downloading read items and comparing them with your database of unread items. You'll find some items that google says are read that your database claims are unread; update these. Continue doing so until you've found a number of these items equal to the difference between your read count and google's, or until the downloads get unreasonable.
  6. If you didn't find all of the read items, c'est la vie; record the number remaining as an "unfound unread" total which you also need to include in your next calculation of the local number you think are unread.

如果用户订阅了很多不同的博客,它也可能他的标签他们广泛的,所以你可以做在每个标签的基础上,而不是整个饲料,这应该有助于保持数据量这件事下来,因为你不会需要为标签,用户没有阅读谷歌阅读器任何新的东西做任何的转移。

If the user subscribes to a lot of different blogs, it's also likely he labels them extensively, so you can do this whole thing on a per-label basis rather than for the entire feed, which should help keep the amount of data down, since you won't need to do any transfers for labels where the user didn't read anything new on google reader.

此整个方案可以应用到其他状态,如主演或取消星号,以及

This whole scheme can be applied to other statuses, such as starred or unstarred, as well.

现在,就像你说的,这

...就意味着我需要保持在客户端上我自己的读/未读状态,并且该条目已被标记为当用户登录到谷歌阅读器的在线版本读取。这并不为我工作。

...would mean that I need to keep my own read/unread state on the client and that the entries are already marked as read when the user logs on to the online version of Google Reader. That doesn't work for me.

的确如此。无论是保持本地读/未读状态(因为你保留所有的项目的数据库反正)也标志着谷歌(其中API支持)看起来很难看的项目,所以为什么不这样对你的工作?

True enough. Neither keeping a local read/unread state (since you're keeping a database of all of the items anyway) nor marking items read in google (which the API supports) seems very difficult, so why doesn't this work for you?


有一个进一步顺利,但是:用户可以标记的东西理解为未读谷歌。这引发了一下扳手进入系统。还有我的建议,如果你真的想尝试利用这一照顾,就是​​假设一般用户会被触及只有最近的东西,下载最新的几百个左右的项目每一次,检查所有的状态他们。 (这是不是所有的的坏;下载100个项目把我从任何地方为0.3秒300KB,以2.5秒为2.5MB,尽管是一个非常快速的宽带连接)

There is one further hitch, however: the user may mark something read as unread on google. This throws a bit of a wrench into the system. My suggestion there, if you really want to try to take care of this, is to assume that the user in general will be touching only more recent stuff, and download the latest couple hundred or so items every time, checking the status on all of them. (This isn't all that bad; downloading 100 items took me anywhere from 0.3s for 300KB, to 2.5s for 2.5MB, albeit on a very fast broadband connection.)

此外,如果用户有大量订阅,他也可能得到一个相当大的数量的标签,所以在每个标签的基础上这样做会加快速度。我建议,其实,那你不仅在每个标签的基础上检查,但你可以为S $ P $垫了检查,检查单个标签每分钟,而不是一切每隔二十分钟。你也可以做这个大检查的旧项目状态的变化少往往比你做一个新玩意检查,每隔几个小时也许一次,如果你想保留的带宽了。

Again, if the user has a large number of subscriptions, he's also probably got a reasonably large number of labels, so doing this on a per-label basis will speed things up. I'd suggest, actually, that not only do you check on a per-label basis, but you also spread out the checks, checking a single label each minute rather than everything once every twenty minutes. You can also do this "big check" for status changes on older items less often than you do a "new stuff" check, perhaps once every few hours, if you want to keep bandwidth down.

这是一个有点带宽猪,主要是因为你需要下载从谷歌完整的文章仅仅是检查状态。不幸的是,我看不到,我们有我们所掌握的API文档周围的任何方式。我唯一​​真正的建议是尽量减少的状态在非新项目的检查。

This is a bit of bandwidth hog, mainly because you need to download the full article from Google merely to check the status. Unfortunately, I can't see any way around that in the API docs that we have available to us. My only real advice is to minimize the checking of status on non-new items.

这篇关于如何使用谷歌阅读器同步时跳过已知的项目?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆