如何使用 LinqToTwitter 在 Hashtag 上获取所有推文 [英] How To Get All Tweets on Hashtag using LinqToTwitter

查看:33
本文介绍了如何使用 LinqToTwitter 在 Hashtag 上获取所有推文的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试让所有推文(计算推文总数)都属于主题标签.我的功能在这里,如何使用 maxID 和 sinceID 来获取所有推文.什么是代替计数"?我不知道.

if (maxid != null){var searchResponse =等待(来自 ctx.Search 中的搜索其中 search.Type == SearchType.Search &&search.Query == "#karne" &&search.Count == Convert.ToInt32(count)选择搜索).SingleOrDefaultAsync();maxid = Convert.ToString(searchResponse.SearchMetaData.MaxID);foreach(searchResponse.Statuses 中的 var tweet){尝试{ResultSearch.Add(new KeyValuePair(tweet.ID.ToString(), tweet.Text));推文计数++;}抓住 {}}while (maxid != null && tweetcount < Convert.ToInt32(count)){maxid = Convert.ToString(searchResponse.SearchMetaData.MaxID);搜索响应 =等待(来自 ctx.Search 中的搜索其中 search.Type == SearchType.Search &&search.Query == "#karne" &&search.Count == Convert.ToInt32(count) &&search.MaxID == Convert.ToUInt64(maxid)选择搜索).SingleOrDefaultAsync();foreach(searchResponse.Statuses 中的 var tweet){尝试{ResultSearch.Add(new KeyValuePair(tweet.ID.ToString(), tweet.Text));推文计数++;}抓住 { }}}}

解决方案

这是一个例子.请记住,MaxID 用于当前会话并防止重新阅读您在当前会话中已处理的推文.SinceID 是您收到的有关此搜索词的最旧推文,可帮助您避免重新阅读您在之前会话中已针对该搜索词处理过的推文.本质上,您正在创建一个窗口,其中 MaxID 是下一个要获取的最新推文,而 SinceID 是您不想读过去的最旧推文.在给定搜索词的第一个会话中,您将 SinceID 设置为 1,因为您还没有最旧的推文.会话结束后,保存 SinceID 以免不小心重读推文.

 静态异步任务 DoPagedSearchAsync(TwitterContext twitterCtx){const int MaxSearchEntriesToReturn = 100;string searchTerm = "推特";//对于这个搜索词,你已经拥有的最旧 IDulong 因为ID = 1;//在第一个查询之后用于跟踪当前会话ulong maxID;var combineSearchResults = new List();列表<状态>搜索响应 =等待(来自 twitterCtx.Search 中的搜索其中 search.Type == SearchType.Search &&search.Query == searchTerm &&search.Count == MaxSearchEntriesToReturn &&search.SinceID == sinceID选择搜索.状态).SingleOrDefaultAsync();combineSearchResults.AddRange(searchResponse);ulong previousMaxID = ulong.MaxValue;做{//比你刚刚查询的最新 id 少一个maxID = searchResponse.Min(status => status.StatusID) - 1;Debug.Assert(maxID 

Console.WriteLine(" 用户:{0} ({1}) 推文:{2}",tweet.User.ScreenNameResponse,tweet.User.UserIDResponse,推文.文本));}

这种方法看起来像很多代码,但实际上可以让您更好地控制搜索.例如您可以检查推文并根据推文的内容(如 CreatedAt)确定要查询的次数.您可以将查询包装在 try/catch 块中,以在超出速率限制或 twitter 出现问题时监视 HTTP 429,从而记住您所在的位置并继续.您还可以监视 twitterContext RateLimit 属性以查看是否接近并提前避免 HTTP 429 异常.任何其他盲目阅读 N 条推文的技术都可能迫使您浪费速率限制并降低您的应用程序的可扩展性.

  • 提示:请记住为给定的搜索词保存 SinceID,如果您保存推文,以防止下次重新阅读相同的推文您使用该搜索词进行搜索.

有关此机制的更多信息,请阅读使用时间线Twitter 文档.

I'm trying to get all tweets(count total tweet number) belong to hashtag. My function is here, how to I use maxID and sinceID for get all tweets. What is the instead of "count"? I dont'know.

if (maxid != null)
        {
            var searchResponse =
                await
                (from search in ctx.Search
                 where search.Type == SearchType.Search &&
                 search.Query == "#karne" &&
                 search.Count == Convert.ToInt32(count)
                 select search)
                 .SingleOrDefaultAsync();

            maxid = Convert.ToString(searchResponse.SearchMetaData.MaxID);

            foreach (var tweet in searchResponse.Statuses)
            {
                try
                {
                    ResultSearch.Add(new KeyValuePair<String, String>(tweet.ID.ToString(), tweet.Text));
                    tweetcount++;
                }
                catch {}
            }

            while (maxid != null && tweetcount < Convert.ToInt32(count))
            {
                maxid = Convert.ToString(searchResponse.SearchMetaData.MaxID);
                searchResponse =
                    await
                    (from search in ctx.Search
                     where search.Type == SearchType.Search &&
                     search.Query == "#karne" &&
                     search.Count == Convert.ToInt32(count) && 
                     search.MaxID == Convert.ToUInt64(maxid)
                     select search)
                     .SingleOrDefaultAsync();
                foreach (var tweet in searchResponse.Statuses)
                {
                    try
                    {
                        ResultSearch.Add(new KeyValuePair<String, String>(tweet.ID.ToString(), tweet.Text));
                        tweetcount++;
                    }
                    catch { }
                }
            }

        }

解决方案

Here's an example. Remember that MaxID is for the current session and prevents re-reading tweets you've already processed in the current session. SinceID is the oldest tweet you've ever received for this search term and helps you avoid re-reading tweets that you've already processed for this search term during previous sessions. Essentially, you're creating a window where MaxID is the newest tweet to get next and SinceID is the oldest tweet that you don't want to read past. On the first session for a given search term, you would set SinceID to 1 because you don't have an oldest tweet yet. After the session, save SinceID so that you don't accidentally re-read tweets.

    static async Task DoPagedSearchAsync(TwitterContext twitterCtx)
    {
        const int MaxSearchEntriesToReturn = 100;

        string searchTerm = "twitter";

        // oldest id you already have for this search term
        ulong sinceID = 1;

        // used after the first query to track current session
        ulong maxID; 

        var combinedSearchResults = new List<Status>();

        List<Status> searchResponse =
            await
            (from search in twitterCtx.Search
             where search.Type == SearchType.Search &&
                   search.Query == searchTerm &&
                   search.Count == MaxSearchEntriesToReturn &&
                   search.SinceID == sinceID
             select search.Statuses)
            .SingleOrDefaultAsync();

        combinedSearchResults.AddRange(searchResponse);
        ulong previousMaxID = ulong.MaxValue;
        do
        {
            // one less than the newest id you've just queried
            maxID = searchResponse.Min(status => status.StatusID) - 1;

            Debug.Assert(maxID < previousMaxID);
            previousMaxID = maxID;

            searchResponse =
                await
                (from search in twitterCtx.Search
                 where search.Type == SearchType.Search &&
                       search.Query == searchTerm &&
                       search.Count == MaxSearchEntriesToReturn &&
                       search.MaxID == maxID &&
                       search.SinceID == sinceID
                 select search.Statuses)
                .SingleOrDefaultAsync();

            combinedSearchResults.AddRange(searchResponse);
        } while (searchResponse.Any());

        combinedSearchResults.ForEach(tweet =>
            Console.WriteLine(
                "
  User: {0} ({1})
  Tweet: {2}",
                tweet.User.ScreenNameResponse,
                tweet.User.UserIDResponse,
                tweet.Text));
    }

This approach seems like a lot of code, but really gives you more control over the search. e.g. you can examine tweets and determine how many times to query based on the contents of a tweet (like CreatedAt). You can wrap the query in a try/catch block to watch for HTTP 429 when you've exceeded your rate limit or twitter has a problem, allowing you to remember where you were and resume. You could also monitor twitterContext RateLimit properties to see if you're getting close and avoid an exception for HTTP 429 ahead of time. Any other technique to blindly read N tweets could force you to waste rate-limit and make your application less scalable.

  • Tip: Remember to save SinceID for the given search term, if you're saving tweets, to keep from re-reading the same tweets the next time you do a search with that search term.

For more info on the mechanics of this, read Working with Timelines in the Twitter docs.

这篇关于如何使用 LinqToTwitter 在 Hashtag 上获取所有推文的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆