没有无关链接的链接提取器 [英] Link extractor without unrelated links

查看：62 发布时间：2019/6/21 2:28:54 C#

本文介绍了没有无关链接的链接提取器的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下用于链接提取程序的代码，该代码提取给定url的所有内部链接

i have following code for link extractor which extracts all internal links for given url

SearchEngines Search = SearchEngines.Google;
LinksExtractor extractor = new LinksExtractor("http://yahoo.com/",Search,10);
          
for (int i = 0; i < extractor.Links.Count; i++)
{
    Console.Write(extractor.Links[i].Href.ToString());
    //Console.ReadKey();
    Console.ReadLine();
}

该代码为我提供了yahoo.com中的所有墨水
就像yahoo.com/sports
yahoo.com/business
但它也会提供不需要的链接，例如是否在yahoo上为shadi.com投放广告
那么它也会给shadi.com的链接
我不想要
请帮助

This Code giving me all inks inside yahoo.com
like yahoo.com/sports
yahoo.com/business
but it also gives unwanted links like if some advertisement on yahoo for shadi.com
then it give shadi.com''s link also
that i dont want
please help

推荐答案

很难忽略您不想要的链接吗?例如，任何不以"http://yahoo.com/"开头的内容?

Is it that hard to ignore the links you don''t want? For instance, anything that doesn''t start with "http://yahoo.com/"?

我想知道您是否可以利用Google的高级过滤功能来创建您的WebRequest吗?

例如，此Google搜索[

I wonder if you can make use of Google''s Advanced filtering capabilities in creating your WebRequest ?

For example, this Google search[^] shows you only sites within Yahoo.com, and only sites in English.

But, perhaps you''ve already eliminated that as a strategy, so:

If extractor.Links is a collection of type IEnumerable<Link>, then you should be able to use a relatively simple Linq filter operation like:

string matchStr = "yahoo.com";

var filteredMatches = extractor.Links.Where(link => link.Href.ToString().Contains(matchStr)).ToList<Link>();

免责声明:此代码段不在我的头上" '并且可能无法按原样为您工作，未经测试并且可能有缺陷:它仅是向您建议一种策略.

Disclaimer: this code fragment is off the ''top-of-my-head'' and may not work for you as is, is not tested, and may be flawed: it is intended only to suggest a strategy to you.

这篇关于没有无关链接的链接提取器的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

没有无关链接的链接提取器 [英] Link extractor without unrelated links

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

没有无关链接的链接提取器 [英] Link extractor without unrelated links

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭