如何从C#中的HTML文件中获取信息? [英] How to take information from HTML file in C#?

查看:99
本文介绍了如何从C#中的HTML文件中获取信息?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好!



我想从span class =c和class =c2创建一个字符串a [4]。 a [0] = class =c; a [1] = 0; a [2] = class =c2。

此时我的字符串只包含[1]和[2]。请帮忙?



以下是html代码示例:





 <   div     class   =  a >  

< div class = b >
< span class = c > ...
< / span >
< / div >


< div class = b1 > ...
< / div >

< div class = b2 > ...
< 跨度 <跨度class =code-attribute> class = c >
< / span >
< / div >


< / div >





我尝试过:



  var  inner1 = doc.DocumentNode.SelectSingleNode(  // div [ @ class ='a']); 

var nodes3 = inner1.SelectNodes( // span [@ class ='c']);

foreach (HtmlAgilityPack.HtmlNode item3 in nodes3)
{
shipping_flag [i4] = item3.InnerText;


shipping_array =( string [])shipping_flag.ToArray();
i4 ++;


}

解决方案

我会尽力帮助你,但我是一个很难解释你的解释和你的foreach块。

你想要达到什么目的?

将所有SPAN元素提取到一个数组中?

算一下吧?



编辑:

好​​吧,我想我明白了。我希望,至少:)

只是为了确定,你想得到这样的输出数组:

  string  [] shipping_array =  new   string  [] 
{
class = c
0
class = c2
null // 为什么不定义这个项目?
};



似乎有点不合时宜nsistent,但是我不知道背景。

所以,我的下一个问题是:

*第三个数组元素中的'2'字符来自哪里?它是找到的'c'类的计数器?

*你没有定义第四个数组元素。为什么?短跨度之间有什么区别?



编辑:

尝试类似的东西:

  var  results =  new  List< string>(); 
var divs = doc.DocumentNode.SelectNodes( //格);
foreach var div in divs)
results.Add(div.SelectNodes( span [@ class ='c'] )== null 0 class = \c\);


Hello!

I would like to create one string a[4] from span class="c" and class="c2". a[0] = class="c"; a[1] = 0; a[2] = class="c2".
At this time my string contains only a[1] and a[2]. Please, help?

Here is an example of html code:


<div class="a">

<div class="b">
<span class="c">...
</span>
</div>


<div class="b1">...
</div>

<div class="b2">...
<span class="c">
</span>
</div>


</div>



What I have tried:

var inner1 = doc.DocumentNode.SelectSingleNode("//div[@ class='a']");

var nodes3 = inner1.SelectNodes("//span[@ class='c']");

foreach (HtmlAgilityPack.HtmlNode item3 in nodes3)
                {
                    shipping_flag[i4] = item3.InnerText;
                   

                    shipping_array = (string[])shipping_flag.ToArray();
                    i4++;


                }

解决方案

I'll try to help you, however I'm a little puzzled about your explanation and your foreach block.
What do you like to achieve?
Extract all the SPAN elements into an array?
Count them?

EDIT:
Ok, i think I got it. I hope, at least :)
Just to be sure, are you like to get an output array like this:

string[] shipping_array = new string[]
{
    "class=""c""",
    "0",
    "class=""c2""",
    null //why did not you define this item?
};


It seems to be a little inconsistent, however I don't know the backgrounds.
So, my next questions are:
* From where comes the '2' character in the third array element? It is a counter of found 'c' classes?
* Do you not defined the fourth array element. Why? There is any difference between short spans?

EDIT:
Try something similar:

var results = new List<string>();
var divs = doc.DocumentNode.SelectNodes("//div");
foreach (var div in divs)
    results.Add(div.SelectNodes("span[@class='c']") == null ? "0" : "class=\"c\"");


这篇关于如何从C#中的HTML文件中获取信息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆