如何从C#中的HTML文件中获取信息? [英] How to take information from HTML file in C#?
问题描述
你好!
我想从span class =c和class =c2创建一个字符串a [4]。 a [0] = class =c; a [1] = 0; a [2] = class =c2。
此时我的字符串只包含[1]和[2]。请帮忙?
以下是html代码示例:
< div class = a >
< div class = b >
< span class = c > ...
< / span >
< / div >
< div class = b1 > ...
< / div >
< div class = b2 > ...
< 跨度跨度> <跨度class =code-attribute> class = c >
< / span >
< / div >
< / div >
我尝试过:
var inner1 = doc.DocumentNode.SelectSingleNode( // div [ @ class ='a']);
var nodes3 = inner1.SelectNodes( // span [@ class ='c']);
foreach (HtmlAgilityPack.HtmlNode item3 in nodes3)
{
shipping_flag [i4] = item3.InnerText;
shipping_array =( string [])shipping_flag.ToArray();
i4 ++;
}
我会尽力帮助你,但我是一个很难解释你的解释和你的foreach块。
你想要达到什么目的?
将所有SPAN元素提取到一个数组中?
算一下吧?
编辑:
好吧,我想我明白了。我希望,至少:)
只是为了确定,你想得到这样的输出数组:
string [] shipping_array = new string []
{
class = c ,
0,
class = c2 ,
null // 为什么不定义这个项目?
};
似乎有点不合时宜nsistent,但是我不知道背景。
所以,我的下一个问题是:
*第三个数组元素中的'2'字符来自哪里?它是找到的'c'类的计数器?
*你没有定义第四个数组元素。为什么?短跨度之间有什么区别?
编辑:
尝试类似的东西:
var results = new List< string>();
var divs = doc.DocumentNode.SelectNodes( //格跨度>);
foreach ( var div in divs)
results.Add(div.SelectNodes( span [@ class ='c'] )== null ? 0 : class = \c\);
Hello!
I would like to create one string a[4] from span class="c" and class="c2". a[0] = class="c"; a[1] = 0; a[2] = class="c2".
At this time my string contains only a[1] and a[2]. Please, help?
Here is an example of html code:
<div class="a">
<div class="b">
<span class="c">...
</span>
</div>
<div class="b1">...
</div>
<div class="b2">...
<span class="c">
</span>
</div>
</div>
What I have tried:
var inner1 = doc.DocumentNode.SelectSingleNode("//div[@ class='a']");
var nodes3 = inner1.SelectNodes("//span[@ class='c']");
foreach (HtmlAgilityPack.HtmlNode item3 in nodes3)
{
shipping_flag[i4] = item3.InnerText;
shipping_array = (string[])shipping_flag.ToArray();
i4++;
}
I'll try to help you, however I'm a little puzzled about your explanation and your foreach block.
What do you like to achieve?
Extract all the SPAN elements into an array?
Count them?
EDIT:
Ok, i think I got it. I hope, at least :)
Just to be sure, are you like to get an output array like this:
string[] shipping_array = new string[] { "class=""c""", "0", "class=""c2""", null //why did not you define this item? };
It seems to be a little inconsistent, however I don't know the backgrounds.
So, my next questions are:
* From where comes the '2' character in the third array element? It is a counter of found 'c' classes?
* Do you not defined the fourth array element. Why? There is any difference between short spans?
EDIT:
Try something similar:
var results = new List<string>(); var divs = doc.DocumentNode.SelectNodes("//div"); foreach (var div in divs) results.Add(div.SelectNodes("span[@class='c']") == null ? "0" : "class=\"c\"");
这篇关于如何从C#中的HTML文件中获取信息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!