如何使用C#从HTML文档中获取CSS类名称或样式属性 [英] How to get the CSS class name or style properties from HTML document using C#
问题描述
任何人都可以帮我解决如何使用 C#从html文档中获取文本/关键字的CSS类名称或样式属性吗?
说我通过了来自代码的文本示例文本。
我有一个html文档,其中包含以下代码
< ; div class =sampleclass>
示例文本
< / div>
当Sample Text为Sample时,我需要将结果作为sampleclass通过。
我尝试过:
我使用过HTMLAgility库,我可以根据类名获取文本,但我需要反过来获取文本传递时的类名。
如果你有一个HtmlDocument
(使用HtmlAgilityPack),您可以使用.DocumentNode.Descendants()
来获取所有后代,并使用LINQ扩展方法,你可以搜索包含'Sample Text'的元素并获得它的类:
string html = @ <!DOCTYPE html>
< html>
< head>< title>示例文档< / title>< / head>
< body>
< div class =sampleclass>
示例文本
< / div>
< / body>
< / html>;
HtmlDocument doc = new HtmlDocument();
doc .LoadHtml(html);
HtmlNode foundNode = doc.DocumentNode.Descendants()。Where(x = > x.InnerHtml.Trim ()== 示例文本)。FirstOrDefault();
string classAttribute = foundNode?.Attributes [ class] ?.Value;.Where [ ^ ]使用谓词
x => x.InnerHtml.Trim()==示例文本
过滤后代,这意味着对于列表中的元素x对于后代,修剪过的'x'的InnerHTML必须是Sample Text。.FirstOrDefault [ ^ ]返回找到的第一个元素,或null
如果没有找到元素。
找到节点后,将从节点获取该属性。请注意,我使用?。
而不是。
因为?。
是空条件运算符 [ ^ ]。foundNode?.Attributes [class]
表示如果foundNode为null,则此表达式求值为null;如果foundNode不为null,则执行此表达式。属性[ 类] 。?。值
以相同的方式工作。使用此运算符可以避免一些null
检查。如果foundNode
为null或者它没有class属性,则classAttribute
也为null。
Can any one help me out how to get the CSS Class name or Style properties of a Text / Keyword from html document using C#?
Say I am passing the text "Sample Text" from code.
I have a html document which has following code
<div class="sampleclass">
Sample Text
</div>
I need to get the result as sampleclass when Sample Text is passed.
What I have tried:
I have used HTMLAgility library , I can get the text based on class name but I need the other way around to get class name when text is passed.
If you have aHtmlDocument
(using HtmlAgilityPack), you can use.DocumentNode.Descendants()
to get all descendants, and using the LINQ extension methods, you can search for the element containing 'Sample Text' and get its class:
string html = @"<!DOCTYPE html> <html> <head><title>Sample document</title></head> <body> <div class=""sampleclass""> Sample Text </div> </body> </html>"; HtmlDocument doc = new HtmlDocument(); doc.LoadHtml(html); HtmlNode foundNode = doc.DocumentNode.Descendants().Where(x => x.InnerHtml.Trim() == "Sample Text").FirstOrDefault(); string classAttribute = foundNode?.Attributes["class"]?.Value;.Where[^] filters the descendants using the predicate
x => x.InnerHtml.Trim() == "Sample Text"
, which means that for an element 'x' in the list of descendants, the trimmed InnerHTML of 'x' must be "Sample Text". .FirstOrDefault[^] returns the first found element, ornull
if no element is found.
When the node is found, the attribute is fetched from the node. Note that I used?.
instead of just.
because?.
is a null-conditional operator[^].foundNode?.Attributes["class"]
means "if foundNode is null, then this expression evaluates to null; if foundNode is not null, then this expression executes .Attributes["class"]".?.Value
works in the same way. Using this operator avoids a fewnull
checks. IffoundNode
is null or if it doesn't have a class attribute, thenclassAttribute
is null too.
这篇关于如何使用C#从HTML文档中获取CSS类名称或样式属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!