将带有rowspans的HTML表转换为datatable C# [英] Convert a HTML table with rowspans to datatable C#

查看:66
本文介绍了将带有rowspans的HTML表转换为datatable C#的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

I need to convert a Html Table to DataTable in C#. I used HtmlAgilityPack but it does not convert it well because of rowspans. The code I am currently using is:

 private static DataTable convertHtmlTableToDataTable()
    {
        WebClient webClient = new WebClient();
        string urlContent = webClient.DownloadString("http://example.com");

        string tableCode = getTableCode(urlContent);

        string htmlCode = tableCode.Replace(" ", " ");

        HtmlDocument doc = new HtmlDocument();
        doc.LoadHtml(htmlCode);
        var headers = doc.DocumentNode.SelectNodes("//tr/th");
        DataTable table = new DataTable();

        foreach (HtmlNode header in headers)
        {
            table.Columns.Add(header.InnerText);
        }
        foreach (var row in doc.DocumentNode.SelectNodes("//tr[td]"))
        {
            table.Rows.Add(row.SelectNodes("td").Select(td => td.InnerText).ToArray());
        }
        return table;
    }

And this is a part of Html Table:

 <pre lang="HTML"> <table class="tabel" cellspacing="0" border="0">
    <caption style="font-family:Verdana; font-size:20px;">SEMGRP</caption>
    <tr>
        <th class="celula" >Ora</th>
        <th  class="latime_celula celula">Luni</th>
        <th  class="latime_celula celula">Marti</th>
        <th  class="latime_celula celula">Miercuri</th>
        <th  class="latime_celula celula">Joi</th>
        <th  class="latime_celula celula">Vineri</th>
    </tr>
    <tr>
        <td class="celula" nowrap="nowrap">8-9</td>
        <td class="celula" rowspan="2">
                                <table border="0" align="center">
                                    <tr>
                                        <td nowrap="nowrap" align="center">   
                                            Curs    
                                            <br />
                                            <a class="link_celula" href="afis_n0.php?id_tip=287&tip=p">Prof</a> 
                                            <br />
                                            <a class="link_celula" href="afis_n0.php?id_tip=9&tip=s">Sala</a>
                                            <br />
                                        </td>
                                    </tr>
                                </table>
        </td>
        <td class="celula" rowspan="2">
                                <table border="0" align="center">
                                    <tr>
                                        <td nowrap="nowrap" align="center">
                                            Curs    
                                            <br />
                                            <a class="link_celula" href="afis_n0.php?id_tip=287&tip=p">Prof</a> 
                                            <br />
                                            <a class="link_celula" href="afis_n0.php?id_tip=12&tip=s">Sala</a>  
                                            <br />
                                        </td>
                                    </tr>
                                </table>
        </td>
        <td class="celula"> </td>
        <td class="celula"> </td>
        <td class="celula" rowspan="2">
                                <table border="0" align="center">
                                    <tr>
                                        <td nowrap="nowrap" align="center">
                                        Curs
                                        <br />
                                        <a class="link_celula" href="afis_n0.php?id_tip=293&tip=p">Prof</a>
                                        <br />
                                        <a class="link_celula" href="afis_n0.php?id_tip=9&tip=s">Sala</a>
                                        <br />
                                        </td>
                                    </tr>
                                </table>
        </td>
    </tr>
    <tr>
        <td class="celula" nowrap="nowrap">9-10</td>
        <td class="celula"> </td>
        <td class="celula"> </td>
    </tr>
    <tr>
        <td class="celula" nowrap="nowrap">10-11</td>
        <td class="celula" rowspan="2">
                                <table border="0" align="center">
                                    <tr>
                                        <td nowrap="nowrap" align="center">   Curs
                                        <br /><a class="link_celula" href="afis_n0.php?id_tip=303&tip=p">Prof</a>
                                        <br /><a class="link_celula" href="afis_n0.php?id_tip=9&tip=s">Sala</a>
                                        <br />
                                        </td>
                                    </tr>
                                </table>
        </td>
        <td class="celula" rowspan="2">
                                <table border="0" align="center">
                                    <tr>
                                        <td nowrap="nowrap" align="center">   Curs
                                        <br />
                                        <a class="link_celula" href="afis_n0.php?id_tip=331&tip=p">Prof</a>
                                        <br />
                                        <a class="link_celula" href="afis_n0.php?id_tip=14&tip=s">Sala</a>  
                                        <br />
                                        </td>
                                    </tr>
                                </table>
        </td>
        <td class="celula" rowspan="2">
                                <table border="0" align="center">
                                    <tr>
                                        <td nowrap="nowrap" align="center">   Curs
                                        <br /><a class="link_celula" href="afis_n0.php?id_tip=330&tip=p">Prof</a>   
                                        <br /><a class="link_celula" href="afis_n0.php?id_tip=9&tip=s">Sala</a> 
                                        <br />
                                        </td>
                                    </tr>
                                </table>
        </td>
        <td class="celula"> </td>
        <td class="celula" rowspan="2">
                                <table border="0" align="center">
                                    <tr>
                                        <td nowrap="nowrap" align="center">   Curs
                                        <br />
                                        <a class="link_celula" href="afis_n0.php?id_tip=293&tip=p">Prof</a>
                                        <br />
                                        <a class="link_celula" href="afis_n0.php?id_tip=10&tip=s">Sala</a>  <br />
                                        </td>
                                    </tr>
                                </table>
        </td>
    </tr>
    <tr>
        <td class="celula" nowrap="nowrap">11-12</td>
        <td class="celula"> </td>
    </tr>
    <tr>





我尝试了一些解决方案,但我没有找到任何好处...



我尝试了什么:



感谢您提前提供任何帮助。



I tried some solutions but I did not find any good...

What I have tried:

Thanks for any help in advance.

推荐答案

这似乎是一个很好的库。

跨框架(WinForms / WPF / PDF / Metro / Mono /等),多用途(UI控件) /图像生成/ PDF生成/等),100%托管(C#),高性能HTML渲染库: HTML渲染器 - 主页 [ ^ ]
This seems to be a good library.
Cross framework (WinForms/WPF/PDF/Metro/Mono/etc.), Multipurpose (UI Controls / Image generation / PDF generation / etc.), 100% managed (C#), High performance HTML Rendering library: HTML Renderer - Home[^]


private static DataTable convertHtmlTableToDataTable( )

{

WebClient webClient = new WebClient();

string urlContent = webClient.DownloadString(http://example.com );



string tableCode = getTableCode(urlContent);



s tring htmlCode = tableCode.Replace(,);



HtmlDocument doc = new HtmlDocument();

doc.LoadHtml( htmlCode);

var headers = doc.DocumentNode.SelectNodes(// tr / th);

DataTable table = new DataTable();



foreach(标题中的HtmlNode标题)

{

table.Columns.Add(header.InnerText);

}

foreach(doc.DocumentNode.SelectNodes中的var行(// tr [td]))

{

table.Rows.Add(row.SelectNodes(td)。选择(td => td.InnerText)。ToArray());

}

返回表格;

}



这是Html表的一部分:



private static DataTable convertHtmlTableToDataTable()
{
WebClient webClient = new WebClient();
string urlContent = webClient.DownloadString("http://example.com");

string tableCode = getTableCode(urlContent);

string htmlCode = tableCode.Replace(" ", " ");

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(htmlCode);
var headers = doc.DocumentNode.SelectNodes("//tr/th");
DataTable table = new DataTable();

foreach (HtmlNode header in headers)
{
table.Columns.Add(header.InnerText);
}
foreach (var row in doc.DocumentNode.SelectNodes("//tr[td]"))
{
table.Rows.Add(row.SelectNodes("td").Select(td => td.InnerText).ToArray());
}
return table;
}

And this is a part of Html Table:

<table class="tabel" cellspacing="0" border="0"><caption style="font-family:Verdana; font-size:20px;">SEMGRP</caption><tbody><tr><th class="celula">Ora</th><th class="latime_celula celula">Luni</th><th class="latime_celula celula">Marti</th><th class="latime_celula celula">Miercuri</th><th class="latime_celula celula">Joi</th><th class="latime_celula celula">Vineri</th></tr><tr><td class="celula" nowrap="nowrap">8-9</td><td class="celula" rowspan="2">




                            <table border="0" align="center"><tbody><tr><td nowrap="nowrap" align="center">
                                        Curs
                                        <br>
                                        <a class="link_celula" href="afis_n0.php?id_tip=287&tip=p">Prof</a>
                                        <br>
                                        <a class="link_celula" href="afis_n0.php?id_tip=9&tip=s">Sala</a>
                                        <br>
                                    </td></tr></tbody></table>
    </td><td class="celula" rowspan="2">




                            <table border="0" align="center"><tbody><tr><td nowrap="nowrap" align="center">
                                        Curs
                                        <br>
                                        <a class="link_celula" href="afis_n0.php?id_tip=287&tip=p">Prof</a>
                                        <br>
                                        <a class="link_celula" href="afis_n0.php?id_tip=12&tip=s">Sala</a>
                                        <br>
                                    </td></tr></tbody></table>
    </td><td class="celula"> </td><td class="celula"> </td><td class="celula" rowspan="2">




                            <table border="0" align="center"><tbody><tr><td nowrap="nowrap" align="center">
                                    Curs
                                    <br>
                                    <a class="link_celula" href="afis_n0.php?id_tip=293&tip=p">Prof</a>
                                    <br>
                                    <a class="link_celula" href="afis_n0.php?id_tip=9&tip=s">Sala</a>
                                    <br>
                                    </td></tr></tbody></table>
    </td></tr><tr><td class="celula" nowrap="nowrap">9-10</td><td class="celula"> </td><td class="celula"> </td></tr><tr><td class="celula" nowrap="nowrap">10-11</td><td class="celula" rowspan="2">




                            <table border="0" align="center"><tbody><tr><td nowrap="nowrap" align="center">   Curs
                                    <br><a class="link_celula" href="afis_n0.php?id_tip=303&tip=p">Prof</a>
                                    <br><a class="link_celula" href="afis_n0.php?id_tip=9&tip=s">Sala</a>
                                    <br>
                                    </td></tr></tbody></table>
    </td><td class="celula" rowspan="2">




                            <table border="0" align="center"><tbody><tr><td nowrap="nowrap" align="center">   Curs
                                    <br>
                                    <a class="link_celula" href="afis_n0.php?id_tip=331&tip=p">Prof</a>
                                    <br>
                                    <a class="link_celula" href="afis_n0.php?id_tip=14&tip=s">Sala</a>
                                    <br>
                                    </td></tr></tbody></table>
    </td><td class="celula" rowspan="2">




                            <table border="0" align="center"><tbody><tr><td nowrap="nowrap" align="center">   Curs
                                    <br><a class="link_celula" href="afis_n0.php?id_tip=330&tip=p">Prof</a>
                                    <br><a class="link_celula" href="afis_n0.php?id_tip=9&tip=s">Sala</a>
                                    <br>
                                    </td></tr></tbody></table>
    </td><td class="celula"> </td><td class="celula" rowspan="2">




                            <table border="0" align="center"><tbody><tr><td nowrap="nowrap" align="center">   Curs
                                    <br>
                                    <a class="link_celula" href="afis_n0.php?id_tip=293&tip=p">Prof</a>
                                    <br>
                                    <a class="link_celula" href="afis_n0.php?id_tip=10&tip=s">Sala</a>  <br>
                                    </td></tr></tbody></table>
    </td></tr><tr><td class="celula" nowrap="nowrap">11-12</td><td class="celula"> </td></tr><tr></tr></tbody></table>


这篇关于将带有rowspans的HTML表转换为datatable C#的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆