如何将html转换为纯文本c#? [英] how to convert html to plain text c#?

查看:42
本文介绍了如何将html转换为纯文本c#?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从 html 网站获取纯文本,但我正在获取 html 代码而不是纯文本.例如 <b > 你好 </b>

是我 </p> 我怎样才能把它转换成你好它我.很感谢任何形式的帮助!这是我的代码.

i am trying to get plain text from html website but i am getting html code instead of plain text.for example < b > hello < /b> < p > its me < / p> How can i convert it to hello its me . Any help is very much appreciated! here is my code .

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.IO;
using System.Linq;
 using System.Net;
 using System.Text.RegularExpressions;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;

 namespace WindowsFormsApplication2
 {
   public partial class Form1 : Form
   {
    public Form1()
    {
        InitializeComponent();
    }

    private void button1_Click(object sender, EventArgs e)
    {

        HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create(""https://www.dailyfx.com/real-time-news");
        myRequest.Method = "GET";
        WebResponse myResponse = myRequest.GetResponse();
        StreamReader sr = new StreamReader(myResponse.GetResponseStream(), System.Text.Encoding.UTF8);
        string result = sr.ReadToEnd();




        textBox1.Text = result;
        sr.Close();
        myResponse.Close();
    }
    }
}

推荐答案

 You can use regex expressions for this. 

 Regex.Replace(htmltext, "<.*?>", string.Empty);

 Eg:- String htmltext = "string html = "<p>Test1 <b>.NET</b> Test2 Test3 
                         <i>HTML</i> Test4.</p>";"
      Output will be :- Test1 Test2 Test3 Test4.

这对你有帮助.http://www.codeproject.com/Tips/136704/Remove-all-the-HTML-tags-and-display-a-plain-text

这篇关于如何将html转换为纯文本c#?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆