如何使用C#或Java中的替换或删除commoand来转换HTML [英] How to convert HTML using replace or remove commoand in C# or Java

查看:88
本文介绍了如何使用C#或Java中的替换或删除commoand来转换HTML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  HTML 1我在字符串中获取了这个 - > 

S =
< html>< head>< link rel ='stylesheet'type ='text / css'
href ='http:// www .taxmann.com / css / taxmannstyle.css'/>
< / head>< body>< html>
< body style ='background-color:Black; font- size:30px; color:#fff;'>
< div id = \digest\> \r\\\

< p class = \threedigest\\ ST:如果在相关的
期间出现分歧
的意见,并且
有些判决有利于被评估者,则延长期限
不能用于不缴纳税款,
,因为assessee< / p> \ r \ n
< / div>< / body>< / html>没有抑制/故意错误陈述
;< /体>< / HTML>中

注意:我收到的这个HTML是正确的

  BUT String HTML 2  - > 

< html>
< head>< link rel ='stylesheet
'type ='text / css'href ='http://www.taxmann .com / css / taxmannstyle.css'/>
< / head>< body>< html>< body style ='background-color:Black; font-size:30px; color: #fff;'>
< html> \r\\\
< head>
< link href ='http://www.taxmann.com/TaxmannWhatsnewService/Styles/style.css 'rel ='stylesheet'type ='text / css'/>
\r\\\
< title> Rs.560-crore检测到的偷税漏税< / title> \r\\\
<样式类型= \text / css \>
\ r \ nbody
{font-family:Arial,Helvetica,sans-serif; font-size:12px;
line -height:18px; text-align:justify;}
\\\\ nn.w100 {width:100%;} \\\
.fl-l {float:left;} \\ r \ n.ffla {font-family:Arial,
Helvetica,sans-serif;}
\r\\\
.fs18 {font-size:18px;} \r\ n.mart10 {margin-top:10px;} \r\\\
.fcred {color:#c81616;}
\r\\\
.tc {text-align:center;} \r\\ \
.tu {文本转换:大写;} \r\\\
.lh18 {线高度:18像素;} \r\\\
< /风格> \r\\\
< /头> \ r \ n< body> \\\\ n
< div class = \w100 fl-l \> \\\\ n< div class = \w100 fl- l ffla fs18 mart10 fcred ttunderline tc tu \>
Rs.560-crore tax
vasion detected< / div> \r\\\
\r\\\

中央情报部门b $ b Chennai Zone已发现2012年至2013年期间涉及消费税
和服务税逃税金额为5.6亿卢比的164起案件。
总计166起展示原因通知
已发出涉及500亿卢比的b
各种中央消费税和服务
税收案件。
- www.business-standard.com< / div> \\\\\\\ n
< / div> \\ n< / body> \ r \\\

< / html> \r\\\
< / body>
< / html>< / body>
< / html>

我想将HTML 2格式转换为与HTML1格式相同我尝试删除一些HTML内容,但它不工作,我不知道如何将Html2转换为Html1,即使我也尝试使用Java删除它,但无法请帮助我!在任何编程语言中使用replace或remove命令。

解决方案

试试这个方法去除两个不需要的html标签在你的代码中的第一行,即来自服务器的响应有两个html标签,因为你没有得到正确的响应。尝试删除所有不需要的标签并对齐html代码

 $ b>  public class TestScriptClass {
public static void main(String [] args){

String inputValue =;
inputValue = inputValue + < HTML>< HEAD> < link rel ='stylesheet'type ='text / css'href ='http://www.taxmann.com/css/taxmannstyle.css'/>+
< / head> < body>< html>< body style ='background-color:Black; font-size:30px; color:#fff;'>+
< html> \ r \\ \
< HEAD> < link href ='http://www.taxmann.com/TaxmannWhatsnewService/Styles/style.css'rel ='stylesheet'type ='text / css'/>+
\r\\ \\ n< title> Rs.560-crore tax evasion detected< / title> \r\\\
< style type = \text / css \>+
\r\ nbody {font-family:Arial,Helvetica,sans-serif;字体大小:12像素; +
line-height:18px; text-align:justify;} \r\\\
.w100 {width:100%;} \r\\\
.fl -l {float:left; } \r\\\
.ffla {font-family:Arial,+
Helvetica,sans-serif;} \r\\\
.fs18 {font-size:18px;} \r\\ \\ n.mart10 {margin-top:10px;} \r\\\
.fcred {color:#c81616;}+
\r\\\
.tc {text-align:center;} \r\\\
.tu {文本转换:大写;} \r\\\
.lh18 {线高度:18像素;} \r\\\
< /风格> \r\\\
< / head> \\\\ n< body> \\\\ n+
< div class = \w100 fl-l \> \\\\ n< div class = \w100 fl -l ffla fs18 mart10 fcred ttunderline tc tu \>+
Rs.560-crore tax+
vasion detected< / div> \ r \\ \\ n \r\\\
< div class = \w100 fl-l lh18 mart10 \>+
Central Excise Intelligenc e,+
钦奈区在2012 - 13年发现了164起涉及消费税+
和服务税逃税56.5亿卢比的案件。+
共166款显示原因通知+
已经发出,涉及+
各种中央消费税和服务+
税务案例,每年500卢比。+
- www.business-standard.com< / div> \\\\\\\ n+
< / div> \\\\ n< / body> \ r \\\
+
< / html> \r\\\
< / body>+
< / html>< / body>+
< ; / HTML>中;

String resultValue = inputValue.replace(< html>< head>< link rel ='stylesheet'type ='text / css'href ='http://www.taxmann。 com / css / taxmannstyle.css'/>< / head>< body>< html>,< html>< head>< link rel ='stylesheet'type ='text / css 'href ='http://www.taxmann.com/css/taxmannstyle.css'/>);

System.out.println(resultValue);
}
}


    HTML 1  I m getting this in string ->

    S=        
           "<html>  <head> <link rel='stylesheet' type='text/css'
             href='http://www.taxmann.com/css/taxmannstyle.css' /> 
                 </head>  <body ><html>
                <body style='background-color:Black;font-size:30px;color:#fff;'>
        <div id=\"digest\">\r\n   
                   <p class=\"threedigest\">ST : Extended period of limitation 
                cannot be invoked for not paying tax if there was divergence 
        of opinion during relevant 
                period and 
                some judgments were in favour of assessee, 
                as there could be no suppression/wilful mis-statement
         by assessee</p>\r\n   
                 </div></body></html></body></html>"

Note : I am getting this HTML which is Correct

BUT    String HTML 2 ->

            "<html> 
                     <head> <link rel='stylesheet
                    ' type='text/css' href='http://www.taxmann.com/css/taxmannstyle.css' /> 
                     </head>  <body ><html><body style='background-color:Black;font-size:30px;color:#fff;'>
                    <html>\r\n<head>
                    <link href='http://www.taxmann.com/TaxmannWhatsnewService/Styles/style.css' rel='stylesheet' type='text/css' />
                    \r\n<title>Rs.560-crore tax evasion detected</title>\r\n<style type=\"text/css\">
            \r\nbody
                    {font-family:Arial, Helvetica, sans-serif; font-size:12px; 
                line-height:18px;text-align:justify;}
                    \r\n.w100{width:100%;}\r\n.fl-l{float:left;}\r\n.ffla{font-family:Arial, 
                Helvetica, sans-serif;}
                    \r\n.fs18{font-size:18px;}\r\n.mart10{margin-top:10px;}\r\n.fcred{color:#c81616;}
                \r\n.tc{text-align:center;}\r\n.tu{text-transform:uppercase;}\r\n.lh18{line-height:18px;}\r\n</style>\r\n</head>\r\n<body>\r\n
                <div class=\"w100 fl-l\">\r\n<div class=\"w100 fl-l ffla fs18 mart10 fcred ttunderline tc tu\">
                    Rs.560-crore tax 
vasion detected</div>\r\n\r\n<div class=\"w100 fl-l lh18 mart10\">
                The Central Excise Intelligence, 
Chennai Zone, has detected 164 cases involving excise
                 and service tax evasion of Rs.560 crore in 2012- 13.
     A total of 166 show cause notices
                 have been issued involving Rs.500 crore for 
    various central excise and service 
                tax cases during the year.
 – www.business-standard.com</div>\r\n\r\n
            </div>\r\n</body>\r\n
                    </html>\r\n</body>
    </html></body>
    </html>"

I want to Convert HTML 2 format same as Html1 format I tried Much but unable to do . I have tried to remove Some HTML Content as well but its not worked, I don't know how to convert Html2 same as it Html1 even I have also tried to Remove this using Java But not able to do Please help me ! Using replace or remove command in any programming language.

解决方案

Try this its working with removing two unwanted html tags in line one in your code i.e response from server have two html tags because of that your are not getting proper response. Try to remove all unwanted tags and align the html code

public class TestScriptClass {
public static void main(String[] args) {

    String inputValue=" ";
      inputValue =inputValue+"<html><head> <link rel='stylesheet' type='text/css' href='http://www.taxmann.com/css/taxmannstyle.css' />"+ 
                    "</head>  <body ><html><body style='background-color:Black;font-size:30px;color:#fff;'>"+ 
                    "<html>\r\n<head> <link href='http://www.taxmann.com/TaxmannWhatsnewService/Styles/style.css' rel='stylesheet' type='text/css' />"+ 
                    "\r\n<title>Rs.560-crore tax evasion detected</title>\r\n<style type=\"text/css\">"+ 
                    "  \r\nbody{font-family:Arial, Helvetica, sans-serif; font-size:12px; "+ 
                    " line-height:18px;text-align:justify;} \r\n.w100{width:100%;}\r\n.fl-l{float:left;}\r\n.ffla{font-family:Arial, "+ 
                    "Helvetica, sans-serif;} \r\n.fs18{font-size:18px;}\r\n.mart10{margin-top:10px;}\r\n.fcred{color:#c81616;}"+ 
                    " \r\n.tc{text-align:center;}\r\n.tu{text-transform:uppercase;}\r\n.lh18{line-height:18px;}\r\n</style>\r\n</head>\r\n<body>\r\n"+ 
                    "  <div class=\"w100 fl-l\">\r\n<div class=\"w100 fl-l ffla fs18 mart10 fcred ttunderline tc tu\">"+ 
                    "   Rs.560-crore tax "+ 
                    "vasion detected</div>\r\n\r\n<div class=\"w100 fl-l lh18 mart10\">"+ 
                    " The Central Excise Intelligence, "+ 
                    "Chennai Zone, has detected 164 cases involving excise"+ 
                    "  and service tax evasion of Rs.560 crore in 2012- 13."+ 
                    "  A total of 166 show cause notices"+ 
                    "   have been issued involving Rs.500 crore for "+ 
                    "  various central excise and service "+ 
                    "  tax cases during the year."+ 
                    "– www.business-standard.com</div>\r\n\r\n"+ 
                    " </div>\r\n</body>\r\n"+ 
                    "   </html>\r\n</body>"+ 
                    "  </html></body>"+ 
                    "   </html>";

      String resultValue= inputValue.replace("<html><head> <link rel='stylesheet' type='text/css' href='http://www.taxmann.com/css/taxmannstyle.css' /></head>  <body ><html>", " <html><head> <link rel='stylesheet' type='text/css' href='http://www.taxmann.com/css/taxmannstyle.css' />");

      System.out.println(resultValue);       
}
}

这篇关于如何使用C#或Java中的替换或删除commoand来转换HTML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆