jSoup-删除特定的样式属性 [英] jSoup - remove particular style attributes

查看:276
本文介绍了jSoup-删除特定的样式属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我错过了什么吗?有更好的方法吗?

Am I missing something? Is there a better way to do this?

输入:

<span style="FONT-FAMILY: 'Lucida Sans','sans-serif'; COLOR: #003572; FONT-SIZE: 9pt; 
mso-fareast-font-family: Calibri; mso-ansi-language: EN-US; mso-fareast-language: EN-US; 
mso-bidi-language: AR-SA; mso-fareast-theme-font: minor-latin">Dr. Who is 
<u>usually</u> available for consultations Mon - Thurs afternoons and Friday 9a-
12p at 555-1212. </span>

期望的输出:

< span style ="COLOR:#003572;字体大小:9pt;">博士.谁是 < u>通常</u>周一至周四可进行咨询 下午和周五9a-12p(555-1212). </span>

<span style="COLOR: #003572; FONT-SIZE: 9pt;">Dr. Who is <u>usually</u> available for consultations Mon - Thurs afternoons and Friday 9a-12p at 555-1212. </span>

我的代码很差:

///在写入数据库之前清除Week Long笔记中的HTML

//cleans the HTML within the Week Long note before writing to the DB

  Whitelist wl = new Whitelist();         
  wl = Whitelist.simpleText();
  wl.addTags("br");
  wl.addTags("p");
  wl.addTags("span");
  wl.addAttributes(":all","style");
  Document doc = 
              Jsoup.parse(
               "<html><head></head><body>"+ds.getWeeklongNote()+"</body></html>");
  Elements e = doc.select("*");
  for (Element el : e){
      for (Attribute attr : el.attributes()){
          if (attr.getKey().equals("span")){
              String newValue = "";
              String s = attr.getValue();
              String[] values = s.split(";");
              for (String value : values){
                  if (value.startsWith("COLOR")||value.startsWith("FONT-SIZE")){
                      newValue += attr.getKey()+"="+attr.getValue()+";";
                  }
              }
              attr.setValue(newValue);
          }
      }
  }

  doc.html(e.outerHtml());
  ds.setWeekLongNote(Jsoup.clean(doc.body().outerHtml(), wl));

推荐答案

尝试一下:

Document doc = Jsoup.parse(html);
  Elements e = doc.getElementsByTag("body");            
  Log.i("Span element: "+e.get(0).nodeName(), ""+e.get(0).nodeName());
  e = e.get(0).getElementsByTag("span");
  Attributes styleAtt = e.get(0).attributes();
  Attribute a = styleAtt.asList().get(0);           
  if(a.getKey().equals("style")){
     String[] items = a.getValue().trim().split(";");
     String newValue = "";
     for(String item: items){

         if(item.contains("COLOR:")||item.contains("FONT-SIZE:")){
             Log.i("Style Item: ", ""+item);
             newValue = newValue.concat(item).concat(";");
         }
     }
     a.setValue(newValue);
     Log.i("New Atrrbute: ",""+newValue);                    
  }

  Log.i("FINAL HTML: ",""+e.outerHtml()); 

  doc.html(e.outerHtml());
    }

输出:

08-17 18:28:07.692: I/FINAL HTML:(8148): <span style=" COLOR: #003572; FONT-SIZE: 9pt;">Dr. Who is <u>usually</u> available for consultations Mon - Thurs afternoons and Friday 9a- 12p at 555-1212. </span>

干杯

这篇关于jSoup-删除特定的样式属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆