如何删除& nbsp;和Jsoup? [英] How to remove   with Jsoup?

查看:150
本文介绍了如何删除& nbsp;和Jsoup?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法使用 .trim() .replace(,)删除它等等!我不明白。

I can't remove it with .trim() or .replace(" ", ""), etc! I don't get it.

我甚至在Stackoverflow上发现尝试 \\\\ 0000 但是没有我也没办法。

I even found on Stackoverflow to try with \\u00a0 but didn't work neither.

我试过这个:

System.out.println( "'"+fields.get(6).text().replace("\\u00a0", "")+"'" ); //'94,00 '
System.out.println( "'"+fields.get(6).text().replace(" ", "")+"'" ); //'94,00 '
System.out.println( "'"+fields.get(6).text().trim()+"'"); //'94,00 '
System.out.println( "'"+fields.get(6).html().replace(" ", "")+"'"); //'94,00' works

但我无法弄清楚为什么我无法删除 .text()的空白区域。

But I can't figure out why I can't remove the white space with .text().

推荐答案

你的第一次尝试非常它,Jsoup映射<$你是对的c $ c>& nbsp; 到U + 00A0。您只是不希望字符串中出现双反斜杠:

Your first attempt was very nearly it, you're quite right that Jsoup maps &nbsp; to U+00A0. You just don't want the double backslash in your string:

System.out.println( "'"+fields.get(6).text().replace("\u00a0", "")+"'" ); //'94,00'
// Just one ------------------------------------------^

replace 不使用正则表达式,因此您不会尝试将文字反斜杠传递到正则表达式级别。你只想在字符串中指定字符U + 00A0。

replace doesn't use regular expressions, so you aren't trying to pass a literal backslash through to the regex level. You just want to specify character U+00A0 in the string.

这篇关于如何删除&amp; nbsp;和Jsoup?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆