JSoup"包装"不能按预期每次 [英] JSoup "wrap" is not working as expected everytime
问题描述
我有一个包含文本,图像或地图图片的HTML字符串。该HTML是动态生成的。现在,只要有一个< IMG>
标签,应该由&LT包裹;中心>
标记。要做到这一点,我使用JSoup,我成功申请了静态图像和文字。
不过,每当我试图张贴图,HTML正在失去它的结构。我不明白发生了什么。无论是正常的,在地图图像有< IMG>
标记。该方法如何可以给不同的输出?
这是该做的工作方法:
公共字符串wrapImgWithCenter(字符串HTML){
文档的DOC = Jsoup.parse(HTML);
。doc.select(img目录)包(<中心及GT;< /中心和GT;);
返回doc.html();
}
原始的HTML与图像和地图图像包装前:
< p DIR =升>< IMG src=\"http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-4de5af73-68f1-401d-9feb-1dfde1373cff-file\" />&下; / P>
&下,P DIR =LTR>&下; A HREF =15.2993265,74.123996>&下; IMG src=\"http://maps.google.com/maps/api/staticmap?center=15.2993265,74.123996&zoom=15&size=960x540&sensor=false&markers=color:blue%7Clabel:!%7C15.2993265,74.123996\" />&下; / A>&所述峰; br />&所述峰; br />&下; / P>
&下,P DIR =LTR>&下; A HREF =22.572646,88.363895,-25.274398,133.775136>&下; IMG src=\"http://maps.google.com/maps/api/staticmap?center=22.572646,88.363895&zoom=2&size=960x540&markers=22.572646,88.363895%7C-25.274398,133.775136&path=color:0xff0000ff%7Cweight:5%7C22.572646,88.363895%7C-25.274398,133.775136&sensor=false\" />&下; / A>&所述峰; br /> &所述; / P>
结果后包装
< p DIR =升> &所述; / P>
<中心及GT;
< IMG src=\"http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-01516245-c773-4765-b542-ebecb964b255-file\" />
< /中心及GT;
< BR />
< BR />
&所述p为H.;&下; / P>
< p DIR =升> &所述; / P>
<中心及GT;
&所述; A HREF =15.2993265,74.123996>&下; IMG src=\"http://maps.google.com/maps/api/staticmap?center=15.2993265,74.123996&zoom=15&size=960x540&sensor=false&markers=color:blue%7Clabel:!%7C15.2993265,74.123996\" />&下; / A>
< /中心及GT;
&所述p为H.;&下; / P>
< p DIR =升> &所述; / P>
<中心及GT;
&所述; A HREF =22.572646,88.363895,-25.274398,133.775136>&下; IMG src=\"http://maps.google.com/maps/api/staticmap?center=22.572646,88.363895&zoom=2&size=960x540&markers=22.572646,88.363895%7C-25.274398,133.775136&path=color:0xff0000ff%7Cweight:5%7C22.572646,88.363895%7C-25.274398,133.775136&sensor=false\" />&下; / A>
< /中心及GT;
< BR />
&所述p为H.;&下; / P>
为了比较,
只有图像的有效输出:
< HTML和GT;
< HEAD>< /头>
<身体GT;
< p DIR =升>
<中心及GT;
< IMG src=\"http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-959467a6-f83f-44c6-b6fc-88ba4f49d900-file\" />
< /中心及GT;< BR />< / P>
< p DIR =升>
<中心及GT;
< IMG src=\"http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-46c38c96-c3b5-402e-a0b4-03209adf5203-file\" />
< /中心及GT;< BR />< / P>
< p DIR =升>
<中心及GT;
< IMG src=\"http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-626ec909-c65e-452c-a341-61a361584eba-file\" />
< /中心及GT;< BR /> &所述; / P>
< /身体GT;
< / HTML>
的有效输出文本和图像:
< HTML和GT;
< HEAD>< /头>
<身体GT;
< p DIR =升>文字< / P>
< p DIR =升>
<中心及GT;
< IMG src=\"http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-11343a01-7cd2-4f9e-9f9a-025ec3feb828-file\" />
< /中心及GT;< BR /> &所述; / P>
< /身体GT;
< / HTML>
++++++++++++++++++++++++++++++++++++
这是负责上述的功能中的类中的方法:
私人无效createHtmlWeb(){ 串listOfElements =空; //通常如果发现
// webTextcontains.maps.google.com
Toast.makeText(getApplicationContext(),+ mainEditText.getHeight(),Toast.LENGTH_SHORT).show();
的parseObject postObject =新的parseObject(邮报);
Spannable S = mainEditText.getText();
字符串webText = Html.toHtml(S);
webText = webText.replaceAll((小于/(?: C |我| U)>?)\\\\ 1+,$ 1)的replaceAll(< /(B | I | u)> &所述; \\\\ 1>中,);
//重构HTML
webText = wrapImgWithCenter(webText);
//确定链接和喜爱的类型周围添加喜欢的一类
//它。
如果(webText.contains(A HREF)){
字符串最喜欢=最爱;
//解析它变成jsoup
文档的DOC = Jsoup.parse(webText);
//创建一个数组来单独解决所有类型的裹能
//影响到整个身体类型otherwises。
元素[]数组=新元素[doc.select(一)的大小()]; 对(INT I = 0; I&下; doc.select(一)的大小();我++){
如果(doc.select(一)。得到(ⅰ)!= NULL){
阵列[I] = doc.select(一)得到(一)。
}
} 的for(int i = 0; I< array.length,我++){
//我们不想换链接类型。常见的部分环节已经是
// HTTP。应该更新somethng更安全。
如果(阵列[我]的ToString()。包含(HTTP)== FALSE){
阵列[I] =阵列[I] .wrap(&下;一类=+喜爱+>&下; / A>中);
} }
// Log.e(从doc.body HTML ***************,+ doc.body());
element元素= doc.body();
Log.e(从元件的html ***************,+ element.html());
listOfElements = element.html();
} //首先需要做的code的检查,如果ITI是个谷歌地图的图像
如果(webText.contains(maps.google.com)){
文档的DOC = Jsoup.parse(webText); //解析它变成jsoup 的for(int i = 0; I< doc.select(img目录)大小();我++){
如果(doc.select(IMG)。得到(I)的ToString()。包含(maps.google.com)){
//获取所有号码+句号+获得的所有号码
模式信息noImage = Pattern.compile(\"(\\\\-?\\\\d+(\\\\.\\\\d+)?),(\\\\-?\\\\d+(\\\\.\\\\d+))+%7C(\\\\-?\\\\d+(\\\\.\\\\d+)?),(\\\\-?\\\\d+(\\\\.\\\\d+))\");
//获取URL SRC基本上.. ..几乎可以试试
匹配matcherer = noImage.matcher(doc.select(IMG)得到(ⅰ)的ToString()); //有两个选择 - 多路或单路
如果(matcherer.find()==真){
对于(INT J = 0; J< matcherer.groupCount(); J ++){
latitude_to = Double.parseDouble(matcherer.group(1));
longitude_to = Double.parseDouble(matcherer.group(3));
latitude_from = Double.parseDouble(matcherer.group(5));
longitude_from = Double.parseDouble(matcherer.group(7));
} 串COORDS =+ latitude_to +,+ longitude_to +,+ latitude_from +,+ longitude_from;
元件ELE = doc.body();
ele.select(IMG)得到(ⅰ).wrap。(&下; A HREF =+ COORDS +>&下; / A>中);
listOfElements = ele.html();
listOfElements = listOfElements.replace(与&放大器;,&放大器;); }否则如果(matcherer.find()== FALSE){
信息noImage = Pattern.compile((\\\\ - ?。?\\\\ D +(\\\\ D +)),\\\\ S *(\\\\ - ?。?\\\\ D +(\\\\ D +))) ;
matcherer = noImage.matcher(doc.select(IMG)得到(ⅰ)的ToString()); Toast.makeText(getApplicationContext(),正则表达式计数:+ matcherer.groupCount(),Toast.LENGTH_LONG).show();
如果(matcherer.find()){
对于(INT J = 0; J< matcherer.groupCount(); J ++){
纬度= Double.parseDouble(matcherer.group(1));
parseGeoPoint.setLatitude(纬度);
经度= Double.parseDouble(matcherer.group(3));
parseGeoPoint.setLongitude(经度);
}
} 串COORDS =+纬度+,+经度; 元件ELE = doc.body();
ele.select(IMG)得到(ⅰ).wrap。(&下; A HREF =+ COORDS +>&下; / A>中);
listOfElements = ele.html();
listOfElements = listOfElements.replace(与&放大器;,&放大器;); } }其他{
//标准照片
元件ELE = doc.body();
ele.select(IMG)获得(一)。
listOfElements = ele.html(); } }
//在htmlContent提出了新的价值
postObject.put(htmlContent,listOfElements); }其他{
postObject.put(htmlContent,webText);
} mainEditText.getViewTreeObserver()。addOnGlobalLayoutListener(新ViewTreeObserver.OnGlobalLayoutListener(){ @覆盖
公共无效onGlobalLayout(){
// TODO自动生成方法存根
矩形R =新的矩形();
mainEditText.getWindowVisibleDisplayFrame(R); // INT screenHeight = mainEditText.getRootView()的getHeight()。
// INT heightDifference = screenHeight - (r.bottom - r.top);
}
}); //查看是否跳闸存在
如果(finalTrip!= NULL){
} //要摆正位置的位置部分
//如果parsegeoPoint = NULL - 旧的信息
如果(!纬度= -10000&放大器;&安培;经度= -10000!){
// Toast.makeText(getApplicationContext(),
//添加位置中的共同ODS:+纬度+:+经度,
// Toast.LENGTH_SHORT).show();
postObject.put(位置,parseGeoPoint);
}
postObject.put(类型,Post.PostType.HTML.getPostVal());
postObject.put(用户,ParseObject.createWithoutData(_用户,user.getObjectId())); //将这些细节
意图I =新意图(getApplicationContext(),WriteStoryAnimation.class);
i.putExtra(listOfElements,listOfElements);
i.putExtra(webText,webText);
i.putExtra(finalTrip,finalTrip);
i.putExtra(纬度,纬度);
i.putExtra(经度,经度); 如果(mainEditText.length()大于0){
startActivity(ⅰ);
}其他{
Toast.makeText(getApplicationContext()你的故事是空的,Toast.LENGTH_SHORT).show();
} //完成();
// Toast.makeText(getApplicationContext()的EditText SIE:+高度+
//:+ desiredHeight,Toast.LENGTH_LONG).show(); } //方法重构HTML
公共字符串wrapImgWithCenter(字符串HTML){
文档的DOC = Jsoup.parse(HTML);
//图像之前添加标签中心
。doc.select(img目录)包(<中心及GT;< /中心和GT;);
最后p标签后//添加差距
的for(int i = 0; I< = 1;我++){
doc.select(P)最后一次()之后。(< BR>中);
} 返回doc.html();
}
我已经解决了这个问题。 Fonkap是正确的他的言论有东西改变我的输出。我只是改变从 wrapImgWithCenter()
被称为得到住的地方。
我刚换了最后的 createHtmlWeb的()
方法做:
Log.e(listOfElements,listOfElements);
//重构HTML
listOfElements = wrapImgWithCenter(listOfElements);
//在htmlContent提出了新的价值
postObject.put(htmlContent,listOfElements); }其他{
//重构HTML
webText = wrapImgWithCenter(webText);
postObject.put(htmlContent,webText);
}
现在的输出符合要求。
I have an HTML string which contains text, images or map images. The HTML is dynamically generated. Now, wherever there is an <img>
tag, it should be wrapped by a <center>
tag. To achieve this, I use JSoup and I successfully apply that to static images and text.
But, whenever I am trying to post a map, the HTML is losing its structure. I do not understand what is happening. Both the normal and the map images have <img>
tags. How can the method give different outputs?
This is the method which is doing the job:
public String wrapImgWithCenter(String html){
Document doc = Jsoup.parse(html);
doc.select("img").wrap("<center></center>");
return doc.html();
}
Original HTML with image and map images before wrapping:
<p dir="ltr"><img src="http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-4de5af73-68f1-401d-9feb-1dfde1373cff-file" /></p>
<p dir="ltr"><a href="15.2993265,74.123996"><img src="http://maps.google.com/maps/api/staticmap?center=15.2993265,74.123996&zoom=15&size=960x540&sensor=false&markers=color:blue%7Clabel:!%7C15.2993265,74.123996" /></a><br /><br /></p>
<p dir="ltr"><a href="22.572646,88.363895,-25.274398,133.775136"><img src="http://maps.google.com/maps/api/staticmap?center=22.572646,88.363895&zoom=2&size=960x540&markers=22.572646,88.363895%7C-25.274398,133.775136&path=color:0xff0000ff%7Cweight:5%7C22.572646,88.363895%7C-25.274398,133.775136&sensor=false" /></a><br /> </p>
Result after wrapping
<p dir="ltr"> </p>
<center>
<img src="http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-01516245-c773-4765-b542-ebecb964b255-file" />
</center>
<br />
<br />
<p></p>
<p dir="ltr"> </p>
<center>
<a href="15.2993265,74.123996"><img src="http://maps.google.com/maps/api/staticmap?center=15.2993265,74.123996&zoom=15&size=960x540&sensor=false&markers=color:blue%7Clabel:!%7C15.2993265,74.123996" /></a>
</center>
<p></p>
<p dir="ltr"> </p>
<center>
<a href="22.572646,88.363895,-25.274398,133.775136"><img src="http://maps.google.com/maps/api/staticmap?center=22.572646,88.363895&zoom=2&size=960x540&markers=22.572646,88.363895%7C-25.274398,133.775136&path=color:0xff0000ff%7Cweight:5%7C22.572646,88.363895%7C-25.274398,133.775136&sensor=false" /></a>
</center>
<br />
<p></p>
For comparison,
Valid output with only images:
<html>
<head></head>
<body>
<p dir="ltr">
<center>
<img src="http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-959467a6-f83f-44c6-b6fc-88ba4f49d900-file" />
</center><br /></p>
<p dir="ltr">
<center>
<img src="http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-46c38c96-c3b5-402e-a0b4-03209adf5203-file" />
</center><br /></p>
<p dir="ltr">
<center>
<img src="http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-626ec909-c65e-452c-a341-61a361584eba-file" />
</center><br /> </p>
</body>
</html>
Valid output with text and image:
<html>
<head></head>
<body>
<p dir="ltr">text </p>
<p dir="ltr">
<center>
<img src="http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-11343a01-7cd2-4f9e-9f9a-025ec3feb828-file" />
</center><br /> </p>
</body>
</html>
++++++++++++++++++++++++++++++++++++
The methods in the class that are responsible for the above functionality:
private void createHtmlWeb(){
String listOfElements = "null"; // normally found if
// webTextcontains.maps.google.com
Toast.makeText(getApplicationContext(), "" + mainEditText.getHeight(), Toast.LENGTH_SHORT).show();
ParseObject postObject = new ParseObject("Post");
Spannable s = mainEditText.getText();
String webText = Html.toHtml(s);
webText = webText.replaceAll("(</?(?:b|i|u)>)\\1+", "$1").replaceAll("</(b|i|u)><\\1>", "");
// refactoring html
webText = wrapImgWithCenter(webText);
// Determine link and favourite types to add favourite a class around
// it.
if (webText.contains("a href")) {
String favourite = "favourite";
// Parse it into jsoup
Document doc = Jsoup.parse(webText);
// Create an array to tackle every type individually as wrap can
// affect whole body types otherwises.
Element[] array = new Element[doc.select("a").size()];
for (int i = 0; i < doc.select("a").size(); i++) {
if (doc.select("a").get(i) != null) {
array[i] = doc.select("a").get(i);
}
}
for (int i = 0; i < array.length; i++) {
// we don't want to wrap link types. Common part links have is
// http. Should update for somethng more secure.
if (array[i].toString().contains("http") == false) {
array[i] = array[i].wrap("<a class=" + favourite + "></a>");
}
}
// Log.e("From doc.body html *************** ", " " + doc.body());
Element element = doc.body();
Log.e("From element html *************** ", " " + element.html());
listOfElements = element.html();
}
// First need to do a check of the code if iti s a google maps image
if (webText.contains("maps.google.com")) {
Document doc = Jsoup.parse(webText); // Parse it into jsoup
for (int i = 0; i < doc.select("img").size(); i++) {
if (doc.select("img").get(i).toString().contains("maps.google.com")) {
// Get all numbers + full stops + get all numbers
Pattern noImage = Pattern.compile("(\\-?\\d+(\\.\\d+)?),(\\-?\\d+(\\.\\d+))+%7C(\\-?\\d+(\\.\\d+)?),(\\-?\\d+(\\.\\d+))");
// Gets the URL SRC basically.. almost.. lets try it
Matcher matcherer = noImage.matcher(doc.select("img").get(i).toString());
// Have two options - multi route or single route
if (matcherer.find() == true) {
for (int j = 0; j < matcherer.groupCount(); j++) {
latitude_to = Double.parseDouble(matcherer.group(1));
longitude_to = Double.parseDouble(matcherer.group(3));
latitude_from = Double.parseDouble(matcherer.group(5));
longitude_from = Double.parseDouble(matcherer.group(7));
}
String coOrds = "" + latitude_to + "," + longitude_to + "," + latitude_from + "," + longitude_from;
Element ele = doc.body();
ele.select("img").get(i).wrap("<a href=" + coOrds + "></a>");
listOfElements = ele.html();
listOfElements = listOfElements.replace("&", "&");
} else if (matcherer.find() == false) {
noImage = Pattern.compile("(\\-?\\d+(\\.\\d+)?),\\s*(\\-?\\d+(\\.\\d+)?)");
matcherer = noImage.matcher(doc.select("img").get(i).toString());
Toast.makeText(getApplicationContext(), "Regex Count:" + matcherer.groupCount(), Toast.LENGTH_LONG).show();
if (matcherer.find()) {
for (int j = 0; j < matcherer.groupCount(); j++) {
latitude = Double.parseDouble(matcherer.group(1));
parseGeoPoint.setLatitude(latitude);
longitude = Double.parseDouble(matcherer.group(3));
parseGeoPoint.setLongitude(longitude);
}
}
String coOrds = "" + latitude + "," + longitude;
Element ele = doc.body();
ele.select("img").get(i).wrap("<a href=" + coOrds + "></a>");
listOfElements = ele.html();
listOfElements = listOfElements.replace("&", "&");
}
} else {
// standard photo
Element ele = doc.body();
ele.select("img").get(i);
listOfElements = ele.html();
}
}
// Put new value in htmlContent
postObject.put("htmlContent", listOfElements);
} else {
postObject.put("htmlContent", webText);
}
mainEditText.getViewTreeObserver().addOnGlobalLayoutListener(new ViewTreeObserver.OnGlobalLayoutListener() {
@Override
public void onGlobalLayout(){
// TODO Auto-generated method stub
Rect r = new Rect();
mainEditText.getWindowVisibleDisplayFrame(r);
// int screenHeight = mainEditText.getRootView().getHeight();
// int heightDifference = screenHeight - (r.bottom - r.top);
}
});
// See if a trip exists
if (finalTrip != null) {
}
// Want to put the location in the location section
// if parsegeoPoint != null -- old information
if (latitude != -10000 && longitude != -10000) {
// Toast.makeText(getApplicationContext(),
// "Adding in location co-ods: " + latitude + " : " + longitude ,
// Toast.LENGTH_SHORT).show();
postObject.put("location", parseGeoPoint);
}
postObject.put("type", Post.PostType.HTML.getPostVal());
postObject.put("user", ParseObject.createWithoutData("_User", user.getObjectId()));
// Transfer these details
Intent i = new Intent(getApplicationContext(), WriteStoryAnimation.class);
i.putExtra("listOfElements", listOfElements);
i.putExtra("webText", webText);
i.putExtra("finalTrip", finalTrip);
i.putExtra("latitude", latitude);
i.putExtra("longitude", longitude);
if (mainEditText.length() > 0) {
startActivity(i);
} else {
Toast.makeText(getApplicationContext(), "Your story is empty", Toast.LENGTH_SHORT).show();
}
// finish();
// Toast.makeText(getApplicationContext(), "EditText Sie: " + height +
// " : " + desiredHeight, Toast.LENGTH_LONG).show();
}
// method to refactor html
public String wrapImgWithCenter(String html){
Document doc = Jsoup.parse(html);
//adding center tag before images
doc.select("img").wrap("<center></center>");
//adding gap after last p tag
for (int i =0; i<= 1; i++) {
doc.select("p").last().after("<br>");
}
return doc.html();
}
I have solved the issue. Fonkap was right in his comments that something was altering my output. I just changed the place from where the wrapImgWithCenter()
was getting called.
I have just changed the last of the createHtmlWeb()
method and did this:
Log.e("listOfElements", listOfElements);
//refactoring html
listOfElements = wrapImgWithCenter(listOfElements);
// Put new value in htmlContent
postObject.put("htmlContent", listOfElements);
} else {
//refactoring html
webText = wrapImgWithCenter(webText);
postObject.put("htmlContent", webText);
}
Now the output conforms to the requirements.
这篇关于JSoup&QUOT;包装&QUOT;不能按预期每次的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!