读取和修改Java中的HTML文件 [英] Reading and modifying an HTML file in Java

查看:166
本文介绍了读取和修改Java中的HTML文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个HTML文件,我需要做的就是保存这个文件到一些数组或大的字符串,然后做必要的修改。我要加入一些Javascript角和属性的一些其他元素,也有消除其中的一些。谁能plaese帮我做这个。感谢你!

 <?XML版本=1.0编码=UTF-8&GT?;
!< D​​OCTYPE HTML PUBLIC - // W3C // DTD XHTML 1.0过渡// ENhttp://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">
< HTML的xmlns =htt​​p://www.w3.org/1999/xhtml>
< HEAD>
< META HTTP-EQUIV =内容类型内容=text / html的;字符集= UTF-8/>
<标题>&EXE LT; /标题>
<风格类型=文/ CSS>
@import URL(base.css);
@import URL(content.css);
< /风格>
<脚本类型=文/ JavaScript的SRC =common.js>< / SCRIPT>
<! - 在这里我需要包括3个Java脚本 - >
< /头>
<身体GT;
< D​​IV ID =外>
< D​​IV ID =主>
< D​​IV ID =nodeDecoration>
&下,P ID =nodeTitle>
部1所述; / P>
< / DIV>
< D​​IV CLASS =TrueFalseIdeviceID =ID12>
<脚本类型=文/ JavaScript的SRC =common.js>< / SCRIPT>
< - !这JavaScript有被淘汰 - >
<脚本类型=文/ JavaScript的SRC =libot_drag.js>< / SCRIPT>
< D​​IV CLASS =的iDevice emphasis1>
&所述; IMG ALT =类=iDevice_iconSRC =icon_question.gif/>
<跨度类=iDeviceTitle>真 - 假问题< / SPAN>< BR />
< D​​IV CLASS =iDevice_inner>
< D​​IV ID =ta12_16级=块的风格=显示:块>< / DIV>< D​​IV CLASS =问题>
< BR />< BR />< D​​IV ID =taquestion0b12级=块的风格=显示:块> 1><跨度风格=COLOR:#000000; FONT-FAMILY :宋体,宋体,黑体,无衬线; FONT-SIZE:13px的;字体风格:正常;字体变形:正常;字体重量:正常;字母间距:正常;线高度:正常;孤儿:2 ; TEXT-INDENT:0像素;文本转换:无;白色空间:正常;寡妇:2;字间距:0像素;背景颜色:#FFFFFF;显示:内联重要;浮动:无> SQL表示结构化查询语言<?/ SPAN>< - !这onclick事件已被删除 - >
< / DIV>< BR />真<输入类型=电台NAME =option0b12ID =true0b12的onclick =getFeedback(0,2,'0b12','truefalse')/&GT ;
假所述;输入类型=无线电名称=option0b12ID =false0b12的onclick =getFeedback(1,2,'0b12','truefalse')/>
< D​​IV ID =s0b0b12的风格=COLOR:RGB(0,51,204);显示:无; even_steven =18>正确! < / DIV>
< D​​IV ID =s1b0b12的风格=COLOR:RGB(0,51,204);显示:无; even_steven =19>不正确的! < / DIV>
< D​​IV ID =sfbk0b12的风格=COLOR:RGB(0,51,204);显示:无;>< D​​IV ID =tafeedback0b12级=块的风格=显示:块&GT ;
<! - 在这里我需要包括一个提交按钮 - >
< / DIV>< / DIV>
< / DIV>
< / DIV>
< / DIV>
< /身体GT;< / HTML>


解决方案

Java的已经有一个解析器,称为DOM,可以帮助你。你可以使用这样的:

 文件theXML =新的文件(C:\\\\ \\\\路径为\\\\ file.xml);
工厂的DocumentBuilderFactory = DocumentBuilderFactory.newInstance();
的DocumentBuilder建设者= factory.newDocumentBuilder();
文档的DOC = builder.parse(theXML);
doc.getDocumentElement()正常化()。

如果你曾经使用JavaScript的DOM,你应该知道现在该做什么,使用doc.getElementsByTagName等。如果没有,请href=\"http://docs.oracle.com/javase/tutorial/jaxp/dom/index.html\" rel=\"nofollow\"> oracle的教程

I have one HTML file and what I have to do is to store this file into some array or big string and then do the required modifications. I have to include some Javascripts and some other elements with attributes and also have to eliminate some of them. Can anyone plaese help me out in doing this. Thanking you!

<?xml version="1.0" encoding="UTF-8"?>  
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">  
<html xmlns="http://www.w3.org/1999/xhtml">  
<head>  
<meta http-equiv="Content-type" content="text/html;  charset=utf-8" />  
<title>eXe</title>  
<style type="text/css">  
@import url(base.css);  
@import url(content.css);  
</style>  
<script type="text/javascript" src="common.js"></script>
<!--HERE I NEED TO INCLUDE 3 MORE JAVASCRIPTS-->  
</head>  
<body>  
<div id="outer">  
<div id="main">  
<div id="nodeDecoration">  
<p id="nodeTitle">  
Part 1</p>  
</div>  
<div class="TrueFalseIdevice" id="id12">  
<script type="text/javascript" src="common.js"></script>  
<!--THIS JAVASCRIPT HAS TO BE ELIMINATED-->  
<script type="text/javascript" src="libot_drag.js"></script>  
<div class="iDevice emphasis1">  
<img alt="" class="iDevice_icon" src="icon_question.gif" />  
<span class="iDeviceTitle">True-False Question</span><br/>  
<div class="iDevice_inner">  
<div id="ta12_16" class="block" style="display:block">  

</div><div class="question">  
<br/><br/><div id="taquestion0b12" class="block" style="display:block">1><span style="color: #000000; font-family: Verdana,Arial,Helvetica,sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: #ffffff; display: inline ! important; float: none"> SQL Stands for Structure Query Language?</span>   

<!--THIS ONCLICK EVENT HAS TO BE REMOVED-->  
</div><br/>True <input type="radio" name="option0b12" id="true0b12" onclick="getFeedback(0,2,'0b12','truefalse')"/>   
False <input type="radio" name="option0b12" id="false0b12" onclick="getFeedback(1,2,'0b12','truefalse')"/>  
<div id="s0b0b12" style="color: rgb(0, 51, 204);display: none;" even_steven="18">Correct! </div>  
<div id="s1b0b12" style="color: rgb(0, 51, 204);display: none;" even_steven="19">Incorrect! </div>  
<div id="sfbk0b12" style="color: rgb(0, 51, 204);display: none;"><div id="tafeedback0b12" class="block" style="display:block">  
<!--HERE I NEED TO INCLUDE A SUBMIT BUTTON-->
</div></div>  
</div>  
</div>  
</div>  
</body></html> 

解决方案

Java already has a parser, called DOM that could help you. You could use something like this:

File theXML = new File("C:\\path\\to\\file.xml");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(theXML);
doc.getDocumentElement().normalize();

If you've ever used JavaScript DOM, you should know what to do now, use doc.getElementsByTagName or the like. If you don't, check out the oracle tutorial

这篇关于读取和修改Java中的HTML文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆