编辑器清理MS Word生成的HTML表 [英] Editor to clean up MS Word-generated HTML table

查看:63
本文介绍了编辑器清理MS Word生成的HTML表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常大的由MS Word创建的html表格,保存为网页。

页面,已过滤&文件类型。每个html表格单元都有很多

格式化标签。大多数文件大小都是格式化。


是否有免费或廉价的编辑器可以快速删除所有

格式以最小化文件大小?


我尝试了一些免费软件编辑,但是找不到清理方式

了。

谢谢,


Greg

解决方案

2007-10-24,Greg Lovern写道:


>


我有一个由MS Word创建的非常大的html表,保存为'''Web

Page,Filtered"文件类型。每个html表格单元都有很多

格式化标签。大多数文件大小都是格式化。


是否有免费或廉价的编辑器可以快速删除所有

格式以最小化文件大小?


我试过几个免费软件编辑,但是找不到清理它的方法。



使用" lynx -dump"提取文本,然后用任何文本标记

编辑器。


-

Chris FA Johnson< http ://cfaj.freeshell.org>

================================= ================= =================

作者:

Shell Scripting Recipes:问题解决方案(2005,Apress)


Greg Lovern写道:


我有一个由MS Word创建的非常大的html表,保存为它的Web/
Page,Filtered和文件类型。每个html表格单元都有很多

格式化标签。大多数文件大小都是格式化。


是否有免费或廉价的编辑器可以快速删除所有

格式以最小化文件大小?



首先 - 不要!


1)在Word中选择表:

2)将表转换为文本并使用表格单元格的选项卡

3)使用Word的搜索和替换功能:

3a)找到:^ t

替换为:< / td>< / td>

全部替换

3b)找到:^ p

替换为:< / td>< / tr> ^ p< tr>< td>

全部替换

4)添加到开头你的正式餐桌:

< table>

< tr>< td>

5)添加到结尾:

< / table>

6)选择全部并使用任何文本编辑器粘贴到您的模板HTML中。

风格品味......

-

保重,


Jonathan

------- ------------

LITTLE WORKS STUDIO
http://www.LittleWorksStudio.com


文章< 4e *************************** @ NAXS.COM>,

" Jonathan N. Little < lw ***** @ centralva.netwrote:


Greg Lovern写道:


I有一个非常大的由MS Word创建的html表,保存为它的Web/
页面,已过滤文件类型。每个html表格单元都有很多

格式化标签。大多数文件大小都是格式化。


是否有免费或廉价的编辑器可以快速删除所有

格式以最小化文件大小?




首先 - 不要!



同意 - 如果可能的话,避免使用Word生成任何HTML。


1)在Word中选择表:

2)将表转换为文本并使用制表符对于表格单元格

3)使用Word的搜索和替换功能:

3a)找到:^ t

替换为: < / TD>< / TD>



我认为你的意思是< / td>< td ??


作为替代方案,OP可能会看起来在某种类似的东西上美元汤:

http://www.crummy.com/software/BeautifulSoup/

取决于操作系统的味道和口味/才能

用户,当然总有grep ......


I have a very large html table created by MS Word, saved as it''s "Web
Page, Filtered" file type. Every html table cell has lots of
formatting tags. Most of the file size is that formatting.

Is there a free or inexpensive editor that can quickly remove all
formatting to minimize the file size?

I tried a few freeware editors, but wasn''t able to find a way to clean
it up.
Thanks,

Greg

解决方案

On 2007-10-24, Greg Lovern wrote:

>

I have a very large html table created by MS Word, saved as it''s "Web
Page, Filtered" file type. Every html table cell has lots of
formatting tags. Most of the file size is that formatting.

Is there a free or inexpensive editor that can quickly remove all
formatting to minimize the file size?

I tried a few freeware editors, but wasn''t able to find a way to clean
it up.

Use "lynx -dump" to extract the text, then mark it up in any text
editor.

--
Chris F.A. Johnson <http://cfaj.freeshell.org>
================================================== =================
Author:
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)


Greg Lovern wrote:

I have a very large html table created by MS Word, saved as it''s "Web
Page, Filtered" file type. Every html table cell has lots of
formatting tags. Most of the file size is that formatting.

Is there a free or inexpensive editor that can quickly remove all
formatting to minimize the file size?


First--don''t!

1) In Word elect the table:
2) Convert table to text and use tabs for the table cells
3) Use Word''s Search and Replace feature:
3a) Find what: ^t
Replace with: </td></td>
Replace all
3b) Find what: ^p
Replace with: </td></tr>^p<tr><td>
Replace all
4) Add to the beginning of your formal table:
<table>
<tr><td>
5) Add to end:
</table>
6) Select all and paste into your template HTML with any text editor.
Style to taste...

--
Take care,

Jonathan
-------------------
LITTLE WORKS STUDIO
http://www.LittleWorksStudio.com


In article <4e***************************@NAXS.COM>,
"Jonathan N. Little" <lw*****@centralva.netwrote:

Greg Lovern wrote:

I have a very large html table created by MS Word, saved as it''s "Web
Page, Filtered" file type. Every html table cell has lots of
formatting tags. Most of the file size is that formatting.

Is there a free or inexpensive editor that can quickly remove all
formatting to minimize the file size?



First--don''t!

Agreed - if at all possible, avoid using Word to generate any html.

1) In Word elect the table:
2) Convert table to text and use tabs for the table cells
3) Use Word''s Search and Replace feature:
3a) Find what: ^t
Replace with: </td></td>

I think you mean </td><td??

As an alternative, the OP could look at something like
Beautiful Soup:

http://www.crummy.com/software/BeautifulSoup/

Depending on the flavour of OS and tastes/talents of the
user, there''s always grep of course...


这篇关于编辑器清理MS Word生成的HTML表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆