寻找一种工具,用简单的HTML文档制作纯文本文档 [英] Looking for a tool to make plain text document out of a simple HTML document

查看:51
本文介绍了寻找一种工具,用简单的HTML文档制作纯文本文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述




希望这不是太多的offtopic。


我正在研究常见问题解答。我想制作它的两个版本,纯文本和

HTML。我正在寻找一种工具,可以用

HTML文档制作纯文本文档。 HTML版本没有任何花哨的东西,只是内部的

链接。因此,该工具必须能够从HTML版本中删除内部链接和锚点,但是以简化形式保留外部链接。也就是说,

HTML版本会说< a href =" http://foo/bar.html"> Bar< / a>和简单的

文本版本只会说 http:// foo / bar。 HTML 。该工具还应该能够将mailto链接转换为纯文本,即更改< a

href =" mailto:fo*@bar.com" > Foo< / a>进入 fo *@bar.com 。事实上,我认为这些更改都可能通过正则表达式搜索和替换引擎来实现。我的b $ b在我的编辑器中有一个,但我不知道如何使用它。所以我要么在寻找正则表达式字符串或者现成的工具,要么找到它们。

Hi,

Hopefully this is not too much offtopic.

I''m working on a FAQ. I want to make two versions of it, plain text and
HTML. I''m looking for a tool that will make a plain text doc out of the
HTML doc. The HTML version doesn''t have anything fancy, just internal
links. So the tool must be able to delete internal links and anchors from
the HTML version, but leave external links in simplified form. That is, the
HTML version would say <a href="http://foo/bar.html">Bar</a> and the plain
text version would just say http://foo/bar.html. The tool should also be
able to make mailto links into plain text, that is, change <a
href="mailto:fo*@bar.com">Foo </a> into fo*@bar.com. In fact, I think both
of these changes might be possible with regex search and replace engine. I
have one at my editor, but I don''t know how to use it. So I''m either
looking for a regex strings or a ready tool, both will be find.

推荐答案

Akseli M?ki写道:


我忘了说,该工具应该是Dos或Windows一个。
Akseli M?ki wrote:

I forgot to say, that the tool should be Dos or Windows one.


打开文件在Word中,选择全部,复制,打开记事本,粘贴,保存:)


Jon


" Akseli M?ki" <是ne ******** @ akseli-yok.utu.fi>在消息中写道

news:0p ******************************** @ 4ax.com ...

Akseli M?ki写道:


我忘了说,该工具应该是Dos或Windows一个。
open the file in Word, select all, copy, open Notepad, paste, save :)

Jon

"Akseli M?ki" <ne********@akseli-yok.utu.fi> wrote in message
news:0p********************************@4ax.com...
Akseli M?ki wrote:

I forgot to say, that the tool should be Dos or Windows one.


On Sun,2003年12月21日14:12:45 +0200,Akseli M?ki写道:
On Sun, 21 Dec 2003 14:12:45 +0200, Akseli M?ki wrote:

希望这不是太多的offtopic。

我正在研究常见问题解答。我想制作它的两个版本,纯文本和HTML。我正在寻找一种工具,可以用HTML文档制作纯文本文档。 HTML版本没有任何花哨的东西,只是内部链接。因此,该工具必须能够从HTML版本中删除内部链接和锚点,但是以简化形式保留外部链接。也就是说,
HTML版本会说< a href =" http://foo/bar.html"> Bar< / a>而简单的
文本版本只会说 http://foo/bar.html 。该工具还应该能够将mailto链接转换为纯文本,即更改< a
href =" mailto:fo*@bar.com"> Foo< / a>进入 fo *@bar.com 。事实上,我认为这些变化都可能通过正则表达式搜索和替换引擎来实现。我在我的编辑中有一个,但我不知道如何使用它。所以我要么正在寻找一个正则表达式字符串或一个现成的工具,两者都将被找到。
Hi,

Hopefully this is not too much offtopic.

I''m working on a FAQ. I want to make two versions of it, plain text and
HTML. I''m looking for a tool that will make a plain text doc out of the
HTML doc. The HTML version doesn''t have anything fancy, just internal
links. So the tool must be able to delete internal links and anchors from
the HTML version, but leave external links in simplified form. That is, the
HTML version would say <a href="http://foo/bar.html">Bar</a> and the plain
text version would just say http://foo/bar.html. The tool should also be
able to make mailto links into plain text, that is, change <a
href="mailto:fo*@bar.com">Foo </a> into fo*@bar.com. In fact, I think both
of these changes might be possible with regex search and replace engine. I
have one at my editor, but I don''t know how to use it. So I''m either
looking for a regex strings or a ready tool, both will be find.




虽然它不是*完全*符合您的要求,lynx是一个非常好的

工具。 lynx --dump http://host/dir/page.ext"将生成

一个纯文本输出,链接替换为''[1] link text'';在底部
输出的
是所有链接的目标URL的列表。


它适用于Windows的< http:// jim.spath.com/lynx_win32/>。


-

有人说连线没有像现实世界那样的政治边界,

但是有太多无意义的无家可归者无政府主义者或白痴人士认为恶作剧是一场革命。



While it doesn''t *exactly* match your requirements, lynx is a very good
tool for doing this. "lynx --dump http://host/dir/page.ext" will produce
a plain-text output with links replaced with ''[1]link text''; at the bottom
of the output is a list of all the links'' destination URLs.

It is available for Windows at <http://jim.spath.com/lynx_win32/>.

--
Some say the Wired doesn''t have political borders like the real world,
but there are far too many nonsense-spouting anarchists or idiots who
think that pranks are a revolution.

这篇关于寻找一种工具,用简单的HTML文档制作纯文本文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆