我可以知道如何在C中编写html解析器吗? [英] can I know how to write a html parser in C

查看:75
本文介绍了我可以知道如何在C中编写html解析器吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述




我在C中相当熟悉,但并不多。


我想知道如何编写html解析器C只解析

html文件中的图像文件并显示或打印

html文件中找到的所有图像。


如何解决这个问题?


我应该有一个文件指针并将html文件存储到一个数组中

首先然后查找img src ..

喜欢做一些字符串比较...


网上是否有样本(不是高保真代码,简单代码)我可以

看看给我一个关于我需要做什么的想法。


再次感谢

Hi

I am fairly familiar in C but not much.

I want to know how I can write a html parser in C that only parses for
the image file in the html file and display or print
all the images found in the html file.

How to go about it?

Should I have a file pointer and store the html file into an array
first and then look for the img src..
like do some string compare...

Is there a sample on the net(not a hifi code,, a simple one) that I can
look at to give me an idea on what I need to do.

Thanks again

推荐答案

" WUV999U" <我们************** @ gmail.com>在消息中写道

news:11 ********************** @ g14g2000cwa.googlegr oups.com ...
"WUV999U" <us**************@gmail.com> wrote in message
news:11**********************@g14g2000cwa.googlegr oups.com...

我想知道如何在C中编写一个html解析器,它只解析html文件中的图像文件并显示或打印
所有在html文件。

I want to know how I can write a html parser in C that only parses for
the image file in the html file and display or print
all the images found in the html file.




您需要检测并打印HTML

文件中的所有img引用?我会使用Perl,而不是C.



You need to detect and print all img references that are found in an HTML
file? I''d use Perl, not C.


好吧,

这是一个很棒的建议Jarmo。正如你所说,我可以使用Perl。

但是我害怕我不习惯它。

我需要在一天左右的时间内完成这项工作。

如果我使用C,我该怎么办呢?


谢谢

well,
that was a great suggestion Jarmo. As you said, I can use Perl.
But m afraid m not used to it.
I need to get this done in a day or so..
If I use C, how do I go about it?

Thanks


WUV999U写道:
WUV999U wrote:

你好

我对C相当熟悉,但并不多。

我想知道如何写一个C语言中的html解析器只解析html文件中的图像文件,并显示或打印html文件中的所有图像。

如何去做?

我应该有一个文件指针并将html文件存储到一个数组中
然后查找img src ..
就像做一些字符串比较...


这肯定是一种有效的方法,如果你确定你拥有

RAM来将整个HTML文件存入内存(这是一个问题,

它必须是一台小型计算机和一个网页的母亲!)。


使用strchr来找到''<''字符。现在你知道你有一个标签。

我不记得在''i'或''我'之前是否允许空格

img;确定,跳过空白。 isspace()将帮助你在那里
。当你越过空白时,将下一个

三个字符(不区分大小写)与img进行比较。如果你有一个

匹配,按下并查找src,这不一定只是

一个空格距离img,所以是小心。别忘了它

可能是SRC甚至是sRc。其余的这一点应该是明显的
$ b $。


如果''<''之后的第一个非空格字符是/不是'''我''或''我',

只是寻找另一个''<''。


继续运行直到你的文件用完为止。

网上有一个样本(不是一个高保真代码,一个简单的代码),我可以看看,让我知道我需要做什么。

Hi

I am fairly familiar in C but not much.

I want to know how I can write a html parser in C that only parses for
the image file in the html file and display or print
all the images found in the html file.

How to go about it?

Should I have a file pointer and store the html file into an array
first and then look for the img src..
like do some string compare...
That''s certainly a valid approach, if you are sure you have the
RAM to get the whole HTML file into memory (for this to be a problem,
it would have to be a tiny computer and one mother of a Web page!).

Use strchr to find a ''<'' character. Now you know you have a tag.
I don''t recall whether whitespace is allowed before the ''i'' or ''I''
of img; to be certain, skip past whitespace. isspace() will help
you there. When you get past the whitespace, compare the next
three characters, case-insensitively, to "img". If you have a
match, press on and look for "src", which isn''t necessarily just
one whitespace away from "img", so be careful. Don''t forget it
might be "SRC" or even "sRc". The rest of this bit should be
obvious.

If the first non-whitespace char after ''<'' is /not/ ''i'' or ''I'',
simply look for another ''<''.

Keep going until you run out of file.
Is there a sample on the net(not a hifi code,, a simple one) that I can
look at to give me an idea on what I need to do.




自己动手吧。如果你遇到困难,可以在这里发布你最努力的

代码,我希望有人会帮助你再次解开。



Have a go at it yourself. If you get stuck, post your best-effort
code here, and I expect someone will help you get unstuck again.


这篇关于我可以知道如何在C中编写html解析器吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆