C编程任务,HTML源文件 [英] C programming task, html source file

查看:120
本文介绍了C编程任务,HTML源文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有这样的任务:我有一个源文件,例如新闻网站,其中也有像meta标签< META NAME =作者内容=走出去> 。而且,正如你明白,源文件中包含了大量的信息。我的任务是找到元author标记,并打印出该meta标签的屏幕内容,现在这将是走出去。我不知道如何甚至开始这样做。
我有一个想法,像扫描18个字符,如果所需的元标记检查,但我认为不工作:

So I have this task: I have a source file of, for example news website, in which there are meta tags like <meta name="author" content="Go Outside">. And, as you understand, that source file contains a lot of information. My task is to find that meta author tag and print out to the screen content of that meta tag, now it would be "Go Outside". I have no idea how to even start doing this. I had one idea to scan like 18 chars, and check if that is required meta tag, but that doesn't work as I thought:

   while(feof(src_file) == 0){
      char key[18];
      int i = 0;
      while (i < 18 && (feof(src_file) == 0)){
         key[i] = fgetc(src_file);
         printf("%c", key[i]);
         i++;
      }
      printf("\n%s", key);
   }

问题是,它打印出垃圾这一行。

The problem is that it prints out rubbish at this line.

您的帮助是因为我一直在努力,并直10小时学习AP preciated,你也许能救我要疯了。
谢谢你。

Your help would be appreciated since I have been working and studying for 10 hours straight, you might be able to save me from going mad. Thanks.

推荐答案

您缺少以零终止字符 -array,使之能够成为处理作为一个字符串之前打印出来。

You are missing to zero-terminate the char-array to enable it to be handle as a string before printing it.

mod您code要么像这样:

Mod you code either like so:

...
{
  char key[18 + 1]; /* add one for the zero-termination */
  memset(key, 0, sizeof(key)); /* zero out the whole array, so there is no need to add any zero-terminator in any case */ 
  ...

或像这样:

...
{
  char key[18 + 1]; /* add one for the zero-termination */

  ... /* read here */

  key[18] = '\0'; /* set zero terminator */
  printf("\n%s", key);
  ...


更新:

正如我在评论中提及你的问题有另一个故事的方式有关的feof()被使用,这是错误的

As mentioned in my comment to your question there is "another story" related to the way feof() is used, which is wrong.

请看到一个EOF已已经在一个错误的情况下,还是真正的结束文件中读取后,才读取循环结束。这EOF伪字,然后加入到字符数组holdling的写着造成的。

Please see that the read loop is ended only after an EOF had been already been read in case of an error or a real end-of-file. This EOF pseudo character, then is added to the character array holdling the reads' result.

您可能会想用下面的结构为:

You might like to use the following construct to read:

{
  int c = 0;
  do
  {
    char key[18 + 1];
    memset(key, 0, sizeof(key));

    size_t i = 0;
    while ((i < 18) && (EOF != (c = fgetc(src_file))))
    {
       key[i] = c;
       printf("%c", key[i]);
       i++;
    }

    printf("\n%s\n", key);
  } while (EOF != c);
}
/* Arriving here means fgetc() returned EOF. As this can either mean end-of-file was
   reached **or** an error occurred, ferror() is called to find out what happend: */
if (ferror(src_file))
{
  fprintf(stderr, "fgetc() failed.\n");
}

有关此的详细讨论,你可能会喜欢阅读这个问题,它的答案的。

For a detailed discussion on this you might like to read this question and its answers.

这篇关于C编程任务,HTML源文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆