在PHP中解析HTML并返回JSON [英] Parse HTML in PHP and return JSON

查看:448
本文介绍了在PHP中解析HTML并返回JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在PHP脚本中使用 PHP Simple HTML DOM Parser 来将信息从网站解析成JSON目的。我的JSON对象的格式应该是这样的:

I am using PHP Simple HTML DOM Parser in my PHP script to parse information from a website into a JSON object. My JSON object should be formatted like this in the end:

最多5个对象(星期一到星期五)或更少(星期二到星期五等)的数组。

Array with maximum 5 objects (Monday to Friday) or less (Tuesday–Friday etc).

所有这些对象应该有两个数组,一个称为 food1 ,另一个称为 food 2 。这两个数组都应该包含多个食物名称及其价格。我认为在JSON它将如下所示:

All of these objects should have two arrays, one called food1 and one called food 2. Both of these arrays should contain multiple food names and their prices. I think in JSON it would look like this:

    {
  "day" : [
    {
      "food1" : [
        {
          "price" : "1.00",
          "foodname" : "test"
        },
        {
          "price" : "1.00",
          "foodname" : "test"
        }
      ],
      "food2" : [
        {
          "price" : "2.00",
          "foodname" : "test2"
        },
        {
          "price" : "2.00",
          "foodname" : "test2"
        }
      ]
    },
    {
      "food1" : [
        {
          "price" : "1.00",
          "foodname" : "test"
        },
        {
          "price" : "1.00",
          "foodname" : "test"
        }
      ],
      "food2" : [
        {
          "price" : "2.00",
          "foodname" : "test2"
        },
        {
          "price" : "2.00",
          "foodname" : "test2"
        }
      ]
    },
    {
      "food1" : [
        {
          "price" : "1.00",
          "foodname" : "test"
        },
        {
          "price" : "1.00",
          "foodname" : "test"
        }
      ],
      "food2" : [
        {
          "price" : "2.00",
          "foodname" : "test2"
        },
        {
          "price" : "2.00",
          "foodname" : "test2"
        }
      ]
    },
    {
      "food1" : [
        {
          "price" : "1.00",
          "foodname" : "test"
        },
        {
          "price" : "1.00",
          "foodname" : "test"
        }
      ],
      "food2" : [
        {
          "price" : "2.00",
          "foodname" : "test2"
        },
        {
          "price" : "2.00",
          "foodname" : "test2"
        }
      ]
    },
    {
      "food1" : [
        {
          "price" : "1.00",
          "foodname" : "test"
        },
        {
          "price" : "1.00",
          "foodname" : "test"
        }
      ],
      "food2" : [
        {
          "price" : "2.00",
          "foodname" : "test2"
        },
        {
          "price" : "2.00",
          "foodname" : "test2"
        }
      ]
    }
  ]
}

无论如何,我以前只使用Objective-C,解决这个问题有问题在PHP中。我还在Objective-C中实现了一个解析器,但是如果网站改变了它的结构,我将不得不重新提交整个应用程序等等。这就是为什么我想要做一个Web服务,我可以动态地改变在的应用程序我所得到的是这样的:

Anyway I previously only worked with Objective-C and having problems with solving this problem in PHP. I have also implemented a parser in Objective-C that works, but if the site changes their structure I would have to re-submit the whole app etc. That’s why I wanted to make a web service where I can dynamically change the parser outside of the app. All I got is this:

<?php
include('simple_html_dom.php');

$opts = array('http'=>array('header' => "User-Agent:MyAgent/1.0\r\n"));
$context = stream_context_create($opts);
$html = file_get_html('http://www.studentenwerk-karlsruhe.de/de/essen/?view=ok&STYLE=popup_plain&c=erzberger&p=1&kw=3',false,$context);

foreach($html->find('b') as $e) 
    echo $e;

?>

哪个给我所有的食物名称,但是没有排序这些日子,也不是为了不同的食物菜单(在我的示例JSON对象中,每天有两个不同的菜单,称为 food1 food2

Which gives me all the food names but it isn’t sorted for the days and also not for the different food menus (there are two different menus on each day which are called food1 and food2 in my example JSON object).

在我的Objective-C解析器中,我创建了一个新的日期对象,当食物名称为SchniPoSa,并将以下所有食物名称添加到 food1 直到有食物名称Salatbuffet,以及所有以下食物名称,我添加到 food2 数组,直到下一个SchniPoSa 食物名称但是这不是很好,因为结构可能每天都会发生变化。

In my Objective-C parser I just created a new day object when the food name is "SchniPoSa" and added all the following food names to food1 until there comes the food name "Salatbuffet" that and all the following food names I added to food2 array until there comes the next "SchniPoSa" food name. But this isn’t very good because the structure could change every day.

此外,我甚至不知道如何在PHP中实现。在我的小PHP脚本中,我也不解析标签中的所有价格< span class =bgp price_1>

Also, I do not even know how to implement that in PHP. In my little PHP script I also don’t parse all the prices which are in the tag <span class="bgp price_1">.

以下是我要解析信息的网站:

Here is the website from which I want to parse the information:

http://www.studentenwerk-karlsruhe.de/de/essen /?view = ok& STYLE = popup_plain& c = erzberger& p = 1& kw = 3

有没有人可以帮助我解析

Is there anyone who can help me with parsing the information in a valid JSON object like I described below?

推荐答案

刚刚看到你的消息,意识到我没有回到你对这个。
也许这将导致你的方向正确:

Just saw you message and realised I hadn't gotten back to you about this. Maybe this will lead you in the right direction:

<?php

$opts = array('http'=>array('header' => "User-Agent:MyAgent/1.0\r\n"));
$context = stream_context_create($opts);
$html = file_get_contents('http://www.studentenwerk-karlsruhe.de/de/essen/?view=ok&STYLE=popup_plain&c=erzberger&p=1&kw=3',false,$context);

libxml_use_internal_errors(true);
$dom = new DomDocument;
$dom->loadHTML($html);
$xpath = new DomXPath($dom);
$nodes = $xpath->query("//table[@class='easy-tab-dot']");
//header("Content-type: text/plain");

foreach ($nodes as $i => $node) {
    $arr = array();

    $children = $node->childNodes;
    foreach ($children as $child) {
        $tmp_doc = new DOMDocument();
        $tmp_doc->appendChild($tmp_doc->importNode($child,true));       
        #echo $tmp_doc->saveHTML();
        print_r( $child );
    }
    echo "#######################################################################################";
}

这篇关于在PHP中解析HTML并返回JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆