解析网站中的特定数据项 [英] Parsing specific data items from website

查看:91
本文介绍了解析网站中的特定数据项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图从

  • 地址
  • 城市
  • 状态
  • 邮政编码
  • 存储电话
  • 药房电话
  • 开放时间
  • 药房营业时间
  • 提取选项
  • 在此商店/位置
  • 站点存放时间
    • Address
    • City
    • State
    • Zip Code
    • Store Phone
    • Pharmacy Phone
    • Open Hours
    • Pharmacy Hours
    • Pickup Options
    • At this store/location
    • Site to Store Hours

    我以这种方式尝试过,但是我无法分离出一些数据来存储在上述数据变量中,因此需要一些PHP专家的帮助和建议

    I tried in this way, but i can't separate out some data to store in the above data variables so need some help and suggestion from some PHP expert

     $html = file_get_html('http://www.walmart.com/storeLocator/ca_storefinder_results.do?serviceName=&rx_title=com.wm.www.apps.storelocator.page.serviceLink.title.default&rx_dest=%2Findex.gsp&sfrecords=50&sfsearch_single_line_address=K6T');
    foreach($html->find('div[class=StoreAddress] div[1]') as $name)
    {
    echo $name->innertext.'<br>';
    }
    

    该网站的html很难识别带有标签的每个数据项,因为它们没有为标签分配适当的ID.任何人都可以建议一种简便且可扩展的方法来解析来自此

    The html of this website is complex to identify each data item with it's tag because their are no proper id assigned to tags. Can anyone please suggest easy and scalable way to parse above data items from this website.

    谢谢

    推荐答案

    html并不是真的那么复杂. Php的迭代器和dom/regex函数对于这样的任务很笨拙,但是可以做到:

    The html isn't really that complex. Php's iterators and dom/regex functions are clumsy for tasks like this but it can be done:

    $dom = new DOMDocument();
    @$dom->loadHTMLFile('http://www.walmart.com/storeLocator/ca_storefinder_details_short.do?rx_dest=/index.gsp&rx_title=com.wm.www.apps.storelocator.page.serviceLink.title.default&edit_object_id=2092&sfsearch_single_line_address=K6T');
    $xpath = new DOMXPath($dom);
    
    foreach($xpath->query('//div[@class="StoreAddress"]') as $div) {
      // title
      echo $xpath->query(".//div[1]", $div)->item(0)->nodeValue . "\n";
      // street
      echo $xpath->query(".//div[2]", $div)->item(0)->nodeValue . "\n";
      // city state and zip
      preg_match('/(.*), ([A-Z]{2}) (\d{5})/', $xpath->query(".//div[3]", $div)->item(0)->nodeValue, $m);
      // city
      echo $m[1] . "\n";
      // state
      echo $m[2] . "\n";
      // zip
      echo $m[3] . "\n";
    }
    

    这篇关于解析网站中的特定数据项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆