如何从asmx Web服务生成的页面抓取数据 [英] How to scrape data from asmx web service generated page

查看:111
本文介绍了如何从asmx Web服务生成的页面抓取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在网上搜索,但没有发现有用的信息.我需要从供应商网站自动更新产品价格.我想一次从类别页面上刮取所有产品的信息.

I have been searching on the net but found nothing useful. I need to update my product prices from the supplier website automatically. I wanted to scrape information from category page for all product at once.

我使用简单的html dom方法获取数据.当我使用标签来获取从firefox firebug扩展程序获得的价格时,它什么也没打印.我试图打印该类别页面中的所有链接,但其中没有产品链接.当我右键单击页面查看站点的源代码时,没有看到与产品相关的代码. div是空的;

I used simple html dom method to get data. When I used tags to retrieve prices wihch I got from firefox firebug extension, it printed nothing. I tried to print all links in that category page and no product link in them. When I looked at the source code of the site with right click on page, I saw no code related to products. the div is empty like;

<div class=coll-2 fleft> </div>

但是在Firebug扩展中充满了代码.然后我看到一个js文件具有以下代码;

But it was full of code in firebug extension. Then I saw that a js file have this codes;

function GetProductListHeader() {
var startPage = GetStartPage();
if (pageName == 'kategori' || pageName == 'reyon') {
    var BrandList = GetQueryStringByName("Brand");
    var ColorList = GetQueryStringByName("Color");
    var PropList = GetQueryStringByName("propid");
    var ItemDim1CodeList = GetQueryStringByName("vcode");
    var QPrice = GetQueryStringByName("price");
    var cFilter = GetQueryStringByName("cfilter");

    var parametre = { PageName: pageName, pUrl: PageUrl, BrandList: BrandList, ColorList: ColorList, ItemDim1CodeList: ItemDim1CodeList, PropList: PropList, QPrice: QPrice, cFilter: cFilter, startPage: startPage };
    $.ajax(
        {
            url: '/WS/wsProduct.asmx/GetProductListHeader',
            type: 'POST',
            processData: false,
            contentType: 'application/json; charset=utf-8',
            data: JSON.stringify(parametre),
            dataType: 'json',
            async: true
        })
        .done(function (e) {
            if (e.d != "") {  
                $('.coll-2').html(e.d);
                GetProductList(startPage);
            }
        })
}
}

有没有办法用php获取这些数据?

Is there any way to get this datas with php?

谢谢.

我从chrome网络获取卷曲代码后尝试设置它,我使用了以下脚本;

I tried to setup the curl code after getting it from chrome network, I used below script;

$html = 'curl "http://bebekbayi.com/WS/wsProduct.asmx/GetProductList" \ 
    -H "Cookie: ASP.NET_SessionId=wy5hyt1bujcrdka2hpbp2wnm; _gat=1; _ga=GA1.2.1204447549.1447830812" \ 
    -H "Origin: http://bebekbayi.com" \ 
    -H "Accept-Encoding: gzip, deflate" \ 
    -H "Accept-Language: tr-TR,tr;q=0.8,en-US;q=0.6,en;q=0.4" \ 
    -H "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36" \ 
    -H "Content-Type: application/json; charset=UTF-8" \ 
    -H "Accept: application/json, text/javascript, */*; q=0.01" \ 
    -H "Cache-Control: max-age=0" \ 
    -H "X-Requested-With: XMLHttpRequest" \ 
    -H "Connection: keep-alive" \ 
    -H "Referer: http://bebekbayi.com/kategori/bakim-cantalari" 
    --data-binary "{""PageName"":""kategori"",""pUrl"":""bakim-cantalari"",""pIndex"":1,""BrandList"":"""",""ColorList"":"""",""ItemDim1CodeList"":"""",""PropList"":"""",""QPrice"":"""",""cFilter"":""""}" --compressed';
exec($html,$result);
   foreach($result as $res){

       echo $res . '<br>'; 
   }

它回来了; [InvalidOperationException:无法识别意外以'/GetProductList'结尾的URL的请求格式.]

It returned; [InvalidOperationException: Request format is unrecognized for URL unexpectedly ending in '/GetProductList'.]

推荐答案

我认为您可以轻松获取直接获取数据源的任务.

I think your task is now getting easier that you directly get the data source.

您所能做的就是获取Web服务的完整URL并进行PHP CURL调用.

What you can do is you can get the full URL of the webservice and make PHP CURL Call.

因此,您将获得响应,通常它会以XML格式显示,但这取决于该Web服务的编写方式.

So you will get the response, Generally it will be in the XML but it will depend on how this webservice is written.

这是代码.

$html = "curl 'http://bebekbayi.com/WS/wsProduct.asmx/GetProductList' -H 'Origin: http://bebekbayi.com' -H 'Accept-Encoding: gzip, deflate' -H 'Accept-Language: en-US,en;q=0.8' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36' -H 'Content-Type: application/json; charset=UTF-8' -H 'Accept: application/json, text/javascript, */*; q=0.01' -H 'Referer: http://bebekbayi.com/reyon/Anne' -H 'X-Requested-With: XMLHttpRequest' -H 'Connection: keep-alive' --data-binary '{\"PageName\":\"reyon\",\"pUrl\":\"Anne\",\"pIndex\":1,\"BrandList\":\"\",\"ColorList\":\"\",\"ItemDim1CodeList\":\"\",\"PropList\":\"\",\"QPrice\":\"\",\"cFilter\":\"\"}' --compressed";
exec($html,$result);
$obj =  json_decode(implode("",$result) , true);
print_R($obj);exit;
exit;

这篇关于如何从asmx Web服务生成的页面抓取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆