如何处理使用由卡萨布兰卡(PPL)http_client返回XmlLite的将XML? [英] How to process the XML using XmlLite returned by the casablanca (PPL) http_client?

查看:347
本文介绍了如何处理使用由卡萨布兰卡(PPL)http_client返回XmlLite的将XML?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要让请求到Web服务,获取XML内容,并分析其对服务得到返回的特定值。

I want to make request to the web service, get the XML content, and parse it to get specific values returned by the service.

在code是在本地C ++ 11被写入(MS Visual Studio的2013年)。该 Cassablanca PPL库选择。对于XML解析,选择了XmlLite的。

The code is to be written in native C++11 (MS Visual Studio 2013). The Cassablanca PPL library was chosen. For XML parsing, the XmlLite was chosen.

我用C ++编程;然而,从PPL库中的异步任务的编程 - 的办法 - 是新的我。我所知道的异步编程是什么,我也知道并行编程的原则。不过,我不习惯使用延续(。然后(...)),而我只能慢慢缠绕我的头左右的概念。

I am used to C++ programming; however, the async-task programming from the PPL library--the approach--is new to me. I know what the asynchronous programming is, and I know the principles of parallel programming. However, I am not used to using the continuations (.then(...)), and I am only slowly wrapping my head around the concept.

到目前为止,我已经修改了的例子来获取XML结果,并将其写入文本文件:

So far, I have modified the examples to get the XML result and write it into the text file:

// Open a stream to the file to write the HTTP response body into.
auto fileBuffer = std::make_shared<concurrency::streams::streambuf<uint8_t>>();
file_buffer<uint8_t>::open(L"test.xml", std::ios::out)
    .then([=](concurrency::streams::streambuf<uint8_t> outFile) -> pplx::task < http_response >
{
    *fileBuffer = outFile;

    // Create an HTTP request.
    // Encode the URI query since it could contain special characters like spaces.
    // Create http_client to send the request.
    http_client client(L"http://api4.mapy.cz/");

    // Build request URI and start the request.
    uri_builder builder(L"/geocode");
    builder.append_query(L"query", address);

    return client.request(methods::GET, builder.to_string());
})

    // Write the response body into the file buffer.
    .then([=](http_response response) -> pplx::task<size_t>
{
    printf("Response status code %u returned.\n", response.status_code());

    return response.body().read_to_end(*fileBuffer);
})

    // Close the file buffer.
    .then([=](size_t)
{
    return fileBuffer->close();
})

    // Wait for the entire response body to be written into the file.
    .wait();

现在,我需要了解如何修改code得到可能被消耗XmlLite的(微软实现,当属 xmllite.h ,结果 xmllite.lib xmllite.dll 。我知道拉解析器。但同样,我很新的库。我还是有点在PPL相关流和其他类丢失了。我不知道如何正确地使用它们。任何的解释是higly欢迎。

Now, I need to understand how to modify the code to get the result that could be consumed XmlLite (Microsoft implementation that comes as xmllite.h, xmllite.lib, and xmllite.dll. I know what pull parsers are. But again, I am very new to the library. I am still a bit lost in PPL related streams and other classes. I do not know how to use them correctly. Any explanation is higly welcome.

该cassablanca人说他们使用XmlLite的与Cassablanca处理的结果,但我没有找到任何的例子。你可以点我一些?谢谢你。

The cassablanca people say they use the XmlLite with the Cassablanca to process the results, but I did not find any example. Can you point me to some? Thanks.

更新(2014年6月4日):上面code实际上是包装成这样的功能( wxString 来自wxWidgets的,但可以很容易地的std ::字符串替换的std :: wstring的

Update (4th June 2014): The above code is actually wrapped as a function like that (wxString comes from wxWidgets, but one can easily replace it by std::string or std::wstring):

std::pair<double, double> getGeoCoordinatesFor(const wxString & address)
{
    ...the above code...
    ...here should be the XML parsing code...
    return {longitude, latitude};
}

我们的目标实际上是代替流写入的test.xml 文件养活XmlLite的解析器。该XML是相当小的,它包含一个或多个(如果该地址不明确)item元素的x和y属性,我想提取 - 是这样的:

The goal actually is instead of writing the stream to the test.xml file to feed the XmlLite parser. The XML is rather small and it contains one or more (if the address is ambiguous) item elements with the x and y attributes that I want to extract -- like this:

<?xml version="1.0" encoding="utf-8"?>
<result>
    <point query="Vítězství 27, Olomouc">
        <item
                x="17.334045"
                y="49.619723"
                id="9025034"
                source="addr"
                title="Vítězství 293/27, Olomouc, okres Olomouc, Česká republika"
        />
        <item
                x="17.333067"
                y="49.61618"
                id="9024797"
                source="addr"
                title="Vítězství 27/1, Olomouc, okres Olomouc, Česká republika"
        />
    </point>
</result>

我不需要那么的test.xml 文件。如何获取流以及如何将其重定向到XmlLite的解析器?

I do not need that test.xml file. How to get the stream and how to redirect it to the XmlLite parser?

推荐答案

我没有使用过卡萨布兰卡呢,所以这可能是有点过。 (我很想与卡萨布兰​​卡的工作,但我得先凑了更多的时间。)这就是说,它看起来像code告诉你会下载一个XML文件,并将其保存到本地文件的test.xml 。从这一点来说它是简单的将文件加载到XmlLite的如果XML文件是EN为UTF-8 codeD。如果不是UTF-8,你将不得不通过一些额外的跳火圈脱code它,无论是在内存或通过的 CreateXmlReaderInputWithEncodingName 或的 CreateXmlReaderInputWith codePAGE ,我将不包括在这里。

I haven't used Casablanca yet, so this may be a little off. (I'd love to work with Casablanca, but I'll have to scrape together more time first.) That said, it looks like the code you show will download an xml file and save it to a local file test.xml. From that point it's straightforward to load the file into XmlLite if the xml file is encoded in UTF-8. If it's not UTF-8, you will have to jump through some additional hoops to decode it, either in memory or via CreateXmlReaderInputWithEncodingName or CreateXmlReaderInputWithCodePage, and I won't cover that here.

一旦你得到了你的UTF-8的文件,或者你正在处理的编码,在开始你的XML解析使用XmlLite的最简单的方法是所示的 CreateXmlReader

Once you've got your UTF-8 file, or you're handling encoding, the easiest approach to starting your XML parse using XmlLite is shown on the documentation for CreateXmlReader:

//Open read-only input stream
if (FAILED(hr = SHCreateStreamOnFile(argv[1], STGM_READ, &pFileStream)))
{
    wprintf(L"Error creating file reader, error is %08.8lx", hr);
    return -1;
}

if (FAILED(hr = CreateXmlReader(__uuidof(IXmlReader), (void**) &pReader, NULL)))
{
    wprintf(L"Error creating xml reader, error is %08.8lx", hr);
    return -1;
}

在你的情况,你想跳过这个文件,所以您需要在内存中创建一个的IStream 。你有三个主要选项:

In your case, you want to skip the file, so you'll need to create an IStream in memory. You have three main options:


  1. 善待你的字符串作为内存缓冲区,并使用 pMemStream = SHCreateMemStream(szData,cbData)

  2. 流从卡萨布兰卡到的IStream 了CreateStreamOnHGlobal(NULL,真实,&安培; pMemStream) ,然后使用它作为源在完成检索后,

  3. 创建一个的IStream 包装卡萨布兰卡的并发::流:: istream的隐藏背后的<$它的异步性C $ C>的IStream 接口

  1. Treat your string as a memory buffer and use pMemStream = SHCreateMemStream(szData, cbData)
  2. Stream from Casablanca into an IStream created with CreateStreamOnHGlobal(NULL, true, &pMemStream) and then use that as your source after you finish retrieving it
  3. Create an IStream wrapper for Casablanca's concurrency::streams::istream that hides its asynchronicity behind the IStream interface

一旦你有你流,你必须告诉你关于它的读者的 IXmlReader :: setInput设置

Once you have your stream, you have to tell your reader about it with IXmlReader::SetInput.

hr = pReader->SetInput(pStream);

不管上述选项,我建议使用RAII类,如ATL的但是CComPtr&LT; IStream的&GT; 但是CComPtr&LT; IXMLReader&GT; 因为他们显示为 pFileStream $ p $帕德尔,还是我的建议变量pMemStream 。这也是当你需要覆盖任何性质,说如果你有处理比XmlLite的默认为更深层次的递归。然后,它是所有关于拉读取文件。对于最简单的循环记录在 IXmlReader ::阅读方法;下面是一些最重要的部分,但是请注意,我省略了可读性错误检测:

Regardless of the above options, I suggest using RAII classes such as ATL's CComPtr<IStream> and CComPtr<IXMLReader> for the variables they show as pFileStream and pReader, or my suggested pMemStream. This is also when you need to override any properties, say if you have to handle deeper recursion than XmlLite defaults to. Then it's all about pull-reading the file. The simplest loop for that is documented on the IXmlReader::Read method; here are some of the most important pieces, but note that I've omitted error detection for readability:

void Summarize(IXmlReader *pReader, LPCWSTR wszType)
{
    LPCWSTR wszNamespaceURI, wszPrefix, wszLocalName, wszValue;
    UINT cchNamespaceURI, cchPrefix, cchLocalName, cchValue;

    pReader->GetNamespaceURI(&wszNamespaceURI, &cchNamespaceURI);
    pReader->GetPrefix(&wszPrefix, &cchPrefix);
    pReader->GetLocalName(&wszLocalName, &cchLocalName);
    pReader->GetValue(&wszValue, &cchValue);
    std::wcout << wszType << L": ";
    if (cchNamespaceURI) std::wcout << L"{" << wszNamespaceURI << L"} ";
    if (cchPrefix)       std::wcout << wszPrefix << L":";
    std::wcout << wszLocalName << "='" << wszValue << "'\n";
}

void Parse(IXmlReader *pReader)
{
    // Read through each node until the end
    while (!pReader->IsEOF())
    {
        hr = pReader->Read(&nodeType);
        if (hr != S_OK)
            break;

        switch (nodeType)
        {
            //  : : :

            case XmlNodeType_Element:
                Summarize(pReader, L"BeginElement");
                while (S_OK == pReader->MoveToNextAttribute())
                    Summarize(pReader, L"Attribute");
                pReader->MoveToElement();
                if (pReader->IsEmptyElement())
                    std::wcout << L"EndElement\n";
                break;

            case XmlNodeType_EndElement:
                std::wcout << L"EndElement\n";
                break;

            //  : : :
         }
    }
}

在一些该样本code中的另一件包括 E_PENDING 的支票可以是相关的,如果整个文件尚未公布。这很可能是更好有卡萨布兰卡 http_resposne ::身体养活一个自定义的的IStream 实现,XmlLite的可以开始加工在平行于它的下载; 这个话题涵盖这样的想法,但并没有出现有一个典型的解决方案。以我的经验XmlLite的是如此之快,它会导致延迟是不相关的,所以从完整的文件处理它可能就足够了,特别是如果你确实需要完整的文件,然后才能完成你的处理。

Some of the other pieces in that sample code include a check for E_PENDING which can be relevant if the entire file is not yet available. It would likely be "better" to have the Casablanca http_resposne::body feed a custom IStream implementation that XmlLite can begin processing in parallel to its download; this discussion thread covers this idea, but doesn't appear to have a canonical solution. In my experience XmlLite is so fast that the delay it causes is not relevant, so processing it from the complete file may be sufficient, especially if you do require the full file before you can finish your processing.

如果您需要更好地这个集成到一个异步系统,将有更多的篮球。显然,,而上面的循环不是异步本身。我的猜测是正确的方法,使之异步将在很大程度上取决于你的文件的内容,你必须在阅读它,做加工,以及你是否把它绑一个定制的IStream 可能没有所有可用数据。由于我没有与Casabalanca的异步性的经验,我不能有效地对此有何评论。

If you need to better integrate this into an asynchronous system, there will be more hoops. Obviously the while loop above is not asynchronous itself. My guess is that the proper way to make it asynchronous will depend heavily on the content of your file and the processing you have to do while reading it, as well as whether you tie it to a custom IStream that may not have all its data available. Since I don't have any experience with Casabalanca's asynchronicity, I can't comment usefully on this.

请问这个地址,你在找什么,或者这是你已经知道了一部分,你正在寻找的的IStream 包装Casabalanca的 HTTP_RESPONSE ::身体或制作安装XmlLite的处理异步秘诀?

Does this address what you're looking for, or was this the part you already knew and you were looking for the IStream wrapper of Casabalanca's http_response::body or tips on making XmlLite's processing asynchronous?

这篇关于如何处理使用由卡萨布兰卡(PPL)http_client返回XmlLite的将XML?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆