XPath 查询按日期过滤 [英] XPath query filtering by date

查看:39
本文介绍了XPath 查询按日期过滤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些示例 XML,我在其中根据 日期 查询节点.

示例 XML 文档:

所以基本上有三个日期:

  • 2/1/2012
  • 2013/2/1
  • 2/1/2014

使用 MSXML 我可以使用 XPath 查询来查询和过滤这些日期:

/NewDataSet/Table[ValidFromDate>"2013-02-12"]

这有效,并返回一个 IXMLDOMNodeList 包含一个项目:

<EmployeeBankGUID>2af49699-579e-4beb-9ab0-a58b4bee3158</EmployeeBankGUID><ValidFromDate>2014-02-01T00:00:00-05:00</ValidFromDate></表>

除了它不再起作用

使用 MSXML 的 XPath 查询;Microsoft 在 1990 年代后期创建的 xml 变体,在 W3C 标准化为完全不同形式的 XPath 之前.

DOMDocument doc = new DOMDocument();//...加载xml...IXMLDOMNodeList nodes = doc.selectNodes('/NewDataSet/Table[ValidFromDate>"2013-02-12"]');

但是那个版本的 MSXML 不符合标准"(因为它是在有标准之前创建的).自 2005 年以来推荐的,遵循标准的,唯一具有我需要的功能的是 MSXML 6.

这是一个简单的改变,只是实例化一个 DOMDocument60 类而不是一个 DOMDocument 类:

DOMDocument doc = new DOMDocument60();//...加载xml...IXMLDOMNodeList nodes = doc.selectNodes('/NewDataSet/Table[ValidFromDate>"2013-02-12"]');

除了相同的 XPath 查询什么都不返回.

什么是符合标准"按日期过滤值的方法?

假装它是一个字符串,你说

您可能认为我可能认为 XML 将 2013-02-01T00:00:00-05:00 视为某种特殊日期,而实际上它是一个字符串.所以也许我应该把它想象成字符串比较.

这行得通,但行不通.没有字符串比较工作:

  • /NewDataSet/Table[ValidFromDate<"a"] 不返回任何节点
  • /NewDataSet/Table[ValidFromDate>"a"] 不返回任何节点
  • /NewDataSet/Table[ValidFromDate!="a"] 返回所有节点
  • /NewDataSet/Table[ValidFromDate>"2014-02-12T00:00:00-05:00"] 不返回任何节点
  • /NewDataSet/Table[ValidFromDate<"2014-02-12T00:00:00-05:00"] 不返回任何节点
  • /NewDataSet/Table[ValidFromDate!="2014-02-12T00:00:00-05:00"] 不返回任何节点

所以,我们有它

实现过去工作的符合标准"的方法是什么?

对日期字符串进行 XPath 查询的正确"方式是什么?

或者更好的是,为什么我的 XPath 查询不起作用?

或者,更好的是更好,为什么曾经有效的查询不再有效?决定语法错误的决定是什么?他们通过破坏"查询语法解决了哪些边缘情况?

MSXML6 兼容版本

这是最终的功能代码,几乎是我使用的语言:

DOMDocument60 GetXml(String url){XmlHttpRequest xml = CoServerXMLHTTP60.Create();xml.Open('GET', url, False, '', '');xml.Send(EmptyParam);DOMDocument60 doc = xml.responseXML AS DOMDocument60;//MSXML6 删除了原来存在的各种特性(感谢 W3C)//需要使用微软的专有扩展来取回一些(感谢W3C)doc.setProperty('SelectionNamespaces', 'xmlns:ms="urn:schemas-microsoft-com:xslt"');返回文档;}DOMDocument doc = GetXml('http://example.com/GetBanks.ashx?employeeID=12345');//查找未来的银行.//仅适用于MSXML3;在 MSXML6 中故意破坏(感谢 W3C)://String qry = '/NewDataSet/Table[ValidFromDate >"2014-02-12"]';//MSXML6兼容版本做以上(向W3C投诉);String qry = '/NewDataSet/Table[ms:string-compare(ValidFromDate, "2014-02-12") >= 0]';IXMLDOMNodeList 节点 = doc.selectNodes(qry);

解决方案

XPath is not date-aware

<块引用>

对日期字符串进行 XPath 查询的正确"方式是什么?

在 XPath 1.0 中,没有办法处理日期字符串,只考虑时区支持.至少没有正确的方法来处理它们.如果时区不同,则比较字符串将失败.

比较字符串

<块引用>

或者更好的是,为什么我的 XPath 查询不起作用?

XPath 1.0 只在字符串上定义相等运算符,对于大于/小于值 必须是转换为数字.

使用 ms:string-compare 是在 MSXML 4.0 中引入的.

/NewDataSet/Table[ms:string-compare(ValidFromDate, "2014-02-12T00:00:00-05:00") >0]

对于 (XML) 世界的其他部分

<块引用>

实现过去工作的符合标准"的方法是什么?

一个也适用于其他 XPath 实现的替代方案(我使用 xmllint 对其进行了测试,它使用了 libxml)可能是 translate 去掉所有非字符串字符,所以字符串将被解析为一个数字:

/NewDataSet/Table[translate(ValidFromDate, "-:T", "") <翻译(2014-02-12T00:00:00-05:00",-:T",")]

I have some sample XML where I am querying for nodes based on a date.

Sample XML document:

<?xml version="1.0" encoding="UTF-16" standalone="yes"?>
<NewDataSet>
    <Table>
        <EmployeeBankGUID>dc396ebe-c8a4-4a7f-85b5-b43c1890d6bc</EmployeeBankGUID>
        <ValidFromDate>2012-02-01T00:00:00-05:00</ValidFromDate>
    </Table>
    <Table>
        <EmployeeBankGUID>2406a5aa-0246-4cd7-bba5-bb17a993042b</EmployeeBankGUID>
        <ValidFromDate>2013-02-01T00:00:00-05:00</ValidFromDate>
    </Table>
    <Table>
        <EmployeeBankGUID>2af49699-579e-4beb-9ab0-a58b4bee3158</EmployeeBankGUID>
        <ValidFromDate>2014-02-01T00:00:00-05:00</ValidFromDate>
    </Table>
</NewDataSet>

So there are basically three dates:

  • 2/1/2012
  • 2/1/2013
  • 2/1/2014

Using MSXML I can query and filter by these dates using an XPath query:

/NewDataSet/Table[ValidFromDate>"2013-02-12"]

And this works, and returns an IXMLDOMNodeList containing one item:

<Table>
    <EmployeeBankGUID>2af49699-579e-4beb-9ab0-a58b4bee3158</EmployeeBankGUID>
    <ValidFromDate>2014-02-01T00:00:00-05:00</ValidFromDate>
</Table>

Except it doesn't work anymore

That XPath query using using MSXML; the variant of xml that Microsoft created in the late 1990's, before the W3C standardized on a completely different form of XPath.

DOMDocument doc = new DOMDocument();
//...load the xml...
IXMLDOMNodeList nodes = doc.selectNodes('/NewDataSet/Table[ValidFromDate>"2013-02-12"]');

But that version of MSXML is not "standards compliant" (since it was created before there were standards). Since 2005 the recommended one, the one that follows the standards, the only one that has features I require is MSXML 6.

It's a simple change, just instantiate a DOMDocument60 class rather than a DOMDocument class:

DOMDocument doc = new DOMDocument60();
//...load the xml...
IXMLDOMNodeList nodes = doc.selectNodes('/NewDataSet/Table[ValidFromDate>"2013-02-12"]');

Except the same XPath query returns nothing.

What is the "standards compliant" way to filtering a value by date?

Pretend it's a string, you say

You might be thinking that I might be thinking that XML is treating the 2013-02-01T00:00:00-05:00 as some sort of special date, when in reality it's a string. So maybe I should just think of it like string comparisons.

Which would work, except that it doesn't work. No string comparison works:

  • /NewDataSet/Table[ValidFromDate<"a"] returns no nodes
  • /NewDataSet/Table[ValidFromDate>"a"] returns no nodes
  • /NewDataSet/Table[ValidFromDate!="a"] returns all nodes
  • /NewDataSet/Table[ValidFromDate>"2014-02-12T00:00:00-05:00"] returns no nodes
  • /NewDataSet/Table[ValidFromDate<"2014-02-12T00:00:00-05:00"] returns no nodes
  • /NewDataSet/Table[ValidFromDate!="2014-02-12T00:00:00-05:00"] returns no nodes

So, there we have it

What is the "standards compliant" way to achieve what used to work?

What is the "correct" way to XPath query for date strings?

Or, better yet, why are my XPath queries not working?

Or, better better yet, why does the query that used to work no longer work? What was the decision that was made that decided the syntax was bad. What were edge cases they were solving by "breaking" the query syntax?

MSXML6 compatible version

Here's the final functional code, nearly in the language I use:

DOMDocument60 GetXml(String url)
{
   XmlHttpRequest xml = CoServerXMLHTTP60.Create();
   xml.Open('GET', url, False, '', '');
   xml.Send(EmptyParam);

   DOMDocument60 doc = xml.responseXML AS DOMDocument60;

   //MSXML6 removed all kinds of features originally present (thanks W3C)
   //Need to use Microsoft's proprietary extensions to get some of it back (thanks W3C)
   doc.setProperty('SelectionNamespaces', 'xmlns:ms="urn:schemas-microsoft-com:xslt"');

   return doc;
}


DOMDocument doc = GetXml('http://example.com/GetBanks.ashx?employeeID=12345');

//Finds future banks. 

//Only works in MSXML3; intentionally broken in MSXML6 (thanks W3C):
//String qry = '/NewDataSet/Table[ValidFromDate > "2014-02-12"]';

//MSXML6 compatible version of doing the above (send complaints to W3C);
String qry = '/NewDataSet/Table[ms:string-compare(ValidFromDate, "2014-02-12") >= 0]';

IXMLDOMNodeList nodes = doc.selectNodes(qry);

解决方案

XPath is not date-aware

What is the "correct" way to XPath query for date strings?

In XPath 1.0, there is no way to handle date strings, just think of time zone support. At least there is no correct way to handle them. Comparing strings will fail if timezones are different.

Comparing strings

Or, better yet, why are my XPath queries not working?

XPath 1.0 only defines equality operators on strings, for greater/less than the values have to be converted to numbers.

Use ms:string-compare which was introduced in MSXML 4.0.

/NewDataSet/Table[
  ms:string-compare(ValidFromDate, "2014-02-12T00:00:00-05:00") > 0
]

For the rest of the (XML) world

What is the "standards compliant" way to achieve what used to work?

An alternative that also works in other XPath implementations (I tested it using xmllint, which uses libxml) might be to translate away all non-string characters, so the string will be parseable as a number:

/NewDataSet/Table[
  translate(ValidFromDate, "-:T", "") < translate("2014-02-12T00:00:00-05:00", "-:T", "")
]

这篇关于XPath 查询按日期过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆