IMPORTXML XPath_Query for Google Sheets [英] IMPORTXML XPath_Query for Google Sheets

查看:59
本文介绍了IMPORTXML XPath_Query for Google Sheets的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用GoogleSheet的IMPORTXML函数来检索一年前的每个日历日期或可获得数据的最近一年的数据.

I'm using GoogleSheet's IMPORTXML function to retrieve data for each calendar date one year earlier or the closest year-ago date where data are available.

这是数据示例(完整的数据源在这里):

 <entry>
    <id>http://data.treasury.gov/Feed.svc/DailyTreasuryYieldCurveRateData(6794)</id>
    <title type="text"></title>
    <updated>2018-02-06T22:05:38Z</updated>
    <author>
      <name />
    </author>
    <link rel="edit" title="DailyTreasuryYieldCurveRateDatum" href="DailyTreasuryYieldCurveRateData(6794)" />
    <category term="TreasuryDataWarehouseModel.DailyTreasuryYieldCurveRateDatum" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
    <content type="application/xml">
      <m:properties>
        <d:Id m:type="Edm.Int32">6794</d:Id>
        <d:NEW_DATE m:type="Edm.DateTime">2017-02-24T00:00:00</d:NEW_DATE>
        <d:BC_1MONTH m:type="Edm.Double">0.4</d:BC_1MONTH>
        <d:BC_3MONTH m:type="Edm.Double">0.52</d:BC_3MONTH>
        <d:BC_6MONTH m:type="Edm.Double">0.65</d:BC_6MONTH>
        <d:BC_1YEAR m:type="Edm.Double">0.8</d:BC_1YEAR>
        <d:BC_2YEAR m:type="Edm.Double">1.12</d:BC_2YEAR>
        <d:BC_3YEAR m:type="Edm.Double">1.38</d:BC_3YEAR>
        <d:BC_5YEAR m:type="Edm.Double">1.8</d:BC_5YEAR>
        <d:BC_7YEAR m:type="Edm.Double">2.12</d:BC_7YEAR>
        <d:BC_10YEAR m:type="Edm.Double">2.31</d:BC_10YEAR>
        <d:BC_20YEAR m:type="Edm.Double">2.69</d:BC_20YEAR>
        <d:BC_30YEAR m:type="Edm.Double">2.95</d:BC_30YEAR>
        <d:BC_30YEARDISPLAY m:type="Edm.Double">2.95</d:BC_30YEARDISPLAY>
      </m:properties>
    </content>
  </entry>
  <entry>
    <id>http://data.treasury.gov/Feed.svc/DailyTreasuryYieldCurveRateData(6795)</id>
    <title type="text"></title>
    <updated>2018-02-06T22:05:38Z</updated>
    <author>
      <name />
    </author>
    <link rel="edit" title="DailyTreasuryYieldCurveRateDatum" href="DailyTreasuryYieldCurveRateData(6795)" />
    <category term="TreasuryDataWarehouseModel.DailyTreasuryYieldCurveRateDatum" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
    <content type="application/xml">
      <m:properties>
        <d:Id m:type="Edm.Int32">6795</d:Id>
        <d:NEW_DATE m:type="Edm.DateTime">2017-02-27T00:00:00</d:NEW_DATE>
        <d:BC_1MONTH m:type="Edm.Double">0.44</d:BC_1MONTH>
        <d:BC_3MONTH m:type="Edm.Double">0.5</d:BC_3MONTH>
        <d:BC_6MONTH m:type="Edm.Double">0.68</d:BC_6MONTH>
        <d:BC_1YEAR m:type="Edm.Double">0.81</d:BC_1YEAR>
        <d:BC_2YEAR m:type="Edm.Double">1.2</d:BC_2YEAR>
        <d:BC_3YEAR m:type="Edm.Double">1.46</d:BC_3YEAR>
        <d:BC_5YEAR m:type="Edm.Double">1.87</d:BC_5YEAR>
        <d:BC_7YEAR m:type="Edm.Double">2.18</d:BC_7YEAR>
        <d:BC_10YEAR m:type="Edm.Double">2.36</d:BC_10YEAR>
        <d:BC_20YEAR m:type="Edm.Double">2.72</d:BC_20YEAR>
        <d:BC_30YEAR m:type="Edm.Double">2.98</d:BC_30YEAR>
        <d:BC_30YEARDISPLAY m:type="Edm.Double">2.98</d:BC_30YEARDISPLAY>
      </m:properties>
    </content>
  </entry>
  <entry>

这是我目前用于检索2017年2月27日数据的XPath查询:

This is the XPath query I'm currently using to retrieve data for 2017 Feb 27:

//*[local-name() = 'NEW_DATE'][text() = '2017-02-27T00:00:00']/..

这是显示的结果:

6795    2017-02-27T00:00:00 0.44    0.5 0.68    0.81    1.2 1.46    1.87    2.18    2.36    2.72    2.98    2.98

有没有办法:

  1. 检索显示的所有数据,但"d:Id"元素除外(上面的"6795") 和
  2. 如果搜索没有数据的日期(例如2017年) 2月25日将导致#N/A"错误,因为查询将不会为缺失的日期返回任何内容),查询将默认为下一个可用的远期日期(例如2017年2月27日)?
  1. Retrieve all data displayed except the "d:Id" element ("6795" above) and
  2. If searching for a date for which there are no data (e.g. 2017 Feb 25 will result in a "#N/A" error since the query would return nothing for the missing date), the query will default to the next available forward date (e.g. 2017 Feb 27)?

为了避免减少IMPORTXML调用,我避免使用IF函数.

I'm avoiding using the IF function in order to make fewer IMPORTXML calls.

推荐答案

因为Google表格(GS)似乎仅与XPath 1.0兼容(其此处此处截至目前为止,请先不要确认或弄清支持哪种版本),例如XPath 2.0 IF-THEN-ELSE语句

Because Google Sheets (GS) seems to be compatible with only XPath 1.0 (its documentation and product forum pages here and here don't confirm or clarify which version(s) is supported as of this date), alternative approaches such as XPath 2.0 IF-THEN-ELSE statements cannot be used. Instead, the source XML data sought can be filtered using GS native functions.

问题1

按照@Tanaike提出的解决方案,检索元素的所有子节点(保存一个),可以使用XPath"not"命令(即!)完成,如下面应用于"NEW_DATE"的父元素并省去了子元素"Id". TRANSPOSE用于以列形式显示它. (A1是包含问题中源XML URL的单元格.)

Per @Tanaike's proposed solution, retrieving all children nodes of an element, save one, can be accomplished using the XPath "not" command, i.e. !, as applied below to the parent element of "NEW_DATE" and leaving out the child element "Id". TRANSPOSE is used to display it in columnar form. (A1 is the cell containing the source XML URL in the question.)

=TRANSPOSE(IMPORTXML(A1, "//*[local-name() = 'NEW_DATE'][text() = '2017-02-25T00:00:00']/../*[local-name()!='Id']"))

问题2

要搜索一年前的特定日期,如果缺少该日期的数据,则要检索最接近的远期日期,则需要嵌套的GS函数首先使用上述公式检索"NEW_DATE"数据,然后反向查找使用SORT并按MATCH的最接近可用日期的顺序排列.然后,将INDEX函数用于反向排序的元素以选择适当的日期. CONCATENATETEXTTODAY函数仅用于以与XML数据兼容的格式设置上一年的日期.公式如下.

To search for a specific date one year ago and, in case data for that date are missing, to retrieve the closest forward date will require nested GS functions to first retrieve the "NEW_DATE" data with the above formula, next to reverse its order using SORT and to MATCH the closest available date. The INDEX function is then used on the reverse-sorted element to select the appropriate date. The CONCATENATE, TEXT and TODAY functions are used just to set the year-ago date in a format compatible with the XML data. The formula is below.

=TRANSPOSE(IMPORTXML(A1,CONCATENATE("//*[local-name() = 'NEW_DATE'][text() = '",INDEX(SORT(IMPORTXML(A1, "//*[local-name() = 'NEW_DATE']"),1,FALSE),MATCH(CONCATENATE(TEXT(TODAY()-365,"YYYY-MM-DD"),"T00:00:00"),SORT(IMPORTXML(A1, "//*[local-name() = 'NEW_DATE']"),1,FALSE),-1)),"']/../*[local-name()! = 'Id']")))

这篇关于IMPORTXML XPath_Query for Google Sheets的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆