从BiqQuery中的单元格中的表中查找字符串 - >查询超出资源限制 [英] Find string from table in cell in BiqQuery --> Query exceeded resource limits

查看：125 发布时间：2018/5/7 17:40:44 google-bigquery

本文介绍了从BiqQuery中的单元格中的表中查找字符串 - >查询超出资源限制的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在BigQuery中有两个表格：

城市列表：invertible-fin-XXX238.Reports.City

StationionNames：invertible-fin-XXX238.Reports.Station

大多数包含城市名称的StationNames。现在我想从车站表中提取城市。
这里有一些示例数据：

城市：柏林

站名：inStore_Berlin_Alexanderplatz

车站名称：柏林舍内费尔德机场

车站名称：柏林火车站特许经营

我尝试了INSTR函数，但没有成功（INSTR仅与Legacy SQL一起工作，并且我无法使用SUBSELECTS）。

SELECT City， INSTR（（SELECT AdGroupName $ b $ FROM [invertible-fin-XXX238.Reports.City]），City）AS Match FROM [invertible-fin- XXX238.Reports.Station]
因此，我在WHERE LIKE中尝试过。在SQL代码下面：

SELECT a.City FROM [invertible-fin-XXX238.Reports.City] a CROSS JOIN [invertible-fin-XXX238.Reports.Station] b 其中b。名称LIKE'％'+ a.City +'％' GROUP BY a.City
但是现在查询的计算量太大了，我得到了错误代码查询超出第1层的资源限制，需要第18层或更高。返回。

请帮助我，写一个更资源友好的查询。

在此先感谢，
Philipp
解决方案
下面是BiigQuery标准SQL的许多可能版本中的几个： b

$ b
#standardSQL SELECT city，station FROM`invertible-fin-XXX238.Reports.Station` as s JOIN`invertible-fin-XXX238。 Reports.City` AS c ON REPLACE（LOWER（station），LOWER（city），''）< LOWER（station）
或

#standardSQL SELECT city，station FROM`invertible-fin-XXX238.Reports.Station` as s JOIN`invertible -fin-XXX238.Reports.City` as c ON LOWER（station）like CONCAT（'％'，LOWER（city），'％'）
如果两个表中的City的名称拼写相同，则可以删除LOWER（）函数

While以上版本看起来更直接 - 我更喜欢低于一个，因为它允许您从站点提取城市的控制方式 - r'（[^ _] +）' - 您应该将所有字符你观察到在列站中是分隔符。因此，在这种情况下，只有在城市不是更长名称的一部分时才会提取城市

当然，您应该验证您是否需要担心这个

< pre class =lang-sql prettyprint-override> #standardSQL WITH TOKENS AS（ SELECT token，station FROM`invertible-fin-XXX238.Reports。 Station'AS s， UNNEST（REGEXP_EXTRACT_ALL（LOWER（station），r'（[^ _] +）'））令牌） SELECT city，station FROM令牌AS s JOIN`invertible-fin-XXX238.Reports.City` as c ON LOWER（city）= token

I have two tables in BigQuery:

City List: Table: invertible-fin-XXX238.Reports.City

StationionNames: invertible-fin-XXX238.Reports.Station

Most of the StationNames containing City Names. Now I want to extract the city from the Station Table. Here some example data:

City: Berlin

Stationname: inStore_Berlin_Alexanderplatz

Stationname: Berlin Schönefeld Airport

Stationname: Train Station Franchise Berlin

I tried the INSTR Function, but had no success (the INSTR works only with Legacy SQL and there I couldn’t use SUBSELECTS).
SELECT City, INSTR((SELECT AdGroupName FROM [invertible-fin-XXX238.Reports.City]),City) AS Match FROM [invertible-fin-XXX238.Reports.Station]
Therefore I tried it with WHERE LIKE. Below the SQL Code:
SELECT a.City FROM [invertible-fin-XXX238.Reports.City] a CROSS JOIN [invertible-fin-XXX238.Reports.Station] b WHERE b. Name LIKE '%' + a.City + '%' GROUP BY a.City
But now the Query is too computationally intensive and I got the Error Code "Query exceeded resource limits for tier 1. Tier 18 or higher required." back.

Could some please help me, writing a more resource friendly query.

Thanks in advance, Philipp
解决方案
Below are few of many possible versions for BiigQuery Standard SQL
#standardSQL SELECT city, station FROM `invertible-fin-XXX238.Reports.Station` AS s JOIN `invertible-fin-XXX238.Reports.City` AS c ON REPLACE(LOWER(station), LOWER(city), '') <> LOWER(station)
or
#standardSQL SELECT city, station FROM `invertible-fin-XXX238.Reports.Station` AS s JOIN `invertible-fin-XXX238.Reports.City` AS c ON LOWER(station) LIKE CONCAT('%',LOWER(city),'%')
You can remove LOWER() function if names of City are spelled in same case in both tables

While above versions look more straightforward - i would prefer below one as it allows control way you extract city from station -r'([^ _]+)' - you should all characters that you observe being delimiters in column station. So in this case you will extract only city when it is not part of longer name
Of course you should validate if you even need to worry of this
#standardSQL WITH tokens AS ( SELECT token, station FROM `invertible-fin-XXX238.Reports.Station` AS s, UNNEST(REGEXP_EXTRACT_ALL(LOWER(station), r'([^ _]+)')) token ) SELECT city, station FROM tokens AS s JOIN `invertible-fin-XXX238.Reports.City` AS c ON LOWER(city) = token

这篇关于从BiqQuery中的单元格中的表中查找字符串 - >查询超出资源限制的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从BiqQuery中的单元格中的表中查找字符串 - >查询超出资源限制 [英] Find string from table in cell in BiqQuery --> Query exceeded resource limits

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从BiqQuery中的单元格中的表中查找字符串 - &gt;查询超出资源限制 [英] Find string from table in cell in BiqQuery --&gt; Query exceeded resource limits

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

从BiqQuery中的单元格中的表中查找字符串 - >查询超出资源限制 [英] Find string from table in cell in BiqQuery --> Query exceeded resource limits

登录关闭