蜂巢式查询:根据其他列值与某些特定值匹配的条件选择一个列,然后将匹配结果创建为新列 [英] Hive query: select a column based on the condition another columns values match some specific values, then create the match result as a new column

查看:90
本文介绍了蜂巢式查询:根据其他列值与某些特定值匹配的条件选择一个列,然后将匹配结果创建为新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须在HiveQL中进行一些查询和创建列操作.

I have to some query and creat columns operations in HiveQL.

例如,

app      col1

app1     anybody love me?
app2     I hate u
app3     this hat is good
app4     I don't like this one
app5     oh my god
app6     damn you.
app7     such nice girl
app8     xxxxx
app9     pretty prefect
app10    don't love me.
app11    xxx anybody?

我想匹配像['anybody', 'love', 'you', 'xxx', 'don't']这样的关键字列表,并选择匹配的关键字结果作为新列,命名为keyword,如下所示:

I want to match a keyword list like ['anybody', 'love', 'you', 'xxx', 'don't'] and select the matched keyword result as a new column, named keyword as follows:

app      keyword

app1     anybody, love
app4     I don't like this one
app6     damn you.
app8     xxx
app10    don't, love
app11    xxx

似乎我必须使用嵌套查询.
逻辑有点像选择匹配的结果行​​并设置匹配的结果,这些结果应保存在列表或类似的内容中作为新列.

It seems that I have to use nested query.
The logic is kind of like selecting the matched result rows and setting a matched results which should be saved in a list or something like this as a new column.

但是我对HiveQL不够熟悉.
有人可以帮我吗?
预先感谢.

But I am not familiar enough with the HiveQL.
Could anyone help me?
Thanks in advances.

推荐答案

您可以将单词列表转换为表格,然后使用模式匹配将其与表格连接:

You could turn the list of words to a table and join it with your table using pattern matching:

select t.app, k.keyword
from  mytable t
inner join (values ('anybody'), ('you'), ('xxx'), ('don''t')) as k(keyword)
    on t.col1 like conca('%', k.keyword, '%')

请注意,如果一个词组中有多个关键字匹配,这将重复app.您没有指定要如何使用此用例.

Note that this will duplicate app if more than one keyword matches on a phrase. You did not specify how you want to handle this use case.

在配置单元中,您还可以将其表达为:

In hive, you can also phrase this as:

select t.app, k.keyword
from  mytable t
inner join table(values 'anybody', 'you', 'xxx', 'don''t') as k(keyword)
    on t.col1 like conca('%', k.keyword, '%')

这篇关于蜂巢式查询:根据其他列值与某些特定值匹配的条件选择一个列,然后将匹配结果创建为新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆