SQL使用多个/相关列计算项目频率? [英] SQL calculate item frequency using multiple / dependent columns?

查看:158
本文介绍了SQL使用多个/相关列计算项目频率?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对SQL完全陌生,并且已阅读有关SQL的StackOverflow帖子以尝试解决此问题,以及其他来源,但无法在SQL中做到这一点.来...

I'm completely new to SQL, and have read StackOverflow posts on SQL to try and figure this out, and other sources and unable to do this in SQL. Here goes...

我有一个3列和数千行的表,前2列有数据.第三列当前为空,我需要根据第一列和第二列中已有的数据填充第三列.

I have a table of 3 columns and thousands of rows, with data for first 2 columns. The third column is currently empty and I need to populate the third column based on data already in the first and second columns.

说我在第一栏中有状态,在第二栏中有水果条目.我需要编写一条SQL语句来计算每种水果来自的不同州的数量,然后将此流行度数字插入到每一行的第三列中.该行中的受欢迎程度数字1表示水果仅来自一个州,受欢迎程度数字4表示该水果来自4个州.所以我的桌子目前是这样的:

Say I have states in the first column and fruit entries in the second column. I need to write an SQL statement(s) that calculates the number of different states where each fruit comes from, and then inserts this popularity number into the third column for every row. A popularity number of 1 in that row means that fruit only comes from one state, a popularity number of 4 means the fruit comes from 4 states. So my table is currently like:

state     fruit     popularity

hawaii    apple     
hawaii    apple     
hawaii    banana       
hawaii    kiwi      
hawaii    kiwi      
hawaii    mango        
florida   apple      
florida   apple        
florida   apple        
florida   orange      
michigan  apple     
michigan  apple     
michigan  apricot   
michigan  orange    
michigan  pear      
michigan  pear      
michigan  pear      
texas     apple     
texas     banana    
texas     banana    
texas     banana    
texas     grape     

我需要弄清楚如何计算然后更新第三列,即受欢迎程度,这是出口该水果的州的数量.目标是生成下表(对不起的双关语),根据上表,在所有4个州中都出现苹果",在2个州中都出现了橘子和香蕉,而在1个州中只出现了奇异果,芒果,梨和葡萄.状态,因此它们对应的受欢迎程度数字.

And I need to figure out how to calculate and then update the third column, named popularity, which is the number of states that exports that fruit. The goal is to produce (sorry bad pun) the table below, where based on above table, "apple" appears in all 4 states, oranges and banana appear in 2 states, and kiwi, mango, pear, and grape only appear in 1 state, hence their corresponding popularity numbers.

state     fruit     popularity

hawaii    apple     4
hawaii    apple     4
hawaii    banana    2   
hawaii    kiwi      1
hawaii    kiwi      1
hawaii    mango     1   
florida   apple     4 
florida   apple     4   
florida   apple     4   
florida   orange    2  
michigan  apple     4
michigan  apple     4
michigan  apricot   1
michigan  orange    2
michigan  pear      1
michigan  pear      1
michigan  pear      1
texas     apple     4
texas     banana    2
texas     banana    2
texas     banana    2
texas     grape     1

我小的程序员头脑说,试图找到一种以某种脚本循环遍历数据的方法,但是对SQL和数据库进行了一些阅读之后,似乎您并没有在其中编写冗长而缓慢的循环脚本. SQL.我什至不确定是否可以?但是相反,有更好/更快的方法可以在SQL中执行此操作.

My small programmer brain says to try and figure out a way to loop through the data in some kind of script, but reading up a little on SQL and databases, it seems like you don't write long and slow looping scripts in SQL. I'm not even sure if you can? but instead that there are better/faster ways to do this in SQL.

任何人都知道如何在SQL语句中为每一行计算和更新第三列,这在此处称为流行度,并与每种水果所来自的状态数相对应?感谢您的阅读,非常感谢您的帮助.

Anyone know how to, in SQL statement(s), calculate and update the third column for each row, which is here called popularity and corresponds to the number of states that each fruit comes from? Thanks for reading, very grateful for any help.

到目前为止,我已经在下面尝试了这些SQL语句,它们的输出结果并不能完全满足我的需求:

So far I have tried these SQL statements below, which output but don't quite get me what I need:

--outputs those fruits appearing multiple times in the table
SELECT fruit, COUNT(*)
  FROM table 
 GROUP BY fruit
HAVING COUNT(*) > 1
 ORDER BY COUNT(*) DESC

--outputs those fruits appearing only once in the table
SELECT fruit, COUNT(*)
  FROM table 
 GROUP BY fruit
HAVING COUNT(*) = 1

--outputs list of unique fruits in the table
SELECT COUNT (DISTINCT(fruit))
  FROM table

推荐答案

如果您只想简单地更新具有优先级的表,则它看起来像:

If you want to simply update your table with the priority it would look like:

update my_table x
   set popularity = ( select count(distinct state) 
                        from my_table
                       where fruit = x.fruit )

如果要选择数据,则可以使用分析查询:

If you want to select the data then you can use an analytic query:

select state, fruit
     , count(distinct state) over ( partition by fruit ) as popularity
  from my_table

这提供了每个水果的不同状态数.

This provides the number of distinct states, per fruit.

这篇关于SQL使用多个/相关列计算项目频率?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆