SQL查询在结果中产生不需要的重复行 [英] SQL query producing unwanted duplicate rows in result

查看:150
本文介绍了SQL查询在结果中产生不需要的重复行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一个相当简单的查询.但是,由于我的访问和SQL能力水平较低,因此我目前的项目遇到了障碍.详细信息在下面,在此先感谢大家的耐心等候.

I'm wrangling with a rather simple query. However, given my low level of Access and SQL competency, I've hit a roadblock with my current project. Details are below and thank you all in advance for your patience.

基本上,我正在尝试使用历史财务数据来测试众所周知的破产预测模型.财务数据以年度"格式存储(下表列出).数据库的结构使得每个公司在公司记录表(IDX_FS)中都有一个常规信息记录,而在财务报表数据表中(DATA_BS等)中存在的每一年都有多个记录.

Basically, I am trying to use historical financial data to test a well known bankruptcy predictor model. The financial data is stored in an Annual format (table listing below). The database is structured such that the each company has one general information record in the company record table (IDX_FS) and multiple records for each year of existence in the financial statement data tables (DATA_BS etc.).

在每个DATA表中,都有一个字段包含每个公司[4DTYR]的数据记录的特定年份及其各自的财务数据.该字段中的数据重复存在,并且存在于每个公司以及存在的每一年.

In each DATA table, there is one field that contains the specific year of the data record for each company [4DTYR] and its respective financial data. The data in this field repeats and exists for each company and for every year it existed.

例如:

[CONAME] [4DTYR] [A_TOTAL]

Apple Inc. 2009 200
Apple Inc. 2010 220
Apple Inc. 2011 240
Google Inc. 2009 180
Google Inc. 2010 170
Google Inc. 2011 160

我遇到的问题是,字段[4DTYR]中的数据存在并在各种表中重复,在这些表中,数据被用于在少数表达式中计算算术,最后我得到了大量重复查询输出中的(以及看起来是排列的)数据.

The problem I am running into is given that the data in the field [4DTYR] exists and repeats in various tables from which data is being used to calculate arithmatic in a handful of expressions, I end up with a huge amount of repeated (and what looks a permutation) data in my query output.

除了SQL脚本外,我还详细介绍了下表,字段和表达式.请注意,我尝试在WHERE下添加一个条件,该条件试图将不同表中的所有[4DTYR]日期设置为相同.该部分以洋红色突出显示.这似乎仍然行不通,因为只有20年的数据时,我只能获得1年的输出.此外,当我运行不带表达式的查询时,现有的参数为我提供了约500条记录的输出.

I've detailed the tables, fields and expressions below in addition to the SQL script. Note that I've tried adding a condition under WHERE that attempts to set all the [4DTYR] dates in the different tables as the same. That portion is highlighted in magenta. This still doesn't seem to work as I only get output for 1 year only, when there are 20 years of data. Furthermore, when I run the query without the expressions, the existing paramaters gives me output with ~500 records.

感谢您的回复.因此,我接受了Gords的建议,并在下面进行了修改.但是,我收到一个JOIN语法错误.请注意,IDX_FS包含CUSIP字段,但不包含4DTYR字段.因此,我使用AND将其添加到原始语句中.有什么建议吗?非常感谢.

Thanks for your responses. So, I've taken Gords advice and made the modification below. However, I receive a JOIN syntax error. Note that IDX_FS contains the CUSIP field, but not the 4DTYR field. So, I used AND to add to the original statement. Suggestions? many thanks.

FROM (((IDX_FS LEFT JOIN DATA_BS ON IDX_FS.CUSIP = DATA_BS.CUSIP) LEFT JOIN DATA_Footnotes ON IDX_FS.CUSIP = DATA_Footnotes.CUSIP) LEFT JOIN DATA_IS ON IDX_FS.CUSIP = DATA_IS.CUSIP) LEFT JOIN DATA_SP ON IDX_FS.CUSIP = DATA_SP.CUSIP AND (((DATA_BS LEFT JOIN DATA_IS ON DATA_BS.CUSIP = DATA_IS.CUSIP AND DATA_BS.4DTYR = DATA_IS.4DTYR) LEFT JOIN DATA_SP ON DATA_BS.CUSIP = DATA_SP.CUSIP AND DATA_BS.4DTYR = DATA_SP.4DTYR) LEFT JOIN DATA_Footnotes.4DTYR ON DATA_BS.CUSIP = DATA_Footnotes.CUSIP AND DATA_BS.4DTYR = DATA_Footnotes.4DTYR

推荐答案

当您同时加入具有公司ID [CUSIP]和年份[4DTYR]的表时,您只加入了[CUSIP].在具有该字段的相关表中获取[4DTYR]各种排列的重复行.您需要同时加入[CUSIP] [4DTYR]以避免重复.

When you are joining the tables that have both the company ID [CUSIP] and the year [4DTYR] you are only joining on [CUSIP], so you are getting duplicate rows for the various permutations of [4DTYR] in the related tables that also have that field. You need to join on both [CUSIP] and [4DTYR] to avoid those duplicates.

在Access的查询设计器中,此类联接将显示为在每个表之间运行的两条行:一个将[CUSIP]连接到[CUSIP],另一个将[4DTYR]连接到[4DTYR].在SQL中,联接看起来像

In Access' query designer such joins will appear as two lines running between each table: one connecting [CUSIP] to [CUSIP] and the other connecting [4DTYR] to [4DTYR]. In SQL the joins will look something like

... TableX LEFT JOIN TableY ON TableX.CUSIP = TableY.CUSIP AND TableX.4DTYR = TableY.4DTYR

这篇关于SQL查询在结果中产生不需要的重复行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆