如何获取第一个报告的数据项实例 [英] How do I get the first reported instance of a data item

查看:79
本文介绍了如何获取第一个报告的数据项实例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述




我有一个很大的(110米行)字段表,代表了许多样本,每个样本都经过多次测试。



在现实生活中,这是来自实验室的数据,其中每个样本都获得一个独特的样本ID,然后对该样本进行一次或多次测试。



我的问题在于,实验室以累积方式报告,因为个别测试报告在样本上*。例如,假设样品ID abc123具有测试na,k,尿素,crea的请求,并且在尿素之前报告na和k,然后最终报告crea我会收到以下内容:

Hi
I have a largeish (110m row) table of fields that represents a number of samples each with a number of tests.

In real life this is a data from a laboratory where each specimen received gets a unique sample ID and then one or more tests are undertaken on that sample.

My problem lies in the fact that the laboratory reports in a cumulative fashion as individual tests are reported on a sample*. For instance assume sample ID abc123 has a request for tests na, k, urea, crea for instance and the na and k are reported before the urea and then finally the crea is reported I would receive the following:

UID    | TimeStamp    | Sample ID     | Test
1      | 17:00        | abc123        | Na
2      | 17:00        | abc123        | k
3      | 17:15        | abc123        | Na
4      | 17:15        | abc123        | k
5      | 17:15        | abc123        | Urea
6      | 17:30        | abc123        | k
7      | 17:30        | abc123        | Na
8      | 17:30        | abc123        | Urea
9      | 17:30        | abc123        | Crea



*注意:样本上只有1个相同测试的实例(即abc123永远不会进行2次Na测试)它们只是之前的结果下次发布报告时重复



在110米行中,大约有15-20米的独特样本ID,每个都有1个或多个测试。



我创建了一个数据视图,它连接了样本ID和测试(见下表),因为我需要实现的是一种获取与第一个报告相关的行的方法sampleId_test的实例(即下面突出显示的行)


* note: There is only ever 1 instance of the same test requested on a sample (i.e. abc123 will never have 2 Na tests undertaken) they are simply the previous result repeated the next time a report is issued

Out of the 110m rows there are around 15-20m unique sample IDs each with 1 or more tests.

I have created a view of that data which concatenates the Sample Id and Test (see table below) as what I need to achieve is a way to get just the row relating to the first reported instance of that sampleId_test (i.e. the highlighted rows below)

UID    | TimeStamp    | SampleID_Test
1      | 17:00        | abc123_Na
2      | 17:00        | abc123_k
3      | 17:15        | abc123_Na
4      | 17:15        | abc123_k
5      | 17:15        | abc123_Urea
6      | 17:30        | abc123_k
7      | 17:30        | abc123_Na
8      | 17:30        | abc123_Urea
9      | 17:30        | abc123_Crea



我希望有道理......



我假设我需要使用分区或一些这样的但是还没有设法解决它的实现。



列名如表所示如果你能够提供一个有用的例子,你将帮助一个有压力的科学家!



非常感谢提前


I hope that makes some sense...

I am assuming that I need to use partitioning or some such but have not managed to get my head around the implementation of it.

The column names are as per the tables so if you were able to offer a worked example you would be helping a stressed scientist!

Many thanks in advance

推荐答案

您没有提到您正在使用哪个DBMS,因此我将假设Microsoft SQL Server 2005或更高版本。



ROW_NUMBER [ ^ ]排名功能可能就是你要找的东西:

You didn't mention which DBMS you're using, so I'll assume Microsoft SQL Server 2005 or higher.

The ROW_NUMBER[^] ranking function is probably what you're looking for:
WITH RankedData As
(
    SELECT
        UID,
        TimeStamp,
        SampleID,
        Test,
        ROW_NUMBER() OVER (PARTITION BY SampleID, Test ORDER BY TimeStamp) As RN
    FROM
        YourTable
)
SELECT
    UID,
    TimeStamp,
    SampleID,
    Test
FROM
    RankedData
WHERE
    RN = 1
;


这篇关于如何获取第一个报告的数据项实例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆