得到了一些独特的价值,而不分离值属于价值相同的块 [英] get a number of unique values without separating values that belong to the same block of values

查看:236
本文介绍了得到了一些独特的价值,而不分离值属于价值相同的块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我与任何一个PL / SQL解决方案,或访问VBA / Excel的VBA OK(尽管访问VBA是pferred在Excel中VBA $ P $)之一。因此,PL / SQL是首选,进入VBA是第二和Excel VBA是第三个。

I'm OK with either a PL/SQL solution or an Access VBA/Excel VBA (though Access VBA is preferred over Excel VBA) one. so, PL/SQL is the first choice, Access VBA is second and Excel VBA is third.

这是一个非常棘手的问题来解释。请询问任何问题,我会尽我所能清楚地回答这些问题。

This is a very tough problem to explain. Please ask any questions and i will do my best to answer them clearly.

我在一个叫NR_PVO_120表中的以下数据集。我要如何挑选出一个数字(可以改变,但让我们说,6)独特的OtherIDs不排除在任何传真号码任何OtherIDs的?

I have the following dataset in a table called NR_PVO_120. How do i pick out a number (which can change but let's say, 6) of UNIQUE OtherIDs without excluding any OtherIDs under any fax numbers?

所以,如果你从Row7挑OtherID你那么还必须挑选行8和9 OtherIDs,因为它们具有相同的传真号码。基本上,一旦你选择了OtherID你再有义务挑选具有相同的传真号码,你​​挑一个都OtherIDs。

So, if you pick OtherID from Row7 you then also must pick OtherIDs from rows 8 and 9 because they have the same fax number. Basically, once you pick an OtherID you're then obligated to pick all OtherIDs that have the same fax number as the one you picked.

如果(对于此实施例6)要求的数量是不可能的,那么最接近的数目可能的,但不超过将是该规则。

If the number requested (6 for this example) isn't possible then "the closest number possible but not exceeding" would be the rule.

例如,如果从行1-10带OtherIDs您将获得6个独特的OtherIDs但排10股行11和12传真你要么需要全取3(但会提高独特的计数8 ,这是不能接受的)或跳过此OtherID并找到一个与一个传真,将增加1独特OtherID(例如,它可以具有4 OtherIDs但其中3上存在结果集,因此,不添加到唯一计数)。我的6个独特的OtherIDs结果将需要包含在现有OtherIDs连接到任何传真ALL OtherIDs。

For example, if you take OtherIDs from rows 1-10 you will get 6 unique OtherIDs but row 10 shares a fax with rows 11 and 12. You either need to take all 3 (but that will raise the unique count to 8, which isn't acceptable) or skip this OtherID and find one with a fax that will add 1 unique OtherID (for example, it can have 4 OtherIDs but 3 of them exist on the result set and therefore don't add to unique counts). My result of 6 UNIQUE OtherIDs will need to contain ALL OtherIDs under any fax the existing OtherIDs are connected to.

所以,一个解决办法是采取行1-6,26.另一个是走行1-4,10-14。还有更多的,但你的想法。

So one solution is to take rows 1-6, 26. Another is to take rows 1-4,10-14. There are more but you get the idea.

将有多种可能性(真实数据集有行数万人要求的数量将是10K左右),只要连接到结果集的所有传真都OtherIDs都要求数字的一部分(6在这种情况下)的任何组合会做

There will be many possibilities (the real dataset has tens of thousands of rows and the number of people requested will be around 10K), as long all OtherIDs connected to all faxes on the result set are part of the requested number (6 in this case) any combination would do.

这几个音符。

  1. 获得尽可能接近所请求的数目是必须的。

  1. Getting as close as possible to the requested number is a requirement.

有些OtherIDs将有一个空白的传真,他们只应包括作为最后的手段(没有足够的OtherIDs请求的数量)。

Some OtherIDs will have a blank fax, they should only be included as a last resort (not enough OtherIDs for the requested number).

这是怎么做的?

Row      OtherID        Fax
1       11098554    2063504752
2       56200936    2080906666
3       11098554    7182160901
4       25138850    7182160901
5       56148974    7182232046
6       56530104    7182234134
7       25138850    7182234166
8       56148974    7182234166
9       11098554    7182234166
10      56597717    7182248132
11      56166294    7182248132
12      25138850    7182248132
13      56148974    7182390090
14      56226456    7182390090
15      56148974    7182395285
16      25138850    7182395285
17      56166614    7180930966
18      11098554    7180930966
19      56159509    7180930966
20      25138850    7185462234
21      56148974    7185462234
22      25138850    7185465013
23      56024315    7185465013
24      56115247    7185465281
25      25138850    7185465281
26      56148975    7185466029

这几样输出

一个解决方案正在行1-6和26。

one solution is taking rows 1-6 and 26.

Row      OtherID        Fax
1       11098554    2063504752
2       56200936    2080906666
3       11098554    7182160901
4       25138850    7182160901
5       56148974    7182232046
6       56530104    7182234134
26      56148975    7185466029

另一种解决方案是采取的行1-4和10-14

Another solution is taking rows 1-4 and 10-14.

Row      OtherID        Fax
1       11098554    2063504752
2       56200936    2080906666
3       11098554    7182160901
4       25138850    7182160901
10      56597717    7182248132
11      56166294    7182248132
12      25138850    7182248132
13      56148974    7182390090
14      56226456    7182390090

有许多。

我只需要传真作为输出。

I only need FAX as my output.

这是一个传真活动,我们需要确保没有传真号码传真两倍,即连接到该传真号码,所有的人都下发一份传真进行接触。

This is for a fax campaign, we need to make sure no fax number is faxed twice, that all people connected to that fax number are contacted under one fax sent.

这样的想法是把你最终使用的传真下的所有OtherIDs。

So the idea is to take all OtherIDs under ANY fax you end up using.

在这里编辑是它是如何做的目前,这也许有助于画画

EDIT here's how it's currently done, maybe this helps paint a picture

列表是通过传真来分类的,他们去了列表,随机点确保最后的记录结束与同一份传真。所以在我的例子,他们会停留在任一排1,2,4,5,6,9,12,14,16,19,21,23,25,26。然后他们知道自己有多少独特OtherIDs有,直到这一点。如果它是太多,他们去了一些,看看他们有多少。如果它太少了,他们去了一些,看看他们有多少。他们一直这样做,直到他们得到他们的唯一编号。唯一的要求是始终包含在传真的所有OtherIDs。

list is sorted by fax, they go down the list to a random point MAKING SURE THE LAST RECORD ENDS WITH THE SAME FAX. so in my example they'd stop at either row 1,2,4,5,6,9,12,14,16,19,21,23,25,26. they then see how many unique OtherIDs they have up until that point. if it's too many they go up some, see how many they have. if it's too little, they go down some, see how many they have. and they keep doing this until they get their unique number. the only requirement is to always include all OtherIDs under a fax.

推荐答案

编辑2015年2月13日 使用接受的答案几个月后,我遇到了尚未发生的一个场景,并意识到他的解决方案,只有当我需要得到一个数字,不是太接近总。例如,如果记录我的总数是15000,我要求12000那么他的code将给予10或11K。如果我问8K的话,我可能会获得8。

EDIT 2/13/2015 after using the accepted answer for a few months i came across a scenario that hasn't happened yet and realized that his solution only works if i need to get a number that's not too close to the total. for example, if my total number of records is 15000 and i'm asking for 12000 then his code will give 10 or 11k. if i ask for 8k then i will probably get the 8.

我不明白他的code不和他从来没有说,所以我无法解释为什么发生这种情况,我的猜测是,他走的是数以一定的顺序和给出的结果也依赖于订购传真进行排序 - 他不会每次都一定得到最好的结果。 当有足够的空间(问8升出15K),他有足够的空间任意组合,以产生可以接受的结果,但一旦你问一个更严格的数量(12K出15K)他锁定在他的命令,并耗尽的速度不够快接受计数

i don't understand what his code does and he never replied so i can't explain why this is happening, my guess is that he's taking the counts in a certain order and since the results are dependent on the order the faxes are sorted in - he won't necessarily get the best results every time. when there's enough room (asking 8l out of 15k) he has enough room for any combination to yield the acceptable result but once you ask for a tighter number (12k out of 15k) he's locked into his order and runs out of acceptable counts fast enough.

所以这是code,将给予正确的结果不管是什么。它几乎没有优雅,是极其缓慢的,但它的工作原理。

so this is the code that will give correct result no matter what. it's not nearly as elegant and is extremely slow but it works.

14年12月13日我想我得到了它,PL / SQL,而不是目前最好的解决方案,但它提供了比他们目前拿到手有什么更好的结果。实际上,将是非常有兴趣了解可能出现的问题

12/13/14 i think i got it, PL/SQL, not the best solution by far but it gives better results than what they currently get by hand. actually, would be really interested to hear about possible problems

14年12月13日编辑接受的答案是要做到这一点,我只有离开这个对比度,使人们可以看到如何不code笑。

12/13/14 EDIT the accepted answer is the way to do it, i'm only leaving this here for contrast, so people can see how not to code lol.

DECLARE
     CountsNeededTotal NUMBER;
     CountsNeededRemaining NUMBER;
     CurCountsTotal NUMBER;
     CurFaxCount NUMBER;
     CurFaxCountPicked NUMBER;
BEGIN
     CountsNeededTotal := 420;
     CurCountsTotal := 0;
     CurFaxCount := 0;

     CountsNeededRemaining := CountsNeededTotal - CurCountsTotal;

     EXECUTE IMMEDIATE 'TRUNCATE TABLE NR_PVO_121';


     --############################################################################################
     --############################################################################################
     --############################################################################################
     --############################################################################################
     --############################################################################################
     --START BLOCK
     --this block jsut gets the first fax, the fax with the largest number of people
     --############################################################################################
     --############################################################################################
     --############################################################################################
     --############################################################################################
     --############################################################################################

     --get the first fax with the most people as long as thta number isn't larger than the number needed
     SELECT MAX(CountOfPeople) CountOfPeople
    INTO CurFaxCount
    FROM (SELECT     fax
            ,COUNT(1) CountOfPeople
           FROM NR_PVO_120
          GROUP BY Fax
         HAVING COUNT(1) <= CountsNeededRemaining);

     COMMIT;

     --if there is a number that's not larger then add to the table and keep looping
     --if there isn't then there's no providers from this campaign that can be used
     IF CurFaxCount >= 0 THEN
       --insert into the 121 table (final list of faxes)
       INSERT INTO NR_PVO_121
         SELECT   fax
              ,COUNT(1) CountOfPeople
             FROM NR_PVO_120
           HAVING COUNT(1) = (SELECT MAX(CountOfPeople) CountOfPeople
                       FROM (SELECT   fax
                               ,COUNT(1) CountOfPeople
                              FROM NR_PVO_120
                          GROUP BY Fax
                            HAVING COUNT(1) <= CountsNeededTotal))
         GROUP BY Fax;



       COMMIT;

       --############################################################################################
       --############################################################################################
       --############################################################################################
       --############################################################################################
       --############################################################################################
       --START BLOCK
       --this block loops through remaining faxes
       --############################################################################################
       --############################################################################################
       --############################################################################################
       --############################################################################################
       --############################################################################################



       SELECT SUM(CountOfPeople) INTO CurCountsTotal FROM NR_PVO_121;


       IF CurCountsTotal < CountsNeededTotal THEN
         CountsNeededRemaining := CountsNeededTotal - CurCountsTotal;


         --loop until counts needed remaining is 0 or as close as 0 as possible without going in the negative
         WHILE CountsNeededRemaining >= 0 LOOP
              --clear 122 table
              EXECUTE IMMEDIATE 'TRUNCATE TABLE NR_PVO_122';


              --loop through all faxes in 120 table  MINUS the ones in the 121 table
              DECLARE
                CURSOR CurRec  IS
                  SELECT DISTINCT Fax
                    FROM NR_PVO_120
                   WHERE Fax NOT IN (SELECT Fax FROM NR_PVO_121);
                PVO CurRec%ROWTYPE;
              BEGIN
                OPEN CurRec;
                LOOP
                  FETCH CurRec INTO PVO;

                  SELECT DISTINCT COUNT(OtherID) CountOfPeople
                    INTO CurFaxCount
                    FROM NR_PVO_120
                   WHERE     Fax = PVO.fax
                      AND OtherID NOT IN (SELECT DISTINCT OtherID
                                   FROM NR_PVO_120
                                  WHERE fax IN (SELECT Fax FROM NR_PVO_121));
                  --                                                          DBMS_OUTPUT.put_line('CurFaxCount ' || CurFaxCount);
                  --                                                          DBMS_OUTPUT.put_line('CountsNeededRemaining ' || CountsNeededRemaining);

                  IF CurFaxCount <= CountsNeededRemaining THEN
                    --record their unique counts in 122 table IF THEY'RE NOT LARGER THAN CountsNeededRemaining
                    INSERT INTO NR_PVO_122
                         SELECT PVO.fax
                            ,CurFaxCount
                        FROM DUAL;

                    COMMIT;
                  END IF;
                  EXIT WHEN CurRec%NOTFOUND;
                --end fax loop
                END LOOP;
                CLOSE CurRec;
              END;


              --pick the highest count from 122 table
              SELECT MAX(CountOfPeople) CountOfPeople INTO CurFaxCountPicked FROM NR_PVO_122;

              --add this fax to the 121 table
              INSERT INTO NR_PVO_121
                SELECT MIN(Fax) Fax
                   ,CurFaxCountPicked
                  FROM NR_PVO_122
                 WHERE CountOfPeople = CurFaxCountPicked;


              COMMIT;
              --add the counts to the CurCountsTotal
              CurCountsTotal := CurCountsTotal + CurFaxCountPicked;
              --recalc   CountsNeededRemaining
              CountsNeededRemaining := CountsNeededTotal - CurCountsTotal;
              --
              --                                                          DBMS_OUTPUT.put_line('CurCountsTotal ' || CurCountsTotal);
              --                                                          DBMS_OUTPUT.put_line('CurFaxCountPicked ' || CurFaxCountPicked);
              --                                                          DBMS_OUTPUT.put_line('CurFaxCount ' || CurFaxCount);
              --                                                          DBMS_OUTPUT.put_line('CountsNeededRemaining ' || CountsNeededRemaining);
              --                                                          DBMS_OUTPUT.put_line('CountsNeededTotal ' || CountsNeededTotal);

              --clear 122 table
              EXECUTE IMMEDIATE 'TRUNCATE TABLE NR_PVO_122';
         --end while loop
         END LOOP;
       END IF;
     --############################################################################################
     --############################################################################################
     --############################################################################################
     --############################################################################################
     --############################################################################################
     --END BLOCK
     --this block loops through remaining faxes
     --############################################################################################
     --############################################################################################
     --############################################################################################
     --############################################################################################
     --############################################################################################



     END IF;
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--END BLOCK
--this block jsut gets the first fax, the fax with the largest number of people
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################



END;

这里有一个更好的版本,远远超过上述的快,但它可能不会在某些情况下返回完美的效果。我是不是能够得到错误的结果,同时测试,但有可能是因为我并不想每一个可能的组合(如在第一个版本),这需要花费数天才能完成为20K记录的数据集

here's a better version, MUCH faster than the above but it probably won't return perfect results in some cases. i wasn't able to get wrong results while testing but there is a possibility because i'm not trying every possible combination (as in the first version), that takes days to finish for a dataset of 20K records

DECLARE
    CountsNeededTotal NUMBER;
    CountsNeededRemaining NUMBER;
    CurCountsTotal NUMBER;
BEGIN
    CurCountsTotal := 0;

    SELECT NoOfProvToKeep INTO CountsNeededTotal FROM NR_PVO_121;

    CountsNeededRemaining := CountsNeededTotal - CurCountsTotal;

    EXECUTE IMMEDIATE 'TRUNCATE TABLE nr_pvo_122';


    COMMIT;

    IF CurCountsTotal <= CountsNeededTotal THEN
        --loop until counts needed remaining is 0 or as close as 0 as possible without going in the negative
        WHILE CountsNeededRemaining > 0 LOOP
            --clear 122 table
            INSERT INTO NR_PVO_122
                SELECT Fax
                      ,CountOfPeople
                  FROM (SELECT   DISTINCT COUNT(OtherID) CountOfPeople
                               ,Fax
                       FROM NR_PVO_120
                      WHERE OtherID NOT IN (SELECT DISTINCT OtherID
                                    FROM NR_PVO_120
                                   WHERE fax IN (SELECT Fax FROM NR_PVO_122))
                     HAVING COUNT(1) <= CountsNeededRemaining
                        GROUP BY fax
                        ORDER BY 1 DESC)
                 WHERE ROWNUM = 1;



            SELECT SUM(CountOfPeople) INTO CurCountsTotal FROM NR_PVO_122;

            COMMIT;
            --recalc   CountsNeededRemaining
            CountsNeededRemaining := CountsNeededTotal - CurCountsTotal;
        --
        --DBMS_OUTPUT.put_line('CurCountsTotal ' || CurCountsTotal || ', CountsNeededRemaining ' || CountsNeededRemaining);
        --end while loop
        END LOOP;
    END IF;



    DELETE FROM NR_PVO_112
          WHERE NVL(Fax, '999999999999') NOT IN (SELECT Fax FROM NR_PVO_122);
END;

这篇关于得到了一些独特的价值,而不分离值属于价值相同的块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆