根据计数总和将记录分成多个存储桶 [英] split records into buckets based on a sum of counts
问题描述
我有一个看起来像下面的表格.我需要找到一种基于计数总和来选择电话号码的方法(该数字将始终是不同的,但在此示例中,我们使用130).
i have a table that looks like below. i need to find a way to pick out phone numbers based on a sum of counts (the number will always be different but let's use 130 for this example).
因此,解决方案之一将是第1至5和11行(如果您从这些行中累加CountOfPeople值,则将获得130).或1-4,6,7,9,11,12.只要选择总数为130的电话号码,就无关紧要.
So one of the solutions would be rows 1 through 5 and 11 (if you add up CountOfPeople values from those rows you will get 130). or 1-4,6,7,9,11,12. it doesn't matter which phone numbers are picked, as long as the total is 130.
有时候您可能无法准确地获得130,因此通常应尽可能接近但不超过".
sometimes you might not be able to get exactly 130, so "as close as possible but not exceeding" would be the rule.
有没有办法做到这一点?
is there a way to do this?
AutoID Phone Number Count Of People
1 5565787 57
2 2342343 30
3 2654456 17
4 3868556 12
5 9856756 12
6 9756456 4
7 4346365 4
8 2376743 3
9 9756343 3
10 2524349 3
11 2029393 2
12 9285656 1
推荐答案
我不确定使用纯SQL是否可以解决问题.但是您可以使用表函数.这是您的问题的一个小例子. 首先,我们需要创建表类型:
I'm not sure that problem could be solved with pure SQL. But you can use table functions. Here is a little example for your problem. First of all, we need to create table type:
create type t_bucket_row as object(
phone_number varchar2(10),
count_of_people number,
bucket_no number);
/
create type t_bucket_table as table of t_bucket_row;
/
带有测试数据的表:
create table test_data as
with t as (
select 1 AutoID, '5565787' Phone_Number, 57 Count_Of_People from dual union all
select 2, '2342343', 30 from dual union all
select 3, '2654456', 17 from dual union all
select 4, '3868556', 12 from dual union all
select 5, '9856756', 12 from dual union all
select 6, '9756456', 4 from dual union all
select 7, '4346365', 4 from dual union all
select 8, '2376743', 3 from dual union all
select 9, '9756343', 3 from dual union all
select 10, '2524349', 3 from dual union all
select 11, '2029393', 2 from dual union all
select 12, '9285656', 1 from dual)
select * from t;
然后,我们创建一个实现客户端分配算法的函数(对不起,代码中没有注释它如何工作,但是它起作用;如果需要,我可以稍后编写).在这里,我们创建一个表类型的变量,用电话和存储区号填充它,然后从一个函数返回它.之后,在SQL查询中,我们将函数的结果用作FROM
子句中的表.参数p_sum
是您希望的客户总数:
Then we create a function that implements an algorithm of distribution of clients (sorry, there is no comments in the code how it works, but it works; I can write it later if you need). Here we create a variable of table type, fill it with phones and bucket numbers, then return it from a function. After that, in SQL query, we use function's result as table in FROM
clause. Parameter p_sum
is your desired sum of counts of clients:
create or replace function get_buckets(p_sum number) return t_bucket_table is
buckets t_bucket_table := t_bucket_table();
type bucket_sums is table of number index by binary_integer;
sums bucket_sums;
counter number := 0;
found boolean;
begin
sums(1) := 0;
-- next line was edited to fix bug in resuult of distribution:
for i in (select t.*, rownum from test_data t order by t.count_of_people desc) loop
buckets.extend;
counter := counter + 1;
buckets(counter) := t_bucket_row(i.phone_number, i.count_of_people, 0);
if i.count_of_people > p_sum then
continue;
end if;
found := false;
for j in 1..sums.count loop
if sums(j) + i.count_of_people <= p_sum then
sums(j) := sums(j) + i.count_of_people;
buckets(counter).bucket_no := j;
found := true;
exit;
end if;
end loop;
if not found then
sums(sums.count + 1) := i.count_of_people;
buckets(counter).bucket_no := sums.count;
end if;
end loop;
return buckets;
end;
/
现在我们可以执行此功能.结果是:
Now we can execute this function. Result is:
SQL> select * from table(get_buckets(130));
PHONE_NUMB COUNT_OF_PEOPLE BUCKET_NO
---------- --------------- ----------
5565787 57 1
2342343 30 1
2654456 17 1
3868556 12 1
9856756 12 1
9756456 4 2
4346365 4 2
2376743 3 2
9756343 3 2
2524349 3 2
2029393 2 1
9285656 1 2
12 rows selected.
桶分配:
select bucket_no, sum(count_of_people) from table(get_buckets(130)) group by bucket_no;
BUCKET_NO SUM(COUNT_OF_PEOPLE)
---------- --------------------
1 130
2 18
如果count_of_people
大于p_sum
,它将进入存储区"0":
If count_of_people
is more than p_sum
, it goes to bucket "0":
SQL> select * from table(get_buckets(35));
PHONE_NUMB COUNT_OF_PEOPLE BUCKET_NO
---------- --------------- ----------
5565787 57 0
2342343 30 1
2654456 17 2
3868556 12 2
9856756 12 3
9756456 4 1
4346365 4 2
2376743 3 3
9756343 3 3
2524349 3 3
2029393 2 2
9285656 1 1
12 rows selected.
SQL> select bucket_no, sum(count_of_people) from table(get_buckets(35)) group by bucket_no;
BUCKET_NO SUM(COUNT_OF_PEOPLE)
---------- --------------------
1 35
2 35
3 21
0 57
这篇关于根据计数总和将记录分成多个存储桶的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!