MySQL过程将数据从登台表加载到其他表.需要在过程中拆分多值字段 [英] MySQL procedure to load data from staging table to other tables. Need to split up multivalue field in the process
问题描述
我正在尝试将数据从多值数据库(Unidata)导出到MySQL.可以说,我的源数据是一个人的ID号码,他们的名字以及他们所居住的所有州.states字段是一个多值字段,我正在导出它们,以便该字段中的不同值用a分隔. 〜.样本摘录如下:
I'm trying to export data from a multivalue database (Unidata) into MySQL. Lets say my source data was a person's ID number, their first name and all the states they've lived in. The states field is a multi value field and I'm exporting them so that the different values within that field are seperated by a ~. A sample extract looks like:
"1234","Sally","NY~NJ~CT"
"1235","Dave","ME~MA~FL"
"3245","Fred","UT~CA"
"2344","Sue","OR"
我已将此数据加载到登台表中
I've loaded this data into a staging table
Table:staging
Column 1: personId
Column 2: name
Column 3: states
我要做的是使用一个过程将此数据分为两个表:人员表和状态表.一个人在状态表中可以有很多条目:
What I want to do is split this data out into two tables using a procedure: a persons table and a states table. A person can have many entries in the states table:
Table 1: persons
Column 1: id
Column 2: name
Table 2: states
Column 1: personId
Column 2: state
我的过程从临时表中获取数据并将其转储到表1中就很好了.但是,我有点迷失了如何拆分数据并将其发送到表2的方法.Sally需要在状态表中拥有三个条目(NY,NJ,CT),Dave需要3个,Fred需要2个并且Sue将具有1(OR).关于如何实现此目标的任何想法?
My procedure takes the data from the staging table and dumps it over to table 1 just fine. However, i'm a little lost how how to split the data up and send it to table 2. Sally would need to have three entries in the states table (NY, NJ, CT), Dave would have 3, Fred would have 2 and Sue would have1 (OR). Any ideas on how to accomplish this?
推荐答案
尝试如下操作: http://pastie.org /1213943
-- TABLES
drop table if exists staging;
create table staging
(
person_id int unsigned not null primary key,
name varchar(255) not null,
states_csv varchar(1024)
)
engine=innodb;
drop table if exists persons;
create table persons
(
person_id int unsigned not null primary key,
name varchar(255) not null
)
engine=innodb;
drop table if exists states;
create table states
(
state_id tinyint unsigned not null auto_increment primary key, -- i want a nice new integer based PK
state_code varchar(3) not null unique, -- original state code from staging
name varchar(255) null
)
engine=innodb;
/*
you might want to make the person_states primary key (person_id, state_id) depending on
your queries as this is currently optimised for queries like - select all the people from NY
*/
drop table if exists person_states;
create table person_states
(
state_id tinyint unsigned not null,
person_id int unsigned not null,
primary key(state_id, person_id),
key (person_id)
)
engine=innodb;
-- STORED PROCEDURES
drop procedure if exists load_staging_data;
delimiter #
create procedure load_staging_data()
proc_main:begin
truncate table staging;
-- assume this is done by load data infile...
set autocommit = 0;
insert into staging values
(1234,'Sally','NY~NJ~CT'),
(1235,'Dave','ME~MA~FL'),
(3245,'Fred','UT~CA'),
(2344,'Sue','OR'),
(5555,'f00','OR~NY');
commit;
end proc_main #
delimiter ;
drop procedure if exists cleanse_map_staging_data;
delimiter #
create procedure cleanse_map_staging_data()
proc_main:begin
declare v_cursor_done tinyint unsigned default 0;
-- watch out for variable names that have the same names as fields !!
declare v_person_id int unsigned;
declare v_states_csv varchar(1024);
declare v_state_code varchar(3);
declare v_state_id tinyint unsigned;
declare v_states_done tinyint unsigned;
declare v_states_idx int unsigned;
declare v_staging_cur cursor for select person_id, states_csv from staging order by person_id;
declare continue handler for not found set v_cursor_done = 1;
-- do the person data
set autocommit = 0;
insert ignore into persons (person_id, name)
select person_id, name from staging order by person_id;
commit;
-- ok now we have to use the cursor !!
set autocommit = 0;
open v_staging_cur;
repeat
fetch v_staging_cur into v_person_id, v_states_csv;
-- clean up the data (for example)
set v_states_csv = upper(trim(v_states_csv));
-- split the out the v_states_csv and insert
set v_states_done = 0;
set v_states_idx = 1;
while not v_states_done do
set v_state_code = substring(v_states_csv, v_states_idx,
if(locate('~', v_states_csv, v_states_idx) > 0,
locate('~', v_states_csv, v_states_idx) - v_states_idx,
length(v_states_csv)));
set v_state_code = trim(v_state_code);
if length(v_state_code) > 0 then
set v_states_idx = v_states_idx + length(v_state_code) + 1;
-- add the state if it doesnt already exist
insert ignore into states (state_code) values (v_state_code);
select state_id into v_state_id from states where state_code = v_state_code;
-- add the person state
insert ignore into person_states (state_id, person_id) values (v_state_id, v_person_id);
else
set v_states_done = 1;
end if;
end while;
until v_cursor_done end repeat;
close v_staging_cur;
commit;
end proc_main #
delimiter ;
-- TESTING
call load_staging_data();
select * from staging;
call cleanse_map_staging_data();
select * from states order by state_id;
select * from persons order by person_id;
select * from person_states order by state_id, person_id;
这篇关于MySQL过程将数据从登台表加载到其他表.需要在过程中拆分多值字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!