SSIS中的脚本任务以导入excel电子表格 [英] script task in SSIS to import excel spreadsheet

查看:29
本文介绍了SSIS中的脚本任务以导入excel电子表格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经查看了可能有我的答案的问题,但不幸的是,它们似乎并不适用.这是我的情况.我必须从我的客户导入工作表.在 A、C、D 和 AA 列中,客户拥有我需要的信息.余额列有什么对我来说毫无价值的信息.列标题在我需要的四列中是一致的,但在无关紧要的列中非常不一致.例如,单元格 A1 包含部门.所有电子表格都是如此.单元格 B1 可以包含从袖长到适合合身的总长度的任何内容.我需要做的是只导入我需要的列并将它们映射到 SQL 2008 R2 表.我已经在当前调用 SSIS 函数的存储过程中定义了该表.

I have reviewed the questions that may have had my answer and unfortunately they don't seem to apply. Here is my situation. I have to import worksheets from my client. In columns A, C, D, and AA the client has the information I need. The balance of the columns have what to me is worthless information. The column headers are consistent in the four columns I need, but are very inconsistent in the columns that don't matter. For example cell A1 contains Division. This is true across all of the spreadsheets. Cell B1 can contain anything from sleeve length to overall length to fit. What I need to do is to import only the columns I need and map them to an SQL 2008 R2 table. I have defined the table in a stored procedure which is currently calling an SSIS function.

问题是,当我尝试导入具有不同列名的电子表格时,SSIS 失败,我必须手动返回运行以正确设置字段.

The problem is that when I try to import a spreadsheet that has different column names the SSIS fails and I have to go back in an run it manually to get the fields set up right.

我无法想象我正在尝试做的事情以前没有人做过.为了不丢失数量级,我有 170 个用户,他们拥有 120 多个不同的电子表格模板.

I cannot imagine that what I am trying to do has not been done before. Just so the magnitude is not lost, I have 170 users who have over 120 different spreadsheet templates.

我迫切需要一个可行的解决方案.在 SQL 中将文件放入我的表后,我可以做任何事情.我什至编写了将文件移回 FTP 服务器的代码.

I am desperate for a workable solution. I can do everything after getting the file into my table in SQL. I have even written the code to move the files back to the FTP server.

推荐答案

我整理了一篇文章,描述了我如何使用 用于解析 Excel 的脚本任务.它允许我将绝对非表格数据导入到数据流中.

I put together a post describing how I've used a Script task to parse Excel. It's allowe me to import decidedly non-tabular data into a data flow.

核心概念是您将使用 JET 或 ACE 提供程序并简单地查询 Excel 工作表/命名范围之外的数据.一旦你有了它,你就有了一个数据集,你可以逐行浏览并执行你需要的任何逻辑.在您的情况下,您可以跳过标题的第 1 行,然后仅导入 A、C、D 和 AA 列.

The core concept is that you will use a the JET or ACE provider and simply query the data out of an Excel Worksheet/named range. Once you have that, you have a dataset you can walk through row-by-row and perform whatever logic you need. In your case, you can skip row 1 for the header and then only import columns A, C, D and AA.

该逻辑将进入 ExcelParser 类.因此,第 71 行的 Foreach 循环可能会被简化为类似(代码近似)

That logic would go in the ExcelParser class. So, the Foreach loop on line 71 would probably be distilled down to something like (code approximate)

// This gets the value of column A
current = dr[0].ToString();
// this assigns the value of current into our output row at column 0
newRow[0] = current;

// This gets the value of column C
current = dr[2].ToString();
// this assigns the value of current into our output row at column 1
newRow[1] = current;

// This gets the value of column D
current = dr[3].ToString();
// this assigns the value of current into our output row at column 2
newRow[2] = current;

// This gets the value of column AA
current = dr[26].ToString();
// this assigns the value of current into our output row at column 3
newRow[3] = current;

您显然可能需要在这里进行类型转换等,但这是解析逻辑的核心.

You obviously might need to do type conversions and such here but that's core of the parsing logic.

这篇关于SSIS中的脚本任务以导入excel电子表格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆