获取一个SSIS C#脚本组件,以通过OleDb从Excel中读取: [英] Get an SSIS C# Script component to read from Excel via OleDb:

查看:122
本文介绍了获取一个SSIS C#脚本组件,以通过OleDb从Excel中读取:的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试构建一个SSIS包以从Excel中读取.我在此处找到了一个很好的例子,它非常符合我的需求

I'm trying to build a SSIS package to read from Excel. I've found an excellent example here that is very close to what I need.

我的问题是:从一个空白的解决方案开始,使C#脚本组件从.xlsx文件中读取所需的最低限度步骤是什么?

My question here is: Starting with a blank solution, what are the bare minimum steps required to get a C# Script component to read from an .xlsx file?

这是我的代码:

using System;
using System.Data;
using System.Data.OleDb;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{    
    public override void CreateNewOutputRows()
    {
        string fileName = @"E:\SFTP\RSS\Results.xlsx";    
        string cstr = "Provider.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 12.0;HDR=YES;IMEX=1\"";

        using (OleDbConnection xlConn = new OleDbConnection(cstr))
        {
            xlConn.Open();
            OleDbCommand xlCmd = xlConn.CreateCommand();
            xlCmd.CommandText = "Select top 10 * from Responses";
            xlCmd.CommandType = CommandType.Text;
            using (OleDbDataReader rdr = xlCmd.ExecuteReader())
            {
                while (rdr.Read())
                {
                    for (int i = 2; i < rdr.FieldCount; i++) //loop from 3 column to last
                    {
                        Output0Buffer.AddRow();
                        Output0Buffer.Question = rdr.GetName(i);
                        Output0Buffer.Response = rdr.ToString();
                    }

                }
            }
            xlConn.Close();
        }
    }

}

这些是我遇到的问题&我有的问题: 我是否需要在解决方案级别配置连接管理器?在脚本组件上还是代码包含我需要的所有内容?

These are the kind of issues I'm getting & questions I have: Do I need to configure the connection manager at the solution level? On the script component or does the code contain everything I need?

我该使用哪种数据类型/函数/方法将单元格读取为数字,日期或字符串?

What datatype/function/method do I use to read a cell as a number, a date or a string?

以下是我遇到的一些错误:

Here are some of the errors I'm getting:

(2,14): error CS0234: The type or namespace name 'Data' does not exist in the namespace 'System' (are you missing an assembly reference?)
(3,14): error CS0234: The type or namespace name 'Data' does not exist in the namespace 'System' (are you missing an assembly reference?)
(4,17): error CS0234: The type or namespace name 'SqlServer' does not exist in the namespace 'Microsoft' (are you missing an assembly reference?)
(5,17): error CS0234: The type or namespace name 'SqlServer' does not exist in the namespace 'Microsoft' (are you missing an assembly reference?)
(8,27): error CS0246: The type or namespace name 'UserComponent' could not be found (are you missing a using directive or an assembly reference?)
 + additional 10 errors

我正在使用:VS2019,SQL Server 2016 SP2,Office2016.

I'm using: VS2019, SQL Server 2016 SP2, Office 2016.

这是我的Excel工作表的屏幕截图:

Here's a screen shot of my Excel Sheet:

它包含对第三方呼叫中心提供的调查的答复.有几列标识该呼叫,然后每个列标题都包含问题,而笨拙数据就是响应.它目前有189列,并且会随着时间增长/变化.我的想法是选择能可靠地标识每一行的前4列,再为每个Q& A选择1列,然后遍历工作表以选择每一列的标题和内容,以便可以将其插入表格中.随着添加更多的列,它们将被附加为其他行.我粘贴的示例代码仅显示2列,这是我试图使示例保持简单的原因.

It contains responses to surveys provided by a 3rd party call centre. There's a few columns to identify the call, And then each column header contains the question and the clumn data is the response. It currently has 189 Columns and will grow/change over time. My idea is to select the first 4 columns which reliably identify each row plus 1 additional column For each Q&A, and iterate through the sheet picking up each column header and contents so that can be inserted into a table. As more columns get added they'd just get appended as additional rows. The example code I pasted just shows 2 columns as I was trying to keep the example simple.

这是我尝试过的. 1.将DataFlow任务添加到控制流".

This is what I've tried. 1. Add a DataFlow task to Control Flow.

  1. 将脚本组件添加到DataFlow.

出现提示时,选择来源".

When prompted select Source.

双击并打开编辑器.

选择连接管理器.点击添加

Select Connection Manager. Click Add

在新行上,选择新建连接",

On new row select New connection,

出现添加新的SSIS连接管理器.

Add new SSIS Connection Manager appears.

这是关于我卡住的地方.我一直在选择OLEDB Connection.

This is about where I get stuck. I've been picking OLEDB Connection.

  1. 在OLEDB连接管理器中,我应该选择什么?

我已经尝试过Access和MS Oledb Simple Provider.事情开始在这里解散.假设脚本中提供了文件名,路径和驱动程序详细信息,那么连接管理器需要什么详细信息?

I've tried Access and MS Oledb Simple Provider. And things start to unravel about here. Given that the filename, path and driver details are provided in the script what details do I need for the connection manager?

我开始使用代码,但是我不知道上游所做的工作是否正确.所以我想从头开始.我正在寻求帮助,逐步完成设置步骤,以达到C#点.

I started playing with the code, but I have no idea if what I have done upstream is working or is correct. So i'm trying to start from the beginning. I'm asking for help walking through the setup steps to get to the C# point.

感谢大家的回应.

欢呼

Pete

推荐答案

除了输出(无需记住正确的数据类型)外,无需向脚本组件GUI添加任何内容.

There's nothing that needs to be added to the script component GUI except for the outputs (remember to get the data types correct.)

不要设置连接管理器,您在代码中执行此操作是因为您不希望SSIS尝试找出列名,并且坦率地说,我们不知道列4之后的列名.

Do not set up a connection manager, you are doing that in code because you do not want SSIS trying to figure out your column names and frankly, we don't know the column names after column 4.

这是您的代码.我只改变了几件事.

Here is your code. I only changed a few things.

public override void CreateNewOutputRows()
    {
        //Change this to your filename you do not need a connection manager
        string fileName = @"E:\SFTP\RSS\Results.xlsx";  
        string SheetName = "Sheet1";  
        string cstr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 12.0;HDR=YES;IMEX=1\"";

        using (System.Data.OleDb.OleDbConnection xlConn = new System.Data.OleDb.OleDbConnection(cstr))
        {
           xlConn.Open();
           System.Data.OleDb.OleDbCommand xlCmd = xlConn.CreateCommand();
           xlCmd.CommandText = "Select * from [" + SheetName + "$]"; //I assume this is the data you want
           xlCmd.CommandType = CommandType.Text;
           using (System.Data.OleDb.OleDbDataReader rdr = xlCmd.ExecuteReader())
           {
              while (rdr.Read())
              {
                 for (int i = 4; i < rdr.FieldCount; i++) //loop from 5th column to last
                 {
                    //The first 4 columns are static and added to every row
                    Output0Buffer.AddRow();
                    Output0Buffer.UniqueID = Int32.Parse(rdr[0].ToString());
                    Output0Buffer.Year = Int32.Parse(rdr[1].ToString());
                    Output0Buffer.ReportingWave = rdr.GetString(2);
                    Output0Buffer.SubmissionDate = rdr.GetString(3);
                    Output0Buffer.Question = rdr.GetName(i);
                    Output0Buffer.Answer = rdr.GetString(i);
                 }

            }
          }
          xlConn.Close();
       }
    }

此代码成功导入了一个如下所示的文件:

This code successfully imported a file that looks like this:

这篇关于获取一个SSIS C#脚本组件,以通过OleDb从Excel中读取:的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆