SSIS 的 C# 中的 System.Text.RegularExpressions.Regex.Replace 错误 [英] System.Text.RegularExpressions.Regex.Replace error in C# for SSIS

查看:31
本文介绍了SSIS 的 C# 中的 System.Text.RegularExpressions.Regex.Replace 错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用以下代码在 C# 中编写 ssis 包,当我编写此代码时出现错误

 使用系统;使用 System.Data;使用 Microsoft.SqlServer.Dts.Pipeline.Wrapper;使用 Microsoft.SqlServer.Dts.Runtime.Wrapper;使用 System.Text.RegularExpressions;[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]公共类 ScriptMain : UserComponent{公共覆盖无效 PreExecute(){base.PreExecute();}公共覆盖无效 PostExecute(){base.PostExecute();}string toreplace = "[~!@#$%^&*()_+`{};':,./<>?]";字符串替换 = "";公共覆盖无效 Input0_ProcessInputRow(Input0Buffer Row){正则表达式 reg = new Regex(toreplace);Row.NaN = reg.Replace(Row.Na, replacewith);}}

错误是

最佳重载方法匹配'System.Text.RegularExpressions.Regex.Replace(string,System.Text.RegularExpressions.MatchEvaluator)' 有一些无效的参数

这里 Na 是输入列,NaN 是输出列,两者都是在 Inpout 列中带有特殊字符的 varchar.

例外:

System.ArgumentNullExceptionSystem.ArgumentOutofRangeException

这是SSIS包中BufferWrapper中的代码

/* 这是自动生成的代码,将被覆盖!不要编辑!* Microsoft SQL Server 集成服务缓冲区包装器* 该模块定义了访问数据流缓冲区的类* 这是自动生成的代码,将被覆盖!不要编辑!*/使用系统;使用 System.Data;使用 Microsoft.SqlServer.Dts.Pipeline;使用 Microsoft.SqlServer.Dts.Pipeline.Wrapper;公共类 Input0Buffer: ScriptBuffer{公共 Input0Buffer(PipelineBuffer Buffer, int[] BufferColumnIndexes, OutputNameMap OutputMap): base(Buffer, BufferColumnIndexes, OutputMap){}公共 BlobColumn Na{得到{返回 (BlobColumn)Buffer[BufferColumnIndexes[0]];}}public bool Na_IsNull{得到{返回 IsNull(0);}}公共 Int32 NaN{放{this[1] = 值;}}public bool NaN_IsNull{放{如果(值){SetNull(1);}别的{throw new InvalidOperationException("IsNull 属性不能设置为 False.改为为列赋值.");}}}新公共布尔 NextRow(){返回 base.NextRow();}new public bool EndOfRowset(){返回 base.EndOfRowset();}}

数据流

脚本组件,输入列

脚本组件,实际脚本

解决方案

您的代码基本没问题.您不是在测试 Na 列为 NULL 的可能性.也许您的源数据不允许空值,因此无需测试.

您可以通过在成员级别确定 Regex 的范围并在 PreExecute 方法中实例化它来提高性能,但这只是性能问题.与您收到的错误消息无关.

你可以看到我的包和预期的结果.我向下发送了 4 行,其中一个带有 NULL 值,一个不应更改,另外两个需要更改.

我的数据流

我已更新我的数据流以匹配您在变色龙问题中使用的步骤.

我的来源查询

我生成了 2 列数据和 4 行数据.与您的原始问题匹配的 Na 列是 varchar 类型.Agency_Names 列被强制转换为已弃用的 Text 数据类型以匹配您的后续更新.

SELECT脱氧核糖核酸, CAST(D.Na AS text) AS Agency_Names从(SELECT 'Hello world' AS NaUNION ALL SELECT 'man~ana'UNION ALL SELECT 'p@$$word!'UNION ALL SELECT NULL) D (Na);

数据转换

我在 OLE DB 源之后添加了数据转换转换.反映您所做的,我将我的 Agency_Name 转换为长度为 50 的 string [DT_STR] 数据类型,并将其别名为Agency_Name 的副本".

元数据

此时,我确认数据流的元数据属于 DT_STR 或 DT_WSTR 类型,它们是即将到来的正则表达式调用唯一允许的输入.我确认 Copy of Agency_Names 是预期的数据类型.

脚本任务

我将只读用法分配给列 NaCopy of Agency_Name,并将后者别名为AgencyNames".

我添加了 2 个输出列:NaN 匹配您的原始问题并创建了 AgencyNamesCleaned.它们都配置为 DT_STR,代码页 1252,长度为 50.

这是我使用的脚本.

公共类 ScriptMain : UserComponent{string toreplace = "[~!@#$%^&*()_+`{};':,./<>?]";字符串替换 = "";公共覆盖无效 Input0_ProcessInputRow(Input0Buffer Row){正则表达式 reg = new Regex(toreplace);//测试是否为空值,否则 Replace 会爆炸如果 (!Row.Na_IsNull){Row.NaN = reg.Replace(Row.Na, replacewith);}别的{Row.NaN_IsNull = true;}如果 (!Row.AgencyNames_IsNull){Row.AgencyNamesCleaned = reg.Replace(Row.AgencyNames, replacewith);}别的{Row.AgencyNamesCleaned_IsNull = true;}}}

根本原因分析

我认为您的核心问题可能是您拥有的 Na 列不是字符串兼容类型.Sriram's 的评论是正确的.如果我查看 Na 列的自动生成代码,在我的示例中我看到

 public String Na{得到{返回 Buffer.GetString(BufferColumnIndexes[0]);}}public bool Na_IsNull{得到{返回 IsNull(0);}}

您的源系统提供了元数据,因此 SSIS 认为此列是二进制数据.也许它是主机中的 NTEXT/TEXT 或 n/varchar(max).您需要做某事 以使其成为正则表达式的兼容操作数.我会清理源中的列类型,但如果这不是一个选项,请使用 Data Conversion 转换将其转换为 DT_STR/DT_WSTR 类型.

结局

您可以在附加到我的第一张图片的数据查看器中观察到,NaN 和 AgencyNamesCleaned 已正确去除了有问题的字符.此外,您可以观察到我的脚本任务没有像您一样附加红色 X.这表明脚本处于无效状态.

因为您从数据转换组件创建了代理名称副本​​"列作为 DT_TEXT,将其连接到脚本组件,然后然后更改了数据转换组件中的数据类型,脚本上的红色 X 可能会通过让转换刷新其元数据来解决.打开脚本并单击重新编译(ctrl-shift-b)以获得良好的测量.

在您的 reg.Replace(... 代码中不应该有下划线.如果有,那么您的问题还有另一个方面尚未传达.我当时最好的建议是重新创建一个概念验证包,正如我所描述的那样,如果可行,它将成为一种练习,以找出您有工作的工作和没有工作的工作之间的区别.

I am using the below code to write a ssis package in C# and when I write this code i get an error

    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
    using Microsoft.SqlServer.Dts.Runtime.Wrapper;
    using System.Text.RegularExpressions;

    [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
    public class ScriptMain : UserComponent
    {

        public override void PreExecute()
        {
            base.PreExecute();
        }
        public override void PostExecute()
        {
            base.PostExecute();
        }
        string toreplace = "[~!@#$%^&*()_+`{};':,./<>?]";
        string replacewith = "";
        public override void Input0_ProcessInputRow(Input0Buffer Row)
        {
            Regex reg = new Regex(toreplace);
            Row.NaN = reg.Replace(Row.Na, replacewith);


        }

    }

The error is

The best overloaded method match for 
'System.Text.RegularExpressions.Regex.Replace(string,System.Text.RegularExpressions.MatchEvaluator)' has some invalid arguments

Here Na is the input column and NaN is the output column both are varchar with special characters in Inpout column.

Exceptions:

System.ArgumentNullException
System.ArgumentOutofRangeException

This is the code in the BufferWrapper in the SSIS package

/* THIS IS AUTO-GENERATED CODE THAT WILL BE OVERWRITTEN! DO NOT EDIT!
*  Microsoft SQL Server Integration Services buffer wrappers
*  This module defines classes for accessing data flow buffers
*  THIS IS AUTO-GENERATED CODE THAT WILL BE OVERWRITTEN! DO NOT EDIT! */



    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Pipeline;
    using Microsoft.SqlServer.Dts.Pipeline.Wrapper;

    public class Input0Buffer: ScriptBuffer

    {
        public Input0Buffer(PipelineBuffer Buffer, int[] BufferColumnIndexes, OutputNameMap OutputMap)
            : base(Buffer, BufferColumnIndexes, OutputMap)
        {
        }

        public BlobColumn Na
        {
            get
            {
                return (BlobColumn)Buffer[BufferColumnIndexes[0]];
            }
        }
        public bool Na_IsNull
        {
            get
            {
                return IsNull(0);
            }
        }

        public Int32 NaN
        {
            set
            {
                this[1] = value;
            }
        }
        public bool NaN_IsNull
        {
            set
            {
                if (value)
                {
                    SetNull(1);
                }
                else
                {
                    throw new InvalidOperationException("IsNull property cannot be set to False. Assign a value to the column instead.");
                }
            }
        }

        new public bool NextRow()
        {
            return base.NextRow();
        }

        new public bool EndOfRowset()
        {
            return base.EndOfRowset();
        }

    }

Data flow

Script component, input columns

Script component, actual script

解决方案

Your code is mostly fine. You are not testing for the possibility that the Na column is NULL. Perhaps your source data doesn't allow for nulls and thus, no need to test.

You can improve your performance by scoping the Regex at the member level and instantiate it in your PreExecute method but that's just a performance thing. Has no bearing on the error message you are receiving.

You can see my package and the expected results. I sent 4 rows down, one with a NULL value, one that shouldn't change and two that have changes required.

My data Flow

I have updated my data flow to match the steps you are using in your chameleon question.

My Source Query

I generate 2 columns of data and 4 rows worth. The Na column, which matches your original question is of type varchar. The column Agency_Names is cast as the deprecated Text data type to match your subsequent updates.

SELECT 
    D.Na
,   CAST(D.Na AS text) AS Agency_Names
FROM
(
SELECT 'Hello world' AS Na
UNION ALL SELECT 'man~ana'
UNION ALL SELECT 'p@$$word!'
UNION ALL SELECT NULL
) D (Na);

Data Conversion

I have added a Data Conversion Transformation after my OLE DB Source. Reflecting what you have done, I converted my Agency_Name to a data type of string [DT_STR] with a length of 50 and aliased it as "Copy of Agency_Name".

Metadata

At this point, I verify that the metadata for my data flow is of type DT_STR or DT_WSTR which are the only allowable inputs for the upcoming call to the regular expression. I confirm that Copy of Agency_Names is the expected data type.

Script Task

I assigned ReadOnly usage to the columns Na and Copy of Agency_Name and aliased the later as "AgencyNames".

I added 2 output columns: NaN which matches your original question and created AgencyNamesCleaned. These are both configured to be DT_STR, codepage 1252, length of 50.

This is the script I used.

public class ScriptMain : UserComponent
{

    string toreplace = "[~!@#$%^&*()_+`{};':,./<>?]";
    string replacewith = "";


    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        Regex reg = new Regex(toreplace);

        // Test for nulls otherwise Replace will blow up
        if (!Row.Na_IsNull)
        {
            Row.NaN = reg.Replace(Row.Na, replacewith);
        }
        else
        {
            Row.NaN_IsNull = true;
        }

        if (!Row.AgencyNames_IsNull)
        {
            Row.AgencyNamesCleaned = reg.Replace(Row.AgencyNames, replacewith);
        }
        else
        {
            Row.AgencyNamesCleaned_IsNull = true;
        }
    }

}

Root cause analysis

I think your core issue may be is that the Na column you have isn't a string compatible type. Sriram's comment is spot on. If I look at the autogenerated code for the column Na, in my example I see

    public String Na
    {
        get
        {
            return Buffer.GetString(BufferColumnIndexes[0]);
        }
    }
    public bool Na_IsNull
    {
        get
        {
            return IsNull(0);
        }
    }

Your source system has provided metadata such that SSIS thinks this column is binary data. Perhaps it's NTEXT/TEXT or n/varchar(max) in the host. You need to do something to make it a compatible operand for the regular expression. I would clean up the column type in the source but if that's not an option, use a Data Conversion transformation to make it into a DT_STR/DT_WSTR type.

Denouement

You can observe in the Data Viewer, attached to my first image, that NaN and AgencyNamesCleaned have correctly stripped the offending characters. Furthermore, you can observe that my Script Task does not have a red X attached to it as your does. This indicates the script is in an invalid state.

As you had created the "Copy of Agency_Names" column from the Data Conversion Component as DT_TEXT, wired it up to the Script Component, and then changed the data type in the Data Conversion Component, the Red X on your script might be resolved by having the transformation refresh its metadata. Open the script and click recompile (ctrl-shift-b) for good measure.

There should be no underlines in your reg.Replace(... code. If there is, there is another facet to your problem that has not been communicated. My best advice at that point would be to recreate a proof of concept package, exactly as I have described and if that works, it becomes an exercise in finding the difference between what you have working and what you do not have working.

这篇关于SSIS 的 C# 中的 System.Text.RegularExpressions.Regex.Replace 错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆