在 SSIS 中仅记录文本文件的某些行 [英] Record only certain rows of a text file in SSIS

查看:30
本文介绍了在 SSIS 中仅记录文本文件的某些行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在尝试将数据从平面文件简单加载到数据库时遇到了困难.问题在于该文本文件中存在错误行或至少未格式化为数据的行.

I'm having a hard time trying to do a simple loading of data from a flat file to a database. The problem is that there are bad rows or at least rows that are not formatted as data in that text file.

示例.txt:

                Stackoverflow School at Philippines

Record: 100101
Date: 6/20/2014

Name:                Age: About:
-------------------- ---  --------------------------
Coolai               19   Bad Row Question
Qwerty               17   Java
Qwerty               19   C#
*User1               21   Dynamic Data
User4                27   Assembly

                    Stackoverflow School at Nippon

Record: 100102
Date: 6/23/2014

Name:                Age: About:
-------------------- ---  --------------------------
Sayuri               19   MSSQL
Niwatori             21   Dynamic Data
Jagaimo              27   Assembly
*User7               21   Dynamic Data
User9                27   Assembly

我在平面文件连接上使用固定宽度格式并尝试使用条件拆分来解决问题,但一旦遇到空白,它就会停止加载数据.

I am using Fixed Width format on the Flat File Connection and tried to approach the problem using conditional split but once it hits the white space, it stops loading the data.

是否可以根据特定的行数加载数据?因为最后,这是我唯一想要的文本文件:

Is it possible to load the data depending on a certain row count? Because at the end, this is the only thing I want from the text file:

Coolai               19   Bad Row Question
Qwerty               17   Java
Qwerty               19   C#
User1                21   Dynamic Data
User4                27   Assembly
Sayuri               19   MSSQL
Niwatori             21   Dynamic Data
Jagaimo              27   Assembly
User7                21   Dynamic Data
User9                27   Assembly

示例.txt 文件LINK.

推荐答案

由于源文本文件不遵循分隔格式标准,您需要使用脚本任务,并编写自定义脚本来处理所有这些空格和坏数据.您需要将文本文件转换为某种分隔格式,然后才能根据需要进行处理.

Since the source text file doesn't follow delimited format standard, you need to use script task, and write custom script to handle all those white spaces and bad data. You need to convert text file into some delimited format then it can be processed as per requirement.

脚本任务的逻辑是 - 逐行读取文件并放置逻辑,以便在执行遇到错误行/数据时跳过.通过这样做,结果将采用标准分隔格式.

The logic for script task is - read file line by line and put logic for when to skip when execution hits bad line/data. By doing this result will be in standard delimited format.

在脚本中试试下面的代码,你可以使用从List中提取的数据.好数据

Try below code in script, you can use extracted data from List<string> goodData

        string line;
        bool isNextLineGoodData = false;
        List<string> goodData = new List<string>();
        // Read the file and display it line by line.
        System.IO.StreamReader file = new System.IO.StreamReader("c:\\csv.txt");
        while ((line = file.ReadLine()) != null)
        {

            if (isNextLineGoodData)
            {
                if (line=="")
                {
                    isNextLineGoodData = false;
                }
                else
                {
                    goodData.Add(line);
                }

            }
            if (line=="")
            {
                isNextLineGoodData = false;
            }
            if (line.StartsWith("---"))
            {
                isNextLineGoodData = true;

            }
        }

        file.Close();

这篇关于在 SSIS 中仅记录文本文件的某些行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆