在vb.net中获取列名Jet OLE DB [英] get column names Jet OLE DB in vb.net

查看:81
本文介绍了在vb.net中获取列名Jet OLE DB的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个读取csv文件并对其进行参数化的函数,因此,我有一个gettypessql函数,该函数首先查询sql表以获取数据类型,从而调整后来插入sql的列.所以我的问题是,当我在Jet OLE DB中设置HDR = Yes时,我仅获得列名称,例如F1,F2,F3.为了解决这个问题,我设置了HDR = No并编写了一些for循环,但是现在我只得到了空字符串,实际上是什么问题?这是我的代码:

I've written a function which reads csv files and parametrizes them accordingly, therefore i have a function gettypessql which queries sql table at first to get data types and therefore to adjust the columns which are later inserted in sql. So my problem is when I set HDR=Yes in Jet OLE DB I get only column names like F1, F2, F3. To circumvent this issue I've set HDR=No and written some for loops but now I get only empty strings, what is actually the problem? here is my code:

 Private Function GetCSVFile(ByVal file As String, ByVal min As Integer, ByVal max As Integer) As DataTable
        Dim ConStr As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & TextBox1.Text & ";Extended Properties=""TEXT;HDR=NO;IMEX=1;FMT=Delimited;CharacterSet=65001"""
        Dim conn As New OleDb.OleDbConnection(ConStr)
        Dim dt As New DataTable
        Dim da As OleDb.OleDbDataAdapter = Nothing
        getData = Nothing

        Try
            Dim CMD As String = "Select * from " & _table & ".csv"
            da = New OleDb.OleDbDataAdapter(CMD, conn)
            da.Fill(min, max, dt)
            getData = New DataTable(_table)
            Dim firstRow As DataRow = dt.Rows(0)  

            For i As Integer = 0 To dt.Columns.Count - 1
                Dim columnName As String = firstRow(i).ToString()
                Dim newColumn As New DataColumn(columnName, mListOfTypes(i))
                getData.Columns.Add(newColumn)
            Next

            For i As Integer = 1 To dt.Rows.Count - 1
                Dim row As DataRow = dt.Rows(i)
                Dim newRow As DataRow = getData.NewRow()

                For j As Integer = 0 To getData.Columns.Count - 1
                    If row(j).GetType Is GetType(String) Then
                        Dim colValue As String = row(j).ToString()
                        colValue = ChangeEncoding(colValue)
                        colValue = ParseString(colValue)
                        colValue = ReplaceChars(colValue)
                        newRow(j) = colValue
                    Else
                        newRow(j) = row(j)
                    End If
                Next

                getData.Rows.Add(newRow)
                Application.DoEvents()
            Next
        Catch ex As OleDbException
            MessageBox.Show(ex.Message)
        Catch ex As Exception
            MessageBox.Show(ex.Message)
        Finally
            dt.Dispose()
            da.Dispose()
        End Try

        Return getData
    End Function

并获取sql类型,此代码无法正确转换,尤其是双精度

and get types sql, this one doesn't convert properly, especially doubles

Private Sub GetTypesSQL()
        If (mListOfTypes Is Nothing) Then
            mListOfTypes = New List(Of Type)()
        End If

        mListOfTypes.Clear()

        Dim dtTabelShema As DataTable = db.GetDataTable("SELECT TOP 0 * FROM " & _table)

        Using dtTabelShema
            For Each col As DataColumn In dtTabelShema.Columns
                mListOfTypes.Add(col.DataType)
            Next
        End Using
    End Sub

推荐答案

我认为您已经使它变得比所需的更加复杂.例如,通过创建一个空的DataTable并从中获取数据类型来获得dbSchema.为什么不只使用第一个表而不是从Types创建一个新表呢?对于导入的每批行,也不需要一遍又一遍地重建该表.

I think you have made it more complicated than it needs to be. For instance, you get the dbSchema by creating an empty DataTable and harvesting the Datatypes from it. Why not just use that first table rather than creating a new table from the Types? The table also need not be reconstructed over and over for each batch of rows imported.

通常,由于OleDb会尝试从数据中推断类型,因此似乎没有必要,甚至在某些情况下可能会妨碍您的工作.另外,您正在重做OleDB所做的一切并将数据复制到其他DT.鉴于此,我将跳过OleDB施加的开销并处理原始数据.

Generally since OleDb will try to infer types from the data, it seems unnecessary and may even get in the way in some cases. Also, you are redoing everything that OleDB does and copying data to a different DT. Given that, I'd skip the overhead OleDB imposes and work with the raw data.

这将使用CSV列名称和数据库中的类型"创建目标表.如果CSV的列顺序与SELECT *查询中的列顺序不同,则它将失败.

This creates the destination table using the CSV column name and the Type from the Database. If the CSV is not in the same column order as those delivered in a SELECT * query, it will fail.

以下代码使用一个类将csv列映射到db表列,因此代码不取决于CSV的顺序相同(因为它们可能是在外部生成的).我的样本数据CSV是,但顺序相同:

The following uses a class to map csv columns to db table columns so the code is not depending on the CSVs being in the same order (since they may be generated externally). My sample data CSV is not in the same order:

Public Class CSVMapItem

    Public Property CSVIndex As Int32
    Public Property ColName As String = ""
   'optional
    Public Property DataType As Type

    Public Sub New(ndx As Int32, csvName As String,
                   dtCols As DataColumnCollection)

        CSVIndex = ndx

        For Each dc As DataColumn In dtCols
            If String.Compare(dc.ColumnName, csvName, True) = 0 Then
                ColName = dc.ColumnName
                DataType = dc.DataType
                Exit For
            End If
        Next

        If String.IsNullOrEmpty(ColName) Then
            Throw New ArgumentException("Cannot find column: " & csvName)
        End If
    End Sub
End Class

用于解析csv的代码使用CSVHelper,但是在这种情况下可以使用TextFieldParser,因为该代码只是将CSV行读取到字符串数组中.

The code to parse the csv uses CSVHelper but in this case the TextFieldParser could be used since the code just reads the CSV rows into a string array.

Dim SQL = String.Format("SELECT * FROM {0} WHERE ID<0", DBTblName)
Dim rowCount As Int32 = 0
Dim totalRows As Int32 = 0
Dim sw As New Stopwatch
sw.Start()

Using dbcon As New MySqlConnection(MySQLConnStr)
    Using cmd As New MySqlCommand(SQL, dbcon)

        dtSample = New DataTable
        dbcon.Open()

        ' load empty DT, create the insert command
        daSample = New MySqlDataAdapter(cmd)
        Dim cb = New MySqlCommandBuilder(daSample)
        daSample.InsertCommand = cb.GetInsertCommand
        dtSample.Load(cmd.ExecuteReader())

        ' dtSample is not only empty, but has the columns
        ' we need

        Dim csvMap As New List(Of CSVMapItem)

        Using sr As New StreamReader(csvfile, False),
                        parser = New CsvParser(sr)

            ' col names from CSV
            Dim csvNames = parser.Read()
            ' create a map of CSV index to DT Columnname  SEE NOTE
            For n As Int32 = 0 To csvNames.Length - 1
                csvMap.Add(New CSVMapItem(n, csvNames(n), dtSample.Columns))
            Next

            ' line data read as string
            Dim data As String()
            data = parser.Read()
            Dim dr As DataRow

            Do Until data Is Nothing OrElse data.Length = 0

                dr = dtSample.NewRow()

                For Each item In csvMap
                    ' optional/as needed type conversion
                    If item.DataType = GetType(Boolean) Then
                        ' "1" wont convert to bool, but (int)1 will
                        dr(item.ColName) = Convert.ToInt32(data(item.CSVIndex).Trim)
                    Else
                        dr(item.ColName) = data(item.CSVIndex).Trim
                    End If
                Next
                dtSample.Rows.Add(dr)
                rowCount += 1

                data = parser.Read()

                If rowCount = 50000 OrElse (data Is Nothing OrElse data.Length = 0) Then
                    totalRows += daSample.Update(dtSample)
                    ' empty the table if there will be more than 100k rows
                    dtSample.Rows.Clear()
                    rowCount = 0
                End If
            Loop
        End Using

    End Using
End Using
sw.Stop()
Console.WriteLine("Parsed and imported {0} rows in {1}", totalRows,
                    sw.Elapsed.TotalMinutes)

如果有许多行,则处理循环每50K行更新一次DB.它也可以一次完成,而不是一次通过OleDB读取N行. CsvParser一次只能读取一行,因此每次手头上的数据永远不应超过50,001行.

The processing loop updates the DB every 50K rows in case there are many many rows. It also does it in one pass rather than reading N rows thru OleDB at a time. CsvParser will read one row at a time, so there should never be more than 50,001 rows worth of data on hand at a time.

If item.DataType = GetType(Boolean) Then所示,可能需要处理特殊的类型转换.读为"1"的布尔列不能直接传递给布尔列,因此将其转换为可以的整数.可能还有其他转换,例如时髦的约会.

There may be special cases to handle for type conversions as shown with If item.DataType = GetType(Boolean) Then. A Boolean column read in as "1" cant be directly passed to a Boolean column, so it is converted to integer which can. There could be other conversions such as for funky dates.

处理250,001行的时间:3.7分钟.需要将这些字符串转换应用于每个单个字符串列的应用程序将花费更长的时间.我很确定,在CSVHelper中使用CsvReader可以将其应用为解析到Type的一部分.

Time to process 250,001 rows: 3.7 mins. An app which needs to apply those string transforms to every single string column will take much longer. I'm pretty sure that using the CsvReader in CSVHelper you could have those applied as part of parsing to a Type.

由于这将是一个通用的进口商/洗涤器,因此有潜在的灾难等待发生.

There is a potential disaster waiting to happen since this is meant to be an all-purpose importer/scrubber.

For i As Integer = 0 To dt.Columns.Count - 1
    Dim columnName As String = firstRow(i).ToString()
    Dim newColumn As New DataColumn(columnName, mListOfTypes(i))
    getData.Columns.Add(newColumn)
Next

问题和自我回答均使用目标表中CSV的列名和SELECT *查询中的DataTypes来构建新表.因此,假定CSV列的顺序与SELECT *会返回它们的顺序相同,并且所有CSV都将始终使用与表相同的名称.

Both the question and the self-answer build the new table using the column names from the CSV and the DataTypes from a SELECT * query on the destination table. So, it assumes the CSV Columns are in the same order that SELECT * will return them, and that all CSVs will always use the same names as the tables.

上面的答案稍好一点,因为它可以根据名称进行查找和匹配.

The answer above is marginally better in that it finds and matches based on name.

一个更可靠的解决方案是编写一个小实用程序,其中用户将数据库列名称映射到CSV索引.将结果保存到List(Of CSVMapItem)并进行序列化.这些可能有一个完整的集合保存到磁盘.然后,与其在航海推算的基础上创建地图,不如在上述代码中将用户的期望反序列化为csvMap.

A more robust solution is to write a little utility app where a user maps a DB column name to a CSV index. Save the results to a List(Of CSVMapItem) and serialize it. There could be a whole collection of these saved to disk. Then, rather than creating a map based on dead reckoning, just deserialize the desired for user as the csvMap in the above code.

这篇关于在vb.net中获取列名Jet OLE DB的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆