序列或批次项DataGridView [英] Sequence or Batch items DataGridView

查看:119
本文介绍了序列或批次项DataGridView的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大量的DataGridView,940000行... ouch,从解析一个csv文件填充,DataGridView有一个名为序列号为1到940000的列。我试图做的是重新编号序列溢出在DataGridView中的行数为1到7000的序列。什么是最有效的方式来重新排序序列?

 使用阅读器作为新的Microsoft.VisualBasic.FileIO.TextFieldParser(fileName)
reader.TextFieldType = FileIO.FieldType。分隔
reader.SetDelimiters(,)
Dim currentRow As String()
Dim serial As String
Dim sequence As Integer = 0
Dim RollId As String

'pbUploadFile.Maximum = serialAmmount / quantityBreak
pbUploadFile.Maximum = serialAmmount
pbUploadFile.Step = 1
pbUploadFile.Value = 0

对于i = 1到serialAmmount / quantityBreak
对于j = 1 To quantityBreak
尝试
currentRow = reader.ReadFields()
serial = currentRow(0).ToString
序列+ = 1
EnterDataIntoDatabase(serial,sequence,nextRollNumber,ddSelectPartNumber.Text)
pbUploadFile.Increment(1)
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox(Code& ex.Message& 无效,将被跳过检查csv文件)
结束尝试
下一步j

sqlCmd =新的SqlClient.SqlCommand(SELECT * FROM serials WHERE Sequence = @ sequence AND RollNo = @ rollNo,sqlCon)
sqlCmd.CommandType = CommandType.Text
sqlCmd.Parameters.AddWithValue(@ sequence,1)
sqlCmd.Parameters.AddWithValue(@ rollNo ,nextRollNumber)
sqlCon.Open()
Dim readRollId As SqlClient.SqlDataReader = sqlCmd.ExecuteReader()
如果readRollId.Read()然后
RollId = readRollId.Item code)
End If
sqlCon.Close()


UpdateAvailableRolls(ddSelectPartNumber.Text,nextRollNumber,RollId)
nextRollNumber + = 1
UpdateRollNo(nextRollNumber)
sequence = 0
'pbUploadFile.Increment(1)
Next i
SaveFile ()
结束使用


解决方案

最好考虑如何使用数据,以及如何确定如何做某事以及使用哪些工具来做。没有一个人|快!有效的方法来做大多数事情。



说,有一些坏的做事方法。使用 DataGridView 作为数据容器似乎不好(我不能在代码中看到任何与DGV相关的内容)。 A)没有数据自动的方式进入它 - 你必须编写代码来做到这一点,2)没有自动的方式让数据去别的地方 - 你必须编写代码循环通过它和将数据退回。那么所有的数据都有可能被存储为字符串的问题。



还有更多的事情,而不仅仅是批量的项目。以下将从CSV导入行,处理它们并将其写回数据库(我使用MySql,但概念是一样的)。






首先, TextFieldParser 是一个非常方便的工具,但它有一个主要的缺点,它只返回字符串。如果CSV中有价格,日期,布尔等等,那么该类型就会丢失。在许多情况下,CSVHelper将是一个更好的选择。



在这种情况下,由于数据是用于数据库的,所以我将使用OleDB将CSV读入 DataTable ,批处理,然后发送到DB。



使用OleDB导入数据



Schema.INI



OleDb包含一个文本文件驱动程序,可用于解析CSV。它可以根据前几行的上下文猜测数据类型,但也可以定义它们。在CSV所在的文件夹/目录中,创建一个名为 Schema.INI 的新文本文件。定义CSV和列如下:


[Capitals.Csv]

ColNameHeader = True

格式= CSVDelimited

TextDelimiter =

DecimalSymbol =。

CurrencySymbol = $

Col1 =国家文本宽度254 < br>
Col2 =Capital City文本宽度254

Col3 =人口单个

Col4 =等级整数

Col5 = 国庆节日期





  • 您可以在单个文件中有多个csv定义, code> [...]

  • [...] 将是CSV的名称

  • 如果CSV有一个标题行,可以使用列名称

  • 如果列也用引号括起来(Like this,in,csv),使用 TextDelimiter =

  • 每个 Col#= 条目定义数据类型并可以覆盖名称,这样可以将CSV中的Foo列映射到名为Ba

  • 可以指定其他选项,如文件中使用的十进制和货币符号和代码页。



连接字符串



要使用的连接字符串将是:

  ACEImportStr =Provider = Microsoft.ACE.OLEDB.12.0; Data Source ='{0}';扩展属性='TEXT'

数据源将是CSV和Schema .INI存在,TEXT元素告诉它使用Text驱动程序。使用文件夹名称填写空白:

  ACEImportConnStr = String.Format(ACEImportConnStr,C:\Temp)

OLEDB.12有时可以很好,如果您有问题,请使用 Microsoft。 Jet.OLEDB.4.0 代替。



现在,要加载数据,只需从CSV文件名中选择(无文件夹) :

  Dim sSQL =SELECT * FROM RandomOle.CSV
...
Dim daSrc =新的OleDbDataAdapter(sSQL,OleCSVConnstr)
rowsLoaded = daSrc.Fill(dtSample)

code> DataAdapter 将读取定义的Schema,并在短短几秒钟内将CSV加载到数据表中。还有更多的工作要处理其他任务,但这是这个概念。

  Dim sSQL =SELECT * FROM YOUR_CSVFILE_NAME CSV
Dim sw As New Stopwatch

Dim rowsLoaded As Int32
Dim rowsUpdated As Int32

sw.Start()
ACEImportConnStr = String.Format(ACEImportConnStr,C:\Temp)

'创建目标MySQL conn,Src和Dest dataadapters,
'和一个命令构建器(因为我很懒...并且可以)
使用mysqlCon作为新的MySqlConnection(MySQLConnStr),
daSrc作为新的OleDbDataAdapter(sSQL,ACEImportConnStr),
daDest作为新的MySqlDataAdapter(SELECT * FROM Sample,mysqlCon),
cb作为新的MySqlCommandBuilder(daDest)

'重要!
daSrc.AcceptChangesDuringFill = False

dtSample =新的DataTable
rowsLoaded = daSrc.Fill(dtSample)

'csv缺少ID列 - 添加它
Dim dc As New DataColumn(Id,GetType(Int32))
dc.DefaultValue = 1
dtSample.Columns.Add(dc)
dc.SetOrdinal(0)

'我的csv也没有一个BATCH列
dc =新的DataColumn(Batch,GetType(Int32))
dc.DefaultValue = 1
dtSample.Columns 。添加(dc)
dc.SetOrdinal(1)

'设置批次号
'每个5k行==一个批次
Dim batch As Int32 = 1
Dim counter As Int32 = 1
对于每个dr As DataRow在dtSample.Rows
dr(Batch)=批次
计数器+ = 1
如果计数器> ; 5000然后
计数器= 0
批次+ = 1
结束如果
下一个

'现在将数据保存到MySQL
mysqlCon.Open ()
'插入250k行需要一段时间,
'使用事务
使用t As MySqlTransaction = mysqlCon.BeginTransaction
rowsUpdated = daDest.Update(dtSample)
t.Commit()
结束使用

结束使用

'在dgv
中显示IMPORT dgv1.DataSource = dtSample
dgv1 .Columns(Id)。Visible = False

'report
sw.Stop()
Console.WriteLine(sw.ElapsedMilliseconds)

原理很简单:由于数据绑定到数据库,因此将数据转换为 DataTable 这里的诀窍是有两个DB提供程序涉及:OleDB读取csv和MySql的保存。




  • 通常,当DataAdapter填充 DataTable 所有行都设置为不变 AcceptChangesDuringFill = False 将状态设置为添加,以便MySql适配器可以插入这些行。

  • CommandBuilder从SELECT命令构建要使用的INSERT SQL。

  • 我不知道serial-rollno查询正在做什么,但是我会>不在导入过程循环中运行查询。如果需要设置的某些值取决于数据库中的值,请将它们加载到另一个DT中并从中查询它们。有一些DataTable扩展方法可以方便地查找行。

  • 同样,我不知道 EnterDataIntoDatabase 是什么,但是您应该努力在 DataTable 中处理和准备所有导入的数据,然后一次更新。



您似乎有更多的进行,而不仅仅是批量或排序一堆行。上面的代码可以导入250k行,分配批号,并在1.2分钟(每秒近3500行)中插入250k行到MySql。






如果批次/顺控程序与CSV中的每行X行数量一样,则可以一次只能加载7000行,设置该值,保存该批次,然后加载下一个7k行。这将限制任何时间加载的行数,并减少应用程序使用的内存。我不知道是否适用。



参考:




I have large DataGridView with 940000 rows...ouch, filled from parsing a csv file, The DataGridView has a column named Sequence numbered 1 to 940000. What I am attempting to do is to re-number the sequence to spilt up into sequences of 1 to 7000 for the amount of rows in the DataGridView. Whats the most efficient way to reorder the sequence column?

Using reader As New Microsoft.VisualBasic.FileIO.TextFieldParser(fileName)
        reader.TextFieldType = FileIO.FieldType.Delimited
        reader.SetDelimiters(",")
        Dim currentRow As String()
        Dim serial As String
        Dim sequence As Integer = 0
        Dim RollId As String

        'pbUploadFile.Maximum = serialAmmount / quantityBreak
        pbUploadFile.Maximum = serialAmmount
        pbUploadFile.Step = 1
        pbUploadFile.Value = 0

        For i = 1 To serialAmmount / quantityBreak
            For j = 1 To quantityBreak
                Try
                    currentRow = reader.ReadFields()
                    serial = currentRow(0).ToString
                    sequence += 1
                    EnterDataIntoDatabase(serial, sequence, nextRollNumber, ddSelectPartNumber.Text)
                    pbUploadFile.Increment(1)
                Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
                    MsgBox("Code " & ex.Message & "is not valid and will be skipped check csv file")
                End Try
            Next j

            sqlCmd = New SqlClient.SqlCommand("SELECT * FROM serials WHERE Sequence=@sequence AND RollNo=@rollNo ", sqlCon)
            sqlCmd.CommandType = CommandType.Text
            sqlCmd.Parameters.AddWithValue("@sequence", 1)
            sqlCmd.Parameters.AddWithValue("@rollNo", nextRollNumber)
            sqlCon.Open()
            Dim readRollId As SqlClient.SqlDataReader = sqlCmd.ExecuteReader()
            If readRollId.Read() Then
                RollId = readRollId.Item("Code")
            End If
            sqlCon.Close()


            UpdateAvailableRolls(ddSelectPartNumber.Text, nextRollNumber, RollId)
            nextRollNumber += 1
            UpdateRollNo(nextRollNumber)
            sequence = 0
            'pbUploadFile.Increment(1)
        Next i
        SaveFile()
    End Using

解决方案

It is usually best to take into consideration how the data will be used and how when deciding exactly how to do something and what tools to use to do it. There is no one right | fast | efficient way to do most things.

That said, there are some bad ways of doing things. Using a DataGridView as a data container seems ill advised (I cant actually see anything related to a DGV in the code though). A) there is no automatic way for the data to get into it - you had to write code to do it, and 2) there is no automatic way for the data to go somewhere else - you have to write code to loop thru it and fish the data back out. Then there is the matter of all the data likely being stored as strings.

There also looks to be more going on than just batching up items. The following will import rows from a CSV, process them and write them back to a DB (I am using MySql, but the concepts are the same).


First, the TextFieldParser is a pretty handy tool, but it has a major drawback in that it only returns strings. If the CSV has prices, dates, booleans etc in it, that type is lost. In many cases, CSVHelper would be a better choice.

In this case, since the data is destined for a database, I would use OleDB to read the CSV into a DataTable, batch it, then send it to the DB.

Import Data using OleDB

Schema.INI

OleDb includes a text file driver which can be used to parse CSVs. It can "guess" at the data types based on the context of the first few rows, but you can also define them. In the folder/directory where the CSV resides, create a new text file named Schema.INI. Define the CSV and columns like this:

[Capitals.Csv]
ColNameHeader=True
Format=CSVDelimited
TextDelimiter=
DecimalSymbol=.
CurrencySymbol=$
Col1="Country" Text Width 254
Col2="Capital City" Text Width 254
Col3="Population" Single
Col4="Rank" Integer
Col5="National Day" Date

  • You can have multiple csv definitions in a single file, each starting with [...]
  • The [...] would be the name of the CSV
  • If the CSV has a header row, it can use those for the column names
  • If the columns are also enclosed in quotes ("Like this","in","the csv"), use TextDelimiter="
  • Each Col#= entry defines the datatype and can override the name. This allows you to "map" a column named "Foo" in the CSV to one named "Bar" in the DB.
  • Other options like the decimal and currency symbol and the code page used in the file can be specified.

Connection String

The connection string to use would be:

ACEImportStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='{0}';Extended Properties='TEXT'"

The Data Source would be the folder where both the CSV and Schema.INI exist and the 'TEXT' element tells it to use the Text driver. Fill in the blank using the folder name:

ACEImportConnStr = String.Format(ACEImportConnStr, "C:\Temp")

OLEDB.12 can sometimes be finicky, if you have problems, use Microsoft.Jet.OLEDB.4.0 for the Provider instead.

Now, to load the data, just select from the CSV file name (no folder):

Dim sSQL = "SELECT * FROM RandomOle.CSV"
...
Dim daSrc = New OleDbDataAdapter(sSQL, OleCSVConnstr)
rowsLoaded = daSrc.Fill(dtSample)

The DataAdapter will read the Schema for the definitions and load the CSV into a datatable in just a few seconds. There is more to be done to handle other tasks, but that is the concept.

Dim sSQL = "SELECT * FROM YOUR_CSVFILE_NAME.CSV"
Dim sw As New Stopwatch

Dim rowsLoaded As Int32
Dim rowsUpdated As Int32

sw.Start()
ACEImportConnStr = String.Format(ACEImportConnStr, "C:\Temp")

' create Destination MySQL conn, Src and Dest dataadapters,
' and a command builder (because I am lazy...and fallible)
Using mysqlCon As New MySqlConnection(MySQLConnStr),
    daSrc As New OleDbDataAdapter(sSQL, ACEImportConnStr),
    daDest As New MySqlDataAdapter("SELECT * FROM Sample", mysqlCon),
    cb As New MySqlCommandBuilder(daDest)

    ' important!
    daSrc.AcceptChangesDuringFill = False

    dtSample = New DataTable
    rowsLoaded = daSrc.Fill(dtSample)

    ' csv lacks an ID column - add it
    Dim dc As New DataColumn("Id", GetType(Int32))
    dc.DefaultValue = 1
    dtSample.Columns.Add(dc)
    dc.SetOrdinal(0)

    ' MY csv also lacks a BATCH column
    dc = New DataColumn("Batch", GetType(Int32))
    dc.DefaultValue = 1
    dtSample.Columns.Add(dc)
    dc.SetOrdinal(1)

    ' set the batch number
    ' each 5k rows == a batch
    Dim batch As Int32 = 1
    Dim counter As Int32 = 1
    For Each dr As DataRow In dtSample.Rows
        dr("Batch") = batch
        counter += 1
        If counter > 5000 Then
            counter = 0
            batch += 1
        End If
    Next

    ' now save the data to MySQL
    mysqlCon.Open()
    ' inserting 250k rows takes a while,
    ' use a transaction
    Using t As MySqlTransaction = mysqlCon.BeginTransaction
        rowsUpdated = daDest.Update(dtSample)
        t.Commit()
    End Using

End Using

' show the IMPORT in a dgv
dgv1.DataSource = dtSample
dgv1.Columns("Id").Visible = False

' report
sw.Stop()
Console.WriteLine(sw.ElapsedMilliseconds)

The principle is simple: since the data is bound for a DB, get the data into a DataTable ASAP. The trick here is that there are 2 DB Providers involved: OleDB to read the csv and MySql for the saving.

  • Normally when a DataAdapter fills a DataTable all the rows are set to Unchanged. AcceptChangesDuringFill = False leaves the states set to Added so that the MySql adapter can insert those rows.
  • The CommandBuilder builds the INSERT SQL to be used from the SELECT command.
  • I don't know what that serials-rollno query is doing, but I would not run queries inside the import process loop. If some of the values you need to set depend on values in the DB, load them into another DT and query them from there. There are some DataTable extension methods that make it easy to find rows.
  • Likewise, I don't know what EnterDataIntoDatabase does, but you should strive to process and prepare all the imported data in the DataTable, then update it all at once.

You appear to have more going one than just batching or sequencing a bunch of rows. The code above can import 250k rows, assign batch numbers, and insert 250k new rows into MySql in 1.2 minutes (almost 3500 rows per second).


If the batch/sequencer is anything like each X number of rows in order from the CSV, you might be able to just load 7000 rows at a time, set the value, save that batch and then load the next 7k rows. This would limit the number of rows loaded at any one time and reduce the memory the app uses. I am not sure if it applies though.

Reference:

这篇关于序列或批次项DataGridView的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆