从CSV读取双打时出现问题-VBA [英] problems reading doubles from CSV - VBA

查看:93
本文介绍了从CSV读取双打时出现问题-VBA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从vba-excel中读取一个csv文件,但是我有一个双精度值的问题,例如,在csv中此值:125.5读取时不带点。所以我得到1255。
我的代码:

I want to read a csv file from vba-excel but i have a problem with double values, from example, this value in the csv: 125.5 is read without dot. So i get 1255. My code:

Dim rs As New ADODB.Recordset
strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & myDir & ";" & "Extended Properties=""text;HDR=Yes;FMT=Delimited()"";"
strSQL = "SELECT * FROM " & myFileName
rs.Open strSQL, strCon, 3, 3
IBH = rs("IBH")

我该如何解决?

更新:
我尝试了@Siddharth Rout解决方案,但我仍然遇到相同的问题。
我的代码了:

update: I tried @Siddharth Rout solution, but i still have the same problem. my code now:

Dim conn As New ADODB.Connection
Dim rs As New ADODB.Recordset
Dim myDate, myTime, IBH, IBL
Dim myDir As String, myFileName As String
Dim strSQL As String

myDir = Trim(shParams.Range("fp_path"))
myFileName = Trim(shParams.Range("fp_filename"))

With conn
 .Provider = "Microsoft.ACE.OLEDB.12.0"
 .ConnectionString = "Data Source=" & myDir & ";Extended Properties='text'"
 .Open
End With

strSQL = "SELECT * FROM " & myFileName
rs.Open strSQL, conn, 3, 3
rs.MoveLast

myDate = rs("Date")
myTime = rs("Time")
IBH = rs("IBH")
IBL = rs("IBL")

Debug.Print myDate, myTime, IBH, IBL

rs.Close
Set rs = Nothing

这是结果:


这是我的csv:

this is my csv:

推荐答案

此问题源于 ACE 引擎确定 ADODB 字段的类型。驱动程序将扫描一定数量的行,以确定整个列的类型。

This issue stems from how the ACE engine determines the type for a ADODB field. The driver will scan a set number of rows to determine what the type should be for the entire column.

更改连接字符串

您可以尝试做的一件事,就是将连接字符串中的 MaxScanRows 更改为0。将其设置为0将扫描所有行以确定类型,请记住,这可能会对性能产生影响,具体取决于数据集的大小。

One quick thing you can try is changing the MaxScanRows to 0 in the Connection String. Setting this to 0 will scan all rows to determine the type, keep in mind this may have a performance impact depending on how large your data set is.

;扩展属性='text; MaxScanRows = 0; IMEX = 0'

这不会总是给您所需的结果。假设我们有一个像这样的数据集:

This won't always give you the desired result. Say we have a data set like this:

+--------------------------+
|       DoubleField        |
+--------------------------+
| 1                        |
| 2                        |
| 3                        |
| ...(996 more records...) |
| 1000.01                  |
+--------------------------+

驱动程序将查看并看到999个看起来像整数的记录,以及1个看起来像一个Double的记录。它将根据 MajorityType 决定此字段,它是整数,而不是Double。坦白地说,我不确定该类型确定是如何的准确完成方式,但这确实符合这些原则。我还看到过一些实例,只需将最上面的记录更改为所需的类型即可。例如,

The driver will look and see 999 records that look like an Integer, and 1 record that looks like a Double. It will decide this field based on MajorityType it is an Integer, not a Double. To be honest, I'm not entirely sure how this type determination is done exactly, but it is something along these lines. I've also seen instances where simply changing the top record to be the type you want will work. E.g.

+--------------------------+
|       DoubleField        |
+--------------------------+
| 1.00                     |
| 2                        |
| 3                        |
| ...(996 more records...) |
| 1000.01                  |
+--------------------------+

因此,另一种方法是格式化源文件,使其预先包含小数位。如果您控制源文件,这应该很容易做到,但这并非总是如此。

So another approach could be to format the source file to include decimal places upfront. This should be easy enough to do if you control the source file, but this isn’t always the case.

使用架构INI文件

如果增加 MaxScanRows 不起作用,请确保如 Commintern 所指出的,使用哪种Schema.ini文件来获得期望的每列类型?

If upping the MaxScanRows doesn't work, a sure fire what to get the type you expect per column is to use a Schema.ini file as Comintern pointed out.

这里是链接

要点是,创建一个文件来明确定义每个列的每种类型。对于我们上面的人为表格,它变为:

The gist, make a file that defines each type for each column explicitly. For our contrived table above, this becomes:

[MyFileNameGoesHere.csv]
ColNameHeader = True
Format = CSVDelimited
Col1=DoubleField Double

然后您将该文件另存为 Schema.Ini 并将其放置在与要导入的文件相同的目录中。这种方法的好处在于,它只是创建一个文本文件,您甚至可以在VBA中完成此操作而不会遇到太多麻烦。这种方法的缺点是,如果要导入很多文件,可能很难管理所有Schema.ini文件。

You would then save this file as Schema.Ini and place it in the same directory as the file you want to import. The nice thing about this approach is it is just creating a text file, you could even do this in VBA without too much trouble. A downside with this approach is if you have lots of files to import, it can be hard to manage all the Schema.ini files.

一种纯粹的VBA方法

您可以在 ADODB 并使用csv文件中的数据填充。以下是一些示例代码,可以帮助您入门。

You can create an in memory table in ADODB and populate that with the data from the csv file. Here is a little code sample to get you started.

Option Explicit

Private Function getTypedRS() As ADODB.Recordset
    Set getTypedRS = New ADODB.Recordset

    With getTypedRS
        'Add your other fields here
        .Fields.Append "DoubleField", adDouble
    End With
End Function

Public Sub CSVToADODB()
    Dim myTimer         As Double
    Dim FileNumber      As Long
    Dim FilePath        As String
    Dim FileData        As String
    Dim CSVArray        As Variant
    Dim i               As Long
    Dim rs              As ADODB.Recordset

    myTimer = Timer
    Set rs = getTypedRS()
    FilePath = "C:\Users\Ryan\Desktop\Example.csv"

    'Get the CSV
    FileNumber = FreeFile()
    Open FilePath For Binary Access Read As FileNumber
    FileData = Space$(LOF(FileNumber)) 'Create a buffer first, then assign
    Get FileNumber, , FileData
    Close FileNumber

    'My CSV is just a list of Doubles, should be relatively easy to swap out to process with ','
    CSVArray = Split(FileData, vbCrLf)

    'Add data
    rs.Open
    For i = LBound(CSVArray) + 1 To UBound(CSVArray) '+1 to skip header
        rs.AddNew
        rs.Fields("DoubleField").Value = CSVArray(i)
    Next
    rs.UpdateBatch
    rs.MoveLast

    Debug.Print rs.Fields("DoubleField").Value, "Processed 1000 records in: " & Timer - myTimer
End Sub

这种方法的优点是,它仍然相当快速。我能够在约0.03秒内加载1000个双音,因为此处完成的大多数操作都是在内存中完成的。这也避免了对Schema.ini文件的需要,但是需要维护更多代码,因此这是一个折衷方案。

The good part with this approach, is it is still quite fast. I was able to load up 1000 doubles in ~0.03 seconds as most actions done here are done in memory. This also avoids the need for a Schema.ini file, however there is more code to maintain, so it is a trade-off.

推荐

I将尝试更改 MaxScanRows ,如果不起作用,请创建Schema.ini文件。

I would try changing the MaxScanRows, if that doesn't work, create a Schema.ini file.

这篇关于从CSV读取双打时出现问题-VBA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆