从CSV读取双打时出现问题-VBA [英] problems reading doubles from CSV - VBA
问题描述
我想从vba-excel中读取一个csv文件,但是我有一个双精度值的问题,例如,在csv中此值:125.5读取时不带点。所以我得到1255。
我的代码:
I want to read a csv file from vba-excel but i have a problem with double values, from example, this value in the csv: 125.5 is read without dot. So i get 1255. My code:
Dim rs As New ADODB.Recordset
strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & myDir & ";" & "Extended Properties=""text;HDR=Yes;FMT=Delimited()"";"
strSQL = "SELECT * FROM " & myFileName
rs.Open strSQL, strCon, 3, 3
IBH = rs("IBH")
我该如何解决?
更新:
我尝试了@Siddharth Rout解决方案,但我仍然遇到相同的问题。
我的代码了:
update: I tried @Siddharth Rout solution, but i still have the same problem. my code now:
Dim conn As New ADODB.Connection
Dim rs As New ADODB.Recordset
Dim myDate, myTime, IBH, IBL
Dim myDir As String, myFileName As String
Dim strSQL As String
myDir = Trim(shParams.Range("fp_path"))
myFileName = Trim(shParams.Range("fp_filename"))
With conn
.Provider = "Microsoft.ACE.OLEDB.12.0"
.ConnectionString = "Data Source=" & myDir & ";Extended Properties='text'"
.Open
End With
strSQL = "SELECT * FROM " & myFileName
rs.Open strSQL, conn, 3, 3
rs.MoveLast
myDate = rs("Date")
myTime = rs("Time")
IBH = rs("IBH")
IBL = rs("IBL")
Debug.Print myDate, myTime, IBH, IBL
rs.Close
Set rs = Nothing
这是结果:
这是我的csv:
this is my csv:
推荐答案
此问题源于 ACE
引擎确定 ADODB
字段的类型。驱动程序将扫描一定数量的行,以确定整个列的类型。
This issue stems from how the ACE
engine determines the type for a ADODB
field. The driver will scan a set number of rows to determine what the type should be for the entire column.
更改连接字符串
您可以尝试做的一件事,就是将连接字符串中的 MaxScanRows
更改为0。将其设置为0将扫描所有行以确定类型,请记住,这可能会对性能产生影响,具体取决于数据集的大小。
One quick thing you can try is changing the MaxScanRows
to 0 in the Connection String. Setting this to 0 will scan all rows to determine the type, keep in mind this may have a performance impact depending on how large your data set is.
;扩展属性='text; MaxScanRows = 0; IMEX = 0'
这不会总是给您所需的结果。假设我们有一个像这样的数据集:
This won't always give you the desired result. Say we have a data set like this:
+--------------------------+
| DoubleField |
+--------------------------+
| 1 |
| 2 |
| 3 |
| ...(996 more records...) |
| 1000.01 |
+--------------------------+
驱动程序将查看并看到999个看起来像整数的记录,以及1个看起来像一个Double的记录。它将根据 MajorityType
决定此字段,它是整数,而不是Double。坦白地说,我不确定该类型确定是如何的准确完成方式,但这确实符合这些原则。我还看到过一些实例,只需将最上面的记录更改为所需的类型即可。例如,
The driver will look and see 999 records that look like an Integer, and 1 record that looks like a Double. It will decide this field based on MajorityType
it is an Integer, not a Double. To be honest, I'm not entirely sure how this type determination is done exactly, but it is something along these lines. I've also seen instances where simply changing the top record to be the type you want will work. E.g.
+--------------------------+
| DoubleField |
+--------------------------+
| 1.00 |
| 2 |
| 3 |
| ...(996 more records...) |
| 1000.01 |
+--------------------------+
因此,另一种方法是格式化源文件,使其预先包含小数位。如果您控制源文件,这应该很容易做到,但这并非总是如此。
So another approach could be to format the source file to include decimal places upfront. This should be easy enough to do if you control the source file, but this isn’t always the case.
使用架构INI文件
如果增加 MaxScanRows
不起作用,请确保如 Commintern 所指出的,使用哪种Schema.ini文件来获得期望的每列类型?
If upping the MaxScanRows
doesn't work, a sure fire what to get the type you expect per column is to use a Schema.ini file as Comintern pointed out.
这里是链接。
要点是,创建一个文件来明确定义每个列的每种类型。对于我们上面的人为表格,它变为:
The gist, make a file that defines each type for each column explicitly. For our contrived table above, this becomes:
[MyFileNameGoesHere.csv]
ColNameHeader = True
Format = CSVDelimited
Col1=DoubleField Double
然后您将该文件另存为 Schema.Ini
并将其放置在与要导入的文件相同的目录中。这种方法的好处在于,它只是创建一个文本文件,您甚至可以在VBA中完成此操作而不会遇到太多麻烦。这种方法的缺点是,如果要导入很多文件,可能很难管理所有Schema.ini文件。
You would then save this file as Schema.Ini
and place it in the same directory as the file you want to import. The nice thing about this approach is it is just creating a text file, you could even do this in VBA without too much trouble. A downside with this approach is if you have lots of files to import, it can be hard to manage all the Schema.ini files.
一种纯粹的VBA方法
您可以在 ADODB $ c $中创建一个内存中表c>并使用csv文件中的数据填充。以下是一些示例代码,可以帮助您入门。
You can create an in memory table in ADODB
and populate that with the data from the csv file. Here is a little code sample to get you started.
Option Explicit
Private Function getTypedRS() As ADODB.Recordset
Set getTypedRS = New ADODB.Recordset
With getTypedRS
'Add your other fields here
.Fields.Append "DoubleField", adDouble
End With
End Function
Public Sub CSVToADODB()
Dim myTimer As Double
Dim FileNumber As Long
Dim FilePath As String
Dim FileData As String
Dim CSVArray As Variant
Dim i As Long
Dim rs As ADODB.Recordset
myTimer = Timer
Set rs = getTypedRS()
FilePath = "C:\Users\Ryan\Desktop\Example.csv"
'Get the CSV
FileNumber = FreeFile()
Open FilePath For Binary Access Read As FileNumber
FileData = Space$(LOF(FileNumber)) 'Create a buffer first, then assign
Get FileNumber, , FileData
Close FileNumber
'My CSV is just a list of Doubles, should be relatively easy to swap out to process with ','
CSVArray = Split(FileData, vbCrLf)
'Add data
rs.Open
For i = LBound(CSVArray) + 1 To UBound(CSVArray) '+1 to skip header
rs.AddNew
rs.Fields("DoubleField").Value = CSVArray(i)
Next
rs.UpdateBatch
rs.MoveLast
Debug.Print rs.Fields("DoubleField").Value, "Processed 1000 records in: " & Timer - myTimer
End Sub
这种方法的优点是,它仍然相当快速。我能够在约0.03秒内加载1000个双音,因为此处完成的大多数操作都是在内存中完成的。这也避免了对Schema.ini文件的需要,但是需要维护更多代码,因此这是一个折衷方案。
The good part with this approach, is it is still quite fast. I was able to load up 1000 doubles in ~0.03 seconds as most actions done here are done in memory. This also avoids the need for a Schema.ini file, however there is more code to maintain, so it is a trade-off.
推荐
I将尝试更改 MaxScanRows
,如果不起作用,请创建Schema.ini文件。
I would try changing the MaxScanRows
, if that doesn't work, create a Schema.ini file.
这篇关于从CSV读取双打时出现问题-VBA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!