从CSV导入到Access时字段顺序错误 [英] Fields in the wrong order when importing from CSV to Access

查看:109
本文介绍了从CSV导入到Access时字段顺序错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我的应用程序中提供了一个导入工具,以允许用户使用基本的起始数据来填充它.我无法控制他们如何创建导入的数据,或者它们是否实际上遵循我提供的导入规范(迄今为止Beta测试的经验充分证明了这一点).我还了解到他们正在导入的数据集比我预期的要大(到目前为止最大的是750,000条记录),因此导入需要执行得很好.

I'm providing an import facility in my application to allow users to populate it with basic starting data. I have no control over how they create the data they import or whether they actually follow the import spec I provide (as amply demonstrated by the experience in beta testing to date). I've also learned that they're importing datasets way larger that I expected (largest so far is 750,000 records), so the import needs to perform well.

将要导入五到六组不同的数据集,范围从4列到24列,我试图从最简单的数据集开始想出一种可以用于所有这些数据的方法.

There'll be five or six different sets of data to import, ranging from 4 columns to 24 columns, and I'm trying to come up with a method I can use for all of them, starting with the simplest dataset.

这个最简单的导入文件将是一个csv文件,该文件具有4列且没有标题,并且应最终出现在具有以下字段的临时Access表中.

This simplest import file will be a csv file with 4 columns without headers, and should end up in a temporary Access table with fields as follows.

GenID: text (up to 255 chars) 

Surname: text (up to 255 chars)

GivenNames: text (up to 255 chars)

OtherInfo: multi-line memo (as long as they want)

数据应导入到临时表中,在将其附加到最终目录中的现有数据之前,将在其中进行验证.

The data should be imported into a temporary table, where validation will be carried out before being appended to existing data in its final home.

我首先尝试将docmd.transfertext与保存的导入规范结合使用-可以正常工作,但是对于最小的导入文件而言,它的速度非常慢. (我不能发誓它不会表现出与我在下面的第二种方法中讨论的某些相同的缺陷,因为我还没有对其进行详尽的测试.)

I first tried using docmd.transfertext with a saved import spec -- which works, but is painfully slow for anything other than the smallest import files. (I can't swear that it doesn't exhibit some of the same flaws that I discuss with a second method below, as I haven't tested it exhaustively).

我尝试的下一个方法基于 https://stackoverflow.com/a/11147920/1943174 ,尽管我一直在努力寻找有关使用Xml导入规范执行导入时应该期望的确切行为的文档,或者我应该使用的XML语法.

The next method I've tried is based on the approach at https://stackoverflow.com/a/11147920/1943174, although I've struggled to find any documentation on the exact behaviour I should expect when performing an import using an Xml Import Spec, or on the XML syntax I should use.

我在VBA中创建并执行导入规范. strFilePath标识要导入的csv文件,strTableName标识要用于导入的临时表.

I create and execute an import spec in VBA. strFilePath identifies the csv file to be imported and strTableName identified the temporary table to be used for the import.

Dim strXML As String

strXML = ""
strXML = strXML & "<?xml version=""1.0"" encoding=""utf-8"" ?>" & vbCrLf
strXML = strXML & "<ImportExportSpecification Path=" & Chr(34) & strFilePath & Chr(34) & " xmlns=""urn:www.microsoft.com/office/access/imexspec"">" & vbCrLf
strXML = strXML & "   <ImportText TextFormat=""Delimited"" FirstRowHasNames=""false"" FieldDelimiter="","" CodePage=""437"" Destination=" & Chr(34) & strTableName & Chr(34) & " >" & vbCrLf
strXML = strXML & "      <DateFormat DateOrder=""DMY"" DateDelimiter=""/"" TimeDelimiter="":"" FourYearDates=""true"" DatesLeadingZeros=""false"" />" & vbCrLf
strXML = strXML & "      <NumberFormat DecimalSymbol=""."" />" & vbCrLf
strXML = strXML & "           <Columns PrimaryKey=""{none}"">" & vbCrLf
strXML = strXML & "                    <Column Name=""Col1"" FieldName=""GenID"" Indexed=""NO"" SkipColumn=""false"" DataType=""Text"" />" & vbCrLf
strXML = strXML & "                    <Column Name=""Col2"" FieldName=""Surname"" Indexed=""NO"" SkipColumn=""false"" DataType=""Text"" />" & vbCrLf
strXML = strXML & "                    <Column Name=""Col3"" FieldName=""GivenNames"" Indexed=""NO"" SkipColumn=""false"" DataType=""Text"" />" & vbCrLf
strXML = strXML & "                    <Column Name=""Col4"" FieldName=""OtherInfo"" Indexed=""NO"" SkipColumn=""false"" DataType=""Memo"" />" & vbCrLf
strXML = strXML & "         </Columns>" & vbCrLf
strXML = strXML & "     </ImportText>" & vbCrLf
strXML = strXML & "</ImportExportSpecification>"

CurrentProject.ImportExportSpecifications.Add "MyImportSpec", strXML 'Add it to the project
DoCmd.RunSavedImportExport "MyImportSpec"'Run it

导入非常快速,并且我使用的测试输入数据格式正确,但是结果是具有正确结构的Access表,但是数据导入到错误的字段中:

The import is very quick, and the test input data I'm using is correctly formed, but the result is an Access table with the right structure, but the data imported into the wrong fields:

GenID -- in the right place

GivenNames  -- in the Other Info (memo) field

OtherInfo -- in the Surname (text) field

Surname -- in the Given Names (field).

如果我更改了导入文件中各列的顺序,以便它们按字母顺序(GenID,GivenNames,OtherInfo,Surname),则导入可以正常进行,但是该顺序对用户或用户都没有意义.制作起来很简单,因此它们很可能最终会错误地填充其导入文件. (此外,由于我要描述的缺陷,它也会失败).

If I change the order of the columns in the import file so that they're in alphabetical order (GenID, GivenNames, OtherInfo, Surname) then the import works correctly, but that order won't make sense to the users or be simple to produce, and so they're likely to end up populating their import files incorrectly. (Plus it will also fail because of the flaw I'm about to describe).

如果我将在临时表中创建的字段名称更改为XA,XB,XC,XD(计划在导入后更改字段名称,则一旦数据经过验证并被追加,该表将作为最终表首页)正确的数据进入正确"字段,但前提是导入文件中恰好有4列.如果存在五分之一,我将得到以下字段:

If I change the names of the fields created in the temporary table to XA,XB,XC,XD (planning to change field names after the import, once the data is validated and is being appended the table which will be its final home) the right data goes into the 'right' field, but only as long as there are exactly 4 columns in the import file. If a fifth is present, I get the following fields:

Field5 -- contains GenID data

XA -- contains Surname

XB -- contains GivenNames

XC -- contains OtherInfo

XD -- contains the data that was in the 5th column in the import file

如果我将导入的数据附加到具有正确结构的现有(空)表中,而不是允许导入创建新表,则会出现相同的问题.

The same problems occur if I append the imported data to an existing (empty) table with the correct structure rather than allow the import to create a new table.

关于如何进行此工作的任何建议,或用于其他将处理大型导入csv数据集中的文本和多行备注字段的快速导入的其他方法的建议,这些数据不能保证用户不会在处包含多余的列结束?我没有任何编辑导入csv的能力-所有数据更改都必须在Access中进行,并且csv文件必须在不更改过程的情况下进行.

Any suggestions on how I can make this work, or for other approaches which will handle rapid import of text and multi-line memo fields in large import csv datasets where the users can't be guaranteed not to have included extraneous columns at the end? I don't have any ability to edit the import csv -- all data changes need to be within Access and the csv file must come through the process unchanged.

盲人显而易见的部门为您带来的此更新:

如果我将导入的字段命名为"Field1","Field2"等,则应该在这些字段之后按顺序插入任何其他字段,因此,多余的字段不是问题.我仍然希望导入正确的字段名称.

If I name the imported fields "Field1", "Field2" etc. any extra fields should slot in in order after those, so extra fields isn't a problem. I'd still prefer to import into the right field names.

推荐答案

考虑使用查询导入CSV数据.此查询从我的CSV文件中选择前4列:

Consider using a query to import your CSV data. This query selects the first 4 columns from my CSV file:

SELECT csv.F1, csv.F2, csv.F3, csv.F4
FROM [Text;FMT=Delimited;HDR=NO;IMEX=2;CharacterSet=437;DATABASE=C:\Users\hans\Documents].[no_field_names.csv] As csv;

该CSV文件不包含字段名称,因此Access会将其分配为F1到F4.实际上,CSV包含超过4列,但我只希望前4列(我认为这与您的情况相对应).

That CSV file does not include field names, so Access assigns them as F1 thru F4. And actually, the CSV includes more than 4 columns, but I only wanted the first 4 (I think that corresponds to your situation).

如果在查询设计器中构建和测试类似的查询,请注意,它可能会将FROM子句更改为类似的内容...

If you build and test a similar query in the query designer, beware it may alter the FROM clause to something like this ...

FROM (Text;FMT=Delimited;HDR=NO;IMEX=2;CharacterSet=437;DATABASE=C:\Users\hans\Documents) no_field_names.csv As csv;

...那是行不通的.因此,每次Access进行无用的更改时,您都必须再次检查并更正它.

... and that won't work. So you will have to check and correct it again each time Access makes its unhelpful change.

一旦获得返回所需信息的SELECT查询,就可以将其转换为INSERT查询.

Once you get a SELECT query which returns what you need, you can transform it to an INSERT query.

INSERT INTO YourTable (GenID, Surname, GivenNames, OtherInfo)
SELECT csv.F1, csv.F2, csv.F3, csv.F4
FROM [Text;FMT=Delimited;HDR=NO;IMEX=2;CharacterSet=437;DATABASE=C:\Users\hans\Documents].[no_field_names.csv] As csv;

但是,我不确定数据源中的备忘字段会发生什么.如果这不能阻止查询正常工作,则至少此方法应允许您提取正确的字段并将其存储在正确的目标字段中.

However, I'm unsure what will happen with the memo field in your data source. If it doesn't prevent the query from working correctly, at least this approach should allow you to extract the correct fields and stored them in the correct destination fields.

这篇关于从CSV导入到Access时字段顺序错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆