导入制表符分隔的CSV,无法识别制表符分隔符 [英] Import tab separated CSV, tab delimiter not recognized

查看:159
本文介绍了导入制表符分隔的CSV,无法识别制表符分隔符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些制表符分隔的csv文件,我想将其转换为xlsx.因此,每个csv应该转换为xlsx.文件名应该相同.但是,文件是制表符分隔的.例如,请参见以下测试文件屏幕截图:

运行代码时(我之前创建了子文件夹xlsx):

  Sub all()昏暗的sourcepath作为字符串Dim sDir As String昏暗的newpath作为字符串sourcepath ="C:\ Users \ PC \ Desktop \ Test \"newpath =源路径&"xlsx \"'确保在创建子文件夹xlsx之前sDir = Dir $(源路径&"* .csv",vbNormal)直到Len(sDir)= 0为止打开工作簿(sourcepath& sDir)使用ActiveWorkbook.SaveAs文件名:= Replace(Left(.FullName,InStrRev(.FullName,."))),sourcepath,newpath)&"xlsx",FileFormat:= xlOpenXMLWorkbook,CreateBackup:= False.关闭结束于sDir = Dir $环形结束子 

它确实有效,但是当我查看excel文件时:

我可以看到未检测到制表符分隔符.我认为我的本地设置是分隔符是分号,这就是为什么它不起作用的原因.现在,我想将dataType设置为xlDelimited并将制表符设置为True,将一行更改为:

  Workbooks.Open(Spath& sDir),数据类型:= xlDelimited,制表符:= True 

我也尝试过

  Workbooks.Open(Spath& sDir,数据类型:= xlDelimited,制表符:= True) 

  Workbooks.Open FileName:= Spath&sDir,数据类型:= xlDelimited,制表符:= True 

但这会导致错误消息.然后,我尝试了另一种方法,将定界符设置为Chr(9)(制表符),将local设置为false:

  Sub all()昏暗的wb作为工作簿昏暗的strFile作为字符串Dim strDir作为字符串strDir ="C:\ Users \ PC \ Desktop \ Test \"strFile = Dir(strDir&"* .csv")做strFile<>"设置wb = Workbooks.Open(文件名:= strDir& strFile,分隔符:= Chr(9),本地:= False)带wb.SaveAs Replace(wb.FullName,".csv",".xlsx"),51.关闭真实结束于设置wb =否strFile =目录环形结束子 

它不会导致错误.但是当我打开文件时,它看起来像:

再次出现同样的问题,无法识别制表符分隔符.我该如何解决?

(我也使用local:True和Delimiter:= Chr(9)进行了尝试,但同样的问题,我也尝试通过添加Format:= 6来进行尝试)

我用csv尝试过这种方式,因为我不想使用txt文件扩展名.原因是使用csv可以轻松地允许特殊语言字符,例如ö",和ü".因此,这就是为什么我想要将csv转换为xlsx而不使用使用txt的解决方法的原因,因为当我尝试将txt转换为xlsx时遇到一个无法正确识别某些特殊字符的问题,我希望避免这种情况使用csv的问题.

csv(或实际上是tsv,因为它们使用制表符作为分隔符而不是分号)具有不同的列.因此,一个csv文件可能有5列,其他6列和数据类型也有所不同.

对EEM的回答:

检查此Test.csv文件,它看起来像这样:

按标签分隔.不是分号.

当我运行代码(向代码中添加.TextFileDecimalSeparator =.")并检查生成的xlsx文件时,它看起来像这样:

第二列(列Ä)中的值(如9987.5)已正确转换为9987,5.但是最后一列(ColumnI)中的值被错误地转换.这是我的问题.(我不知道为什么特殊字符无法正确转换,因为在我的原始文件中确实可以正常工作.)

解决方案

如@RonRosenfeld所述,具有 .csv 扩展名的文件将由excel作为带有制表符分隔符的文本文件打开.

以下假设也不正确:

Option1 txt对我来说不是一种方法,因为我面临一个新问题,即UTF-8特殊字符(例如äö等)无法正确导入.

字符或 code页面的处理与文件扩展名无关,而是由

生成了这个"xlsx"文件:

希望将这些过程添加到您的项目中应该很简单,让我知道您可能会对使用的资源有任何疑问.

  Sub Tab_Delimited_UTF8_Files_Save_As_Xlsx()昏暗的sFile作为字符串昏暗sPathSrc作为字符串,sPathTrg作为字符串Dim sFilenameSrc作为字符串,sFilenameTrg作为字符串Dim bShts作为字节,exCalc作为XlCalculationRem sPathSrc ="C:\ Users \ PC \ Desktop \ Test \"sPathTrg = sPathSrc&"xlsx \"Rem Excel属性关闭随着申请.EnableEvents =假.DisplayAlerts =假.ScreenUpdating = FalseexCalc =.计算.Calculation = xlCalculationManual.CalculateBeforeSave = FalsebShts = .SheetsInNewWorkbook.SheetsInNewWorkbook = 1结束于Rem验证目标文件夹如果Len(Dir $(sPathTrg,vbDirectory))= 0,则MkDir sPathTrgRem处理Csv文件sFile = Dir $(sPathSrc&"* .csv")直到Len(sFile)= 0为止sFilenameSrc = sPathSrc&文件sFile = Left(sFile,-1 + InStrRev(sFile,".csv")))'@sFilenameTrg = sPathTrg&文件和".xlsx"'@调用Open_Csv_As_Tab_Delimited_Then_Save_As_Xls(sFile,sFilenameSrc,sFilenameTrg)'@sFile = Dir $环形Rem Excel属性关闭随着申请.SheetsInNewWorkbook = bShts.计算= exCalc.CalculateBeforeSave = True.ScreenUpdating =真.DisplayAlerts =真.EnableEvents = True结束于结束子 

...

  Sub Open_Csv_As_Tab_Delimited_Then_Save_As_Xls(sWsh作为字符串,sFilenameSrc作为字符串,sFilenameTrg作为字符串)'@Dim Wbk作为工作簿雷姆工作簿-添加设置Wbk = Workbooks.Add与Wbk使用.Worksheets(1)雷姆QueryTable-添加使用.QueryTables.Add(Connection:=" TEXT;"& sFilenameSrc,Destination:=.Cells(1))雷姆QueryTable-属性.SaveData = True.TextFileParseType = xlDelimited.TextFileDecimalSeparator =.".TextFileThousandsSeparator =,".TextFileConsecutiveDelimiter =假.TextFileTabDelimiter = True.TextFileTrailingMinusNumbers =真.TextFilePlatform = 65001'Unicode(UTF-8).刷新BackgroundQuery:= False雷姆QueryTable-删除.删除结束于雷姆(Rem)重命名工作表'@在错误恢复下一个'@时忽略错误,如果文件名作为Sheetname无效.名称= sWsh'@出错时转到0'@结束于雷姆工作簿-保存并保存关闭.SaveAs文件名:= sFilenameTrg,文件格式:= xlOpenXMLWorkbook.关闭结束于结束子 

I have tab separated csv files which I want to transform to xlsx. So each csv should be transformed to a xlsx. Filename should be the same. However, the files are tab separated. For example, see this test file screenshot:

When I run my code (I created a subfolder xlsx before):

Sub all()

   Dim sourcepath As String
   Dim sDir As String
   Dim newpath As String
    
    sourcepath = "C:\Users\PC\Desktop\Test\"
    newpath = sourcepath & "xlsx\"
    
    'make sure subfolder xlsx was created before

    sDir = Dir$(sourcepath & "*.csv", vbNormal)
    Do Until Len(sDir) = 0
        Workbooks.Open (sourcepath & sDir)
        With ActiveWorkbook
            .SaveAs Filename:=Replace(Left(.FullName, InStrRev(.FullName, ".")), sourcepath, newpath) & "xlsx", FileFormat:=xlOpenXMLWorkbook, CreateBackup:=False
            .Close
        End With
        
        sDir = Dir$
    Loop
End Sub

it does work, but when I look into the excel file:

I can see that the tab separator was not detected. I think my local settings are that the separator is a semi-colon and that's why it is not working. Now I wanted to set dataType to xlDelimited and tab to True, with changing the one line to:

Workbooks.Open (Spath & sDir), DataType:=xlDelimited, Tab:=True

I also tried

Workbooks.Open (Spath & sDir, DataType:=xlDelimited, Tab:=True)

or

Workbooks.Open FileName:=Spath & sDir, DataType:=xlDelimited, Tab:=True

But this leads to an error message. I then tried another approach, where I set the delimiter to Chr(9) (tab) and local to false:

Sub all()

    Dim wb As Workbook
    Dim strFile As String
    Dim strDir As String
    
    strDir = "C:\Users\PC\Desktop\Test\"
    strFile = Dir(strDir & "*.csv")
    
    Do While strFile <> ""
    
    Set wb = Workbooks.Open(Filename:=strDir & strFile, Delimiter:=Chr(9), Local:=False)
    With wb
        .SaveAs Replace(wb.FullName, ".csv", ".xlsx"), 51
        .Close True
    End With
    Set wb = Nothing
    strFile = Dir
    Loop
    
End Sub

It does not lead to an error. But when I open the file, it looks like:

So again same problem, the tab separator is not recognized. How can I fix this?

(I also tried it with local:True together with Delimiter:=Chr(9), but same problem and I also tried it with adding Format:=6)

I tried it this way with csv as I did not want to go the same way with txt file extension. Reason is that using csv easily allows special language characters like "ö" and "ü". So that is why I wanted to convert csv to xlsx and not use the workaround of using txt instead, as I then run into the problem that when I try to convert txt to xlsx certain special characters are not properly recognised and I hope to avoid this problem with using csv.

The csv (or actually these are tsv, because they have the tab as separator and not semi-colon) files have different columns. So could be one csv file has 5 columns, the other 6 and the datatypes vary too.

EDIT:

In repsonse to EEMs answer:

Check this Test.csv file, it looks like this:

Separated by tab. Not semi-colon.

When I run the code (plus adding .TextFileDecimalSeparator = "." to the code) and check the resulting xlsx file it looks like this:

Values in the second column (ColumnÄ), like 9987.5 are correctly transformed to 9987,5. But values in the last column (ColumnI) are wrongly transformed. This is my problem now. (I dont know why special character does not get transformed correctly, as in my original files this does work.)

解决方案

As mentioned by @RonRosenfeld, files with .csv extension will be opened by excel as a text file with tab delimiters.

Also the following assumption is not accurate:

Option1 txt is not a way for me, as I face a new problem that UTF-8 special characters, like äö and so are not properly imported.

The handling of the characters or code page has nothing to do with the extension of the files, instead it's managed by the Origin parameter of the Workbooks.OpenText method or by the TextFilePlatform property of the QueryTable object.

Therefore unless the files are renamed with a extension different than csv the [Workbooks.OpenText method] will not be effective.

The solution proposed below, uses the QueryTable object and consist of two procedures:

  1. Tab_Delimited_UTF8_Files_Save_As_Xlsx

  • Sets the source and target folder
  • Creates the xlsx folder if not present
  • Gets all csv files in the source folder

  1. Open_Csv_As_Tab_Delimited_Then_Save_As_Xls

  • Process each csv files
  • Adds a workbook to hold the Query Table
  • Imports the csv file
  • Deletes the Query
  • Saves the File as `xlsx'

EDIT I These lines were added to ensure the conversion of the numeric data:

                .TextFileDecimalSeparator = "."
                .TextFileThousandsSeparator = ","

EDIT II A Few changes to rename the worksheet (marked as '@ )

Tested with this csv file:

Generated this `xlsx' file:

Hopefully, it should be straightforward to add these procedure to you project, let me know of any problem or question you might have with the resources used.

Sub Tab_Delimited_UTF8_Files_Save_As_Xlsx()
Dim sFile As String
Dim sPathSrc As String, sPathTrg As String
Dim sFilenameSrc As String, sFilenameTrg As String
Dim bShts As Byte, exCalc As XlCalculation
    
Rem sPathSrc = "C:\Users\PC\Desktop\Test\"
    sPathTrg = sPathSrc & "xlsx\"
    
    Rem Excel Properties OFF
    With Application
        .EnableEvents = False
        .DisplayAlerts = False
        .ScreenUpdating = False
        exCalc = .Calculation
        .Calculation = xlCalculationManual
        .CalculateBeforeSave = False
        bShts = .SheetsInNewWorkbook
        .SheetsInNewWorkbook = 1
    End With

    Rem Validate Target Folder
    If Len(Dir$(sPathTrg, vbDirectory)) = 0 Then MkDir sPathTrg

    Rem Process Csv Files
    sFile = Dir$(sPathSrc & "*.csv")
    Do Until Len(sFile) = 0
        
        sFilenameSrc = sPathSrc & sFile
        sFile = Left(sFile, -1 + InStrRev(sFile, ".csv"))    '@
        sFilenameTrg = sPathTrg & sFile & ".xlsx"            '@
        
        Call Open_Csv_As_Tab_Delimited_Then_Save_As_Xls(sFile, sFilenameSrc, sFilenameTrg)    '@
        
        sFile = Dir$
    
    Loop

    Rem Excel Properties OFF
    With Application
        .SheetsInNewWorkbook = bShts
        .Calculation = exCalc
        .CalculateBeforeSave = True
        .ScreenUpdating = True
        .DisplayAlerts = True
        .EnableEvents = True
    End With
    
    End Sub

Sub Open_Csv_As_Tab_Delimited_Then_Save_As_Xls(sWsh As String, sFilenameSrc As String, sFilenameTrg As String)    '@
Dim Wbk As Workbook
    
    Rem Workbook - Add
    Set Wbk = Workbooks.Add
    With Wbk
        
        With .Worksheets(1)

            Rem QueryTable - Add
            With .QueryTables.Add(Connection:="TEXT;" & sFilenameSrc, Destination:=.Cells(1))
                
                Rem QueryTable - Properties
                .SaveData = True
                .TextFileParseType = xlDelimited
                .TextFileDecimalSeparator = "."
                .TextFileThousandsSeparator = ","
                .TextFileConsecutiveDelimiter = False
                .TextFileTabDelimiter = True
                .TextFileTrailingMinusNumbers = True
                .TextFilePlatform = 65001               'Unicode (UTF-8)
                .Refresh BackgroundQuery:=False
                
                Rem QueryTable - Delete
                .Delete
            
            End With
        
            Rem Rename Worksheet    '@
            On Error Resume Next    '@ Ignore error in case the Filename is not valid as Sheetname
            .Name = sWsh            '@
            On Error GoTo 0         '@
        
        End With

        Rem Workbook - Save & Close
        .SaveAs Filename:=sFilenameTrg, FileFormat:=xlOpenXMLWorkbook
        .Close
    
    End With

    End Sub

这篇关于导入制表符分隔的CSV,无法识别制表符分隔符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆