我可以导入 CSV 文件并自动推断分隔符吗? [英] Can I import a CSV file and automatically infer the delimiter?

查看:16
本文介绍了我可以导入 CSV 文件并自动推断分隔符吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想导入两种CSV文件,有的用";"对于分隔符和其他使用,".到目前为止,我一直在接下来的两行之间切换:

I want to import two kinds of CSV files, some use ";" for delimiter and others use ",". So far I have been switching between the next two lines:

reader=csv.reader(f,delimiter=';')

reader=csv.reader(f,delimiter=',')

是否可以不指定分隔符并让程序检查正确的分隔符?

Is it possible not to specify the delimiter and to let the program check for the right delimiter?

以下解决方案(Blender 和 sharth)似乎适用于逗号分隔文件(使用 Libroffice 生成),但不适用于分号分隔文件(使用 MS Office 生成).这是一个以分号分隔的文件的第一行:

The solutions below (Blender and sharth) seem to work well for comma-separated files (generated with Libroffice) but not for semicolon-separated files (generated with MS Office). Here are the first lines of one semicolon-separated file:

ReleveAnnee;ReleveMois;NoOrdre;TitreRMC;AdopCSRegleVote;AdopCSAbs;AdoptCSContre;NoCELEX;ProposAnnee;ProposChrono;ProposOrigine;NoUniqueAnnee;NoUniqueType;NoUniqueChrono;PropoSplittee;Suite2LecturePE;Council PATH;Notes
1999;1;1;1999/83/EC: Council Decision of 18 January 1999 authorising the Kingdom of Denmark to apply or to continue to apply reductions in, or exemptions from, excise duties on certain mineral oils used for specific purposes, in accordance with the procedure provided for in Article 8(4) of Directive 92/81/EEC;U;;;31999D0083;1998;577;COM;NULL;CS;NULL;;;;Propos* are missing on Celex document
1999;1;2;1999/81/EC: Council Decision of 18 January 1999 authorising the Kingdom of Spain to apply a measure derogating from Articles 2 and 28a(1) of the Sixth Directive (77/388/EEC) on the harmonisation of the laws of the Member States relating to turnover taxes;U;;;31999D0081;1998;184;COM;NULL;CS;NULL;;;;Propos* are missing on Celex document

推荐答案

为了解决这个问题,我创建了一个函数来读取文件的第一行(标题)并检测分隔符.

To solve the problem, I have created a function which reads the first line of a file (header) and detects the delimiter.

def detectDelimiter(csvFile):
    with open(csvFile, 'r') as myCsvfile:
        header=myCsvfile.readline()
        if header.find(";")!=-1:
            return ";"
        if header.find(",")!=-1:
            return ","
    #default delimiter (MS Office export)
    return ";"

这篇关于我可以导入 CSV 文件并自动推断分隔符吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆