numpy genfromtext或numpy loadtxt"ValueError:无法将字符串转换为浮点数";或“太多价值无法解包",几乎所有内容都尝试过 [英] numpy genfromtext or numpy loadtxt " ValueError: could not convert string to float" or "too many values to unpack", tried almost everything
问题描述
现在尝试解决多个小时,我遇到了一个非常令人沮丧的问题.我真的在Google上用相关的问题和答案来充实所有主题.
I have a very frustrating problem trying to solve for many hours now. I really maxed out all topics here on google with relevant questions and answers.
我想做什么: 我有大量的测光数据集(CSV中的50-70k列),我想加载这些数据集,然后将它们作为浮点数使用,并最终通过一些拟合和计算对它们进行绘制.
What I would like to do: I have big datasets (50-70k columns in CSV) of photometry data that I would like to load in, then work with them as floats and plot them eventually with some fittings and calculations.
数据示例:
Time(s),AnalogIn-1,AnalogIn-2
0.00E+00,3.96E-02,3.33E-02
0.00E+00,4.10E-02,3.33E-02
所以每一列都有许多带有科学计数法的数字.
So each column has many numbers with scientific notation.
在我的代码中,我首先使用以下内容加载文本:
In my code I first used the following to load the text:
time, dat1, dat2= np.loadtxt(path, skiprows=1, unpack=True, delimiter=",")
并且它已经删除了"ValueError:无法将字符串转换为float:"
And it's been dropping "ValueError: could not convert string to float:"
如果我要使用Excel,将整个CSV工作表从常规"转换为数字",则效果很好.
It works fine if I go to for e.g Excel, convert the whole CSV sheet from 'General' to 'Numbers.
我实际上尝试了这里讨论的所有内容,首先从跳过标头和第一行开始,无论是使用np.loadtxt
,np.genfromtxt
还是使用pandas loader.还尝试更改数据类型,修复转换器,重新映射加载到浮点数的内容.这有帮助,但仅适用于某些行,并且错误很快在随机行中弹出,或弹出回太多值无法解包". -我试过跳过空白,南也.我怀疑问题仍然存在于转换中,科学记数法的确是一个字符串,并且按随机"顺序具有"E","+"和-"字符.我仍然认为我缺少一些非常简单的解决方案,因为我的CSV确实是标准数据.
I tried literally everything discussed here, first starting with skipping headers and first rows, both with np.loadtxt
, np.genfromtxt
or with pandas loader. Also tried to change datatypes, fixing converters, re-mapping whatever what was loaded to floats. This helped, but only for certain rows and error popped in soon at random rows or popped back 'Too many values to unpack'. - I tried skip blank, nan also. I suspect the problem still somewhere in the conversion, that the scientific notation is indeed a string and it has "E" "+" and "-" chars in "random" order. I still believe I'm missing something very easy solution to this, as my CSV is really standard data.
推荐答案
这确实是一个很长的评论,但是如果它能识别出问题,则可能是一个答案.
This is really just a long comment, but if it identifies the problem, it might be an answer.
使用您在评论中链接的CSV文件,我运行了
With the CSV file that you linked to in a comment, I ran
time, dat1, dat2 = np.loadtxt("data1.csv", skiprows=1, unpack=True, delimiter=",")
并且没有错误.
当我检查文件时,我发现行尾是单个回车符(通常缩写为CR,十六进制代码0d
).您提到使用Excel,所以我假设您使用的是Windows.在Windows中,通常以CR + LF结尾的行(两个字符:回车符,后跟换行符;十六进制0d0a
).
When I inspected the file, I noticed that the line endings were a single carriage return character (often abbreviated CR, hex code 0d
). You mentioned using Excel, so I assume you are using Windows. The usual line ending in Windows is CR+LF (two characters: carriage return followed by linefeed; hex 0d0a
).
这可能是问题所在(但是我希望Python文件I/O能够解决这个问题).我没有Windows系统对此进行测试,因此目前我只能说尝试一下":
That might be the problem (but I expected Python file I/O to take care of this). I don't have a Windows system to test this, so at the moment all I can say is "try this":
with open('data1.csv', 'r', newline='\r') as f:
time, dat1, dat2 = np.loadtxt(f, skiprows=1, unpack=True, delimiter=",")
这篇关于numpy genfromtext或numpy loadtxt"ValueError:无法将字符串转换为浮点数";或“太多价值无法解包",几乎所有内容都尝试过的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!