read.table用逗号分隔的值以及每个元素内的逗号 [英] read.table with comma separated values and also commas inside each element
问题描述
我试图从csv文件中创建一个逗号分隔的表。我知道不是所有的行有相同数量的元素,所以我会写一些代码来消除这些行。问题是,有行包括数字(以千为单位),其中包括另一个逗号。我不能正确分割这些行,这里是我的代码:
I'm trying to create a table from a csv file comma separated. I'm aware that not all the rows have the same number of elements so I would write some code to eliminate those rows. The problem is that there are rows that include numbers (in thousands) which include another comma as well. I'm not capable of splitting those rows properly, here's my code:
pURL <- "http://financials.morningstar.com/ajax/exportKR2CSV.html?&callback=?&t=EI®ion=FRA&order=asc"
res <- read.table(pURL, header=T, sep='\t', dec = '.', stringsAsFactors=F)
x <- unlist( lapply(keyRatios, function(u) strsplit(u,split='\n')) [[1]] )
推荐答案
您需要使用 quote =
参数 read.table
或 read.delim
。
res <- read.delim( pURL, header=F, sep=',', dec = '.', stringsAsFactors=F , quote = "\"" , fill = TRUE , skip = 2 )
分隔符为,
不\t
。 在此文件中引用,因此您可以使用 quote
参数使R忽略引号内的逗号, quote =\
,并且要跳过前两行,并使用 fill = TRUE
在不均匀的行上填充空格。
The seperator is ","
not "\t"
. Numbers written as thousands of millions are always quoted in this file so you can use the quote
argument to make R ignore the comma inside the quotes with quote = "\""
, and you want to skip the first two lines, and use fill = TRUE
to fill in blanks on uneven lines.
head( res )
# 2003-12 2004-12 2005-12 2006-12 2007-12 2008-12 2009-12 2010-12 2011-12 2012-12 TTM
#2 Revenue EUR Mil 2,116 2,260 2,424 2,690 2,908 3,074 3,268 3,892 4,190 4,989 5,034
#3 Gross Margin % 60.6 60.3 57.3 58.2 57.6 56.9 56.1 55.5 55.4 55.8 56.1
#4 Operating Income EUR Mil 365 404 394 460 505 515 555 618 683 832 841
#5 Operating Margin % 17.2 17.9 16.2 17.1 17.4 16.7 17.0 15.9 16.3 16.7 16.7
#6 Net Income EUR Mil 200 227 289 331 371 389 402 472 518 584 594
#7 Earnings Per Share EUR 3.90 4.30 5.44 6.22 3.48 3.62 3.78 4.36 4.82 2.77 2.80
我之后设置 res
的列名...
I set the column names of res
afterwards like this...
names( res ) <- res[1,]; res <- res[-1,]
它提供了更好的格式。
这篇关于read.table用逗号分隔的值以及每个元素内的逗号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!