什么是R水平? [英] What are R levels?
问题描述
我试图读取一个csv文件与R.我可以读取该文件,但我有水平,当我调用一个变量。这些级别是什么,如何删除它们?
文件可以从这里下载文件
> data = read.csv(Documents / bet / I1.csv,sep =,)
>数据$ HomeTeam
[1] Sampdoria维罗纳卡利亚里国际拉齐奥利沃诺那不勒斯帕尔马
[9]托里诺菲奥伦蒂纳Chievo尤文图斯亚特兰大博洛尼亚卡塔尼亚热那亚
[17]米兰罗马萨索洛乌迪内斯国际那不勒斯托里诺菲奥伦蒂纳
[25] Lazio Livorno桑普多利亚乌迪内斯维罗纳帕尔马卡利亚里Chievo
[33]热那亚阿塔兰达博洛尼亚卡塔尼亚尤文图斯米兰罗马萨索洛
[41]乌迪内斯博洛尼亚Chievo拉齐奥利沃诺·那不勒斯帕尔马桑普多
[49]都灵热那亚米兰亚特兰大卡利亚里卡塔尼亚罗马
[57]萨索洛托里诺维罗纳Fiorentina博洛尼亚卡塔尼亚那不勒斯帕尔马
[65]桑普多利亚乌迪内斯尤文图斯拉齐奥Chievo国际罗马卡利亚里
[73]米兰亚特兰大佛罗伦萨热那亚里窝那Sassuolo维罗纳都灵
[81]班桑普多利亚博洛尼亚卡塔尼亚Chievo尤文图斯拉齐奥那不勒斯
[89]帕尔马乌迪内斯阿塔兰塔卡利亚里菲奥伦蒂那热那亚尤文图斯里沃诺
[97]米兰萨索洛维罗纳米兰那不勒斯帕尔马拉齐奥
[105] Livorno桑普多利亚托里诺乌迪内斯维罗纳博洛尼亚卡塔尼亚国际
[113]阿塔兰达卡利亚里Chievo热那亚帕尔马罗马菲奥伦蒂纳尤文图斯
[121]米兰那不勒斯维罗纳博洛尼亚里窝那桑普多利亚萨索洛托林诺
[129]乌迪内斯罗马
20级别:阿塔兰达博洛尼亚卡利亚里卡塔尼亚Chievo佛罗伦萨热那亚国际尤文图斯维罗纳
当您使用?read.csv 读取文件,参数 stringsAsFactors
默认设置为 TRUE
,你只需要将其设置为false就不会得到这个结果。这应该工作:
data = read.csv(Documents / bet / I1.csv,sep =,,在默认情况下,文件中包含字符串的列(变量)被假定为包含字符串的列(变量)是因素。因子是一个分类变量,它只能取一个固定的有限集合。这些可能的类别是级别。您可以在R Intro手册此处中了解因素,并且此是另一个教程。
此外,由于您正在使用读取。 csv ,添加 sep =,
是多余的。它不会伤害任何东西,但默认情况下将逗号作为分隔符。
I am trying to read a csv file with R. I can read the file but I have levels when I call a variable. What are these levels and how can I remove them?
The file can be downloaded here file
> data=read.csv("Documents/bet/I1.csv",sep=",")
> data$HomeTeam
[1] Sampdoria Verona Cagliari Inter Lazio Livorno Napoli Parma
[9] Torino Fiorentina Chievo Juventus Atalanta Bologna Catania Genoa
[17] Milan Roma Sassuolo Udinese Inter Napoli Torino Fiorentina
[25] Lazio Livorno Sampdoria Udinese Verona Parma Cagliari Chievo
[33] Genoa Atalanta Bologna Catania Juventus Milan Roma Sassuolo
[41] Udinese Bologna Chievo Lazio Livorno Napoli Parma Sampdoria
[49] Torino Inter Genoa Milan Atalanta Cagliari Catania Roma
[57] Sassuolo Torino Verona Fiorentina Bologna Catania Napoli Parma
[65] Sampdoria Udinese Juventus Lazio Chievo Inter Roma Cagliari
[73] Milan Atalanta Fiorentina Genoa Livorno Sassuolo Verona Torino
[81] Inter Sampdoria Bologna Catania Chievo Juventus Lazio Napoli
[89] Parma Udinese Atalanta Cagliari Fiorentina Genoa Juventus Livorno
[97] Milan Sassuolo Verona Roma Milan Napoli Parma Lazio
[105] Livorno Sampdoria Torino Udinese Verona Bologna Catania Inter
[113] Atalanta Cagliari Chievo Genoa Parma Roma Fiorentina Juventus
[121] Milan Napoli Verona Bologna Livorno Sampdoria Sassuolo Torino
[129] Udinese Roma
20 Levels: Atalanta Bologna Cagliari Catania Chievo Fiorentina Genoa Inter Juventus ... Verona
解决方案 When you use ?read.csv to read a file, the argument stringsAsFactors
is set by default to TRUE
, you just need to set it to false to not get this result. This should work:
data = read.csv("Documents/bet/I1.csv", sep=",", stringsAsFactors=FALSE)
Under the default, columns (variables) in the file that contain strings are assumed to be factors. A factor is a categorical variable that can take only one of a fixed, finite set of possibilities. Those possible categories are the levels. You can read about factors in the R Intro manual here, and this is another tutorial.
In addition, since you are using read.csv, adding the sep=","
is redundant. It doesn't harm anything, but the comma is taken as the separator by default.
这篇关于什么是R水平?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!