在 R 中将 data.frame 从字符转换为数字以在时间序列函数中使用 [英] Converting data.frame from character to numeric in R to use in Time Series function

查看:39
本文介绍了在 R 中将 data.frame 从字符转换为数字以在时间序列函数中使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在使用 R(3.2.1),并且在将我的数据集转换为数字以绘制我的时间序列图时遇到了一些问题.

I am currently using R(3.2.1), and having some problem with converting my dataset to numeric so to plot my time series graph.

我读取了从 html 页面源中提取的数据表,并将其存储在我的全局环境中.我无法将我的 data.frame 从字符转换为数字,这是我的数据标题的示例.

I read my data table extracted from a html page source, and have stored it in my global environment. I can't convert my data.frame from character to numeric, and this is the example of heading of my data.

> head(World)
    World  
V3 "5,689"
V4 "4,672"
V5 "4,344"
V6 "3,745"
V7 "4,246"
V8 "4,823"

这是我的数据结构

> str(World)
 'data.frame':  108 obs. of  1 variable:
 $ World: chr  "1,234" "1,234" "1,234" "4,321" ...

我想将此数据转换为时间序列,但是,

I would like to convert this data to time series, however,

ts(as.data.frame(sapply(World, function(x) gsub("\"", "", x))))

给我字符类型的整数值,比如

give me the integer values of the character type, such as

Time Series:
Start = 1 
End = 6 
Frequency = 1 
     World
[1,]    49
[2,]    41
[3,]    37
[4,]    32
[5,]    36
[6,]    43

我试过了

 as.numeric(as.character(World[,1]))

但它给了我 NA 值并带有警告消息:由强制引入的 NA.

but it gave me NA values with Warning message: NAs introduced by coercion.

我可以在没有引号等的情况下看到 World 的值,但是,当我将其用作时间序列时,值会发生变化.

I can see the value of World without quote, etc, however, when I use it as Time Series, the values change.

我希望我的最终产品是

Time Series:
Start = 1 
End = 6 
Frequency = 1 
     World
[1,]    5,689
[2,]    4,672
[3,]    4,333
[4,]    3,745
[5,]    4,246
[6,]    4,823

如果您提供任何帮助,我将不胜感激.

I would appreciate any help given.

谢谢

推荐答案

警告消息是因为您的数字"中包含逗号.删除逗号(或将它们转换为句点,如果它们应该是小数点分隔符),则转换为数字将起作用.

The warning message is because your "numbers" have commas in them. Remove the commas (or convert them to periods, if they're supposed to be decimal separators) and the conversion to numeric will work.

此外,您的 World 对象似乎不是 data.frame,因为 data.frames 不打印带引号的字符向量.更有可能的是,它是一个矩阵.

Also, your World object doesn't appear to be a data.frame, because data.frames don't print character vectors with the quotes. More likely, it's a matrix.

R> # if the comma is a thousands separator
R> ts(as.matrix(as.numeric(gsub(",", "", World[,1]))))
Time Series:
Start = 1 
End = 6 
Frequency = 1 
     Series 1
[1,]     5689
[2,]     4672
[3,]     4344
[4,]     3745
[5,]     4246
[6,]     4823
R> # if the comma is a decimal separator
R> ts(as.matrix(as.numeric(gsub(",", ".", World[,1]))))
Time Series:
Start = 1 
End = 6 
Frequency = 1 
     Series 1
[1,]    5.689
[2,]    4.672
[3,]    4.344
[4,]    3.745
[5,]    4.246
[6,]    4.823

这篇关于在 R 中将 data.frame 从字符转换为数字以在时间序列函数中使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆