绘制数据子集的直方图 [英] Plot a histogram of subset of a data

查看:103
本文介绍了绘制数据子集的直方图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

该图像显示了数据.txt文件的屏幕截图. 数据由2,075,259行和9列组成

!The image shows the screen shot of the .txt file of the data. The data consists of 2,075,259 rows and 9 columns

在近4年的时间里以一分钟的采样率对一个家庭的电力消耗进行测量.提供不同的电量和一些子计量值.

Measurements of electric power consumption in one household with a one-minute sampling rate over a period of almost 4 years. Different electrical quantities and some sub-metering values are available.

仅需要2007-02-01和2007-02-02日期的数据. 我试图在上述日期绘制"Global_active_power"的直方图.

Only data from the dates 2007-02-01 and 2007-02-02 is needed. I was trying to plot a histogram of "Global_active_power" in the above mentioned dates.

请注意,在此数据集中,缺失值编码为?"]

Note that in this dataset missing values are coded as "?"]

这是我试图绘制直方图的代码:

This is the code i was trying to plot the histogram:

{

data <- read.table("household_power_consumption.txt", header=TRUE)
my_data <- data[data$Date %in% as.Date(c('01/02/2007', '02/02/2007'))]

my_data <- gsub(";", " ", my_data)    # replace ";" with " "  
my_data <- gsub("?", "NA", my_data)   # convert "?" to "NA"
my_data <- as.numeric(my_data)      # turn into numbers

hist(my_data["Global_active_power"])

}

运行代码后,它显示此错误:

After running the code it is showing this error:

hist.default(my_data ["Global_active_power"])中的错误: 无效的休息时间"

Error in hist.default(my_data["Global_active_power"]) : invalid number of 'breaks'

能帮我找出代码中的错误吗?

Can you please help me spot the mistake in the code.

数据文件的链接: https://d396qusza40orc.cloudfront.net/exdata% 2Fdata%2Fhousehold_power_consumption.zip

推荐答案

您需要显式提供分隔符(;"),而您的类型并不是您认为的那样,请注意:

You need to provide the separator (";") explicitly and your types aren't what you think they are, observe:

data <- read.table("household_power_consumption.txt", header=TRUE, sep=';', na.strings='?')
data$Date <- as.Date(data$Date, format='%d/%m/%Y')
bottom.date <- as.Date('01/02/2007', format='%d/%m/%Y')
top.date <- as.Date('02/02/2007', format='%d/%m/%Y')
my_data <- data[data$Date > bottom.date & data$Date < top.date,3] 
hist(my_data)

给出作为图.希望有帮助.

Gives as the plot. Hope that helps.

这篇关于绘制数据子集的直方图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆