重塑EPA风速R中具有dcast的方向数据 [英] Reshaping EPA wind speed & direction data with dcast in R

查看:167
本文介绍了重塑EPA风速R中具有dcast的方向数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将长格式的风数据转换为宽格式。风速和风向都在Parameter.Name列中列出。这些值需要同时由Local.Site.Name和Date.Local变量强制转换。

I am trying to convert long format wind data into wide format. Both wind speed and wind direction are listed within the Parameter.Name column. These values need to be cast by both Local.Site.Name, and Date.Local variables.

如果每个唯一的Local.Site.Name + Date.Local行有多个观测值,那么我想要这些观测值的平均值。内置参数 fun.aggregate = mean对于风速来说效果很好,但是由于值是以度为单位,因此无法以这种方式计算平均风向。例如,北(350、10)附近的两个风向的平均值将输出为南(180)。例如:((350 + 10)/ 2 = 180),尽管极坐标平均值为360或0。

If there are multiple observations per unique Local.Site.Name + Date.Local row, then I want the mean value of those observations. The built-in argument 'fun.aggregate = mean' works just fine for wind speed, but mean wind direction cannot be computed this way because the values are in degrees. For example, the average of two wind directions near North (350, 10) would output as South (180). For example: ((350 + 10)/2 = 180), despite the polar average being 360 or 0.

'circular'软件包将使我们能够计算出无需执行任何三角函数即可获得平均风向,但是我很难在 fun.aggregate参数中嵌套此附加功能。我以为,如果使用if语句可以解决问题,但是我遇到了以下错误:

The 'circular' package will allow us to compute the mean wind direction without having to perform any trigonometry, but I am having trouble trying to nest this additional function within the 'fun.aggregate' argument. I thought a simple else if statement would do the trick, but I am running into the following error:

Error in vaggregate(.value = value, .group = overall, .fun = fun.aggregate, : could not find function ".fun"
In addition: Warning messages:
1: In if (wind$Parameter.Name == "Wind Direction - Resultant") { :
    the condition has length > 1 and only the first element will be used
2: In if (wind$Parameter.Name == "Wind Speed - Resultant") { :
    the condition has length > 1 and only the first element will be used     
3: In mean.default(wind$"Wind Speed - Resultant") :
    argument is not numeric or logical: returning NA

目标是能够使用 fun.aggregate =平均值表示风速,而平均值(圆形(风向,单位='度')表示风向。

The goal is to be able to use the fun.aggregate = mean for Wind Speed, but the mean(circular(Wind Direction, units = 'degrees') for Wind Direction.

以下是原始数据(> 100MB):
https://drive.google.com/open?id=0By6o_bZ8CGwuUUhGdk9ONTgtT0E

Here's the original data (>100MB): https://drive.google.com/open?id=0By6o_bZ8CGwuUUhGdk9ONTgtT0E

这里的数据(第100行):
https://drive.google.com / open?id = 0By6o_bZ8CGwucVZGT0pBQlFzT2M

Here's a subset of the data (1st 100 rows): https://drive.google.com/open?id=0By6o_bZ8CGwucVZGT0pBQlFzT2M

这是我的脚本:

library(reshape2)
library(dplyr)
library(circular)

#read in the long format data:
wind <- read.csv("<INSERT_FILE_PATH_HERE>", header = TRUE)

#cast into wide format:
wind.w <- dcast(wind, 
            Local.Site.Name + Date.Local ~ Parameter.Name,
            value.var = "Arithmetic.Mean", 
            fun.aggregate = (
              if (wind$Parameter.Name == "Wind Direction - Resultant") {
                mean(circular(wind$"Wind Direction - Resultant", units = 'degrees'))
              }
              else if (wind$Parameter.Name == "Wind Speed - Resultant") {
                mean(wind$"Wind Speed - Resultant")
              }),
            na.rm = TRUE)

任何帮助将不胜感激!

-spacedSparking

-spacedSparking

编辑:此处是解决方案:

library(reshape2)
library(SDMTools)
library(dplyr)
#read in the EPA wind data:
#This data is publicly accessible, and can be found here: https://aqsdr1.epa.gov/aqsweb/aqstmp/airdata/download_files.html    
wind <- read.csv("daily_WIND_2016.csv", sep = ',', header = TRUE, stringsAsFactors = FALSE)

#convert long format wind speed data by date and site id:
wind_speed <- dcast(wind, 
                    Local.Site.Name + Date.Local ~ Parameter.Name,
                    value.var = "Arithmetic.Mean",
                    fun.aggregate = function(x) {
                      mean(x, na.rm=TRUE)
                    },
                    subset = .(Parameter.Name == "Wind Speed - Resultant")
)

#convert long format wind direction data into wide format by date and local site id:
wind_direction <- dcast(wind, 
                        Local.Site.Name + Date.Local ~ Parameter.Name,
                        value.var = "Arithmetic.Mean",
                        fun.aggregate = function(x) {
                          if(length(x) > 0) 
                            circular.averaging(x, deg = TRUE)
                          else
                            -1
                        },
                        subset= .(Parameter.Name == "Wind Direction - Resultant")
)

#join the wide format split wind_speed and wind_direction dataframes
wind.w <- merge(wind_speed, wind_direction)


推荐答案

您可以在dcast中使用子集来应用这两个函数,并获得单独的数据帧,然后将它们合并

you can use subset in dcast to apply the two functions and get seperate dataframes then merge them

library(reshape2)
library(dplyr)
library(circular)

#cast into wide format:
wind_speed <- dcast(wind, 
                Local.Site.Name + Date.Local ~ Parameter.Name,
                value.var = "Arithmetic.Mean",
                fun.aggregate = function(x) {
                  mean(x, na.rm=TRUE)
                },
                subset=.(Parameter.Name == "Wind Speed - Resultant")
)

wind_direction <- dcast(wind, 
                    Local.Site.Name + Date.Local ~ Parameter.Name,
                    value.var = "Arithmetic.Mean",
                    fun.aggregate = function(x) {
                      if(length(x) > 0) 
                        mean(circular(c(x), units="degrees"), na.rm=TRUE)
                      else
                        -1
                    },
                    subset=.(Parameter.Name == "Wind Direction - Resultant")
)


wind.w <- merge(wind_speed, wind_direction)

这篇关于重塑EPA风速R中具有dcast的方向数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆