动态R数据帧 - 将是/否响应更改为1/0 [英] Dynamic R dataframes - change yes/no responses to 1/0

查看:144
本文介绍了动态R数据帧 - 将是/否响应更改为1/0的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用API​​调用LimeSurvey来获取数据到我正在开发的Shiny R应用程序。然后,我操纵数据框,以便我只有一段时间内某个人给出的响应。数据框可能如下所示:

 食欲<  -  c(否,是,否否,不,否,否)
Dental.Health< - c(否,是,否 ,否,否,否)
Dry.mouth< - c(否,是,是,是 (否,否,是,否,否)
口味开始< ,否,是,是,否)
Pain.elsewhere< - c(否,是,否,否,否否,否,否,否)
睡觉< - c(否,否,否,否,否,是否,否,否)
Sore.mouth< - c(否,否,是,是,否,否 ,否,否)
吞咽< - c(否,否,否,否,是,否,否 ,否)
癌症治疗< - c(否,否,是,是,否,是,否,否否)
Support.for.my.family< - c(否,否,是,是,否,否,否,否 ,否)
Fear.of.cancer.coming.back< - c(否,否,是,是,否,否,是 ,否,否)
亲密度< - c(是,否,否,否,否,否,否 ,否)
牙医< - c(否,是,否 ,否,否,否,否)
营养师< - c(否,否,是 ,否,否,否,否,否)
Date.submitted< - c(2002-07-25 00:00:00,
2002-09-05 00:00:00,
2003-01-09 00:00:00,
2003-01-09 00:00:00,
2003-07-17 00:00:00,
2003-11-06 00:00:00,
2004-12-17 00:00:00,
2005-06-03 00:00:00,
2005-12-17 00:00:00)

theDataFrame< - data.frame(Date。提交,
食欲,
Dental.Health,
Dry.mouth,
口腔开放,
Pain.elsewhere,
睡觉,
Sore.mouth,
吞咽,
癌症治疗,
Support.for.my.family,
Fear.of.cancer.coming.back,
亲密,
牙医,
营养师)

为了清楚,这个数据框可能包含更多(或更少)更多(或更少)变量的观察值,而不是上面的例子。



我的目标是制作一个动态直方图,如下所示:

  library(dplyr)
库(ggplot2)
库(tidyr)

df< - data.frame(timeline = Sys.Date() - 1:10 ,
q3 = sample(c(是,否),size = 10,replace = T),
q4 = sample(c(是,否),size = 10,replace = T),
q5 = sample(c(Yes,No),size = 10,replace = T),
q6 = sample(c(Yes否,)size = 10,replace = T),
q7 = sample(c(是,否),size = 10,replace = T),
q8 = sample (是,否),size = 10,replace = T),

stringsAsFactors = F)%>%
mutate(q3 = ifelse(q3 == ,1,0),
q4 = ifelse(q4 ==是,1,0),
q5 = ifelse(q5 ==是,1,0),
q6 = ifelse(q6 ==是,1,0),
q7 = ifelse(q7 ==是,1,0),
q8 = ifelse(q8 ==是,1,0)

)%>%
gather(key = question,value = value,q3,q4,q5,q6,q7 ,q8)

g < - ggplot(df,aes(x = timeline,y = value,fill = question))+
geom_bar(stat =identity)

g

我想我将需要使用库(lubridate)作为时间轴,整个数据帧是纯文本。我处理这样的列名称中的'。':

  myColNames<  -  colnames(theDataFrame)

myNames< - myColNames

myNames< - gsub(^ X\\\\。,,myNames)
myNames< - gsub(\\。,,myNames)
名称(theDataFrame)< myChoices中的myNames#项从myNames
/ pre>

但最具挑战性的方面是让其动态工作。数据集只会包含Date.submitted和(x)仅为是或否的其他列数。



希望我给出足够的信息(这是我在Stack Exchange上的第一个问题!)

解决方案

我们可以使用 base R

  theDataFrame [-1]<  -  +(theDataFrame [-1] ==是 )

或与 lapply 当数据集为big

  theDataFrame [-1]<  -  lapply(theDataFrame [-1],function(x)as.integer ==是))


I use an API call to LimeSurvey to get data into a Shiny R app I'm working on. I then manipulate the dataframe so that I have only the responses given by a certain individual over time. The dataframe can look like this:

Appetite <- c("No","Yes","No","No","No","No","No","No","No")
Dental.Health <- c("No","Yes","No","No","No","No","Yes","Yes","No")
Dry.mouth <- c("No","Yes","Yes","Yes","Yes","No","Yes","Yes","No")
Mouth.opening <- c("No","No","Yes","Yes","Yes","No","Yes","Yes","No")
Pain.elsewhere <- c("No","Yes","No","No","No","No","No","No","No")
Sleeping <- c("No","No","No","No","No","Yes","No","No","No")
Sore.mouth <- c("No","No","Yes","Yes","No","No","No","No","No")
Swallowing <- c("No","No","No","No","Yes","No","No","No","No")
Cancer.treatment <- c("No","No","Yes","Yes","No","Yes","No","No","No")
Support.for.my.family <- c("No","No","Yes","Yes","No","No","No","No","No")
Fear.of.cancer.coming.back <- c("No","No","Yes","Yes","No","No","Yes","No","No")
Intimacy  <- c("Yes","No","No","No","No","No","No","No","No")
Dentist   <- c("No","Yes","No","No","No","No","No","No","No")
Dietician <- c("No","No","Yes","Yes","No","No","No","No","No")
Date.submitted <- c("2002-07-25 00:00:00",
                 "2002-09-05 00:00:00",
                 "2003-01-09 00:00:00",
                 "2003-01-09 00:00:00",
                 "2003-07-17 00:00:00",
                 "2003-11-06 00:00:00",
                 "2004-12-17 00:00:00",
                 "2005-06-03 00:00:00",
                 "2005-12-17 00:00:00")

theDataFrame <- data.frame( Date.submitted,
                            Appetite,
                            Dental.Health,
                            Dry.mouth,
                            Mouth.opening,
                            Pain.elsewhere,
                            Sleeping,
                            Sore.mouth,
                            Swallowing,
                            Cancer.treatment,
                            Support.for.my.family,
                            Fear.of.cancer.coming.back,
                            Intimacy,
                            Dentist,
                            Dietician)

To be clear, this dataframe could contain more (or fewer) observations of more (or fewer) variables than the example above.

My goal is to make a dynamic histogram that looks like the following:

library(dplyr)
library(ggplot2)
library(tidyr)

df <- data.frame(timeline = Sys.Date() - 1:10,
                 q3 = sample(c("Yes", "No"), size = 10, replace = T),
                 q4 = sample(c("Yes", "No"), size = 10, replace = T),
                 q5 = sample(c("Yes", "No"), size = 10, replace = T),
                 q6 = sample(c("Yes", "No"), size = 10, replace = T),
                 q7 = sample(c("Yes", "No"), size = 10, replace = T),
                 q8 = sample(c("Yes", "No"), size = 10, replace = T),

                 stringsAsFactors = F) %>%
    mutate(q3 = ifelse(q3 == "Yes", 1, 0),
           q4 = ifelse(q4 == "Yes", 1, 0),
           q5 = ifelse(q5 == "Yes", 1, 0),
           q6 = ifelse(q6 == "Yes", 1, 0),
           q7 = ifelse(q7 == "Yes", 1, 0),
           q8 = ifelse(q8 == "Yes", 1, 0)

    ) %>%
    gather(key = question, value = value, q3, q4, q5, q6, q7, q8)

g <- ggplot(df, aes(x = timeline, y = value, fill = question)) +
    geom_bar(stat = "identity")

g 

I think I will need to use library(lubridate) for the timeline, as the entire dataframe is plain text. I deal with the '.' in the column names like this:

myColNames <- colnames(theDataFrame)

myNames <- myColNames

myNames <- gsub("^X\\.\\.", "", myNames)
myNames <- gsub("\\.", " ", myNames)
names(theDataFrame) <- myNames # items in myChoices get "labels" from myNames

But the most challenging aspect is getting this to work dynamically. The datasets will only contain Date.submitted and (x)number of additional columns that will only be "Yes" or "No"

I hope I've given enough information (this is my first question on Stack Exchange!)

解决方案

We can update it using base R

theDataFrame[-1] <- +(theDataFrame[-1]=="Yes")

Or with lapply when the dataset is big

theDataFrame[-1] <- lapply(theDataFrame[-1], function(x) as.integer(x=="Yes"))

这篇关于动态R数据帧 - 将是/否响应更改为1/0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆