R - 循环cbind()结果的累积存储和可能的解决方案,以双重for循环 [英] R - Cumulative storage of looped cbind() results and possible lapply solution to double for-loop

查看:583
本文介绍了R - 循环cbind()结果的累积存储和可能的解决方案,以双重for循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我找到了解决方案的一个解决方案问题我发布了基于@ Ryan的建议,由此代码给出:

  
网页< - read_html(url [i])#loop通过URL列表访问html数据

fac_data < - html_nodes(网页,'。表格')%>%html_text()
fac_data1< - html_nodes(网页,'。表格1')%>%html_text()
fac_data< - c(fac_data,fac_data1)#在变量

x< - fac_data%>%矩阵(ncol = length(headers [[i]))中存储每个URL的表数据,byrow = TRUE )#make矩阵提取列数据

(seq_along(headers [[i]])){
y < - cbind(x [,j])中的j列提取列数据并存储在临时变量
colnames(y)< - as.character(headers #add列名称
print(cbind(y))#loop通过标题列表按顺序打印列数据。 **当我尝试将结果存储在列表中时,cbind(y)将被覆盖,其中'z < - cbind(y)'。




$ b我现在可以打印出所有的值,完成与问题的数据标题。






一些后续问题将是:


  1. 如何将cbind(y)的输出累积地保存在data.frame或list中?循环遍历cbind(y)将覆盖值,这使得我只剩下最后一个表的最后一列。像这样:

    退休年月

    <1>82年8月


这些变化都无效:

  z [[x]] [j]<  -  cbind(y)

> ('〜/ Google云端硬盘/R/scrapeFaculty.R')
在* tmp * [[x]]错误:最多只能选择一个元素

z [j] < - cbind(y)

> source('〜/ Google云端硬盘/R/scrapeFaculty.R')
有13条警告(使用warnings()来查看)

z [[j]]< - cbind (y)

> source('〜/ Google云端硬盘/R/scrapeFaculty.R')
z [[j]]中的错误< - cbind(y):用来替换的元素比所要替换的值多




  1. 可以用简单的lapply替换double for-loop ()函数为
    解决上述问题?

    $ b

    编辑:

    这是我用来解决这个问题的最终代码:
    $ b $ pre $ for(i in seq_along(url)){

    网页< - read_html(url [i])

    fac_data< - html_nodes(网页,'。tableunder')%>%html_text()
    fac_data1< ; html_nodes(webpage,'。tableunder1')%>%html_text()
    fac_data <-c(fac_data,fac_data1)

    x< - fac_data%>%矩阵ncol = length(headers [[i]]),byrow = TRUE)#make矩阵提取列数据
    y < - cbind(x [,1:length(headers [[i]])])#extract列数据
    colnames(y)< - as.character(headers [[i]])#add colunm name
    ntu.hist [[i]]< - y #Cumul在列表上吃了结果。



    解决方案

    想知道这是否是一个选项,而不是循环一次cbind多个。这些语法选项是否有帮助?

      y < -  data.frame(col1 = c(1:3),col2 = c(4:6),col3 = c(7:9))

    cbind(y [,c(1:3)])

    col1 col2 col3
    1 1 4 7
    2 2 5 8
    3 3 6 9

    #在R中,可以用:指定一个范围。所以1,2,3,4等于1:4。
    #如果你不想要那个范围内的数字3,你可以使用c(1,2,4)。

    例如:

    cbind(y [,c(1,3)])

    col1 col3
    1 1 7
    2 2 8
    3 3 9


    I've found a work around solution to a question I posted based on @Ryan's recommendation, given by this code:

    for (i in seq_along(url)){
    
      webpage <- read_html(url[i]) #loop through URL list to access html data
    
      fac_data <- html_nodes(webpage,'.tableunder')  %>% html_text()
      fac_data1 <- html_nodes(webpage,'.tableunder1')  %>% html_text()
      fac_data <- c(fac_data, fac_data1) #Store table data on each URL in a variable 
    
      x <- fac_data %>% matrix(ncol = length(headers[[i]]), byrow=TRUE) #make matrix to extract column data
    
      for (j in seq_along(headers[[i]])){
        y <- cbind(x[,j]) #extract column data and store in temporary variable
        colnames(y) <- as.character(headers[[i]][j]) #add column name
        print(cbind(y)) #loop through headers list to print column data in sequence. ** cbind(y) will be overwritten when I try to store the result on a list with 'z <- cbind(y)'.
      }
    }
    

    I am now able to print out all values, complete with headers of the data in question.


    Some follow-up questions will be:

    1. How do I save the output of cbind(y) cumulatively in a data.frame or a list? Looping through cbind(y) will overwrite values, which leaves me with only the last column from the last table. Like this:

      退休年月

      [1,] "82年8月"

    Neither do these variations work:

    z[[x]][j] <- cbind(y)
    
    > source('~/Google 云端硬盘/R/scrapeFaculty.R')
    Error in `*tmp*`[[x]] : 最多只能選擇一個元素
    
    z[j] <- cbind(y)
    
    > source('~/Google 云端硬盘/R/scrapeFaculty.R')
    There were 13 warnings (use warnings() to see them)
    
    z[[j]] <- cbind(y)
    
    > source('~/Google 云端硬盘/R/scrapeFaculty.R')
    Error in z[[j]] <- cbind(y) : 用來替換的元素比所要替換的值多
    

    1. Can the double for-loop be replaced by a simple lapply() function to resolve the above issue?

    EDIT:

    Here's the final code I used to solve this:

    for (i in seq_along(url)){
    
      webpage <- read_html(url[i])
    
      fac_data <- html_nodes(webpage,'.tableunder')  %>% html_text()
      fac_data1 <- html_nodes(webpage,'.tableunder1')  %>% html_text()
      fac_data <- c(fac_data, fac_data1)
    
      x <- fac_data %>% matrix(ncol = length(headers[[i]]), byrow=TRUE) #make matrix to extract column data
      y <- cbind(x[,1:length(headers[[i]])]) #extract column data
      colnames(y)<- as.character(headers[[i]]) #add colunm name
      ntu.hist[[i]] <- y #Cumulate results on a list.
    
    }
    

    解决方案

    I was wondering if it would be an option to cbind multiple at one time instead of looping. Would any of these syntax options help?

    y <– data.frame(col1=c(1:3),col2=c(4:6),col3=c(7:9))
    
    cbind(y[,c(1:3)])
    
      col1 col2 col3
    1    1    4    7
    2    2    5    8
    3    3    6    9
    
    #In R, you can use ":" to specify a range. So 1,2,3,4 is equal to 1:4.
    #If you don't want number 3 in that range, you can use c(1,2,4).
    
    #For example:
    
    cbind(y[,c(1,3)])
    
      col1  col3
    1    1     7
    2    2     8
    3    3     9
    

    这篇关于R - 循环cbind()结果的累积存储和可能的解决方案,以双重for循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆