for()循环的返回值,就好像它是R中的函数一样 [英] Return value of for() loop as if it were a function in R

查看:128
本文介绍了for()循环的返回值,就好像它是R中的函数一样的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在R脚本中有这个for循环:

I have this for loop in an R script:

url <- "https://example.com"
page <- html_session(url, config(ssl_verifypeer = FALSE))

links <- page %>% 
  html_nodes("td") %>% 
  html_nodes("tr") %>%
  html_nodes("a") %>% 
  html_attr("href")

base_names <- page %>%
  html_nodes("td") %>% 
  html_nodes("tr") %>%
  html_nodes("a") %>% 
  html_attr("href") %>%
  basename()

for(i in 1:length(links)) {

  site <- html_session(URLencode(
    paste0("https://example.com", links[i])),
    config(ssl_verifypeer = FALSE))

  writeBin(site$response$content, base_names[i])
} 

这会通过链接循环,&将文本文件下载到我的工作目录中.我想知道是否可以将return放在某个地方,以便它返回文档.

This loops through links, & downloads a text file to my working directory. I'm wondering if I can put return somewhere, so that it returns the document.

原因是,我正在NiFi中执行脚本(使用ExecuteProcess),并且没有将我抓取的文档发送到一行.相反,它仅显示我的R脚本的开头.我假设您会将for循环包装在fun <- function(x) {}中,但是我不确定如何将x集成到已经工作的刮板中.

Reason being, is that I'm executing my script in NiFi (using ExecuteProcess), and it's not sending my scraped documents down the line. Instead, it just shows the head of my R script. I would assume you would wrap the for loop in a fun <- function(x) {}, but I'm not sure how to integrate the x into an already working scraper.

我需要它来沿着流程返回文档,而不仅仅是这样:

I need it to return documents down the flow, and not just this:

处理器配置:

即使您不熟悉NiFi,在R方面也会有很大帮助!谢谢

推荐答案

如果您的意图是同时(1)保存输出(使用writeBin)(2)返回值(在list中),然后尝试以下操作:

If your intent is to both (1) save the output (with writeBin) and (2) return the values (in a list), then try this:

out <- Map(function(ln, bn) {
  site <- html_session(URLencode(
    paste0("https://example.com", ln)),
    config(ssl_verifypeer = FALSE))
  writeBin(site$response$content, bn)
  site$response$content
}, links, base_names)

使用Map将各个元素压缩"在一起.对于基本情况,以下内容是相同的:

The use of Map "zips" together the individual elements. For a base-case, the following are identical:

Map(myfunc, list1)
lapply(list1, myfunc)

但是,如果要使用多个列表中的相同索引元素,则可以执行以下操作之一

But if you want to use same-index elements from multiple lists, you can do one of

lapply(seq_len(length(list1)), function(i) myfunc(list1[i], list2[i], list3[i]))
Map(myfunc, list1, list2, list3)

展开Map的有效结果是:

myfunc(list1[1], list2[1], list3[1])
myfunc(list1[2], list2[2], list3[2])
# ...

lapplyMap之间最大的区别在于,lapply仅接受一个向量,而Map接受一个或更多(实际上是无限的),将它们压缩在一起.使用的所有列表的长度都必须相同或长度为1(可回收),因此执行类似的操作是合法的

The biggest difference between lapply and Map here is that lapply can only accept one vector, whereas Map accepts one or more (practically unlimited), zipping them together. All of the lists used must be the same length or length 1 (recycled), so it's legitimate to do something like

Map(myfunc, list1, list2, "constant string")

注意:Map -versus- mapplylapply -vs- sapply相似.对于这两者,第一个总是返回一个list对象,而第二个都将返回一个vector IFF,每个返回值的长度/尺寸都相同,否则它也将返回一个list

Note: Map-versus-mapply is similar to lapply-vs-sapply. For both, the first always returns a list object, while the second will return a vector IFF every return value is of the same length/dimension, otherwise it too will return a list.

这篇关于for()循环的返回值,就好像它是R中的函数一样的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆