R - 根据 `purrr' 中的条件从列表中提取元素 [英] R - Extracting elements from a list according to a condition in `purrr'

查看:69
本文介绍了R - 根据 `purrr' 中的条件从列表中提取元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下 HTML 输入列表.该列表具有嵌套结构 -

I have the following list of HTML inputs. The list has a nested structure -

  1. 级别 1 包含输入的名称(例如 input1).
  2. Level 2 包含有关每个输入的一些信息 - nameattribschildren
  3. Level 3 从 children 分支出来,它是一个长度为 2 的列表 - 第一个元素包含有关输入标签的信息,第二个元素包含有关输入类型的信息.由于我需要输入标签,因此我需要为每个输入提取此列表的第一个元素.
  1. Level 1 contains the names of the inputs (e.g. input1).
  2. Level 2 contains some information about each input - name, attribs, children
  3. Level 3 branches off children, which is a list of length 2 - the first element contains information about the input's label and the second contains information about the type of input. Since I need the input labels, I need to extract the first element of this list for each input.

名单:

library(purrr)

inputs = list(
  input1 = list(
    name = 'div', 
    attribs = list(class = 'form-group'), 
    children = list(list(name = 'label', 
                         attribs = list(`for` = 'email'), 
                         children = list('Email')), 
                    list(
                      list(name = 'input', 
                           attribs = list(id = 'email', type = 'text'), 
                           children = list()))
                    )))

str(inputs)
List of 1
 $ input1:List of 3
  ..$ name    : chr "div"
  ..$ attribs :List of 1
  .. ..$ class: chr "form-group"
  ..$ children:List of 2
  .. ..$ :List of 3
  .. .. ..$ name    : chr "label"
  .. .. ..$ attribs :List of 1
  .. .. .. ..$ for: chr "email"
  .. .. ..$ children:List of 1
  .. .. .. ..$ : chr "Email"
  .. ..$ :List of 1
  .. .. ..$ :List of 3
  .. .. .. ..$ name    : chr "input"
  .. .. .. ..$ attribs :List of 2
  .. .. .. .. ..$ id  : chr "email"
  .. .. .. .. ..$ type: chr "text"
  .. .. .. ..$ children: list()

我可以使用 keep()has_element 来做到这一点:

I am able to do this using keep() and has_element :

label = input %>% 
  map_depth(2, ~keep(., ~has_element(., 'label'))) %>%
  map('children') %>%
  flatten %>% 
  map('children') %>%
  flatten

输出:

str(label)
List of 1
 $ input1: chr "Email"

当我浏览 purrr 帮助页面时,keep 似乎是我所追求的功能,但我仍然不得不使用 mapflatten 两次以到达标签,这看起来很笨拙.所以我想知道是否有更直接的方法来实现相同的输出?我对解决方案不太感兴趣,因为我对使用此类嵌套列表背后的思考过程感兴趣.

When I was looking through the purrr help pages, keep seemed to be the function I was after but I still had to use map and flatten twice to get to the label, which seems clumsy. So I was wondering if there is a more direct way to achieve the same output? I am not so much interested in the solution as I am in the thought process behind working with nested lists like these.

推荐答案

如果每个输入都具有相同的结构,那么你就不需要 keep,它用于删除不存在的列表元素t 满足一些条件.相反,您可以像这样使用 pluck 进行映射.当然,此方法会删除与每个输入相关的所有其他数据.如果最终目标是矩形化",即在平面结构中获取每个输入的所有信息,您可能想要做一些不同的事情.

If every input has the same structure, then you don't need keep, which is used to remove list elements that don't meet some condition. Instead, you can just map through with pluck like this. Of course, this method removes all the other data relevant to each input. You may want to do something different if the end goal is "rectangling", i.e. getting all the information for each input in a flat structure.

library(purrr)

inputs = list(
  input1 = list(
    name = 'div', 
    attribs = list(class = 'form-group'), 
    children = list(
      list(
        name = 'label', 
        attribs = list(`for` = 'email'), 
        children = list('Email')
      ), 
      list(
        list(
          name = 'input', 
          attribs = list(id = 'email', type = 'text'), 
          children = list()
        )
      )
    )
  )
)

inputs %>%
  map(~ pluck(., "children", 1, "name"))
#> $input1
#> [1] "label"

reprex 包 (v0.3.0) 于 2019 年 6 月 14 日创建

Created on 2019-06-14 by the reprex package (v0.3.0)

这篇关于R - 根据 `purrr' 中的条件从列表中提取元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆