如何根据R中元素的长度对列表进行子集化 [英] How to subset a list based on the length of its elements in R

查看:105
本文介绍了如何根据R中元素的长度对列表进行子集化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

R中,我有一个函数(来自软件包spcoordinates),该函数针对您提供的每个IP地址查找11个数据字段.

In R I have a function (coordinates from the package sp ) which looks up 11 fields of data for each IP addresss you supply.

我有一个称为ip.addresses的IP列表:

I have a list of IP's called ip.addresses:

> head(ip.addresses)
[1] "128.177.90.11"  "71.179.12.143"  "66.31.55.111"   "98.204.243.187" "67.231.207.9"   "67.61.248.12"  

注意:那些IP或任何其他IP都可以用来重现此问题.

Note: Those or any other IP's can be used to reproduce this problem.

因此,我使用sapply将函数应用于该对象:

So I apply the function to that object with sapply:

ips.info     <- sapply(ip.addresses, ip2coordinates)

并得到一个名为ips.info的列表作为我的结果.这一切都很好,但是我不能对列表做更多的事情,因此我需要将其转换为数据框.问题在于,并非所有IP地址都在数据库中,因此某些列表元素仅具有1个字段,并且出现此错误:

and get a list called ips.info as my result. This is all good and fine, but I can't do much more with a list, so I need to convert it to a dataframe. The problem is that not all IP addresses are in the databases thus some list elements only have 1 field and I get this error:

> ips.df       <- as.data.frame(ips.info)
Error in data.frame(`128.177.90.10` = list(ip.address = "128.177.90.10",  : 

参数暗示不同的行数:1、0

arguments imply differing number of rows: 1, 0

我的问题是-如何删除缺少/不完整数据的元素,或者将该列表转换为每个IP地址包含11列和1行的数据框?"

My question is -- "How do I remove the elements with missing/incomplete data or otherwise convert this list into a data frame with 11 columns and 1 row per IP address?"

我尝试了几件事.

  • 首先,我尝试编写一个循环,以删除长度小于11的元素

  • First, I tried to write a loop that removes elements with less than a length of 11

for (i in 1:length(ips.info)){
if (length(ips.info[i]) < 11){
ips.info[i] <- NULL}}

这会留下一些没有数据的记录,并使其他记录说"NULL",但是即使is.null

This leaves some records with no data and makes others say "NULL", but even those with "NULL" are not detected by is.null

  • 接下来,我尝试使用双方括号进行相同的操作并得到

  • Next, I tried the same thing with double square brackets and get

Error in ips.info[[i]] : subscript out of bounds

  • 我也尝试过complete.cases()看看它是否可能有用

  • I also tried complete.cases() to see if it could potentially be useful

    Error in complete.cases(ips.info) : not all arguments have the same length
    

  • 最后,我尝试了以length(ips.info[[i]] == 11为条件的for循环的变体,并将完整的记录写入了另一个对象,但是不知何故,它得到了ips.info的精确副本

  • Finally, I tried a variation of my for loop which was conditioned on length(ips.info[[i]] == 11 and wrote complete records to another object, but somehow it results in an exact copy of ips.info

    推荐答案

    这是使用内置的Filter函数

    #input data
    library(RDSTK)
    ip.addresses<-c("128.177.90.10","71.179.13.143","66.31.55.111","98.204.243.188",
        "67.231.207.8","67.61.248.15")
    ips.info  <- sapply(ip.addresses, ip2coordinates)
    
    #data.frame creation
    lengthIs <- function(n) function(x) length(x)==n
    do.call(rbind, Filter(lengthIs(11), ips.info))
    

    或者如果您不想使用帮助器功能

    or if you prefer not to use a helper function

    do.call(rbind, Filter(function(x) length(x)==11, ips.info))
    

    这篇关于如何根据R中元素的长度对列表进行子集化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆