根据空行和标题标题将数据帧划分或拆分为多个df [英] Divide or split dataframe into multiple dfs based on empty row and header title

查看:27
本文介绍了根据空行和标题标题将数据帧划分或拆分为多个df的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,在一个文件中有多个值.我想将其分成文件中大约25个的多个文件.文件的模式是在其中有一个空白行和一个标题标题,这是一个新的df.我已经尝试过根据空行在R中拆分数据帧但这不会处理新df(V1列第9行)中的任何空白行.我希望将数据分为空行和标题标题,下面提供了我尝试过的数据和代码.另外,如何在新创建的dfs中将标题行作为数据框名称.

I have a dataframe which has multiple values in a single file. I want to divide it into multiple files around 25 from the file. Pattern for the file is where there is one blank row and a header title is there , it is a new df. I Have tried this Splitting dataframes in R based on empty rows but this does not take care of any blank row within the new df (V1 column 9th row). I want the data to be divided on empty row and a header title my data and code i have tried is given below . Also how can i put the header row as the Dataframe name in my newly created dfs.

 df = structure(list(V1 = c("Machine", "", "Machine", "V1", "03-09-2020", 
"", "Machine", "No", "Name", "a", "1", "2", "", "Machine", "No", 
""), V2 = c("Data", "", "run", "V2", "600119", "", "error", "SpNo", 
"", "a", "b", "c", "", "logs", "sp", ""), V3 = c("Editor", "", 
"information", "V3", "6", "", "messages", "OP", "", "", "b", 
"c", "", "", "op", ""), V4 = c("", "", "", "V4", "", "", "", 
"OP", "", "", "", "", "", "", "name", "")), class = "data.frame", row.names = c(NA, 
-16L))

dt <- df



## add column to indicate groups
dt$tbl_id <- cumsum(!nzchar(dt$V1) 

unique(dt$tbl_id)

## remove blank lines
dt <- dt[nzchar(dt$V1), ]

## split the data frame
dt_s <- split(dt[, -ncol(dt)], dt$tbl_id)

## use first line as header and reset row numbers
dt_s <- lapply(dt_s, function(x) {
  colnames(x) <- x[1, ]
  x <- x[-1, ]
  rownames(x) <- NULL
  x
})

任何帮助将非常有用.同样,所有文件中的所有标题标题都相同.我正在对多个文件操作使用lapply.

any help will be highly useful . Also all the header title will be same in all the files. I am using lapply for the multiple file operations.

预期输出为:-

Machine_run_nformation  <- read.table(text="
V1  V2  V3  V4
03-09-2020  600119  -   6

",header = T)

Machine_error_essages <- read.table(text="
No  SpNo    OP  OP_Name
-   -   a   a
1   -   b   b
2   -   c   c

",header = T)

类似于这些-将有25个输出

Similar to these - there will be 25 outputs

推荐答案

也许您可以尝试

u <- rowSums(df == "")==ncol(df)
out <- split(subset(df,!u),cumsum(u)[!u])

给出

> out
$`0`
       V1   V2     V3 V4
1 Machine Data Editor

$`1`
          V1     V2          V3 V4
3    Machine    run information
4         V1     V2          V3 V4
5 03-09-2020 600119           6

$`2`
        V1    V2       V3 V4
7  Machine error messages   
8       No  SpNo       OP OP
9     Name
10       a     a
11       1     b        b
12       2     c        c

$`3`
        V1   V2 V3   V4
14 Machine logs        
15      No   sp op name

这篇关于根据空行和标题标题将数据帧划分或拆分为多个df的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆