根据空行和标题标题将数据帧划分或拆分为多个df [英] Divide or split dataframe into multiple dfs based on empty row and header title
问题描述
我有一个数据框,在一个文件中有多个值.我想将其分成文件中大约25个的多个文件.文件的模式是在其中有一个空白行和一个标题标题,这是一个新的df.我已经尝试过根据空行在R中拆分数据帧但这不会处理新df(V1列第9行)中的任何空白行.我希望将数据分为空行和标题标题,下面提供了我尝试过的数据和代码.另外,如何在新创建的dfs中将标题行作为数据框名称.
I have a dataframe which has multiple values in a single file. I want to divide it into multiple files around 25 from the file. Pattern for the file is where there is one blank row and a header title is there , it is a new df. I Have tried this Splitting dataframes in R based on empty rows but this does not take care of any blank row within the new df (V1 column 9th row). I want the data to be divided on empty row and a header title my data and code i have tried is given below . Also how can i put the header row as the Dataframe name in my newly created dfs.
df = structure(list(V1 = c("Machine", "", "Machine", "V1", "03-09-2020",
"", "Machine", "No", "Name", "a", "1", "2", "", "Machine", "No",
""), V2 = c("Data", "", "run", "V2", "600119", "", "error", "SpNo",
"", "a", "b", "c", "", "logs", "sp", ""), V3 = c("Editor", "",
"information", "V3", "6", "", "messages", "OP", "", "", "b",
"c", "", "", "op", ""), V4 = c("", "", "", "V4", "", "", "",
"OP", "", "", "", "", "", "", "name", "")), class = "data.frame", row.names = c(NA,
-16L))
dt <- df
## add column to indicate groups
dt$tbl_id <- cumsum(!nzchar(dt$V1)
unique(dt$tbl_id)
## remove blank lines
dt <- dt[nzchar(dt$V1), ]
## split the data frame
dt_s <- split(dt[, -ncol(dt)], dt$tbl_id)
## use first line as header and reset row numbers
dt_s <- lapply(dt_s, function(x) {
colnames(x) <- x[1, ]
x <- x[-1, ]
rownames(x) <- NULL
x
})
任何帮助将非常有用.同样,所有文件中的所有标题标题都相同.我正在对多个文件操作使用lapply.
any help will be highly useful . Also all the header title will be same in all the files. I am using lapply for the multiple file operations.
预期输出为:-
Machine_run_nformation <- read.table(text="
V1 V2 V3 V4
03-09-2020 600119 - 6
",header = T)
Machine_error_essages <- read.table(text="
No SpNo OP OP_Name
- - a a
1 - b b
2 - c c
",header = T)
类似于这些-将有25个输出
Similar to these - there will be 25 outputs
推荐答案
也许您可以尝试
u <- rowSums(df == "")==ncol(df)
out <- split(subset(df,!u),cumsum(u)[!u])
给出
> out
$`0`
V1 V2 V3 V4
1 Machine Data Editor
$`1`
V1 V2 V3 V4
3 Machine run information
4 V1 V2 V3 V4
5 03-09-2020 600119 6
$`2`
V1 V2 V3 V4
7 Machine error messages
8 No SpNo OP OP
9 Name
10 a a
11 1 b b
12 2 c c
$`3`
V1 V2 V3 V4
14 Machine logs
15 No sp op name
这篇关于根据空行和标题标题将数据帧划分或拆分为多个df的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!