基于公共列但长度不一致的合并.csvs [英] merge .csvs based on common column but of inconsistent length
问题描述
下午(或早上,晚上)
我正在尝试合并几个布局相似的 .csv
文件,它们在一个列( character
)中有一个类,并且有大量( num
).
I am trying to merge several .csv
files that have a similar layout, they have a class in one column (character
) and an abundance (num
) in another.
当作为 data.frame
导入时,示例为:
When imported as a data.frame
example would be:
print(one[1:5,])
X Class Abundance_inds
1 1 Chaetognath 2
2 2 Copepod_Calanoid_Acartia_spp 9
3 3 Copepod_Calanoid_Centropages_spp 4
4 4 Copepod_Calanoid_Temora_spp 1
5 5 Copepod_Calanoid_Unknown 55
class列(行数和顺序)根据找到的内容更改每个csv,我想根据class列绑定多个(30+)csv,我有以下内容(我确定是在前一段时间工作的....)
The class column (number of rows and order) changes every csv based on what was found and I want to bind several (30+) csvs based on the class column, I had the following (which I am sure was working a while ago.....):
DensityFiles <- list.files(CSVdirectory,
pattern = '.csv',
full.names = T)
Combined <- rbindlist(
lapply(
DensityFiles,
fread),
fill = TRUE,
use.names = TRUE)
这将产生以下结果:
str(Combined)
Classes ‘data.table’ and 'data.frame': 461 obs. of 3 variables:
不完全是我的追求!我正在寻找以下内容:
not quite what I was after! I am looking for the following:
> print(example)
X Class CSV.NAME CSV.NAME.1
1 1 Bivalve_Larvae 1 3
2 2 Bryozoa_Larvae 4 6
3 3 Chaetognath NA 7
4 4 Cnidaria 1 8
5 5 Copepod_Calanoid_Acartia_spp 22 NA
6 6 Copepod_Calanoid_Calanus_spp 24 4
7 7 Copepod_Calanoid_Candacia_sp 5 3
8 8 Copepod_Calanoid_Centropages_spp 41 2
9 9 Copepod_Calanoid_Temora_spp 39 8
10 10 Copepod_Calanoid_Unknown 458 NA
11 11 Copepod_Cyclopoid_Corycaeus_spp 46 NA
12 12 Copepod_Cyclopoid_Oithona_spp NA 4
13 13 Copepod_Cyclopoid_Oncaea_spp NA 7
14 14 Copepod_Harpacticoid 36 NA
15 15 Copepod_Nauplii 12 9
在使用时,我可以使用 idcol ="origin"
将CSV名称添加到列标题中 data.table
libary rbindlist
.但不确定是否适用于所有解决方案.
I can get the CSV name into the column header using idcol = "origin"
when using
data.table
libary rbindlist
. but not sure if this works for all solutions.
我四处游历,但大多数示例似乎都在处理一致数量的行,
I have had a good hunt around but most examples seem to be dealing with a consistent number of rows,
任何帮助将不胜感激!
吉姆
推荐答案
您可以使用 reader
和 bind_rows
library(dplyr)
library(readr)
df <- do.call(bind_rows, lapply(DensityFiles,read_csv))
这篇关于基于公共列但长度不一致的合并.csvs的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!