如何使用RStudio将Excel文件中的列和子列读取并显示到数据框中 [英] How to read and display columns and sub columns from an excel file into a data frame using RStudio

查看:177
本文介绍了如何使用RStudio将Excel文件中的列和子列读取并显示到数据框中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试读取一个Excel文件,该文件也需要读取子列,但没有找到解决此问题的方法。有人可以给我一些提示吗?

I tried reading an excel file where I need to read sub columns too, but not getting a way to resolve this. Can anybody give me some hints to work on it.

excel文件包含的数据为

The excel file contains data as,

Row1 : | Sl No.   |      X         |     Y    |    Z    |
Row2 : |          |  a    |   b    |  c  | d  |         |
Row3 : |   1      |  34   |   67   | 175 | 45 |  125    |

导入后,我可以看到数据为

After importing I can see the data as

Sl. No.    X    ...3        Y       ...5       Z
  NA       a       b        c          d       NA
  1        12     34       56         78       55

但是我需要像

Sl. No.     X              Y            Z
          a     b       c          d    
   1     12     34     56         78    55

请帮助您

推荐答案

尽管我很难想到使用这种结构(以您指定的形式),但这是一种方法去得到你想要的。由于您提到要读取 xls xlsx 文件,因此我将使用 xlsx 软件包。

Though I can hardly think the use of such a structure (in the form you specified), here is one way to go about getting what you want. Since, you mentioned that you're reading an xls or xlsx file, I'm going to use xlsx package.

library("xlsx")
# read the data frame
df <- read.xlsx("my_file.xlsx", sheetName="Sheet1", header = FALSE)
#df <- read.xlsx("my_file.xls", sheetName="Sheet1", header = FALSE)
#[1] df <- read.csv("~/my_file.csv",  header = FALSE)
# -------------------------------------------------------------------------
#     X1 X2   X3  X4   X5   X6
# 1 S1.No  X <NA>   Y <NA>    Z
# 2  <NA>  a    b   c    d <NA>
# 3     1 12   34 175   45  125

# -------------------------------------------------------------------------

# replace NA's with empty string (This is not required for the csv file)
df <- sapply(df, as.character)
# -------------------------------------------------------------------------
#       X1      X2   X3   X4    X5   X6   
# [1,] "S1.No" "X"  NA   "Y"   NA   "Z"  
# [2,] NA      "a"  "b"  "c"   "d"  NA   
# [3,] "1"     "12" "34" "175" "45" "125"
# -------------------------------------------------------------------------

# replace NA with empty string
df[is.na(df)] <- " "
# -------------------------------------------------------------------------
#       X1      X2   X3   X4    X5   X6   
# [1,] "S1.No" "X"  " "  "Y"   " "  "Z"  
# [2,] " "     "a"  "b"  "c"   "d"  " "  
# [3,] "1"     "12" "34" "175" "45" "125"

# -------------------------------------------------------------------------
#  Coerce df to a data frame 
df <- as.data.frame(df)
# -------------------------------------------------------------------------
#      X1 X2 X3  X4 X5  X6
# 1 S1.No  X      Y      Z
# 2        a  b   c  d    
# 3     1 12 34 175 45 125

# -------------------------------------------------------------------------
#Use the first row instead of the auto generated header
# This is the second action for the csv file
colnames(df) <- as.character(unlist(df[1,]))
# -------------------------------------------------------------------------
#   S1.No  X      Y      Z
# 1 S1.No  X      Y      Z
# 2        a  b   c  d    
# 3     1 12 34 175 45 125

# -------------------------------------------------------------------------
# Finally remove the first row 
# This is the third action for the csv file 
df = df[-1,]

# -------------------------------------------------------------------------
#   S1.No  X      Y      Z
# 2        a  b   c  d    
# 3     1 12 34 175 45 125
# -------------------------------------------------------------------------

这为您提供了一个数据框,您可以进一步整理

This gives you a data frame which you can tidy further.

这篇关于如何使用RStudio将Excel文件中的列和子列读取并显示到数据框中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆