需要矢量化使用嵌套循环的解决方案(将数据帧从长格式转换为宽格式) [英] Need to vectorize solution that using nested loops (transform data frame from long to wide format)

查看:162
本文介绍了需要矢量化使用嵌套循环的解决方案(将数据帧从长格式转换为宽格式)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下的数据框架,需要做从长到宽的格式转换:

pre $ code $符号边价
1 AB 1
2 AS 2
3 BB 3
4 CB 4
5 BS 5

说明:对于每个符号必须存在两条边,B和S边。需要查找这些线并将它们转换为宽格式:



code $符号,首先方(B或S),B方的价格,S方的价格]



如果一行存在,但另一行不存在,则将NA值设置为适当的价格值。例如,如果行与B方面存在,但与方S是缺少,然后把NA的价格S方。



输出结果必须如下:

 符号方price_B price_S 
1 AB 1 2
2 BB 3 5
3 CB 4 NA

对于符号A和B,我们有A和B两边的线,所以我们不用NA来转换它们。 B方首先为了然后我们只把B方放到方列。对于符号C,我们只有B面,而不是S面,所以我们把NA值放在price_S列。



如何进行矢量化?

解决方案

reshape 获取价格:

< pre $ lt; code> price< - reshape(x,direction ='wide',idvar ='symbol',timevar ='side',v.names ='price',sep ='')
价格
##符号priceB priceS
## 1 A 1 2
## 3 B 3 5
## 4 C 4 NA

集合获得第一个价格:

  first<  - 聚合(side〜symbol,data = x,FUN = head,n = 1)
first
##符号方
## 1 AB
## 2 BB
## 3 CB


$ b $

  merge(first,价格)
##符号方priceB priceS
## 1 AB 1 2
## 2 BB 3 5
## 3 C B 4 NA


I have following data frame and need to do transformation from long into wide format:

symbol side price
1      A    B     1
2      A    S     2
3      B    B     3
4      C    B     4
5      B    S     5

Explanation: for every symbol must exists two lines with side B and side S. Need to find these lines and transform them into wide format:

[symbol, first-comed side (B or S), price of side B, price of side S]

If one line exists but another is missing, then put NA value to appropriate price value. For example if line with side B exists, but with side S is missing, then put NA to price of side S.

Output results must be following:

  symbol side price_B price_S
1      A    B      1      2
2      B    B      3      5
3      C    B      4     NA

For symbols A and B we have lines with sides A and B so we transform them without NA's. Side B was first in order then we put only B side to "side" column. For symbol C we have only side B but not side S so we put NA value to the "price_S" column.

How to vectorize it?

解决方案

reshape gets the prices:

prices <- reshape(x, direction='wide', idvar='symbol', timevar='side', v.names='price', sep='')
prices
##   symbol priceB priceS
## 1      A      1      2
## 3      B      3      5
## 4      C      4     NA

aggregate gets the first price:

first <- aggregate(side ~ symbol, data=x, FUN=head, n=1)
first
##   symbol side
## 1      A    B
## 2      B    B
## 3      C    B

merge puts them together:

merge(first, prices)
##   symbol side priceB priceS
## 1      A    B      1      2
## 2      B    B      3      5
## 3      C    B      4     NA

这篇关于需要矢量化使用嵌套循环的解决方案(将数据帧从长格式转换为宽格式)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆