使用Python中列表中的变量操作DataFrame的函数 [英] Function for DataFrame operation using variables in the list with Python

查看：901 发布时间：2018/4/17 18:04:48 python function pandas numpy dataframe

本文介绍了使用Python中列表中的变量操作DataFrame的函数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个列表 list = ['OUT'，'IN'] 其中列表的所有元素都是数据框中的变量名称，后缀 _3M，_6M，_9M，15M 附加到它。

清单：
list = ['OUT'，'IN'] $ b

Input_df：

ID OUT_3M OUT_6M OUT_9M OUT_15M IN_3M IN_6M IN_9M IN_15M A 2 3 4 6 2 3 4 6 B 3 3 5 7 3 3 5 7 C 2 3 6 6 2 3 6 6 D 3 3 7 7 3 3 7 7

 
 
 我正在解决的问题是减去
 
  OUT_6M 从 OUT_3M ，然后输入 Out_3M-6M  
 
 
 <2>  OUT_9M  from  OUT_6M 并输入到单独的列中作为 Out_6M-9M  
 
 
  3。 OUT_15M 从 OUT_9M ，然后作为 Out_9M-15M  
 b 
 $ b 同样重复每一个e在列表中保留 OUT_3M 和 IN_3M ，我在示例 Output_df 数据集。 
 
 
  Output_df：
 
 
   
 ID输出3M输出3M-6M输出6M-9M Out_9M-15M IN_3M IN_3M-6M IN_6M-9M IN_9M-15M 
 A 2 1 1 2 2 1 1 2 
 B 3 0 2 2 3 0 2 2 
 C 2 1 3 0 2 1 3 0 
 D 3 0 4 0 3 0 4 0 
  
 
 
 列表中有很多元素，需要执行操作。有什么办法可以通过编写函数来解决这个问题。谢谢！ 
解决方案
我不确定你写的函数是什么意思，想做？例如： 
 
 
  postfixes = ['3M'，'6M'，'9M'，'15M'] 
前缀= ['IN'，'OUT'] 
 
＃分配空间，同时复制_3M 
 output_df = input_df.copy（）
 
＃重命名a很少
 output_df.rename（columns = {'_'。join（（prefix，postfixes [i]））：'_'。join（（prefix，postfixes [i-1] +' - '+ postfixes [ （1，len（postfixes））}，inplace = True）
 
 
＃计算差值
 （1，len（postfixes））：
 postfix = postfixes [i] +' - '+ postfixes [i-1] 
 output_df [' _'。join（（prefix，postfix））] = input_df ['_'。join（（prefix，postfixes [i-1]））]。values  -  input_df ['_'。join（（prefix，postfixes [i ]）]]。values 
  
 output_df是input_df的一个副本，与_3M情况分开，并预先分配DataF而不是一次创建一列（在你的代码中并不重要，但是如果你有成千上万的列，它会浪费时间在内存中移动内存，否则......）
 
 
另外，你应该避免调用一个列表list，或者当你试图将一个元组转换成一个列表的时候，你会得到一些令人讨厌的bug。
 
I have a list list = ['OUT', 'IN']where all the elements of the list is a variable name in the data frame with suffixes _3M, _6M, _9M, 15Mattached to it. 

List:

list = ['OUT', 'IN']


Input_df:

ID OUT_3M  OUT_6M  OUT_9M  OUT_15M  IN_3M  IN_6M   IN_9M   IN_15M
A   2        3        4        6        2     3       4       6
B   3        3        5        7        3     3       5       7
C   2        3        6        6        2     3       6       6
D   3        3        7        7        3     3       7       7


The problem I am solving to do is subtracting the 

1.OUT_6M from OUT_3M and entering in into separate column as Out_3M-6M

2.OUT_9M from OUT_6M and entering in into separate column as Out_6M-9M

3.OUT_15M from OUT_9M and entering in into separate column as Out_9M-15M

The Same repeats to each and every element in the list while keeping the OUT_3M and IN_3M which I mentioned in the sample Output_df dataset.  

Output_df:


ID  Out_3M  Out_3M-6M Out_6M-9M Out_9M-15M IN_3M IN_3M-6M IN_6M-9M  IN_9M-15M
A   2         1         1           2        2      1        1         2
B   3         0         2           2        3      0        2         2
C   2         1         3           0        2      1        3         0
D   3         0         4           0        3      0        4         0


There are many elements in the list which I need to perform operation on. Is there any way I could solve this by writing a function. Thanks!  
 解决方案 
I'm not sure what you mean by writing a function, aren't a couple of for cycles enough for what you want to do? Something like:
postfixes = ['3M','6M','9M','15M']
prefixes = ['IN','OUT']

# Allocate the space, while also copying _3M
output_df = input_df.copy()

# Rename a few
output_df.rename(columns={'_'.join((prefix, postfixes[i])): '_'.join((prefix, postfixes[i-1] + '-' + postfixes[i]))
                          for prefix in prefixes for i in range(1, len(postfixes))}, inplace=True)


# Compute the differences
for prefix in prefixes:
    for i in range(1,len(postfixes)):
        postfix = postfixes[i] + '-' + postfixes[i-1]
        output_df['_'.join((prefix, postfix))] = input_df['_'.join((prefix, postfixes[i-1]))].values - input_df['_'.join((prefix, postfixes[i]))].values
The output_df is a copy of input_df in the beginning, both to avoid dealing with the _3M case separately, and to pre-allocate the DataFrame instead of creating the columns one at a time (it doesn't matter in your code, but if you had thousands of columns it would waste time moving stuff around in memory otherwise...)

Also, you should avoid calling a list "list" or you're going to get some nasty-to-find bugs along the way when you're trying to convert a tuple to a list!

                        这篇关于使用Python中列表中的变量操作DataFrame的函数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Python中列表中的变量操作DataFrame的函数 [英] Function for DataFrame operation using variables in the list with Python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用Python中列表中的变量操作DataFrame的函数 [英] Function for DataFrame operation using variables in the list with Python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭