如何在数据框中创建矩阵元素的数据集? [英] How can make a dataset of elements of matrices in dataframe?

查看：112 发布时间：2020/5/7 18:39:20 python arrays dataframe matrix dataset

本文介绍了如何在数据框中创建矩阵元素的数据集?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在.TXT文件中有3个参数'A'，'B'，'C'的数据集，并在24x20矩阵中将它们打印后，我需要收集'A'，'B'，'C'在panda dataframe中放入长数组，然后每个数组放入第二个元素，然后依次排列第三个元素，依此类推直到第480个元素.

I have dataset of 3 parameters 'A','B','C' in .TXT file and after I print them in 24x20 matrices I need to collect the 1st elements of 'A','B','C' put in long arrays in panda dataframe and then 2nd elements of each then 3rd and so on till 480th elements.

所以我的数据在文本文件中是这样的: 我的数据是txt文件，如下:

So my data is like this in text file: my data is txt file is following:

id_set: 000
     A: -2.46882615679
     B: -2.26408246559
     C: -325.004619528

我已经制成了熊猫dataframe，其中包括3列'A'，'B'，'C'和index列，并定义了以正确方式打印24x20矩阵的函数.通过2x2矩阵的简单示例:

I already made a panda dataframe includes 3 columns of 'A','B','C' and index and defined functions to print 24x20 matric in right way. Simple example via 2x2 matrices:

1st cycle:  A = [1,2,    B = [4,5,     C = [8,9,
                 3,4]         6,7]          10,11]
2nd cycle:  A = [0,8,    B = [1,9,     C = [10,1,
                 2,5]         4,8]          2,7]

重塑为这种形式:

          A(1,1),B(1,1),C(1,1),A(1,2),B(1,2),C(1,2),.....
Result=  [1,4,8,2,5,9,3,6,10,4,7,11] #1st cycle
         [0,1,10,8,9,1,2,4,2,5,8,7]  #2nd cycle

我的脚本如下:

import numpy as np
import pandas as pd
import os

def normalize(value, min_value, max_value, min_norm, max_norm):
    new_value = ((max_norm - min_norm)*((value - min_value)/(max_value - min_value))) + min_norm
    return new_value

dft = pd.read_csv('D:\mc25.TXT', header=None)
id_set = dft[dft.index % 4 == 0].astype('int').values
A = dft[dft.index % 4 == 1].values
B = dft[dft.index % 4 == 2].values
C = dft[dft.index % 4 == 3].values
data = {'A': A[:,0], 'B': B[:,0], 'C': C[:,0]}

df = pd.DataFrame(data, columns=['A','B','C'], index = id_set[:,0])  

#next iteration create all plots, change the number of cycles
cycles = int(len(df)/480)
print(cycles)
for cycle in range(0,10):             
    count =  '{:04}'.format(cycle)
    j = cycle * 480
    for i in df:
        try:
            os.mkdir(i)
        except:
            pass

        min_val = df[i].min()
        min_nor = -1
        max_val = df[i].max()
        max_nor = 1

        ordered_data = mkdf(df.iloc[j:j+480][i])
        csv = print_df(ordered_data)
        #Print .csv files contains matrix of each parameters by name of cycles respectively
        csv.to_csv(f'{i}/{i}{count}.csv', header=None, index=None)            
        if 'C' in i:
            min_nor = -40
            max_nor = 150
            #Applying normalization for C between [-40,+150]
            new_value3 = normalize(df['C'].iloc[j:j+480], min_val, max_val, -40, 150)
            df3 = print_df(mkdf(new_value3))
            df3.to_csv(f'{i}/norm{i}{count}.csv', header=None, index=None)
        else:
            #Applying normalization for A,B between    [-1,+1]
            new_value1 = normalize(df['A'].iloc[j:j+480], min_val, max_val, -1, 1)
            new_value2 = normalize(df['B'].iloc[j:j+480], min_val, max_val, -1, 1)
            df1 = print_df(mkdf(new_value1))
            df2 = print_df(mkdf(new_value2))
            df1.to_csv(f'{i}/norm{i}{count}.csv', header=None, index=None) 
            df2.to_csv(f'{i}/norm{i}{count}.csv', header=None, index=None)

注意2::我在文本文件中提供了3个周期的数据集: 文本数据集

Note2: I provided a dataset in text file for 3 cycles: Text dataset

推荐答案

我不确定我是否完全理解您的问题，但这是一个解决方案:

I am not sure if I understood your question fully but this is a solution:

使用as_matrix()将数据帧转换为2d numpy数组，然后使用ravel()获得大小为480 * 3的向量，然后在循环中循环，并使用vstack方法在结果中彼此堆叠行，这是包含示例数据的代码:

Convert your data frame to a 2d numpy array using as_matrix() then use ravel() to get a vector of size 480 * 3 then cycle over your cycles and use vstack method for stacking rows over each other in your result, this is a code with your example data:

A = [[1,2,3,4], [10,20,30,40]]
B = [[4,5,6,7], [40,50,60,70]]
C = [[8,9,10,11], [80,90,100,110]]

cycles = 2

for cycle in range(cycles):
    data = {'A': A[cycle], 'B': B[cycle], 'C': C[cycle]}
    df = pd.DataFrame(data)
    D = df.as_matrix().ravel()
    if cycle == 0:
        Results = np.array(D)
    else:
        Results = np.vstack((Results, D2))
# Output: Results= array([[  1,   4,   8,   2,   5,   9,   3,   6,  10,   4,   7,  11], [ 10,  40,  80,  20,  50,  90,  30,  60, 100,  40,  70, 110]], dtype=int64)
np.savetxt("Results.csv", Results, delimiter=",")

这是您想要的吗?

这篇关于如何在数据框中创建矩阵元素的数据集?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在数据框中创建矩阵元素的数据集? [英] How can make a dataset of elements of matrices in dataframe?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在数据框中创建矩阵元素的数据集? [英] How can make a dataset of elements of matrices in dataframe?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭