如何从Python中的Mysql规范化数据挖掘Min Max [英] How Normalize Data Mining Min Max from Mysql in Python
问题描述
这是我在mysql中的数据示例,我使用lib flashext.mysql和python 3
This is example of my data in mysql, I use lib flashext.mysql and python 3
RT NK NB SU SK P TNI IK IB TARGET
84876 902 1192 2098 3623 169 39 133 1063 94095
79194 902 1050 2109 3606 153 39 133 806 87992
75836 902 1060 1905 3166 161 39 133 785 83987
75571 902 112 1878 3190 158 39 133 635 82618
83797 1156 134 1900 3518 218 39 133 709 91604
91648 1291 127 2225 3596 249 39 133 659 99967
公式MinMax是
(data-min)/(max-min)*0.8+0.1
我得到了从csv标准化数据的代码
I got the code normalize data from csv
import pandas as pd
df = pd.read_csv("dataset.csv")
norm = (df - df.min()) / (df.max() - df.min() )*0.8 + 0.1
我知道如何这样计算
(first data of RT - min column RT data) / (max column RT- min column RT) * 0.8 + 0.1
下一列也是如此
(first data of NK - min column NK data) / (max column NK- min column NK) * 0.8 + 0.1
请帮助我,如何规范化数据库中的数据,它调用数据集"并对其进行规范化,并在另一个表中输入规范化"
Please help me, How to normalize data from database, it call "dataset" and normalize it and input in another table call "normalize"
推荐答案
下面是一个SQL查询,可以帮助您入门(假设您想按列计算):
Here is a SQL query that should get you started (assuming you want to calculate it per column):
create table normalize as
select
(RT - min(RT)over()) / (max(RT)over() - min(RT)over()) * 0.8 + 0.1 as RT_norm
from test;
我在sqlite3(而不是MySQL)中测试了此查询.它不一定是最佳的,但可以直观地遵循公式.注意,over
将min/max聚合函数转换为窗口函数,这意味着它们查看整列,但结果在每一行上重复.
I tested this query in sqlite3, not MySQL. It isn't necessarily optimal, but intuitively follows the formula. Notice, the over
turns the min / max aggregate functions into window functions, which means they look at whole column, but the result is repeated on each row.
您仍然需要:
- 通过Python发送MySQl查询
- 为每列重复相同的代码
- 为各列命名
- 将结果表分配给一个模式(最有可能)
- 在列的最大值和最小值相等的情况下,将句柄除以0
这篇关于如何从Python中的Mysql规范化数据挖掘Min Max的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!