使用Numpy进行大数据集多项式拟合 [英] Large Dataset Polynomial Fitting Using Numpy

查看:205
本文介绍了使用Numpy进行大数据集多项式拟合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图将二阶多项式拟合到原始数据,并使用Matplotlib输出结果.我要拟合的数据集中大约有一百万个点.它应该很简单,网络上有许多示例.但是由于某种原因,我无法正确处理.

I'm trying to fit a second order polynomial to raw data and output the results using Matplotlib. There are about a million points in the data set that I'm trying to fit. It is supposed to be simple, with many examples available around the web. However for some reason I cannot get it right.

我收到以下警告消息:

随机警告:Polyfit的条件可能不佳

RankWarning: Polyfit may be poorly conditioned

这是我的输出:

这是使用Excel输出的:

This is output using Excel:

有关我的代码,请参见下文.我想念什么?

See below for my code. What am I missing??

xData = df['X']
yData = df['Y']
xTitle = 'X'
yTitle = 'Y'
title = ''
minX = 100
maxX = 300
minY = 500
maxY = 2200

title_font = {'fontname':'Arial', 'size':'30', 'color':'black', 'weight':'normal',
              'verticalalignment':'bottom'} # Bottom vertical alignment for more space
axis_font = {'fontname':'Arial', 'size':'18'}

#Poly fit

# calculate polynomial
z = np.polyfit(xData, yData, 2)
f = np.poly1d(z)
print(f)

# calculate new x's and y's
x_new = xData
y_new = f(x_new)   

#Plot
plt.scatter(xData, yData,c='#002776',edgecolors='none')
plt.plot(x_new,y_new,c='#C60C30')

plt.ylim([minY,maxY])
plt.xlim([minX,maxX])

plt.xlabel(xTitle,**axis_font)
plt.ylabel(yTitle,**axis_font)
plt.title(title,**title_font)

plt.show()      

推荐答案

必须对要绘制的数组进行排序.这是在绘制已排序和未排序数组之间的比较.未排序情况下的图看起来完全失真,但是拟合函数当然是相同的.

The array to plot must be sorted. Here is a comparisson between plotting a sorted and an unsorted array. The plot in the unsorted case looks completely distorted, however, the fitted function is of course the same.

        2
-3.496 x + 2.18 x + 17.26

import matplotlib.pyplot as plt
import numpy as np; np.random.seed(0)

x = (np.random.normal(size=300)+1)
fo = lambda x: -3*x**2+ 1.*x +20. 
f = lambda x: fo(x) + (np.random.normal(size=len(x))-0.5)*4
y = f(x)

fig, (ax, ax2) = plt.subplots(1,2, figsize=(6,3))
ax.scatter(x,y)
ax2.scatter(x,y)

def fit(ax, x,y, sort=True):
    z = np.polyfit(x, y, 2)
    fit = np.poly1d(z)
    print(fit)
    ax.set_title("unsorted")
    if sort:
        x = np.sort(x)
        ax.set_title("sorted")
    ax.plot(x, fo(x), label="original func", color="k", alpha=0.6)
    ax.plot(x, fit(x), label="fit func", color="C3", alpha=1, lw=2.5  )  
    ax.legend()


fit(ax, x,y, sort=False)

fit(ax2, x,y, sort=True) 


plt.show()

这篇关于使用Numpy进行大数据集多项式拟合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆