如何权衡散点图中的点以进行拟合? [英] How to weigh the points in a scatter plot for a fit?

查看:224
本文介绍了如何权衡散点图中的点以进行拟合?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我在Python的polyfit(numpy.polynomial.polynomial.polyfit)函数中查找了有关权重参数的信息,似乎它与与各个点相关的错误有关. (如何在numpy.polyfit中包含测量错误)

So, I looked up information about the weights parameter in the polyfit (numpy.polynomial.polynomial.polyfit) function in Python and it seems like it has something to do with the error associated with the individual points. (How to include measurement errors in numpy.polyfit)

但是,我尝试执行的操作与错误无关,而是权重.我有一个numpy数组形式的图像,该图像指示沉积在检测器中的电荷量.我将该图像转换为散点图,然后进行拟合.但我希望这种方法能使更多的权重集中在电荷沉积更多的点上,而更少的电荷较少的点上.这是weights参数的目的吗?

However, what I am trying to do has nothing to do with the error, but weights. I have an image in the form of a numpy array which indicates the amount of charge deposited in the detector. I convert that image to a scatter plot and then do a fit. But I want that fit to give more weight to the points which have more charge deposited and less to the ones that have less charge. Is that what the weights parameter is for?

以下是示例图片: 这是我的代码:

Here's an example image: Here's my code:

def get_best_fit(image_array, fixedX, fixedY):
    weights = np.array(image_array)
    x = np.where(weights>0)[1]
    y = np.where(weights>0)[0]
    size = len(image_array) * len(image_array[0])
    y = np.zeros((len(image_array), len(image_array[0])))
    for i in range(len(np.where(weights>0)[0])):
        y[np.where(weights>0)[0][i]][np.where(weights>0)[1][i]] = np.where(weights>0)[0][i]
    y = y.reshape(size)
    x = np.array(range(len(image_array)) * len(image_array[0]))
    weights = weights.reshape((size))
    b, m = polyfit(x, y, 1, w=weights)
    angle = math.atan(m) * 180/math.pi
    return b, m, angle

让我向您解释代码:

第一行将沉积的电荷分配到一个称为权重的变量中.接下来的两行获得的点是沉积的电荷> 0的点,因此沉积了一些电荷以捕获散点图的坐标.然后,我得到整个图像的大小,以后再转换为一维数组进行打印.然后,我浏览图像并尝试获取存放了一些电荷的点的坐标(请记住,电荷的金额存储在变量weights中).然后,我对y坐标进行整形以获得一维数组,并从图像中获取所有对应y坐标的x坐标,然后将权重的形状也更改为一维.

The first line assigns the charged deposited in a variable called weights. The next two lines get the points where the charge deposited is >0, so there's some charge deposited to capture the coordinates for the scatter plot. Then I get the size of the entire image to later convert to just a one dimensional array for plotting. I then go through the image and try to get the coordinates of the points where there's some charge deposited (remember that the amount of charge is stored in the variable weights). I then reshape the y coordinates to get a one dimensional array and get the x coordinates for all the corresponding y coordinates from the image, then change the shape of the weights too to be just one dimensional.

如果有一种使用np.linalg.lstsq函数执行此操作的方法,那将是理想的,因为我也在尝试使拟合通过图的顶点.我可以重新定位图,使顶点为零,然后使用np.linalg.lstsq,但这不允许我使用权重.

if there's a way of doing this using the np.linalg.lstsq function, that would be ideal since I'm also trying to get the fit to go through the vertex of the plot. I could just reposition the plot so the vertex is at zero and then use np.linalg.lstsq, but that wouldn't allow me to use the weights.

推荐答案

您可以使用 sklearn.linear_model.LinearRegression .它使您无法拟合截距(即,线穿过原点,或者经过一些跳线选择了您选择的点).它还处理加权数据.

You can use sklearn.linear_model.LinearRegression. It allows you to not fit the intercept (i.e. line goes through the origin, or, with some finagling, the point of your choice). It also deals with weighted data.

例如(大部分都是从@Hiho的答案中无耻地偷走的)

e.g. (mostly stolen shamelessly from @Hiho's answer)

import numpy as np
import matplotlib.pyplot as plt
import sklearn.linear_model

y = np.array([1.0, 3.3, 2.2, 4.25, 4.8, 5.1, 6.3, 7.5])
x = np.arange(y.shape[0]).reshape((-1,1))
w = np.linspace(1,5,y.shape[0])

model = sklearn.linear_model.LinearRegression(fit_intercept=False)
model.fit(x, y, sample_weight=w)

line_x = np.linspace(min(x), max(x), 100).reshape((-1,1))
pred = model.predict(line_x)

plt.scatter(x, y)
plt.plot(line_x, pred)

plt.show()

这篇关于如何权衡散点图中的点以进行拟合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆