如何在python中为我的数据正确获取累积分布函数? [英] How to get cumulative distribution function correctly for my data in python?

查看:73
本文介绍了如何在python中为我的数据正确获取累积分布函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,我有一个值列表,我需要获取累积分布函数我已将此列表保存在变量名称 yvalues 中

Hello everyone i have a list of values for which i need to get cumulative distribution function i have saved this list in a variable name yvalues

[0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 17.0, 17.0, 1.0, 1.0, 1.0.0,22.0,23.0,24.0,25.0,26.0,27.0,28.0,29.0,30.0,31.0,32.0,33.0,34.0,35.0,36.0,37.0,38.0,39.0,40.0,41.0,42.0,43.0,44.0,45.0,46.0,47.0,48.0,49.0,50.0,51.0,52.0,53.0,54.0,55.0,56.0,57.0,58.0,59.0,60.0,61.0,62.0,63.0,64.0,65.0,66.0,67.0,68.0,69.0,70.0,71.0,72.0,73.0,74.0,75.0,76.0,77.0,78.0,79.0,80.0,81.0,82.0,83.0,84.0,85.0,86.0,87.0,88.0,89.0,90.0,91.0,92.0,93.0,94.0,95.0,96.0,97.0,98.0,99.0,100.0,101.0,102.0,103.0,104.0,105.0,106.0,107.0,108.0,109.0,110.0,111.0,112.0,113.0,114.0,115.0,116.0,117.0,118.0,119.0,120.0,121.0,122.0,123.0,124.0,125.0,126.0,127.0,128.0,129.0,130.0,131.0,132.0,133.0,134.0,135.0,136.0,137.0,138.0,139.0,140.0,141.0,142.0,143.0,144.0,145.0,146.0, 147.0, 148.0, 149.0, 150.0, 151.0, 152.0, 153.0, 154.0, 155.0, 156.0, 157.0,158.0,159.0,160.0,161.0,162.0,163.0,164.0,165.0,166.0,167.0,168.0,169.0,170.0,171.0,172.0,173.0,174.0,175.0,176.0,177.0,178.0,179.0,180.0,181.0,182.0,183.0,184.0,185.0,186.0,187.0,188.0,189.0,190.0,191.0,192.0,193.0,194.0,195.0,196.0,197.0,198.0,199.0,200.0,201.0,202.0,203.0,204.0,205.0,206.0,207.0,208.0,209.0,210.0,211.0,212.0,213.0,214.0,215.0,216.0,217.0,218.0,219.0,220.0,221.0,222.0,223.0,224.0,225.0,226.0,227.0,228.0,229.0,230.0,231.0,232.0,233.0,234.0,235.0,236.0,237.0,238.0,239.0,240.0,241.0,242.0,243.0,244.0,245.0,246.0,247.0,248.0,249.0,250.0,251.0,252.0,253.0,254.0,255.0,256.0,257.0,259.0,260.0,261.0,262.0,263.0,264.0,265.0,266.0,267.0,268.0,269.0,270.0,271.0,272.0,273.0,274.0,275.0,276.0,277.0,278.0,279.0,280.0,281.0,282.0,283.0, 284.0, 285.0, 286.0, 287.0, 288.0, 289.0, 290.0, 291.0, 292.0, 293.0, 294.0, 295.0, ., 293.0, ., 093.0, 293.0, 293.0302.0,303.0,304.0,305.0,306.0,307.0,308.0,309.0,310.0,311.0,313.0,315.0,316.0,317.0,318.0,319.0,320.0,321.0,322.0,323.0,324.0,325.0,326.0,327.0,328.0,329.0,331.0,332.0,333.0,334.0,335.0,336.0,337.0,338.0,339.0,340.0,341.0,342.0,343.0,344.0,345.0,346.0,347.0,349.0,350.0,352.0,353.0,354.0,355.0,356.0,357.0,358.0,359.0,360.0,362.0,363.0,364.0,365.0,367.0,368.0,370.0,371.0,372.0,375.0,376.0,377.0,378.0,379.0,380.0,381.0,383.0,384.0,386.0,389.0,390.0,391.0,392.0,393.0,395.0,396.0,397.0,398.0,399.0,400.0,402.0,403.0,404.0,405.0,411.0,412.0,413.0,414.0,415.0,416.0,417.0,419.0,420.0,424.0,425.0,426.0,427.0,428.0,429.0,430.0,431.0,432.0,433.0,434.0,435.0,436.0,438.0,439.0,440.0,442.0,443.0,445.0,446.0,447.0,448.0,452.0,454.0,456.0,458.0,460.0,461.0,462.0, 463.0, 464.0, 467.0, 468.0, 469.0, 470.0, 475.0, 477.0, 479.0, 480.0, 481.0, 482.0, ., 04, 84.0, 484.0, ., 04, 87.0492.0,493.0,495.0,500.0,502.0,505.0,508.0,509.0,511.0,514.0,515.0,516.0,517.0,518.0,519.0,520.0,524.0,526.0,527.0,528.0,530.0,531.0,532.0,533.0,534.0,535.0,536.0,537.0,540.0,541.0,545.0,546.0,547.0,548.0,551.0,552.0,553.0,555.0,558.0,562.0,563.0,565.0,566.0,567.0,569.0,570.0,572.0,573.0,574.0,575.0,577.0,579.0,583.0,585.0,587.0,588.0,591.0,593.0,594.0,597.0,599.0,601.0,602.0,607.0,610.0,613.0,614.0,622.0,624.0,627.0,629.0,630.0,631.0,632.0,633.0,636.0,637.0,638.0,640.0,645.0,649.0,654.0,655.0,656.0,658.0,662.0,668.0,676.0,677.0,679.0,682.0,685.0,689.0,691.0,696.0,697.0,699.0,700.0,702.0,703.0,706.0,707.0,721.0,722.0,725.0,727.0,731.0,733.0,735.0,740.0,744.0,747.0,751.0,754.0,760.0,770.0,778.0,779.0,781.0,782.0,791.0,798.0,805.0,807.0,825.0,835.0, 840.0, 846.0, 851.0, 877.0, 882.0, 887.0, 893.0, 900.0, 919.0, 926.0, 929.0, 944.0, ., 077.0, 959.0, 959.19.040,1017.0,1042.0,1043.0,1048.0,1055.0,1062.0,1077.0,1089.0,1111.0,1128.0,1162.0,1203.0,1204.0,1243.0,1300.0,1318.0,1325.0,1339.0,1362.0,1425.0,1483.0,1512.0,1657.0,1671.0,1709.0,1751.0,1812.0,1889.0,1955.0,2138.0,2147.0,2171.0,2205.0,2278.0,2558.0,2574.0,2781.0,2783.0,2790.0,2815.0,3019.0,3034.0,3278.0,3292.0,3415.0,3452.0,3579.0,3760.0,3857.0,3944.0,4111.0,4698.0,4994.0,5191.0,5586.0,5647.0,5874.0,6072.0,6440.0,6491.0,6772.0,7973.0,8341.0,13170.0,74473.0,76745.0,78061.0,78955.0,79225.0,79500.0,80509.0,80968.0,81203.0,81462.0,81506.0,81761.0,81989.0,82215.0,82426.0,83003.0,83011.0,83108.0,83129.0,83425.0,83457.0,83553.0,83609.0,83705.0,83844.0,83973.0,83996.0,84075.0,84283.0,84336.0,84524.0,84676.0,84787.0,84830.0,84943.0,84944.0,84960.0,85071.0,85088.0,85170.0,85194.0,85235.0,85353.0,85400.0,85557.0,85589.0,85599.0,85600.0,85716.0,85820.0,85824.0,85830.0,85846.0,85934.0,86022.0,86067.0,86177.0,86186.0,86195.0,86228.0,86279.0,86282.0,86289.0,86327.0,86336.0,86340.0,86359.0,86366.0,86370.0,86371.0,86376.0,86377.0,86385.0,86390.0,86391.0,86396.0,86397.0,86398.0,86399.0, 471967.0, 545161.0, 583973.0]

我已经试过了

a = yvalues
num_bins = len(a)
counts, bin_edges = np.histogram(a, bins=num_bins, normed=True)
cdf = np.cumsum(counts)

我的 cdf 输出是以下不正确的列表,因为 cdf 输出的最后一个值应该是 1.请帮助我不知道我做错了什么.提前致谢.

My cdf output is the following list which is not correct because cdf output last value should be 1. Please help i don't know what i am doing wrong. Thanks in advance.

<代码>>>>发展基金数组([ 0.0009658, 0.00104114, 0.00106169, 0.00107539, 0.00108909,0.00109252, 0.00109594, 0.00110279, 0.00110793, 0.00110964,0.00111135, 0.00111135, 0.00111135, 0.00111135, 0.00111135,0.00111135, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111307, 0.00111307,0.00111307, 0.00111307, 0.00111307, 0.00111478, 0.00111478,0.00111478, 0.00111649, 0.0011182, 0.00111991, 0.00112334,0.00112505, 0.00112848, 0.00113533, 0.00113875, 0.00115416,0.00116615, 0.0011867, 0.00120896, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00124835, 0.00124835, 0.00124835, 0.00124835,0.00124835, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125006, 0.00125006,0.00125006, 0.00125006, 0.00125006, 0.00125177, 0.00125177,0.00125177, 0.00125177, 0.00125177, 0.00125177, 0.00125177,0.00125177, 0.00125177, 0.00125177, 0.00125177, 0.00125177,0.00125177, 0.00125177, 0.00125177, 0.00125177, 0.00125177,0.00125177, 0.00125177, 0.00125177, 0.00125177, 0.00125177,0.00125177, 0.00125177, 0.00125177, 0.00125177, 0.00125177,0.00125177, 0.00125177, 0.00125177, 0.00125177, 0.00125177,0.00125177, 0.00125177, 0.00125177, 0.00125177, 0.00125177,0.00125177, 0.00125177, 0.00125177, 0.00125177, 0.00125177,0.00125177, 0.00125177, 0.00125177, 0.00125177, 0.00125177,0.00125177, 0.00125348])

推荐答案

normed=True 时,counts 可以解释为 pdf 值:

When normed=True, the counts can be interpreted as pdf values:

counts, bin_edges = np.histogram(a, bins=num_bins, normed=True)

cdf

dx = bin_edges[1]-bin_edges[0]
cdf = np.cumsum(counts*dx)

bin 边缘之间的距离是均匀的,所以 dx 是恒定的.counts*dx 给出每个 bin 的概率质量.现在np.cumsum的概率质量给出了累积分布函数.

The distance between the bin edges is uniform, so dx is constant. counts*dx gives the probability mass for each bin. Now np.cumsum of the probability masses gives the cumulative distribution function.

assert np.allclose(cdf[-1], 1)

这篇关于如何在python中为我的数据正确获取累积分布函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆