如何在除了日期列的所有行都为NaN的地方dropna? [英] How to dropna where all rows are NaN except the date column?

查看:44
本文介绍了如何在除了日期列的所有行都为NaN的地方dropna?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从csv文件中删除NaN值,但我只想删除所有列均为空的行.我要删除的行的图片附在下面.

链接到文件:

基本上,如果B,C,D,E,F,G,H列为NaN,则删除整个行

我尝试使用下面的代码,但它删除了所有内容

 将pandas导入为pddf = pd.read_csv("testing.csv")df = df.dropna(阈值= 7) 

最终结果将如下所示

数据

 <代码>,打开,高,低,关闭,调整关闭,交易量,新加坡2015-10-01,2795.399902,3104.719971,2765.439941,2998.350098,2998.350098,0.0,2015-11-01,2976.719971,3043.850098,2843.949951,2855.939941,2855.939941,0.0,2015-12-01,2862.790039,2911.439941,2793.389893,2882.72998,2882.72998,0.0,2016-01-01,2889.22998,2890.209961,2529.01001,2629.110107,2629.110107,0.0,2016-02-01,2637.050049,2684.790039,2528.439941,2666.51001,2666.51001,0.0,2016-03-01,2666.709961,2906.800049,2654.97998,2840.899902,2840.899902,0.0,2016-04-01,2820.659912,2964.100098,2783.419922,2838.52002,2838.52002,158708700.0,2016-05-01,2842.860107,2848.899902,2713.469971,2791.060059,2791.060059,0.0,2016-06-01,2787.98999,2881.919922,2703.47998,2840.929932,2840.929932,0.0,2016-07-01,2848.449951,2958.899902,2830.0,2868.689941,2868.689941,0.0,2016-08-01,2875.590088,2898.27002,2810.8798829999996,2820.590088,2820.590088,0.0,2016-09-01,2821.929932,2911.840088,2791.3798829999996,2869.469971,2869.469971,0.0,2016-10-01,2879.850098,2901.72998,2783.330078,2813.8701170000004,2813.8701170000004,0.0,2016-11-01,2814.080078,2915.419922,2760.969971,2905.169922,2905.169922,0.0,2016-12-01,2913.649902,2980.77002,2857.909912,2880.76001,2880.76001,0.0,2017-01-01,2887.0,3065.1298829999996,2869.659912,3046.800049,3046.800049,0.0,2017-02-01,3045.939941,3138.969971,3030.649902,3096.610107,3096.610107,4018227800.0,2017-03-01,3106.300049,3188.02002,3104.330078,3175.110107,3175.110107,5462555700.0,2017-04-01,3180.27002,3189.810059,3113.899902,3175.439941,3175.439941,4292226700.0,2017-05-01,3183.429932,3275.389893,3183.409912,3210.820068,3210.820068,5080433500.0,2017-06-01,3214.1201170000004,3270.919922,3196.48999,3226.47998,3226.47998,4414015100.0,2017-07-01,3228.909912,3354.709961,3196.139893,3329.52002,3329.52002,5085548600.0,2017-08-01,3321.5,3349.090088,3244.22998,3277.26001,3277.26001,4856835500.0,2017-09-01,3274.389893,3275.139893,3193.409912,3219.909912,3219.909912,3840282400.0,2017-10-01,3233.949951,3392.149902,3230.810059,3374.080078,3374.080078,4261116400.0,2017-11-01,3377.1899409999996,3449.320068,3341.300049,3433.540039,3433.540039,4789747800.0,2017-12-01,3441.850098,3469.360107,3370.219971,3402.919922,3402.919922,3386126700.0,2018-01-01,3406.4799799999996,3611.6899409999996,3403.8701170000004,3533.98999,3533.98999,4727173600.0,2018-02-01,3536.929932,3574.5900880000004,3340.550049,3517.9399409999996,3517.9399409999996,6143735500.0,2018-03-01,3493.4399409999996,3555.9799799999996,3382.780029,3427.969971,3427.969971,4963081900.0,2018-04-01,3439.040039,3628.429932,3338.959961,3613.929932,3613.929932,4599803900.0,2018-05-01,3624.1999509999996,3641.649902,3428.179932,3428.179932,3428.179932,5918362800.0,2018-06-01,3423.5,3492.3400880000004,3237.77002,3268.699951,3268.699951,5500961400.0,2018-07-01,3277.429932,3341.419922,3176.26001,3319.850098,3319.850098,5029346600.0,2018-08-01,3331.050049,3347.97998,3187.830078,3213.47998,3213.47998,5005791600.0,2018-09-01,3209.969971,3265.01001,3102.72998,3257.050049,3257.050049,4158150600.0,2018-10-01,3262.429932,3272.8798829999996,2955.679932,3018.800049,3018.800049,5516696000.0,2018-11-01,3045.679932,3132.419922,3007.310059,3117.610107,3117.610107,4457632700.0,2018-12-01,3154.219971,3192.8798829999996,3000.449951,3068.76001,3068.76001,3627597800.0,2019-01-01,3072.98999,3250.27002,2993.419922,3190.169922,3190.169922,4467841200.0,2019-02-01,3194.219971,3286.080078,3174.0,3212.689941,3212.689941,3786000800.0,2019-03-01,3210.840088,3251.719971,3156.790039,3212.8798829999996,3212.8798829999996,4128594600.0,2019-04-01,3229.110107,3415.179932,3227.6201170000004,3400.1999509999996,3400.1999509999996,4447727600.0,2019-05-01,3389.5200200000004,3397.179932,3110.51001,3117.76001,3117.76001,4319537800.0,2019-06-01,3111.51001,3336.080078,3104.030029,3321.610107,3321.610107,4160448600.0,2019-07-01,3339.580078,3386.649902,3299.889893,3300.75,3300.75,4489792100.0,2019-08-01,3282.790039,3311.26001,3040.159912,3106.52002,3106.52002,5146051500.0,2019-09-01,3092.25,3216.8701170000004,3074.040039,3119.98999,3119.98999,4116898900.0,2019-10-01,3130.110107,3235.23999,3068.830078,3229.8798829999996,3229.8798829999996,4402690200.0,2019-11-01,3227.600098,3285.719971,3182.050049,3193.919922,3193.919922,7055882400.0,2019-12-01,3198.27002,3239.23999,3144.070068,3222.830078,3222.830078,4536740600.0,2020-01-01,3230.47998,3283.889893,3144.100098,3153.72998,3153.72998,4951167700.0,2020-02-01,3131.02002,3233.860107,3008.459961,3011.080078,3011.080078,5320489700.0,2020-02-21 、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、、等44.02020-02-25 、、、、、、2020-02-28 、、、、、、、 22.02020-03-01,2988.350098,3047.790039,2208.419922,2481.22998,2481.22998,7767702900.0,2020-03-02 ,,,,,,,2020-03-03 ,,,,,,,2020-03-06 、、、、、、、、 23.02020-03-10 、、、、、、2020-03-13 、、、、、、、 21.02020-03-17 ,,,,,,,2020-03-20 ,,,,,, 24.02020-03-23 、、、、、、2020-03-24 ,,,,,,,2020-03-27 、、、、、、、、 27.02020-03-302020-03-31 、、、、、、、2020-04-01,2468.169922,2671.580078,2380.840088,2624.22998,2624.22998,7238328000.0,2020-04-03 ,,,,,, 37.02020-04-06 、、、、、、、2020-04-07 、、、、、、、2020-04-10 ,,,,,, 73.02020-04-13 、、、、、、2020-04-14 、、、、、、、2020-04-17 ,,,,,,, 85.02020-04-20 、、、、、、、2020-04-21 、、、、、、、2020-04-24 ,,,,,,, 90.02020-04-27 、、、、、、2020-04-28 、、、、、、2020-05-01,2555.669922,2611.73999,2489.939941,2510.75,2510.75,7367276100.0,90.02020-05-05 、、、、、、、2020-05-15 、、、、、、、2020-05-21 、、、、、、、2020-05-22 、、、、、、、 92.02020-05-25 、、、、、、2020-05-26 、、、、、、、2020-05-30 、、、、、、、2020-06-01,2519.419922,2839.389893,2516.459961,2589.909912,2589.909912,8396435700.0,2020-06-05 ,,,,,, 89.02020-06-08 、、、、、、、2020-06-15 、、、、、、、2020-06-16 、、、、、、2020-06-19 、、、、、、、 92.02020-06-22 、、、、、、、2020-06-25 、、、、、、2020-07-01,2604.080078,2707.669922,2511.02002,2529.820068,2529.820068,4876221500.0,2020-07-03 ,,,,,,,2020-07-06 、、、、、、、2020-07-07 ,,,,,,, 90.02020-07-12 、、、、、、2020-07-14 、、、、、、、2020-07-20 、、、、、、、 92.02020-07-26 ,,,,,,,2020-07-27 、、、、、、、2020-07-31 、、、、、、、2020-08-01,2522.530029,2602.330078,2478.389893,2532.51001,2532.51001,6347053700.0,2020-08-03 ,,,,,, 88.02020-08-072020-08-102020-08-12 、、、、、、2020-08-14 ,,,,,,, 90.02020-08-172020-08-25 、、、、、、2020-08-28 ,,,,,, 90.02020-08-31 、、、、、、、2020-09-01,2521.810059,2546.8701170000004,2476.820068,2490.090088,2490.090088,2000718800.0,2020-09-11,2481.080078,2492.419922,2476.820068,2490.090088,2490.090088,0.0, 

解决方案

  • 使用 pandas.read_csv ,将 parse_dates index_col 设置为索引0处的未命名日期列.
  • .dropna how ='all',这将删除所有完全为 NaN 的行.不考虑索引,这就是为什么将date列设置为索引的原因.
  • 从技术上讲,日期不必解析为日期时间,但这是财务数据,因此对于时间序列分析,它应采用正确的日期时间格式,因为它将正确绘制.日期列必须是以这种方式轻松地 .dropna 的索引.

  df = pd.read_csv('testing.csv',parse_dates = [0],index_col = 0)#drop nadf = df.dropna(how ='all')# 保存存档df.to_csv('test_updated.csv',index = True)#显示(df)开盘高位低位关闭成交量新加坡2015-10-01 2795.39990 3104.71997 2765.43994 2998.35010 2998.35010 0.00000e + 00 NaN2015-11-01 2976.71997 3043.85010 2843.94995 2855.93994 2855.93994 0.00000e + 00 NaN2015-12-01 2862.79004 2911.43994 2793.38989 2882.72998 2882.72998 0.00000e + 00 NaN2016-01-01 2889.22998 2890.20996 2529.01001 2629.11011 2629.11011 0.00000e + 00 NaN2016-02-01 2637.05005 2684.79004 2528.43994 2666.51001 2666.51001 0.00000e + 00 NaN2016-03-01 2666.70996 2906.80005 2654.97998 2840.89990 2840.89990 0.00000e + 00 NaN2016-04-01 2820.65991 2964.10010 2783.41992 2838.52002 2838.52002 1.58709e + 08 NaN2016-05-01 2842.86011 2848.89990 2713.46997 2791.06006 2791.06006 0.00000e + 00 NaN2016-06-01 2787.98999 2881.91992 2703.47998 2840.92993 2840.92993 0.00000e + 00 NaN2016-07-01 2848.44995 2958.89990 2830.00000 2868.68994 2868.68994 0.00000e + 00 NaN2016-08-01 2875.59009 2898.27002 2810.87988 2820.59009 2820.59009 0.00000e + 00 NaN2016-09-01 2821.92993 2911.84009 2791.37988 2869.46997 2869.46997 0.00000e + 00 NaN2016-10-01 2879.85010 2901.72998 2783.33008 2813.87012 2813.87012 0.00000e + 00 NaN2016-11-01 2814.08008 2915.41992 2760.96997 2905.16992 2905.16992 0.00000e + 00 NaN2016-12-01 2913.64990 2980.77002 2857.90991 2880.76001 2880.76001 0.00000e + 00 NaN2017-01-01 2887.00000 3065.12988 2869.65991 3046.80005 3046.80005 0.00000e + 00 NaN2017-02-01 3045.93994 3138.96997 3030.64990 3096.61011 3096.61011 4.01823e + 09 NaN2017-03-01 3106.30005 3188.02002 3104.33008 3175.11011 3175.11011 5.46256e + 09 NaN2017-04-01 3180.27002 3189.81006 3113.89990 3175.43994 3175.43994 4.29223e + 09 NaN2017-05-01 3183.42993 3275.38989 3183.40991 3210.82007 3210.82007 5.08043e + 09 NaN2017-06-01 3214.12012 3270.91992 3196.48999 3226.47998 3226.47998 4.41402e + 09 NaN2017-07-01 3228.90991 3354.70996 3196.13989 3329.52002 3329.52002 5.08555e + 09 NaN2017-08-01 3321.50000 3349.09009 3244.22998 3277.26001 3277.26001 4.85684e + 09 NaN2017-09-01 3274.38989 3275.13989 3193.40991 3219.90991 3219.90991 3.84028e + 09 NaN2017-10-01 3233.94995 3392.14990 3230.81006 3374.08008 3374.08008 4.26112e + 09 NaN2017-11-01 3377.18994 3449.32007 3341.30005 3433.54004 3433.54004 4.78975e + 09 NaN2017-12-01 3441.85010 3469.36011 3370.21997 3402.91992 3402.91992 3.38613e + 09 NaN2018-01-01 3406.47998 3611.68994 3403.87012 3533.98999 3533.98999 4.72717e + 09 NaN2018-02-01 3536.92993 3574.59009 3340.55005 3517.93994 3517.93994 6.14374e + 09 NaN2018-03-01 3493.43994 3555.97998 3382.78003 3427.96997 3427.96997 4.96308e + 09 NaN2018-04-01 3439.04004 3628.42993 3338.95996 3613.92993 3613.92993 4.59980e + 09 NaN2018-05-01 3624.19995 3641.64990 3428.17993 3428.17993 3428.17993 5.91836e + 09 NaN2018-06-01 3423.50000 3492.34009 3237.77002 3268.69995 3268.69995 5.50096e + 09 NaN2018-07-01 3277.42993 3341.41992 3176.26001 3319.85010 3319.85010 5.02935e + 09 NaN2018-08-01 3331.05005 3347.97998 3187.83008 3213.47998 3213.47998 5.00579e + 09 NaN2018-09-01 3209.96997 3265.01001 3102.72998 3257.05005 3257.05005 4.15815e + 09 NaN2018-10-01 3262.42993 3272.87988 2955.67993 3018.80005 3018.80005 5.51670e + 09 NaN2018-11-01 3045.67993 3132.41992 3007.31006 3117.61011 3117.61011 4.45763e + 09 NaN2018-12-01 3154.21997 3192.87988 3000.44995 3068.76001 3068.76001 3.62760e + 09 NaN2019-01-01 3072.98999 3250.27002 2993.41992 3190.16992 3190.16992 4.46784e + 09 NaN2019-02-01 3194.21997 3286.08008 3174.00000 3212.68994 3212.68994 3.78600e + 09 NaN2019-03-01 3210.84009 3251.71997 3156.79004 3212.87988 3212.87988 4.12859e + 09 NaN2019-04-01 3229.11011 3415.17993 3227.62012 3400.19995 3400.19995 4.44773e + 09 NaN2019-05-01 3389.52002 3397.17993 3110.51001 3117.76001 3117.76001 4.31954e + 09 NaN2019-06-01 3111.51001 3336.08008 3104.03003 3321.61011 3321.61011 4.16045e + 09 NaN2019-07-01 3339.58008 3386.64990 3299.88989 3300.75000 3300.75000 4.48979e + 09 NaN2019-08-01 3282.79004 3311.26001 3040.15991 3106.52002 3106.52002 5.14605e + 09 NaN2019-09-01 3092.25000 3216.87012 3074.04004 3119.98999 3119.98999 4.11690e + 09 NaN2019-10-01 3130.11011 3235.23999 3068.83008 3229.87988 3229.87988 4.40269e + 09 NaN2019-11-01 3227.60010 3285.71997 3182.05005 3193.91992 3193.91992 7.05588e + 09 NaN2019-12-01 3198.27002 3239.23999 3144.07007 3222.83008 3222.83008 4.53674e + 09 NaN2020-01-01 3230.47998 3283.88989 3144.10010 3153.72998 3153.72998 4.95117e + 09 NaN2020-02-01 3131.02002 3233.86011 3008.45996 3011.08008 3011.08008 5.32049e + 09 NaN2020-02-21 NaN NaN NaN NaN NaN NaN NaN 24.02020-02-28 NaN NaN NaN NaN NaN NaN NaN 22.02020-03-01 2988.35010 3047.79004 2208.41992 2481.22998 2481.22998 7.76770e + 09 NaN2020-03-06 NaN NaN NaN NaN NaN NaN NaN 23.02020-03-13 NaN NaN NaN NaN NaN NaN NaN 21.02020-03-20 NaN NaN NaN NaN NaN NaN NaN 24.02020-03-27 NaN NaN NaN NaN NaN NaN NaN 27.02020-04-01 2468.16992 2671.58008 2380.84009 2624.22998 2624.22998 7.23833e + 09 NaN2020-04-03 NaN NaN NaN NaN NaN NaN NaN 37.02020-04-10 NaN NaN NaN NaN NaN NaN NaN 73.02020-04-17 NaN NaN NaN NaN NaN NaN NaN 85.02020-04-24 NaN NaN NaN NaN NaN NaN NaN 90.02020-05-01 2555.66992 2611.73999 2489.93994 2510.75000 2510.75000 7.36728e + 09 90.02020-05-22 NaN NaN NaN NaN NaN NaN NaN 92.02020-06-01 2519.41992 2839.38989 2516.45996 2589.90991 2589.90991 8.39644e + 09 NaN2020-06-05 NaN NaN NaN NaN NaN NaN NaN 89.02020-06-19 NaN NaN NaN NaN NaN NaN NaN 92.02020-07-01 2604.08008 2707.66992 2511.02002 2529.82007 2529.82007 4.87622e + 09 NaN2020-07-07 NaN NaN NaN NaN NaN NaN NaN 90.02020-07-20 NaN NaN NaN NaN NaN NaN NaN 92.02020-08-01 2522.53003 2602.33008 2478.38989 2532.51001 2532.51001 6.34705e + 09 NaN2020-08-03 NaN NaN NaN NaN NaN NaN NaN 88.02020-08-14 NaN NaN NaN NaN NaN NaN NaN 90.02020-08-28 NaN NaN NaN NaN NaN NaN NaN 90.02020-09-01 2521.81006 2546.87012 2476.82007 2490.09009 2490.09009 2.00072e + 09 NaN2020-09-11 2481.08008 2492.41992 2476.82007 2490.09009 2490.09009 0.00000e + 00 NaN 

绘图

  • 此绘图使用 pandas.DataFrame.plot ,它使用 matplotlib 作为默认绘图引擎
    • 请注意,这并不是在NaN值之间绘制线,因此添加了 dropna 进行绘制.
  • 不要用值绘制 Volume ,因为比例(y值)要大得多.
  • 'Singapore'是单独绘制的,因为它的值较低且数据点很少,因此它看起来像是条线图.

 将matplotlib.pyplot导入为plt无花果,(ax1,ax2)= plt.subplots(nrows = 2,figsize =(9,10))df [['Open','High','Low','Close','Adj Close']].dropna().plot(ax = ax1)ax2.scatter(df.index,'Singapore',data = df,label ='Singapore')ax2.legend()plt.show() 

I am trying to remove NaN values from my csv file but I only want to remove the row where all columns are empty. A picture of the rows I want to remove is attached below.

Link to the file: https://filebin.net/ou93iqiinss02l0g

Essentially if column B,C,D,E,F,G,H is NaN, I remove the whole row

I tried using the below code but it removes all everything

import pandas as pd

df = pd.read_csv("testing.csv")
df = df.dropna(thresh = 7)

the end result will look like this

Data

,Open,High,Low,Close,Adj Close,Volume,Singapore
2015-10-01,2795.399902,3104.719971,2765.439941,2998.350098,2998.350098,0.0,
2015-11-01,2976.719971,3043.850098,2843.949951,2855.939941,2855.939941,0.0,
2015-12-01,2862.790039,2911.439941,2793.389893,2882.72998,2882.72998,0.0,
2016-01-01,2889.22998,2890.209961,2529.01001,2629.110107,2629.110107,0.0,
2016-02-01,2637.050049,2684.790039,2528.439941,2666.51001,2666.51001,0.0,
2016-03-01,2666.709961,2906.800049,2654.97998,2840.899902,2840.899902,0.0,
2016-04-01,2820.659912,2964.100098,2783.419922,2838.52002,2838.52002,158708700.0,
2016-05-01,2842.860107,2848.899902,2713.469971,2791.060059,2791.060059,0.0,
2016-06-01,2787.98999,2881.919922,2703.47998,2840.929932,2840.929932,0.0,
2016-07-01,2848.449951,2958.899902,2830.0,2868.689941,2868.689941,0.0,
2016-08-01,2875.590088,2898.27002,2810.8798829999996,2820.590088,2820.590088,0.0,
2016-09-01,2821.929932,2911.840088,2791.3798829999996,2869.469971,2869.469971,0.0,
2016-10-01,2879.850098,2901.72998,2783.330078,2813.8701170000004,2813.8701170000004,0.0,
2016-11-01,2814.080078,2915.419922,2760.969971,2905.169922,2905.169922,0.0,
2016-12-01,2913.649902,2980.77002,2857.909912,2880.76001,2880.76001,0.0,
2017-01-01,2887.0,3065.1298829999996,2869.659912,3046.800049,3046.800049,0.0,
2017-02-01,3045.939941,3138.969971,3030.649902,3096.610107,3096.610107,4018227800.0,
2017-03-01,3106.300049,3188.02002,3104.330078,3175.110107,3175.110107,5462555700.0,
2017-04-01,3180.27002,3189.810059,3113.899902,3175.439941,3175.439941,4292226700.0,
2017-05-01,3183.429932,3275.389893,3183.409912,3210.820068,3210.820068,5080433500.0,
2017-06-01,3214.1201170000004,3270.919922,3196.48999,3226.47998,3226.47998,4414015100.0,
2017-07-01,3228.909912,3354.709961,3196.139893,3329.52002,3329.52002,5085548600.0,
2017-08-01,3321.5,3349.090088,3244.22998,3277.26001,3277.26001,4856835500.0,
2017-09-01,3274.389893,3275.139893,3193.409912,3219.909912,3219.909912,3840282400.0,
2017-10-01,3233.949951,3392.149902,3230.810059,3374.080078,3374.080078,4261116400.0,
2017-11-01,3377.1899409999996,3449.320068,3341.300049,3433.540039,3433.540039,4789747800.0,
2017-12-01,3441.850098,3469.360107,3370.219971,3402.919922,3402.919922,3386126700.0,
2018-01-01,3406.4799799999996,3611.6899409999996,3403.8701170000004,3533.98999,3533.98999,4727173600.0,
2018-02-01,3536.929932,3574.5900880000004,3340.550049,3517.9399409999996,3517.9399409999996,6143735500.0,
2018-03-01,3493.4399409999996,3555.9799799999996,3382.780029,3427.969971,3427.969971,4963081900.0,
2018-04-01,3439.040039,3628.429932,3338.959961,3613.929932,3613.929932,4599803900.0,
2018-05-01,3624.1999509999996,3641.649902,3428.179932,3428.179932,3428.179932,5918362800.0,
2018-06-01,3423.5,3492.3400880000004,3237.77002,3268.699951,3268.699951,5500961400.0,
2018-07-01,3277.429932,3341.419922,3176.26001,3319.850098,3319.850098,5029346600.0,
2018-08-01,3331.050049,3347.97998,3187.830078,3213.47998,3213.47998,5005791600.0,
2018-09-01,3209.969971,3265.01001,3102.72998,3257.050049,3257.050049,4158150600.0,
2018-10-01,3262.429932,3272.8798829999996,2955.679932,3018.800049,3018.800049,5516696000.0,
2018-11-01,3045.679932,3132.419922,3007.310059,3117.610107,3117.610107,4457632700.0,
2018-12-01,3154.219971,3192.8798829999996,3000.449951,3068.76001,3068.76001,3627597800.0,
2019-01-01,3072.98999,3250.27002,2993.419922,3190.169922,3190.169922,4467841200.0,
2019-02-01,3194.219971,3286.080078,3174.0,3212.689941,3212.689941,3786000800.0,
2019-03-01,3210.840088,3251.719971,3156.790039,3212.8798829999996,3212.8798829999996,4128594600.0,
2019-04-01,3229.110107,3415.179932,3227.6201170000004,3400.1999509999996,3400.1999509999996,4447727600.0,
2019-05-01,3389.5200200000004,3397.179932,3110.51001,3117.76001,3117.76001,4319537800.0,
2019-06-01,3111.51001,3336.080078,3104.030029,3321.610107,3321.610107,4160448600.0,
2019-07-01,3339.580078,3386.649902,3299.889893,3300.75,3300.75,4489792100.0,
2019-08-01,3282.790039,3311.26001,3040.159912,3106.52002,3106.52002,5146051500.0,
2019-09-01,3092.25,3216.8701170000004,3074.040039,3119.98999,3119.98999,4116898900.0,
2019-10-01,3130.110107,3235.23999,3068.830078,3229.8798829999996,3229.8798829999996,4402690200.0,
2019-11-01,3227.600098,3285.719971,3182.050049,3193.919922,3193.919922,7055882400.0,
2019-12-01,3198.27002,3239.23999,3144.070068,3222.830078,3222.830078,4536740600.0,
2020-01-01,3230.47998,3283.889893,3144.100098,3153.72998,3153.72998,4951167700.0,
2020-02-01,3131.02002,3233.860107,3008.459961,3011.080078,3011.080078,5320489700.0,
2020-02-21,,,,,,,24.0
2020-02-25,,,,,,,
2020-02-28,,,,,,,22.0
2020-03-01,2988.350098,3047.790039,2208.419922,2481.22998,2481.22998,7767702900.0,
2020-03-02,,,,,,,
2020-03-03,,,,,,,
2020-03-06,,,,,,,23.0
2020-03-10,,,,,,,
2020-03-13,,,,,,,21.0
2020-03-17,,,,,,,
2020-03-20,,,,,,,24.0
2020-03-23,,,,,,,
2020-03-24,,,,,,,
2020-03-27,,,,,,,27.0
2020-03-30,,,,,,,
2020-03-31,,,,,,,
2020-04-01,2468.169922,2671.580078,2380.840088,2624.22998,2624.22998,7238328000.0,
2020-04-03,,,,,,,37.0
2020-04-06,,,,,,,
2020-04-07,,,,,,,
2020-04-10,,,,,,,73.0
2020-04-13,,,,,,,
2020-04-14,,,,,,,
2020-04-17,,,,,,,85.0
2020-04-20,,,,,,,
2020-04-21,,,,,,,
2020-04-24,,,,,,,90.0
2020-04-27,,,,,,,
2020-04-28,,,,,,,
2020-05-01,2555.669922,2611.73999,2489.939941,2510.75,2510.75,7367276100.0,90.0
2020-05-05,,,,,,,
2020-05-15,,,,,,,
2020-05-21,,,,,,,
2020-05-22,,,,,,,92.0
2020-05-25,,,,,,,
2020-05-26,,,,,,,
2020-05-30,,,,,,,
2020-06-01,2519.419922,2839.389893,2516.459961,2589.909912,2589.909912,8396435700.0,
2020-06-05,,,,,,,89.0
2020-06-08,,,,,,,
2020-06-15,,,,,,,
2020-06-16,,,,,,,
2020-06-19,,,,,,,92.0
2020-06-22,,,,,,,
2020-06-25,,,,,,,
2020-07-01,2604.080078,2707.669922,2511.02002,2529.820068,2529.820068,4876221500.0,
2020-07-03,,,,,,,
2020-07-06,,,,,,,
2020-07-07,,,,,,,90.0
2020-07-12,,,,,,,
2020-07-14,,,,,,,
2020-07-20,,,,,,,92.0
2020-07-26,,,,,,,
2020-07-27,,,,,,,
2020-07-31,,,,,,,
2020-08-01,2522.530029,2602.330078,2478.389893,2532.51001,2532.51001,6347053700.0,
2020-08-03,,,,,,,88.0
2020-08-07,,,,,,,
2020-08-10,,,,,,,
2020-08-12,,,,,,,
2020-08-14,,,,,,,90.0
2020-08-17,,,,,,,
2020-08-25,,,,,,,
2020-08-28,,,,,,,90.0
2020-08-31,,,,,,,
2020-09-01,2521.810059,2546.8701170000004,2476.820068,2490.090088,2490.090088,2000718800.0,
2020-09-11,2481.080078,2492.419922,2476.820068,2490.090088,2490.090088,0.0,

解决方案

  • Use pandas.read_csv, with parse_dates and index_col set to the unnamed date column at index 0.
  • .dropna with how='all', which will drop any row that's entirely NaN. The index isn't considered, which is why the date column is set as the index.
  • The dates don't technically have to be parsed to a datetime, but this is financial data so it should be in a correct datetime format for timeseries analysis, and because it will plot correctly. The date column has to be the index to easily .dropna in this manner.

df = pd.read_csv('testing.csv', parse_dates=[0], index_col=0)

# drop na
df = df.dropna(how='all')

# save file
df.to_csv('test_updated.csv', index=True)

# display(df)
                  Open        High         Low       Close   Adj Close       Volume  Singapore
2015-10-01  2795.39990  3104.71997  2765.43994  2998.35010  2998.35010  0.00000e+00        NaN
2015-11-01  2976.71997  3043.85010  2843.94995  2855.93994  2855.93994  0.00000e+00        NaN
2015-12-01  2862.79004  2911.43994  2793.38989  2882.72998  2882.72998  0.00000e+00        NaN
2016-01-01  2889.22998  2890.20996  2529.01001  2629.11011  2629.11011  0.00000e+00        NaN
2016-02-01  2637.05005  2684.79004  2528.43994  2666.51001  2666.51001  0.00000e+00        NaN
2016-03-01  2666.70996  2906.80005  2654.97998  2840.89990  2840.89990  0.00000e+00        NaN
2016-04-01  2820.65991  2964.10010  2783.41992  2838.52002  2838.52002  1.58709e+08        NaN
2016-05-01  2842.86011  2848.89990  2713.46997  2791.06006  2791.06006  0.00000e+00        NaN
2016-06-01  2787.98999  2881.91992  2703.47998  2840.92993  2840.92993  0.00000e+00        NaN
2016-07-01  2848.44995  2958.89990  2830.00000  2868.68994  2868.68994  0.00000e+00        NaN
2016-08-01  2875.59009  2898.27002  2810.87988  2820.59009  2820.59009  0.00000e+00        NaN
2016-09-01  2821.92993  2911.84009  2791.37988  2869.46997  2869.46997  0.00000e+00        NaN
2016-10-01  2879.85010  2901.72998  2783.33008  2813.87012  2813.87012  0.00000e+00        NaN
2016-11-01  2814.08008  2915.41992  2760.96997  2905.16992  2905.16992  0.00000e+00        NaN
2016-12-01  2913.64990  2980.77002  2857.90991  2880.76001  2880.76001  0.00000e+00        NaN
2017-01-01  2887.00000  3065.12988  2869.65991  3046.80005  3046.80005  0.00000e+00        NaN
2017-02-01  3045.93994  3138.96997  3030.64990  3096.61011  3096.61011  4.01823e+09        NaN
2017-03-01  3106.30005  3188.02002  3104.33008  3175.11011  3175.11011  5.46256e+09        NaN
2017-04-01  3180.27002  3189.81006  3113.89990  3175.43994  3175.43994  4.29223e+09        NaN
2017-05-01  3183.42993  3275.38989  3183.40991  3210.82007  3210.82007  5.08043e+09        NaN
2017-06-01  3214.12012  3270.91992  3196.48999  3226.47998  3226.47998  4.41402e+09        NaN
2017-07-01  3228.90991  3354.70996  3196.13989  3329.52002  3329.52002  5.08555e+09        NaN
2017-08-01  3321.50000  3349.09009  3244.22998  3277.26001  3277.26001  4.85684e+09        NaN
2017-09-01  3274.38989  3275.13989  3193.40991  3219.90991  3219.90991  3.84028e+09        NaN
2017-10-01  3233.94995  3392.14990  3230.81006  3374.08008  3374.08008  4.26112e+09        NaN
2017-11-01  3377.18994  3449.32007  3341.30005  3433.54004  3433.54004  4.78975e+09        NaN
2017-12-01  3441.85010  3469.36011  3370.21997  3402.91992  3402.91992  3.38613e+09        NaN
2018-01-01  3406.47998  3611.68994  3403.87012  3533.98999  3533.98999  4.72717e+09        NaN
2018-02-01  3536.92993  3574.59009  3340.55005  3517.93994  3517.93994  6.14374e+09        NaN
2018-03-01  3493.43994  3555.97998  3382.78003  3427.96997  3427.96997  4.96308e+09        NaN
2018-04-01  3439.04004  3628.42993  3338.95996  3613.92993  3613.92993  4.59980e+09        NaN
2018-05-01  3624.19995  3641.64990  3428.17993  3428.17993  3428.17993  5.91836e+09        NaN
2018-06-01  3423.50000  3492.34009  3237.77002  3268.69995  3268.69995  5.50096e+09        NaN
2018-07-01  3277.42993  3341.41992  3176.26001  3319.85010  3319.85010  5.02935e+09        NaN
2018-08-01  3331.05005  3347.97998  3187.83008  3213.47998  3213.47998  5.00579e+09        NaN
2018-09-01  3209.96997  3265.01001  3102.72998  3257.05005  3257.05005  4.15815e+09        NaN
2018-10-01  3262.42993  3272.87988  2955.67993  3018.80005  3018.80005  5.51670e+09        NaN
2018-11-01  3045.67993  3132.41992  3007.31006  3117.61011  3117.61011  4.45763e+09        NaN
2018-12-01  3154.21997  3192.87988  3000.44995  3068.76001  3068.76001  3.62760e+09        NaN
2019-01-01  3072.98999  3250.27002  2993.41992  3190.16992  3190.16992  4.46784e+09        NaN
2019-02-01  3194.21997  3286.08008  3174.00000  3212.68994  3212.68994  3.78600e+09        NaN
2019-03-01  3210.84009  3251.71997  3156.79004  3212.87988  3212.87988  4.12859e+09        NaN
2019-04-01  3229.11011  3415.17993  3227.62012  3400.19995  3400.19995  4.44773e+09        NaN
2019-05-01  3389.52002  3397.17993  3110.51001  3117.76001  3117.76001  4.31954e+09        NaN
2019-06-01  3111.51001  3336.08008  3104.03003  3321.61011  3321.61011  4.16045e+09        NaN
2019-07-01  3339.58008  3386.64990  3299.88989  3300.75000  3300.75000  4.48979e+09        NaN
2019-08-01  3282.79004  3311.26001  3040.15991  3106.52002  3106.52002  5.14605e+09        NaN
2019-09-01  3092.25000  3216.87012  3074.04004  3119.98999  3119.98999  4.11690e+09        NaN
2019-10-01  3130.11011  3235.23999  3068.83008  3229.87988  3229.87988  4.40269e+09        NaN
2019-11-01  3227.60010  3285.71997  3182.05005  3193.91992  3193.91992  7.05588e+09        NaN
2019-12-01  3198.27002  3239.23999  3144.07007  3222.83008  3222.83008  4.53674e+09        NaN
2020-01-01  3230.47998  3283.88989  3144.10010  3153.72998  3153.72998  4.95117e+09        NaN
2020-02-01  3131.02002  3233.86011  3008.45996  3011.08008  3011.08008  5.32049e+09        NaN
2020-02-21         NaN         NaN         NaN         NaN         NaN          NaN       24.0
2020-02-28         NaN         NaN         NaN         NaN         NaN          NaN       22.0
2020-03-01  2988.35010  3047.79004  2208.41992  2481.22998  2481.22998  7.76770e+09        NaN
2020-03-06         NaN         NaN         NaN         NaN         NaN          NaN       23.0
2020-03-13         NaN         NaN         NaN         NaN         NaN          NaN       21.0
2020-03-20         NaN         NaN         NaN         NaN         NaN          NaN       24.0
2020-03-27         NaN         NaN         NaN         NaN         NaN          NaN       27.0
2020-04-01  2468.16992  2671.58008  2380.84009  2624.22998  2624.22998  7.23833e+09        NaN
2020-04-03         NaN         NaN         NaN         NaN         NaN          NaN       37.0
2020-04-10         NaN         NaN         NaN         NaN         NaN          NaN       73.0
2020-04-17         NaN         NaN         NaN         NaN         NaN          NaN       85.0
2020-04-24         NaN         NaN         NaN         NaN         NaN          NaN       90.0
2020-05-01  2555.66992  2611.73999  2489.93994  2510.75000  2510.75000  7.36728e+09       90.0
2020-05-22         NaN         NaN         NaN         NaN         NaN          NaN       92.0
2020-06-01  2519.41992  2839.38989  2516.45996  2589.90991  2589.90991  8.39644e+09        NaN
2020-06-05         NaN         NaN         NaN         NaN         NaN          NaN       89.0
2020-06-19         NaN         NaN         NaN         NaN         NaN          NaN       92.0
2020-07-01  2604.08008  2707.66992  2511.02002  2529.82007  2529.82007  4.87622e+09        NaN
2020-07-07         NaN         NaN         NaN         NaN         NaN          NaN       90.0
2020-07-20         NaN         NaN         NaN         NaN         NaN          NaN       92.0
2020-08-01  2522.53003  2602.33008  2478.38989  2532.51001  2532.51001  6.34705e+09        NaN
2020-08-03         NaN         NaN         NaN         NaN         NaN          NaN       88.0
2020-08-14         NaN         NaN         NaN         NaN         NaN          NaN       90.0
2020-08-28         NaN         NaN         NaN         NaN         NaN          NaN       90.0
2020-09-01  2521.81006  2546.87012  2476.82007  2490.09009  2490.09009  2.00072e+09        NaN
2020-09-11  2481.08008  2492.41992  2476.82007  2490.09009  2490.09009  0.00000e+00        NaN

Plotting

  • This plot uses pandas.DataFrame.plot, which uses matplotlib as the default plot engine
    • Note that this wasn't drawing lines between the NaN values, so dropna was added for doing the plot.
  • Don't plot Volume with the values, because the scale (y-value) is so much larger.
  • 'Singapore' is plotted separately, because of it's lower value and few data points, it will look funny as a line plot.

import matplotlib.pyplot as plt

fig, (ax1, ax2) = plt.subplots(nrows=2, figsize=(9, 10))

df[['Open', 'High', 'Low', 'Close', 'Adj Close']].dropna().plot(ax=ax1)
ax2.scatter(df.index, 'Singapore', data=df, label='Singapore')
ax2.legend()
plt.show()

这篇关于如何在除了日期列的所有行都为NaN的地方dropna?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆