使用df.apply和lambda函数将多个列添加到数据框 [英] adding multiple columns to a dataframe using df.apply and a lambda function

查看:187
本文介绍了使用df.apply和lambda函数将多个列添加到数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用df.apply和lambda函数将多个列添加到现有数据框.我能够一一添加列,但不能一起为所有列添加列.我的代码

I am trying to add multiple columns to an existing dataframe with df.apply and a lambda function. I am able to add columns one by one but not able to do it for all the columns together. My code



def get_player_stats(player_name):
    print(player_name)
    resp = requests.get(player_id_api + player_name)
    if resp.status_code != 200:
        # This means something went wrong.
        print('Error {}'.format(resp.status_code))

    result = resp.json()
    player_id = result['data'][0]['pid']

    resp_data = requests.get(player_data_api + str(player_id))
    if resp_data.status_code != 200:
        # This means something went wrong.
        print('Error {}'.format(resp_data.status_code))

    result_data = resp_data.json()

    check1 = len(result_data.get('data',None).get('batting',None))
#    print(check1)
    check2 = len(result_data.get('data',{}).get('batting',{}).get('ODIs',{}))
#    check2 = result_data.get(['data']['batting']['ODIs'],None)
#    print(check2)
    if check1 > 0 and check2 > 0:
        total_6s = result_data['data']['batting']['ODIs']['6s']
        total_4s = result_data['data']['batting']['ODIs']['4s']
        average = result_data['data']['batting']['ODIs']['Ave']
        total_innings = result_data['data']['batting']['ODIs']['Inns']
        total_catches = result_data['data']['batting']['ODIs']['Ct']
        total_stumps = result_data['data']['batting']['ODIs']['St']
        total_wickets = result_data['data']['bowling']['ODIs']['Wkts']
        print(average,total_innings,total_4s,total_6s,total_catches,total_stumps,total_wickets)    
        return np.array([average,total_innings,total_4s,total_6s,total_catches,total_stumps,total_wickets])
    else:
        print('No data for player')
        return '','','','','','',''


cols = ['Avg','tot_inns','tot_4s','tot_6s','tot_cts','tot_sts','tot_wkts']
for col in cols:
    players_available[col] = ''

players_available[cols] = players_available.apply(lambda x: get_player_stats(x['playerName']) , axis =1) 

我尝试将列显式添加到数据框,但是仍然出现错误

I have tried adding columns explicitly to the dataframe but still i am getting an error

ValueError: Must have equal len keys and value when setting with an iterable

有人可以帮我吗?

推荐答案

这很棘手,因为在大熊猫中,apply方法会随着版本而发展.

It's tricky, since in pandas the apply method evolve through versions.

在我的版本(0.25.3)和其他最近的版本中,如果函数返回 pd.Series 对象,那么它将起作用.

In my version (0.25.3) and also the other recent versions, if the function returns pd.Series object then it works.

在您的代码中,您可以尝试在函数中更改返回值:

In your code, you could try to change the return value in the function:

return pd.Series([average,total_innings,total_4s,total_6s,
                  total_catches,total_stumps,total_wickets])

return pd.Series(['','','','','','',''])

这篇关于使用df.apply和lambda函数将多个列添加到数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆