pandas 将用户代理列解析为多个列 [英] Pandas parse user agent column into multiple columns

查看:96
本文介绍了 pandas 将用户代理列解析为多个列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个http请求日志的数据框.唯一相关的列是我尝试解析的userAgent列.我正在使用ua_parser.这样会将每个userAgent变成嵌套的字典,如下所示:

I have a dataframe of http request logs. The only relevant column is the userAgent column which I'm trying to parse. I'm using ua_parser. This turns each userAgent into a nested dictionary like so:

>>> from ua_parser import user_agent_parser
>>> user_agent_parser.Parse('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36')
{
     'device': {'brand': None, 
                'model': None, 
                'family': 'Other'}, 
     'os': {'major': '10', 
            'patch_minor': None, 
            'minor': '10', 
            'family': 'Mac OS X', 
            'patch': '5'}, 
     'user_agent': {'major': '55', 
                    'minor': '0', 
                    'family': 'Chrome', 
                    'patch': '2883'}, 
     'string': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36'
}

我正在尝试使用user_agent_parser的结果在日志数据帧上创建4个其他列.我想要device_brand,device_model,os_family和user_agent_family列.

I'm trying to create 4 additional columns on my log dataframe using the results of user_agent_parser. I'd like device_brand, device_model, os_family, and user_agent_family columns.

不幸的是,当我将其存储为numpy数组时,无法访问字典索引:

Unfortunately, when I store this as a numpy array, I can't access the dictionary indices:

>>> parsed_ua = logs['userAgent'].apply(user_agent_parser.Parse)
>>> logs['device_brand'] = parsed_ua['device']['brand']
KeyError: 'device'

我尝试将其转换为数据框,以便可以将parsed_ua与日志合并.不幸的是,这会将每个字典写到一列

I tried converting this to a dataframe so I could merge parsed_ua with logs. Unfortunately, this writes each dictionary to a single column

>>> pd.DataFrame(parsed_ua)
userAgent
0   {u'device': {u'brand': None, u'model': None, u...
1   {u'device': {u'brand': None, u'model': None, u...
2   {u'device': {u'brand': None, u'model': None, u...
3   {u'device': {u'brand': None, u'model': None, u...
4   {u'device': {u'brand': None, u'model': None, u...

如何解析userAgent列并将结果写入多个列?

How can I parse the userAgent column and write the results to multiple columns?

推荐答案

,您可以使用 查看全文

登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆