pandas 将用户代理列解析为多个列 [英] Pandas parse user agent column into multiple columns
问题描述
我有一个http请求日志的数据框.唯一相关的列是我尝试解析的userAgent列.我正在使用ua_parser.这样会将每个userAgent变成嵌套的字典,如下所示:
I have a dataframe of http request logs. The only relevant column is the userAgent column which I'm trying to parse. I'm using ua_parser. This turns each userAgent into a nested dictionary like so:
>>> from ua_parser import user_agent_parser
>>> user_agent_parser.Parse('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36')
{
'device': {'brand': None,
'model': None,
'family': 'Other'},
'os': {'major': '10',
'patch_minor': None,
'minor': '10',
'family': 'Mac OS X',
'patch': '5'},
'user_agent': {'major': '55',
'minor': '0',
'family': 'Chrome',
'patch': '2883'},
'string': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36'
}
我正在尝试使用user_agent_parser的结果在日志数据帧上创建4个其他列.我想要device_brand,device_model,os_family和user_agent_family列.
I'm trying to create 4 additional columns on my log dataframe using the results of user_agent_parser. I'd like device_brand, device_model, os_family, and user_agent_family columns.
不幸的是,当我将其存储为numpy数组时,无法访问字典索引:
Unfortunately, when I store this as a numpy array, I can't access the dictionary indices:
>>> parsed_ua = logs['userAgent'].apply(user_agent_parser.Parse)
>>> logs['device_brand'] = parsed_ua['device']['brand']
KeyError: 'device'
我尝试将其转换为数据框,以便可以将parsed_ua与日志合并.不幸的是,这会将每个字典写到一列
I tried converting this to a dataframe so I could merge parsed_ua with logs. Unfortunately, this writes each dictionary to a single column
>>> pd.DataFrame(parsed_ua)
userAgent
0 {u'device': {u'brand': None, u'model': None, u...
1 {u'device': {u'brand': None, u'model': None, u...
2 {u'device': {u'brand': None, u'model': None, u...
3 {u'device': {u'brand': None, u'model': None, u...
4 {u'device': {u'brand': None, u'model': None, u...
如何解析userAgent列并将结果写入多个列?
How can I parse the userAgent column and write the results to multiple columns?
推荐答案
,您可以使用 查看全文