将Google API对象解析为Pandas DataFrame [英] Parsing google api object into pandas dataframe
问题描述
我正在尝试解析从GA到Pandas DataFrame的API响应.
I am trying to parse API response from GA to a Pandas DataFrame.
请求(来自Google页面的示例):
The request (sample from Google page):
def initialize_analyticsreporting():
"""Initializes an Analytics Reporting API V4 service object.
Returns:
An authorized Analytics Reporting API V4 service object.
"""
credentials = ServiceAccountCredentials.from_json_keyfile_name(
KEY_FILE_LOCATION, SCOPES)
# Build the service object.
analytics = build('analyticsreporting', 'v4', credentials=credentials)
return analytics
def get_report(analytics):
"""Queries the Analytics Reporting API V4.
Args:
analytics: An authorized Analytics Reporting API V4 service object.
Returns:
The Analytics Reporting API V4 response.
"""
return analytics.reports().batchGet(
body={
'reportRequests': [
{
'viewId': VIEW_ID,
'dateRanges': [{'startDate': 'today', 'endDate': 'today'}],
'metrics': [{'expression': 'ga:sessions'}],
'dimensions': [{'name': 'ga:country'}, {'name': 'ga:hostname'}]
}]
}
).execute()
响应:
def print_response(response):
"""Parses and prints the Analytics Reporting API V4 response.
Args:
response: An Analytics Reporting API V4 response.
"""
for report in response.get('reports', []):
columnHeader = report.get('columnHeader', {})
dimensionHeaders = columnHeader.get('dimensions', [])
metricHeaders = columnHeader.get(
'metricHeader', {}).get('metricHeaderEntries', [])
for row in report.get('data', {}).get('rows', []):
dimensions = row.get('dimensions', [])
dateRangeValues = row.get('metrics', [])
for header, dimension in zip(dimensionHeaders, dimensions):
print(header + ': ' + dimension)
for i, values in enumerate(dateRangeValues):
print('Date range: ' + str(i))
for metricHeader, value in zip(metricHeaders, values.get('values')):
print(metricHeader.get('name') + ': ' + value)
def main():
analytics = initialize_analyticsreporting()
response = get_report(analytics)
print_response(response)
输出以下内容:
>> ga:country: United States
>> ga:hostname: nl.sitename.com
>> Date range: 0
>> ga:sessions: 1
>> ga:country: United States
>> ga:hostname: sitename.com
>> Date range: 0
>> ga:sessions: 2078
>> ga:country: Venezuela
>> ga:hostname: sitename.com
>> Date range: 0
>> ga:sessions: 1
>> ga:country: Vietnam
>> ga:hostname: de.sitename.com
>> Date range: 0
>> ga:sessions: 1
>> ga:country: Vietnam
>> ga:hostname: sitename.com
>> Date range: 0
>> ga:sessions: 32
首先,我想将其放置在数据框中,而不是像Google示例中那样打印它.
Firstly I would like to place it in a dataframe rather than print it as in the Google example.
我尝试过的事情:
def main():
analytics = initialize_analyticsreporting()
response = get_report(analytics)
df = pd.DataFrame(print_response(response))
return df
但这无法正常工作,因为 print_response
函数可以打印内容.
But this did not work since print_response
function prints stuff.
我知道我可能需要添加pandas数据框并在 print_response
函数中向其添加信息,但是我不知道在哪里可以得到这样的东西:
I understand that probably I would need to add pandas dataframe and append information to it in the print_response
function but I have no clue where I would do that to get something like this:
ga:country ga:hostname Date range ga:sessions
United States nl.sitename.com 0 1
Venezuela nl.sitename.com 0 1
谢谢您的建议.
推荐答案
我认为此功能可以解决问题
I think this function will do the trick
def print_response(response):
list = []
# get report data
for report in response.get('reports', []):
# set column headers
columnHeader = report.get('columnHeader', {})
dimensionHeaders = columnHeader.get('dimensions', [])
metricHeaders = columnHeader.get('metricHeader', {}).get('metricHeaderEntries', [])
rows = report.get('data', {}).get('rows', [])
for row in rows:
# create dict for each row
dict = {}
dimensions = row.get('dimensions', [])
dateRangeValues = row.get('metrics', [])
# fill dict with dimension header (key) and dimension value (value)
for header, dimension in zip(dimensionHeaders, dimensions):
dict[header] = dimension
# fill dict with metric header (key) and metric value (value)
for i, values in enumerate(dateRangeValues):
for metric, value in zip(metricHeaders, values.get('values')):
#set int as int, float a float
if ',' in value or '.' in value:
dict[metric.get('name')] = float(value)
else:
dict[metric.get('name')] = int(value)
list.append(dict)
df = pd.DataFrame(list)
return df
这篇关于将Google API对象解析为Pandas DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!