将嵌套字典转换为表/父子结构,Python 3.6 [英] Convert Nested Dictionary into Table/Parent Child Structure, Python 3.6
问题描述
想从下面的代码转换嵌套的Dictionary.
Want to convert nested Dictionary from the below code.
import requests
from bs4 import BeautifulSoup
url = 'https://www.bundesbank.de/en/statistics/time-series-databases/time-series-databases/743796/openAll?treeAnchor=BANKEN&statisticType=BBK_ITS'
result = requests.get(url)
soup = BeautifulSoup(result.text, 'html.parser')
def get_child_nodes(parent_node):
node_name = parent_node.a.get_text(strip=True)
result = {"name": node_name, "children": []}
children_list = parent_node.find('ul', recursive=False)
if not children_list:
return result
for child_node in children_list('li', recursive=False):
result["children"].append(get_child_nodes(child_node))
return result
Data_Dict = get_child_nodes(soup.find("div", class_="statisticTree"))
是否可以导出如图所示的Parent-Child?
Is it possible to export Parent - Child as shown in image?
以上代码来自@alecxe的答案:使用BeautifulSoup,Python 3.6获取完整的项目列表
Above code is from the Answer of @alecxe : Fetch complete List of Items using BeautifulSoup, Python 3.6
我尝试过,但是太复杂了,难以理解,请提供帮助.
I tried but its too complex to understand, Please help on the same.
字典: http://s000.tinyupload.com/index.php?file_id = 97731876598977568058
示例词典数据:
{"name": "Banks", "children": [{"name": "Banks", "children": [{"name": "Balance sheet items", "children":
[{"name": "Minimum reserves", "children": [{"name": "Reserve maintenance in the euro area", "children": []}, {"name": "Reserve maintenance in Germany", "children": []}]},
{"name": "Bank Lending Survey (BLS) - Results for Germany", "children": [{"name": "Lending", "children": [{"name": "Enterprises", "children": [{"name": "Changes over the past three months", "children": [{"name": "Credit standards and explanatory factors", "children": [{"name": "Overall", "children": []}, {"name": "Loans to small and medium-sized enterprises", "children": []}, {"name": "Loans to large enterprises", "children": []}, {"name": "Short-term loans", "children": []}, {"name": "Long-term loans", "children": []}]}, {"name": "Terms and conditions and explanatory factors", "children": [{"name": "Overall", "children": [{"name": "Overall terms and conditions and explanatory factors", "children": []}, {"name": "Margins on average loans and explanatory factors", "children": []}, {"name": "Margins on riskier loans and explanatory factors", "children": []}, {"name": "Non-interest rate charges", "children": []}, {"name": "Size of the loan or credit line", "children": []}, {"name": "Collateral requirements", "children": []}, {"name": "Loan covenants", "children": []}, {"name": "Maturity", "children": []}]}, {"name": "Loans to small and medium-sized enterprises", "children": []}, {"name": "Loans to large enterprises", "children": []}]}, {"name": "Share of enterprise rejected loan applications", "children": []}]}, {"name": "Expected changes over the next three months", "children": [{"name": "Credit standards", "children": []}]}]}, {"name": "Households", "children": [{"name": "Changes over the past three months", "children": [{"name": "Credit standards and explanatory factors", "children": [{"name": "Loans for house purchase", "children": []}, {"name": "Consumer credit and other lending", "children": []}]},
推荐答案
您可以使用递归函数进行处理.
You can handle this using a recursive function.
def get_pairs(data, parent=''):
rv = [(data['name'], parent)]
for d in data['children']:
rv.extend(get_pairs(d, parent=data['name']))
return rv
Data_Dict = get_child_nodes(soup.find("div", class_="statisticTree"))
pairs = get_pairs(Data_Dict)
然后,您可以选择创建DataFrame或立即导出到csv,如示例输出中所示.要创建一个DataFrame,我们可以简单地做:
You then have the option of creating a DataFrame, or exporting to a csv immediately, as in your example output. To create a DataFrame, we can simply do:
df = pd.DataFrame(get_pairs(Data_Dict), columns=['Name', 'Parent'])
给予:
Name Parent
0 Banks
1 Banks Banks
2 Balance sheet items Banks
3 Minimum reserves Balance sheet items
4 Reserve maintenance in the euro area Minimum reserves
... ...
3890 Number of transactions per type of terminal Payments statistics
3891 Value of transactions per type of terminal Payments statistics
3892 Number of OTC transactions Payments statistics
3893 Value of OTC transactions Payments statistics
3894 Issuance of banknotes Payments statistics
[3895 rows x 2 columns]
或者要输出到csv,我们可以使用 csv
内置库:
Or to output to a csv, we can use the csv
builtin library:
import csv
with open('out.csv', 'w', newline='') as f:
writer = csv.writer(f, delimiter=',')
writer.writerow(('Name', 'Parent'))
for pair in pairs:
writer.writerow(pair)
输出:
这篇关于将嵌套字典转换为表/父子结构,Python 3.6的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!