在json文件中插入数据 [英] Insert data in json file

查看:79
本文介绍了在json文件中插入数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

代码将错误的json结构插入文件

The code inserts wrong structure json into file

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import json

urls = {}
urls['Av'] = {'Áa', 'Bb'}

data = {}
for key, value in urls.items(): 
    for x in value: 

        url = 'https://www.google.pt/search?q=' + key + '%20' + x
        driver = webdriver.Chrome()
        driver.get(url)
        html = driver.page_source

        soup = BeautifulSoup(html, 'html.parser')
        a = soup.find("body")

        for child in a.find_all("div", {'class': 'g'}):
            h2 = child.find("span", {'class': 'Q8LRLc'})
            div = child.find("a", {'class': 'Fx4vi'})

        data[key] = []
        data[key].append({'h2': h2, 'div': div})
        print(data)

        with open("data_file.json", "a") as write_file: 
            json.dump(data, write_file, indent=4)

        driver.quit()

推荐答案

我看到了很多问题,大多数是要么当它们在外面时就处于循环内部,或者当它们在外面时就处于循环内部.

I see a bunch of issues, most are things either being inside a loop when they should be outside, or outside when they should be in.

  • 您可以在a.find_all("div",{'class':'g'}中为孩子的循环中设置变量 h2 div ):,但是您将它们添加到循环外的 data 中,因此仅会添加最后一个值.
  • 此外,您还需要为循环内的每个键初始化数据,并且应该在外部完成,否则每次都会重新初始化.
  • 您每次也打开要附加到该文件的文件,我只需要执行一次.
  • 然后,在每个循环中初始化驱动程序.
  • 请求 selenium.webdriver.chrome.options.Options 都是未使用的导入
  • You set your variables h2 and div inside the loop for child in a.find_all("div", {'class': 'g'}):, but you add them to data outside the loop, so only the last values will be added.
  • Additionally, you initialize the data for each key inside the loop, and it should be done outside, or it will be re-initialized each time.
  • You also open the file to append to it each time, I'd just do it once.
  • And, you initialize your driver in every loop.
  • requests and selenium.webdriver.chrome.options.Options are both unused imports

所以,我会这样更改它:

So, I'd change it like this:

urls = {}
urls['Av'] = {'Áa', 'Bb'}

data = {}
driver = webdriver.Chrome()
with open("data_file.json", "a") as write_file: 
    for key, value in urls.items():
        data[key] = []. # initialize only once per key

        for x in value: 
            url = 'https://www.google.pt/search?q=' + key + '%20' + x
            driver.get(url)
            html = driver.page_source
            soup = BeautifulSoup(html, 'html.parser')
            a = soup.find("body")

            for child in a.find_all("div", {'class': 'g'}):
                h2 = child.find("span", {'class': 'Q8LRLc'})
                div = child.find("a", {'class': 'Fx4vi'})
                data[key].append({'h2': h2, 'div': div})  # update data for every h2/div found

    json.dump(data, write_file, indent=4) # This write can be done once, outside all loops!

driver.quit()

我很难测试,但希望对您有所帮助!祝您编码愉快!

A little hard for me to test, but hope that helps! Happy Coding!

这篇关于在json文件中插入数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆