将抓取的表数据直接插入PostgreSQL数据库 [英] Insert scraped table data directly into PostgreSQL db

查看:76
本文介绍了将抓取的表数据直接插入PostgreSQL数据库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将抓取的数据直接插入到PostgreSQL db中,为此我在编写查询方面很费力,我们将不胜感激。

I want to insert my scraped data directly into PostgreSQL db, I'm struggling with writing query for this, any help would be appreciated.

我到目前为止提出的代码:

The code I've came up with so far:

import csv
import urllib.request
from bs4 import BeautifulSoup
conn = psycopg2.connect(database='--',user='--', password='--', port=--)
cursor = conn.cursor()
soup = BeautifulSoup(urllib.request.urlopen("http://tis.nhai.gov.in/TollInformation?TollPlazaID=236").read(),'lxml')
tbody = soup('table' ,{"class":"tollinfotbl"})[0].find_all('tr')
for row in tbody:
    cols = row.findChildren(recursive=False)
    cols = [ele.text.strip() for ele in cols]
    writer.writerow(cols)
    print(cols)

我的表格的详细信息如下:

My table's details are as follows:

    Column     |  Type   | Modifiers
---------------+---------+-----------
 vehicle_type  | text    | not null
 one_time      | integer | not null
 return_trip   | integer |
 monthly_pass  | integer | not null
 local_vehicle | integer | not null


推荐答案

我认为 cols 包含5个元素(按在表上显示的顺序),否则调整索引。

I assume that cols contains 5 elements, in order which you presented at your table, otherwise adjust indexes.

import csv
import urllib.request

from bs4 import BeautifulSoup

conn = psycopg2.connect(database='--', user='--', password='--', port='--')
cursor = conn.cursor()
soup = BeautifulSoup(urllib.request.urlopen(
    "http://tis.nhai.gov.in/TollInformation?TollPlazaID=236").read(), 'lxml')
tbody = soup('table', {"class": "tollinfotbl"})[0].find_all('tr')
for row in tbody:
    cols = row.findChildren(recursive=False)
    cols = [ele.text.strip() for ele in cols]
    if cols:
        vehicle_type = cols[0]
        one_time = int(cols[1])
        return_strip = int(cols[2])
        monthly_pass = int(cols[3])
        local_vehicle = int(cols[4])

        query = "INSERT INTO table_name (vehicle_type, return_strip, monthly_pass, local_vehicle) VALUES (%s, %s, %s, %s, %s);"
        data = (vehicle_type, one_time, return_strip, monthly_pass, local_vehicle)
        cursor.execute(query, data)

# Commit the transaction
conn.commit()

这篇关于将抓取的表数据直接插入PostgreSQL数据库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆