如何在Python中快速搜索.csv文件 [英] How do quickly search through a .csv file in Python
问题描述
我正在使用Python读取一个包含600万条.csv文件的文件,我希望能够通过此文件搜索特定条目。
任何技巧来搜索整个文件?你应该把整个东西读进字典,还是应该每次都进行搜索?我尝试将它加载到字典,但是花了年龄,所以我目前正在搜索整个文件每次,这似乎是浪费。
我可以利用的列表是按字母顺序排列? (例如,如果搜索词以b开头,则我仅从包括以b开头的第一个词的行到包括以b开头的最后一个词的行)
我使用 import csv
。
make csv
转到文件中的特定行?我想让程序以随机的一行开始)
编辑:我已经有一个作为.sql文件的列表的副本,我如何实现到Python?
如果csv文件没有改变,加载到数据库,搜索是快速和容易。如果你不熟悉SQL,你需要刷新一下。
这里是一个从csv插入sqlite表的粗略例子。示例csv是';'分隔,并有2列。
import csv
import sqlite3
con = sqlite3.Connection('newdb.sqlite ')
cur = con.cursor()
cur.execute('CREATE TABLEstuff(onevarchar(12),twovarchar(12));')
f = open('stuff.csv')
csv_reader = csv.reader(f,delimiter =';')
cur.executemany('INSERT INTO stuff VALUES ?,?)',csv_reader)
cur.close()
con.commit()
con.close()
f.close()
I'm reading a 6 million entry .csv file with Python, and I want to be able to search through this file for a particular entry.
Are there any tricks to search the entire file? Should you read the whole thing into a dictionary or should you perform a search every time? I tried loading it into a dictionary but that took ages so I'm currently searching through the whole file every time which seems wasteful.
Could I possibly utilize that the list is alphabetically ordered? (e.g. if the search word starts with "b" I only search from the line that includes the first word beginning with "b" to the line that includes the last word beginning with "b")
I'm using import csv
.
(a side question: it is possible to make csv
go to a specific line in the file? I want to make the program start at a random line)
Edit: I already have a copy of the list as an .sql file as well, how could I implement that into Python?
If the csv file isn't changing, load in it into a database, where searching is fast and easy. If you're not familiar with SQL, you'll need to brush up on that though.
Here is a rough example of inserting from a csv into a sqlite table. Example csv is ';' delimited, and has 2 columns.
import csv
import sqlite3
con = sqlite3.Connection('newdb.sqlite')
cur = con.cursor()
cur.execute('CREATE TABLE "stuff" ("one" varchar(12), "two" varchar(12));')
f = open('stuff.csv')
csv_reader = csv.reader(f, delimiter=';')
cur.executemany('INSERT INTO stuff VALUES (?, ?)', csv_reader)
cur.close()
con.commit()
con.close()
f.close()
这篇关于如何在Python中快速搜索.csv文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!