数百万个条目的SQLite优化? [英] SQLite Optimization for Millions of Entries?

查看：84 发布时间：2020/5/21 20:58:44 perl optimization sqlite berkeley-db

本文介绍了数百万个条目的SQLite优化?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试通过使用SQLite数据库和Perl模块来解决问题.最后，我需要登录数千万个条目.每个项目的唯一唯一标识符是URL的文本字符串.我正在考虑通过两种方式做到这一点:

方法#1:拥有一张好桌子，一张坏桌子，未分类的桌子. (我需要检查html并确定是否需要它.)假设我们总共有10亿个页面，每个表中有3.33亿个URL.我要添加一个新的URL，我需要检查它是否在任何表中，如果唯一，则将其添加到Unsorted中.另外，使用此选项，我会在周围移动很多行.

方法2:我有2张桌子，主人和好. Master拥有所有10亿个页面URL，Good拥有我想要的3.33亿个URL.新的URL，需要做同样的事情，除了这次我只查询一个表，而且我永远不会从Master删除一行，只将数据添加到Good.

因此，基本上，我需要了解快速查询大型SQLite数据库的最佳设置，以查看〜20个字符的文本字符串是否唯一，然后添加. >

我现在正在尝试使Berkeley DB使用Perl模块工作，但是没有骰子.这就是我所拥有的:

use BerkeleyDB;

$dbFolder = 'C:\somedirectory';
my $env = BerkeleyDB::Env->new ( -Home => $dbFolder );

my $db  = BerkeleyDB::Hash->new (
-Filename => "fred.db", 
-Env => $env );
my $status = $db->db_put("apple", "red");

运行此命令时，我得到以下信息:

Can't call method "db_put" on an undefined value at C:\Directory\perlfile.pl line 42, <STDIN> line 1.

解决方案

如果未定义$db，则打开数据库失败，您应检查$!和$BerkeleyDB::Error以了解原因.

您已经创建了数据库吗?如果不是，则需要-Flags => DB_CREATE.

工作示例:

use strict;
use warnings;
use BerkeleyDB;

my $dbFolder = '/home/ysth/bdbtmp/';

my $db  = BerkeleyDB::Hash->new (
    -Filename => "$dbFolder/fred.db", 
    -Flags => DB_CREATE,
) or die "couldn't create: $!, $BerkeleyDB::Error.\n";

my $status = $db->db_put("apple", "red");

尽管如此，我无法让BerkeleyDB :: Env做任何有用的事情.无论我尝试了什么，构造函数都会返回undef.

I'm trying to tackle a problem by using a SQLite database and Perl modules. In the end, there will be tens of millions of entries I need to log. The only unique identifier for each item is a text string for the URL. I'm thinking of doing this in two ways:

Way #1: Have a good table, bad table, unsorted table. (I need to check the html and decide whether I want it.) Say we have 1 billion pages total, 333 million URLs in each table. I have a new URL to add, and I need to check and see if it's in any of the tables, and add it to the Unsorted if it is unique. Also, I would be moving a lot of rows around with this option.

Way #2: I have 2 tables, Master and Good. Master has all 1 billion page URLs, and Good has the 333 million that I want. New URL, need to do the same thing, except this time I am only querying one table, and I would never delete a row from Master, only add the data to Good.

So basically, I need to know the best setup to quickly query a huge SQLite database to see if a text string of ~20 characters is unique, then add if it isn't.

Edit: I'm now trying to get Berkeley DB to work using the Perl module, but no dice. Here's what I have:

use BerkeleyDB;

$dbFolder = 'C:\somedirectory';
my $env = BerkeleyDB::Env->new ( -Home => $dbFolder );

my $db  = BerkeleyDB::Hash->new (
-Filename => "fred.db", 
-Env => $env );
my $status = $db->db_put("apple", "red");

And when I run this, I get the following:

Can't call method "db_put" on an undefined value at C:\Directory\perlfile.pl line 42, <STDIN> line 1.

解决方案

If $db is undefined, opening the database is failing, and you should inspect $! and $BerkeleyDB::Error to see why.

Have you created the database already? If not, you need -Flags => DB_CREATE.

Working example:

use strict;
use warnings;
use BerkeleyDB;

my $dbFolder = '/home/ysth/bdbtmp/';

my $db  = BerkeleyDB::Hash->new (
    -Filename => "$dbFolder/fred.db", 
    -Flags => DB_CREATE,
) or die "couldn't create: $!, $BerkeleyDB::Error.\n";

my $status = $db->db_put("apple", "red");

I couldn't get BerkeleyDB::Env to do anything useful, though; whatever I tried, the constructor returned undef.

这篇关于数百万个条目的SQLite优化?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

数百万个条目的SQLite优化? [英] SQLite Optimization for Millions of Entries?

问题描述

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

数百万个条目的SQLite优化? [英] SQLite Optimization for Millions of Entries?

问题描述

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭