如何锁定对MySQL表的读/写,以便我可以选择然后插入而无需其他程序对数据库进行读/写? [英] How do I lock read/write to MySQL tables so that I can select and then insert without other programs reading/writing to the database?

查看:71
本文介绍了如何锁定对MySQL表的读/写,以便我可以选择然后插入而无需其他程序对数据库进行读/写?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在并行运行WebCrawler的许多实例.

I am running many instances of a webcrawler in parallel.

每个搜寻器从表中选择一个域,将该URL和开始时间插入日志表中,然后开始搜寻该域.

Each crawler selects a domain from a table, inserts that url and a start time into a log table, and then starts crawling the domain.

其他并行爬网程序在选择其自己要爬网的域之前,先检查日志表以查看已经在爬网的域.

Other parallel crawlers check the log table to see what domains are already being crawled before selecting their own domain to crawl.

我需要阻止其他搜寻器选择刚刚由另一个搜寻器选择但没有日志条目的域.我对如何执行此操作的最佳猜测是,在一个搜寻器选择一个域并在日志表中插入一行(两个查询)的同时,锁定所有其他读/写操作的数据库.

I need to prevent other crawlers from selecting a domain that has just been selected by another crawler but doesn't have a log entry yet. My best guess at how to do this is to lock the database from all other read/writes while one crawler selects a domain and inserts a row in the log table (two queries).

这到底是怎么做到的?恐怕这是非常复杂的,并且依赖于许多其他事情.请帮助我入门.

How the heck does one do this? I'm afraid this is terribly complex and relies on many other things. Please help get me started.

此代码似乎是一个很好的解决方案(但是,请参见下面的错误):

This code seems like a good solution (see the error below, however):

INSERT INTO crawlLog (companyId, timeStartCrawling)
VALUES
(
    (
        SELECT companies.id FROM companies
        LEFT OUTER JOIN crawlLog
        ON companies.id = crawlLog.companyId
        WHERE crawlLog.companyId IS NULL
        LIMIT 1
    ),
    now()
)

但我不断收到以下mysql错误:

but I keep getting the following mysql error:

You can't specify target table 'crawlLog' for update in FROM clause

有没有办法解决同一问题而没有这个问题?我尝试了几种不同的方法.包括这个:

Is there a way to accomplish the same thing without this problem? I've tried a couple different ways. Including this:

INSERT INTO crawlLog (companyId, timeStartCrawling)
VALUES
(
    (
        SELECT id
        FROM companies
        WHERE id NOT IN (SELECT companyId FROM crawlLog) LIMIT 1
    ),
    now()
)

推荐答案

我从@Eljakim的答案中得到了一些启发,并开始

I got some inspiration from @Eljakim's answer and started this new thread where I figured out a great trick. It doesn't involve locking anything and is very simple.

INSERT INTO crawlLog (companyId, timeStartCrawling)
SELECT id, now()
FROM companies
WHERE id NOT IN
(
    SELECT companyId
    FROM crawlLog AS crawlLogAlias
)
LIMIT 1

这篇关于如何锁定对MySQL表的读/写,以便我可以选择然后插入而无需其他程序对数据库进行读/写?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆