Flask-WhooshAlchemy与现有的数据库 [英] Flask-WhooshAlchemy with existing database

查看：248 发布时间：2017/12/12 21:36:05 python sqlalchemy flask flask-sqlalchemy whoosh

本文介绍了Flask-WhooshAlchemy与现有的数据库的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我怎样才能让Flask-WhooshAlchemy为一个已经存在的数据库填充记录创建.seg文件？
通过调用：

  with app.app_context（）：
 whooshalchemy.whoosh_index（app，MappedClass）

我可以得到.toc文件，但只会创建.seg文件，插入直接通过Flask-WhooshAlchemy接口记录。因此，所有已经存在的记录将永远不会被包含在一个whoosh搜索中。

解决方案

这是一个索引现有数据库的脚本。 FWIW，whoosh指的是作为批量索引。

这有点粗糙，但是起作用：

< pre $ ＃！/ usr / bin / env python2 导入os 导入sys 导入app $ b $ from models import YourModel作为Model from flask.ext.whooshalchemy import whoosh_index sys.stdout = os.fdopen（sys.stdout.fileno（），'w'，0） atatime = 512 with app.app_context（）： index = whoosh_index（app，Model） searchable = Model .__ searchable__ print'counting rows ...'$格式（总数）作者=总数$ b $总数= {int（Model.query.order_by（无）.count（））完成= 0 打印总行数：{} index.writer（limitmb = 10000，procs = 16，multisegment = True） for Model.query.yield_per（atatime）： record = dict（[（s，p .__ dict __ [s] ）for s in searchable]） record.update（{'id'：unicode（p.id）}）＃id是强制性的，或者whoosh将不起作用 writer.add_document（** record） done + = 1 如果完成％atatime == 0： print'c {} / {}（{}％）'。，（total（float）（total）/ total）* 100,2））， print'{} / {}（{}％）'格式（done，total，round （float（done）/ total）* 100,2）） writer.commit（）

<你可能想要玩这个参数：

atatime -

limitmb - max要使用的字节数

procs - 并行使用的内核

索引8核AWS实例上的360,000条记录。大约需要4分钟，其中大部分正在等待（单线程） commit（）。

How can I get Flask-WhooshAlchemy to create the .seg files for an already existing database filled with records? By calling:
with app.app_context(): whooshalchemy.whoosh_index(app, MappedClass)
I can get the .toc file, but the .seg files will only be created and once I insert a record directly via Flask-WhooshAlchemy interface. Thus all already existing records will never be included in a whoosh search.
解决方案
Here is a script that indexes an existing database. FWIW, Whoosh refers to that as "batch indexing".

This is a little rough, but it works:
#!/usr/bin/env python2 import os import sys import app from models import YourModel as Model from flask.ext.whooshalchemy import whoosh_index sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0) atatime = 512 with app.app_context(): index = whoosh_index(app, Model) searchable = Model.__searchable__ print 'counting rows...' total = int(Model.query.order_by(None).count()) done = 0 print 'total rows: {}'.format(total) writer = index.writer(limitmb=10000, procs=16, multisegment=True) for p in Model.query.yield_per( atatime ): record = dict([(s, p.__dict__[s]) for s in searchable]) record.update({'id' : unicode(p.id)}) # id is mandatory, or whoosh won't work writer.add_document(**record) done += 1 if done % atatime == 0: print 'c {}/{} ({}%)'.format(done, total, round((float(done)/total)*100,2) ), print '{}/{} ({}%)'.format(done, total, round((float(done)/total)*100,2) ) writer.commit()
You may want to play with the the parameters:

atatime - the number of records to pull from the database at once

limitmb - "max" megabytes to use

procs - cores to use in parallel

I used this to index around 360,000 records on an 8-core AWS instance. It took about 4 minutes, most of which was waiting for the (single-threaded) commit().

这篇关于Flask-WhooshAlchemy与现有的数据库的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Flask-WhooshAlchemy与现有的数据库 [英] Flask-WhooshAlchemy with existing database

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Flask-WhooshAlchemy与现有的数据库 [英] Flask-WhooshAlchemy with existing database

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭