如何组织需要数据文件的模块 [英] how to organize a module that requires a data file

查看:48
本文介绍了如何组织需要数据文件的模块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的,所以我有一个基本上是一个Python包装器的模块,它围绕着一个存储在文本文件中的大的
查找表[1]。该模块需要提供一个

几个函数::


get_stem(word,pos,default = None)

stem_exists(单词,pos)

...


因为应该只有一个查找表,我觉得这些

函数应该是模块全局变量。那样的话,你可以这样做:
$ b / b类似::


导入变形

assist = morph.get_stem(''help '',''N'')

...


我的问题在于文本文件。我应该把它留在哪里?如果我想

保持模块简单,我需要能够在模块导入时识别文件的位置

。这样,我可以将所有数据读入适当的Python结构,并且我的所有模块级函数将在导入后立即工作。


我只能想到一些显而易见的地方,我可以在导入时找到文本

文件 - 与模块在同一目录中(例如

lib / site-packages),位于用户的主目录中,或者是由环境变量指示的目录

。第一个看起来很奇怪因为

文本文件很大(大约10MB)而且我真的没有看到任何其他的

包将数据文件放入lib / site-packages 。第二个似乎很奇怪,因为它不是每用户配置 - 它是所有用户共享的数据文件

。而第三个看起来很奇怪,因为我的配置非常依赖环境

变量,这很难维护。


如果我不介意使模块功能复杂一点(例如通过

以如果_lookup_table不是None开始每个功能),我可以

允许用户在导入模块

后指定文件的位置,例如::


导入变形

变形.setfile(r''C:\ resources\morph_english.flat'')

...


那么所有的模块级函数都会必须提出异常,直到

setfile()被调用。我不是不喜欢用户每次想要使用它时都需要配置模块,但也许那个'b
'是不可避免的。


有什么建议吗?是否有一个明显的地方放置文本文件

我不见了?


提前谢谢,


STeVe


[1]如果您感到好奇,该文件是大学提供的单词列表及其b / b $ b形态词典宾夕法尼亚州。

Ok, so I have a module that is basically a Python wrapper around a big
lookup table stored in a text file[1]. The module needs to provide a
few functions::

get_stem(word, pos, default=None)
stem_exists(word, pos)
...

Because there should only ever be one lookup table, I feel like these
functions ought to be module globals. That way, you could just do
something like::

import morph
assist = morph.get_stem(''assistance'', ''N'')
...

My problem is with the text file. Where should I keep it? If I want to
keep the module simple, I need to be able to identify the location of
the file at module import time. That way, I can read all the data into
the appropriate Python structure, and all my module-level functions will
work immediatly after import.

I can only think of a few obvious places where I could find the text
file at import time -- in the same directory as the module (e.g.
lib/site-packages), in the user''s home directory, or in a directory
indicated by an environment variable. The first seems weird because the
text file is large (about 10MB) and I don''t really see any other
packages putting data files into lib/site-packages. The second seems
weird because it''s not a per-user configuration - it''s a data file
shared by all users. And the the third seems weird because my
experience with a configuration depending heavily on environment
variables is that this is difficult to maintain.

If I don''t mind complicating the module functions a bit (e.g. by
starting each function with "if _lookup_table is not None"), I could
allow users to specify a location for the file after the module is
imported, e.g.::

import morph
morph.setfile(r''C:\resources\morph_english.flat'')
...

Then all the module-level functions would have to raise Exceptions until
setfile() was called. I don''t like that the user would have to
configure the module each time they wanted to use it, but perhaps that''s
unaviodable.

Any suggestions? Is there an obvious place to put the text file that
I''m missing?

Thanks in advance,

STeVe

[1] In case you''re curious, the file is a list of words and their
morphological stems provided by the University of Pennsylvania.

推荐答案

Steven Bethard写道:


[模块内部的文本文件使用。]
Steven Bethard wrote:

[Text file for a module''s internal use.]
我的问题在于文本文件。我应该把它留在哪里?如果我想保持模块简单,我需要能够在模块导入时识别文件的位置。这样,我就可以将所有数据读入相应的Python结构,所有我的模块级函数将在导入后立即工作。
My problem is with the text file. Where should I keep it? If I want to
keep the module simple, I need to be able to identify the location of
the file at module import time. That way, I can read all the data into
the appropriate Python structure, and all my module-level functions will
work immediatly after import.




我倾向于使用每个模块中可用的__file__属性。

例如:


resource_dir = os.path.join(os.path .split(__ file __)[0]," Resources")

这将为resource_dir分配资源目录的路径

和模块本身一起文件系统。当然,如果您只是想要将文本文件放在模块旁边,而不是整个目录,那么您将替换资源。您的文件名称为

(当然,更改变量名称)。例如:


filename = os.path.join(os.path.split(__ file __)[0],

" morph_english.flat")


已经发布了这个解决方案,并且在Usenet的传统中,我会感兴趣地听听这是否是一个特别糟糕的主意。


Paul



I tend to make use of the __file__ attribute available in every module.
For example:

resource_dir = os.path.join(os.path.split(__file__)[0], "Resources")

This assigns to resource_dir the path to the Resources directory
alongside the module itself in the filesystem. Of course, if you just
wanted the text file to reside alongside the module, rather than a
whole directory of stuff, you''d replace "Resources" with the name of
your file (and change the variable name, of course). For example:

filename = os.path.join(os.path.split(__file__)[0],
"morph_english.flat")

Having posted this solution, and in the tradition of Usenet, I''d be
interested to hear whether this is a particularly bad idea.

Paul


Steven Bethard写道:


[正文模块的内部使用。]
Steven Bethard wrote:

[Text file for a module''s internal use.]
我的问题在于文本文件。我应该把它留在哪里?如果我想保持模块简单,我需要能够在模块导入时识别文件的位置。这样,我就可以将所有数据读入相应的Python结构,所有我的模块级函数将在导入后立即工作。
My problem is with the text file. Where should I keep it? If I want to
keep the module simple, I need to be able to identify the location of
the file at module import time. That way, I can read all the data into
the appropriate Python structure, and all my module-level functions will
work immediatly after import.




我倾向于使用每个模块中可用的__file__属性。

例如:


resource_dir = os.path.join(os.path .split(__ file __)[0]," Resources")

这将为resource_dir分配资源目录的路径

和模块本身一起文件系统。当然,如果您只是想要将文本文件放在模块旁边,而不是整个目录,那么您将替换资源。您的文件名称为

(当然,更改变量名称)。例如:


filename = os.path.join(os.path.split(__ file __)[0],

" morph_english.flat")


已经发布了这个解决方案,并且在Usenet的传统中,我会感兴趣地听听这是否是一个特别糟糕的主意。


Paul



I tend to make use of the __file__ attribute available in every module.
For example:

resource_dir = os.path.join(os.path.split(__file__)[0], "Resources")

This assigns to resource_dir the path to the Resources directory
alongside the module itself in the filesystem. Of course, if you just
wanted the text file to reside alongside the module, rather than a
whole directory of stuff, you''d replace "Resources" with the name of
your file (and change the variable name, of course). For example:

filename = os.path.join(os.path.split(__file__)[0],
"morph_english.flat")

Having posted this solution, and in the tradition of Usenet, I''d be
interested to hear whether this is a particularly bad idea.

Paul


2005年11月17日星期四12:18:51 -0700

Steven Bethard< st ************ @ gmail.com>写道:
On Thu, 17 Nov 2005 12:18:51 -0700
Steven Bethard <st************@gmail.com> wrote:
我的问题是文本文件。我应该把它放在哪里?

我只能想到一些显而易见的地方,我可以在导入时找到文本文件 - 与模块在同一目录中(例如lib / site-packages),在
用户的主目录中,或在由
环境变量指示的目录中。
My problem is with the text file. Where should I keep it?

I can only think of a few obvious places where I could
find the text file at import time -- in the same
directory as the module (e.g. lib/site-packages), in the
user''s home directory, or in a directory indicated by an
environment variable.




为什么不为这些地方搜索它?


检查〜/ .mymod / myfile,然后/ etc / mymod / myfile,然后

/ lib / site-packages / mymod / myfile或者其他什么。它不会花费很长时间,只需对模块的导入进行存在检查。

如果你在检查这些地方后找不到它,*那么*

提出异常。


你不知道这个数据文件是什么或是否是

受制于改变或定制。如果是的话,那么这种方法就有了实际的理由,因为

的个人用户可能想要用

影响系统安装。自己的数据版本。


这对于配置文件来说是非常典型的行为

任何Posix系统。


干杯,

特里

-

Terry Hancock(ha*****@AnansiSpaceworks.com)

Anansi Spaceworks http://www.AnansiSpaceworks.com



Why don''t you search those places in order for it?

Check ~/.mymod/myfile, then /etc/mymod/myfile, then
/lib/site-packages/mymod/myfile or whatever. It won''t take
long, just do the existence checks on import of the module.
If you don''t find it after checking those places, *then*
raise an exception.

You don''t say what this data file is or whether it is
subject to change or customization. If it is, then there is
a real justification for this approach, because an
individual user might want to shadow the system install with
his own version of the data.

That''s pretty typical behavior for configuration files on
any Posix system.

Cheers,
Terry
--
Terry Hancock (ha*****@AnansiSpaceworks.com)
Anansi Spaceworks http://www.AnansiSpaceworks.com


这篇关于如何组织需要数据文件的模块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆