返回脚本中使用的导入 Python 模块的列表? [英] Return a list of imported Python modules used in a script?

查看:23
本文介绍了返回脚本中使用的导入 Python 模块的列表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个程序,该程序根据导入的模块对 Python 文件列表进行分类.因此,我需要扫描 .py 文件的集合并返回它们导入的模块的列表.例如,如果我导入的文件之一具有以下几行:

I am writing a program that categorizes a list of Python files by which modules they import. As such I need to scan the collection of .py files ad return a list of which modules they import. As an example, if one of the files I import has the following lines:

import os
import sys, gtk

我希望它返回:

["os", "sys", "gtk"]

我玩了 modulefinder 并写道:

I played with modulefinder and wrote:

from modulefinder import ModuleFinder

finder = ModuleFinder()
finder.run_script('testscript.py')

print 'Loaded modules:'
for name, mod in finder.modules.iteritems():
    print '%s ' % name,

但这不仅仅是返回脚本中使用的模块.作为仅具有以下内容的脚本中的示例:

but this returns more than just the modules used in the script. As an example in a script which merely has:

import os
print os.getenv('USERNAME')

ModuleFinder 脚本返回的模块返回:

The modules returned from the ModuleFinder script return:

tokenize  heapq  __future__  copy_reg  sre_compile  _collections  cStringIO  _sre  functools  random  cPickle  __builtin__  subprocess  cmd  gc  __main__  operator  array  select  _heapq  _threading_local  abc  _bisect  posixpath  _random  os2emxpath  tempfile  errno  pprint  binascii  token  sre_constants  re  _abcoll  collections  ntpath  threading  opcode  _struct  _warnings  math  shlex  fcntl  genericpath  stat  string  warnings  UserDict  inspect  repr  struct  sys  pwd  imp  getopt  readline  copy  bdb  types  strop  _functools  keyword  thread  StringIO  bisect  pickle  signal  traceback  difflib  marshal  linecache  itertools  dummy_thread  posix  doctest  unittest  time  sre_parse  os  pdb  dis

...而我只希望它返回os",因为这是脚本中使用的模块.

...whereas I just want it to return 'os', as that was the module used in the script.

谁能帮我实现这个目标?

Can anyone help me achieve this?

更新:我只是想澄清一下,我想在不运行正在分析的 Python 文件的情况下执行此操作,而只需扫描代码.

UPDATE: I just want to clarify that I would like to do this without running the Python file being analyzed, and just scanning the code.

推荐答案

IMO 最好的方法是使用 http://furius.ca/snakefood/ 包.作者已经完成了所有必需的工作,不仅获得了直接导入的模块,而且还使用 AST 解析了运行时依赖项的代码,而这些代码是静态分析会遗漏的.

IMO the best way todo this is to use the http://furius.ca/snakefood/ package. The author has done all of the required work to get not only directly imported modules but it uses the AST to parse the code for runtime dependencies that a more static analysis would miss.

编写了一个命令示例来演示:

Worked up a command example to demonstrate:

sfood ./example.py | sfood-cluster > example.deps

这将生成每个唯一模块的基本依赖文件.如需更详细的信息,请使用:

That will generate a basic dependency file of each unique module. For even more detail use:

sfood -r -i ./example.py | sfood-cluster > example.deps

要遍历一棵树并查找所有导入,您也可以在代码中执行此操作:请注意 - 此例程的 AST 块是从拥有此版权的蛇食源中提取的:版权所有 (C) 2001-2007 Martin Blais.保留所有权利.

To walk a tree and find all imports, you can also do this in code: Please NOTE - The AST chunks of this routine were lifted from the snakefood source which has this copyright: Copyright (C) 2001-2007 Martin Blais. All Rights Reserved.

 import os
 import compiler
 from compiler.ast import Discard, Const
 from compiler.visitor import ASTVisitor

 def pyfiles(startPath):
     r = []
     d = os.path.abspath(startPath)
     if os.path.exists(d) and os.path.isdir(d):
         for root, dirs, files in os.walk(d):
             for f in files:
                 n, ext = os.path.splitext(f)
                 if ext == '.py':
                     r.append([d, f])
     return r

 class ImportVisitor(object):
     def __init__(self):
         self.modules = []
         self.recent = []
     def visitImport(self, node):
         self.accept_imports()
         self.recent.extend((x[0], None, x[1] or x[0], node.lineno, 0)
                            for x in node.names)
     def visitFrom(self, node):
         self.accept_imports()
         modname = node.modname
         if modname == '__future__':
             return # Ignore these.
         for name, as_ in node.names:
             if name == '*':
                 # We really don't know...
                 mod = (modname, None, None, node.lineno, node.level)
             else:
                 mod = (modname, name, as_ or name, node.lineno, node.level)
             self.recent.append(mod)
     def default(self, node):
         pragma = None
         if self.recent:
             if isinstance(node, Discard):
                 children = node.getChildren()
                 if len(children) == 1 and isinstance(children[0], Const):
                     const_node = children[0]
                     pragma = const_node.value
         self.accept_imports(pragma)
     def accept_imports(self, pragma=None):
         self.modules.extend((m, r, l, n, lvl, pragma)
                             for (m, r, l, n, lvl) in self.recent)
         self.recent = []
     def finalize(self):
         self.accept_imports()
         return self.modules

 class ImportWalker(ASTVisitor):
     def __init__(self, visitor):
         ASTVisitor.__init__(self)
         self._visitor = visitor
     def default(self, node, *args):
         self._visitor.default(node)
         ASTVisitor.default(self, node, *args) 

 def parse_python_source(fn):
     contents = open(fn, 'rU').read()
     ast = compiler.parse(contents)
     vis = ImportVisitor() 

     compiler.walk(ast, vis, ImportWalker(vis))
     return vis.finalize()

 for d, f in pyfiles('/Users/bear/temp/foobar'):
     print d, f
     print parse_python_source(os.path.join(d, f)) 

这篇关于返回脚本中使用的导入 Python 模块的列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆