使用 Python 与其他程序交互 [英] Interact with other programs using Python

查看:20
本文介绍了使用 Python 与其他程序交互的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在考虑使用 Python 编写一个程序,该程序将找到我提供的歌曲名称的歌词.我认为整个过程应该归结为以下几件事.这些是我在运行程序时希望程序执行的操作:

  • 提示我输入歌曲名称
  • 复制那个名字
  • 打开网络浏览器(例如谷歌浏览器)
  • 将该名称粘贴到地址栏中并查找有关该歌曲的信息
  • 打开一个包含歌词的页面
  • 复制歌词
  • 运行文本编辑器(例如 Microsoft Word)
  • 粘贴歌词
  • 使用歌曲名称保存新的文本文件

当然,我不是要代码.我只是想知道如何使用python与其他程序交互的概念或想法

更具体地说,我想我想知道,fox 示例,我们如何指出 Google Chrome 中的地址栏在哪里,并告诉 python 将名称粘贴到那里.或者我们如何告诉python如何复制歌词并将其粘贴到Microsof Word的表格中然后保存.

我一直在阅读(我仍在阅读)几本关于 Python 的书籍:Python 的字节、艰难地学习 Python、傻瓜的 Python、使用 Python 和 Pygame 开始游戏开发.然而,我发现我似乎只(或几乎只)学会了创建自己运行的程序(我不能告诉我的程序用我计算机上已经安装的其他程序来做我想做的事情)

我知道我的问题在某种程度上听起来很愚蠢,但我真的想知道它是如何工作的,我们告诉 Python regconize 谷歌 chrome 浏览器的这一部分是地址栏的方式,它应该粘贴里面的歌.让 python 与另一个程序交互的整个想法对我来说真的很模糊,我只是非常想掌握那个.

感谢所有花时间阅读我这么长的问题的人.

ttriet204

解决方案

如果您真正想要的是自学如何与其他应用程序交互的好借口,那么这可能不是最好的方法.Web 浏览器是混乱的,时间将是不可预测的,等等.所以,你承担了一项非常艰巨的任务——如果你以通常的方式完成这项任务将会非常容易(直接与服务器对话,创建直接文本文件等,都没有触及任何其他程序).

但如果您确实想与其他应用交互,则有多种不同的方法,哪种方法合适取决于您需要处理的应用类型.

  • 某些应用旨在从外部实现自动化.在 Windows 上,这几乎总是意味着它们是 COM 接口,通常带有 IDispatch 接口,您可以使用 pywin32 的 COM 包装器;在 Mac 上,它意味着一个 AppleEvent 接口,你使用 ScriptingBridgeappscript ;在其他平台上没有通用标准.IE(但可能不是 Chrome)和 Word 都有这样的界面.

  • 有些应用程序具有非 GUI 界面——无论是可以使用 popen 驱动的命令行,还是可以通过 ctypes<加载的 DLL/SO/DYLIB/代码>.或者,理想情况下,其他人已经为您编写了 Python 绑定.

  • 有些应用程序只有 GUI,并且无法进行 GUI 自动化.您可以在低级别执行此操作,通过制作 WM_ 消息以通过 Windows 上的 pywin32 发送,使用 Mac 上的辅助功能 API 等,或者在更高级别使用诸如 pywinauto 之类的库,或者可能在 selenium 或类似工具的非常高的级别上,用于自动化特定应用程序.

因此,您可以使用从 Chrome 的 selenium 和 Word 的 COM 自动化到自己制作所有 WM_ 消息的任何东西来做到这一点.如果这是一项学习练习,那么问题是您今天想学习哪些内容.

<小时>

让我们从 COM 自动化开始.使用 pywin32,您可以直接访问应用程序自己的脚本接口,而无需控制 GUI来自用户,弄清楚如何导航菜单和对话框等.这是编写Word 宏"的现代版本——宏可以是外部脚本而不是 Word 内部,它们不必用 VB 编写,但它们看起来非常相似.脚本的最后一部分如下所示:

word = win32com.client.dispatch('Word.Application')word.Visible = Truedoc = word.Documents.Add()doc.Selection.TypeText(my_string)doc.SaveAs(r'C:TestFilesTestDoc.doc')

如果您查看 Microsoft Word 脚本,您会看到大量示例.但是,您可能会注意到它们是用 VBScript 编写的.如果您四处寻找教程,它们都是为 VBScript(或更旧的 VB)编写的.大多数应用程序的文档都是为 VBScript(或 VB、.NET,甚至低级 COM)编写的.以及我所知道的使用 Python 中的 COM 自动化的所有教程,例如 客户端 COM 和 Python 的快速入门,是为那些已经了解 COM 自动化并且只想知道如何从 Python 实现它的人编写的.Microsoft 不断更改所有内容的名称这一事实使搜索变得更加困难——您如何猜测谷歌搜索 OLE 自动化、ActiveX 脚本、Windows Scripting House 等与了解 COM 自动化有什么关系?所以,我不确定推荐什么来开始.我可以保证,一旦你学会了所有的废话,一切就和上面那个例子一样简单,但我不知道如何克服最初的障碍.

无论如何,并非每个应用程序都是可自动化的.有时,即使是这样,描述 GUI 操作(用户将在屏幕上单击的内容)比考虑应用程序的对象模型更简单.选择第三段"很难用 GUI 术语来描述,但选择整个文档"很容易——只需按 control-A,或转到编辑"菜单并全选".GUI 自动化比 COM 自动化要困难得多,因为您要么必须向应用程序发送 Windows 本身发送的相同消息以表示您的用户操作(例如,请参阅菜单通知"),或者更糟糕的是,制作鼠标消息,例如从顶部移动 (32, 4) 个像素-左上角,单击,鼠标向下移动 16 像素,再次单击"以说打开文件菜单,然后单击新建".

幸运的是,有像 pywinauto 这样的工具可以将这两种 GUI 自动化打包东西起来,使它更简单.还有诸如 swapy 之类的工具可以帮助您确定要发送的命令.如果您不喜欢 Python,还有一些工具,例如 AutoItActions 比使用 swapypywinauto 更容易,在至少在您开始时.按照这种方式,您的脚本的最后一部分可能如下所示:

word.Activate()word.MenuSelect('文件->新建')word.KeyStrokes(my_string)word.MenuSelect('文件->另存为')word.Dialogs[-1].FindTextField('文件名').Select()word.KeyStrokes(r'C:TestFilesTestDoc.doc')word.Dialogs[-1].FindButton('OK').Click()

最后,即使使用所有这些工具,Web 浏览器也很难实现自动化,因为每个网页都有自己的菜单、按钮等,它们不是 Windows 控件,而是 HTML.除非你想一直到鼠标移动12个像素"这个级别,否则很难处理这些.这就是 selenium 的用武之地——它编写 Web GUI 脚本的方式与 pywinauto 编写 Windows GUI 脚本的方式相同.

I'm having the idea of writing a program using Python which shall find a lyric of a song whose name I provided. I think the whole process should boil down to couple of things below. These are what I want the program to do when I run it:

  • prompt me to enter a name of a song
  • copy that name
  • open a web browser (google chrome for example)
  • paste that name in the address bar and find information about the song
  • open a page that contains the lyrics
  • copy that lyrics
  • run a text editor (like Microsoft Word for instance)
  • paste the lyrics
  • save the new text file with the name of the song

I am not asking for code, of course. I just want to know the concepts or ideas about how to use python to interact with other programs

To be more specific, I think I want to know, fox example, just how we point out where is the address bar in Google Chrome and tell python to paste the name there. Or how we tell python how to copy the lyrics as well as paste it into the Microsof Word's sheet then save it.

I've been reading (I'm still reading) several books on Python: Byte of python, Learn python the hard way, Python for dummies, Beginning Game Development with Python and Pygame. However, I found out that it seems like I only (or almost only) learn to creat programs that work on itself (I can't tell my program to do things I want with other programs that are already installed on my computer)

I know that my question somehow sounds rather silly, but I really want to know how it works, the way we tell Python to regconize that this part of the Google chrome browser is the address bar and that it should paste the name of the song in it. The whole idea of making python interact with another program is really really vague to me and I just extremely want to grasp that.

Thank you everyone, whoever spend their time reading my so-long question.

ttriet204

解决方案

If what you're really looking into is a good excuse to teach yourself how to interact with other apps, this may not be the best one. Web browsers are messy, the timing is going to be unpredictable, etc. So, you've taken on a very hard task—and one that would be very easy if you did it the usual way (talk to the server directly, create the text file directly, etc., all without touching any other programs).

But if you do want to interact with other apps, there are a variety of different approaches, and which is appropriate depends on the kinds of apps you need to deal with.

  • Some apps are designed to be automated from the outside. On Windows, this nearly always means they a COM interface, usually with an IDispatch interface, for which you can use pywin32's COM wrappers; on Mac, it means an AppleEvent interface, for which you use ScriptingBridge or appscript; on other platforms there is no universal standard. IE (but probably not Chrome) and Word both have such interfaces.

  • Some apps have a non-GUI interface—whether that's a command line you can drive with popen, or a DLL/SO/DYLIB you can load up through ctypes. Or, ideally, someone else has already written Python bindings for you.

  • Some apps have nothing but the GUI, and there's no way around doing GUI automation. You can do this at a low level, by crafting WM_ messages to send via pywin32 on Windows, using the accessibility APIs on Mac, etc., or at a somewhat higher level with libraries like pywinauto, or possibly at the very high level of selenium or similar tools built to automate specific apps.

So, you could do this with anything from selenium for Chrome and COM automation for Word, to crafting all the WM_ messages yourself. If this is meant to be a learning exercise, the question is which of those things you want to learn today.


Let's start with COM automation. Using pywin32, you directly access the application's own scripting interfaces, without having to take control of the GUI from the user, figure out how to navigate menus and dialog boxes, etc. This is the modern version of writing "Word macros"—the macros can be external scripts instead of inside Word, and they don't have to be written in VB, but they look pretty similar. The last part of your script would look something like this:

word = win32com.client.dispatch('Word.Application')
word.Visible = True
doc = word.Documents.Add()
doc.Selection.TypeText(my_string)
doc.SaveAs(r'C:TestFilesTestDoc.doc')

If you look at Microsoft Word Scripts, you can see a bunch of examples. However, you may notice they're written in VBScript. And if you look around for tutorials, they're all written for VBScript (or older VB). And the documentation for most apps is written for VBScript (or VB, .NET, or even low-level COM). And all of the tutorials I know of for using COM automation from Python, like Quick Start to Client Side COM and Python, are written for people who already know about COM automation, and just want to know how to do it from Python. The fact that Microsoft keeps changing the name of everything makes it even harder to search for—how would you guess that googling for OLE automation, ActiveX scripting, Windows Scripting House, etc. would have anything to do with learning about COM automation? So, I'm not sure what to recommend for getting started. I can promise that it's all as simple as it looks from that example above, once you do learn all the nonsense, but I don't know how to get past that initial hurdle.

Anyway, not every application is automatable. And sometimes, even if it is, describing the GUI actions (what a user would click on the screen) is simpler than thinking in terms of the app's object model. "Select the third paragraph" is hard to describe in GUI terms, but "select the whole document" is easy—just hit control-A, or go to the Edit menu and Select All. GUI automation is much harder than COM automation, because you either have to send the app the same messages that Windows itself sends to represent your user actions (e.g., see "Menu Notifications") or, worse, craft mouse messages like "go (32, 4) pixels from the top-left corner, click, mouse down 16 pixels, click again" to say "open the File menu, then click New".

Fortunately, there are tools like pywinauto that wrap up both kinds of GUI automation stuff up to make it a lot simpler. And there are tools like swapy that can help you figure out what commands you want to send. If you're not wedded to Python, there are also tools like AutoIt and Actions that are even easier than using swapy and pywinauto, at least when you're getting started. Going this way, the last part of your script might look like:

word.Activate()
word.MenuSelect('File->New')
word.KeyStrokes(my_string)
word.MenuSelect('File->Save As')
word.Dialogs[-1].FindTextField('Filename').Select()
word.KeyStrokes(r'C:TestFilesTestDoc.doc')
word.Dialogs[-1].FindButton('OK').Click()

Finally, even with all of these tools, web browsers are very hard to automate, because each web page has its own menus, buttons, etc. that aren't Windows controls, but HTML. Unless you want to go all the way down to the level of "move the mouse 12 pixels", it's very hard to deal with these. That's where selenium comes in—it scripts web GUIs the same way that pywinauto scripts Windows GUIs.

这篇关于使用 Python 与其他程序交互的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆