在Python中使用Mac的听写功能 [英] Using Mac’s Dictation Inside Python

查看:153
本文介绍了在Python中使用Mac的听写功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人对如何使用Mac内置的听写工具创建供Python使用的字符串有任何想法吗?

Does anyone have any ideas on how to use the Mac’s built-in dictation tool to create strings to be used by Python?

要启动听写,必须在任何文本编辑器中双击Fn键.如果是这种情况,是否可以将击键命令与输入命令结合在一起?像这样:

To launch a dictation, you have to double-press the Fn key inside any text editor. If this is the case, is there a way to combine the keystroke command with the input command? Something like:

步骤1:模拟击键以双击Fn键,启动听写工具,然后 第2步.通过使用语音转文字内容作为输入函数的一部分来创建变量,即text_string = input("Start dictation:)

Step 1: Simulate a keystroke to double-press the Fn key, launching the Dictation tool, and then Step 2. Creating a variable by using the speech-to-text content as part of the input function, i.e. text_string = input("Start dictation: ")

在此线程中(我可以在没有GUI的情况下使用OS X 10.8的语音识别/听写功能吗?)用户建议他通过CGEventCreateKeyboardEvent(src,0x3F,true)找出答案,但是没有代码.

In this thread (Can I use OS X 10.8's speech recognition/dictation without a GUI?) a user suggests he figured it out with CGEventCreateKeyboardEvent(src, 0x3F, true), but there is no code.

有什么想法吗?代码示例将不胜感激.

Any ideas? Code samples would be appreciated.

更新:由于下面的建议,我已经导入了AppScript.我正在尝试使代码沿这些方向工作,但没有成功:

UPDATE: Thanks to the suggestions below, I've imported AppScript. I'm trying the code to work along these lines, with no success:

from appscript import app, its
se = app('System Events')
proc = app.processes[its.frontmost == True]
mi = proc.menu_bars[1].menu_bar_items['Edit'].menus[1].menu_items['Start Dictation']
user_voice_text = input(mi.click())
print(user_voice_text)

关于如何打开听写工具以输入字符串的任何想法吗?

Any ideas on how I can turn on the dictation tool to be input for a string?

更新2:

这是我要创建的程序的一个简单示例:

Here is a simple example of the program I'm trying to create:

Ideally i want to launch the program, and then have it ask me: "what is 1 + 1?"
Then I want the program to turn on the dictation tool, and I want the program to record my voice, with me answering "two".
The dictation-to-text function will then pass the string value = "two" to my program, and an if statement is then used to say back "correct" or "incorrect".

我正在尝试将命令传递给程序,而无需在键盘上打字.

Im trying to pass commands to the program without ever typing on the keyboard.

推荐答案

首先,FnFn指令是NSText(或者也许是NSTextView?)可可控件的功能.如果您有其中之一,则将规定的文本插入到该控件中. (它也使用该控件的现有文本作为上下文.)从使用NSTextView的应用程序的角度来看,如果您仅创建标准的编辑"菜单,则将开始听写"项添加到末尾,并以FnFn作为快捷方式,并且命令的任何内容都将显示为输入,就像在键盘上键入的输入,用鼠标或通过任何其他输入方法粘贴或拖动的输入一样.

First, FnFn dictation is a feature of the NSText (or maybe NSTextView?) Cocoa control. If you've got one of those, the dictated text gets inserted into that control. (It also uses that control's existing text for context.) From the point of view of the app using an NSTextView, if you just create a standard Edit menu, the Start Dictation item gets added to the end, with FnFn as a shortcut, and anything that gets dictated appears as input, just like input typed on a keyboard, or pasted or dragged with the mouse, or via any other input method.

因此,如果您没有GUI应用程序,则启用听写将毫无意义,因为您无法获取输入.

So, if you don't have a GUI app, enabling dictation is going to be pointless, because you have no way to get the input.

如果您有GUI应用程序,最简单的事情就是通过

If you do have a GUI app, the simplest thing to do is just get the menu item via NSMenu, and click the item.

几乎可以肯定,您正在使用某种GUI库,例如PyQt或Tkinter,它们都有自己的访问应用程序菜单的方式.但是,如果没有,您可以直接通过Cocoa(使用PyObjC(Apple预先安装的Python随附)执行此操作,但是如果您使用的是第三方Python,则必须pip install):

You're almost certainly using some kind of GUI library, like PyQt or Tkinter, which has its own way of accessing your app's menu. But if not, you can do it directly through Cocoa (using PyObjC—which comes with Apple's pre-installed Python, but which you'll have to pip install if you're using a third-party Python):

import AppKit
mb = AppKit.NSApp.mainMenu()
edit = mb.itemWithTitle_('Edit').submenu()
sd = edit.indexOfItemWithTitle_('Start Dictation')
edit.performActionForItemAtIndex_(sd)


但是,如果您要编写在终端上运行的控制台程序(无论是Terminal.app还是类似iTerm的替代程序),则您正在其下运行的应用程序都有其自己的文本小部件和编辑"菜单,并且可以寄生地使用而是它的菜单.


But if you're writing a console program that runs in the terminal (whether Terminal.app or an alternative like iTerm), the app you're running under has its own text widget and Edit menu, and you can parasitically use its menu instead.

问题在于,除非用户允许,否则您无权仅控制其他应用程序.在旧版本的OS X中,只需在全局范围内打开辅助脚本以实现可访问性"即可.从10.10版开始,安全性和安全性"的隐私"标签中有一个辅助功能"锚点. 系统偏好设置"的隐私"窗格,其中包含具有权限的应用程序的列表.幸运的是,如果您不在列表中,那么当您第一次尝试使用辅助功能时,它将弹出一个对话框,如果用户单击该对话框,它将启动系统偏好设置,显示锚点,添加您的应用程序到禁用了复选框的列表中,然后将其滚动到视图中,因此用户只需单击复选框即可.

The problem is that you don't have permission to just control other apps unless the user allows it. In older versions of OS X, this was done just by turning on "assistive scripting for accessibility" globally. As of 10.10, there's an Accessibility anchor in the Privacy tab of the Security & Privacy pane of System Preferences that has a list of apps that have permissions. Fortunately, if you're not on the list, the first time you try to use accessibility features, it'll pop up a dialog, and if the user clicks on it, it'll launch System Preferences, reveal that anchor, add your app to the list with the checkbox disabled, and scroll it into view, so all the user has to do is click the checkbox.

要执行此操作的AppleScript是:

The AppleScript to do this is:

tell application "System Events"
    click (menu item "Start Dictation" of menu of menu bar item "Edit" 
        of menu bar of (first process whose frontmost is true))
end tell

在Python中执行等效操作的正确"方法是通过ScriptingBridge,您可以通过PyObjC进行访问…但是使用第三方库appscript要容易得多:

The "right" way to do the equivalent in Python is via ScriptingBridge, which you can access via PyObjC… but it's a lot easier to use the third-party library appscript:

from appscript import app, its
se = app('System Events')
proc = app.processes[its.frontmost == True]
mi = proc.menu_bars[1].menu_bar_items['Edit'].menus[1].menu_items['Start Dictation']
mi.click()


如果您确实要发送两次Fn键,则用于生成和发送键盘事件的API是


If you really want to send the Fn key twice, the APIs for generating and sending keyboard events are part of Quartz Events Services, which (even though it's a CoreFoundation C API, not a Cocoa ObjC API) is also wrapped by PyObjC. The documentation can be a bit tricky to understand, but basically, the idea is that you create an event of the appropriate type, then either post it to a specific application, an event tap, or a tap location. So, you can create and send a system-wide key-down Fn-key event like this:

evt = Quartz.CGEventCreateKeyboardEvent(None, 63, True)
Quartz.CGEventPost(Quartz.kCGSessionEventTap, evt)

要发送按键事件,只需将True更改为False.

To send a key-up event, just change that True to False.

这篇关于在Python中使用Mac的听写功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆