渲染文本问题(上下文是MSWin UI Automation) [英] Rendering text question (context is MSWin UI Automation)

查看:48
本文介绍了渲染文本问题(上下文是MSWin UI Automation)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,


我正在尝试使用UI Automation来驱动MS Windows应用程序(使用pywinauto)。


我需要刮掉应用程序的窗口内容并使用某种形式的OCR来获取文本(pywinauto无法获取)。


作为集成OCR引擎的替代方案,由于我知道用于在应用程序窗口上写入的字体和

大小,我推断我可以基于简单的

文本识别模块关于驱动MSWindows文本呈现的能力 - 例如

生成我希望在驱动应用程序窗口中找到的文本的pixmaps,确切

到这个像素。


这种方法的优点是精确和自我控制。


我在空闲时手动验证窗口,确实我可以制作

预期应用文本的pixmaps,精确到像素(至少有Tkinter +屏幕截图

)。

我可以使用帮助将其转换为可编程功能,即:简单 - 使用Tkinter或其他方式
- 用于封装对MS Windows UI文本的访问的方式

渲染引擎,作为函数将返回渲染文本的图片,

给定字符串,字体,大小和颜色?


理想情况下,不会干扰屏幕内容?


提前感谢您的任何指导,


Boris Borcic

解决方案

2007年1月23日,Boris Borcic< bb ***** @ gmail.comwrote:


您好,


我正在尝试使用UI Automation来驱动MS Windows应用程序(使用pywinauto)。


我需要抓取应用程序的窗口内容并使用某种形式的OCR来获取文本(pywinauto无法获得)。


作为集成OCR引擎的替代方案,因为我知道用于写o的字体和

尺寸在应用程序的窗口,我推断我可以在驱动MSWindows文本呈现的功能上建立一个简单的

文本识别模块 - 例如

来生成文本的pixmaps我希望在驱动应用程序的窗口中找到确切的

到像素。


这种方法的优点是精确性和自我控制。


我已经在空闲窗口内手动验证了,确实我可以制作

预期应用文本的pixmaps,精确到像素(使用Tkinter) +屏幕截图

至少)。


我可以使用帮助将其转换为可编程功能,即:简单 -
$ b带有Tkinter或其他的$ b - 封装访问MS Windows UI文本的方式

渲染引擎,作为一个可以返回渲染文本图片的函数,

给出一个字符串,字体,大小和颜色?


理想情况下,不干扰屏幕内容?


谢谢你任何指导,


Boris Borcic



实际上有几种不同的文字渲染方法(和2或

更完全不同的引擎)并且它们会给出不同的结果,

所以如果你想要一个非常困难的完全通用的解决方案。

但是,它这听起来是出于特定目的。


使用pywin32模块直接访问相应的窗口

API调用将是最准确的。这将是相当复杂的,并且你需要知道win32 api才能做到这一点。你也可以使用

wxPython,它使用的东西可能是正确的API,并且比win32需要的代码少了b / b
。如果你不熟悉win32 API,我会建议这个。


PyQt使用它自己的文本渲染引擎,如据我所知,所以生成正确位图的可能性不大于b $ b。我不确定在什么级别

tkinters文本绘图完成。


使用win32或wxPython你将能够生成位图

直接,无需创建可见窗口。

一些快速&脏wxPython代码


def getTextBitmap(text,font,fgcolor,bgcolor):

dc = wx.MemoryDC()

dc.SetFont(font)

width,height = dc.GetTextExtent(text)

bmp = wx.EmptyBitmap(width,height)

dc.SelectObject(bmp)

dc.SetBackground(wx.Brush(bgcolor))

dc.Clear()

dc.SetTextBackground( bgcolor)

dc.SetTextForeground(fgcolor)

dc.DrawText(text,0,0)

dc.SelectObject(wx.NullBitmap)

返回bmp

原始win32代码看起来很相似但会更冗长。


我正在尝试使用UI自动化来驱动MS Windows应用程序(使用pywinauto)。


>

我需要抓取应用程序的窗口内容并使用某种形式的OCR来获取

文本(pywinauto无法得到它们)。


作为集成OCR引擎的替代方案,因为我知道字体和

尺寸用于在应用程序的窗口上书写,我推断我可以在驱动MSWindows文本呈现的能力上建立一个简单的

文本识别模块 - 例如

生成我希望在驱动应用程序的窗口中找到的文本的像素图,确切地说

到像素。


这种方法的优势将是精确和自我控制。


我已在空闲窗口内手动验证,确实我可以生成

预期应用文本的pixmaps,精确到像素(使用Tkinter +屏幕截图至少
)。


我可以使用帮助将其转换为可编程功能,即:简化e -

使用Tkinter或其他方式 - 封装访问MS Windows UI文本的方式

渲染引擎,作为返回渲染文本图片的函数,

给出字符串,字体,大小和颜色?


理想情况下,不干扰屏幕内容?


提前感谢您的任何指导,


Boris Borcic



我一直在寻找(现在仍在寻找)类似的功能。

具体来说,我希望能够捕获

屏幕的一小块区域(数字或代码)并将其转换为可以使用的文本<我的申请表中有



当我问我的问题时,我被引导到Microsoft Accessibility

工具包。

Serach在此列表中标题为:

"从Win32窗口读取文本标签


我使用wxPython和Win32专门申请。


S. o如果我能得到任何帮助或帮助,请告诉我。


Geoff。


On 2007年1月23日12:06:35 -0800,imageguy< im ********** @ gmail.comwrote:


我正在尝试使用UI自动化来驱动MS Windows应用程序(使用pywinauto)。


我需要抓取应用程序的窗口内容并使用某种形式的OCR得到

文本(pywinauto无法得到它们)。


作为集成OCR引擎的替代方案,因为我知道用于在应用程序窗口上编写的字体和

尺寸,我推断我可以在驱动MSWindows文本渲染的功能上建立一个简单的

文本识别模块 - 例如

生成我希望在驱动应用程序的窗口中找到的文本的像素图,确切地说

到像素。


这种方法的优点是严格的ude和self-containment。


我已经在Idle窗口内手动验证了,确实我可以制作

预期应用文本的pixmaps,精确到像素(使用Tkinter +屏幕截图至少
)。


我可以使用帮助将其转换为可编程功能,即:简单 -

使用Tkinter或其他方式 - 封装访问MS Windows UI文本的方式

渲染引擎,作为返回渲染文本图片的函数,

给出了字符串,字体,大小和颜色?


理想情况下,不干扰屏幕内容?


谢谢提前获取任何指导,


Boris Borcic



我一直在寻找(现在仍在寻找)类似的功能。

具体来说我希望能够捕获

屏幕的一小块区域(数字或代码)并将其转换为可以使用的文本

在我的申请中。


当我问我的问题时,我被引导到Microsoft Accessibility

工具包。

Serach在此列表中标题为

从Win32窗口读取文本标签


我专门使用wxPython和Win32应用程序。


如果我能得到任何帮助或帮助,请告诉我。


Geoff。



OP表示pywinauto无法获取文本,因此它可能是直接用GDI方法绘制而不是静态

文本控件。可访问性工具包只有在它是静态的文本控件时才有效,或者应用程序花了一些时间向屏幕阅读器公开

文本。


Hello,

I am trying to use UI Automation to drive an MS Windows app (with pywinauto).

I need to scrape the app''s window contents and use some form of OCR to get at
the texts (pywinauto can''t get at them).

As an alternative to integrating an OCR engine, and since I know the fonts and
sizes used to write on the app''s windows, I reasoned that I could base a simple
text recognition module on the capability to drive MSWindows text rendering - eg
to generate pixmaps of texts I expect to find in the driven app''s windows, exact
to the pixel.

The advantage of that approach would be exactitude and self-containment.

I''ve verified manually inside an Idle window, that indeed I could produce
pixmaps of expected app texts, exact to the pixel (with Tkinter+screen capture
at least).

I could use help to turn this into a programmable capability, ie : A simple -
with Tkinter or otherwise - way to wrap access to the MS Windows UI text
rendering engine, as a function that would return a picture of rendered text,
given a string, a font, a size and colors ?

And ideally, without interfering with screen contents ?

Thanks in advance for any guidance,

Boris Borcic

解决方案

On 1/23/07, Boris Borcic <bb*****@gmail.comwrote:

Hello,

I am trying to use UI Automation to drive an MS Windows app (with pywinauto).

I need to scrape the app''s window contents and use some form of OCR to get at
the texts (pywinauto can''t get at them).

As an alternative to integrating an OCR engine, and since I know the fonts and
sizes used to write on the app''s windows, I reasoned that I could base a simple
text recognition module on the capability to drive MSWindows text rendering - eg
to generate pixmaps of texts I expect to find in the driven app''s windows, exact
to the pixel.

The advantage of that approach would be exactitude and self-containment.

I''ve verified manually inside an Idle window, that indeed I could produce
pixmaps of expected app texts, exact to the pixel (with Tkinter+screen capture
at least).

I could use help to turn this into a programmable capability, ie : A simple -
with Tkinter or otherwise - way to wrap access to the MS Windows UI text
rendering engine, as a function that would return a picture of rendered text,
given a string, a font, a size and colors ?

And ideally, without interfering with screen contents ?

Thanks in advance for any guidance,

Boris Borcic

There are actually several different text rendering methods (and 2 or
more totally different engines) and they will give different results,
so if you want a fully generic solution that could be quite difficult.
However, it sounds like this is for a specific purpose.

Using the pywin32 modules to directly access the appropriate windows
API calls will be the most accurate. It will be fairly complicated and
you''ll require knowledge of the win32 api to do it. You could also use
wxPython, which uses what will probably be the right API and will take
less code than win32 will. I''d suggest this if you aren''t familiar
with the win32 API.

PyQt uses it''s own text rendering engine, as far as I know, so it is
less likely to generate correct bitmaps. I''m not sure at what level
tkinters text drawing is done.

Using either win32 or wxPython you will be able to produce bitmaps
directly, without needing to create a visible window.
Some quick & dirty wxPython code

def getTextBitmap(text, font, fgcolor, bgcolor):
dc = wx.MemoryDC()
dc.SetFont(font)
width, height= dc.GetTextExtent(text)
bmp = wx.EmptyBitmap(width, height)
dc.SelectObject(bmp)
dc.SetBackground(wx.Brush(bgcolor))
dc.Clear()
dc.SetTextBackground(bgcolor)
dc.SetTextForeground(fgcolor)
dc.DrawText(text, 0, 0)
dc.SelectObject(wx.NullBitmap)
return bmp
Raw win32 code will look similar but will be much more verbose.


I am trying to use UI Automation to drive an MS Windows app (with pywinauto).

>
I need to scrape the app''s window contents and use some form of OCR to get at
the texts (pywinauto can''t get at them).

As an alternative to integrating an OCR engine, and since I know the fonts and
sizes used to write on the app''s windows, I reasoned that I could base a simple
text recognition module on the capability to drive MSWindows text rendering - eg
to generate pixmaps of texts I expect to find in the driven app''s windows, exact
to the pixel.

The advantage of that approach would be exactitude and self-containment.

I''ve verified manually inside an Idle window, that indeed I could produce
pixmaps of expected app texts, exact to the pixel (with Tkinter+screen capture
at least).

I could use help to turn this into a programmable capability, ie : A simple -
with Tkinter or otherwise - way to wrap access to the MS Windows UI text
rendering engine, as a function that would return a picture of rendered text,
given a string, a font, a size and colors ?

And ideally, without interfering with screen contents ?

Thanks in advance for any guidance,

Boris Borcic

I was looking for ( and still am searching for) similiar functionality.
Specifically I would like to be able to capture a small area of the
screen (a number or a code) and convert this to text that can be used
in my application.

When I asked my question, I was directed to the Microsoft Accessibility
tool kit.
Serach on this list for the post titled;
"Reading text labels from a Win32 window"

I work with wxPython and Win32 applications exclusively.

So if I can be of any help or assistance, please let me know.

Geoff.


On 23 Jan 2007 12:06:35 -0800, imageguy <im**********@gmail.comwrote:

I am trying to use UI Automation to drive an MS Windows app (with pywinauto).

I need to scrape the app''s window contents and use some form of OCR to get at
the texts (pywinauto can''t get at them).

As an alternative to integrating an OCR engine, and since I know the fonts and
sizes used to write on the app''s windows, I reasoned that I could base a simple
text recognition module on the capability to drive MSWindows text rendering - eg
to generate pixmaps of texts I expect to find in the driven app''s windows, exact
to the pixel.

The advantage of that approach would be exactitude and self-containment.

I''ve verified manually inside an Idle window, that indeed I could produce
pixmaps of expected app texts, exact to the pixel (with Tkinter+screen capture
at least).

I could use help to turn this into a programmable capability, ie : A simple -
with Tkinter or otherwise - way to wrap access to the MS Windows UI text
rendering engine, as a function that would return a picture of rendered text,
given a string, a font, a size and colors ?

And ideally, without interfering with screen contents ?

Thanks in advance for any guidance,

Boris Borcic


I was looking for ( and still am searching for) similiar functionality.
Specifically I would like to be able to capture a small area of the
screen (a number or a code) and convert this to text that can be used
in my application.

When I asked my question, I was directed to the Microsoft Accessibility
tool kit.
Serach on this list for the post titled;
"Reading text labels from a Win32 window"

I work with wxPython and Win32 applications exclusively.

So if I can be of any help or assistance, please let me know.

Geoff.

The OP stated that pywinauto couldn''t get at the text, so it''s
probably drawn directly with GDI methods rather than being a static
text control. The accessibility toolkit only works if it''s a static
text control or the application goes to some lengths to expose the
text to screen readers.


这篇关于渲染文本问题(上下文是MSWin UI Automation)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆