如何在Python中从另一个应用程序捕获和处理实时活动? [英] How to capture and process live activity from another application in Python?

查看:251
本文介绍了如何在Python中从另一个应用程序捕获和处理实时活动?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是计算机科学专业的学生,​​作为一个个人项目,我对构建可以监视和产生有关在本地模拟器上运行的Super Nintendo游戏的有用信息的软件感兴趣.这可能是诸如当前健康状况,当前得分等(屏幕上任何清晰可见的东西)之类的东西.该模拟器以窗口形式运行(我正在使用SNES9x),因此不需要捕获屏幕上的每个像素,而只需要捕获约30fps.

我已经研究了FFMPEG和OpenCV之类的库,但是到目前为止,我所看到的使我相信我必须预先录制游戏的渲染图.

在某个时候,我想探索开发某种启发式的AI的能力,这种AI可能可以玩Super Metroid,但要做到这一点,就需要解释实时游戏性.这样的工作所需的算法和数据结构在我的研究范围之内.视频处理不是,我有点菜鸟.任何指针都很棒(对不起,la脚的计算机科学双关语).

对于那些可能会指出,抓取游戏内存而不是抓屏数据会更简单-是的.我的兴趣是开发一种仅能提供人类玩家所拥有的信息(即屏幕上的视觉效果)的东西,所以这是我目前感兴趣的方法.谢谢!

解决方案

A:是的,python可以抓取&通过USB输入设备处理任何场景

实时图像(而非流...)处理设计问题与总体RT循环性能有关,主要是图像转换和处理.处理,而不仅仅是静态图像大小和本身的获取方法.

无论如何,您的代码必须在[usec,nsec]中进行仔细的设计和预先测量(是的,有可用的python工具可让您将代码的时序问题基准测试到大约25 ns的分辨率),以便使整个RT循环在您的常规图像处理体系结构内保持可行.另外,您将在资源管理和管理方面都遇到困难.错误处理,这两者都会在RT计划中引起很多问题.

如何?以此为灵感开始

仅从医学成像PoC python原型中获取最初图像捕获想法的样本:

def demoCLUT( ):
    cameraCapture = cv2.VideoCapture(0)

    cv2.namedWindow(        'msLIB:ComputerVision.IN' )
    cv2.setMouseCallback(   'msLIB:ComputerVision.IN', onMouse )

    cv2.namedWindow(        'msLIB:ComputerVision.OUT-0' )
    cv2.namedWindow(        'msLIB:ComputerVision.OUT-1' )
    cv2.namedWindow(        'msLIB:ComputerVision.OUT-2' )

    success, frame = cameraCapture.read()

    if success:

        while success and cv2.waitKey( 10 ) == -1 and not clicked:          # [msec]

            aGrayFRAME  =                               cv2.cvtColor(   frame, cv2.COLOR_BGR2GRAY )

            cv2.imshow( 'msLIB:ComputerVision.IN',                                     frame )
            cv2.imshow( 'msLIB:ComputerVision.OUT-0',                             aGrayFRAME )
            cv2.imshow( 'msLIB:ComputerVision.OUT-1',   reprocessIntoFalseCOLORs( aGrayFRAME, frame, aFalseCLUT   ) )    # <frame>-destructive
            cv2.imshow( 'msLIB:ComputerVision.OUT-2',   reprocessIntoFalseCOLORs( aGrayFRAME, frame, aFalseCLUT_T ) )    # <frame>-destructive

            success, frame = cameraCapture.read()
        pass
    else:
        print "OpenCV.CLUT.DEMO: cameraCapture.read() failed to serve a success/frame ... "
    pass
    # ------------------------------------------------------------------<RELEASE-a-Resource>
    cameraCapture = False                                               #RELEASE-a-Resource setting it asFalse
    print 30 * ">", "call clearWIN() to release & tidy up resources..."
    # ------------------------------------------------------------------<RELEASE-a-Resource>

必须预先录制好序列吗?

就您表达的动机而言,您的原型将花费大量时间进行开发.预先录制的序列可能会帮助您专注于开发/测试方面,而您的注意力不会在游戏和python代码之间分成两半,但这不是必须的.

关于FPS的评论.您针对人类玩家构建AI

话虽如此,您最初的AI引擎可能会以低至10-15 FPS的速度启动,而不必因为人为地提高FPS率而陷入无法解决的RT循环难题.

我们的人眼/大脑串联会在电视刷新率附近(意味着模拟电视的原始视频)产生运动的错觉,在这种情况下,大约21分之二的屏幕足以容纳数十年的人(对于狗来说则不一样). ..因此,营销公司着重于影响人类,以人计而不是狗计来衡量广告营收的影响,因为我们最好的朋友根本不喜欢在电视屏幕上观看那些奇怪的闪烁静静的声音./p>

所以不要过度设计要开发的AI引擎,它的目标是击败人类玩家,而不是狗狗,不是吗?

I'm a computer science student, and as a personal project, I'm interested in building software that can watch and produce useful information about Super Nintendo games being run on a local emulator. This might be things like current health, current score, etc., (anything legible on the screen). The emulator runs in windowed form (I'm using SNES9x) and so I wouldn't need to capture every pixel on the screen, and I'd only have to capture about 30fps.

I've looked into some libraries like FFMPEG and OpenCV, but so far what I've seen leads me to believe I have to have pre-recorded renderings of the game.

At some point, I'd like to explore the capacity for developing a somewhat heuristic AI that might be able to play Super Metroid, but to do so, it would need to be interpreting live gameplay. The algorithms and data structures needed for something like this are within my realms of study; video processing is not, and I'm something of a noob. Any pointers would be awesome (pardon the lame computer science pun).

For those who might point out that it would be simpler to scrape the game memory rather than screen grab data -- yes, it would be. My interest is in developing something that is only given the information a human player would have, i.e., the visuals on the screen, so this is the approach I'm interested in for the time being. Thanks!

解决方案

A: Yes, python can grab & process any scene via a USB-input device

The real-time image ( not stream ... ) processing design issues are about the overall RT-loop performance, mainly the image-transformations & processing, not about just the static image-size and an acquisition method per-se.

Anyway, your code has to be carefully designed and pre-measured in [usec, nsec] ( yes, there are python tools available to allow you to benchmark your code's timing issues down to some 25-nsec resolution ) so as to keep the whole RT-loop feasible within your general image-processing architecture. Plus you will struggle with both resouces management & error-handling, both of which cause a lot of problems in RT-scheduling.

How? Take this as an inspiration to start from

A sample brought just for an initial image-capture idea from a medical imaging PoC python prototype:

def demoCLUT( ):
    cameraCapture = cv2.VideoCapture(0)

    cv2.namedWindow(        'msLIB:ComputerVision.IN' )
    cv2.setMouseCallback(   'msLIB:ComputerVision.IN', onMouse )

    cv2.namedWindow(        'msLIB:ComputerVision.OUT-0' )
    cv2.namedWindow(        'msLIB:ComputerVision.OUT-1' )
    cv2.namedWindow(        'msLIB:ComputerVision.OUT-2' )

    success, frame = cameraCapture.read()

    if success:

        while success and cv2.waitKey( 10 ) == -1 and not clicked:          # [msec]

            aGrayFRAME  =                               cv2.cvtColor(   frame, cv2.COLOR_BGR2GRAY )

            cv2.imshow( 'msLIB:ComputerVision.IN',                                     frame )
            cv2.imshow( 'msLIB:ComputerVision.OUT-0',                             aGrayFRAME )
            cv2.imshow( 'msLIB:ComputerVision.OUT-1',   reprocessIntoFalseCOLORs( aGrayFRAME, frame, aFalseCLUT   ) )    # <frame>-destructive
            cv2.imshow( 'msLIB:ComputerVision.OUT-2',   reprocessIntoFalseCOLORs( aGrayFRAME, frame, aFalseCLUT_T ) )    # <frame>-destructive

            success, frame = cameraCapture.read()
        pass
    else:
        print "OpenCV.CLUT.DEMO: cameraCapture.read() failed to serve a success/frame ... "
    pass
    # ------------------------------------------------------------------<RELEASE-a-Resource>
    cameraCapture = False                                               #RELEASE-a-Resource setting it asFalse
    print 30 * ">", "call clearWIN() to release & tidy up resources..."
    # ------------------------------------------------------------------<RELEASE-a-Resource>

Are pre-recorded sequences a must or a nice-to have?

As far as your motivation was expressed, your prototype will use a lot of time for development. There the pre-recorded sequences may help you focus on dev/test side, while your concentration is not split in halves between the game and the python-code, however these are not a must-have.

A remark on FPS. You build AI against a Human-Player

Having said this, your initial AI-engine may start at anything low as 10-15 FPS, no need to get yourself into an unsolvable RT-loop puzzle just due to artificially high FPS rate.

Our human eye / brain tandem gets an illusion of motion somewhere near the TV refresh-rate ( meaning the analog-TV original, where about 21-half-screens were for many decades enough for people ( not the same for dogs ... thus the marketing companies focused on rather influencing humans, measuring their advertising campaings' impact with people-meters and not dog-meters as our best friends did not like at all to watch those strange flashing statics on TV-screens ) ).

So do not over-design the AI-engine to be developped, it shall aim at beating Human-Players, not the dog ones, shan't it?

这篇关于如何在Python中从另一个应用程序捕获和处理实时活动?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆