像Zookazam这样的增强现实 [英] Augment reality like zookazam

查看:85
本文介绍了像Zookazam这样的增强现实的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


  1. 什么算法用于增强现实,例如 zookazam



    < img src = https://i.stack.imgur.com/rMfpi.jpg alt = zookazam示例>



    我认为它可以分析图像并通过对比找到飞机,但我不知道如何。


  2. 从这样的应用开始之前应该阅读哪些主题?



解决方案

[序言]



这是极其广泛的主题,在当前状态下,大多数是非主题。我重新编辑了您的问题,但为了使您的问题在本网站的规则/可能性之内可以回答



您应更详细地说明增强现实的内容:


  1. 应该做




    • 添加已知网格物体的 2D / 3D 对象...

    • 改变光照条件

    • 添加/去除身体部位/衣服/头发...



    一个好主意是提供一些输入/输出的示例图像(草图)


  2. 输入的内容




    • 视频,静态图像, 2D ,立体声, 3D 。对于纯 2D 输入,请指定您需要帮助重建的条件/标记/照明/ 激光模式。

    • 内容将在其中输入图像?空房间,人员,特定对象等。


  3. 指定目标平台



    许多算法都限于内存大小/带宽, CPU 功能,特殊的 HW 功能等,因此添加标记是个好主意为您的平台。 操作系统和语言也是添加的好主意。


[增强现实如何工作]


  1. 获取输入图像



    如果要连接到某些设备(例如相机),则需要使用其驱动程序/框架或其他工具来获取图像或使用其支持的某些常见 API 。此任务取决于 OS 。我在Windows上最喜欢的方式是使用 VFW (Windows视频) API



    我将从一些内容开始从一开始就添加静态文件,以简化调试和增量构建过程。 (您无需等待相机和其他东西在每次构建中发生)。然后,当您的应用准备好进行实时视频播放时,然后切换回相机...


  2. 将场景重构为3D网格

    如果您使用 Kinect 之类的 3D 相机,则无需执行此步骤。否则,您通常需要根据边缘检测或颜色均匀性通过某些分段过程来区分对象。



    的质量3D 网格取决于您要实现的目标和输入的内容。例如,如果您想要逼真的阴影和照明,则需要非常好的网格。如果在某些房间中将相机固定,则可以手动预定义网格(对其进行硬编码),然后仅计算视图中的对象。同样,通过从当前视图图像中减去空房间图像,从而使差异很大的像素成为对象,可以非常简单地完成对象检测/分割。



    平面而不是您在 OP 中建议的真实3D网格,但是您可以忘记更逼真的效果,例如照明,阴影,相交...如果您假设对象是直立的,则可以使用房间指标来获取与摄像机的距离。请参阅:





    对于纯 2D 输入,您还可以使用照明来估计 3D 网格,请参见:




  3. 渲染



    <只需将场景渲染回具有添加/删除功能的某些图像/视频/屏幕...。如果您没有太多改变光照条件,您也可以使用原始图像并直接渲染。阴影可以通过使像素变暗来实现。为了获得更好的效果,照明/阴影/斑点/等。通常会从原始图像中过滤掉,然后直接通过渲染添加。请参阅





    渲染过程本身也依赖于平台(除非您通过内存中的低级图形来完成此工作)。您可以使用 GDI,DX,OpenGL等... 参见:





    您还需要相机参数渲染如下:




[基本谷歌/阅读主题]


  1. 2D




    • DIP数字图像处理

    • 图像分割


  2. 3D




  3. 依赖于格式




    • 图像获取

    • 渲染



  1. What algorithms are used for augmented reality like zookazam ?

    I think it analyze image and find planes by contrast, but i don't know how.

  2. What topics should I read before starting with app like this?

解决方案

[Prologue]

This is extremly broad topic and mostly off topic in it's current state. I reedited your question but to make your question answerable within the rules/possibilities of this site

You should specify more closely what your augmented reality:

  1. should do

    • adding 2D/3D objects with known mesh ...
    • changing light conditions
    • adding/removing body parts/clothes/hairs ...

    a good idea is to provide some example image (sketch) of input/output of what you want to achieve.

  2. what input it has

    • video,static image, 2D,stereo,3D. For pure 2D input specify what conditions/markers/illumination/LASER patterns you have to help the reconstruction.
    • what will be in the input image? empty room, persons, specific objects etc.
  3. specify target platform

    many algorithms are limited to memory size/bandwidth, CPU power, special HW capabilities etc so it is a good idea to add tag for your platform. The OS and language is also a good idea to add.

[How augmented reality works]

  1. acquire input image

    if you are connecting to some device like camera you need to use its driver/framework or something to obtain the image or use some common API it supports. This task is OS dependent. My favorite way on Windows is to use VFW (video for windows) API.

    I would start with some static file(s) from start instead to ease up the debug and incremental building process. (you do not need to wait for camera and stuff to happen on each build). And when your App is ready for live video then switch back to camera...

  2. reconstruct the scene into 3D mesh

    if you use 3D cameras like Kinect then this step is not necessary. Otherwise you need to distinguish the object by some segmentation process usually based on the edge detections or color homogenity.

    The quality of the 3D mesh depends on what you want to achieve and what is your input. For example if you want realistic shadows and lighting then you need very good mesh. If the camera is fixed in some room you can predefine the mesh manually (hard code it) and compute just the objects in view. Also the objects detection/segmentation can be done very simply by substracting the empty room image from current view image so the pixels with big difference are the objects.

    you can also use planes instead of real 3D mesh as you suggested in the OP but then you can forget about more realistic quality of effects like lighting,shadows,intersections... if you assume the objects are standing straight then you can use room metrics to obtain the distance from camera. see:

    For pure 2D input you can also use the illumination to estimate the 3D mesh see:

  3. render

    Just render the scene back to some image/video/screen... with added/removed features. If you are not changing the light conditions too much you can also use the original image and render directly to it. Shadows can be achieved by darkening the pixels ... For better results with this the illumination/shadows/spots/etc. are usually filtered out from the original image and then added directly by rendering instead. see

    The rendering process itself is also platform dependent (unless you are doing it by low level graphics in memory). You can use things like GDI,DX,OpenGL,... see:

    You also need camera parameters for rendering like:

[Basic topics to google/read]

  1. 2D

    • DIP digital image processing
    • Image Segmentation
  2. 3D

  3. paltform dependent

    • image acquisition
    • rendering

这篇关于像Zookazam这样的增强现实的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆