Microsoft Computer Vision API或Google的Cloud Vision API是否有可能获取对象的位置? [英] Is it possible for either Microsoft Computer Vision API or Google's Cloud Vision API to get a location for objects?
问题描述
我正在尝试开发一个应用程序,该应用程序需要知道图像中被标记对象的位置.仅仅知道图像中有钢琴"是不够的,我需要知道该钢琴在图像中的位置.
I am trying to develop an application that needs to know the location of tagged objects in an image. Knowing that there is a "piano" in an image is not enough, I need to know where that piano is in the image.
Microsoft的Computer Vision API和Google的Cloud Vision API都提供了某种形式的裁剪建议/智能缩略图生成服务,这使我认为正在检测到某些对象的位置-但是有一种方法可以获取该信息(例如Microsoft的Computer Vision API或Google的Cloud Vision API中每个检测到的对象周围的边框)?
Both Microsoft's Computer Vision API and Google's Cloud Vision API provide some form of cropping suggestion/smart thumbnail generation service which leads me to think that the location of certain objects is being detected - however is there a way to get that information (like a bounding box around each detected object) from either Microsoft's Computer Vision API or Google's Cloud Vision API?
编辑:我了解这两个API都可以返回在图像中检测到的面部的位置,但是我正在寻找图像中每个对象的位置和大小:汽车,钢琴,树木,人...任何东西.
I understand that both APIs can return the location of faces detected in an image, however I am looking for locations and sizes of every object in an image: cars, pianos, trees, people...anything.
推荐答案
2020更新:
这个问题已经有好几年了,但是 Microsoft Azure 计算机视觉API现在能够在图像中检测到的对象周围绘制边界框. 这是Python中的示例.其他语言也可用.
This question is a few years old, but the Microsoft Azure Computer Vision API is now able to draw bounding boxes around objects that are detected in an image. Here is a sample in Python. Other languages are available as well.
计算机视觉文档: https://docs. microsoft.com/en-us/azure/cognitive-services/computer-vision/
计算机视觉SDK:计算机视觉API: https://westus.dev .cognitive.microsoft.com/docs/services/5cd27ec07268f6c679a3e641/operations/56f91f2e778daf14a499f21b
这篇关于Microsoft Computer Vision API或Google的Cloud Vision API是否有可能获取对象的位置?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!