SwiftUI:围绕使用 Firebase ML Kit 识别的元素绘制矩形 [英] SwiftUI: Drawing rectangles around elements recognized with Firebase ML Kit
问题描述
我目前正在尝试在图像顶部绘制使用 Firebase ML Kit 识别的文本框.目前,我还没有成功,我根本看不到任何框,因为它们都显示在屏幕外.我正在查看这篇文章以供参考:无需计算宽度 &高度,矩形似乎具有它们应该具有的大小(不完全是),所以这给了我一个假设,即 ML Kit 的大小计算与 image.size.height/width 不成比例.
这就是我如何更改 foreach 循环
Image(uiImage: uiimage!).resizable().scaledToFit().overlay(GeometryReader{(几何:GeometryProxy)在ForEach(self.blocks , id: \.self){ (block:VisionTextBlock) 在Rectangle().path(in: block.frame.applying(self.transformMatrix(geometry: geometry, image: self.uiimage!))).stroke(Color.purple, lineWidth: 2.0)}})
我没有传递 x、y、宽度和高度,而是将 transformMatrix
函数的返回值传递给路径函数.
我的transformMatrix
函数是
private func transformMatrix(geometry:GeometryProxy, image:UIImage) ->CGAffineTransform {让 imageViewWidth = geometry.size.width让 imageViewHeight = geometry.size.height让 imageWidth = image.size.width让 imageHeight = image.size.height让 imageViewAspectRatio = imageViewWidth/imageViewHeight让 imageAspectRatio = imageWidth/imageHeight让 scale = (imageViewAspectRatio > imageAspectRatio) ?图像视图高度/图像高度 :图像视图宽度/图像宽度//图像视图的`contentMode` 是`scaleAspectFit`,它缩放图像以适应图像的大小//通过保持纵横比来查看图像.乘以比例"以获取图像的原始大小.让 scaledImageWidth = imageWidth * scale让 scaledImageHeight = imageHeight * scale让 xValue = (imageViewWidth - scaledImageWidth)/CGFloat(2.0)让 yValue = (imageViewHeight - scaledImageHeight)/CGFloat(2.0)var 变换 = CGAffineTransform.identity.translatedBy(x: xValue, y: yValue)transform = transform.scaledBy(x: scale, y: scale)返回变换}}
输出是
I am currently trying to achieve to draw boxes of the text that was recognized with Firebase ML Kit on top of the image. Currently, I did not have success yet and I can't see any box at all as they are all shown offscreen. I was looking at this article for a reference: https://medium.com/swlh/how-to-draw-bounding-boxes-with-swiftui-d93d1414eb00 and also at that project: https://github.com/firebase/quickstart-ios/blob/master/mlvision/MLVisionExample/ViewController.swift
This is the view where the boxes should be shown:
struct ImageScanned: View {
var image: UIImage
@Binding var rectangles: [CGRect]
@State var viewSize: CGSize = .zero
var body: some View {
// TODO: fix scaling
ZStack {
Image(uiImage: image)
.resizable()
.scaledToFit()
.overlay(
GeometryReader { geometry in
ZStack {
ForEach(self.transformRectangles(geometry: geometry)) { rect in
Rectangle()
.path(in: CGRect(
x: rect.x,
y: rect.y,
width: rect.width,
height: rect.height))
.stroke(Color.red, lineWidth: 2.0)
}
}
}
)
}
}
private func transformRectangles(geometry: GeometryProxy) -> [DetectedRectangle] {
var rectangles: [DetectedRectangle] = []
let imageViewWidth = geometry.frame(in: .global).size.width
let imageViewHeight = geometry.frame(in: .global).size.height
let imageWidth = image.size.width
let imageHeight = image.size.height
let imageViewAspectRatio = imageViewWidth / imageViewHeight
let imageAspectRatio = imageWidth / imageHeight
let scale = (imageViewAspectRatio > imageAspectRatio)
? imageViewHeight / imageHeight : imageViewWidth / imageWidth
let scaledImageWidth = imageWidth * scale
let scaledImageHeight = imageHeight * scale
let xValue = (imageViewWidth - scaledImageWidth) / CGFloat(2.0)
let yValue = (imageViewHeight - scaledImageHeight) / CGFloat(2.0)
var transform = CGAffineTransform.identity.translatedBy(x: xValue, y: yValue)
transform = transform.scaledBy(x: scale, y: scale)
for rect in self.rectangles {
let rectangle = rect.applying(transform)
rectangles.append(DetectedRectangle(width: rectangle.width, height: rectangle.height, x: rectangle.minX, y: rectangle.minY))
}
return rectangles
}
}
struct DetectedRectangle: Identifiable {
var id = UUID()
var width: CGFloat = 0
var height: CGFloat = 0
var x: CGFloat = 0
var y: CGFloat = 0
}
This is the view where this view is nested in:
struct StartScanView: View {
@State var showCaptureImageView: Bool = false
@State var image: UIImage? = nil
@State var rectangles: [CGRect] = []
var body: some View {
ZStack {
if showCaptureImageView {
CaptureImageView(isShown: $showCaptureImageView, image: $image)
} else {
VStack {
Button(action: {
self.showCaptureImageView.toggle()
}) {
Text("Start Scanning")
}
// show here View with rectangles on top of image
if self.image != nil {
ImageScanned(image: self.image ?? UIImage(), rectangles: $rectangles)
}
Button(action: {
self.processImage()
}) {
Text("Process Image")
}
}
}
}
}
func processImage() {
let scaledImageProcessor = ScaledElementProcessor()
if image != nil {
scaledImageProcessor.process(in: image!) { text in
for block in text.blocks {
for line in block.lines {
for element in line.elements {
self.rectangles.append(element.frame)
}
}
}
}
}
}
}
The calculation of the tutorial caused the rectangles being to big and the one of the sample project them being too small. (Similar for height) Unfortunately I can't find on which size Firebase determines the element's size. This is how it looks like: Without calculating the width & height at all, the rectangles seem to have about the size they are supposed to have (not exactly), so this gives me the assumption, that ML Kit's size calculation is not done in proportion to the image.size.height/width.
This is how i changed the foreach loop
Image(uiImage: uiimage!).resizable().scaledToFit().overlay(
GeometryReader{ (geometry: GeometryProxy) in
ForEach(self.blocks , id: \.self){ (block:VisionTextBlock) in
Rectangle().path(in: block.frame.applying(self.transformMatrix(geometry: geometry, image: self.uiimage!))).stroke(Color.purple, lineWidth: 2.0)
}
}
)
Instead of passing the x, y, width and height, I am passing the return value from transformMatrix
function to the path function.
My transformMatrix
function is
private func transformMatrix(geometry:GeometryProxy, image:UIImage) -> CGAffineTransform {
let imageViewWidth = geometry.size.width
let imageViewHeight = geometry.size.height
let imageWidth = image.size.width
let imageHeight = image.size.height
let imageViewAspectRatio = imageViewWidth / imageViewHeight
let imageAspectRatio = imageWidth / imageHeight
let scale = (imageViewAspectRatio > imageAspectRatio) ?
imageViewHeight / imageHeight :
imageViewWidth / imageWidth
// Image view's `contentMode` is `scaleAspectFit`, which scales the image to fit the size of the
// image view by maintaining the aspect ratio. Multiple by `scale` to get image's original size.
let scaledImageWidth = imageWidth * scale
let scaledImageHeight = imageHeight * scale
let xValue = (imageViewWidth - scaledImageWidth) / CGFloat(2.0)
let yValue = (imageViewHeight - scaledImageHeight) / CGFloat(2.0)
var transform = CGAffineTransform.identity.translatedBy(x: xValue, y: yValue)
transform = transform.scaledBy(x: scale, y: scale)
return transform
}
}
and the output is
这篇关于SwiftUI:围绕使用 Firebase ML Kit 识别的元素绘制矩形的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!