从眼睛的图像进行凝视估计 [英] gaze estimation from an image of an eye

查看:19
本文介绍了从眼睛的图像进行凝视估计的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

到目前为止,我已经能够准确地检测到瞳孔和眼角.你可以在这里看到我在回答我自己的问题时上传的一些快照:

I've been able to detect pupil and the eye corners accurately so far. You can see a few snaps i uploaded in my answer to my own question here:

执行稳定的眼角检测

这是我到目前为止所做的.我通过查看 TLCP、TRCP 和 BLCP 来校准用户的视线在哪里

Here's what i've done so far. I calibrated the gaze of the user by looking at TLCP, TRCP and BLCP where

CP = calibration point; a screen point used for calibration
B = bottom
T = top
L= left
R = right
gaze_width = TRCP.x - TLCP.x

gaze_height = BLCP.y- TLCP.y

而我通过看那些CP得到的对应凝视点称为GP

And the corresponding gaze points i get by looking at those CPs are called GPs

凝视点GP的计算:

我从当前瞳孔中心的位置减去 TLGP 的纵坐标值,因为注视点必须落在假设的矩形中我希望你能理解它,它真的很简单.

I subtract the TLGP's ordinates' values from the current pupil center's location, because the gaze point has to fall in the hypothetical rectangle whose i hope you understand it, its really very simple.

我已经使用基本缩放系统将从瞳孔中心位置计算的凝视点线性映射到屏幕点,其中比例计算如下:

I've linearly mapped the gaze points calculated from pupil center's location to screen points using a basic scaling system where the scales are calculated as follows:

scaleX = screen_width/gaze_width
scaleY = screen_height/gaze_height

对于任何注视点 P(x,y),我计算相应的屏幕点 Q(m,n) 为:

And for any gaze point P(x,y) i calculate the corresponding screen point Q(m,n) as:

m = scaleX*x
n = scaleY*y

但问题是,即使是几乎完美的瞳孔检测(几乎是因为在光线不足的情况下它会产生误报.但我打算将其置于限制之下,因为我无法处理它,我没有足够的时间),我的视线宽度和视线高度仍然很差.

But the problem is, after even almost perfect pupil detection (almost because in poor lighting it gives false positives. But i intend to put that under limitations because i can't work on it, i don't have enough time), i'm still getting poor gaze width and gaze height.

这是一个测试运行日志:

Here's a test run log:

DO_CAL= True

Gaze Parameters:

TLGP = (38, 26) | TRGP = (20, 22) | BLGP = (39, 33)
screen height = 768 screen width = 1366

gaze height = 7 gaze width = 18

scales: X = 75.8888888889 | Y = 109.714285714
Thing on = True

Gaze point = (5, 3)
Screen point: (987, 329)

Gaze point = (5, 3)
Screen point: (987, 329)

Gaze point = (7, 5)
Screen point: (835, 549)

Thing on = False

TLGP = (37, 24) | TRGP = (22, 22) | BLGP = (35, 29)
screen height = 768 screen width = 1366

gaze height = 5 gaze width = 15
scales: X = 91.0666666667 | Y = 153.6
Thing on = True

Gaze point = (12, 3)
Screen point: (1093, 461)

Gaze point = (12, 3)
Screen point: (1093, 461)

ESC pressed

只需查看注视点及其对应的注视检测屏幕点(在它们下方).x,y 坐标值的巨大差异让我抓狂.星期一是最后的演讲.

Just look at the gaze points and their corresponding gaze-detected screen points (under them). The vast differences in x,y ordinates' values is bugging me nuts. Monday is the final presentation.

在这种方法之后,我理论化了另一种方法:

After this approach, i theorized another one where in:

校准与第一种方法相同.我会检测凝视的运动及其方向.假设给定瞳孔中心位置的任意两个点 P 和 Q,其中 P 是第一个注视点,Q 是第二个注视点,然后我们计算线 PQ 的方向和长度.

Calibration is done as in the first method. I would detect the motion of the gaze, and its direction. Say, given any two points of pupil center’s location, P and Q, where P is the first gaze point, Q is the second, then we calculate the direction and length of the line PQ.

假设这条线段的长度是 L.然后我们将 L 缩放到屏幕比例,假设 L 是屏幕比例的 D,并且给定注视移动的方向,我们将屏幕上的光标从其最后一个静止点(例如 R、D 距离)移动到一个新点 S,该点将被计算为长度为 D 的线段的终点和起点 S.图形表示为图中给出.因此,基本上,我不会将任何凝视数据映射到屏幕点,我基本上是跟踪凝视,并将其转换为推送"以应用于屏幕上的光标.但我还没有实现它.因为它实际上并没有将视线映射到屏幕坐标,因此可能是错误的.这个理论的动机来自 sourceforge 上的 eViacam 项目——它们基本上跟踪你的脸,并相应地移动鼠标.在校准时,他们只是计算你的脸沿着轴移动了多少.

Let’s assume that the length of this line segment is L. We then scale L to screen proportions, say L is D in screen scale, and given the direction of gaze movement, we move the cursor on the screen from its last point of rest, say R, D distance, to a new point S which will be calculated as the end point of the line segment whose length is D, and starting point S. The figurative representation is given in the figure. Thus basically, i don't map any gaze data to screen point, i basically track the gaze, and convert it into a "push" to be applied to the cursor on the screen. But i haven't implemented it yet. Because it actually doesn't map the gaze to screen co-ordinates, and thus might be erroneous. Motivations for this theory were derived from the eViacam project on sourceforge - they basically track your face, and move the mouse accordingly. In calibration they just calculate how much your face moves along the axes.

底线:因此,如果你们中的任何人对如何从完美处理的眼睛图像中检测用户的注视有任何想法 - 检测到瞳孔中心和眼角的图像,请告诉!我只有大约一天的时间,我知道已经很晚了,但我只需要任何可以帮助我的神奇想法.

Bottom line: So if any of you have any ideas of how to detect a user's gaze from a perfectly processed eye image - one with a detected pupil center and eye corners, please do tell! I've got just about a day, and i know its late, but i just need any magical idea that can help me.

推荐答案

这不是答案,但不可能作为评论发布.你回答后我会删除它.

This is not an answer, but it is impossible to post as a comment. I'll delete it after your answer.

你确定你有所有必要的参数吗?

Are you sure you have all the necessary parameters?

考虑下图:

如果您的相机检测到 {K, J, Q} 处的角和瞳孔,您如何区分另一个三元组 {F, E, O}?请注意,测量值是相同的,但由黑色箭头表示的注视方向完全不同.

If your camera detects the corners and pupil at {K, J, Q} , how can you distinguish from another triple {F, E, O}? Note that the measures are the same, but the gaze directions, represnted by the black arrows are completely different.

注意:两条黑线和红线是从一个相机点绘制的,位于可见区域之外.

Note: the two black and the red lines were drawn from a single camera point, placed outside the visible region.

这篇关于从眼睛的图像进行凝视估计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆