量化幻灯片中有多少笔迹 [英] Quantify how much a slide has been filled with handwriting
问题描述
我有一个幻灯片放映的视频,演示者在其中将笔记手写在幻灯片上:
我想创建一个程序来检测幻灯片是否已被填充(例如,通过手写笔记)或新的幻灯片.
我想到的一种方法是文本的OCR,但这并不适合,因为这里唯一更改的文本是手写的或数学的.
我到目前为止所做的事情:我浏览视频,并始终比较前一帧和当前帧.我从相对于前一帧添加的所有元素中提取边界框坐标,并存储最高的y坐标.最高y坐标属于图像最下方的元素(从图像顶部看).因此,从理论上讲,这应该给我指示是否要填充幻灯片...
实际上,我不能真正利用这些数据:
相关视频可以在此处下载:
I have a video of a slideshow, where the presenter handwrites notes onto the slide:
I would like to create a program that detects if a slide is being filled (by handwritten notes for example) or if it is a new slide.
One method I was thinking of is OCR of the text, but this is not suitable since here the only text that changes are either handwritten or math.
What I have done so far: I go through the video and compare always the previous frame and the current frame. I extract the bounding box coordinates from all elements that have been added with respect to the previous frame, and I store the highest y-coordinate. The highest y-coordinate belongs to the element the furthest down the image (as seen from the top of the image). Thus this should -in theory- give me an indication if I am filling up the slide...
In practice, I cannot really make use of this data:
The video in question can be downloaded here: http://www.filedropper.com/00_6
Here is my code:
from skimage.measure import compare_ssim
import cv2
import numpy as np
# Packages for live plot visualisation
import pyqtgraph as pg
from pyqtgraph.Qt import QtGui, QtCore
from tqdm import tqdm
def get_y_corrd_of_lowest_added_element(prev_frame, frame):
"""
Given Two Images it detects the bounding boxes of all elemnts that
are different betweent the two images and outputs the y coordinate of
the lowest added element (when seen from the top of the image)
Parameters
----------
prev_frame : numpy array
original image.
frame : numpy array
new image, based on original image.
Returns
-------
TYPE
lowest y coordinate of elments that were added.
"""
# Compute SSIM between two images
(score, diff) = compare_ssim(prev_frame, frame, full=True)
# The diff image contains the actual image differences between the two images
# and is represented as a floating point data type in the range [0,1]
# so we must convert the array to 8-bit unsigned integers in the range
# [0,255] before we can use it with OpenCV
diff = (diff * 255).astype("uint8")
# Threshold the difference image, followed by finding contours to
# obtain the regions of the two input images that differ
thresh = cv2.threshold(diff, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
contours = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
# Initialize a list that will hold all y coordinates of all bounding boxes
# of all elements that were added to the frame when compared to the
# previous frame
y_list = [0]
for c in contours:
area = cv2.contourArea(c)
if area > 40:
x,y,w,h = cv2.boundingRect(c)
# Append to y coordinate list
y_list.append(y)
y_list.sort()
return y_list[-1]
def transform(frame):
# convert to greyscale
frame = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
# make smaller
small = cv2.resize(frame, (0,0), fx=0.5, fy=0.5)
return small
vidcap = cv2.VideoCapture(ADD PATH TO VIDEO HERE)
success,prev_frame = vidcap.read()
prev_frame = transform(prev_frame)
# For Real Time Ploting
#Source: http://www.pyqtgraph.org/downloads/0.10.0/pyqtgraph-0.10.0-deb/pyqtgraph-0.10.0/examples/PlotSpeedTest.py
app = QtGui.QApplication([])
win = pg.GraphicsWindow()
win.resize(800, 800)
p = win.addPlot()
p.setTitle('Lowest Y')
plot = p.plot([])
# Store lowest y coordinates of added elements
y_lowest_list = []
while success:
success,frame = vidcap.read()
# convert
frame = transform(frame)
# show frame
cv2.imshow("frame", frame)
cv2.waitKey(1)
#extract lowest y corrd
y = get_y_corrd_of_lowest_added_element(prev_frame, frame)
y_lowest_list.append(y)
# Real-time plot
plot.setData(y_lowest_list)
# close real-time plot
win.close()
Does anyone have an idea?
You can try this code, see comments:
import cv2
import numpy as np
def get_bg_and_ink_level(frame):
frame = cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)
background=cv2.threshold(frame[:,:,2], 245, 255, cv2.THRESH_BINARY)[1]
background_level=cv2.mean(background) # for future use if you need to select frames without hands.
ink_color_low = (117,60,150)
ink_color_high = (130,207,225)
only_ink = cv2.inRange(frame, ink_color_low, ink_color_high)
ink_level=cv2.mean(only_ink)
return background_level[0], ink_level[0]
vidcap = cv2.VideoCapture('0_0.mp4')
success,frame = vidcap.read()
bg = []
ink=[]
i=0
while success:
lv= get_bg_and_ink_level(frame)
bg.append(lv[0])
ink.append(lv[1])
success,frame = vidcap.read()
# search for frames where the blue ink is removed from the picture.
d_ink=np.diff(ink)
d_ink[-1]=-2.0 #add last frame
idx=np.where(d_ink<-1.0)
#save frames
for i in idx[0]:
vidcap.set(cv2.CAP_PROP_POS_FRAMES, i)
flag, frame = vidcap.read()
out_name='frame'+str(i)+'.jpg'
cv2.imwrite(out_name, frame)
Result 15708 frame:
这篇关于量化幻灯片中有多少笔迹的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!