使用KNN分类器进行数字识别前的预处理 [英] Pre-processing before digit recognition with KNN classifier
问题描述
现在我正在尝试使用OpenCV创建数字识别系统。 WEB中有很多文章和示例(甚至在上的表中我发现人们正在使用 deskewing ,噪声消除,模糊和像素移位技术。不幸的是,几乎所有文章的链接都被打破了。所以我决定自己做这样的预处理,因为我已经知道如何做了。
现在,我的算法如下:
- 侵蚀图像(我认为我原来的数字太粗糙了)
粗略的。) - 删除小轮廓。
- 阈值和模糊图像。
- 中心数字(而非移位)。
我认为在我的情况下不需要去歪斜,因为所有数字都是正常旋转的。而且我也不知道如何找到合适的旋转角度。
所以在此之后我得到了这些图片:
- 1
- 3 (以前不是 5 )
- 5 (不是 8 )
- 7 (利润!)
所以,这样的预处理对我有所帮助,但我需要更好的结果,因为在我看来这样应该没有问题地识别数字。
任何人都可以通过预处理给我任何建议吗?感谢您的帮助。
P.S。我可以上传我的源码(c ++)代码。
我意识到我的错误 - 它与预处理无关全部(感谢 @DavidBrown 和 @John )。我使用手写的数字数据集而不是打印(大写)。我没有在网上找到这样的数据库所以我决定自己创建它。我已将我的数据库上传到 Google云端硬盘。
以下是你如何使用它(训练和分类):
int digitSize = 16;
//返回特定目录中的文件列表
static vector< string> getListFiles(const string& dirPath)
{
vector< string>结果;
DIR * dir;
struct dirent * ent;
if((dir = opendir(dirPath.c_str()))!= NULL)
{
while((ent = readdir(dir))!= NULL)
{
if(strcmp(ent-> d_name,。)!= 0&& strcmp(ent-> d_name,..)!= 0)
{
result.push_back(ent-> d_name);
}
}
closedir(dir);
}
返回结果;
}
void DigitClassifier :: train(const string& imagesPath)
{
int num = 510;
int size = digitSize * digitSize;
Mat trainData = Mat(Size(size,num),CV_32FC1);
Mat响应= Mat(Size(1,num),CV_32FC1);
int counter = 0;
for(int i = 1; i< = 9; i ++)
{
char digit [2];
sprintf(数字,%d /,i);
string digitPath(digit);
digitPath = imagesPath + digitPath;
vector< string> images = getListFiles(digitPath);
for(int j = 0; j< images.size(); j ++)
{
Mat mat = imread(digitPath + images [j],0);
resize(mat,mat,Size(digitSize,digitSize));
mat.convertTo(mat,CV_32FC1);
mat = mat.reshape(1,1);
for(int k = 0; k< size; k ++)
{
trainData.at< float>(counter * size + k)= mat.at< float>(k);
}
responces.at< float>(counter)= i;
counter ++;
}
}
knn.train(trainData,responces);
}
int DigitClassifier :: classify(const Mat& img)const
{
Mat tmp = img.clone();
resize(tmp,tmp,Size(digitSize,digitSize));
tmp.convertTo(tmp,CV_32FC1);
返回knn.find_nearest(tmp.reshape(1,1),5);
}
Right now I'm trying to create digit recognition system using OpenCV. There are many articles and examples in WEB (and even on StackOverflow). I decided to use KNN classifier because this solution is the most popular in WEB. I found a database of handwritten digits with a training set of 60k examples and with error rate less than 5%.
I used this tutorial as an example of how to work with this database using OpenCV. I'm using exactly same technique and on test data (t10k-images.idx3-ubyte
) I've got 4% error rate. But when I try to classify my own digits I've got much bigger error. For example:
- is recognized as 7
- and are recognized as 5
- and are recognized as 1
- is recognized as 8
And so on (I can upload all images if it's needed).
As you can see all digits have good quality and are easily-recognizable for human.
So I decided to do some pre-processing before classifying. From the table on MNIST database site I found that people are using deskewing, noise removal, blurring and pixel shift techniques. Unfortunately almost all links to the articles are broken. So I decided to do such pre-processing by myself, because I already know how to do that.
Right now, my algorithm is the following:
- Erode image (I think that my original digits are too
rough). - Remove small contours.
- Threshold and blur image.
- Center digit (instead of shifting).
I think that deskewing is not needed in my situation because all digits are normally rotated. And also I have no idea how to find a right rotation angle. So after this I've got these images:
- is also 1
- is 3 (not 5 as it used to be)
- is 5 (not 8)
- is 7 (profit!)
So, such pre-processing helped me a bit, but I need better results, because in my opinion such digits should be recognized without problems.
Can anyone give me any advice with pre-processing? Thanks for any help.
P.S. I can upload my source (c++) code.
I realized my mistake - it wasn't connected with pre-processing at all (thanks to @DavidBrown and @John). I used handwritten dataset of digits instead of printed (capitalized). I didn't find such database in the web so I decided to create it by myself. I have uploaded my database to the Google Drive.
And here's how you can use it (train and classify):
int digitSize = 16;
//returns list of files in specific directory
static vector<string> getListFiles(const string& dirPath)
{
vector<string> result;
DIR *dir;
struct dirent *ent;
if ((dir = opendir(dirPath.c_str())) != NULL)
{
while ((ent = readdir (dir)) != NULL)
{
if (strcmp(ent->d_name, ".") != 0 && strcmp(ent->d_name, "..") != 0 )
{
result.push_back(ent->d_name);
}
}
closedir(dir);
}
return result;
}
void DigitClassifier::train(const string& imagesPath)
{
int num = 510;
int size = digitSize * digitSize;
Mat trainData = Mat(Size(size, num), CV_32FC1);
Mat responces = Mat(Size(1, num), CV_32FC1);
int counter = 0;
for (int i=1; i<=9; i++)
{
char digit[2];
sprintf(digit, "%d/", i);
string digitPath(digit);
digitPath = imagesPath + digitPath;
vector<string> images = getListFiles(digitPath);
for (int j=0; j<images.size(); j++)
{
Mat mat = imread(digitPath+images[j], 0);
resize(mat, mat, Size(digitSize, digitSize));
mat.convertTo(mat, CV_32FC1);
mat = mat.reshape(1,1);
for (int k=0; k<size; k++)
{
trainData.at<float>(counter*size+k) = mat.at<float>(k);
}
responces.at<float>(counter) = i;
counter++;
}
}
knn.train(trainData, responces);
}
int DigitClassifier::classify(const Mat& img) const
{
Mat tmp = img.clone();
resize(tmp, tmp, Size(digitSize, digitSize));
tmp.convertTo(tmp, CV_32FC1);
return knn.find_nearest(tmp.reshape(1, 1), 5);
}
这篇关于使用KNN分类器进行数字识别前的预处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!