如何使用Image列从SQL表中查找重复的图像 [英] How to find duplicate image from SQL table with Image column
问题描述
我自己的项目有一项新的R& D任务。
i有列的表格
auto_id(int identity),Catelog_Id(int),CatelogImage(图像数据类型)
这个表有表中有20,000行。
现在我想从这张表中找到重复的图像。怎么可能呢?我该怎么做?想要获得其中包含相同图像的auto_Id。
---到目前为止我一直在尝试---
尝试区别:
错误:图像数据类型无法选择为DISTINCT,因为它无法比较。
文章: http://support.microsoft.com/kb/162032/en-us [ ^ ]
尝试转换为varbinary(max)和distinct:
没有得到正确的解决方案,因为我的varbinary数据类型的限制类似于varchar(8000)。
如果有人做过这种类型的R& D.
Hi,
I have one new R&D task for my own project.
i have on table with column
auto_id (int identity), Catelog_Id (int) , CatelogImage (Image datatype)
and this table have around 20,000 rows in table.
now i want to find duplicate image from this table. than how is it possible? how can i do it? want to get auto_Id whose have same image in it.
--- What i have tried until now ---
Tried Distinct :
Error : The image data type cannot be selected as DISTINCT because it is not comparable.
Article : http://support.microsoft.com/kb/162032/en-us[^]
Tried by converting in to varbinary(max) and distinct:
not getting right solution because i thing varbinary datatype has limit like varchar(8000).
Please help if anybody have done this type of R&D.
推荐答案
主要思想是将哈希字节与'md5进行比较'algorithm。
我创建了一个像你的列一样的表
Hi,
The main idea is to compare hash bytes with 'md5' algorithm.
I have create table like your columns
create table emp (auto_id int identity(1,1), Catelog_Id int, CatelogImage Image)
接下来我用脚本填充(运行多次):
Next i have populate with script (running many times) :
Insert emp (Catelog_Id, CatelogImage)
Select 1001, BulkColumn from Openrowset( Bulk 'D:\CodeProject1.jpg', Single_Blob)
Insert emp (Catelog_Id, CatelogImage)
Select 1001, BulkColumn from Openrowset( Bulk 'D:\CodeProject2.jpg', Single_Blob)
Insert emp (Catelog_Id, CatelogImage)
Select 1001, BulkColumn from Openrowset( Bulk 'D:\CodeProject3.jpg', Single_Blob)
finally列出同一图像的所有ID,运行此脚本:
finally to list all id's for the same image, run this script :
;
with hashimage (
hashimage_hash
,hashimage_count
)
as (
select hashbytes('md5', cast([catelogimage] as varbinary)),count(*)
from emp
group by hashbytes('md5', cast([catelogimage] as varbinary))
having count(*) > 1
)
select hashimage_hash
,stuff((
select ',' + rtrim(auto_id)
from emp sub
where hashbytes('md5', cast([catelogimage] as varbinary)) = hashimage_hash
for xml path('')
), 1, 1, '')
from hashimage
来源:
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/d55317cc-8bff-4e7d-a34c-f75da7e4778b/ how-to-compare-imagevarbinary-column [ ^ ]
这篇关于如何使用Image列从SQL表中查找重复的图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!