在C ++中查找重复文件时出现问题 [英] Problem in finding duplicate files in C++

查看:97
本文介绍了在C ++中查找重复文件时出现问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

先生,我愿意使用C ++找到重复的文件。为了达到这个目的,

我应该遍历驱动器中的所有文件并获取其文件大小,然后使用地图查找重复的密钥。

所以,我已经创建了一个映射,其中key是文件的大小,value是文件的路径。这是我的会员功能,

Sir, I am willing to find the duplicate files using C++. In-order to achieve this,
I should iterate over all files present in the drive and get its file size and then finding duplicate keys using a map.
So, I've created a map in which key is the size of the file and value is the path of the file. Here is my member function,

bool duplicateFinder::processDrive(const wchar_t* sDir)
{
	// referred http://www.stackoverflow.com/questions/2314542/listing-directory-contents-using-c-and-windows
	//Map creation and usage
	map<int, wchar_t*> duplicate;
	map<int, wchar_t*>::iterator iterate;
	WIN32_FIND_DATA fdFile;
	HANDLE hFind = NULL;

	wchar_t sPath[2048];
	wsprintf(sPath, L"%s\\*.*", sDir);

	if ((hFind = FindFirstFile(sPath, &fdFile)) == INVALID_HANDLE_VALUE)
	{
		wprintf(L"Path not found: [%s]\n", sDir);
		return false;
	}

	do
	{
		
		if (wcscmp(fdFile.cFileName, L".") != 0
			&& wcscmp(fdFile.cFileName, L"..") != 0)
		{
			
			wsprintf(sPath, L"%s\\%s", sDir, fdFile.cFileName);
			if (fdFile.dwFileAttributes &FILE_ATTRIBUTE_DIRECTORY)
			{
				wprintf(L"Directory: %s\n", sPath);
				processDrive(sPath); 
			}
			else
			{
				//wprintf(L"File: %s\n", sPath);
				char** arr;
				char* hash = new char[MAX_PATH];
				memset(hash, 0, MAX_PATH);
				int correction;
				correction = wcstombs(hash, sPath, MAX_PATH);
				//arr = CALL_MD5_Function(hash);
				iterate = duplicate.find(getFileSize(hash));
				if (iterate != duplicate.end())
				{
					cout << "\n\n FOUND THE VALUE " << iterate->second;
				}
				else
				{
					duplicate.insert(pair<int, wchar_t*>(getFileSize(hash), sPath));
				}
			}
		}
	} while (FindNextFile(hFind, &fdFile));
	FindClose(hFind);
	return isDuplcateFound;
}



在上面的代码中,每当找到文件时,使用getFileSize函数计算文件的大小(这不是Win32 API函数。它是本机C ++用户定义的函数。该函数返回文件的大小,以字节为单位,例如:4278),在将其插入地图之前,如果密钥不存在,则使用映射中的查找功能检查密钥的存在,然后插入进入地图。但是如果使用iterate函数找到密钥,则应显示路径。



但是无论何时找到重复文件,返回的输出都是这样的地址发现价值A012556 我不知道为什么会出现这个错误。我先尝试迭代 - >先发现文件的大小正确显示但是这两个文件是否重复。



请帮助我先生。

感谢您的时间先生。



< b>我尝试了什么:



我尝试过:



1.之前使用地图我首先确保程序通过打印文件的名称来迭代驱动器中存在的所有文件。



2.然后我确保getFileSize功能可以通过打印驱动器中存在的所有文件的大小来正常工作。



3.我测试完毕后,MD5哈希是通过打印哈希来正确计算的目录中的所有文件。[自我在向地图添加文件大小时遇到​​错误我没有进一步开发这个错误因为当我找到相同大小的文件时它是下一步]



经过多次尝试和修改以上三个工作正常。然后我继续下一步,即添加价值。

我提到我的学校笔记[这不是我的家庭作业]然后互联网关于将值添加到地图,发现我做得正确所以我不确定错误。



请帮助我先生。

谢谢您的时间先生。


In the above code whenever a file is found the size of the file is calculated using getFileSize function(This is not a Win32 API function.It is Native C++ user defined function.this function returns size of the file in bytes e.g: 4278) and before inserting it to the map the presence of the key is checked using "find" function in maps if the key is not present then it is inserted into the map. But if the key is found using the "iterate" function the path should be displayed.

But whenever the duplicate file is found the out put returned is an address like this FOUND THE VALUE A012556 I don't know why this error.I tried iterate->first and found the size of the file displayed correctly but those two files or not the duplicates.

Kindly help me sir with this.
Thank you for your time sir.

What I have tried:

I have tried:

1. Before using maps I first did make sure that the program is iterating over all files present in the drive by printing the names of the files.

2. Then I made sure that "getFileSize" function works correctly by printing sizes of all files present in the drive.

3. After I've tested is MD5 Hash is computed correctly by printing the hashes of all files in the directory.[Since I'm getting errors in adding sizes of files to the map I didn't develop this further more because it is the next step when I found the file of same size]

After several tries and modification above three worked fine. then i moved on to the next step which is adding the value.
I referred my school notes[This is not my Homework] then internet about adding the values to the map and found that i did correctly so I'm unsure of the error.

Kindly help me sir with this.
Thank you for your time sir.

推荐答案

std :: cout 将'wchar_t *'类型的变量视为指针,而不是字符串。修复代码的最简单方法是将输出发送到 std :: wcout ,它在相同的标题(iostream)中定义。



另一种可能的解决方案是修改你的std :: map< int,wchar_t *>到std :: map< int,wstring>,但这也需要修改程序的其他部分。
std::cout treats a variable of type 'wchar_t*' as a pointer, not a string. The easiest way to fix your code would be to send output to std::wcout, which is defined in the same header (iostream).

Another possible solution would be to modify your std::map<int,wchar_t*> to std::map<int,wstring>, but this would require modifying other parts of your program as well.


这篇关于在C ++中查找重复文件时出现问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆