从文件中查看我的BST中存在哪些单词 [英] See which words exist in my BST from a file

查看：106 发布时间：2019/6/7 23:08:24 C++

本文介绍了从文件中查看我的BST中存在哪些单词的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试从我正在插入的文件中输出拼写错误（不在BST中的单词）单词。所以基本上我有一个功能齐全的二叉树。唯一需要的功能是插入和存在。当我插入我的字典（到目前为止很好）并读取每行上有多个单词的文件时，它会崩溃（显示从大写到小写的转换后的单词和有标点符号的单词。但是当我插入一个文件时每个单词都在不同的行上，程序会给我拼写错误的单词。

I am trying to output the "misspelled" (words that are not in BST) words from a file which I am inserting. So basically I have a fully functional binary tree. The only functions needed are insert and exist. When I insert my dictionary (so far so good) and read a file that has multiple words on each line, it crashes (displays the converted words from upper case to lower case and the ones that had punctuation. But when I insert a file that every word is on a different line the program gives me the misspelled words.

<pre>#include <iostream>
#include <fstream>
#include <cstdlib>
#include <string>
#include <algorithm>
#include "bst.h"

using namespace std;

int main()
{
    char dictionaryFile[50]; //input dictionary file
	char file[50]; // input file
	
	string misspelt; // misspelt 
	string wordsDictionary;
	string words;
	
	ifstream inputDictionaryFile; // file input
	ifstream inputFile;
	BinarySearchTree *tree = new BinarySearchTree();
	
	/* GETTING DICTIONARY*/
	
	cout << "Enter dictionary file name: ";
	
	cin.getline(dictionaryFile,1000); // getting lines 
	
	inputDictionaryFile.open(dictionaryFile); //open
	
	//if it fails to open - Error
	if(!inputDictionaryFile.is_open())
	{
		cout << "Fail to open file" << endl;
		exit(EXIT_FAILURE);
	}
	
	while(inputDictionaryFile >> wordsDictionary)
	{
		int i =0;
		for( i = 0;wordsDictionary[i]!='\0'; i++) 
		{
			//find upperCase letters
			if(wordsDictionary[i] >= 'A' && wordsDictionary[i] <= 'Z')
			{
				//overwrite to lowerCase
				wordsDictionary[i] = tolower(wordsDictionary[i]);			
					
			}//end of if statement 
			//ignore tab
			if(wordsDictionary[i] == '\t')
			{ 
				wordsDictionary[i] = ' ';
			}//end of if statement
	
			//ignoring punctuation 	
			if(wordsDictionary[i] == ',' || wordsDictionary[i] == '.' || wordsDictionary[i] == '!' || wordsDictionary[i] == '?' || wordsDictionary[i] == '"' || wordsDictionary[i] == ':' || wordsDictionary[i] ==';' || wordsDictionary[i] == '-' || wordsDictionary[i] == '/' || wordsDictionary[i] == '`' || wordsDictionary[i] == '&' || wordsDictionary[i] == '@' || wordsDictionary[i] == '^' || wordsDictionary[i] == '(' || wordsDictionary[i] == ')' || wordsDictionary[i] == '<' || wordsDictionary[i] == '>' || wordsDictionary[i] == '#' || wordsDictionary[i] == '%' || wordsDictionary[i] == '{' || wordsDictionary[i] == '}' || wordsDictionary[i] == '[' || wordsDictionary[i] == ']' || wordsDictionary[i] == '|' || wordsDictionary[i] == '+' || wordsDictionary[i] == '*')
			{
				wordsDictionary[i] = ' ';
			}//end of is statement 
				
			//ignore if there is double space
			if(wordsDictionary[i] == '  ')
			{
				wordsDictionary[i] = ' ';
			}//end of if statement
		}
		tree->insert(wordsDictionary); // insert to file
	}
	
	if(tree == nullptr)
	{
		cout << "Empty tree" << endl;
	}
	
	/* GETTING FILE*/
	
	cout << "Enter file name: ";
	
	cin.getline(file,1000); // getting lines 
	
	inputFile.open(file); //open
	//if it fails to open - Error
	if(!inputFile.is_open())
	{
		cout << "Fail to open file" << endl;
		exit(EXIT_FAILURE);
	}
	
	while(inputFile >> words)
	{	
		int i =0;
		for( i = 0;words[i]!='\0'; i++) 
		{
			//find upperCase letters
			if(words[i] >= 'A' && words[i] <= 'Z')
			{
				//overwrite to lowerCase
				words[i] = tolower(words[i]);			
				
			}//end of if statement 
				
			//ignore tab
			if(words[i] == '\t')
			{ 
				words[i] = ' ';
			}//end of if statement
	
			//ignoring punctuation 	
			if(words[i] == ',' || words[i] == '.' || words[i] == '!' || words[i] == '?' || words[i] == '"' || words[i] == ':' || words[i] ==';' || words[i] == '-' || words[i] == '/' || words[i] == '`' || words[i] == '&' || words[i] == '@' || words[i] == '^' || words[i] == '(' || words[i] == ')' || words[i] == '<' || words[i] == '>' || words[i] == '#' || words[i] == '%' || words[i] == '{' || words[i] == '}' || words[i] == '[' || words[i] == ']' || words[i] == '|' || words[i] == '+' || words[i] == '*')
			{
				words[i] = ' ';
			}//end of is statement 
				
			//ignore if there is double space
			if(words[i] == '  ')
			{
				words[i] = ' ';
			}//end of if statement
			
		} //end of for loop	
		//tree->exists(words);
		if(!tree->exists(words))
		{
			cout <<"Misspelled: " << words << endl;
		}
	}
	
	delete tree;
	
	return 0;
}

^

|

spellChecker.cpp文件

^
|
spellChecker.cpp file

<pre lang="c++"><pre>// Checks if a word is in the tree
bool BinarySearchTree::exists(std::string word) const
{
    Node* node = root;
	while(node != nullptr)
	{
		if(node->data == word) 
		{
			return true;
		}
		else
		{
			if (word > node->data)
			{
				node = node->right;
			}
			else
			{
				node = node->left;
			}
		}
	}
	return false;
}

<pre>//Helper function to insert a word into the tree
void insertHelper(Node **node, std::string word)
{
	//Check if nullptr. If so set new node
	if(*node == nullptr)
	{
		//Create new node
		*node = new Node;
		//Set new word
		(*node)-> data = word;
		//Set branches to nullptr
		(*node)-> left = nullptr;
		(*node)->right = nullptr;
	}
	else  // if not empty
	{
		if(word < (*node)->data)
			insertHelper(&(*node)->left,word);
		else if(word > (*node)->data)
			insertHelper(&(*node)->right, word);
		else
			return;
	}
}

// Adds a word to the tree
void BinarySearchTree::insert(std::string word)
{
	insertHelper(&root, word);
}

单词输入文件

Single words input file

C 
is
the
most
commonly
used
programming
language
for
writing
operating
systems
The 
first
operatingg
system
written 
in 
C 
is 
Unix
Later 
operating
systems 
like 
Linux 
were 
all 
written 
in 
C 
Not 
only 
is 
C 
the 
language 
of 
operating 
systems 
it 
is 
the 
precursor 
and 
insspiration 
for 
almost 
all 
of 
the 
most 
popular 
high 
level 
languages 
available 
today 
In 
fact
Perl 
PHP 
Python 
and 
Ruby 
are 
all 
writtten 
in
c

另一个是simmilar但是单词在同一行。还有标签。

我尝试过：

我尝试更改标点符号函数

And the other is simmilar but words are on same line. There are tabs as well.

What I have tried:

I tried changing my punctuation function

//ignoring punctuation 	
			if(wordsDictionary[i] == ',' || wordsDictionary[i] == '.' || wordsDictionary[i] == '!' || wordsDictionary[i] == '?' || wordsDictionary[i] == '"' || wordsDictionary[i] == ':' || wordsDictionary[i] ==';' || wordsDictionary[i] == '-' || wordsDictionary[i] == '/' || wordsDictionary[i] == '`' || wordsDictionary[i] == '&' || wordsDictionary[i] == '@' || wordsDictionary[i] == '^' || wordsDictionary[i] == '(' || wordsDictionary[i] == ')' || wordsDictionary[i] == '<' || wordsDictionary[i] == '>' || wordsDictionary[i] == '#' || wordsDictionary[i] == '%' || wordsDictionary[i] == '{' || wordsDictionary[i] == '}' || wordsDictionary[i] == '[' || wordsDictionary[i] == ']' || wordsDictionary[i] == '|' || wordsDictionary[i] == '+' || wordsDictionary[i] == '*')
				wordsDictionary[i] = ' ';
			}//end of is statement 
				
			//ignore if there is double sp

到ispunct但它不起作用。不知道如何忽略标点符号而不用其他东西替换它。在不同的循环中尝试插入和现有函数。尝试使用如下方法：

to ispunct but it didn't work. Have no idea how to ignore punctuation without replacing it with something else. Tryed inserting and existing function in a different loop. Tried with a method something like this :

#include <iostream>
#include <string>
#include <algorithm>
using namespace std;

int main() {
    string str = "this. is my string. it's here.";

    transform(str.begin(), str.end(), str.begin(), [](char ch)
    {
        if( ispunct(ch) )
            return '\0';
        return ch;
    });
}

从文件中查看我的BST中存在哪些单词 [英] See which words exist in my BST from a file

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

从文件中查看我的BST中存在哪些单词 [英] See which words exist in my BST from a file

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭