将未编码的字符串转换为C中的对应字符串 [英] Convert unicoded string to corresponding string in C
问题描述
我需要将未编码的字符串转换为其适当的语言。我需要逐行读取文本文件。
I need to convert a unicoded string to its appropriate language. I need to read from a text file line by line. There is a possibility that a line may contain a unicode some thing like this
\xE6\xAC\xA2\xE8\ \xBF\x8E
\xE6\xAC\xA2\xE8\xBF\x8E
这基本上是中文文本,等于
This is basically a chinese text which is equal to
欢迎
欢迎
现在我需要删除此行(\xE6\xAC\ xA2\xE8\xBF\x8E),将此unicode转换为中文文本,然后将此中文文本附加到文本文件中。
Now I need to remove this line (\xE6\xAC\xA2\xE8\xBF\x8E) from text file, convert this unicode to chinese text, append this chinese text to the text file.
下面是内容我的data.txt文件:
Below is the content of my data.txt file:
testing
programming
\xE6\xAC\xA2\xE8\xBF\x8E
development
我想获取文件内容为:
testing
programming
development
欢迎
以下是我到目前为止所做的事情
Below is what I have done so far
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAX 256
int main()
{
int ctr = 0;
char ch;
FILE *fptr1, *fptr2;
char fname[MAX] = "data.txt";
char str[MAX], temp[] = "temp.txt";
char str2[256];
fptr1 = fopen(fname, "r");
if (!fptr1)
{
printf(" File not found or unable to open the input file!!\n");
return 0;
}
fptr2 = fopen(temp, "w"); // open the temporary file in write mode
if (!fptr2)
{
printf("Unable to open a temporary file to write!!\n");
fclose(fptr1);
return 0;
}
// copy all contents to the temporary file except the specific line with unicode characters
while (!feof(fptr1))
{
strcpy(str, "\0");
fgets(str, MAX, fptr1);
if (!feof(fptr1))
{
ctr++;
if(strstr(str,"\\")!=NULL)
{
memset(str2,'\0',sizeof(str2));
printf("Input String Contains Unicode Character\n");
str[strlen(str)-1]='\0';
sprintf(str2,"echo %s >> data.txt",str);
printf("Final String: %s\nUnicode String Size: %ld\n",str2,strlen(str));
system(str2);
}
else
{
fprintf(fptr2, "%s", str);
}
}
}
fclose(fptr1);
fclose(fptr2);
remove(fname); // remove the original file
rename(temp, fname); // rename the temporary file to original name
/*------ Read the file ----------------*/
fptr1=fopen(fname,"r");
ch=fgetc(fptr1);
printf(" Now the content of the file %s is : \n",fname);
while(ch!=EOF)
{
printf("%c",ch);
ch=fgetc(fptr1);
}
fclose(fptr1);
/*------- End of reading ---------------*/
return 0;
}
在尝试编译并运行此代码时,下面是我看到的输出
When tried to compile and run this code, below is the output I am seeing
Input String Contains Unicode Character
Final String: echo \xE6\xAC\xA2\xE8\xBF\x8E >> data.txt
Unicode String Size: 24
Now the content of the file data.txt is :
testing
programming
development
xE6xACxA2xE8xBFx8E
相同的代码在更改以下几行时,按预期运行
The same code when changed the below lines, it was working as expected
sprintf(str2,"echo %s >> data.txt",str);
sprintf(str2,"echo %s >> data.txt","\xE6\xAC\xA2\xE8\xBF\x8E");
但是当从文件中读取值时,它将无法正常工作。
But when the value is read from file it was not working.
在这一行中,该字符串也被标识为具有正确大小的Unicode字符串
Also this line, the string is identified as unicode string with correct size
printf("Final String: %s\nUnicode String Size: %ld\n",str2,strlen(str));
The String Size: 6
请问有人,如何转换从文本文件中读取时,将值转换为中文。
Can some one please let me know, how to convert the value to chinese when read from text file.
推荐答案
我能够完成转换。下面是我的最终代码
I was able to get the conversion done. Below is my final code
if(strstr(str,"\\")!=NULL)
{
memset(str2,'\0',sizeof(str2));
printf("Input String Contains Unicode Character\n");
str[strlen(str)-1]='\0';
sprintf(str2,"echo %s | sed \'s/[\\\\x]//g\' | xxd -r -p >> data.txt",str);
printf("Final String: %s\nUnicode String Size: %ld\n",str2,strlen(str));
system(str2);
}
感谢您的所有回复,并感谢@chux提出的建议
Thanks for all your response and thanks @chux for your pointer
这篇关于将未编码的字符串转换为C中的对应字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!