将未编码的字符串转换为C中的对应字符串 [英] Convert unicoded string to corresponding string in C

查看:133
本文介绍了将未编码的字符串转换为C中的对应字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将未编码的字符串转换为其适当的语言。我需要逐行读取文本文件。

I need to convert a unicoded string to its appropriate language. I need to read from a text file line by line. There is a possibility that a line may contain a unicode some thing like this


\xE6\xAC\xA2\xE8\ \xBF\x8E

\xE6\xAC\xA2\xE8\xBF\x8E

这基本上是中文文本,等于

This is basically a chinese text which is equal to


欢迎

欢迎

现在我需要删除此行(\xE6\xAC\ xA2\xE8\xBF\x8E),将此unicode转换为中文文本,然后将此中文文本附加到文本文件中。

Now I need to remove this line (\xE6\xAC\xA2\xE8\xBF\x8E) from text file, convert this unicode to chinese text, append this chinese text to the text file.

下面是内容我的data.txt文件:

Below is the content of my data.txt file:

testing
programming
\xE6\xAC\xA2\xE8\xBF\x8E
development

我想获取文件内容为:

testing
programming
development
欢迎

以下是我到目前为止所做的事情

Below is what I have done so far

#include <stdio.h>
#include <string.h>
#include <stdlib.h>


#define MAX 256

  int main() 
  {
        int ctr = 0;
        char ch;
        FILE *fptr1, *fptr2;
        char fname[MAX] = "data.txt";
        char str[MAX], temp[] = "temp.txt";
        char str2[256];

        fptr1 = fopen(fname, "r");
        if (!fptr1) 
        {
                printf(" File not found or unable to open the input file!!\n");
                return 0;
        }
        fptr2 = fopen(temp, "w"); // open the temporary file in write mode 
        if (!fptr2) 
        {
                printf("Unable to open a temporary file to write!!\n");
                fclose(fptr1);
                return 0;
        }

        // copy all contents to the temporary file except the specific line with unicode characters
        while (!feof(fptr1)) 
        {
            strcpy(str, "\0");
            fgets(str, MAX, fptr1);
            if (!feof(fptr1)) 
            {
                ctr++;
                if(strstr(str,"\\")!=NULL)
                {
                    memset(str2,'\0',sizeof(str2));
                    printf("Input String Contains Unicode Character\n");                    
                    str[strlen(str)-1]='\0';

                    sprintf(str2,"echo %s >> data.txt",str);
                    printf("Final String: %s\nUnicode String Size: %ld\n",str2,strlen(str));
                    system(str2);
                }
                else
                {

                    fprintf(fptr2, "%s", str);                  
                }
            }
        }
        fclose(fptr1);
        fclose(fptr2);
        remove(fname);          // remove the original file 
        rename(temp, fname);    // rename the temporary file to original name
/*------ Read the file ----------------*/
   fptr1=fopen(fname,"r"); 
            ch=fgetc(fptr1); 
          printf(" Now the content of the file %s is : \n",fname); 
          while(ch!=EOF) 
            { 
                printf("%c",ch); 
                 ch=fgetc(fptr1); 
            }
        fclose(fptr1);
/*------- End of reading ---------------*/
        return 0;

  } 

在尝试编译并运行此代码时,下面是我看到的输出

When tried to compile and run this code, below is the output I am seeing

Input String Contains Unicode Character
Final String: echo \xE6\xAC\xA2\xE8\xBF\x8E >> data.txt
Unicode String Size: 24
 Now the content of the file data.txt is : 
testing
programming
development
xE6xACxA2xE8xBFx8E

相同的代码在更改以下几行时,按预期运行

The same code when changed the below lines, it was working as expected

 sprintf(str2,"echo %s >> data.txt",str); 
 sprintf(str2,"echo %s >> data.txt","\xE6\xAC\xA2\xE8\xBF\x8E");

但是当从文件中读取值时,它将无法正常工作。

But when the value is read from file it was not working.

在这一行中,该字符串也被标识为具有正确大小的Unicode字符串

Also this line, the string is identified as unicode string with correct size

printf("Final String: %s\nUnicode String Size: %ld\n",str2,strlen(str));
The String Size: 6

请问有人,如何转换从文本文件中读取时,将值转换为中文。

Can some one please let me know, how to convert the value to chinese when read from text file.

推荐答案

我能够完成转换。下面是我的最终代码

I was able to get the conversion done. Below is my final code

                if(strstr(str,"\\")!=NULL)
                {
                    memset(str2,'\0',sizeof(str2));
                    printf("Input String Contains Unicode Character\n");                    
                    str[strlen(str)-1]='\0';


                    sprintf(str2,"echo %s | sed \'s/[\\\\x]//g\' | xxd -r -p >> data.txt",str);
                    printf("Final String: %s\nUnicode String Size: %ld\n",str2,strlen(str));
                    system(str2);
                }

感谢您的所有回复,并感谢@chux提出的建议

Thanks for all your response and thanks @chux for your pointer

这篇关于将未编码的字符串转换为C中的对应字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆