从字符串中删除不必要的RTF数据 [英] Remove unnecessary RTF data from string
问题描述
我有一条RTF数据正在发送到我的应用程序,如下所示:
{\rtf1 \ sstecf22000\ansi\deflang2057\ftnbj\uc1\deff0
{\ tonttbl {\f0 \fnil \fcharset0 Microsoft Sans Serif;} {\f1 \fswiss Tahoma;}}
{\ colortbl; \red0\green0\blue0; \ red255 \ green255 \ blue255;}
{\ stylesheet {\f1\fs18 Normal;} {\\ \\cs1默认段落字体;}}
{\ * \revtbl {未知;} {JOE BLOGS;}}
{\ info {\doccomm TEST1 TEST1}} \ paperw12240 \ paperh15840\margl1800\margr1800\margt1440\margb1440\headery720\footery720\\\
ogrowautofit\deftab720\formshade\fet4\aendnotes\aftnnrlc\pgbrdrhead\pgbrdrfoot\revisions
\sectd\pgwsxn12240\pghsxn15840\guttersxn0\marglsxn1800\margrsxn1800\margtsxn1440\margbsxn1440\\ \\ headery720\footery720\sbkpage\pgncont\pgndec
\plain\plain\f1 \fs18 \ql\plain\f1 \fs18 \ plain\f0 \ fs17\lang2057\hich\f0\dbch\f0\loch\f0\fs17
\ deleted\revauthdel1 \\ _revdttmdel1196190643 \ {\\Rtf1 \\ Ansi \\Deff0 \ {\\Fonttbl \ {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ \\Uc1 \\Pard \\Lang2057 \\F0 \\Fs17 Cup .... \\Par \\\\\\\\\\\\\\\\\\ \fs17\lang2057\hich\f0\dbch\f0\loch\f0\fs17
\revised\revauth1 \\ \\ _drevdttm1196190643 hello world \plain\f1 \ fs18 \ par
}
当我将其转换为纯文本时,仍会显示RTF数据。
<前la NG = XML> \ {\\Rtf1\\Ansi\\Deff0\ {\\Fonttbl\ {\\F0\\Fnil\\Fcharset0 Microsoft Sans Serif; \} \} \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ \ Par \ par \}
那么我如何检测并删除必要的RTF数据?:
\ {\\Rtf1 \\Ansi \\Deff0 \ {\\Fonttbl \ {\\\\\\\\ \\Fcharset0 Microsoft Sans Serif; \} \} \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ ... \\Par \ par \}
我试图使用rejex,但只检测RTF中的所有内容块我有。
例如
({\\)(。+?)(})|(\\\ ?\\)(+)(\b)|} $
我想只删除不必要的RTF数据。
以下是整个RTF数据块:
我的尝试:
我尝试使用以下代码尝试查看是否可以删除不必要的RTF数据,但我认为像这样指定的字符串是错误的。
string result = rtfString;
const string toLookFor = {\\Rtf1 \\Ansi\\Deff0 {\\Fonttbl {\\\\\\\\\\\\\\ Sans Serif;}} \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ n} \\\
text 3;
尝试
{
if (IsRichText( rtfString))
{
if (rtfString.contains(toLookFor))
{
rtfString = rtfString.replace( toLookFor, );
}
}
else
{
result = rtfString;
}
}
catch
{
throw < /跨度>;
}
返回结果;
我想只删除不必要的RTF数据。
以下是整个RTF数据块:
我的尝试:
我尝试使用以下代码尝试查看是否可以删除不必要的RTF数据,但我认为像这样指定的字符串是错误的。
string result = rtfString;
const string toLookFor = {\\Rtf1 \\Ansi\\Deff0 {\\Fonttbl {\\\\\\\\\\\\\\ Sans Serif;}} \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ n} \\\
text 3;
尝试
{
if (IsRichText( rtfString))
{
if (rtfString.contains(toLookFor))
{
rtfString = rtfString.replace( toLookFor, );
}
}
else
{
result = rtfString;
}
}
catch
{
throw < /跨度>;
}
返回结果;
使用Windows您可以使用RichText编辑控件将RTF转换为纯文本(只需在内存中创建它而不显示它并使用适当的函数来设置RTF并获取文本)。
C#示例:如何:将RTF转换为纯文本(C#编程指南) [ ^ ]。
使用Linux,您可以使用 unrtf(1) - Linux手册页 [ ^ ]工具或检查它源代码。
I have a piece of RTF data that is being sent to my application shown below:
{\rtf1\sstecf22000\ansi\deflang2057\ftnbj\uc1\deff0
{\fonttbl{\f0 \fnil \fcharset0 Microsoft Sans Serif;}{\f1 \fswiss Tahoma;}}
{\colortbl ;\red0\green0\blue0 ;\red255\green255\blue255 ;}
{\stylesheet{\f1\fs18 Normal;}{\cs1 Default Paragraph Font;}}
{\*\revtbl{Unknown;}{JOE BLOGS;}}
{\info{\doccomm TEST1 TEST1}}\paperw12240\paperh15840\margl1800\margr1800\margt1440\margb1440\headery720\footery720\nogrowautofit\deftab720\formshade\fet4\aendnotes\aftnnrlc\pgbrdrhead\pgbrdrfoot\revisions
\sectd\pgwsxn12240\pghsxn15840\guttersxn0\marglsxn1800\margrsxn1800\margtsxn1440\margbsxn1440\headery720\footery720\sbkpage\pgncont\pgndec
\plain\plain\f1\fs18\ql\plain\f1\fs18\plain\f0\fs17\lang2057\hich\f0\dbch\f0\loch\f0\fs17
\deleted\revauthdel1\revdttmdel1196190643 \{\\Rtf1\\Ansi\\Deff0\{\\Fonttbl\{\\F0\\Fnil\\Fcharset0 Microsoft Sans Serif;\}\}\par \\Viewkind4\\Uc1\\Pard\\Lang2057\\F0\\Fs17 Cup....\\Par\par \}\par\plain\f0\fs17\lang2057\hich\f0\dbch\f0\loch\f0\fs17
\revised\revauth1\revdttm1196190643 hello world \plain\f1\fs18\par
}
When i convert it to plain text there is still RTF data being displayed.
\{\\Rtf1\\Ansi\\Deff0\{\\Fonttbl\{\\F0\\Fnil\\Fcharset0 Microsoft Sans Serif;\}\}\par \\Viewkind4\\Uc1\\Pard\\Lang2057\\F0\\Fs17 Cup....\\Par\par \}
So how would i detect and remove the necessary RTF data?:
\{\\Rtf1\\Ansi\\Deff0\{\\Fonttbl\{\\F0\\Fnil\\Fcharset0 Microsoft Sans Serif;\}\}\par \\Viewkind4\\Uc1\\Pard\\Lang2057\\F0\\Fs17 Cup....\\Par\par \}
I tried to use rejex but that only detects everything in the RTF block that i have.
e.g
({\\)(.+?)(})|(\\)(.+?)(\b)|}$
I want to remove only the unnecessary RTF data.
Here is the entire RTF block of data:
What I have tried:
I tried to use the following code to try and see if i can remove the unnecessary RTF data but i think having the string specified like this is wrong.
string result = rtfString;
const string toLookFor = "{\\Rtf1\\Ansi\\Deff0{\\Fonttbl{\\F0\\Fnil\\Fcharset0 Microsoft Sans Serif;}}\n\\Viewkind3\\Uc1\\Pard\\Lang2057\\F0\\Fs17 Cup....\\Par\n}\ntext 3";
try
{
if (IsRichText(rtfString))
{
if(rtfString.contains(toLookFor))
{
rtfString = rtfString.replace(toLookFor, "");
}
}
else
{
result = rtfString;
}
}
catch
{
throw;
}
return result;
I want to remove only the unnecessary RTF data.
Here is the entire RTF block of data:
What I have tried:
I tried to use the following code to try and see if i can remove the unnecessary RTF data but i think having the string specified like this is wrong.
string result = rtfString; const string toLookFor = "{\\Rtf1\\Ansi\\Deff0{\\Fonttbl{\\F0\\Fnil\\Fcharset0 Microsoft Sans Serif;}}\n\\Viewkind3\\Uc1\\Pard\\Lang2057\\F0\\Fs17 Cup....\\Par\n}\ntext 3"; try { if (IsRichText(rtfString)) { if(rtfString.contains(toLookFor)) { rtfString = rtfString.replace(toLookFor, ""); } } else { result = rtfString; } } catch { throw; } return result;
With Windows you can use a RichText edit control to convert RTF to plain text (just create it in memory without displaying it and use the appropriate functions to set RTF and get text).
C# example: How to: Convert RTF to Plain Text (C# Programming Guide)[^].
With Linux you can use the unrtf(1) - Linux man page[^] tool or check it's source code.
这篇关于从字符串中删除不必要的RTF数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!