将HTML转换为字符串 [英] Convert HTML to a string

查看:75
本文介绍了将HTML转换为字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串编写器函数,该函数捕获HTM1并作为字符串返回.例如

I have a string writer function which captures a HTMl and returns as a string. for example

" \ r \ n \ r \ n<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN \"\" http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\r\n\r\n<html xmlns = \"http://www.w3.org/1999/xhtml \"> \ r \ n<head> \ r \ n< link rel = \"Stylesheet \" href = \"../../Content/style.css \" type = \"text/css \"/> \ r \ n< title> Cover Page</title> \ r \ n< style type = \"text/css \"> \ r \ n html,body \ r \ n {\ r \ n \ t font-family:Arial,Helvetica,sans-serif; \ r \ n \ t字体大小:13pt; \ r \ n \ t填充:0px; \ r \ n \ t边距:0px; \ r \ n \ t背景色:#FFFFFF; \ r \ n \ t颜色:黑色; \ r \ n \ t宽度:680px; \ r \ n} \ r \ n</style> \ r \ n</head> \ r \ n< body>; \ r \ n< div> \ r \ n Ssotest Ssotest,\ r \ n</div> \ r \ n</body> \ r \ n</html> \ r \ n"

当我将其传递给PDF生成工具时,会引发错误.但是,当我从VS2010的本地"窗口中复制String writer(上面的相同HTML字符串)的输出并对其进行硬编码时

when I pass this to a PDF generating tool it throws an error.But when I copy the output of the String writer ( the same HTML string above") from the Locals window in VS2010 and hardcode it like

 string test ="\r\n\r\n<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\r\n\r\n<html xmlns=\"http://www.w3.org/1999/xhtml\" >\r\n<head>\r\n    <link rel=\"Stylesheet\" href=\"../../Content/style.css\" type=\"text/css\" />\r\n    <title>Cover Page</title>\r\n    <style type=\"text/css\">\r\n        html, body\r\n        {\r\n\t        font-family: Arial, Helvetica, sans-serif;\r\n\t        font-size: 13pt;\r\n\t        padding: 0px;\r\n\t        margin: 0px;\r\n\t        background-color: #FFFFFF;\r\n\t        color: black;\r\n\t        width: 680px;\r\n        }\r\n    </style>\r\n</head>\r\n<body>\r\n    <div>\r\n        Ssotest Ssotest, \r\n    </div> \r\n</body>\r\n</html>\r\n"

并传递到工作正常的工具.在这两种情况下,字符串都是相同的.我想知道有什么区别吗?复制文本和硬编码后,东西会转换吗?有什么建议吗?

and pass to the tool it works fine. In the both cases the string is same. I wonder what makes the difference? Is that something gets converted when I copy the text and hardcode?? Any suggestions??

只是一个更新.我用这段代码来格式化

Just a update.. I used this code to format

 public class ReplaceString
        {
            static readonly IDictionary<string, string> m_replaceDict
                = new Dictionary<string, string>();

            const string ms_regexEscapes = @"[\a\b\f\n\r\t\v\\""]";

            public static string StringLiteral(string i_string)
            {
                return Regex.Replace(i_string, ms_regexEscapes, match);
            }

            public static string CharLiteral(char c)
            {
                return c == '\'' ? @"'\''" : string.Format("'{0}'", c);
            }

            private static string match(Match m)
            {
                string match = m.ToString();
                if (m_replaceDict.ContainsKey(match))
                {
                    return m_replaceDict[match];
                }

                throw new NotSupportedException();
            }

            static ReplaceString()
            {
                m_replaceDict.Add("\a", @"\a");
                m_replaceDict.Add("\b", @"\b");
                m_replaceDict.Add("\f", @"\f");
                m_replaceDict.Add("\n", @"\n");
                m_replaceDict.Add("\r", @"\r");
                m_replaceDict.Add("\t", @"\t");
                m_replaceDict.Add("\v", @"\v");

                m_replaceDict.Add("\\", @"\\");
                m_replaceDict.Add("\0", @"\0");

                //The SO parser gets fooled by the verbatim version 
                //of the string to replace - @"\"""
                //so use the 'regular' version
                m_replaceDict.Add("\"", "\\\"");
            }

            static void Main(string[] args)
            {

                string s = "here's a \"\n\tstring\" to test";
                Console.WriteLine(ReplaceString.StringLiteral(s));
                Console.WriteLine(ReplaceString.CharLiteral('c'));
                Console.WriteLine(ReplaceString.CharLiteral('\''));

            }
        }

但是字符串像这样返回

\\r\\n\\r\\n<!DOCTYPE html PUBLIC \\\"-//W3C//DTD XHTML 1.0 Transitional//EN\\\" \\\"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\\\">\\r\\n\\r\\n<html xmlns=\\\"http://www.w3.org/1999/xhtml\\\" >\\r\\n<head>\\r\\n    <link rel=\\\"Stylesheet\\\...."

哪个剂量有意义.我正在使用的PDF生成器的代码

which dosent make sense.. the code of PDF generator I am using

string test="\r\n\r\n<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\r\n\r\n<html xmlns=\"http://www.w3.org/1999/xhtml\" >\r\n<head>\r\n <link rel=\"Stylesheet\" href=\"../../Content/style.css\" type=\"text/css\" />\r\n <title>Cover Page</title>\r\n <style type=\"text/css\">\r\n html, body\r\n {\r\n\t font-family: Arial, Helvetica, sans-serif;\r\n\t font-size: 13pt;\r\n\t padding: 0px;\r\n\t margin: 0px;\r\n\t background-color: #FFFFFF;\r\n\t color: black;\r\n\t width: 680px;\r\n }\r\n </style>\r\n</head>\r\n<body>\r\n <div>\r\n Ssotest Ssotest, \r\n </div> \r\n</body>\r\n</html>\r\n"

           FileStreamResponseContext response = new FileStreamResponseContext();
 Document doc = new Document();
            doc.DocumentInformation.CreationDate = DateTime.Now;
            doc.DocumentInformation.Title = "Test Plan";
            doc.DocumentInformation.Subject = "Test Plan";
            doc.CompressionLevel = CompressionLevel.NormalCompression;
            doc.Margins = new Margins(0, 0, 0, 0);
            doc.Security.CanPrint = true;
            doc.ViewerPreferences.HideToolbar = false;
            doc.ViewerPreferences.FitWindow = false;

string baseUrl = String.Format("http://localhost{0}/", Request.Url.Port == 80?"":":" + Request.Url.Port.ToString());

PdfPage docTestPlan = doc.AddPage(PageSize.Letter, new Margins(0, 0, 0, 0), PageOrientation.Portrait);
// passing the string test returned from the string writer

   HtmlToPdfElement htmlToPdf = new HtmlToPdfElement(test, baseUrl);
            htmlToPdf.FitWidth = false;
            docTestPlan.AddElement(htmlToPdf);



            /******************************************
             * put doc in a memory stream for return */
            response.FileDataStream = new MemoryStream();
            doc.Save(response.FileDataStream);
            doc.Close();
            response.FileDataStream.Position = 0;

            return new FileStreamResult(response.FileDataStream, "application/pdf");

推荐答案

经过长时间的奋斗,我找到了原因.导致该错误的原因是,当从编程上下文传递字符串时,它会继续添加到 System.Web.HttpResponseBase Response 对象中.当我通过硬编码直接传递字符串时,它不会再混乱 System.Web.HttpResponseBase响应对象.因此,最终的解决方案是添加一段代码 Response.clear(); ,以清除所有先前的Response对象.现在,它的工作正常.谢谢大家的建议.欢呼!!

After a long struggle I found the reason. The error is caused because when the string is passed from the programming context it keeps on adding to System.Web.HttpResponseBase Response object. when I pass the string directly by hard coding it is not messing again with System.Web.HttpResponseBase Response object. So the final solution is to add a piece of code Response.clear(); which clears all the previous Response objects. Now its working fine. Thanks all for your suggestions. cheers!!

这篇关于将HTML转换为字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆