你如何处理“特殊"?MS Word 添加的字符? [英] How do you deal with the "special" characters that MS Word adds?

查看:36
本文介绍了你如何处理“特殊"?MS Word 添加的字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道您如何清理 MS Word 中的特殊字符,例如 m 和 n 破折号和卷曲引号?

I'm wondering how you clean the special characters that MS Word as, such as m- and n-dashes and curly quotes?

我经常发现自己从 Word 中复制客户的内容并粘贴到静态 HTML 页面中,但内容最终会出现奇怪的字符,因为特殊字符没有转换为正确的 ACSII 代码,因此显示为乱码.(对于这些基本网站,我使用的是 Dreamweaver.)

I often find myself copying content from clients from Word and pasting into a static HTML page, but the content ends up with weird characters because the special characters are not converted to their correct ACSII codes and therefore show up as garbled text. (For these basic websites, I'm using Dreamweaver.)

当客户将内容从 Word 复制到纯文本字段(主要是文本区域)时,我看到了很多类似的问题.当我将其放入 PDF(通过 PHP)或显示在页面上时,它也出现乱码.

I have seen a lot of similar problems when clients copy content from Word into text only fields (mostly textareas). When I put this into a PDF (through PHP) or it shows up on the page it too has garbled text.

你是如何处理这个问题的?您是否使用了清洁服务或程序?

How do you deal with this? Is there a cleaning service or program you use?

推荐答案

关于客户在 textareas 中发布从 Word 复制/粘贴的文本:

确保客户端以任何特定编码向您发送文本的最可靠方法(因此希望为您从 CP-1252 [或任何 Word 使用的] 进行任何转换),是添加 accept-charset=..." 属性用于您所有的

.例如:

The most reliable way to ensure that the client sends you text in any particular encoding (thus hopefully doing any conversion from CP-1252 [or whatever Word uses] for you), is to add the accept-charset="..." attribute to all your <form>s. E.g.:

<form ... accept-charset="UTF-8">
   ...
</form>

大多数浏览器都会遵守这一点,并确保在到达您的网站之前将任何特定于 Word"的字符转换为适当的字符集.

Most browsers will obey that and make sure any "Word-specific" characters are converted to the appropriate character set before it gets to your website.

一旦无效文本进入您的网站,您几乎无法可靠地修复它,因此最好简单地检查所有输入在您使用的任何字符集中是否有效,并丢弃任何包含无效文本的请求.即使使用 accept-charset,这也是必要的,因为毫无疑问,有些客户端会忽略它.

Once invalid text gets to your website, there's very little you can do to fix it reliably, so it's best to simply check all input for being valid in whatever character set you use, and discard any requests that have invalid text. This is necessary even with accept-charset, because undoubtedly there are some clients out there that will ignore it.

这篇关于你如何处理“特殊"?MS Word 添加的字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆