如何强制破坏不可破坏的字符串? [英] How to force breaking of non breakable strings?

查看:146
本文介绍了如何强制破坏不可破坏的字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个从数据库中包含的数据生成的HTML页面。数据库有时包含长字符串,浏览器不能断开,因为字符串不包含可断字符(空格,点,逗号等)。

I have an HTML page that I generate from the data contained in a database. The database sometimes contains long strings that the browser can't break because the strings don't contain breakable characters (space, point, comma, etc...).

有没有办法使用html,css甚至javascript修复这个问题?

Is there any way to fix this using html, css or even javascript?

查看这个链接的问题示例。

推荐答案

根据这篇文章这一个也一样:害羞的连字号或软连字号可以写成HTML as:& shy; / ­ / & #xAD (173 dec = AD hex)。它们都转换为U + 00AD字符。

Based on this article and this one as well: the "Shy Hyphen" or "Soft Hyphen" can be written in HTML as: ­ / ­ / &#xAD (173 dec = AD hex). They all convert to the U+00AD character.

JavaScript textContent nodeValue 的DOM文本节点不是'实体编码' - 它们只是包含实际的实体。为了写这些字符,你必须自己编码它们: \xAD 是一个简单的方法写一个JavaScript字符串中的相同的字符。 String.fromCharCode(173)也可以工作。

The JavaScript textContent and nodeValue of the DOM Text Nodes are not 'entity encoded' - they just contain the actual entities. In order to write these characters you must therefore encode them yourself: \xAD is a simple way to write the same character in a JavaScript string. String.fromCharCode(173) would also work.

基于你自己的非常好的答案 - 一个jQuery插件版本:

Based on your own VERY good answer - a jQuery Plugin version:

$.fn.replaceInText = function(oldText, newText) {
  // contents() gets all child dom nodes -- each lets us operate on them
  this.contents().each(function() {
    if (this.nodeType == 3) { // text node found, do the replacement
        if (this.textContent) {
            this.textContent = this.textContent.replace(oldText, newText);
        } else { // support to IE
            this.nodeValue = this.nodeValue.replace(oldText, newText);
        }
    } else {
      // other types of nodes - scan them for same replace
      $(this).replaceInText(oldText, newText);
    }
  });
  return this;
};

$(function() {
    $('div').replaceInText(/\w{10}/g, "$&\xAD");
});

附注:

我认为这应该发生的地方不是在JavaScript - 它应该在服务器端代码。如果这只是一个用来显示数据的页面 - 你可以很容易地在文本发送到浏览器之前对文本进行类似的regexp替换。然而,JavaScript解决方案提供了一个优点(或缺点取决于你想要看它) - 它不添加任何无关的字符到数据,直到脚本执行,这意味着任何机器人抓取您的HTML输出数据不会看到shy连字符。虽然HTML规范将其解释为连字符提示和不可见的字符,它不能保证在整个Unicode世界的其余部分:(通过我链接的第二篇文章从Unicode标准)

I think that the place this should happen is NOT in JavaScript - it should be in the server side code. If this is only a page used to display data- you could easily do a similar regexp replace on the text before it is sent to the browser. However the JavaScript solution offers one advantage(or disadvantage depending on how you want to look at it) - It doesn't add any extraneous characters to the data until the script executes, which means any robots crawling your HTML output for data wont see the shy hyphens. Although the HTML spec interprets it as a "hyphenation hint" and an invisible character its not guaranteed across the rest of the Unicode world: (quote from Unicode standard via the second article I linked)


U + 00AD软连字符表示
连字符点,其中换行符
在单词为
连字符。根据脚本,
当出现换行符时,
字符的可见呈现可能
不同(例如,在某些脚本中
它被渲染为连字符,而

U+00AD soft hyphen indicates a hyphenation point, where a line-break is preferred when a word is to be hyphenated. Depending on the script, the visible rendering of this character when a line break occurs may differ (for example, in some scripts it is rendered as a hyphen -, while in others it may be invisible).



<在此其他SO问题中,似乎零宽度空格字符&#8203; / &#x200b; / U + 200b 是您可能想要探索的另一个选项。这将是 \x20 \x0b 作为一个javascript字符串。

Another Note: Found in this other SO Question - it seems that the "Zero Width Space" character &#8203; / &#x200b; / U+200b is another option you might want to explore. It would be \x20\x0b as a javascript string.

这篇关于如何强制破坏不可破坏的字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆