字符串中的空字符 [英] Null character in strings

查看:209
本文介绍了字符串中的空字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下字符串:

var s = "A\0Z";

其长度为3,由s.length给出.使用console.log可以看到字符串没有被剪切,并且s[1]""s.charCodeAt(1)0.

Its length is 3, as given by s.length. Using console.log you can see the string isn't cut and that s[1] is "" and s.charCodeAt(1) is 0.

当您在Firefox中对其进行警报时,您会看到AZ.当您在Chrome/Linux中使用alert(s)对其进行警报时,\0会终止该字符串,并显示A.

When you alert it in Firefox, you see AZ. When you alert it in Chrome/Linux using alert(s), the \0 terminates the string and you see A.

我的问题是:浏览器和Java脚本引擎应该做什么? Chrome越野车在这里吗?是否有文件定义应该发生什么?

My question is: what should browsers and Javascript engines do? Is Chrome buggy here? Is there a document defining what should happen?

由于这是关于标准的问题,因此需要参考.

推荐答案

浏览器应该要做的是分别跟踪字符串及其长度,因为标准中不存在空终止符. (字符串只是一个具有长度的对象).

What the browser should do is keep track of the string and its length separately since there are no null terminators present in the standard. (A string is just an object with a length).

Chrome 似乎要做的事情(我同意了)是使用标准的C字符串函数,该函数以\ 0终止.要回答您的一个问题:是的,这对我来说是Chrome处理alert()函数的错误.

What Chrome seems to do (I am taking your word for this) is use the standard C string functions which terminate at a \0. To answer one of your questions: Yes this to me constitutes a bug in Chrome's handling of the alert() function.

规范正式说:

字符串文字是用单引号或双引号引起来的零个或多个字符.每个字符可以由转义序列表示.除右引号,反斜杠,回车符,行分隔符,段落分隔符和换行符外,所有字符都可以按字面意义显示在字符串文字中.任何字符都可以以转义序列的形式出现.

A string literal is zero or more characters enclosed in single or double quotes. Each character may be represented by an escape sequence. All characters may appear literally in a string literal except for the closing quote character, backslash, carriage return, line separator, paragraph separator, and line feed. Any character may appear in the form of an escape sequence.

也:

字符串文字代表String类型的值.文字的字符串值(SV)是根据字符串文字的各个部分贡献的字符值(CV)来描述的.

A string literal stands for a value of the String type. The String value (SV) of the literal is described in terms of character values (CV) contributed by the various parts of the string literal.

关于NUL字节:

EscapeSequence :: 0 [lookahead∉DecimalDigit]的CV [字符值]是< NUL>字符(Unicode值0000).

The CV [Character Value] of EscapeSequence :: 0 [lookahead ∉ DecimalDigit] is a <NUL> character (Unicode value 0000).

因此,与其他语言中可能以SV(字符串值)结尾的NUL字节相反,NUL字节应该只是又一个字符值"而没有特殊含义.

Therefore, a NUL byte should simply be "yet another character value" and have no special meaning, as opposed to other languages where it might end a SV (String value).

作为(有效)字符串单字符转义序列"的参考,请参阅

For Reference of (valid) "String Single Character Escape Sequences" have a look at the ECMAScript Language spec section 7.8.4. There is a table at the end of the paragraph listing the aforementioned escape sequences.

旨在编写Java脚本引擎的人可能可以从中学到以下知识:不要使用C/C ++字符串函数. :)

What someone aiming to write a Javascript engine could probably learn from this: Don't use C/C++ string functions. :)

这篇关于字符串中的空字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆