带换行符的列名 [英] Column names with line breaks
问题描述
我知道对于PostgreSQL中的文本字符串,换行符是通过在符号前面附加符号 E
或 e
来统一的文本:
I know that for text strings in PostgreSQL line breaks are unified by appending symbol E
or e
in front of the text:
SELECT E'first\nsecond'
结果:
first
second
但是PostgreSQL也支持列名中的换行符-不知道这种做法为什么或多么邪恶,但是可以做到以下内容:
But PostgreSQL also support line breaks within column names - not sure why or how evil this practice is, but one can do the following:
CREATE TABLE One("first\nsecond" text);
CREATE TABLE Two("first
second" text);
当您不幸遇到其中之一时,您会发现虽然这些查询有效:
When you are unfortunate enough to run into one of these, you would find that while these queries work:
SELECT "first\nsecond" from One;
SELECT "first
second" from Two;
这些没有:
SELECT "first
second" from One;
SELECT "first\nsecond" from Two;
我的问题是:PostgreSQL中是否有一种方法可以统一这种差异,类似于列值?
My question is: Is there a way in PostgreSQL that unifies such differences, similar to the situation with the column values?
我尝试将 E
放在 first\nsecond前面
列名,但不受支持。尝试使用 \r\n
代替(我使用Windows)给了我第三种列名,只能查询为:
I have tried putting E
in front of "first\nsecond"
column names, but it is not supported. Trying to put \r\n
instead (I'm using Windows) gave me a third type of column names, one that can only be queried as:
SELECT "first\r\nsecond" FROM Third
推荐答案
列名是标识符,有关标识符语法的详细内容,请参见:
Column names are identifiers, and the gory details of the syntax for identifiers are described at:
http:// www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS
TL; DR :使用 U& ...
语法通过其Unicode代码点将不可打印的字符注入到标识符中,并且无法统一 CR,LF
与 LF
单独使用。
TL;DR: use the U&"..."
syntax to inject non-printable characters into identifiers through their Unicode codepoints, and there's no way to unify CR,LF
with LF
alone.
我们被允许在标识符中使用Unicode转义序列,因此对于每个文档,t他的以下方法确实起作用:
We're allowed to use Unicode escape sequences in identifiers, so per documentation, the following does work:
select U&"first\000asecond" from Two;
如果只是两个单词之间的换行符。
if it's just a newline character between the two words.
创建表的方式是:
CREATE TABLE One("first\nsecond" text);
由于反斜杠字符在此处没有特殊含义,因此此列不包含任何换行符。
它包含第一个
,然后是 \
,然后是 n
后跟秒
。
这样:
As the backslash character has no special meaning here, this column does not contain any newline.
It contains first
followed by \
followed by n
followed by second
.
So:
SELECT "first\nsecond" from One;
之所以起作用,是因为它与 CREATE TABLE $ CREATE TABLE
does work because it's the same as what's in the CREATE TABLE
而
SELECT "first
second" from One;
失败,因为该SELECT中有换行符,其中表中的实际列名带有反斜杠,后跟a n
。
fails because there's a newline in that SELECT where the actual column name in the table has a backslash followed by a n
.
这与 One相反。
CREATE TABLE Two("first
second" text);
换行符是逐字记录的,并且是列的一部分。
The newline is taken verbatim and is part of the column. So
SELECT "first
second" from Two;
可以正常工作,因为换行符与CREATE TABLE中的一样,
带有嵌入式换行符,
,而
works because the newline is there exactly as in the CREATE TABLE, with an embedded newline, whereas
SELECT "first\nsecond" from Two;
失败,因为像以前的 \n
此上下文并不意味着要换行。
fails because as previously \n
in this context does not mean a newline.
注释和您的修改,这可能是回车和换行符,在这种情况下,应该执行以下操作:
As mentioned in comments and your edit, this could be carriage return and newline instead, in which case the following should do:
select U&"first\000d\000asecond" from Two;
尽管在我的测试中,使用 psql在列中间按Enter
在Unix和Windows上具有相同的效果:列名称中只有一个换行符。
although in my test, hitting Enter in the middle of a column with psql
on Unix and Windows has the same effect: a single newline in the column's name.
检查列名称中到底包含哪些确切字符,我们可以用十六进制对其进行检查。
To check what exact characters ended up in a column name, we can inspect them in hexadecimal.
应用于Unix上psql内部的创建表示例时,
When applied to your create table example, from inside psql under Unix:
CREATE TABLE Two("first
second" text);
select convert_to(column_name::text,'UTF-8')
from information_schema.columns
where table_schema='public'
and table_name='two';
结果是:
convert_to
----------------------------
\x66697273740a7365636f6e64
对于更复杂的情况(例如,在UTF-8中具有多个字节的非ASCII字符),可能需要更高级的查询帮助,以获得易于阅读的代码点:
For more complex cases (e.g. non-ascii characters with several bytes in UTF-8), a more advanced query might help, for easy-to-read codepoints:
select c,lpad(to_hex(ascii(c)),4,'0') from (
select regexp_split_to_table(column_name::text,'') as c
from information_schema.columns
where table_schema='public'
and table_name='two'
) as g;
c | lpad
---+------
f | 0066
i | 0069
r | 0072
s | 0073
t | 0074
+| 000a
|
s | 0073
e | 0065
c | 0063
o | 006f
n | 006e
d | 0064
这篇关于带换行符的列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!