带换行符的列名 [英] Column names with line breaks

查看：154 发布时间：2020/5/29 23:47:11 postgresql

本文介绍了带换行符的列名的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我知道对于PostgreSQL中的文本字符串，换行符是通过在符号前面附加符号 E 或 e 来统一的文本：

I know that for text strings in PostgreSQL line breaks are unified by appending symbol E or e in front of the text:

SELECT E'first\nsecond'

结果：

first
second

但是PostgreSQL也支持列名中的换行符-不知道这种做法为什么或多么邪恶，但是可以做到以下内容：

But PostgreSQL also support line breaks within column names - not sure why or how evil this practice is, but one can do the following:

CREATE TABLE One("first\nsecond" text);
CREATE TABLE Two("first
second" text);

当您不幸遇到其中之一时，您会发现虽然这些查询有效：

When you are unfortunate enough to run into one of these, you would find that while these queries work:

SELECT "first\nsecond" from One;
SELECT "first
second" from Two;

这些没有：

SELECT "first
second" from One;
SELECT "first\nsecond" from Two;

我的问题是：PostgreSQL中是否有一种方法可以统一这种差异，类似于列值？

My question is: Is there a way in PostgreSQL that unifies such differences, similar to the situation with the column values?

我尝试将 E 放在 first\nsecond前面 列名，但不受支持。尝试使用 \r\n 代替（我使用Windows）给了我第三种列名，只能查询为：

I have tried putting E in front of "first\nsecond" column names, but it is not supported. Trying to put \r\n instead (I'm using Windows) gave me a third type of column names, one that can only be queried as:

SELECT "first\r\nsecond" FROM Third

推荐答案

列名是标识符，有关标识符语法的详细内容，请参见：

Column names are identifiers, and the gory details of the syntax for identifiers are described at:

http：// www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS

TL; DR ：使用 U& ... 语法通过其Unicode代码点将不可打印的字符注入到标识符中，并且无法统一 CR，LF 与 LF 单独使用。

TL;DR: use the U&"..." syntax to inject non-printable characters into identifiers through their Unicode codepoints, and there's no way to unify CR,LF with LF alone.

我们被允许在标识符中使用Unicode转义序列，因此对于每个文档，t他的以下方法确实起作用：

We're allowed to use Unicode escape sequences in identifiers, so per documentation, the following does work:

select U&"first\000asecond" from Two;

如果只是两个单词之间的换行符。

if it's just a newline character between the two words.

创建表的方式是：

CREATE TABLE One("first\nsecond" text);

由于反斜杠字符在此处没有特殊含义，因此此列不包含任何换行符。
它包含第一个，然后是 \ ，然后是 n 后跟秒。
这样：

As the backslash character has no special meaning here, this column does not contain any newline. It contains first followed by \ followed by n followed by second. So:

 SELECT "first\nsecond" from One;

之所以起作用，是因为它与 CREATE TABLE


does work because it's the same as what's in the CREATE TABLE 
而
SELECT "first
second" from One;

失败，因为该SELECT中有换行符，其中表中的实际列名带有反斜杠，后跟a  n 。
fails because there's a newline in that SELECT where the actual column name in the table has a backslash followed by a n.
这与 One相反。 
CREATE TABLE Two("first
second" text);

换行符是逐字记录的，并且是列的一部分。 
The newline is taken verbatim and is part of the column.
So 
SELECT "first
second" from Two;

可以正常工作，因为换行符与CREATE TABLE中的一样，
带有嵌入式换行符，
，而
works because the newline is there exactly as in the CREATE TABLE,
with an embedded newline,
whereas 
SELECT "first\nsecond" from Two;

失败，因为像以前的 \n 此上下文并不意味着要换行。
fails because as previously \n in this context does not mean a newline.
注释和您的修改，这可能是回车和换行符，在这种情况下，应该执行以下操作：
As mentioned in comments and your edit, this could be carriage return and newline instead, in which case the following should do:
select U&"first\000d\000asecond" from Two;

尽管在我的测试中，使用 psql在列中间按Enter 在Unix和Windows上具有相同的效果：列名称中只有一个换行符。
although in my test, hitting Enter in the middle of a column with psql on Unix and Windows has the same effect: a single newline in the column's name.
检查列名称中到底包含哪些确切字符，我们可以用十六进制对其进行检查。
To check what exact characters ended up in a column name, we can inspect them in hexadecimal.
应用于Unix上psql内部的创建表示例时，
When applied to your create table example, from inside psql under Unix:
CREATE TABLE Two("first
second" text);

select convert_to(column_name::text,'UTF-8')
 from information_schema.columns 
 where table_schema='public'
   and table_name='two';

结果是：
        convert_to         
----------------------------
 \x66697273740a7365636f6e64

对于更复杂的情况（例如，在UTF-8中具有多个字节的非ASCII字符），可能需要更高级的查询帮助，以获得易于阅读的代码点：
For more complex cases (e.g. non-ascii characters with several bytes in UTF-8), a more advanced query might help, for easy-to-read codepoints:
select c,lpad(to_hex(ascii(c)),4,'0') from (
  select regexp_split_to_table(column_name::text,'')  as c
    from  information_schema.columns
    where table_schema='public'
    and table_name='two'
  ) as g;

 c | lpad 
---+------
 f | 0066
 i | 0069
 r | 0072
 s | 0073
 t | 0074
  +| 000a
   | 
 s | 0073
 e | 0065
 c | 0063
 o | 006f
 n | 006e
 d | 0064


                        这篇关于带换行符的列名的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

带换行符的列名 [英] Column names with line breaks

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

带换行符的列名 [英] Column names with line breaks

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭