在sed中转义双引号 [英] Escaping double quotation marks in sed

查看:242
本文介绍了在sed中转义双引号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为我的应用程序创建搜索和替换功能,我正在运行一个包含 3 个文件的测试场景,大批脚本测试

Creating a search and replace function for my application, I am running a test scenario with 3 files, array tscript test

我试图转义双引号,但它不起作用

I am trying to escape double quotation marks but it wont work

脚本文件包含

variableName=$1
sed "s#data\-field\=\"${variableName}\.name\"#data\-field\=${variableName}\.name data\-type\=dropdown data\-dropdown\-type\=${variableName}#g" test

测试文件包含

data-field="fee_category.name"
data-field="tax_type.name"

数组文件包含

fee_category
tax_type

没有错误代码,输出就是我输入的内容,因为 sed 命令找不到它要查找的内容,如果我不在 ${VariableName} 旁边使用双引号并将它们从测试文件中删除函数工作正常.

There is no error code, the output is just what I inputted because the sed command could not find what it was looking for, if I dont use double quotes next to ${VariableName} and remove them from the test file the function works fine.

推荐答案

按照 mklement0 的评论,我只是写这个答案是为了分享我的一些发现,以防我们需要你的特殊双引号的字面匹配.它可能对其他用户有用.

Following the comment of mklement0 , i am only writing this answer in order to share some of my findings in case we need a literal match of your special double quotes. It might be useful to other users.

您引用的文本 fee_category.name 具有 Unicode 左双引号 U+201c 左侧引号和 右侧的 Unicode 右双引号 U+201d.

Your quoted text fee_category.name has Unicode Left Double Quotation Mark U+201c quotes on the left side and Unicode Right Double Quotation Mark U+201d on the right side.

那些非标准引号在 UTF-8 中也有一些表示:

Those non std quotation marks have also some representation in UTF-8 :

Unicode 左双引号 U+201c
UTF-8 (十六进制) 0xE2 0x80 0x9C (e2809c)
UTF-16 (十六进制) 0x201C (201c)

Unicode Left Double Quotation Mark U+201c
UTF-8 (hex) 0xE2 0x80 0x9C (e2809c)
UTF-16 (hex) 0x201C (201c)

Unicode 右双引号 U+201d
UTF-8 (十六进制) 0xE2 0x80 0x9D (e2809d)
UTF-16(十六进制)0x201D(201d)

Unicode Right Double Quotation Mark U+201d
UTF-8 (hex) 0xE2 0x80 0x9D (e2809d)
UTF-16 (hex) 0x201D (201d)

使用 od 实用程序分析您的文件,我们可以确认您的数据中是否存在上述十六进制 utf-8 序列:

Analyzing your file with od utility we can confirm presence of above hex utf-8 sequences in your data:

$ echo data-field="fee_category.name" |od -w40 -t x1c
0000000  64  61  74  61  2d  66  69  65  6c  64  3d  e2  80  9c  66  65  65  5f  63  61  74  65  67  6f  72  79  2e  6e  61  6d  65  e2  80  9d  0a
          d   a   t   a   -   f   i   e   l   d   = 342 200 234   f   e   e   _   c   a   t   e   g   o   r   y   .   n   a   m   e 342 200 235  \n

有趣的是,我们可以通过使用 unicode 代码或使用 utf-8 十六进制系列在 bash 中打印这些 un​​icode 字符:

What is interesting is that we can print those unicode characters in bash either by using their unicode code or by using the utf-8 hex series :

$ echo -e "\u201c test \u201d"
" test "
$ echo -e "\xe2\x80\x9c test \xe2\x80\x9d"
" test "

因此,我们可以强制 sed 匹配这些特殊字符,如下所示:

Accordingly we can force sed to match those special chars like this:

$ string=$(echo -e "\u201c test \u201d");echo "$string"
" test "
$ lq=$(echo -ne "\u201c");rq=$(echo -ne "\u201d")
$ sed -E "s/($lq)(.+)($rq)/**\2**/" <<<"$string"
** test **

这似乎也能正常工作,无需使用辅助"变量:

Also this seems to work fine, without the need of using "helper" variables:

$ sed -E "s/(\xe2\x80\x9c)(.+)(\xe2\x80\x9d)/**\2**/" <<<"$string"
** test **

表示十六进制序列\xe2\x80\x9c(或\xe2\x80\x9d用于右引号)可以直接被sed<使用/code> 以在此特殊引号上提供文字匹配.

Meaning that the hex sequence \xe2\x80\x9c (or \xe2\x80\x9d for right quotes) can be used directly by sed to provide a literal match on this special quotes.

您不妨对文件进行预处理,然后使用以下内容将所有非标准引号转换为标准引号:

You might as well make a pre-process of your files and convert all those non standard quotes to standard quotes using something like :

$ sed -E "s/[\xe2\x80\x9c,\xe2\x80\x9d]/\x22/g" <<<"$string"
" test "   #Special quotes replaced with classic ascii quotes.

以上测试已在 Debian Testing &Bash 4.4 &GNU Sed 4.4 并且此技术可能不适用于其他 sed 版本.

Above test have been done in Debian Testing & Bash 4.4 & GNU Sed 4.4 and may be this techniques will not work in other sed flavors.

这篇关于在sed中转义双引号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆