在sed中转义双引号 [英] Escaping double quotation marks in sed
问题描述
为我的应用程序创建搜索和替换功能,我正在运行一个包含 3 个文件的测试场景,大批脚本测试
Creating a search and replace function for my application, I am running a test scenario with 3 files, array tscript test
我试图转义双引号,但它不起作用
I am trying to escape double quotation marks but it wont work
脚本文件包含
variableName=$1
sed "s#data\-field\=\"${variableName}\.name\"#data\-field\=${variableName}\.name data\-type\=dropdown data\-dropdown\-type\=${variableName}#g" test
测试文件包含
data-field="fee_category.name"
data-field="tax_type.name"
数组文件包含
fee_category
tax_type
没有错误代码,输出就是我输入的内容,因为 sed 命令找不到它要查找的内容,如果我不在 ${VariableName} 旁边使用双引号并将它们从测试文件中删除函数工作正常.
There is no error code, the output is just what I inputted because the sed command could not find what it was looking for, if I dont use double quotes next to ${VariableName} and remove them from the test file the function works fine.
推荐答案
按照 mklement0 的评论,我只是写这个答案是为了分享我的一些发现,以防我们需要你的特殊双引号的字面匹配.它可能对其他用户有用.
Following the comment of mklement0 , i am only writing this answer in order to share some of my findings in case we need a literal match of your special double quotes. It might be useful to other users.
您引用的文本 fee_category.name
具有 Unicode 左双引号 U+201c 左侧引号和 右侧的 Unicode 右双引号 U+201d.
Your quoted text fee_category.name
has Unicode Left Double Quotation Mark U+201c quotes on the left side and Unicode Right Double Quotation Mark U+201d on the right side.
那些非标准引号在 UTF-8 中也有一些表示:
Those non std quotation marks have also some representation in UTF-8 :
Unicode 左双引号 U+201c
UTF-8 (十六进制) 0xE2 0x80 0x9C (e2809c)
UTF-16 (十六进制) 0x201C (201c)
Unicode Left Double Quotation Mark U+201c
UTF-8 (hex) 0xE2 0x80 0x9C (e2809c)
UTF-16 (hex) 0x201C (201c)
Unicode 右双引号 U+201d
UTF-8 (十六进制) 0xE2 0x80 0x9D (e2809d)
UTF-16(十六进制)0x201D(201d)
Unicode Right Double Quotation Mark U+201d
UTF-8 (hex) 0xE2 0x80 0x9D (e2809d)
UTF-16 (hex) 0x201D (201d)
使用 od
实用程序分析您的文件,我们可以确认您的数据中是否存在上述十六进制 utf-8 序列:
Analyzing your file with od
utility we can confirm presence of above hex utf-8 sequences in your data:
$ echo data-field="fee_category.name" |od -w40 -t x1c
0000000 64 61 74 61 2d 66 69 65 6c 64 3d e2 80 9c 66 65 65 5f 63 61 74 65 67 6f 72 79 2e 6e 61 6d 65 e2 80 9d 0a
d a t a - f i e l d = 342 200 234 f e e _ c a t e g o r y . n a m e 342 200 235 \n
有趣的是,我们可以通过使用 unicode 代码或使用 utf-8 十六进制系列在 bash 中打印这些 unicode 字符:
What is interesting is that we can print those unicode characters in bash either by using their unicode code or by using the utf-8 hex series :
$ echo -e "\u201c test \u201d"
" test "
$ echo -e "\xe2\x80\x9c test \xe2\x80\x9d"
" test "
因此,我们可以强制 sed 匹配这些特殊字符,如下所示:
Accordingly we can force sed to match those special chars like this:
$ string=$(echo -e "\u201c test \u201d");echo "$string"
" test "
$ lq=$(echo -ne "\u201c");rq=$(echo -ne "\u201d")
$ sed -E "s/($lq)(.+)($rq)/**\2**/" <<<"$string"
** test **
这似乎也能正常工作,无需使用辅助"变量:
Also this seems to work fine, without the need of using "helper" variables:
$ sed -E "s/(\xe2\x80\x9c)(.+)(\xe2\x80\x9d)/**\2**/" <<<"$string"
** test **
表示十六进制序列\xe2\x80\x9c
(或\xe2\x80\x9d
用于右引号)可以直接被sed<使用/code> 以在此特殊引号上提供文字匹配.
Meaning that the hex sequence \xe2\x80\x9c
(or \xe2\x80\x9d
for right quotes) can be used directly by sed
to provide a literal match on this special quotes.
您不妨对文件进行预处理,然后使用以下内容将所有非标准引号转换为标准引号:
You might as well make a pre-process of your files and convert all those non standard quotes to standard quotes using something like :
$ sed -E "s/[\xe2\x80\x9c,\xe2\x80\x9d]/\x22/g" <<<"$string"
" test " #Special quotes replaced with classic ascii quotes.
以上测试已在 Debian Testing &Bash 4.4 &GNU Sed 4.4 并且此技术可能不适用于其他 sed 版本.
Above test have been done in Debian Testing & Bash 4.4 & GNU Sed 4.4 and may be this techniques will not work in other sed flavors.
这篇关于在sed中转义双引号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!