shell 脚本对编码和行尾敏感吗? [英] Are shell scripts sensitive to encoding and line endings?

查看:15
本文介绍了shell 脚本对编码和行尾敏感吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在 Mac 上制作 NW.js 应用程序,并希望通过双击图标在开发模式下运行该应用程序.第一步,我试图让我的 shell 脚本工作.

I am making a NW.js app on Mac, and want to run the app in dev mode by double-clicking on an icon. First step, I'm trying to make my shell script work.

在 Windows 上使用 VSCode(我想争取时间),我在项目的根目录下创建了一个 run-nw 文件,其中包含:

Using VSCode on Windows (I wanted to gain time), I have created a run-nw file at the root of my project, containing this:

#!/bin/bash

cd "src"
npm install

cd ..
./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &

但我得到这个输出:

$ sh ./run-nw

: command not found  
: No such file or directory  
: command not found  
: No such file or directory  

Usage: npm <command>

where <command> is one of:  (snip commands list)

(snip npm help)

npm@3.10.3 /usr/local/lib/node_modules/npm  
: command not found  
: No such file or directory  
: command not found

我真的不明白:

  • 它似乎将空行作为命令.在我的编辑器(VSCode)中,我尝试将 替换为 (以防 产生问题)但它什么都没有改变.
  • 它似乎没有找到文件夹(有或没有 dirname 指令),或者它可能不知道 cd 命令?
  • 它似乎不理解 npm
  • install 参数
  • 真正让我感到奇怪的部分是它仍然运行应用程序(如果我手动执行了 npm install)...
  • it seems that it takes empty lines as commands. In my editor (VSCode) I have tried to replace with (in case the creates problems) but it changes nothing.
  • it seems that it doesn't find the folders (with or without the dirname instruction), or maybe it doesn't know about the cd command ?
  • it seems that it doesn't understand the install argument to npm
  • the part that really weirds me out, is that it still runs the app (if I did a npm install manually)...

无法让它正常工作,并且怀疑文件本身有什么奇怪的地方,我直接在 Mac 上创建了一个新的,这次使用 vim.我输入了完全相同的说明,然后......现在它可以正常工作了.
两个文件的差异显示完全为零差异.

Not able to make it work properly, and suspecting something weird with the file itself, I created a new one directly on the Mac, using vim this time. I entered the exact same instructions, and... now it works without any issue.
A diff on the two files reveals exactly zero difference.

有什么区别?什么会使第一个脚本不起作用?我怎么知道?

What can be the difference? What can make the first script not work? How can I find out?

按照接受的答案的建议,在错误的行结尾回来后,我检查了很多东西.原来,由于我从我的 Windows 机器上复制了我的 ~/.gitconfig,我有了 autocrlf=true,所以每次我在 Windows 下修改 bash 文件时,它都会重新- 将行尾设置为 .
因此,除了运行 dos2unix(您必须在 Mac 上使用 Homebrew 安装)之外,如果您使用的是 Git,请检查您的配置.

Following the accepted answer's recommendations, after the wrong line endings came back, I checked multiple things. It turns out that since I copied my ~/.gitconfig from my Windows machine, I had autocrlf=true, so every time I modified the bash file under Windows, it re-set the line endings to .
So, in addition to running dos2unix (which you will have to install using Homebrew on a Mac), if you're using Git, check your config.

推荐答案

是的.Bash 脚本对行尾很敏感,无论是在脚本本身还是在它处理的数据中.它们应该有 Unix 风格的行尾,即每一行都以换行符(ASCII 中的十进制 10,十六进制 0A)结束.

Yes. Bash scripts are sensitive to line-endings, both in the script itself and in data it processes. They should have Unix-style line-endings, i.e., each line is terminated with a Line Feed character (decimal 10, hex 0A in ASCII).

对于 Windows 或 DOS 样式的行结尾,每行都以回车符结束,后跟换行符.您可以在 cat -v yourfile 的输出中看到这个不可见的字符:

With Windows or DOS-style line endings , each line is terminated with a Carriage Return followed by a Line Feed character. You can see this otherwise invisible character in the output of cat -v yourfile:

$ cat -v yourfile
#!/bin/bash^M
^M
cd "src"^M
npm install^M
^M
cd ..^M
./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &^M

在这种情况下,回车符(插入符号中的 ^M 或 C 转义符号中的 )不被视为空格.Bash 将 shebang 之后的第一行(由单个回车符组成)解释为要运行的命令/程序的名称.

In this case, the carriage return (^M in caret notation or in C escape notation) is not treated as whitespace. Bash interprets the first line after the shebang (consisting of a single carriage return character) as the name of a command/program to run.

  • 由于没有名为 ^M 的命令,它打印 : command not found
  • 由于没有名为"src"^M(或src^M)的目录,它打印: No such file or directory
  • 它传递 install^M 而不是 install 作为 npm 的参数,这会导致 npm 抱怨.
  • Since there is no command named ^M, it prints : command not found
  • Since there is no directory named "src"^M (or src^M), it prints : No such file or directory
  • It passes install^M instead of install as an argument to npm which causes npm to complain.

像上面一样,如果你有一个带回车的输入文件:

Like above, if you have an input file with carriage returns:

hello^M
world^M

然后它在编辑器中和将其写入屏幕时看起来完全正常,但工具可能会产生奇怪的结果.例如,grep 将无法找到明显存在的行:

then it will look completely normal in editors and when writing it to screen, but tools may produce strange results. For example, grep will fail to find lines that are obviously there:

$ grep 'hello$' file.txt || grep -x "hello" file.txt
(no match because the line actually ends in ^M)

附加的文本将覆盖该行,因为回车会将光标移动到行的开头:

Appended text will instead overwrite the line because the carriage returns moves the cursor to the start of the line:

$ sed -e 's/$/!/' file.txt
!ello
!orld

字符串比较似乎会失败,即使写入屏幕时字符串看起来相同:

String comparison will seem to fail, even though strings appear to be the same when writing to screen:

$ a="hello"; read b < file.txt
$ if [[ "$a" = "$b" ]]
  then echo "Variables are equal."
  else echo "Sorry, $a is not equal to $b"
  fi

Sorry, hello is not equal to hello

解决方案

解决方案是将文件转换为使用 Unix 样式的行结尾.有多种方法可以实现这一点:

Solutions

The solution is to convert the file to use Unix-style line endings. There are a number of ways this can be accomplished:

  1. 这可以使用 dos2unix 程序来完成:

dos2unix filename

  • 有能力的文本编辑器(Sublime、Notepad++,而不是记事本)中打开文件并配置它以保存带有 Unix 行尾的文件,例如,使用 Vim,在此之前运行以下命令(重新)保存:

  • Open the file in a capable text editor (Sublime, Notepad++, not Notepad) and configure it to save files with Unix line endings, e.g., with Vim, run the following command before (re)saving:

    :set fileformat=unix
    

  • 如果您有支持 -i--in-place 选项的 sed 实用程序版本,例如, GNU sed,您可以运行以下命令去除尾随回车:

  • If you have a version of the sed utility that supports the -i or --in-place option, e.g., GNU sed, you could run the following command to strip trailing carriage returns:

    sed -i 's/
    $//' filename
    

    对于其他版本的 sed,您可以使用输出重定向来写入新文件.请务必为重定向目标使用不同的文件名(以后可以重命名).

    With other versions of sed, you could use output redirection to write to a new file. Be sure to use a different filename for the redirection target (it can be renamed later).

    sed 's/
    $//' filename > filename.unix
    

  • 同样,tr 翻译过滤器可用于从输入中删除不需要的字符:

  • Similarly, the tr translation filter can be used to delete unwanted characters from its input:

    tr -d '
    ' <filename >filename.unix
    

  • Cygwin Bash

    对于 Cygwin 的 Bash 端口,有一个自定义的 igncr 选项可以设置为忽略行尾中的回车(大概是因为它的许多用户使用本机 Windows 程序来编辑他们的文本文件).这可以通过运行 set -o igncrcurrent shell 启用.

    Cygwin Bash

    With the Bash port for Cygwin, there’s a custom igncr option that can be set to ignore the Carriage Return in line endings (presumably because many of its users use native Windows programs to edit their text files). This can be enabled for the current shell by running set -o igncr.

    设置此选项仅适用于当前 shell 进程,因此它在采购带有无关回车的文件时很有用.如果您经常遇到带有 DOS 行结尾的 shell 脚本并希望永久设置此选项,您可以设置一个名为 SHELLOPTS(全部大写字母)的环境变量以包含 igncr.Bash 在启动时(在读取任何启动文件之前)使用这个环境变量来设置 shell 选项.

    Setting this option applies only to the current shell process so it can be useful when sourcing files with extraneous carriage returns. If you regularly encounter shell scripts with DOS line endings and want this option to be set permanently, you could set an environment variable called SHELLOPTS (all capital letters) to include igncr. This environment variable is used by Bash to set shell options when it starts (before reading any startup files).

    file 实用程序可用于快速查看文本文件中使用了哪些行结尾.这是它为每种文件类型打印的内容:

    The file utility is useful for quickly seeing which line endings are used in a text file. Here’s what it prints for for each file type:

    • Unix 行尾:Bourne-Again shell 脚本,ASCII 文本可执行
    • Mac 行结尾:Bourne-Again shell 脚本,ASCII 文本可执行文件,带有 CR 行终止符
    • DOS 行结束符:Bourne-Again shell 脚本,ASCII 文本可执行文件,带有 CRLF 行终止符

    cat 实用程序的 GNU 版本有一个 -v, --show-nonprinting 选项,用于显示非打印字符.

    The GNU version of the cat utility has a -v, --show-nonprinting option that displays non-printing characters.

    dos2unix 实用程序专门用于在 Unix、Mac 和 DOS 行尾之间转换文本文件.

    The dos2unix utility is specifically written for converting text files between Unix, Mac and DOS line endings.

    维基百科有一篇优秀文章,涵盖了许多不同的方式来标记一行的结尾文本、此类编码的历史以及换行符在不同操作系统、编程语言和 Internet 协议(例如 FTP)中的处理方式.

    Wikipedia has an excellent article covering the many different ways of marking the end of a line of text, the history of such encodings and how newlines are treated in different operating systems, programming languages and Internet protocols (e.g., FTP).

    对于 Classic Mac OS(OS X 之前),每行都以 Carriage 结尾返回(十进制 13,十六进制 0D ASCII).如果脚本文件以这样的行结尾保存,Bash 只会看到像这样的一长行:

    With Classic Mac OS (pre-OS X), each line was terminated with a Carriage Return (decimal 13, hex 0D in ASCII). If a script file was saved with such line endings, Bash would only see one long line like so:

    #!/bin/bash^M^Mcd "src"^Mnpm install^M^Mcd ..^M./tools/nwjs-sdk-v0.17.3-osx-x64/nwjs.app/Contents/MacOS/nwjs "src" &^M
    

    由于这一长行以八字 (#) 开头,因此 Bash 将这一行(以及整个文件)视为单个注释.

    Since this single long line begins with an octothorpe (#), Bash treats the line (and the whole file) as a single comment.

    注:2001 年,Apple 推出了 Mac OS X,它基于 BSD 派生的 NeXTSTEP 操作系统.因此,OS X 也使用 Unix 风格的 LF-only 行尾,从那时起,以 CR 结尾的文本文件变得极为罕见.尽管如此,我认为值得展示 Bash 将如何尝试解释此类文件.

    Note: In 2001, Apple launched Mac OS X which was based on the BSD-derived NeXTSTEP operating system. As a result, OS X also uses Unix-style LF-only line endings and since then, text files terminated with a CR have become extremely rare. Nevertheless, I think it’s worthwhile to show how Bash would attempt to interpret such files.

    这篇关于shell 脚本对编码和行尾敏感吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆