无法解决意外令牌"fi"附近的“语法错误";-隐藏的控制字符(CR)/Unicode空格 [英] unable to solve "syntax error near unexpected token `fi'" - hidden control characters (CR) / Unicode whitespace
问题描述
我是bash脚本的新手,我只是在尝试新事物并开始使用它.
基本上,我正在编写一个小脚本来将文件的内容存储在变量中,然后在if语句中使用该变量.
我一步一步地弄清楚了存储变量然后将文件内容存储为变量的方法.我现在正在研究if语句.
我的测试if语句非常基础.我只是想先掌握语法,然后再进入程序的更复杂的if语句.
我的if语句是:
如果[["test" ="test"]然后回声这是相同的"科幻
简单吧?但是,当我运行脚本时,出现错误:
意外令牌'fi'附近的 语法错误
我已经尝试了该站点以及其他站点上的许多操作,但是仍然出现此错误,并且不确定什么地方出错了.在我的计算机上阻止脚本运行可能会成为问题吗?
编辑有效代码.注意,我还删除了所有注释掉的代码,只使用了if语句,仍然出现相同的错误.
#!/bin/bash#this存储一个简单变量,其内容为testy1.txt文件#DATA = $(< testy1.txt)#这回显了存储的变量#echo $ DATA#simple if语句如果["test" ="test"]然后回声有价值"科幻
以补充 Jens的有用答案,它解释了症状良好,并提供了基于实用程序的解决方案( dos2unix
).有时不希望安装第三方实用程序,因此这是基于标准实用程序 tr
的解决方案:
tr -d'\ r'<脚本>script.tmp&&mv script.tmp脚本
这将从输入中删除所有 \ r
(CR)字符,将输出保存到临时文件,然后替换原始文件.
- 尽管这会盲目删除
\ r
实例,即使它们不是\ r \ n
(CRLF)对的一部分,但通常可以假定实际上,\ r
实例仅作为此类对的一部分出现. - 也可以使用其他标准实用程序(
awk
,sed
)进行解决-请参见 shebang行#!/bin/bash
开头,其失败方式是确实指出了问题的原因:/bin/bash ^ M:错误的解释器
快速SO搜索可以证明.
^ M
表示CR被视为解释器路径的一部分,显然失败了.
(相反,如果脚本的shebang行是基于env
的,例如#!/usr/bin/env bash
,则错误消息会有所不同,但仍然指向原因:env:bash \ r:没有这样的文件或目录
)您没有看到此问题的原因是您正在Windows Unix仿真环境中运行,因为Bash尝试将CR 作为命令执行(因为它无法将CR识别为行尾的一部分).
-
在shebang行之后的注释行没有问题,因为以CR结尾的注释行在语法上仍然有效.
-
最后,
if
语句以晦涩的方式破坏了该命令:-
如果您的文件以换行符结尾(通常是这种情况),则会出现
语法错误:文件意外结束
:- 行尾
then
和if
标记被视为then \ r
和if \ r
(也就是说,CR由Bash附加),因此不会被识别为关键字.因此,Bash从没看到if
复合命令的结尾,并且抱怨看到if
语句完成之前遇到了文件结尾.
- 行尾
-
由于文件未包含文件,因此在意外令牌"fi"附近出现了
语法错误:
- 最后的
fi
,由于 not 后跟CR,因此 被Bash识别为关键字,而前面的则
不是(如前所述).因此,在这种情况下,Bash在看到then
之前就先看到了关键字fi
,并抱怨这种不适当出现的fi
.
- 最后的
-
可选背景信息
看起来不错但由于字符不可见或仅与所需字符看起来相同而中断的Shell脚本是一个常见问题,通常具有以下原因之一:
-
问题A :文件具有 Windows风格的CRLF(
\ r \ n
)行尾,而不是Unix风格仅LF(\ n
)行尾-在这种情况下.- 从Windows计算机复制文件或使用编辑器保存带有CRLF序列的文件都是可能的原因.
-
问题B :该文件具有非ASCII Unicode空白和标点符号,看起来像常规空白,但在技术上有所区别.
- 一个常见的原因是从使用非ASCII空格和标点符号来格式化代码的网站复制源代码,以实现显示的目的;
一个示例是使用不间断空格 Unicode字符(U + 00A0
; UTF-8编码0xc2 0xa0
),这在视觉上是无法区分的来自正常(ASCII)空间(U + 0020
).
- 一个常见的原因是从使用非ASCII空格和标点符号来格式化代码的网站复制源代码,以实现显示的目的;
诊断问题
以下 cat
命令可以直观显示:
- 所有通常不可见的ASCII控制字符,例如
\ r
作为^ M
. - 所有非ASCII字符(假定现在流行的UTF-8编码),例如不间断空格Unicode字符.作为
M-BM-
.
^ M
是插入符号的示例.并不明显,尤其是对于多字节字符,但是,除了 ^ M
之外,通常不必确切地知道 的含义-您只需要注意是否 ^< letter>
序列完全存在(问题A),或者出现在意外的位置(问题B).
最后一点很重要:非ASCII字符可以是源代码的合法部分,例如字符串文字和注释.仅当使用代替ASCII标点符号时,它们才是问题.
LC_ALL = C cat -v脚本
注意:如果使用的是 GNU 实用程序,则 LC_ALL = C
前缀是可选的.
问题A的解决方案:将行尾从CRLF转换为仅LF
-
用于基于标准或通常默认使用的实用程序(
tr
,awk
,sed
的解决方案,perl
),请参阅我的此答案. -
更健壮和便捷的选项是广泛使用的
dos2unix
实用程序(如果已安装(通常是 not )),或者可以选择安装.
安装方式取决于您的平台.例如:- 在Ubuntu上:
sudo apt-get install dos2unix
- 在macO上,安装了 Homebrew ,
brew install dos2unix
- 在Ubuntu上:
dos2unix脚本
会将行尾转换为LF,并在适当位置更新文件 script
.
请注意, dos2unix
还提供其他功能,例如更改文件的字符编码.
问题B的解决方案:将Unicode标点符号转换为ASCII标点符号
注意:通过标点符号表示空格和字符,例如-
在这种情况下的挑战是仅以Unicode 标点为目标,而其他非ASCII字符则应单独放置;因此,使用 iconv
之类的字符转码实用程序不是的一种选择.
nws
是该实用程序(我写的)提供了 Unicode-标点符号到ASCII-标点符号转换模式,同时保留了非标点符号Unicode字符.独自的;例如:
nws -i --ascii脚本#翻译Unicode点.转换为ASCII,更新文件脚本"
安装:
-
如果已安装Node.js,则只需运行
[sudo] npm install -g nws-cli
即可安装,它将在您的计算机中放置nws
路径. -
否则:请参见手动安装说明.
nws
还有其他一些专注于空格处理的功能,包括CRLF到LF以及反之亦然的翻译(-lf
,-crlf
).
I am new to bash scripting and i'm just trying out new things and getting to grips with it.
Basically I am writing a small script to store the content of a file in a variable and then use that variable in an if statement.
Through step by step i have figured out the ways to store variables and then store content of files as variables. I am now working on if statements.
My test if statement if very VERY basic. I just wanted to grasp the syntax before moving onto more complicated if statement for my program.
My if statement is:
if [ "test" = "test" ]
then
echo "This is the same"
fi
Simple right? however when i run the script i am getting the error:
syntax error near unexpected token `fi'
I have tried a number of things from this site as well as others but i am still getting this error and I am unsure what is wrong. Could it be an issue on my computer stopping the script from running?
Edit for ful code. Note i also deleted all the commented out code and just used the if statement, still getting same error.
#!/bin/bash
#this stores a simple variable with the content of the file testy1.txt
#DATA=$(<testy1.txt)
#This echos out the stored variable
#echo $DATA
#simple if statement
if [ "test" = "test" ]
then
echo "has value"
fi
To complement Jens's helpful answer, which explains the symptoms well and offers a utility-based solution (dos2unix
). Sometimes installing a third-party utility is undesired, so here's a solution based on standard utility tr
:
tr -d '\r' < script > script.tmp && mv script.tmp script
This removes all \r
(CR) characters from the input, saves the output to a temporary file, and then replaces the original file.
- While this blindly removes
\r
instances even if they're not part of\r\n
(CRLF) pairs, it's usually safe to assume that\r
instances indeed only occur as part of such pairs. - Solutions with other standard utilities (
awk
,sed
) are possible too - see this answer of mine.
If yoursed
implementation offers the-i
option for in-place updating, it may be the simpler choice.
To diagnose the problem I suggest using cat -v script
, as its output is easy to parse visually: if you see ^M
(which represents \r
) at the end of the output lines, you know you're dealing with a file with Window line endings.
Why Your Script Failed So Obscurely
Normally, a shell script that mistakenly has Windows-style CRLF line endings, \r\n
, (rather than the required Unix-style LF-only endings, \n
) and starts with shebang line #!/bin/bash
fails in a manner that does indicate the cause of the problem:
/bin/bash^M: bad interpreter
as a quick SO search can attest. The ^M
indicates that the CR was considered part of the interpreter path, which obviously fails.
(If, by contrast, the script's shebang line is env
-based, such as #!/usr/bin/env bash
, the error message differs, but still points to the cause: env: bash\r: No such file or directory
)
The reason you did not see this problem is that you're running in the Windows Unix-emulation environment Cygwin, which - unlike on Unix - allows a shebang line to end in CRLF (presumably to also support invoking other interpreters on Windows that do expect CRLF endings).
The CRLF problem therefore didn't surface until later in your script, and the fact that you had no empty lines after the shebang line further obfuscated the problem:
An empty CRLF-terminated line would cause Bash (4.x) to complain as follows:
"bash: line <n>: $'\r': command not found
, because Bash tries to execute the CR as a command (since it doesn't recognize it as part of the line ending).The comment lines directly following the shebang lines are unproblematic, because a comment line ending in CR is still syntactically valid.
Finally, the
if
statement broke the command, in an obscure manner:If your file were to end with a line break, as is usually the case, you would have gotten
syntax error: unexpected end of file
:- The line-ending
then
andif
tokens are seen asthen\r
andif\r
(i.e., the CR is appended) by Bash, and are therefore not recognized as keywords. Bash therefore never sees the end of theif
compound command, and complains about encountering the end of the file before seeing theif
statement completed.
- The line-ending
Since your file did not, you got
syntax error near unexpected token 'fi'
:- The final
fi
, due to not being followed by a CR, is recognized as a keyword by Bash, whereas the precedingthen
wasn't (as explained). In this case, Bash therefore saw keywordfi
before ever seeingthen
, and complained about this out-of-place occurrence offi
.
- The final
Optional Background Information
Shell scripts that look OK but break due to characters that are either invisible or only look the same as the required characters are a common problem that usually has one of the following causes:
Problem A: The file has Windows-style CRLF (
\r\n
) line endings rather than Unix-style LF-only (\n
) line endings - which is the case here.- Copying a file from a Windows machine or using an editor that saves files with CRLF sequences are among the possible causes.
Problem B: The file has non-ASCII Unicode whitespace and punctuation that looks like regular whitespace, but is technically distinct.
- A common cause is source code copied from websites that use non-ASCII whitespace and punctuation for formatting code for display purposes;
an example is use of the no-break space Unicode character (U+00A0
; UTF-8 encoding0xc2 0xa0
), which is visually indistinguishable from a normal (ASCII) space (U+0020
).
- A common cause is source code copied from websites that use non-ASCII whitespace and punctuation for formatting code for display purposes;
Diagnosing the Problem
The following cat
command visualizes:
- all normally invisible ASCII control characters, such as
\r
as^M
. - all non-ASCII characters (assuming the now prevalent UTF-8 encoding), such as the non-break space Unicode char. as
M-BM-
.
^M
is an example of caret notation, which is not obvious, especially with multi-byte characters, but, beyond ^M
, it's usually not necessary to know exactly what the notation stands for - you just need to note if the ^<letter>
sequences are present at all (problem A), or are present in unexpected places (problem B).
The last point is important: non-ASCII characters can be a legitimate part of source code, such as in string literals and comments. They're only a problem if they're used in place of ASCII punctuation.
LC_ALL=C cat -v script
Note: If you're using GNU utilities, the LC_ALL=C
prefix is optional.
Solutions to Problem A: translating line endings from CRLF to LF-only
For solutions based on standard or usually-available-by-default utilities (
tr
,awk
,sed
,perl
), see this answer of mine.A more robust and convenient option is the widely used
dos2unix
utility, if it is already installed (typically, it is not), or installing it is an option.
How you install it depends on your platform; e.g.:- on Ubuntu:
sudo apt-get install dos2unix
- on macOs, with Homebrew installed,
brew install dos2unix
- on Ubuntu:
dos2unix script
would convert the line endings to LF and update file script
in place.
Note that dos2unix
also offers additional features, such as changing the character encoding of a file.
Solutions to Problem B: translating Unicode punctuation to ASCII punctuation
Note: By punctuation I mean both whitespace and characters such as -
The challenge in this case is that only Unicode punctuation should be targeted, whereas other non-ASCII characters should be left alone; thus, use of character-transcoding utilities such as iconv
is not an option.
nws
is a utility (that I wrote) that offers a Unicode-punctuation-to-ASCII-punctuation translation mode while leaving non-punctuation Unicode chars. alone; e.g.:
nws -i --ascii script # translate Unicode punct. to ASCII, update file 'script' in place
Installation:
If you have Node.js installed, install it by simply running
[sudo] npm install -g nws-cli
, which will placenws
in your path.Otherwise: See the manual installation instructions.
nws
has several other functions focused on whitespace handling, including CRLF-to-LF and vice-versa translations (--lf
, --crlf
).
这篇关于无法解决意外令牌"fi"附近的“语法错误";-隐藏的控制字符(CR)/Unicode空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!