识别并删除文本文件中的特定隐藏字符 [英] Identify and remove specific hidden characters from text file

查看：359 发布时间：2016/8/3 11:46:17 bash unix sed

本文介绍了识别并删除文本文件中的特定隐藏字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含几个隐藏字符的文本文件。使用猫-v 我能看到它们包含以下;

^ M



^ [[A

也有在该行的末尾\\ n 字符。我希望能够和莫名其妙地显示这些。
然后我想是能够选择性地剪切和 SED 这些隐藏字符。我将如何能够实现这一点？
我试过 DOS2UNIX的但这并没有帮助删除任何 ^ M 字符。我也试过 SED S / ^ M //摹，其中我pressed Ctrl + V M
原始数据
从猫-v 输出的原始数据，
也可在： http://pastebin.com/Vk2i81JC
  ^ MCopying非试块......第一遍（向前）^ M ^ [[A ^ [[A ^ [Arescued：0 B，errsize：0 B，电流率：0 B / S
   新股消息：0 B，错误：0，平均速率：0 B / S
   OPOS：0 B，运行时间：1秒，读取成功：1秒前
^ MFinished
 
输出通缉
也可在： http://pastebin.com/wfDnrELm
 获救：0 B，errsize：0 B，现价：0 B / S
   新股消息：0 B，错误：0，平均速率：0 B / S
   OPOS：0 B，运行时间：1秒，读取成功：1秒前
完
 
解决方案

尝试以下 TR 这是用来翻译或删除字符命令。下面的命令删除所有比引号内的八进制指定的其他字符
八进制\\ 12 - 新行（\\ n）的八进制\\ 11 - TAB（^ I），八进制\\ 40- \\ 176 - 好个字符
有关八进制值的完整参考参考这个网页：
的https://courses.engr.illinois.edu/ece390/books/labmanual/ascii-$c$c-table.html
  TR-CD'\\ 11 \\ 12 \\ 40- \\ 176'＆LT; org.txt＆GT; new.txt
 
文件 new.txt 将包含字符删除。
要删除^ M之间的字符，并删除不必要的控制字符，使用下面的命令
  sed的S / \\ R * \\ r // Gorg.txt | TR-CD'\\ 11 \\ 12 \\ 40- \\ 176'＆GT; new.txt
 
I have a text file that contains several hidden characters. Using cat -v I am able to see that they include the following;

^M

^[[A

There are also \n characters at the end of the line. I would like to be able to display these as well somehow.

Then I would like to be able to selectively cut and sed these hidden characters. How would I go able accomplishing this?

I've tried dos2unix but that didn't help remove any of the ^M characters. I've also tried sed s/^M//g wherein I pressed ctrl+v m.

Raw data

Output from cat -v on the raw data, also available at: http://pastebin.com/Vk2i81JC
^MCopying non-tried blocks... Pass 1 (forwards)^M^[[A^[[A^[[Arescued:         0 B,  errsize:       0 B,  current rate:        0 B/s
   ipos:         0 B,   errors:       0,    average rate:        0 B/s
   opos:         0 B, run time:       1 s,  successful read:       1 s ago
^MFinished
Output wanted

Also available at: http://pastebin.com/wfDnrELm
rescued:         0 B,  errsize:       0 B,  current rate:        0 B/s
   ipos:         0 B,   errors:       0,    average rate:        0 B/s
   opos:         0 B, run time:       1 s,  successful read:       1 s ago
Finished
解决方案
Try the below tr command which is used to translate or delete characters. The below command removes all the characters other than the one specified in octal within the quotes

octal \12 - new line(\n), octal \11 - TAB(^I), octal \40-\176 - are good characters.

For a complete reference of octal values refer to this page: https://courses.engr.illinois.edu/ece390/books/labmanual/ascii-code-table.html
tr -cd '\11\12\40-\176' < org.txt > new.txt
The file new.txt will contain the characters removed.

To remove the characters between ^M and remove the unnecessary control characters use the below command
sed "s/\r.*\r//g" org.txt | tr -cd '\11\12\40-\176' > new.txt
这篇关于识别并删除文本文件中的特定隐藏字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

识别并删除文本文件中的特定隐藏字符 [英] Identify and remove specific hidden characters from text file

问题描述

原始数据

输出通缉

Raw data

Output wanted

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

识别并删除文本文件中的特定隐藏字符 [英] Identify and remove specific hidden characters from text file

问题描述

原始数据

输出通缉

Raw data

Output wanted

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭