用“单元分隔符”替换平面文件中的选项卡(“\ t”)。 (0x1f)在C# [英] Replace tabs ("\t") in flat file with "Unit Separator" (0x1f) in C#

查看:522
本文介绍了用“单元分隔符”替换平面文件中的选项卡(“\ t”)。 (0x1f)在C#的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直无法找到'单元分隔符'的元字符来替换平面文件中的选项卡。



到目前为止,我有这个:

  File.WriteAllLines(outputFile,
File.ReadLines(inputFile)
.Select(t => t。替换(\ t,\ 0x1f))); //这不起作用

我也试过了:

  File.WriteAllLines(outputFile,
File.ReadLines(inputFile)
.Select(t => t.Replace(\ t ,\u))); //也不起作用

AND

  File.WriteAllLines(outputFile,
File.ReadLines(inputFile)
.Select(t => t.Replace(\t,0x1f) )); //也不起作用

如何正确使用十六进制作为参数?此外,单元分隔符的元字符是什么?

解决方案

单元分隔符的元字符是

  U + 001f 

你应该可以像使用它一样使用它$ / $>

  File.WriteAllLines(outputFile,
File.ReadLines(inputFile)
.Select(t => t.Replace(\ t,\\\)));

编辑:由于关于控制角色的讨论开始发生,所以我为后人的缘故添加了这个定义。


一个特殊的非打印字符,用于开始,修改或结束一个功能,事件,操作或控制操作,ASCII字符集定义了32个控制字符,最初这些代码是为了控制电传打字机而设计的。通常用于控制显示器,打印机和其他现代设备。

来自 here

另外,这里是对单位分隔符


要存储在数据库中的最小数据项在ASCII定义中称为单位。我们现在会称他们为场。单位分隔符在串行数据存储环境中分隔这些字段。大多数当前的数据库实现要求大多数类型的字段具有固定的长度。记录中足够的空间被分配用于存储每个字段中最大的可能成员,即使在大多数情况下这不是必需的。这在很多情况下会花费大量的空间。美国控制代码允许所有字段具有可变长度。如果数据存储空间有限 - 如同六十年代 - 这是保存宝贵空间的好方法。另一方面,串行存储比现代的表驱动RAM和磁盘实现效率低得多。我无法想象现代SQL数据库与存储在纸带或磁盘上的数据一起运行的情况...


from 此处


I have been having trouble finding the metacharacter for the 'Unit Separator' to replace the tabs in a flat file.

So far I have this:

File.WriteAllLines(outputFile,
    File.ReadLines(inputFile)
    .Select(t => t.Replace("\t", "\0x1f")));  //this does not work

I have also tried:

File.WriteAllLines(outputFile,
    File.ReadLines(inputFile)
    .Select(t => t.Replace("\t", "\u"))); //also doesn't work

AND

File.WriteAllLines(outputFile,
    File.ReadLines(inputFile)
    .Select(t => t.Replace("\t", 0x1f)));  //also doesn't work

How do I correctly use hex as a parameter? Also, what is the metacharacter for the 'Unit Separator"?

解决方案

the metacharacter for the unit separator is

U+001f

you should be able to use it like

File.WriteAllLines(outputFile,
File.ReadLines(inputFile)
.Select(t => t.Replace("\t", "\u001f")));

EDIT: Since a discussion about control characters started to happen, Ill add this definition for posterity's sake.

A special, non-printing character that begins, modifies, or ends a function, event, operation or control operation. The ASCII character set defines 32 control characters. Originally, these codes were designed to control teletype machines. Now, however, they are often used to control display monitors, printers, and other modern devices.

from here.

also, here is a description of the unit separator

The smallest data items to be stored in a database are called units in the ASCII definition. We would call them field now. The unit separator separates these fields in a serial data storage environment. Most current database implementations require that fields of most types have a fixed length. Enough space in the record is allocated to store the largest possible member of each field, even if this is not necessary in most cases. This costs a large amount of space in many situations. The US control code allows all fields to have a variable length. If data storage space is limited—as in the sixties—this is a good way to preserve valuable space. On the other hand is serial storage far less efficient than the table driven RAM and disk implementations of modern times. I can't imagine a situation where modern SQL databases are run with the data stored on paper tape or magnetic reels...

from here.

这篇关于用“单元分隔符”替换平面文件中的选项卡(“\ t”)。 (0x1f)在C#的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆