如何使用固定字符长度,可变文本分隔符拆分字符串 [英] how to split a string using ,fixed character length, variable text delimmiter

查看:74
本文介绍了如何使用固定字符长度,可变文本分隔符拆分字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一个数据文件,并且在

文件中找不到任何常见的分隔符来指示一行数据的结束和下一行的开始。 />
行不在单行上,而是在多行上运行。


虽然每个不同的数据集都以
$ b $开头b''代码''总是长25个字符。文本是可变的

然而。


假设我已经将文件的内容读入字符串myfile,那么

我将我的文件拆分成一个数组,使用这个可变文本,固定25

字符长,分隔符?


谢谢!


Gary-

I''m working on a data file and can''t find any common delimmiters in the
file to indicate the end of one row of data and the start of the next.
Rows are not on individual lines but run accross multiple lines.

It would appear though that every distinct set of data starts with a
''code'' that is always the 25 characters long. The text is variable
however.

Assuming i''ve read the contents of the file into the string myfile, how
do i split my file into an array, using this variable text, fixed 25
character long, delimiter?

Thankyou!

Gary-

推荐答案

您好,

>假设我已经将文件内容读入字符串myfile,我是如何将文件拆分成数组的,使用这个可变文本,固定25
字符长,分隔符?
>Assuming i''ve read the contents of the file into the string myfile, how
do i split my file into an array, using this variable text, fixed 25
character long, delimiter?



你应该能够使用Regex.Split(...),当然还有一个好的常规

表达式。我可以帮你写一些常规的

表达式,但是我必须要了解更多关于分隔符字符串的信息。

Oliver Sturm

-
http://www.sturmnet.org/blog


你怎么知道它是分隔符而不是数据?


换句话说,如果*我*要查看文件,对此一无所知,

我怎么能分辨出什么是分隔符,什么是数据?你怎么用
向我解释要找什么?


当你能回答这个问题时,你可以开始思考如何通过它了

信息给机器。


HTH

彼得


< ga **** ****@myway.com写信息

新闻:11 ********************** @ 16g2000cwy.googlegro ups.com ...
How do *you* know it''s a delimiter and not data?

In other words, if *I* were to look at the file, knowing nothing about it,
how could I tell what was a delimiter and what was data? How would you
explain to me what to look for?

When you can answer that, you can start thinking about how to pass that
information to a machine.

HTH
Peter

<ga********@myway.comwrote in message
news:11**********************@16g2000cwy.googlegro ups.com...

我正在处理一个数据文件,并且在

文件中找不到任何常见的分隔符来表示一行数据的结束和下一行的开始。

行不是单行,而是多行运行。


虽然会出现每个不同的数据集都以

''代码'开头,总是长25个字符。文本是可变的

然而。


假设我已经将文件的内容读入字符串myfile,那么

我将我的文件拆分成一个数组,使用这个可变文本,固定25

字符长,分隔符?


谢谢!


Gary-
I''m working on a data file and can''t find any common delimmiters in the
file to indicate the end of one row of data and the start of the next.
Rows are not on individual lines but run accross multiple lines.

It would appear though that every distinct set of data starts with a
''code'' that is always the 25 characters long. The text is variable
however.

Assuming i''ve read the contents of the file into the string myfile, how
do i split my file into an array, using this variable text, fixed 25
character long, delimiter?

Thankyou!

Gary-



谢谢您的回复。好的,我再看看,我认为

任务变得更难了。长度不总是25个字符。但是我已经找到了一个模式,希望这会有所帮助。

我使用这个''代码'作为分隔符因为它总是进行

一个项目的名称,这个文件本质上是一个项目数据库。

在一个项目的名称后面,有一些特定的项目特征

到该项目被列为。最终项目特征是

完全列出,并且遇到下一个代码,这将继续数据库中的下一个项目




此代码似乎有一些可识别的特征。

它似乎总是至少20个字符。


- 代码是连续的,没有空格。

- 它总是由AZ或数字0-9组成的字母组成。

- 此代码的前两个字符是总是来自

AZ。

- 在

代码期间,这两个字母至少重复两次。


例如


DODE86DODE86SZDO010144

所以我想我现在要做的就是拆分字符串,每一次a
遇到的
a字符串长度至少为20个字符,是alpha

numeric,并且前两个字母重复自己至少两个

其他时间。


我觉得这会很难吗?


有什么想法吗?


谢谢你 -

Peter Bradley写道:
Thankyou for your replies. OK I have had another look at i think the
task has just got harder. The length isn''t always 25 characters. But I
have found a pattern, hopefully this will help.
I am using this ''code'' as a delimmiter because it always proceeds the
name of an item, and this file is essentially a database of items.
Following the name of an item, a number of item characteristcs specific
to that item are listed. Eventually the items characteristics are
completely listed and the next ''code'' is encountered which proceeds the
next item in the database.

There does seem to be some identifiable traits of this code.
It appears to be always at least 20 characters long.

- The code is continuous there are no spaces present.
- It is always composed of letters ranging from A-Z, or numbers 0-9.
- The first two characters of this code are always letters raning from
A-Z.
- These two letters are repeated at least two other times during the
code.

e.g.

DODE86DODE86SZDO010144

So I guess what I am trying to do now is split the string, every time a
a string in encountered that is at least 20 characters long, is alpha
numeric, and has the first two letters repeated initself at least two
other times.

I think this is going to be tough?

Any ideas?

Thankyou-
Peter Bradley wrote:

你怎么知道它是分隔符而不是数据?


换句话说,如果* I *要查看文件,对它一无所知,

我怎么能说出什么是分隔符,什么是数据?你怎么用
向我解释要找什么?


当你能回答这个问题时,你可以开始思考如何通过它了

信息给机器。


HTH


彼得


< ga ********@myway.com写信息

新闻:11 ********************** @ 16g2000cwy。 googlegro ups.com ...
How do *you* know it''s a delimiter and not data?

In other words, if *I* were to look at the file, knowing nothing about it,
how could I tell what was a delimiter and what was data? How would you
explain to me what to look for?

When you can answer that, you can start thinking about how to pass that
information to a machine.

HTH
Peter

<ga********@myway.comwrote in message
news:11**********************@16g2000cwy.googlegro ups.com...

我正在处理一个数据文件,并且在
$ b $中找不到任何常见的分隔符b文件表示一行数据的结束和下一行的开始。

行不是单行,而是多行运行。


虽然每个不同的数据集都以一个始终为25个字符的

''代码'开头。文本是可变的

然而。


假设我已经将文件的内容读入字符串myfile,那么

我将我的文件拆分成一个数组,使用这个可变文本,固定25

字符长,分隔符?


谢谢!


Gary-
I''m working on a data file and can''t find any common delimmiters in the
file to indicate the end of one row of data and the start of the next.
Rows are not on individual lines but run accross multiple lines.

It would appear though that every distinct set of data starts with a
''code'' that is always the 25 characters long. The text is variable
however.

Assuming i''ve read the contents of the file into the string myfile, how
do i split my file into an array, using this variable text, fixed 25
character long, delimiter?

Thankyou!

Gary-


这篇关于如何使用固定字符长度,可变文本分隔符拆分字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆