在ByteString上分割ByteString(而不是Word8或Char) [英] Split ByteString on a ByteString (instead of a Word8 or Char)

查看:133
本文介绍了在ByteString上分割ByteString(而不是Word8或Char)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道我已经有了Haskell Data.ByteString.Lazy函数来将CSV分割成单个字符,例如:

  split :: Word8  - > ByteString  - > [ByteString] 

但是我想分割多字符的ByteString(就像在String上分割):

  split :: ByteString  - > ByteString  - > [ByteString] 

我需要解析类似csv的文本文件中的多字符分隔符,并且单个字符本身出现在某些字段中,所以只选择一个分隔符并丢弃其他字符会污染数据导入。



我有一些想法关于如何做到这一点,但他们似乎有点哈克(例如,拿三个Word8s,测试他们是否是分隔符组合,如果他们是开始一个新领域,进一步递归),我想我会重新发明一个轮子。有没有办法做到这一点,而无需重新从头开始重建函数?

//hackage.haskell.org/packages/archive/bytestring/0.9.1.4/doc/html/Data-ByteString.html#v%3AbreakSubstringrel =nofollow noreferrer> breakSubstring $ b

  tokenise xy = h:if null t $ c $> 包含一个函数,用于执行所要求的操作: 

 然后[] else tokenise x(drop(length x)t)
where(h,t)= breakSubstring xy


I know I already have the Haskell Data.ByteString.Lazy function to split a CSV on a single character, such as:

split :: Word8 -> ByteString -> [ByteString]

But I want to split on a multi-character ByteString (like splitting on a String instead of a Char):

split :: ByteString -> ByteString -> [ByteString]

I have multi-character separators in a csv-like text file that I need to parse, and the individual characters themselves appear in some of the fields, so choosing just one separator character and discarding the others would contaminate the data import.

I've had some ideas on how to do this, but they seem kind of hacky (e.g. take three Word8s, test if they're the separator combination, start a new field if they are, recurse further), and I imagine I would be reinventing a wheel anyway. Is there a way to do this without rebuilding the function from scratch?

解决方案

The documentation of Bytestrings breakSubstring contains a function that does what you are asking for:

tokenise x y = h : if null t then [] else tokenise x (drop (length x) t)
    where (h,t) = breakSubstring x y

这篇关于在ByteString上分割ByteString(而不是Word8或Char)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆