hSeek和SeekFromEnd在Haskell中 [英] hSeek and SeekFromEnd in Haskell

查看:159
本文介绍了hSeek和SeekFromEnd在Haskell中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


$我希望在Haskell中快速检索文件的最后一行---从最后开始,而不是开始---并正确使用hSeek有一些困难。 b $ b

看起来, SeekFromEnd N 的行为与查找文件 sz 的长度不同,

p $ p> outh< - openFiletest.csvReadMode

λ> hIsSeekable outh
True

λ> hFileSize outh
81619956
λ> hSeek outh AbsoluteSeek 1000
λ> hTell outh
1000

λ> hSeek outh SeekFromEnd 1000
λ> hTell outh
81620956

λ> hSeek outh AbsoluteSeek 0
λ> hGetLine outh
这里是我的数据

λ> hSeek outh SeekFromEnd 10000
- ***例外:test.csv:hGetLine:文件结尾



<嗯,这很奇怪。



所以,我做了一个函数,用绝对代码完成这个工作:

 λ> hSeek outh SeekFromEnd 100000 
λ> hTell outh
81719956

fromEnd outh = do
sz< - hFileSize outh
hSeek outh AbsoluteSeek(sz - 100000)

λ> ; fromEnd outh

λ> hTell outh
81519956

所以在输出方面,他们有不同的答案,这很奇怪。此外,我现在也可以使用hGetLine,SeekFromEnd失败:

 λ> hGetLine outh 
partial output
λ> hGetLine outh
全部输出,很多字段,部分输出

不清楚这里发生了什么。为什么我的fromEnd与SeekFromEnd在允许hGetLine方面的行为不同?



问题的第二部分:在文件末尾开始的正确策略是什么?追溯到第一个换行符(EOF换行符后的第一个\ n)?



在这个问题中,我正在寻找一个使用SeekFromEnd的答案。 div>

SeekFromEnd 的偏移预计为负值。



至于获得最后一行一个文件,我们遇到了烦恼,我们必须逐个扫描每个字符,每次重新设置位置。也就是说,我们可以做到这一点 - 我们只是继续前进,直到遇到第一个 \\\
字符。

  import System.IO 

- |给定一个文件句柄,找到最后一行。在这次调用之后,对于手柄的
- 位置没有任何保证,并且预计给定的
- 手柄是可寻求的。
hGetLastLine :: Handle - > IO字符串
hGetLastLine hdl = go(否定1)
其中
go si = do
hSeek hdl SeekFromEnd i
c< - hGetChar hdl
如果c =='\ n'
则纯s
else go(c:s)(i-1)

您可能希望在这里添加一个,因为大多数文件通常以 \\\
结尾(并且该空行可能不是你想要的)


I'm looking to retrieve just the last line of a file quickly in Haskell---starting from the end, not the beginning---and having some difficulties using hSeek correctly.

It seems the SeekFromEnd N behaves differently than finding the length of the file sz, and using AbsoluteSeek to go (sz - N) bytes.

outh <- openFile "test.csv" ReadMode

λ> hIsSeekable outh
True

λ> hFileSize outh
81619956
λ> hSeek outh AbsoluteSeek 1000
λ> hTell outh
1000

λ> hSeek outh SeekFromEnd 1000
λ> hTell outh
81620956

λ> hSeek outh AbsoluteSeek 0
λ> hGetLine outh
"here's my data"

λ> hSeek outh SeekFromEnd 10000
-*** Exception: test.csv: hGetLine: end of file

Hm, that's weird.

So, I made a function that does this with absolute instead:

λ> hSeek outh SeekFromEnd 100000
λ> hTell outh
81719956

fromEnd outh = do
  sz <- hFileSize outh
  hSeek outh AbsoluteSeek (sz - 100000)

λ> fromEnd outh

λ> hTell outh
81519956

So output-wise, they have different answers which is weird. Additionally, I can now also use hGetLine, which SeekFromEnd failed on:

λ> hGetLine outh
"partial output"
λ> hGetLine outh
"full output, lots of fields, partial output"

Not clear to me what's going on here. Why does my fromEnd behave differently than SeekFromEnd in permitting hGetLine?

Part II of the question: what /would/ be the right strategy for starting at the end of the file and seeking backwards to the first newline (the first \n after the EOF newline)?

In this question, I'm looking specifically for an answer using SeekFromEnd.

解决方案

The offset to SeekFromEnd is expected to be negative.

As for getting the last line of a file, we come across the annoyance that we have to scan each character from the end, one by one, every time resetting the position. That said, we can do it - we just keep moving back until we encounter the first \n character.

import System.IO

-- | Given a file handle, find the last line. There are no guarantees as to the 
-- position of the handle after this call, and it is expected that the given
-- handle is seekable.
hGetLastLine :: Handle -> IO String
hGetLastLine hdl = go "" (negate 1)
  where
  go s i = do
    hSeek hdl SeekFromEnd i
    c <- hGetChar hdl
    if c == '\n'
      then pure s
      else go (c:s) (i-1)

You may want to add an off by one here, as most files generally end in an \n (and that empty line is probably not what you want)

这篇关于hSeek和SeekFromEnd在Haskell中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆