如何解决此Conduit代码在列表类型不期望的地方出现问题? [英] How to fix this Conduit code invovling the appearance of a list type where I do not expect one?

查看:60
本文介绍了如何解决此Conduit代码在列表类型不期望的地方出现问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在努力处理此管道代码,非常感谢您的帮助.就像这种代码由于类型检查器强制执行自然选择而通过随机突变而不断发展.这是我到目前为止最合适的候选人之一:

I've been struggling with this Conduit code for a while, any help would be extremely appreciated. It is sort of like this code has been evolving by random mutation while the type checker is enforcing natural selection. Here is one of the fittest candidates I have so far:

import           Conduit
import qualified Data.Conduit.Combinators       as DCC
import           Data.CSV.Conduit
import           Data.Function                  ((&))
import           Data.List.Split                (splitOn)
import           Data.Map                       as DM
import           Data.Text                      (Text)
import qualified Data.Text                      as Txt
import qualified Data.Text.IO                   as DTIO
import           Data.Vector                    (Vector)
import qualified Data.Vector                    as DV
import           Path
import           System.FilePath.Posix

retrieveSmaXtec :: Path Abs Dir -> IO (Vector (MapRow Text))
retrieveSmaXtec sxDir = do
  files <- sourceDirectoryDeep False (fromAbsDir sxDir) & return
  fileVector <- return $ runConduit $ files .| sinkVector
  csvRowsByFile <- runConduit ((yieldM fileVector) .| DCC.mapM processCSV .| sinkVector)
  fNameRows <- readFnameData $ yieldM fileVector
  (pairFill fNameRows csvRowsByFile)
    & fmap (uncurry DM.union)
    & return
  where
    fileList :: Path Abs Dir -> IO (Vector FilePath)
    fileList dir = sourceDirectoryDeep False (fromAbsDir sxDir) .| sinkVector & runConduit

    expandZip :: MapRow Text -> Vector (MapRow Text) -> Vector (MapRow Text, MapRow Text)
    expandZip one many = zip (replicate mlen one) many
      where
        mlen = length many

    pairFill :: Vector (MapRow Text) -> Vector (Vector (MapRow Text)) -> Vector (MapRow Text, MapRow Text)
    pairFill ones manies = join $ fmap (uncurry expandZip) (zip ones manies)

    processCSV :: FilePath -> IO (Vector (MapRow Text))
    processCSV fp = sourceFile fp
      .| intoCSV defCSVSettings
      .| sinkVector
      & runConduitRes
    readFnameData :: (MonadThrow m, MonadResource m, PrimMonad m) => ConduitT () FilePath m () -> m (Vector (MapRow Text))
    readFnameData files = runConduit $ files .| processFileName .| sinkVector

    processFileName :: (MonadResource m, MonadThrow m, PrimMonad m) =>
      ConduitT FilePath (MapRow Text) m ()
    processFileName = mapC go
      where
        go :: FilePath -> MapRow Text
        go fp = takeFileName fp
          & takeWhile (/= '.')
          & splitOn "_"
          & fmap Txt.pack
          & zip colNames
          & DM.fromList
        colNames = [markKey, idKey]

在下面两个错误中出现的当前混乱点是,当我希望所有内容都只是 FilePath 时,弹出了 [FilePath] .现在,即使已解决此问题,我也不会怀疑会弹出其他错误,因此,如果有解决该问题的方法,需要进行一些返工,我很乐于尝试.

The current point of confusion that occurs in both errors below is that [FilePath] is popping up, when I expect everything to just be FilePath. Now, even if this is fixed, I wouldn't doubt other errors could pop up, so if there's a solution for getting this going that involves a bit of a rework, I'd be happy to try it.

    * Couldn't match type `Char' with `[Char]'
      Expected type: ConduitM
                       [FilePath] Void IO (Vector (Vector (MapRow Text)))
        Actual type: ConduitM
                       FilePath Void IO (Vector (Vector (MapRow Text)))
    * In the second argument of `(.|)', namely
        `DCC.mapM processCSV .| sinkVector'
      In the first argument of `runConduit', namely
        `((yieldM fileVector) .| DCC.mapM processCSV .| sinkVector)'
      In a stmt of a 'do' block:
        csvRowsByFile <- runConduit
                           ((yieldM fileVector) .| DCC.mapM processCSV .| sinkVector)
   |
40 |   csvRowsByFile <- runConduit ((yieldM fileVector) .| DCC.mapM processCSV .| sinkVector)
   |                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    * Couldn't match type `[Char]' with `Char'
      Expected type: ConduitT () FilePath IO ()
        Actual type: ConduitT () [FilePath] IO ()
    * In the second argument of `($)', namely `yieldM fileVector'
      In a stmt of a 'do' block:
        fNameRows <- readFnameData $ yieldM fileVector
      In the expression:
        do files <- sourceDirectoryDeep False (fromAbsDir sxDir) & return
           fileVector <- return $ runConduit $ files .| sinkVector
           csvRowsByFile <- runConduit
                              ((yieldM fileVector) .| DCC.mapM processCSV .| sinkVector)
           fNameRows <- readFnameData $ yieldM fileVector
           ....
   |
41 |   fNameRows <- readFnameData $ yieldM fileVector
   |                                ^^^^^^^^^^^^^^^^^

此问题以

This question started in an alternative form at How to merge one-to-one and one-to-many input:output relationships in conduit? but now I'm just trying to get it to work, somehow, anyhow.

推荐答案

我经过一番睡眠并花了更多时间后才想到一个解决方案.我仍然不太明白为什么我尝试过的某些方法不起作用,但是我对最终结果感到很满意(如果不是我到达那儿所走的道路,而是为什么有时有时无法获得没有因使用而产生的CSV文本实例"csv-conduit时的"intoCSV""字样?).

I came up with a solution after getting some sleep and spending some more time on it. I still don't quite understand why some things I tried didn't work, but I'm reasonably happy with the end result (if not the path I took to get there, but learning is pain, at least sometimes). The major difference here is that I decided re-use the sourceDirectoryDeep conduit (files now) instead of trying to turn it into a vector directly. I also had to be a little more clever with how I wrote processCSV, which did involve one false turn that still confuses me (Why can one sometimes get "No instance for CSV Text Text arising from a use of `intoCSV`" when using csv-conduit?).

retrieveSmaXtec :: Path Abs Dir -> IO (Vector SxRecord)
retrieveSmaXtec sxDir = do
  csvRows <- getCsvRows
  fnameRows <- getFileNameRows
  rows <- return $ pairFill fnameRows csvRows & fmap (uncurry DM.union)
  print rows
  rows & fmap fromRow & catMaybes & return
  where
    getCsvRows :: IO (Vector (Vector (MapRow Text)))
    getCsvRows = files .| processCSV & runConduitRes

    getFileNameRows :: IO (Vector (MapRow Text))
    getFileNameRows = files .| processFileName & runConduitRes

    files :: MonadResource m => ConduitT () FilePath m ()
    files = sourceDirectoryDeep False (fromAbsDir sxDir)

    expandZip :: MapRow Text -> Vector (MapRow Text) -> Vector (MapRow Text, MapRow Text)
    expandZip one many_ = zip (replicate mlen one) many_
      where
        mlen = length many_

    pairFill :: Vector (MapRow Text) -> Vector (Vector (MapRow Text)) -> Vector (MapRow Text, MapRow Text)
    pairFill ones manies = join $ fmap (uncurry expandZip) (zip ones manies)

    processCSV :: (MonadResource m, MonadThrow m, PrimMonad m) =>
      ConduitT FilePath Void m (Vector (Vector (MapRow Text)))
    processCSV = mapMC (readCSVFile defCSVSettings) .| sinkVector

    processFileName :: (MonadResource m, MonadThrow m, PrimMonad m) =>
      ConduitT FilePath Void m (Vector (MapRow Text))
    processFileName = mapC go
      .| sinkVector
      where
        go :: FilePath -> MapRow Text
        go fp = takeFileName fp
          & takeWhile (/= '.')
          & splitOn "_"
          & fmap Txt.pack
          & zip colNames
          & DM.fromList
        colNames = [markKey, idKey]

这篇关于如何解决此Conduit代码在列表类型不期望的地方出现问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆