F#CsvTypeProvider从稍微不同的csv文件中提取相同的列 [英] F# CsvTypeProvider extracting the same columns from slightly different csv-files

查看:181
本文介绍了F#CsvTypeProvider从稍微不同的csv文件中提取相同的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建一个程序,从不同的CSV文件中读取足球比赛。我感兴趣的列出现在所有的文件,但文件有不同数量的列。

I am creating a program that reads football matches from different CSV files. The columns I am interested in are present in all the files, but the files have a varying number of columns.

这让我为每个变体创建一个单独的映射函数

This left me creating a separate mapping function for each variation of file, with a different sample for each type:

type GamesFile14 = CsvProvider<"./data/sample_14.csv">
type GamesFile15 = CsvProvider<"./data/sample_15.csv">
type GamesFile1617 = CsvProvider<"./data/sample_1617.csv">

let mapRows14 (rows:seq<GamesFile14.Row>) = rows |> Seq.map ( fun c -> { Division = c.Div; Date = DateTime.Parse c.Date; 
        HomeTeam = { Name = c.HomeTeam; Score = c.FTHG; Shots = c.HS; ShotsOnTarget = c.HST; Corners = c.HC; Fouls = c.HF }; 
        AwayTeam = { Name = c.AwayTeam; Score = c.FTAG; Shots = c.AS; ShotsOnTarget = c.AST; Corners = c.AC; Fouls = c.AF };
        Odds = { H = float c.B365H; U = float c.B365D;  B = float c.B365A } } ) 


let mapRows15 (rows:seq<GamesFile15.Row>) = rows |> Seq.map ( fun c -> { Division = c.Div; Date = DateTime.Parse c.Date; 
        HomeTeam = { Name = c.HomeTeam; Score = c.FTHG; Shots = c.HS; ShotsOnTarget = c.HST; Corners = c.HC; Fouls = c.HF }; 
        AwayTeam = { Name = c.AwayTeam; Score = c.FTAG; Shots = c.AS; ShotsOnTarget = c.AST; Corners = c.AC; Fouls = c.AF };
        Odds = { H = float c.B365H; U = float c.B365D;  B = float c.B365A } } ) 


let mapRows1617 (rows:seq<GamesFile1617.Row>) = rows |> Seq.map ( fun c -> { Division = c.Div; Date = DateTime.Parse c.Date; 
        HomeTeam = { Name = c.HomeTeam; Score = c.FTHG; Shots = c.HS; ShotsOnTarget = c.HST; Corners = c.HC; Fouls = c.HF }; 
        AwayTeam = { Name = c.AwayTeam; Score = c.FTAG; Shots = c.AS; ShotsOnTarget = c.AST; Corners = c.AC; Fouls = c.AF };
        Odds = { H = float c.B365H; U = float c.B365D;  B = float c.B365A } } ) 

这些都是由loadGames函数使用的:

These are again consumed by the loadGames function:

let loadGames season resource = 
    if season.Year = 14 then GamesFile14.Load(resource).Rows |> mapRows14
    else if season.Year = 15 then GamesFile15.Load(resource).Rows |> mapRows15
    else GamesFile1617.Load(resource).Rows |> mapRows1617

在我看来,必须有更好的方法来解决这个问题。

It seems to me that there must be better ways to get around this problem.

有什么方法可以使我的映射函数更通用,这样我不必重复相同的函数一遍又一遍?

Is there any way I could make my mapping function more generic so that I don't have to repeat the same function over and over?

是否可以根据资源即时创建CsvProvider,或者我需要为我的csv文件的每个变体显式声明一个示例,如上面的代码?

Is it possible to create the CsvProvider on the fly based on the resource, or do I need to explicitly declare a sample for each variation of my csv-files like in the code above?

其他建议?

推荐答案

在您的场景中,您可能会从 FSharp.Data的 CsvFile 类型。它使用更动态的方法来解析CSV,使用动态操作符进行数据访问:你会失去一些类型提供者给你的类型安全保证,因为每个单独的CSV文件将被加载到保存 CsvRow 类型 - 这意味着您不能保证在编译时任何给定的列将在一个文件中,以准备运行时错误。但在你的case,这只是你想要的,因为它会允许你的三个函数被重写为这样:

In your scenario, you might get better results from FSharp.Data's CsvFile type. It uses a more dynamic approach to CSV parsing, using the dynamic ? operator for data access: you lose some of the type-safety guarantees that the type provider gives you, since each separate CSV file will be loaded into the save CsvRow type -- which means that you can't guarantee at compile time that any given column will be in a file, and you have to be prepared for runtime errors. But in your case, that's just what you want, because it would allow your three functions to be rewritten like this:

let mapRows14 rows = rows |> Seq.map ( fun c -> { Division = c?Div; Date = DateTime.Parse c?Date; 
        HomeTeam = { Name = c?HomeTeam; Score = c?FTHG; Shots = c?HS; ShotsOnTarget = c?HST; Corners = c?HC; Fouls = c?HF }; 
        AwayTeam = { Name = c?AwayTeam; Score = c?FTAG; Shots = c?AS; ShotsOnTarget = c?AST; Corners = c?AC; Fouls = c?AF };
        Odds = { H = float c?B365H; U = float c?B365D;  B = float c?B365A } } )

尝试 CsvFile ,看看是否解决了您的问题。

Give CsvFile a try and see if it solves your problem.

这篇关于F#CsvTypeProvider从稍微不同的csv文件中提取相同的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆