如何在 Scala 中将格式化的字符串转换为元组? [英] How to convert formatted String to Tuple in Scala?

查看:60
本文介绍了如何在 Scala 中将格式化的字符串转换为元组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含以下内容的文本文件.

I have a text file with following content.

//((number,(number,date)),number)
((210,(18,2015/06/28)),57.0)
((92,(60,2015/06/16)),102.89777479000209)
((46,(18,2015/06/17)),52.8940162267246)
((204,(27,2015/06/06)),75.2807019793683)

我希望将其转换为元组,并且需要一种快速的方法来实现.由于我拥有的此类字符串的列表非常庞大.

I wish to convert it to tuple and need a fast way to do it. As the list of such string's I have is substantially huge.

我也想维护类型和结构信息.

EDIT : I would also, like to maintain the type and structure information.

任何帮助将不胜感激.

推荐答案

我发现 scala-parser-combinators 是做这种事情的好方法;它比拆分或正则表达式更能自我记录:

I find scala-parser-combinators is the nice way to do this kind of thing; it's a lot more self-documenting than splits or regexes:

import scala.util.parsing.combinator.JavaTokenParsers
import org.joda.time.LocalDate

object MyParser extends JavaTokenParsers {
  override val skipWhitespace = false
  def date = (wholeNumber ~ "/" ~ wholeNumber ~ "/" ~ wholeNumber) ^^ { 
    case day ~ _ ~ month ~ _ ~ year =>
      new LocalDate(year.toInt, month.toInt, day.toInt)
  }
  def myNumber = decimalNumber ^^ { _.toDouble }
  def tupleElement: Parser[Any] = date | myNumber | tuple
  def tuple: Parser[List[Any]] = "(" ~> repsep(tupleElement, ",") <~ ")"
  def data = repsep(tuple, "\\n")
}

希望扩展它的方法是显而易见的.使用类似于:

Hopefully the way to extend this is obvious. Use is something like:

scala> MyParser.parseAll(MyParser.data, """((210,(18,2015/06/28)),57.0)
 | ((92,(60,2015/06/16)),102.89777479000209)
 | ((46,(18,2015/06/17)),52.8940162267246)
 | ((204,(27,2015/06/06)),75.2807019793683)""")
res1: MyParser.ParseResult[List[List[Any]]] = [4.41] parsed: List(List(List(210, List(18, LocalDate(28,6,2015))), 57.0), List(List(92, List(60, LocalDate(16,6,2015))), 102.89777479000209), List(List(46, List(18, LocalDate(17,6,2015))), 52.8940162267246), List(List(204, List(27, LocalDate(6,6,2015))), 75.2807019793683))

在编译时不能完全知道类型(在编译时用宏或其他类似的东西进行解析) - 以上是一个 List[List[Any]] 其中元素是 LocalDateDouble 或另一个 List.您可以在运行时使用模式匹配来处理它.更好的方法可能是使用密封特征:

The types can't be fully known at compile time (short of doing the parsing at compile time with a macro or some such) - the above is a List[List[Any]] where the elements are either LocalDate, Double or another List. You could handle it using pattern matching at runtime. A nicer approach could be to use a sealed trait:

sealed trait TupleElement
case class NestedTuple(val inner: List[TupleElement]) extends TupleElement
case class NumberElement(val value: Double) extends TupleElement
case class DateElement(val value: LocalDate) extends TupleElement

def myNumber = decimalNumber ^^ { d => NumberElement(d.toDouble) }
def tupleElement: Parser[TupleElement] = ... //etc.

然后当您在代码中有一个 TupleElement 并且您进行模式匹配时,如果您没有涵盖所有情况,编译器将发出警告.

Then when you have a TupleElement in code and you pattern-match, the compiler will warn if you don't cover all the cases.

这篇关于如何在 Scala 中将格式化的字符串转换为元组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆