如何在Spark中使用Regexp_replace [英] how to use Regexp_replace in spark

查看：2083 发布时间：2020/9/4 1:21:44 scala apache-spark apache-spark-sql regexp-replace

本文介绍了如何在Spark中使用Regexp_replace的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我刚起步很新，并且想对数据框的列执行操作，以便用.

I am pretty new to spark and would like to perform an operation on a column of a dataframe so as to replace all the , in the column with .

假设有一个数据框x和第x4列

Assume there is a dataframe x and column x4

我希望输出为

我正在使用的代码是

import org.apache.spark.sql.Column
def replace = regexp_replace((x.x4,1,6566:String,1.6566:String)x.x4)

但是我收到以下错误

import org.apache.spark.sql.Column
<console>:1: error: ')' expected but '.' found.
       def replace = regexp_replace((train_df.x37,0,160430299:String,0.160430299:String)train_df.x37)

在语法，逻辑或任何其他合适方式方面的任何帮助将不胜感激

Any help on the syntax, logic or any other suitable way would be much appreciated

推荐答案

这是一个可复制的示例，假设x4是字符串列.

Here's a reproducible example, assuming x4 is a string column.

import org.apache.spark.sql.functions.regexp_replace

val df = spark.createDataFrame(Seq(
  (1, "1,3435"),
  (2, "1,6566"),
  (3, "-0,34435"))).toDF("Id", "x4")

语法为regexp_replace(str, pattern, replacement)，其翻译为:

df.withColumn("x4New", regexp_replace(df("x4"), "\\,", ".")).show
+---+--------+--------+
| Id|      x4|   x4New|
+---+--------+--------+
|  1|  1,3435|  1.3435|
|  2|  1,6566|  1.6566|
|  3|-0,34435|-0.34435|
+---+--------+--------+

这篇关于如何在Spark中使用Regexp_replace的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在Spark中使用Regexp_replace [英] how to use Regexp_replace in spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在Spark中使用Regexp_replace [英] how to use Regexp_replace in spark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭