partitionColumn，lowerBound，upperBound，numPartitions参数是什么意思? [英] What is the meaning of partitionColumn, lowerBound, upperBound, numPartitions parameters?

查看：202 发布时间：2019/9/2 12:07:45 apache-spark jdbc apache-spark-sql

本文介绍了partitionColumn，lowerBound，upperBound，numPartitions参数是什么意思?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

当通过Spark中的JDBC连接从SQL Server获取数据时，我发现我可以设置一些并行化参数，例如partitionColumn，lowerBound，upperBound和numPartitions.我已经阅读过火花文档但无法理解.

While fetching data from SQL Server via a JDBC connection in Spark, I found that I can set some parallelization parameters like partitionColumn, lowerBound, upperBound, and numPartitions. I have gone through spark documentation but wasn't able to understand it.

任何人都可以向我解释这些参数的含义吗?

Can anyone explain me the meanings of these parameters?

推荐答案

这很简单:

partitionColumn是应用于确定分区的列.
lowerBound和upperBound确定要获取的值的范围.完整的数据集将使用与以下查询对应的行:

partitionColumn is a column which should be used to determine partitions.
lowerBound and upperBound determine range of values to be fetched. Complete dataset will use rows corresponding to the following query:

SELECT * FROM table WHERE partitionColumn BETWEEN lowerBound AND upperBound

numPartitions确定要创建的分区数. lowerBound和upperBound之间的范围分为numPartitions，每个步幅等于:

numPartitions determines number of partitions to be created. Range between lowerBound and upperBound is divided into numPartitions each with stride equal to:

upperBound / numPartitions - lowerBound / numPartitions

例如，如果:

lowerBound:0
upperBound:1000
numPartitions:10

lowerBound: 0
upperBound: 1000
numPartitions: 10

步幅等于100，分区对应于以下查询:

Stride is equal to 100 and partitions correspond to following queries:

SELECT * FROM table WHERE partitionColumn BETWEEN 0 AND 100
SELECT * FROM table WHERE partitionColumn BETWEEN 100 AND 200
...
SELECT * FROM table WHERE partitionColumn BETWEEN 900 AND 1000

SELECT * FROM table WHERE partitionColumn BETWEEN 0 AND 100
SELECT * FROM table WHERE partitionColumn BETWEEN 100 AND 200
...
SELECT * FROM table WHERE partitionColumn BETWEEN 900 AND 1000

这篇关于partitionColumn，lowerBound，upperBound，numPartitions参数是什么意思?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

partitionColumn，lowerBound，upperBound，numPartitions参数是什么意思? [英] What is the meaning of partitionColumn, lowerBound, upperBound, numPartitions parameters?

问题描述

推荐答案

相关文章

Java相关最新文章

热门教程

热门工具

登录关闭

partitionColumn，lowerBound，upperBound，numPartitions参数是什么意思? [英] What is the meaning of partitionColumn, lowerBound, upperBound, numPartitions parameters?

问题描述

推荐答案

相关文章

Java相关最新文章

热门教程

热门工具

登录 关闭

登录关闭