Spark最佳方法查找Dataframe以提高性能 [英] Spark best approach Look-up Dataframe to improve performance

查看：582 发布时间：2016/11/13 14:15:44 scala apache-spark cassandra datastax-enterprise

本文介绍了Spark最佳方法查找Dataframe以提高性能的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

Dataframe A（百万记录）列之一是create_date，modified_date

Dataframe A (millions of records) one of the column is create_date,modified_date

Dataframe B 500记录有start_date和end_date

Dataframe B 500 records has start_date and end_date

当前方法：

在a.create_date上从start_date和end_date之间的联接b中选择一个。

上述工作需要半小时或更长时间才能运行。

The above job takes half hour or more to run.

我提高了效果