如何使用sparklyr删除Spark DataFrame? [英] How to delete a Spark DataFrame using sparklyr?
问题描述
我使用以下内容创建了一个名为 iris的Spark数据框
I have created a Spark dataframe called "iris" using the below
library(sparklyr)
library(dplyr)
sc <- spark_connect(master = "local")
iris_tbl <- copy_to(sc, iris)
现在我要删除Spark数据框 iris(不是R中的数据框)怎么办?
now I want to delete the Spark dataframe "iris" (not the dataframe in R) how do I do that?
推荐答案
这完全取决于您说什么删除数据框。您必须记住,通常,Spark数据帧与普通的本地数据结构不是同一类型的对象。 Spark DataFrame
而不是数据容器的描述。
This strictly depends on what you when say delete dataframe. You have to remember that in general, Spark data frames are not the same type of objects as you plain local data structures. Spark DataFrame
is rather a description than a data container.
sparklyr
本身主要取决于Spark SQL接口。调用 copy_to
(或任何其他数据导入方法)时,它会:
sparklyr
itself, depends primarily on Spark SQL interface. When you call copy_to
(or any other data import method, it:
- 寄存器
- 使用默认参数(应避免使用),它会急速缓存表。
这意味着删除数据框的自然方法是删除临时视图(通过引用它或 dplyr
/ dbplyr
:
This means that the natural way to delete dataframe is to drop the temporary view (referencing it by its name either with dplyr
/ dbplyr
:
db_drop_table(sc, "iris")
或Spark自己的方法:
or Spark's own methods:
sc %>% spark_session() %>% invoke("catalog") %>% invoke("dropTempView", "iris")
请注意,它将使本地绑定无效,因此任何在调用上面显示的任何方法后尝试访问 iris_tbl
都会失败:
Please note that it will invalidate local bindings, so any attempt to access iris_tbl
after calling any of the methods shown above will fail:
iris_tbl
Error: org.apache.spark.sql.AnalysisException: Table or view not found: iris; line 2 pos 5
...
Caused by: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: Table or view 'iris' not found in database 'default'
...
这篇关于如何使用sparklyr删除Spark DataFrame?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!