Google Spread Sheet Spark库 [英] Google Spread Sheet Spark library
问题描述
我正在使用 https://github.com/potix2/spark-google-spreadsheets 库,用于在spark中读取电子表格文件.它在我的本地环境中运行良好.
I am using https://github.com/potix2/spark-google-spreadsheets library for reading the spread sheet file in spark. It is working perfectly in my local.
val df = sqlContext.read.
format("com.github.potix2.spark.google.spreadsheets").
option("serviceAccountId", "xxxxxx@developer.gserviceaccount.com").
option("credentialPath", "/path/to/credentail.p12").
load("<spreadsheetId>/worksheet1")
我创建了一个包含所有凭据的新程序集jar,并使用该jar读取文件.但是我在读取credentialPath文件时遇到问题.我尝试使用
I created a new assembly jar with included all the credentials and use that jar for reading the file. But I am facing issue with reading the credentialPath file. I tried using
getClass.getResourceAsStream("/resources/Aircraft/allAircraft.txt")
但是库仅支持绝对路径.请帮助我解决此问题.
But library only supports absolute path. Please help me to resolve this issue.
推荐答案
您可以使用 spark-submit
或 SparkContext.addFile()
分发凭据文件.如果要在工作程序节点中获取凭证文件的本地路径,则应调用 SparkFiles.get("credential filename")
.
You can use --files
argument of spark-submit
or SparkContext.addFile()
to distribute a credential file. If you want to get a local path of the credential file in worker node, you should call SparkFiles.get("credential filename")
.
import org.apache.spark.SparkFiles
// you can also use `spark-submit --files=credential.p12`
sqlContext.sparkContext.addFile("credential.p12")
val credentialPath = SparkFiles.get("credential.p12")
val df = sqlContext.read.
format("com.github.potix2.spark.google.spreadsheets").
option("serviceAccountId", "xxxxxx@developer.gserviceaccount.com").
option("credentialPath", credentialPath).
load("<spreadsheetId>/worksheet1")
这篇关于Google Spread Sheet Spark库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!