使用Spark JDBC时DataFrame列的自定义数据类型 [英] Custom Data Types for DataFrame columns when using Spark JDBC
问题描述
我知道我可以使用自定义方言在我的数据库和spark之间建立正确的映射,但是当我使用spark的jdbc.write
选项时如何创建具有特定字段数据类型和长度的自定义表模式?当我从spark加载表时,我想对表模式进行精细控制.
I know I can use a custom dialect for having a correct mapping between my db and spark but how can I create a custom table schema with specific field data types and lengths when I use spark's jdbc.write
options? I would like to have granular control over my table schemas when I load a table from spark.
推荐答案
写入的灵活性最小,由
- SPARK-10101 - Spark JDBC writer mapping String to TEXT or VARCHAR
- SPARK-10849 - Allow user to specify database column type for data frame fields when writing data to jdbc data sources
但如果您愿意
当我从spark加载表时对我的表模式进行精细控制.
to have granular control over my table schemas when I load a table from spark.
您可能必须实现自己的 registerDialect
,但我没有尝试过)
you might have to implement your own JdbcDialect
. It is internal developer API and as far as I can tell it is not plugable so you may need customized Spark binaries (it might be possible to registerDialect
but I haven't tried this).
这篇关于使用Spark JDBC时DataFrame列的自定义数据类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!