如何查看Databricks中的所有数据库和表 [英] How to see all the databases and Tables in Databricks
问题描述
我想列出 Azure Databricks 中每个数据库中的所有表.
i want to list all the tables in every database in Azure Databricks.
所以我希望输出看起来像这样:
so i want the output to look somewhat like this:
Database | Table_name
Database1 | Table_1
Database1 | Table_2
Database1 | Table_3
Database2 | Table_1
etc..
这是我目前所拥有的:
from pyspark.sql.types import *
DatabaseDF = spark.sql(f"show databases")
df = spark.sql(f"show Tables FROM {DatabaseDF}")
#df = df.select("databaseName")
#list = [x["databaseName"] for x in df.collect()]
print(DatabaseDF)
display(DatabaseDF)
df = spark.sql(f"show Tables FROM {schemaName}")
df = df.select("TableName")
list = [x["TableName"] for x in df.collect()]
## Iterate through list of schema
for x in list:
### INPUT Required: Change for target table
tempTable = x
df2 = spark.sql(f"SELECT COUNT(*) FROM {schemaName}.{tempTable}").collect()
for x in df2:
rowCount = x[0]
if rowCount == 0:
print(schemaName + "." + tempTable + " has 0 rows")
但我不太清楚结果.
推荐答案
有一个 catalog
属性可以触发会话,可能是您正在寻找的内容:
There is a catalog
property to spark session, probably what you are looking for :
spark.catalog.listDatabases()
spark.catalog.listTables("database_name")
listDatabases
返回您拥有的数据库列表.listTables
返回某个数据库名称的表列表.
listDatabases
returns the list of database you have.
listTables
returns for a certain database name, the list of tables.
你可以做这样的事情,例如:
You can do something like this for example :
[
(table.database, table.name)
for database in spark.catalog.listDatabases()
for table in spark.catalog.listTables(database.name)
]
获取数据库和表的列表.
to get the list of database and tables.
(thx @Alex Ott)即使这个解决方案工作正常,它也很慢.直接使用一些 sql 命令,例如 show databases
或 show tables in ...
应该可以更快地完成工作.
(thx @Alex Ott) even if this solution works fine, it is quite slow.
Using directly some sql commands like show databases
or show tables in ...
should do the work faster.
这篇关于如何查看Databricks中的所有数据库和表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!