Catalog Class
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Catalog interface for Spark. To access this, use SparkSession.Catalog.
public sealed class Catalog
type Catalog = class
Public NotInheritable Class Catalog
- Inheritance
-
Catalog
Methods
CacheTable(String) |
Caches the specified table in-memory. Spark SQL can cache tables using an in-memory columnar format by calling
|
ClearCache() |
Removes all cached tables from the in-memory cache. You can either clear all cached
tables at once using this or clear each table individually using
|
CreateTable(String, String, String) |
Creates a table, in the hive warehouse, from the given path based from a data source and returns the corresponding DataFrame. The type of file type (csv, parquet, etc.) is specified using the |
CreateTable(String, String) |
Creates a table, in the hive warehouse, from the given path and returns the
corresponding DataFrame. The table will contain the contents of the parquet
file that is in the |
CurrentDatabase() |
Returns the current database in this session. By default your session will be
connected to the "default" database (named "default") and to change database
either use |
DatabaseExists(String) |
Check if the database with the specified name exists. This will check the list of hive databases in the current session to see if the database exists. |
DropGlobalTempView(String) |
Drops the global temporary view with the given view name in the catalog. You can create global temporary views by taking a DataFrame and calling
|
DropTempView(String) |
Drops the local temporary view with the given view name in the catalog. Local temporary view is session-scoped. Its lifetime is the lifetime of the session that created it, i.e. it will be automatically dropped when the session terminates. It's not tied to any databases, i.e. we can't use db1.view1 to reference a local temporary view. You can create temporary views by taking a DataFrame and calling
|
FunctionExists(String, String) |
Check if the function with the specified name exists in the specified database. If you
want to check if a built-in function exists specify the dbName as null or use
|
FunctionExists(String) |
Check if the function with the specified name exists. |
GetDatabase(String) |
Get the database with the specified name. Calling |
GetFunction(String, String) |
Get the function with the specified name. If you are trying to get an in-built function then pass null as the dbName. |
GetFunction(String) |
Get the function with the specified name. If you are trying to get an in-built function then use the unqualified name. |
GetTable(String, String) |
Get the table or view with the specified name in the specified database. You can use this to find the tables description, database, type and whether it is a temporary table or not. |
GetTable(String) |
Get the table or view with the specified name. You can use this to find the tables description, database, type and whether it is a temporary table or not. |
IsCached(String) |
Returns true if the table is currently cached in-memory. If the table is cached then it
will consume memory. To remove the table from cache use |
ListColumns(String, String) |
Returns a list of columns for the given table/view in the specified database.
The |
ListColumns(String) |
Returns a list of columns for the given table/view or temporary view. The DataFrame includes the name, description, dataType, whether it is nullable or if it is partitioned and if it is broken in buckets. |
ListDatabases() |
Returns a list of databases available across all sessions. The |
ListFunctions() |
Returns a list of functions registered in the current database. This includes all
temporary functions. The |
ListFunctions(String) |
Returns a list of functions registered in the specified database. This includes all
temporary functions. The |
ListTables() |
Returns a list of tables/views in the current database. The |
ListTables(String) |
Returns a list of tables/views in the specified database. The |
RecoverPartitions(String) |
Recovers all the partitions in the directory of a table and update the catalog. This only works for partitioned tables and not un-partitioned tables or views. |
RefreshByPath(String) |
Invalidates and refreshes all the cached data (and the associated metadata) for any Dataset that contains the given data source path. Path matching is by prefix, i.e. "/" would invalidate everything that is cached. |
RefreshTable(String) |
Invalidates and refreshes all the cached data and metadata of the given table. For performance reasons, Spark SQL or the external data source library it uses might cache certain metadata about a table, such as the location of blocks. When those change outside of Spark SQL, users should call this function to invalidate the cache. If this table is cached as an InMemoryRelation, drop the original cached version and make the new version cached lazily. |
SetCurrentDatabase(String) |
Sets the current default database in this session. |
TableExists(String, String) |
Check if the table or view with the specified name exists in the specified database. |
TableExists(String) |
Check if the table or view with the specified name exists. This can either be a temporary view or a table/view. |
UncacheTable(String) |
Removes the specified table from the in-memory cache. |