DataLakeServiceClient Class
A client to interact with the DataLake Service at the account level.
This client provides operations to retrieve and configure the account properties as well as list, create and delete file systems within the account. For operations relating to a specific file system, directory or file, clients for those entities can also be retrieved using the get_client functions.
- Inheritance
-
azure.storage.filedatalake._shared.base_client.StorageAccountHostsMixinDataLakeServiceClient
Constructor
DataLakeServiceClient(account_url: str, credential: str | Dict[str, str] | AzureNamedKeyCredential | AzureSasCredential | TokenCredential | None = None, **kwargs: Any)
Parameters
Name | Description |
---|---|
account_url
Required
|
The URL to the DataLake storage account. Any other entities included in the URL path (e.g. file system or file) will be discarded. This URL can be optionally authenticated with a SAS token. |
credential
|
The credentials with which to authenticate. This is optional if the account URL already has a SAS token. The value can be a SAS token string, an instance of a AzureSasCredential or AzureNamedKeyCredential from azure.core.credentials, an account shared access key, or an instance of a TokenCredentials class from azure.identity. If the resource URI already contains a SAS token, this will be ignored in favor of an explicit credential
Default value: None
|
Keyword-Only Parameters
Name | Description |
---|---|
api_version
|
The Storage API version to use for requests. Default value is the most recent service version that is compatible with the current SDK. Setting to an older version may result in reduced feature compatibility. |
audience
|
The audience to use when requesting tokens for Azure Active Directory authentication. Only has an effect when credential is of type TokenCredential. The value could be https://storage.azure.com/ (default) or https://.blob.core.windows.net. |
Examples
Creating the DataLakeServiceClient from connection string.
from azure.storage.filedatalake import DataLakeServiceClient
datalake_service_client = DataLakeServiceClient.from_connection_string(self.connection_string)
Creating the DataLakeServiceClient with Azure Identity credentials.
from azure.identity import DefaultAzureCredential
token_credential = DefaultAzureCredential()
datalake_service_client = DataLakeServiceClient("https://{}.dfs.core.windows.net".format(self.account_name),
credential=token_credential)
Variables
Name | Description |
---|---|
url
|
The full endpoint URL to the datalake service endpoint. |
primary_endpoint
|
The full primary endpoint URL. |
primary_hostname
|
The hostname of the primary endpoint. |
Methods
close |
This method is to close the sockets opened by the client. It need not be used when using with a context manager. |
create_file_system |
Creates a new file system under the specified account. If the file system with the same name already exists, a ResourceExistsError will be raised. This method returns a client with which to interact with the newly created file system. |
delete_file_system |
Marks the specified file system for deletion. The file system and any files contained within it are later deleted during garbage collection. If the file system is not found, a ResourceNotFoundError will be raised. |
from_connection_string |
Create DataLakeServiceClient from a Connection String. |
get_directory_client |
Get a client to interact with the specified directory. The directory need not already exist. |
get_file_client |
Get a client to interact with the specified file. The file need not already exist. |
get_file_system_client |
Get a client to interact with the specified file system. The file system need not already exist. |
get_service_properties |
Gets the properties of a storage account's datalake service, including Azure Storage Analytics. New in version 12.4.0: This operation was introduced in API version '2020-06-12'. |
get_user_delegation_key |
Obtain a user delegation key for the purpose of signing SAS tokens. A token credential must be present on the service object for this request to succeed. |
list_file_systems |
Returns a generator to list the file systems under the specified account. The generator will lazily follow the continuation tokens returned by the service and stop when all file systems have been returned. |
set_service_properties |
Sets the properties of a storage account's Datalake service, including Azure Storage Analytics. New in version 12.4.0: This operation was introduced in API version '2020-06-12'. If an element (e.g. analytics_logging) is left as None, the existing settings on the service for that functionality are preserved. |
undelete_file_system |
Restores soft-deleted filesystem. Operation will only be successful if used within the specified number of days set in the delete retention policy. New in version 12.3.0: This operation was introduced in API version '2019-12-12'. |
close
This method is to close the sockets opened by the client. It need not be used when using with a context manager.
close() -> None
create_file_system
Creates a new file system under the specified account.
If the file system with the same name already exists, a ResourceExistsError will be raised. This method returns a client with which to interact with the newly created file system.
create_file_system(file_system: FileSystemProperties | str, metadata: Dict[str, str] | None = None, public_access: PublicAccess | None = None, **kwargs) -> FileSystemClient
Parameters
Name | Description |
---|---|
file_system
Required
|
The name of the file system to create. |
metadata
Required
|
A dict with name-value pairs to associate with the file system as metadata. Example: {'Category':'test'} |
public_access
Required
|
Possible values include: file system, file. |
Keyword-Only Parameters
Name | Description |
---|---|
encryption_scope_options
|
Specifies the default encryption scope to set on the file system and use for all future writes. New in version 12.9.0. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
A FileSystemClient with newly created file system. |
Examples
Creating a file system in the datalake service.
datalake_service_client.create_file_system("filesystem")
delete_file_system
Marks the specified file system for deletion.
The file system and any files contained within it are later deleted during garbage collection. If the file system is not found, a ResourceNotFoundError will be raised.
delete_file_system(file_system: FileSystemProperties | str, **kwargs) -> FileSystemClient
Parameters
Name | Description |
---|---|
file_system
Required
|
The file system to delete. This can either be the name of the file system, or an instance of FileSystemProperties. |
Keyword-Only Parameters
Name | Description |
---|---|
lease
|
If specified, delete_file_system only succeeds if the file system's lease is active and matches this ID. Required if the file system has an active lease. |
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
A FileSystemClient with the specified file system deleted. |
Examples
Deleting a file system in the datalake service.
datalake_service_client.delete_file_system("filesystem")
from_connection_string
Create DataLakeServiceClient from a Connection String.
from_connection_string(conn_str: str, credential: str | Dict[str, str] | AzureNamedKeyCredential | AzureSasCredential | TokenCredential | None = None, **kwargs: Any) -> Self
Parameters
Name | Description |
---|---|
conn_str
Required
|
A connection string to an Azure Storage account. |
credential
|
The credentials with which to authenticate. This is optional if the account URL already has a SAS token, or the connection string already has shared access key values. The value can be a SAS token string, an instance of a AzureSasCredential from azure.core.credentials, an account shared access key, or an instance of a TokenCredentials class from azure.identity. Credentials provided here will take precedence over those in the connection string. Default value: None
|
Keyword-Only Parameters
Name | Description |
---|---|
audience
|
The audience to use when requesting tokens for Azure Active Directory authentication. Only has an effect when credential is of type TokenCredential. The value could be https://storage.azure.com/ (default) or https://.blob.core.windows.net. |
Returns
Type | Description |
---|---|
A DataLakeServiceClient. |
Examples
Creating the DataLakeServiceClient from a connection string.
from azure.storage.filedatalake import DataLakeServiceClient
datalake_service_client = DataLakeServiceClient.from_connection_string(self.connection_string)
get_directory_client
Get a client to interact with the specified directory.
The directory need not already exist.
get_directory_client(file_system: FileSystemProperties | str, directory: DirectoryProperties | str) -> DataLakeDirectoryClient
Parameters
Name | Description |
---|---|
file_system
Required
|
The file system that the directory is in. This can either be the name of the file system, or an instance of FileSystemProperties. |
directory
Required
|
The directory with which to interact. This can either be the name of the directory, or an instance of DirectoryProperties. |
Returns
Type | Description |
---|---|
A DataLakeDirectoryClient. |
Examples
Getting the directory client to interact with a specific directory.
directory_client = datalake_service_client.get_directory_client(file_system_client.file_system_name,
"mydirectory")
get_file_client
Get a client to interact with the specified file.
The file need not already exist.
get_file_client(file_system: FileSystemProperties | str, file_path: FileProperties | str) -> DataLakeFileClient
Parameters
Name | Description |
---|---|
file_system
Required
|
The file system that the file is in. This can either be the name of the file system, or an instance of FileSystemProperties. |
file_path
Required
|
The file with which to interact. This can either be the full path of the file(from the root directory), or an instance of FileProperties. eg. directory/subdirectory/file |
Returns
Type | Description |
---|---|
A DataLakeFileClient. |
Examples
Getting the file client to interact with a specific file.
file_client = datalake_service_client.get_file_client(file_system_client.file_system_name, "myfile")
get_file_system_client
Get a client to interact with the specified file system.
The file system need not already exist.
get_file_system_client(file_system: FileSystemProperties | str) -> FileSystemClient
Parameters
Name | Description |
---|---|
file_system
Required
|
The file system. This can either be the name of the file system, or an instance of FileSystemProperties. |
Returns
Type | Description |
---|---|
A FileSystemClient. |
Examples
Getting the file system client to interact with a specific file system.
# Instantiate a DataLakeServiceClient using a connection string
from azure.storage.filedatalake import DataLakeServiceClient
datalake_service_client = DataLakeServiceClient.from_connection_string(self.connection_string)
# Instantiate a FileSystemClient
file_system_client = datalake_service_client.get_file_system_client("mynewfilesystem")
get_service_properties
Gets the properties of a storage account's datalake service, including Azure Storage Analytics.
New in version 12.4.0: This operation was introduced in API version '2020-06-12'.
get_service_properties(**kwargs: Any) -> Dict[str, Any]
Keyword-Only Parameters
Name | Description |
---|---|
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
An object containing datalake service properties such as analytics logging, hour/minute metrics, cors rules, etc. |
get_user_delegation_key
Obtain a user delegation key for the purpose of signing SAS tokens. A token credential must be present on the service object for this request to succeed.
get_user_delegation_key(key_start_time: datetime, key_expiry_time: datetime, **kwargs: Any) -> UserDelegationKey
Parameters
Name | Description |
---|---|
key_start_time
Required
|
A DateTime value. Indicates when the key becomes valid. |
key_expiry_time
Required
|
A DateTime value. Indicates when the key stops being valid. |
Keyword-Only Parameters
Name | Description |
---|---|
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
The user delegation key. |
Examples
Get user delegation key from datalake service client.
from datetime import datetime, timedelta
user_delegation_key = datalake_service_client.get_user_delegation_key(datetime.utcnow(),
datetime.utcnow() + timedelta(hours=1))
list_file_systems
Returns a generator to list the file systems under the specified account.
The generator will lazily follow the continuation tokens returned by the service and stop when all file systems have been returned.
list_file_systems(name_starts_with: str | None = None, include_metadata: bool | None = None, **kwargs) -> ItemPaged[FileSystemProperties]
Parameters
Name | Description |
---|---|
name_starts_with
Required
|
Filters the results to return only file systems whose names begin with the specified prefix. |
include_metadata
Required
|
Specifies that file system metadata be returned in the response. The default value is False. |
Keyword-Only Parameters
Name | Description |
---|---|
results_per_page
|
The maximum number of file system names to retrieve per API call. If the request does not specify the server will return up to 5,000 items per page. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
include_deleted
|
Specifies that deleted file systems to be returned in the response. This is for file system restore enabled account. The default value is False. .. versionadded:: 12.3.0 |
include_system
|
Flag specifying that system filesystems should be included. .. versionadded:: 12.6.0 |
Returns
Type | Description |
---|---|
An iterable (auto-paging) of FileSystemProperties. |
Examples
Listing the file systems in the datalake service.
file_systems = datalake_service_client.list_file_systems()
for file_system in file_systems:
print(file_system.name)
set_service_properties
Sets the properties of a storage account's Datalake service, including Azure Storage Analytics.
New in version 12.4.0: This operation was introduced in API version '2020-06-12'.
If an element (e.g. analytics_logging) is left as None, the existing settings on the service for that functionality are preserved.
set_service_properties(**kwargs: Any) -> None
Keyword-Only Parameters
Name | Description |
---|---|
analytics_logging
|
Groups the Azure Analytics Logging settings. |
hour_metrics
|
The hour metrics settings provide a summary of request statistics grouped by API in hourly aggregates. |
minute_metrics
|
The minute metrics settings provide request statistics for each minute. |
cors
|
You can include up to five CorsRule elements in the list. If an empty list is specified, all CORS rules will be deleted, and CORS will be disabled for the service. |
target_version
|
Indicates the default version to use for requests if an incoming request's version is not specified. |
delete_retention_policy
|
The delete retention policy specifies whether to retain deleted files/directories. It also specifies the number of days and versions of file/directory to keep. |
static_website
|
Specifies whether the static website feature is enabled, and if yes, indicates the index document and 404 error document to use. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
undelete_file_system
Restores soft-deleted filesystem.
Operation will only be successful if used within the specified number of days set in the delete retention policy.
New in version 12.3.0: This operation was introduced in API version '2019-12-12'.
undelete_file_system(name: str, deleted_version: str, **kwargs: Any) -> FileSystemClient
Parameters
Name | Description |
---|---|
name
Required
|
Specifies the name of the deleted filesystem to restore. |
deleted_version
Required
|
Specifies the version of the deleted filesystem to restore. |
Keyword-Only Parameters
Name | Description |
---|---|
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
The restored solft-deleted FileSystemClient. |
Attributes
api_version
location_mode
The location mode that the client is currently using.
By default this will be "primary". Options include "primary" and "secondary".
Returns
Type | Description |
---|---|
primary_endpoint
primary_hostname
secondary_endpoint
The full secondary endpoint URL if configured.
If not available a ValueError will be raised. To explicitly specify a secondary hostname, use the optional secondary_hostname keyword argument on instantiation.
Returns
Type | Description |
---|---|
Exceptions
Type | Description |
---|---|
secondary_hostname
The hostname of the secondary endpoint.
If not available this will be None. To explicitly specify a secondary hostname, use the optional secondary_hostname keyword argument on instantiation.
Returns
Type | Description |
---|---|
url
The full endpoint URL to this entity, including SAS token if used.
This could be either the primary endpoint, or the secondary endpoint depending on the current location_mode. :returns: The full endpoint URL to this entity, including SAS token if used. :rtype: str
Azure SDK for Python