DataLakeServiceClient Class

A client to interact with the DataLake Service at the account level.

This client provides operations to retrieve and configure the account properties as well as list, create and delete file systems within the account. For operations relating to a specific file system, directory or file, clients for those entities can also be retrieved using the get_client functions.

Inheritance
azure.storage.filedatalake._shared.base_client.StorageAccountHostsMixin
DataLakeServiceClient

Constructor

DataLakeServiceClient(account_url: str, credential: str | Dict[str, str] | AzureNamedKeyCredential | AzureSasCredential | TokenCredential | None = None, **kwargs: Any)

Parameters

Name Description
account_url
Required
str

The URL to the DataLake storage account. Any other entities included in the URL path (e.g. file system or file) will be discarded. This URL can be optionally authenticated with a SAS token.

credential

The credentials with which to authenticate. This is optional if the account URL already has a SAS token. The value can be a SAS token string, an instance of a AzureSasCredential or AzureNamedKeyCredential from azure.core.credentials, an account shared access key, or an instance of a TokenCredentials class from azure.identity. If the resource URI already contains a SAS token, this will be ignored in favor of an explicit credential

  • except in the case of AzureSasCredential, where the conflicting SAS tokens will raise a ValueError. If using an instance of AzureNamedKeyCredential, "name" should be the storage account name, and "key" should be the storage account key.
Default value: None

Keyword-Only Parameters

Name Description
api_version
str

The Storage API version to use for requests. Default value is the most recent service version that is compatible with the current SDK. Setting to an older version may result in reduced feature compatibility.

audience
str

The audience to use when requesting tokens for Azure Active Directory authentication. Only has an effect when credential is of type TokenCredential. The value could be https://storage.azure.com/ (default) or https://.blob.core.windows.net.

Examples

Creating the DataLakeServiceClient from connection string.


   from azure.storage.filedatalake import DataLakeServiceClient
   datalake_service_client = DataLakeServiceClient.from_connection_string(self.connection_string)

Creating the DataLakeServiceClient with Azure Identity credentials.


   from azure.identity import DefaultAzureCredential
   token_credential = DefaultAzureCredential()
   datalake_service_client = DataLakeServiceClient("https://{}.dfs.core.windows.net".format(self.account_name),
                                                   credential=token_credential)

Variables

Name Description
url
str

The full endpoint URL to the datalake service endpoint.

primary_endpoint
str

The full primary endpoint URL.

primary_hostname
str

The hostname of the primary endpoint.

Methods

close

This method is to close the sockets opened by the client. It need not be used when using with a context manager.

create_file_system

Creates a new file system under the specified account.

If the file system with the same name already exists, a ResourceExistsError will be raised. This method returns a client with which to interact with the newly created file system.

delete_file_system

Marks the specified file system for deletion.

The file system and any files contained within it are later deleted during garbage collection. If the file system is not found, a ResourceNotFoundError will be raised.

from_connection_string

Create DataLakeServiceClient from a Connection String.

get_directory_client

Get a client to interact with the specified directory.

The directory need not already exist.

get_file_client

Get a client to interact with the specified file.

The file need not already exist.

get_file_system_client

Get a client to interact with the specified file system.

The file system need not already exist.

get_service_properties

Gets the properties of a storage account's datalake service, including Azure Storage Analytics.

New in version 12.4.0: This operation was introduced in API version '2020-06-12'.

get_user_delegation_key

Obtain a user delegation key for the purpose of signing SAS tokens. A token credential must be present on the service object for this request to succeed.

list_file_systems

Returns a generator to list the file systems under the specified account.

The generator will lazily follow the continuation tokens returned by the service and stop when all file systems have been returned.

set_service_properties

Sets the properties of a storage account's Datalake service, including Azure Storage Analytics.

New in version 12.4.0: This operation was introduced in API version '2020-06-12'.

If an element (e.g. analytics_logging) is left as None, the existing settings on the service for that functionality are preserved.

undelete_file_system

Restores soft-deleted filesystem.

Operation will only be successful if used within the specified number of days set in the delete retention policy.

New in version 12.3.0: This operation was introduced in API version '2019-12-12'.

close

This method is to close the sockets opened by the client. It need not be used when using with a context manager.

close() -> None

create_file_system

Creates a new file system under the specified account.

If the file system with the same name already exists, a ResourceExistsError will be raised. This method returns a client with which to interact with the newly created file system.

create_file_system(file_system: FileSystemProperties | str, metadata: Dict[str, str] | None = None, public_access: PublicAccess | None = None, **kwargs) -> FileSystemClient

Parameters

Name Description
file_system
Required
str

The name of the file system to create.

metadata
Required

A dict with name-value pairs to associate with the file system as metadata. Example: {'Category':'test'}

public_access
Required

Possible values include: file system, file.

Keyword-Only Parameters

Name Description
encryption_scope_options

Specifies the default encryption scope to set on the file system and use for all future writes.

New in version 12.9.0.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

A FileSystemClient with newly created file system.

Examples

Creating a file system in the datalake service.


   datalake_service_client.create_file_system("filesystem")

delete_file_system

Marks the specified file system for deletion.

The file system and any files contained within it are later deleted during garbage collection. If the file system is not found, a ResourceNotFoundError will be raised.

delete_file_system(file_system: FileSystemProperties | str, **kwargs) -> FileSystemClient

Parameters

Name Description
file_system
Required

The file system to delete. This can either be the name of the file system, or an instance of FileSystemProperties.

Keyword-Only Parameters

Name Description
lease

If specified, delete_file_system only succeeds if the file system's lease is active and matches this ID. Required if the file system has an active lease.

if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

A FileSystemClient with the specified file system deleted.

Examples

Deleting a file system in the datalake service.


   datalake_service_client.delete_file_system("filesystem")

from_connection_string

Create DataLakeServiceClient from a Connection String.

from_connection_string(conn_str: str, credential: str | Dict[str, str] | AzureNamedKeyCredential | AzureSasCredential | TokenCredential | None = None, **kwargs: Any) -> Self

Parameters

Name Description
conn_str
Required
str

A connection string to an Azure Storage account.

credential

The credentials with which to authenticate. This is optional if the account URL already has a SAS token, or the connection string already has shared access key values. The value can be a SAS token string, an instance of a AzureSasCredential from azure.core.credentials, an account shared access key, or an instance of a TokenCredentials class from azure.identity. Credentials provided here will take precedence over those in the connection string.

Default value: None

Keyword-Only Parameters

Name Description
audience
str

The audience to use when requesting tokens for Azure Active Directory authentication. Only has an effect when credential is of type TokenCredential. The value could be https://storage.azure.com/ (default) or https://.blob.core.windows.net.

Returns

Type Description

A DataLakeServiceClient.

Examples

Creating the DataLakeServiceClient from a connection string.


   from azure.storage.filedatalake import DataLakeServiceClient
   datalake_service_client = DataLakeServiceClient.from_connection_string(self.connection_string)

get_directory_client

Get a client to interact with the specified directory.

The directory need not already exist.

get_directory_client(file_system: FileSystemProperties | str, directory: DirectoryProperties | str) -> DataLakeDirectoryClient

Parameters

Name Description
file_system
Required

The file system that the directory is in. This can either be the name of the file system, or an instance of FileSystemProperties.

directory
Required

The directory with which to interact. This can either be the name of the directory, or an instance of DirectoryProperties.

Returns

Type Description

A DataLakeDirectoryClient.

Examples

Getting the directory client to interact with a specific directory.


   directory_client = datalake_service_client.get_directory_client(file_system_client.file_system_name,
                                                                   "mydirectory")

get_file_client

Get a client to interact with the specified file.

The file need not already exist.

get_file_client(file_system: FileSystemProperties | str, file_path: FileProperties | str) -> DataLakeFileClient

Parameters

Name Description
file_system
Required

The file system that the file is in. This can either be the name of the file system, or an instance of FileSystemProperties.

file_path
Required

The file with which to interact. This can either be the full path of the file(from the root directory), or an instance of FileProperties. eg. directory/subdirectory/file

Returns

Type Description

A DataLakeFileClient.

Examples

Getting the file client to interact with a specific file.


   file_client = datalake_service_client.get_file_client(file_system_client.file_system_name, "myfile")

get_file_system_client

Get a client to interact with the specified file system.

The file system need not already exist.

get_file_system_client(file_system: FileSystemProperties | str) -> FileSystemClient

Parameters

Name Description
file_system
Required

The file system. This can either be the name of the file system, or an instance of FileSystemProperties.

Returns

Type Description

A FileSystemClient.

Examples

Getting the file system client to interact with a specific file system.


   # Instantiate a DataLakeServiceClient using a connection string
   from azure.storage.filedatalake import DataLakeServiceClient
   datalake_service_client = DataLakeServiceClient.from_connection_string(self.connection_string)

   # Instantiate a FileSystemClient
   file_system_client = datalake_service_client.get_file_system_client("mynewfilesystem")

get_service_properties

Gets the properties of a storage account's datalake service, including Azure Storage Analytics.

New in version 12.4.0: This operation was introduced in API version '2020-06-12'.

get_service_properties(**kwargs: Any) -> Dict[str, Any]

Keyword-Only Parameters

Name Description
timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

An object containing datalake service properties such as analytics logging, hour/minute metrics, cors rules, etc.

get_user_delegation_key

Obtain a user delegation key for the purpose of signing SAS tokens. A token credential must be present on the service object for this request to succeed.

get_user_delegation_key(key_start_time: datetime, key_expiry_time: datetime, **kwargs: Any) -> UserDelegationKey

Parameters

Name Description
key_start_time
Required

A DateTime value. Indicates when the key becomes valid.

key_expiry_time
Required

A DateTime value. Indicates when the key stops being valid.

Keyword-Only Parameters

Name Description
timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

The user delegation key.

Examples

Get user delegation key from datalake service client.


   from datetime import datetime, timedelta
   user_delegation_key = datalake_service_client.get_user_delegation_key(datetime.utcnow(),
                                                                         datetime.utcnow() + timedelta(hours=1))

list_file_systems

Returns a generator to list the file systems under the specified account.

The generator will lazily follow the continuation tokens returned by the service and stop when all file systems have been returned.

list_file_systems(name_starts_with: str | None = None, include_metadata: bool | None = None, **kwargs) -> ItemPaged[FileSystemProperties]

Parameters

Name Description
name_starts_with
Required
str

Filters the results to return only file systems whose names begin with the specified prefix.

include_metadata
Required

Specifies that file system metadata be returned in the response. The default value is False.

Keyword-Only Parameters

Name Description
results_per_page
int

The maximum number of file system names to retrieve per API call. If the request does not specify the server will return up to 5,000 items per page.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

include_deleted

Specifies that deleted file systems to be returned in the response. This is for file system restore enabled account. The default value is False. .. versionadded:: 12.3.0

include_system

Flag specifying that system filesystems should be included. .. versionadded:: 12.6.0

Returns

Type Description

An iterable (auto-paging) of FileSystemProperties.

Examples

Listing the file systems in the datalake service.


   file_systems = datalake_service_client.list_file_systems()
   for file_system in file_systems:
       print(file_system.name)

set_service_properties

Sets the properties of a storage account's Datalake service, including Azure Storage Analytics.

New in version 12.4.0: This operation was introduced in API version '2020-06-12'.

If an element (e.g. analytics_logging) is left as None, the existing settings on the service for that functionality are preserved.

set_service_properties(**kwargs: Any) -> None

Keyword-Only Parameters

Name Description
analytics_logging

Groups the Azure Analytics Logging settings.

hour_metrics

The hour metrics settings provide a summary of request statistics grouped by API in hourly aggregates.

minute_metrics

The minute metrics settings provide request statistics for each minute.

cors

You can include up to five CorsRule elements in the list. If an empty list is specified, all CORS rules will be deleted, and CORS will be disabled for the service.

target_version
str

Indicates the default version to use for requests if an incoming request's version is not specified.

delete_retention_policy

The delete retention policy specifies whether to retain deleted files/directories. It also specifies the number of days and versions of file/directory to keep.

static_website

Specifies whether the static website feature is enabled, and if yes, indicates the index document and 404 error document to use.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

undelete_file_system

Restores soft-deleted filesystem.

Operation will only be successful if used within the specified number of days set in the delete retention policy.

New in version 12.3.0: This operation was introduced in API version '2019-12-12'.

undelete_file_system(name: str, deleted_version: str, **kwargs: Any) -> FileSystemClient

Parameters

Name Description
name
Required
str

Specifies the name of the deleted filesystem to restore.

deleted_version
Required
str

Specifies the version of the deleted filesystem to restore.

Keyword-Only Parameters

Name Description
timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

The restored solft-deleted FileSystemClient.

Attributes

api_version

The version of the Storage API used for requests.

Returns

Type Description
str

location_mode

The location mode that the client is currently using.

By default this will be "primary". Options include "primary" and "secondary".

Returns

Type Description
str

primary_endpoint

The full primary endpoint URL.

Returns

Type Description
str

primary_hostname

The hostname of the primary endpoint.

Returns

Type Description
str

secondary_endpoint

The full secondary endpoint URL if configured.

If not available a ValueError will be raised. To explicitly specify a secondary hostname, use the optional secondary_hostname keyword argument on instantiation.

Returns

Type Description
str

Exceptions

Type Description

secondary_hostname

The hostname of the secondary endpoint.

If not available this will be None. To explicitly specify a secondary hostname, use the optional secondary_hostname keyword argument on instantiation.

Returns

Type Description

url

The full endpoint URL to this entity, including SAS token if used.

This could be either the primary endpoint, or the secondary endpoint depending on the current location_mode. :returns: The full endpoint URL to this entity, including SAS token if used. :rtype: str