Compartilhar via


DataLakeDirectoryClient Class

A client to interact with the DataLake directory, even if the directory may not yet exist.

For operations relating to a specific subdirectory or file under the directory, a directory client or file client can be retrieved using the get_sub_directory_client or get_file_client functions.

Inheritance
azure.storage.filedatalake._path_client.PathClient
DataLakeDirectoryClient

Constructor

DataLakeDirectoryClient(account_url: str, file_system_name: str, directory_name: str, credential: str | Dict[str, str] | AzureNamedKeyCredential | AzureSasCredential | TokenCredential | None = None, **kwargs: Any)

Parameters

Name Description
account_url
Required
str

The URI to the storage account.

file_system_name
Required
str

The file system for the directory or files.

directory_name
Required
str

The whole path of the directory. eg. {directory under file system}/{directory to interact with}

credential

The credentials with which to authenticate. This is optional if the account URL already has a SAS token. The value can be a SAS token string, an instance of a AzureSasCredential or AzureNamedKeyCredential from azure.core.credentials, an account shared access key, or an instance of a TokenCredentials class from azure.identity. If the resource URI already contains a SAS token, this will be ignored in favor of an explicit credential

  • except in the case of AzureSasCredential, where the conflicting SAS tokens will raise a ValueError. If using an instance of AzureNamedKeyCredential, "name" should be the storage account name, and "key" should be the storage account key.
Default value: None

Keyword-Only Parameters

Name Description
api_version
str

The Storage API version to use for requests. Default value is the most recent service version that is compatible with the current SDK. Setting to an older version may result in reduced feature compatibility.

audience
str

The audience to use when requesting tokens for Azure Active Directory authentication. Only has an effect when credential is of type TokenCredential. The value could be https://storage.azure.com/ (default) or https://.blob.core.windows.net.

Examples

Creating the DataLakeServiceClient from connection string.


   from azure.storage.filedatalake import DataLakeDirectoryClient
   DataLakeDirectoryClient.from_connection_string(connection_string, "myfilesystem", "mydirectory")

Variables

Name Description
url
str

The full endpoint URL to the file system, including SAS token if used.

primary_endpoint
str

The full primary endpoint URL.

primary_hostname
str

The hostname of the primary endpoint.

Methods

acquire_lease

Requests a new lease. If the file or directory does not have an active lease, the DataLake service creates a lease on the file/directory and returns a new lease ID.

close

This method is to close the sockets opened by the client. It need not be used when using with a context manager.

create_directory

Create a new directory.

create_file

Create a new file and return the file client to be interacted with.

create_sub_directory

Create a subdirectory and return the subdirectory client to be interacted with.

delete_directory

Marks the specified directory for deletion.

delete_sub_directory

Marks the specified subdirectory for deletion.

exists

Returns True if a directory exists and returns False otherwise.

from_connection_string

Create DataLakeDirectoryClient from a Connection String.

get_access_control
get_directory_properties

Returns all user-defined metadata, standard HTTP properties, and system properties for the directory. It does not return the content of the directory.

get_file_client

Get a client to interact with the specified file.

The file need not already exist.

get_paths

Returns a generator to list the paths under specified file system and directory. The generator will lazily follow the continuation tokens returned by the service.

get_sub_directory_client

Get a client to interact with the specified subdirectory of the current directory.

The sub subdirectory need not already exist.

remove_access_control_recursive

Removes the Access Control on a path and sub-paths.

rename_directory

Rename the source directory.

set_access_control

Set the owner, group, permissions, or access control list for a path.

set_access_control_recursive

Sets the Access Control on a path and sub-paths.

set_http_headers

Sets system properties on the file or directory.

If one property is set for the content_settings, all properties will be overridden.

set_metadata

Sets one or more user-defined name-value pairs for the specified file system. Each call to this operation replaces all existing metadata attached to the file system. To remove all metadata from the file system, call this operation with no metadata dict.

update_access_control_recursive

Modifies the Access Control on a path and sub-paths.

acquire_lease

Requests a new lease. If the file or directory does not have an active lease, the DataLake service creates a lease on the file/directory and returns a new lease ID.

acquire_lease(lease_duration: int | None = -1, lease_id: str | None = None, **kwargs) -> DataLakeLeaseClient

Parameters

Name Description
lease_duration
Required
int

Specifies the duration of the lease, in seconds, or negative one (-1) for a lease that never expires. A non-infinite lease can be between 15 and 60 seconds. A lease duration cannot be changed using renew or change. Default is -1 (infinite lease).

lease_id
Required
str

Proposed lease ID, in a GUID string format. The DataLake service returns 400 (Invalid request) if the proposed lease ID is not in the correct format.

Keyword-Only Parameters

Name Description
if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

A DataLakeLeaseClient object, that can be run in a context manager.

close

This method is to close the sockets opened by the client. It need not be used when using with a context manager.

close() -> None

Keyword-Only Parameters

Name Description
if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

create_directory

Create a new directory.

create_directory(metadata: Dict[str, str] | None = None, **kwargs) -> Dict[str, str | datetime]

Parameters

Name Description
metadata
Required

Name-value pairs associated with the file as metadata.

Keyword-Only Parameters

Name Description
content_settings

ContentSettings object used to set path properties.

lease

Required if the file has an active lease. Value can be a DataLakeLeaseClient object or the lease ID as a string.

umask
str

Optional and only valid if Hierarchical Namespace is enabled for the account. When creating a file or directory and the parent folder does not have a default ACL, the umask restricts the permissions of the file or directory to be created. The resulting permission is given by p & ^u, where p is the permission and u is the umask. For example, if p is 0777 and u is 0057, then the resulting permission is 0720. The default permission is 0777 for a directory and 0666 for a file. The default umask is 0027. The umask must be specified in 4-digit octal notation (e.g. 0766).

owner
str

The owner of the file or directory.

group
str

The owning group of the file or directory.

acl
str

Sets POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, a user or group identifier, and permissions in the format "[scope:][type]:[id]:[permissions]".

lease_id
str

Proposed lease ID, in a GUID string format. The DataLake service returns 400 (Invalid request) if the proposed lease ID is not in the correct format.

lease_duration
int

Specifies the duration of the lease, in seconds, or negative one (-1) for a lease that never expires. A non-infinite lease can be between 15 and 60 seconds. A lease duration cannot be changed using renew or change.

permissions
str

Optional and only valid if Hierarchical Namespace is enabled for the account. Sets POSIX access permissions for the file owner, the file owning group, and others. Each class may be granted read, write, or execute permission. The sticky bit is also supported. Both symbolic (rwxrw-rw-) and 4-digit octal notation (e.g. 0766) are supported.

if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

cpk

Encrypts the data on the service-side with the given key. Use of customer-provided keys must be done over HTTPS.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

A dictionary of response headers.

Examples

Create directory.


   directory_client.create_directory()

create_file

Create a new file and return the file client to be interacted with.

create_file(file: FileProperties | str, **kwargs) -> DataLakeFileClient

Parameters

Name Description
file
Required

The file with which to interact. This can either be the name of the file, or an instance of FileProperties.

Keyword-Only Parameters

Name Description
content_settings

ContentSettings object used to set path properties.

metadata

Name-value pairs associated with the file as metadata.

lease

Required if the file has an active lease. Value can be a DataLakeLeaseClient object or the lease ID as a string.

umask
str

Optional and only valid if Hierarchical Namespace is enabled for the account. When creating a file or directory and the parent folder does not have a default ACL, the umask restricts the permissions of the file or directory to be created. The resulting permission is given by p & ^u, where p is the permission and u is the umask. For example, if p is 0777 and u is 0057, then the resulting permission is 0720. The default permission is 0777 for a directory and 0666 for a file. The default umask is 0027. The umask must be specified in 4-digit octal notation (e.g. 0766).

owner
str

The owner of the file or directory.

group
str

The owning group of the file or directory.

acl
str

Sets POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, a user or group identifier, and permissions in the format "[scope:][type]:[id]:[permissions]".

lease_id
str

Proposed lease ID, in a GUID string format. The DataLake service returns 400 (Invalid request) if the proposed lease ID is not in the correct format.

lease_duration
int

Specifies the duration of the lease, in seconds, or negative one (-1) for a lease that never expires. A non-infinite lease can be between 15 and 60 seconds. A lease duration cannot be changed using renew or change.

expires_on

The time to set the file to expiry. If the type of expires_on is an int, expiration time will be set as the number of milliseconds elapsed from creation time. If the type of expires_on is datetime, expiration time will be set absolute to the time provided. If no time zone info is provided, this will be interpreted as UTC.

permissions
str

Optional and only valid if Hierarchical Namespace is enabled for the account. Sets POSIX access permissions for the file owner, the file owning group, and others. Each class may be granted read, write, or execute permission. The sticky bit is also supported. Both symbolic (rwxrw-rw-) and 4-digit octal notation (e.g. 0766) are supported.

if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

cpk

Encrypts the data on the service-side with the given key. Use of customer-provided keys must be done over HTTPS.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

A DataLakeFileClient with newly created file.

create_sub_directory

Create a subdirectory and return the subdirectory client to be interacted with.

create_sub_directory(sub_directory: DirectoryProperties | str, metadata: Dict[str, str] | None = None, **kwargs) -> DataLakeDirectoryClient

Parameters

Name Description
sub_directory
Required

The directory with which to interact. This can either be the name of the directory, or an instance of DirectoryProperties.

metadata
Required

Name-value pairs associated with the file as metadata.

Keyword-Only Parameters

Name Description
content_settings

ContentSettings object used to set path properties.

lease

Required if the file has an active lease. Value can be a DataLakeLeaseClient object or the lease ID as a string.

umask
str

Optional and only valid if Hierarchical Namespace is enabled for the account. When creating a file or directory and the parent folder does not have a default ACL, the umask restricts the permissions of the file or directory to be created. The resulting permission is given by p & ^u, where p is the permission and u is the umask. For example, if p is 0777 and u is 0057, then the resulting permission is 0720. The default permission is 0777 for a directory and 0666 for a file. The default umask is 0027. The umask must be specified in 4-digit octal notation (e.g. 0766).

owner
str

The owner of the file or directory.

group
str

The owning group of the file or directory.

acl
str

Sets POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, a user or group identifier, and permissions in the format "[scope:][type]:[id]:[permissions]".

lease_id
str

Proposed lease ID, in a GUID string format. The DataLake service returns 400 (Invalid request) if the proposed lease ID is not in the correct format.

lease_duration
int

Specifies the duration of the lease, in seconds, or negative one (-1) for a lease that never expires. A non-infinite lease can be between 15 and 60 seconds. A lease duration cannot be changed using renew or change.

permissions
str

Optional and only valid if Hierarchical Namespace is enabled for the account. Sets POSIX access permissions for the file owner, the file owning group, and others. Each class may be granted read, write, or execute permission. The sticky bit is also supported. Both symbolic (rwxrw-rw-) and 4-digit octal notation (e.g. 0766) are supported.

if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

cpk

Encrypts the data on the service-side with the given key. Use of customer-provided keys must be done over HTTPS.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

DataLakeDirectoryClient for the subdirectory.

delete_directory

Marks the specified directory for deletion.

delete_directory(**kwargs) -> None

Keyword-Only Parameters

Name Description
lease

Required if the file has an active lease. Value can be a LeaseClient object or the lease ID as a string.

if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

None.

Examples

Delete directory.


   new_directory.delete_directory()

delete_sub_directory

Marks the specified subdirectory for deletion.

delete_sub_directory(sub_directory: DirectoryProperties | str, **kwargs) -> DataLakeDirectoryClient

Parameters

Name Description
sub_directory
Required

The directory with which to interact. This can either be the name of the directory, or an instance of DirectoryProperties.

Keyword-Only Parameters

Name Description
lease

Required if the file has an active lease. Value can be a LeaseClient object or the lease ID as a string.

if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

DataLakeDirectoryClient for the subdirectory.

exists

Returns True if a directory exists and returns False otherwise.

exists(**kwargs: Any) -> bool

Keyword-Only Parameters

Name Description
timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

True if a directory exists, False otherwise.

from_connection_string

Create DataLakeDirectoryClient from a Connection String.

from_connection_string(conn_str: str, file_system_name: str, directory_name: str, credential: str | Dict[str, str] | AzureNamedKeyCredential | AzureSasCredential | TokenCredential | None = None, **kwargs: Any) -> Self

Parameters

Name Description
conn_str
Required
str

A connection string to an Azure Storage account.

file_system_name
Required
str

The name of file system to interact with.

credential

The credentials with which to authenticate. This is optional if the account URL already has a SAS token. The value can be a SAS token string, an instance of a AzureSasCredential or AzureNamedKeyCredential from azure.core.credentials, an account shared access key, or an instance of a TokenCredentials class from azure.identity. If the resource URI already contains a SAS token, this will be ignored in favor of an explicit credential

  • except in the case of AzureSasCredential, where the conflicting SAS tokens will raise a ValueError. If using an instance of AzureNamedKeyCredential, "name" should be the storage account name, and "key" should be the storage account key.
Default value: None
directory_name
Required
str

The name of directory to interact with. The directory is under file system.

Keyword-Only Parameters

Name Description
audience
str

The audience to use when requesting tokens for Azure Active Directory authentication. Only has an effect when credential is of type TokenCredential. The value could be https://storage.azure.com/ (default) or https://.blob.core.windows.net.

Returns

Type Description

A DataLakeDirectoryClient.

get_access_control

get_access_control(upn: bool | None = None, **kwargs) -> Dict[str, Any]

Parameters

Name Description
upn
Required

Optional. Valid only when Hierarchical Namespace is enabled for the account. If "true", the user identity values returned in the x-ms-owner, x-ms-group, and x-ms-acl response headers will be transformed from Azure Active Directory Object IDs to User Principal Names. If "false", the values will be returned as Azure Active Directory Object IDs. The default value is false. Note that group and application Object IDs are not translated because they do not have unique friendly names.

Keyword-Only Parameters

Name Description
lease

Required if the file/directory has an active lease. Value can be a LeaseClient object or the lease ID as a string.

if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

response dict containing access control options with no modifications.

get_directory_properties

Returns all user-defined metadata, standard HTTP properties, and system properties for the directory. It does not return the content of the directory.

get_directory_properties(**kwargs: Any) -> DirectoryProperties

Keyword-Only Parameters

Name Description
lease

Required if the directory or file has an active lease. Value can be a DataLakeLeaseClient object or the lease ID as a string.

if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

cpk

Decrypts the data on the service-side with the given key. Use of customer-provided keys must be done over HTTPS. Required if the directory was created with a customer-provided key.

upn

If True, the user identity values returned in the x-ms-owner, x-ms-group, and x-ms-acl response headers will be transformed from Azure Active Directory Object IDs to User Principal Names in the owner, group, and acl fields of DirectoryProperties. If False, the values will be returned as Azure Active Directory Object IDs. The default value is False. Note that group and application Object IDs are not translate because they do not have unique friendly names.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

DirectoryProperties with all user-defined metadata, standard HTTP properties, and system properties for the directory. It does not return the content of the directory.

Examples

Getting the properties for a file/directory.


   props = new_directory.get_directory_properties()

get_file_client

Get a client to interact with the specified file.

The file need not already exist.

get_file_client(file: FileProperties | str) -> DataLakeFileClient

Parameters

Name Description
file
Required

The file with which to interact. This can either be the name of the file, or an instance of FileProperties. eg. directory/subdirectory/file

Keyword-Only Parameters

Name Description
if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

A DataLakeFileClient.

get_paths

Returns a generator to list the paths under specified file system and directory. The generator will lazily follow the continuation tokens returned by the service.

get_paths(*, recursive: bool = True, max_results: int | None = None, upn: bool | None = None, timeout: int | None = None, **kwargs: Any) -> ItemPaged[PathProperties]

Keyword-Only Parameters

Name Description
recursive

Set True for recursive, False for iterative. The default value is True.

max_results

An optional value that specifies the maximum number of items to return per page. If omitted or greater than 5,000, the response will include up to 5,000 items per page.

upn

If True, the user identity values returned in the x-ms-owner, x-ms-group, and x-ms-acl response headers will be transformed from Azure Active Directory Object IDs to User Principal Names in the owner, group, and acl fields of PathProperties. If False, the values will be returned as Azure Active Directory Object IDs. The default value is None. Note that group and application Object IDs are not translate because they do not have unique friendly names.

timeout

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. The default value is None.

Returns

Type Description

An iterable (auto-paging) response of PathProperties.

get_sub_directory_client

Get a client to interact with the specified subdirectory of the current directory.

The sub subdirectory need not already exist.

get_sub_directory_client(sub_directory: DirectoryProperties | str) -> DataLakeDirectoryClient

Parameters

Name Description
sub_directory
Required

The directory with which to interact. This can either be the name of the directory, or an instance of DirectoryProperties.

Keyword-Only Parameters

Name Description
if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

A DataLakeDirectoryClient.

remove_access_control_recursive

Removes the Access Control on a path and sub-paths.

remove_access_control_recursive(acl: str, **kwargs: Any) -> AccessControlChangeResult

Parameters

Name Description
acl
Required
str

Removes POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, and a user or group identifier in the format "[scope:][type]:[id]".

Keyword-Only Parameters

Name Description
progress_hook
<xref:func>(AccessControlChanges)

Callback where the caller can track progress of the operation as well as collect paths that failed to change Access Control.

continuation_token
str

Optional continuation token that can be used to resume previously stopped operation.

batch_size
int

Optional. If data set size exceeds batch size then operation will be split into multiple requests so that progress can be tracked. Batch size should be between 1 and 2000. The default when unspecified is 2000.

max_batches
int

Optional. Defines maximum number of batches that single change Access Control operation can execute. If maximum is reached before all sub-paths are processed then, continuation token can be used to resume operation. Empty value indicates that maximum number of batches in unbound and operation continues till end.

continue_on_failure

If set to False, the operation will terminate quickly on encountering user errors (4XX). If True, the operation will ignore user errors and proceed with the operation on other sub-entities of the directory. Continuation token will only be returned when continue_on_failure is True in case of user errors. If not set the default value is False for this.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

A summary of the recursive operations, including the count of successes and failures, as well as a continuation token in case the operation was terminated prematurely.

Exceptions

Type Description

User can restart the operation using continuation_token field of AzureError if the token is available.

rename_directory

Rename the source directory.

rename_directory(new_name: str, **kwargs: Any) -> DataLakeDirectoryClient

Parameters

Name Description
new_name
Required
str

the new directory name the user want to rename to. The value must have the following format: "{filesystem}/{directory}/{subdirectory}".

Keyword-Only Parameters

Name Description
source_lease

A lease ID for the source path. If specified, the source path must have an active lease and the lease ID must match.

lease

Required if the file/directory has an active lease. Value can be a LeaseClient object or the lease ID as a string.

if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

source_if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

source_if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

source_etag
str

The source ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

source_match_condition

The source match condition to use upon the etag.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

A DataLakeDirectoryClient with the renamed directory.

Examples

Rename the source directory.


   new_dir_name = "testdir2"
   print("Renaming the directory named '{}' to '{}'.".format(dir_name, new_dir_name))
   new_directory = directory_client\
       .rename_directory(new_name=directory_client.file_system_name + '/' + new_dir_name)

set_access_control

Set the owner, group, permissions, or access control list for a path.

set_access_control(owner: str | None = None, group: str | None = None, permissions: str | None = None, acl: str | None = None, **kwargs) -> Dict[str, str | datetime]

Parameters

Name Description
owner
Required
str

Optional. The owner of the file or directory.

group
Required
str

Optional. The owning group of the file or directory.

permissions
Required
str

Optional and only valid if Hierarchical Namespace is enabled for the account. Sets POSIX access permissions for the file owner, the file owning group, and others. Each class may be granted read, write, or execute permission. The sticky bit is also supported. Both symbolic (rwxrw-rw-) and 4-digit octal notation (e.g. 0766) are supported. permissions and acl are mutually exclusive.

acl
Required
str

Sets POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, a user or group identifier, and permissions in the format "[scope:][type]:[id]:[permissions]". permissions and acl are mutually exclusive.

Keyword-Only Parameters

Name Description
lease

Required if the file/directory has an active lease. Value can be a LeaseClient object or the lease ID as a string.

if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description
dict[str, str],

response dict containing access control options (Etag and last modified).

set_access_control_recursive

Sets the Access Control on a path and sub-paths.

set_access_control_recursive(acl: str, **kwargs: Any) -> AccessControlChangeResult

Parameters

Name Description
acl
Required
str

Sets POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, a user or group identifier, and permissions in the format "[scope:][type]:[id]:[permissions]".

Keyword-Only Parameters

Name Description
progress_hook
<xref:func>(AccessControlChanges)

Callback where the caller can track progress of the operation as well as collect paths that failed to change Access Control.

continuation_token
str

Optional continuation token that can be used to resume previously stopped operation.

batch_size
int

Optional. If data set size exceeds batch size then operation will be split into multiple requests so that progress can be tracked. Batch size should be between 1 and 2000. The default when unspecified is 2000.

max_batches
int

Optional. Defines maximum number of batches that single change Access Control operation can execute. If maximum is reached before all sub-paths are processed, then continuation token can be used to resume operation. Empty value indicates that maximum number of batches in unbound and operation continues till end.

continue_on_failure

If set to False, the operation will terminate quickly on encountering user errors (4XX). If True, the operation will ignore user errors and proceed with the operation on other sub-entities of the directory. Continuation token will only be returned when continue_on_failure is True in case of user errors. If not set the default value is False for this.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

A summary of the recursive operations, including the count of successes and failures, as well as a continuation token in case the operation was terminated prematurely.

Exceptions

Type Description

User can restart the operation using continuation_token field of AzureError if the token is available.

set_http_headers

Sets system properties on the file or directory.

If one property is set for the content_settings, all properties will be overridden.

set_http_headers(content_settings: ContentSettings | None = None, **kwargs) -> Dict[str, Any]

Parameters

Name Description
content_settings
Required

ContentSettings object used to set file/directory properties.

Keyword-Only Parameters

Name Description
lease

If specified, set_file_system_metadata only succeeds if the file system's lease is active and matches this ID.

if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

file/directory-updated property dict (Etag and last modified)

set_metadata

Sets one or more user-defined name-value pairs for the specified file system. Each call to this operation replaces all existing metadata attached to the file system. To remove all metadata from the file system, call this operation with no metadata dict.

set_metadata(metadata: Dict[str, str], **kwargs) -> Dict[str, str | datetime]

Parameters

Name Description
metadata
Required

A dict containing name-value pairs to associate with the file system as metadata. Example: {'category':'test'}

Keyword-Only Parameters

Name Description
lease

If specified, set_file_system_metadata only succeeds if the file system's lease is active and matches this ID.

if_modified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time.

if_unmodified_since

A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time.

etag
str

An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter.

match_condition

The match condition to use upon the etag.

cpk

Encrypts the data on the service-side with the given key. Use of customer-provided keys must be done over HTTPS.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description
dict[str, str],

file system-updated property dict (Etag and last modified).

update_access_control_recursive

Modifies the Access Control on a path and sub-paths.

update_access_control_recursive(acl: str, **kwargs: Any) -> AccessControlChangeResult

Parameters

Name Description
acl
Required
str

Modifies POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, a user or group identifier, and permissions in the format "[scope:][type]:[id]:[permissions]".

Keyword-Only Parameters

Name Description
progress_hook
<xref:func>(AccessControlChanges)

Callback where the caller can track progress of the operation as well as collect paths that failed to change Access Control.

continuation_token
str

Optional continuation token that can be used to resume previously stopped operation.

batch_size
int

Optional. If data set size exceeds batch size then operation will be split into multiple requests so that progress can be tracked. Batch size should be between 1 and 2000. The default when unspecified is 2000.

max_batches
int

Optional. Defines maximum number of batches that single change Access Control operation can execute. If maximum is reached before all sub-paths are processed, then continuation token can be used to resume operation. Empty value indicates that maximum number of batches in unbound and operation continues till end.

continue_on_failure

If set to False, the operation will terminate quickly on encountering user errors (4XX). If True, the operation will ignore user errors and proceed with the operation on other sub-entities of the directory. Continuation token will only be returned when continue_on_failure is True in case of user errors. If not set the default value is False for this.

timeout
int

Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here.

Returns

Type Description

A summary of the recursive operations, including the count of successes and failures, as well as a continuation token in case the operation was terminated prematurely.

Exceptions

Type Description

User can restart the operation using continuation_token field of AzureError if the token is available.

Attributes

api_version

The version of the Storage API used for requests.

Returns

Type Description
str

location_mode

The location mode that the client is currently using.

By default this will be "primary". Options include "primary" and "secondary".

Returns

Type Description
str

primary_endpoint

The full primary endpoint URL.

Returns

Type Description
str

primary_hostname

The hostname of the primary endpoint.

Returns

Type Description
str

secondary_endpoint

The full secondary endpoint URL if configured.

If not available a ValueError will be raised. To explicitly specify a secondary hostname, use the optional secondary_hostname keyword argument on instantiation.

Returns

Type Description
str

Exceptions

Type Description

secondary_hostname

The hostname of the secondary endpoint.

If not available this will be None. To explicitly specify a secondary hostname, use the optional secondary_hostname keyword argument on instantiation.

Returns

Type Description

url

The full endpoint URL to this entity, including SAS token if used.

This could be either the primary endpoint, or the secondary endpoint depending on the current location_mode. :returns: The full endpoint URL to this entity, including SAS token if used. :rtype: str