AzureDLFileSystem Class
Access Azure DataLake Store as if it were a file-system
- Inheritance
-
builtins.objectAzureDLFileSystem
Constructor
AzureDLFileSystem(token_credential=None, **kwargs)
Parameters
Name | Description |
---|---|
store_name
Required
|
str(<xref:
”” )
Store name to connect to. If not supplied, we use environment variable azure_data_lake_store_name |
token_credential
|
<xref:<xref:azure.datalake.store.core.credentials object>>
When setting up a new connection, this contains the authorization credentials. Use Azure Identity to get this or define an implementation of azure.core.credentials.TokenCredential Default value: None
|
scopes
Required
|
which is a list of scopes to use for the token. |
url_suffix
Required
|
Domain to send REST requests to. The end-point URL is constructed using this and the store_name. If None, use default. |
api_version
Required
|
str(<xref:2018-09-01>)
The API version to target with requests. Changing this value will change the behavior of the requests, and can cause unexpected behavior or breaking changes. Changes to this value should be undergone with caution. |
per_call_timeout_seconds
Required
|
float(<xref:60>)
This is the timeout for each requests library call. |
kwargs
Required
|
<xref:<xref:azure.datalake.store.core.optional key/values>>
Other arguments forwarded to the DatalakeRESTInterface constructor. |
Methods
access |
Does such a file/directory exist? |
cat |
Return contents of file |
chmod |
Change access mode of path Note this is not recursive. |
chown |
Change owner and/or owning group Note this is not recursive. |
concat |
Concatenate a list of files into one new file |
connect |
Establish connection object. |
cp |
Not implemented. Copy file between locations on ADL |
current |
Return the most recently created AzureDLFileSystem |
df |
Resource summary of path |
du |
Bytes in keys at path |
exists |
Does such a file/directory exist? |
get |
Stream data from file at path to local filename |
get_acl_status |
Gets Access Control List (ACL) entries for the specified file or directory. |
glob |
Find files (not directories) by glob-matching. |
head |
Return first bytes of file |
info |
File information for path |
invalidate_cache |
Remove entry from object file-cache |
listdir |
List all elements under directory specified with path |
ls |
List all elements under directory specified with path |
merge |
Concatenate a list of files into one new file |
mkdir |
Make new directory |
modify_acl_entries |
Modify existing Access Control List (ACL) entries on a file or folder. If the entry does not exist it is added, otherwise it is updated based on the spec passed in. No entries are removed by this process (unlike set_acl). Note: this is by default not recursive, and applies only to the file or folder specified. |
mv |
Move file between locations on ADL |
open |
Open a file for reading or writing |
put |
Stream data from local filename to file at path |
read_block |
Read a block of bytes from an ADL file Starting at
If offset+length is beyond the eof, reads to eof. |
remove |
Remove a file or directory |
remove_acl |
Remove the entire, non default, ACL from the file or folder, including unnamed entries. Default entries cannot be removed this way, please use remove_default_acl for that. Note: this is not recursive, and applies only to the file or folder specified. |
remove_acl_entries |
Remove existing, named, Access Control List (ACL) entries on a file or folder. If the entry does not exist already it is ignored. Default entries cannot be removed this way, please use remove_default_acl for that. Unnamed entries cannot be removed in this way, please use remove_acl for that. Note: this is by default not recursive, and applies only to the file or folder specified. |
remove_default_acl |
Remove the entire default ACL from the folder. Default entries do not exist on files, if a file is specified, this operation does nothing. Note: this is not recursive, and applies only to the folder specified. |
rename |
Move file between locations on ADL |
rm |
Remove a file or directory |
rmdir |
Remove empty directory |
set_acl |
Set the Access Control List (ACL) for a file or folder. Note: this is by default not recursive, and applies only to the file or folder specified. |
set_expiry |
Set or remove the expiration time on the specified file. This operation can only be executed against files. Note: Folders are not supported. |
stat |
File information for path |
tail |
Return last bytes of file |
touch |
Create empty file |
unlink |
Remove a file or directory |
walk |
Get all files below given path |
access
Does such a file/directory exist?
access(path, invalidate_cache=True)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to query |
invalidate_cache
|
Whether to invalidate cache Default value: True
|
Returns
Type | Description |
---|---|
True,
<xref:false> <xref:depending> <xref:on> <xref:whether> <xref:the> <xref:path> <xref:exists.>
|
cat
Return contents of file
cat(path)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to query |
Returns
Type | Description |
---|---|
<xref:azure.datalake.store.core.Contents> <xref:of> <xref:azure.datalake.store.core.file>
|
chmod
Change access mode of path
Note this is not recursive.
chmod(path, mod)
Parameters
Name | Description |
---|---|
path
Required
|
Location to change |
mod
Required
|
Octal representation of access, e.g., "0777" for public read/write. See docs |
chown
Change owner and/or owning group
Note this is not recursive.
chown(path, owner=None, group=None)
Parameters
Name | Description |
---|---|
path
Required
|
Location to change |
owner
|
UUID of owning entity Default value: None
|
group
|
UUID of group Default value: None
|
concat
Concatenate a list of files into one new file
concat(outfile, filelist, delete_source=False)
Parameters
Name | Description |
---|---|
outfile
Required
|
<xref:azure.datalake.store.core.path>
The file which will be concatenated to. If it already exists, the extra pieces will be appended. |
filelist
Required
|
list of <xref:azure.datalake.store.core.paths>
Existing adl files to concatenate, in order |
delete_source
|
If True, assume that the paths to concatenate exist alone in a directory, and delete that whole directory when done. Default value: False
|
Returns
Type | Description |
---|---|
connect
Establish connection object.
connect()
cp
Not implemented. Copy file between locations on ADL
cp(path1, path2)
Parameters
Name | Description |
---|---|
path1
Required
|
|
path2
Required
|
|
current
Return the most recently created AzureDLFileSystem
current()
df
du
Bytes in keys at path
du(path, total=False, deep=False, invalidate_cache=True)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to query |
total
|
Return the sum on list Default value: False
|
deep
|
Recursively enumerate or just use files under current dir Default value: False
|
invalidate_cache
|
Whether to invalidate cache Default value: True
|
Returns
Type | Description |
---|---|
<xref:size> <xref:pairs>,
<xref:total> <xref:size.>
|
List of dict of name |
exists
Does such a file/directory exist?
exists(path, invalidate_cache=True)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to query |
invalidate_cache
|
Whether to invalidate cache Default value: True
|
Returns
Type | Description |
---|---|
True,
<xref:false> <xref:depending> <xref:on> <xref:whether> <xref:the> <xref:path> <xref:exists.>
|
get
Stream data from file at path to local filename
get(path, filename)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
ADL Path to read |
filename
Required
|
str or
<xref:azure.datalake.store.core.Path>
Local file path to write to |
Returns
Type | Description |
---|---|
get_acl_status
Gets Access Control List (ACL) entries for the specified file or directory.
get_acl_status(path)
Parameters
Name | Description |
---|---|
path
Required
|
Location to get the ACL. |
glob
Find files (not directories) by glob-matching.
glob(path, details=False, invalidate_cache=True)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to query |
details
|
Whether to include file details Default value: False
|
invalidate_cache
|
Whether to invalidate cache Default value: True
|
Returns
Type | Description |
---|---|
head
Return first bytes of file
head(path, size=1024)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to query |
size
|
How many bytes to return Default value: 1024
|
Returns
Type | Description |
---|---|
info
File information for path
info(path, invalidate_cache=True, expected_error_code=None)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to query |
invalidate_cache
|
Whether to invalidate cache or not Default value: True
|
expected_error_code
|
Optionally indicates a specific, expected error code, if any. Default value: None
|
Returns
Type | Description |
---|---|
File <xref:information>
|
invalidate_cache
Remove entry from object file-cache
invalidate_cache(path=None)
Parameters
Name | Description |
---|---|
path
|
str or
AzureDLPath
Remove the path from object file-cache Default value: None
|
Returns
Type | Description |
---|---|
listdir
List all elements under directory specified with path
listdir(path='', detail=False, invalidate_cache=True)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to query |
detail
|
Detailed info or not. Default value: False
|
invalidate_cache
|
Whether to invalidate cache or not Default value: True
|
Returns
Type | Description |
---|---|
List <xref:of> <xref:elements> <xref:under> <xref:directory> <xref:specified> <xref:with> <xref:path>
|
ls
List all elements under directory specified with path
ls(path='', detail=False, invalidate_cache=True)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to query |
detail
|
Detailed info or not. Default value: False
|
invalidate_cache
|
Whether to invalidate cache or not Default value: True
|
Returns
Type | Description |
---|---|
List <xref:of> <xref:elements> <xref:under> <xref:directory> <xref:specified> <xref:with> <xref:path>
|
merge
Concatenate a list of files into one new file
merge(outfile, filelist, delete_source=False)
Parameters
Name | Description |
---|---|
outfile
Required
|
<xref:azure.datalake.store.core.path>
The file which will be concatenated to. If it already exists, the extra pieces will be appended. |
filelist
Required
|
list of <xref:azure.datalake.store.core.paths>
Existing adl files to concatenate, in order |
delete_source
|
If True, assume that the paths to concatenate exist alone in a directory, and delete that whole directory when done. Default value: False
|
Returns
Type | Description |
---|---|
mkdir
Make new directory
mkdir(path)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to create directory |
Returns
Type | Description |
---|---|
modify_acl_entries
Modify existing Access Control List (ACL) entries on a file or folder. If the entry does not exist it is added, otherwise it is updated based on the spec passed in. No entries are removed by this process (unlike set_acl).
Note: this is by default not recursive, and applies only to the file or folder specified.
modify_acl_entries(path, acl_spec, recursive=False, number_of_sub_process=None)
Parameters
Name | Description |
---|---|
path
Required
|
Location to set the ACL entries on. |
acl_spec
Required
|
The ACL specification to use in modifying the ACL at the path in the format '[default:]user|group|other:[entity id or UPN]:r|-w|-x|-,[default:]user|group|other:[entity id or UPN]:r|-w|-x|-,...' |
recursive
|
Specifies whether to modify ACLs recursively or not Default value: False
|
number_of_sub_process
|
Default value: None
|
mv
Move file between locations on ADL
mv(path1, path2)
Parameters
Name | Description |
---|---|
path1
Required
|
Source Path |
path2
Required
|
Destination path |
Returns
Type | Description |
---|---|
open
Open a file for reading or writing
open(path, mode='rb', blocksize=33554432, delimiter=None)
Parameters
Name | Description |
---|---|
path
Required
|
<xref:azure.datalake.store.core.string>
Path of file on ADL |
mode
|
<xref:azure.datalake.store.core.string>
One of 'rb', 'ab' or 'wb' Default value: rb
|
blocksize
|
Size of data-node blocks if reading Default value: 33554432
|
delimiter
|
For writing delimiter-ended blocks Default value: None
|
put
Stream data from local filename to file at path
put(filename, path, delimiter=None)
Parameters
Name | Description |
---|---|
filename
Required
|
str or
<xref:azure.datalake.store.core.Path>
Local file path to read from |
path
Required
|
str or
AzureDLPath
ADL Path to write to |
delimiter
|
Optional delimeter for delimiter-ended blocks Default value: None
|
Returns
Type | Description |
---|---|
read_block
Read a block of bytes from an ADL file
Starting at offset
of the file, read length
bytes. If
delimiter
is set then we ensure that the read starts and stops at
delimiter boundaries that follow the locations offset
and `offset
- length
. If
offset` is zero then we start at zero. The bytestring returned WILL include the end delimiter string.
If offset+length is beyond the eof, reads to eof.
read_block(fn, offset, length, delimiter=None)
Parameters
Name | Description |
---|---|
fn
Required
|
<xref:azure.datalake.store.core.string>
Path to filename on ADL |
offset
Required
|
Byte offset to start read |
length
Required
|
Number of bytes to read |
delimiter
|
bytes(<xref:optional>)
Ensure reading starts and stops at delimiter bytestring Default value: None
|
Examples
>>> adl.read_block('data/file.csv', 0, 13)
b'Alice, 100\nBo'
>>> adl.read_block('data/file.csv', 0, 13, delimiter=b'\n')
b'Alice, 100\nBob, 200\n'
Use length=None
to read to the end of the file.
adl.read_block('data/file.csv', 0, None, delimiter=b'n') # doctest: +SKIP b'Alice, 100nBob, 200nCharlie, 300'
- See also
-
<xref:distributed.utils.read_block>
remove
Remove a file or directory
remove(path, recursive=False)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
The location to remove. |
recursive
|
Whether to remove also all entries below, i.e., which are returned by walk(). Default value: False
|
Returns
Type | Description |
---|---|
remove_acl
Remove the entire, non default, ACL from the file or folder, including unnamed entries. Default entries cannot be removed this way, please use remove_default_acl for that.
Note: this is not recursive, and applies only to the file or folder specified.
remove_acl(path)
Parameters
Name | Description |
---|---|
path
Required
|
Location to remove the ACL. |
remove_acl_entries
Remove existing, named, Access Control List (ACL) entries on a file or folder. If the entry does not exist already it is ignored. Default entries cannot be removed this way, please use remove_default_acl for that. Unnamed entries cannot be removed in this way, please use remove_acl for that.
Note: this is by default not recursive, and applies only to the file or folder specified.
remove_acl_entries(path, acl_spec, recursive=False, number_of_sub_process=None)
Parameters
Name | Description |
---|---|
path
Required
|
Location to remove the ACL entries. |
acl_spec
Required
|
The ACL specification to remove from the ACL at the path in the format (note that the permission portion is missing) '[default:]user|group|other:[entity id or UPN],[default:]user|group|other:[entity id or UPN],...' |
recursive
|
Specifies whether to remove ACLs recursively or not Default value: False
|
number_of_sub_process
|
Default value: None
|
remove_default_acl
Remove the entire default ACL from the folder. Default entries do not exist on files, if a file is specified, this operation does nothing.
Note: this is not recursive, and applies only to the folder specified.
remove_default_acl(path)
Parameters
Name | Description |
---|---|
path
Required
|
Location to set the ACL on. |
rename
Move file between locations on ADL
rename(path1, path2)
Parameters
Name | Description |
---|---|
path1
Required
|
Source Path |
path2
Required
|
Destination path |
Returns
Type | Description |
---|---|
rm
Remove a file or directory
rm(path, recursive=False)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
The location to remove. |
recursive
|
Whether to remove also all entries below, i.e., which are returned by walk(). Default value: False
|
Returns
Type | Description |
---|---|
rmdir
Remove empty directory
rmdir(path)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Directory path to remove |
Returns
Type | Description |
---|---|
set_acl
Set the Access Control List (ACL) for a file or folder.
Note: this is by default not recursive, and applies only to the file or folder specified.
set_acl(path, acl_spec, recursive=False, number_of_sub_process=None)
Parameters
Name | Description |
---|---|
path
Required
|
Location to set the ACL on. |
acl_spec
Required
|
The ACL specification to set on the path in the format '[default:]user|group|other:[entity id or UPN]:r|-w|-x|-,[default:]user|group|other:[entity id or UPN]:r|-w|-x|-,...' |
recursive
|
Specifies whether to set ACLs recursively or not Default value: False
|
number_of_sub_process
|
Default value: None
|
set_expiry
Set or remove the expiration time on the specified file. This operation can only be executed against files.
Note: Folders are not supported.
set_expiry(path, expiry_option, expire_time=None)
Parameters
Name | Description |
---|---|
path
Required
|
File path to set or remove expiration time |
expire_time
|
The time that the file will expire, corresponding to the expiry_option that was set Default value: None
|
expiry_option
Required
|
Indicates the type of expiration to use for the file:
|
stat
File information for path
stat(path, invalidate_cache=True, expected_error_code=None)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to query |
invalidate_cache
|
Whether to invalidate cache or not Default value: True
|
expected_error_code
|
Optionally indicates a specific, expected error code, if any. Default value: None
|
Returns
Type | Description |
---|---|
File <xref:information>
|
tail
Return last bytes of file
tail(path, size=1024)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to query |
size
|
How many bytes to return Default value: 1024
|
Returns
Type | Description |
---|---|
touch
Create empty file
touch(path)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path of file to create |
Returns
Type | Description |
---|---|
unlink
Remove a file or directory
unlink(path, recursive=False)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
The location to remove. |
recursive
|
Whether to remove also all entries below, i.e., which are returned by walk(). Default value: False
|
Returns
Type | Description |
---|---|
walk
Get all files below given path
walk(path='', details=False, invalidate_cache=True)
Parameters
Name | Description |
---|---|
path
Required
|
str or
AzureDLPath
Path to query |
details
|
Whether to include file details Default value: False
|
invalidate_cache
|
Whether to invalidate cache Default value: True
|
Returns
Type | Description |
---|---|
Azure SDK for Python