Experience Updates to the Azure Data Lake Store and Analytics Portal

In this month's refresh of the Azure Data Lake Store and Azure Data Lake Analytics portal, we've added a set of features focused on giving our users more control over their accounts.

Simple data cleanup using File Retention

Often, there is a time after which data can become stale or redundant - for example you may be generating a lot of temporary data that's only useful for testing. In a big data system, when you're dealing with a lot of data it can be hard to track when to clean up this data. The new File Expiry feature will help you clean up stale/old data you don't want to keep around anymore.  When you set a File Expiration Time on a file and when that expiry date is reached, the file will automatically be deleted by the Data Lake Store service.

To do this from the Azure Portal:

  1. Navigate to the file using Data Explorer
  2. Right click on the file and choose Set Expiry
  3. Set File Expiry to ON
  4. Under Expires On, pick a date

z1

Control how much compute capacity your Data Lake Analytics accounts have access to

When a new Data Lake Analytics account is created, there is a set value for the maximum number of parallelism that can be assigned to a job and the number of jobs that can run concurrently. You can now freely update these numbers yourself. We will have a follow up post with more details about this update in general. Keep an eye out for it!

To update the Parallelism and Concurrency:

  1. Open your Data Lake Analytics account
  2. Click on Properties in the Table of Contents
  3. Slide the Parallelism and Concurrency sliders to the values you want
  4. Click Save

z2

 

Use Custom Delimiters when Previewing Files

Previously, we had supported comma, colon, space, tab, ampersand, and bar delimiters. With the many different kinds of files used in Azure Data Lake Store and Azure Storage, we've added a "Custom" delimiter options for you to define your own delimiter.

To change the delimiter on the Azure Portal:

  1. Open the file you want to preview using Data Explorer.
  2. Click on Format
  3. Under Delimiter, click the dropdown and change it to Custom
  4. A new Custom Delimiter field will appear, type in your delimiter here
  5. Click OK

z3

 

Extending Data Lake Store super-users to manage the U-SQL Catalog ACLs

The U-SQL Catalog keeps the data it uses in Azure Data Lake Store. Previously, super-users of Azure Data Lake Store were not able to update (button would be disabled) the U-SQL Catalog's ACLs even though they have full access to where the data is stored. With this refresh, we've extended the super-user relationship to the U-SQL Catalog as well. Now, when an ADLS super-user tries to modify permission on the U-SQL Catalog, the operation will succeed.

Comments

  • Anonymous
    October 11, 2016
    Is setting the File Expiry available via API? If so, is there information available that you can share?
    • Anonymous
      October 11, 2016
      Yes. The is called SetFileExpiry and is available via the .NET SDK for Azure Data Lake Store. The team will release a sample project in a week or two that illustrates how to easily use this and other APIs without having to be very familiar with how REST APIs work or with the details of WebHDFS.
      • Anonymous
        October 19, 2016
        When this will be available for .NET SDK and Java SDK?
        • Anonymous
          October 19, 2016
          It's in the .NET SDK now. The Java SDK is pending till next month.
          • Anonymous
            October 19, 2016
            Thank you for the updates. Where the sample projects would be posted once this is available?
          • Anonymous
            November 17, 2016
            Is this available for Java SDK yet?