Udostępnij za pośrednictwem


Create U-SQL EXTRACT Script Automatically

In this blog, you will learn how to create U-SQL EXTRACT script automatically using the latest version of Azure Data Lake Tools for Visual Studio.

Watching this 3 minutes video to learn more.

 

One of U-SQL's core capabilities is to be able to schematize unstructured data on the fly without having to create a metadata object for it. This capability is provided by the EXTRACT expression that will invoke either a user-defined extractor or built-in extractor to process the input file or set of files specified in the FROM clause and produces a rowset whose schema is specified in the EXTRACT clause.

While using the build-in extractor to schema semi-structured data, like data in .csv file, the schema definition in U-SQL is slow and error prone, especially for the .csv file contains hundreds of columns.

Recently, we released a new feature in the latest version of Azure Data Lake Tools for Visual Studio to help you generate this U-SQL EXTRACT statement automatically.

How to use this feature?

Step 1:

Double click your ADLS account in Server Explorer to open Azure Data Lake Explorer in VS.

 

Step 2:

Find the file, right click it to choose Create EXTRACT Script.

If you have an ADLS URI for the file you want to query, through Tools > Data Lake > Open ADLS Path to open file preview, and then click Create EXTRACT Script in file preview window.

 

Step 3:

Adjust the automatically generated script as needed. Click the column header to change the column name, and click the icon besides of the column name to change associated data type. If the first row of your file is header row, check File Has Header Row? to use the first row as the columns names.

 

Step 4:

Click Copy Script? to copy the script to clipboard, or click New Query from Script? to open a new temp query with the script.

 

 

Get the latest Azure Data Lake Tools for Visual Studio from https://aka.ms/adltoolsvs.

Contact us at adldevtool@microsoft.com if you have questions of feedback.

Comments

  • Anonymous
    August 16, 2017
    Looks to be a promising tool. Is it free or commercial?
    • Anonymous
      August 23, 2017
      It is free, and built in with Azure Data Lake Tools for Visual Studio, you can go to http://aka.ms/adltoolsvs to get the latest version.
  • Anonymous
    August 20, 2017
    This is awesome!
    • Anonymous
      October 26, 2017
      Thanks Phil, please feel free to let us know if you have other feedback for Azure Data Lake Tools for Visual Studio, we shall be happy to hear your feedback.
  • Anonymous
    September 07, 2017
    Can we do this using data factory? Looks like it's only for visual studio. How do we automate this?
    • Anonymous
      October 26, 2017
      Dear Maddy, currently it is not supported via ADF. May I ask your scenarios for this? You want to generate the script submitted by ADF on the air? You can contact me at yanacai@microsoft.com if you want to share more. It would be great to learn your scenarios.
  • Anonymous
    October 19, 2017
    The comment has been removed
    • Anonymous
      October 26, 2017
      Thanks for your feedback, Bill. We will address this.
  • Anonymous
    December 27, 2017
    Is it just me or this functionality is not working with the latest ADL tools? I've tried with both VS2015 and 2017 fresh install, and with different files. The screen just returns blank in vs2017, and vs crashes with an unhandled XamlParseException.
  • Anonymous
    December 27, 2017
    Can you make to work from windows explorer( via contextmenu(right click), for this you to have a file to ADLS and then use it.