Azure Data Lake Analytics and U-SQL Summer 2017 Updates: Introducing GZip on OUTPUT, Catalog Views, Major updates to Cognitive Libraries, Tool support to create your EXTRACT statement and much more!
Hello Azure Data Lake and U-SQL fans and followers.
It has been a while since we release the release notes for all the cool features we released over the summer and listing all the pending deprecation items and breaking changes. The summer break is finally over, so without further ado, here are the Summer 2017 Updates for Azure Data Lake U-SQL and Developer Tooling!
The main items are the announcement of some pending deprecations and breaking changes, the release of catalog views so you can now use information about meta data in your U-SQL scripts (e.g., to generate maintenance scripts), announce the preview of GZip on OUTPUT support and a preview of the DiagnosticStream support that gives you the ability to write diagnostic information from within your .NET user code, add more file set performance improvements, document the major updates to the cognitive libraries and tools, including the very useful ability to create EXTRACT statements on CSV-like files from the VisualStudio tooling (on which we have a previously published blog)!
Thanks to all of you who continue to volunteer to test the new version of the more scalable file set. Now everyone can do it.
Make sure that you update your scripts that are going to be affected by the future deprecations and breaking changes!
Please contact us or leave a comment below if you have feedback on this or other features.
Here is the list of topics with links to the detailed release notes:
- Pending and Upcoming Deprecations and Breaking Changes
- U-SQL jobs will introduce an upper limit for the number of table-backing files being read
- Table-valued functions will disallow result variable names to conflict with parameter names
- Built-in extractors will change mapping of empty fields from zero-length string to null with quoting enabled
- Disallowing user variables that start with @@
- Disallow U-SQL identifiers in C# delegate bodies in scripts
- Breaking Changes
- Major U-SQL Bug Fixes, Performance and Scale Improvements
- U-SQL Preview Features
- Input File Set scales orders of magnitudes better (now with additional improvements!)
- Automatic GZip compression on OUTPUT statement is now in preview
- DiagnosticStream support in .Net User-code
- A limited flexible-schema feature for U-SQL table-valued function parameters is now available for preview (requires opt-in)
- New U-SQL capabilities
- Azure Data Lake Tools for Visual Studio New Capabilities
- ADL Tools for VisualStudio now helps you generate the U-SQL EXTRACT statement
- Python and R code-behind are supported for U-SQL project
- Simplifying debugging shared user code in ADL Tools for VisualStudio
- ADL Tools for Visual Studio supports F1 help on U-SQL keywords
- ADL Tools for Visual Studio allows temporary U-SQL scripts outside of a project
- ADL Tools in Visual Studio highlights all uses of a highlighted U-SQL variable
- Azure Portal Updates
In order to get access to the new syntactic features and new tool capabilities on your local environment, you will need to refresh your ADL Tools. If you use VisualStudio 2013 or 2015, you can download and install them directly from MSDN or use the new Check for Updates menu item mentioned above. If you are using VisualStudio 2017, you currently have to wait for the next VisualStudio 2017 refresh that should occur about every 6 to 8 weeks. Otherwise you will not be able to use the new features during local run and submission to the cluster will give you syntax errors for the new language features (although you can still submit the script anyway).
You can find more details with examples in the Summer 2017 release notes (or by clicking on the items above) on our GitHub site, where you also can find our previous release notes.