Using LucidWorks on Windows Azure (Part 3 of a multi-part MS Open Tech series)
LucidWorks Search on Windows Azure delivers a high-performance search service based on Apache Lucene/Solr open source indexing and search technology. This service enables quick and easy provisioning of Lucene/Solr search functionality on Windows Azure without any need to manage and operate Lucene/Solr servers, and it supports pre-built connectors for various types of enterprise data, structured data, unstructured data and web sites.
In June, we shared an overview of the LucidWorks Search service for Windows Azure, and in our first post in this series we provided more detail on features and benefits. In December we covered the main features of LucidWorks Search, but today Microsoft Open Technologies, Inc, is happy to share with you a few new data sources that are Available in LucidWorks Search on Windows Azure, and a new easier way to sign up for LucidWorks Search Service on Windows Azure.
A new option for signing up
LucidWorks Search is still listed under applications in the Windows Azure Marketplace, and from there you can create an account via the LucidWorks Account Signup Page. But getting started is now even easier as we’ve integrated LucidWorks’ service with the Windows Azure Store, so you can now set up an instance on Windows Azure By clicking 0n the Store option in the Windows Azure Dashboard:
Next, you’ll be prompted to choose an Add-on from a list. Select LucidWorks Search. The next screen invites you Personalize your new Add-On:
At this point, all you have to do is enter a new Name for your LucidWorks Search Add-on and the region you want your instance to be located in.
Right now the only option for signup via the Windows Azure Store is the Micro level, which is great for getting started. Should you exceed the limits of the Micro level, you can also sign up for other enterprise-level accounts from the LucidWorks Dashboard using the LucidWorks account that is automatically created when you sign up via the Window Azure Store.
LucidWorks support for Windows Azure SQL Databases, Windows Azure Tables and Windows Azure Blobs
Along with the Windows Azure Store integration, we also released LucidWorks Search support for Windows Azure SQL Databases, Windows Azure Blobs, and windows Azure Table Storage. All are available via the LucidWorks Search Dashboard under Indexing > Data Sources:
Windows Azure Blobs provide a way to store large amounts of unstructured, binary data, such as video, audio, and images, including streaming content such as video or audio. There are two types of blob storage available, block blobs and page blobs. Block blobs are optimized for streaming and referenced by a unique block ID. Page blobs are optimized for random access and composed of pages that are referenced by offsets from the beginning of the blob. More information on Windows Azure blobs can be found here.
Windows Azure Table storage is a collection of non-relational structured data. Unlike tables in a database, there is no schema that enforces a certain set of values on all the rows within a table. Windows Azure Storage tables are more like rows within a spreadsheet application such as Excel than rows within a database such as SQL Server. Each row can contain a different number of columns, and of different data types, than the other rows in the same table. You can find more information on Windows Azure Table storage here.
Windows Azure SQL Databases are similar to an on-premise instance of SQL Server, but not the same. Windows Azure SQL Databases expose a tabular data stream (TDS) interface for Transact-SQL-based database access, so they can be used the same way you use on-premise SQL Server.
However, there are some very important differences for administration. Windows Azure SQL Database abstracts the logical administration from the physical administration. That means that you continue to administer databases, logins, users, and roles, but Windows Azure manages the physical hardware and networking to ensure enterprise-class availability, scalability, security, and self-healing. More information on Windows Azure SQL Databases is available here.
To set up a new Azure SQL Database as a Data source, select Database as your data source option under Indexing > Data Sources.
There are a few tips for setting up a Windows Azure SQL Database as a data source for LucidWorks that you need to know. First of all, copy the URL for your Database from the JDBC connection strings in your Windows Azure Dashboard, using this format:
jdbc:sqlserver://<WindowsAzureSQLDBURL>:1433/<databaseName>
Next, select the SQL Server JDBC driver as the Driver for your Windows Azure SQL Database. You also have to include at least one SQL SELECT statement that includes an id column in the result. The id column is used at the Document identifier in LucidWorks search, and relates each row returned by the SELECT statement as fields in that Document. Have a look at my first post in this series for more information on how LucidWorks works with Documents, Fields, and Collections to return search results.
When done your Data Source configuration should look something like the sample here:
Next there are two additional options for setting up SELECT statements to work with your database. The Delta SQL Query uses the primary key to compare new records in the database with existing Documents in the LucidWorks Search Index, and only indexes the new or updated rows. Nested Queries allow you to set up one-to many relationships in the source Windows Azure SQL Database to include multiple rows of data in a single LucidWorks index Document, based on the primary key. Full instructions on setting up these queries as well as other options can be found in the LucidWorks help documentation here.
Summary
These are just the latest new features to help you easily and quickly set up LucidWorks Search service on Windows Azure, and there are more on the way. Get started with your own LucidWorks Search solution by signing up via the Windows Azure Store, and let us know what you think!