Query Processing -- introduction to the blog!

Since we’re starting a new blog for the whole query processing team, we think it’s appropriate to start with an introductory post about query processing as a whole and what we’ll be talking about in this blog.

 

If you’re unfamiliar with the way SQL Server processes its queries, a good refresher can be found here. As mentioned in the linked article, the first operation that SQL Server performs when it encounters a query it has never seen before is to optimize it to produce an execution plan for it. Most of our older blog was focused on this stage of query processing, and we’ll still be talking about it a great deal. Some of the topics we’ll tackle in the query optimization space include:

 

  1. Statistics used by the optimizer to deduce the structure of the underlying data, and their usage in cardinality estimation (examples from the old blog include: How to Read Statistics Profile, Ascending Keys and Auto Quick Corrected Statistics, Auto-create and Auto-update Statistics)

  2. Indexes, their usage in the optimizer and best practices regarding them (examples from the old blog include: Why should I create an index?, Fun for the day - Automated Auto-Indexing!, Indexed Views in SQL Server 2005)

  3. Various tricks done by the optimizer (examples: Local-Global Aggregation, Row Goals in Action)

  4. Hints that can be passed on to the optimizer to manipulate it (examples: Hints for DML queries)

  5. Recompilation, what they are, when & how to avoid them (examples: I Smell a Parameter!)

  6. Pitfalls people fall into related to the query optimizer (examples: TOP 100 Percent ORDER BY Considered Harmful.)

  7. etc.

Once armed with query plans, the SQL Server’s Query Execution (QE) component is primarily responsible for consuming these optimized query plans and transforming them into executable trees of physical operators. These physical operators are designed to implement a wide variety of algorithms for joins, sorts, aggregation, groupings, etc. By chaining these operators to conform with the query plan’s tree structure, QE is able to physically implement and execute the optimized query plan and perform the necessary computation to produce results for the end user.

 

In addition to the core operators, QE is also responsible for handling a number of server engine features, such as parallel query execution, distributed queries, and query memory management.

As a team, we are dedicated to providing a customer experience that is second to none. We strive to always maintain an exceptionally high level of robustness, performance, scalability, both within our component and for the rest of the query processor. By delivering powerful functionality through these execution building blocks, QE is critical in providing much of the flexibility needed for SQL Server to be an enterprise-quality database.

 

From a QE perspective, some of the interesting topics we plan to cover include:

  1. Memory management policies in QE and how to interpret memory-related issues based on system loads
  2. How to use information that can be collected from Showplan XML and Statistics Profile for execution analysis
  3. Efficient use of Distributed Queries
  4. Indexes and their role in Query Execution
  5. Data Manipulation Language (DML) including wide plans, narrow plans, etc.
  6. Parallel query execution - hints, tips, tricks, etc.
  7. Execution plan caching - what is it and how does it help?

Comments

  • Anonymous
    May 02, 2007
    Nice blog, I'll be visting frequently. I'm coming from the Oracle world and ma trying to draw comparisons that will help me understand how SQL Server 2000 processes queries. Specifically what does the SQL Server Optimizer conisider in it's evaluation of a query? Cardinality, selectivity? What types of conditions might preclude the optimizer from using an index (ie a function, OR expansions etc). Is there are chapter in sql server reference quide or a link that gives a thorough explanation of the optimizer?