Memory Pressure and how to see it

Since this is such a broad topic, I want to just start with the basics.   I'll review some of the settings in SQL and Windows, and then talk about what some of the pressures I see are, and how we can help reduce them.

First let's cover the Settings...

SP_Configure holds just a few of these settings.   Max Server Memory and Min Server Memory.  

Max Server Memory:
Max is kind of what you would expect.   It is the Maximum number that the Buffer Pool will grow to.   Please note I did not say it's the total Memory that SQL will consume, but just the Buffer Pool.   This is important as SQL can take more than this for other reasons.  Not EVERYTHING is in the Buffer Pool.   Things like Query Plan Cache, Memory for the Optimizer, Connections, CLR Objects, etc... all use memory that canbe outside of the pool.   Max Memory also does not limit other areas of SQL.  SSIS, SSRS, SSAS all are their own separate processes, and they are not controlled by this.   I know this may not sound ground breaking, but I still go on-site to customers who think this setting will control everything that SQL does.

Min Server Memory:
This is a setting I feel is very much not understood.   First and foremost, this DOES NOT pre-allocate memory.   SQL will NOT start up and grab this much ram.   This setting instead is a threshold.   Once SQL does grow beyond this number, it will not give it back to the OS.   Please re-read that again.   It does not give it up!    This is 0 by default, and although I have had several conversations on it, I personally don't see why this should be changed.    If it did pre-allocate it, then I could see a change, but if my server is under a Memory Attack, I would want SQL to release memory to save the OS.

AWE:
AWE is left over from the 32bit days, and why it is not required in 64bit, it does not hurt anything, so I always turn it on.

Lock Pages in Memory:
This is not a SQL setting, but it is one that has some old skeletons.  I will say that in SQL 2000, once this was set, SQL did not like to trim this memory, but that was then, this is now.   We DO want this to be set.   Why?   Here is the quick overview.   If I fill my SQL Servers memory with data, why would I want to page that data out to a page file?   I would not.   Most of us are going to have much faster disks for data files than for our OS Page file.   So if I wanted to remove the data from RAM, I would just rather throw it away and then reload it from the faster disks when needed again, not from the slower Page file.   If you are running into a Memory issue, you may have Max Server Memory set too high, and that needs to be changed, not this locking.

Okay, now we have just a few settings covered, let's talk about how to see this pressure, and what direction it might be coming from.   The tool I use here is Perfmon.   It is absolutely my #1 tool.  Not because it's a MS tool, but because it's easy to use, light (little if any impact to the server), on every windows system by default, just to name a few.   In Perfmon I want to look at memory on the box, as well as memory in SQL.  

  • Memory: Available MB
  • SQL Buffer: Free Pages
  • SQL Buffer: Page Life
  • SQL Buffer: Lazy Writes

These four are my starting point.   First, if there is less than a few hundred Megabytes free on the server?  If so, something is wrong.   Either I have too many processes running or Max Server Memory is set too high.   If OS Memory is ok, now we need to see if there is pressure inside of SQL.   The other three counters will help with this.   I'll dig into what each one does first.

Free Pages:
This is the number of free pages in the Buffer.  Remember there WILL always be free pages, but how many and how does that number change?  The number we are looking for is 640 free.  Below that, the engine will either try to grow the buffer in memory, or start tossing the old data out.   This is done through the lazywritter, which happens to be the other Perfmon counter we look at.

Page Life Expectancy:
This counter tells us how long a page has lived in memory when it's no longer referenced.   The ideal number is over 300 seconds, obviously, the longer the better.

Lazy Writes:
This is the process that is used to throw away old data in the buffer, if that buffer can't be grown to a larger memory footprint.  Our target here is less than 20/sec.  The lower, the better.

The combination of these three counters will help us see pressure inside of SQL Server.   Most DBA's will monitor Buffer Cache Hit Ratio.  This counter does not really help us.  If the counter is low, then there is a good chance there is an issue, but if it's high, it tells us nothing.  Yep, nothing.  This is why I don't look at it.   On the other hand, the three I laid out will, and they all work hand and hand.

If I have low free pages, and the buffer can't be grown, Lazy Writes will start removing data. Once the oldest data is removed, this will reduce our Page Life.  See the combination of the three help prove this.    Let's look at it another way.  What if Free Pages is low, but Page Life continues to grow and Lazy Writes is at 0?  Well, this just tells me that we are running queries, and growing the Buffer Pool.  In that case we could look at the Available Megabytes, and see that it is decreasing by exactly the same trend as SQL is taking memory.   Again the same counters, but this proved that we did NOT have internal memory pressure.

One other little tip I like to add here.  You need to rule out Memory pressure first.   If you have Memory issues, I would bet that you have or will have disk issues.  Once you have thrown out all of you nice old data in memory, we are at some point going to have to load it back in.   When that happens, we have to read from the very slow disks.  I know, I know, you are on a SAN, and you have the fastest disks in town, well that does not matter.  Disks are SLOW, even the fabulous Solid State drives (Like I have in my laptop) are slower than just re-using data that is already in memory.

The other thing I need to point out is WHY.  Why is SQL doing this?  SQL's not doing it; it's doing what you told it to do.  How so?  With your code!  T-SQL queries, Stored Procedures, Function Calls, etc...   Up to this point, our lives have been pretty easy.  We used Perfmon to track down where the problem is.  IF, and I stress here, IF the problem is in SQL, then we are going to have to look at code some place in order to treat the symptom.   Yep, I guess that means cracking the chest and looking at that evil stuff called code.

I'll get detailed about how to find this code and what to and what not to write in a future blog, but high level, you are going to need to look at your highest I/O queries, and start tuning.   Adding Indexes, maybe roll-up tables, etc...   Some absolute basics are...

  1. Don't query data that you don't plan to use.  NO SELECT *!!
  2. Always use a WHERE clause.
  3. Create Non-Clustered indexes that SEEK the data, not scan.
  4. Having a Clustered index is "Usually" a requirement.  (Not required, but should be for 99% of the tables out there.
  5. Be careful with SELECT DISTINCT, its better not to duplicate the data in the first place.

I could go on forever there, but I have to stop.

To recap what we talked about here.  We focused on just Memory Pressure.  My primary tool for finding this is to use Performance Monitor (Perfmon).  First look at the OS level, and then dig into SQL.  You may need to reduce Max Memory to help the OS, but tuning Queries is going to be the best way to fight internal pressure.   Just adding Memory will NOT fix the problem.  You can't buy enough Memory to fix bad code, but you can give your best try.  Most of us would turn to Profiler to find these queries, but I will direct you to the SYS.DM_EXEC_QUERY_STATS DMV.  With just a few queries to that DMV, you can find the most expensive queries any way you want to slice and dice it.

 

Eric, @SQLPilot
Tweet