How does DPM really protect data?
Just because you know how DPM 2006 works ... you don't know DPM 2007.
That statement is a little bold, since many folks know both products -- but lately, I have heard a surprisingly large number of assumptions from people on how DPM2007 does what it does -- or what its capabilities actually will be - or even what its limitations will be.
Marketing note - DPM 2007 has no limitations, only opportunities for future enhancement.
What is interesting to me is that these assumptions often come from folks that are claiming to be in-depth on DPM 2006. Ok ... the good news is that DPM 2006 did a good job at what it did, and a lot of people invested some learning in to how to work it. The bad news is that they are assuming what is inside DPM 2007. My point is don't assume that even the basics of your understanding of DPM 2006 necessarily apply to DPM 2007.
As one example that I hear most often - and probably the most common technical question for DPM...
"How does DPM protect data ... How does the DPM filter work ?"
.
DPM 2006, our existing product, used a file system filter - meaning that there was kernel-mode driver that passively monitored what was going on between the IO Manager of the Windows OS and the stack that eventually culminated with NTFS.SYS. Lots of storage technologies sit in this stack - including many anti-virus technologies and other data protection mechanisms.
Anti-Virus filters tend to be what I call "Blocking" drivers - meaning that they watch what is coming down from the OS towards the disk. They block the flow when a new write comes down, to check it against their virus signatures. If everything is clean, then they allow the file operation to continue moving downstream to the disk.
Disk-Protection filters, like DPM 2006, are usually thought of as "non-Blocking" drivers - meaning that their job is allow the original disk write to proceed thru the disk stack at normal pace, but grab a copy of the file operation to be dealt with separately. This is a somewhat common method of asynchronous host-based replication or synchronization, particularly if you are not application aware.
In the case of DPM 2006, the filter would hand of a copy of the file system operation to a DPM journal on the production server disk, called a Synchronization Log. It was literally a log file of the NTFS operations done on the production disk. On a regular schedule, the DPM 2006 server fetches what is in the log and essentially repeats those file operations on the DPM Server's copy of the files.
It was a good idea -- and Microsoft wasn't the only data protection technology to use the principles of it. But don't assume that even something as foundational as the plumbing is consistent for DPM 2007.
Microsoft has said for years that the advocated way to protect data that resides on a Windows OS is through VSS, Volume Shadow Services, which provides a method for operating systems and applications to put their data in a clean state, in part to prepare for backup activity.
DPM 2007 uses the VSS writers provided by the application workloads that we protect -- we do NOT use a file system filter to capture file changes. Because of that, while we could already protect Windows Server 2003 and 2008, we had to drop support for Windows 2000 Server ... no VSS writer. On the other side, we now get the ability to protect desktops running Windows XP (and Windows Vista), something we could not do with DPM 2006, because XP didn't have a a file-system filter manager. But it does use VSS. Get the idea?
And because the DPM 2007 filter monitors changed blocks, instead of changed files, we don't require any additional disk footprint on the production servers. There isn't any "Synchronization Log", which was usually sized at 10% of the production data footprint. Congrats ... you get that space back when you upgrade to DPM 2007.
Please check out my 10 minute deep dive on exactly how DPM 2007 captures data, including not only how we use VSS, but also how transactions within SQL Server and Exchange are protected.
.
STREAMING VIDEO (10 minutes) -- How DPM 2007 protects data (deep-dive)
.
There are lots of things that are similar between DPM 2006 and DPM 2007, such as the empowering of End-Users to restore their own documents through Windows Explorer or Microsoft Office -- via the Previous Versions Client (the topic for a future blog, I'm sure). But hopefully this gives an example of why one should not assume that what you think you might know about DPM 2006 applies to DPM 2007.
The DPM Engineering team has done some truly amazing work in delivering a product that many say will set a new bar for data protection in the Windows world and changes the landscape of data protection overall. To do that, we had to re-invent a few things, all of which are for the better.
So, please come learn about what is coming in DPM 2007.
Comments
Anonymous
January 01, 2003
One of my buddies in Texas, Jason Buffington is a product manger in the Storage Server group and is aAnonymous
January 01, 2003
that's a great idea, Joe N ... Look for that in a future deep dive segment! others that I have been asked for is:
- how does disk allocation work on the DPM server?
- ideas on how to determine fan-in or ratio of production servers to DPM server Keep the good ideas coming !!
Anonymous
January 01, 2003
The comment has been removedAnonymous
January 01, 2003
This entry probably should have been a cross-posted ... but please check out my individual blog's latestAnonymous
January 01, 2003
Like many of you, I have spent a fair amount of time testing, designing and deploying all types of systems.Anonymous
January 01, 2003
Thanks Felix - let me clarify. DPM 2006 uses a File System Filter (sitting around IO MGR) and monitoring file system changes on their way to NTFS - literally like "Change File X contents from A to B". Every file write is captured, copied, and held on a local Sychronization Log - to be sent hourly to the DPM 2006 server. Then each file operation is essentially replayed to update the DPM copy of the data. DPM 2007 uses a filter/driver which monitors blocks on the disk. As disk blocks get updates, we flip a memory based bit denoting it is different. At the scheduled time, we use VSS writer to momentarily provide us the quiesced blocks. In between, DPM 2007 can protect data every fifteen minutes from transaction based applications like SQL Server or Exchange, by copying the transaction log(s) changes - again not as a filter, but by simply reading the logs. DPM 2007 doesnt use a sync log and cause repetitive IO on the production server, it doesnt mimic individual NTFS file operations, etc. Hope that helps. jasonAnonymous
August 20, 2007
Thanks for video, it's really enlightening about how DPM 2007 works. I will show it to my coworkers at our next meeting.Anonymous
August 28, 2007
How about a deep dive on a DPM restore of Exchange and SQL? That would help explain things out more clearly and answer a lot of questions about scenarios possible with DPM.Anonymous
August 29, 2007
You seem to of contradicted yourself in article, you state the 2007 does not use a filter, and is different from DPM 2006 because of this, but, you then go on to say the 2007 DPM uses a filter driver. >> And because the DPM 2007 filter monitors changed blocks, instead of changed files... >> Still confused. -Felix