Memory-mapped files and how they work

A key Windows facility that’s been available since NT shipped is the support of memory-mapped files.  A memory-mapped file is a file that has been mapped (i.e., not copied) into virtual memory such that it looks as though it has been loaded into memory.  Rather than actually being copied into virtual memory, a range of virtual memory addresses are simply marked off for use by the file.  You (or Windows) can then access the file as though it were memory, freely changing parts of it without having to make file I/O calls.  Windows transparently loads parts of the file into physical memory as you access them.  You don’t have to concern yourself with which parts are and are not in memory at any given time.

 

As each page of the file is accessed and copied into memory so that the CPU can access it, it bypasses the paging file.  In this sense, the file takes the place of the paging file as the backing storage for the particular range of virtual memory addresses into which it has been mapped.  Typically, the backing storage for virtual memory is the system paging file.  But with memory-mapped files, things change.  The files themselves act as virtual extensions of the paging file and serve as the backing storage for their associated virtual memory address range for unmodified pages.

 

How are memory-mapped files used?  Most commonly by Windows itself.  Every time a binary image (an .EXE or .DLL, for example) is loaded into memory, it is actually loaded there as a memory-mapped file.  The DLLs you see loaded into a process’s virtual memory address space are actually mapped there as memory-mapped files by Windows. 

 

As I’ve mentioned, memory-mapped files are also used to process files as though they were data.  If you’d rather search a file or perform minor in-place modifications of it using simple pointer dereferencing and value assignment rather than invoking file I/O operations, you can certainly do so.  Have a look at the MapViewOfFile() Win32 API for some basic info and some pointers on how to get going with this.  Something to be careful of, though, is depleting virtual memory.  When mapping large files into virtual memory to perform I/O on them, be cognizant of the fact that every address you burn in virtual memory is another than can’t be used by your app.  It’s usually more efficient to use regular file I/O routines to perform read/write operations on large files.

 

I’ll continue tomorrow with a discussion of how this relates to SQL Server.

Comments

  • Anonymous
    January 30, 2006
    Hi,


    it would be great to know more on this topic. I'm also interested in how SQL Server behaves on a clustered environment. There are not so many topics on this, so please tell us (me) more. And what's the latest on TPC tests?


    Best regards,
    Calin






  • Anonymous
    February 05, 2006
    The comment has been removed

  • Anonymous
    May 03, 2006
    The comment has been removed

  • Anonymous
    July 16, 2006
    Ris,  I did a quick test of MMF performance by mapping to a 261mb file, and accessing bytes at every 100th KB increment forward and back.  The first run took 4.5 seconds to complete.  Subsequent runs took 0 seconds to complete probably due to OS caching.

    I did the test on Windows 2000, not Windows Mobile, and I have 512mb of RAM rather than 32MB so I upped the file size to 1GB, and reran the test.  This took 17.6 seconds to complete.  I think this is reasonable: the time it took to test a 1GB file is about 4 times what it took for a 261mb file.

    Where is the slow performance coming from -- when you access an address that is generating a page fault, or some place else ?  How random is your data access ?  On W2K, each page fault causes a read of at least 4kb from file to RAM.  If your data access is causing a lot of page faults and you are running low on RAM so pages are being swapped out and in, it will slow performance down.

    As for your problem with opening 2000 files and running out of memory, you're probably running out of handles.   The memory used to store handles are quite limited and it is very possible that they will run out on a small device.  I've tried opening as many handles as I can on a 4gb RAM machine and the program ran out of memory way before 4gb was reached (or even 2gb).  I'm no expert on this, but I suspect that the bulk of what it takes to store a handle may reside in system memory rather than user process memory, and this system memory is quite limited.

    Vincent
    2006/7/16

  • Anonymous
    July 16, 2006
    The comment has been removed

  • Anonymous
    June 08, 2009
    PingBack from http://cellulitecreamsite.info/story.php?id=7910

  • Anonymous
    June 18, 2009
    PingBack from http://thestoragebench.info/story.php?id=9393