Saving VM by using OEMDRIVERSHIGH package

The release of Windows Mobile 6 (WM6) gave us a few more tools for dealing with Virtual Memory (VM) issues. Compaction of slot 1 is new for WM6, in previous releases all slot 1 modules were aligned on 64k boundaries, causing VM address space to be wasted when modules sizes aren’t perfectly divisible by the alignment size of 64KB were mapped into slot 1. The new alignment size of 4k allows more modules to be packed closer together, thus better compaction.

To illustrate this point, Figure 1 below shows 4 files totaling 142k in size taking up 320k using 64k alignment. Figure 2 shows the same 4 files only taking up 144k using 4k alignment. The white space is unused memory.

Along with compaction, the WM6 release also provides a way to map OEM driver’s modules into slot 1 with the package OEMDRIVERSHIGH. This is important because all the modules that do not fit into slot 1 rollover into slot 0, and thus reducing available VM for all slots. Further exacerbating the situation of modules rolling over into slot 0 is the fact slot 0 aligns modules on 64k boundaries. The OEMDRIVERSHIGH package enables OEMs, SVs, and system integrators to change the packaging priority of OEM drivers, which is typically near the end in the list of packages processed by makeimg.

When it comes to reconfiguring modules, OEM’s don’t have the ability to rearrange packages or to change the modules that go into a particular package. All that is available at this stage of building an image is reconfiguring the modules that go into the OEMDRIVERS package. Now with the addition of OEMDRIVERSHIGH package an OEM can target selected DLL’s to be processed, which will push larger modules out of slot 1 and into slot 0.

 

Optimizing OEMDRIVERSHIGH usage

To save VM using OEMDRIVERSHIGH we are going to identify the modules at the bottom of slot 1 on the verge of rolling over into slot 0. Through careful comparison of these modules and those currently in the OEMDRIVERS package we will reassign OEMDRIVERS modules to OEMDRIVERHIGH. By doing so the modules in OEMDRIVERSHIGH will be processed before the modules at the bottom of slot 1, and through this careful calculation the OEMDRIVERSHIGH modules targeted for slot 1 will rollover files on the slot 1 / 0 boundary. The VM savings will come from rolling over larger modules that are 100’s of KB into slot 0 which is 64KB align and filling the freed slot 1 space with smaller modules. It is the relocation, for example, of three 16KB modules to slot 1 that will free 192KB in slot 0 and only consume 48KB in slot 1.

To achieve this, the first place to start is analyzing the output of makeimg from a baseline build. The output isn’t always captured so you many need to run it once capturing the output (makeimg > makeimg.out 2>&1). Search through makeimg.out and look for “fixing up”. These are modules rolling over from slot 1 into slot 0. Below is an example of modules that just made it into slot 1 and modules that started rolling over into slot 0. This list of modules will be different for every project based on SYSGEN, BSP flags, IMG flags and the type of image.

Using the data from makeimg.out we will build a table to calculate the size of each module sitting at the bottom of slot1. Later we will use this data to determine how many modules to move into OEMDRIVERSHIGH, which will cause these modules to rollover.

 

                                                        FIGURE 3

Module cespell.dll at offset 01d8a000 data, 0210b000 code

Module MsgStore.dll at offset 01d89000 data, 02106000 code

Module msim.dll at offset 01d87000 data, 020e4000 code

Module mscoree.dll at offset 01d86000 data, 020d4000 code

Module netcfagl2_0.dll at offset 01d85000 data, 02099000 code

Module netcfd3dm2_0.dll at offset 01d84000 data, 02072000 code

Module ccoredrv.dll at offset 01d83000 data, 02051000 code

Module ccoreutl.dll at offset 01d82000 data, 02042000 code

Module cspvoice.dll at offset 01d81000 data, 02032000 code

Module simsec.dll at offset 01d80000 data, 02026000 code

Module sms.dll at offset 01d7f000 data, 02020000 code

Code space full, fixing up supsvcs.dll to ram space

Code space full, fixing up btagsvc.dll to ram space

Code space full, fixing up BthAGPhonebook.dll to ram space

Code space full, fixing up celltsp.dll to ram space

Code space full, fixing up cplphone.dll to ram space

Code space full, fixing up phone.dll to ram space

Code space full, fixing up ril.dll to ram space

 

Taking the modules code offset from Figure 3, and subtracting it from the code offset of the previously listed module will determine the slot 1 space that can be recouped. (The data segment is need regardless of what we do and will remain the same.) So for msim.dll the code size is 0x02106000 – 0x020E4000, which is 136KB. The rollover order will be the same for every run of makeimg, the only exception is when slot 1 is low on space, it will pack in modules small enough to fit into remaining space. This is good news because one less module in slot 0 saves an additional 64KB. The consistent rollover order enables us to predict which files will rollover and in what order, this is illustrated in Table 1. The last two columns of this table illustrate the total space saved in slot 1 and total space used by these modules as they rollover. Notice how sms.dll consumes 24K of space in slot 1 and will consume 64K in slot 0.

 

 

*Calculated with equation roundup(size/slot alignment)*slot alignment; sms.dll size in slot 0 is roundup((0x02026000 – 0x0202000)/1024/64)*64.

 

Now we need to generate a list of OEMDRIVERS modules that can be relocated into the OEMDRIVERSHIGH package. A dir of %_FLATRELEASEDIR%/OEMDrivers_PACKAGE_FILES/ will result in a rough list. Further refine the list by making sure the modules are not being filtered out of the image, so search makeimg.out to verify their presence.

The next step is to determine the module size that will be used in slot 1. For DLLs and EXEs this can be done with the command dumpbin /headers <DLL name> and subtracting the size of .reloc from the total size of the image. For the example below we get 0x5000KB (20KB).

 

D:\_FLATRELEASEDIR>dumpbin /headers gpio.dll

  ...

             6000 size of image

  ...

        1000 .reloc

  ...

 

Table 2 is a list of modules going into OEMDRIVERS package that can be moved into the OEMDRIVERSHIGH package. The table is sorted by a ratio of wasted VM / module size, size is taken from the dumpbin output. The columns “Total space freed in slot 0” and “Total space used in slot 1”list the sum total of space used by that module and the others that precede it in the table based on the compaction of each slot. The column “Wasted VM in slot 0” illustrates the VM not utilized by 64KB alignment. To populate “Modules from Table 1 rolled over into slot 0” we compare the “Incremental space freed in slot 1” column of Table 1 and the “Total space used in slot 1” of Table 2. This comparison determines how many modules we can put into OEMDRIVERSHIGH thus causing modules from Table 1 to rollover into slot 0. The intersections of the two columns are marked with an asterisk to indicate the optimal amount of savings to get an additional module listed in Table 1 to rollover. As illustrated in this table, it would only take one module (gxdma.dll) to get one module from Table 1 to rollover. However if we wanted to rollover two modules we could relocate the first 2-4 modules, four being to optimal number.

 

* To achieve optimal savings when rolling over this number of files from slot 0 push this file and all pervious files into OEMDRIVERSHIGH

 

We can now use the optimal number of modules to relocate to estimate VM savings. We can assume VM usage in slot 1 stays the same, though it isn’t exactly the same as the relocated modules aren’t exactly the same size as the ones being rolled over. Finally we compare the differences in slot 0 usage. Table 3 brings together data from Table 1 and optimal data from Table 2 to calculate VM savings. Looking that the data under the column “Compaction savings in slot 0 (KB)” we can see a clear point where causing 6 modules to rollover will save 384KB in VM. We will relocate the first 15 modules of table 2 so that six modules of Table 1 rollover. We should see about 384KB of VM savings. It is fathomable, because VM is conserved in 64KB chucks that there are multiple points of optimal savings, if this were to happen in your optimization you should implement each scenario to see which one results in the better optimization.

 

* Based on the sort order of Table 2 there is a large jump between the 15th and 16th module; because of the ddi.dlls (16th) size. Relocated the ddi.dll will cause the 7th, 8th, 9th, 10th, and 11th modules of table 1 to rollover.

 

Enable OEMDRIVERSHIGH

· Set environment variable IMGOEMDRIVERSHIGH=1

· In platform\<platform>\files\wpc\oem.cpm.csv or platform\<platform>\files\smartfon\oem.cpm. csv add the line ‘PACKAGE_OEMDRIVERSHIGH,OEMDriversHigh’

· In platform\<platform>\files\platform.bib change @XIPREGION [END]IF PACKAGE_OEMDRIVERS to @XIPREGION [END]IF PACKAGE_OEMDRIVERSHIGH for targeted modules.

· Do the same for platform.reg and the corresponding settings for modules moved in previous step.

· In platform\<platform>\files\wpc or platform\<platform>\files\smartfon make a copy of oemdrivers.pkd.xml and name it oemdrivershigh.pkd.xml

· Use guidgen tool to create a new packageguid for oemdrivershigh.pkd.xml.

· In oemdrivershigh.pkd.xml change the name entry in the XML file from OEMDrivers to OEMDriversHigh.

 

Example:

This example is for the GPIO driver which is typically pretty small.

; @CESYSGEN IF COREDLL_CORESIOA

; @XIPREGION IF PACKAGE_OEMDRIVERS

IF BSP_NOGPIO !

IF IMG_NOGPIO !

    gpio.dll $(_FLATRELEASEDIR)\gpio.dll NK SH

ENDIF IMG_NOGPIO !

ENDIF BSP_NOGPIO !

; @XIPREGION ENDIF PACKAGE_OEMDRIVERS

; @CESYSGEN ENDIF COREDLL_CORESIOA

 

This is what the same section would look like after moving the module to the OEMDRIVERSHIGH package.

; @CESYSGEN IF COREDLL_CORESIOA

; @XIPREGION IF PACKAGE_OEMDRIVERSHIGH

IF BSP_NOGPIO !

IF IMG_NOGPIO !

    gpio.dll $(_FLATRELEASEDIR)\gpio.dll NK SH

ENDIF IMG_NOGPIO !

ENDIF BSP_NOGPIO !

; @XIPREGION ENDIF PACKAGE_OEMDRIVERSHIGH

; @CESYSGEN ENDIF COREDLL_CORESIOA

 

Results

For this example I modified platform.bib to move the first 15 modules of Table 2 into OEMDRIVERSHIGH, so that the last 6 modules of Table 1 would rollover into slot 0 and a goal of saving 384KB. Here is snippet from the optimized makeimg.out showing the target modules of slot 1 are now rolled over into slot 0. Occasionally, though not depicted here, the relocated VM space used in slot 1 will be smaller than the VM space being rolled over. When that happens you will see additional files this area also being assigned data and code offsets; each extra module that makes it in is saving at least an additional 64KB. We don’t see this because we rolled over 484KB by replacing it with exactly the same size, which I suspect won’t happen all that often.

 

...

Module MsgStore.dll at offset 01d76000 data, 0208e000 code

Module msim.dll at offset 01d74000 data, 0206c000 code

Module mscoree.dll at offset 01d73000 data, 0205c000 code

Module netcfagl2_0.dll at offset 01d72000 data, 02021000 code

Code space full, fixing up netcfd3dm2_0.dll to ram space

Code space full, fixing up ccoredrv.dll to ram space

Code space full, fixing up ccoreutl.dll to ram space

Code space full, fixing up cspvoice.dll to ram space

Code space full, fixing up simsec.dll to ram space

Code space full, fixing up sms.dll to ram space

Code space full, fixing up supsvcs.dll to ram space

Code space full, fixing up btagsvc.dll to ram space

...

 

Using makeimg.out from a base line image and the optimized image I got these results.

 

Results from baseline makeimg:

...

Module wendyser.dll at offset 01220000 data

Module drvr007.dll at offset 010e0000 data

Module drvr002.dll at offset 010d0000 data

Module drvr005.dll at offset 010b0000 data

Module drvr001.dll at offset 010a0000 data

Module shellcelog.dll at offset 01090000 data

***** BuildPkg Warnings *****

...

 

Results from optimized OEMDRIVERSHIGH package:

...

Module drvr006.dll at offset 01370000 data

Module wavedev.dll at offset 01260000 data

Module wcestreambt.dll at offset 01250000 data

Module wendyser.dll at offset 01240000 data

Module drvr007.dll at offset 01100000 data

Module shellcelog.dll at offset 010f0000 data

***** BuildPkg Warnings *****

...

The last address assigned by makeimg is for shellcelog.dll; for the baseline is 0x1090000, and 0x10F0000 when optimized for OEMDRIVERSHIGH. This is a savings of 384KB.

 

Summary

The potential amount of VM savings will change from platform to platform. Potential savings can be affected by modules sizes sitting on the slot 1 / 0 threshold and the libraries configured to be in the devices modules section. The potential saving is also dependent on how many small to medium size OEM drivers are available to be relocated. The more you have the greater potential there is for savings.

 

This is not the only way to save VM, nor is this the first step that should be taken to preserve VM. To trouble shoot VM issues a clear understanding of the CE memory model is also needed.

Comments

  • Anonymous
    July 09, 2007
    The comment has been removed

  • Anonymous
    July 10, 2007
    From the looks of your makeimg.out sample these are modules targeted to OEMDRIVERS and are at the threshold for rolling over into slot 0. If most of your modules were already in Slot 1 then using OEMDRIVERSHIGH probably won’t save you any VM because they were already in slot 1. The modules listed in the sample makeimg.out will definitely be part of your image whether they end up in slot 1 for slot 0.

  • Anonymous
    July 10, 2007
    Hi Wesbarc: It's interesting and useful article, thanks! From the reply of Motanis quesiton, you said 'If most of your modules were already in Slot 1 then using OEMDRIVERSHIGH probably won’t save you any VM because they were already in slot 1.' What's the most means, it's means just a few files with fixing up or have other flag to determine? Thanks in advance.

  • Anonymous
    July 10, 2007
    The comment has been removed

  • Anonymous
    July 11, 2007
    This is a very difficult subject to explain so thank you for asking these questions to help clarify the subject. Your understanding of the process is correct in that we change the order of modules entering slot 1. It sounds to me like the difficulty you are encountering is that before enabling OEMDRIVERSHIGH it is the modules targeted to OEMDRIVERS that are rolling over from slot 1 to slot 0. In this situation you have to modify the documented technique because the modules rolling over and the module being reassigned to OEMDRIVERSHIGH are the same. Some VM savings could come from your effort, but I suspect that if it is OEMDRIVERS modules that are rolling over that you aren’t losing a lot of VM due to slot 0 rollover. In this example slot 0 rollover is consuming 15.4 MB of slot 0 (0x02000000-0x01090000). I suspect yours is much less.

  • Anonymous
    July 11, 2007
    luckytigerwood: You are correct. In this case “most” meant a majority of the OEMDRIVERS modules. This stems from the fact that if, for example, 50% of the OEMDRIVERS modules rollover into slot 0, then relocating modules to OEMDRIVERSHIGH would only serve to push the remaining OEMDRIVERS modules that were in slot 1 into slot 0. By taking into account compaction some VM may be saved.

  • Anonymous
    July 23, 2007
    Hi Wesbarc: Sorry for disturb you again with a question which isn't related with this article, because I really don't know who   can help me to understand this problem, thanks in advance if you can help me on this or recommend me to consult other expert. I do the hopper test with my devices,and I get such result which I never seen before. Build = 17740 (OS 1235)(Hop 2.0.16,6225). Random Seed = 132143. Previous runtime = 1007 mins (16 hrs 47 mins). Ended by: UI Unresponsiveness caused by CPU starvation! Boot count (prev): 1 (0) ACTIONS/min = 107 Total States = 218 ...... ...... ...... STATS: System: RUNTIME = 16 hrs 35 mins; ACTIONS = 106966; ACTIONS/min = 107; Total States = 218 FREE MEMORY: 2400 KB FREE DISK:   54219 KB EndType = UI Unresponsiveness caused by CPU starvation WatchDog: Highest starved priority 251 should ping 618045 ticks ago. Would you please help me on it, thanks!

  • Anonymous
    July 24, 2007
    Have you seen this Wiki page? I like using CEDebugX for system hangs. http://channel9.msdn.com/wiki/default.aspx/CeDeveloper.BSPHopperInvestigation

  • Anonymous
    July 24, 2007
    It's really a good web site, thanks!

  • Anonymous
    July 26, 2007
    Hello: Can I ask you if we can use IMGVMCOMPACT=1 safely. I have noticed this definition in publicwpcoakfileswpc.sku.xml. It seems it can put many dlls from MODULES to FILES thus give us more Virtual Memory. But I found nothing about its in help. So I wandering if we can set IMGVMCOMPACT=1 safely to reserve more VM? Thanks Jun

  • Anonymous
    July 30, 2007
    I was able to dig up a little information about this flag and I posted it here. http://channel9.msdn.com/wiki/default.aspx/CeDeveloper.BSPVMOptimization. I have also filed a doc bug to get MSDN/OEM docs updated appropriately.

  • Anonymous
    August 26, 2007
    Thanks Wesbarc, I used this information to gain 0x50000 bytes by moving in/out the dlls between OEMDrivers/OEMDriversHigh packages. -Jack

  • Anonymous
    August 26, 2007
    Wesbarc, We are eliminating the XIP DLL VM here, is it possible to reduce the heap usage by reducing the Thread Stack by either linker option or CreateThread? It's now set to 64K. 10 threads takes up to 640KB. although it's only 40~100K used. Thanks, -Jack

  • Anonymous
    August 27, 2007
    Have you seen this link? http://channel9.msdn.com/wiki/default.aspx/CeDeveloper.BSPVMOptimization This is an excerpt: “You can also create a heap to allocate from shared memory. Instead of using LocalAlloc for heap allocations from the default process heap, create a heap to allocate from shared memory to preserve process VM. Use CeHeapCreate to create a new heap and define custom allocator/deallocator functions to allocate/deallocate blocks from shared memory. The custom allocator/deallocator functions will be called when the heap size needs to change. Use HeapAlloc to allocate from your heap.” Although this is not a complete answer, hopefully this is enough information to get you going.

  • Anonymous
    September 01, 2008
    Hi wesbarc: I encounter this problem,when compile WM611. [makecif Error:] (-1): Unable to find file D:WM611releaseWPC_CITRINE_ShipWPCPackagesOEMDRIVERSHIGH.cab.pkg. Exiting. please help me! Thank you Bob