"User friendly" explanation of how we cache embedded files
Irina Yatsenko, a OneNote Tester, wrote up the following information on how OneNote stores embedded files and how it works with the OneNote cache. She wrote it for another user but thought it would be good for everyone so I am posting it here. If you have any questions please let us know. and thanks to Irina!
What you see in OneNote:
What you see in File System:
"classical.one" and "jazz.one" both contain embedded files, but you don't see it in the file system, because the files are truly embedded inside the sections (the section size reflects this, there is a single page in each section, but because of the embedded files the size is rather big):
What you see in the cache:
Embedded files are stored separately in OneNoteOfflineCache_Files:
And there is also OneNoteOfflineCache.onecache file, which includes content of all your sections except the embedded files.
So, conceptually it looks like this:
What happens when you open or close a notebook?
If you have a folder with some sections (in this example, "studies") and you open it OneNote as a notebook, OneNote will go through all sections and cache them in the manner described above, but it will *not* remove embedded files from the sections stored in the File System.
If now you try to play the audio in OneNote, it will rely on the cached external copy of the file to be present (because it's way faster/easier to play from this file than from the embedded data). So, if you delete the folder OneNote won't be able to play the file. However, if you close the notebook and open it again, OneNote will re-cache everything and all your data should be available again.
What might go wrong and cause size bloat?
Some operations inside OneNote cause duplicated copies of embedded files to be created, most common example being copy-paste of an embedded file. So, if you had a session of re-organizing your notes and copy-pasted a lot of embedded files, you'd see a spike in cache size. Those copies should be removed by OneNote's garbage collection, and this leads us to the next point.
OneNote has asynchronous garbage collection logic when from time to time it cleans up the cached files from those that are not needed anymore. In some cases garbage collection (GC) either fails to run at all (e.g. access denied conditions sometimes happen on Vista for no apparent reason) or it fails to collect specific files because it incorrectly concludes they are still needed.
The surgery solution to this is to close all notebooks (ensuring that nothing ended up in Misplaced virtual notebook) and delete all cached files manually. Then re-open the notebooks and let OneNote to cache everything afresh, hoping that GC will jump over whatever hole it previously fell into...
Comments
Anonymous
September 18, 2007
The comment has been removedAnonymous
September 19, 2007
If I may offer a suggestion as to why access denied seems to be happening on Vista for no apparent reason... I've noticed that the Vista indexer, when indexing files, tends to lock files. You can verify this using Sysinternals Process Explorer and just search for handles that have the OneNote cache in its path and you'll see all processes that have an open handle to this folder.Anonymous
September 20, 2007
Dan, Thanks for sharing. Any chance one of your people has a detailed description of the concepts behind syncing? I've done some deep digging, if you will, into the binary files and think I have a handle on the basics, but it'd be cool to see it described by the ON team. EvanAnonymous
November 05, 2007
Where do you find the cash files exactlyAnonymous
August 26, 2010
Hello, I migrate from Onenote2007 to 2010, but one of my remote notebook wasn'tavailable when I did the migration. I still can see the OneNoteOfflineCache.onecache of 2007, how do I open it to take some data from it? thanks Wil