SharePoint 2013 Search enable embedded documents indexing – or a good reason to install Nov 2014 CU #1

Are your users asking; Where is my embedded content when I search for it in SP2013?

By default, SP2013 skips processing embedded objects in Office documents, this is seemingly set to off by design to save index space size. Until the Nov 2014 CU, there was no easy way to index and search embedded content of your Office documents until the iFilter file Offiltx.dll was updated to build 15 of this new CU.

Below I have created a simple process to enable embedded content within PowerPoint, but this will work for any Office document, docx, xlsx, etc.

  **Please note: enabling this will grow your index size as more content per document that contains embedded objects is indexed, also there will be a slight overhead at processing time on your content processor components, see below for more info on that.

  **Update: PDF objects embedded to office documents are not parsed and indexed in this version.

 

 

Process to enable embedded docs for .pptx files:

 

  • SP management console, run this set cmdlet after setting your SSA:
    • Set-SPEnterpriseSearchFileFormatState -SearchApplication $ssa -Identity pptx -Enable $true -UseIFIlter $true

  • Confirm with PowerShell command:
    • Get-SPEnterpriseSearchFileFormat -SearchApplication $ssa -Identity pptx

  • Stop/Start service “SharePoint Search Host Controller”
    • net stop SPSearchHostController
    • net start SPSearchHostController

  • Crawl docs

Ex: .pptx slide with embedded Word .docx

  • Search

*Bacon is indeed good

 

 

Some more technical info:

The SP2013 reg key and value for .pptx:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\15.0\Search\Setup\ContentIndexCommon\Filters\Extension
DDFE337F-4987-4ec8-BDE3-133FA63D5D85

Searching the registry for this value you will find:

HKEY_CLASSES_ROOT\CLSID\{DDFE337F-4987-4EC8-BDE3-133FA63D5D85}\InprocServer32

At path:
C:\Program Files\Common Files\Microsoft Shared\Filters\offfiltx.dll

New parser build:

With Procmon we now see ParserServer.exe process load that image into its process.

This could potentially lead to some resource overhead on your CPCs.

Procmon with filter Path contains filtx.dll:

Comments

  • Anonymous
    May 26, 2015
    Hello, I'm interested in this subject. We're using SP2010 and we would like to know if it is possible to control indexing of embedded objects. More precisely, we would like to prevent indexing of embedded objects. Do you have knowledge of a parameter for this ? Thanks, Sebastien