Crawl taking indefinitely to complete

Came across this a couple of times and, unfortunately there is no easy way out of it.

Here is the behavior description:

Crawling your site, full or incremental will not finish, even after a long time.

If you try to access the SQL server’s tempdb will most likely result in the inability to get a lock on the database.

The last item appearing in the crawl log is registered a more than 2 or 3 hours ago, still, the crawl status is “Crawling incremental” or "Crawling Full"

There are no mssdmn processes running on the indexer server and the memory occupied by mssearch.exe ranges in the area of 150 –200 Mb

You try to restart the Office Sharepoint Search service using services.msc and the service times out on stopping.

You can try to stop/pause the crawling using the UI but the page will timeout. upon refresh, the content source still shows as crawling.

A fix for the behavior was already published in August Cumulative Update :

KB 956056   

Description of the SharePoint Server 2007 hotfix package (Coreserver.msp): August 26, 2008  https://support.microsoft.com/default.aspx?scid=kb;EN-US;956056

(A full crawl may take several weeks to be completed. Additionally, the crawl may stop responding, and you cannot stop it or cancel it.)

If you already installed the said fix, and still experience this behavior, it means you are most likely in our scenario.

What could possibly inflict this you ask?

Well, you can run into this behavior if one or more lists(document libraries) on your site lacks the default view.

This means , the respective list’s default URL , instead of pointing to:  https://site/subsite/list/forms/allitems.aspx , points to https://site/subsite/  which will have as effect the building of all the anchors for the items in the respective list(s) starting from the root of the site. This (depending on the number of lists and number of list items in each list) will lead to an immense number of rows in the temp db  while the server will attempt to index the list items referencing them recursively from the root of the site. The direct effect is the dramatic decrease in the performance of the tempdb processing and stalls the crawl process.

This might happen if you deleted the default view by mistake, or created the list through migration tools that did not create all the aspects of the lists (as in , just created the SPList object and stuffed the SPListItems in them…)

How to check if you have such lists ?

Here a snippet of code that can help you achieve that :

/*
This source code is freeware and is provided on an "as is" basis without warranties of any kind,
whether express or implied, including without limitation warranties that the code is free of defect,
fit for a particular purpose or non-infringing. The entire risk as to the quality and performance of the code is with the end user.
*/

…………..

    SPWebApplication spwa = SPWebApplication.Lookup(new Uri(args[0]));
                    foreach (SPSite osite in spwa.Sites)
                    {
                        foreach (SPWeb oweb in osite.AllWebs)
                        {
                            Console.WriteLine(oweb.Url + "==============");
a:;         // this label will serve in case we implement the fix                  

 foreach (SPList olist in oweb.Lists)
                            {
                                if (olist.Hidden ==false)
                            {
                                try
                                {
                                    Console.WriteLine("Title: " + olist.Title);
                                    if (olist.DefaultView == null)

                                    { Console.WriteLine("ERROR: No default list view found “); }

                                    else
                                    {
                                        Console.WriteLine("DefaultViewURL: " + olist.DefaultViewUrl);
                                        Console.WriteLine("DefaultViewTitle: " + olist.DefaultView.Title);
                                    }
                                }
                                catch (Exception e)
                                {
                                   Console.WriteLine("ERROR------\n" + e.Message + "\n”);

                                }
                            }
                            }
                            Console.WriteLine("=========");
                            oweb.Dispose();
                        }
                        osite.Dispose();
                    }

…………...

if the result of the code returns some of the lists as having a null default view, the only way to fix it, (except deleting the document library and recreating-it) is through object model  :

add the following lines to the code , next to :

{

Console.WriteLine("ERROR: No default list view found “);

SPView spView = olist.GetUncustomizedViewByBaseViewId(1); // 1 = All Documents Base Template
StringCollection viewColl = spView.ViewFields.ToStringCollection();
olist.Views.Add("All Documents", viewColl, spView.Query, spView.RowLimit, spView.Paged, true);

Console.WriteLine("ERROR :fixed“);

goto a; //Collection was modified, enumeration might not execute

}

For the currently running crawl, there is no easy way to stop it, you either have to wait until it will eventually finish ( because as, said, it is not stalled , it’s just extremely slow) or forfeit the existing index and Stop the Office Server Search Service on the indexer.

NOTE: Stopping the Office Server Search on the indexer will result in loosing the index and having to re-crawl all the content.

Comments

  • Anonymous
    January 01, 2003
    The comment has been removed

  • Anonymous
    January 01, 2003
    For posterity-sake - PowerShell version (comes in handy with MOSS 2007)... [void] [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint") $SPSite = New-Object Microsoft.SharePoint.SPSite("http://site/") foreach ($oweb in $SPSite.AllWebs){    write-host "Checking" : "$($oweb.Url)"    foreach ($olist in $oweb.Lists){        if(!($olist.Hidden)){            try{                if($olist.DefaultView -eq $null){                    $olist | select Title, ItemCount, Author, LastItemModifiedDate, ParentWebUrl                }            } catch { Write-Verbose "Unable to load $olist.DefaultViewUrl" }        }    }    $oweb.Dispose() }

  • Anonymous
    January 01, 2003
    the code above is a snippet from the program not the whole code , that program should start with using clauses using System; using System.Collections.Generic; using System.Text; using Microsoft.SharePoint;

  • Anonymous
    January 01, 2003
    The comment has been removed

  • Anonymous
    January 01, 2003
    Hi Chris, actually, except for the header of the program , this is the whole source code. Just create a new C# command prompt application in Visual Studio  ( Express )www.microsoft.com/.../express   and copy paste the code in the main method. Do not forget to add the references to Microsoft Sharepoint  and Microsoft.Sharepoint.Administration. Do not forget, this will not resolve the problem , it can only help you identify the cause of the issue so that after resetting the index , it does not reoccur.

  • Anonymous
    April 08, 2010
    The comment has been removed

  • Anonymous
    June 03, 2010
    Thanks!  That really helped me out.

  • Anonymous
    June 04, 2010
    Hi Victor - thanks for the blog post. I'm not a developer, but my understanding is that I would need to use a SharePoint 2007 developer environment (either a physical server or VM with SharePoint and dev tools installed) to build and compile your code. Or, can I build this on my Windows 7 referencing the Microsoft.Sharepoint.dll through a UNC path?

  • Anonymous
    September 13, 2010
    The comment has been removed

  • Anonymous
    October 20, 2011
    Any possibility of getting the full code file?  As a new SharePoint SysAdmin I do not have a firm grasp on some of the more low level operations.