Sharepoint 2013 Search Ranking and Relevancy Part 1: Let’s compare to FS14

[アーティクル]
12/05/2013

I’m very happy to do some “guest” blogging for my good friend Leo and continue diving into various search-related topics. In this and upcoming posts, I’d like to jump right into something that interests me very much, and that is taking a look at what makes some documents more relevant than others as well as what factors influence rank score calculations.

Since Sharepoint 2013 is already out, I’d like to touch upon a question that comes up often when someone is considering moving from FAST ESP or FAST for Sharepoint 2010 to Sharepoint 2013 : “So how are rank scores calculated in Sharepoint 2013 Search as opposed to previous FAST versions”?

In upcoming posts, I will go more into “internals” of the current Sharepoint 2013 ranking model as well as introduce the basics of relevancy calculation concepts that apply across many search engines and are not necessarily specific to FAST or Sharepoint Search.

There are some excellent blog posts out there that go in-depth on how Sharepoint 2013 Search rank models work, including the ones below from Alexey Kozhemiakin and Mikael Svenson.

https://powersearching.wordpress.com/2013/03/29/how-sharepoint-2013-ranking-models-work/

https://techmikael.blogspot.com/2013/04/rank-models-in-2013main-differences.html

To avoid being repetitive, what I’ve tried to do is to create an easy to see comparison chart between factors that influence rank calculations in FS14 to Sharepoint 2013 Search. I may update this chart in the future to include FAST ESP, although the main factors involved in both ESP and FS14 are somewhat similar to each other as opposed to Sharepoint 2013 Search(which is closer related to Sharepoint 2010 Search model).

One of the main differences is with the fact that Sharepoint 2013 Search uses a 2-stage process for rank calculations: a linear ranking model as a 1st stage and a Neural Network as a 2nd stage. The 1st stage is “light” and we can afford to apply it to all documents in a result set. There are specific rank features that are part of this stage that are applied to all documents. The top 1000 documents(candidates) based on Stage 1 Rank are input to Stage 2. This stage is more performance intensive and re-computes the rank score for documents used as an input, which is why it is only applied to a limited set. It consists of all the same rank features as Stage 1 plus 4 additional Proximity features.

For my comparison below, I was mainly using a model called “Search Ranking Model with Two Linear Stages”, which has been put in place as of August 2013 CU. This model is recommended to use as a template when creating custom rank models, as it provides you with proximity without a Neural Network.

Rank Factor	FS14	SP2013 Search
Rank Models	1 OOTB rank model	16 Rank Models
Freshness	Available OOTB and customizable	N/A OOTB, possible to be configured
Dynamic Ranking (field weighting/managed properties)	Context Boost: Title, DocSubject, Keywords, DocKeywords, urlkeywords, Description, Author, CreatedBy, ModifiedBy, MetadataAuthor, WorkEmail, Body, crawledpropertiescontent	Document MP’s + Usage/Social data Title, QLogClickedText, SocialTag, Filename, Author, AnchorText, body
FileType	Field-Boost weight/Managed Property Boost(OOTB -4000 points): Format: Unknown Format, XML, XLS FileExtension: CVS, TXT, MSG, OFT, ZIP, VSD, RTF IsEmptyList, IsListItem	FileType rank feature: PPT, Sharepoint site, DOC, HTML, ListItems, Image, Message, XLS, TXT
Language	N/A	Dynamic Rank(query-based). LCID, i.e locale ID is used.
Social Distance	N/A	Static Rank(colleague relationship to the person issuing the query). 0 bucket – No colleague relationship 1 bucket – first level(direct) relationship 2 bucket – second level(indirect) relationship
Static Rank Boost (Query-Independent)	Quality Weight Components: hwboost docrank siterank urldepthrank Authority Weight– Partial and Complete	Now part of Analytics Processing Component. Static Rank features calculated with Search and Usage Analytics: QLogClicks QLogSkips QLogLastClicks EventRate
Proximity	Enabled by default	MinSpan (Neural Networks 2^nd stage, parameters for proximity minimal span
Anchortext (Query-Dependent)	Extnumocc = part of Dynamic Rank calculations, query-time hits in anchortext	AnchortextComplete
URLDepth (Query-Dependent)	N/A – in FS14, this was a static rank feature.	UrlDepth – Depth of the document URL(number of slashes)
Click-Through Weight(Query-Dependent)	Query-Authority weight: click-through weight, dynamic rank	N/A Now part of static rank features used in Analytics processing Component(QLogClicks, etc)
Rank Tuning	FS14	SP2013 Search
GUI-based applications. Ease of tuning rank calculations and user-friendliness	N/A Rank calculations and scores can be seen either via ranklog output or via Codeplex tools such as FS4SP Query Logger. However, there isn’t a user-friendly tool to help you make the changes and push them live, or preferably see them in “Preview” mode offline. A separate ‘spreladmin’ tool is needed for click analysis.	Rank Tuning App(coming soon). A GUI-based and user-friendly way to tune/customize ranking and impact relevancy. Includes a “preview”, i.e offline mode.
Rank logging availability	Server-side: Ranklog is available via QRServer output. However, it is server-side and only available to Admins with local access to QRServer port 13280. Client-side: N/A	Server-side: Rank tuning app/ULS logs Client-side: ExplainRank template available to clients. https://powersearching.wordpress.com/2013/01/25/explain-rank-in-sharepoint-2013-search/

次の方法で共有

Sharepoint 2013 Search Ranking and Relevancy Part 1: Let’s compare to FS14

Rank Factor

FS14

SP2013 Search

Rank Tuning

FS14

SP2013 Search

その他のリソース