Cloud Hybrid Search Service Application
Cloud Hybrid Search-Overview
Hybrid search makes finding content easy, wherever the content lives. A company has a hybrid environment if their content and applications are spread across on-premises and Office 365. To complement the existing hybrid search solution, which is based on federating search results, a new capability has been introduced in the product. This new hybrid search solution is called cloud hybrid search and is based on indexing on-premises content in Office 365.
While the existing federated search results model of inbound and outbound hybrid search continues to be supported, this new hybrid solution brings numerous new capabilities into the hybrid sphere. The new hybrid solution takes away the complexities related to queries coming in to the SharePoint Server on- premises environment via a reverse proxy. Therefore, a public SSL certificate is no longer required for inbound authentication requests from the Secure Store application in SharePoint Online. The infrastructure requirements for configuring cloud hybrid search is discussed later on in this blog. Below is a pictorial representation describing the query and crawl overview in next generation hybrid search.
Cloud hybrid search will be available to SharePoint Online customers and either of these solutions
§ As Public Preview in the August 2015 Public Update (PU) for SharePoint Server 2013 https://support.microsoft.com/kb/3055009
§ In the public preview release of SharePoint Server 2016 IT Preview
https://www.microsoft.com/en-us/download/details.aspx?id=48712
You may already be familiar with the federated search results model of inbound and outbound hybrid search, as well as its implementation guidelines. This blog contains a series of articles on how to configure these scenarios – see below
Office-365-configure-hybrid-search-with-directory-synchronization.aspx
Office-365-configure-hybrid-search-with-directory-synchronization-password-sync-part2.aspx
The-all-new-cloud-search-service-application-coming-to-sharepoint-2013-and-sharepoint-2016.aspx
The focus of this blog is to deploy and develop an understanding of the Cloud Search service application.
The cloud hybrid search solution provides the ability to crawl and parse on-premises content and then process and index it in Office 365. When users query the search index in Office 365, they thus get search results from both on-premises and Office 365 content. The parsed content from on-premises is encrypted while in transit from the on-premises crawler through to the content processing stages in Office 365. We will talk about the specifics of encryption and data transfer later in this content.
The crawling configuration, including that of the Search service application, content sources, crawl rules etc. is carried out in the on-premises environment. Modification to search experiences, for example search schema changes, are performed at the Office 365 level. By deploying the cloud hybrid search solution customers can finally achieve the benefits of a unified index and search experience spanning on-premises and online content in a single result set.
You can join the TechNet support forum for cloud hybrid search here https://social.technet.microsoft.com/Forums/en-us/home?forum=CloudSSA
Note: The cloud Search service application is currently not available for customers outside the regular Office 365 multitenant service, including China data centre customers and Government cloud customers.
The deployment and configuration steps documented here are based on the preview release of the cloud Search service application. As the product moves towards General Availability and the SharePoint 2016 on premises release to market the processes may change.
How to configure the cloud Search service application
In order to configure the cloud Search service application, the below steps needs to be performed.
Mandatory Configuration
1. Synchronize users and groups from on-premises to Office365 Azure Active Directory
2. Create cloud Search service application
3. Install onboarding pre-requisites
4. Execute onboarding script
5. Create on-premises content sources
6. Configure outbound query federation
7. Configure SharePoint Online search vertical
Optional configuration
1. Publish cloud Search service and consume from SharePoint 2010.
2. Customization.
Synchronize users and groups from on-premises to Office365 Azure Active Directory
A prerequisite for cloud Search service application deployment is to have a common user identity across on-premises and online, this enables search results to be correctly security trimmed against the user identity performing the query. The synchronization of user identity is performed using a directory synchronization tool. Below is the list of supported synchronization tools.
To configure directory synchronization for your environment, follow the steps in https://blogs.msdn.com/b/spses/archive/2013/10/22/office-365-configure-hybrid-search-with-directory-synchronization.aspx
Directory synchronization also ensures the correct user properties are available in Office 365 Azure AD to be able to carry out a process known as user rehydration.
Create a cloud Search service application
A cloud Search service application (SSA) cannot be created using the central admin SSA creation user interface. The reason being that the cloud SSA requires a property setting that is not applied by the UI based creation process. This property is called CloudIndex and must be set to true for a cloud SSA. CloudIndex is a read-only property of any deployed SSA and as such cannot be set post creation. By definition this also implies that an existing regular SSA cannot be converted to a cloud SSA.
The property value for a SSA can be checked by executing
(get-spenterprisesearchserviceapplication).cloudindex
The cloud SSA should be created by executing a SSA creation PowerShell script and setting the CloudIndex property to true. Later, when we execute the on-boarding script, another property called IsHybrid is set to 1 for the SSA.
New-SPEnterpriseSearchServiceApplication -Name $SearchServiceAppName -ApplicationPool $appPool -DatabaseServer $DatabaseServerName -CloudIndex $true
The CloudIndex property disables the normal ContentPlugin and the IsHybrid property initializes the AzurePlugin so the content gets ready to be pushed to Azure.
The SharePoint Admin is free to use their own script to generate the cloud SSA and scale out as long as they set this property. For convenience we have provided a sample creation script named CreateCloudSSA which generates a single server search instance and sets this property correctly. By running the CreateCloudSSA script, you create a cloud SSA that crawls your on-premises content.
Note: All scripts are provided AS IS without warranty of any kind. Please see the official TechNet guidance for links to supported scripts.
On the server that is running SharePoint Server 2013 or SharePoint Server 2016 Preview: Copy the sample script below and save it as CreateCloudSSA.ps1 and run it.
When prompted, type:
The name of the SharePoint Server search server.
The Search service account in this format: domain\username.
A name of your choice for the cloud SSA.
The name of the SharePoint Farm database server.
Verify that you see a message that the cloud SSA was created successfully.
Sample CreateCloudSSA.ps1
-----------------------------------------------------------------------------------------------------------------------------------------------------
## Gather mandatory parameters ##
## Note: SearchServiceAccount needs to already exist in Windows Active Directory as per Technet Guidelines https://technet.microsoft.com/library/gg502597.aspx ##
Param(
[Parameter(Mandatory=$true)][string] $SearchServerName,
[Parameter(Mandatory=$true)][string] $SearchServiceAccount,
[Parameter(Mandatory=$true)][string] $SearchServiceAppName,
[Parameter(Mandatory=$true)][string] $DatabaseServerName
)
Add-PSSnapin Microsoft.SharePoint.Powershell -ea 0
## Validate if the supplied account exists in Active Directory and whether supplied as domain\username
if ($SearchServiceAccount.Contains("\")) # if True then domain\username was used
{
$Account = $SearchServiceAccount.Split("\")
$Account = $Account[1]
}
else # no domain was specified at account entry
{
$Account = $SearchServiceAccount
}
$domainRoot = [ADSI]''
$dirSearcher = New-Object System.DirectoryServices.DirectorySearcher($domainRoot)
$dirSearcher.filter = "(&(objectClass=user)(sAMAccountName=$Account))"
$results = $dirSearcher.findall()
if ($results.Count -gt 0) # Test for user not found
{
Write-Output "Active Directory account $Account exists. Proceeding with configuration"
## Validate whether the supplied SearchServiceAccount is a managed account. If not make it one.
if(Get-SPManagedAccount | ?{$_.username -eq $SearchServiceAccount})
{
Write-Output "Managed account $SearchServiceAccount already exists!"
}
else
{
Write-Output "Managed account does not exists - creating it"
$ManagedCred = Get-Credential -Message "Please provide the password for $SearchServiceAccount" -UserName $SearchServiceAccount
try
{
New-SPManagedAccount -Credential $ManagedCred
}
catch
{
Write-Output "Unable to create managed account for $SearchServiceAccount. Please validate user and domain details"
break
}
}
Write-Output "Creating Application Pool"
$appPoolName=$SearchServiceAppName+"_AppPool"
$appPool = New-SPServiceApplicationPool -name $appPoolName -account $SearchServiceAccount
Write-Output "Starting Search Service Instance"
Start-SPEnterpriseSearchServiceInstance $SearchServerName
Write-Output "Creating Cloud Search Service Application"
$searchApp = New-SPEnterpriseSearchServiceApplication -Name $SearchServiceAppName -ApplicationPool $appPool -DatabaseServer $DatabaseServerName -CloudIndex $true
Write-Output "Configuring Admin Component"
$searchInstance = Get-SPEnterpriseSearchServiceInstance $SearchServerName
$searchApp | get-SPEnterpriseSearchAdministrationComponent | set-SPEnterpriseSearchAdministrationComponent -SearchServiceInstance $searchInstance
$admin = ($searchApp | get-SPEnterpriseSearchAdministrationComponent)
Write-Output "Waiting for the admin component to be initialized"
$timeoutTime=(Get-Date).AddMinutes(20)
do {Write-Output .;Start-Sleep 10;} while ((-not $admin.Initialized) -and ($timeoutTime -ge (Get-Date)))
if (-not $admin.Initialized) { throw 'Admin Component could not be initialized'}
Write-Output "Inspecting Cloud Search Service Application"
$searchApp = Get-SPEnterpriseSearchServiceApplication $SearchServiceAppName
Write-Output "Setting IsHybrid Property to 1"
$searchApp.SetProperty("IsHybrid",1)
#Output some key properties of the Search Service Application
Write-Host "Search Service Properties"
Write-Host "Hybrid Cloud SSA Name : " $searchapp.Name
Write-Host "Hybrid Cloud SSA Status : " $searchapp.Status
Write-Host "Cloud Index Enabled : " $searchApp.CloudIndex
Write-Output "Configuring Search Topology"
$searchApp = Get-SPEnterpriseSearchServiceApplication $SearchServiceAppName
$topology = $searchApp.ActiveTopology.Clone()
$oldComponents = @($topology.GetComponents())
if (@($oldComponents | ? { $_.GetType().Name -eq "AdminComponent" }).Length -eq 0)
{
$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.AdminComponent $SearchServerName))
}
$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.CrawlComponent $SearchServerName))
$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.ContentProcessingComponent $SearchServerName))
$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.AnalyticsProcessingComponent $SearchServerName))
$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.QueryProcessingComponent $SearchServerName))
$topology.AddComponent((New-Object Microsoft.Office.Server.Search.Administration.Topology.IndexComponent $SearchServerName,0))
$oldComponents | ? { $_.GetType().Name -ne "AdminComponent" } | foreach { $topology.RemoveComponent($_) }
Write-Output "Activating topology"
$topology.Activate()
$timeoutTime=(Get-Date).AddMinutes(20)
do {Write-Output .;Start-Sleep 10;} while (($searchApp.GetTopology($topology.TopologyId).State -ne "Active") -and ($timeoutTime -ge (Get-Date)))
if ($searchApp.GetTopology($topology.TopologyId).State -ne "Active") { throw 'Could not activate the search topology'}
Write-Output "Creating Proxy"
$searchAppProxy = new-spenterprisesearchserviceapplicationproxy -name ($SearchServiceAppName+"_proxy") -SearchApplication $searchApp
Write-Output " Cloud hybrid search service application provisioning completed successfully."
}
else # The Account Must Exist so we can proceed with the script
{
Write-Output "Account supplied for Search Service does not exist in Active Directory."
Write-Output "Script is quitting. Please create the account and run again."
Break
} # End Else
Successful execution of the CreateCloudSSA.ps1 script should show something similar to the printscreen below
Validate the search topology and cloud index parameter by executing the following script
Validate the search topology and cloud index parameter by executing the following script
Add-PSSnapin Microsoft.SharePoint.Powershell
$ssa = Get-SPEnterpriseSearchServiceApplication
Get-SPEnterpriseSearchTopology -Active -SearchApplication $ssa
Get-SPEnterpriseSearchStatus -SearchApplication $ssa -Text |ft Name, state,Partition,Host -AutoSize
$ssa.CloudIndex
The expected output is a list of Search Service Application components showing status online and the CloudIndex property of the Search Service Application showing true
Note: Only one cloud SSA is allowed per Farm, however, you can have multiple non-cloud SSAs in the same SharePoint 2013 or 2016 farm. You can scale out the topology, follow the steps in Change from the default search topology to a small enterprise topology.
Install hybrid onboarding prerequisites
In order for you to successfully execute the on-boarding script, the following prerequisites must be installed on the SharePoint Server where the script is executed.
The following script can be used to validate and if required install the pre-requisites. Note, however that the pre-requisites must be downloaded prior to executing the script.
Create a new folder on a server in the SharePoint Farm e.g. c:\scripts and save the script file to that location. Additionally, download the two prerequisites and save in the same folder. The two prerequisites can be downloaded from:
Microsoft online sign in assistant - https://www.microsoft.com/en-us/download/details.aspx?id=39267
Microsoft Azure AD PowerShell - https://go.microsoft.com/fwlink/p/?linkid=236297
# This script installs the two required prerequisites:
# AdministrationConfig-EN.msi and msoidcli_64.msi.
# It is assumed that these are available in the same folder as the script itself.
# See the following links for downloading manually:
# - https://www.microsoft.com/en-us/download/details.aspx?id=39267
# - https://go.microsoft.com/fwlink/p/?linkid=236297
#
function Install-MSI {
param(
[Parameter(Mandatory=$true)]
[ValidateNotNullOrEmpty()]
[String] $path
)
$parameters = "/qn /i " + $path
$installStatement = [System.Diagnostics.Process]::Start( "msiexec", $parameters )
$installStatement.WaitForExit()
}
$scriptFolder = Split-Path $script:MyInvocation.MyCommand.Path
$MSOIdCRLRegKey = Get-ItemProperty -Path "HKLM:\SOFTWARE\Microsoft\MSOIdentityCRL" -ErrorAction SilentlyContinue
if ($MSOIdCRLRegKey -eq $null)
{
Write-Host "Installing Office Single Sign On Assistant" -Foreground Yellow
Install-MSI ($scriptFolder + "\msoidcli_64.msi")
Write-Host "Successfully installed!" -Foreground Green
}
else
{
Write-Host "Office Single Sign On Assistant is already installed." -Foreground Green
}
$MSOLPSRegKey = Get-ItemProperty -Path "HKLM:\SOFTWARE\Microsoft\MSOnlinePowershell" -ErrorAction SilentlyContinue
if ($MSOLPSRegKey -eq $null)
{
Write-Host "Installing AAD PowerShell" -Foreground Yellow
Install-MSI ($scriptFolder + "\AdministrationConfig-EN.msi")
Write-Host "Successfully installed!" -Foreground Green
}
else
{
Write-Host "AAD PowerShell is already installed." -Foreground Green
}
------------------------------------------------------------------------------
Successful execution of the pre requisite installer shows the following output
The script has installed both components in silent mode. To validate that the components were successfully installed you can execute the following PowerShell;
To confirm the Microsoft Online Single Sign On Assistant is successfully installed
(get-service msoidsvc).Status
A response of Running is a report of success
To confirm the Windows Azure AAD PowerShell is successfully installed import the msonline modules
import-module msonline
import-module msonlineextended
A successful import results in no error message
If you wish to double check the import then use the –verbose switch to show the cmdlets being successfully loaded
import-module msonline -verbose
import-module msonlineextended -verbose
Both Office Single sign-on assistant and AAD PowerShell MUST succeed before continuing with the onboarding process.
Note: If you try to proceed to onboarding without these components installed the onboarding script will recognize this state and exit gracefully.
On-boarding process
The next stage of configuring the on-premises farm to be able to index into the Office 365 search index is called onboarding. Onboarding consists of several stages and is implemented by executing the OnBoardHybrid-Search.ps1 PowerShell script and supplying parameters for the Tenant Url and optionally the cloud SSA name.
Note: There is a process underway to add the onboarding stage to the Office 365 Hybrid Picker as an HRC based application. This will negate the need for the PowerShell script.
On-boarding stages
There are four keys stages in the onboarding process
Get-HybridSSA
This stage validates that the cloud SSA name supplied as a parameter to the script execution is valid. If multiple SSAs are found and no parameter is supplied, then it attempts to validate an SSA that has the IsHybrid property set to 1.
Prepare-Environment
This stage checks that the prerequisites for deployment are installed. It checks for MicrosoftOnline Single Sign-on Assistant and for Windows Azure PowerShell. If either of these tools are missing the script will exit with a prompt to install them.
Connect-SPFarmToAAD
This completes the OAuth trust configuration with Azure Access Control services (ACS) and deploys the ACS Proxy. Additionally, it deploys a new SPO connection proxy to enable the farm to communicate with the external endpoint of the cloud search service
Add-ServicePrincipal
The penultimate stage adds the O365 Service Principal ID to the local farm and sets the correct Service Principal Name in Azure AD for the on-premises url. This ensures outbound query federation can succeed between the O365 tenant and the on-premises farm.
The process completes by settings some additional parameters on the Cloud SSA.
Important: Before executing the Onboard-Hybridsearch.ps1 script it is advised that you close all open PowerShell Sessions including PowerShell ISE and PowerShell cmd windows.
The OnboardHybrid-Search.ps1 script can be downloaded along with documentation from the Microsoft Connect Site – https://connect.microsoft.com/office/Downloads/DownloadDetails.aspx?DownloadID=58777
If you do not have access to the cloud Hybrid Search preview program, you can request access via the link below https://connect.microsoft.com/office/SelfNomination.aspx?ProgramID=8647&pageType=1 and we highly recommend checking the link for the latest version of the script prior to execution.
Output from executing the OnboardHybrid-Search.ps1 looks similar to the following
When the message Connecting to O365 appears, you will be prompted to sign in using a tenant global admin account:
The next stage of execution shows the connecting to the ACS Endpoint. If a message identical to the below printscreen shows Signing Credential already exists, this means the script may have been run before and the Service Principal Names (SPNs) are already in place. This is a safe operation.
You may on occasion see the output matching the screenprint below. This is indicative of a connection issue with ACS and the script will retry for four minutes before exiting.
It is safe to rerun the script if you received the exception, however at this point we strongly recommend you begin a new PowerShell session and close all existing PowerShell sessions.
If connection is successful, the script will proceed and you should be presented with an output matching the following printscreen
In this printscreen you can see the TenantID, Authentication Realm and the connected endpoint address.
At this point on-boarding is completed.
Configure content sources on-premises
Content sources in the Cloud SSA are no different to Content Sources in a regular Enterprise SSA. Follow the steps in https://technet.microsoft.com/en-us/library/jj219808.aspx?f=255&MSPPError=-2147217396 to create and manage content sources in the cloud SSA.
Crawl SharePoint 2013
The next step after on-boarding to the service and creating the on-premises content source(s), is to execute the first full crawl. The information below is from an incremental crawl to reduce the amount of information and show the pertinent data for crawling a new item and completing the indexing to the cloud.
Let us take a look at the Uls logs when we crawl a SharePoint 2013 farm from the Cloud SSA.
The Web Application we will crawl is https://sp13
Uploaded a text file - sampledoc.txt -> https://sp13/Shared%20Documents/sampledoc.txt .
Turned on verbose logging and below is what we see.
-----------------------------------------------------------------------------------------------------------------------------------------------------
Incremental Crawl started here and the CrawlID is 129
08/18/2015 20:12:57.93 w3wp.exe (0x1FE4) 0x050C SharePoint Server Search Admin Audit 97 Information An incremental crawl was started on 'SharePoint2013' by OnPremDomain\ServiceAcct e449259d-e4f8-30ff-cbe7-e91005bdfcf5
08/18/2015 20:12:58.13 mssearch.exe (0x1194) 0x1270 SharePoint Server Search Crawler:Gatherer Plugin e5ey Verbose CGatherer::LoadCrawls New Crawl Requested Component 91a6119d-38c1-4d49-be29-4cb689a77288-crawl-0, CrawlID 129, Project 1 [gatherobj.cxx:6758] search\native\gather\server\gatherobj.cxx
Using web service calls to sitedata.asmx, we gather the changes on the site from previous crawl: We can see the Item we added.
08/18/2015 20:13:05.21 mssdmn.exe (0x2730) 0x2F18 SharePoint Server Search Connectors:SharePoint aiwts Verbose GetChanges CorrelationID e749259d-b72a-e0d1-f02c-0ee137b3e39c Url https://sp13/_vti_bin/sitedata.asmx Search RequestDuration 382 SPIISLatency 0 SP RequestDuration 375
08/18/2015 20:13:05.21 mssdmn.exe (0x2730) 0x2F18 SharePoint Server Search Connectors:SharePoint allng VerboseEx GetChanges received WS response
<GetChangesResult>
<SPContentDatabase Change="Unchanged" ItemCount="1">
<SPSite Change="Unchanged" ItemCount="1" Id="{1d26851d-2e5c-4b5f-a4f9-bc0f8b4107b6}">
<Messages />
<SPWeb Change="Unchanged" ItemCount="1">
<SPList Change="Unchanged" ItemCount="1">
<SPListItem Change="Add" ItemCount="0" UpdateSecurity="False" UpdateSecurityScope="False" Id="{4c464428-78c5-4373-847f-64240a2eb83a}" ParentId="{e360f42c-b82c-49c7-b8e8-9338639c92b4}_" InternalUrl="/siteurl=/siteid={1d26851d-2e5c-4b5f-a4f9-bc0f8b4107b6}/weburl=/webid={c5001554-4143-4b1d-8443-3eb98844dc89}/listid={e360f42c-b82c-49c7-b8e8-9338639c92b4}/folderurl=/itemid=2" DisplayUrl="/Shared Documents/sampledoc.txt" ServerUrl="https://sp13" CRC="0" SequenceNumber="246" Url="sampledoc.txt">
</SPListItem>
</SPList>
</SPWeb>
</SPSite>
</SPContentDatabase>
<MoreChanges>False</MoreChanges>
<StartChangeId>1;0;e2453c4e-184a-4214-ab08-c4ebde2b8513;635755248354630000;8484</StartChangeId>
<EndChangeId>1;0;e2453c4e-184a-4214-ab08-c4ebde2b8513;635755248354630000;8484</EndChangeId>
</GetChangesResult>
This item is picked up and crawled next. Per crawl logs, this DocID is 718 :
Corresponding ULS data :
08/18/2015 20:13:05.21 mssdmn.exe (0x2730) 0x2F18 SharePoint Server Search Connectors:SharePoint dv9u VerboseEx Emit change log crawl link sts4://sp13/siteurl=/siteid={1d26851d-2e5c-4b5f-a4f9-bc0f8b4107b6}/weburl=/webid={c5001554-4143-4b1d-8443-3eb98844dc89}/listid={e360f42c-b82c-49c7-b8e8-9338639c92b4}/folderurl=/itemid=2, DisplayURL = https://sp13/Shared Documents/sampledoc.txt , url=sts4://sp13/contentdbid={e2453c4e-184a-4214-ab08-c4ebde2b8513} [sts3filt.cxx:4644]
Once MSSDmn picks up the crawled properties for the document, we can see these being sent to AzurePlugin like below :
[
Filter on
Category - Crawler:Azure Plugin
Correlation - fbda6832-4933-4b2b-8b02-42ec747ed373
08/18/2015 20:13:05.21 mssearch.exe (0x1194) 0x0C88 SharePoint Server Search Crawler:Azure Plugin amn1g Verbose CAzurePIFilterSink::AddAttribute : attrName : GUID:#12, value : ChangeLogLink [azurepifiltersink.cxx:433] search\native\gather\plugins\azurepi\azurepifiltersink.cxx fbda6832-4933-4b2b-8b02-42ec747ed373
08/18/2015 20:13:05.21 mssearch.exe (0x1194) 0x0C88 SharePoint Server Search Crawler:Azure Plugin amn1g Verbose CAzurePIFilterSink::AddAttribute : attrName : GUID:#12, value : sts4://sp13/siteurl=/siteid={1d26851d-2e5c-4b5f-a4f9-bc0f8b4107b6}/weburl=/webid={c5001554-4143-4b1d-8443-3eb98844dc89}/listid={e360f42c-b82c-49c7-b8e8-9338639c92b4}/folderurl=/itemid=2 [azurepifiltersink.cxx:433] fbda6832-4933-4b2b-8b02-42ec747ed373
08/18/2015 20:13:05.21 mssearch.exe (0x1194) 0x0C88 SharePoint Server Search Crawler:Azure Plugin amn1g Verbose CAzurePIFilterSink::AddAttribute : attrName : GUID:#12, value : https://sp13/Shared Documents/sampledoc.txt [ azurepifiltersink.cxx:433] fbda6832-4933-4b2b-8b02-42ec747ed373
What we see above is repeated for all the new documents/content identified and crawled under different correlation IDs. Once above details are gathered for a document, we see a SUBMIT to Azure like below. The first time this happens, we authenticate and grab a token with Azure. Example :
08/18/2015 20:13:04.77 mssearch.exe (0x1194) 0x324C SharePoint Server Search Crawler:Azure Plugin amnzz VerboseEx AzureServiceProxy::SubmitDocuments: submitting document : 15, operation : 0, fullUri : https://searchserviceendpoint/v1.0/77133120-bdb0-42e1-815a-77e881f87db4/index/sp/sp/35c98eb9d863faeca5a7c5892fa522bc7a8bd8d4f9b1cd767f4c613035e31c71 , { "type" : "long", "value" : 2732042392} , "012357BD-1113-171D-1F25-292BB0B0B0B0:#324" : { "type" : "long", "value" : 4359478693111617600} , "noindex" : true , "oldnoindex" : true
08/18/2015 20:13:04.77 mssearch.exe (0x1194) 0x324C SharePoint Foundation Claims Authentication airze Verbose Current identity context: '{"elevated":"true","nameid":"."}'
8/18/2015 20:13:04.77 mssearch.exe (0x1194) 0x324C SharePoint Foundation Application Authentication ahi1z Verbose Created OAuth2 bearer credentials: {"realm":"GUID","claims":{}}
08/18/2015 20:13:04.77 mssearch.exe (0x1194) 0x1FE0 SharePoint Foundation Azure Access Control adnhy Verbose Discovery home realm: GUID'
08/18/2015 20:13:04.77 mssearch.exe (0x1194) 0x1FE0 SharePoint Foundation Application Authentication age6i Verbose Issuing OAuth2 S2S token for identity GUID/searchserviceendpoint@GUID'. tokenType: 2
Once all the documents are submitted, we perform Index callbacks for all the documents :
08/18/2015 20:13:38.15 mssearch.exe (0x1194) 0x3298 SharePoint Server Search Crawler:Azure Plugin amn35 VerboseEx CAzurePlugin::StatusTask Starting, docs 1 [azurepiobj.cxx:931] search\native\gather\plugins\azurepi\azurepiobj.cxx
08/18/2015 20:13:54.69 mssearch.exe (0x1194) 0x2460 SharePoint Server Search Crawler:Azure Plugin amnz7 VerboseEx AzureServiceProxy::GetDocumentsStatus: received Index callback for GUID/operation/ruFukw/XcbWQA/17592187534485/10
There is a completion confirmation after the index callback for every DocID crawled:
Detailed completion information for the DOCID we are testing (718) is in the below example:
08/18/2015 20:13:54.69 mssearch.exe (0x1194) 0x2460 SharePoint Server Search Crawler:Azure Plugin amn4m Verbose CAzurePlugin::CompleteDocument docID 718 status 2 [azurepiobj.cxx:1189]
08/18/2015 20:13:54.69 mssearch.exe (0x1194) 0x2460 SharePoint Server Search Crawler:Gatherer Service drb9 VerboseEx InProgress chk rel Rmv hr 0x0 Portal_Content, sts4://sp13/siteurl=/siteid={1d26851d-2e5c-4b5f-a4f9-bc0f8b4107b6}/weburl=/webid={c5001554-4143-4b1d-8443-3eb98844dc89}/listid={e360f42c-b82c-49c7-b8e8-9338639c92b4}/folderurl=/itemid=2 docid 718 type 2 flgs 0x132a sts 2 [4ba320ac-7a96-4027-9f4a-0901589cd6f4
08/18/2015 20:13:54.69 mssearch.exe (0x1194) 0x2460 SharePoint Server Search Crawler:Gatherer Service kf7a VerboseEx Destroying CInProgressEntry sts4://sp13/siteurl=/siteid={GUID}/weburl=/webid={c5001554-4143-4b1d-8443-3eb98844dc89}/listid={e360f42c-b82c-49c7-b8e8-9338639c92b4}/folderurl=/itemid=2 docid 718 type 2 flgs 0x132a sts 2 [inprogressentry.hxx:51] 4ba320ac-7a96-4027-9f4a-0901589cd6f4
08/18/2015 20:13:55.20 mssearch.exe (0x1194) 0x24B8 SharePoint Server Search Crawler:Gatherer Plugin e5g1 VerboseEx CGatherer::CommitTransaction succeeded CrawlID 129, DocID 718 [gatherobj.cxx:8826] search\native\gather\server\gatherobj.cxx 4ba320ac-7a96-4027-9f4a-0901589cd6f4
8/18/2015 20:13:55.20 mssearch.exe (0x1194) 0x24B8 SharePoint Server Search Crawler:Gatherer Service e574 Verbose Destroying CTransaction docid 718 type 2 [gthrtrx.cxx:571]
Managed property for hybrid search results
There is a new property that has been defined for all content that has been crawled from on-premises, it’s named IsExternalContent. It’s a managed property and is set to 1 for content that is crawled on-premises.
The property can be used to restrict a query for online/on-premises results, as a refiner or in a result source. Below are some examples
Example search verticals
Custom result source using Local SharePoint results plus a filter which excludes results from on-premises can be set up. TIP: Can be used during validation of hybrid search in the production tenant.
Result source query for SharePoint Online only
{searchTerms} NOT(IsExternalContent:1)
Result source query for as an example a filtered Support Forum
{searchTerms} (
Path:https://sp2010 OR
Path:https://demohybrid.../../supportforum)
For example, you can use the default result source using Local SharePoint results but can rename to "Everything" in the Search Navigation configuration. It uses Local SharePoint results plus a filter on which sites to include in the search results.
People crawl scenario
This managed property becomes very important when considering the people crawl scenario. By default, all people in the SharePoint Online User Profile application will be indexed by the Office 365 Search Service. By default, all people in the SharePoint Online User Profile application will be indexed in the Office 365 index. If you additionally crawl people from the cloud SSA, you generate a duplicate set of people content in the Office 365 index. This will be confusing to end users as searching for a person will return multiple results.
There are two ways to approach this problem today.
§ Make the Office 365 User Profile service the primary source of user information and let Office 365 search take care of the indexing and presentation. With this approach you do not need to crawl people on-premises.
§ Crawl the on-premises people profile store in addition to Office 365 crawling the tenant profile store. This will result in the described scenario of duplicate search results for each person, however you can use query transformation to decide which results you want to display. Even providing the ability for end users to choose between the different result sources at query time.
Businesses who have a richly populated on-premises profile store, perhaps with additional augmentation from line of business applications, may want to maintain their primary source of people information as this store. In order to avoid duplicate search results and to focus the results on the primary store, query transformation must be implemented.
To utilise the on-premises profile store as the primary people search source you should follow these step
1. Create a new result source or copy the existing people results source
2. Edit the new result source and modify the Query Transformation box to include the Managed Property IsExternalContent as below.
{?{searchTerms} ContentClass=urn:content-class:SPSPeople IsExternalcontent:1}
3. Then you can create a new search results page and configure the Core Search Results web part to consume this new search result source.
4. Complete the implementation by adding the new page to the search navigation settings. This will add the new page as a search vertical within the search center, as per the screenshot below.
To utilize the Office 365 profile store as the primary people search source you should follow the same steps but using a slightly different query transformation at step two, as follows.
{?{searchTerms} ContentClass=urn:content-class:SPSPeople NOT IsExternalcontent:1}
Note: The difference in the two transform is the insertion of NOT before the managed property to force the exclusion of External content ie Non Office 365 People Results.
For people results transformation you can make a copy of the inbuilt people result source and modify it to include the IsExternalContent managed property restriction for filtering against online or on-premises people sources.
For on-premises people
{?{searchTerms} ContentClass=urn:content-class:SPSPeople IsExternalcontent:1}
For online people
{?{searchTerms} ContentClass=urn:content-class:SPSPeople NOT IsExternalcontent:1}
Publishing service applications between SharePoint Farms
As with most service applications the cloud SSA supports cross farm publishing. This capability allows us to extend the cloud hybrid search capability to SharePoint 2010 farms as well. Following the guidelines for publishing service application you should now be able to publish Cloud SSA and consume it from a SharePoint 2010 farm. This can be achieved following TechNet guidelines here.
If SharePoint 2010 is configured to consume a published cloud SSA, then the end users would be able to leverage the default scopes like “This site” and All Sites in SharePoint 2010 to query against the cloud index. This enables you to get the same cloud hybrid search experience in SharePoint 2010 that we see in SharePoint 2013.
Delve Experience
The cloud hybrid search solution also influences what users can search for and see in Delve. In the preview version of cloud hybrid search, Delve will allow users to see search results from on-premises as well as Office 365. Gestures such as adding an item to a board will work just the same for on-premises content as for Office 365 content.
When the GA version of cloud hybrid search is released on-premises content from SharePoint will also show up as activity on the Me page or a person page in Delve.
Content from on-premises will not show up on the Delve Home Page because the views are not sent from the Cloud Search service application to Office 365.
POST BY Manas Biswas [MSFT] & Neil Hodgkinson [MSFT]
Comments
Anonymous
September 15, 2015
Hello Manas & Neil, I have followed the above steps and still get the same error. Please see the below link for the screenshot http://1drv.ms/1NB8Rmu I appreciate all your help. Thanks, SVAnonymous
September 15, 2015
we have already setup hybrid for SharePoint 2013 SP1 with O365 and we need to try out this new cloud Hybrid search. We have got another test O365 tenant and need to setup a test environment first with our test O365 tenant. Can we sync same OnPrem AD with the second O365 tenant?Anonymous
September 15, 2015
Sudeep, The environment you mentioned below is it already configured for S2S trust and hybrid using query federation? From the on-premises farm can you query and get results from SharePoint Online Thanks,Anonymous
September 15, 2015
Sudesh, There is no easy way through which you can achieve this and moreover the other environment is production, so workarounds may not be the way you would want to go. Since this is for testing you may want to create a new domain and dirsync the users from that AD to test this. I have the steps in our post below if you want to take a look blogs.msdn.com/.../office-365-configure-hybrid-search-with-directory-synchronization.aspx. Also check out the forum post here social.technet.microsoft.com/.../test-setup ThanksAnonymous
September 16, 2015
Yes, S2S is already configured and One Way Outbound search is setup and working. I can run a query from SharePoint On-Premise and get results of SharePoint Online.Anonymous
September 16, 2015
Sudeep, That is interesting observation, thanks for sharing. I will try to repro this and update my findings, meanwhile do you mind posting this as a question to the forum link below so that rest of people having identical question would be benefitted too. Thanks!!Anonymous
September 16, 2015
Sudeep, Missed posting the TechNet Forum Link above for Cloud Search Service Application Preview. Its social.technet.microsoft.com/.../home Thanks!Anonymous
September 18, 2015
Hello, Any idea as to when we can expect this to be supported in the GCC? -Steve- Anonymous
May 30, 2017
Hello Steve,GCC is still work in progress. I will post a response as soon as I hear back on any announcement.ThanksManas
- Anonymous
Anonymous
September 18, 2015
Great post guys! Do we still need a public SSL certificate and the SPO Secure Store Service if we want "inbound" Search Previews of on-premises documents indexed by the Cloud SSA? Cheers Ben- Anonymous
May 25, 2017
Hi Manas,Has the support for GCC started ?
- Anonymous
Anonymous
September 20, 2015
Steve, We do not have any further update as of now on the availability. I am following up and will post a response at social.technet.microsoft.com/.../support-for-gcc.Anonymous
September 20, 2015
Ben, The new hybrid solution takes away the complexities related to queries coming in to the SharePoint Server on- premises environment via a reverse proxy. The cloud hybrid search solution provides the ability to crawl and parse on-premises content and then process and index it in Office 365. Thus when users query the search index which is in Office 365, they get search results from both on-premises and Office 365 content. Since the query is within Office365 index a public SSL certificate is no longer required for inbound authentication requests from the Secure Store application in SharePoint Online. Of course if a person needs to access a document that is in SharePoint Onpremise or want a preview of the document from within search center in SharePoint Online, you need to take care of the accessibility of the document and WAC server if the user is outside of CorpNet. SharePoint extranet best practices talks about securely publishing SharePoint Onpremise to users outside of CorpNet which you would then need to implement.Anonymous
October 29, 2015
The comment has been removedAnonymous
November 02, 2015
Hello Max, I am investigating this , can you confirm if you have Visual Studio installed on the same server. Also do you just have SP2013 version of dll in GAC WindowsMicrosoft.NETassemblyGAC_MSILMicrosoft.SharePoint.Client.Search or do you see multiple versions ?Anonymous
November 20, 2015
Hi, the link is broken to the download page. Thanks! MaxAnonymous
November 25, 2015
The comment has been removed- Anonymous
March 25, 2017
@Durgesh By aggressive crawling , do you mean setting up crawl rules , can you confirm. Regarding quota ask --> For each 1 TB of storage space your tenant has in Office 365, you can index 1 million items of on-premises content in your tenant’s search index in Office 365 with the cloud hybrid search solution. This quota is by default limited upwards to 20 million items. https://support.office.com/en-us/article/Search-limits-for-SharePoint-Online-7C06E9ED-98B6-4304-A900-14773A8FA32F
- Anonymous
Anonymous
November 29, 2015
I get the following error using the latest version (downloaded 11/29)of the onboarding script: "No indexing service endpoint found!" The error itself is not helpful and returns nothing online. Do you know what this means? Thanks, JB- Anonymous
March 25, 2017
@Jonathan Browne Can you tell us when you get the error? Is it while running the onboarding script or running a crawl
- Anonymous
Anonymous
December 17, 2015
The comment has been removed- Anonymous
March 25, 2017
@Don LeBlanc thanks . Posting update here . If you see the error that Don mentioned we may need to look a closer look at some of the search settings. Its recommended to open a troubleshooting ticket with Microsoft support for a quick fix.- Anonymous
March 25, 2017
Looks like I posted the earlier comment too early. The comment I made above about opening a support incident is if anyone else sees the same message as Don reported. Its also advisable to turn on ULS logs verbose or in crawl logs in the on premise farm and you may see that the farm is not able to talk to the search endpoint .
- Anonymous
- Anonymous
Anonymous
February 09, 2016
I've successfully setup Cloud Hybrid search for my Office 365 Dev environment. It's really cool to have on-prem content in search webparts on SharePoint Online! But there is one issue I feel has not been adressed -- or I've missunderstood something (most likely...) Do you disable/remove the old on-prem SSA after adding the new cloud SSA?- Anonymous
March 25, 2017
The comment has been removed
- Anonymous
Anonymous
February 12, 2016
The comment has been removed- Anonymous
March 25, 2017
@ Ankit do you have a outbound proxy server on your network ? If yes please take a look at our post here and ensure that port and protocol requirements are met.
- Anonymous
Anonymous
March 13, 2016
When you configure cloud hybrid search for SharePoint, run the PowerShell script CreateCloudSSA.ps1 to create and configure a cloud Search service application (cloud SSA) on SharePoint Server 2013 or SharePoint Server 2016 server. Run the PowerShell script OnBoard-CloudHybridSearch.ps1 to connect your cloud SSA to Office 365 tenant and set up server-to-server authentication for the two environments. These scripts are now available for download at Microsoft Download Center www.microsoft.com/.../details.aspxAnonymous
April 28, 2016
My understanding is that it Reverse Proxy is not required to configure Hybrid Search ( 2 way search ) i am looking for SharePoint 2016What about followingDomain Registration & UPN domain suffixTrust between 2 environments- Anonymous
March 25, 2017
@ Kris reverse proxy is no more a requirement for setting up Cloud SSA . This is because the CloudSSA crawls the items from SharePoint Onpremise farm and the index is unified in Office365 SPO farm so you no more need to make a inbound call to Onpremise for any data retrieval.- Anonymous
March 25, 2017
@Kris The user rehydration requirement for upn match has an additional improvement . Unlike earlier we are no more dependent just on user upn to upn/sip/smtp property match for user rehydration. There is a new property in SPO which stores the SID of a user if synched through AADSync and a match against that attribute is done to rehydrate the user identity . S2S Trust is automatically configured when you run the onboarding script.
- Anonymous
- Anonymous
Anonymous
June 17, 2016
Excellent article!At the moment I´m using two SharePoint 2016 Dev Systems with Hybrid Search with the following problem:Dev System 1: Firewall is open -> Search is workingDev System 1: Firewall is open only for Port 80 and 443-> Search is not workingWhich ports are needed?Thank you in advance!- Anonymous
March 25, 2017
@Lars please look at our blog post for details https://blogs.technet.microsoft.com/beyondsharepoint/2016/08/15/ports-and-protocols-requirement-for-the-hybrid-cloud-search-service-application/
- Anonymous
Anonymous
August 02, 2016
Great post. There appears to be an issue with calls to Search Content Service failing. The submit operation to the Search Content Service failed. This item will be retried in the next incremental crawl. The request sent to the Search Content Service failed (access denied). Any help addressing that would be very helpful.- Anonymous
March 25, 2017
@ Michael Buckingham . Do you continue to see the same error . If yes can you validate that the steps here has been followed in https://blogs.technet.microsoft.com/beyondsharepoint/2016/08/15/ports-and-protocols-requirement-for-the-hybrid-cloud-search-service-application/. If the issue persists you may try to re-run the onboarding script, restart search service across all bosses once and if that still does not resolve the issue we may need to take a closer look. Opening a support ticket with Microsoft may help here.
- Anonymous
Anonymous
August 18, 2016
Hi,i am able to crawl content, but search does not return any result. can you guide me. thx- Anonymous
March 25, 2017
@Imughal. You may want to check few things 1. The account that you are querying has permission in the source content and also is licensed in SPO2. If you are querying from Onpremise search center then please ensure you have a result source in onprenise search center that has remote server URL set to SPO root site collection 3. You may want to search for IsExternalContent:1 in SPO search center to see if you see any search results in SPO for the onpremise content . If yes, than this one item may be failing due to some sort of ACL issue , try checking the permission of the account explicitly on the source document.
- Anonymous
Anonymous
September 01, 2016
Hi, I followed above steps for OnBoard script, it asks for cloud password at starting while initializing param.After "Connecting to O365.." , it throws an error saying "The user name or password is incorrect. Verify your user name, and then type your password again" , instead of asking the credentials as specified in above steps.I am entering correct credentials only, now I am sure of that.Is there any other step whcih I am missing or which we need to include in script.Help will be greatly appreciated.Thanks,Shivani- Anonymous
March 25, 2017
The comment has been removed
- Anonymous
Anonymous
September 21, 2016
Hello Manas & Neil,I am configured Cloud SSA in 2013. But it doesn't work with ADFS web app? Is it a limitation?- Anonymous
February 15, 2017
The comment has been removed- Anonymous
March 25, 2017
@Emmanuel ISSALY .Post crawling the metadata is pushed securely to the search endpoint in Office365 so index partition in onpremise is not required
- Anonymous
- Anonymous
Anonymous
January 13, 2017
HiCreateCloudSSA.ps1 powershell script is taking too long time i am not unable to run the script successfully can you please help on this,thanks,Bhaskar.- Anonymous
March 25, 2017
@M Bhaskar you may want to run a network trace to see if you have a proxy or an intermediate device is playing a role here . Onboarding script is fast and should take a couple of mins max
- Anonymous
Anonymous
March 20, 2017
I have setup an Hybrid Search environment on SharePoint 2013 and SPO. Currently my Search Center is on On-Prem. I am getting both On-Prem and SPO docs on On-Prem search result page. All the settings and search result looks good.However i am not able to Preview PDF docs coming from SPO. Preview is working for all MS office docs from SPO. I have tried to change the On-Prem search result type for PDF files to consume Word Item as display template but still not working. My issue is only with PDF preview not working from SPO files on Serach center which is on On-Prem. Can any one give valuable input to resolve this issue ?- Anonymous
March 25, 2017
@Sanjay what is the behavior when you lookup the same document from a SPO search center ?- Anonymous
March 27, 2017
Hi Manas Sir. Thanks for looking into my issue. I am regular reader of all your blogs and look you as a role model in SharePoint Industry.For the issue YES i am able to preview the same PDF files as expected in SPO search center. Issue is only with PDF files coming from SPO in hybrid search setup where Search center is on On-Prem- Anonymous
April 24, 2017
Thanks Sanjay. We would need to take a look at the ULS logs from your search farm to comment on the next actions. Would be great if you open a support ticket so the logs can be analyzed to comment further.
- Anonymous
- Anonymous
- Anonymous
Anonymous
March 23, 2017
The comment has been removedAnonymous
April 20, 2017
I am getting an error when running the OnBoard-HybridSearch.ps1 script. It is saying the following: A parameter cannot be found that matches parameter name "AzureEnvironment"- Anonymous
May 25, 2017
The comment has been removed - Anonymous
May 25, 2017
you need to install the newer module from http://connect.microsoft.com/site1164/Downloads/DownloadDetails.aspx?DownloadID=59185 , download “AdministrationConfig-V1.1.166.0-GA.msi”.In order to verify if the customer is upto date, we can run “Get-Module -Name MSOnline” in PowerShell, it should be equal to or greater than “1.1.166.0 “
- Anonymous
Anonymous
May 28, 2017
Great blog post. Worked for me, thank you a lot.But now I want to roll back everything since we will follow a different path. Can I just remove the CloudSSA and configure the default result source in SP online? What about other configuration settings like the mentioned proxies? Is there an "Un"board-CloudHybridSearch script to really undo everything?Thanks,Oliver- Anonymous
May 30, 2017
@Oliver if you just want to use the same farm for some other testing then you can definitely clean up the CloudSSA . However that would leave the items that was crawled from Onpremise and resides in SharePoint Online (SPO) index. So users who has rights and query from SPO search center can still lookup the items that was crawled from Onpremise. For a complete cleanup you can follow the steps below.1. Follow the steps in our blog to clean up items in index. (https://blogs.msdn.microsoft.com/spses/2016/05/18/cloud-search-service-application-removing-items-from-the-office-365-search-index/).2. Delete the CloudSSA. Thanks,Manas
- Anonymous
Anonymous
June 02, 2017
Hello Manas,As always, great post!One thought though, If multiple Cloud Search Service Applications are connected to the same Office 365 tenant, for Geo-distributedenterprises; does the Index get stored in partitioned mode?Thanks,Aneesh- Anonymous
June 03, 2017
Hello Aneesh,I assume you mean you have multiple Sharepoint Onpremise farms in different locations with CloudSSA and you connect the same to a single Office365 tenant, SPO farm. Office365 is already multitenant hence the items crawled by these farm will be stored along with the same index of your SPO farm. Based on user rights , users from any of the farms can query for these items. Or query can be optimized to return only certain results to some farms . Something we need to be careful about is, crawling same content across multiple CloudSSA farms and pushing to same endpoint/tenant is not supported. This means no way you should have content sources in these farm that crawls the same content. Crawling different source content across farms and pushing to same tenant is always ok.Thanks, Manas.
- Anonymous
Anonymous
August 21, 2017
Hello Manas, Just wanted to know, Can I manage search schema of On-Prem in O365 ?- Anonymous
August 30, 2017
@Amol if you are using CloudSSA you can create crawl properties in your Onpremise farm. As long as they are part of default propset, post a crawl of the content via a content source in CloudSSA, you should be able to leverage the same within search schema in SPO.- Anonymous
August 31, 2017
@ Amol Jadhav you may also need to ensure that you are looking up the Crawl Properties in SharePoint Online using the correct account. By correct account I mean an account in Onpremise Active directory having rights in SharePoint as well synched to Office365 Azure AD in your tenant. Example if you can look up using the content access account you should see the Crawl Properties in SharePoint Online search schema.
- Anonymous
- Anonymous
Anonymous
September 27, 2017
Great Post Manas. We are able to pull the SP2013 search results in SharePoint online. Unfortunately we are not able to get search results from SharePoint online in SP2013 site. Getting below error when search query is issued from SP2013 site. Note that we are not using any custom display template.0¾System.Collections.Generic.KeyNotFoundException: The given key was not present in the dictionary. at System.Collections.Generic.Dictionary`2.get_Item(TKey key) at Microsoft.Office.Server.Search.Query.Rules.QueryTransformProperties.get_Item(String key) at Microsoft.Office.Server.Search.RemoteSharepoint.RemoteSharepointEvaluator.RemoteSharepointProducer.ReadInputFields(IRecord record) at Microsoft.Office.Server.Search.RemoteSharepoint.RemoteSharepointEvaluator.RemoteSharepointProducer.ProcessRecordCore(IRecord record)Thanks,Prasad- Anonymous
September 28, 2017
Hello Prasad,From a quick look at the stack, it seems this was a known issue that is fixed in a service pack. Can you check if your Sharepoint farm build is "15.0.4904.1000" or higher. Note : If you decide to proceed with the patching please follow the best practices of SharePoint patching. In another farm facing identical issue was fixed with March 16 CU, you can try the same fix. Thanks.!
- Anonymous
Anonymous
December 16, 2017
The comment has been removed- Anonymous
December 17, 2017
The comment has been removed
- Anonymous