URL Formatting Requirements
As of Windows 7, inconsistencies remain in the handling and parsing of URLs. This topic provides a limited guide to navigating inconsistencies in file URL formats.
This topic is organized as follows:
- URL Formats in Use
- Slash Direction, Trailing Star, and Trailing Slash Sensitivity
- URL Formats by API and Query
- Related topics
URL Formats in Use
Third-party protocols are responsible for defining their URL format and defining queries in a manner that conforms to their standard. For example, Microsoft Outlook supports folder names with arbitrary characters, including those that are illegal in URLs such as the "?"
character. The MAPI protocol handler does its own URL-encoding of its URLs. Hence, the index stores "%3F"
instead of "?"
, and Outlook must take this into account when creating queries.
The different formats are listed in the following table and are each assigned a letter identifier for referring to them later in this topic.
ID | Local file URL or remote | Example |
---|---|---|
A | Local | file:///c:\test\example\ |
B | Local | file:c:/test/example/ |
C | Local | c:\test\example\ |
D | Remote | file:///\\server\share\ |
E | Remote | file://server/share/ |
F | Remote | \\server\share\ |
Slash Direction, Trailing Star, and Trailing Slash Sensitivity
In Windows Search there is largely no sensitivity to slash direction. If the format c:\test\example
is accepted, then c:/test/example is accepted as well. However, although SCOPE is generally insensitive to slash direction, it is sensitive to the slash direction in the case of remote URL format F. Hence, Scope = '//server/share'
does not work.
The only API that is sensitive to trailing stars and distinguishes between c:\test\
and c:\test\*
is ISearchCrawlScopeManager. If there is an exclusion rule for c:\test\*
, the URL directory c:\test
itself will still be indexed. But if the exclusion URL is c:\test\
, the URL directory c:\test
itself will not be indexed.
There are two places where Windows Search is sensitive to trailing slashes: ItemUrl and Path queries. If there is a directory c:\test
, Windows Search treats c:\test\
differently from c:\test
for predicates like path = 'c:\test'
and System.ItemUrl = 'c:\test'
. For example, the predicate path='file:c:/test'
would match the directory c:\test
, but path='file:c:/test/'
would not, due to the trailing slash.
URL Formats by API and Query
Local file URL formats accepted by selected APIs and queries are listed in the following table. The formats are associated with a letter (A through F), the meaning of which was denoted in the "URL Formats in Use" section earlier in this topic.
API or query | Format A | Format B | Format C |
---|---|---|---|
ISearchCrawlScopeManager | Y | N | Y |
IGatherNotifyInline::OnDataChange | Y | Y | Y |
ISearchCatalogManager::ReindexMatchingURLs | Y | Y | Y |
ISearchCatalogManager::ReindexSearchRoot | Y | N | N |
ISearchCatalogManager2::PrioritizeMatchingURLs | Y | Y | Y |
Scope= | N | Y | Y |
Directory= | N | Y | Y |
ItemUrl= | N | Y | Y |
Path= | N | Y | Y |
Remote file URL formats accepted by selected queries are listed in the following table.
Query | Format D | Format E | Format F |
---|---|---|---|
ISearchCrawlScopeManager | N/A | N/A | N/A |
IGatherNotifyInline::OnDataChange | N/A | N/A | N/A |
ISearchCatalogManager::ReindexMatchingURLs | N/A | N/A | N/A |
ISearchCatalogManager::ReindexSearchRoot | N/A | N/A | N/A |
ISearchCatalogManager2::PrioritizeMatchingURLs | N/A | N/A | N/A |
Scope= | Y | Y | Y |
Directory= | Y | Y | Y |
ItemUrl= | Y | Y | Y |
Path= | Y | Y | Y |
Related topics