Поделиться через


Acronyms and Abbreviations

Note

Indexing Service is no longer supported as of Windows XP and is unavailable for use as of Windows 8. Instead, use Windows Search for client side search and Microsoft Search Server Express for server side search.

 

Acronyms and abbreviations merit consideration when you implement a word breaker. In many languages, individual letters of acronyms are separated by periods. Occasionally, words that are not recognized acronyms or abbreviations are abbreviated. For example, "United States of America" may be abbreviated as either "USA" or "U.S.A." Word breakers included with Indexing Service usually identify single-letter words as noise words and treat those words as placeholders during query time. During query time, a word breaker that is not aware of common acronyms, or that does not recognize abbreviations, converts the abbreviation "U.S.A." into "U," "S," and "A." This decomposition does not provide enough information to match words in the full-text index because all the query terms are noise words. When you create a word breaker, it is recommended that the word breaker remove the periods that separate the letters of acronyms. In the example, "U.S.A." is stored as "USA" and a query term that contains "U.S.A." actually queries for "USA." If a word breaker processes an abbreviation, the period in that abbreviation is not treated as an EOS break. Because of this, a word breaker might not correctly identify an EOS break if the abbreviation is at the end of the sentence.