Jaa


опрос по Full Text Search!!

?????????? ????????? ? Full Text Search. ???? ???? ??????????? ?? ???? ????, ?????????? ????????????

SQL Server Planning Questionnaire

Topic: Full Text Search

As part of the Microsoft SQL Server planning and design for next release of Full-Text Search (FTS), Microsoft needs to hear what you have to say!? Right now, the SQL Server Product Group would like to receive feedback from you in two main areas they are right now designing. This is a great time to influence directly in the future release of Full Text Search in order to make it as best as possible.

Please send your responses to alexeik@microsoft.com

Name of respondent: ………………………

Company:……………………………………

Contact email-address: ……………………

Your answers will help dramatically the SQL Server FTS team at Microsoft, and please, feel free to let us know any other concern you might have which is not covered by the questions mentioned in this email.

Thanks in advance for your help and hope to hear from you soon! Your feedback is more than ever important to drive SQL Server improvements for you.

 

1) Upgrade story

There are several possibilities technologically speaking to implement the upgrade experience for FTS users. For Microsoft is very important to know your requirements of availability, time consuming, data integrity, etc…

Because the next release of FTS will include some mayor architectural changes, we will need to perform several operations at upgrade time. Thus, Microsoft wants to understand your business needs in order to choose the correct path to cover them.

The upgrade story applies to the moment when you will eventually migrate your application from SQL Server 2000 or 2005 to the next SQL Server release X.You will have FT Catalogs working on these before releases and Microsoft needs to know how you would like the upgrade process to be.

Some important questions we would like to have an answer for are:

  • Do you actually need your full-text app to be online during the upgrade?
    • Yes. Do you want to be able to perform queries only, or you need the FTCatalog to support updates as well during the upgrade?
      …………………………………………………………………………….
      …………………………………………………………………………….
      …………………………………………………………………………….

    • No. Then. for how long can you live without have the FT app online? (hours?, 1-2 days?)
      …………………………………………………………………………….
      …………………………………………………………………………….
      …………………………………………………………………………….

  • How many rows do you approximately expect that your FTCatalogs will have in 1-2 or 3 years ahead? Basically, how does your FTCatalog grow?
    …………………………………………………………………………….
    …………………………………………………………………………….
    …………………………………………………………………………….
  • How much CPU/resources are you willing to dedicate to the upgrade needs of FTS? i.e: if you app is 100% full-text oriented you might want to give all the possible resources to the upgrade process, but if your app has an important load of relational standard queries, you might want to give second priority to FTS upgrade in order to not hit these query resources. More details in this, the better.

…………………………………………………………………………….
…………………………………………………………………………….
…………………………………………………………………………….
…………………………………………………………………………….
…………………………………………………………………………….
…………………………………………………………………………….
…………………………………………………………………………….

  • Briefly, which HW do you expect to have in 1-2 years? Number of CPUs, RAM, etc…
    …………………………………………………………………………….
    …………………………………………………………………………….
    …………………………………………………………………………….

 

  • Due some expected improvements in the way how Microsoft will break the words for a specific language (i.e: English word breaker), it may happen that after you upgrade, your FTCatalog (populated with the before word-breakers) is not fully consistent anymore with the new word breakers behavior.

i.e: If your app was used to retrieve X results while searching for an specific word or phrase, after the upgrade and improvements, although not probable, it might happen that searching for the same word or phrase the results will be X’ , which it might be slightly different than X (the recall changed). However, X’ will be more accurate than X due the improvements mentioned before. This could happen because the word or phrase searched in the example contains terms that the prior word breakers and the new ones will threat differently.

Would this be a real problem for your app? Or you can live with this change?

(Note: in case this behavior is not acceptable for you, you will need to repopulate your FTCatalogs with the new word-breakers in order to solve this issue. Of course, the repopulation will make the upgrade story significantly slower; depending of how large is your FTCatalog. However, having duplicated server in production would solve this problem, as while a server answers business needs, the other would be repopulating the data with the new FT Indexes and word breakers in the background.).
…………………………………………………………………………….
…………………………………………………………………………….
…………………………………………………………………………….
…………………………………………………………………………….
…………………………………………………………………………….
……………………………………………………………………………. ?

2) Your queries. Let us know what your most common FTS usages are.

Microsoft is designing a new FTS architecture and expects to achieve better query performance in most type of queries. In order to focus more in specific cases performance; Microsoft needs to understand which the most common FT queries you use are.

I.e:

  • CONTAINS or FREETEXT? How big is normally your resultset?
  • Do you use the ranking functionality provided by CONTAINSTABLE and FREETEXTTABLE?
  •  What is more important to you, to recall all the documents satisfying a search or to rapidly retrieve only the most relevant ones?(I.e: using TOP_N_BY_RANK parameter)
  • Do you have mixed Relational and FTS queries? I.e: CONTAINS(* , ‘foo’) AND number=75
  • How complex are your queries? Do you have several FT predicates connected by ORs or ANDs?
  • Which one is more critical for you, cold box query performance (first time a given query is executed) or warm box query performance? (when a given query is known already due prio executions. This allows to have eventually a cached query plan and a cached data)
  •    Etc….

…………………………………………………………………………….
…………………………………………………………………………….
…………………………………………………………………………….
…………………………………………………………………………….
…………………………………………………………………………….
…………………………………………………………………………….