You’ve probably seen the option to install Full-Text Search during a SQL Server installation without really understanding what it is and how it can be used. As with other database systems SQL Server has a full spectrum of features that typically reaches far beyond what is needed in any one software development project. I look at Full-Text Search Services as one of those step child type features I never felt was worth spending time on since it seemed so trivial at first glance. I couldn’t ever really recall a time when I had a problem searching text per say. If you are like me you probably thought about why would you need that feature for a second and then went on your merry way. Honestly I didn’t really know what the full-text search feature was in the scheme of things and never really had a situation where I felt like it was something I really needed.
Recently this feature became a topic of interest to me for a project I was working on for a client who wanted to create an intermediate interface for creating new CRM accounts. They didn’t want CRM to be just another system of independently created and maintained account information which would only add to their existing problem. The solution was to create a system that managed and linked duplicate customer account information together from various systems. The premise behind this system was to allow a user to look for an existing company record before creating a new CRM account. Instead of displaying the standard CRM new account form, the intermediate interface would be shown allowing the user to search for an existing company. If that company already existing in the CRM system they would simply be redirected to that record else the user could elect to create a new record either based on an existing record or from scratch. While it is true that the company information itself would effectively be duplicated in CRM for a new record, it would automatically be linked to existing company records in other systems while also reducing the amount of keying required to add the account information to CRM. In addition to the internal account information being this system they also added public databases of company information from other sources such as the Dun & Bradstreet database. By including public sources of company information the user could search for a company that may not be found in their existing systems but still allowing the user to create a new CRM account without having to rekey all that data in manually.
In theory this sounds great until I came to see the amount of data that needed to be searched. I ended up having upwards of 17 million rows to start off in my primary table. The user needed to be able to search on not only company name, but any information related to a company such as address, tax Id, duns number, etc. I couldn’t just simply throw an index on the company name field and call it a day. I also didn’t want to try creating a covering indexes across tables. I needed something much more comprehensive. As I found myself having to search through this significant amount of text data than normal I felt like I had to pursue another option. I wanted to research any opportunities to increase search performance of text fields in my database feeling that normal indexes were not going to cut it. With a full-text index I could create a new kind of index that would cover all the fields in my company table that indexed the things I wanted to search on in the first place – the words, not the character data in the fields themselves. Using full-text search I created an interface that allowed the user to perform their company searches over millions of rows for data that could be in one of a number of fields in my table.
If you think about the world we live in today we are increasingly storing more textual data in databases as we move towards a paperless society. You can expect that the amount of text database systems are going to be required to search through will only increase over time and standard indexes are not going to get the job done. This being the case it makes sense for you to have the ability create a special kind of index that can provide better performance and functionality on the level of words and phrases rather than simply on the characters that make up the words themselves.
If you take a cursory look at what you can find online about full-text search you’ll most likely find at the top of your search some MSDN documentation from Microsoft on how to setup full-text search. At this point I wasn’t so much interested in knowing how to set it up versus why I would want to setup it in the first place.
So what is this full-text search feature? The name itself seems rather descriptive but it doesn’t give you a glance into how it can actually benefit you. Let’s think of this special type of index in terms of a book. A book will typically have a table of contents containing an index of chapters and possibly an appendix containing supplementary information on words or phrases and where to find them within the book. The table of contents index is typically based on the context of subject matter found in each chapter. This makes sense from a reader’s perspective who may want to skip ahead to read a particular topic. This is not easy to do if the reader only had an appendix which would be used more for a specific lookup of a word or phrase of interest.
I like to think of a typical index on a text field like the appendix of a book. The appendix is simply based on the content of the book which is valid for certain purposes but not necessarily the context. Full-text search from my perspective provides a higher level of indexing as compared to an appendix. A typical index on a text field indexes character data whereas a full-text index tracks the occurrences of words and word-forms in the blocks of unstructured text which makes a full-text index structure very different from standard indexes.
When it comes to searching text you can think of full-text search as a super index. Most developers are probably most familiar with searching through text data using the LIKE operator. Depending on the amount of data being searched the LIKE operator will often times give good enough performance, but if you think about it how often are you actually searching text data by the character? Most likely you are probably looking for a word or phrase. The biggest performance gains with a full-text index over a standard index comes when as you find yourself having to search larger datasets. This is because the full-text index is based on words and not simply on the individual characters that make up the words. When the full-text search engine indexes text it breaks the words into tokens which become the basis of the index.
Here’s what Microsoft has to say about full-text index performance:
“The performance benefit of using full-text search can be best realized when querying against a large amount of unstructured text data. A LIKE query against millions of rows of text data can take minutes to return; whereas a full-text query can take only seconds or less against the same data, depending on the number of rows that are returned. “
Besides the performance gains the full-text search engine allows you to not only search on words, but phrases, and inflectional forms of words. You can search for a word such as “drive” and specify that other forms of the word should quality in the search like “drives”, “driving” or “driven”. You also can specify searches on synonymous forms of a specific word as well as a word or phrase that begin with specific text.
You can think of full-text search as SQL Server’s own little search text search engine. You have the ability to rank results and limit the results returned. As a matter of fact a blog post by Microsoft MVP Michael Coles gives an example of how to create your own Google type search using the full-text search engine at http://www.sqlservercentral.com/articles/Full-Text+Search+(2008)/64248/.
In his blog Michael describes how to construct a query engine that uses similar syntax to Google for searching text inside SQL Server. One of the benefits to this is that most people are already familiar with using search engines so you can parlay this experience to reduce the learning curve of an application.
Some of the other nice things you can do with a full-text search are:
So you may still be asking yourself, but what are the actual characteristics of a full-text search index that make it unique? Here are a few features of full-text search that goes beyond a normal index:
SQL Server’s full-text search engine is a great feature that helps you when dealing with large amounts of text data. I admittedly took it for granted thinking that it was just some fluffy feature that I didn’t really see as a necessity in my development efforts. Now that I know what full-text search is and what it can do for me I will always consider if the solution I’m working on can benefit from a full-text search index. Now you can do the same.
MSDN Full-Text Search Documentation: http://msdn.microsoft.com/en-us/library/ms142571(v=SQL.90).aspx
The complementary paper includes over 12 years of research, recent survey results, and CRM turnaround success stories.
This 60-second assessment is designed to evaluate your organization's collaboration readiness.
Learn how you rank compared to organizations typically in years 1 to 5 of implementation - and which areas to focus on to improve.
This is a sandbox solution which can be activated per site collection to allow you to easily collect feedback from users into a custom Feedback list.
Whether you are upgrading to SharePoint Online, 2010, 2013 or the latest 2016, this checklist contains everything you need to know for a successful transition.