Search Experts Blog
Ideas and Thoughts on Enterprise Search, Microsoft SharePoint and FAST Search
 
November 19, 2010
Predictions On Search Part 1 of 3 Cloud Computing will transform enterprise search and usher in a new era of Information Worker productivity.

Cloud Computing is going to completely transform how IT Professionals and Information Workers deploy, manage, and interact with technology. The Cloud represents the third of two other forces that have shaped the IT industry as we know it today and will without question have a larger impact than the other two combined. The other two forces to which I'm referring is the CPU, and the Internet.

The first wave, the CPU, brought massive processing power to the desk top. Software vendors quickly stepped in to develop applications and software development platforms that took advantage of Moore's Law. This shaped how IT Pros and Information workers interacted with software. Client Server computing became the standard for two decades.

The second wave came on the scene around 1995. The Internet represented low cost reliable connectivity. Software vendors very quickly recognized that Client Server architecture and platforms for developing software was completely inadequate to exploit this new opportunity. In Microsoft's case they set out to develop a new software development platform and .Net (dotNet) was born. dotNet provided developers with a platform that enabled the development of applications that took advantage of both the CPU on the desktop and servers, and leverage the Internet for connectivity.

So here we are present day, and along comes a new engine of economic growth; Cloud Computing which fully leverages, both the CPU and the Internet. This is going create a new way paradigm for IT, but what does it mean for Enterprise Search / Information Access?

1 – The Cloud will enable a new data structure that will render the Search Index obsolete

The search index is ideal for searching against unstructured content. In today's world however, the information worker has to pull together data that is unstructured, semi-structured, and structured. Take a typical customer service application. In order to track a customer issue, the customer rep has to first verify that the caller is indeed a customer and does a database lookup. When the rep understands the issue, he might then have to search the companies' knowledge base, support forums, and product documentation.



A search index can certainly store structured and semi structured content. In fact, if you are using Microsoft's Business Data Catalog or Business Connectivity Services this is exactly what's happening, but this is far from optimal. Suppose the customer wants information about his or her purchase history? The search index is not designed to aggregate purchase dollars.

Search vendors are well aware of this problem. Many have already started to create "Hybrid Indexes", creating indexes that bear no resemblance at all to what most of currently use today. As an example, SharePoint 2010 stores metadata and values in a flat file data structure. FAST, in contrast, stores this information in a cube. This is the reason why SharePoint can only provide Shallow Facets. SharePoint can easily serve up a search result in under a second, but aggregating the meta-data values for each facet would degrade performance significantly. FAST, in contrast, aggregates the values at index time. The index of the future will need to do far more than serve up deep facets. The combination of disk and processing power of the cloud will enable this

2 – The Cloud will enable unprecedented levels of Performance and Scale

Enterprise Search Architectures are increasingly being stressed by exploding growth of data and the requirement to process the data in ways that make it more useful to the Information worker.

The ongoing battle to stay ahead of the information glut puts continuous stress on IT departments that are hard press to predict capacity requirements, obtain funding for additional storage and servers, and implement upgrades in a timely fashion. The Cloud will provide capacity on demand without requiring a traditional deployment. From a cost perspective, the benefits include only paying for what you need, when you need it.

Performance will be a key benefit that the cloud provides. As one example, Search Engines can index documents at astonishing rates, but the moment you trying to enrich the data being indexed via advanced techniques such as Entity Extraction performance drops precipitously. While traditional indexing uses the standard technique of Extract, Transform, and Load (ETL), the cloud will enable highly scalable approaches, such as ELT; Extract first, Load second, and then transform. They key point here is where you do the "T". Transforming it at the same time you are trying to build an index has proven to be a fragile architecture.

3 – The cloud will enable a new productive workplace for Information Workers

It's difficult to predict what cloud enabled search will look like. Just as the Internet has created new business models like eBay on the commercial side, and rich search experiences on the consumer side like Google, cloud based search will power a new way of interacting with information. In the Enterprise, IDC has been calling this the Intelligent Workspace. This new workspace will sit at the intersection of composite applications, data integration, and document assembly. In short, search will become the primary way we interact with information.

 
​​