This blog is dedicated to the in-depth review, analysis and discussion of technologies related to the search and discovery of information. This blog represents my views only and does not reflect those of my employer, IBM.


Saturday, September 23, 2006

Drinking From the Fire Hose

So you issue a search and the search engine tells you that there are 24,832,271 results of which it displays the top ten on the first page. You scan the first page, maybe click through the next couple of pages. But have you ever wondered what those other 24 million results are? It’s a bit overwhelming – kind of like drinking from a fire hose. I for one think the number is meaningless and shouldn’t be displayed at all, especially since there is no practical way to skip say to result one million and one. But there are techniques that the search engine can employ that help you understand what’s behind that number. In particular is a technique called multifaceted search.


Rather than just show a single list of results and the total results
found number, multifaceted search displays a list of facets along side your results. Each facet represents a different dimension of your results and provides the number of results in that facet. For example, suppose I issue a search for “digital cameras”. The search engine would display my first page of results (as before) but also display the different facets for digital cameras such as by price, manufacture, or features as shown to the right. The facets act as a kind of guided navigation through the myriad of results that remain complete with counts to let you know what you’re getting into. As you click on a facet your search is reissued with the constraint defined by the facet. So clicking on Kodak would only show me the digital cameras manufactured by Kodak. The other facets (and their counts) are also updated based on this selection – the price counts would only reflect the prices offered by Kodak in this case.

Mike Moran, a colleague of mine from IBM, points out that multifaceted search is quite different from the typical "advanced search" where users are prompted to choose their criteria up front—they ask for a digital camera with 10x optical zoom for under $1000 and get the dreaded, "No results found." When they have specified more than two criteria, they don't even know which one to back off to get some results. With multifaceted search, they just pick their facets in the order of their importance to them. Invalid combinations are never shown.

So multifaceted search seeded by my original query gives me a window into what lies ahead in my result list. It has proven to be extremely effective in e-commerce applications to assist shoppers in the discovery of specific products. IBM’s OmniFind Discovery Edition is one such product that leverages multifaceted search for this purpose. The OmniFind Discovery Edition is designed to be deployed for online commerce, self service portals, and call center scenarios via a suite of tailored configurations and prepackaged industry solutions. Discovery Edition can easily extract facets from existing metadata stored in a relational database (e.g., product catalog) or from document content using advanced text analytics. Business rules expressed in natural language can be easily defined to determine when particular facets are revealed and in what context. The end result is a much richer search experience for your users, one the helps them find what they are looking for in less time.

P.S. I'm not sure what motivated my son to use the garden hose that way :-)

Click here to read more...


Thursday, September 14, 2006

WebSphere Portal Technical Conference Review

Have you ever noticed how conferences seem to bunch up in the spring and fall? This fall is no different. My last post two weeks ago was on the Search Engine Strategies conference in San Jose. This week I just returned from speaking at two WebSphere Portal conferences - one held in Baltimore Maryland the other in Stuggart Germany. Both conferences had the same agendas so I'll give my perceptions of the conferences as if they were one. While my main reason for attending the conference was to present the integration of IBM's advanced search technologies into portal I was able to learn some of the other exciting new developments with WebSphere Portal 6.0 which is just coming out. I'll touch on some of those developments as well as the advances in portal search.

Search Within WebSphere Portal

A portal is a web site that offers a single point of entry to a broad array of resources and services. The IBM WebSphere Portal enables an enterprise to quickly build a web portal of their own and make it available to their customers externally through the internet or through an intranet to their employees. A portal is made up of one or more web pages. Each web page can have one or more portlets arranged on the page. A portlet is a specialized content area that occupies a small "window" in the portal page. For example, a portlet can contain travel itineraries, business news, local weather, or sports scores.

Consequently, a portal acts as a kind of aggregator of an enterprise's content on the glass. For most enterprises the amount of content can be immense and not navigatable without search. The IBM WebSphere Portal comes with its own search engine but is limited in scale to about 800 thousand documents and only HTTP accessible sources. At the conference I talked about how IBM's enterprise search engine, OmniFind, could be used to go beyond these limitations and provide much more advanced search features.

In particular, OmniFind can scale to millions of documents and supports a broad range of enterprise content sources. Supported sources include the web, news groups, file systems, Notes databases, Quickplaces, document and content management systems, relational databases, and much more. So you can see that nearly all of your enterprise content is now searchable through your portal. Advanced search portlets are available that provide a robust set of search features to enhance the overall search experience. And by leveraging state of the art ranking algorithms and text analytics OmniFind can assure that the results are relevant. And that's what is important. If your users can't find what they are looking for they become frustrated, abandoning search and using more expensive means such as phone or email. Or worse yet they may leave your site altogether.

Composite Applications

While composing applications out of components is more efficient than creating them from scratch, building complex business logic using a set of portlets can be pretty tedious. You need to:
  1. Deploy the individual components one after each other.
  2. Arrange the deployed pieces on the staging system, as desired.
  3. Define portlet interaction and access control according to the business logic to be implemented.

All these steps need active involvement of application developers, portal administrators, and the people with the necessary business domain skills. To simplify this process, WebSphere Portal V6 introduces composite applications. Business analysts and application designers can easily assemble composite applications which implements complex business logic from individual components, such as portlets, processes, or other code artifacts. Composite applications use two fundamental aspects: templates and applications. A template describes a composite application in an abstract way, including information which defines how complex business logic is assembled out of a given set of components. The template is an XML file which references all components, such as portlets or Java code artifacts, and specifies applicable meta information, such as specific configuration settings for each individual component. The template describes the composed application behavior by defining the desired interaction between the components, such as wires between portlets, and access control logic to be enforced, such as application specific user roles. Because templates are stored in a template library, a user can pick a template and create a new instance of the composite application that is described by the selected template definition. Users can manage their application instances based on their own needs.

WebSpere Portal and AJAX

Behind the scenes, WebSphere Portal V6 leverages Asynchronous JavaScript and XML (AJAX) in several places to move UI logic from the server into the browser. For example, contextual menus are implemented based on Ajax. The appropriate choice of menu options is determined on the browser system without request/response roundtrips to the portal server. WebSphere Portlet Factory uses Ajax to implement a “type-ahead” capability and to refresh UI fragments within a page. The use of AJAX has a dramatic effect on the overall usability of the portal. Because the UI processing is performed in the client and does not require a refresh of the entire page from the server AJAX removes the annoying flash or flicker typically experienced as a user navigates within a page.


Click here to read more...