Think of search as the app platform, not just a feature
In 2000, Yahoo was in pole position to win one of the greatest market opportunities of all time, as one of the most popular and fastest growing early services on the World Wide Web.
The Internet was still relatively new (17 million websitesagainst 1.6 today billion) and companies like Yahoo belonged to an awkwardly named category, sometimes called “start pages” or “portals,” gateways to services like email, news, finance, and sports. Yahoo was running away with that traffic because it had the friendliest interface and the best content at that time for that new “web” experience.
In June of that year, Yahoo chose Google as its “default search engine provider” and Yahoo’s search box was suddenly advertised as “powered by Google”. Then users simply ended up on Google for that search.
Today, Google (aka “Alphabet”) boasts a market capitalization of $1.7 trillion, while Yahoo is remembered as one of the early Internet commercials, which failed to capitalize on being in exactly the right place at the right time.
Research is the engine of value
The history lesson isn’t just that Google won the Internet with search.
It’s that research is what won all technology player dominating its market share. It won apps (App Store) and music (iTunes) for Apple, social for Facebook, e-commerce for Amazon, and more. All of today’s most valuable technology brands are research masters in their application areas. The market has shown us time and time again that research is unequivocally the engine of value and that those who master research control the markets.
But many developers still struggle to understand search as a fundamental part of their app platform. Some view search as something to “lock” to the application post facto, while others fall back on SQL LIKE queries and other half measures.
If you’re trying to understand the importance of search in your app platform strategy, let’s talk about what’s at stake and why you need to get it right.
Research is a conversation with your users
If you walk into a pharmacy and say, hey, I’m looking for a home COVID-19 test kit, and they walk away without answering your question, how does that make you feel? Ignored? Lack of respect ? You won’t be coming back, that’s for sure.
Research is a conversation with your users. Search is how you make it easy for them to interact with your data. What’s more important than that?
10 years ago, developers working with search mostly tried to analyze text. Natural language, parsing chain, and index setup – all driven by decades of research to understand how languages are composed, what words are important, how to deal with diacritics, and more.
Then search evolved into the concept of learning to rank, so that over time you could rearrange search results based on what you had observed in user conversations in the past. This is great basic search functionality that every search engine still offers today.
Show data before your users know they’re looking for it
Today, we’re seeing a major shift in the way search anticipates what data users want before they even know they’re looking for it. I land on Netflix and it already knows I want this movie or am interested in this show — that’s the canonical example of personalization, powered by search indexing and machine learning.
Beneath these use cases of predicting what users want is math that tries to mimic how our brains work. Vector space – words, phrases, or expressions represented in a graph by their location in a language model – is driving this movement.
The search changes from text representation to vector representation. The native digital world of the ubiquitous Internet, ubiquitous e-commerce, and ubiquitous smartphones is pushing us into the next phase of multimodal information seeking. Whether the metaverse wins or a different future platform emerges, sometimes the interface will be text, sometimes it will be voice, and sometimes it will be images or video. Eventually, it may even be neural links directly to the brain.
Vector representation makes possible this type of multimodal information retrieval in research. It is a discovery that is not possible with the text alone. If someone under 20 says a new song is sick, it will probably have a different meaning than if someone over 60 is saying the exact same thing. We all speak differently, and when we try to anticipate what someone wants, we have to analyze both who they are and what they are looking for at a time.
SQL LIKE queries are a dead end; just like the proprietary engines
As a developer, the decisions you make today about how you implement search will either prosper or block your future use cases and your ability to capture this rapidly changing world of vector representation. and multimodal information retrieval.
A very blocking mindset is to rely on SQL LIKE queries. This old relational database approach is a dead end for searching in your application platform. LIKE queries simply don’t match the capabilities or functionality built into Lucene or other modern search engines. They also hurt the performance of your operational workload, causing overuse of resources through greedy quantifiers. They are fossils – artefacts of SQL from 60 or 70 years ago, which is equivalent to a few tens of millennia of application development.
Another common architectural pitfall is proprietary search engines that force you to replicate all of your application data to the search engine when you really only need the search fields. Maintaining both a document store for research and a separate store for truth leads to significant complexity, increased storage costs, and latency for the modern full-stack developer, who must now be at the both research expert and part-time database administrator.
Operational workloads such as search are adaptive and dynamic. They are “post-SQL”, obsoleting expensive and inefficient LIKE and CONTAINS operations in older databases.
Getting Started: User Journeys and Destinations
Developers who have understood the importance of research can easily find themselves trying to boil the ocean, building a specialized external system and trying to do it all on the first try. Whereas the savvy engineer will simplify and iterate.
Understanding your users is the first step in every successful search implementation I’ve seen. You need to audit their destinations and then trace different user paths, just like with UI design.
Typically, you’ll find that although users’ paths may be different, they often start from the same place and reach the same destination. Getting a very granular understanding of what your users are trying to do and how you’re getting them there will reveal the commonalities that bring focus and simplicity to your development efforts around search.
Marcus Eagan is a contributor to Solr and Lucene and is the product manager for Atlas Search at MongoDB. Prior to that, he was the Developer Tools Manager at Lucidworks. He was a global technology manager at Ford Motor Company and he led an IoT security startup through its acquisition by a router maker. Eagan works hard to help underrepresented groups break into technology, and he’s contributed to open source projects since 2011.
The New Tech Forum provides a venue to explore and discuss emerging enterprise technologies with unprecedented depth and breadth. The selection is subjective, based on our selection of the technologies that we think are important and most interesting for InfoWorld readers. InfoWorld does not accept marketing materials for publication and reserves the right to edit all contributed content. Send all inquiries to [email protected]
Copyright © 2022 IDG Communications, Inc.