I've talked in the past about the various possibilities of the future of search engines and the biggest threats to Google's dominance, and today I've been able to follow up on some of that prodding into that "vertical search fracturing" possibility.

iMedix is one of the more exciting vertical search engines on the market. Not only are they a recipient of the Best New Startup Award from the Crunchies, they're also very highly regarded in the marketplace by a lot of smart players. Not everyone's a fan, but they're in an area of high opportunity and have executed remarkably well from both a development and marketing perspective.
_

iMedix Screenshot of Results Page
_

The biggest "difference" that iMedix offers is the inclusion of a social network that integrates with the search results. Above, you can see I searched for allergies, and along the sidebar are people who share that interest and are interested and willing to talk about it. This takes advantage of some big psychological strengths in the health field - namely the need to "not be alone" when dealing with health issues. I'd imagine that iMedix's community is held together by stronger bonds than most, and that, potentially, gives them a powerful edge.

Last week, I got the opportunity to interview Iri Arimav and Amir Leitersdorf frrom iMedix (CMO & CEO, respectively) about they're progress to date and some of their search technology issues:


For those of us who aren't familiar, can you give us a background about iMedix - how it started? What are the goals of the company and what is the business model?

iMedix started because of a personal health need for both of us. Since we both love the Web, we saw it as an untapped resource for people looking to make better health related decisions. We felt that searching alone using the popular horizontal search engines was definitely not the right way to find and share health information. Not only were the results irrelevant in most cases, they also left us unaided and with a strong sense of anxiety. We knew that there was excellent information online, but it was organized poorly. At the same time, it was difficult for us to communicate easily and quickly with other people that had valuable personal health experiences and knowledge. We decided to build a health search engine that will be powered by the patients for the benefit of patients. We built a prototype that improved into an alpha version that upgraded to what you see today on iMedix.

From a vertical search perspective, do you see an ability to pull search traffic mindshare away from Google/Yahoo!/Live? Do you think people will ever come on the web and think - "I need to search for health/medical issues, so I'll use iMedix, not Google."? Or is your strategy more like WebMD - to build a portal and attract search traffic from the major engines?

We believe that the experience of finding and sharing health information on the Web is about to change dramatically. An exceptionally different and valuable experience can pull traffic mindshare away from traditional web properties. The experience iMedix offers to consumers is very different than Google/Yahoo or Live. People nowadays are attracted to communicating with each other and engaging in various ways. People want more than search and browse. The growth of social networks that took substantial traffic from traditional portals in the last 3 years is a good example. We are experimenting with an innovative marketing approach now and it looks very promising. We are continuously introducing new features that allow for the creation of valuable user generated health content and most importantly listen to the needs of our fast-growing community. We believe that if you create value and a worthy experience, the rest will follow.

It appears that iMedix crawls and indexes websites, just like the major search engines, but rather than indexing the entire web, you've limited your search to only domains pertaining to health. A few questions on that - do you hand select the domains to be included? If so, do you worry that the information is more narrow, or do you think searchers will feel confident knowing the results include more authoritative sources and fewer blogs, scrapers, etc.?

iMedix crawls a subset of the Web to index the most informative and relevant health articles available. Using our patented technology we crawl and index numerous Web directories and medical databases resulting in almost any known medical site in the English language. This way, we are able to cover all of the symptoms, conditions and treatments known to medicine.

In terms of building a search engine crawler, did you custom develop something yourself or use a technology like Nutch? Does iMedix maintain all those inverted-keyword databases or do you use third party technology? And, how many pages (approx.) do you have in your index?

Our crawler is a multi threaded, distributed computing technology that was developed in-house and is capable of crawling hundreds of Web sites in parallel. Our crawler relies on a high bandwidth network and while using a single database maintains an average rate of crawling half a million pages per hour without burdening on the crawled site's performance.

The crawler detects similarity between pages thus avoiding "over crawling" and is also capable of detecting frequency of change in a site's content so to optimize the crawl scheduling and gain maximum data update for each crawl session.

The indexer was developed in-house, while relying on open source projects for representing documents in a space vector model, we developed a grid computed application based on divide-and-conquer algorithms that provide the ability to index hundreds of documents per second into binary files that can be instantly searched while the indexing is continued in the background.

The number of documents in our index varies a lot since we are constantly increasing the number of sites crawled, while deleting pages that were flagged as irrelevant both automatically (by heuristics of the crawler or indexer) and also by manual processing that is done by our staff. In our recent versions we have used indexes of 10-20 million health pages depending on the factors stated above.

As a follow-up with regards to the ranking algorithm - is it something you've done in-house? Did you end up using a modified version of something like PageRank? TrustRank?

Our IP resides with our ranking algorithms that analyze the feedback received from our users in order to recognize patterns of useful pages. The ranking formula is constantly and automatically updated according to the users' feedback. The learning machine itself is built upon an ensemble of modern algorithms in the machine-learning field. The classifying algorithms are focused at bringing very high precision in predicting the probability of a page being a good match given a certain query. Our proprietary technology is also developed with the assistance of our chief scientist, Prof. Yuval Shahar who is the head of the Medical Informatics Research Center in Ben Gurion University and has more than 15 years of experience in the most advanced health information retrieval and artificial intelligence technologies. Prof. Shahar holds a Ph.D. from Stanford University in Medical information Sciences and is a certified medical doctor.

iMedix, in addition to being a search engine, leverages users to help build a community - what made you choose that path and how have your users been responding?

We decided to choose this path because we felt that people want to be empowered and can contribute so much from their experience and knowledge. If we build the right tools we can organize and leverage this collaborative effort. Although we regularly read Forrester, Jupiter Research, eMarketer and all the other major research companies we believe that listening to your users is the most important thing regarding building successful products. Our path is really a reflection of our conversations with our users. We are fortunate that our users like to use iMedix and that the Internet community decided that we are worthy of winning the Best New Startup of 2007 at the Crunchies worldwide competition.

iMedix has obviously been a hit in its first year, winning the Crunchies award for best new startup - with regard to publicizing the product, what have been your strategies to date and where have you seen the most success?

Thank you very much for your kind words. We feel very fortunate to be in the right place at the right time. One of the strategies that worked well for us was to truly spend time learning and understanding the products that existed out there and get to know our audience. Not just read reports, but engage in an honest and an open discussion with bloggers, opinion leaders and patients. We were in contact with hundreds of these people and we learned together with them and with the help of our users what we needed to develop. I believe these people enjoyed the process and decided to share it with their friends and readers. This had a tremendous effect on our visibility and traffic.

Finally, with high expectations and a startup environment, I imagine things can get fairly overwhelming - how have you and the rest of the iMedix team done with work/life balance? Any recommendations you have for a startup - things you'd change or do differently the next time around?

The startup environment is indeed very demanding. The past year has been pretty tough for all of us. We’ve worked around the clock and had a hard time balancing our personal lives. Eventually, we are all happy and satisfied but several lessons were learned along the way.

We would recommend the following to every start-up CEO/founders:

  • Have 1 day every week in which you do NOT check your emails.
  • Understand that people are the most important resource in a start-up company. Make sure that they spend time with their families and encourage them to take vacations.
  • Keep your employees informed in the company’s progress and activities. If you expect everyone to share the effort then you definitely need to share the fruits.
  • And finally.... Celebrate! Even if you don’t have the time. Don’t let those wonderful moments pass you by.

Thanks, Iri & Amir - I'm not entirely sure I can live up to all of those suggestions, but will certainly try. Much appreciated!


BTW - Some interesting thoughts to consider for search marketers:

  • How do you target content towards vertical search sites that could potentially be whitelisting domains?
  • Are you currently seeing any significant search traffic from vertical engines outside the major properties?
  • Is traffic from a site like iMedix potentially more or less valuable/focused to site owners?

My biggest concern is that marketers who don't pay attention to potential future landscapes may miss out on big opportunities to leapfrog the competition when those changes do occur.