- Spiders your site's content to X levels deep
- Build a database of page titles, urls, and content
- Provide a searchable front-end that is easily customizable to match an existing site
I don't want to use the google API. It takes too long to update it's index and I'd like the site index to be updated hourly. I've already tried phpdig, tsep, and php-crawler. Phpdig was the best of all of them, but their templating system (and php code in general) is horrendous and I'm about ready to give up on it. I've also heard mention of Lucene being a great alternative to using mysql fulltext indexes, but I think it's overkill for what I need.
I'm looking for something that uses php and mysql.
Any suggestions?
I'd recommend lucene. We've been using it for a while now.
Have you tried swish-e? It is way superior to htdig.
No, I'll check it out
Subjex.com artificial intelligence that works well on site wide search but not so good web wide.
What happened to good old Gigablast site search?
Write your own, sometimes reinventing the wheel can be a real time saver. All the time you spend picking, customizing and making templates for a existing product could be spent writing your own.
Right now im using site search pro: www.site-search-pro.com. Its php and integrates seamlessly into any existing website template.
It doesn't have the ranking flexibility that I would like, or I just haven't found the right combo yet, but as fax as integration goes, it works perfect for me.
As far as spidering it will go as deep or shallow as you want and puts everything into a database.
Atomz used to be my favorite, though I guess they got scooped up by webside story at some point.
Never used myself...but have heard...htdig?
Thanks for all your input everyone.
I ended up hacking the hell out of phpdig and got it running decently. You can all give it a whirl next week some time when the new site goes up.