Showing entries 31 to 38
« 10 Newer Entries
Displaying posts with tag: search (reset)
Xapian Search Backend Revisited

I wrote previously about looking for a more powerful search solution, and I mentioned that Xapian wasn’t quite so convenient in indexing my data. I then chose to experiment with sphinx a little more, and proceeded to create a number of search engines and indexed a number of data sources in order to decide which direction to go. Unfortunately, while sphinx was convenient and still provides an excellent backend for basic search indexes, I’m revisiting Xapian once again based on it’s more-than-anticipated flexibility. I was brief in my explanation of Xapian however, and didn’t mention some of the more important and powerful aspects of it.

Xapian provides an API

Xapian is primarily an API for search indexing/data retrieval. They do provide a handy utility called Omega (available here) for indexing static pages and a plethora of other mime-types. However, I’m in …

[Read more]
Sphinx Fulltext Search Engine Part III (continued)

I’m finally taking the time to continue this series =)

Lyrics Grep

For testing purposes, I went ahead and scraped about 60,000 song lyrics off of a number of sites and developed a simple search engine for them. The script that did the scraping is pretty nasty (to handle equally nasty HTML that I had to parse through), so I’m going to refrain from posting that script and save myself some embarrassment. Make a pot of coffee, sit down, and write one yourself (or something else that’s similar enough).

Database Schema

I created a new database called lyricsgrep with the following simple …

[Read more]
Sphinx Fulltext Search Engine Part II (continued)

Note: Part I is located at the page that describes part one

Just a minor clarification to anyone that was confused: I am currently experiencing Sphinx for the first time. Everything I’m writing about is new to me as well, for the most part. So far, I’m drooling over some of it’s capabilities; I may come back in a month and rip it a new ass hole.

Back to Configuration… (not really, this is the bitching section)

In preparation for my previous post about Sphinx, I had originally played with a number of configuration options, and even encountered a couple of issues that caused my confused butt to have to debug a number of things, and even recompile --with-debug and gdb the thing …

[Read more]
Sphinx Fulltext Search Engine

I’ve been looking for a good solution to manage full text search for large data chunks (2GB - 100GB). I’ve written a couple of solutions using Xapian with limited success, but unfortunately I haven’t been satisfied with it overall. Performance was good, but there were a number of issues with flexibility that have me ultimately looking for another solution.

At my usual day job, the topic was brought up and I mentioned Xapian and Lucene as solutions, however we’re looking to stay away from Java as it’s not currently in our architecture, and as I stated before: Xapian doesn’t quite have the capabilities I’m looking for to handle even my own systems. Someone brought up Sphinx as something that was being looked into, and I jumped into the typical research process.

One of the key elements that Sphinx seems to offer …

[Read more]
Getting all SOAPY

So, we ( rolled out a new site this month,  Its young and lacking features of many of the other price comparison sites, but is has great potential.  Our hope is to bring together the best features of all the other players in the market in one great application.
Part of this project required using web services with several different data suppliers.  Most support simple REST and SOAP, but some only offer SOAP.  So, given that I bit the bullet and enabled the SOAP extension for PHP5.  Wow!  I was happily surprised.  The last SOAP code I had looked at was the old PEAR code.  It was not that attractive to me.  It required a lot of work IMO to talk SOAP.

Now, with just 3 lines of code, I can get back a nice object that has all the data I need.  Kudos to Brad …

[Read more]
Sphinx - Open Source SQL Full Text Search Engine

I came across Sphinx today via the MySQL Performance Blog (which has some good entries you might want to check out). It is an Open Source Full Text SQL Search Engine. It can be installed as a storage engine type on MySQL, and from what I hear can beat the pants off of MySQL's built-in full text search in some cases.

From the web site:

Generally, it's a standalone search engine, meant to provide fast, size-efficient and relevant fulltext search functions to other applications. Sphinx was specially designed to integrate well with SQL databases and scripting …

[Read more]
Phorum + Sphinx = really fast


MySQL FULLTEXT Indexing and Searching

MySQL has supported FULLTEXT indexes since version 3.23.23. VARCHAR and TEXT Columns that have been indexed with FULLTEXT can be used with special SQL statements that perform the full text search in MySQL.

To get started you need to define the FULLTEXT index on some columns. Like other indexes, FULLTEXT indexes can contain multiple columns. Here's how you might add a FULLTEXT index to some table columns:

ALTER TABLE news ADD FULLTEXT(headline, story);

Once you have a FULLTEXT index, you can search it using MATCH and AGAINST statements. For example:

SELECT headline, story FROM news
WHERE MATCH (headline,story) AGAINST ('Hurricane');

The result of this query is automatically sorted by relevancy.


The MATCH function is used to …

[Read more]
Showing entries 31 to 38
« 10 Newer Entries