Today's project was to index my 100,000 email and Usenet messages into a MySQL full text index so I can search things I've written. Not bad: 15 minutes to parse and load the messages, 5 minutes to build the index. Queries take a tenth of a second or two. MySQL supports a rich boolean query language.

What I like best is how easy this was. I spent weeks building Funes, a Java mail search program that never was useful. With Python and MySQL it took me just a few hours and the result is better! Goodbye, grepmail.

I'm not the only MySQL fulltext enthusiast: Jeremy Zawodny's blog has a great entry with comments and Mitchell Harper has a useful introduction article. There's also some performance discussion on a PHP forum.

One trick - for speed, run myisamchk -a on the table after building the full text index. And do your big load before creating the index; afterwards, inserts are slow.

techgood
  2003-04-21 03:16 Z