28
Feb
08

sfLucene Quick Tip 2: Automatic Re-indexing

Here’s another quick tip when using sfLucene. As you’ll notice when you start using the plugin, in order to index all of your results (after you’ve set up all of your search.yml files) you need to run a pake task to build the index:

symfony lucene-rebuild frontend dev

This will (re)build a search index for the “frontend” application using the “dev” environment. This is great, but this means that the search index is only up-to-date when you run this command. It would be a pain to have to do this manually whenever we wanted our search index updated, so we’ll use a cron job to automate the re-indexing.

Since I’m using a Media Temple Grid Server for this project, I’ll simply use their control panel to enable the cron job. Media Temple actually has a great KnowledgeBase Article on how to setup and configure a cron job through your grid server control panel, so you can see that article for more info on setting this up using a Grid Server.

For those of you using another service or your own brand of linux/unix, there are instructions all over the web that you can use to follow the same methodology, but basically all you have to do is run the same “lucene-rebuild” command on an interval of your choosing. You also have to use the full path to your symfony project’s root folder where the symfony command is located. For a Grid Server, this would be something like this:

php5 /home/XXXXX/domains/example.com/symfony lucene-rebuild frontend prod

…where XXXXX is your Grid Server site number and “example.com” is your domain name. Note the use of “php5” before the path of the actual command. This is required if you’re using a Grid Server as by default it will try to run this script using the php4 CLI. Also, your path my vary depending on how your files are setup in the domain folder itself. I set mine to index once a day, but if your site has a high data turnover rate, you may want to make it index more frequently so that results are fresh. You’ll also notice that I’m using the “prod” environment since that’s what the site is using in production.

Update 2008-03-04: Looks like I had propel.builder.addBehaviors set to false in my propel.ini so the behaviors weren’t being built into the model classes when I was rebuilding them. So new and modified objects are now correctly and automatically re-indexed. I still use this cron job though, albeit less frequently, to update the search index for static files that are being indexed through the Action indexer.


2 Responses to “sfLucene Quick Tip 2: Automatic Re-indexing”


  1. 1 Gunnar Lium Mar 2nd, 2008 at 10:47 am

    Doesn’t sfLucenePlugin automatically update the index whenever the model changes?

  2. 2 Mark Quezada Mar 2nd, 2008 at 1:48 pm

    It’s *supposed* to automatically reindex all of the model objects whenever they are added or changed, but that hasn’t really worked for me for some reason. It’s definitely in the Propel Behavior to do so, but I haven’t been able to get it to work reliably. Also, the benefit of doing this is that static file changes that are being indexed (through the Action indexer) also get updated if you’ve made changes to them. I use the “symfony sync” pake task to rsync changes to my production server, so this helps keep even the static pages fresh in the index. I haven’t used this with an extremely large number of records, so Your Mileage May Vary, but it works well for what I use it for. I’ve read that Carl (the creator of the plugin) doesn’t really intend for people to use the rebuild task from a cron job, but rather only for development, but I haven’t had any issues thus far.