1 Notes

Thinking Sphinx… and feeling jinxed

After some time working with Thinking Sphinx for a client, I figured I should share some of the gotchas I worked through.  I didn’t find the TS documentation to be all that bad at first glance.  It all seemed fairly comprehensive.  But after working with TS in practice, here are a few things I would have liked to know:

1.  How to get around Delta indexing troubles on Passenger

The documentation actually explicitly addresses this issue but it didn’t solve my problem.

Here’s what was happening. I had correctly set up delta indexes for my model.  I had run the migration to add the delta column.  I had rebuilt the conf and re-indexed locally.  In my local dev environment, everything worked fine.  When I deployed via capistrano to the staging box (running Passenger), here’s what would happen.  In the live app or in script/console, I would actually see the delta indexer being invoked.  The response from the indexer even seemed to indicate that the deltas were processed. However, when I ran a Model.search, my updated or new model instances were not getting returned.   The culprit?  The documentation does mention that the user that the app is running under ( in the case of Passenger it’s the file owner of environment.rb) needs to be the same user that is running sphinx.  That’s true.  But here was my problem.  Before I started delta-indexing, I had been running sphinx as a different user.  This means that the very first index files (i.e. /db/sphinx/[model_name]_core.sp* ) had been written and owned by a different user.  Then, when I switched users, the index updates were silently failing.

The solution?  I ended blowing away the existing index files, re-configuring (re-generating the staging.sphinx.conf) and then rebuilding.  Thereafter, the documentation led me to happy delta-indexing.

2.  Faceted One-to-many associations.  Let’s say you have a model Post that has_many Tags.  You might want to index your profile and include the Tags as a search facets so you can see the breakdown of posts across all of the tags:

So you might do something like this in your post.rb :

define_index do

indexes title

has tags.text, :as => :tag, :facet => true

end

Then let’s say you want to see the Profiles with a particular set of tag texts, you would so something like this:

Post.search :with => { :tag_facet => [“iphone”.to_crc32 , “ipad”.to_crc32] }

Okay, so that one was pretty much in the documentation.

Here’s a problem I ran into though.  I was adding a :type to the “has” statement.  For instance, I had this:

has tags.text, :as => :tag, :facet => true, :type => :string   # BAD !

or

has tags.id, :as => :tag, :facet => true, :type => :int   # BAD !

Don’t make this mistake.  Those :types shouldn’t be there for these one-to-many attributes.

Okay. That’s it for now.

Replies

Likes

  1. decogram posted this

 

Reblogs