Posted 2 years ago
1 Notes
Thinking Sphinx… and feeling jinxed
After some time working with Thinking Sphinx for a client, I figured I should share some of the gotchas I worked through. I didn’t find the TS documentation to be all that bad at first glance. It all seemed fairly comprehensive. But after working with TS in practice, here are a few things I would have liked to know:
1. How to get around Delta indexing troubles on Passenger
The documentation actually explicitly addresses this issue but it didn’t solve my problem.
Here’s what was happening. I had correctly set up delta indexes for my model. I had run the migration to add the delta column. I had rebuilt the conf and re-indexed locally. In my local dev environment, everything worked fine. When I deployed via capistrano to the staging box (running Passenger), here’s what would happen. In the live app or in script/console, I would actually see the delta indexer being invoked. The response from the indexer even seemed to indicate that the deltas were processed. However, when I ran a Model.search, my updated or new model instances were not getting returned. The culprit? The documentation does mention that the user that the app is running under ( in the case of Passenger it’s the file owner of environment.rb) needs to be the same user that is running sphinx. That’s true. But here was my problem. Before I started delta-indexing, I had been running sphinx as a different user. This means that the very first index files (i.e. /db/sphinx/[model_name]_core.sp* ) had been written and owned by a different user. Then, when I switched users, the index updates were silently failing.
The solution? I ended blowing away the existing index files, re-configuring (re-generating the staging.sphinx.conf) and then rebuilding. Thereafter, the documentation led me to happy delta-indexing.
2. Faceted One-to-many associations. Let’s say you have a model Post that has_many Tags. You might want to index your profile and include the Tags as a search facets so you can see the breakdown of posts across all of the tags:
So you might do something like this in your post.rb :
define_index do
indexes title
has tags.text, :as => :tag, :facet => true
end
Then let’s say you want to see the Profiles with a particular set of tag texts, you would so something like this:
Post.search :with => { :tag_facet => [“iphone”.to_crc32 , “ipad”.to_crc32] }
Okay, so that one was pretty much in the documentation.
Here’s a problem I ran into though. I was adding a :type to the “has” statement. For instance, I had this:
has tags.text, :as => :tag, :facet => true, :type => :string # BAD !
or
has tags.id, :as => :tag, :facet => true, :type => :int # BAD !
Don’t make this mistake. Those :types shouldn’t be there for these one-to-many attributes.
Okay. That’s it for now.