Rails

Sunspot, Tips and Tricks


Tips and Tricks in Sunspot

Lately, I've been working a lot with Solr and the awesome gem Sunspot (sunspot-rails), and sometimes, I've had the necessity to do some unusual stuff with Sunspot, like deleting a document from the index by hand, ordering by date beginning with the ones with no date then the ones with results, different statuses, etc.

So, here I will show a couple tricks and tips to accomplish this type of task.

Removing an object from Solr index

Did you ever need to delete an element from the index to reproduce a bug? Well, this is how you can do it:

Sunspot.remove_by_id(ObjectType, id)

Cool, isn't it? This is another way:

Sunspot.remove(ObjectType) do
  with(:field, 'value')
end
#Removing all posts older than 1 week
Sunspot.remove(Post) { with(:created_at).less_than(7.days.ago) }

Reading Solr Config

Inspecting solr configuration

puts Sunspot.config.inspect

Changing default 30 results per page(initializer?)

Sunspot.config.pagination.default_per_page = 100

Conditional indexing

You can disable indexing conditionally, using if or unless,

class Product < ActiveRecord::Base
  searchable(if: proc { |model| model.should_reindex? }
end

Avoid indexing if specific columns are updated

class Product < ActiveRecord::Base
   searchable(ignore_attribute_changes_of: [:updated_at, :internal_state]
end

Eager loading

You can use ActiveRecord includes to eager load data needed to reindex documents

class Product < ActiveRecord::Base
  searchable(includes: [:variants]) do
    string :name, stored: true
    text :description, stored: true
    string(:sku, :stored => true, :multiple => true) { variants.map(&:sku) }
  end
end

Case Insensitive Ordering

It's hard to support case insensitive ordering, if you need to perform ordering with case insensitive, the easiest way is to create a virtual column with everything in lower case

class Product < ActiveRecord::Base
  searchable(includes: [:variants]) do
    string(:name, stored: true)
    string(:name_order) { name.downcase }
  end
end
Product.search { order_by(:name_order, :asc) }

Ordering by nil values first then ordered results

By default, solr orders all documents with the ones with data going first, then the ones with no data, if you want to invert this ordering, you need to apply a workaround

class Product < ActiveRecord::Base
  searchable(includes: [:variants]) do
    date(:last_purchased_at, stored: true) { orders.last.created_at }
    date(:last_purchased_at_order, stored: true) do 
      orders.last.created_at rescue Time.zone.at(1)
    end
  end
end
Product.search { order_by(:last_purchased_at_order, :asc) }

Reindexing records

Do you want to reindex all documents?

By Model

WARNING This will delete the entire index and start over!

Model.reindex

All Documents

RAILS_ENV=environment bundle exec rake sunspot:solr:reindex

Avoiding index destruction in production

You can disable sunspot:solr:reindex to be executed in production(I did it once and the site ran without products for more than 40 mins)

namespace :sunspot do
  desc "Prevents from deleting solr index in production environment"
  task :check_environment do
    if Rails.env == "production"
      fail "Can't use this task in production mode, as it's destructive to Solr index."
    end
  end
  task :reindex => [:check_environment]
end

Incremental reindexing

Sometimes you need to reindex the full catalog, but you want to do it incrementally, in that case you can use something like this:

namespace :sunspot do
  task :sunspot do
    task :incremental_reindex do
      model = ENV['MODEL_NAME']
      raise 'Model not found' unless defined?(model)
      model.constantize.find_in_batches do |records|
        Sunspot.index records
      end
    end
  end
end

Searching by hashtags?

It's really hard to make Sunspot to search by hashtags, instead of doing a full text search in normal string fields, you need to do a workaround like this:

class Product < ActiveRecord::Base
  searchable(includes: [:variants]) do
    text(:description, stored: true)
    text(:description_tags, stored: true) do 
      description.scan(/#[\w]*/).join(' ')
    end
  end
  def filter(keywords)
     search do
       keywords(keywords, 
           fields: keywords.start_with?('#') ? :description_tags : :description)
     end
  end
end
Product.filter('#tag1') #returns all documents that has '#tag1'
Product.filter('tag1') #returns all documents that has 'tag1' or '#tag1'

So that's it, thank you for reading!

Beginner
Administrate review
Community
De Código, Café y Cervezas 06 – ActiveModel::Serializer
Craftsmanship
De Código, Café y Cervezas 08 – Web Services