Saturday, 27 June 2015

Sitecore Solr - One Core To Rule Them All?

Solr is fast becoming the preferred search technology within Sitecore. When it first became an option in Sitecore 7, I remember Google Hangouts and user group discussions about how Solr supported the ability to combine multiple Sitecore indexes into one Solr core. At the time I didn't understand the significance of this, but I now realize it's quite important.

What is a Solr core?

Apache's defintion of a Solr core is "a running instance of an index along with all the Solr configuration required to use it". So does that mean it's just another name for an index? Well not quite...

If you're used to using Lucene for your Sitecore search needs then you'll know that you define various indexes in Sitecore, configuring each differently depending on its purpose. For example, I might have one index for Products and another for Store Locations.

You still work like this with Solr, but unlike Lucene the underlying storage of the data doesn't necessarily correspond to the indexes defined in Sitecore. We can still search against Products and Store Locations independently, but Solr might store the data for both in a single core named "Commerce".

How does the mapping work?

The configuration of each Sitecore index specifies a Solr core in which the data should be stored. When items are crawled Sitecore adds an additional "_indexname" field to each record. It then appends that field to each query depending the index being used. So some example queries to the "Commerce" core might be:
  • product_id_t: abc123 AND _indexname: products 
  • city_t: london AND _indexname: store_locations

When to group and when to separate

The OOTB Sitecore indexes all specify separate Solr cores, but is that always the most sensible option? In general I would say that it is. Every time a core is updated the caches associated with it are cleared, and this causes a temporary reduction in performance. If you group all Sitecore indexes into a single Solr core then you're likely to be clearing those caches very frequently. It will also affect more areas of your site. Keeping cores separate allows you to isolate the impact of cache clearances.

On the other hand there might be good reasons to group multiple Sitecore indexes into a single Solr core. Perhaps you have a lot of indexes and just find it more manageable to group them. Maybe you find it more useful to group indexes into cores based on the database from which the data originates (master, web, core etc).

There probably isn't a perfect answer, but I definitely think these issues need careful consideration when designing your approach to using Solr.

What are your experiences of the options available for mapping Sitecore indexes to Solr cores?
Are there any pros/cons to the different approaches that I've missed?

For some further reading, take a look at Patrick Perrone's post -  A Solr Core-nucopia?