Using Vagrant with Sitecore Part 3 – Sitecore using Solr running on Ubuntu Linux

This is part 3 of the series of post that I am doing about Vagrant to aid with Sitecore Development

Part 1 – Setup, Chocolatey, Vagrant, and VirtualBox
Part 2 – MongoDB on Ubuntu Linux using Vagrant
Part 3 – This post

There are several blog posts the detail how to configure Solr with use of Sitecore

These posts do a good detailing all the steps that are required to install SOLR running on Windows, and then how to configure Sitecore to use Solr. Therefore in this post I will not be duplicating the work that has been done previously. As I am going to be using an Ubuntu VM, I am going to detail what I did to get SOLR running on Ubuntu hosted within a Vagrant VM. The only nod to Sitecore and Solr will be detailing the problems I experienced and how I resolved them.

Getting Started

To save some time i just created a new folder, and copied my vagrantfile and provision.sh from my mongodb configuration that I build within Part 2

For this I decided to use the ubuntu/trusty64 box, instead of using the hashicorp/precise64 box. This box is using Ubuntu v14 while the precise64 box is using Ubuntu v12, although what is detailed should work on any version of Ubuntu.

I made only minimal changes to the vagrant file. config.vm.box’s value was changed to ubuntu/trusty64, and I changed config.vm.hostname to solr-dev

 
Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/trusty64"
  config.vm.hostname = "solr-dev"
  config.vm.network "forwarded_port", guest: 8983, host: 8983, id: "solr"
  config.vm.provision "shell", path: "provision.sh"
end
 

Port 8983 is the default port that Solr uses so this needs to be forwarded to the guest.

With the provision file, I removed everything except the first apt-get -y update

To start lets run

 
vagrant up
 

Unlike in Part 2, I am not going to provide all output, unless i think it will be relevant

Once the command has completed, ssh into the vm

Installing Java

Solr needs java to run. Therefore before downloading solr from apache, first need to install java. As i am using Ubuntu 14, I followed the instructions from Install Oracle Java 8 (JDK8 and JRE8) in Ubuntu or Linux Mint

I ran the following commands

 
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer -y
 

Now this was a bit of a different installation. At the start of the install, I had to accept two prompts
accept licence agreement 1
and
accept licence agreement 2

If you read the full document from webupd8.org, it gives you the command to auto accept the two licence agreements

 
echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | sudo /usr/bin/debconf-set-selections
 

Going to need this when we update our provisioning file.

To verify that java is installed, run

 
java -version
 

For me, this was the result


java version "1.8.0_91"
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)

Installing Solr

If you look at this knowledge base Solr Compatibility Table, Sitecore is only supporting Solr 4.1 for v7.x and v8.0, or Solr 4.10 for Sitecore 8.1

This table presents compatibility of different Sitecore CMS/XP versions with different Solr versions as at July 2016.

Solr Sitecore 7.0 – 8.0 Sitecore 8.1
1.1 – 4.0
4.1 *
4.2 – 4.7 * *
4.8 – 4.9 * *
4.10 *
5.0 * *
5.1 * *
5.2 – 5.4 * *
5.5, 6.0 – 6.1 * *

Legend:

officially tested, recommended
* not officially tested, but expected to work
no compatibility information

Pretty much for every version of Solr past 4.2, the official guidance is “not officially tested, but expected to work”

<rant>
For me that pretty much translates into, please go ahead and use it, and have your customers be our unofficial beta testers on their production sites.
</rant>

If you review this document which is a presentation given by Ryan Donovan at the Sitecore User Group in Belguim 2016, http://files.meetup.com/14353282/Sitecore%20XM-XP%20160531a.pdf and go to slide 17, when Sitecore 8.2 is released, part of the Search and Indexing enhancements expected are

  • Improve support for the SOLR package (include in the default setup)
  • Solr v5 Support

From the slides, I cannot determine what v5 version of Solr Sitecore is planning on supporting. Knowing Sitecore, I am going to assume that it will probably be 5.0. instead of the latest 5.5.2 or even 6.x

However because there is no official release date yet for Sitecore 8.2 that I can find, I will assume the best, that they will be using 5.5.2. Therefore this is the version that I will be using

There does not appear to be a package for installing Solr, and the current installation method is to download the relevant zip file for a mirror and extract.

I will be using this mirror site http://mirrors.ukfast.co.uk/sites/ftp.apache.org/lucene/solr/5.5.2/

Download the zip file to your home directory

 
cd ~
wget http://mirrors.ukfast.co.uk/sites/ftp.apache.org/lucene/solr/5.5.2/solr-5.5.2.tgz
 

That took about five minutes to download the 130MB file

Extract the service installation file from the archive

 
tar xzf solr-5.5.2.tgz solr-5.5.2/bin/install_solr_service.sh --strip-components=2
 

Install Solr as a service using the script

 
sudo bash ./install_solr_service.sh solr-5.5.2.tgz
 

Solr will be installed in /opt/solr

This is what I received after executing the command above:


id: solr: no such user
Creating new user: solr
Adding system user `solr' (UID 109) ...
Adding new group `solr' (GID 113) ...
Adding new user `solr' (UID 109) with group `solr' ...
Creating home directory `/var/solr' ...

Extracting solr-5.5.2.tgz to /opt


Installing symlink /opt/solr -> /opt/solr-5.5.2 ...


Installing /etc/init.d/solr script ...


Installing /etc/default/solr.in.sh ...

Adding system startup for /etc/init.d/solr ...
/etc/rc0.d/K20solr -> ../init.d/solr
/etc/rc1.d/K20solr -> ../init.d/solr
/etc/rc6.d/K20solr -> ../init.d/solr
/etc/rc2.d/S20solr -> ../init.d/solr
/etc/rc3.d/S20solr -> ../init.d/solr
/etc/rc4.d/S20solr -> ../init.d/solr
/etc/rc5.d/S20solr -> ../init.d/solr
Waiting up to 30 seconds to see Solr running on port 8983 [-] Still not seeing Solr listening on 8983 after 30 seconds!
tail: cannot open '/var/solr/logs/solr.log' for reading: No such file or directory

Found 1 Solr nodes:

Solr process 6568 from /var/solr/solr-8983.pid not found.
Service solr installed.

You can see that Solr did not start. The error is listed at the bottom.

If you look into the Solr logs.

 
sudo cat /var/solr/logs/solr-8983-console.log
 

Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000e8000000, 402653184, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 402653184 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /tmp/hs_err_pid7025.log

By default, a vagrant VM has a default ram allocation of 512MB.

To fix this, modify the vagrantfile to give the VM 1GB of ram. This is the Vagrantfile I ended up using.

 
Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/trusty64"
  config.vm.hostname = "solr-dev"
  config.vm.network "forwarded_port", guest: 8983, host: 8983, id: "solr"
  config.vm.provider "virtualbox" do |v|
	v.memory = 1024
	v.cpus = 2
  end
  config.vm.provision "shell", path: "provision.sh"
end
 

exit from ssh and issue

 
vagrant reload
  

When the machine reboots, you should now have a working Solr server

freshly installed solr 5.5.2 instance running on ubuntu

Building the Solr Indexes

Before starting, I suggest starting the Indexing Manager from the Control Panel, and verify the index names. I discovered that some of the names no longer matched what was on the web.

As mentioned previously, I am not going to document creating all the schema files etc, as there are already plenty of resources out there and I don’t have much new to add. After following the process, I have ended up with the following file and folder structure

\sitecore
\sitecore\conf\
\sitecore\conf\lang
\sitecore\conf\lang\stopwords_en.txt
\sitecore\conf\_rest_managed.json
\sitecore\conf\currency.xml
\sitecore\conf\managed-schema
\sitecore\conf\protwords.txt
\sitecore\conf\schema.xml
\sitecore\conf\solrconfig.xml
\sitecore\conf\stopwords.txt
\sitecore\conf\synonyms.txt
\sitecore\data\

I have created a zip of this, and you can download it here

Just remember that I built this against a Sitecore 8 Initial Release build, but it should work for all versions of Sitecore. I have not checked to see if these files will work with Solr 6.0. Be aware, the files will probably needs more work for multiple languages.

Notice that I have not supplied a core.properties file. I will be creating this file dynamically.

I placed this folder within my Vagrant folder for the VM.

After ssh’ing into the VM, execute the following

 
sudo cp -R /vagrant/sitecore /opt/solr-5.5.2/server/solr/sitecore_core_index
echo "name=sitecore_core_index
loadOnStartup=true" | sudo tee /opt/solr-5.5.2/server/solr/$1/core.properties
 

This will copy the index folder, and rename it to the correct index name, and create file core.properties. Restart Solr for it to pick up the new core created.

 
sudo /opt/solr/bin/solr restart
 

I now have a new index named sirecore_core_index.

It was at this point I started working on my provision.sh file, to automate all of this.


function copyIndexFolders() {
	cp -R /vagrant/sitecore /opt/solr-5.5.2/server/solr/$1
	
	echo "name=$1
loadOnStartup=true" | sudo tee /opt/solr-5.5.2/server/solr/$1/core.properties
}

echo "start provisioning script"

apt-get -y update

add-apt-repository ppa:webupd8team/java

apt-get update

echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | sudo /usr/bin/debconf-set-selections

apt-get install oracle-java8-installer -y

wget http://mirrors.ukfast.co.uk/sites/ftp.apache.org/lucene/solr/5.5.2/solr-5.5.2.tgz

tar xzf solr-5.5.2.tgz solr-5.5.2/bin/install_solr_service.sh --strip-components=2

bash ./install_solr_service.sh solr-5.5.2.tgz

echo "solr installed"

echo "copying sitecore indexes required"

copyIndexFolders sitecore_core_index
copyIndexFolders sitecore_master_index
copyIndexFolders sitecore_web_index
copyIndexFolders sitecore_analytics_index
copyIndexFolders sitecore_marketing_asset_index_web
copyIndexFolders sitecore_marketing_asset_index_master
copyIndexFolders sitecore_testing_index
copyIndexFolders sitecore_suggested_test_index
copyIndexFolders sitecore_list_index
copyIndexFolders social_messages_web
copyIndexFolders social_messages_master
copyIndexFolders sitecore_fxm_domains_master
copyIndexFolders sitecore_fxm_domains_web

echo "reload solr to see the sitecore indexes"

/opt/solr/bin/solr restart

echo "provisioning script completed"
 

The file will now install java, solr, and create, as at the time of writing, the thirteen indexes that Sitecore requires. Hmm 13. Could you not have changed it to 14.

Migrating Sitecore to use Solr instead of Lucene

There are a lot of file to switch. And while testing I wanted to switch between Solr and Lucene, so I have created the following powershell file to aid.

 
<#
 
.SYNOPSIS
This is a simple Powershell script that will enable / disable SOLR/LUCENE searching for a sitecore instance
.DESCRIPTION
This has been tested against Sitecore 8.0 Initial Release
.PARAMETER path
Specifies the location of the Sitecore index.config files
.PARAMETER enable
Indicates if must enable the solr configuration files or lucene configuration files
Valid values are SOLR or LUCENE
if -enable is not specified, then will default to SOLR
.EXAMPLE
./enable_index.ps1 -path C:\inetpub\wwwroot\SC80\Website\App_Config\Include -enable SOLR
./enable_index.ps1 -path C:\inetpub\wwwroot\SC80\Website\App_Config\Include -enable LUCENE
.LINK
darrenguy.com
#>

param(
	[Parameter(Mandatory=$true)]
	[string]$path, 
	[string]$enable = "solr")

function enableFile($fileName) {
	$enabledFileName = $fileName + ".config"
	$disabledFileName = $enabledFileName + ".disabled"
	$exampleFileName = $enabledFileName + ".example"

	if ( Test-Path $disabledFileName ) {
		Rename-Item $disabledFileName $enabledFileName 
	}
	
	if ( Test-Path $exampleFileName ) {
		Rename-Item $exampleFileName $enabledFileName 
	}
}

function disableFile($fileName) {
	$enabledFileName = $fileName + ".config"
	$disabledFileName = $enabledFileName + ".disabled"

	if ( Test-Path $enabledFileName ) {
		Rename-Item $enabledFileName $disabledFileName
	}	
}


if ( ( $enable.ToUpper() -ne "SOLR" ) -and ( $enable.ToUpper() -ne "LUCENE") ) {
	Get-Help .\enable_index.ps1 -full
}

#write-output "Path: $path"

if ( $enable.ToUpper() -eq "SOLR" ) {
	Write-Host "Enable solr"
	disableFile "$path\Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration"
	disableFile "$path\Sitecore.ContentSearch.Lucene.Index.Analytics"
	disableFile "$path\Sitecore.ContentSearch.Lucene.Index.Core"
	disableFile "$path\Sitecore.ContentSearch.Lucene.Index.Master"
	disableFile "$path\Sitecore.ContentSearch.Lucene.Index.Web"
	disableFile "$path\Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.Index.Master"
	disableFile "$path\Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.Index.Web"
	disableFile "$path\Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.IndexConfiguration"
	
	disableFile "$path\ContentTesting\Sitecore.ContentTesting.Lucene.IndexConfiguration"
	
	disableFile "$path\FXM\Sitecore.FXM.Lucene.Index.DomainsSearch"
	
	disableFile "$path\ListManagement\Sitecore.ListManagement.Lucene.Index.List"
	disableFile "$path\ListManagement\Sitecore.ListManagement.Lucene.IndexConfiguration"
		
	disableFile "$path\Social\Sitecore.Social.Lucene.Index.Master"
	disableFile "$path\Social\Sitecore.Social.Lucene.Index.Web"
	disableFile "$path\Social\Sitecore.Social.Lucene.IndexConfiguration"


	enableFile "$path\Sitecore.ContentSearch.Solr.DefaultIndexConfiguration"
	enableFile "$path\Sitecore.ContentSearch.Solr.Index.Analytics"
	enableFile "$path\Sitecore.ContentSearch.Solr.Index.Core"
	enableFile "$path\Sitecore.ContentSearch.Solr.Index.Master"
	enableFile "$path\Sitecore.ContentSearch.Solr.Index.Web"
	enableFile "$path\Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Solr.Index.Master"
	enableFile "$path\Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Solr.Index.Web"
	enableFile "$path\Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Solr.IndexConfiguration"
	
	enableFile "$path\ContentTesting\Sitecore.ContentTesting.Solr.IndexConfiguration"
	
	enableFile "$path\FXM\Sitecore.FXM.Solr.Index.DomainsSearch"
	
	enableFile "$path\ListManagement\Sitecore.ListManagement.Solr.Index.List"
	enableFile "$path\ListManagement\Sitecore.ListManagement.Solr.IndexConfiguration"
	
	enableFile "$path\Social\Sitecore.Social.Solr.Index.Master"
	enableFile "$path\Social\Sitecore.Social.Solr.Index.Web"
	enableFile "$path\Social\Sitecore.Social.Solr.IndexConfiguration"
}
else {
	Write-Host "Enable Lucene"
	
	enableFile "$path\Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration"
	enableFile "$path\Sitecore.ContentSearch.Lucene.Index.Analytics"
	enableFile "$path\Sitecore.ContentSearch.Lucene.Index.Core"
	enableFile "$path\Sitecore.ContentSearch.Lucene.Index.Master"
	enableFile "$path\Sitecore.ContentSearch.Lucene.Index.Web"
	enableFile "$path\Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.Index.Master"
	enableFile "$path\Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.Index.Web"
	enableFile "$path\Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.IndexConfiguration"
	
	enableFile "$path\ContentTesting\Sitecore.ContentTesting.Lucene.IndexConfiguration"
	
	enableFile "$path\FXM\Sitecore.FXM.Lucene.Index.DomainsSearch"
	
	enableFile "$path\ListManagement\Sitecore.ListManagement.Lucene.Index.List"
	enableFile "$path\ListManagement\Sitecore.ListManagement.Lucene.IndexConfiguration"
	
	enableFile "$path\Social\Sitecore.Social.Lucene.Index.Master"
	enableFile "$path\Social\Sitecore.Social.Lucene.Index.Web"
	enableFile "$path\Social\Sitecore.Social.Lucene.IndexConfiguration"

	disableFile "$path\Sitecore.ContentSearch.Solr.DefaultIndexConfiguration"
	disableFile "$path\Sitecore.ContentSearch.Solr.Index.Analytics"
	disableFile "$path\Sitecore.ContentSearch.Solr.Index.Core"
	disableFile "$path\Sitecore.ContentSearch.Solr.Index.Master"
	disableFile "$path\Sitecore.ContentSearch.Solr.Index.Web"
	disableFile "$path\Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Solr.Index.Master"
	disableFile "$path\Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Solr.Index.Web"
	disableFile "$path\Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Solr.IndexConfiguration"
	disableFile "$path\ContentTesting\Sitecore.ContentTesting.Solr.IndexConfiguration"
	disableFile "$path\FXM\Sitecore.FXM.Solr.Index.DomainsSearch"
	disableFile "$path\ListManagement\Sitecore.ListManagement.Solr.Index.List"
	disableFile "$path\ListManagement\Sitecore.ListManagement.Solr.IndexConfiguration"
	disableFile "$path\Social\Sitecore.Social.Solr.Index.Master"
	disableFile "$path\Social\Sitecore.Social.Solr.Index.Web"
	disableFile "$path\Social\Sitecore.Social.Solr.IndexConfiguration"
}
 

I have put this file up for download

Once you switch all the indexes over to Solr, your problems have not finished yet.

After I switched everything over to Solr, I started the website, and was presented with the dreaded yellow screen of death

Connection error to search provider [Solr] : Unable to connect to [http://localhost:8983/solr]

It has to be up there with the top 10 unhelpful messages that Sitecore have ever produced. Without using dotPeek to look into the code, I suspect that when Sitecore is attempting to connect to Solr, and they get an error message instead of reporting an error saying what index is missing its throwing this exception. The sort of good news is that is mostly down to an incorrectly named index. If you get that message, Good Luck. The only way I found to solve the problem is to revert back to Lucene and then swap the indexes to Solr one by one. Very time consuming process to identify what the problem is

With Sitecore 8 initial release there is a bug with the following files
Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Solr.Index.Master.config
Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Solr.Index.Web.config

The param desc is

 
<param desc="core">marketingdefinitions</param>
 

I had to change the value to the following.

 
<param desc="core">$(id)</param> 
 

The index defined within Sitecore.FXM.Solr.Index.DomainsSearch.config produced this error

Could not find property ‘typeMatches’ on object of type: Sitecore.ContentSearch.SolrProvider.SolrFieldMap

I have omitted the stack trace, you only need to know the error.

Sitecore have released a knowledge base article about this: Error after enabling Solr in Sitecore 8
However, the new file has the same issue as the Marketing indexes, in that need to update the core param from fxm to $(id). You need to do this for index sitecore_fxm_domains_master and sitecore_fxm_domains_web.

But the problems don’t stop there. You cannot build index “sitecore_marketing_asset_index_master”

First you are going to get the following error returned

ERROR: unknown field ‘height_t_zh’

As like above, I have omitted the stack trace, as you only need to know the error.

Googling the error leads to this post from Jason Bert Configuring Solr search with Sitecore 8

His solution is to amend the Solr scheme.xml.

 
<schema name="example" version="1.5">
  <fields>
    ....       
    <dynamicField name="*_t_zh" type="text_zh" indexed="true" stored="true" />
  </fields>
  <types>
    .....
    <!-- Chinese -->
    <fieldType name="text_zh" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory" />
        <filter class="solr.TurkishLowerCaseFilterFactory" />
        <filter class="solr.StopFilterFactory" ignoreCase="false" words="lang/stopwords_zh.txt" />
        <filter class="solr.SnowballPorterFilterFactory" language="Chinese" />
      </analyzer>
    </fieldType>
  </types>
</schema>
 

But when I tried that, Solr complained about missing stopwords.zh file. So in the end I only added in the dynamic fields as such

 
    <dynamicField name="*_t_zh" type="text_general" indexed="true" stored="true" />
 

Then I received another error while trying to rebuild the index.

Unknown field error when re-indexing Sitecore_marketing_asset_index_master using SOLR

Which lead to this Sitecore knowledge base article: https://community.sitecore.net/developers/f/8/t/1454

I had to add in another dynamic field to the scheme.xml file

 
    <dynamicField name="*_t_uk" type="int" indexed="true" stored="true"/>
 

Finally after that I was able to build the index. You should be able to rebuild all the indexes within Sitecore. All my indexes use the same schema.xml file. So a fix for one index was populated to all indexes.

If you review the homepage of the SOLR UI after creating all the index, it is using nearly all the memory, therefore if you experience performance problems during development, increase the amount of ram assigned to the VM.

If you want to check the contents of each index, you can run the following queries.

Or execute queries using the Solr UI.

Solr on Windows

In the past I was trying to use Solr 5.x running on Windows, but I just could not get it working, and I still have an open question on StackOverflow

Problem I was having was that after restarting my machine, Solr would not pick up the cores. If anybody has experienced similar and got this working, then please update Stackoverflow and I will try it again.

Sitecore 8.2

Here’s hoping that with the Solr first approach in Sitecore 8.2, they sort out all this nonsense, nobody should be spending any amount of time trying to work around the issues that Sitecore creates. I mean having Sitecore produce the schema file and then having to still manually alter it is beyond belief that it is still not fixed in the latest version of 8.1

I suspect that at the time of writing, and as there is still no hint of a release date, Sitecore are probably still struggling to get this working correctly first time without having to issue so many manual fixes.

Next Steps

The VM’s created will work great if you only ever work with a single Sitecore project. If you work with multiple solutions, then you will want to use machine names instead of localhost for accessing Solr and MongoDB.

Some small changes, and a few powershell commands, all of this can be automated. Don’t forget to update your hosts file.

I ran all of these scripts in an Administrator powershell console. If you get any errors then restart powershell as an Administrator

Advertisements

My musing about anything and everything

Tagged with: , , , , ,
Posted in Lucene, Sitecore, Solr, Ubuntu, Vagrant, VirtualBox

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 12 other followers

%d bloggers like this: