Overview

Despite leaping two major versions, EMERSE v6 is very similar to EMERSE v4. The major changes are:

  • New user interface

  • Filters in search

  • Support for Exclude Phrases in all-patient search

  • Support to run Solr and EMERSE on separate servers

  • New Solr index settings

  • New Solr plugin for EMERSE

  • Removal of ActiveMQ

The changes to Solr are responsible for the ability to run Solr and EMERSE on separate servers, along with the ability to run Exclude Phrases in all-patient search (i.e., Find Patients). Before, the EMERSE server itself read the lucene index files to do most search, but now we use the EMERSE plugin to do that work in the Solr server. Thus, the index files no longer need to reside on the EMERSE server; they can be separate.

Similarly, Exclude Phrases were implemented in a costly manner that would be too expensive for all-patient search. We re-did this, along with the implementation of highlighting, and that required new index settings.

Since we changed the index settings from the last public release, you’ll need to re-index all your documents. (You cannot use the index upgrading tool, as you may have in the past, since that is only to upgrade the version of the index, not index settings describing what is indexed, which is what we’ve changed here.) Since reindexing is a big task, we recommend you set up a parallel production-like environemnt, and "upgrade" that system. You could do this first with a more test-like system which may use less disk space so you can get the steps down.

Broadly, here are the steps:

  1. Provision a new application server, Solr server with sufficient storage, and database to make a parallel production-like system.

  2. Copy your production or test database over to the parallel system. Database administrators should know how to do this efficiently.

  3. Run the upgrade script on the database copy.

  4. Install a new Solr instance for Solr 8.x.

  5. Create a directory for your new index.

  6. Install the EMERSE Solr plugin into your index directory.

  7. Create the cores from our configsets.

  8. Install and setup Tomcat with the emerse.war file

  9. Copy over your emerse.properties file, adjusting it to talk to the new database and Solr instance.

  10. Further adjust emerse.properties and other installation features for the new release.

  11. Confirm EMERSE is working with an empty documents inedx.

  12. Index documents into the new Solr instance. This can be done by pulling them from the existing Solr instance, or from orignial sources, or from some source you may have set up for re-indexing.

  13. Confirm search is working with the newly indexed documents.

Once you’ve made a parallel production system, you could just switch over to it once it is up and running. Alternately, you could create a parallel production system, but with the new Solr residing on the same server and data storage as the original, and then upgrade the production system in place once the parallel production system has confirmed the new Solr index works with the next version of EMERSE.

Files Needed for Upgrading

Since EMERSE moved to GitHub, we provide the files you need to install EMERSE on the release page in the repository. It’s a private repository, so if you don’t have access (in which case, you’ll see a 404 page when you click on that link), you’ll have to create a GitHub account and contact the EMERSE so we can give you permission.

Setting Up Solr

Download the latest version of Solr with major version 8. Create a new directory for the indexes or "cores." Solr calls this $SOLR_HOME. Add the following files: - $SOLR_HOME/solr.xml, which is the same as from before. We also provide a bare-bones version in our configsets.zip. - $SOLR_HOME/lib/emerse-solr-6.0.jar, which is our Solr plugin.

Next, you’ll need to create the cores. To do this, we need to start Solr, and tell it to create the cores according to a configset.

You’ll need to start up the server with SOLR_HOME pointing to the new home you just created. You can do this by either setting the SOLR_HOME variable in solr-8.*/bin/solr.in.sh, or by passing -s and the path on the command line.

solr-8.*/bin/solr start -s path/to/solr/home

# OR
echo 'SOLR_HOME=path/to/solr/home' >> solr-8.*/bin/solr.in.sh
solr-8.*/bin/solr start

Download the configsets.zip, which contains the config sets. Unzip this somewhere other than within the Solr home.

At this point, if you customized your schema before, you’ll want to customize the schemas inside the unzipped configsets. (Configsets are basically indexes without any of the data files; they’re only a subset of the config files.) The configsets/documents/conf/managed-schema file has our new index settings, so if you changed the names of these fields, make those changes there, but be sure to preserve the index settings. (That is, don’t just overwrite the fields with your old fields; we need each of the attributes termPositions="true" termVectors="true" indexed="true" termOffsets="true" stored="true" on RPT_TEXT and RPT_TEXT_NOIC.)

Issue the three commands to make the cores:

unzip configsets.zip
solr create_core -c documents -d configsets/documents
solr create_core -c patient -d configsets/patient
solr create_core -c patient-slave -d configsets/patient-slave

(The -d parameters should refer to the directories of the same name inside the zip.)

In the Solr admin web console, you should see these three indexes up and running. Now, you can begin indexing into them.

Re-indexing Solr

To re-index, you just need push data into the new instance as you would normally. The only question is where all the data comes from. If you have a system set up to easily pull data from the original sources, then that’s probably the best way. If that’s not the case, it is possible to query the original instance of Solr for the text of the documents.

Do not use the IndexUpgraderTool that comes with Solr. This tool will only work if the index settings are unchanged between versions, but in our case we have changed the index settings since the prior version of Solr.

Upgrade the Database

Upgrading the database should be as easy as it was for previous releases. Simply run the script upgrade.sql we provided on your database. This will upgrade your database from version 4.10.7. If you are on a previous version, you should upgrade to the 4.10.7 version of EMERSE first, then upgrade to version 6.

Update your emerse.properties file

See the configuration guide for the definitive list of emerse properties. Properties that are not in that list likely no longer are needed. In particular, properties pertaining to ActiveMQ are not needed.

EMERSE should complain about missing properties on start-up. One such new property is the resources.dir. As described in the config guide, this is a directory somewhere on the application server which contains a cover photo for the login page. (We at UM use a picture of our hospital, so we figured you’d want to change that.) Drop an image in this directory called cover.png or cover.jgp and it should appear on the login screen.

Deploy the EMERSE war

To redeploy the emerse.war, just replace the old one with the new one in the webapps directory of Tomcat.