This guide describes modifications that can be made that impact the behavior of EMERSE, tuning of the JVM associated with the EMERSE application and Solr, and some security hardening procedures.


Most of the default configuration will not need modification, but some are specific to the deployment environment and would normally be changed. The most common settings that would be changed are access to database, the URL of the Solr instance with indexed documents, and location of the Solr data files.

Solr configuration

Solr has its own configuration files which you generally would not need to modify. If you are looking for them, they will be on the server side in the SOLR_HOME/corename/conf/ directory, where each Solr core has its own file (such as SOLR_HOME/documents/conf/). These should all be named solrconfig.xml.

In the same directory (SOLR_HOME/corename/conf/) will also be located the managed-schema file, which likely will need to be modified to match your local document metadata structures, and will need to match what the appropriate database tables. For more information see the Data Guide.

Loading Configuration

EMERSE loads configuration data mainly via property file configuration provided by the Spring Framework. Multiple paths can be defined for locating a file named project.properties, and Spring will use the first one it finds.

Normally this file is located inside the EMERSE war file. Once deployed, the file is located at TOMCAT_HOME/webapps/EMERSE/WEB-INF/classes. Using this location is usually not desirable in production environments, as deploying new versions of the code will cause this file to be replaced. Alternatively, the file can be placed in the Tomcat users’ home directory (that tomcat runs as) in a file called emerse.properties.

Version number

The EMERSE version number is distributed as part of the WAR file that gets deployed to the server. This is not really a configurable option, but it is being mentioned here for the sake of completeness. The number is set in the Project Object Model (POM) files that can be found in the META-INF directory within the Tomcat webapps folder.


EMERSE uses Spring profiles to enable optional services within the application. These profiles are "activated" by supplying a parameter to the EMERSE JVM at startup. Currently this is only useful to activate LDAP related spring security features. If the profile "ldap" is enabled, EMERSE will use enable LDAP for user authentication. Activation of profiles is added by adding a JVM system property at startup. In the example below, the tomcat startup file is modified to enable LDAP service via the system property. You should not have to change or swap XML files to enable LDAP. Toggling between the XML files is enabled at runtime via a Spring profile being activated.

Additional information can be found in the LDAP section below.

export CATALINA_OPTS="-Dspring.profiles.active=ldap"

Application Settings

The following settings can be modified in the project.properties file, or emerse.properties as discussed previously:



The username for the database account for the main EMERSE database.


The password for the database account. This can be encrypted with Jasypt if desired.


The JDBC url to connect to the EMERSE database.


Maximum number of available connections to the database. Production systems with moderate users should set this to 10-20 connections. Our implementation at Michigan Medicine would be considered moderate. We estimate that 20 connections is probably reasonable for about 20 concurrent logged in users, about about 10 users searching at the same time. More is generally better, but a centrally managed DB may have a limit on the number of connections permitted.



The path on the local file system to the Solr patient indexes as well as the document index. The patient and patient-slave Solr directories/indexes would reside inside of this path, as would the actual index containing the documents. Note that this path name is to the parent directory holding all of the indexes, not to a specific index itself. The name of the index containing the documents should be defined in the SOLR_INDEX database table, within the ID column. In general it should just be documents and should not need to be changed. The patient and patient-slave indexes are used internally and should never be renamed. The application identifies the document index by looking up the lucene.indexPath defined below and the Solr index name defined in the database table.




The URL the application will use to access the Solr instance. Mainly used by the All patient search feature.


The path that is appended to the solr serviceURL to find the collection of patient documents.


The path appended to the solr URL where the application will find the patient index. By default this is patient-slave, a replicated copy of the patient index. This is so that the slave copy can be updated at any time. EMERSE has a background task that automatically updates these from the patient table, so no real configuration needs to be done. These two Solr cores (patient and patient-slave) were created to support the All Patient search feature in order to rapidly summarize the demographics of a search result. The EMERSE code will automatically re-index these two Solr indexes from the source (emerse.patient table) daily, but it is also possible to force re-indexing from the patient tables via http calls to the application server (see the Troubleshooting Guide for details on how to do this).

It is worth noting that the Solr patient index is replicated within EMERSE, with one serving as a backup for the other if the main one ever became corrupted. The larger, production Solr document index is not replicated in this way, mainly because it is so large. In other words, this type of 'slave' index is good practice, but may not be practical for the larger indexes.

The path appended to the solr URL to the source patient collection for updating.


The username used when making connection to Solr when configured with basic auth.


The password used when making connection to Solr when configured with basic auth.


Lucene is unable to recognize newly added documents until the indexes are closed and then re-opened ('refreshed'). This property will allow you to set up a cron schedule for when this 'refresh' should occur. The example below will re-open the indexes every morning at 6:30 AM, which should occur after new documents are added to the index. In general we presume that documents will be added to the index once per day, usually overnight when system usage is low. If this property is not defined it will default to 6:30 AM each day, but that can be over-ridden using this property if new documents are added more frequently and refreshing the index needs to occur more frequently. (This can also be forced by hitting a Solr endpoint. Details can be found in the Troubleshooting Guide).

task.refreshIndexes.cron= 00 30 6 * * ?


In the default activemq configuration, its message broker doesn’t require authentication, and does not use SSL. In this case, username, password are not required. If they are provided in the EMERSE configuration it will have no effect unless authentication is enabled in ActiveMQ. Please see ActiveMQ documentation for how to set up authentication. While securing the connections with SSL is good practice, EMERSE does not transmit any protected health information (PHI) via ActiveMQ. For details, see the section on 'Security Hardening' later in this document. [see: Active MQ]


The username used to access activeMQ broker


The password used to access activeMQ broker


URL the application will use to access the activeMQ broker


With SSL configured:


Name of the queue that will be used by EMERSE application to process search results. The queue names can really be anything, but they need to be unique. ActiveMQ will create them on the fly if they do not exist.


Name of the queue that receives search results


Patient Lists and MRN validation

Various options are available for validating user-entered patient medical record numbers (MRNs) that are stored in the Patient database table. Note that String comparisons are performed, so it is important to make sure that any formatting or cleaning of the user-entered MRNs results in exact matches with the format of the MRNs in the table.


The number of patients that can be added to a patient list. We set it to 100,000 by default.


The size limit in MB of incoming CSV uploads. We currently use 10 MB.


The maximum number of errors that are shown when user uploads or inserts new patients to a patient list.


Whether the system will allow duplicate medical record numbers on the same patient list. A true value will cause emerse to remove duplicates before saving to a patient list. In general, it is a good idea to remove duplicates, so keeping it true is ideal.


If set to true, invalid MRN’s will be reported. If set to false, they will be silently removed. There is a small performance improvement when they do not need to be reported back to user, but in general it is good to let users know when invalid MRNs have been removed.


A format string that can be optionally provided for the patient’s MRN. This uses a Java String format syntax. (Note that currently MRNs can’t be longer than 20 characters).

For MRNs 9 digits long, one can use:


(Requires use of MRN format) Optionally pad the MRN with leading zeroes. If this is set to true the numbers will be padded with leading zeroes using the patientList.MRNFormat described abonve.


Remove matches of the regular expression when MRNs are uploaded/entered by users. This is run after whitespace is removed from the MRN. (For instance, if the regular expression is ^0+ then 000 0045 67 would become 4567.) This setting is important when validating MRNs against the Patient table. Example values:

  • ^0+|[-] - remove leading zeros and dashes

  • [-#] - remove dashes and pound signs

  • An empty value keeps the exact value (spaces would still be removed)



The following settings that configure EMERSE to use LDAP for authentication only work when the runtime profile is set to include "ldap". See the Profiles section to add this profile to the running EMERSE instance.

Typically, to authenticate to LDAP, you need the DN (distinguished name) of the user you want to authenticate as, and the password of that user. Since the DN of a user may not contain their username (as entered at the login screen), EMERSE authenticates to LDAP as a fixed "service user" as specified by ldap.userDn and ldap.password. This should give EMERSE the permission to then run a search for the DN of the user trying to login. The search run is the one specified in the ldap.search property, where the every instance of the text {0} in that search is replaced with the username entered on the login screen. The user record found from that search should contain a dn: entry, which is then used to authenticate the user against LDAP with the password provided on the login screen.

LDAP is only used for authentication, authorization (permissions) are given to the user as defined by their user account in the EMERSE database that matches the username they entered at the login screen. This means if they are not in the EMERSE database, they will not have access. So, you must add users via the administration application; we don’t create accounts from information stored in LDAP or grant permissions based on LDAP groups.


The distinguished name of the service account that EMERSE will use to conduct the user search


Password of the service account


Path of the subtree to search for the user


The search to find the user based on the username typed at the login screen. Every occurrence the text {0} will be replaced with the username typed at login.




If set to true, Quick Buttons with standard reasons are displayed to the user for selection


If set to true, users can enter a free text description describing their purpose of using EMERSE


If set to true, the Attestation screen will display in the table prior free text attestation reasons used by the user.

Batch Updating Begin/End Dates

Our experience at Michigan Medicine has shown that legacy documents coming from older systems may sometimes have invalid document dates. This led to unusual dates being displayed in the section of EMERSE that shows the overall date range of included documents when no date limitation was placed on the search criteria (e.g., “01/01/1900”).

To circumvent this potential problem EMERSE provides two options for controlling the dates displayed to users. In general, background tasks that update the Lucene indexes would also update the date ranges for documents when all dates are selected (that is, when no date range is entered into the date range boxes in the user interface). This is so that as the index updates every night a new ‘end date’ can be shown for the date range of the documents.

This auto-update setting can be over-ridden for the start and stop dates, independently, using the properties described below. Changing this setting can, for example, allow one to have a more sensible document start date that more closely matches when the documents were being collected (without having to actually change the dates of all of the incorrect documents).

Note that changing these dates only affects the dates displayed in the date range section at the top of the screen. The actual documents will still show their original dates, and the searches will still take place based on the actual dates of the documents even if they are incorrect. Thus, if actual dates are entered by users into the stop/date boxes, those dates will be used. If no dates are entered by users (thus, searching ‘All dates’) then the system will search across all of the documents regardless of the over-ride date shown in the UI and regardless of the document dates in the system.


If set to true, min date of documents is updated from Solr every night, which would be updated in the solr_index table.


If set to true, max date of documents is updated from Solr every night, which would be updated in the solr_index table.

If one or both of these properties is set to false, then the date entered in the solr_index table is what will be used for display purposes. For more information on this table see the section on the solr_index in the Data Guide.


Number of fragments/text snippets to display for preview when using All Patient Search


The All Patient Search displays a chart based on patient’s age using intervals. This setting specifies the interval to use when displaying the chart. In general there should be no reason to change the default setting.




size of cache to use for efficient searches. Larger numbers are better, but you just need to make sure you have enough memory available.


Each query term can be a query, so search bundles with lot of terms need a large count for this to work efficiently. We have found 1024 to be reasonable, so it likely does not need to be changed.



There are several components configurable within the EMERSE menu, which is available to all users in the upper-right portion of the window.

Contact Information

Users may want to contact a local administrator about issues or feedback about EMERSE. This can be accessed by users in the upper right menu through either the About option or the Feedback option. Both of these menu options have some hard-coded text followed by a customizable URL that can be defined using the two properties listed below. The About menu item contains text beginning with "Please direct feedback and issues to…​" and the Feedback menu item contains text beginning with "Please send any comments or suggestions you may have about EMERSE to…​". The remaining text is defined the the two properties:


This is the URL, or the mailto URL that will direct the user to the correct resource.


This is the text that wil be displayed on the screen for the URL.

contact.text=EMERSE help desk

The resulting URL would then be constructed using the two properties above to look something like:

<a href="https://link.to.help/server">EMERSE help desk</a>

This is the link that contains the user guide. By default (if nothing is defined) it will link to the main user guide on the project-emerse.org website. If you have your own user guide you can link to that instead by replacing the URL.



A user’s session is configured to be timed out due to inactivity. If the app is idle and does not encounter a mouse click, mouse move, mouse scroll or a keypress activity for a configured timeout setting, the application logs the user out of their session and the login page is presented. The following properties can be added to the project.properties to override the defaults.

This timeout feature does not apply to the Attestation screen, because at this point no Protected Health Information (PHI) would be displayed. Nevertheless, EMERSE would still timeout based on the server timeout settings even though the countdown window for a forced logout would not be shown to the user.

Number of seconds to run the timer when the application is idle. The default value is 3600 if this property has not been added to the properties file. The value should be in seconds.


Number of seconds to show the timeout warning window. The default value is 30 if this property has not been added to the properties file.


Overall Patient Count

EMERSE displays the total number of patients in the system with respect to conducting an All Patient Search across all of the patients. This count is updated using the Spring Scheduler within the app itself, and should auto-update about every 30 minutes. The overall patient count is not configurable since it is derived from the data loaded into the system. Specifically, this count is based on the distinct number of MRNs that are associated with all of the documents in the Solr index. It is not based on the total number of MNRs in the database table, Patient. Thus, if a patient is in the Patient table but does not have an associated document, that patient will not be counted towards the total number of patients.

The total patient count displayed in the user interface is stored in the PATIENT_COUNT column of the SOLR_INDEX table in the database. This count is refreshed periodically based on a background process that retrieves the unique numebr of MRNs from the Solr documents index. Additional details about configuring the schedule for this process can be found within this guide in the section called 'Solr Patient Index Replication Interval'. However, the overall patient count can also be forced to refresh immediately using the 'System Synchronization' feature found within the admin application.


Various components of the EMERSE system can be tweaked to enhance the user experience and yield optimum performance.

Tomcat (EMERSE application)

To reduce the frequency of garbage collection and memory recollection use -Xmx and -Xms switches to control how JVM handles its heap memory. We recommend setting up tomcat to use between 1 and 2 gig of memory. One way to do this is to add the following snippet to the tomcat startup file - startup.sh.

export JAVA_OPTS="-Xmx2048m -Xms1024m"

Solr Index Optimization

Over time we have found that many document changes occur as they get updated or deleted (a deletion might be required if, for example, a document was found to be created under the wrong patient). It is possible to clear out these deleted/inactivated documents and potentially improve the performance of Solr by Optimizing the documents. This can be invoked manually using the Optimize button in the Solr Administration User Interface. Optimizing also reduces the index segment sizes which can also improve system performance. During the optimization process the original index is left in place while the new, optimized index is being created. This means that you will need empty storage about 2-3 times the original index’s size for optimization to proceed. Additionally, we have found that it can take about 10+ hours to conduct an optimization and it also uses substantial computational resources, meaning that system performance might suffer for users. Thus, it might be best to run this on weekends during times of low use. At Michigan Medicine we optimize infrequently and copy the indexes to a different server with more space and then copy the indexes back after optimization is complete. We also need to ensure that no new documents are added to the original index during this time.

Solr Memory

Solr’s memory can be configured by a simple flag at startup. When working with millions of documents, we had some issues when Solr was using its default settings of 512 megabytes. Currently the EMERSE production instance at Michigan Medicine is configured to use 3 gigabytes. In the example below, 3 gigabytes are being allocated.

./solr start -m 3g
You may need to pass other flags such as -s when starting Solr, as described in the Installation Guide.

Solr Patient Index Replication Interval

EMERSE has two indexes used to keep track of patients: patient and patient-slave (the patient-slave index is the one actually used by EMERSE while it is running).

The patient index is created by copying the patients from the database table, Patient over to the corresponding Solr index, patient. This is done automatically by the system once per day as a scheduled event. The schedule of the jobs can be found in the properties file. The default time set for the EMERSE distribution is 7:30 AM. This was done with the assumption that the patients in the Database table would be updated once every night during non-peak hours. If you are fine with that time, no changes need to be made.

The properties file uses a cron-like syntax to specify the schedule, which consists of six fields separated by whitespace. The first field is the seconds, then it’s minutes, hours, day-of-the-month, the month number, and then day-of-the-week. A field can have a number in it, appropriate for the field, a star meaning every value of the field, or a question mark, meaning no restriction. A more formal description of the syntax can be found in the Spring Documentation, specifically the component regarding the Class CronSequenceGenerator.

This runs a job that finds the minimum and and maximum dates of the document index, along with the number of distinct MRNs in the document index, which are used for updating the date ranges displayed in EMERSE as well as the overall patient count shown when conducting an All Patient search. The default time is every hour, around minute 42. For instance, 1:42, 2:42, 3:42, etc.

task.updateIndexStatsViaSolr.cron=00 42 * * * ?

The schedule to update the Solr patient index from the patient table in the database. The default time is 7:30 AM.

task.updatePatientIndex.cron=00 30 7 * * ?

The schedule to optimize the Solr patient index. Optimizing an index puts all the data into a single segment for operational efficiency. The default time is 7:45 AM.

task.updatePatientIndex.cron=00 45 7 * * ?

The schedule to re-open the Solr documents index. Changes to the documents index (such as new documents indexed) can’t be seen until the index is re-opened. Note that all-patient search is done through Solr, and Solr re-opens the indexes much more often than EMERSE does, which means that if you index new documents during the day, all-patient search may match documents that won’t be found through patient-list-based searches. The default time is 6:30 AM.

task.updatePatientIndex.cron=00 30 6 * * ?
If you change the scheduled time of this process, you will have to restart Tomcat for the changes to take effect.

The patient-slave Solr index gets replicated from the master patient Solr index, and it is possible to change the frequency of this replication. We currently have it set to do this every minute, and we do not see any reason that it should need to be changed. However, if you do wish to change it, this can be found in the solrconfig.xml file within the patient-slave core. The parameter to change is pollInterval. For example:

<requestHandler name="/replication" class="solr.ReplicationHandler" startup="lazy">
    <lst name="slave">
      <str name="masterUrl">http://localhost:8983/solr/patient/replication</str>
      <str name="pollInterval">00:01:00</str>
It is possible to force these copying and indexing events to occur on demand, which may be useful for troubleshooting or when testing with an initial setup. Details about how to do this are described in the Administrator Guide.

Server memory optimization

In a deployment of EMERSE where the main application and Solr are running on the same server, each process should be given adequate memory. At some point, however, allocating memory to these processes may actually reduce performance. Most modern operating systems will cache files in all available memory. As memory is allocated to these processes, less memory will be available to the operating system to cache files.

EMERSE Search Concurrency

EMERSE enqueues user searches as they use the application, and a pool of workers concurrently pulls from that queue and executes searches. This parallel execution of searches allows fast searches to complete while long ones continue, and helps prevent a single user’s long-running query from slowing down the entire system for all users. Workers from the pool take a batch of requests from the queue at once, and process that batch sequentially. However, if the requests they are running are slow, they may take a smaller batch size, but never smaller than the minimum size. There are a few parameters in the project.properties file that control this construct, although we would not expect that these should need to be changed unless specific performance issues need to be tweaked.


This controls the size of the worker pool which concurrently pulls from the queue. Default 10.


The smallest batch of requests a worker may take at once, no matter how slow searches are taking. Default 1.


The largest batch of requests a work may take at once. This is the size of the first batch for a worker. After this, it may reduce its batch size of the searches are running slow, and may go back up to this limit if they are running fast again. Default 7.

The dynamic adjustment of batch sizes is based on the following rules, which are not configurable at this time. Each thread processes a search for a given user. A search is done for each patient (not patient/source). When the last search for the user has taken…​
…​< 1.5 seconds, max batch size (7) is used for next search
…​>=1.5 seconds and < 5 seconds, max batch size / 2 (round down, so 3 in this case) is used for next search
…​>=5 seconds, initilal batch size (1) is used for next search

Security Hardening


Solr also provides a REST API that can be accessed with tools such as curl. By default this is not locked down and should be secured with basic authentication if the Solr ports are not firewalled to external communication.

Solr can be set up to use SSL/TLS, and require authentication with basic auth. Both of these features are supported by Solr Cloud, but EMERSE does not yet support Solr Cloud. However, the Jetty servlet engine embedded by stand alone Solr can be modified to require authentication and use SSL.

Much of the Solr documentation pertains to Solr Cloud, which is NOT currently supported by EMERSE. Look for references to a single node configuration when consulting Solr documentation.

Solr SSL Setup

Changes are required in solr.in.sh found in bin directory under the Solr_INSTALLATION directory. Essentially uncomment the lines below and configure them with values appropriate to a java keystore containing the certificate for the server.


See the "Basic SSL Setup" section at the following link for more information.

Basic Auth

If Basic Auth is desired, there are several ways in which Basic Auth can be configured. Solr provides its own approach, but another approach uses the Jetty servlet engine bundled with Solr.

The first step is to modify the jetty.xml file inside the SOLR_INSTALL_DIR/server/etc folder, adding the following snippet inside the <Configure></Configure> tags.

  <Call name="addBean">
        <New class="org.eclipse.jetty.security.HashLoginService">
          <Set name="name">Test Realm</Set>
          <Set name="config"><SystemProperty name="jetty.home" default="."/>/etc/realm.properties</Set>

After adding in the xml snippet, add a user/password combination to the file realm.properties located in SOLR_INSTALL_DIR/server/etc. If the file doesn’t exist just create a new file and add the following line to it.

solradmin:password, admin-role

In the above, the username is "solradmin" and the password is "password".

Also, the following needs to be added to the webdefaults.xml file:

      <web-resource-name>Solr authenticated application</web-resource-name>

    <realm-name>Test Realm</realm-name>
For more information on configuring Jetty with Basic Authentication, see here https://www.eclipse.org/jetty/documentation/9.3.0.v20150612/configuring-security-authentication.html

Active MQ

ActiveMQ broker can be setup to use SSL/TLS and also require authentication. ActiveMQ webapp can also be configured to use SSL. EMERSE allows setting user/password via configuration properties, described above

Configuration password

Passwords specified in the project.properties files can themselves be encrypted using Jasypt.

Exported Excel files

EMERSE provides a function for exporting password-protected Excel files containing patient lists and and associated comments/tags. These files are generated on demand by the user and stored on the EMERSE server inside the exploded EMERSE war file, with a unique download link provided to the user. Because there is no straightforward way to know when a file has been successfully downloaded, the Excel file persists on the server. We currently have a shell script on the server that executes every 30 minutes and deletes files older than 60 minutes.

cd /PATH_TO_TOMCAT_INSTALL/webapps/emerse/downloads \
        2> /dev/null || exit 0
find . -name "*.xlsx*" -mmin +60 -exec rm {} \;

Admin Application

Most details related to the Admin application and Admin features can be found in the Administrator Guide. Below is a high-level summary of the Admin features.

EMERSE users that have an ADMIN role have access to the admin application located at:

The application has two main features- user management related to authorization, and maintenance of synonyms.

Add/Remove users

The Add/Remove users tab can be used to manage users of the EMERSE application. When you add new users, note that there are an expanded set of roles that can be applied to a user. For general users, you want to select/check “User with full privs” option and leave the others unchecked. The password field is required but will be ignored if security is set up to use LDAP. Although there is now a role for “limited access” type of user, we aren’t doing much with it yet locally.

Roles and Privileges

Roles and Privileges for EMERSE users can be customized. Details about how this is done can be found in the Administrator Guide.


The Synonyms tab allows the admin user to update synonyms in the EMERSE application by uploading them from a CSV file.

Synonyms upload currently deletes all the existing entries in the synonyms table and then loads entries from the CSV file. Thus, this options replaces synonyms as opposed to appending new ones to the existing list.


The admin application has an option to "synchronize" various data between the database and Solr. While this happens automatically overnight it can be useful to force this more frequently, especially during initial system setup and testing. Details can be found in the Administrator Guide.

Supporting Multiple Environments

It may be ideal to support multiple EMERSE environments such as test, dev, prod, etc. We have found that sometimes it can be difficult for users who are testing EMERSE to know what specific system they are using. To make it easier to distinguish between multiple instances of EMERSE, the system has the ability to display a small, but obvious, box in the upper right part of the screen to inform users. Having this information in a database table is useful because it can remain stable even as the application itself gets upgraded.

This information is defined in a table with a single row called ENVIRONMENT_INFO:

Column Name



This should set to 0 and not changed.


This is the environment that is active (dev, test, prod, etc). This is a free text option so can be anything (e.g., "Development", "Testing", "Production", etc.)


This is a flag to determine if the text for environment should be displayed on the screen or not. 1=display, 0=do not display. In general you would not display this to users in the Production system.

The version number of the application (displayed when selecting the About menu) is distributed with the WAR file itself and is not contained in the database.