Overview

This guide covers topics relevant to EMERSE system administrators.

Admin page

Basic system administrator functions can be found on the admin page (which is actually a separate application). The administrator page provides mechanisms for adding users, making them active/inactive, managing system-wide synonyms, managing roles/privileges, and more. The Admin page page also contains a way to view some of the EMERSE log files, which can be useful for troubleshooting, especially during initial installation. More details about this feature can be found in the Troubleshooting Guide.

The admin page can be accessed after logging into EMERSE and completing the attestation. Then, in the main EMERSE application it can be accessed via the EMERSE menu in the upper right portion of the screen. The Admin option will only appear in the drop down menu if the logged in user has admin privileges.

The admin page can also be accessed directly at the URL below as long as the user has admin privileges.

http://host:port/emerse/admin2/

Adding Users

Users should generally be added to the system via the interface on the Admin page.

It is also possbile to add users by adding them to the database tables (via a script, for example), but we don’t encourage that since multiple tables need to be updated correctly. For those who want to add users via the tables, the relevant tables are PERSON, PARTY, LOGIN_ACCOUNT, and the associated roles/privilege tables. Users are rows in the PERSON table which must have matching rows in the PARTY and LOGIN_ACCOUNT tables (on PARTY.ID = `PERSON.ID and on PERSON.LOGIN_ACCOUNT_ID = LOGIN_ACCOUNT_ID respectively). The latter table is also how privileges are associated to the users, since it has foreign key in the USER_ROLES table.

If you’re going to programmatically insert rows into those tables, you should use the right sequences for those tables so the app doesn’t try to use the same id you inserted into the table. They are generally the table name followed by _SEQ, such as LOGIN_ACCOUNT_SEQ or PARTY_SEQ. PERSON doesn’t have its own sequence since it has the same ids as the PARTY table so it uses the PARTY_SEQ.

User Announcements

EMERSE provides for the capability to have an announcement displayed to users after logging in and completing the Attestation page. This may be helpful if there is a need to communicate something to all users, such as a planned system shutdown, or an announcement about an updgrade. More than one announcement can be shown, and each announcement will be shown to every user until either the user clicks on the Don’t Show Again button, the administrator clears the announcement, or the announcement "expires". If more than one announcement is entered for users, each will be shown in succession (not all together on the same announcement window).

This feature is now easily accessible in the Admin app under the Alerts section. You can add the text of the alert, as well as the start and stop dates for when the alert should be shown. Basic HTML (line breaks, links, etc) can be included and should be interpreted correctly when displayed to the user.

Changing a user ID

There may be rare cases in which a user’s User ID has to be changed (for example, someone’s last name has changed leading to a change in the User ID). Such changes can be made from the Administrator page.

As an admin you can change any User ID other than your own when you are logged in under that specific User ID.

Internal vs External Authorization

Users can be authenticated via two mechanisms:

  1. An internal encrypted password stored in the LOGIN_ACCOUNT table

  2. LDAP authentication external to EMERSE

This can be set, and later changed, on the Administrator page by choosing either Active or Inactive for the switch that allows for "External Authentication (Ldap, etc)".

This result is stored in the EXTERNAL_AUTH column (1, 0) in the LOGIN_ACCOUNT table, and it tells the application what authentication to prefer.

When EMERSE is configured to use LDAP as an authentication source (see the Configuration Guide for how to do this), it is still possible that a user can login if their password matches the encrypted credentials in the password field of the LOGIN_ACCOUNT table. If the EXTERNAL_AUTH flag is set to true (EXTERNAL_AUTH=1), however, the application won’t try to match the user’s credentials against the LOGIN_ACCOUNT table. In other words, when this flag is set to 1, the password field has no impact for the user, and only the external authentication mechanism is used.

Roles and Privileges

The visibility/accessibility of some user interface elements or features within EMERSE are controlled directly by a user’s set of privileges. Privileges are determined mainly through the users’ roles that are set up in the USER_ROLES table. Each role is mapped system-wide to a set of privileges. The ROLE_PRIVILEGES table contains a full list of possible actions/components/features that are assigned to a specific role in the application. These roles and privileges can now be managed through the administrator app, without the need to modify the underlying database tables (see the section on "Role page", below).

Tables Associated with Roles and Privileges
Figure 1. Tables Associated with Roles and Privileges
Table 1. Details about the internal EMERSE tables that handle roles and privileges
Table Name Purpose/Description

PRIVILEGE

System-defined privileges that give access to certain features or displays of information within EMERSE (e.g., ability to export patient lists). This table should never change.

USER_ROLE_DESC

Lists the roles that exist in the system along with a description of what that role is, used for display on the admin page when selecting the roles a new user will have

ROLE_PRIVILEGE

Lists the privileges granted to each role.

PRIVILEGE_EXCEPTIONS

Lists privileges granted to or revoked from individual users. These "exceptions" are consulted first before a user’s roles are consulted.

USER_ROLES

Lists the roles of each user in the system.

Currently EMERSE has only one role where this is implemented, the “DEFAULT_USER” role. The ROLE_PRIVILEGE table contains a mapping of roles to privileges. For example, the "DEFAULT_USER" role has privileges:

  • ACCESS_ADMIN

  • ACCESS_API

  • ACCESS_EMERSE

  • ATTEST_COMMON

  • ATTEST_FREE_TEXT

  • ATTEST_PRIOR_REASON

  • ATTEST_RESEARCH_STUDY

  • EXPORT_PT_LIST

  • NEW_PT_LIST

  • SAVE_ALL_PT_LIST

  • SEARCH_ALL_PT

  • SEARCH_NETWORK

  • VIEW_ALL_PT_CHARTS

  • VIEW_ALL_PT_SNIPPETS

  • VIEW_ALL_PT_TRENDS

An example of what can be done using this primary "DEFAULT_USER" role is that one could disable all users’ access (assuming no other roles have the same privilege) to exporting patient lists by removing EXPORT_PT_LIST privilege from the ROLE_PRIVILEGE table.

Privileges can also be granted to or revoked from a specific user by adding rows to the PRIVILEGE_EXCEPTIONS table. The ADD_PRIVILEGE column determines whether the privilege will be granted to the user (Y), or revoked from the user (N), essentially acting as a blacklist and whitelist. This table is consulted first, and only if there is no entry for the specific user and privilege will the roles be considered. So, if a privilege is revoked here, the user will not have it, even if they have a role that grants it to them.

Table 2. Example exception in the PRIVILEGE_EXCEPTIONS table, showing the revocation of the ability to export a patient list for a single user.

LOGIN_ACCOUNT_ID

PRIVILEGE_NAME

ADD_PRIVILEGE

CREATION_DATE

100

EXPORT_PT_LIST

N

sysdate

Table 3. Currently supported privileges in the PRIVILEGE table

NAME

DESCRIPTION

ACCESS_ADMIN

Allows the user to access the admininstration application of EMERSE.

ACCESS_API

Allows the user to access the server API, /springmvc/ext/**.

ACCESS_EMERSE

Allows the user to access the EMERSE web application itself. If access is not granted, they could still access the admin app or the API (based on other permissions).

ATTEST_COMMON

Allows the user to attest with the common attestation reasons defined in the table COMMON_REASON

ATTEST_FREE_TEXT

Allows the user to write their own reason for using EMERSE in the attestation page

ATTEST_PRIOR_REASON

Allows the user to see and select a prior free-text reason they have attested to.

ATTEST_RESEARCH_STUDY

Allows the user to attest to using EMERSE for a study they are a member of.

EXPORT_PT_LIST

Allows the user to export a patient list to a password protected Excel file

NEW_PT_LIST

Allows user to create a saved patient list. This includes from find-patients search on an existing patient-list. The privilege to add patients to the list is defined by the EDIT_PT_LIST privilege.

SAVE_ALL_PT_LIST

Allows the user to move a set of patients found from running a search across all patients to a temporary patient list. Access to the temporary patient list would give the user access to the patient names, clinical notes, etc, and allows them to search the notes in more details.

SEARCH_ALL_PT

Allows users to do an all-patient search. This does not apply to find-patients search on a patient list.

SEARCH_NETWORK

Will enable a user to search the network when this feature is complete and released.

VIEW_ALL_PT_CHARTS

Allows the user to view the demographic charts of an all-patient search. This does not affect the demographics page on a patient list.

VIEW_ALL_PT_SNIPPETS

Allows the user to view snippets from an all-patient search. These snippets contain the search terms as well as surrounding text from the clinical notes. These snippets are also called Summaries.

VIEW_ALL_PT_TRENDS

Allows the user to view the trend chart of a all-patient search.

The default configuration for the EMERSE system is that the DEFAULT_USER has all privileges except access to the admin app and the server API. If these standard privileges are not desirable for the typical user of EMERSE, then one can simply remove the undesired privileges from the ROLE_PRIVILEGES table.

Alternately, you can create a new role by adding a row to the USER_ROLE_DESC table, and associate the wanted privileges with that role by adding rows to the ROLE_PRIVILEGES table. You can then assign users that role in the admin app. To do so, navigate to the users tab, find the user by name in the table, and click edit. This will bring you to a page that has a few checkboxes whose text matches the description of the role in the USER_ROLE_DESC table. Select the checkboxes for the roles you want the user to have. You can also add or remove roles to users in the database directly by changing the USER_ROLES table.

You must grant one of the privileges starting with ATTEST_ in order for a user to be able to make an attestation. Without an attestation, the user will not be able to use EMERSE.
Table 4. A few examples of what can be done using the privileges supported within EMERSE.

Granted privileges

Implication

ACCESS_EMERSE
NEW_PT_LIST
SEARCH_ALL_PT
VIEW_ALL_PT_SNIPPETS
SAVE_ALL_PT_LIST

User can identify a set of patients with an All Patient Search and can see the text Summaries (VIEW_ALL_PT_SNIPPETS). User can move the list to a Temporary Patient list to search them further (SAVE_ALL_PT_LIST). User can also save the list as a Saved Patient List for searching later or sharing with another user (NEW_PT_LIST). User cannot export the list to an Excel file (EXPORT_PT_LIST).

ACCESS_EMERSE
SEARCH_ALL_PT
VIEW_ALL_PT_SNIPPETS
SAVE_ALL_PT_LIST

User can identify a set of patients with an All Patient Search and can see the text Summaries (VIEW_ALL_PT_SNIPPETS). User can move the list to a Temporary Patient list to search them further (SAVE_ALL_PT_LIST). But because the user cannot create a new list (NEW_PT_LIST), the user cannot save this list for later. Once the user logs out, the list will be gone.

ACCESS_EMERSE
SEARCH_ALL_PT

User can find a set of patients with an All Patient Search, but will not be able to see the text Summaries (VIEW_ALL_PT_SNIPPETS), and will not be able to move or save the list to search it in more detail (SAVE_ALL_PT_LIST). Essentially the user will be able to get a count from that All Patient search screen. The user will not be able to create their own lists (NEW_PT_LIST), but can still search a list of patients that was shared to them by another user. The user will not be able to add any new patients to the list that was shared to them (EDIT_PT_LIST). This is essentially a 'locked down' role that will allow them to get counts and search a list that was shared to them, but nothing else. For limited access, this could be a good configuration since an administrator could create and share a list with such a user.

ACCESS_EMERSE

The user can only search patient lists they are given permission to. This is the most locked-down role.

ACCESS_EMERSE
NEW_PT_LIST

The user can only search patient lists they are given permission to, plus create lists from MRNs they get from sources outside EMERSE.

ACCESS_ADMIN

This configures an administer role. Administrators can add and edit users and roles, and assign roles to users. Users with a role with such permissions cannot access EMERSE itself (so they cannot do any kind of search with EMERSE, but being an administrator, they can give themselves that permission).

ACCESS_API

This configures a service role. Users in this role can access the server endpoints under /emerse/springmvc/ext/**. This would be used by programs that need to access EMERSE to do searches or other automated tasks supported by the endpoints in /emerse/springmvc/ext/**.

Role Page

Inside the admin app there is now a role page, which allows administrators to create and edit roles. This makes it much easier to set up roles for a new installation of EMERSE, or when a new use-case comes along. Privilege exceptions are not able to be configured in this page however—​those must still be added directly to the database.

In the role page, you see a table whose rows are the roles. The columns are the privileges. Since there are a lot of privileges, this table will probably need to be scrolled horizontally on most screens. A check in a cell indicates the role of that row is granted the privilege of the column. You can freely check and uncheck the checkboxes in the table to alter the permissions of the roles. These are saved automatically.

Other than the privileges assigned to a role, a role consists of two parts: a code, and a description. The description is shown in the page for editing a user, and the code is shown in the table of this page (since it is shorter). Currently, the name of a role cannot be changed, and roles cannot be deleted. However, the description of a role can be changed. To edit the dsecription of a role, click on the dots in the first column of a row, then the "Edit" item that pops up. In the future, we may implement deletion of roles, or editing the name of a role, so more options may appear in that pop-up menu.

New roles can be created by clicking the "Add Role" button.

You can still edit roles by directly changing the database tables (including deleting roles are changing the code), but this interface was added to make it easier to understand permissions associated with roles, and to make it easy to craft new roles.

Synonyms

EMERSE supports a type of query expansion we call Synonyms. These are system-suggested terms/phrases that help users conduct a more thorough search. Synonyms can include alternative phrasings, acronyms, abbreviations, misspellings, stemming, related concepts, and more. These system-wide term suggestions can be leveraged by users but can only be updated by an administrator. There is no need to make your own Synonyms dataset since we provide several that have been formatted specifically for EMERSE, including Enesmbl gene names and associated diseases, the Human Phenotype Ontology, and others. Details can be found on the EMERSE website. You can download these datasets from the project-emerse.org website and and then upload them into the EMESRSE application using the EMERSE Admin app and by following the directions below. The sections below also describe how you can make your own Synonyms file, which may be useful if you want to add localized suggestions, or add additional data sources.

In addition to the sytem-wide Synonyms users can make their own collection of terms using the Saved Terms feature (previously called a Term Bundle), and can incorporate system-suggested Synonyms into their Bundles. User-created terms can be made available to other users through the Saved Terms sharing feature but they will not automatically become a part of the system-suggested Synonyms unless they are added by an administrator.

To be as quick as possible when matching terms entered by users, all Synonyms are loaded into memory at the time the system starts up. This might cause a slight delay between starting the system and the availability of the datasets.

Preparing a synonyms file

To prepare a synonyms file for importing into EMERSE, you must structure the file as a three-column tab-separated values (TSV) file.

Column 1: A key value that links related concepts together. This can be anything but should be unique for each grouping of concepts.

Column 2: The term/phrase

Column 3: The Concept Type, of which there are 3. These 3 Concept Types are used for categorizing how the suggestions are displayed to users and also how the terms will be matched. They are:

Concept Type Concept Type Name Concept Type Description

1

Regular Synonym

This is a standard type of "synonym", although that term is used loosely since it can mean any type of meaningful relationship between the terms. No formal ontological relationships are defined. Rather, the co-occurence of terms with the same key value implies that a relationship of some type exists.

2

Related Term

This is for terms that are related to the synonym it is grouped with but usually more distantly related. It provides a means for "one-way" matching. For example, in a group of connected terms (grouped by the key value), if one of the terms is "amoxicillin" (Concept Type = 1) and the related term is "antibiotic" (Concept Type = 2), then searching for "amoxicillin" would bring up "antibiotic" as a Related Term, but searching for "antibiotic" would not bring up "amoxicillin" at all.

3

Spelling Alternative

This is meant for misspellings of a term/phrase.

The maximum length of each synonym entry is 255 characters, not including the Key Value or the Concept Type. This constraint is set by the database, so to go beyond that limit the database would need to be modified. We recommend not exceeding about 200 characters simply because we’ve noted that some Unicode characters can take up additional space in the database and can exceed the database character limit. Also note that because the files is a TSV, there should be no tabs in the terms/phrases themselves. While it should not be necessary to change the maximum length of a synonym entry or the tab delimiter used in the file, these can be changed through properties if desired. See: synonymList.valueLimit and synonymList.delimiter.

Synonyms file metadata

Additional metadata can be added to the file which will be displayed to the users. These also must follow the three column TSV format and should appear at the beginning of the file, using the following formatting

Column 1: The phrase: emerse_synonyms_metadata

Column 2: An attribute, which is case-insensitive.

Column 3: The value for the attribute.

Note that you can provide any attribute, but only specific ones are supported by EMERSE. Other attributes not supported will be stored in the database but not used.

Currently supported Attributes are as follows:

Attribute Description Example Required or Optional

name

The name of the Synonyms dataset to be shown to users. This is case-sensitive and needs to be unique. If a dataset with the same case-sensitive name is uploaded, it will replace the prior dataset.

Local health center synonyms

Required

description

A brief description of the dataset to help a user understand what kind of terms it contains.

Local acronyms/abbreviations used at our health center

Optional

url

A URL that points back to the original resource from where the terms were obtained. For the users, the name of the dataset will become an active link if a URL is provided.

https://link-to-website.org

Optional

last_updated

The date that the dataset was last updated. This is treated as a String, so any format is acceptable.

08/25/2011

Optional

The name attribute must appear as the first line in the dataset. The other attributes can appear in any order as long as they are not on the first line. Furthermore, the name cannot exceed 256 characters.

The beginning of an example Synonyms import file would look like:

emerse_synonyms_metadata	name	Local health center synonyms
emerse_synonyms_metadata	description	Local acronyms/abbreviations used at our health center
emerse_synonyms_metadata	url	https://link-to-website.org
emerse_synonyms_metadata	last_updated	08/25/2011
1001	endocervical type adenocarcinomas	1
1001	endocervical type adenocarcinoma	1
1001	endocervical adenocarcinomas	1
1001	endocervical adenocarcinoma	1
1001	adenocarcinoma, endocervical type	1
1001	adenocarcinoma	1
1021	adenomatoid	1
1021	adenomatous polyps	1
1021	adenomatous polyp	1
1021	adenomatous	1
1021	adenomata	1
1021	adenomas	1
1021	adenoma	1
1021	polyps	2
1021	adeomas	3
1021	adeoma	3
1021	ademoa	3
1021	ademoma	3
1021	ademona	3

In the example above, there are two groups of concepts (group 1001, and group 1021). Group 1021 has mix of all three concept types: seven regular Synonyms (Concept Type = 1), one Related Term (Concept Type = 2), and five Spelling Alternatives (Concept Type = 3).

When a user enters a term, EMERSE does the following to determine which Synonyms to display:

  1. The user-entered term is matched to all Regular synonyms (Concept Type = 1) in a case-insensitive manner.

  2. For all matches, the key values are obtained (a concept may belong to more than one grouping and thus may have more than one key value).

  3. All other terms related to the key value(s) are obtained

  4. Duplicates are removed

  5. The final set is displayed to the user, organized by the three Concept Types, as shown in the figure below

Synonyms suggestions display
Figure 2. Synonyms shown for the term 'adenomatous'. Note that the terms are organized into three categories, based on the concept type: Synonyms, Related Terms, and Spelling Alternatives.

This organizational scheme of three Concept Types allows for some terms to be suggested based on a user-entered term, but supports the ability to have a one-way match to reduce extraneous matching when desired. For example, if the terms entered is 'adenomatous', the term 'polyps' will be displayed as a Related Term (see figure above). However, since in that grouping (group 1021 in the above example) the term 'polyps' is not a Regular concept type (i.e., it is not Concept Type 1 but rather Concept Type 2) then it means that a user-entered search term of 'polyps' will not match to anything in that 1021 grouping. This can be seen in the figure, below, where none of the terms in group 1021 are displayed to the user when 'polyps' is entered as a search term.

Synonyms suggestions display for the term Polyps
Figure 3. Synonyms shown for the term 'polyps'. This shows that the 'adenomatous' term is not displayed to the user since 'polyps' in the 'adenomatous' concept groups is not a Regular synonym Concept Type but rather is a Related Term Concept Type.

Uploading a Synonyms File

A Synonyms file should be prepared as a 3 column TSV file as described above. To upload the file, login to the administrator page and choose the Synonyms tab. Then select Upload, and click on the Choose button to select the TSV file. After the file is selected press Upload.

Minimal feedback is provided while the file is loaded and verified. Common problems include a file size too large for the server to handle, an upload time that exceeds the server’s timeout maximum, or a string length that exceeds what the database allows (we recommend no terms > 200 characters). A description of a few problem that might be encountered, and possible solutions, can be found in the Troubleshooting Guide.

When uploading a Synonyms file, note that if the same name is used as an existing dataset (based on the included metadata; and a case-sensitive match), the newly uploaded dataset will replace the existing dataset.

Managing the Synonyms

Multiple Synonym datasets can be loaded into EMERSE via the Admin app. After loading, a dataset will be in a Deactivated state, meaning that it will not be available or even visible to users. Clicking the Activate button will do two things: (1) it will make the dataset available to users and (2) it will invoke a background process that counts how many times each term appears in the overall set of documents. The counts resulting from this background process provide an intelligent approach to displaying the synonyms to users because it allows users to rank the terms based on how often they appear, including providing the option to hide terms that never appear in the corpus of all notes in the Solr index.

Counting the terms can take hours, depending in part on how many terms are in the dataset and also on how many documents are indexed. While the count is ongoing the synonyms will be available to users but users will not be able to change options that limit the results by count. Counts will not appear for a dataset until the entire counting process for all of the terms is complete. The counting process is refreshed only at the time a dataset is Activated. To refresh the counts, Deactivate and then Activate the dataset again. Refreshing the counts may be desirable if many new documents have been added since the counts were last conducted, which could change the frequency of the Synonyms in the dataset.

The Deactivate button retains the Synonyms dataset in the system but makes them unavailable to users. The Remove button deletes the dataset entirely from the database.

If the counting process was interrupted (for example, by the server being patched and restarted) the system should be able to recover from where it left off and complete the counts, but it must be restarted manually be pressing the the Resume button which will be displayed when an interrupted process is detected. The counting process can also be stopped manually by pressing the Stop button. When counts have been stopped before completion, the Synonyms will be available but without counts.

There are several things displayed in the table that may be useful. Some of these are described below:

Synonym Dataset Statistics

A few statistics are reported that can be helpful to understand how valuable the dataset might be for users. Some of these statistics will not be available until counting has completed for the entire dataset.

Rows/Terms: Rows are the total number of rows in the dataset. Terms are the number of distinct terms in the dataset, since a term can appear in more than one row if it is "mapped" to multiple different terms.

Rows/Terms w/ count > 0: This lists the number of Rows/Terms in the Synonyms dataset that appear in at least 1 document across the entire document index.

Synonym Dataset Statuses

A dataset can have several statuses, described below:

Deactivated: The dataset has been successfully loaded, but it is not available to users. Users will not even be aware that it is in the system.

Activated without Counts: XX% completed: The counting for the frequency for the dataset has begun, but it is not complete. After counting has begun (by pressing the Activate button) the dataset will be available to users, but counts will not be available until the counts are complete for the entire dataset.

Counting is stopped: Counting is no longer occurring, either because it was interrupted by a problem with the system (such as a server restart) or because it was manually stopped by the Administrator. Counting can be restarted by pressing the Resume button.

Activated with Counts: The synonym dataset is available to all users, along with the document counts.

Enqueued: Only one synonym dataset can be counted at a time, so if multiple datases are activated in quick succession, some will be in the queue (enqueued) while waiting for counting to begin.

Synonyms management page
Figure 4. The Synonym Datasets table on the Administrator app. In this screen several datasets have already been uploaded and are Activated with the term counting completed (Status: "Activated with Counts"). Others are in various stages of counting or availability to users, described by their Status.

System error logs

The Log tab in the EMERSE admin page is a good place to look if you are running into problems. It will show any recent errors that are appearing in the system logs (which are different from the audit logs stored in the database). More details about troubleshooting can be found in the Troubleshooting Guide.

System Synchronization

The System tab in the admin app provides a feature to help synchronize components of the system. In general this should not be needed since synchronization automatically occurs once per night. However, when installing the system or troubleshooting it may be useful to force these events to occur immediately so that changes can be verified. To invoke this, simply click on the Synchronize button. Some of these actions may take time, though.

The Synchronize action will:

  1. Copy the Patient table from the database to the master patient Solr index

  2. Replicate the master patient Solr index to the patient-slave Solr index

  3. Update document statistics as they are displayed in the UI, such as the number of patients and the date range of the documents. Note that the number of patients displayed in the UI is the number of distinct patients with at least one document in the Solr documents index, not the number of patients in the Patient table.

Additional details about how to check the progress of these synchronization steps can be found in the Troubleshooting Guide.

System Caches

The Caches page under the System tab allows you to force EMERSE to refresh its caches of certain database tables. This is normally done periodically on a "cron job" (see the Configuration Guide), however, during installation, it may be helpful to force a refresh if you made significant changes to the database, but don’t want to restart EMERSE.

Heatmap (Overview) Statistics Page

The Heatmap (now called the Overview within the application) Statistics page under the System tab contains some statistics about the performance of EMERSE to run queries from the "Overview Page" in the EMERSE application. These can be especially slow to run since the queries can be complex and numerous, in addition to the fact that it’s one of the most used parts of EMERSE.

The statistics are split up into a few sections. The only statistics collected are simple minimum, maximum, and average. Some statistics are point-in-time snapshots, but most are cumulative, and can be reset to zero with the "Reset Stats" button. The "Reload Stats" button allows you to update the statistics displayed; they are not live. Reloading the whole page would do the same, but the button is much faster.

The Fair Heatmap Query Scheduler Algorithm

To understand many of the statistics, it’s best to explain the internal scheduling mechanism used to run heatmap queries.

When a user views a page of patients in the Overview table, this triggers the backend to add a batch of overview-table rows to a data structure we call the fair batch heap. All of the rows of a page of the table go into the same batch.

There are then a configurable number of threads that each pull a row from the highest-priority batch in the fair batch heap, and run queries to fill out that row. If there are N columns in the table (ie, N sources), then there are 2N queries done to produce simple counts of matching documents, or N + NC queries done for the mosaic view, where C is the number of colors in the term bundle.

After filling out the row, the thread sends the row to the browser, goes back to the fair batch heap, and re-proirities the batch based on the amount of time it took to fill out the row. It then starts again, pulling a row from the highest-priority batch.

The prioritization of batches is determined by a priority number maintained on the batch. Batches start with zero priority. If a row from a batch takes M milliseconds to fill out, and there are N other batches in the fair batch heap, then the batch that contained the processed row is docked MN priority, and every other batch is given M more priority. If a batch completes but has non-zero priority, it’s remaining priority is distributed evenly between the remaining batches in the fair batch heap. This ensures there is no "drift" from zero of the avarage priority of a batch in the heap. (In the code, this is actually phrased with "penality" which is just negative priority.)

The Statistics

Batch Heat Statistics

These statistics are a point-in-time snapshot of the number of rows in the batches in the heap. In particular, this lets you know how many pages of the overview are being processed concurrently.

Row Query Time Statistics

These statistics are cumulative since the last reset. Each data point is the duration needed to fill out a row. The unit is milliseconds.

Intra-Batch Service Delay Statistics

These statistics are cumulative since the last reset. Since batches are processed in a changing priority order, if there are lot of batches, and if a batch is slow to process, it may be a rather long time before even one row is taken from the batch to be processed. The data points of these statistics are the durations from one row being taken from a batch to the next row being taken from the same batch. Generally, the minimum duration is very low, since at some point multiple threads will grab a row from the same batch roughly simultaneously, especially on the first batch in the heap, when all threads are inactive and grab from the one and only batch in the heap.

Cancelled Batch Jobs

These statistics are cumulative since the last reset. This tracks the number of rows remaining in a batch that was cancelled. It’s mainly here to ensure the cancel mechanism is working, and to show how much work is saved by employing a cancel mechanism. A batch is cancelled if a user changes the search, or after a minute of inactivity from the browser, in the case they closed their browser tab that initiated the search.

Batch Computation Stats

These statistics are a point-in-time snapshot. This section is actually a list of Row Query Time Statistics, but specific to each batch currently in the fair batch heap. The data points are the number of milliseconds it took to process a row from the given batch. If you suspect a single very long-running query is slowing down the system, you should be able to see that from the statistics here.

Software component versions

The EMERSE team tries to keep up with the latest versions of software components, as this is seen as a good approach to ensure security. While some components are installed by each site (e.g., Tomcat, Solr), other components come pre-installed with the EMERSE distribution. One way to check for the versions of these additional components is shown below for Spring and Hibernate:

unzip -l emerse2-4.3.war | grep '/spring.*jar'
unzip -l emerse2-4.3.war | grep '/hibernate.*jar'

Note that the specific version numbers of the war file may change based on the current version of the EMERSE distribution.

Index Optimization

EMERSE uses Apache Solr for indexing documents. Over time the index can become more fragmented and less efficient. This issue can be addressed by optimizing the index periodically. Index optimization is time- and space-intensive, so we recommend doing this about once every 3-4 months. Additional details can be found in the Configuration and Optimization Guide.

Audit logs

EMERSE maintains persistent audit logs of user activities, including logins, attestations (reasons for use with each session), patients/notes viewed, and search terms, among other actions. While reports are not currently built into the interface, you can access these logs via the database. We have several prebuilt SQL queries that can be found in the Reporting and Auditing Guide.

Each user’s set of search terms is captured in the audit logs, and these are the actual terms used. In other words, the terms are not referenced back to original sources (such as the Synonyms suggestions), which means that updating or changing the system-wide Synonyms will not affect the ability to retrieve or interpret existing search terms in the audit logs.

Some items are not captured in the audit logs, including any comments or tags (used for annotating the patients lists) affiliated with a Saved Patient List.