EMERSE   EMERSE Website   Documentation Documentation

This guide describes the various feature of EMERSE and how they can be used. The guide is directed towards regular EMERSE users. For technical details about implementing and running the system from an administrative perspective, please see our other guides.

What is EMERSE?

EMERSE is a search engine, sometimes referred to as an information retrieval system, and text processing tool for identifying terms and concepts in free text (also known as unstructured) clinical notes from electronic health records. It has some basic natural language processing (NLP) capabilities built into it, but EMERSE was designed mostly to be a simple, quick system to help people find the information they need, and to require minimal training to use.

EMERSE was not designed to automatically "code" information. In other words, it was not designed to find a concept, assign it a code, and then build up a spreadsheet of the concepts. While it may be able to do this for simple concepts, other types of concepts that have to be derived from integrating multiple pieces of data are currently beyond the scope of what EMERSE can handle. For the most part, EMERSE will help you find the information but it will still be up to you to decide if the context is correct, and how the concept should be coded.

EMERSE is a mature, feature rich application that has been in development since 2005. EMERSE should be considered to be a 'tool' in a toolbox of multiple software applications to support the data needs of users. It can work well in environments with other existing and commonly used tools for structured data (example include i2b2/ENACT and Leaf). EMERSE also works well in environments that have tools such as REDCap, which is used for electronic data capture. Of course, EMERSE can be used standalone as well, without other software tools.

An example of a typical workflow might include identifying an initial cohort based on structured data within i2b2 (ICD-10 codes, lab values, etc) and then searching that cohort within EMERSE. Data abstracted from EMERSE would then be stored within REDCap. However, many other software tools can be used in conjunction with EMERSE, depending on local needs and workflows.

Citing EMERSE

The EMERSE team is proud to be able to provide high-quality research software at no cost. But to be able to maintain our team’s funding it is vital for us to demonstrate the value EMERSE has provided—​value often measured in the form of peer-reviewed publications that have used EMERSE. If you use EMERSE in any way (cohort identification, data abstraction, or other tasks) please cite us in your paper:

Hanauer DA, Mei Q, Law J, Khanna R, Zheng K. Supporting information retrieval from electronic health records: A report of University of Michigan's nine-year experience in developing and using the Electronic Medical Record Search Engine (EMERSE). J Biomed Inform. 2015 Jun;55:290-300. doi: 10.1016/j.jbi.2015.05.003. Epub 2015 May 13. PMID: 25979153; PMCID: PMC4527540.

This paper is also available from Pubmed

A full list of papers that have used EMERSE can be found at https://project-emerse.org/publications.html.

EHR systems that work with EMERSE

EMERSE was designed to be vendor "agnostic" with respect to electronic health record (EHR) systems. That is, EMERSE is not directly coupled to any specific EHR. If there is a way to get data our of your EHR it should be possible to get it into EMERSE. Our partners have successfully used EMERSE with both Epic and Cerner, the two most "popular" EHRs in use at large academic medical centers.

How can EMERSE be obtained?

EMERSE is available free of charge with open source licensing. However, it does require a technical team to install it, move data into it, and then manage the system including ensuring security, provisioning accounts, etc. Our other guides provide substantial detail on the processes required to install and operate EMERSE.

Who should use EMERSE?

Anyone who has a need to find data "buried" within clinical notes could benefit from using EMERSE. EMERSE can be used for a wide variety of clinical and translational research tasks, as well as non-research tasks. A few examples are described below. On the project-emerse website there are brief use case videos of real users describing how EMERSE has benefited their work. Further examples of research supported by EMERSE are provided in a list of peer-reviewed publications.

Research Use Case Examples

  • Reviews preparatory to research: EMERSE is great for a review preparatory to research, where you want to estimate the number of patients available for a study in order to determine whether a study is feasible or not.

  • Rare diseases: EMERSE can quickly find mentions of rare disease in the notes, which is especially helpful when billing codes are too broad or used incorrectly. For example, there is only one ICD-10 code to represent all 13 types of Ehlers-Danlos syndrome. To find the specific type of interest, you would need to search the free text.

  • Eligibility determination: For studies in which eligibility determination is complex and may rely on data only captured within the free text portion of documents, EMERSE can be a rapid way to check for mentions of inclusion/exclusion criteria.

  • Data abstraction: For patients already on a study, EMERSE can be used to identify data for abstraction. These could include treatment courses, side effects, adverse events, or anything about the patient that requires a chart review.

Operational Use Case Examples

  • Quality Improvement (QI): EMERSE can be used for QI activities such as infection control monitoring. For example: Informatics and the American College of Surgeons National Surgical Quality Improvement Program: automated processes could replace manual record review.

  • Risk Management: Such tasks could include identifying falls or finding patients with device/implant recalls.

  • Health Information Management (HIM): HIM teams can use EMERSE for a wide variety of tasks including support for outside records, identifying unapproved abbreviations, contact moves, historical inpatients/outpatient volume searching, and even consents.

  • Billing/Coding: The billing/coding teams can use EMERSE to help identify supporting information, or verify that necessary details are actually mentioned in the notes. Using EMERSE can help save an institution hundreds of thousands of dollars each year!

  • Cancer Registries: Cancer Registrars can use EMERSE for a wide variety of data abstraction tasks including difficult-to-find information such as genetic and biomarker testing.

Clinical Use Case Examples

  • Clinical Chart Review: EMERSE can be used to help find details about a patient rapidly, even during a clinical visit. For example, if a patient mentions that a certain medication helped their migraine 3 years ago but can’t remember the name of the medication, just search the chart for 'migraine' and find that note within seconds.

  • Prior authorizations: Often an insurance company won’t approve a specific drug unless documentation can be provided that cheaper alternatives were used without success. EMERSE can be used to find mentions of those ineffective alternatives very quickly.

EMERSE was designed to work with patient data. Therefore, access will be almost certainly be limited, and each site will have policies in place about who can or cannot have access to the system, and under what circumstances that access can be granted or revoked. In general, institutional review board (IRB) approval will always be required for EMERSE access when research is being conducted, whereas other types of requirements/approvals will need to be in place for other access types such as clinical care, operations, quality improvement, education, etc.

Guide Conventions

There are a few conventions used throughout the guide to ensure consistency and ease of understanding of the concepts discussed.

Italicized phrase
Most of the time an italicized phrase will refer to a specific component of EMERSE that a user can access or utilize.

Navigation and action elements
EMERSE provides multiple approaches for moving around within the system or selecting an action for the system to do. This can sometimes be a button, or a clickable icon, or other options. In this guide these elements will be highlighted with a box around it.

Terms
Terms, sometimes called Search Terms or Keywords, will usually be shown in a monospace font to help them stand out from the rest of the text in this guide. They will also sometimes be shown with a colored background to show how they might be highlighted by the system. Sometimes the word Phrase will also be used. That is essentially a Term that has multiple words in it, like "asthma exacerbation". But the words Term and Phrase will often be used interchangeably.

The Navigation Panel is the area at the top of EMERSE that contains many of the navigation elements (buttons) in the system, as well as other information that can be useful to users such as the currently select patients and search terms.

When "Documents" are mentioned, this means clinical documents generated by a clinician about a patient. These documents are also commonly referred to as "Notes" or even "Reports" and often these terms are used interchangeably.

guide conventions
Figure 1. The Navigation Panel is outlined in red in this screenshot.

How to Get Help

There are various ways to get help or learn more about EMERSE. This guide is a good place to start. Also consider asking a colleague. Another place to look are the online videos (although these are not always kept up to date). If you need more help, please ask your local EMERSE site administrator, especially if it deals with account and access information. If all else fails, you can contact the EMERSE team at the University of Michigan, which is where EMERSE was originally developed. We are happy to help you if you have questions.

Moving Around Within EMERSE

EMERSE was designed as a "single-page application". This means that new screens within EMERSE are actually loaded within the same web page. You should generally avoid using the browser’s back button to go to previous pages. Rather, use only the buttons presented within the EMERSE application to navigate and move around within the application. In other words, if you want to go "back" to a certain page, navigate back to that page by using the buttons that EMERSE displays.

Many things are clickable in EMERSE, even if they are not highlighted as such. It is safe to click around and try things out. You can’t ruin anything or change the original documents.

Most of the tables can be sorted. To sort a specific column in a table, click on the text in the column header of interest. Columns that are sorted should have small arrow pointing in the sort direction (up or down). To sort in the opposite direction, just click on the column header again.

EMERSE supports roles/privileges which are configured locally at each site. Depending on your specific role, some parts of EMERSE may not be accessible to you. If you think this is a problem, contact your local EMERSE administrator.

Getting Started and Logging In

To get started with EMERSE you will need an account. Because EMERSE contains protected health information (PHI), access will be limited. Access will be provided by your local institution, and requirements for access may vary. You will need institutional review board (IRB) approval for research. Additional training may also be required, especially as it relates to human subjects research. You will likely need other reviews and approvals for non-research use (quality improvement work, operational work, etc.), which will be determined based on local policies.

Attestation Page

The Attestation Page is the first page you will land on after logging in to EMERSE. It is the page on which you define your specific reason for using EMERSE for that particular login session. This information is stored in the logs (audit trails) to capture specific reasons for use when looking up patient information. There is currently no way to change your Attestation selection once you have logged into EMERSE for a specific user-session. If you need to change it, log out of EMERSE and then log back in again. You can view your currently selected Attestation within EMERSE by clicking on the menu in the upper right section of the Navigation Panel.

The entries/selections on this page get recorded in the official audit logs which may be reviewed periodically to ensure that reasons for access are valid.
Attestation page
Figure 2. The Attestation page is where you will state your reason for using EMERSE after logging in.

There are three high-level Types of options on the Attestation Page that may or may not be available to you, depending on how EMERSE has been configured at your institution, described below.

  1. Common Use Case: Common Use Cases (e.g., "Quality Improvement", "Clinical Care", etc) are set by your local site’s system administrators, and your system administrators can add or remove options to best fit the most common use cases at your site. Clicking the option will record that reason in the log files.

  2. Research Study: Research Studies should display valid IRB-approved studies for which you are involved. How these studies get populated from their source systems (such as an electronic IRB tracking system) will differ for each EMERSE installation, but note that sometimes there can be delays between the time a study is approved and the time that it shows up in the Attestation table.

  3. Free Text (Other Reason): This provide a free text option to enter your reason for using EMERSE. You can type in anything, but it should ideally be brief and should sufficiently describe why you are using the system. This free text box should really only be used as a last resort, in cases where you are unable to find any other options that match your reason for using EMERSE. Once you submit your response, it will be listed in the Table as a Free Text entry during future logins. These can be removed from the table at any time by clicking on the Remove button, but it will not remove the information from the audit logs.

Patients

There are three overall approaches for searching with EMERSE, based on whether you are:

  1. Searching across All Local Patients without specifying a patient list (in which case you are likely identifying a cohort). The term Local is used here to emphasize that those are all the patients in the local EMERSE system you are using.

  2. Working with a Patient List of known patients, in which case you are likely conducting a chart review. The Patient Lists are themselves divided into two categories: Temporary and Saved. Each is described below.

  3. Searching across the Network of EMERSE sites, in which case you can get approximate counts from other sites but will not be able to view any notes or protected health information (PHI) from those other sites.

An important distinction to understand is that the way EMERSE searches differs depending on whether it is an All Local Patients search or a search across a pre-defined Patient List. These differences are described in detail in the section discussing the Terms. A Network search behaves similar to an All Local Patients search.

When first logging into EMERSE, after completing the Attestion Page you will always land on a page where All Local Patients are selected. This is to ensure that you will be able to complete a search even if you do not know the patients you want to search. If you want to search a known set of patients instead, this can be changed with just a few button clicks.

All Local Patients

All Local Patients (sometimes referred to as All Patients, or even simply All) is the default setting after logging into EMERSE. As the name implies, this setting will search across all of the patients in the local EMERSE system installed for your site. To select All Local Patients any other time, simply click on PatientsAll Local Patients and then click on the single row in the table to choose "All patients in the EMERSE system". When this is selected, the system will show the phrase "All Local Patients" next to the Patients button in the upper-left of the Navigation Panel, as well as the total patient count in parentheses. After choosing All Local Patients, navigate to the Terms to enter your search terms, then click on the Find Patients button to identify the patient set.

When searching across All Local Patients you will be able to see an overall patient count that matches the search criteria, a subset of up to 100 text snippets (Summaries) to quickly review the context of what was found, a summary of the Demographics of the patients that were found, and a graph of the frequency of the patients containing the term(s) of interest over time (Trends).

A Summary in EMERSE is a small snippet of text that shows the keyword or phrase of interest (the search term) highlighted with additional text (about 20-30 characters or 5-6 words) on either side to show the term in its original context. This is to enable rapid skimming of the results and developing a better understanding of the results in context. Depending on your specific privileges set by your site administrator, it is possible that you will not have access to see these Summaries.

To review patients found through an All Patient search in more detail, the patients first have to be moved to a Patient List, described below.

Patient Lists

Patient lists are based on medical record numbers (MRNs). Each patient list can contain anywhere from a single patient up to 100,000 patients per list. Patient lists can be created through a search across All Local Patients (using Find Patients ), or by entering a set of medical record numbers. You can also apply Find Patients to an existing Patient List to reduce the list further based on the presence of additional search terms. You can also create a patient list from a set of Encounter IDs (EIDs; commonly used in Epic), if you have a list of EIDs and your system administrators have loaded the Encounter ID data into EMERSE.

100,000 patients per list is a system default, but your local site administrator may have changed this to a different number.

Depending on your local configuration, it may also be possible to have a patient list automatically submitted to EMERSE from another system (such as i2b2), which would then appear as a Saved Patient List, described below.

There are two basic types of patient lists: Temporary and Saved and there are difference between the two types regarding what you can do with them and when you should use them. The following table provides a high level overview of these differences, with additional details following.

Table 1. Feature Comparison between the Temporary Patient List and the Saved Patient List

Feature

Temporary Patient List

Saved Patient List

Maximum number of patients per list

100,000

100,000

Maximum number of lists per user

1

Unlimited

Ideal for "Reviews Preparatory To Research"

Yes

No

Ideal for teams working together from the same list

No

Yes

Supports sharing with other users

No

Yes

Saved between sessions

No

Yes

Supports Tags

No

Yes

Supports Comments

No

Yes

Supports Exporting/Saving to Excel

No

Yes

Can be used with the Compare Patient Lists feature

No

Yes

Temporary Patient List

A Temporary Patient List is a list that a user can make and use during a single user session but will not be saved between sessions unless it is converted to a Saved Patient List. That is, it can be used for a single session but is erased when the user logs out. Such a list is useful in cases where a patient, or set of patients, needs to be reviewed but there is no need to save the list of patients for another time. A "review preparatory to research" should utilize this Temporary Patient List since such a list should not be saved during this type of research review activity.

temporary patient list
Figure 3. The Temporary Patient List screen has limited options for managing the list (basically adding and removing patients). Comments and Tags, as well as Sharing options, are not available with Temporary Patient Lists.

A Temporary Patient List is meant to be fast and simple, and has a limited set of features. By contrast, a Saved Patient List (see next section) provides more features. For example, with a Temporary Patient List it is not possible to share the list with other users, nor is it possible to annotate the list with Comments or Tags, both of which are only possible with a Saved Patient List.

Saving a Temporary Patient List

A Temporary Patient List can be converted to a Saved Patient List by clicking on the Save option, which can be found within the list of options for the Temporary Patient List.

Adding Patients to a Temporary Patient List

Patients can be added to a Temporary Patient List by selecting the Add/Upload Option. You can then copy-paste in a column of medical record numbers (MRNS) (from an Excel file, for example). There is also an option to paste in a list of Encounter IDs. Encounter IDs (EIDs) can be useful because they are often tied to a specific component of a patient’s care, for example emergency room visits, or inpatient stays. If you are given a list of Encounter IDs, you can enter them into the same box where the MRNs would be entered, select the "By Encounter ID" option, and then click on the Add Patients button. At that point you will also be given the option to add those EIDs as a Filter. If the EIDs are not added as a Filter, then when a search is conducted EMERSE will search across all of the notes that had an MRN corresponding to the entered EIDs, but the search will not be limited to notes linked with the EIDs themselves. If the EIDs are added as a Filter, then EMERSE will search only the notes that were linked to one of the EIDs. In other words, EIDs can be used to create a patient list (since there is always a corresponding MRN for each EID), but the EIDs can also be used to Filter what is searched. The EIDs can also be directly added to the Filters section of EMERSE. However, entering the EIDs in the Patients section will create a specific patient list based on those EIDs.

add/upload patients
Figure 4. Patients can be added to the Temporary Patient List (or to Saved Patient Lists) by pasting in a column of medical record numbers or uploading a file with the numbers.
Removing Patients from a Temporary Patient List

To remove a single patient from a Temporary Patient List simply click on View Patients and then the Remove link associated with the patient. Or, to remove all of them from the list at once, choose the Clear option for the Temporary Patient List and then select the Clear All Patients option.

Saved Patient Lists

A Saved Patient List is a list that a user can make and save between sessions. There is no limit to the number of Saved Patient Lists that a user can have, and each list can contain between 1 and 100,000 patients. Because these lists are saved, each list should be given a name and a description, available under Name/Description. A Saved Patient List is very similar to a Temporary Patient List but there are several notable advantages when using the Saved Patient List.

saved patient list
Figure 5. The Saved Patient List screen has additional options for managing the list. Comments and Tags are supported. Users can also share the list with team members.
Sharing Saved Patient Lists

These Saved Patient Lists can be shared with other EMERSE users. To share them, first make sure the list has been selected, click on Sharing and enter the user IDs of those who should have access to the list. These other users will be able to view the list, but because they are not the "owner" of the list they will not be able to add new patients or remove existing patients. However, these users will be able to add/edit the Comments and Tags, both of which are described below. These Comments and Tags belong to the list and are shared and available to all users who have access to the list. Thus, among teams sharing a Saved Patient List, it is important to establish a protocol for changing these Comments and Tags since one user could modify (or delete) a Comment or Tag entered by another user. Further details about Comments and Tags can be found below.

Comments

A Comment is a short bit of text that can be entered by users who are reviewing patients in a Saved Patient List. These Comments can be used for any purpose but could be useful for recording notes about the patients as the reviews occurring. For example, if a study coordinator/data abstractor has a question about a patient that needs additional review by the study team, the Comment section would be a good place to enter that. It could be also used for some simple/basic data abstraction. Comments (and Tags) can be exported to an Excel file. Exporting Saved Patient Lists is described in a separate section. Comments are saved immediately when a user clicks out of, or removes focus from, a Comment field.

EMERSE is not intended to be a full-fledged electronic data capture system. For storing data as you are reviewing records in EMERSE you may need an additional system running in tandem with EMERSE, such as REDCap.
Tags

Tags are a simple, yet powerful, way to help manage the patients in a Saved Patient List. While just a simple checkbox, Tags can be used in multiple ways. For example, Tags are useful when reviewing patients in a Saved Patient List to mark patients of interest. Similar to Comments, it is up to each user (or team sharing the list) to determine how the Tag should be used, but a common use would be to mark certain patients as being eligible for a study. The Tags belong to the patient list and all users who have access to the Saved Patient List will see the Tag and be able to change it. Tags are saved immediately upon clicking on the checkbox (either checked or unchecked). Another use for Tags is to mark patients based on a Find Patients search. For example, you can enter a search term, and select Find Patients to identify all the patients in the list with that term. Then you can choose the option to Tag Patients in List which will add a Tag to each of the patients in the list that has the term (or terms) of interest. From there, you could then navigate to the patient list itself and clear all of the untagged patients to reduce the list to only those with the term of interest (see the next section on Removing Patients).

Removing Patients from a Saved Patient List

It is possible to individually remove patients from a Saved Patient List by simply clicking on View Patients tab and then the Remove link next to each patient’s name. Or, to remove all of them from the list at once, choose the Clear/Delete option for that list and then click on Clear All Patients . Clearing (but not deleting) the list could be advantageous in cases where you want to preserve the sharing settings of that list for a group of users but want to work with a new set of patients in the list.

You can also remove patients that are tagged, or those that are untagged, depending on your specific use case. Tags can be applied automatically after using the Find Patients feature applied to the list (see prior section on Tags).

Deleting A Saved Patient List

It is also possible to delete the entire list by pressing the Delete Patient List button under the Clear/Delete option. Be careful because once it is deleted there is no undo option.

Exporting a Saved Patient List

Saved Patient Lists can be exported/saved as an Excel spreadsheet. The file requires a password which must be added at the time of saving it. The exported spreadsheet will contain each patient’s name, medical record number, date of birth, current age, other demographics, and any Comments or Tags that were added. This can be useful for a variety of reasons, including importing the list into another system such as REDCap.

patient list export
Figure 6. Saved Patient Lists can be exported to a password-protected Excel file. Names, MRNs, Demographics, Comments and Tags are included in the export.

Moving Patients to a Patient List

Patients identified using the Find Patients feature can be moved to a Temporary Patient List for more detailed searching and viewing of their notes. Once the Find Patients search is completed, simply click on the button called Move to Temporary Patient List. This will transfer the set of patients identified from a Find Patients search to a Temporary Patient List, which is not saved between sessions. Once that move occurs, the list of patients will be shown. To then change that Temporary Patient List to a Saved Patient List (the latter of which is saved between sessions), simply click on the option called Save.

Adding Patients to a Patient List

In addition to moving patients to a new Temporary Patient List from a Find Patients search, patients can also be added to a list (either to an existing list or to a new list). This process is the same regardless of whether you are using a Temporary Patient List or a Saved Patient List. Entering patients is done via the Add Patients option for the patient lists. To enter new patients, you can type each MRN one-by-one into the MRN text entry box, or you can paste an entire set of MRNs at once. For example, you can copy and paste an entire column of MRNs from an Excel spreadsheet and paste them into the MRN text entry box to create a patient list. If there are already patients in a patient list, adding new MRNs will append them to the end of the existing list, up to the maximum number allowed.

Any time an MRN is entered, that MRN is checked against the complete set of patients within EMERSE. If an MRN is not found, a warning to the user will be provided. Similarly, if a duplicate MRN is entered for a list, the duplicates will be removed automatically and the user will be alerted.

As described in the section on Temporary Patient Lists, Encounter IDs (EIDs) can also be entered into the Add Patients section. Each EID is associated with an MRN, so if you start with a list of EIDs, you can use that EID list to generate your patient list based on the corresponding MRNs in the system. This will only work if your local EMERSE instance has EIDs loaded into the system.

Compare Patient Lists

The Compare Patients Lists feature lets you compare two Saved Patient Lists to each other with respect to the patients contained within each list. This feature can be accessed via the Compare Patient Lists button which can be found above the table displaying the Saved Patient Lists.

compare patients
Figure 7. The Compare Patients screen offer powerful options for quickly comparing the overlap between two Saved Patient Lists and selecting only the portion that is desired to create a new Saved Patient List.

Compare Patients Lists provides functionality that can be useful for separating out only those patients of interest that may be found among two lists. This feature essentially builds a Venn diagram showing the overlap of patients between any two Saved Patient Lists. You can then select any part of the diagram you are interested in simply by clicking on the appropriate section of the diagram. Once your selection is finalized, click on the Create New Patient List From Selection button and you will then have a new Saved Patient List based on your selection (you will have to provide a Name and Description for the new list). Note that when lists are merged, duplicates will be removed automatically. The original lists used for the comparison will not be modified.

Table 2. Operations that can be performed with Patient Lists using the Compare Patient Lists feature
             (List 1 is represented by the left circle, List 2 is represented by the right circle)

All patients in List 1 that are not in List 2

user guide Venn1

Only those patients from List 1 and List 2 that are not already in both lists

user guide Venn3

All patients from Lists 1 and 2 (an actual merge)

user guide Venn4

All patients in List 2 that are not in List 1

user guide Venn6

Only those patients that were originally in both List 1 and List 2

user guide Venn7

It may be useful to use the Compare Patient Lists feature when you want to compare patients found through different approaches for obtaining the patient lists. For example, suppose you find a set of patients who had "back pain" and a separate set of patients who had "acupuncture". Using the Compare Patient Lists feature you can find the set of all patients who had both concepts mentioned in their notes at least once, even if the mentions were in completely separate notes or sections of the EHR.

Or, suppose you have a set of patients with a diagnosis of "amyotrophic lateral sclerosis" (List 1) and now you want to exclude any patients with a mention of "packs per year" (List 2). You can make two separate lists, one based on each search. Then using the Compare Patients Lists feature, you can select out the list of patients from List 1 that did not also appear in List 2.

Finally, some have used this feature to reduce a list obtained from an external source to make it more relevant. For example, suppose you have a list of 10,000 patients from a registry and now you want to know which ones mention a certain drug, paclitaxel. You can upload the list of patients from the registry (List 1). Separately you can run a Find Patients search for the word paclitaxel (List 2) and then use the Compare Patient Lists feature to identify only those patients that appeared in both lists. As of EMERSE version 6.1, this same kind of result can also be achieved by using the Find Patients feature applied to the original patient list, followed by Tagging the patients of interest and then clearing the Untagged patients.

Network Patients

As of EMERSE version 6.3, searching across a network of EMERSE sites is now possible. The decision to participate in this network depends on the leadership/administrators at your local site.

If available, you can select the Network Patients option, and then select thes sites you want to include in your search. For now, there are limitations to this feature which still remains somewhat experimental.

When searching across the EMERSE Network, only patient counts can be obtained. With the network functionality it is not possible to save a patient list, or see any details about the patients, since the counts can come not only from your local EMERSE system, but from other systems at research centers across the country.

Many more details are described in the overall section on the <<[network]>>.

Obsolete medical record numbers (MRNs)

It is not uncommon for a single patient to be assigned more than one medical record number (MRN) over time. This can happen, for example, when a patient has an existing MRN but then comes to the emergency room and is assigned a new MRN when the clerical staff are not aware of the existing MRN. Over time, it may become evident that the patient has two or more MRNs and then a workflow is implemented in the electronic health record system to merge the MRNs, and move all of a patient’s documents to a single MRN.

For EMERSE, this can mean that some MRNs in a saved patient list (especially an older list) can have obsolete MRNs, as some of the MRNs in the list may have been re-assigned to newer MRNs. If this is the case, EMERSE will display a warning to the user when a Saved Patient List is opened. The list will automatically be sorted so that the Obsolete patient MRNs are at the top of the list, and a new column (called Obsolete) will appear to the right to help notify the user which MRNs are no longer valid. You can then easily remove the patient(s) from the list if you are the list owner. A similar message will appear in the Overview section of EMERSE, to warn about the obsolete MRNs. A few additional visual reminders will be invoked, such as line through the name/MRN to help you identify which ones are obsolete: For example: Jane Doe 100346543.

EMERSE does not currently have the capability of telling you what the new, correct MRN is, and it is not able to automatically assign the obsolete patient MRN to the new MRN. However, it will be important for you to be aware of the changes so that you can also check for consistency of MRNs across other research systems where the older, obsolete MRN may be entered, such as a clinical data capture system (e.g., REDCap). Because EMERSE is not able to report on what the new MRN is, you will have to identify the updated MRN in the original source system which is likely to be your electronic health record (EHR) system.

This Obsolete MRN feature will only be enabled if your local system administrators are updating the patient database within EMERSE regarding MRNs that have been made obsolete. Additional technical descriptions about patient merges can be found here.

Filters

Filters are a way to reduce/limit the search results based on various criteria that are related to either the patients themselves or the documents. In some ways, search terms are also a kind of filter, but the Filters described in this section are related to things not related to the text within the actual clinical documents. Some filters work at the level of the documents (e.g., Encounter Dates and Document Source) and other Filters work at the level of the patient (e.g., Sex, Race, Current Age, etc.). Filters are described here at a somewhat high level, in part, because these filters might be very different for each EMERSE implementation. The options available will depend heavily on how the system is configured at each site. Further, the options available to a user might differ even at a single site if the administrators have connected access to certain kinds of documents to specific user roles within EMERSE (e.g., some users might not be allowed to view/search psychiatry reports). The examples shown below are being used for demonstration purposes, and may not match what you see in your instance of EMERSE.

Using Filters means that EMERSE has to check for additional criteria to make sure that the patients or documents meets the filter criteria. As a result you will generally notice that a search using Filters will take longer than a search without using Filters. However, this wait time should be offset with more relevant results that will take less time to look through.
filters
Figure 8. Filters offer a way to reduce the number of results based on user-defined criteria. This screenshot shows filtering by document source. The sources that are checked will be included in the search and those not checked will be excluded from the search.

High level description of Filters

Most filters are configured using checkboxes, so that you can check off which elements you want to include. For filters that have many (sometimes hundreds) of options, checkboxes can be cumbersome so instead you can type in what you are looking for and then select from the list of options. To help you find what filter categories have options selected, a small yellow bar will appear next to the filter name. The filter currently selected will have a blue bar next to the filter name. To also help identify which filters are active, the filter name will be shown across the Navigation Panel at the top, but the individual selections will not be displayed there.

filter display
Figure 9. Example showing several filters chosen, which are listed in the Navigation Panel: Birth Date, Source, Department in Main EHR, Imaging Modality in Radiology. The specific filter elements selected are not shown there. To see the specific elements chosen, click on the filter option itself. Filters that have something selected have a yellow bar next to them, and the currently active filter on the screen has a blue bar next to it, and the filter name itself will be highlighted.

Filters are not applied to an overall category unless something in that category is chosen as a filter. For example, if nothing for Race is selected, then all Races will be included in a search. However, if only "Black or African American" is selected, then only patients with demographics matching that specific race will be included in the search.

Sometimes a document may be categorized into multiple options of a single filter. This should make sense given what the field represents. There’s no specific way to see if this is the case in the filter UI, but if a specific document is categorized into multiple options, it should list each option for that field when viewing the document.

Not all documents may be categorized into one of the Filter options. This can be because the association to a filter really does not exist, or because of a data quality issue. For instance, all documents should be associated with a department, but some documents may not, for one reason or another. If you select an option on the department filter, documents that don’t have a department associated with them at all will not be included in your search. Administrators may create a default category for these kinds of documents, and if that happens you would be able to search documents that don’t have a department. The name of this category is chosen by administrators, but may be something like "Missing", "Unknown", "N/A", or "null".

EMERSE generally has two major types of search modes, described in the section on Searching: Highlight Documents and Find Patients. Filters work a bit differently depending on these two modes, described in more detail below.

Filter by Birth Date

Filtering by Birth Date means that EMERSE will only include patients that were born within a specified birth date range. To make it easier to interpret, the Filter by Birth Date option shows the current age range based on the birth date(s) entered. Keep in mind that because the birth dates are fixed, these "current" ages will change over time. This filter essentially helps users limit results based on a patient’s current age (based itself on a date of birth). To filter based on the age of the patient at the time the document was written, see below.

The Birth Date filter does not currently take into consideration whether a patient is alive or deceased. Often, such information is incomplete with an EHR system. However, there is another filter for alive vs. deceased so it can be used in conjunction with Birth Date (if the decedent data are populated by the system administrators).
filters
Figure 10. This screenshot of a Birth Date filter was taken on May 10, 2024. Based on the birth date range entered, the filter shows that it would be filtering for patients who were currently between 12 years, 4 months and 3 months, 11 days old on that specific date.

Filter by Age in Days, on Document Date

The Age in Days, on Document Date filter leverages a feature where documents in EMERSE are indexed with how old the patient was at the time the document itself was written. For this filter, you can enter an age in months, an age in days, or both. For example, if you want to find documents only written at the time a child was between 1 to 3 month of age, you can enter that as a filter for this section. Then, when EMERSE searches, it will only identify documents where the patient was within that 1 to 3 month age range based on the patient’s birth date and the document date.

Filter by Age in Years, on Document Date

See the above filter (Age in Days, on Document Date) for details on how this works. This filter work the same way, except that it allows users to filter based on a patient’s age in years and months at the time a document was written.

Filter Details

With so many options, it can become a bit complicated to understand what the combinations of selected Filters is doing. To provide more insight into how the filter is being applied to a search, click on Filter Details. A description of the filter logic, including operators such as AND, OR, NOT will be shown.

Filter Details
Figure 11. Filter Details shows the logic applied to the selected filters when conducting a search.

Filters using Find Patients

When using Find Patients, the Filters will reduce the entire patient list returned, since the system will Filter at the patient level. If a patient has at least one document that matches the criteria and the patient also matches the filter criteria, they will be counted in the results.

Filters using Highlight Documents

With Highlight Documents each patient will still be shown in the Overview table regardless of matching at the patient level. However, the results displayed for that patient will depend on what matches the filter criteria. Only documents that match the criteria will be displayed. Futher, if the patient overall does not match the criteria, then no documents will be displayed for that patient even if some of the search terms matched. That is, the row for that patient will have no results, and no documents will be shown for that patient.

Filter Restrictions

EMERSE is a highly configurable system, and access to certain types of data may be restricted to certain users depending on your local EMERSE setup. For example, some users might be able to search through all documents types, and other users might have restrictions placed on them which only permits them to see a limited set of documents types. These restrictions can be placed by the system administrators, and are essentially a type of system-defined Filter that is added to a user’s search and adds additional restrictions that a user cannot modify. Because of this, the results that you get could be different from the results another user gets because of differing restrictions. This also has implications for when you share a Filter to another user who has different restrictions than you have, or if you use a Filter shared to you by someone else. Restrictions will become apparent when you go to a section in the Filters and find that some items are disabled, cannot be selected, and the label "(access restricted)" is displayed.

Filter Restriction Details
Figure 12. In this example, the administrators have restricted this user’s access within the "Source" filter to only Radiology and Main EHR; the other sources are Restricted.
Filter Restriction Details
Figure 13. Based on the Source filter in the prior screenshot, which restricted the user from accessing Pathology data as as source, this screenshot shows that all categories under the Pathology source are also restricted.

To get a high-level overview of what Filter Restrictions have been applied by the administrators, go to the Restrictions tab in the Filters section. All of the restrictions that have been applied by the system administrator are shown here. Restrictions can be grouped together, and each set of restrictions will be shown in the table on this screen. To view them in more detail, click on each row. In this screenshot, only one Restriction group is listed (named "Limited access filter"). Clicking on that row will reveal the details, which are shown in the following screenshot. Note: If you do not see the Restrictions tab, it means that there are no data restrictions placed on your user account.

Filter Restriction Table
Figure 14. The table showing Filter Restrictions for a user. Click on each row in the table to see the restrictions. If no restrictions exist, this table will not be displayed.
Filter Restriction Details
Figure 15. This screenshot shows the sections of Filters that have been restriction to the user. Here it can be seen that restrictions have been set for "Source" and "Department". Under "Source" it can be seen that "Radiology" and "Main EHR" are selected, meaning that the user will only be able to search through those sources.

Saving Filters

Filters can be saved and reused. Similar to Patient Lists, you just need to provide a name and a description to save the filter. To use the filter again, simply go to the Saved Filters tab and click on the row in the table with the filter you want to use.

Sharing Filters

Filters can also be shared to other users. Select the Share option and the choose the users to share the filter to. Users who have access to your shared filter will be able to use the filter but will not be able to edit/change it.

If you see a warning icon with the message "Restrictions Differ" for a Filter you have selected, it means that there is a discrepancy between the Filter Restrictions that you have compared to the Filter Restrictions on the user who made the filter. This means that you might get different results, depending on who had the restrictions, and what those restrictions were. For example, you may get more results, fewer results, or even the same results with different Restrictions between yourself and the owner/creator of the Shared Filter. Use these kinds of Filters with caution, or discuss the possible restriction with the user who created the Filter.

Filter Restriction Details
Figure 16. When restrictions differ between two users sharing a Filter, a warning message will be visible (see red arrow in screenshot).

Filters across the Network

At this time EMERSE does not support most Filters when searching across the Network. This is partly due to a lack of standardization among data across the sites. Currently, only a date range can be applied to Network searches.

Natural Language Processing (NLP)

The default configuration of EMERSE has some basic natural language processing (NLP) built-into it. This guide will discuss the components of that default NLP integration; however, it is important to note that the specific NLP used, and the level of integration is highly customizable and might be different for your local EMERSE instance.

NLP encompasses a wide range of tools and techniques to recognize parts of human language, such as determining when a word or phrase indicates a "no" (negation) or "maybe" (uncertainty). It also includes identifying specific categories of words like names of disease, drugs, or symptoms (named entity recognition). These foundational NLP tasks enable computers to make sense of basic elements within text, helping to derive meaning and context in the same way a human reader might.

We consider the NLP integration within EMERSE "basic" in that it is not perfect, but will hopefully help you get to the right information faster. For example, by noting which words/phrases in notes are negated, it means that you can choose to ignore terms like "chest pain" in the context of "denies chest pain". EMERSE is not at the level of generative pre-trained transformer (GPT) models such as ChatGPT, although integration with such tools is being considered.

No system is perfect, and EMERSE is no different. The actual performance of the NLP within EMERSE will depend on many factors, including on how "natural" the language actually is within the clinical documents. The system will make mistakes, so it will always be important to verify results against the text of the document to check for accuracy.

There are multiple steps often taken to perform NLP, in what is often called a "pipeline". This can range from sentence detection (determining where one sentence ends and another begins), to tokenization (how to split up the text into word chunks for analysis), to identifying phrase fragments, to mapping to standard codes. Within the default EMERSE system, we use an OpenNLP model for sentence detection, and a series of simple rules and regular expressions for identifying other attributes such as negation, uncertainty, and others described below. We use a modified version of cTAKES to conduct the named entity recognition (NER), and we map any identified entities to Concept Unique Identifiers (CUIs) maintained by the Natural Library of Medicine (NLM) within the Unified Medical Language System (UMLS).

The following are what EMERSE currently includes in the NLP annotations, as well as other details about the NLP implementation with EMERSE

Negation

Negation is when something is described as being absent, denied, not occurring, etc. Examples of negation include:

  • "She denies chest pain"

  • "He never had diabetes"

  • "There is no evidence for an obstruction"

Even though EMERSE does not include non-alphanumeric characters when supporting search, some characters are used when identifying negation. For example, the phrase "-ve" will usually be interpreted as a shorthand for "negative".

Uncertainty

Uncertainty expressions are tentative, ambiguous, or indicative of doubt. Example of uncertainty include:

  • "She might have chest pain"

  • "He possibly has diabetes"

  • "There is a chance that there is an obstruction"

Non-patient subject

Even though most medical documents are about the patients, there can be plenty of times when other people are mentioned, and it might be desirable to exclude those mentions from some searches. Examples of this include:

  • "Her father has chest pain"

  • "His sister was diagnosed with diabetes"

  • "An obstruction was found in his uncle"

History of

Clinical documents often include a mix of events that are happening "now", at the time of care, and those that happened in the past. EMERSE can identify some of these phrases so that they can be excluded if desired. Examples of such phrases include:

  • "History of chest pain"

  • "Fam hx of diabetes"

  • "h/o obstruction 5 years ago"

Named Entity Recognition

EMERSE can identify many common words/phrases using Named Entity Recognition. These can incude single words like "lungs" or multi-word phrases like "acute lymphoblastic leukemia". The words/phrases in the text are compared to a very large dictionary of terms found within the Unified Medical Language System (UMLS). When a match is made, EMERSE will store that match in the index in the form of a Concept Unique Identifier (CUI). These CUIs can also be used to search EMERSE, described in the section on Terms. UMLS also has high level categories called "Semantic Types" which include grouping such as "Anatomy", "Disorder", "Drug", "Procedure", and others. EMERSE also stores some of these semantic types to help group the data when reviewing/highlighting information. Additional details can be found in our NLP Guide.

CUIs are always in the form of the letter 'C' followed by 7 digits. For example "chest pain" is defined as CUI C0008031, whereas acute lymphocytic leukemia is code C0023449. This pattern is detected by EMERSE so that it will known when a CUI is entered.

NLP Example

The following screenshot shows an example of how EMERSE can display some of the contextual information found through NLP approaches. In this example, which shows how information from a single document can be viewed, the yellow term hypoglycemia is what the user searched for. The Semantic Groups table shows various categories that be highlighted by clicking on them. Clicking once dotted-underlines them, and clicking again will fully highlight the terms related to each group. Here, the "Anatomy" group was selected, showing various anatomical terms automatically being highlighted. Additionally, the phrases underlined in red have been identified as being Negated, and those underlined in purple are Uncertainty phrases. This highlighting/underlining can be toggled on and off by the user by clicking on the options within the Annotations table. The bottom centers of this image shows what happens when a user clicks on a highlighted term identified through NLP using the Semantic Groups: the concept unique identifiers (CUIs) associated with that term are shown. More details about how to leverage the NLP features within EMERSE are described later in this guide.

NLP example
Figure 17. Screenshot of EMERSE showing how some of the NLP information is displayed.

Licensing

Use of UMLS requires a license, but the NLM only requires that one person from each institution signs the license. Therefore, as an EMERSE user you will not need to sign a license with the NLM. One of the EMERSE system administrators will be taking care of that.

Terms

Terms are the words and phrases you specify in your searches. EMERSE provides a lot of flexibility in what you can do with terms. There are three ways to enter terms (Temporary Terms, Saved Terms--also called Term Bundles, and Advanced Terms), each of which is described in detail below.

Before getting into the details, there are several aspects about EMERSE that are worth pointing out:

  • All searches, by default, are case-insensitive. This means that you will get the same result whether you search for HELLP syndrome or hellp syndrome (though it is possible to enable a case-sensitive search when necessary).

  • "Stemming" is not turned on with the default configuration of EMERSE. This means that if you search for echo you will only find that exact word. You will not automatically find variations such as echoing, echoed, etc. and thus if you want those variations you will need to add them yourself. This was intentional to allow users maximum control over the terms found. For example, if you are looking for echocardigram, which is often abbreviated echo, you would not want to find terms such as echoing. EMERSE does provide various options, described elsewhere, for expanding your search to capture these kinds of variations, including the use of wildcards and Synonyms.

  • Terms can be highlighted in one of 18 colors. In fact, the reason that that EMERSE interface uses so few colors is to help the colorful Terms stand out.

  • The way EMERSE uses terms for searching depends on the searching context, and this context differs whether you are searching across all of the patient in the system (All Local Patients) or among an existing patients list (either a Temporary Patient List or a Saved Patient List). This distinction is important, and it is described in more detail in the section about Searching.

  • All non-alphanumeric characters are ignored by EMERSE, so there is no advantage to including them in your search terms. Alphanumeric characters include the the letters A-Z and the digits 0-9. Examples of non-alphanumeric characters normally ignored by EMERSE include, but are not limited to:

    ! ? . > < + - #

    This is important because there may be some things you will not be able to definitively find using EMERSE. For example, if you are looking for HIV+ patients, the + is ignored and thus from the perspective of a search you would be able to find HIV but EMERSE cannot distinguish between HIV+ and HIV-. (However, you could find phrases like HIV positive and HIV pos.) EMERSE will remind you when you enter Terms with those characters, and will also let you know that they can still be included but will not impact the search results.

  • EMERSE indexes words based on what are known as "tokens". A token is usually a term separated by white space (such as a space, tab, or linefeed), or by non-alphanumeric characters that are ignored (e.g., ! ? . - etc). Thus, if you are looking for echocardiogram you will not find echo-cardiogram (since the latter is considered two separate words when it is tokenized) unless you specifically search for that specific of variation as well (e.g., echo cardiogram). Note that many of these variations can still be found by leveraging the Synonyms that EMERSE offers.

  • EMERSE will look for problems with your Temporary Terms and your Saved Terms and provide feedback about how to fix them if they are entered with problems, with the option of fixing the problems. For example, EMERSE will look for queries that are not formed properly or are likely to fail. It will also warn about conflicting colors for terms that might overlap but are set to different highlight colors.

  • The maximum length of any individual term/phrase is 255 characters, including spaces. However, it would be very unusual to have any single search term/phrase that long.

There are three basic types of terms you can work with: Temporary Terms, Saved Terms and Advanced Terms. Each of these are described in detail below. Additionally, the following table provides a high level overview of the differences:

Table 3. Feature Comparison between the Temporary Terms, the Saved Terms, and the Advanced Terms
Feature Temporary Terms Saved Terms Advanced Terms

Maximum number of terms per list

no limit 1

no limit 1

no limit 1

Supports 18 colors for terms

Yes

Yes

No 2

Full control over the choice of colors for terms

Yes

Yes

No

Ideal for teams working together

No

Yes

No

Supports sharing with other users

No

Yes

No

Saved between sessions

No

Yes

No

Has a Term Upload feature

Yes

Yes

No

Can take advantage of Synonym suggestions

Yes

Yes

No

Spell Checking of Terms

Yes

Yes

No

Supports the ability to search for terms in a case-sensitive manner

Yes

Yes

Yes

Supports advanced search features including customizable proximity searches and fuzzy searches

Yes

Yes

Yes

Supports nested Boolean searches with AND, OR, NOT, and regular expressions

No

No

Yes

Supports the use of basic natural language processing (NLP) such as negation and uncertainty detection

Yes

Yes

No

1 While there is no limit on the number of terms, a large number (more than 100) will cause the system to run quite slow.
2 Only one color (yellow) is supported for Advanced Terms.

Temporary Terms and Saved Terms

Temporary Terms and Saved Terms are collections of search terms that can provide a lot of power and flexibility with searching. Temporary Terms and Saved Terms are almost identical in their functionality. The main differences is that Temporary Terms disappear after logging out and cannot be shared with other users, whereas Saved Terms are saved between sessions and can be shared with (and discovered by) other users. This sharing/reusing of search terms is important because it can help ensure reproducibility. Some research teams have even included their Saved Terms as an Appendix with their peer-reviewed publication. Another useful feature about Bundles is that you have control over the specific color for each term. This means that you can color-code concepts. For example you could make all pain medications in green, all side effects in orange, and all diagnoses of interest in blue. This can make it very efficient to visually identify concepts of interest when rapidly reviewing clinical notes.

The phrases Saved Terms and Term Bundles are used somewhat interchangeably throughout this guide and in the application itself. To be precise, a Term Bundle is a single collection of search terms that a user can create and share, whereas Saved Terms is the overall collection of Term Bundles.

Adding/Editing Terms

Terms should be added (and edited) in the section called Edit, specifically into the small text box where it says "Enter Terms/Phrases (one at a time)". These can be single words (chemotherapy) or multi-word phrases (acute lymphoblastic leukemia). One term or phrase should be entered at a time. This is so that the system can help provide suggestions about what has been entered and to help with selecting the desired colors for each term. Once a term has been added, it shows up in the right column of "Active Terms/Phrases".

add terms
Figure 18. Single words or multi-word phrases can be added, one at a time, in the Edit section. Each term can be assigned to one of 18 colors using the color palate picker.

UMLS Concepts as Terms

As mentioned in the section on NLP, some terms within EMERSE have been pre-identified and mapped to UMLS Concepts. These concepts are codes (Concept Unique Identifiers, or CUIs) developed by the National Library of Medicine (NLM) that can, in turn, be used to map back to other standard ontologies and vocabularies. The official site to look up concepts, is the UMLS Metathesaurus Browser.

Depending on your local EMERSE configuration, you might also be able to look up some UMLS concepts through a data file (named "UMLS CUI-term relationships" or something similar) loaded through the Synonyms feature, described later in this guide.

Concepts are always in the form of the letter C followed by 7 digits. For example the term chest pain has the CUI C0008031 and is mapped to R07.9 (ICD-10-CM), 786.50 (ICD-9-CM), HP:0100749 (HPO), and more. It is important to note that the concepts are being used 'behing the scenes' in EMERSE to search for words and phrases that have been labeled with a CUI. The actual CUIs themselves are not being searched. That also means that when the results are displayed, the terms and phrases will be shown and highlighted, and not the CUI itself.

UMLS concept for chest pain
Figure 19. Screen shot from the NLM UMLS Metathesaurus Browser, showing some of the mappings for the term chest pain, which has a UMLS concept of C0008031.

You can search by UMLS concept simply by entering a CUI into the search term box. EMERSE will recognize it as a concept and show it in a slightly different (monospace) font, with a small light bulb icon next to it to help identify it as a concept. It is even possible to mix regular terms with concepts. For example, you can combined the word severe with the concept for migraine (C0149931) to look for that specific phrase. The figure below shows a search for C1690547 (nickel allergy) as well as severe C0149931 (severe migraine).

UMLS term entry
Figure 20. Screen shot showing one term in blue entered as a UMLS CUI for C1690547 ("nickel allergy") as well as an entry combining a normal term with a CUI, in this case severe C0149931, or "severe migraine". Note the monospace font to differentiate the concept (CUI) from a normal search term, as well as the lightbulb icon to show that a CUI is included in the search.
Search result from concepts
Figure 21. Screen shot showing the result for C1690547 and severe C0149931.

Options for Terms

There are multiple options for customizing how terms should be searched. This ranges from whether the term should be search in a case-sensitive manner to whether it should be search in a negated, uncertain, or other manner. Each of these options are described below. All options, other than the color, have a "subway icon" next to them to help provide a high-level visual summary of the selected options on other parts of the screen (where it shows the "Active Terms/Phrases"), which will be described later. These icons are inspired by the "service bullets" used by the New York City subway system.

subway icons
Figure 22. Inspiration for the icons used for the term options came from the New York City subway.
subway icons example in EMERSE
Figure 23. Example of how these "subway icons" are used in EMERSE to summarize the settings for each term. A description of what the icons represent follows below.

Below is a screenshot of the term entry screen, followed by a description of each option.

term options
Figure 24. The entry screen for terms, showing the numerous options available.
  Color

You can choose between 18 colors for the terms to be highlighted. To pick a color, simply click on the color in the rectangular palette. A small black box will surround the selected color to make it easier to identify which one is chosen. When entering terms, EMERSE will default to the same color as the previously entered term. If you want a different color, you can either select the color choice yourself, or you can click on the Use The Next Available Color button, which will select the next color in the palette that has not yet been used for any of the terms.

The colors can make it easier to distinguish terms from one another, since EMERSE will highlight the terms within the notes themselves. Colors are also used in conjunction with the Mosaic option on the Overview page, which can provide a high-level overview of of what terms appear for specific patients. Or, you can group terms together based on the same color(s) if that is your preference. Sometimes it might make sense to group terms in the same high-level category the same color. For example, all narcotic medications in green, and all constipation terms in purple. Last, colors are used in the search logic, depending on the searching context.

Term colors are used to define the searching logic. When searching through all patients in EMERSE, a document matches the search if it contains at least one term of each color being used. That is, terms of the same color are separated by OR, and terms with different colors are separated by AND. However, in all other circumstances, including highlighting documents and Find Patients applied to a Patient List, a document matches the search if any term regardless of color appears in the document. That is, all terms are separated by OR, regardless of their colors. Additional details about the logic for the terms entered below can be seen in the Query Details section, where it will show the AND/OR relationships.

search term color conflict warning
Figure 25. The warning shown when a search term color conflict arises.

Warnings about Search term conflicts may appear based on how you have set up the colors for the terms. These conflicts can occur when you have set up terms to be different colors, but there may be overlap between the terms, leading to a potential conflict with the colored highlighting. For example, imagine searching for chest pain in yellow as well as pain when lying down in blue. A larger phrase that could exist is: chest pain when lying down. In such a case, should the word pain be highlighted in yellow, since it could belong to chest pain, or should it be highlighted in blue since it could also belong to pain when lying down? There is no clear answer, but a warning will be displayed about the conflict so that it can be corrected.

C Case-sensitive
  • C Icon color when case-sensitive option is not selected

  • C Icon color when case-sensitive option is selected

Some medical acronyms are identical to common English words, including NO (nitric oxide), ALL (acute lymphblasic leukemia) and RICE (rest, ice, compression, elevation). Because EMERSE, by default, searches for all terms in a case-insensitive manner, you can end up with a lot of false positives for these otherwise commons words. To help reduce the number of false positives, you can specify a word to be case-sensitive. For example, you can search for a case-sensitive capitalized ALL which will help reduce false positives by not returning the common word, lower-case all as a hit.

To use this feature, simply highlight the word(s) in the text entry box and click on the Toggle Case-Sensitive On Selected Text button. The text designated for a case-sensitive search will be displayed in bold. To remove the case-sensitive feature, highlight the bold terms and click on the button again.

A single word in a phrase can be case-sensitive. For example, "found to have an ALL relapse". In such a situation, all of the words in the phrase can be in any case (upper, lower or a mix), except for the bolded ALL which must be exactly as written (which in this example is all capitalized letters).

A case-sensitive search cannot be applied to a UMLS Concept Unique Identifier (CUI).
case-sensitive example
Figure 26. An example of a phrase with one word (ALL) set to be case-sensitive, which is noted by ALL being in bold font.
Table 4. A few examples to illustrate case-sensitive searching, based on the search phrase "found to have an ALL relapse", where ALL is being searched case-sensitive

Example text

Match?

found to have an ALL relapse

Yes

Found To Have An ALL Relapse

Yes

FOUND TO HAVE AN ALL RELAPSE

Yes

Found To Have An All Relapse

No

found to have an all relapse

No

N Negation
  • N Icon color when set to "Find positive mentions only"

  • N Icon color when set to "Find negated mentions only"

  • N Icon color when set to "Find any mentions regardless of negation"

Negation is the absence of something, often identified by words like "no", "not", "never", "without", "denies", etc. Identifying (and potentially excluding) negated terms can be a way to reduce man false-positive results within EMERSE.

The default search approach in EMERSE is to search all text, regardless of negation status. However, you can choose to search for only terms that are not negated, or even only terms that are negated. Simply choose which option you want for the term entered, and EMERSE will only consider those terms based on the negation status you selected.

Detecting negation can sometimes be complex, so it is safe to assume that EMERSE will not always be correct in its determination.

negation example
Figure 27. An example of a search for the term hypoglycemia where the default search approach is used and no specific negation preference is set. Phrases identified as being negated are underlined in red, but all hypoglycemia terms are highlighted regardless of negation status.

 

negation example
Figure 28. The same text as the previous figure, with only the non-negated examples highlighted.

 

negation example
Figure 29. The same text as the previous figure, with only the negated examples highlighted.

 

S Non-patient subject
  • S Icon color when set to "Find patient-related mentions only"

  • S Icon color when set to "Find non-patients mentions only"

  • S Icon color when set to "Find any mentions regardless of whether it was about the patient"

Subject identification tries to determine if a phrase is about the patient or someone else (someone else would be the "non-patient subject"). This is useful to help reduce false positives where some details are described in a document, but those details are not about the patient. Examples include phrases such as "mother with diabetes", "brother was diagnosed with hypothyroidism", or even "family history of eczema".

As with negation, it can be difficult to properly identify these types of phrases perfectly, and it should be assumed that EMERSE will not always be correct.

subject example
Figure 30. An example of a search for the term gastric cancer where the default search approach is used and no specific setting is made for the subject. Phrases identified as being about someone other than the patient are underlined in blue, but all gastric cancer terms are highlighted regardless of subject status.

 

subject example
Figure 31. The same text as the previous figure, with only the non-patient subject examples highlighted.

 

subject example
Figure 32. The same text as the previous figure, with only the patient examples highlighted.

 

U Uncertainty
  • U Icon color when set to "Find certainty mentions only"

  • U Icon color when set to "Find uncertainty mentions only"

  • U Icon color when set to "Find any mentions regardless of uncertainty"

Uncertainty terms are words or phrases that express some degree of uncertainty, doubt, or lack of certainty about a situation, event, or outcome. These would include phrases like "possible diagnosis of diabetes", "likely hyperthyroidism", "chance of recovery".

Similar to other types of elements identified by EMERSE, not all uncertainty expressions are identified correctly, but using this feature should help identify a reasonable amount to help reduce false positives. Please note that for the practical purposes of EMERSE, "certainty" is simply defined as the lack of uncertainty. What this means is that the system will consider a certainty expression to be not only "patient definitely has diabetes", but also "patient with diabetes" since the second phrase is not modified by any uncertainty expression.

subject example
Figure 33. An example of a search for the term diagnosis where the default search approach is used and no specific setting is made for uncertainty expressions. Phrases identified as being uncertain are underlined in purple, but all diagnosis terms are highlighted regardless of uncertainty status.

 

subject example
Figure 34. The same text as the previous figure, with only the uncertainty diagnosis term highlighted.

 

subject example
Figure 35. The same text as the previous figure, with only the certainty diagnosis term highlighted.
H History of
  • H Icon color when set to "Find non-history mentions only"

  • H Icon color when set to "Find history mentions only"

  • H Icon color when set to "Find any mentions regardless of whether it was history"

The History Of option attempts to identify phrases that suggest that the patient had something in the past. It uses phrases such as "history of", or "past medical history", or "PMHx" to identify these histories. EMERSE is not perfect in identifying these phrases, although using this option should still be helpful in many situations.

history example
Figure 36. An example of a search for the term breast cancer where the default search approach is used and no specific setting is made for identifying "history of". Phrases identified as being about a "history of" are underlined in green, but all breast cancer terms are highlighted regardless of history status.

 

history example
Figure 37. The same text as the previous figure, with only the history of breast cancer terms highlighted.

 

history example
Figure 38. The same text as the previous figure, with only the non-history of breast cancer term highlighted.
P Proximity
  • P Icon color when Proximity option is not selected

  • P Icon color when Proximity option is selected

A proximity search is where EMERSE will look for 2 or more words that are close together but not directly next to each other. These words can be in any order, regardless of the order you type them into the box. You can choose a spacing between words ranging from 0 (where they essentially have to be directly adjacent) to 10 words.

For example, pain leg using a proximity of 5 will match pain shooting down her right leg as well as leg with severe pain.

proximity example
Figure 39. An example of text highlighted using a proximity search of the terms ALL (case-sensitive) and treatment, within 5 words of each other.
F Fuzzy Match
  • F Icon color when the Fuzzy Match option is not selected

  • F Icon color when the Fuzzy Match option is selected

A fuzzy match allows you to search for terms that are similar to the term of interest. This can be way to look for possible misspellings, or spelling variations. For example, a fuzzy search for the word dyslipidemia would identify the British spelling variant dyslipidaemia. This can also pick up some misspellings of words as well, for example, cardiac misspelled as cardiaac.

A fuzzy search can only be applied to a single word at a time. To use a fuzzy search, drag the slider to one of 3 positions, which represent how many potential character substitutions would have to occur to match the newly found term.

A fuzzy search cannot be applied to a UMLS Concept Unique Identifier (CUI).
E Exclude Phrase
  • E Icon color when the Exclude Phrase option is not selected

  • E Icon color when the Exclude Phrase option is selected

EMERSE supports a concept of Excluding phrases with a search. Using the Exclude Phrase option is a good way to still find mentions of terms in a note but ignoring some mentions of the term in a note that you don’t want. This is similar to the idea of Negation but works a bit differently. Essentially, this is a way of ignoring certain phrases so that you can find the terms of interest in the appropriate context, and remove potential false positives. Often these phrases are not actually Negated but they might appear in the wrong context.

For example, if you are searching for complication or complications in the context of a surgical complication, you may want to simply look through operative notes for the terms complication or complications. However, you don’t want to the terms in other contexts that are not necessarily negated but are still the wrong context (e.g., complications from diabetes).

Another common use case would be to ignore words within phrases that might be included in templated/boilerplate text. For example, a note may say we discussed the risks for surgical complications which has nothing to do with the actual patient, so you could exclude that entire phrase and then still search for surgical complications in other contexts in the same note.

phrases to exclude
Figure 40. In this example, where Exclude Phrase is not applied, the term XRT is searched along with the term were discussed (see next Figure to see the result when an Exclude Phrase is applied).

 

phrases to exclude
Figure 41. In this example, the Exclude Phrase XRT were discussed was excluded, so that when XRT was searched, it was not found in the context of the Exclude Phrase.

 

Note that the color for the term entry will disappear with an Exclude Phrase since it is being excluded and thus will not be highlighted. Also note that the whole point of this option is to find a term that you are interested in, but not in the context of the excluded phrase. As a result, you cannot exclude a phrase if you are not also searching for a part of that phrase. For example, suppose the first term you enter is the word diabetes and you try to make that an Exclude Phrase. The system will not let you add that since it would have no impact on the search. However, if you entered diabetes as a normal search term you could then successfully add an Exclude Phrase such as diabetes was discussed at length. This combination will allow you to search for diabetes but not in the context of the excluded phrase diabetes was discussed a length.

As mentioned above, the Exclude Phrase feature will only be effective if the excluded phrase contains a term that you are interested in searching for. Also, this feature does not exclude entire documents that contain the term, it only excludes that phrase inside of a document when highlighting the term of interest in other contexts found within the document. (You can exclude entire documents with certain words and phrases using Exclude Note option.)

X Exclude Note
  • X Icon color when the Exclude Note option is not selected

  • X Icon color when the Exclude Note option is selected

Exclude note is similar in concept to Exclude Phrase, but in this case the entire document will be excluded from the search if the search term is identified with this setting on. That means that if you are searching for other terms in a normal manner, those other terms will not be returned by EMERSE if they are in a clinical document that contains an Exclude Note term, because the entire note will be ignored. This is in contrast to an Exclude Phrase which will only ignore a specific phrase within a note but all terms in any other context will still be returned.

W Wildcard Match
  • W Icon color when the Wildcard Match option is not selected

  • W Icon color when the Wildcard Match option is selected

Wildcards help with finding concepts where you may not be able to specify the exact details/phrasing/spelling of the terms.

For example, sometimes you may want to look for a term where the ending of the word may be variable. Suppose you want to look for terms related to hypertension. If you just search for hypertension you may miss other variations. Using the wildcard symbol, the asterisk (*), you can overcome this problem. Simply type in something like hyperten* and you will be able to match terms such as hypertension, hypertensive, hypertensives, etc. This would also match potential misspellings such as hypertenion.

In addition to adding a wildcard to the end of a word, you can also add it to the middle of a word. If you search for hyper*ia you will match terms such as hypercapnia, hypernatremia, and others. In this example, the search term with the embedded wildcard will look for words that start with hyper and end with ia.

There are few caveats with Wildcards:

  • You cannot use a wildcard at the beginning of a term. In other words, *oma is not allowed.

  • You must have at least three characters in a word before using the wildcard. Thus, hyp* is allowed, but hy* is not allowed.

  • If your wildcard search ends up matching > 1024 distinct terms, the search will likely fail.

  • You cannot combine the Wildcard Match option with the Fuzzy Match option.

  • A wildcard search cannot be applied to a UMLS Concept Unique Identifier (CUI).

Apply The Same Settings

At the very bottom of the long panel where you can enter your term and set all of the options for searching that term, you will find a checkbox labeled "Apply the same settings to the next term if possible". Having this box checked means that if you set various options for the term you were about to add, EMERSE will apply the same settings to the next term you plan to add. This should make it easier to add numerous terms that have the same settings.

apply same settings
Figure 42. The bottom of the term entry panel is where the checkbox is to apply the same settings to the next term to be entered.
Copy/Paste The Same Settings

There may be situations in which you have already entered terms but then want to apply new or different settings (negation, uncertainty, etc) to those terms. There is a somewhat hidden feature in EMERSE where you can Copy Settings from one term and "paste" them onto another term.

To use this feature, Edit an existing term by clicking on the pencil icon next to it. Then, scroll to the bottom of that editable area and click on the Copy Settings button. Click on Done to close the editing section. Then, hold down the 'v' key on the keyboard. You should see the pencil icon change to an icon with overlapping squares. Simply click on one of those icons and the settings should be applied to the term.

If the pencil icon disappears but you do not see the overlapping square icon it means either that you did not copy any settings, or that the settings you copied are the same as the term without the icon (in which case pasting the settings would have no effect).

This feature is also demonstrated in the short video, shown below.

This brief video shows how to copy settings from one term and then apply those settings to other terms. To view the icon for pasting the settings, hold down the 'v' key on the keyboard.

Note that the color will not be transferred from one term to another when copying and pasting the settings.

Bulk Upload Terms

If you have a large list of terms that you want to enter, you can paste in an entire list of terms at once through the Bulk Upload feature. Simply click on the Bulk Upload option, and then type or paste in the list of terms into the text entry box, one term per line. You can also choose settings for the terms (such as excluding Negated terms, etc), but the settings (and the term color) will apply to all of the terms entered.

Note that some of the options for entering terms individually will not be avialalbe, including the Proximity and the Fuzzy Match

Editing Terms

Terms already entered will appear in a column to the right of the panel where terms are originally entered, with a header labeled "Active Terms/Phrases".

Editing terms can be done within the Edit section, which is also the same place where you can enter new terms. Terms can be edited if you are the owner of the Saved Terms list (for Temporary Terms there is no concept of ownership since it is only a temporary list that you create yourself). If you are not the owner of the Saved Terms you will not be able to make any changes. To edit an existing term, just click on the pencil icon inside of its "bubble" and an editing box should appear. This is also how you would change the color of a term, or any of the other settings for that term.

Do not forget to click the Done button to save the changes/edits.

Removing Terms

There are several ways to remove terms from a Saved Terms list. If you want to remove a single term, you can click on the pencil icon for a term and then click on the Remove button.

You can also "grab" the term in the upper right corner of its colored "pill" and drag it out of the area. You may see it turn into a trash can icon and then it will be gone when you stop dragging.

Finally, you can delete multiple terms quickly by holding down the X key on your keyboard (make sure that the focus is not on any term entry box). At that point you should see small trash van icons on all of the term "pills" and then while holding down the X you can click on each trash can icon and those terms will disappear.

Below is a short video showing these various options.

This brief video shows three different ways of removing terms: (1) clicking the remove button, (2) dragging a term "pill" out of the area, (3) holding down the 'X' key to make the trash can icons appear and then clicking on the icons.

If you want to remove all terms at once from a Saved Terms list (aka Term Bundle), click on the Clear/Delete tab and then click on the Clear All Terms button. This will remove all of the terms but preserve any sharing preferences that you may have used with the Saved Terms. To delete a Saved Terms list entirely, choose the Delete Term Bundle button. This will completely remove the Term Bundle, but be careful because there is no undo option once it has been deleted.

Spell checking

EMERSE will check the spelling of terms entered. While typing the term spelling suggestions will displayed under the Terms text entry box in bold, preceded by "Did you mean". For example, if you enter your terms as word misspeling you will see:

Did you mean "word misspelling"?

To correct the spelling you can simply click on the spelling suggestion and the misspelled words will be replaced with the corrected version. Of course, sometimes you may want to search for misspellings, in which case just leave those spelling errors as-is.

If you do not see any spelling suggestions for misspelled words, it is possible that that your local EMERSE installation was misconfigured; if so, please contact your system administrator.

Estimated Document Count

For some terms you type in, you might see an Estimated Document Count appear. This is available to provide a high level estimate about how many documents (not patients) contain that term. These counts are based off of any Synonyms loaded into the system (Synonyms are described later in this guide). The estimated counts are not real-time counts (real-time counts have to be done through searching), nor are they adjusted for any types of filters or other criteria. Synonyms data may be periodically updated by your site administrators and the estimated counts are updated at those times. It is worth nothing that if you turn on the counts display with the Synonyms option, these are the same counts, but shown in a different location on the interface.

estimated document counts
Figure 43. Estimated document counts may appear for some terms, to give a sense of how frequently they appear in the overall document index.

Synonyms

Synonyms are a very powerful feature of EMERSE, and are described in detail in a separate section.

Saving the Temporary Terms

A set of Temporary Terms can be saved which will allow you to keep and re-use the set of search terms permanently and share it with other users. Simply cluck on the Save option. You will be asked to provide a Name and Description and then that list of terms will be saved as a Favorite within your list of _Saved Terms, which are described in detail below.

Additional Options with Saved Terms

Saved Terms are grouped into two categories: Favorites and All Available. The Favorites is a smaller subset of all potential Saved Terms (also called Term Bundles) that are available to you, so that you can have a smaller list to work with. All Term Bundles that you make yourself will automatically be in your Favorites. Otherwise, to make a Term Bundles that is not yours into a Favorite simply click on the Favorite checkbox.

Using a Term Bundle is simple: just click on the row in the table with the Term Bundle of interest, and EMERSE will display the contents of the Term Bundle. Then, just click on either Highlight Documents or Find Patients.

Making a new Term Bundle is also very easy—​just click on the Create New Saved Terms List button and give the Term Bundle a Name and a Description. A few other features of a Term Bundle are described below.

All Term Bundles have a concept of "ownership". That means that only the "owner" (the person who created the Term Bundle) can manage it, such as adding/editing terms and choosing how the Term Bundle should be shared. Others can use it with the right Sharing privileges but they cannot modify it.

Filtering Term Bundles

If you have trouble finding the right Term Bundle, you can try to use the Search feature, which can be found in a small text box above the list of Saved Terms. Just start typing what you are looking for and those matching what you type will remain and everything else will be hidden. Search looks across the Name, Description and terms within the Term Bundle. You can not search for the Owner of a bundle, but you can sort the list of all available Saved Terms/Term Bundles based on owner.

Copying a Term Bundle

If you are not the owner of a Term Bundle but wish to make edits, you can make your own copy and become the "owner" of this new Term Bundle. To access this feature, click on the Copy tab of the Bundle. This new Term Bundle will then be yours to modify and control, but any changes to the original Term Bundle will no longer be reflected in your new Term Bundle, as they have essentially been split off from one another. You can also make a copy of your own Bundle if you want to make some changes but leave the original one intact.

Copying a Bundle (even one of your own) only copies the terms but does not copy any sharing preferences that you have set with the original bundle.
When you copy another user’s Bundle, the copied Bundle is available only to you by default.

Sharing a Term Bundle

Similar to Patient Lists, Term Bundles (also called Saved Terms) can be shared with other users. As of version 6.4, this can include users at another EMERSE site, as long as that site participates in the network. To access this Sharing feature click on the Share tab. There are four options for sharing these Bundles, described below:

  • Private; available only to you. This is the default option.

  • Specific users. With this option you can choose specific users to share the Term Bundle with. These users will have access to the Term Bundle for their searches, but will not be able to modify the list of Terms. Users can be added by clicking on the Add Users button. You can choose users at your own site, or you can also select users from another site. In either case you will need to know the name or username of the user(s) to whom you want to share. This across-site sharing can be useful for cases of multi-center research collaborations in which standardizing the searches are desirable.

  • All users at your own site/institution.

  • All users at all sites participating in the Network. (This option is only available if other sites are on the Network).

bundle sharing
Figure 44. Saved Terms/Term Bundles have several options for sharing, including sharing to specific users in the local system, or even users across the network.

Advanced Terms

The Advanced Terms feature sends queries directly to the underlying Solr system. In the past it allowed for complex searches that were not feasible through the main EMERSE user interface. However, as of EMERSE version 7.x, searches through the standard EMERSE interface are generally going to be more powerful than what can be done through Advanced Terms. For example, it is not possible to leverage the Negation or other NLP annotations through Advanced Terms. In fact, there are only a few things you can now do using Advanced Terms that are not possible otherwise. The main ones are nested Booleans and date math. Therefore we recommend using this feature with awareness of the tradeoffs. Below are a few types of searches you can do through this feature.

Another important point is that as of version 7.x, the terms in the search index are "silently" prefixed to distinguish standard terms from other types of annotations (such as negation, uncertainty, etc). These prefixes are tx_ for lower-cased terms (case-insensitive) and TX_ for case-sensitive terms. That is, words like "chest" will actually be indexed as tx_chest and words like "Ehlers" would be indexed as TX_Ehlers. In the Advanced Terms section, the lower-cased prefix will be automatically added in all cases, except for with regular expressions (described below). This means that in all settings within Advanced Terms (except for regular expressions) you will not be able to search for terms in a case-sensitive manner. Instead, use the main search interface in EMERSE to conduct case-sensitive searches.

It’s important to keep in mind that searches using Advanced Terms are conducted at the level of a single document. So you if you are looking for two terms that occur together, those terms must exist in the same document. Additionally, since Advanced Search leverages the standard Solr search apparatus, you will not have access to multiple color highlighting or the use of Synonyms from this page. Terms will only be highlighted in yellow.

Complete details about the Solr search syntax can be found on the Solr Query Parser Syntax page.

Examples of what can be done with Advanced Terms is summarized in the Feature Comparison Table above, and explained in additional detail below.

Boolean searches

Boolean searches use the operators AND, OR, and NOT. They must be capitalized or they will be considered as regular terms; that is, it will search for the words and, or, and not if they are lower-case. NOT in this context means that a document should not be identified/returned if it contains that term. For example:

chest AND xray NOT abdomen

The query above would identify documents that have the words chest and xray in the same document (anywhere in the document), but if the word abdomen was also in the document it would not be considered a hit (i.e., would not be returned) even if the first two words (chest and xray) were there.

You can also group booleans together using parentheses, for example:

(chest AND pneumonia AND headache) OR (kidneys AND reflux AND "back pain")

This query will return documents that have the terms chest, pneumonia and headache all together in the same document in any order, or documents that have kidneys, reflux, and back pain all together in the same document in any order.

Boolean operators can only be used as described above with the Advanced Terms feature. The other two search features (Quick Terms and Term Bundles) have a type of implicit Boolean operator assigned to terms based on whether one is searching among a Patient List ( Highlight Documents ) where the OR operator is assumed, or among All Local Patients ( Find Patients ) where the AND operator is assumed if the colors of the terms different and the OR operator is assumed if the colors of the terms are the same. This is explained in more detail in the section about Search.

Wildcard searches

Two basic types of wildcards are supported (? and *). Wildcards can only be applied to single words, and cannot be applied to phrases.

? is used to match a single character. For example:

bea?

Would match words like beam, beat, bear, bean, bead, etc.

* is used to match zero or more characters:

hyperten*

Would match terms like hypertension and hypertensive, as well as just hyperten.

Fuzzy Searches

A fuzzy search allows you to search for terms that are similar to the term of interest. This can be way to look for possible misspellings. A fuzzy search can only be applied to a single word at a time. To use a fuzzy search, append the term of interest with a tilde (~) followed by a number between 0 and 2. These numbers represent the number of potential character substitutions that would have to occur to match the new term.

For example:

beast~1

Would match beats as well as the word breast, since both of those terms can be created from the word beast by changing just one character. (Note that the original term beast would also be highlighted in this example).

Proximity Searches

A proximity search looks for two words within a certain maximum distance from one another, regardless of their order. Separate the two words by a space and wrap them in double quotes, then add a tilde (~) followed by a number representing the maximum distance in words that they can be separated.

For example:

"ct mass"~10

Would match phrases like …​a CT scan showed a mass in his left…​

as well as …​a 12x12x8 cm mass in her L upper lung found on CT…​.

and even simply mass CT

Regular Expressions

A Regular Expression is powerful way to match text using complex patterns. Regular Expressions are commonly used by computer programmers, and it is expected that only those with a certain level of technical expertise would ever use this feature. The syntax for creating Regular Expressions can be complex and difficult to understand, and can vary between programming languages. To create a Regular Expression with EMERSE follow the guidelines for Java regular expressions.

Regular Expressions within EMERSE can be created for single words/tokens only (not multi-word phrases). Nevertheless, there may not be much advantage to using this featured compared to simple wildcard matching. Wrap the regular expression within a forward slash at the beginning and end to tell the query parse that it is a regular expression. Also, it is necessary to use one of the two "hidden" prefixes for the terms (tx_ for case-insensitive or TX_ for case-sensitive). For example,

/tx_[0-9]{3}mg/

Would match 100mg, 150mg, 400Mg, 650MG, etc.

Case Sensitive Searches

With the exception of regular expressions, searching the index for case-sensitive terms is no longer possible with Advanced Terms as of version 7.x.

Filtering by Metadata

Metadata are additional pieces of data about a document that can potentially be used to help with searching. Examples of metadata might be the source of the document (e.g. the main EHR, radiology, pathology, etc), the type of test (e.g., MRI report, CT report, plain film report, etc), the clinical service (e.g., nephrology, cardiology, rheumatology, etc) and others. While it is possible to search based on metadata, the specific types of metadata and names of the possible elements will vary from one installation/site to another and will depend completely on how the system was configured locally, what data have been loaded, etc. Therefore, if you are interested in this option you will likely need to contact your local EMERSE administrators to get more information. Further, the Filters in EMERSE are applied to the Advanced Terms searches, so it is preferable to use the standard Filters rather than try to recreate the complexity of a filter in raw Solr queries.

Below is an example of the type of query that could potentially be run if specific metadata elements were captured and stored within EMERSE:

RPT_TEXT:"motor vehicle accident" AND SOURCE:"EHR" AND DEPT:"ADULT EMERGENCY"

The query above will search for the phrase motor vehicle accident and it will limit the search to a locally named source that is called EHR and a locally named department called ADULT EMERGENCY. As mentioned above, the specific names of sources, departments, and other metdata will be unique to each installation, so it is not possible to provide specific details on what options will be available at each site.

As of EMERSE version 6, the "SOURCE" should now be available as a Filter option in the main user interface and would not need to be added into a query like the one above.

Date Ranges and Date Math

Searches can be limited by date ranges using the Advanced Terms feature. Currently, due to technical constraints for how EMERSE was designed, this date feature can only be used when searching across All Local Patients using the Find Patients function. It will not work when searching across a Patient List with the Highlight Documents function. At some point the system may be re-designed to allow these Advanced Terms date ranges to be used everywhere.

It is worth pointing out that there may be no real need to use a date range within an Advanced Search query, since a date range can already be applied to any search (even searches using Advanced Terms) by using the Filters feature. However, if using dates in the query is desirable, they should be in the standard Solr date syntax, which can be somewhat complex to use. It is also worth pointing out that if you set a specific date range using the Filters AND you also set a custom date in your Advanced Terms query, an error will occur and 0 patients will be returned. In other words, if you want to use the date options within an Advanced Terms query, you need to leave the standard EMERSE date fields blank.

A basic example of how a date range can be included in a search is below. This query will search for the term trabeculotomy and it will limit the search to the date range January 20, 2015 through May 20, 2015.

RPT_TEXT:"trabeculotomy" AND ENCOUNTER_DATE:[2015-01-20T00:00:00Z TO 2015-05-20T23:59:59Z]

The following query is similar to the one above, but shows how some basic date math can be used. In this query the same date is used for both the initial and final dates (2015-01-20), but the second instance of the date is followed by +4MONTHS which essentially adds 4 months to the date, and thus is equivalent to the previously shown query that includes the date range of January 20, 2015 through May 20, 2015.

RPT_TEXT:"trabeculotomy" AND ENCOUNTER_DATE:[2015-01-20T00:00:00Z TO 2015-01-20T23:59:59Z+4MONTHS]

A few more examples, using the special NOW option are below.

Search for trabeculotomy between January 20, 2015 and the current date:

RPT_TEXT:"trabeculotomy" AND ENCOUNTER_DATE:[2015-01-20T00:00:00Z TO NOW]

Search for trabeculotomy between 12 months prior to the current date and the current date:

RPT_TEXT:"trabeculotomy" AND ENCOUNTER_DATE:[NOW-12MONTHS TO NOW]

Search for trabeculotomy between 30 days prior to the current date and the current date:

RPT_TEXT:"trabeculotomy" AND ENCOUNTER_DATE:[NOW-30DAYS TO NOW]

Search for trabeculotomy between 60 days prior to the current date and 30 days prior to the current date:

RPT_TEXT:"trabeculotomy" AND ENCOUNTER_DATE:[NOW-60DAYS TO NOW-30DAYS]

Synonyms

The Synonyms feature in EMERSE can help you conduct more thorough searches by providing suggestions for additional terms or phrases that are related to the terms you entered. This is sometimes referred to as "query expansion". The name Synonym is used loosely here. While the feature is called Synonyms, it can provide much more than basic synonyms for a given word/phrase. In fact, these can include many types of related terms that might be useful for a search. An important aspect of the Synonyms is that you have full control over what is included in your search. Synonyms are essentially system-provided suggestions about additional words or phrases that you may want to include in your searches, based on the words/phrases you have already entered. You can incorporate the suggestions into your own search terms, either with the Quick Terms feature or the Save Terms feature. Because these are system-wide suggestions, these terms are available to all users in the system.

synonyms
Figure 45. Synonyms can be a powerful way to expand your search to related terms and phrases. In this screen, the frequency (count) of each Synonym suggestion is displayed in parenthesis to the right of the term. These counts are based on the overall set of documents within EMERSE and are not related to a specific search or filter settings. Showing the frequency can be turned off in the Synonym Preferences (see below). However, the term frequency can be used to infer the "value" of a term in a search; that is, will it yield many results or very few.

Accessing the Synonyms feature is easy. In the Edit section, each term/phrase will be shown in a colored bubble. If a synonym exists for that term, a Synonyms button will appear in the bubble under the "Active Terms/Phrases" column. Once the Synonyms button is clicked, a small pop-up window appears that displays the terms in four categories.

  1. Synonyms: These are the "standard" synonyms suggestions

  2. Related Terms: These are terms related to the synonyms but considered different enough from the original term that it should not be included in the former category. For example, a Related Term of acalabrutinib is mantle cell lymphoma since acalabrutinib is meant to treat mantle cell lymphona. The two terms are related, but not really the same. Sometimes, however, other related terms may seem more related but were placed in that category to prevent too many matches from being displayed for an initial term that was very generic.

  3. Spelling Alternatives: These are generally misspellings of the terms of interest. They are not always actually misspelled words, as these can include correctly spelled words that are not the right ones. For example beast instead of breast.

  4. Concepts: These are Concept Unique Identifiers (CUIs) used with the National Library of Medicine (NLM) Unified Medical Language System (UMLS). These will only appear if the proper Synonym dataset for these CUIs has been loaded by the administrators and enabled by the user.

In the Synonyms pop-up window each term will be highlighted with the same color as the original term that prompted the Synonym suggestion. If the term is highlighted it means that it is already selected and ready to be added into the list of search terms to be used. It is possible to click on each individual term to highlight or de-highlight the term to be added (highlighted) or not added (not highlighted). Or, one can use the provided buttons to Highlight All or Highlight None. For example, if a large list of terms is provided but you only want to include a single suggestion, first click on the Highlight None button, and then click on the single term of interest to highlight it. Once the selection is complete, click on the Add Highlighted Terms button to add those selected (highlighted) terms.

The ability to control the colors of the added Synonyms has important implications when conducting a search across All Local Patients, since the colors of the terms are used to define how the search is conducted (Details can be found in the section on Searching).

Once additional terms are added via the Synonyms feature, it may be worth checking if those newly added terms also have Synonym suggestions. For example, if the word fruit is originally entered by a user, then the Synonyms feature might suggest apple as an additional term. Once apple is added, then apple cider might be suggested, and so on.

Users are not able to add new terms to the system-provided Synonym suggestions, but users are always free to add as many terms as desired to their own term lists (Saved Terms or Temporary Terms). Only system administrators can add additional terms to the system-wide Synonym suggestions, so please contact your local EMERSE system administrator for terms that you think are missing but should be included.

It is also worth noting that other datasets can be imported into EMERSE that can add to the list of available Synonym suggestions. Examples of such datasets can be found here, and directions for formatting a Synonyms dataset for importing into EMERSE can be found in our Administrator Guide.

Synonym Preferences

EMERSE supports the ability to incorporate multiple Synonym Datasets. As mentioned above, these datasets can only be added by a system administrator, but once they are added they can be available to all users. The Synonym Preferences page allows users to have granular control over these Synonyms including the ability to include or exclude them from the suggestions.

Synonym Preferences
Figure 46. The Synonym Preferences screen.

There are several options available to users from the Synonym Preferences screen:

Enabled: When this is on, the synonyms will be available to the user and will be included in any suggestions if matches are found.

Counts available: If counts are available, they will be shown in parentheses next to suggested the term itself. Counts represent how many documents these terms appear in across the entire corpus of documents that EMERSE has indexed. These counts can be useful to help determine which terms might be most valuable in the search (i.e., yield the most results). For example, terms with counts of 0 are not in the full index of all documents and won’t yield any results, so they do not need to be included. Note that counts are not updated daily, so if new documents are added the counts will not change unless the EMERSE administrators starts the counting process again (which can take days to run). Additionally, if counts are available they can be sorted by frequency when the Synonyms suggestions are shown.

Show Counts: Turning this on will show the counts if they are avaiable, and turning it off will suppress the display of counts.

Show Zero Counts: If counts are available, then keeping this option off will prevent the system from suggesting terms that have a count of 0. There is really no reason to show terms with counts of 0 other than to see the wide range of potential terms available. Turning this on will show all terms, even if they have a count of 0.

Searching

Searching is really the core of what EMERSE does. But conducting an effective search can be complex and depends on more than just the search terms used. Searching in EMERSE is an action invoked by clicking Highlight Documents Find Patients or Search Network. A search cannot be carried out until required parameters have been set, including the patients to be searched (All Local Patients, a Patient List, or patients from other EMERSE systems in the Network) and the Terms. Filters are optional, and most are not supported with Network searches at this time.

Find Patients Search Mode

The Find Patients search mode is used to identify a set of patients, that may warrant further exploration, or that you may want to save as a Patient List. When searching across All Local Patients this approach can identify a set of patients among all the patients in the system that meet the search criteria. When Find Patients is applied to an existing patient list, it will limit the search to only the patients in that list. It’s the equivalent of trying to identify a cohort based on search terms.

Data displayed with the Find Patients mode

When in the Find Patients mode, EMERSE will provide the following results in different sections:

Patient count

Total number of patients meeting the search criteria.

Text snippets

Brief Text Snippets from the 100 top-ranked documents. This is to provide some context for what was found to help ensure that the search retrieved what you were looking for. Note that ranking is not used much within EMERSE, since these ranking don’t have much significance for the kind of work most people use EMERSE for.

Demographics charts

Charts showing the demographic breakdown of the patients, such as sex, race, ethnicity, and current age.

Trends

A chart that shows how the number of patients matching the search has changed over time. Trends uses the date range set with the Filters to define the range and will automatically adjust the time intervals based on that range (years, months, etc). Note that for trends, a distinct count of patients per time interval is displayed. If a patient has notes with the term(s) of interest over multiple time periods, they will be counted for each time period in which a note with the term(s) appears. At this time the counts are absolute counts and are not normalized for the volume of patients seen within a given time period.

If you are satisfied with the search results, you can then explore the list of patients in more detail, or save the list for later use by clicking on the button Move to Temporary Patient List. Note that to actually save the list for use in future sessions you would then need to click on the Convert to Saved Patient List button after converting it to a Temporary Patient List. Once the patients have been moved to a Temporary Patient List, you can also search through their documents using the Highlight Documents button, because at this point you have switched from the Find Patients mode to the Highlight Documents mode by selecting a subset of patients in the system to look through.

If you used Find Patients against a Patient List you will also see an option to Tag Patients in List. This will add a Tag to each patient that was in the result from the current search. More details can be found in the section on Tags.

Summaries
Figure 47. The Summaries display after performing a search across all patients using Find Patients. In this example, the Negation option for Annotations has been selected, which will underline any negated words in the Snippets.
Demographics
Figure 48. The Demographics display after performing a search across all local patients using Find Patients
Trends
Figure 49. The Trends display after performing a search across all local patients using Find Patients

Searching with the Find Patients mode

With Find Patients, a patient will be included in the results if any of their documents match the search criteria. The way the search runs is different depending on whether it is run against All Local Patients in the system, or against an existing Patient List.

  • All Local Patients: A document matches the search if it contains at least one term of distinct each color in the search. That is, terms with different colors are considered to be separated by AND whereas terms with the same color are considered to be separated by OR.

  • Patients Lists: A document matches the search if any term appears in the document, regardless of the color. That is, terms with different colors are not treated distinctly and are all considered to be separated by OR.

For example, if you search for heart and "chest pain" across All Local Patients, then a document must contain both the word "heart" and the phrase "chest pain" to match, and a patient must have at least one document that matches those terms since the terms are different colors. If a patient merely has a document containing the word "heart" and another different document containing the phrase "chest pain", the patient would not match because the requirement is that the terms appear together in the same document.

If the same search for heart and "chest pain" os done within a Patient List, then a document matches if it contains either the word "heart" or the phrase "chest pain"; the document does not have to contain both. For example, a patient will match the query when they have a document that contains the word "heart" even if they have no document that contains the phrase "chest pain".

Document matches for Advanced Terms operate a bit differently: the results are based only on the search syntax used, not the highlight color; all advanced searches are highlighted in yellow. Terms in advanced search not separated by a boolean operator are searched as if they were separated by the OR operator. Parentheses should be used to make grouping clear.

Advanced term search can do things that have no analogy in Temporary Terms or Saved Terms, such as (heart AND "chest pain") OR "chest tightness" which matches documents that either contain both "heart" AND "chest pain" OR just contains the phrase "chest tightness" (this is generally known as 'nested Booleans'). In this example, a document containing the phrase "chest tightness" is enough to match, but a document containing the phrase "chest pain" without the word "heart" is not enough to make a match.

Search Strategies with the Find Patients Mode

Because of how searches are conducted, and how patients are identified when a search using Find Patients is applied to All Local Patients, it is best to ensure that true synonyms for a concept are highlighted in the same color. Additionally, be careful about adding too many terms especially if a term is a different color from the others, since it means that the additional term must also be in the document for the document to be retrieved. Adding an extraneous term that isn’t very important for identifying the right patient population has the potential to result in no patients being found. For example, if the following query is used with a search across All Local Patients, the query will only find patients that have at least one document with all three of those terms in the same document:

heart
chest pain
chest tightness

Thus, it is better to use only those terms that are absolutely necessary to find the right patients, or to use a Term Bundle and set the color of the terms more strategically, such as:

heart
chest pain
chest tightness

In this case, the above query will look for documents that contain heart AND either chest pain OR chest tightness, but chest pain and chest tightness do not have to both be present, only one of them.

Highlight Documents Search Mode

The Highlight Documents search mode is used when searching across a known set of patients, using either a Temporary Patient List or a Saved Patient List. This mode is equivalent to a chart review where you want to highlight all of the terms of interest, because you already know what patients you are interested in looking through.

Data displayed with the Highlight Documents mode

When in the Highlight Documents mode, EMERSE will provide results including:

Overview

The Overview is a table where every patient is shown in a row and the document sources are shown in the columns. This Overview provides a high level view of what was found for each patient, with each cell providing specifics about what was found on a per-patient, per-source basis. There are several viewing options for the Overview table.

Numbers

Numbers are displayed in cells that have search "hits". (In search jargon, a "hit" means that the search query matched a document in the system). The numbers represent the total number of documents with at least one hit and the total number of documents for that specific patient and document source. For example, if a cell says "21 of 92" it means that the patient has 21 documents containing a search hit out of a total of 92 documents for that source. Cells that are blank have no hits (but those cells might still contain documents without any hits). Clicking on the Numbers icon will toggle between showing the numbers for cells that have hits and numbers for all of the cells, even those that do not have a hit. This can be useful if you are interested in knowing if a patient has any documents for a specific source, even if there are no hits for that source. For example, a cell might say "0 of 12" (0 notes out of 12 notes had a hit) or even "0 of 0" (there were 0 notes for that particular patient and document source). Noe that the numbers shown in these cells will vary depending on how the Filters are set. More restrictive Filters will results in fewer hits and potentially fewer documents. If a patient does not match the Filter criteria at all, they will have 0 documents displayed in the Overview table.

Grayscale

The Grayscale option will shade a cell in a gray color, with darker colors representing more documents with a hit. This provides a rapid visual overview similar to a "heat map" which is commonly used in some bioinformatics disciplines. Darker shades may help point out areas in which to focus, since they have more documents and thus more mentions of the terms of interest. The general rule of thumb is that cells with no hits are white, cells with 1-5 documents hits are a light gray, 6-10 document hits a slightly darker gray, and so on. The Grayscale option can be toggled on or off by clicking on the icon, and it can be used in conjunction with the Numbers option and the Mosaic Option, described below.

Grayscale
Figure 50. The Grayscale view in the Overview section shades cells in the table darker when there are documents with more "hits" (or search terms). This is similar to how a heatmap is shaded in bioinformatics.
Mosaic

The Mosaic option provides another way to view the data at a high level. It displays a color-coded grid representing the terms found for each patient based on document source, where the colors are shown in specific locations within a 6 x 3 grid (because there are 18 possible colors for terms). Each color corresponds to the color that was assigned to the term. Because the colors appear in the same location in the grid every time, even those with trouble distinguishing between colors should still be able to make use of this feature based on the location of the shading. Clicking on the Mosaic icon a second time will reveal the actual grid, making it even easier to identify the locations of the color within the grid. In that sense, the locations of colors within the grid can serve as a kind of "fingerprint" to help identify patterns.

Mosaic
Figure 51. The Mosaic view in the Overview section matches the colors of the search terms to each of the cells in the table, making it very easy to rapidly identify which terms appear for each patient and document source.
Mosaic
Figure 52. The screen shot is identical to the one above, except that the grid has been made visible by clicking again on the Mosaic icon.
Summaries

Clicking on a specific cell on the Overview page will take you to a drill-down view of a list of documents for that specific user and document source. This is known as the Summaries page, since it displays the high level summaries, or text snippets, of what was found based on the search. Each row in the Summaries view represents a document for that patient. By default only the documents that contain at least one search term are displayed. To show all of the documents, even those with no terms, click on the Display All Notes checkbox. The documents listed on this screen are sorted in reverse chronological, with the most recent documents on top, but the table is sortable in both directions by clicking on the header column of interest. A row that is blank means that there were no hits for that document, whereas a row that is not blank will show a snippet, or Summary, of what was found for that document with the search terms of interest. Summaries allows for rapidly skimming the notes at a high level to see if it is worth drilling down further to open the actual note for more details.

Above the list of Summaries is a small table called Annotations. Clicking on each possible label ("negation", "uncertainty, etc) will results in corresponding sections within the snippets to show which areas should be labeled with those anntations, by underlining them within the snippers.

A hidden shortcut in the Summaries section: pressing the left and right arrow keys will skip to the prior (left arrow) or next (right arrow) patient in the list.

Summaries
Figure 53. The Summaries view after clicking on a cell within the Overview shows the various documents for an individual patient and document source, with snippets of text for documents containing the terms of interest. Users can choose to show/hide various Annotations including negation, uncertainty, and more. In this example, parts of one snippets are underlined in red, showing that it is marked as negated.
Documents

Clicking on a specific row on the Summaries page will take you to a drill-down view of the actual document. The entire document will be shown, with the terms of interest highlighted in the document. Above the document, four collapsible sections are shown.

  1. The top contains the name of the patient, the MRN, and the Comments and Tags option (Comments and Tags are available for Saved Patient Lists only).

  2. This is followed by a section with metadata about the document itself. This information might include a unique document ID, the clinical service, the authoring clinician, etc. but it will differ depending on the local configuration of EMERSE.

  3. The Semantic Groups section contains various groups derived from the natural language processing (NLP) within EMERSE. Many (but not necessarily all) of the concepts/terms found through the NLP are linked to the Semantic Groups listed here. Clicking on a group in a table provides a convenienent way to highlight all terms mapped to those groups in a document, even if they were not included in your search terms. For example, clicking on Drugs should highlight all of the drugs/medications reported in the document. Note, however, that the processes of identifying drugs can be imperfect, and it is possible that multiple terms that are not actually drugs will be misidentified, because the names are general (an example might be a vitamin called "Daily" so that all of the words Daily might be identified as a drug). Clicking on a Semantic Group name once will cause all of those terms to be underlined, and clicking again will cause them to be fully highlighted. Users do not have control over the colors for these pre-defined categories.

  4. Annotations are labels for various parts of text based on attributes such as negation, uncertainty, non-patient subject, and history of. Clicking on one of those categories will result in those sections in the documents to be underlined with a corresponding color. Users also do not have control over the colors of these underlined categories.

  5. Summary is similar to other areas in EMERSE where these appear, but in this case only the text snippets for that single note are shown. These are displayed to provide another way to quickly skim the results for anything of interest. It is also worth noting that clicking on any of the highlighted terms in this table will take you to that specific place in the document, which is a hidden shortcut (there are benefits to reading the user guide). The Documents page has various navigation options to skip to the next document, or next patient. There is no way to save the document—​however, it can always be accessed again at a later time by going back to EMERSE. A hidden shortcut in the Documents section: the < and > keys on the keyboard switch between the prior (<) and next (>) documents. You can also switch between patients using the left and right arrow keys.

Document
Figure 54. The Document view in the Overview section is where you can drill down the level of a single document and still see all of the terms highlighted within the document.

Searching with the Highlight Documents mode

The Highlight Documents mode is simpler than the Find Patients mode. This is because all terms are treated as being separated by the OR operator, regardless of the color in which the terms are highlighted (unless otherwise specified using actual Boolean operators with the Advanced Terms). The general idea is that since the patients of interest have already been identified, the goal with Highlight Documents is to simply highlight any terms of interest anywhere in the documents to support a rapid, efficient chart review process. As a result, there is no harm (other than speed/performance) in including additional terms even if they are not present anywhere in the documents, since their absence will not affect the highlighting of the other terms.

Search Strategies with the Highlight Documents Mode

In general, no specific search strategies are needed when searching in the Highlight Documents mode. Even so, it may still be advantageous to group terms with similar meaning by the same color to aid in more rapid chart reviews. In other words, if you are searching for a list of narcotic medications and the resulting constipation that the medications might cause, it may be useful to set all of the narcotic medications to be green, all of the constipation terms to be orange, and all of the stool softeners/laxatives to be purple. Then, someone reviewing the charts would only have to recognize one of three colors to know what kind of concept was being highlighted.

Some elements with the Advanced Terms will not work properly when in the Highlight Documents mode, and these limitations are detailed in the section on Advanced Terms.

Searching across the Network

Searching the network involves a very different process from searching across All Local Patients or within a Patient List. This is because a Network search involves contacting other EMERSE sites to ask those other systems for approximate counts of patients based on the search terms. The most important thing to note is that the search terms are handled in the same way that they are when searching across All Local Patients. That is, terms with the same color are treated as being separated by OR and those with a different color are considered to be separated by AND. More details can be found in the section on the Network.

Additional Search Strategies

In addition to the two general search strategies described above (Search Strategies with the Find Patients Mode and Search Strategies with the Highlight Documents Mode), there are other tips that are worth noting to ensure that you get the results you’re looking for.

Understand the differences between the two search modes

This is described elsewhere, but it is extremely important to understand the differences between the two search modes, Find Patients and Highlight Documents. The way EMERSE works differs depending on which search mode is being used, and therefore this difference must be understood to ensure effective search results.

Use the synonyms

EMERSE can provide a lot of suggestions for additional terms through the Synoyms feature. Use this to expand your search to cover more possibilities for how something was worded or phrased in the notes. Also be careful because EMERSE is a very literal search engine. If you want to look for antibiotics and that is what you type in, the system will look for the word antibiotics only. If what you really meant was to find any instance of any antibiotic mentioned in the notes, then you will have to include all of the possible terms for these, and there may be hundreds of them (e.g., cephalexin, Keflex, Ancef, Ciprodex, amoxicillin, etc). The Synonyms feature can certainly help expand the breadth of the query terms, but it may not be complete.

While the Synonyms can be very useful and powerful, be careful about what Synonym suggestions you add. EMERSE has the potential to provide a very large list of options, but adding too many terms could slow things down without providing much benefit. And, for ambiguous terms, you might inadvertently include concepts you don’t even want. For example, if you are searching for synonyms of MI because you want to look for terms related to myocardial infarction, EMERSE will also suggest terms like Michigan since that is also a different meaning of MI.

Break up the query

Similar to trying multiple variations, be careful about how specific the query is. For example, if you enter "metastatic breast cancer", then the system will look for that specific phrase. It might be better to look for the phrase metastatic and the phrase "breast cancer" separately, so that you have a greater chance of finding other variations such as:

the breast cancer was found to be metastatic to the lungs

Note, however, that sometimes breaking up the query can lead to false positives if, for example there was text such as:

history of breast cancer as well as melanoma metastatic to the brain

Another approach for 'separating' the terms is to use the Proximity Search feature. This will allow you to search for two terms within a specified number of words from each other. For example, suppose you were looking for "sublingual nitroglycerin". A logical alternative phrase would also be "nitroglycerin sublingually". But if the phrase in the clinical note is actually "nitroglycerin was given sublingually" then the simpler, shorter phrase would miss it. Including the terms in a Proximity Search would capture the longer phrase. A Proximity Search within Temporary Terms or Saved Terms will search up to 5 words apart. To search with even more space between words, the Proximity Search can be used with the Advanced Terms. Just remember that the more spacing between two words, the more likely that there is no true connection to them which can result in a false positive result.

Beware of implicit concepts

A concept that you’re looking for may be implicitly described but not explicitly stated. An example of this would be looking for the concept of metastatic as it relates to breast cancer. If the phrase is…​

the breast cancer was also found in her lungs

…​this implies that the cancer was metastatic even though the term metastatic or its variants (metastasized, mets, etc.) was not explicitly mentioned. EMERSE, being a simple tool, is not able to interpret that phrase as being an example of a metastatic breast cancer.

Another example:

He had a raised, red patch on his arm that he didn’t even notice.

This is an example of an asymptomatic rash, even though neither of those two words (neither asymptomatic nor rash) appears in the sentence above. This can happen often, so it is important to be careful of what might be missed when only searching for specific phrases or concepts.

Beware of ambiguous terms

There are hundreds of terms that are used in the clinical notes that are ambiguous—​that is, it is hard to know what a term represents without understanding the context in which it appears, and some terms/abbreviations may have been completely made up locally by a single clinician or a small group of clinicians.

For example, diffs might mean differentials, or differences, or difficulties or even be a part of an abbreviation for Clostridium difficiles (C diffs).

Other common ambiguous abbreviations include CA which can mean cancer, calcium, or California, and MG could mean milligrams, myasthenia gravis, or magnesium. There are many more examples like this.

Even terms that one might think are straightforward might not be. For example, imagine you are interested in exploring milk consumption to assess dairy intake, so you search for milk. Be careful because there can be terms like almond milk, carob milk, soy milk, rice milk, and others which are clearly not the right kind of milk. This is why reviewing the results carefully is very important. Also note that you could add those 'false positive' terms as an Exclude Phrase to remove them from the search results (see section on: E Exclude Phrase).

Try multiple variations

You may be tempted to use the most standard, common phrasing for a particular concept, but it is best to consider all of the ways in which something might be phrased. Clinicians will sometimes even make up new terms. For example, scaphotrapeziotrapezoidal may be the official spelling, but clinicians have also used terms such as scaphotrapeziumtrapezoid, scapho-trapezoid-trapezium, and scaphoid-trapezium-trapezoidal. Clinical note creators also add hyphens or spaces (these are equivalent in EMERSE) into words where they might not be expected, as can be seen in the above examples, or even simple words like ham burger or yester-day. The system-suggested Synonyms can help capture some of this variability lot, but they may not be enough to cover all of the possibilities for what is mentioned in the notes.

Be aware of word substitutions

Remember that the people who create clinical notes often have a difficult time spelling medical words, or they make errors when they type, or automated transcription systems can "hear" the wrong word, etc. For example, beast mass and prostrate cancer are phrases that contain correctly spelled words, but they are not the right words. In fact, many such word substitutions of correctly spelled (but incorrectly used) words appear. Often these are homonyms, but not always. Think carefully about how someone (or a computer) might hear a word and insert the wrong word. If you’re looking for people with a hoarse voice, be sure to search for horse voice because it’s almost guaranteed that someone will have used the wrong word at least once.

Following is a table containing examples of such word substitutions that have been found int he medical record. It is by no means complete.

Table 5. Examples of incorrect words used in clinical notes

Wrong Word (the word that was used in the note)

Correct Word (what the word was supposed to be)

beast

breast

prostrate

prostate

horse

hoarse

anther

another

mutation

nutation

reposited

repositioned

unkept

unkempt

physiatric

psychiatric

crabs

carbs

synergist

Synagis

callous

callus

sliver

silver

circumscribed

circumcised

calibers

calipers

Understand the limitations of EMERSE

This has already been described in the section on Terms, but it is worth pointing out again. EMERSE ignores multiple characters such as ! ? . > < + - #. This means that some searches will not work. For example: BRCA1- will be interpreted by EMERSE as simply BRCA1, which means that although you can still search for BRCA1 you cannot distinguish between BRCA1- and BRCA1+. (If the phrase is "BRCA1 negative", or a similar variation, then EMERSE will be able to find the concept. Similarly, "150 pounds" when written as "150 #" or 150 # will simply be interpreted as 150 and a blood pressure written in the form of 120/90 will be interpreted the same as 120 90, meaning that you cannot use the slash symbol (/) as a way to look for a pattern that appears to be a blood pressure.

Understand the limitations of search in general

A search engine like EMERSE is great for finding concepts quickly, but it does not have any understanding of what it found, or the context in which a term was found. That is why it is important to carefully review the results for accuracy. Even in cases where you might think no error is possible, it just might be possible. The following is a real example of what was found in a clinical note:

A search for mesenteric polyarterisis nodosa should also include the abbreviation for polyarterisis nodosa which is PAN. Thus, an additional search term should be mesenteric PAN. By default EMERSE searches in a case-insensitive manner, so it will identify polyarteris pan as a 'hit'. However, the actual phrase highlighted was mesenteric pan niculitis. which is not the intended concept. In this case, an inadvertent space had been added into the word panniculitiis.

Similarly, terms like ular could represent a misspelling of the word ulnar or part of a word that had an inadvertent space such as ventric ular or fib ular.

Another example:

Suppose you are searching for lupus, so you include the abbreviation SLE which is a common abbreviation for systemic lupus erythematosus. The system will higlight something like:

50 YEAR OLD FE, SLE WITH HISTORY OF…​.

In this case, it looks like there are two common medical abbreviations: [1] FE (iron), and [2] SLE (systemic lupus erythematosus). However, upon careful inspection, neither of those two abbreviation are correct. In fact, the word should have been FEMALE, but the person who typed the word accidentally shifted two of the letters to the right on the keyboard (, is directly to the right of m and s is directly to the right of a on a keyboard). Thus, what looks like two clinical concepts is really just the term female.

As a general rule of thumb, if you have found what you were looking for, and have verified that it is truly the correct concept then consider that to be a success. If you haven’t found what you were looking for, the concept may not be there at all, or it may be that you just haven’t figure out the unusual, or incorrect, way in which a term or concept was phrased.

Consult clinical experts

If possible, talk to someone who has clinical expertise in the area for which you are conducting your searches. They might be able to provide guidance about how certain concepts are likely to be phrased in the clinical notes.

Use the minimum necessary terms, sometimes

Queries that are too long or specific will generally not work well. For example, if you are looking for a positive depression screen based on a PHQ-9 test, do not search for "positive depression screen based on a PHQ-9 test". Instead it might be better to search for PHQ-9 since that alone will help narrow down the results because it is already very specific. This could perhaps be combined with terms such as positive to help narrow down the results. Or, consider a proximity search where PHQ-9 and positive are searched within 3 words of each other to help narrow things down further.

With more general terms you may need to add more terms to help narrow the results. For example, if you are looking for type 1 diabetes and you just search for the word diabetes you will end up with far too many irrelevant hits because you will most likely be finding diabetes related to type 2 diabetes since it is much more common.

In other words, you will need to come up with a strategy for how to construct your search queries with search terms, and this will usually be very customized to your specific task.

Indeed, there may be times when using longer and more specific phrases can be useful. An example of this might be to look for wording used in a specific note template or report template, which can be useful when trying to identify specific notes needed for data abstraction. In such a situation, the specific text surround a concept of interest may be needed to narrow down the results to the desired one. The example below identifies very specific language used in a test result reported by the Mayo Clinic Medical Laboratories. Searching for this very specific language, which does not appear to vary between reports, helps to identify the actual lab report and not the mention of HIV-1 elsewhere in the notes.

This test was performed using the Abbott RealTime HIV-1 Qualitative assay (Abbott Molecular, Inc., Des Plaines, IL). This test was developed and its performance characteristics determined by Mayo Clinic in a manner consistent with CLIA requirements. This test has not been cleared or approved by the U.S. Food and Drug Administration.

Search in stages

Sometimes it may not be an effective strategy to find your patients of interest with a single query. It may require several queries. For example, you may want to first identify your patient population using the Find Patients search mode just by searching for a rare disease. Once you have identified those potential patients, you can move them to a Patient List and then search them for more details (for medications, side effects, etc.) in a subsequent query using the Highlight Documents search mode.

Another example: Imagine you want to find a set of patients who have fibromyalgia and back pain but your query with the Find Patients mode returns almost none. This may be because with Find Patients the system is looking for documents that have the terms fibromyalgia and back pain in the same document. But what if the patients have these terms mentioned in different documents? One solution, among several, would be to first conduct a search for fibromyalgia using Find Patients and then saving the result as a Saved Patient List. Then conduct another, separate search for back pain using Find Patients, and saving this result as a Saved Patient List. Then, use the Compare Patient Lists feature and select the patients that belong to both lists. The resulting set of patients would be ones that contain the terms fibromyalgia and back pain, but not necessarily within the same document. This smaller subset of patients could then be saved as a new list and searched further using the Highlight Documents feature.

Use multiple data resources

EMERSE was designed to work with unstructured, free text data. These clinical documents can be extremely valuable for finding the needed information, much of which might not be present anywhere else in the medical record. However, there are times when using structured data makes sense. For example, if you want to find all patients with a hemoglobin less than 6, you can’t type that into EMERSE and expect to get any sensibile results. It would be better to use a separate system designed for structured data (e.g., DataDirect at the University of Michigan, or the widely used i2b2 Workbench). It may also make sense to use structured data such as ICD-10 billing codes as a first pass attempt to narrow down a patient cohort, and then search that subset of patients within EMERSE to do additional verifications within the clinical notes.

A patient list generated from a different system can be imported into EMERSE, or one can simply copy and paste a column of medical record numbers from an Excel sheet containing a list of patients generated in a different system. EMERSE also provides the technical capabilities for importing a list directly from another software system, but this functionality will depend on how it has been locally installed, configured, and connected to other systems.

It is best to think of EMERSE as being one tool among many in a toolbox of useful software to support your work. It is unlikely that any single tool will enable you to complete your task, but using multiple tools together will help you efficiently get your work done.

Plan your data entry with data extraction/search in mind

While this may not often be possible, if you have any control over how data are initially entered into the electronic health record, you can use strategies to help identify the note or concept later when searching. One example is to use standardized phrases that can be used in search later on that will positively identify only the notes of interest. Also, using more standardized language helps with ensuring that no unusual variations in wording are missed with the search.

Give thought to how you might want to identify and abstract data later on. It may be useful, for example, to include a unique "note identifier" in a note template, so that searching for just that identifier will identify all of the notes of interest. You can use something distinctive like stntv25; it can even be simple phrases like Strep throat note template version 2.5, as long as it is unique to that set of notes.

The Food and Drug Administration (FDA) has enabled just this type of searching with drug biosimilars. These drugs are required to have a generic name appended by a "distinguishing suffix that is devoid of meaning and composed of four lowercase letters" (Reference: Nonproprietary Naming of Biological Products). Examples include: emicizumab-kxwh, vestronidase alfa-vjbk, filgastrim-vkzt, filgastrim-sndz. Thus, searching for even the 4 letter suffix (kxwh, vjbk, vkzt, sndz) should positively identify the drug in most cases.

Be creative

Sometimes it helps to think 'outside the box' when conducting searches. For example, you may be interested in identifying patients based on structured ICD-10 codes. If your clinical notes contain such codes within the text of the document, they can be searched just like searching for any other word or phrase. If you’re interested in finding Ehlers-Danlos syndrome, which has an ICD-10 code of Q79.6, just search for that term and it will find any mention of that code in the notes. Of course, since the dot (.) is ignored, you could simply search for Q79 6 to get the same result). The main thing that you cannot do, as already mentioned above, is search for numeric data using comparitors such as greater than (>), less than (<) or equal to (=).

Network

Network Background

The EMERSE Network has been available since EMERSE version 6.3. If you do not have access to this feature, it may be because you do not have permission from your local EMERSE administrator to use the Network feature, or your site has chosen not to join the network.

The general idea behind the Network is to run searches for patient counts not only at your own site, but at other participating sites across the country as well. If you are looking to plan a clinical trial involving patients that have a rare disease that might only be mentioned in the notes (e.g., no specific ICD-10 code) this could be a very good way to estimate how many patients exist at other sites, and then plan to form a collaboration with investigators at those other sites.

There are many security mechanisms built into the Networking capabilities, but the most important one is that no actual patient data ever travels across the Network. By using the Network you cannot see clinical notes from other EMERSE sites, and other sites cannot see your local notes. Instead, the only information passed across the network are obfuscated counts. Obfuscation means that the real count is not provided. Instead, an approximate count within a reasonable range is provided. This is done to prevent users from trying to identify specific patients by manipulating the search query to change the counts. Another important security mechanism is that sites can set a threshhold below which even counts are not reported. In such a case the site might return a number such as < 10.

To re-iterate, using the network can provide you with approximate counts of how many patients at each participating site have the terms of interest. You cannot see any more details about those patients, and this is intentional.

This Network is conceptually similar to other systems (such as i2b2/ACT) that perform related functions for structured data. That is, for data such as billing codes, medication codes, lab codes, etc. EMERSE differs in that it allows for network search that related to the free text data.

While we do not thing it would be possible, no one should ever use the Network to try to deduce the identify of an individual.

Using the Network

Using the network should be straightforward. Simply click on the Network tab in the Patients section. You will be presented with a table showing all of the participating sites. Simply check the boxes next to the sites that you want to include in your search.

Selecting Network Sites
Figure 55. This screenshot shows how to select sites for a network search. This example shows 3 sites selected. Because this is a network of demonstration system, each happens to have the same number of patients.

Note that currently the Filters will not work across the Network except for the date filter. In the future this may become more functional. Once the sites are selected, click on the Search Network

Each time a user logs in, the first time that a Network search is attempted, a dialog will appear with the Terms of Use for running a network search. You must agree to these Terms of Use to proceed.

Network Terms of Use
Figure 56. Screenshot showing the Network Terms of Use

Once you agree to the Terms of Use the query will run. Other sites will be contacted and the results will be displayed on the screen in real-time as they arrive.

Network Results
Figure 57. Screenshot showing the Network Results. Note that this image shows three sites from a demonstration network. Even though each site on the network has exactly the same dataset, the numbers returned are all different from one another because of the obfuscation that was applied in order to hide the true number.

A few other important points to note:

  • If you run too many queries in too short of a time span, you will be blocked from using the network. This is one of the security features built in.

  • All queries sent across the Network are logged (including the specific user and originating site), and all participating sites have their own copy of the logs.

  • We currently do not have a mechanism for contacting other sites if an interesting patient cohort is identified. This is a known challenge with research networks.

Application Programming Interfaces (APIs)

This section is meant for highly technical groups who have a programmer/developer on the team.

Application programming interfaces (APIs) are available to teams needing to access the underlying EMERSE data store. This is based on the underlying Apache Solr infrastructure. This might be useful for groups wanting to utilize the very fast data retrieval mechanisms of Solr, but want to be able to retrieve the documents outside the constraints of the user interface so that additional processing/text mining can be done. The API access points should be locked down and secured by your IT Administrators, and you will need to be provisioned with a username/password for access.

There are two APIs potentially available:

  1. The standard Solr API comes built into Apache Solr, and which underlies the EMERSE system. The standard Solr API does not maintain audit logs of access or what notes were extracted, which makes it a poor choice for security/HIPAA-compliance. A simple way to leverage the API is to use curl which is a tool available with almost any programming language. Using curl you can form an HTML query, with the results returned in JSON or XML format. Depending on how the query is formed, you can get back the medical record numbers (MRN), document metadata, or even the contents of the document itself. Remember that if the clinical document is in native HTML format, you may need to have a further cleanup process (e.g., strip HTML tags) to make the document usable for your projects. Note that there is no way to access the EMERSE resources such as Synonym suggestions when using the Solr API. Details about utilizing this API can be found in the Apache Solr documentation section on Client APIs.

  2. The EMERSE API, which is built on top of Solr, but has additional capabilities that generally make it a better choice. It also maintains audit logs of access. We have an entire guide dedicated to the EMERSE API. For additional details, see the EMERSE Application Programming Interface (API) Guide.

Frequently Asked Questions

Can I export the data?

No. EMERSE was not designed for data export. It was designed to keep the data centralized and secure, but allow for easy access. If you need to download the data, consider using the Application Programming Interfaces (APIs).

You can, however, Export patient lists with the associated Comments and Tags if they were added. Access to the Export feature will depend on your specific privileges within the EMERSE system.

I can’t get into EMERSE, what am I doing wrong?

There are multiple reasons why you may be having trouble getting into the EMERSE application. A few are outlined below.

You haven’t been granted access

Or your access has been revoked. EMERSE contains patient data, so it is not the kind of system you can just log onto without someone providing an account. Talk to your system administrators if you think that is the issue.

You are not on the right network

If you can’t even see the login page (because the page is not found), this may be the reason you’re having trouble. In general, EMERSE will be available on a secured network available only to individuals on that internal, trusted network. There may be ways to get to this network from the outside using virtual private network (VPN) software. This is an issue to bring up to your local IT support teams.

How can I filter an existing patient list to include/exclude only those that mention a certain finding?

There may be times when you have Patient List from another source, and now you want to look through the list, but only retain patients who have a mention of a certain finding (for example, "fibromyalgia"). The easiest way to do that is to use the Find Patients feature and apply it to your existing Saved Patient List. After identifying the patients, simply Tag them and then clear the untagged patients using the Clear/Delete option for the Saved Patient List.

I’m getting an error message and can’t proceed, what can I do?

If you receive an error message, it might be because the system has been updated and your local browser cache needs to be cleared. It is often helpful to clear your browser cache when running into problems/errors, since that will often fix the issue. If not, feel free to report the issue to the EMERSE team.