Overview
This guide covers issue surrounding initial setup of the system, specifically as it relates to the customization of patient and document data sources.
Before Getting Started
Before getting started, it is important to note a few things.
You must provide EMERSE both patient data and document data. Patient data is data describing the actual patient, such as the patient name and MRN. Document data is data describing the medical records or other records connected to a patient.
EMERSE has two places to store this imported data: (1) in the Oracle database, and (2) in Solr’s indexes. Patient documents are stored exclusively in Solr’s indexes. Patient data is stored in the Oracle database, but EMERSE periodically copies this data into Solr indexes to aid in searches.
Finally, there is data that EMERSE generates and uses on its own, such as patient lists, term bundles, audit logs and user account information. This is mostly stuff you will not need to change, as it can be changed through the user interface. This guide focuses only on imported data described above. However, a full data dictionary is available if more details are desired.
The data for the documents and document indices are not stored within the relational database. Instead these are managed by Solr in its own data store and can be on a separate server from the Oracle database. |
It is also important to note that as you are making changes to the database tables described here, some changes might be reflected immediately within the user interface, whereas others might not. If you run into trouble when making modifications, a good first step would be to restart everything and then see if the changes have taken place. For example, if you make modifications to the tables and then add clinical documents to the document index, those changes may not be reflected immediately until the index has been closed and re-opened (or Solr is restarted). Similarly, if patients are added to the database table, they are typically only copied to the corresponding Solr index once each day through a scheduled job (details in the Configuration Guide). It is possible to force this to happen more frequently when needed (potentially useful during initial setup and testing), and if that is desired such issues are detailed in the Troubleshooting Guide.
Patient Demographics
PATIENT table
Table: PATIENT
Population: From external source (such as EHR)
Population Frequency: Can be variable, but once per day is reasonable
The EMERSE schema includes a patient table with medical record number (MRN), name, date of birth, and other demographic information which is displayed in the search results. Data in this table are used to display the patient name, validate user-entered or uploaded MRNs and to calculate current ages of the patients. Other demographic data are used to summarize the populations found in a search.
Although the coded demographics information is not required, some features such as the demographics breakdowns within the All Patients search feature will not work if sex, race, ethnicity are not populated. |
For all documents indexed, there must be a corresponding patient in the patient table with a medical record number (MRN) that matches the document. This should be taken into consideration when determining the frequency for updating this table. |
Column name | Description | Required or Optional |
---|---|---|
id |
Primary Key |
Required |
external_id |
Medical Record Number |
Required |
first_name |
First Name |
Required |
middle_name |
Middle Name |
Optional |
last_name |
Last Name |
Required |
birth_date |
Birth Date — used to calculate current age |
Required |
sex_cd |
Sex |
Optional |
language_cd |
Language |
Optional |
race_cd |
Race |
Optional |
marital_status_cd |
Marital Status |
Optional |
religion_cd |
Religion |
Optional |
zip_cd |
ZIP code |
Optional |
create_date |
Date the row was created. Can be used to track changes to the table. |
Optional |
update_date |
Date the row was updated. Can be used to track changes to the table. |
Optional |
deleted_flag |
Logical delete flag. Useful for merged patients. Valid values are 1 = yes, deleted; 0 = no, not deleted |
Required |
deceased_flag |
Currently not used. Valid values are: 1 = yes, deceased; or 0 = no, not deceased |
Required |
Many of the columns in the patient table use codes for their values- sex, race, ethnicity, etc. Although these values are not constrained by the database, the UI can display descriptions for them in the patient demographics areas, and are used in the bar charts that breakdown sex, race, gender in the "All Patient Search" feature. The lookup tables for these codes are
-
LKP_PATIENT_RACE
-
LKP_PATIENT_GENDER
-
LKP_PATIENT_SEX
-
LKP_PATIENT_MARITAL_STATUS
-
LKP_PATIENT_RELIGION
-
LKP_PATIENT_ETHNICITY
These tables all have the same structure:
Column | Description |
---|---|
DESCRIPTION |
The description that is shown in the User Interface |
CODE |
The coded value in the patient table |
The EMERSE distribution has default codes already in the tables, but it is important to make sure that these codes match what your own local institution uses. Otherwise it will be necessary to update these tables with your local codes and text descriptions.
An example of how these codes are used within EMERSE can be found in our Virtual Machine Guide.
The data in the Patient table are automatically copied to a Solr index, by default once per day. This is detailed in the Configuration Guide.
|
Research Studies and Attestation
Immediately after each login, every user is required to ‘attest’ to their use of EMERSE for that session by specifying their reason for using the system. This is called the ‘Attestation’ page, and the results are stored in the SESSION_ATTESTATION table. EMERSE provides three options (configurable by a system administrator) for this attestation: (1) a free text box, (2) ‘Quick Buttons’ for choosing pre-selected options that are commonl used (for example, “Quality Improvement”, “Patient Care”, “Infection Control”, etc), and (3) a table of research studies to which a user is associated. Additional tables RESEARCH_STUDY_ATTESTATION, and ATTESTION_OTHER may contain additional information depending on whether the user specifies a research study, or other reasons. The free text option would be used by users when no other attestation choices are reasonable. Additionally, previously used entries from the free text box will appear in the table, along with any IRB-approved studies, for the user’s convenience.
For our implementation at Michigan Medicine, we pull data on all studies in the IRB system, even if the study/person is not currently a part of EMERSE. This is because the dataset is generally small, and it makes it easier for users to validate their studies if the data are already populated, once the user is given an EMERSE account |
RESEARCH_STUDY table
Table: RESEARCH_STUDY
Population: Populated from external source such as an electronic IRB system
Population Frequency: Can be variable, but once per day is reasonable
If a user is required to select his/her study from the table, then delays in moving IRB data to EMERSE after IRB approval can result in delays access for that user. |
This table contains information about research studies. Using this table and RESEARCH_STUDY_MEMBER
allows EMERSE to show a list of studies the end user is associated with.
Column name | Description | Required or Optional |
---|---|---|
id |
Primary Key |
Required |
external_id |
IRB study number — used to link specific studies to usage, and is very helpful for tracking research usage |
Required |
study_name |
Name of the study |
Required |
principal_investigator_name |
Name of the principal investigator |
Required |
prin_invest_org_id |
id of principal investigator. Not currently used by EMERSE. This could be a user id, or email, but it is a good idea to ensure it is unique. |
Optional |
expiration_date |
Expiration date of study. Used to determine if a user should be allowed to proceed. If the expiration date is older than the current date, the user will not be able to select it in the attestion GUI. |
Required |
project_status |
Current project status. This is used to track where a study is in the review and approval process. Only certain study statuses allow access to EMERSE for research. The statuses that allow a study to be selected during attestation are defined in the VALID_RES_STUDY_STATUS table. |
Required |
last_updated |
A last updated date is not used by EMERSE, but can be useful for troubleshooting and tracking changes to the table. |
Optional |
begin_date |
This originally referred to the date the study began or should be allowed to begin. This field that can be used for tracking and troubleshooting. |
Optional |
VALID_RES_STUDY_STATUS table
Table: VALID_RES_STUDY_STATUS
Population: By System Admin. Only needed if research studies need to be validated
Population Frequency: May only need to be done once, at the time of system setup. May need periodic updates if the source data (such as from IRB system) defining study status is changed.
EMERSE contains a simple table defining study statuses. The statuses that are initially populated in the system (loaded up in the build script) are unique to Michigan Medicine (that is, they were developed locally and are implemented in our separate electronic IRB tracking system) and other implementations would have to have their own set of valid statuses if these were to be used to validate and approve usage for research. If the status of a research study is not in this table, EMERSE will not allow the study to be used for attestation; that is, the study would not even be displayed to the user to select.
Column name | Description | Required or Optional |
---|---|---|
status |
A list of study statuses that EMERSE considers valid in terms of allowing a user to proceed. These statuses are generally defined by the IRB and are universal across studies. |
Required |
VALID_RES_STUDY_STATUS
Table Example:
Status |
---|
Exempt Approved - Inital |
Approved |
Not Regulated |
Exempt Approved - Tranistional |
RESEARCH_STUDY_MEMBER table
Table: RESEARCH_STUDY_MEMBER
Population: Populated from external source such as an electronic IRB system
Population Frequency: Can be variable, but once per day is reasonable
This table contains information about study team members, and is related to the RESEARCH_STUDY
table, described above. Each study can have one or many study team members.
This table at Michigan Medicine contains information on all study team members for all studies, whether they have an EMERSE account or not. |
Column name | Description | Required or Optional |
---|---|---|
RESEARCH_STUDY_ID |
Foreign key reference to row |
Required |
USER_ID |
Foreign key reference to row in |
Required |
ROLE_NAME |
A string describing a person’s role on the study team. EG. “PI”, “Staff”, “Study Coordinator”. This can be useful when generating usage reports. |
Optional |
FIRST_NAME |
First name of the username who is on the study. It is currently populated from the source IRB system, but it is not used at all by EMERSE. Nevertheless, it may be useful when generating reports. |
Optional |
LAST_NAME |
Last name of the username who is on the study. It is currently populated from the source IRB system, but it is not used at all by EMERSE. Nevertheless, it may be useful when generating reports. |
Optional |
BEGIN_DATE |
This is not currently used by EMERSE. |
Optional |
LAST_UPDATED |
Date row was last updated |
Optional |
DELETED |
Flag to indicate if the record has been logically deleted. 0 = false, not deleted; 1 = true, deleted. |
Required |
SESSION_ATTESTATION table
Table: SESSION_ATTESTATION
Population: Used internally by EMERSE
Population Frequency: In real time by EMERSE
Each time a user attests to why they are using EMERSE, a row is inserted into this table, which is one of the audit tables. Attestations related to research can be joined to the RESEARCH_ATTESTION
table. Non research uses can be joined to ATTESTION_OTHER
.
Column name | Description | Required or Optional |
---|---|---|
id |
Primary Key |
N/A (populated internally by EMERSE) |
type |
A string indicating the top level category of attestation. |
N/A (populated internally by EMERSE) |
User_session_id |
A foreign key reference to the |
N/A (populated internally by EMERSE) |
OTHER_ATTESTATION_REASON table
Table: OTHER_ATTESTATION_REASON
Population: By System Admin. Only needed if commonly used text reasons are needed as quick buttons in the application
Population Frequency: May only need to be done once, at the time of system setup.
For non-research attestations, there is a lookup table called OTHER_ATTESTATION_REASON
that lists available options. These can be configured by each institution, and may include commonly used access reasons that don’t involve research (such as quality improvement, patient care, etc). These options (other than the Free text reason) can be used to populate “quick buttons” that provide a simple way for a user to click on one of the common reasons for use.
Column name | Description | Required or Optional |
---|---|---|
USER_KEY |
Text based primary key of this table. The column name might better be thought of as as 'reason key'. |
Required |
DESCRIPTION |
The text description that will be displayed in the Quick Buttons section of the Attestation page. |
Required |
DELETED FLAG |
Has this reason been deleted? (0 = no; 1= yes) |
Required |
DISPLAY_ORDER |
Order of display in the UI. Can be any integer, but should be unique per row. The buttons are ordered by this column via sql sort. Generally start with 0,1,2, etc. |
Optional |
OTHER_ATTESTATION_REASON
Table Example:
USER_KEY | DESCRIPTION | DELETED_FLAG | DISPLAY_ORDER |
---|---|---|---|
QI |
Quality Improvement |
0 |
0 |
RVPREPRES |
Review Preparatory to Research |
0 |
1 |
STDYDESC |
Study involving only decedents (deceased patients) |
0 |
2 |
ATTESTATION_OTHER table
Table: ATTESTATION_OTHER
Population: Used internally by EMERSE
Population Frequency: Application dependent
The free text reasons that users enter are stored in a table called ATTESTATION_OTHER
. This is populated by EMERSE and is not customizable by users.
Column name | Description | Required or Optional |
---|---|---|
SESSION_ATTESTATION_ID |
A unique ID for the session attestation. Used for audit logging. |
Required |
FREE_TEXT_REASON |
The free text reason that a user entered. |
Required |
OTHER_ATTEST_REASON_KEY |
This will currently only be populated by the system with |
Required |
ATTESTATION_OTHER
Table Example:
SESSION_ATTESTATION_ID | FREE_TEXT_REASON | OTHER_ATTEST_REASON_KEY |
---|---|---|
50208 |
Testing out the system |
FRETXT |
52060 |
Testing out the system |
FRETXT |
46051 |
Looking up a patient in clinic |
FRETXT |
71052 |
infection control monitoring |
FRETXT |
74107 |
cancer registry operational work |
FRETXT |
Clinical Documents
EMERSE search is enabled by the indexing of clinical text documents by Apache Solr. Documents in a clinical environment can come from a myriad of sources like transcription, radiology, and pathology, or from an electronic health record. Normally the structure, data, and metadata related to these documents from different sources varies considerably.
To simplify things, we configure Solr with a single document schema containing all fields from all sources. This requires that documents from different sources use fields in a consistent way. For instance, if one source uses field X
for purpose Y
, another source must use field X
for only purpose Y
as well, or not at all. Certain essential elements, such as patient MRN, clinical date, document source key, and document text are required to be to certain fields, and should not be configured differently.
The structure of clinical documents stored in Solr is described by three tables.
-
DOCUMENT_SOURCE
lists the data sources, -
DOC_FIELD_EMR_INTENT
lists the purposes of fields (such as being the document text, being the document date, or being the unique document identifier), -
and finally the table
DOCUMENT_FIELDS
lists the fields of each data source, stating for each field, what its Solr field name is, and what the abstract purpose of the field is.
DOC_FIELD_EMR_INTENT
also associates a Solr field for with each abstract purpose (through the column DEFAULT_LUCENE_NAME
), those these are not always used.
DOC_FIELD_EMR_INTENT table
Table: DOC_FIELD_EMR_INTENT
Population: Likely once at system setup
Population Frequency: May need updating as data sources change.
This table lists the abstract purposes of Solr fields across all document sources.
Each row is marked as required or optional. Required rows indicate that the Solr field (found in the DEFAULT_LUCENE_NAME
column) must be used for that purpose across all data sources. Optional rows indicate that the Solr field name is found on DOCUMENT_FIELDS
table, not in the DEFAULT_LUCENE_NAME
column in this table.
You can customize the value of the DEFAULT_LUCENE_NAME
column only for two rows:
-
CLINICAL_DATE
-
LAST_UPDATED
This means all Solr documents for all data sources must use the following Solr field names:
-
ID
for the unique identifier for the document -
RPT_TEXT
for the text of the document (what is searched) -
MRN
for the medical record number of the document, linking the document to a patient in thePATIENT
table -
RPT_TEXT_NOIC
for the non-case sensitive indexed version ofRPT_TEXT
.
Name | Description | DEFAULT_LUCENE_NAME (aka the Solr field) | Required or Optional |
---|---|---|---|
MRN |
Patient medical record number, which is a unique patient identifier |
MRN |
Required |
RPT_ID |
Unique document identifier. This must be unique across all documents and sources |
ID |
Required |
CLINICAL_DATE |
Date when the clinical event occurred. Often this would be considered the "note date" or "document date". When displayed for users within EMERSE in the Summaries section, this is the default sort column with the most recent date shown at the top. |
ENCOUNTER_DATE |
Required, customizable |
LAST_UPDATED |
Date when the document was last updated, since changes are sometimes made to documents |
LAST_UPDATED |
Required, customizable |
RPT_TEXT |
The actual text of the clinical document. This field is used by Lucene for lower-case indexing (case-insensitive searching). |
RPT_TEXT |
Required |
RPT_TEXT_NOIC |
A copy of the document text to be indexed using a case-sensitive Lucene filter (NOIC = NO Ignore Case) |
RPT_TEXT_NOIC |
Required |
TEXT |
Any generic text field. Note that a document may have multiple of these types of generic text fields (e.g., clinical service, document type, clinician name, etc). This is useful when additional metadata are associated with the document and should be displayed. If this field is also defined in the Solr configuration it can become searchable in advanced search. Otherwise, it could still potentially be used to help filter queries based on additional metadata (e.g., 'study type'). |
ignored |
Optional |
DATE |
Any generic date field, since a document may have more than one kind of date associated with it. Otherwise, it could still potentially be used to help filter queries based on additional metadata |
ignored |
Optional |
ENCOUNTER_ID |
This is no longer used. It had been used for a time to search across all patients without limiting it to a set of medical record numbers. |
DOCUMENT_SOURCE table
Table: DOCUMENT_SOURCE
Population: Likely once at system setup
Population Frequency: May need updating as data sources change.
Each source of documents (e.g., pathology, radiology, commercial EHR, legacy EHR, etc.) is listed as a row in the document_source
table. The EMERSE application searches and displays the results based on document source. Additionally, advanced search queries can leverage these source data to limit queries to a specific source (e.g., searching only pathology reports). Document sources normally differ in their format and metadata depending on the source of origin. Each row in this table corresponds to a column in the Overview display within EMERSE, and as a subset of documents when a patient is selected.
Column name | Description | Required or Optional |
---|---|---|
SOURCE_KEY |
A short name or abbreviation for the document source. This field needs to be unique as it is the primary key of the table, and search results are displayed on separate tabs for each source. However, this name is not displayed to the user but instead is used to match the name of the document source as defined in the Solr configuration, in the |
Required |
USER_DESCRIPTION |
A Description for the source of documents. This field is used only internally and can be useful for system admins who set up EMERSE to provide a a description of the |
Required |
HTML_FLAG |
If set to false ( |
Required |
PRELIMINARY_DOC_FLAG |
If it is possible that the source will have documents without text, this can be set to |
Required |
DISPLAY_NAME |
The name of the source as it is displayed in the UI (e.g., "Pathology", "Radiology", "Main EHR"). |
Required |
CSS_DISPLAY_PREFIX |
Prefix used internally by CSS components in the UI. This can be anything, but each source must have a unique |
Required |
DISPLAY_ORDER |
Order in which sources appear in the Overview and the tabs within the Summary results page. Each row should have a distinct display order. Start sequential numbering with |
Required |
EXTERNAL_SOURCE |
Currently not in use. May be used if documents need to be displayed externally, for example with a PDF viewer outside the browser, and will not be displayed using SOLR’s copy of the document. ( |
Required |
DOCUMENT_SOURCE Table Example:
source_key | user_description | html_flag | preliminary_doc_flag | display_name | css_display_prefix | display_order | external_source |
---|---|---|---|---|---|---|---|
epic |
Primary EHR |
1 |
0 |
Epic EHR |
ehr |
0 |
0 |
rad |
Radiology Documents |
0 |
0 |
Radiology |
rad |
1 |
0 |
path |
Pathology Document |
0 |
0 |
Pathology |
path |
2 |
0 |
DOCUMENT_FIELDS table
Table: DOCUMENT_FIELDS
Population: Likely once at system setup
Population Frequency: May need updating as data sources change.
This table provides EMERSE with information about what fields are available in the underlying Solr index, their data type, and additional metadata. Each field indexed with Solr should exist in this table for each source system in the DOCUMENT_SOURCE
table. The column EMR_INTENT
is linked to the NAME
column of the DOC_FIELD_EMR_INTENT
mapping table. The column DOC_SOURCE_KEY
is linked to the SOURCE_KEY
column of the DOCUMENT_SOURCE
table.
Each document source should have at least six rows in this table corresponding to the required purposes listed in DOC_FIELD_EMR_INTENT
. The value of the SOLR_FIELD_NAME
column of these rows should be exactly the value specified in the DEFAULT_LUCENE_NAME
column of the matching row in DOC_FIELD_EMR_INTENT
.
Additional fields can be specified using the generic EMR_INTENT
options of TEXT
or DATE
. (These are the optional purposes in the DOC_FIELD_EMR_INTENT
table.) These additional metadata fields are used by EMERSE for display in the UI but are not used for search. However, you must configure Solr’s index schema.xml
to store these fields.
Some of the data defined here includes the document text itself, but also the metadata fields that will likely vary for each source system (e.g., authoring clinician, clinical service, document identifier, date of service, etc). The metadata can be displayed (or hidden) in two basic places in the EMERSE UI which are:
-
Within the Summaries table in the EMERSE UI that shows a listing of all documents for a single patient and a specific document source (referenced in the
DOCUMENT_FIELDS
table with thesummary_display_flag
column). -
Inside a small box in the EMERSE UI that shows document-specific metadata that is shown above a single document after a user drills down to view a document (referenced in the
DOCUMENT_FIELDS
table with thedisplay_flag
column)
The metadata displayed within the EMERSE UI can, for the most part, be ordered by using the display_order
column defined in the DOCUMENT_FIELDS
table. The display order applies to both places in the UI where the data can be displayed, but note that for each of these two locations the system can be setup to display or not display the metadata element. If the ordering is not listed correctly in this DOCUMENT_FIELDS
table (e.g., two items are given the same ordering number, no error will occur for the user, but the actual order may be unpredictable).
Column name | Description | Required or Optional |
---|---|---|
SOLR_FIELD_NAME |
Name that corresponds with the Solr document field . The names of the fields are specified in Solr |
Required |
DATATYPE |
Mainly used by the UI. Should be either |
Required |
DISPLAY_ORDER |
Order in which fields need to appear in the search results, either in the Summaries section of the UI or in a small box above a single displayed document. This should be unique among rows for each source but note that some elements (such as the text of the document itself) would not actually be displayed as a metadata element. |
Required |
DISPLAY_NAME |
Name that appears in the UI |
Required |
EMR_INTENT |
Specifies the purpose of the field. This refers to the values of the |
Required |
DOCUMENT_SOURCE_KEY |
Specifies the document type key from |
Required |
DISPLAY_FLAG |
Flag that controls if the field is displayed when the document is displayed. This display of metadata is in a small table above the document when an individual document is shown in the EMERSE UI, when a user drills down to view a complete document. ( |
Required |
SUMMARY_DISPLAY_FLAG |
Flag that controls if the field is displayed in the search results summary page, which would show up as a metadata coulumn in the Summary results table. ( |
Required |
DOCUMENT_FIELDS Table Example:
Shown below is an example document_fields table for three different document sources:
SOLR_FIELD_NAME | DATATYPE | DISPLAY_ORDER | DISPLAY_NAME | EMR_INTENT | DOCUMENT_SOURCE_KEY | DISPLAY_FLAG | SUMMARY_DISPLAY_FLAG |
---|---|---|---|---|---|---|---|
MRN |
Text |
0 |
MRN |
MRN |
epic |
0 |
0 |
RPT_TEXT |
Text |
1 |
Report Text |
RPT_TEXT |
epic |
0 |
0 |
RPT_TEXT_NOIC |
Text |
2 |
Report Text |
RPT_TEXT_NOIC |
epic |
0 |
0 |
ID |
Text |
3 |
Report ID |
RPT_ID |
epic |
1 |
1 |
LAST_UPDATED |
Date |
4 |
Last Updated |
LAST_UPDATED |
epic |
1 |
0 |
CASE_DATE |
Date |
5 |
Case Date |
CLINICAL_DATE |
epic |
1 |
1 |
MRN |
Text |
0 |
MRN |
MRN |
path |
0 |
0 |
RPT_TEXT |
Text |
1 |
Report Text |
RPT_TEXT |
path |
0 |
0 |
RPT_TEXT_NOIC |
Text |
2 |
Report Text |
RPT_TEXT_NOIC |
path |
0 |
0 |
ID |
Text |
3 |
Report Id |
RPT_ID |
path |
1 |
1 |
LAST_UPDATED |
Date |
4 |
Last Updated |
LAST_UPDATED |
path |
1 |
1 |
DR_NUM |
Text |
5 |
Doctor Num |
TEXT |
path |
1 |
1 |
CASE_DATE |
Date |
6 |
Collection Date |
CLINICAL_DATE |
path |
1 |
0 |
MRN |
Text |
0 |
MRN |
MRN |
rad |
0 |
0 |
RPT_TEXT |
Text |
1 |
Report Text |
RPT_TEXT |
rad |
0 |
0 |
RPT_TEXT_NOIC |
Text |
2 |
Report Text |
RPT_TEXT_NOIC |
rad |
0 |
0 |
ID |
Text |
3 |
Report ID |
RPT_ID |
rad |
1 |
1 |
LAST_UPDATED |
Date |
4 |
Last Updated |
LAST_UPDATED |
rad |
1 |
0 |
SVC_CD |
Text |
5 |
Service Code |
TEXT |
rad |
1 |
0 |
DR_NUM |
Text |
6 |
Doctor Num |
TEXT |
rad |
1 |
0 |
CASE_DATE |
Date |
7 |
Report Date |
CLINICAL_DATE |
rad |
1 |
1 |
SOLR_INDEX table
Table: SOLR_INDEX
Population: Likely once at system setup, but the dates may get updated with every indexing.
Population Frequency: Variable, but usually automated.
EMERSE previously used this table to locate Solr/Lucene indexes that were available, as several indexes (shards) were created to improve performance. However, we no longer use multiple indexes. For most users running EMERSE on a single server, having one row in this table pointing to a single Solr/Lucene index yields adequate performance for 1-2TB indexes with 100’s of millions of documents. Thus, only one row would be setup in this table. After indexing, EMERSE will automatically update the START_DATETIME
and END_DATETIME
fields with the latest date range of the indexed documents. The Start and End dates in this table are used within the EMERSE UI to display the date range of the documents. The automatic updating of the start and end dates can be overriden using two parameters (see batch.updateIndexMinDateFromSolrIndex
and batch.updateIndexMaxDateFromSolrIndex
in the 'Batch Updating Begin/End Dates' section of the Configuration Guide).
If the need arose to break the index into smaller pieces for performance gains, we would recommend using Solr Cloud. |
Column name | Description | Required or Optional |
---|---|---|
ID |
The Lucene name of the index |
Required |
START_DATETIME |
Start date of clinical documents in this shard |
Required |
END_DATETIME |
End date of clinical documents in this shard |
Required |
PATIENT_COUNT |
Total distinct MRN’s found in the solr index. Presented in the count in "All Patient" patient list. Updated periodically by the application as a background task. |
No |
SOLR_INDEX Table Example:
ID | START_DATETIME | END_DATETIME | PATIENT_COUNT |
---|---|---|---|
unified |
01.02.2008 00:00:00 |
31.12.2099 00:00:00 |
1223829 |