A great addition to EMERSE is our large collection of manually curated terms and phrases that we collectively call "Synonyms". This dataset was created for term expansion, to help users find related concepts that they might otherwise miss. The Synonyms collection is a work in progress — new terms are added all the time. These terms are derived from a wide variety of sources including news articles, scientific abstracts, product release notes, and especially the search terms that users have typed into EMERSE over the years. It contains over 2.6 million unique rows. Among these rows, about 85% do not appear within the Unified Medical Language System (UMLS) resource, which means that you'll be able to find a lot more with the EMERSE Synonyms than by using UMLS alone. The synonyms collection is extremely broad and contains a wide variety of terms related to diseases, medications (prescription, over-the-counter, herbal, etc), symptoms, infectious agents, test names, anatomical names, misspellings (>130,000 of them), and many other types of concepts. While formal relationships are not defined, the dataset has aspects similar to a "light-weight" ontology or knowledge graph (see network graphs, below). A subset of the types of categories, with examples, are also shown in the Examples table, below.
Image showing the knowledge graph-like relationships among terms related to "hypertension". For more details, click the following links for a full size PDF or an interactive network diagram.
Image showing the knowledge graph-like relationships among terms related to "lymphoma". For more details, click the following links for a full size PDF or an interactive network diagram.
EMERSE "Synonyms" Examples
Category* | Examples |
Acronyms | • ALL, acute lymphoblastic leukemia • ESRD, end stage renal disease • CXR, chest x-ray |
Abbreviations | • chemotherapy, chemo • ventilator, vent • shoulder pain, shld pain |
Professional and consumer terms | • myocardial infarction, heart attack • alopecia, hair loss • emesis, puking |
Misspellings | • ventilator, ventilater ventalator • pneumoina, pneuomnia, pneuomonia • diarriea, diahhrea, diarhea |
American English & British English spelling | • anesthesia, anaesthesia • anemic, anaemic • virilization, virilisation |
Trade and generic drug names | • Tylenol, acetaminophen • Prilosec, omeprazole • Zoloft, sertraline |
Species | • Staphylococcus aureus, staph aureus, S aureus • thrush, Candida albicans, C albicans • roundworm, Ascaris lumbricoides, A lumbricoides |
Chemotherapy regimens | • CHOP, cyclophosphamide, hydroxydaunorubicin, oncovin, prednisone • CFAR, cytoxan, fludarabine, alemtuzumab, rituximab • FOLFOX, folinic acid, fluorouracil, oxaliplatin |
Word stems | • pain, painful, painfully • sleep, sleeping, sleepy • test, tested, testing |
True synonyms | • frigid, cold, freezing • break, shatter, fracture • sandal, shoe, footwear |
Phrasing variations | • lidocaine with epinephrine • lido w/ epi • epi-lidocaine |
Malapropisms/usage errors | • beast mass • prostrate cancer • horse voice |
Root word variations | • adenotonsillectomy, tonsilloadenoidectomy • hepatopancreatobiliary, pancreatohepatobiliary • trichophytobezoar, phytotrichobezoar |
Hyphenations | • pseudosarcomatous, pseudo-sarcomatous • polyradiculoneuropathy, poly-radiculoneuropathy, poly-radiculo-neuropathy |
Idioms | • under the knife • under the weather • frog in her throat |
Neologisms | • trabecularity (instead of trabeculation) • misinnervation, dysinnervation, malinnervation |
Organizations | • AMA, American Medical Association • COG, Children's Oncology Group • SAEM, Society for Academic Emergency Medicine |
Companies | • Moderna Therapeutics, biotechnology company • Neuronetics, medical device company • Omthera, pharmaceutical company |
Some terms are included that clinicians may not even consider to be real terms. For example, menorrhagia is considered a synonym of hypermenorrhea. These two terms are are also grouped with hypermenorrhagia, the latter of which is not a true medical term but nevertheless appears in the clinical notes of more than 500 of our patients.
Within EMERSE, Synonyms are used as system-provided suggestions, but users can always include their own terms in their searches. Additionally, it is possible for EMERSE system administrators to include additional terms that would then be available to all users of the system, perhaps needed for unique local terminologies.
95% of the terms above appear in the medical record at least once
97% of the terms above appear in the medical record at least once
100% of the terms above appear in the medical record at least once
98% of the terms above appear in the medical record at least once
90% of the terms above appear in the medical record at least once
Licensing
The Synonyms were created specifically to help with term matching within EMERSE, but the dataset clearly has utility for other tasks including text mining, information retrieval (IR), natural language processing (NLP), machine learning (ML), and artificial intelligence (AI)/large language models (LLMs). The Synonyms collection can be licensed from the University of Michigan Office of Innovation Partnerships. Both Academic and Commercial Licenses are available, including a no-cost license for research use within an EMERSE installation. For more information about the Synonyms, or to see a sample dataset, please Contact the EMERSE team. For details about licensing opportunities, please visit the medsynonyms.com page.
Additional Data Sets
In addition to the EMERSE Synonyms Collection, other datasets have been formatted to work within EMERSE. These are described below and are available for download at no cost.
A list of journal titles and their MedLine abbreviations. This can be useful for searching documents for mentions of literature citations.
Source: National Center for Biotechnology Information (NCBI) (link)
MeSH (Medical Subject Headings) terms are a controlled vocabulary used for organizing biomedical literature such as PubMed and other database.
Source: National Library of Medicine (NLM), MeSH 2018 dataset (link)
A list of occupations, which may be useful if trying to find occupation data for patients. Searching for synonyms of 'occupation' will bring up the list within EMERSE.
Source: Careerplanner.com (link)
A list of occupations, which may be useful if trying to find occupation data for patients. Searching for synonyms of 'occupation' will bring up the list within EMERSE.
Source: Bureau of Labor Statistics, May 2017 dataset (link)
The Synonym Collection described on this page, created specifically for use within EMERSE, but has many uses for other types of clinical/health-related applications. Contact Us for Academic or Commercial Licensing.
Source: EMERSE project (link)