H/ICH/MedDRA/Basics

The Medical Dictionary for Regulatory Activities (MedDRA) is a set of standardized terminologies commonly used in clinical safety recording, communicating, and reporting activities. MedDRA is developed and maintained by the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH)[1],[2]. The module emt includes basic functions for processing MedDRA distribution folder to get information of terms and processing data already coded in MedDRA to further use those data in analyses.

Please note that this module does not include MedDRA distribution files. MedDRA data sharing follows data share statement published by MSSO[3]. The users can download distribution files from the MedDRA website. Please note that some extra columns included in files distributed before March 2012 (Version 15.0) will not be processed by this module.

For demonstration purpose, an example distribution folder is shipped with this module. The example folder includes randomly generated strings and numbers following MedDRA distribution format in Version 26.1 to demonstrate the usage of this module. The numbers of terms are also reduced for faster results.

Assign MedDRA Distribution Folder

To use downloaded MedDRA distribution files, assign folder_name to the relative or exact path of the unzipped folder. If the folder_name remains empty, then the name will be assigned to the example folder.

To check if necessary files are found:

from mtbp3.health.emt import Emt

emt = Emt(folder_name='')
print(emt.find_files())
([], 'All files found. Version: 26.1; Year: 2023; Month: September; Language: English. N_SOC: 17.')

Please note that having the “all files found” response from find_files is required for using the following steps.

Find MedDRA Terms Within a Given Hierarchical Level

MedDRA terms are organized in a hierarchical system: SOC, HLGT, HLT, PT, LLT. Two most commonly used levels for analyses are SOC and PT. Note that the MedDRA terms do use both capital letters and lowercase letters. The outputs will follow the exact letters used in the distribution folder. Functions in this section return a list for faster follow-up processing.

To find a list of SOC:

soc_names = emt.find_soc()

If a list of SOC terms are given to the function find_soc, the function will return the corresponding list of id, and vice versa.

To find the id of a given list of SOC term(s):

id = emt.find_soc(soc_names[:3])
print(id)
[19926972, 19986360, 19908214]

If the input SOC string may have letter cases changed, use the option ignore_case to find id:

id = emt.find_soc([name.upper() for name in soc_names[:3]], ignore_case=True)
print(id)
[19926972, 19986360, 19908214]

If the input can not be found in the current version, the output will remain empty as:

id = emt.find_soc(soc_names[:3]+['This is not a standard term!'])
print(id)
[19926972.0, 19986360.0, 19908214.0, nan]

Similar finding functions for HLGT, HLT, PT, and LLT are included in this module. Simply replace find_soc to find_pt, etc.

Find MedDRA Terms Across Different Hierarchical Levels

Functions in this sections return DataFrames.

To find a list of PTs related to the first SOC on the list:

df = emt.find_pt_given_soc(soc_names[0])

# To save the df to a CSV file:
# df.to_csv('filename.csv', index=False)

Please note that the default returned output includes PT with the specified SOC as both primary and non-primary SOC. Add the option primary_soc_only=True to get primary only results. Other across level functions are also available, including find_llt_given_pt, find_llt_given_soc, find_soc_given_pt, etc.

Other Options

MedDRA also provides other options including online viewer, desktop browser [4], and API access[5]. Please visit MedDRA website for more information. Another Python module for multiple medical term systems is also available [6].

Reference