Seq/ICTV/MSL and VMR
International Committee on Taxonomy of Viruses (ICTV) [1] maintains a database [2] of virus taxonomic classification [3]. The Master Species Lists (MSL) [4] provides “the official, current virus taxonomy approved by the ICTV”. The Virus Metadata Resource (VMR) [5] provides “exemplars and additional isolates for each virus species”.
An Example
ICTV provides MSL downloads in .xlsx [4], web version of interactive search [6], and visual taxonomy browser [7]. For example, if we are interested in studying the virus causing feline acquired immunodeficiency syndrome [8], we can find the following taxonomy information in the ICTV MSL:
Column name |
value |
taxon-specific suffix * |
---|---|---|
Realm |
Riboviria |
…viria |
Subrealm |
…vira |
|
Kingdom |
Pararnavirae |
…virae |
Subkingdom |
…virites |
|
Phylum |
Artverviricota |
…viricota |
Subphylum |
…viricotina |
|
Class |
Revtraviricetes |
…viricetes |
Subclass |
…viricetidae |
|
Order |
Ortervirales |
…virales |
Suborder |
…virineae |
|
Family |
Retroviridae |
…viridae |
Subfamily |
Orthoretrovirinae |
…virinae |
Genus |
Lentivirus |
…virus |
Subgenus |
…virus |
|
Species |
Lentivirus felimdef |
|
Genome |
ssRNA-RT |
|
Last Change |
Renamed, |
|
MSL of Last Change |
39 |
|
Proposal for Last Change |
2023.009D.Retroviridae_68rensp.zip |
|
Taxon History URL |
ictv.global=202305029 |
* [15]
From the taxon history, we can see that the species name was Feline immunodeficiency virus (FIV) in the previous releases, and changed to Lentivirus felimdef in the 2023 release [9]. The reasons of changes proposed are included in the Proposal for Last Change link. Additional information about FIV, for example, about the genus Lentivirus, can be found in ICTV report [10].
How to
The module mtbp3.seq.ictvmsiview
accept a csv file exported from the MSL tab of MSL download file.
The csv file path can be assigned using the option msl_file_path
.
For demonstration, a small subset of MSL is shipped with this module.
When using the option msl_file_path = ""
, the example file will be loaded.
To load a MSL csv file:
from mtbp3.seq import ictvmslview
msl = ictvmslview.ictvmsl(msl_file_path = "")
File supp_seq/ICTV_MSL39v4_example.csv has been loaded
Column names: ['Sort', 'Realm', 'Subrealm', 'Kingdom', 'Subkingdom', 'Phylum', 'Subphylum', 'Class', 'Subclass', 'Order', 'Suborder', 'Family', 'Subfamily', 'Genus', 'Subgenus', 'Species', 'Genome', 'Last Change', 'MSL of Last Change', 'Proposal for Last Change ', 'Taxon History URL']
Total number of rows: 65
Please note that the column names will be used as search_rank
in this module and those are case sensitive.
The search_strings
is not case sensitive, as shown below.
To view the first row of loaded file:
print(msl.msl.iloc[0].transpose())
Sort 13664
Realm Riboviria
Subrealm NaN
Kingdom Pararnavirae
Subkingdom NaN
Phylum Artverviricota
Subphylum NaN
Class Revtraviricetes
Subclass NaN
Order Ortervirales
Suborder NaN
Family Retroviridae
Subfamily Orthoretrovirinae
Genus Alpharetrovirus
Subgenus NaN
Species Alpharetrovirus avicarmilhil2
Genome ssRNA-RT
Last Change Renamed,
MSL of Last Change 39
Proposal for Last Change 2023.009D.Retroviridae_68rensp.zip
Taxon History URL ictv.global=202304982
Name: 0, dtype: object
To view the unique values of ‘Realm’ in the file and number of species under each realm:
msl.msl['Realm'].value_counts()
Realm
Riboviria 65
Name: count, dtype: int64
The counts above are from the example file.
The full MSL 39v4 includes the following realms and kingdoms:
Virus:
├── Adnaviria:
│ └── Zilligvirae (32 Species)
├── Duplodnaviria:
│ └── Heunggongvirae (4973 Species)
├── Monodnaviria:
│ ├── Loebvirae (60 Species)
│ ├── Sangervirae (22 Species)
│ ├── Shotokuvirae (1930 Species)
│ └── Trapavirae (16 Species)
├── NA:
│ └── NA (636 Species)
├── Riboviria:
│ ├── NA (17 Species)
│ ├── Orthornavirae (6423 Species)
│ └── Pararnavirae (272 Species)
├── Ribozyviria:
│ └── NA (21 Species)
└── Varidnaviria:
├── Bamfordvirae (279 Species)
└── Helvetiavirae (9 Species)
To count species within a subset:
print('\n'.join(msl.count_species(count_rank='Phylum', outfmt="tree", search_within_subset={'Kingdom': 'Bamfordvirae'}))
Output:
Virus:
└── [Realm] Varidnaviria:
└── [Kingdom] Bamfordvirae:
├── [Phylum] NA (1 Species)
├── [Phylum] Nucleocytoviricota (132 Species)
└── [Phylum] Preplasmiviricota (146 Species)
Search within MSL
To search for rows with “lentivirus fel” included in a species name:
print(msl.find_rows_given_str(search_strings="lentivirus fel", color="red").iloc[0].transpose())
Sort 13704
Realm Riboviria
Subrealm NaN
Kingdom Pararnavirae
Subkingdom NaN
Phylum Artverviricota
Subphylum NaN
Class Revtraviricetes
Subclass NaN
Order Ortervirales
Suborder NaN
Family Retroviridae
Subfamily Orthoretrovirinae
Genus Lentivirus
Subgenus NaN
Species Lentivirus felimdef
Genome ssRNA-RT
Last Change Renamed,
MSL of Last Change 39
Proposal for Last Change 2023.009D.Retroviridae_68rensp.zip
Taxon History URL ictv.global=202305029
Name: 40, dtype: object
To search for rows with “lentivirus” included in species:
print(msl.find_rows_given_str(search_strings="lentivirus", search_rank="Species", color="red", narrow=True)[['Genus', 'Species','Genome']])
Genus Species Genome
36 Lentivirus Lentivirus bovimdef ssRNA-RT
37 Lentivirus Lentivirus bovjem ssRNA-RT
38 Lentivirus Lentivirus capartenc ssRNA-RT
39 Lentivirus Lentivirus equinfane ssRNA-RT
40 Lentivirus Lentivirus felimdef ssRNA-RT
41 Lentivirus Lentivirus humimdef1 ssRNA-RT
42 Lentivirus Lentivirus humimdef2 ssRNA-RT
43 Lentivirus Lentivirus ovivismae ssRNA-RT
44 Lentivirus Lentivirus pum ssRNA-RT
45 Lentivirus Lentivirus simimdef ssRNA-RT
To search for genus Lentivirus using exact=True
option (partly exact, the searching is still not case sensitive):
print('\n'.join(msl.find_rows_given_str(search_strings="lentivirus", search_rank="Genus", color="red", outfmt='tree', exact=True)))
[Realm] Riboviria:
└── [Kingdom] Pararnavirae:
└── [Phylum] Artverviricota:
└── [Class] Revtraviricetes:
└── [Order] Ortervirales:
└── [Family] Retroviridae; [Subfamily] Orthoretrovirinae:
└── [Genus] Lentivirus:
├── [Species] Lentivirus bovimdef
├── [Species] Lentivirus bovjem
├── [Species] Lentivirus capartenc
├── [Species] Lentivirus equinfane
├── [Species] Lentivirus felimdef
├── [Species] Lentivirus humimdef1
├── [Species] Lentivirus humimdef2
├── [Species] Lentivirus ovivismae
├── [Species] Lentivirus pum
└── [Species] Lentivirus simimdef
There are more than one virus species in family Retroviridae that can infect feline. To search for virus within family Retroviridae that can infect feline:
print('\n'.join(msl.find_rows_given_str(search_strings=" fel", search_rank="Species", color="red", outfmt='tree', search_within_subset={"Family":"Retroviridae"})))
[Realm] Riboviria:
└── [Kingdom] Pararnavirae:
└── [Phylum] Artverviricota:
└── [Class] Revtraviricetes:
└── [Order] Ortervirales:
├── [Family] Retroviridae; [Subfamily] Orthoretrovirinae:
│ ├── [Genus] Gammaretrovirus:
│ │ └── [Species] Gammaretrovirus felleu
│ └── [Genus] Lentivirus:
│ └── [Species] Lentivirus felimdef
└── [Family] Retroviridae; [Subfamily] Spumaretrovirinae:
└── [Genus] Felispumavirus:
└── [Species] Felispumavirus felcat
To search for two known virus species:
print('\n'.join(msl.find_rows_given_str(search_strings=["Gammaretrovirus felleu", "Lentivirus felimdef"], search_rank="Species", color="red", outfmt='tree')))
[Realm] Riboviria:
└── [Kingdom] Pararnavirae:
└── [Phylum] Artverviricota:
└── [Class] Revtraviricetes:
└── [Order] Ortervirales:
└── [Family] Retroviridae; [Subfamily] Orthoretrovirinae:
├── [Genus] Gammaretrovirus:
│ └── [Species] Gammaretrovirus felleu
└── [Genus] Lentivirus:
└── [Species] Lentivirus felimdef
The tree above always shows 8 ranks.
There are two more types of tree available, including a tree with nonempty rank:
print('\n'.join(msl.find_rows_given_str(search_strings=["Gammaretrovirus felleu", "Lentivirus felimdef"], search_rank="Species", color="red", outfmt='tree', tree_style="drop")))
[Realm] Riboviria:
└── [Kingdom] Pararnavirae:
└── [Phylum] Artverviricota:
└── [Class] Revtraviricetes:
└── [Order] Ortervirales:
└── [Family] Retroviridae:
└── [Subfamily] Orthoretrovirinae:
├── [Genus] Gammaretrovirus:
│ └── [Species] Gammaretrovirus felleu
└── [Genus] Lentivirus:
└── [Species] Lentivirus felimdef
and a tree with full 15 ranks:
print('\n'.join(msl.find_rows_given_str(search_strings=["Gammaretrovirus felleu", "Lentivirus felimdef"], search_rank="Species", color="red", outfmt='tree', tree_style="full")))
[Realm] Riboviria:
└── [Subrealm] NA:
└── [Kingdom] Pararnavirae:
└── [Subkingdom] NA:
└── [Phylum] Artverviricota:
└── [Subphylum] NA:
└── [Class] Revtraviricetes:
└── [Subclass] NA:
└── [Order] Ortervirales:
└── [Suborder] NA:
└── [Family] Retroviridae:
└── [Subfamily] Orthoretrovirinae:
├── [Genus] Gammaretrovirus:
│ └── [Subgenus] NA:
│ └── [Species] Gammaretrovirus felleu
└── [Genus] Lentivirus:
└── [Subgenus] NA:
└── [Species] Lentivirus felimdef
Download Updated MSL
To download the current release, use msl.update_msl(version="current")
.
That will return output:
File of version current has been loaded
Column names: ['Sort', 'Realm', 'Subrealm', 'Kingdom', 'Subkingdom', 'Phylum', 'Subphylum', 'Class', 'Subclass', 'Order', 'Suborder', 'Family', 'Subfamily', 'Genus', 'Subgenus', 'Species', 'Genome', 'Last Change', 'MSL of Last Change', 'Proposal for Last Change ', 'Taxon History URL']
Total number of rows: 14690
To see versions available to download:
msl.update_msl(version="")
Version '' is not supported. Supported versions are: current, 39.v4, 39.v3, 39.v2, 39.v1, 38.v3, 38.v2, 38.v1, 2021.v3, 2021.v2, 2021.v1, 2020, 2019, 2018b.v2, 2018a, 2017, 2016.v1.3, 2015, 2014.v4, 2013.v2, 2012.v4, 2011.v2, 2009.v10, 2008, 2005.v1
Subtypes within Species
ICTV MSL focuses on “taxa at and above the species rank” [11]. Classification below the species are often related to species specific characteristics [12].
Species Exemplar
ICTV VMR provides one exemplar (and may be more additional isolates) for each species in VMR, with Genebak accession number and direct link to NCBI database.
To load a VRM table extracted from VRM download file:
from mtbp3.seq import ictvvmrview
vmr = ictvvmrview.ictvvmr(vmr_file_path = "")
File supp_seq/ICTV_VMR_MSL39v4_example.csv has been loaded
Column names: ['Isolate ID', 'Species Sort', 'Isolate Sort', 'Realm', 'Subrealm', 'Kingdom', 'Subkingdom', 'Phylum', 'Subphylum', 'Class', 'Subclass', 'Order', 'Suborder', 'Family', 'Subfamily', 'Genus', 'Subgenus', 'Species', 'Exemplar or additional isolate', 'Virus name(s)', 'Virus name abbreviation(s)', 'Virus isolate designation', 'Virus GENBANK accession', 'Genome coverage', 'Genome', 'Host source', 'Accessions Link']
Total number of rows: 65
When the path is empty, an example file will be used for illustration.
Similarily, use vmr.update_vmr(version="current")
. to download the current version from ICTV.
To search for exemplars including “feline” using search_rank_or_exemplar="Virus name(s)"
:
print('\n'.join(vmr.find_rows_given_str(search_strings="feline", search_rank_or_exemplar="Virus name(s)", color="red", outfmt='tree', tree_style="drop")))
[Realm] Riboviria:
└── [Kingdom] Pararnavirae:
└── [Phylum] Artverviricota:
└── [Class] Revtraviricetes:
└── [Order] Ortervirales:
└── [Family] Retroviridae:
├── [Subfamily] Orthoretrovirinae:
│ ├── [Genus] Gammaretrovirus:
│ │ ├── [Species] Gammaretrovirus felleu (ssRNA-RT):
│ │ │ └── [E] feline leukemia virus (Rickard subgroup A) (Genebank: AF052723)
│ │ ├── [Species] Gammaretrovirus hazufelsar (ssRNA-RT):
│ │ │ └── [E] Hardy-Zuckerman feline sarcoma virus (Genebank: X03711)
│ │ └── [Species] Gammaretrovirus snythefelsar (ssRNA-RT):
│ │ └── [E] Snyder-Theilen feline sarcoma virus (Genebank: M22820)
│ └── [Genus] Lentivirus:
│ └── [Species] Lentivirus felimdef (ssRNA-RT):
│ └── [E] feline immunodeficiency virus (Petaluma) (Genebank: M25381)
└── [Subfamily] Spumaretrovirinae:
└── [Genus] Felispumavirus:
└── [Species] Felispumavirus felcat (ssRNA-RT):
└── [E] feline foamy virus Felis catus (FUV7) (Genebank: Y08851)
To search for genus Lentivirus:
print('\n'.join(vmr.find_rows_given_str(search_strings="lentivirus", search_rank_or_exemplar="Genus", color="red", outfmt='tree', exact=True, tree_style="drop")))
[Realm] Riboviria:
└── [Kingdom] Pararnavirae:
└── [Phylum] Artverviricota:
└── [Class] Revtraviricetes:
└── [Order] Ortervirales:
└── [Family] Retroviridae:
└── [Subfamily] Orthoretrovirinae:
└── [Genus] Lentivirus:
├── [Species] Lentivirus bovimdef (ssRNA-RT):
│ └── [E] bovine immunodeficiency virus (HXB3) (Genebank: M32690)
├── [Species] Lentivirus bovjem (ssRNA-RT):
│ └── [E] Jembrana disease virus (Tabanan_87) (Genebank: U21603)
├── [Species] Lentivirus capartenc (ssRNA-RT):
│ └── [E] caprine arthritis encephalitis virus (Clements) (Genebank: M33677)
├── [Species] Lentivirus equinfane (ssRNA-RT):
│ └── [E] equine infectious anemia virus (Genebank: U01866)
├── [Species] Lentivirus felimdef (ssRNA-RT):
│ └── [E] feline immunodeficiency virus (Petaluma) (Genebank: M25381)
├── [Species] Lentivirus humimdef1 (ssRNA-RT):
│ └── [E] human immunodeficiency virus 1 (Genebank: AF033819)
├── [Species] Lentivirus humimdef2 (ssRNA-RT):
│ └── [E] human immunodeficiency virus 2 (BEN) (Genebank: M30502)
├── [Species] Lentivirus ovivismae (ssRNA-RT):
│ └── [E] Visna_maedi virus (kv1772) (Genebank: L06906)
├── [Species] Lentivirus pum (ssRNA-RT):
│ └── [E] puma lentivirus (14) (Genebank: U03982)
└── [Species] Lentivirus simimdef (ssRNA-RT):
└── [E] simian immunodeficiency virus (Genebank: M58410)
Other Resource
NCBI: