# Seq/ICTV/MSL and VMR

International Committee on Taxonomy of Viruses (ICTV) [^1] maintains a database [^2] of virus taxonomic classification [^3].
The Master Species Lists (MSL) [^4] provides "the official, current virus taxonomy approved by the ICTV".
The Virus Metadata Resource (VMR) [^5] provides "exemplars and additional isolates for each virus species".


## An Example

ICTV provides MSL downloads in .xlsx [^4], web version of interactive search [^6], and visual taxonomy browser [^7].
For example, if we are interested in studying the virus causing feline acquired immunodeficiency syndrome [^8], we can find the following taxonomy information in the ICTV MSL:

| Column name | value | taxon-specific suffix * |
| --- | --- | --- |
| Realm | Riboviria | ...viria |
| Subrealm | | ...vira |
| Kingdom | Pararnavirae | ...virae |
| Subkingdom | | ...virites |
| Phylum | Artverviricota | ...viricota |
| Subphylum | | ...viricotina |
| Class | Revtraviricetes | ...viricetes |
| Subclass | | ...viricetidae |
| Order | Ortervirales | ...virales |
| Suborder | | ...virineae |
| Family | Retroviridae | ...viridae |
| Subfamily | Orthoretrovirinae | ...virinae |
| Genus | Lentivirus | ...virus |
| Subgenus | | ...virus |
| Species | Lentivirus felimdef | 
| Genome | ssRNA-RT |
| Last Change | Renamed, |
| MSL of Last Change | 39 |
| Proposal for Last Change | 2023.009D.Retroviridae_68rensp.zip |
| Taxon History URL | ictv.global=202305029 |

* [^15]

From the taxon history, we can see that the species name was *Feline immunodeficiency virus* (FIV) in the previous releases, and changed to *Lentivirus felimdef* in the 2023 release [^9].
The reasons of changes proposed are included in the Proposal for Last Change link.
Additional information about FIV, for example, about the genus *Lentivirus*, can be found in ICTV report [^10].
 

## How to

The module `mtbp3.seq.ictvmsiview` accept a csv file exported from the MSL tab of MSL download file.
The csv file path can be assigned using the option `msl_file_path`.
For demonstration, a small subset of MSL is shipped with this module.
When using the option `msl_file_path = ""`, the example file will be loaded.

To load a MSL csv file:

In [1]:
from mtbp3.seq import ictvmslview
msl = ictvmslview.ictvmsl(msl_file_path = "")


File supp_seq/ICTV_MSL39v4_example.csv has been loaded
Column names: ['Sort', 'Realm', 'Subrealm', 'Kingdom', 'Subkingdom', 'Phylum', 'Subphylum', 'Class', 'Subclass', 'Order', 'Suborder', 'Family', 'Subfamily', 'Genus', 'Subgenus', 'Species', 'Genome', 'Last Change', 'MSL of Last Change', 'Proposal for Last Change ', 'Taxon History URL']
Total number of rows: 65


Please note that the column names will be used as `search_rank` in this module and those are case sensitive.
The `search_strings` is not case sensitive, as shown below.

To view the first row of loaded file:

In [None]:
print(msl.msl.iloc[0].transpose())

To view the unique values of 'Realm' in the file and number of species under each realm:

In [None]:
msl.msl['Realm'].value_counts()

The counts above are from the example file.

The full MSL 39v4 includes the following realms and kingdoms:

```
 Virus:
 ├── Adnaviria:
 │ └── Zilligvirae (32 Species)
 ├── Duplodnaviria:
 │ └── Heunggongvirae (4973 Species)
 ├── Monodnaviria:
 │ ├── Loebvirae (60 Species)
 │ ├── Sangervirae (22 Species)
 │ ├── Shotokuvirae (1930 Species)
 │ └── Trapavirae (16 Species)
 ├── NA:
 │ └── NA (636 Species)
 ├── Riboviria:
 │ ├── NA (17 Species)
 │ ├── Orthornavirae (6423 Species)
 │ └── Pararnavirae (272 Species)
 ├── Ribozyviria:
 │ └── NA (21 Species)
 └── Varidnaviria:
 ├── Bamfordvirae (279 Species)
 └── Helvetiavirae (9 Species)
```
To count species within a subset:

```python

print('\n'.join(msl.count_species(count_rank='Phylum', outfmt="tree", search_within_subset={'Kingdom': 'Bamfordvirae'}))

```

Output:

```
 Virus:
 └── [Realm] Varidnaviria:
 └── [Kingdom] Bamfordvirae:
 ├── [Phylum] NA (1 Species)
 ├── [Phylum] Nucleocytoviricota (132 Species)
 └── [Phylum] Preplasmiviricota (146 Species)
```

### Search within MSL 

To search for rows with "lentivirus fel" included in a species name:

In [None]:
print(msl.find_rows_given_str(search_strings="lentivirus fel", color="red").iloc[0].transpose())

To search for rows with "lentivirus" included in species: 

In [None]:
print(msl.find_rows_given_str(search_strings="lentivirus", search_rank="Species", color="red", narrow=True)[['Genus', 'Species','Genome']])

To search for genus *Lentivirus* using `exact=True` option (partly exact, the searching is still not case sensitive): 

In [None]:
print('\n'.join(msl.find_rows_given_str(search_strings="lentivirus", search_rank="Genus", color="red", outfmt='tree', exact=True)))

There are more than one virus species in family *Retroviridae* that can infect feline.
To search for virus within family *Retroviridae* that can infect feline:

In [None]:
print('\n'.join(msl.find_rows_given_str(search_strings=" fel", search_rank="Species", color="red", outfmt='tree', search_within_subset={"Family":"Retroviridae"})))

To search for two known virus species:

In [None]:
print('\n'.join(msl.find_rows_given_str(search_strings=["Gammaretrovirus felleu", "Lentivirus felimdef"], search_rank="Species", color="red", outfmt='tree')))

The tree above always shows 8 ranks.

There are two more types of tree available, including a tree with nonempty rank:

In [None]:
print('\n'.join(msl.find_rows_given_str(search_strings=["Gammaretrovirus felleu", "Lentivirus felimdef"], search_rank="Species", color="red", outfmt='tree', tree_style="drop")))

and a tree with full 15 ranks:

In [None]:
print('\n'.join(msl.find_rows_given_str(search_strings=["Gammaretrovirus felleu", "Lentivirus felimdef"], search_rank="Species", color="red", outfmt='tree', tree_style="full")))

### Download Updated MSL

To download the current release, use `msl.update_msl(version="current")`.
That will return output:

```
File of version current has been loaded
Column names: ['Sort', 'Realm', 'Subrealm', 'Kingdom', 'Subkingdom', 'Phylum', 'Subphylum', 'Class', 'Subclass', 'Order', 'Suborder', 'Family', 'Subfamily', 'Genus', 'Subgenus', 'Species', 'Genome', 'Last Change', 'MSL of Last Change', 'Proposal for Last Change ', 'Taxon History URL']
Total number of rows: 14690
```

To see versions available to download:

In [None]:
msl.update_msl(version="")

## Subtypes within Species

ICTV MSL focuses on "taxa at and above the species rank" [^11].
Classification below the species are often related to species specific characteristics [^12]. 


## Species Exemplar

ICTV VMR provides one exemplar (and may be more additional isolates) for each species in VMR, with Genebak accession number and direct link to NCBI database.

To load a VRM table extracted from VRM download file:

In [None]:
from mtbp3.seq import ictvvmrview
vmr = ictvvmrview.ictvvmr(vmr_file_path = "")


When the path is empty, an example file will be used for illustration. 
Similarily, use `vmr.update_vmr(version="current")`. to download the current version from ICTV.

To search for exemplars including "feline" using `search_rank_or_exemplar="Virus name(s)"`:

In [None]:
print('\n'.join(vmr.find_rows_given_str(search_strings="feline", search_rank_or_exemplar="Virus name(s)", color="red", outfmt='tree', tree_style="drop")))


To search for genus *Lentivirus*:

In [None]:
print('\n'.join(vmr.find_rows_given_str(search_strings="lentivirus", search_rank_or_exemplar="Genus", color="red", outfmt='tree', exact=True, tree_style="drop")))

## Other Resource

NCBI:

- Explore Virus Data [^13]
- NCBI Visual Data Dashboard [^14]


## Reference

[^1]: ICTV. (2024). About Virus Taxonomic Classification. ([web page](https://ictv.global/taxonomy/about))
[^2]: Elliot J Lefkowitz, Donald M Dempsey, Robert Curtis Hendrickson, Richard J Orton, Stuart G Siddell, Donald B Smith, Virus taxonomy: the database of the International Committee on Taxonomy of Viruses (ICTV), Nucleic Acids Research, Volume 46, Issue D1, 4 January 2018, Pages D708–D717, ([web page](https://doi.org/10.1093/nar/gkx932))
[^3]: wiki. (year). Virus classification. ([web page](https://en.wikipedia.org/wiki/Virus_classification))
[^4]: ICTV. (2024). Master Species Lists (MSL). ([web page](https://ictv.global/msl))
[^5]: ICTV. (2024). Virus Metadata Resource (VMR). ([web page](https://ictv.global/vmr))
[^6]: ICTV. (2024). Current ICTV Taxonomy Release. ([web page](https://ictv.global/taxonomy))
[^7]: ICTV. (2024). Visual Taxonomy Browser. ([web page](https://ictv.global/taxonomy/visual-browser))
[^8]: Sykes J. E. (2014). Feline Immunodeficiency Virus Infection. Canine and Feline Infectious Diseases, 209–223. ([web page](https://doi.org/10.1016/B978-1-4377-0795-3.00021-1))
[^9]: ICTV. (2023). History of the taxon. ([web page](https://ictv.global/taxonomy/taxondetails?taxnode_id=202305029))
[^10]: ICTV. (year). The ICTV Report on Virus Classification and Taxon Nomenclature. ([web page](https://ictv.global/report/chapter/retroviridae/retroviridae/lentivirus))
[^11]: Siddell, S. G., Smith, D. B., Adriaenssens, E., Alfenas-Zerbini, P., Dutilh, B. E., Garcia, M. L., Junglen, S., Krupovic, M., Kuhn, J. H., Lambert, A. J., Lefkowitz, E. J., Łobocka, M., Mushegian, A. R., Oksanen, H. M., Robertson, D. L., Rubino, L., Sabanadzovic, S., Simmonds, P., Suzuki, N., Van Doorslaer, K., … Zerbini, F. M. (2023). Virus taxonomy and the role of the International Committee on Taxonomy of Viruses (ICTV). The Journal of general virology, 104(5), 001840. ([web page](https://doi.org/10.1099/jgv.0.001840))
[^12]: Kuhn, J. H., Bao, Y., Bavari, S., Becker, S., Bradfute, S., Brister, J. R., Bukreyev, A. A., Chandran, K., Davey, R. A., Dolnik, O., Dye, J. M., Enterlein, S., Hensley, L. E., Honko, A. N., Jahrling, P. B., Johnson, K. M., Kobinger, G., Leroy, E. M., Lever, M. S., Mühlberger, E., … Nichol, S. T. (2013). Virus nomenclature below the species level: a standardized nomenclature for natural variants of viruses assigned to the family Filoviridae. Archives of virology, 158(1), 301–311. ([web page](https://doi.org/10.1007/s00705-012-1454-0))
[^13]: NCBI. (2024). Explore Virus Data: Feline immunodeficiency virus. ([web pate](https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/virus?SeqType_s=Nucleotide&VirusLineage_ss=Feline%20immunodeficiency%20virus,%20taxid:11673))
[^14]: NCBI. (2024). NCBI Visual Data Dashboard. ([web page](https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/))
[^15]: ICTV. (year). Taxon names are written differently from virus names. ([web page](https://ictv.global/faq/names))