HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium...

11
HIV Sequence Compendium 2008 Editors Carla Kuiken Los Alamos National Laboratory Thomas Leitner Los Alamos National Laboratory Brian Foley Los Alamos National Laboratory Beatrice Hahn University of Alabama Preston Marx Tulane National Primate Research Center Francince McCutchan Henry M. Jackson Foundation Steven Wolinsky Northwestern University Bette Korber Los Alamos National Laboratory Project Officer Geetha Bansal Division of AIDS National Institute of Allergy and Infectious Diseases Los Alamos HIV Sequence Database and Analysis Staff Werner Abfalterer, Gayathri Athreya, Will Fischer, Bob Funkhouser, Brian Gaschen, Peter Hraber, Chien-Chi Lo, Jennifer Macke, James J. Szinger, James Thurmond, Hyejin Yoon, Ming Zhang This publication is funded by the Division of AIDS, National Institute of Allergy and Infectious Diseases, through an interagency agreement with the U.S. Department of Energy. Published by Theoretical Biology and Biophysics Group T-10, Mail Stop K710 Los Alamos National Laboratory Los Alamos, New Mexico 87545 U.S.A. LA-UR 08-03719 http://www.hiv.lanl.gov/

Transcript of HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium...

Page 1: HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium 2008 is available ... R A M E 9000 9719 ... 7719 8723 8352 8352 8719 nef 8557

HIV Sequence Compendium 2008

EditorsCarla KuikenLos Alamos National Laboratory

Thomas LeitnerLos Alamos National Laboratory

Brian FoleyLos Alamos National Laboratory

Beatrice HahnUniversity of Alabama

Preston MarxTulane National Primate ResearchCenter

Francince McCutchanHenry M. Jackson Foundation

Steven WolinskyNorthwestern University

Bette KorberLos Alamos National Laboratory

Project OfficerGeetha Bansal

Division of AIDSNational Institute of Allergy and Infectious Diseases

Los Alamos HIV Sequence Database and Analysis StaffWerner Abfalterer, Gayathri Athreya, Will Fischer, Bob Funkhouser, Brian Gaschen, Peter Hraber,

Chien-Chi Lo, Jennifer Macke, James J. Szinger, James Thurmond, Hyejin Yoon, Ming Zhang

This publication is funded by the Division of AIDS, National Institute of Allergy and Infectious Diseases,through an interagency agreement with the U.S. Department of Energy.

Published byTheoretical Biology and Biophysics

Group T-10, Mail Stop K710Los Alamos National Laboratory

Los Alamos, New Mexico 87545 U.S.A.

LA-UR 08-03719

http://www.hiv.lanl.gov/

Page 2: HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium 2008 is available ... R A M E 9000 9719 ... 7719 8723 8352 8352 8719 nef 8557

HIV Sequence Compendium 2008

Published byTheoretical Biology and BiophysicsGroup T-10, Mail Stop K710Los Alamos National LaboratoryLos Alamos, New Mexico 87545 U.S.A

LA-UR 08-03719Approved for public release; distribution is unlimited.

Los Alamos National Laboratory, an affirmative action/equalopportunity employer, is operated by Los Alamos National Se-curity, LLC, for the National Nuclear Security Administrationof the U.S. Department of Energy under contract DE-AC52-06NA25396.

This report was prepared as an account of work sponsored by anagency of the U.S. Government. Neither Los Alamos NationalSecurity, LLC, the U.S. Government nor any agency thereof,nor any of their employees make any warranty, express or im-plied, or assume any legal liability or responsibility for the accu-racy, completeness, or usefulness of any information, apparatus,product, or process disclosed, or represent that its use wouldnot infringe privately owned rights. Reference herein to anyspecific commercial product, process, or service by trade name,trademark, manufacturer, or otherwise does not necessarily con-stitute or imply its endorsement, recommendation, or favoringby Los Alamos National Security, LLC, the U.S. Government,or any agency thereof. The views and opinions of authors ex-pressed herein do not necessarily state or reflect those of LosAlamos National Security, LLC, the U.S. Government, or anyagency thereof.

Los Alamos National Laboratory strongly supports academicfreedom and a researcher’s right to publish; as an institution,however, the Laboratory does not endorse the viewpoint of apublication or guarantee its technical correctness.

Page 3: HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium 2008 is available ... R A M E 9000 9719 ... 7719 8723 8352 8352 8719 nef 8557

Contents

Contents iii

I Preface 1I-1 Introduction . . . . . . . . . . . . . . . . 1I-2 Acknowledgements . . . . . . . . . . . . 1I-3 Citing the database . . . . . . . . . . . . 1I-4 About the PDF . . . . . . . . . . . . . . 1I-5 Genome maps . . . . . . . . . . . . . . . 3I-6 HIV/SIV proteins . . . . . . . . . . . . . 4I-7 Landmarks of the Genome . . . . . . . . 5I-8 Amino Acid Codes . . . . . . . . . . . . 7I-9 Nucleic Acid Codes . . . . . . . . . . . . 7

II HIV-1/SIVcpz Complete Genomes 9II-1 Introduction . . . . . . . . . . . . . . . . 9II-2 Annotated Features . . . . . . . . . . . . 11II-3 Sequences . . . . . . . . . . . . . . . . . 13II-4 Alignments . . . . . . . . . . . . . . . . 18

III HIV-2/SIV Complete Genomes 151III-1 Introduction . . . . . . . . . . . . . . . . 151III-2 Annotated Features . . . . . . . . . . . . 153III-3 Sequences . . . . . . . . . . . . . . . . . 155III-4 Alignments . . . . . . . . . . . . . . . . 157

IV PLV Complete Genomes 221IV-1 Introduction . . . . . . . . . . . . . . . . 221IV-2 Sequences . . . . . . . . . . . . . . . . . 222IV-3 Alignments . . . . . . . . . . . . . . . . 224

V HIV-1/SIVcpz Proteins 301V-1 Introduction . . . . . . . . . . . . . . . . 301V-2 Annotated Features . . . . . . . . . . . . 302V-3 Sequences . . . . . . . . . . . . . . . . . 304V-4 Alignments . . . . . . . . . . . . . . . . 310

VI HIV-2/SIV Proteins 361VI-1 Introduction . . . . . . . . . . . . . . . . 361VI-2 Annotated Features . . . . . . . . . . . . 362VI-3 Sequences . . . . . . . . . . . . . . . . . 363VI-4 Alignments . . . . . . . . . . . . . . . . 369

VII PLV Proteins 395VII-1 Introduction . . . . . . . . . . . . . . . . 395VII-2 Sequences . . . . . . . . . . . . . . . . . 396VII-3 Alignments . . . . . . . . . . . . . . . . 402

HIV Sequence Compendium 2008 iii

Page 4: HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium 2008 is available ... R A M E 9000 9719 ... 7719 8723 8352 8352 8719 nef 8557

Contents

iv HIV Sequence Compendium 2008

Page 5: HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium 2008 is available ... R A M E 9000 9719 ... 7719 8723 8352 8352 8719 nef 8557

Pref

ace

I

Preface

I-1 Introduction

This compendium is an annual printed summary of the data con-tained in the HIV sequence database. In these compendia wetry to present a judicious selection of the data in such a waythat it is of maximum utility to HIV researchers. Each of thealignments attempt to display the genetic variability within thedifferent species, groups and subtypes of the virus.

This compendium contains sequences published until end of2007.

The number of sequences in the HIV database is still increas-ing exponentially. In total, at the time of printing, there were229,451 sequences in the HIV Sequence Database, an increaseof 17% since last year.

The number of near complete genomes (>7000 nucleotides)increased to 2084 by end of 2007, reflecting a smaller increasethan in previous years. However, similarly as in previous years,due to size limitations we have omitted many sequences inthe compendium alignments. These omissions were done con-sidering redundant sequencing of certain isolates and patientsas well as construction of phylogenetic trees with all avail-able sequences. A more complete version of all alignmentsis available on our website, http://www.hiv.lanl.gov/content/sequence/NEWALIGN/align.html

This year we provide phylogenetic trees of the alignments inthis compendium. These trees are meant to be general guides tointerpret the overall genetic variation. Trees before as well as af-ter compendium selection are shown. In all trees known recom-binants have been removed to as accurately as possible show thephylogenetic history of the non-recombinant sequences.

Reprints are available from our website in the form of bothHTML and PDF files. As always, we are open to complaints andsuggestions for improvement. Inquiries and comments regard-ing the compendium should be addressed to [email protected]

I-2 Acknowledgements

The HIV Sequence Database and Analysis Project is funded bythe Vaccine and Prevention Research Program of the AIDS Di-vision of the National Institute of Allergy and Infectious Dis-eases (Dr. Geetha Bansal, Project Officer) through an intera-gency agreement with the U.S. Department of Energy.

I-3 Citing the database

This publication may be cited as

HIV Sequence Compendium 2008. Carla Kuiken, ThomasLeitner, Brian Foley, Beatrice Hahn, Preston Marx, Franc-ince McCutchan, Steven Wolinsky, and Bette Korber edi-tors. 2008. Publisher: Los Alamos National Laboratory,Theoretical Biology and Biophysics, Los Alamos, NewMexico. LA-UR 08-03719.

I-4 About the PDF

The complete HIV Sequence Compendium 2008 is availablein Adobe Portable Document Format (PDF) from our website,http://www.hiv.lanl.gov/. The PDF version is hy-pertext enabled and features ‘clickable’ table-of-contents, in-dexes, references and links to external web sites.

This volume is typeset using LATEX.

HIV Sequence Compendium 2008 1

Page 6: HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium 2008 is available ... R A M E 9000 9719 ... 7719 8723 8352 8352 8719 nef 8557

Preface

Preface About the PDF

Figure I.1: This year’s cover is a graphic image of the full 2007 HIV-1 web alignment, translated to amino acids. The amino acids arecolor-coded; each is represented by one pixel. Code written by Will Fischer, HIV database project.

2 HIV Sequence Compendium 2008

Page 7: HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium 2008 is available ... R A M E 9000 9719 ... 7719 8723 8352 8352 8719 nef 8557

Genom

em

apsPreface

Preface

I-5 Genome maps

ɂ LTR

1

634

gag

790 1186 1879 21341921 2086

2292

pol env

2085 2253 2550 3870 4230

5096

vif

5041

5619

vpr

5559

*5772

5850

5831

6045

5970

6045

vpu

!6062

6310

tat

rev6225 7758

8795

8469

8379

8379

8653

nef

8797

9417

ɂ LTR

9086

9719

1

2

3

F

R

A

M

E

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 9719

ɂ LTR

1

855

1103 1508 2198 24382249 2396

2668

2395 2638 2935 4252 4612

5754

vif

5423

6070

vpx

5898

6239

vpr

6239

6502

6402

6697

6628

6697

6704 8228

9286

8861

8957

8861

9102

nef

9120

9893

ɂ LTR

9505

10359

1

2

3

F

R

A

M

E

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 10359

ɂ LTR

1

602

795 1266 1959 22321998 2184

2459

2183 2429 2747 4070 4430

5293

vif

5238

5903

vpr

5701

6042

5988

6214

6154

6214

6225 7719

8723

8352

8352

8719

nef

8557

8457 9279

ɂ LTR

8918

9597

1

2

3

F

R

A

M

E

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 9597

tat

tat

rev

rev

HIV-1 HXB2

HIV-2 BEN

SIV Sykes

†8424 †9168

p17 p24

prot p51 RT p15 gp120 gp41

env

gp120 gp41

env

gp120 gp41

p31 int

pol

prot p51 RT p15

p15

p31 int

pol

prot p51 RT p31 int

p7

p2 p1

p6

gag

p17 p24 p7

p2 p1

p6

gag

p17 p24 p7

p2 p1

p6

3

3

3

5

5

5

Landmarks of the HIV-1, HIV-2, and SIV genomes. Open reading frames are shown as rectangles. The gene start, indicated by the small number in the upper leftcorner of each rectangle normally records the position of the a in the atg start codon for that gene, while the number in the lower right records the last position ofthe stop codon. For pol, the start is taken to be the first t in the sequence ttttttag, which forms part of the stem loop that potentiates ribosomal slippage on theRNA and a resulting −1 frameshift and the translation of the Gag-Pol polyprotein. The tat and rev spliced exons are shown as shaded rectangles. In HXB2, *5772marks the position of a frameshift in the vpr gene caused by an “extra” t relative to most other subtype B viruses; !6062 indicates a defective acg start codon invpu; †8424 and †9168 mark premature stop codons in tat and nef. See Korber et al., Numbering Positions in HIV Relative to HXB2CG, in Human Retroviruses andAIDS, 1998, p. 102. Available from http://www.hiv.lanl.gov/content/sequence/HIV/REVIEWS/HXB2.html

HIV

SequenceC

ompendium

20083

Page 8: HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium 2008 is available ... R A M E 9000 9719 ... 7719 8723 8352 8352 8719 nef 8557

Preface

Preface HIV/SIV proteins

I-6 HIV/SIV proteins

Name Size Function Localization

GagMA p17 membrane anchoring; env interaction; nuclear

transport of viral core (myristylated protein)virion

CA p24 core capsid virionNC p7 nucleocapsid, binds RNA virion

p6 binds Vpr virion

PolProtease (PR) p15 Gag/Pol cleavage and maturation virionReverseTranscriptase(RT)

p66, p51 reverse transcription, RNAse H activity virion

RNase H p15 virionIntegrase (IN) p31 DNA provirus integration virion

Env gp120/gp41 external viral glycoproteins bind to CD4 andsecondary receptors

plasma membrane, virion envelope

Tat p16/p14 viral transcriptional transactivator primarily in nucleolus/nucleus

Rev p19 RNA transport, stability and utilization factor(phosphoprotein)

primarily in nuleolus/nucleusshuttling between nucleolus andcytoplasm

Vif p23 promotes virion maturation and infectivity cytoplasm (cytosol, membranes),virion

Vpr p10-15 promotes nuclear localization of preintegrationcomplex, inhibits cell division, arrests infected cells atG2/M

virion nucleus (nuclear membrane?)

Vpu p16 promotes extracellular release of viral particles;degrades CD4 in the ER; (phosphoprotein only inHIV-1 and SIVcpz)

integral membrane protein

Nef p27-p25 CD4 and class I downregulation (myristylated protein) plasma membrane, cytoplasm,(virion?)

Vpx p12-16 Vpr homolog present in HIV-2 and some SIVs, absentin HIV-1

virion (nucleus?)

Tev p28 tripartite tat-env-rev protein (also named Tnv) primarily in nucleolus/nucleus

4 HIV Sequence Compendium 2008

Page 9: HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium 2008 is available ... R A M E 9000 9719 ... 7719 8723 8352 8352 8719 nef 8557

Landmarks of the Genome Preface

Pref

ace

I-7 Landmarks of the Genome

HIV genomic structural elements

LTR Long terminal repeat, the DNA sequence flanking thegenome of integrated proviruses. It contains important reg-ulatory regions, especially those for transcription initiationand polyadenylation.

TAR Target sequence for viral transactivation, the binding sitefor Tat protein and for cellular proteins; consists of approxi-mately the first 45 nucleotides of the viral mRNAs in HIV-1(or the first 100 nucleotides in HIV-2 and SIV.) TAR RNAforms a hairpin stem-loop structure with a side bulge; thebulge is necessary for Tat binding and function.

RRE Rev responsive element, an RNA element encoded withinthe env region of HIV-1. It consists of approximately 200nucleotides (positions 7327 to 7530 from the start of tran-scription in HIV-1, spanning the border of gp120 and gp41).The RRE is necessary for Rev function; it contains a highaffinity site for Rev; in all, approximately seven binding sitesfor Rev exist within the RRE RNA. Other lentiviruses (HIV-2, SIV, visna, CAEV) have similar RRE elements in similarlocations within env, while HTLVs have an analogous RNAelement (RXRE) serving the same purpose within their LTR;RRE is the binding site for Rev protein, while RXRE is thebinding site for Rex protein. RRE (and RXRE) form complexsecondary structures, necessary for specific protein binding.

PE Psi elements, a set of 4 stem-loop structures preceding andoverlapping the Gag start codon which are the sites recog-nized by the cysteine histidine box, a conserved motif withthe canonical sequence CysX2CysX4HisX4Cys, present inthe Gag p7 MC protein. The Psi Elements are present inunspliced genomic transcripts but absent from spliced viralmRNAs.

SLIP An TTTTTT slippery site, followed by a stem-loop struc-ture, is responsible for regulating the -1 ribosomal frameshiftout of the Gag reading frame into the Pol reading frame.

CRS Cis-acting repressive sequences postulated to inhibitstructural protein expression in the absence of Rev. One suchsite was mapped within the pol region of HIV-1. The exactfunction has not been defined; splice sites have been postu-lated to act as CRS sequences.

INS Inhibitory/Instability RNA sequences found within thestructural genes of HIV-1 and of other complex retroviruses.Multiple INS elements exist within the genome and can actindependently; one of the best characterized elements spansnucleotides 414 to 631 in the gag region of HIV-1. The INSelements have been defined by functional assays as elementsthat inhibit expression posttranscriptionally. Mutation of theRNA elements was shown to lead to INS inactivation and upregulation of gene expression.

Genes and gene products

GAG The genomic region encoding the capsid proteins (groupspecific antigens). The precursor is the p55 myristylated pro-

tein, which is processed to p17 (MAtrix), p24 (CApsid), p7(NucleoCapsid), and p6 proteins, by the viral protease. Gagassociates with the plasma membrane where the virus assem-bly takes place. The 55 kDa Gag precursor is called assem-blin to indicate its role in viral assembly.

POL The genomic region encoding the viral enzymes protease,reverse transcriptase, RNAse, and integrase. These enzymesare produced as a Gag-Pol precursor polyprotein, which isprocessed by the viral protease; the Gag-Pol precursor is pro-duced by ribosome frameshifting near the 3′end of gag.

ENV Viral glycoproteins produced as a precursor (gp160)which is processed to give a noncovalent complex of theexternal glycoprotein gp120 and the transmembrane glyco-protein gp41. The mature gp120-gp41 proteins are boundby non-covalent interactions and are associated as a trimeron the cell surface. A substantial amount of gp120 can befound released in the medium. gp120 contains the bindingsite for the CD4 receptor, and the seven transmembrane do-main chemokine receptors that serve as co-receptors for HIV-1.

TAT Transactivator of HIV gene expression. One of two es-sential viral regulatory factors (Tat and Rev) for HIV geneexpression. Two forms are known, Tat-1 exon (minor form)of 72 amino acids and Tat-2 exon (major form) of 86 aminoacids. Low levels of both proteins are found in persistentlyinfected cells. Tat has been localized primarily in the nucle-olus/nucleus by immunofluorescence. It acts by binding tothe TAR RNA element and activating transcription initiationand elongation from the LTR promoter, preventing the 5′LTRAATAAA polyadenylation signal from causing premature ter-mination of transcription and polyadenylation. It is the firsteukaryotic transcription factor known to interact with RNArather than DNA and may have similarities with prokaryoticanti-termination factors. Extracellular Tat can be found andcan be taken up by cells in culture.

REV The second necessary regulatory factor for HIV expres-sion. A 19 kDa phosphoprotein, localized primarily in thenucleolus/nucleus, Rev acts by binding to RRE and promot-ing the nuclear export, stabilization and utilization of the un-spliced viral mRNAs containing RRE. Rev is consideredthe most functionally conserved regulatory protein of lenti-viruses. Rev cycles rapidly between the nucleus and the cy-toplasm.

VIF Viral infectivity factor, a basic protein of typically 23 kDa.Promotes the infectivity but not the production of viral par-ticles. In the absence of Vif the produced viral particles aredefective, while the cell-to-cell transmission of virus is notaffected significantly. Found in almost all lentiviruses, Vifis a cytoplasmic protein, existing in both a soluble cytoso-lic form and a membrane-associated form. The latter formof Vif is a peripheral membrane protein that is tightly asso-ciated with the cytoplasmic side of cellular membranes. In2003, it was discovered that Vif prevents the action of thecellular APOBEC-3G protein which deaminates DNA:RNAheteroduplexes in the cytoplasm.

HIV Sequence Compendium 2008 5

Page 10: HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium 2008 is available ... R A M E 9000 9719 ... 7719 8723 8352 8352 8719 nef 8557

Preface

Preface Landmarks of the Genome

VPR Vpr (viral protein R) is a 96-amino acid (14 kDa) protein,which is incorporated into the virion. It interacts with the p6Gag part of the Pr55 Gag precursor. Vpr detected in the cell islocalized to the nucleus. Proposed functions for Vpr includethe targeting the nuclear import of preintegration complexes,cell growth arrest, transactivation of cellular genes, and in-duction of cellular differentiation. In HIV-2, SIV-SMM, SIV-RCM, SIV-MND-2 and SIV-DRL the Vpx gene is apparentlythe result of a Vpr gene duplication event, possibly by recom-bination.

VPU Vpu (viral protein U) is unique to HIV-1, SIVcpz (theclosest SIV relative of HIV-1), SIV-GSN, SIV-MUS, SIV-MON and SIV-DEN. There is no similar gene in HIV-2,SIV-SMM or other SIVs. Vpu is a 16 kDa (81-amino acid)type I integral membrane protein with at least two differentbiological functions: (a) degradation of CD4 in the endoplas-mic reticulum, and (b) enhancement of virion release fromthe plasma membrane of HIV-1-infected cells. Env and Vpuare expressed from a bicistronic mRNA. Vpu probably pos-sesses an N-terminal hydrophobic membrane anchor and ahydrophilic moiety. It is phosphorylated by casein kinase IIat positions Ser52 and Ser56. Vpu is involved in Env matu-ration and is not found in the virion. Vpu has been found toincrease susceptibility of HIV-1 infected cells to Fas killing.

NEF A multifunctional 27-kDa myristylated protein producedby an ORF located at the 3′end of the primate lentiviruses.Other forms of Nef are known, including nonmyristylatedvariants. Nef is predominantly cytoplasmic and associatedwith the plasma membrane via the myristyl residue linked tothe conserved second amino acid (Gly). Nef has also beenidentified in the nucleus and found associated with the cy-toskeleton in some experiments. One of the first HIV pro-teins to be produced in infected cells, it is the most immuno-genic of the accessory proteins. The nef genes of HIV andSIV are dispensable in vitro, but are essential for efficient vi-ral spread and disease progression in vivo. Nef is necessaryfor the maintenance of high virus loads and for the develop-ment of AIDS in macaques, and viruses with defective Nefhave been detected in some HIV-1 infected long term sur-vivors. Nef downregulates CD4, the primary viral receptor,and MHC class I molecules, and these functions map to dif-ferent parts of the protein. Nef interacts with components ofhost cell signal transduction and clathrin-dependent proteinsorting pathways. It increases viral infectivity. Nef containsPxxP motifs that bind to SH3 domains of a subset of Src ki-nases and are required for the enhanced growth of HIV butnot for the downregulation of CD4.

VPX A virion protein of 12 kDa found in HIV-2, SIV-SMM,SIV-RCM, SIV-MND-2 and SIV-DRL and not in HIV-1 orother SIVs. This accessory gene is a homolog of HIV-1 vpr,and viruses with Vpx carry both vpr and vpx. Vpx function inrelation to Vpr is not fully elucidated; both are incorporatedinto virions at levels comparable to Gag proteins through in-teractions with Gag p6. Vpx is necessary for efficient replica-tion of SIV-SMM in PBMCs. Progression to AIDS and death

in SIV-infected animals can occur in the absence of Vpr orVpx. Double mutant virus lacking both vpr and vpx was at-tenuated, whereas the single mutants were not, suggesting aredundancy in the function of Vpr and Vpx related to viruspathogenicity.

Structural proteins/viral enzymes The products of gag, pol,and env genes, which are essential components of the retro-viral particle.

Regulatory proteins Tat and Rev proteins of HIV/SIV and Taxand Rex proteins of HTLVs. They modulate transcriptionaland posttranscriptional steps of virus gene expression and areessential for virus propagation.

Accessory or auxiliary proteins Additional virion and non-virion-associated proteins produced by HIV/SIV retro-viruses: Vif, Vpr, Vpu, Vpx, Nef. Although the accessoryproteins are in general not necessary for viral propagation intissue culture, they have been conserved in the different iso-lates; this conservation and experimental observations sug-gest that their role in vivo is very important. Their functionalimportance continues to be elucidated.

Complex retroviruses Retroviruses regulating their expres-sion via viral factors and expressing additional proteins (reg-ulatory and accessory) essential for their life cycle.

6 HIV Sequence Compendium 2008

Page 11: HIV Sequence Compendium 2008 - hiv.lanl.gov · PDF fileThe complete HIV Sequence Compendium 2008 is available ... R A M E 9000 9719 ... 7719 8723 8352 8352 8719 nef 8557

Amino Acid Codes Preface

Pref

ace

I-8 Amino Acid Codes

A AlanineB Aspartic Acid or AsparagineC CysteineD Aspartic AcidE Glutamic AcidF PhenylalanineG GlycineH HistidineI IsoleucineK LysineL LeucineM MethionineN AsparagineP ProlineQ GlutamineR ArginineS SerineT ThreonineV ValineW TryptophanX unknown or “other” amino acidY TyrosineZ Glutamic Acid or Glutamine. gap- identity$ stop codon# incomplete codon

I-9 Nucleic Acid Codes

A AdenineC CytosineG GuanineT ThymineU UracilM A or CR A or GW A or TS C or GY C or TK G or TV A or C or GH A or C or TD A or G or TB C or G or TN unknown. gap- identity

HIV Sequence Compendium 2008 7