Genomewide analysis of sperm whale E2 ubiquitin ...

11
RESEARCH ARTICLE Genomewide analysis of sperm whale E2 ubiquitin conjugating enzyme genes RAN TIAN 1,2 , CHEN YANG 2 , YUEPAN GENG 2 , INGE SEIM 2,3 * and GUANG YANG 1 * 1 Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing 210046, Jiangsu, People’s Republic of China 2 Integrative Biology Laboratory, College of Life Sciences, Nanjing Normal University, Nanjing 210046, Jiangsu, People’s Republic of China 3 School of Biology and Environmental Science, Queensland University of Technology, Brisbane, QLD 4000, Australia *For correspondence. E-mail: Inge Seim, [email protected]; Guang Yang, [email protected]. Received 7 December 2020; revised 15 July 2021; accepted 28 July 2021 Abstract. Marine mammals are exposed to the oxidative stress induced by hypoxia/reoxygenation cycles yet resist cellular damage. The availability of high-quality genomes promises to provide insights on how this is achieved. In this study, we considered the ubiquitin- conjugating enzymes (E2) gene family, UBE2 genes, which encodes enzymes with critical roles in cellular physiology, including the oxidative stress response. The sperm whale was the first marine mammal with a chromosome-level genome, allowing the study of gene family repertories, phylogenetic relationships, chromosome gene organization, and other evolutionary patterns on a genomewide basis. Here, 39 UBE2 genes (similar to human, including 32 intact genes, one partial gene, six pseudogenes) were identified in sperm whale genome. These genes were found on 17 chromosomes and were assigned into 23 subfamilies, 16 subgroups, and four classes based on structural characteristics and functions, phylogeny and conserved domains, respectively. Although the gene structure and motif distribution of sperm whale UBE2 genes are conserved in each subfamily, motif variation and intron gain/loss may contribute to functional divergence. Segmental duplications were detected in six gene pairs, which could drive UBE2 gene innovation in the sperm whale. Contrasting seven cetaceans and five terrestrial taxa, we found that cetaceans have experienced shifts in selective constraint on UBE2 genes, which may contribute to oxidative stress tolerance during the adaptation to aquatic life. Our results provide the first comprehensive survey of cetacean UBE2 genes. Keywords. ubiquitin-conjugating enzymes; sperm whale; oxidative stress; evolution. Introduction Ubiquitination refers to the covalent ligation of ubiquitin (Ub) to proteins––the formation of ubiquitin-protein conjugates. The ubiquitin system is a major regulator of cell physiology. Ubiquitin-mediated degradation of regulatory proteins im- pacts diverse cellular processes, such as signal transduction, transcription, immune response, and DNA repair (Stewart et al. 2016). Generally, three types of enzymes are involved in the ubiquitination of proteins: a ubiquitin-activating enzyme (E1) activates ubiquitin by forming a thiol ester bond with ubiquitin in an ATP-dependent manner. The E1 then passes the activated ubiquitin to a ubiquitin-conjugating enzyme (E2) to form a high-energetic conjugation with Ub (via a conserved catalytic cysteine residue in the E2). The E2 next selectively interacts with one or more ubiquitin ligase enzymes (E3s), resulting in binding of ubiquitin to a lysine residue on the target protein (i.e., substrate). Although genes in the E2 gene family have considerable sequence divergence, they all har- bour a 150 to 200 amino acid ubiquitin-conjugating catalytic (UBC) domain, which form a canonical topology, and a motif containing an HPN tripeptide (histidine–proline–asparagine) and a catalytic cysteine residue (Michelle et al. 2009). The E2 gene family repertoire varies among species. For instance, the Ran Tian and Chen Yang contributed equally to this work. Supplementary Information: The online version contains supplementary material available at https://doi.org/10.1007/s12041-021-01333-y . Journal of Genetics (2021)100:78 Ó Indian Academy of Sciences https://doi.org/10.1007/s12041-021-01333-y

Transcript of Genomewide analysis of sperm whale E2 ubiquitin ...

RESEARCH ARTICLE

Genomewide analysis of sperm whale E2 ubiquitin conjugatingenzyme genes

RAN TIAN1,2, CHEN YANG2, YUEPAN GENG2, INGE SEIM2,3* and GUANG YANG1*

1Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University,Nanjing 210046, Jiangsu, People’s Republic of China2Integrative Biology Laboratory, College of Life Sciences, Nanjing Normal University, Nanjing 210046, Jiangsu,People’s Republic of China3School of Biology and Environmental Science, Queensland University of Technology, Brisbane, QLD 4000, Australia*For correspondence. E-mail: Inge Seim, [email protected]; Guang Yang, [email protected].

Received 7 December 2020; revised 15 July 2021; accepted 28 July 2021

Abstract. Marine mammals are exposed to the oxidative stress induced by hypoxia/reoxygenation cycles yet resist cellular damage. Theavailability of high-quality genomes promises to provide insights on how this is achieved. In this study, we considered the ubiquitin-conjugating enzymes (E2) gene family, UBE2 genes, which encodes enzymes with critical roles in cellular physiology, including theoxidative stress response. The sperm whale was the first marine mammal with a chromosome-level genome, allowing the study of genefamily repertories, phylogenetic relationships, chromosome gene organization, and other evolutionary patterns on a genomewide basis.Here, 39 UBE2 genes (similar to human, including 32 intact genes, one partial gene, six pseudogenes) were identified in sperm whalegenome. These genes were found on 17 chromosomes and were assigned into 23 subfamilies, 16 subgroups, and four classes based onstructural characteristics and functions, phylogeny and conserved domains, respectively. Although the gene structure and motif distributionof sperm whale UBE2 genes are conserved in each subfamily, motif variation and intron gain/loss may contribute to functional divergence.Segmental duplications were detected in six gene pairs, which could drive UBE2 gene innovation in the sperm whale. Contrasting sevencetaceans and five terrestrial taxa, we found that cetaceans have experienced shifts in selective constraint on UBE2 genes, which maycontribute to oxidative stress tolerance during the adaptation to aquatic life. Our results provide the first comprehensive survey of cetaceanUBE2 genes.

Keywords. ubiquitin-conjugating enzymes; sperm whale; oxidative stress; evolution.

Introduction

Ubiquitination refers to the covalent ligation of ubiquitin (Ub)to proteins––the formation of ubiquitin-protein conjugates.The ubiquitin system is a major regulator of cell physiology.Ubiquitin-mediated degradation of regulatory proteins im-pacts diverse cellular processes, such as signal transduction,transcription, immune response, and DNA repair (Stewartet al. 2016). Generally, three types of enzymes are involved inthe ubiquitination of proteins: a ubiquitin-activating enzyme(E1) activates ubiquitin by forming a thiol ester bond with

ubiquitin in anATP-dependentmanner. The E1 then passes theactivated ubiquitin to a ubiquitin-conjugating enzyme (E2) toform a high-energetic conjugation with Ub (via a conservedcatalytic cysteine residue in the E2). The E2 next selectivelyinteracts with one or more ubiquitin ligase enzymes (E3s),resulting in binding of ubiquitin to a lysine residue on thetarget protein (i.e., substrate). Although genes in the E2 genefamily have considerable sequence divergence, they all har-bour a 150 to 200 amino acid ubiquitin-conjugating catalytic(UBC) domain, which form a canonical topology, and a motifcontaining an HPN tripeptide (histidine–proline–asparagine)and a catalytic cysteine residue (Michelle et al. 2009). The E2gene family repertoire varies among species. For instance, the

Ran Tian and Chen Yang contributed equally to this work.

Supplementary Information: The online version contains supplementary material available at https://doi.org/10.1007/s12041-021-01333-y.

Journal of Genetics (2021)100:78 � Indian Academy of Scienceshttps://doi.org/10.1007/s12041-021-01333-y (0123456789().,-volV)(0123456789().,-volV)

human (Homo sapiens) genome has 40 E2 genes, while 16genes are found in yeast (Saccharomyces cerevisae) (Michelleet al. 2009). This variability in the gene number betweendifferent species results from gene duplication.

Oxidative stress occurs when cells are exposed to animbalance between the production of antioxidants andreactive oxygen species (ROS), resulting in damage tobiomolecules (e.g., DNA) and cells (Reuter et al. 2010).Serving as a stress-sensing mechanism, the ubiquitinpathway plays an important role in response to oxida-tive stress (Stewart et al. 2016). For example, oxidativestress condition is associated with increased substratestability of the E2 enzyme CDC34 (encoded byUBE2R1) and delayed cell cycle progression (Doriset al. 2012).

The sperm whale (Physeter macrocephalus) was thefirst marine mammal with a chromosome-level genome(Fan et al. 2019). As the deepest and longest diver (2035m depth, [73 min) of all marine mammals (Watwoodet al. 2006), the sperm whale is exposed to oxidativestress associated with hypoxia/reoxygenation and ischemia/reperfusion cycles during diving (Allen and Vazquez-Me-dina 2019). Limited oxygen or ischemia deplete cellularATP, leading to accumulation of ATP degradation products(e.g., xanthine and hypoxanthine). Upon reoxygenation orreperfusion after diving, these ATP degradation productsare oxidized, increasing oxidant generation (e.g., super-oxide radical) from enzymatic systems. Subsequently,marine mammals would be expected to suffer extensiveoxidative injury mediated by reactive oxygen species(Allen and Vazquez-Medina 2019). However, marinemammals are able to resist oxidative damage. Elevatedantioxidant capacities have been reported in physiologicalresearch on diving mammals (Lopez-Cruz et al. 2014; DelCastillo Velasco-Martınez et al. 2016; Del Aguila-Vargaset al. 2020). For instance, glutathione levels were higherin Weddell and harbour seals than in human plasma(Vazquez-Medina et al. 2012). Moreover, the enzymeactivity of erythrocyte glutathione peroxidase were two orthree times higher in manatee than that in terrestrial spe-cies (Wilhelm et al. 2002). These features likely helpmarine mammals counteract increases in apnea-derivedoxidant production. However, it is appreciated that, tofully understand oxidative stress resistance in marinemammals, its genetic basis must be explored. Previousfindings on cetaceans include an increase in the copynumber of antioxidant genes (the PRDX and GST genefamilies) (Yim et al. 2014; Tian et al. 2019) and evidenceof positive selection of hypoxia-related genes (e.g., genesencoding haemoglobins and myoglobin) (Tian et al.2016). Considering that E2 is the key constituent in theubiquitin pathway, and has a role in oxidative stress, wehere characterized the E2 enzyme gene family of thesperm whale to (i) provide a more comprehensive under-standing of phylogenetic relationships, chromosomal geneorganization, gene gain or loss, and evolutionary

processes; (ii) compare and contrast the evolution of E2gene family of cetaceans and terrestrial mammals.

Materials and methods

Gene identification

A hidden Markov model (HMM) file containing the ubiquitinconjugating (UBC) domain (PF00179) was downloaded fromthe Pfam database (http://pfam.xfam.org) (El-Gebali et al.2019). HMMER v3.3 (http://hmmer.org), with a cut-off of0.01 was used to search for E2 conjugating enzymes (i.e.,UBE2) genes in the sperm whale genome (NCBI assemblyno. GCF_002837175.2). After manually removing redundantsequences, candidate genes, containing a UC domain weresubmitted to the SMART (Schultz et al. 1998), Pfam (El-Gebali et al. 2019), and CDD (Lu et al. 2020) databases toverify the presence of UBE2 core sequences. Next, all thecandidate sequences were manually inspected. The corre-sponding CDS and gene sequences were extracted accordingto their protein identifications. Isoelectric points and molec-ular weights of UBE2 family proteins were estimated usingExPASy - ProtParam (Gasteiger et al. 2005) (https://web.expasy.org/protparam). The subcellular localization of UBE2genes were predicted using CELLO v2.5 (Yu et al. 2006)(http://cello.life.nctu.edu.tw).

Sequence alignment and phylogeny

Multiple codon-based alignments were performed with pre-dicted nucleotide sequences of UBE2 genes using MUSCLEimplemented in MEGA X v10.1 (Kumar et al. 2018). Con-served alignment blocks were identified using Gblocksv0.91b (Talavera and Castresana 2007); with default settingsand allowing gaps in all sequences. Phylogenetic amino acidtrees were inferred by maximum likelihood, with 1000ultrafast bootstraps replicates implemented in IQ-Tree v1.6.2(Nguyen et al. 2015) (http://iqtree.cibiv.univie.ac.at) andutilizing the built-in model test option. The best-fit model ofVT ? G4 for aligned sequences was determined usingModelFinder (Kalyaanamoorthy et al. 2017), which is part ofIQ-TREE. Phylogenetic trees were displayed and annotatedusing iTOL v5.5 (Interactive Tree Of Life, https://itol.embl.de) (Letunic and Bork 2019).

Gene structure prediction

To further classify the genes, we examined the conservedmotifs of all candidate UBE2 genes using multiple EM formotif elicitation (MEME; http://meme-suite.org/tools/meme)(Bailey et al. 2009) with the following optimized parame-ters: any number of repetitions; minimum seven motifs;

78 Page 2 of 11 Ran Tian et al.

maximum 50 motifs; and the optimum 10–200 amino acidsof each motif. The MAST tool in the MEME suite (Baileyet al. 2009) was used to compare discovered motifs betweensperm whale and human. Exon/intron structures were pre-dicted using the Gene Structure Display Server (GSDS;http://gsds.cbi.pku.edu.cn) (Hu et al. 2014).

Chromosomal distribution, gene duplication and promoterprediction

All identified UBE2 genes were mapped onto sperm whalechromosomes using MapChart v2.32 (Voorrips 2002).Duplication events and synteny of UBE2 family genes in thesperm whale genome were identified using MCScanX (Wanget al. 2012). The chromosomal distributions and syntenicrelationships of UBE2 family genes in sperm whale and otherspecies (e.g. human, mouse, dog and cow) were constructedusing Circos v0.69-9 (Krzywinski et al. 2009). The nonsyn-onymous (Ka) and synonymous (Ks) substitution rates, andtheir ratio (Ka/Ks) of syntenic UBE2 gene pairs were calcu-lated using KaKs Calculator v2.0 (Wang et al. 2010).

Molecular evolution analyses

To estimate the strength and form of selection acting onUBE2genes, we performed genomewide searches for orthologousUBE2 genes of seven cetaceans (i.e. bottlenose dolphin, killerwhale, finless porpoise, beluga whale, baiji, sperm whale andbowhead whale) and five terrestrial mammals (i.e. cow, pig,dog,mouse and human). The dataset (30UBE2 genes detectedin eight or more species) was analysed using the CodeMLprogram in the PAML v4.7 package (Yang 2007). Site models(M8 (beta ? x C 1) versus M8a (beta ? x = 1)) were used tocalculate the nonsynonymous rate (dN), synonymous rate (dS)and their ratio (dN/dS) at each site (Swanson et al. 2003).Branch models were employed to assess the rate of evolutionalong a specific lineage (Yang 1998). The two-ratio modelallows one ratio for background branches and another forforeground branches, compared with the one-ratio modelwhich ‘enforces’ the same ratio for all branches. All analyseswere run with varying starting values to avoid potential localoptima. Model pairs were compared using likelihood ratio(LRT) tests, with a v2 distribution to determine significance.The topology was based on Mcgowen et al. (2019) andScornavacca et al. (2019).

Results and discussion

Identification of UBE2 genes in sperm whale

The ubiquitin-conjugating (E2) enzyme gene family ischaracterized by the presence of a highly conserved UBCdomain (Stewart et al. 2016). UBC domain HMM profiles

searches, database domain validation, and manual curationrevealed 39 UBE2 genes in the sperm whale genome (files inelectronic supplementary at http://www.ias.ac.in/jgenet/),including 32 intact genes, one partial gene, and six pseu-dogenes. The length of the protein-coding regions in theidentified UBE2 genes ranged from 89 (XP_028346238.1;UBE2N_4) to 1296 amino acids (XP_023985900.1;UBE2O). The predicted molecular weights of these proteinsvaried from 10,088.74 kDa (UBE2N_4) to 141345.16 kDa(UBE2O), and their isoelectric points ranged from 4.26(XP_023988649.1; UBE2R2) to 10.00 (XP_028342443.1;UBE2QL1). Subcellular localization analysis indicated that21 sperm whale UBE2 genes are localized in the nucleus.Eight, six, and three proteins were predicted to localize to thecytoplasm, extracellular matrix, and mitochondrion, respec-tively. One gene (UBE2QL1) was predicted to localize to theplasma membrane (table 1 in electronic supplementarymaterial). These results suggest that sperm whale UBE2genes are organelle-specific.

The phylogenetic relationship of sperm whale UBE2 genes

A classification of mammalian UBE2 genes into four classesbased on the existence of additional extensions to the UBCdomain has been proposed (Van Wijk and Timmers 2010).The 39 sperm whale UBE2 genes were classified into class I(16), class II (10), class III (11), and class IV (2) (table 1 inelectronic supplementary material). While a well-supportedmaximum-likelihood (ML) phylogenetic tree was estab-lished based on the 39 UBE2 protein sequences of spermwhale (figure 1), no phylogenetic clustering by class wasevident––suggesting that this E2 classification scheme hasno phylogenetic significance. Sperm whale UBE2 geneswere classified into 23 subfamilies by structural character-istics and functions according to the human nomenclature(Van Wijk and Timmers 2010): UBE2A, UBE2B, UBE2C,UBE2D, UBE2E, UBE2F, UBE2G, UBE2H, UBE2I,UBE2J, UBE2K, UBE2L, UBE2M, UBE2N, UBE2O,UBE2Q, UBE2R, UBE2S, UBE2T, UBE2U, UBE2V,UBE2W and UBE2Z. Phylogenetic analysis showed thatsperm whale UBE2 subfamilies varies in size (figure 1). TheUBE2D and UBE2N clades are the largest subfamily, eachcontaining four genes. UBE2N is required for DNA damagerepair and regulation of p53 localization and transcriptionalactivity (Laine et al. 2006). We speculate that an expansion(sperm whale vs human: 4:1) of UBE2N subfamily in spermwhale contributes to enhanced oxidative damage repair.Three members were observed in the UBE2E and UBE2Qsubfamilies. The UBE2G, UBE2J, UBE2K, UBE2R, UBE2Sand UBE2V subfamilies each have two genes. The remaining13 members consist of a single gene. Previous studiesclassified UBE2 genes into 17 (Michelle et al. 2009) or 18(Jones et al. 2001) subgroups based on the sequence simi-larity. Our phylogenetic analysis of 39 genes revealed 16UBE2 subgroups in the sperm whale (figure 1). For example,

Genomewide analysis of the sperm whale E2 gene family Page 3 of 11 78

subgroup 11 is shared by UBE2G and UBE2R, and subgroup14 is composed of UBE2O and UBE2Z, consistent withprevious findings (Michelle et al. 2009).

Motif composition and gene structure of the UBE2 genefamily

Fifteen conserved motifs were identified from the 39 spermwhale UBE2 proteins (figure 2). These resembled the UBE2gene family in human (figure 2 in electronic supplementarymaterial). Motif 1, motif 2, motif 3, motif 4, and motif 6were detected in most of the proteins (figure 2). This isindicative of a conserved structure and similar function ofsperm whale UBE2 enzymes. Further, the motif compositionand distribution patterns were closely related to the assignedsubfamilies and subgroups. For example, the UBE2Q

subfamily harbours motif 7, motif 9, motif 10, motif 13, andmotif 14; the UBE2Z, UBE2O, and UBE2S subfamilies sharemotif 11. In subgroup 1 (UBE2D and UBE2E), all genesharbour motif 1, motif 2, motif 3, motif 4, motif 5 and motif6; which indicate that they may interact with the same E3ligases (Ma et al. 2016). In addition, we also found specificmotifs in some subfamilies. For instance, motif 12 wasfound only in subgroup 9 (UBE2A and UBE2B), and motif15 is unique to the UBE2R subfamily. We speculate thatshared and idiosyncratic motifs contribute to the UBE2protein repertoire.

The histidine–proline–asparagine (HPN) tripeptide and‘PxxPP’ (located in motifs 1 and 4, respectively) (figure 2)are general signature motifs of UBE2 superfamily (Cotteeet al. 2006; Michelle et al. 2009). They are highly con-served and plays an important role in the function of UBE2

Figure 1. Phylogenetic tree of the 39 UBE2 genes of sperm whale constructed by maximum likelihood with 1000 ultrafast bootstrapvalues. Each branch represents a different subfamily. Different colours indicate four classes of UBE2 gene family. Bootstrap support ofnodes more than 95% were shown with blue dots. Sixteen subgroups of UBE2 gene family are shown by colour circle outside of thephylogeny.

78 Page 4 of 11 Ran Tian et al.

enzymes (Cottee et al. 2006) (figure 1 in electronic sup-plementary material). The HPN tripeptide motif is notconserved in sperm whale subgroup 1 (UBE2E1, UBE2E2and UBE2E3), subgroup 12 (UBE2W), subgroup 13(UBE2V1 and UBE2V2), subgroup 14 (UBE2O andUBE2Z), subgroup 15 (UBE2J1 and UBE2J2), and sub-group 16 (UBE2Q1, UBE2Q2, and UBE2QL) (figure 1 inelectronic supplementary material). The histidine residuewithin the HPN motif is important for proper folding of theactive-site region by interacting with tyrosine residues(Haas and Siepmann 1997). In subgroup 14, the histidineresidue in motif is replaced by asparagine in UBE2O andUBE2Z, yielding a NPN tripeptide (figure 1 (green) inelectronic supplementary material). Similar modificationsare also presented in UBE2 genes of species such ashuman, the nematode Caenorhabditis elegans (Jones et al.2001), and the blood fluke Schistosoma mansoni (Costaet al. 2015). Mutagenesis of the yeast UBE2N HPN motif(H77A) led to a 50% decrease in reactivity (Wu et al.2003). However, the impact of this noncanonical motif(NPN) on protein structure and function of sperm whale E2enzyme activity awaits functional assessment. The terminalresidue of the canonical HPN motif is critical for thecatalysis of ubiquitin transfer from E2-ubiquitin conjugatesto lysine residues of substrates (Wu et al. 2003). An N toH substitution is found in UBE2W of subgroup 12 (figure 1(blue) in electronic supplementary material). This substi-tution, which is found in both marine and terrestrialmammals, enhances N-terminal ubiquitination activity(Scaglione et al. 2013) and is likely a functional adaptation(Vittal et al. 2015). Similarly, a noncatalytic cysteine

residue (C136) in UBE2E3 (UbcM2 protein), which con-tributes to restoration of redox homeostasis followingoxidative stress by interaction with antioxidant transcriptionfactor Nrf2 (Plafker et al. 2010), is also present in thesperm whale.

We next considered the gene structure. Sperm whaleUBE2 genes are encoded by a minimum of one exon(e.g., UBE2N_2) and a maximum of 18 (e.g., UBE2O)exons (figure 2). Whereas sequences of different membersdisplay considerable variations in gene structure, theUBE2 genes in each subfamily or subgroup are similar interms of their intron–exon organization (figure 2). Forexample, UBE2A and UBE2B in subgroup 9 are composedof four introns and six exons. In contrast, the exons in theUBE2G, UBE2E, and UBE2N subfamily ranged from fiveto seven. UBE2N_3 and UBE2N_4 have two and threeexons, respectively, while UBE2N_1 and UBE2N_2 aresingle-exon genes. It is possible that the single-exon genesstem from retrotransposition of processed mRNAs, aphenomenon observed in gene families such as the cata-lase family in flowering plants (Frugoli et al. 1998; Jef-fares et al. 2006). Although all members of the UBE2Esubfamily have five exons, their intron sizes vary con-siderably. The intron size of UBE2E1 and UBE2E3 rangefrom 1026 to 68,352 bp, whereas UBE2E2 has intronsranging from 8086 to 281,116 bp (figure 2). This disparitymay stem from insertion of various transposable elementsinto the intronic regions (Sela et al. 2010). UBE2 familymembers also present a strong structural conservation ofthe known core domains predicted by HMMER, such asUQ_con (figure 2).

Figure 2. Comparison of conserved motif (left) and gene structure (right) of sperm whale UBE2 genes. A total of 15 motifs werediscovered by MEME and their organization on the protein is marked by colour boxes. The conserved motifs of PxxPP (red arrows) andHPN (blue triangles) are located in motifs 1 and 4, respectively. Gene models were obtained using GSDS, and known conserved domains ofUBE2 family were predicted by HMMER. Phylogenetic relationships were obtained using maximum likelihood method with 1000bootstrap replicates. Boxes denote exons; horizontal black lines, introns. The scale bar at the bottom estimates the lengths.

Genomewide analysis of the sperm whale E2 gene family Page 5 of 11 78

Chromosomal localization and gene duplication

Thirty UBE2 family genes were assigned to the 17 spermwhale chromosomes (chr 21 is chr X), while the remainingnine genes assigned to unplaced scaffolds (figure 3). Chro-mosomes 2, 7 and 8 contain the largest number of UBE2genes (three genes each). Chromosomes 1, 4, 5, 11, 14, 15,and 20 each contain two UBE2 genes. All the other chro-mosomes and scaffolds harbour one UBE2 gene (figure 3).We found absence of UBE2 family members on chromo-somes 12, 16, 17, and 18.

Considering that gene duplications can give rise to geneswith novel or modified functions (Nei and Rooney 2005), weassessed the extent of this phenomenon on the sperm whaleUBE2 gene family. Segmental duplications were large (1–200kb), nearly identical duplicatedblocks of genomicDNApresentin at least two locations in a genome (Samonte and Eichler2002). Segmental duplications have been implicated in disease,as well as gene innovations, of primates (Bailey et al. 2001;Samonte and Eichler 2002). Our results grouped 10 sperm

whale UBE2 genes into six segmental duplication events (fig-ure 4), all interchromosomal duplications. For example,UBE2Blocated on chromosome 8 resulted from segmental duplicationof UBE2A on chromosome 21 (we also observed a highsequence identity, 95.2%, between genes). This data suggeststhat segmental gene duplication might be a major driving forceof UBE2 gene expansion. To examine the evolutionary selec-tionpressureon spermwhaleUBE2 familygenes,wecalculatedthe Ka, Ks, and their ratios (Ka/Ks) for gene pairs (table 2 inelectronic supplementary material). Three pairs of duplicatedgenes had Ka/Ks less than 0.05, suggesting that these genes areunder purifying selection; possibly because there is an advan-tage to maintaining UBE2 gene function in response to oxida-tive stress (Shang and Taylor 2011).

To further deduce the evolutionary relationships of thesperm whale UBE2 family, we performed a comparativesyntenic analysis of UBE2 genes in sperm whale and fourterrestrial mammals (i.e., human, mouse, cow and dog). Eightsyntenic gene pairs (tables 3–6 in electronic supplementarymaterial) were observed between the sperm whale and the

Figure 3. Chromosomal map of UBE2 genes of sperm whale. Blue bars represent the chromosomes or scaffolds with the ID at the top.All 39 UBE2 genes are located at the right of the chromosomes. Scale bar on the left indicates the chromosome length.

78 Page 6 of 11 Ran Tian et al.

four other mammals examined, indicating that these ortholo-gous UBE2 gene pairs duplicated before the divergence ofancestral species and played a vital role in mammals. Theresult also shows that the UBE2 family genes in the spermwhale have homology to the terrestrial species (figure 5;tables 3–6 in electronic supplementary material). The cow(Bos taurus; 20 orthologous gene pairs mapped on chr. 1, 2, 3,4, 6, 7, 8, 9, 14, 16, 17, 19, 21, 26 and X) and dog (Canislupus familiaris; 20 orthologous gene pairs distributed on chr.2, 3, 5, 9, 11, 12, 14, 15, 23, 26, 29, 30, 31, 32, 36 and X)clearly share more syntenic gene pairs with the sperm whalethan human (Homo sapiens; 19 orthologous gene pairs

scattered on chr. 1, 2, 3, 5, 6, 7, 8, 9, 10, 15, 17, 21, 22, andX) and mouse (Mus musculus; 12 orthologous gene pairslocated on chr. 1, 3, 4, 5, 6, 11, 16, and 18) (figure 5).

Evolutionary analysis of UBE2 genes and other differentmammals

We tested for site-specific and branch-specific selectionamong mammalian UBE2 genes. We found that the M8model fitted the data significantly better than the null modelfor UBE2D3 and UBE2N1, with five and two sites under

Figure 4. The diagram represents chromosomal distribution and interchromosomal duplications of sperm whale UBE2 genes. Gray linesindicate all synteny blocks in the sperm whale genome, and the red lines indicate duplicated UBE2 gene pairs.

Genomewide analysis of the sperm whale E2 gene family Page 7 of 11 78

positive selection, respectively (table 7 in electronic sup-plementary material). The branch model was conducted toanalyse variations among lineages. It should be noted thatthe x values for all UBE2 genes were less than 1 (table 8 inelectronic supplementary material), suggesting that strongpurifying selection plays a central role in maintaining theirfunction. The two-ratio model, which allows different xvalues in the foreground and background branches, resultedin a significant improvement over the one-ratio model fornine genes (i.e., UBE2D2, UBE2N1, UBE2V1, UBE2D3,UBE2E1, UBE2K, UBE2S, UBE2C and UBE2O) (table 8 inelectronic supplementary material). Interestingly, all of thesegenes showed a higher strength of selection pressure incetaceans compared to terrestrial mammals (figure 6). Twogenes (UBE2K and UBE2C) showed a similar pattern ofselection along the lineages leading to sperm whale (table 9in electronic supplementary material). These results suggestthat these UBE2 genes are under different selective pressuresin cetaceans and other mammals, with a stronger selectionpressure shift in cetaceans. Interestingly, UBE2K protects thecerebrum against ischemia/reperfusion-induced oxidativedamage through proteasome inhibition via its SUMOylation

(Jeong et al. 2016). Taken together, we speculate that shiftsin selective pressures on UBE2 genes in cetaceans mayreflect adaptation to oxidative stress.

In conclusion, the present study explored UBE2 genefamily in sperm whale, a deep-diving marine mammal. Ourphylogenetic analysis of 39 UBE2 genes led us to define fourclasses and 16 subgroups of UBE2 family. Fifteen conservedmotifs were discovered in the 39 sperm whale UBE2 pro-teins. The canonical motif of HPN plays an important role inthe function of UBE2 enzymes. However, histidine residuein this motif was replaced by asparagine in UBE2O andUBE2Z, yielding a NPN tripeptide, which may change thecatalysis activity of these enzymes. The secondary structureswere conserved between family members. Known domains(e.g., UQ_con) presented in almost all UBE2 genes, high-lighting a strong structural conservation of core domains. Inaddition, we inspected UBE2 gene family evolution ofmarine mammals and terrestrial mammals. We found thatnine UBE2 genes show a higher strength of selection pres-sure in cetaceans compared to terrestrial mammals, sug-gesting that selective pressures shift on UBE2 genes maydrive the adaptation to oxidative stress in cetaceans. Future

(a)

(b)

Figure 5. Synteny analysis of UBE2 genes (a) between cow (Bos taurus), human (Homo sapiens), and sperm whale (Physter catodon) and(b) between dog (Canis lupus familiaris), mouse (Mus musculus) and sperm whale (Physter catodon). Gray lines in the background indicatethe collinear blocks among genomes; the red lines highlight the syntenic UBE2 gene pairs.

78 Page 8 of 11 Ran Tian et al.

studies comparing the UBE2 repertoire in additional marinemammals may provide further insights into UBE2 genevariation and the evolution of oxidative stress defenses.

Acknowledgements

This work was financially supported by the Key Project of theNational Natural Science Foundation of China (NSFC) (no.31630071, 32030011), the National Natural Science Foundation ofChina (NSFC) (no. 31900310; 31950410545), the Priority Aca-demic Program Development of Jiangsu Higher Education Insti-tutions (PAPD).

References

Allen K. N. and Vazquez-Medina J. P. 2019 Natural tolerance toischemia and hypoxemia in diving mammals: a review. Front.Physiol. 10, 1199.

Bailey J. A., Yavor A. M., Massa H. F., Trask B. J. and Eichler E.E. 2001 Segmental duplications: organization and impact withinthe current human genome project assembly. Genome Res. 11,1005–1017.

Bailey T. L., Boden M., Buske F. A., Frith M., Grant C. E.,Clementi L. et al. 2009 MEME SUITE: tools for motif discoveryand searching. Nucleic Acids Res. 37, W202–W208.

Costa M. P., Oliveira V. F., Pereira R. V., De Abreu F. C., Jannotti-Passos L. K., Borges W. C. et al. 2015 In silico analysis anddevelopmental expression of ubiquitin-conjugating enzymes inSchistosoma mansoni. Parasitol. Res. 114, 1769–1777.

Cottee P. A., El-Osta Y. G. A., Nisbet A. J. and Gasser R. B. 2006Ubiquitin-conjugating enzyme genes in Oesophagostomumdentatum. Parasitol. Res. 99, 119–125.

Del Aguila-Vargas A. C., Vazquez-Medina J. P., Crocker D. E.,Mendez-Rodrıguez L. C., Gaxiola-Robles R., de Anda-MontanezJ. A. et al. 2020 Antioxidant response to cadmium exposure inprimary skeletal muscle cells isolated from humans and elephantseals. Comp. Biochem. Physiol. Part C: Toxicol. Pharmacol.227, 108641.

Del Castillo Velasco-Martınez I., Hernandez-Camacho C. J.,Mendez-Rodrıguez L. C. and Zenteno-Savın T. 2016 Purine

Figure 6. Tests for species specific selective pressure on UBE2 genes between cetaceans and terrestrial mammals. The omega value (dN/dS) estimated by branch model shows the difference between foreground (cetaceans) and background (terrestrial mammals) for each gene.When only a red circle is presented, the difference versus null model (one ratio) was not significant and instead the equivalent value fromthe one ratio is shown. Red circle indicates the omega ratio of background species, and green circle represents the omega ratio of cetaceans.A black line between the two circles indicates significant P-value (LRT, P\ 0.05). A summary of P-values is available in table 9 inelectronic supplementary material.

Genomewide analysis of the sperm whale E2 gene family Page 9 of 11 78

metabolism in response to hypoxic conditions associated withbreath-hold diving and exercise in erythrocytes and plasma frombottlenose dolphins (Tursiops truncatus). Comp. Biochem.Physiol. Part A: Mol. Integ. Physiol. 191, 196–201.

Doris K. S., Rumsby E. L. and Morgan B. A. 2012 Oxidative stressresponses involve oxidation of a conserved ubiquitin pathwayenzyme. Mol. Cell. Biol. 32, 4472–4481.

El-Gebali S., Mistry J., Bateman A., Eddy S. R., Luciani A., PotterS. C. et al. 2019 The Pfam protein families database in 2019.Nucleic Acids Res. 47, D427–D432.

Fan G., Zhang Y., Liu X., Wang J., Sun Z., Sun S. et al. 2019 The firstchromosome-level genome for a marine mammal as a resource tostudy ecology and evolution.Mol. Ecol. Resour. 19, 944–956.

Frugoli J. A., Mcpeek M. A., Thomas T. L. and Mcclung C. R.1998 Intron loss and gain during evolution of the catalase genefamily in angiosperms. Genetics 149, 355–365.

Gasteiger E., Hoogland C., Gattiker A., Wilkins M. R., Appel R.D., Hochstrasser D. F. et al. 2005 Protein identification andanalysis tools on the ExPASy server. In The proteomics protocolshandbook, pp. 571–607. Springer.

Haas A. L. and Siepmann T. J. 1997 Pathways of ubiquitinconjugation. FASEB J. 11, 1257–1268.

Hu B., Jin J., Guo A.-Y., Zhang H., Luo J. and Gao G. 2014 GSDS2.0: an upgraded gene feature visualization server. Bioinformat-ics 31, 1296–1297.

Jeffares D. C., Mourier T. and Penny D. 2006 The biology of introngain and loss. Trends Genet. 22, 16–22.

Jeong E. I., Chung H. W., Lee W. J., Kim S.-H., Kim H., Choi S. G.et al. 2016 E2–25K SUMOylation inhibits proteasome for celldeath during cerebral ischemia/reperfusion. Cell Death Dis. 7,e2573.

Jones D., Crowe E., Stevens T. A. and Candido E. P. M. 2001Functional and phylogenetic analysis of the ubiquitylationsystem in Caenorhabditis elegans: ubiquitin-conjugatingenzymes, ubiquitin-activating enzymes, and ubiquitin-like pro-teins. Genome Biol. 3(research0002), 0001.

Kalyaanamoorthy S., Minh B. Q., Wong T. K., Von Haeseler A. andJermiin L. S. 2017 ModelFinder: fast model selection foraccurate phylogenetic estimates. Nat. Methods 14, 587–589.

Krzywinski M., Schein J., Birol I., Connors J., Gascoyne R.,Horsman D. et al. 2009 Circos: an information aesthetic forcomparative genomics. Genome Res. 19, 1639–1645.

Kumar S., Stecher G., Li M., Knyaz C. and Tamura K. 2018MEGA X: molecular evolutionary genetics analysis acrosscomputing platforms. Mol. Biol. Evol. 35, 1547–1549.

Laine A., Topisirovic I., Zhai D., Reed J. C., Borden K. L. andRonai Z. E. 2006 Regulation of p53 localization and activity byUbc13. Mol. Cell. Biol. 26, 8901–8913.

Letunic I. and Bork P. 2019 Interactive Tree Of Life (iTOL) v4:recent updates and new developments. Nucleic Acids Res. 47,W256–W259.

Lopez-Cruz R. I., Perez-Milicua M. B., Crocker D. E., Gaxiola-Robles R., Bernal-Vertiz J. A., Rosa A. et al. 2014 Purinenucleoside phosphorylase and xanthine oxidase activities inerythrocytes and plasma from marine, semiaquatic and terrestrialmammals. Comp. Biochem. Physiol. Part a: Mol. Integ. Physiol.171, 31–35.

Lu S., Wang J., Chitsaz F., Derbyshire M. K., Geer R. C., GonzalesN. R. et al. 2020 CDD/SPARCLE: the conserved domaindatabase in 2020. Nucleic Acids Res. 48, D265–D268.

Ma K., Ryan P., Klevit R. and Lipkowitz S. 2016 Ube2d familymembers, Ube2e family members and Ube2w modulate theubiquitination and degradation of EGFR by Cbl, pp. AACR.

Mcgowen M. R., Tsagkogeorga G., Alvarez-Carretero S., Dos ReisM., Struebig M., Deaville R. et al. 2019 Phylogenomicresolution of the cetacean tree of life using target sequencecapture. Syst. Biol. 69, 479–501.

Michelle C. and Vourc’h P., Mignon L. and Andres C. R. 2009What was the set of ubiquitin and ubiquitin-like conjugatingenzymes in the eukaryote common ancestor? J. Mol. Evol. 68,616–628.

Nei M. and Rooney A. P. 2005 Concerted and birth-and-deathevolution of multigene families. Annu. Rev. Genet. 39, 121–152.

Nguyen L.-T., Schmidt H. A., Von Haeseler A. and Minh B. Q.2015 IQ-TREE: a fast and effective stochastic algorithm forestimating maximum-likelihood phylogenies. Mol. Biol. Evol.32, 268–274.

Plafker K. S., Nguyen L., Barneche M., Mirza S., Crawford D. andPlafker S. M. 2010 The ubiquitin-conjugating enzyme UbcM2can regulate the stability and activity of the antioxidanttranscription factor Nrf2. J. Biol. Chem. 285, 23064–23074.

Reuter S., Gupta S. C., Chaturvedi M. M. and Aggarwal B. B. 2010Oxidative stress, inflammation, and cancer: how are they linked?Free Radic. Biol. Med. 49, 1603–1616.

Samonte R. V. and Eichler E. E. 2002 Segmental duplications andthe evolution of the primate genome. Nat. Rev. Genet. 3, 65–72.

Scaglione K. M., Basrur V., Ashraf N. S., Konen J. R., Elenitoba-Johnson K. S., Todi S. V. and Paulson H. L. 2013 The ubiquitin-conjugating enzyme (E2) Ube2w ubiquitinates the N terminus ofsubstrates. J. Biol. Chem. 288, 18784–18788.

Schultz J., Milpetz F., Bork P. and Ponting C. P. 1998 SMART, asimple modular architecture research tool: identification ofsignaling domains. Proc. Natl. Acad. Sci. USA 95, 5857–5864.

Scornavacca C., Belkhir K., Lopez J., Dernat R., Delsuc F.,Douzery E. J. and Ranwez V. 2019 OrthoMaM v10: scaling-uporthologous coding sequence and exon alignments with morethan one hundred mammalian genomes. Mol. Biol. Evol. 36,861–862.

Sela N., Kim E. and Ast G. 2010 The role of transposable elementsin the evolution of non-mammalian vertebrates and invertebrates.Genome Biol. 11, R59.

Shang F. and Taylor A. 2011 Ubiquitin–proteasome pathway andcellular responses to oxidative stress. Free Radic. Biol. Med. 51,5–16.

Stewart M. D., Ritterhoff T., Klevit R. E. and Brzovic P. S. 2016 E2enzymes: more than just middle men. Cell Res. 26, 423–440.

Swanson W. J., Nielsen R. and Yang Q. 2003 Pervasive adaptiveevolution in mammalian fertilization proteins. Mol. Biol. Evol.20, 18–20.

Talavera G. and Castresana J. 2007 Improvement of phylogeniesafter removing divergent and ambiguously aligned blocks fromprotein sequence alignments. Syst. Biol. 56, 564–577.

Tian R., Seim I., Ren W., Xu S. and Yang G. 2019 Contractionof the ROS scavenging enzyme glutathione S-transferase genefamily in cetaceans. G3: Genes. Genomes, Genet. 9,2303–2315.

Tian R., Wang Z., Niu X., Zhou K., Xu S. and Yang G. 2016Evolutionary genetics of hypoxia tolerance in cetaceans duringdiving. Genome Biol. Evol. 8, 827–839.

Van Wijk S. J. and Timmers H. M. 2010 The family of ubiquitin-conjugating enzymes (E2s): deciding between life and death ofproteins. FASEB J. 24, 981–993.

Vazquez-Medina J. P., Zenteno-Savın T., Elsner R. and Ortiz R. M.2012 Coping with physiological oxidative stress: a review ofantioxidant strategies in seals. J. Comp. Physiol. B 182, 741–750.

Vittal V., Shi L., Wenzel D. M., Scaglione K. M., Duncan E. D.,Basrur V. et al. 2015 Intrinsic disorder drives N-terminalubiquitination by Ube2w. Nat. Chem. Biol. 11, 83.

Voorrips R. 2002 MapChart: software for the graphical presentationof linkage maps and QTLs. J. Hered. 93, 77–78.

Wang D., Zhang Y., Zhang Z., Zhu J. and Yu J. 2010 KaKs_Cal-culator 2.0: a toolkit incorporating gamma-series methods andsliding window strategies. Genom. Proteom. Bioinform. 8,77–80.

78 Page 10 of 11 Ran Tian et al.

Wang Y., Tang H., Debarry J. D., Tan X., Li J., Wang X. et al. 2012MCScanX: a toolkit for detection and evolutionary analysis ofgene synteny and collinearity. Nucleic Acids Res. 40,e49–e49.

Watwood S. L., Miller P. J., Johnson M., Madsen P. T. and Tyack P.L. 2006 Deep-diving foraging behaviour of sperm whales(Physeter macrocephalus). J. Anim. Ecol. 75, 814–825.

Wilhelm Filho D., Sell F., Ribeiro L., Ghislandi M., CarrasquedoF., Fraga C. G. et al. 2002 Comparison between the antioxidantstatus of terrestrial and diving mammals. Comp. Biochem.Physiol. Part a: Mol. Integr. Physiol. 133, 885–892.

Wu P. Y., Hanlon M., Eddins M., Tsui C., Rogers R. S., Jensen J. P.et al. 2003 A conserved catalytic residue in the ubiquitin-

conjugating enzyme family. EMBO J. 22,5241–5250.

Yang Z. 1998 Likelihood ratio tests for detecting positive selectionand application to primate lysozyme evolution. Mol. Biol. Evol.15, 568–573.

Yang Z. 2007 PAML 4: phylogenetic analysis by maximumlikelihood. Mol. Biol. Evol. 24, 1586–1591.

Yim H. S., Cho Y. S., Guang X., Kang S. G., Jeong J. Y., Cha S. S.et al. 2014 Minke whale genome and aquatic adaptation incetaceans. Nat. Genet. 46, 88–92.

Yu C. S., Chen Y. C., Lu C. H. and Hwang J. K. 2006 Prediction ofprotein subcellular localization. Proteins: Struct. Funct. Bioin-form. 64, 643–651.

Corresponding editor: PUNYASLOKE BHADURY

Genomewide analysis of the sperm whale E2 gene family Page 11 of 11 78