Identification and Characterization of Lin12/Notch Repeats (LNRs): Conclusion/Future Work: Our...

1
Identification and Characterization of Lin12/Notch Repeats (LNRs): Conclusion/Future Work: Our preliminary results indicate that although all LNRs, regardless of their origin, are small, acidic sequences, there are important subtle differences in the details of each LNR sequence that might shed light into their unique biological function within the larger multidomain protein scaffold. We have also found that some slots in the alignment of the LNRs predominantly contain a certain type of amino acid, either acidic, basic, hydrophobic, polar or aromatic. This compiled information, in the future, will be used in deciding which LNRs are relevant for further experimental characterization study and comparison of the bioinformatics data with experimental results will give us a clearer understanding of the characteristics of LNRs from such a diverse variety of protein families. References: 1.Notch Subunit Heterodimerization and Prevention of Ligand-Independent Proteolytic Activation Depend, Respectively, on a Novel Domain and the LNR Repeats. Cheryl lSanchez-Irizarry, Andrea C. Carpenter, Andrew P. Weng, Warren S. Pear, Jon C. Aster, and Stephen C. Blacklow. Molecular and Cellular Biology, Nov.2004, Vol.24, No.21. p 9265–9273. 2.The Lin12-Notch Repeats of Pregnancy-associated Plasma Protein-A Bind Calcium and Determine its Proteolytic Specificity. Henning B. Boldt, Kasper Kjaer-Sorensen, Michael T. Overgaard, Kathrin Weyer, Christine B. Poulsen, Lars Sottrup-Jensen, Cheryl A. Conovers, Linda C. Giudice, Claus Oxvig. Journal of Biological Chemistry, Sept. 2004, Vol. 279, No. 37, p. 38525-38531. 3.Stealth Proteins: In Silico Identification of a Novel Protein Family Rendering Bacterial Pathogens Invisible to Host Immune Defense. Peter Sperisen, Christoph D. Schmid, Philipp Bucher, Olav Zilian. PLoS Comput Biol. 1(6): e63. 2005. 4.“Number of Cysteines Histogram”. UCSC Genome Bioinformatics . Updated 12 Feb. 2004. <http://genome.ucsc.edu/google/goldenPath/help/pbTracksHelpFiles/pbcCnt.shtml> Fig. 5: Physichochemical characteristics of LNRs. Each LNR sequence is characterized using ExPASy Proteomics. Information such as the theoretical pI and total number of residues tells us that all LNRs are acidic and are less than 45 amino acids long. This tables shows a few of the characteristics and compiled information for some selected LNR sequences. Fig. 1: Domain organization of different classes of proteins that contain LNRs. In the Notch protein the LNRs (represented as yellow ovals) are found in a tandem block of three, while in the PAPP the first two tandem LNRs are separated from the third LNR by ~1000 amino acids. Human Stealth Protein contains two LNRs while the Fly Stealth protein contains only a single LNR Human Stealth: Fly Stealth: CR1 CR2 CR4 CR3 LNRA LNR B CR4 CR3 CR1 CR2 LNRA Proteolytic Domain 4 5 3 2 1 NEC LNRs RAM Ankyrin repeats Transacti vation Domain Transmembrane region EGF-Like repeats Notch: PAPP: Fig. 2: Websites and Tools Name Accessio n Sequence Residues # Cys MW pI (Theor. ) # neg. # Pos. Instabil ity Index # alipha tic # Aromati c # Basic # Acidic Total # Ex. Co. (all half) hN1 LNRA P46531 EEAC ELPEC QEDAGNKVC SLQC NNHAC GWDGGDC S 1447- 1481 6 3715. 9 3.89 8 1 92.14 10 1 2 13 35 5875 hN1 LNRB P46531 LNFNDPWKNC TQSLQC WKYFSDGHC DSQC NSAGC LFDGFDC Q 1482- 1523 6 4827. 2 4.28 5 2 53.47 7 7 3 13 42 12865 hN1 LNRC P46531 RAEGQC NPLYDQYC KDHFSDGHCDQGCNSAECEWDGLDCA 1524- 1563 6 4484. 7 4.12 9 2 20.63 9 4 4 14 40 8855 hN2 LNRA Q04721 PATC LSQYC ADKARDGVC DEAC NSHAC QWDGGDC 1422- 1455 6 3593. 8 4.17 6 2 52.41 10 2 3 9 34 7365 hN2 LNRB Q04721 LTMENPWANCSSPLPCWDYINNQCDELCNTVECLFDNFECQ 1457- 1497 6 4804. 3 3.26 7 0 50.62 7 5 0 15 41 12865 hN2 LNRC Q04721 GNSKTCKYDKYCADHFKDNHCDQGCNSEECGWDGLDCA 1498- 1535 6 4261. 5 4.64 8 4 36.11 7 4 6 12 38 8855 hN3 LNRA Q9UM47 EPRCPRAACQAKRGDQRCDRECNSPGCGWDGGDCS 1384- 1418 6 3785. 1 6.31 6 6 69.21 8 1 6 9 35 5875 hN3 LNRB Q9UM47 LSVGDPWRQCEALQCWRLFNNSRCDPACSSPACLYDNFDCH 1419- 1459 6 4722. 2 4.75 5 3 95.14 9 5 4 10 41 12865 hN3 LNRC Q9UM47 AGGRERTCNPVYEKYCADHFADGRCDQGCNTEECGWDGLDCA 1460- 1501 6 4616. 9 4.35 9 4 34.81 12 4 5 12 42 8855 hN4 LNRA Q5STG5 C EGRSGDGAC DAGC SGPGGNWDGGDC S 1180- 1206 4 2490. 5 3.71 5 1 48.32 11 1 1 6 27 5750 Q5STG5 PGAKGCEGRSGDGACDAGCSGPGGNWDGGDCS 1175- 1206 4 2900. 9 4.04 5 2 37.03 14 1 2 6 32 5750 hN4 LNRB Q5STG5 LGVPDPWKGCPSHSRCWLLFRDGQCHPQCDSEECLFDGYDCE 1207- 1248 6 4830. 3 4.57 8 3 83.37 9 5 5 10 42 12865 hN4 LNRC Q5STG5 TPPACTPAYDQYCHDHFHNGHCEKGCNTAECGWDGGDCR 1249- 1287 6 4297. 6 5.2 6 2 69.48 8 4 6 9 39 8855 mN1 LNRA Q01705 EEACELPECQVDAGNKVCNLQCNNHACGWDGGDCS 1446- 1480 6 3713 3.95 7 1 74.68 11 1 2 13 35 5875 mN1 LNRB Q01705 LNFNDPWKNCTQSLQCWKYFSDGHCDSQCNSAGCLFDGFDCQ 1481- 1522 6 4827. 2 4.28 5 2 53.47 7 7 3 13 42 12865 mN1 LNRC Q01705 LTEGQCNPLYDQYCKDHFSDGHCDQGCNSAECEWDGLDCA 1523- 1562 6 4471. 7 3.93 9 1 25.45 9 4 3 14 40 8855 dN LNRA P07207 RAMCDKRGCTECQGNGICDSDCNTYACNFDGNDCS 1479- 1513 7 3771 4.17 6 3 60.46 7 2 3 11 35 1865 Slot # Total High. Perc. # Hydr. (G, A, V, L, I, M, P) # Arom. (F, Y, W) # Basic (H, K, R) # Acidic (D, E) # Polar (S, C, T, N, Q) % 1 9 78% L 9 100% Hydrophobic 2 10 40% N 3 1 6 60% Polar 3 12 33% F 8 4 67% Hydrophobic 4 12 25% N 5 1 2 4 42% Hydrophobic 5 18 33% D 4 1 4 6 3 33% Acidic 6 19 47% P 13 2 4 68% Hydrophobic 7 29 31% E 7 8 2 10 2 34% Acidic 8 29 21% K 15 8 3 3 52% Hydrophobic 9 29 28% N 7 1 4 17 59% Polar 10 29 100% C 29 100% Polar 11 2 50% V, E 1 1 50% Acid/Hydr. 12 3 33% D, V, T 1 1 1 33% Hydr/Acid/Polar 13 3 33% Y, S, L 1 1 1 33% Hydr/Arom/Polar 14 3 33% Q, N, R 1 2 67% Polar 15 10 50% N 1 1 1 7 70% Polar 16 10 60% P 7 1 2 70% Hydrophobic 17 18 33% L 11 1 6 61% Hydrophobic 18 28 25% Y 5 8 1 7 7 29% Aromatic 19 32 22% D 13 4 8 7 41% Hydrophobic 20 32 22% Q 13 7 3 9 41% Hydrophobic Fig. 4: Alignment Slots Statisticss –Some of Them. Each slot (see Fig. 3) is analyzed for the most abundant amino acid (Column 3 – Highest Percentage) and then analyzed for different types of amino acids (Columns 4-8). Many slots are made predominantly of a certain type of amino acid. Information for slots 1-20 is shown. LNR C LNR A LNR B LNR A LNR B LNR C Fig. 3: LNR alignment. All LNRs are aligned based on the position of key structural amino acids such as the cysteines and the aspartic acids (highlighted in red and green, respectively). Each “slot” is numbered (on top). This alignment allows us to see similarities and trends in each “slot”, giving us further clues to LNR characteristics. LNRA LNRB LNRC Human Notch1 LNRA LNRB LNRC Mouse Notch1 LNRA LNRB LNRC Human Notch4 LNRA LNRB LNRC Human Notch2 LNRA LNRB LNRC Human Notch3 LNRA LNRB LNRC Fruit Fly Notch LNRA LNRB LNRC Frog Notch LNRA LNRB LNRC Zebr a Fish Notc h Green Algae Gluc TraB Gluc TraA Stealth PAPP A PAPP A2 PAPP E Nematode Notch 1 5 10 15 20 25 30 35 40 45 50 Slot # A Bioinformatics Approach Fathima F. Jahufar, Framingham High School ’07. Didem Vardar-Ulu, Chemistry Department Acknowledgements - National Science Foundation Research Experiences for Undergraduates (NSF-REU) in Chemistry and Physics - Professor Didem Vardar-Ulu, Christina Hao, Sharline Madera, and Ursela Siddiqui. Abstract Lin12 Notch Repeats (LNRs) are Ca 2+ binding, cysteine-rich protein domains. They were first found in a block of three in a transmembrane receptor protein called Notch. Since then they have also been found in other types of multidomain proteins such as the Pregnancy-associated Plasma Protein (PAPP) and Stealth proteins. In these proteins, the LNRs are present in a variety of different numbers and arrangements. For this project, we have used a variety of different bioinformatics tools to identify, align, and compile information on different LNRs from different protein sources. These tools include BLAST, ClustalW, ExPASy Proteomics Tools, and UniProt. Using these tools, we have been able to compile a list of different LNRs along with certain physicochemical properties of each, including the theoretical pI, the molecular weight, the number of acidic and basic residues and the extinction coefficients. We have also broken down the percentages of each amino acid and each type of amino acid in each residue position relative to the cysteines. Our preliminary results indicate that although all LNRs, regardless of their origin, are small, acidic sequences. There are important subtle differences in the details of each LNR sequence that might shed light into their unique biological function within the larger multidomain protein scaffold. The compilations presented in this work are useful in comparing different LNRs and deciding which LNRs would be valuable for further studies. . Introduction: Lin12 Notch Repeats (LNRs) are relatively short protein domains (only about 35-40 amino acids long) found in a variety of different protein families. LNRs were first found in a block of three in Notch protein, a transmembrane receptor protein. In this protein, LNRs help maintain the receptor in a resting, metalloprotease-resistant conformation prior to ligand binding (1) . LNRs are also found in other multidomain proteins such as PAPP proteins and Stealth proteins. PAPP proteins, like the Notch, have three LNRs. However, the third LNR is separated from the second LNR by more than 1000 amino acids (2). LNRs in PAPP are thought to determine the proteolytic specificity of PAPP, which cleaves insulin-like growth factor-binding proteins (2) . In Stealth, LNRs come in ones or twos, but are not found in all Stealth proteins (3). Average natural abundance of cysteine in proteins is about 2.3% (4). However, most LNRs are ~ 15-17% cysteine. Hence, they are very cysteine rich and require Ca 2+ to fold properly into their native forms. Most LNRs have six cysteines, while a few have only four. These cysteines help to form three (or two) specific disulfide bridges that help give LNRs their structure. LNRs also contain several aspartic acids and asparagines that coordinate the binding of Ca 2+ ions. Using bioinformatics to study LNRs involves the use of websites such as UniProt, BLAST, ClustalW2, and ExPASy Proteomics Tools. UniProt allows keyword/ text searches to identify amino acid sequences from different data bases. It also matches input sequences to sequences within proteins in a database and provides basic information about these proteins. Protein BLAST (Basic Local Alignment Search Tool) compares amino acids sequence inputs to those in the protein database and outputs significant matches. ClustalW2 is an online tool that aligns multiple amino acid sequences facilitating one to one amino acid comparisons. Finally, ExPASy (Expert Protein Analysis System) Proteomics tools allow information to be gathered and predictions to be made about amino acids sequences. We have used UniProt and BLAST to first identify different LNR sequences within the protein database and to determine their location within their corresponding protein sources. Then, we used ClustalW to align these LNRs, after which we improved these automated alignments manually based on the position of the cyteines and the Ca 2+ coordinating residues that define an LNR. Finally, in order to better understand and predict the biochemical and biophysical characteristics of LNRs, we used EXPASY Proteomics Tools to compile a list of physicochemical properties for each of the identified LNR sequences. The alignments of the LNRs (each slot numbered) and small sections of the tables detailing the properties of the LNRs and of each slot in the alignments are presented here.

Transcript of Identification and Characterization of Lin12/Notch Repeats (LNRs): Conclusion/Future Work: Our...

Page 1: Identification and Characterization of Lin12/Notch Repeats (LNRs): Conclusion/Future Work: Our preliminary results indicate that although all LNRs, regardless.

Identification and Characterization of Lin12/Notch Repeats (LNRs):

Conclusion/Future Work:Our preliminary results indicate that although all LNRs, regardless of their origin, are small, acidic sequences, there are important subtle differences in the details of each LNR sequence that might shed light into their unique biological function within the larger multidomain protein scaffold. We have also found that some slots in the alignment of the LNRs predominantly contain a certain type of amino acid, either acidic, basic, hydrophobic, polar or aromatic. This compiled information, in the future, will be used in deciding which LNRs are relevant for further experimental characterization study and comparison of the bioinformatics data with experimental results will give us a clearer understanding of the characteristics of LNRs from such a diverse variety of protein families.

References:1.Notch Subunit Heterodimerization and Prevention of Ligand-Independent Proteolytic Activation Depend, Respectively, on a Novel Domain and the LNR Repeats. Cheryl lSanchez-Irizarry, Andrea C. Carpenter, Andrew P. Weng, Warren S. Pear, Jon C. Aster, and Stephen C. Blacklow. Molecular and Cellular Biology, Nov.2004, Vol.24, No.21. p 9265–9273. 2.The Lin12-Notch Repeats of Pregnancy-associated Plasma Protein-A Bind Calcium and Determine its Proteolytic Specificity. Henning B. Boldt, Kasper Kjaer-Sorensen, Michael T. Overgaard, Kathrin Weyer, Christine B. Poulsen, Lars Sottrup-Jensen, Cheryl A. Conovers, Linda C. Giudice, Claus Oxvig. Journal of Biological Chemistry, Sept. 2004, Vol. 279, No. 37, p. 38525-38531.3.Stealth Proteins: In Silico Identification of a Novel Protein Family Rendering Bacterial Pathogens Invisible to Host Immune Defense. Peter Sperisen, Christoph D. Schmid, Philipp Bucher, Olav Zilian. PLoS Comput Biol. 1(6): e63. 2005. 4.“Number of Cysteines Histogram”. UCSC Genome Bioinformatics. Updated 12 Feb. 2004.

<http://genome.ucsc.edu/google/goldenPath/help/pbTracksHelpFiles/pbcCnt.shtml> 

Fig. 5: Physichochemical characteristics of LNRs. Each LNR sequence is characterized using ExPASy Proteomics. Information such as the theoretical pI and total number of residues tells us that all LNRs are acidic and are less than 45 amino acids long. This tables shows a few of the characteristics and compiled information for some selected LNR sequences.

Fig. 1: Domain organization of different classes of proteins that contain LNRs. In the Notch protein the LNRs (represented as yellow ovals) are found in a tandem block of three, while in the PAPP the first two tandem LNRs are separated from the third LNR by ~1000 amino acids. Human Stealth Protein contains two LNRs while the Fly Stealth protein contains only a single LNR

Human Stealth:

Fly Stealth:

CR1 CR2 CR4CR3LNRA LNR B

CR4CR3CR1 CR2 LNRA

Proteolytic Domain

4 5321

NEC

LNRs

RAM Ankyrin repeats Transactivation Domain

Transmembrane region

EGF-Like repeats

Notch:

PAPP:

Fig. 2: Websites and Tools

Name Accession Sequence Residues # Cys MW pI (Theor.) # neg. # Pos. Instability

Index#

aliphatic # Aromatic # Basic # Acidic Total # Ex. Co. (all half)

hN1 LNRA P46531 EEACELPECQEDAGNKVCSLQCNNHACGWDGGDCS 1447-1481 6 3715.9 3.89 8 1 92.14 10 1 2 13 35 5875

hN1 LNRB P46531LNFNDPWKNCTQSLQCWKYFSDGHCDSQCNSAGCLFDGFDCQ 1482-1523 6 4827.2 4.28 5 2 53.47 7 7 3 13 42 12865

hN1 LNRC P46531RAEGQCNPLYDQYCKDHFSDGHCDQGCNSAECEWDGLDCA 1524-1563 6 4484.7 4.12 9 2 20.63 9 4 4 14 40 8855

hN2 LNRA Q04721 PATCLSQYCADKARDGVCDEACNSHACQWDGGDC 1422-1455 6 3593.8 4.17 6 2 52.41 10 2 3 9 34 7365

hN2 LNRB Q04721LTMENPWANCSSPLPCWDYINNQCDELCNTVECLFDNFECQ 1457-1497 6 4804.3 3.26 7 0 50.62 7 5 0 15 41 12865

hN2 LNRC Q04721GNSKTCKYDKYCADHFKDNHCDQGCNSEECGWDGLDCA 1498-1535 6 4261.5 4.64 8 4 36.11 7 4 6 12 38 8855

hN3 LNRA Q9UM47 EPRCPRAACQAKRGDQRCDRECNSPGCGWDGGDCS 1384-1418 6 3785.1 6.31 6 6 69.21 8 1 6 9 35 5875

hN3 LNRB Q9UM47LSVGDPWRQCEALQCWRLFNNSRCDPACSSPACLYDNFDCH 1419-1459 6 4722.2 4.75 5 3 95.14 9 5 4 10 41 12865

hN3 LNRC Q9UM47AGGRERTCNPVYEKYCADHFADGRCDQGCNTEECGWDGLDCA 1460-1501 6 4616.9 4.35 9 4 34.81 12 4 5 12 42 8855

hN4 LNRA Q5STG5 CEGRSGDGACDAGCSGPGGNWDGGDCS 1180-1206 4 2490.5 3.71 5 1 48.32 11 1 1 6 27 5750

Q5STG5 PGAKGCEGRSGDGACDAGCSGPGGNWDGGDCS 1175-1206 4 2900.9 4.04 5 2 37.03 14 1 2 6 32 5750

hN4 LNRB Q5STG5LGVPDPWKGCPSHSRCWLLFRDGQCHPQCDSEECLFDGYDCE 1207-1248 6 4830.3 4.57 8 3 83.37 9 5 5 10 42 12865

hN4 LNRC Q5STG5TPPACTPAYDQYCHDHFHNGHCEKGCNTAECGWDGGDCR 1249-1287 6 4297.6 5.2 6 2 69.48 8 4 6 9 39 8855

mN1 LNRA Q01705 EEACELPECQVDAGNKVCNLQCNNHACGWDGGDCS 1446-1480 6 3713 3.95 7 1 74.68 11 1 2 13 35 5875

mN1 LNRB Q01705

LNFNDPWKNCTQSLQCWKYFSDGHCDSQCNSAGCLFDGFDCQ 1481-1522 6 4827.2 4.28 5 2 53.47 7 7 3 13 42 12865

mN1 LNRC Q01705

LTEGQCNPLYDQYCKDHFSDGHCDQGCNSAECEWDGLDCA 1523-1562 6 4471.7 3.93 9 1 25.45 9 4 3 14 40 8855

dN LNRA P07207 RAMCDKRGCTECQGNGICDSDCNTYACNFDGNDCS 1479-1513 7 3771 4.17 6 3 60.46 7 2 3 11 35 1865

Slot # Total High. Perc.# Hydr.

(G, A, V, L, I, M, P)# Arom. (F, Y, W)

# Basic (H, K, R)

# Acidic (D, E)

# Polar (S, C, T, N, Q) %

1 9 78% L 9 100% Hydrophobic2 10 40% N 3 1 6 60% Polar3 12 33% F 8 4 67% Hydrophobic4 12 25% N 5 1 2 4 42% Hydrophobic5 18 33% D 4 1 4 6 3 33% Acidic6 19 47% P 13 2 4 68% Hydrophobic7 29 31% E 7 8 2 10 2 34% Acidic8 29 21% K 15 8 3 3 52% Hydrophobic9 29 28% N 7 1 4 17 59% Polar

10 29 100% C 29 100% Polar

11 2 50% V, E 1 1 50% Acid/Hydr.

12 3 33% D, V, T 1 1 1 33% Hydr/Acid/Polar

13 3 33% Y, S, L 1 1 1 33% Hydr/Arom/Polar14 3 33% Q, N, R 1 2 67% Polar15 10 50% N 1 1 1 7 70% Polar16 10 60% P 7 1 2 70% Hydrophobic17 18 33% L 11 1 6 61% Hydrophobic18 28 25% Y 5 8 1 7 7 29% Aromatic19 32 22% D 13 4 8 7 41% Hydrophobic20 32 22% Q 13 7 3 9 41% Hydrophobic

Fig. 4: Alignment Slots Statisticss –Some of Them. Each slot (see Fig. 3) is analyzed for the most abundant amino acid (Column 3 – Highest Percentage) and then analyzed for different types of amino acids (Columns 4-8). Many slots are made predominantly of a certain type of amino acid. Information for slots 1-20 is shown.

LNR CLNR A LNR B

LNR A LNR B LNR C

Fig. 3: LNR alignment. All LNRs are aligned based on the position of key structural amino acids such as the cysteines and the aspartic acids (highlighted in red and green, respectively). Each “slot” is numbered (on top). This alignment allows us to see similarities and trends in each “slot”, giving us further clues to LNR characteristics.

LNRALNRBLNRC

HumanNotch1

LNRALNRBLNRC

MouseNotch1

LNRALNRBLNRC

HumanNotch4

LNRALNRBLNRC

HumanNotch2

LNRALNRBLNRC

HumanNotch3

LNRALNRBLNRC

Fruit Fly Notch

LNRALNRBLNRC

FrogNotch

LNRALNRBLNRC

ZebraFishNotch

Green Algae

Gluc TraB

Gluc TraA

Stealth

PAPP A

PAPP A2

PAPP E

NematodeNotch

1 5 10 15 20 25 30 35 40 45 50 Slot #

A Bioinformatics ApproachFathima F. Jahufar, Framingham High School ’07.

Didem Vardar-Ulu, Chemistry Department

Acknowledgements- National Science Foundation Research Experiences for Undergraduates (NSF-REU) in Chemistry and Physics - Professor Didem Vardar-Ulu, Christina Hao, Sharline Madera, and Ursela Siddiqui.

AbstractLin12 Notch Repeats (LNRs) are Ca2+ binding, cysteine-rich protein

domains.  They were first found in a block of three in a transmembrane receptor protein called Notch. Since then they have also been found in other types of multidomain proteins such as the Pregnancy-associated Plasma Protein (PAPP) and Stealth proteins.  In these proteins, the LNRs are present in a variety of different numbers and arrangements.

For this project, we have used a variety of different bioinformatics tools to identify, align, and compile information on different LNRs from different protein sources. These tools include BLAST, ClustalW, ExPASy Proteomics Tools, and UniProt. Using these tools, we have been able to compile a list of different LNRs along with certain physicochemical properties of each, including the theoretical pI, the molecular weight, the number of acidic and basic residues and the extinction coefficients. We have also broken down the percentages of each amino acid and each type of amino acid in each residue position relative to the cysteines.

Our preliminary results indicate that although all LNRs, regardless of their origin, are small, acidic sequences. There are important subtle differences in the details of each LNR sequence that might shed light into their unique biological function within the larger multidomain protein scaffold. The compilations presented in this work are useful in comparing different LNRs and deciding which LNRs would be valuable for further studies. .

Introduction:Lin12 Notch Repeats (LNRs) are relatively short protein domains (only

about 35-40 amino acids long) found in a variety of different protein families. LNRs were first found in a block of three in Notch protein, a transmembrane receptor protein. In this protein, LNRs help maintain the receptor in a resting, metalloprotease-resistant conformation prior to ligand binding (1) . LNRs are also found in other multidomain proteins such as PAPP proteins and Stealth proteins. PAPP proteins, like the Notch, have three LNRs. However, the third LNR is separated from the second LNR by more than 1000 amino acids (2). LNRs in PAPP are thought to determine the proteolytic specificity of PAPP, which cleaves insulin-like growth factor-binding proteins (2) . In Stealth, LNRs come in ones or twos, but are not found in all Stealth proteins (3).

Average natural abundance of cysteine in proteins is about 2.3% (4). However, most LNRs are ~ 15-17% cysteine. Hence, they are very cysteine rich and require Ca2+ to fold properly into their native forms. Most LNRs have six cysteines, while a few have only four. These cysteines help to form three (or two) specific disulfide bridges that help give LNRs their structure. LNRs also contain several aspartic acids and asparagines that coordinate the binding of Ca2+ ions.

Using bioinformatics to study LNRs involves the use of websites such as UniProt, BLAST, ClustalW2, and ExPASy Proteomics Tools. UniProt allows keyword/ text searches to identify amino acid sequences from different data bases. It also matches input sequences to sequences within proteins in a database and provides basic information about these proteins.  Protein BLAST (Basic Local Alignment Search Tool) compares amino acids sequence inputs to those in the protein database and outputs significant matches. ClustalW2 is an online tool that aligns multiple amino acid sequences facilitating one to one amino acid comparisons. Finally, ExPASy (Expert Protein Analysis System) Proteomics tools allow information to be gathered and predictions to be made about amino acids sequences. We have used UniProt and BLAST to first identify different LNR sequences within the protein database and to determine their location within their corresponding protein sources. Then, we used ClustalW to align these LNRs, after which we improved these automated alignments manually based on the position of the cyteines and the Ca2+ coordinating residues that define an LNR. Finally, in order to better understand and predict the biochemical and biophysical characteristics of LNRs, we used EXPASY Proteomics Tools to compile a list of physicochemical properties for each of the identified LNR sequences. The alignments of the LNRs (each slot numbered) and small sections of the tables detailing the properties of the LNRs and of each slot in the alignments are presented here.