*Running title: To whom correspondence should be addressed...

S-1

Structural determinants of bacterial lytic polysaccharide monooxygenase functionality*

Zarah Forsberg‡, 1, Bastien Bissaro†, ‡, Jonathan Gullesen‡, Bjørn Dalhus¶,‖, Gustav Vaaje-Kolstad‡

and Vincent G.H. Eijsink‡

From the ‡Faculty of Chemistry, Biotechnology, and Food Science, The Norwegian University of Life

Sciences (NMBU), 1432 Ås, Norway, †INRA, UMR792, Ingénierie des Systèmes Biologiques et des

Procédés, F-31400 Toulouse, France, the ¶Department of Medical Biochemistry, Institute for Clinical

Medicine, University of Oslo, PO Box 4950, Nydalen, N-0424, Oslo, Norway, and the ‖Department of

Microbiology, Clinic for Laboratory Medicine, Oslo University Hospital, Rikshospitalet, PO Box 4950,

Nydalen, N-0424, Oslo, Norway

*Running title: Functionality of cellulose-oxidizing LPMO10s

1To whom correspondence should be addressed: Zarah Forsberg, Faculty of Chemistry, Biotechnology, and

Food Science, The Norwegian University of Life Sciences (NMBU), 1432 Ås, Norway, Tel.: +47

67232469; E-mail: [email protected]

SUPPORTING INFORMATION

S-2

TABLE OF CONTENTS:

Table S1 – Sequences of 54 LPMO10s used in the CMA analysis. 3 – 7

Table S2-S6 – Correlations between CMA positions. 8 – 13

Table S7 – Gene specific primers for cloning MaLPMO10B and D. 14

Table S8 – Calculated protein properties for all LPMOs used in this study. 15

Figure S1 – Phylogenetic tree built from 130 AA10 sequences including all substrate

specificities.

16

Figure S2 – Heat map derived from correlated mutation analysis. 17

Figure S3 – WebLogo of the cellulose C1-specific LPMO10 sub-group sequences. 18

Figure S4 – WebLogo of the C1/C4-oxidizing LPMO10 sub-group sequences. 19

Figure S5 – MALDI-TOF MS of the hexamer cluster obtained after degrading PASC. 20

Figure S6 – MALDI-TOF MS of the hexamer cluster obtained after degrading β-chitin. 21

Figure S7 – Structures and structure-based sequence alignment of MaLPMO10B,

ScLPMO10B and ScLPMO10C.

22

Figure S8 – Chromatographic analysis of products generated by MaLPMO10B variants. 23

Figure S9 – HPAEC-PAD analysis of products generated by variants of ScLPMO10B and

ScLPMO10C.

24

Figure S10 – Comparison of the ratio between C1-oxidized and native oligosaccharides

released from PASC by MaLPMO10B variants.

25

Figure S11 – HILIC-UV detection of oxidized products from squid pen β-chitin generated by

MaLPMO10B variants.

26

Figure S12 – Binding of full-length and catalytic domain variants of MaLPMO10B to Avicel. 27

Figure S13 – Probing early inactivation of two MaLPMO10B variants. 28

Figure S14 – Standard curves for C1-oxidized and C4-oxidized dimers. 29

References 30

S-3

Table S1. Sequences employed for the CMA analysis of C1- versus C1/C4- oxidizing cellulose-active LPMO10s. Twenty-six ScLPMO10C-like and 28

ScLPMO10B-like sequences were employed for the CMA analysis and these sequences (catalytic domains only, no signal peptides) are shown below. The reference

sequences of ScLPMO10B and ScLPMO10C as well as the sequences of MaLPMO10B and MaLPMO10D are colored in red.

Protein name Organism UniProt/PDB Protein sequence (without signal peptide and additional domains) No.

Sequences of C1-specific LPMO10s

ScLPMO10C Streptomyces coelicolor

A3(2)

Q9RJY2/4OY7

HGVAMMPGSRTYLCQLDAKTGTGALDPTNPACQAALDQSGATALYNWFAVLDSNAGGRGAGYVPDGT

LCSAGDRSPYDFSAYNAARSDWPRTHLTSGATIPVEYSNWAAHPGDFRVYLTKPGWSPTSELGWDDL

ELIQTVTNPPQQGSPGTDGGHYYWDLALPSGRSGDALIFMQWVRSDSQENFFSCSDVVFDGG

1

TfLPMO10B Thermobifida fusca YX Q47PB9 HGAMTYPPTRSYICYVNGIEGGQGGNIAPTNPACQNLLAENGNYPFYNWFGNLISDAGGRHREIIPD

GQLCGPHPQFSGLNLVSEHWPTTTLVAGSTITFQYNAWAPHPGTWYLYVTKDGWDPNSPLGWDDLEP

VPFHTVTDPPIRPGGPEGPEYYWDATLPNKSGRHIIYSIWQRSDSPEAFYDCTDVVFV

2

Streptomyces sp. NRRL B-

3229

4BF1F25 HGTPMKPASRTFLCWQDALTDTGEIKPVNPACKNAQQVSGTTPFYNWFSVLRSDGAGRTRGFVPDGQ

LCSGGNTNFTGFNAPRDDWPLTHLTAGKTVDFSYNAWAAHPGWFYVYVTKDGFDPKKTLTWNDIEDR

PFLSVDHPPLHGSPGTVEANYSWTGQLPANKSGRHIIYMVWQRSDSAETFYSCSDVVFDGG

3

Catelliglobosispora

koreensis

36C6794 HGATMFPGSRTWLCYQDGLRPNGAIEAYNPACAAAIAQNGTTPLYNWFAVLRSDAAGRTVGFIPDGQ

LCSAGTGGPYNFSAYNAARTDWPLTHLTTGANIQFRHSNWAHHPGTFYLAITRQGWSPTSPLAWSDL

QEFASVTNPPANGGPGALNYYYWNAQLPTGRSGRAIIYIRWVRSDSNENFFSCSDVIFDGG

4

Streptomyces albus 4C56432 HGATMMPGSRTYLCYEDLIKNNHQMPANPACAAAVQQSSTTPLYNWFAVLDSNGGGKTTGYIPDGKL

CSGGDRGPYNFSSYNAARADWPKTHLTSGASIQVKHSNWAHHPGKFDVYITKNGFSPTKSLAWSDLE

LLQTVTNPPQSGGAGSNGGHYYWDLNLPQRSGQHIIYIHWTRSDSQENFYSCSDVVFDGG

5

Catenulispora acidiphila

DSM 44928

C7QJR2 HGALMIPGSRTFFCYEDGLTSTGQIVPQNPACQAAVAQSGTTPLYNWFAVGNRSGATSGGTAGFIPD

GKLCSGNSNYYDFSGFDQVSTAWPVTHLTSGADITINYNRWAAHPGTFSLYITKDGWDQTKPLTWAD

MEPAPFSTATDPTNTGTVATLQSYYSWNAKLPANKTGHHIIYSVWARSDSSETFYGCSDVVFDGG

6

Actinoplanes sp. N902-109 R4LUJ8 HGAMMVPGSRTYLCWKDGLSTNGAMQPKNPACAAAVAQSGVTPLYNWFAVLRSDGAGRTSGFIPDGQ

LCSGGTGGPYDFTGFNQARTDWPTTHLTSGSTIRFDYNDWAKHPGTFRLYVTKDSWSPTRPLAWSDL

ESQPFSTATNPTSVGGPGTEDGRYSWTGTLPSGKSGRHLIYSVWQRSDSNETFYGCSDVMFDGG

7

Kitasatospora sp. NRRL B-

11411

UPI0004C34CB0 HGVAIVPGSRTYLCYQDGLTGTGALTPTNPACQAAVAQSGTTPLYNWFAVLDSNAGGRGQGYVPDGT

LCSAGNKSPYDFSAYNAARDDWPKTHLTSGATIEVDYSNWAAHPGEFRVYLTKQGWSPTTPLAWADL

GLLTTVANPPQVGSPGVDGGHYYWNLALPSGRSGAAEMLIQWVRSDSQENFFSCSDLVFDGG

8

Actinoplanes globisporus

DSM 43857

UPI0003818C12 HGAMMMPGSRTWLCYKDGRTSTGEIVPQNPACAAAVAQSGVTPLYNWFAVLRSDGAGRVDGFIPSGQ

LCSGGTGGPYNFTGFNLARNDWPVTHLTSGKTIEIDYNDWAKHPGTFSLYITKDGYDPTKPLAWSDL

NPTPFSQVTNPPANGGPGTDDGHYWWNATLPSGKTGKHLIYSVWARSDSTETFYGCSDVVFDGG

9

Actinoplanes missouriensis

ATCC 14538

I0HBF4 HGAIQVPGSRTWFCYQDGRNPQTGAIEPKNAACAAAVAQSGVTSLYNWFAVLRSDGAGRVSGFLPDG

QLCSGGTGGPYNFTGFNLARTDWPLTHLTAGANVQFKYNNWAKHPGTFYLYITKDGYDPTKPLAWSD

LEPTPFDQVTNPPANGGPGTDDGHYYWNGNLPAGKTGRHLIYSVWSRSDSQETFYGCSDVVFDGG

10

Streptomyces sp. NRRL F-

5135

UPI0004C48435 HGTPMQPASRTFMCWRDGLTSTGEIKPINPACKAAVAESGTTPLYNWFSVLRSDGAGRTKGFIPDGQ

LCSGGNPTFSGFNKNRNDWPLTHLTSGASLDFSYNAWAAHPGWFYVYVTKDGFNPTATFKWDDLESQ

PFLTVDHPSVTGPVGSVEGAYKWTGKLPGNKSGRHIIYMVWQRSDSNETFYSCSDVVFDGG

11

Glycomyces sp. NRRL B-

16210

UPI0004BF2C49 HGATVFPGSRQYFCYFDALSGNGALDPSNAMCQQALNQGGPNAFYNWFGNLDSNGAGQTVGYIPDGE

ICNGGGRGPYDFSAFNAAGNWPRTHVTAGANVEWRYNNWAHHPGKFDLYITKDGWNPNTPITWDELE

LFTTITNPPQNGGPGSDDHYYYANLTLPQKSGYHVVFTHWVRSDSSENFYACSDVEFDGG

12

S-4

Micromonospora sp. ATCC

39149

C4RK97 HGAAMTPGSRTYLCWKDGLAPTGEIKPNNPACSAAVAQNGPNSLYNWFSVLRSDAGGRTVGFIPDGK

LCSGGNPGFSGYDAARTDWPLTHLTAGARFDFKYSNWAHHPGTFYFYVTKDSWSPTRALAWSDLETQ

PFLTVTNPPQNGPVGTNEGHYYFSGNLPSGKSGRHIIYSRWVRSDSQENFFGCSDVTFDGG

13

Streptomyces sp. CB01249 A0A1Q5E904 HGVTMSPGSRTYLCWLDAKTSTGSLDPTNPACKAALAESGASSLYNWFAVLDSNAGGRGAGYVPDGT

LCSAGDRSPYNFTGYNAARGDWPRTHLTSGAKIEVDHSNWAAHPGEFRVYMSKPGYSPTTELGWDDL

DLIQTVSNPPQVGSPGTDGGHYYWDLTLPSGRSGDAVMFIQWVRSDSQENFFSCSDIVFDGG

14

MaLPMO10E Micromonospora aurantiaca

ATCC 27029

D9T4V8 HGAAMTPGARTYLCWKDGLTGTGEIRPNNPACSSAVAANGANSLYNWFSVLRSDAGGRTVGFIPDGK

LCSGGNPGFSGYDAARNDWPITHLTAGRSMEFRYSNWAHHPGTFYFYVTKDSWSPNRPLAWSDLEEQ

PFLQVTNPPQRGAVGTNDGHYYFTGNLPSNKSGRHIIYSRWVRSDSQENFFGCSDVTFDGG

15

Streptomyces griseoflavus

Tu4000

D9XN87 HGVAMMPGSRTYLCQLDAITGTGALDPTNPACRNALNQSGASALYNWFAVLDSQAAGRGAGYVPDGT

LCSAGDRSPYDFSAYNAARADWPRTHLTSGATVKVQYSNWAAHPGDFRVYVTKPGWSPTSRLGWSDL

DLVQTVTNPPQQGSPGTNGGHYYWDLKLPSGRSGDALIFMQWVRSDSQENFFSCSDVVFDGG

16

Streptosporangium roseum

ATCC 12428

D2BFT3 HGAMMVPGSRTFFCWQDGLSSTGQIIPINPACGAAVAQSGPNSLYNWFSVLRSDGAGRTRGFIPDGQ

ACSGGNPGYSGFDLPRADWPVTHLTAGAGIQFKYNKWAAHPGWFYLYVTKDGWNPNQALTWDDLESQ

PFHTADHPQSVGSPGTNDAHYYWNATLPSGKSGRHIIYSVWQRSDSNETFYNCSDVVFDGG

17

Micromonospora lupini str.

Lupac 08

I0L866 HGAAMTPGSRTYLCWQDGLSQTGEVRPNNPACSAAVAQSGANSLYNWFSVLRSDAGGRTTGFIPDGQ

LCSGGNPGYKGYDLPRTDWPLTHLTSGSRLDFRYSNWAHHPGTFYFYVTKDSWSPTRALAWSDLEEP

FLTVTNPPQRGSVGTNDGHYYFSGNLPSGKSGRHIIYSRWVRSDSQENFFGCSDVTFDGG

18

Streptomyces scabiei A0A086GQ59 HGTPMKPGSRTFLCWQDGLTDTGEIKPVNPACRSAHQVSGTTPFYNWFSVLRSDGAGRTRGFVPDGQ

LCSGGNTNFTGFNTPSADWPLTHLTSGATVDFSYNAWAAHPGWFYVYITKDGFDPTKTLTWNDLEAQ

PFLSVDHPPLNGSPGTVEANYSWRGQLPANKSGRHLVYMVWQRSDSQETFYSCSDIVFDGG

19

Herbidospora cretacea UPI0004C31F82 HGAMMMPGSRTYLCWKDGLNSSGAIQPNNPACAAAVAQSGTNSLYNWFATLRSDGAGRMSGFIPDGS

LCSGGAVVYNFSGFDLARNDWPVTHLTSGANIQIRYNKWAAHPGTFRLYITKNTWSPTRPLAWSDLE

STPFDSSTNPPSVGNPGNENAYYYWNANLPSGKSGRHIIYSVWQRSDSNETFYGCSDVVFDGG

20

Microbispora rosea UPI0004C399F4 HGAPMMPGSRTYLCWKDGTTSSGAIQPKNPACAAAVAKGGTNALYNWFAVLRSDGAGRMSGFIPDGQ

LCSGGAVVYDFSGFDLARDDWPVTHLTSGATVEFKYNKWAAHPGTFRSYITKDSWSPTRALAWSDLE

STPFDSVTNPPSVGSVGTEDGHYYWNAHLPSGKSGRHIIYTVWQRSDSNETFYSCSDVVFDGG

21

Streptomyces sp. e14 D6KFY7 HGVAMLPGSRTYLCYLDAQTSTGALDPTNPACKDALAKSGATALYNWFAVLDSNAGGRGAGYVPDGT

LCSAGDRSPYDFSAYNAARADWPRTHLTSGATIQAQYSNWAAHPGDFRVYVTKAGWSPTAQLRWSDL

ELIQTVTNPPQVGSPGANGGHYYWDLKLPSGRSGDALVFIQWVRSDSQENFFSCSDVVFDGG

22

Streptomyces violaceusniger

Tu 4113

G2NYJ7 HGAPMAPGSRTYLCWKYSLSSTGEVKPTNPACKAAYDKSGATPFYNWFAVLRSDGAGRTKGFIPDGK

LCSGAATVYDFSGFDVGSKDWPVTHLTAGKSMQFSYNAWAAHPGSFRSYVTKAPIDPTKPLTWNDLE

DQPFNTVTDPPLSGQVGTVDGKYTWTANLPSGKSGRHIIYTVWTRSDSQETFYSCSDVVFDGG

23

CfLPMO10B Cellulomonas flavigena

ATCC 482

D5UGB1 HGGLTNPPTRTYACYQDGLAGGAAAGEAGNIRPRNAACVNAFDNEGNYSFYNWYGNLLGTIAGRHET

IADGKVCGPDARFASYNTPSSAWPTTKVTPGQTMTFQYAAVARHPGWFTTWITKDGWNQNEPIGWDD

LEPAPFDRVLDPPLREGGPAGPEYWWNVKLPSNKSGKHVLFNIWERTDSPESFYNCVDVD

24

Micromonospora

auratinigra

A0A1C6TMZ0 HGAAMTPGARTYLCWKDGLTGTGEIRPNNPACSSAVAANGANSLYNWFSVLRSDAGGRTVGFIPDGK

LCSGGNPGFSGYDAARNDWPITHLTAGRSMEFRYSNWAHHPGTFYFYVTKDSWSPNRPLAWSDLEEQ

PFLQVTNPPQRGAVGTNDGHYYFTGNLPSNKSGRHIIYSRWVRSDSQENFFGCSDVT

25

Xylanimonas cellulosilytica

DSM 15894

D1BWE0 HGAEVFPGSRQYLCWVDGQSASGALDPSNPACASALAVSGSGAYYNWFGNLDSNGAGRTEGYIPDGT

ICSGGDRGPYDFSAFNAPRTDWPTTHLTAGATYEFQHNNWAQHPGRFDVYVTRAGFDPTKPLGWSDL

ELIDSVTNPPDTGGPGSDNYYYWDVTLPADRTGRHIVFTHWVRSDSTENFYSCSDV

26

S-5

Sequences of C1/C4-oxidizing LPMO10s

ScLPMO10B Streptomyces coelicolor

A3(2)

Q9RJC1/4OY6 HGSVVDPASRNYGCWERWGDDFQNPAMADEDPMCWQAWQDDPNAMWNWNGLYRNGSAGDFEAVVPDG

QLCSGGRTESGRYNSLDAVGPWQTTDVTDDFTVKLHDQASHGADYFLVYVTKQGFDPATQALTWGEL

QQVARTGSYGPSQNYEIPVSTSGLTGRHVVYTIWQASHMDQTYFLCSDVDFG

1

Cellulomonas flavigena

ATCC 482

D5UH31 HGAVSDPPSRIYGCWERWASNFTDPAMATSDPQCWDAWQSEPQAMWNWNGMFKEGAAGQHEQSIPDG

KLCSADNPLYAAADDPGPWRTTPVDHDFRLTLHDPSNHGADYLKIYVTKQGYDARSEALTWADLELV

KTTGRYATSSPYVTDVSVPRDRTGHHVVFTIWQASHLDQPYYQCSDVTFG

2

Streptomyces scabiei 87.22 C9Z804 HGTVVGPATRAYQCWKTWGSNHTNPAMQTQDPMCWQAFQANADTMWNWMSALRDGLGGQFQARTPDG

TLCSNNLSRNASLDKPGAWKTTTIGSNFSVQLYDQASHGADYFKVYVSKQGFNPKTQTLGWSNLDFI

TQTGRFAPAQNITFPVRTSGYTGHHIVFVIWQASHLDQAYMWCSDVNFG

3

Verrucosispora maris AB-

18-032

F4F6G8 HGTIVNPASRAYQCWKTWGNNHMSPQMQQQDPMCWQAFQANPDTMWNWMSALRDGLGGQFQARTPDG

QLCSNALSRNNSLNRPGAWKTTNISRNFTVQLHDQASHGADYFRVYVSKQGFNPATQQLGWGNLDFI

TQTGRYAPAQNISFNVSTSGYTGHHILFVIWQASHLDQAYMWCSDVNFV

4


ATCC 14538

I0H648 HGTIINPASRAYQCWKTWGSQHMNPAMQQQDPMCWQAFQANPDTMWNWMSALRDGLGGQFQARTPDG

QLCSNGLTRNDSLNQPGAWKQTTLSRNFTVQLYDQASHGADYFRVYVSKQGFNPATQKLGWGNLDFI

TQTGRYAPAQNISFDISTSGYTGHHVLFVIWQASHLDQAYMWCSDVNFA

5

Actinosynnema mirum

ATCC 29888

C6WN10 HGTIVGPATRAYQCWQDWGSQHTNPAMQQQDPTCYQAYQANADTMWNWMSALRDGLKGNFQAATPDG

QLCSNALARNNSLNTPGKWRTTSVGSNFTMRLHDQASHGADYFKVYVSKNGFDPATQRLGWGNLDLV

AQTGKYAPAKDITFNVQTNGSYRGHHVVFTIWQASHLDQAYMWCSDVNFG

6

Amycolatopsis mediterranei

S699

G0G8Q3 HGTIVDPATRAYQCWKDWGSQHTNPAMQQQDPMCWAAFQANADTMWNWMSALRDGLHGNFQGVTPDG

QLCSNGLARNNSLNTPGPWRTTSIGSTFSMHLYDQASHGADFIRVYVSKNGYDPTTQPLGWGNLDQV

TQTGRYAPAKDITFTVQTNGGYRGHHVVFTIWQASHQDQTYMWCSDVNFG

7

Nocardiopsis dassonvillei

ATCC 23218

D7B4C4 HGSIVDPASRNYGCWERWGSDHLNPDMAQEDPMCWQAWQDNPNAMWNWNGLYRDNVGGDHQGQIPDG

TLCSGGNTEGGRYDSMDAVGEWKTTDVGDDFTLHLYDQAQHGADYFRVYVTEQGFDPTTQALGWDDL

ELVEETGPYAPALDSYIDVSTSGRSGHHIVFTIWQASHMDQVYYLCSDVNFT

8

TfLPMO10A Thermobifida fusca YX Q47QG3/4GBO HGSVINPATRNYGCWLRWGHDHLNPNMQYEDPMCWQAWQDNPNAMWNWNGLYRDWVGGNHRAALPDG

QLCSGGLTEGGRYRSMDAVGPWKTTDVNNTFTIHLYDQASHGADYFLVYVTKQGFDPTTQPLTWDSL

ELVHQTGSYPPAQNIQFTVHAPNRSGRHVVFTIWKASHMDQTYYLCSDVNFV

9

Thermomonospora sp.

MTCC 5117

A4GND6 HGSVINPATRNYGCWLRWGNDHLNPNMQHEDPMCWQAWQDNPNAMWNWNGLYRDNVGGNHRAALPDG

QLCSGGLAEGGRYRSMDAVGPWKTTDINNTFTIHLYDQASHGADYFEVYVTKQGFDPTTQPLTWGSL

DLVHRTGSYAPSQNIQFTVNAPNRSGRHVVFTIWKASHMDQTYYLCSDVNFV

10

Streptomyces

bingchenggensis BCW-1

D7CG43 HGSVIDPASRNYGCWLRWADDFQNPEMAQKDPMCWQAWQDNPNAMWNWNGLYRNGSAGNFPAVVPDG

QLCSGGHTEGNRYNSLDTVGAWQTTNISNKFTVKLYDQASHGADYFLVYVSRQGYDPTTQPLKWSNL

QQVAKTGKYAPSQNYSIDVNTSGYTGRHVVYTIWQASHMDQTYFLCSDVNFS

11

Streptomyces violaceusniger

Tu 4113

G2PC07 HGSVVDPASRNYGCWLRWGDDFQNPEMAKLDPMCWQAWQDNPNAMWNWNGLYRNGSAGNFPAVIPDG

QLCSGGHTEGGRYNSLDTPGAWQTTNIGGKFTVKLYDQASHGADYFKVYVSREGYDPTTQPLKWSDL

QLVTTTGKYAPSQNYAIDVSTSGYTGRHVVYTIWQASHMDQTYFLCSDVNFG

12

Streptomyces sp. SirexAA-E

/ ActE

G2NK15 HGSVVDPASRNYGCWLRWGSDFQNPAMAQEDPMCWQAWQADPNAMWNWNGLYRNESAGNFPAVIPDG

QLCSGGRTEGGRYNALDTVGAWQATDITDDFTVRLEDQASHGADYFRVYVTEQGFDPTAQPLTWGAL

DLVAETGRYGPSTSYEIPVSTSGYTGRHVVYTIWQASHMDQTYFLCSDVNFG

13

S-6

Streptomyces ambofaciens

ATCC 23877

A0AD36 HGSVVDPASRNYGCWQRWGDDFQNPAMAQQDPMCWQAWQDDPNAMWNWNGLYRNGSAGDFEAVVPDG

QLCSGGRTEGGRYNSLDAVGAWKTTAVADDFTVKLHDQASHGADYLKVYVSRQGFDPTTQPLGWDDL

QLLTTTGRYAPGQHYEVPVSTSGLSGRHVVYTIWQASHMDQTYFLCSDVDFG

14

Streptomyces pratensis

ATCC 33331

E8W3L9 HGSVVDPASRNYGCWLRWGDDFQNPAMEQEDPMCWQAWQADPNAMWNWNGLYRNGSGGDFPAAVPDG

QLCSGGRTESGRYDSLDSVGAWKTTDITDDFTVKLYDQASHGADYFQVYVTRQGFDPAAEALTWSDL

QLVAETGRYAPSQNYEIPVTTSGLSGRHVVYTIWQASHMDQTYFLCSDVNFT

15

Verrucosispora maris AB-

18-032

F4FDY2 HGSVVNPGARAYSCWERWGGDHMNPRMATEDPMCWQAWQANPQAMWNWNGQFREGVGGNHQAAIPDG

QLCSGGRAEGGRYNALDTIGAWRTTQVSNSFRLKFFDQASHGADYIRVYATKQGFDALTKPLAWSDL

ELVGQIGNTPASQWQREVDGVSIEIPVNAPGRSGRHIIYTIWQASHFDQSYYHCSDVQFG

16

MaLPMO10B Micromonospora aurantiaca

ATCC 27029

D9SZQ3/5OPF HGSVVDPASRSYSCWQRWGGDFQNPAMATQDPMCWQAWQADPNAMWNWNGLFREGVAGNHQGAIPDG

QLCSGGRTQSGRYNALDTVGAWKTVPVTNNFRVKFFDQASHGADYIRVYVTKQGYNALTSPLRWSDL

ELVGQIGNTPASQWTREVDGVSIQIPANAPGRTGRHVVYTIWQASHLDQSYYLCSDVDFG

17

Micromonospora aurantiaca

sp. L5

D3C3Z5 HGSVVDPASRSYSCWQRWGGDFQNPAMATQDPMCWQAWQADPNAMWNWNGLFREGVAGNHQGAIPDG

QLCSGGRTQSGRYNALDTVGAWKTVPVTNNFRVKFFDQASHGADYIRVYVTKQGYNALTSPLRWSDL

ELVGQIGNTPASQWTREVDGVSIQIPANAPGRTGRHVVYTIWQASHLDQSYYLCSDVDFG

18


ATCC 14538

I0H7A5 HGNVIGPASRNYGCYERWGSKFQDPAMATEDPMCWQAWQANPNAMWNWNGLFRENVAGQHETAIPDG

QLCSAGHTENGRYNAMDTVGDWKATSIGNSFTVQLFDGARHGADYIRVYVTKPGYNPVTTPLKWSDL

QLITTVPNTPAANWTHQQSNGVQIDIPVSVSGRSGRAMVYTIWQASHLDQSYYFCSDVNFG

19

Streptomyces

bingchenggensis BCW-1

D7C1L2 HGSAIGPASRNYGCWKRWGSDFQNPEMKTKDPMCWQAWQADTNAMWNWNGLYREGVAGNHQAALPDG

QLCSGGHTSSGRYNAMDVPGKWEATAVNNSFTFKNHDQAKHGADYYRIYVTKQGYDATTQPLRWSDL

ELVAQTGKIAPGVGEPSTDPSLNGVTVSIPVNASGRTGRHVVFMIWQASHLDQSFYSCSDVIFPGG

20

Actinosynnema mirum

ATCC 29888

C6WL62 HGSTTDPPSRNYGCWQRWGSDFQNPTMAQRDPMCWQAWQRDTNAMWNWNGLYREGVAGNHQAAIPDG

QLCSAGRTENGRYAAMDVPGAWTAATKPRQFTLTVTDQAKHGADYLRVYVTKQGFNPLTTSPRWSDL

EQVASTGRYAPAGTYQVAVNAGSRTGRHIVYVIWQASHSDQSYYFCNDVIFQ

21

Amycolatopsis mediterranei

S699

G0FLJ0 HGSATDPPSRNYGCWNRWGSDFQNPAMATKDPMCWQAWQADPNAMWNWNGLYREGVKGNHQGAIPDG

TLCSGGRTQAPRYNALDTVGAWQMANKDNRFTLTVTDQAHHGADYLLVYITKQGFDPATQPLGWGNL

ELVAQTGRYAPAGQYQVAVNAGTRTGRHVVYTIWQASHLDQSYYFCSDVNLSGR

22

MaLPMO10D Micromonospora aurantiaca

ATCC 27029

D9T1F0 HGSVTNPPTRNYGCWERWGSDHLNPTMAQTDPMCWQAWQANPNTMWNWNGLYRENVGGNHQAAVPDG

QLCSGGRTQGGLYASLDAVGAWTAKPMPNNFTLTLTDGAKHGADYMLIYITKQGFDPTTQPLTWNSL

ELVLRTGSYPTTGLYEAQVNAGNRTGRHVVYTIWQASHLDQPYYLCSDVIFG

23

Micromonospora aurantiaca

sp. L5

D3C5V5 HGSVTNPPTRNYGCWERWGSDHLNPAMAQTDPMCWQAWQANPNTMWNWNGLYRENVGGNHQAAVPDG

QLCSGGRTQGGLYASLDAVGAWTAKPMPNNFTLTLTDGAKHGADYMLIYITKQGFDPTTQPLTWNSL

ELVLRTGSYPTTGLYEAQVNAGNRTGRHVVYTIWQASHLDQPYYLCSDVIFG

24

Cellulomonas flavigena

ATCC 482

D5UGA8 HGSVTDPPSRNYGCWEREGGTHMDPAMAQRDPMCWQAFQANPNTMWNWNGNFREGVGGRHEQVIPDD

QLCSAGKTQNGLYASLDTPGPWIMKTVPHNFTLTLTDGAMHGADYMRIYVSKAGYDPTTDPLGWDDI

ELIKETGRYGTTGLYQADVSIPSNRTGRAVLFTIWQASHLDQPYYICSDINING

25

Xylanimonas cellulosilytica

DSM 15894

D1BV23 HGSVTNPPTRNYGCFERWGDDHLNPVMATEDPMCWGAFQAEPSAMWNWNGLYRENVGGNHEAAIPNG

QLCSGGRTENGRYNYLDTPGQWVAKGVPNNFRLTLTDGAQHGADYLRIYVSKPGFDPTTQALGWDDL

TLVKETGRYGTTGLYETDLSIPGRSGRAVLFTIWQASHLDQPYYLCSDINIG

26

Cellulomonas fimi ATCC

484

F4H6A3 HGSVTDPPTRNYSCWERWGSDHLNPQMATLDPMCWGAFQHDPNAMWNWNGLYRENVGGRHEAVIPDG

QLCSGGRTFSPRYDYLDTPGPWTAKAVPEKFTLTLTDGAKHGADYLRIYVSKPGFDPTKEALGWDDI

TLLKETGRYGTTGLYQTDVDLTGRSGRAVLFTIWQASHLDQPYYLCSDINVG

27

S-7

Cellulomonas gilvus ATCC

13127

F8A7J2 HGSVTDPASRNYGCWLRWGSDHLNPEMADEDPMCWAAFQADPNTMWNWNGLFREGVAGNHEAAIPDG

QLCSAGRTFDGRYGSLDTPGRWTTTSVPNSFTLTLTDSAKHGADYLKVYVSKPGYDPTTKALAWSDL

ELLKTTGRYPTTGLYQTDIDLPGRSGRAVLYTIWQASHLDQSYYLCSDIMI

28

S-8

Tables S2 – S6.

How to read the correlation tables provided hereinafter (Tables S2 – S6):

Each table shows correlations for one pair of residues (x, y). For each (x, y) pair the correlation score is

given in the legend. The method applied by Kuipers et al. (1) that underlies the correlation score calculation

is known as the statistical coupling analysis method (2,3). Let´s set that x and y denote a position in the

multiple sequence alignment (MSA) and that a and b denote the amino acid (out of 20) tested at position x

and y, respectively. The algorithm runs x and y over the positions in the MSA and calculates different

parameters used to calculate the correlation factor (see ref (1) for equation): the frequency of residue type

a at position x; the frequency of residue type b at position y; the frequency of residue type b at position y

when type a is observed at position x. The latter is provided in Tables S2-S6 for selected (x, y) couples and

represent thus the occurrence of a given pair of residues at positions x and y within the multiple sequence

alignment (MSA) of all 54 sequences (C1 and C1/C4 oxidizers), in percent. The position number is

according to the MSA. Regarding the correlation score, a high score is obtained when residues at position

x and y tend to be mutated in tandem. If a (x, y) pair is fully conserved throughout all the sequences, a score

of zero is attributed.

The actual residue number in each protein sequence may be found in the correlation network shown in Fig.

3A of the main text. For reference, the residue pair found in C1-oxidizing ScLPMO10C and in C1/C4-

oxidizing MaLPMO10B is provided in the legend of each Table. Colors in the Tables indicate the

occurrence of specific pairs in specific (model) LPMOs: green, pair found in C1-oxidizing ScLPMO10C;

orange, pair found in C1/C4-oxidizing ScLPMO10B (that is also active on chitin); yellow, dominating

alternative combination sharing one residue with either ScLPMO10B or ScLPMO10C; grey, frequent

combination not containing residues found in ScLPMO10C or ScLPMO10B.

S-9

Table S2. Correlations between CMA positions 51 and 54 (eq. Y79 and F82 in ScLPMO10C, W82 and N85 in MaLPMO10B). For these two

positions, the correlation score was 0.83. The table shows the frequency of type a residue at position x when a type b residue is observed at position

y. The sum of all frequencies for the couple (x, y) is equal to 100%.

Position 54 % A C D E F G H I K L M N P Q R S T V W Y -

Position

51

A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

E 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

F 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

G 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

H 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

K 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

L 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

M 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

N 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

V 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

W 0 0 0 0 0 0 0 0 0 0 9.09 41.82 0 0 0 0 0 0 0 0 0

Y 0 0 0 0 45.45 0 0 0 0 0 1.82 0 0 0 0 0 0 0 0 1.82 0

- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

S-10

Table S3. Correlations between CMA positions 54 and 117 (eq. F82 and W141 in ScLPMO10C, N85 and Q141 in MaLPMO10B). For these two

positions, the correlation was 0.75. The table shows the frequency of type a residue at position x when a type b residue is observed at position y.

The sum of all frequencies for the couple (x, y) is equal to 100%.

Position 117

% A C D E F G H I K L M N P Q R S T V W Y -

Position

54

A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

E 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

F 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 45.45 0 0

G 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

H 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

K 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

L 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

M 0 0 0 0 0 0 0 0 0 0 0 0 0 9.09 0 0 0 0 1.82 0 0

N 0 0 0 0 0 10.91 0 0 0 0 0 0 1.82 27.27 0 1.82 0 0 0 0 0

P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

V 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

W 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.82 0 0 0

- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

S-11

Table S4. Correlations between CMA positions 51 and 117 (eq. Y79 and W141 in ScLPMO10C, W82 and Q141 in MaLPMO10B). For these two

positions, the correlation score was 0.82. The table shows the frequency of type a residue at position x when a type b residue is observed at position


Position 117


Position

51

A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

E 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

F 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

G 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

H 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

K 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

L 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

M 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

N 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

V 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

W 0 0 0 0 0 10.91 0 0 0 0 0 0 1.82 36.36 0 1.82 0 0 0 0 0

Y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.82 47.27 0 0

- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

S-12

Table S5. Correlations between CMA positions 54 and 88 (eq. F82 and F113 in ScLPMO10C, N85 and Y116 in MaLPMO10B). For these two

positions the correlation score was 1.00. The table shows the frequency of type a residue at position x when a type b residue is observed at position


Position 88

Position

54


A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

E 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

F 0 0 0 0 41.82 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3.64 0

G 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

H 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

K 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

L 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

M 1.82 0 0 0 0 0 0 0 0 0 0 9.09 0 0 0 0 0 0 0 0 0

N 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 41.82 0

P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

V 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

W 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Y 0 0 0 0 1.82 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

S-13

Table S6. Correlations between CMA positions 51 and 88 (eq. Y79 and F113 in ScLPMO10C, W82 and Y116 in MaLPMO10B). For these two

positions the correlation score was 0.80. The table shows the frequency of type a residue at position x when a type b residue is observed at position


Position 88


Position

51

A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

E 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

F 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

G 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

H 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

K 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

L 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

M 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

N 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

V 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

W 0 0 0 0 0 0 0 0 0 0 0 9.09 0 0 0 0 0 0 0 41.82 0

Y 1.82 0 0 0 43.64 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3.64 0

- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

S-14

Table S7. Primers used to amplify the genes encoding MaLPMO10B and MaLPMO10D. Genes encoding full-length MaLPMO10B and D and the

catalytic domain of MaLPMO10B were cloned into the pRSET B expression vector using In-Fusion cloning. Vector overhang sequences are shown

as bold letters; the underlined sequences show the NdeI or BsmI (forward) and HindIII (reverse) restriction sites used to linearize the pRSET B vector

prior to In-Fusion cloning. The BsmI restriction site is coupling between the SmLPMO10A signal sequence and the MaLPMO10B gene.

InFusion cloning primers Signal peptide and restriction enzyme Primer sequence 5’ to 3’ malpmo10B forward primer Native (residue 1-36), NdeI GAAGGAGATATACATATGTCAACGCCGTATCGT

malpmo10B reverse primer (full-length) Native (residue 1-36), HindIII CAGCCGGATCAAGCTTTTAACTGGTCGTGGT

malpmo10B reverse primer (catalytic domain) Native (residue 1-36), HindIII CAGCCGGATCAAGCTTTTAGCCAAAGTCAACATCG

malpmo10D forward primer SmLPMO10A (residue 1-27), BsmI CGCAACAGGCGAATGCCCATGGCAGTGTTACGAAT

malpmo10D reverse primer SmLPMO10A (residue 1-27), HindIII CAGCCGGATCAAGCTTTTAGCGGGCGGTGCAGGTC

S-15

Table S8. Calculated protein properties for MaLPMO10B, MaLPMO10D, ScLPMO10B and ScLPMO10C

wild type and mutants.

Enzyme Mutation(s) Mw (Da) Extinction coefficient,

280 nm (M-1 cm-1)a

MaLPMO10B

WT 34,906 93,765

WT catalytic domain 21,560 67,170

W82Y 34,883 89,755

N85F 34,939 93,765

Y116F 34,890 92,275

D140A 34,862 93,765

Q141W 34,964 99,265

W82Y/N85F 34,916 89,755

N85F/Y116F 34,923 92,275

W82Y/N85F/Y116F 34,900 88,265

W82Y/N85F/Q141W 34,974 95,255

W82Y/N85F/Y116F/Q141W 34,958 93,765

MaLPMO10D WT 34,017 100,755

ScLPMO10B

WT 20,737 63,160

W88Y/N91F 20,747 59,150

A148G 20,723 63,160

A148S 20,753 63,160

ScLPMO10C

WT 34,568 75,775

Y79W 34,590 79,785

F82N 34,534 75,775

Y79W/F82N 34,557 79,785

W141Q 34,510 70,275

A142G 34,554 75,775 a Predicted value from the ExPASy ProtParam server (http://web.expasy.org/protparam/)

assuming all pairs of Cys residues form cystines.

http://web.expasy.org/protparam/

S-16

Figure S1. Phylogenetic tree built from 130 LPMO10 sequences. Enzymes with green labels have been

biochemically characterized and are known to be strict C1 cellulose-oxidizers; orange labels indicate

enzymes with known C1/C4-oxidizing activity on cellulose and C1-oxidizing activity on chitin; grey labels

indicate enzymes known to oxidize C1-carbon in chitin. Black labels indicate enzymes from M. aurantiaca

and MaLPMO10B, the model enzyme used in this study, is colored blue. The fourth, non-colored cluster

contains uncharacterized enzymes with unknown activity, some of which contain a C-terminal sortase

recognition motif and are predicted as covalently anchored to the cell wall in Gram-positive bacteria.

S-17

Figure S2. Heat map derived from correlated mutation analysis. Panel A shows a full view, whereas panel

B only shows highly correlated positions. The map is a matrix showing the correlation between residues

found in the MSA (numbering given on x and y-axis). Two residues are considered as correlated when the

mutation of a given amino acid in the pair under study is associated to mutation of the second residue in

this pair (1). For instance, let’s assume that one finds a Trp at position x and an Ala at position y in 50 %

of the sequences, and that in the 50% remaining sequences one finds a Tyr at position x. Then, if the Ala is

mutated for instance into a Glu each time that Trp is mutated into Tyr, both positions x and y will be

considered as highly correlated. If the Ala is mutated into a diversity of residues in the 50% remaining

sequences then positions x and y are not considered as correlated. The color gradient, from green to red,

represents the correlation score from the lowest to the highest value, respectively (the highest correlation is

set to 100%, which here was found for the couple of CMA positions 54-88; see Table S5). For selected

correlated positions (see main text), the diversity of amino acids occuring at both positions can be retrieved

from Tables S2-S6. Red labeled numbers in panel B represents the fourteen CMA positions with highest

correlation score.

S-18

Figure S3. Amino acid frequency encountered at each position in an MSA of sequences in the cellulose-

active C1-specific LPMO10 sub-group (26 sequences). The graph was generated using WebLogo (4).

Numbers in black on the x-axis correspond to the numbering for this MSA, whereas numbers written in

white on a red background indicates residues identified during the CMA analysis (see Fig. 3 and S2). Note

that these numbers correspond to the global MSA (i.e. including both the C1 and C1/C4 subgroup

sequences), and explains why they differ from the numbers written in black. The catalytic histidines are

labeled with an orange star.

S-19

Figure S4. Amino acid frequency encountered at each position in an MSA of sequences corresponding to

the cellulose-active C1/C4 LPMO10 sub-group (28 sequences). See legend of Fig. S3 for further details.

S-20

Figure S5. MALDI-TOF MS of the hexamer cluster obtained after degrading PASC with C1-oxidizing

ScLPMO10C (black) or C1/C4-oxidizing MaLPMO10D (purple), MaLPMO10B (blue) or ScLPMO10B

(orange). Prior to MS analysis, the product mixtures were saturated with 10 mM sodium ions. Possible

assignments of the observed masses are shown below the spectra: red, double-oxidized; black, C1-oxidized;

blue, C4-oxidized; green, native. Note that the 1009, 1027 and 1049 species can only result from C4

oxidation and do not occur in the product mixture generated by ScLPMO10C.

S-21

Figure S6. MALDI-TOF MS of the hexamer cluster obtained after degrading β-chitin with SmLPMO10A

(black), which is only active on chitinous substrates, and three LPMOs with mixed chitin and cellulose

activity: MaLPMO10D (purple), MaLPMO10B (blue) and ScLPMO10B (orange). Peak annotations are

shown above the spectrum for SmLPMO10A for the more abundant sodium adducts. The peak annotated

with 1291.70* is the potassium adduct of the hexamer aldonic acid and 1313.71** is the sodium salt of the

potassium adduct of the hexamer aldonic acid. Note that the LPMOs with dual substrate-specificity produce

more partly deacetylated products (-42 Da per deacetylation) compared to SmLPMO10A.

S-22

Figure S7. Structures and structure-based sequence alignment of C1/C4-active MaLPMO10B and

ScLPMO10B, and C1-active ScLPMO10C. The structure pictures highlight the location of the insertion in

MaLPMO10B [P180-G190, includes helix 9 (H9) and β-strand 7 (S7)] and ScLPMO10C (P177-G188),

relative to ScLPMO10B. The so-called L2 loop region between S1 and S3 is colored violet in the three

structures. Secondary structure assignments according to MaLPMO10B appear above the sequences, and

the individual sequences are also colored by secondary structure. Cysteine residues involved in disulfide

formation appear on a yellow background; the four residues targeted for mutagenesis after CMA appear on

a magenta background; the alanine residue affecting the accessibility of the axial copper coordination

position appears on an orange background. The two catalytic histidines are labeled by a blue star above the

sequences. PyMod was used to make the structure based alignment (5).

S-23

Figure S8. Chromatographic analysis of products generated by MaLPMO10B variants. Chromatograms

are shown for wild-type MaLPMO10B and ScLPMO10C, nine variants of MaLPMO10B based on the

CMA, and truncated wild-type MaLPMO10B (WTcd). The reactions contained 0.2 % (w/v) PASC, 1 µM

Cu(II)-loaded LPMO and 1 mM ascorbic acid in 50 mM Bis-Tris buffer pH 6.0, and were incubated for 24

h at 40 °C. The peaks for cellotriose (Glc3) and C1-oxidized hexamer (Glc5Glc1A) are labeled with red

asterisks and were considered as representative major peaks for C1-oxidized and native cello-

oligosaccharides, respectively, to be used for calculating the ratios between native and C1-oxidized

products (see Fig. S10 and text).

S-24

Figure S9. HPAEC-PAD analysis of products generated by variants of ScLPMO10B (A, B) and

ScLPMO10C (C, D). All variants were incubated with 0.2 % (w/v) PASC for 24 h. Black chromatograms

show various standard samples and a negative control in which enzyme was excluded from the reaction

mixture. Blue chromatograms show product profiles with detectable amounts of C4-oxidized species,

whereas green chromatograms show no or very minor peaks (putatively) representing C4-oxidized species.

All reactions were carried out in 50 mM Bis-Tris buffer pH 6.0, at 40 °C with 1 µM Cu(II)-loaded LPMO

and 1 mM ascorbic acid. For quantification of oxidized products (B, D), the products were degraded using

TfCel5A and C1- (green) and C4-oxidized (blue) dimers were quantified using in-house produced standards

with known concentrations (see Fig. S14 for standard curves and quantification). For ScLPMO10B variants,

the C4-oxidized products were below detection limit even after TfCel5A hydrolysis and are therefore not

shown in panel B. The error bars show s.d. (n = 3). DHA, dehydroascorbic acid.

S-25

Figure S10. Comparison of the ratio between C1-oxidized and native oligosaccharides released from PASC

by MaLPMO10B variants or ScLPMO10C-WT. Peaks for the most abundant native and C1-oxidized

products (see Fig. S8), cellotriose (Glc3) and C1-oxidized cellohexaose (Glc5Glc1A), were integrated and

the values were used as quantitative representatives of native and C1-oxidized products, respectively.

C1:native ratios are shown as red markers (secondary y-axis). All reactions were carried out with 0.2 %

(w/v) PASC in 50 mM Bis-Tris buffer pH 6.0, at 40 °C with 1 µM Cu(II)-loaded LPMO and 1 mM ascorbic

acid.

S-26

Figure S11. HILIC-UV detection of oxidized products from squid pen β-chitin generated by MaLPMO10B

variants. The chromatograms are arranged in the order of activity level. All reaction mixtures were

incubated for 24 h at 40 °C and 1000 rpm and contained 1 % (w/v) chitin, 1 mM ascorbic acid and 1 µM of

LPMO in 50 mM Bis-Tris buffer pH 6.0.

S-27

Figure S12. Binding of full-length (solid line) and catalytic domain (dashed line) variants of MaLPMO10B

to Avicel. The percentage of free protein was determined by measuring the reduction in protein

concentration (A280) in the supernatant over time (5, 15, 30, 60, 120 and 240 minutes) as previously

described (6). The experiment was carried out at 22 °C using 1 % (w/v) substrate in 50 mM Bis-Tris buffer

pH 6.0, in the absence of ascorbic acid.

S-28

Figure S13. Probing early inactivation of two MaLPMO10B variants. (A) One µM of MaLPMO10B-WTcd

or the W82Y/N85F mutant of the full-length enzyme was incubated with 0.1 % (w/v) PASC and 1 mM

ascorbic acid for up to four hours. After four hours (t = 0, panel B; equivalent to t = 4 h in panel A), the

sample was split and, to half of the reaction, more substrate and ascorbic acid (“+ PASC”; final

concentration 0.15 %) was added, whereas fresh enzyme and ascorbic acid was added to the other half (“+

LPMO”, final concentration 1.5 µM), followed by further incubation for 2 h. Samples were taken at various

time points and C1-oxidized products were quantified by HPAEC-PAD after LPMO-generated cello-

oligosaccharides had been hydrolyzed by TfCel5A. Dilution factors were taken into account when

determining the amount of oxidized sites. The error bars show s.d. (n = 3).

S-29

Figure S14. Standard curves for C1-oxidized (A; GlcGlc1A) and C4-oxidized (B; Glc4GemGlc) dimers.

Cellobiose with a known concentration was oxidized by MtCDH to produce C1-oxidized dimer (cellobionic

acid; GlcGlc1A) and C4-oxidized dimer (Glc4GemGlc) was produced as described in (7). Panel C shows

HPAEC-PAD chromatograms of the two standards in panel A and B and soluble products generated from

PASC by ScLPMO10C-WT, MaLPMO10B-WT or MaLPMO10B-Y116F after hydrolysis with TfCel5A.

Endo-glucanase hydrolysis yielded mixtures of glucose (not visible), cellobiose (not visible) and oxidized

products with a degree of polymerization of 2 and 3 (i.e. GlcGlc1A, Glc2Glc1A, Glc4GemGlc and

Glc4GemGlc2). The ratio between DP2 and 3 is constantly 1:1 and therefore only the dimers were quantified

and their ratio was used as a measurement for oxidative regioselectivity.

S-30

References

1. Kuipers, R. K., Joosten, H. J., Verwiel, E., Paans, S., Akerboom, J., van der Oost, J., Leferink, N.

G., van Berkel, W. J., Vriend, G., and Schaap, P. J. (2009) Correlated mutation analyses on super-

family alignments reveal functionally important residues. Proteins 76, 608-616

2. Lockless, S. W., and Ranganathan, R. (1999) Evolutionarily conserved pathways of energetic

connectivity in protein families. Science 286, 295-299

3. Fodor, A. A., and Aldrich, R. W. (2004) On evolutionary conservation of thermodynamic coupling

in proteins. J. Biol. Chem. 279, 19046-19050

4. Crooks, G. E., Hon, G., Chandonia, J. M., and Brenner, S. E. (2004) WebLogo: a sequence logo

generator. Genome Res. 14, 1188-1190

5. Bramucci, E., Paiardini, A., Bossa, F., and Pascarella, S. (2012) PyMod: sequence similarity

searches, multiple sequence-structure alignments, and homology modeling within PyMOL. BMC

bioinformatics 13 Suppl 4, S2

6. Forsberg, Z., Nelson, C. E., Dalhus, B., Mekasha, S., Loose, J. S. M., Crouch, L. I., Røhr, A. K.,

Gardner, J. G., Eijsink, V. G. H., and Vaaje-Kolstad, G. (2016) Structural and functional analysis

of a lytic polysaccharide monooxygenase important for efficient utilization of chitin in Cellvibrio

japonicus. J. Biol. Chem. 291, 7300-7312

7. Müller, G., Varnai, A., Johansen, K. S., Eijsink, V. G. H., and Horn, S. J. (2015) Harnessing the

potential of LPMO-containing cellulase cocktails poses new demands on processing conditions.

Biotechnol. Biofuels 8, 187

*Running title: To whom correspondence should be addressed...

Documents

Transcript of *Running title: To whom correspondence should be addressed...