(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

23
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 1 © 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 1 (Very) Loose Proposal to Revamp MPI_INIT and MPI_FINALIZE These are the kinds of crazy ideas that we discuss at the MPI Forum Jeffrey M. Squyres Cisco Systems 23 September 2015

Transcript of (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

Page 1: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 1© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 1

(Very) Loose Proposalto Revamp MPI_INIT and

MPI_FINALIZEThese are the kinds

of crazy ideasthat we discuss

at the MPI ForumJeffrey M. Squyres

Cisco Systems23 September 2015

Page 2: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 2

Before MPI-3.1, this could be erroneous

int my_thread1_main(void *context) { MPI_Initialized(&flag); // …}

int my_thread2_main(void *context) { MPI_Initialized(&flag); // …}

int main(int argc, char **argv) { MPI_Init_thread(…, MPI_THREAD_FUNNELED, …); pthread_create(…, my_thread1_main, NULL); pthread_create(…, my_thread2_main, NULL); // …}

These mightrun at the same time (!)

Page 3: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 3

The MPI-3.1 solution• MPI_INITIALIZED (and friends) are allowed to be called at any time

…even by multiple threads…regardless of MPI_THREAD_* level

• This is a simple, easy-to-explain solutionAnd probably what most applications do, anyway

• But many other paths were investigated

Page 4: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 4

MPI_INIT / FINALIZE limitations• Cannot call MPI_INIT more than once• Cannot set error behavior of MPI_INIT• Cannot re-initialize MPI after it has been finalized• Cannot init MPI from different entities within a process without a priori

knowledge / coordination

MPI Process// Library 1MPI_Initialized(&flag);if (!flag) MPI_Init(…);

// Library 2MPI_Initialized(&flag);if (!flag) MPI_Init(…);

Page 5: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 5

MPI_INIT / FINALIZE limitations• Cannot call MPI_INIT more than once• Cannot set error behavior of MPI_INIT• Cannot re-initialize MPI after it has been finalized• Cannot init MPI from different entities within a process without a priori

knowledge / coordination

MPI Process// Library 1MPI_Initialized(&flag);if (!flag) MPI_Init(…);

// Library 2MPI_Initialized(&flag);if (!flag) MPI_Init(…);

THIS IS INSUFFICIENT / POTENTIALLY ERRONEOUS

Page 6: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 6

1994 called.

They want their API design back.

Page 7: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 7

What we should have• Call MPI_INIT as many times as you like• By whomever wants to call it

MPI Process

// Library 3MPI_Init(…);

// Library 4MPI_Init(…);

// Library 5MPI_Init(…);

// Library 6MPI_Init(…);// Library 7

MPI_Init(…);

// Library 8MPI_Init(…);

// Library 9MPI_Init(…);

// Library 10MPI_Init(…);

// Library 11MPI_Init(…);

// Library 12MPI_Init(…);// Library 2

MPI_Init(…);// Library 1MPI_Init(…);

Page 8: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 8

…but that has its own complicationsDo you have to call MPI_FINALIZE exactly that many times?

Do you allow MPI_INIT after MPI_FINALIZE?

Or perhaps you only allow MPI_INIT before MPI has been finalized?

How can you tell if it’s safe to call MPI_INIT? Atomic “test-and-init”?

I IS CONFUSED

Page 9: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 9

We need something new

Page 10: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 10© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 10

The following are just (incomplete) crazy ideas

WARNING!

Page 11: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 11

New MPI concept: a session

int my_thread1_main(void *context) { MPI_Session session; MPI_Session_create(…, &session);

// Do MPI things

MPI_Session_free(&session);}

int my_thread2_main(void *context) { MPI_Session session; MPI_Session_create(…, &session);

// Do MPI things

MPI_Session_free(&session);}

int main(int argc, char **argv) { pthread_create(…, my_thread1_main, NULL); pthread_create(…, my_thread2_main, NULL); …}

Page 12: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 12

New MPI concept: a session

int my_thread1_main(void *context) { MPI_Session session; MPI_Session_create(…, &session);

// Do MPI things

MPI_Session_free(&session);}

int my_thread2_main(void *context) { MPI_Session session; MPI_Session_create(…, &session);

// Do MPI things

MPI_Session_free(&session);}

int main(int argc, char **argv) { pthread_create(…, my_thread1_main, NULL); pthread_create(…, my_thread2_main, NULL); …}

Now featuring

100% less MPI_INIT!

Page 13: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 13

Create communicators from sessionsint my_thread1_main(void *context) { MPI_Session session; MPI_Session_create(&session); MPI_Comm_create_from_session(session, &comm)

// Do MPI things with comm

MPI_Comm_free(&comm); MPI_Session_free(&session);}

int my_thread1_main(void *context) { MPI_Session session; MPI_Session_create(&session); MPI_Comm_create_from_session(session, &comm)

// Do MPI things with comm

MPI_Comm_free(&comm); MPI_Session_free(&session);}

Page 14: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 14

Problems that sessions solve

Each entity (library?) in an OS process can have its own session

Any session-local state can be encapsulated in the handle

Entities can create / destroy sessions at any time …in any thread

Page 15: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 15

…but what about MPI_COMM_WORLD?

Page 16: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 16

MPI_COMM_WORLD. Sigh.• When is MPI_COMM_WORLD created (and/or initialized)?• When is MPI_COMM_WORLD destroyed?• Can you use MPI_COMM_WORLD with any session?

There doesn’t seem to be an obvious relation between MCW and individual sessions (ditto for MPI_COMM_SELF)

Page 17: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 17

What if we get rid of MPI_COMM_WORLD?

Page 18: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 18

Problems that solves• Addresses logical inconsistency with session concept• Clean separation of communicators between sub-entities

…maybe slightly better than we have it today (sub-entities dup’ing COMM_WORLD)

• Side effects:Fault tolerance issues become easierOpens some possibilities for scalability improvements

Page 19: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 19

Problems that creates• Users will riot

…but what if they don’t?

Page 20: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 20

Open questions• What would be the forward / backward compatibility strategy?

E.g., deprecate INIT, FINALIZE, INITIALIZED, FINALIZED…?

• What are the other arguments to MPI_SESSION_CREATE?• Can you call both MPI_INIT and MPI_SESSION_CREATE in the same

process?• Can you do anything else with a session?

Page 21: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 21

Sooo… what happens next?

Page 22: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 22© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 22

Come to MPI Forum meetings

Discuss this and otherscintillating MPI topics

Page 23: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE

© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 23

Thank you.