Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

17
1

Transcript of Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

Page 2: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

2

• Motivation• Apex Processing Model• Spark Processing Model• Translation from Spark to Apex• Parallelism in Apex• I/O Performance Enhancement• RoadMap

Page 3: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

3

Page 4: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

4

Page 5: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

5

Page 6: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

6

Page 7: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

7

val parsed = sc.textFile(path, minPartitions)

.map(_.trim)

.filter(line => !(line.isEmpty || line.startsWith("#")))

.map(training_record)

val d = parsed.reduce(math.Max + 1)

parsed.map(_+d).collect()

Page 8: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

8

val parsed = sc.textFile(path, minPartitions)

.map(_.trim)

.filter(line => !(line.isEmpty || line.startsWith("#")))

.map(training_record)

Apex RDD

parsed

Page 9: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

9

val d = parsed.reduce(math.Max + 1)

val d = nParsed

Apex RDD

Page 10: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

10

parsed.map(_ + d).collect()

Parsed (ApexRDD)

Page 11: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

11

Map

Map

Map

Map

Reduce

Reduce

Page 12: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

12

Page 13: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

13

Page 14: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

14

Page 15: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

15

Page 16: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

16

Page 17: Lightning Talks & Integrations Track - Running Apache Spark Libraries on Apache Apex @ ABDW17, Pune

17