Download - Spark SQL and DataFrames Spark GraphX Spark Mlib Spark ...Spark GraphX! Spark Mlib! Spark Streaming Lightning-fast cluster computing. Chaining transformations 2. ... Covert RDD to

Transcript

Spark SQL and DataFrames ���Spark GraphX ���Spark Mlib ���Spark Streaming

Lightning-fast cluster computing

Chaining transformations

2  

SQL context

3  

Creating a SQL context

4  

DataFrames

5  

Creating DataFrames

6  

Creating a DataFrame from Hive

7  

Place your hive-site.xml, core-site.xml (for security configuration), hdfs-site.xml (for HDFS configuration) file in your spark conf/

Creating a DataFrame from MySQL

8  

Creating a DataFrame from MySQL

9  

Transforming and querying DataFrames

10  https://spark.apache.org/docs/1.6.2/api/python/pyspark.sql.html#

Working data in a DataFrame

11  

Working data in a DataFrame

12  

DataFrame queries

13  

DataFrame queries

14  

DataFrame queries

15  

Query DataFrame using columns

16  

Query DataFrame using columns

17  

SQL queries

18  

Saving DataFrames

19  

DataFrames and RDDs

20  

DataFrames and RDDs

21  

Working with Row objects

22  

Extracting data from rows

23  

Covert RDD to DataFrame

24  

ML and GraphX in Spark

25  

Common spark use case

26  

Common spark use case

27  

Spark examples

28  

Iterative algorithms in Spark: PageRank

29  

PageRank algorithm

30  

PageRank algorithm

31  

PageRank algorithm

32  

PageRank algorithm

33  

Neighbor contribution function

34  

Input data

35  

Pairs of page links

36  

Page links grouped by source page

37  

Persisting the link pair RDD

38  

Set initial ranks

39  

First iteration

40  

First iteration

41  

First iteration

42  

First iteration

43  

Second iteration

44  

Checking point

45  

Checking point

46  

GraphX in Spark

47  

Examples in GraphX

48  

MLlib in Spark���

49  

https://spark.apache.org/docs/2.0.2/ml-guide.html

What is MLlib?

50  

Why MLlib?

51  

https://docs.databricks.com/spark/latest/mllib/decision-trees.html

Spark streaming

52  http://spark.apache.org/docs/latest/streaming-programming-guide.html