Hsiang hung
16
TIP MAX Hsiang-Hsuan Hung
-
Upload
hsiang-hsuan-hung -
Category
Data & Analytics
-
view
326 -
download
0
Transcript of Hsiang hung
Pipeline
Flask
Batch process
Problem: raw data is not ordered by time and 220GB with 13 billions events
Challenges• Connector between Cassandra and Spark
• Design primary keys for data query
• Cleaning data
AboutMe• UCSD, Physics PhD 2011
• U Illinois, ECE 2011-2012
• U Texas Austin, Physics 2012-2015
• Computational material science:
• Programming, travel, fitness….
HPC, e.g. quantum Monte Carlo…