Hw09 Hadoop + Clojure
-
Upload
cloudera-inc -
Category
Technology
-
view
3.056 -
download
0
Transcript of Hw09 Hadoop + Clojure
![Page 1: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/1.jpg)
Hadoop + ClojureHadoop World NYC
Friday, October 2, 2009
Stuart Sierra, AltLaw.org
![Page 2: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/2.jpg)
JVM Languages
Native tothe JVM
Ported tothe JVM
ObjectOriented
JRubyJythonRhino
Groovy
Functional
Clojure
Armed Bear CLKawa
Scala
Java is dead, long live the JVM
![Page 3: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/3.jpg)
Clojure
● a new Lisp,neither Common Lisp nor Scheme
● Dynamic, Functional● Immutability and concurrency● Hosted on the JVM● Open Source (Eclipse Public License)
![Page 4: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/4.jpg)
Clojure Primitive TypesString "Hello, World!\n"Integer 42Double 2.0e64BigInteger 9223372036854775808BigDecimal 1.0MRatio 3/4Boolean true, falseSymbol fooKeyword :foonull nil
![Page 5: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/5.jpg)
Clojure Collections
List (print :hello "NYC")Vector [:eat "Pie" 3.14159]Map {:lisp 1 "The Rest" 0}Set #{2 1 3 5 "Eureka"}
Homoiconicity
![Page 6: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/6.jpg)
(defn greet [name] (println "Hello," name))
(greet "New York")Hello, New York
public void greet(String name) { System.out.println("Hi, " + name);}greet("New York");Hi, New York
![Page 7: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/7.jpg)
(defn average [& nums] (/ (reduce + nums) (count nums)))
(average 1 2 3 4)5/2
public double average(double[] nums) { double total = 0; for (int i = 0; i < nums.length; i++) { total += nums[i]; } return total / nums.length;}
![Page 8: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/8.jpg)
(def m {:f "foo" :b "bar"})(m :f)"foo"(:b m)"bar"
(def s #{1 5 3})(s 3)true(s 7)false
Data Structures as Functions
![Page 9: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/9.jpg)
(import '(com.example.package MyClass YourClass))(. object method arguments)(new MyClass arguments)
(.method object arguments)(MyClass. arguments)(MyClass/staticMethod)
SyntacticSugar
![Page 10: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/10.jpg)
...open a stream...try { ...do stuff with the stream...} finally { stream.close();}
(defmacro with-open [args & body] `(let ~args (try ~@body (finally (.close ~(first args))))))
(with-open [stream (...open a stream...)] ...do stuff with stream...)
![Page 11: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/11.jpg)
coordinated
independent
unshared
synchronous asynchronous
ref
agentatom
var
![Page 12: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/12.jpg)
mapper(key, value) reducer(key, values)
list of key-value pairs
list of key-value pairs
(map function values)
(reduce function values)list of values
single value
![Page 13: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/13.jpg)
public static class MapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.toString(); StringTokenizer itr = new StringTokenizer(line); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); output.collect(word, one); } } }
public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } }
![Page 14: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/14.jpg)
(mapper key value) (reducer key values)
list of key-value pairs
list of key-value pairs
![Page 15: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/15.jpg)
Clojure-Hadoop 1(defn mapper-map [this key val out reporter] (doseq [word (enumeration-seq (StringTokenizer. (str val)))] (.collect out (Text. word) (IntWritable. 1))))(defn reducer-reduce [this key vals out reporter] (let [sum (reduce + (map (fn [w] (.get w)) (iterator-seq values)))] (.collect output key (IntWritable. sum))))(gen-job-classes)
![Page 16: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/16.jpg)
Clojure-Hadoop 2
(defn my-map [key value] (map (fn [token] [token 1]) (enumeration-seq (StringTokenizer. value)))) (def mapper-map (wrap-map my-map int-string-map-reader))(defn my-reduce [key values] [[key (reduce + values)]])(def reducer-reduce (wrap-reduce my-reduce))(gen-job-classes)
![Page 17: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/17.jpg)
Clojure print/read
DATA
STRING
read
![Page 18: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/18.jpg)
Clojure-Hadoop 3
(defn my-map [key val] (map (fn [token] [token 1]) (enumeration-seq (StringTokenizer. val))))(defn my-reduce [key values] [[key (reduce + values)]])(defjob job :map my-map :map-reader int-string-map-reader :reduce my-reduce :inputformat :text)
![Page 19: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/19.jpg)
public static class MapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.toString(); StringTokenizer itr = new StringTokenizer(line); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); output.collect(word, one); } } }
public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } }
![Page 20: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/20.jpg)
Clojure-Hadoop 3
(defn my-map [key val] (map (fn [token] [token 1]) (enumeration-seq (StringTokenizer. val))))(defn my-reduce [key values] [[key (reduce + values)]])(defjob job :map my-map :map-reader int-string-map-reader :reduce my-reduce :inputformat :text)
![Page 21: Hw09 Hadoop + Clojure](https://reader034.fdocuments.net/reader034/viewer/2022042615/55d4fc12bb61eb95528b460d/html5/thumbnails/21.jpg)
More
● http://clojure.org/● Google Groups: Clojure● #clojure on irc.freenode.net● http://richhickey.github.com/clojure-contrib● http://stuartsierra.com/● http://github.com/stuartsierra● http://www.altlaw.org/