Scala user-group-19.03.2014

Collections in Clojure

Jan Herich

2014-03-19 Mon

Jan Herich Collections in Clojure 2014-03-19 Mon 1 / 23

Outline

1 Basic Clojure collection types

2 Persistent characteristics of Clojure collections

3 Sequence abstraction and laziness

4 Reducers - better performance and parallelism

Basic Clojure collection types

List data-structureimplemented asordinarysingle-linked listLists are specialbecause they areused to composeClojure programsUnquoted lists aretreated as functioncalls by Clojureenvironment

;; list literal representation’(1 2 :id (3 4) "name");; unquoted list interpreted;; as function call(= (+ 1 2) 3);; get the first element(= (peek ’(1 2 3)) 1);; new vector from old one(= (pop ’(1 2 3)) ’(2 3))(= (conj ’(3 2 1) 4)

’(4 3 2 1))

Sets are collectionsof unique elementsAs every collectionin Clojure, sets canbe heterogeneousFast membershiptest

;; set literal representation#{1 :id :type "name"};; testing membership(= true (contains? #{1 2} 2));; new set from old one(= (disj #{1 2 3} 2) #{1 3})(= (conj #{1 3} 2) #{1 2 3})

Maps is a basicconstruct forholding structuredinformationDefaultimplementationuses a well-knownhash-mapmechanismFast look-up

;; map literal representation{:id 1 :name "John"};; Optional comma delimiters{:id 1, :name "John"};; lookup(= (get {:id 1 :name "John"} :id)

1);; new map from old one(= (dissoc {:id 1 :name "John"}

:name){:id 1})

(= (assoc {:id 1} :name "John"){:id 1 :name "John"})

Vectors

Vector is the rightstructure forordered data whererandom look-up isnecessaryFast look-up byindexMaintains orderingof elements

;; vector literal representation[1 2 3 4 5];; lookup by zero based index(= (get [1 2 3] 2) 3);; new vector from old one(= (subvec [1 2 3 4 5] 2)

[3 4 5])(= (conj [1 2 3] 4)

[1 2 3 4])(= (assoc [1 3] 0 2) [2 3])

Persistent characteristics of Clojure collections

Non-destructive updates

All Clojure persistent collections support functional,non-destructive updates, instead of in-place mutation of dataTo guarantee that updates with such semantics will be fast andmemory efficient, it’s obvious that simple defensive copyingwon’t workLuckily, there is a technique called structural sharing, which canhelp us

Persistent characteristics of Clojure collections

Example of structural sharing

Before update After update

Sequence abstraction and laziness

Sequence as a powerful abstraction for collections

Sequence is a logical list, persistent and immutable view of thecollectionAll core Clojure collections provide sequence implementationsMost core Clojure transformation functions for manipulatingcollections like filter or map are defined in terms of sequencesThis is very handy when composing collection transformations

Sequences explained

You can call seq on any Clojure collection, which yields sequenceimplementation appropriate to the collection. This implementationprovides following basic guarantees (which are defined in terms of theISeq interface under the hood):;; Returns the first item in the collection. Calls seq;; on its argument. If coll is nil, returns nil(first coll);; Returns a sequence of the items after the first.;; Calls seq on its argument. If there are no more items,;; returns a logical sequence for which seq returns nil(rest coll);; Returns a new seq where item is the first element;; and seq is the rest(cons item seq)

How Clojure leverages sequences

As already mentioned, many Clojure functions are defined in terms ofsequences, for example, have a look at greatly simplified mapimplementation:(defn map [f coll]

(when-let [s (seq coll)](cons (f (first s)) (map f (rest s)))))

This enable the map function to operate on any collection whichsatisfies sequence interface, because the map function calls seq on itssecond (coll) argument. Notice that the map returns sequence aswell, with the consequence, that functions operating on sequencescan be easily composed together.

Composing collection transformations

;; filter countries, calculate densities and sort them(->> ’({:code "SK" :area 49035 :population 5415949}

{:code "CZ" :area 78866 :population 10513209}{:code "AT" :area 83855 :population 8414638}{:code "HU" :area 93030 :population 9908798})

(filter (fn [country](> (get country :area) 80000)))

(map (fn [country](assoc country :density

(double (/ (get country :population)(get country :area))))))

(sort-by (fn [country](get country :density))))

Laziness

As it turns out, it’s very easy to express infinite sequences, justby defining some recursive relations between sequence elementsClojure gives us many functions for infinite sequences, such asiterate;; infinite stream of ascending numbers from zero(iterate inc 0);; to avoid blocking the consuming thread, use take(take 10 (iterate inc 0))To be able to express such infinite sequences, we need to expresslazinessIn fact, most Clojure core functions (for example map) aredefined as lazy so they can consume and produce lazy sequences

How to express laziness in Clojure

;; define fibonacci number as lazy sequence with;; the help of lazy-seq macro(defn fib [a b]

(cons a (lazy-seq (fib b (+ a b)))));; consume first ten numbers from sequence(take 10 (fib 0 1));; map is lazy as well(take 10 (map (fn [x] (* 3 x)) (fib 0 1)))

Reducers - better performance and parallelism

Reducers, or another useful collection abstraction

Why another abstraction if we already have sequences ?1 Laziness is great when we need it, but not always2 Sequence is fundamentally serial3 Those two points are problems if we want high-performing

solution which can easily exploit parallelism

Therefore, we need to find some new notion of collection, evensimpler one than sequence abstractionThe new, minimalist notion of collection is something which isreducible

How is reducible defined

It’s important to understand the reduce function:;; this is a simplified definition of reduce(defn reduce [f init coll]

(if-let [s (seq coll)](reduce f (f init (first s)) (rest s))init))

;; this is how we call reduce with reducing function(reduce (fn [accumulator item]

(* accumulator item))1’(1 2 3 4 5 6 7))

Reducible is something which can reduce itself, and we are notinterested in actual mechanism

Digging deeper into reducers

Reducers are about transformation of reducing functions;; new simplified definition of map(defn mapping [f]

(fn [f1](fn [accumulator item]

(f1 accumulator (f item)))))Reducers library offer alternatives to sequence functions definedsimilar to mapping above => as a higher order functions whichtransform the reducing step to include the logic of mapping,filtering, etcWhat’s particularly nice, is that those functions consist only ofthe core logic of their operations

Applying reducers

If we keep the definition of mapping from previous slide, ourcode would be little strange;; our sequence based code(reduce + 0 (map (fn [x] (* x 3)) ’(1 2 3)));; and equivalent reducers based code(reduce ((mapping (fn [x] (* x 3))) +) 0 ’(1 2 3))Luckily, we are in a LISP land, so reducers library handles suchdetails with the help of macros and we are working withfunctions which have the same shape as before;; require reducers library(require ’[clojure.core.reducers :as r]);; use it(reduce + 0 (r/map (fn [x] (* x 3)) 0 ’(1 2 3)))

What we gain and what we loose

Reducers are faster and more memory efficient then theirsequence based counterparts, specially when moretransformations are chained (have a look at slide 12), becauseno intermediate sequences are producedThis is because composing reducers functions merely creates arecipe for future reduction, no work is done until reduce is calledWe loose laziness in the process, so we can’t write thisexpression with reducers anymore(take 10 (r/map (fn [x] (* 3 x)) (fib 0 1)))(compiler will complain, because unlike normal map, r/mapdoesn’t return a sequence)

Enter parallelism

With reducers, core collection operations are freed from lazinessand representation, but we are stuck with reduce function whichis serial as wellBut we can parallelize reduction by using independentsub-reductions and combining their resultsThere is a function which does just that: foldfold takes an combining function, reducing function andcollection and returns the result of combining the results ofreducing sub-segments of the collection, potentially in parallel

Fold example

(require ’[clojure.core.reducers :as r]);; we use the same combine and reduce function(r/fold + + [1 2 3 4 5 6]);; when this is the case, it’s enough to supply;; just reducing function and fold will use it;; to combine the the sub-reductions(r/fold + [1 2 3 4 5 6])

Conclusion

Fold will take advantage of collections which are amenable toparallel subdivision, ideal candidates are trees, such as Clojurevectors and mapsParallel implementations of fold for those collections are basedupon Java ForkJoin frameworkIf the underlying collection is not suited for parallel subdivision(as is the case with sequence), fold just devolves into reduce

The End

Thank you for your attentionI hope this presentation sparked yourinterest in Clojure, in which case, visitwww.clojure.org and learn more !

Scala user-group-19.03.2014

Technology

Transcript of Scala user-group-19.03.2014

Scala Rider Q2 User Guide

Mbbs Prospectus 2014 Findal 19.03.2014 Updatedfd

Rezultatet e shkollës së Policisë 19.03.2014 Grupi i dytë

Ppt reunion du 19.03.2014 clubs_zonings_mons_6.03

Scala User GuideScala et Scala Replay sont ajoutés au Launchpad . 8. Depuis le Launchpad , glissez-déposez l'icône de Scala vers le Dock au bas de votre écran. Vous pouvez maintenant

2014-08-27 | The Twitter Stack (Vienna Scala User Group)

2014-11-26 | Creating a BitTorrent Client with Scala and Akka, Part 1 (Vienna Scala User Group)

AHCAL Mechanics status report Karsten Gadow CALICE Collaboration Meeting Argonne, 19.03.2014.

Paris Scala User Group #43 - Spray (Magnet Pattern) + RxScala / ElasticSearch

Reactive Slick at Scala User Group Vienna May 20 2015

19.03.2014 Rezultatet Grupi i Pare

Scala 55 / Scala 100 / Scala 30 - Molplast...und Metall sind erhältlich als • Scala 100 für extrem strapazierte Nutzungen • Scala 55 für hoch frequentierte Bereiche • Scala

So Slick! An introduction Jan Christopher Vogt, EPFL Slick Team Scala User Group Berlin Brandenburg.

19.03.2014 Rezultatet Grupi i Dyte

scala rider Q3 User Guide EN · Title: scala rider Q3 User Guide EN Keywords: scala rider, Q3, scala rider Q3, Bluetooth, motorcycle, Helmet, headset, English, Q3 MultiSet, MultiSet

Reactive Streams at "Underscore" Scala User Group Tel Aviv, 2.7.2015

Relate scala-user-group-2014

i § 14 Henkilöstökertomus 2013 25 - Varsinais-Suomen Liitto · 19.03.2014 Pöytäkirja 2/2014 KOKOUSAIKA 19.03.2014 klo 13.00-KOKOUSPAIKKA Kokoushuone Graniitti PYKÄLÄT § 10

Presentation siec 19.03.2014 en

Presentazione siec 19.03.2014