Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

37
Mathematics of Incidence part 1: Getting started through Collaborative Filtering Benjamin J. Keller bjkeller.github.io v.1, 26 September 2014 Creative Commons Attribution-ShareAlike 4.0 International License a u 1 u 2 s 1 s 2 u k s l t

description

Incidence relations are a ubiquitous form of data, and this presentation begins an exploration of their analysis. Using a simple collaborative filtering scheme over data represented as a bipartite graph of users and foods they like, the presentation is meant as an approachable introduction to the basic ideas of Formal Concept analysis.

Transcript of Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Page 1: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Mathematics of Incidencepart 1: Getting started through Collaborative Filtering !Benjamin J. Kellerbjkeller.github.io!v.1, 26 September 2014

Creative Commons Attribution-ShareAlike 4.0 International License

a

u1u2

s1s2

uksl

t

Page 2: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Collaborative Filtering

Abby

Brian

Charles

David

cherries

doughnuts

eggs

apples

bananas

Use likes of users to recommend foods to Abby

Recommend these three foods to Abby because she likes food in common with users who like them

Page 3: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Likes represented as bipartite graph

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

bipartite graph (U,V,E) has vertices in disjoint sets U and V with edges (u,v) in E from vertex u in U to vertex v in V

Page 4: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Constructing a recommendationConstruct a recommendation by following edges in graph:

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Page 5: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Constructing a recommendationConstruct a recommendation by following edges in graph:

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Page 6: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Constructing a recommendationConstruct a recommendation by following edges in graph:

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Page 7: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Constructing a recommendationConstruct a recommendation by following edges in graph:

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Page 8: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Constructing a recommendationConstruct a recommendation by following edges in graph:

Abby Charles

David

cherries

doughnuts

eggs

apples

bananas

Brian

Page 9: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Biclique – a special subgraph

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

A biclique (U,V,E) of a bipartite graph G is a subgraph of G such that each u in U has an edge (u,v) with each v in V

Page 10: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Constructing a biclique 1.1

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Start with vertices for Brian and Charles

Page 11: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Constructing a biclique 1.2

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Find all foods liked by Brian and Charles

Page 12: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Constructing a biclique 2.1

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Start with vertices for all foods liked by Brian and Charles

Page 13: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Constructing a biclique 2.2

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Find all users who like same foods as Brian and Charles

Page 14: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Biclique

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Page 15: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Try another…

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Start with vertex for David

Page 16: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Which one?Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

hint: must be a biclique, and should include all users who like everything David likes

Page 17: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

All Bicliques

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Page 18: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Recommendations via bicliques

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Construct same recommendation by composing bicliques

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Page 19: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Recommendations via bicliques

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Construct same recommendation by composing bicliques

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby Charles

David

cherries

doughnuts

eggs

apples

bananas

Brian

Page 20: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

All recommendations

Abby

Brian

Charles

David

cherries

doughnuts

eggs

apples

bananas

Abby Charles

David

cherries

doughnuts

eggs

apples

bananas

Brian

Abby

Brian

Charles

David

cherries

doughnuts

eggs

apples

bananas

Abby

Brian

Charles

David

cherries

doughnuts

eggs

apples

bananas

Abby

Brian

Charles

David

cherries

doughnuts

eggs

apples

bananas

Page 21: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Notice something about these bicliques

Page 22: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Users of first are superset of second

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Page 23: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Foods of first are subset of second

Page 24: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Ordering bicliques by usersAbby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Page 25: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Ordering bicliques by foods

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

both are equivalent

Page 26: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Biclique order

Transitive: if a ≤ b and b ≤ c then a ≤ c

Antisymmetric: if a ≤ b and b ≤ a then a = b

Reflexive: a ≤ a, for all bicliques a

Inherits properties from subset order on sets:

called a partial order

Page 27: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Covering relationship

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

second biclique covers the first because there is no biclique greater than first that is less than second (e.g., there is no b such that a < b < c)

Page 28: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Drawing ordering

Hasse diagram

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

If have

then draw

Page 29: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Page 30: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

We saw that

Abby Charles

David

cherries

doughnuts

eggs

apples

bananas

Brian

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

involves related bicliques

but, what about others?

Page 31: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Abby

Brian

Charles

David

cherries

doughnuts

eggs

apples

bananas

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Page 32: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

cherries

doughnuts

eggs

apples

bananas

order is transitive

Page 33: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Abby

Brian

Charles

David

cherries

doughnuts

eggs

apples

bananas

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Anecdotally, can construct recommendation from biclique A and B where A > B

Page 34: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

It has to be – a "proof"a

u1u2

s1

s2

uk

sl

ta

u1u2

s1

s2

uk

sl

u1u2

s1

s2

uk

sl

t

Can recommend food t to user a because there are foods s1,s2,…,sl liked by user a and users u1,u2,…,uk who also like food t

Means that there must be a biclique A involving at least users a and u1,u2,…,uk and exactly foods s1,s2,…,sl

And, there is also a biclique B involving exactly users u1,u2,…,uk and at least foods s1,s2,…,sl and t

As user set of A is a superset of the users of B, it must be that A > B

Page 35: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

Questions to ponder

• What is a "good" recommendation?

• Serendipity, who?

• And, what's the deal with Abby and doughnuts, anyway?

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

Abby

Brian

Charles

David

apples

bananas

cherries

doughnuts

eggs

every biclique

above has Abby

every biclique below has doughnuts

Page 36: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

About me and these slides

I am Ben(jamin) Keller. I learn and, sometimes, create through explaining. I had been involved in a big (US) federally funded project that had the goal of helping biomedical scientists tell stories about their experimental observations. The project is long gone, but I’m still trying to grok how such a thing would work. Much of biological data comes in the form of observations that are distilled to something that looks like an incidence relation, which brings us to this series of presentations. My goal for the slides is to deal with the mathematics of incidence in an approachable way, but the intuitive beginnings will eventually allow us to embrace the more complex later.

Page 37: Mathematics of Incidence, part 1: Getting Started with Collaborative Filtering

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0

International License.