Deep and Machine Learning in Sciences
Transcript of Deep and Machine Learning in Sciences
Deep and Machine Learning in Sciences
Bálint Pataki, Istvan Csabai
ELTE Eötvös Loránd University
Dept. of Physics of Complex Systems
Data intensive sciences and machine learning group
2021.02.09Deeplea17em, FIZ/3/089
History of (machine) intelligence / data science
Model World
OBSERVATION
History of (machine) intelligence / data science
Model World
OBSERVATION
History of (machine) intelligence / data science
Model World
OBSERVATION
Instruments
crutch for senses
Natural intelligence
7±2 bit
Pollack, I. The information of elementary auditory displays.
J. Acoust. Soc. Amer., 1952, 24, 745-749.
G.A. Miller The Magical Number Seven, Plus or Minus Two:
Some Limits on our Capacity for Processing Information,
Psychological Review, 63, 81-97. (1956)
Homo Sapiens: Technical Specifications
CPU 100 GN (giga-neurons)
Clock frequency 4-32 Hz
CPU cores 1 (male version), 2+ (female v.)
CPU speed 0.1 Flops (floating point op. / sec)
Memory (short term) 7 +/-2 bits
Storage 1TB-2.5PB
Power 20 W
Camera 576Mpix, 24Hz
Touch Yes
Display No
Speakers Mono
GPS No
WIFI No
Bluetooth No
2G/3G/4G/5G No/No/No/No
Latest version update 100 000 BC
Main Features :
• Find food
• Escape predators
• Kill enemies
• Find mate and reproduce
History of (machine) intelligence / data science
Model World
OBSERVATION
EXPERIMENT
Mathematicaldescription
Instruments
Predictions
7±2 bit*
crutch for senses
crutch for mind
History of (machine) intelligence / data science
Model World
OBSERVATION
EXPERIMENT
Mathematicaldescription
Instruments
Predictions
7±2 bit*
Virtual reality
𝑭 = 𝑮𝒎𝟏𝒎𝟐
𝒓𝟐Λ=0.7
Ωm=0.3
Initial values “laws”, equations Simulated reality
Science – technology – science – technology ...
Solid state physicsTransistor/
microelectronics
Elektronics Sensors Data
Astronomy Mechanics Quantummechanics
Moore’s-law
New challenges in sciences20th century 21st century
manual observations high throughput instruments
simple equations complex simulations
small data big data300 million galaxies
2.5 terapixels
3.2 gigabases,
37 trillion cells
AI: Why inevitable? Why now?Key challenges: amount of data and complexity of models
7±2
bit
?
10kx3.2 G bp
1Mx3k
10kx30k
5Gpx/image
2.5Tpx
400M tx
Perceptron ‘57, Hopfield ’82, Backprop ‘86
more data (MNIST’98 60k, CIFAR’10 60k,
IMAGENET’10 14M)
steadily improving models, deeper
understanding of statistics/data/models
more compute power. GPU!!
• V100 GPU: 100 TFlops
Image recognition progress
Idea: mimic the brain’s• flexibility
• self-organization
• self-learning
• generalization
• massive parallelism
• fault-tolerance
„Biologically inspired approach”
Using figures from: D. Kriesel – A Brief Introduction to Neural Networks (ZETA2-EN): http://www.dkriesel.com/en/science/neural_networks
Synapses
Too complex
and not well understood enough
for direct implementation
The “black box concept”
Robot example
• Input: 8 sensors
• Output: stop or go
Goal: avoid obstacles
Learning => regression
There are several „machine learning” approaches, neural net is just one of them
Will not cover:
• SVM, random forest, decision trees …
• Unsupervised nets
Perceptron: the artificial neuron
y
Learning is to find the optimal
set of “synaptic weights”
Brief and gappy history The beginning
• 1943 McCulloch & Pitts: threshold based neurological switches
• 1949 Hebb: Hebb-rule connections strengthen between simultaneously active neurons
Golden age
• 1951 Minsky: adjustable weights
• 1957 Rosenblatt & Wightman: perceptron
• 1960 Widrow & Hoff: ADA-LINE, delta-rule (adaptive) – echo filtering in telephones
• 1969 Minsky & Papert: problems with perceptron
Book: John von Neumann: The computer and the brain, Yale
University Press, 1959
Winter is coming …
The XOR problem
Not all classification problems are linear, especially the interesting ones
Minsky & Papert 1969 -> the “AI winter”
Other approaches: expert systems
Some “connectionist Eskimos”
• 1976 Grossberg & Carpenter: Adaptive Resonance Theory (ART)
• 1982 Kohonen: Self-organizing feature maps (unsupervised)
The multilayer perceptron
The multilayer perceptron
Transforming the input representation, adding new dimensions may help
Brief and gappy history
The renaissance
1982 Hopfield: Hopfield networks, associative memory “spin glass”, 1985 solving TSP
1986 Parallel Distributed Processing group (Rumelhart, Hinton, Williams): backpropagation of errors
Now: Mostly the same basic architectures, with various improvements : deep learning
AI: paradigm shift
ComputerData
ModelPrediction
ComputerData Model Prediction
IF color=red AND profile=smooth THEN type:=tomato
IF color=red AND HAS(horns) THEN type:=cow
f( ) = “apple”
f( ) = “tomato”
f( ) = “cow”
Example: Image recognition
Method: hand crafted features
Supervised learning
input
label
apple
pear
Supervised learning
input
label
internal representation
apple
pear
Supervised learning
apple
prediction
input
label
internal representation
IF color=red AND profile=smooth THEN type:=tomato
IF color=red AND HAS(horns) THEN type:=cow
f( ) = “apple”
f( ) = “pear”
f( ) = “cow”
function regression
Supervised learning: neural net
apple
prediction
input
label
internal representation
IF color=red AND profile=smooth THEN type:=tomato
IF color=red AND HAS(horns) THEN type:=cow
f( ) = “apple”
f( ) = “pear”
f( ) = “cow”
function regression
Learning -> loss function optimization
Loss = number of wrong
categorizations
images -> points
in N dim space
Learning=minimum search
slope
Learning as energy minimization
Concepts from statistical physics
Spin-glasses ~ Hopfield model for associative memory (1982), quantum computers
Boltzmann machines, simulated annealing, …
Renormalization? (Lin, Tegmark, Rolnick, J. Stat. Phys. 2017: Why Does Deep and Cheap
Learning Work So Well?
Magnetic systems, Ising model, spin glass: Hopfield net
Concepts from statistical physics
Spin-glasses ~ Hopfield model for associative memory (1982), quantum computers
ferromagnet anti-ferromagnetEnergy
Dynamics
Training
Attractors
Energy surface
with (local) minima
Proper, big enough training set
Representation of data (images, words, … -> vector space)
Nonlinear optimization
Model complexity• Accuracy
• Generalization
“Black box”, trust
…
ChallengesTypical network: 2M adjustable parameters
Ribli et al. Nature Astro. 2018, MNRAS 2019
Mutations -> antibiotics resistance
Mobile sensors -> Parkinson
Mosquito images -> vector borne diseases
Medical imaging -> breast cancer
Weak lensing map -> cosmology parameters
Explainable AI
Control of aging related methylation networks
Pathology images
Quantum neural computing
MSc, PhD courses
AI Research, Education and Applications @ Eötvös University Dept. of Physics of Complex Systems
Solving analytically
untraceable
hard inverse
problems
Matamoros et al., Pataki et al. 2020.
Pataki @DREAM, Laki et al. 2016
Pataki et al. in prep.
Ribli et al. @DREAM, Sci. Rep. 2018
Ribli et al. in prep , Patent subm. 2019
SOTE TKP collab.
http://datascience.elte.hu
Space weather : whistler detection
Gene sequencing: nanopore
MosquitoAlert image deep learning
False(?) negatives:
False(?) positives:
Tig
er
NO
T T
ige
r
Pataki et al. Sci. Rep. 2021.
F. Bartumeus et al.
http://www.mosquitoalert.com/
“Zika, dengue, chikungunya,
and yellow fever are all
transmitted to
humans by Ae. aegypti and
Ae. Albopictus.”
Deep learning for colorectal cancer pathology
Collaboration SOTE-ELTE
Pencz Bianka
Pesti Adrián
Ribiánszky Annamária
Schaff Zsuzsa
Sághi Márton
Takács Dániel
Tuza Sebestyén
Kiss András
Bilecz Ágnes
Budai András
Gyöngyösi Benedek
Keszthelyi Diána
Kontsek Endre
Kovács Attila
Kovács Tekla
Kramer Zsófia
Semmelweis Egyetem
II. Sz. Patológiai Intézet
Szócska Miklós
WHO Globocan 2018
Incidence Mortality
Hungary has the highest colorectal cancer rate
>10,000 new cases, >5,000 death/yr (Orv. Hetil., 2017, 158(3), 84–89)
Early detection!
https://fightcolorectalcancer.org/prevent/colon-polyps/Dysplasia
~ 2-10 yr
invasive cancerprecancerous stage
Deep learning for colorectal cancer pathology
Samples from colonoscopy, biopsy
Traditional: by-eye microscope
New: digital scanner
>2,000 whole slide images
• 80,000 x 60,000 pixels, 15GB
Detailed annotation by trained pathologists:• Dysplasia_low grade
• Dysplasia_high grade
• Adenocarcinoma
• Suspicious for adenocarcinoma
• Lymphovascular invasion
• Tumornecrosis
• Inflammation
• Artifact
https://www.3dhistech.com/
Dysplasia – high grade
Adenocarcinoma
Dysplasia – high grade
Necrosis
Necrosis
Ad
en
oca
rcin
om
a
Necrosis
Large, well annotated training set is the
key
bottleneck for most machine learning
tasks!
Mammography with deep learning (Faster R-CNN ) The Digital Mammography DREAM challenge
• 1200 participants
• Dezső Ribli, best final result
• the only solution with localization
• AUC = 0.95
Publication: Nature Scientific Reports (2018)
• 90 citations
• 30-th most popular from 17000 articles
New collaborations with hospitals, clinics
• more training data
• steps towards licensing
D. Ribli, A. Horváth, Z. Unger, P. Pollner, and I. Csabai. "Detecting and
classifying lesions in mammograms with deep learning." Scientific reports
(2018)
Ren et al. 2016
NVIDIA Developer 2018
Open source software
• https://riblidezso.github.io/frcnn_cad/
• 50 visitors/week
• runs on a regular laptop with GPU
• plugin for OsiriX/Horos medical image viewer
Mammography with deep learning (Faster R-CNN )
Cosmological
parameters: Ωm,σ8
Complex
physical
processes
Cosmological parameters from gravitational lensingLearning new tricks from deep learning
Cosmological
parameters: Ωm,σ8
Complex
physical
processes? ?
Cosmological parameters from gravitational lensing Learning new tricks from deep learning
Hard inverse problem!
Ribli et al. MNRAS 2019
Ribli et al. Nature Astr. 2019
Cosmological
parameters: Ωm,σ8
Complex
physical
processes? ?
Cosmological parameters from gravitational lensing Learning new tricks from deep learning
2M parameters
Learned kernels: dark matter halo profile expansion
Attention focus of the network with
Layer-wise Relevance Propagation
Instead of Fourier power spectrum:
information from halo profiles
Cosmological
parameters: Ωm,σ8
Complex
physical
processes? ?
Cosmological parameters from gravitational lensing Learning new tricks from deep learning
Learning from deep learning:
New simple method
Few parameters, robust, fast, understandable.
Explainable AI: understanding internal representation
Explainable AI: automatic classification enhancement
normal
cancer
A. Large calcification
B. Oval mass
C. Spiculated mass
D. Calcified vessel
E. Calcification
F. Clusetered micro-calcifications
Automatic
labels
„discovered”
by the
network
Based on
simple teaching
classesFeatures that
invoke highest
activity
Interpretable, trustworthy,
for radiologists Ribli et al. in prep , Patent subm. 2019
Nem csak fizikát tanítunk! BSc:
• Számítógépes alapismeretek(Linux, Python alapok, professzionális tudományosszövegszerkesztés, grafikonok) [ http://oroszl.web.elte.hu/#teaching ]
• Programozási alapismeretek a fizikában(C programozás, adatkezelés, matematikai és fizikaifeladatok megoldása programokkal)[ http://www.vo.elte.hu/~dobos/teaching ]
• Fizika numerikus módszerei I-II.
• Számítógépes szimulációk, …
MSc: Tudományos adatanalitika és modellezés specializáció[ http://datascience.elte.hu ]
• Adatexploráció és vizualizáció
• Haladó statisztika és modellezés
• Adatmodellek és adatbázisok a tudományban
• Adatbányászat és gépi tanulás
• Számítógépes laboratórium
• Adattudomány számítógépes laboratórium
• Tudományos modellezés számítógépes laboratórium
• Deep learning a tudományokban (spec.) [ https://csabaibio.github.io/physdl/ ]
Any sufficiently advanced technology is indistinguishable from magic.
(Arthur C. Clarke)
Indeed, understanding the laws of mechanics made us able to build pyramids and cathedrals, based on the laws of thermodynamics the invention of the steam engine empowered us to cross oceans and continents and today we all have „seven-league boots” in our garages. Understanding electrodynamics and quantum mechanics brought us the transistor that is at the heart of the Internet and the modern „magic mirrors”, the mobile phones.
What miracles will the advancements of machine learning bring? And what kind of challenges?
NEW PARADIGMSEDUCATION: WE NEED NEW SCIENTIST WHO HAVE PROFESSIONAL SKILLS BOTH IN THEIR DISCIPLINES AND IN MODERN INFORMATION TECHNOLOGIES.PUBLIC DATA: COMMON GOOD. GREAT OPPORTUNITIES, GREAT RESPONSIBILITY.
István Csabai ELTE Dept. of Physics of Complex Systems
http://complex.elte.hu/~csabai/