FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database...
Transcript of FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database...
![Page 1: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/1.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Atvira kristalografine duomenu baze CODFAIR duomenu pateikimas gamtos moksluose
Saulius Gražulis
Atvirojo mokslo ir tyrimu duomenu aktualijosVilnius, 2020
Vilniaus universitetas, Gyvybes mokslu centras
Biotechnologijos institutas
Ši skaidriu rinkini galima kopijuoti, kaip nurodyta Creative CommonsAttribution-ShareAlike 4.0 International licenzijoje
1 / 48
![Page 2: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/2.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
2 / 48
![Page 3: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/3.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
2 / 48
![Page 4: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/4.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Paprastos užklausos
Kiek kristalu strukturu buvo publikuota kiekvienaismetais? Užklausa COD duomenu bazeje:
3 / 48
![Page 5: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/5.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Paprastos užklausos
Kiek kristalu strukturu buvo publikuota kiekvienaismetais? Užklausa COD duomenu bazeje:
SELECT count(*) AS nr, year FROM data
WHERE year IS NOT NULL AND
GROUP BY year ORDER BY year DESC
3 / 48
![Page 6: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/6.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Paprastos užklausos
Kiek kristalu strukturu buvo publikuota kiekvienaismetais? Užklausa COD duomenu bazeje:
SELECT count(*) AS nr, year FROM data
WHERE year IS NOT NULL AND
GROUP BY year ORDER BY year DESC
1915 1960
05
01
00
15
02
00
25
03
00
35
0
3 / 48
![Page 7: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/7.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Paprastos užklausos
Kiek kristalu strukturu buvo publikuota kiekvienaismetais? Užklausa COD duomenu bazeje:
SELECT count(*) AS nr, year FROM data
WHERE year IS NOT NULL AND
GROUP BY year ORDER BY year DESC
1915 1921 1926 1931 1936 1941 1946 1951 1956 1961
05
01
00
15
02
00
25
03
00
35
0
3 / 48
![Page 8: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/8.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Atradimai „žaliuose“ duomenyse
Zheng iš Robertso NEB komandos panaudojo „žalius“sekoskaitos duomenis aktyviems genus karpantiemsbaltymams aptikti [Zheng et al. (2008)]:
4 / 48
![Page 9: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/9.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Duomenu svarba
Kinijos tyreju grupe padare atradima biochemijos srityjebe eksperimentiniu tyrimu [Li et al. (2008)]:
http://slidegur.com/doc/3077570/introducing-bioinformatics-databases
5 / 48
![Page 10: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/10.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (2)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
6 / 48
![Page 11: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/11.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
FAIR duomenu mainu principai
Siekiame, kad duomenys butu [Wilkinson et al. (2016)]:
◮ Findable: Randami (automatinemis priemonemis!)
◮ Acessible: Prieinami (automatinemis priemonemis!)
◮ Interoperable: Suderinami (su ivairiomisprogramomis!)
◮ Reusable: Pakartotinai panaudojami (ivairiems,nenumatytiems tikslams!)
7 / 48
![Page 12: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/12.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (3)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
8 / 48
![Page 13: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/13.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Ivairus duomenu mainusprendimai
1. Bendri duomenu archyvai (Zenodo, Data Dryad,MIDAS, ...);◮
◮
2.
◮
◮
◮
◮
9 / 48
![Page 14: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/14.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Ivairus duomenu mainusprendimai
1. Bendri duomenu archyvai (Zenodo, Data Dryad,MIDAS, ...);◮ Zenodo https://doi.org/10.5281/zenodo.3560693
◮ Zenodo https://zenodo.org/record/3841841
2.
◮
◮
◮
◮
9 / 48
![Page 15: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/15.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Ivairus duomenu mainusprendimai
1. Bendri duomenu archyvai (Zenodo, Data Dryad,MIDAS, ...);◮ Zenodo https://doi.org/10.5281/zenodo.3560693
– stendinis pranešimas (PDF)...◮ Zenodo https://zenodo.org/record/3841841
2.
◮
◮
◮
◮
9 / 48
![Page 16: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/16.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Ivairus duomenu mainusprendimai
1. Bendri duomenu archyvai (Zenodo, Data Dryad,MIDAS, ...);◮ Zenodo https://doi.org/10.5281/zenodo.3560693
– stendinis pranešimas (PDF)...◮ Zenodo https://zenodo.org/record/3841841
– susieti COVID-19 duomenys (TTL)...
2.
◮
◮
◮
◮
9 / 48
![Page 17: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/17.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Ivairus duomenu mainusprendimai
1. Bendri duomenu archyvai (Zenodo, Data Dryad,MIDAS, ...);◮ Zenodo https://doi.org/10.5281/zenodo.3560693
– stendinis pranešimas (PDF)...◮ Zenodo https://zenodo.org/record/3841841
– susieti COVID-19 duomenys (TTL)...
2. Tematiniai duomenu archyvai (PDB, COD, NCBI,SwissProt, EuropePMC, PubMed (!));◮
◮
◮
◮
9 / 48
![Page 18: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/18.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Ivairus duomenu mainusprendimai
1. Bendri duomenu archyvai (Zenodo, Data Dryad,MIDAS, ...);◮ Zenodo https://doi.org/10.5281/zenodo.3560693
– stendinis pranešimas (PDF)...◮ Zenodo https://zenodo.org/record/3841841
– susieti COVID-19 duomenys (TTL)...
2. Tematiniai duomenu archyvai (PDB, COD, NCBI,SwissProt, EuropePMC, PubMed (!));◮ https://www.crystallography.net/cod/1557684.cif◮ https://www.crystallography.net/cod/1544162.html◮ https://www.pdb.org/pdb/files/1KNV.cif◮ https://www.rcsb.org/structure/2IXS
9 / 48
![Page 19: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/19.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (4)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
10 / 48
![Page 20: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/20.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
COD duomenu baze
Crystallography Open Database(COD, https://www.crystallography.net)[Gražulis et al. (2009), Gražulis et al. (2012)]:COD turinys:
◮ Kiek imanoma, visos mažu molekuliu (organiniu,mineralu) strukturos;
◮ Atvira duomenu baze, galima paieška Žiniatinkliopuslapyje, automatinemis priemonemis ir visos DBnukelimas;
11 / 48
![Page 21: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/21.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
COD projektas
But what if crystallographers work together to establish a public
domain database with all relevant crystallographic data? This would
not only overcome the current situation with ’fragmented’ databases,
it would also prevent for becoming dependent from monopolists.
What would be needed?
1. A small team of engaged scientists with some experience in database
and software design to coordinate the project.
2. The authors (i.e. the scientific community = YOU) who provides the
project with database entries (note, that if you have’nt sold your
experimental results exclusively, you are free to distribute the data
to such a database, even if they have already been part of a
publication - and a lot of good data have never been published).
3. Free software a) for maintaining the database, b) for data
evaluation and calculation of derived data (e.g. calculated powder
pattern from crystal structures for search-match purposes), c) for
browsing and retrieval.
gemstonede (Dr. Michael BERNDT) Fri Feb 14, 2003 1:26 pm
12 / 48
![Page 22: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/22.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Po 17 metu ... :)The Crystallography Open Database
http://www.crystallography.net/cod
13 / 48
![Page 23: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/23.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
COD tvarumas ir augimas
COD gyvuoja jau 17 metu, išaugo 8 kartus per paskutinius10 metu; šiuo metu talpina virš 450 000 irašu (2020):
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
2006 2008 2010 2012 2014 2016 2018 2020
CO
D r
ecord
num
ber
Year
COD records
14 / 48
![Page 24: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/24.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (5)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
15 / 48
![Page 25: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/25.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (5)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
15 / 48
![Page 26: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/26.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Identifikatoriu tipai
1. Bendras (centralizuotas arba paskirstytas) registras:◮ COD identifikatoriai (pvz. COD 2000000);◮ PDB identifikatoriai (pvz. PDB 1KNV);◮ DOI (pvz. 10.1093/nar/gkn883);◮ URI (pvz. https://www.w3.org/Provider/Style/URI.html)◮ ISSN, ISBN, PMID, PMCID, ...
2. Randomizuoti identifikatoriai:◮ UUID (pvz. 90376010-a315-11ea-adba-6bb1c61159af)◮ Kriptografines kontrolines sumos (pvz. git commit
42a03a255612b8d43ecd77bb0acc02def888f688,42a03a2);
16 / 48
![Page 27: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/27.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (6)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
17 / 48
![Page 28: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/28.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Metaduomenys
◮ Bendrieji metaduomenys:◮ Bibliografija◮ Duomenu bazes tvarkymo irašai (revizija, ivedimo
data, ...)◮ Dalykines srities metaduomenys:
◮ Eksperimento salygos◮ Kokybes kriterijai◮ COD: chemine formule◮ COD: kristalo auginimo salygos◮ Sasajos su kitomis duomenu bazemis◮ ir t.t. (metaduomenu sarašas potencialiai atviras!)
18 / 48
![Page 29: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/29.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (7)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
19 / 48
![Page 30: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/30.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Homogeniškumas◮ Irašai tarpusavyje palyginami
cod/2014638-head.cif:
data_2014638
loop_
_publ_author_name
’U\,car, \.Ibrahim’
# et al.
_publ_section_title
;
3-Acetoxy-2-(acetylamino)pyridinium ...
;
_journal_issue 3
_journal_name_full ’Acta Cryst. C’
_journal_paper_doi
10.1107/S0108270104031841
#...
_chemical_formula_sum ’C13 H10 N2 O6’
_chemical_formula_weight 290.23
#...
_cell_length_a 8.8959(16)
#...
loop_
_atom_site_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
#...
C1 0.1598(5) 0.4645(4) 0.7452(4) # ...
C2 0.1712(5) 0.4275(5) 0.9028(5) # ...
cod/2103669-head.cif:
data_2103669
loop_
_publ_author_name
’Rychlewska, Urszula’
’War\.zajtis, Beata’
_publ_section_title
;
... dibenzoyltartaric acid
;
_journal_issue 3
_journal_name_full ’Acta Cryst. B’
_journal_paper_doi
10.1107/S010876810100430X
# ...
_chemical_formula_sum ’C20 H20 N2 O6’
_chemical_formula_weight 384.38
# ...
_cell_length_a 8.9153(6)
# ...
loop_
_atom_site_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
#...
C1 .1554(3) -.9495(2) -1.00258(16) # ...
C2 -.0044(3) -.9472(2) -1.03195(16) # ...
20 / 48
![Page 31: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/31.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Sudetingu bruožu atpažinimasPolienu grandines kampai iš COD
120 125 130 135 140
02
46
8
kampas
dažnis
C2i−1
H
C2i
H
n
?
M+
A. Merkys, daktaro disertacija
21 / 48
![Page 32: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/32.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Molekuliu mechanines savybes
0
10
20
30
40
50
60
70
80
90
100
-10 -5 0 5 10
displacementlength units
∆ l
Energyarbitrary units Energy
arbitrary units
Angle, 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-3 -2 -1 0 1 2 3
radians
E = 12 k∆l2 p(∆l) ∼ e
−E
kBT = e−
∆l2
2σ2 E = k(1 + cos(n∆ϕ)), if n > 0
◮ Kristalu statistika leidžia nustatyti jegas, veikianciasatomus molekulese.
22 / 48
![Page 33: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/33.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Naujos apribojimu bibliotekos
[Long et al. (2017a), Long et al. (2017b)]
23 / 48
![Page 34: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/34.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (8)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
24 / 48
![Page 35: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/35.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Pasiekiamumas
◮ Duomenys turetu buti pasiekiami URI (URL) pagalba;
◮ Duomenys turetu buti mašina skaitomu pavidalu;
◮ Duomenys turetu surandami automatiniu užklausu
pagalba.
◮ Neblogai tureti ir Žiniatinklio paieškos forma ;).
◮ Reiketu siekti atkartojamu užklausu[Rauber et al. (2016)]
25 / 48
![Page 36: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/36.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
COD paieškos forma
http://www.crystallography.net/cod/search.html
26 / 48
![Page 37: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/37.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
COD užklausu pavyzdžiaiWeb, REST, SQL
◮ Naudojant stabilius URL’us (REST):
◮ http://www.crystallography.net/cod/2100202.html◮ http://www.crystallography.net/cod/2100202.cif◮ http://www.crystallography.net/cod/result?text=caffeine
◮ Naudojant SQL užklausu kalba duomenu bazeje:
◮ mysql -u cod_reader cod -h www.crystallography.net\
-e ’select file, a, b, c, vol, formula
from data where
year between 2013 and
2014 and
formula regexp " C[0-9]* "
order by vol desc limit 10’
27 / 48
![Page 38: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/38.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (9)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
28 / 48
![Page 39: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/39.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Formatai
◮ Naudokime atvirus, gerai dokumentuotus formatus;
◮ Pageidautina laikyti duomenis teksto pavidale, jeileidžia vieta;
◮ Super formatai :) :◮ CSV, CIF, JSON (su schema!), XML (su schema!),
PDB, FASTA – tekstu paremti;◮ HDF5 (su žodynais!), EXI (su schemomis!), BSON –
dvejetainiai;◮ TXT (UTF-8 Unikodo koduotes tekstas);
◮ Šiaip sau formatai:◮ XLS(X), DOC(X), ...;
29 / 48
![Page 40: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/40.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
CIF karkasas kristalografiniuduomenu mainams
Sukurtas ir palaikomas Tarptautines kristalografu draugijos(International Union of Crystallography,IUCr) [Hall et al. (1991)].examples/data/2100858-head.cif:data_2100858
loop_
_publ_author_name
’Buttner, R. H.’
’Maslen, E. N.’
_publ_section_title
;
Structural parameters and electron difference density in BaTiO~3~
;
_journal_issue 6
_journal_name_full ’Acta Crystallographica Section B’
_journal_page_first 764
_journal_page_last 769
_journal_volume 48
_journal_year 1992
_chemical_compound_source ’synthetic, from a mixture of KF:KMoO4:BaTiO3’
_chemical_formula_sum ’Ba O3 Ti’
_chemical_formula_weight 233.24
_symmetry_cell_setting tetragonal
_symmetry_space_group_name_Hall ’P 4 -2’
_symmetry_space_group_name_H-M ’P 4 m m’
_cell_length_a 3.9998(8)
_cell_length_b 3.9998(8)
_cell_length_c 4.0180(8)
30 / 48
![Page 41: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/41.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (10)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
31 / 48
![Page 42: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/42.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Semantika
◮ CIF – žodynai!
◮ XML, JSON – schemos!
◮ SQL – schemos!
◮ Semantiniai tinklai (RDF);
32 / 48
![Page 43: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/43.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Kontroliuojami žodynai
examples/dictionaries/cif-core-example.cif:
data_cell_length_
loop_ _name ’_cell_length_a’
’_cell_length_b’
’_cell_length_c’
_category cell
_type numb
_type_conditions esd
_enumeration_range 0.0:
_units A
_units_detail ’angstroms’
_definition
; Unit-cell lengths in angstroms corresponding to the structure
reported. The values of _refln_index_h, *_k, *_l must
correspond to the cell defined by these values and _cell_angle_
values. The values of _diffrn_refln_index_h, *_k, *_l may not
correspond to these values if a cell transformation took place
following the measurement of the diffraction intensities. See
also _diffrn_reflns_transf_matrix_.
;
33 / 48
![Page 44: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/44.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Save aprašantys duomenys
Visada užtenka triju lygiu duomenu apribojimamsaprašyti!
34 / 48
![Page 45: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/45.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (11)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
35 / 48
![Page 46: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/46.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Pilnumas
◮ COD irašu skaicius: > 450 000;
◮ Žinomu paskelbtu strukturu skaicius (pagalDataCite): > 818 000 (2019-12-31);
36 / 48
![Page 47: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/47.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (12)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
37 / 48
![Page 48: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/48.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Kokybes kriterijai
◮ IUCr kokybes kriterijai◮ IUCr kriteriju sarašas ftp://ftp.iucr.org/pub/dvntests◮ IUCr Publikaciju kokybes kriterijai
◮ COD kokybes kriterijai◮ ✓✓ Teisinga sintakse;◮ ✓ Validacija pagal žodynus;◮ ✓ Validacija pagal duomenu statistika;◮ ✗ Validacija pagal pamatinius fizikinius principus;
38 / 48
![Page 49: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/49.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
COD duomenu tikrinimas
COD duomenu tikrinimo politika:
1. Sintakses patikrinimas:$ cifparse 7234818.cif
2. Semantinis patikrinimas (pagal žodynus):$ cif_validate -D cif_core.dic 7234818.cif
3. Specifiniai COD duomenu bazes testai:$ cif_cod_check 7234818.cif
39 / 48
![Page 50: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/50.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
COD validavimo ir deponavimosvetaine
40 / 48
![Page 51: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/51.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
COD validavimo ir deponavimosvetaine
40 / 48
![Page 52: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/52.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Turinys (13)
1. Motyvacija – kodel duomenu bazes?
2. FAIR principai
3. Atviru mokslo resursu tipai
4. COD duomenu baze5. Mokslo duomenu pateikimas
◮ Identifikatoriai◮ Metaduomenys◮ Homogeniškumas◮ Pasiekiamumas◮ Formatai◮ Semantika◮ Pilnumas◮ Kokybes kriterijai◮ Programos
41 / 48
![Page 53: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/53.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Programos duomenu apdorojimui
Atkartojami skaitmeniniai tyrimai reikalaujadokumentuotu, prieinamu kompiuteriniu programu.
◮ Pilna atkartojamuma galima pasiekti tik naudojantatviro kodo programas (F/LOSS);
◮ Daugybe duomenu apdorojimo programu prieinamossu Atviro kodo licencijomis”◮ Octave, R, Perl, Python, Make, MySQL, MariaDB,
cod-tools, ...
42 / 48
![Page 54: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/54.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Reziume“Take-home message”
◮ Mokslo ir visuomenes labui – skelbkime duomenisatvirai ir pagal FAIR principus;
◮ Užtikrinkime stabilius unikalius identifikatorius
(juk žinome, kaip!);
◮ Užtikrinkime atvirus, suderinamus formatus;
◮ Užtikrinkime mokslo srities kokybe;
◮ Siekime pilno duomenu pateikimo mašina
skaitomame pavidale;
43 / 48
![Page 55: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/55.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Padekos
VU GMCBiotechnologijos i-tas
Virginijus Šikšnys(skyriaus vedejas)
Andrius MerkysAntanas VaitkusAlgirdas GrybauskasAleksandrasKonovalovas
COD Patareju taryba
Daniel ChateignerRobert T. DownsWerner KaminskyArmel Le BailLuca LutterottiPeter MoeckPeter Murray-RustMiguel Quirós
RDA Node Lithuania yra projekto „RDA Europe 4.0“,finansuojamo ES bendrosios moksliniu tyrimu ir inovacijuprogramos „Horizontas 2020“, dalis (sutarties nr. 777388)
44 / 48
![Page 56: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/56.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Nuorodos I
Gražulis S, Chateigner D, Downs RT, Yokochi AFT, Quirós M, LutterottiL, et al. (2009) Crystallography Open Database – an open-accesscollection of crystal structures. Journal of Applied Crystallography42:726–729, DOI 10.1107/S0021889809016690, URLhttp://dx.doi.org/10.1107/S0021889809016690
Gražulis S, Daškevic A, Merkys A, Chateigner D, Lutterotti L, Quirós M,et al. (2012) Crystallography Open Database (COD): an open-accesscollection of crystal structures and platform for world-wide collaboration.Nucleic Acids Research 40:D420–D427, DOI 10.1093/nar/gkr900, URLhttp://nar.oxfordjournals.org/content/40/D1/D420.abstract
Hall SR, Allen FH, Brown ID (1991) The crystallographic information file(CIF): a new standard archive file for crystallography. ActaCrystallographica Section A 47:655–685,DOI 10.1107/S010876739101067X, URLhttp://dx.doi.org/10.1107/S010876739101067X
Li CY, Mao X, Wei L (2008) Genes and (common) pathways underlyingdrug addiction. PLoS Computational Biology 4(1):e2,DOI 10.1371/journal.pcbi.0040002, URLhttp://dx.doi.org/10.1371/journal.pcbi.0040002
45 / 48
![Page 57: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/57.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Nuorodos II
Long F, Nicholls RA, Emsley P, Gražulis S, Merkys A, Vaitkus A, et al.(2017a) ACEDRG: A stereo-chemical description generator for ligands.Acta Crystallographica Section D 73(2):112–122,DOI 10.1107/S2059798317000067, URLhttps://doi.org/10.1107/S2059798317000067
Long F, Nicholls RA, Emsley P, Gražulis S, Merkys A, Vaitkus A, et al.(2017b) Validation and extraction of stereochemical information fromsmall molecular databases. Acta Crystallographica Section D73(2):103–111, DOI 10.1107/S2059798317000079, URLhttps://doi.org/10.1107/S2059798317000079
Rauber A, Asmi A, van Uytvanck D, Pröll S (2016) Identification ofreproducible subsets for data citation, sharing and re-use. URLhttps://tinyurl.com/y7o7o8m4
Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, BaakA, et al. (2016) The FAIR guiding principles for scientific datamanagement and stewardship. Scientific Data 3(1),DOI 10.1038/sdata.2016.18, URLhttps://doi.org/10.1038/sdata.2016.18
46 / 48
![Page 58: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/58.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Nuorodos III
Zheng Y, Posfai J, Morgan RD, Vincze T, Roberts RJ (2008) Usingshotgun sequence data to find active restriction enzyme genes. NucleicAcids Research 37(1):e1–e1, DOI 10.1093/nar/gkn883, URLhttps://doi.org/10.1093/nar/gkn883
47 / 48
![Page 59: FAIR duomenu˛ pateikimas gamtos moksluose mo ES Saulius ...€¦ · Crystallography Open Database (COD, ) [Gražulis et al. (2009), Gražulis et al. (2012)]: COD turinys: Kiek ˛imanoma,](https://reader035.fdocuments.net/reader035/viewer/2022063021/5fe49ed7c9187c3c5a4419cb/html5/thumbnails/59.jpg)
RD
AN
od
eLi
thu
an
iayra
pro
jekto
„R
DA
Eu
rop
e4
.0“,fin
an
suo
jam
oES
be
nd
rosi
os
mo
ksl
iniu
tyrim
uir
ino
va
ciju
pro
gra
mo
s„H
orizo
nta
s2
02
0“,d
alis
(su
tart
ies
nr.
77
73
88
)
Dekoju už demesi!
http://en.wikipedia.org/wiki/Emerald http://www.crystallography.net/5000095.html
A path to freedom: GNU → Linux → Ubuntu → MySQL → R → LATEX→ TikZ → Beamer