4th APCAP Dec 5th 2008, NIAS Bangalore P āņ ini's Ash ţā dhy ā y ī : A Computer Scientist's...
-
Upload
derrick-watson -
Category
Documents
-
view
221 -
download
4
Transcript of 4th APCAP Dec 5th 2008, NIAS Bangalore P āņ ini's Ash ţā dhy ā y ī : A Computer Scientist's...
4th APCAP Dec 5th 2008, NIAS Bangalore
Pāņini's Ashţādhyāyī: A Computer Scientist's Perspective
Amba KulkarniDepartment of Sanskrit Studies
University of HyderabadHyderabad
4th APCAP Dec 5th 2008, NIAS Bangalore
Pāņini's Ashţādhyāyī
Circa 500 B.C.E.
Extant Grammar of the then prevalent Sanskrit Language
Around 4000 sutras;
8 chapters 4 sections each
4th APCAP Dec 5th 2008, NIAS Bangalore
A Computer Scientist
Information Theory: Shannon
Computational Linguists:(Language: Means of coding the information.)Information Coding: How much, Where and How
Programming Languages: Concepts, Techniques and Models
4th APCAP Dec 5th 2008, NIAS Bangalore
Pāņini's Ashţādhyāyī
Pāņini's Ashţādhyāyī has something to offer to each of these fields
4th APCAP Dec 5th 2008, NIAS Bangalore
Information Coding
Claim:Panini was aware of the strength of language asan information coding device.
Evident from a) His style of presenting the information in sUtrab) His analysis of Sanskrit Language
4th APCAP Dec 5th 2008, NIAS Bangalore
Information Theory
BrevityKiparsky: Panini used Brevity to achieve
generalisation.
– Maximum Use of anuvŗtti
Ram went home.Ram ate an apple.----------------------------------------------Ram went home, ate an apple.
4th APCAP Dec 5th 2008, NIAS Bangalore
Anuvŗtti
upadeSe ac anunAsika it 1.3.2hal antyam 1.3.3na vibhaktau tusmA 1.3.4Adi ~nitudavAH 1.3.5.saH pratyayasya 1.3.6CutU 1.3.7laSaku ataddhite 1.3.8
4th APCAP Dec 5th 2008, NIAS Bangalore
upadeSe ac anunAsika (=it) 1.3.2 hal antyam 1.3.3 na vibhaktau tusmA (=it) 1.3.4 Adi ~nitudavAH (=it) 1.3.5 pratyayasya .saH (=it) 1.3.6 CutU (=it) 1.3.7 ataddhite laSaku (=it) 1.3.8
Anuvŗtti … contd
4th APCAP Dec 5th 2008, NIAS Bangalore
upadeSe(a) ac anunAsika(b) (=it)(c) 1.3.2 hal antyam(d) 1.3.3 na vibhaktau tusmA(e) (=it) 1.3.4 Adi(f) ~nitudavAH(g) (=it) 1.3.5 pratyaya(h)sya(i) .saH(j) (=it) 1.3.6 CutU(k) (=it) 1.3.7 ataddhite laSaku(l) (=it) 1.3.8
Anuvŗtti … contd
4th APCAP Dec 5th 2008, NIAS Bangalore
a b c d e c f g c h_i j c k c l c a (b + de + f [g + h_i{j + k + l }]) c
Anuvŗtti … contd
4th APCAP Dec 5th 2008, NIAS Bangalore
Anuvŗtti … contd
No Proper Nesting; MandUka pluti:Ad(a) guNaH(b) 6.1.84v.rddhiH(c) eci(d) {a} 6.1.85 etyedhatyUTsu (e) {a c d} 6.1.86upasargAt(f) ŗti(g) dhAtO(h) {a c} 6.1.87VA supyApiSale(i) {f g h a c} 6.1.88OtaH amSasoH(j) 6.1.89 e”ni(k) pararUpaM(l) { f h a} 6.1.90
a{b + c[d(1+e) + fh(g(1 + i)] + j + kl)}
4th APCAP Dec 5th 2008, NIAS Bangalore
Anuvŗtti … contd
Maximum advantage of features of Natural Language:
How are the complete phrases reconstructed?
AkAnksha (Expectancy): Major role in deciding the anuvŗtti
4th APCAP Dec 5th 2008, NIAS Bangalore
Anuvŗtti … contd
Example of borrowing from as many as 11 sūtras
Original sūtra: 3-3-65 क्वणः� व�णः�यां�� च
After anuvŗtti:3-3-65: क्वणः� व�णः�यां�� च प्रत्यांयां� परः� च आद्यु�दा�त्तः� च धा�तोः�� कृ� तोः� क्रि�यां�यां�� क्रि�यां�र्था��यां�म्� भा�व� अकृतोः�रिरः च कृ�रःकृ� सञ्ज्ञा�यां�म्�अप� उपसर्गे' व� नौ) (anuvŗtti from 11 different sūtras)
4th APCAP Dec 5th 2008, NIAS Bangalore
Anuvŗtti … ContdSome Statistics:
Total sūtras : (3984) 4000
Total Words (with sandhi): (7007) 7000
Total Sandhi split words: 9843Total words after repeating the words with anuvŗtti:
40,000Compression because of anuvŗtti: 1/6
In terms of byte size, compression is 1/3.
4th APCAP Dec 5th 2008, NIAS Bangalore
Information Theory
Normal Arrangement of Alphabet
a A i I u U ŗ ļ e E o O M H
k kh g gh “nc ch j jh ~nţ ţh .d .dh .n
t th d dh n p ph b bh m
y r l v S ş s h
4th APCAP Dec 5th 2008, NIAS Bangalore
ShivasUtra
Panini required several(42) subsets of thisalphabet to describe various operations.
4th APCAP Dec 5th 2008, NIAS Bangalore
ShivasUtra
Some of these subsets:
All vowelsAll consonantsAll vowels + semivowelsh y v r l + consonantsy v r l + consonantsv r l + consonantsr l + consonants
4th APCAP Dec 5th 2008, NIAS Bangalore
ShivasUtra
It is not advisable to give 42 names to these sets. It will be difficult to memorize the association.
These are Partially ordered sets.
Panini arranged them linearly in the form of 14 ShivasUtras.
4th APCAP Dec 5th 2008, NIAS Bangalore
a i u Nŗ ļ Ke o c
E O “Nh y v r T
l Nň m “n N n M
Jh bh Ňgh .dh dh Şj b g .d d S
kh ph ch .th th c .t t Vk p Y
SŞ s Rh L
SHIVASUTRAS
4th APCAP Dec 5th 2008, NIAS Bangalore
ShivasUtras .. contd
Optimality of these sUtras is proved independently by
Kiparsky (linguistically)and Petersen (mathematically)
4th APCAP Dec 5th 2008, NIAS Bangalore
ShivasUtras ... contd
Given a set of Partially Ordered sets, Now it is possible to tell
Whether the elements are Shivasutra encodable or not.
Ref: Petersen(2008)
4th APCAP Dec 5th 2008, NIAS Bangalore
Information Dynamics
Information Theory: Deals with
Measure of information, data compression, etc.
Information Dynamics: Focuses on flow of information in a language
4th APCAP Dec 5th 2008, NIAS Bangalore
Information Dynamics
For a Computer Scientist working in NLP
Where a language codes the informationHow it codes the informationHow much information it codes are important.
4th APCAP Dec 5th 2008, NIAS Bangalore
Information Dynamics … contd
Where a language codes information:
Useful to decide the parsing Strategy
4th APCAP Dec 5th 2008, NIAS Bangalore
Information Dynamics … contd
How a language codes information:
Useful to decide whether the information can be passed on to other language without any efforts
or not
4th APCAP Dec 5th 2008, NIAS Bangalore
Information Dynamics … contd
How much information a language codes:
Useful to decide whether the desired information can be extracted or not merely from a language
string without appealing to the extra linguistic knowledge.
4th APCAP Dec 5th 2008, NIAS Bangalore
Information Dynamics … Contd
Where: anabhihite 3.1.1
How much: svatantraH kartA 1.4.54
How (the manner): samAna kartŗkayoH pUrvakAle 3.4.21
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages
Data + Algorithm = Program
Algorithm: Around 4000 sUtras
Data: Shivasutra gaNapAtha DhAtupAtha uNAdi sUtra li”ngAnuSAsana
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages … contd
Data + Algorithm = Program
Object Oriented Programming:Encapsulation of data with the (markers to the)
functionsBhaj + (gh)a(~n) : Presence of gh => j->g
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages … contd
Ordering of rules:
Two meta rulesa) viprati.sedhe param kAryam 1.4.2(In case of conflict apply the later rule)
b) pUrvatra asiddham 8.2.1(Apply the rules in last 3 sections at the end and in linear order)
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages … contd
Typical VaiyAkara.na's view:
Event driven ProgrammingChanges in the Data spaces: Event
==> Triggers the rules
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages … contd
Typical vaiakara.nas view:
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages … contd
Control:Certain rules block certain other rules.
Blocking is of 3 different typesa) Partial blocking (asiddhavat)b) certain rules are not applicable (asiddhaH)c) only certain rules are applicable (asiddham)(direct implication on : passing of parameters)
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages … contd
Partial Blocking: Asiddhavat atra AbhAt
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages … contd
Ordering of the sUtras
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages … contd
Control:a) If X then Y (mA.thara kau.ndi.nya nyAya)b) If X then Y else Z (takra kau.ndi.nya nyAya)
c) If (not X) then Y (ni.sedha)d) if X then (Y and Z) (vibhA.sA)
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages … contd
Conflict Resolution:
Utsarga / apavAdaGeneral / special rule
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages … contd
Use of Special Features of Sanskrit
Use of Vibhaktis (case markers)Order of parameters in a function
IkaH ya.n aciIk ac → ya.n ac
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages … contd
Use of Special Features of Sanskrit
Use of Pronouns as variables
Tasmin iti nirdis.te pUrvasyaTasmAt iti uttarasya
4th APCAP Dec 5th 2008, NIAS Bangalore
Programming Languages … contd
Inheritance
Multiple inheritance → arranged as a linear inheritance
Taddhita pratyayaAshwini Deo 2007