Post on 17-Dec-2015
Indus Script: Search for Indus Script: Search for GrammarGrammar11
Nisha YadavTata Institute of Fundamental Research
Collaborators:Collaborators:Mayank Vahia, Iravatham Mahadevan, Hrishikesh Joglekar
1 Lecture given at a two day seminar on “The Indus Script: Problems and Prospects”, Chennai
2Indus Script: Search for Grammar, Yadav et.al. (2007)
ContentsContents
1) Indus Script - An Overview2) Various Approaches3) Our Approach4) Dataset5) Preliminary Analysis6) Analysis - 1 : Check against random order7) Analysis - 2 : Positional analysis of Frequent Sign
combinations8) Text Beginners and Text Enders9) Segmentation of Indus Texts10) SummaryNote: In the lecture, unless specified otherwise, all text examples are from Mahadevan 1977 and all images are from Parpola’s UNESCO volumes of Indus seals.
3Indus Script: Search for Grammar, Yadav et.al. (2007)
1) Indus Script: An 1) Indus Script: An OverviewOverview
4Indus Script: Search for Grammar, Yadav et.al. (2007)
Indus Valley Civilization
From Mahadevan, 1977
5Indus Script: Search for Grammar, Yadav et.al. (2007)
Indus script is one of the few scripts that defy decipherment.
Inscriptions found only on small objects like seals.
The inscriptions are very brief: average length 4-5 signs.
There are only 417 signs in the script as per Mahadevan’s Concordance (1977).
The script is pictographic with signs showing human, fish etc.
Signs are modified by joining or by strokes and many signs appear as combination of other simple signs.
The direction of the script is variable (mostly right to left: 83 % of times).
In general the seals are of 1 to 2 square inches in size.
There are no bi-lingual texts to aid decipherment.
Indus Script : Pointers to understand
6Indus Script: Search for Grammar, Yadav et.al. (2007)
Direction indicators of the Direction indicators of the scriptscript
Cramping or overflow of signs at the left end Orientation of asymmetric signs Sequence of frequent combinations of signs Split sequences
A split sequence indicating direction
7Indus Script: Search for Grammar, Yadav et.al. (2007)
Scale of a typical sealScale of a typical seal
For the most part, seals are between 1 inch or 2 inches square.For the most part, seals are between 1 inch or 2 inches square.From Professor John C. Huntington’s ppt
8Indus Script: Search for Grammar, Yadav et.al. (2007)
SEAL
SEAL IMPRESSION
SEAL SEAL IMPRESSION
From Professor John C. Huntington’s ppt
9Indus Script: Search for Grammar, Yadav et.al. (2007)
Specimens of Indus Texts on different objects
From Mahadevan, 1977
Text No. Text
13Indus Script: Search for Grammar, Yadav et.al. (2007)
Indus ScriptIndus ScriptScientists from a variety of disciplines have attemptedto read the Indus script with no clear answer.
Various attempts so far include:Various attempts so far include: I. Mahadevan’s analytical work – Creation of first
Published Concordance (1977) Gift Siromoney’s statistical work A. Parpola’s comparison with Dravidian Russian group’s comparison with Dravidian Subbarayappa’s interpretation as pure numerals S. R. Rao’s interpretation as Vedic literature Others (Ref. Possehl,1996)
15Indus Script: Search for Grammar, Yadav et.al. (2007)
We make no assumption about its content or meaning.
Our first emphasis is to attempt to WRITE IN THE SCRIPT RATHER THAN READ.
We search for rules of writing without assigning meanings or interpretations.
We ignore variation due to archaeological context of sites, stratigraphy and type of objects.
17Indus Script: Search for Grammar, Yadav et.al. (2007)
DatasetDatasetUnambiguous data subset (EBUDS) was created for analysis of the grammar of Indus writing, from the original electronic dataset of Mahadevan (1977) partially modified as M80.
EBUDS: Extended Basic Unique Dataset, excludes All ambiguous lines All texts from sides having multiple lines All duplicates (keeping their single occurrence)
Thus, EBUDS consists of 1548 lines of texts, with 7000 sign occurrences.
18Indus Script: Search for Grammar, Yadav et.al. (2007)
5) Preliminary5) Preliminary AnalysisAnalysis
19Indus Script: Search for Grammar, Yadav et.al. (2007)
Frequencyrange in M77
In M77 Present Work (EBUDS)
No. of signs Total sign occurrences
Total sign occurrences (in percent)
No. of signs Total sign occurrences
Total sign occurrences (in percent)
>1000 1 1395 10.43 1 715 10.21
999-500 1 649 4.85 1 377 5.39
499-100 31 6344 47.44 31 3230 46.14
99-50 34 2381 17.81 34 1243 17.76
49-10 86 1833 13.71 86 975 13.93
9-2 152 658 4.92 152 388 5.54
1 112 112 0.84 72 72 1.03
0 0 - - 40 - -
Total 417 13372 100.00 417 7000 100.00
Frequency distribution of Indus Signs
Only 67 (16% of total no. of signs) signs account for over 80% of Only 67 (16% of total no. of signs) signs account for over 80% of the writingthe writing..
20Indus Script: Search for Grammar, Yadav et.al. (2007)
Conclusions from Preliminary Conclusions from Preliminary AnalysisAnalysis
The frequency distribution of the signs in EBUDS is consistent with M77.
The manner of choosing the data set has not changed the pattern of occurrence of various signs and the results are consistent with the analysis of M77.
Only 67 signs (16% of total no. of signs) account for over 80% of the writing.
21Indus Script: Search for Grammar, Yadav et.al. (2007)
6) Analysis 1:6) Analysis 1:Check against Random OrderCheck against Random Order
22Indus Script: Search for Grammar, Yadav et.al. (2007)
MethodologyMethodology We take 1548 unique texts (7000 signs) present in EBUDS.
We randomise their appearance keeping the frequency of each sign as in EBUDS.
We split this long random string (of 7000 signs) into texts of 1 to 14 signs as in EBUDS.
We create 10 such random databases.
We then compare the frequency of their sign pairs, triplets etc. with Genuine Indus database (EBUDS) to check if Indus texts have any significant sequencing.
23Indus Script: Search for Grammar, Yadav et.al. (2007)
Comparison of EBUDS with Random DatasetsComparison of EBUDS with Random Datasets
No. of signs in the sign combination
Frequency of most frequent sign combination
Random Data set
EBUDS
1 2 3 4 5 6 7 8 9 10 Mean
2 60 54 62 51 57 56 63 66 58 56 58.3 168
3 5 3 3 4 3 5 7 5 5 3 4.3 34
4 1 1 1 2 1 1 2 2 1 1 1.3 16
5 1 1 1 1 1 1 1 1 1 1 1 4
6 1 1 1 1 1 1 1 1 1 1 1 2
24Indus Script: Search for Grammar, Yadav et.al. (2007)
Result of Analysis 1Result of Analysis 1Most Frequent Sign combination Frequency vs No. of signs in
the combination
0
20
40
60
80
100
120
140
160
180
1 2 3 4 5 6 7
No. of signs in the combination
Fre
qu
ency
of
mo
st f
req
uen
t co
mb
inat
ion
Random Datasets (Mean) Genuine Indus Dataset
25Indus Script: Search for Grammar, Yadav et.al. (2007)
String lengths of 2, 3 and 4 signs appear with frequency far higher than expected by random chance.
The signs are ordered in a specific manner.
It is justifiable to state that Indus texts followed certain rules and thereby meant something significant and meaningful.
Conclusions from Analysis 1Conclusions from Analysis 1
26Indus Script: Search for Grammar, Yadav et.al. (2007)
7)7) Analysis 2: Analysis 2: Positional analysis of Frequent Positional analysis of Frequent
Sign CombinationsSign Combinations
27Indus Script: Search for Grammar, Yadav et.al. (2007)
Positional Analysis of Frequent Two-sign Positional Analysis of Frequent Two-sign CombinationsCombinations
Two-sign Combination Frequency Solo (%) Left (%) Middle (%) Right (%)
99 267 168 0.60 1.79 11.90 85.71
89 336 75 0.00 0.00 89.33 10.67
176 342 59 0.00 96.61 3.39 0.00
342 8 58 1.72 72.41 25.86 0.00
99 391 56 0.00 0.00 8.93 91.07
342 347 56 0.00 89.29 10.71 0.00
1 342 48 0.00 89.58 10.42 0.00
123 293 40 0.00 0.00 0.00 100.00
59 87 39 0.00 0.00 79.49 20.51
342 48 38 2.63 52.63 28.95 15.79
59 171 36 0.00 0.00 80.56 19.44
162 249 34 0.00 0.00 85.29 14.71
211 89 34 0.00 91.18 8.82 0.00
245 245 33 0.00 60.61 21.21 18.18
211 59 31 0.00 90.32 9.68 0.00
67 65 27 0.00 0.00 74.07 25.93
130 51 27 0.00 7.41 70.37 22.22
67 99 26 0.00 0.00 100.00 0.00
342 162 25 4.00 84.00 12.00 0.00
343 123 25 0.00 0.00 100.00 0.00
28Indus Script: Search for Grammar, Yadav et.al. (2007)
Three-sign Combination Frequency Solo (%) Left (%) Middle (%) Right (%)
211 89 336 34 2.94 88.24 5.88 2.94
343 123 293 25 0.00 0.00 0.00 100.00
342 162 249 24 4.17 83.33 8.33 4.17
342 169 249 20 5.00 70.00 20.00 5.00
342 8 171 19 5.26 73.68 5.26 15.79
149 130 51 19 0.00 0.00 78.95 21.05
59 87 99 16 0.00 0.00 100.00 0.00
342 87 403 16 6.25 81.25 6.25 6.25
342 149 130 16 0.00 75.00 25.00 0.00
67 99 267 14 0.00 0.00 7.14 92.86
87 99 267 14 0.00 0.00 21.43 78.57
89 336 72 14 0.00 0.00 85.71 14.29
65 99 267 12 0.00 0.00 8.33 91.67
342 244 67 12 8.33 66.67 8.33 16.67
15 389 178 11 9.09 72.73 0.00 18.18
59 171 53 10 0.00 0.00 60.00 40.00
245 245 25 10 10.00 90.00 0.00 0.00
Positional Analysis of Frequent Three-sign Combinations
29Indus Script: Search for Grammar, Yadav et.al. (2007)
Four-sign Combination Frequency Solo (%) Left (%) Middle (%) Right (%)
342 149 130 51 16 6.25 68.75 6.25 18.75
59 87 99 267 9 0.00 0.00 33.33 66.67
89 336 59 171 6 0.00 0.00 83.33 16.67
15 389 178 98 5 0.00 100.00 0.00 0.00
342 53 230 175 5 20.00 80.00 0.00 0.00
342 169 249 65 5 20.00 20.00 20.00 40.00
211 89 336 72 5 0.00 80.00 0.00 20.00
Positional Analysis of Frequent Four-sign Combinations
30Indus Script: Search for Grammar, Yadav et.al. (2007)
The most frequent two-sign, three-sign and four-sign combinations appear at fixed positions.
The exact location varies from combination to combination.
However, frequently occurring two-sign, three-sign and four-sign combinations may be incomplete except of course when they occur as solo texts.
It can be seen that two-sign, three-sign and four-sign combinations which are complete have typically one of the text-enders (mostly 342 or 211 ) at the end. This is confirmed by the solo occurrences of such texts.
Conclusions from Positional Conclusions from Positional analysisanalysis
31Indus Script: Search for Grammar, Yadav et.al. (2007)
8) Text Beginners and Text 8) Text Beginners and Text EndersEnders
32Indus Script: Search for Grammar, Yadav et.al. (2007)
Indus Text Beginners and Indus Text Beginners and EndersEnders
Enders and Beginners (EBUDS)
0.00
0.20
0.40
0.60
0.80
1.00
1.20
0 20 40 60 80 100 120 140 160 180 200
Number of Signs
Fra
ctio
nal
Cu
mu
lati
ve F
req
uen
cy
Enders
Beginners
33Indus Script: Search for Grammar, Yadav et.al. (2007)
Consider an Indus Text with Signs
G F E D C B AG F E D C B A
(In order of their statistical significance)
FrequentText Enders
FrequentText Beginners
34Indus Script: Search for Grammar, Yadav et.al. (2007)
Specimens of Indus Texts illustrating syntactical patterns
From Mahadevan (1986)
35Indus Script: Search for Grammar, Yadav et.al. (2007)
Conclusions for Indus ScriptConclusions for Indus Script
There are well defined text-enders though text-beginners are not that well-defined.
Sign distribution within the strings seems to be ordered as per some specific rules. The distribution is far more significant than would arise by chance.
This indicates existence of patterns and rules that need to be dug out.
36Indus Script: Search for Grammar, Yadav et.al. (2007)
9) Segmentation of Indus 9) Segmentation of Indus TextsTexts
37Indus Script: Search for Grammar, Yadav et.al. (2007)
SegmentationSegmentation ApproachApproach
There can be various methods which can be used for segmenting anIndus text namely
Comparing texts Using frequent combinations of signs Using Pair Frequencies Using Single Signs (Enders, Beginners, Auxiliary Enders)
These methods are overlapping and hence it is decided to select anapproach which takes into consideration the effect of each of these.A cumulative method based on statistically significant units, is thusformulated.
41Indus Script: Search for Grammar, Yadav et.al. (2007)
Segmentation ProcessSegmentation Process
INDUS TEXT
Look for pair, triplet and quad texts successively
Look for frequent 4, 3 and 2 sign combinations successively
Look for Enders, Beginners and Auxiliary Enders successively
TEXT SEGMENTS
55 % split
77 % split
88 % split
Percent of texts split (for texts of 5 or more signs)
42Indus Script: Search for Grammar, Yadav et.al. (2007)
Length vs Number of Texts or Segments
0
200
400
600
800
1000
1200
1400
1600
1800
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Text or Segment length
Nu
mb
er o
f T
exts
or
Seg
men
ts
EBUDS beforesegmentation
EBUDS aftersegmentation
Segment Length vs. Segment Frequency in EBUDS before and after segmentation
43Indus Script: Search for Grammar, Yadav et.al. (2007)
EBUDS before and after segmentation
EBUDS before Segmentation EBUDS after Segmentation
45Indus Script: Search for Grammar, Yadav et.al. (2007)
Conclusions from Conclusions from segmentationsegmentation
It is possible to segment 88% of Indus texts of length 5 and above into segments of length 4 and below by using statistically significant signs and their combinations in addition to all the texts of length 2, 3 and 4.
Many frequent sign combinations make their appearance as independent texts.
The Indus texts after segmentation can be viewed as permutations of the identifiable units (segments) of 2, 3 or 4 signs.
The identifiable units may or may not be standalone (or complete) pieces of information.
47Indus Script: Search for Grammar, Yadav et.al. (2007)
SummarySummary The writing is highly ordered.
Typical length of information containing units is 2, 3 or maximum 4 signs.
However, they are not always complete enough to exist as standalone pieces of text.
This suggests a more complex grammar in the writing where information units need proper beginners or enders.
The present study shows that Indus writing seems to have specific ordering as would be expected if sophisticated information is coded. This is consistent with the general level of sophistication associated with the Indus culture.