E N I D - Roma, September 7th-9th, 2011 1 Upcoming concepts in a specific scientific discipline: an...
-
Upload
wilfred-goodwin -
Category
Documents
-
view
217 -
download
1
Transcript of E N I D - Roma, September 7th-9th, 2011 1 Upcoming concepts in a specific scientific discipline: an...
E N I D - Roma, September 7th-9th, 2011
1
Upcoming concepts in a specific scientific discipline:
an analysis based on a categorisation of the related terminology
Marianne HörlesbergerBeatrix WepnerEdgar Schiebel
Ivana RocheChristine LoualaClaire FrançoisNathalie AntonotDominique Besagni
Upcoming concepts in a specific scientific discipline:
an analysis based on a categorisation of the related terminology
Georg Vorlaufer
E N I D - Roma, September 7th-9th, 2011
2
Summary
• Objective• Data collection• Methodology
– Statistical evaluation– Diffusion model– Field indicators
• Discussion & Perspectives
E N I D - Roma, September 7th-9th, 2011
3
Objective
Produce a methodology allowing the analysis of the evolution of a specific scientific domainby means of studyingits terminologyextracted from related specialized international literature
E N I D - Roma, September 7th-9th, 2011
4
Data collection
• Data source: PASCAL database• Period: 2001 to 2011
corpus: 19,090 bibliographic references
136 identified fields
19 final fields
examination of classification categories
analysis by scientific experts
representing the whole Tribology domain(without lost of information)
TRIBOLOGYFields applyingTribology issues
Fields facilitating newissues in Tribology
Tribology domain
E N I D - Roma, September 7th-9th, 2011
5
Statistical evaluation of the fields (1/2)
Selection, among the 19 defined fields, of the most dynamic fields showing a steady and consistent development over time study of the evolution of their annual productivity
• by defining 2 growth indexes giving a straightforward comparison of the two endpoints of the period under observation, namely, 2001 to 2011 and 2001 to 2005
• by calculating the annual average growth rate taking yearly changes into account
• by introducing the Sharpe Ratio taking into account the growth homogeneity or stability, conversely of the two growth indicators
production of a composite indicator from the ranking of each indicator equally weighted meta-ranking
E N I D - Roma, September 7th-9th, 2011
6
Statistical evaluation of the fields (2/2)
the analysis of these results by a scientific expert did not generate the suppression of any field
Composite MetaField ID Field name indicator ranking
13 General mechanical engineering and machine design 16 19 Polymers 24 2
10 Metals: production techniques and joining 27 34 Mechanics of solids. Solid Earth physics 29 4
11 Corrosion 36 56 Chemistry 38 65 Material science 38 7
15 Drives 38 87 Energy. Electric power engineering 39 9
14 Machine components. Friction, wear, lubrication 39 102 Metrology 40 11
17 Precision engineering 40 1212 Metals: mechanical properties, tribology 43 1318 Buildings. public works. Transportation 46 143 Condensed matter 46 15
19 Biological and medical sciences 50 1616 Engines. Pumps. Steel design 53 171 General physics 58 188 Electronics. Information theory 60 19
E N I D - Roma, September 7th-9th, 2011
7
Complementary assessment by clustering
Tribological properties of materials
Industrial applications
Lubricants
E N I D - Roma, September 7th-9th, 2011
8
Diffusion model – An heuristic approach (1/2)
In-depth analysis of the evolution of the fields to evaluate the term status in a considered field by
measuring its degree of diffusion
Stage 1: unusual terms index few publications new terms and imported terms well-known in other fields form a set of strongly exotic termsStage 2: established terms can begin to diffuse to other fields the number of occurrences begin to growStage 3: cross-section terms, highly established, show a broad diffusion in other fields. This is the stage with the highest maturity we have a heavily growing number of occurrences Stage 4: a new paradigm occurs and the number of publications about the “old” invention declines
No
of b
iblio
grap
hic
refe
renc
es
Time
Time
Nu
mb
er o
f A
rticles
Stage 1Stage 2
Stage 3 Stage 4
No
of b
iblio
grap
hic
refe
renc
es
Time
Time
Nu
mb
er o
f A
rticles
Stage 1Stage 2
Stage 3 Stage 4
E N I D - Roma, September 7th-9th, 2011
9
Diffusion model – An heuristic approach (2/2)
• definition of Home Technology terms (HT terms)– keywords which are specific for a field occur with a higher
probability in that field rather than in others– the probability is defined as a relative term frequency
(rtfField) • the frequency of one term in a field divided by the number of
bibliographic references in this field– for each term, after calculating its rtfField in each field,
those with the highest probability is declared to be its Home Technology
In fine: each field gets the list of its HT terms, and each term gets a Home Technology
• utilization of the Gini index as a measure of the dispersion of a term in a scientific domain:– Gini index = 0 means a completely uniform distribution and
indicates that the term occurs in all the 19 considered fields of the Tribology domain
– Gini index = 1 tells us that the term is very specifically limited to the only field where it appears
E N I D - Roma, September 7th-9th, 2011
10
Categorizing the HT terms
For each field, its HT terms are classified into 4 categories:
• the terms occurring once– analyzed by introducing a diachronic approach allowing
distinguishing the “old” concepts appearing in the beginning of the studied period from the very “new” ones appearing in its latest years.
• the remaining terms whose Gini index is lower or equal to a fixed threshold, considered as cross-section
• the remaining terms whose relative term frequency in the field (rtfField) is greater than a fixed threshold, considered as established
• the last terms, considered as unusual
E N I D - Roma, September 7th-9th, 2011
11
Results: HT terms occurring once by field
Number %01-General physics 214 10 5%02-Metrology 129 22 17%03-Condensed matter 232 22 9%04-Mechanics of solids 89 20 22%05-Material sciences 176 3 2%06-Chemistry 253 40 16%07-Energy. Electrical power engineering 247 33 13%08-Electronics. Information theory 136 7 5%09-Polymers 199 20 10%10-Metals: production techniques and joining 164 10 6%11-Corrosion 133 17 13%12-Metals: mechanical properties, tribology 98 4 4%13-General mechanical engineering and machine design 86 8 9%14-Machine components. Friction, wear, lubrication 162 11 7%15-Drives 56 8 14%16-Engines. Pumps. Steel design 109 4 4%17-Precision engineering 128 20 16%18-Buildings. Public works. Transportation 337 25 7%19-Biological and medical sciences 253 22 9%
Mono-occurrential HT terms by field
Appearing in 2010-2011Total Nb
ID number & Name of fields
Three fields have the highest rates of HT terms occurring once in the two last years of the considered period: “Mechanics of solids”, “Metrology” and “Precision engineering”
E N I D - Roma, September 7th-9th, 2011
12
Results: HT terms occurring once in « Mechanics of solids » by year
10
2
11
6
11 11
9
45
19
1
0
2
4
6
8
10
12
14
16
18
20
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
E N I D - Roma, September 7th-9th, 2011
13
HT terms of « Precision engineering » by category
Year
Electronic component 2011 Spherical bearing Silicon AssemblyThermodynamic analysis 2011 Knudsen number Rough surface QualityAtomization 2010 Involute tooth Microelectromechanical device ManufacturingCastor oil 2010 Limit cycle Gas bearing Hostile environmentCyclic copolymer 2010 Differential pressure Wafer Array
Deformation mode2010
Dissimilar materialsExternally pressurized gas lubrication
Aspect ratio
Elasticity theory 2010 Forming limit curve Micromachine Manufacturing processLinear machine 2010 Stochastic process Single crystal DesignNanodot 2010 Ultrasonic machining Precision engineering FlexibilityNanoplatelet 2010 Ball screw Polycrystal Thin sheetNon destructive method 2010 Turbo pump Electrical discharge machining SuspensionOlefin copolymer 2010 Chemical etching Electromechanical device Surface cleaningPlate electrode 2010 Potassium Relative humidity Production processSAN 2010 Gaussian distribution Mechanical polishing StartingSelective etching 2010 Copper alloy Stiction PositioningSheet electrode 2010 Axial speed Diamond tool ReproducibilityShock wave 2010 Surface fatigue Polysilicon GaPStepping motor 2010 Adhesion work Chemical polishing Process controlSynchronous motor 2010 Crystal face Hydrostatic bearings RepeatabilityWater corrosion 2010 Damper Theoretical model Plate
HT terms in the field by categoryMono-occurential
Unusual Established Cross-section
E N I D - Roma, September 7th-9th, 2011
14
Fields’ terminology indicators (1/3 )
To characterize the terminology used for each field, we define a set of indicators:
dealing with strictly local characteristics of the field terminology
• diversity, defined as the ratio between the number of indexing keywords of its bibliographic references and its productivity. The higher this value, the more diverse the terms used
• specificity, corresponding to the proportion of its HT terms with respect to the number of indexing keywords of its bibliographic references. The lower this value, the bigger the part of the field terminology coming from abroad
E N I D - Roma, September 7th-9th, 2011
15
Fields’ terminology indicators (2/3 )
taking into account the field’s term exchanges with the set of the others fields
• singularity, defined as the proportion of its HT terms without any occurrence in the other fields. The weaker this value, the lower the number of these so-called lonesome terms, corresponding to HT terms neither exported nor imported, occurring exclusively in the field
• normalized trade balance, giving a measure of the balance between the HT terms exported by the field and the terms imported from abroad. The range of values goes from -1 (the field exports none of its HT terms) to 1 (the field imports no term) and a zero value means a trade balance in perfect equilibrium
E N I D - Roma, September 7th-9th, 2011
16
Fields’ terminology indicators (3/3 )
• diffusion capacity, given by the average of the Gini index of all the HT terms. An average Gini index equal to 0 means a completely uniform distribution of all HT terms and indicates that each HT term occurs in all the other fields. An average Gini index of 1 tells us that all HT terms are very specific and occur only in this field
• impact, defined as a Hirsch index. We say that a field has an h-index of X if X of its HT terms are imported by at least X other fields. These X keywords form the h-core list of the field. We can then consider, by analogy, that this Hirsch index can help appraise the influence of the considered field on the others
E N I D - Roma, September 7th-9th, 2011
17
Results: Trade balance indicator ID number & Field name Balance Rank06-Chemistry -0,34 118-Buildings. Public works. Transportation -0,49 211-Corrosion -0,52 316-Engines. Pumps. Steel design -0,55 4
17-Precision engineering -0,57 5
09-Polymers -0,68 607-Energy. Electrical power engineering -0,7 713-General mechanical engineering and machine design -0,71 8
19-Biological and medical sciences -0,72 908-Electronics. Information theory -0,76 1002-Metrology -0,78 1103-Condensed matter -0,78 1210-Metals: production techniques and joining -0,79 1315-Drives -0,79 1401-General physics -0,8 1512-Metals: mechanical properties, tribology -0,85 16
04-Mechanics of solids -0,86 1705-Material sciences -0,89 1814-Machine components. Friction, wear, lubrication -0,99 19
- all the values of the trade balance indicator are negative this means that all fields import more terms than they export- the calculation of the number of exported terms does not consider how many times each term is exported by taking into account the number of exportations of each term in the calculation, the half of fields get a positive value of the trade balance
range = [-1, 1]
E N I D - Roma, September 7th-9th, 2011
18
Results: h-index ID number & Field name h-index Rank04-Mechanics of solids 18 1
09-Polymers18 1
10-Metals: production techniques and joining 18 111-Corrosion 18 113-General mechanical engineering and machine design 18 115-Drives 18 1
16-Engines. Pumps. Steel design 18 1
17-Precision engineering 18 1
02-Metrology 17 9
03-Condensed matter 17 9
06-Chemistry 17 908-Electronics. Information theory 17 912-Metals: mechanical properties, tribology 17 918-Buildings. Public works. Transportation 17 919-Biological and medical sciences 17 9
05-Material sciences 16 16
07-Energy. Electrical power engineering16 16
01-General physics 15 1814-Machine components. Friction, wear, lubrication 6 19
- eight fields get the maximum possible value of the h-index, namely 18 this means that they export 18 of their HT terms to all the other 18 fields considered in the study - the value of the average Gini index allows to qualify this observation, for instance “Precision engineering” has an average Gini index at 0.66, meaning that this field exports its terms more uniformly than “Polymers” whose average Gini index is 0.84
average Gini=0.84
average Gini=0.66
E N I D - Roma, September 7th-9th, 2011
19
Results: Singularity indicator
18-Buildings. Public works. Transportation 0.175 1
19-Biological and medical sciences 0.150 2
07-Energy. Electrical power engineering 0.050 3
09-Polymers 0.046 414-Machine components. Friction, wear, lubrication
0.039 5
05-Material sciences 0.037 601-General physics 0.034 703-Condensed matter 0.031 8
08-Electronics. Information theory 0.024 9
10-Metals: production techniques and joining
0.022 10
12-Metals: mechanical properties, tribology 0.016 11
06-Chemistry 0.015 1204-Mechanics of solids 0.010 1315-Drives 0.007 1402-Metrology 0.006 15
13-General mechanical engineering and machine design
0.004 16
16-Engines. Pumps. Steel design 0.003 17
17-Precision engineering 0.003 1811-Corrosion 0.002 19
RankID number & Field name Singularity
- the singularity indicator gives the proportion of lonesome HT terms the high values of the singularity indicator got by “Buildings. Public works. Transportation” and “Biological and medical sciences” give an indication of their quite “independent” character, in the context of this study these fields are definitively located downstream in our Tribology defined domain and are thus considered applied fields
range = [0, 1]
E N I D - Roma, September 7th-9th, 2011
20
Discussion & Perspectives
• The produced results are available on a web server to facilitate their assessment by scientific experts from the AC2T by providing– direct link from a term to the related bibliographic reference (s)
allowing thus the contextualization of the terminological information
• The AC2T in-depth analysis of the results: a “virtuous circle”– participates to a better understanding of the evolution of the
studied domain and its relationships in a multidisciplinary context– allows verifying if our approaches are complementary– helps assessing and improving our methodology and generates
new developments based on real needs• Some improvements are in study:
– developing a step of assisted terminological extraction previously to the indexation in order to better represent the very recent concepts not yet introduced in our terminological reference tables
– extending the diachronic analysis to the terms occurring more than once in a field
– adding an a priori categorization of the terms known to belong, without doubt, to the “core” field
– adding a stop list of contextual “general science” terminology
E N I D - Roma, September 7th-9th, 2011
21
Thank yououy knahT
[ivana.roche; christine.louala; nathalie.antonot; claire.francois; dominique.besagni]@inist.fr[marianne.horlesberger; beatrix.wepner; edgar.schiebel]@ait.ac.at