Volume 4 Number 14 EJTPe-mail: [email protected] [email protected] Leonardo Chiatti Medical...

Volume 4 Number 14

EJTPElectronic Journal of Theoretical Physics

ISSN 1729-5254

http://www.ejtp.com March, 2007 Email:[email protected]

Volume 4 Number 14

Electronic Journal of Theoretical Physics

EJTP

http://www.ejtp.com March, 2007 Email:[email protected]

Editor in Chief

A. J. Sakaji

EJTP Publisher P. O. Box 48210 Abu Dhabi, UAE [email protected] [email protected]

Editorial Board

Co-Editor

Ignazio Licata,Foundations of Quantum Mechanics Complex System & Computation in Physics and Biology IxtuCyber for Complex Systems Sicily – Italy

[email protected]@ejtp.info [email protected]

Wai-ning Mei Condensed matter TheoryPhysics DepartmentUniversity of Nebraska at Omaha,

Omaha, Nebraska, USA e-mail: [email protected] [email protected]

F.K. DiakonosStatistical Physics Physics Department, University of Athens Panepistimiopolis GR 5784 Zographos, Athens, Greece

e-mail: [email protected]

A. AbdelkaderExperimental Physics Physics Department, AjmanUniversity Ajman-UAE e-mail: [email protected]

[email protected]

Tepper L. Gill Mathematical Physics, Quantum Field Theory Department of Electrical and Computer Engineering Howard University, Washington, DC, USA e-mail: [email protected]

[email protected]

J. A. MakiApplied Mathematics School of Mathematics University of East Anglia Norwich NR4 7TJ UK e-mail: [email protected]

[email protected]

Nicola Yordanov Physical Chemistry Bulgarian Academy of Sciences,BG-1113 Sofia, Bulgaria Telephone: (+359 2) 724917 , (+359 2) 9792546


ndyepr[AT]bas.bg

S.I. ThemelisAtomic, Molecular & Optical Physics Foundation for Research and Technology - Hellas P.O. Box 1527, GR-711 10 Heraklion, Greece e-mail: [email protected]

T. A. HawaryMathematics Department of Mathematics Mu'tah University P.O.Box 6 Karak- Jordan e-mail: [email protected]

Arbab Ibrahim Theoretical Astrophysics and Cosmology Department of Physics, Teachers' College, P.O. Box 4341, Riyadh 11491, Suadi Arabia e-mail: [email protected]

[email protected]

Sergey Danilkin Instrument Scientist, The Bragg Institute Australian Nuclear Science and Technology Organization PMB 1, Menai NSW 2234 AustraliaTel: +61 2 9717 3338 Fax: +61 2 9717 3606


Robert V. Gentry The Orion Foundation P. O. Box 12067 Knoxville, TN 37912-0067 USAe-mail: gentryrv[AT]orionfdn.org

[email protected]

Attilio Maccari Nonlinear phenomena, chaos and solitons in classic and quantum physics Technical Institute "G. Cardano" Via Alfredo Casella 3 00013 Mentana RM - ITALY


Beny Neta Applied Mathematics Department of Mathematics Naval Postgraduate School 1141 Cunningham Road Monterey, CA 93943, USA


Haret Rosu Theoretical Astrophysics Instituto de F´ sica, Universidad de Guanajuato, Apdo Postal E-143, Le´on, Gto, Mexico


Jorge A. Franco Rodríguez General Theory of Relativity Av. Libertador Edificio Zulia P12 123 Caracas 1050 Venezuela


[email protected]

Leonardo Chiatti Medical Physics Laboratory ASL VT Via S. Lorenzo 101, 01100 Viterbo (Italy) Tel : (0039) 0761 236903 Fax (0039) 0761 237904


[email protected]

Zdenek Stuchlik Relativistic Astrophysics Department of Physics, Faculty of Philosophy and Science, Silesian University, Bezru covo n´am. 13, 746 01 Opava, Czech Republic


Copyright © 2003-2007 Electronic Journal of Theoretical Physics (EJTP) All rights reserved

Table of Contents

No Articles Page

1 On the Dynamics of a n-D Piecewise Linear Map Zeraoulia Elhadj

1

2 Flow of Unsteady Dusty Fluid under Varying Pulsatile Pressure Gradient in Anholonomic Co-ordinate System J.Gireesha, C.S.Bagewadi and B.C.Prasanna Kumara

9

3 Exact Solutions for Nonlinear Evolution Equations via Extended Projective Riccati Equation Expansion Methods M A Abdou

17

4 Evolutionary Neural Gas: A Model of Self Organizing Network from Input Categorization Luigi Lella and Ignazio Licata

31

5 Discrete Groups Approach to Non Symmetric Gravitation TheoryN.Mebarki, F.Khelili and J.Mimouni

51

6 Quantization of the Scalar Field Coupled Minimally to the Vector PotentialW. I. Eshraim and N. I. Farahat

61

7 A Generalized Option Pricing Model J. P. Singh

69

8 Derivation of the Radiative Transfer Equation inside a Moving Semi-Transparent Medium of Non Unit Refractive Index Le Dez and H. Sadat

87

9 Quantum Images and the Measurement Process Fariel Shafee

121

EJTP 4, No. 14 (2007) 1–8 Electronic Journal of Theoretical Physics

On the Dynamics of a n-D Piecewise Linear Map

Zeraoulia Elhadj∗

Department of Mathematics, University of Tebessa, (12000), Algeria.

Received 27 September 2006, Accepted 6 January 2007, Published 31 March 2007

Abstract: This paper, derives sufficient conditions for the existence of chaotic attractors in ageneral n-D piecewise linear discrete map, along the exact determination of its dynamics usingthe standard definition of the largest Lyapunov exponent.c© Electronic Journal of Theoretical Physics. All rights reserved.

Keywords: Chaos, Discrete Mapping, Lyapunov ExponentsPACS (2006): 05.45.a, 95.10.Fh, 05.45.Ra

1. Introduction

There are many works that focus on the topic of the rigorous mathematical proof

of chaos in a discrete mapping ( continuous or not). For example it has been studied

rigorously from a control and anti-control schemes or from the use of Lyapunouv ex-

ponents, see for example [1-2-3-4-5-6], to prove the existence of chaos in n-dimensional

dynamical discrete system, since a large number of physical and engineering systems have

been found to exhibit a class of continuous or discontinuous piecewise linear maps [12-

13] where the discrete-time state space is divided into two or more compartments with

different functional forms of the map separated by borderlines [14-15-16-17-18]. The

theory for discontinuous maps is in the preliminary stage of development, with some

progress reported for 1-D and n-D discontinuous maps in [19-20-21-22-23], these results

are restrictive, and cannot be obtained in the general n-dimensional context [23].

This paper, derives sufficient conditions for the existence of chaotic attractors in a

general n-D piecewise linear discrete map, along the exact determination of its dynamics

using the standard definition of the Lyapunov exponents as the usual test for chaos.

In the following, we present the standart definition of the Lyapunov exponents for a

discrete n-D mapping.

∗ [email protected]

2 Electronic Journal of Theoretical Physics 4, No. 14 (2007) 1–8

Theorem 1. (Lyapunouv exponent): Considered the following n-D discrete dynamical

system:

xk+1 = f(xk), xk ∈ Rn, k = 0, 1, 2, ... (1)

where f : Rn −→ R

n, is the vector field associated with system (1), let J (x) be its

Jacobian evaluated at x , let also the matrix:

Tr (x0) = J (xr−1) J (xr−2) ...J (x1) J (x0) . (2)

Moreover, let Ji(x0, l) be the module of the ith eigenvalue of the lthmatrix Tr (x0) ,where

i = 1, 2, ..., n and r = 0, 1, 2, ...

Now, the Lyapunov exponents of a n-D discrete time systems are defined by:

ωi(x0) = ln

(lim

r−→+∞Ji(x0, r)

1r

), i = 1, 2, ..., n. (3)

2. The main result

Let us consider the following n-D map of the form: f : D → D, D ⊂ Rn, defined by:

xk+1 = f (xk) = Aixk + bi, if xk ∈ Di, i = 1, 2, ..., m. (4)

where Ai =(aijl

)1≤j,l≤n

and bi =(bji)1≤j≤n

,are respectively n × n and n × 1 real

matrices, for all i = 1, 2, ..., m, and xk =(xjk

)1≤j≤n

∈ Rn is the state variable, and m is

the number of disjoint domains on which D is partitioned. Due to the shape of the vector

field f of the map (4) the plane can be divided into m regions denoted by (Di)1≤i≤m ,

and in each of these regions the map (4) is linear.

The Jacobian matrix of the map (4) is:

J (xk) =

⎧⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎩

A1, if xk ∈ D1,

A2, if xk ∈ D2,

...

Am, if xk ∈ Dm,

(5)

In the following we will compute analytically all the Lyapunov exponents of the map

(1) and we will show that these exponents are the same in each linear regions (Di)1≤i≤m de-

fined above. The essential idea of our proof is the assumption that the matrices (Ai)1≤i≤m

has the same eigenvalues, i.e. they are equivalent, then, if one compute analytically a

Lyapunov exponent of the map (4) in a region Di (which is the logarithm of the absolute

value of an eigenvalue of a matrix Ai) then, one can find that these exponents are identi-

cal in each linear region Di, for all i ∈ {1, 2, .., m}. Thus, one can consider the Jacobian

matrix J (xk) of the map (4) as any matrix Ai, denoted by A = (ajl)1≤j,l≤n.

Electronic Journal of Theoretical Physics 4, No. 14 (2007) 1–8 3

Assume that the eigenvalues of A are listed in order as follow:∣∣∣λ1

((ajl)1≤j,l≤n

)∣∣∣ ≥ ∣∣∣λ2

((ajl)1≤j,l≤n

)∣∣∣ ≥ ... ≥∣∣∣λn

((ajl)1≤j,l≤n

)∣∣∣ , (6)

where the notation λi

((ajl)1≤j,l≤n

)indicate that the eigenvalue λi depend only the co-

efficients (ajl)1≤j,l≤n , then Tr (x0) = Ar , and its eigenvalues are λr1

((ajl)1≤j,l≤n

), ..., λr

n

((ajl)1≤j,l≤n

),

then the Lyapunov exponents of the map (4) are:

ωi(x0) = ln

(lim

r−→+∞

(∣∣∣λi

((ajl)1≤j,l≤n

)∣∣∣r) 1r

)= ln

∣∣∣λi

((ajl)1≤j,l≤n

)∣∣∣ , i = 1, 2, .., n. (7)

Hence, according to (6) all the Lyapunov exponents are listed as follow:

ω1

((ajl)1≤j,l≤n

)≥ ω2

((ajl)1≤j,l≤n

)≥ ... ≥ ωn

((ajl)1≤j,l≤n

), (8)

Define the following subsets of Rn2

in term of the vector (ajl)1≤j,l≤n as follow:

Ω1 ={

(ajl)1≤j,l≤n ∈ Rn2

,∣∣∣λn

((ajl)1≤j,l≤n

)∣∣∣ > 1}

, (9)

Ω2 ={


,∣∣∣λ1

((ajl)1≤j,l≤n

)∣∣∣ < 1}

, (10)

Ω3 ={


,∣∣∣λ1

((ajl)1≤j,l≤n

)∣∣∣ = 1}

, (11)

Ω4 ={


,∣∣∣λi

((ajl)1≤j,l≤n

)∣∣∣ < 1, i = 2, ..., n}

, (12)

Ω5 ={


,∣∣∣λ2

((ajl)1≤j,l≤n

)∣∣∣ = 1}

, (13)

Ω6 ={


,∣∣∣λi

((ajl)1≤j,l≤n

)∣∣∣ < 1, i = 3, ..., n}

, (14)

Ω7 ={


,∣∣∣λi

((ajl)1≤j,l≤n

)∣∣∣ = 1, i = 1, 2, ..., K, where 1 ≤ K ≤ n}

,

(15)

Ω8 ={


,∣∣∣λi

((ajl)1≤j,l≤n

)∣∣∣ < 1, i = K + 1, ..., n}

, (16)

Ω9 =

{(ajl)1≤j,l≤n ∈ R

n2

,∣∣∣λ1

((ajl)1≤j,l≤n

)∣∣∣ > 1, andi=n∏i=2

∣∣∣λi

((ajl)1≤j,l≤n

)∣∣∣ < 1

}, (17)

Finally, one obtain the following results:


(1) The map (4) is super chaotic when all its Lyapunov exponents are positive,

i.e. ωn

((ajl)1≤j,l≤n

)> 0, according to inequalities (6) and (8). Thus, one may obtain

(ajl)1≤j,l≤n ∈ Ω1.

(2) The map (4) converges to a stable fixed point when all the Lyapunov exponents

are negative, i.e.∣∣∣λ1

((ajl)1≤j,l≤n

)∣∣∣ < 1, according to inequalities (6) and (8). Thus, one

may obtain (ajl)1≤j,l≤n ∈ Ω2.

(3) The map (4) converges to a circle attractor when ω1 = 0, and 0 > ω2 ≥ ... ≥ ωn,

i.e.∣∣∣λ1

((ajl)1≤j,l≤n

)∣∣∣ = 1, and∣∣∣λi

((ajl)1≤j,l≤n

)∣∣∣ < 1, for i = 2, ..., n. Thus, one may

obtain (ajl)1≤j,l≤n ∈ Ω3 ∩ Ω4.

(4) The map (4) converges to a torus attractor when ω1 = ω2 = 0, and 0 > ω3 ≥... ≥ ωn, i.e.

∣∣∣λ1

((ajl)1≤j,l≤n

)∣∣∣ =∣∣∣λ2

((ajl)1≤j,l≤n

)∣∣∣ = 1, and∣∣∣λi

((ajl)1≤j,l≤n

)∣∣∣ < 1, for

i = 3, ..., n . Thus, one may obtain (ajl)1≤j,l≤n ∈ Ω3 ∩ Ω5 ∩ Ω6.

(5) The map (4) converges to a K-torus attractor when ω1 = ω2 = ... = ωK =

0, and 0 > ωK+1 ≥ ... ≥ ωn, i.e.∣∣∣λ1

((ajl)1≤j,l≤n

)∣∣∣ =∣∣∣λ2

((ajl)1≤j,l≤n

)∣∣∣ = ... =∣∣∣λK

((ajl)1≤j,l≤n

)∣∣∣ = 1, and∣∣∣λi

((ajl)1≤j,l≤n

)∣∣∣ < 1, i = K + 1, ..., n . Thus, one may

obtain (ajl)1≤j,l≤n ∈ Ω7 ∩ Ω8.

(6) The map (4) converges to a chaotic attractor when ω1 > 0, andn∑

i=2

ωi < 0, i.e.∣∣∣λ1

((ajl)1≤j,l≤n

)∣∣∣ > 1, andi=n∏i=2

∣∣∣λi

((ajl)1≤j,l≤n

)∣∣∣ < 1. Thus, one may obtain (ajl)1≤j,l≤n ∈Ω9.

Generally, for a continuous map positive Lyapunov exponent indicate chaos, nega-

tive exponent indicate fixed points, and if the Lyapunov exponent is equal to 0, then

the dynamics is periodic, while for a discontinuous map a zero Lyapunouv exponent

is not indicate periodic behavior, in this case the map generates a symbolic sequence

s = {s0; s1; ...; sj; ...} composed of symbols sj = i if xj = f j(x0) ∈ Di, i = 1, .., m. Each

of those symbolic sequences is called “admissible” and its symbols describe the order in

which trajectories, starting from any initial condition x0, visit the various sub regions Di,

i = 1, ...m. In [7] a general approach for finding periodic trajectories in piecewise-linear

maps, this procedure is based on the decomposition of the initial state via the eigenvec-

tors of their jacobian and it is applied to digital filters with two’s complement overflow

and ΣΔ modulators [8-9-10-11]. Finally, one conclude that there is some cases ( depend

mainly on position of the initial conditions) where the behavior of a map is not periodic

in spite of its Lyapunouv exponent is zero.

Hence, the following theorem is proved.

Theorem 2. Considered a general n-D piecewise linear map of the form:

f (xk) = xk+1 = Aixk + bi, if xk ∈ Di ⊂ Rn, i = 1, 2, ..., m, (18)

and assume the following:

(a) The map (18) is piecewise linear. i.e the integer m verify m ≥ 2, and there exist


i, j ∈ {1, 2, ..., m} such that bi = 0 and bi = bj.

(b) The map (18) has a set of fixed point. i.e. There is a set of integers i in {1, 2, ..., m}such that the equations Aix + bi = x, has at least a zero x in the subregion Di.

(c) All the matrices Ai and Aj are equivalent. i.e. there exist invertible matrices Pij

such that: Ai = PijAjP−1ij , for all i, j ∈ {1, 2, ..,m} .

Then, the dynamics of the map (18) is known in term of the vector (ajl)1≤j,l≤n ∈ Rn2

in the following cases:

(1) if (ajl)1≤j,l≤n ∈ Ω1, then the map (18) is super chaotic.

(2) if (ajl)1≤j,l≤n ∈ Ω2, then the map (18) converges to a stable fixed point.

(3) if (ajl)1≤j,l≤n ∈ Ω3 ∩ Ω4, then the map (18) converges to a circle attractor.

(4) if (ajl)1≤j,l≤n ∈ Ω3 ∩ Ω5 ∩ Ω6, then the map (18) converges to a torus attractor.

(5) if (ajl)1≤j,l≤n ∈ Ω7 ∩ Ω8, then the map (18) converges to a K-torus attractor.

(6) if (ajl)1≤j,l≤n ∈ Ω9, then the map (18) is chaotic.

3. Conclusion

We have reported a rigorous proof of chaos in a general n-D piecewise linear map,

along the exact determination of its dynamics using the standard definition of the largest

Lyapunov exponent.


References

[1] Zeraoulia Elhadj, (2007) On the rigorous determination of chaotic behavior in apiecewise linear planar map, to aapear in Discrete Dynamics in Nature and Society.

[2] Li, Z., Park, J. B., Joo, Y. H., Choi, Y. H., Chen, G. (2002) Anticontrol of chaos fordiscrete TS fuzzy systems. IEEE Trans. Circ. Syst.-I, 49:249–253.

[3] Wang, X. F., Chen, G. (1999) On feedback anticontrol of discrete chaos. Int.J. Bifur.Chaos, 9:1435–1441.

[4] Wang, X. F., Chen, G. (2000) Chaotifying a stable map via smooth small amplitudehigh-frequency feedback control. Int. J. Circ. Theory Appl., 28:305–312.

[5] Chen, G & Lai, D (1997) Making a discrete dynamical system chaotic: feedback controlof lyapunouv exponents for discrete-time dynamical system, IEEE Trans. Circ. Syst.-I,44, 250–253.

[6] Lai, D,.Chen, G (2003) Making a discrete dynamical system chaotic:Theorical resultsand numerical simulations, Int. J. Bifur. Chaos, 13(11), 3437-3442.

[7] C. D. Mitrovski, Lj. M. Kocarev (2001), Periodic Trajectories in Piecewise-LinearMaps, IEEE Tans.Circuit & Systems-I, 48 (10),1244-1246.

[8] L. O. Chua, T. Lin (1988), Chaos in digital filters, IEEE Trans. Circuits Syst., 35,648–658.

[9] O. Feely, L. O. Chua (1991), The effect of integrator leak in modulation, IEEE Trans.Circuits Syst, 38,1293–1305.

[10] M. J. Ogorzalek (1992), Complex behavior in digital filters, Int. J. Bifur. Chaos, 2(1),11–29.

[11] C. W. Wu and L. O. Chua (1994), Symbolic dynamics of piecewise-linear maps, IEEETrans. Circuits Syst. II, 41 420–424.

[12] Banerjee S, Verghese G C (ed) (2001), Nonlinear Phenomena in Power Electronics:Attractors, Bifurcations, Chaos, and Nonlinear Control (IEEE Press, New York, USA).

[13] Tse T K, (2003), Complex Behavior of Switching Power Converters (CRC Press,Boca Raton, USA) .

[14] Banerjee S, Ott E, Yorke J A, Yuan G H (1997), Anomalous bifurcations in dc-dcconverters: borderline collisions in piecewise smooth maps IEEE Power ElectronicsSpecialists’ Conference, pp 1337 .

[15] Yuan G H, Banerjee S, Ott E, Yorke J A, (1998), Border collision bifurcations in thebuck converter IEEE Trans. Circuits & Systems–I 45, 707–716 .

[16] Maggio G M, di BernardoM and Kennedy M P, (2000), Nonsmooth Bifurcations ina Piecewise-Linear Model of the Colpitts Oscillator IEEE Trans. Circuits & Systems–I8 1160–77.

[17] Banerjee S, Parui S and Gupta (2004) A Dynamical Effects of Missed Switchingin Current-Mode Controlled dc-dc Converters IEEE Trans. Circuits & Systems–II 51649–54.

[18] Rajaraman R, Dobson I and Jalali S, (1996), Nonlinear Dynamics and SwitchingTime Bifurcations of a Thyristor Controlled Reactor Circuit, IEEE Trans. Circuits &Systems–I 43 1001–6.


[19] Feely O, Chua L O, (1992), Nonlinear dynamics of a class of analog-to-digitalconverters, Inter. J. Bifur.Chaos 22 325–40.

[20] Sharkovsky A N, Chua L O, (1993), Chaos in some 1–D discontinuous maps thatappear in the analysis of electrical circuits IEEE Trans. Circuits & Systems–I 40 722–31.

[21] Jain P, Banerjee S, (2003), Border collision bifurcations in one-dimensionaldiscontinuous maps, Inter. J. Bifur.Chaos 13, 3341–3352.

[22] Kollar L E, Stepan G and Turi J, (2004), Dynamics of Piecewise Linear DiscontinuousMaps Inter. J. Bifur.Chaos 14 2341–2351.

[23] Partha Sharathi Dutta, Bitihotra Routroy, Soumitro Banerjee, S. S. Alam, (2007),Border Collision Bifurcations in n-Dimensional Piecewise Linear Discontinuous Maps,To appear in Chaos.


Flow of Unsteady Dusty Fluid Under VaryingPulsatile Pressure Gradient in Anholonomic

Co-ordinate System

B.J.Gireesha, C.S.Bagewadi∗ and B.C.Prasanna Kumara†

Department of Mathematics, Kuvempu University, Shankaraghatta-577451,Shimoga, Karnataka, India.

†Department of Mathematics, SBM Jain College of Engineering,Jakkasandra, Bangalore.


Abstract: An analytical study of unsteady viscous dusty fluid flow with uniform distributionof dust particles between two infinite parallel plates has been studied by taking into the accountof the influence of pulsatile pressure gradient. The flow analysis is carried out using differentialgeometry techniques and analytical solutions of the problem is obtained with the help of LaplaceTransform technique and which are discussed with the help of graphs.c© Electronic Journal of Theoretical Physics. All rights reserved.

Keywords: Frenet frame field system; laminar flow, dusty gas; velocity of dust gas and fluidphase, unsteady dusty fluid, pulsatile pressure gradient, relaxation zone and density.PACS (2006): 47.15.x, 47.15.Cb,AMS Subject Classification(2000): 76T10, 76T15

1. Introduction

A dusty fluid is a mixture of fluid and fine dust particles. Its study is important

in areas like environmental pollution, smoke emission from vehicles, emission of effluents

from industries, cooling effects of air conditioners, flying ash produced from thermal

reactors and formation of raindrops, etc. Also it is useful in the study of lunar ash flow

which explains many features of lunar soil.

P.G.Saffman [15] has discussed the stability of the laminar flow of a dusty gas in

which the dust particles are uniformly distributed. Liu [11] has studied the Flow induced

by an oscillating infinite plat plate in a dusty gas. Michael and Miller[12] investigated

∗ prof [email protected]


the motion of dusty gas with uniform distribution of the dust particles occupied in the

semi-infinite space above a rigid plane boundary. Samba Siva Rao [16] have obtained the

analytical solutions for the dusty fluid flow through a circular tube under the influence of

constant pressure gradient, using appropriate boundary conditions. Later T.M.Nabil [13]

studied the Effect of couple stresses on pulsatile hydromagnetic poiseuille flow, N.Datta

[5] obtained the solutions for Pulsatile flow of heat transfer of a dusty fluid through

an infinitely long annular pipe. A.Eric [6] have studied the Quantitative Assessment of

Steady and Pulsatile Flow Fields in a Parallel Plate Flow Chamber.

Some researchers like Kanwal [10], Trusdell [17], Indrasena [9], Purushotham [14],

Bagewadi, Shantharajappa and Gireesha [1, 2, 3] have applied differential geometry tech-

niques to investigate the kinematical properties of fluid flows in the field of fluid mechan-

ics. Further, the authors [2, 3] have studied two-dimensional dusty fluid flow in Frenet

frame field system. Recently the authors [7, 8] have studied the flow of unsteady dusty

fluid under varying different pressure gradients like constant, periodic and exponential.

The present work is on the laminar flow of a dusty fluid between two infinite station-

ary parallel plates with a pulsatile pressure gradient in anholonomic co-ordinate system.

Further by considering the fluid and dust particles are at rest initially, the analytical

expressions are obtained for velocities of fluid and dust particles. The changes in the

velocity profiles at different times are shown graphically.

2. Equations of Motion

The equations of motion of unsteady viscous incompressible fluid with uniform dis-

tribution of dust particles are given by [15]:

For fluid phase

∇.−→u = 0 (Continuity) (1)

∂−→u∂t

+ (−→u .∇)−→u = −ρ−1∇p + υ∇2−→u +kN

ρ(−→v −−→u ) (2)

(Linear Momentum)

For dust phase

∇.−→v = 0 (Continuity) (3)

∂−→v∂t

+ (−→v .∇)−→v =k

m(−→u −−→v ) (Linear Momentum) (4)

We have following nomenclature:−→u −velocity of the fluid phase, −→v −velocity of dust phase, ρ−density of the gas,

p−pressure of the fluid, N−number of density of dust particles, υ−kinematic viscos-

ity, k = 6πaμ−Stoke’s resistance (drag coefficient), a−spherical radius of dust particle,

m−mass of the dust particle, μ−the co-efficient of viscosity of fluid particles, t−time.

Let −→s ,−→n ,−→b be triply orthogonal unit vectors tangent, principal normal, binormal

respectively to the spatial curves of congruences formed by fluid phase velocity and dusty


phase velocity lines respectively, Geometrical relations are given by Frenet formulae [4]

i)∂−→s∂s

= ks−→n ,

∂−→n∂s

= τs−→b − ks

−→s ,∂−→b

∂s= −τs

−→n

ii)∂−→n∂n

= k′n−→s ,

∂−→b

∂n= −σ′n

−→s ,∂−→s∂n

= σ′n−→b − k′n

−→n(5)

iii)∂−→b

∂b= k′′b

−→s ,∂−→n∂b

= −σ′′b−→s ,

∂−→s∂b

= σ′′b−→n − k′′b

−→b

iv) ∇.−→s = θns + θbs; ∇.−→n = θbn − ks; ∇.−→b = θnb

where ∂/∂s, ∂/∂n and ∂/∂b are the intrinsic differential operators along fluid phase

velocity (or dust phase velocity ) lines, principal normal and binormal. The functions

(ks, k′n, k

′′b ) and (τs, σ

′n, σ

′′b ) are the curvatures and torsion of the above curves and θns

and θbs are normal deformations of these spatial curves along their principal normal and

binormal respectively.

3. Formulation and Solution of the Problem

In the present problem we consider unsteady laminar flow of an incompressible viscus

fluid with uniform distribution of dust particles between two infinite stationary parallel

plates separated by a distance h in the absence of body force. The flow is due to the

influence of pulsatile pressure gradient with respect to time. Both the fluid and the dust

particle clouds are supposed to be static at the beginning. The dust particles are assumed

to be spherical in shape and uniform in size. The number density of the dust particles

is taken as a constant throughout the flow. Under these assumptions the flow will be a

parallel flow in which the streamlines are along the tangential direction and the velocities

are varies along binormal direction and with time t, since we extended the fluid to infinity

in the principal normal direction.

Fig. 1 Geometry of the flow


By virtue of system of equations (5) the intrinsic decomposition of equations (2) and

(4) give the following forms;

∂us

∂t= −1

ρ

∂p

∂s+ ν

[∂2us

∂b2− Crus

]+

kN

ρ(vs − us) (6)

2u2sks = −1

ρ

∂p

∂n+ ν

[2σ′′b

∂us

∂b− usk

2s

](7)

0 = −1

ρ

∂p

∂b+ ν

[usksτs − 2k′′b

∂us

∂b

](8)

∂vs

∂t=

k

m(us − vs) (9)

2v2sks = 0 (10)

where Cr = (σ′2n + k′2n + k′′2b + σ′′2b ) is called curvature number [3].

From equation (10) we see that v2sks = 0 which implies either vs = 0 or ks = 0.

The choice vs = 0 is impossible, since if it happens then us = 0, which shows that the

flow doesn’t exist. Hence ks = 0, it suggests that the curvature of the streamline along

tangential direction is zero. Thus no radial flow exists.

Equation (6) and (9) are to be solved subject to the initial and boundary conditions;⎧⎪⎨⎪⎩ Initial condition; at t = 0; us = 0, vs = 0

Boundary condition; for t > 0; us = 0, at b = 0 and b = h

⎫⎪⎬⎪⎭ (11)

Since we have assumed that a pulsatile pressure gradient is impressed on the system

for t > 0, we can write

−1

ρ

∂p

∂s= c + αcos(βt)

where c and α are constants and β is the frequency of oscillation.

We define Laplace transformations of us and vs as

U =

∞∫0

e−stusdt and V =

∞∫0

e−stvsdt (12)

Applying the Laplace transform to equations (6), (9) and to boundary conditions,

then by using initial conditions one obtains

sU =c

s+

αs

(s2 + β2)+ ν

[∂2U

∂b2− CrU

]+

L

τ(V − U) (13)

sV =1

τ(U − V ) (14)

U = 0, at b = 0 and b = h (15)


where L = mNρ

and τ = mk. Equation (14) implies

V =U

1 + sτ(16)

Eliminating V from (13) and (16) we obtain the following equation

d2U

db2− Q2U = −

[c

νs+

αs

ν(s2 + β2)

](17)

where Q2 =(Cr + s

ν+ sL

ν(1+sτ)

).

The velocities of fluid and dust particle are obtained by solving the equation (17)

subjected to the boundary conditions ((15)) as follows

U =1

νQ2

[c

s+

αs

s2 + β2

]{sinhQ(b − h) − sinh(Qb)

sinh(Qh)+ 1

}

Using U in (16) we obtain V as

V =1

(νQ2)(1 + sτ)

[c

s+

αs

s2 + β2

]{sinhQ(b − h) − sinh(Qb)

sinh(Qh)+ 1

}

By taking inverse Laplace transform to U and V, one can obtain

us =c

νλ2

(sinh(λ(b − h)) − sinh(λb)

sinh(λh)+ 1

)+

4c

π

∞∑n=0

1

2n + 1sin

(2n + 1

hπb

)(ex1t(1 + x1τ)2

((1 + x1τ)2 + L)+

ex2t(1 + x2τ)2

((1 + x2τ)2 + L)

)+

α

ν

((AE + BF )M1 − (BE − AF )M2

[(y1y2 − β2)2 + (βy1 + βy2)2] (E2 + F 2)

)+

4α

π

∞∑n=0

1

2n + 1sin

(2n + 1

hπb

)×[

x1ex1t(1 + x1τ)2

(x21 + β2)((1 + x1τ)2 + L)

+x2e

x2t(1 + x2τ)2

(x22 + β2)((1 + x2τ)2 + L)

]


vs =c

νλ2

(sinh(λ(b − h)) − sinh(λb)

sinh(λh)+ 1

)+

4c

π

∞∑n=0

1

2n + 1sin

(2n + 1

hπb

)[ex1t(1 + x1τ)

((1 + x1τ)2 + L)+

ex2t(1 + x2τ)

((1 + x2τ)2 + L)

]+

α

ν

((M1A − M2B)(E − Fβτ) + (M2A + M1B)(Eβτ + F )

[(y1y2 − β2)2 + (βy1 + βy2)2] (E2 + F 2)(1 + β2τ 2)

)+

4α

π

∞∑n=0

1

2n + 1sin

(2n + 1

hπb

)×[

x1ex1t(1 + x1τ)

(x21 + β2)((1 + x1τ)2 + L)

+x2e

x2t(1 + x2τ)

(x22 + β2)((1 + x2τ)2 + L)

]where

x1 = − 1

2τ

(1 + L + νCrτ + ντ

n2π2

h2

)+

1

2τ

√(1 + L + νCrτ + ντ

n2π2

h2

)2

− 4τν

(Cr +

n2π2

h2

)x2 = − 1

2τ

(1 + L + νCrτ + ντ

n2π2

h2

)− 1

2τ

√(1 + L + νCrτ + ντ

n2π2

h2

)2

− 4ντ

(Cr +

n2π2

h2

)y1 = − 1

2τ(1 + L + νCrτ) +

1

2τ

√(1 + L + νCrτ)2 − 4Crντ

y2 = − 1

2τ(1 + L + νCrτ) − 1

2τ

√(1 + L + νCrτ)2 − 4Crντ

A = sinh(α1(b − h))cos(β1(b − h)) − sinh(α1b)cos(β1b) + sinh(α1h)cos(β1h)

B = cosh(α1(b − h))sin(beta1(b − h)) − cosh(α1b)sin(β1b) + cosh(α1h)sin(β1h)

M1 = (cosβt − βτsinβt)(y1y2 − β2) + (sinβt + βτcosβt)(βy1 + βy2)

M2 = (sinβt − βτcosβt)(y1y2 − β2) + (cosβt + βτsinβt)(βy1 + βy2)

E = sinh(α1h)cos(β1h), F = cosh(α1h)sin(β1h)

δ1 =y1y2 − β2 − β2τ(y1 + y2)

ν(1 + β2τ 2), δ2 =

β2τ − β(y1 + y2) − y1y2βτ

ν(1 + β2τ 2)

α1 =

√δ1 +

√δ21 + δ2

2

2and β1 =

√−δ1 +

√δ21 + δ2

2

2, λ =

√x1x2

ν


4. Conclusion

The figures 2 and 3 represents the velocity profiles for the fluid and dust particles

respectively, which are parabolic in nature. It is observed that velocity of fluid particles

is parallel to velocity of dust particles and velocity decreases with increase in time t.

Further one can observe that if the dust is very fine i.e., mass of the dust particles is

negligibly small then the relaxation time of dust particle decreases and ultimately as

τ → 0 the velocities of fluid and dust particles will be the same. Also we see that the

fluid particles will reach the steady state earlier than the dust particles. This difference

is due to the fact that pulsatile pressure gradient is directly exerted on the fluid.

Fig. 2 Variation of fluid velocity with b

Fig. 3 Variation of dust phase velocity with b


References

[1] C.S.Bagewadi and A.N.Shantharajappa, A study of unsteady dusty gas flow in FrenetFrame Field, Indian Journal Pure Appl. Math., 31 (2000) 1405-1420.

[2] C.S.Bagewadi and B.J.Gireesha, A study of two dimensional steady dusty fluid flowunder varying temperature, Int. Journal of Appl. Mech. & Eng., 09(2004) 647-653.

[3] C.S.Bagewadi and B.J.Gireesha, A study of two dimensional unsteady dusty fluid flowunder varying pressure gradient, Tensor, N.S., 64 (2003) 232-240.

[4] Barret O’ Nell, Elementary Differential Geometry, Academic Press, New York &London, 1966.

[5] N.Datta & D.C.Dalal, Pulsatile flow of heat transfer of a dusty fluid through aninfinitly long annlur pipe, Int. J. Multiphase flow, 21(3) (1995) 515-528.

[6] A.Eric, Nauman, J.Kurtis, Risic, M.Tony, Keaveny, & L.Robert Satcher, QuantitativeAssessment of Steady and Pulsatile Flow Fields in a Parallel Plate Flow Chamber, Annals of Biomedical Engineering, 27 (1999) 194-199.

[7] B.J.Gireesha , C. S. Bagewadi & B.C.Prasannakumara, Flow of unsteady dusty fluidunder varying periodic pressure gradient, ’Journal of Analysis and Computation’,2(2), (2006) 183-189.

[8] B.J.Gireesha , C. S. Bagewadi & B.C.Prasannakumara, Flow of unsteady dusty fluidbetween two parallel plates under constant pressure gradient, Tensor.N.S. 68 (2007)

[9] Indrasena, Steady rotating hydrodynamic-flows, Tensor, N.S., (1978) 350-354.

[10] R.P.Kanwal, Variation of flow quantities along streamlines, principal normals andbi-normals in three-dimensional gas flow, J.Math., 6 (1957) 621-628.

[11] J.T.C.Liu, Flow induced by an oscillating infinite plat plate in a dusty gas, Phys.Fluids, 9 (1966) 1716-1720.

[12] D.H.Michael and D.A.Miller, Plane parallel flow of a dusty gas, Mathematika, 13(1966) 97-109.

[13] T.M.Nabil, EL-Dabe, M.G.Salwa and EL-Mohandis, Effect of couple stresses onpulsatile hydromagnetic poiseuille flow, Fluid Dynamic Research, 15 (1995) 313-324.

[14] G.Purushotham and Indrasena, On intrinsic properties of steady gas flows ,Appl.Sci. Res., A 15(1965) 196-202.

[15] P.G.Saffman, On the stability of laminar flow of a dusty gas, Journal of FluidMechanics, 13(1962) 120-128.

[16] P.Samba Siva Rao, Unsteady flow of a dusty viscous liquid through circular cylinder,Def. Sci. J., 19(1969) 135-138.

[17] C.Truesdell, Intrinsic equations of spatial gas flows, Z.Angew.Math.Mech, 40 (1960)9-14.


Exact Solutions for Nonlinear Evolution EquationsVia Extended Projective Riccati Equation

Expansion Method

M A Abdou∗

Theoretical Research Group, Physics Department,Faculty of Science, Mansoura University, 35516 Mansoura, Egypt

Received 21 June 2006, Accepted 6 January 2007, Published 31 March 2007

Abstract: By means of a simple transformation, we have shown that the generalized-Zakharovequations, the coupled nonlinear Klein-Gordon-Zakarov equations, the GDS, DS and GZequations and generalized Hirota-Satsuma coupled KdV system can be reduced to the elliptic-like equations. Then, the extended projective Riccati equation expansion method is used toobtain a series of solutions including new solitary wave solutions,periodic and rational solutions.The method is straightforward and concise, and its applications is promising.c© Electronic Journal of Theoretical Physics. All rights reserved.

Keywords: Extended projective Riccati equation, Nonlinear evolution equations, New solitarywave solutions, Periodic and rational solutions.PACS (2006): 02.30.Hq, 02.30.Jr, 47.35.Fg, 94.05.Fg, 02.90.+p

1. Introduction

The investigation of the exact travelling wave solutions of nonlinear evolution equa-

tions plays an important role in the study of nonlinear physical phenomena. For exam-

ple,the wave phenomena observed in fluid dynamics, plasma,elastic media,optical fibers,

etc. In the past several decades, both mathematicians and physicists have made signifi-

cant progression in this direction.

Many effective methods [1 − 13] have been presented such as variational iteration

method [6], homotopy perturbation method [3], Exp-function method [8, 12], and others.

A complete review on the field is available on [4].

The rest of this paper is organized as follows: In Section 2, first we briefly give the

steps of the method and apply the method to solve the elliptic-like equation. In Section



3, by using the results obtained in Section 3, the corresponding solutions of some class of

nonlinear evolution equations in mathematical physics can be obtained. The last section

is devoted to the conclusion.

2. Method and its Applications

To illustrate the basic idea of the extended projective Riccati equation expansion

method, we consider the nonlinear evolution equation with independent variables,say in

two variables x, t,

Q(u, ux, uxx, ....) = 0, (1)

we consider its travelling wave solutions

u(x, t) = u(ξ), ξ = x − λt + ξ0, (2)

then Eq.(1) is reduced to an ordinary differential equation (ODE)

Q(u, u,, u,,, ....) = 0, (3)

where a prime denotes ddξ

.

Step (1). We assume that Eq.(1) has the following formal solution :

u(ξ) = a0 +M∑i=1

f i−1(ξ)[aif(ξ) + big(ξ)], (4)

where a0,ai and bi are constants to be determined later. The parameter M can be

determined by balancing the highest order derivative term with nonlinear term in Eq.(3),

f′(ξ) = pf(ξ)g(ξ), (5)

g′(ξ) = q + pg2(ξ) − rf(ξ), (6)

g2 = −1

p[q − 2rf +

r2 + δ

qf 2], (7)

where p = 0 is a real constant, q, r, δ are real constants.

Step (2). Substituting Eq.(4) into (3) and making use of Eqs.(5-7) yields a set of

algebraic polynomials for f i(ξ)gj(ξ)(i = 0, 1, ...; j = 0, 1, ...). Eliminating all the coeffi-

cients of the power of f i(ξ)gj(ξ), yields a series of alegbraic equations, from which the

parameters ai,bi and λ are explicitly determined.

Step (3). It is easy to see that Eqs.(5) and (6) admits the following solutions:

Case(1):δ = h2 − s2,q = 0, and pq < 0,

f1 =q

r + scosh(√−pqξ) + hsinh(

√−pqξ), (8)


g1 =

√−pq

p

ssinh(√−pqξ) + hcosh(

√−pqξ)


√−pqξ), (9)

g21 = −1

p[q − 2rf1 +

r2 + h2 − s2

qf2

1 ], (10)

where h, p, s, q and r are constants.

Case(2): δ = −h2 − s2,q = 0 , and pq > 0,

f2 =q

r + scos(√

pqξ) + hsin(√

pqξ), (11)

g2 =

√pq

p

ssin(√

pqξ) − hcos(√

pqξ)

r + scos(√

pqξ) + hsin(√

pqξ), (12)

g22 = −1

p[q − 2rf2 +

r2 − h2 − s2

qf 2

2 ], (13)

where h, p, s, q and r are constants.

Case(3): q = 0,

f3 =1

(pr/2)ξ2 + mξ + n, (14)

g3 =−1

p

prξ + m

(pr/2)ξ2 + mξ + n, (15)

g23 =

2r

pf3 + [

m2

p2− 2rn

p]f2

3 , (16)

where m, n, p, r are arbitrary constants.

Case(4): p = ±1, δ = −r2,

f4(ξ) =q

6r+

2

prψ(ξ), (17)

g4(ξ) =12ψ

′(ξ)

q + 12ψ′(ξ), (18)

where ψ(ξ) satisfies

ψ′2(ξ) = 4ψ3(ξ) − γ2ψ(ξ) − γ3,

where γ2 = q2

12, γ3 = pq3

216,

g24 =

2r

pf4 − p

q(19)

Case(5):p = ±1,δ = − r2

25,

f5(ξ) =5q

6r+

5pq2

72ψ(ξ), (20)


g5(ξ) = − qψ′(ξ)

ψ(ξ)(pq + 12ψ(ξ)), (21)

g25 = −1

p[q − 2rf5 +

24r2

25qf2

5 ] (22)

3. The Exact Solutions of Elliptic-like Equations

Let us consider the elliptic-like equation in [7]

Aφ′′(ξ) + Bφ(ξ) + Dφ3(ξ) = 0, (23)

where A, B, D are arbitrary constants. In this section,the exact solutions of Eq.(23)

are derived using the coupled projective Riccati Eqs.(5) and (6).Considering the homo-

geneous balance between φ′′(ξ) and φ3(ξ) in Eq.(23), the solution of Eq.(23) is given

by

φ(ξ) = a0 + a1f(ξ) + b1g(ξ), (24)

where a0,a1 and b1 are constants to be determined later, and f(ξ) and g(ξ) satisfy

Eqs.(5-7).Substituting Eq.(24) into (23) and making use of Eqs.(5-7), becomes a polyno-

mials for f i1(i = 0, 1, 2, 3) and f j

1g1(j = 0, 1, 2), setting the coefficients of the polynomials

to zero yields a set of algebraic equations. Solving the system of algebraic equations with

the aid of Maple, we have

a0 = 0, a21 =

Ap(r2 + h2 − s2)

2qD, b2

1 = −Ap2

2D(25)

Case(1):pq < 0,q = 0,g21 = −1

p[q − 2rf1 + r2+h2−s2

qf2

1 ].Substituting Eq.(25) into

Eq.(24) and using Eqs.(5-7),the exact solution of Eq.(23) are derived as

φ1(ξ) =a1q


√−pqξ)−b1

√−pq

p


√−pqξ)


√−pqξ),

(26)

a21 = Ap(r2+h2−s2)

2qD,AD < 0 and b2

1 = −Ap2

2D.

Case(1.1): a0 = a1 = 0, r = 0,the exact solution of Eq.(23) are derived as

φ2(ξ) = −b1

√−pq

p


√−pqξ)


√−pqξ), (27)

b21 = −Ap2

2Dand A

D< 0.

Case(1.2): a0 = b1 = 0,r = 0, the exact solution of Eq.(23) yields

φ3(ξ) =a1q

scosh(√−pqξ) + hsinh(

√−pqξ), (28)


a21 = 2Ap(h2−s2)

qD.

Case(2):pq > 0,q = 0,g22 = −1

p[q − 2rf2 + r2−h2−s2

qf 2

2 ].

Case(2.1): a0 = 0, pq > 0,

φ4(ξ) =a1q

r + scos(√

pqξ) + hsin(√

pqξ)+

b1√

pq

p

ssin(√

pqξ) − hcos(√

pqξ)

r + scos(√

pqξ) + hsin(√

pqξ), (29)

a21 = Ap(r2−h2−s2)

2qDand b2

1 = −Ap2

2D.

Case(2.2): a0 = a1 = 0, r = 0, pq > 0,

φ5(ξ) =b1√

pq

p

ssin(√

pqξ) − hcos(√

pqξ)

scos(√

pqξ) + hsin(√

pqξ), (30)

b21 = −2Ap2

D.

Case(2.3):a0 = b1 = 0,r=0,pq > 0,

φ6(ξ) =a1q

scos(√

pqξ) + hsin(√

pqξ), (31)

a21 = 2Ap(−h2−s2)

qDand pq > 0.

Case(3): p = ±1, δ = −r2, g24 = 2r

pf4 − q

p. The exact solution of Eq.(23) admits

φ7(ξ) =12b1ψ

′(ξ)

q + 12pψ′(ξ), (32)

b21 = −Ap2

2Dand A

D< 0

Case(4): p = ±1,g25 = −1

p[q−2rf5+ r2−h2−s2

qf 2

5 ].The exact solution of Eq.(23) admits

φ8(ξ) = a1[5q

6r+

5pq2

72rψ(ξ)] − b1qψ

′(ξ)

ψ(ξ)(pq + 12ψ(ξ)), (33)

a21 = 12r2AP

25Dq, b2

1 = −AP 2

2D, p

q< 0,p = ±1 and A

D< 0.

4. Exact Solutions of Some Class of Nonlinear Evolution Equa-

tions

In this section,by using the results obtained in section (3), we will constract the

corresponding solutions of the generalized-Zakharov equations, the coupled nonlinear

Klein-Gordon-Zakarov equations, the GDS,DS and GZ equations and generalized Hirota-

Satsuma coupled KdV system.

4.1 The generalized-Zakharov equations

The generalized Zakharov equations for the complex envelope ψ(x, t) of the high-frequency

wave and the real low-frequency field v(x, t) reads [13]


iψt + ψxx − 2λ|ψ|2ψ + 2ψv = 0, (34)

vtt − vxx + (|ψ|2)xx = 0, (35)

where the cubic term in Eq.(34) describes the nonlinear-self interaction in the high

frequency subsystem,such a term corresponds to a self-focusing effect in plasma physics.

The coefficient λ is a real constant that can be a postive or negative number. Let us

assume the travelling wave solution of Eqs.(34) and (35) in the form

ψ(x, t) = eiηφ(ξ), v = v(ξ),

η = αx + βt, ξ = k(x − 2αt), (36)

where φ(ξ) and v(ξ) are real functions, the constants α, β and k are to be determined.

Substituting (36) into Eqs.(34) and (35), we have

k2φ′′(ξ) + 2φ(ξ)v(ξ) − (α2 + β)φ(ξ) − 2λφ3(ξ) = 0, (37)

k2(4α2 − 1)v′′(ξ) + k2(φ2)

′′(ξ) = 0 (38)

In order to simplify ODEs (37) and (38), integrating Eq.(38) once and taking inte-

gration constant to zero, and integrating yields

v(ξ) =φ2(ξ)

(1 − 4α2)+ C, ifα2 = 1

4, (39)

where C-integration constant. Inserting Eq.(39) into (37), we have

Aφ′′(ξ) + Bφ(ξ) + Dφ3(ξ) = 0 (40)

Eq.(40) coincides with Eq.(23), where A, B and D are defined by

A = k2,

B = [2C − α2 − β],

D = 2[1

1 − 4α2− λ] (41)

Then the solution of Eqs.(34) and (35) are

ψ(x, t) = eiηφ(ξ),

v(x, t) =φ2(ξ)

(1 − 4α2)+ C, (42)


where φ(ξ) is given by Eqs.(26-33), η = αx + βt, ξ = k(x− 2αt) and A, B and D are

defined by Eq.(41).

4.2 The coupled nonlinear Klein-Gordon-Zakarov equations

The coupled nonlinear Klein-Gordon-Zakarov equations [14] read

utt − c20∇2u + f2

0 u + δuv = 0,

vtt − c20∇2v − β∇2|u|2 = 0, (43)

where c0, f0, β and δ are constants. We seek its following wave packet solution

u(x, y, z, t) = φ(ξ)ei(kx+ly+nz−Ωt), v = v(ξ), ξ = px + qy + rz − wt, (44)

where φ(ξ) and v(ξ) are real functions.Substituting Eq.(44) into Eqs.(43) yields

[w2 − c20P

2)φ′′(ξ) + 2i[wΩ − c2

0K.P )φ′(ξ) − (w2 − K2c2

0 − f 20 )φ(ξ) + δv(ξ)φ(ξ) = 0,

[w2 − c20P

2)v′′(ξ) − βP 2(φ2(ξ))

′′= 0, (45)

K = (k, l, n), P = (p, q, r), K.P = kp + lq + nr

If we take w.Ω = c20K.P , then Eqs.(43) leads to

[w2 − c20P

2]φ′′(ξ) − (w2 − K2c2

0 − f 20 )φ(ξ) + δv(ξ)φ(ξ) = 0, (46)

[w2 − c20P

2)v′′(ξ) − βP 2(φ2(ξ))

′′= 0 (47)

Integrating (47) twice with respect to ξ, we get

v(ξ) =c

w2 − c20P

2+

βP 2

w2 − c20P

2φ2(ξ), (48)

where c is an integration constant. Substituting (48) into (46) the obtained equation

can be expressed as Eq.(23), while the parameters A, B and D are defined by

A = [w2 − c20P

2]2,

B = [(w2 − c20P

2)(−w2 + c20K

2c20 + f2

0 ) + δc],

D = δβP 2 (49)


Then the solution of Eqs.(43) are defined as follows

u(x, y, z, t) = φ(ξ)ei(kx+ly+nz−Ωt),

v(x, y, z, t) =c

w2 − c20P

2+

βP 2

w2 − c20P

2φ2(ξ),

Ω =c20K.P

w, (50)

where φ(ξ) appearing in these solutions is given by Eqs.(26-33) and A, B and D are

defined by (49) and ξ = px + qy + rz − wt.

4.3 The GDS,DS and GZ equations

We consider a class of NLPDEs with constant coefficients [15]

iut + ν(uxx + D1uyy) + E1|u|2u + C1uv = 0,

D2vtt + (vxx − E2uyy) + C2(|u|2)xx = 0, (51)

where ν, Di, Ei, Ci are real constants and ν = 0, D1 = 0, C1 = 0, C2 = 0. Eqs.(51)

are a class of physically important equations.In fact, if one takes

ν =1

2k2, D1 = 2ν, E1 = α, C1 = −1, D2 = 0, E2 = D1, C2 = −2α, k2 = ±1, (52)

then Eqs.(51) represent the DS equations [16]

iut +1

2k2(uxx + k2uyy) + α|u|2u − uv = 0,

vxx − k2uyy − 2α(|u|2)xx = 0 (53)

If one takes

ν = v(x, t), i.e., vy = 0, ν = 1, D1 = 0, E1 = −2σ,E2 = −1, C2 = −1, C1 = 2, (54)

then Eqs.(51) represent the GZ equations [17]

iut + uxx − 2σ|u|2u + 2uv = 0,

vtt − vxx + (|u|2)xx = 0 (55)

Since u is a complex function,we assume that


u(x, y, t) = φ(ξ)ei(kx+ly−Ωt), v(x, y, t) = v(ξ), ξ = px + qy − wt, (56)

where both φ(ξ) and v(ξ) are real functions,and k, l, p, q, Ω and w are constants to

be determined later.Substituting Eq.(56) into (51),we have the following ODE for φ(ξ)

and v(ξ)

ν(p2+D1q2)φ

′′(ξ)+[Ω−ν(k2+D1l

2)]φ(ξ)+E1φ3(ξ)+i[−w+2ν(kp+D1lq)]φ

′(ξ)+C1φ(ξ)v(ξ) = 0,

(57)

[D2w2 + p2 − E2q

2]v′′(ξ) + C2p

2(φ2(ξ))′′

= 0 (58)

if we set

w = 2ν(kp + D1lq), (59)

then Eq.(57) reduces to

ν(p2 + D1q2)φ

′′(ξ) + [Ω − ν(k2 + D1l

2)]φ(ξ) + E1φ3(ξ) + C1φ(ξ)v(ξ) = 0 (60)

Integrating Eq.(58) twice, we get

v(ξ) =c

D2w2 + p2 − E2q2− C2p

2

D2w2 + p2 − E2q2φ2(ξ), (61)

where c is an integration constant. Substituting Eq.(61) into (60) yields

ν(p2+D1q2)(D2w2+p2−E2q2)φ′′(ξ)+C1c−(D2w2+p2−E2q2)[ω−ν(k2+D1l2)]φ(ξ)+E1(D2w2+p2−E2q2)−C1C2p2φ3(ξ)=0,

(62)

Eq.(62) can be rewritten as Eq.(23), while A, B and D are given by the following

equation,

A = ν(p2 + D1q2)(D2w

2 + p2 − E2q2),

B = C1c − (D2w2 + p2 − E2q

2)[Ω − ν(k2 + D1l2)],

D = E1(D2w2 + p2 − E2q

2) − C1C2p2 (63)

Then the solution of Eqs.(51) are

u(x, y, t) = φ(ξ)ei(kx+ly−Ωt), (64)

v(x, y, t) =c

D2w2 + p2 − E2q2− C2p

2

D2w2 + p2 − E2q2φ2(ξ), (65)


w = 2ν(kp + D1lq) (66)

The expression φ(ξ) appearing in these solutions is given by Eqs.(26-33) and ξ =

px + qy − wt. We may obtain from Eq.(53) that

u(x, y, t) = φ(ξ)ei(kx+ly−Ωt), (67)

v(x, y, t) =c

p2 − k2q2+

2αp2

p2 − k2q2φ2(ξ), (68)

w = k2(kp + k2lq), (69)

where φ(ξ) satisfy the elliptic-like Eq.(23) with A, B and D defined as follows

A = k2(p2 + k2q2)(k2q2 − p2),

B = 2c + (p2 − k2q2)[2Ω − k2(k2 + k2l2)],

D = 2α(p2 + k2q2) (70)

The expression φ(ξ) are defined by Eqs.(26-33) and ξ = px + qy − wt.Then From

Eq.(55) we have that

u(x, y, t) = φ(ξ)ei(kx−Ωt), (71)

v(x, y, t) =c

p2 − w2+

p2

p2 − w2φ2(ξ), (72)

w = 2kp, (73)

where φ(ξ) satisfies Eq.(23),while A, B and D are given by

A = p2(p2 − w2),

B = 2c − (p2 − w2)[Ω − k2],

D = 2[p2 − σ(p2 − w2)] (74)

The expression φ(ξ) appearing in these solutions is given by Eqs.(26-33) and ξ =

px − wt.


4.4 Generalized Hirota-Satsuma coupled KdV equation

Consider the Hirota-Satsuma coupled KdV system in [18]

ut =1

4uxxx + 3uux + 3(w − v2)x,

vt = −1

2vxxx − 3uvx,

wt = −1

2wxxx − 3uwx (75)

When w = 0,Eqs.(75) reduces to be the well-known Hirota-Satsuma coupled KdV

system [19]. We seek travelling wave solutions for Eqs.(75) in the form

u(x, t) = u(ξ), v(x, t) = v(ξ), w(x, t) = w(ξ), ξ = k(x − ct) (76)

Substituting Eq.(76) into (75), we get

−cku′=

1

4k3u

′′′+ 3kuu

′+ 3k(w − v2)

′, (77)

−ckv′= −1

2k3v

′′′ − 3kuv′, (78)

−ckw′= −1

2k3w

′′′ − 3kuw′

(79)

Let

u = αv2 + βv + γ,

w = A0v + B0, (80)

where α, γ, β, A0 and B0 are constants. Inserting Eq.(80) into (78) and (79) inte-

grating once we know that (78) and (79) give rise to the same equation

k2v′′

= −2αv3 − 3βv2 + 2(c − 3γ)v + c1, (81)

where c1 is an integration constant.Integrating (81) we have

k2v′2 = −αv4 − 2βv3 + 2(c − 3γ)v2 + 2c1v + c2, (82)

where c2 is an integration constant. By means of Eqs.(80-82) we get

k2u′′

= 2αk2v′2 + k2(2αv + β)v

′′= 2α[−αv4 − 2βv3

+2(c − 3γ)v2 + 2c1v + c2] + (2αv + β)[−2αv3 − 3βv2 + 2(c − 3γ)v + c1] (83)


Integrating (77) once we have

1

4k2u

′′+

3

2u2 + cu + 3(w − v2) + c3 = 0, (84)

where c3 is an integration constant.Inserting (80) and (83) into (84) gives

3αc − 3αγ +3

4β2 − 3 = 0,

1

2[αc1 + βc + γβ) + A0 = 0,

1

4(2αc2 + βc1) +

3

2γ2 + cγ + 3B0 + c3 = 0 (85)

Let

c1 =1

2α2[β3 + 2cαβ − 6αβγ),

v(ξ) = aφ(ξ) − β

2α(86)

Therefore from Eq.(81), we have

k2φ′′(ξ) − a(

3β2

2α+ 2c − 6γ)φ(ξ) + 2αa3φ3(ξ) = 0, (87)

then Eq.(87) can be written as

Aφ′′(ξ) + Bφ(ξ) + Dφ3(ξ) = 0 (88)

Eq.(88) is the same with Eq.(23) where A, B and D are defined by

A = k2, B = −a((3β2/2α) + 2c − 6γ), D = 2αa3 (89)

Then the solutions of Eqs.(75) are given by

u(x, t) = α[aφ(ξ) − β

2α]2 + γ, (90)

v(x, t) = [aφ(ξ) − β

2α], (91)

w(x, t) = A0[aφ(ξ) − β

2α] + B0, (92)

the expression φ(ξ) appearing in these solutions are defined by Eqs.(26-33).


5. Conclusion

In this paper, with the aid of a simple transformation technique, we have shown

that the generalized-Zakharov equations, the coupled nonlinear Klein-Gordon-Zakarov

equations, the GDS,DS and GZ equations and generalized Hirota-Satsuma coupled KdV

system can be reduced to the elliptic-like equation.

The validity of the proposed method has been tested by applying it successfully to the

generalized-Zakharov equations, the coupled nonlinear Klein-Gordon-Zakarov equations,

the GDS,DS and GZ equations and generalized Hirota-Satsuma coupled KdV system.

As a result, many exact wave solutions are obtained which include new solitary wave

solutions, periodic and rational solutions.

Finally, it is worthwhile to mention that the proposed method is straightforward and

concise, more applications to other nonlinear physical systems should be concerned and

deserve further investigation. This is our task in the future work.

Acknowledgement

The author is thankful to Prof. Dr. S. A. El-Wakil for his suggestions, reviews and

continuous encouragement.


References

[1] El-Wakil S A, Abdou M A, Chaos, Solitons and Fractals 31(2007)840-852

[2] El-Wakil S A, Abdou M A. The Adomian decomposition method for solving nonlinearphysical models, Chaos, Solitons and Fractals (2007) in Press

[3] Ji-Huan He, Chaos, Solitons and Fractals 26(2005)695

[4] Ji-Huan He,Int.J.Modern Phys.B 20(2006)1141

[5] El-Wakil S A, Abdou M A, Phys. Lett. A 358(2006)275-282

[6] Abdou M A, Soliman A A, Physica D 2005;211:1

[7] Abdou M A, Chaos, Solitons and Fractals 31(2007)95-104.

[8] Ji-Huan He, Wu X H, Chaos, Solitons and Fractals 29(2006)108

[9] El-Wakil S A, Abdou M A, Chaos, Solitons and Fractals 31(2007)1256

[10] El-Wakil S A, Abdou MA, Elhanbaly A, Phys. Lett. A 353(2006)40

[11] Abdou M A, Elhanbaly A.Construction of periodic and solitary wave solutions bythe extended Jacobi elliptic function expansion method, Comm. Non. Sci. and Numer.Sim. (2007) in Press

[12] Ji-Huan He,Abdou M A. New periodic solutions for nonlinear evolution equationsusing Exp-function method, Chaos, Solitons and Fractals (2007) in Press

[13] Wang M, X Li, Phys. Lett. A 343(2005)48

[14] Ablowitz M,Clarkson P A 1991. Solitons, Nonlinear evolution equations and inversescattering transform, New York, Cambridge, University Press.

[15] Zhou Y, Wang M,Miao T. Phys Lett 323(2004)77

[16] Davey A, Stewartson K Proc R Soc Land 338(1974)101

[17] Malomed B et al.Phys Rev E 55(1997)962

[18] Z.Yan, Chaos, Solitons and Fractals 15(2003)575.

[19] Hirota R, Satsuma J, J. Phys Lett A 50(1981)407


Evolutionary Neural Gas (ENG) : A Model of SelfOrganizing Network from Input Categorization

Ignazio Licata1∗, Luigi Lella2

1Ixtucyber for Complex Systems, Marsala, TP andInstitute for Scientific Methodology, Palermo, Italy

2A.R.C.H.I. - Advanced Research Center for Health Informatics, Ancona, Italy

Received 16 December 2006, Accepted 6 January 2007, Published 31 March 2007

Abstract: Despite their claimed biological plausibility, most self organizing networks have stricttopological constraints and consequently they cannot take into account a wide range of externalstimuli. Furthermore their evolution is conditioned by deterministic laws which often are notcorrelated with the structural parameters and the global status of the network, as it shouldhappen in a real biological system. In nature the environmental inputs are noise affected and“fuzzy”. Which thing sets the problem to investigate the possibility of emergent behaviour in anot strictly constrained net and subjected to different inputs. It is here presented a new model ofEvolutionary Neural Gas (ENG) with any topological constraints, trained by probabilistic lawsdepending on the local distortion errors and the network dimension. The network is consideredas a population of nodes that coexist in an ecosystem sharing local and global resources. Thoseparticular features allow the network to quickly adapt to the environment, according to itsdimensions. The ENG model analysis shows that the net evolves as a scale-free graph, andjustifies in a deeply physical sense- the term “gas” here used.c© Electronic Journal of Theoretical Physics. All rights reserved.

Keywords: Self-Organizing Networks; Neural Gas; Scale-Free Graph; Information in NetworkFunctional SpecializationPACS (2006): 89.75.k, 89.75.Fb, 82.39.Rt,07.05.Mh, 84.35.+i, 87.23.n, 91.62.Np

1. Introduction

Self organizing networks are systems widely used in categorization tasks. A network

can be seen as a set A={c1, c2,. . . ,cn} of units with associated reference vectors wc ∈Rn

where Rn is the same space where inputs are defined. Each unit (or node) can establish

connections with the other ones, the units belonging to the same clusters are subjected

∗ Corresponding author: [email protected]


to similar modification affecting their reference vectors.

Self organizing networks can automatically adapt to input distributions without super-

vision by means of training algorithms that are simple sequences of deterministic rules.

Competitive hebbian learning and neural gas are the most important strategies used for

their training.

Neural gas algorithm (Martinetz T.M. and Schulten K.J., 1991) sorts the network units

according to the distance of their reference vector to each input. Then the reference vec-

tors are adapted so that the ones related to the first nodes in the rank order are moved

more close than the others to the considered input.

Competitive hebbian learning (Martinetz and Schulten, 1991; Martinetz, 1993) consists

in augmenting the weight of the link connecting the two units whose reference vectors

are closest to the considered input (the two most activated units).

Both strategies are examples of deterministic rules. As we know there are other rules that

constrain the topology of the network which has a fixed dimensionality. That’s the case

of Self Organizing Maps (Kohonen, 1982) and Growing Cell Structures (Fritzke, 1994).

In other cases the network structures haven’t topological constraints, they take a well

ordered distribution by exactly adapting to the manifold inputs. For example TRN

(Martinetz and Schulten, 1994) and GNG are networks whose final structure is similar to

a Delaunay Triangulation (Delaunay, 1934).We have tried to define a new self organizing

network that is trained by probabilistic rules avoiding any topological constraints.

According to Jefferson (1995) life and evolution are structured at least into four funda-

mental levels: molecular, cellular, organism and population. We propose a population

level based on evolutionary algorithm where the network is seen as a population of units

whose interactions are conditioned by the availability of resources in their ecosystem. The

evolution of the population is driven by a selective process that favours the fittest units.

This approach has a biological plausibility. As stated by recent theories (Edelman, 1987)

human brain evolution is subjected to similar selective pressures.

Obviously we are not interested in recreating the same structure as the human brain.

Our work aims at finding innovative and effective solutions to the categorization problem

adopting natural system strategies. So our system falls within the Artificial Life field

(Langton, 1989).

Our model is a complex system that shows emergent features. In particular its structure

evolves as a scale free graph. In the training phase there arise clusters of units with a

limited number of nodes that establish a great number of links with the others.

Scale free graphs are a particular structure that is really common in natural systems.

Human knowledge, for instance, seems to be structured as a scale free graph (Steyvers,

Tenenbaum 2001). If we represent words and concepts as nodes, we’ll find that some of

these are more connected than the others.

Scale free graphs have three main features.The small world structure. It means there is a

relatively short path between any couple of nodes (Watts, Strogatz, 1998).The inherent

tendency to cluster that is quantified by a coefficient introduced by Watts and Strogatz.

Given a node i of ki degree i.e. having ki edges which connect it to ki other nodes, if those


make a cluster, they can establish ki(ki-1)/2 edges at best. The ratio between the actual

number of edges and the maximum number gives the clustering coefficient of node i. The

clustering coefficient of the whole network is the average of all the individual clustering

coefficients.

Scale free graphs are also characterized by a particular degree distribution that has a

power-law tail P(k)∼k−n. That’s why such networks are called “scale free” (Albert,

Barabasi, 2000).

The three previous features are quantified by three parameters: the average path length

between any couple of nodes, the clustering coefficient and the exponent of the power

law tail. We’ll show that the values of these parameters in our model seem to confirm its

scale free nature.

2. An Outline on Self-Organization and Evolutionary Systems

Natural selection mechanism has been successfully used for a lot of industrial appli-

cations spanning from projecting to real-time control and neural networks training.

It was in the 60s that Genetic Algorithms based on the Evolution Theory’s three main

mechanisms - reproduction, mutation and fitness – were first used in dealing with op-

timization problems. Although the solution is reached by a population of individuals,

systems based on this approach are not considered self organizing because their dynam-

ics depend on the external constraint of the fitness function.

In the 80s a new approach to the study of living systems which mixed together self or-

ganization and evolutionary systems came out (Rocha, 1997). Its success was due to the

studies on the way how biological systems work (metabolism, adaptability, autonomy,

self repairing, growth, evolution etc.). The hybrid systems make us possible to get a bet-

ter simulation both of the evolutionary optimization processes and the internal structure

modification to reach a greater biological plausibility in the fitness.

Neuroevolutionary systems are an example of this approach. In classic neuroevolutionary

models the network parameters are genetically set, whereas the connection weights are

modified according to a training strategy. This solution follows the classic vision of cere-

bral development where genes control the formation of synaptic connections while their

reinforcement depends on neural activity.

More recent neuroevolutionary systems are characterized by different forms of self or-

ganizing processes which are cooperative coevolution (Paredis, 1995; Smith, Forrest and

Perelson, 1993) and synaptic Darwinism (Edelman, 1987).

Cooperative co evolutionary systems offer a promising alternative to classic evolutionary

algorithms when we face complex dynamical problems. The main difference with respect

to classic EA is the fact that each individual represents only a partial solution of the

problem. Complete solutions are obtained by grouping several individuals. The goal of

each individual is to optimize only a part of the solution, cooperating with other individ-

uals that optimize other parts of the solution. It is so avoided the premature convergence

towards a single group of individuals. An example of such approach is given by the Sym-


biotic Adaptive Neuroevolution System (Moriarty and Miikkulainen, 1998) that operates

on populations of neural networks.

While in most neuroevolutionary systems each individual represents a complete neural

network, in SANE each individual represents a hidden unit of a two-layered network.

Units are continuously combined and the resulting networks are evaluated on the basis

of the performances shown in a given task. The global effect is equal to schemas promot-

ing in standard EAs. In fact during the evolution of the population the neural schemas

having the highest fitness values are favoured and the possible mutations in the copies of

these schemas don’t affect the other copies in the population.

Other recent strategies focus on the evolution of connection schemas in the network. In

the human brain the number of synapses established by a single neuron is always much

lower than the overall number of neurons. That gives the network a sparsely connected

aspect. In the last years several models have been proposed to emulate the mechanism

involved in the selection of links without referring to the physical and chemical properties

of neurons.

The Chialvo and Bak model (Chialvo and Bak, 1999) is based on two simple and bio-

logical inspired principles. First, the neural activity is kept low selecting the activated

units by a winner takes all strategy. Second, the external environment gives a negative

feedback that inhibits active synapses if the network behaviour is not satisfying. With

these simple rules the model operates in a highly adaptive state and in critical conditions

(extreme dynamics). The fundamental difference of this strategy based on the synaptic

inhibition with respect to the classic one based on synaptic reinforcement is that the

reinforcement-based learning is a continuative process by definition, while the inhibition-

based learning stops when the training goal is achieved. The synaptic inhibition is also

biologically plausible. According to Young (Young, 1964; Young, 1966) learning is the

result of the elimination of synaptic connections (closing of unneeded channels). Dawkins

(Dawkins R., 1971) stressed that pattern learning is achieved by synaptic inhibition. As

stated by the neural groups’ selection theory developed by Edelman (Edelman, 1978;

Edelman, 1987), brain development is characterized by generating a structural and dy-

namical variability within and between populations of neurons, by the interaction of the

neural circuit with the environment and by the differential attenuation or amplification

of synaptic connections. Research in neurobiology seems to confirm the validity of the

negative feedback model and the fact that neural development follows the process of Dar-

winian evolution.

The Chialvo and Bak model is a simple two-layered network. After the training each

input pattern is associated with a single output unit leading to the formation of an as-

sociative map. When an input pattern is presented the most activated input unit i is

selected. Then the neuron j from the hidden layer that establishes the most robust con-

nection with i is selected. Finally the output neuron k that is the most strongly connected

with j is selected. If k is not the desired output the two links connecting i with j and

j with k are inhibited by a coefficient d that is the only parameter of the model. The

iterative application of these rules leads to a rapid convergence towards any input-output


mapping. This selective process followed by an inhibitory one is the essence of the natural

selection in the evolutionary context. The fittest individual is selected on the basis of a

strategy that doesn’t reward the best but punishes the worst. That’s the reason why this

model has been considered a particular kind of synaptic Darwinism.

Our neuroevolutionary model is also based on a selection strategy. The structural infor-

mation of our network is not codified by genes. We directly consider the entire network

as a population of nodes that can establish connections, generate other units or die. The

probability of these events depends on the presence of local and global resources. If there

are few resources the population falls, if there is a lot of resources the population grows.

Like in the Chialvo and Bak model we don’t select the fittest nodes reinforcing their

links, but we simply remove the worst nodes when the ecosystem resources are low. This

generates a selective process that indirectly rewards the units which can better model the

input patterns. Our evolutionary strategy can be seen as a selective retention process

(Heylighen, 1992) that removes those units which cannot reach a stable state, remaining

associated with several input patterns. Even if the stability of a unit is quantified by

the minimum distortion error related to it, this information mustn’t be considered to be

environmental information. The minimum distortion error simply quantifies the difficulty

encountered by the unit during the modelling of input patterns.

3. The Evolutionary Algorithm

Research has confirmed (Roughgarden, 1979; Song and Yu, 1988) that in natural

environments the population size along with competition and reproduction rates con-

tinuously changes according to some natural resources and the available space in the

ecosystem.

These mechanisms have been reproduced in some evolutionary algorithms, for example

to optimize the evolution of a population of chromosomes in a genetic algorithm (Annun-

ziato and Pizzuti, 2000). We have tried to use a similar strategy for the evolution of a

population of units in a self organizing network without using the string representation

of genetic programming.

In our model each node is defined by a vector of neighbouring units connected to it,

a reference vector and a variable D that is the smallest distance between its reference

vector and the closest modelled input. The value of this variable quantifies the debility

degree of the unit. The lower is D the higher are the chances for the unit to survive.

At each presentation of the training input set, D is set to the maximum value. After

the presentation of a given input x, if the reference vector w of the unit is modified, the

resulting distance between the two vectors ||x-w|| is calculated. If this value is lower than

D it becomes its new value.

The training algorithm here used can be subdivided in three phases:

(1) Winners are selected. For each input the unit having the closest reference vector is

selected.

(2) The reference vectors of the winners and their neighbours are updated according to


the following formula :

w (t + 1) = w (t) + α (x − w (t)) (1)

So the reference vectors w of the selected units are moved towards the relative inputs

x of a certain fraction of the distances that separate them. For winners this fraction

is two or three orders of magnitude higher than the one used for their neighbours.

So winners have the reference vectors moving more quickly towards the inputs.

(3) The population of units evolves producing descendants, establishing new connections

and eliminating the less performing units. All these events can occur with a well

defined probability that depends on the availability of resources.

These rules are iterated until a given goal is achieved. For example the minimization of

the expected quantization error that is the mean of the distances between the winners

and the K inputs they model:

D = 1/K

K∑i=1

‖xi − wj‖ (2)

If this value falls below a certain threshold Dmin, training is stopped.

The first two phases can be considered a kind of winner takes all strategy, where only

the most activated units are selected and enabled to modify their reference vectors. The

third phase is the evolutionary phase (fig. 3.1). Each unit i, i=[1. . . N(t)] where N(t) is

the actual population size can meet the closest winner j with probability Pm:

Fig. 3.1 – The evolutionary phase of the algorithm.

If meeting occurs, the two units establish a link and they can interact by reproducing


with probability Pr. In this case two new units are created. One is closer to the first

parent, the other to the second parent:

w1 =wp1+

wp1+wp22

2

w2 =wp2+

wp1+wp22

2

(3)

If reproduction doesn’t take place due to the lack of resources the weaker unit of the

population, i.e. the one with the highest debility degree, is removed.

If unit i doesn’t meet any winner it can interact with the closest node k with probability

Pr establishing a connection and producing a new unit whose reference vector is set

between the parents reference vectors:

w2 =wp1 + wp2

2(4)

When we fix a maximum population size, the ratio between the actual size and the

threshold N(t)/Nmax can be seen as a global resource of the ecosystem affecting the

probabilities of the events. For example if the population size is low the reproduction

rate should be high. So we can reasonably choose Pr = 1-N(t)/Nmax. If the population

size is high, the chance for the units to meet each other will be higher, so we can set Pm

= N(t)/Nmax.

We can also consider a local resource that is the ratio between the threshold Dmin and

the debility degree Di of the unit i. Each unit i should meet a winner with a probability

Pm=(N(t)/Nmax)(1-Dmin/Di) and Pr = 1 – Pm. In this way winners are not encouraged

to migrate to other groups of nodes and weaker units don’t participate in reproduction

activities.

We can estimate the population grow rate in the following way:

N (t + 1) = N (t) + 2PmPrN (t) − Pm (1 − Pr) N (t) + (1 − Pm) N (t) − (1 − Pm) PdN (t) =

= 2N (t) − 2P 2mN (t) =

= 2N (t)(1 − N(t)2

M2P

)⇒ X (t + 1) = 2X (t)

(1 − X (t)2) (first model)

= 2N (t)(1 − N(t)2

M2P

(1 − Dmin

D

)2)⇒ X (t + 1) = 2X (t)(1 − X (t)2 (1 − Dmin

D

)2)(second model)

(5)

where X(t) is the normalized size N(t)/Nmax. This is the quadratic-logistic map of An-

nunziato and Pizzuti(Annunziato and Pizzuti, 2000):

X (t + 1) = aX (t)(1 − X (t)2) (6)

They proved that by varying the parameter different chaotic regimes arise. For a<1.7 the

behaviour is not chaotic, for 1.7<a<2.1 we have chaotic regimes with simple attractors

localized in a fixed part of the plane of the phases. Theoretically for the first model we

expect to obtain a chaotic regime that is described by a simple attractor. In the second


model the factor (1 – Dmin/D) might reduce the influence of the negative feedback in the

final part of network training.

It is possible to demonstrate that during the evolution the population size converges to

N(t) = 0.72 Nmax. In this phase the probability that a unit establishes n connections

with the other ones for the first model, considering only clusters of n units, is given by:

P (n) =

(0.72Nmax

Nmax

)n

−0.72Nmax−1−n∑

i=1

(0.72Nmax

Nmax

)i+n

= αn−β (7)

It has to be pointed out we have subtracted the probability that such n links developed

within a cluster of more than a n unit.

The coefficients α and β of the power law are considered constant at the end of the

training. To compute their values, we can take into consideration the cases n=1 and

n=0.72Nmax-1 which correspond to the minimum and maximum number of connection

at the end of the training.

P (1) = 0.72 −0.72Nmax−2∑

i=1

0.72i+1 = α1−β = α (8)

P (0.72Nmax − 1) = 0.720.72Nmax−1 =

(0.72 −

0.72Nmax−2∑i=1

0.72i+1

)(0.72Nmax − 1)−β

⇒ β = log0.72Nmax−1

⎛⎜⎜⎝0.72 −0.72Nmax−2∑

i=1

0.72i+1

0.720.72Nmax−1

⎞⎟⎟⎠The distribution tail of the degrees tends to stretch when the maximum size of the

population increases, it means that in wider networks there are more hubs with a higher

degree.

For the second model we can consider that at the end of the training (1-Dmin/D) ∼ ε

So the probability that a unit establishes n links becomes:

P (n) =

(0.72Nmax

Nmax

ε

)n

−0.72Nmax−1−n∑

i=1

(0.72Nmax

Nmax

ε

)i+n

= αn−β (9)

⇒ β = log0.72Nmax−1

⎛⎜⎜⎝0.72ε −0.72Nmax−2∑

i=1

(0.72ε)i+1

(0.72ε)0.72Nmax−1

⎞⎟⎟⎠ ,

and the considerations made for the first model can be therefore extended to the second

model.


4. Training the Net: Simulations

We have compared the performances of our networks with those of a Growing Neural

Gas in categorizing bidimensional inputs.

GNG is a self organizing network which thanks to both the competitive hebbian learning

strategy and the neural gas algorithm can categorize inputs without altering their exact

dimensionality.For the GNG, the parameters of the model aregggggα = 0.5, β = 0.0005

and at each λ = 300 steps a new unit is inserted. The maximum age of the links is set

to 88.

For the two different ENG models, the parameters are α = 0.05, β = 0.0006 and the

maximum size is set to Nmax = 120.

As stopping criterion for both the algorithms we have chosen the minimization of the

expected quantization error that is the average distance between the winners and the

corresponding inputs.

We have considered two different input domains. In the first case inputs are localized

within four square regions, in the second one inputs are uniformly distributed in a ring

region.

As shown in fig.4.1 after the training, GNG reference vectors are all positioned in the

input domain. In the Evolutionary Self Organizing Networks (fig.4.2a and fig.4.2b) some

units fall outside the input domain, but in this way the network remains fully connected.

The nodes’ distribution statistical analysis confirms what appears to be intuitively patent:

the emerging network structure is a typical scale-free one, i.e. a structure where few hubs

manage the links.

Fig. 4.1 – Growing Neural Gas simulations.

We trained 30 networks of each type obtaining the average degree distributions reported


Fig. 4.2a – Evolutionary Self Organizing NETwork simulations (first model).

Fig. 4.2b – Evolutionary Self Organizing NETwork simulations (second model).

in fig.4.3-4.5. In tab. 1 – 2 are reported the average values of the structural parameters

of the two networks.

While GNG have a high value for the average path length and a low clustering coefficient,

ENG have a short average path length and a high clustering coefficient which along with

the power law tail of the degree distribution confirm its scale free graph features.

Fig. 4.6 – 4.7 shows the population dynamics of the two ENG models. The structure

shared by the two different ENG models is due to the fact that the winner units tend to

establish the greatest number of connections. These are the favoured units with which


Fig. 4.3 – Average degree distribution in GNG (two different input manifolds)

Fig. 4.4 – Average degree distribution in ENG (first model, two different input manifolds)

Fig. 4.5 – Average degree distribution in ENG (second model, two different input manifolds)

each node try to establish a connection. If the probability depends also on the local

distortion error as it happens in the second model, we obtain a final structure that is

more similar to the GNG, which is to say more similar to a gas. In point of fact, the

conditions to create a new link become more restrictive, reducing the interaction among

each cluster and the whole network. The structure of connections seems to extend more

uniformly in the regions where inputs are present as it can be seen in picture 4.2b (more

evident in the circular distribution).

Picture 4.7 shows the dynamics of the populations in the two different models of ENG.

In the first model the population size seems to converge to the final value of 0.72Nmax,


Average path length Clustering coefficient Power law exponent

GNG - 0.49 2.04

ESON (1st) 3.82 0.64 1.15

ESON(2nd) 3.92 0.63 1.14

Table 1 Comparison of structural parameters (average values, first input domain)

Average path length Clustering coefficient Power law exponent

GNG 6.4 0.42 2.98

ESON (1st) 3.61 0.58 1.11

ESON(2nd) 3.67 0.59 1.14

Table 2 Comparison of structural parameters (average values, second input domain)

confirming the experimental results of Annunziato and Pizzuti. As it can be noticed in

fig. 4.6, since the d value gradually diminishes during the training, the influence of the

factor (1−Dmin/d) grows reducing the effects of the negative feedback which character-

izes the quadratic logistic map. This justifies the sudden growth of the population at the

end of the training in the second model.

Fig. 4.6 – Network size evolution of the two ENG models (first input manifold)

At the end of training new units connect with the winner units which have a lower d, while

the subgroups of units become more isolated. Considering the function (X(t),X(t+1))

the attractor becomes more marked in the second model. This means that the system

tends to converge more toward a precise final state with a lower interaction among the

groups of units.


Fig. 4.7 – population dynamics (X(t),X(t+1)) of the two ESONET models (first inputmanifold).

5. The Role of Information in Functional Specialization and

Integration

We can classify a system as complex when it is made up of different parts hetero-

geneously interacting. In addition, its behaviour and its structure have to be neither

completely casual (as it happens in a gas) nor too regular (as it happens in a crystal). In

Nature we generally observe the co-existence of functionally highly specialized integrated

areas.

That’s what happens in the brain, where different areas and groups of neurons interact

to give rise to an integrated and unitary cognitive scenario (G. M. Edelman, G. Tononi,

2000).

Edelman has introduced the integration, reciprocal information and complexity concepts

in order to mathematically define the functional organization of the cerebral structures.

Within a complex system, a subset of elements can be defined an integrated process if

– on a given temporal scale – the elements interact more strongly with each other than

with the system. In a neural net or in a self-organizing one it means that the units of an

integrated group will tend to simultaneously activate themselves.

When the units in a subset are independent, the system’s entropy reaches its maximum

value which is the sum of the entropies of the single elements (local entropies). On the

contrary, when any kind of interaction occurs, the global entropy decreases so becoming

lower than the sum of the local entropies. The integration measure is, therefore, a natural

indicator of the system informational “capacity”.

So the integration of a subset of network units can be calculated by deducting the sum of

the entropies of each single component (xi) from the entropy of the system considered as

a whole. If each unit can only take two states (activated/not-activated), the amount of

the possible activation patterns of a subset with N units is 2N . So the system maximum

entropy is:

Hmax (X) =n∑

i=1

H (xi) =n∑

i=1

pi log2

(1

pi

)=

2N

2Nlog2

(1

/1

2N

)= log2 2N = N (10)


and the integration will be:

I (X) =n∑

i=1

H (xi) − H (X) (11)

for the self-organized net here considered, the integration of a sub-group of units takes

the following expression:

I (X) = N −N−1∑i=1

⎛⎜⎝N

i + 1

⎞⎟⎠Pi log2

(1

Pi

)(12)

where Pi is the probability for a node to establish i connections. The overall number of

the system’ states is equal to the total number of possible groups of i+1 units. Groups

of units having the same dimension (groups of i+1 units) give the same contribution to

the entropy of the system.

If we choose the WTA strategy as activation modality, for each presented input only a

single unit (the winner) and the 1< i < N -1 i units will activate themselves. All the

other ones remain not-activated.

The probability for a node to create connections is ruled by the power lawPi = αk−β,

with αand βdepending on 1) the network dimension, 2) the local distortion errors (for the

second model) and 3) the particular evolution of the network structure, i.e. the dynamic

behaviour of α (t) and β (t).

So the integration of the two self-organizing network here presented is:

I (X) = N −N−1∑i=1

⎛⎜⎝N

i + 1

⎞⎟⎠(αi−β)i

log2

(1

αi−β

)i

(13)

The integration can be seen as a measure of the statistic dependency within a subset of

units. The stronger their interactions are, the higher their integration.

In order to measure the statistic dependency between a subset and the whole system,

Edelman introduced the concept of mutual information. Given an n subset made up of k

elements(Xk

n

)and its complement in the system

(X − Xk

n

), the mutual information is:

IR(Xk

n; X − Xkn

)= H

(Xk

n

)+ H

(X − Xk

n

)− H (X) (14)

The mutual information is essential to evaluate the differentiation degree of a system, i.e.

it is a significant index of the system’ “resolution” degree, calculated on the subdividable

and distinct states.

In order to measure the information of an integrated activation pattern, we calculate how

the states of a given subset can differentiate them from the whole system ones. Which

thing, following Edelman, is equivalent to considering the whole system as the observer

of itself. In fact, if entropy measures the variability of a system according to an external

observer evaluation, the mutual information measures the system variability according to


an observer ideally placed within the system itself.

The overall measure of the differentiation degree of a complex system is given by the

mutual information average between each subset and the whole system:

C (X) =

N/2∑k=1

⟨IR(Xk

n; X − Xkn

)⟩(15)

Edelman defined such measure as complexity and its value is high if each subset can aver-

agely take many different states which are statistically depending on the whole system’s

ones, so it shows how the system is differentiated. High complexity values correspond

to an optimal synthesis of functional specialization and functional integration. Systems

whose elements are not integrated (such as a gas) or not specialized ( such as an homo-

geneous crystal) have a minimum complexity.

In the evolutionary neural gas case, the WTA strategy limits the integration among the

activation patterns. So the mutual information between any activation pattern and the

other possible patterns is equal to zero. It justifies the use of the term “gas”, since the

patterns behave like isles of information weakly interacting each other.

If there were selected more winner units for the same input signal in the early training

phase, we could get a given system status characterized by i+ 1 activated units not only

by the activation of just a single winner and its related i units, but also by the activation

of more winners. therefore we should also take into consideration all the possible sub-

groups with j+1 elements.

The mutual information formula between a subgroup with k activated units and the

system is given by:

H(Xkn)=

k−1∑i=1

⎛⎜⎜⎜⎜⎝k

i + 1

⎞⎟⎟⎟⎟⎠⎡⎢⎢⎢⎢⎣(αi−β)

i+

i∑j=1

⎛⎜⎜⎜⎜⎝i + 1

j + 1

⎞⎟⎟⎟⎟⎠(αj−β)j

⎤⎥⎥⎥⎥⎦ log2

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝1

(αi−β)i+

i∑j=1

⎛⎜⎜⎜⎜⎜⎜⎜⎝i + 1

j + 1

⎞⎟⎟⎟⎟⎟⎟⎟⎠(αj−β)

j

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠(16)

H(X−Xkn)=

k−2∑i=1

⎛⎜⎜⎜⎜⎝N

i + 1

⎞⎟⎟⎟⎟⎠⎡⎢⎢⎢⎢⎣(αi−β)

i+

i∑j=1

⎛⎜⎜⎜⎜⎝i + 1

j + 1

⎞⎟⎟⎟⎟⎠(αj−β)j

⎤⎥⎥⎥⎥⎦ log2

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝1

(αi−β)i+

i∑j=1

⎛⎜⎜⎜⎜⎜⎜⎜⎝i + 1

j + 1

⎞⎟⎟⎟⎟⎟⎟⎟⎠(αj−β)

j

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠+

+

⎛⎜⎜⎜⎜⎝⎛⎜⎜⎜⎜⎝

N

k

⎞⎟⎟⎟⎟⎠−1

⎞⎟⎟⎟⎟⎠⎡⎢⎢⎢⎢⎣(α(k−1)−β)

k−1+

k−1∑j=1

⎛⎜⎜⎜⎜⎝k

j + 1

⎞⎟⎟⎟⎟⎠(αj−β)j

⎤⎥⎥⎥⎥⎦ log2

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝1

(α(k−1)−β)k−1

+k−1∑j=1

⎛⎜⎜⎜⎜⎜⎜⎜⎝k

j + 1

⎞⎟⎟⎟⎟⎟⎟⎟⎠(αj−β)

j

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠+


+N−1∑i=k

⎛⎜⎜⎜⎜⎝N

i + 1

⎞⎟⎟⎟⎟⎠

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣(αi−β)

i+

i∑j=1

j �=k−1

⎛⎜⎜⎜⎜⎝i + 1

j + 1

⎞⎟⎟⎟⎟⎠(αj−β)j+

⎛⎜⎜⎜⎜⎝⎛⎜⎜⎜⎜⎝

i + 1

k

⎞⎟⎟⎟⎟⎠−1

⎞⎟⎟⎟⎟⎠(αj−β)j

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦·

log2

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

1

(αi−β)i+

i∑j=1

j �=k−1

⎛⎜⎜⎜⎜⎜⎜⎜⎝i + 1

j + 1

⎞⎟⎟⎟⎟⎟⎟⎟⎠(αj−β)

j+

⎛⎜⎜⎜⎜⎜⎜⎜⎝

⎛⎜⎜⎜⎜⎜⎜⎜⎝i + 1

k

⎞⎟⎟⎟⎟⎟⎟⎟⎠−1

⎞⎟⎟⎟⎟⎟⎟⎟⎠(αj−β)

j

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

H(X)=N−1∑i=1

⎛⎜⎜⎜⎜⎝N

i + 1

⎞⎟⎟⎟⎟⎠⎡⎢⎢⎢⎢⎣(αi−β)

i+

i∑j=1

⎛⎜⎜⎜⎜⎝i + 1

j + 1

⎞⎟⎟⎟⎟⎠(αj−β)j

⎤⎥⎥⎥⎥⎦ log2

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝1

(αi−β)i+

i∑j=1

⎛⎜⎜⎜⎜⎜⎜⎜⎝i + 1

j + 1

⎞⎟⎟⎟⎟⎟⎟⎟⎠(αj−β)

j

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠To provide the system with a greater level of complexity, in order to favouring the inte-

gration among the network unit subgroups, it is, therefore, necessary adopting a strategy

different from the WTA in the early training phases so as to select more winner units.

6. Conclusions and Future Works

The here presented self-organizing network can be considered as an example of au-

topoietic system which evolves by means of a closed network of interactions and based

upon the production of components (the categorization units). In the course of the re-

productive dynamics, those ones produce other components, also belonging to the system

(i.e. other categorization units) which maintain the system identity over time with re-

spect to the experimental task.

In particular, it has to be noticed that they are not just the environmental information to

lead the evolution of the network of connections, but rather the network internal status,

which is individuated globally by the size that the population has reached and locally by

the values of the parameters of the units. The latter show the difficulties that the units

encounter in modelling the presented input, such difficulty is directly proportional to the

amount of variations their reference vectors are subjected to.

Learning and the capability to model the system external inputs, therefore, emerges more


by means of the population internal dynamics than by means of a learning algorithm.

The appearing of a scale-free structure emerging from the choice of the population dynam-

ics is peculiarly significant for the model’s biological plausibility. Which thing describes

a quite phase-transition-like status where cluster “float” as informational “isles” in a

“gaseous” configuration. It is worthy noticing that the WTA strategy and the environ-

mental noise (probabilistic laws) suffice to create a kind of basic informational skeleton

around which more interconnected functional structures can then aggregate. In the ner-

vous system, it plausibly happens according to an essentially genetic design. Such kind of

neural dynamics guarantees flexibility and redundancy to the informational nuclei which

are ready to synchronize and connect through signals. Actually, what we tried here to

describe is a proto-neural scenario with low integration of clusters which are specialized

in easy categorization tasks.

Developing the ENG model requires to investigate different synchronization scenarios

among clusters and their ensuing functional integration to execute more complex tasks.

In particular, it is necessary to modify the evolutive dynamics so as to mane the connec-

tions among units active. In this way, it should be possible to create a dynamic neural

topology susceptible of hierarchical organization.

Everything seems to confirm not only the deep reasons for the scale-free structures re-

curring in nature (Z. Toroczkai, K. E. Bassler, 2004), but also the fundamental lesson

associating complexity with a thin border zone between integration and differentiation

among the functional modules of a system.

Acknowledgements

The authors thank Eliano Pessa and Graziano Terenzi for their precious suggestions.


References

[1] Albert R. and Barabasi A.(2000) Topology of evolving networks: Local events anduniversality. Physical Review Letters vol.85, p.5234.

[2] Annunziato M. and Pizzuti S.(2000), Adaptive Parametrization of EvolutionaryAlgorithms Driven by Reproduction and Competition, in Proceedings of ESIT 2000,Aachen, Germany.

[3] Chialvo D.R. and Bak P. (1999) Learning from mistakes, in Neuroscience Vol.90, No.4,pp.1137-1148.

[4] Dawkins R. (1971), Selective neurone death as a possible memory mechanism, in Naturen.229, pp.118-119.

[5] Delaunay B. (1934), Bullettin of the Academy of Sciences USSR, vol.7, pp. 793-800.

[6] Edelman G.M. (1978), Group selection and phasic reentrant signaling: a theory ofhigher brain function, in The Mindful Brain (eds Edelman G.M. and Mountcastle V.),pp. 51-100, MIT, Cambridge.

[7] Edelman G.M. (1987), Neural Darwinism: The Theory of Neuronal Group Selection,Basic Books, New York.

[8] Edelman G.M., Tononi G. (2001), A Universe of Consciousness: How Matter BecomesImagination, Basic Books

[9] Fritzke B. (1994), Growing Cell Structures. A Self-Organizing Network forUnsupervised and Supervised Learning. in Neural Networks, 7(9), pp. 1441-1460.

[10] Heylighen F. (1992), Principles of Systems and Cybernetics: an evolutionaryperspective, in: Cybernetics and Systems ’92, R. Trappl (ed.), World Science, Singapore,pp. 3-10.

[11] Langton C.G. (1989), Artificial Life: The Proceedings of an InterdisciplinaryWorkshop on the Synthesis and Simulation of Living Systems, Addison-Wesley.

[12] Jefferson D. and Taylor C. (1995), Artificial Life as a Tool for Biological Inquiry, inArtificial Life: an Overview, edited by C.G. Langton, MIT press, pp.1-10.

[13] Kohonen T. (1982), Self-Organized Formation of Topologically Correct Feature Maps,in Biological Cybernetics, n.43, pp.59-69.

[14] Martinetz T.M. (1993), Competitive Hebbian Learning Rule Forms PerfectlyTopology Preserving Maps, in ICANN’93, International Conference on Artificial NeuralNetworks, Springer, pp. 427-434. Amsterdam.

[15] Martinetz T.M. and K.J. Schulten, (1991), A Neural Gas Network Learns Topologies,In Artificial Neural Networks, T.Kohonen, K. Makisara, O. Simula, and J. Kangas, eds,, pp. 397-402. North-Holland, Amsterdam.

[16] Martinetz T.M. and Schulten K.J. (1994), Topology Representing Networks, in NeuralNetworks, 7(3), pp. 507-522.

[17] Moriarty D.E. and R.Miikkulainen (1998), Forming Neural Networks ThroughEfficient and Adaptive Coevolution, in Evolutionary Computation, 5(4), pp. 373-399.

[18] Paredis J. (1995), Coevolutionary Computation, in Artificial Life, 2, pp.355-375.

[19] Rocha L.M. (1997) Evolutionary Systems and Artificial Life, Lecture Notes. LosAlamos, NM 87545


[20] Roughgarden J.,(1979), Theory of Population Genetics and Evolutionary Ecology,Prentice-Hall.

[21] Smith R.E., Forrest S., Perelson A.S. (1993), Searching for Diverse CooperativePopulations with Genetic Algorithms, in Evolutionary Computation, 1(2), 127-149.

[22] Song J. and Yu J. (1988), Population System Control, Springer-Verlag.

[23] Steyvers M. and Tenenbaum J., 2001. The Large-Scale structure of SemanticNetworks. Working draft submitted to Cognitive Science.

[24] Toroczkai,Z. and Bassler,K.E. (2004), Jamming is Limited in Scale-Free Systems, inNature, 428 , p.716

[25] Watts D.J., Strogatz S.H. , 1998. Collective Dynamics of ‘Small-World’ Networks, inNature, vol. 393, pp. 440-442.

[26] Young J.Z. (1964), A Model of the Brain, Clarendon, Oxford.

[27] Young J.Z. , (1966), The Memory System of the Brain, University of California Press,Berkeley.


Discrete Groups Approach to Non SymmetricGravitation Theory

N.Mebarki, F.Khelili and J.Mimouni∗

Laboratoire de Physique Mathematique et Subatomique,Mentouri University, Constantine, Algeria

Received 22 August 2006, Accepted 6 January 2007, Published 31 March 2007

Abstract: A generalized discrete group formalism is obtained and used to describe the NonSymmetric Gravity theory (NGT) coupled to a scalar field. We are able to derive explicitly thevarious terms of the NGT action including the interaction term without any ad-hoc assumptions.c© Electronic Journal of Theoretical Physics. All rights reserved.

Keywords: General Relativity, Non Commutative Geometry, Non Symmetric GravityPACS (2006): 04.20.Cv, 04.90.+e, 95.30.Sf, 02.40.Gh, 11.10.Nx

1. Introduction

During the past few years, a renewed interest in the non commutative geometry ap-

proach [1], [2], [3], [4] of the standard model and some of the grand unified theories, has

appeared among the physicists and mathematicians. The motivation is to find probable

answers to the remaining outstanding problems. One of the promising approach is the

one using the discrete groups [5], [6], [7] where it is shown that it has an intimate relation

to non commutative geometry in which the scalar particles are treated in an equal foot-

ing with the usual gauge boson. Recently, this formalism has been applied to the case

of General Relativity [8] where it was shown that the gravitational field is completely

decoupled from the scalar one.

The purpose of this paper is to generalize this approach based essentially on the

work presented in references [9],[10] , and derive explicitly the various terms of the Non

Symmetric Gravitation theory (NGT) action [11],[12],[13],[14]. In section 2 we present

the mathematical formalism, in section 3 we derive the NGT action together with the

scalar field interaction terms. Finally, in section 4 we draw our conclusions.



2. Formalism

An alternative to A.Cones’s Non Commutative Geometry [1], [2], [3], [4] is the dis-

crete groups approach [5], [6], [7] based on the algebra of 2× 2 matrices having as entries

the p-differential forms. In this formulation, a generalized product denoted by � is used

to define the structure of a Z2 graded associative algebra. Thus, the product of two

elements of this algebra is given by [8]:⎛⎜⎝ A C

D B

⎞⎟⎠�

⎛⎜⎝A′ C ′

D′ B′

⎞⎟⎠ =

⎛⎜⎝ A ∧ A′ + (−)∂C C ∧ D′ C ∧ B′ + (−)∂A A ∧ C ′

D ∧ A′ + (−)∂B B ∧ D′ B ∧ B′ + (−)∂D D ∧ C ′

⎞⎟⎠ (1)

where A, B, C D, A′, B′, C′, D′ are p-forms, ∂ stands for degree of these p-forms, and ∧denotes the exterior product.

One can also define a nilpotent differential operator d satisfying a generalized Leibnitz

rule as follows [8]:

dX = d

⎛⎜⎝A C

D B

⎞⎟⎠ =

⎛⎜⎝ dA + C + D −dC − (A − B)

−dD + (A − B) dB + C + D

⎞⎟⎠d (X � X ′) = dX � X ′ + (−)∂X X � dX ′

(2)

This formulation was applied to describe the Einstein-Hilbert action with a minimal

coupling of the gravitation with scalar fields [8].

Concerning NGT, one can define the following generalized spin connection Ωab:

Ωab =

⎛⎜⎝ωab φab

φab

ωab

⎞⎟⎠ , a = { i = 1, 2, , n

.a = n + 1, , N

(3)

where ωab and ωab(resp.φab and φ

ab)are the generalized hyperbolic complex 1-forms

(resp.0-forms)where their components in the holonomic basis{ei, i = 1, n

}are given by:

ωab = ωabμ dXμ , ωab = ωab

μ dXμ , dXμ = Eμi ei (4)

and the generalized vierbein is defined as:

Eμi =

⎛⎜⎝ 0 eμi

eμi 0

⎞⎟⎠Here eμ

i is the hyperbolic complex and eμi its hyperbolic complex conjugate

eμi = αμ

i + εβμi , ε = −ε, ε2 = 1

eμi = αμ

i − εβμi , αμ

i , βμi ∈ C∞

R (X)(5)


A generalized orthonormal basis can be defined such that:

ξa =

⎛⎜⎝ ρa sa

−sa ρa

⎞⎟⎠ , a = { i = 1, 2, , n

.a = n + 1, , N

(6)

where (resp.sa and sa) ρa is a 1-form (resp.0-forms) given by:

ρi = eiμdXμ , i = 1, 2, , n

ρ.a = 0 ,

.a = n + 1, , N

(7)

si = 0, si = 0, i = 12, , n

s.a = Mλ

.a, s

.a = Mλ

.a,

.a = n + 1, , N

(8)

with eμj is the inverse of the vierbein verifying:

eiμe

μj = δi

j , eiμe

νi = δν

μ (9)

and M , M are the following 2 × 2 matrices:

M =

⎛⎜⎝ 0 1

0 0

⎞⎟⎠ ,M =

⎛⎜⎝ 0 0

1 0

⎞⎟⎠ (10)

here λ.a and its hyperbolic complex conjugate λ

.a are arbitrary functions.

The exterior product and differential operator for the generalized spin connection

components are defined by:

ωab ∧ ωcd = ωabμ ωcd

ν dXμ ∧ dXν = E[μi E

ν]j ωab

μ ωcdν .ei.ej

ωab ∧ ϕcd = Eμi Mωab

μ ϕcdei

ϕab ∧ ωcd = MEμi ϕabωcd

μ

ϕab ∧ ϕcd = MMϕabϕcd

(11)

and

dωab = d(ωab

μ dXμ)

=(dωab

μ

)dXμ = ∂μω

abν dXμ ∧ dXν

dφab = ∂μφabdXμ = MEμ

i ∂μϕab ei

dφab

= ∂μφab

dXμ = Eμi M∂μϕ

ab ei

(12)

with:

φab = Mϕab , φab

= Mϕab (13)

Now imposing the unitarity condition:(Ωab)∗

= Ωba (14)


where * is an involution such that:(ei)∗

= −ei , (dXμ)∗ = −dXμ (15)

we obtain the following constraints:

ωabμ = −ωba

μ , ωab

μ = −ωbaμ

ϕab = ϕba , ϕab

= ϕba(16)

As for the 2-form curvature Rab, it is given by [8]:

Rab = dΩab + Ωac � Ωcb

Straightforward calculations lead to:

Rab11 = dωab + ωac ∧ ωcb + φacφ

cb+ φab + φ

ab= Rab + τϕacϕcb + Mϕab + Mϕab

Rab12 = −dφab + φacωcb − ωacφcb − (ωab − ωab

)= −∇φab − (ωab − ωab

)Rab

21 = −dφab

+ φac

ωcb − ωacφcb

+(ωab − ωab

)= −∇φ

ab − (ωab − ωab)

Rab22 = dωab + ωac ∧ ωcb + φ

acφcb + φab + φ

ab= R

ab+ τϕacϕcb + Mϕab + Mϕab

with

τ =

⎛⎜⎝ 1 0

0 0

⎞⎟⎠ τ =

⎛⎜⎝ 0 0

0 1

⎞⎟⎠and

∇φab = ejeμj∇μϕ

abτ − ejeμj ω

acμ ϕcbτ3

∇μϕab = ∂μϕ

ab − ϕacωcbμ + ωac

μ ϕcb

∇φab

= ej eμj∇μϕ

abτ + ej eμj ω

acμ ϕcbτ3

∇μϕab = ∂μϕ

ab + ωacμ ϕcb − ϕacωcb

μ

It is worth mentioning that Rab and Rab

have the following expressions:

Rab =(∂μω

abν + ωac

μ ωcbν

)dXμ ∧ dXν = 1

2Rab

μνdXμ ∧ dXν

Rab

=(∂μω

abμ + ωac

μ ωcbν

)dXμ ∧ dXν = 1

2R

ab

μνdXμ ∧ dXν

Rabμν = ∂μω

abν + ωac

μ ωcbν − (μ ↔ ν) = −Rab

νμ

Rab

μν = ∂μωabν + ωac

μ ωcbν − (μ ↔ ν) = −R

ab

νμ

The torsion is defined by[8]:

T a = dξa + Ωab � ξb (17)

Using the fact that:

dXμ ∧ dXν = E[μi E

ν]j ei.ej =

[ημνij e[i.ej] + εgμν

ij e(i.ej)τ3

]where ημν

ij and gμνij are the real and imaginary parts of the product eμ

i eνj that is:

Gμνij = eμ

i eνj = ημν

ij − εgμνij (18)


with ε is a pur imaginary hyperbolic complex number (ε2 = 1) and τ3 is the usual Pauli

matrix:

τ3 =

⎛⎜⎝ 1 0

0 −1

⎞⎟⎠the notations () and []mean symmetric and antisymmetric parts respectively. Direct

simplifications lead to:

(T a)11 = dρa + ωab ∧ ρb − φabsb + sa − sa

=(∂μρ

aν + ωab

μ ρbν

)dXμ ∧ dXν − τϕabλb + sa − sa

(T a)22 = dρa + ωab ∧ ρb + φab

sb + sa − sa

=(∂μρ

aν + ωab

μ ρbν

)dXμ ∧ dXν + τϕabλb + sa − sa

(T a)12 = −dsa + φabρb − ωabsb = −ejeμj

(∂μλ

a − ϕabρbμ

)τ − ejeμ

j ωabμ λbτ

(T a)21 = dsa + φab

ρb + ωabsb = ej eμj

(∂μλ

a − ϕabρbμ

)τ + ej eμ

j ωabμ λbτ

The components of T i are given by:

(T i)11 =(ημνkl e[k.el] + εgμν

kl e(k.el)τ3

) (∂μe

iν + ωij

μ ejν

)− τϕi.aλ

.a

(T i)22 =(ημνkl e[k.el] + εgμν

kl e(k.el)τ3

) (∂μe

iν + ωij

μ ejν

)+ τϕi

.aλ

.a

(T i)12 = .ejϕijτ − ejeμj ω

i.aμ λ

.aτ

(T i)21 = ejϕijτ + ej eμj ω

i.aμ λ

.aτ

while those of T.a are:(

T.a)

11=(ημνkl e[k.el] + εgμν

kl e(k.el)τ3

)ω

.akμ ek

ν − τϕ.a

.bλ

.b + Mλ

.a − Mλ

.a(

T.a)

22=(ημνkl e[k.el] + εgμν

kl e(k.el)τ3

)ω

.akμ ek

ν + τϕ.a

.bλ

.b + Mλ

.a − Mλ

.a(

T.a)

12= −ejeμ

j

(∂μλ

.a − ϕ

.akek

μ

)τ − ejeμ

j ω.a

.b

μ λ.bτ(

T.a)

12= −ej eμ

j

(∂μλ

.a+ ϕ

.akek

μ

)τ + ej eμ

j ω.a

.b

μ λ.bτ

3. The NGT Action

If one defines the scalar product (·, ·) as:

(X, Y ) =

∫∗tr (X � Y ) =

∫ √eed4xtrXi1...iP ,Yj1...jq ∗

(ei1 . . . eip

) (ej1 . . . ejq

)(19)

where X = Xi1...iP ei1 . . . eip and Y = Yj1...jqej1 . . . ejq ,and ∗ is the Hodge star operator

verifying the following equations:

∗ (ei.ej) = −δij = ∗e(i.ej)

∗ (ei.ej.ek.el)

= δijδkl − δikδjl + δilδjk

∗ (ei.ej.e(k.el))

= δijδkl + δikδjl + δilδjk

∗ (ei) = 0 = ∗ (ej1 . . . .ej2k+1) = 0

∗ (1) = 0 , ∗e[i.ej] = 0

then the NGT action takes the form:

I =1

2

∫ √eed4x ∗ Tr

[Ea � Eb∗ − Eb∗ � Ea

]� Rba (20)


where Ea are given by:

Ea =

⎛⎜⎝ τρa Mλa

−Mλa τρa

⎞⎟⎠ , a = { i = 1, 2, , n

.a = n + 1, , N

(21)

that is:

Ei =

⎛⎜⎝Mei 0

0 Mei

⎞⎟⎠ , E.a =

⎛⎜⎝ 0 Mλ.a

−Mλ.a

0

⎞⎟⎠After a direct calculation we obtain:

I = I(1) + I(2) (22)

with

I(1) =

∫ √eed4x

(−Gμν

(Rμν + Rμν

)+

1

2

(ϕiaϕai − ϕiaϕai

))(23)

I(2) = −1

2

∫ √eed4x{λ

.a

eμi

(∂μϕ

.ai + ω

.abμ ϕbi − ϕ

.abωbi

μ

)(24)

+λ.aeμ

i

(∂μϕ

i.a + ωib

μ ϕb.a − ϕibωb

.a

μ

)}

Now, in order to get dynamical fields, we impose the following weak torsionless con-

ditions: ⎛⎜⎝ 0 M

−M 0

⎞⎟⎠� T i = 0 (25)

and

Tr (τ3 ⊗ 1) � T.a = 0

Here Tr denotes the trace over the 2 × 2 matrices algebra.

After some straightforward simplifications, the action becomes (see Appendix A):

I =

∫ √eed4xL

where

L = L(1)+L(2)+L

with

L(1) = Gμν(Rμν + Rμν

)= 2GμνRμν

L(2) = −12


)= 0

and

L(3) = 12λ

.aeμi

(∂μϕ

.ai + ω

.abμ ϕbi − ϕ

.abωbi

μ

)+ 1

2λ

.aeμ

i

(∂μϕ

i.a + ωib

μ ϕb.a − ϕibωb

.a

μ

)Note that Gμν = eμ

i eνi is the NGT metric.

Setting λ = exp (εΦ), we get:


L = GμνRμν + 12G(μν)WμWν − 1

2G[μν]∂νWμ + 1

2G(μν)∂μΦ∂νΦ − G(μν)εWμ∂νΦ

Notice that one can also add the following cosmological term J (see Appendix B):

J =1

2

∫∗Tr[Ea � Eb∗ − Eb∗ � Ea

]� (ξb � ξa∗) (26)

which may be also written as:

J = −∫ √

eed4x(−2G[μν]G[μν] − 4λλ − 8 − Gνμ

ji Gjiμν

)(27)

4. Conclusions

We have shown that we can consistently generalize the discrete groups formalism

to the case of Non Symmetric Gravitation theory, and have obtained in the process a

Lagrangian density containing the pure NGT action an interaction term, as well as the

kinetic term for the scalar field Φ. Thus, the various terms that Moffat has introduced

by hand for mere physical consistency, are here seen to be the result of the generalized

discrete group approach. Moreover, a dynamical scalar field was found to be also neces-

sary in this formalism, but contrary to General Relativity, it couples to the gravitational

field (term proportional to G(μν)εWμ∂νΦ).

Appendix A

In order to get dynamical fields, we impose a weak torsionless condition:⎛⎜⎝ 0 M

−M 0

⎞⎟⎠� T i = 0 , T rτ

((τ3 ⊗ 1) � T

.a)

= 0

where Trτ denotes the trace over M2 (K) ( M, M ,τ ,τ , τ3).

We thus get the following constraints:

ωi.aμ = ωi

.aμ = 0(

∂μeiν + ωij

μ ejν

)= 0(

∂μeiν + ωij

μ ejν

)= 0

∂μλ.a − ϕ

.akek

μ − ω.a

.b

μ λ.b = 0

∂μλ.a + ϕ

.akek

μ − ω.a

.b

μ λ.b = 0

Consequently we obtain:

Rijμν = R

ij

μν

Rμν = Rμν

Now by imposing also that Tr (T i) = 0, we get:

λ.aϕi

.a = λ

.aϕi

.a

which implies:

ϕi.aϕ

.ai − ϕi

.aϕ

.ai = 0

Using the fact that:


ϕijϕji − ϕijϕji = 0

we obtain:

ϕiaϕai − ϕiaϕai = 0

and thus

L(2) = −12


)= 0

By taking into account the above constraints, L(3) takes the form:

L(3) = 12λeμ

i

(∂μϕ

5i − ω55μ ϕ5i − ϕ5jωji

μ

)+ 1

2λeμ

i

(∂μϕ

i5 + ωijμ ϕj5 − ϕi5ω55

μ

)Putting Wμ = ω55

μ , Wμ = ω55μ = −Wμ, and using the compatibility condition:

∇μeσi = ∂μe

σi − ωji

μ eσj + W σ

αμeαi = 0

we end up with:

2L(3) = λGσμ∂μ (∂σ (λ − Wσ) λ) − Wσλ + GνμW σνμλ (∂μ − Wσ) λ

−λGσμWμ (∂σλ − Wσλ) + h.c.c

where here h.c.c. means hyperbolic complex conjugate.

Using the parametrization λ = exp (εΦ) , L(3)becomes:

L(3) = G(μν)WμWν − G[μν]∂νWμ + G(μν)∂μΦ∂νΦ − 2G(μν)εWμ∂νΦ

Finally we get for the action I :

I =

∫ √eed4xL

with

L = 2GμνRμν + G(μν)WμWν − G[μν]∂νWμ + G(μν)∂μΦ∂νΦ − 2G(μν)εWμ∂νΦ

Appendix B

The cosmological term can be obtained from the following expression:

J = 12

∫ ∗Tr[Ea � Eb∗ − Eb∗ � Ea

]� (ξb � ξa∗) = 12

(J(1)−J(2)

)where:

J(1)=∫ ∗Tr{(Ea � Eb∗ − Eb∗ � Ea

)11∧ (ξb � ξa∗)

11

+(Ea � Eb∗ − Eb∗ � Ea

)22∧ (ξb � ξa∗)

22}

and

J(2) =∫ ∗Tr{(Ea � Eb∗ − Eb∗ � Ea

)12∧ (ξb � ξa∗)

21

+(Ea � Eb∗ − Eb∗ � Ea

)21∧ (ξb � ξa∗)

12}

Straightforward calculations give:

J(1)=2∫ ∗Tr{(Ei � Ej∗ − Ej∗ � Ei)11 ∧ (ξj � ξi∗)11

= −2∫ √

eed4x(GνμGμν − Gνμ

ij Gjiμν − 12

)and:

J(2) =∫ ∗Tr{

((E

.a � Ei∗

)−(Ei∗ � E

.a))

12∧ (ξi � ξ

.a∗)21

+((

E.a � Ei∗

)−(Ei∗ � E

.a))

21∧ (ξb � ξa∗)12

+((Ei � E.b∗) − (E

.b∗ � Ei))21 ∧ (ξ

.b � ξi∗)12

+((Ei � E.b∗) − (E

.b∗ � Ei))12 ∧ (ξ

.b � ξi∗)21}


J(2) = −8∫ √

eed4xλ.aλ

.a

Finally we obtain:

J = − ∫ √eed4x(GνμGμν − Gνμ

ij Gjiμν − 12 − 4λ

.aλ

.a)

= − ∫ √eed4x(−2G[μν]G[μν] − 4λλ − 8 − Gνμ

ji Gjiμν

)


References

[1] T.Schucker, J.M.Zylinski, J.Geom.Phys, 16, (1995) 207.

[2] A. Connes,in ” Essay on Physics and Non-commutative Geometry”, The Interfaceof Mathematics and Particle Physics, Clarendon Press, Oxford (1990).

[3] D.Kastler, T.Schucker, J.Geom.Phys. 24 (1997)61.

[4] A.H.Chamseddine, G.Felder, J.Frohlich, Nucl. Phys. B395 (1993) 672

[5] A.Sitarz, J. Geom. Phys. 15 (1995) 123.

[6] R.Coquereaux, G.Esposito-Farese, G.Vaillant, Nucl.Phys. B 353, (1991) 689.

[7] M.Dubois-Violette, R.Kerner, J.Madore, J.Math.Phys. 31, (1990)316 .

[8] N.Mohammedi, Mod.Phys.Lett.A9 (1994) 875.

[9] F.Khelili, J.Mimouni, N.Mebarki, J.Math.Phys. 42 (8) (2001)3615

[10] N.Mebarki, F.Khelili, J.Mimouni, in ” Extended Non Symmetric Gravitation Theorywith a Scalar Field in Non Commutative Geometry”, Mentouri Univ. Preprints August2006.

[11] J.W.Moffat, Phys.Rev. D19, (1979)3554.

[12] J.Legare, J.W.Moffat,in ”Field Equations and Conservation Laws in theNonsymmetric. Gravitational Theory” arXiv:gr-qc/9412009

[13] J.W.Moffat, J.Math.Phys, 29 (7) (1988)1655.

[14] J.W.Moffat, J.Math.Phys, 21 (7) (1980)1798.


Quantization of the Scalar Field Coupled Minimallyto the Vector Potential

W. I. Eshraim1∗ and N. I. Farahat2†

Department of PhysicsIslamic University of Gaza

P.O.Box 108, Gaza, Palestine

Received 6 July 2006, Accepted 16 August 2006, Published 31 March 2007

Abstract: A system of the scalar field coupled minimally to the vector potential is quantized byusing canonical path integral formulation based on Hamilton-Jacobi treatment. The equation ofmotions are obtained as total differential equation and the integrability conditions are examined.c© Electronic Journal of Theoretical Physics. All rights reserved.

Keywords: Hamilton-Jacobi Formalism, Path Integral Quantization, Constrained SystemsPACS (2006): 11.10.Ef, 03.65.-w, 11.10.z, 31.15.Kb

1. Introduaction

Dirac approach [1,2] is widely used for quantizing the constrained Hamilton systems.

The path integral is another approach used for the quantization of constrained systems

of classical singular theories which is initiated by Faddeeve [3]. Faddeeve has applied this

approach when only first-class constraints in the canonical gauge are present. Senjanovic

[4] generalized Faddeev’s method to second-class constraints. Fradkin and Vilkovisky

[5,6] rederived both results in a broader context, where they improved procedure to the

Grassman variables. Gitman and Tyutin [7] discussed the canonical quantization of sin-

gular theories as well as the Hamiltonian formalism of gauge theories in an arbitrary

gauge.

The Hamilton-Jacobi approach [8-10] is most powerful approach for treating con-

strained systems. The equations of motion for singular system are obtained as total

differential equations in many variables. The integrability conditions for the system lead

us to obtain the canonical reduced phase-space coordinates without using any fixing con-

∗ wibrahim−[email protected]† [email protected]


ditions . Muslih and Guler’s have constructed the desired path integral in the context of

canonical formalism [11-14], which is based on the Hamilton-Jacobi approach.

In this paper, we shall treat the scalar field coupled minimally to the vector potential

as constrained system. The path integral quantization is obtained using both Hamilton-

Jacobi approach and Faddeeve approach and the results are compared.

2. Path Integral Formulation

In this section, we briefly review the Faddeeve method and the Hamilton-Jacobi

method for studying the path integral for constrained systems.

2.1 Fadeeve Pop Method

Consider a mechanical system with n degrees of freedom and having α first-class con-

straints φα, but no second-class constraints, Fadeeve has formulated the transition am-

plitude as [3]

〈Out | S | In〉 =

∫exp

[i

∫ ∞

−∞(piqi − H0) dt

]∏t

dμ(qi(t), pi(t)), (1)

where H0 is the Hamiltonian of the system. The measure of integration is defined by

dμ(q, p) =

(α∏

a=1

δ(χa)δ(φa)

)det||{χa, φa}||

n∏i=1

dpi dqi. (2)

and χa(pi, qi) are the gauge-fixing condition with

1. {χa, χa′} = 0,

2. det||{χa, φa}|| = 0.

2.2 Hamilton-Jacobi Path Integral Quantization

One starts from singular Lagrangian L ≡ L(qi, qi, τ), i = 1, 2, . . . , n, with the Hess matrix

Aij =∂2L(qi, qi, τ)

∂qi ∂qji, j = 1, 2, . . . , n, (3)

of rank (n − r), r < n. Then r momenta are dependent. The generalized momenta pi

corresponding to the generalized coordinates qi are defined as

pa =∂L

∂qa, a = 1, 2, . . . , n − r, (4)

pμ =∂L

∂qμ, μ = n − r + 1, . . . , n. (5)

The singular value of the system enables us to solve Eq.(4) for qa as

qa = qa(qi, qμ, pb; τ) ≡ wa. (6)


Substituting Eq. (6), into Eq. (5), we get

pμ =∂L

∂qμ

∣∣∣∣qa≡ωa

≡ −Hμ(qi, qμ, pa; τ). (7)

Relations (7) indicate the fact that the generalized momenta Pμ are independent of Pa

which is a natural result of the singular nature of the Lagrangian.

The canonical Hamiltonian H0 is defined as

H0 = −L(qi, qμ, qa ≡ wa; τ) + paqa + Pμqμ

∣∣∣∣pμ=−Hμ

. (8)

The set of Hamilton-Jacobi Partial Differential Equations (HJPDE) is expressed as

H ′α

(τ, qμ, qa, pi =

∂S

∂qi, p0 =

∂S

∂τ

)= 0, α = 0, n − p + 1, . . . , n, (9)

where

H ′α = pα + Hα , (10)

The equations of motion are obtained as total differential equations in many variables

as follows:

dqr =∂H ′

α

∂pr

dtα, r = 0, 1, . . . , n, (11)

dpa = −∂H ′α

∂qadtα, a = 0, . . . , n − p, (12)

dpμ = −∂H ′α

∂qμdtα, α = 0, n − p + 1, . . . , n, (13)

dZ =

(− Hα + pa

∂H ′α

∂pa

)dtα, (14)

where Z = S(tα, qa) being the action. The set of Eqs. (11-14) are integrable if

dH ′α = 0, α = 0, n − p + 1, . . . , n. (15)

If conditions (15) are not satisfied identically, one may consider them as new constraints

and a gain test the integrability conditions, then repeating this procedure, a set of con-

ditions may be obtained.

In this case the path integral representation may be written as [11-14].

〈Out | S | In〉 =

∫ n−r∏a=1

dqadpa exp

[i

∫ t′α

tα

(−Hα + pa

∂H ′α

∂pa

)dtα

], (16)

One should notice that the integrate (16) is an integration over the canonical phase-space

coordinates qa, pa.


3. The Scalar Field Coupled Minimally to the Vector Potential

Consider the action integral for the scalar field coupled minimally to the vector

potential as

S =

∫d4x L, (17)

where the Lagrangian L is given by

L = −1

4Fμν(x)F μν(x) + (Dμϕ)∗(x)Dμϕ(x) − m2ϕ∗(x)ϕ(x), (18)

where

F μν = ∂μAν − ∂νAμ, (19)

and

Dμϕ(x) = ∂μϕ(x) − ieAμ(x)ϕ(x). (20)

Let us first discuss the system using Hamilton-Jacobi approach. In this approach the

canonical momenta (4) and (15) take the forms

πi =∂L

∂Ai

= −F 0i, (21)

π0 =∂L

∂A0

= 0, (22)

pϕ =∂L

∂ϕ= (D0ϕ)∗ = ϕ∗ + ieA0ϕ

∗, (23)

pϕ∗ =∂L

∂ϕ∗= (D0ϕ) = ϕ − i e A0 ϕ, (24)

From Eqs. (21), (23) and (24), the velocities Ai, ϕ∗ and ϕ can be expressed in terms

of momenta πi, pϕ and pϕ∗ respectively as

Ai = −πi − ∂iA0, (25)

ϕ∗ = pϕ − ieA0ϕ∗, (26)

ϕ = pϕ∗ + ieA0ϕ. (27)

The canonical Hamiltonian H0 is obtained as

H0 =1

4F ijFij − 1

2πiπ

i + πi ∂iA0 + pϕ∗pϕ + ieA0ϕpϕ

− ieA0ϕ∗pϕ∗ − (Diϕ)∗(Diϕ) + m2ϕ∗ϕ. (28)

Making use of (9) and (10), we find for the set of HJPDE

H ′0 = π4 + H0, (29)


H ′ = π0 + H = π0 = 0, (30)

Therefore, the total differential equations for the characteristic (11-13) are obtained as

dAi =∂H ′

0

∂πi

dt +∂H ′

∂πi

dA0,

= −(πi + ∂iA0) dt, (31)

dA0 =∂H ′

0

∂π0

dt +∂H ′

∂π0

dA0 = dA0, (32)

dϕ =∂H ′

0

∂pϕ

dt +∂H ′

∂pϕ

dA0,

= (pϕ∗ + ieA0ϕ) dt, (33)

dϕ∗ =∂H ′

0

∂pϕ∗dt +

∂H ′

∂pϕ∗dA0,

= (pϕ − ieA0ϕ∗) dt, (34)

dπi = −∂H ′0

∂Ai

dt − ∂H ′

∂Ai

dA0,

= [∂lFli + ie(ϕ∗∂iϕ + ϕ ∂iϕ

∗) + 2e2Aiϕϕ∗] dt, (35)

dπ0 = −∂H ′0

∂A0

dt − ∂H ′

∂A0

dA0,

= [∂iπi + ieϕ∗pϕ∗ − ieϕ pϕ] dt, (36)

dpϕ = −∂H ′0

∂ϕdt − ∂H ′

∂ϕdA0,

= [(−→D · −→Dϕ)∗ − m2ϕ∗ − ieA0pϕ] dt, (37)

and

dpϕ∗ = −∂H ′0

∂ϕ∗dt − ∂H ′

∂ϕ∗dA0,

= [(−→D · −→Dϕ) − m2ϕ + ieA0pϕ∗ ] dt.

(38)

The integrability condition (dH ′α = 0) implies that the variation of the constraint H ′

should be identically zero, that is

dH ′ = dπ0 = 0, (39)

which lead to a new constraint

H ′′ = ∂iπi + ieϕ∗pϕ∗ − ieϕ pϕ = 0. (40)


Taking the total differential of H ′′, we have

dH ′′ = ∂idπi + iepϕ∗dϕ∗ + ieϕ∗dpϕ∗ − ieϕ dpϕ − iepϕ dϕ = 0. (41)

Then the set of equations (31-38) is integrable. Therefore, the canonical phase space

coordinates (ϕ, pϕ) and (ϕ∗, pϕ∗) are obtained in terms of parameters (t, A0).

Making use of Eqs.(14) and (28-30), one gets the canonical action integral as

Z =

∫d4x(−1

4〉F ijFij − 1

2〉πiπ

i + pϕpϕ∗ +−→Dϕ∗ · −→

Dϕ + m2|ϕ|2), (42)

where −→D =

−→� + ie−→A. (43)

Now the path integral representation (16) is given by

〈out|S|In〉 =

∫ ∏i

dAi dπi dϕ dpϕ dϕ∗ dpϕ∗〉exp

[i

{∫d4x

(−1

2πiπ

i − 1

4F ijFij + pϕpϕ∗ + (Diϕ)∗(Diϕ) − m2ϕ∗ϕ)

}]. (44)

To apply the Faddeeve method to the pervious system, we start with the total Hamil-

tonian

HT =1

4F ijFij − 1

2πiπ

i + πi ∂iA0 + pϕ∗pϕ + ieA0ϕpϕ

− ieA0ϕ∗pϕ∗ − (Diϕ)∗(Diϕ) + m2ϕ∗ϕ + λπ0. (45)

According to Dirac’s method, the time derivative of the primary constraints should be

zero, that is

H ′ = {H ′, HT} = ∂iπi + ieϕ∗pϕ∗ − ieϕ pϕ ≈ 0, (46)

which leads to the secondary constraints

H ′′ = ∂iπi + ieϕ∗pϕ∗ − ieϕ pϕ ≈ 0. (47)

There are no tertiary constraints, since

H ′′ = {H ′′, HT} = 0. (48)

By taking suitable linear combinations of constraints, one has to find the first-class one,

that is

Φ = H ′ = π0. (49)

The equations of motion read as

Ai = {Ai0, HT} = −(πi + ∂iA0), (50)

A0 = {A0, HT} = λ, (51)


ϕ = {ϕ,HT} = (pϕ∗ + ieA0ϕ), (52)

ϕ∗ = {ϕ∗, HT} = (pϕ − ieA0ϕ∗), (53)

πi = {πi, HT} = ∂lFli + ie(ϕ∗∂iϕ + ϕ∂iϕ

∗) + 2e2Aiϕϕ∗, (54)

π0 = {π0, HT} = ∂iπi + ieϕ∗pϕ∗ − ieϕ pϕ, (55)

pϕ = {pϕ, HT} = (−→D · −→Dϕ)∗ − m2ϕ∗ − ieA0pϕ, (56)

pϕ∗ = {pϕ∗ , HT} = (−→D · −→Dϕ) − m2ϕ + ieA0pϕ∗ . (57)

We will contact ourselves with a partial gauge fixing by introducing gauge constraints

for the first-class primary constraints only, just to fix the multiplier λ in Eq.(45). Since

there are weakly vanishing, a gauge choice near at hand would be:

φ′ = A0 = 0. (58)

But for this forbids dynamic at all, since the requirement A0 = 0 implies λ = 0.

Making use of Eq.(1), we obtain the path integral quantization

〈out|S|In〉 =

∫exp

[i

∫ +∞

−∞(−1

2〉πiπ

i − 1

4〉F ijFij + pϕpϕ∗

+−→Dϕ∗ · −→Dϕ − m2|ϕ|2

]d4x dAi dπi dϕ dpϕ dϕ∗ dpϕ∗ . (59)

We showed that Eq.(44) and Eq.(59) are identical.

4. Conclusion

Path integral quantization of the scalar field coupled minimally to the vector potential is

obtained by using the canonical path integral formulation [11-14]. The integrability con-

ditions dH ′0 and dH ′ are satisfied, the system is integrable, hence the path integral is ob-

tained directly as an integration over the canonical phase space coordinatesAi, πi, ϕ, Pϕ, ϕ∗

and pϕ∗ without using any gauge fixing conditions.

The Hamilton-Jacobi quantization is simpler and more economical. Also there is no

need to distinguish between first and second-class constraints, and there is no need to

introduce Lagrange multipliers; all that is needed is the set of Hamilton-Jacobi partial

differential equations and the equations of motion. If the system is integrable then one

can construct the canonical phase space.


References

[1] P.A.M. Dirac, lectures of Quantum Mechanics, Yeshiva University Press, New york(1964).

[2] P.A.M. Dirac, Can J. Math. 2, 129 (1950).

[3] L.D. FADDEEV, Teoret. Mat. Fiz. 1, 3 (1969)[Theor. Math. Phys. 1, 1 (1970)].

[4] P. Senjanovic, Ann. Phys (NY) 100, 227 (1976).

[5] E. S. Fradkin and G. A Vilkovisky, Phys. Rev. D8,4241 (1973).

[6] E. S. Fradkin and G. A Vilkovisky, Phys. Lett. B55,241 (1975).

[7] D. M. Gitman and I.V. Tyutin, Quantization of Fields with constraints, Springs verlag,Berlin (1990).

[8] Y. Guler, Nuovo Cimento B107, 1389 (1992).

[9] Y. Guler, Nuovo Cimento B107, 1143 (1992).

[10] N. Farahat and Y. Guler, Nuovo Cimento B111, 513 (1996).

[11] S. I. Muslih and Y. Guler, Nuovo Cimento B113, 277 (1998).

[12] S. I. Muslih and Y. Guler, Nuovo Cimento B112, 97 (1997).

[13] S. I. Muslih, Nuovo Cimento B115, 1 (2000).



[16] S. I. Muslih, Mod. Phys. Lett. A A19, 151 (2004).


A Generalized Option Pricing Model

J. P. Singh∗

Department of Management StudiesIndian Institute of Technology Roorkee

Roorkee 247667, India

Received 6 December 2006, Accepted 6 January 2007, Published 31 March 2007

Abstract: The Black Scholes model of option pricing constitutes the cornerstone ofcontemporary valuation theory. However, the model presupposes the existence of severalunrealistic assumptions including the lognormal distribution of stock market price processes.There, now, subsists abundant empirical evidence that this is not the case. Consequently,several generalisations of the basic model have been attempted with relaxation of some ofthe underlying assumptions. In this paper, we postulate a generalization that contemplatesa statistical feedback process for the stochastic term in the Black Scholes partial differentialequation. Several interesting implications of this modification emanate from the analysis andare explored.c© Electronic Journal of Theoretical Physics. All rights reserved.

Keywords: Econophysics, Stochastic Processes, Financial Markets, Black Scholes Model, OptionPricing ModelPACS (2006): 89.65.s, 89.65.Gh, 02.50.Ey, 05.40.a

1. Introduction

With the rapid advancements in the evolution and study of disordered systems and

the associated phenomena of nonlinearity, chaos, self organized criticality etc., the impor-

tance of generalizations of the extant mathematical apparatus to enhance its domain of

applicability to such disordered systems is cardinal to the further development of science.

A possible mechanism for achieving this objective is through deformation of standard

mathematics.

A considerable amount of work has already been done and success achieved in the

broad areas of q-deformed harmonic oscillators [1], representations of q-deformed rotation

and Lorentz groups [2-3]. q-deformed quantum stochastic processes have also been studied

∗ [email protected] and Jatinder [email protected]


with realization of q-white noise on bialgebras [4], deformations of the Fokker Planck’s

equation [5], Langevin equation [6] and Levy processes [7-8] have also been analysed and

results reported.

Though at a nascent stage, the winds of convergence of physics and finance are unmis-

takably perceptible with several concepts of fundamental physics like quantum mechanics,

field theory and related tools of non-commutative probability, gauge theory, path integral

etc. being applied for pricing of contemporary financial products and for explaining var-

ious phenomena of financial markets like stock price patterns, critical crashes etc [8-19].

The celebrated Black Scholes formula [20,21] constitutes the cornerstone of contem-

porary valuation theory. However, the model, although very robust and of immense

practical utility is based on several unrealistic and rigid assumptions. Several general-

izations have been attempted through relaxation of one or other assumption, thereby

enhancing its spectrum of applicability.

In this paper, we attempt one such generalization based on the deformation of the

standard Brownian motion. Section 2, which forms the essence of this paper, attempts a

deformation of the standard Black Scholes pricing formula. In Section 3 we illustrate the

theory developed in the previous section with a concrete example.Section 4 looks at the

interpretation of the deformation index. Section 5 addresses issues relating to empirical

relevance of the model. Section 6 the conclusions.

2. The Generalized Black Scholes Model

The standard analysis of the Black Scholes formula for option pricing presupposes

that the stock price follows the lognormal distribution. However, significant empirical

evidence now subsists of the stock returns deviating from the lognormal distribution with

“fat tails” and a “sharp peak” which better fit the truncated Levy flights or other power

law distributions [9, 22, 23]. To broadbase the Black Scholes model, generalizations by

way of “Levy noise” and “jump diffusions” [24] have already been studied. In this paper,

we propose a model that incorporates a “weighted Brownian motion” as the stochastic

(noise) term, where the weights themselves are a function of the “Brownian motion /

noise” i.e.

dW Pt → dUP

t = f(UP

t , t)dW P

t (1)

W Pt is a regular Brownian motion representing Gaussian white noise with zero mean and

δ correlation in time i.e. EP (dWtdWt′) = dtdt′δ (t − t′) and on some filtered probability

space (Ω, (Ft) , P ). We, further, mandate that the function f(UP

t , t)

satisfies the Novikov

condition and that the process UPt =

∫ t

0f(UP

t , s)dW P

s is a local P -martingale wih a non

normal distribution. This requirement is not as restrictive as it may seem at first sight

in context of the applications envisaged. We shall address this issue again in the sequel.

This generalization contemplates a statistical feedback process. In this context, sev-

eral studies on stock market data have shown the existence of nonlinear characteristics

and chaotic behavior that lend credence to the existence of a statistical feedback mech-

anism of market players. Explanations for the existence of “fat tails” in stock market


data have been offered through this statistical feedback process e.g. “extremal events”

cause “disproportionate reactions” among market players. This deformed noise may also

capture the “herd behavior” of stock market investors. The model also encompasses time

dependent return processes since f is a function of UPt and t so that the drift term varies

with time.

We define the European call option as a financial contingent claim that entails a right

(but not an obligation) to the holder of the option to buy one unit of the underlying

asset at a future date (called the exercise date or maturity date) at a price (called the

exercise price). The option contract, therefore, has a terminal payoff of max (ST − E, 0) =

(ST − E)+ where ST is the stock price on the exercise date and E is the exercise price.

We consider a non-dividend paying stock, the price process of which follows the ge-

ometric Brownian motion with drift St = e(μt+σUPt ) under the probability measure P

with drift μ and volatility σ. The logarithm of the stock price Yt = In St follows the

stochastic differential equation

dYt = μdt + σdUPt = μdt +

[σf(UP

t , t)]

dW Pt (2)

Application of Ito’s formula yields the following SDE for the stock price process

dSt =

(μ +

1

2

[σf(UP

t , t)]2)

Stdt +[σf(UP

t , t)]

StdW Pt (3)

Let C (St, t) denote the instantaneous price of a call option with exercise price E at any

time t before maturity when the price per unit of the underlying is St. We assume that

C (St, t) does not depend on the past price history of the underlying. Applying the Ito

formula to C (St, t)yields

dCt=[(

μ+ 12 [σf(UP

t ,t)]2)St

∂C∂S

+ ∂C∂t

+ 12 [σf(UP

t ,t)]2S2

t∂2C∂S2

]dt+ ∂C

∂S [σf(UPt ,t)]StdWP

t , (4)

Applying Girsanov’s theorem to the price process (3), we perform a change of measure and

define a probability measure Q such that the discounted stock price process Zt = Ste−rt

or equivalently

dZt =

(μ − r +

1

2

[σf(UP

t , t)]2)

Ztdt +[σf(UP

t , t)]

ZtdW Pt (5)

behaves as a martingale with respect to Q. This is performed by eliminating the drift

term through the transformation(μ − r + 1

2

[σf(UP

t , t)]2)

σf (UPt , t)

→ γt (6)

whence WQt = W P

t + γtt is a Brownian motion without drift with respect to the measure

Q and dZt =[σf(UQ

t , t)]

ZtdWQt which is driftless under the measure Q and hence, Zt

is a Qmartingale.


The equivalence of[σf(UP

t , t)]

ZtdW Pt and

[σf(UQ

t , t)]

ZtdWQt follows from the

fact that both WQt ,W P

t are zero mean Weiner processes and that f(UQ

t , t)

can be ex-

pressed in terms of f(UP

t , t)

through dZt =[σf(UQ

t , t)]

ZtdWQt alongwith eq. (5). The

noise terms in dZt =[σf(UQ

t , t)]

ZtdWQt and eq. (5), will, therefore, be equivalent

stochastically.

The two measures P& Q are related through the Radon Nikodym derivative which in

the deformed case takes the form

ξ (t) =dQ

dP= exp

⎛⎝−t∫

0

γtdW Pt − 1

2

t∫0

γ2t dt

⎞⎠ (7)

and the expectation operators under the two measures are related as

EQ (Xt |Fs ) = ξ−1 (s) EP (ξ (t) Xt |Fs ) (8)

Our next step in martingale based pricing is to constitute a Q martingale process that

hits the discounted value of the contingent claim i.e. call option. This is formed by taking

the conditional expectation of the discounted terminal payoff from the claim under the

Q‘measure i.e.

Et = EQ[e−rT (ST − E)+ |Ft

]. (9)

We now constitute a self-financing strategy that exactly replicates the claim and whose

value is known with certainty. For this purpose, we introduce a ‘bond’ in our model that

evolves according to the following price process

dBt

Bt

= rdt, B0 = 1, (10)

where ris the relevant risk free interest rate.

Making use of φt units of the underlying asset and ψt units of the bond, where φt =∂C(St,t)

∂S, Btψt = C (St, t) − φtSt, we can now construct a trading strategy that has the

following properties

(1) it exactly replicates the price process of the call option i.e.

φtSt + ψtBt = C (St, t) ,∀t ∈ [0, T ] . (11)

(2) it is self financing i.e.

φtdSt + ψtdBt = dVt,∀t ∈ [0, T ] . (12)

Using eqs. (1), (3), (11) & (12) we have

dC =

(φtμSt +

1

2φt

[σf(UP

t , t)]2

St + ψtrBt

)dt + φt

[σf(UP

t , t)]

StdW Pt . (13)

Matching the diffusion terms of (3) & (13) and using (11), we get the aforesaid expressions

for φt and ψt respectively. The value of this portfolio at any time tcan be shown to be


equal to Vt = ertEt with Et being given by eq.(9). It follows that the value of the

replicating portfolio and hence of the call option at time t is given by

Vt = ertEt = e−r(T−t)EQ[(ST − E)+|Ft] = e−r(T−t)EQ[(ST − E)1(ST≥E)|Ft]

= e−r(T−t)

∫{UQ

T :S(UQT ,T )≥E}

(S(UQT , T ) − E)f(UQ

T , T |UQT , t)dUQ

T (14)

The expectation value of the contingent claim max (ST − E, 0) = (ST − E)+ under the

measure Q depends only on the marginal distribution of the stock price process St under

the measure Q which is obtained by writing it in terms of Q Brownian motion WQt . We

have, from eq.(2), for the deformed stock price process under the measure Qas

d (In St) = μdt +[σf(UP

t , t)]

dW Pt =

(r − 1

2

[σf(UQ

t , t)]2)

dt +[σf(UQ

t , t)]

dWQt

(15)

which on integration yields

St = S0 exp

⎡⎣ t∫0

[σf(UQ

t , t)]

dWQt +

t∫0

(r − 1

2

[σf(UQ

t , t)]2)

ds

⎤⎦ . (16)

The value of the call option can now be computed by using eq. (14). The existence or

otherwise of a closed form solution would depend on the explicit representation of the

function f (U, t).

The following observations are cardinal to the above analysis.

(a) We have, implicitly, made the standard assumption of the market satisfying the

“No Arbitrage” condition. It is well known that long-term market equilibrium cannot

subsist in the presence of arbitrage opportunities. This “No Arbitrage” condition guar-

antees the existence and measurability of γt defined by eq. (6) as is proved below:

For this purpose, we assume that there exist values of UPt for which f

(UP

t , t)

= 0 and

hence, γt does not exist. Let Xt ={UP

t : f(UP

t , t)

= 0}. We construct a portfolio (φ, ψ)

of the normalized stock process(St

)and the bond process

(Bt)

where

φ =

⎧⎪⎨⎪⎩ θ for UPt ∈ Xt

0 for UPt /∈ Xt

⎫⎪⎬⎪⎭and

ψt = ψ0 + φ0S0 +t∫

0

e−rsφsdSs −t∫

0

re−rsφsds − e−rtφtSt,B0 = 1 and the normalized stock

process i.e. the stock process adapted to a market with zero interest rates being given by

St = Ste−rt and dSt = e−rtdSt − re−rtStdt.

The portfolio is self financing since Vt = ψt+φtSt and hence, dVt = ψt+φtdSt. Further,

Vt−V0 =t∫

0

φsdSs =t∫

0

e−rs(μ + 1

2

[of(UP

s , s)]2 − r

)φsSsds+

t∫0

e−rs[of(UP

s , s)]

φsSsdW Ps

=t∫0

ℵXse−rs(μ+ 1

2 [of(UPs ,s)]

2−r)θsSsds+

t∫0

ℵXse−rs[of(UP

s ,s)]θsSsdWPs =

t∫0

ℵXse−rs(μ−r)θsSsds≥0


where ℵXt is the characteristic function of the set Xt∀U, t. But under the “No Arbitrage”

condition Vt − V0 ≤ 0. It, therefore, follows that ℵXt = 0 ∀U, t and hence, Xt = φ.

(b) In the standard Black Scholes theory, the Novikov condition is automatically satisfied

due to the constancy of γt ≡ γ. However, in the deformed version, this condition needs

to be explicitly imposed to ensure the applicability of the Girsanov’s theorem and hence,

the existence of the equivalent martingale measure Q. Hence, we require that the func-

tion f (U, t) to be such that EP

{exp

[12

T∫0

(γs)2 ds

]}< ∞. As mentioned above, this

condition is not very restrictive insofar as the applications of this model are concerned,

since f (U, t) would normally take the form of probability distributions and hence, be

non zero bounded functions, thereby, automatically satisfying the square integrability

requirements.

(c) Except for the Novikov condition, which needs to be explicitly imposed in the

deformed model as mentioned in (b) above, our analysis is equivalent to the standard

Black Scholes model since f (U, t) can be expressed as a function of Y , the logarithm of

the stock price Sthrough eq. (2);

(d) The “No Arbitrage” condition together with the Novikov Condition guarantee

the completeness of the market and hence, the availability of replicating portfolios for

the valuation of any contingent claim. This is established by showing that there exists a

self financing portfolio (φ, ψ)defined as in (a) above that exactly replicates the terminal

payoff of any lower bounded contingent claim, say C (St, t). Mathematically, this implies

that there exists a real number ε such that C (ST , T ) = V εT = ε +

T∫0

(φtdSt + ψtdBt) or

equivalently

C (ST , T ) = V εT = ε +

T∫0

(φtdSt + ψtdBt) = erT

(ε +

T∫0

φtdSt

)=erT

[ε+

T∫0

e−rt(μ+ 1

2 [of(UPt ,t)]

2−r)φtStdt+

T∫0

e−rt[of(UPt ,t)]φtStdWP

t

]=erT

[ε+

T∫0

e−rt[of(UQt ,t)]φtStdW

Qt

]

By the Martingale Representation Theorem, there exists a function ηt such that

C (ST , T ) = erT

{EQ[e−rTC (ST , T )

]+

T∫0

ηtStdWQt

}. Hence, we can identify ε =

EQ[e−rTC (ST , T )

]and φt = ert

[of(UQ

t , t)]−1

ηt. By selecting the bond component of

the portfolio (ψ) according to ψt = ψ0 +t∫

0

e−rsdλs where λs =s∫0

φvdSv − φsSs, we can

make our portfolio (φ, ψ) self financing. This is shown below. We have,

dVt=d(ψtert+φtSt)=rertψtdt+ertd(ψt)+d(φtSt)=rertψtdt+ertd(λt)+d(φtSt)=rertψtdt+φtdSt

as required. Furthermore,

V εt =ert

(ε+

t∫0

φvdSv

)=ert

(ε+

t∫0

ηvSvdWQv

)=ertEQ( e−rT V ε

T |Ft)=ertEQ( e−rT C(ST ,T )|Ft)

showing that V εt is lower bounded and hence, establishing the completeness of the market.


3. An Illustration of the Deformed Model

We now present a concrete example as an application of the aforesaid analysis. For

the purpose, we consider a Brownian motion of the form

dW Pt → dUP

t = f(UP

t , t)q

dW Pt (17)

where f(UP

t , t)is a probability density function.

The incorporation of probability dependent term in the stochastic force enables us

to describe nonlinear return processes where the randomness is not uniform across the

entire return spectrum. In the standard theory, we envisage a random process that

is independent of the level of returns and hence, if sufficient number of observations

are accumulated, the entire spectrum of possible returns will be traversed. However,

through this deformed noise function we can model return processes that change with

the respective probability of such returns i.e. the degree of randomness changes across

the return spectrum – highly frequented regions of the spectrum may have higher/lower

returns depending on the nature of the deformation function. Hence, a biased yet random

return process can be accommodated. Although, in theory, the entire return spectrum

may still be traversed if sufficient number of observations are made, yet the dependence

on probabilities enable the modeling of systems that require a cleavage of the return

spectrum to create an effectively nonergodic space for the system. The model would also

be versatile enough to encompass a return spectrum having the character of a multifractal

which goes well with contemporary research findings in this area. Furthermore, unlike the

standard case where W Pt =

t∫0

dW Pt is normally distributed, UP

t =∫ t

0f(UP

t , s)dW P

s is no

longer normally distributed but follows a skewed distribution depending on the explicit

representation of the function f(UP

t , t)

and parameter q.

Eq. (17) is equivalent to the Langevin equation [25]

dUPt

dt= f(UP

t , t)q dW P

t

dt= f(UP

t , t)q

η (t) (18)

η (t) is a noise function that satisfies

〈η (t)〉 = 0 (19)

〈η (t′) dt′η (t′′) dt′′〉 = δ (t′ − t′′) dt′ (20)

The time evolution of the probability density f(UP

t , t)

is given by the following equation

[26] (The super(sub)scripts are suppressed for the sake of brevity)

f (U, t + Δt) =

∫f (U, t + Δt |U ′, t) .f (U ′, t) dU ′ (21)

f is the transition probability between states. We now set U ′ = U −ΔU and expand the

integrand as a Taylor’s series around f (U + ΔU, t + Δt |U, t) f (U, t) to obtain

f (U, t + Δt |U ′, t) f (U ′, t) = −ΔU ddU

f (U + ΔU, t + Δt |U, t) f (U, t) +

−ΔU2

2d2

dU2 f (U + ΔU, t + Δt |U, t) f (U, t) + .........(22)


Eq. (22) on integration gives

f (U, t + Δt) = − ddU

[∫ΔU f (U + ΔU, t + Δt |U, t)dΔU

]f (U, t) +

−12

d2

dU2

[∫ΔU2 f (U + ΔU, t + Δt |U, t)dΔU

]f (U, t) + .........

(23)

We can further simply the above expression, noting that U is a martingale, as follows:-

∫ΔUf(U + ΔU, t + Δt|U, t)dΔU = Et[ΔU ] = Et[

t+Δt∫t

f(Us, s)qdWs] = 0 (24)

and∫ΔU2f(U + ΔU, t + Δt|U, t)dΔU = Et[ΔU2] = Et[

t+Δt∫t

f(Us, s)2qds] = f(Us, t)

2qΔt + o(Δt)

(25)

where the last step follows from Ito isometry. We have ignored terms of second and higher

orders in Δt. Using the results in eqs. (24) & (25) in eq. (23) and taking the limit as

Δt → 0 we obtain the Fokker Planck equation [26] for the time evolution of the deformed

probability density (17) asdf

dt=

1

2

d2f2q+1

dU2(26)

To obtain an explicit solution of eq. (26) for the probability density f (U, t), we postulate

a normalized scaled solution, which enables the separation of the U and t dependencies

through the ansatz

f (U, t) = g (t) H (Ug (t)) = g (t) H (z) (27)

Substitution from eq. (27) into eq. (26) and simplification yields

.

g (t)

g (t)2q+3

∂

∂z(zH (z)) =

1

2

∂2

∂z2H (z)2q+1 (28)

Writing.

2g(t)

g(t)2q+3 = −k, we have

g (t) = [(q + 1) k (t − t0)]− 1

2(q+1) (29)

which gives the solution of eq. (26) as

f (U, t) = A (t − t0)− 1

2(q+1) exp(1−2q)

{B[(U − U0) (t − t0)

− 12(q+1)

]2}(30)

whereA = [(q + 1) k]−1

2(q+1) B = − kA2

4(2q+1)and expq (x) = [1 + (1 − q) x]

11−q is the qexponential

function. kcan be determined from the normalization condition∞∫−∞

f (U, t) dU = 1,

f (U, t)being a probability density function.


The transition probability density f (U, t |U0, t0 ), that is the key element in option

pricing, is the probability density f (U, t)with a special initial condition f (U, t0) =

δ (U − U0) i.e. f (U, t |U0, t0 ) also obeys the Fokker Planck equation (26). Furthermore,

it is seen that the solution for f (U, t) given by eq. (30) meets the δ function initial con-

dition in the limit t → t0, and is, therefore, also a solution for the transition probability

density f (U, t |U0, t0 ).

As an illustration, the conditional probability density of the logarithm of the stock

prices would be

f (Yt+Δt |Yt ) = A (Δt)−1

2(q+1) exp(1−2q)

{B

[(ln

St+ΔtSt

−μΔt)

σ(Δt)−

12(q+1)

]2}under the probability measure P and

f (Yt+Δt |Yt ) = A (Δt)−1

2(q+1) exp(1−2q)

{B[

1σ

(ln St+Δt

St

)(Δt)−

12(q+1)

]2}under Q.

Using the expression (30) for f (U, t) with U0 = 0, t0 = 0(which does not result in

any loss of generality) in eq. (16), we derive the expression for the stock price process

under the martingale measure Q and, thereby, of the contingent claim using eq. (14). To

approximatet∫

0

f (U, s)2q ds we note that for any arbitrary value of time s, the distribution

of the random variable Us can be mapped onto the distribution of a random variable ω

at a fixed time T through the transformation Us =(Ts

)− 12(1+q) UT . Hence,

t∫0

f (U, s)2q ds =

t∫0

f((

Ts

)− 12(1+q) UT , s

)2q

ds

= A2q

t∫0

s−q

(q+1) exp2q(1−2q)

[B(UTT−

12(q+1)

)2]ds = Ct

1(q+1) exp2q

(1−2q)

[B(UTT−

12(q+1)

)2](31)

where C = (q + 1) A2q.

Furthermore,t∫

0

f (U, t)q dW = U (t), in view of eq. (17). Substituting this result and

that of eq. (31) in eq. (16), we get the following expression for the stock price process in

the martingale measure Q

St = S0 exp

{σUt + rt − 1

2σ2Ct

1(q+1) exp2q

(1−2q)

[B(UTT−

12(q+1)

)2]}

(32)

from which the value of the call option can be recovered using (14). It may, however,

be noted that in the standard case the exponential is linear in W and the stock price,

therefore, is a monotonically increasing function of W . Hence, the condition St − E > 0

is satisfied for all values of W that exceed a threshold value. However, in this illustration,

consequent to the noise induced drift, the exponential in the stock price process is now

a quadratic function of the deformed Brownian motion U . We, therefore, have two roots

of Uthat meet the condition St−E = 0. Accordingly, there will exist an interval (U1, U2)


within which the inequality St − E > 0 will hold. Furthermore, as q → 0, U2 → ∞thereby recovering the standard case. Hence, we have

Vt = e−r(T−t)

U2∫U1

⎛⎝S0e

{σUT +rT− 1

2σ2CT

1(q+1) exp2q

(1−2q)

[B

(UT T

− 12(q+1)

)2]}

− E

⎞⎠ f (UT , T ) dU

(33)

As in the standard case, in the martingale measure based risk neutral world, the stock

price distribution under Q is dependent on the risk free interest rate r and not on the

average return μ. We easily recover the standard results from the generalized model in

the limit q → 0.

4. Interpretation of the q Index

Towards examining the interpretation of the q index in the context of the ap-

plication being envisaged, we study the impact of the deformation of the standard

exponential distribution g (U, ζ) = CeBU2ζ . For this purpose, we note that f (U, t),

withU0 = 0, t0 = 0, can be expanded in the form of a gamma distribution as f (U, x) =

Aζ1/20

1

Γ[(−2q)−1]

∞∫0

x−(1+ 12q )e−x(1+2qζ0BU2)dxwhere ζ = t−(1+q)−1

. We assume that there ex-

ists a function h (ζ) that modifies the exponential distribution g (U, ζ) to f (U, ζ) i.e.

that f (U, ζ) = A∞∫0

h (ζ) eBU2ζdζ. Identifying −2qζ0x with ζ and comparing the two

expressions for f we obtain h (ζ) = ζ1/20

1

Γ[(−2q)−1]e(2qζ0)−1ζ (−2qζ0)

1/2q ζ−(1+ 12q ). Using

this expression for h (ζ) we obtain the expected values of ζ and ζ2 as 〈ζ〉 = ζ3/20 and

〈ζ2〉 = (1 − 2q) ζ5/20 which gives the coefficient of variation as (1 − 2q) ζ

−1/20 − 1. Hence,

it follows that if f (U, t) is a probability distribution function that satisfies the nonlinear

Fokker Planck eq. (26), then its explicit representation is given as in eq. (30) where

the parameter q is linearly related to the relative variance of ζ = t−(1+q)−1

Furthermore,

since the relative variance depends on both q and ζ = t−(1+q)−1

, it follows that the func-

tion f (U, t) generates an ensemble of returns corresponding to various values of q over

a particular time scale and also that, for a given q the distributions of returns evolves

anomalously across differing timescales.

5. Empirical Evidence

The Black Scholes model assumes lognormal distributions of stock prices. However,

deviations from such behaviour are, by now, well documented [28]. Empirical evidence

testifies that probability distributions of stock returns are negatively skewed, have fat

tails and show leptokurtosis [28]. Some of these features of empirical distributions are

modeled through Levy distributions [29-32], stochastic volatility [33] or cumulant ex-

pansions [31] around the lognormal case. Each of these models, however, attempts to

empirically attune the model parameters to fit observed data and hence, is equivalent


to interpolating or extrapolating observed data in one form or the other. In contrast,

the deformed noise model preserves the analytical framework of the Black Scholes world

by retaining only one source of stochasticity and hence remaining within the domain of

complete markets. It also provides a complete form solution with enables the prediction

of option prices ab initio in lieu of parameter fitting to match observed data.

In this context, the probability distribution function of eq. (30) generates power law

distributions with consequential fat tails that are characteristic of stock price distribu-

tions. This fact is brought out explicitly by writing eq. (30), with U0 = 0, t0 = 0, in the

form:-

f(U,t)=At− 1

2(q+1) exp(1−2q)

[B

(Ut

− 12(q+1)

)2]=At

− 12(q+1)

{1+2q

[B

(Ut

− 12(q+1)

)2]} 1

2q

∼(2qA2qB)12q U

1q t

− 12q

(34)

for sufficiently large values of t.

There is an intricate yet natural relationship between the power law tails observed

in stock market data and probability distributions of the form (30) that emanates as

the solution of the nonlinear Fokker Planck equation (26). The nonlinear Fokker Planck

equation (26) is known to describe anomalous diffusion under time evolution. Empirical

results [34-37] establish that temporal changes of several financial market indices have

variances that that are shown to undergo anomalous super diffusion under time evolution.

One of the most exhaustive set of studies on stock market data in varying dimensions

has been reported in [38-42]. In [42], a phenomenological study was conducted of stock

price fluctuations of individual companies using data from two different databases cover-

ing three major US stock markets. The probability distributions of returns over varying

timescales ranging from 5 min. to 4 years were examined. It was observed that for

timescales from 5 minutes upto 16 days the tails of the distributions were well described

by a power law decay. For larger timescales results consistent with a gradual convergence

to Gaussian behaviour was observed. In another study [38] the probability distributions

of the returns on the S & P 500 were computed over varying timescales. It was, again,

seen that the distributions were consistent with an asymptotic power law behaviour with

a slow convergence to Gaussian behaviour. Similar findings were obtained on the analysis

of the NIKKEI and the Hang –Sang indices [38].

A plausible explanation of the matching of empirical behaviour referred to in the

preceding paragraphs and the probability distribution function (30) is based on the ob-

servation that if the stock prices show large deviations from the averages, then f (U)

would be small in line with the probabilities of extremal events being small. Since the

exponent qis usually negative in the region of interest, the effective volatility would be

accentuated. In terms of market behaviour, one could say that the traders would react

extremally. On the other hand, mild deviations would cause moderate reactions from

market players and hence, the effective volatility gets diminished.


6. Conclusions

Contemporary empirical research into the behavior of stock market price /return

patterns has found significant evidence that financial markets exhibit the phenomenon

of anomalous diffusion, primarily superdiffusion, wherein the variance evolves with time

according to a power law tα with α > 1.0. The standard technique for the study of su-

perdiffusive processes is through a stochastic process that evolves according to a Langevin

equation and whose probability distribution function satisfies a nonlinear Fokker Planck

equation of the form (26). The very fact that our deformed noise function satisfies the

nonlinear Fokker Planck equation is motivation enough for an adoption of this deformed

Brownian motion with statistical feedback for the modeling of financial processes.

Until recently, stock market phenomena was were assumed to result from complicated

interactions among many degrees of freedom, and thus they were analyzed as random

processes and one could go to the extent of saying that the Efficient Market Hypothesis

[43-44] was formulated with one primary objective – to create a scenario which would

justify the use of stochastic calculus [45] for the modeling of capital markets.

The Efficient Market Hypothesis contemplated a market where all assets were fairly

priced according to the information available and neither buyers nor sellers enjoy any ad-

vantage. Market prices were believed to reflect all public information, both fundamental

and price history and prices moved only as sequel to new information entering the market.

Further, the presence of large number of investors was believed ensures that all prices

are fair. Memory effects, if any at all, were assumed to be extremely short ranging and

dissipated rapidly. Feedback effects on prices was, thus, assumed to be marginal. The

investor community was assumed rational as benchmarked by the traditional concepts of

risk and return.

An immediate corollary to the Efficient Market Hypothesis was the independence of

single period returns, so that they could be modeled as a random walk and the defining

probability distribution, in the limit of the number of observations being large, would be

Gaussian.

Ever since the studies of Fama in 1964-65, evidence has been accumulating against

the validity of the Efficient Market Hypothesis – the existence of negatively skewed ob-

servations and fat tails and distortion around the mean values are but a few {28, 31-35].

Most financial returns, including stock returns have shown deviation from Gaussian be-

haviour at short time scales with the variance not scaling with the sq. root of timescale,

an attribute that is symptomatic of the possible existence of power law distributions like

the one being envisaged in this study. A useful measure of quantifying deviations from

the Gaussian distribution is the Hurst’s exponent. If a population is Gaussian, a Hurst’s

exponent of 0.5 is mandated. Empirical evidence, however, shows that the Hurst’s ex-

ponent for typical stock market data is around 0.6 for small timescales of about a day

or less and tends to approach 0.5 asymptotically with the lengthening of the timescales.

Empirical evidence also demonstrates the existence of memory effects, particularly in

stock price volatilities that show long term memory effects with lag-s autocorrelations.


Further, these effects tend to fall off according to a power law rather than exponentially.

Furthermore, the access to enhanced computing power during the last decade has

enabled analysts to try refined methods like the phase space reconstruction methods

for determining the Lyapunov Exponents [46] of stock market price data, besides doing

Rescaled Analysis [47] etc. A set of several studies has indicated the existence of strong

evidence that the stock market shows chaotic behavior with fractal return structures and

positive Lyapunov exponents. Results of these studies have unambiguously established

the existence of significant nonlinearities and chaotic behavior in these time series [48-51].

As mentioned above, several studies [28,52-55] adopting largely diverse and indepen-

dent approaches have established the existence of the following characteristics in the

behavior of stock markets:-

• Long term correlation and memory effects

• Erratic markets under certain conditions and at certain times

• Fractal time series of returns

• Less reliable forecasts with increase in the horizon

thereby establishing strong evidence for the existence of chaotic behavior. In this

context, he following are conventionally accepted as the inherent characteristics of a

chaotic system [56-60]:-

• Exponential divergence of trajectories in phase space;

• Sensitive dependence on initial conditions;

• Fractal dimensions;

• Critical levels and bifurcations;

• Time dependent feedback systems;

• Far from equilibrium conditions.

This provides us with a second motivation for the adoption of this deformed Brownian

motion structure as a model for the random kicks since our model is based on a statistical

time dependent feedback into the system. This feedback may be modeled into the sys-

tem macroscopically through the explicit representation of the probability distribution

function f (U, t) and microscopically through the stochastic process U .

It need be emphasized here that the above is purely a phenomenological model for

modeling stock behavior. One could, for instance, postulate that that the statistical

feedback at the microscopic level represents the actions and interaction of the intra trader

interactions among traders constituting the market. The statistical dependency in the

noise could, further, be representing the aggregate behavior of these traders. Thus, we

could model a market with non homogeneous reactions with consequent biased return

structures

It is fair to say that the current stage of research in financial processes is dominated

by the postulation of phenomenological models that attempt to explain a limited set of

market behavior. There is a strong reason for this. A financial market consists of a huge

number of market players. Each of them is endowed with his own set of beliefs about ra-

tional behavior and it is this set of beliefs that govern his actions. The market, therefore,

invariably generates a heterogeneous response to any stimulus. Furthermore, “rational-


ity” mandates that every market player should have knowledge and understanding about

the “rationality” of all other players and should take full cognizance in modeling his re-

sponse to the market. This logic would extend to each and every market player so that

we have a situation where every market player should have knowledge about the beliefs

of every other player who should have knowledge of beliefs of every other player and so

on. We, thus, end up with an infinitely complicated problem that would defy a solution

even with the most sophisticated mathematical procedures. Additionally, unlike as there

is in physics, financial economics does not possess a basic set of postulates like General

Relativity and Quantum Mechanics that find homogeneous applicability to all systems

in their domain of validity.


References

[1] A. J. Macfarlane, J.Phys., A 22, (1989), 4581;

[2] L.C. Biedenharn, J.Phys., A 22, (1989), L873;

[3] S. Zakrzewski, J.Phys., A 31, (1998), 2929 and references therein; Shahn Majid, J.Math. Phy., 34, (1993), 2045;

[4] Michael Schurmann, Comm. Math. Phy., 140, (1991), 589;

[5] C. Blecken and K.A. Muttalib, J.Phys., A 31, (1998), 2123;

[6] J. P. Singh, Ind. J. Phys., 76, (2002), 285;

[7] U. Franz and R. Schott, J. Phys., A 31, (1998), 1395;

[8] V.I. Man’ko et al, Phy. Lett., A 176, (1993), 173; V.I. Man’ko and R.Vileta Mendes,J.Phys., A 31, (1998), 6037;

[9] W. Paul & J. Nagel, Stochastic Processes, Springer, (1999);

[10] J. Voit, The Statistical Mechanics of Financial Markets, Springer, (2001);

[11] Jean-Philippe Bouchard & Marc Potters, Theory of Financial Risks, Publication bythe Press Syndicate of the University of Cambridge, (2000);

[12] J. Maskawa, Hamiltonian in Financial Markets, arXiv:cond-mat/0011149 v1, 9 Nov2000;

[13] Z. Burda et al, Is Econophysics a Solid Science?, arXiv:cond-mat/0301069 v1, 8 Jan2003;

[14] A. Dragulescu, Application of Physics to Economics and Finance: Money, Income,Wealth and the Stock Market, arXiv:cond-mat/0307341 v2, 16 July 2003;

[15] A. Dragulescu & M. Yakovenko, Statistical Mechanics of Money, arXiv:cond-mat/0001432 v4, 4 Mar 2000;

[16] B. Baaquie et al, Quantum Mechanics, Path Integration and Option Pricing:Reducing the Complexity of Finance, arXiv:cond-mat/0208191v2, 11 Aug 2002;

[17] G. Bonanno et al, Levels of Complexity in Financial markets, arXiv:cond-mat/0104369 v1, 19 Apr 2001;

[18] A. Dragulescu, & M. Yakovenko, Statistical Mechanics of money, income and wealth: A Short Survey, arXiv:cond-mat/0211175 v1, 9 Nov 2002;

[19] J. Doyne Farmer, Physics Attempt to Scale the Ivory Tower of Finance, adap-org/9912002 10 Dec 1999;

[20] F. Black & M. Scholes, Journal of Political Economy, 81, (1973), 637;

[21] M. Baxter & E. Rennie, Financial Calculus, Cambridge University Press, (1992).

[22] J. C. Hull & A. White, Journal of Finance, 42, (1987), 281;

[23] J. C. Hull, Options, Futures & Other Derivatives, Prentice Hall, (1997);

[24] R. C. Merton, Journal of Financial Economics, (1976), 125;

[25] C. W. Gardiner, Handbook of Stochastic Methods, Springer, (1996);


[26] Enrique Canessa., Langevin Equation of Financial Systems: A second-order analysis,arXiv:cond-mat/0104412 v1, 22 Apr 2001.

[27] H. Risken, The Fokker Planck Equation, Springer, (1996);

[28] E. Peters, Chaos & Order in the Capital Markets, Wiley, (1996) and referencestherein;

[29] L. Andersen L & J. Andreasen, Review of Derivatives Research, 4, 231, (2000);

[30] J.P. Bouchaud et al, Risk 93, 61, (1996);

[31] J.P. Bouchaud & M.Potters, Theory of Financial Risks, Cambridge, (2000);

[32] E. Eberlein et al, Journal of Business 71(3), 371, (1998);

[33] B. Dupire, RISK Magazine, 8, January 1994;

[34] R.N. Mantegna & H.E. Stanley, An Introduction to Econophysics, Cambridge, (2000);

[35] M.M. Dacrrogna et al, J. Int’l Money & Finance, 12, 413, (1993);

[36] R.N. Mantegna & H.E. Stanley, Nature, 383, 587, (1996);

[37] R.N. Mantegna, Physica A, 179, 232, (1991);

[38] P. Gopikrishnan et al, Phys. Rev. E 60, 5305, (1999);

[39] P. Gopikrishnan et al, Phys. Rev. E 62, R4493, (2000);

[40] P. Gopikrishnan et al, Physica A, 299, 137, (2001);

[41] P. Gopikrishnan et al, Phys. Rev. E 60, 5305, (1999);

[42] V. Plerou et al, Phys. Rev. E 60, 6519, (1999);

[43] W.F. Sharpe, Portfolio Theory & Capital Markets, McGraw Hill, 1970.

[44] E.J. Elton & M.J. Gruber, Modern Portfolio Theory,& Investment Analysis, Wiley,1981.

[45] S.M. Ross, Stochastic Processes, John Wiley, 1999.

[46] A. Wolf, J.B. Swift, S.L. Swinney & J.A. Vastano, Determining Lyapunov ExponentsFrom a Time Series, Physica 16D, 1985, 285.

[47] B.B. Mandlebrot, The Fractal Geometry of Nature, Freeman Press, 1977.

[48] G. DeBoek, Ed., Trading on the Edge, Wiley, 1994.

[49] J.G. DeGooijer, Testing Nonlinearities in World Stock Market Prices, EconomicsLetters, 31, 1989.

[50] E. Peters, A Chaotic Attractor for the S&P 500, Financial Analysts Journal,March/April, 1991.

[51] E. Peters, Fractal Structure in the Capital Markets, Financial Analysts Journal,July/August, 1989.31. P. Cootner, Ed., The Random Character of Stock Market Prices,Cambridge MIT Press, 1964.

[52] E.F. Fama, The Behaviour of Stock Market Prices, Journal of Business, 38, 1965.

[53] E.F. Fama, Efficient Capital Markets – A Review of Theory & Empirical Work,Journal of Finance, 25, 1970.


[54] E.F. Fama & K.R. French, The Cross Section of Expected Stock Returns, Journal ofFinance, 47, 1992.

[55] E.F. Fama & M.H. Miller, The Theory of Finance, Holt Rinehart & Winston, 1972.

[56] A. Lichtenberg and M. Lieberman, Regular & Stochastic Motion, Springer, 1983.

[57] L.E. Reichl, the Transition to Chaos, Springer, 1992.

[58] V.I. Arnol’d, and A. Avez, Ergodic Problems of Classical Mechanics, Benjamin, 1968.

[59] I. Kornfeld, S. Fomin, and Ya Sinai, Ergodic Theory, Springer, 1982.

[60] J. Guckenheimer, and P. Holmes, Nonlinear Oscillations, Dynamical Systems andBifurcations of Vector Fields, Springer Verlag, 1983.


Derivation of the Radiative Transfer Equation Insidea Moving Semi-Transparent Medium of Non Unit

Refractive Index

V. LE DEZ and H. SADAT∗

Laboratoire d’Etudes Thermiques UMR 6608 CNRS-ENSMA - 86960 FuturoscopeCedex, France


Abstract: The derivation of the radiative transfer equation inside a moving semi-transparentmedium of non unit constant refractive index has been completely achieved, leading to an exactlysimilar equation as in the case of a unit index, unless it is expressed in a particular frame withparticular time and space co-ordinates; defining first the “equivalent vacuum” and the “matter”space associated to its “matter” co-ordinates with the help of the Gordon’s metric, it is shownthat an observer at rest in vacuum perceives the isotropic moving medium as an anisotropicuniaxial medium of given optical axis, for which it is possible to derive general transmission andreflection rules for electromagnetic fields; however the exhibited refractive index characterisingthe moving medium, relatively to the observer located in vacuum, is not an effective indexbut only an apparent one without any energetic significance, and the specific intensity must beobtained relatively to a given observer at rest located inside the moving medium; finally thegeneral form of the radiative transfer equation is obtained in the moving medium.c© Electronic Journal of Theoretical Physics. All rights reserved.

Keywords: Radiation Hydrodynamics, Radiative Transfer Equation, Gordan’s Metric, OpticalPropertiesPACS (2006): 47.35.i, 47.65.d, 67.40.Hf, 02.40.k, 02.90.+p, 04.40.Nr

1. Introduction

Several years ago, Mihalas [1] proposed an elegant way to obtain the invariant radia-

tive transfer equation in a moving semi-transparent medium, noting that in some cases

it was judicious to perform energetic calculations either in a comobile frame bound to

a moving particle or in the frame bound to a given observer; then, when a radiation

participates to the energy transfer, it is necessary to be able to compute the radiative



fluxes in the appropriate frame; the energetic radiative fluxes being strongly related to

the radiative intensity whose evolution is governed by the radiative transfer equation,

one may naturally conceive to give the appropriate form of this equation either in the

comobile frame or in the observer frame. This fundamental work however is restricted

to media for which the refractive index is a unit index; this approximation is of great

interest for gases for which the refractive index is very close to 1, but dense isotropic me-

dia, liquids or solids, have generally a much higher refractive index, and we may imagine

situations where some liquids of high index are moving with a large speed: what is then

in this case the correct form of the radiative transfer equation in such media, and is it

possible to exhibit an invariant form of this equation, valid in both the comobile frame

or the observer frame?

If the optics of moving dielectric media has recently received a strong interest in the

literature [2, 3], to our knowledge no study focused on what happens from a radiative

energetic point of view in moving semi-transparent media; it is to suspect however that

the effects of high speeds may be spectacular, since some spectacular effects may arise

from an optical point of view, as described in [2, 3]; the main tool used to exhibit such

optical effects is the Gordon’s metric tensor; indeed, many decades ago, Gordon [4] had

the intuition that light in a moving dielectric medium “could see matter as a metric” in the

sense where a moving dielectric medium acts on light as an effective gravitational field, and

this is this property which is enhanced to produce in some particular conditions special

optical effects; following this basic idea, it may be interesting to see if the Gordon’s metric

is the appropriate tool to exhibit an invariant form of the radiative transfer equation.

The purpose of this paper is then the derivation of the radiative transfer equation

inside a grey (i.e. its optical properties are non frequency depending) moving semi-

transparent medium characterised by its constant refractive index different from one; to

do so, we shall first develop in section II the optical problem, that is determine the angle

and frequency transformation for a propagating radiation, between the comobile location

reference system bound to a moving particle embedded in the medium of non unit re-

fractive index and the location reference system relatively to a given observer inside the

medium: from the Gordon’s metric, we shall construct an “equivalent vacuum” and its

related “mater-light space” perceived both by the moving particle and the observer, such

that the “vacuum” bound to the particle and the “vacuum” bound to the observer are

related by a Lorentz transformation thanks to a particular rapidity different from the

usual one; this latter result will provide us in section IV, analogously to what happens

in the real vacuum, the radiation angle and frequency transformation in the refractive

medium, from which, following the work of Mihalas, one deduces the invariant form of

the radiative transfer equation. In section III, a closely related problem will be exam-

ined, which allows to interpret an uniaxial crystal (here are only studied the negative

crystals) as an isotropic moving medium, for which it is possible to derive (here only the

parallel polarisation was examined) the reflection and transmission laws for an electro-

magnetic field through an interface separating the crystal from an isotropic medium of

unit refractive index.


2. The Optical Problem: Construction of an Equivalent Vac-

uum and Determination of the Fundamental Mater-Light

Space

It is well known that in a motionless medium embedded in a flat space, for which the

refractive index equals the unity, the radiative transfer equation (RTE) can be written in

Cartesian co-ordinates as

Pα

∂α I =h

c

( η

ν2− κ ν I

)=

κh

c ν2

[L

0 (T ) − L], (1)

where I = Lν3 is the specific intensity, L being the classical intensity, and

⇒P = h ν

c

(1,→Ω)

is the impulsion-energy 4-vector; κ is the absorption coefficient, L0 (T ) the black body

intensity at local thermodynamic equilibrium for a given temperature T , h the Boltzmann

constant, c the light speed in the vacuum and ν the radiation frequency inside the medium;

in absence of any relativistic event, the radiation frequency remains constant, and the

formal RTE can be rewritten under the standard form

1

c

∂L

∂t+

∂L

∂s= κ

[L

0 (T ) − L], (2)

where t is the time and s the curvilinear abscissa along a luminous trajectory, with∂L∂s

=→Ω

→grad L.

The Gordon effective gravitational field can be expressed as [2]

gμν = gμν (0) − (ε μ − 1) uμ uν ⇔ gμν = g(0)μν +

(1 − 1

ε μ

)uμ uν , (3)

where u is the mean 4-speed vector of the medium (relatively to a given observer), g(0) the

vacuum Minkowski tensor, g the effective gravitational tensor, and ε and μ the relative

dielectric and magnetic permittivity and permeability of the medium assumed hereafter

isotropic, related to its refractive index n by n2 = ε μ; we shall consider only non

magnetic media, with μ = 1, so that n2 = ε. Let us now remind a more mechanical

demonstration of this latter result: in a transparent medium where the refractive index,

hereafter assumed constant, i.e. non depending on space and/or time co-ordinates, is not

1, the proper time interval (PTI) for a photon can be written in Cartesian co-ordinates

as

d τ 2 = c2 d t2 − n2(d x2 + d y2 + d z2

)= − gμν d xμ d xν , (4)

where the contravariant co-ordinates xμ are xμ = (x0, x1, x2, x3) = (c t, x, y, z) ; this

is the most general form of the photon PTI in a motionless medium, simple translation of

the fact that light propagates at speed cn

in a dielectric for which the refractive index is

different from one, for the PTI is a light one for a photon and dτ = 0 ⇒ dsdt

= cn; hence

one deduces the covariant components of the metric tensor in Cartesian co-ordinates

g00 = − 1 gxx = n2 gyy = n2 gzz = n2, (5)


and the contravariant components, since the tensor is diagonal

g00 = − 1 gxx =1

n2gyy =

1

n2gzz =

1

n2, (6)

It has to be noticed that the PTI can be rewritten as

d τ 2 =(1 − n2

)d x02

+ n2(d x02 − dx2 − d y2 − d z2

), (7)

At a given event {xμ} in space-time, can be defined a 4-speed vector uμ = d xμ

dτrepre-

senting the mean motion of the dielectric; in a given comobile location reference system

(LRS) bound to a particle moving with a speed→β =

→vc

relatively to a “fix” (for observer)

LRS, the covariant components of the 4-speed are uμ′ = − δ0μ′ where the primes indicate

the co-ordinates relatively to the considered comobile LRS ; hence in this LRS one has

d τ 2 =(1 − n2

) (uμ′ d xμ′

)2

− n2 g(0)μ′ν′ d xμ′

d xν′= − gμ′ν′ d xμ′

d xν′, (8)

where g(0)μ′ν′ represents the vacuum metric tensor in the comobile LRS; the cinematic

transformation between the vacuum metric tensor relatively to the comobile LRS and the

vacuum metric tensor relatively to the observer LRS, and the metric tensors associated

to the dielectric is

gμν (0) =∂ xμ

∂ xμ′∂ xν

∂ xν′ gμ′ν′ (0)and gμν =

∂ xμ

∂ xμ′∂ xν

∂ xν′ gμ′ν′, (9)

Considering a motion along the→x axis, the former relations are developed for the

diagonal components

g00(0)= −

(∂ x0

∂ x0′

)2

+(

∂ x0

∂x′

)2gxx(0) = −

(∂x

∂ x0′

)2

+(

∂x∂x′)2

gyy(0) = 1 gzz(0) = 1, (10)

and for the non diagonal one

g0x(0)= −

(∂ x0

∂ x0′

) (∂x

∂ x0′

)+

(∂ x0

∂x′

) (∂x

∂x′

), (11)

since the variables x0 and x do not depend on y and z which remain unchanged; the

Lorentz transform in Cartesian co-ordinates is simply

x0′ = γ (x0 − β x)

x′ = γ (x − β x0)

y′ = y z′ = z

x0 = γ(x0′ + β x′

)x = γ

(x′ + β x0′

)y = y′ z = z′

, (12)

from which one obtains

d x0 dx =

∣∣∣∣ ∂ (x0, x)

∂ (x0′ , x′)

∣∣∣∣ d x0′ dx′ =

∣∣∣∣∣∣∣∂ x0

∂ x0′∂x

∂ x0′

∂ x0

∂x′∂x∂x′

∣∣∣∣∣∣∣ d x0′ dx′ = d x0′ dx′, (13)


due to the scalar density conservation, it comes that√−Det g d x0 dx dy dz =

√−Det g′ d x0′ dx′ dy′ dz′, (14)

hence one has Det g = − n6: in Cartesian co-ordinates, the metric tensor determinant

remains unchanged; developing relations (10)-(11), one has with the help of (12) the

contravariant co-ordinates of the vacuum metric tensor relatively to the observer LRS

g00(0)= − 1 gxx(0) = 1 gyy(0) = 1 gzz(0) = 1, (15)

for the non diagonal component g0x(0)= 0 ; the tensor being diagonal, one deduces the

covariant co-ordinates as

g(0)00 = − 1 g(0)

xx = 1 g(0)yy = 1 g(0)

zz = 1, (16)

while the metric tensor associated to the dielectric in the observer LRS is gμν =∂ xμ

∂ xμ′∂ xν

∂ xν′ gμ′ν′, leading to

g00 = −γ2(n2 − β2

)n2

gxx =γ2(1 − β2 n2

)n2

gyy =1

n2gzz =

1

n2, (17)

for the diagonal components and

g0x = − β γ2 (n2 − 1)

n2, (18)

for the non diagonal one. The 4-speed vector being defined in vacuum as⇒u = γ

⎛⎜⎝ 1→β

⎞⎟⎠with γ = 1√

1−β2, one has for the contravariant co-ordinates of the 4-speed u0 = γ and

ux = β γ; hence the contravariant components of the metric tensor given by (17) and

(18) can be rewritten as

g00 = 1n2

[g00(0) − (n2 − 1) u0 u0

]gxx = 1

n2

[gxx(0) − (n2 − 1) ux ux

]g0x = 1

n2

[g0x(0) − (n2 − 1) u0 ux

] , (19)

or under a more compact form gμν = 1n2

[gμν (0) − (n2 − 1) uμ uν

], which is the Gordon

metric [4]; it is then possible to obtain the covariant components with a simple inversion

of the contravariant matrix, or more simply use Eq. (8) since uμ′ d xμ′= uμ′ ∂ xμ′

∂ xμ d xμ =

− δ0μ′

∂ xμ′

∂ xμ d xμ = − ∂ x0′

∂ xμ d xμ, from which uμ′ d xμ′= - γ (d x0 - β dx) ; hence one obtains

for the covariant components, since d x0 dx = d x0′ dx′:

g00 = − γ2(1 − β2 n2

)gxx = γ2

(n2 − β2

)gyy = n2 gzz = n2

g0x = − β γ2 (n2 − 1), (20)


which may be rewritten under the compact Gordon metric as gμν = n2 g(0)μν + (n2 − 1) uμ uν .

In such a metric the photon PTI is

d τ 2 = − gμν d xμ d xν = − g00 d x02 − gxx d x2 − 2 g0x d x0 dx− n2(d y2 + d z2

)= 0,

(21)

for a constant refractive index the luminous trajectories are straight lines, and if light

propagates along the→x axis, then dy = dz = 0, from which one deduces

gxx

(dx

d x0

)2

+ 2 g0xdx

d x0+ g00 = 0, (22)

the resolution of this equation easily leads to

dx

d x0=

1

n=

1 + β n

n + β=

dx′ + β d x0′

β dx′ + d x0′ , (23)

and if β << 1 one has for the effective refractive index 1n

= 1n

+ β(1 − 1

n2

), which is

the famous well-known Fresnel’s drag additional formula [5].

However, the physical significance of the former metric is not that obvious, since theg00 component associated to a time may be either negative or positive (or even zero),

depending on the value of β if the latter one is greater or lower than 1n

; we also make the

choice of a new metric, equivalent to the precedent one, for which the unique eigen-value

associated to a time is always negative: the calculation of the covariant metric tensor

eigen-values shows that n2 is a double eigen-value associated to the eigen-vectors→ey and

→ez, the two other eigen-values being solution of the characteristic equation

g2 − (gxx + g00) g − (g20x − gxx g00

)= g2 − (gxx + g00) g − n2 = 0, (24)

leading to

g1 = gxx + g00 −√

Δ2

= 12

[γ2(β2 + 1

)(n2 − 1) − √

Δ]

g2 = gxx + g00 +√

Δ2

= 12

[γ2(β2 + 1

)(n2 − 1) +

√Δ] , (25)

withg1 g2 = −n2 and Δ = (gxx + g00)

2 +4 (g20x − gxx g00) = (n2 +1)

2+4 β2 γ4 (n2 − 1)

2> 0

One deduces from the former result that the g1 eigen-value is strictly negative what-

ever the refractive index n and the β medium rapidity are, and that it may be associated

to a time, the g2 eigen-value being always positive and associated to a space variable: in

well chosen axis, the metric tensor relatively to the observer LRS is diagonal and can be

represented as

g =

⎛⎜⎜⎜⎜⎜⎜⎜⎝

g1 0 0 0

0 g2 0 0

0 0 n2 0

0 0 0 n2

⎞⎟⎟⎟⎟⎟⎟⎟⎠, (26)


The determination of the associated Eigen-vector leads to

→E1 =

→e0 +

β γ2 (n2 − 1)γ2 (n2 − β2)− g1

→ex =

→e0 − g0x

gxx − g1

→ex

→E2 = − β γ2 (n2 − 1)

γ2 (1−β2 n2)+ g2

→e0 +

→ex = g0x

g2 − g00

→e0 +

→ex

, (27)

these two vector are orthogonal relatively to this metric and can be normed so that→e1

2= g1 and

→e2

2= g2 since (gxx − g1) (g00 − g1) = g2

0x : hence→E1

2

= g00 − 2 g20x

gxx − g1+

g20x

gxx

(gxx − g1)2= 2 g1 − g00 +

gxx (g00 − g1)gxx − g1

= g1

√Δ

gxx − g1from which

→E1

2

=→e1

2 g2 − g1gxx − g1

and→e1 =

√gxx − g1g2 − g1

(→e0 − g0x

gxx − g1

→ex

); performing the same calculation with the second

eigen-vector finally leads to the expression of the two normed eigen-vectors→e1 and

→e2

associated to the two eigen-values g1 and g2 as

→e1 = 1√

g2 − g1

(√gxx − g1

→e0 +

√g00 − g1

→ex

)→e2 = 1√

g2 − g1

(−√

g2 − gxx→e0 +

√g2 − g00

→ex

) . (28)

The 4-event vector is defined as(d⇒M)2

= − d τ 2 = gμν d xμ d xν ⇒ d⇒M = d xμ →

eμ ⇒⇒M = xμ →

eμ, hence for a given event one has

⇒M = x0 →e 0 + x

→e x + y

→e y + z

→e z = x0′ →e 0′ + x′

→e x′ + y′

→e y′ + z′

→e z′

= γ(x0′ + β x′

) →e 0 + γ

(x′ + β x0′

) →e x + y′

→e y′ + z′

→e z′

= γ(→

e 0 +β→e x

)x0′ +γ

(β→e 0 +

→e x

)x′ + y′

→e y′ +z′

→e z′ ⇒

γ(→

e 0 +β→e x

)x0′ = x0′ →e 0′

γ(β→e 0 +

→e x

)x′ = x′

→e x′

,

(29)

the variables x0′and x

′being independent; it comes then

→e 0 + β

→e x =

→e 0′γ

β→e 0 +

→e x =

→e x′γ

⇒→e 0 = γ

(→e 0′ − β

→e x′)

→e x = γ

(→e x′ − β

→e 0′) , (30)

for this event one also has⇒M = x1 →e 1 + x2 →e 2 + y

→e y + z

→e z = x0 →e 0 + x

→e x + y

→e y + z

→e z

and for the pseudo-norm⇒M

2

= g1 x12+ g2 x22

+ n2 (y2 + z2) ; but

→e1 = 1√

g2− g1

(√gxx − g1

→e0 +

√g00 − g1

→ex

)→e2 = 1√

g2− g1

(−√

g2 − gxx→e0 +

√g2 − g00

→ex

) ⇒→e0 = 1√

g2− g1

(√gxx − g1

→e1 −

√g00 − g1

→e2

)→ex = 1√

g2− g1

(√g2 − gxx

→e1 +

√g2 − g00

→e2

) ,

(31)

from which

x1 = γ√g2 − g1

[(√

gxx − g1 + β√

g2 − gxx) x0′ + (√

g2 − gxx + β√

gxx − g1) x′]

x2 = γ√g2 − g1

[(β

√g2 − g00 − √

g00 − g1) x0′ + (√

g2 − g00 − β√

g00 − g1) x′] ,

(32)


performing the calculation of g1 x12+ g2 x22

one obtains finally

g1 x12+ g2 x22

= γ2

⎧⎪⎨⎪⎩(g00 + β2 gxx +2β g0x

)x0′2 +

(gxx + β2 g00 + 2β g0x

)x′2

+ 2[(

1 + β2)g0x +β (g00 + gxx)

]x0′ x′

⎫⎪⎬⎪⎭ = −x0′2 + n2 x′2,

(33)

hence the 4-event pseudo-norm remains unchanged, and

d x1 d x2 =

∣∣∣∣ ∂(x1,x2)∂(x0′ ,x′)

∣∣∣∣ d x0′ dx′ = γ2

g2− g1

∣∣∣∣∣∣∣√

gxx − g1 + β√

g00 − g1 β√

g2 − g00 −√

g2 − gxx

√g00 − g1 + β

√gxx − g1

√g2 − g00 − β

√g2 − gxx

∣∣∣∣∣∣∣ d x0′ dx′

= γ2

g2− g1

(1 − β2

)(g2 − g1) d x0′ dx′ = d x0′ dx′

,

(34)

from which one deduces that√−Det g d x1 d x2 dy dz =

√−Det g′ d x0′ dx′ dy′ dz′, that

is the conservation of the scalar density ; noticing furthermore that

(√

gxx − g1 − β√

g00 − g1) (β√

gxx − g1 − √g00 − g1) = − 2 β g1

(√

g2 − gxx + β√

g2 − g00) (√

g2 − g00 + β√

g2 − gxx) = 2 β g2

(√

gxx − g1 − β√

g00 − g1) (√

g2 − gxx + β√

g2 − g00) = 2 β n2

(β√

gxx − g1 − √g00 − g1) (

√g2 − g00 + β

√g2 − gxx) = 2 β

, (35)

the x1 and x2 co-ordinates can equivalently be rewritten as

x1 = γ√g2 − g1

[2β g2√

g2 − gxx +β√

g2 − g00x0′ + (

√g2 − gxx + β

√g2 − g00) x′

]x2 = γ

g2√

g2 − g1

[(√

g2 − gxx + β√

g2 − g00) x0′ + 2β g2 n2√g2 − gxx +β

√g2 − g00

x′] , (36)

with the help of the auxiliary value√

X =√

g2 − gxx + β√

g2 − g00, performing the

calculation of g1 x12+ g2 x22

, it comes for the equation verified by X that

X2 +

g2 (g2 − g1)γ2

X − 4 β2 g22 n2 = 0, (37)

the discriminant of this equation is Δ = g22

[16 β2 n2 + (g2 − g1)2

γ4

]= g2

2

(1 + β2

)2(n2 + 1)

2,

and since X is positive, one has

X = g2

2

[(1 + β2

)(n2 + 1) − (

1 − β2)

(g2 − g1)]

= g2

[g1 − g00 + β2 (gxx − g1)

]= g2

(β2 gxx − g00

)− n2(1 − β2

)= g2

(1 + β2

)− n2(1 − β2

)= g2

[1 + g1 + (1 − g1) β2

] ,

(38)

following the same steps, with the help of the auxiliary value√

X =√

g2 − g00 +

β√

g2 − gxx, and performing once again the calculation of g1 x12+ g2 x22

, it comes for

the equation verified by X

X2 − g2 (g2 − g1)

γ2 n2X − 4 β2

g22

n2= 0, (39)


the discriminant of this equation is Δ =g22

n4

[16 β2 n2 + (g2 − g1)2

γ4

]=

g22 (1+β2)

2(n2 + 1)

2

n4 ,

and since X is positive, one has

X = g2

2 n2

[(1 + β2

)(n2 + 1) +

(1 − β2

)(g2 − g1)

]= g2

n2

[gxx − g1 + β2 (g1 − g00)

]= g2

n2

(gxx − β2 g00

)+ 1 − β2 = g2

(1 + β2

)+ 1 − β2 = g2 + 1 + (g2 − 1) β2

,

(40)

Finally the system (35) can be rewritten as

√g2 − gxx + β

√g2 − g00 =

√g2

[1 + g1 + (1 − g1) β2

]β√

gxx − g1 − √g00 − g1 =

√1+ g1 +(1− g1) β2

g2√gxx − g1 − β

√g00 − g1 = − g1

√g2 + 1 + (g2 − 1) β2

√g2 − g00 + β

√g2 − gxx =

√g2 + 1 + (g2 − 1) β2

, (41)

hence for the x1 and x2 co-ordinates one has

x1 = γ√− g1

[√− g1

√g2 + 1+ (g2 − 1) β2

g2 − g1x0′ + n

√1+ g1 + (1− g1) β2

g2 − g1x′]

x2 = γ√g2

[√1+ g1 +(1− g1) β2

g2 − g1x0′ + n

√− g1

√g2 + 1+ (g2 − 1) β2

g2 − g1x′] , (42)

performing then the calculation of g1 x12+ g2 x22

, it comes the useful following relation

n2 − 1 − 2 g1 + β2(n2 − 1 + 2 g1

)=

g2 − g1

γ2⇔ g1 + g2 = γ2

(1 + β2

) (n2 − 1

),

(43)

this leads for the 4-event that

⇒M = γ√− g1

[√− g1

√g2 +1+ (g2 − 1) β2

g2 − g1x0′ + n

√1+ g1 + (1− g1) β2

g2 − g1x′]→e 1

+ γ√g2

[√1+ g1 +(1− g1) β2

g2 − g1x0′ + n

√− g1

√g2 + 1+ (g2 − 1) β2

g2 − g1x′]→e 2 + y

→e y + z

→e z

= x0′ →e 0′ + x′→e x′ + y′

→e y′ + z′

→e z′

, (44)

and since the variables x0′ and x′ are independent,

→e 0′ = γ

[√g2 +1+ (g2 − 1) β2

g2 − g1

→e 1 + 1√

g2

√1+ g1 + (1− g1) β2

g2 − g1

→e 2

]→e x′ = γ

[√g2

√1+ g1 +(1− g1) β2

g2 − g1

→e 1 − g1

√g2 + 1 + (g2 − 1) β2

g2 − g1

→e 2

] , (45)

It is now comfortable to introduce symbolic variables such as

x1 = γ√− g1

(x0′ + β nx′

)x2 = γ√

g2

(β x0′ + nx′

) , (46)


since they obviously verify the relation g1 x12+ g2 x22

= − x0′2 + n2 x′2 ; however

those variables do not represent the⇒M 4-event, but it is convenient to notice the following

substitution, very useful for latter calculations

√− g1

√g2 +1+ (g2 − 1) β2

g2 − g1↔ 1√

1+ g1 +(1− g1) β2

g2 − g1↔ β

, (47)

note that this equivalence becomes a strict equality if and only if n = 1; hence for the

symbolic 4-speed one has

⇒u = γ

⎛⎜⎝ 1→β

⎞⎟⎠ = γ

( →e1√− g1

+β→e2√g2

), (48)

this is obviously an admissible 4-speed since its pseudo-norm is⇒u

2

= γ2(β2 − 1

)= − 1,

but the symbolic 4-speed is not the real 4-speed, which is given, with the help of (47) in

the(→e1,

→e2

)basis as

⇒u = γ

⎡⎣√g2 + 1 + (g2 − 1) β2

g2 − g1

→e 1 +

1√g2

√1 + g1 + (1 − g1) β2

g2 − g1

→e 2

⎤⎦ =→e 0′ , (49)

which is the case.

Let us now focus our attention on the 4-impulsion of a photon: in the former comobile

LRS its contravariant components are symbolically written⇒P = h ν′

c

⎛⎜⎝ 1→Ω′

⎞⎟⎠ where→Ω′ is

the photon propagation direction relatively to this LRS, from which⎛⎜⎜⎜⎜⎜⎜⎜⎝

P 0′

P x′

P y′

P z′

⎞⎟⎟⎟⎟⎟⎟⎟⎠= n2 h ν ′

c

⎛⎜⎜⎜⎜⎜⎜⎜⎝

1

cos Θ′n

sin Θ′ cos Φ′n

sin Θ′ sin Φ′n

⎞⎟⎟⎟⎟⎟⎟⎟⎠, (50)

the covariant components being Pμ′ = gμ′ν′ P ν′, leading naturally to the fact that its

pseudo-norm is⇒P

2

= Pμ′ P μ′= 0 , since

⇒P is a light 4-event like; then the contravariant

co-ordinates of the photon 4-impulsion vector relatively to the observer LRS are P μ =

P μ′ ∂ xμ

∂ xμ′ , leading to

P 0 = γ h ν′ nc

(n + β cos Θ′) P x = γ h ν′ nc

(nβ + cos Θ′)

P y = h ν′ nc

sin Θ′ cos Φ′ P z = h ν′ nc

sin Θ′ sin Φ′, (51)


from which it is obvious that:

P0 = − γ h ν′ n2

c(1 + nβ cos Θ′) Px = γ h ν′ n2

c(β + n cos Θ′)

Py = h ν′ n3

csin Θ′ cos Φ′ Pz = h ν′ n3

csin Θ′ sin Φ′

hence a simple calculation leads to Pμ P μ = 0 which was expected; it is important

however to notice here that these components suffer from a lack of physical clear signif-

icance so as for the g metric tensor components expressed in the(→e0,

→ex,

→ey,

→ez

)basis,

since from the following tensorial relation

Pμ = P

μ′ ∂ xμ

∂ xμ′ =∂ xμ

∂ xμ′ gμ′ν′Pν′ = P

μ′ ∂ xμ

∂ xμ′∂ xμ′

∂ xσ

∂ xν′

∂ xτgστ gμ′ν′ , (52)

one has

P0′ ∂ xμ

∂ x0′

(1 − ∂ x0′

∂ xσ

∂ xν′

∂ xτgστ g0′ν′

)= P

x′ ∂ xμ

∂x′

(∂x′

∂ xσ

∂ xν′

∂ xτgστ gx′ν′ − 1

), (53)

developing the former equality leads to

1 − ∂ x0′

∂ xσ∂ xν′

∂ xτgστ g0′ν′ = 1 − γ4

n2

[n2 β4 − 2 β2 (2 n2 − 1) + n2

]∂x′∂ xσ

∂ xν′

∂ xτgστ gx′ν′ − 1 = γ4

[β4 − 2 β2 (2 − n2) + 1

] − 1, (54)

but the P 0′ and P x′variables being independent and ∂ xμ

∂ x0′ �= 0 like ∂ xμ

∂x′ �= 0, Eq. (53)

is verified if and only if

γ4

n2

[n2 β4 − 2 β2 (2 n2 − 1) + n2

]= 1

γ4[β4 − 2 β2 (2 − n2) + 1

]= 1

, that is if and only if

n = 1; hence the⇒P representation in the

(→e0,

→ex,

→ey,

→ez

)basis is not a satisfactory one;

this can be explained by the fact that there does not exist a canonical representation in

this basis of the form P μ = n2 h νc

Ωμ√|gμμ|

since the metric tensor g is not diagonal in the

former basis: the angle and frequency transformation must be then performed with the

help of the diagonal metric; noticing that the 4-impulsion is then

⇒P =

⎛⎜⎜⎜⎜⎜⎜⎜⎝

P 1

P 2

P y

P z

⎞⎟⎟⎟⎟⎟⎟⎟⎠= n2 h ν

c

⎛⎜⎜⎜⎜⎜⎜⎜⎝

1√− g1

cos Θ√g2

sin Θ cos Φn

sin Θ sin Φn

⎞⎟⎟⎟⎟⎟⎟⎟⎠, (55)

for which it is easy to verify that⇒P

2

= 0, moreover using relation (45), it comes from

(50) that

⇒P = n2 h ν ′

c

⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩γ

[√g2 +1+ (g2 − 1) β2

g2 − g1+

√g2

cos Θ′n

√1+ g1 +(1− g1) β2

g2 − g1

]→e 1

+ γ

[1√g2

√1+ g1 + (1− g1) β2

g2 − g1− g1

cos Θ′n

√g2 + 1+ (g2 − 1) β2

g2 − g1

]→e 2

+ sin Θ′n

(cos Φ′

→e y + sin Φ′

→e z

)

⎫⎪⎪⎪⎪⎪⎬⎪⎪⎪⎪⎪⎭, (56)


from which one obtains it is easy to obtain

ν = γ ν ′[√− g1

√g2 +1+ (g2 − 1) β2

g2 − g1+ cos Θ′

√1+ g1 + (1− g1) β2

g2 − g1

]ν cos Θ = γ ν ′

[√1+ g1 +(1− g1) β2

g2 − g1+

√− g1 cos Θ′√

g2 + 1+ (g2 − 1) β2

g2 − g1

]ν sin Θ cos Φ = ν ′ sin Θ′ cos Φ′

ν sin Θ sin Φ = ν ′ sin Θ′ sin Φ′

, (57)

one deduces from this result that angle Φ remains unchanged, while

ν sin Θ = ν ′ sin Θ′, (58)

noting μ = cos Θ and noticing that√[

g2 + 1 + (g2 − 1) β2] [

1 + g1 + (1 − g1) β2]

=

2 β√

g2, the two first equations of (58) can be rewritten as

ν = γ ν ′ n√

g2 +1+ (g2 − 1) β2

g2 (g2 − g1)

[1 + 2 β

√− g2

g1

μ′g2 + 1 + (g2 − 1) β2

]μ =

[g2 +1+ (g2 − 1) β2]μ′ +2β√− g2

g1

g2 +1+ (g2 − 1) β2 +2β√− g2

g1μ′

, (59)

from which it is easy to verify that for a unit refractive index one has

ν = γ ν ′ (1 + β μ′) μ =μ′ + β

1 + β μ′, (60)

which is the habitual and well-known angular aberration and frequency Doppler shift

transformation in vacuum [2]; for an isotropic and grey dielectric, that is a refractive

index independent on both the frequency and propagation direction, one has from (59)

∂μ

∂μ′= γ2

(ν ′

ν

)2

n2

[g2 + 1 + (g2 − 1) β2

]2+ 4 β2 g2

g1

g2 (g2 − g1)[g2 + 1 + (g2 − 1) β2

] ∂μ

∂ν ′= 0

∂ν

∂ν ′=

ν

ν ′, (61)

but[g2 +1+ (g2 − 1) β2]

2+4 β2 g2

g1

[g2 +1+ (g2 − 1) β2]2 =

g1 g2 + 2 g1 + 1+ (g1 g2 − 2 g1 + 1) β2

g1 [g2 +1+ (g2 − 1) β2]

= − n2 − 1− 2 g1 + (n2 − 1+2 g1) β2

g1 [g2 +1+ (g2 − 1) β2]=

g2 (g2 − g1)

γ2 n2 [g2 + 1+ (g2 − 1) β2]

, (62)

so that∂μ

∂μ′=

(ν ′

ν

)2

, (63)

hence one finally obtains

dν dμ =

∣∣∣∣∂ (ν, μ)

∂ (ν ′ μ′)

∣∣∣∣ dν ′ dμ′ =

∣∣∣∣∣∣∣νν′ 0

∂ν∂μ′

(ν′ν

)2∣∣∣∣∣∣∣ dν ′ dμ′ =

ν ′

νdν ′ dμ′, (64)


the angle Φ remaining unchanged, noting dΩ = dμ dΦ the solid angle element, it comes

from (64) the final important relation valid in the medium of index n as in vacuum

ν dν dΩ = ν ′ dν ′ dΩ′. (65)

Let us then introduce the index n2 =g2 (g2 − g1)

g2 + 1 + (g2 − 1) β2 such that the co-ordinates x1 and

x2 can be rewritten:

x1 =γ√− g1

(n

nx0′ +

2 β n ng2 − g1

x′)

x2 =γ√g2

(2 β n

g2 − g1

x0′ +n2

nx′)

, (66)

note also that from the expression of n2, it is possible to obtain, after a rather difficult

calculation the following relation

dn

dβ= β γ2 n

[1 − 4 n2 (n2 + 1)

(g2 − g1)3

]⇒

dndβ

= 0 if n = 1

dndβ

= 0 if β = 0

The evolution of this index and its derivative with respect to β is plotted on the following

figure, for a refractive index n = 1.33; below a rapidity β = 0.6, n remains practically

constant so as its derivative; at β = 0.75, one has n = 1.50 and the effects of the medium

speed become appreciable; from β = 0.75 to 1, n and its derivative grow up very quickly,

and for extremely high speeds, the refractive index effects cannot be longer ignored.

Then for the 4-speed one has

⇒u = γ

n

n

[ →e1√− g1

+2 β n2

n (g2 − g1)

→e2√g2

], (67)

since⇒u

2= − 1 one obtains from (67) that n2 (g2 − g1)

2 − 4 β2 n4 = n2 (g2 − g1)2

γ2 and

⇒u = γ

n

n

⎡⎣ →e1√− g1

+

√1 −

(n

γ n

)2 →e2√g2

⎤⎦ , (68)

the previous expression of the 4-speed vector⇒u allows us to introduce a new rapidity

β = 2β n2

n (g2 − g1)such that β =

√1 −

(nγ n

)2

=√

1 − 1γ2 ⇔ γ = γ n

n= 1√

1− β2, with

limβ→1

β = 1 whatever n is, so that the 4-speed vector can be written under the compact

standard form⇒u = γ

( →e1√− g1

+ β

→e2√g2

), (69)

analogous to the symbolic 4-speed vector, replacing the rapidity β (expressed in vacuum)

by the rapidity β expressed in the medium relatively to the observer LRS; introducing

then the values of n and β in the angle/frequency transformations leads to

ν = γ ν ′(1 + β μ′

)μ =

μ′ + β

1 + β μ′ν dν dΩ = ν ′ dν ′ dΩ′ ν sin Θ = ν ′ sin Θ′, (70)


which has the remarkable form as the one obtained in vacuum; hence, one may

expect to find a judicious set of co-ordinates (x0, x) such that they verify the Lorentz

transform, analogous to the habitual Lorentz transform replacing β by β, that is

x0 = γ(x0′ + β x′

)x = γ

(x′ + β x0′

)y = y′ z = z′

⇔x0′ = γ

(x0 − β x

)x′ = γ

(x − β x0

)y′ = y z′ = z

, (71)

from Eq. (66) one has

x0′ = γ nn

[x1

√− g1 − 2β n2

n (g2 − g1)x2

√g2

]= γ

(x1

√− g1 − β x2√

g2

)x′ = 1

nγ nn

[x2

√g2 − 2β n2

n (g2 − g1)x1

√− g1

]= γ

n

(x2

√g2 − β x1

√− g1

) , (72)

hence with the substitution

x0′ = x0′ x′ = nx′ x0 = x1√− g1 x = x2

√g2

the Lorentz transform (72) is obtained: it has to be noticed that the fundamental variable

x′ = nx′ induces a local dilatation along x′, so that we choose for the other spatial vari-

ables the same dilatation, that is y′ = n y′ and z′ = n z′; from the 4-event vector, one

constructs then an “equivalent vacuum” completely defined from the comobile LRS by

its co-ordinates(x0′ = x0′ , x′ = nx′, y′ = n y′, z′ = n z′

)associated to the orthonor-

mal basis

(→e0′ =

→e0′ ,

→ex′ =

→ex′n

,→ey′ =

→ey′n

,→ez′ =

→ez′n

), and from the observer LRS by

its co-ordinates (x0 = x1√− g1, x = x2

√g2, y = n y, z = n z) associated to the or-

thonormal basis(→e0 =

→e1√− g1

,→ex =

→e2√g2

,→ey =

→ey

n,→ez =

→ez

n

): indeed, those light spaces

can be easily explained, reminding that the photons PTI are expressed as

d τ 2 = − gμν d xμ d xν = c2 (dt′)2 − [(n dx′)2 + (n dy′)2 + (n dz′)2]

= d x0′2 − (d x′2 + d y′2 + d z′2

)

andd τ 2 = − gμν d xμ d xν = (

√− g1 d x1)2 −

[(√

g2 d x2)2

+ (n dy)2 + (n dz)2]

= d x02 − (d x2 + d y2 + d z2)

hence the co-ordinates sets(x0′ , x′, y′, z′

)and (x0, x, y, z) associated to the two

basis(→e0′ ,

→ex′ ,

→ey′ ,

→ez′)

and(→e0,

→ex,

→ey,

→ez

)represent vacuum light co-ordinates and basis,

which are different from the mater co-ordinates and basis; from a comobile LRS point of

view, the mater co-ordinates are(x0′ , x′, y′, z′

)associated to the basis

(→e0′ ,

→ex′ ,

→ey′ ,

→ez′),

so that it is convenient analogously to introduce, from the observer LRS point of view, the

mater co-ordinates(x0 = x1

√− g1, x = x2 √g2

n, y = y, z = z

)associated to the basis


(→e0 =

→e1√− g1

,→ex = n

→e2√g2

,→ey =

→ey,

→ez =

→ez

), so that mater-light metric tensors can be

expressed as

g′ =

⎛⎜⎜⎜⎜⎜⎜⎜⎝

−1 0 0 0

0 n2 0 0

0 0 n2 0

0 0 0 n2

⎞⎟⎟⎟⎟⎟⎟⎟⎠and g =

⎛⎜⎜⎜⎜⎜⎜⎜⎝

−1 0 0 0

0 n2 0 0

0 0 n2 0

0 0 0 n2

⎞⎟⎟⎟⎟⎟⎟⎟⎠, (73)

hence the two metrics, relatively to the moving particle and the observer, are strictly

equivalent, meaning that the natural mater curvilinear abscissa path is d s2 = d x2 +d y2 +d z2

relatively to the observer LRS, and that for a photon it is related to time by dt = ncds,

the latter quantities being defined relatively to the observer LRS; then one has the relation

between the two sets of co-ordinates

x0 = γ γ[(

1 − nβ β)

x0 +(n β − β

)x]

x = γ γn

[(β − nβ

)x0 +

(n − β β

)x] ⇔

x0 = γ γn

[(n − β β

)x0 −n

(n β − β

)x]

x = γ γn

[n(1 − nβ β

)x − (

β − nβ)

x0] ,

(73)

Furthermore, from the 4-event vector and the definition of the 4-speed vector, one has

⇒u = d

⇒Mdτ

= d⇒M

d x0d x0

dτ= γ

(→e0 + β

→ex

)= d x0

dτ

(→e0 + dx

d x0

→ex + dy

d x0

→ey + dz

d x0

→ez

)⇒ d x0

dτ= γ

→β = dx

d x0

→ex = dx

d x0

→ex

dyd x0 = dz

d x0 = 0, (74)

but one also has⇒u = d

⇒Mdτ

= d⇒M

dx0′d x0′

dτ=

→e0′ ⇒ d x0′

dτ= 1, so that it comes the

fundamental relation binding the proper times of a mass particle moving with speed

βg?o(defined as if it were in vacuum)

d x0 = γ d x0′ ⇔ d x0 = γ d x0′ (75)

3. Interpretation of the Fresnel’S Refractive Index as a Nega-

tive Uniaxial Anisotropy

From what precedes, it is obvious that the 4-speed is⇒u = γ

(→e0 + β

→ex

)and is natu-

rally defined in the vacuum bound to the observer located in the moving medium; it is im-

portant to remark here that the 4-speed perceived by an observer at rest located in the medium

is the same 4-speed perceived by an observer at rest but located in vacuum; for the latter

one, the 4-speed is:⇒uv = γ

(→e0 + β

→ex

)where

(→e 0,

→e x

)is the equivalent “vacuum basis” of the moving medium perceived by

the observer located in vacuum of unit refractive index, which is not the(→e0,

→ex

)basis


of the equivalent vacuum bound to the observer located inside the medium of refractive

index n; it is thus defined as follows:→e 0 = γ

(→e 0′ − β

→e x′)

= γ(→

e 0′ − βn

→e x′)

= γ2

n

[(n − β2

) →e 0 + β (n − 1)

→e x

]→e x = γ

(→e x′ − β

→e 0′)

= γ(→

e x′n

− β→e 0′)

= γ2

n

[(1 − n β2

) →e x − β (n − 1)

→e 0

]with

→e 0

2= − 1

→e x

2= 1

→e 0

→e x = 0

and equivalently:

→e 0 = γ2

[(1 − n β2

) →e 0 − β (n − 1)

→e x

]→e x = γ2

[β (n − 1)

→e 0 +

(n − β2

) →e x

]where β has to be understood as the speed of the mass particles evolving in the

medium, but evaluated as if they were evolving in vacuum; hence if n = 1 one obviously

has→e 0 =

→e 0 and

→e x =

→e x; replacing then

→e0 and

→ex by their values in terms of

→e0 and

→ex leads to

⇒uv = γ

(→e0 + β

→ex

)which was expected ; furthermore, one has:

→e 0 = γ γ

n

[(n − β β

) →e 0 +

(nβ − β

) →e x

]→e x = γ γ

n

[(β − n β

) →e 0 +

(1 − nβ β

) →e x

] so that after a simple calculation :⇒u =

γ(→e0 + β

→ex

)=

⇒uv

which proves the result; one deduces then from what precedes that:

→e 0 = γ γ

[(1 − β β

) →e 0 +

(β − β

) →e x

]→e x = γ γ

[(β − β

) →e 0 +

(1 − β β

) →e x

] ⇔→e 0 = γ γ

[(1 − β β

) →e 0 − (

β − β) →

e x

]→e x = γ γ

[− (

β − β) →

e 0 +(1 − β β

) →e x

]

this result allows to define the vacuum-like co-ordinates inside the medium for the ob-

server located in real vacuum: indeed, the existence of an observer located in vacuum

induces the existence of an interface between the moving medium and the vacuum, that

we shall assume orthogonal to the direction of motion of the moving medium; we shall

this time define a fictive observer at rest in the vacuum such that at a given instant

(hereafter designed as the initial vacuum instant) its position coincides with the position

of the interface, and that at this time one has dV =∥∥∥ →OV OVf

∥∥∥ where dV is the distance

between the reference observer OV in vacuum and the fictive observer OVfin vacuum,

this distance being evaluated in vacuum; let us introduce(x0, x, y, z

)the vacuum-like

co-ordinates of an event perceived by the reference observer at rest located in the real

vacuum,(xf

0, xf , yf , zf

)the vacuum-like co-ordinates of the same event perceived by

the fictive observer and(x0′ , x′, y′, z′

)the vacuum-like co-ordinates of this event per-

ceived by a moving particle in vacuum and bound to the moving interface; obviously, the

interface moves with the β rapidity for the observers at rest in vacuum, while it moves

with the β rapidity for observers at rest located in the moving medium, so that for the

observers in vacuum one has:


xf0 = γ

(x0′ + β x′

)xf = γ

(β x0′ + x′

)yf = y′ zf = z′

from which:

x0 = γ(x0′ + β x′

)x = γ

(β x0′ + x′

)+ dV

y = y′ z = z′

since the two observers in vacuum remain at rest and have the same time perception;

similarly, one introduces a fictive observer in the medium such that at the initial vacuum

instant, its position inside the moving medium coincides with the one of the moving

interface and dM =∥∥∥ →OM OMf

∥∥∥ where dM is the distance between the reference observer

OM in the moving medium and the fictive observer OMfin the moving medium, this

distance being evaluated in the medium; note that the two fictive observers are created

only for calculation rules: indeed, it dv

(x0)

> dV where dv

(x0)

is the distance between

the reference observer in vacuum and moving interface, then the moving medium leaves

the reference observer in vacuum and goes towards the reference observer in the medium,

so that the fictive observer which was in the medium at x0 = x0 = 0 is in vacuum for

x0 > 0, while if dv

(x0)

< dV , the moving medium goes towards the reference observer

in vacuum and the fictive observer in vacuum at x0 = x0 = 0 is in the medium for

x0 > 0; then for the moving medium:

xf0 = γ

(x0′ + β x′

)xf = γ

(β x0′ + x′

)yf = y′ zf = z′

from which:

x0 = γ(x0′ + β x′

)x = γ

(β x0′ + x′

) − dM

y = y′ z = z′

with dM = n dM

note that no distinction has to be done for the basis vectors which are the same for the

fictive and reference observers; hence from a same 4-event perceived both by an observer

at rest located in the real vacuum and an observer at rest located in the moving medium,

one has:

⇒M = M

μ→eμ = xf0→e 0 + xf

→e x + yf

→e y + zf

→e z = xf

0 →e 0 +xf→e x +yf

→e y +zf

→e z

hence from what precedes, it comes:

x0 = γγ[(

1 − ββ)x0 +

(β − β

)(x + dM)

]x − dV = γγ

[(1 − ββ

)(x + dM) +

(β − β

)x0]

y = y z = z

⇔x0 = γ γ

[(1 − ββ

)x0 − (β − β

)(x − dV )

]x + dM = γγ

[(1 − ββ

)(x − dV ) − (

β − β)x0]

y = y z = z

replacing then x0′ and x′ by their values in terms of x0′ and x′, and relating x0′ and x′ tox0 and x thanks to the Lorentz transform with the vacuum-like β rapidity finally leads

to:

x0 = γ2[(

1 − nβ2)x0 +β (n − 1) (x − dV )

]x − dV = γ2

[(n − β2

)(x − dV ) − β (n − 1) x0

] ⇔x0 = γ2

n

[(n − β2

)x0 −β (n − 1) (x − dV )

]x − dV = γ2

n

[(1 − nβ2

)(x − dV ) + β (n − 1) x0

]


so that x0 = x0 and x = x if n = 1. In these conditions, the 4-impulsion vector of a

photon is, for the observer at rest located in the medium

⇒P = h ν

c

{n2

→e 0 + n

[cos Θ

→e x + sin Θ

(cos Φ

→e y + sin Φ

→e z

)]}= P μ →

eμ

= h ν n2

c

[→e 0 + cos Θ

→e x + sin Θ

(cos Φ

→e y + sin Φ

→e z

)]= E

c

⎛⎜⎝ 1→Ω

⎞⎟⎠ (76)

from the former definition of the 4-impulsion, one deduces that→Ω = cos Θ

→e x + sin Θ

(cos Φ

→e y + sin Φ

→e z

), and that E = h ν n2 = n2 E(0): it

is a well known result that the energy of a photon in a medium of index n is n2 times

its energy in vacuum, but the important fact is that this result remains valid, even for

a moving medium, in the point of view of the observer at rest located in the moving

medium.

Replacing then→e 0 and

→e x by their values in terms of

→e 0 and

→e x, leads finally to

⇒P =

hν n

c

⎧⎪⎨⎪⎩ γγ[n − ββ +

(β − nβ

)cos Θ

]→e 0 +γγ

[nβ − β +

(1 − nββ

)cos Θ

]→e x

+ sin Θ(cos Φ

→e y + sin Φ

→e z

)⎫⎪⎬⎪⎭ = P

μ →eμ

(77)

so that introducing the values of ν and cos Θ in terms of ν ′ and cos Θ′ gives the values

of P μ obtained by Eq. (51). Replacing now→e 0 and

→e x by their values in terms of

→e0 and

→ex gives the important result

⇒P =

h ν n2

c

⎧⎪⎨⎪⎩ γ γ[1 − β β +

(β − β

)cos Θ

] →e 0 + γ γ

[β − β +

(1 − β β

)cos Θ

] →e x

+ sin Θ(cos Φ

→e y + sin Φ

→e z

)⎫⎪⎬⎪⎭

(78)

this is the 4-impulsion energy of a photon evolving in the moving medium and expressed

in the vacuum basis of an observer at rest located in the real vacuum, while this photon

4-impulsion energy perceived by an observer at rest located in the moving medium is

simply⇒P = h ν n2

c

[→e 0 + cos Θ

→e x + sin Θ

(cos Φ

→e y + sin Φ

→e z

)]; under this form,

Eq. (78) allows to define an apparent energy and direction Θ of propagation of light in

the moving medium for the observer located in vacuum, such that:

Ec

= hν n2 γγc

[1 − β β +

(β − β

)cos Θ

]cos Θ =

β−β+(1−ββ) cosΘ

1−ββ+(β−β) cosΘ

sin Θ cos Φ = sinΘ cosΦ

γγ[1−ββ+(β−β) cosΘ]

sin Θ sin Φ = sinΘ sinΦ


from which Φ = Φ andcos Θ =

β−β+(1−ββ) cosΘ

1−ββ+(β−β) cosΘ

sin Θ = sinΘ


one can easily verify that cos2 Θ + sin2 Θ = 1, so thatcos Θ =

(1−β β) cos Θ− (β− β)1−β β− (β− β) cos Θ

sin Θ = sin Θ

γ γ [1−β β− (β− β) cos Θ]


hence⇒P = E

c

⎛⎜⎝ 1→Ω

⎞⎟⎠ with Ec

= h ν n2

c γ γ [1−β β + (β− β) cos Θ]

For a constant refractive index n, the light trajectories inside the medium are straight

lines; indeed, these trajectories are the light geodesics determined by the geodesics equa-

tionsdP α

dσ+ Γα

βγ P β P γ = 0 where σ is a step parameter on the trajectory defined by P α =

d xα

dσand Γα

βγare the Christoffel coefficients such that Γα

βγ= gαm

2

(∂ gmγ

∂ xβ+

∂ gmβ

∂ xγ − ∂ gβγ

∂ xm

);

obviously these coefficients are all 0 for a constant refractive index, so that the geodesics

equations lead to P α = cons tan t, or equivalently d xα

d xβ= P α

P β= cons tan t, and for light

propagating in the (x, y) plane, one has:

dx = cosΘn

d x0 dy = sinΘn

d x0 hence ds2 =[1 +

(dydx

)2]dx2 = dx2

cos2 Θ= d x02

n2

and one retrieves the obvious relation dsd x0 = 1

n, or equivalently ds

d x0 = 1; furthermore,

from what precedes, it comes that:

dx

d x0 =(β− β) d x0 + (1−β β) dx

(1−β β) d x0 + (β− β) dx=

β− β + (1−β β) cos Θ

1−β β + (β− β) cos Θ= cos Θ

dy

d x0 = dy

γ γ [(β− β) d x0 + (1−β β) dx]= sin Θ

γ γ [1−β β + (β− β) cosΘ]= sin Θ

hence ds2 =

[1 +

(dy

dx

)2]

dx2 = dx2

cos2 Θ= d x02

Note that expressed in terms of (x0, x, y) co-ordinates set, the previous relations are

equivalent to:

dx

d x0=

(n − β2

)dxd x0 − β (n − 1)

β (n − 1) dxd x0 + 1 − n β2

dy

d x0=

n dyd x0

γ2[β (n − 1) dx

d x0 + 1 − n β2]

so that when Θ = 0, one has dx

d x0 = 1 anddy

d x0 = 0, from which it comes:dyd x0 = 0 and dx

d x0 = 1+β nn+β

= 1n

which is the Fresnel’s drag additional formula

for Θ = π2, dx

d x0 = 0 anddy

d x0 = 1, from which one obtains:dxd x0 = β (n− 1)

n−β2 and dyd x0 = 1

γ2 (n−β2), hence

[(dxd x0 = 0

)and

(dyd x0 = 1

)] ⇔ (n = 1)

we retrieve the fact that it is impossible to define a physical direction for light in

the moving medium when using the sets(→

e 0,→e x,

→e y,

→e z

)and (x0, x, y, z), since the

two vectors→e 0 and

→e x are not linearly independent; let us introduce now the habitual

matter co-ordinates(x0, x, y, z

)for the observer located in vacuum, such that for an

event in vacuum the co-ordinates(x0, x, y, z

)coincide with the vacuum co-ordinates(

x0, x, y, z), for an event in the medium perceived in the

→ex direction by the observer in

vacuum they are:

x0 = x0 x = n|| x then for a light propagating in the→ex direction, one must have

dxd x0 = 1

n||, and for an event in the medium perceived in the

→ey or

→ez directions they are:

x0 = x0 y = n⊥ y z = n⊥ z where obviously n⊥ = n and y = y as z = z if

the observers in vacuum and in the medium perceive the same y and z co-ordinates; the


observer at rest in vacuum perceives naturally the time in a given “direction” which is his

fundamental time vector reference system, in the vacuum as well as for events located in

the moving medium: this implies that the time direction→e0 perceived by the observer in

vacuum for events located in the moving medium must be the time direction→e0 perceived

by this observer for events located in the vacuum, which is the case by construction of→e0; similarly, the observer at rest located in vacuum is unable to distinguish a light ray

emerging from the moving medium in the perceived→ex direction for a perceived frequency

ν, governed by dxd x0 = 1

n||, from any light travelling the vacuum for the same perceived

direction and frequency and characterised by dxd x0 , so that for the observer located in

vacuum, the light emerging from the moving medium in the perceived direction→ex will

obey to:dxd x0 = 1

n||= dx

d x0 = 1n

from which it comes 1 < n|| = n+β1+β n

< n = n⊥Then, the Fresnel’s refractive index can be interpreted as the apparent refractive index

of the moving medium in the direction of motion of this medium effectively perceived by

the observer at rest in vacuum; hence the two relations dx

d x0 = cos Θ anddy

d x0 = sin Θ

can be rewritten as:

dxd x0 = cos Θ

n||

dyd x0 = sin Θ

n⊥

⇒ d s2 =

[1 +

(dy

dx

)2] (

dx

d x0

)2

d x02=

n2⊥ cos2 Θ + n2

|| sin2 Θ

n2⊥ n2

||d x02

from which one deduces the apparent refractive index: n2e =

n2⊥ n2

||n2⊥ cos2 Θ +n2

|| sin2 Θ

This apparent refractive index is the extraordinary wave refractive index for an uni-

axial medium of optical axis→e|| =

→ey [6] : indeed, n2

e =n2⊥ n2

||n2⊥ sin2 (π

2− Θ)+n2

|| cos2 (π2− Θ)

where π2− Θ is the angle between the unit wave vector

→Ω and the optical axis

→e|| of

the medium, so that it comes→e|| =

→ey: hence, if the observer at rest located in vacuum

perceives the same y and z co-ordinates as the observer at rest located in the moving

medium, the observer located in vacuum will perceive the isotropic moving medium (in

the point of view of the observer located in the medium) as an uniaxial medium whose

optical axis→e|| is orthogonal to the perceived direction

→ex of motion of the medium and

in the plane

(→ex,

→Ω

)where

→Ω is the perceived direction of propagation of light in the

moving medium; in the fundamental orthonormal basis(→ex,

→ey,

→ez

), the dielectric tensor

of the medium will be represented as ε = ε0

⎛⎜⎜⎜⎜⎝n2⊥ 0 0

0 n2|| 0

0 0 n2⊥

⎞⎟⎟⎟⎟⎠, with 1 < n|| < n⊥, so that

the observer in vacuum will perceive the moving medium as a negative uniaxial medium,

with reference matter co-ordinates:

x0 = x0 x = ne

(Θ)

x(Θ)

y = ne

(Θ)

y(Θ)

z = ne

(Θ)

z(Θ)


hence in the uniaxial medium, the only possible reference co-ordinates are vacuum-like

co-ordinates; since the two observers perceive the same y and z co-ordinates, one writes

y = n y and z = n z with y = y and z = z.

Let us now examine an electromagnetic field associated to the photon propagating in

the moving medium: the co-ordinates (x0, x, y, z) and(x0′ , x′, y′, z′

)associated to the

basis(→e0,

→ex,

→ey,

→ez

)and

(→e0′ ,

→ex′ ,

→ey′ ,

→ez′)

[respectively(x0, x, y, z

)and

(x0′ , x′, y′, z′

)associated to

(→e 0,

→e x,

→e y,

→e z

)and

(→e0′ ,

→ex′ ,

→ey′ ,

→ez′)] being related thanks to a vacuum

Lorentz transform, the vacuum-like electromagnetic tensors are such that:

F μν = ∂ xμ

∂ xμ′∂ xν

∂ xν′ F μ′ν′and F μ′ν′

= ∂ xμ′

∂ xμ∂ xν′

∂ xν F μν

from which one obtains after calculation:

Ex = Ex′= Ex Ey = γ

(E y′ + c β B z′

)= γ γ

[(1 − β β

)E y + c

(β − β

)B z]

Ez = γ(E z′ − c β By′

)= γ γ

[(1 − β β

)E z − c

(β − β

)By]

Bx = Bx′= Bx By = γ

(By′ − β

c E z′)

= γ γ[(

1 − β β)

By − 1c

(β − β

)E z]

Bz = γ(B z′ + β

c E y′)

= γ γ[(

1 − β β)

B z + 1c

(β − β

)E y]

(79)

choosing a monochromatic plane wave magnetic field perceived by the reference observer

at rest located in the moving medium→B = B0 exp

{− 2 i π νc

[x0 − (x cos Θ + y sin Θ)]} →

e z,

where→Ω = cos Θ

→e x + sin Θ

→e y, leads to for the associated complex electric field (par-

allel polarisation):

→E = E0 exp

{− 2 i π ν

c

[x0 − (x cos Θ + y sin Θ)

]} (− sin Θ

→e x + cos Θ

→e y

)here

→E and

→B are vacuum-like fields so that B = E

c, from which one deduces the compo-

nents of the associated vacuum-like electromagnetic field relatively to the observer located

in vacuum:

Ex = −E sin Θ Ey = γ γ[(

1 − β β)

cos Θ + β − β]

E Ez = 0

Bx = 0 By = 0 Bz = γ γ[1 − β β +

(β − β

)cos Θ

]Ec

replacing Θ by its value in term of Θ finally leads to:

Ex = −E sin Θ

γ γ [1−β β− (β− β) cos Θ] Ey = E cos Θ

γ γ [1−β β− (β− β) cos Θ] Ez = 0

Bx = 0 By = 0 Bz = 1

γ γ [1−β β− (β− β) cos Θ]Ec

from which one obtains:→D = ε0 E

(− sin Θ

→e x + cos Θ

→e y

) →B = E

c

→e z

where E = E0 e− i Ψ, with E0 = E0

γ γ [1−β β− (β− β) cos Θ]and

Ψ = 2π νc

[x0 − (x cos Θ + y sin Θ)]


in the phase expression, the co-ordinates are the vacuum-like co-ordinates relatively

to the reference observer inside the medium,→D and

→B are the components of the vacuum-

like electromagnetic field perceived by the observer at rest in vacuum, and→Ω = cos Θ

→e x + sin Θ

→e y is the perceived unit wave vector associated to the electro-

magnetic field; hence, since the observer in vacuum perceives the moving medium as

an anisotropic uniaxial medium, the apparent electromagnetic induction field inside the

medium and perceived by this observer will obey to:→k→D = 0

→k→B = 0

→k ∧ →

E = ω→B

→k ∧

→Bμ0

= − ω→D

where→k is the wave vector inside the medium:

→k = 2π ν

cne

→Ω

hence→B = B0 exp

{− 2 i π ν

c

[x0 − ne

(x cos Θ + y sin Θ

)]} →ez = B0 e− i Ψ →

ez from

which one deduces the electric induction:

→D =

ne B0

c μ0

e− i Ψ(− sin Θ

→ex + cos Θ

→ey

)= ε

→E

then it easily comes for the electric field:→E = ne c B0 e− i Ψ

(− sin Θ

n2⊥

→ex + cos Θ

n2||

→ey

)=

E0 e− i Ψ →eE

one immediately verifies that→k ∧

→E = ω

→B and B

2

0 =n2⊥ n2

|| (n2⊥ cos2 Θ+n2

|| sin2 Θ)c2(n4⊥ cos2 Θ+n4

|| sin2 Θ) E

2

0 =

N2e

c2 E2

0, where Ne is the extraordinary ray refractive index for an uniaxial medium of op-

tical axis→e|| =

→ey [6]. Then, the apparent electromagnetic field inside the medium is for

the observer located in vacuum:→E = ne Ne E0 e− i Ψ

(− sin Θ

n2⊥

→ex + cos Θ

n2||

→ey

)→D = ε0 ne Ne E0 e− i Ψ

(− sin Θ

→ex + cos Θ

→ey

)→B = Ne

cE0 e− i Ψ →

ez

from which it comes E0 e− i ΨV =

E0 e− i ΨV ,

where ΨV and ΨV are the vacuum-like phases, that is:

E0 exp

{−2iπν

c

[x0 −


)]}= E0 exp

{−2iπν

c

[x0 − (x cos Θ + y sin Θ)

]}the invariance for all y = y implies ν sin Θ = ν sin Θ, and from the transformation

formulas for the co-ordinates one has:

ν[x0−(x cosΘ+y sinΘ)]= νγγ[1−β β+(β−β) cos Θ]{x0−[(x−dV ) cos Θ+y sin Θ]}+νdM cosΘ

hence ν = ν

γ γ [1−β β− (β− β) cos Θ]is the frequency perceived by the reference observer at

rest in vacuum, depending on the propagation direction, ν being the frequency of the


radiation for the reference observer located in the medium, and:

ν(x0 − x cos Θ

)= ν

(x0 −x cos Θ

)+ ν dV cos Θ + n ν dM cos Θ

on the moving interface inside the medium, x′ = 0, from which x = β x0 + dV and

x = β x0 −n dM , so that:

ν x0(1 − β cos Θ

)= ν x0

(1 − β cos Θ

)and since on the interface γ x0 = γ x0,

one has the fundamental relations:

ν γ(1 − β cos Θ

)= ν γ

(1 − β cos Θ

)ν sin Θ = ν sin Θ

with E0 exp(

2 i π νc dV cos Θ

)= E0 exp

(− 2 i π νc

n dM cos Θ)

hence the internal apparent fields perceived by the observer at rest in vacuum are:

→E = Ne ne E0 e− iΓ exp

{− 2 i π ν

c

[x0 − ne


)]} (− sin Θ

n2

→e x + cos Θ

n2||

→e y

)→B =

Ne E0 e− i Γ

cexp

{− 2 i π ν

c

[x0 − ne


)]} →e z

with e− iΓ = exp[− 2 i π

c

(ν dV cos Θ + n ν dM cos Θ

)], while for the observer located

in the medium, the true electromagnetic field is:

→E = E0 exp

{− 2 i π νc

[x0 −n (x cos Θ + y sin Θ)

]} (− sin Θ→e x + cos Θ

→e y

)→B = n E0

cexp

{− 2 i π νc

[x0 −n (x cos Θ + y sin Θ)

]} →e z

The observer at rest in vacuum perceives emerging electromagnetic fields (parallel polar-

isation studied here) such that:→Et = E0t e− i Ψt

(− sin Θt

→e x + cos Θt

→e y

)→Bt = E0t

ce− i Ψt

→e z

with Ψt = 2π νt

c

[x0 −

(x cos Θt + y sin Θt

)]where Θt and νt are the transmitted angle and frequency perceived by the observer

at rest in vacuum, related to the transmitted angle Θ′t and frequency ν ′ in the co-moving

LRS by ν ′ sin Θ′t = νt sin Θt and cos Θ′t = cos Θt −β

1−β cos Θt= γ νt

ν′

(cos Θt − β

); on the

moving interface, since x = x = β x0 + dV = β x0 + dV , one has for the transmitted

field on the interface perceived by the observer at rest in vacuum:

→Et = E0t exp

{− 2 i π νt

c

[x0(1 − β cos Θt

)− y sin Θt

]} (− sin Θt

→e x + cos Θt

→e y

)→Bt =

E0t

cexp

{− 2 i π νt

c

[x0(1 − β cos Θt

)− y sin Θt

]} →e z

with E0t = E0t exp(

2 i π νt

c dV cos Θt

)since Eq. (79) must be verified, one obtains the components of the transmitted field

in the co-moving LRS of the particle bound to the interface in vacuum:

Bx′

t = Bx

t = By

t = By′

t = 0 Ez′

t = Ez

t = 0


Ex′

t = Ex

t = − E0t sin Θt e− i Ψt = − E0tν ′

νt

sin Θ′t e− i Ψt

for the phase, it comes easily after calculation:

Ψt = 2π νt

c

[x0 −

(x cos Θt +y sin Θt

)]= 2πν′

c

[x0′ − (x′ cos Θ′t + y′ sin Θ′t)

]− 2π νt

c dV cos Θt

= Ψ′t −2π νt

c dV cos Θt

and: Ex′

t = − E0tν′νt

sin Θ′t e− i Ψ′t

doing so for the two other non zero components leads to:

Ey′

t = γ(E

y

t − c β Bz

t

)= γ E0t

(cos Θt − β

)e− i Ψt = E0t

ν′νt

cos Θ′t e− i Ψ′t

Bz′

t = γ(B

z

t − βc

Ey

t

)= γ E0t

c

(1 − β cos Θt

)e− i Ψt =

E0t

cν′νt

e− i Ψ′t

The electromagnetic field expressed in the co-moving LRS of a particle inside the moving

medium and on the moving plane interface has the following form:

→E ′

i = E ′0 exp

[− 2 i π ν′c

(x0′ −n y′ sin Θ′i

)]⎛⎜⎜⎜⎜⎝

− sin Θ′i

cos Θ′i

0

⎞⎟⎟⎟⎟⎠→B′

i = n E′0

cexp

[− 2 i π ν′c


)] →e z′

since x′ = 0 on the interface, where→E ′

i and→B′

i are the incident fields on the interface;

when the incident wave impinges the interface with incident angle Θ′i, a reflected wave

appears in the medium and a transmitted one appears in the vacuum, such that the total

fields are:

* in the medium:

→E ′

T =→E ′

i +→

E ′r = E ′

0 exp[− 2 i π ν′

c


)]⎡⎢⎢⎢⎢⎣− (

1 + r||)

sin Θ′i(1 − r||

)cos Θ′i

0

⎤⎥⎥⎥⎥⎦→

B′T =

→B′

i +→

B′r = n E′

0

c

(1 + r||

)exp

[− 2 i π ν′c


)] →e z′

* in vacuum:→E ′

t = E ′0 t|| exp

[− 2 i π ν′c

(x0′ − y′ sin Θ′t

)] (− sin Θ′t→e x′ + cos Θ′t

→e y′)

→B′

t = E′0

c t|| exp[− 2 i π ν′

c

(x0′ − y′ sin Θ′t

)] →e z′

since the reflected angle equals the incident one and where the frequency remains

unchanged threw the interface in the co-moving LRS, Θ′t is the transmitted angle, r||and t|| the parallel amplitude reflection and transmission factors, since by construction

the matter co-ordinates x0′ , y′ and associated vectors of the co-moving frame remain

unchanged from the medium to vacuum; hence the continuity relations for the fields


through an interface lead to n sin Θ′i = sin Θ′t which is the classical Descartes’ law,

r|| = cos Θ′i −n cos Θ′

t

cos Θ′i +n cos Θ′

tand t|| = 2n cos Θ′

i

cos Θ′i +n cos Θ′

t; then one deduces from what precedes

that E ′0 t|| = ν′

νtE0t and the incident electromagnetic field inside the medium relatively

to the reference observer at rest in the medium is such that Eq. (79) is verified, that is

after a simple calculation:→Ei = E0 e− i Ψi

(− sin Θi

→e x + cos Θi

→e y

)→Bi = n E0

ce− i Ψi

→e z

where E0 = νi

ν′ E ′0 exp

(2 i π νi

cn dM cos Θi

)and Ψi = 2π νi

c

[x0 −n (x cos Θi + y sin Θi)

],

hence: E0 = νi

νt

E0t

t||exp

[2 i πc

(νt dV cos Θt + n νi dM cos Θi

)]while the phase continuity implies for all y that: νt sin Θt = n νi sin Θi; then from

what precedes, one has:

E0i =νi

νt

E0t

t||exp

[− 2 i π

cdV

(νi cos Θi − νt cos Θt

)]and the apparent incident field perceived by the reference observer at rest in vacuum is:

→Ei = ne

(Θi

)Ne

(Θi

)E0i e− i Ψi

(− sin Θi

n2⊥

→ex + cos Θi

n2||

→ey

)→Bi =

Ne(Θi)c

E0i e− i Ψi→ez

with Ψi = 2π νi

c

{x0 − ne

(Θi

) [x(Θi

)cos Θi + y

(Θi

)sin Θi

]}since on the moving interface x = ne

(Θi

)x(Θi

)= β x0 + dV = β x0 + dV and

ne

(Θi

)y(Θi

)= n y, one has for the apparent incident field on the interface perceived

by the observer at rest in vacuum:

→Ei = ne

(Θi

)Ne

(Θi

)E0i exp

(2 i π νi

c dV cos Θi

)× exp

{− 2 i π νi

c

[x0(1 − β cos Θi

)− n y sin Θi

]} (− sin Θi

n2⊥

→ex + cos Θi

n2||

→ey

)→Bi =

Ne(Θi)c

E0i exp(

2 i π νi

c dV cos Θi

)exp

{− 2 i π νi

c


)− n y sin Θi

]} →ez

the phase continuity for the fields implies νt

(1 − β cos Θt

)= νi

(1 − β cos Θi

)and

νt sin Θt = n νi sin Θi which is obviously verified, and the apparent transmission factors

are defined such that: →Et = T ||,E

→Ei

→Bt = T||,B

→Bi

with:

T||,B =

∥∥∥∥→Bt

∥∥∥∥∥∥∥∥→Bi

∥∥∥∥ = E0t

Ne(Θi) E0iexp

[− 2 i π

c dV

(νi cos Θi − νt cos Θt

)]= E0t

Ne(Θi) E0iexp

[− 2 i π dV

c β(νi − νt)

]= νt

νi

t||Ne(Θi)


and:

T ||,E =

⎛⎜⎜⎜⎜⎝T xx||,E T

xy||,E 0

Tyx||,E T

yy||,E 0

T zx||,E T

zy||,E T zz

||,E

⎞⎟⎟⎟⎟⎠ ⇒− n2

|| T xx||,E sin Θi + n2 T

xy||,E cos Θi = − n2 n2

||ne(Θi) T||,B sin Θt

− n2|| T

yx||,E sin Θi + n2 T

yy||,E cos Θi =

n2 n2||

ne(Θi) T||,B cos Θt

T zx||,E sin Θi = n2

n2||

Tzy||,E cos Θi

for the amplitudes, T||,E =

∥∥∥∥→Et

∥∥∥∥∥∥∥∥→Ei

∥∥∥∥ = Ne

(Θi

)T||,B and:

n2e

(Θi

)N2

e

(Θi

) [(− Txx

||,E sin Θi

n2 +T

xy||,E cos Θi

n2||

)2

+

(− T

yx||,E sin Θi

n2 +T

yy||,E cos Θi

n2||

)2]

= T 2||,E

⇒ n2e(Θi)n4 n4

||

⎡⎢⎣n4(T

xy||,E

2 + Tyy||,E

2)

cos2 Θi + n4||(T xx||,E

2 + Tyx||,E

2)

sin2 Θi

− 2 n2 n2||(T xx||,E T

xy||,E + T

yy||,E T

yx||,E)

sin Θi cos Θi

⎤⎥⎦ = T 2||,B

then it is efficient to choose T xx||,E T

xy||,E + T

yy||,E T

yx||,E = 0 and the transmission matrix can

be diagonal, with:

Txx||,E sin Θi =

n2

ne

(Θi

) T||,B sin Θt Tyy||,E cos Θi =

n2||

ne

(Θi

) T||,B cos Θt

For the reflected fields, relatively to the two reference observers, the situation is slightly

different, since if in the co-moving LRS the reflected angle and frequency equal the

incident ones, it is not the case for the reference observer located inside the medium

and for the one located in vacuum; in the co-moving LRS, the reflected unit wave vector

is

→Ω′r =

⎛⎜⎜⎜⎜⎝− cos Θ′i

sin Θ′i

0

⎞⎟⎟⎟⎟⎠ =

⎡⎢⎢⎢⎢⎣cos (π − Θ′i)

sin (π − Θ′i)

0

⎤⎥⎥⎥⎥⎦ =

⎛⎜⎜⎜⎜⎝cos Θ′r

sin Θ′r

0

⎞⎟⎟⎟⎟⎠applying the angle and frequency transformation relatively to the observer located in the

medium leads to:

νr sin Θr = ν ′ sin Θ′r = ν ′ sin Θ′i = νi sin Θi

cos Θr = cos Θ′r + β

1+ β cos Θ′r

= cos (π−Θ′i)+ β

1+ β cos (π−Θ′i)

= β− cos Θ′i

1− β cos Θ′i

but cos Θ′i = cos Θi − β1− β cos Θi

, from which one obtains:

cos Θr =2 β− (1+ β

2) cos Θi

1+ β2 − 2 β cos Θi

�= − cos Θi ⇒ Θr �= π − Θi


hence one has the fundamental result: relatively to the observer located in the moving

medium, and similarly relatively to the one located in vacuum, the reflected angle is not

the incident angle, the latter result being true if and only if Θi = 0 or β = 0; then the

relations between the reflected and incident angles and frequencies are:

cos Θr =2 β− (1+ β

2) cos Θi

1+ β2 − 2 β cos Θi

sin Θr = sin Θi

γ2 (1+ β2 − 2 β cos Θi)

νr = γ2 νi

(1 + β

2 − 2 β cos Θi

)one easily verifies that cos2 Θr + sin2 Θr = 1; relatively to the reference observer located

in vacuum, it comes:

cos Θr =β− β + (1−β β) cos Θr

1−β β + (β− β) cos Θr=

β + β− (1+β β) cos Θi

1+β β− (β + β) cos Θi=

2β− (1+β2) cos Θi

1+β2 − 2β cos Θi

sin Θr = sin Θr

γ γ [1−β β + (β− β) cos Θr]= sin Θi

γ γ [1+β β− (β + β) cos Θi]= sin Θi

γ2 (1+β2 − 2β cos Θi)

obviously the relations νi sin Θi = νr sin Θr = νi sin Θi are verified, and one ob-

tains after calculation the expected result νr = γ2 νi

(1 + β2 − 2 β cos Θi

)from which

νr sin Θr = νi sin Θi; furthermore, from the definition of the reflected angle, it is easy

to obtain :

cos Θ′r =cos Θr −β

1 − β cos Θr

= γνr

ν ′

(cos Θr − β

)=

cos Θr − β

1 − β cos Θr

= γνr

ν ′(cos Θr − β

)then the apparent reflection factors on the interface are defined such that:

→Er = R||,E

→Ei

→Br = R||,B

→Bi, with R||,B =

∥∥∥∥ →Br

∥∥∥∥∥∥∥∥→Bi

∥∥∥∥ and R||,E =

⎛⎜⎜⎜⎜⎝Rxx||,E 0 0

0 Ryy||,E 0

0 0 0

⎞⎟⎟⎟⎟⎠and the total apparent electromagnetic field on the interface is given by:

→ET = E0i exp

(2 i π νi

c dV cos Θi

)ne

(Θi

)Ne

(Θi

)e− i Ψi

(I + R||,E

) (− sin Θi

n2⊥

→ex + cos Θi

n2||

→ey

)→BT = E0i

c Ne

(Θi

)exp

(2 i π νi

c dV cos Θi

)(1 + RB

||)

e− i Ψi→ez

where Ψi = 2π νi

c


)− n y sin Θi

]; then the continuity of the fields

implies:

* for the magnetic field: 1 + R||,B = T||,B ⇒ 1 + R||,B =νt n (1+ r||)νi Ne(Θi)

* for the tangential component of the electric field:

Tyy||,E = 1 + R

yy||,E

(1 + R

yy||,E) cos Θi

n2||

=T||,B

ne

(Θi

) cos Θt


* for the normal component of the electric induction (when the interface is assumed free

of charges and currents):

1 + Rxx||,E =

T xx||,En2

(1 + R

xx||,E)

sin Θi =T||,B

ne

(Θi

) sin Θt

One may notice that all these expressions (for parallel polarisation) are valid for an

uniaxial negative crystal, with β =n⊥ − n||n⊥ n|| − 1

where n⊥ and n|| are the principal refractive

indices of the crystal.

It is important to note here that the refractive index such that ds =

√n2 cos2 Θ+n2

|| sin2 Θ

nn||d x0

is not an effective index but only an apparent index, and the refractive phenomena which

occur inside the moving medium have to examined from the reference observer located

inside the medium point of view; hence, when the reference observer located in vac-

uum perceives an incoming intensity emerging from the moving medium, he knows its

transmitted direction and frequency, from which he can deduce from what precedes the

apparent incident direction and frequency, directly related to the real incident direction

and frequency perceived by the reference observer at rest in the medium: then the initial

problem, that is to find an invariant form of the radiative transfer equation has to be

examined from the internal observer’s point of view.

4. Derivation of the Radiative Transfer Equation

Let us now pay attention to the derivative operator Pα ∂α along a photon path inside

the medium, using the two sets of mater co-ordinates relatively to the observer located

inside the medium; from (72), one easily obtains that

∂∂ x0′ = γ

(∂

∂ x0 + βn

∂∂x

)∂

∂x′ = γ(β n ∂

∂ x0 + ∂∂x

)∂

∂y′ = ∂∂y

∂∂z′ = ∂

∂z

, (80)

Performing the calculation of the different contravariante components of the 4-impulsion

vector leads to

P 0′ = n2 h ν′c

= n2 h γ νc

(1 − β μ

)P x′

= h ν′ nμ′c

= h ν nc

γ(μ − β

)P y′ = h ν′ n sin Θ′ cos Φ′

c= h ν n sin Θ cos Φ

c P z′ = h ν′ n sin Θ′ sin Φ′c

= h ν n sin Θ sin Φc

, (81)

from which one obtains

P 0′ ∂∂ x0′ + P x′ ∂

∂x′ = hν n2

cγ2{[

1 − βμ + β(μ − β

)]∂

∂ x0 + 1n

[β(1 − βμ

)+ μ − β

]∂∂x

}= hνn

c

(n ∂

∂ x0 + μ ∂∂x

)= P 0 ∂

∂ x0 + P x ∂∂x

,

(82)


and easily deduces that

Py′ ∂

∂y′+ P

z′ ∂

∂z′=

h ν n

c

(sin Θ cos Φ

∂

∂y+ sin Θ sin Φ

∂

∂z

)= P

y ∂

∂y+ P

z ∂

∂z,

(83)

Then one has the final and important result

P0′ ∂

∂ x0′ + Px′ ∂

∂x′+ P

y′ ∂

∂y′+ P

z′ ∂

∂z′= P

0 ∂

∂ x0 + Px ∂

∂x+ P

y ∂

∂y+ P

z ∂

∂z

rewritten under the more compact form

Pα′

∂α′ = Pα

∂α, (84)

which reveals that the derivative operator along a photon path is an invariant quan-

tity, unless one uses the fundamental mater variables(x0′ , x′, y′, z′

)and

(x0, x, y, z

)associated to the fundamental basis

(→e0′ ,

→ex′ ,

→ey′ ,

→ez′)

and(→e0,

→ex,

→ey,

→ez

); note that

Pα′∂α′ = h ν′ n

c

[n ∂

∂ x0′ + cos Θ′ ∂∂x′ + sin Θ′

(cos Φ′ ∂

∂y′ + sin Φ′ ∂∂z′

)]= h ν′ n

c

(n ∂

∂ x0′ +→Ω′

→grad

)= h ν′ n

c

(n ∂

∂ x0′ + ∂∂s′

)where ∂

∂s′ is the habitual curvilinear spatial derivative, that is the propagation unit vector

so as the gradient vector (expressed with the mater co-ordinates) are given in the vacuum

basis; similarly one has

P α ∂α = h ν nc

[n ∂

∂ x0 + cos Θ ∂∂x

+ sin Θ(cos Φ ∂

∂y+ sin Φ ∂

∂z

)]= h ν n

c

(n ∂

∂ x0 +→Ω

→grad

)= h ν n

c

(n ∂

∂ x0 + ∂∂s

)hence from Eq. (83) one may rewrite the derivative operator transformation under the

useful form

ν

(n

c

∂

∂t+

∂

∂s

)= ν ′

(n

c

∂

∂t′+

∂

∂s′

), (85)

It is now time to focus on the energetic invariant quantity, namely the specific intensity; let

us first remind the obtaining of the specific intensity when the system to be considered is

vacuum, following the steps developed by Mihalas [1]: if N is the photons number passing

through a surface element perpendicular to the particle speed vector at a given frequency

and a given propagation direction, this number in the comobile LRS is expressed as

N =L′

→Ω′

→dS dν ′ dΩ′ dt′

h ν ′=

L′ cos Θ′ dν ′ dΩ′ dS dt′

h ν ′, (86)

relatively to the observer LRS, this number is

N =Ldν dΩ

→dS dt

h ν

(→Ω −

→β)

=Ldν dΩ dS dt

h ν(cos Θ − β) , (87)


from which one immediately deduces that

L′ cos Θ′ dν ′ dΩ′

ν ′= γ

L dν dΩ

ν(cos Θ − β) , (88)

since the proper times are related with dt′ = dtγ

; moreover, using the optical aberration

and frequency Doppler shift transformation, cos Θ − β = ν′ cos Θ′γ ν

, hence, using the solid

angle conservation ν dν dΩ = ν ′ dν ′ dΩ′, one obtains

L′

ν ′3=

L

ν3, (89)

which justifies that I = Lν3 is the specific intensity for an unit refractive index medium;

if the refractive index is not one, analogously to what precedes, the specific intensity

is simply L′n2 ν′3 = L

n2 ν3 since relatively to the observer at rest inside the medium, the

moving medium remains isotropic of refractive index n; then since at rest the intensity isLn2 , well-known result [7], one deduces the previous result; then the left handed side term

of Eq. (1) obeys to the following relation

ν

[n

c

∂

∂t

(L

n2 ν3

)+

∂

∂s

(L

n2 ν3

)]= ν ′

[n

c

∂

∂t′

(L′

n2 ν ′3

)+

∂

∂s′

(L′

n2 ν ′3

)], (90)

If one considers now the number of photons emitted by an elementary volume in an

elementary solid angle around a given frequency interval, this number can be expressed

as

N =η′ dν ′ dΩ′ dV ′ d x0′

c h ν ′=

η dν dΩ dV d x0

c h ν, (91)

where η is the emissive power; due do the scalar density conservation, it comes that√−Det g d x0 dV =

√−Det g′ d x0′ dV ′

⇒ d x0 dV = d x0′ dV ′, (92)

hence one has from the previous result and with the help of Eq. (70)

η′

ν ′2=

η

ν2, (93)

similarly the number of photons absorbed in the same conditions is

N =κ′ L′ dν ′ dΩ′ dV ′ d x0′

c h ν ′=

κL dν dΩ dV d x0

c h ν, (94)

where κ is the absorption coefficient; from what precedes, one deduces that κ′L′ν′2 = κL

ν2 ,

and by definition of the specific intensity, one finally obtains

κ′ ν ′ = κ ν, (95)

so that the right handed side term of Eq. (1) obeys to the following relation

1

n2 ν2(η − κL) =

1

n2 ν ′2(η′ − κ′ L′) , (96)


In the vacuum, the emissive power is simply the Planck function multiplied by the ab-

sorption coefficient κ L0, while in a medium of refractive index n, it is η = n2 κ L0, from

which it comes the invariant forms of the RTE

nc

∂∂t′(

L′n2 ν′3

)+ ∂

∂s′(

L′n2 ν′3

)= κ′

ν′3(L0 − L′

n2

)nc

∂∂t

(L

n2 ν3

)+ ∂

∂s

(L

n2 ν3

)= κ

ν3

(L0 − L

n2

) , (97)

Hence, noting the total spatial derivative dds

= nc

∂∂t

+ ∂∂s

, one finally obtains the usual

form of the invariant RTE

dIds

+ κ I = κ L0

ν3

⇒ I (sf ) = I (si) exp[− ∫ sf

s= siκ (s) ds

]+∫ sf

s= si

κ(s) L0(s)ν3(s)

exp[− ∫ sf

s′ = sκ (s′) ds′

]ds

,

(98)

Then the reference observer at rest in vacuum is able to determine the (real) radiative

field inside the moving medium from the perceived emerging directional and spectral

intensity field.

5. CONCLUSION

In this paper we described a way to construct a consistent “equivalent vacuum” and

“matter” space bound to the observer after a diagonalisation of the metric tensor related

to the Gordon’s metric, due to the moving (with a constant speed) particles of the non

unit refractive index semi-transparent medium; the construction of this space relatively to

the observer allows then the calculation of the optical aberration and frequency transfor-

mation in the new fundamental co-ordinates attached to the observer space, and leads to

the determination of the invariant specific intensity and the general form of the radiative

transfer equation in this space, following the method developed by Mihalas in vacuum.

We may expect to determine a more general formulation of this work by generalisation

to the case of non constant speed moving particles in a semi-transparent medium of non

constant refractive index.


References

[1] Mihalas D. and Mihalas B. W., Foundations of Radiation Hydrodynamics (Dover,New York, 1999)

[2] Leonhardt U. and Piwnicki P., Optics of non uniformly moving media, Phys. Rev. A60, pp 4301-4312 (1999)

[3] Ben Abdallah P., When the space curvature dopes the radiant intensity, J. Opt. Soc.Am. B/Vol. 19, N˚ 8 (2002)

[4] Gordon W., Ann. Phys. 72, 421, 1923 (Leipzig)

[5] Fresnel A. J., Ann. Chim. Phys. 9, 57 (1818)

[6] L. Landau et E. Lifchitz, ”Physique theorique (tome 8), Electrodynamique des milieuxcontinus”, Ed. Librairie du globe, Editions MIR (2nde edition) (1990).

[7] R. W. Preisendorfer, “Radiative Transfer on Discrete Spaces”, International Series ofMonographs on pure and applied Mathematics, Vol. 74, Pergamon Press (1965).


Quantum Images and the Measurement Process

Fariel Shafee∗

Department of PhysicsPrinceton University

Princeton, NJ 08540 USA

Received 12 March 2007, Accepted 25 March 2007, Published 31 March 2007

Abstract: We argue that symmetrization of an incoming microstate with similar states in a seaof microstates contained in a macroscopic detector can produce an effective image, which doesnot contradict the no-cloning theorem, and such a combinatorial set, with conjugate quantumnumbers can form virtual bound states with the incoming microstate. This can then be usedwith first passage random walk interactions to give the right quantum mechanical weight fordifferent measured eigenvalues.c© Electronic Journal of Theoretical Physics. All rights reserved.

Keywords: Quantum Measurement, Quantum Image, Quantum Bound State, No-cloningTheoremPACS (2006): 03.65.Ta, 03.65.Ud, 03.67.Mn

1. Introduction

Random walks [1] have long been a favorite sports enjoyed by many quantum physi-

cists in search of a rationale for quantum indeterminism [2]. Different stochastic models

for transitions to collapsed states on measurement have been presented by many authors

[3, 4, 5, 6, 7]. In a previous work [8] we have presented a picture of the transition of

a superposed quantum microstate to an eigenstate of a measured operator through in-

teractions with a measuring device, which are random in the sense of the stochasticity

introduced by the large number of degrees of freedom of the macrosystem, and not due to

any intrinsic quantum indeterminism. However, in our work we made the novel departure

of using first passage walks [9] which lead to a dimensional reduction of the path in sim-

plicial complexes to simplexes of lower dimensions by turn, a possible feature also noted

very recently by Omnes [10]. In the work cited we appealed to heuristic arguments in

analogy with electrodynamic images. In the present work we try to justify the emergence

∗ f [email protected]


of image-like subsystems in a macrosystem from quantum symmetry principles.

2. Symmetrization and Interactions

Interactions between systems may be due to Hamiltonians connecting operators that

explicitly connect components of different systems, or they may be due to symmetriza-

tion or anti-symmetrization of the states of the systems involved. For fermions, exchange

interaction yields the exclusion principle, which may have more dominant effects than

a weak potential in a many-particle system. For bosonic systems condensation at low

temperatures indicate the creation of macro-sized quantum states. Unlike the unitary

time-dependent operators representing the explicit interactions between systems through

the Hamiltonian, (anti-)symmetrization has no explicit time involvement, and a sys-

tem includes the (anti-)symmetrization of the component subsystems ab initio, which

continues until the states change and lose their indistinguishability. Alternatively, (anti-

)symmetrization comes into action as soon as an intermediate or final state is produced

involving identical particles, even when the initial system might not have had any. The

process therefore is apparently a discrete phenomenon, going together with the abrupt

action of the creation or annihilation of particles in field theory.

In terms of first quantized quantum mechanics, we understand the permutative (anti-

)symmetry properties of identical microstates (particles) in terms of the separation of the

co-ordinates). Two identical microsystems labeled 1 and 2 in a particular states a and b

has the combined (anti)symmetric wave function

ψ(1, 2)ab =1√2[ψ(1)aψ(2)b ± ψ(2)aψ(1)b] (1)

In practice the labels 1 and 2 for the two particles usually refer to the concentration

of the two particles in two different regions of space, for example, near two attractive

potential centers. So, the labels 1 and 2 are actually also interpretable as parameters for

two different states, and [11] it is possible to combine the two sets of labels into a single

set, say α and β and demand that

ψ(α, β) = ±ψ(α′, β′) (2)

where the sign for fermionic systems depends on the number of interchanges needed

to obtain the parameter sets α′ and β′ from the unprimed sets and for the bosons it is of

course always positive.

Even with ab initio symmetry built-in, it is well-known that a state can dynamically

evolve from a nearly factorized separable product state to a fully symmetric entangled

state as the overlap becomes high from nearly zero when the two subsystems (particles)

were well-separated initially. If we know that the incoming particles labeled 1 and 2 were

in states a and b at large separations then,


ψ(x1, x2, a, b) =1√2[ψ(x1, a)ψ(x2, b) ± ψ(x2, a)ψ(x1, b)]

∼ 1√2ψ(x1, a)ψ(x2, b) (3)

for |x1 − x2| large, as the second term is small .

If the states a and b are identical, then it is well-known that this exchange interaction

for bosons gives an effective attractive interaction for small |x1 − x2|, as we get simply√2 times a single wave function, whereas for fermions it becomes highly repulsive as the

antisymmetry produces the exclusion principle.

3. State of the Detector

We shall consider a detector as macrosystem which consists of a large number of

microsystems identical with the microsystem to be detected, but in all possible different

states, including the incoming state to be detected, so that initially it appears like a

neutral unbiased system with respect to the state of the incoming microsystem. This

picture is comparable to that of a sea of quarks of all flavors and colors in a quark bag,

or even the similar content of a neutral vacuum when considering vacuum polarization

contributions. To maintain the quantum number of the vacuum, i.e. to give a singlet

with respect to all possible symmetry/classification groups, all these states occur paired

with conjugate anti-states (group theoretically inverse elements):

ΨD =∑a

ψDaψDa (4)

where the label D indicates states with positional peaks inside the detector. The

expression above is the simplest spectral decomposition for our purpose. In general there

will also be simultaneous multiple state/anti-state pairs, which will introduce new numer-

ical factors from combinatorics, but will not change the relative strengths of interactions

between the incoming microstate (S) and the pairing anti-states of the detector (D),

which is the crucial part of our measurement picture.

4. System-Detector Symmetry

For a bosonic microsystem system being detected, if it is in the state ψSi, sym-

metrization with the detector states gives

ψSD =1√2N

∑j �=i

(ψSiψDj + ψSjψDi)ψDj +1√N

(ψSiψDiψDi) (5)

when there are N states uniformly distributed in the detector, including the state i.

Normalization is ensured by the orthogonality of the states, when the coefficients are as

chosen.


However, if the microsystem was well-separated from the detector and symmetrization

was not invoked, the product state in a product space would be, with the macroscopic

detector still containing a superposition of all possible states:

ΨSD0 =1√N

∑allj

(ψSiψDjψDj) (6)

Since the functions ψSi and ψDi for an identical microsystem in the same state may

both actually represent the observed incoming microsystem in Eq. 6, we can rewrite Eq.

5 as

ΨSD =1√2(ΨSD0 + ΨDS0) +

1√N

(1 −√

2)ψSiψDiψDi (7)

Here both ΨSD0 and ΨDS0 represent an incoming particle in the state ψi and its

noninteracting product with the detector. Hence, the extra term ψSiψDiψDi represents

the ’exchange interaction’ resulting from the entanglement of the microstate with the

detector.

For incoming fermionic systems the arguments are similar, but somewhat more com-

plicated. In this case anti-symmetrization gives

Ψ[SD] =1√

2(N − 1)

∑j �=i

(ψSiψDj − ψSjψDi)ψDj (8)

Since the detector includes all other states but must exclude the state ψi due to

anti-symmetrization (exclusion principle), we can actually consider the sums in Eq. 8

as involving hole-antihole pair states ψhDiψ

hDi corresponding to ψi. So we have for the

combined system of the incoming microsystem ψi and the detector:

Ψ[SD] ∼ (ψSiψhDi − ψDiψ

hSi)ψ

hDi (9)

with the definitions:

ψhDiψ

hDi =

∑j �=i

ψDjψDj

ψhSiψ

hDi =

∑j �=i

ψSjψDj (10)

In the above analysis we have not considered the eigen-basis of the detector. As

we have considered the symmetrization aspects only, the state ψi occurs as a natural

preferred vector and for the other states j �= i we can consider any set orthogonal to ψi.

5. Quantum Images and the No-cloning Theorem

The exchange interaction term due to (anti-)symmetrization contains a product of

the incoming microstate ψSi, a corresponding state ψiD in the detector, which is the same


microstate for bosons, or a hole ψhDi in the case of fermions, and also associated with such

a pair is a conjugate state ψDi or ψhDi for fermions. In the case of the bosonic systems

we shall call the latter conjugate state an image of the original incoming state created

by the symmetrization process. We do not consider the symmetric identical state ψDi as

the image, because the identical state nominally in the detector is indistinguishable from

the original incoming state and when there is an overlap of functions they may represent

the same physical entity. In the case of fermionic systems the incoming state ψiS and

the corresponding hole state ψhDi or its conjugate ψh

Di are in general all nonidentical

systems. Sine the incoming state is definitely not ψhSi, we can neglect the second term in

Eq. 9. Hence the effect of the antisymmetrization effectively gives a simple product as

for a bosonic system:

ΨSDferm= ψSiψ

hDiψ

hDi (11)

However, since the hole is more like a conjugate and the conjugate of the hole is more

like the original incoming microsystem, we can expect that both ψSi and ψhDi interact

in a similar manner with ψhDi.

There is no conflict with the no-cloning theorem [12] when (anti-)symmetrization

produces such quantum images, which, as we have seen, are either extensions of the

original functions, or are conjugate states. Though there is a one-to-one correspondence

with the incoming state, the states in the detector simply extend the original state by

(anti-)symmetry or produce a state which is conjugate to the original state, and is not

producible by a unitary operator assumed in the no-cloning theorem. In other words,

(anti-)symmetrization and the consequent exchange interactions are not producible by

the linear unitary operators and the simple and elegant proof of the no-cloning theorem

is inappropriate for quantum images of the kind described above.

6. Measurement and Eigenstates

Quantum images, formed by invoking symmetrization properties of the combined

system, do not depend on the operator involved in the measurement process associated

with the detector. The quantity measured is represented by a unitary operator in quan-

tum mechanics, and, if the microsystem is an eigenstate, it remains in the same state even

after measurement, but if it is a mixture of eigenstates of the operator, then it is taken as

a postulate of quantum mechanics that the emerging state after measurement is one of

the eigenstates and the detector too carries off the information of the final state to which

it collapses. We have shown recently [8] how a first passage random walk model reduces

an arbitrary linear combination of eigenstates to one of the component eigenstates with a

probability proportional to the square of the absolute magnitude of the coefficient of that

component. In that work we appealed to an electrostatic analogy for the formation of

the image in the detector which interacts with the incoming microsystem in steps, both

changing simultaneously till an eigenstate is reached.

If the state ψSi is expressed in terms of the eigenstates in a simple two-state system


ψSi = ai|α〉S + bi|β〉S (12)

then we get

ψDi = ai|α〉D + bi|β〉D (13)

and

ψDi = a∗i |α〉D + b∗i |β〉D (14)

and similarly for the hole states in the case of the fermionic systems.

This shows how the complex conjugate of the co-efficients occur in a natural way in

the image, which is not possible by cloning with a unitary operator.

Here we also see that the conjugate can interact interchangeably with the incoming

state or its indistinguishable extension in the detector and form virtual bound states

|SD〉i ∼ |ai|2|α〉S|α〉D + |bi|2|β〉S|β〉D|DD〉i ∼ |ai|2|α〉D|α〉D + |bi|2|β〉D|β〉D (15)

We can now think of the initial state of the virtual bound (SD) system to be a point in

a real space ( {xp = |ap(i)|2}, where we have now the running index p, in place of the α and

β for the 2-dimensional case, to indicate the label of the eigenvalue) of n-dimensions, if

the microsystem can have n different eigenvalues of the operator representing the quantity

to be observed.

|ψ〉i =n∑

p=1

xp(i)|S >p |D >p (16)

The process of interaction between the detector and the microsystem proceeds as a

first passage random walk in this x space, with the constraint

∑p

xp(t) = 1 (17)

which describes a n-dimensional plane restricted to the sector 0 ≤ xp(t) ≤ 1. We are

also now using the notation of time or step t, with xp beginning at the initial values of

the co-ordinates.

The random walk can be described [9] by a diffusion equation for small steps. The

concentration of path points, i.e. the probability c of finding the system at x at time t,

can be found [8, 9] from an integrable Green’s function of the corresponding equation for

the Laplace transform c, with D a diffusion constant:

∇2c(x, s) − (s/D)c(x; s) = −c(x, t = 0)/D (18)


The interesting thing about this random walk is that, whenever a path reaches a

co-ordinate at an edge of the plane of motion, with say xq = 0, the walk continues in the

lower dimensional sub-simplex confined to this fixed value of xq. Eventually, the path

ends at a vertex, say, xf = 1, with all other xp �=f = 0, and the probability for reaching it

is obtained [8] from the gradient of c at the vertex.

pf = k|af (t = 0)|2. (19)

Usual quantum mechanics postulates this relation, and considers a derivation impossi-

ble. Here k represents the detector efficiency, which includes the strength of the coupling

between S and D.

7. Conclusions

We have shown above that if the detector is a macroscopic system and is initially

neutral with respect to the measured quantity, which we have expressed as the sum

of microstates with all different states, then symmetrization with the measured system

for bosonic systems or anti-symmetrization for fermionic systems breaks the neutrality

in a unique way which may be regarded as the formation of a quantum image of the

measured microsystem in the detector. These images are conjugates of the incoming

microsystems, or hole-type states equivalent to conjugate states, and since the process is

not a linear unitary operation, the no-cloning theorem does not pose a problem. That the

interaction between the incoming state and these images can be modeled by first passage

random walks to give the probabilities for different eigenstates as final states of both the

incoming state and the detector’s microstate component has been shown in [8]. We shall

later examine the question of measurement of entangled systems in spatially separated

detectors.

Acknowledgements

The author would like to thank Professor R. Omnes of Universite de Paris, Orsay,

for useful feedback from the earlier work.


References

[1] Zurek W H,Rev. Mod. Phys. 75, 715 (2003)

[2] Bub J, Interpreting the quantum world (Cambridge U.P., Cambridge, England, 1997)

[3] Pearle P, 1976 Phys. Rev. D13, 857 (1976)

[4] Ghirardi G C, Rimini A and Weber T,Phys. Rev. D34, 470 (1986)

[5] Diosi L, Phys. Lett. 129A, 419 (1988)

[6] Gisin N and Percival I C, J. Phys. A25, 5677 (1992)

[7] Adler S L, Brody D C, Brun T A and Hughston L P, J. Phys. A34, 8795 (2001)

[8] Shafee F, Preprint quant-ph/0502111 (2005)

[9] Redner S, A guide to first-passage processes (Cambridge U.P., Cambridge, England,2001))

[10] Omnes, Phys. Rev. D 71, 065011 (2005)

[11] York M J, Preprint quant-ph/9908078 (1999)

[12] Zurek W H and Wootters W K, 1982 Nature 299 802 (1982)

Volume 4 Number 14 EJTPe-mail: [email protected] [email protected] Leonardo Chiatti Medical...

Documents

Transcript of Volume 4 Number 14 EJTPe-mail: [email protected] [email protected] Leonardo Chiatti Medical...