Foundations of Probability and Physics

393
PQ-QP: Quantum Probability and WItite Noise Analysis Volume XIII ^^ Proceedings of the Conference Foundations of p robability and physics Edited by A Khrennikov World Scientific

Transcript of Foundations of Probability and Physics

Page 1: Foundations of Probability and Physics

PQ-QP Quantum Probability and WItite Noise Analysis

Volume XIII

^ ^ Proceedings of the Conference

Foundations of p robability and

physics Edited by A Khrennikov

World Scientific

^ ^ Proceedings of the Conference

foundations of Probability and

physics

P Q - Q P Quantum Probability and White Noise Analysis

Managing Editor W Freudenberg Advisory Board Members L Accardi T Hida R Hudson and K R Parthasarathy

PQ-QP Quantum Probability and White Noise Analysis

Vol 13 Foundations of Probability and Physics ed A Khrennikov

QP-PQ

Vol 10 Quantum Probability Communications eds R L Hudson and J M Lindsay

Vol 9 Quantum Probability and Related Topics ed L Accardi

Vol 8 Quantum Probability and Related Topics ed L Accardi

Vol 7 Quantum Probability and Related Topics ed L Accardi

Vol 6 Quantum Probability and Related Topics ed L Accardi

PQ-QP Quantum Probability and White Noise Analysis

Volume XIII

Proceedings of the Conference

foundations of probability and

physics Vaxjo Sweden 25 November - 1 December 2000

Edited by A Khrennikov University of Vaxjo Sweden

|5 World Scientific m New JerseyLondonSingapore New Jersey bull London bull Singapore bull Hong Kong

Published by

World Scientific Publishing Co Pte Ltd

P O Box 128 Farrer Road Singapore 912805

USA office Suite IB 1060 Main Street River Edge NJ 07661

UK office 57 Shelton Street Covent Garden London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library

FOUNDATIONS OF PROBABILITY AND PHYSICS PQ-QP Quantum Probability and White Noise Analysis - Vol 13

Copyright copy 2001 by World Scientific Publishing Co Pte Ltd

All rights reserved This book or parts thereof may not be reproduced in any form or by any means electronic or mechanical including photocopying recording or any information storage and retrieval system now known or to be invented without written permission from the Publisher

For photocopying of material in this volume please pay a copying fee through the Copyright Clearance Center Inc 222 Rosewood Drive Danvers MA 01923 USA In this case permission to photocopy is not required from the publisher

ISBN 981-02-4846-6

Printed in Singapore by World Scientific Printers (S) Pte Ltd

V

Foreword

With the present proceedings of a conference on Foundations of Probability and Physics we continue the QP series mdash the first volume of which appeared more than twenty years ago The series had its origin in proceedings of conshyferences and workshops on quantum probability and related topics Initially published by Springer-Verlag World Scientific has now been the publisher for about ten years Much has changed in the world of quantum probability in the last two decades Quantum probabilistic methods became a mature subject in mathematics and mathematical physics The number of well-established scienshytists who have turned their scientific interest to the field of quantum probability is impressively increasing Scientifically and numerically strong schools of quanshytum probability evolved in the past years Moreover the highly interdisciplinary character of quantum probability became more and more evident Especially the close connections to white noise analysis aroused the interest of classical and quantum probabilists and stimulated mutual exchange and cooperation fruitful for both parties

Taking into account this development during the previous QP conferences we discussed comprehensively and in detail the future profile and main goals of the series Some changes in the alignment and the objectives of the series reshysulted from these discussions First of all the new title reflects the intention to unify white noise analysis and quantum probability It is important and essenshytial to bring together classical and quantum probabilists and the success of the World Scientific journal Infinite Dimensional Analysis Quantum Probability and Related Topics shows that such an alliance will benefit both parties Furshythermore we should be open to a wide audience of scientists and to a broad spectrum of themes The present volume represents such a field being not very closely connected to quantum probability and white noise analysis but of general interest to the readership of the series

Future volumes of the series will include proceedings of conferences or workshyshops lecture notes of schools but also monographs on topics in quantum probshyability and white noise analysis

Finally we would like to thank all former editors of the series for their excellent job they did We especially appreciate the enthusiastic commitment of Luigi Accardi who initiated the series and was the responsible editor for many years

Wolfgang Freudenberg

VII

Contents

Foreword v

Preface xi

Locality and Bells Inequality 1 L Accardi and M Regoli

Refutation of Bells Theorem 29 G Adenier

Probability Conservation and the State Determination Problem 39 S Aerts

Extrinsic and Intrinsic Irreversibility in Probabilistic Dynamical Laws 50 H Atmanspacher R C Bishop and A Amann

Interpretations of Probability and Quantum Theory 71 L E Ballentine

Forcing Discretization and Determination in Quantum History Theories 85

B Coecke

Interpretations of Quantum Mechanics and Interpretations of Violation of Bells Inequality 95

W M De Muynck

Discrete Hessians in Study of Quantum Statistical Systems Complex Ginibre Ensemble 115

M M Duras

Some Remarks on Hardy Functions Associated with Dirichlet Series 121 W Ehm

Ensemble Probabilistic Equilibrium and Non-Equilibrium Thermodynamics without the Thermodynamic Limit 131

D H E Gross

An Approach to Quantum Probability 147 S Gudder

Innovation Approach to Stochastic Processes and Quantum Dynamics 161

T Hida

Statistics and Ergodicity of Wave Functions in Chaotic Open Systems 170 H Ishio

Origin of Quantum Probabilities 180 A Khrennikov

Nonconventional Viewpoint to Elements of Physical Reality Based on Nonreal Asymptotics of Relative Frequencies 201

A Khrennikov

Complementarity or Schizophrenia Is Probability in Quantum Mechanics Information or Onta 219

A F Kracklauer

A Probabilistic Inequality for the Kochen-Specker Paradox 236 J-A Larsson

Quantum Stochastics The New Approach to the Description of Quantum Measurements 246

E Loubenets

Abstract Models of Probability 257 V M Maximov

Quantum K-Systems and their Abelian Models 274 H Narnhofer

Scattering in Quantum Tubes 303 B Nilsson

Position Eigenstates and the Statistical Axiom of Quantum Mechanics 314

L Polley

Is Random Event the Core Question Some Remarks and a Proposal 321 P Rocchi

Constructive Foundations of Randomness 335 V I Serdobolskii

ix

Structure of Probabilistic Information and Quantum Laws 350 J Summhammer

Quantum Cryptography in Space and Bells Theorem 364 Volovich

Interacting Stochastic Process and Renormalization Theory 373 Y Volovich

xi

Preface

This volume constitutes the proceedings of the Conference Foundations of Probability and Physics held in Vaxjo (Smoland Sweden) from 25 November to 1 December 2000

The Organizing Committee of the Conference L Accardi (Rome Italy) W De Muynck (Eindhoven the Netherlands) T Hida (Meijo University Japan) A Khrennikov (Vaxjo University Sweden) and U V Maximov (Be-lostok Poland)

The purpose of the Conference (tentatively the first of a series) was to bring together scientists (physicists as well as mathematicians) who are intershyested in probabilistic foundations of physics An emphasis was made on both theory and experiment the underlying objective being to offer to the physical and mathematical scientific communities a truly interdisciplinary Conference as a privileged place for a scientific interaction among theoreticians and exshyperimentalists Due to the actual increased role of probabilistic foundations in physical applications (Einstein-Podolsky-Rosen correlation experiments Bells inequality quantum information computing and teleportation) as well as the necessity to reconsider foundations at the beginning of new millennium the organizers of the Conference decided that it was just the right time for taking the scientific risk of trying this

Since the creation of Statistical Mechanics probabilistic description plays more and more important role in physics The new crucial step in the develshyopment of the statistical approach to physics was made in the process of the creation of quantum mechanics The founders of quantum theory recognized that quantum formalism could not provide the description of physical processes for individual elementary particles The understanding of this surprising fact induced numerous debates on the possibilities of individual and probabilistic descriptions and relations between them These debates are characterized by the large diversity of opinions on the origin of quantum stochasticity

One of the viewpoints is that quantum stochasticity differs from classical stochasticity So quantum (statistical) mechanics could not be reduced to classical statistical mechanics This viewpoint implies convential interpretation of quantum mechanics

By this interpretation we could not use objective realism in quantum deshyscription of reality The very fundamental physical quantities such as for example position and momentum of an elementary particle could not be conshysidered as properties of the object the elementary particle The elementary particle can be in a state that is superposition of alternatives Only the act of a measurement gives the possibility to choose between these alternatives

xii

We recall historical roots of the origin of such a viewpoint namely the idea of superposition

In fact the whole quantum building was built on two experimental cornershystones 1) the experiment on photoelectric emission 2) the two slit experiment

The first experiment definitely demonstrated that light has the corpuscular structure (discrete structure of energy)

However the second experiment demonstrated that photons (corpuscular objects) do not follow the standard CLASSICAL STATISTICS The convenshytional rule for the addition of probabilistic alternatives

P = P1+P2

is violated in the interference experiments Instead of this rule probabilities observed in interference experiments follow to quantum rule for the addition of probabilistic alternatives

P = Pi + P2 + 2TP1P2COSO

Thus in general the classical rule is perturbed by the cos 0-factor The appearance of NEW STATISTICS induced the revolution in theoretshy

ical physics reconsideration of the role of all basic elements of the physical theory The common opinion was (and is) that quantum probabilistic rule could not be explained by purely corpuscular model To explain this rule we must apply to wave arguments (see for example Diracs book for the detailed analysis of the roots of quantum mechanical formalism)

This implies the wave-particle dualism and Bohrs principle of complemenshytarity This was the crucial change of the whole picture of physical reality (at least at micro-level)

We underline again that all these revolutionary changes had the purely probabilistic root namely the appearance of the new probabilistic rule We also underline that the founders of quantum mechanics in fact did not proshyvide deep probabilistic analysis of the problem Instead of this they analysed other elements of the physical model And such an analysis induces the new description of physical reality that we have already discussed namely quanshytum reality We will never know the real reasons of such a development of the

aOf course we must also mention that the necessity for a departure from classical meshychanics was shown by experiments demonstrating the remarkable stability of atoms and molecules The forces known in classical electrodynamics are inadequate for the explanation of this phenomenon However quantum mechanical explanation of such a stability is in fact based on the same arguments as the explanation of the photoelectric effect

bP A M Dirac The Principles of Quantum Mechanics (Claredon Press Oxford 1995)

xiii

theoretical study of the results of experiments with elementary particles at the beginning of the last century

It might be that one of the reasons was the absence of the mathematical theory of probability A N Kolmogorov proposed the modern axiomatics of probability theory only in 1933

During the round table at this conference Prof T Hida and Prof I Volovich pointed out to the fundamental role of direct contacts between physishycists and mathematician in the creation of new physical theories It may be that the absence of the direct collaboration between quantum physical and probabilistic communities was the main root of the absence of deep probabilisshytic analysis of quantum behaviour

Debates on foundations of quantum mechanics were continued with a new excitement in the connection with Einstein-Podolsky-Rosen (EPR) paradox Unfortunately the probabilistic element played the minor role in the EPR conshysiderations There was used (in a rather formal way) the notion of probability one in the formulation of the sufficient condition to be an element of physical reality A new probabilistic impulse to debates on foundations of quantum meshychanics was given by Bells inequality However we must recognize that Bells probabilistic considerations were performed on the formal level that could not be considered as satisfactory (at least from the point of view of mathematishycian) It may be that this absence of the deep probabilistic analysis of the EPR and Bell arguments was one of the main reasons to concentrate investigations in the direction of nonlocality and no-go theorems for hidden variables

The main aim of the conference Foundations of Probability and Physics was to provide probabilistic analysis of foundations of physics classical as well as quantum (in particular the EPR and Bell arguments) The present volume contains results of such analysis It gives the general picture of probabilistic foundations of modern physics Foundations of probability were considered in the close connection to foundations of physics We demonstrated that probashybility plays the fundamental role in models of physical reality It seems to be impossible to split probabilistic and physical problems On one hand many important problems that looks as purely physical are in fact just probabilistic problems On the other hand the right meaning of probability can be found only on the basis of physical investigations Such a meaning depends strongly on a physical model

The conference and the present volume give the good example of the fruitshyful collaboration between physicists and mathematicians stimulate research on the foundations of probability and physics especially quantum physics

We would like to thank Swedish Natural Science Foundation Swedish Technical Science Foundation Vaxjo University and Vaxjo Commune for fi-

XIV

nancial support that made the Conference possible We would also like to thank Prof Magnus Soderstrom the Rector of Vaxjo University for support of fundamental investigations and in particular this Conference

Andrei Khrennikov International Center for Mathematical Modelling in Physics and Cognitive Sciences University of Vaxjo Sweden December 2000

1

L O C A L I T Y A N D B E L L S I N E Q U A L I T Y

LUIDGI ACCARDI MASSIMO REGOLI Centro Vito Volterra

Universita di Roma Tor Vergata Roma Italy Email accardi copyvolterra mat uniroma2 it

We prove that the locality condition is irrelevant to Bell in equality We check that the real origin of the Bells inequality is the assumption of applicability of classical (Kolmogorovian) probability theory to quantum mechanics We describe the chameleon effect which allows to construct an experiment realizing a local realistic classical deterministic and macroscopic violation of the Bell inequalities

1 Inequal i t i e s a m o n g n u m b e r s

In this section we summarize some elementary inequalities among numbers which correspond to different forms of the Bell inequality one meets in the literature Since some confusion have arosen about the mutual relationships among these inequalities in particular their (in)equivalence and the cases of equality such a summary might not be totally useless

L e m m a (1) For any two numbers ac euro [mdash11] the following equivalent inshyequalities hold

aplusmncltlplusmnac (1)

Moreover equality in (1) holds if and only if either o = plusmn l o r c = plusmn l

Proof The equivalence of the two inequalities (1) follows from the fact tha t one is obtained from the other by changing the sign of c and c is arbi t rary in

[-11]-

Since for any a c 6 [mdash11] 1 plusmn ac gt 0 (1) is equivalent to

a plusmn c2 = a2 + c2 plusmn 2ac lt (1 plusmn ac)2 = 1 + a2c2 plusmn 2ac

and this is equivalent to a 2 ( l - c 2 ) + c2 lt 1

which is identically satisfied because 1 mdash c2 gt 0 and therefore

a 2 ( l - c 2 ) + c 2 lt l - c 2 + c2 = 1 (2)

Notice tha t in (2) equality holds if and only if a2 = 1 ie a = plusmn 1 Since exchanging a and c in (1) the inequality remains unchanged the thesis follows

2

Corollary (2) For any three numbers abc euro [mdash11] the following equivalent inequalities hold

ab plusmn cb lt 1 plusmn ac (3)

and equality holds if and only if b = plusmn1 and either a = plusmn l o r c = i l

Proof For b e [-11]

abplusmncb = b-aplusmncltaplusmnc (4)

so the thesis follows from Lemma (1) In (34) equality holds if and only if b = plusmn 1 so also the second statement follows from Lemma (1)

Lemma (3) For any numbers o a b b c e [mdash11] one has

ab - bc + ab + bc lt 2 (5)

ab + ab + ab -ab lt 2 (6)

In (5) equality holds if and only if b b = plusmn1 and either a o r c = plusmn 1

Proof Adding the two inequalities in (3) one finds (5) The left hand side of (6) is lt than

ab-ba + ab + la (7)

and replacing a by c (7) becomes the left hand side of (5) Therefore (6) holds If b b = plusmn1 and either a or c = plusmn1 equality holds in (3) hence in (5) Conversely suppose that equality holds in (5) and suppose that either b lt 1 or | V | lt 1 Then we arrive to the contradiction

2 = b bull a - a + b bull |o + a lta- a + a + a lt (1 - aa) + (1 + aa) = 2 (8)

So if equality holds in (5) we must have |6| = b = 1 In this case (5) becomes

a-a + a + a=2 (9)

and we know from Lemma (1) that the identity (41) can take place if and only if either a or a = plusmn 1

3

Corollary (4) If aabbc pound -11 then the inequalities (3) (6) and (5) are equivalent and equality holds in all of them

Proof From Lemma (1) we know that the inequalities (1) and (2) are equivshyalent Prom Lemma (3) we know that (3) implies (5) Choosing b = a in (5) since a = plusmn 1 (5) becomes

ab mdash cb lt 1 mdash ac

which is (3) The left hand side of (6) is

a(b + b) + a(b - b) (10)

In our assumptions either (b + b) or (b - b) is zero so (4) is either equal to

a(b+b) = b + b=2

or to a(b-b) = b-b = 2

Corollary (5) If abc G (mdash11) then the inequality (5) hence a fortiori (6) is strictly weaker than (3)

Proof We have already proved that that (3) implies (5) hence (6) On the other hand (5) is equivalent to

ab - bc lt (1 - ac) + (1 + ac - ab + bc (11)

ByLemma(l) 1+acmdash ab+bc gt 0 and equality holds if and only if | b | = land either a or c is plusmn 1 From this the thesis follows

2 The Bell inequality

Corollary (1) (Bell inequality) Let ABCD be random variables defined on the same probability space (f2 J- P) and with values in the interval [mdash11] Then the following inequalities hold

E(AB - BC) lt 1 - E(AC) (1)

E(AB + BC) lt 1 + EAC) (2)

4

E(AB - BC) + E(AD + DC) lt 2 (3)

where E denotes the expectation value in the probability space of the four variables Moreover (1) is equivalent to (2) and if either A or C has values plusmn 1 then the three inequalities are equivalent

Proof Lemma (11) implies the following inequalities (interpreted pointwise on fi)

AB - BC lt 1 - AC

AB + BC lt 1 + AC

AB - BC + AD + DC lt 2 from which (1) (2) (3) follow by taking expectation and using the fact that |pound(-0I lt Ed-X^) The equivalence is established by the same arguments as in Lemma (11)

Remark (2) Bells original proof as well as the almost totality of the availshyable proofs of Bells inequality deal only with the case of random variables assuming only the values +1 and mdash1 The present generalization is not withshyout interest because it dispenses from the assumption that the classical random variables used to describe quantum observables have the same set of values of the latter ones a hidden variable theory is required to reproduce the results of quantum theory only when the hidden parameters are averaged over

Theorem (3) Let Sa 5c 5^ 5^ be random variables defined on a probshyability space (poundlF P) and with values in the interval following inequalities holds

-1+1] Then the

pound(5laquo5lt2gt) - E(SWSP) lt 1 - E(SWS^) (4)

E(SMS12)) + E(SWsi2)) lt 1 + E(S^SW) (5)

E(sWsi2)) - pound ( 5 laquo 5 lt 2 ) ) + E(S^S2)) + E(S^S2)) lt 2 (6)

Proof This is a rephrasing of Corollary (2)

5

3 Implications of the Bells inequalities for the singlet correlations

To apply Bells inequalities to the singlet correlations considered in the EPR paradox it is enough to observe that they imply the following

Lemma (1) In the ordinary three-dimensional euclidean space there exist sets of three unit length vectors a b c such that it is not possible to find a probability space (Q T P) and six random variables SX

J (x = a 6 c j = 12) denned on ($7 J- P) and with values in the interval [mdash1 +1] whose correlations are given by

E(SW-SM) = -x-y xy = abc (1)

where if x = (xiX2X3) y = (211223) are two three-dimensional vectors x bull y denotes their euclidean scalar product ie the sum xyi + X2J2 + ^323-

Remark In the usual EPR-type experiments the random variables qti) qU) qii)

represent the spin (or polarization) of particle j of a singlet pair along the three directions abc in space The expression in the right-hand side of (1) is the singlet correlation of two spin or polarization observables theoretically predicted by quantum theory and experimentally confirmed by the Aspect-type experiments

Proof Suppose that for any choice of the unit vectors x = abc there exist random variables Si as in the statement of the Lemma Then using Bells inequality in the form (25) with A = spound1 B = s f ) C = S ^ ) we obtain

E(SWsl2)) + E(S12)SW) lt 1 + ESltpsM) (2)

Now notice that if x = y is chosen in (1) we obtain

ESP bull SM) =-x bull x = - x2 = ~l x = abc

and since Si J Si = 1 this is possible if and only if Si1 = -Sx2gtgt (x = a b c)

P-almost everywhere Using this (2) becomes equivalent to

ESPSIgt) + E(S^SW) lt 1 - E(S^S^)

or again using (1) to

a-b + b-c lt 1 + o-c (3)

6

If the three vectors a b c are chosen to be in the same plane and such that a is perpendicular to c and b lies between a and b forming an angle 9 with a then the inequality (3) becomes

cos9 + sin0 lt 1 0 lt 0 lt TT2 (4)

But the maximum of the function of 6 imdashgt sin 9 + cos 9 in the interval [0 n2] is 2 (obtained for 9 = 7r4) Therefore for 0 close to 7r4 the left-hand side of (4) will be close to 2 which is more that 1 In conclusion for such a choice of the unit vectors a b c random variables Sa S^ Sc Sc as in the statement of the Lemma cannot exist

Definition (2) A local realistic model for the EPR (singlet) correlations is defined by

(1) a probability space (fl T P)

(2) for every unit vector x in the three-dimensional euclidean space two random variables Sx SX defined on fi and with values in the interval [mdash1 +1] whose correlations for any x y are given by equation (1)

Corollary (3) If a b c are chosen so to violate (4) then a local realistic model for the EPR correlations in the sense of Definition (2) does not exist

Proof Its existence would contradict Lemma (1)

Remark In the literature one usually distinguishes two types of local realistic models - deterministic and stochastic ones Both are included in Definition (2) the deterministic models are defined by random variables Sx with values in the setmdash1 +1 while in the stochastic models the random variables take values in the interval [mdash1+1] The original paper [7] was devoted to the deterministic case Starting from [9] several papers have been introduced to justify the stochastic models We prefer to distinguish the definition of the models from their justification

4 Bell on the meaning of Bells inequality

In the last section of [8] (submitted before [7] but published after) Bell briefly describes Bohm hidden variable interpretation of quantum theory underlining

7

its non local character He then raises the question that there is no proof that any hidden variable account of quantum mechanics must have this extraorshydinary character and in a footnote added during the proof corrections he claims that Since the completion of this paper such a proof has been found

m-In the short Introduction to [7] Bell reaffirms the same ideas namely

that the result proven by him in this paper shows that any such [hidden variable] theory which reproduces exactly the quantum mechanical predictions must have a grossly nonlocal structure

The proof goes along the following scheme Bell proves an inequality in which according to what he says (cf statement after formula (1) in [7])

The vital assumption [2] is that the result B for particle 2 does not depend on the setting a of the magnet for particle nor A on b

The paper [2] mentioned in the above statement is nothing but the Einshystein Podolsky Rosen paper [11] and the locality issue is further emphasized by the fact that he reports the famous Einsteins statement [12] But on one supposition we should in my opinion absolutely hold fast the real factual situation of the system S2 is independent of what is done with the system Si which is spatially separated from the former

Stated otherwise according to Bell Bells inequality is a consequence of the locality assumption

It follows that a theory which violates the above mentioned inequality also violates the vital assumption needed according to Bell for its deduction ie locality

Since the experiments prove the violation of this inequality Bell concludes that quantum theory does not admit a local completion in particular quantum mechanics is a nonlocal theory To use again Bells words the statistical predictions of quantum mechanics are incompatible with separable predetermination ([7] p199) Moreover this incompatibility has to be undershystood in the sense that in a theory in which parameters are added to quantum mechanics to determine the results of individual measurements without changshying the statistical predictions there must be a mechanism whereby the setting of one measuring device can influence the reading of another instrument how-evere remote Moreover the signal involved must propagate instantaneously

5 Critique of Bells vital assumption

An assumption should be considered vital for a theorem if without it the theorem cannot be proved

8

To favor Bell let us require much less Namely let us agree to consider his assumption vital if the theorem cannot be proved by taking as its hypothesis the negation of this assumption

If even this minimal requirement is not satisfied then we must conclude that the given assumption has nothing to do with the theorem

Notice that Bell expresses his locality condition by the requirement that the result B for particle 2 should not depend on the setting a of the magnet for particle 1 (cf citation in the preceeding section) Let us denote Mi (M2) the space of all possible measurement settings on system 1 (2)

Theorem (1) For each unit vector x in the three dimensional euclidean space (1 6 R3 I a |= 1) let be given two random variables Sx Sx (spin of particle 1 (2) in direction x) defined on a space D with a probability P and with values in the 2-point set +1 mdash1- Fix 3 of these unit vectors a b c and suppose that the corresponding random variables satisfy the following non locality condition [violating Bells vital assumption] suppose that the probability space Cl has the following structure

) = A x M x M 2 (1)

so that for some function Fj1 F^2 A x Mi x M2 -raquobull [-11]

Sal) (w) = Fa

(1) (A mi m2) (S^ depends on m2) (2)

Sa2)(u) = Fa

(2)(A mi m2) (Sa2) depends on mi) (3)

with mi euro Mim2 euro M2 and similarly for b and c [nothing changes in the (2) proof if we add further dependences for example Fa may depend on all the

41 (w) and F0(1) on all the SX

2LJ)

Then the random variables Si S^2 Sc satisfy the inequality

I (SMStrade) - (StradeSW) |lt 1 - (S^SM) (4)

If moreover the singlet condition

lt5(1)-S(2)) = - 1 x = abc (5)

is also satisfied then Bells inequality holds in the form

(Sa^si2))-S^S^)ltl + (sWS^) (6)

9

Proof The random variables Sa S^ Sc satisfy the assumptions of Corolshylary (23) therefore (4) holds If also condition (5) is satisfied then since the variables take values in the set mdash1 +1 with probability 1 one must have

SP = -SW (x = abc) (7)

and therefore (S^S^) = -S^S^) Using this identity (4) becomes (6)

Summing up Theorem (1) proves that Bells inequality is satisfied if one takes as hypothesis the negation of his vital assumption From this we conclude that Bells vital assumption not only is not vital but in fact has nothing to do with Bells inequality

REMARK Using Lemma (141) below we can allow that the observables take values in [mdash11] also in Theorem (1)

REMARK The above discussion is not a refutation of the Bell inequality it is a refutation of Bells claim that his formulation of locality is an essential assumption for its validity since the locality assumption is irrelevant for the proof of Bells inequality it follows that this inequality cannot discriminate between local and non local hidden variable theories as claimed both in the introduction and the conclusions of Bells paper

In particular Theorem (1) gives an example of situations in which

(i) Bells locality condition is violated while his inequality is satisfied

In a recent experiment with M Regoli [4] we have produced examples of situations in which

(ii) Bells locality condition is satisfied while his inequality is violated

6 The role of the counterfactual argument in Bells proof

Bell uses the counterfactual argument in an essential way in his proof because it is easy to check that formula (13) in [7] paper is the one which allows him to reduce in the proof of his inequality all consideration to the A-variables (Sa

in our notations while Bells -B-variables are the Sa ^ in our notations) The pairs of chameleons (cf section (10) as well as the experiment of [4] provide a counterexample precisely to this formula

10

7 Proofs of Bells inequality based on counting arguments

There is a widespread illusion to exorcize the above mentioned critiques by restricting ones considerations to results of measurements The following conshysiderations show why this is an illusion

The counting arguments usually used to prove the Bell inequality are all based on the following scheme In the same notations used up to now conshysider N simultaneous measurements of the singlet pairs of observables (S^ S) (Spound S) (S 5) and one denotes S3

XV the results of the v-th measurement of Sdegx (j = 12 x = a b c v = 1 N) With these notations one can calculate the empirical correlations on the samples that is

u

(and similarly for the other ones) In the Bell inequality 3 such correlations are involved

(slsl) slsD slsD (2)

Thus in the three experiments observer 1 has to measure 5 in the first and third experiment and S in the second while observer 2 has to measure Sjj in the first and second experiment and S in the third Therefore the directions a and b can be chosen arbitrarily by the two observers and it is not necessary that observer 1 is informed of the choice of observer 2 or conversely However the direction c has to be chosen by both observers and therefore at least on this direction there should be a preliminary agreement among the two observers This preliminary information can be replaced it by a procedure in which each observer chooses at will the three directions only those choices are considered for which it happens (by chance) that the second choice of observer 1 coincides with the third of observer 2 (cf section (15) for further discussion of this point) Whichever procedure has been chosen after the results of the experiments one can compute the 3 empirical correlations

^ 2 )^ 1 ) ) = ^E^ 1 ) (^ 2 ) )^ 2 ) ^ 2 ) ) lt4gt

11

JV

(5)

where pj means the j - t h point of the 3-d experiment etc If we try to apply the Bell argument directly to the empirical data given by the right hand sides of (3) (4) (5) we meet the expression

Jj EampWWto) - plusmn E^^pf )5f (Pf) (6) N

J = I j = i

from which we immediately see that if we try to apply Bells reasoning to the empirical data we are stuck at the first step because we find a sum of terms of the type

si^sPip^-sUip^sfHpV) (7)

to which the inequalities among numbers of section (1) cannot be applied because in general

More explicitly since the expression (x) above is of the form

ab mdash bc

(8)

with a b b c euro plusmn1 the only possible upper bound for it is 2 and not 1 mdash ac Even supposing that we in order to uphold Bells thesis can introduce a

cleaning operation [3] (cf [4]) which eliminates all the points in which (8) is not satisfied we would arrive to the inequality

jf E^frf) Wgt) - jf E ^ f W (f) j = i 3 = 1

lt i-^E^W^fef) (9) j = i

and in order to deduce from this something comparable with the experiments we need to use the counterfactual argument assessing that

^ 1 (p 9 ) ) = -sltagt(Pa)) (2h (10)

12

But in the second experiment S^ and not Sc has been measured Thus to postulate the validity of (10) means to postulate that the value assumed by Sjj in the second experiment is the same that we would have found if Sc and

(2) not S^ had been measured The chameleon effect provides a counterexample to this statement

8 The quantum probabilistic analysis

Given the results of section (5) (6) (7) it is then legitimate to ask if Bells vital assumption is irrelevant for the deduction of Bells inequalshy

ity which is the really vital assumption which guarantees the validity of this inequality

This natural question was first answered in [1] and this result motivated the birth of quantum probability as something more than a mere noncommu-tative generalization of probability theory in fact a necessity motivated by experimental data

Theorem (23) has only two assumptions

(i) that the random variables take values in the interval [mdash1 +1]

(ii) that the random variables are defined on the same probability space

Since we are dealing with spin variables assumption (i) is reasonable Let us consider assumption (ii) This is equivalent to the claim that the

three probability measures PabPacPcb representing the distributions of the pairs (Sa Sl ) (Sc 5^ ) (Sa SC ) respectively can be obtained by reshystriction from a single probability measure P representing the distribution of the quadruple si1] s f s f SJ

This is indeed a strong assumption because due to the incompatibility of the spin variables along non parallel directions the three correlations

(spsP) ltslaquoslt2gtgt (s^sP) (i)

can only be estimated in different in fact mutually incompatible series of exshyperiments If we label each series of experiments by the corresponding pair (ie (a 6) (6 c) (c a)) then we cannot exclude the possibility that also the probability measure in each series of experiments will depend on the correshysponding pair In other words each of the measures Pab Pbc Pca describes the joint statistics of a pair of commuting observables (Si1 s f ) (S^ s f gt)

13

(Sa Sc ) and there is no a priori reason to postulate that all these joint disshytributions for pairs can be deduced from a single distribution for the quadruple r o U ) c ( l ) o(2) Q ( 2 ) I

We have already proved in Theorem (23) that this strong assumption implies the validity of the Bell inequality Now let us prove that it is the truly vital assumption for the validity of this inequality ie that if this assumption is dropped ie if no single distribution for quadruples exist then it is an easy exercise to construct counterexamples violating Bells inequality To this goal one can use the following lemma

Lemma (1) Let be given three probability measures plusmnabi aci - c6 on amp given (measurable) space (S1f) and let S^ si1] S^ SJp be functions defined on (QJ-) with values in the interval [mdash1-1-1] and such that the probability measure Pab (resp PcbPac) is the distribution of the pair (Sa Sl ) (resp ( ^ 1 ^ 2 ) ) (S i 1 ^ 2 ) ) ) For each pair define the corresponding correlation

Kab=SWS^)=Jsa^S^dPab

and suppose that for ee = plusmn the joint probabilities for pairs

Ki bullbull= P(Si1] = e bull Strade = e)

satisfy

p++ _ pmdash p + - _ p - + (o xy xy gt xy M xy ^I

P = Px = 12 (3)

then the Bell inequality

Kab - Kbc ltl~Kac (4)

is equivalent to

pb+-pb

+c++p^+lt (5)

Proof The inequality (4) is equivalent to

W - 2Pab ~ Pamp+ + 2P+-1 lt 1 - 2Pa+

c+ + 2 P + - (6)

14

Using the identity (equivalent to (3))

bull-xy 0 xy ()

the left hand side of (4) becomes the modulus of

2(^t+-^r )-2(nt+-nr) = 2 (s+-f +pav) -2 (pbt+-+nr)

= 4(p a v-n t + ) (8) and again using (7) the right hand side of (6) is equal to

1 - 2 ( P + + - 2 + Pac+ ) = 2 - 4P++ (9)

Summing up (4) is equivalent to

Kb+-Kc+ltl -PaV (io)

which is (5)

Corollary (2) There exist triples of PabPacPcb on the 4-point space + 1 - 1 x + 1 - 1 which satisfy conditions (1) (2) of Lemma (1) and are not compatible with any probability measure P on the 6-point space + 1 - 1 X + 1 - 1 X + 1 - 1

Proof Because of conditions (1) (3) the probability measures Pab Pac Pcb are uniquely determined by the three numbers

pb+p++px+euroioi (ii)

Thus if we choose these three numbers so that the inequality (5) is not satisfied the Bell inequality (4) cannot be satisfied because of Lemma (1)

9 The realism of ballot boxes and the corresponding statistics

The fact that there is no a priori reason to postulate that the joint distributions of the pairs ( S ^ s f 0 ) (si1]sf) ( S ^ S ^ ) can be deduced from a single distribution for the quadruple Sa Sc Sl Sc does not necessarily mean that such a common joint distribution does not exist

15

On the contrary in several physically meaningful situations we have good reasons to expect that such a joint distribution should exist even if it might not be accessible to direct experimental verification

This is a simple consequence of the so-called hypothesis of realism which is justified whenever we are entitled to believe that the results of our meashysurements are pre-determined In the words of Bell Since we can predict in advance the result of measuring any chosen component of olti by previously measuring the same component of o it follows that the result of any such measurement must actually be predetermined

Consider for example a box containing pairs of balls Suppose that the experiments allow to measure either the color or the weight or the material of which each ball is made of but the rules of the game are that on each ball only one measurement at a time can be performed Suppose moreover that the experiments show that for each property only two values are realized and that whenever a simultaneous measurement of the same property on the two elements of a pair is performed the resulting answers are always discordant Up to a change of convenction and in appropriate units we can always suppose that these two values are plusmn1 and we shall do so in the following

Then the joint distributions of pairs (of properties relative to different balls) are accessible to experiment but those of triples or quadruples are not

Nevertheless it is reasonable to postulate that in the box there is a well defined (although purely Platonic in the sense of not being accessible to experiment) number of balls with each given color weight and material These numbers give the relative frequencies of triples of properties for each element of the pair hence using the perfect anticorrelation a family of joint probabilities for all the possible sextuples More precisely due to the perfect anticorrelation the relative frequency of the triples of properties

SW=ai [Sf^h] [^1=Cl]

where aibia = plusmn1 are equal to the relative frequency of the sextuples of properties

[Strade = ai] [Si1] = h] [SP = Cl] [SM = - 0 l ] [Slt2gt = -bl] [S(2) = _C l]

and since we are confining ourselves to the case of 3 properties and 2 particles the above ones when abic vary in all possible ways in the set plusmn1 are all the possible configurations in this situation the counterfactural argument is applicable and in fact we have used it to deduce the joint distribution of sextuples from the joint distributions of triples

16

10 The realism of chameleons and the corresponding statistics

According to the quantum probabilistic interpretation what Einstein Podol-sky Rosen Bell and several other who have discussed this topic call the hyshypothesis of realism should be called in a more precise way the hypothesis of the ballot box realism as opposed to hypothesis of the chameleon realism

The point is that according to the quantum probabilistic interpretation the term predetermined should not be confused with the term realized a priori which has been discussed in section (9) it might be conditionally dediced according to the scheme if such and such will happen I will react so and so

The chameleon provides a simple example of this distinction a chameleon becomes deterministically green on a leaf and brown on a log In this sense we can surely claim that its color on a leaf is predetermined However this does not mean that the chameleon was green also before jumping on the leaf

The chameleon metaphora describes a mechanism which is perfectly local even deterministic and surely classical and macroscopic moreover there are no doubts that the situation it describes is absolutely realistic Yet this reshyalism being different from the ballot box realism allows to render free from metaphysics statements of the orthodox interpretation such as the act of meashysurement creates the value of the measured observable To many this looks metaphysic or magic but load how natural it sounds when you think of the color of a chameleon

Finally and most important for its implications relatively to the EPR arshygument the chameleon realism provides a simple and natural counterexample of a situation in which the results are predetermined however the counter-factual argument is not applicable

Imagine in fact a box in which there are many pairs of chameleons In each pair there is exactly an healthy one which becomes green on a leaf and brown on a log and a mutant one which becomes brown on a leaf and green on a log moreover exactly one of the chameleons in each pair weights 100 grams and exactly one 200 grams A measurement consists in separating the members of each pair each one in a smaller box and in performing one and only one measurement on each member of each pair

The color on the leaf color on the log and weight are 2-valued observables (because we do not know a priori if we are measuring the healthy or the mutant chameleon) Thus with respect to the observables color on the leaf color on the long and weight the pairs of chameleons behave exactly as EPR pairs whenever the same observable is measured on both elements of a pair the results are opposite However suppose I measure the color on the leaf of one element of a pair and the weight of the other one and suppose the answers I

17

find are green and 100 grams Can I conclude that the second element of the pair is brown and weights 100 grams Clearly not because there is no reason to believe that the second member of the pair of which the weight was measured while in a box was also on a leaf

From this point of view the measurement interaction enters the very definishytion of an observable However also in this interpretation which is more similar to the quantum mechanical situation the counterfactual argument cannot be applied because it amounts to answer brown to the question which is the color on the leaf if I have measured the weight and if I know that the chameleon is the mutant one (this because the measurement of the other one gave green on the leaf) But this answer is not correct because it could well be that inside the box there is a leaf and the chameleon is interacting with it while I am measuring its weight but it could also be that it is interacting with a log also contained inside the box in which case being a mutant it would be green

Therefore if we can produce an example of a 2-particle system in which the Heisenberg evolution of each particles observable satisfies Bells locality condition but the Schroedinger evolution of the state ie the expectation value (bull) depends on the pair (ab) of measured observables we can claim that this counterexample abides with the same definition of locality as Bells theorem

11 Bells inequalities and the chamaleon effect

Definition (1) Let S be a physical system and O a family of observable quantities relative to this system We say that the it chamaleon effect is realized on S if for any measurement M of an observable A pound O the dynamical evolution of S depends on the observable A If D denotes the state space of S this means that the change of state from the beginning to the end of the experiment is described by a map (a one-parameter group or semigroup in the case of continuous time)

TA D-gtD

Remark The explicit form of the dependence of TA on A depends on both the system and the measurement and many concrete examples can be constructed An example in the quantum domain is discussed in [3] and the experiment of [4] realizes an example in the classical domain

Remark If the system S is composed of two sub-systems S and 52 we can also consider the case in which the evolutions of the two subsystems are differshyent in the sense that for system 1 we have one form of functional dependence

18

Tjj of the evolution associated to the observable A and for system 2 we have another form of functional dependence Tjj In the experiment of [4] the state space is the unit disk D in the plane the observables are parametrized by angles in [02n) (or equivalently by unit vectors in the unit circle) and for each observable S i of system 1

and for each observable Sbdquo of system 2

where Ra denotes (counterclockwise) rotation of an angle a Let us consider Bells inequalities by assuming that a chamaleon effect

is present Denoting E the common initial state of the composite system (12) (eg singlet state) the state at the end of the measurement will be

Now replace Sx by

g(j) = gj) o T ( j )

x x --x

Since the Sx take values plusmn 1 we know from Theorem (23) that if we postulate

the existence of joint probabilities for the triple 5bdquo S^ Sc compatible with

the two correlations E(si1S^2)) E(si1S^2)) then the inequality

E(S^si2)) - E(S^si2)) lt 1 - E(S^S^)

holds and if we also have the singlet condition

ESpoundTWp)STWp)) = -l (1)

then ae

and we have the Bells inequality Thus if we postulate the same probability space even the chamaleon effect alone is not sufficient to guarantee violation of the Bells inequality

Therefore the fact that the three experiments are done on different and incompatible samples must play a crucial role

19

As far as the chameleon effect is concerned let us notice that in the above statement of the problem the fact that we use a single initial probability measure E is equivalent to postulate that at time t = 0 the three pairs of observables

(^U2)) (sMagt) (^U1) admit a common joint distribution in fact E

12 Physical implausibility of Bells argument

In this section we show that combining the chameleon effect with the fact that the three experiments refer to different samples then even in very simple situations no cleaning conditions can lead to a proof of the Bells inequality

If we try to apply Bells reasoning to the empirical data we have to start from the expression

~ E^W^sfcr^) -1 E^crJV)^(if Pf) 3 3

(1)

which we majorize by

^ E W^P^iT^p]) - SW(TJ V ) s f (tf V ) (2) N

3

But if we try to apply the inequality among numbers to the expression

SPiT^S^iTiW) - STWpraquo)sl2Traquo) (3)

we see that we are not dealing with the situation covered by Corollary (12)

ie

ab -cbltl-ac (4)

because since

si2)(T^)^S^(T^Py) (5)

the left hand side of (4) must be replaced by

ab-cb (6)

whose maximum for a b cb euro [mdash1 +1] is 2 and not 1 mdash ac

20

Bells implicit assumption of the single probability space is equivalent to the postulate that for each j = 1 N

P]=P (7)

Physically this means that the hidden parameter in the first experiment is the same as the hidden

parameter in the second experiment This is surely a very implausible assumption Notice however that without this assumption Bells argument cannot be

carried over and we cannot deduce the inequality because we must stop at equation (2)

13 The role of the single probability space in CHSHs proof

Clauser Home Shimony Holt [9] introduced the variant (26) of the Bell inequality for quadruples (ab) (ab) (ab) ab) which is based on the following inequality among numbers a b b a euro [mdash11]

ab + ab+ ab - ab |lt 2 (1)

Section (1) already contains a proof of (1) A direct proof follows from

b + b + b-blt2 (2)

because

| ab + ab + ab - ab | = | a(b + b) + ab - b) |

lta-b + b + a -b-b ltb + b + b-b lt2

The proof of (2) is obvious

Remark (1) Notice that an inequality of the form

a1b1+a2b2 + a3b3~a4b4lt2 (3)

would be obviously false In fact for example the choice

c1 = b = a2 = b2 = a3 = 63 = b4 = 1 a 4 = mdash1

would give I o-ih + a2b2 + a3b3 - a4b4 = 4

21

That is for the validity of (1) it is absolutely essential that the number a is the same in the first and the second term and similarly for a in the 3-d and the 4-th b in the 2-d and the 4-th b in the first and the 3-d

This inequality among numbers can be extended to pairs of random varishyables by introducing the following postulates

( P I ) Instead of four numbers a b b a g [mdash11] one considers four functions

o(l) c(2) o(l) o(2) dega Jdegb dega -V

all defined on the same space A (whose points are called hidden paramshyeters) and with values in [mdash11]

(P2) One postulates that there exists a probability measure P on A which defines the joint distribution of each of the following four pairs of funcshytions

ampamp) (gtSltgt) Slt$SP) S$SP) (4)

Remark (2) Notice that (P2) automatically implies that the joint distribushytions of the four pairs of functions can be deduced from a joint distribution of the whole quadruple ie the existence of a single Kolmogorov model for these four pairs With these premises for each A euro A one can apply the inequality

(1) to the four numbers

and deduce that

I Spound)S12) + SW)S$) + Slaquo(A)Sf (A) - S$)Strade() |lt 2 (5)

From this taking P-averages one obtains

I ltslM2)) + (^142)gt + lt ^ 2 ) gt - ltspoundWgt i= (6)

I J(SW)S12) + SW)Slt) + Si))si2x) - 5^(A)42)(A))rfP(A) |lt

(7)

lt||5W(A)^2)(A) + 5laquo(A)42)(A)+

22

S$)Sl2) - S$)Sigt() I dP(X) lt 2 (8)

Remark (3) Notice that in the step from (6) to (7) we have used in an essential way the existence of a joint distribution for the whole quadruple ie the fact that all these random variales can be realized in the same probability space In EPR type experiments we are interested in the case in which the

four pairs (a b) (a amp) (ab) (ab) come from four mutually incompatible experiments Let us assume that there is a hidden parameter determining the result of each of these experiments This means that we interpret the number Sa (A) as the value of the spin of particle 1 in direction a determined by the hidden parameter A

There is obviously no reason to postulate that the hidden parameter deshytermining the result of the first experiment is exactly the same one which determines the result of the second experiment However when CHSH conshysider the quantity (5) they are implicitly doing the much stronger assumption that the same hidden parameter A determines the results of all the four exshyperiments This assumption is quite unreasonable from the physical point of view and in any case it is a much stronger assumption than simply postulating the existence of hidden parameters The latter assumption would allow CHSH only to consider the expression

SPiWfHXi) + Slaquo(A2)42)(A2) + 5^(A3)5f (A3) - 5^(A4)4

)(A4) (9)

and as shown in Remark (1) above the maximum of this expression is not 2 but 4 and this does not allow to deduce the Bell inequality

14 The role of the counterfactual argument in CHSHs proof

Contrarily to the original Bells argument the CHSH proof of the Bell inequalshyity does not use explicitly the counterfactual argument Since one can perform experiments also on quadruples rather than on triples as originally proposed by Bell has led some authors to claim that the counterfactual argument is not essential in the deduction of the Bell inequality However we have just seen in section (7) that the hidden assumption as in Bells proof ie the realizabil-ity of all the random variales involved in the same probability space is also present in the CHSH argument The following lemma shows that under the singlet assumption the conclusion of the counterfactual argument follows from the hidden assumption of Bell and of CHSH

23

Lemma (1) If and g are random variables defined on a probability space (A P) and with values in [mdash11] then

(fg) bull= I fgdP = - i JA

if and only if Pfg = - i ) = i

Proof If P(fg gt - 1 ) gt 0 then

fgdP = -P(fg = - 1 ) - fgdP gt -P(fg = -1)-P(fg gt - 1 ) gt - 1 JA Jfggt-1

Corollary (2) Suppose that all the random variales in (x3) are realized in the same probability space Then if the singlet condition

(SPSW) = - 1 (1)

is satisfied then the condition

SW = SM ( 2)

(ie formula (13) in Bells 64 paper) is true almost everywhere

Proof Follows from Lemma (1) with the choice f = Sx g = Si Summing

up if you want to compare the predictions of a hidden variable theory with quantum theory in the EPR experiment (so that at least we admit the validity of the singlet law) then the hidden assumption of realizability of all the random variables in (3) in the same probability space (without which Bells inequality cannot be proved) implies the same conclusion of the counterfactual argument Stated otherwise the counterfactual argument is implicit when you postulate the singlet condition and the realizability on a single probability space It does not matter if you use triples or quadruples

15 Physical difference between the CHSHs and the original Bells inequalities

In the CHSH scheme

(ab) (ab) (ab) (ab)

24

the agreement required by the experimenters is the following - 1 will measures the same observable in experiments I and III and the

same observable in experiments II and IV - 2 will measure the same observable in experiments I and II and the same

observable in experiments III and IV Here there is no restriction a priori on the choice of the observables to be

measured In the Bell scheme the experimentalists agree that - 1 measures the same observable in experiments I and III - 2 measures the same observable in experiments I and II - 1 and 2 choose a priori ie before the experiment begins a direction c

and agree that 1 will measure spin in direction c in experiment II and 2 will measure spin in direction c in experiment III (strong agreement)

The strong agreement can be replaced by the following (weak agreement) - 1 and 2 choose a priori ie before the experiment begins a finite set of

directions c CK and agree that 1 will measure spin in a direction choosen randomly among the directions c CK in experiment II and 2 will do the same in experiment III

In this scheme there is an a priori restriction on the choice of some of the observables to be measured

If the directions fixed a priori in the plane are K then the probability of a coincidence corresponding to a totally random (equiprobable) choice is

p$ = 42A) = X gt =laquo 42A =laquo) = pound h = h a=l a=l

This shows that contrarily than in the CHSH scheme the choice has to be restricted to a finite number of possibilities otherwise the probability of coincidence will be zero

From this point of view we can claim that the Clauser Home Shimony Holt formulation of Bells inequalities realize a small improvement with respect to the original Bells formulation

Reproduction of the E P R correlations by the chameleon effect

Consider a classical dynamical system composed of two particles (12) Let S denote the state space of each of the particles and suppose that at

time t = to (initial time) the state i j of particle 1 and the state UdegJ OI particle 2 coincide

Hdeg = A=ti (1)

25

Starting from time to the two particles begin to move in opposite directions and after a time interval of length T two independent and non communicating experimenters simultaneously perform a measurement on each particle

Experimenter 1 (resp 2) can choose among three different measurements corresponding to the observables

SWSWSW (resp 5 ( 2 ) 5 f ^ ) ) (2)

of particle 1 (resp particle 2) We suppose that both particles satisfy the chameleon effect described by

the following

DEFINITION (1) Let S be the state space of a dynamical system u let 7 be a set and for each x euro I let be given a function

Sx S -gt R x euro I (3)

representing an observable of the system The system ltr is said to realize the chameleon effect with respect to the observables (33) if whenever the observable Sx is measured the dynamical evolution of the system

T S -gt S tell (4)

depends on the measured observable Sx In our case we consider only two instants of time the initial one and the

one when the measurement takes place and we omit time from our notations Moreover in our case we have two particles and each particle is far away from the other one hence it can only feel the interaction with the measurement apparatus near to it So combining the locality principle with the chameleon effect we conclude that if experimenter 1 (resp 2) chooses to measure the observable Sx (resp Sy ) then particle 1 (resp 2) will evolve according to the dynamics

T1gtx (resp T2lV) (5)

In our case the variables x y can be any element of the set a b c

Suppose that experimenter 1 chooses to measure and experimenter

Let ti (resp j2) denote the final state ie the state at the time when the measurement occurs of particle 1 (resp 2) Condition (31) is then equivalent to

^iTaVi = T276Va (6)

26

The empirical correlations of the measurements will then be

i pound 5(1)(x1)5f ( i ^ C O i - T2gt2) (7)

where J^(-) is a lt5-like factor keeping into account the fact that only the conshyfigurations satisfying condition (6) give a non zero contribution to the correlashytions

Now suppose that the state space S is the real line R Thus the empirical correlations (7) are

nab = Z J J 5laquo ( m )5 f (M2) (T1aV1 - T^^d^d^ (8)

where Z is a normalization constant With the change of variables

T ^ V i = Ai T~^2 = A2 (9)

(8) becomes

z j J 5W(T1aA1)^2)(T2bA2)lt5(A1 - X2)dTha(X1)dT2b(X2) (10)

Now introduce the notations

S^TiiXj)=S^(j) j = l2 x = ab (11)

with these notations supposing as always possible that T[i0(Ai)T2 6(A2) gt 0 (10) becomes

Z j j S^X1)Sb2x2)8Xl - X2)Tlta(X1)T^b(X2)dX1dX2 =

Z JSi1X)si2)(X)Tla(X)Tib(X)dX

Now let us make the following choices

A 6 [02vr] laquobull supp Sltj) C [0 2TT] (12)

Z = (27T)1 (13)

27

Tb = V^ (14)

n a ( A ) = ^ | c o s ( A - a ) | (15)

SW() = sgn (cos(A - x)) Strade = -Strade (16)

With these choices the correlations (8) become

I-2TT I

( S ^ f i f gt = - sgn (cos(A - a)) sgn(cos(A - 6))- | cos(A - a)d (17) Jo 4

= mdash sgn (cos(A mdash b)) cos(A mdash a)d = mdash cos(b mdash a) = mdasha bull b

which are the EPR correlations

References

1 L Accardi Phys Rep 77 169-192 (1981) 2 L Accardi Urne e camaleonti Dialogo sulla realta le leggi del caso

e la teoria quantistica (II Saggiatore 1997) Japanese translation Maruzen (2000) russian translation ed by Igor Volovich (PHASIS Publishing House 2000) english translation by Daniele Tartaglia to appear

3 L Accardi On the EPR paradox and the Bell inequality Volterra Preprint N 350 (1998)

4 L Accardi M Regoli Quantum probability and the interpretation of quantum mechanics a crucial experimentInvited talk at the workshop The applications of mathematics to the sciences of nature critical moments and aspetcs Arcidosso June 28-July 1 (1999) To appear in the proceedings of the workshop Preprint Volterra N 399 (1999)

5 L Accardi M Regoli Local realistic violation of Bells inequality an experiment Conference given by the first-named author at the Dipartimento di Fisica Universita di Pavia on 24-02-2000 Preprint Volterra N 402

6 L Accardi M Regoli Non-locality and quantum theory new experishymental evidence Invited talk given by the first-named author at the Confershyence Quantum paradoxes University of Nottingham on 4-05-2000 Preprint Volterra N 421

7 J S Bell Physics 1 3 195-200 (1964) 8 J S Bell Rev Mod Phys 38 447-452 (1966)

28

9 J F Clauser MA Home A Shimony R A Holt Phys Rev Letters 49 1804-1806 (1969) J S Bell Speakable and unspeakable in quantum mechanics (Cambridge Univ Press 1987)

10 J F Clauser M A Home Phys Rev D 10 2 (1974) 11 A Einstein B Podolsky N Rosen Phys Rev 47 777-780 (1935) 12 A Einstein in Albert Einstein Philosopher Scientist Edited by PA

Schilpp Library of Living Philosophers (Evanston Illinois 1949)

29

R e f u t a t i o n of Be l l s T h e o r e m

Guil laume A D E N I E R Louis Pasteur University Strasbourg France

E-mail guillaumeadenierulpu-strasbgfr

Bells Theorem was developed on the basis of considerations involving a linear combination of spin correlation functions each of which has a distinct pair of arguments The simultaneous presence of these different pairs of arguments in the same equation can be understood in two radically different ways either as strongly objective that is all correlation functions pertain to the same set of particle pairs or as weakly objective that is each correlation function pertains to a different set of particle pairs It is demonstrated that once this meaning is determined no discrepancy appears between local realistic theories and quantum mechanics the discrepancy in Bells Theorem is due only to a meaningless comparison between a local realistic inequality written within the strongly objective interpretation (thus relevant to a single set of particle pairs) and a quantum mechanical prediction derived from a weakly objective interpretation (thus relevant to several different sets of particle pairs)

1 Introduction

Bells Theorem1 exhibits a peculiar discrepancy between any local realistic theshyory and Quantum Mechanics which leads to empirically distinguishable altershynatives The quandary is that neither local realistic conceptions nor Quantum Mechanics are easy to abandon Indeed classical physics and common sense are usually based upon the former while the latter is rightly presented as the most successful theory of all times Several experiments have been done all but a few2 show violations of Bell inequalities3 Yet the ideas brought forth by Bells Theorem are so disconcerting that there is still incredulity not to menshytion antipathy evoked by the verdict The purpose of this article is to provide a refutation of this theorem within a strictly quantum theoretical framework without the use of outside assumptions

2 The E P R B gedanken experiment

21 Spin observables and singlet state

Bells theorem is usually based on a didactic reformulation of the EPR (Einshystein Podolsky and Rosen4) gedanken experiment due to D Bohm5 In this EPRB gedanken experiment a pair of spin-| particles with total spin zero is produced such that each particle moves away from the source in opposite directions along the y-axis Two Stern-Gerlach devices are placed at opposite

30

points (left and right) on the y-axis and are oriented respectively along the directions u and v The Hilbert space associated with the entire EPRB system is H = 7ih lt8gtHR where T^L and HR are the Hilbert spaces associated with each Stern-Gerlach device respectively The spin observable has two counterparts in this new product space H as

CTL-U = ltr-u(ggtIR (1)

ltTR bull v = IL reg a bull v (2)

where I I and IR are the identity operators of ~Hh and R Contrary to the observables a bull u and a bull v which are mutually non commuting when u ^ v these new observables ox bull u and OR bull v do commute reflecting the fact that the Stern-Gerlach devices are arbitrarily far from each other and are thus measuring distinct subsystems The product of these two observables is therefore also an observable and can be understood as a spin correlation observable corresponding to the joint spin measurement of both Stern-Gerlach devices Its eigenvectors are |poundLU) ltggt | pound R V ) with corresponding eigenvalues poundL-poundRgt where each e is either +1 or mdash1

In an EPRB gedanken experiment the source produces particle pairs with zero total spin represented by the singlet state

M = ^ [l+ngt reg -gtngt - -gtngt reg l+ngt]gt (3)

where n is an arbitrary unitary vector which can usually be ommited since the singlet state is invariant under rotation6

22 Statistical properties and hidden-variables

The expectation value of a spin observable for the singlet state ip) is zero

(r-u(8gtlR|Vgt) = 0 MI L regltr-v |^gt = 0 (4)

whatever u and v as follows from the rotational invariance of the singlet state Likewise the expectation value of the spin correlation observable 67 is

E(uv) = M ( o f u ) ( o - v ) M (5)

= - u - v (6)

which depends only on the relative angle between u and v

31

In a local realistic hidden-variables model a single particle pair is supposed to be entirely characterised by means of a set of hidden-variables which are symbolically represented by a parameter A so that the measurement result on the left along u can be written as A(uA) and the result on the right along v as B(v) Although the hidden-variables model is supposed to be fully deterministic it must also be capable of reproducing the stochastic nature of the EPRB gedanken experiment expressed in Eqs (4) and (6) For that purpose the complete state specification Aj of any particle pair with label i must be a random variable1s its complete state Aj is supposed to be drawn randomly according to a probability distribution p

Consider a set of N particle pairs i = 1 N the mean value of joint spin measurements for this set is

1 N

M(uv) = - ^ A ( u A i ) B ( v A i ) (7)

3 The CHSH function

In order to establish Bells Theorem a linear combination of correlation funcshytions c(a b) with different arguments 9 is considered once when these correlashytion functions are expectation values E^av) given by Quantum Mechanics ie Eq(6) and once when they are mean values M p (u v ) given by local hidden-variables theories Eq(7) then the results are to be compared A well known choice of such a linear combination is the CHSH (Clauser Home Shi-mony and Holt10) function written with four pairs of arguments

S = |c(ab) - c ( a b ) +c (a b ) + c(a b ) | (8)

The exact meaning of the simultaneous presence of these different argushyments in a CHSH function must be clarified Basically there are two possible interpretations the strongly objective interpretation and the weakly objective interpretation1112

Strongly Objective Interpretation implies that all correlation functions are relevant to the same set of N particle pairs As such they cannot be relevant to actual experiments but rather with what result would have been obtained if measured on the same set of N particle pairs along different directions

Weakly Objective Interpretation implies that each correlation function is actually to be measured on distinct sets of N particle pairs that is for each pair only one joint spin measurement is to be executed

32

The CHSH function was actually developed specifically for experimental convenience10 and many experiments have been done (the most famous being Aspects13) obviously invoking the natural interpretation namely the weakly objective one Nevertheless the strongly objective interpretation must also be considered since it remains a possible interpretation a priori and since the choice between strong and weak objectivity is not made at all explicit in many papers including Bells

It must be stressed that these interpretations are radically different not only epistemologically but also physically Indeed the strongly objective inshyterpretation pertains to a single set of N particle pairs characterised by the corresponding set of parameters A i = 1 TV whereas the weakly obshyjective interpretation pertains to no less than 4 sets of N particle pairs The fact is that a finite set of N particle pairs characterised by A cant be identishycally reproduced either theoretically (for each complete state A of any particle pair i is a random variable as defined in Section 22) or empirically (for the experimenter has no control over the complete state of a particle pair in a sinshyglet state) Hence in the weakly objective interpretation these four sets are necessarily four different sets of particle pairs 7 14 respectively characterised by four different sets of hidden-variables parameters Aij ^2i ^3i a n d A4J

The difference between each interpretation can therefore be embodied in the number of degrees of freedom of the whole system Let be the degrees of freedom of a single particle pair In the strongly objective interpretation the degrees of freedom of the whole CHSH system is then Nf whereas in the weakly objective interpretation it is 4 times as large that is 47V Thus before initiating Bells analysis one has to choose explicitly one interpretation and stick to it

4 Strongly objective interpretation

4-1 Local realistic inequality within strongly objective interpretation

The local realistic formulation of the CHSH function within strong objectivity is written

OP ^strong

M ( a b ) - M ( a b ) + Mgt(ab) + M (a b ) (9)

which (using Eq 7) becomes after factorisation a summation where each term can have two values 2 7

A(a Xi) B(b Xi) - B(b Xi)] + A(a Xt) [l(b Alt) + B(b A)] = plusmn2 (10)

33

so that the most restrictive local realistic inequality within the strongly objecshytive interpretation is

Strong lt 2- (11)

This is the well known generalised formulation of Bells inequality due to CHSH10 It must be stressed once more however that this inequality has been established only within the strongly objective interpretation which means that each expectation value is relevant to the same set of N particle pairs Hence this result cannot be compared directly with results from real experimental tests where in fact mean values from four distinct sets of N particle pairs are measured

4-2 Quantum mechanical prediction within strongly objective interpretation

The quantum prediction for the CHSH function within the strongly objective interpretation is written

strong = l ^ ( a b ) - E ( a b ) + E+(ab) + E(ah) (12)

This equation is usually directly evaluated by replacing each expectation value by the scalar product result of Eq (6) This unfortunately is all too hasty

Indeed in order to understand better the quantum mechanical meaning of equation (12) it is advantageous to take a step backward using equation (5)

^strong (Vgt|(aLa)(ltTRbM - ltVgt|(lt7La)(lt7Rb)|tgt)

+ (ygt|(lt7La)(ltTRb)|V) + (igt|(lt7La)(lt7R b)|V) bull (13)

The four spin correlation observables in this equation are non commuting observables (this can be shown by calculating the commutator of ((7LU)(ltTRV)

and ((TLU)(CTRV) with v ^ v ) so that the meaning of their combination must be questioned

According to Von Neumann15 any linear combination of expectation valshyues of different observables R S is meaningful in quantum mechanics

R + S + )4 = (R)4 + (S)4 + (14)

even if R S are non commuting observables However as was stressed by dEspagnat 1116 quantum mechanics is only a weakly objective theory and expectation values given by quantum mechanics are also weakly objective statements that is to say statements relevant to observations so that when

34

R 5 are non commuting observables the expectation values cannot be simultaneously relevant to the same set of N systems each expectation value is necessarily relevant to a distinct set of JV systems Therefore the only possible meaning of equation (13) is weakly objective not strongly objective as desired Of course this does not imply that Quantum Mechanics cannot provide any meaning at all for the CHSH function it implies only that this meaning cannot be strongly objective

Since the local realistic inequality SgtT0 cannot be compared with any strongly objective prediction given by Quantum Mechanics Bells Theorem cannot be verified with a strongly objective interpretation given to the CHSH function Hence there is no choice but to rely on the weakly objective interpreshytation in order to compare hidden-variables theories and Quantum Mechanics

5 Weakly objective interpretation

51 Quantum mechanical prediction within weakly objective interpretation

It was shown in Section 3 that strong objectivity and weak objectivity pertain to different physical systems This difference should therefore appear in the relevant equations Indeed the correlation expressed in Eq (6) is relevant to spin measurements performed on particles that once constituted a single parent particle Yet two particles issued from two distinct parents never have intershyacted with each other so that spin measurements performed on such particle pairs can not be correlated Hence if left and right spin measurements are pershyformed on two distinct sets of N particle pairs instead of the same set there should be no correlation and this property should appear in a generalised spin correlation function (ie generalised to the case of spin measurements performed on different sets of particle pairs)

This can be easily done within a quantum theoretical framework by means of a distinct EPRB space for each set of N particle pairs Let Hj be the EPRB Hilbert space associated with the jth set of particle pairs In this Hilbert space the EPRB gedanken experiment is represented by the singlet state ipj) (see Section 2)

|V) = ^[l+gtreg|-gt-|-gtreg|+gt-] (15)

The whole CHSH experiment with the four sets of particle pairs can be exshypressed then in terms of a new tensor product space W1234 = i reg 2 reg 3 reg HA in which the state vector is

1 1234) = |Vl) reg 1 2) reg |^s) reg |^4gt- (16)

35

The counterparts of observables in 7 1234 are obtained as in Section 21 For instance the observable pertaining to the right Stern-Gerlach device for the 2nd set of particle pairs is

a2R -u = Ii reg (CTR bull u) lt8gt I3 reg I4 (17)

where Ij is the identity operator of the EPRB space Hj Hence the expectation value of the product of two spin observables the first belonging to the fcth set and the second to the Zth set is

Eftu V) = (Vgt1234|(ltTL bull U)(lt7IR bull v)|Vgt1234) (18)

and this is the generalised expectation value of spin correlation observables that was sought The expectation value for measurements performed on the same set (k = I) of particle pairs is already known Eq (6) and E^k(u v) should provide the same result Indeed using Eqs (16) and (17) leads to

lt ( u v ) = ltIM(ltTL -u) bull K - v)rpk) = - u v (19)

but when k ^ I the result is quite different

J3(uv) = (V-fcKot - u ^ X V - z I K -v)hM = 0 (20)

in accord with Eq (4) There are indeed no correlations between two sets of particle pairs as stipulated in the beginning of this section

Now contrary to what was done in Section 42 it is possible to proceed here in full accord with the quantum mechanical postulates because the spin correlation observables as the one given in Eq (17) are mutually commuting so that a linear combination of these commuting observables is an observable as well The CHSH experiment can therefore be described by a new observable

Sweak = (lt7lL bull a)(ai R bull b ) - (ltT2L bull a)(lt72R b )

+(o-3L-a)(ltT3R-b) + (lt74L- a)(ltx4R bull b ) (21)

and the quantum prediction for the CHSH function within a weakly objective interpretation is therefore obtained by calculating the expectation value of the observable 5weak when the system is in the quantum state 1 1234)

Sweak = (^1234|5weak|V1234) (22)

which using Eqs (17) (18) and (19) is

S L k = S f 1 ( a b ) - ^ 2 ( a b ) + ^ 3 ( a b ) + E 4 (a b ) (23)

36

This equation is not ambiguous (as was Eq 12) it is a linear combination of expectation values each relevant to a distinct set of N particle pairs This equation is therefore weakly objective as requested

Finally using Eq (19) yields

weak a bull b - a bull b + a bull b + a bull b

with a well known maximum equal to

max(5 B a k )=2gt^

(24)

(25)

This numerical result is indeed the one given in the literature the only difshyference here being the fact that the meaning of this result is unambiguously weakly objective Quantum Mechanics which is a weakly objective theory n

provides a clear answer to the CHSH function understood as a weakly objective question

52 Local realistic inequality within weakly objective interpretation

The last step consists in comparing the quantum prediction S^eak with its local realistic counterpart S^eak As was stressed in Section 3 the j t h set of particle pairs must be characterised by a distinct set of hidden-variables parameters [Xji j = 1 N Hence to the generalised expectation value of the spin correlation observable Eq (18) corresponds the generalised mean value of joint spin measurements

1 N

Mpound(uv) = - J gt ( u A M ) B ( v A M ) (26)

which is a priori capable of reproducing not only the k mdash I prediction Eq (19) but also the k ^ prediction Eq (20) The local realistic CHSH function with a weakly objective interpretation is therefore

9P = weak

Mftfob) - M22(ab) + M3 3(ab) + M4 4(ab) (27)

and that is explicitly

i 1 N

5weak = b E [^(a A M )pound(b A M ) - gtl(aA2li)B(bA2ii)

+A(a 3i)B(h A3i) + AB A4i)B(bl A4]i) ] (28)

37

This expression is to be compared with the one pertaining to the strongly objective interpretation (Section 41) which contained terms that could be factored Here since each term is different from the others no factorisation is possible ie there is no way to derive a Bell inequality7mdashthis is not the first time this fact has been noticed unfortunately no conclusion was drawn then Yet this fact cannot be ignored for it has been shown in Section 4 that Bells Theorem cannot be demonstrated within a strongly objective interpretation

Here the only local realistic inequality that can be derived is obtained by consideringmdashas was done with Eq (10)mdashthe possible numerical values of each term of the summation in Eq (28) for which the extrema are +4 and -4 so that the narrowest local realistic inequality that can be derived from Eq (28) is nothing but

^ e a k lt 4 - (29)

This most restrictive local realistic inequality (which can also be found in Accardi17) is not incompatible with the quantum mechanical prediction as the maximum of Sbdquoe a k is 2-2 This shows that experiments intended to test Bells Theorem were unfortunately not testing the strongly objective inequality Eq (11)mdashwhich is a Bell inequalitymdash but this weakly objective one Eq (29) since all experimental tests necessarily are executed in a weakly objective way due to the irreducible incompatibility between spin measurements As was stressed by Sica18 and Accardi17 a local realistic inequality is nothing but an arithmetic identity and inequality (29) is definitely too lax to be violated by experimental tests

6 Conclusion

It was shown that Bells Theorem cannot be derived either within a strongly objective interpretation of the CHSH function because Quantum Mechanics gives no strongly objective results for the CHSH function (see Section 42) or within a weakly objective interpretation because the only derivable loshycal realistic inequality is never violated either by Quantum Mechanics or by experiments (see Section 52) It was demonstrated that the discrepancy in Bells Theorem is due only to a meaningless comparison between S^trons lt 2 and 5^ e a k = 22 where the former is relevant to a system with Nf degrees of freedom whereas the latter to one with 4Nf (see Section 3) The only meaningful comparison is between the weakly objective local realistic inequalshyity 5^ e a k lt 4 and the weakly objective quantum prediction Sbdquo e a k = 2^2 but these results are not incompatible Bells Theorem therefore is refuted

38

References

1 J S Bell Physics 1 195 (1964) 2 F Selleri Le grand debat de la mcanique quantique (Champs Flammar-

ion Paris 1986) 3 A Aspect Nature 398 189 (1999) 4 A Einstein B Podolsky and N Rosen Phys Rev 47 777 (1935) 5 D Bohm Phys Rev 85 166 (1952) 6 D Greenberger M Home A Shimony and A Zeilinger Am J Phys

58 1131 (1990) 7 A Bohm Quantum Mechanics Foundations and applications (Springer-

Verlag New York 1979) 8 J S Bell in Proceedings of the international School of physics Enrico

Fermi course IL Foundations of quantum mechanics (Academic New York 1971) p 171

9 J S Bell Epistemological Letters p 2 (July 1975) 10 J F Clauser M A Home A Shimony and R A Holt Phys Rev Lett

23 880 (1969) 11 B dEspagnat Veiled Reality An Analysis of Present Day Quantum

Mechanical Concepts (Addison-Wesley 1995) 12 B dEspagnat httparXivabsquant-ph9802046 13 A Aspect J Dalibard and G Roger Phys Rev Lett 49 1804 (1982) 14 A Khrennikov httparXivabsquant-ph0006017 15 J von Neumann Mathematical Foundations of Quantum Mechanics

(Princeton University Press 1955) 16 B dEspagnat Conceptual foundations of Quantum Mechanics (WA

Benjamin Massachusetts 1976) 17 L Accardi httparXivabsquant-ph0007005 18 L Sica Opt Commun 170 55 (1999)

39

PROBABILITY CONSERVATION A N D THE STATE DETERMINATION PROBLEM

S AERTS Free University of Brussels

Triomflaan 2 Brussels Belgium E-mail saertsvubacbe

The problem of finding an operational definition for the wave vector is briefly examined from a historical point of view Led by an old idea of Feenberg we integrate the one dimensional probability conservation equation to obtain a closed formula that determines the state vector in the spinless case The formula that determines the state does not depend on the (real) potential external fields having their influence on the state only through the time derivative of the probability density function in position space We apply the method to the simple case of a free Gaussian wave packet Some problems regarding the operational status of the quantities involved are discussed

1 Introduction

It is well known that Heisenberg constructed the matrix formulation of quanshytum mechanics by keeping in close accordance with what might be labelled the principle of operationality Roughly one can describe this principle as a determination to introduce only measurable quantities Schrodinger more concerned with anschaulichkeit than operationality introduced rather unshyscrupulously the concept of a wave function He initially interpreted the wave function as a charge density in space but this interpretation is difficult to extend to several particle problems a The interpretation that would stand the test of time as testimonied by it being awarded the Nobel prize in 1954 was due to Born In analogy with the theory of electro-magnetic radiation in which the intensity is the square of the amplitude Born took the step to interpret the intensity of an electro-magnetic wave in a given region of space as proportional to the relative frequency of a photon detection in that region and the probabilistic interpretation was born However this correspondence still doesnt make it an operational quantity as for every density p(x t) there are infinitely many 4gt(xt) such that with ip(xt) = ^pxt)el^xt we get ip(xt)ip(xt) = p(xt) The problem is then to find suitable functions that we can approximate experimentally in a statistical way that in some well choshysen combination yield the same information as the complete wave function In order to make the question mathematically more precise Prugovecki2 intro-

aFor a rescue attempt of the original Schrodinger interpretation see Dorling1

40

duced the notion of informational completeness A family T = Oii euro 1 of bounded operators on a Hilbert space ~H is called informationally complete iff for every two density operators p and p the equality Tr(pOi) = Tr(pOi) implies p = p This definition implies that the set of expectation values of an informationally complete set of operators allows only one state operator from which the expectation values could have been derived What characterizes such a set In a classical statistical framework we can calculate all macroshyscopic quantities from a single density function p(p q) in phase space Hence by analogy one is naturally led to the following interesting question originally due to Pauli3 Is it sufficient to know the probability density functions of poshysition and momentum to determine unambiguously the quantum mechanical state of the physical system In the quantum mechanical case it is sufficient to know the wave function in coordinate space ip(xt) since the corresponding wave function for the same system in momentum space ip(pt) is given by its Fourier transform Hence we can phrase the problem in a more mathematical way is it possible to determine a square integrable function uniquely from both its modulus and the modulus of its Fourier transform Possibly the first non-trivial counterexamples came from Bargmann b who constructed explicit examples of wave functions Vl and ip2 that give rise to the same probabilshyity distributions for position and momentum but give a different probability distribution for a third operator that does not commute with the position or momentum operator This leads to the remarkable conclusion that the wave function in its coordinate representation contains more information than the corresponding probability densities in position and momentum together Due to Bargmann we know the answer to be negative in a physically relevant way c

and what is now commonly referred to as the Pauli problem is either the probshylem of determining the set of states that share the same modulus and the modshyulus of their Fourier transform or the problem of finding a set of observables that are informationally complete The problems are related but not identical and we prefer to refer to the first version of the problem as the Pauli probshylem and to the second as simply the state determination problem It seems much more work has been done on the state determination problem which isnt surprising given the fact that the Pauli problem is a special case of it With the exception of the production of counterexamples such as Bargmanns the first instructive results regarding the Pauli problem were obtained only in

Bargmann never seems to have published these results himself and as a result little refershyence is given to his work in the literature However the examples can be found in Reichen-bach 4 c The problem re-appeared unaltered in the 1958 edition of Paulis book more than a decade after the first counterexamples

41

1978 by Corbett and Hurst5 In their paper they construct physically imporshytant classes of functions that are uniquely determined by their position and momentum distributions However they also show there exist dense subsets of states that are not uniquely determined by their position and momentum disshytributions and as a consequence any state can be approximated in norm by a non-unique state Extensions comments and counterexamples to their work can be found in Friedman6 and Pavicic7 Nevertheless the complete charshyacterization of the set of states that share modulus and the modulus of their Fourier transform is still open As for the state determination problem we can split the work into those who were primarily concerned with establishing a set of observables that is informationally complete (or disproving a certain set to have this property) and those that set out to characterize such sets The first group includes Feenberg8 (1933) Moyal9 (1949 ) Gale Guth and Trammell (1968)10 Band and Park 1 1 1 2 13 (1970-1971) and many more14 15 16 We will not go into the reconstruction of the state by placing the entity in different potentials a method pioneered by Lamb17 and one that inspired many similar approaches such as Wiesbrock18 and Weigert19 nor will we mention the vast literature pertaining to the measurement of the Wigner distribution known as phase-space tomography However concerning the characterization of inshyformationally complete sets we cannot help but make the following elementary remarks Suppose we have a non-trivial (ie not a multiple of the identity) self-adjoint operator A that commutes with every member of a set of operashytors S in a Hilbert space 7i It is well known that the one parameter family of unitary operators exp(itA) also commutes with every element of ltS Now take any xj) that is not an eigenvector of A For any observable in S the state ipt mdash exp(itA)tp gives the same expectation value for this operator whatever numerical value t has But if t ^ s it follows that ipt ^ Vs (for the relation of this with superselection rules see Wick Wightman and Wigner (1952) 20 Emch and Piron (1963) 21 and Piron2 2) Hence S is not an informationally complete set of observables So a necessary condition for a set of observables to be informationally complete is maximality in the sense of Dirac in other words that there be no other non-trivial operator that commutes with every member of the set However this is far from sufficiency As Bush and Lahti23

have shown it is easy to derive d from the considerations above that no comshymuting set of observables is informationally complete Maximal commuting sets of observables serve as a means of state preparation not state identifishycation This means that at least for for continuous variables the Pauli set P Q is in a certain sense the minimal set that one could possibly hope to be informationally complete (although Bargmann has shown this in general not

One arrives at this result by allowing A to be a member of S

42

to be the case)

2 Conservation of Probability

What we will present in this article is an elaboration on the reasoning followed by Feenberg Consider the time-dependent Schrodinger equation in tp with a real e potential V and using the shorthand tp for ip(r t)

~ = -h2imV2tp +^rVip at in

Multiply by tp and add this to the complex conjugate of the above equation multiplied by ip After some elementary vector operator manipulation we find what is commonly known as the conservation law of probability

Substitution of the polar representation of the wave vector iP(rt) = yfafietrade (ip assumed real) into the former equation yields a second order partial difshy

ferential equation which is in fact a Fokker-Planck equation with zero diffusion coefficient and the phase serving as a a potential

Feenbergs argument is a uniqueness result based on this last equation It amounts to showing that any two phase functions that satisfy this equation and some gentle boundary conditions differ by at most a constant His 1933 thesis is hard to get hold of but the argument was (erroneously1015 ) extended by Kemble 24 to three spatial dimensions in his much easier to find handbook on quantum mechanics What we will do here is go back to the original one dimensional idea but rather than trying to establish a uniqueness result we will show that in this simple case a solution can be obtained by direct integration

3 Determination of the phase function

So p and ip satisfy the conservation law as given by the last equation Rewriting this equation in one dimension evaluated at a specific time instant t = to gives us eThe imaginary part of a complex potential can be used to mimic creation and annihilation effects Although this is sometimes a useful approximation such results violate the continuity equation and for a more reliable analysis one should really use a second quantized theory

43

lt9V dp(xt0)dip mtdpxt) pxto)w + mdashdxmdashTx + -nmdashm-]t^ = deg

Assume for the time being that p(x t0) ^ 0 and divide the equation by p(x t0)

d2(p dinp(xt0) dip m dlnp(xt) _ ~dtf + dx ~5x~+ J dt h=t0 ~

Assuming pox) and its time derivative to be known functions we can solve for the unknown phase ltp(xto) Set

As all quantities are evaluated at the same time instant t = to we will not bother to give further notational reference to this fact In what follows we will also abbreviate (with abuse of language) ( a i nP(x f)) f = t o a s dtlnp(x) Applying these transformations the equation becomes

^ + f(X)(fgt = g(X)

So we have transformed the second order partial differential equation into an ordinary first order linear differential equation with a source g(x) at a fixed time instant The solution of the homogeneous equation is ltph = exp[mdash f f(x)dx] = p~1x) The general solution with c chosen to fit the boundary condition is ltfgt(x) = 4gthx)(c + $x g(s)p(s)ds) We have to integrate this result once more to get ltp(x)

x rr

4gthr)(c+ I g(s)p(s)ds)dr

= J p~(7)[c+J J P(s)dtlnp(s)ds]

= J (c+-J dtP(s)ds)W)

4 Validity and range of applicability

The solution is seen to be a two parameter family of curves one for every value of the constant c and one for every lower limit say x$ of the r integration The result of changing the lower integration limit is only the addition

bullThe lower limit of the s integration is absorbed in the constant c

44

of an overall constant to tp(xt) Because we know the quantum mechanical expectation values and probabilities to be invariant under such an addition we set this constant equal to zero The value of the constant c can potentially affect the phase in a more profound way Depending on the particular p(r t) used pfriy m i g n t diverge when p(r t) is zero for some value(s) of r or even worse for some Ar First of all we assumed in our derivation that p(r t) ^ 0 but this restriction can easily be removed Indeed suppose we have n places xn where the density does equal zero A solution ipi is then obtained for each interval ]x Xi+ [ by means of our equation The total solution ip is obtained by pasting all the ipi together by requiring continuity of if and V^- 9 bull Now continuity of ip and VVgt implies continuity of their respective complex conjugates and hence of p and Vp If we are to infer the phase from actual data it seems reasonable to require (p also to be continuous In fact the conservation equation requires it to be twice differentiable If any cutting and pasting is necessary to obtain the solution we can easily see that the constant c should be the same for any two pasted pieces Hence if the cut is applied at a pole c has to be zero h for ltp to be continuous We arrive at the same conclusion when we use the same reasoning on a point adjacent to the support of p Hence we arrive at the main result of our paper

m rx fo rr

V(xt0) = yp(xt0)exp(imdash dtp(st0)ds)

Note that the state does not contain reference to the potential External fields will show up in the state indirectly as a consequence of the time dependence of p The assumptions that underlie the derivation of the equation are a spinless one dimensional particle that acts under a real potential V being prepared in a pure state In short all that is required for a particle to obey the one dimensional dynamical Schrodinger equation However restricted this class is it does include many examples that can be found in standard textbooks on quantum mechanics

Comparing the result we have found to those in the literature we find the closest match with a result obtained by Gale Guth and Trammel10 They apply the definitions of p(r) and j(r) to show that knowing these is sufficient for the determination of the phase They then discuss a gedanken experiment

9 This continuity demand is in fact a necessity because the validity of the equation of probshyability conservation (and a fortiori of the Schrodinger equation) requires xjj and Vigt to be continuous A notable but unproblematic exception is that of an infinite potential step h the value of c might be non-zero in applications where the continuity equation only expresses conservation of the probability flux in some intermediate region the boundaries (possibly at infinity) containing sinks or sources of probability

45

for establishing the probability current by measuring the expectation of the velocity and argue by means of this experiment and an intuitive argument that the current j(r) equals p(r) lt v(r) gt for some r inside a small space region that is supposed to contain the particle Our result was obtained by a direct integration and as a consequence is exact It is however difficult to extend to higher dimensions because of two reasons The first is the fact that the expression for the probability current in the presence of a vector potential becomes J(xpound) = Reip(xt)[pmmdash (qmc)A]ip(x t) and depending on the form of the vector potential it is not obvious to what function of the phase this corresponds If the vector potential corresponds to a uniform magnetic field or in absence of a vector potential (in which case one can transform the equation into a Poisson equation) one can solve the continuity equation by employing standard techniques However one then encounters a second problem Providing an initial value for the phase (which is unproblematic as the phase is only determined within an additive constant) is no longer sufficient instead we need an initial boundary function Hence we have to resort to other principles to determine the phase on such a boundary in order to solve the problem Of course the principle of conservation may still serve the purpose of reducing the family of admissible functions for the phase of the amplitude We will now illustrate the principle by applying it to a Gaussian wave packet Later we will expound a few operational issues regarding the quantities involved in the solution given above

5 Evolution of a Gaussian Wave Packet

The full time dependent wave function for a free Gaussian wave packet is

c o = ltMA)Srltlti + ^ r -x24(Ax)l + ik0x - ik2Ht2m

eXpL 1 + iht2m(Ax)20 J

From this we easily calculate p(xt)

p(xt) = tpxt)ip(xt)

iv A N2W- h2t2 N--12 r -(x + k0htm)2

Now assume we did not know the wave function only the probability density and its time derivative at some time instant t mdash 0 In an abbreviated

46

form (with easy identification of the coefficients) we can write the probability as

) = + tf)-raquolaquop[-JEplusmn|pound]

At time t = 0 this gives us p(x0) mdash aexp(mdash^-) The derivative of p with respect to the time parameter

bulllaquoraquo - 4ilt1 + 6 2gt~1 2 e x plt-|r^)gt]= CX X2

= ~2a~dexp(~~j)

So the phase becomes

ltp(x0) = j J J dtp(s0)d p(r0)

2 bdquo2 bull v

C TTl f fr S V

= ~2d-hJ J sexP(--)dsexP(-)d

m fx v^ r2

kohm = T~x

m n

= kox

which is precisely the desired phase of the wave function at t = 0 6 Operational Issues

Expounding Feenbergs uniqueness result Reichenbach points out that we can recover the phase by numerical computation if we know p(x to) and dtp(x t) t=t0 bull In order to establish these quantities Reichenbach outlines the following proshycedure4 We take an ensemble A of identically prepared systems such that the ensemble can be properly described by a pure state ifgt Now select at random two sub-ensembles from A say B and C For each system in B we measure at the time to the value of a As the results will vary we obtain in this way a distribution p(xto)- Likewise for each system in C we we measure at the time ti the value of x obtaining a distribution p(xti) The quotient

p(xt0) - p(xh)

h mdash to

47

is then supposed to approximate dtp(xt) for t euro [toh] if the interval [toh] is chosen sufficiently small The wave function can then be obtained through numerical approximation and represents the state of the systems that are left untouched in the original ensemble A There is a problem with Reichenbachs procedure for determining these quantities that is of equal concern to our method Despite the fact that it is entirely possible to position the detector wherever one wants it to be hence effectively controlling x in p(xt) it is an annoying peculiarity of quanta that one cannot determine when a detection will take place One places a detector and simply waits for a detection count to happen The problem seems related to what Mielnik has called the screen problem in a provocative and enlightening paper by the same name 25 As Mielnik points out experimentalists perform a lot of experiments but none reshysembling an instantaneous check of particle position Indeed a measurement setup typically consists of a source that what is emitted undergoes a series of transformations (ie an optical bench or a potential) and is subsequently detected by a fixed detector or a set of fixed detectors If we are to describe operational means of measuring densities at some time instant we will have to do so by such a typical setup To produce anything remotely satisfactory we will need a few assumptions A first assumption is that if a particle is detected at some time instant to in position x the intricate mechanism beshytween the measurement apparatus and the particle that is responsible for its detection does not depend on to and in this sense has no effect on the value of p(xt) However unnatural the assumption might be from a physical point of view it seems to underlie the statistical interpretation of fn ^x t)2dV as an instantaneous localization probability of the system in a state ip in a space region fi and at a time instant t In so far as our analysis depends on this assumption so does the standard interpretation of quantum mechanics The next assumption is that we are able to control the release of the particle in a certain state within a sufficient small time interval At such that within this small time interval the density can reasonably be approximated by a linear function This can be achieved by placing a shutter mechanism behind the source Naturally the shutter opening time has to be substantially less than the coherence time of the particle A sufficiently short opening time can only be established by experiment and one can never be quite sure if there would still be more oscillations on a much shorter time scale A density function with a larger variation will be harder to approximate as it requires a shorter shutter opening time and hence will result in a lower detection rate The wave packet then participates in the transformations we may have set up (optical bench Stern-Gerlach) and is detected The time interval between the shutter reshylease and the detection time is noted together with the position of the detector

48

After many of such recordings we gather all the data to reconstruct p(xt) How many samples do we need Well if the samples were taken at equidistant At and Ax we could do a Fourier synthesis and apply the Shannon-Whittaker sampling theorem However due to the non-equidistant spreading of the tn (at best following some statistical pattern) we need Frame Theory (Duffin and Schaeffer26) to reconstruct band limited signals from irregularly spaced samshyples f(tn) The derivative with respect to time can then be derived from the reconstructed signal and the phase derived by means of the proposed equation

Acknowledgments

The author wishes to acknowledge a helpful discussion with John Corbett regarding the subject of this paper

References

1 J Dorling Schrodinger Centenary celebration of a polymath eds CW Kilmister (Cambridge 1987)

2 E Prugovecki Int J Theor Phys 16 pp 321-331 (1977) 3 W Pauli Encyclopedia of Physics Vol V p17 (Springer-Verlag Berlin

1958) 4 H Reichenbach Philosophic Foundations of Quantum Mechanics (Unishy

versity of California Press 1948) 5 JV Corbett CA Hurst J Austral Math Soc B20 182-201 (1978) 6 CN Friedman J Austral Math Soc B30 298 (1987) 7 M Pavicic Phys Lett A 122 280 (1987) 8 E Feenberg The Scattering of Slow Electrons in Neutral Atoms Thesis

Harvard University (1933) 9 JE Moyal Proc Cambridge Phil Soc 45 99 (1949)

10 W Gale E Guth and GT Trammell Phys Rev A 165 1434-1436 (1968)

11 W Band J Park Found Phys 1 No 2 pp 133-144 (1970) 12 J Park W Band Found Phys 1 No 4 pp 339-357 (1971) 13 W Band J Park Am J Phy 47 pp 188-191 (1979) 14 A Royer Phys Rev Lett 55 pp 2745 (1985) 15 A Royer Found Phys 19 3 (1989) 16 W Stulpe M Singer Found Phys Lett 3 153 (1990) 17 W E Lamb Phys Today 22(4) 23 (1969) 18 H-W Wiesbrock Int J Theor Phys 26 pp 1175 (1987) 19 S Weigert Phys Rev A 45 pp 7688-7696 (1992)

49

20 GC Wick AS Wightman EP Wigner Phys Rev 88 pp 101-105 (1952)

21 EC Emch C Piron J Math Phys 4pp 496-473 (1963) 22 C Piron Helv Phys Acta 42 pp 330-338 (1969) 23 P Bush PJ Lahti Found Phys 19 pp 633 (1971) 24 EC Kemble New York MacGraw-Hill (1937) 25 B Mielnik Found Phys 24 8 pp 1113-1129 (1994) 26 RJ Duffin AC Schaeffer Trans Amer Math Soc 72 341-366

(1952)

50

EXTRINSIC A N D INTRINSIC IRREVERSIBILITY IN PROBABILISTIC DYNAMICAL LAWS

H ATMANSPACHER Institut fur Grenzgebiete der Psychologie und Psychohygiene

Wilhelmstr 3a D-79098 Freiburg Germany E-mail haaigppde

and Max-Planck-Institut fur extraterrestrische Physik

D-85740 Garching Germany

R C BISHOP Institut fur Grenzgebiete der Psychologie und Psychohygiene

Wilhelmstr 3a D-79098 Freiburg Germany E-mail rcbigppde

A AMANN Universitatsklinik fur Anasthesie Leopold-Franzens- Universitat

Anichstr 35 A-6020 Innsbruck Austria E-mail antonamannuibkacat

and Institut fur Allgemeine Anorganische und Theoretische Chemie Abteilung fur theoretische Chemie Leopold-Franzens- Universitat

Innrain 52a A-6020 Innsbruck Austria

Two distinct conceptions for the relation between reversible time-reversal invarishyant laws of nature and the irreversible behavior of physical systems are outlined The standard extrinsic concept of irreversibility is based on the notion of an open system interacting with its environment An alternative intrinsic concept of irreshyversibility does not explicitly refer to any environment at all Basic aspects of the two concepts are presented and compared with each other The significance of the terms extrinsic and intrinsic is discussed

1 Introduction

The relation between reversible time-reversal invariant laws of nature and the irreversible behavior of empirical systems has been a long-standing problem in physics In most standard approaches fundamental dynamical laws such as in Newtons Maxwells Einsteins or Schrodingers equations describe the temporal evolution of isolated systems Irreversible dynamical laws are typshyically regarded as emerging from the interaction between systems and their environment ie from considering open systems

In contrast to this extrinsic conception of irreversibility there is a group

51

of scientists who insist that some kinds of irreversibility are intrinsic ie some kinds of irreversible laws are fundamental On this view mainly adshyvocated by Prigogine and colleagues in Brussels and Austin the switch from extrinsic to intrinsic irreversibility goes along with a switch from particular kinds of deterministic descriptions to particular kinds of probabilistic descripshytions

In general the two viewpoints are considered to be distinct sometimes even entirely incompatible It is the main goal of this contribution to show that there are both differences and similarities between them As a consequence it does not make too much sense to prefer one of them at the expense of the other It is much more interesting to explore whether particular aspects of each of the two views can be constructively related to each other in order to increase our insight into the issue of irreversibility

In the following both conceptions will be presented to some detail and compared It is suggested that the distinction of ontic and epistemic catego-rial frameworks for some problems associated with irreversibility is particularly useful when focusing on a conceptual discussion Such a distinction serves to clarify both common and distinct aspects of extrinsic and intrinsic irreversibilshyity and it helps to frame a number of open questions concerning them

In Section 2 ontic and epistemic descriptions are briefly introduced We use an algebraic framework for this introduction since this has proven fruitful in related problem areas Section 3 outlines some basic issues with respect to the ontic states of closed quantum systems and their time-reversal invariant dynamical evolution Subsequently two ways to conceive of extrinsic irreshyversibility are described In one of them epistemic states are represented by (reduced) density operators in the other they are represented by probabilshyity distributions of pure states Section 4 presents the intrinsic conception of irreversibility One major line of research in this regard deals with transformashytions from invertible K-systems to non-invertible exact systems the other uses the concept of rigged Hilbert spaces to extend the state of a system beyond Hilbert space Section 5 summarizes the main points and indicates some open questions

2 Ontic and epistemic descriptions

21 General issues

Can nature be observed and described as it is in itself independent of those who observe and describe - that is to say nature as it is when nobody looks This question has been debated throughout the history of philosophy with no clear answer either way Each perspective has strengths and weaknesses and in each

52

epoch has had its critics and proponents In contemporary terminology the two perspectives can be distinguished as the topics of ontology and epistemology Ontological questions refer to the structure and behavior of a system as such whereas epistemological questions refer to knowledge (or information) about systems

In philosophical discourse it is considered a serious fallacy to confuse these two types of questions For instance Fetzer and Almeder emphasize that an ontic answer to an epistemic question (or vice versa) normally commits a category mistake 1 Nevertheless such mistakes are frequently committed in many fields of research when addressing subjects where the distinction between ontological and epistemological arguments is important

The onticepistemic distinction refers to states and properties of a system as such or in its relation to observers hence it is an ontological distinction0

In physics the rise of quantum theory with its interpretational problems was one of the first major challenges to the onticepistemic distinction The Bohr-Einstein discussions in the 1920s and 1930s serve as a famous historical examshyple Einsteins arguments were generally ontically motivated that is to say he emphasized a viewpoint independent of observers or measurements By conshytrast Bohrs emphasis was generally epistemically motivated focusing on what we could know and infer from observed quantum phenomena Since Bohr and Einstein never made their basic viewpoints explicit it is not surprising that they talked past each other in a number of respects2

Examples of approaches trying to avoid the confusions of the Bohr-Einstein discussions are Heisenbergs distinction of actuality and potentiality 3 Bohms ideas on explicate and implicate orders5 or dEspagnats scheme of an empirshyical weakly objective reality and an objective (veiled) reality independent of observers and their minds5 Further terms fitting into the ontic side of these distinctions are latency6 propensity7 or disposition8 See also Jammers discussion of these notions including their criticism and additional references 9

A first attempt to draw an explicit distinction between ontic and epistemic descriptions for quantum systems was introduced by Scheibe 10 who himself however strongly emphasized the epistemic realm Later Primas developed this distinction in the formal framework of algebraic quantum theory11 The basic structure of the onticepistemic distinction which will be made more precise below can be roughly characterized as follows (for more details the reader is referred to1 1 1 2)

On the other hand the distinction between ontological and epistemological problems can be considered as epistemological insofar as both areas represent fields of (philosophical) knowledge

53

Ontic states describe all properties of a physical system exhausshytively (Exhaustive in this context means that an ontic state is precisely the way it is without any reference to epistemic knowledge or ignorance) Ontic states are the referents of indishyvidual descriptions the properties of the system are treated as intrinsic bullproperties As an important example ontic states reshyfer to closed systems they are empirically inaccessible Typically their temporal evolution (dynamics) is reversible and follows fundashymental deterministic laws Epistemic states describe our (usually non-exhaustive) knowledge of the properties of a physical system ie based on a finite partition of the relevant phase space The refshyerents of statistical descriptions are epistemic states the properties of the system are treated as contextual properties Epistemic states refer to open systems they are at least in principle empirically accessible Typically their temporal evolution (dynamics) follows irreversible laws

The combination of the onticepistemic distinction with the formalism of algebraic quantum theory provides a framework that is both formally and conshyceptually satisfying Although the formalism of algebraic quantum theory is often hard to handle for specific physical applications it offers significant clarshyifications concerning the basic structure and the philosophical implications of quantum theory For instance the modern achievements of algebraic quanshytum theory make clear in what sense pioneer quantum mechanics (which von Neumann implicitly formulated epistemically 13) as well as classical and stashytistical mechanics can be considered as special cases of a more general theory Compared to the framework of von Neumanns monograph13 important exshytensions are obtained by giving up the irreducibility of the algebra of observshyables (not admitting observables which commute with every observable in the same algebra) and the restriction to locally compact phase spaces (admitting only finitely many degrees of freedom) As a consequence modern quantum physics is able to deal with open systems in addition to isolated ones it can involve infinitely many degrees of freedom such as the infinitely many modes of a radiation field it can properly consider interactions with the environment of a system superselection rules classical observables and phase transitions can be formulated which would be impossible in an irreducible algebra of obshyservables there exist infinitely many representations inequivalent to the Fock

In a more technical terminology one speaks of observables (mathematically represented by operators) rather than properties of a system Prima facie the term observable has nothing to do with the actual observability of a corresponding property

54

representation and non-automorphic irreversible dynamical evolutions can be successfully incorporated and even derived

In addition to this remarkable progress the mathematical rigor of algeshybraic quantum theory in combination with the onticepistemic distinction alshylows us to address a number of unresolved conceptual and interpretational problems of pioneer quantum mechanics from a new perspective First the distinction between different concepts of states as well as observables provides a much better understanding of many confusing issues in earlier conceptions including alleged paradoxes such as those of Einstein Podolsky and Rosen (EPR) 1 4 Second a clear-cut characterization of different concepts of states and observables is a necessary precondition to explore new approaches beshyyond von Neumanns projection postulate toward the central problem that pervades all quantum theory the measurement problem Third a number of much-discussed interpretations of quantum theory and their variants can be appreciated more properly if they are considered from the perspective of an algebraic formulation

One of the most striking differences between the concepts of ontic and epistemic states is their difference concerning operational access ie observshyability and measurability At first sight it might appear pointless to keep a level of description which is not related to what can be operationalized empirshyically However a most appealing feature at this ontic level is the existence of first principles and fundamental laws that cannot be obtained at the episshytemic level Furthermore it is possible to rigorously deduce (eg to GNS-construct cf 12gt15) a proper epistemic description from an ontic description if enough details about the empirically given situation are known These aspects show that the crucial point is not to decide whether ontic or epistemic levels of discussions are right or wrong in a mutually exclusive sense There are always ontic and epistemic elements to be taken into account for a proper description of a system This requires the definition of ontic and epistemic terms to be relativized with respect to some selected framework within a set of (hierarchishycal) descriptions (see16 for details and examples) The problem is then to use the proper level of description for a given context and to develop and explore well-defined relations between different levels

These relations are not universally prescribed they depend on contexts of various kinds The concepts of reduction and emergence are of crucial sigshynificance here In contrast to the majority of publications dealing with these topics it is possible to precisely specify their meaning in mathematical terms Contexts or contingent conditions can be formally incorporated as topologies in which particular asymptotic limits give rise to novel emergent properties unavailable without those contexts (see 15 for more details) It should also

55

be mentioned that the distinction between ontic and epistemic descriptions is neither identical with that of parts and wholes nor with that of micro- and macrostates as used in statistical mechanics or thermodynamics The thermoshydynamic limit of an infinite number of degrees of freedom provides only one example of a contextual topology others are the Born-Oppenheimer limit in molecular physics or the short-wavelength limit for geometrical optics

These examples indicate that the usefulness or even inevitability of the onticepistemic distinction is not restricted to quantum systems It plays a significant role in the description of classical systems as well More specifically it has been shown in detail that for systems exhibiting deterministic chaos the distinction of ontic and epistemic descriptions is necessary if category mistakes and corresponding interpretational fallacies are to be avoided17

3 Breaking Time-Reversal Symmetry Extrinsic Irreversibility

31 Time-Reversal Symmetry in Closed Systems

Let us start with a closed quantum system which can be considered without any reference to an environment The pure state ltfgt of such a system is an extremal positive linear functional on a C-algebra A The state ltgt euro A where A is the dual of A is then called an ontic state of the closed system If a Hilbert space representation of A is possible ltjgt can be represented as a state vector ip G characterized by the expectation values lt ipAip gt of all observables A euro A Under particular conditions the dynamics of ltfgt is given by the time-reversal invariant Schrodinger equation

In the traditional Hilbert space representation the algebra A of observshyables is irreducible there are no commuting observables Due to the Stone-von Neumann theorem every representation of the canonical commutation relashytions is then equivalent to the Schrodinger representation In the more general setting of a Fock space (sum of tensor products of one-particle Hilbert spaces) the same holds for Fock representations

A restriction of ltfr to a subsystem is not a pure state in general hence it is in general illegitimate to consider a closed quantum system as consisting of closed subsystems As a consequence an ontic state cfgt characterizes an individual undivided whole not consisting of subsystems with their own ontic states This is the level of description to which the notions of quantum nonlocality or quantum holism apply Since the concept of an environment does not make sense for ontic states of closed systems it is illegitimate to speak about their entanglement or interaction with another state

If one introduces a distinction (Heisenberg cut) to create subsystems in

56

a closed system then these subsystems in general are open For example one can then consider an object entangled andor interacting with its environshyment The epistemic state r] of those subsystems can be represented in two conceptually different ways

32 Density Operators as Non-Pure States

The first more or less familiar representation of an epistemic state n is given by a (reduced) density operator D 6 M where M is the predual of a W-algebra M of contextual observables The expectation value of D is given by TrDM for observables M E M The epistemic state n represented by D is a non-pure state EPR-correlations between subsystem and environment are generic if the contextual algebra of observables is non-commutative

The term contextual observables derives from the fact that their conshystruction requires the selection of a context defined by a subset of relevant observables B E B C A and a reference state (eg vacuum state KMS state) distinguished by some appropriate stability condition This context induces the weak closure of B and gives rise to a contextual topology in M If the context is known well enough then the GNS representation is a powerful constructive tool to implement a proper contextual topology (see eg15)

The dynamics of D is of Schrodinger type plus dissipative terms (eg a master equation) so that the time-reversal invariance of the Schrodinger equation can be broken18 19

33 Probability Distributions of Pure States

If the epistemic state r of an open system is approximately pure by a clever dressing of object and environment (b indicates bare objects and environments and d indicates dressed objects and environments)

ri0ij lt8gt Henv = Hgbj lt8gt nenv

7] can be represented (estimated) by a probability distribution fj of pure states (A dressing procedure is clever if it minimizes EPR-correlations between obshyject and environment or if it maximizes the integrity of both object and environment20) Hgbj is the proper Hilbert space for an approximately pure epistemic state 77 Although 77 can be uniquely extended to a normal state on M (represented by a density operator) the pure states and their distribution fi themselves do not make sense on M The relevant observables are elements of a C-subalgebra B C A

57

The dynamics of p is of Schrodinger type plus stochastic terms (eg an ItoStratonovic equation) so that the time-reversal invariance of the Schroshydinger equation can be broken The stochastic aspect of the time evolution (of approximately pure states of the object) originates from the fact that the (initial) state of the environment cannot be determined and therefore must be treated as a stochastic variable Starting from an initial pure state pa one gets time-evolved states ptu where co is the stochastic variable First steps of such an approach toward single open quantum systems not based exclusively on decompositions of density-operator dynamics were proposed in2 1 2 2

For a large class of stochastic dynamics of approximately pure states of objects one ends up with one particular distribution p^ of pure states in the limit t mdashgt oo independently of the initial conditions (such dynamical objects are called ergodic) Splitting the underlying C-algebra B into two subsystems with two C-subalgebras B and B2 B = B reg B2 is then admitted under particular conditions In an ideal situation all those pure states onto which the probability measures pt extend are product states with respect to the tensor product B = B reg $2- This situation never arises in practice but most relevant pure states can be product states or almost product states if the dressing tensorization is chosen appropriately 23

3-4 Dynamics of Measurement a Simple Example

Any dynamical description of measurement has to start from a proper decomshyposition of a system into a dressed object and its dressed environment It is crucial to keep in mind that such a decomposition is a logical precondition for the dynamics of measurement insofar as the Hamiltonian of the composed system needs to be written as a sum

H = Hobiregl + lregHmy+Hint (1)

An illustrative heuristic example has been extensively discussed by Primas24 Consider the simple case of a two-level quantum object (spin 12 system) with the Hamiltonian

h 3

^ o b j ~ Tj^yGu (2)

a sufficiently nontrivial boson field environment

3

-Henv = ^2^2ujkaklakv (3)

58

and an interaction

3

Hint = ^ lt7bdquo (ggt Abdquo (4)

where

Av = ^ ^kuOtkv + CC (5) k

If such a decomposition has been properly carried out (cf Sec 33) then it is possible to derive the expectation values

M(t) = ltiptWflHgt (6)

a(t) = ltXtAXtgt (7)

with respect to the (approximate) product state

t = v- tobjregxr- (8)

Corresponding to the product state Pt the C-algebra of intrinsic observables in the composed system of dressed object and dressed environment is

A = A0hi reg-4env (9)

Aohi is the C-algebra of 2 x 2 matrices and ^4env is the C-algebra of intrinsic observables of an environment with infinitely many degrees of freedom

The equations of motion for the expectation values M(t) and a(t) are given by

M(t) = M(t) x ft + M(t) x a(t) (10)

() = -UkOLkv + -^gt~kvMvt) (11)

They describe the feedback between object and environment More precisely they describe the polarization M of the object under the influence of the enshyvironment and the motion of the environment observable a (boson operator) under the polarizing influence of the object The solution of the second equashytion referring to the observables of the environment (or the measuring system

59

respectively) has a retarded and an advanced part

(t gt 0) (12)

(t lt 0) (13)

A bidirectionally deterministic system can be described in terms of a superposhysition of a backward deterministic (forward non-deterministic) and a forward deterministic (backward non-deterministic) process which are equally relevant a priori Selecting one of these solutions and disregarding the other requires the time inversion symmetry of the compound system to be broken For this purpose one can apply the principle of causality (past-determinacy error-free retrodiction no anticipation) as a heuristic argument for the selection of the retarded solution

It has been argued that the retarded ie the backward deterministic forward non-deterministic solution is a K-flowc on a state space with infinitely many degrees of freedom24 In the simplest case the relaxation time for this K-flow is the time constant rbdquo of an exponentially decaying correlation function (for details see24)

Kv = ivexp(-tTv) (14)

At this point we are still at the level of description of intrinsic observables needed for the specification of initial conditions of the K-flow Conceptually this K-flow represents a stochastic process which corresponds to chaos in the sense of Wiener25 rather than chaos in the sense of Kolmogorov and Sinai (ie a dissipative dynamics) By introducing a context via a reference state with respect to which stability in a particular sense (hopefully more general than thermal equilibrium) can be checked one can proceed to (GNS-constructed) contextual observables

35 General Features of Extrinsic Irreversibility

The breaking of time-reversal symmetry in the framework of extrinsic irreshyversibility corresponds to the conceptual transition from closed systems with cNote that K-flows or K-systems play an important role in one of the approaches of intrinsic irreversibility (see Sec 41) It would be interesting but exceeds the scope of this paper to explore the question of whether the process of measurement as described here can be conceived as intrinsically irreversible In this respect see eg2 6

aTke = exp(-iLjkt)akl0)

i r - 2Xk exp(-iuk(t - s))Mv(s)ds

fj = exp(-iujkt)akv(t)

i fdeg + 9 ^ exp(-wt(t-s))Mbdquo(s)ds

60

ontic states to open systems with epistemic states Such a transition can be understood by dividing a closed system into open more or less EPR-correlated subsystems (eg object and environment) and by selecting a subset of relshyevant observables The proper state concepts are epistemic There are then two different statistical representations for different epistemic state concepts A ^-statistical representation expresses a probability distribution of pure states whereas the usual ^-statistical representation focuses on reduced density opshyerators

The interaction of the open subsystems is described by dynamical laws difshyferent from the time-reversal invariant dynamics of a closed system Breaking the time-reversal invariance of a unitary group evolution generates two semishygroups which can be endowed with two arrows of time opposite to each other It should be pointed out that the forward arrow cannot be selected by physical reasons alone Extra-physical arguments such as consistency with experience causality etc must be invoked

4 Breaking Time-Reversal Symmetry Intrinsic Irreversibility

In contrast to the extrinsic concept of irreversibility there is an alternative concept of intrinsic irreversibility mainly advocated by Prigogine and collabshyorators (more recently also by Bohm) They propose describing states of any system generically with distributions p (ie probability distributions or denshysity operators) The claim is that the state p of systems beyond a particular degree of complexity evolves irreversibly by itself ie without any relationship to an environment There are essentially two lines of research pursuing this proposal

4-1 A-Transformation from K-Systems to Exact Systems

The notion of the A-transformation has been developed by Misra Courbage and Prigogine in the 1970s It is essentially based on the theory of ergodic systems In particular the concept of Kolmogorov systems briefly K-systems is of central significance in this context

Definition 127 Let (X A n) be a normalized measure space and let S X mdashgt X be an invertible transformation such that S and 5 _ 1 are measurable and measure preserving The transformation S is called a K-automorphism if there exists a cr-algebra A0 such that the following three conditions are satisfied (i)S-1(A0)cA0 (ii) the cr-algebra f l^Lo - ^ 0 ) is trivial (ie contains only sets of measure

61

1 or 0) (hi) the smallest cr-algebra containing Jtrade=0S

n(Ao) is identical to A Another way to characterize (classical) K-systems is by way of the existence

of positive Ljapounov exponents equivalent to a strictly positive Kolmogorov-Sinai entropy The properties of K-systems imply mixing and ergodicity K-systems are invertible transformations hence their deterministic dynamics given by p(t) = Ut p(0) is reversible (Ut is a unitary evolution operator acting on p) A standard example is the (2-dimensional) baker transformation

Another important class of mixing systems refers to so-called exact sysshytems

Definition 2 27 Let (XAp) be a normalized measure space and let S X mdasht X a measure preserving transformation such that S(A) pound A for each A pound A If l im^oo = p(Sn(A)) = 1 for every A euro A p(A) = 1 then S is called exact

Exact systems are represented by non-invertible transformations hence their stochastic dynamics given by p(t) = Wt p(0) is irreversible Wt is a semigroup evolution operator acting on a distribution p rather than p For instance an exact system obtained from the baker transformation is the dyadic transformation

S(x) = 2x (mod 1)

A theorem by Rokhlin28 says that every exact system is the factor of a K-system This means that K-systems can be transformed into exact systems by their projections (or factors see2 7) More generally a factor of a K-system can be obtained by restriction to dilating fibers or unstable manifolds Hence it is intuitively clear that the invertibility of a K-system gets lost by its transformation into an exact system

According to Misra et al 29 30 the relations between the two kinds of

dynamics Ut and Wt and the two state concepts p and p are provided by a similarity transformation A according to

Wt = AUtA-1

p = Ap

Wightmans question31 as to the meaning of p in his review of30 gets an imshymediate answer if one applies Rokhlins theorem to construct A (cf 3 2 ) The transformed distribution p is the projection of p onto a dilating subspace This can easily be seen for the examples of the baker transformation and the dyadic transformation In the more complicated case of continuous-time nonlinear (hyperbolic) systems the corresponding procedure would be a projection onto the unstable manifolds ie those directions along which the Lyapunov expo-

62

nents are positive and add up to the Kolmogorov-Sinai entropy (cf 33gt34) As an important conceptual feature such projections select a time direction

A crucial formal feature associated with the irreversibility due to Wt is that a properly constructed A (and hence A[ (A

_1) preserves the positivity of the state distributions only for positive times A conceptual discussion of this point can be found in3 5 For a more detailed formal account of the role which positivity preservation plays in the transformation between irreversible semigroups and chaotic dynamics see 36 and references given there

4-2 Rigged Hilbert Space Representation

Intrinsic irreversibility has also been implemented in an approach based on an extension of the usual Hilbert space representation of the state of a sysshytem This approach makes use of the so-called rigged Hilbert space (RHS) construction first introduced by the Russian mathematician Gelfand and his collaborators37 Roberts38 and Bohm3 9 independently showed how Diracs formalism could be justified with complete mathematical rigor in a RHS By the end of the 1970s it turned out that some basic physical problems of Hilbert space quantum mechanics notably in the context of decaying states or resoshynances could be clarified in terms of RHS (40 and references therein)

Very briefly a RHS (Gelfand triplet) can be understood as follows Let be an abstract linear scalar product space and complete with respect to two topologies The first topology is the standard norm topology yielding a separable Hilbert space The second topology r$ is defined by a countable set of norms

IMU = Aamp0)n ^ euro n = 012 (15)

where (fgt e $ and the scalar product is given by

(lt(gt ltf)n = (ltjgt (A + 1) V ) n = 0 1 2 (16)

where A is the Nelson operator A =J2iXi41- The Xi are operators representing the observables for the system in question and are the generators for the Nelson operator Furthermore the operator A + 1 is a nuclear operator and ensures that $ is a nuclear space (cf 42gt39) An operator is nuclear if it is linear essentially self-adjoint and its inverse is Hilbert-Schmidt An operator A-1 is Hilbert Schmidt if A1 = XiPi where the Pt are mutually orthogonal projection operators on a finite dimensional vector space and J2iPi lt degdeg gt Pi denoting the eigenvalues of Pi39 We then have the Gelfand triplet of spaces

$ C ^ C $ X (17)

63

where $ x is the dual to the space $ The Nelson operator fully determines the choice of function space when

it comes to choosing a realization of the space $ However there are many different inequivalent irreducible representations of an enveloping algebra of a Lie group used to generate a Nelson operator describing physical systems Therefore further restrictions on the choice of function space for a realization of $ are required The particular characteristics of the physical context of the system being modeled provide some of these restrictions analogous to the situation for GNS constructions in the transition from C- to W-algebras in algebraic quantum mechanics23 Additional restrictions may be required due to the convergence properties desired for test functions in $ and ltJgtX

Bohm and colleagues applied the RHS approach to intrinsic irreversibility in the context of scattering and decay phenomena4043 Antoniou and Prigogine 44 extended the approach to broader contexts The core idea in both versions is that a unitary group operator Ut = exp(-iHt) mdashoo lt t lt oo generated by a Hamiltonian H under very general circumstances may be extended from W to $ x (restricted to $) For scattering processes $ is the intersection of the Hardy class functions with the Schwarz class functions Because of continuity and completeness requirements Ut $ x mdashgt $ x (Ut $mdashgt$) can be extended to the upper half plane $+ (restricted to $+) for positive times and to the lower half plane $ x ($_) for negative times4 3 The extension of Ut to $ x

(restriction to $) forms two semigroups because the extension (restriction) cannot be defined for replacement of t with mdasht Thus semigroup evolution falls out of the analysis quite naturally in the RHS framework

4-3 General Features of Intrinsic Irreversibility

In the intrinsic conception of irreversibility states of a system are generically represented by distributions in a suitable state space where pure states are S functions The trajectories of individual points are either (1) considered irreleshyvant because empirically inaccessible (as in the A-transformation approach) or (2) make minimal contributions to the collective behavior of the system when a sufficient number of Poincare resonances are present (as in the RHS approach) For systems beyond a particular degree of complexity (K-systems Poincare resshyonances etc) the dynamics of the system is governed by irreversible evolution laws regardless of interactions with an environment

While the A-transformation approach has only been applied to the baker map the RHS approach has been applied to nonlinear maps Friedrich models

dThe dual space x is the space of linear functionals acting on elements of ltpoundgt and its topology is induced by the choice of T and includes distributions among its elements

64

scattering experiments and other decay phenomena In the latter approach exact Golden Rules for decay and survival probabilities and their rates can be derived in agreement with experimental observations43

In both approaches the transition from reversible to irreversible dynamical evolution laws is achieved by breaking the time-reversal symmetry in specific ways leading to two semigroups The time direction of the semigroups howshyever is not given by either the A-transformation or RHS approaches Physical considerations alone are insufficient to select the forward arrow and one must appeal to consistency with experience causality or other criteria

5 Summary and Open Questions

There are two basic points at which extrinsic and intrinsic notions of irreshyversibility coincide The first is that both notions explicitly break the time-reversal symmetry of reversible dynamical laws This is clearly the case for the standard external view in which the transition from fundamental reversible laws to contextual irreversible laws corresponds to the transition from ontic states of closed systems to epistemic states of open systems But even for the alternative intrinsic view irreversibility is an emergent feature 45 In the framework of the A-transformation the time-reversal symmetry of K-systems is broken leading to irreversible exact systems In the RHS representation a similar symmetry breaking is achieved by the transition from Hilbert space to the rigging spaces $ and $ x

The breaking of time-reversal symmetry always produces two semigroups which can be endowed with opposite temporal directions Selection criteria must be used to select one of these two directions for a preferred mode of description In both extrinsic and intrinsic approaches there is no such crishyterion available based on physical reasoning alone The selection is based on extra-physical arguments such as causality experience and others This secshyond point of agreement between extrinsic and intrinsic irreversibility raises the interesting question of what conditions the proper direction of time has to satisfy It could be argued that up to the condition that it is the same for all physical systems the selection is arbitrary

There are two basic points at which extrinsic and intrinsic notions of irreshyversibility apparently differ One of them concerns the role of the environment the other has to do with the state concepts used in the two approaches Briefly speaking the role of the environment and the distinction of different state concepts is crucial in the standard framework of extrinsic irreversibility The conceptual framework of the formalisms refering to intrinsic irreversibility neishyther (1) explicitly contains the concept of an environment nor (2) distinguishes

65

between different state concepts These observations do not necessarily imply that intrinsic irreversibility

really can dispense with points (1) and (2) It is likely that the two points play crucial roles even though they do not explicitly appear in the formalism and its usual interpretation

The projection (factorization) which is the crucial part of a A transforshymation can be considered as the selection of an exact subsystem of the origshyinal K-system Obviously the A-transformation is not universal but context-dependent Conceptually the irreversible evolution of p mdash Kp due to Wt could then be attributed to the restriction of the K-system to an exact subsystem This might lead to interesting analogies with aspects of extrinsic irreversibility if the subsystem cannot be described as a closed subsystem Concrete empirshyical applications of the A-transformation are not yet available They would be necessary to check the significance of a physical environment which is not explicit in the formalism

Concerning the distinction between ontic and epistemic state concepts it is clear that the approach of intrinsic irreversibility starts at the level of distributions rather than points In the space of distributions 5 functions are special cases that could be related to points in a state space underlying the distribution space considered In this way a connection between distributions as epistemic states and points as ontic states is possible The general claim in the A-transformation framework of intrinsic irreversibility though is that ontic states in the sense of phase points are meaningless or irrelevant since they are empirically inaccessible

But is it justified to consider ontic states as generally irrelevant because they are empirically inaccessible Reversible fundamental laws refer to ontic states and it is not easy to formulate physics without them The monoshygraphs by Ludwig46 which consistently avoid any ontic elements are an ilshylustrative example Moreover special techniques to break symmetries often enable a unique derivation of irreversible contextual laws if the fundamental laws plus contexts are known This also holds for the symmetry breaking used to derive intrinsic irreversibility from time-reversal invariant evolution in the A-transformation approach The empirical inaccessibility of ontic states notwithstanding one should therefore not dismiss their overall relevance too quickly

In the RHS approach there is no contradiction with the formal arguments in the case of extrinsic irreversibility insofar as the extension of Ut from V into $ x leads from reversibility to irreversibility In this case irreversibility is a feature arising during the transition from states in to states whose state space is defined with respect to contexts In the algebraic framework of Sec 3

66

such contexts are reflected by a contextual topology on M As mentioned in Sec 42 physical contexts may not be known sufficiently well to determine $ x uniquely The physical examples used to demonstrate the significance of the RHS formulation (eg decay) suggest that a physical environment is inevitable although this is not explicit in the formalism

The relationship between ontic and epistemic states in the RHS approach is more subtle than in the A-transformation approach As Petrosky and Pri-gogine argue4748 the presence of a sufficient number of Poincare resonances in so-called large Poincare systems (LPS) rapidly convert the smooth infinitely differentiable trajectories of the phase space points into random walks Though the trajectories are not considered to be empirically inaccessible their effects are limited to the formation of higher and higher orders of correlations as the dynamics evolves The phase space points can represent ontic states but the correlations also have an ontic status Correlations very rapidly come to domishynate the dynamics of all collective modes of behavior of LPS (eg the approach to equilibrium) as the correlations diffuse throughout the system In this way the effects of individual points and trajectories become irrelevant to the dyshynamics of the whole and thus one can argue that the distribution description is an ontic description of the systems behavior

In this way the distinction between ontic and epistemic states might be a powerful conceptual tool even at the level of distributions alone There is a conceptual difference between a probability distribution conceived as a distrishybution over an ensemble of individual pure states (as in the ^-statistical represhysentation) and a probability distribution conceived as an individual whole The latter concept is sometimes indicated in the context of intrinsic irreversibility and can be considered as an ontic version of the former (cf the notion of relshyative onticity16) For instance continuum mechanics requires a formulation which needs ontically interpreted holistic distributions from the very beginshyning since its description in terms of an ensemble of points would violate basic physical laws

Among the adherents of intrinsic irreversibility it is claimed that the holisshytic concept of a distribution as a whole entails predictions eg related to the dynamics of correlations in large systems which cannot be obtained with the concept of a probability distribution of individual pure states This claim particularly refers to situations far from thermal equilibrium Based on Gallavottis approach which describes systems far from equilibrium in terms of SRB-measures49 ie in an ensemble description this claim may become testable (see also50 for a brief discussion)

After all it is possible to view the intrinsic approach to irreversibility as emphasizing the relative importance of the advanced level of complexity

67

of systems with nontrivial correlations over environmental effects While exshytrinsic irreversibility addresses the importance of an environment intrinsic irreversibility should not primarily be understood as focusing on the neglect of such an environment (eg the environment may be a necessary condition for the existence of the dynamics) Instead it is perhaps more appropriate to understand intrinsic irreversibility as irreversibility intrinsic to the dynamics of a system given a particular degree of its complexity

Acknowledgments

Helpful comments by L Accardi L Ballentine H Narnhofer and I Volovich during the discussion of this contribution at the conference are much apprecishyated We are grateful to H Primas for remarks on an earlier version of this paper

References

1 JH Fetzer and RF Almeder Glossary of EpistemologyPhilosophy of Science (Paragon House New York 1993) p lOOf

2 D Howard Space-time and separability problems of identity and indishyviduation in fundamental physics In Potentiality Entanglement and Passion-at-a-Distance ed by RS Cohen M Home and J Stachel (Kluwer Dordrecht 1997) pp 113-141

3 W Heisenberg Physics and Philosophy (Harper and Row New York 1958)

4 D Bohm Wholeness and the Implicate Order (Routledge and Kegan Paul London 1980)

5 B dEspagnat Veiled Reality (Addison-Wesley Reading 1995) 6 H Margenau Reality in quantum mechanics Phil Science 16 287-302

(1949) here p 297 7 KR Popper The propensity interpretation of probability and quanshy

tum mechanics In Observation and Interpretation in the Philosophy of Physics - With special reference to Quantum Mechanics ed by S Korner in collaboration with MHL Pryce (Constable London 1957) pp 65-70 [Reprinted by Dover New York 1962]

8 R Harre Is there a basic ontology for the physical sciences Dialectica 51 17-34 (1997)

9 M Jammer The Philosophy of Quantum Mechanics (Wiley New York 1974) pp 448-453 504-507

10 E Scheibe The Logical Analysis of Quantum Mechanics (Pergamon Oxford 1973) pp 82-88

68

11 H Primas Mathematical and philosophical questions in the theory of open and macroscopic quantum systems In Sixty-Two Years of Uncershytainty ed by AI Miller (Plenum New York 1990) pp 233-257

12 H Primas Endo- and exotheories of matter In Inside Versus Outside ed by H Atmanspacher and GJ Dalenoort (Springer Berlin 1994) pp 163-193

13 J von Neumann Mathematische Grundlagen der Quantenmechanik (Springer Berlin 1932) English translation Mathematical Foundations of Quantum Mechanics (Princeton University Press Princeton 1955)

14 A Einstein B Podolsky and N Rosen Can quantum-mechanical deshyscription of physical reality be considered complete Phys Rev 47 777-780 (1935)

15 H Primas Emergence in exact natural sciences Acta Polytechnica Scan-dinavica M a 91 83-98 (1998) See also Primas Chemistry Quantum Mechanics and Reductionism (Springer Berlin 1983) Chap 6

16 H Atmanspacher and F Kronz Relative onticity In On Quanta Mind and Matter Hans Primas in Context Edited by H Atmanspacher A Amann and U Miiller-Herold (Kluwer Dordrecht 1999) pp 273-294

17 H Atmanspacher Ontic and epistemic descriptions of chaotic systems In Computing Anticipatory Systems CASYS 99 Edited by D Dubois (Springer Berlin 2000) pp 465-478

18 E Fick and G Sauermann Quantenstatistik dynamischer Prozesse Ha Antwort- und Relaxationstheorie (Harri Deutsch Thun 1986)

19 R Kubo M Toda and N Hashitsume Statistical Physics II (Springer Berlin 1985)

20 H Primas The Cartesian cut the Heisenberg cut and disentangled observers In Symposia on the Foundations of Modern Physics Wolfgang Pauli as a Philosopher ed by KV Laurikainen and C Montonen (World Scientific Singapore 1993) pp 245-269

21 A Amann Structure dynamics and spectroscopy of single molecules a challenge to quantum mechanics J Math Chem 18 247-308 (1995)

22 A Amann and H Atmanspacher Fluctuations in the dynamics of single quantum systems Stud Hist Phil Mod Phys 29 151-182 (1998)

23 A Amann and H Atmanspacher C- and W-algebras of observ-ables their interpretation and the problem of measurement In On Quanta Mind and Matter Hans Primas in Context Edited by H Atshymanspacher A Amann and U Miiller-Herold (Kluwer Dordrecht 1999) pp 57-79

24 H Primas Induced nonlinear time evolution of open quantum systems

69

In Sixty-Two Years of Uncertainty ed by AI Miller (Plenum New York 1990) pp 259-280

25 N Wiener (1938) The homogeneous chaos Am J Math 60 897-936 (1938)

26 CM Lockhart and B Misra Irreversibility and measurement in quanshytum mechanics Physica A 136 47-76 (1986) Cf H Primas Math Rev 87k 81006 (1987)

27 A Lasota and MC Mackey Chaos Fractals and Noise (Springer Berlin 1995)

28 VA Rokhlin Exact endomorphisms of Lebesgue spaces Izv Akad Nauk SSSR Ser Mat 25 499-530 (1964) transl in Am Math Soc Transl 39 1-36 (1964)

29 B Misra NonequiUbrium entropy Lyapounov variables and ergodic properties of classical systems Proc Ntl Acad Sci USA 75 1627-1631 (1978)

30 B Misra I Prigogine and M Courbage From deterministic dynamics to probabilistic descriptions Physica A 98 1-26 (1979)

31 A Wightman Review of Misra Prigogine and Courbage30 Math Rev 82e 58066 (1982)

32 Z Suchanecki On lambda and internal time operators Physica A 187 249-266 (1992)

33 H Atmanspacher and H Scheingraber A fundamental link between sysshytem theory and statistical mechanics Found Phys 17 939-963 (1987)

34 H Atmanspacher Dynamical entropy in dynamical systems In Time Temporality Now ed by H Atmanspacher and E Ruhnau (Springer Berlin 1997) pp 325-344

35 RW Batterman Randomness and probability in dynamical theories on the proposals of the Prigogine school Philosophy of Science 58 241-263 (1991)

36 I Antoniou K Gustafson and Z Suchanecki (1998) On the inverse problem of statistical physics from irreversible semigroups to chaotic dynamics Physica A 252 345-361 (1998)

37 IM Gelfand and NYa Vilenkin Generalized Functions Vol 4 (Acashydemic New York 1964) Russian original published 1961 in Moscow

38 JERoberts The Dirac bra and ket formalism Journal of Mathematical Physics 7 1097-1104 (1966)

39 A Bohm Rigged Hilbert space and mathematical descriptions of physshyical systems In Lectures in Theoretical Physics IX A Mathematical methods of theoretical physics Edited by WE Brittin AO Barut and M Guenin (Gordon and Breach New York 1967) pp 255-317

70

40 A Bohm and M Gadella Dirac Kets Gamow Vectors and Gelfand Triplets Lecture Notes in Physics Vol 348 ed by A Bohm and JD Dollard (Springer Berlin 1989)

41 E Nelson Analytic Vectors Annals of Mathematics 70 572-615 (1959) 42 F Treves Topological Vector Spaces Distributions and Kernels (Acashy

demic Press New York 1967) 43 A Bohm S Maxson M Loewe and M Gadella Quantum mechanical

irreversibility Physica A 236 485-549 (1997) 44 I Antoniou and I Prigogine Intrinsic irreversibility and integrability of

dynamics Physica A 192 443-464 (1993) 45 T Petrosky and I Prigogine The Liouville space extension of quantum

mechanics Adv Chem Phys XCIX 1-120 (1997) here p 71 46 G Ludwig Foundations of Quantum Mechanics Vols 12 (Springer

Berlin 19831985) 47 T Petrosky and I Prigogine Poincare resonances and the extension of

classical dynamics Chaos Solitons amp Fractals 7 441-497 (1996) 48 T Petrosky and I Prigogine The Extension of Classical Dynamics for

Unstable Hamiltonian Systems Computers amp Mathematics with Applishycations 34 1-44 (1997)

49 G Gallavotti Chaotic dynamics fluctuations nonequilibrium ensemshybles CHAOS 8 384-392(1998)

50 D Ruelle Gaps and new ideas in our understanding of nonequilibrium Physica A 263 540-544 (1999)

71

INTERPRETATIONS OF PROBABILITY A N D Q U A N T U M THEORY

L E B A L L E N T I N E

Department of Physics Simon Fraser University Burnaby

BC V5A 1S6 Canada

e-mail ballentisfuca

There is a peculiar similarity between Probability Theory and Quantum Mechanics both subjects are mature and successful yet both remain subject to controversy about their foundations and interpretation I first present a classification of the various interpretations of probability arguing that they should not be thought of as rivals but rather as applications of a general theory to different kinds of subshyject matter An axiom system that makes conditional probability the fundamental concept is put forward as being superior to Kolmogorovs axioms I then discuss the relevance to quantum theory of the various interpretations of probability the applicability of classical probability theory within quantum mechanics and the reshylations between the interpretation of probability and the interpretation of quantum mechanics

1 Introduction

There are many connections between Probability Theory and Quantum Meshychanics the most notable being that Quantum Mechanics uses Probability Theory in its fundamental interpretation not merely as a technique But I wish to concentrate on a more peculiar similarity Although both subjects are mature and successful both remain subject to controversy about their foundations and interpretation There may be even more interpretations of probability than there are of quantum theory Can one bring some degree of order to this subject

Probability Theory being a branch of mathematics is defined by a set of axioms So it can legitimately be applied to any entity that satisfies those axioms Most of the interpretations of probability can be viewed as applications of the formal theory to different subject matters It is therefore misguided to argue over which is the correct interpretation Most of them are correct within their appropriate domain of application But it is still reasonable to ask whether there is a general overarching form of Probability Theory of which all the various interpretations can be seen as special cases applied to special subject matters

I shall propose such a classification of the various interpretations of probshyability To do so it is necessary to overlook small differences and to lump closely related interpretations into a few broad categories I expect this classi-

72

fication to be controversial but I believe that it is a step in the right direction I shall consider only theories that are based on the same or equivalent sets of axioms Hence generalizations such as negative probabilities are not included in this scheme although I shall briefly refer to them later After describing the major categories of interpretation of probability I will discuss the relevance of each to quantum mechanics

2 Interpretations of Probability

Many different interpretations of probability are examined in detail by T L Fine1 I propose to overlook many of the fine differences and hence classify them into a few major groups shown in Figure 1 References to most of the authors named in Fig 1 and critical analyses of their ideas are given by Fine1

21 The Theory of Inductive Inference

I propose that the Theory of Inductive Inference be taken as the master theory and that all other interpretations be regarded as special cases applicable in more restricted contexts This point of view was expressed most completely by E T Jaynes in his book Probability Theory The Logic of Science which unfortunately was not completed during his lifetime

Within this interpretation probability is assigned to propositions The notation P(AC) is to be read as the probability of A under the condition C Probability is regarded as a logical relation among propositions that is weaker than entailment Inductive logic reduces to deductive logic in the limit of probability values 0 and 1 Probability is an objective relation and should not be confused with degrees of belief

The propositions to which probability is assigned may have any particular content If we specialize to propositions about repeated experiments we obtain the Ensemble-Frequency theory If we specialize to propositions about personal belief we obtain Subjective probability If we specialize to propositions about indeterministic or unpredictable events we obtain the Propensity theory

Although P(AC) is a logical relation between proposition A and the conshyditioning information C it is not merely a formal syntactic relation The content (meaning) of A and C must be invoked to evaluate P(AC) There is no magic formula to translate arbitrary information into probabilities Jaynes has given solutions to this problem in some important special cases (symmetry groups marginalization) but there is as yet no general solution

73

The Logic of Inductive Inference

(E T Jaynes R T Cox H Jefferys)

P(AC) is the probability that proposhysition A is true given the information C

Ensemble and Frequency

(Kolmogorov Bernoulli von Mises)

Measure on a set Limit frequency in an ordered sequence

Propensity

(K R Popper)

PAC) is the propensity for event A to occur under the conshydition C

Subjective and Personal

(de Finnetti L J Savage I J Good)

Incomplete knowledge Degrees of reasonable belief

Figure 1 Classification of the interpretations of Probability

22 Ensemble and Frequency Theories

One of the most common interpretations of probability is as a limit frequency in an ordered sequence The ratio of the number n of occurrences of a particshyular type in a sequence of N events nN is identified with the probability This interpretation is useful in analyzing repeated experiments but it has the

74

difficulty that in a random sequence the ratio nN need not have a limit The ensemble interpretation is a generalization of the frequency interpretation in which probability is identified with a measure on a set that need not be orshydered It is closely associated with Kolmogorovs axiom system which will be discussed later

23 Subjective Probability

Subjectivism has its place and subjective probability provides an excellent way to describe degrees of reasonable belief But in science subjectivism can be like a virus and we must guard against its infection In general the probability P(AC) expresses an objective relation between A and C determined by the totality of the information C and not by anyones personal opinions Jaynes tried to ensure objectivity through the pedagogical device of introducing a robot that is programmed to reason consistently using only the information that is given to it But even Jaynes sometimes slipped from objective to personal probabilities in his examples without apparently being aware of doing so Indeed the contamination of Inductive Logic Probability by subjectivism may have been a major barrier to its acceptance

24 Propensity

Propensity is a form of causality that is weaker than determinism34 Generally speaking probability expresses logical relations rather that causal relations (Recall the old saying Correlation does not imply causality) However causalshyity is a special kind of logical relation and propensity theory deals with just that special case The propensity interpretation of probability is natural in situations such as those described by quantum mechanics in which events can not be predicted with certainty from their antecedents

3 The Axioms of Probability

The axioms of probability theory can be given in several different forms howshyever those given by RT Cox56 are particularly convenient

Axiom 1 0 lt PAB) lt 1 Axiom 2 PAA) = 1 Axiom 3 PhAB) = 1 - P(AB) Axiom 4 P(AkBC) = P(AC) PBAkC)

Here the notation is as follows -gtA means not A AkB means A and J5 A B means either A or B

75

Axiom 2 states that the probability of a certainty (A given A) is one Axiom 1 states that no probabilities are greater than the probability of a certainty Axiom 3 expresses the notion that the probability of non-occurrence of an event increases as the probability of its occurrence decreases It also implies P-gtAA) = 0 an impossibility (not A given A) has zero probability Axiom 4 is the least intuitive The probability of both A and B (under some condition C) is equal to the probability of A multiplied by the probability of B given A

The probabilities of negation (-gtA) and conjunction (AampB) each require an axiom However no further axioms are required to treat disjunction because AV B = -i(-iAamp-ii) in words A or B is equivalent to the negation of neither A nor B This allows us to deduce a theorem

P(A V BC) = P(AC) + P(BC) - PAkBC) (1)

If A and B are mutually exclusive then we obtain

PAV BC) = P(AC) + P(BC) (2)

which is often taken to be an axiom and may be used in place of Axiom 3 Several remarks about these axioms are in order First the notion of ranshy

domness plays no fundamental role in the theory Hence we need not enquire whether our variables and events are random as a prerequisite to applying probability theory

Second these axioms are not arbitrary They are uniquely determined (apart from formal changes that do not affect the content) by conditions of plausibility and consistency (see Cox5 and Jaynes2)

(i) The probability of A on some given evidence determines also the probshyability of not A on the same evidence

(ii) The probability on given evidence that both A and B are true is determined by their separate probabilities one on the given evidence and the other on that evidence plus the assumption that the first is true

(iii) If a complex proposition can be composed in more than one way [ex (AampB)ampC or AampcBbC) then all ways of computing its probability must lead to the same answer Notice that in (i) and (ii) only the existence of certain connections are asshysumed but not their mathematical form The consistency condition (iii) then leads to the mathematical forms of the axioms Therefore anyone who proshyposes an inequivalent alternative to Coxs axioms (such as allowing negative probabilities) has an obligation to explain how and why he departs from these conditions of plausibility and consistency

76

Finally a very important remark All probabilities are conditional

The use of the single-variable notation PA) instead of P(AC) is permissible only if the conditional information C is obvious from the context and is unshychanging throughout the problem Many fallacies and paradoxes follow from ignoring this principle

31 Kolmogorovs axioms

If the fundamental axioms that define Probability Theory are those given above then what is the status of Kolmogorovs well-known axioms According to Kolmogorovs axioms probability is assigned to subsets of a universal set fi with the following rules

(i) p(n) = I (2) P(f) gt 0 for any in il (3) If i - - - laquoare disjoint then P(f) = Sj j where is the union of

fir fn-(4) If mdashgt 0 (the empty set) then P(fi) -gt 0 The answer I believe is that Kolmogorovs axioms provide a mathematshy

ical model of probability theory (defined by Coxs axioms) on the theory of measurable sets A mathematical model is useful because it reduces the conshysistency of one theory to that of another (A familiar example is the algebra of complex numbers which can be modeled by the algebra of ordered pairs of reals) Thus any doubts about the consistency of Probability Theory may be laid to rest because of the existence of Kolmogorovs model

There are several objections to taking Kolmogorovs axioms as a foundashytion for Probability Theory rather than merely as a model bull The universal set Cl is often fictitious The propositions to which probabilities are assigned are not subsets of a set bull Conditional probability is relegated to secondary status while the matheshymatical fiction of absolute probability is made primary bull Probability theory and Measure theory are distinct subjects The interesting problems of one are not closely related to the interesting problems of the other For example measure theory deals mostly with infinite sets culminating with the construction of non-measureable sets which have no probabilistic intershypretation But in probability theory one seldom needs to consider an infishynite number of conjunctions and disjunctions On the other hand the imporshytant problem of translating qualitative information into probabilities has no measure-theoretic analog

77

4 Probability in Quantum Mechanics

4-1 Relevant and Irrelevant Interpretations of Probability

Which of the interpretations of probability are relevant to quantum mechanshyics The ensemble-frequency interpretation is obviously relevant and widely used in discussing the statistics of repeated experiments on similarly prepared states Indeed the standard description of an idealized experiment is (1) prepare a state (2) measure an observable of the system (3) repeat the previous two steps until sufficient statistical data has been accumulated (4) compare the relative frequencies of this data with the probabilities predicted by quantum theory

The propensity interpretation is in accord with the ensemble-frequency interpretation whenever it is applied to repeated experiments but it also allows one to make meaningful statements about individual events The propensity interpretation is more natural when one considers time-dependent states and hence time-dependent probabilities Consider the following examples

(i) A source produces s = 12 particles polarized at an angle 4gt relative to some coordinate axis A Stern-Gerlach magnet has its field gradient axis oriented at an angle 8 What is the probability that such a particle incident on the apparatus will emerge with spin up

The formal answer is of course p = cos[(9 mdash ltj))22 but what does this mean

According to the propensity interpretation it means The propensity (chance) of the particle emerging with spin up is p

According to the ensemble-frequency interpretation it means In a long run of similar experiments the fraction of particles emerging with spin up will be (approximately) p

(ii) Now let the magnet be re-oriented in some arbitrary manner before each particle is released so that 6 is different in each case

According to the propensity interpretation we say nearly the same thing The propensity (chance) has a different value p = p$ in each case

But in the ensemble-frequency interpretation one must conceptually embed each event in an imaginary long run of experiments having the same value of 6 in order to make a frequency statement

78

(iii) Suppose next that the polarization direction ltjgt of the particles is unknown Can it be inferred from the data of (ii)

In the ensemble-frequency interpretation the answer would appear to be No A long run of events for each value of 0 would be necessary to estimate p$ as a frequency and hence to determine its dependence on 6

In the propensity interpretation the answer is Yes Bayesian inference (equivalent to maximum likelihood if the prior probashybility distribution for ltgt is uniform) can determine the most probable value of ltjgt even if there is only one event for each value of 9

I have never seen a coherent exposition of QM based on a subjective inshyterpretation of quantum probabilities as representing knowledge This point (which has also been argued at length by Popper8) is worth emphasizing beshycause the interpretation of probabilities as knowledge seems to be a tenet of the Copenhagen interpretation

Two persons (with limited knowledge of QM) might have different reashysonable beliefs about the position of the electron in the hydrogen atom and those beliefs could be represented by subjective probabilities But such igshynorance probabilities have nothing to do with |gt(a0|2 from the Schroedinger equation |V(a)|2 is an objective propensity not a subjective degree of belief

The so-called Uncertainty principle AxAp gt h2 has nothing to do with subjective knowledge or ignorance Its meaning is that in any physical prepashyration of a state the values of x and p will not be reproducible the widths of their distributions being related by the inequality The widths Aa and Ap are objective predictable and measurable parameters which should not be called uncertainties Indeed the name Indeterminacy principle is preferable to Uncertainty principle0

Subjective probabilities can occur in the information games that are played in quantum communication theory Consider a typical example

Bob prepares some quantum state but keeps it secret He tells Alice only that it is one of four (usually nonorthogonal) possible states and she must try to infer what the hidden state is from a measurement Alices incomplete knowledge of that hidden state can be expressed as a subjective probability Suppose also that Bob tells Carol that the unknown state is one of three posshysibilities Carols knowledge is different from Alices and hence her subjective probability will be different But both of these subjective knowledge probabilshyities are quite distinct from the objective quantum probabilities (propensities)

When I once heard Heisenberg speak (about 1964) he used the term Indeterminacy prinshyciple In his early writings he used the words Ungenauigheit (inexactness) Unbestimmtheit (indeterminacy) and Unsicherheit (uncertainty) with various shades of meaning

79

that would be calculated by solving Schroedingers equation for Bobs state preparation apparatus

I suspect that the subjective knowledge interpretation of QM probabilshyities came about by accident the founders of QM may have believed (erroshyneously) that probability can only be a measure of knowledgeignorance Max Born has written that Heisenberg did not know what a matrix was when he was inventing what later became known as matrix mechanics It is therefore not very radical to suppose that the founders of quantum mechanics had an inadequate understanding of probability

4-2 Fallacies in the use of Probability

Unsound arguments to the effect that classical probability theory does not apply to QM are woefully common Before examining an actual argument to that effect let us first consider a simple classical paradox

The Bookies Paradox A bookie needs to fix the odds on a star track runner who has a 60 chance of winning any race that he enters There is a race in Paris and a race in Tokyo scheduled on the same day so he cannot enter both and we do not know which he will enter What is the probability that he will win at least one of these races

Let A = (winning in Paris) and let B = (winning in Tokyo) Clearly A and B are mutually exclusive events so PAJB) = PA) + P(B) The probability of his winning at least one race is 06 + 06 = 12 But this is absurd since 12 gt 1

The paradox is resolved by taking account of a principle that was noted in Sec 3

All probabilities are conditional The notation PA) instead of P(AC) is permissible only if the conditional information C is obshyvious from the context and unchanging throughout the problem

Let us therefore be more precise about the conditions involved Let Ep = (entering in Paris) and let ET mdash (entering in Tokyo) Then clearly we have

P(AEP) = 06 P(BEP)=0 P(AET) = 0 P(BEr) = 06

80

Additivity P(A V BC) = P(AC) + PBC) holds for the same condition C in all terms But PAEp) and P(BET) are not additive by any valid rule so the absurd conclusion reached above followed only from an erroneous apshyplication of probability theory

Double-slit Fallacy A common fallacy about 2-slit experiment is of exactly the same form The experiment consists of three parts

(a) Open slit 1 close slit 2 The probability of a particle arriving at the point X on the screen is Pi(X)

(b) Open slit 2 close slit 1 The probability of a particle arriving at X is now P2(X)

(c) Open both slits 1 and 2 The probability of a particle arriving at X is Pi2(X)

Now passage through slit 1 and through slit 2 are mutually exclusive so we deduce

PuX) = Pi(X) + P2(X) which is empirically false It is then concluded (fallaciously) that classical probability theory does not apply in quantum mechanics

The above reasoning embodies essentially the same fallacy is does the Bookies paradox and it is resolved similarly by paying proper attention to the conditional nature of the probabilities

Let condition C = (slit 1 open slit 2 closed) Let C2 = (slit 2 open slit 1 closed) Let C3 = (both slits open)

We observe empirically that P(XCi) + P(XC2) ^ P(XC3)

(due of course to interference) But this fact is is fully compatible with classical probability theory

4-3 Quantum Probabilities

Quantum probabilities are not essentially different from classical probabilities but like quantum theory itself they do require some care in their interpreshytation H Jefferys 7 remarked that the probability statements of quantum mechanics are incomplete because a probability is always relative to a set of data and the data are not specified In our terminology Jefferys is saying that all probabilities are conditional and the conditions need to be specified to

81

make the probability statement meaningful This can be accomplished through a propensity interpretation of quantum probabilities with proper attention beshying given to the basic concepts of measurement and state preparation When that is done it can be demonstrated9 10 that quantum probabilities obey all of the axioms of classical probability theory The demonstration is straight forshyward but too lengthy to review here so I shall only remark on some conceptual points

(a) The standard formula P(A=an^) = | (abdquo |) |2 where Aan) = anan) should be read as

The probability (propensity) for a measurement of the dynamical variable A to yield the value an conditional on the preparation of the state is | (abdquo |) |2

Note that the propensity is conditioned by the physical process of state prepashyration and not by anyones beliefs or opinions

(b) One can also calculate the probability of a measurement result condishytioned by state preparation and the results of other measurements^

P(B=bm(A=an)kV) However it is necessary that the measurement processes be described dynamshyically as an interaction between the object and the apparatus Simplistic applishycation of the Projection Postulate is liable to give an incorrect answer11

(c) No difficulties of principle arise if the probabilities are conditioned on actual events of state preparation and measurement But assigning probabilishyties to hypothetical unmeasured values is not always possible This problem is encountered if we try to introduce joint probability distributions for (unmeashysured values of) non-commuting observables and require the marginal distrishybutions to agree with the quantum probabilities of the individual observables

In the case of position and momentum we would like to have a joint distribution P(xp) that satisfies

P(xp) gt 0 (3)

Jp(xp)dp=(x)2 (4)

Jp(xp)dx = (pV)2 (5)

There are infinitely many solutions to this problem12 but there is no apparent physical reason for any one of them to be preferred

However in the case of angular momentum where we might seek a joint distribution P(JxJyJz) for the three angular momentum components it is

82

not difficult to show that no such a function can yield the quantum probshyabilities of the three components as marginals However this has more to do with Kochen-Specker13 difficulties (the impossibility of assigning values to all quantum observables consistent with all the relevant constraints) than to probability theory There is no case in which a quantum probability is well defined but violates an axiom of classical probability theory

5 Conclusions

In this paper I have suggested a scheme whereby all the major interpretations of probability are unified with the separate interpretations now seen as applishycations of the general theory to particular subject matters That such different ideas as ensemble-frequency theories propensity theory and subjective degrees of reasonable belief can all be encompassed within a single framework is both useful and surprizing Because they can all be described by the same matheshymatical axioms it is easy to switch from one kind of probability to another as may be appropriate in a particular problem But on the other hand one can ask why such different things as frequencies propensities and degrees of belief should necessarily obey the same axiom system This question should stimulate further foundational research

For the case of degrees of reasonable belief this work has already been completed by Cox56 who showed that certain conditions of plausibility and consistency determine the axioms essentially uniquely Essentially unique means subject only to formal transformations that do not alter the content of the theory Therefore any alternative inequivalent system of plausible reasonshying could be shown to suffer from some degree of inconsistency

Khrennikov14 has studied limit frequencies outside of any theory of probshyability imposing only a condition of stabilization that in a long sequence the frequencies should approach a limit He has found many different cases to be possible some of which lie outside of probability theory It will be interesting to see whether these new logical possibilities are realized in nature If not then his stabilization condition will have to be supplemented by other conditions

The greatest need for more foundational research is in the case of propenshysity Although it clearly can be described by the axioms of probability theory it is not yet clear why it must be so described

Although I have dealt only with versions of probability theory that are derivable from the same axioms I expect that the classification of interpretashytions (Fig 1) may also be useful for generalized theories such as those that admit negative probabilities15 For such generalizations we should ask which of the interpretations do they support Can such generalized probabilities be

83

interpreted as frequencies As propensities As degrees of belief Or must they be given some entirely new interpretation

There are connections between the interpretations of probability and of quantum mechanics This must be so because quantum mechanics does not predict events but only the probabilities of events If one adheres exclusively to a frequency interpretation of probability then one is bound to assert that a quantum state describes only an ensemble of similarly prepared systems If on the other hand one adopts a propensity interpretation of probability then it becomes possible to make meaningful probability statements about an individshyual system However the empirically testable content of those statements can be realized only by measurements on an ensemble of similarly prepared sysshytems Thus the frequency interpretation is not made obsolete by the propensity interpretation but merely broadened The subjective interpretation of probshyability can be used in some situations such as when the observer is not fully informed about the state preparation procedure But it is never correct to interpret ip2 as representing knowledge (except perhaps in the trivial case in which the observers knowledge is complete and in perfect accord with reality)

References

1 TL Fine Theories of Probability an Examination of Foundations (Acashydemic Press New York 1973)

2 ET Jaynes Probability Theory The Logic of Science (Cambridge Unishyversity Press forthcoming) an incomplete version of this work is availshyable electronically at httpbayeswustledu

3 KR Popper in Observation and Interpretation ed S Korner (Butter-worths London 1957)

4 KR Popper Realism and the Aim of Science (Hutchinson London 1983)

5 RT Cox The Algebra of Probable Inference (Johns Hopkins University Press Baltimore MD 1961)

6 RT Cox Am J Phys 14 1 (1946) 7 H Jefferys Scientific Inference (Cambridge University Press Cambridge

1973) sec 1031 8 KR Popper Quantum Theory and the Schism in Physics (Hutchinson

London 1982) 9 LE Ballentine Quantum Mechanics - A Modern Development (World

Scientific Singapore 1998) Ch 15 24 96 10 LE Ballentine Am J Phys 54 883 (1986) 11 LE Ballentine Found Phys 20 1329 (1990)

84

12 L Cohen in Frontiers of Nonequilibrium Statistical Physics ed GT Moore and MO Scully (Plenum New York 1986) pp 97-117

13 S Kochen and EP Specker J Math Mech 17 59 (1967) 14 A Khrennikov Nonconventional approach to elements of physical realshy

ity based on nonreal asymptotics of relative frequencies Proc Conf Foundations of Probability and Physics Vaxjo-2000 (WSP Singapore 2001)

15 A Khrennikov Interpretations of Probability (VSP Utrecht 1999)

85

FORCING DISCRETIZATION A N D DETERMINATION IN Q U A N T U M HISTORY THEORIES

BOB COECKE Imperial College of Science Technology amp Medicine Theoretical Physics Group

The Blackett Laboratory South Kensington LondonSW7 2BZ and

Free University of Brussels Department of Mathematics Pleinlaan 2 B-1050 Brussels

E-mail bocoeckevubacbe

We present a formally deterministic representation for quantum history theories where we obtain the probabilistic structure via a discrete contextual variable no continuous probabilities are as such involved at the primal level

1 Introduction

In this paper we propose and study a model for history theories in which the probability structure emerges from a finite number of contextual happenings any next happening having a fixed chance to occur under the condition that the previous one happened Although this model cannot have a canonical mathematical status since it has been proved that this type of representation in general admits no essentially unique smallest one 8 u it provides insight in the emergence of logicality in the History Projection Operator setting14 and it illustrates how deterministic behavior can be encoded beyond those inshyterpretations of quantum history theories that are interpretationally restricted by so-called consistency or quasi-consistency (eg approximate decoherence) The particular motivation for this paradigm case study finds its origin in structural considerations towards a theory of quantum gravity4 15 19 As arshygued in16 although the relative frequency interpretation of probability justifies the continuous interval as the codomain for value assignment in the quanshytum gravity regime standard ideas of space and time might break down in such a way that the idea of spatial or temporal ensembles is inappropriate For the other main interpretations of probability mdash subjective logical or propensity mdash there seems to be no compelling a priori reason why probabilities should be real numbers Our model should be envisioned as a deconstructive step unshyraveling the probabilistic continuum as it appears in standard quantum theory reducing it explicitly to a discrete temporal sequence of (contextual) events The as such emerging temporal sequence is then easier to manipulate towards alternative encoding of contextual events eg in propositional terms It also enables a separate treatment of internal (the systems) and external (the con-

86

texts) time-encoding variable Although quantum history theories are currently most frequently envishy

sioned in a context of so-called decoherence we prefer to take the minimal perspective that a history theory is a theory that deals with sequential quanshytum measurements but remains essentially a dichotomic propositional theory This is formally encoded in a rigid way in the History Projection Operator-approach 14 We also mention recently studied sequential structures in the context of quantum logic of which references can be found in1 0 resulting in a dynamic disjunctive quantum logic which provides an appropriate formal context to discuss the logicality of history theories

A general theory on deterministic contextual models can be found in 8 Note here that what we consider as contextuality is that in a measurement there is an interaction between the system and its context and that precisely this interaction to some extend may influence the outcome of a measurement A lack of knowledge on the precise interaction then yields quantum-type unshycertainties Besides this interpretational issue classical representations are important since we think classical so even without giving any conceptual sigshynificance to the representation it provides a mode to think deterministically in terms of determined trajectories of the systems state without having to reconcile with concrete non-canonical constructs like pilot-wave mechanics

2 Outcome determination via contextual models

We will present the required results in full abstraction such that the reader clearly sees which structural ingredient of quantum theory determines existence of contextual models For details and proofs we refer t o 8 Let B(M) denote the Borel subsets of M Definition 1 A probabilistic measurement system is given by (i) A set of states pound and a set of measurements pound (ii) For each e e pound an outcome set Oe euro B(W) a a-field B(Oe) of Oe-subsets and (Kolmogorovian) probability measures Pplte B(Oe) -gt [01] for eachp 6 pound The canonical example is that of quantum theory with every Hilbert space ray ij) representing a state every self-adjoint operator H representing a measureshyment with its spectrum OH C K as outcome set where the a-structure B(OH) is inherited from that of B(R) and with probability measures P^tHE) bull= (tpPEtp) where PE denotes the spectral projector for E G BOH) bull In benefit of insight and also for notational convenience we will from now on assume that the measurements e pound pound are represented in a one to one way by their outcome sets Oe mdash note that whenever pound can be represented by points of W it then suffices to consider W x w = W+v in stead of W to fulfill this assumption

87

taking Oe x e as the corresponding outcome set We stress however that the results listed below also hold in absence of this assumption81 Definition 2 A pre-probabilistic hidden measurement system is given by (i) A set of states pound and a set of measurements pound (ii) Sets O C B(W) and A that parameterize pound ie pound = eAo|A pound A0 pound O and each e pound pound goes equipped with a map ltpto bull pound mdashgt O We can represent ltpoundAO|A pound A as ipo pound x A -gt O (p A) H-gt ltPAO(P) giving A a similar formal status as the set of states pound or as AAo pound x 13(0) mdashgt P ( pound ) (pE) gt-gt A|y0(p A) pound E where 7gt(A) denotes the set of subsets of A The core of this definition is that given a state p pound pound and a value A euro A we have a completely determined outcome tpo [p A) These pre-probabilistic hidden measurement systems encode as such fully deterministic settings Definition 3 Whenever for a given pre-probabilistic hidden measurement system (Ypound(0 A) ltpooeo) there exists a a-field B(A) of A-subsets that satisfies J0e0AAo(pE)(pE) pound pound x B(0) C B(A) it defines a probashybilistic hidden measurement system if a probability measure p B(A) mdashgt [01] is also specified

The condition on A A requires that all AAo(p E) are 23(A)-measurable such that to all triples (p O E) we can assign a value PPto(E) = p(AAo(p E)) euro [01] As such any probabilistic hidden measurement system defines a meashysurement system The question then rises whether every probabilistic meashysurement system (MS) can be encoded as a probabilistic hidden measurement system (HMS) The answer to this question is yes8 42 Theorem 12 3 There always exists a canonical HMS-representation for A = [01] B(A) = B([01]) (ie the Borel sets in [01]) and pu([0a]) = a ie uniformly distributed mdash the proof goes via a construction using the Loomis-Sikorski Theorem17 20 and Marczewskis Lemma13 It makes as such sense to investigate how the different possible HMS-representations for different non-isomorphic pairs (B(A)p) are structured mdash below it will become clear what we mean here by non-isomorphic First we will discuss an example that illustrates the above it traces back to 1 and details and illustrations can be found in 2 8 Consider the states of a spin-1 entity encoded as a point on the Poincare sphere pound 0 ( = C^C) C E3 Then any pair of antipodically located points of pound 0 encodes mutual orthogshyonal states as such encodes mutual orthogonal one-dimensional projectors and thus a (dichotomic) measurement Let p pound pound 0 let (a -gta) be a pair of mutual orthogonal points of pound 0 and let A be the diagonal connecting a and -lta Let xp pound A be the orthogonal projection of p on the diagonal A Then for A pound [xp-gta] ie xp pound [aA] we set ltp(pA) = a and for A pound [a xp[ ie xp euro]A -IQ] we set ltp(p A) = -a One then verifies that for p0 bull= B([a -gta]) mdashgt [01] [a (1 mdash x)a + x-lta] gt-gt x ie uniformly distributed

88

we obtain exactly the probability structure for spin- | in quantum theory a An interpretational proposal of this model could be the following123 Rather than decomposing states as in so-called hidden variable theories here we decompose the measurements in deterministic ones mdash the probability measure fi should then be envisioned as encoding the lack of knowledge on the interaction of the measured system with its environment including measurement device

We now introduce a notion of relative size of HMS-representations jusshytifying the use of smaller Given a er-algebra6 and probability measure H B mdashgt [01] denote by Bn the ltr-algebra of equivalence classes [E] with respect to the relation

pound ~ pound iff n(E n Ec) = nE H (E)c) = 0

ie iff E and E coincide up to a symmetric difference of measure zero The ordering of Bn is inherited from B For notational convenience denote the induced measure Bfi mdashgt [01] [E] H-gt H(E) again by fi Given two pairs (B x) and (B1 ) consisting of separable cr-algebras and probability measures on them set

bull (B u) lt (B u) amp 3f B^ ~ B^ a n i n J e c t i v e c-nidegrphism

We call Bn) and (Bfi) equivalent denoted (Bfi) ~ (Bfi) whenever in the above is a c-isomorphism Given two MS (poundpound) and (Epound ) we set

3s S -gt E 3t pound-+pound both bijections Ve 6 pound 3 e B(Oe) -gt B(Ot(e)) a cr-isomorphism Vp E E V e E pound Ps(p)t(e) deg fe = PPe

Via this equivalence relation we can define a relation lt M S between classes of measurement systems M and M1 as M ltMSM if for all (Epound) euro M there exists (Epound) 6 M such that (Epound) ~M S(S pound ) ie if M is included in M up to MS-equivalence We can then prove the following

(i) (Bi) ~ (Bii) if and only if (BgtAi) lt (Bn) and Bft) lt Bft) mdash 8 3 Lemma 1 thus the equivalence classes with respect to ~ constitute a partially ordered set (poset) for the ordering induced by lt we will denote

As shown in 6 9 this deterministic model for spin-^ in R3 can be generalized to R3-models for arbitrary spin-N2 The states are then represented in the so called Majorana representation 1 8 5 ie as N copies of So Correct probabilistic behavior is then obtained by introducing entanglement between the N different spin-^ systems fcIe a pointless cr-fleld In particular it follows from the Loomis-Sikorski theorem 1 7 2 0

that all separable ltr-algebras (ie which contain a countable dense subset) can be represented as a ltT-field mdash it as such also follows that assuming that B(A) is a er-field and not an abstracted c-algebra imposes no formal restriction

89

the set of these equivalence classes by M a class in it will be denoted via a member of it as [B n]

(ii) When setting M H M S = M[BK)ii [B(A)n] pound M where M[B(A)fi] stands for all HMS with B(A) and i such that (S(A) fi) pound [B(A)j] we have that (B(A)i) lt (B(A)M) BndM[B(A)n] ltMS M[B(A)n] are equivalent 8 i 3 Theorem 2 This then results in

Theorem 1 (M lt) and (MH M S ltM S) are isomorphic posets One of the crucial ingredients in (ii) above and also in the proof for genshy

eral existence with A = [01] is the following when setting AM(Epound) = (B(Oe) Ppe)p euro pound e G pound we obtain that pound pound admits a HMS-representation with B(A) and i if and only if AM(E pound) lt (B(A)n) where the order applies pointwisely to the elements of AM(Epound) 8 t 42 Theorem 1 Using this and Theorem 1 above we can now translate properties of M to propositions on the existence of certain HMS-representations We obtain the following

(i) (M lt) is not a join-semilattice thus In general there exists no smallest HMS-representation As such we will have to refine our study to particular settings where we are able to make statements whether there exists a smallest one and if not whether we can say at least something on the cardinality of A

(ii) One can prove a number of criteria on AM(Epound) that force (B(A)fi) ~ (S([01]) ibdquo) as such assuring existence of a smallest representation Among these the following Let Mfinite = (B(X)^) euro M J X is finite ^bullfinite Q AM(pound pound ) than A cannot be discrete It then follows for examshyple that quantum theory restricted to measurements with a finite number of outcomes still requires A = [01]

(iii) Let MJV = (B(X)(i) 6 M | X has at most N elements J AM(pound pound ) C M^r then there exists a HMS-representation with A mdash N Thus quantum theory restricted to those measurements with at most a fixed number N of outcomes has discrete HMS-representation

(iv) A M ( E pound ) = MAT then there exists no smallest HMS-representation Neither does it exist when fixing the number of outcomes So there is no essenshytially unique smallest HMS-representation for V-outcome quantum theory

Although there exists no smallest and as such no canonical discrete HMS-representation we will give the construction of one solution for dichotomic (or propositional) quantum theory ie N = 2 since this will constitute the core of the model presented in this paper We will follow82 to which we also refer for a construction for arbitrary N Let us denote the quantum mechanical probability to obtain a positive outcome in a measurement of a proposition or question a on a system in state p as Pp(a) mdash the outcome set consists here of we obtain a positive answer for the question a slightly abusively denoted

90

as a itself and we obtain a negative answer for the question a denoted as -ia Set inductively for A euro N c

a iff P (n gt A- 4- V - 1 i(Vc(plti)a) ltpa(p X)= a tradeigt W Z ^ + U=i 2gt

^ -ia otherwise

One verifies that for p(X) = ^x we obtain the correct probabilities in the resultshying HMS-model This provides a discrete alternative for the above discussed E3 -model for spin-i The model including the projection xp remains the same although we dont consider [a -gta] as A anymore Let A e A = N Set xbdquo = ( 1 - pound)a+ (pound)-lta for n pound Z2gt-i bull For xp ltE [ax$[U[x$x$[U[xxpound[U U [a2A-i~lQ] w e se^ faampty = agt anc^ PaiPty = ~ltx otherwise Then for p0 = B(N) mdashraquobull [01] A gt-gt ^ we obtain again quantum probability Geshyometrically this means that the values of A pound A as compared to the first model where they represents points on the diagonal ie a continuous intershyval or again equivalently decompositions of an interval in two intervals we now consider decompositions of an interval in 2A equally long parts of which there are only a discrete number of possibilities We refer t o 8 for details and illustrations concerning

3 Unitary ortho- and projective structure

In the above discussed E3 models rotational symmetries where implicit in their spatial geometry However in general the decompositions of measurements over p B(A) mdashgt [01] go measurement by measurement so additional structure if there is any has to be put in by hand It is probably fair to say that these contextual models only become non-trivial and useful when encoding physical symmetries within the maps tpa in an appropriate manner For sake of the argument we will distinguish between three types of symmetries that can be encoded namely unitary ortho- and projective ones

i Unitary symmetries When considering quantum measurements with disshycrete non-degenerated spectrum we can represent the outcomes OJJ by the corresponding eigenstates pii via spectral decomposition ie there exshyists an injective map B(Oe) -t P(E) for each e euro pound Then specification of ltp E x A mdashbull pii and p for one measurement eo G pound fixes it for any other e E pound by symmetry ltgte = (UoipoU-1) AxE -gt peii where U E -gt E is the unishytary transformation that satisfies U(pi) = pei and pe = p This is exactly the

cWe agree on N = 12 Note here that already by non-uniqueness of binary decomshyposition mdash i = 4- = EigN T^TT mdash follows that the construction below is not canonical Obviously there are also less pathological differences between the different non-comparable discrete representations8

91

symmetry encoded in the above described E3-models Note in particular that in this perspective the pairs (a -ia) and (-gta -gt(-gta)) should not be envisioned as merely a change of names of the outcomes but truly as putting the meashysurement device (or at least its detecting part) upside down d In this setting where we represent outcomes as states the assignment of an outcome can now be envisioned as a true change of state fegt E -gt E (D Oe) p i-gt tpe(p A) as such allowing to describe the behavior of the system under concatenated measurements

ii Projective symmetries For non-degenerated quantum measurements the outcomes require representation by higher dimensional subspaces so identifishycation in terms of states now requires an injective map B(Oe) -raquo V(V(S)) The behavior of states of the system under concatenated measurements then requires specification of a family of projectors TTT bull S -gt TT euro Oe eg the orthogonal projectors 7 r ^ E - gt A p i - gt ^ l A ( p V A x ) on the correshysponding subspace A in quantum theory The above discussed non-degenerated case fits also in this picture by setting Oe C p | p pound E where now each 7Tp E mdashgt p is uniquely determined (having a singleton codomain)

Hi Orthosymmetries The existence of an orthocomplementation on the latshytice of closed subspaces of a Hilbert space provides a dichotomic representashytion for measurements which can be envisioned as a pair consisting of a (to be verified) proposition a and its negation -a in quantum theory yielding TT^A bull E mdashgt A1- p Hraquo A L A ( p V A ) In terms of linear operator calculus we have IT^A = 1 mdash A gt both of them being orthogonal projectors

4 Representing quantum history theory

Although quantum history theory involves sequential measurements one of its goals is to remain an essentially dichotomic propositional theory This is forshymally encoded in a rigid way in the History Projection Operator-approach 14 The key idea here is that the form of logicality aimed at in 14 represhysents faithfully in the Hilbert space tensor producte Let A = (ctti)i be a

d The attentive reader will note that it is at this point that we escape the so-called hidden variable no-go theorems They arise when trying to impose contextual symmetries within the states of the system by requiring that values of observables are independent of the chosen context eg the proof of the Kochen-Specker theorem Our newly introduced variable A pound A follows contextual manipulations in an obvious manner c At this point we mention that in the study of sequential phenomena in the axiomatic quantum theory perspective on quantum logic sequentiality and compoundness both turn out to be specifications of a universal causal duality 1 0 as such providing a metaphysical perspective on the use of tensor products both for the description of compound physical systems and sequential processes

92

(so-called homogeneous) quantum history proposition with temporal support (pound1 pound2 bull bull bull tn) bull Then rather than representing this as a sequence of subspaces (Ai)i or projectors (ir^i we will either represent A as a pure tensor regiAi in the lattice of closed subspaces of the tensor product of the corresponding Hilbert spaces or as the orthogonal projector regi~Ki on this subspace The crucial propshyerty of this representation is then that -gtA again encodes as a projector namely idmdashregiiTi14 clarifying the notations TTJ and 7r-^ Moreover if Ali is a set of so-called disjoint history propositions ie lt8gtkAk plusmn regkA3

k for i ^ j then the history proposition that expresses the disjunction of Ai sensu14 is exactly encoded as the projector ] [ reg7rpound We get as such a kind of logical setting that is still encoded in terms of projectors Note that TT-A is not of the form regj7Tj but of the form Yli regA7rfc breaking the structural symmetry between a proposition and its negation in ordinary quantum theory

We will now transcribe the observations in the two previous sections to this setting in order to provide a contextual deterministic model for quantum history theory with discretely originating probabilities One could say that we will apply a split picture in terms of Schrodinger-Eisenbergh namely we assume that on the level of unitary evolution we apply the Eisenbergh picshyture such that we can fix notation without reference to this evolution but for changes of state due to measurement we will (obviously) express this in the state space When encoding outcomes in terms of states we need to consider n copies of E encoding the trajectories due to the measurements In view of the considerations made above it will be no surprise that we will consider these trajectories as of the form regiPi in the tensor product (gijEj This will require the introduction of the following pseudo-projector

bull 7r^ pound -gt regipoundi p Hgt p ^ = p reg m(p) reg reg (7Tn_i o o in)(p) Setting poundreg = TTreg[pound] = pg|p pound pound then ir pound -gt E^ encodes a bijective representation of E Noting that PP(A) mdash (preg IXAPA) is the probability given by quantum theory to obtain A we then set inductively for fixed A pound N that ltPA(P A) = A if and only if

bull lt P S I trade S gt gt pound + E pound ^ ^ and (p^(p) = -14 otherwise The outcome trajectories in case we obtain A are then given in terms of initial states by (n^ o 7rreg) E mdashgt regiAi The value A euro N can be envisioned as follows We assume it to be a number of contextual events either real or virtual depending on ones taste and we asshysume that given that some events already happened the chance of a next one happening is equal to the chance that it doesnt happen so we actually conshysider a finite number of probabilistically balanced consecutive binary decisive processes where the result of the previous one determines whether we actually

93

will perform the next one Unitary symmetries are induced in the obvious way as tensored unitary operators regiUi This model then produces the statistical behavior of quantum history theory

The breaking of the structural symmetry between a proposition and its negation manifestates itself in the most explicit way in the sense that when we have a determined outcome -gtA we dont have a determined trajectory in our model mdash obviously one could build a fully deterministic model that also determines this by concatenation of individual deterministic models (one for each element in the temporal support) but we feel that this would not be in accordance with the propositional flavor a history theory aims at The negation -gtA is indeed cognitive and not ontological with respect to the actual executed physical procedure or in other words the systems context and one cannot expect an ontological model to encode this in terms of a formal duality Explicitly -i(AregB) can be written both as H lt8gt -gtB) copy (-gtA reg B) and (-gtA reg H) copy (A reg -gtB) which clearly define different procedures with respect to imposed change of state due to the measurement Even more explicitly setting HPO(Hkk) = E reg 4 l 4 G pound(laquo)gt reg4l -L reg 4 for i ^ j for pound(ik) the lattice of closed subspaces of Hk the ontologically faithful hull oiUVO(Ukk) consists then of all ortho-ideals Ol(HVO(Hkk)) ~

bull 4[regAji] | A e CUk)regkA plusmn regkA for i plusmn j

where J[mdash] assigns to a set of pure tensors all pure tensors in QkHk that are smaller than at least one in the given set this with respect to the ordering in CregkHk) mdash the downset 4-[~] construction makes Ol(HVO(Hkk)) inherit the pound(regkHk)-oideT as intersection If a particular decomposition is specified as an element of OX(HVO(Hkk)) what means full specification of the physishycal procedure where summation over different sequences of pure tensors is now envisioned as choice of procedure we can provide a deterministic contextual model the choice of procedure itself becoming an additional variable Conshyclusively the HPO-setting looses part of the physical ontology that goes with an operational perspective on quantum theory and as such if we want to provide a deterministic representation for general inhomogeneous history propositions sensu the one we obtained for the homogeneous ones we formally need to restore this part of the physical ontology eg as Ol7iVO(7ikk))

5 Further discussion

In this paper we didnt provide an answer and we even didnt pose a question We just provided a new way to think about things slightly confronting the

A choice that is motivated by the traditional consistent history setting and its interpretation as well as by a particular semantical perspective on quantum logic as a whole

94

usual consistency or decoherence perspective for history theories Even if one does not subscribe to the underlying deterministic nature of the model it still exhibits what a minimal representation of the indeterministic ingredients can be as such representing it in a more tangible way With respect to the nonshyexistence of a smallest representation in view of other physical considerations it could be that one of the constructible discrete models presents itself as the truly canonical one eg equilibrium or other thermodynamical considerations metastatistical ones emerging from additional modelization

Acknowledgments

We thank Chris Isham for useful discussions on the content of this paper

References

1 D Aerts J Math Phys 27 202 (1986) 2 D Aerts Int J Theor Phys 32 2207 (1993) 3 D Aerts Found Phys 24 1227 (1994) 4 GK Au mdash Interview with A Ashtekar CJ Isham and E Witten The

Quest for Quantum Gravity arXiv gr-qc9506001 (1995) 5 H Bacry J Math Phys 15 1686 (1974) 6 B Coecke Helv Phys Acta 68 396 (1995) 7 B Coecke Found Phys Lett 8 437 (1995) 8 B Coecke Helv Phys Acta 70 442 462(1997) arXiv quant-

ph0008061 k 0008062 Tatra Mt Math Publ 10 63 9 B Coecke Found Phys 28 1347 (1998)

10 B Coecke et ai Found Phys Lett 14(2001) arXiv quant-ph0009100 11 N Gisin and C Piron Lett Math Phys 5 379 (1981) 12 S Gudder J Math Phys 11 431 (1970) 13 A Horn and H Tarski Trans AMS 64 467 (1948) 14 C J Isham J Math Phys 23 2157 (1994) 15 C J Isham Structural Issues in Quantum Gravity In General Relativshy

ity and Gravitation GR14 pp167 (World Scientific Singapore 1997) 16 CJ Isham and J Butterfield Found Phys 30 1707 (2000) 17 L Loomis Bull AMS 53 757 (1947) 18 E Majorana Nuovo Cimento 9 43 (1932) 19 C Rovelli Strings Loops and Others A Critical Survey of the Present

Approaches to Quantum Gravity Plenary Lecture at GR15 Poona India (1998) arXiv gr-qc9803024

20 R Sikorski Fund Math 35 247 (1948)

95

INTERPRETATIONS OF Q U A N T U M MECHANICS A N D INTERPRETATIONS OF VIOLATION OF BELLS

INEQUALITY

WILLEM M DE MUYNCK Theoretical Physics Eindhoven University of Technology

FOB 513 5600 MB Eindhoven the Netherlands E-mail W-MdMuyncktuenl

The discussion of the foundations of quantum mechanics is complicated by the fact that a number of different issues are closely entangled Three of these issues are i) the interpretation of probability ii) the choice between realist and empiricist interpretations of the mathematical formalism of quantum mechanics iii) the disshytinction between measurement and preparation It will be demonstrated that an interpretation of violation of Bells inequality by quantum mechanics as evidence of non-locality of the quantum world is a consequence of a particular choice beshytween these alternatives Also a distinction must be drawn between two forms of realism viz a) realist interpretations of quantum mechanics b) the possibility of hidden-variables (sub-quantum) theories

1 Realist and empiricist interpretations of quantum mechanics

In realist interpretations of the mathematical formalism of quantum mechanics state vector and observable are thought to refer to the microscopic object in the usual way presented in most textbooks Although of course preparing and measuring instruments are often present these are not taken into account in the mathematical description (unless as in the theory of measurement the subject is the interaction between object and measuring instrument)

In an empiricist interpretation quantum mechanics is thought to describe relations between input and output of a measurement process A state vector is just a label of a preparation procedure an observable is a label of a measuring instrument In an empiricist interpretation quantum mechanics is not thought to describe the microscopic object This of course does not imply that this object would not exist it only means that it is not described by quantum mechanics Explanation of relations between input and output of a measureshyment process should be provided by another theory eg a hidden-variables (sub-quantum) theory This is analogous to the way the theory of rigid bodies describes the empirical behavior of a billiard ball or to the description by thershymodynamics of the thermodynamic properties of a volume of gas explanations being relegated to theories describing the microscopic (atomic) properties of the systems

Although a term like observable (rather than physical quantity) is ev-

96

idence of the empiricist origin of quantum mechanics (compare Heisenberg1) there has always existed a strong tendency toward a realist interpretation in which observables are considered as properties of the microscopic object more or less analogous to classical ones Likewise many physicists use to think about electrons as wave packets flying around in space without bothering too much about the Unanschaulichkeit that for Schrodingei2 was such a problematic feature of quantum theory Without entering into a detailed discussion of the relative merits of either of these interpretations (eg de Muynck3) it is noted here that an empiricist interpretation is in agreement with the operational way theory and experiment are compared in the laboratory Moreover it is free of paradoxes which have their origin in a realist interpretation As will be seen in the next section the difference between realist and empiricist interpretations is highly relevant when dealing with the EPR problem

2 E P R experiments and Bell experiments

In figure 1 the experiment is depicted

measuring instrument for Q or P

Figure 1 E P R experiment

proposed by Einstein Podolsky and Rosen4 to study (in)completeness of quantum mechanics A pair of particles (1 and 2) is prepared in an entangled state and allowed to separate A measurement is performed on particle 1 It is essential to the EPR reasoning that particle 2 does not interact with any measuring instrument thus allowing to consider so-called elements of physical reality of this particle that can be considered as objective properties being attributable to particle 2 independently of what happens to particle 1 By EPR this arrangement was presented as a way to perform a measurement on particle 2 without in any way disturbing this particle

The EPR experiment should be compared to correlation measurements of the type performed by Aspect et al56 to test Bells inequality (cf figure 2) In these latter experiments also particle 2 is interacting with a measurshying instrument In the literature these experiments are often referred to as EPR experiments too thus neglecting the fundamental difference between

97

Q

Figure 2 Bell experiment

the two measurement arrangements of figures 1 and 2 This negligence has been responsible for quite a bit of confusion and should preferably be avoided by referring to the latter experiments as Bell experiments rather than EPR ones In EPR experiments particle 2 is not subject to a measurement but to a (conditional) preparation (conditional on the measurement result obtained for particle 1) This is especially clear in an empiricist interpretation because here measurement results cannot exist unless a measuring instrument is present its pointer positions corresponding to the measurement results

Unfortunately the EPR experiment of figure 1 was presented by EPR as a measurement performed on particle 2 and accepted by Bohr as such That this could happen is a consequence of the fact that both Einstein and Bohr entertained a realist interpretation of quantum mechanical observables (note that they differed with respect to the interpretation of the state vector) the only difference being that Einsteins realist interpretation was an objectivistic one (in which observables are considered as properties of the object possessed independently of any measurement the EPR elements of physical reality) whereas Bohrs was a contextualistic realism (in which observables are only well-defined within the context of the measurement) Note that in Bell expershyiments the EPR reasoning would break down because due to the interaction of particle 2 with its measuring instrument there cannot exist elements of physical reality

Much confusion could have been avoided if Bohr had maintained his intershyactional view of measurement However by accepting the EPR experiment as a measurement of particle 2 he had to weaken his interpretation to a relational one (eg Popper7 Jammer8) allowing the observable of particle 2 to be co-determined by the measurement context for particle 1 This introduced for the first time non-locality in the interpretation of quantum mechanics But this could easily have been avoided if Bohr had required that for a measurement of particle 2 a measuring instrument should be actually interacting with this very particle with the result that an observable of particle n (n = 12) can be co-determined in a local way by the measurement context of that particle only This incidentally would have completely made obsolete the EPR ele-

98

ments of physical reality and would have been quite a bit less confusing than the answer Bohr9 actually gave (to the effect that the definition of the EPR element of physical reality would be ambiguous because of the fact that it did not take into account the measurement arrangement for the other particle) thus promoting the non-locality idea

Summarizing the idea of EPR non-locality is a consequence of i) a neglect of the difference between EPR and Bell experiments (equating elements of physical reality to measurement results) ii) a realist interpretation of quantum mechanics (considering measurement results as properties of the microscopic object ie particle 2) In an empiricist interpretation there is no reason to assume any non-locality

It is often asserted that non-locality is proven by the Aspect experiments because these are violating Bells inequality The reason for such an assertion is that it is thought that non-locality is a necessary condition for a derivation of Bells inequality However as will be demonstrated in the following this cannot be correct since this inequality can be derived from quite different assumptions Also experiments like the Aspect ones -although violating Bells inequality-do not exhibit any trace of non-locality because their measurement results are completely consistent with the postulate of local commutativity implyshying that relative frequencies of measurement results are independent of which measurements are performed in causally disconnected regions Admittedly this does not logically exclude a certain non-locality at the individual level being unobservable at the statistical level of quantum mechanical probability distributions However from a physical point of view a peaceful coexistence between locality at the (physically relevant) statistical level and non-locality at the individual level is extremely implausible Unobservability of the latter would require a kind of conspiracy not unlike the one making unobservable 19 century world aether For this reason the non-locality explanation of the experimental violation of Bells inequality does not seem to be very plausible and does it seem wise to look for alternative explanations

Since non-locality is never the only assumption in deriving Bells inequalshyity such alternative explanations do exist Thus Einsteins assumption of the existence of elements of physical reality is such an additional assumption More generally in Bells derivation10 the existence of hidden-variables is one Is it still possible to derive Bells inequality if these assumptions are abolshyished Moreover even assuming the possibility of hidden-variables theories are there in Bells derivation no hidden assumptions additional to the locality assumption

Bells inequality refers to a set of four quantum mechanical observables AiBiA2 and B2 observables with differentidentical indices being compati-

99

bleincompatible In the Aspect experiments measurements of the four possible compatible pairs are performed in these experiments An and Bn refer to polarshyization observables of photon n n = 12 respectively) Bells inequality can typically be derived for the stochastic quantities of a classical Kolmogorovian probability theory Hence violation of Bells inequality is an indication that observables A B A2 and B2 are not stochastic quantities in the sense of Kol-mogorovs probability theory In particular there cannot exist a quadrivariate joint probability distribution of these four observables Such a non-existence is a consequence of the incompatibility of certain of the observables Since inshycompatibility is a local affair this is another reason to doubt the non-locality explanation of the violation of Bells inequality

In the following derivations of Bells inequality will be scrutinized to see whether the non-locality assumption is as crucial as was assumed by Bell In doing so it is necessary to distinguish derivations in quantum mechanics from derivations in hidden-variables theories

3 Bells inequality in quantum mechanics

For dichotomic observables having values plusmn 1 Bells inequality is given accordshying to

A^A2) - AXB2) - (B1B2) - (BiA2) lt 2 (1)

A more general inequality being valid for arbitrary values of the observables is the BCHS inequality

-lltp(b1a2) +p(bib2)+p(a1b2) - p ( o i a 2 ) -p(bi) -p(b2) lt 0 (2)

from which (31) can be derived for the dichotomic case Because of its indeshypendence of the values of the observables inequality (32) is preferable by far over inequality (31) Bells inequality may be violated if some of the observshyables are incompatible [gtliii]_ ^ O [^2-62]- ^ O

I shall now discuss two derivations of Bells inequality which can be formushylated within the quantum mechanical formalism and which do not rely on the existence of hidden variables The first one is relying on a possessed values principle stating that

values of quantum mechanical observables may be attributed to the object as objective properties possessed by the object independent of observation

values principle can be seen as an expression of the objectiv-

possessed values = lt principle

The possessec istic-realist interpretation of the quantum mechanical formalism preferred by

100

Einstein (compare the EPR elements of physical reality) The important point is that by this principle well-defined values are simultaneously attributed to incompatible observables If an bj = plusmn1 are the values of Ai and Bj for the nth of a sequence of N particle pairs then we have

- 2 lt lt 4 n ) 4 n ) - a[n)b2n) - b[n)b2

n) - ampltn)a2n) lt 2

from which it directly follows that the quantities

lt iA2gt = l f a W 4 n gt gt e t c n=l

must satisfy Bells inequality (31) (a similar derivation has first been given by Stapp11 although starting from quite a different interpretation) The essential point in the derivation is the assumption of the existence of a quadruple of values (ai b a262) for each of the particle pairs

From the experimental violation of Bells inequality it follows that an objectivistic-realist interpretation of the quantum mechanical formalism enshycompassing the possessed values principle is impossible Violation of Bells inequality entails failure of the possessed values principle (no quadruples availshyable) In view of the important role measurement is playing in the interpreshytation of quantum mechanics this is hardly surprising As is well-known due to the incompatibility of some of the observables the existence of a quadruple of values can only be attained on the basis of doubtful counterfactual reashysoning If a realist interpretation is feasible at all it seems to have to be a contextualistic one in which the values of observables are co-determined by the measurement arrangement In the case of Bell experiments non-locality does not seem to be involved

As a second possibility to derive Bells inequality within quantum meshychanics we should consider derivations of the BCHS inequality (32) from the existence of a quadrivariate probability distribution p(ai 610262) by Fine12

and Rastalf3 (also de Muynck14) Hence from violation of Bells inequality the non-existence of a quadrivariate joint probability distribution follows In view of the fact that incompatible observables are involved this once again is hardly surprising

A priori there are two possible reasons for the non-existence of the quadrishyvariate joint probability distribution (01610262) First it is possible that Um]v-gt00N(aibia2b2)N of the relative frequencies of quadruples of meashysurement results does not exist Since however Bells inequality already folshylows from the existence of relative frequency ^(01610262)^ with finite

101

N and the limit N mdashgt oo is never involved in any experimental implementashytion this answer does not seem to be sufficient Therefore the reason for the non-existence of the quadrivariate joint probability distribution pa ampi alti 62) can only be the non-existence of relative frequencies N(aibia2b2)N This seems to reduce the present case to the previous one Bells inequality can be violated because quadruples ( 4i = a B = bi A = 02 B2 = ^2) do not exist

Could non-locality explain the non-existence of quadruples A = aB = bi A2 = a2 B2 = 62) Indeed it could If the value of A say is co-determined by the measurement arrangement of particle 2 then non-locality could entail

Oi(^2) 0(B2) (3)

thus preventing the existence of one single value of observable A for the two Aspect experiments involving this observable This precisely is the non-locality explanation referred to above This explanation is close to Bohrs ambiguity answer to EPR referred to in section 2 stating that the definishytion of an element of physical reality of observable A must depend on the measurement context of particle 2

As will be demonstrated next there is a more plausible local explanation however based on the inequality

a i ^ O ^ a ^ B i ) (4)

expressing that the value of Ai say will depend on whether either Ai or B is measured Inequality (34) could be seen as an implementation of Heisenbergs disturbance theory of measurement to the effect that observables incompatishyble with the actually measured one are disturbed by the measurement That such an effect is really occurring in the Aspect experiments can be seen from the generalized Aspect experiment depicted in figure 3 This experiment should be compared with the Aspect switching experiment in which the switches have been replaced by two semi-transparent mirrors (transmissivities 71 and 72 reshyspectively) The four Aspect experiments are special cases of the generalized one having 7bdquo = 0 or 1 n = 12

Restricting for a moment to one side of the interferometer it is possible to calculate the joint detection probabilities of the two detectors according to

p^auMj)) - ( 1 _ 7 l ) ( F ( D + ) i - 7 l ( pound ( i ) + ) - ( l - 7 l ) ( f ( i ) + ) Jgt

(5)

in which E^ + E^bdquo and F^+jF^- are the spectral representations of the two polarization observables (Ai and Bi) in directions 81 and 6[ respecshytively The values an = +mdashbij = +mdash correspond to yesno registration

102

(IIS bull y ltamp bull BID Pole D

Pole C S 3 E 3 Pol 9]

Figure 3 Generalized Aspect experiment

of a photon by the detector p 7 1 (+ +) = 0 means that like in the switching experiment only one of the detectors can register photon 1 There however is a fundamental difference with the switching experiment because in this latter experiment the photon wave packet is sent either toward one detector or the other whereas in the present one it is split so as to interact coherently with both detectors This makes it possible to interpret the right hand part of the generalized experiment of figure 3 as a joint non-ideal measurement of the inshycompatible polarization observables in directions 6 and 6[ (eg de Muynck et al15) the joint probability distribution of the observables being given by (5)

It is not possible to extensively discuss here the relevance of experiments of the generalized type for understanding Heisenbergs disturbance theory of measurement and its relation to the Heisenberg uncertainty relations (see eg de Muynck16) The important point is that such experiments do not fit into the standard (Dirac-von Neumann) formalism in which a probability is an expectation value of a projection operator Indeed from (5) it follows that P-n(aubij) = TrpR^ij is yielding operators R^ij according to

( ( 1 ) laquo ) = ( ( 1 - T 0 F lt 1 gt + 7 i pound(D 7 ipound ( 1 ) +

+ ( l - 7 l ) F ( O (6)

The set of operators R^ij constitutes a so-called positive operator-valued measure (POVM) Only generalized measurements corresponding to POVMs are able to describe joint non-ideal measurements of incompatible observables By calculating the marginals of probability distribution p 7 l (an bj) it is possishyble to see that for each value of 71 information is obtained on both polarization observables be it that information on polarization in direction 0 gets more non-ideal as 71 decreases while information on polarization in direction 0[ is getting more ideal This is in perfect agreement with the idea of mutual disshyturbance in a joint measurement of incompatible observables The explanation of the non-existence of a single measurement result for observable Ai say as implied by inequality (34) is corroborated by this analysis

103

The analysis can easily be extended to the joint detection probabilities of the whole experiment of figure 3 The joint detection probability distribution of all four detectors is given by the expectation value of a quadrivariate POVM Rijki according to

(an bija2khi) = TrpRijkt- (7)

This POVM can be expressed in terms of the POVMs of the left and right interferometer arms according to

Rijki = R)R) (8)

It is important to note that the existence of the quadrivariate joint probshyability distribution (7) and the consequent satisfaction of Bells inequality is a consequence of the existence of quadruples of measurement results available because it is possible to determine for each individual particle pair what is the result of each of the four detectors Although because of (35) also loshycality is assumed this does not play an essential role Under the condition that a quadruple of measurement results exists for each individual photon pair Bells inequality would be satisfied also if due to non-local interaction Rijkt were not a product of operators of the two arms of the interferometer The reason why the standard Aspect experiments do not satisfy Bells inequality is the non-existence of a quadrivariate joint probability distribution yielding the bivariate probabilities of these experiments as marginals Such a nonshyexistence is strongly suggested by Heisenbergs idea of mutual disturbance in a joint measurement of incompatible observables This is corroborated by the easily verifiable fact that the quadrivariate joint probability distributions of the standard Aspect experiments obtained from (7) and (35) by taking j n

to be either 1 or 0 are all distinct Moreover in general the quadrivariate joint probability distribution (7) for one standard Aspect experiment does not yield the bivariate ones of the other experiments as marginals Although it is not strictly excluded that a quadrivariate joint probability distribution might exist having the bivariate probabilities of the standard Aspect experiments as marginals (hence different from the ones referred to above) does the mathshyematical formalism of quantum mechanics not give any reason to surmise its existence As far as quantum mechanics is concerned the standard Aspect experiments need not satisfy Bells inequality

104

4 Bells inequality in stochastic and deterministic hidden-variables theories

In stochastic hidden-variables theories quantum mechanical probabilities are usually given as

p(ai)= [ d p()p(ai) (1) JA

in which A is the space of hidden variable A (to be compared with classical phase space) and p(ai|A) is the conditional probability of measurement result A = ai if the value of the hidden variable was A and pX) the probability of A It should be noticed that expression (41) fits perfectly into an empiricist intershypretation of the quantum mechanical formalism in which measurement result ai is referring to a pointer position of a measuring instrument the object being described by the hidden variable Since p(ai | A) may depend on the specific way the measurement is carried out the stochastic hidden-variables model correshysponds to a contextualistic interpretation of quantum mechanical observables Deterministic hidden-variables theories are just special cases in which p(ai|A) is either 1 or 0 In the deterministic case it is possible to associate in a unique way (although possibly dependent on the measurement procedure) the value ai to the phase space point A the object is prepared in A disadvantage of a deterministic theory is that the physical interaction of object and measuring instrument is left out of consideration thus suggesting measurement result ai to be a (possibly contextually determined) property of the object In order to have maximal generality it is preferable to deal with the stochastic case

For Bell experiments we have

p(aia2)= dp(X)p(aia2) (2) JA

a condition of conditional statistical independence

p(a1a2X) =p(ai|A)p(o2 |A) (3)

expressing that the measurement procedures of Ai and A2 do not influence each other (so-called locality condition)

As is well-known the locality condition was thought by Bell to be the crucial condition allowing a derivation of his inequality This does not seem to be correct however As a matter of fact Bells inequality can be derived if a quadrivariate joint probability distribution exists1213 In a stochastic hidden-variables theory such a distribution could be represented by

p(aibia2b2) = dX p(X)p(aibia2b2X) (4) JA

105

without any necessity that the conditional probability be factorizable in order that Bells inequality be satisfied (although for the generalized experiment disshycussed in section 3 it would be reasonable to require that p(ai 6102621 A) = p(ai6i|A)p(a2amp2|A)) Analogous to the quantum mechanical case it is suffishycient that for each individual preparation (here parameterized by A) a quadrushyple of measurement results exists If Heisenberg measurement disturbance is a physically realistic effect in the experiments at issue it should be described by the hidden-variables theory as well Therefore the explanation of the nonshyexistence of such quadruples is the same as in quantum mechanics

However with respect to the possibility of deriving Bells inequality there is an important difference between quantum mechanics and the stochastic hidden-variables theories of the kind discussed here Whereas quantum meshychanics does not yield any indication as regards the existence of a quadrivariate joint probability distribution returning the bivariate probabilities of the Asshypect experiments as marginals local stochastic hidden-variables theory does Indeed using the single-observable conditional probabilities assumed to exist in the local theory (compare (3)) it is possible to construct a quadrivariate joint probability distribution according to

p(aia2b1b2) = d p(A)p(ai|A)p(a2|A)p(ampi|A)p(amp2|A) (5) JK

satisfying all requirements It should be noted that (42) does not describe the results of any joint measurement of the four observables that are involved Quadruples (ai a2 b b2) are obtained here by combining measurement results found in different experiments assuming the same value of A in all experishyments For this reason the physical meaning of this probability distribution is not clear However this does not seem to be important The existence of (42) as a purely mathematical constraint is sufficient to warrant that any stochastic hidden-variables theory in which (2) and (3) are satisfied must reshyquire that the standard Aspect experiments obey Bells inequality Admittedly there is a possibility that (42) might not be a valid mathematical entity beshycause it is based on multiplication of the probability distributions p(a|A) which might be distributions in the sense of Schwartz distribution theory However the remark made with respect to the existence of probability distributions as infinitemdashA limits of relative frequencies is valid also here the reasoning does not depend on this limit but is equally applicable to relative frequencies in finite sequences

The question is whether this reasoning is sufficient to conclude that no local hidden-variables theory can reproduce quantum mechanics Such a conshyclusion would only be justified if locality would be the only assumption in

106

deriving Bells inequality If there would be any additional assumption in this derivation then violation of Bells inequality could possibly be blamed on the invalidity of this additional assumption rather than locality Evidently one such additional assumption is the existence of hidden variables A belief in the completeness of the quantum mechanical formalism would indeed be a suffishycient reason to reject this assumption thus increasing pressure on the locality assumption Since however an empiricist interpretation is hardly reconcilshyable with such a completeness belief we have to take hidden-variables theories seriously and look for the possibility of additional assumptions within such theories

In expression (41) one such assumption is evident viz the existence of the conditional probability p(ai|A) The assumption of the applicability of this quantity in a quantum mechanical measurement is far less innocuous than appears at first sight If quantum mechanical measurements really can be modshyeled by equality (41) this implies that a quantum mechanical measurement result is determined either in a stochastic or in a deterministic sense by an instantaneous value A of the hidden variable prepared independently of the measurement to be performed later It is questionable whether this is a reshyalistic assumption in particular if hidden variables would have the character of rapidly fluctuating stochastic variables As a matter of fact every individshyual quantum mechanical measurement takes a certain amount of time and it will in general be virtually impossible to determine the precise instant to be taken as the initial time of the measurement as well as the precise value of the stochastic variable at that moment Hence hidden-variables theories of the kind considered here may be too specific

Because of the assumption of a non-contextual preparation of the hidshyden variable such theories were called quasi-objectivistic stochastic hidden-variables theories in de Muynck and van Stekelenborg17 (dependence of the conditional probabilities p(aiX) on the measurement procedure preventing complete objectivity of the theory) In the past attention has mainly been restricted to quasi-objectivistic hidden-variables theories It is questionable however whether the assumption of quasi-objectivity is a possible one for hidden-variables theories purporting to reproduce quantum mechanical meashysurement results The existence of quadrivariate probability distribution (42) only excludes quasi-objectivistic local hidden-variables theories (either stochasshytic or deterministic) from the possibility of reproducing quantum mechanics As will be seen in the next section it is far more reasonable to blame quasi-objectivity than locality for this thus leaving the possibility of local hidden-variables theories that are not quasi-objectivistic

107

5 Analogy between thermodynamics and quantum mechanics

The essential feature of expression (41) is the possibility to attribute either in a stochastic or in a deterministic way measurement result a to an instantashyneous value of hidden variable A The question is whether this is a reasonable assumption within the domain of quantum mechanical measurement Are the conditional probabilities p(ai|A) experimentally relevant within this domain In order to give a tentative answer to this question we shall exploit the analogy between thermodynamics and quantum mechanics considered already a long time ago by many authors (eg de Broglie18 Bohm et al1920 Nelson2122)

Quantum mechanics -yen Hidden variables theory (A1A2BUB2) A

t t Thermodynamics mdashgt Classical statistical mechanics

(PTS) quPi In this analogy thermodynamics and quantum mechanics are considered as phenomenological theories to be reduced to more fundamental microscopic theories The reduction of thermodynamics to classical statistical mechanics is thought to be analogous to a possible reduction of quantum mechanics to stochastic hidden-variables theory Due to certain restrictions imposed on preparations and measurements within the domains of the phenomenological theories their domains of application are thought to be contained in but smaller than the domains of the microscopic theories

In order to assess the nature and the importance of such restrictions let us first look at thermodynamics As is well-known (eg Hollinger and Zenzen23) thermodynamics is valid only under a condition of molecular chaos assuring the existence of local equilibrium necessary for the ergodic hypothesis to be satisfied Thermodynamics only describes measurements of quantities (like pressure temperature and entropy) being defined for such equilibrium states From an operational point of view this implies that measurements within the domain of thermodynamics do not yield information on the object system valid for one particular instant of time but it is time-averaged information time averaging being replaced under the ergodic hypothesis by ensemble averaging In the Gibbs theory this ensemble is represented by the canonical density function Z~1e~H^qnp^^kT on phase space This state is called a macrostate to be distinguished from the microstate qnPn representing the point in phase space the classical object is in at a certain instant of time

The restricted validity of thermodynamics is manifest in a two-fold way i) through the restriction of all possible density functions on phase space to aIn equilibrium thermodynamics equilibrium is assumed to be even global

108

the canonical ones ii) through the restriction of thermodynamical quantities (observables) to functionals on the space of thermodynamic states Physishycally this can be interpreted as a restriction of the domain of application of thermodynamics to those measurement procedures probing only properties of the macrostates This implies that such measurements only yield information that is averaged over times exceeding the relaxation time needed to reach a state of (local) equilibrium Thus it is important to note that thermodynamic quantities are quite different from the physical quantities of classical statistical mechanics the latter ones being represented by functions of the microstate ltlnPn and hence referring to a particular instant of time6 Only if it were possible to perform measurements faster than the relaxation time would it be necessary to consider such non-thermodynamic quantities Such measureshyments then are outside the domain of application of thermodynamics Thus if we have a cubic container containing a volume of gas in a microstate initially concentrated at its center and if we could measure at a single instant of time either the total kinetic energy or the force exerted on the boundary of the conshytainer then these results would not be equal to thermodynamic temperature and pressurec respectively because this microstate is not an equilibrium state Only after the gas has reached equilibrium within the volume denned by the container (equilibrium) thermodynamics becomes applicable

Within the domain of application of thermodynamics the microstate of the system may change appreciably without the macrostate being affected Indeed a macrostate is equivalent to an (ergodic) trajectory qn(t)pn(t)ergodic- We might exploit as follows the difference between micro- and macrostates for charshyacterizing objectivity of a physical theory Whereas the microstate is thought to yield an objective description of the (microscopic) object the macrostate just describes certain phenomena to be attributed to the object system only while being observed under conditions valid within the domain of application of the theory In this sense classical mechanics is an objective theory all quantities being instantaneous properties of the microstate Thermodynamic quantities only being attributable to the macrostate (ie to an ergodic trashyjectory) can not be seen however as properties belonging to the object at a certain instant of time Of course we might attribute the thermodynamic quantity to the event in space-time represented by the trajectory but it should be realized that this event is not determined solely by the preparation of the microstate but is determined as well by the macroscopic arrangement serving

6Note that a definition of an instantaneous temperature by means of the equality Z2nkT = S i P2mj does not make sense as can easily be seen by applying this definition to an ideal gas in a container freely falling in a gravitational field t h e r m o d y n a m i c pressure is defined for the canonical ensemble by p mdash kTddV log Z

109

Figure 4 Incompatible thermodynamic arrangements

to define the macrostate In order to illustrate this consider two identical cubic containers differing

only in their orientations (cf figure 4) In principle the same microstate may be prepared in the two containers Because of the different orientations howshyever the macrostates evolving from this microstate during the time the gas is reaching equilibrium with the container are different (for different orientations of the container we have Hx ^ H2 and hence e - i f l f c T Z i ^ e~H2kTZ2 since H = T+V and Vi ^ V2 because potential energy is infinite outside a conshytainer) This implies that thermodynamic macrostates may be different even though starting from the same microstate Macrostates in thermodynamics have a contextual meaning It is important to note that since the container is part of the preparing apparatus this contextuality is connected here to prepashyration rather than to measurement Consequently whereas classical quantities f(qnPn) can be interpreted as objective properties thermodynamic quanshytities are non-objective the non-objectivity being of a contextual nature

Let us now suppose that quantum mechanics is related to hidden-variables theory analogous to the way thermodynamics is related to classical mechanshyics the analogy maybe being even closer for non-equilibrium thermodynamics (only local equilibrium being assumed) than for the thermodynamics of global equilibrium processes Support for this idea was found in de Muynck and van Stekelenborg17 where it was demonstrated that in the Husimi representashytion of quantum mechanics by means of non-negative probability distribution functions on phase space an analogous restriction to a canonical set of disshytributions obtains as in thermodynamics In particular it was demonstrated that the dispersionfree states p(qp) = S(q mdash qo)S(p mdash po) are not canonical in this sense This implies that within the domain of quantum mechanics it does not make sense to consider the preparation of the object in a microstate with a well-defined value of the hidden variables (qp)

In the analogy quantum mechanical observables like AiA2BiB2 should be compared to thermodynamic quantities like pressure temperature and enshytropy The central issue in the analogy is the fact that thermodynamic quanti-

110

ties like pressure and temperature cannot be conditioned on the instantaneous phase space variable qnPn (microstate) Expressions like p(qnPn) and T(qnPn) are meaningless within thermodynamics Thermodynamic quanshytities are conditioned on macrostates corresponding to ergodic paths in phase space Analogously a quantum mechanical observable might not correspond to an instantaneous property of the object but might have to be associated with an (ergodic) path in hidden-variables space A (macrostate) rather than with an instantaneous value A (microstate)

On the basis of the analogy between thermodynamics and quantum meshychanics it is possible to state the following conjectures

bull Quantum mechanical measurements (analogous to thermodynamic meashysurements) do not probe microstates but macrostates

bull Quantum mechanical quantities (analogous to thermodynamic quantishyties) should be conditioned on macrostates

A hidden-variables macrostate will be symbolically indicated by A For quantum mechanical measurements the conditional probabilities p(ai) of (41) should then be replaced by p(ai|A ) Concomitantly quantum mechanshyical probabilities should be represented in the hidden-variables theory by a functional integral

p(ai) = Jd ptfMa^X1) (1)

in which the integration is over all possible macrostates consistent with the preparation procedure

By itself conditioning of quantum mechanical observables on macrostates rather than microstates is not sufficient to prevent derivation of Bells inequalshyity As a matter of fact on the basis of expression (43) a quadrivariate joint probability distribution can be defined analogous to (42) according to

p(oi026162) = f dt p(A)p(a1|At)p(a2|At)p(61|Alt)p(62|At) (2)

from which Bells inequality can be derived just as well There is however one important aspect that up till now has not sufficiently been taken into acshycount viz contextuality In the construction of (44) it is assumed that the

macrostate A is applicable in each of the measurement arrangements of obshyservables AA2Bi and B2 Because of the incompatibility of some of these observables this is an implausible assumption On the basis of the thermoshydynamic analogy it is to be expected that macrostates A will depend on the

111

measurement context of a specific observable Since [AiBi]_ ^ O we will have

f f1 (3)

and analogously for A2 and B2 Then for the Bell experiments measuring the pairs (Ai A2) and (AiB2) respectively we have

p(aia2) = dX 2 p(t 1 2)p(ai|A 1 2)p(a2X 1 2 ) (4)

p(aib2) = JdtAlB2 ptMB2)patfMB)pa2tMB) (5)

Now the contextuality expressed by inequality (45) prevents the construction of a quadrivariate joint probability distribution analogous to (44) Hence like in the quantum mechanical approach also in the local non-objectivistic hidden-variables theory a derivation of Bells inequality is prevented due to the local contextuality involved in the interaction of the particle and the measuring instrument it is directly interacting with

6 Conclusions

Our conclusion is that if quantum mechanical measurements do probe macro-states A rather than microstates A then Bells inequality cannot be derived for quantum mechanical measurements Both in quantum mechanics and in hidden-variables theories is Bells inequality a consequence of the assumption that the theory is yielding an objective description of reality in the sense that the preparation of the microscopic object as far as relevant to the realization of the measurement result can be thought to be independent of the measureshyment arrangement The important point to be noticed is that although in Bell experiments the preparation of the particle pair at the source (ie the microstate) can be considered to be independent of the measurement proceshydures to be carried out later (and hence one and the same microstate can be assumed in different Bell experiments) the measurement result is only detershymined by the macrostate which is co-determined by the interaction with the measuring instruments It really seems that the Copenhagen maxim of the impossibility of attributing quantum mechanical measurement results to the object as objective properties possessed independently of the measurement should be taken very seriously and implemented also in hidden-variables theshyories purporting to reproduce the quantum mechanical results The quantum

112

mechanical dice is only cast after the object has been interacting with the meashysuring instrument even though its result can be deterministically determined by the (sub-quantum mechanical) microstate

The thermodynamic analogy suggests which experiments could be done in order to transcend the boundaries of the domain of application of quanshytum mechanics If it would be possible to perform experiments that probe the microstate A rather than the macrostate A then we are in the domain of (quasi-)objectivistic hidden-variables theories Because of (42) it then is to be expected that Bells inequality should be satisfied for such experiments In such experiments preparation and measurement must be completed well within the relaxation time of the microstates Such times have been estimated by Bohm24 for the sake of illustration as the time light needs to cover a disshytance of the order of the size of an atom (10~18 s say) If this is correct then all present-day experimentation is well within the range of quantum mechanshyics thus explaining the seemingly universal applicability of this latter theory By hindsight this would explain why Aspects switching experiment is corshyroborating quantum mechanics the applied switching frequency (50 MHz) although sufficient to warrant locality has been far too low to beat the local relaxation processes in each of the measuring instruments separately

It has often been felt that the most surprising feature of Bell experiments is the possibility (in certain states) of a strict correlation between the measureshyment results of the two measured observables without being able to attribute this to a previous preparation of the object (no elements of physical reality ) For many physicists the existence of such strict correlations has been reason enough to doubt Bohrs Copenhagen solution to renounce causal explanation of measurement results and to replace determinism by complementarity It seems that the urge for causal reasoning has been so strong that even within the Copenhagen interpretation a certain causality has been accepted even a non-local one in an EPR experiment (cf figure 1) determining a measurement result for particle 2 by the measurement of particle 1 This however should rather be seen as an internal inconsistency of this interpretation caused by a tendency to make the Copenhagen interpretation as realist as possible In a consistent application of the Copenhagen interpretation to Bell experiments such experiments could be interpreted as measurements of bivariate correlation observables The certainty of obtaining a certain (bivariate) eigenvalue of such an observable would not be more surprising than the certainty of obtaining a certain eigenvalue of a univariate one if the state vector is the corresponding eigenvector

It is important to note that this latter interpretation of Bell experiments takes seriously the Copenhagen idea that quantum mechanics need not ex-

113

plain the specific measurement result found in an individual measurement Indeed in order to compare theory and experiment it would be sufficient that quantum mechanics just describe the relative frequencies found in such meashysurements In this view quantum mechanics is just a phenomenological theory in an analogous way describing (not explaining) observations as does thermoshydynamics in its own domain of application Explanations should be provided by more fundamental theories describing the mechanisms behind the obshyservable phenomena Hence the Copenhagen completeness thesis should be rejected (although this need not imply a return to determinism)

This approach has important consequences One consequence is that the non-existence within quantum mechanics of elements of physical reality does not imply that elements of physical reality do not exist at all They could be elements of the more fundamental theories In section 5 it was discussed how an analogy between quantum mechanics and thermodynamics could be exploited to spell this out Elements of physical reality could correspond to hidden-variables microstates A The determinism necessary to explain the strict correlations referred to above would be explained if within a given measurement context a microstate would define a unique macrostate A This demonstrates how it could be possible that quantum mechanical measurement results cannot be attributed to the object as properties possessed prior to meashysurement and there yet is sufficient determinism to yield a local explanation of strict correlations of quantum mechanical measurement results in certain Bell experiments

Another important aspect of a dissociation of phenomenological and funshydamental aspects of measurement is the possibility of an empiricist interpreshytation of quantum mechanics As demonstrated by the generalized Aspect experiment discussed in section 3 an empiricist approach needs a generalshyization of the mathematical formalism of quantum mechanics in which an observable is represented by a POVM rather than by a projection-valued meashysure corresponding to a self-adjoint operator of the standard formalism Such a generalization has been very important in assessing the meaning of Bells inequality In the major part of the literature of the past this subject has been dealt with on the basis of the (restricted) standard formalism However some conclusions drawn from the restricted formalism are not cogent when viewed in the generalized one (for instance because von Neumanns projection postulate is not applicable in general) For this reason we must be very careful when accepting conclusions drawn from the standard formalism This in particular holds true for the issue of non-locality

114

References

1 W Heisenberg Zeitschr f Phys 33 879 (1925) 2 E Schrodinger Naturwissenschaften 23 807 823 844 (1935) (English

translation in Quantum Theory and Measurement eds JA Wheeler and WH Zurek (Princeton Univ Press 1983 p 152))

3 WM de Muynck Synthese 102 293 (1995) 4 A Einstein B Podolsky and N Rosen Phys Rev 47 777 (1935) 5 A Aspect P Grangier and G Roger Phys Rev Lett 47 460 (1981) 6 A Aspect J Dalibard and G Roger Phys Rev Lett 49 1804 (1982) 7 KR Popper Quantum theory and the schism in physics (Rowman and

Littlefield Totowa 1982) 8 M Jammer The philosophy of quantum mechanics (Wiley New York

1974) 9 N Bohr Phys Rev 48 696 (1935)

10 JS Bell Physics 1 195 (1964) 11 HR Stapp Phys Rev D 3 1303 (1971) II Nuovo Cim 29B 270

(1975) 12 A Fine Journ Math Phys 23 1306 (1982) Phys Rev Lett 48 291

(1982) 13 P Rastall Found of Phys 13 555 (1983) 14 WM de Muynck Phys Lett A 114 65 (1986) 15 WM de Muynck W De Baere and H Martens Found of Phys 24

1589 (1994) 16 WM de Muynck Found of Phys 30 205 (2000) 17 WM de Muynck and JT van Stekelenborg Ann der Phys 7 Folge

45 222 (1988) 18 L de Broglie La thermodynamique de la particule isolee (Gauthier-

Villars 1964) L de Broglie Diverses questions de mecanique et de thershymodynamique classiques et relativistes (Springer-Verlag 1995)

19 D Bohm Phys Rev 89 458 (1953) 20 D Bohm and J-P Vigier Phys Rev 96 208 (1954) 21 E Nelson Dynamical theories of Brownian motion (Princeton University

Press 1967) 22 E Nelson Quantum fluctuations (Princeton University Press 1985) 23 HB Hollinger and MJZenzen The Nature of Irreversibility (D Reidel

Publishing Company Dordrecht 1985 sect 44) 24 D Bohm Phys Rev 85 166 180 (1952)

115

DISCRETE HESSIANS IN STUDY OF Q U A N T U M STATISTICAL SYSTEMS COMPLEX GINIBRE ENSEMBLE

M M DURAS

Institute of Physics Cracow University of Technology ulica Podchorazych 1 PL-30084 Cracow Poland

E-mail mdurasriaduskpkedupl

The Ginibre ensemble of nonhermitean random Hamiltonian matrices K is conshysidered Each quantum system described by K is a dissipative system and the eigenenergies Z of the Hamiltonian are complex-valued random variables The second difference of complex eigenenergies is viewed as discrete analog of Hessian with respect to labelling index The results are considered in view of Wigner and Dysons electrostatic analogy An extension of space of dynamics of random magnitudes is performed by introduction of discrete space of labeling indices

1 Introduction

Random Matrix Theory RMT studies quantum Hamiltonian operators H which are random matrix variables Their matrix elements Hij are independent ranshydom scalar variables 12345678 There were studied among others the folshylowing Gaussian Random Matrix ensembles GRME orthogonal GOE unitary GUE symplectic GSE as well as circular ensembles orthogonal COE unishytary CUE and symplectic CSE The choice of ensemble is based on quantum symmetries ascribed to the Hamiltonian H The Hamiltonian H acts on quanshytum space V of eigenfunctions It is assumed that V is TV-dimensional Hilbert space V = F ^ where the real complex or quaternion field F = R C H corresponds to GOE GUE or GSE respectively If the Hamiltonian matrix

116

H is hermitean H mdash H then the probability density function of H reads

MH)=CH0exp[-p-plusmn-Tr(H2) (1)

CH0 = ( ^ ) ^ 2

MHP=N+ ^N(N - 1)0

fn(H)dH = 1

N N D-l

^=nniK) i = l j gt i 7=0

Hii = (H$HltSgt-raquo)eF

where the parameter 3 assume values 3 = 124 for GOE(iV) GUE(A^) GSE(A^) respectively and Nap is number of independent matrix elements of hermitean Hamiltonian H The Hamiltonian H belongs to Lie group of hermitean N x AT-matrices and the matrix Haars measure dH is invarishyant under transformations from the unitary group U(iV F) The eigenenergies Eii = 1 N oi H are real-valued random variables Ei = E It was Eushygene Wigner who firstly dealt with eigenenergy level repulsion phenomenon studying nuclear spectra1 2 3 RMT is applicable now in many branches of physics nuclear physics (slow neutron resonances highly excited complex nushyclei) condensed phase physics (fine metallic particles random Ising model [spin glasses]) quantum chaos (quantum billiards quantum dots) disordered meso-scopic systems (transport phenomena) quantum chromodynamics quantum gravity field theory

2 The Ginibre ensembles

Jean Ginibre considered another example of GRME dropping the assumption of hermiticity of Hamiltonians thus denning generic F-valued Hamiltonian K 12910 j j e n C 6 ) j belong to general linear Lie group GL(N F) and the matrix Haars measure dK is invariant under transformations form that group The

117

distribution of K is given by

MK) = CK0 exp [-P-- TrffftA-)] (2)

KHfgt = N2p

fKK)dK = 1

N N D-

^=nniK) i=j= 7=0

where 3 mdash 124 stands for real complex and quaternion Ginibre ensembles respectively Therefore the eigenenergies Zi of quantum system ascribed to Ginibre ensemble are complex-valued random variables The eigenenergies Zii = 1N of nonhermitean Hamiltonian K are not real-valued random variables Zi ^ Z Jean Ginibre postulated the following joint probability density function of random vector of complex eigenvalues Z ZN tor N X N Hamiltonian matrices K for f = 21 2-9 10

PzuzN) = (3) N 1 N N

=n ^771 bull n zi - ztf bull exp(- zZ I^I2) 3 = 1 J iltj j=l

where Zi are complex-valued sample points (zi 6 C) We emphasize here Wigner and Dysons electrostatic analogy A Coulomb

gas of iV unit charges moving on complex plane (Gausss plane) C is considered The vectors of positions of charges are zt and potential energy of the system is

U(z1zN) = -J2]nzi-j + lEZil (4) iltj i

If gas is in thermodynamical equilibrium at temperature T = ^- (ft = -^-^ = 2 ks is Boltzmanns constant) then probability density function of vectors of positions is P(ZIZN) Eq (3) Therefore complex eigenenergies Zi of quantum system are analogous to vectors of positions of charges of Coulomb

118

gas Moreover complex-valued spacings AxZi of complex eigenenergies of quantum system

A1Zi = Zi+1-Zii = l(N-l) (5)

are analogous to vectors of relative positions of electric charges Finally complex-valued second differences A2Zj of complex eigenenergies

A2Zi = Zi+2 - 2Zi+l + Zui = 1 N - 2) (6)

are analogous to vectors of relative positions of vectors of relative positions of electric charges

The eigenenergies Zi = Z(i) can be treated as values of function Z of discrete parameter i mdash 1 N The Jacobian of Zi reads

dZi A1Zi JacZi = V ~ ^ T 1 = A Zlt- 7

Ol A1 We readily have that the spacing is an discrete analog of Jacobian since the indexing parameter i belongs to discrete space of indices i pound = l iV Therefore the first derivative with respect to i reduces to the first differential quotient The Hessian is a Jacobian applied to Jacobian We immediately have the formula for discrete Hessian for the eigenenergies Zi

Q2 7 A 2 7

Thus the second difference of Z is discrete analog of Hessian of Z One emphasizes that both Jacobian and Hessian work on discrete index space of indices i The finite differences of order higher than two are discrete analogs of compositions of Jacobians with Hessians of Z

The eigenenergies Eii 6 of the hermitean Hamiltonian H are ordered increasingly real-valued random variables They are values of discrete function Ei = Ei) The first difference of adjacent eigenenergies is

A1Ei = Ei+1-Eii = l(N-l) (9)

are analogous to vectors of relative positions of electric charges of one-dimensional Coulomb gas It is simply the spacing of two adjacent energies Real-valued second differences A2Ei of eigenenergies

A2Ei = Ei+2 - 2Ei+1 +Eui = 1 (N - 2) (10)

119

are analogous to vectors of relative positions of vectors of relative positions of charges of one-dimensional Coulomb gas The A2Zi have their real parts ReA2Zi and imaginary parts ImA2Z as well as radii (moduli) A2Zi and main arguments (angles) ArgA2Zi A2Zj are extensions of real-valued second differences

A 2 poundi = Ei+2 - 2Ei+1 +Ehi = 1 (N - 2) (11)

of adjacent ordered increasingly real-valued eigenenergies Ei of Hamiltonian H defined for GOE GUE GSE and Poisson ensemble PE (where Poisson ensemshyble is composed of uncorrelated randomly distributed eigenenergies)1112131415 The Jacobian and Hessian operators of energy function E(i) mdash Ei for these ensembles read

and

The treatment of first and second differences of eigenenergies as discrete analogs of Jacobians and Hessians allows one to consider these eigenenergies as a magshynitudes with statistical properties studied in discrete space of indices The labelling index i of the eigenenergies is an additional variable of motion hence the space of indices I augments the space of dynamics of random magshynitudes

Acknowledgements

It is my pleasure to most deeply thank Professor Antoni Ostoja-Gajewski for continuous help I also thank Professor Wlodzimierz Wojcik for his giving me access to computer facilities

References

1 F Haake Quantum Signatures of Chaos (Springer-Verlag Berlin Heidelshyberg New York 1990) Chapters 1 3 4 8 pp 1-11 33-77 202-213

2 T Guhr A Miiller-Groeling and H A Weidenmuller Phys Rept 299 189-425 (1998)

3 M L Mehta Random matrices (Academic Press Boston 1990) Chapters 1 2 9 pp 1-54 182-193

4 L E Reichl The Transition to Chaos In Conservative Classical Systems Quantum Manifestations (Springer-Verlag New York 1992) Chapter 6 p 248

5 O Bohigas in Proceedings of the Les Houches Summer School on Chaos and Quantum Physics (North-Holland Amsterdam 1991) p 89

6 CE Porter Statistical Theories of Spectra Fluctuations (Academic Press New York 1965)

7 T A Brody J Flores J B French P A Mello A Pandey and S S M Wong Rev Mod Phys 53 385 (1981)

8 C W J Beenakker Rev Mod Phys 69 731 (1997) 9 J Ginibre J Math Phys 6 440 (1965)

10 M L Mehta Random matrices (Academic Press Boston 1990) Chapter 15 pp 294-310

11 M M Duras and K Sokalski Phys Rev E 54 3142 (1996) 12 M M Duras Finite difference and finite element distributions in statisshy

tical theory of energy levels in quantum systems (PhD thesis Jagellonian University Cracow 1996)

13 M M Duras and K Sokalski Physica D125 260 (1999) 14 M M Duras Description of Quantum Systems by Random Matrix Enshy

sembles of Large Dimensions in Proceedings of the Sixth International Conference on Squeezed States and Uncertainty Relations 24 May-29 May 1999 Naples Italy (NASA Greenbelt Maryland at press 2000)

15 M M Duras J Opt B Quantum Semiclass Opt 2 287 (2000)

121

SOME REMARKS ON HARDY FUNCTIONS ASSOCIATED WITH DIRICHLET SERIES

W E H M Institut fur Grenzgebiete der Psychologie und Psychohygiene

Wilhelmstrasse 3a 79098 Freiburg Germany E-mail ehmigppde

A simple method of associating a Hardy function with a Dirichlet series is described and applied to some examples connected with the Riemann zeta function The theory of Hardy functions then is used to derive integral tests of the Riemann hypothesis generalizing a recent result of Balazard Saias and Yor1

1 Introduction

The most famous example of a Dirichlet series f(z) = Y^=i an n~z converging absolutely in the half plane $lz gt 1 is the Riemann zeta function ((z) which has all coefficients an = 1 It has a simple pole at z mdash 1 and can be extended as a meromorphic function with no other singularities to the whole complex plane6

A simple method of associating a Hardy function with a Dirichlet series of that kind consists in multiplying f(z) by (z mdash l ) ^ 2 the factor (z mdash l)z removes the pole at z = 1 and the division by z achieves square integrability along vertical lines Moreover the zeros of fz) remain unchanged by this modification The motivation for passing from f(z) to f(z) (z mdash l)z2 is to utilize the theory of Hardy functions especially factorization of Hardy functions for the study of the zeta function

In section 2 of this note we give conditions under which the function f(z) (z mdash l)z2 has an analytic continuation as a Hardy function beyond the abscissa of convergence of the Dirichlet series f(z) The criterion is tested on three examples all related to the Riemann zeta function Factorization of the Hardy function pound(z) (z mdash l)z2 which is briefly dicussed in section 3 is used in section 4 to derive some integral tests of the Riemann hypothesis The content of the Riemann hypothesis hereafter abbreviated RH is Riemanns yet unproven conjecture that all non-real zeros of the pound function lie on the line iftz = 12 in the complex plane It has received increasing interest among physicists since the discovery of striking similarities in the distribution of the zeros of the zeta function and the spectrum of large random matrices2

The idea to utilize Hardy functions in connection with the zeta function including integral tests of the Riemann hypothesis is not new See the recent article of Balazard Saias and Yor1 who initially work with Hardy functions in the disc then pass to the half plane 3te gt 12 by conformal mapping In our

122

approach based on the function C(z)(z mdash l ) z 2 which also appears in recent work of Burnol4 we deal with half plane Hardy functions from the beginning This leads to somewhat more general results in a natural fashion

2 Hardyfication of Dirichlet series

The basic result of this section is the following

Theorem Given a Dirichlet series f(z) = $3nLi a laquo n~z with a finite abscissa of convergence let functions A and ltfgt be defined by

A(x) = ^2 abdquo ltj)x) = ^^ an(l-x + ogn) (x euro R ) l lt n lt x lltnlte

(1)

Suppose that Ax) = 0(x) as x mdashgtbull oo and let

X = l i m s u p l-pM where DN = A(N) - V ^ M ( 2 )

Then the function f(z) (z mdash l)z2 can be represented as the Laplace transform of ltfgt(x) in the half plane Stz gt A

(3) bullOO

f(z)(z-l)z2 = e-zx4gt(x)dx ($lzgt) Jo

Proof Fix an integer N gt 1 and let log N lt x lt og(N + 1) Then

4gt(x)-4gt(logN) = (x-logN)A(N)ltA(N)logtplusmnl = 0(1)

as N -gt oo by the assumed growth behavior of A(x) Combining this with

(A(log(n + l))-lt)(logn) = an+1 - A(n) log ^ = an+1 - A(n)n + 0(n1)

we get for N = [ex] -gt oo

N-l

4gtx) = m + J2 [^(log(+)) - ^(losn)] + deg() n=l

N-l

= ai + 5 3 [an+1 - A(n)n + Ofa-1)] + 0(1) = DN + 0(log N) n = l

123

and thus for every e gt 0 ltfgt(x) = 0(ea(A+egt) x t oo by the definition of A Since 4gt vanishes on the left half line it follows that the integral on the right-hand side of (3) converges absolutely in the half plane 5ftz gt A It remains to show that this Laplace transform coincides with f(z) (z - l ) z 2 in the half plane 3z gt aa where aa denotes the abscissa of absolute convergence of f(z)

To that end let us write r)(z) = f(z) (z mdash l)z2 and introduce truncated versions

N

fN(z) = ^2ann~z T]N(z) = fN(z)(z-l)z2

n = l

(j)Nx) = Y2 an(l-x + ogn) lltnltmin(Nex)

N gt1 and set h^^ix) mdash e~~ax ltfgtjv(x) Using

2TT J^ [ + ] 0 if x lt 0

(for every integer q gt 1 a gt 0) we get for fixed a gt aa

(bullOO

eitxr)N(v + it)dt (4)

-i -oo N = v eitx ]C a n~deg~it (a + it- l)l(a +t)2 dt

2r J -OO

-f 2TT J_

n = l N

^-ijy^-i^u dt ya + it (a + it)2

Y ann-dege-deg(x-lo^(l-(x-logn)) = haNx) lltnltmin(Nex)

almost everywhere in x S R the Fourier integrals being understood in the L2

sense Note that r](z) is square integrable along every line 9z = a with a gt aa Clearly rj^i^+it) converges to r)a+it) in L2(dt) so h^^ is a Cauchy sequence in L2(dx) by Parsevals formula The pointwise limit ha(x) of hltT^(x) then also is the L2(dx) limit so that by (4) h^x) and T)(a + it) represent a Fourier transform pair for every a gt aa Therefore

poo poo

r](a + it) = Kit) = hax)e~ixtdx = e-(deg+iVxltf)(x)dx (5) Jo Jo

124

holds almost everywhere in t (a gt aa) hence everywhere in 3te gt aa by continuity This shows that the Laplace transform of ltfgt represents the analytic continuation of 77 to the region $tz gt A completing the proof

Let Ti2 denote the Hardy space consisting of all functions g(z) which are analytic for $lz gt a and such that s u p ^ ^ J^deg g(cr + it)2 dt lt 00 The growth behavior of (jgt(x) established in the proof implies ha euro L2 for every a gt A so that by (5) and Parsevals formula we obtain the following

Corollary Under the conditions of the theorem the function f(z) (z mdash l)z2

belongs to every Hardy space H2 a gt X

Example 1 Let obdquo = 1 for all n that is f(z) mdash Cz) Then DN = 1 N gt 1 so that A = 0 A more careful analysis shows that ltfrx) is nonnegative and grows linearly as x tends to infinity Consequently (z) (z mdash l)z2 is a member of every Hardy space W2 a gt 0 but not of H2 The nonnegativity allows one to associate with ltfgt an exponential family V mdash pa a gt 0 of probability densities with support [000) by setting

pbdquo(x) = K(x)r](a) = ltfgtx)e-xri((T) (x euro R a gt 0) (6)

The function pound(z) (z mdash l)z2 was also considered by Burno in connection with a closure problem in function space known as the Nyman - Beurling real variable form of the Riemann hypothesis

It may be interesting to note here that although ha is square integrable for every a gt 0 it is not true that hafM mdashgtbull iltr in L2 if cr lt 1 In fact we have

Uminf jv-gtoo ||fr(7JV-iltr||2 gt 0 0 lt a lt 1 (7)

Proof Note first that for x gt log N -gt 00

4gtx) - 4gtNX) (8)

J ^ ( l - z + logn) = ( l - a O Q e ^ - A O + l o g t e ^ l - l o g A T Nltnlte

= ( l - x ) ( [ e ] - A 0 + ([ex + plusmn)log[ex] - [ex] - (N + | ) logiV + N + 0(1)

= (JV+)(log[ex]- logJV) + ( [ e^ ] - iV) ( log [e a ] -x )+0 ( l )

= (N + ) ( - log TV) + 0(1)

on using Stirlings formula and the inequalities 0 lt x - log [ex] lt2e~x (x gt 0) The estimate (8) shows that there exists a finite constant B gt 0 such that

125

ltfgt(x) - 4gtNx) gtN(x- logN) for all large N and x gt B + log JV Therefore

O0

KN-Kl gt (ltfgt(x) - lttgtN(x))2 e-2 dx JB+ogN

roo TOO

gt TV2 (x-logN)2e-2axdx = N2~2deg y2 e~2try dy JB+ogN JB

for all large N and assertion (7) follows

Example 2 Let f(z) = ^2p~z^ogp where the sum extends over all prime numbers This example is related to the logarithmic derivative of the zeta function as may be seen from the product representation pound(z) = J~T_ (1mdashp_ z)_ 1 For IRz gt 1

C(z) v - logP gt V - ogP C(z) ^ Pz - 1 M ^ ^ Pz (p2 - 1)

and since the last series converges for Htz gt 12 it suffices to consider f(z) as far as the analytic continuation of C(z)C(z) 1S concerned

The series f(z) had convergence abscissa 12 implying the RH if the associated sequence DN satisfied condition (2) with A = 12 For a numerical check we computed DN for TV up to 5 million A plot of log+ |Djv| log TV versus logiV (thinned out to every 200th data point the general picture is not affected thereby) is shown in Figure 1 (a) Within the considered range the observed behavior is well in accordance with a possible value of A = 12 Notice the obvious connection with the classical criterion saying that the RH is equivalent to the error estimate $^pltxlogp mdash x = 0(x12+e) (V e gt 0) in the prime number theorem (Edwards6 Sect 55) Incidentally 4gt(x) seems to be nonnegative in this case too as a plot of ltfgt(x) for small a-values indicates

Example 3 Let f(z) = 1C(z) = ^2^Li^(n)n~z with fj the Mobius funcshytion It is well-known that the RH is equivalent to the condition A(N) = EnltivM(trade) = 0(V1 2 + e) (for every e gt 0) that is to A = 12 The analogous plot for this case is shown in Figure 1 (b) with similar findings

3 Factorization of r)

From now on we shall restrict attention to the case = pound For brevity we write r](z) = ((z)(z mdash l)z2 throughout the sequel Recall from the previous section that TJ belongs to every Hardy space H2

T a gt 0 Being a Hardy function r admits a useful factorization some applications of which will be discussed in

126

Figure 1 Convergence abscissa of Laplace transform equal to 12 Plot of criterion log1 DN I logN versus log AT for (a) Example 2 (b) Example 3

the next section The zeros of r) in the right half plane Sftz gt 0 which coincide with the non-trivial zeros of the zeta function are generically denoted by p The ps are known to lie symmetrically with respect to both the real axis and the critical line Kz = 12 That is whenever p is a zero then so are the mirror images p 1-9 and 1 mdash p

Let a gt 0 be fixed According to the factorization theorem for Hardy functions (see eg Dym and McKean5 (ch 27) or Hoffman8 (p 132 133)) TJ can be represented as the product of an outer and an inner function on the half plane 5Rz gt a More precisely

r(z) = Haz)Baz)

where the outer function is given by

(ftz gt a)

Hltr(z) = exp 7T J-c

log rj(a + it) t(z mdash a) + i dt t + i(z-a) 1+t2

(9)

(10)

and the inner function reduces in the present case to a Blaschke product Ba

which is composed of the zeros p of T] with 5fygt gt a and their mirror images after reflection at the line 9z = a 2a mdash ~p Explicitly

l-p-o D M _ TT z ~ P l 1 ( i i )

These formulae are easily obtained from the familiar ones for the half plane 9iz gt 08 by shifting both the complex variable and the zeros by a The inner

127

factor simplifies to a Blaschke product for the following reasons (i) n has an analytic continuation across the line dtz = a to the entire right half plane so that there is no singular factor (ii) the constant c appearing in the general factorization formula reduces to unity because Ba(o) = 1 and Ha(a) = rj(a) as is readily verified For real arguments z = s taking first logarithms then real parts on both sides of (9) one obtains for s gt a gt 0

iog(s) = i jy^(^) s(s_-^2 + pound i0i

5Rpgtltr

s-p s-(2a-p)

(12)

Note that T](s) is positive for s gt 0 being the Laplace transform of a nonneg-ative function

4 Applications

The factorization of n gives rise to various tests of the RH A first example is obtained by setting a = 12 in (12) The sum on the right-hand side of (12) vanishes if and only if pound(z) has no zero within the region $lz gt 12 Therefore the RH is true if and only if for some (and then for all) s gt 12

If 71 J-lt

logMl + ^ l ^ = lograquoK) (13) (s 2) +t

This criterion is equivalent to the condition that r)(z) be an outer function for the half plane 9z gt 12 cf Dym and McKean5 Sect 27 For s = 1 it assumes a particularly neat form The right-hand side vanishes and the left-hand side can be simplified and one gets the following criterion for the truth of the RH due to Balazard Saias and Yor1

4 + l

Another example results from the formula

OO 1

log[|ij(ltr + it)|i(lt7)] -2L - 2 pound K ( p - a ) 1 (15)

(cr gt 0) which can be derived from (12) by subtracting logger) on both sides dividing by s - cr and then taking the limit s a The interchange of limits and integration (or summation) can be justified by dominated convergence

128

Putting a = 12 in (15) one obtains the following differential version of the integral tests (13) (14) The RH is true if and only if

f j mdash lt

dt l o g t W i + i t J I M D l - r j = ( log^) ( i ) (16)

This statement can be amplified in various ways First it is possible to evaluate (log77)(|) explicitly (logr)(|) = f + |log(87r) + f - 6 and for u = 12 the sum in (15) can be written in a more symmetric form One thus obtains the relation

00

log v+it)

v(h) dt (l 1 7T ^$tp-5 ( l + l l o g M + I _ 6 ) = E 2 I

bullKt2 2 2 6V 4 J ^ p - | p (17)

in which the sum extends over all zeros in the critical strip Note that (17) quantifies the difference between the two sides of (16) as a weighted sum of the absolute deviations of the real parts of the zeros from 12

Secondly there is a connection with logarithmic Hilbert transforms also called logarithmic dispersion relations3 Suppose we had T](z) ^ 0 for IStz gt 12 Then n itself would be an outer function

Taking imaginary parts in this equation one can show with a little algebra that for z mdash 12 = a + ib a gt 0 one then has

ZlogV(z) = - J ^ (log|7(i + it) - l o g W +ib)) -plusmn-plusmn j - ^ 1 8 )

l o g M | + r t ) I - log T + ib) I a dt

-I t-b a2 + (t-b)2

Fix any b gt 0 such that 7(| +ib) ^ 0 Then the last term in (38) converges to zero as a 4- 0 Therefore using the fact that r]( + it) is an even function of t one obtains in the limit the logarithmic dispersion relation

o-i ( + bull 2b Z-00 log k ( | + it)| - log |raquo(| + t6)| ^ Zlogriiz+ib) = mdash J i ^ mdash ^ dt (19)

which expresses the phase of rj on the boundary dtz = 12 as an integral of its log modulus along that line Recall that this relation is a consequence of the

129

assumed outer function character of 77 that is of the RH In fact the validity of (19) for every 6 gt 0 such that 7(| + ib) ^ 0 is also sufficient for the RH To see this divide both sides of (19) by b and let 6 4-0 Then the left side tends to (lograquo7)(i) the right side to f 0degdeglog[r]( + it)h)] sect so in the limit we get the condition (16) shown above to be equivalent to the RH

Finally we note that mdash (log77)(ltr) equals the first moment of the probability density pbdquo cp (6) In view of (16) and (15) this raises the question whether the integral term in these relations admits of a probabilistic interpretation too Relevant to this question is the observation going back to Khintchine that for every a gt 1 the function fa(t) = pound(a + it)((a) is the characteristic function of an infinitely divisible distribution cf Example 6 p 75 in Gnedenko and Kolmogorov7 This can be verified by rewriting the product representation of the zeta function (for a gt 1) in the form

C(o- + it) = T T 1-p-7

exp mdash Tmdashon

y^ y^ E ie-itnoSp _ i p n = l

(20)

and noting that fat) is thus represented as a product of terms of the form exp(a(elbt mdash 1)) each of which is the characteristic function of a Poisson random variable with intensity a and values in the lattice kb k = 012

In order to connect this fact with the above question it is convenient to introduce the Levy measure Fa which puts mass (npncr)~1 at each of the points - logp ngtlp prime Then (20) becomes log ^fffi = J(eitx - 1) Fa(dx) so taking real parts in this equation and using J^deg (l mdash costx)t2 dt = n x (x pound R) one obtains

J o g [ | C ( a + i i ) | C ( lt T ) ] ^ = j_^jpostx-l)Fadx)^

= ( c o s t e - 1 ) mdash ^ F ^ d x ) = - hxlFeidx) = xFbdquo(dx)

Thus we find that the essential part of the integral in question equals the first moment of the Levy measure Fa The other part stemming from the factor (z mdash l)z2 can be incorporated by introducing a signed absolutely continuous measure Ga with density x _ 1 [2eax - e ^ - 1 ^ ) on (-000) (zero on [000)) One then has

log r)a + it) plusmnii) = j(eax-l)(Fa-Ga)(dx)

130

and hence

l o g [ | bdquo ( | + r t ) I M sect ) ] ^ = lx(Fbdquo-Ga)dx) (ltxgtl)

These calculations give a more detailed picture of the way how the factor (z mdash l)z2 regularizes the zeta function as a J 1 it compensates the flow of mass of Fa towards mdash oo by the subtraction of measures Ga such that the first moment of Fa mdash Ga remains bounded Evidently other ways of renormalizing the Levy measure as a 1 are also conceivable and may be interesting to explore

References

1 M Balazard E Saias and M Yor Adv Math 143 284 (1999) 2 MV Berry and JP Keating SIAM Review 41 236 (1999) 3 RE Burge MA Fiddy AH Greenaway and G Ross Proc R Soc

London A 350 191 (1976) 4 J -F Burnol lt h t t p arXivorgabsmath0001013gt (2000) 5 H Dym and HP McKean Gaussian Processes Function Theory and

the Inverse Spectral Problem (Academic Press New York 1976) 6 HM Edwards The Theory of the Riemann Zeta Function (Academic

Press New York 1974) 7 BV Gnedenko and AN Kolmogorov Limit Distributions for Sums of

Independent Random Variables (Addison-Wesley Cambridge 1954) 8 K Hoffman Banach Spaces of Analytic Functions (Dover New York

1988)

131

ENSEMBLE PROBABILISTIC EQUILIBRIUM A N D NON-EQUILIBRIUM THERMODYNAMICS W I T H O U T THE

THERMODYNAMICAL LIMIT

D H E G R O S S

Hahn-Meitner-Institut Berlin Bereich Theoretische PhysikGlienickerstrlOO

14109 Berlin Germany and Freie Universitdt Berlin Fachbereich Physik Email grosshmide

Boltzmanns principle S = k In W allows to extend equilibrium thermo-statistics to Small systems without invoking the thermodynamic limit23 As the limit hides more than clarifies the origin of phase transitions a deeper and more transparent understanding is thus possible The main clue is to base statistical probability on ensemble averaging and not on time averaging It is argued that due to the incomplete information obtained by macroscopic measurements thermodynamics handles ensembles or finite-sized sub-manifolds in phase space and not single time-dependent trajectories Therefore ensemble averages are the natural objects of statistical probabilities This is the physical origin of coarse-graining which is not anymore a mathematical ad hoc assumption The probabilities P(M) of macroshyscopic measurements M are given by the ratio P(M) = W(M)W of the volumes of the sub-manifold M of the microcanonical ensemble with the constraint M to the one without From this concept all equilibrium thermodynamics can be deduced quite naturally including the most sophisticated phenomena of phase transitions for Small systems

Boltzmanns principle is generalized to non-equilibrium Hamiltonian systems with possibly fractal distributions M in 6iV-dim phase space by replacing the conshyventional Riemann integral for the volume in phase space by its corresponding box-counting volume This is equal to the volume of the closure M With this extension the Second Law is derived without invoking the thermodynamic limit The irreversibility in this approach is due to the replacement of the phase-space volume of the fractal sub-manifold M by the volume of its closure M The physical reason for this replacement is that macroscopic measurements cannot distinguish M from Ai Whereas the former is not changing in time due to Liouvilles theoshyrem the volume of the closure can be larger In contrast to conventional coarse graining the box-counting volume is defined in the limit of infinite resolution Ie there is no artificial loss of information

1 Introduction

Recently the interest in the thermo-statistical behavior of non-extensive many-body systems like atomic nuclei atomic clusters soft-matter biological sysshytems mdash and also self-gravitating astro-physical systems lead to consider thermo-statistics without using the thermodynamic limit This is most safely done by going back to Boltzmann Einstein considers Boltzmanns definition of entropy as eg written on his

132

famous epitaph

S=k-lnW (1)

as Boltzmanns principle4 from which Boltzmann was able to deduce thermoshydynamics Here W is the number of micro-states at given energy E of the TV-body system in the spatial volume V

W(ENV) = tr[e0S(E - HN)) (2)

ltlt-amp)] = ff^(^0)BBbdquo) (3)

eo is a suitable energy constant to make W dimensionless Hpf is the N-particle Hamilton-function and the iV positions q are restricted to the volume V whereas the momenta p are unrestricted In what follows we remain on the level of classical mechanics The only reminders of the underlying quantum meshychanics are the measure of the phase space in units of 2-KK and the factor 1N which respects the indistinguishability of the particles (Gibbs paradoxon) In contrast to Boltzmann56 who used the principle only for dilute gases and to Schrodinger7 who thought equation (1) is useless otherwise I take the princishyple as the fundamental generic definition of entropy In the following sections 1 will demonstrate that this definition of thermo-statistics works well espeshycially also at higher densities and at phase transitions without invoking the thermodynamic limit

2 There is a lot to add to classical equilibrium statistics from our experience with Small systems

Following Lieb8 extensivity a and the existence of the thermodynamic limit N mdashgt oo|jvv=cobdquogt are essential conditions for conventional (canonical) thershymodynamics to apply Certainly this implies also the homogeneity of the system Phase transitions are somehow foreign to this The essence of first order transitions is that the systems become inhomogeneous and split into difshyferent phases separated by interfaces In the conventional Yang-Lee theory phase transitions are represented by the positive zeros of the grand-canonical partition sum where the grand-canonical formalism breaks down (Yang-Lee singularities) In the following we show that the micro-canonical ensemble

Dividing extensive systems into larger pieces the total energy and entropy are equal to the sum of those of the pieces

133

gives much more detailed and more natural insight which corresponds to the experimental identification of phase transitions

There is a whole group of physical many-body systems called Small in the following which cannot be addressed by conventional thermo-statistics

bull nuclei

bull atomic cluster

bull polymers

bull soft matter (biological) systems

bull astrophysical systems

bull first order transitions are distinguished from continuous transitions by the appearance of phase-separations and interfaces with surface tension If the range of the force or the thickness of the surface layers is such that the number of surface particles is not negligible compared to the total number of particles these systems are non-extensive

For such systems the thermodynamic limit does not exist or makes no sense Either the range of the forces (Coulomb gravitation) is of the order of the linear dimensions of these systems andor they are strongly inhomogeneous eg at phase-separation

Boltzmanns principle does not invoke the thermodynamic limit nor ad-ditivity nor extensivity nor concavity of the entropy S(EN) (downwards bending) This was largely forgotten since hundred years We have to go back to pre Gibbsian times It is a purely geometrical definition of the entropy and applies as well to Small systems Moreover the entropy S(E N) as defined above is everywhere single-valued and multiple differentiable There are no singularities in it This is the most simple access to equilibrium statistics9 We will explore its consequences in this contribution Moreover we will see that this way we get simultaneously the complete information about the three crucial parameters characterizing a phase transition of first order transition tempershyature Ttr latent heat per atom qiat and surface tension crsurf Boltzmanns famous epitaph above (eql) contains everything what can be said about equishylibrium thermodynamics in its most condensed form W is the volume of the sub-manifold at sharp energy in the 6iV-dim phase space

134

3 Relation of the topology of S(EN) to the Yang-Lee zeros of Z(TnV)

In conventional thermo-statistics phase transitions are indicated by zeros of the grand-canonical partition function Z(T n V) V is the volume See more details in1-2310

Z(TfiV) = f r mdash dN e-[E-N-TsmiT JJo go

rdegdegdE

V2

= Y_ ff de dn c-V[ e-Mn-r(en)]T_ laquoo JJo

const+lin+quadr

(4)

in the thermodynamic limit V mdashgt oo|vy=cobdquos t The double Laplace integral (4) can be evaluated asymptotically for large

V by expanding the exponent as indicated in the last line to second order in Ae An around the stationary point esns where the linear term vanishes

1 T

T P f

dE 8

as dN

dS dv (5)

the only term remaining to be integrated is the quadratic one If the two eigen-curvatures Ai lt 0 A2 lt 0 this is then a Gaussian integral and yields

Z(TliV) = Yle-V[e-Itn-T^n)]T ffdegdeg dvidv2eV[Mvl+Xvl2 ( g )

CO JJ-00

Z(TfiV) = e - F ^ ^ (7)

FiT^V) _ _ T B i i ^ ^ ^ plusmn ^ ( g )

V

bdquo Tln(vdet(eg n)) l n V -+ea- in - Tss + VV

VK s + o ( mdash )

Here det(e s n s) is the determinant of the curvatures of s(en) viv2 are the eigenvectors of d

det(en) = de2 dnde d s d s

dedn dn2 Sfie Snn A1A2 Ai gt A2 (9)

135

Nalooo P = 1 a t m ^ AS s u r f ^_^

^ J - ^ mdash ^ r f ^

bull7 e2 1 s ( e ) - 2 5 - e 1 1 5

H l a t

e 3

03 0 5 07 09 11 13

Figure 1 MMMC simulation of the entropy s(e) per atom (e in eV per atom) of a system of JVo = 1000 sodium atoms with realistic inshyteraction at an external pressure of 1 atm At the energy per atom e the system is in the pure liquid phase and at e$ in the pure gas phase of course with fluctuations The latent heat per atom is qiat = e mdash e

Attention the curve s(e) is artifically sheared by subtracting a linear funcshytion 25 -(- e 115 in order to make the convex intruder visible s(e) is always a steeply monotonic rising functionWe clearly see the global concave (downshywards bending) nature of s(e) and its convex intruder Its depth is the enshytropy loss due to the additional corshyrelations by the interfaces Prom this one can calculate the surface tension per surface atom aSUrfTtr = As3 1 i r NoNsUrf The double tangent is the concave hull of s(e) Its derivative gives the Maxwell line in the caloric curve T(e) at Ttr- In the thermodynamic limit the intruder would disappear and s(e) would approach the double tanshygent (Maxwell line) from below

In the cases studied here A2 lt 0 but Ai can be positive or negative If d e t ( e s n s ) is positive (Ai lt 0) the last two terms in eq(8) go to 0 and we obtain the familiar result fTnV mdashgt oo) = es mdash xns mdash Tss Ie the curvashyture Ai of the entropy surface s(e n V) decides whether the grand-canonical ensemble agrees with the fundamental micro ensemble in the thermodynamic limit If this is the case n[Z(T j)] or f(Tn) is analytical in e3^ and due to Yang and Lee we have a single stable phase Or otherwise the Yang-Lee zeros reflect anomalous pointsregions of Ai gt 0 (det (e n) lt 0) This is crucial As d e t ( e s n s ) can be studied for finite or even small systems as well this is the only proper extension of phase transit ions to Small systems

4 T h e reg ions of p o s i t i v e curvature Ai of sesns) c o r r e s p o n d t o p h a s e t rans i t i ons of first order

We will now discuss the physical origin of convex (upwards bending) intruders in the entropy surface in two examples

In table (1) we compare the liquid-gas phase transit ion in sodium clusshyters of a few hundred atoms with tha t of the bulk at 1 a tm cf also fig(l)

Figure (2) shows how for a small system (Pot ts q = 3 lattice gas with 50 50 points) all phenomena of phase transitions can be studied from the

136

Table 1 Parameters of the liquid-gas transition of small sodium clusters (MMMC-calculation1) in comparison with the bulk for rising number No of atoms Nsurf is the average number of surface atoms of all clusters together

N a

N0

Ttr [K] qiat [eV]

Sboil

^Ssurf

bullL surf

crTtr

200

940 082 101 055 3994 275

1000

990 091 107 056 9853 568

3000

1095 094 99 044 1866 707

bulk 1156 0923 9267

oo 741

topology of the determinant of curvatures (9) in the micro-canonical ensemble

5 Boltzmanns principle and non-equilibrium thermodynamics

Before we proceed we must comment on Einsteins attitude to the principle11) Originally Boltzmann called W the Wahrscheinlichkeit (probability) ie the relative time a system spends (along a time-dependent path) in a given region of 6V-dim phase space Our interpretation of W to be the number of complexions (Boltzmanns second interpretation) or quantum states (trace) with the same energy was criticized by Einstein4 as artificial It is exactly that criticized interpretation of W which I use here and which works so excellently1 In section 7 I will come back to this fundamental point

After succeeding to deduce equilibrium statistics including all phenomshyena of phase transitions from Boltzmanns principle even for Small systems ie non-extensive many-body systems it is challenging to explore how far this most conservative and restrictive way to thermodynamics9 is able to describe also the approach of (eventually Small) systems to equilibrium and the Second Law of Thermodynamics

Thermodynamics describes the development of macroscopic features of many-body systems without specifying them microscopically in all details Beshyfore we address the Second Law we have to clarify what we mean with the label macroscopic observable

6 Macroscopic observables imply the EPS-probability

A single point qi(t)Pi(t)i=iN in the Af-body phase space corresponds to a detailed specification of the system with all degrees of freedom (dof) com-

137

1

0 8

0 6

0 4

0 2

0 - 2 - 1 5 - 1 - 0 5 0

e Figure 2 Conture plot of the curvature determinant of Potts-3 lattice gas Dark grey line d = 0 boundary of the region of phase coexistence the triangle APmB Light grey line minimum of d(en) in the direction of the largest curvature second order transition In the triangle APmC ordered (solid) phase Above and right of the line CPmB disordered (gas) phase The crossing Pm of the boundary lines is a multi critical point The light gray region around the multi-critical point Pm corresponds to a flat region of d(e n) ~ 0

pletely fixed at time t (microscopic determination) Fixing only the total energy E of an iV-body system leaves the other (6N mdash l)-degrees of freeshydom unspecified A second system with the same energy is most likely not in the same microscopic state as the first it will be at another point in phase space the other dof will be different Ie the measurement of the total energy HN or any other macroscopic observable M determines a (QN mdash 1)-dimensional sub-manifold pound or M in phase space All points in iV-body phase space consistent with the given value of E and volume V ie all points in the (6N mdash l)-dimensional sub-manifold poundNV) of phase space are equally consistent with this measurement pound(NV) is the microcanonical ensemble This example tells us that any macroscopic measurement is incomplete and defines a sub-manifold of points in phase space not a single point An addishytional measurement of another macroscopic quantity Bqp reduces pound further to the cross-section pound O B a (6iV mdash 2)-dimensional subset of points in pound with the volume

WBENV) = plusmnJ j0f) e0S(E-HNqp)6(B-Bqp) (10)

138

If Hffqp as also Bqp are continuous differentiable functions of their arguments what we assume in the following pound n B is closed In the following we use W for the Riemann or Liouville volume of a many-fold

Microcanonical thermostatics gives the probability P(B E N V) to find the TV-body system in the sub-manifold pound D B(EN V)

P(B E N V)~ W(BEgtNV) _ ln[W(BENV)]-S(ENV) ( m

This is what Krylov seems to have had in mind12 and what I will call the ensemble probabilistic formulation of statistical mechanics (EPS)

Similarly thermodynamics describes the development of some macroscopic observable Bqtpt in time of a system which was specified at an earlier time to by another macroscopic measurement Aqop0 It is related to the volume of the sub-manifold M(t) = A(t0) n B(t) D pound

W(ABEt) = ^J^0)N^-Bqupt]) 6(A - Aq0po)e0d(E - Hqtpt) (12)

where qtQoPoPtQoPo is the set of trajectories solving the Hamilton-Jacobi equations

dH 8H = laquo - Pi = mdash laquo - i = l---N (13)

with the initial conditions q(t = to) = lto p(t = t0) = Po- For a very large system with N ~ 1023 the probability to find a given value B(T) P(B(t)) is usually sharply peaked as function of B Ordinary thermodynamics treats systems in the thermodynamic limit N mdashbull oo and gives only ltB(t)gt However here we are interested to formulate the Second Law for Small systems ie we are interested in the whole distribution P(B(t)) not only in its mean value ltB(t)gt Thermodynamics does not describe the temporal development of a single system (single point in the 6iV-diiri phase space)

There is an important property of macroscopic measurements Whereas the macroscopic constraint Aqopo determines (usually) a compact region A(to) in qoPo this does not need to be the case at later times t 3gt to A(t) denned by AqoqtptPoltltPt might become a fractal ie spaghetti-like manifold cf fig3 as a function of qtPt in f at i mdash oo and loose compactness

This can be expressed in mathematical terms There exist series of points an euro -4(oo) which converge to a point an=_+oo which is not in ^4(oo) Eg

139

such points may have intruded from the phase space complimentary to A(to) Illustrative examples for this evolution of an initially compact sub-manifold into a fractal set are the baker transformation discussed in this context by ref1314 Then no macroscopic (incomplete) measurement at time t = oo can resolve aoo from its immediate neighbors an in phase space with distances o-n mdash laquooo| less then any arbitrary small 5 In other words at the time t Sgt to no macroscopic measurement with its incomplete information about qtPt can decide whether qoqtPtPoqtPt euro -4(o) or not Ie any macroscopic theory like thermodynamics can only deal with the closure of A(t) If necessary the sub-manifold A(t) must be artificially closed to A(t) as developed further in section 8 Clearly in this approach this is the physical origin of irreversibility We come back to this in section 8

7 On Einsteins objections against the EPS-probability

According to Abraham Pais Subtle is the Lord11 Einstein was critical with regard to the definition of relative probabilities by eql l Boltzmanns countshying of complexions He considered it as artificial and not corresponding to the immediate picture of probability used in the actual problem The word probability is used in a sense that does not conform to its definition as given in the theory of probability In particular cases of equal probability are often hypothetically defined in instances where the theoretical pictures used are sufshyficiently definite to give a deduction rather than a hypothetical assertion4 He preferred to define probability by the relative time a system (a trajectory of a single point moving with time in the V-body phase space) spends in a subset of the phase space However is this really the immediate picture of probashybility used in statistical mechanics This definition demands the ergodicity of the trajectory in phase space As we discussed above thermodynamics as any other macroscopic theory handles incomplete macroscopic informations of the A-body system It handles consequently the temporal evolution of finite sized sub-manifolds - ensembles - not single points in phase space The typical outcomes of macroscopic measurements are calculated Nobody waits in a macroscopic measurement eg of the temperature long enough that an atom can cross the whole system

In this respect I think the EPS version of statistical mechanics is closer to the experimental situation than the duration-time of a single trajectory Moreover in an experiment on a small system like a nucleus the excited nushycleus which then may fragment statistically later on is produced by a multiple repetition of scattering events and statistical averages are taken No ergodic covering of the whole phase space by a single trajectory in time is demanded

140

At the high excitations of the nuclei in the fragmentation region their life-time would be too short for that This is analogous to the statistics of a falling ball on a Galtons nail-board where also a single trajectory is not touching all nails but is random Only after many repetitions the smooth binomial distribution is established As I am discussing here the Second Law in finite systems this is the correct scenario not the time average over a single ergodic trajectory

8 Fractal distributions in phase space Second Law

Let us examine the following Gedanken experiment Suppose the probability to find our system at points qtPt in phase space is uniformly distributed for times t lt to over the sub-manifold poundN V) of the TV-body phase space at energy E and spatial volume V At time t gt to we allow the system to spread over the larger volume V2 gt Vi without changing its energy If the system is dynamically mixing the majority of trajectories qtPt^ in phase space starting from points qoPo with qo 6 V at to will now spread over the larger volume V2- Of course the Liouvillean measure of the distribution JAqtPt in phase space at t gt to will remain the same (= tr[pound(N Vi)]f5 (The label qo pound Vi of the integral means that the positions qo^ are restricted to the volume Vi the momenta po are unrestricted)

tr[MqtqoPoPtqoPo]goeVl

-UMW-^-61^ lt14) because of 7-7mdash-mdashr = 1 (15)

oqoPo

But as already argued by Gibbs the distribution MqtPt will be filamented like ink in water and will approach any point of poundN V2) arbitrarily close Mqtpt becomes dense in the new larger pound(N V2) for times sufficiently larger than to (strictly in the limt_gtoo)- The closure M becomes equal to poundNV-z) This is clearly expressed by Lebowitz1617

In order to express this fact mathematically we have to redefine Boltz-manns definition of entropy eq(l) and introduce the following fractal mea-

141

sure for integrals like (3) or (10)

W(ENtraquot0) = plusmn [ i^Sf)zo6(E-HNquPt) (16)

With the transformation

f(d3qt d3Pt)

N bull bull bull = d lt n bullbull bull da6N bull bull bull (17)

1 ^dH dH 1 _ 1 Q do-QN = mdash gt -mdash- dqi + -^mdashdpi = mdashdE (18)

IVffll Ns)+gy W[E N t raquo t0) = v 9 Lv3jv f rfltJi bull bull bull d(76N-1-

JVH||

we replace M by its closure M and define now

(20)

W(EW traquo fo ) -gt M(E JV traquo t 0 ) =ltG(pound(JVV2))gt volt08[MCEJTt raquo i o ) ] (21)

where lt G(S(N V2)) gt is the average of fi^llvgll o v e r t i e (^arSer) m a n _

ifold pound(N V2) and volbox[M(ENt raquo to)] is the box-counting volume of M(E N t 3gt to) which is the same as the volume of M see below

To obtain voltox[M(E Nt 3gt to)] we cover the d-dim sub-manifold M(t) here with d = (6V mdash 1) of the phase space by a grid with spacing 6 and count the number N$ oc 5~d of boxes of size S6N which contain points of M Then we determine

vobox[M(ENt raquo to)] =)ms_y05dNs[M(ENfraquo f0)] (22)

with lim= inf [lim ] or symbolically

M(ENtraquot0) = L lf^^Pi) e06(E-HN)(23) J laquoolaquoplaquoeViM V ( 2 ^ ) ^ J

N

i 1 1 aat arvt

= WfaNWtWiE^M) (24)

142

Va vb va + vb

t lt 0 gt i o

Figure 3 The compact set M(to) left side develops into an increasingly folded spaghetti-like distribution in phase-space with rising time t This figure shows only the early form of the distribution At much larger times it will become more and more fractal The grid illustrates the boxes of the box-counting method All boxes which overlap with A4(t) are counted in Ng in eq(22)

where 3d means that this integral should be evaluated via the box-counting

volume (22) here with d = 6N mdash 1 This is illustrated by the figure 3 With this extension of eq(3) Boltzmanns entropy (1) is at time t -gtbull oo equal to the logarithm of the larger phase space W(E TV V )- This is the Second Law of Thermodynamics The box-counting is also used in the definition of the Kolmogorov entropy the average rate of entropy gain1819 Of course still at to Mto)=Mt0)=poundNV1)

l_ M(ENt0) =

lt7oeuroVi

qoeuroVi N l

= WENV)

4o6Vgt N

d3q0 dpQ

(2irH)3

d3q0 d3p0 (2nh)3 J

e06(E - HN) (25)

e0S(E - HN)

(26)

The box-counting volume is analogous to the standard method to detershymine the fractal dimension of a set of points18 by the box-counting dimension

dimbox[M(ENt raquo t0)] = lira InNs[M(ENtgt tp)]

In S (27)

143

Like the box-counting dimension volbox has the peculiarity that it is equal to the volume of the smallest closed covering set Eg The box-counting volume of the set of rational numbers Q between 0 and 1 is voloxQ = 1 and thus equal to the measure of the real numbers cf Falconer18 section 31 This is the reason why volampox is not a measure in its mathematical definition because then we should have

volf0 pound(M) ieuroQ

2 voUolaquo[Mi] = 0 (28) ieQ

therefore the quotation marks for the box-counting measure Coming back to the the end of section (6) the volume W(ABbull bull bull t) of

the relevant ensemble the closure M(t) must be measured by something like

the box-counting measure (2223) with the box-counting integral B d which

must replace the integral in eq(3) Due to the fact that the box-counting volume is equal to the volume of the smallest closed covering set the new extended definition of the phase-space integral eq(23) is for compact sets like the equilibrium distribution pound identical to the old one eq(3) Therefore one can simply replace the old Boltzmann-definition of the number of complexions and with it of the entropy by the new one (23)

9 Conclusion

Macroscopic measurements M determine only a very few of all 6N dof Any macroscopic theory like thermodynamics deals with the volumes M of the corresponding closed sub-manifolds M in the 6iV-dim phase space not with single points The averaging over ensembles or finite sub-manifolds in phase space becomes especially important for the micro canonical ensemble of a finite system

Because of this necessarily coarsed information macroscopic measureshyments and with it also macroscopic theories are unable to distinguish fractal sets M from their closures M Therefore I make the conjecture the proper manifolds determined by a macroscopic theory like thermodynamics are the closed M However an initially closed subset of points at time to does not necshyessarily evolve again into a closed subset at t ^gt to- l e the closure operation and the t mdash)bull oo limit do not commute and the macroscopic dynamics becomes irreversible The limt-^oo and l i m ^ o may be linked as eg S gt constft and the S mdashgtbull 0 limit taken after the t mdashgt oo limit

Here is the origin of the misunderstanding by the famous reversibility paradoxes which were invented by Loschmidt20 and Zermelo2122 and which

144

bothered Boltzmann so much2324 These paradoxes address to trajectories of single points in the JV-body phase space which must return after Poincarres recurrence time or which must run backwards if all momenta are exactly reshyversed Therefore Loschmidt and Zermelo concluded that the entropy should decrease as well as it was increasing before The specification of a single point demands of course a microscopic exact specification of all 6N degrees of freeshydom not a determination of a few macroscopic degrees of freedom only No entropy is defined for a single point

By our formulation of thermo-statistics various non-trivial limiting proshycesses can be avoided Neither does one invoke the thermodynamic limit of a homogeneous system with infinitely many particles nor does one rely on the er-godic hypothesis of the equivalence of (very long) time averages and ensemble averages The use of ensemble averages is justified directly by the very nature of macroscopic (incomplete) measurements Coarse-graining appears as natushyral consequence of this The box-counting method mirrors the averaging over the overwhelming number of non-determined degrees of freedom Of course a fully consistent theory must use this averaging explicitly Then one would not depend on the order of the limits l i m ^ o limt_gtoo as it was tacitly assumed here Presumably the rise of the entropy can then be already seen at finite times when the fractality of the distribution in phase space is not yet fully deshyveloped The coarse-graining is no more any mathematical ad hoc assumption Moreover the Second Law is in the EPS-formulation of statistical mechanics not linked to the thermodynamic limit as was thought up to now1617

Appendix

In the mathematical theory of fractals18 one usually uses the Hausdorff measure or the Hausdorff dimension of the fractal19 This however would be wrong in Statistical Mechanics Here I want to point out the difference between the box-counting measure and the proper Hausdorff measure of a manifold of points in phase space Without going into too much mathematical details we can make this clear again with the same example as above The Hausdorff measure of the rational numbers euro [01] is 0 whereas the Hausdorff measure of the real numbers euro [01] is 1 Therefore the Hausdorff measure of a set is a proper measure The Hausdorff measure of the fractal distribution in phase space M(t -gt oo) is the same as that of M(to) W(E NV) Measured by the Hausdorff measure the phase space volume of the fractal distribution M(t -t oo) is conserved and Liouvilles theorem applies This would demand that thermodynamics could distinguish between any point inside the fractal from any point outside of it independently how close it is This however

145

is impossible for any macroscopic theory that can only address macroscopic information where all unobserved degrees of freedom are averaged over That is the deep reason why the box-counting measure must be taken and where irreversibility comes from

Acknowledgement

I thank to EGD Cohen and Pierre Gaspard for detailed discussions

References

1 D H E Gross Microcanonical thermodynamics Phase transitions in Small systems Lecture Notes in Physics (World Scientific Singapore 2000)

2 D H E Gross and E Votyakov Phase transitions in small sysshytems EurPhysJB 15 115-126 (2000) httparXivorgabscond-mat9911257

3 D H E Gross Micro-canonical statistical mechanics of some non-extensive systems httparXiv orgabsastro-phcond-mat0004268 (2000)

4 A Einstein Uber einen die Erzeugung und Verwandlung des Lichtes betreffenden heuristischen Gesichtspunkt Annalen der Physik 17 132 (1905)

5 L Boltzmann Uber die Beziehung eines algemeinen mechanischen Satzes zum Hauptsatz der Warmelehre Sitzungsbericht der Akadamie der Wis-senschaften Wien 2 67-73 (1877)

6 L Boltzmann Uber die Begriindung einer kinetischen Gastheorie auf anziehende Krafte allein Wiener Berichte 89 714 (1884)

7 E Schrodinger Statistical Thermodynamics a Course of Seminar Lecshytures delivered in January-March 1944 at the School of Theoretical Physics (Cambridge University Press London 1946)

8 Elliott H Lieb and J Yngvason The physics and mathematics of the second law of thermodynamics Physics Reportcond-mat9708200 310 1-96 (1999)

9 J Bricmont Science of chaos or chaos in science Physicalia Magazine Proceedings of the New York Academy of Science to apear 1-50 (2000)

10 DHE Gross Phase transitions in small systems - a challenge for thershymodynamics httparXivorgabscond-mat0006087 page 8 (2000)

11 A Pais Subtle is the Lord chapter 4 pages 60 - 78 (Oxford University Press Oxford 1982)

12 N S Krylov Works on the Foundation of Statistical Physics (Princeton University Press Princeton 1979)

13 R F Fox Entropy evolution for the baker map Chaos 8 462-465 (1998)

14 T Gilbert J R Dorfman and P Gaspard Entropy production fractals and relaxation to equilibrium PhysRevLett 85 1606nlinCD000301 (2000)

15 H Goldstein Classical Mechanics (Addison-Wesley Reading Mass 1959)

16 J L Lebowitz Microscopic origins of irreversible macroscopic behavior Physica A 263 516-527 (1999)

17 J L Lebowitz Statistical mechanics A selective review of two central issues RevModPhys 71 S346-S357 (1999)

18 K Falconer Fractal Geometry - Mathematical Foundations and Apshyplications ( John Wiley amp Sons Chichester New York Brisbane TorontoSingapore 1990)

19 E W Weisstein Concise Encyclopedia of Mathemetics (CRC Press Lonshydon New York Washington DC 1999 CD-ROM edition 1 205 99)

20 J Loschmidt Wienerberichte 73 128 (1876) 21 E Zermelo WiedAnn 57 778-784 (1896) 22 E Zermelo Uber die mechanische Erklarung irreversiblen Vorgange

WiedAnn 60 392-398 (1897) 23 E G D Cohen Boltzmann and statistical mechanics In Boltz-

manns Legacy 150 Years after his Birth httpxxxlanlgovabscond-mat9608054 (Atti dell Accademia dei Lincei Rome 1997)

24 E G D Cohen Boltzmann and Statistical Mechanics volume 371 of Dynamics Models and Kinetic Methods for Nonequilibrium Many Body Systems J Karkheck editor 223-238 (Kluwer Dordrecht The Nethershylands 2000)

147

A N APPROACH TO Q U A N T U M PROBABILITY

STAN GUDDER Department of Mathematics

University of Denver Denver Colorado 80208

sguddercs du edu

We present an approach to quantum probability that is motivated by the Feynman formalism This approach shows that there is a realistic description of quantum mechanics and that nonrelativistic quantum theory can be derived from simple postulates of quantum probability The basic concepts in this framework are meashysurements and actions The measurements are similar to the dynamic variables of classical mechanics and the random variables of classical probability theory The actions correspond to quantum mechanical states An influence between configshyurations of a physical system is defined in terms of an action The fundamental postulate of this approach is that the probability density at a measurement outshycome x is the sum (or integral) of the influences between each pair of configurations that result in x upon executing the measurement

1 Introduction

We shall discuss a new approach to quantum probability that combines a reshyformulation of the mathematical foundations of quantum mechanics and the basic tenets of probability theory This approach is motivated by the Feynshyman formalism1 and it answers various puzzling questions about traditional quantum mechanics Some of these questions are the following

1 Where does the quantum mechanical Hilbert space H come from

2 Why are states represented by unit vectors in H and observables by self-

adjoint operators on HI

3 Why does the probability have its postulated form

4 Why do the position and momentum operators have their particular forms

5 Why does a physical theory that must give real-valued results involve complex amplitudes or states

6 Is there a realistic description of quantum mechanics

Our philosophy is that quantum probability theory need not be the same as classical probability theory That is the probability need not be given by a measure However the predictions of quantum probability theory should agree

148

with experimental long run relative frequencies We shall show that there is a realistic description of quantum mechanics In other words a quantum system has properties independent of observation We also show that nonrelativistic quantum mechanics can be derived from simple postulates of this approach Our presentation is a modified version of the discussion in Gudder 2

2 Formulation

We denote the set of possible configurations of a physical system ltS by fl and call $1 a sample space If X is a measurement on ltS then executing X results in a unique outcome depending on the configuration u of S To be precise we define a measurement to be a map X from fl onto its range R(X) C R satisfying

(Ml) R(X) is the base space of a measure space (R(X) Ex fix)-

(M2) X_1(x) is the base space of a measurable space (X~1(x) E x ) for every x e R(x)

We call the elements of R(X) X-outcomes and the sets in Ex are X-events Note that X _ 1 (x ) corresponds to the set of configurations resulting in outcome x when X is executed and we call X_1(x) the X-fiber over x The measure fix represents an a priori weight due to our knowledge of the system (for example we may know the energy of S or we might assume the energy has a certain value) In the case of total ignorance the weight is taken to be counting measure in the discrete case and uniform measure in the continuous case This framework gives a realistic theory because a configuration CJ detershymines the properties of S independent of any particular observation That is w determines the outcomes of all measurements simultaneously Notice that measurements are similar to the dynamical variables of classical mechanics and the random variables of classical probability theory The sample space fi gives an underlying level of reality upon which traditional quantum mechanics can be constructed

If X is a measurement an X-action is a pair

(Spound xeR(X))

where S CI mdashgt R and (ix is a measure on [X~lx)Hxx) As we shall see

actions correspond to quantum states For simplicity we frequently denote an action by S and we remark that S depends on our model of S and also on our knowledge of ltS We define the influence between w w 6 SI relative to S

149

by

Fs(uu) = JVf cos[S(w) - S(u)] (1)

where Ns gt 0 is a normalization constant The appearance of the cosine in (1) is not arbitrary but it can be derived from the regularity conditions of continuity and causality25

We now make a fundamental reformulation of the probability concept2 5

We postulate that the probability density Pxs) of an X-outcome x is the sum (or integral) of the influences between each pair of configurations that reshysult in x upon executing X Precisely we postulate that Fs(w u) is integrable and that

PXS(X)= f [ FS(ujUj)fMx(du)^x(dLj JX-l(x) JX~l(x)

(2)

Also to ensure that Pxsx) is indeed a probability density we assume that Pxsx) is measurable with respect to Ex and that

L RX) Pxs(x)nx(dx) - 1 (3)

Equation (3) can be employed to find Ns- To show that Pxs(x) gt 0 we have

Pxs()

= N2S[ f [caaS(w)coaS(w) + 8mS(u)S(u)]px(du)px(du)

Jx-Hx) Jx-Hx)

= N2S

-| 2 p

cosS(u)fix(dcj + sinS(w)^x(eL Jx-1(x) Jx-^x)

gt 0

We conclude that Pxs(x) is a probability density on R(X) pound X J X )

If B G pound is an X-event we define the (X 5)-probability of B by

PxsB) = [ Pxs(x)Vxdx) JB

(4)

(5)

Then Pxs- Ex -gt [01] is a probability measure on (R(X)Hx) that we call the S-distribution of X If h R(X) -gtbull R is ^x-integrable then the

150

5-expectation of hX) is defined by

Es(hX))= [ h(x)Pxs(dx)= [ h(x)Pxs(x)nx(dx) (6) JR(X) JR(X)

In particular if h is the identity function the 5-expectation of X becomes

ES(X)= [ xPxsx)nx(dx) (7) JR(X)

Influence is a strictly quantum phenomenon that is not present in classical physics In the classical limit Fswu) approaches a delta function 5U(UJ) In this limit Fs(uiui) = 0 for u 7 OJ and there is no influence between distinct configurations We then have Pxs(x) mdash nx

x X~lx)) which gives a classical probability framework

We can extend this theory to include expectations of other functions on Q Let g Q mdashgt R be a function that is integrable along X-fibers We define the (X 5)-expectation of g at x by

EXlS(g)(x) = I [ 5(w)fs(wa)Mx(dw)Mx(dw) (8) JX-1(x)JX-^(x)

This is the natural generalization of (2) from a probability density to an exshypectation density If Exs(g) 1S integrable then the (X 5)-expectation of g is given by

Exs(9) = [ Exs9)x)raquoxdx) (9) JR(X)

In particular if g(u) = h (X(CJ)) then

Exs(g)(x) = h(x)Pxs(x)

and

ExM = I h(x)Pxs(x)raquox(dx) = Es (h(X)) JR(X)

This shows that (9) is an extension of (6) We can also use this formalism to compute probabilities of events in fi Let

ACQ and denote the characteristic function of Aby xA- If XA is integrable along X-fibers we define analogously as in classical probability theory the (X 5)-pseudoprobability of A by

xs(A) = Exs(xA)

151

It follows from (3) and (9) that Pxs(ty = 1 and Pxs is countably additive However Pxs rnay have negative values which is why it is called a pseudo-probability Nevertheless there are cr-algebras of subsets of fi on which Pxs is a probability measure For example if A = X~XB) for B euro Ex then it can be shown that Pxs(A) = Pxs(B)2 Therefore in this case Pxs reduces to the distribution Pxs- We shall consider some less trivial examples later

3 Wave Functions and Hilbert Space

This section employs the formalism of Section 2 to derive the wave functions and Hilbert space of traditional quantum mechanics It is not necessary to do this because the needed probability formulas have been presented in Section 2 However as we shall see the Hilbert space formulation gives more convenient and concise notations

Applying (4) we obtain

NseiS^raquox(duj)

JX-l(x)

2

(10)

We call the function

s M = NseiS^ (11)

the S-amplitude function and define the (X S)-wave function by

fxs() = f fs(u)raquoxx(du) (12)

X-i(a)

From (10) and (12) we obtain

Pxs(x) = l xs()|2 (13)

We also have

Fs(uw) = iVfRe e ^ M e - ^ ) = Re s(w)s(w) (14)

Equation (10) shows how the complex numbers arise in quantum mechanshyics The complex numbers are not needed for the computation of Pxs because we can always write FS(OJW) in the form (1) They are merely a convenience that gives a simple and concise formula Equation (11) gives the Feynman amshyplitude function which we have now derived from deeper principles and (12) is Feynmans prescription that the amplitude of an outcome a is the sum (or

152

integral) of the amplitudes of the configurations (or alternatives) that result in x when X is executed

If B G Ex applying (5) and (13) gives

Pxs(B) = [ fxs(x)2raquox(dx) (15) JB

and this is the usual probabilistic formula of traditional quantum mechanics It follows from (3) that fxs is a unit vector in the Hilbert space 1 (R(X)Hx^x) and this derives the quantum Hilbert space and the vector form for a state If Ax is a set of X-actions then the Hilbert space Hx Q L2 (R(X) TxfJ-x) genshyerated by the set of wave functions fxs- S euro Ax is called an X-Hilbert space Some X-actions may not be relevant for physical reason so we may want Ax to be a proper subset of the set of all X-actions

If g Cl mdashgt R is integrable along rr-fibers and S pound Ax we define the (X 5)-amplitude average of g at x by

fxs(9)x) = [ g(u)fs(ugt)fx(dLj) = NS [ gu)eiS^nxd) Jx-l(x) JX-i(x)

(16)

Applying (8) and (14) we obtain

poundx s ( f f ) (s )=Re g(Lj)fs(cj)raquox(du) [ s(^)gti(^)

= Befxs(g)(x)fxsx)

It follows from (9) that

Exs(g)=Re(fxs(g)fxs) (17)

Define the linear operator g on Hx by gfxs() = fxs(g)() and extend by linearity If the operator Tj is self-adjoint on Hx we call g an X-observable and we have

Exs(9) = (9fxsfxs) (18)

for all S G Ax- We then say that g is represented by the self-adjoint opershyator lt on Hx bull This derives the representation of observables by self-adjoint operators

153

For a simple example of a representation let g pound1 -raquo R be a constant function g(uj) = c Then (16) gives

fxs(g)x) = c fs(w)nx(du) = cfxs(x) JX-1(x)

Hence g is an A-observable and is represented by the self-adjoint operator cl As another example letting g mdash X we have by (16) that

fxs(X)x) = xfXiS(x)

It follows that X is represented by the self-adjoint operator X on Hx given by Xu(x) = xux) We conclude that Hx is a Hilbert space in which X is diagonal More generally since

fxs (h(X)) (x) = h(x)fxs(x) (19)

we see that hX) is represented by the self-adjoint operator h(X)Au(x) = h(x)u(x) Moreover the spectral measure Px is given by Px (B)u(X) mdash XB(x)u(x) and applying (15) gives

Pxs(B) = px(B)fxs

which is again a standard probabilistic formula Finally for A C fi the (X 5)-pseudoprobability becomes by (17)

Pxs(A) = Re (fxs(xA)fxs) (20)

where by (16) we have

fxAxA)(x)= [ fs(cj)fixx(du) = NS I eiS^raquox(ckj) (21) JX- ( i )n i Jx-1(x)nA

4 Spin

We now illustrate the framework presented in the last two sections by preshysenting a model for spin 12 measurements Fix a direction corresponding to the z axis and assume that the spin j z in the z direction is known (either 12 or mdash12) Let UJ euro [07r] denote a direction whose angle to the z axis is LJ By symmetry the spin distribution should depend only on u Let fi = [07r] 8 6 fi and let X Q -gt -1212 be the function

X(u) = - 1 2 for u E [06] and X(u) = 12 for u G (0TT]

154

We make X into a measurement by defining

fix (-12)= ^ (12) = 1

and endowing X~1(-l2) = [0(9] and X~ 1 ( 1 2) = (0ir] with the usual Borel structure The function X corresponds to a spin 12 measurement in the 0 direction Letting 6 vary we obtain an infinite number of spin measurements each applied in a different direction Observe that a sample point ugt euro CI determines the spin in every direction simultaneously

For j z = 12 we define the X-action (S lt fix fix gtJ given by S(LJ) = u

and fix fix are fi2 where fi is Lebesgue measure restricted to X_ 1(mdash12) X _ 1 ( l 2 ) respectively We then have

FS(OJCJ) = cos(o - a)

(we shall see that Ns = 1) The probabilities become

P 5 ( - l 2 ) = l oVoCOs^-wJdwdw

= i[09cosadu]2 + i [ 0

e s i n a ^ ] 2 (22)

= plusmn s i n 2 0 + i ( l - c o s 0 ) 2 = s i n 2 f

Pxs(l2) = fficoa(u-uj)dLjdu

= [fg cos uiduj] + i [fg sin udu] (23)

= sin2 6 + (1 + cos Of = cos2 f

Since Pxs(-l2) + Pxs(ll2) = 1 we see that Ns = 1 Notice that (22) and (23) are the usual probability distribution for spin in the 9 direction when U = i 2

For j z = mdash12 we define the X-action S Avx vj J given by

S = u for u e (07r) and S = -TT2 for u e 0 n and vx = So + fi2 vx = Sn + fi2 where lt5o Sv are the Dirac point

measures at 0 ir respectively A similar but more tedious calculation gives

i ^ S ( - 1 2 ) = cos 2^

Pxs-(12) = s in 2 ^

155

which is the usual distribution for spin in the 6 direction when j z mdash - 1 2 We now examine the wave functions and Hilbert space corresponding to

this model The 5-amplitude function becomes fs(ugt) = etw and the (XS)-wave function fxs is given by

x s ( - l 2 ) 2 Jo e w d w = - ( l - )

fxs^l2) = f e^ltkj^-l + i0

The S-amplitude function becomes fsgt (w) = etrade for u euro (0 TT) and s - M = -i for w euro 0 TT and the (X 5)-wave function fxs IS given by

fxM-W) = f[o9]fs(gtx12^) = -i+12foeid

= - f ( l + eiS)

x5lt(l2) = M ] 5 H ^ 2 ( ^ ) = - i + 3 X r ^ d W

= - | ( l - e i e )

The X-Hilbert space is clearly C 2 and we can represent fxs and xS in C 2 by the unit vectors

vs

VS

(l-ei9l + eie)

(I + eie1 - eie)

Notice that vs i vs- Also when 6 = 0 vs mdash (01) and us = (10) which are the usual eigenvectors for the spin 12 operator in the z direction We can treat this as a measurement and the general X as an observable It can be shown that the matrix for X in the standard basis (10) (01) becomes

= 5 cos 9 ism 6

-i sin 6 mdash cos 6 = - cos 6

2 1 0 0 - 1

+ - sin 6 0 i -i 0

which is the usual form for a spin 12 matrix in the direction 6 We can extend this analysis to higher order spins3 Moreover this frameshy

work gives a realistic model for the Bohm version of the EPR problem4 The reason that Bells theorem is not contradicted is because Bells inequalities are derived using classical probability theory and we have employed quantum probability theory

156

5 Traditional Quantum Mechanics

We now show that this formalism contains traditional nonrelativistic quantum mechanics For simplicity we consider a single spinless particle in one dimenshysion although this work easily generalizes to three dimensions We take our sample space to be the phase space

n = K2 = (qp) qpER

The two most important measurements are the position and momentum given by Q(QP) = ltgt P(QJP) = P respectively However as is frequently done in quantum mechanics we shall investigate the ^-representation of the system In this case Q is considered a measurement and P fi mdashgt R is viewed as a function on fi which as we shall show is a Q-observable

Each Q-fiber Q~lq) = (qp)- p pound R can be identified with R We make Q a measurement by endowing its range R(Q) = R with Lebesgue meashysure and its fibers with the usual Borel structure of R Only certain Q-actions ISlt(1Q lt 7 G R H correspond to traditional quantum states and these can be derived from natural postulates We assume that fj is absolutely continuous relative to Lebesgue measure on R and that IQ is independent of Q This is because sets of Lebesgue measure zero are too small to have any effect on the outcomes of position measurements and there is no a priori reason to disshytinguish between Q-fibers It follows from the Radon-Nikodym theorem that there exists a nonnegative Lebesgue measurable function pound R mdashgt R such that

raquoQ(dp) = (2irh)-12ap)dp (24)

We take S fl mdashgt R to have the form

S(qp) = f+V(p) (25)

This form is natural because qp is the classical action and adding a function of momentum gives a quantum fluctuation We could also add a function of q but it is easy to see that this would just multiply the wave function by a constant phase which would not alter the probabilistic formulas Denote by AQ the set of (^-actions that have the form (24) (25)

Applying (12) for S euro AQ we find that the (Q 5)-wave function becomes

fQs(q) = 2-KK)-12 J tipYnp)eiqvhdp

Defining

m = t(p)eivp) (26)

157

and denoting the inverse Fourier transform by v we have

fQs(q) = (27Tr12 4gtPyqphdP = ltpa) (27)

In order for (3) to be satisfied Q ^ must be a unit vector in L2(R dq) or equivalently ltjgtp) must be a unit vector in L2(R dp) However every vector in L2 (R dp) has the form (26) for some functions pound R -raquobull R + 77 R -gtbull R It follows that the Q-Hilbert space becomes the traditional Hilbert space HQ = L2(R dq) and fQs is the usual wave function (or state)

Let (s l^9Q q euro R ) be a fixed Q-action in AQ of the form (24) (25)

and let ip(q) = fQs(q) $(p) = ^(p)eitgt^ Applying (16) and (27) we have

fQs(P)(Q) = (2nh)-12Jpltigt(p)ei^dp

= -ihplusmn(2nh)-V2j4gt(P)eilphdp=-ihq)

More generally if n is a positive integer we obtain

fQs(Pn)(Q) = (-ihQ V-CP) (28)

Moreover applying (18) we have

E^pn) = l[(-ihiS 1gt(q) P(q)dq

which is the usual quantum expectation formula We conclude from (28) that P is a Q-observable and is represented by the operator (mdashihddq)n Moreover if V R mdashgt R is measurable we see from (19) that V(Q) is a Q-observable and is represented by the operator V(Q)Au(q) = V(q)u(q) This together with our observation concerning P gives a derivation of the Bohr correspondence principle

We now consider probability distributions We have already seen in (15) that

PQS(B)= I ltP(q)2dq JB

which is the usual distribution of Q It is more interesting to compute the probability of A = P~1(B) for the momentum function P We have from (21) that

fQs(xA)(q) = 2Kh)-12 [ 4gtjgtyqphdp=xB4gtYq) JB

158

Hence by (20) and the Plancherel formula we obtain

PQS [P-^B)] = jxBdgtYq)rq)dq

(xB4gt)p)ltP(p)dp lt

= |(p) JB lt

dp

Again this is the usual momentum distribution This gives an example in which PQS is an actual probability measure on a er-algebra of subsets of fi

Until now we have treated time as fixed We now briefly consider dynamshyics Let ipqt) be a smooth function Our previous formulas hold with tp(q) replaced by tp(qt) and HQ replaced by tQt- We now derive Schrodingers equation from Hamiltons equation of classical mechanics dpdt = mdashdHdq Suppose the energy function has the form

H(qP) = ^+V(q)

We assume that Hamiltons equation holds in the amplitude average Applying (16) we have

Jt J Pfs(qPt)nqQltt(dp) = -mdashJ H(qp)fsqpt)nq

Qtdp)

Hence

dt Jp$(p t)e^hdp =-^f H(qp)$(p t)e^lhdp

Applying (28) and (19) gives

h2 d2igt dt dq J dq 2m dq2 + V(q)rlgt

Interchanging the order of differentiation on the left side of this equation and integrating with respect to q gives Schrodingers equation

6 Concluding Remarks

In this paper we have presented a realistic contextual nonlocal approach to quantum probability theory The formalism is realistic because each sample

159

point w euro n uniquely determines a value X(ugt) for any measurement X In this way a physical system ltS possesses all of its attributes independent of whether they are measured Although the sample space fi exists and we can discuss its properties fi is not physically accessible in general This is because the samshyple points may not correspond to physical states which can be prepared in the laboratory or at least exist in nature We may think of fi as a hidden variable completion of quantum mechanics This approach is contextual because it is necessary to specify a particular basic measurement X Once X is specified a Hilbert space Hx can be constructed and Hx provides an X-representation for S Of course one may choose a different basic measurement Y and then the ^-representation will give a different picture of S For example in trashyditional quantum mechanics we usually choose the position representation or the momentum representation to describe ltS For a given basic measurement X and an action S we have given a method for constructing the probability distribution Pxs of X We have shown that Pxs may be found in terms of a state vector fxs 6 Hx and these correspond to physically accessible states In Hx the measurement X and functions of X are diagonal and hence represhysented by random variables Other measurements which we call observables to distinguish them from X are represented by self-adjoint operators on Hx and their usual distributions follow in a natural way The theory is nonlocal because the distribution Pxs is specified by an influence function Fs(ww) This function provides an influence between pairs of sample points which in a spacetime model may be spacelike separated

There is considerable controversy concerning various interpretations and approaches to probability theory I believe that three types of probabilities are necessary for a description of quantum mechanics The probabilities and disshytributions of measurement results in the laboratory are usually computed using long run relative frequencies Even though a measurement X may involve a microscopic system S (for example the position of an electron) S must intershyact with a macroscopic apparatus in order to obtain an observable outcome The theoreticians task is to find the distribution Px of X This theoretical distribution should agree with the long run relative frequencies found in the laboratory or give predictions that can eventually be tested experimentally Since there are serious well-known difficulties in dealing with abstract theories of relative frequencies it is convenient and perhaps even necessary to use the standard Kolmogorovian probability theory for describing Px- Now Px is a probability measure that satisfies the axioms of standard probability theory However the method for computing Px is characteristic of quantum mechanshyics and is not found in any classical theory Richard Feynman whose work has motivated the present paper once said that nobody really understands

160

quantum mechanics I think that what he meant is that nobody understands why nature has chosen to compute probabilities in this unusual way As preshysented here the probability density for Px is found by employing an influence function The advantage of this method is that it is physically motivated and avoids complex numbers An equivalent method which is usually employed in quantum mechanics is to take the absolute value squared of the wave function

The quantum probability approach that we have presented contains stanshydard probability theory as a special case Thus we only need two types of probabilities to describe quantum mechanics Standard probability theory as developed by Kolmogorov is a distillation of hundreds of years of experience with empirical and theoretical studies of chance phenomena The founders of the subject were concerned with games of chance statistics and the behavior of macroscopic objects They were not aware of microscopic objects and quanshytum mechanics and had no reason to design a probability theory for describing such situations It is therefore not surprising that a new theory called quantum probability theory had to be developed to serve these purposes

References

1 R Feynman and A Hibbs Quantum Mechanics and Path Integrals (Mc Graw-Hill New York 1965)

2 S Gudder Int J Theor Phys 32 1747 (1993) 3 S Gudder Int J Theor Phys 32 824 (1993) 4 S Gudder Quantum probability and the EPR argument Ann Found

Louis De Broglie 20 167 (1994) 5 G Hemion Int J Theor Phys 29 1335 (1990)

161

INNOVATION APPROACH TO STOCHASTIC PROCESSES A N D Q U A N T U M DYNAMICS

TAKEYUKI HIDA Department of Mathematics

Meijo University TenpakuNagoya 468-8502

and Nagoya University (Professor Emeritus)

Theory of stochastic process has extensively developed in the twentieth century and there established a beautiful connection with quantum dynamics It seems to be a good time now to revisit the foundations of stochastic process and quantum mechanics with the hope that the attempt would suggest some of further directions of these two disciplines with intimate relations For this purpose we review some topics in white noise analysis and observe motivations from physiscs and how they have actually been realized

1 Introduction

We shall discuss the analysis of random complex systems and its connection with Quantum dynamics In particular we analyse some stochastic processes Xt) and random fields X(C) in a manner of using the innovation and revisit quantum dynamics in connection with stochastic analysis Actually our aim is to study those random complex systems including quantum fields by using the white noise analysis

The basic idea of our analysis is that we first discuss stochastic processes by taking a basic and standard system of random variables then expressing the given process as a function of the system that has been provided The system of such variables from where we have started is called idealized elemental random variables (abbr ierv) The idea of taking such a system is in line with the

Reductionism One might think that this thought seems to be similar to the Reductionism

in physics Before we come to this point it sounds interesting to refer to the lecture given by PW Anderson at University of Tokyo 1999 His title included Emergence together with reductionism and he gave good interpretation

Following the reductionism we then come to the next step is to form a function of the iervs so that the function represents the given random complex system It is nothing but

Synthesis

162

Then naturally follows the analysis of functions which have been formed in our setup Thus the goal has therefore to be the analysis of the function (may be called functional) to identify the random complex system in question

The first step of taking suitable system of iervs has been influenced by the way how to understand the notion of a stochastic process We therefore have a quick review of the definition of a stochastic process starting from the idea of J Bernoulli (Ars Conjectandi 1713) S Bernstein (1933) and P Levy on the definition of a stochastic process (1947) where we are suggested to consider the innovation of a stochastic process It is viewed as a system of iervs which will be specified to be a white noise

The analysis of white noise functionals has many significant characteristics which are fitting for investigation of quantum mechnical phenomena Thus we shall be able to show examples to which white noise theory is efficiently applied

Having had great contribution by many authors the theory developed in our line has become the present state

AMS 2000 Mathematics Subject Classification 60H40 White Noise Theory

2 Review of defining a stochastic process and white noise analysis

There is a traditional and in fact original way of defining a stochastic process Let us refer to Levys definition of a stochastic process given in his book [3] Chapt II une fonction aleatoire X(t) du temps t dans lequel le hasard inter-vient a chaque instant The hasard is expressed as an infinitesimal random variable Y(t) which is independent of the observed values of X(s) s lt t in the past The random variable Y(t) is nothing but the innovation of the process X(t)

Formally speaking the Y(t) which is usually an infinitesimal random varishyable contains the information that was gained by the X(t) during the time interval [t t + dt) To express this idea P Levy proposed a formula called an infinitesimal equation for the variation 5X (t)

6X(t) = $(X(s)s lt tY(t)tdt)

where $ is a non-random functional Although this equation has only a formal significance it still tells us lots of suggestions

While it would be fine if the given process is expressed as a functional of

163

Yt) in the following manner

X(t) = V(Y(s)slttt)

where ^ is a sure (non random) function Such a trick may be called the Reduction and Synthesis method The

above expression is causal in the sense that the X(t) is expressed as a function of Y(s) s ltt and never uses Y(s) with s gt t

Note that this method of denning a stochastic process is more important than function space type distribution

The collection Y(s) is a system of iervs so that the above expression is a realization of the synthesis We are particularly interested in the case where the system of iervs is taken to be a white noise and thus ready to discuss white noise analysis

So far we have discussed the theory only for a stochastic process It is in fact quite natural to extend the theory for a random field X(C) indexed by an ovaloid say a contour or closed surface A generalization of the infinitesimal equation is

SX(C) = $ (X(C) C lt CY(s)s e CC6C)

The y(s) s G C is the innovation

We note that the white noise analysis has many advantages as are quickly mentioned below Such a generalization can be done because of the use of the innovation

1) It is an infinite dimensional analysis Actually our stochastic analysis can be systematically done by taking a white noise as a sytem of iervs to express the given random complex systems Indeed the analysis is essentially infinite dimensional as will be seen in what follows

2) Infinite dimensional harmonic analysis The white noise measure supported by the space E of generalized funcshy

tions on the parameter space Rd is invariant under the rotations of E Hence a harmonic analysis arising from the group will naturally be discussed The group contains significant subgroups which describes essentially infinite dimenshysional characters

3) Generalizations to random fields X(C) are discussed in the similar manshyner to X(t) so far as innovation is concerned Needless to say X(C) enjoys more profound characteristic properties

164

4) Connection with the classical functional analysis The so-called S-transform applied to white noise functionals provides a bridge connecting white noise functionals and classical functionals of ordinary functions We can thereshyfor appeal to the classical theory of functionals established in the first half of the twentieth century

5) Good connection with quantum dynamics as will be seen in the next section

Differential and integral calculus of white noise functionals using annihishylation dt and creation lt9t class of generalized functionals harmonic analysis including Fourie analysis the Levy Laplacian A L complexification and other theories are refered to the monograph [12] and other literatures

3 Relations to Quantum Dynamics

We now explain briefly some topics in quantum dynamics to which white noise theory can be applied What we are going to present here may seem to be separate topics each other but behind the description always is a white noise

1) Representation of the canonical commutation relations for Boson field This topic is well known

Let B(t) be a white noise and let dt denote the S(i)-derivative Then it is an annihilation operator and its dual operator 3t stands for the creation They satisfy the commutation relations

[fta] = [aa] = o

[dtd] = s(t-s)

From these a representation of the canonical commutation relations are given for Bosonic particle

It is noted that the following assertion holds

Proposition There are continuously many irreducible representations of the canonical commutation relations

White noises with different variances are inequivalent each other which proves the assertion

2) Reflection positivity (T-positivity)

165

A stationary multiple Markov (say N-ple Markov) Gaussian process has a spetral density function (A) of particular type Namely

On the other hand it is proved that

Proposition The covariance function 7(t) of a stationary T-positive Gausshysian process is expressed in the form

bull O O

j(h) = exp[mdash |i|x]cfo(a) Jo

where v is a positive finite measure

By applying this assertion to the N-ple Markov Gaussian process we claim that T-positivity requires Ck gt 0 for every k

Note that in the strictly N-ple Markov case this condition is not satisfied

It is our hope that this result would be generalized to the cases of general stochastic processes of multiple Markov properties

3) A path integral formulation

One of the realizations of Dirac-Feynmans idea of the path integral may be given by the following method using generalized white noise functionals First we establish a class of possible trajectories when a Lagrangian L(x x) is given Let x be the classical trajectory determined by the Lagrangian As soon as we come to quantum dynamics we have to consider fluctuating paths y We propose they are given by

y(s) = xs) + mdashBs) V m

The average over the paths is replaced with the expectation with respect to the probability measure for which Brownian motion B(t) is defined Thus the propagator G(yiy2t) is given by

ENexp[l-J L(yy)ds+^j B(s)2ds] bull S(y(t) - y2)

With this setup actual computations have been done to get exact formulae of the propagators (L Streit et al)

166

4) Dirichlet forms in infinite dimensions With the help of positive grneralized white noise functionals we prove criteria for closability of energy forms See [3]

5) Random fields X(C)

A random field XC) depending on a parameter C which is taken to be a certain smooth and closed manifold in a Euclidean space naturally enjoys more complex probabilistic structure than a stochastic process X(t) depending on the time t It therefore has good connections with quantum fields in physics

We are particularly interested in the case where X(C) has a causal represhysentation in terms of white noise Some typical examples are listed below

51) Markov property and multiple Markov properties We are suggested by Diracs paper [1] to define Markov property For

Gaussian case a reasonable definition has been given (see [15]) by using the canonical representation in terms of white noise where the canonical property of a representation can be introduced as a geberalization of that for a Gaussian process Some attempts have been made for some non Gaussian fields (see [17]) For Gaussian case multiple Markov properties have been defined It is now an interesting question to find conditions under which a Gaussian random field satisfies a multiple Markov property

52) Stochastic variational equations of Langevin type Let C runs through a class C of concentric circles The equation is to solve

the following stochastic variational equation of Langevin type

SX(C) = -XXC) [ 6n(s)ds + X0 [ v(s)ds5n(s)ds Jc Jc

The explicit solution is given by using the 5-transform and the classical theory of functionals

53) We have made an attempt to define a random field X(C)C G C which satisfies conformal invariance Reversibility can also be discussed

Example Linear parameter case A Brownian bridge For t euro [01] it is defined by

X(t) = (l-t) [ mdash^mdashB(u)du Jo 1 ~u

167

Reversibility can be guaranteed not only by the time reflection but also by whiskers (one-parameter subgroup denned by deformation of parameter) in the conformal group that leaves the unit time interval invariant

We now come to the case of a random field Let C be the class of concentric circles Assume 0 lt r0 lt r lt r Denote by Cr the circle with radius r Then we define

(ft) - yfi^^bw w^w^ This is a canonical representation To show a reversibility we apply the invershysion with respect to the circle with radius yrori

We claim that it is possible to have a generalization to the case where C is taken to be a class of curves obtained by a conformal mapping of concentric circles

Remark 1 It is noted that the white noise x(t) is regarded as a representation of the parameter t so that propagation of randomness (fluctuation) is expressed in terms of x(t) instead the time t itself Namely the way of development of random complex phenomena in particular reversibility has explicit description in terms of white noise as is seen in the above example

Remark 2 See the papers [1] by Dirac and [13] by Polyakov to have suggestions on a generalization of the path integral

4 Addenda to foundations of the theories Concluding remarks

Before the concluding remarks are given we should like to add some facts as an addenda to SI regarding the foundations of probability theory

Prom a brief history mentioned in SI we understand the reason why a white noise that is a system of iervs is introduced It is a generalized stochastic process so that we need some additional consideration when reashysonable functionals in general nonlinear functionals of white noise are introshyduced In physics we met interesting cases where those nonlinear functionals of white noise are requested canonical commutation relations for quantum fields where degree of freedom is continuously infinite Feynmans path inteshygrals as was discussed in 3) of the last section and variational equation for a

168

random field On the other hand we were lucky when a class of generalized white noise functionals were introduced in 1975 since the theory of genaral-ized functions was established and some attempt had been made to apply it to the theory of generalized stochastic processes To have further fruitful results we have been given a powerful method to study random fields indexed by a manifold It is the so-called innovation approach where our reductionism does not care higher dimensionality of the parameter space With these in mind we can come to the concluding remarks

As the concluding remarks some of proposed future directions are now in order

1 One is concerned with good applications of the Levy Laplacian Its signifishycance is that it is an operator that is essentially infinite dimensional

2 A two-dimensional Brownian path is considered to have some optimality in occupying the territory This property should reflect to forming a model of physical phenomena

3 Systematic approach to in variance of random fields under transformation group will be discussed

4 Stochastic Variational Calculus for random fields

With the classical results on variational calculus we can proceed further white noise analysis

Acknowledgements The author is grateful to Professor A Khrenikov who has invited him to give a talk at this conference Thanks are due to Academic Frontier Project at Meijo University for the support of this work

References

1 PAM Dirac The Lagrangian in quantum mechanics Phys Z Soviet Union 3 64-72(1933)

2 S Tomonaga On a relativistically invariant formulation of the quantum theory of wave fields Prog Theor Phys 1 27-42 (1946)

3 P Levy Processus stochastiques et mouvement brownien (Gauthier-Villars 1948 2 ed 1965)

4 P Levy Nouvelle notice sur les travaux scientifique de M Paul Levy Janvier 1964 Part III Processus stochastiques (unpublished manuscript)

169

5 T Hida Canonical representations of Gaussian processes and their applications Mem College of Science Univ of Kyoto A 33 109-155(1960)

6 T Hida Stationary stochastic processes (Princeton Univ Press 1970) 7 T Hida Brownian motion (Iwanami Pub Co 1975 English ed

Springer-Verlag 1980) 8 T Hida Analysis of Brownina functionals Carleton Math Lecture

Notes 13 (1975) 9 T Hida Innovation approach to random complex systems Pub

Volterra Center 433 (2000) 10 T Hida and L Streit On quantum theory in terms of white noiseNagoya

Math J 68 21-34(1977) 11 T Hida J Pothoff and L Streit Dirichlet forms and white noise

analysis Commun Math Phys 116 235-245 (1988) 12 T Hida H-H Kuo J Potthoff and L Streit White noise an Infinite

dimensional calculus (Kluwer Academikc Pub 1993) 13 AM Polyakov Quantum geometry of Bosonic strings Phys Lett

103B 207-210(1981) 14 J Schwinger Brownian motion of a quantum oscillator J of Math

Phys 2 407-432 (1961) 15 Si Si Gaussian processes and Gaussian random fields Quantum Inshy

formational (World Scientific Pub Co 2000) 16 L Streit and T Hida Generalized Brownian functionals and the Feyn-

man integral Stoch Processes Appl 16 55-69 (1983) 17 L Accardi and Si Si Innovation approach to multiple Markov propershy

ties of some non Gaussian random fields to appear

170

STATISTICS A N D ERGODICITY OF WAVE FUNCTIONS IN CHAOTIC OPEN SYSTEMS

H ISHIO Department of Physics and Measurement Technology Linkoping University

S-581 83 Linkoping Sweden E-mail hirisifmliuse

and Division of Natural Science Osaka Kyoiku University Kashiwara

Osaka 582-8582 Japan E-mail ishioccosaka-kyoikuacjp

In general quantum chaotic systems are considered to be described in the context of the random matrix theory ie by random Gaussian variables (real or complex) in an appropriate universality class In reality however quantum states inside a chaotic open system are not given by a statistically homogeneous random state We show some numerical evidences of such statistical inhomogeneity for ballistic transport through two-dimensional chaotic open billiards and argue about their relation to the corresponding classical dynamics

1 Introduction

Quantum-mechanical signature of classical chaos is called quantum chaos The rigorous definition of chaotic systems in quantum theory has been given very recently for Kolmogorov (K-) and Anosov (C-) systems on the analogy of the corresponding classical natures1 In such systems quantum ergodicity is naturally expected Eigenfunctions are equidistributed in their representation space and all expectation values of quantum observables coincide with mean values of the corresponding classical observables It was first noted that a sufficient condition for quantum ergodicity to hold is the ergodicity of the corshyresponding classical dynamics2 More recently the statement was proved in the case of quantum billiards34 Nowadays the quantum ergodicity is one of the few results for which there exist mathematical proofs in the field of quantum chaos

The quantum ergodicity however can be reached only in the semiclassical limit (h mdashgt 0) In experiments or numerical simulations for chaotic systems we often see nonuniversal quantum features far from ergodicity even in a high (but finite) energy region In the present work we show some numerical evidences of such statistical inhomogeneity for chaotic open systems In Sec 2 we introshyduce a model of ballistic transport through a chaotic open billiard and show some evidences of nonergodicity in the classical dynamics We briefly discuss in Sec 3 the general wave-statistical description of chaotic open systems by

171

Figure 1 Typical single trajectory in the open stadium billiard

the random matrix theory (RMT) In Sec 4 we show numerical results of fully-quantum calculations of the open billiard model and find that the idealshyistic description by RMT does not apply in some cases even in a high energy region There we focus on the relation between the statistical deviations and wave localization corresponding to classical short paths Section 5 consists of conclusions

2 Classical Nonergodicity and Short-Path Dynamics

We consider a two-dimentional (2D) billiard where the motion of noninter-acting particles confined by Dirichlet boundaries is ballistic The shape of the boundaries directly determines the nonlinearity of particle dynamics inside the billiard One of the prototypes of conservative chaotic systems is a Bunimovich stadium billiard In the case of a closed stadium billiard it is proved that the system has K-property 5 In the case of an open stadium billiard coupled to two narrow leads (see Fig 1) the nonintegrability is still expected eg we can observe a fractal structure in the spectrum of dwell times inside the cavity region6 However the Monte Carlo simulation of the classical path-length (oc dwell time) distribution shows that the distribution function is not a simple exponential decay function as a signature of ergodicity but a highly structured function owing to short-path dynamics7

Another example showing nonergodicity of classical dynamics in the case

172

of the open stadium billiard is a transmission-reflection diagram of particles as is shown in Fig 2 There y is an initial transversal position of each particle incoming from the lead 1 (see Fig 1) at the entrance of the stadium cavity d denotes a common width of the attached leads We apply semiclassical quantization condition to the momentum of the incoming particles in the lead The angle of incidence is quantized as 6 = plusmn s in - 1 [(nir)(kd)] (n = 12 ) where we choose the positive and negative 0j for the upper and lower direction of particle motions in Fig 1 respectively k is the Fermi wave number of the semiclassical particles In the calculation of all the range of the diagram we fix the quantized mode number n as n = 1 Because of the semiclassical quantization condition 0i monotonically decreases as a function of k The distributed black and white points correspond to transmission and reflection events respectively The relative measure of the black (white) portion for each fc is equal to the classical transmission (reflection) probability Tci(k) (Rct(k)) In Fig 2 we see a number of black and white windows in the chaotic sea Each of them is associated with a family of short paths connecting from the lead 1 to the lead 2 (for the black) and the lead 1 (for the white) Such paths are stable in the event of transmission and reflection and are expected to make an important contribution as a family to the corresponding quantum transport

3 Universal Description of Wave Function Statistics

We write the scaled local density as p(r) mdash Vip(r)2 where V is the volume of the system in which a single-particle wave function ip(r) is normalized in terms of the position r It is well known that the probability distribution of the local densities of a chaotic eigenfunction of a closed system is the Porter-Thomas (P-T) distribution8

P(p) = ( l v 2 ^ ) exp( -p 2) (1)

described by a Gaussian orthogonal ensemble (GOE) of random matrices when time-reversal symmetry (TRS) is present ie ip poundR On the other hand the distribution is an exponential8Q

P(p) = exp(-p) (2)

described by a Gaussian unitary ensemble (GUE) of random matrices when TRS is broken in the closed system ie tp 6 C The space-averaged spatial correlation of the local densities of a 2D chaotic wave function with wave number k is also given by9 10 11

P2(kr) = (p^pfa)) = l + cJi(kr) (3)

173

where r = |ri mdash r2 | and Jox) is the Bessel function of zeroth order The parameter c is chosen as c = 2 for GOE (TRS) and c = 1 for GUE (broken TRS) eigenfunctions

Investigations of the continuous transition of the wave function statistics between GOE and GUE symmetries have been also worked out Introducshying a transition parameter b euro (12] we have the probability distribution 1213141516

PM = 2Vr3Texp(4(5^T))

where Iox) is the modified Bessel function of zeroth order and the spatial correlation17

Pb2kr) = 1 + (l + ( ^ ) 2 ) JS(kr) bull (5)

For b -gt 1 and b -gt 2 both equations tend to the GOE and GUE cases respectively

On the other hand the systematic statistical investigations of scattering wave functions in open chaotic systems have been carried out quite recently16

It is essential that the space reciprocity in conservative closed systems which means that each plane wave ties up with its counterpart with the same amplishytude and running in the opposite direction in phase is lost in open systems As a result the wave function statistics in a chaotic open system is expected to be the GUE if the system is completely open16

4 Numerical Analyses and Discussions

We show in this section some numerical evidences of wave statistical inho-mogeneity for ballistic transport through the 2D open stadium billiard Asshysuming steady current flow through the leads we solve the time-independent Schrodinger equation for a single particle under Dirichlet boundary conditions based on the plane-wave-expansion method6 giving reflection and transmission amplitudes as well as local wave functions for each energy In the calculation of the statistics a sample space A(= V) is taken in the cavity region corshyresponding to the closed stadium and more than one million sample points are used to obtain reliable statistics We show the numerical results for the wave probability density in Fig 3 and for the probability distribution P(p) and spatial correlation P2(kr) in Fig 4

174

In Fig 3(a) we find the so-called bouncing-ball mode in the central reshygion of the stadium cavity where we see a number of vertical nodes associated with marginally stable classical orbits bouncing vertically between the straight edges Bouncing-ball states are nonstatistical states since the amplitude of ip is strongly localized in the middle region of the stadium (the space reciprocity holds locally) and is very small in the endcaps (the space reciprocity does not necessarily hold) As a result both Pp) and P2(kr) for such states do not folshylow their universal expressions (see Fig 4(a)) In addition to the bouncing-ball mode we also see another wave localization strongly coupled to both the initial and the (open) transmission channels corresponding to the direct transmission path (see the white line depicted in Fig 3(a)) Along such localization plane wave may propagate with nonzero probability current partially contributing to the anomaly of the wave statistics16

In the higher energy region where the ratio of the system size A to the wave length A is v^4A ~ 25 (ie in the case of Fig 3(b)) we may expect the GUE statistics However we see in Fig 4(b) that both P(p) and P2(kr) follow closely the GOE

The reason is a localization effect reminiscent of the phenomenon known as scar 18 describing an anomalous localization of quantum probability denshysity along unstable periodic orbits in classically chaotic systems In order to characterize a localization we usually introduce a moment defined by J = V~l Jv tp(r)2qdr of the eigenfunction local density |VKr)|2 with V being the system volume19 20 The second moment I2 is known as the inverse particshyipation ratio (IPR) Assuming a normalization condition (|V|2) (= ^1) = 1gt we have I2 = 1 for completely ergodic (random and uniform) eigenfunctions while h = 00 for completely localized eigenfunctions like IV(r)2 ~ V5(r) The localization effect on wave-function density statistics has been examined anashylytically in relation to J for closed systems212223 and also numerically using a time-dependent approach ie in terms of recurrences of a test Gaussian wave packet for closed and weakly (imperfectly) open systems 24gt25gt26 In the latter work they showed that the tail of the wave-function intensity distribution in phase space is dominated by scarring departing from the RMT predictions

In contrast the most prominent effect of the localization of wave probashybility density in open billiards is the local space reciprocity holding along the classical orbits corresponding to the localization not strongly coupled to any (open) transmission channel (see eg the white lines depicted in Fig 3(b)) Along such orbits there is no net current owing to the coherent overlap of time-reversed waves so that both P(p) and P2(kr) are close to the GOE predicshytions 16 For quantitative discussion the value of the GOE-GUE transition pashyrameter b is calculated numerically from the wave function ip(r) mdash u(r) + iv(r)

175

by a formula 16

amp = 2 lt | V | 2 ) (hf) + y(|V|2)2-4((u2)( l2)-(w)2) (6)

and (bull bull bull) denotes a space average on A The obtained value for Fig 3(b) is b = 103 which corresponds to the case very close to the GOE

In the case of open systems the IPR may again play an important role as a measure of localization27 In the definition I2 = V 1 Jv |^(r) |4dr |V(r)|2(= p(r)) is the scattering-wave local density and V the area (A) of the stadium cavity in our case For chaotic wave functions normalized as (IVI2) = 1 gt w e

obtain from Eq (4) the IPR l for the transition between the GOE and GUE statistics as

Tb I p2Pb(p)dp = -7T

2VF^i

5 [2

70 Ti dQ

[l+(t-l)cos0]

3b2 - 4 6 + 4 b2 (7)

In the GOE and GUE limits I=1 = 3 and 7|=2 = 2 respectively For Fig 3(b) the numerically obtained IPR is h = 289 which is exactly equal to jt=i03 ^phis m e a n s that the enhancement of the IPR by the amplitude of the localized wave is not strong in the case of Fig 3(b) and that the effect of the localization appears mainly in the value of b which also determines the IPR

From our investigations together with more extended studies16 the comshyplete GUE statistics is conjectured to be obtained only in the high-energy (semiclassical) limit Until the energy reaches such limit the localization of wave functions within the chaotic open systems strongly affects the wave stashytistical properties leading to deviations from the RMT predictions based on the ergodicity or uniform randomness of wave functions

Finally we note that the classical-path families associated with the loshycalization found in Fig 3(a) and (b) can be identified as windows indicated with a and 3 in Fig 2 respectively (In Fig 3(b) only the path family for the localization touching the entrance can be identified in Fig 2) We notice that the angle of incidence 0 for a given k is irrelevant to that of the path corresponding to the observed localizations directly connected to the entrance

5 Conclusions

In conclusions our numerical analyses show that chaotic-scattering wave funcshytions in open systems exhibit remarkably different features from the idealistic GUE predictions The statistical deviations from the GUE can be understood in terms of wave localization corresponding to classical short-path dynamics

176

Acknowledgments

The auther is obliged to K-F Berggren A I Saichev and A F Sadreev for fruitful collaboration leading to the work in Sec 4 Support from the Swedish Board for Industrial and Technological Development (NUTEK) under Project No P12144-1 is also acknowledged Part of the calculations of the wave funcshytion statistics were carried out by using a resource in National Supercomputer Center (NSC) at Linkoping

References

1 H Narnhofer (to be published) 2 A I Shnirelman Usp Mat Nauk 29 181 (1974) 3 P Gerard and E Leichtnam Duke Math J 71 559 (1993) 4 S Zelditch and M Zworski Comm Math Phys 175 673 (1996) 5 L A Bunimovich Fund Anal Appl 8 254 (1974) 6 K Nakamura and H Ishio J Phys Soc Jpn 61 3939 (1992) 7 H Ishio and J Burgdorfer Phys Rev B 51 2013 (1995) 8 C Porter and R Thomas Phys Rev 104 483 (1956) 9 V N Prigodin Phys Rev Lett 74 1566 (1995)

10 V N Prigodin et al Phys Rev Lett 72 546 (1994) 11 M V Berry in Chaos and Quantum Physics ed M J Giannoni

A Voros and J Zinn-Justin (Elsevier Amsterdam 1990) p 251 12 K Zyczkowski and G Lenz Z Phys B 82 299 (1991) 13 G Lenz and K Zyczkowski J Phys A 25 5539 (1992) 14 E Kanzieper and V Freilikher Phys Rev B 54 8737 (1996) 15 R Pnini and B Shapiro Phys Rev E 54 R1032 (1996) 16 H Ishio et al (unpublished) 17 S-H Chung et al Phys Rev Lett 85 2482 (2000) 18 E J Heller Phys Rev Lett 53 1515 (1984) 19 F Wegner Z Phys B 36 209 (1980) 20 C Castellani and L Peliti J Phys A 19 L429 (1986) 21 Y V Fyodorov and A D Mirlin Phys Rev B 51 13403 (1995) 22 K Miiller et al Phys Rev Lett 78 215 (1997) 23 V N Prigodin and B L Altshuler Phys Rev Lett 80 1944 (1998) 24 L Kaplan Nonlinearity 12 Rl (1999) 25 L Kaplan Phys Rev Lett 80 2582 (1998) 26 L Kaplan and E J Heller Ann Phys 264 171 (1998) 27 H Ishio and L Kaplan (private communication)

177

-612 0 612-612 0 612 y(-9i) y(+6i)

Figure 2 Transmission-reflection diagram of classical particles as a function of initial position y at the entrance of the stadium cavity and Fermi wave number k corresponding to the angle of incidence $i calculated by semiclassical quantization condition (n = 1 in all the range) in the lead Black and white points correspond to transmission and reflection events respectively Two families of short paths are identified with an arrow beside the diagram (see the text)

178

Figure 3 Contour plot of wave probability density in the open stadium billiard for the condition (a) kdn = 18785 (n = 1) and (b) kdrc = 46553 (n = 1) Initial wave comes through the left lead into the cavity The transmission probability is (a) Tqm = 055 and (b) Tqm = 036 The contours show about 975 of the largest wave probability density Thin white lines show some of the short classical orbits corresponding to the localization of the wave probability density Taken from the work by the authors in Ref [12] (unpublished)

179

Q

Q_

001

10

Q

Q_

01

001

(b) = 2

X ^ Q U E _ _S gtJ^ 0 G O r T lt ^ lt

GOE

) 2 4 6 kr

bull

8

0

Figure 4 Probability distribution (steps) and spatial correlation (thick line in the inset) of local densities in the open stadium billiard for the condition (a) kd = 18785 (n = 1) and (b) kdir = 46553 (n = 1) Two thin lines show GOE (ie Eq (1)) and GUE (ie Eq (2)) cases (Eq (3) for the inset) Taken from the work by the authors in Ref [12] (unpublished)

180

ORIGIN OF Q U A N T U M PROBABILITIES

A N D R E I K H R E N N I K O V

International Center for Mathematical

Modeling in Physics and Cognitive Sciences

MSI University of Vaxjo S-35195 Sweden

Email AndreiKhrennikovmsivxuse

We demonstrate that the origin of the quantum probabilistic rule (which differs from the conventional Bayes formula by the presence of cos 0-factor) might be exshyplained by perturbation effects of preparation and measurement procedures The main consequence of our investigation is that interference could be produced by purely corpuscular objects In particular the quantum rule for probabilities (with nontrivial cos 0-factor) could be simulated for macroscopic physical systems via preparation procedures producing statistical deviations of a special form We disshycuss preparation and measurement procedures which may produce probabilistic rules which are neither classical nor quantum in particular hyperbolic quantum theory

1 Introduction

It is well known that the conventional probabilistic rule formula for the total probability (that is based on Bayes formula for conditional probabilities) canshynot be applied to quantum experiments see for example [1]-[12] for extended discussions It seems that special features of quantum probabilistic behaviour are just consequences of violations of the conventional probabilistic rule

In this paper we restrict our investigations to the two dimensional case Here the formula for the total probability has the form (i = 12)

p(A = ai) = p(B = h)p(A = ltnB = h) + p(B = b2)pA = taB = b2)

(1)

where A and B are physical variables which take respectively values aia2

and 6162- Symbols p(A = a^jB = bj) denote conditional probabilities It is one of the most important rules used in applied probability theory In fact it is the prediction rule if we know probabilities for B and conditional probabilities then we can find probabilities for A However this rule cannot be used for the prediction of probabilities observed in experiments with elementary particles The violation of conventional probabilistic rule and the necessity to use new prediction rule was found in interference experiments with elementary particles This astonishing fact was one of the main reasons to build the quantum formalism on the basis of the wave-particle duality

181

Let (fgt be a quantum state Let b gtf=1 be the basis consisting of eigenshyvectors of the operator B corresponding to the physical observable B The quantum probabilistic rule has the form (i = 12)

Pi = qiPii + q2P2i plusmn 2qiPHq2p2i cos0 (2)

where p = p^A = ai)qj - p^B = 6j)Py = pbigt(A = aj)ij = 12 Here probabilities have indexes corresponding to quantum states

By denoting P = pj and P i = qiPi i P2 = q2P2i we get the standard quantum probabilistic rule for interference of alternatives

P = P i + P 2 + 2v P7PT cos6raquo There is the large diversity of opinions on the origin of violations of convenshy

tional probabilistic rule (1) in quantum mechanics see [1]-[12] The common opinion is that violations of (1) are induced by special properties of quanshytum systems (for example Dirac Feynman Schrodinger) Thus the quantum probabilistic rule must be considered as a peculiarity of nature

An interesting investigation on this problem is contained in the paper of J Shummhammer [12] In the opposite to Dirac Feynman Schrodinger he claimed that quantum probabilistic rule (2) is not a peculiarity of nature but just a consequence of one special method of the probabilistic description of nature so called method of maximum predictive power

In this paper we provide probabilistic analysis of quantum rule (2) In our analysis probability has the meaning of the frequency probability namely the limit of frequencies in a long sequence of trials (or for a large statistical ensemble) Hence in fact we follow to R von Mises approach to probabilshyity [13] It seems that it would be impossible to find the roots of quantum rule (2) in the measure-theoretical framework A N Kolmorogov 1933 [14] In the measure-theoretical framework probabilities are defined as sets of real numbers having some special mathematical properties The conventional rule (1) is merely a consequence of the definition of conditional probabilities In the Kolmogorov framework to analyse the transition from (1) to (2) is to analshyyse the transition from one definition to another In the frequency framework we can analyse behaviour of trails which induce one or another property of probability Our analysis shows that quantum probabilistic rule (2) can be in principle a consequence of perturbation effects of preparation and measureshyment procedures Thus trigonometric fluctuations of quantum probabilities can be explained without using the wave arguments

In fact our investigation is strongly based on the famous Diracs analysis of foundations of quantum mechanics see [1] In particular P Dirac pointed out that one of the main differences between the classical and quantum theories is that in quantum case perturbation effects of preparation and measurement

182

procedures play the crucial role However P Dirac could not explain the origin of interference for quantum particles in the purely corpuscular model He must apply to wave arguments If the two components are now made to interfere we should require a photon in one component to be able to interfere with one in the other [1]

In this paper we discuss perturbation effects of preparation and measureshyment procedures We remark that we do not follow to W Heisenberg [15] we do not study perturbation effects for individual measurements We discuss statistical (ensemble) deviations induced by perturbations

We underline again that our probabilistic analysis was possible only due to the rejection of Kolmogorovs measure-theoretical model of probability theshyory Of course each particular experiment (measurement) can be described by Kolmogorovs model there are no quantum probablities Moreover it seems that there is nothing more than the binomial probability distribution (see the paper of J Shummhammer in the present volume) The most important feashyture of QUANTUM STATISTICS is not related to a single experiment We have to consider at least three different experiments (preparation procedures) to observe quantum probabilistic behaviour namely interference of alternashytives Kolmogorovs model is not adequate to such a situation In this model all random variables are defined on the same probability space It is impossible to do in the case of a few experiments that produce interference of alternatives (at least the author does not see any way to do this) In our analysis probashybility is classical relative frequency but it is not Kolmogorov (compare with Accardi [3])

An unexpected consequence of our analysis is that quantum probability rule (2) is just one of possible perturbations (by ensemble fluctuations) of conventional probability rule (1) In principle there might exist experiments which would produce perturbations of conventional probabilistic rule (1) which differ from quantum probabilistic rule (2)

Moreover if we use the same normalization of the interference term namely 2vPTP7 then we can classify all possible probabilistic rules that we have in nature

1) trigonometric 2) hyperbolic 3) hyper-trigonometric The hyperbolic probabilistic transformation has a linear space representashy

tion that is similar to the standard quantum formalism in the complex Hilbert space Instead of complex numbers we use so called hyperbolic numbers see for example [18] p21 The development of hyperbolic quantum mechanics can be interesting for comparative analysis with standard quantum mechanics In

Such an approach implies the statistical viewpoint to Heisenberg uncertainty relation the statistical dispersion principle see L Ballentine [16] [17] for the details

183

particular we clarify the role of complex numbers in quantum theory Complex (as well as hyperbolic) numbers were used to linearize nonlinear probabilistic rule (that in general could not be linearized over real numbers) Another intershyesting feature of hyperbolic quantum mechanics is the violation of the principle of superposition Here we have only some restricted variant of this principle

2 Quantum formalism and perturbation effects

1 Frequency probability theory The frequency definition of probability is more or less standard in quantum theory especially in the approach based on preparation and measurement procedures [5] [10] [16] [11]

Let us consider a sequence of physical systems n = (7TI7T2 71-JV bullbullbull) bull Suppose that elements of TT have some property for example position or spin and this property can be described by natural numbers L = 12 m the set of labels Thus for each -Kj euro TT we have a number Xj pound L So ir induces a sequence

x = (XIX2XN) Xj e L (3)

For each fixed a euro L we have the relative frequency VNOC) mdash niv(a)N of the appearance of a in (aia2 XN) Here njv(a) is the number of elements in (XIX2--XN) with Xj = a R von Mises [13] said that x satisfies to the principle of the statistical stabilization of relative frequencies if for each fixed a G L there exists the limit

p(a) = lim ^AT(Q) (4) NmdashHXl

This limit is said to be a probability of a Thus the probability is defined as the limit of relative frequencies In fact this definition of probability is used in all experimental investigations In Kolmogorovs approach [14] probability is denned as a measure The principle of the statistical stabilization is obtained as the mathematical theorem the law of large numbers

2 Preparation and measurement procedures and quantum forshymalism We consider a statistical ensemble S of quantum particles described by a quantum state ltjgt This ensemble is produced by some preparation proceshydure 8 see for example [4] [5] [16] [10] [11] for details see also P Dirac [1] In practice the conditions could be imposed by a suitable preparation of the system consisting perhaps in passing it through various kinds of sorting apparatus such as slits and polarimeters the system being left undisturbed after the preparation

There are two discrete physical observables B = bi 62 and A = ax a2

184

The total number of particles in S is equal to N Suppose that ni mdash 12 particles in S with B = bi and n i = 12 particles in S with A = a

Suppose that among those particles with B = bi there are riijij = 12 particles with A = aj (see (R) below to specify the meaning of with) So

n = nn +ni2n^ = nxi +n2jij = 12

(R) We follow to Einstein and use the objective realist model in that both B and A are objective properties of a quantum particle see [5] [4] [10] for the details In particular here each elementary particle has simultaneously defined position and momentum In such a model we can consider in the ensemble S sub-ensembles Sj(B) and Sj(A)j = 12 of particles having properties B = bj and A = aj respectively Set

Sij(AB) = S i(B)nS j(A) Then n^ is the number of elements in the ensemble S J ( A B ) We remark

that the existence of the objective property (B mdash bi and A mdash Oj) need not imshyply the possibility to measure this property For example such a measurement is impossible in the case of incompatible observables In general the property (B = bi and A = aj) is a kind of hidden objective property b

The physical experience says that the following frequency probabilities are well defined for all observables B A

q i = p^(B = 6 i ) = lim q ^ U r 0 ^ (5) JVmdashgtoo iV

p = p ( j 4 = a ) = l i m pWpf) = | (6) IS mdashtoo 1

Let quantum states |6j gt be eigenstates of the operator B Let us conshysider statistical ensembles Tii = 12 of quantum particles described by the quantum states |6j gt These ensembles are produced by some preparation proshycedures poundj For instance we can suppose that particles produced by a prepashyration procedure pound (for the quantum state 4gt) pass through additional niters Fi i = 12 In quantum formalism we have

ltfgt = xqT |ampi gt +V^eiB h gt bull (7)

^Attempts to use objective realism in quantum theory were strongly criticized especially in the connection with the EPR-Bell considerations Moreover many authors (for example P Dirac [1] and R Feynman [2]) claimed that the contradiction between objective realism and quantum theory can be observed just by comparing the conventional and quantum probabilistic rules (see dEspagnat [4] for the extended discussion) However in this paper we demonstrate that there is no direct contradiction between objective realism and quantum probabilistic rule

185

In the objective realist model (R) this representation may induce the illushysion that ensembles Tti = 12 for states bi gt must be identified with sub-ensembles Si(B) of the ensemble S for the state (j) However there are no physical reasons for such an identification

The additional filter Fj(i = 12) changes the A-property of quantum partishycles In general the probability distribution of the property A for the ensemble S(B) = IT e S B(7r) = b differs from the corresponding probability distrishybution for the ensemble T

Suppose that there are rriij particles in the ensemble T with A = aj(j mdash 12) c

The following frequency probabilities are well defined Pij = p|6 gt(A = aj) = limAr- oo pgt- where the relative frequency p ^ =

^f- (by measuring values of the variable A for the statistical ensemble T

we always observe the stabilization of the relative frequencies pj bull to some constant probability py)

Here it is assumed that the ensemble Tj consists of n^ particles i = 12 This assumption is natural if we consider preparation procedure pound = Ft a filter with respect to the value B mdash bi Only particles with B = bi pass this filter Hence the number of elements in the ensemble T (represented by the state bi gt) coincides with number of elements with B = bi in the ensemble 5 (represented by the state cjgt)

It is also assumed that n = n(N) -gt ooiV-gtoo In fact the latter assumption holds true if both probabilities qi = 12

are nonzero We remark that probabilities pjj = TpbigtA = aj) cannot be (in general)

identified with conditional probabilities p$(A = ajB = bi) As we have reshymarked these probabilities are related to statistical ensembles prepared by different preparation procedures namely by poundii mdash 12 and pound Probabilities P|ijgt(A = aj) can be found by measuring the A-variable for particles belongshying to the ensemble Tj Probabilities p^iA = CLJB = bi) in general could not be found these are hidden probabilities with respect to the ensemble S

3 Derivation of quantum probabilistic rule Here we present the standard Hilbert space calculations

cWe can use the objective realist model (R) Then m^- is just the number of particles in the ensemble Tj having the objective property A = aj We can also use the contextualist model (C) Then rriij is the number of particles in the ensemble T which in the process of an interaction with a measurement device for the physical observable A would give the result A = aj

186

lttgt = y5x h gt +y^eie b2 gt Let aj gt be the orthonormal basis consisting of eigenvectors of the

operator A We can restrict our considerations to the case

h gt= -vPiT K gt +e I 7 lv pH a2 gt b2 gt= VP2T K gt +en2^p22 a2 gt bull

(8)

We note that Pll + Pl2 = 1 P21 + P22 = 1-The first sum is the probability to observe one of values of the variable A

for the statistical ensemble Ti the second sum is the probability to observe one of values of the variable A for the statistical ensemble T2

As lt ampi|62 gt = 0 we obtain VP11P21 + e i(71 ~72) v p l ip i i = 0 We suppose that all probabilities pij gt 0 This is equivalent to say that

A and B are incompatible observables or that operators A and B do not commute

Hence sin(7i mdash 72) = 0 and 72 = 71 + nk We also have VP11P21 + cos(7i - 72VP12P22 = 0 This implies that k = 21 + 1 and ^ p i ^ i = iPi2P22- As p2 = 1 mdash P n

and P21 = 1 mdash P22 we obtain that

P l l = P 2 2 P l2=P21- (9)

This equalities are equivalent to the condition P u + P21 = 1 P12 + P22 = 1 Hence the matrix of probabilities (pij) is double stochastic matrix see

for example [5] for general considerations Thus in fact

h gt= v^PiT K gt +e17lVPi2 a2 gt b2 gt= ^pln |ai gt - e J 7 l v^22 a2 gt (10)

So (p = di |ai gt +d2|a2 gt where di = VqlpTT + e ^ y ^ p i T d2 = e i 7 l qiPi2 - e^+^yqjp^ Thus

pi = p 0 ( A = ai) = |di|2 = q i p n + q 2 p 2 i + 2 v q ip i iq 2 p 2 i cos^ (11)

p 2 = pltt(A = a2) = |d2|2 = qiPi2 + q2P22 - 2yqiPi2q2P22Cos0 (12)

187

3 Probability transformations connecting preparation proceshydures Let us forget at the moment about the quantum theory Let B(= b b2) and A(= 0102) be physical variables We consider an arbitrary preparation procedure pound for microsystems or macrosystems Suppose that pound produced an ensemble S of physical systems Let pound and pound2 be preparation procedures which are based on filters Fi and F2 corresponding respectively to values 61 and b2

of B Denote statistical ensembles produced by these preparation procedures by symbols Tx and T2 respectively Symbols

have the same meaning as in the previous considerations Probabilities qi)PijgtPi a r e defined in the same way as in the previous considerations The only difference is that instead of indexes corresponding to quantum states we use indexes corresponding to statistical ensembles

q = Ps(B = bi)pi = ps(A = ai)pij = PTi(A = a)

We shall restrict our considerations to the case of strictly positive probashybilities

The following simple frequency considerations are basic in our investigashytion We would like to represent the frequency p^ (for A = a in the ensemble S) as the sum of the conventional (Bayes) part

q i ^ P i f + q ^ P ^ and some perturbation term Such a perturbation term appears because

frequencies q and p ^ are calculated with respect to different ensembles The magnitude of this perturbation term will play the crucial role in our further analysis We have

(N) _ nplusmn _ nu I^pound _ mi l H2i 4 (nii ~ miraquo) (n2i ~ ra2j) P i ~ N ~ N N ~ N N N N

But for i = l 2 we have

tradegtu _ rnu_ r^_ _ (N) (N) m^ _ rn^ n | _ (jy) (N)

N ~ n N ~ P l i q i N ~ n N ~P2i ^

Hence

pw = qwp(f) + qwp(f) + r ) ) (13)

where

SiN) = Jj[(nu ~ m i i ) + (2i - m2i)] i mdash 12

188

In fact this rest term depends on the statistical ensembles STiT2 4Ngt=6W(STlT2) 4 Behaviour of fluctuations First we remark that limjv-yoo S exists

for all physical measurements We always observe that P 1

( N ) - M M q i( N ) - q p J ) - gt P u N - gt 0 0

Thus there exist limits 6i = limiv^oo S = Pi ~ qiPii - q2P2i-This coefficient Si is statistical deviation produced by the perturbation

effect of the preparation procedure Ei (quantities S are experimental statisshytical deviations)

Suppose that preparation procedures poundi = 12 (typically filters F) proshyduce negligibly small (with respect to the size N of the statistical ensemble) changes in properties of particles Then

6deg -gt0N-oo (14)

This asymptotic implies conventional probabilistic rule (1) In particular this rule can be used in all experiments of classical physics Hence preparation and measurement procedures of classical physics produce experimental statistical deviations with asymptotic (14) We also have such a behaviour in the case of compatible observables in quantum physics

Moreover the same conventional probabilistic rule we can obtain for inshycompatible observables B and A if the phase factor 9 = j + nk Therefore conventional probabilistic rule (1) is not directly related to commutativity of corresponding operators in quantum theory It is a consequence of asymptotic (14)

Despite the same asymptotic (14) there is the crucial difference between classical observations (and compatible observations) and decoherence 9 = f +

irk for incompatible observations In the first case S fa 0 TV -gt oo because both

4T = jj(nu ~mH)w deg siyen = jj(n2i ~ m 2 ) K deg N bullbull deg deg -In an ideal classical experiment we have

gtiiraquo = ma and n^i = tn^i-Here preparation procedures poundj (filters with respect to the values hi of the

variable B) do not change values of the A-variable at all In the case of decoherence of incompatible observables the statistical deshy

viations S j and 8 2 are not negligibly small So perturbations can be sufshyficiently strong However we still observe (14) as a consequence of the comshypensation effect of perturbations

189

x(N) ~ _x() degil ~ degi2 bull Suppose now that filters Fii = 12 produce changes in properties of

particles that are not negligibly small (from the statistical viewpoint) Then the statistical deviations

lim 6N) =Si^0 (15) iV-gtoo

Here we obtain probabilistic rules which differ from the conventional one (1) In particular this implies that behaviour (15) cannot be produced in experishyments of classical physics (or for compatible observables in quantum physics)

A rather special class of statistical deviations (15) is produced in experishyments of quantum physics However behaviour of form (15) is not the specific feature of quantum measurements (see further considerations)

To study carefully behaviour of fluctuations S we represent them as

where

A-N) = [jnu - mii) + (n2i - m2i)] 2ymum2i

These are normalized (experimental) statistical deviations We have used the fact

(N) (N) (N) (N) _ nj r^plusmn ^2 ^2i _ rniim2i qi P H q2 p2i - N bull n t bull N bull n6 - JV-2 bull

In the limit N -gt oo we get

Si = 2yqiPHq2P2i Araquo

where the coefficients Aj = limjv-gtoo A i = 12 Thus we found the general probabilistic transformation (for three preparation procedures) that can be obtained as a perturbation of the conventional probabilistic rule (i = 12)

Pi = qiPH + q2P2i + 2Vqiq2PiiP2iAj (16)

Of course we are free in the choice of a normalization constant in the perturbation term We use 2vqiq2Piipi7 by the analogy with quantum forshymalism In fact such a normalization was found in quantum formalism to get the representation of probabilities with the aid of complex numbers Comshyplex numbers were introduced in quantum formalism to linearize the nonlinear

190

probabilistic transformation q ip i + q2P2raquo + 2-vqiq2PiiP2i cos 6 To do this we use the formula (c d gt 0)

c + d + 2Vcdcos6 = ^+Vdeie2 (17)

The square root yc+Vde9 gives the possibility to use linear transformations Thus we do not see anything mystical in the appearance of complex numbers in quantum theory This is a consequence of the impossibility of real linearization of the nonlinear probabilistic transformation

In classical physics the coefficients A = 0 The same situation we have in quantum physics for all compatible observables as well as for measurements of incompatible observables for some states In the general case in quantum physics we can only say that the normalized statistical deviations

K lt 1 (18)

Hence for quantum experiments we always have

(nu - mu) + (n2i - m2i)

2ymum2i lt l J V - gt o o (19)

Thus quantum perturbations induce a relatively small (but not negligibly small) statistical variations of properties We underline again that quantum perturbations give just the proper class of perturbations satisfying to condition (19)

Let us consider arbitrary preparation procedures that induce perturbations satisfying to (18) We can set

Aj = cos9ii = 12 where 6i are some phases Here we can represent perturbation to the

conventional probabilistic rule in the form

St = 2vqipliq2p2iCOS0iJ = 12 (20)

In this case the probabilistic rule has the form (i = 12)

Pi = qiPii + q2P2i + 2^qiq2piiP2i cos8i (21)

This is the general form of a trigonometric probabilistic transformation The usual probabilistic calculations give us 1 = Pl + p 2 = qiPH + q2P21 + +qiPl2 + q2P22 + 2 TqTqiPiTpircos^i + 2 yqTqiPiipii cos 02

= 1 + 2Aqiq2[xpnP2i coslti + vPi2P22 cos02] bull

191

Thus we obtain the relation

P l l P 2 1 c o s ^ l + Pl2P22COS02 = 0 (22)

Suppose now that the matrix of probabilities is a double stochastic matrix We get

cos 6 mdash mdash cos 6-2 (23)

We obtain quantum probabilistic transformation (2) We demonstrate that this rule could be derived even in the realist framework Condition (19) has the evident interpretation To explain the mystery of quantum probabilistic rule we must give some physical interpretation to the condition of double stochasticity see section 4 for such an attempt

We can simulate quantum probabilistic transformation by using random variables niju)miju) such that the deviations

4T = nu - mH = 2^fVmiraquom2raquo (24)

4 i = n2i ~ m2j = ^ii VmUm2i (25)

where the coefficients poundy satisfy the inequality

l deg + $ deg I lt l-gtoo (26)

Suppose that Agt mdash poundj + Qj ~raquo A N -raquobull oo where |Ai| lt 1 We can repshy

resent A|N) = cos(9i(N) Then0JN) -gtbull 9imod2iT when N -gt oo Thus A = cos ft We remark that the conventional probabilistic rule (which is induced by

ensemble fluctuations with Q mdashgt 0) can be observed for fluctuations having relatively large absolute magnitudes For instance let

e l i mdash lt Vmlraquogt e2i mdash 2S2t V m 2i )raquo mdash J-iA (27)

where sequences of coefficients pound4 and pound^ are bounded (JV -gt oo) Here (N) f(JV) pound(JV)

^ = mti wmn -gt 0 iV -gt oo (as usual we assume that pj gt 0) Example 21 Let N laquo 106nJ w rig laquo 5 bull 105 mn ss mi2 laquo m2i laquo

m22 ~ 25 bull 104 So qi mdash q2 = 12 p u mdash p i 2 = p 2 1 = p 2 2 = 12 (symmetric state) Suppose we have fluctuations (27) with f m Qi ~ 12- Then eH w 4 w ^00 So riij = 24 bull 104 plusmn 500 Hence the relative deviation

192

(N)

m7 = 25I04 ~ 0002 Thus fluctuations of the relative magnitude laquo 0002 produce the conventional probabilistic rule

It is evident that fluctuations of essentially larger magnitude

4V = 2^f )(mH)1 2(m2 1)1Agt euro W = 2ampm2i)^(mu)Wap gt 2 (28)

where Q and pound2i a r e bounded sequences (N mdashgt 00) also produce (for Pij yen 0) the conventional probabilistic rule

Example 22 Let all numbers N mij be the same as in Example 31 and let deviations have behaviour (28) with a = = 4 Here the relative

AN)

deviation -mdash laquo 0045 Remark 21 The magnitude of fluctuations can be found experimentally

Let A and B be two physical observables We prepare free statistical ensembles S Ti T 2 corresponding to states ltj)bi gtb2 gt bull By measurements of B and A for 7r G S we obtain frequencies q[ q2 gt Pi gt P2 gt ^y measurements of A for 7r euro Ti and for TT G T2 we obtain frequencies p[j We have

H N ) = A ( N ) = p(N) q ( N ) p ( N ) _ q ( N ) p ( N

It would be interesting to obtain graphs of functions f (N) for different pairs of physical observables Of course we know that lini7v-raquooo ft (N) = plusmncos6 However it may be that such graphs can present a finer structure of quantum states

3 Hyperbolic and hyper-trigonometric probabilistic transformations

Let Si pound2 be preparation procedures that produce perturbations such that the normalized (experimental) statistical deviations

lAJ^I gt lJV-raquooo (29)

Thus |Aj| gt 12 = 12 Here the coefficients Aj can be represented in the form Aj = plusmn cosh8ii = 12 The corresponding probability rule has the following form

Pi = qiPii + Q2P2J plusmn 2AqIqipIip27cosh Qh i = 12 The normalization pi + p 2 = 1 gives the orthogonality relation

VP11P2I COSh 61 plusmn 1Pl2P22COSh^2 = 0 (30)

Thus cosh 62 mdash C0Sn^ipi2P22 and signAiA2 = mdash1

193

This probabilistic transformation can be called a hyperbolic rule It deshyscribes a part of nonconventional probabilistic behaviours which is not deshyscribed by the trigonometric formalism Experiments (and preparation proshycedures 86182) which produce hyperbolic probabilistic behaviour could be simulated on computer On the other hand at the moment we have no natural physical phenomena which are described by the hyperbolic probabilistic formalshyism Trigonometric probabilistic behaviour corresponds to essentially better control of properties in the process of preparation than hyperbolic probabilistic behaviour Of course the aim of any experimenter is to approach trigonometshyric behaviour However in principle there might exist such natural phenomena that trigonometric quantum behaviour could not be achieved

Example 3 1 Let qi = a q2 = 1 - a P n = = P22 = 12 Then pi = I + ya(l - a)Ai P2 = I - A(1 - laquo)^i bull If a is sufficiently small then Ai can be in principle larger than 1 We

can find a phase 6 such that the normalized statistical deviation Ai = cosh Let us consider experiments that produce hyperbolic probabilistic rule and

let the corresponding matrix of probabilities be double stochastic In this case orthogonality relation (30) has the form

coshi = cosh 62 = cosh We get the probabilistic transformation

Pi = q i P n +q2P2i plusmn 2^qiq2piiP2i coshfl

P2 = q iP i2 + q2P22 T 2v qiq2Pi2P22COsh0

This probabilistic transformation looks similar to the quantum probabilistic transformation The only difference is the presence of hyperbolic factors inshystead of trigonometric This similarity gives the possibility to construct a linear space representation of the hyperbolic probabilistic calculus see section 7

The reader can easily consider by himself the last possibility one norshymalized statistical deviations |A| is large than 1 and another is less than 1 hyper-trigonometric probabilistic transformation

Remark 31 The real experimental situation is more complicated In fact the phase parameter 6 is connected with the experimental arrangement In particular in the standard interference experiments the phase is related to the space-time structure of an experiment It may be that in some expershyiments dependence of the normalized statistical deviation A on 6 is neither trigonometric nor hyperbolic

P = P + P 2 + 2 yP^XiO) However if the function |A()| lt 1 then we can obtain the trigonometric

transformation by just the reparametrization 6 = arccos()

194

4 Double stochasticity and correlations between preparation proshycedures

In this section we study the frequency meaning of the fact that in the quantum formalism the matrix of probabilities is double stochastic We remark that this is a consequence of orthogonality of quantum states bi gt and |62 gt corresponding to distinct values of a physical observable B We have

PU = P22 ( 3 1 )

Pl2 P21

Suppose that all quantum features are induced by the impossibility to create new ensembles Ti and T2 without to change properties of quantum parshyticles Suppose that for example the preparation procedure Si practically destroys the property A = ai (transforms this property into the property A = a2) So p n = 0 As a consequence the pound1 makes the property A = a2

dominating So p i 2 laquo 1 Then the preparation procedure Si must practishycally destroy the property A = a2 (transforms this property into the property A = ai) So P22 PS 0 As a consequence the Si makes the property A = ai dominating So P21 laquo 1

We remark that

We recall that the number of elements in the ensemble T is equal to n Thus

n n -run _ n22 - m 2 2 ^ nil _ 22 bdquobdquo

This is nothing than the relation between fluctuations of property A under the transition from the ensemble S to ensembles Ti T2 and distribution of this property in the ensemble S

5 Hyperbolic quantum formalism

The mathematical formalism presented in this section can have different physshyical interpretations In particular quantum state can be interpreted from the orthodox Copenhagen as well as statistical viewpoints

A hyperbolic algebra G see [18] p 21 is a two dimensional real algebra with basis eo = 1 and ei = j where j 2 = 1 Elements of G have the form z = x + jy xy euro R We have zi + z2 = (xi + x2) + j(yi + yi) and ziz2 = xixi + 2122) + j(^i22 + X2yi) This algebra is commutative We introduce

195

the involution in G by setting z = x - jy We set z2 = zz = x2 - y2 We remark that z = yjx2 - y2 is not well denned for an arbitrary z euro G We set G+ = z pound G z2 gt 0 We remark that G+ is the multiplicative semigroup ZiZ2 pound G + mdashbull z = zz2 pound G+ It is a consequence of the equality

zxz22 = |zi |2 |z2 |2

Thus for zz2 pound G + we have zz2 = l^iH^I- We introduce

eje = cosh6+js inh9 6 pound R

We remark that

e j 0 i e j 02 _ em+ltgt2)^ _ e - j 9 |gjlaquo|2 _ c o s h 2 g _ s i n h 2 g _ L

Hence z = plusmneJ e always belongs to G+ We also have cosh6raquo = e +2

e sinh6gt = e ~j We set G = z e G + |Z|2 gt 0 Let z pound G+ We have

= W(1f[+W = laquoN( aSr+jHSr)-2 2

As A T - T TJ = 1 we can represent x sign a = cosh 6 and y sign a = sinh 6 where the phase 6 is unequally defined We can represent each z pound G+ as

z = sign x |z| ee By using this representation we can easily prove that G+ is the mulshy

tiplicative group Here mdash 5Spe-Jfl The unit circle in G is denned as Si = z pound G z2 = 1 = z = plusmneje9 pound ( -oo+oo) It is a multiplicative subgroup of G+

Hyperbolic Hilbert space is G-linear space (module) see [18] E with a G-linear product a map (bullbull) E x E mdashgt G that is

1) linear with respect to the first argument (az + bwu) = a(zu) + b(wu)ab pound Gzwu pound E 2) symmetric (zu) = (uz) 3) nondegenerated (zu) = 0 for all u pound E iff z mdash 0 If we consider E as just a R-linear space then (bull bull) is a bilinear form which

is not positively defined In particular in the two dimensional case we have the signature (+ mdash + mdash)

As in the ordinary quantum formalism we represent physical states by normalized vectors of the hyperbolic Hilbert space ltp pound E and (ip ip) = 1 We shall consider only dichotomic physical variables and quantum states belonging to the two dimensional Hilbert space So everywhere below E denotes the two dimensional space Let A = a a2 and B = bi b2 be two dichotomic physical variables We represent they by G-linear operators a gtlt a i | + a2 gtlt a2

196

and bi gtlt b + |amp2 gt lt b2 where |a gtj=i2 and bi gti=i2 are two orthonormal bases in E

Let (p be a state (normalized vector belonging to E) We can perform the following operation (which is well defined from the mathematical point of view) We expend the vector ltp with respect to the basis bi gti=i2 bull

ltP = Pibigt+p2b2gt (34)

where the coefficients (coordinates) Pi belong to G As the basis bi gti=i2 is orthonormal we get (as in the complex case) that

p12 + p2

2 = l (35)

However we could not automatically use Borns probabilistic interpretation for normalized vectors in the hyperbolic Hilbert space it may be that Pi $ G +

(in fact in the complex case we have C = C + ) We say that a state ip is deshycomposable with respect to the system of states |6j gti=i2 (S-decomposable) if

Pi G G+ (36)

In such a case we can use Borns probabilistic interpretation of vectors in a hyperbolic Hilbert space

Numbers q = Pi2i = 12 are interpreted as probabilities for values B = bi for the G-quantum state tp

We now repeat these considerations for each state bi gt by using the basis ogtk gt=i2- We suppose that each bi gt is ^-decomposable We have

|ampi gt = n k gt +Pi2a2 gt |amp2 gt = ampi |a i gt +p22a2 gt (37)

where the coefficients Pik belong to G+ We have automatically

|n|2 + |i2|2 = l |2i|2 + |22|2 = l (38)

We can use the probabilistic interpretation of numbers p n = |n|2pi2 = |3i2|2 and p2 i = |32i|

2P22 = P22 bull Pik is the probability for a - ak in the state bi gt

Let us consider matrices B = (Pik) and P = (pik)- As in the complex case the matrix B is unitary vectors u = (PnPi2) and u2 = (p2iP22) are orthonormal The matrix P is double stochastic

By using the G-linear space calculation (the change of the basis) we get ltp = a i |o i gt +a 2 | a 2 gt where a-i = PiPn + P2P21 and a2 mdash PP2 + 222-

197

We remark that decomposability is not transitive In principle ip may be not A-decomposable despite B-decomposability of ip and A-decomposability of the B-system

Suppose that ip is A-decomposable Therefore coefficients p^ = |afc|2 can be interpreted as probabilities for a = ak for the G-quantum state ltp

Let us consider states such that coefficients fiiPik belong to G+ We can uniquely represent them as

pi = plusmnvq~e^ I5ik = plusmnyJHkehih ik= 12

We find that

Pi = q i P u + Q2P21 + 2ei v q 1piiq 2p 2 i coshfli (39)

P2 = qiPi2 + q2P22 + 2e2vqTpl2q2P22 cosh^2 (40)

where 6t = 77 + 7 and 77 = f i - pound271 = 7n - 7217i = 7i2 - 722 and e = plusmn To find the right relation between signs of the last terms in equations (39) (40) we use the normalization condition

M 2 + |a2 |2 = l (41)

(which is a consequence of the normalization of ip and orthonormality of the system ai gti=i2) It is equivalent to the equation (condition of orthogonalshyity in the hyperbolic case see section 8)

VPl2P22COSh02 plusmn PllP2lCOSh02 = 0 Thus we have to choose opposite signs in equations (39) (40) Unitarity

of B also inply that 6 mdash 62 = 0 so 71 = 72 We recall that in the ordinary quantum mechanics we have similar conditions but trigonometric functions are used instead of hyperbolic and phases 71 and 72 are such that 71mdash72 = ir

Finally we get that (unitary) linear transformations in the G-Hilbert space (in the domain of decomposable states) represent the hyperbolic transformashytion of probabilities (see section 8)

Pi = QiPu + q2P2i plusmn 2-vq1piiq2p2iCOsh0 P2 = qiPi2 + q2P22 =F 2vq1pi2q2P22COsh0 This is a kind of hyperbolic interference There can be some connection with quantization in Hilbert spaces with

indefinite metric as well as the theory of relativity However at the moment we cannot say anything definite It seems that by using Lorentz-rotations we can produce hyperbolic interference in a similar way as we produce the standard trigonometric interference by using ordinary rotations

198

6 Physical consequences

The wave-particle dualism was created to explain the interference phenomenon for massive elementary particles In particular the orthodox Copenhagen inshyterpretation was proposed to find a compromise between corpuscular and wave features of elementary particles The idea of superposition of distinct propershyties is in fact based on these interference experiments It is well known that the orthodox Copenhagen interpretation is not free of difficulties (in particular collapse of wave function) and even paradoxes (see for example Schrodinger [19]) Problems in the orthodox Copenhagen interpretation induce even atshytempts to exclude corpuscular objects from quantum theory at all see for example [20] for Schrodinger critique of the classical concept of a particle At the moment there is only one alternative to the orthodox Copenhagen intershypretation namely Einsteins statistical interpretation By this interpretation the wave function describes distinct statistical features of an ensemble of eleshymentary particles see L Ballentine [17] for the details (see also [16] [5] [10]

[11])-However we must recognize that Einsteins statistical approach could not

solve the fundamental problem of quantum theory it could not explain the appearance of NEW STATISTICS in the purely corpuscular model We did this in the present paper On one hand this is the strong argument in favour of the statistical interpretation of quantum mechanics On the other hand one of main motivations to use the wave-particle duality disappeared

Nevertheless our investigation could not be considered as the crucial argushyment against the wave-particle duality It is clear that by using purely mathshyematical analysis we cannot prove or disprove some physical theory The only thing that we proved is that corpuscular objects (that have no wave features) can exhibit NEW STATISTICS

In fact we obtained essentially more than planed this NEW STATISTICS are not reduced to QUANTUM STATISTICS In principle we can propose experiments that induce TRIGONOMETRIC HYPERBOLIC and HYPER-TRIGONOMETRIC STATISTICS

We remark that the quantum probabilistic transformation P = Pi + P2 + 2VPTP7 cos0 gives the possibility to predict the probability P if we know probabilities

P i and P 2 In principle there might be created theories based on arbitrary transformations

P = F ( P 1 gt P 2 ) It may be that some rules have linear space representations over exotic number systems for example p-adic numbers [20]

199

Preliminary analysis of probabilistic foundations of quantum mechanics (that induced the present investigation) was performed in the books [11] and [21] (chapter 2) a part of results of this paper was presented in preprints [22]-[24]

Acknowledgements

I would like to thank S Albeverio L Accardi L Ballentine V Belavkin E Beltrametti W De Muynck S Gudder T Hida A Holevo P Lahti A Peres J Summhammer I Volovich for (sometimes critical) discussions on probabilistic foundations of quantum mechanics

References 1 P A M Dirac The Principles of Quantum Mechanics (Claredon Press

Oxford 1995) 2 R Feynman and A Hibbs Quantum Mechanics and Path Integrals

(McGraw-Hill New-York 1965) 3 L Accardi The probabilistic roots of the quantum mechanical parashy

doxes The wave-particle dualism A tribute to Louis de Broglie on his 90th Birthday ed S Diner D Fargue G Lochak and F Selleri (D Reidel Publ Company Dordrecht 297-330 1984)

4 B dEspagnat Veiled Reality An anlysis of present-day quantum meshychanical concepts (Addison-Wesley 1995)

5 A Peres Quantum Theory Concepts and Methods (Kluwer Academic Publishers 1994)

6 J von Neumann Mathematical foundations of quantum mechanics (Princeton Univ Press Princeton NJ 1955)

7 E Schrodinger Philosophy and the Birth of Quantum Mechanics Edited by M Bitbol O Darrigol (Editions Frontieres 1992)

8 J M Jauch Foundations of Quantum Mechanics (Addison-Wesley Reading Mass 1968)

9 P Busch M Grabowski P Lahti Operational Quantum Physics (Springer Verlag 1995)

10 W De Muynck W De Baere H Martens Found Phys 24 1589-1663 (1994)

11 A Yu Khrennikov Interpretations of probability (VSP Int Publ Utrecht 1999)

12 J Summhammer Int J Theor Phys 33 171-178 (1994) 13 R von Mises The mathematical theory of probability and statistics

(Academic London 1964)

200

14 A N Kolmogoroff Grundbegriffe der Wahrscheinlichkeitsrechnung (Springer Verlag Berlin 1933) reprinted Foundations of the Probability Theshyory (Chelsea Publ Comp New York 1956)

15 W Heisenberg Z Physik 43 172 (1927) 16 L E Ballentine Quantum mechanics (Englewood Cliffs New Jersey

1989) 17 L E Ballentine Rev Mod Phys 42 358-381 (1970) 18 A Yu Khrennikov Supernalysis (Kluwer Academic Publishers Dor-

dreht 1999) 19 E Schrodinger Die Naturwiss 23 807-812 824-828 844-849 (1935) 20 E Schrodinger What is an elementary particle in Gesammelte Ab-

handlungen (Wieweg and Son Wien 1984) 21 A Yu Khrennikov p-adic valued distributions in mathematical physics

(Kluwer Academic Publishers Dordrecht 1994) 22 A Yu Khrennikov Ensemble fluctuations and the origin of quantum

probabilistic rule Rep MSI Vaxjo Univ 90 October (2000) 23 A Yu Khrennikov Classification of transformations of probabilities

for preparation procedures trigonometric and hyperbolic behaviours Preprint quant-ph0012141 24 Dec (2000)

24 A Yu Khrennikov Hyperbolic quantum mechanics Preprint quant-ph0101002 31 Dec (2000)

201

NONCONVENTIONAL VIEWPOINT TO ELEMENTS OF PHYSICAL REALITY BASED ON NONREAL ASYMPTOTICS

OF RELATIVE FREQUENCIES

A N D R E I K H R E N N I K O V

International Center for Mathematical

Modeling in Physics and Cognitive Sciences

MSI University of Vaxjo S-35195 Sweden

EmailAndreiKhrennikovmsivxuse

We study connection between stabilization of relative frequencies and elements of physical reality We observe that besides the standard stabilization with respect to the real metric there can be considered other statistical stabilizations (in parshyticular with respect to so called p-adic metric on the set of rational numbers) Nonconventional statistical stabilizations might be connected with new (noncon-ventional) elements of reality We present a few natural examples of statistical phenomena in that relative frequencies of observed events stabilize in the p-adic metric but fluctuate in the standard real metric

1 Introduction

The present methodology of physical measurements is based on the principle of the statistical stabilization of relative frequencies in the long run of trials In the mathematical model this principle is represented by the law of large numbers This approach to measurements is induced by human representation of physical reality as reality of stable repetitive phenomena In the process of evolution we created cognitive structures that correspond to elements of this repetitive physical reality All modern physical investigations are oriented to the creation of new elements of such a reality

It must be remarked that the notion of stabil ization (of relative frequenshycies) plays the fundamental role in the creation of this reality I would like to point out that the conventional meaning of stabilization is based on real numbers When we say stabilization we mean the stabilization with respect to the standard real metric pn(xy) = |x mdash y| (the distance between points x and y on the real line R) Of course such a choice of the metric that deshytermines statistically elements of physical reality was not just a consequence of the development of one special mathematical theory real analysis b It

a W e ask the reader not connect our vague (common sense) use of the notion of an element of physical reality with the EPR sufficient condition to be an element of reality [1] bNevertheless we must not forget that the human factor played the large role in the expendshying of the (presently dominating) model of physical reality based on real numbers At the beginning Newtons analysis was propagated as a kind of religion There were (in particular

202

seems that the notion of ^-stabilization was induced by human practice in that quantities n laquo N were not important We created real physical reality because we used smallness based on the standard order on the set of natural numbers

It must be underlined that in modern physics the real physical reality (ie reality based on the 9R-stability) is in fact identified with the whole physical reality

On the other hand the modern mathematics is not more just a real analshyysis In particular the development of general topology [2] [3] induced large spectrum of new nearness (in particular metric) structures In principle we need not more identify any stabilization with the p^-stabilization There apshypears a huge set of new possibilities to introduce new forms of stability in physical experiments Moreover new stable structures can be considered as new elements of physical reality that in general need not belong the standard real reality

This idea was presented for the first time in authors investigations [4] [5] on so called p-adic physics [6]- [10] Later we tried to find the place of p-adic probabilities in quantum physics [11] [12] (in particular to justify on the mathematical level of rigorousness the use of negative and complex probabilishyties as well as create models with hidden variables that do not produce Bells inequality) In this paper we give the brief introduction into these probabilisshytic models as well as present a few rather natural examples in that relative frequencies of events stabilize with respect to so called p-adic metric but flucshytuate with respect to pR There is no corresponding element of the real reality But there is an element of the p-adic reality The objects considered in examshyples could be created on the hard-level In particular to create a plantation in that a colour of the flower (red or white) is the element of p-adic reality I need just a tractor and (sufficiently large) peace of land Nevertheless I must agree that such a p-adic element of reality were never observed in naturally created physical objects

The reader can be interested in the reasons by that we are concentrated on the statistical stabilization with respect to the p-adic numbers p-adic frequency probability theory The main reason is that p-adic numbers are in fact the unique alternative to real numbers there is no other possibility to complete the field of rational numbers and obtain a new number field (Ostrovskiis theorem see for example [13] [14])

Our probabilistic foundations are based on the generalization of R von Mises frequency theory of probability [15] [16] At the beginning of this censhytury when the foundation of modern probability theory were being laid the

in France) divine services devoted to Newtons analysis

203

frequency definition of probability proposed by von Mises played an imporshytant role In particular it was this definition of probability that Kolmogorov used to motivate his axioms of probability theory (see [17]) We also begin the construction of the new theory of probability with a frequency definition of probability

Von Mises defined the probability of an event as the limit of the relative frequencies of the occurrence of the event when the volume of the statistical sample tends to infinity This definition is the foundation of mathematical statistics (see example Cramer [18]) in which von Misess definition is formushylated as the principle of statistical stabilization of relative frequencies

In this paper we propose a general principle of statistical stabilization of relative frequencies By virtue of this principle statistical stabilization of relative frequencies u = nN can be considered not only in the real topology on Q (and all relative frequencies are rational numbers) but also in any other topology on Q Then the probabilities of events belong to the corresponding completion of the field of rational numbers As special cases we obtain the ordinary real probability theory (von Misess definition) and p-adic probability theories p = 2 3 5

How should one choose the topology of statistical stabilization for a given statistical sample The topology is determined by the properties of the studied probability model In essence we propose this principle for each probability model there is a corresponding topology (or topologies) of statistical stabilizashytion

For example in a random sample there need not be any statistical stashybilization of the relative frequencies in the real metric Thus from the point of view of real probability theory this is not a probabilistic object However in this random sample one may observe p-adic statistical stabilization of the relative frequencies

In essence I am asserting that the foundation of probability theory is provided by rational numbers (relative frequencies) and not real numbers Real probabilities of events merely represent one of many possibilities that arise in the statistical analysis of a random sample Such an approach to probability theory agrees well with Volovichs proposition that rational numbers are the foundation of theoretical physics [19] In accordance with this proposition everything physical is rational and number fields that are different from the field of rational numbers arise as an idealization needed for the theoretical description of physical results

All necessary information on p-adic (and more general m-adic) numbers can be found in Appendix 1 of this paper However in the first two sections they are hardly used at all and we may restrict ourselves to the remark that

204

in addition to the completion of the field of rational numbers Q with respect to the real metric there also exist completions with respect to other metrics and among these completions there are the fields of p-adic numbers Qpp = 2 3 5

2 Analysis of the foundation of probability theory

21 Frequency Definition of Probability As is well known the frequency definition of probability proposed by von Mises [15] in 1919 played an imporshytant role in the construction of the foundations of modern probability theory This definition exerted a strong influence on the theory of probability meashysures the foundations of which were laid by Borel [20] Kolmogorov [17] and Frechet [21] There is no point in giving here Kolmogorovs axioms (which can be found in any textbook on probability theory) but it is probably necessary to recall in its general features the main propositions of von Misess theory of probability The theory is based on infinite sequences x = (ai xlti xn) of samplings or observations If an experiment having S outcomes is made then Xj can take values 12 5 (possible outcomes) For the standard exshyperiment on coin trails we have 5 = 2 and Xj = 12 In what follows possible outcomes of an experiment will be called labels

However not every such sequence is regarded as an object of probability theory The fundamental principle of the frequency theory of probability is the principle of statistical stabilization of the relative frequencies of occurrence of a particular label and only sequences of samplings that satisfy this principle are regarded as objects of probability theory Such sequences of samplings are called collectives

A collective is a bulk phenomenon or a repeated process in brief a series of individual observations for which one is justified in assuming that the relative frequency of occurrence of each individual observable label tends to a definite limiting value [16]

The probability of an event E is defined as the limit of the sequence of frequencies u^ = nN where n is the number of cases in which the event E is detected in the first N tests

For the subsequent considerations it is important to note that in the statistical analysis of the results of an experiment only rational numbers -relative frequencies - are obtained

The principle of statistical stabilization of the relative frequencies is used practically unchanged in mathematical statistics

Observations of the frequency v^ of a fixed event E for increasing values of N reveals that this frequency has generally speaking a tendency to take a

205

more or less constant value at large N (see Cramer [18]) In defining a collective von Mises used a further principle - the principle

of irregularity of a sequence of tests ie invariance of the limit of the relative frequencies with respect to the selection made using a definite law from a given sequence of tests x = (xiX2 xn) of some subsequence It is important that the law of this selection should not be based on the difference of the elements of the sequence with respect to the considered label

Second this limiting value must remain unchanged if from the complete sequence we choose arbitrarily any part and consider in what follows only this part [16]

This principle like the principle of statistical stabilization of the relative frequencies is fully in accord with our intuitive ideas of randomness However there are here some logical difficulties associated with the arbitrariness of the choice A detailed analysis of these logical problems was made by Khinchin [22] see also [12] for the details It appears that one must agree with Khinchins critical comments and consider the frequency theory of probability that is based only on von Misess first principle - the principle of statistical stabilization of the relative frequencies

As is noted in [22] the frequency theory of probability based solely on von Misess first principle is axiomatized and is as rigorous a mathematical theory as Kolmogorovs theory of probability Here we do not intend to consider von Misess theory of probability in the framework of an axiomatic approach Our task is to analyze the principle of stabilization of the frequencies of occurrence of a particular event in a collective

22 Von Mises Frequency Theory of Probabilities as Objective Foundation of Kolmogorovs Axiomatics

As motivation of his axioms Kolmogorov used the properties of limits of relative frequencies see [17] We shall be interested in the manner in which Kolmogorovs axiom 2 arose in accordance with this axiom the probability PE) of any event E is a nonnegative real number lt 1 In [17] Kolmogorov considers von Misess definition [16] of probability as the limit of the relative frequencies of occurrence of the event E Further since the relative frequencies i(pound) = nN are rational numbers that lie between zero and unity their limits in the real topology are real numbers between zero and unity Cramer proceeded similarly in the construction of his theory of probability distributions [18]

Khinchin discussing the advantages of Kolmogorovs axioms over von Misess frequency theory of probability noted that from the formal asshypect the mutual relationship between the axiomatic and frequency theories is characterized in the first place by a higher degree of abstraction of the former

This higher degree of abstraction was the foundation of the successful

206

development of the theory of probability measures However this degree of abstraction is too high and some properties of the world of real frequencies are lost in it Essentially the rational numbers were lost in Kolmogorovs theory of probability Whereas in von Misess theory the rational numbers arise as primary objects and real probabilities are obtained as a result of a limiting process for rational frequencies in Kolmogorovs theory rational frequencies are secondary objects associated with real probabilities (which are here primary) by means of the law of large numbers

3 General principle of statistical stabilization of relative frequenshycies

First we emphasize that the probabilities P in von Misess frequency theory are ideal objects (symbols to denote the sequences of relative frequencies that are stabilized in the field of real numbers) Therefore real numbers arise here as ideal objects associated with rational sequences of frequencies (see also Borel [20] and Poincare [23])

A basis for a broader view of probability theory is provided by the following principle of statistical stabilization of frequencies

Statistical stabilization (the limiting process) can be considered not only in the real topology on the field of rational numbers Q but also in any other topolshyogy on Q The probabilities of events are defined as the limits of the sequences of relative frequencies in the corresponding completions of the field of rational numbers

For each considered probability model there is a corresponding topology on the field of rational numbers The metrizable topologies on Q given by absolute values are the most interesting By virtue of Ostrovskiis theorem there are very few such topologies indeed besides the usual real topology for which p(xy) = x mdash y there exists only the p-adic topologies p = 2 3 where p(x y) = x mdash yp Thus if we consider only topologies given by absolute values then besides the usual probability theory over R we obtain only the probability theories over Qp

It is here necessary to introduce a natural restriction on the topology of statistical stabilization

The completion Qt of the field of rational numbers Q with respect to the statistical stabilization topology t is a topological field

We have deliberately not introduced this restriction into the general prinshyciple of statistical stabilization One can also consider statistical stabilization topologies that are not consistent with the algebraic structure on Q However probability theory based on such topologies loses many familiar properties For

207

example it turns out that the continuity of the addition operation is equivashylent to additivity of probabilities and continuity of the division operation is equivalent to the existence of conditional probabilities

Let x = (xX2 bull bull xn) be some collective We denote the set of all labels for this collective (possible outcomes of an experiment producing this collective) by the symbol II We denote by fi the event consisting in the realization of at least of the label n euro II

Proposition 31 The probability of the event il is equal to unity To prove this it is sufficient to use the fact that all the relative frequencies

are equal to unity Let v^fi j = 12 be the relative frequencies of realization of certain labels

7Ti and 7r2 and Pj = l imi ^ be the corresponding probabilities Let event A be the realization of the label TT or -K-I A = n V TT2 bull Using the continuity of the addition operation we obtain

P(A) = lim iW = lim(jW + v^) = lim iW + lim J 2 ) = PX+P2 (1)

This rule can be generalized to any number of mutually exclusive events Proposition 32 Let Ajj = 1 k be mutually exclusive events (ie

the sets of labels that define these events are disjoint) Then

k

P(A1VVAk) = YP(Aj) (2) i= i

Using the continuity of the subtraction operation we obtain the following proposition

Proposition 33 For any two events A and B the equation P(AB) mdash PA) + PB) - PA A B) holds

In the language of collectives the rule of addition of probabilities is forshymulated as follows see[16] Beginning with an original collective possessing more than two labels an appreciable number of new collectives can be conshystructed by uniting labels the elements of the new collective are the same as in the original one but their labels are unifications of the labels of the origshyinal collective To the unification of labels there corresponds the addition of frequencies

We consider the set of rational numbers U = x euro Q Q lt x lt We denote by the symbol Ut the closure of the set U in the field Qt (if t is the ordinary real topology then Ut mdash [01]) An obvious consequence of the definition of probabilities is the following proposition

Proposition 34 The probability of any event PE) belongs to the set Ut-

208

Conditional probabilities are then introduced into the frequency theory in same way as in [16] Suppose there is some initial collective x = (xltx2-- xn) with probabilities pn of the labels IT euro II Using the unification rule we define the probabilities of all groups of labels

P(A) = YP- (3)

We fix some group of labels B = n^ V V iTik We are interested in the conditional probability P(TTB)TT euro B of the label n given the condition B We form a new collective x = (x[ x2 xn) which is obtained from the original one by choosing only the elements with the labels r pound 5 The probability of the label -K in this new collective is then called the conditional probability of the label n under the condition B P(nB) = lim v^lB^ where J(TB) a r e the relative frequencies of the label -K in the new collective Noting that z5) = iM z B ) where v^ is the relative frequency of the label it in the collective x and j B ) is the relative frequency of the event B in the collective x we obtain (using the continuity of the division operation)

j ( 7 r ) limiW p(V) PMB)=lua-m = mdash m = ^ y PB)0 (4)

The general formula can be proved similarly Proposition 35 P(AB) = PAAB)P(B)P(B) pound 0 We now introduce the concept of independence of events Analyzing argushy

ments in the book [16] one notes that the rule of multiplication of probabilities for independent events is equivalent to the continuity of the multiplication opshyeration

An important property that makes it possible to use p-adic probabilities when considering standard problems of probability theory is the p-adic intershypretation of the probabilities zero and one (which are probabilities in the sense of ordinary probability theory)

Indeed the equation P(E) = 0 in ordinary probability theory does not mean that the event E is impossible It merely means that in a long series of experiments the event E occurs in a very small fraction of cases However in a large number of experiments this fraction can be relatively large Moreover the equation P(E) = 0 lumps together a huge class of events that intuitively appear to have different probabilities For example suppose we consider two events E and Ei and in the first

N = Nk = Cpound)2 (5)

209

trials the event Ei is realized n^ = 2k times and the event E2 is realized

k

nW = Y2j (6) J=0

times It is intuitively clear that the probabilities of these events must be different However in real probability theory

Pi = lim n1)N = P2= lim n (2) N = 0 (7)

It is different in 2-adic probability theory Stabilization in the 2-adic topology gives

Pi = 0 P2 = - 1 since in Q2 we have 2 -gt 0 k -gt co and for - 1 we have the represenshy

tation - 1 = l + 2 + 22 + + 2 + We here encounter for the first time negative numbers for probabilities of events (compare to Wigner [24] Dirac [25] Feynman [26] see also [27] [28] [12]) Of course these probabilities are forbidden by Kolmogorovs second axiom in ordinary probability theory (in von Misess approach they are forbidden by the choice of the topology of stashytistical stabilization) However from the point of view of the frequency theory of probability P = mdash 1 is only an ideal object the symbol that denotes the limit of a sequence of relative frequencies This symbol is in no way better and in no way worse than the symbol P = jix in ordinary probability theory

In this example negative p-adic probabilities were used to split zero conshyventional (real) probability So p-adic negative probabilities can be interpreted as infinitely small conventional probabilities It may be that all negative probshyabilities that appear in quantum physics might be interpreted in such a way If conventional (real) probability is equal to zero there is no conventional (real) element of reality However there is nonconventional (p-adic) element of reality that is realized with negative probability Real and p-adic probabilities correshyspond to different classes of measurement procedures The element of reality that it would be impossible to observe by using real measurement procedure might be observed by using p-adic measurement procedure

One can treat similarly the case of a probability (in the sense of the ordishynary theory) equal to unity For example suppose

k k k k

N = Nk = (J2V)2n^ = (]T2^)2 - 2fcn(2) = ( ^ V ) 2 - pound)2gt (8) j=0 j=0 j=0 j=0

210

In 2-adic probability theory we find that

oo

P1 =l^P2 = l _ ( l ^ 2 gt ) = 2 (9) 3=0

We see here that natural numbers not equal to unity also belongs to the set Up

In this example p-adic (integer) probabilities which are larger than 1 were used to split conventional (real) probability one So under the p-adic considshyeration a conventional element of reality can be split to a few p-adic elements of reality

In the framework of p-adic statistical stabilizations there is also nothing seditious about complex probabilities For example let p = l(mod 4) Then i = ( - l )Va e Qp Let

i = io + hp + iip1 + bull bull bull ir = 0 1 p - 1 (10)

be the canonical decomposition of the imaginary unit in powers of p Note also that for any p

_ l = ( p - l ) + ( p - l ) p + ( p - l ) p 2 + (11)

Then for rational relative frequencies we have

v JQ + HP+ + ikpk ^ _ 1 2

(p - 1) + (p - l)p + + (p - l)pk

in the p-adic topology Geometrically one may suppose that the new probability theory is a transhy

sition from one-dimensional probabilities on the interval [01] to multidimenshysional probabilities

4 Probability distribution of a collective

Let x = (xi Xk bull bull bull) be some collective and II be the set of labels of this collective We consider the simplest case when the set II is finite II = ( 1 S) We denote by v^ the relative frequency of the jmdashlabel and by Pj = limiJ) the corresponding probability In the frequency theory the set of probabilities Px = (Pi bull bull Ps) is called the probability distribution of the collective x

211

The general principle of statistical stabilization makes it possible to conshysider not only real distributions but also distributions for other number fields For one and the same collective x there can exist distributions over different number fields Thus in the proposed approach a collective has in general an entire spectrum of distributions PXit = (P i t Pst) where t are the topologies of statistical stabilization for the given collective Therefore one here studies more subtle structure of the collective The relative frequencies are investigated not only for real stabilization but for a complete spectrum of stabilizations

In the connection with the existence of an entire spectrum of probability distributions of a collective it is necessary to make some comments

First this agrees well with von Misess principle that the collective comes first and the probabilities after Indeed a probability distribution is an object derived from a collective and to one and the same collective there corresponds an entire spectrum of probability distributions these reflecting different propshyerties of the collective

Second each statistical stabilization determines some physical property of the investigated object For example if in a statistical experiment involving the tossing of a coin the probability of heads is Pi and tails is P2 then these probabilities are physical characteristics of the coin like its mass or volume This question is discussed in detail in the books of Poincare [23] and von Mises [16]

If we consider from this point of view the new principle of statistical stashybilization we obtain new physical characteristics of the investigated objects For example if in the real topology statistical stabilization is absent then it is not possible to obtain any physical constants in the language of ordinary probability theory But these constants could exist and be for example p-adic numbers If a collective has not only a real probability distribution but an enshytire spectrum of other distributions then besides real constants corresponding to physical properties of the investigated object we obtain an entire spectrum of new constants corresponding to physical properties that were hidden from the real statistics Note that these new constants can also be ordinary rational numbers

5 Model examples of p-adic statistics

51 Plantation with Red and White Flowers As one of the first examples of a collective von Mises considered [16] a

plantation sown with flowers of different colors and he studied the statistical stabilization of the relative frequencies of each of the colors We shall construct

212

an analogous collective for which p-adic stabilization always occurs but real stabilization is in general absent

Suppose there are flowers of two types red (R) and white (W) The planshytation (or rather infinite bed) is sown in a random order with red and white flowers the flowers being sown in series formed by blocks of p flowers the length of the series (the power of p) being also determined in accordance with a random rule

Namely suppose there are two generators of random numbers 1) j = 01 2) i = 12 (with probabilities 05) If j = 0 then a series of red flowers is sown if j = 1 then a series of white ones The length of each series is defined as follows the length of the first series is some power p1 (it can also be determined in accordance with a random rule) if the length of the previous series was plm then the length of the next series is plm+x lm+i =lm + im

We introduce the relative frequencies of the red and white flowers in the firs m series vpoundgt = rVmgtNmi^T = ntrade Nm

Proposition 51 For all generators of the random numbers j and i there is statistical stabilization of the relative frequencies u^Rgt and u^wgt in the p-adic topology

Thus we have defined p-adic probabilities PR = l imi ^ and Pw mdash limi(w and

oo oo oo oo

PR = (pound(1 -Jn)P)CZPln)gtpw = (E^) (E^ n ) (13) n=l n= l n=l n=l

Note that in general there is no real statistical stabilization for such a random plantation If the generator of the random numbers j gives series 0 or 1 then u^ and v^w^ in the real topology can oscillate from zero to unity

Thus a real observer (an investigator who carries out statistical analysis of the sample in the field of real numbers) cannot obtain any statistically regular law

He will obtain only a random variation of the series of real relative frequenshycies In contrast the p-adic observer (the investigator who makes a statistical analysis of the sample in the field of p-adic numbers) will obtain a well-defined law consisting of the stabilization of the outcomes in the p-adic decomposition of the relative frequencies

It is evident that in the example of probability theory we observe a new funshydamental approach to the investigation of natural phenomena In accordance with this approach experimental results must be analyzed not only in the field of real numbers but also in p-adic fields

Naturally our example is purely illustrative but it does appear to reflect many very important properties of p-adic statistics

213

Remark 51 Intuitively one supposes that in a real plantation it is possible to find a white flower next to almost every red flower in contrast large groups (clusters) of red and white flowers are distributed randomly over a p-adic plantation (one can sow not only a bed but also distribute series of red and white flowers over a plane in accordance with a random rule) A real random plane is obtained if one throws at random red and white points onto the plane in contrast a p-adic random plane is obtained if one throws patches of pl points at a time of red and white color onto the plane

In Appendix 2 we give the results of statistical analysis of the results of a random modeling on a computer of the proposed probability model There is very rapid p-adic stabilization of the relative frequencies and no stabilization in the sense of ordinary real probability theory

Remark 52 Evidently the structure of series formed by powers of p need not necessarily be directly observed in a statistical sample This structure is introduced by rounding the number of results to powers of p In very large statistical samples one can take into account only the orders of the numbers and one thereby introduces into the sample a 10-adic structure

52 Random Choice of the Digit of a p-Adic Number Suppose there are two labels 1 and 2 j is a generator of random numbers

corresponding to the choice of one of the labels Each random label is produced in series the length of the series being determined by random choice of the next p-adic digit ie there is a generator of random numbers a that take the values a = 0 1 p - 1 and the length of the next series is anp

n~1n = 12 We introduce the relative frequencies v^ and v^

Proposition 52 For all generators of the random numbers j and a there is statistical stabilization of the relative frequencies v-1 and i 1 in the p-adic topology

Thus the following p-adic probabilities are defined

oo oo oo oo Pl = (Y^l-J^nPn~1)lY^nPn-l)P2 = (EjnltnP

n-l)(ltrianpn-1) n=l n=l n=l n=l

In the real topology there is in general no statistical stabilization Appendix 1 Every rational number x ^ 0 can be represented in the form

where p does not divide m and n Here p is a fixed prime The p-adic absolute value (norm) for the rational number x is defined by the equations xp =

214

p r i 0 |0|p = 0 This absolute value has the usual properties l)xp gt 0 xp = 0 laquo-raquobull x = 0 2)|x|p = |a|p|2|p and satisfies a strong triangle inequality 3)x + yp lt max(|a|p |y|p)

The completion of the field of rational numbers with respect to the metric p(x mdash y) = x mdash yp is called the field of p-adic numbers and denoted by the symbol Qp It is a locally compact field Numbers in the unit ball Zp = x euro QP bull XP lt 1 degf the field Qp are called integer p-adic numbers Prom the strong triangle inequality we obtain a theorem which states that a series in the field Qp converges if and only if its general term tends to zero Any p-adic number can be represented in a unique manner in the form of a (convergent) series in powers of p

oo x = Yla^ai =0 1 p-lfc = 0plusmnl (15)

j=k

with xp = p~k

One can define similarly m-adic numbers where m is any natural number m gt 2 In the general case property 2) is replaced by the weaker property xym lt |z|m|2|mgt i-e-gt xm ls a pseudonorm The completion of the field Q in the metric p(xy) = x mdash ym will not be a field (for m that are not prime) It is only a ring Here we already encounter some deviations from the ordinary probability rules (which can be extended without any changes to p-adic probabilities) For example one can have a situation of the following kind A and B are independent events P(A) ^ 0 and PB) ^ 0 but P(A AB)=0 In particular the conditional probability P(AB) is in general not defined for an event B having nonvanishing probability

Appendix 2

We give here the results of a random experiment (modeled on a computer) for a 2-adic plantation The results of this experiment give a good illustration of a situation in which there is no statistical stabilization in the real topology but there is statistical stabilization in the 2-adic topology In the following tables m is the number of a random experiment in which two random numbers are modeled one corresponding to the choice of a flower and the other to the length of the series of this flower d is the number of elements in the sample Because of the exponential growth of the number of elements in the series d increases very rapidly

The table of relative frequencies in the field of real numbers is

215

m 4 5 6 7

12 13 14

22 23

d 10 102

103

103

105

105

106

109

1010

w uyy

01304 06364 01913 00504

00006 05335 01703

00022 07453

uH

08696 03636 08087 09496

09994 04665 08297

09978 02547

Thus for the relative frequencies in the field of real numbers there is no stabilization of even the first digit after the decimal point We examined large sequences of experiments on the computer in which the oscillations continued The calculations in the field Q2 give the results

AT = 10

v(w) =101011111011000000110100010111011000110011011110110001011 iW =001100000100111111001011101000100111001100100001001110100

iV = 20

v(w) _ 10101111101100111011001100101111110000011100111000000001 vWgt = 00110000010011000100110011010000001111100011000111111110

AT = 30

iW = 101011111011001110110011001111111100000000100110110000011 iW =001100000100110001001100110000000011111111011001001111100

AT = 40

v(w) =101011111011001110110011001111111100000000010111001110100 iW =001100000100110001001100110000000011111111101000110001011

216

Thus after ten random experiments 14 digits are stabilized in the 2-adic decomposition for the relative frequency of occurrence of a red flower and 14 digits for a white flower after 20 experiments the numbers of digits that are stabilized are 27 for both colors after 30 experiments 42 digits are stabilized for each and so forth

Appendix 3 W e give the results of analysis of a statistical sample in a field of 5-adic

numbers Here N is the number of random experiments M is the number of elements of the sample M is the number of elements of the first label and Mi is the number of elements of the second label

N 2 M l 002 M 2 00002 M 00202

MlM1044004400440044004400440044004400440044004400440044 M2M0010440044004400440044004400440044004400440044004400

N 3 M l 002 M 2 000023 M 002023

MlM1040303403420004404141041024440040303403420004404141 M2M10014141041024440040303403420004404141041024440040303

N 4 M l 00200002 M 2 000023 M 00202302

MlM1040303004000130020234341334320032124414032304024031 M2M0014141440444314424210103110124412320030412140420413

N 5 M l 00200002 M 2 000023004 M 002023024

MlM1040301040132010043322212441423102032221232032034142 M2M0014143404312434401122232003021342412223212412410302

N 6 M l 00200002 M 2 00002300403 M 00202302403

MlM1040301003131014113132222240403413222311230303113140 M2M0014143441313430331312222204041031222133214141331304

N 7 M l 00200002 M 2 0000230040303 M 0020230240303

217

MlM1040301003202004101343032004014023441101104433243020 M2M0014143441242440343101412440430421003343340011201424

Thus in the analysis of the sample in the field of 5-adic numbers there is rapid stabilization of the digits in the 5-adic decomposition of the relative frequenshycies For example after 55 experiments 78 digits in the 5-adic decomposition of the relative frequencies are stabilized

When the sample is analyzed in the field of real numbers there is again no statistical stabilization

Acknowledgements

I would like to thank L Ballentine and J Summhammer for discussions on p-adic probabilities and elements of physical reality

References 1 A Einstein B Podolsky N Rosen Phys Rev 47 777-780 (1935) 2 PS Alexandrov Introduction to general theory of sets and functions

(Gostehizdat Moscow 1948) 3 R Engelking General Topology (PWN Warszawa 1977) 4 AYu Khrennikov Dokl Akad Nauk 322 1075-1079 (1992) 5 AYu Khrennikov J of Math Phys 32 932-937 (1991) 6 VS Vladimirov I V Volovich and E I Zelenov p-adic analysis and

mathematical physics ( World Scientific Publ Singapore 1994) 7 Yu Manin Springer Lecture Notes in Math1111 59-101 (1985) 8 P G 0 Freund and E Witten Phys Lett B 199 191-195 (1987) 9 AYu Khrennikov Non-Archimedean Analysis Quantum Paradoxes

Dynamical Systems and Biological Models (Kluwer Academic Publ Dordrecht 1997)

10 S Albeverio A Yu Khrennikov and R Cianci J Phys A Math and Gen 30 881-889 (1997)

11 A Yu Khrennikov J of Math Physics 39 1388-1402 (1998) 12 AYu Khrennikov Interpretations of probability (VSP Int Publ

Utrecht 1999) 13 Z I Borevich and I R Shafarevich Number Theory (Academic Press

New-York 1966) 14 W Schikhov Ultrametric calculus (Cambridge Univ Press Camshy

bridge 1984) 15 R von Mises MathZ 5 52-99 (1919)

16 R von Mises Probability Statistics and Truth (Macmillan London 1957)

17 A N Kolmogorov Foundations of the Probability Theory (Chelsea Publ Comp New York 1956)

18 H Cramer Mathematical theory of statistics (Univ Press Princeton 1949)

19 I V Volovich Number Theory as the Ultimate Physical Theory Preprint CERN Geneva TH 478187 (1987)

20 E Borel Rend Cic Mat Palermo 27 247 (1909) 21 M Frechet Recherches theoriques modernes sur la theorie des probashy

bility (Univ Press Paris 1937-1938) 22 A Ya Khinchin Voprosi Filosofii No 1 92 No 2 77 (1961) (in

Russian) 23 A Poincare About Science Collection of works (Nauka Moscow

1983) 24 E Wigner Quantum -mechanical distribution functions revisted in

Perspectives in quantum theory Yourgrau W and van der Merwe A editors (MIT Press Cambridge MA 1971)

25 P A M Dirac Proc Roy Soc London A 180 1-39 (1942) 26 R P Feynman Negative probability Quantum Implications Esshy

says in Honour of David Bohm 235-246 BJ Hiley and FD Peat editors (Routledge and Kegan Paul London 1987)

27 W Muckenheim Phys Reports 133 338-401 (1986) 28 A Yu Khrennikov Int J Theor Phys 34 2423-2434 (1995)

219

COMPLEMENTARITY OR SCHIZOPHRENIA IS PROBABILITY IN Q U A N T U M MECHANICS INFORMATION

OR ONTA

A F KRACKLAUER E-mail kracklaufossiuni-weimarde

Of the various complimentarities or dualities evident in Quantum Mechanics (QM) among the most vexing is that afflicting the character of a wave function which at once is to be something ontological because it diffracts at material boundshyaries and something epistemological because it carries only probabilistic informashytion Herein a description of a paradigm a conceptual model of physical effects will be presented that perhaps can provide an understanding of this schizophrenic nature of wave functions It is based on Stochastic Electrodynamics (SED) a candidate theory to elucidate the mysteries of QM The fundamental assumption underlying SED is the supposed existence of a certain sort of random electroshymagnetic background the nature of which it is hoped will ultimately account for the behavior of atomic scale entities as described usually by QM In addition the interplay of this paradigm with Bells no-go theorem for local realistic extentions of QM will be analyzed

1 Introduction

Of the various complimentarities or dualities evident in Quantum Mechanshyics (QM) among the most vexing is that afflicting the character of a wave function which at once is to be something ontological because it diffracts at material boundaries and something epistemological because it carries only probabilistic information All other diffractable waves it may be said carry momentum energy not conceptual abstract information ideas All other probabilities are calculational aids and like abstractions generally are utterly unaffected by material boundaries The literature is replete with resolutions of QM-conundrums selectively ignoring one or the other of these characteristicsmdash in the end they all fail

Herein a description of a paradigm a conceptual model of physical efshyfects will be presented that perhaps can provide an understanding of this schizophrenic nature of wave functions It is based on Stochastic Electrodyshynamics (SED) a candidate theory to elucidate the mysteries of QM1 The fundamental concept underlying SED is the supposed existence of a certain sort of random electromagnetic background the nature of which it is hoped will ultimately account for the behavior of atomic scale entities as described usually by QM2 Among the successes of SED one is a local realistic explashynation of the diffraction of particle beams3 The core of this explanation is the

220

notion that relative motion through the SED background effectively engenders de Broglies pilot wave Given such a pilot wave associated with a particles motion the statistical distribution of momentum in a density over phase space can be decomposed in the sense of Fourier analysis such that the resulting form of Liouvilles Equation under some conditions is Schrodingers Equation

From this viewpoint the schizophrenic character of wave functions can be discussed and understood free of preternatural attributes These concepts have broad implications for serious philosophical questions such as the mind-body dichotomy through teleportation to popular science fiction effects In addition the peculiar nature of probability in QM is clarified

Although much remains to be done to comprehensively interpret all of QM in terms of SED many of the by now hoary paradoxes can be rationally deconstructed

A secondary (but intimately related) issue is that of determining the imshyport of Bells Theorem for the use of the SED paradigm to reconcile fully the interpretation of QM Arguments will be presented showing that in his proof Bell (essentially by misconstruing the use of conditional probabilities) called on inappropriate hypothetical presumptions just as Hermann de Broglie Bohm and others found that Von Neumann did before him45

2 De Broglie waves as an SED effect

The foundation of the model or conceptual paradigm for the mechanism of particle diffraction proposed herein is Stochastic Electrodynamics (SED) Most of SED for which there exists a substantial literature is not crucial for the issue at hand1 The nux of SED can be characterized as the logical inversion of QM in the following sense If QM is taken as a valid theory then ultimately one concludes that there exists a finite ground state for the free electromagnetic field with energy per mode given by

E = huj2 (1)

SED on the other hand inverts this logic and axiomatically posits the existence of a random electromagnetic background field with this same spectral energy distribution and then endeavors to show that ultimately a consequence of the existence of such a background is that physical systems exhibit the behavior otherwise codified by QM The motivation for SED proponents is to find an intuitive local realistic interpretation for QM hopefully to resolve the well known philosophical and lexical problems as well as to inspire new attacks on other problems

221

The question of the origin of this electromagnetic background is of course fundamental In the historical development of SED its existence has been posited as an operational hypothesis whose justification rests o posteriori on results Nevertheless lurking on the fringes from the beginning has been the idea that this background is the result of self-consistent interaction ie the background arises out of interactions from all other electromagnetic charges in the universe6

For present purposes all that is needed is the hypothesis that particles as systems with charge structure (not necessarily with a net charge) are in equishylibrium with electromagnetic signals in the background Consider for example as a prototype system a dipole with characteristic frequency u Equilibrium for such a system in its rest frame can be expressed as

moc2 = Jkj0 (2)

This statement is actually tautological as it just defines UJQ for which an exact numerical value will turn out to be practically immaterial

This equilibrium in each degree of freedom is achieved in the particles rest frame by interaction with counter propagating electromagnetic background signals in both polarization modes separately which on the average add to give a standing wave with antinode at the particles position

2cos(fc0a)sin(wo)- (3)

Again this is essentially a tautological statement as a particle doesnt see signals with nodes at its location thereby leaving only the others Of course everything is to be understood in an on-the-average statistical sense

Now consider Eq (3) in a translating frame in particular the rest frame of a slit through which the particle as a member of a beam ensemble passes In such a frame the component signals under a Lorentz transform are Doppler shifted and then add together to give what appears as modulated waves

2 cos(fc07(x mdash cflt)) sin(wo7(i mdash c_13a)) (4)

for which the second the modulation factor has wave length A = (7fco)-1 From the Lorentz transform of Eq (2) P = hj3ko the factors j3k0 can be identified as the de Broglie wave vector from QM as expressed in the slit frame

In short it is seen that a particles de Broglie wave is modulation on what the orthodox theory designates Zitterbewegung The modulation-wave effectively functions as a pilot wave Unlike de Broglies original conception in which the pilot wave emanates from the kernel here this pilot wave is a kinematic effect of the particle interacting with the SED Background Because

222

this SED Background is classical electromagnetic radiation it will diffract according to the usual laws of optics and thereafter modify the trajectory of the particle with which it is in equilibrium3 (See Ref [1] Section 123 for a didactical elaboration of these concepts)

The detailed mechanism for pilot wave steerage is based on observing that the energy pattern of the actual signal that pilot waves are modulating and to which a particle tunes comprises a fence or rake-like structure with prongs of varying average heights specified by the pilot wave modulation These prongs in turn can be considered as forming the boundaries of energy wells in which particles are trapped a series of micro-Paul-traps as it were Intuitively it is clear that where such traps are deepest particles will tend to be captured and dwell the longest The exact mechanism moving and restraining particles is radiation pressure but not as given by the modulation rather by the carrier signal itself Of course because these signals are stochastic well boundaries are bobbing up and down somewhat so that any given particle with whatever energy it has will tend to migrate back and forth into neighboring cells as boundary fluctuations permit Where the wells are very shallow however particles are laterally (in a diffraction setup say) unconstrained they tend to vacate such regions and therefore have a low probability of being found there

The observable consequences of the constraints imposed on the motion of particles is a microscopic effect which can be made manifest only in the observation of many similar systems For illustration consider an ensemble of similar particles comprising a beam passing through a slit Let us assume that these particles are very close to equilibrium with the background that is that any effects due to the slit can be considered as slight perturbations on the systematic motion of the beam members

Given this assumption each member of the ensemble with index n say will with a certain probability have a given amount of kinetic energy En associated with each degree of freedom Of special interest here is the beam direction perpendicular to both the beam and the slit in which by virtue of the assumed state of near equilibrium with the background we can take the distribution with respect to energy of the members of the ensemble to be given in the usual way by the Boltzmann Factore_^pound where is the reciprocal product of the Boltzmann Constant k and the temperature T in degrees Kelvin The temperature in this case is that of the electromagnetic background serving as a thermal bath for the beam particles with which it is in near equilibrium

Now the relative probability of finding any given particle ie with energy Enj or Enltk or trapped in a particular well will be according to elementary probability proportional to the sum of the probabilities of finding

223

particles with energy less than the well depth

pound e -J = f ( t ) e s amp = (1-eSD) lt5) lEnltd JO 0 V 0

where approximating the sum with an integral is tantamount to the recognition that the number of energy levels if not a priori continuous is large with respect to the well depth

If now d in Eq (5) is expressed as a function of position we get the probability density as a function of position For example for a diffraction pattern from a single slit of width o at distance D the intensity (essentially the energy density) as a function of lateral position is E0 sin2(9)62 where 9 = k[piiotWave(^D)y and the probability of occurrence P(6(y)) as a function of position would be

P ( y ) a ( l - e - ^ s i n 2 W f l 2 ) (6)

Whenever the exponent in Eq (6) is significantly less than one its rhs is very accurately approximated by the exponent itself so that one obtains the standard and verified result that the probability of occurrence Py) = iptp in conventional QM is proportional to the intensity of a particles de Broglie (pilot) wave

3 Schrodinger Equation

A consequence of the attachment of a De Broglie pilot wave to each particle is that there exists a Fourier kernel of the following form

bull 2p V (7)

which can be used to decompose the density function of an ensemble of similar particles Consider an ensemble governed by the Liouville Equation

at m ^ = - V raquo - ^ + ( V p p ) F

i=xy z (8)

Now decompose p(x p)with respect to p using the De Broglie-Fourier Kernel

p(x x t) = e-^p(x p t)dp (9)

224

110

relative intensity

Neutron Diffraction

0 Particle Beam

1 x Radiation

bullI A Chi(y)-squared (x50)

lateral displacement in radians theta

Figure 1 A simulated single slit neutron diffraction pattern showing the closeness of the fit of Eq (6) to the pure wave diffraction patten See Ref [3] for details

to transform the Liouville Equation into

dt i2m

To solve separate variables using

f)(xP)

r = x + x r = x mdashx

to get

i = (^ )^ - (^raquo - ( i ) (-raquobull(4^^ which can (sometimes) be separated by writing

r r )=V(r )Vlt(r)

(10)

(11)

(12)

(13)

225

to get Schrodingers Equation

ihd-^ = ~y^ + v^ (14) at 2 m

4 Conclusions

Within this paradigm Quantum Mechanics is incomplete as surmised by Einshystein Padolsky and Rosen4 It is built on the basis of the Liouville Equation while taking a particular stochastic background into account The conceptual function of Probability in QM is just as in Statistical Mechanics Measurement reduces ignorance it does not precipitate reality Of course measurement also disturbs the measured system but this presents no more fundamental problems that it does in classical physics Heisenberg uncertainty on the other hand is seen to be caused simply by the incessant dynamical perturbashytion from background signals In so far as the source of background signals can not be isolated this source of uncertainty is intrinsic but not fundamentally novel For these reasons duality is superfluous Particles have the same ontological status as in classical physics Individual particles in a beam pass through one or the other slit in a Young double slit experiment for example while their De Broglie piloting waves pass through both slits Beyond the slit the particles are induced stochastically to track the nodes of their pilot waves so that a diffraction pattern is built up mimicking the intensity of the pilot wave

From within this paradigm the now infamously paradoxical situations illustrating various problems with the interpretation of QM never arise or are resolved with elementary reasoning In particular wave functions are not vested with an ambiguous nature

The SED Paradigm also clarifies the appearance of interference among probabilities Numerous analysts from various view points have discovered that fact that Probability Theory admits structure (used by QM) that goes unexploited in traditional applications (Eg see Gudder Summhammar this volume) While each of these approaches provides deep and surprising insights none really offers any explanation of why and how nature exploits this structure Just as a certain second order hyperbolic partial differential equation becomes the wave equation as a physics statement only with the introduction eg of Hooks Law so this extra probability structure can be made into physics only with an analogue to Hooks Law

SED provides that analogue for particle behavior with its model of pilot wave guidance In this model radiation pressure is responsible for particle guidance3 Radiation pressure is proportional to the square of EM fields ie

226

the intensity (in this case of the the background field as modified by objects in the environment) which is not additive Rather the field amplitudes are additive and interference arrises in the way well understood in classical EM In other words QM interference is a manifestation of EM interference The relevant Hooks Law analogue is the phenomenon of radiation pressure For radiation this is all intimately related of course to classical coherence theshyory as applied to square law photoelectron detectors which when properly applied resolves many QM conundrums including those instigated by Bells Theorem surrounding EPR correlations

Appendix Bells Theorem

The interpretation or paradigm described herein conflicts with the conclusions of Bells no-go theorem according to which a local realistic extention of QM should conform with certain restraints that have been shown empirically to be false To be sure this paradigm does not deliver the hidden variables for exploitation in calculations but it does indicate to which features in the universe they pertainmdashnamely all other charges The character of these hidden variables is dictated by the fact that they are distinguished only in that they pertain to particles distant from the system of particular interest thus internal consistency requires that they be local and realistic8

The basic proof

Bells Theorem purports to establish certain limitations on coincidence probashybilities of spin or polarization measurements as calculated using QM if they are to have an underlying deterministic but still local and realistic basis describ-able by extra as yet hidden variables A distributed with a density p(X) These limitations take the form of inequalities which measurable coincidences must respect The extraction of one of these inequalities where the input assumptions are enumerated as Bell made them proceeds as follows

Bells fundamental Ansatz consists of the following equation

P(a b) = f dp(X)A(a X)B(b A) (15)

where per explicit assumption A is not a function of 6 nor B of a This he motivated on the grounds that a measurement at station A if it respects locality can not depend on remote conditions such as the settings of a distant measuring device ie b In addition each by definition satisfies

Alt1 Blt1 (16)

227

Eq (15) expresses the fact that when the hidden variables are integrated out the usual results from QM are recovered

The extraction proceeds by considering the difference of two such coincishydence probabilities where the parameters of one measuring station differ

P(a b) - P(a b) = f dp(X)[A(a X)B(b A) - A(a X)B(b A)] (17)

to which zero in the form

A(a X)B(b X)A(a X)B(b A) - A(a X)B(b X)A(a X)B(b A) (18)

is added to get

P(a b) - P(a b) = [ dXp(X)(A(a X)B(b A))(l plusmn A(a X)B(b A)+

dXp(X)(A(a X)B(b A))(l plusmn A(a X)B(b A) (19)

which upon taking absolute values Bell wrote as

P(a b)-P(a b) lt [dXp(X)(l plusmn A(a X)B(b A)+

I dXpX)l plusmn A(a X)B(b A) (20)

Then using Eq (15) Ansatz and normalization J dXp(X) = 1 one gets

P(a b) - P(a b) + P(a V) + P(a b) lt 2 (21)

a Bell inequality9

Now if the QM result for these coincidences namely P(a b) = mdash cos(20) is put in Eq (21) it will be found that for 6 = iramp the rhs of Eq (21) becomes 22 Experiments verify this result10 Why the discrepancy According to Bell it must have been induced by demanding locality as all else he took to be harmless

228

Critiques

Although Bells analysis is denoted a theorem in fact there can be no such thing in Physics the axiomatic base on which to base a theorem consists of those fundamental theories which the whole enterprise is endeavoring to reveal Moreover buried in all mathematics pertaining to the physical world are numerous unarticulated assumptions some of which are exposed below

The analytical character of dichotomic functions

In motivating his discussion of the extraction of inequalities Bell considered the measurement of spin using Stern-Gerlach magnets or polarization measureshyments of photons In both cases single measurements can be seen as individshyual terms in a symmetric dichotomic series ie having the values plusmn 1 It is ther-fore natural to ask if the correlation computed using QM P(a b) = mdash cos(20) and verified empirically can be the correlation of dichotomic functions It is easy to show that they can not so be consider

- cos(20) = k f P(x- 6)P(x)dx (22)

where p(A) is fc27r and where the Ps are dichotomic functions Now take the derivative wrt 8 to get

2 sin(2lt9) = f 5(x - 6j)P(x)dx = ^ P0j) = k (23) J i

and again

4cos(20)=O (24)

which is false QED Some authors (see eg Aerts this volume) employ a parameterized dishy

chotomic function to represent measurements Such a function can be dishychotomic in the argument but continuous in the parameter eg of the form P(sin(i) mdash x)) for which then the correlation is taken to be of the form

Corr(t) = J D(x- sin(2t))D(x)dx (25) J mdash IT

However this approach seems misguided First it assumes that the the argushyment of Corr t can be identical to the parameter of the dichotomic function

229

Pt(x) rather than the off-set in the argument here x as befitting a correlashytion Moreover the same sort of consistency test applied above also results in contradictions therefore such parameterized functions do not constitute counterexamples invalidating the claim that discontinuous functions can not have an harmonic correlation At best this tactic implicitly results in the correlation of the measurement functions wrt the continuous parameter t which is interpreted as the weight or frequency of the the dichotomic value This tactic however does not conform with Bells analysis in which the dishychotomic values are to correlated rather it corresponds with the type of model proposed below without however recognizing Malus Law as the source of the weights

Conclusion There is a fundamental error in Bells analysis the QM result is at irreconcilable odds with the conventional understanding of his arguments11

This can be revealed alternately following Sica by considering four dishychotomic sequences (with values plusmn1 and length N) a a b and b and the following two quantities a ^ + a ^ = a(6j + 6J) and dfii mdash a^)i = abi mdash b^) Sum these expressions over i divide by N and take absolute values before adding together to get

N N N N

i i i i

N N

- pound | a j | | amp i + ampi + - jgtnamp i -amp i (26) i i

The rhs equals 2 so this is a Bell Inequality Conclusion this Bell Inequality is an arithmetic identity for dichotomic sequences there is no need to postulate locality in order to extract it12

Discrete vice continuous variables

By implication Bell considered discrete variables for which the correlation would be

1 N

Cor(a 6 ) = - 5 3 X 4 ( 0 ) ^ ( 6 ) (27) i

But experiments measure the number of hits per unit time given a b and then compute the correlation each event is a density not a single pair The

230

data taken in experiments corresponds to the read-out for Malus Law not the generation of dichotomic sequences for which each term represents an event consisting of a pair of photons with anticorrelated polarization or a particle pair with anticorrelated spins This discrepancy is ignored in the standard renditions of Bells analysis It is however serious and suggests a different tack

Consider following Barut a model for which the spin axis of pairs of particles have random but totally anticorrelated instantaneous orientation Si = mdashS213 Each particle then is directed through a Stern-Gerlach magnetic field with orientation a and b The observable in each case then would be A = Si bull a and B = S2 bull b Now by standard theory

_ bdquo s ltABgt - ltAgtltB gt Cor (A B) = = = = 28

Vlt A2 gt lt B2 gt the where the angle brackets indicate averages over the range of the variables This becomes

Cor(A B) = ^ s i n ( 7 ) d y c o s ( 7 - g ) c o s ( 7 ) ^

J(Jdysm(j)cos2(j))2

which evaluates to -cos(0) ie the QM result for spin state correlation Conclusion this model essentially a counter example to Bells analysis shows that continuous functions (vice dichotomic) work It is more than just natural to ask where do the gremlins reside in Bells analysis There are at least two

One has to do with the following covert hypothesis Bells proof seems to pertain to continuous variables in that the demand is only that A (B) lt 1 This argument however silently also assumes that the averages lt A gt = lt B gt = 0 It enters in the derivation of a Bell inequality where the second term above is ignored as if it is always zero When it is not zero Bell inequalities become eg

lP(a b) - P(a b) + P(a b) - P(a b)lt2+ 2 lt ^ gt lt f 2

gt ^ (30) Vlt Az gt lt Bz gt

which opens up a broader category of non quantum models A second covert gremlin having broader significance is discussed below

Are nonlocal correlations essential

The demand that in spite of the introduction of hidden variables A that a probability P(a b) averaged over these extra variables reduce to currently

231

used QM expressions implies that

P(a b)= f P(a b X)dX (31)

By basic probability theory the integrand in this equation is to be decomposed in terms of individual detections in each arm according to Bayes formula

Pa b A) = P(X)P(a X)P(ba A) (32)

where P(a A) is a conditional probability In turn the integrand above can be converted to the integrand of Bells Ansatz

P(a b) = jA(a X)B(b X)pX)dX iff

P(baX) = P(bX) Va (33)

This equation admits it seems two interpretations

(i) When this equation is true the ratio of occurrence of outcomes at station B must be statistically independent of the outcomes at A Therefore as the hidden variables A are extra and do not duplicate a and b even if the correlation is considered to be encoded by a A it will not be available to an observer But the correlation by hypothesis does exist and is to be detectable via the as and 6s therefore this equation can not hold Thus within this interpretation Bells Ansatz is not internally consistent

(ii) Alternately if the a on the lhs is superfluous so is b so that P mdash P(X) = 0 except at one value of A where it equals 1 or is a Dirac-delta function That is the correlation is totally encoded by the hidden variables as follows if a sufficient number of new variables are introduced to render everything deterministicmdashas often assumed Consequently individual products of probabilities at the separate stations ie ABs in Bells notation become Dirac delta-functions of the A If everything is deterministic then there can be no overlap of the of the non-zero values of pairs of probabilities for a given value of A and therefore in the extraction of a Bell inequality all quadruple products of P s with pair-wise different values of A in Eq (19) are identically zero so that the final form of a Bell inequality is the trivial identity

P(ab)-P(ab)lt2 (34)

232

In either case locality is not be so employed so as to exclude correlations generated at the conception of the spin-particles or photon pairs ie common causes The non existence of instantaneous communication can not impose a restraint here it must bear no relationship to the validity of Eq (33)

In addition Eq (34) reconciles Baruts continuous variable model with Bells analysis

Bell-Kochen-Specker Theorem

Besides Bells original theorem there is another set of no-go theorems ostensishybly prohibiting a local realistic extention for QM In contrast to the theorem analyzed above they do not make explicit use of locality rather they use cershytain properties (falsely it turns out) of angular momentum (spin) In general the proof of these theorems proceeds as follows The system of interest is deshyscribed as being in a state ip) specified by observables A B C A hidden variable theory is then taken to be a mapping v of observables to numerical values v(A)v(B)v(C) Use is then made of the fact that if a set of operashytors all commute then any function of these operators f(A BC) = 0 will also be satisfied by their eigenvalues f(v(A) v(B)v(C)) mdash 0

The proof of a Kochen-Specker Theorem proceeds by displaying a conshytradiction consider eg two spin-12 particles for which the nine separate mutually commuting operators can be arranged in the following 3 by 3 matrix

degl degl degdeg (35) degWy degldeg degdegz

It is then a little exercise in bookkeeping to verify that any assignment of plus and minus ones for each of the factors in each element of this matrix results in a contradiction namely the product of all these operators formed row-wise is plus one and the same product formed column-wise is minus one14

Now recall that given a uniform static magnetic field B in the z-direction the Hamiltonian is H = ^Baz for which the time-dependent solution of the

r nmdashiuit Schrodinger equation is ip(t) = 4= e

bdquo+iut and this in turn gives time-

dependent expectation values for spin values in the xy directions^5

lt ampx gtmdash ~ cos(oi) lt ay gt= - sin(wi) (36)

where w = eBmc

233

Proof of a Bell-Kochen-Specker theorem depends on simultaneously asshysigning the [eigenvalues plusmn1 to ltrx o~y and az as measurables for each particle (With some effort for all other proofs of this theorem one can find an equivashylent assumption) However as Barut13 observed and can be seen in Eq (36) if the eigenvalues plusmn1 are realizable measurement results in the P-field dishyrection then in the other two directions the expectation values oscillate out of phase and therefore can not be simultaneously equal to plusmn 1 Thus this variation of a Bell theorem also is defective physics

A local model for EPR (polarization) Correlations

The following model incorporates the features of polarization correlations withshyout preternatural aspects or the concept of photon The basic assumption is that the source emits oppositely directed anticorrelated classical electromagshynetic signals

EA = xcos(i) +ys in( f ) EB = mdash xsin( + 6) + y cos(i + 9) (37)

where factors of the form exp(i(wt + k bull x + pound(t)) where pound(pound) is a random variable are dropped as they are suppressed by averaging16 Now the random variables with physical significance emerging in the detectors per Malus Law are EA B It is the detectors that digitize the data and create the illusion of photons But because Maxwells Equations are not linear in intensities rather in the fields a fourth order field correlation is required to calculate the cross correlation of the intensity

P(a b) = Klt(A- B)(B bull A) gt (38)

where brackets indicate averages over space-time (This appears to be the source of entanglement in QM which is seen to have no basis beyond that found in classical physics) Here Eq (38) turns out to be

P ( + +) ltXK (COS(J) sin(i + 6) - sin(i) cos(i + 6)fdv (39) Jo

which gives P ( + + ) = P ( - - ) oc tsin2(0) a n d P ( - + ) = P ( - - ) ocfccos2(0) The constant K can be eliminated by computing the ratio of particular events to the total sample space which here includes coincident detections in all four combinations of detectors averaged over all possible displacement angles 6 thus the denominator is

mdash (sin2 (6raquo) + cos2 (6))d6 = 2K (40) i Jo

234

so that the ratio becomes

P ( + + ) = is in 2(0) (41)

the QM result This in turn yields the correlation

P ( + +) + P ( - - ) - P ( + - ) - P ( - +) Cor(a b) =

P ( + +) + P ( - - ) + P ( + - ) + P ( - + )

Cor (a b) = -cos(20) (42)

If the fundamental assumptions involved in this local realistic model are valid then there would be observable consequences For example if radiation on the other side of a photodetector is continuous and not comprised of photons then photoelectrons are evoked independently in each detector by continuous but (anti)correlated radiation Thus the density of photoelectron pairs should be linearly proportional (baring effects caused by limited cohershyence) to the coincidence window width On the other hand if photons are in fact generated in matched pairs at the source then at very low intensities the detection rate should be relatively insensitive to the coincidence window width once it is wide enough to capture both electrons

1 L de la Peha and A M Cetto The Quantum Dice (Kluwer Dordrecht 1996)

2 A F Kracklauer An Intuitive Paradigm for Quantum Mechanics Physics Essays 5 (2) 226 (1992)

3 A F Kracklauer Found Phys Lett 12 (5) 441 (1999) 4 G Hermann Die Naturphilosophischen Grundlagen der Quanten-

mechanik Abhandlungen der Friesschen Schule 6 75-152 (1935) 5 D Bohm Causality and Chance in Modern Physics (Routledge amp Kegan

Paul Ltd London 1957) 6 H Puthoff Phys Rev A 40 4857 (1989) 44 3385 (1991) 7 A Einstein B Podolsky and N Rosen Phys Rev 47 777 (1935) 8 J S Bell Speakable and unspeakable in quantum mechanics (Cambridge

University Press Cambridge 1987) 9 J S Bell in Foundations of Quantum Mechanics Proceedings of the

International School of Physics Enrico Fermi course IL (Academic New York 1971) p 171-181 reprinted in Ref [8]

10 A Afriat and F Selleri The Einstein Podolsky and Rosen Paradox (Plenum New York 1999) review theory and experiments from a current prospective

235

11 A F Kracklauer in New Developments on Fundamental Problems in Quantum Mechanics M Ferrero and A van der Merwe (eds) (Kluwer Dordrecht 1997) p185

12 L Sica Opt Commun 170 55-60 amp 61-66 (1999) 13 A O Barut Found Phys 22 (1) 137 (1992) 14 N D Mermin Rev Mod Phys 65 (3) 803 (1993) 15 R H Dicke and J P Wittke Introduction to Quantum Mechanics

(Addison-Wesley Reading 1960) p 195 16 A F Kracklauer in Instantaneous Action-at-a-Distance in Modern

Physics A E Chubykalo V Pope and R Smirnov-Rueda (eds) (Nova Science Commack NY 1999) p 379 httparXivquant-ph0007101 Ann Fond L deBroglie 20 (2) 193 (2000)

236

A PROBABILISTIC INEQUALITY FOR THE KOCHEN-SPECKER PARADOX

JAN-AKE LARSSON Matematiska Institutionen Linkopings Universitet

SE-581 83 Linkoping Sweden E-mail jalarmailiuse

A probabilistic version of the Kochen-Specker paradox is presented The paradox is restated in the form of an inequality relating probabilities from a non-contextual hidden-variable model by formulating the concept of probabilistic contextuality This enables an experimental test for contextuality at low experimental error rates Using the assumption of independent errors an explicit error bound of 071 is derived below which a Kochen-Specker contradiction occurs

1 Introduction

The description of quantum-mechanical (QM) processes by hidden variables is a subject being actively researched at present The interest can be traced to topics where recent improvements in technology has made testing and using QM processes possible Research in this field is usually intended to provide insight into whether how and why QM processes are different from classical processes Here the presentation will be restricted to the question whether there is a possibility of describing a certain QM system using a non-contextual hidden-variable model or not A non-contextual hidden-variable model would be a model where the result of a specific measurement does not depend on the context ie what other measurements that are simultaneously performed on the system It is already known that for perfect measurements (perfect alignment no measurement errors) no non-contextual model exists These results origin in the work of Gleasonf but a conceptually simpler proof was given by Kochen and Specker2 (KS)

The KS theorem concerns measurements on a QM system consisting of a spin-1 particle In the QM description of this system the operators associated with measurement of the spin components along orthogonal directions do not commute ie

Sxj^y and sz do not commute (1)

however the operators that are associated with measurement of the square of the spin components do commute ie

^1si and s^ commute (2)

237

The latter operators (the squared ones) have the eigenvalues 0 and 1 and

si +s2y + s2

z = 21 (3)

Thus it is possible to simultaneously measure the square of the spin composhynents along three orthogonal vectors and two of the results will be 1 while the third will be 0 Only this QM property of the system will be used in what follows

The notation used from now on is intended to avoid confusion with QM notation since the notions used will be those of (Kolmogorovian) probability theory not QM A hidden-variable model will be taken to be a probabilistic model ie the hidden variable A is represented as a point in a probabilistic space A and sets in this space (events) have a probability given by the probability measure P The measurement results are described by random variables (RVs) Xj(A) which take their values in the value space 01

These mappings will depend not only on the hidden variable A but also the specific directions in which we choose to measure the squared spin components so that we would have

X i ( x y z A ) A - gt 0 l

X 2 ( x y z A ) A - + 0 l (4)

X 3 ( x y z A ) A ^ 0 l

Here Xi is the result of the measurement along the first direction (x) X2

along the second (y) and X3 along the third (z) To be able to model the spin-1 system described above these RVs would need to sum to two ie

3

^ X i ( x y z A ) = 2 (5) i= l

This is in itself no guarantee that the model will be accurate but it is the least one would expect from a hidden-variable model yielding the QM behaviour

In simple experimental setups there is usually only one direction specified (the direction along which the spin component squared is measured) Thus we would expect that X only depends on x (and A) This is referred to as non-contextuality and more formally this can be written as

Xi(xyzA) =X 1 (x y z A )

X 2 (x y z A)=X 2 (x y z A ) (6)

AT3(xyzA) = X 3 ( x y z A )

These two prerequisites are all that is needed to arrive at the Kochen-Specker paradox

238

2 The Kochen-Specker t heo rem

A more appropriate name for this section is perhaps A Kochen-Specker theshyorem since there are several variants the example presented here is from Peres (1993)3 All variants aim for the same thing to show a contradiction by assigning values to measurement results coming from a non-contextual hidden-variable model In this particular one3 a set of 33 three-dimensional vectors are used depicted in Fig 1

Figure 1 The 33 vectors used in the Kochen-Specker theorem The vectors are from the center of the cube onto one of the spots on the cubes surface (normalized if desired)

The proof is as follows assume that we have a non-contextual hidden-variable model Then for any A (except perhaps for a null set) this model satisfies equations (5) and (6) in particular for the directions in Fig 1 Now look at Fig 2(a) The measurement result along one of the coordinate axes must be 0 and along the other axes it must be 1 Let us assume that the 0 is obtained from the measurement along the z axis (the white spot on the cube) and the other two measurements yield 1 (black spots) Measurements along other directions in the ay-plane must also yield 1 as indicated in Fig 2(a) In Fig 2(b-d) three more similar choices are made and having made these assignments a white spot must be added at the position indicated in Fig 2(e) because of the two black spots at orthogonal positions and by this another black spot must be added being orthogonal to the white one This proceshydure continues in Fig 2(f-j) until all the spots are painted either white or black as necessitated by the previously painted spots Finally in Fig 2(k) we have three black orthogonal spots violating equation (5) the condition of QM results A similar contradiction will occur whatever choices we make in our assignments in Fig 2(a-d) and we have a proof of the KS theorem We have

these were green and red in Peres3

239

(a) Arbitrary choice (b) Arbitrary choice (c) Arbitrary choice

(d) Arbitrary choice (e) Orthogonality (f) Orthogonality

(g) Orthogonality (h) Orthogonality (i) Orthogonality

(j) Orthogonality (k) Contradiction

Figure 2 A proof of the Kochen-Specker paradox

240

Theorem 1 (Kochen-Specker) The following three prerequisites cannot hold simultaneously for any A

(i) Realism Measurement results can be described by probability theory using three (families of) RVs

X ( x y z ) A - gt 0 l i = 123

(ii) Non-contextuality The result along a vector is not changed by rotation around that vector For example

Xi(xyzA) = X j ( x y z A )

(Hi) Quantum-mechanical results For any triad the sum of the results is two ie

^ X i ( x y z A ) = 2 i

Note that there is a certain structure to the proof assignment of meashysurement results on a finite number of orthogonal triads according to the QM rule and rotations connecting the measurement results on different triads by non-contextuality This structure can be made explicit in the statement of the theorem by introducing the set EKS (a KS set of triads)

copybullcopybullcopybullcopybull-bull(-i5) (7)

In this set there are n vectors forming TV distinct orthogonal triads where some vectors are present in more than one triad establishing in total M connections by rotation around a vector Using this notation (a restricted version of) the KS theorem is

Theorem 1 (Kochen-Specker) Given a KS set of vector triads EKS the following three prerequisites cannot hold simultaneously for any A

(i) Realism For any triad in EKS the measurement results can be described by probability theory using three (families of) RVs

Xi(xyz)A^0l 1 = 123

241

(ii) Non-contextuality For any pair of triads in EKS related by a rotation around a vector the result along that vector is not changed by the rotashytion For example

Xi(xyzA) = X i ( x y z A )

(Hi) Quantum-mechanical results For any triad in EKS the sum of the results is two ie

^ X i ( x y z A ) = 2 i

This version of the KS theorem will be useful when formulating a probabilistic version of the theorem

3 The Kochen-Specker inequality

The above discussion is valid in an ideal situation where no measurement errors are present Introducing measurement errors these occur as (i) missing detections (ii) changes in the results along the axis vector when rotating or (hi) deviations from the sum 2 Since the prerequisites of Theorem 1 is no longer valid neither is the theorem However using probabilistic notions the theorem can be restated as follows

Theorem 2 (Kochen-Specker inequality) Given a KS set EKS of AT vector triads with M interconnections by rotation if we have

(i) Realism For any triad in EKS the measurement results can be described by probability theory using three (families of) RVs

J f i ( x y z ) A X l - + 0 l i = l 2 3

where Ax is a (possibly proper) subset of A

(ii) Rotation error bound For any pair of triads in EKS related by a rotation around a vector the set of As where the result along that vector is not changed by the rotation is probabilistically large (has probability greater than 1 mdash S) For example

p ( Xi(xgt y gtzA) = Xi(xygtzgtA))gt) gt 1 - S

242

(Hi) Sum error bound For any triad in EKS the set of As where the sum of the results is two is probabilistically large (has probability greater than 1 - e ) ie

p f A ^ X i ( x y z A ) = 2 ) gt 1 - e

Then

M8 + Negt 1

To shorten the proof the following symmetry of the measurement results are assumed to hold (the proof goes through without the symmetry but grows notably in size)

Xi(xyzA) = X 2 ( z x y A ) = X 3 (y z x A) (8)

Proof By Theorem 1 we have

( f | A X 1 ( x y z A ) = X 1 ( x y z A ) ) f l M

( f | A ] T x i ( x v z A ) = 2 ) = 0 N

Then the complement has probability one and

1 = P (j^-X1(KyzX)=X1(xyzX) ) - M

U(UA pound^(x ygtzgtA) = 2c)l N i J

lt ^ p ( A X 1 ( x y z A ) = X 1 ( x y z A ) C ) ( 9 )

M

+ Ep(A Ex^xgtygtzA) = 2c) N i

ltM6 + Ne

Here the probability in (iii) is to be read as the probability of obtainshying results for all three Xi and that the sum is two In other words it is

243

possible to avoid using the no-enhancement assumption in Theorem 2 but unshyfortunately inefficient detector devices would contribute no-detection events to both the error rates S and e which puts a rather high demand on experimental equipment While the no-enhancement assumption can be used in inefficient setups this may weaken the statement (cf a similar argument for the GHZ paradox2)

The error rate e is the probability of getting an error in the sum (both non-detections and the wrong sum are errors here) not the probability of getting an error in an individual result This makes it easy to extract e from experimental data but unfortunately the errors that arise in rotation are not available in the experimental data so it is not possible to estimate the size of S (note that it is not even meaningful to discuss 5 in QM) It is possible to use e to obtain a bound for 5

Corollary 3 (Kochen-Specker inequality) Given a KS set of N vector triads EKS with M interconnections by rotation if Theorem 2 (i-iii) hold then

Obviously a small EKS s e t (small N and M) is better yielding a higher bound for S for a given e (for a few different KS sets see2 3 5)

In an inexact experiment yielding a large e one expects the error rate S to be large as well whereas the bound in Theorem 3 will be low because of the large e A model for this inexact experiment may then be said to be probabilistically non-contextual the measurement error rate is large enough to allow the changes arising in rotation to be explained as natural errors in the inexact measurement device rather than being fundamentally contextual For a good experiment yielding a low e one expects 6 to be low but here the bound in Theorem 3 is higher In a hidden-variable model of this experiment the changes arising in rotation occur at an unexpectedly high rate which cannot be explained as due to measurement errors and a model of this type may be said to be probabilistically contextual Note that this probabilistic non-contextuality is a weaker notion than the one used in Theorem 1 (ii)

4 Independence

To enable a general statement the proof of Theorem 2 does not make any assumptions on independence of the errors but it is possible to give a more quantitative bound for the error rate by introducing independence (for simshyplicity at 100 detector efficiency)

Corollary 4 (KS inequality for independent errors) Assuming that the errors are independent at the rate r and that Theorem 2 (i-iii) hold then both

244

= P(noerrors) + P(fliponbothXis) bull

6 and e are given by r and

M(2r - 2r2) + iV(3r - 5r2 + 3r3) gt 1

Proof In the case of independent errors at the rate r the expressions for the probabilities in Theorem 2 (i) and (ii) are

p(X1(Xyz)=X1(xyz))

rrors) + P(fliponboth

= ( l - r ) 2 + r 2 = l - ( 2 r - 2 r 2 )

p(AExlt(xyzgtA) = 2) 1 (ii)

= P(noerrors) + P(flipoftheOandonel) = (1 - r )3 + 2(1 - r)r2 = 1 - (3r - 5r2 + 3r3)

The probabilities of these sets are not independent so from this point on we cannot use independence The inequality above then follows easily from Theorem 2

An expression on the form r gt f(N M) can now be derived from Corolshylary 4 but this complicated expression is not central to the present paper One important observation is that again to obtain a contradiction for high error rates (r) a small EKS set is needed (small N and M) Unfortunately the error rate needs to be very low eg in the E^s m the present example6 only an error rate r below 071 yields a contradiction in Corollary 4 Please note that there is no experimental check whether the assumption of independent errors holds or not While the errors in the sum may be possible to check it is not possible to extract what errors are present in the rotations or check for independence of those errors (further discussion of independence is necessary but cannot be fit into this limited space)

The set contains 33 vectors forming 16 distinct orthonormal bases3 but some rotations used are not between two of these 16 bases in some cases a rotation goes from one of the 16 bases to a pair of vectors in the set (where the third needed to form a basis is not in the set) and a subsequent rotation returns us to another of the 16 bases Thus in the notation adopted here a few extra vectors are needed to form s yielding n = 41 N mdash 24 and M = 31 Note that these additional vectors are not needed to yield the KS contradiction but are only needed in the proof of the inequality in this paper A more detailed analysis for the initial set of 33 vectors is possible probably yielding a contradiction at a somewhat higher r than the one obtained from this general analysis but this is lengthy and will not be done here

245

5 Conclusions

To conclude for any hidden-variable model we have a bound on the changes arising in rotation

Here iV is the number of triads in EKS and M is the number of connections within EKS- A proof using few triads with few connections is not only easier to understand but is also essential to yield a bound usable in real experiments At a large error rate e probabilistically non-contextual models cannot be ruled out since the changes of the results arising in rotation can be attributed to measurement errors However a small error rate e will force any hidden-variable description of the physical system to be probabilistically contextual

If the assumption of independent errors is used an explicit bound can be determined for the error rate r

M(2r - 2r2) + V(3r - 5r2 + 3r3) gt 1 (13)

which is possible to write on the form r gt f(N M) Below the bound we have a KS contradiction Again a small KS set is better than a large one yielding a higher bound For example for the KS set used here3 an r below 071 yields a contradiction

While writing this paper the author learned from C Simon that a similar approach was in preparation by him C Brukner and A Zeilinger6

The author would like to thank A Kent for discussions This work was partially supported by the Quantum Information Theory Programme at the European Science Foundation

1 A M Gleason J Math Mech 6 885 (1957) 2 S Kochen and E P Specker J Math Mech 17 59 (1967) 3 A Peres Quantum Theory Concepts and Methods Ch 7 (Kluwer Dorshy

drecht 1993) 4 D M Greenberger M Home A Shimony and A Zeilinger Am J

Phys 58 1131 (1990) N D Mermin Phys Rev Lett 65 1838 (1990) J-A Larsson Phys Rev A 57 R3145 (1998) J-A Larsson Phys Rev A 59 4801 (1999)

5 A Peres J Phys A 24 L175 (1991) J Zimba and R Penrose Stud Hist Philos Sci 24 697 (1993)

6 C Simon C Brukner and A Zeilinger quant-ph0006043

246

Q U A N T U M STOCHASTICS THE N E W A P P R OA C H TO THE DESCRIPTION OF Q U A N T U M MEASUREMENTS

ELENA LOUBENETS Moscow State Institute of Electronics and Mathematics

Abstract

We propose a new general approach to the description of an arbitrary generalized direct quantum measurement with outcomes in a measurable space This approach is based on the introduction of the physically imshyportant mathematical notion of a family of quantum stochastic evolution operators describing in a Hilbert space the conditional evolution of a quantum system under a direct measurement

In the frame of the proposed approach which we call quantum stochasshytic all possible schemes of measurements upon a quantum system can be considered

The quantum stochastic approach (QSA) gives not only the complete statistical description of any quantum measurement (a POV measure and a family of posterior states) but it gives also the complete stochastic description of the random behaviour of a quantum sytem in a Hilbert space in the sense of specifying the probabilistic transition law governing the change from the initial state of a quantum system to a final one under a single measurement When a quantum system is isolated the family of quantum stochastic evolution operators consists of only one element which is a unitary operator

In the case of continuous in time measurements the QSA allows to define in the most general case the notion of the family of posterior pure state trajectories (quantum trajectories) in the Hilbert space of a quantum system and to give their probabilistic treatment

1 Introduction

The evolution of the isolated quantum system is quantum deterministic since its behaviour in a complex separable Hilbert space H is described by a unitary operator U(t) mdashgt satisfying the Schrodinger equation whose solutions are reversible in time

Under a measurement the behaviour of a quantum system becomes irreshyversible in time and stochastic not only is the outcome of a measurement random being defined with some probability distribution but the state of a quantum system becomes random as well

Consider the general scheme of description of any quantum measurement

247

with outcomes of the most general nature possible under a quantum measureshyment Such a measurement is usually called generalized

Let n be a set of outcomes and J7 be a u-algebra of subsets of fi Let po be a state of a quantum system at the instant before a measurement

The complete statistical description of any generalized quantum measureshyment implies that for any initial state po of a quantum system we can present

bull the probability distribution of different outcomes of a measurement bull the statistical description of a state change po -gt pout of the quantum

system under a measurement We shall say also about the complete stochastic description of the random

behaviour of a quantum system under a measurement in the sense of specifying the probabilistic transition law governing the change from the initial state of a quantum system to a final one under a single measurement

Introduce some notations Let fj(Epo) = Probw 6 Ep0 WE pound T be a probability that under

a measurement (upon a quantum system being initially in a state po) the observed outcome UJ belongs to a subset E

Let ExZE) be a conditional expectation of any von Neumann observable Z G C(H) Z = Z+ at the instant immediately after the measurement provided the observed outcome w 6 E Here CH) denotes the linear space of all linear bounded operators on 7i

The statistical (density) operator pout(Epo) is called a posterior state of a quantum system conditioned by the observed outcome w euro E if for any Z the following relation is valid

ExZE = tr[pout(Ep0)Z] (1)

Unconditional (a priori) state p0ut(QPo) of a quantum system defines the quantum mean value

tr[pout(np0)Z] = ExZQ = (Z)PoutnPo) (2)

of any von Neumann observable Z at the instant immediately after the meashysurement if the results of a measurement are ignored

Any conditional state change p0ut(Epo) of a quantum system under a measurement can be completely described by a family of statistical operators Pout(uPo)v G ft] denned ^-almost everytwhere on fl and called a family of posterior states

Specifically for WE pound T fi(E p0) ^ 0

PoutEPo) ~ pjE^) ( 3 )

248

and consequently due to (1) for any von Neumann observable Z the condishytional expectation can be presented as

ExZE = feB tr^pout^ P o ) Z M ^ Pa) ( 4 )

p(Ep0)

Every posterior state pout(^po) describes the state of a quantum system conditioned by the sharp outcome w In general however when outcomes of a measurement are not of discrete character or the observation is not sharp then provided the outcome ugt pound E we can only say that after a measurement the quantum system is in a state p0ut(lt^Po) with probability

ndwpo)

( w ) 7^T (5)

where XEltgt) is an indicator function of a subset E The a priori state p0ut(^Po) a n d the quantum mean value of any von

Neumann observable Z at the instant immediately after the measurement are represented through the family of posterior states as

Pout(ttp0)= Pout(up0)lJ(duPo) (6) Ja

(z)pout(npo)= tr[pout(ujpo)Z]ft(lthpo) (7) Jn

respectively The relation (6) can be considered as the usual statistical average over

posterior states p0utuPo) given with the probability distribution p(cLjpo) From (7) it also follows that in any possible measurement upon an obshy

servable Z which could be done immediately at the instant after the first measurement the probability distribution Probz euro Apout(Clpo) of possishyble outcomes is given by

Probz e A w(n 9 0 ) = Pvobz euro Apout(upo)fi(dup0) (8) JQ

This formula can be considered as the quantum analog of Bayes formula in classical probability theory

In quantum theory there are two major approaches to the specification of above mentioned elements of the description of a quantum measurement

249

bull The von Neumann approach [1] considers only direct measurements with outcomes in R According to this approach only self-adjoint operators on ~H are allowed to represent real-valued variables of a quantum system which can be measured (observables) The probability distribution p(Epo) of any measurement is denned as

Li(Epo)=tr[p0P(E)l (9)

through the projection-valued measure P(-) on (R B(M)) corresponding due to the spectral theorem to the self-adjoint operator representing this observshyable

Under the von Neumann approach the posterior state of a quantum sysshytem is defined only in the case of discrete spectrum of a measured quantum variable and is given by the well-known jump of a quantum system under a measurement prescribed by von Neumann reduction postulate

In the case of continuous spectrum of a quantum observable the description of a state change of a quantum system under a measurement is not formalized

The simultaneous measurement of n quantum observables is allowed if and only if the corresponding self-adjoint operators and consequently their spectral projection-valued measures commute

bullThe operational approach [2-8] gives the complete statistical description of any generalized quantum measurement In the frame of the operational approach the mathematical notion of a quantum instrument plays the central role In physical literature a quantum instrument is usually called a superop-erator

Specifically a mapping T(-)[-] T x C(Ji) -gt CT-L) is called a quantum inshystrument if T(-) is a measure on (fi F) with values T(E) VE pound T being linear bounded normal completely positive maps on pound(H) such that the following normality relation is valid T(fi)[J] = J

Let T(-)[-] be an instrument of a generalized quantum measurement Then the conditional expectation of any von Neumann observable Z at

the instant after a measurement is defined to be

Exm = ^mMMt yEpoundjr ( 1 0 ) Hhpo)

In case Z = I from (10) it follows that in the frame of the operational approach the probability distribution p(E po) of outcomes under a measurement is given by

p(Ep0) = tr[p0T(E)[I]] Vpound euro T (11)

250

The positive operator-valued measure M(E) = T(E)[I] satisfying the conshydition M(fi) = is called a probability operator-valued measure or a POV measure for short

From (1) and (10) it also follows that for any initial state po the posterior state p0ut(Epo) conditioned by the outcome us pound E can be represented as

Pout(Ep0)- KEpo) (12)

where T(E)[-] denotes the dual map acting on the linear space T(H) of trace class operators on H and denned by

tr[ST(E)[Z] = tr[T(E)[SZ] VZ pound CU) VS ltET(H) (13)

For any initial state po of a quantum system the family of posterior state Pout(upo)w G fi always exists and is denned uniquely ^-almost everyshywhere by the relation

tr[pout(cjp0)Z]fi(dup0)=tr[p0T(E)[Z] MZ 6 C(H) Vpound euro T (14) JuieuroE

Due to (13) (14) we have

T(E)[p0]= pout(ujpo)p-(du)po) (15) JweuroE

and consequently the posterior state pout(^Po) is a density of the measure T(-)[po] with respect to the probability scalar measure p(-po)

The operational approach is very important for the formalization of the complete statistical description of an arbitrary generalized quantum measureshyment

However the operational approach does not specify the description of a generalized direct quantum measurement that is the situation where we have to describe a direct interaction between a measuring device and an observed quantum system resulting in some observed outcome w in a classical world and the change of a quantum system state conditioned by this outcome

We would like to emphasize that in principle the description of a direct measurement can not be simply reduced to the quantum theoretical description of a measuring process We can not specify definitely neither the interaction nor the quantum state of a measuring device environment nor to describe a measuring device only in quantum theory terms In fact under such a scheme the description of a direct quantum measurement is simply postponed to the

251

description of a direct measurement of some observable of the environment of a measuring device

The operational approach does not also in general give the possibility to include into consideration the complete stochastic description of the random behaviour of a quantum system under a measurement

We recall that for the case of discrete outcomes the von Neumann approach gives both - the complete statistical description of a direct quantum measureshyment and the complete stochastic description in a Hilbert space of the random behaviour of a quantum system under a single measurement In particular if the initial state po of a quantum system is pure that is po = |Vo)(Vo| and if under a single measurement the outcome A_ is observed then in the frame of von Neumann approach the quantum system jumps with certainty to the posterior pure state

AVo H -iM

(16)

where Pj is the projection corresponding to the observed eigenvalue Xj The probability fij of the outcome Xj is given by

H = ll-P^oll2 (17)

We would also like to underline that the description of stochastic irreversible in time behaviour of the quantum system under a direct measurement is very important in particular in the case of continuous in time direct measureshyments where the evolution of continuously observed quantum system can not be described by reversible in time solutions of the Schrodinger equation

In quantum theory any physically based problem must be formulated in unitarily equivalent terms and the results of its consideration must not be deshypendent neither on the choice of a special representation picture (Schrodinger Heisenberg or interaction) nor on the choice of a basis in the Hilbert space That is why in [9] we introduce the notion of a class of unitarily equivalent measuring processes and analyse the invariants of this class

We show [9] that the description of any generalized direct quantum meashysurement with outcomes in a standard Borel space (n Fg) can be considered in the frame of a new general approach which we call quantum stochastic based on the notion of a family of quantum stochastic evolution operators satisfying the orthonormality relation In the case when a quantum system is isolated the family of quantum stochastic evolution operators consists of only one element which is a unitary operator

The quantum stochastic approach (QSA) which we present in the next section can be considered as the quantum stochastic generalization of the de-

252

scription of von Neumann measurements for the case of any measurable space of outcomes an input probability scalar measure of any type on the space of outcomes and any type of a quantum state reduction Due to the orthonorshymality relation the QSA allows to interpret the posterior pure states defined by quantum stochastic evolution operators as posterior pure state outcomes in a Hilbert space corresponding to different random measurement channels

Even for the special case of discrete outcomes the QSA differs due to the orthogonality relation for posterior pure state outcomes from looking someshywhat similar approaches considered in the physical literature [1011] where the so called measurement or Kraus operators are used for the description of both the statistics of a measurement (a POV measure) and the conditional state change of a quantum system

The QSA gives not only the complete statistical description of any genshyeralized direct quantum measurement but it gives also the complete stochastic description of the random behaviour of the quantum system under a measureshyment

2 Quantum stochastic approach

In this section we introduce the quantum stochastic approach (QSA) to the description of a generalized direct quantum measurement developed in [9]

Specifically it was shown in [9] that for any generalized direct quantum measurement with outcomes in a standard Borel space (ft TB) upon a quantum system being at the instant before the measurement in a state po there exist

bull the unique family of complex scalar measures absolutely continuous with respect to a finite positive scalar measure v(-) and satisfying the orthonormality relation

A = nji(ui)i(du) LJ pound Clij - 1N0N0 lt oo Trji(cj)i(du)) = lt Jn

(18)

bull the unique (up to phase equivalence) family of v- measurable operator-valued functions l^(-) on fi satisfying the orthonormality relation with values being linear operators on defined for any ip 6 v- almost everywhere on ft

V = Vi(u) u pound ili = 1 JV0 f Vf (u)Vi(w)irji(u)v(du) = (19)

and such that for any index i = lNo and for VE 6 TB

[ Vi(w)7rlaquo(u)i(dw) (20) JweuroE

253

is a bounded operator on The relation

W V O M = V M V Wgt G H (21)

holding ^-almost everywhere on fl defines the bounded linear operator Wi Ti mdashgtCe(iligtyH) with the norm ||Wj|| = 1 Here Vidw) = nu(ui)i(daj)

bull the unique sequence of positive numbers a = (0102 OJV0) satisfying the relation

No

5 gt i = i (22) raquo=i

such that the complete statistical description (a POV measure and a family of posterior states) of a measurement and the complete stochastic description of the random behaviour of a quantum system under a single measurement (a family of posterior pure state outcomes and their probability distribution) are given by

bull The POV measure

Wo

M(E) = J2 ltiMiE) Vpound e TB (23) i= l

with

Mi(E) = f VJ+MVSMi^dw) (24)

JweE

bull The family of posterior states

No

Poutu Po) = ^2 amp(w)r^(w po) (25) t = i

with

and

Tt(wp0) = Vi(cj)poV(Lj) (26)

E j ltXin MM7trade(u po)] flaquoH = ^ u ) f -gt (27)

254

bull The probability scalar measure of the measurement given by the expresshysion

H(dup0) = ^ a ^ w ( d w p 0 ) (28) i

through the probability scalar measures

^ ( d w p o ) = tr[T^t(ujpo)Mdoj) (29)

bull The family of random operators (19) describing the stochastic behaviour of the quantum system under a single measurement Every operator Vi(ui) defines in the Hilbert space a posterior pure state outcome conditioned by the observed result ui and corresponding to the i-th random channel of a measurement

For any ij)0 pound the following orthonormality relation for a family Vi(ugt)ipo w i poundli = lNo of unnormalized posterior pure state outcomes is valid

(^raquoVo v s M M w M K d w ) = ltMhMlaquo- (30)

For the definite observed outcome u the probability of the posterior pure state outcome Vi(-)tpo in the Hilbert space is given by

Q( A- O ^ M M I I V J M ^ O H 2 O I 1 ~E-laquoi i iMI|v-MiM2 ^

We call Viifjj) quantum stochastic evolution operators and the probability scalar measures ij(-)fo(-) = Z ^ a w O andzW(-p0) Pgt(-Po) = Sraquoaraquox( )(iA)) - input and output probability measures respectively

Due to the decompositions (23) (25) and (28) Mi(E) T^t(ujp0) Vi(-) and fj^(-po) are interpreted to present the POV measure the unnormalized posterior state the input and the output probability distributions of outcomes in the i-th func-random channel of the measurement respectivelyThe stashytistical weights of different i-th func-random channels are given by numbers agtii = 1 N0

The a priori state

Pout(tipo) = y2ai T^t(up0)ui((hj) (32) i Jn

is the usual statistical average over unnormalized posterior states Tg^t(ujpo) with respect to the input probability distribution of outcomes Ui(-) in every channeland with respect to different random channels of the measurement

255

Physically the introduced notion of different random channels of a meashysurement corresponds under the same observed outcome to different random quantum transitions of the environment of a measuring device which we can not however specify with certainty

The triple 7 = A V a is called a quantum stochastic representation of a generalized direct measurement

We call direct measurements presented by different quantum stochasshytic representations stochastic representation equivalent if the statistical and stochastic description of these direct measurements is identical

In the frame of the QSA von Neumann (projective) measurements present such the stochastic representation equivalence class of direct measurements on (E B(M)) for which the complete statistical and the complete stochastic description is given by the von Neumann measurement postulates [1] presented by the formulae (16) (17)

3 Concluding remarks

We present a new general approach to the description of a generalized direct quantum measurement The proposed approach allows to give

bull the complete statistical description (a POV measure and a family of posterior states) of any quantum measurement

bull the complete description in a Hilbert space of the stochastic behaviour of a quantum system under a measurement (in the sense of specifying of the probabilistic transition law governing the change from the initial state of a quantum system to a final one under a single measurement)

bull to formalize the consideration of all possible cases of quantum measureshyments including measurements continuous in time

bull to give the semiclassical interpretation of the description of a generalized direct quantum measurement

4 Acknowledgments

This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University I would like to thank A Khrennikov for the warm hospitality and fruitful discussions

References

1 J Von Neumann Mathematical foundations of Quantum Mechanics (Princeton U Princeton NJ 1955)

256

2 E B Davies J T Lewis An operational approach to quantum probashybility Commun MathPhys17 239-260 (1970)

3 E B Davies Quantum Theory of Open Systems (Academic Press Lonshydon 1976)

4 A S Holevo Probabilistic and statistical aspects of quantum the-on(Moscow Nauka 1980 North Holland Amsterdam 1982 English translation)

5 K Kraus States Effects and Operations Fundamental Notions of Quanshytum Theory (Springer-Verlag Berlin 1983)

6 M Ozawa Quantum measuring processes of continuous observables J Math Phys 25 79-87 (1984)

7 M Ozawa Conditional probability and a posteriori states in quantum mechanics Publ RIMS Kyoto Univ 21 279-295 (1985)

8 A Barchielli V P Belavkin Measurements continuous in time and a posteriori states in quantum mechanics J Phys A MathGen 24 1495-1514 (1991)

9 ER Loubenets Quantum stochastic approach to the description of quantum measurements Research Report N 39 MaPhySto University of Aarhus Denmark (2000)

10 A Peres Classical intervention in quantum systems I The measuring process Phys Rev A 61 022116 (1-9) (2000)

11 H Wiseman Adaptive quantum measurements Proceedings of the Workshop on Stochastics and Quantum Physics Miscellanea N 16 89-93 MaPhySto University of Aarhus Denmark (1999)

257

A B S T R A C T M O D E L S O F P R O B A B I L I T Y

V M M A X I M O V

Institute of Computer Science Bialystok University

PL15887 Bialystok ulSosnowa 64 POLAND

Probability theory presents a mathematical formalization of intuitive ideas of inshydependent events and a probability as a measure of randomness It is based on axioms 1-5 of AN Kolmogorov x and their generalizations 2 Different formalshyized refinements were proposed for such notions as events independence random value etc 2 3 whereas the measure of randomness ie numbers from [01] reshymained unchanged To be precise we mention some attempts of generalization of the probability theory with negative probabilities4 From another side the physishycists tryed to use the negative and even complex values of probability to explain some paradoxes in quantum mechanics 5 6 7 Only recently the necessity of forshymalization of quantum mechanics and their foundations 8 led to the construction of p-adic probabili t ies9 1 0 1 1 which essentially extended our concept of probability and randomness Therefore a natural question arises how to describe algebraic structures whose elements can be used as a measure of randomness As conseshyquence a necessity arises to define the types of randomness corresponding to every such algebraic structure Possibly this leads to another concept of randomness that has another nature different from combinatorical - metric conception of Kolshymogorov Apparenly discrepancy of real type of randomness corresponding to some experimental data lead to paradoxes if we use another model of randomness for data processing12 Algebraic structure whose elements can be used to estimate some randomness will be called a probability set $ Naturally the elements of 4gt are the probabilities

1 What probability sets $ are possible

For practical conclusions of probability theory two kinds of events so called certain and uncertain are of importance Therefore the probability set $ must have two type of elements corresponding to certainty and uncertainty Their main role is that they are coupling all elements of $ We interpret them as a possibility of a determination of any probability p euro $ of a random events by an infinite sequence of random independent variables denned by the probability set $ In this connection we dont require the formal physical interpretation for certainty

We would like to preserve all fundamental properties of probability on [01] corresponding to an intuitive ideas of a probability of an event for abshystract probability set $

Analogical situation occures in logic A construction which preserve the main properties of Bool algebra and possesses a some new properties led to appearance of the logical Lukasiewicz-Tarski system13 14

258

Definition 1 A set $ is called the probability set if it has the following propshyerties

(i) In $ a binary operation bull can be defined as multiplication of probabilishyties being unnecessary commutative Whith respect that operation the set $ is semigroup In addition $ consists of three non-intersecting semishygroups O e and P such that $ = O U P U e The elements of semigroup O will play a role of zeros ie O is a semigroup of zeros The eleshyments of e will play role of units ie e is a semigroup of units P is a semigroup of probabilities Besides for all p pound P 8 pound O we have 9 bull p p bull 6 pound O and for all p pound P e 6 e we have e-p p-e pound P

It is clear that zero elements correspond to uncertain events and the unit elements correspond to certain events

(ii) For some elements of $ a commutative and associative operation + of addition is defined The operation of addition and multiplication are distributive It means that ifforpqr pound $ the operationsp+q (p+q)+r are defined then operations q + r p + (q + r) also are defined and an equality takes place (p + q) + r = p+ (q + r) In addition for all uvr the operations u-p + v-q p-u + q-v are defined and the equalities take place r-(p + q)mdashr-p + r-q (p + q)-r=p-r + q-r

(iii) For all p pound P there exists a complementary element p pound P and e pound e such that p + p = e

(iv) The operation + is defined for all elements of O and is not defined for elements of e Besides for all p fi e 6 ^ O a sum p + 6 is defined and p + 6 pound O p + 6 $ e Also for e pound e the inclusion takes place 6 + e pound e but p + e is not defined

(v) In $ some topology is introduced such that with respect that topology the operations bull and + are continuous For arbitrary neighbourhood V(0) of zeros there esists p pound $ such that pn euro V(O) for ngtn0 (Vp)

(vi) IfpqE$ andp + q pound O then it follows that pq pound O (the property of indecomposability of zero) That property is not necessary For example in the complex and p-adic probability it can be not fulfilled

(vii) The equation p2 = p always has the solutions in O and e If the equation p2 = p has the solutions only in O and in e then we will say that Kolmogorov condition is valid for probability set $

The properties (31)-(5) provide the main identity of independent probshyabilities calculus ie if

259

Pi + bull bull bull +pn = e G e pi 6 P then we have

(p i + ---+Pn)n = E f t i bullbullbullPik = e f c euro e -

Unfortunately operations of a direct sum and of a tensor product of [01] do not produce new probability set different from [01]

For example in case of a direct sum [01] copy [01] with the coordinate-wise multiplication we have (pq) pq G [01] as probabilities Consequently (Pilti) + (P292) = (pi +P2qi +qi) and (pilti)(p2lt2) = (p i^ t f i f t ) - Obshyviously the element (00) must be zero But then (p0)(0q) = (00) It follows by zero semigroup properties that (p 0) G O or (0^) pound O Asshysume that (p 0) euro O p $ O Then by virtue of others axioms we obtain (mdash p 0) G O 0 lt mdash lt 1 and therefore by the continuity property the set (p 0)p G [01] consists O Formally the probability set differs from [01] But the factorization with respect the set O yields the [01] once again with usual addition and multiplication (see section 2) However there exists the probability set $ satisfying all axioms in the algebra consisting of pairs (xy) xy G R with the operations of coordinate-wise addition and multiplishycation

Indeed consider the set $ on Figl (parallelogram) bounded by vertices 0h 1 mdashh where h lt | Then we can easly verify that if x 21) (222) G $ then (xix22122) G $ The zero set O consists of a single element 0 and a set e consists of a single 1 The topology of $ is induced from R 2 The remaining properties of 4gt can be examined easily Note that the first coordinate x runs over the segment [01]

Since R2 with the coordinate-wise addition and multiplication is a simplest non-trivial topological semi-field 15 We can consider $ as an example of a probability set included in a topological semi-field

In 16 the foundation of classical probability theory is presented in terms of semi-fields Thus the construction of probability sets in abstract topologshyical semi-fields can be of interest for applications In section 3 we considshyered multidimentional examples of probability sets which could be even non-commutative These examples get beyond the frames of topological semi-fields

The zero-indecomposability property can be included or not included into the properties of $ It depends on a problem For example if we consider all fields of p-adic numbers as a probability set then the indecomposability property does not holds Nevethless it does not prevent the existence of an analogue of Bernoulli theorem in the p-adic probabilities10

However we can find sets satisfying all axioms in the field of p-adic numshybers For this purpose we take a p-adic number q qp lt 1 that is not a root of any algebraic equation with integer coefficients Then the set of p-adic

260

Fig 1

numbers of a form nkq

k + nk+1qk+1 +bullbullbull + nrq

r

where n G TV and the rest of n^ belong to Z k r 123 and of the form 1 mdash msq

s + ms+iqs+1 + bull bullbull + mtq where ms pound N and the rest of mj belong to Z st = 123 together with 0 and 1 are a probability sets with the operations of addition and multiplication in a p-adic set

The semigroups O and e consist of 0 and 1 respectively Essentially different examples of probability sets will be considered in secshy

tions 3 and 4

2 Uniqueness of semigroups of zeros and units

(i) Proposition 1 In the probability set $ defined by operations bull and + the semigroups O ande satisfying properties (31)-(34) are unique

Proof It is important to note that semigroups O and e posses the maximality property ie they cannot be extended to semigroups O O C O and e e C e or e C e O C O preserving the properties (31)-(34) Indeed if there is an extention O then there is an element p pound O such that p G O But this will contradict conditions (33)-(34) since on one hand the operation p + e e pound e is not defined for p pound O and on the other side the operation p + e is denned for all e e pound $ since p pound O

261

Now let O = O and e C e Then there exists an element j ) 6 e but p pound e By (33) there exists p pound O such that p + p euro e C e Prom the other side the operation p + q is not defined for q pound O = O and p e e Thus any two pairs of semigroups O and e satisfying (31)-(34) are maximal

By the same reason in $ there exists no other pairs semigroup O i and semigroup ei different from O and e Indeed assume these semigroups exist Let Ox ^ O O x ltf_ O O pound O j Then 3p 6 O p pound O i If e r i e j 7 0 then the operation p + e is defined for e e e f l e i since p pound O On the other hand the operation p + e is not defined for e pound e i since p $ O i If e H e = 0 we consider an element p such that p ^ O but p pound O i Then by (34) the sum p + q is defined V g euro $ On the other hand the sum p -f e is not defined for e euro e since p $ O

It remains to consider the case when O = O i but e 2 e i - This case does not coinside with the case O = Oi and e C e i studied above but the proof remains the same Namely there exists such p pound e i but p ^ e By virtue of (33) there exists an element p pound O such that p + p 6 e At the same time the operation p + p is not defined since p euro ei and pi Oi = O

(laquo) The homomorphism of the probability set $ i into the probability set $2 can be defined as usual but with the following natural complement

Definition 2 A mappind ip of a probability set $1 into the probability set $2 is defined to be homomorphism if

(a) (p is a semigroup homomorphism with respect to the multiplication

(b) If a sum p + q is defined in $ i then the sum ltp(p) + ltp(q) is also defined in $ 2 and ltp(p + q) mdash ip(p) + (p(q)

(c) If a sum ltpp) + ip(q) is defined in $2 then the sum p + q is defined in $1 and consequently by (iib) we have ip(p + q) = ip(p) + ltp(q)

Proposition 2 Let the probability set $2 ampe a (p-homomophic image of a probability set$i Let$i = O iUPiUe i and $ 2 = 0 2 UP2Ue 2 where Oj ei are semigroups of zeros and units respectivly Then ltp(Oi) = O2 lt^(Pi) = P2 and (p(ei) = e2 Also we have ltp(p) = ip(p) for allp euro P i

Proof Consider sets Oi = lt^-1(02) P i = ltp -1(P2) ei = tp~1(e2) Since the sets 0 2 P 2 and e2 do not intersect pairwise the sets 01 P i and ei also do not intersect pairwise and $1 = Oi U P[ U e[ Since

262

O2 P2 e2 are semigroups the semigroup properties of ip imply that the sets 0[ P i e[ are semigroups in $1 Further using properties (iia) and (iib) one can easly verify that the sets O^ and e[ satisfy conditions (31)-(34) of definition 1 and thus are semigroups of zeros and units In view of proposition 1 we have OI = Oi and e^ = e i It follows that P[ = P i Then if p pound P i there exists an element p pound P i such that p + p pound e i Therefore ip(p + p) = ipp) + (p(p) pound e2 and we can set ip(p) = ltp(p)

(Hi) Let $ be an arbitrary probability set with a semigroup of zeros O Proposhysitions 1 and 2 allow to consider instead of the probability set $ a home-omorphic probability set $0 (by proposition 3 below) whose semigroup of zeros consists of a single element Denote it by bull Then bull possesses all properties of the usual zero ie p+O = p bull bull p = p bull bull Vp euro ltlgto-

Definition 3 A class of the equivalence Kq of an element q pound $ is the set of all elements p pound $ for which p + 6 = q + 62 for some 1 62 euro O Set

$ 0 = Kq q G $

From definition 3 it is clear that KB = O for all 0 E O Indeed let x pound Kg then by definition 3 we have x + 61 = 0 + 62 for some 9i 82 pound O By 6 it follows that x pound O Further since p + 6 = 8+p6poundOwe have ppoundKp

The following two lemmas are similar to those for conjugate classes in rings but the proofs are different

Lemma 1 If z pound Kp then Kz = Kp

Proof If z pound Kp then by definition 3 we have z + 81 = p + 62 for some 1 82 pound O Let x be an arbitrary element of Kz Then by definition 3 we have that x + 83 = z + 84 for some 83 84 pound O Adding 81 to this equality and using the addition properties in $ and the relation z + 81 = p + 82

we obtain

(x + 83) + 0i = x + (83 + 0i) = (z + 8A) +8X =

= (Z + 01) + 04 = (p + 62) + 04 = P + (2 + 04)

Since 03 + 0i and 02 + 84 belongs to O from definition 3 follows that x pound Kp ie Kz C Kp

Also from the relation p + 82 = z + 0i it follows that p pound Kz Conseshyquently Kp C Kz and we have Kz = Kp

263

Lemma 2 The classes Kp and Kq either coinside or do not intersect

Proof Indeed let KpCKq^ If z euro Kp n Kq then by Lemma 1 we have Kz = Kp and Kz = Kq ie Kp = Kq

Proposition 3 In the set $ 0 one can introduce the operations of mulshytiplication and addition naturally induced by the operations in $ that transform $ 0 to a probabilitic set (We denote it by $o) Moreover the semigroup of zeros of a probability set $o consists of a single element Kg = O V0 euro O which possesses the properties of a usual zero

Proof Define the set Kp + Kq by a term-by-term addition of elements The definition of Kp + Kq is correct if p + q is defined Indeed let us consider x G Kp y G Kq Then by definition 3 we have that x + 0i = P + 02 y + 03 mdash q + 64 for some 0raquo G O Since p + q is defined by properties (32) and (34) imply

(p + 02) + (q + 04) = (p + q) + (02 + 04) = ( + raquo) + (0i + 03)-

Consequently x + y euro -ftTP+9 and it follows that Kp + Kq C -ftTp+g

Similarly we can define the set Kp bull Kq by term-by-term multiplication If x G Kp y e Kq we have x + 0i = p + 02 and y + 03 = ltZ + 04 0j euro O Multiplying left-hand and right-hand sides of these equalities and applying the properties of O we obtain

Or + 0i)(i + 03) = (p + 02)(lt + 04) = x bull y + 0 = p bull q + 0

where 0 0 euro O Consequently x-y euro Xpg and therefore KpKq C Kp

Those inclusions lemma 2 and properties (33) (34) allow to introduce correctly the operations of multiplication and addition on classes ltJgt0 by

KpGKq = Kpq KpHKq = Kp+q (1)

These operations transform the set $ 0 into a probability semigroup $o- The zero semigroup of ltJgt0 consists a single class O = K 0 euro O and the semigroup by units e O consists of classes Ke e euro e Obviously the properties (31)-(6) of definition 1 can be easly verified The class K$ = O V 0 G O possesses all properties of usual zero since Kq bull Kg = Kq9 = Kg = O and Kq + Kg = K g + e = if

We define lt on $ as ltj(p) = Kp Obviously the mapping ltp satiesfies the conditions of definition 2 and therefore is a homomorphism $ into $0 = $ 0

Probabilities with hidden parameters

(i) The idea of a hidden variables is very popular in quantum mechanics17 With the help of hidden variables many investigators try to overcome some difficulties of quantum mechanics For example in 1 8 to solve the Bells inequality paradox it was proposed the p-adic theory of distribushytions for hidden variables

On the other hand we propose to consider the hidden variables as a hidden parametres of usual probabilities so that the letter ones must be the abstract probabilities satysfying the conditions of definition 1

At first we consider one model of hidden parameters for abstract probshyabilities

Definition 4 We say that a set of abstract probabilities $ allows hidshyden parameters A (or $ has hidden parameters A) where A is certain topological space if to each a pound A corresponds a subset Pa C $ such that (J Pa = $ and the continuous mappings cp and ifi from A x A x $ x $

a

into A are defined and possess the following properties The operations

(p a) + (q 3) = (p + q tp(a p q)) (2)

pa)-q3) = p-qigta3pq)) (3)

where p G Pa q pound P0 p + q G P^afrpq) P bull Q euro ^V(laquoPlaquo) define

on the set of pairs (pa) a euro A p 6 Pa a probability set denoted by (4) P(A) C $ x A

Since the left hand side of (2) and (3) is the operations in the probashybility set $ the hidden parameters can describe additional properties of probabilities including some possible physical sense It is obvious that the principle problem conserning the probability with hidden parameters is as follows can we destinguish statistically the sequences Ci(w)gt bullbullbullgt Claquo(w)) mdash and T]i(ui) nn(poundj) where C(w) a r e independent random variables with identical distributions with respect to usual probabilities from [01] and (agt) are independent random variables with the some values as poundfc(w) but with the distributions from probability set [01] x A and satshyisfying the conditions if P(k(u) E B =p then pr)k(oJ) G B mdash (pa) for some a euro A

265

(ii) Now we consider the principle construction for different examples of usual probability on [01] with hidden parameters

Proposition 4 Let $ = [01] and A be some convex semigroup in arshybitrary Banach algebra over R Then the set $ x A = (p a) a pound A forms a probability set with respect to the operations

(pa) + (qa) = (p + q - pound - a + - ^ 8 ) p + qltl (4) p+q p+q

(pa)-qa) = (p-qa- ) (5)

Proof As a zero set O we consider the set (0a) a pound A and as e we consider the set ( l a ) a pound A Then all properties of definition 1 can be easly verified By the proposition 3 all elements of the form (0 a) a pound A can be ^identified with one zero

A simple interesting example of such kind can be obtained by considering a set of pairs (p q) pq pound [01] with the operations

(piQi) + P2qi) = (pi +P2 ^ mdash q + mdash92) Pi +P2 Pi+ Pi

0 lt p i + p 2 lt l (6)

(Pi 9i ) bull (P292) = (Pi -P2 qi bull 92) (7)

Obviously instead of q pound [01] we can take the elements of Banach alshygebra of sequences of numbers from [01] with coordinate-wise multishyplication We can interpret probabilities (p q) with hidden parameters Q mdash (lt7i)lt72 bullbullbull)) 0 ^ Ii ^ 1 a s follows if an event S occurs with the probability p then the probabilities (71(72 bullbullbull can be considered as probshyabilities of some independent events Si52 which can occur when S occurs

Another example of hidden parameters interesting from a probabilitic point of view can be obtained when q = qij runs over stochastic mashytrices Now we can consider random index i i = 12 with distribution (Pt ||ltfcmlD- Thus if the event i occurs with probability pi then qij is the probability of some events Sj This duplicates the previous situation differing that the matrix multiplication implies more interpretations

Problem of a general description of all mappings ltp and ip of the set [0 l ] x 4 into [01] or the full description of probabilities [01] with hidden parameters from [01] remains open

266

(Hi) As a prototype of a general construction of a probability $ with hidden parameters we can consider a set of positive measures min(G) on some semigroup structure G with natural opperation of addition and composhysition of measures

Indeed let G be an arbitrary locally compact semigroup Consider a set min(G) of all positive measures on G with weak topology We can naturally define operation of convolution (composition) on min(G) as follows for i v euro min(G)we set3

Hv(B) =fjxv(xy) x-yeB xypoundG (8)

where i x v denotes direct product of measures fi and u on G Then min(G) is a semigroup with respect to the convolution Besides the adshydition (fi + v)B) = nB) + vB) and the multiplication by a positive number A (v)(B) = XJ(B) are defined on min(G) Obviously the opshyerations of convolutions and additions are distributive Thus the linear set min(G) is convex semigroup with respect to convolution

The set min(G) possesses almost all properties of the probabilities sets with respect to these operations except one there is no semigroup of units in min(G) But if we restrict min(G) we can obtain a convex semigroup possessing all properties of a probability set To this end we consider a subset minj(G) of min(G) consisting of all probability meashysures ie the set of positive measures fi for which (i(G) = 1 Prom (8) it follows that mini (G) is a semigroup Consider a convex closed semishygroup min[01](G) consisting of all non-negative measures fi for which 0 lt i(G) lt 1 It can be readily seen that set min[0]i](G) with the operashytions of the addition and the composition satisfies all properties (31)-(6) of the probability set with a semigroup of units e = mini(G)

Each element fi from min[oii](G) can be obviously represented in the form p bull (^fJ) where n(G) = p 6 [01] p ^ 0 ^i euro mini (G) If fi and u belong to min[0ji](G) then we have

p q p + q

Hv = p(-raquo)q(-v) =pq(-ti)(-v)- (10)

Prom (9) and (10) we obtain the

267

Proposition 5 The convex semigroup min[oi](G) and the set $mini(G) of elements (pa) p pound [01] a E mini(G) with the operashytions (4) (5) are isomorphic

The probabilities (p n) can be interpreted similary to item ii above Howshyever the structure of multiplication of semigroup is rather more complishycated Consider an algebra of some events F Suppose that each such event has a state which can be represented by an element of a group G Let the probabilities (pipi) ]TXPJ^J) = (1pound) assigne the distribution on events Ti C T TiV Tj = 0 Then the probability (pifii) means the choice of a event Ti with the probability pi and the choice of a state g pound G with distribution n

It is obvious that the addition and multiplication of these probabilities must be determined by the physical model obtained from an experiment or theoretically

4 Probability sets with a single unit

If a semigroup G is finite then min[0ii] (G) is convex set in the Euclidean space We will show that convex set contains probability subsets with a single unit A special two-demensional case of such probability set was presented in section 1

(i) Let G be a finite group (commutative or non-comutative) with elements ei62 e s s gt 2 Consider a group algebra G(R) ie a linear space of linear forms ziei + (- xses i j G R with a group multiplication of basic elements ej Assume that the basis ej is ortonormalized Let mini(G) be a simplex formed by the vertices eei--es and the set min[o)i](G) be a simplex formed by the vertices 0eie2 e s see Fig2 Then the measure (i 6 min[01](G) can be written as fj = pe- -pses where 0 lt pi lt 1 and J2iPi 5 1- The geometrical center of mini (G) is an invariant measure no = e - h ^e s For any measure fi euro min[01] (G) we have

jnG = nGiJ - nG)nG (11)

In special case if p 6 mini(G) then una = nop = no and nG = no-Denote the line passing through the points 0 and no by I Then as it can be seen from Fig2 mini(G) is a part of hyperplane orthogonal to line I and passing through the point no and min[0)1](G) is a part of positive orthant cut of by mini(G)

268

^3

MG)

i ^ _ bdquo ^ bdquo r

Fig 2

Really Fig2 corresponds to the case s mdash 3 when G is a cyclic group of three elements This case is of a special interest because algebra G(R) is isomorphic to direct sum of real numbers field and complex numbers field19 Consider a cube Q as it is shown in Fig2 The cube Q consists of all measures fi = Y^l Piei fdeg r which 0 lt pt lt j

Proposition 6 The set Q considered as a subset of a convex semigroup minr0i](C) is a probability set with a single zero 0 and a single unit no-

Proof Let us establish that the set Q is a semigroup with respect to the multiplication Indeed if fi = ^2piei v mdash YHljej belong to Q then 0 lt Pi lt - 0 lt qj lt 1 and therefore we have iv = Y^Pi1ieiej ~

S ( ^Pilik I efcgt where i = 12 s are defined uniquely for each i and k i J

k by the condition a bull ek = ejt i k = 12 s Since G is a group then for any fixed k k mdash 12 s the indexes ik run over 12 s when i runs over 12 s Therefore we have

$gtife lt E laquo ^

269

Now let us show that a complimentary element ~p exists for each p = p-e + bull bull bull + pses euro Q By definition 1 we must have i + ~p 6 e In our case we set e = n g Then p + ~p = ng and therefore ~p - nG - p = ( i - pi)ei + bullbullbull + ( j - ps)es 6 Q since 0 lt pi lt pound i = 12 s Finally let us check property (34) Really if p euro Q p ^ no then p(G) = A lt 1 Thus by virtue of (11) we have pna = ^GM = n(G)nG = nG

The remaining properties of definition 1 for the set Q follow straightforshywardly from the properties of probability set min[0i](G)

Note that the Kolmogorov condition (7) holds in Q

(ii) It proves to be possible to construct even more general kind of probability sets with a single unit as a subsets of the set min[01] (G) For this purpose we consider an arbitrary convex semigroup S(G) in mini (G) and a convex set SQ(G) formed by zero (0) and the elements of the set S(G) One can readily see that So(G) also satisfies properties of a probability set in which S(G) is a set of units

Now we consider a set Q(S G) which is an intersection of the set S$(G) and all half-spaces contained zero and bounded by hyperplanes parallel to the faces of the So(G) and passing through the point nG

Proposition 7 Let S be an arbitrary convex semigroup in mini G) censhytral symmetric with respect to the point nG Then Q(S G) is a probability set with a single zero and a single unit

Proof We shall show that Q(SG) is a semigroup with respect to conshyvolution and hence Q(SG) as a subset of min[0]1](G) is a probability set with a single unit nG- First note that in view of central symmetry of 5 with respect to nG an intersection of any face of So(G) with any hyperplan passing through the element nG and parallel to another face lays in the intersection of faces of SQ(G) and the hyperplan h passing through nG and perpenducular to the line

Fig3 shows a plane -K passing through the point p0 euro S0(G) and line The rhombus 0AnGB is an intersection of Q(SG) with this plane Each element p of this rhombus can be represented by p = nG mdash Aixi where pi euro S(G) 0 lt Ai lt 1 Symilary for each other element v of QSG) we also have ii = nG - A2^i where v pound S(G) 0 lt A2 lt 1

270

71 O S(G)

JA

- bull x G s

^ 1

Fig 3

Therefore the product fiv equals

(nG - Xim)(nG - A2^i) - nG - A2nG^i - AizinG + AiA^i^i =

= ( 1 - A i - A2)nG +AiA2ii^2 (12)

Let us show that the element (12) belongs to Q(SG) Consider the first case when either Ai and A2 is greater than | Let for example Ai gt |

Then the point jl lays in the left-hand side of the rhombus and thus can be represented as ty i 6 S(G) t lt | On the other hand we have v - T bull v for v E Q(SG) where v pound S(G) 0 lt r lt 1 Therefore the product Jiv is equal tr bull fiu where fj bull v G S(G) and 0 lt tr lt | Consequently by construction of Q(SG) measure pigt lays the left of hyperplane h (Fig3) and consequently ftu pound Q(SG)

Now consider the case when Ai lt | A2 lt | Then p = 1 mdash x mdash A 2 gt 0 and q = 12 gt 0 Show that inequality p + 2q lt 1 holds which is equivalent to the inequality Ai + Ai gt 2AiA2 Indeed (Ai mdash A2)2 = Af + A| - 2AiA2 gt 0 Since 0 lt Ai lt 1 0 lt A2 lt 1 we have Ai + A2 - 2AX A2 gt + l - 2AiA2 gt 0 Whence p + 2pltl

Thus from (12) we have [iv = pna + qfJ-iVi fJ-i v pound S(G) pq gt

271

0 p + 2g lt 1 Show the measure m = pna + gw belongs to Q(S G) for any measure w euro S(G)

Fig4 shows the plane passing through the points 0 u ans no- The point m = priG + qw lays on the line parallel to Ow and passing through priG-

Now to prove that m belongs to Q(SG) it suffices to demonstrate that qugt lt |A| By similarity of triangles 0 u n s and pno BTIQ we have

|2A| ( l - p ) | n G |

ugt nG = l-p

That is |A| = | ( 1 -p)u Then

qu 1(1 -P) 2 Q

1 1 - p gt 1

U)

follows from the inequality p + 2q lt 1

Hypothesis For arbitrary S(G) C mini(G) the set Q(S G) as a subset of a convex semigroup minr0)i] (G) is a probability set with a single 0 and a single unit no bull

272

We would like to note in connection with the examples of section 1 that a general description of probability sets in topological semi-fields and in the field of p-adic numbers is of a great interest for applications

We hope that problems of an experimental determination of abstract probabilities will be considered in the continuation of this work

5 Acknowledgments

In conclusion I want to express my gratitude to A Yu Khrennikov (Vaxjo Univ Sweden) Yu V Prokhorov O V Viskov I V Volovich (all of Steklov Mathematical Institut Russia) V Ja Kozlov (Academy of Criptografy Russhysia) V I Serdobolskii (Moskow Univ of Electronic and Math Russia) and A K Kwasniewski (Bialystok Univ Institut of Computer Science Poland) for discussions and their advices on foundations of probability theory and quantum mechanics This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University

References

1 A N Kolmogorov Foundation of the probability theory (Chelsea Publ Comp New York 1956)

2 T L Fine Theories of probabilities an examination of foundations (Acashydemic Press New York 1973)

3 H Heyer Probability measures on locally compact groups (Springer -Verlag Berlin-Heidelberg New York 1977)

4 Y P Studnev TV and its applications 12 727 (1967) 5 R P Feyman Negative probability Quantum implications Essays in

Honour of David Bohm BJ Hiley and FDPeat (Routledge and Kegan Paul London 1987)

6 P Dirac Pev Mod Phys 17 195 (1945) 7 0 G Smolaynov and A Y Khrennikov Dokl Akademii Nauk USSR

281 279 (1985) 8 V S Vladimirov I V Volovich and E I Zelenov p-adic analysis and

mathematical physics (World Scientific Publ Singapore 1993) 9 A Y Khrennikov Theor and Math Phis 97 348 (1993)

10 A Y Khrennikov Doklady Mathematics 55 402 (1997) 11 A Y Khrennikov Mathematical and physical arguments for the change

of Kolmogorovs axiomatics Trends in Comtemporary Inf Dim Analshyysis and Quantum Probability Nl 215-249 (2000)

273

12 L Accardi The probabilitic roots of the quantum mechanical paradoxes The wave - particle dualism (D Reidel Publ Company Dordrecht 1958)

13 C C Chang Transactions of the Amer Math Sos 86 467 (1958) 14 R S Grigolia Algebraic ananlysis of Lukasiewicz - Tarskis n-valued

logical systems Selected papers on Lukasiewicz sentential calculi (PAN Ossolineum Poland 1977)

15 T A Sarymsakov Topological semi-fields and its applications (FAN Tashkent 1989)

16 T A Sarymsakov Topological semi-fields and probability theory (FAN Tashkent 1969)

17 J S Bell Rev Mod Phys 38 447 (1966) 18 A Y Khrennikov Physics Letters A 200 219 (1995) 19 B L Wan der Waerden Algebra I Achte Auflage der modern algebra

(Springer-Verlag Berlin-Heidelberg New Yok 1977)

274

Q U A N T U M K-SYSTEMS A N D THEIR ABELIAN MODELS

H NARNHOFER Institut fur Theoretische Physik

Universitat Wien Boltzmanngasse 5 A-1090 Wien E-mail narnhapunivieacat

In this review the concept of quantum K-systems is studied on one hand based on a set of increasing algebras on the other hand with respect to entropy properties We consider in examples how far it is possible to find abelian models

1 Introduction

Classical ergodic theory is a powerful discipline both in mathematics and physics to analyze mixing properties of a given dynamics Since in physics the mixing properties take place on the microscopic level that is controlled by quantum theory it is natural to try to translate the concepts of classical ergodic theory also into the quantum framework and to study how far these concepts can find their quantum counterpart and whether new features appear

One possibility is the following we start with a classical dynamical system eg a free particle on a hyperbolic manifold with finite measure and quantize the dynamics ie study the properties of the Laplace-Beltrami operator on this manifold Since the manifold has finite measure the Laplace-Beltrami operator has necessarily discrete spectrum1 and the classical mixing properties can only have their footprints in the distribution of the eigenvalues at high energy23 Many deep results have been found on the basis of this approach But in this review we will follow another path of considerations

We start with the classical dynamical system with optimal mixing propershyties the Kolmogorov system456 It can be characterized either by its algebraic structure or by properties of its dynamical entropy Both concepts find their counterpart in quantum systems7 but they are not equivalent any more

First we will give the definition of an algebraic K-system and some defshyinitions of dynamical entropies One of them relates the quantum system to classical K-systems that can be considered as models of the quantum system Then we will give examples of algebraic quantum K-systems and will discuss how far they can be represented by classical models Finally we will give examshyples of quantum K-systems for which no classical model exist and on the other hand a quantum dynamical model that allows the construction of a classical model but for which the algebraic K-property so far cannot be controlled

275

2 Classical K-System

Let us repeat the characteristics of a classical dynamical system (A a z) where we take A to be the abelian algebra built by the characteristic functions over a measure space with measure fi and a an automorphism over A with [i o a = fi 456

Definition 21 We call (A Ao a fi) a K(olmogorov) system if

Ao pound A crAoDAo JanAo=A f]a~nAo = XI (21)

For a given classical dynamical system (A a fi) we can decide in several ways if some Ao (that is not unique) exists so that (AAoafj) form a K-system 56

A) Choose some finite subalgebra 13 C A (ie some finite partition of the measure space) and construct its past algebra Ao = UneuroN a~namp- If A) is a proper subalgebra of A it will increase in time Check if J anAo = A if not B has to be increased If B is large enough check if f] a~nAo = Al

B) Consider the conditional entropy H(BAo) If this expression is strictly positive V B (A a fi) is a K-system

C) If

lim H(anBAo) = H(B) VB (22) nmdashfoo

then (^4 a (i) is a K-system

The classical K-system can also be characterized by its clustering properties Let (AAQ(JH) be a K-system Then to every B E A e gt 0 3 n0 such that

p(Bo-nA) - n(B)n(A) lt en(A) VAeAongtn0 (23)

The prototype of a K-system are the Bernoulli shifts (including the Baker transformation) We regard the Bernoulli shift as an infinite tensor product A mdash lt8)fez Bti where Be is isomorphic to a finite abelian algebra Bi laquo BQ = Pi Pk with projections P with expectation values z The dynamics is given as the shift a over the tensor product The state x has to be translation

276

invariant It can be the tensor product of the local state but we allow also spatial correlations The dynamical entropy is given by

s u p t f l Q c S I | J arB (24) t=0 rlt-l+n J

= s u p i f f M J lt r B j (25)

and coincides with H (B) if the state p factorizes

3 Algebraic Quantum K-Systems

It is obvious that one can adopt Definition 21 directly to define an algebraic quantum K-system It is also obvious that the definition is not empty because we can construct the quantum analogue of a Bernoulli shift by taking for B a nonabelian algebra eg a full matrix algebra Mkxk- In the following we will first discuss physical applications of this quantum Bernoulli shift and then turn to generalizations

A A model for Quantum Measurement

We start with a finite-dimensional algebra B and a state u over B In order to determine w we have to make many copies of u and repeat a variety of measurements The classical Bernoulli shift consists of projections and every measurement gives as outcome 0 or 1 on these projections with probability corresponding to the state p By repeated measurements we can determine p with exponentially increasing security

In the quantal situation a measurement corresponds to pick some abelian subalgebra Bo of B maximal abelian if the measurement is sharp and again the outcome of the measurement will be 0 or 1 on the projections in Bo- To determine the state u we have to vary the measurements respectively the alshygebras Bo Since the state space over B is compact it suffices to vary over finitely many Bo- Let u(Pj) = pj for Pj 6 BQ TO get security on the density distribution with respect to Bo the number of experiments have to be of the order pj(l mdash pj)e2 For the algebra Bo that commutes with the density mashytrix p corresponding to u the entropy S(pg ) is minimal and approximative security on the density distribution is reached for the smallest number of meashysurements For other abelian subalgebras BQ we are satisfied with less security

277

we have just to be sure that pe0 is more mixed than p-go With pj mdash UJ(PJ)

for Pj pound Bo and Jj- = u(Pj) for ~Fj e B0- The probability that the outcome of N measurements gives a probability qj gt pj + e is

Nipj-pj-e)2

exp mdash (31a P i ( l - P j )

This has to be compared with the security given by N measurements on B0

~Ne2

exp-^-p - r (31b)

Therefore the number of experiments N necessary to control ps0 is small comshypared to the number N that fixes pg and at the same time p If we interpret the entropy as a measure on the reliability of a sequence of measurements we see that it is not changed compared to the classical expression ie the same order of experiments is necessary and therefore

S(p) = S(pBo) = -Trplnp (32)

Remark In 8 the Shannon information resp von Neumann entropy (32) was questioned to be the appropriate quantity But in these considerations it was not taken into account that measurements on different abelian subalgebras are correlated We have incorporated these correlations by taking into account the varying necessary accuracy and in this way got the desired result

B Lattice Systems

Again we choose a matrix algebra B and define A = reg n 6 ^ Bn as before But now the algebra describes particles on a lattice (one-dimensional for n pound Z) the shift corresponds to space translation and the translation invariant state describes the system in eg the ground state or equilibrium state with respect to some Hamiltonian eg the Heisenberg ferromagnet Therefore in general the state will not factorize but be obtained as 9

T r e - ^ A u(A) = hm mdash ^smdash (33)

A-yZ Tr e-PH

We assume that the sequence of local Hamiltonians H determines a time automorphism on the algebra that commutes with space translation We can assume that ui(A) is space translation invariant In order that we have an algebraic K-system on the von Neumann level (in the weak topology) it is necessary that the state is extremal space translation invariant This can be achieved if necessay by a unique decomposition as in the classical situation9

278

C Fermi Systems

We consider the CAR algebra Aa(f) a^(g) either over C2(Z) or L2(R) The shift defines an automorphism over A and the K-property is satisfied with AQ = a( ) a t ( ) supp 6 Z~ or R~ This is not a Bernoulli-K-system because creation and annihilation operators anticommute

D Quantum Stationary Markov Processes

Another example 10 of a K-system is provided by stationary Markov chains Here many variations of the definition of such a Markov chain exist We give an explicit example that again cannot be imbedded into a Bernoulli system

Let Ao be a 2 x 2 matrix algebra and C = reg n euro Z Cn a Bernoulli system Cn again a 2 x 2 matrix algebra Define the map Ti A$ reg 1 mdashgt Ao lt8gt C by

Ti(axregl) mdash ~oxregox

T^yreg) = axregay (34)

r i ( a z reg l ) = lregaz

On C we consider the shift r and a r-invariant state CJ Therefore we can define

T = (Ti reg idci )degidAregT) (35)

Then A[mn] = mltkltnTk(Ao) and (-4[-oooo]^[-ooo] f reg w) define a K-

system for arbitrary states (p over ^lo-It can easily be seen that though -4[_oooo] can be imbedded in AregC the

automorphism T is not asymptotically abelian

[Tnax reg l)az regl) = ioyregox ax (36)

E Prize-Powers Shift

Another illustrative example for a quantum K-system is the Prize-Powers shift n

Let ej be a unitary satisfying e2 = 1 Let

eiek = ( - l ) ^ - ) e e i with g(i - k) e 01 (37)

Let aek = e^+i Then

Vgo = ehi lt 0Vg = etJ pound ZltJT)

279

form an algebraic K-system where r is the tracial state

-r(e) = Sift with e = J J eiu eik (38) iiiibdquoeurol

Special examples are

a) g(l) mdash 1 gk) = 0 otherwise Then the algebra coincides with 0 A M^ x 2

where

amp2k - crzregazpound Mk+i lt8gt Mk

R2k+i = 1 regltJx euro gtlfc

b) g(i) = IV i Then the algebra coincides with CAR on Z

et = ai+a

Other explicit examples can be found in1 2 In all these examples (A - E) we inherit from the classical theory the

following

Theorem Let (A Ao cr u) be a K-system and u an extremal translation-ally invariant state (That is equivalent that f)(j~nAo = Al in the strong topology) Then to every A e 3 no such that

oj(Aa-nB) - U(A)OJ(B) lt eB ngtn0 B e A0 (39)

Therefore we have the same clustering properties as in (23)

Proof If OJ is the tracial state T(AB) = T(BA) then in the GNS represenshytation

OJ(B) = (n|7r(B)|ngt

ir(Ao) defines a projection operator PQH = Tr(Ao)Q that is increasing respecshytively decreasing in an

uAo-xB) = oj(Aa-nP0(J-nB)

and

st- lim (7nP0 = 1 st- lim a~nPQ = fl)(fl (310) nmdashoo nmdashbulloo

280

If LJ is not the tracial state but a KMS state it cannot be excluded that ft is not only cyclic for TT(A) but also for TT(AO) But in this case the modular operator corresponding to ^(Ao) A0 can replace P0 for controlling the cluster properties and satisfies13

st- lim ltr-nmdashr^ = J |fi)(fi| (311) A i 2 + 1 2

nmdashyenoo

4 Dynamical En t ropy

The dynamical entropy of classical ergodic theory can be interpreted in two different ways

If we use the definition

ha) = supH(aB) = supH(B I J a~nB) (41)

then it measures how the algebraic K-system increases and how in the course of time our information on the complete system increases

If we concentrate on the fact that

lim H[akB I J a~nB) = H(B) (42)

it describes that the remote past becomes more and more irrelevant for the presence Both properties can inspire us to look for an appropriate definition for a dynamical entropy for a quantum dynamical system

a) For an algebraic K-system we can just copy the definition of a classical K-system

Definition Given two subalgebras A B C M w a state over M Then we define with S(ujip) the relative entropy the conditional entropy H(AB)

HUAB)= sup ^2(S(uuiU - S(uui)B) (43)

Evidently H(AB) gt 0 By monotonicity of the relative entropy H(AB) = OifAcB

Let (AAoau) be an algebraic K-system Then HiJ(aAoAo) measures how fast AQ is increasing The above expression has not been much

281

investigated The main reason lies in the fact that for a given quantum dynamical system different to the classical situation no strategy is known to decide whether an AQ with the desired properties exist If it exists there is no reason to assume that it is unique In the classical situation the dynamical entropy does not depend on the special choice of AQ In a quantum system due to the lack of a constructive approach to Ao we also have no chance to compare H(aAoAo) with respect to different past algebras Ao-

There exists also another characterization for the amount of increase

For A D Ao both type Hi algebras define P0 the projector on AoO in the GNS representation of the tracial state over A Po 6 n(Ao) Then 14

[AA0=T(P0)- (44)

r the trace over n(Ao)

This definition has been generalized to type III algebras by1 5 Note that it is not state dependent As a typical example it can be evaluated for the Price-Powers shift both (43) and (42) are independent of the sequence g and give In 2 resp 2 But it should be noted that in general there exists only an order relation16

H(aAoAo) lt 2 1 o g M o M-

b) The main obstacle to use (43) or (44) as a definition for the dynamical entropy comes from the fact that for noncommutative algebras in general U n = 1 a~nB will increase in a way that can be hardly controlled

An illustrating example is given by the following observation17

Take A = a(f)a^(g) f g G C2(R) a with a the space translation We know already that it corresponds to a K-system with A0 = a(f)a(g) fg euro C2(R~) But if we pick a(e~x ) and construct the algebra A0 = a(e~(x_a) ) a gt 0 then Adeg coincides with A if it would not we could find some with (|e~(x~deg) ) = OVa gt 0 and this is impossible due to the analyticity properties of the Gauss function

Due to this fact 18 proposed the following definition for a dynamical entropy

282

Definition Let M be a hyperfinite von Neumann algebra with a faithshyful normal trace Let Pf(M) be the family of finite subsets of M Let X C M We write

if for every x euro w there exists ay e x s u c n that

T((X - y)(x - y)) lt 6 (45)

Let J be the family of finite dimensional C subalgebras of M Then

rT(cj5) = infrank A A e TM)UJ C A (46)

1 (n~l

haT(aujS) = lim sup mdashlogrr I I J oUu)8 n-yenoo n ^

j=o

haT(augt) = suphaT(aujS) (5gt0

haT(a) = sup ioT ( (Tw)w6P(M) (47)

The notation stands for approximation entropy of a

The above definition allows many variations For instance the lim sup can be replaced by a lim inf and we can hope but it is not proven that this does not change the definition

New information can be gained if we change the approximation conditions (45)

The topological entropy uses the approximation in norm But to keep generality we cannot assume that the full matrix algebra belongs to A Concentrating on nuclear C algebras we have to approximate via completely positive maps (ltpipB) with B a finite dimensional algebra if M -gt B and ifgt B -gtbull M such that

tp o tp(a) - a lt 6 V a G w (48)

hata) is denned as haT only under the new approximation condition If M is an AF-algebra and therefore possesses a tracial state then the topological entropy dominates the approximation entropy

hta) lt hata) (49)

283

As another possibility we can approximate ip o p(a) mdash a in the strong topology in a given representation corresponding to a state ip and replace the rank of the best algebra A by the entropy19

s = (ipoip)

All these definitions satisfy the requirement that they coincide with the usual definitions (state dependent dynamical entropy or topological enshytropy) if we apply them to commutative algebras

Let us finally remark that applied to the Price-Powers shift again indeshypendent of g (37)

haT(a) = hat(a) = ht - ltp(a) = ^ H(AoW1 AQ) (410) Li

For further studies we refer to (Stormer Choda Dykema)20 21 22

c) An approch that differs very much from the mathematically motivated definition of Voiculescu is offered by Alicki and Fannes23 It is motivated from the concrete method how we are able to determine by experiment the state of a system we perform a measure and repeat the measurement in the course of time Here we use the idea of the history of a system as discussed eg in24 25

A single measure corresponds to a partition of unity

fc-i ]bullgt = (411) j = 0

In fact we may think that the x^ are commutative selfadjoint projecshytion operators But by time evolution this commutativity is destroyed anyhow and also for the necessary estimations it is preferable to conshysider this generalized partition of unity without further restrictions on Xi Repetition of the measurement corresponds to a composed partition

X = (x0xbdquo-i)

ax = ((TX0 o-xn_i)

VXdegX = ( ltTXi---Xk)

ie a partition of size k2

(iixXjn) = MX

284

defines a density matrix of dimension k with entropy

Hx) ~ S(MX (412)

As dynamical entropy h(x) we define

h(x) = limsupmdash H(am~1xdeg---vxdegx) m rn

= limsup mdash S(Mam-ixo axox)

ha) = suph(X) (413)

But here a problem arises if we do not restrict B in the algebra A we lose control on the dynamical entropy For instance if we take as C-algebra the Cuntz algebra9 with 1117j mdash and UfUj = Pj and use the Ui for then the identity map has infinite dynamical entropy If for instance we consider the shift on the lattice system B) then we can choose as natural subalgebra B that is dense in A the algebra of strictly local operators Some weakening of this restriction is possible and this is of course necessary if we want to apply the theory to time evolution with interaction where local operators immediately delocalize But this derealization decreases exponentially fast in space26 therefore B consisting of exponentially localized operators should be sufficient to define a dynamical entropy for time evolution in the sense of Alicki and Fannes As an example we consider the shift on the lattice Then

IAFMO = S(LJ) + lnd (414)

there s(u is the entropy density corresponding to the state w and d is the dimension of the full matrix algebra of each lattice point

d) As last proposal for the definition of a dynamical entropy we describe the one which in fact has the longest history First it was proposed by Connes and Stormer for type II algebras27 and then generalized in28 and 29 to general situations We present the definition given by Sauvageot and Thouvenot 30 which they showed to be equivalent to the ones in 27 and 29 for hyperfinite algebras In their definition it is most evident that this dynamical entropy measures how far the quantum system is related to a classical K-system In addition concepts developed in this framework also find their application in quantum information theory

285

Definition The entropy defect of an abelian model Let (4 w) be a nonabelian algebra with state u Let (B n) be an abelian algebra with state fi that is coupled to A by a state A over AregB satisfying A| t = w XB = fi Its entropy defect is defined as

HX(BA) = [H^B) - S(LJ reg iiX)A9B] (415)

Theorem The entropy of the state u is given as

SA(w) = sup [HB(fi) - HX(BA)] (416)

In fact there exist many abelian models that optimize the above expresshysion every decomposition of OJ into pure states ui = J^ILi Viui c a n be interpreted as abelian model with B = P i Pn and fi(Pi) = fii (PiregA) = fiiOJi(A)

Due to quantum effects the entropy is not monotonically increasing if we consider an increasing sequence An C Am nltm But monotonicity can be regained if we change the definition to

Definition Let A C C and (Bfx) be an abelian model for (CCJ)

Then

HUlC(A)= sup [HBn) - HX(BA)] (417) (BMA)

This suggests the definition for a dynamical entropy

Definition Given (Aaugt) a quantum dynamical system The dyshynamical entropy is given by

hu(a) = sup[raquoM(P|P_) - H(PP- reg A)] (418)

where the supremum is taken over all dynamical abelian models (B n 0 ) with n o 0 = 0 and coupling A o 0 ltggt a = A A|4 = ugt B = A- Here P- = U^Li Q~nP the past algebra of the partition P

Remark There holds equality between hu(a) and

sup [MP |P_) - H(PA)] (419)

286

This is based on considering

H(PP-) = lim - H I ekP) )

H(PP_ regA) = lim - H I BkPA) ]

and taking V kP as a new abelian model

It is evident that one can also define the dynamical entropy with respect to a subalgebra C C A

KaC) = sup[iM(P|P_) - HPP- reg C)] (420)

an expression that we need if we want to discuss 2C) in the framework of quantum systems Notice that (419) cannot be replaced in general by an expression like (418)

The main task now is to find abelian models This can be done very similar as for calculating the entropy of a state

Theorem Assume a state w is decomposed

w = ^MiiibdquoWi1in (421)

Define

Consider

lt lt = 1^ WiiraquoWiiiraquo-it l^k

H(C aC ak^C) = 5( W ) - pound S$) + pound ^ S M U ^ ^ - M

(422)

Consider now the decomposition

w = ^ p y 51 E 1 - i W i - - i laquo ^ = Sibdquoiwltilt-- (423) r = -

In the limit limmdash limbdquo^oo (i-e we have to start with a sufficiently large decomposition) the pik converge to an abelian model and all

287

abelian models can be obtained in this way The detailed proof for this statement can be found in3 0

This theorem enables us to find lower limits for the dynamical entropy Together with the fact that

1 H(CaCak-lC) lt SU(C) + 0(8) (424)

if C C C in the sense of (45) or (48) we also have the upper bound29

h(a) lt sup lim H(C ltr-1C) (425) c k

so that in some cases we can really evaluate the dynamical entropy

5 Some General Considerations on Abelian Models

As we already mentioned the entropy of a state over a quantum system can be calculated via an abelian model For a matrix algebra this view point may look superficial but has found its important application in the theory of entangled states where subalgebras Areg B C C are considered and the entanglement describes that a pure state over C will not be pure as state over A resp B This entanglement can be used for quantum communication and the amount of this applicability is expressed as entanglement of formation31 (compare (417))

EuA) = S(u)A - HW(A) = miY^mSugtuji)A (51)

Expressed in terms of an abelian model we can also write

HU(A) = sup S(UregH)AregB0 (52) A0o

where A is a state over BQ reg C We have the following inequality Let w as state over C be written in the

GNS-representation w(C) = ltn|7r(C)|ngt

and let C be the commutant in this representation Then

S(u reg HUgt)AregC0 lt HU(A) lt S(UJ reg ULJ)AregC (53)

with C0 any abelian subalgebra of C A maximal abelian subalgebra of C gives a lower bound to the entropy and in some cases it even is the best

288

abelian model (compare 32 and the explicit results in 33 for estimates on E ie without dynamics) but in other examples 32 see also the forthcoming 6E it is evidently too small If in addition the abelian model has to carry a dynamics the question arises when the abelian model can be imbedded into the commutant (or whether by the natural isomorphism the algebra itself contains a sufficiently large time invariant abelian subalgebra)

Here we have the following results

Theorem 34 Assume that (ACTCJ) is a dynamical system and OJ a tracial state Assume that the analogue of lc) (entropic K-system) is satisfied ie

lim H(onB) = H(A) V finite dimensional B C A nmdashtoo

Then

st-lim[ylltrM]=0 V A (54)

Proof It sufficies to choose B = P for all projection operators in A Then P is its own best abelian model in the calculation of H(B) Refinements of the models P anP have to be used to calculate H(anB) (compare theorem (423)) But they are only possible if P and anP nearly commute

The theorem was generalized to other states 34 but with the restriction that we had to be able to keep control over sufficiently many optimal abelian models We do not believe that these restrictions cannot be removed by a harder analysis

Another result on footprints of commutativity is the following

Theorem 35 Assume that in the calculation of the dynamical model there exists an optimal abelian model ie

h(a) = sup (419) = maxAipe(419) (55) B0

then the algebra 4 contains an abelian subalgebra Ao on which a acts as an automorphism Notice that this does not imply that this abelian subalgebra already is the optimal abelian model

6 Abelian Models for Algebraic K-Systems

In the following we will discuss the examples of abelian K-systems given in Sect 3 and how far they allow to find good abelian models

289

A) In this model of a quantized Bernoulli system that completely factorizes the obvious choice of the abelian model that gives the correct result is

-4o = (g)4n )

neuroZ

where BQ is the abelian algebra that commutes with p and describes the measurements with maximal certainty

B) For the lattice system for which the state does not factorize any more it does not suffice to pick a suitable abelian subalgebra at every lattice point This provides an abelian model but not an optimal one Accordshying to the observations (425) it is clear that an upper bound for the dynamical entropy is given by the entropy density 29 and it seems very plausible that it should not be less To our knowledge no general proof is available but for the states that are of physical interest equality is shown

Already in 29 equality was shown under some compatibility relation beshytween space translation and modular automorphism Only in reality it is difficult to check whether this compatibility relation holds For quasifree states this is possible and was done in 3 6 Here an abelian subalgebra was selected for increasing size of the tensor product This subalgebra delocalizes but only to such an extent that the convergence of these subalgebras to an abelian model that gives the desired result can be controlled

In 37 equilibrium states over lattice systems as in 9 were considered and a decomposition offered that in the limit gave the desired result 38 applied the affinity of the dynamical entropy to control these limits and allow to exchange them His ideas are generalized in39 giving the following result

If you assume that the shift a is asymptotically abelian (ie we consider not only lattice algebras but some generalization in the framework of AF-algebras) and you consider a dynamics given by a sequence of local Hamiltonians then

The thermodynamic limit of the equilibrium states exists and they satisfy the KMS property with respect to the dynamics

For these states the entropy density and the dynamical entropy of the shift coincide The dynamical entropy of the shift can be used in a thermodynamic variation principle This variation principle is satisfied exactly by states that are KMS with respect to the time evolution

290

The maximal dynamical entropy is achieved by the tracial state and coincides in this state with the Voiculescu-dynamical entropy hat (49) In all these examples the abelian model is constructed by considering the sequence p = C~HA and the corresponding minimal projectors in (421-23)

There exists another possibility to construct space translation invariant states on the lattice namely the method of correlated states

We start again with our chain A = regnBn In addition we choose an algebra C (we restrict to finite dimensional ones) and consider some completely positive map F C reg $ -gt C that we can write as fbc) and we demand i (c) = c Let w be a state over C satifying Q o fx =Q Then we define

uj(bi ltggt reg bk) = Q(fbl ofbaoo fbbdquo(l))

where bi is an operator at the lattice point i (many of them can be 1)

It can be checked that in this way we obtain a translation invariant state If eg amp(1) = oj(b) bull 1 then we obtain a state that is clustering If we want to have nontrivial correlations between nearest neighbours we have to choose another but this enforces that there must be also correlations to other neighbours Space clustering is encoded in the convergence properties of ( ) 4 0

Now the construction of an abelian model is offered by a decomposition of F into finer completely positive maps Convergence properties in the construction of abelian models as it is necessary in (423) are now conshytrolled by convergence properties of F (that acts over finite dimensional algebras) instead of convergence properties of space correlations Again we have to choose Bn sufficiently large ie combine sufficiently many lattice points With appropriate estimates it was shown 41 that for all finitely correlated states (C of finite dimension) the dynamical entropy and the entropy density of the so constructed states coincide

C) The Fermi Algebra

If we concentrate on the even subalgebra Ae of the CAR algebra ie the algebra consisting of even polynomials in creation and annihilation opershyators this is just a special AF-algebra that is asymptotically abelian and therefore the results in39 guarantee that for equilibrium states dynamical entropy of space translation and entropy density coincide

If in addition we apply the theorem 29

han) = n h(a)

291

then obviously

hAAdegn) lt hA(an)

~ h^PlP^-HiPlP-ttA)

lt hli(PPLn))-H(PP-regAe) + ln2 (61)

shows that hAc(a) = hA(a)

Nevertheless the noncommutativity of the algebra has consequences

Theorem If ugt = OJ O a then UJ(AQ) = 0 for all odd elements in A

Proof

M4gt)|2 N-l

bdquo N n=0

= ^EF U PO^4W (6-2)

The anticommutator vanishes for strictly local odd operators except for (pound-k) = 0(l) Therefore

K 4 o ) | 2 lt ^ ViV

We notice that noncommutativity reduces the possibility for invariant states

Concerning the question for entropic K-systems (22) for all even subal-gebras

KmH((TnBe)=H(Be)

but for a typical odd subalgebra AQ = ao + h(a Ao) = 0

D) For the stationary quantum Markov chain again an abelian model can be constructed that gives the optimal result ie the entropy density10 The main idea in the proof is the fact that apart from the algebra A we can concentrate on the algebra C and inside of this algebra we construct an optimal decomposition Therefore in the limit of these decompositions we find an abelian model with vanishing entropy defect H(PP- reg A)

292

As we already mentioned the automorphism T (as in our special exshyample) will not be asymptotically abelian in general and therefore the system fails to be an entropic K-system Similar as for the Fermi system we can introduce the gauge automorphism

7 ~Ox = -Vx

ldegy = -Oy

bullyaz = az

The elements invariant under this gauge automorphism are asymptotishycally abelian under space translation because they become localized in 1 regC Therefore again the result corresponds to the results in3 9 though the states are constructed in different ways

E) The last example we want to discuss in this framework is the Price-Powers shift We have already considered the special case g(i) mdash 1 the Fermi algebra (3Eb) For gl) = 1 g(l) mdash 0 otherwise the representation (3Ea) already indicates how to construct an abelian model For a2 we are dealing with a quantum Bernoulli shift that is factorizing with the obvious choice for an abelian model Therefore it is easy to construct the abelian model for a

We can consider Bff2 as subalgebra of A therefore oBai is again an abelian subalgebra and for the shift a we consider the abelian model

oBai

with the obvious coupling Notice that now we have presented an examshyple where the entropy defect of the abelian model does not vanish ie the abelian model is not a subalgebra of the system For arbitrary g we will in general fail to find an abelian model We have only to vary the proof (62) If g is sufficiently irregular so that for all wj euro A where Wi are monomials in a i euro

[wIltrkwI]+ = 0

for infinitely many k so that

|w(w)|2 = J2 TT UJ^(jkwi)

= jjjl E w([laquolaquo]+) = o (j-ijJ (63)

293

then LJ(WI) has to vanish

In fact it was shown in42 that it is possible to construct a sequence g so that (63) holds for all wi and therefore the only invariant state is the tracial state In4 3 we proved that with probability one on the set of possible lt (63) holds and again we have a unique invariant state But this argument can be generalized to every coupling to abelian models therefore every coupling has to be trivial and the dynamical entropy in the sense of29 resp 30 vanishes

The Price-Powers shift was also studied in the context of Voiculescus dynamical entropy and in the context of the Alicki-Fannes entropy23 44 Here the increasing property is the dominant feature We obtain

hat(a) = i In 2 hAF (a) = In 2 (64)

independently of the special sequence g

If we return to our remark that the dynamical entropy describes how information increases but at the same time becomes more and more irshyrelevant for classical dynamical systems we notice that the Voiculescu and the Alicki-Fannes algebra concentrate on the fact that information increases whereas the 29 entropy is sensitive to the amount how inforshymation becomes irrelevant

7 Continuous K-Systems

So far we concentrated on discrete dynamics But obviously the discrete group of translation Z can be replaced by R without varying much of the definitions Especially due to the linearity of the dynamical entropy (which is proven for 18 and2 9)

han) = n h(a) (71)

also for the continuous groups R we can choose the subgroup aZ and can calculate the dynamical entropy (for all possible definitions) for this subgroup It can be shown that the result will be independent of the scaling parameter a

Also the definition of an algebraic quantum K-system is applicable also for a continuous group Only in this case the amount of increase cannot be described by [At Ao it is either zero on infinity because [At AQ] = n[Atn AQ] and [A 40] is either 0 or gt 2 1 4

294

This remark shows that a continuous quasifree evolution over a Fermi lattice system (aaa(f) = a(eiapf) a 6 R) can give positive dynamical entropy but cannot correspond to a continuous algebraic K-system

[At A0] = hat(at)

and hat(at) = hT(crt)

in the tracial state (compare39) This leads to a contradiction if hT(aT) is bounded

A prototype of a continuous K-system is given in relativistic quantum field theory

The Wedge Algebra 45 Consider the algebra Aw = lttgtx)xi gt 0 as subalgebra of a quantum field theory A This algebra is mapped into itself by the following automorphisms

a) ampx the shift in the x-direction Therefore AAwltri (Q bull |fi) is a K-system in an irreducible state The unitary operator implementing ai1] is eiplx with spec (P1) = R

b) lpound the shift in the light direction x1 + xdeg Again AAwtpound (ft| bull |ft) is a K-system Now pound^ = ad eiL with spec (L1) = R+

c) fl) is cyclic and separating for Aw- Therefore it defines a KMS-automorphism and this KMS-automorphism coincides with the geometric action of the boost b^ With AwZ1)Awbw (tt bull |ft) we obtain a new K-system where the K-automorphism is now the modular automorphism ad b^ = ad eB poundx acts as endomorphism on Aw- The generators satisfy

[ f l W L W ] = i l W (72)

These relations can be generalized to the following theorem

Theorem Let A AoTtuj be a modular K-system ie rt the modular automorphism of A and

n A0 D Ao-

a) Then the GNS vector Q) implementing ui is cyclic and separating both for A and Ao-

295

b) Let Tt be implemented by eim eiHtil = Q) Let rtdeg be the modular automorphism of A implemented by eiH with eiH |fi) = |ft) Then

G = Hdeg - H is well defined G gt 0

e i G s s gt 0 implements an endomorphism on A with elG A e~G = Ao

[HG) = iG (73)

The proof is based on the analyticity properties of the modular operator taking appropriate care of domain properties46 47

We notice that for quantum modular K-systems in a natural way endomorshyphism arise that satisfy the Anosov commutation relations and therefore offer by Lyapunov exponents the clustering properties of the automorphism

Theorem Let A T(t)a(s)uj be an Anosov system with r the K-automorphism and a the Anosov endomorphism

Take XA to be the characteristic function (a oo) for some a gt 0 Choose A and B euro A such that

i) AQ 6 Tgt(Gr) for some r gt 0

ii) XA(G)BQ = 0 As a consequence (n|Z|fi) = 0

Then

|w(i4TB)| lt e-tra-rBnGrAn (74)

We refer t o 1 and4 8

As for discrete quantum K-systems we wonder whether the dynamical enshytropy is positive and there exists nontrivial models Again no general result is available On the basis of quasifree evolution 49 we can construct models for fermions and bosons that are modular K-systems with positive dynamishycal entropy But there exists also a ^-deformed quasifree modular system50 Here the past algebra has trivial relative commutant and therefore the algebra does not contain any subalgebra on which the dynamics acts asymptotically abelian which according to 34 seems to be a requirement for the construction of abelian models

296

8 Mixing Properties Without Algebraic K-Property

As already mentioned no strategy is available up to now to construct for a given quantum dynamical system a subalgebra that satisfies the K-property A model for which it is still undecided whether we are dealing with an algebraic K-system is the rotation algebra51

Definition The rotation algebra Aa is built by unitary operators U V with

U-V = eiaV bull U (81)

for some a G [027r) This algebra arises in a natural way in a physically motivated example Consider a free particle in a constant magnetic field confined to two dishy

mensions Then the particle describes Larmor bounds In the thermodynamic limit these Larmor bounds can be occupied up to a precise filling factor52 This thermodynamic limit can most easily be achieved by confining the particles in an additional harmonic potential whose strength is going to zero53 Another method more taylor-made to study electric currents are periodic boundary conditions Therefore the algebra is built by eiav ePv einx emy with

piavx Jinx pin(x+a) iavx

pifivypiny _ pim(y+P) giffvy

eiavXpil3vy _ pia0Bpi0vypiav g 2

with B the magnetic field orthogonal to the plane All other commutators vanish

If we introduce

exp[inx] =

exp[im7] =

len the algebra splits into

eiav em

exp

exp

tn(x - mdashvy

im(y - ~5vx

yreg einxeimy

pinXpimy _ g i Bpimypinx

297

Therefore the rotation algebra with a = lB describes the algebra of the center of the Larmor precision

For Aa there exists a representation on CT2)

7r(Va) = exp [i [y - ^Pz) ] gt (83)

where p pv are the momentum operators - mdash- - mdash with periodic boundary i ox i ay

conditions on the torus For |fi) = |1) the constant function on the torus

JJa)il) = eix

n(va)n) = jy (84)

independent of the rotation parameterM On Aa we have the following autoshymorphism

4(^C) = J^usv

with

n m

= T n m - ( ) bull

ad mdash be = 1

tjW describe currents and are therefore of physical relevance QT describes dilation in R space and reduces to a map on the torus T2 only for discrete values and discrete directions of the dilation A physical description for QT can be given if it describes a sudden periodic push to the particle Whereas CT1 and a(2gt have no good mixing behaviour QT inherits all mixing properties from the classical torus due to (84)

(nn(Wa(z))QTn(Wa(z))n) = (QirW0z))QTn(W0(z))il) (85)

But with respect to dynamical entropy the noncommutativity plays an essential

298

role Let A be the eigenvalue gt 1 of T Then

hat(ampT) = In A for a irrational18

= In A for a rational

IAF(copyT) = In A for all a 5 5

ICNT(copyT) = hi A for a rational

gt 0 for a depending rationally on A57

= 0 in general56

In addition it was possible to construct for a rational a subalgebra Ao so that (A AQQTU) became a K-system54 This was possible because A can be looked at as a crossed product of the classical algebra on T2 with a discrete translation group and by rather general considerations crossed product algeshybras inherit under some conditions the K-structure of the underlying algebra 56 Obviously this construction does not give a hint for irrational a

The strong dependence on a of the CNT-dynamical entropy is based on the fact of the strong dependence of the asymptotic commutation behaviour Only if a and A are rational depending the system is asymptotically abelian and the commutator converges asymptotically fast to zero This rapid convergence made it possible to construct an abelian model57 using the fact that the algebra Aa can be imbedded in but is not an AF-algebra Therefore different from the approaches for lattice systems the abelian model cannot be identified up to convergence problems with an abelian subalgebra of Aa-

9 Time Evolution

As we have seen in a quantum system there are many possibilities for some kind of mixing behaviour that are not equivalent as in the classical situation Up to now we concentrated on dynamics that were constructed in such a way that they should give us information on possible ergodic structures

When dynamics is given to us by a sequence of local Hamiltonians we have up to now hardly control on the asymptotic behaviour apart from quasifree evolution

We mention just one result The x-y model58 allows a transformation to a quasifree evolution Therefore we know that it is weakly but not strongly asymptotically abelian Its dynamical entropy is positive and all definitions give the same result (with the dimensional correction term for IAF)- We do not know whether it is an algebraic K-system for a discrete subset in time For sure it is not a continuous algebraic K-system

299

References

1 GG Emch H Narnhofer GL Sewell W Thirring Anosov Actions on Non-Commutative Algebras J Math Phys 3511 5582-5599 (1994)

2 MC Gutzwiller Chaos in classical and quantum mechanics (Springer New York 1990)

3 E Bogomolny F Leyvraz C Schmit Statistical Properties of Eigenshyvalues for the Modular Group in Xlth International Congress of Mathshyematical Physics Daniel Jagolnitzer ed (International Press Boston 306-323 1995)

4 AN Kolmogorov A new metric invariant of transitive systems and autoshymorphisms of Lebesgue spaces Dokl Akad Nauk 119 861-864 (1958)

5 P Walters An Introduction to Ergodic Theory (Springer New York 1982)

6 LP Cornfeld SV Fomin YaG Sinai Ergodic Theory (Springer New York 1982)

7 H Narnhofer W Thirring Quantum K-Systems Commun Math Phys 125 565-577 (1989)

8 C Brukner A Zeilinger Conceptual Inadequacy of the Shannon Inforshymation in Quantum Measurements quant-ph0006087

9 0 Bratteli DW Robinson Operator Algebras and Quantum Statistical Mechanics I II (Springer Berlin Heidelberg New York 1993)

10 B Kiimmerer Examples of Markov dilation over 2 x 2 matrices in L Accardi A Frigerio V Gorini eds Quantum Probability and Applicashytions to the Quantum Theory of Irreversible Processes Springer Berlin 1984 228-244 and private communications

11 RT Powers An index theory for semigroups of -endomorphisms of BH) and type Hi factors Canad J Math 40 86-114 (1988) GL Price Shifts of Hi factors Canad J Math 39 492-511 (1987)

12 H Narnhofer W Thirring Chaotic Properties of the Noncommutative 2-Shift in From Phase Transition to Chaos G Gyorgyi I Kondor S Sasvari T Tel eds World Scientific 1992 530-546

13 H Narnhofer W Thirring Clustering for Algebraic K-Systems Lett Math Phys 30 307-316 (1994)

14 VFR Jones Index for subfactors Invent Math 72 1-25 (1983) 15 R Longo Simple Injective Subfactors Adv Math 63 152-171 (1987)

Index of Subfactors and Statistics of Quantum Fields Commun Math Phys 130 285-309 (1990)

16 M Choda Entropy of canonical shifts Trans Amer Math Soc 334 827-849 (1992)

300

17 H Narnhofer A Pflug W Thirring Mixing and Entropy Increase in Quantum Systems in Symmetry in Nature in honour of Luigi A Radicati di Brozolo Scuola Normale Superiore Pisa 597-626 (1989)

18 DV Voiculescu Dynamical Approximation Entropies and Topological Entropy in Operator Algebras Commun Math Phys 170 249-282 (1995)

19 M Choda A C Dynamical Entropy and Applications to Canonical En-domorphisms J Fund Anal 173 453-480 (2000)

20 E Stormer A Survey of noncommutative dynamical entropy Oslo preprint No 18 Dep of Mathematics MSC-class 46L40 (2000)

21 M Choda Entropy on crossed products and entropy on free products preprint (1999)

22 K Dykema Topological entropy of some automorphisms of reduced amalshygamated free product C algebras preprint (1999)

23 R Alicki F Fannes Defining Quantum Dynamical Entropy Lett Math Phys 32 75-82 (1994)

24 RB Griffiths Consistent histories and the interpretation of quantum mechanics J Stat Phys 36 219-279 (1984)

25 M Gell-Mann J Hartle Alternative decohering histories in quantum mechanics in Proc of the 25th Int Conf on High Energy Physics Vol 2 ed by KK Phua and Y Yamaguchi World Scientific Singapore 1303-1310 (1991)

26 EH Lieb DW Robinson The finite group velocity of quantum spin systems Commun Math Phys 28 251-257 (1972)

27 A Connes E Stormer Entropy of IIj von Neumann algebras Acta Math 134 289-306 (1972)

28 A Connes Acad Sci Paris301I 1-4 (1985) 29 A Connes H Narnhofer W Thirring Dynamical Entropy of C-

Algebras and von Neumann Algebras Commun Math Phys 112 691-719 (1987)

30 JL Sauvageot JP Thouvenot Une nouvelle definition de Ientropic dynamique des systems non commutatifs Commun Math Phys 145 411-423 (1992)

31 CH Bennett DP DiVincenzo JA Smolin WK Wootters Mixed state entanglement and quantum error corrections Phys Rev A 54 3824-3851 (1996)

32 F Benatti H Narnhofer A Uhlmann Decomposition of quantum states with respect to entropy Rep Math Phys 38 123-141 (1996)

33 WK Wootters Entanglement of formation of an arbitrary state of two qubits q-ph970929

301

34 F Benatti H Narnhofer Strong asymptotoc abelianess for entropic K-systemsCommun Math Phys 136 231-250 (1991) Strong Clustering in Type III Entropic K-Systems Mh Math 124 287-307 (1996)

35 H Narnhofer An Ergodic Abelian Skeleton for Quantum Systems Lett Math Phys 28 85-95 (1993)

36 H Narnhofer W Thirring Dynamical Theory of Quantum Systems and Their Abelian Counterpart in On Klauders Path eds GG Emch GC Hegerfeldt L Streit World Scientific 127-145 (1994)

37 H Narnhofer Free energy and the dynamical entropy of space translashytion Rep Math Phys 25 345-356 (1988)

38 H Moriya Variational principle and the dynamical entropy of space translation Rev Math Phys 11 1315-1328 (1999)

39 S Neshveyev E Stormer The variational principle for a class of asympshytotically abelian C algebras MSC-class 46L55 (2000)

40 M Fannes B Nachtergaele RF Werner Finitely correlated states of quantum spin systems Commun Math Phys 144 443-490 (1992)

41 RF Werner private communication 42 H Narnhofer E Stormer W Thirring C dynamical systems for which

the tensor product formula for entropy fails Ergod Th amp Dynam Sys 15 961-968 (1995)

43 H Narnhofer W Thirring C dynamical systems that are highly anti-commutative Lett Math Phys 35 145-154 (1995)

44 R Alicki H Narnhofer Comparison of Dynamical Entropies for the Noncommutative Shifts Lett Math Phys 33 241-247 (1995)

45 HJ Borchers On the Revolutionization of Quantum Field Theory by Tomitas Modular Theory ESI preprint 160 pages 148 references

46 HJ Borchers On Modular Inclusion and Spectrum Condition Lett Math Phys 27 311-324 (1993)

47 HW Wiesbrock Halfsided Modular Inclusions of von Neumann Algeshybras Commun Math Phys 157 83-92 (1993) Commun Math Phys 184 683-685 (1997)

48 H Narnhofer Kolmogorov Systems and Anosov Systems in Quantum Theory review to be publ in IDAQP

49 H Narnhofer W Thirring Realization of Two-Sided Quantum K-Systems Rep Math Phys 45 239-256 (2000)

50 D Shlyakhtenko Free quasifree states Pac Journ of Math 177 329-368 (1997)

51 MA Rieffel Pac J Math 93 415 (1981) 52 RB Laughlin Quantized Hall Conductivity in Two Dimensions Phys

302

Rev B 2310 5632-5633 (1981) 53 N Ilieva W Thirring Second quantization picture of the edge currents

in the fractional quantum Hall effect math-ph0010038 54 F Benatti H Narnhofer GL Sewell A Non Commutative Version of

the Arnold Cat Map Lett Math Phys 21 157-172 (1991) 55 R Alicki J Andries M Fannes P Tuyls Lett Math Phys 35 375-

383 (1995) 56 H Narnhofer Ergodic Properties of Automorphisms on the Rotation

Algebra Rep Math Phys 39 387-406 (1997) 57 SV Neshveyev On the K property of quantized Arnold cat maps J

Math Phys 41 1961-1965 (2000) 58 H Araki T Matsui Commun Math Phys 101 213-246 (1985)

303

SCATTERING IN Q U A N T U M TUBES

B O R J E NILSSON

School of Mathematics and Systems Engineering Vaxjo University SE-351 95 VAXJO Sweden

E-mail borjenilssonmsivxuse

It is possible to fabricate mesoscopic structures where at least one of the dimenshysions is of the order of de Broglie wavelength for cold electrons By using semishyconductors composed of more than one material combined with a metal slip-gate two-dimensional quantum tubes may be built We present a method for predicting the transmission of low-temperature electrons in such a tube This problem is mathematically related to the transmission of acoustic or electromagnetic waves in a two-dimensional duct The tube is asymptotically straight with a constant cross-section Propagation properties for complicated tubes can be synthesised from corresponding results for more simple tubes by the so-called Building Block Method Conformal mapping techniques are then applied to transform the simple tube with curvature and varying cross-section to a straight constant cross-section tube with variable refractive index Stable formulations for the scattering operators in terms of ordinary differential equations are formulated by wave splitting using an invariant imbedding technique The mathematical framework is also generalised to handle tubes with edges which are of large technical interest The numerical method consists of using a standard MATLAB ordinary differential equation solver for the truncated reflection and transmission matrices in a Fourier sine basis It is proved that the numerical scheme converges with increasing truncation

1 Introduction

In the search for faster computers critical parts are becoming smaller Today it is possible to build mesoscopic structures where some dimensions are of the order of the de Broglie wavelength for cold electrons Often the electron motion is confined to two dimensions Consequently it may be necessary at least for some computer parts to include quantum effects in the design process

A large number of studies devoted to such quantum effects have been carried out in recent years and a review is given by Londegan et alx Many inshyvestigations aim at understanding the physical properties of a particular quanshytum tube rather than developing reliable mathematical and numerical methods that can be used in a more general context The research has given valuable knowledge on the physical behaviour but also reports on the limitations of the methods used For instance Lin amp Jaffe2 report that a straightforward matchshying at the boundary of a circular bend does not converge demonstrating the numerical problems with such a method An illposedness is present in quantum tube scattering and some type of regularisation is therefore required to avoid large errors Often the tubes have sharp corners to facilitate manufacturing

304

but also to enhance quantum effects The presence of corners with attached singularities requires special treatment

Scattering of electrons in quantum tubes see figure 1 is theorywise reshylated to the scattering of acoustic and electromagnetic waves in ducts Nilsson 3 treats a general method for the acoustic transmission in curved ducts with varying cross-sections Wellposedness ie stability is achieved in an asympshytotic sense The mathematical framework guarantees consistent results and allows for sharp corners and a proof for numerical convergence is given We set out to present a quantum version of the results of Nilsson3 In this way the problems reported on convergence2 and on inconsistent mathematical results would be resolved

The paper is organised as follows An introduction to scattering in quanshytum tubes is given in section 2 and a mathematical model is formulated in section 3 The Building block Method which is a systematic method to analyse complicated tubes in terms of results for simple tubes is also briefly described Then in section 4 the scattering problem for the curved tube with varying cross-section and constant potential is reformulated to a scattering problem for a straight tube with a varying refractive index The solution to this probshylem is presented in section 5 and a discussion on numerical methods are also given

2 Tubes in quantum heterostructures

A schematic view of a quantum heterostructure is shown in figure 2 following Wu et al 4 Electrons are emitted from the n-type doped AlGaAs layer migrate into the GaAs layer and stay close to the boundary to the AlGaAs layer In this way a very narrow layer of electrons which are free to move in a plane is formed Nearly all the electrons in this two-dimensional gas are in the same quantum state By applying a negative potential on the metal electrodes on the top of the heterostructure in figure 1 the electrons are banished from the region below the electrodes For relatively low voltages the effective potential in the tube for one electron is close to the square-well potential 1 As a consequence the electrons in the two-dimensional gas are further restricted to a tube that in form is a mirror picture of the gap between the two electrodes This quantum tube links the electrons between the two two-dimensional gases on both sides of the strip formed by the electrodes

3 Mathematical model

Consider a two-dimensional tube with interior ft according to figure 1 The boundary V consists of two continuous curves F+ and r_ which are piecewise

305

C2 The upper boundary r + can be continuously deformed to T_ within ft Outside a bounded region the duct is straight with constant widths a and b respectively These terminating ducts are called the left and the right terminating duct or L and R for short We use stationary scattering theory for one electron in an effective potential with time dependence exp(mdashiEth) assuming that the wave function ip satisfies the time-independent Schrodinger equation Atp + k2ip = 0 in ftwhere k2 = 2mEh and m is the effective mass5 Usually k2 is called energy The effective potential is assumed to be a square well meaning that Vlr = 0-

In a tube with constant cross-section the harmonic wavefunction ip can be uniquely decomposed in leftgoing and rightgoing parts by ip = ip++ip~ Super indices + and mdash indicate rightgoing or plus and leftgoing or minus waves respectively Let ipfn

a n d V^ be known incoming waves in the terminating ducts tpfn is present in the left and ip~n in the right one Let us write

f V = 1gttn + R+tfn + T-rp-JnL rj = VTn + RiTn + T+igtfninR ^

where for example the last two terms in (31a) are minus waves and the equashytion defines the left reflection mapping R+ that maps the incoming wave to an outgoing one in L The scattering problem consists of finding the mappings R+ T~ R~ and T+ as functions of energy for a given duct In summary we have

Aip + k2igt = Oinfl

1gt+=1gtpnL bull 6-2)

igt = gtPininR

There is always a solution to (32) and except for a discrete number of eigenenergies k2 = kfi = 123 the solution is unique 6 When k2 = k2 an eigenenergy there exists a solution without incoming but with outgoing waves

The use of the Building Block Method 7 or transfer matrix formalism 8 is very efficient for the solution of scattering problems In this method a tube with a complicated geometry is divided into two parts usually where the tube is straight These two parts are converted to the type shown in figure 1 by extending the terminating tubes to infinity A sub tube for the tube shown in figure 1 originates from the left part and is depicted in figure 3 The Building Block Method gives a procedure for calculating the mappings R+ T~ R~ and T+ for the entire tube in terms of the corresponding scattering properties for the sub tubes This procedure can be repeated to get several sub tubes

306

Rather than using a general numerical package for conformal mappings we have for the calculations in this paper employed the Schwarz-Christoffel mapping for a duct with corners and rounding the corners using the methods of Henrici 9 Required analytic integrations are performed in MATHEMATICA

We recall the standard duct theory6 in a form that illustrates the illposed-ness of the problem and we have

oo oo

rP = Vgt+ + V- = Y A+e t eVraquo(v) + pound ^ e ^ - ^ l y ) (33) ra=l n = l

with pn(y) = sin(nnya) and an = ^Jk2 mdash n2n2a2 Im an gt 0 It is conveshynient to define the operator Bo by

-Bo = pound r T = l ttnnVn

I f(y) = Zn=lltnfnltPn(y) ^

We find that BQ mdash d2x 4- k2 and dx^ mdash plusmni50Vplusmn- The initial value problem

dxtp+(x) = iB0ip

+(x)

I V+(0) = ^ (

is illposed for x lt 0 but not for x gt 0 If an attenuated plus wave is marched to the left an exponential growth is found To avoid the illposedness ip is decomposed and the plus waves are calculated by marching to the right and minus waves in the opposite direction

4 Reformulated scattering problem

To be able to use powerful spectral methods it is advantageous to transform the tube to a flat boundary It is enough according to the Building Block Method to consider the scattering in the sub tubes and we restrict ourselves to the first part as shown in figure 3 One way of transforming the tube is to use a conformal mapping w(C) transforming the interior CI of the tube with variable cross-section in the pound = x + iy plane (figure 3) to the interior H of a straight tube with constant cross-section in the w = u + iv plane The straight tube is described by mdashoo lt u lt o o 0 lt t lt a

Introducing cfgt(u v) = tp(x y) we get

f d2uclgt + B2(u)^ = 0inn (

0(uO) = 0(uo) = O u e R K

with B2u) = d2 + k2n(uv) and n = dCdw2 ^(uigt)-1 can be denoted as a refractive index for the straight tube In figure 4 x related to the simple

307

tube in figure 3 is depicted The factor (i(u v) is asymptotically constant at both ends of the tube or more precisely fj(u v) = (iplusmn+0(e^cu^) u mdashgt plusmn00 with [i- mdash 1 and J+ = (ba)2

We use a first order description and rewrite (41a) as

9u dultjgt ) ~ - B 2 0j dulttgt ) (42)

To avoid illposedness the decomposition ltjgt = ltfgt+ + cfgt~ is introduced which must be identical to the corresponding decomposition (33) in regions where n is a constant The new state variables (ltfgt+ltfgt~) are introduced via the linear relation

dultigt)- ic -ic )lttgt- ) bull (43)

Solving (43) for 0+and ltjgt that

and taking the u-derivative and using (42) we find

(pound) - ( i)(pound)- (44)

where

a = MiduC-^C + iC~lB2 + iC] -(duC-1)C + iC-1B2-iC -(duC-1)C-iC-1B2 + iC

S =[duC-l)C - iC~lB2 - iC]~

amp _ 1

7 = I 2

(45)

To generalize the concept of transmission operators we make them u-dependent using a similar notation as Fishman10

4gt+u2) f T+(U2Ui) V tf-(Ul) J V ^+(2laquol)

(u1 (u2) ( 4gt+(ui)

J V r (laquoraquo) ) R T-(Ulu2)

(46)

assuming that ti lt u2 and suppressing the explicit v-dependence It is asshysumed for (46) that the scattering problem has a unique solution or that homogenous solutions are removed A homogenous solution is usually called a bound state

Next we find a differential equation for the scattering operators T+(u2 u) R~(uiu2) R+(u2ui) and T~(uiu2) in (46) using the invariant imbedding technique11 10 It is required that the incoming wave from the right ltjgt~u2)

308

is vanishing Then put u = u find dultj) (u) from (46) use (46) once more to obtain

duR+(u2u) = J + 5R+(u2u) - R+(u2u)a - R+(u2u)PR+(u2u) (47)

In a similar manner we get

duT+ (u2 u) = -T+ (u2 u)a-T+ (u2 u)3R+ u2 u) (48)

The stability properties of (47) and (48) are of central importance In the flat regions where B = B+ or B- we have C mdash B and duC~x mdash 0 implying that = 7 = 0 and a = -S = IB Similarly (47) and (48) reduce to duX

+ = mdashiBX+ X+ = R+ or T + equations which are well-posed for marching to the left The initial values to accompany (47) and (48) are R+(u2u2) = 0 and T+(u2u2) = where I is the identity operator

We choose C mdash B- + f(u)(B+ mdash pound_) that is independent of v Here is increasing and smooth with limu-^-oo^) = 0 and limu_gt00(u) = 1

5 Solution of the scattering problem

For the numerical solution of the scattering operator we expand ltj) in a Fourier sine series and i i n a Fourier cosine series

^(uv) = pound ~ = 1 (pnu)tpn(v) (

where poundn(v) = cos(mra) Using the notation 4gt = ((jgt0(j))T we find that

^ M + B 2 ( U ) ^ ) = 0 (52)

The matrix elements of B 2 (u) are given by

k2 n2TT2

B2(u)nm = mdash [-fjm+n(u) - Hm-nu) - Hm + Hn-m(u)] ^Snm (53)

and it is understood in (53) that [ii(u) = 0 for negative I For the tube in the physical Cmdashplane we require that locally both the potenshy

tial and the kinetic part of the energy are finite that is both Jx ip dxdy lt oo and Jx Vip dxdy lt oo for all finite regions X inside the tube We say that ip belongs to the Sobolev space Hj1^ meaning that tp and its first derivatives are locally square integrable Transformed to the straight duct the local finite energy requirement means Jv (fgt fidudv lt oo and ^ |V^| dudv lt oo for all

309

finite regions U inside the tube For a smooth boundary cfgt is more regular and also the second derivatives of ltjgt are square integrable that is 0 G H2

0C It follows from the theory of Grisvard12 that also the second derivatives of ltjgt are square integrable which means that ltjgt 6 H2

oc According to a graph theorem13

cj) euro H2oc implies that cfgt(u-) 6 H32(0o) meaning that up to 32 derivatives

are square integrable To interpret this regularity with fractional derivatives we define following Taylor13 the function space

Ds = fe L2(0 a) f^ | bdquo | 2 (l + n2)s lt oo 1 s gt 0 (54) I 71=0 J

wi th = J2^Li fnltPn a n d bdquo = (fltpn)(ltPnPn)- D s is a Hi lber t space wi th the norm

oo

11112) = () = pound l n | 2 ( i + laquo2)- (5-5) n=l

Taylor13 shows that D0 =L2(0o) Di =Hj(0a) D2 =H2(0a)nHj(0a) and that dvDs = D s_i s gt 1 In this terminology we have that for a smooth boundary ltjgtu bull) euro D32-

The operator 92 is self-adjoint on D32- Thus we may define Bplusmn by

oo

Bplusmnf = ^2 k2Hplusmn-nHyafnipn (56) 7 1 = 1

assuming that the branch Im gt 0 of the square root is taken It is clear that T + R~ R+ and T~ are mappings D3 2 ^ D 3 2 and Bplusmn D s mdashgt D s_i s gt 1

For tubes with edges in the poundmdashduct things are a little more complicated With no restriction on the sharpness of the edges we cannot improve that (jgt euro Hoc implying ltjgtu-) euroDi2 Then as an intermediate step in our calcushylations Bplusmnltj) should be in the space D_2 Such a derivative must of course be interpreted as a distribution However the end result ie scattered wave function belongs to D ^ To generalise we define by duality for positive s

poundraquo_s = | g f(v)g(v)dv lt oo for all f pound Ds

Multiplication by^ju is an operator Tgti2 -gtbull D_2 and if s gt 12 we have the following mapping properties Bplusmn D s - bull Dg_idbdquo D s -gt D5_ and T + R~ R+ and T~ are mappings D s -^D s

310

The equations (47-48) can only in very special cases be solved in a closed form Therefore some type of numerical scheme is used Generally a numerical method cannot give uniform convergence for the entire space Ds In a practical application it is usually sufficient to know the effect of the scattering matrices on the lowest eigenfunctions the first No say A practical method is therefore to truncate the matrix representation of (47) - (48) to N raquo NQ and solve the finite-dimensional ordinary differential equation with a standard numerical routine Nilsson3 proves that such a procedure converges when N mdashgt oo

Presently numerical results are not available for the quantum tube scatshytering However Nilsson 3 presents results for the acoustic case where the Neumann rather than the Dirichlet boundary condition applies He reports that for the lowest order reflection coefficient N = 1 ie a scalar solution is accurate up to ka = 15 N = 2 gives a good and N = 5 gives a perfect discription up to ka = 6 Energy conservation holds for all N

References

1 J T Londegan J P Carini D P Murdock Binding and scattering in two-dimensional systems - Applications to quantum wires waveguides and photonic crystals Lecture notes in physics (Berlin Springer 1999)

2 K Lin R L Jaffe Bound states and threshold resonances in quantum wires with circular bends Phys Rev B54 5750-5762 (1996)

3 B Nilsson Acoustic transmission in curved ducts with varying cross-sections Article submitted to Proc Roy Soc A

4 J C Wu M N Wybourne W Yindeepol A Weisshaar S M Good-nick Interference phenomena due to a double bend in a quantum wire Appl Phys Lett 59 102-104 (1991)

5 J Davies The Physics of low-dimensional semiconductors (Cambridge Cambridge University press 1998)

6 M Cessenat Mathematical methods in electromagnetism (Singapore World Scientific Publishing Co 1996)

7 B Nilsson O Brander The propagation of sound in cylindrical ducts with mean flow and bulk reacting lining - IV Several interacting disconshytinuities IMA J Appl Math 27 263-289 (1981)

8 H Wu D W L Sprung J Martorell Periodic quantum wires and their quasi-one-dimensional nature J Phys D Appl Phys 26 798-803 (1993)

9 P Henrici Applied and computational complex analysis Volume I (New York John Wiley k Sons 1988)

10 L Fishman One-way propagation methods in direct and inverse scalar

311

wave propagation modeling Radio Science 28(5) 865-876 (1993) 11 R Bellman G M Wing An introduction to invariant imbedding Classhy

sics in Applied Mathematics 8 Society for Industrial and Applied Mathshyematics (SIAM) Philadelphia 1992

12 P Grisvard Elliptic problems in nonsmooth domains Monographs and studies in mathematics 24 (Boston Pitman 1985)

13 M Taylor Partial differential equations I Basic theory Applied mathshyematics sciences 115 (NewYork Springer 1996)

312

Figure 1 Two-dimensional quantum tube

Doped AJGaAs

Undoped AIGaAs

Undoped GaAs

Semi insulating GaAs

Figure 2 Schematic picture of heterostructure and split-gate structure

313

Figiire 3 Sub-tube with interior Q and upper boundary T^_and lower boundary T_ ba -06

2 0

Figure 4 fi(uv) in the straight duct Parameters as in figure 3 fi x is the refractive index

314

POSITION EIGENSTATES A N D THE STATISTICAL AXIOM OF Q U A N T U M MECHANICS

L POLLEY Physics Dept Oldenburg University 26111 Oldenburg Germany

E-mail polleyQuni-oldenburg de

Quantum mechanics postulates the existence of states determined by a particle position at a single time This very concept in conjunction with superposition induces much of the quantum-mechanical structure In particular it implies the time evolution to obey the Schrodinger equation and it can be used to complete a truely basic derivation of the statistical axiom as recently proposed by Deutsch

1 Quantum probabilities according to Deutsch

A basic argument to see why quantum-mechanical probabilities must be squares of amplitudes (statistical axiom) was given by Deutsch1 2 It is independent of the many-worlds interpretation Deutsch considers a superposition of the form

He introduces an auxilliary degree of freedom i = 1 m + n and replaces

1 4) and B) by normalized superpositions

~r~ m nr m+n

pound5gt)|igt l5gtWn pound m) (L2)

imdashl i=m+l All amplitudes in the grand superposition are equal to 1ym + n and should result in equal probabilities for the detection of the states This immediately implies the ratio m n for the probabilities of property A or B

The argument has clear advantages over previous derivations of the statisshytical axiom Gleasons theorem3 4 for example is mathematically non-trivial and not well received by many physicists while von Neumanns assumption 0 +Cgt2) = (Oi) + (O2) about expectations of observables 5 6 is difficult to interpret physicswise if 0 and Oi are non-commuting45

However Deutschs argument relies in an essential way on the unitarity of the replacement or the normalization of any physical state vector Why should a state vector be normalized in the usual sense of summing the squares of amplitudes It would seem desirable to provide justification for this beyond

315

its being natural 2 In fact the reasoning would appear circular without an extra argument about unitarity or normalization I have proposed 7 to realize the replacement (12) physically by the time evolution of a suitable device Then what can be said about quantum-mechanical evolution without anticipating the unitarity

2 Schrodingers equation for a free particle as a consequence of position eigenstates

For free particles a well-known and elegant way to obtain the Schrodinger equation is via unitary representations of space-time symmetries Interactions can be introduced via the principle of local gauge invariance However this approach to the equation anticipates unitarity

As I pointed out recently8 the Schrodinger equation for a free scalar parshyticle is also a consequence of the very concept of a position eigenstatea in dis-cretized space To an extent this just means to regard hopping amplitudes as they are familiar from solid state theory as a priori quantum-dynamical entities The point is to show however that a hopping-parameter scenario without unitarity would lead to consequences sufficiently absurd to imply that unitarity must be a property of the physical system As will be seen below the absurdity is that a wave-function that makes perfect sense at t = 0 would cease to exist anywhere in space at an earlier or later time

Consider a spinless particle hopping on a 1-dimensional chain of posishytions x = na where n is integer and a is the lattice spacing

bull bull bull bull bull - gt mdash bull mdash - bull bull - a - trade-i n +i

Assume the particle is in an eigenstate n t) of position number n at time t (using the Heisenberg picture) and it has a possibility to change its position The information given by a position at one time does not determine which direction the particle should go Thus the eigenstate n t) necessarily is a superposition when expressed in terms of eigenstates relating to another time t Moreover because of the same lack of information positions to the left and right will have to occur symmetrically If t mdashyen t only nearest neighbours will be involved Thus we expect a hopping equation of the form

nt)=a nt)+3 |n + l t ) + n-lt)

This can be rewritten as a differential equation in t

mdashimdash n t) = V n t) + K n + 1 t) + K n mdash 1 t) K V complex (so far)

Which relies on linear algebra hence includes the concept of superposition

316

Parameters a3 and K V are in an algebraic relation8 which need not concern us here To obtain an equation for a wave-function we consider a general state tp) composed of simultaneous position eigenstates

ip) = ^J^gt(npound) nt) (Heisenberg picture) n

This defines the coefficients ip(nt) for all t Now take the time derivative on both sides identify i[)nt) with a function ip(xt) where x = na and Taylor-expand the shifted values ip(x plusmn a t) This results in

Finally take a mdashgt 0 on the relevant physical scale The spatial spreading of the wave-function is then given by the a2 term and the solution of the equation is

ilgt(xt) = e~iv+2K)t f rP(p)eipxe-ia2Kp2tdp

This time evolution would be unitary if K and V were real Hence consider the consequences of a non-real K The integrand would then contain an evolution factor increasing towards positive or negative times like

exp (plusmn a2 Imtp21)

This would lead to physically absurd conclusions about certain harmless wave-functions like the Lorentz-shape function ij)x) = 11 + x2

bull For Imt gt 0 harmless function rpp) oc exp(mdashp) would not exist anywhere in space after a short while

bull For Imc lt 0 the harmless function could not be prepared for an experiment to be carried out on it after a short while

In a mathematical sense of course it still remains a postulate that the value of K be real But physicswise it does seem that unitarity of quantum mechanics is unavoidable once the superposition principle and the concept of position eigenstate are taken for granted

As for parameter V the factor e~lVt would be raised to the nth power in an n-particle state and would lead to an absurdity similar to the above with certain superpositions of n-particle states unless V is real too

317

3 Driven particle Weyl equation in general space-time

As an example of a particle interacting with external fields we may consider a massless spin 12 particle with inhomogeneous hopping conditions8 Here the starting point is common eigenstates of spin and position where position refers to a site on a cubic spatial lattice A particle in such a state at time t will be in a superposition of neighbouring positions and flipped spins at a time t laquo t In 3 dimensions and immediately in terms of a wave-function the corresponding differential equation is

-imdaships(xt)= S~] Hnssiilgtslx-ant) at mdash

lattice directions

where Hnssi are any complex amplitudes On-site hopping (time-like direction) is included as n = 0 To begin with a free particle is defined by translational and rotational symmetry In this case the hopping amplitudes reduce to two independent parameters8 e and K both of them complex so far By Taylor-expanding the wave-function and taking a mdashgt 0 we find

dtipsxt) = etp3(xt) - aKa^sdntpsgt(xt)

If K had an imaginary part it would lead to physical absurdities with the time-evolution of certain harmless wave-functions similarly to the previous section For real K we recover the non-interacting Weyl equation

If we now admit for slight (order of o) anisotropics and inhomogeneities in the hopping amplitudes by adding some a7MSS(x t) to the hopping conshystants above we recover a general-relativistic version of the equation 9 with the Juss (x t) acting as spin connection coefficients Unitarity in this context means that the probability current density

j(t) = v()ltcvv(t) is covariantly conserved

daja + Ta

0aj = O

This is found to hold automatically if the vector connection coefficients are identified as usual9 through the matrix equation

Imposing no constraints on the spin connection coefficients we are dealing with a metric-affine space-time here which can have torsion and whose metric

318

Figure 1 An array of eight cavities of equal shape The initial state is located in the central cavity When each channel is opened for an appropriate time the state evolves to an equal-amplitude superposition of the peripheral cavity-states

may be covariantly non-constant The study of space-times of this general structure has been motivated by problems of quantum gravity9 It may be interesting to note that nothing but propagation by superposing next-neighbour states needs to be assumed here In particular scalar products of state vectors are not needed

4 Realizing Deutschs substitution as a time evolution

Having demonstrated automatic unitarity on two rather general examples we can now turn with some confidence to the original issue of completing Deutschs derivation of the statistical axiom

To realize the particular substitution (12) for state vector (11) let us consider a particle with internal eigenstate A) or B) such as the polarisations of a photon Let this particle be placed in a system of cavities6 connected by channels (Fig 1) which can be opened selectively for internal state A) or

Or Paul traps or any other sort of potential well these are to enable us to store away parts of the wave function so that there is no influence on them by the other parts

319

B) It will be essential in the following that all cavities are of the same shape because this will enable us to exploit symmetries to a large extent The location of the particle in a cavity will serve as the auxilliary degree of freedom as in (12) except that A) and B) before the substitution will be identified with |A)|0) and |-B)|0) where |0) corresponds to the central cavity

Now let only one of the channels be open at a time We are then dealing with the wave-function dynamics of a two-cavity subsystem while the rest of the wave-function is standing by What law of evolution could we expect A particle with a well-defined (observed) position 0 at time t will no longer have a well-defined position at time t if we allow it to pass through a channel without observing it Thus a state |0 t) defined by position 0 at time t (using the Heisenberg picture) will be a superposition when expressed in terms of position states relating to a different time t In particular if channel 0 lt-bull 1 is the open one

0t) = a0t) + plt)

Likewise by symmetry of arrangement

|ltgt = a | M ) + 0 | O l f )

It follows that |0 t) plusmn |1pound) are stationary states whose dependence on time consists in prefactors

(a plusmn fi)k after k time steps (41)

If the particle is initially in the rest of the cavities whose channels are shut we would expect this state not to change with time

|restt) = |resti)

Now if (41) were not mere phase factors we could easily construct a supershyposition of |0) |1) and |rest) so that relative to the disconnected cavities the part of the state vector in the connected cavities would grow indefinitely or vanish in the long run As there is no physical reason for such an imbalance between the connected and the disconnected cavities we conclude that

a + p = ei a-0 = eiv

Having shown evolution through one open channel to be unitary we can idenshytify an opening time interval7 r m to realize the following step of the replaceshyment (12)

ymA) |0) + |rest) ^ ym=lA) |0) + | A) 11) + |rest)

320

Here |rest) stands for state vectors that are decoupled such as all |B)|i) and all | 4) |i) with i ^ 01 Opening other channels analogously each one for the appropriate r m and internal state we produce an equal-amplitude superposishytion

m m+n

Xraquo|igt + pound |Bgt|tgt i=l i=m+l

The probability of finding the particle in a particular cavity is now 1m + n as a matter of symmetry As the internal state is correlated with a cavity by the conduction of the process the probabilities for A and B immediately follow These must also be the probabilities for finding A or B in the original state because properties A and B have remained unchanged during the time evolution

5 Can normalization be replaced by symmetry

An interesting side effect of the above realization of Deutschs argument is that state vectors need no longer be normalized at all Permutational symmetry of a superposition suffices to show that all possible outcomes of an experiment must occur with equal frequency Then the numerical values of the probabilities are fully determined This feature of quantum probabilities may be relevant to problems of normalization in quantum gravity10 such as the non-locality of summing xp2 over all of space or the non-normalizability of the solutions of the Wheeler-DeWitt equation

References

1 D Deutsch Proc Roy Soc Lond A 455 3129 (1999) Oxford preprint (1989)

2 B DeWitt Int J Mod Phys 13 1881 (1998) 3 A M Gleason J Math Mech 6 885 (1957) 4 A Peres Quantum Theory (Kluwer Academic Publishers Dordrecht

1995) 5 J von Neumann Mathematische Grundlagen der Quantenmechanik

(Springer Berlin-New-York 1932) 6 A Bohr 0 Ulfbeck Rev Mod Phys 67 1 (1995) 7 L Polley quant-ph9906124 8 L Polley quant-ph0005051 9 F W Hehl et al Rev Mod Phys 48 393 (1976) Phys Rep 258 1

(1995) 10 A Ashtekar (ed) Conceptual problems of quantum gravity (Birkhauser

1991)

321

IS RANDOM EVENT THE CORE QUESTION SOME REMARKS AND A PROPOSAL

P ROCCHI

IBM via Shangai 53 00144 Roma Italy E-mail paolorocchiit ibm com

This work addresses the Probability Calculus foundations We begin with considering the relations of the event models today in use with the physical reality Then we propose the structural model of the event and a definition of probability that harmonizes the interpretations sustained by different probabilistic schools

1 Preface

The origin of the Probability Calculus is credited to Pascal who applied rigorous methods to the matter that had been grasped by gamblers and unreliable individuals until then He intended to lay the foundations of a new Geometry and the random event should be a point in this hypothetical abstract science Throughout the centuries several scientists shared the Pascals conjecture which has been accepted without discussion Instead in our opinion an exhaustive and systematic approach to probability requires us to investigate the argument before examining the probability itself The probability theories do not diverge in their final results do not provide different formulas for the total probability and the conditioned probability instead they are in contrast on the foundations to wit in the initial concepts and this circumstance seems to us a substantial reason to study the random event

In brief we may say that the probability theories use two main models of the random event the linguistic model and the set model We shall examine them in the ensuing sections However we do not restrict our works to mere criticism but we shall trace a theoretical proposal This one provides a new mathematical model of the random event and a definition of probability which seems capable of harmonizing the various authors appearing today in contrast Kolmogorov and the frequentists the subjectivist and objectivist schools etc In this article we present a few elements taken from the complete theoretical framework [11]

2 Linguistic Model

In general different sentences can describe the same random event Let the propositions p q regard one event and verify the equivalence relationship

322

p agt q (1)

They form the equivalence class X

X=pq (2)

that constitutes the model of the random event so that we have

P = P(X) (3)

We share the opinion that random events are extremely complex and the linguistic model (2) is consistent with this feature Disciplines which investigate complicated phenomena such as psychology and sociology business management and medicine adopt the linguistic representation and consider other schemes to be too simple and reductive The proposition seems an adequate model except for the following perplexity Each primitive is a simple idea and can be left to intuition only for its fundamental property For example a number a point an entity are elementary concepts Can we declare that the random event is complex and contemporarily assume it is a primary concept The acknowledgement of the complexity opposes the primitive assumption This contrast would at least require an in depth justification that instead is lacking as far as we know

The inconsistency is confirmed in the every-day practice and we examine the linguistic model in relation to the facts

21) - Some subjectivists declare that each particular of the event should be described in order to make evident its uniqueness whereas in usual calculations we accept a sentence such as

The coin comes down heads (4)

Note that only two items are reported the coin and the result The precise date time place and all the particulars that make the event unique and unrepeatable remain implicit In fact the parts of a probabilistic event are not easy to distinguish and to relate in a sentence In conclusion a gap exists between the theoretical assertions and the practical applications of (2)

22) - In the Logic of Predicates every phrase has a precise meaning and is liable to be calculated Programmers using Prolog and Lisp develop inferences Logical programs can deduce the thesis from the hypothesis using precise clauses However this linguistic precision constitutes an exception and normally the natural language is approximate to the extent that a word must be interpreted The natural language usually represents a random event in generic terms whereas the linguistic model (2) should be liable to the probability calculation (3)

323

3 Ensemble Model

The axiomatic theory [8] assumes that the sample space D includes all the possible elementary events Kolmogorov defines the random event X as a set of particular events Ex

X= Ex (5)

when X is a subset of Q

X c Q (6)

and the probability is the measure of X

P = P(X) (7)

The practical application of the theory is immediately clarified by Kolmogorov who defines X as the result of the event

31) - This conception causes some perplexities in the light of modern systemic studies Applied and theoretical works on systems [7] assume the event as the dynamic producing the result from the antecedent item

EVENT

ou tpu t (8)

The result is a part and the event is the whole The properties of the event are evidently quite different from the properties of the output We encounter heavy difficulties when we call Ex) set of events and contemporarily we conceive it as a set of results We cannot merge them without a logical justification But do we have any

32) - Some probabilistic outcomes cannot be properly modeled as sets and subsets The spectrum of interference in the two slit experiment is a well-known case emerging in Quantum Physics [6]

input

324

4 Structural Model

We searched for a solution of the above written difficulties and we designed a theoretical framework based on the structure model for the random event

Ludwing von Bertalanffy father of the General Systems Theory conceives a system and consequently an event as an intricate set of items which affect one another [2] Interacting and connecting is the essential character and the inner nature of events and we take this idea as the basis of our theoretical proposal We make the following assumption

Axiom 41) - The idea of relating of connecting of linking is a primitive

This idea suggests two elements specialized in relating and in being related that we call entity and relationship We define them such as

Definition 42) - The relationship R connects the entities and we say R has the property of connecting

Definition 43) - The entity E is connected by R and we say E has the property of being connected

Intuitively we may say R is the active element and E is the passive one They are symmetric complementary and complete since they exhaust the applications of Axiom 41) Relationships and entities are already known in Algebra as operations and elements as arrows and objects as edges and vertices The main difference is that all of them are given as primitive while R and E derive from the axiomatic concept 41) In other words the properties of the relationship and the entity are openly given in 42) and 43) while they are implicit in other theories We underline that Axiom 41) is not a theoretical refinement and will provide the necessary basis to the ensuing inferences

From Definitions 42) and 43) follows that the relationship R links the entity E and they give the set

S = (ER) (9)

which is an algebraic structure [4] In this article we discuss theoretical models with respect to the physical reality thus we immediately examine howE R and S provide proper models for events The parts of an event are entities and relationships As an example an entity is a dice a spade heads tails a product The relationship that connects two or more entities is for ease a device a force a physical interaction [3] In the physical reality an event is a dynamic phenomenon linking Ein to Eout and from (9) we can deduce this general structure

325

5 = (Ein Eout R) (10)

Using a graph we get

^

R Eout (11)

R is the pivotal element in (10) and (11) and the structural model represents accurately the facts In addition we get the following advantages

1 The result Eout is distinct from the event S The parts and the whole are logically separate and they give a precise answer to objection 31)

2 Relations and entities constitute finite and also infinite sets so that R and E match with both discrete and continuous mathematical formalism

3 When Eout is an ensemble

Eout = Ex (12) Eout c= Q (13)

The structure accomplishes the set model in (5) and (6) 4 The result Eout may be also a rational or an irrational number a real or an

imaginary value It can be calculated by a wave function or by another function etc and we can offer a formal solution to point 32)

5 The structure S can include the comprehensive context of the probabilistic event Eg The atomic experiment depends upon the observer Eo and we have this exhaustive structure

S = (EinEout Eo R) (14)

We believe that the structural model can give a contribution to Quantum Probability

6 A simple sentence includes nouns that are entities and a verb representing a dynamical evolution Eg (4) expresses the following entities and relationship

The coin comes down heads Ein R Eout (15)

326

In short the algebraic structure accomplishes the linguistic model However a sentence can be equivocal whereas the structure S is a rigorous formalism and answers to point 22)

Note that the set (9) has the associative dissociative property namely the event is unicum S then it is defined in terms of the details E and R If this analysis is insufficient we reveal the entities (ElE2Em) and the relations (Rl R2Rp) these are exploded at a greater level and so forth The structure of levels is the complete and rigorous model of any event

S = = (ER) = = (ElE2EmRlR2Rp) = = (E11E12 EmlEm2EmkRllR12 RplRp2Rph) (16)

The structure can also be written such as

level 0 S level 1 ER level 2 ElE2EmRlR2Rp level 3 E11E12 EmlEm2EmkRllR12 RplRp2Rph (17)

The multiple level decomposition is known also as hierarchical property in literature [13] It is applied by professionals in software analysis methodologies [14][10] it is basic in modern ontology [12] and in various other sectors [1] The progressive explosion of the event is already known in the Probability Calculus where we use trees connecting the parts and the subparts of a random event For example an urn contains x red balls y green balls and z white balls Which is the probability of getting a white and two green balls through three draws

We consider the drawing Rw of a white ball w and Rg of a green ball The winning combinations wgg gwg ggw are generated by Rl R2 and R3 Intuitively we write this tree connecting three levels

R3

l RgRgRw (18)

The structure of levels (17) is rigorous and complete It includes the relations of the event as well as the entities

327

level 0 S level 1 gw R1+R2+R3 level 2 wgggwgggw(RwRgRg)+(RgRwRg)+(RgRgRw) (19)

Thanks to this completeness the structural model provides some insight into what is involved In particular if Rx at level k includes the subrelationships of level (k + 1) then Rx connects the entities through these subrelationships Eg The structure of levels (19) illustrates the dynamic Rl carried out by (RwRgRg) that physically determine the results The structure (16) proves that any event is composed of precise macromechanisms and micromechanisms Any event appears like an industrial apparatus a mechanical clock or an electronic device including various working parts This operational analysis which is based on Axiom 41) will be fundamental in the next section

5 Certain and Uncertain Structures

Probability is the answer to such kinds of questions Who will win the next foot-ball match Who will be voted in the regional elections Shall I pass the examinations Where is the photon now

These questions prove that probability concerns the particulars of an event that is already known in the whole We see the overall random phenomenon but however we ignore the details that will produce the result When we ask who will win the next match we are familiar with the match we already know the teams which will play where the match will be held etc We master the event however we do not have the details that will set out the result Why do we not have details

The cognitive difficulties related to the particulars of a random event take several origins For example there is a generic memory the reports are not detailed the particulars are missing because they are disseminated over a vast area we meet obstacles in the use of instruments etc

Ignorance of microscopic is sometimes a voluntary choice Every detail could be observed and yet we decline to know them For example a company has collected analytical data but the executive managers ignore them and evaluate their average values in taking important decisions Macroscopic knowledge and unawareness of microscopic items provide a precise method Statisticians assume this method that is absolutely scientific

Let us translate these concepts into the formalism just introduced Let the event S have the level the level 2 up to the level q two cases arise now

328

51 Certain Structures

The event is entirely described by the relations and the entities of level q The elements at level (q + 1) do not exist in the paper and in the physical reality This structure which is wholly defined and complete is certain As an example we take a body falling

level 0 S level 1 EbETRf (20)

The structure includes the body Eb the Earth poundTand the force of gravity Rf at level 1 The elements exhaustively model the event and other elements do not exist in the physical world

52 Uncertain Structures

The event is not entirely described by the relations and the entities of level q The microelements pertaining to level (q + 1) exist in the physical reality and influence the final results in a decisive way however the structure do not include them We call uncertain (or random) such a structure which is partial As ease we take the flipping of a coin The structure includes the coin Em the launchingfalling dynamcs Rm The entities Et heads and Ec tails and the relations which are alternative and produce them appear at the next level

level 0 S level 1 EmRm level EtEcRt+Rc level 3 (21)

The subrelationships of Rt and of Re produce any specific outcome They are essential since they would enable the calculation of any result and should be listed at the level 3 in (21) However they do not appear and the structure (21) is uncertain

6 Probability

A certain event is entirely explained through the structure of levels The structure clearly indicates how the event runs through q levels which are exhaustive by definition On the contrary the uncertain structure is incomplete and cannot describe how the event runs in the physical reality As the impossibility of describing how the event functions since the level (q + 1) is unknown we inquire when the event behaves that is when the random event exists in the physical reality This

329

inquire unveils a typically physical approach The problem eludes whoever develops an abstract study For the pure theoretician the event S once defined on the paper exists by definition The applicative instead knows the great difference between the definition of a model and its experimental observation

The structure of levels (16) proves that the event S works through R therefore we measure the ability to connect of the relationship

Definition 51) - When R links the input to the output in the physical reality the event S is certain and the measure P(R) equals one

P(R)= 1 (22)

When R does not run in the physical reality S is impossible in the facts and the measure P(R) is zero

P(R) = 0 (23)

If R occasionally runs P(R) assumes a decimal value The connection is neither sure nor impossible and R has a value between zero and one

0 lt P(R) lt 1 (24)

We call probability the measure P(R) of the operation R which extensively indicates the occurrence of S We can add the ensuing remarks

1 The relationship R is the precise argument of probability while S is generic 2 Definition 51) is coherent with the common sense on probability as P(R)

gauges the possibility or the impossibility of the random event 3 In some special events we can define the operation using its outcome Formally

we state an univocal relation between Eout and R

Eout =gt R (25)

and we calculate the probability of the outcome

P(Eout) = P(R) (26)

Eg The result heads Et appears whenever Rt works and we forecast the chances of a gamble from the possible outputs

P(Et) = P(Rt) = 05 (27)

330

In conclusion if (12) (13) and (26) are true Definition 51) is consistent with the Kolmogorov s theory

4 Certain structures include only certain elements impossible elements have no sense and are omitted The unitary value of probability merely confirms what is already related in the levels For example P(Rf) is one and substantiates the structure of levels (20) Conversely the uncertain structure lacks the lowest elements that are essential and (24) unveils them The decimal values of probabilities clarify the intervention of the elements at level (q + 1) For example we ignore the parts of Rt producing the result Et in (21) instead the probability (27) is capable of explaining how they work Exactly half of the S occurrences is due to the subrelationships of Rt and the other half is activated by the components of Re The explicative and predictive values of probability in (24) appear absolutely relevant

7 Experimental Verification

Our inferences are strictly inspired by experience and Definition 51) must be confirmed in the facts In order to simplify the discussion of practical verification let the event include either the relationship Ri or NOT Ri at level 2 and level 3 is ignored

level 0 S level 1 ER level 2 EiNOT Ei (Ri+NOTRi) level 3 (28)

The probability P(Ri) expresses the runs of Ri by definition thus the occurrences gs(Ri) in the sample s verifies the theoretical value P(Ri) As much as Ri connects so much is gs(Ri) Vice versa as little Ri runs so small is gs(Ri) However the absolute frequency gs(Ri) exceeds the range [01] and we select the relative frequency Fs(Ri) which verifies

0 lt Fs(Ri) lt1 (29)

According to this theory the relative frequency must coincide with the probability calculated theoretically instead Fs(Ri) does not coincide withP(3() Why There is perhaps a systematic error in the experiment

The relationship Ri at level q works by means of its subrelationships at level (q + 1) however we do not know in details how these ones behave In particular a subrelationship at level (q + 1) occurs random and a finite number of tests does not

331

allow the subrelationships of Ri to maintain their dynamical contribution to Ri Symmetrically the subrelationships of NOTRi are not proportional to P(NOT Ri) Every finite sample of tests unbalances Ri and NOT Ri The occurrences of one group are lower to what they ought to be and the occurrences of the other are greater since the subrelationships are casual The relative frequencies appear in favour of one group of subrelationships and in detriment of another Fs(Ri) and Fs(NOT Ri) are necessarily unreliable and disagree P(Ri) and P(NOT Ri) We conclude the correct trial of probability must be extended over the universe where the subrelationships of Ri and of NOT Ri do not undergo limitations The ideal experimentation of P(Ri) which excludes any deforming influence and provides the unaltered value oiFs(Ri) requires the number Gs of tests be infinite

Gs = oo (30)

In this situation the theoretical value P(Ri) and the experimental one coincide

Fs(Ri) - P(Ri) = 0 (31)

The ideal experiment (30) is unattainable therefore we can only bring near We define this approximation using the limit

Urn Fs(Ri) - P(Ri) = 0 Gs^oo (32)

The limit affirms that given the high number AT there is a value Gs

Gs gt N (33)

such that

Fs(Ri) - P(Ri) lt1Gs (34)

In other words we repeat the tests a sufficiently high number of times and the difference between the frequency and the probability will be less to the small number 1Gs The limit (32) ensures a result as fine as desired It proves that the probability defined by (22) (23) (24) is verifiable in the fact and confirms that the present theory has substance

The limit (32) known as empirical law of chance or law of great numbers does not define probability but explains its experimental verification only It is less meaningful with respect to the law sustained by frequentists [9] and does not give rise to the same conceptual difficulties The limit (32) does not use probability to

332

describe the approximation of Fs(Ri) to P(Ri) and avoids a certain conceptual tautology

8 Objective and Subjective Probability

The limit (32) states that the higher the number of tests the more frequency moves near to probability Vice versa the smaller the sample the less reliable is the experimental control of probability The maximum deviation emerges in a single test and the structural model provides the explanation

One subrelationship of the level (q + 1) fires the single experiment and this subrelationship pertains to Ri or otherwise pertains to NOT Ri In both cases the frequency deviates completely from the probability which should be decimal

I bull Gs 1 gtN oo Fs wrong approximate right

(35)

The spectrum (35) is valid in relation to frequency and also in relation to probability What does this mean

Any scientific measure takes its meaning under the precise conditions in which it is defined Therefore a parameter does not have a value for ever but does only in the practical conditions under which it must be tested And this rule also concerns probability A fairly simple case can clarify the matter

We define the force as the factor causing the acceleration a to the mass

f=m-a (36)

Mechanics defines the force (36) in the conditions which pertain exclusively to the inertial system This is characterized by the property of being stationary or moving straight on and steadily In the inertial system the mass m goes through the force and accelerates in accordance with (36) Conversely the body can move without any mechanical solicitation in the non-inertial reference The force cannot be tested and definition (36) is meaningless when system is not inertial

In general a scientific measure takes on a significance only under the experimental conditions pertaining to it and out of this context it objectively has no meaning The same criterion applies to probability with additional difficulties due to the experimental conditions that are expressed by the limit (32) and are somewhat

333

complex We have not two alternative and mutually exclusive reference systems intertial and non-intertial conversely we have the continuous spectrum (35) Probability is correctly experimented and thus takes on a right and objective significance when

Gs =00 (37)

This is unattainable and we use a large sample

Gs gtN (38)

the higher is the test number and the more objective is the probability verification Probability loses significance as more as Gs decreases The test is absolutely meaningless when

Gs = 1 (39)

Probability is very useful (see point 3 in section 6) and we calculate P(R) even if (39) is true In the single event however the probability does not exist as De Finetti paradoxically states [5] Probability can only orientate the personal expectation namely probability takes on a subjective significance

I

Gs 1 gtN Fs wrong approximate P subjective objective

Note that the subjectivist schools focus their attention on the single event while the general event is a repetition of single events This remarks put to light once again that incongruences between various authors take their roots on the random event modeling

In substance Fs(Ri) and P(Ri) have a correct and objective meaning when they refer to the entire inductive base As the number of experiments decrease so the precision of Fs(Ri) decreases and the objectivity of P(Ri) decreases progressively to the point (39) in which the numerical value of Fs(Ri) is systematically wrong and the value ofP(Ri) is subjective

00

right

(40)

334

9 Conclusions

Our theoretical proposal arose from a critical approach to the probabilistic event in particular we started with examining the relation between theoretical models today in use and the physical reality We believe the algebraic structure meets the needs better than the linguistic and the set models Besides the theoretical appreciations that we listed in the previous pages we highlight that structures of levels are already applied in several fields and in Probability Calculus too

The definition of probability that derives from the structural model is consistent with the common sense and with the probabilistic schools The different interpretations of probability which today are conflicting are unified in between our framework We judge this is a significant feature and may provide a stimulation to the scientific debate

The reader may find some parts in this paper sketchy and insufficiently explained we regret the conciseness Other considerations and further calculations have been developed in [11] but exhaustive discussions cannot be included here

References 1 Ahl V Allen TFH Hierarchy theory a vision vocabulary and epistemology

(Columbia Univ Press NY 1996) 2 von Bertalanffy L General system theory (Brazziller NY 1968) 3 Chen PS The entity-relationship model toward a unified view of data ACM

Transactions on Database Systems vol 1 nl (1976) 4 Cony L Modern algebra and the rise of mathematical structures (Verlang

NY 1996) 5 de Finetti B Theory of probability (Wiler amp Sons NY 1975) 6 Feynman R The concept of probability in quantum mechanics Proceedings

Symp on Math andProb California University Press (1951) 7 Kalman RE Falb PL Arbib MA Topics in mathematical system theory

(McGrawNY1969) 8 Kolmogorov AN Foundations of the theory of probability (Chelsea NY

1956) 9 von Mises R The mathematical theory of probability and statistics (Academic

Press London 1964) 10 Rocchi P Technology + culture = software (IOS Press Amsterdam 2000) 11 Rocchi P La probabilitd e oggettiva o soggettiva (Pitagora Bologna 1998) 12 Uschold ML Building ontologies toward a unified methodology Proc Expert

Systems Cambridge (1996) 13 Takahara Y Mesarovic MD Macko D Theory of hierarchical multilevel

systems (Academic Press NY 1970) 14 YourdonE Modern structured analysis (Englewood Cliffs NY 1989)

335

CONSTRUCTIVE FOUNDATIONS OF R A N D O M N E S S

V I SERDOBOLSKII Moscow 109028 BTrekhsviatitelskii 312 MGIEM E-mail vserdmailru

The ideas of the complexity and randomness are developed in a successively conshystructive theory The Kolmogorov complexity is reconsidered as a minimization process Basic theorems are proved for the processes A new notion of the comshyplexity based on sequential prefix coding algorithms (S-algorithms) is proposed It is proved that a constructive infinite binary sequence is algorithmically stationary iff it is an S-encoded random sequence

1 Introduction

In 1963 ANKolmogorov [1] suggested an algorithmic approach to foundation of the probability His new definition of probability was based on the notion of the complexity which was defined as the length of the minimal description for a binary word x the complexity function is defined as

bull ()= min b | (1) A(p)=x

where p are (shorter) binary words and the minimum is evaluated over all possible algorithms A A remarkable properties of this approach was that thus algorithmically defined randomness was proved to display all traditional laws of probability However the function K(x) denned by (1) in a traditional intuitive approach cannot be effectively calculated since it is not a partially recursive function In fact this function is computable only for finitely many words x [2] In [3] it was shown that Kx) is not partially recursive for any universal algorithm In [4] the definition (1) was called a heuristic basis for various approximation In [5] the author writes that the non-constructive form of the definition (1) leads to some difficulties so that many important relations hold only to within an error term measured by the logarithm of the complexity To offer a constructive definition of randomness it would be desirable to call an infinite sequence random if all initial segments (prefixes) in it are incompressible However it was proved [6] that such sequences do not exist Kolmogorov proposed some definition of randomness (K-randomness) but he wrote that it was to be improved

In this paper we reconsider fundamental relations of the Kolmogorov comshyplexity theory and develop a successively constructive formalism The main idea is that as far as we deal with algorithms we must explicitly take into acshycount the current time of their performance Thus a static notion of minimal

336

description must be replaced by the process of the minimization Here we sugshygest a rigorous formalism in which it is possible to replace somewhat obscure intuitive reasoning of the existing complexity theory by formal investigation of strings of symbols We present a survey of basic results of the Kolmogorov complexity theory in terms of processes of step-by-step performance of algoshyrithms We also introduce a new form of the complexity based on a restriction by algorithms coding sequentially from left to right (S-algorithms) Construcshytive infinite binary sequences can be called stationary if frequencies of all finite blocks of digits in it converge We prove that a sequence is stationary iff it is the transformation of an incompressible (up to a logarithmic term) sequence by a sequential left-to-right encoding algorithm

Let us define the objects of the investigation and fix notations We study binary words x that are finite chains of binary digits and at the same time binary numbers These words are transformed with algorithmic procedures A which can be represented by Turing algorithms (Turing machines) or equiva-lently by partially computable (partially recursive) functions We also study infinite sequences xdegdeg of binary digits which can be considered at the same time as infinite sequences of words x of increasing length n ie initial segments of xdegdeg In the constructive approach these sequences must be generated by some finite algorithms (generating functions) We write A(x) = y if A halts at some finite step and yields y If A(x) does not halt we write A(x) = We will often need to perform algorithms step-by-step Let Atx) denote the result of the performance of Ax) for t steps At(x) mdash y if Ax) halts at the step t lt t and yields y We write At(x) = if A(x) does not halt or halts only at the moment t gt t Let |a| denote the length of binary word x

2 Kolmogorov Complexity

According to Kolmogorov the complexity of a binary word is the length of a minimal program generating this word To make this definition comshypletely constructive we first must explicitly describe the minimization proshycedure To minimize a partially computable function f(x) we combine the search of x with counting number of steps of an algorithm that evaluates f(x) Let us use the uniform increasing numeration N = 12 of n-tuples of arguments for example let N = 12345 represent pairs (11) (12) (21) (22) (1 3)

Define the standard minimization process for A(x) as follows

min A(x) = A(xN) N = l2 X

where N = (xt) A(x0)= and A(xN) = min (A(xN - l)A t(x)) for

337

N gt 1 In the minimization process the sign can be treated as infinity If Ax) halts for a computable number of steps t then the minimization process ends and min A(x) is a computable function If no such t exists we can say

X

then that the function A(x) has no bottom Consider the universal Turing machine U by definition U(Ap) = A(p)

in the domain where (and in the following) the same letter A also denotes the text of the algorithm Let A denote the length of the text A Theorem 1 There exist computable functions such that the mass problem of their minimization process halting is algorithmically unsolvable

Proof Consider the indicator function ind(xt) = 0 if Ut(x) with x = (Ap) halts exactly at the step t so that Ut(x) = A(x) otherwise ind(xt) = 1 Denote

(j)xt) =TT ind(aT) Tltt

The minimization process ltfgt(x l)(jgt(x 2 ) is finite iff U(x) halts But the halting problem for the universal Turing machine U is algorithmically unsolvable

Now we can define the complexity as follows

Definition 1 Given binary word x and an algorithm (partially computable function) A the complexity of x with respect to A is K(x A) = K(x AN) N = 1 2 where

K(x A N) = min p (pt)ltN A(p)=x

In this definition Ap) is called a generating algorithm and p is called a program or a code for x

So the complexity is defined as a process but not as a function If A(x) halts for some x then the sequence K(xA) = K(xAN) N = 12 converges to a constant for some computable N = NQ and we can say that the complexity function K(x) is defined Otherwise no such constructive function exist

To compare minimization processes we need a special technique

Definition 2 Given two minimization processes

min A(x) = A(x N) N = 12 min B(x) = B(x M) M = 12 X X

we write A(x)ltB(x) if for each M there exist an iVo such that for all N gt N0

the inequality holds A(x N) lt B(xM)

338

If the both processes halt we can write simply A(x) lt B(x) If A(x)ltB(x) and A(x)gtB(x) we say that the strong equivalence holds

and write A(x) ~ Bx) Define also a weak equivalence A(x) laquo B(x) if A(x)ltB(x) + c along with Bx)ltA(x) + c

The algorithmic theory of complexity was started with the discovery of universal descriptions and universal complexity This basic discovery was made simultaneously and independently by Kolmogorov and RSolomonoff in 1960-1964 (see in [7])

This theory is developed to study minimal descriptions of arbitrarily long words x with finite algorithms It means that A lt c All basic results are obtained with the accuracy up to constants c which are supposed to be indeshypendent of x

Definition 3 The complexity of the word x with respect to an algorithm A is the process K(x A) = K(x AN) N = 12 where

K(x A N) = min |raquo| (pt)ltN At(p)=x

We use two methods of the complexity theory upper estimates of the comshyplexity are derived by the construction of explicit generating procedures lower estimates are obtained by counting the variety of words and their programs

Theorem 2 For any algorithm A we have

K(xU)ltKxA) + cA

where CA depends only on A but not on x

Proof Count steps of Ax) by steps of the universal Turing machine performing A For each N we can find a number M such that

K(x U N) = min z lt (zt)ltN U(z)=x ~

min min |(Bp)| lt min (CA + p) lt B Bltc (pt)ltN Ut(Bp)=x ~ (pt)ltN Ut(Ap)=x ~

CA+ min p = CA +K(XA) (pt)ltM A(p)=x

where CA is a constant depending only on A This is the proof

This statement is called the Invariance Theorem Its significance is that it introduces a universal measure of complexity which is calculated by trying different algorithms with different input words Let us fix a particular universal Turing machine U as a reference machine and set K(x) = K(x U)

339

Let us call the difference |x| mdash K(x) the number of regularities

Remark 1 Given n = x the fraction of words x with the number of regushylarities more than m is no more than 2~m

This follows from the fact that there are only 2 n _ m programs p of length nmdashm So almost all words are incompressible up to a slowly increasing function of n

Remark 2 Kx)ltx + c This is obvious since we can use as a generating the identity algorithm A(x) = x

Note that the minimization process in Theorem 2 can be made more effishycient if we restrict p with p lt x + c

The complexity of finite words depends strongly on the additive constant c Therefore the main object of study will be the complexity of words x of arbitrarily great lengths n

Theorem 3 If fx) is a partially computable function then K(f(x))ltK(x) + c

Proof Suppose the algorithm evaluating f(x) halts Given an arbitrary algorithm A we construct the composition B = fA By Definition 3 and Theorem 2 for each N we can find M and a constant c independent of x such that

K(f(x)UN)= min p lt (zt)ltN Ut(z)=f(x)

min min Inl + c lt min p + c lt B Bltc (pt)ltM Bt(p)=f(x) ~ (pt)ltMf(At(p))=f(x)

min Id + c = K(x A) + cltK(x) + c (pt)ltMAt(p)=x V

The theorem is proved

Example Let x mdash 0n (n zeros) Then K(x)ltK(n) + clt logn + c If n = l m then K(x)ltlogogn + c Clearly Kxn) is not monotone in n

By definition it is impossible to present a conceivable example of a high-complexity word

To separate a number n in chain we define a special self-delimiting code for an integer n as follows n = Omln where m = logn with the length n = 2log n + 1 or a more refined code n = O l o g m lmn of length n lt logn + 2 + 2 log logn Here (and in the following) log a for x gt 0 denotes a function equal to an integer nearest from above to the standard logarithmic function logx and only positive arguments of log a are considered (if x lt 0 then the expressions containing log a are supposed to equal 0)

340

Note that the set of n presents a prefix-free set More sparing self-delimiting codes can be obtained by further iterations Denote their length by log n = log + log log n 4- log log log n + (the iterated logarithm)

Theorem 4 K(x y)ltK(x) + K(y) + 2 log ||z|| + 1

Proof It suffices to use programs for (x y) of the form p = 0mlp1p2 where m = logpi A(pi) = x B(p2) = y and 0m serves to separate p from p2

3 Incompressibility

Now we consider algorithmically generated infinite sequences of digits xdegdeg that are treated as sequences of words x |x| = n = 1 2

We cite (in a simplified form) two theorems by Martin-L6f [6]

Theorem 5 Any constructive xdegdeg contains infinitely many words x of length n with K(x)ltn mdash logn + c

Theorem 6 For almost all sequences xdegdeg for any e gt 0 for all words x of length n gt no with some computable no we have K(x) gt n mdash (1 + e) logn

Thus the complexity of a typical constructive binary sequence fluctuates between the lower bound n mdash (1 + e)logn and n

The idea to define randomness as algorithmic incompressibility was put forward by Kolmogorov [2] and GJChaitin [8] There exist no sequences in which all words in it are c-incompressible

Definition 4 (Kolmogorov) An infinite binary sequence is called K-random if it contains infinitely many words x with if(a)gt|a| mdash c

Remark 3 Almost all sequences xdegdeg are K-random

This follows from the fact that there is only a portion 2~c of words x for which K(a)lt|a| - c

Definition 5 An infinite binary sequence xdegdeg = x is called L-random if for some c we have K(x)gtn mdash c logn for all words n = x

Theorem 6 states that almost all binary sequences are L-random Stepping aside from the incompressibility idea Martin-L6f [6] suggested

another notion of randomness based on the idea of universal tests The Martin-Lof randomness (ML-randomness) follows from the Kolmogorov randomness If zdegdeg is Martin-Lof random then for any e gt 0 we have K(x)gtn- ( l + e ) l o g n from some n onwards

These properties suggest three notions of randomness implied one from the other K -+ ML -gt L

Now let us restrict classes of algorithms

341

4 Reversible Complexity

Let us restrict ourselves with reversible algorithms

Definition 6 An algorithm A(p) is called reversible (R-algorithm) if one can find another algorithm B = A-1 such that A(p) mdash x implies B(x) mdash p and vice versa

These algorithms state 1-1 correspondence between inputs and outputs We can say that B(x) is an encoding algorithm and A(p) is a decoding algoshyrithm

Definition 7 R-complexity of a word x is defined as the process KR(X) = KR(x N) N = 1 2 where

KR(XN) = min min Id A Altc pt)ltN Ut(Ap)=x

where A are R-algorithms and the minimization process is shortened by disshycovering the first root of the equation A(p) = x

Since the class of R-algorithms includes the identity algorithm we have KR(X) lt x + c

Definition 8 A function (an algorithm) A(x) is called unidomain if there are no pairs x ^ x-i such that Ax) = Ax2)

Proposition 1 A function A(x) is unidomain iff it is reversible

Proof First let A be unidomain Using A let us construct an algorithm B(y) as follows

for (pt) = 12 do if At(p) = y then B(y) = p halt

endfor

If A(x) = y then this algorithm provides the first root of this equation and halts If A(x) = then we have B(y) = Conversely if A is a reversible algorithm then there exist an algorithm B(y) such that Ax) = y implies B(y) = x and the argument of A is recovered uniquely

Theorem 7 There exist no algorithm W such that for any algorithm A we have W(A) = 1 if A can be a reversible algorithm and W(A) = 0 if not

Proof To prove this assertion it suffices to prove it for some special class of A Let N be a nullifying algorithm such that for any x we have N(x) = 0 and let B be an arbitrary algorithm Choose A so that A(0) = 0 A(l) = N(B(1)) and A(n) = n for n gt 1 This algorithm is not unidomain iff -B(l) halts However the mass problem of algorithm halting is algorithmically unsolvable This proves the theorem

342

Theorem 8 The complexity KRX) as K(X)

Proof The relation K(X)ltKR(X) + c follows from definitions Prove the converse relation Let Kx) be given by a sequence of functions

KixN) = min min Ipl A Altc (Apt)ltN At(p)=x

where A are arbitrary algorithms Given A the minimization here is carried out over all roots of the equation At(p) = x We replace the evaluation of all roots for a single algorithm At by evaluating roots of a number of the equations Let us numerate roots of the equation A(p) = x in the process (p t) = 12 Construct the algorithm B(vp) as follows

k=0 for (qr)=l 2 do

if ATq) mdash x then k = k + 1 if k = v and p = q then

B = x halt endfor

The function B(vp) = x iff p is the root number is otherwise B(yp) = By construction for fixed v the function B(ip) is unidomain The theorem statement follows

Knowing the complexity of a word x we can constructively evaluate its minimal codes Minimizing descriptions of physical events x can be considered as a process of a cognition of x by search of a regularities producing the phenomenon x It is known that all elementary physical processes are time-reversible The reversible generating algorithms generally speaking can be less efficient in producing long words The equivalence Kx) laquo KRX) stated by Theorem 8 can be interpreted as the absence of phenomena that can be produced but not cognized within the frames of the algorithmic theory

5 Complexity and Information

Kolmogorov discovered [2] [9] that information theory can be developed from the algorithmic definition of complexity

The conditional complexity of a binary word x with respect to the word y is defined as the minimal length of a program that generates x from y

K(xyA)= min p (pt) At(py)=x

Theorem 9 There exists an optimal algorithm V such that for any algorithm A we have

K(xy) d=f K(xy V)ltK(xy A) + c

343

Example We have K(Onn)ltc where the constant c is the length of the algorithm generating 0 from n

We show the connection between the notion of complexity and optimal coding in the Shannon information theory Suppose the words x of length n be partitioned from left to right into sequences of k blocks ba of binary digits of the identical length I m = 2l blocks in total n = kl Denote by fbdquo the empirical frequency of the occurence of bs in x The Shannon entropy per block is defined as

s

Theorem 10 Let o word x be partitioned into k blocks of length I Then k~1K(x)ltH(f) + clogfcfc where c depends on I but not on x

Proof Use a special code not depending on the source of information universal code) To specify x we can fix numbers k3 = kfs of the occurence of each block bs for all blocks s of length I and the number

~ kilk2kml

m = 2l where fci + bull bull bull + km = k Applying the Stirling formula we find that the length of this code is no more than m log k + kH(f) + c log k The theorem statement follows

Thus Kx) can be considered as the entropy and K(yx) as the conditional entropy The information in x about y is I(xy) = K(y) mdash K(yx)

Remark 4 For arbitrary words x and y

K(yx)ltK(y) + c and K(xy) = K(x) + K(y|x) + clog|x|

Indeed consider a special code for (x y) of the form P1P2 where pi is a self-delimiting code for x and pi is a code for y We have

K(xy)lt min min (|Pi| + IP2I) AB | A | lt c | B | lt c (piP2t) At(pi) = x Bt(p2) = y

This is the required statement Note that the measure of the information I(xy) is non-negative only

asymptotically for long x and y The correction logarithmic term can be preshyscribed to the individual description of x in contrast to traditional description in terms of distributions

344

6 Frequency Ra te s

The stability of frequency rates that is assumed a priori in the conventional concept of probability can be deduced in the algorithmic theory

Denote the empiric rate of occurences of 1 in x by f(x 1) The frequency rates stability can be stated as follows

Theorem 11 Given L-random xdegdeg c gt 0 for each word x in it

f(xl)-l22ltcognn

where c does not depend on n Proof Use a special code p for x as follows Let k = nf(xl) and

P = (fcgtj)gt where j = 1 C numerates all words x of length n with k units Use the prefix codes for (k j) of the form kj with k = log k lt 21ogn Thus

A(a)lt|(gtm)|lt21ogn + logC7

Using the Stirling formula we find that logC lt nH(kn) + clogn where the entropy H(f) = mdashlog mdash (1 - ) log( l - ) = kn It satisfies the inequality H(f) lt 1 mdash 2( - 12)2 Combining these formulas we obtain the desired result

Remark 5 If f(x 1) - 12|2 gt cn then K(x)ltn - 12 logn + c This inequality shows the effect of a regularity when the number of units is too close to n2

The refinement is natural We consider a partition of xdegdeg mdash x into blocks of digits b of the identical length b = Define by fxb) the number of blocks b = bi among the partition of a word x of length n = kl Denote 7T = 2 -J

Theorem 12 Given an L-random sequence xdegdeg = x and a block of digits b of length I for all words x of length n we have

f(xb)-2~l2 ltc(b) lognn

A number of other specifically probabilistic laws deduced previously by intuitive reasoning in can be proved similiarly

7 Prefix Complexity

In 1974-1975 another approach to the complexity was developed starting from the concept of a prefix complexity (by LALevin PGacs GJChaitin [10-12])

345

Definition 9 A set of words is called prefix-free if there are no pairs of different words such that one is the beginning of the other

Lemma 1 (1) If pi is a prefix set n = pi i mdash 12 then the Kraft inequality

holds pound 2-ltltl

t = l 2

(2) if numbers n nlti satisfy the Kraft inequality then one can find binary words pi P2 bull bull of length n n-i such that the set pi is prefix-free

These words can be constructed by the well-known Fano-Shannon proceshydure

Definition 10 An algorithm is called a prefix algorithm if its domain is a prefix-free set The prefix complexity of a word x with respect to a prefix alshygorithm A is defined as the process Kp(x A) = Kp(x AN) N = 1 2 where

KP(xAN)= min ||p|| (pt)ltN At=x

The set of prefix algorithms is an enumerable set

Theorem 13 There exists a universal prefix algorithm V such that for any prefix algorithm A we have

KPx) d= KP(x V)ltKP(x A) + cA

To deal with prefix algorithms we notice that we can recover the word x = 0n (n zeros) from n but we cannot encode numbers n as simple integers since they are not prefix-free Using self-delimiting codes we obtain prefix-free codes of length n + log n

Remark 6 K(x)ltKP(x)ltK(x) + log(z)

Remark 7 Kp(xy)ltKp(x) + Kp(y) + c In contrast to K(x) here we do

not need an end marker for the word x since x is recognized as a prefix

Theorem 14 [12] For any fixed length n of words x we have max Kp(x)gtn + log n mdash c

X

Theorem 15 [13]An infinite sequence xdegdeg is Martin-Lof random iff Kp(x)gtx mdash c for all words x

346

For most of xdegdeg we have Kp(x)gtx mdash c for all x Thus the prefix complexshyity of almost all sequences fluctuates within the bounds x and |a| + log x (with the accuracy up to c)

8 Universal Probability

The idea of a universal a priori probability was put forward by Solomonoff in [4] For a binary word x he introduced the probability P(x) = 2 _ l p ^^ where p(x) is a minimal description of a However

pound2-ltgt = oo x

To obtain normalizable algorithmic probabilities the Kraft inequality for a prefix-free set was proposed and this led to the development of a theory of the prefix complexity [10-12] Let us reformulate the basic results of it in a successively constructive form

Definition 11 The algorithmic probability of x is defined by the process

P(x) = 2-Kr(ltN AT = 12

Example If x = 0n then Kp(x)lt logn + 2 log log n + c Hence P(x)gtc(nlog2 n)

Definition 12 The universal a priori probability is defined by Qx) = Q(xUN) N = (p t) mdash 12 where U is the universal prefix algorithm and

Q(xUN) = QxUN-l) + md(Ut(p) = x) 2~M

where the indicator function equals 1 iff Ut(p) halts exactly at the step number t otherwise 0

Since the mass problem of the universal machine halting is algorithmically unsolvable the sequence Q(x) has no ceiling

The following Coding Theorem shows that these two formulations define processes differing by no more than a constant

Theorem 16 For each x we have Kpx) raquo logQ(x)

In [14] a non-constructive infinite binary fraction was considered

n =53 Q(x) lt I

347

The real number fi was called the universal algorithm halting probability It can be interpreted as a process Q(N) N mdash 12 with

fi(jV) = Yl MN ) + md(ut(p) = )]gt (xpt)ltN

where the indicator function equals 1 iff Utp) halts exactly at the moment t yielding x otherwise 0

The monotone increasing sequence il(N) is bounded from above and has no ceiling Knowing first signs of ilN) N mdash 12 we can accumulate in fi solutions of all constructive problems of bounded complexity CBennet and MGardner would call ft the number of Wisdom [15]

9 Sequentially Coding Algorithms

We suggest the following extension of the complexity theory produced by a restriction with algorithms coding sequentially from left to right

A set P of code words is called complete-code if any half-infinite sequence can be represented as a concatenation of codes from P

Definition 13 An one-to-one constructive function T X ltmdashgt Y is called a coding table if it is defined on complete-code prefix-free sets X and Y

Definition 14 An algorithm A evaluating a coding table T X ltmdashgt Y is called a sequential coder or an S-algorithm if

(1) for any concatenation x = xXi Xk of words Xi from X we have A(x) = A(x1)A(x2)A(xk)

(2) for any concatenation y = A(xx)A(x2) bull bull A(xk) we also have A(x1x2xk) = y

The set of S-algorithms is recursively enumerable

Definition 15 The S-complexity of a word x with respect to an S-algorithm A is a process Ks(x A) = Ks(x AN) N = 1 2 where

Ks(xAN)d= min p (pt)ltN At(p)=x

Theorem 17 There exists a (universal) S-algorithm V such that for any S-algorithm A we have

Ks(x) = Ks(xV)ltKs(xA) + cA

where CA does not depend on x

348

Since the class of S-algorithms contains the identity algorithm (with A(0) = 0 A(l) = 1) we have Ks(x)ltx+c If f(x) is a partially computable function evaluated by some S-algorithm then Ks(f(x))ltKs(x) + c

Obviously K(x)ltKs(x)ltKp(x) But we only have Ksxy)ltKpx) + Ks(y) since the sequentially coding algorithm can separate the utmost left prefix from the remaining ones

For words x = 0trade we have Ks(x)lt log n For almost all sequences xdegdeg for all sufficiently long words x in it for any

c gt 1 we have Ks(x)gtK(x)gtx mdash clog |x|

Definition 16 A binary sequence is called S-random if for all words x Ks(x)gtx mdash c log |a| where c does not depend on x

Definition 17 A binary sequence xdegdeg = x is algorithmically stationary if for any block b of digits in it there exist the limit lim f(b x)

xmdashgtoo

Any L -random sequence is algorithmically stationary Lemma 2 a binary sequence ydegdeg = y is produced from an algorithmically stationary sequence xdegdeg = x by an S-algorithm A so that y = A(x) then the sequence ydegdeg is also algorithmically stationary

Proof Suppose ydegdeg is produced from xdegdeg by y = A(x) where A is an S-algorithm The algorithm A defines a prefix-free domain X and a code-complete range of values Y Choose a block of digits b Using the completeness of Y we have b mdash 2122 bull bull bull Vk where j 6 Y i = 12 k By the sequential property we can find a program a = XXi Xk with all Xi euro X such that Aa) = b The frequencies f(ax) = f(by) This proves the lemma

Lemma 3 KsKs(x))ltKs(x) + c

Proof Note that S-algorithms are such that the composition AB of two S-algorithms A and B is again an S-algorithm For a fixed N we find

Ks(xN) = min min Ipl A Altc (pt)ltN At(p)=x

and for the minimizing value p = Po

KspoM)= min min y B Bltc (yt)ltM Bt(y)=p0

Let y = 20 be the minimizing value of a code for po- Since for some t AtBt(y) = x (if both algorithms halt) it is clear that Ksx) lt y + c We obtain K(x)ltKs(p) laquo Ks(Ks(x))

Theorem 18 An infinite binary sequence xdegdeg is algorithmically stationary iff it is an S-algorithm transformation of some S-random sequence

349

Proof First assume that y = A(x) for all x euro xdegdeg and Ks(x)gtx mdash clog x We have K(x)gtKs(x)-log x So K(x)gtx -c log|a | c gt c + l By Theorem 12 the sequence xdegdeg is stationary

To prove the converse assume that xdegdeg = x is stationary We find minKs(x N) for (p t) lt N let p be a minimum code for x At(p) = x for some t if At(p) halts Here A P -yen X has the domain P and the range X both prefix-free and code-complete Since X is code-complete we can express x as xxiXk with Xi e X and A(pi) = Xi with pi euro P i = lk By Lemma 3 we have Ks(p)gtp - c It follows that p mdash ppi pk is log-incompressible The proof is complete

The comparison of different notions of the complexity and randomness shows that this difference is no more than a logarithmic term With account of stationarity theorems it seems plausible to suggest a common definition of randomness of infinite sequences xdegdeg mdash x as the incompressibility up to the term c log |x| where c does not depend on x

In conclusion I have a pleasure to express my sincere gratitude to prof VMMaximov for encouraging discussions

References

1 A N Kolmogorov Grundlagen der Wahrscheintlickkeits Rechnung (Springer Verlag 1933 in English Chelsea New York 1956)

2 A N Kolmogorov Problems of Information Transfer 1 1 1-7 (1965) 3 L Longren Computer and Information Sciences 2 165-175(1967) 4 R J Solomonoff Progress of Symposia in Applied Math AMS 43

(1962) IEEE Trans on Inform Theory 4 5 662-664(1968) 5 Li Ming P Vitanyi An Introduction to Kolmogorov Complexity (Springer

Berlin-Heridelberg-New-York 1993) 6 P Martin-L6f Information and Control 9 602-619(1966) Zeits Warsch

Verw Geb 19225-230(1971) 7 A N Shiryaev The Annals of Probability 17 3 866-944(1989) 8 G J Chaitin J ACM 16 145-159(1969) 9 A N Kolmogorov Russian Math Survey 38 4 27-36(1983) 10 L A Levin Problems of Information Transmission 10 3206-210(1974) 11 P Gacs Soviet Math Doklady 15 1477-1480(1974) 12 G J Chaitin J ACM 22 329-340(1975) 13 V V Vjugin Semiotika i Informatika (in Russian) 16 14-43(1981)

V A Uspenskii SIAM J Theory Probab Appl 32 387-412(1987) 14 R J Solomonoff Information and Control 7 1-22(1964) 15 C H Bennet M Gardner Sci America 241 11 20-34(1979)

350

STRUCTURE OF PROBABILISTIC INFORMATION A N D Q U A N T U M LAWS

JOHANN SUMMHAMMER Atominstitut der Osterreichischen Universitdten

Stadionallee 2 A-1020 Vienna Austria E-mail summhammeratiacat

The acquisition and representation of basic experimental information under the probabilistic paradigm is analysed The multinomial probability distribution is identified as governing all scientific data collection at least in principle For this distribution there exist unique random variables whose standard deviation beshycomes asymptotically invariant of physical conditions Representing all informashytion by means of such random variables gives the quantum mechanical probabilshyity amplitude and a real alternative For predictions the linear evolution law (Schrodinger or Dirac equation) turns out to be the only way to extend the invari-ance property of the standard deviation to the predicted quantities This indicates that quantum theory originates in the structure of gaining pure probabilistic inshyformation without any mechanical underpinning

1 Introduction

The probabilistic paradigm proposed by Born is well accepted for comparing experimental results to quantum theoretical predictions It states that only the probabilities of the outcomes of an observation are determined by the exshyperimental conditions In this paper we wish to place this paradigm first We shall investigate its consequences without assuming quantum theory or any other physical theory We look at this paradigm as defining the method of the investigation of nature This consists in the collection of information in probabilistic experiments performed under well controlled conditions and in the efficient representation of this information Realising that the empirical information is necessarily finite permits to put limits on what can at best be extracted from this information and therefore also on what can at best be said about the outcomes of future experiments At first this has nothing to do with laws of nature But it tells us how optimal laws look like under probshyability Interestingly the quantum mechanical probability calculus is found as almost the best possibility It meets with difficulties only when it must make predictions from a low amount of input information We find that the quantum mechanical way of prediction does nothing but take the initial unshycertainty volume of the representation space of the finite input information and move this volume about without compressing or expanding it However we emphasize that any mechanistic imagery of particles waves fields even

351

space must be seen as what they are The human brains way of portraying sensory impressions mere images in our minds Taking them as corresponding to anything in nature while going a long way in the design of experiments can become very counter productive to sciences task of finding laws Here the correct path seems to be the search for invariant structures in the empirshyical information without any models Once embarked on this road the old question of how nature really is no longer seeks an answer in the muscular domain of mass force torque and the like which classical physics took as such unshakeable primary notions (not surprisingly considering our ape orishygin I cannot help commenting) Rather one asks Which of the structures principally detectable in probabilistic information are actually realized

In the following sections we shall analyse the process of scientific investishygation of nature under the probabilistic paradigm We shall first look at how we gain information then how we should best capture this information into numbers and finally what the ideal laws for making predictions should look like The last step will bring the quantum mechanical time evolution but will also indicate a problem due to finite information

2 Gaining experimental information

Under the probabilistic paradigm basic physical observation is not very difshyferent from tossing a coin or blindly picking balls from an urn One sets up specific conditions and checks what happens And then one repeats this many times to gather statistically significant amounts of information The difference to classical probabilistic experiments is that in quantum experiments one must carefully monitor the conditions and ensure they are the same for each trial Any noticeable change constitutes a different experimental situation and must be avoided0

Formally one has a probabilistic experiment in which a single trial can give K different outcomes one of which happens The probabilities of these outcomes pi PK (52Pj = 1) are determined by the conditions But they are unknown In order to find their values and thereby the values of physical quantities functionally related to them one does N trials Let us assume the outcomes j = 1 K happen L LK times respectively (52 Lj = N) The Lj are random variables subject to the multinomial probability distribution Listing Li LK represents the complete information gained in the N trials The customary way of representing the information is however by other random

Strictly speaking identical trials are impossible A deeper analysis of why one can neglect remote conditions might lead to an understanding of the notion of spatial distance about which relativity says nothing and which is badly missing in todays physics

352

variables the so called relative frequencies Vj = LjN Clearly they also obey the multinomial probability distribution

Examples

A trial in a spin-12 Stern-Gerlach experiment has two possible outcomes This experiment is therefore goverend by the binomial probability distribution A trial in a GHZ experiment has eight possible outcomes because each of the three particles can end up in one of two detectors 2 Here the relative frequencies follow the multinomial distribution of order eight Measuring an intensity in a detector which can only fire or not fire is in fact an experiment where one repeatedly checks whether a firing occurs in a sufficiently small time interval Thus one has a binomial experiment If the rate of firing is small the binomial distribution can be approximated by the Poisson distribution

We must emphasize that the multinomial probability distribution is of utshymost importance to physics under the probabilistic paradigm This can be seen as follows The conditions of a probabilistic experiment must be verified by auxiliary measurements These are usually coarse classical measurements but should actually also be probabilistic experiments of the most exacting standards The probabilistic experiment of interest must therefore be done by ensuring that for each of its trials the probabilities of the outcomes of the auxiliary probabilistic experiments are the same Consequently empirical scishyence is characterized by a succession of data-takings of multinomial probability distributions of various orders The laws of physics are contained in the reshylations between the random variables from these different experiments Since the statistical verification of these laws is again ruled by the properties of the multinomial probability distribution we should expect that the inner structure of the multinomial probability distribution will appear in one form or another in the fundamental laws of physics In fact we might be led to the bold conshyjecture that under the probabilistic paradigm basic physical law is no more than the structures implicit in the multinomial probability distribution There is no escape from this distribution Whichever way we turn we stumble across it as the unavoidable tool for connecting empirical data to physical ideas

The multinomial probability distribution of order K is obtained when calshyculating the probability that in N trials the outcomes 1 K occur L LK

times respectively

Prob(L1LKNp1pK) = L K ^ - P K - (2-1)

The expectation values of the relative frequencies are

353

Vj = pj (2 2)

and their standard deviations are

3 Efficient representation of probabilistic information

The reason why probabilistic information is most often represented by the relative frequencies Vj seems to be history Probability theory has originated as a method of estimating fractions of countable sets when inspecting all elements was not possible (good versus bad apples in a large plantation desirable versus undesirable outcomes in games of chance etc) The relative frequencies and their limits were the obvious entities to work with But the information can be represented equally well by other random variables jgt a s ldegng a s these are one-to-one mappings Xjvj)i s o that no information is lost The question is whether there exists a most efficient representation

To answer this let us see what we know about the limits pi PK before the experiment but having decided to do iV trials Our analysis is equivalent for all K outcomes so that we can pick out one and drop the subscript We can use Chebyshevs inequality4 to estimate the width of the interval to which the probability p of the chosen outcome is pinned down6

If N is not too small we get

Wp = 2kJ^ (31)

where A is a free confidence parameter (Eq(4) is not valid at ^=0 or 1) Before the experiment we do not know u so we can only give the upper limit

Wp lt - ^ (32)

But we can be much more specific about the limit x of the random variable x(f) for which we require that at least for large N the standard deviation

Chebyshevs inequality states For any random variable whose standard deviation exists the probability that the value of the random variable deviates by more than fc standard deviations from its expectation value is less than or equal to fc-2 Here A is a free confidence parameter greater 1

354

A shall be independent of p (or of x for that matter since there will exist a function px))

Ax = ^ (33)

where C is an arbitrary real constant For the derivation of the function X(v) it is easiest to make use of the illustration in Figl Although it already shows the solution the argument is general enough so that the particular form of the discussed function does not matter First we note that x(^) shall be smooth and differentiate and strictly monotonic For sufficiently large N the probability distribution of v can be approximated by a normal distribution centered at v and with standard deviation Av In other words it will approach the gaussian form

ProbvNp) laquo rexp (y-vf 2(Ai)2 (34)

where r is the normalization factor But clearly the corresponding probability distribution of will also tend to the gaussian form of standard deviation Ax-(For instance take the probability distributions of v and x for P mdash -5 These are the ones in the middle as shown in Figl) And if N is large both Av and Ax will be small so that in the range of x and v where the probability is significantly different from zero the curve x(^) can be approximated by its tangent

X laquo X W + ( | ) __v-v) (35)

Then it follows that the characteristic width of the probability distribution of xgt which is Ax will be proportional to the characteristic width of the probability distribution of v which is Av The proportionality constant will be gpound because this is by how much the distribution for v gets squeezed or stretched to become the one for x- So we have for large N

poundU pound (36) Av dv

Use of (3) and (6) and integration yields

X = C arcsin (2v - 1) + 9 (37)

where 9 is an arbitrary real constant For comparison with v we confine x to [01] and thus set C = 7r_1 and 6 = 5 as was already done in Figl Then we

355

have Ax = l(iryN) and upon application of Chebyshevs inequality we get the interval wx to which we can pin down the unknown limit x as

wx = mdash = (38)

Clearly this is narrower than the upper limit for wp in eq(5) Having done no experiment at all we have better knowledge on the value of x than on the value of p although both can only be in the interval [01] And note that the actual experimental data will add nothing to the accuracy with which we know x but they may add to the accuracy with which we know p Nevertheless even with data wp may still be larger than to especially when p is around 05

For the representation of information the random variable x is the proper choice because it disentangles the two aspects of empirical information The number of trials N which is determined by the experimenter not by nashyture and the actual data which are only determined by nature The expershyimenter controls the accuracy wx by deciding N nature supplies the data x and thereby the whereabouts of x In the real domain the only other random variables with this property are the linear transformations afforded by C and 9 From the physical point of view x s degf interest because its standard deshyviation is an invariant of the physical conditions as contained in p or x The random variable x expresses empirical information with a certain efficiency eliminating a numerical distortion that is due to the structure of the multishynomial distribution and which is apparent in all other random variables We shall call x an efficient random variable (ER) More generally we shall call any random variable an ER whose standard deviation is asymptotically invariant of the limit the random variable tends to eq(6)

Another graphical depiction of the relation between v and c a n be given by drawing a semicircle of diameter 1 along which we plot v (Fig2a) By orthogonal projection onto the semicircle we get the random variable C = [K + 2arcsin(2i mdash l)]4 and thereby Xi when we choose different constants The drawing also suggests a simple way how to obtain a complex ER We scale the semicircle by an arbitrary real factor a tilt it by an arbitrary angle ip and place it into the complex plane as shown in Fig2b This gives the random variable

0 = a(yv(l-v) +iv e^ + b (39)

where b is an arbitrary complex constant We get a very familiar special case by setting a mdash 1 and 6 = 0

Vgt = (yjv (1 - v) + iv) eiv (310)

356

Figure 1 Functional relation between random variables v and xgt and their respective probshyability distributions as expected for N = 100 trials plotted for five different values of p 07 25 50 75 and 93 The bar above each probablity distribution indicates twice its standard deviation Notice that the standard deviations of v differ considerably for different p while those of x a r e aU the same as required in eq(6)

357

(a) (b) Figure 2 (a) Graphical construction of efficient random variable pound (and thereby of x) from the observed relative frequency v pound is measured along the arc (b) Similar construction of the efficient random variable 3 It is given by its coordinates in the complex plane The quantum mechanical probability amplitude ip is the normalized case of 3 obtained by setting a = 1 and 6 = 0

358

For large N the probability distribution of v becomes gaussian but also that of any smooth function of v as we have already seen in Figl Therefore the standard deviation of ip is obtained as

Aip dip

dv 4 = S f lt3 Ugt

Obviously the random variable ip is an ER It fulfills ip2 mdash i and we recogshynize it as the probability amplitude of quantum theory which we would infer from the observed relative frequency v Note however that the intuitive way of getting the quantum mechanical probability amplitude namely by simply taking ^vexp(ia) where a is an arbitrary phase does not give us an ER

We have now two ways of representing the obtained information by ERs either the real valued x o r the complex valued Since the relative frequency of each of the K outcomes of a general probabilistic experiment can be conshyverted to its respective efficient random variable the information is efficiently represented by the vector (XI---XK) or by the vector (0i3K) The latshyter is equivalent to the quantum mechanical state vector if we normalize it (ipuipK)

At this point it is not clear whether fundamental science could be built solely on the real ERs j o r whether it must rely on the complex ERs J- and for practical reasons on the normalized case ipj as suggested by current formulations of quantum theory We cannot address this problem here but mention that working with the j3j or ipj can lead to nonsensical predictions while working with the Xj never does so that the former are more sensitive to inconsistencies in the input data 6 Therefore we use only the ipj in the next section but will not read them as if we were doing quantum theory

4 Predictions

Let us now see whether the representation of probabilistic information by ERs suggests specific laws for predictions A prediction is a statement on the exshypected values of the probabilities of the different outcomes of a probabilistic experiment which has not yet been done or whose data we just do not yet know on the basis of auxiliary probabilistic experiments which have been done and whose data we do know We intend to make a prediction for a probabilistic experiment with Z outcomes and wish to calculate the quantishyties 4gts (s = 1 Z) which shall be related to the predicted probabilities Ps

as Ps = (jgts2- We do not presuppose that the ltps are ERs

We assume we have done M different auxiliary probabilistic experiments of various multinomial order Km m = 1 M and we think that they provided

359

all the input information needed to predict the cfgts and therefore the Ps With (13) the obtained information is represented by the ERs iptrade where m denotes the experiment and j labels a possible outcome in it (j = 1 Km) Then the predictions are

and their standard deviations are by the usual convolution of gaussians as approximations of the multinomial distributions

Alttgts =

N M

4Nn

dltj)s

dip (42)

where Nm is the number of trials of the mth auxiliary experiment If we wish the ltfgts to be ERs we must demand that the A(ps depend only on the Nm (A technical requirement is that in each of the M auxiliary experiments one of the phases of ERs ip^1 cannot be chosen freely otherwise the second summations in (16) could not go to Km but only to Km mdash 1) Then the derivatives in (16) must be constants implying that the ltfgts are linear in the i)trade However we cannot simply assume such linearity because (15) contains the laws of physics which cannot be known a priori But we want to point out that a linear relation for (15) has very exceptional properties so that it would be nice if we found it realized in nature To be specific if the Nm are sufficiently large linearity would afford predictive power which no other functional relation could achieve It would be sufficient to know the number of trials of each auxiliary probabilistic experiment in order to specify the accuracy of the predicted ltfgts No data would be needed only a decision how many trials each auxiliary experiment will be given Moreover even the slightest increase of the amount of input information by only doing one more trial in any of the auxiliary experiments would lead to better accuracy of the predicted ltjgts by bringing a definite decrease of the Altjgts This latter property is absent in virtually all other functional relations conceivable for (15) In fact most nonlinear relations would allow more input information to result in less accurate predictions This would undermine the very idea of empirical science namely that by observation our knowledge about nature can only increase never just stay the same let alone decrease For this reason we assume linearity and apply it to a concrete example

We take a particle in a one dimensional box of width w Alice repeatedly prepares the particle in a state only she knows At time t after the preparation Bob measures the position by subdividing the box into K bins of width wK

360

and checking in which he finds the particle In N trials Bob obtains the relative frequencies vi VK giving a good idea of the particles position probability distribution at time t He represents this information by the ERs xpj of (10) and wants to use it to predict the position probability distribution at time T (T gt t)

First he predicts for t + dt With (15) the predicted ltps must be linear in the ipj if they are to be ERs

K

lt)s(t + dt) = J2asjxpj (43) i= i

Clearly when dt mdashgt 0 we must have asj mdash 1 for s mdash j and asj = 0 otherwise so we can write

asj (t) = 6aj + gsj (t)dt (44)

where gSj(t) are the complex elements of a matrix G and we included the possibility that they depend on t Using matrix notation and writing the ltfgts

and ipj as column vectors we have

$t + dt) = [1 + G(t)dt] $ (45)

For a prediction for time t + 2dt we must apply another such linear transforshymation to the prediction we had for t + dt

$t + 2dt) = [1 + G(t + dt)dt] $t + dt) (46)

Replacing t + dt by t and using ltp(t + dt) = lttgtt) HmdashQp-dt we have

d$t) dt

= Gt)ltjgtt) (47)

With (10) the input vector was normalized ip2 mdash 1 We also demand this from the vector ltfgt This results in the constraint that the diagonal elements gaa must be imaginary and the off-diagonal elements must fulfill gsj = mdashgjs And then we have obviously an evolution equation just as we know it from quantum theory

For a quantitative prediction we need to know G() and the phases (pj of the initial ipj We had assumed the ltpj to be arbitrary But now we see that they influence the prediction and therefore they attain physical significance G(t) is a unitary complex K x K matrix For fixed conditions it is indepenshydent of time and with the properties found above it is given by K2 mdash 1 real

361

numbers The initial vector ip has K complex components It is normalized and one phase is free so that it is fixed by 2K mdash 2 real numbers Altogether K2 + IK - 3 = (K + 3) (K - 1) numbers are needed to enable prediction Since one probabilistic experiment yields K mdash 1 numbers Bob must do K + 3 probabilistic experiments with different delay times between Alices preparashytion and his measurement to obtain sufficient input information But neither Plancks constant nor the particles mass are needed It should be noted that this analysis remains unaltered if the initial vector ip is obtained from meashysurement of joint probability distributions of several particles Therefore (21) also contains entanglement between particles

5 Discussion

This paper was based on the insight that under the probabilistic paradigm data from observations are subject to the multinomial probability distribution For the representation of the empirical information we searched for random variables which are stripped of numerical artefacts They should therefore have an invariance property We found as unique random variables a real and a complex class of efficient random variables (ERs) They capture the obtained information more efficiently than others because their standard deviation is an asymptotic invariant of the physical conditions The quantum mechanical probability amplitude is the normalized case-of the complex class It is natural that fundamental probabilistic science should use such random variables rather than any others as the representors of the observed information and therefore as the carriers of meaning

Using the ERs for prediction has given us an evolution prescription which is equivalent to the quantum theoretical way of applying a sequence of inshyfinitesimal rotations to the state vector in Hilbert space7 It seems that simply analysing how we gain empirical information what we can say from it about expected future information and not succumbing to the lure of the question what is behind this information can give us a basis for doing physics This confirms the operational approach to science And it is in support of Wheelers It-from-Bit hypothesis8 Weizsackers ur-theor$ Eddingtons idea that inforshymation increase itself defines the rest10 Anandans conjecture of absence of dynamical laws11 Bohr and Ulfbecks hypothesis of mere symmetry^2 or the recent 1 Bit mdash 1 Constituent hypothesis of Brukner and Zeilingei13

In view of the analysis presented here the quantum theoretical probability calculus is an almost trivial consequence of probability theory but not as applied to objects or anything physical but as applied to the naked data of probabilistic experiments If we continue this idea we encounter a deeper

362

problem namely whether the space which we consider physical this 3- or higher dimensional manifold in which we normally assume the world to unfurl 14 cannot also be understood as a peculiar way of representing data Kant conjectured this - in somewhat different words - over 200 years ago1 5 And indeed it is clearly so if we imagine the human observer as a robot who must find a compact memory representation of the gigantic data stream it receives through its senses16 That is why our earlier example of the particle in a box should only be seen as illustration by means of familiar terms It should not imply that we accept the naive conception of space or things like particles in it although this view works well in everyday life and in the laboratory mdash as long as we are not doing quantum experiments We think that a full acceptance of the probabilistic paradigm as the basis of empirical science will eventually require an attack on the notions of spatial distance and spatial dimension from the point of view of optimal representation of probabilistic information

Finally we want to remark on a difference of our analysis to quantum theory We have emphasized that the standard deviations of the ERs a n d tp become independent of the limits of these ERs only when we have infinitely many trials But there is a departure for finitely many trials especially for values of p close to 0 and close to 1 With some imagination this can be noticed in Figl in the top and bottom probability distributions of which are a little bit wider than those in the middle But as we always have only finitely many trials there should exist random variables which fulfill our requirement for an ER even better than x a n d ip- This implies that predictions based on these unknown random variables should also be more precise Whether we should see this as a fluke of statistics or as a need to amend quantum theory is a debatable question But it should be testable We need to have a number of different probabilistic experiments all of which are done with only very few trials From this we want to predict the outcomes of another probabilistic experiment which is then also done with only few trials Presumably the optimal procedure of prediction will not be the one we have presented here (and therefore not quantum theory) The difficulty with such tests is however that in the usual interpretation of data statistical theory and quantum theory are treated as separate while one message of this paper may also be that under the probabilistic paradigm the bottom level of physical theory should be equivalent to optimal representation of probabilistic information and this theory should not be in need of additional purely statistical theories to connect it to actual data We are discussing this problem in a future paper17

363

Acknowledgments

This paper is a result of pondering what I am doing in the lab how it can be that in the evening I know more than I knew in the morning and discussing this with G Krenn K Svozil C Brukner M Zukovski and a number of other people

References

1 M Born Zeitschrift f Physik 37 863 (1926) Brit J Philos Science 4 95 (1953)

2 D Bouwmeester et al Phys Rev Lett 82 1345 (1999) and references therein

3 W Feller An Introduction to Probability Theory and its Applications (John Wiley and Sons New York 3rd edition 1968) Vol1 p168

4 ibid p233 5 The connection of this relation to quantum physics was first stressed by

W K Wootters Phys Rev D 23 357 (1981) 6 We give the example in quant-ph0008098 7 Several authors have noted that probability theory itself suggests quanshy

tum theory A Lande Am J Phys 42 459 (1974) A Peres Quanshytum Theory Concepts and Methods (Kluwer Academic Publishers Dorshydrecht 1998) D I Fivel Phys Rev A 50 2108 (1994)

8 J A Wheeler in Quantum Theory and Measurement eds J A Wheeler and W H Zurek (Princeton University Press Princeton 1983) 182

9 C F von Weizsacker Aufbau der Physik (Hanser Munich 1985) Holger Lyre Int J Theor Phys 34 1541 (1995) Also quant-ph9703028

10 C W Kilmister Eddingtons Search for a Fundamental Theory (Camshybridge University Press 1994)

11 J Anandan Found Phys 29 1647 (1999) 12 A Bohr and 0 Ulfbeck Rev Mod Phys 67 1 (1995) 13 C Brukner and A Zeilinger Phys Rev Lett 83 3354 (1999) 14 A penetrating analysis of the view of space implied by quantum theory

is given by U Mohrhoff Am J Phys 68 (8) 728 (2000) 15 Immanuel Kant Critik der reinen Vernunft (Critique of Pure Reason)

Riga (1781) There should be many English translations 16 ET Jaynes introduced the reasoning robot in his book Probshy

ability Theory The Logic of Science in order to eliminate the problem of subjectivism that has been plaguing probability theshyory and quantum theory alike The book is freely available at httpbayeswustleduetjprobhtml

17 J Summhammer (to be published)

364

Q U A N T U M C R Y P T O G R A P H Y I N S P A C E A N D B E L L S T H E O R E M

I G O R V O L O V I C H

Steklov Mathematical Institute Gubkin St 8

GSP-1 117966 Moscow Russia

E-mail volovichmirasru

Bells theorem states that some quantum correlations can not be represented by classical correlations of separated random variables It has been interpreted as incompatibility of the requirement of locality with quantum mechanics We point out that in fact the space part of the wave function was neglected in the proof of Bells theorem However this space part is crucial for considerations of property of locality of quantum system Actually the space part leads to an extra factor in quantum correlations and as a result the ordinary proof of Bells theorem fails in this case Bells theorem constitutes an important part in quantum cryptography The promise of secure cryptographic quantum key distribution schemes is based on the use of Bells theorem in the spin space In many current quantum cryptography protocols the space part of the wave function is neglected As a result such schemes can be secure against eavesdropping attacks in the abstract spin space but they could be insecure in the real three-dimensional space We discuss an approach to the security of quantum key distribution in space by using a special preparation of the space part of the wave function

1 Introduction

Bells theorem1 states that there are quantum correlation functions that can not be represented as classical correlation functions of separated random varishyables It has been interpreted as incompatibility of the requirement of locality with the statistical predictions of quantum mechanics For a recent discusshysion of Bells theorem see for example 2 - 17 and references therein It is now widely accepted as a result of Bells theorem and related experiments that local realism must be rejected

Evidently the very formulation of the problem of locality in quantum mechanics is based on ascribing a special role to the position in ordinary three-dimensional space It is rather surprising therefore that the space dependence of the wave function is neglected in discussions of the problem of locality in relation to Bells inequalities Actually it is the space part of the wave function which is relevant to the consideration of the problem of locality

In this note we point out that the space part of the wave function leads to an extra factor in quantum correlation and as a result the ordinary proof of Bells theorem fails in this case We present a criterium of locality (or nonlocality) of quantum theory in a realist model of hidden variables We

365

argue that predictions of quantum mechanics can be consistent with Bells inequalities for Gaussian wave functions and hence Einsteins local realism is restored in this case

Bells theorem constitutes an important part in quantum cryptography19 It is now generally accepted that techniques of quantum cryptography can allow secure communications between distant parties 18 - 25 The promise of secure cryptographic quantum key distribution schemes is based on the use of quantum entanglement in the spin space and on quantum no-cloning theorem An important contribution of quantum cryptography is a mechanism for detecting eavesdropping

However in many current quantum cryptography protocols the space part of the wave function is neglected But exactly the space part of the wave function describes the behaviour of particles in ordinary real three-dimensional space As a result such schemes can be secure against eavesdropping attacks in the abstract spin space but could be insecure in the real three-dimensional space

It follows that proofs of the security of quantum cryptography schemes which neglect the space part of the wave function could fail against attacks in the real three-dimensional space We will discuss how one can try to improve the security of quantum cryptography schemes in space by using a special preparation of the space part of the wave function

2 Bells Inequality

In the presentation of Bells theorem we will follow 17 where one can find also more references The mathematical formulation of Bells theorem reads

cos(a -P)plusmn Eamptip (21)

where poundQ and r)p are two random processes such that |pounda | lt 1 r$ lt 1 and E is the expectation Let us discuss in more details the physical interpretation of this result Consider a pair of spin one-half particles formed in the singlet spin state and moving freely towards two detectors (Alice and Bob) If one neglects the space part of the wave function then the quantum mechanical correlation of two spins in the singlet state ipspin is

Dspin(a b) = (ipspin(7 -areg a bull btpspin) = -a bull b (22)

Here a and b are two unit vectors in three-dimensional space a mdash ( o i ^ ^ ) are the Pauli matrices and

366

Bells theorem states that the function Dspinab) Eq (22) can not be represented in the form

P(ab) = Jaa)r](bX)dp(X) (23)

ie

Dspin(ab) ^ P(ab) (24)

Here pound(a A) and 77(6 A) are random fields on the sphere |pound(a A)| lt 1 rj(b A)| lt 1 and dp(X) is a positive probability measure dp) = 1 The parameters A are interpreted as hidden variables in a realist theory It is clear that Eq (24) can be reduced to Eq (21)

One has the following Bell-Clauser-Horn-Shimony-Holt (CHSH) inequality

P(a b) - P(a b) + P(a b) + P(a b)lt2 (25)

Prom the other hand there are such vectors (ab mdash ab = ab = mdash ab = V22) for which one has

Dspin(a b) - Dspin(a b) + Dspin(a b) + Dspin(a b) = 2^2 (26)

Therefore if one supposes that Dspin(ab) = P(ab) then one gets the contrashydiction

It will be shown below that if one takes into account the space part of the wave function then the quantum correlation in the simplest case will take the form g cos(a mdash 3) instead of just cos(a - 3) where the parameter g describes the location of the system in space and time In this case one can get the representation

gcos(a-p)=EZaT]l3 (27)

if g is small enough (see below) The factor g gives a contribution to visibility or efficiency of detectors that are used in the phenomenological description of detectors

3 Localized Detectors

In the previous section the space part of the wave function of the particles was neglected However exactly the space part is relevant to the discussion of locality The complete wave function is tp = (Vgta3(rir2)) where a and are spinor indices and r i and r^ are vectors in three-dimensional space

367

We suppose that Alice and Bob have detectors which are located within the two localized regions OA and OB respectively well separated from one another

Quantum correlation describing the measurements of spins by Alice and Bob at their localized detectors is

G(a0AbOB) = (1gtW bull aPoA reg a bull bPoB|Vgt (3-1)

Here PQ is the projection operator onto the region O Let us consider the case when the wave function has the form of the product

of the spin function and the space function tp = y spin^(i ir2) Then one has

G(a 0A b 0B) = g(0A 0B)Dspin(a b) (32)

where the function

9(OAOB)= [ 4gt(r1T2)2dT1dv2 (33)

JOAXOB

describes correlation of particles in space It is the probability to find one particle in the region OA and another particle in the region OB- One has

0ltg(OAOB)ltl (34)

Remark In relativistic quantum field theory there is no nonzero strictly localized projection operator that annihilates the vacuum It is a consequence of the Reeh-Schlieder theorem Therefore apparently the function g(OAOs) should be always strictly smaller than 1 I am grateful to W Luecke for this remark

Now one inquires whether one can write the representation

9(0A0B)Dspin(ab) = f^aOAX)v(b0B)dP(X) (35)

Note that if we are interested in the conditional probablity of finding the projection of spin along vector a for the particle 1 in the region OA and the projection of spin along the vector b for the particle 2 in the region OB then we have to divide both sides of Eq (35) to g(OA OB)-

The factor g is important In particular one can write the following repshyresentation15 for 0 lt g lt 12

gcos(a-3)= v ^ c o s ( a - A ) v 2 p c o s ( ^ - A ) mdash (36) Jo An

Let us now apply these considerations to quantum cryptography

368

4 Quantum Key Distribution

Ekert1 9 showed that one can use the EPR correlations to establish a secret random key between two parties (Alice and Bob) Bells inequalities are used to check the presence of an intermediate eavesdropper (Eve) There are two stages to the Ekert protocol the first stage over a quantum channel the second over a public channel

The quantum channel consists of a source that emits pairs of spin one-half particles in a singlet state The particles fly apart towards Alice and Bob who after the particles have separated perform measurements on spin components along one of three directions given by unit vectors a and b In the second stage Alice and Bob communicate over a public channelThey announce in public the orientation of the detectors they have chosen for particular measurements Then they divide the measurement results into two separate groups a first group for which they used different orientation of the detectors and a second group for which they used the same orientation of the detectors Now Alice and Bob can reveal publicly the results they obtained but within the first group of measurements only This allows them by using Bells inequality to establish the presence of an eavesdropper (Eve) The results of the second group of measurements can be converted into a secret key One supposes that Eve has a detector which is located within the region OE and she is described by hidden variables A

We will interpret Eve as a hidden variable in a realist theory and will study whether the quantum correlation Eq (32) can be represented in the form Eq (23) ^From (25) (26) and (35) one can see that if the following inequality

g(0A0B) lt1V2 (41)

is valid for regions OA and OB which are well separated from one another then there is no violation of the CHSH inequalities (25) and therefore Alice and Bob can not detect the presence of an eavesdropper On the other side if for a pair of well separated regions OA and OB one has

9(OAOB) gtly2 (42)

then it could be a violation of the realist locality in these regions for a given state Then in principle one can hope to detect an eavesdropper in these circumstances

Note that if we set g(OA OB) = 1 in (35) as it was done in the original proof of Bells theorem then it means we did a special preparation of the states of particles to be completely localized inside of detectors There exist such

369

well localized states (see however the previous Remark) but there exist also another states with the wave functions which are not very well localized inside the detectors and still particles in such states are also observed in detectors The fact that a particle is observed inside the detector does not mean of course that its wave function is strictly localized inside the detector before the measurement Actually one has to perform a thorough investigation of the preparation and the evolution of our entangled states in space and time if one needs to estimate the function g(CgtA OB)-

5 Gaussian Wave Functions

Now let us consider the criterium of locality for Gaussian wave functions We will show that with a reasonable accuracy there is no violation of locality in this case Let us take the wave function ltfgt of the form ltfgt = Vi(ri)V2(r2) where the individual wave functions have the moduli

Mr)2 = ( ^ ) raquo V V a |Vgt2(r)|2 = (^ )raquo raquoe -raquo ( - 1 )Vraquo (51)

We suppose that the length of the vector 1 is much larger than 1m We can make measurements of PoA and PQB for any well separated regions OA and OB- Let us suppose a rather nonfavorite case for the criterium of locality when the wave functions of the particles are almost localized inside the regions OA and OB respectively In such a case the function 9(OAOB) can take values near its maxumum We suppose that the region OA is given by ri lt 1mr = (ri r2r3) and the region OB is obtained from OA by translation on 1 Hence Vi(ri) is a Gaussian function with modules appreciably different from zero only in OA and similarly laquogt2(i2) is localized in the region OB- Then we have

g(0A OB) = ( ^ L J ^ e~x^2dx (52)

One can estimate (52) as

g(0A0B)lt(^ (53)

which is smaller than 12 Therefore the locality criterium (41) is satisfied in this case

Let us remind that there is a well known effect of expansion of wave packets due to the free time evolution If e is the characteristic length of the Gaussian

370

wave packet describing a particle of mass M at time t = 0 then at time t the chracteristic length tt will be

It tends to (HMe)t as t mdashgt oo Therefore the locality criterium is always satisfied for nonrelativistic particles if regions OA and OB are far enough from each other The case of relativistic particles will be considered in a separate publication

6 Conclusions

It is shown in this note that if we do not neglect the space part of the wave function of two particles then the prediction of quantum mechanics can be consistent with Bells inequalities One can say that Einsteins local realism is restored in this case

It would be interesting to investigate whether one can prepare a reasonshyable wave function for which the condition of nonlocality (42) is satisfied for a pair of the well separated regions In principle the function g(CgtA OB) can approach its maximal value 1 if the wave functions of the particles are very well localized within the detector regions OA and OB respectively However perhaps to establish such a localization one has to destroy the original entanshyglement because it was created far away from detectors

It is shown that the presence of the space part in the wave function of two particles in the entangled state leads to a problem in the proof of the security of quantum key distribution To detect the eavesdroppers presence by using Bells inequality we have to estimate the function g(OA OB)- Only a special quantum key distribution protocol has been discussed here but it seems there are similar problems in other quantum cryptographic schemes as well

We dont claim in this note that it is in principle impossible to increase the detectability of the eavesdropper However it is not clear to the present author how to do it without a thorough investigation of the process of preparation of the entangled state and then its evolution in space and time towards Alice and Bob

In the previous section Eve was interpreted as an abstract hidden variable However one can assume that more information about Eve is available In particular one can assume that she is located somewhere in space in a region OE- It seems one has to study a generalization of the function g(OAOB) which depends not only on the Alice and Bob locations OA and OB but also depends on the Eve location OE and try to find a strategy which leads to an optimal value of this function

371

7 Acknowledgments

This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University I would like to thank A Khrennikov for the warm hospitality and fruitful discussions This work is supported in part also by RFFI 99-01-00105 and INTAS 99-0590

References

1 JS Bell Physics 1 195 (1964) 2 A Peres Quantum Theory Concepts and Methods Kluwer Dordrecht

1993 3 LE Ballentine Quantum Mechanics Prince-Hall 1990 4 Muynck WM de De Baere W and Martens H Found of Physics

(1994) 1589 5 DM Greenberger MA Home A Shimony and A Zeilinger Am J

Phys 58 1131 (1990) 6 SL Braunstein A Mann and M Revzen Phys Rev Lett 68 3259

(1992) 7 ND Mermin Am J Phys 62 880 (1994) 8 G M DAriano L Maccone M F Sacchi and A Garuccio Tomographic

test of Bells inequality quant-ph9907091 9 Luigi Accardi and Massimo Regoli Locality and Bells inequality quant-

ph0007005 10 Andrei Khrennikov Non-Kolmogorov probability models and modified

Bells inequality quant-ph0003017 11 Almut Beige William J Munro and Peter L Knight A Bells Inequality

Test with Entangled Atoms quant-ph0006054 12 F Benatti and R Floreanini On Bells locality tests with neutral kaons

hep-ph9812353 13 A Khrennikov Statistical measure of ensemble nonreproducibility and

correction to Bells inequality Nuovo Cimento 115B (2000)179 14 W A Hofer Information transfer via the phase A local model of

Einstein-Podolksy-Rosen experiments quant-ph0006005 15 Igor Volovich Yaroslav Volovich Bells Theorem and Random Variables

quant-ph0009058 16 N Gisin V Scarani W Tittel H Zbinden Optical tests of quantum

nonlocality from EPR-Bell tests towards experiments with moving obshyservers quant-ph0009055

17 Igor V Volovich Bells Theorem and Locality in Space quant-

372

ph0012010 18 CH Bennett and G Brassard in Proc of the IEEE Inst Conf on

Comuters Systems and Signal Processing Bangalore India (IEEE New York1984) p175

19 AK Ekert Phys Rev Lett 67 (1991)661 20 D S Naik C G Peterson A G White A J Berglund P G Kwiat

Entangled state quantum cryptography Eavesdropping on the Ekert proshytocol quant-ph9912105

21 Gilles Brassard Norbert Lutkenhaus Tal Mor Barry C Sanders Secushyrity Aspects of Practical Quantum Cryptography quant-ph9911054

22 Kei Inoue Takashi Matsuoka Masanori Ohya New approach to Epsilon-entropy and Its comparison with Kolmogorovs Epsilon-entropy quant-ph9806027

23 Hoi-Kwong Lo Will Quantum Cryptography ever become a successful technology in the marketplace quant-ph9912011

24 Akihisa Tomita Osamu Hirota Security of classical noise-based cryptogshyraphy quant-ph0002044

25 Yong-Sheng Zhang Chuan-Feng Li Guang-Can Guo Quantum key disshytribution via quantum encryption quant-ph0011034

373

INTERACTING STOCHASTIC PROCESS A N D RENORMALIZATION THEORY

YAROSLAV V O L O V I C H

Physics Department Moscow State University

Vorobievi Gori 119899Moscow Russia

E-mail yaroslav-Vmailru

A stochastic process with self-interaction as a model of quantum field theory is studied We consider an Ornstein-Uhlenbeck stochastic process x(t) with intershyaction of the form x ( a ( t ) 4 where a indicates the fractional derivative Using Bogoliubovs Rmdashoperation we investigate ultraviolet divergencies for the various parameters a Ultraviolet properties of this one-dimensional model in the case a = 34 are similar to those in the ip theory but there are extra counterterms It is shown that the model is two-loops renormalizable For 58 lt a lt 34 the model has a finite number of divergent Feynman diagrams In the case a = 23 the model is similar to the ltp theory If 0 lt a lt 58 then the model does not have ultraviolet divergencies at all Finally if a gt 34 then the model is nonrenormalizable

1 Introduction

There is a very fruitful interrelation between probability theory and quantum field theory 1 _ 6 In this note we consider a stochastic process that shows the same divergencies as quantum electrodynamics or ltgt4 theory in the 4-dimensional spacetime This stochastic process corresponds to one-dimensional Euclidean quantum field theory with the quartic interaction that contains fracshytional derivatives This one-dimensional model can be used for studying the fundamental problem of non-perturbative investigation of renormalized quanshytum field theory1 3 It can also find applications in theory of phase transishytions5 6

The Interacting Stochastic Process Let x(t) = x(tu)) be an Ornstein-Uhlenbeck stochastic process with the correlation function

1 rdegdeg pip(t-r) p~mt-r

where m gt 0 There exists a spectral representation of the Ornstein-Uhlenbeck stochastic process 8

xtu)= JeiktC(dku)

374

where ((dku) is a stochastic measure We define the fractional derivative a

as

lt lt gt (tw)= fkaeiktC(dkoj) (12)

If 0 lt a lt 12 then x^(t) is a stochastic process If a gt 12 then one needs a regularization described below We will use distribution notations and write

1 fdegdeg C(dkui) = x(kcj)dk i(kw) = mdash I x(tcj)e

2 r J-oo

-iktdt

We want to give a meaning to the following correlation functions

Kh tN)= Exh) bull bull bull xtN)e~xu) E(e-xu) (13)

for all N = 12 Here

OO

X^T)A g(T)dT (14)

-OO

where g(r) is a nonnegative test function with a compact support (the volume cut-off) a(Q)(i) denotes the fractional derivative (12) A gt 0 and ^ ( ^ ( T ) 4 is the Wick normal product We will denote the expectation value as E(A) mdash A) In this notations (x(t)x(r)) = plusmn J^ ^^rdp

For the correlation function (13) one has the perturbative expansion

(x(h) xtN)e~xu) = V Kmdashf- (xfa) bull bullbullx(tN)Un) (15) n=0

If a gt 58 then the expectation value in (15) has no meaning because there are ultraviolet divergencies We have to introduce a cutoff stochastic process xK (t) 3

xK(tegt)= f eiktadku) J mdashK

Instead of U in (13) we put

UK = j 4 a ) M 4 9(r)dr

Stochastic differential equations with fractional derivatives 7 are considered also on pmdashadic number fields

375

where

JmdashK

The problem is to prove that after the renormalization there exists a limit of the correlation functions

(xh)-x(tN)e-w)rm

as K -gt oo in each order of the perturbation expansion We will consider this problem below by using the Bogoliubov-Parasiuk R-operation and the standart language of the Feynman diagrams

In the momentum representation we obtain the expression of the form

x(pi)xjpN)e~xu) = ^2Gr(pi PN)

Here the sum runs over all Feynman diagrams T with N external legs that can be build up using 4-vertices corresponding to the x^4 term Contributions from the connected diagrams with n 4-vertices and L internal lines has a form

j = i j j = i lt i j + m

where I = L mdash (n mdash 1) qi are linear combinations of the internal momenta fci ki and external momenta p i PN-

The canonical degree D(T) of a proper diagram is defined by the dimension of the corresponding Feynman integral with respect to the integration variables Using (16) we have

D = D(T) = (2a - 2)L + I = (2a - )L - n + 1 (17)

If for a given diagram D lt 0 then this diagram is superficially finite otherwise it is divergent Let us consider a proper diagram with n vertices L internal lines and E legs We have the following relation

An-2L + E (18)

Note that for any nontrivial connected diagram

2n gt L gt n gt 2 (19)

E lt2n (110)

376

Theorem If a lt 58 then all Feynman diagrams of the interacting stochastic process are superficially finite If 58 lt a lt 34 then there exists a finite number of divergent diagrams moreover all divergent diagrams have only 0 or 2 legs If a = 34 then the model is renormalizable and all divergent diagrams have only 0 2 or 4 external lines Finally if a gt 34 then the model is nonrenormalizable Proof Let us prove the first statement of the theorem ie if a lt 58 then D lt 0 for any n gt 2 Using (17) and (19) we have

D nr 5 T n L-An + A ^ lt2L L-n + l = lt

alt58 8 4 (111)

lt In - An + 4

lt 0 4 2

Prom (111) it follows that D lt 0 for any a lt 58 Let us consider a = 58 Similarly to (111) from (17) we have

D L-An + A 2_ n

a=58 lt 0 (112)

Therefore only two-point (n = 2) diagram could be divergent (in this case D = 0) Rewriting (112) in the form

D A-(E + L)

alt58 (113)

Prom (113) it follows that only diagram with E = 0 L mdash A n = 2 is divergent In the case when 58 lt a lt 34 we can write

a = (114)

where 0 lt e lt 18 Substituting (114) into (17) and using (19) we have

D L 2n

= --2Le-n + llt mdash a=34-er 2 2

2ns - n + 1 = 1 - 2ne (115)

Thus for any given s gt 0 (and therefore any a lt 34) there exists a number N such that for any n gt N the canonical dimension D lt 0 Hence there exists only a finite number of divergent diagrams Rewriting (115) in the form

D a=34-e

= -2Le + A-E

377

It follows that D gt 0 only if E lt 4 ie E = 0 or E = 2 and the model is super-renormalizable

Let us consider the case when a = 34 Using (18) and (17) we have

D = l - f (116) a=34 4

The equality (116) means that all divergent diagrams have only 0 2 or 4 legs and the model is renormalizable

Finally if a gt 34 we have

D = - - n + l = gt ^ gt 0 (117) agt34 2 1 2

Therefore if a gt 34 then all proper diagrams are divergent bull Examples of application of this theorem one can find in9

2 Acknowledgments

This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University I would like to thank A Khrennikov for the warm hospitality and fruitful discussions

References

1 NN Bogoliubov and DV Shirkov Introduction to the theory of quantum fields Nauka Moscow 1973

2 T Hida Brownian Motion Springer-Verlag 1980 3 J Glimm and A Jaffe Quantum Physics A Functional Integral Point of

View Springer-Verlag 1987 4 T Hida H-H Kuo J Potthoff and L Streit White noise An Infinite

Dimensional Calculus Kluwer Academic 1993 5 J Kogut K Wilson Phys Reports 12C p 75 1974 6 AZ Patashinski and VL Pokrovski The fluctuational theory of phase

transitions Nauka Moscow 1975 7 VS Vladimirov Generalized functions over the field ofpmdashadic numbers

Russian Math Surveys 435 (1988) 8 II Gihman and AV Skorohod Introduction to Theory of Random Proshy

cesses Nauka Moscow 1977 9 YaI Volovich Interacting stochastic process and renormalization theory

quant-ph0008063

ISBN 981-02-4846-6

www worldscientificcom 48 84hc 9 789810 248468

  • Foreword
  • Contents
  • Preface
  • Locality and Bells Inequality
    • 1 Inequalities among numbers
    • 2 The Bell inequality
    • 3 Implications of the Bells inequalities for the singlet correlations
    • 4 Bell on the meaning of Bells inequality
    • 5 Critique of Bells vital assumption
    • 6 The role of the counterfactual argument in Bells proof
    • 7 Proofs of Bells inequality based on counting arguments
    • 8 The quantum probabilistic analysis
    • 9 The realism of ballot boxes and the corresponding statistics
    • 10 The realism of chameleons and the corresponding statistics
    • 11 Bells inequalities and the chamaleon effect
    • 12 Physical implausibility of Bells argument
    • 13 The role of the single probability space in CHSHs proof
    • 14 The role of the counterfactual argument in CHSHs proof
    • 15 Physical difference between the CHSHs and the original Bells inequalities
    • References
      • Refutation of Bells Theorem
        • 1 Introduction
        • 2 The EPRB gedanken experiment
        • 3 The CHSH function
        • 4 Strongly objective interpretation
        • 5 Weakly objective interpretation
        • 6 Conclusion
        • References
          • Probability Conservation and the State Determination Problem
            • 1 Introduction
            • 2 Conservation of Probability
            • 3 Determination of the phase function
            • 4 Validity and range of applicability
            • 5 Evolution of a Gaussian Wave Packet
            • 6 Operational Issues
            • Acknowledgments
            • References
              • Extrinsic and Intrinsic Irreversibility in Probabilistic Dynamical Laws
                • 1 Introduction
                • 2 Ontic and epistemic descriptions
                • 3 Breaking Time-Reversal Symmetry Extrinsic Irreversibility
                • 4 Breaking Time-Reversal Symmetry Intrinsic Irreversibility
                • 5 Summary and Open Questions
                • Acknowledgments
                • References
                  • Interpretations of Probability and Quantum Theory
                    • 1 Introduction
                    • 2 Interpretations of Probability
                    • 3 The Axioms of Probability
                    • 4 Probability in Quantum Mechanics
                    • 5 Conclusions
                    • References
                      • Forcing Discretization and Determination in Quantum History Theories
                        • 1 Introduction
                        • 2 Outcome determination via contextual models
                        • 3 Unitary ortho- and projective structure
                        • 4 Representing quantum history theory
                        • 5 Further discussion
                        • Acknowledgments
                        • References
                          • Interpretations of Quantum Mechanics and Interpretations of Violation of Bells Inequality
                            • 1 Realist and empiricist interpretations of quantum mechanics
                            • 2 EPR experiments and Bell experiments
                            • 3 Bells inequality in quantum mechanics
                            • 4 Bells inequality in stochastic and deterministic hidden-variables theories
                            • 5 Analogy between thermodynamics and quantum mechanics
                            • 6 Conclusions
                            • References
                              • Discrete Hessians in Study of Quantum Statistical Systems Complex Ginibre Ensemble
                                • 1 Introduction
                                • 2 The Ginibre ensembles
                                • Acknowledgements
                                • References
                                  • Some Remarks on Hardy Functions Associated with Dirichlet Series
                                    • 1 Introduction
                                    • 2 Hardyfication of Dirichlet series
                                    • 3 Factorization of n
                                    • 4 Applications
                                    • References
                                      • Ensemble Probabilistic Equilibrium and Non-Equilibrium Thermodynamics without the Thermodynamic Limit
                                        • 1 Introduction
                                        • 2 There is a lot to add to classical equilibrium statistics from our experience with Small systems
                                        • 3 Relation of the topology of S(E N) to the Yang-Lee zeros of Z(T u V)
                                        • 4 The regions of positive curvature A1 of s(es ns) correspond to phase transitions of first order
                                        • 5 Boltzmanns principle and non-equilibrium thermodynamics
                                        • 6 Macroscopic observables imply the EPS-probability
                                        • 7 On Einsteins objections against the EPS-probability
                                        • 8 Fractal distributions in phase space Second Law
                                        • 9 Conclusion
                                        • Appendix
                                        • Acknowledgement
                                        • References
                                          • An Approach to Quantum Probability
                                            • 1 Introduction
                                            • 2 Formulation
                                            • 3 Wave Functions and Hilbert Space
                                            • 4 Spin
                                            • 5 Traditional Quantum Mechanics
                                            • 6 Concluding Remarks
                                            • References
                                              • Innovation Approach to Stochastic Processes and Quantum Dynamics
                                                • 1 Introduction
                                                • 2 Review of defining a stochastic process and white noise analysis
                                                • 3 Relations to Quantum Dynamics
                                                • 4 Addenda to foundations of the theories Concluding remarks
                                                • Acknowledgements
                                                • References
                                                  • Statistics and Ergodicity of Wave Functions in Chaotic Open Systems
                                                    • 1 Introduction
                                                    • 2 Classical Nonergodicity and Short-Path Dynamics
                                                    • 3 Universal Description of Wave Function Statistics
                                                    • 4 Numerical Analyses and Discussions
                                                    • 5 Conclusions
                                                    • Acknowledgments
                                                    • References
                                                      • Origin of Quantum Probabilities
                                                        • 1 Introduction
                                                        • 2 Quantum formalism and perturbation effects
                                                        • 3 Probability transformations connecting preparation procedures
                                                        • 3 Hyperbolic and hyper-trigonometric probabilistic transformations
                                                        • 4 Double stochasticity and correlations between preparation procedures
                                                        • 5 Hyperbolic quantum formalism
                                                        • 6 Physical consequences
                                                        • Acknowledgements
                                                        • References
                                                          • Nonconventional Viewpoint to Elements of Physical Reality Based on Nonreal Asymptotics of Relative Frequencies
                                                            • 1 Introduction
                                                            • 2 Analysis of the foundation of probability theory
                                                            • 3 General principle of statistical stabilization of relative frequencies
                                                            • 4 Probability distribution of a collective
                                                            • 5 Model examples of p-adic statistics
                                                            • Acknowledgements
                                                            • References
                                                              • Complementarity or Schizophrenia Is Probability in Quantum Mechanics Information or Onta
                                                                • 1 Introduction
                                                                • 2 De Broglie waves as an SED effect
                                                                • 3 Schrodinger Equation
                                                                • 4 Conclusions
                                                                  • A Probabilistic Inequality for the Kochen-Specker Paradox
                                                                    • 1 Introduction
                                                                    • 2 The Kochen-Specker theorem
                                                                    • 3 The Kochen-Specker inequality
                                                                    • 4 Independence
                                                                    • 5 Conclusions
                                                                      • Quantum Stochastics The New Approach to the Description of Quantum Measurements
                                                                        • 1 Introduction
                                                                        • 2 Quantum stochastic approach
                                                                        • 3 Concluding remarks
                                                                        • 4 Acknowledgments
                                                                        • References
                                                                          • Abstract Models of Probability
                                                                            • 1 What probability sets o are possible
                                                                            • 2 Uniqueness of semigroups of zeros and units
                                                                            • 3 Probabilities with hidden parameters
                                                                            • 4 Probability sets with a single unit
                                                                            • 5 Acknowledgments
                                                                            • References
                                                                              • Quantum K-Systems and their Abelian Models
                                                                                • 1 Introduction
                                                                                • 2 Classical K-System
                                                                                • 3 Algebraic Quantum K-Systems
                                                                                • 4 Dynamical Entropy
                                                                                • 5 Some General Considerations on Abelian Models
                                                                                • 6 Abelian Models for Algebraic K-Systems
                                                                                • 7 Continuous K-Systems
                                                                                • 8 Mixing Properties Without Algebraic K-Property
                                                                                • 9 Time Evolution
                                                                                • References
                                                                                  • Scattering in Quantum Tubes
                                                                                    • 1 Introduction
                                                                                    • 2 Tubes in quantum heterostructures
                                                                                    • 3 Mathematical model
                                                                                    • 4 Reformulated scattering problem
                                                                                    • 5 Solution of the scattering problem
                                                                                    • References
                                                                                      • Position Eigenstates and the Statistical Axiom of Quantum Mechanics
                                                                                        • 1 Quantum probabilities according to Deutsch
                                                                                        • 2 Schrodingers equation for a free particle as a consequence of position eigenstates
                                                                                        • 3 Driven particle Weyl equation in general space-time
                                                                                        • 4 Realizing Deutschs substitution as a time evolution
                                                                                        • 5 Can normalization be replaced by symmetry
                                                                                        • References
                                                                                          • Is Random Event the Core Question Some Remarks and a Proposal
                                                                                            • 1 Preface
                                                                                            • 2 Linguistic Model
                                                                                            • 3 Ensemble Model
                                                                                            • 4 Structural Model
                                                                                            • 5 Certain and Uncertain Structures
                                                                                            • 6 Probability
                                                                                            • 7 Experimental Verification
                                                                                            • 8 Objective and Subjective Probability
                                                                                            • 9 Conclusions
                                                                                            • References
                                                                                              • Constructive Foundations of Randomness
                                                                                                • 1 Introduction
                                                                                                • 2 Kolmogorov Complexity
                                                                                                • 3 Incompressibility
                                                                                                • 4 Reversible Complexity
                                                                                                • 5 Complexity and Information
                                                                                                • 6 Frequency Rates
                                                                                                • 7 Prefix Complexity
                                                                                                • 8 Universal Probability
                                                                                                • 9 Sequentially Coding Algorithms
                                                                                                • References
                                                                                                  • Structure of Probabilistic Information and Quantum Laws
                                                                                                    • 1 Introduction
                                                                                                    • 2 Gaining experimental information
                                                                                                    • 3 Efficient representation of probabilistic information
                                                                                                    • 4 Predictions
                                                                                                    • 5 Discussion
                                                                                                    • Acknowledgments
                                                                                                    • References
                                                                                                      • Quantum Cryptography in Space and Bells Theorem
                                                                                                        • 1 Introduction
                                                                                                        • 2 Bells Inequality
                                                                                                        • 3 Localized Detectors
                                                                                                        • 4 Quantum Key Distribution
                                                                                                        • 5 Gaussian Wave Functions
                                                                                                        • 6 Conclusions
                                                                                                        • 7 Acknowledgments
                                                                                                        • References
                                                                                                          • Interacting Stochastic Process and Renormalization Theory
                                                                                                            • 1 Introduction
                                                                                                            • 2 Acknowledgments
                                                                                                            • References
Page 2: Foundations of Probability and Physics

^ ^ Proceedings of the Conference

foundations of Probability and

physics

P Q - Q P Quantum Probability and White Noise Analysis

Managing Editor W Freudenberg Advisory Board Members L Accardi T Hida R Hudson and K R Parthasarathy

PQ-QP Quantum Probability and White Noise Analysis

Vol 13 Foundations of Probability and Physics ed A Khrennikov

QP-PQ

Vol 10 Quantum Probability Communications eds R L Hudson and J M Lindsay

Vol 9 Quantum Probability and Related Topics ed L Accardi

Vol 8 Quantum Probability and Related Topics ed L Accardi

Vol 7 Quantum Probability and Related Topics ed L Accardi

Vol 6 Quantum Probability and Related Topics ed L Accardi

PQ-QP Quantum Probability and White Noise Analysis

Volume XIII

Proceedings of the Conference

foundations of probability and

physics Vaxjo Sweden 25 November - 1 December 2000

Edited by A Khrennikov University of Vaxjo Sweden

|5 World Scientific m New JerseyLondonSingapore New Jersey bull London bull Singapore bull Hong Kong

Published by

World Scientific Publishing Co Pte Ltd

P O Box 128 Farrer Road Singapore 912805

USA office Suite IB 1060 Main Street River Edge NJ 07661

UK office 57 Shelton Street Covent Garden London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library

FOUNDATIONS OF PROBABILITY AND PHYSICS PQ-QP Quantum Probability and White Noise Analysis - Vol 13

Copyright copy 2001 by World Scientific Publishing Co Pte Ltd

All rights reserved This book or parts thereof may not be reproduced in any form or by any means electronic or mechanical including photocopying recording or any information storage and retrieval system now known or to be invented without written permission from the Publisher

For photocopying of material in this volume please pay a copying fee through the Copyright Clearance Center Inc 222 Rosewood Drive Danvers MA 01923 USA In this case permission to photocopy is not required from the publisher

ISBN 981-02-4846-6

Printed in Singapore by World Scientific Printers (S) Pte Ltd

V

Foreword

With the present proceedings of a conference on Foundations of Probability and Physics we continue the QP series mdash the first volume of which appeared more than twenty years ago The series had its origin in proceedings of conshyferences and workshops on quantum probability and related topics Initially published by Springer-Verlag World Scientific has now been the publisher for about ten years Much has changed in the world of quantum probability in the last two decades Quantum probabilistic methods became a mature subject in mathematics and mathematical physics The number of well-established scienshytists who have turned their scientific interest to the field of quantum probability is impressively increasing Scientifically and numerically strong schools of quanshytum probability evolved in the past years Moreover the highly interdisciplinary character of quantum probability became more and more evident Especially the close connections to white noise analysis aroused the interest of classical and quantum probabilists and stimulated mutual exchange and cooperation fruitful for both parties

Taking into account this development during the previous QP conferences we discussed comprehensively and in detail the future profile and main goals of the series Some changes in the alignment and the objectives of the series reshysulted from these discussions First of all the new title reflects the intention to unify white noise analysis and quantum probability It is important and essenshytial to bring together classical and quantum probabilists and the success of the World Scientific journal Infinite Dimensional Analysis Quantum Probability and Related Topics shows that such an alliance will benefit both parties Furshythermore we should be open to a wide audience of scientists and to a broad spectrum of themes The present volume represents such a field being not very closely connected to quantum probability and white noise analysis but of general interest to the readership of the series

Future volumes of the series will include proceedings of conferences or workshyshops lecture notes of schools but also monographs on topics in quantum probshyability and white noise analysis

Finally we would like to thank all former editors of the series for their excellent job they did We especially appreciate the enthusiastic commitment of Luigi Accardi who initiated the series and was the responsible editor for many years

Wolfgang Freudenberg

VII

Contents

Foreword v

Preface xi

Locality and Bells Inequality 1 L Accardi and M Regoli

Refutation of Bells Theorem 29 G Adenier

Probability Conservation and the State Determination Problem 39 S Aerts

Extrinsic and Intrinsic Irreversibility in Probabilistic Dynamical Laws 50 H Atmanspacher R C Bishop and A Amann

Interpretations of Probability and Quantum Theory 71 L E Ballentine

Forcing Discretization and Determination in Quantum History Theories 85

B Coecke

Interpretations of Quantum Mechanics and Interpretations of Violation of Bells Inequality 95

W M De Muynck

Discrete Hessians in Study of Quantum Statistical Systems Complex Ginibre Ensemble 115

M M Duras

Some Remarks on Hardy Functions Associated with Dirichlet Series 121 W Ehm

Ensemble Probabilistic Equilibrium and Non-Equilibrium Thermodynamics without the Thermodynamic Limit 131

D H E Gross

An Approach to Quantum Probability 147 S Gudder

Innovation Approach to Stochastic Processes and Quantum Dynamics 161

T Hida

Statistics and Ergodicity of Wave Functions in Chaotic Open Systems 170 H Ishio

Origin of Quantum Probabilities 180 A Khrennikov

Nonconventional Viewpoint to Elements of Physical Reality Based on Nonreal Asymptotics of Relative Frequencies 201

A Khrennikov

Complementarity or Schizophrenia Is Probability in Quantum Mechanics Information or Onta 219

A F Kracklauer

A Probabilistic Inequality for the Kochen-Specker Paradox 236 J-A Larsson

Quantum Stochastics The New Approach to the Description of Quantum Measurements 246

E Loubenets

Abstract Models of Probability 257 V M Maximov

Quantum K-Systems and their Abelian Models 274 H Narnhofer

Scattering in Quantum Tubes 303 B Nilsson

Position Eigenstates and the Statistical Axiom of Quantum Mechanics 314

L Polley

Is Random Event the Core Question Some Remarks and a Proposal 321 P Rocchi

Constructive Foundations of Randomness 335 V I Serdobolskii

ix

Structure of Probabilistic Information and Quantum Laws 350 J Summhammer

Quantum Cryptography in Space and Bells Theorem 364 Volovich

Interacting Stochastic Process and Renormalization Theory 373 Y Volovich

xi

Preface

This volume constitutes the proceedings of the Conference Foundations of Probability and Physics held in Vaxjo (Smoland Sweden) from 25 November to 1 December 2000

The Organizing Committee of the Conference L Accardi (Rome Italy) W De Muynck (Eindhoven the Netherlands) T Hida (Meijo University Japan) A Khrennikov (Vaxjo University Sweden) and U V Maximov (Be-lostok Poland)

The purpose of the Conference (tentatively the first of a series) was to bring together scientists (physicists as well as mathematicians) who are intershyested in probabilistic foundations of physics An emphasis was made on both theory and experiment the underlying objective being to offer to the physical and mathematical scientific communities a truly interdisciplinary Conference as a privileged place for a scientific interaction among theoreticians and exshyperimentalists Due to the actual increased role of probabilistic foundations in physical applications (Einstein-Podolsky-Rosen correlation experiments Bells inequality quantum information computing and teleportation) as well as the necessity to reconsider foundations at the beginning of new millennium the organizers of the Conference decided that it was just the right time for taking the scientific risk of trying this

Since the creation of Statistical Mechanics probabilistic description plays more and more important role in physics The new crucial step in the develshyopment of the statistical approach to physics was made in the process of the creation of quantum mechanics The founders of quantum theory recognized that quantum formalism could not provide the description of physical processes for individual elementary particles The understanding of this surprising fact induced numerous debates on the possibilities of individual and probabilistic descriptions and relations between them These debates are characterized by the large diversity of opinions on the origin of quantum stochasticity

One of the viewpoints is that quantum stochasticity differs from classical stochasticity So quantum (statistical) mechanics could not be reduced to classical statistical mechanics This viewpoint implies convential interpretation of quantum mechanics

By this interpretation we could not use objective realism in quantum deshyscription of reality The very fundamental physical quantities such as for example position and momentum of an elementary particle could not be conshysidered as properties of the object the elementary particle The elementary particle can be in a state that is superposition of alternatives Only the act of a measurement gives the possibility to choose between these alternatives

xii

We recall historical roots of the origin of such a viewpoint namely the idea of superposition

In fact the whole quantum building was built on two experimental cornershystones 1) the experiment on photoelectric emission 2) the two slit experiment

The first experiment definitely demonstrated that light has the corpuscular structure (discrete structure of energy)

However the second experiment demonstrated that photons (corpuscular objects) do not follow the standard CLASSICAL STATISTICS The convenshytional rule for the addition of probabilistic alternatives

P = P1+P2

is violated in the interference experiments Instead of this rule probabilities observed in interference experiments follow to quantum rule for the addition of probabilistic alternatives

P = Pi + P2 + 2TP1P2COSO

Thus in general the classical rule is perturbed by the cos 0-factor The appearance of NEW STATISTICS induced the revolution in theoretshy

ical physics reconsideration of the role of all basic elements of the physical theory The common opinion was (and is) that quantum probabilistic rule could not be explained by purely corpuscular model To explain this rule we must apply to wave arguments (see for example Diracs book for the detailed analysis of the roots of quantum mechanical formalism)

This implies the wave-particle dualism and Bohrs principle of complemenshytarity This was the crucial change of the whole picture of physical reality (at least at micro-level)

We underline again that all these revolutionary changes had the purely probabilistic root namely the appearance of the new probabilistic rule We also underline that the founders of quantum mechanics in fact did not proshyvide deep probabilistic analysis of the problem Instead of this they analysed other elements of the physical model And such an analysis induces the new description of physical reality that we have already discussed namely quanshytum reality We will never know the real reasons of such a development of the

aOf course we must also mention that the necessity for a departure from classical meshychanics was shown by experiments demonstrating the remarkable stability of atoms and molecules The forces known in classical electrodynamics are inadequate for the explanation of this phenomenon However quantum mechanical explanation of such a stability is in fact based on the same arguments as the explanation of the photoelectric effect

bP A M Dirac The Principles of Quantum Mechanics (Claredon Press Oxford 1995)

xiii

theoretical study of the results of experiments with elementary particles at the beginning of the last century

It might be that one of the reasons was the absence of the mathematical theory of probability A N Kolmogorov proposed the modern axiomatics of probability theory only in 1933

During the round table at this conference Prof T Hida and Prof I Volovich pointed out to the fundamental role of direct contacts between physishycists and mathematician in the creation of new physical theories It may be that the absence of the direct collaboration between quantum physical and probabilistic communities was the main root of the absence of deep probabilisshytic analysis of quantum behaviour

Debates on foundations of quantum mechanics were continued with a new excitement in the connection with Einstein-Podolsky-Rosen (EPR) paradox Unfortunately the probabilistic element played the minor role in the EPR conshysiderations There was used (in a rather formal way) the notion of probability one in the formulation of the sufficient condition to be an element of physical reality A new probabilistic impulse to debates on foundations of quantum meshychanics was given by Bells inequality However we must recognize that Bells probabilistic considerations were performed on the formal level that could not be considered as satisfactory (at least from the point of view of mathematishycian) It may be that this absence of the deep probabilistic analysis of the EPR and Bell arguments was one of the main reasons to concentrate investigations in the direction of nonlocality and no-go theorems for hidden variables

The main aim of the conference Foundations of Probability and Physics was to provide probabilistic analysis of foundations of physics classical as well as quantum (in particular the EPR and Bell arguments) The present volume contains results of such analysis It gives the general picture of probabilistic foundations of modern physics Foundations of probability were considered in the close connection to foundations of physics We demonstrated that probashybility plays the fundamental role in models of physical reality It seems to be impossible to split probabilistic and physical problems On one hand many important problems that looks as purely physical are in fact just probabilistic problems On the other hand the right meaning of probability can be found only on the basis of physical investigations Such a meaning depends strongly on a physical model

The conference and the present volume give the good example of the fruitshyful collaboration between physicists and mathematicians stimulate research on the foundations of probability and physics especially quantum physics

We would like to thank Swedish Natural Science Foundation Swedish Technical Science Foundation Vaxjo University and Vaxjo Commune for fi-

XIV

nancial support that made the Conference possible We would also like to thank Prof Magnus Soderstrom the Rector of Vaxjo University for support of fundamental investigations and in particular this Conference

Andrei Khrennikov International Center for Mathematical Modelling in Physics and Cognitive Sciences University of Vaxjo Sweden December 2000

1

L O C A L I T Y A N D B E L L S I N E Q U A L I T Y

LUIDGI ACCARDI MASSIMO REGOLI Centro Vito Volterra

Universita di Roma Tor Vergata Roma Italy Email accardi copyvolterra mat uniroma2 it

We prove that the locality condition is irrelevant to Bell in equality We check that the real origin of the Bells inequality is the assumption of applicability of classical (Kolmogorovian) probability theory to quantum mechanics We describe the chameleon effect which allows to construct an experiment realizing a local realistic classical deterministic and macroscopic violation of the Bell inequalities

1 Inequal i t i e s a m o n g n u m b e r s

In this section we summarize some elementary inequalities among numbers which correspond to different forms of the Bell inequality one meets in the literature Since some confusion have arosen about the mutual relationships among these inequalities in particular their (in)equivalence and the cases of equality such a summary might not be totally useless

L e m m a (1) For any two numbers ac euro [mdash11] the following equivalent inshyequalities hold

aplusmncltlplusmnac (1)

Moreover equality in (1) holds if and only if either o = plusmn l o r c = plusmn l

Proof The equivalence of the two inequalities (1) follows from the fact tha t one is obtained from the other by changing the sign of c and c is arbi t rary in

[-11]-

Since for any a c 6 [mdash11] 1 plusmn ac gt 0 (1) is equivalent to

a plusmn c2 = a2 + c2 plusmn 2ac lt (1 plusmn ac)2 = 1 + a2c2 plusmn 2ac

and this is equivalent to a 2 ( l - c 2 ) + c2 lt 1

which is identically satisfied because 1 mdash c2 gt 0 and therefore

a 2 ( l - c 2 ) + c 2 lt l - c 2 + c2 = 1 (2)

Notice tha t in (2) equality holds if and only if a2 = 1 ie a = plusmn 1 Since exchanging a and c in (1) the inequality remains unchanged the thesis follows

2

Corollary (2) For any three numbers abc euro [mdash11] the following equivalent inequalities hold

ab plusmn cb lt 1 plusmn ac (3)

and equality holds if and only if b = plusmn1 and either a = plusmn l o r c = i l

Proof For b e [-11]

abplusmncb = b-aplusmncltaplusmnc (4)

so the thesis follows from Lemma (1) In (34) equality holds if and only if b = plusmn 1 so also the second statement follows from Lemma (1)

Lemma (3) For any numbers o a b b c e [mdash11] one has

ab - bc + ab + bc lt 2 (5)

ab + ab + ab -ab lt 2 (6)

In (5) equality holds if and only if b b = plusmn1 and either a o r c = plusmn 1

Proof Adding the two inequalities in (3) one finds (5) The left hand side of (6) is lt than

ab-ba + ab + la (7)

and replacing a by c (7) becomes the left hand side of (5) Therefore (6) holds If b b = plusmn1 and either a or c = plusmn1 equality holds in (3) hence in (5) Conversely suppose that equality holds in (5) and suppose that either b lt 1 or | V | lt 1 Then we arrive to the contradiction

2 = b bull a - a + b bull |o + a lta- a + a + a lt (1 - aa) + (1 + aa) = 2 (8)

So if equality holds in (5) we must have |6| = b = 1 In this case (5) becomes

a-a + a + a=2 (9)

and we know from Lemma (1) that the identity (41) can take place if and only if either a or a = plusmn 1

3

Corollary (4) If aabbc pound -11 then the inequalities (3) (6) and (5) are equivalent and equality holds in all of them

Proof From Lemma (1) we know that the inequalities (1) and (2) are equivshyalent Prom Lemma (3) we know that (3) implies (5) Choosing b = a in (5) since a = plusmn 1 (5) becomes

ab mdash cb lt 1 mdash ac

which is (3) The left hand side of (6) is

a(b + b) + a(b - b) (10)

In our assumptions either (b + b) or (b - b) is zero so (4) is either equal to

a(b+b) = b + b=2

or to a(b-b) = b-b = 2

Corollary (5) If abc G (mdash11) then the inequality (5) hence a fortiori (6) is strictly weaker than (3)

Proof We have already proved that that (3) implies (5) hence (6) On the other hand (5) is equivalent to

ab - bc lt (1 - ac) + (1 + ac - ab + bc (11)

ByLemma(l) 1+acmdash ab+bc gt 0 and equality holds if and only if | b | = land either a or c is plusmn 1 From this the thesis follows

2 The Bell inequality

Corollary (1) (Bell inequality) Let ABCD be random variables defined on the same probability space (f2 J- P) and with values in the interval [mdash11] Then the following inequalities hold

E(AB - BC) lt 1 - E(AC) (1)

E(AB + BC) lt 1 + EAC) (2)

4

E(AB - BC) + E(AD + DC) lt 2 (3)

where E denotes the expectation value in the probability space of the four variables Moreover (1) is equivalent to (2) and if either A or C has values plusmn 1 then the three inequalities are equivalent

Proof Lemma (11) implies the following inequalities (interpreted pointwise on fi)

AB - BC lt 1 - AC

AB + BC lt 1 + AC

AB - BC + AD + DC lt 2 from which (1) (2) (3) follow by taking expectation and using the fact that |pound(-0I lt Ed-X^) The equivalence is established by the same arguments as in Lemma (11)

Remark (2) Bells original proof as well as the almost totality of the availshyable proofs of Bells inequality deal only with the case of random variables assuming only the values +1 and mdash1 The present generalization is not withshyout interest because it dispenses from the assumption that the classical random variables used to describe quantum observables have the same set of values of the latter ones a hidden variable theory is required to reproduce the results of quantum theory only when the hidden parameters are averaged over

Theorem (3) Let Sa 5c 5^ 5^ be random variables defined on a probshyability space (poundlF P) and with values in the interval following inequalities holds

-1+1] Then the

pound(5laquo5lt2gt) - E(SWSP) lt 1 - E(SWS^) (4)

E(SMS12)) + E(SWsi2)) lt 1 + E(S^SW) (5)

E(sWsi2)) - pound ( 5 laquo 5 lt 2 ) ) + E(S^S2)) + E(S^S2)) lt 2 (6)

Proof This is a rephrasing of Corollary (2)

5

3 Implications of the Bells inequalities for the singlet correlations

To apply Bells inequalities to the singlet correlations considered in the EPR paradox it is enough to observe that they imply the following

Lemma (1) In the ordinary three-dimensional euclidean space there exist sets of three unit length vectors a b c such that it is not possible to find a probability space (Q T P) and six random variables SX

J (x = a 6 c j = 12) denned on ($7 J- P) and with values in the interval [mdash1 +1] whose correlations are given by

E(SW-SM) = -x-y xy = abc (1)

where if x = (xiX2X3) y = (211223) are two three-dimensional vectors x bull y denotes their euclidean scalar product ie the sum xyi + X2J2 + ^323-

Remark In the usual EPR-type experiments the random variables qti) qU) qii)

represent the spin (or polarization) of particle j of a singlet pair along the three directions abc in space The expression in the right-hand side of (1) is the singlet correlation of two spin or polarization observables theoretically predicted by quantum theory and experimentally confirmed by the Aspect-type experiments

Proof Suppose that for any choice of the unit vectors x = abc there exist random variables Si as in the statement of the Lemma Then using Bells inequality in the form (25) with A = spound1 B = s f ) C = S ^ ) we obtain

E(SWsl2)) + E(S12)SW) lt 1 + ESltpsM) (2)

Now notice that if x = y is chosen in (1) we obtain

ESP bull SM) =-x bull x = - x2 = ~l x = abc

and since Si J Si = 1 this is possible if and only if Si1 = -Sx2gtgt (x = a b c)

P-almost everywhere Using this (2) becomes equivalent to

ESPSIgt) + E(S^SW) lt 1 - E(S^S^)

or again using (1) to

a-b + b-c lt 1 + o-c (3)

6

If the three vectors a b c are chosen to be in the same plane and such that a is perpendicular to c and b lies between a and b forming an angle 9 with a then the inequality (3) becomes

cos9 + sin0 lt 1 0 lt 0 lt TT2 (4)

But the maximum of the function of 6 imdashgt sin 9 + cos 9 in the interval [0 n2] is 2 (obtained for 9 = 7r4) Therefore for 0 close to 7r4 the left-hand side of (4) will be close to 2 which is more that 1 In conclusion for such a choice of the unit vectors a b c random variables Sa S^ Sc Sc as in the statement of the Lemma cannot exist

Definition (2) A local realistic model for the EPR (singlet) correlations is defined by

(1) a probability space (fl T P)

(2) for every unit vector x in the three-dimensional euclidean space two random variables Sx SX defined on fi and with values in the interval [mdash1 +1] whose correlations for any x y are given by equation (1)

Corollary (3) If a b c are chosen so to violate (4) then a local realistic model for the EPR correlations in the sense of Definition (2) does not exist

Proof Its existence would contradict Lemma (1)

Remark In the literature one usually distinguishes two types of local realistic models - deterministic and stochastic ones Both are included in Definition (2) the deterministic models are defined by random variables Sx with values in the setmdash1 +1 while in the stochastic models the random variables take values in the interval [mdash1+1] The original paper [7] was devoted to the deterministic case Starting from [9] several papers have been introduced to justify the stochastic models We prefer to distinguish the definition of the models from their justification

4 Bell on the meaning of Bells inequality

In the last section of [8] (submitted before [7] but published after) Bell briefly describes Bohm hidden variable interpretation of quantum theory underlining

7

its non local character He then raises the question that there is no proof that any hidden variable account of quantum mechanics must have this extraorshydinary character and in a footnote added during the proof corrections he claims that Since the completion of this paper such a proof has been found

m-In the short Introduction to [7] Bell reaffirms the same ideas namely

that the result proven by him in this paper shows that any such [hidden variable] theory which reproduces exactly the quantum mechanical predictions must have a grossly nonlocal structure

The proof goes along the following scheme Bell proves an inequality in which according to what he says (cf statement after formula (1) in [7])

The vital assumption [2] is that the result B for particle 2 does not depend on the setting a of the magnet for particle nor A on b

The paper [2] mentioned in the above statement is nothing but the Einshystein Podolsky Rosen paper [11] and the locality issue is further emphasized by the fact that he reports the famous Einsteins statement [12] But on one supposition we should in my opinion absolutely hold fast the real factual situation of the system S2 is independent of what is done with the system Si which is spatially separated from the former

Stated otherwise according to Bell Bells inequality is a consequence of the locality assumption

It follows that a theory which violates the above mentioned inequality also violates the vital assumption needed according to Bell for its deduction ie locality

Since the experiments prove the violation of this inequality Bell concludes that quantum theory does not admit a local completion in particular quantum mechanics is a nonlocal theory To use again Bells words the statistical predictions of quantum mechanics are incompatible with separable predetermination ([7] p199) Moreover this incompatibility has to be undershystood in the sense that in a theory in which parameters are added to quantum mechanics to determine the results of individual measurements without changshying the statistical predictions there must be a mechanism whereby the setting of one measuring device can influence the reading of another instrument how-evere remote Moreover the signal involved must propagate instantaneously

5 Critique of Bells vital assumption

An assumption should be considered vital for a theorem if without it the theorem cannot be proved

8

To favor Bell let us require much less Namely let us agree to consider his assumption vital if the theorem cannot be proved by taking as its hypothesis the negation of this assumption

If even this minimal requirement is not satisfied then we must conclude that the given assumption has nothing to do with the theorem

Notice that Bell expresses his locality condition by the requirement that the result B for particle 2 should not depend on the setting a of the magnet for particle 1 (cf citation in the preceeding section) Let us denote Mi (M2) the space of all possible measurement settings on system 1 (2)

Theorem (1) For each unit vector x in the three dimensional euclidean space (1 6 R3 I a |= 1) let be given two random variables Sx Sx (spin of particle 1 (2) in direction x) defined on a space D with a probability P and with values in the 2-point set +1 mdash1- Fix 3 of these unit vectors a b c and suppose that the corresponding random variables satisfy the following non locality condition [violating Bells vital assumption] suppose that the probability space Cl has the following structure

) = A x M x M 2 (1)

so that for some function Fj1 F^2 A x Mi x M2 -raquobull [-11]

Sal) (w) = Fa

(1) (A mi m2) (S^ depends on m2) (2)

Sa2)(u) = Fa

(2)(A mi m2) (Sa2) depends on mi) (3)

with mi euro Mim2 euro M2 and similarly for b and c [nothing changes in the (2) proof if we add further dependences for example Fa may depend on all the

41 (w) and F0(1) on all the SX

2LJ)

Then the random variables Si S^2 Sc satisfy the inequality

I (SMStrade) - (StradeSW) |lt 1 - (S^SM) (4)

If moreover the singlet condition

lt5(1)-S(2)) = - 1 x = abc (5)

is also satisfied then Bells inequality holds in the form

(Sa^si2))-S^S^)ltl + (sWS^) (6)

9

Proof The random variables Sa S^ Sc satisfy the assumptions of Corolshylary (23) therefore (4) holds If also condition (5) is satisfied then since the variables take values in the set mdash1 +1 with probability 1 one must have

SP = -SW (x = abc) (7)

and therefore (S^S^) = -S^S^) Using this identity (4) becomes (6)

Summing up Theorem (1) proves that Bells inequality is satisfied if one takes as hypothesis the negation of his vital assumption From this we conclude that Bells vital assumption not only is not vital but in fact has nothing to do with Bells inequality

REMARK Using Lemma (141) below we can allow that the observables take values in [mdash11] also in Theorem (1)

REMARK The above discussion is not a refutation of the Bell inequality it is a refutation of Bells claim that his formulation of locality is an essential assumption for its validity since the locality assumption is irrelevant for the proof of Bells inequality it follows that this inequality cannot discriminate between local and non local hidden variable theories as claimed both in the introduction and the conclusions of Bells paper

In particular Theorem (1) gives an example of situations in which

(i) Bells locality condition is violated while his inequality is satisfied

In a recent experiment with M Regoli [4] we have produced examples of situations in which

(ii) Bells locality condition is satisfied while his inequality is violated

6 The role of the counterfactual argument in Bells proof

Bell uses the counterfactual argument in an essential way in his proof because it is easy to check that formula (13) in [7] paper is the one which allows him to reduce in the proof of his inequality all consideration to the A-variables (Sa

in our notations while Bells -B-variables are the Sa ^ in our notations) The pairs of chameleons (cf section (10) as well as the experiment of [4] provide a counterexample precisely to this formula

10

7 Proofs of Bells inequality based on counting arguments

There is a widespread illusion to exorcize the above mentioned critiques by restricting ones considerations to results of measurements The following conshysiderations show why this is an illusion

The counting arguments usually used to prove the Bell inequality are all based on the following scheme In the same notations used up to now conshysider N simultaneous measurements of the singlet pairs of observables (S^ S) (Spound S) (S 5) and one denotes S3

XV the results of the v-th measurement of Sdegx (j = 12 x = a b c v = 1 N) With these notations one can calculate the empirical correlations on the samples that is

u

(and similarly for the other ones) In the Bell inequality 3 such correlations are involved

(slsl) slsD slsD (2)

Thus in the three experiments observer 1 has to measure 5 in the first and third experiment and S in the second while observer 2 has to measure Sjj in the first and second experiment and S in the third Therefore the directions a and b can be chosen arbitrarily by the two observers and it is not necessary that observer 1 is informed of the choice of observer 2 or conversely However the direction c has to be chosen by both observers and therefore at least on this direction there should be a preliminary agreement among the two observers This preliminary information can be replaced it by a procedure in which each observer chooses at will the three directions only those choices are considered for which it happens (by chance) that the second choice of observer 1 coincides with the third of observer 2 (cf section (15) for further discussion of this point) Whichever procedure has been chosen after the results of the experiments one can compute the 3 empirical correlations

^ 2 )^ 1 ) ) = ^E^ 1 ) (^ 2 ) )^ 2 ) ^ 2 ) ) lt4gt

11

JV

(5)

where pj means the j - t h point of the 3-d experiment etc If we try to apply the Bell argument directly to the empirical data given by the right hand sides of (3) (4) (5) we meet the expression

Jj EampWWto) - plusmn E^^pf )5f (Pf) (6) N

J = I j = i

from which we immediately see that if we try to apply Bells reasoning to the empirical data we are stuck at the first step because we find a sum of terms of the type

si^sPip^-sUip^sfHpV) (7)

to which the inequalities among numbers of section (1) cannot be applied because in general

More explicitly since the expression (x) above is of the form

ab mdash bc

(8)

with a b b c euro plusmn1 the only possible upper bound for it is 2 and not 1 mdash ac Even supposing that we in order to uphold Bells thesis can introduce a

cleaning operation [3] (cf [4]) which eliminates all the points in which (8) is not satisfied we would arrive to the inequality

jf E^frf) Wgt) - jf E ^ f W (f) j = i 3 = 1

lt i-^E^W^fef) (9) j = i

and in order to deduce from this something comparable with the experiments we need to use the counterfactual argument assessing that

^ 1 (p 9 ) ) = -sltagt(Pa)) (2h (10)

12

But in the second experiment S^ and not Sc has been measured Thus to postulate the validity of (10) means to postulate that the value assumed by Sjj in the second experiment is the same that we would have found if Sc and

(2) not S^ had been measured The chameleon effect provides a counterexample to this statement

8 The quantum probabilistic analysis

Given the results of section (5) (6) (7) it is then legitimate to ask if Bells vital assumption is irrelevant for the deduction of Bells inequalshy

ity which is the really vital assumption which guarantees the validity of this inequality

This natural question was first answered in [1] and this result motivated the birth of quantum probability as something more than a mere noncommu-tative generalization of probability theory in fact a necessity motivated by experimental data

Theorem (23) has only two assumptions

(i) that the random variables take values in the interval [mdash1 +1]

(ii) that the random variables are defined on the same probability space

Since we are dealing with spin variables assumption (i) is reasonable Let us consider assumption (ii) This is equivalent to the claim that the

three probability measures PabPacPcb representing the distributions of the pairs (Sa Sl ) (Sc 5^ ) (Sa SC ) respectively can be obtained by reshystriction from a single probability measure P representing the distribution of the quadruple si1] s f s f SJ

This is indeed a strong assumption because due to the incompatibility of the spin variables along non parallel directions the three correlations

(spsP) ltslaquoslt2gtgt (s^sP) (i)

can only be estimated in different in fact mutually incompatible series of exshyperiments If we label each series of experiments by the corresponding pair (ie (a 6) (6 c) (c a)) then we cannot exclude the possibility that also the probability measure in each series of experiments will depend on the correshysponding pair In other words each of the measures Pab Pbc Pca describes the joint statistics of a pair of commuting observables (Si1 s f ) (S^ s f gt)

13

(Sa Sc ) and there is no a priori reason to postulate that all these joint disshytributions for pairs can be deduced from a single distribution for the quadruple r o U ) c ( l ) o(2) Q ( 2 ) I

We have already proved in Theorem (23) that this strong assumption implies the validity of the Bell inequality Now let us prove that it is the truly vital assumption for the validity of this inequality ie that if this assumption is dropped ie if no single distribution for quadruples exist then it is an easy exercise to construct counterexamples violating Bells inequality To this goal one can use the following lemma

Lemma (1) Let be given three probability measures plusmnabi aci - c6 on amp given (measurable) space (S1f) and let S^ si1] S^ SJp be functions defined on (QJ-) with values in the interval [mdash1-1-1] and such that the probability measure Pab (resp PcbPac) is the distribution of the pair (Sa Sl ) (resp ( ^ 1 ^ 2 ) ) (S i 1 ^ 2 ) ) ) For each pair define the corresponding correlation

Kab=SWS^)=Jsa^S^dPab

and suppose that for ee = plusmn the joint probabilities for pairs

Ki bullbull= P(Si1] = e bull Strade = e)

satisfy

p++ _ pmdash p + - _ p - + (o xy xy gt xy M xy ^I

P = Px = 12 (3)

then the Bell inequality

Kab - Kbc ltl~Kac (4)

is equivalent to

pb+-pb

+c++p^+lt (5)

Proof The inequality (4) is equivalent to

W - 2Pab ~ Pamp+ + 2P+-1 lt 1 - 2Pa+

c+ + 2 P + - (6)

14

Using the identity (equivalent to (3))

bull-xy 0 xy ()

the left hand side of (4) becomes the modulus of

2(^t+-^r )-2(nt+-nr) = 2 (s+-f +pav) -2 (pbt+-+nr)

= 4(p a v-n t + ) (8) and again using (7) the right hand side of (6) is equal to

1 - 2 ( P + + - 2 + Pac+ ) = 2 - 4P++ (9)

Summing up (4) is equivalent to

Kb+-Kc+ltl -PaV (io)

which is (5)

Corollary (2) There exist triples of PabPacPcb on the 4-point space + 1 - 1 x + 1 - 1 which satisfy conditions (1) (2) of Lemma (1) and are not compatible with any probability measure P on the 6-point space + 1 - 1 X + 1 - 1 X + 1 - 1

Proof Because of conditions (1) (3) the probability measures Pab Pac Pcb are uniquely determined by the three numbers

pb+p++px+euroioi (ii)

Thus if we choose these three numbers so that the inequality (5) is not satisfied the Bell inequality (4) cannot be satisfied because of Lemma (1)

9 The realism of ballot boxes and the corresponding statistics

The fact that there is no a priori reason to postulate that the joint distributions of the pairs ( S ^ s f 0 ) (si1]sf) ( S ^ S ^ ) can be deduced from a single distribution for the quadruple Sa Sc Sl Sc does not necessarily mean that such a common joint distribution does not exist

15

On the contrary in several physically meaningful situations we have good reasons to expect that such a joint distribution should exist even if it might not be accessible to direct experimental verification

This is a simple consequence of the so-called hypothesis of realism which is justified whenever we are entitled to believe that the results of our meashysurements are pre-determined In the words of Bell Since we can predict in advance the result of measuring any chosen component of olti by previously measuring the same component of o it follows that the result of any such measurement must actually be predetermined

Consider for example a box containing pairs of balls Suppose that the experiments allow to measure either the color or the weight or the material of which each ball is made of but the rules of the game are that on each ball only one measurement at a time can be performed Suppose moreover that the experiments show that for each property only two values are realized and that whenever a simultaneous measurement of the same property on the two elements of a pair is performed the resulting answers are always discordant Up to a change of convenction and in appropriate units we can always suppose that these two values are plusmn1 and we shall do so in the following

Then the joint distributions of pairs (of properties relative to different balls) are accessible to experiment but those of triples or quadruples are not

Nevertheless it is reasonable to postulate that in the box there is a well defined (although purely Platonic in the sense of not being accessible to experiment) number of balls with each given color weight and material These numbers give the relative frequencies of triples of properties for each element of the pair hence using the perfect anticorrelation a family of joint probabilities for all the possible sextuples More precisely due to the perfect anticorrelation the relative frequency of the triples of properties

SW=ai [Sf^h] [^1=Cl]

where aibia = plusmn1 are equal to the relative frequency of the sextuples of properties

[Strade = ai] [Si1] = h] [SP = Cl] [SM = - 0 l ] [Slt2gt = -bl] [S(2) = _C l]

and since we are confining ourselves to the case of 3 properties and 2 particles the above ones when abic vary in all possible ways in the set plusmn1 are all the possible configurations in this situation the counterfactural argument is applicable and in fact we have used it to deduce the joint distribution of sextuples from the joint distributions of triples

16

10 The realism of chameleons and the corresponding statistics

According to the quantum probabilistic interpretation what Einstein Podol-sky Rosen Bell and several other who have discussed this topic call the hyshypothesis of realism should be called in a more precise way the hypothesis of the ballot box realism as opposed to hypothesis of the chameleon realism

The point is that according to the quantum probabilistic interpretation the term predetermined should not be confused with the term realized a priori which has been discussed in section (9) it might be conditionally dediced according to the scheme if such and such will happen I will react so and so

The chameleon provides a simple example of this distinction a chameleon becomes deterministically green on a leaf and brown on a log In this sense we can surely claim that its color on a leaf is predetermined However this does not mean that the chameleon was green also before jumping on the leaf

The chameleon metaphora describes a mechanism which is perfectly local even deterministic and surely classical and macroscopic moreover there are no doubts that the situation it describes is absolutely realistic Yet this reshyalism being different from the ballot box realism allows to render free from metaphysics statements of the orthodox interpretation such as the act of meashysurement creates the value of the measured observable To many this looks metaphysic or magic but load how natural it sounds when you think of the color of a chameleon

Finally and most important for its implications relatively to the EPR arshygument the chameleon realism provides a simple and natural counterexample of a situation in which the results are predetermined however the counter-factual argument is not applicable

Imagine in fact a box in which there are many pairs of chameleons In each pair there is exactly an healthy one which becomes green on a leaf and brown on a log and a mutant one which becomes brown on a leaf and green on a log moreover exactly one of the chameleons in each pair weights 100 grams and exactly one 200 grams A measurement consists in separating the members of each pair each one in a smaller box and in performing one and only one measurement on each member of each pair

The color on the leaf color on the log and weight are 2-valued observables (because we do not know a priori if we are measuring the healthy or the mutant chameleon) Thus with respect to the observables color on the leaf color on the long and weight the pairs of chameleons behave exactly as EPR pairs whenever the same observable is measured on both elements of a pair the results are opposite However suppose I measure the color on the leaf of one element of a pair and the weight of the other one and suppose the answers I

17

find are green and 100 grams Can I conclude that the second element of the pair is brown and weights 100 grams Clearly not because there is no reason to believe that the second member of the pair of which the weight was measured while in a box was also on a leaf

From this point of view the measurement interaction enters the very definishytion of an observable However also in this interpretation which is more similar to the quantum mechanical situation the counterfactual argument cannot be applied because it amounts to answer brown to the question which is the color on the leaf if I have measured the weight and if I know that the chameleon is the mutant one (this because the measurement of the other one gave green on the leaf) But this answer is not correct because it could well be that inside the box there is a leaf and the chameleon is interacting with it while I am measuring its weight but it could also be that it is interacting with a log also contained inside the box in which case being a mutant it would be green

Therefore if we can produce an example of a 2-particle system in which the Heisenberg evolution of each particles observable satisfies Bells locality condition but the Schroedinger evolution of the state ie the expectation value (bull) depends on the pair (ab) of measured observables we can claim that this counterexample abides with the same definition of locality as Bells theorem

11 Bells inequalities and the chamaleon effect

Definition (1) Let S be a physical system and O a family of observable quantities relative to this system We say that the it chamaleon effect is realized on S if for any measurement M of an observable A pound O the dynamical evolution of S depends on the observable A If D denotes the state space of S this means that the change of state from the beginning to the end of the experiment is described by a map (a one-parameter group or semigroup in the case of continuous time)

TA D-gtD

Remark The explicit form of the dependence of TA on A depends on both the system and the measurement and many concrete examples can be constructed An example in the quantum domain is discussed in [3] and the experiment of [4] realizes an example in the classical domain

Remark If the system S is composed of two sub-systems S and 52 we can also consider the case in which the evolutions of the two subsystems are differshyent in the sense that for system 1 we have one form of functional dependence

18

Tjj of the evolution associated to the observable A and for system 2 we have another form of functional dependence Tjj In the experiment of [4] the state space is the unit disk D in the plane the observables are parametrized by angles in [02n) (or equivalently by unit vectors in the unit circle) and for each observable S i of system 1

and for each observable Sbdquo of system 2

where Ra denotes (counterclockwise) rotation of an angle a Let us consider Bells inequalities by assuming that a chamaleon effect

is present Denoting E the common initial state of the composite system (12) (eg singlet state) the state at the end of the measurement will be

Now replace Sx by

g(j) = gj) o T ( j )

x x --x

Since the Sx take values plusmn 1 we know from Theorem (23) that if we postulate

the existence of joint probabilities for the triple 5bdquo S^ Sc compatible with

the two correlations E(si1S^2)) E(si1S^2)) then the inequality

E(S^si2)) - E(S^si2)) lt 1 - E(S^S^)

holds and if we also have the singlet condition

ESpoundTWp)STWp)) = -l (1)

then ae

and we have the Bells inequality Thus if we postulate the same probability space even the chamaleon effect alone is not sufficient to guarantee violation of the Bells inequality

Therefore the fact that the three experiments are done on different and incompatible samples must play a crucial role

19

As far as the chameleon effect is concerned let us notice that in the above statement of the problem the fact that we use a single initial probability measure E is equivalent to postulate that at time t = 0 the three pairs of observables

(^U2)) (sMagt) (^U1) admit a common joint distribution in fact E

12 Physical implausibility of Bells argument

In this section we show that combining the chameleon effect with the fact that the three experiments refer to different samples then even in very simple situations no cleaning conditions can lead to a proof of the Bells inequality

If we try to apply Bells reasoning to the empirical data we have to start from the expression

~ E^W^sfcr^) -1 E^crJV)^(if Pf) 3 3

(1)

which we majorize by

^ E W^P^iT^p]) - SW(TJ V ) s f (tf V ) (2) N

3

But if we try to apply the inequality among numbers to the expression

SPiT^S^iTiW) - STWpraquo)sl2Traquo) (3)

we see that we are not dealing with the situation covered by Corollary (12)

ie

ab -cbltl-ac (4)

because since

si2)(T^)^S^(T^Py) (5)

the left hand side of (4) must be replaced by

ab-cb (6)

whose maximum for a b cb euro [mdash1 +1] is 2 and not 1 mdash ac

20

Bells implicit assumption of the single probability space is equivalent to the postulate that for each j = 1 N

P]=P (7)

Physically this means that the hidden parameter in the first experiment is the same as the hidden

parameter in the second experiment This is surely a very implausible assumption Notice however that without this assumption Bells argument cannot be

carried over and we cannot deduce the inequality because we must stop at equation (2)

13 The role of the single probability space in CHSHs proof

Clauser Home Shimony Holt [9] introduced the variant (26) of the Bell inequality for quadruples (ab) (ab) (ab) ab) which is based on the following inequality among numbers a b b a euro [mdash11]

ab + ab+ ab - ab |lt 2 (1)

Section (1) already contains a proof of (1) A direct proof follows from

b + b + b-blt2 (2)

because

| ab + ab + ab - ab | = | a(b + b) + ab - b) |

lta-b + b + a -b-b ltb + b + b-b lt2

The proof of (2) is obvious

Remark (1) Notice that an inequality of the form

a1b1+a2b2 + a3b3~a4b4lt2 (3)

would be obviously false In fact for example the choice

c1 = b = a2 = b2 = a3 = 63 = b4 = 1 a 4 = mdash1

would give I o-ih + a2b2 + a3b3 - a4b4 = 4

21

That is for the validity of (1) it is absolutely essential that the number a is the same in the first and the second term and similarly for a in the 3-d and the 4-th b in the 2-d and the 4-th b in the first and the 3-d

This inequality among numbers can be extended to pairs of random varishyables by introducing the following postulates

( P I ) Instead of four numbers a b b a g [mdash11] one considers four functions

o(l) c(2) o(l) o(2) dega Jdegb dega -V

all defined on the same space A (whose points are called hidden paramshyeters) and with values in [mdash11]

(P2) One postulates that there exists a probability measure P on A which defines the joint distribution of each of the following four pairs of funcshytions

ampamp) (gtSltgt) Slt$SP) S$SP) (4)

Remark (2) Notice that (P2) automatically implies that the joint distribushytions of the four pairs of functions can be deduced from a joint distribution of the whole quadruple ie the existence of a single Kolmogorov model for these four pairs With these premises for each A euro A one can apply the inequality

(1) to the four numbers

and deduce that

I Spound)S12) + SW)S$) + Slaquo(A)Sf (A) - S$)Strade() |lt 2 (5)

From this taking P-averages one obtains

I ltslM2)) + (^142)gt + lt ^ 2 ) gt - ltspoundWgt i= (6)

I J(SW)S12) + SW)Slt) + Si))si2x) - 5^(A)42)(A))rfP(A) |lt

(7)

lt||5W(A)^2)(A) + 5laquo(A)42)(A)+

22

S$)Sl2) - S$)Sigt() I dP(X) lt 2 (8)

Remark (3) Notice that in the step from (6) to (7) we have used in an essential way the existence of a joint distribution for the whole quadruple ie the fact that all these random variales can be realized in the same probability space In EPR type experiments we are interested in the case in which the

four pairs (a b) (a amp) (ab) (ab) come from four mutually incompatible experiments Let us assume that there is a hidden parameter determining the result of each of these experiments This means that we interpret the number Sa (A) as the value of the spin of particle 1 in direction a determined by the hidden parameter A

There is obviously no reason to postulate that the hidden parameter deshytermining the result of the first experiment is exactly the same one which determines the result of the second experiment However when CHSH conshysider the quantity (5) they are implicitly doing the much stronger assumption that the same hidden parameter A determines the results of all the four exshyperiments This assumption is quite unreasonable from the physical point of view and in any case it is a much stronger assumption than simply postulating the existence of hidden parameters The latter assumption would allow CHSH only to consider the expression

SPiWfHXi) + Slaquo(A2)42)(A2) + 5^(A3)5f (A3) - 5^(A4)4

)(A4) (9)

and as shown in Remark (1) above the maximum of this expression is not 2 but 4 and this does not allow to deduce the Bell inequality

14 The role of the counterfactual argument in CHSHs proof

Contrarily to the original Bells argument the CHSH proof of the Bell inequalshyity does not use explicitly the counterfactual argument Since one can perform experiments also on quadruples rather than on triples as originally proposed by Bell has led some authors to claim that the counterfactual argument is not essential in the deduction of the Bell inequality However we have just seen in section (7) that the hidden assumption as in Bells proof ie the realizabil-ity of all the random variales involved in the same probability space is also present in the CHSH argument The following lemma shows that under the singlet assumption the conclusion of the counterfactual argument follows from the hidden assumption of Bell and of CHSH

23

Lemma (1) If and g are random variables defined on a probability space (A P) and with values in [mdash11] then

(fg) bull= I fgdP = - i JA

if and only if Pfg = - i ) = i

Proof If P(fg gt - 1 ) gt 0 then

fgdP = -P(fg = - 1 ) - fgdP gt -P(fg = -1)-P(fg gt - 1 ) gt - 1 JA Jfggt-1

Corollary (2) Suppose that all the random variales in (x3) are realized in the same probability space Then if the singlet condition

(SPSW) = - 1 (1)

is satisfied then the condition

SW = SM ( 2)

(ie formula (13) in Bells 64 paper) is true almost everywhere

Proof Follows from Lemma (1) with the choice f = Sx g = Si Summing

up if you want to compare the predictions of a hidden variable theory with quantum theory in the EPR experiment (so that at least we admit the validity of the singlet law) then the hidden assumption of realizability of all the random variables in (3) in the same probability space (without which Bells inequality cannot be proved) implies the same conclusion of the counterfactual argument Stated otherwise the counterfactual argument is implicit when you postulate the singlet condition and the realizability on a single probability space It does not matter if you use triples or quadruples

15 Physical difference between the CHSHs and the original Bells inequalities

In the CHSH scheme

(ab) (ab) (ab) (ab)

24

the agreement required by the experimenters is the following - 1 will measures the same observable in experiments I and III and the

same observable in experiments II and IV - 2 will measure the same observable in experiments I and II and the same

observable in experiments III and IV Here there is no restriction a priori on the choice of the observables to be

measured In the Bell scheme the experimentalists agree that - 1 measures the same observable in experiments I and III - 2 measures the same observable in experiments I and II - 1 and 2 choose a priori ie before the experiment begins a direction c

and agree that 1 will measure spin in direction c in experiment II and 2 will measure spin in direction c in experiment III (strong agreement)

The strong agreement can be replaced by the following (weak agreement) - 1 and 2 choose a priori ie before the experiment begins a finite set of

directions c CK and agree that 1 will measure spin in a direction choosen randomly among the directions c CK in experiment II and 2 will do the same in experiment III

In this scheme there is an a priori restriction on the choice of some of the observables to be measured

If the directions fixed a priori in the plane are K then the probability of a coincidence corresponding to a totally random (equiprobable) choice is

p$ = 42A) = X gt =laquo 42A =laquo) = pound h = h a=l a=l

This shows that contrarily than in the CHSH scheme the choice has to be restricted to a finite number of possibilities otherwise the probability of coincidence will be zero

From this point of view we can claim that the Clauser Home Shimony Holt formulation of Bells inequalities realize a small improvement with respect to the original Bells formulation

Reproduction of the E P R correlations by the chameleon effect

Consider a classical dynamical system composed of two particles (12) Let S denote the state space of each of the particles and suppose that at

time t = to (initial time) the state i j of particle 1 and the state UdegJ OI particle 2 coincide

Hdeg = A=ti (1)

25

Starting from time to the two particles begin to move in opposite directions and after a time interval of length T two independent and non communicating experimenters simultaneously perform a measurement on each particle

Experimenter 1 (resp 2) can choose among three different measurements corresponding to the observables

SWSWSW (resp 5 ( 2 ) 5 f ^ ) ) (2)

of particle 1 (resp particle 2) We suppose that both particles satisfy the chameleon effect described by

the following

DEFINITION (1) Let S be the state space of a dynamical system u let 7 be a set and for each x euro I let be given a function

Sx S -gt R x euro I (3)

representing an observable of the system The system ltr is said to realize the chameleon effect with respect to the observables (33) if whenever the observable Sx is measured the dynamical evolution of the system

T S -gt S tell (4)

depends on the measured observable Sx In our case we consider only two instants of time the initial one and the

one when the measurement takes place and we omit time from our notations Moreover in our case we have two particles and each particle is far away from the other one hence it can only feel the interaction with the measurement apparatus near to it So combining the locality principle with the chameleon effect we conclude that if experimenter 1 (resp 2) chooses to measure the observable Sx (resp Sy ) then particle 1 (resp 2) will evolve according to the dynamics

T1gtx (resp T2lV) (5)

In our case the variables x y can be any element of the set a b c

Suppose that experimenter 1 chooses to measure and experimenter

Let ti (resp j2) denote the final state ie the state at the time when the measurement occurs of particle 1 (resp 2) Condition (31) is then equivalent to

^iTaVi = T276Va (6)

26

The empirical correlations of the measurements will then be

i pound 5(1)(x1)5f ( i ^ C O i - T2gt2) (7)

where J^(-) is a lt5-like factor keeping into account the fact that only the conshyfigurations satisfying condition (6) give a non zero contribution to the correlashytions

Now suppose that the state space S is the real line R Thus the empirical correlations (7) are

nab = Z J J 5laquo ( m )5 f (M2) (T1aV1 - T^^d^d^ (8)

where Z is a normalization constant With the change of variables

T ^ V i = Ai T~^2 = A2 (9)

(8) becomes

z j J 5W(T1aA1)^2)(T2bA2)lt5(A1 - X2)dTha(X1)dT2b(X2) (10)

Now introduce the notations

S^TiiXj)=S^(j) j = l2 x = ab (11)

with these notations supposing as always possible that T[i0(Ai)T2 6(A2) gt 0 (10) becomes

Z j j S^X1)Sb2x2)8Xl - X2)Tlta(X1)T^b(X2)dX1dX2 =

Z JSi1X)si2)(X)Tla(X)Tib(X)dX

Now let us make the following choices

A 6 [02vr] laquobull supp Sltj) C [0 2TT] (12)

Z = (27T)1 (13)

27

Tb = V^ (14)

n a ( A ) = ^ | c o s ( A - a ) | (15)

SW() = sgn (cos(A - x)) Strade = -Strade (16)

With these choices the correlations (8) become

I-2TT I

( S ^ f i f gt = - sgn (cos(A - a)) sgn(cos(A - 6))- | cos(A - a)d (17) Jo 4

= mdash sgn (cos(A mdash b)) cos(A mdash a)d = mdash cos(b mdash a) = mdasha bull b

which are the EPR correlations

References

1 L Accardi Phys Rep 77 169-192 (1981) 2 L Accardi Urne e camaleonti Dialogo sulla realta le leggi del caso

e la teoria quantistica (II Saggiatore 1997) Japanese translation Maruzen (2000) russian translation ed by Igor Volovich (PHASIS Publishing House 2000) english translation by Daniele Tartaglia to appear

3 L Accardi On the EPR paradox and the Bell inequality Volterra Preprint N 350 (1998)

4 L Accardi M Regoli Quantum probability and the interpretation of quantum mechanics a crucial experimentInvited talk at the workshop The applications of mathematics to the sciences of nature critical moments and aspetcs Arcidosso June 28-July 1 (1999) To appear in the proceedings of the workshop Preprint Volterra N 399 (1999)

5 L Accardi M Regoli Local realistic violation of Bells inequality an experiment Conference given by the first-named author at the Dipartimento di Fisica Universita di Pavia on 24-02-2000 Preprint Volterra N 402

6 L Accardi M Regoli Non-locality and quantum theory new experishymental evidence Invited talk given by the first-named author at the Confershyence Quantum paradoxes University of Nottingham on 4-05-2000 Preprint Volterra N 421

7 J S Bell Physics 1 3 195-200 (1964) 8 J S Bell Rev Mod Phys 38 447-452 (1966)

28

9 J F Clauser MA Home A Shimony R A Holt Phys Rev Letters 49 1804-1806 (1969) J S Bell Speakable and unspeakable in quantum mechanics (Cambridge Univ Press 1987)

10 J F Clauser M A Home Phys Rev D 10 2 (1974) 11 A Einstein B Podolsky N Rosen Phys Rev 47 777-780 (1935) 12 A Einstein in Albert Einstein Philosopher Scientist Edited by PA

Schilpp Library of Living Philosophers (Evanston Illinois 1949)

29

R e f u t a t i o n of Be l l s T h e o r e m

Guil laume A D E N I E R Louis Pasteur University Strasbourg France

E-mail guillaumeadenierulpu-strasbgfr

Bells Theorem was developed on the basis of considerations involving a linear combination of spin correlation functions each of which has a distinct pair of arguments The simultaneous presence of these different pairs of arguments in the same equation can be understood in two radically different ways either as strongly objective that is all correlation functions pertain to the same set of particle pairs or as weakly objective that is each correlation function pertains to a different set of particle pairs It is demonstrated that once this meaning is determined no discrepancy appears between local realistic theories and quantum mechanics the discrepancy in Bells Theorem is due only to a meaningless comparison between a local realistic inequality written within the strongly objective interpretation (thus relevant to a single set of particle pairs) and a quantum mechanical prediction derived from a weakly objective interpretation (thus relevant to several different sets of particle pairs)

1 Introduction

Bells Theorem1 exhibits a peculiar discrepancy between any local realistic theshyory and Quantum Mechanics which leads to empirically distinguishable altershynatives The quandary is that neither local realistic conceptions nor Quantum Mechanics are easy to abandon Indeed classical physics and common sense are usually based upon the former while the latter is rightly presented as the most successful theory of all times Several experiments have been done all but a few2 show violations of Bell inequalities3 Yet the ideas brought forth by Bells Theorem are so disconcerting that there is still incredulity not to menshytion antipathy evoked by the verdict The purpose of this article is to provide a refutation of this theorem within a strictly quantum theoretical framework without the use of outside assumptions

2 The E P R B gedanken experiment

21 Spin observables and singlet state

Bells theorem is usually based on a didactic reformulation of the EPR (Einshystein Podolsky and Rosen4) gedanken experiment due to D Bohm5 In this EPRB gedanken experiment a pair of spin-| particles with total spin zero is produced such that each particle moves away from the source in opposite directions along the y-axis Two Stern-Gerlach devices are placed at opposite

30

points (left and right) on the y-axis and are oriented respectively along the directions u and v The Hilbert space associated with the entire EPRB system is H = 7ih lt8gtHR where T^L and HR are the Hilbert spaces associated with each Stern-Gerlach device respectively The spin observable has two counterparts in this new product space H as

CTL-U = ltr-u(ggtIR (1)

ltTR bull v = IL reg a bull v (2)

where I I and IR are the identity operators of ~Hh and R Contrary to the observables a bull u and a bull v which are mutually non commuting when u ^ v these new observables ox bull u and OR bull v do commute reflecting the fact that the Stern-Gerlach devices are arbitrarily far from each other and are thus measuring distinct subsystems The product of these two observables is therefore also an observable and can be understood as a spin correlation observable corresponding to the joint spin measurement of both Stern-Gerlach devices Its eigenvectors are |poundLU) ltggt | pound R V ) with corresponding eigenvalues poundL-poundRgt where each e is either +1 or mdash1

In an EPRB gedanken experiment the source produces particle pairs with zero total spin represented by the singlet state

M = ^ [l+ngt reg -gtngt - -gtngt reg l+ngt]gt (3)

where n is an arbitrary unitary vector which can usually be ommited since the singlet state is invariant under rotation6

22 Statistical properties and hidden-variables

The expectation value of a spin observable for the singlet state ip) is zero

(r-u(8gtlR|Vgt) = 0 MI L regltr-v |^gt = 0 (4)

whatever u and v as follows from the rotational invariance of the singlet state Likewise the expectation value of the spin correlation observable 67 is

E(uv) = M ( o f u ) ( o - v ) M (5)

= - u - v (6)

which depends only on the relative angle between u and v

31

In a local realistic hidden-variables model a single particle pair is supposed to be entirely characterised by means of a set of hidden-variables which are symbolically represented by a parameter A so that the measurement result on the left along u can be written as A(uA) and the result on the right along v as B(v) Although the hidden-variables model is supposed to be fully deterministic it must also be capable of reproducing the stochastic nature of the EPRB gedanken experiment expressed in Eqs (4) and (6) For that purpose the complete state specification Aj of any particle pair with label i must be a random variable1s its complete state Aj is supposed to be drawn randomly according to a probability distribution p

Consider a set of N particle pairs i = 1 N the mean value of joint spin measurements for this set is

1 N

M(uv) = - ^ A ( u A i ) B ( v A i ) (7)

3 The CHSH function

In order to establish Bells Theorem a linear combination of correlation funcshytions c(a b) with different arguments 9 is considered once when these correlashytion functions are expectation values E^av) given by Quantum Mechanics ie Eq(6) and once when they are mean values M p (u v ) given by local hidden-variables theories Eq(7) then the results are to be compared A well known choice of such a linear combination is the CHSH (Clauser Home Shi-mony and Holt10) function written with four pairs of arguments

S = |c(ab) - c ( a b ) +c (a b ) + c(a b ) | (8)

The exact meaning of the simultaneous presence of these different argushyments in a CHSH function must be clarified Basically there are two possible interpretations the strongly objective interpretation and the weakly objective interpretation1112

Strongly Objective Interpretation implies that all correlation functions are relevant to the same set of N particle pairs As such they cannot be relevant to actual experiments but rather with what result would have been obtained if measured on the same set of N particle pairs along different directions

Weakly Objective Interpretation implies that each correlation function is actually to be measured on distinct sets of N particle pairs that is for each pair only one joint spin measurement is to be executed

32

The CHSH function was actually developed specifically for experimental convenience10 and many experiments have been done (the most famous being Aspects13) obviously invoking the natural interpretation namely the weakly objective one Nevertheless the strongly objective interpretation must also be considered since it remains a possible interpretation a priori and since the choice between strong and weak objectivity is not made at all explicit in many papers including Bells

It must be stressed that these interpretations are radically different not only epistemologically but also physically Indeed the strongly objective inshyterpretation pertains to a single set of N particle pairs characterised by the corresponding set of parameters A i = 1 TV whereas the weakly obshyjective interpretation pertains to no less than 4 sets of N particle pairs The fact is that a finite set of N particle pairs characterised by A cant be identishycally reproduced either theoretically (for each complete state A of any particle pair i is a random variable as defined in Section 22) or empirically (for the experimenter has no control over the complete state of a particle pair in a sinshyglet state) Hence in the weakly objective interpretation these four sets are necessarily four different sets of particle pairs 7 14 respectively characterised by four different sets of hidden-variables parameters Aij ^2i ^3i a n d A4J

The difference between each interpretation can therefore be embodied in the number of degrees of freedom of the whole system Let be the degrees of freedom of a single particle pair In the strongly objective interpretation the degrees of freedom of the whole CHSH system is then Nf whereas in the weakly objective interpretation it is 4 times as large that is 47V Thus before initiating Bells analysis one has to choose explicitly one interpretation and stick to it

4 Strongly objective interpretation

4-1 Local realistic inequality within strongly objective interpretation

The local realistic formulation of the CHSH function within strong objectivity is written

OP ^strong

M ( a b ) - M ( a b ) + Mgt(ab) + M (a b ) (9)

which (using Eq 7) becomes after factorisation a summation where each term can have two values 2 7

A(a Xi) B(b Xi) - B(b Xi)] + A(a Xt) [l(b Alt) + B(b A)] = plusmn2 (10)

33

so that the most restrictive local realistic inequality within the strongly objecshytive interpretation is

Strong lt 2- (11)

This is the well known generalised formulation of Bells inequality due to CHSH10 It must be stressed once more however that this inequality has been established only within the strongly objective interpretation which means that each expectation value is relevant to the same set of N particle pairs Hence this result cannot be compared directly with results from real experimental tests where in fact mean values from four distinct sets of N particle pairs are measured

4-2 Quantum mechanical prediction within strongly objective interpretation

The quantum prediction for the CHSH function within the strongly objective interpretation is written

strong = l ^ ( a b ) - E ( a b ) + E+(ab) + E(ah) (12)

This equation is usually directly evaluated by replacing each expectation value by the scalar product result of Eq (6) This unfortunately is all too hasty

Indeed in order to understand better the quantum mechanical meaning of equation (12) it is advantageous to take a step backward using equation (5)

^strong (Vgt|(aLa)(ltTRbM - ltVgt|(lt7La)(lt7Rb)|tgt)

+ (ygt|(lt7La)(ltTRb)|V) + (igt|(lt7La)(lt7R b)|V) bull (13)

The four spin correlation observables in this equation are non commuting observables (this can be shown by calculating the commutator of ((7LU)(ltTRV)

and ((TLU)(CTRV) with v ^ v ) so that the meaning of their combination must be questioned

According to Von Neumann15 any linear combination of expectation valshyues of different observables R S is meaningful in quantum mechanics

R + S + )4 = (R)4 + (S)4 + (14)

even if R S are non commuting observables However as was stressed by dEspagnat 1116 quantum mechanics is only a weakly objective theory and expectation values given by quantum mechanics are also weakly objective statements that is to say statements relevant to observations so that when

34

R 5 are non commuting observables the expectation values cannot be simultaneously relevant to the same set of N systems each expectation value is necessarily relevant to a distinct set of JV systems Therefore the only possible meaning of equation (13) is weakly objective not strongly objective as desired Of course this does not imply that Quantum Mechanics cannot provide any meaning at all for the CHSH function it implies only that this meaning cannot be strongly objective

Since the local realistic inequality SgtT0 cannot be compared with any strongly objective prediction given by Quantum Mechanics Bells Theorem cannot be verified with a strongly objective interpretation given to the CHSH function Hence there is no choice but to rely on the weakly objective interpreshytation in order to compare hidden-variables theories and Quantum Mechanics

5 Weakly objective interpretation

51 Quantum mechanical prediction within weakly objective interpretation

It was shown in Section 3 that strong objectivity and weak objectivity pertain to different physical systems This difference should therefore appear in the relevant equations Indeed the correlation expressed in Eq (6) is relevant to spin measurements performed on particles that once constituted a single parent particle Yet two particles issued from two distinct parents never have intershyacted with each other so that spin measurements performed on such particle pairs can not be correlated Hence if left and right spin measurements are pershyformed on two distinct sets of N particle pairs instead of the same set there should be no correlation and this property should appear in a generalised spin correlation function (ie generalised to the case of spin measurements performed on different sets of particle pairs)

This can be easily done within a quantum theoretical framework by means of a distinct EPRB space for each set of N particle pairs Let Hj be the EPRB Hilbert space associated with the jth set of particle pairs In this Hilbert space the EPRB gedanken experiment is represented by the singlet state ipj) (see Section 2)

|V) = ^[l+gtreg|-gt-|-gtreg|+gt-] (15)

The whole CHSH experiment with the four sets of particle pairs can be exshypressed then in terms of a new tensor product space W1234 = i reg 2 reg 3 reg HA in which the state vector is

1 1234) = |Vl) reg 1 2) reg |^s) reg |^4gt- (16)

35

The counterparts of observables in 7 1234 are obtained as in Section 21 For instance the observable pertaining to the right Stern-Gerlach device for the 2nd set of particle pairs is

a2R -u = Ii reg (CTR bull u) lt8gt I3 reg I4 (17)

where Ij is the identity operator of the EPRB space Hj Hence the expectation value of the product of two spin observables the first belonging to the fcth set and the second to the Zth set is

Eftu V) = (Vgt1234|(ltTL bull U)(lt7IR bull v)|Vgt1234) (18)

and this is the generalised expectation value of spin correlation observables that was sought The expectation value for measurements performed on the same set (k = I) of particle pairs is already known Eq (6) and E^k(u v) should provide the same result Indeed using Eqs (16) and (17) leads to

lt ( u v ) = ltIM(ltTL -u) bull K - v)rpk) = - u v (19)

but when k ^ I the result is quite different

J3(uv) = (V-fcKot - u ^ X V - z I K -v)hM = 0 (20)

in accord with Eq (4) There are indeed no correlations between two sets of particle pairs as stipulated in the beginning of this section

Now contrary to what was done in Section 42 it is possible to proceed here in full accord with the quantum mechanical postulates because the spin correlation observables as the one given in Eq (17) are mutually commuting so that a linear combination of these commuting observables is an observable as well The CHSH experiment can therefore be described by a new observable

Sweak = (lt7lL bull a)(ai R bull b ) - (ltT2L bull a)(lt72R b )

+(o-3L-a)(ltT3R-b) + (lt74L- a)(ltx4R bull b ) (21)

and the quantum prediction for the CHSH function within a weakly objective interpretation is therefore obtained by calculating the expectation value of the observable 5weak when the system is in the quantum state 1 1234)

Sweak = (^1234|5weak|V1234) (22)

which using Eqs (17) (18) and (19) is

S L k = S f 1 ( a b ) - ^ 2 ( a b ) + ^ 3 ( a b ) + E 4 (a b ) (23)

36

This equation is not ambiguous (as was Eq 12) it is a linear combination of expectation values each relevant to a distinct set of N particle pairs This equation is therefore weakly objective as requested

Finally using Eq (19) yields

weak a bull b - a bull b + a bull b + a bull b

with a well known maximum equal to

max(5 B a k )=2gt^

(24)

(25)

This numerical result is indeed the one given in the literature the only difshyference here being the fact that the meaning of this result is unambiguously weakly objective Quantum Mechanics which is a weakly objective theory n

provides a clear answer to the CHSH function understood as a weakly objective question

52 Local realistic inequality within weakly objective interpretation

The last step consists in comparing the quantum prediction S^eak with its local realistic counterpart S^eak As was stressed in Section 3 the j t h set of particle pairs must be characterised by a distinct set of hidden-variables parameters [Xji j = 1 N Hence to the generalised expectation value of the spin correlation observable Eq (18) corresponds the generalised mean value of joint spin measurements

1 N

Mpound(uv) = - J gt ( u A M ) B ( v A M ) (26)

which is a priori capable of reproducing not only the k mdash I prediction Eq (19) but also the k ^ prediction Eq (20) The local realistic CHSH function with a weakly objective interpretation is therefore

9P = weak

Mftfob) - M22(ab) + M3 3(ab) + M4 4(ab) (27)

and that is explicitly

i 1 N

5weak = b E [^(a A M )pound(b A M ) - gtl(aA2li)B(bA2ii)

+A(a 3i)B(h A3i) + AB A4i)B(bl A4]i) ] (28)

37

This expression is to be compared with the one pertaining to the strongly objective interpretation (Section 41) which contained terms that could be factored Here since each term is different from the others no factorisation is possible ie there is no way to derive a Bell inequality7mdashthis is not the first time this fact has been noticed unfortunately no conclusion was drawn then Yet this fact cannot be ignored for it has been shown in Section 4 that Bells Theorem cannot be demonstrated within a strongly objective interpretation

Here the only local realistic inequality that can be derived is obtained by consideringmdashas was done with Eq (10)mdashthe possible numerical values of each term of the summation in Eq (28) for which the extrema are +4 and -4 so that the narrowest local realistic inequality that can be derived from Eq (28) is nothing but

^ e a k lt 4 - (29)

This most restrictive local realistic inequality (which can also be found in Accardi17) is not incompatible with the quantum mechanical prediction as the maximum of Sbdquoe a k is 2-2 This shows that experiments intended to test Bells Theorem were unfortunately not testing the strongly objective inequality Eq (11)mdashwhich is a Bell inequalitymdash but this weakly objective one Eq (29) since all experimental tests necessarily are executed in a weakly objective way due to the irreducible incompatibility between spin measurements As was stressed by Sica18 and Accardi17 a local realistic inequality is nothing but an arithmetic identity and inequality (29) is definitely too lax to be violated by experimental tests

6 Conclusion

It was shown that Bells Theorem cannot be derived either within a strongly objective interpretation of the CHSH function because Quantum Mechanics gives no strongly objective results for the CHSH function (see Section 42) or within a weakly objective interpretation because the only derivable loshycal realistic inequality is never violated either by Quantum Mechanics or by experiments (see Section 52) It was demonstrated that the discrepancy in Bells Theorem is due only to a meaningless comparison between S^trons lt 2 and 5^ e a k = 22 where the former is relevant to a system with Nf degrees of freedom whereas the latter to one with 4Nf (see Section 3) The only meaningful comparison is between the weakly objective local realistic inequalshyity 5^ e a k lt 4 and the weakly objective quantum prediction Sbdquo e a k = 2^2 but these results are not incompatible Bells Theorem therefore is refuted

38

References

1 J S Bell Physics 1 195 (1964) 2 F Selleri Le grand debat de la mcanique quantique (Champs Flammar-

ion Paris 1986) 3 A Aspect Nature 398 189 (1999) 4 A Einstein B Podolsky and N Rosen Phys Rev 47 777 (1935) 5 D Bohm Phys Rev 85 166 (1952) 6 D Greenberger M Home A Shimony and A Zeilinger Am J Phys

58 1131 (1990) 7 A Bohm Quantum Mechanics Foundations and applications (Springer-

Verlag New York 1979) 8 J S Bell in Proceedings of the international School of physics Enrico

Fermi course IL Foundations of quantum mechanics (Academic New York 1971) p 171

9 J S Bell Epistemological Letters p 2 (July 1975) 10 J F Clauser M A Home A Shimony and R A Holt Phys Rev Lett

23 880 (1969) 11 B dEspagnat Veiled Reality An Analysis of Present Day Quantum

Mechanical Concepts (Addison-Wesley 1995) 12 B dEspagnat httparXivabsquant-ph9802046 13 A Aspect J Dalibard and G Roger Phys Rev Lett 49 1804 (1982) 14 A Khrennikov httparXivabsquant-ph0006017 15 J von Neumann Mathematical Foundations of Quantum Mechanics

(Princeton University Press 1955) 16 B dEspagnat Conceptual foundations of Quantum Mechanics (WA

Benjamin Massachusetts 1976) 17 L Accardi httparXivabsquant-ph0007005 18 L Sica Opt Commun 170 55 (1999)

39

PROBABILITY CONSERVATION A N D THE STATE DETERMINATION PROBLEM

S AERTS Free University of Brussels

Triomflaan 2 Brussels Belgium E-mail saertsvubacbe

The problem of finding an operational definition for the wave vector is briefly examined from a historical point of view Led by an old idea of Feenberg we integrate the one dimensional probability conservation equation to obtain a closed formula that determines the state vector in the spinless case The formula that determines the state does not depend on the (real) potential external fields having their influence on the state only through the time derivative of the probability density function in position space We apply the method to the simple case of a free Gaussian wave packet Some problems regarding the operational status of the quantities involved are discussed

1 Introduction

It is well known that Heisenberg constructed the matrix formulation of quanshytum mechanics by keeping in close accordance with what might be labelled the principle of operationality Roughly one can describe this principle as a determination to introduce only measurable quantities Schrodinger more concerned with anschaulichkeit than operationality introduced rather unshyscrupulously the concept of a wave function He initially interpreted the wave function as a charge density in space but this interpretation is difficult to extend to several particle problems a The interpretation that would stand the test of time as testimonied by it being awarded the Nobel prize in 1954 was due to Born In analogy with the theory of electro-magnetic radiation in which the intensity is the square of the amplitude Born took the step to interpret the intensity of an electro-magnetic wave in a given region of space as proportional to the relative frequency of a photon detection in that region and the probabilistic interpretation was born However this correspondence still doesnt make it an operational quantity as for every density p(x t) there are infinitely many 4gt(xt) such that with ip(xt) = ^pxt)el^xt we get ip(xt)ip(xt) = p(xt) The problem is then to find suitable functions that we can approximate experimentally in a statistical way that in some well choshysen combination yield the same information as the complete wave function In order to make the question mathematically more precise Prugovecki2 intro-

aFor a rescue attempt of the original Schrodinger interpretation see Dorling1

40

duced the notion of informational completeness A family T = Oii euro 1 of bounded operators on a Hilbert space ~H is called informationally complete iff for every two density operators p and p the equality Tr(pOi) = Tr(pOi) implies p = p This definition implies that the set of expectation values of an informationally complete set of operators allows only one state operator from which the expectation values could have been derived What characterizes such a set In a classical statistical framework we can calculate all macroshyscopic quantities from a single density function p(p q) in phase space Hence by analogy one is naturally led to the following interesting question originally due to Pauli3 Is it sufficient to know the probability density functions of poshysition and momentum to determine unambiguously the quantum mechanical state of the physical system In the quantum mechanical case it is sufficient to know the wave function in coordinate space ip(xt) since the corresponding wave function for the same system in momentum space ip(pt) is given by its Fourier transform Hence we can phrase the problem in a more mathematical way is it possible to determine a square integrable function uniquely from both its modulus and the modulus of its Fourier transform Possibly the first non-trivial counterexamples came from Bargmann b who constructed explicit examples of wave functions Vl and ip2 that give rise to the same probabilshyity distributions for position and momentum but give a different probability distribution for a third operator that does not commute with the position or momentum operator This leads to the remarkable conclusion that the wave function in its coordinate representation contains more information than the corresponding probability densities in position and momentum together Due to Bargmann we know the answer to be negative in a physically relevant way c

and what is now commonly referred to as the Pauli problem is either the probshylem of determining the set of states that share the same modulus and the modshyulus of their Fourier transform or the problem of finding a set of observables that are informationally complete The problems are related but not identical and we prefer to refer to the first version of the problem as the Pauli probshylem and to the second as simply the state determination problem It seems much more work has been done on the state determination problem which isnt surprising given the fact that the Pauli problem is a special case of it With the exception of the production of counterexamples such as Bargmanns the first instructive results regarding the Pauli problem were obtained only in

Bargmann never seems to have published these results himself and as a result little refershyence is given to his work in the literature However the examples can be found in Reichen-bach 4 c The problem re-appeared unaltered in the 1958 edition of Paulis book more than a decade after the first counterexamples

41

1978 by Corbett and Hurst5 In their paper they construct physically imporshytant classes of functions that are uniquely determined by their position and momentum distributions However they also show there exist dense subsets of states that are not uniquely determined by their position and momentum disshytributions and as a consequence any state can be approximated in norm by a non-unique state Extensions comments and counterexamples to their work can be found in Friedman6 and Pavicic7 Nevertheless the complete charshyacterization of the set of states that share modulus and the modulus of their Fourier transform is still open As for the state determination problem we can split the work into those who were primarily concerned with establishing a set of observables that is informationally complete (or disproving a certain set to have this property) and those that set out to characterize such sets The first group includes Feenberg8 (1933) Moyal9 (1949 ) Gale Guth and Trammell (1968)10 Band and Park 1 1 1 2 13 (1970-1971) and many more14 15 16 We will not go into the reconstruction of the state by placing the entity in different potentials a method pioneered by Lamb17 and one that inspired many similar approaches such as Wiesbrock18 and Weigert19 nor will we mention the vast literature pertaining to the measurement of the Wigner distribution known as phase-space tomography However concerning the characterization of inshyformationally complete sets we cannot help but make the following elementary remarks Suppose we have a non-trivial (ie not a multiple of the identity) self-adjoint operator A that commutes with every member of a set of operashytors S in a Hilbert space 7i It is well known that the one parameter family of unitary operators exp(itA) also commutes with every element of ltS Now take any xj) that is not an eigenvector of A For any observable in S the state ipt mdash exp(itA)tp gives the same expectation value for this operator whatever numerical value t has But if t ^ s it follows that ipt ^ Vs (for the relation of this with superselection rules see Wick Wightman and Wigner (1952) 20 Emch and Piron (1963) 21 and Piron2 2) Hence S is not an informationally complete set of observables So a necessary condition for a set of observables to be informationally complete is maximality in the sense of Dirac in other words that there be no other non-trivial operator that commutes with every member of the set However this is far from sufficiency As Bush and Lahti23

have shown it is easy to derive d from the considerations above that no comshymuting set of observables is informationally complete Maximal commuting sets of observables serve as a means of state preparation not state identifishycation This means that at least for for continuous variables the Pauli set P Q is in a certain sense the minimal set that one could possibly hope to be informationally complete (although Bargmann has shown this in general not

One arrives at this result by allowing A to be a member of S

42

to be the case)

2 Conservation of Probability

What we will present in this article is an elaboration on the reasoning followed by Feenberg Consider the time-dependent Schrodinger equation in tp with a real e potential V and using the shorthand tp for ip(r t)

~ = -h2imV2tp +^rVip at in

Multiply by tp and add this to the complex conjugate of the above equation multiplied by ip After some elementary vector operator manipulation we find what is commonly known as the conservation law of probability

Substitution of the polar representation of the wave vector iP(rt) = yfafietrade (ip assumed real) into the former equation yields a second order partial difshy

ferential equation which is in fact a Fokker-Planck equation with zero diffusion coefficient and the phase serving as a a potential

Feenbergs argument is a uniqueness result based on this last equation It amounts to showing that any two phase functions that satisfy this equation and some gentle boundary conditions differ by at most a constant His 1933 thesis is hard to get hold of but the argument was (erroneously1015 ) extended by Kemble 24 to three spatial dimensions in his much easier to find handbook on quantum mechanics What we will do here is go back to the original one dimensional idea but rather than trying to establish a uniqueness result we will show that in this simple case a solution can be obtained by direct integration

3 Determination of the phase function

So p and ip satisfy the conservation law as given by the last equation Rewriting this equation in one dimension evaluated at a specific time instant t = to gives us eThe imaginary part of a complex potential can be used to mimic creation and annihilation effects Although this is sometimes a useful approximation such results violate the continuity equation and for a more reliable analysis one should really use a second quantized theory

43

lt9V dp(xt0)dip mtdpxt) pxto)w + mdashdxmdashTx + -nmdashm-]t^ = deg

Assume for the time being that p(x t0) ^ 0 and divide the equation by p(x t0)

d2(p dinp(xt0) dip m dlnp(xt) _ ~dtf + dx ~5x~+ J dt h=t0 ~

Assuming pox) and its time derivative to be known functions we can solve for the unknown phase ltp(xto) Set

As all quantities are evaluated at the same time instant t = to we will not bother to give further notational reference to this fact In what follows we will also abbreviate (with abuse of language) ( a i nP(x f)) f = t o a s dtlnp(x) Applying these transformations the equation becomes

^ + f(X)(fgt = g(X)

So we have transformed the second order partial differential equation into an ordinary first order linear differential equation with a source g(x) at a fixed time instant The solution of the homogeneous equation is ltph = exp[mdash f f(x)dx] = p~1x) The general solution with c chosen to fit the boundary condition is ltfgt(x) = 4gthx)(c + $x g(s)p(s)ds) We have to integrate this result once more to get ltp(x)

x rr

4gthr)(c+ I g(s)p(s)ds)dr

= J p~(7)[c+J J P(s)dtlnp(s)ds]

= J (c+-J dtP(s)ds)W)

4 Validity and range of applicability

The solution is seen to be a two parameter family of curves one for every value of the constant c and one for every lower limit say x$ of the r integration The result of changing the lower integration limit is only the addition

bullThe lower limit of the s integration is absorbed in the constant c

44

of an overall constant to tp(xt) Because we know the quantum mechanical expectation values and probabilities to be invariant under such an addition we set this constant equal to zero The value of the constant c can potentially affect the phase in a more profound way Depending on the particular p(r t) used pfriy m i g n t diverge when p(r t) is zero for some value(s) of r or even worse for some Ar First of all we assumed in our derivation that p(r t) ^ 0 but this restriction can easily be removed Indeed suppose we have n places xn where the density does equal zero A solution ipi is then obtained for each interval ]x Xi+ [ by means of our equation The total solution ip is obtained by pasting all the ipi together by requiring continuity of if and V^- 9 bull Now continuity of ip and VVgt implies continuity of their respective complex conjugates and hence of p and Vp If we are to infer the phase from actual data it seems reasonable to require (p also to be continuous In fact the conservation equation requires it to be twice differentiable If any cutting and pasting is necessary to obtain the solution we can easily see that the constant c should be the same for any two pasted pieces Hence if the cut is applied at a pole c has to be zero h for ltp to be continuous We arrive at the same conclusion when we use the same reasoning on a point adjacent to the support of p Hence we arrive at the main result of our paper

m rx fo rr

V(xt0) = yp(xt0)exp(imdash dtp(st0)ds)

Note that the state does not contain reference to the potential External fields will show up in the state indirectly as a consequence of the time dependence of p The assumptions that underlie the derivation of the equation are a spinless one dimensional particle that acts under a real potential V being prepared in a pure state In short all that is required for a particle to obey the one dimensional dynamical Schrodinger equation However restricted this class is it does include many examples that can be found in standard textbooks on quantum mechanics

Comparing the result we have found to those in the literature we find the closest match with a result obtained by Gale Guth and Trammel10 They apply the definitions of p(r) and j(r) to show that knowing these is sufficient for the determination of the phase They then discuss a gedanken experiment

9 This continuity demand is in fact a necessity because the validity of the equation of probshyability conservation (and a fortiori of the Schrodinger equation) requires xjj and Vigt to be continuous A notable but unproblematic exception is that of an infinite potential step h the value of c might be non-zero in applications where the continuity equation only expresses conservation of the probability flux in some intermediate region the boundaries (possibly at infinity) containing sinks or sources of probability

45

for establishing the probability current by measuring the expectation of the velocity and argue by means of this experiment and an intuitive argument that the current j(r) equals p(r) lt v(r) gt for some r inside a small space region that is supposed to contain the particle Our result was obtained by a direct integration and as a consequence is exact It is however difficult to extend to higher dimensions because of two reasons The first is the fact that the expression for the probability current in the presence of a vector potential becomes J(xpound) = Reip(xt)[pmmdash (qmc)A]ip(x t) and depending on the form of the vector potential it is not obvious to what function of the phase this corresponds If the vector potential corresponds to a uniform magnetic field or in absence of a vector potential (in which case one can transform the equation into a Poisson equation) one can solve the continuity equation by employing standard techniques However one then encounters a second problem Providing an initial value for the phase (which is unproblematic as the phase is only determined within an additive constant) is no longer sufficient instead we need an initial boundary function Hence we have to resort to other principles to determine the phase on such a boundary in order to solve the problem Of course the principle of conservation may still serve the purpose of reducing the family of admissible functions for the phase of the amplitude We will now illustrate the principle by applying it to a Gaussian wave packet Later we will expound a few operational issues regarding the quantities involved in the solution given above

5 Evolution of a Gaussian Wave Packet

The full time dependent wave function for a free Gaussian wave packet is

c o = ltMA)Srltlti + ^ r -x24(Ax)l + ik0x - ik2Ht2m

eXpL 1 + iht2m(Ax)20 J

From this we easily calculate p(xt)

p(xt) = tpxt)ip(xt)

iv A N2W- h2t2 N--12 r -(x + k0htm)2

Now assume we did not know the wave function only the probability density and its time derivative at some time instant t mdash 0 In an abbreviated

46

form (with easy identification of the coefficients) we can write the probability as

) = + tf)-raquolaquop[-JEplusmn|pound]

At time t = 0 this gives us p(x0) mdash aexp(mdash^-) The derivative of p with respect to the time parameter

bulllaquoraquo - 4ilt1 + 6 2gt~1 2 e x plt-|r^)gt]= CX X2

= ~2a~dexp(~~j)

So the phase becomes

ltp(x0) = j J J dtp(s0)d p(r0)

2 bdquo2 bull v

C TTl f fr S V

= ~2d-hJ J sexP(--)dsexP(-)d

m fx v^ r2

kohm = T~x

m n

= kox

which is precisely the desired phase of the wave function at t = 0 6 Operational Issues

Expounding Feenbergs uniqueness result Reichenbach points out that we can recover the phase by numerical computation if we know p(x to) and dtp(x t) t=t0 bull In order to establish these quantities Reichenbach outlines the following proshycedure4 We take an ensemble A of identically prepared systems such that the ensemble can be properly described by a pure state ifgt Now select at random two sub-ensembles from A say B and C For each system in B we measure at the time to the value of a As the results will vary we obtain in this way a distribution p(xto)- Likewise for each system in C we we measure at the time ti the value of x obtaining a distribution p(xti) The quotient

p(xt0) - p(xh)

h mdash to

47

is then supposed to approximate dtp(xt) for t euro [toh] if the interval [toh] is chosen sufficiently small The wave function can then be obtained through numerical approximation and represents the state of the systems that are left untouched in the original ensemble A There is a problem with Reichenbachs procedure for determining these quantities that is of equal concern to our method Despite the fact that it is entirely possible to position the detector wherever one wants it to be hence effectively controlling x in p(xt) it is an annoying peculiarity of quanta that one cannot determine when a detection will take place One places a detector and simply waits for a detection count to happen The problem seems related to what Mielnik has called the screen problem in a provocative and enlightening paper by the same name 25 As Mielnik points out experimentalists perform a lot of experiments but none reshysembling an instantaneous check of particle position Indeed a measurement setup typically consists of a source that what is emitted undergoes a series of transformations (ie an optical bench or a potential) and is subsequently detected by a fixed detector or a set of fixed detectors If we are to describe operational means of measuring densities at some time instant we will have to do so by such a typical setup To produce anything remotely satisfactory we will need a few assumptions A first assumption is that if a particle is detected at some time instant to in position x the intricate mechanism beshytween the measurement apparatus and the particle that is responsible for its detection does not depend on to and in this sense has no effect on the value of p(xt) However unnatural the assumption might be from a physical point of view it seems to underlie the statistical interpretation of fn ^x t)2dV as an instantaneous localization probability of the system in a state ip in a space region fi and at a time instant t In so far as our analysis depends on this assumption so does the standard interpretation of quantum mechanics The next assumption is that we are able to control the release of the particle in a certain state within a sufficient small time interval At such that within this small time interval the density can reasonably be approximated by a linear function This can be achieved by placing a shutter mechanism behind the source Naturally the shutter opening time has to be substantially less than the coherence time of the particle A sufficiently short opening time can only be established by experiment and one can never be quite sure if there would still be more oscillations on a much shorter time scale A density function with a larger variation will be harder to approximate as it requires a shorter shutter opening time and hence will result in a lower detection rate The wave packet then participates in the transformations we may have set up (optical bench Stern-Gerlach) and is detected The time interval between the shutter reshylease and the detection time is noted together with the position of the detector

48

After many of such recordings we gather all the data to reconstruct p(xt) How many samples do we need Well if the samples were taken at equidistant At and Ax we could do a Fourier synthesis and apply the Shannon-Whittaker sampling theorem However due to the non-equidistant spreading of the tn (at best following some statistical pattern) we need Frame Theory (Duffin and Schaeffer26) to reconstruct band limited signals from irregularly spaced samshyples f(tn) The derivative with respect to time can then be derived from the reconstructed signal and the phase derived by means of the proposed equation

Acknowledgments

The author wishes to acknowledge a helpful discussion with John Corbett regarding the subject of this paper

References

1 J Dorling Schrodinger Centenary celebration of a polymath eds CW Kilmister (Cambridge 1987)

2 E Prugovecki Int J Theor Phys 16 pp 321-331 (1977) 3 W Pauli Encyclopedia of Physics Vol V p17 (Springer-Verlag Berlin

1958) 4 H Reichenbach Philosophic Foundations of Quantum Mechanics (Unishy

versity of California Press 1948) 5 JV Corbett CA Hurst J Austral Math Soc B20 182-201 (1978) 6 CN Friedman J Austral Math Soc B30 298 (1987) 7 M Pavicic Phys Lett A 122 280 (1987) 8 E Feenberg The Scattering of Slow Electrons in Neutral Atoms Thesis

Harvard University (1933) 9 JE Moyal Proc Cambridge Phil Soc 45 99 (1949)

10 W Gale E Guth and GT Trammell Phys Rev A 165 1434-1436 (1968)

11 W Band J Park Found Phys 1 No 2 pp 133-144 (1970) 12 J Park W Band Found Phys 1 No 4 pp 339-357 (1971) 13 W Band J Park Am J Phy 47 pp 188-191 (1979) 14 A Royer Phys Rev Lett 55 pp 2745 (1985) 15 A Royer Found Phys 19 3 (1989) 16 W Stulpe M Singer Found Phys Lett 3 153 (1990) 17 W E Lamb Phys Today 22(4) 23 (1969) 18 H-W Wiesbrock Int J Theor Phys 26 pp 1175 (1987) 19 S Weigert Phys Rev A 45 pp 7688-7696 (1992)

49

20 GC Wick AS Wightman EP Wigner Phys Rev 88 pp 101-105 (1952)

21 EC Emch C Piron J Math Phys 4pp 496-473 (1963) 22 C Piron Helv Phys Acta 42 pp 330-338 (1969) 23 P Bush PJ Lahti Found Phys 19 pp 633 (1971) 24 EC Kemble New York MacGraw-Hill (1937) 25 B Mielnik Found Phys 24 8 pp 1113-1129 (1994) 26 RJ Duffin AC Schaeffer Trans Amer Math Soc 72 341-366

(1952)

50

EXTRINSIC A N D INTRINSIC IRREVERSIBILITY IN PROBABILISTIC DYNAMICAL LAWS

H ATMANSPACHER Institut fur Grenzgebiete der Psychologie und Psychohygiene

Wilhelmstr 3a D-79098 Freiburg Germany E-mail haaigppde

and Max-Planck-Institut fur extraterrestrische Physik

D-85740 Garching Germany

R C BISHOP Institut fur Grenzgebiete der Psychologie und Psychohygiene

Wilhelmstr 3a D-79098 Freiburg Germany E-mail rcbigppde

A AMANN Universitatsklinik fur Anasthesie Leopold-Franzens- Universitat

Anichstr 35 A-6020 Innsbruck Austria E-mail antonamannuibkacat

and Institut fur Allgemeine Anorganische und Theoretische Chemie Abteilung fur theoretische Chemie Leopold-Franzens- Universitat

Innrain 52a A-6020 Innsbruck Austria

Two distinct conceptions for the relation between reversible time-reversal invarishyant laws of nature and the irreversible behavior of physical systems are outlined The standard extrinsic concept of irreversibility is based on the notion of an open system interacting with its environment An alternative intrinsic concept of irreshyversibility does not explicitly refer to any environment at all Basic aspects of the two concepts are presented and compared with each other The significance of the terms extrinsic and intrinsic is discussed

1 Introduction

The relation between reversible time-reversal invariant laws of nature and the irreversible behavior of empirical systems has been a long-standing problem in physics In most standard approaches fundamental dynamical laws such as in Newtons Maxwells Einsteins or Schrodingers equations describe the temporal evolution of isolated systems Irreversible dynamical laws are typshyically regarded as emerging from the interaction between systems and their environment ie from considering open systems

In contrast to this extrinsic conception of irreversibility there is a group

51

of scientists who insist that some kinds of irreversibility are intrinsic ie some kinds of irreversible laws are fundamental On this view mainly adshyvocated by Prigogine and colleagues in Brussels and Austin the switch from extrinsic to intrinsic irreversibility goes along with a switch from particular kinds of deterministic descriptions to particular kinds of probabilistic descripshytions

In general the two viewpoints are considered to be distinct sometimes even entirely incompatible It is the main goal of this contribution to show that there are both differences and similarities between them As a consequence it does not make too much sense to prefer one of them at the expense of the other It is much more interesting to explore whether particular aspects of each of the two views can be constructively related to each other in order to increase our insight into the issue of irreversibility

In the following both conceptions will be presented to some detail and compared It is suggested that the distinction of ontic and epistemic catego-rial frameworks for some problems associated with irreversibility is particularly useful when focusing on a conceptual discussion Such a distinction serves to clarify both common and distinct aspects of extrinsic and intrinsic irreversibilshyity and it helps to frame a number of open questions concerning them

In Section 2 ontic and epistemic descriptions are briefly introduced We use an algebraic framework for this introduction since this has proven fruitful in related problem areas Section 3 outlines some basic issues with respect to the ontic states of closed quantum systems and their time-reversal invariant dynamical evolution Subsequently two ways to conceive of extrinsic irreshyversibility are described In one of them epistemic states are represented by (reduced) density operators in the other they are represented by probabilshyity distributions of pure states Section 4 presents the intrinsic conception of irreversibility One major line of research in this regard deals with transformashytions from invertible K-systems to non-invertible exact systems the other uses the concept of rigged Hilbert spaces to extend the state of a system beyond Hilbert space Section 5 summarizes the main points and indicates some open questions

2 Ontic and epistemic descriptions

21 General issues

Can nature be observed and described as it is in itself independent of those who observe and describe - that is to say nature as it is when nobody looks This question has been debated throughout the history of philosophy with no clear answer either way Each perspective has strengths and weaknesses and in each

52

epoch has had its critics and proponents In contemporary terminology the two perspectives can be distinguished as the topics of ontology and epistemology Ontological questions refer to the structure and behavior of a system as such whereas epistemological questions refer to knowledge (or information) about systems

In philosophical discourse it is considered a serious fallacy to confuse these two types of questions For instance Fetzer and Almeder emphasize that an ontic answer to an epistemic question (or vice versa) normally commits a category mistake 1 Nevertheless such mistakes are frequently committed in many fields of research when addressing subjects where the distinction between ontological and epistemological arguments is important

The onticepistemic distinction refers to states and properties of a system as such or in its relation to observers hence it is an ontological distinction0

In physics the rise of quantum theory with its interpretational problems was one of the first major challenges to the onticepistemic distinction The Bohr-Einstein discussions in the 1920s and 1930s serve as a famous historical examshyple Einsteins arguments were generally ontically motivated that is to say he emphasized a viewpoint independent of observers or measurements By conshytrast Bohrs emphasis was generally epistemically motivated focusing on what we could know and infer from observed quantum phenomena Since Bohr and Einstein never made their basic viewpoints explicit it is not surprising that they talked past each other in a number of respects2

Examples of approaches trying to avoid the confusions of the Bohr-Einstein discussions are Heisenbergs distinction of actuality and potentiality 3 Bohms ideas on explicate and implicate orders5 or dEspagnats scheme of an empirshyical weakly objective reality and an objective (veiled) reality independent of observers and their minds5 Further terms fitting into the ontic side of these distinctions are latency6 propensity7 or disposition8 See also Jammers discussion of these notions including their criticism and additional references 9

A first attempt to draw an explicit distinction between ontic and epistemic descriptions for quantum systems was introduced by Scheibe 10 who himself however strongly emphasized the epistemic realm Later Primas developed this distinction in the formal framework of algebraic quantum theory11 The basic structure of the onticepistemic distinction which will be made more precise below can be roughly characterized as follows (for more details the reader is referred to1 1 1 2)

On the other hand the distinction between ontological and epistemological problems can be considered as epistemological insofar as both areas represent fields of (philosophical) knowledge

53

Ontic states describe all properties of a physical system exhausshytively (Exhaustive in this context means that an ontic state is precisely the way it is without any reference to epistemic knowledge or ignorance) Ontic states are the referents of indishyvidual descriptions the properties of the system are treated as intrinsic bullproperties As an important example ontic states reshyfer to closed systems they are empirically inaccessible Typically their temporal evolution (dynamics) is reversible and follows fundashymental deterministic laws Epistemic states describe our (usually non-exhaustive) knowledge of the properties of a physical system ie based on a finite partition of the relevant phase space The refshyerents of statistical descriptions are epistemic states the properties of the system are treated as contextual properties Epistemic states refer to open systems they are at least in principle empirically accessible Typically their temporal evolution (dynamics) follows irreversible laws

The combination of the onticepistemic distinction with the formalism of algebraic quantum theory provides a framework that is both formally and conshyceptually satisfying Although the formalism of algebraic quantum theory is often hard to handle for specific physical applications it offers significant clarshyifications concerning the basic structure and the philosophical implications of quantum theory For instance the modern achievements of algebraic quanshytum theory make clear in what sense pioneer quantum mechanics (which von Neumann implicitly formulated epistemically 13) as well as classical and stashytistical mechanics can be considered as special cases of a more general theory Compared to the framework of von Neumanns monograph13 important exshytensions are obtained by giving up the irreducibility of the algebra of observshyables (not admitting observables which commute with every observable in the same algebra) and the restriction to locally compact phase spaces (admitting only finitely many degrees of freedom) As a consequence modern quantum physics is able to deal with open systems in addition to isolated ones it can involve infinitely many degrees of freedom such as the infinitely many modes of a radiation field it can properly consider interactions with the environment of a system superselection rules classical observables and phase transitions can be formulated which would be impossible in an irreducible algebra of obshyservables there exist infinitely many representations inequivalent to the Fock

In a more technical terminology one speaks of observables (mathematically represented by operators) rather than properties of a system Prima facie the term observable has nothing to do with the actual observability of a corresponding property

54

representation and non-automorphic irreversible dynamical evolutions can be successfully incorporated and even derived

In addition to this remarkable progress the mathematical rigor of algeshybraic quantum theory in combination with the onticepistemic distinction alshylows us to address a number of unresolved conceptual and interpretational problems of pioneer quantum mechanics from a new perspective First the distinction between different concepts of states as well as observables provides a much better understanding of many confusing issues in earlier conceptions including alleged paradoxes such as those of Einstein Podolsky and Rosen (EPR) 1 4 Second a clear-cut characterization of different concepts of states and observables is a necessary precondition to explore new approaches beshyyond von Neumanns projection postulate toward the central problem that pervades all quantum theory the measurement problem Third a number of much-discussed interpretations of quantum theory and their variants can be appreciated more properly if they are considered from the perspective of an algebraic formulation

One of the most striking differences between the concepts of ontic and epistemic states is their difference concerning operational access ie observshyability and measurability At first sight it might appear pointless to keep a level of description which is not related to what can be operationalized empirshyically However a most appealing feature at this ontic level is the existence of first principles and fundamental laws that cannot be obtained at the episshytemic level Furthermore it is possible to rigorously deduce (eg to GNS-construct cf 12gt15) a proper epistemic description from an ontic description if enough details about the empirically given situation are known These aspects show that the crucial point is not to decide whether ontic or epistemic levels of discussions are right or wrong in a mutually exclusive sense There are always ontic and epistemic elements to be taken into account for a proper description of a system This requires the definition of ontic and epistemic terms to be relativized with respect to some selected framework within a set of (hierarchishycal) descriptions (see16 for details and examples) The problem is then to use the proper level of description for a given context and to develop and explore well-defined relations between different levels

These relations are not universally prescribed they depend on contexts of various kinds The concepts of reduction and emergence are of crucial sigshynificance here In contrast to the majority of publications dealing with these topics it is possible to precisely specify their meaning in mathematical terms Contexts or contingent conditions can be formally incorporated as topologies in which particular asymptotic limits give rise to novel emergent properties unavailable without those contexts (see 15 for more details) It should also

55

be mentioned that the distinction between ontic and epistemic descriptions is neither identical with that of parts and wholes nor with that of micro- and macrostates as used in statistical mechanics or thermodynamics The thermoshydynamic limit of an infinite number of degrees of freedom provides only one example of a contextual topology others are the Born-Oppenheimer limit in molecular physics or the short-wavelength limit for geometrical optics

These examples indicate that the usefulness or even inevitability of the onticepistemic distinction is not restricted to quantum systems It plays a significant role in the description of classical systems as well More specifically it has been shown in detail that for systems exhibiting deterministic chaos the distinction of ontic and epistemic descriptions is necessary if category mistakes and corresponding interpretational fallacies are to be avoided17

3 Breaking Time-Reversal Symmetry Extrinsic Irreversibility

31 Time-Reversal Symmetry in Closed Systems

Let us start with a closed quantum system which can be considered without any reference to an environment The pure state ltfgt of such a system is an extremal positive linear functional on a C-algebra A The state ltgt euro A where A is the dual of A is then called an ontic state of the closed system If a Hilbert space representation of A is possible ltjgt can be represented as a state vector ip G characterized by the expectation values lt ipAip gt of all observables A euro A Under particular conditions the dynamics of ltfgt is given by the time-reversal invariant Schrodinger equation

In the traditional Hilbert space representation the algebra A of observshyables is irreducible there are no commuting observables Due to the Stone-von Neumann theorem every representation of the canonical commutation relashytions is then equivalent to the Schrodinger representation In the more general setting of a Fock space (sum of tensor products of one-particle Hilbert spaces) the same holds for Fock representations

A restriction of ltfr to a subsystem is not a pure state in general hence it is in general illegitimate to consider a closed quantum system as consisting of closed subsystems As a consequence an ontic state cfgt characterizes an individual undivided whole not consisting of subsystems with their own ontic states This is the level of description to which the notions of quantum nonlocality or quantum holism apply Since the concept of an environment does not make sense for ontic states of closed systems it is illegitimate to speak about their entanglement or interaction with another state

If one introduces a distinction (Heisenberg cut) to create subsystems in

56

a closed system then these subsystems in general are open For example one can then consider an object entangled andor interacting with its environshyment The epistemic state r] of those subsystems can be represented in two conceptually different ways

32 Density Operators as Non-Pure States

The first more or less familiar representation of an epistemic state n is given by a (reduced) density operator D 6 M where M is the predual of a W-algebra M of contextual observables The expectation value of D is given by TrDM for observables M E M The epistemic state n represented by D is a non-pure state EPR-correlations between subsystem and environment are generic if the contextual algebra of observables is non-commutative

The term contextual observables derives from the fact that their conshystruction requires the selection of a context defined by a subset of relevant observables B E B C A and a reference state (eg vacuum state KMS state) distinguished by some appropriate stability condition This context induces the weak closure of B and gives rise to a contextual topology in M If the context is known well enough then the GNS representation is a powerful constructive tool to implement a proper contextual topology (see eg15)

The dynamics of D is of Schrodinger type plus dissipative terms (eg a master equation) so that the time-reversal invariance of the Schrodinger equation can be broken18 19

33 Probability Distributions of Pure States

If the epistemic state r of an open system is approximately pure by a clever dressing of object and environment (b indicates bare objects and environments and d indicates dressed objects and environments)

ri0ij lt8gt Henv = Hgbj lt8gt nenv

7] can be represented (estimated) by a probability distribution fj of pure states (A dressing procedure is clever if it minimizes EPR-correlations between obshyject and environment or if it maximizes the integrity of both object and environment20) Hgbj is the proper Hilbert space for an approximately pure epistemic state 77 Although 77 can be uniquely extended to a normal state on M (represented by a density operator) the pure states and their distribution fi themselves do not make sense on M The relevant observables are elements of a C-subalgebra B C A

57

The dynamics of p is of Schrodinger type plus stochastic terms (eg an ItoStratonovic equation) so that the time-reversal invariance of the Schroshydinger equation can be broken The stochastic aspect of the time evolution (of approximately pure states of the object) originates from the fact that the (initial) state of the environment cannot be determined and therefore must be treated as a stochastic variable Starting from an initial pure state pa one gets time-evolved states ptu where co is the stochastic variable First steps of such an approach toward single open quantum systems not based exclusively on decompositions of density-operator dynamics were proposed in2 1 2 2

For a large class of stochastic dynamics of approximately pure states of objects one ends up with one particular distribution p^ of pure states in the limit t mdashgt oo independently of the initial conditions (such dynamical objects are called ergodic) Splitting the underlying C-algebra B into two subsystems with two C-subalgebras B and B2 B = B reg B2 is then admitted under particular conditions In an ideal situation all those pure states onto which the probability measures pt extend are product states with respect to the tensor product B = B reg $2- This situation never arises in practice but most relevant pure states can be product states or almost product states if the dressing tensorization is chosen appropriately 23

3-4 Dynamics of Measurement a Simple Example

Any dynamical description of measurement has to start from a proper decomshyposition of a system into a dressed object and its dressed environment It is crucial to keep in mind that such a decomposition is a logical precondition for the dynamics of measurement insofar as the Hamiltonian of the composed system needs to be written as a sum

H = Hobiregl + lregHmy+Hint (1)

An illustrative heuristic example has been extensively discussed by Primas24 Consider the simple case of a two-level quantum object (spin 12 system) with the Hamiltonian

h 3

^ o b j ~ Tj^yGu (2)

a sufficiently nontrivial boson field environment

3

-Henv = ^2^2ujkaklakv (3)

58

and an interaction

3

Hint = ^ lt7bdquo (ggt Abdquo (4)

where

Av = ^ ^kuOtkv + CC (5) k

If such a decomposition has been properly carried out (cf Sec 33) then it is possible to derive the expectation values

M(t) = ltiptWflHgt (6)

a(t) = ltXtAXtgt (7)

with respect to the (approximate) product state

t = v- tobjregxr- (8)

Corresponding to the product state Pt the C-algebra of intrinsic observables in the composed system of dressed object and dressed environment is

A = A0hi reg-4env (9)

Aohi is the C-algebra of 2 x 2 matrices and ^4env is the C-algebra of intrinsic observables of an environment with infinitely many degrees of freedom

The equations of motion for the expectation values M(t) and a(t) are given by

M(t) = M(t) x ft + M(t) x a(t) (10)

() = -UkOLkv + -^gt~kvMvt) (11)

They describe the feedback between object and environment More precisely they describe the polarization M of the object under the influence of the enshyvironment and the motion of the environment observable a (boson operator) under the polarizing influence of the object The solution of the second equashytion referring to the observables of the environment (or the measuring system

59

respectively) has a retarded and an advanced part

(t gt 0) (12)

(t lt 0) (13)

A bidirectionally deterministic system can be described in terms of a superposhysition of a backward deterministic (forward non-deterministic) and a forward deterministic (backward non-deterministic) process which are equally relevant a priori Selecting one of these solutions and disregarding the other requires the time inversion symmetry of the compound system to be broken For this purpose one can apply the principle of causality (past-determinacy error-free retrodiction no anticipation) as a heuristic argument for the selection of the retarded solution

It has been argued that the retarded ie the backward deterministic forward non-deterministic solution is a K-flowc on a state space with infinitely many degrees of freedom24 In the simplest case the relaxation time for this K-flow is the time constant rbdquo of an exponentially decaying correlation function (for details see24)

Kv = ivexp(-tTv) (14)

At this point we are still at the level of description of intrinsic observables needed for the specification of initial conditions of the K-flow Conceptually this K-flow represents a stochastic process which corresponds to chaos in the sense of Wiener25 rather than chaos in the sense of Kolmogorov and Sinai (ie a dissipative dynamics) By introducing a context via a reference state with respect to which stability in a particular sense (hopefully more general than thermal equilibrium) can be checked one can proceed to (GNS-constructed) contextual observables

35 General Features of Extrinsic Irreversibility

The breaking of time-reversal symmetry in the framework of extrinsic irreshyversibility corresponds to the conceptual transition from closed systems with cNote that K-flows or K-systems play an important role in one of the approaches of intrinsic irreversibility (see Sec 41) It would be interesting but exceeds the scope of this paper to explore the question of whether the process of measurement as described here can be conceived as intrinsically irreversible In this respect see eg2 6

aTke = exp(-iLjkt)akl0)

i r - 2Xk exp(-iuk(t - s))Mv(s)ds

fj = exp(-iujkt)akv(t)

i fdeg + 9 ^ exp(-wt(t-s))Mbdquo(s)ds

60

ontic states to open systems with epistemic states Such a transition can be understood by dividing a closed system into open more or less EPR-correlated subsystems (eg object and environment) and by selecting a subset of relshyevant observables The proper state concepts are epistemic There are then two different statistical representations for different epistemic state concepts A ^-statistical representation expresses a probability distribution of pure states whereas the usual ^-statistical representation focuses on reduced density opshyerators

The interaction of the open subsystems is described by dynamical laws difshyferent from the time-reversal invariant dynamics of a closed system Breaking the time-reversal invariance of a unitary group evolution generates two semishygroups which can be endowed with two arrows of time opposite to each other It should be pointed out that the forward arrow cannot be selected by physical reasons alone Extra-physical arguments such as consistency with experience causality etc must be invoked

4 Breaking Time-Reversal Symmetry Intrinsic Irreversibility

In contrast to the extrinsic concept of irreversibility there is an alternative concept of intrinsic irreversibility mainly advocated by Prigogine and collabshyorators (more recently also by Bohm) They propose describing states of any system generically with distributions p (ie probability distributions or denshysity operators) The claim is that the state p of systems beyond a particular degree of complexity evolves irreversibly by itself ie without any relationship to an environment There are essentially two lines of research pursuing this proposal

4-1 A-Transformation from K-Systems to Exact Systems

The notion of the A-transformation has been developed by Misra Courbage and Prigogine in the 1970s It is essentially based on the theory of ergodic systems In particular the concept of Kolmogorov systems briefly K-systems is of central significance in this context

Definition 127 Let (X A n) be a normalized measure space and let S X mdashgt X be an invertible transformation such that S and 5 _ 1 are measurable and measure preserving The transformation S is called a K-automorphism if there exists a cr-algebra A0 such that the following three conditions are satisfied (i)S-1(A0)cA0 (ii) the cr-algebra f l^Lo - ^ 0 ) is trivial (ie contains only sets of measure

61

1 or 0) (hi) the smallest cr-algebra containing Jtrade=0S

n(Ao) is identical to A Another way to characterize (classical) K-systems is by way of the existence

of positive Ljapounov exponents equivalent to a strictly positive Kolmogorov-Sinai entropy The properties of K-systems imply mixing and ergodicity K-systems are invertible transformations hence their deterministic dynamics given by p(t) = Ut p(0) is reversible (Ut is a unitary evolution operator acting on p) A standard example is the (2-dimensional) baker transformation

Another important class of mixing systems refers to so-called exact sysshytems

Definition 2 27 Let (XAp) be a normalized measure space and let S X mdasht X a measure preserving transformation such that S(A) pound A for each A pound A If l im^oo = p(Sn(A)) = 1 for every A euro A p(A) = 1 then S is called exact

Exact systems are represented by non-invertible transformations hence their stochastic dynamics given by p(t) = Wt p(0) is irreversible Wt is a semigroup evolution operator acting on a distribution p rather than p For instance an exact system obtained from the baker transformation is the dyadic transformation

S(x) = 2x (mod 1)

A theorem by Rokhlin28 says that every exact system is the factor of a K-system This means that K-systems can be transformed into exact systems by their projections (or factors see2 7) More generally a factor of a K-system can be obtained by restriction to dilating fibers or unstable manifolds Hence it is intuitively clear that the invertibility of a K-system gets lost by its transformation into an exact system

According to Misra et al 29 30 the relations between the two kinds of

dynamics Ut and Wt and the two state concepts p and p are provided by a similarity transformation A according to

Wt = AUtA-1

p = Ap

Wightmans question31 as to the meaning of p in his review of30 gets an imshymediate answer if one applies Rokhlins theorem to construct A (cf 3 2 ) The transformed distribution p is the projection of p onto a dilating subspace This can easily be seen for the examples of the baker transformation and the dyadic transformation In the more complicated case of continuous-time nonlinear (hyperbolic) systems the corresponding procedure would be a projection onto the unstable manifolds ie those directions along which the Lyapunov expo-

62

nents are positive and add up to the Kolmogorov-Sinai entropy (cf 33gt34) As an important conceptual feature such projections select a time direction

A crucial formal feature associated with the irreversibility due to Wt is that a properly constructed A (and hence A[ (A

_1) preserves the positivity of the state distributions only for positive times A conceptual discussion of this point can be found in3 5 For a more detailed formal account of the role which positivity preservation plays in the transformation between irreversible semigroups and chaotic dynamics see 36 and references given there

4-2 Rigged Hilbert Space Representation

Intrinsic irreversibility has also been implemented in an approach based on an extension of the usual Hilbert space representation of the state of a sysshytem This approach makes use of the so-called rigged Hilbert space (RHS) construction first introduced by the Russian mathematician Gelfand and his collaborators37 Roberts38 and Bohm3 9 independently showed how Diracs formalism could be justified with complete mathematical rigor in a RHS By the end of the 1970s it turned out that some basic physical problems of Hilbert space quantum mechanics notably in the context of decaying states or resoshynances could be clarified in terms of RHS (40 and references therein)

Very briefly a RHS (Gelfand triplet) can be understood as follows Let be an abstract linear scalar product space and complete with respect to two topologies The first topology is the standard norm topology yielding a separable Hilbert space The second topology r$ is defined by a countable set of norms

IMU = Aamp0)n ^ euro n = 012 (15)

where (fgt e $ and the scalar product is given by

(lt(gt ltf)n = (ltjgt (A + 1) V ) n = 0 1 2 (16)

where A is the Nelson operator A =J2iXi41- The Xi are operators representing the observables for the system in question and are the generators for the Nelson operator Furthermore the operator A + 1 is a nuclear operator and ensures that $ is a nuclear space (cf 42gt39) An operator is nuclear if it is linear essentially self-adjoint and its inverse is Hilbert-Schmidt An operator A-1 is Hilbert Schmidt if A1 = XiPi where the Pt are mutually orthogonal projection operators on a finite dimensional vector space and J2iPi lt degdeg gt Pi denoting the eigenvalues of Pi39 We then have the Gelfand triplet of spaces

$ C ^ C $ X (17)

63

where $ x is the dual to the space $ The Nelson operator fully determines the choice of function space when

it comes to choosing a realization of the space $ However there are many different inequivalent irreducible representations of an enveloping algebra of a Lie group used to generate a Nelson operator describing physical systems Therefore further restrictions on the choice of function space for a realization of $ are required The particular characteristics of the physical context of the system being modeled provide some of these restrictions analogous to the situation for GNS constructions in the transition from C- to W-algebras in algebraic quantum mechanics23 Additional restrictions may be required due to the convergence properties desired for test functions in $ and ltJgtX

Bohm and colleagues applied the RHS approach to intrinsic irreversibility in the context of scattering and decay phenomena4043 Antoniou and Prigogine 44 extended the approach to broader contexts The core idea in both versions is that a unitary group operator Ut = exp(-iHt) mdashoo lt t lt oo generated by a Hamiltonian H under very general circumstances may be extended from W to $ x (restricted to $) For scattering processes $ is the intersection of the Hardy class functions with the Schwarz class functions Because of continuity and completeness requirements Ut $ x mdashgt $ x (Ut $mdashgt$) can be extended to the upper half plane $+ (restricted to $+) for positive times and to the lower half plane $ x ($_) for negative times4 3 The extension of Ut to $ x

(restriction to $) forms two semigroups because the extension (restriction) cannot be defined for replacement of t with mdasht Thus semigroup evolution falls out of the analysis quite naturally in the RHS framework

4-3 General Features of Intrinsic Irreversibility

In the intrinsic conception of irreversibility states of a system are generically represented by distributions in a suitable state space where pure states are S functions The trajectories of individual points are either (1) considered irreleshyvant because empirically inaccessible (as in the A-transformation approach) or (2) make minimal contributions to the collective behavior of the system when a sufficient number of Poincare resonances are present (as in the RHS approach) For systems beyond a particular degree of complexity (K-systems Poincare resshyonances etc) the dynamics of the system is governed by irreversible evolution laws regardless of interactions with an environment

While the A-transformation approach has only been applied to the baker map the RHS approach has been applied to nonlinear maps Friedrich models

dThe dual space x is the space of linear functionals acting on elements of ltpoundgt and its topology is induced by the choice of T and includes distributions among its elements

64

scattering experiments and other decay phenomena In the latter approach exact Golden Rules for decay and survival probabilities and their rates can be derived in agreement with experimental observations43

In both approaches the transition from reversible to irreversible dynamical evolution laws is achieved by breaking the time-reversal symmetry in specific ways leading to two semigroups The time direction of the semigroups howshyever is not given by either the A-transformation or RHS approaches Physical considerations alone are insufficient to select the forward arrow and one must appeal to consistency with experience causality or other criteria

5 Summary and Open Questions

There are two basic points at which extrinsic and intrinsic notions of irreshyversibility coincide The first is that both notions explicitly break the time-reversal symmetry of reversible dynamical laws This is clearly the case for the standard external view in which the transition from fundamental reversible laws to contextual irreversible laws corresponds to the transition from ontic states of closed systems to epistemic states of open systems But even for the alternative intrinsic view irreversibility is an emergent feature 45 In the framework of the A-transformation the time-reversal symmetry of K-systems is broken leading to irreversible exact systems In the RHS representation a similar symmetry breaking is achieved by the transition from Hilbert space to the rigging spaces $ and $ x

The breaking of time-reversal symmetry always produces two semigroups which can be endowed with opposite temporal directions Selection criteria must be used to select one of these two directions for a preferred mode of description In both extrinsic and intrinsic approaches there is no such crishyterion available based on physical reasoning alone The selection is based on extra-physical arguments such as causality experience and others This secshyond point of agreement between extrinsic and intrinsic irreversibility raises the interesting question of what conditions the proper direction of time has to satisfy It could be argued that up to the condition that it is the same for all physical systems the selection is arbitrary

There are two basic points at which extrinsic and intrinsic notions of irreshyversibility apparently differ One of them concerns the role of the environment the other has to do with the state concepts used in the two approaches Briefly speaking the role of the environment and the distinction of different state concepts is crucial in the standard framework of extrinsic irreversibility The conceptual framework of the formalisms refering to intrinsic irreversibility neishyther (1) explicitly contains the concept of an environment nor (2) distinguishes

65

between different state concepts These observations do not necessarily imply that intrinsic irreversibility

really can dispense with points (1) and (2) It is likely that the two points play crucial roles even though they do not explicitly appear in the formalism and its usual interpretation

The projection (factorization) which is the crucial part of a A transforshymation can be considered as the selection of an exact subsystem of the origshyinal K-system Obviously the A-transformation is not universal but context-dependent Conceptually the irreversible evolution of p mdash Kp due to Wt could then be attributed to the restriction of the K-system to an exact subsystem This might lead to interesting analogies with aspects of extrinsic irreversibility if the subsystem cannot be described as a closed subsystem Concrete empirshyical applications of the A-transformation are not yet available They would be necessary to check the significance of a physical environment which is not explicit in the formalism

Concerning the distinction between ontic and epistemic state concepts it is clear that the approach of intrinsic irreversibility starts at the level of distributions rather than points In the space of distributions 5 functions are special cases that could be related to points in a state space underlying the distribution space considered In this way a connection between distributions as epistemic states and points as ontic states is possible The general claim in the A-transformation framework of intrinsic irreversibility though is that ontic states in the sense of phase points are meaningless or irrelevant since they are empirically inaccessible

But is it justified to consider ontic states as generally irrelevant because they are empirically inaccessible Reversible fundamental laws refer to ontic states and it is not easy to formulate physics without them The monoshygraphs by Ludwig46 which consistently avoid any ontic elements are an ilshylustrative example Moreover special techniques to break symmetries often enable a unique derivation of irreversible contextual laws if the fundamental laws plus contexts are known This also holds for the symmetry breaking used to derive intrinsic irreversibility from time-reversal invariant evolution in the A-transformation approach The empirical inaccessibility of ontic states notwithstanding one should therefore not dismiss their overall relevance too quickly

In the RHS approach there is no contradiction with the formal arguments in the case of extrinsic irreversibility insofar as the extension of Ut from V into $ x leads from reversibility to irreversibility In this case irreversibility is a feature arising during the transition from states in to states whose state space is defined with respect to contexts In the algebraic framework of Sec 3

66

such contexts are reflected by a contextual topology on M As mentioned in Sec 42 physical contexts may not be known sufficiently well to determine $ x uniquely The physical examples used to demonstrate the significance of the RHS formulation (eg decay) suggest that a physical environment is inevitable although this is not explicit in the formalism

The relationship between ontic and epistemic states in the RHS approach is more subtle than in the A-transformation approach As Petrosky and Pri-gogine argue4748 the presence of a sufficient number of Poincare resonances in so-called large Poincare systems (LPS) rapidly convert the smooth infinitely differentiable trajectories of the phase space points into random walks Though the trajectories are not considered to be empirically inaccessible their effects are limited to the formation of higher and higher orders of correlations as the dynamics evolves The phase space points can represent ontic states but the correlations also have an ontic status Correlations very rapidly come to domishynate the dynamics of all collective modes of behavior of LPS (eg the approach to equilibrium) as the correlations diffuse throughout the system In this way the effects of individual points and trajectories become irrelevant to the dyshynamics of the whole and thus one can argue that the distribution description is an ontic description of the systems behavior

In this way the distinction between ontic and epistemic states might be a powerful conceptual tool even at the level of distributions alone There is a conceptual difference between a probability distribution conceived as a distrishybution over an ensemble of individual pure states (as in the ^-statistical represhysentation) and a probability distribution conceived as an individual whole The latter concept is sometimes indicated in the context of intrinsic irreversibility and can be considered as an ontic version of the former (cf the notion of relshyative onticity16) For instance continuum mechanics requires a formulation which needs ontically interpreted holistic distributions from the very beginshyning since its description in terms of an ensemble of points would violate basic physical laws

Among the adherents of intrinsic irreversibility it is claimed that the holisshytic concept of a distribution as a whole entails predictions eg related to the dynamics of correlations in large systems which cannot be obtained with the concept of a probability distribution of individual pure states This claim particularly refers to situations far from thermal equilibrium Based on Gallavottis approach which describes systems far from equilibrium in terms of SRB-measures49 ie in an ensemble description this claim may become testable (see also50 for a brief discussion)

After all it is possible to view the intrinsic approach to irreversibility as emphasizing the relative importance of the advanced level of complexity

67

of systems with nontrivial correlations over environmental effects While exshytrinsic irreversibility addresses the importance of an environment intrinsic irreversibility should not primarily be understood as focusing on the neglect of such an environment (eg the environment may be a necessary condition for the existence of the dynamics) Instead it is perhaps more appropriate to understand intrinsic irreversibility as irreversibility intrinsic to the dynamics of a system given a particular degree of its complexity

Acknowledgments

Helpful comments by L Accardi L Ballentine H Narnhofer and I Volovich during the discussion of this contribution at the conference are much apprecishyated We are grateful to H Primas for remarks on an earlier version of this paper

References

1 JH Fetzer and RF Almeder Glossary of EpistemologyPhilosophy of Science (Paragon House New York 1993) p lOOf

2 D Howard Space-time and separability problems of identity and indishyviduation in fundamental physics In Potentiality Entanglement and Passion-at-a-Distance ed by RS Cohen M Home and J Stachel (Kluwer Dordrecht 1997) pp 113-141

3 W Heisenberg Physics and Philosophy (Harper and Row New York 1958)

4 D Bohm Wholeness and the Implicate Order (Routledge and Kegan Paul London 1980)

5 B dEspagnat Veiled Reality (Addison-Wesley Reading 1995) 6 H Margenau Reality in quantum mechanics Phil Science 16 287-302

(1949) here p 297 7 KR Popper The propensity interpretation of probability and quanshy

tum mechanics In Observation and Interpretation in the Philosophy of Physics - With special reference to Quantum Mechanics ed by S Korner in collaboration with MHL Pryce (Constable London 1957) pp 65-70 [Reprinted by Dover New York 1962]

8 R Harre Is there a basic ontology for the physical sciences Dialectica 51 17-34 (1997)

9 M Jammer The Philosophy of Quantum Mechanics (Wiley New York 1974) pp 448-453 504-507

10 E Scheibe The Logical Analysis of Quantum Mechanics (Pergamon Oxford 1973) pp 82-88

68

11 H Primas Mathematical and philosophical questions in the theory of open and macroscopic quantum systems In Sixty-Two Years of Uncershytainty ed by AI Miller (Plenum New York 1990) pp 233-257

12 H Primas Endo- and exotheories of matter In Inside Versus Outside ed by H Atmanspacher and GJ Dalenoort (Springer Berlin 1994) pp 163-193

13 J von Neumann Mathematische Grundlagen der Quantenmechanik (Springer Berlin 1932) English translation Mathematical Foundations of Quantum Mechanics (Princeton University Press Princeton 1955)

14 A Einstein B Podolsky and N Rosen Can quantum-mechanical deshyscription of physical reality be considered complete Phys Rev 47 777-780 (1935)

15 H Primas Emergence in exact natural sciences Acta Polytechnica Scan-dinavica M a 91 83-98 (1998) See also Primas Chemistry Quantum Mechanics and Reductionism (Springer Berlin 1983) Chap 6

16 H Atmanspacher and F Kronz Relative onticity In On Quanta Mind and Matter Hans Primas in Context Edited by H Atmanspacher A Amann and U Miiller-Herold (Kluwer Dordrecht 1999) pp 273-294

17 H Atmanspacher Ontic and epistemic descriptions of chaotic systems In Computing Anticipatory Systems CASYS 99 Edited by D Dubois (Springer Berlin 2000) pp 465-478

18 E Fick and G Sauermann Quantenstatistik dynamischer Prozesse Ha Antwort- und Relaxationstheorie (Harri Deutsch Thun 1986)

19 R Kubo M Toda and N Hashitsume Statistical Physics II (Springer Berlin 1985)

20 H Primas The Cartesian cut the Heisenberg cut and disentangled observers In Symposia on the Foundations of Modern Physics Wolfgang Pauli as a Philosopher ed by KV Laurikainen and C Montonen (World Scientific Singapore 1993) pp 245-269

21 A Amann Structure dynamics and spectroscopy of single molecules a challenge to quantum mechanics J Math Chem 18 247-308 (1995)

22 A Amann and H Atmanspacher Fluctuations in the dynamics of single quantum systems Stud Hist Phil Mod Phys 29 151-182 (1998)

23 A Amann and H Atmanspacher C- and W-algebras of observ-ables their interpretation and the problem of measurement In On Quanta Mind and Matter Hans Primas in Context Edited by H Atshymanspacher A Amann and U Miiller-Herold (Kluwer Dordrecht 1999) pp 57-79

24 H Primas Induced nonlinear time evolution of open quantum systems

69

In Sixty-Two Years of Uncertainty ed by AI Miller (Plenum New York 1990) pp 259-280

25 N Wiener (1938) The homogeneous chaos Am J Math 60 897-936 (1938)

26 CM Lockhart and B Misra Irreversibility and measurement in quanshytum mechanics Physica A 136 47-76 (1986) Cf H Primas Math Rev 87k 81006 (1987)

27 A Lasota and MC Mackey Chaos Fractals and Noise (Springer Berlin 1995)

28 VA Rokhlin Exact endomorphisms of Lebesgue spaces Izv Akad Nauk SSSR Ser Mat 25 499-530 (1964) transl in Am Math Soc Transl 39 1-36 (1964)

29 B Misra NonequiUbrium entropy Lyapounov variables and ergodic properties of classical systems Proc Ntl Acad Sci USA 75 1627-1631 (1978)

30 B Misra I Prigogine and M Courbage From deterministic dynamics to probabilistic descriptions Physica A 98 1-26 (1979)

31 A Wightman Review of Misra Prigogine and Courbage30 Math Rev 82e 58066 (1982)

32 Z Suchanecki On lambda and internal time operators Physica A 187 249-266 (1992)

33 H Atmanspacher and H Scheingraber A fundamental link between sysshytem theory and statistical mechanics Found Phys 17 939-963 (1987)

34 H Atmanspacher Dynamical entropy in dynamical systems In Time Temporality Now ed by H Atmanspacher and E Ruhnau (Springer Berlin 1997) pp 325-344

35 RW Batterman Randomness and probability in dynamical theories on the proposals of the Prigogine school Philosophy of Science 58 241-263 (1991)

36 I Antoniou K Gustafson and Z Suchanecki (1998) On the inverse problem of statistical physics from irreversible semigroups to chaotic dynamics Physica A 252 345-361 (1998)

37 IM Gelfand and NYa Vilenkin Generalized Functions Vol 4 (Acashydemic New York 1964) Russian original published 1961 in Moscow

38 JERoberts The Dirac bra and ket formalism Journal of Mathematical Physics 7 1097-1104 (1966)

39 A Bohm Rigged Hilbert space and mathematical descriptions of physshyical systems In Lectures in Theoretical Physics IX A Mathematical methods of theoretical physics Edited by WE Brittin AO Barut and M Guenin (Gordon and Breach New York 1967) pp 255-317

70

40 A Bohm and M Gadella Dirac Kets Gamow Vectors and Gelfand Triplets Lecture Notes in Physics Vol 348 ed by A Bohm and JD Dollard (Springer Berlin 1989)

41 E Nelson Analytic Vectors Annals of Mathematics 70 572-615 (1959) 42 F Treves Topological Vector Spaces Distributions and Kernels (Acashy

demic Press New York 1967) 43 A Bohm S Maxson M Loewe and M Gadella Quantum mechanical

irreversibility Physica A 236 485-549 (1997) 44 I Antoniou and I Prigogine Intrinsic irreversibility and integrability of

dynamics Physica A 192 443-464 (1993) 45 T Petrosky and I Prigogine The Liouville space extension of quantum

mechanics Adv Chem Phys XCIX 1-120 (1997) here p 71 46 G Ludwig Foundations of Quantum Mechanics Vols 12 (Springer

Berlin 19831985) 47 T Petrosky and I Prigogine Poincare resonances and the extension of

classical dynamics Chaos Solitons amp Fractals 7 441-497 (1996) 48 T Petrosky and I Prigogine The Extension of Classical Dynamics for

Unstable Hamiltonian Systems Computers amp Mathematics with Applishycations 34 1-44 (1997)

49 G Gallavotti Chaotic dynamics fluctuations nonequilibrium ensemshybles CHAOS 8 384-392(1998)

50 D Ruelle Gaps and new ideas in our understanding of nonequilibrium Physica A 263 540-544 (1999)

71

INTERPRETATIONS OF PROBABILITY A N D Q U A N T U M THEORY

L E B A L L E N T I N E

Department of Physics Simon Fraser University Burnaby

BC V5A 1S6 Canada

e-mail ballentisfuca

There is a peculiar similarity between Probability Theory and Quantum Mechanics both subjects are mature and successful yet both remain subject to controversy about their foundations and interpretation I first present a classification of the various interpretations of probability arguing that they should not be thought of as rivals but rather as applications of a general theory to different kinds of subshyject matter An axiom system that makes conditional probability the fundamental concept is put forward as being superior to Kolmogorovs axioms I then discuss the relevance to quantum theory of the various interpretations of probability the applicability of classical probability theory within quantum mechanics and the reshylations between the interpretation of probability and the interpretation of quantum mechanics

1 Introduction

There are many connections between Probability Theory and Quantum Meshychanics the most notable being that Quantum Mechanics uses Probability Theory in its fundamental interpretation not merely as a technique But I wish to concentrate on a more peculiar similarity Although both subjects are mature and successful both remain subject to controversy about their foundations and interpretation There may be even more interpretations of probability than there are of quantum theory Can one bring some degree of order to this subject

Probability Theory being a branch of mathematics is defined by a set of axioms So it can legitimately be applied to any entity that satisfies those axioms Most of the interpretations of probability can be viewed as applications of the formal theory to different subject matters It is therefore misguided to argue over which is the correct interpretation Most of them are correct within their appropriate domain of application But it is still reasonable to ask whether there is a general overarching form of Probability Theory of which all the various interpretations can be seen as special cases applied to special subject matters

I shall propose such a classification of the various interpretations of probshyability To do so it is necessary to overlook small differences and to lump closely related interpretations into a few broad categories I expect this classi-

72

fication to be controversial but I believe that it is a step in the right direction I shall consider only theories that are based on the same or equivalent sets of axioms Hence generalizations such as negative probabilities are not included in this scheme although I shall briefly refer to them later After describing the major categories of interpretation of probability I will discuss the relevance of each to quantum mechanics

2 Interpretations of Probability

Many different interpretations of probability are examined in detail by T L Fine1 I propose to overlook many of the fine differences and hence classify them into a few major groups shown in Figure 1 References to most of the authors named in Fig 1 and critical analyses of their ideas are given by Fine1

21 The Theory of Inductive Inference

I propose that the Theory of Inductive Inference be taken as the master theory and that all other interpretations be regarded as special cases applicable in more restricted contexts This point of view was expressed most completely by E T Jaynes in his book Probability Theory The Logic of Science which unfortunately was not completed during his lifetime

Within this interpretation probability is assigned to propositions The notation P(AC) is to be read as the probability of A under the condition C Probability is regarded as a logical relation among propositions that is weaker than entailment Inductive logic reduces to deductive logic in the limit of probability values 0 and 1 Probability is an objective relation and should not be confused with degrees of belief

The propositions to which probability is assigned may have any particular content If we specialize to propositions about repeated experiments we obtain the Ensemble-Frequency theory If we specialize to propositions about personal belief we obtain Subjective probability If we specialize to propositions about indeterministic or unpredictable events we obtain the Propensity theory

Although P(AC) is a logical relation between proposition A and the conshyditioning information C it is not merely a formal syntactic relation The content (meaning) of A and C must be invoked to evaluate P(AC) There is no magic formula to translate arbitrary information into probabilities Jaynes has given solutions to this problem in some important special cases (symmetry groups marginalization) but there is as yet no general solution

73

The Logic of Inductive Inference

(E T Jaynes R T Cox H Jefferys)

P(AC) is the probability that proposhysition A is true given the information C

Ensemble and Frequency

(Kolmogorov Bernoulli von Mises)

Measure on a set Limit frequency in an ordered sequence

Propensity

(K R Popper)

PAC) is the propensity for event A to occur under the conshydition C

Subjective and Personal

(de Finnetti L J Savage I J Good)

Incomplete knowledge Degrees of reasonable belief

Figure 1 Classification of the interpretations of Probability

22 Ensemble and Frequency Theories

One of the most common interpretations of probability is as a limit frequency in an ordered sequence The ratio of the number n of occurrences of a particshyular type in a sequence of N events nN is identified with the probability This interpretation is useful in analyzing repeated experiments but it has the

74

difficulty that in a random sequence the ratio nN need not have a limit The ensemble interpretation is a generalization of the frequency interpretation in which probability is identified with a measure on a set that need not be orshydered It is closely associated with Kolmogorovs axiom system which will be discussed later

23 Subjective Probability

Subjectivism has its place and subjective probability provides an excellent way to describe degrees of reasonable belief But in science subjectivism can be like a virus and we must guard against its infection In general the probability P(AC) expresses an objective relation between A and C determined by the totality of the information C and not by anyones personal opinions Jaynes tried to ensure objectivity through the pedagogical device of introducing a robot that is programmed to reason consistently using only the information that is given to it But even Jaynes sometimes slipped from objective to personal probabilities in his examples without apparently being aware of doing so Indeed the contamination of Inductive Logic Probability by subjectivism may have been a major barrier to its acceptance

24 Propensity

Propensity is a form of causality that is weaker than determinism34 Generally speaking probability expresses logical relations rather that causal relations (Recall the old saying Correlation does not imply causality) However causalshyity is a special kind of logical relation and propensity theory deals with just that special case The propensity interpretation of probability is natural in situations such as those described by quantum mechanics in which events can not be predicted with certainty from their antecedents

3 The Axioms of Probability

The axioms of probability theory can be given in several different forms howshyever those given by RT Cox56 are particularly convenient

Axiom 1 0 lt PAB) lt 1 Axiom 2 PAA) = 1 Axiom 3 PhAB) = 1 - P(AB) Axiom 4 P(AkBC) = P(AC) PBAkC)

Here the notation is as follows -gtA means not A AkB means A and J5 A B means either A or B

75

Axiom 2 states that the probability of a certainty (A given A) is one Axiom 1 states that no probabilities are greater than the probability of a certainty Axiom 3 expresses the notion that the probability of non-occurrence of an event increases as the probability of its occurrence decreases It also implies P-gtAA) = 0 an impossibility (not A given A) has zero probability Axiom 4 is the least intuitive The probability of both A and B (under some condition C) is equal to the probability of A multiplied by the probability of B given A

The probabilities of negation (-gtA) and conjunction (AampB) each require an axiom However no further axioms are required to treat disjunction because AV B = -i(-iAamp-ii) in words A or B is equivalent to the negation of neither A nor B This allows us to deduce a theorem

P(A V BC) = P(AC) + P(BC) - PAkBC) (1)

If A and B are mutually exclusive then we obtain

PAV BC) = P(AC) + P(BC) (2)

which is often taken to be an axiom and may be used in place of Axiom 3 Several remarks about these axioms are in order First the notion of ranshy

domness plays no fundamental role in the theory Hence we need not enquire whether our variables and events are random as a prerequisite to applying probability theory

Second these axioms are not arbitrary They are uniquely determined (apart from formal changes that do not affect the content) by conditions of plausibility and consistency (see Cox5 and Jaynes2)

(i) The probability of A on some given evidence determines also the probshyability of not A on the same evidence

(ii) The probability on given evidence that both A and B are true is determined by their separate probabilities one on the given evidence and the other on that evidence plus the assumption that the first is true

(iii) If a complex proposition can be composed in more than one way [ex (AampB)ampC or AampcBbC) then all ways of computing its probability must lead to the same answer Notice that in (i) and (ii) only the existence of certain connections are asshysumed but not their mathematical form The consistency condition (iii) then leads to the mathematical forms of the axioms Therefore anyone who proshyposes an inequivalent alternative to Coxs axioms (such as allowing negative probabilities) has an obligation to explain how and why he departs from these conditions of plausibility and consistency

76

Finally a very important remark All probabilities are conditional

The use of the single-variable notation PA) instead of P(AC) is permissible only if the conditional information C is obvious from the context and is unshychanging throughout the problem Many fallacies and paradoxes follow from ignoring this principle

31 Kolmogorovs axioms

If the fundamental axioms that define Probability Theory are those given above then what is the status of Kolmogorovs well-known axioms According to Kolmogorovs axioms probability is assigned to subsets of a universal set fi with the following rules

(i) p(n) = I (2) P(f) gt 0 for any in il (3) If i - - - laquoare disjoint then P(f) = Sj j where is the union of

fir fn-(4) If mdashgt 0 (the empty set) then P(fi) -gt 0 The answer I believe is that Kolmogorovs axioms provide a mathematshy

ical model of probability theory (defined by Coxs axioms) on the theory of measurable sets A mathematical model is useful because it reduces the conshysistency of one theory to that of another (A familiar example is the algebra of complex numbers which can be modeled by the algebra of ordered pairs of reals) Thus any doubts about the consistency of Probability Theory may be laid to rest because of the existence of Kolmogorovs model

There are several objections to taking Kolmogorovs axioms as a foundashytion for Probability Theory rather than merely as a model bull The universal set Cl is often fictitious The propositions to which probabilities are assigned are not subsets of a set bull Conditional probability is relegated to secondary status while the matheshymatical fiction of absolute probability is made primary bull Probability theory and Measure theory are distinct subjects The interesting problems of one are not closely related to the interesting problems of the other For example measure theory deals mostly with infinite sets culminating with the construction of non-measureable sets which have no probabilistic intershypretation But in probability theory one seldom needs to consider an infishynite number of conjunctions and disjunctions On the other hand the imporshytant problem of translating qualitative information into probabilities has no measure-theoretic analog

77

4 Probability in Quantum Mechanics

4-1 Relevant and Irrelevant Interpretations of Probability

Which of the interpretations of probability are relevant to quantum mechanshyics The ensemble-frequency interpretation is obviously relevant and widely used in discussing the statistics of repeated experiments on similarly prepared states Indeed the standard description of an idealized experiment is (1) prepare a state (2) measure an observable of the system (3) repeat the previous two steps until sufficient statistical data has been accumulated (4) compare the relative frequencies of this data with the probabilities predicted by quantum theory

The propensity interpretation is in accord with the ensemble-frequency interpretation whenever it is applied to repeated experiments but it also allows one to make meaningful statements about individual events The propensity interpretation is more natural when one considers time-dependent states and hence time-dependent probabilities Consider the following examples

(i) A source produces s = 12 particles polarized at an angle 4gt relative to some coordinate axis A Stern-Gerlach magnet has its field gradient axis oriented at an angle 8 What is the probability that such a particle incident on the apparatus will emerge with spin up

The formal answer is of course p = cos[(9 mdash ltj))22 but what does this mean

According to the propensity interpretation it means The propensity (chance) of the particle emerging with spin up is p

According to the ensemble-frequency interpretation it means In a long run of similar experiments the fraction of particles emerging with spin up will be (approximately) p

(ii) Now let the magnet be re-oriented in some arbitrary manner before each particle is released so that 6 is different in each case

According to the propensity interpretation we say nearly the same thing The propensity (chance) has a different value p = p$ in each case

But in the ensemble-frequency interpretation one must conceptually embed each event in an imaginary long run of experiments having the same value of 6 in order to make a frequency statement

78

(iii) Suppose next that the polarization direction ltjgt of the particles is unknown Can it be inferred from the data of (ii)

In the ensemble-frequency interpretation the answer would appear to be No A long run of events for each value of 0 would be necessary to estimate p$ as a frequency and hence to determine its dependence on 6

In the propensity interpretation the answer is Yes Bayesian inference (equivalent to maximum likelihood if the prior probashybility distribution for ltgt is uniform) can determine the most probable value of ltjgt even if there is only one event for each value of 9

I have never seen a coherent exposition of QM based on a subjective inshyterpretation of quantum probabilities as representing knowledge This point (which has also been argued at length by Popper8) is worth emphasizing beshycause the interpretation of probabilities as knowledge seems to be a tenet of the Copenhagen interpretation

Two persons (with limited knowledge of QM) might have different reashysonable beliefs about the position of the electron in the hydrogen atom and those beliefs could be represented by subjective probabilities But such igshynorance probabilities have nothing to do with |gt(a0|2 from the Schroedinger equation |V(a)|2 is an objective propensity not a subjective degree of belief

The so-called Uncertainty principle AxAp gt h2 has nothing to do with subjective knowledge or ignorance Its meaning is that in any physical prepashyration of a state the values of x and p will not be reproducible the widths of their distributions being related by the inequality The widths Aa and Ap are objective predictable and measurable parameters which should not be called uncertainties Indeed the name Indeterminacy principle is preferable to Uncertainty principle0

Subjective probabilities can occur in the information games that are played in quantum communication theory Consider a typical example

Bob prepares some quantum state but keeps it secret He tells Alice only that it is one of four (usually nonorthogonal) possible states and she must try to infer what the hidden state is from a measurement Alices incomplete knowledge of that hidden state can be expressed as a subjective probability Suppose also that Bob tells Carol that the unknown state is one of three posshysibilities Carols knowledge is different from Alices and hence her subjective probability will be different But both of these subjective knowledge probabilshyities are quite distinct from the objective quantum probabilities (propensities)

When I once heard Heisenberg speak (about 1964) he used the term Indeterminacy prinshyciple In his early writings he used the words Ungenauigheit (inexactness) Unbestimmtheit (indeterminacy) and Unsicherheit (uncertainty) with various shades of meaning

79

that would be calculated by solving Schroedingers equation for Bobs state preparation apparatus

I suspect that the subjective knowledge interpretation of QM probabilshyities came about by accident the founders of QM may have believed (erroshyneously) that probability can only be a measure of knowledgeignorance Max Born has written that Heisenberg did not know what a matrix was when he was inventing what later became known as matrix mechanics It is therefore not very radical to suppose that the founders of quantum mechanics had an inadequate understanding of probability

4-2 Fallacies in the use of Probability

Unsound arguments to the effect that classical probability theory does not apply to QM are woefully common Before examining an actual argument to that effect let us first consider a simple classical paradox

The Bookies Paradox A bookie needs to fix the odds on a star track runner who has a 60 chance of winning any race that he enters There is a race in Paris and a race in Tokyo scheduled on the same day so he cannot enter both and we do not know which he will enter What is the probability that he will win at least one of these races

Let A = (winning in Paris) and let B = (winning in Tokyo) Clearly A and B are mutually exclusive events so PAJB) = PA) + P(B) The probability of his winning at least one race is 06 + 06 = 12 But this is absurd since 12 gt 1

The paradox is resolved by taking account of a principle that was noted in Sec 3

All probabilities are conditional The notation PA) instead of P(AC) is permissible only if the conditional information C is obshyvious from the context and unchanging throughout the problem

Let us therefore be more precise about the conditions involved Let Ep = (entering in Paris) and let ET mdash (entering in Tokyo) Then clearly we have

P(AEP) = 06 P(BEP)=0 P(AET) = 0 P(BEr) = 06

80

Additivity P(A V BC) = P(AC) + PBC) holds for the same condition C in all terms But PAEp) and P(BET) are not additive by any valid rule so the absurd conclusion reached above followed only from an erroneous apshyplication of probability theory

Double-slit Fallacy A common fallacy about 2-slit experiment is of exactly the same form The experiment consists of three parts

(a) Open slit 1 close slit 2 The probability of a particle arriving at the point X on the screen is Pi(X)

(b) Open slit 2 close slit 1 The probability of a particle arriving at X is now P2(X)

(c) Open both slits 1 and 2 The probability of a particle arriving at X is Pi2(X)

Now passage through slit 1 and through slit 2 are mutually exclusive so we deduce

PuX) = Pi(X) + P2(X) which is empirically false It is then concluded (fallaciously) that classical probability theory does not apply in quantum mechanics

The above reasoning embodies essentially the same fallacy is does the Bookies paradox and it is resolved similarly by paying proper attention to the conditional nature of the probabilities

Let condition C = (slit 1 open slit 2 closed) Let C2 = (slit 2 open slit 1 closed) Let C3 = (both slits open)

We observe empirically that P(XCi) + P(XC2) ^ P(XC3)

(due of course to interference) But this fact is is fully compatible with classical probability theory

4-3 Quantum Probabilities

Quantum probabilities are not essentially different from classical probabilities but like quantum theory itself they do require some care in their interpreshytation H Jefferys 7 remarked that the probability statements of quantum mechanics are incomplete because a probability is always relative to a set of data and the data are not specified In our terminology Jefferys is saying that all probabilities are conditional and the conditions need to be specified to

81

make the probability statement meaningful This can be accomplished through a propensity interpretation of quantum probabilities with proper attention beshying given to the basic concepts of measurement and state preparation When that is done it can be demonstrated9 10 that quantum probabilities obey all of the axioms of classical probability theory The demonstration is straight forshyward but too lengthy to review here so I shall only remark on some conceptual points

(a) The standard formula P(A=an^) = | (abdquo |) |2 where Aan) = anan) should be read as

The probability (propensity) for a measurement of the dynamical variable A to yield the value an conditional on the preparation of the state is | (abdquo |) |2

Note that the propensity is conditioned by the physical process of state prepashyration and not by anyones beliefs or opinions

(b) One can also calculate the probability of a measurement result condishytioned by state preparation and the results of other measurements^

P(B=bm(A=an)kV) However it is necessary that the measurement processes be described dynamshyically as an interaction between the object and the apparatus Simplistic applishycation of the Projection Postulate is liable to give an incorrect answer11

(c) No difficulties of principle arise if the probabilities are conditioned on actual events of state preparation and measurement But assigning probabilishyties to hypothetical unmeasured values is not always possible This problem is encountered if we try to introduce joint probability distributions for (unmeashysured values of) non-commuting observables and require the marginal distrishybutions to agree with the quantum probabilities of the individual observables

In the case of position and momentum we would like to have a joint distribution P(xp) that satisfies

P(xp) gt 0 (3)

Jp(xp)dp=(x)2 (4)

Jp(xp)dx = (pV)2 (5)

There are infinitely many solutions to this problem12 but there is no apparent physical reason for any one of them to be preferred

However in the case of angular momentum where we might seek a joint distribution P(JxJyJz) for the three angular momentum components it is

82

not difficult to show that no such a function can yield the quantum probshyabilities of the three components as marginals However this has more to do with Kochen-Specker13 difficulties (the impossibility of assigning values to all quantum observables consistent with all the relevant constraints) than to probability theory There is no case in which a quantum probability is well defined but violates an axiom of classical probability theory

5 Conclusions

In this paper I have suggested a scheme whereby all the major interpretations of probability are unified with the separate interpretations now seen as applishycations of the general theory to particular subject matters That such different ideas as ensemble-frequency theories propensity theory and subjective degrees of reasonable belief can all be encompassed within a single framework is both useful and surprizing Because they can all be described by the same matheshymatical axioms it is easy to switch from one kind of probability to another as may be appropriate in a particular problem But on the other hand one can ask why such different things as frequencies propensities and degrees of belief should necessarily obey the same axiom system This question should stimulate further foundational research

For the case of degrees of reasonable belief this work has already been completed by Cox56 who showed that certain conditions of plausibility and consistency determine the axioms essentially uniquely Essentially unique means subject only to formal transformations that do not alter the content of the theory Therefore any alternative inequivalent system of plausible reasonshying could be shown to suffer from some degree of inconsistency

Khrennikov14 has studied limit frequencies outside of any theory of probshyability imposing only a condition of stabilization that in a long sequence the frequencies should approach a limit He has found many different cases to be possible some of which lie outside of probability theory It will be interesting to see whether these new logical possibilities are realized in nature If not then his stabilization condition will have to be supplemented by other conditions

The greatest need for more foundational research is in the case of propenshysity Although it clearly can be described by the axioms of probability theory it is not yet clear why it must be so described

Although I have dealt only with versions of probability theory that are derivable from the same axioms I expect that the classification of interpretashytions (Fig 1) may also be useful for generalized theories such as those that admit negative probabilities15 For such generalizations we should ask which of the interpretations do they support Can such generalized probabilities be

83

interpreted as frequencies As propensities As degrees of belief Or must they be given some entirely new interpretation

There are connections between the interpretations of probability and of quantum mechanics This must be so because quantum mechanics does not predict events but only the probabilities of events If one adheres exclusively to a frequency interpretation of probability then one is bound to assert that a quantum state describes only an ensemble of similarly prepared systems If on the other hand one adopts a propensity interpretation of probability then it becomes possible to make meaningful probability statements about an individshyual system However the empirically testable content of those statements can be realized only by measurements on an ensemble of similarly prepared sysshytems Thus the frequency interpretation is not made obsolete by the propensity interpretation but merely broadened The subjective interpretation of probshyability can be used in some situations such as when the observer is not fully informed about the state preparation procedure But it is never correct to interpret ip2 as representing knowledge (except perhaps in the trivial case in which the observers knowledge is complete and in perfect accord with reality)

References

1 TL Fine Theories of Probability an Examination of Foundations (Acashydemic Press New York 1973)

2 ET Jaynes Probability Theory The Logic of Science (Cambridge Unishyversity Press forthcoming) an incomplete version of this work is availshyable electronically at httpbayeswustledu

3 KR Popper in Observation and Interpretation ed S Korner (Butter-worths London 1957)

4 KR Popper Realism and the Aim of Science (Hutchinson London 1983)

5 RT Cox The Algebra of Probable Inference (Johns Hopkins University Press Baltimore MD 1961)

6 RT Cox Am J Phys 14 1 (1946) 7 H Jefferys Scientific Inference (Cambridge University Press Cambridge

1973) sec 1031 8 KR Popper Quantum Theory and the Schism in Physics (Hutchinson

London 1982) 9 LE Ballentine Quantum Mechanics - A Modern Development (World

Scientific Singapore 1998) Ch 15 24 96 10 LE Ballentine Am J Phys 54 883 (1986) 11 LE Ballentine Found Phys 20 1329 (1990)

84

12 L Cohen in Frontiers of Nonequilibrium Statistical Physics ed GT Moore and MO Scully (Plenum New York 1986) pp 97-117

13 S Kochen and EP Specker J Math Mech 17 59 (1967) 14 A Khrennikov Nonconventional approach to elements of physical realshy

ity based on nonreal asymptotics of relative frequencies Proc Conf Foundations of Probability and Physics Vaxjo-2000 (WSP Singapore 2001)

15 A Khrennikov Interpretations of Probability (VSP Utrecht 1999)

85

FORCING DISCRETIZATION A N D DETERMINATION IN Q U A N T U M HISTORY THEORIES

BOB COECKE Imperial College of Science Technology amp Medicine Theoretical Physics Group

The Blackett Laboratory South Kensington LondonSW7 2BZ and

Free University of Brussels Department of Mathematics Pleinlaan 2 B-1050 Brussels

E-mail bocoeckevubacbe

We present a formally deterministic representation for quantum history theories where we obtain the probabilistic structure via a discrete contextual variable no continuous probabilities are as such involved at the primal level

1 Introduction

In this paper we propose and study a model for history theories in which the probability structure emerges from a finite number of contextual happenings any next happening having a fixed chance to occur under the condition that the previous one happened Although this model cannot have a canonical mathematical status since it has been proved that this type of representation in general admits no essentially unique smallest one 8 u it provides insight in the emergence of logicality in the History Projection Operator setting14 and it illustrates how deterministic behavior can be encoded beyond those inshyterpretations of quantum history theories that are interpretationally restricted by so-called consistency or quasi-consistency (eg approximate decoherence) The particular motivation for this paradigm case study finds its origin in structural considerations towards a theory of quantum gravity4 15 19 As arshygued in16 although the relative frequency interpretation of probability justifies the continuous interval as the codomain for value assignment in the quanshytum gravity regime standard ideas of space and time might break down in such a way that the idea of spatial or temporal ensembles is inappropriate For the other main interpretations of probability mdash subjective logical or propensity mdash there seems to be no compelling a priori reason why probabilities should be real numbers Our model should be envisioned as a deconstructive step unshyraveling the probabilistic continuum as it appears in standard quantum theory reducing it explicitly to a discrete temporal sequence of (contextual) events The as such emerging temporal sequence is then easier to manipulate towards alternative encoding of contextual events eg in propositional terms It also enables a separate treatment of internal (the systems) and external (the con-

86

texts) time-encoding variable Although quantum history theories are currently most frequently envishy

sioned in a context of so-called decoherence we prefer to take the minimal perspective that a history theory is a theory that deals with sequential quanshytum measurements but remains essentially a dichotomic propositional theory This is formally encoded in a rigid way in the History Projection Operator-approach 14 We also mention recently studied sequential structures in the context of quantum logic of which references can be found in1 0 resulting in a dynamic disjunctive quantum logic which provides an appropriate formal context to discuss the logicality of history theories

A general theory on deterministic contextual models can be found in 8 Note here that what we consider as contextuality is that in a measurement there is an interaction between the system and its context and that precisely this interaction to some extend may influence the outcome of a measurement A lack of knowledge on the precise interaction then yields quantum-type unshycertainties Besides this interpretational issue classical representations are important since we think classical so even without giving any conceptual sigshynificance to the representation it provides a mode to think deterministically in terms of determined trajectories of the systems state without having to reconcile with concrete non-canonical constructs like pilot-wave mechanics

2 Outcome determination via contextual models

We will present the required results in full abstraction such that the reader clearly sees which structural ingredient of quantum theory determines existence of contextual models For details and proofs we refer t o 8 Let B(M) denote the Borel subsets of M Definition 1 A probabilistic measurement system is given by (i) A set of states pound and a set of measurements pound (ii) For each e e pound an outcome set Oe euro B(W) a a-field B(Oe) of Oe-subsets and (Kolmogorovian) probability measures Pplte B(Oe) -gt [01] for eachp 6 pound The canonical example is that of quantum theory with every Hilbert space ray ij) representing a state every self-adjoint operator H representing a measureshyment with its spectrum OH C K as outcome set where the a-structure B(OH) is inherited from that of B(R) and with probability measures P^tHE) bull= (tpPEtp) where PE denotes the spectral projector for E G BOH) bull In benefit of insight and also for notational convenience we will from now on assume that the measurements e pound pound are represented in a one to one way by their outcome sets Oe mdash note that whenever pound can be represented by points of W it then suffices to consider W x w = W+v in stead of W to fulfill this assumption

87

taking Oe x e as the corresponding outcome set We stress however that the results listed below also hold in absence of this assumption81 Definition 2 A pre-probabilistic hidden measurement system is given by (i) A set of states pound and a set of measurements pound (ii) Sets O C B(W) and A that parameterize pound ie pound = eAo|A pound A0 pound O and each e pound pound goes equipped with a map ltpto bull pound mdashgt O We can represent ltpoundAO|A pound A as ipo pound x A -gt O (p A) H-gt ltPAO(P) giving A a similar formal status as the set of states pound or as AAo pound x 13(0) mdashgt P ( pound ) (pE) gt-gt A|y0(p A) pound E where 7gt(A) denotes the set of subsets of A The core of this definition is that given a state p pound pound and a value A euro A we have a completely determined outcome tpo [p A) These pre-probabilistic hidden measurement systems encode as such fully deterministic settings Definition 3 Whenever for a given pre-probabilistic hidden measurement system (Ypound(0 A) ltpooeo) there exists a a-field B(A) of A-subsets that satisfies J0e0AAo(pE)(pE) pound pound x B(0) C B(A) it defines a probashybilistic hidden measurement system if a probability measure p B(A) mdashgt [01] is also specified

The condition on A A requires that all AAo(p E) are 23(A)-measurable such that to all triples (p O E) we can assign a value PPto(E) = p(AAo(p E)) euro [01] As such any probabilistic hidden measurement system defines a meashysurement system The question then rises whether every probabilistic meashysurement system (MS) can be encoded as a probabilistic hidden measurement system (HMS) The answer to this question is yes8 42 Theorem 12 3 There always exists a canonical HMS-representation for A = [01] B(A) = B([01]) (ie the Borel sets in [01]) and pu([0a]) = a ie uniformly distributed mdash the proof goes via a construction using the Loomis-Sikorski Theorem17 20 and Marczewskis Lemma13 It makes as such sense to investigate how the different possible HMS-representations for different non-isomorphic pairs (B(A)p) are structured mdash below it will become clear what we mean here by non-isomorphic First we will discuss an example that illustrates the above it traces back to 1 and details and illustrations can be found in 2 8 Consider the states of a spin-1 entity encoded as a point on the Poincare sphere pound 0 ( = C^C) C E3 Then any pair of antipodically located points of pound 0 encodes mutual orthogshyonal states as such encodes mutual orthogonal one-dimensional projectors and thus a (dichotomic) measurement Let p pound pound 0 let (a -gta) be a pair of mutual orthogonal points of pound 0 and let A be the diagonal connecting a and -lta Let xp pound A be the orthogonal projection of p on the diagonal A Then for A pound [xp-gta] ie xp pound [aA] we set ltp(pA) = a and for A pound [a xp[ ie xp euro]A -IQ] we set ltp(p A) = -a One then verifies that for p0 bull= B([a -gta]) mdashgt [01] [a (1 mdash x)a + x-lta] gt-gt x ie uniformly distributed

88

we obtain exactly the probability structure for spin- | in quantum theory a An interpretational proposal of this model could be the following123 Rather than decomposing states as in so-called hidden variable theories here we decompose the measurements in deterministic ones mdash the probability measure fi should then be envisioned as encoding the lack of knowledge on the interaction of the measured system with its environment including measurement device

We now introduce a notion of relative size of HMS-representations jusshytifying the use of smaller Given a er-algebra6 and probability measure H B mdashgt [01] denote by Bn the ltr-algebra of equivalence classes [E] with respect to the relation

pound ~ pound iff n(E n Ec) = nE H (E)c) = 0

ie iff E and E coincide up to a symmetric difference of measure zero The ordering of Bn is inherited from B For notational convenience denote the induced measure Bfi mdashgt [01] [E] H-gt H(E) again by fi Given two pairs (B x) and (B1 ) consisting of separable cr-algebras and probability measures on them set

bull (B u) lt (B u) amp 3f B^ ~ B^ a n i n J e c t i v e c-nidegrphism

We call Bn) and (Bfi) equivalent denoted (Bfi) ~ (Bfi) whenever in the above is a c-isomorphism Given two MS (poundpound) and (Epound ) we set

3s S -gt E 3t pound-+pound both bijections Ve 6 pound 3 e B(Oe) -gt B(Ot(e)) a cr-isomorphism Vp E E V e E pound Ps(p)t(e) deg fe = PPe

Via this equivalence relation we can define a relation lt M S between classes of measurement systems M and M1 as M ltMSM if for all (Epound) euro M there exists (Epound) 6 M such that (Epound) ~M S(S pound ) ie if M is included in M up to MS-equivalence We can then prove the following

(i) (Bi) ~ (Bii) if and only if (BgtAi) lt (Bn) and Bft) lt Bft) mdash 8 3 Lemma 1 thus the equivalence classes with respect to ~ constitute a partially ordered set (poset) for the ordering induced by lt we will denote

As shown in 6 9 this deterministic model for spin-^ in R3 can be generalized to R3-models for arbitrary spin-N2 The states are then represented in the so called Majorana representation 1 8 5 ie as N copies of So Correct probabilistic behavior is then obtained by introducing entanglement between the N different spin-^ systems fcIe a pointless cr-fleld In particular it follows from the Loomis-Sikorski theorem 1 7 2 0

that all separable ltr-algebras (ie which contain a countable dense subset) can be represented as a ltT-field mdash it as such also follows that assuming that B(A) is a er-field and not an abstracted c-algebra imposes no formal restriction

89

the set of these equivalence classes by M a class in it will be denoted via a member of it as [B n]

(ii) When setting M H M S = M[BK)ii [B(A)n] pound M where M[B(A)fi] stands for all HMS with B(A) and i such that (S(A) fi) pound [B(A)j] we have that (B(A)i) lt (B(A)M) BndM[B(A)n] ltMS M[B(A)n] are equivalent 8 i 3 Theorem 2 This then results in

Theorem 1 (M lt) and (MH M S ltM S) are isomorphic posets One of the crucial ingredients in (ii) above and also in the proof for genshy

eral existence with A = [01] is the following when setting AM(Epound) = (B(Oe) Ppe)p euro pound e G pound we obtain that pound pound admits a HMS-representation with B(A) and i if and only if AM(E pound) lt (B(A)n) where the order applies pointwisely to the elements of AM(Epound) 8 t 42 Theorem 1 Using this and Theorem 1 above we can now translate properties of M to propositions on the existence of certain HMS-representations We obtain the following

(i) (M lt) is not a join-semilattice thus In general there exists no smallest HMS-representation As such we will have to refine our study to particular settings where we are able to make statements whether there exists a smallest one and if not whether we can say at least something on the cardinality of A

(ii) One can prove a number of criteria on AM(Epound) that force (B(A)fi) ~ (S([01]) ibdquo) as such assuring existence of a smallest representation Among these the following Let Mfinite = (B(X)^) euro M J X is finite ^bullfinite Q AM(pound pound ) than A cannot be discrete It then follows for examshyple that quantum theory restricted to measurements with a finite number of outcomes still requires A = [01]

(iii) Let MJV = (B(X)(i) 6 M | X has at most N elements J AM(pound pound ) C M^r then there exists a HMS-representation with A mdash N Thus quantum theory restricted to those measurements with at most a fixed number N of outcomes has discrete HMS-representation

(iv) A M ( E pound ) = MAT then there exists no smallest HMS-representation Neither does it exist when fixing the number of outcomes So there is no essenshytially unique smallest HMS-representation for V-outcome quantum theory

Although there exists no smallest and as such no canonical discrete HMS-representation we will give the construction of one solution for dichotomic (or propositional) quantum theory ie N = 2 since this will constitute the core of the model presented in this paper We will follow82 to which we also refer for a construction for arbitrary N Let us denote the quantum mechanical probability to obtain a positive outcome in a measurement of a proposition or question a on a system in state p as Pp(a) mdash the outcome set consists here of we obtain a positive answer for the question a slightly abusively denoted

90

as a itself and we obtain a negative answer for the question a denoted as -ia Set inductively for A euro N c

a iff P (n gt A- 4- V - 1 i(Vc(plti)a) ltpa(p X)= a tradeigt W Z ^ + U=i 2gt

^ -ia otherwise

One verifies that for p(X) = ^x we obtain the correct probabilities in the resultshying HMS-model This provides a discrete alternative for the above discussed E3 -model for spin-i The model including the projection xp remains the same although we dont consider [a -gta] as A anymore Let A e A = N Set xbdquo = ( 1 - pound)a+ (pound)-lta for n pound Z2gt-i bull For xp ltE [ax$[U[x$x$[U[xxpound[U U [a2A-i~lQ] w e se^ faampty = agt anc^ PaiPty = ~ltx otherwise Then for p0 = B(N) mdashraquobull [01] A gt-gt ^ we obtain again quantum probability Geshyometrically this means that the values of A pound A as compared to the first model where they represents points on the diagonal ie a continuous intershyval or again equivalently decompositions of an interval in two intervals we now consider decompositions of an interval in 2A equally long parts of which there are only a discrete number of possibilities We refer t o 8 for details and illustrations concerning

3 Unitary ortho- and projective structure

In the above discussed E3 models rotational symmetries where implicit in their spatial geometry However in general the decompositions of measurements over p B(A) mdashgt [01] go measurement by measurement so additional structure if there is any has to be put in by hand It is probably fair to say that these contextual models only become non-trivial and useful when encoding physical symmetries within the maps tpa in an appropriate manner For sake of the argument we will distinguish between three types of symmetries that can be encoded namely unitary ortho- and projective ones

i Unitary symmetries When considering quantum measurements with disshycrete non-degenerated spectrum we can represent the outcomes OJJ by the corresponding eigenstates pii via spectral decomposition ie there exshyists an injective map B(Oe) -t P(E) for each e euro pound Then specification of ltp E x A mdashbull pii and p for one measurement eo G pound fixes it for any other e E pound by symmetry ltgte = (UoipoU-1) AxE -gt peii where U E -gt E is the unishytary transformation that satisfies U(pi) = pei and pe = p This is exactly the

cWe agree on N = 12 Note here that already by non-uniqueness of binary decomshyposition mdash i = 4- = EigN T^TT mdash follows that the construction below is not canonical Obviously there are also less pathological differences between the different non-comparable discrete representations8

91

symmetry encoded in the above described E3-models Note in particular that in this perspective the pairs (a -ia) and (-gta -gt(-gta)) should not be envisioned as merely a change of names of the outcomes but truly as putting the meashysurement device (or at least its detecting part) upside down d In this setting where we represent outcomes as states the assignment of an outcome can now be envisioned as a true change of state fegt E -gt E (D Oe) p i-gt tpe(p A) as such allowing to describe the behavior of the system under concatenated measurements

ii Projective symmetries For non-degenerated quantum measurements the outcomes require representation by higher dimensional subspaces so identifishycation in terms of states now requires an injective map B(Oe) -raquo V(V(S)) The behavior of states of the system under concatenated measurements then requires specification of a family of projectors TTT bull S -gt TT euro Oe eg the orthogonal projectors 7 r ^ E - gt A p i - gt ^ l A ( p V A x ) on the correshysponding subspace A in quantum theory The above discussed non-degenerated case fits also in this picture by setting Oe C p | p pound E where now each 7Tp E mdashgt p is uniquely determined (having a singleton codomain)

Hi Orthosymmetries The existence of an orthocomplementation on the latshytice of closed subspaces of a Hilbert space provides a dichotomic representashytion for measurements which can be envisioned as a pair consisting of a (to be verified) proposition a and its negation -a in quantum theory yielding TT^A bull E mdashgt A1- p Hraquo A L A ( p V A ) In terms of linear operator calculus we have IT^A = 1 mdash A gt both of them being orthogonal projectors

4 Representing quantum history theory

Although quantum history theory involves sequential measurements one of its goals is to remain an essentially dichotomic propositional theory This is forshymally encoded in a rigid way in the History Projection Operator-approach 14 The key idea here is that the form of logicality aimed at in 14 represhysents faithfully in the Hilbert space tensor producte Let A = (ctti)i be a

d The attentive reader will note that it is at this point that we escape the so-called hidden variable no-go theorems They arise when trying to impose contextual symmetries within the states of the system by requiring that values of observables are independent of the chosen context eg the proof of the Kochen-Specker theorem Our newly introduced variable A pound A follows contextual manipulations in an obvious manner c At this point we mention that in the study of sequential phenomena in the axiomatic quantum theory perspective on quantum logic sequentiality and compoundness both turn out to be specifications of a universal causal duality 1 0 as such providing a metaphysical perspective on the use of tensor products both for the description of compound physical systems and sequential processes

92

(so-called homogeneous) quantum history proposition with temporal support (pound1 pound2 bull bull bull tn) bull Then rather than representing this as a sequence of subspaces (Ai)i or projectors (ir^i we will either represent A as a pure tensor regiAi in the lattice of closed subspaces of the tensor product of the corresponding Hilbert spaces or as the orthogonal projector regi~Ki on this subspace The crucial propshyerty of this representation is then that -gtA again encodes as a projector namely idmdashregiiTi14 clarifying the notations TTJ and 7r-^ Moreover if Ali is a set of so-called disjoint history propositions ie lt8gtkAk plusmn regkA3

k for i ^ j then the history proposition that expresses the disjunction of Ai sensu14 is exactly encoded as the projector ] [ reg7rpound We get as such a kind of logical setting that is still encoded in terms of projectors Note that TT-A is not of the form regj7Tj but of the form Yli regA7rfc breaking the structural symmetry between a proposition and its negation in ordinary quantum theory

We will now transcribe the observations in the two previous sections to this setting in order to provide a contextual deterministic model for quantum history theory with discretely originating probabilities One could say that we will apply a split picture in terms of Schrodinger-Eisenbergh namely we assume that on the level of unitary evolution we apply the Eisenbergh picshyture such that we can fix notation without reference to this evolution but for changes of state due to measurement we will (obviously) express this in the state space When encoding outcomes in terms of states we need to consider n copies of E encoding the trajectories due to the measurements In view of the considerations made above it will be no surprise that we will consider these trajectories as of the form regiPi in the tensor product (gijEj This will require the introduction of the following pseudo-projector

bull 7r^ pound -gt regipoundi p Hgt p ^ = p reg m(p) reg reg (7Tn_i o o in)(p) Setting poundreg = TTreg[pound] = pg|p pound pound then ir pound -gt E^ encodes a bijective representation of E Noting that PP(A) mdash (preg IXAPA) is the probability given by quantum theory to obtain A we then set inductively for fixed A pound N that ltPA(P A) = A if and only if

bull lt P S I trade S gt gt pound + E pound ^ ^ and (p^(p) = -14 otherwise The outcome trajectories in case we obtain A are then given in terms of initial states by (n^ o 7rreg) E mdashgt regiAi The value A euro N can be envisioned as follows We assume it to be a number of contextual events either real or virtual depending on ones taste and we asshysume that given that some events already happened the chance of a next one happening is equal to the chance that it doesnt happen so we actually conshysider a finite number of probabilistically balanced consecutive binary decisive processes where the result of the previous one determines whether we actually

93

will perform the next one Unitary symmetries are induced in the obvious way as tensored unitary operators regiUi This model then produces the statistical behavior of quantum history theory

The breaking of the structural symmetry between a proposition and its negation manifestates itself in the most explicit way in the sense that when we have a determined outcome -gtA we dont have a determined trajectory in our model mdash obviously one could build a fully deterministic model that also determines this by concatenation of individual deterministic models (one for each element in the temporal support) but we feel that this would not be in accordance with the propositional flavor a history theory aims at The negation -gtA is indeed cognitive and not ontological with respect to the actual executed physical procedure or in other words the systems context and one cannot expect an ontological model to encode this in terms of a formal duality Explicitly -i(AregB) can be written both as H lt8gt -gtB) copy (-gtA reg B) and (-gtA reg H) copy (A reg -gtB) which clearly define different procedures with respect to imposed change of state due to the measurement Even more explicitly setting HPO(Hkk) = E reg 4 l 4 G pound(laquo)gt reg4l -L reg 4 for i ^ j for pound(ik) the lattice of closed subspaces of Hk the ontologically faithful hull oiUVO(Ukk) consists then of all ortho-ideals Ol(HVO(Hkk)) ~

bull 4[regAji] | A e CUk)regkA plusmn regkA for i plusmn j

where J[mdash] assigns to a set of pure tensors all pure tensors in QkHk that are smaller than at least one in the given set this with respect to the ordering in CregkHk) mdash the downset 4-[~] construction makes Ol(HVO(Hkk)) inherit the pound(regkHk)-oideT as intersection If a particular decomposition is specified as an element of OX(HVO(Hkk)) what means full specification of the physishycal procedure where summation over different sequences of pure tensors is now envisioned as choice of procedure we can provide a deterministic contextual model the choice of procedure itself becoming an additional variable Conshyclusively the HPO-setting looses part of the physical ontology that goes with an operational perspective on quantum theory and as such if we want to provide a deterministic representation for general inhomogeneous history propositions sensu the one we obtained for the homogeneous ones we formally need to restore this part of the physical ontology eg as Ol7iVO(7ikk))

5 Further discussion

In this paper we didnt provide an answer and we even didnt pose a question We just provided a new way to think about things slightly confronting the

A choice that is motivated by the traditional consistent history setting and its interpretation as well as by a particular semantical perspective on quantum logic as a whole

94

usual consistency or decoherence perspective for history theories Even if one does not subscribe to the underlying deterministic nature of the model it still exhibits what a minimal representation of the indeterministic ingredients can be as such representing it in a more tangible way With respect to the nonshyexistence of a smallest representation in view of other physical considerations it could be that one of the constructible discrete models presents itself as the truly canonical one eg equilibrium or other thermodynamical considerations metastatistical ones emerging from additional modelization

Acknowledgments

We thank Chris Isham for useful discussions on the content of this paper

References

1 D Aerts J Math Phys 27 202 (1986) 2 D Aerts Int J Theor Phys 32 2207 (1993) 3 D Aerts Found Phys 24 1227 (1994) 4 GK Au mdash Interview with A Ashtekar CJ Isham and E Witten The

Quest for Quantum Gravity arXiv gr-qc9506001 (1995) 5 H Bacry J Math Phys 15 1686 (1974) 6 B Coecke Helv Phys Acta 68 396 (1995) 7 B Coecke Found Phys Lett 8 437 (1995) 8 B Coecke Helv Phys Acta 70 442 462(1997) arXiv quant-

ph0008061 k 0008062 Tatra Mt Math Publ 10 63 9 B Coecke Found Phys 28 1347 (1998)

10 B Coecke et ai Found Phys Lett 14(2001) arXiv quant-ph0009100 11 N Gisin and C Piron Lett Math Phys 5 379 (1981) 12 S Gudder J Math Phys 11 431 (1970) 13 A Horn and H Tarski Trans AMS 64 467 (1948) 14 C J Isham J Math Phys 23 2157 (1994) 15 C J Isham Structural Issues in Quantum Gravity In General Relativshy

ity and Gravitation GR14 pp167 (World Scientific Singapore 1997) 16 CJ Isham and J Butterfield Found Phys 30 1707 (2000) 17 L Loomis Bull AMS 53 757 (1947) 18 E Majorana Nuovo Cimento 9 43 (1932) 19 C Rovelli Strings Loops and Others A Critical Survey of the Present

Approaches to Quantum Gravity Plenary Lecture at GR15 Poona India (1998) arXiv gr-qc9803024

20 R Sikorski Fund Math 35 247 (1948)

95

INTERPRETATIONS OF Q U A N T U M MECHANICS A N D INTERPRETATIONS OF VIOLATION OF BELLS

INEQUALITY

WILLEM M DE MUYNCK Theoretical Physics Eindhoven University of Technology

FOB 513 5600 MB Eindhoven the Netherlands E-mail W-MdMuyncktuenl

The discussion of the foundations of quantum mechanics is complicated by the fact that a number of different issues are closely entangled Three of these issues are i) the interpretation of probability ii) the choice between realist and empiricist interpretations of the mathematical formalism of quantum mechanics iii) the disshytinction between measurement and preparation It will be demonstrated that an interpretation of violation of Bells inequality by quantum mechanics as evidence of non-locality of the quantum world is a consequence of a particular choice beshytween these alternatives Also a distinction must be drawn between two forms of realism viz a) realist interpretations of quantum mechanics b) the possibility of hidden-variables (sub-quantum) theories

1 Realist and empiricist interpretations of quantum mechanics

In realist interpretations of the mathematical formalism of quantum mechanics state vector and observable are thought to refer to the microscopic object in the usual way presented in most textbooks Although of course preparing and measuring instruments are often present these are not taken into account in the mathematical description (unless as in the theory of measurement the subject is the interaction between object and measuring instrument)

In an empiricist interpretation quantum mechanics is thought to describe relations between input and output of a measurement process A state vector is just a label of a preparation procedure an observable is a label of a measuring instrument In an empiricist interpretation quantum mechanics is not thought to describe the microscopic object This of course does not imply that this object would not exist it only means that it is not described by quantum mechanics Explanation of relations between input and output of a measureshyment process should be provided by another theory eg a hidden-variables (sub-quantum) theory This is analogous to the way the theory of rigid bodies describes the empirical behavior of a billiard ball or to the description by thershymodynamics of the thermodynamic properties of a volume of gas explanations being relegated to theories describing the microscopic (atomic) properties of the systems

Although a term like observable (rather than physical quantity) is ev-

96

idence of the empiricist origin of quantum mechanics (compare Heisenberg1) there has always existed a strong tendency toward a realist interpretation in which observables are considered as properties of the microscopic object more or less analogous to classical ones Likewise many physicists use to think about electrons as wave packets flying around in space without bothering too much about the Unanschaulichkeit that for Schrodingei2 was such a problematic feature of quantum theory Without entering into a detailed discussion of the relative merits of either of these interpretations (eg de Muynck3) it is noted here that an empiricist interpretation is in agreement with the operational way theory and experiment are compared in the laboratory Moreover it is free of paradoxes which have their origin in a realist interpretation As will be seen in the next section the difference between realist and empiricist interpretations is highly relevant when dealing with the EPR problem

2 E P R experiments and Bell experiments

In figure 1 the experiment is depicted

measuring instrument for Q or P

Figure 1 E P R experiment

proposed by Einstein Podolsky and Rosen4 to study (in)completeness of quantum mechanics A pair of particles (1 and 2) is prepared in an entangled state and allowed to separate A measurement is performed on particle 1 It is essential to the EPR reasoning that particle 2 does not interact with any measuring instrument thus allowing to consider so-called elements of physical reality of this particle that can be considered as objective properties being attributable to particle 2 independently of what happens to particle 1 By EPR this arrangement was presented as a way to perform a measurement on particle 2 without in any way disturbing this particle

The EPR experiment should be compared to correlation measurements of the type performed by Aspect et al56 to test Bells inequality (cf figure 2) In these latter experiments also particle 2 is interacting with a measurshying instrument In the literature these experiments are often referred to as EPR experiments too thus neglecting the fundamental difference between

97

Q

Figure 2 Bell experiment

the two measurement arrangements of figures 1 and 2 This negligence has been responsible for quite a bit of confusion and should preferably be avoided by referring to the latter experiments as Bell experiments rather than EPR ones In EPR experiments particle 2 is not subject to a measurement but to a (conditional) preparation (conditional on the measurement result obtained for particle 1) This is especially clear in an empiricist interpretation because here measurement results cannot exist unless a measuring instrument is present its pointer positions corresponding to the measurement results

Unfortunately the EPR experiment of figure 1 was presented by EPR as a measurement performed on particle 2 and accepted by Bohr as such That this could happen is a consequence of the fact that both Einstein and Bohr entertained a realist interpretation of quantum mechanical observables (note that they differed with respect to the interpretation of the state vector) the only difference being that Einsteins realist interpretation was an objectivistic one (in which observables are considered as properties of the object possessed independently of any measurement the EPR elements of physical reality) whereas Bohrs was a contextualistic realism (in which observables are only well-defined within the context of the measurement) Note that in Bell expershyiments the EPR reasoning would break down because due to the interaction of particle 2 with its measuring instrument there cannot exist elements of physical reality

Much confusion could have been avoided if Bohr had maintained his intershyactional view of measurement However by accepting the EPR experiment as a measurement of particle 2 he had to weaken his interpretation to a relational one (eg Popper7 Jammer8) allowing the observable of particle 2 to be co-determined by the measurement context for particle 1 This introduced for the first time non-locality in the interpretation of quantum mechanics But this could easily have been avoided if Bohr had required that for a measurement of particle 2 a measuring instrument should be actually interacting with this very particle with the result that an observable of particle n (n = 12) can be co-determined in a local way by the measurement context of that particle only This incidentally would have completely made obsolete the EPR ele-

98

ments of physical reality and would have been quite a bit less confusing than the answer Bohr9 actually gave (to the effect that the definition of the EPR element of physical reality would be ambiguous because of the fact that it did not take into account the measurement arrangement for the other particle) thus promoting the non-locality idea

Summarizing the idea of EPR non-locality is a consequence of i) a neglect of the difference between EPR and Bell experiments (equating elements of physical reality to measurement results) ii) a realist interpretation of quantum mechanics (considering measurement results as properties of the microscopic object ie particle 2) In an empiricist interpretation there is no reason to assume any non-locality

It is often asserted that non-locality is proven by the Aspect experiments because these are violating Bells inequality The reason for such an assertion is that it is thought that non-locality is a necessary condition for a derivation of Bells inequality However as will be demonstrated in the following this cannot be correct since this inequality can be derived from quite different assumptions Also experiments like the Aspect ones -although violating Bells inequality-do not exhibit any trace of non-locality because their measurement results are completely consistent with the postulate of local commutativity implyshying that relative frequencies of measurement results are independent of which measurements are performed in causally disconnected regions Admittedly this does not logically exclude a certain non-locality at the individual level being unobservable at the statistical level of quantum mechanical probability distributions However from a physical point of view a peaceful coexistence between locality at the (physically relevant) statistical level and non-locality at the individual level is extremely implausible Unobservability of the latter would require a kind of conspiracy not unlike the one making unobservable 19 century world aether For this reason the non-locality explanation of the experimental violation of Bells inequality does not seem to be very plausible and does it seem wise to look for alternative explanations

Since non-locality is never the only assumption in deriving Bells inequalshyity such alternative explanations do exist Thus Einsteins assumption of the existence of elements of physical reality is such an additional assumption More generally in Bells derivation10 the existence of hidden-variables is one Is it still possible to derive Bells inequality if these assumptions are abolshyished Moreover even assuming the possibility of hidden-variables theories are there in Bells derivation no hidden assumptions additional to the locality assumption

Bells inequality refers to a set of four quantum mechanical observables AiBiA2 and B2 observables with differentidentical indices being compati-

99

bleincompatible In the Aspect experiments measurements of the four possible compatible pairs are performed in these experiments An and Bn refer to polarshyization observables of photon n n = 12 respectively) Bells inequality can typically be derived for the stochastic quantities of a classical Kolmogorovian probability theory Hence violation of Bells inequality is an indication that observables A B A2 and B2 are not stochastic quantities in the sense of Kol-mogorovs probability theory In particular there cannot exist a quadrivariate joint probability distribution of these four observables Such a non-existence is a consequence of the incompatibility of certain of the observables Since inshycompatibility is a local affair this is another reason to doubt the non-locality explanation of the violation of Bells inequality

In the following derivations of Bells inequality will be scrutinized to see whether the non-locality assumption is as crucial as was assumed by Bell In doing so it is necessary to distinguish derivations in quantum mechanics from derivations in hidden-variables theories

3 Bells inequality in quantum mechanics

For dichotomic observables having values plusmn 1 Bells inequality is given accordshying to

A^A2) - AXB2) - (B1B2) - (BiA2) lt 2 (1)

A more general inequality being valid for arbitrary values of the observables is the BCHS inequality

-lltp(b1a2) +p(bib2)+p(a1b2) - p ( o i a 2 ) -p(bi) -p(b2) lt 0 (2)

from which (31) can be derived for the dichotomic case Because of its indeshypendence of the values of the observables inequality (32) is preferable by far over inequality (31) Bells inequality may be violated if some of the observshyables are incompatible [gtliii]_ ^ O [^2-62]- ^ O

I shall now discuss two derivations of Bells inequality which can be formushylated within the quantum mechanical formalism and which do not rely on the existence of hidden variables The first one is relying on a possessed values principle stating that

values of quantum mechanical observables may be attributed to the object as objective properties possessed by the object independent of observation

values principle can be seen as an expression of the objectiv-

possessed values = lt principle

The possessec istic-realist interpretation of the quantum mechanical formalism preferred by

100

Einstein (compare the EPR elements of physical reality) The important point is that by this principle well-defined values are simultaneously attributed to incompatible observables If an bj = plusmn1 are the values of Ai and Bj for the nth of a sequence of N particle pairs then we have

- 2 lt lt 4 n ) 4 n ) - a[n)b2n) - b[n)b2

n) - ampltn)a2n) lt 2

from which it directly follows that the quantities

lt iA2gt = l f a W 4 n gt gt e t c n=l

must satisfy Bells inequality (31) (a similar derivation has first been given by Stapp11 although starting from quite a different interpretation) The essential point in the derivation is the assumption of the existence of a quadruple of values (ai b a262) for each of the particle pairs

From the experimental violation of Bells inequality it follows that an objectivistic-realist interpretation of the quantum mechanical formalism enshycompassing the possessed values principle is impossible Violation of Bells inequality entails failure of the possessed values principle (no quadruples availshyable) In view of the important role measurement is playing in the interpreshytation of quantum mechanics this is hardly surprising As is well-known due to the incompatibility of some of the observables the existence of a quadruple of values can only be attained on the basis of doubtful counterfactual reashysoning If a realist interpretation is feasible at all it seems to have to be a contextualistic one in which the values of observables are co-determined by the measurement arrangement In the case of Bell experiments non-locality does not seem to be involved

As a second possibility to derive Bells inequality within quantum meshychanics we should consider derivations of the BCHS inequality (32) from the existence of a quadrivariate probability distribution p(ai 610262) by Fine12

and Rastalf3 (also de Muynck14) Hence from violation of Bells inequality the non-existence of a quadrivariate joint probability distribution follows In view of the fact that incompatible observables are involved this once again is hardly surprising

A priori there are two possible reasons for the non-existence of the quadrishyvariate joint probability distribution (01610262) First it is possible that Um]v-gt00N(aibia2b2)N of the relative frequencies of quadruples of meashysurement results does not exist Since however Bells inequality already folshylows from the existence of relative frequency ^(01610262)^ with finite

101

N and the limit N mdashgt oo is never involved in any experimental implementashytion this answer does not seem to be sufficient Therefore the reason for the non-existence of the quadrivariate joint probability distribution pa ampi alti 62) can only be the non-existence of relative frequencies N(aibia2b2)N This seems to reduce the present case to the previous one Bells inequality can be violated because quadruples ( 4i = a B = bi A = 02 B2 = ^2) do not exist

Could non-locality explain the non-existence of quadruples A = aB = bi A2 = a2 B2 = 62) Indeed it could If the value of A say is co-determined by the measurement arrangement of particle 2 then non-locality could entail

Oi(^2) 0(B2) (3)

thus preventing the existence of one single value of observable A for the two Aspect experiments involving this observable This precisely is the non-locality explanation referred to above This explanation is close to Bohrs ambiguity answer to EPR referred to in section 2 stating that the definishytion of an element of physical reality of observable A must depend on the measurement context of particle 2

As will be demonstrated next there is a more plausible local explanation however based on the inequality

a i ^ O ^ a ^ B i ) (4)

expressing that the value of Ai say will depend on whether either Ai or B is measured Inequality (34) could be seen as an implementation of Heisenbergs disturbance theory of measurement to the effect that observables incompatishyble with the actually measured one are disturbed by the measurement That such an effect is really occurring in the Aspect experiments can be seen from the generalized Aspect experiment depicted in figure 3 This experiment should be compared with the Aspect switching experiment in which the switches have been replaced by two semi-transparent mirrors (transmissivities 71 and 72 reshyspectively) The four Aspect experiments are special cases of the generalized one having 7bdquo = 0 or 1 n = 12

Restricting for a moment to one side of the interferometer it is possible to calculate the joint detection probabilities of the two detectors according to

p^auMj)) - ( 1 _ 7 l ) ( F ( D + ) i - 7 l ( pound ( i ) + ) - ( l - 7 l ) ( f ( i ) + ) Jgt

(5)

in which E^ + E^bdquo and F^+jF^- are the spectral representations of the two polarization observables (Ai and Bi) in directions 81 and 6[ respecshytively The values an = +mdashbij = +mdash correspond to yesno registration

102

(IIS bull y ltamp bull BID Pole D

Pole C S 3 E 3 Pol 9]

Figure 3 Generalized Aspect experiment

of a photon by the detector p 7 1 (+ +) = 0 means that like in the switching experiment only one of the detectors can register photon 1 There however is a fundamental difference with the switching experiment because in this latter experiment the photon wave packet is sent either toward one detector or the other whereas in the present one it is split so as to interact coherently with both detectors This makes it possible to interpret the right hand part of the generalized experiment of figure 3 as a joint non-ideal measurement of the inshycompatible polarization observables in directions 6 and 6[ (eg de Muynck et al15) the joint probability distribution of the observables being given by (5)

It is not possible to extensively discuss here the relevance of experiments of the generalized type for understanding Heisenbergs disturbance theory of measurement and its relation to the Heisenberg uncertainty relations (see eg de Muynck16) The important point is that such experiments do not fit into the standard (Dirac-von Neumann) formalism in which a probability is an expectation value of a projection operator Indeed from (5) it follows that P-n(aubij) = TrpR^ij is yielding operators R^ij according to

( ( 1 ) laquo ) = ( ( 1 - T 0 F lt 1 gt + 7 i pound(D 7 ipound ( 1 ) +

+ ( l - 7 l ) F ( O (6)

The set of operators R^ij constitutes a so-called positive operator-valued measure (POVM) Only generalized measurements corresponding to POVMs are able to describe joint non-ideal measurements of incompatible observables By calculating the marginals of probability distribution p 7 l (an bj) it is possishyble to see that for each value of 71 information is obtained on both polarization observables be it that information on polarization in direction 0 gets more non-ideal as 71 decreases while information on polarization in direction 0[ is getting more ideal This is in perfect agreement with the idea of mutual disshyturbance in a joint measurement of incompatible observables The explanation of the non-existence of a single measurement result for observable Ai say as implied by inequality (34) is corroborated by this analysis

103

The analysis can easily be extended to the joint detection probabilities of the whole experiment of figure 3 The joint detection probability distribution of all four detectors is given by the expectation value of a quadrivariate POVM Rijki according to

(an bija2khi) = TrpRijkt- (7)

This POVM can be expressed in terms of the POVMs of the left and right interferometer arms according to

Rijki = R)R) (8)

It is important to note that the existence of the quadrivariate joint probshyability distribution (7) and the consequent satisfaction of Bells inequality is a consequence of the existence of quadruples of measurement results available because it is possible to determine for each individual particle pair what is the result of each of the four detectors Although because of (35) also loshycality is assumed this does not play an essential role Under the condition that a quadruple of measurement results exists for each individual photon pair Bells inequality would be satisfied also if due to non-local interaction Rijkt were not a product of operators of the two arms of the interferometer The reason why the standard Aspect experiments do not satisfy Bells inequality is the non-existence of a quadrivariate joint probability distribution yielding the bivariate probabilities of these experiments as marginals Such a nonshyexistence is strongly suggested by Heisenbergs idea of mutual disturbance in a joint measurement of incompatible observables This is corroborated by the easily verifiable fact that the quadrivariate joint probability distributions of the standard Aspect experiments obtained from (7) and (35) by taking j n

to be either 1 or 0 are all distinct Moreover in general the quadrivariate joint probability distribution (7) for one standard Aspect experiment does not yield the bivariate ones of the other experiments as marginals Although it is not strictly excluded that a quadrivariate joint probability distribution might exist having the bivariate probabilities of the standard Aspect experiments as marginals (hence different from the ones referred to above) does the mathshyematical formalism of quantum mechanics not give any reason to surmise its existence As far as quantum mechanics is concerned the standard Aspect experiments need not satisfy Bells inequality

104

4 Bells inequality in stochastic and deterministic hidden-variables theories

In stochastic hidden-variables theories quantum mechanical probabilities are usually given as

p(ai)= [ d p()p(ai) (1) JA

in which A is the space of hidden variable A (to be compared with classical phase space) and p(ai|A) is the conditional probability of measurement result A = ai if the value of the hidden variable was A and pX) the probability of A It should be noticed that expression (41) fits perfectly into an empiricist intershypretation of the quantum mechanical formalism in which measurement result ai is referring to a pointer position of a measuring instrument the object being described by the hidden variable Since p(ai | A) may depend on the specific way the measurement is carried out the stochastic hidden-variables model correshysponds to a contextualistic interpretation of quantum mechanical observables Deterministic hidden-variables theories are just special cases in which p(ai|A) is either 1 or 0 In the deterministic case it is possible to associate in a unique way (although possibly dependent on the measurement procedure) the value ai to the phase space point A the object is prepared in A disadvantage of a deterministic theory is that the physical interaction of object and measuring instrument is left out of consideration thus suggesting measurement result ai to be a (possibly contextually determined) property of the object In order to have maximal generality it is preferable to deal with the stochastic case

For Bell experiments we have

p(aia2)= dp(X)p(aia2) (2) JA

a condition of conditional statistical independence

p(a1a2X) =p(ai|A)p(o2 |A) (3)

expressing that the measurement procedures of Ai and A2 do not influence each other (so-called locality condition)

As is well-known the locality condition was thought by Bell to be the crucial condition allowing a derivation of his inequality This does not seem to be correct however As a matter of fact Bells inequality can be derived if a quadrivariate joint probability distribution exists1213 In a stochastic hidden-variables theory such a distribution could be represented by

p(aibia2b2) = dX p(X)p(aibia2b2X) (4) JA

105

without any necessity that the conditional probability be factorizable in order that Bells inequality be satisfied (although for the generalized experiment disshycussed in section 3 it would be reasonable to require that p(ai 6102621 A) = p(ai6i|A)p(a2amp2|A)) Analogous to the quantum mechanical case it is suffishycient that for each individual preparation (here parameterized by A) a quadrushyple of measurement results exists If Heisenberg measurement disturbance is a physically realistic effect in the experiments at issue it should be described by the hidden-variables theory as well Therefore the explanation of the nonshyexistence of such quadruples is the same as in quantum mechanics

However with respect to the possibility of deriving Bells inequality there is an important difference between quantum mechanics and the stochastic hidden-variables theories of the kind discussed here Whereas quantum meshychanics does not yield any indication as regards the existence of a quadrivariate joint probability distribution returning the bivariate probabilities of the Asshypect experiments as marginals local stochastic hidden-variables theory does Indeed using the single-observable conditional probabilities assumed to exist in the local theory (compare (3)) it is possible to construct a quadrivariate joint probability distribution according to

p(aia2b1b2) = d p(A)p(ai|A)p(a2|A)p(ampi|A)p(amp2|A) (5) JK

satisfying all requirements It should be noted that (42) does not describe the results of any joint measurement of the four observables that are involved Quadruples (ai a2 b b2) are obtained here by combining measurement results found in different experiments assuming the same value of A in all experishyments For this reason the physical meaning of this probability distribution is not clear However this does not seem to be important The existence of (42) as a purely mathematical constraint is sufficient to warrant that any stochastic hidden-variables theory in which (2) and (3) are satisfied must reshyquire that the standard Aspect experiments obey Bells inequality Admittedly there is a possibility that (42) might not be a valid mathematical entity beshycause it is based on multiplication of the probability distributions p(a|A) which might be distributions in the sense of Schwartz distribution theory However the remark made with respect to the existence of probability distributions as infinitemdashA limits of relative frequencies is valid also here the reasoning does not depend on this limit but is equally applicable to relative frequencies in finite sequences

The question is whether this reasoning is sufficient to conclude that no local hidden-variables theory can reproduce quantum mechanics Such a conshyclusion would only be justified if locality would be the only assumption in

106

deriving Bells inequality If there would be any additional assumption in this derivation then violation of Bells inequality could possibly be blamed on the invalidity of this additional assumption rather than locality Evidently one such additional assumption is the existence of hidden variables A belief in the completeness of the quantum mechanical formalism would indeed be a suffishycient reason to reject this assumption thus increasing pressure on the locality assumption Since however an empiricist interpretation is hardly reconcilshyable with such a completeness belief we have to take hidden-variables theories seriously and look for the possibility of additional assumptions within such theories

In expression (41) one such assumption is evident viz the existence of the conditional probability p(ai|A) The assumption of the applicability of this quantity in a quantum mechanical measurement is far less innocuous than appears at first sight If quantum mechanical measurements really can be modshyeled by equality (41) this implies that a quantum mechanical measurement result is determined either in a stochastic or in a deterministic sense by an instantaneous value A of the hidden variable prepared independently of the measurement to be performed later It is questionable whether this is a reshyalistic assumption in particular if hidden variables would have the character of rapidly fluctuating stochastic variables As a matter of fact every individshyual quantum mechanical measurement takes a certain amount of time and it will in general be virtually impossible to determine the precise instant to be taken as the initial time of the measurement as well as the precise value of the stochastic variable at that moment Hence hidden-variables theories of the kind considered here may be too specific

Because of the assumption of a non-contextual preparation of the hidshyden variable such theories were called quasi-objectivistic stochastic hidden-variables theories in de Muynck and van Stekelenborg17 (dependence of the conditional probabilities p(aiX) on the measurement procedure preventing complete objectivity of the theory) In the past attention has mainly been restricted to quasi-objectivistic hidden-variables theories It is questionable however whether the assumption of quasi-objectivity is a possible one for hidden-variables theories purporting to reproduce quantum mechanical meashysurement results The existence of quadrivariate probability distribution (42) only excludes quasi-objectivistic local hidden-variables theories (either stochasshytic or deterministic) from the possibility of reproducing quantum mechanics As will be seen in the next section it is far more reasonable to blame quasi-objectivity than locality for this thus leaving the possibility of local hidden-variables theories that are not quasi-objectivistic

107

5 Analogy between thermodynamics and quantum mechanics

The essential feature of expression (41) is the possibility to attribute either in a stochastic or in a deterministic way measurement result a to an instantashyneous value of hidden variable A The question is whether this is a reasonable assumption within the domain of quantum mechanical measurement Are the conditional probabilities p(ai|A) experimentally relevant within this domain In order to give a tentative answer to this question we shall exploit the analogy between thermodynamics and quantum mechanics considered already a long time ago by many authors (eg de Broglie18 Bohm et al1920 Nelson2122)

Quantum mechanics -yen Hidden variables theory (A1A2BUB2) A

t t Thermodynamics mdashgt Classical statistical mechanics

(PTS) quPi In this analogy thermodynamics and quantum mechanics are considered as phenomenological theories to be reduced to more fundamental microscopic theories The reduction of thermodynamics to classical statistical mechanics is thought to be analogous to a possible reduction of quantum mechanics to stochastic hidden-variables theory Due to certain restrictions imposed on preparations and measurements within the domains of the phenomenological theories their domains of application are thought to be contained in but smaller than the domains of the microscopic theories

In order to assess the nature and the importance of such restrictions let us first look at thermodynamics As is well-known (eg Hollinger and Zenzen23) thermodynamics is valid only under a condition of molecular chaos assuring the existence of local equilibrium necessary for the ergodic hypothesis to be satisfied Thermodynamics only describes measurements of quantities (like pressure temperature and entropy) being defined for such equilibrium states From an operational point of view this implies that measurements within the domain of thermodynamics do not yield information on the object system valid for one particular instant of time but it is time-averaged information time averaging being replaced under the ergodic hypothesis by ensemble averaging In the Gibbs theory this ensemble is represented by the canonical density function Z~1e~H^qnp^^kT on phase space This state is called a macrostate to be distinguished from the microstate qnPn representing the point in phase space the classical object is in at a certain instant of time

The restricted validity of thermodynamics is manifest in a two-fold way i) through the restriction of all possible density functions on phase space to aIn equilibrium thermodynamics equilibrium is assumed to be even global

108

the canonical ones ii) through the restriction of thermodynamical quantities (observables) to functionals on the space of thermodynamic states Physishycally this can be interpreted as a restriction of the domain of application of thermodynamics to those measurement procedures probing only properties of the macrostates This implies that such measurements only yield information that is averaged over times exceeding the relaxation time needed to reach a state of (local) equilibrium Thus it is important to note that thermodynamic quantities are quite different from the physical quantities of classical statistical mechanics the latter ones being represented by functions of the microstate ltlnPn and hence referring to a particular instant of time6 Only if it were possible to perform measurements faster than the relaxation time would it be necessary to consider such non-thermodynamic quantities Such measureshyments then are outside the domain of application of thermodynamics Thus if we have a cubic container containing a volume of gas in a microstate initially concentrated at its center and if we could measure at a single instant of time either the total kinetic energy or the force exerted on the boundary of the conshytainer then these results would not be equal to thermodynamic temperature and pressurec respectively because this microstate is not an equilibrium state Only after the gas has reached equilibrium within the volume denned by the container (equilibrium) thermodynamics becomes applicable

Within the domain of application of thermodynamics the microstate of the system may change appreciably without the macrostate being affected Indeed a macrostate is equivalent to an (ergodic) trajectory qn(t)pn(t)ergodic- We might exploit as follows the difference between micro- and macrostates for charshyacterizing objectivity of a physical theory Whereas the microstate is thought to yield an objective description of the (microscopic) object the macrostate just describes certain phenomena to be attributed to the object system only while being observed under conditions valid within the domain of application of the theory In this sense classical mechanics is an objective theory all quantities being instantaneous properties of the microstate Thermodynamic quantities only being attributable to the macrostate (ie to an ergodic trashyjectory) can not be seen however as properties belonging to the object at a certain instant of time Of course we might attribute the thermodynamic quantity to the event in space-time represented by the trajectory but it should be realized that this event is not determined solely by the preparation of the microstate but is determined as well by the macroscopic arrangement serving

6Note that a definition of an instantaneous temperature by means of the equality Z2nkT = S i P2mj does not make sense as can easily be seen by applying this definition to an ideal gas in a container freely falling in a gravitational field t h e r m o d y n a m i c pressure is defined for the canonical ensemble by p mdash kTddV log Z

109

Figure 4 Incompatible thermodynamic arrangements

to define the macrostate In order to illustrate this consider two identical cubic containers differing

only in their orientations (cf figure 4) In principle the same microstate may be prepared in the two containers Because of the different orientations howshyever the macrostates evolving from this microstate during the time the gas is reaching equilibrium with the container are different (for different orientations of the container we have Hx ^ H2 and hence e - i f l f c T Z i ^ e~H2kTZ2 since H = T+V and Vi ^ V2 because potential energy is infinite outside a conshytainer) This implies that thermodynamic macrostates may be different even though starting from the same microstate Macrostates in thermodynamics have a contextual meaning It is important to note that since the container is part of the preparing apparatus this contextuality is connected here to prepashyration rather than to measurement Consequently whereas classical quantities f(qnPn) can be interpreted as objective properties thermodynamic quanshytities are non-objective the non-objectivity being of a contextual nature

Let us now suppose that quantum mechanics is related to hidden-variables theory analogous to the way thermodynamics is related to classical mechanshyics the analogy maybe being even closer for non-equilibrium thermodynamics (only local equilibrium being assumed) than for the thermodynamics of global equilibrium processes Support for this idea was found in de Muynck and van Stekelenborg17 where it was demonstrated that in the Husimi representashytion of quantum mechanics by means of non-negative probability distribution functions on phase space an analogous restriction to a canonical set of disshytributions obtains as in thermodynamics In particular it was demonstrated that the dispersionfree states p(qp) = S(q mdash qo)S(p mdash po) are not canonical in this sense This implies that within the domain of quantum mechanics it does not make sense to consider the preparation of the object in a microstate with a well-defined value of the hidden variables (qp)

In the analogy quantum mechanical observables like AiA2BiB2 should be compared to thermodynamic quantities like pressure temperature and enshytropy The central issue in the analogy is the fact that thermodynamic quanti-

110

ties like pressure and temperature cannot be conditioned on the instantaneous phase space variable qnPn (microstate) Expressions like p(qnPn) and T(qnPn) are meaningless within thermodynamics Thermodynamic quanshytities are conditioned on macrostates corresponding to ergodic paths in phase space Analogously a quantum mechanical observable might not correspond to an instantaneous property of the object but might have to be associated with an (ergodic) path in hidden-variables space A (macrostate) rather than with an instantaneous value A (microstate)

On the basis of the analogy between thermodynamics and quantum meshychanics it is possible to state the following conjectures

bull Quantum mechanical measurements (analogous to thermodynamic meashysurements) do not probe microstates but macrostates

bull Quantum mechanical quantities (analogous to thermodynamic quantishyties) should be conditioned on macrostates

A hidden-variables macrostate will be symbolically indicated by A For quantum mechanical measurements the conditional probabilities p(ai) of (41) should then be replaced by p(ai|A ) Concomitantly quantum mechanshyical probabilities should be represented in the hidden-variables theory by a functional integral

p(ai) = Jd ptfMa^X1) (1)

in which the integration is over all possible macrostates consistent with the preparation procedure

By itself conditioning of quantum mechanical observables on macrostates rather than microstates is not sufficient to prevent derivation of Bells inequalshyity As a matter of fact on the basis of expression (43) a quadrivariate joint probability distribution can be defined analogous to (42) according to

p(oi026162) = f dt p(A)p(a1|At)p(a2|At)p(61|Alt)p(62|At) (2)

from which Bells inequality can be derived just as well There is however one important aspect that up till now has not sufficiently been taken into acshycount viz contextuality In the construction of (44) it is assumed that the

macrostate A is applicable in each of the measurement arrangements of obshyservables AA2Bi and B2 Because of the incompatibility of some of these observables this is an implausible assumption On the basis of the thermoshydynamic analogy it is to be expected that macrostates A will depend on the

111

measurement context of a specific observable Since [AiBi]_ ^ O we will have

f f1 (3)

and analogously for A2 and B2 Then for the Bell experiments measuring the pairs (Ai A2) and (AiB2) respectively we have

p(aia2) = dX 2 p(t 1 2)p(ai|A 1 2)p(a2X 1 2 ) (4)

p(aib2) = JdtAlB2 ptMB2)patfMB)pa2tMB) (5)

Now the contextuality expressed by inequality (45) prevents the construction of a quadrivariate joint probability distribution analogous to (44) Hence like in the quantum mechanical approach also in the local non-objectivistic hidden-variables theory a derivation of Bells inequality is prevented due to the local contextuality involved in the interaction of the particle and the measuring instrument it is directly interacting with

6 Conclusions

Our conclusion is that if quantum mechanical measurements do probe macro-states A rather than microstates A then Bells inequality cannot be derived for quantum mechanical measurements Both in quantum mechanics and in hidden-variables theories is Bells inequality a consequence of the assumption that the theory is yielding an objective description of reality in the sense that the preparation of the microscopic object as far as relevant to the realization of the measurement result can be thought to be independent of the measureshyment arrangement The important point to be noticed is that although in Bell experiments the preparation of the particle pair at the source (ie the microstate) can be considered to be independent of the measurement proceshydures to be carried out later (and hence one and the same microstate can be assumed in different Bell experiments) the measurement result is only detershymined by the macrostate which is co-determined by the interaction with the measuring instruments It really seems that the Copenhagen maxim of the impossibility of attributing quantum mechanical measurement results to the object as objective properties possessed independently of the measurement should be taken very seriously and implemented also in hidden-variables theshyories purporting to reproduce the quantum mechanical results The quantum

112

mechanical dice is only cast after the object has been interacting with the meashysuring instrument even though its result can be deterministically determined by the (sub-quantum mechanical) microstate

The thermodynamic analogy suggests which experiments could be done in order to transcend the boundaries of the domain of application of quanshytum mechanics If it would be possible to perform experiments that probe the microstate A rather than the macrostate A then we are in the domain of (quasi-)objectivistic hidden-variables theories Because of (42) it then is to be expected that Bells inequality should be satisfied for such experiments In such experiments preparation and measurement must be completed well within the relaxation time of the microstates Such times have been estimated by Bohm24 for the sake of illustration as the time light needs to cover a disshytance of the order of the size of an atom (10~18 s say) If this is correct then all present-day experimentation is well within the range of quantum mechanshyics thus explaining the seemingly universal applicability of this latter theory By hindsight this would explain why Aspects switching experiment is corshyroborating quantum mechanics the applied switching frequency (50 MHz) although sufficient to warrant locality has been far too low to beat the local relaxation processes in each of the measuring instruments separately

It has often been felt that the most surprising feature of Bell experiments is the possibility (in certain states) of a strict correlation between the measureshyment results of the two measured observables without being able to attribute this to a previous preparation of the object (no elements of physical reality ) For many physicists the existence of such strict correlations has been reason enough to doubt Bohrs Copenhagen solution to renounce causal explanation of measurement results and to replace determinism by complementarity It seems that the urge for causal reasoning has been so strong that even within the Copenhagen interpretation a certain causality has been accepted even a non-local one in an EPR experiment (cf figure 1) determining a measurement result for particle 2 by the measurement of particle 1 This however should rather be seen as an internal inconsistency of this interpretation caused by a tendency to make the Copenhagen interpretation as realist as possible In a consistent application of the Copenhagen interpretation to Bell experiments such experiments could be interpreted as measurements of bivariate correlation observables The certainty of obtaining a certain (bivariate) eigenvalue of such an observable would not be more surprising than the certainty of obtaining a certain eigenvalue of a univariate one if the state vector is the corresponding eigenvector

It is important to note that this latter interpretation of Bell experiments takes seriously the Copenhagen idea that quantum mechanics need not ex-

113

plain the specific measurement result found in an individual measurement Indeed in order to compare theory and experiment it would be sufficient that quantum mechanics just describe the relative frequencies found in such meashysurements In this view quantum mechanics is just a phenomenological theory in an analogous way describing (not explaining) observations as does thermoshydynamics in its own domain of application Explanations should be provided by more fundamental theories describing the mechanisms behind the obshyservable phenomena Hence the Copenhagen completeness thesis should be rejected (although this need not imply a return to determinism)

This approach has important consequences One consequence is that the non-existence within quantum mechanics of elements of physical reality does not imply that elements of physical reality do not exist at all They could be elements of the more fundamental theories In section 5 it was discussed how an analogy between quantum mechanics and thermodynamics could be exploited to spell this out Elements of physical reality could correspond to hidden-variables microstates A The determinism necessary to explain the strict correlations referred to above would be explained if within a given measurement context a microstate would define a unique macrostate A This demonstrates how it could be possible that quantum mechanical measurement results cannot be attributed to the object as properties possessed prior to meashysurement and there yet is sufficient determinism to yield a local explanation of strict correlations of quantum mechanical measurement results in certain Bell experiments

Another important aspect of a dissociation of phenomenological and funshydamental aspects of measurement is the possibility of an empiricist interpreshytation of quantum mechanics As demonstrated by the generalized Aspect experiment discussed in section 3 an empiricist approach needs a generalshyization of the mathematical formalism of quantum mechanics in which an observable is represented by a POVM rather than by a projection-valued meashysure corresponding to a self-adjoint operator of the standard formalism Such a generalization has been very important in assessing the meaning of Bells inequality In the major part of the literature of the past this subject has been dealt with on the basis of the (restricted) standard formalism However some conclusions drawn from the restricted formalism are not cogent when viewed in the generalized one (for instance because von Neumanns projection postulate is not applicable in general) For this reason we must be very careful when accepting conclusions drawn from the standard formalism This in particular holds true for the issue of non-locality

114

References

1 W Heisenberg Zeitschr f Phys 33 879 (1925) 2 E Schrodinger Naturwissenschaften 23 807 823 844 (1935) (English

translation in Quantum Theory and Measurement eds JA Wheeler and WH Zurek (Princeton Univ Press 1983 p 152))

3 WM de Muynck Synthese 102 293 (1995) 4 A Einstein B Podolsky and N Rosen Phys Rev 47 777 (1935) 5 A Aspect P Grangier and G Roger Phys Rev Lett 47 460 (1981) 6 A Aspect J Dalibard and G Roger Phys Rev Lett 49 1804 (1982) 7 KR Popper Quantum theory and the schism in physics (Rowman and

Littlefield Totowa 1982) 8 M Jammer The philosophy of quantum mechanics (Wiley New York

1974) 9 N Bohr Phys Rev 48 696 (1935)

10 JS Bell Physics 1 195 (1964) 11 HR Stapp Phys Rev D 3 1303 (1971) II Nuovo Cim 29B 270

(1975) 12 A Fine Journ Math Phys 23 1306 (1982) Phys Rev Lett 48 291

(1982) 13 P Rastall Found of Phys 13 555 (1983) 14 WM de Muynck Phys Lett A 114 65 (1986) 15 WM de Muynck W De Baere and H Martens Found of Phys 24

1589 (1994) 16 WM de Muynck Found of Phys 30 205 (2000) 17 WM de Muynck and JT van Stekelenborg Ann der Phys 7 Folge

45 222 (1988) 18 L de Broglie La thermodynamique de la particule isolee (Gauthier-

Villars 1964) L de Broglie Diverses questions de mecanique et de thershymodynamique classiques et relativistes (Springer-Verlag 1995)

19 D Bohm Phys Rev 89 458 (1953) 20 D Bohm and J-P Vigier Phys Rev 96 208 (1954) 21 E Nelson Dynamical theories of Brownian motion (Princeton University

Press 1967) 22 E Nelson Quantum fluctuations (Princeton University Press 1985) 23 HB Hollinger and MJZenzen The Nature of Irreversibility (D Reidel

Publishing Company Dordrecht 1985 sect 44) 24 D Bohm Phys Rev 85 166 180 (1952)

115

DISCRETE HESSIANS IN STUDY OF Q U A N T U M STATISTICAL SYSTEMS COMPLEX GINIBRE ENSEMBLE

M M DURAS

Institute of Physics Cracow University of Technology ulica Podchorazych 1 PL-30084 Cracow Poland

E-mail mdurasriaduskpkedupl

The Ginibre ensemble of nonhermitean random Hamiltonian matrices K is conshysidered Each quantum system described by K is a dissipative system and the eigenenergies Z of the Hamiltonian are complex-valued random variables The second difference of complex eigenenergies is viewed as discrete analog of Hessian with respect to labelling index The results are considered in view of Wigner and Dysons electrostatic analogy An extension of space of dynamics of random magnitudes is performed by introduction of discrete space of labeling indices

1 Introduction

Random Matrix Theory RMT studies quantum Hamiltonian operators H which are random matrix variables Their matrix elements Hij are independent ranshydom scalar variables 12345678 There were studied among others the folshylowing Gaussian Random Matrix ensembles GRME orthogonal GOE unitary GUE symplectic GSE as well as circular ensembles orthogonal COE unishytary CUE and symplectic CSE The choice of ensemble is based on quantum symmetries ascribed to the Hamiltonian H The Hamiltonian H acts on quanshytum space V of eigenfunctions It is assumed that V is TV-dimensional Hilbert space V = F ^ where the real complex or quaternion field F = R C H corresponds to GOE GUE or GSE respectively If the Hamiltonian matrix

116

H is hermitean H mdash H then the probability density function of H reads

MH)=CH0exp[-p-plusmn-Tr(H2) (1)

CH0 = ( ^ ) ^ 2

MHP=N+ ^N(N - 1)0

fn(H)dH = 1

N N D-l

^=nniK) i = l j gt i 7=0

Hii = (H$HltSgt-raquo)eF

where the parameter 3 assume values 3 = 124 for GOE(iV) GUE(A^) GSE(A^) respectively and Nap is number of independent matrix elements of hermitean Hamiltonian H The Hamiltonian H belongs to Lie group of hermitean N x AT-matrices and the matrix Haars measure dH is invarishyant under transformations from the unitary group U(iV F) The eigenenergies Eii = 1 N oi H are real-valued random variables Ei = E It was Eushygene Wigner who firstly dealt with eigenenergy level repulsion phenomenon studying nuclear spectra1 2 3 RMT is applicable now in many branches of physics nuclear physics (slow neutron resonances highly excited complex nushyclei) condensed phase physics (fine metallic particles random Ising model [spin glasses]) quantum chaos (quantum billiards quantum dots) disordered meso-scopic systems (transport phenomena) quantum chromodynamics quantum gravity field theory

2 The Ginibre ensembles

Jean Ginibre considered another example of GRME dropping the assumption of hermiticity of Hamiltonians thus denning generic F-valued Hamiltonian K 12910 j j e n C 6 ) j belong to general linear Lie group GL(N F) and the matrix Haars measure dK is invariant under transformations form that group The

117

distribution of K is given by

MK) = CK0 exp [-P-- TrffftA-)] (2)

KHfgt = N2p

fKK)dK = 1

N N D-

^=nniK) i=j= 7=0

where 3 mdash 124 stands for real complex and quaternion Ginibre ensembles respectively Therefore the eigenenergies Zi of quantum system ascribed to Ginibre ensemble are complex-valued random variables The eigenenergies Zii = 1N of nonhermitean Hamiltonian K are not real-valued random variables Zi ^ Z Jean Ginibre postulated the following joint probability density function of random vector of complex eigenvalues Z ZN tor N X N Hamiltonian matrices K for f = 21 2-9 10

PzuzN) = (3) N 1 N N

=n ^771 bull n zi - ztf bull exp(- zZ I^I2) 3 = 1 J iltj j=l

where Zi are complex-valued sample points (zi 6 C) We emphasize here Wigner and Dysons electrostatic analogy A Coulomb

gas of iV unit charges moving on complex plane (Gausss plane) C is considered The vectors of positions of charges are zt and potential energy of the system is

U(z1zN) = -J2]nzi-j + lEZil (4) iltj i

If gas is in thermodynamical equilibrium at temperature T = ^- (ft = -^-^ = 2 ks is Boltzmanns constant) then probability density function of vectors of positions is P(ZIZN) Eq (3) Therefore complex eigenenergies Zi of quantum system are analogous to vectors of positions of charges of Coulomb

118

gas Moreover complex-valued spacings AxZi of complex eigenenergies of quantum system

A1Zi = Zi+1-Zii = l(N-l) (5)

are analogous to vectors of relative positions of electric charges Finally complex-valued second differences A2Zj of complex eigenenergies

A2Zi = Zi+2 - 2Zi+l + Zui = 1 N - 2) (6)

are analogous to vectors of relative positions of vectors of relative positions of electric charges

The eigenenergies Zi = Z(i) can be treated as values of function Z of discrete parameter i mdash 1 N The Jacobian of Zi reads

dZi A1Zi JacZi = V ~ ^ T 1 = A Zlt- 7

Ol A1 We readily have that the spacing is an discrete analog of Jacobian since the indexing parameter i belongs to discrete space of indices i pound = l iV Therefore the first derivative with respect to i reduces to the first differential quotient The Hessian is a Jacobian applied to Jacobian We immediately have the formula for discrete Hessian for the eigenenergies Zi

Q2 7 A 2 7

Thus the second difference of Z is discrete analog of Hessian of Z One emphasizes that both Jacobian and Hessian work on discrete index space of indices i The finite differences of order higher than two are discrete analogs of compositions of Jacobians with Hessians of Z

The eigenenergies Eii 6 of the hermitean Hamiltonian H are ordered increasingly real-valued random variables They are values of discrete function Ei = Ei) The first difference of adjacent eigenenergies is

A1Ei = Ei+1-Eii = l(N-l) (9)

are analogous to vectors of relative positions of electric charges of one-dimensional Coulomb gas It is simply the spacing of two adjacent energies Real-valued second differences A2Ei of eigenenergies

A2Ei = Ei+2 - 2Ei+1 +Eui = 1 (N - 2) (10)

119

are analogous to vectors of relative positions of vectors of relative positions of charges of one-dimensional Coulomb gas The A2Zi have their real parts ReA2Zi and imaginary parts ImA2Z as well as radii (moduli) A2Zi and main arguments (angles) ArgA2Zi A2Zj are extensions of real-valued second differences

A 2 poundi = Ei+2 - 2Ei+1 +Ehi = 1 (N - 2) (11)

of adjacent ordered increasingly real-valued eigenenergies Ei of Hamiltonian H defined for GOE GUE GSE and Poisson ensemble PE (where Poisson ensemshyble is composed of uncorrelated randomly distributed eigenenergies)1112131415 The Jacobian and Hessian operators of energy function E(i) mdash Ei for these ensembles read

and

The treatment of first and second differences of eigenenergies as discrete analogs of Jacobians and Hessians allows one to consider these eigenenergies as a magshynitudes with statistical properties studied in discrete space of indices The labelling index i of the eigenenergies is an additional variable of motion hence the space of indices I augments the space of dynamics of random magshynitudes

Acknowledgements

It is my pleasure to most deeply thank Professor Antoni Ostoja-Gajewski for continuous help I also thank Professor Wlodzimierz Wojcik for his giving me access to computer facilities

References

1 F Haake Quantum Signatures of Chaos (Springer-Verlag Berlin Heidelshyberg New York 1990) Chapters 1 3 4 8 pp 1-11 33-77 202-213

2 T Guhr A Miiller-Groeling and H A Weidenmuller Phys Rept 299 189-425 (1998)

3 M L Mehta Random matrices (Academic Press Boston 1990) Chapters 1 2 9 pp 1-54 182-193

4 L E Reichl The Transition to Chaos In Conservative Classical Systems Quantum Manifestations (Springer-Verlag New York 1992) Chapter 6 p 248

5 O Bohigas in Proceedings of the Les Houches Summer School on Chaos and Quantum Physics (North-Holland Amsterdam 1991) p 89

6 CE Porter Statistical Theories of Spectra Fluctuations (Academic Press New York 1965)

7 T A Brody J Flores J B French P A Mello A Pandey and S S M Wong Rev Mod Phys 53 385 (1981)

8 C W J Beenakker Rev Mod Phys 69 731 (1997) 9 J Ginibre J Math Phys 6 440 (1965)

10 M L Mehta Random matrices (Academic Press Boston 1990) Chapter 15 pp 294-310

11 M M Duras and K Sokalski Phys Rev E 54 3142 (1996) 12 M M Duras Finite difference and finite element distributions in statisshy

tical theory of energy levels in quantum systems (PhD thesis Jagellonian University Cracow 1996)

13 M M Duras and K Sokalski Physica D125 260 (1999) 14 M M Duras Description of Quantum Systems by Random Matrix Enshy

sembles of Large Dimensions in Proceedings of the Sixth International Conference on Squeezed States and Uncertainty Relations 24 May-29 May 1999 Naples Italy (NASA Greenbelt Maryland at press 2000)

15 M M Duras J Opt B Quantum Semiclass Opt 2 287 (2000)

121

SOME REMARKS ON HARDY FUNCTIONS ASSOCIATED WITH DIRICHLET SERIES

W E H M Institut fur Grenzgebiete der Psychologie und Psychohygiene

Wilhelmstrasse 3a 79098 Freiburg Germany E-mail ehmigppde

A simple method of associating a Hardy function with a Dirichlet series is described and applied to some examples connected with the Riemann zeta function The theory of Hardy functions then is used to derive integral tests of the Riemann hypothesis generalizing a recent result of Balazard Saias and Yor1

1 Introduction

The most famous example of a Dirichlet series f(z) = Y^=i an n~z converging absolutely in the half plane $lz gt 1 is the Riemann zeta function ((z) which has all coefficients an = 1 It has a simple pole at z mdash 1 and can be extended as a meromorphic function with no other singularities to the whole complex plane6

A simple method of associating a Hardy function with a Dirichlet series of that kind consists in multiplying f(z) by (z mdash l ) ^ 2 the factor (z mdash l)z removes the pole at z = 1 and the division by z achieves square integrability along vertical lines Moreover the zeros of fz) remain unchanged by this modification The motivation for passing from f(z) to f(z) (z mdash l)z2 is to utilize the theory of Hardy functions especially factorization of Hardy functions for the study of the zeta function

In section 2 of this note we give conditions under which the function f(z) (z mdash l)z2 has an analytic continuation as a Hardy function beyond the abscissa of convergence of the Dirichlet series f(z) The criterion is tested on three examples all related to the Riemann zeta function Factorization of the Hardy function pound(z) (z mdash l)z2 which is briefly dicussed in section 3 is used in section 4 to derive some integral tests of the Riemann hypothesis The content of the Riemann hypothesis hereafter abbreviated RH is Riemanns yet unproven conjecture that all non-real zeros of the pound function lie on the line iftz = 12 in the complex plane It has received increasing interest among physicists since the discovery of striking similarities in the distribution of the zeros of the zeta function and the spectrum of large random matrices2

The idea to utilize Hardy functions in connection with the zeta function including integral tests of the Riemann hypothesis is not new See the recent article of Balazard Saias and Yor1 who initially work with Hardy functions in the disc then pass to the half plane 3te gt 12 by conformal mapping In our

122

approach based on the function C(z)(z mdash l ) z 2 which also appears in recent work of Burnol4 we deal with half plane Hardy functions from the beginning This leads to somewhat more general results in a natural fashion

2 Hardyfication of Dirichlet series

The basic result of this section is the following

Theorem Given a Dirichlet series f(z) = $3nLi a laquo n~z with a finite abscissa of convergence let functions A and ltfgt be defined by

A(x) = ^2 abdquo ltj)x) = ^^ an(l-x + ogn) (x euro R ) l lt n lt x lltnlte

(1)

Suppose that Ax) = 0(x) as x mdashgtbull oo and let

X = l i m s u p l-pM where DN = A(N) - V ^ M ( 2 )

Then the function f(z) (z mdash l)z2 can be represented as the Laplace transform of ltfgt(x) in the half plane Stz gt A

(3) bullOO

f(z)(z-l)z2 = e-zx4gt(x)dx ($lzgt) Jo

Proof Fix an integer N gt 1 and let log N lt x lt og(N + 1) Then

4gt(x)-4gt(logN) = (x-logN)A(N)ltA(N)logtplusmnl = 0(1)

as N -gt oo by the assumed growth behavior of A(x) Combining this with

(A(log(n + l))-lt)(logn) = an+1 - A(n) log ^ = an+1 - A(n)n + 0(n1)

we get for N = [ex] -gt oo

N-l

4gtx) = m + J2 [^(log(+)) - ^(losn)] + deg() n=l

N-l

= ai + 5 3 [an+1 - A(n)n + Ofa-1)] + 0(1) = DN + 0(log N) n = l

123

and thus for every e gt 0 ltfgt(x) = 0(ea(A+egt) x t oo by the definition of A Since 4gt vanishes on the left half line it follows that the integral on the right-hand side of (3) converges absolutely in the half plane 5ftz gt A It remains to show that this Laplace transform coincides with f(z) (z - l ) z 2 in the half plane 3z gt aa where aa denotes the abscissa of absolute convergence of f(z)

To that end let us write r)(z) = f(z) (z mdash l)z2 and introduce truncated versions

N

fN(z) = ^2ann~z T]N(z) = fN(z)(z-l)z2

n = l

(j)Nx) = Y2 an(l-x + ogn) lltnltmin(Nex)

N gt1 and set h^^ix) mdash e~~ax ltfgtjv(x) Using

2TT J^ [ + ] 0 if x lt 0

(for every integer q gt 1 a gt 0) we get for fixed a gt aa

(bullOO

eitxr)N(v + it)dt (4)

-i -oo N = v eitx ]C a n~deg~it (a + it- l)l(a +t)2 dt

2r J -OO

-f 2TT J_

n = l N

^-ijy^-i^u dt ya + it (a + it)2

Y ann-dege-deg(x-lo^(l-(x-logn)) = haNx) lltnltmin(Nex)

almost everywhere in x S R the Fourier integrals being understood in the L2

sense Note that r](z) is square integrable along every line 9z = a with a gt aa Clearly rj^i^+it) converges to r)a+it) in L2(dt) so h^^ is a Cauchy sequence in L2(dx) by Parsevals formula The pointwise limit ha(x) of hltT^(x) then also is the L2(dx) limit so that by (4) h^x) and T)(a + it) represent a Fourier transform pair for every a gt aa Therefore

poo poo

r](a + it) = Kit) = hax)e~ixtdx = e-(deg+iVxltf)(x)dx (5) Jo Jo

124

holds almost everywhere in t (a gt aa) hence everywhere in 3te gt aa by continuity This shows that the Laplace transform of ltfgt represents the analytic continuation of 77 to the region $tz gt A completing the proof

Let Ti2 denote the Hardy space consisting of all functions g(z) which are analytic for $lz gt a and such that s u p ^ ^ J^deg g(cr + it)2 dt lt 00 The growth behavior of (jgt(x) established in the proof implies ha euro L2 for every a gt A so that by (5) and Parsevals formula we obtain the following

Corollary Under the conditions of the theorem the function f(z) (z mdash l)z2

belongs to every Hardy space H2 a gt X

Example 1 Let obdquo = 1 for all n that is f(z) mdash Cz) Then DN = 1 N gt 1 so that A = 0 A more careful analysis shows that ltfrx) is nonnegative and grows linearly as x tends to infinity Consequently (z) (z mdash l)z2 is a member of every Hardy space W2 a gt 0 but not of H2 The nonnegativity allows one to associate with ltfgt an exponential family V mdash pa a gt 0 of probability densities with support [000) by setting

pbdquo(x) = K(x)r](a) = ltfgtx)e-xri((T) (x euro R a gt 0) (6)

The function pound(z) (z mdash l)z2 was also considered by Burno in connection with a closure problem in function space known as the Nyman - Beurling real variable form of the Riemann hypothesis

It may be interesting to note here that although ha is square integrable for every a gt 0 it is not true that hafM mdashgtbull iltr in L2 if cr lt 1 In fact we have

Uminf jv-gtoo ||fr(7JV-iltr||2 gt 0 0 lt a lt 1 (7)

Proof Note first that for x gt log N -gt 00

4gtx) - 4gtNX) (8)

J ^ ( l - z + logn) = ( l - a O Q e ^ - A O + l o g t e ^ l - l o g A T Nltnlte

= ( l - x ) ( [ e ] - A 0 + ([ex + plusmn)log[ex] - [ex] - (N + | ) logiV + N + 0(1)

= (JV+)(log[ex]- logJV) + ( [ e^ ] - iV) ( log [e a ] -x )+0 ( l )

= (N + ) ( - log TV) + 0(1)

on using Stirlings formula and the inequalities 0 lt x - log [ex] lt2e~x (x gt 0) The estimate (8) shows that there exists a finite constant B gt 0 such that

125

ltfgt(x) - 4gtNx) gtN(x- logN) for all large N and x gt B + log JV Therefore

O0

KN-Kl gt (ltfgt(x) - lttgtN(x))2 e-2 dx JB+ogN

roo TOO

gt TV2 (x-logN)2e-2axdx = N2~2deg y2 e~2try dy JB+ogN JB

for all large N and assertion (7) follows

Example 2 Let f(z) = ^2p~z^ogp where the sum extends over all prime numbers This example is related to the logarithmic derivative of the zeta function as may be seen from the product representation pound(z) = J~T_ (1mdashp_ z)_ 1 For IRz gt 1

C(z) v - logP gt V - ogP C(z) ^ Pz - 1 M ^ ^ Pz (p2 - 1)

and since the last series converges for Htz gt 12 it suffices to consider f(z) as far as the analytic continuation of C(z)C(z) 1S concerned

The series f(z) had convergence abscissa 12 implying the RH if the associated sequence DN satisfied condition (2) with A = 12 For a numerical check we computed DN for TV up to 5 million A plot of log+ |Djv| log TV versus logiV (thinned out to every 200th data point the general picture is not affected thereby) is shown in Figure 1 (a) Within the considered range the observed behavior is well in accordance with a possible value of A = 12 Notice the obvious connection with the classical criterion saying that the RH is equivalent to the error estimate $^pltxlogp mdash x = 0(x12+e) (V e gt 0) in the prime number theorem (Edwards6 Sect 55) Incidentally 4gt(x) seems to be nonnegative in this case too as a plot of ltfgt(x) for small a-values indicates

Example 3 Let f(z) = 1C(z) = ^2^Li^(n)n~z with fj the Mobius funcshytion It is well-known that the RH is equivalent to the condition A(N) = EnltivM(trade) = 0(V1 2 + e) (for every e gt 0) that is to A = 12 The analogous plot for this case is shown in Figure 1 (b) with similar findings

3 Factorization of r)

From now on we shall restrict attention to the case = pound For brevity we write r](z) = ((z)(z mdash l)z2 throughout the sequel Recall from the previous section that TJ belongs to every Hardy space H2

T a gt 0 Being a Hardy function r admits a useful factorization some applications of which will be discussed in

126

Figure 1 Convergence abscissa of Laplace transform equal to 12 Plot of criterion log1 DN I logN versus log AT for (a) Example 2 (b) Example 3

the next section The zeros of r) in the right half plane Sftz gt 0 which coincide with the non-trivial zeros of the zeta function are generically denoted by p The ps are known to lie symmetrically with respect to both the real axis and the critical line Kz = 12 That is whenever p is a zero then so are the mirror images p 1-9 and 1 mdash p

Let a gt 0 be fixed According to the factorization theorem for Hardy functions (see eg Dym and McKean5 (ch 27) or Hoffman8 (p 132 133)) TJ can be represented as the product of an outer and an inner function on the half plane 5Rz gt a More precisely

r(z) = Haz)Baz)

where the outer function is given by

(ftz gt a)

Hltr(z) = exp 7T J-c

log rj(a + it) t(z mdash a) + i dt t + i(z-a) 1+t2

(9)

(10)

and the inner function reduces in the present case to a Blaschke product Ba

which is composed of the zeros p of T] with 5fygt gt a and their mirror images after reflection at the line 9z = a 2a mdash ~p Explicitly

l-p-o D M _ TT z ~ P l 1 ( i i )

These formulae are easily obtained from the familiar ones for the half plane 9iz gt 08 by shifting both the complex variable and the zeros by a The inner

127

factor simplifies to a Blaschke product for the following reasons (i) n has an analytic continuation across the line dtz = a to the entire right half plane so that there is no singular factor (ii) the constant c appearing in the general factorization formula reduces to unity because Ba(o) = 1 and Ha(a) = rj(a) as is readily verified For real arguments z = s taking first logarithms then real parts on both sides of (9) one obtains for s gt a gt 0

iog(s) = i jy^(^) s(s_-^2 + pound i0i

5Rpgtltr

s-p s-(2a-p)

(12)

Note that T](s) is positive for s gt 0 being the Laplace transform of a nonneg-ative function

4 Applications

The factorization of n gives rise to various tests of the RH A first example is obtained by setting a = 12 in (12) The sum on the right-hand side of (12) vanishes if and only if pound(z) has no zero within the region $lz gt 12 Therefore the RH is true if and only if for some (and then for all) s gt 12

If 71 J-lt

logMl + ^ l ^ = lograquoK) (13) (s 2) +t

This criterion is equivalent to the condition that r)(z) be an outer function for the half plane 9z gt 12 cf Dym and McKean5 Sect 27 For s = 1 it assumes a particularly neat form The right-hand side vanishes and the left-hand side can be simplified and one gets the following criterion for the truth of the RH due to Balazard Saias and Yor1

4 + l

Another example results from the formula

OO 1

log[|ij(ltr + it)|i(lt7)] -2L - 2 pound K ( p - a ) 1 (15)

(cr gt 0) which can be derived from (12) by subtracting logger) on both sides dividing by s - cr and then taking the limit s a The interchange of limits and integration (or summation) can be justified by dominated convergence

128

Putting a = 12 in (15) one obtains the following differential version of the integral tests (13) (14) The RH is true if and only if

f j mdash lt

dt l o g t W i + i t J I M D l - r j = ( log^) ( i ) (16)

This statement can be amplified in various ways First it is possible to evaluate (log77)(|) explicitly (logr)(|) = f + |log(87r) + f - 6 and for u = 12 the sum in (15) can be written in a more symmetric form One thus obtains the relation

00

log v+it)

v(h) dt (l 1 7T ^$tp-5 ( l + l l o g M + I _ 6 ) = E 2 I

bullKt2 2 2 6V 4 J ^ p - | p (17)

in which the sum extends over all zeros in the critical strip Note that (17) quantifies the difference between the two sides of (16) as a weighted sum of the absolute deviations of the real parts of the zeros from 12

Secondly there is a connection with logarithmic Hilbert transforms also called logarithmic dispersion relations3 Suppose we had T](z) ^ 0 for IStz gt 12 Then n itself would be an outer function

Taking imaginary parts in this equation one can show with a little algebra that for z mdash 12 = a + ib a gt 0 one then has

ZlogV(z) = - J ^ (log|7(i + it) - l o g W +ib)) -plusmn-plusmn j - ^ 1 8 )

l o g M | + r t ) I - log T + ib) I a dt

-I t-b a2 + (t-b)2

Fix any b gt 0 such that 7(| +ib) ^ 0 Then the last term in (38) converges to zero as a 4- 0 Therefore using the fact that r]( + it) is an even function of t one obtains in the limit the logarithmic dispersion relation

o-i ( + bull 2b Z-00 log k ( | + it)| - log |raquo(| + t6)| ^ Zlogriiz+ib) = mdash J i ^ mdash ^ dt (19)

which expresses the phase of rj on the boundary dtz = 12 as an integral of its log modulus along that line Recall that this relation is a consequence of the

129

assumed outer function character of 77 that is of the RH In fact the validity of (19) for every 6 gt 0 such that 7(| + ib) ^ 0 is also sufficient for the RH To see this divide both sides of (19) by b and let 6 4-0 Then the left side tends to (lograquo7)(i) the right side to f 0degdeglog[r]( + it)h)] sect so in the limit we get the condition (16) shown above to be equivalent to the RH

Finally we note that mdash (log77)(ltr) equals the first moment of the probability density pbdquo cp (6) In view of (16) and (15) this raises the question whether the integral term in these relations admits of a probabilistic interpretation too Relevant to this question is the observation going back to Khintchine that for every a gt 1 the function fa(t) = pound(a + it)((a) is the characteristic function of an infinitely divisible distribution cf Example 6 p 75 in Gnedenko and Kolmogorov7 This can be verified by rewriting the product representation of the zeta function (for a gt 1) in the form

C(o- + it) = T T 1-p-7

exp mdash Tmdashon

y^ y^ E ie-itnoSp _ i p n = l

(20)

and noting that fat) is thus represented as a product of terms of the form exp(a(elbt mdash 1)) each of which is the characteristic function of a Poisson random variable with intensity a and values in the lattice kb k = 012

In order to connect this fact with the above question it is convenient to introduce the Levy measure Fa which puts mass (npncr)~1 at each of the points - logp ngtlp prime Then (20) becomes log ^fffi = J(eitx - 1) Fa(dx) so taking real parts in this equation and using J^deg (l mdash costx)t2 dt = n x (x pound R) one obtains

J o g [ | C ( a + i i ) | C ( lt T ) ] ^ = j_^jpostx-l)Fadx)^

= ( c o s t e - 1 ) mdash ^ F ^ d x ) = - hxlFeidx) = xFbdquo(dx)

Thus we find that the essential part of the integral in question equals the first moment of the Levy measure Fa The other part stemming from the factor (z mdash l)z2 can be incorporated by introducing a signed absolutely continuous measure Ga with density x _ 1 [2eax - e ^ - 1 ^ ) on (-000) (zero on [000)) One then has

log r)a + it) plusmnii) = j(eax-l)(Fa-Ga)(dx)

130

and hence

l o g [ | bdquo ( | + r t ) I M sect ) ] ^ = lx(Fbdquo-Ga)dx) (ltxgtl)

These calculations give a more detailed picture of the way how the factor (z mdash l)z2 regularizes the zeta function as a J 1 it compensates the flow of mass of Fa towards mdash oo by the subtraction of measures Ga such that the first moment of Fa mdash Ga remains bounded Evidently other ways of renormalizing the Levy measure as a 1 are also conceivable and may be interesting to explore

References

1 M Balazard E Saias and M Yor Adv Math 143 284 (1999) 2 MV Berry and JP Keating SIAM Review 41 236 (1999) 3 RE Burge MA Fiddy AH Greenaway and G Ross Proc R Soc

London A 350 191 (1976) 4 J -F Burnol lt h t t p arXivorgabsmath0001013gt (2000) 5 H Dym and HP McKean Gaussian Processes Function Theory and

the Inverse Spectral Problem (Academic Press New York 1976) 6 HM Edwards The Theory of the Riemann Zeta Function (Academic

Press New York 1974) 7 BV Gnedenko and AN Kolmogorov Limit Distributions for Sums of

Independent Random Variables (Addison-Wesley Cambridge 1954) 8 K Hoffman Banach Spaces of Analytic Functions (Dover New York

1988)

131

ENSEMBLE PROBABILISTIC EQUILIBRIUM A N D NON-EQUILIBRIUM THERMODYNAMICS W I T H O U T THE

THERMODYNAMICAL LIMIT

D H E G R O S S

Hahn-Meitner-Institut Berlin Bereich Theoretische PhysikGlienickerstrlOO

14109 Berlin Germany and Freie Universitdt Berlin Fachbereich Physik Email grosshmide

Boltzmanns principle S = k In W allows to extend equilibrium thermo-statistics to Small systems without invoking the thermodynamic limit23 As the limit hides more than clarifies the origin of phase transitions a deeper and more transparent understanding is thus possible The main clue is to base statistical probability on ensemble averaging and not on time averaging It is argued that due to the incomplete information obtained by macroscopic measurements thermodynamics handles ensembles or finite-sized sub-manifolds in phase space and not single time-dependent trajectories Therefore ensemble averages are the natural objects of statistical probabilities This is the physical origin of coarse-graining which is not anymore a mathematical ad hoc assumption The probabilities P(M) of macroshyscopic measurements M are given by the ratio P(M) = W(M)W of the volumes of the sub-manifold M of the microcanonical ensemble with the constraint M to the one without From this concept all equilibrium thermodynamics can be deduced quite naturally including the most sophisticated phenomena of phase transitions for Small systems

Boltzmanns principle is generalized to non-equilibrium Hamiltonian systems with possibly fractal distributions M in 6iV-dim phase space by replacing the conshyventional Riemann integral for the volume in phase space by its corresponding box-counting volume This is equal to the volume of the closure M With this extension the Second Law is derived without invoking the thermodynamic limit The irreversibility in this approach is due to the replacement of the phase-space volume of the fractal sub-manifold M by the volume of its closure M The physical reason for this replacement is that macroscopic measurements cannot distinguish M from Ai Whereas the former is not changing in time due to Liouvilles theoshyrem the volume of the closure can be larger In contrast to conventional coarse graining the box-counting volume is defined in the limit of infinite resolution Ie there is no artificial loss of information

1 Introduction

Recently the interest in the thermo-statistical behavior of non-extensive many-body systems like atomic nuclei atomic clusters soft-matter biological sysshytems mdash and also self-gravitating astro-physical systems lead to consider thermo-statistics without using the thermodynamic limit This is most safely done by going back to Boltzmann Einstein considers Boltzmanns definition of entropy as eg written on his

132

famous epitaph

S=k-lnW (1)

as Boltzmanns principle4 from which Boltzmann was able to deduce thermoshydynamics Here W is the number of micro-states at given energy E of the TV-body system in the spatial volume V

W(ENV) = tr[e0S(E - HN)) (2)

ltlt-amp)] = ff^(^0)BBbdquo) (3)

eo is a suitable energy constant to make W dimensionless Hpf is the N-particle Hamilton-function and the iV positions q are restricted to the volume V whereas the momenta p are unrestricted In what follows we remain on the level of classical mechanics The only reminders of the underlying quantum meshychanics are the measure of the phase space in units of 2-KK and the factor 1N which respects the indistinguishability of the particles (Gibbs paradoxon) In contrast to Boltzmann56 who used the principle only for dilute gases and to Schrodinger7 who thought equation (1) is useless otherwise I take the princishyple as the fundamental generic definition of entropy In the following sections 1 will demonstrate that this definition of thermo-statistics works well espeshycially also at higher densities and at phase transitions without invoking the thermodynamic limit

2 There is a lot to add to classical equilibrium statistics from our experience with Small systems

Following Lieb8 extensivity a and the existence of the thermodynamic limit N mdashgt oo|jvv=cobdquogt are essential conditions for conventional (canonical) thershymodynamics to apply Certainly this implies also the homogeneity of the system Phase transitions are somehow foreign to this The essence of first order transitions is that the systems become inhomogeneous and split into difshyferent phases separated by interfaces In the conventional Yang-Lee theory phase transitions are represented by the positive zeros of the grand-canonical partition sum where the grand-canonical formalism breaks down (Yang-Lee singularities) In the following we show that the micro-canonical ensemble

Dividing extensive systems into larger pieces the total energy and entropy are equal to the sum of those of the pieces

133

gives much more detailed and more natural insight which corresponds to the experimental identification of phase transitions

There is a whole group of physical many-body systems called Small in the following which cannot be addressed by conventional thermo-statistics

bull nuclei

bull atomic cluster

bull polymers

bull soft matter (biological) systems

bull astrophysical systems

bull first order transitions are distinguished from continuous transitions by the appearance of phase-separations and interfaces with surface tension If the range of the force or the thickness of the surface layers is such that the number of surface particles is not negligible compared to the total number of particles these systems are non-extensive

For such systems the thermodynamic limit does not exist or makes no sense Either the range of the forces (Coulomb gravitation) is of the order of the linear dimensions of these systems andor they are strongly inhomogeneous eg at phase-separation

Boltzmanns principle does not invoke the thermodynamic limit nor ad-ditivity nor extensivity nor concavity of the entropy S(EN) (downwards bending) This was largely forgotten since hundred years We have to go back to pre Gibbsian times It is a purely geometrical definition of the entropy and applies as well to Small systems Moreover the entropy S(E N) as defined above is everywhere single-valued and multiple differentiable There are no singularities in it This is the most simple access to equilibrium statistics9 We will explore its consequences in this contribution Moreover we will see that this way we get simultaneously the complete information about the three crucial parameters characterizing a phase transition of first order transition tempershyature Ttr latent heat per atom qiat and surface tension crsurf Boltzmanns famous epitaph above (eql) contains everything what can be said about equishylibrium thermodynamics in its most condensed form W is the volume of the sub-manifold at sharp energy in the 6iV-dim phase space

134

3 Relation of the topology of S(EN) to the Yang-Lee zeros of Z(TnV)

In conventional thermo-statistics phase transitions are indicated by zeros of the grand-canonical partition function Z(T n V) V is the volume See more details in1-2310

Z(TfiV) = f r mdash dN e-[E-N-TsmiT JJo go

rdegdegdE

V2

= Y_ ff de dn c-V[ e-Mn-r(en)]T_ laquoo JJo

const+lin+quadr

(4)

in the thermodynamic limit V mdashgt oo|vy=cobdquos t The double Laplace integral (4) can be evaluated asymptotically for large

V by expanding the exponent as indicated in the last line to second order in Ae An around the stationary point esns where the linear term vanishes

1 T

T P f

dE 8

as dN

dS dv (5)

the only term remaining to be integrated is the quadratic one If the two eigen-curvatures Ai lt 0 A2 lt 0 this is then a Gaussian integral and yields

Z(TliV) = Yle-V[e-Itn-T^n)]T ffdegdeg dvidv2eV[Mvl+Xvl2 ( g )

CO JJ-00

Z(TfiV) = e - F ^ ^ (7)

FiT^V) _ _ T B i i ^ ^ ^ plusmn ^ ( g )

V

bdquo Tln(vdet(eg n)) l n V -+ea- in - Tss + VV

VK s + o ( mdash )

Here det(e s n s) is the determinant of the curvatures of s(en) viv2 are the eigenvectors of d

det(en) = de2 dnde d s d s

dedn dn2 Sfie Snn A1A2 Ai gt A2 (9)

135

Nalooo P = 1 a t m ^ AS s u r f ^_^

^ J - ^ mdash ^ r f ^

bull7 e2 1 s ( e ) - 2 5 - e 1 1 5

H l a t

e 3

03 0 5 07 09 11 13

Figure 1 MMMC simulation of the entropy s(e) per atom (e in eV per atom) of a system of JVo = 1000 sodium atoms with realistic inshyteraction at an external pressure of 1 atm At the energy per atom e the system is in the pure liquid phase and at e$ in the pure gas phase of course with fluctuations The latent heat per atom is qiat = e mdash e

Attention the curve s(e) is artifically sheared by subtracting a linear funcshytion 25 -(- e 115 in order to make the convex intruder visible s(e) is always a steeply monotonic rising functionWe clearly see the global concave (downshywards bending) nature of s(e) and its convex intruder Its depth is the enshytropy loss due to the additional corshyrelations by the interfaces Prom this one can calculate the surface tension per surface atom aSUrfTtr = As3 1 i r NoNsUrf The double tangent is the concave hull of s(e) Its derivative gives the Maxwell line in the caloric curve T(e) at Ttr- In the thermodynamic limit the intruder would disappear and s(e) would approach the double tanshygent (Maxwell line) from below

In the cases studied here A2 lt 0 but Ai can be positive or negative If d e t ( e s n s ) is positive (Ai lt 0) the last two terms in eq(8) go to 0 and we obtain the familiar result fTnV mdashgt oo) = es mdash xns mdash Tss Ie the curvashyture Ai of the entropy surface s(e n V) decides whether the grand-canonical ensemble agrees with the fundamental micro ensemble in the thermodynamic limit If this is the case n[Z(T j)] or f(Tn) is analytical in e3^ and due to Yang and Lee we have a single stable phase Or otherwise the Yang-Lee zeros reflect anomalous pointsregions of Ai gt 0 (det (e n) lt 0) This is crucial As d e t ( e s n s ) can be studied for finite or even small systems as well this is the only proper extension of phase transit ions to Small systems

4 T h e reg ions of p o s i t i v e curvature Ai of sesns) c o r r e s p o n d t o p h a s e t rans i t i ons of first order

We will now discuss the physical origin of convex (upwards bending) intruders in the entropy surface in two examples

In table (1) we compare the liquid-gas phase transit ion in sodium clusshyters of a few hundred atoms with tha t of the bulk at 1 a tm cf also fig(l)

Figure (2) shows how for a small system (Pot ts q = 3 lattice gas with 50 50 points) all phenomena of phase transitions can be studied from the

136

Table 1 Parameters of the liquid-gas transition of small sodium clusters (MMMC-calculation1) in comparison with the bulk for rising number No of atoms Nsurf is the average number of surface atoms of all clusters together

N a

N0

Ttr [K] qiat [eV]

Sboil

^Ssurf

bullL surf

crTtr

200

940 082 101 055 3994 275

1000

990 091 107 056 9853 568

3000

1095 094 99 044 1866 707

bulk 1156 0923 9267

oo 741

topology of the determinant of curvatures (9) in the micro-canonical ensemble

5 Boltzmanns principle and non-equilibrium thermodynamics

Before we proceed we must comment on Einsteins attitude to the principle11) Originally Boltzmann called W the Wahrscheinlichkeit (probability) ie the relative time a system spends (along a time-dependent path) in a given region of 6V-dim phase space Our interpretation of W to be the number of complexions (Boltzmanns second interpretation) or quantum states (trace) with the same energy was criticized by Einstein4 as artificial It is exactly that criticized interpretation of W which I use here and which works so excellently1 In section 7 I will come back to this fundamental point

After succeeding to deduce equilibrium statistics including all phenomshyena of phase transitions from Boltzmanns principle even for Small systems ie non-extensive many-body systems it is challenging to explore how far this most conservative and restrictive way to thermodynamics9 is able to describe also the approach of (eventually Small) systems to equilibrium and the Second Law of Thermodynamics

Thermodynamics describes the development of macroscopic features of many-body systems without specifying them microscopically in all details Beshyfore we address the Second Law we have to clarify what we mean with the label macroscopic observable

6 Macroscopic observables imply the EPS-probability

A single point qi(t)Pi(t)i=iN in the Af-body phase space corresponds to a detailed specification of the system with all degrees of freedom (dof) com-

137

1

0 8

0 6

0 4

0 2

0 - 2 - 1 5 - 1 - 0 5 0

e Figure 2 Conture plot of the curvature determinant of Potts-3 lattice gas Dark grey line d = 0 boundary of the region of phase coexistence the triangle APmB Light grey line minimum of d(en) in the direction of the largest curvature second order transition In the triangle APmC ordered (solid) phase Above and right of the line CPmB disordered (gas) phase The crossing Pm of the boundary lines is a multi critical point The light gray region around the multi-critical point Pm corresponds to a flat region of d(e n) ~ 0

pletely fixed at time t (microscopic determination) Fixing only the total energy E of an iV-body system leaves the other (6N mdash l)-degrees of freeshydom unspecified A second system with the same energy is most likely not in the same microscopic state as the first it will be at another point in phase space the other dof will be different Ie the measurement of the total energy HN or any other macroscopic observable M determines a (QN mdash 1)-dimensional sub-manifold pound or M in phase space All points in iV-body phase space consistent with the given value of E and volume V ie all points in the (6N mdash l)-dimensional sub-manifold poundNV) of phase space are equally consistent with this measurement pound(NV) is the microcanonical ensemble This example tells us that any macroscopic measurement is incomplete and defines a sub-manifold of points in phase space not a single point An addishytional measurement of another macroscopic quantity Bqp reduces pound further to the cross-section pound O B a (6iV mdash 2)-dimensional subset of points in pound with the volume

WBENV) = plusmnJ j0f) e0S(E-HNqp)6(B-Bqp) (10)

138

If Hffqp as also Bqp are continuous differentiable functions of their arguments what we assume in the following pound n B is closed In the following we use W for the Riemann or Liouville volume of a many-fold

Microcanonical thermostatics gives the probability P(B E N V) to find the TV-body system in the sub-manifold pound D B(EN V)

P(B E N V)~ W(BEgtNV) _ ln[W(BENV)]-S(ENV) ( m

This is what Krylov seems to have had in mind12 and what I will call the ensemble probabilistic formulation of statistical mechanics (EPS)

Similarly thermodynamics describes the development of some macroscopic observable Bqtpt in time of a system which was specified at an earlier time to by another macroscopic measurement Aqop0 It is related to the volume of the sub-manifold M(t) = A(t0) n B(t) D pound

W(ABEt) = ^J^0)N^-Bqupt]) 6(A - Aq0po)e0d(E - Hqtpt) (12)

where qtQoPoPtQoPo is the set of trajectories solving the Hamilton-Jacobi equations

dH 8H = laquo - Pi = mdash laquo - i = l---N (13)

with the initial conditions q(t = to) = lto p(t = t0) = Po- For a very large system with N ~ 1023 the probability to find a given value B(T) P(B(t)) is usually sharply peaked as function of B Ordinary thermodynamics treats systems in the thermodynamic limit N mdashbull oo and gives only ltB(t)gt However here we are interested to formulate the Second Law for Small systems ie we are interested in the whole distribution P(B(t)) not only in its mean value ltB(t)gt Thermodynamics does not describe the temporal development of a single system (single point in the 6iV-diiri phase space)

There is an important property of macroscopic measurements Whereas the macroscopic constraint Aqopo determines (usually) a compact region A(to) in qoPo this does not need to be the case at later times t 3gt to A(t) denned by AqoqtptPoltltPt might become a fractal ie spaghetti-like manifold cf fig3 as a function of qtPt in f at i mdash oo and loose compactness

This can be expressed in mathematical terms There exist series of points an euro -4(oo) which converge to a point an=_+oo which is not in ^4(oo) Eg

139

such points may have intruded from the phase space complimentary to A(to) Illustrative examples for this evolution of an initially compact sub-manifold into a fractal set are the baker transformation discussed in this context by ref1314 Then no macroscopic (incomplete) measurement at time t = oo can resolve aoo from its immediate neighbors an in phase space with distances o-n mdash laquooo| less then any arbitrary small 5 In other words at the time t Sgt to no macroscopic measurement with its incomplete information about qtPt can decide whether qoqtPtPoqtPt euro -4(o) or not Ie any macroscopic theory like thermodynamics can only deal with the closure of A(t) If necessary the sub-manifold A(t) must be artificially closed to A(t) as developed further in section 8 Clearly in this approach this is the physical origin of irreversibility We come back to this in section 8

7 On Einsteins objections against the EPS-probability

According to Abraham Pais Subtle is the Lord11 Einstein was critical with regard to the definition of relative probabilities by eql l Boltzmanns countshying of complexions He considered it as artificial and not corresponding to the immediate picture of probability used in the actual problem The word probability is used in a sense that does not conform to its definition as given in the theory of probability In particular cases of equal probability are often hypothetically defined in instances where the theoretical pictures used are sufshyficiently definite to give a deduction rather than a hypothetical assertion4 He preferred to define probability by the relative time a system (a trajectory of a single point moving with time in the V-body phase space) spends in a subset of the phase space However is this really the immediate picture of probashybility used in statistical mechanics This definition demands the ergodicity of the trajectory in phase space As we discussed above thermodynamics as any other macroscopic theory handles incomplete macroscopic informations of the A-body system It handles consequently the temporal evolution of finite sized sub-manifolds - ensembles - not single points in phase space The typical outcomes of macroscopic measurements are calculated Nobody waits in a macroscopic measurement eg of the temperature long enough that an atom can cross the whole system

In this respect I think the EPS version of statistical mechanics is closer to the experimental situation than the duration-time of a single trajectory Moreover in an experiment on a small system like a nucleus the excited nushycleus which then may fragment statistically later on is produced by a multiple repetition of scattering events and statistical averages are taken No ergodic covering of the whole phase space by a single trajectory in time is demanded

140

At the high excitations of the nuclei in the fragmentation region their life-time would be too short for that This is analogous to the statistics of a falling ball on a Galtons nail-board where also a single trajectory is not touching all nails but is random Only after many repetitions the smooth binomial distribution is established As I am discussing here the Second Law in finite systems this is the correct scenario not the time average over a single ergodic trajectory

8 Fractal distributions in phase space Second Law

Let us examine the following Gedanken experiment Suppose the probability to find our system at points qtPt in phase space is uniformly distributed for times t lt to over the sub-manifold poundN V) of the TV-body phase space at energy E and spatial volume V At time t gt to we allow the system to spread over the larger volume V2 gt Vi without changing its energy If the system is dynamically mixing the majority of trajectories qtPt^ in phase space starting from points qoPo with qo 6 V at to will now spread over the larger volume V2- Of course the Liouvillean measure of the distribution JAqtPt in phase space at t gt to will remain the same (= tr[pound(N Vi)]f5 (The label qo pound Vi of the integral means that the positions qo^ are restricted to the volume Vi the momenta po are unrestricted)

tr[MqtqoPoPtqoPo]goeVl

-UMW-^-61^ lt14) because of 7-7mdash-mdashr = 1 (15)

oqoPo

But as already argued by Gibbs the distribution MqtPt will be filamented like ink in water and will approach any point of poundN V2) arbitrarily close Mqtpt becomes dense in the new larger pound(N V2) for times sufficiently larger than to (strictly in the limt_gtoo)- The closure M becomes equal to poundNV-z) This is clearly expressed by Lebowitz1617

In order to express this fact mathematically we have to redefine Boltz-manns definition of entropy eq(l) and introduce the following fractal mea-

141

sure for integrals like (3) or (10)

W(ENtraquot0) = plusmn [ i^Sf)zo6(E-HNquPt) (16)

With the transformation

f(d3qt d3Pt)

N bull bull bull = d lt n bullbull bull da6N bull bull bull (17)

1 ^dH dH 1 _ 1 Q do-QN = mdash gt -mdash- dqi + -^mdashdpi = mdashdE (18)

IVffll Ns)+gy W[E N t raquo t0) = v 9 Lv3jv f rfltJi bull bull bull d(76N-1-

JVH||

we replace M by its closure M and define now

(20)

W(EW traquo fo ) -gt M(E JV traquo t 0 ) =ltG(pound(JVV2))gt volt08[MCEJTt raquo i o ) ] (21)

where lt G(S(N V2)) gt is the average of fi^llvgll o v e r t i e (^arSer) m a n _

ifold pound(N V2) and volbox[M(ENt raquo to)] is the box-counting volume of M(E N t 3gt to) which is the same as the volume of M see below

To obtain voltox[M(E Nt 3gt to)] we cover the d-dim sub-manifold M(t) here with d = (6V mdash 1) of the phase space by a grid with spacing 6 and count the number N$ oc 5~d of boxes of size S6N which contain points of M Then we determine

vobox[M(ENt raquo to)] =)ms_y05dNs[M(ENfraquo f0)] (22)

with lim= inf [lim ] or symbolically

M(ENtraquot0) = L lf^^Pi) e06(E-HN)(23) J laquoolaquoplaquoeViM V ( 2 ^ ) ^ J

N

i 1 1 aat arvt

= WfaNWtWiE^M) (24)

142

Va vb va + vb

t lt 0 gt i o

Figure 3 The compact set M(to) left side develops into an increasingly folded spaghetti-like distribution in phase-space with rising time t This figure shows only the early form of the distribution At much larger times it will become more and more fractal The grid illustrates the boxes of the box-counting method All boxes which overlap with A4(t) are counted in Ng in eq(22)

where 3d means that this integral should be evaluated via the box-counting

volume (22) here with d = 6N mdash 1 This is illustrated by the figure 3 With this extension of eq(3) Boltzmanns entropy (1) is at time t -gtbull oo equal to the logarithm of the larger phase space W(E TV V )- This is the Second Law of Thermodynamics The box-counting is also used in the definition of the Kolmogorov entropy the average rate of entropy gain1819 Of course still at to Mto)=Mt0)=poundNV1)

l_ M(ENt0) =

lt7oeuroVi

qoeuroVi N l

= WENV)

4o6Vgt N

d3q0 dpQ

(2irH)3

d3q0 d3p0 (2nh)3 J

e06(E - HN) (25)

e0S(E - HN)

(26)

The box-counting volume is analogous to the standard method to detershymine the fractal dimension of a set of points18 by the box-counting dimension

dimbox[M(ENt raquo t0)] = lira InNs[M(ENtgt tp)]

In S (27)

143

Like the box-counting dimension volbox has the peculiarity that it is equal to the volume of the smallest closed covering set Eg The box-counting volume of the set of rational numbers Q between 0 and 1 is voloxQ = 1 and thus equal to the measure of the real numbers cf Falconer18 section 31 This is the reason why volampox is not a measure in its mathematical definition because then we should have

volf0 pound(M) ieuroQ

2 voUolaquo[Mi] = 0 (28) ieQ

therefore the quotation marks for the box-counting measure Coming back to the the end of section (6) the volume W(ABbull bull bull t) of

the relevant ensemble the closure M(t) must be measured by something like

the box-counting measure (2223) with the box-counting integral B d which

must replace the integral in eq(3) Due to the fact that the box-counting volume is equal to the volume of the smallest closed covering set the new extended definition of the phase-space integral eq(23) is for compact sets like the equilibrium distribution pound identical to the old one eq(3) Therefore one can simply replace the old Boltzmann-definition of the number of complexions and with it of the entropy by the new one (23)

9 Conclusion

Macroscopic measurements M determine only a very few of all 6N dof Any macroscopic theory like thermodynamics deals with the volumes M of the corresponding closed sub-manifolds M in the 6iV-dim phase space not with single points The averaging over ensembles or finite sub-manifolds in phase space becomes especially important for the micro canonical ensemble of a finite system

Because of this necessarily coarsed information macroscopic measureshyments and with it also macroscopic theories are unable to distinguish fractal sets M from their closures M Therefore I make the conjecture the proper manifolds determined by a macroscopic theory like thermodynamics are the closed M However an initially closed subset of points at time to does not necshyessarily evolve again into a closed subset at t ^gt to- l e the closure operation and the t mdash)bull oo limit do not commute and the macroscopic dynamics becomes irreversible The limt-^oo and l i m ^ o may be linked as eg S gt constft and the S mdashgtbull 0 limit taken after the t mdashgt oo limit

Here is the origin of the misunderstanding by the famous reversibility paradoxes which were invented by Loschmidt20 and Zermelo2122 and which

144

bothered Boltzmann so much2324 These paradoxes address to trajectories of single points in the JV-body phase space which must return after Poincarres recurrence time or which must run backwards if all momenta are exactly reshyversed Therefore Loschmidt and Zermelo concluded that the entropy should decrease as well as it was increasing before The specification of a single point demands of course a microscopic exact specification of all 6N degrees of freeshydom not a determination of a few macroscopic degrees of freedom only No entropy is defined for a single point

By our formulation of thermo-statistics various non-trivial limiting proshycesses can be avoided Neither does one invoke the thermodynamic limit of a homogeneous system with infinitely many particles nor does one rely on the er-godic hypothesis of the equivalence of (very long) time averages and ensemble averages The use of ensemble averages is justified directly by the very nature of macroscopic (incomplete) measurements Coarse-graining appears as natushyral consequence of this The box-counting method mirrors the averaging over the overwhelming number of non-determined degrees of freedom Of course a fully consistent theory must use this averaging explicitly Then one would not depend on the order of the limits l i m ^ o limt_gtoo as it was tacitly assumed here Presumably the rise of the entropy can then be already seen at finite times when the fractality of the distribution in phase space is not yet fully deshyveloped The coarse-graining is no more any mathematical ad hoc assumption Moreover the Second Law is in the EPS-formulation of statistical mechanics not linked to the thermodynamic limit as was thought up to now1617

Appendix

In the mathematical theory of fractals18 one usually uses the Hausdorff measure or the Hausdorff dimension of the fractal19 This however would be wrong in Statistical Mechanics Here I want to point out the difference between the box-counting measure and the proper Hausdorff measure of a manifold of points in phase space Without going into too much mathematical details we can make this clear again with the same example as above The Hausdorff measure of the rational numbers euro [01] is 0 whereas the Hausdorff measure of the real numbers euro [01] is 1 Therefore the Hausdorff measure of a set is a proper measure The Hausdorff measure of the fractal distribution in phase space M(t -gt oo) is the same as that of M(to) W(E NV) Measured by the Hausdorff measure the phase space volume of the fractal distribution M(t -t oo) is conserved and Liouvilles theorem applies This would demand that thermodynamics could distinguish between any point inside the fractal from any point outside of it independently how close it is This however

145

is impossible for any macroscopic theory that can only address macroscopic information where all unobserved degrees of freedom are averaged over That is the deep reason why the box-counting measure must be taken and where irreversibility comes from

Acknowledgement

I thank to EGD Cohen and Pierre Gaspard for detailed discussions

References

1 D H E Gross Microcanonical thermodynamics Phase transitions in Small systems Lecture Notes in Physics (World Scientific Singapore 2000)

2 D H E Gross and E Votyakov Phase transitions in small sysshytems EurPhysJB 15 115-126 (2000) httparXivorgabscond-mat9911257

3 D H E Gross Micro-canonical statistical mechanics of some non-extensive systems httparXiv orgabsastro-phcond-mat0004268 (2000)

4 A Einstein Uber einen die Erzeugung und Verwandlung des Lichtes betreffenden heuristischen Gesichtspunkt Annalen der Physik 17 132 (1905)

5 L Boltzmann Uber die Beziehung eines algemeinen mechanischen Satzes zum Hauptsatz der Warmelehre Sitzungsbericht der Akadamie der Wis-senschaften Wien 2 67-73 (1877)

6 L Boltzmann Uber die Begriindung einer kinetischen Gastheorie auf anziehende Krafte allein Wiener Berichte 89 714 (1884)

7 E Schrodinger Statistical Thermodynamics a Course of Seminar Lecshytures delivered in January-March 1944 at the School of Theoretical Physics (Cambridge University Press London 1946)

8 Elliott H Lieb and J Yngvason The physics and mathematics of the second law of thermodynamics Physics Reportcond-mat9708200 310 1-96 (1999)

9 J Bricmont Science of chaos or chaos in science Physicalia Magazine Proceedings of the New York Academy of Science to apear 1-50 (2000)

10 DHE Gross Phase transitions in small systems - a challenge for thershymodynamics httparXivorgabscond-mat0006087 page 8 (2000)

11 A Pais Subtle is the Lord chapter 4 pages 60 - 78 (Oxford University Press Oxford 1982)

12 N S Krylov Works on the Foundation of Statistical Physics (Princeton University Press Princeton 1979)

13 R F Fox Entropy evolution for the baker map Chaos 8 462-465 (1998)

14 T Gilbert J R Dorfman and P Gaspard Entropy production fractals and relaxation to equilibrium PhysRevLett 85 1606nlinCD000301 (2000)

15 H Goldstein Classical Mechanics (Addison-Wesley Reading Mass 1959)

16 J L Lebowitz Microscopic origins of irreversible macroscopic behavior Physica A 263 516-527 (1999)

17 J L Lebowitz Statistical mechanics A selective review of two central issues RevModPhys 71 S346-S357 (1999)

18 K Falconer Fractal Geometry - Mathematical Foundations and Apshyplications ( John Wiley amp Sons Chichester New York Brisbane TorontoSingapore 1990)

19 E W Weisstein Concise Encyclopedia of Mathemetics (CRC Press Lonshydon New York Washington DC 1999 CD-ROM edition 1 205 99)

20 J Loschmidt Wienerberichte 73 128 (1876) 21 E Zermelo WiedAnn 57 778-784 (1896) 22 E Zermelo Uber die mechanische Erklarung irreversiblen Vorgange

WiedAnn 60 392-398 (1897) 23 E G D Cohen Boltzmann and statistical mechanics In Boltz-

manns Legacy 150 Years after his Birth httpxxxlanlgovabscond-mat9608054 (Atti dell Accademia dei Lincei Rome 1997)

24 E G D Cohen Boltzmann and Statistical Mechanics volume 371 of Dynamics Models and Kinetic Methods for Nonequilibrium Many Body Systems J Karkheck editor 223-238 (Kluwer Dordrecht The Nethershylands 2000)

147

A N APPROACH TO Q U A N T U M PROBABILITY

STAN GUDDER Department of Mathematics

University of Denver Denver Colorado 80208

sguddercs du edu

We present an approach to quantum probability that is motivated by the Feynman formalism This approach shows that there is a realistic description of quantum mechanics and that nonrelativistic quantum theory can be derived from simple postulates of quantum probability The basic concepts in this framework are meashysurements and actions The measurements are similar to the dynamic variables of classical mechanics and the random variables of classical probability theory The actions correspond to quantum mechanical states An influence between configshyurations of a physical system is defined in terms of an action The fundamental postulate of this approach is that the probability density at a measurement outshycome x is the sum (or integral) of the influences between each pair of configurations that result in x upon executing the measurement

1 Introduction

We shall discuss a new approach to quantum probability that combines a reshyformulation of the mathematical foundations of quantum mechanics and the basic tenets of probability theory This approach is motivated by the Feynshyman formalism1 and it answers various puzzling questions about traditional quantum mechanics Some of these questions are the following

1 Where does the quantum mechanical Hilbert space H come from

2 Why are states represented by unit vectors in H and observables by self-

adjoint operators on HI

3 Why does the probability have its postulated form

4 Why do the position and momentum operators have their particular forms

5 Why does a physical theory that must give real-valued results involve complex amplitudes or states

6 Is there a realistic description of quantum mechanics

Our philosophy is that quantum probability theory need not be the same as classical probability theory That is the probability need not be given by a measure However the predictions of quantum probability theory should agree

148

with experimental long run relative frequencies We shall show that there is a realistic description of quantum mechanics In other words a quantum system has properties independent of observation We also show that nonrelativistic quantum mechanics can be derived from simple postulates of this approach Our presentation is a modified version of the discussion in Gudder 2

2 Formulation

We denote the set of possible configurations of a physical system ltS by fl and call $1 a sample space If X is a measurement on ltS then executing X results in a unique outcome depending on the configuration u of S To be precise we define a measurement to be a map X from fl onto its range R(X) C R satisfying

(Ml) R(X) is the base space of a measure space (R(X) Ex fix)-

(M2) X_1(x) is the base space of a measurable space (X~1(x) E x ) for every x e R(x)

We call the elements of R(X) X-outcomes and the sets in Ex are X-events Note that X _ 1 (x ) corresponds to the set of configurations resulting in outcome x when X is executed and we call X_1(x) the X-fiber over x The measure fix represents an a priori weight due to our knowledge of the system (for example we may know the energy of S or we might assume the energy has a certain value) In the case of total ignorance the weight is taken to be counting measure in the discrete case and uniform measure in the continuous case This framework gives a realistic theory because a configuration CJ detershymines the properties of S independent of any particular observation That is w determines the outcomes of all measurements simultaneously Notice that measurements are similar to the dynamical variables of classical mechanics and the random variables of classical probability theory The sample space fi gives an underlying level of reality upon which traditional quantum mechanics can be constructed

If X is a measurement an X-action is a pair

(Spound xeR(X))

where S CI mdashgt R and (ix is a measure on [X~lx)Hxx) As we shall see

actions correspond to quantum states For simplicity we frequently denote an action by S and we remark that S depends on our model of S and also on our knowledge of ltS We define the influence between w w 6 SI relative to S

149

by

Fs(uu) = JVf cos[S(w) - S(u)] (1)

where Ns gt 0 is a normalization constant The appearance of the cosine in (1) is not arbitrary but it can be derived from the regularity conditions of continuity and causality25

We now make a fundamental reformulation of the probability concept2 5

We postulate that the probability density Pxs) of an X-outcome x is the sum (or integral) of the influences between each pair of configurations that reshysult in x upon executing X Precisely we postulate that Fs(w u) is integrable and that

PXS(X)= f [ FS(ujUj)fMx(du)^x(dLj JX-l(x) JX~l(x)

(2)

Also to ensure that Pxsx) is indeed a probability density we assume that Pxsx) is measurable with respect to Ex and that

L RX) Pxs(x)nx(dx) - 1 (3)

Equation (3) can be employed to find Ns- To show that Pxs(x) gt 0 we have

Pxs()

= N2S[ f [caaS(w)coaS(w) + 8mS(u)S(u)]px(du)px(du)

Jx-Hx) Jx-Hx)

= N2S

-| 2 p

cosS(u)fix(dcj + sinS(w)^x(eL Jx-1(x) Jx-^x)

gt 0

We conclude that Pxs(x) is a probability density on R(X) pound X J X )

If B G pound is an X-event we define the (X 5)-probability of B by

PxsB) = [ Pxs(x)Vxdx) JB

(4)

(5)

Then Pxs- Ex -gt [01] is a probability measure on (R(X)Hx) that we call the S-distribution of X If h R(X) -gtbull R is ^x-integrable then the

150

5-expectation of hX) is defined by

Es(hX))= [ h(x)Pxs(dx)= [ h(x)Pxs(x)nx(dx) (6) JR(X) JR(X)

In particular if h is the identity function the 5-expectation of X becomes

ES(X)= [ xPxsx)nx(dx) (7) JR(X)

Influence is a strictly quantum phenomenon that is not present in classical physics In the classical limit Fswu) approaches a delta function 5U(UJ) In this limit Fs(uiui) = 0 for u 7 OJ and there is no influence between distinct configurations We then have Pxs(x) mdash nx

x X~lx)) which gives a classical probability framework

We can extend this theory to include expectations of other functions on Q Let g Q mdashgt R be a function that is integrable along X-fibers We define the (X 5)-expectation of g at x by

EXlS(g)(x) = I [ 5(w)fs(wa)Mx(dw)Mx(dw) (8) JX-1(x)JX-^(x)

This is the natural generalization of (2) from a probability density to an exshypectation density If Exs(g) 1S integrable then the (X 5)-expectation of g is given by

Exs(9) = [ Exs9)x)raquoxdx) (9) JR(X)

In particular if g(u) = h (X(CJ)) then

Exs(g)(x) = h(x)Pxs(x)

and

ExM = I h(x)Pxs(x)raquox(dx) = Es (h(X)) JR(X)

This shows that (9) is an extension of (6) We can also use this formalism to compute probabilities of events in fi Let

ACQ and denote the characteristic function of Aby xA- If XA is integrable along X-fibers we define analogously as in classical probability theory the (X 5)-pseudoprobability of A by

xs(A) = Exs(xA)

151

It follows from (3) and (9) that Pxs(ty = 1 and Pxs is countably additive However Pxs rnay have negative values which is why it is called a pseudo-probability Nevertheless there are cr-algebras of subsets of fi on which Pxs is a probability measure For example if A = X~XB) for B euro Ex then it can be shown that Pxs(A) = Pxs(B)2 Therefore in this case Pxs reduces to the distribution Pxs- We shall consider some less trivial examples later

3 Wave Functions and Hilbert Space

This section employs the formalism of Section 2 to derive the wave functions and Hilbert space of traditional quantum mechanics It is not necessary to do this because the needed probability formulas have been presented in Section 2 However as we shall see the Hilbert space formulation gives more convenient and concise notations

Applying (4) we obtain

NseiS^raquox(duj)

JX-l(x)

2

(10)

We call the function

s M = NseiS^ (11)

the S-amplitude function and define the (X S)-wave function by

fxs() = f fs(u)raquoxx(du) (12)

X-i(a)

From (10) and (12) we obtain

Pxs(x) = l xs()|2 (13)

We also have

Fs(uw) = iVfRe e ^ M e - ^ ) = Re s(w)s(w) (14)

Equation (10) shows how the complex numbers arise in quantum mechanshyics The complex numbers are not needed for the computation of Pxs because we can always write FS(OJW) in the form (1) They are merely a convenience that gives a simple and concise formula Equation (11) gives the Feynman amshyplitude function which we have now derived from deeper principles and (12) is Feynmans prescription that the amplitude of an outcome a is the sum (or

152

integral) of the amplitudes of the configurations (or alternatives) that result in x when X is executed

If B G Ex applying (5) and (13) gives

Pxs(B) = [ fxs(x)2raquox(dx) (15) JB

and this is the usual probabilistic formula of traditional quantum mechanics It follows from (3) that fxs is a unit vector in the Hilbert space 1 (R(X)Hx^x) and this derives the quantum Hilbert space and the vector form for a state If Ax is a set of X-actions then the Hilbert space Hx Q L2 (R(X) TxfJ-x) genshyerated by the set of wave functions fxs- S euro Ax is called an X-Hilbert space Some X-actions may not be relevant for physical reason so we may want Ax to be a proper subset of the set of all X-actions

If g Cl mdashgt R is integrable along rr-fibers and S pound Ax we define the (X 5)-amplitude average of g at x by

fxs(9)x) = [ g(u)fs(ugt)fx(dLj) = NS [ gu)eiS^nxd) Jx-l(x) JX-i(x)

(16)

Applying (8) and (14) we obtain

poundx s ( f f ) (s )=Re g(Lj)fs(cj)raquox(du) [ s(^)gti(^)

= Befxs(g)(x)fxsx)

It follows from (9) that

Exs(g)=Re(fxs(g)fxs) (17)

Define the linear operator g on Hx by gfxs() = fxs(g)() and extend by linearity If the operator Tj is self-adjoint on Hx we call g an X-observable and we have

Exs(9) = (9fxsfxs) (18)

for all S G Ax- We then say that g is represented by the self-adjoint opershyator lt on Hx bull This derives the representation of observables by self-adjoint operators

153

For a simple example of a representation let g pound1 -raquo R be a constant function g(uj) = c Then (16) gives

fxs(g)x) = c fs(w)nx(du) = cfxs(x) JX-1(x)

Hence g is an A-observable and is represented by the self-adjoint operator cl As another example letting g mdash X we have by (16) that

fxs(X)x) = xfXiS(x)

It follows that X is represented by the self-adjoint operator X on Hx given by Xu(x) = xux) We conclude that Hx is a Hilbert space in which X is diagonal More generally since

fxs (h(X)) (x) = h(x)fxs(x) (19)

we see that hX) is represented by the self-adjoint operator h(X)Au(x) = h(x)u(x) Moreover the spectral measure Px is given by Px (B)u(X) mdash XB(x)u(x) and applying (15) gives

Pxs(B) = px(B)fxs

which is again a standard probabilistic formula Finally for A C fi the (X 5)-pseudoprobability becomes by (17)

Pxs(A) = Re (fxs(xA)fxs) (20)

where by (16) we have

fxAxA)(x)= [ fs(cj)fixx(du) = NS I eiS^raquox(ckj) (21) JX- ( i )n i Jx-1(x)nA

4 Spin

We now illustrate the framework presented in the last two sections by preshysenting a model for spin 12 measurements Fix a direction corresponding to the z axis and assume that the spin j z in the z direction is known (either 12 or mdash12) Let UJ euro [07r] denote a direction whose angle to the z axis is LJ By symmetry the spin distribution should depend only on u Let fi = [07r] 8 6 fi and let X Q -gt -1212 be the function

X(u) = - 1 2 for u E [06] and X(u) = 12 for u G (0TT]

154

We make X into a measurement by defining

fix (-12)= ^ (12) = 1

and endowing X~1(-l2) = [0(9] and X~ 1 ( 1 2) = (0ir] with the usual Borel structure The function X corresponds to a spin 12 measurement in the 0 direction Letting 6 vary we obtain an infinite number of spin measurements each applied in a different direction Observe that a sample point ugt euro CI determines the spin in every direction simultaneously

For j z = 12 we define the X-action (S lt fix fix gtJ given by S(LJ) = u

and fix fix are fi2 where fi is Lebesgue measure restricted to X_ 1(mdash12) X _ 1 ( l 2 ) respectively We then have

FS(OJCJ) = cos(o - a)

(we shall see that Ns = 1) The probabilities become

P 5 ( - l 2 ) = l oVoCOs^-wJdwdw

= i[09cosadu]2 + i [ 0

e s i n a ^ ] 2 (22)

= plusmn s i n 2 0 + i ( l - c o s 0 ) 2 = s i n 2 f

Pxs(l2) = fficoa(u-uj)dLjdu

= [fg cos uiduj] + i [fg sin udu] (23)

= sin2 6 + (1 + cos Of = cos2 f

Since Pxs(-l2) + Pxs(ll2) = 1 we see that Ns = 1 Notice that (22) and (23) are the usual probability distribution for spin in the 9 direction when U = i 2

For j z = mdash12 we define the X-action S Avx vj J given by

S = u for u e (07r) and S = -TT2 for u e 0 n and vx = So + fi2 vx = Sn + fi2 where lt5o Sv are the Dirac point

measures at 0 ir respectively A similar but more tedious calculation gives

i ^ S ( - 1 2 ) = cos 2^

Pxs-(12) = s in 2 ^

155

which is the usual distribution for spin in the 6 direction when j z mdash - 1 2 We now examine the wave functions and Hilbert space corresponding to

this model The 5-amplitude function becomes fs(ugt) = etw and the (XS)-wave function fxs is given by

x s ( - l 2 ) 2 Jo e w d w = - ( l - )

fxs^l2) = f e^ltkj^-l + i0

The S-amplitude function becomes fsgt (w) = etrade for u euro (0 TT) and s - M = -i for w euro 0 TT and the (X 5)-wave function fxs IS given by

fxM-W) = f[o9]fs(gtx12^) = -i+12foeid

= - f ( l + eiS)

x5lt(l2) = M ] 5 H ^ 2 ( ^ ) = - i + 3 X r ^ d W

= - | ( l - e i e )

The X-Hilbert space is clearly C 2 and we can represent fxs and xS in C 2 by the unit vectors

vs

VS

(l-ei9l + eie)

(I + eie1 - eie)

Notice that vs i vs- Also when 6 = 0 vs mdash (01) and us = (10) which are the usual eigenvectors for the spin 12 operator in the z direction We can treat this as a measurement and the general X as an observable It can be shown that the matrix for X in the standard basis (10) (01) becomes

= 5 cos 9 ism 6

-i sin 6 mdash cos 6 = - cos 6

2 1 0 0 - 1

+ - sin 6 0 i -i 0

which is the usual form for a spin 12 matrix in the direction 6 We can extend this analysis to higher order spins3 Moreover this frameshy

work gives a realistic model for the Bohm version of the EPR problem4 The reason that Bells theorem is not contradicted is because Bells inequalities are derived using classical probability theory and we have employed quantum probability theory

156

5 Traditional Quantum Mechanics

We now show that this formalism contains traditional nonrelativistic quantum mechanics For simplicity we consider a single spinless particle in one dimenshysion although this work easily generalizes to three dimensions We take our sample space to be the phase space

n = K2 = (qp) qpER

The two most important measurements are the position and momentum given by Q(QP) = ltgt P(QJP) = P respectively However as is frequently done in quantum mechanics we shall investigate the ^-representation of the system In this case Q is considered a measurement and P fi mdashgt R is viewed as a function on fi which as we shall show is a Q-observable

Each Q-fiber Q~lq) = (qp)- p pound R can be identified with R We make Q a measurement by endowing its range R(Q) = R with Lebesgue meashysure and its fibers with the usual Borel structure of R Only certain Q-actions ISlt(1Q lt 7 G R H correspond to traditional quantum states and these can be derived from natural postulates We assume that fj is absolutely continuous relative to Lebesgue measure on R and that IQ is independent of Q This is because sets of Lebesgue measure zero are too small to have any effect on the outcomes of position measurements and there is no a priori reason to disshytinguish between Q-fibers It follows from the Radon-Nikodym theorem that there exists a nonnegative Lebesgue measurable function pound R mdashgt R such that

raquoQ(dp) = (2irh)-12ap)dp (24)

We take S fl mdashgt R to have the form

S(qp) = f+V(p) (25)

This form is natural because qp is the classical action and adding a function of momentum gives a quantum fluctuation We could also add a function of q but it is easy to see that this would just multiply the wave function by a constant phase which would not alter the probabilistic formulas Denote by AQ the set of (^-actions that have the form (24) (25)

Applying (12) for S euro AQ we find that the (Q 5)-wave function becomes

fQs(q) = 2-KK)-12 J tipYnp)eiqvhdp

Defining

m = t(p)eivp) (26)

157

and denoting the inverse Fourier transform by v we have

fQs(q) = (27Tr12 4gtPyqphdP = ltpa) (27)

In order for (3) to be satisfied Q ^ must be a unit vector in L2(R dq) or equivalently ltjgtp) must be a unit vector in L2(R dp) However every vector in L2 (R dp) has the form (26) for some functions pound R -raquobull R + 77 R -gtbull R It follows that the Q-Hilbert space becomes the traditional Hilbert space HQ = L2(R dq) and fQs is the usual wave function (or state)

Let (s l^9Q q euro R ) be a fixed Q-action in AQ of the form (24) (25)

and let ip(q) = fQs(q) $(p) = ^(p)eitgt^ Applying (16) and (27) we have

fQs(P)(Q) = (2nh)-12Jpltigt(p)ei^dp

= -ihplusmn(2nh)-V2j4gt(P)eilphdp=-ihq)

More generally if n is a positive integer we obtain

fQs(Pn)(Q) = (-ihQ V-CP) (28)

Moreover applying (18) we have

E^pn) = l[(-ihiS 1gt(q) P(q)dq

which is the usual quantum expectation formula We conclude from (28) that P is a Q-observable and is represented by the operator (mdashihddq)n Moreover if V R mdashgt R is measurable we see from (19) that V(Q) is a Q-observable and is represented by the operator V(Q)Au(q) = V(q)u(q) This together with our observation concerning P gives a derivation of the Bohr correspondence principle

We now consider probability distributions We have already seen in (15) that

PQS(B)= I ltP(q)2dq JB

which is the usual distribution of Q It is more interesting to compute the probability of A = P~1(B) for the momentum function P We have from (21) that

fQs(xA)(q) = 2Kh)-12 [ 4gtjgtyqphdp=xB4gtYq) JB

158

Hence by (20) and the Plancherel formula we obtain

PQS [P-^B)] = jxBdgtYq)rq)dq

(xB4gt)p)ltP(p)dp lt

= |(p) JB lt

dp

Again this is the usual momentum distribution This gives an example in which PQS is an actual probability measure on a er-algebra of subsets of fi

Until now we have treated time as fixed We now briefly consider dynamshyics Let ipqt) be a smooth function Our previous formulas hold with tp(q) replaced by tp(qt) and HQ replaced by tQt- We now derive Schrodingers equation from Hamiltons equation of classical mechanics dpdt = mdashdHdq Suppose the energy function has the form

H(qP) = ^+V(q)

We assume that Hamiltons equation holds in the amplitude average Applying (16) we have

Jt J Pfs(qPt)nqQltt(dp) = -mdashJ H(qp)fsqpt)nq

Qtdp)

Hence

dt Jp$(p t)e^hdp =-^f H(qp)$(p t)e^lhdp

Applying (28) and (19) gives

h2 d2igt dt dq J dq 2m dq2 + V(q)rlgt

Interchanging the order of differentiation on the left side of this equation and integrating with respect to q gives Schrodingers equation

6 Concluding Remarks

In this paper we have presented a realistic contextual nonlocal approach to quantum probability theory The formalism is realistic because each sample

159

point w euro n uniquely determines a value X(ugt) for any measurement X In this way a physical system ltS possesses all of its attributes independent of whether they are measured Although the sample space fi exists and we can discuss its properties fi is not physically accessible in general This is because the samshyple points may not correspond to physical states which can be prepared in the laboratory or at least exist in nature We may think of fi as a hidden variable completion of quantum mechanics This approach is contextual because it is necessary to specify a particular basic measurement X Once X is specified a Hilbert space Hx can be constructed and Hx provides an X-representation for S Of course one may choose a different basic measurement Y and then the ^-representation will give a different picture of S For example in trashyditional quantum mechanics we usually choose the position representation or the momentum representation to describe ltS For a given basic measurement X and an action S we have given a method for constructing the probability distribution Pxs of X We have shown that Pxs may be found in terms of a state vector fxs 6 Hx and these correspond to physically accessible states In Hx the measurement X and functions of X are diagonal and hence represhysented by random variables Other measurements which we call observables to distinguish them from X are represented by self-adjoint operators on Hx and their usual distributions follow in a natural way The theory is nonlocal because the distribution Pxs is specified by an influence function Fs(ww) This function provides an influence between pairs of sample points which in a spacetime model may be spacelike separated

There is considerable controversy concerning various interpretations and approaches to probability theory I believe that three types of probabilities are necessary for a description of quantum mechanics The probabilities and disshytributions of measurement results in the laboratory are usually computed using long run relative frequencies Even though a measurement X may involve a microscopic system S (for example the position of an electron) S must intershyact with a macroscopic apparatus in order to obtain an observable outcome The theoreticians task is to find the distribution Px of X This theoretical distribution should agree with the long run relative frequencies found in the laboratory or give predictions that can eventually be tested experimentally Since there are serious well-known difficulties in dealing with abstract theories of relative frequencies it is convenient and perhaps even necessary to use the standard Kolmogorovian probability theory for describing Px- Now Px is a probability measure that satisfies the axioms of standard probability theory However the method for computing Px is characteristic of quantum mechanshyics and is not found in any classical theory Richard Feynman whose work has motivated the present paper once said that nobody really understands

160

quantum mechanics I think that what he meant is that nobody understands why nature has chosen to compute probabilities in this unusual way As preshysented here the probability density for Px is found by employing an influence function The advantage of this method is that it is physically motivated and avoids complex numbers An equivalent method which is usually employed in quantum mechanics is to take the absolute value squared of the wave function

The quantum probability approach that we have presented contains stanshydard probability theory as a special case Thus we only need two types of probabilities to describe quantum mechanics Standard probability theory as developed by Kolmogorov is a distillation of hundreds of years of experience with empirical and theoretical studies of chance phenomena The founders of the subject were concerned with games of chance statistics and the behavior of macroscopic objects They were not aware of microscopic objects and quanshytum mechanics and had no reason to design a probability theory for describing such situations It is therefore not surprising that a new theory called quantum probability theory had to be developed to serve these purposes

References

1 R Feynman and A Hibbs Quantum Mechanics and Path Integrals (Mc Graw-Hill New York 1965)

2 S Gudder Int J Theor Phys 32 1747 (1993) 3 S Gudder Int J Theor Phys 32 824 (1993) 4 S Gudder Quantum probability and the EPR argument Ann Found

Louis De Broglie 20 167 (1994) 5 G Hemion Int J Theor Phys 29 1335 (1990)

161

INNOVATION APPROACH TO STOCHASTIC PROCESSES A N D Q U A N T U M DYNAMICS

TAKEYUKI HIDA Department of Mathematics

Meijo University TenpakuNagoya 468-8502

and Nagoya University (Professor Emeritus)

Theory of stochastic process has extensively developed in the twentieth century and there established a beautiful connection with quantum dynamics It seems to be a good time now to revisit the foundations of stochastic process and quantum mechanics with the hope that the attempt would suggest some of further directions of these two disciplines with intimate relations For this purpose we review some topics in white noise analysis and observe motivations from physiscs and how they have actually been realized

1 Introduction

We shall discuss the analysis of random complex systems and its connection with Quantum dynamics In particular we analyse some stochastic processes Xt) and random fields X(C) in a manner of using the innovation and revisit quantum dynamics in connection with stochastic analysis Actually our aim is to study those random complex systems including quantum fields by using the white noise analysis

The basic idea of our analysis is that we first discuss stochastic processes by taking a basic and standard system of random variables then expressing the given process as a function of the system that has been provided The system of such variables from where we have started is called idealized elemental random variables (abbr ierv) The idea of taking such a system is in line with the

Reductionism One might think that this thought seems to be similar to the Reductionism

in physics Before we come to this point it sounds interesting to refer to the lecture given by PW Anderson at University of Tokyo 1999 His title included Emergence together with reductionism and he gave good interpretation

Following the reductionism we then come to the next step is to form a function of the iervs so that the function represents the given random complex system It is nothing but

Synthesis

162

Then naturally follows the analysis of functions which have been formed in our setup Thus the goal has therefore to be the analysis of the function (may be called functional) to identify the random complex system in question

The first step of taking suitable system of iervs has been influenced by the way how to understand the notion of a stochastic process We therefore have a quick review of the definition of a stochastic process starting from the idea of J Bernoulli (Ars Conjectandi 1713) S Bernstein (1933) and P Levy on the definition of a stochastic process (1947) where we are suggested to consider the innovation of a stochastic process It is viewed as a system of iervs which will be specified to be a white noise

The analysis of white noise functionals has many significant characteristics which are fitting for investigation of quantum mechnical phenomena Thus we shall be able to show examples to which white noise theory is efficiently applied

Having had great contribution by many authors the theory developed in our line has become the present state

AMS 2000 Mathematics Subject Classification 60H40 White Noise Theory

2 Review of defining a stochastic process and white noise analysis

There is a traditional and in fact original way of defining a stochastic process Let us refer to Levys definition of a stochastic process given in his book [3] Chapt II une fonction aleatoire X(t) du temps t dans lequel le hasard inter-vient a chaque instant The hasard is expressed as an infinitesimal random variable Y(t) which is independent of the observed values of X(s) s lt t in the past The random variable Y(t) is nothing but the innovation of the process X(t)

Formally speaking the Y(t) which is usually an infinitesimal random varishyable contains the information that was gained by the X(t) during the time interval [t t + dt) To express this idea P Levy proposed a formula called an infinitesimal equation for the variation 5X (t)

6X(t) = $(X(s)s lt tY(t)tdt)

where $ is a non-random functional Although this equation has only a formal significance it still tells us lots of suggestions

While it would be fine if the given process is expressed as a functional of

163

Yt) in the following manner

X(t) = V(Y(s)slttt)

where ^ is a sure (non random) function Such a trick may be called the Reduction and Synthesis method The

above expression is causal in the sense that the X(t) is expressed as a function of Y(s) s ltt and never uses Y(s) with s gt t

Note that this method of denning a stochastic process is more important than function space type distribution

The collection Y(s) is a system of iervs so that the above expression is a realization of the synthesis We are particularly interested in the case where the system of iervs is taken to be a white noise and thus ready to discuss white noise analysis

So far we have discussed the theory only for a stochastic process It is in fact quite natural to extend the theory for a random field X(C) indexed by an ovaloid say a contour or closed surface A generalization of the infinitesimal equation is

SX(C) = $ (X(C) C lt CY(s)s e CC6C)

The y(s) s G C is the innovation

We note that the white noise analysis has many advantages as are quickly mentioned below Such a generalization can be done because of the use of the innovation

1) It is an infinite dimensional analysis Actually our stochastic analysis can be systematically done by taking a white noise as a sytem of iervs to express the given random complex systems Indeed the analysis is essentially infinite dimensional as will be seen in what follows

2) Infinite dimensional harmonic analysis The white noise measure supported by the space E of generalized funcshy

tions on the parameter space Rd is invariant under the rotations of E Hence a harmonic analysis arising from the group will naturally be discussed The group contains significant subgroups which describes essentially infinite dimenshysional characters

3) Generalizations to random fields X(C) are discussed in the similar manshyner to X(t) so far as innovation is concerned Needless to say X(C) enjoys more profound characteristic properties

164

4) Connection with the classical functional analysis The so-called S-transform applied to white noise functionals provides a bridge connecting white noise functionals and classical functionals of ordinary functions We can thereshyfor appeal to the classical theory of functionals established in the first half of the twentieth century

5) Good connection with quantum dynamics as will be seen in the next section

Differential and integral calculus of white noise functionals using annihishylation dt and creation lt9t class of generalized functionals harmonic analysis including Fourie analysis the Levy Laplacian A L complexification and other theories are refered to the monograph [12] and other literatures

3 Relations to Quantum Dynamics

We now explain briefly some topics in quantum dynamics to which white noise theory can be applied What we are going to present here may seem to be separate topics each other but behind the description always is a white noise

1) Representation of the canonical commutation relations for Boson field This topic is well known

Let B(t) be a white noise and let dt denote the S(i)-derivative Then it is an annihilation operator and its dual operator 3t stands for the creation They satisfy the commutation relations

[fta] = [aa] = o

[dtd] = s(t-s)

From these a representation of the canonical commutation relations are given for Bosonic particle

It is noted that the following assertion holds

Proposition There are continuously many irreducible representations of the canonical commutation relations

White noises with different variances are inequivalent each other which proves the assertion

2) Reflection positivity (T-positivity)

165

A stationary multiple Markov (say N-ple Markov) Gaussian process has a spetral density function (A) of particular type Namely

On the other hand it is proved that

Proposition The covariance function 7(t) of a stationary T-positive Gausshysian process is expressed in the form

bull O O

j(h) = exp[mdash |i|x]cfo(a) Jo

where v is a positive finite measure

By applying this assertion to the N-ple Markov Gaussian process we claim that T-positivity requires Ck gt 0 for every k

Note that in the strictly N-ple Markov case this condition is not satisfied

It is our hope that this result would be generalized to the cases of general stochastic processes of multiple Markov properties

3) A path integral formulation

One of the realizations of Dirac-Feynmans idea of the path integral may be given by the following method using generalized white noise functionals First we establish a class of possible trajectories when a Lagrangian L(x x) is given Let x be the classical trajectory determined by the Lagrangian As soon as we come to quantum dynamics we have to consider fluctuating paths y We propose they are given by

y(s) = xs) + mdashBs) V m

The average over the paths is replaced with the expectation with respect to the probability measure for which Brownian motion B(t) is defined Thus the propagator G(yiy2t) is given by

ENexp[l-J L(yy)ds+^j B(s)2ds] bull S(y(t) - y2)

With this setup actual computations have been done to get exact formulae of the propagators (L Streit et al)

166

4) Dirichlet forms in infinite dimensions With the help of positive grneralized white noise functionals we prove criteria for closability of energy forms See [3]

5) Random fields X(C)

A random field XC) depending on a parameter C which is taken to be a certain smooth and closed manifold in a Euclidean space naturally enjoys more complex probabilistic structure than a stochastic process X(t) depending on the time t It therefore has good connections with quantum fields in physics

We are particularly interested in the case where X(C) has a causal represhysentation in terms of white noise Some typical examples are listed below

51) Markov property and multiple Markov properties We are suggested by Diracs paper [1] to define Markov property For

Gaussian case a reasonable definition has been given (see [15]) by using the canonical representation in terms of white noise where the canonical property of a representation can be introduced as a geberalization of that for a Gaussian process Some attempts have been made for some non Gaussian fields (see [17]) For Gaussian case multiple Markov properties have been defined It is now an interesting question to find conditions under which a Gaussian random field satisfies a multiple Markov property

52) Stochastic variational equations of Langevin type Let C runs through a class C of concentric circles The equation is to solve

the following stochastic variational equation of Langevin type

SX(C) = -XXC) [ 6n(s)ds + X0 [ v(s)ds5n(s)ds Jc Jc

The explicit solution is given by using the 5-transform and the classical theory of functionals

53) We have made an attempt to define a random field X(C)C G C which satisfies conformal invariance Reversibility can also be discussed

Example Linear parameter case A Brownian bridge For t euro [01] it is defined by

X(t) = (l-t) [ mdash^mdashB(u)du Jo 1 ~u

167

Reversibility can be guaranteed not only by the time reflection but also by whiskers (one-parameter subgroup denned by deformation of parameter) in the conformal group that leaves the unit time interval invariant

We now come to the case of a random field Let C be the class of concentric circles Assume 0 lt r0 lt r lt r Denote by Cr the circle with radius r Then we define

(ft) - yfi^^bw w^w^ This is a canonical representation To show a reversibility we apply the invershysion with respect to the circle with radius yrori

We claim that it is possible to have a generalization to the case where C is taken to be a class of curves obtained by a conformal mapping of concentric circles

Remark 1 It is noted that the white noise x(t) is regarded as a representation of the parameter t so that propagation of randomness (fluctuation) is expressed in terms of x(t) instead the time t itself Namely the way of development of random complex phenomena in particular reversibility has explicit description in terms of white noise as is seen in the above example

Remark 2 See the papers [1] by Dirac and [13] by Polyakov to have suggestions on a generalization of the path integral

4 Addenda to foundations of the theories Concluding remarks

Before the concluding remarks are given we should like to add some facts as an addenda to SI regarding the foundations of probability theory

Prom a brief history mentioned in SI we understand the reason why a white noise that is a system of iervs is introduced It is a generalized stochastic process so that we need some additional consideration when reashysonable functionals in general nonlinear functionals of white noise are introshyduced In physics we met interesting cases where those nonlinear functionals of white noise are requested canonical commutation relations for quantum fields where degree of freedom is continuously infinite Feynmans path inteshygrals as was discussed in 3) of the last section and variational equation for a

168

random field On the other hand we were lucky when a class of generalized white noise functionals were introduced in 1975 since the theory of genaral-ized functions was established and some attempt had been made to apply it to the theory of generalized stochastic processes To have further fruitful results we have been given a powerful method to study random fields indexed by a manifold It is the so-called innovation approach where our reductionism does not care higher dimensionality of the parameter space With these in mind we can come to the concluding remarks

As the concluding remarks some of proposed future directions are now in order

1 One is concerned with good applications of the Levy Laplacian Its signifishycance is that it is an operator that is essentially infinite dimensional

2 A two-dimensional Brownian path is considered to have some optimality in occupying the territory This property should reflect to forming a model of physical phenomena

3 Systematic approach to in variance of random fields under transformation group will be discussed

4 Stochastic Variational Calculus for random fields

With the classical results on variational calculus we can proceed further white noise analysis

Acknowledgements The author is grateful to Professor A Khrenikov who has invited him to give a talk at this conference Thanks are due to Academic Frontier Project at Meijo University for the support of this work

References

1 PAM Dirac The Lagrangian in quantum mechanics Phys Z Soviet Union 3 64-72(1933)

2 S Tomonaga On a relativistically invariant formulation of the quantum theory of wave fields Prog Theor Phys 1 27-42 (1946)

3 P Levy Processus stochastiques et mouvement brownien (Gauthier-Villars 1948 2 ed 1965)

4 P Levy Nouvelle notice sur les travaux scientifique de M Paul Levy Janvier 1964 Part III Processus stochastiques (unpublished manuscript)

169

5 T Hida Canonical representations of Gaussian processes and their applications Mem College of Science Univ of Kyoto A 33 109-155(1960)

6 T Hida Stationary stochastic processes (Princeton Univ Press 1970) 7 T Hida Brownian motion (Iwanami Pub Co 1975 English ed

Springer-Verlag 1980) 8 T Hida Analysis of Brownina functionals Carleton Math Lecture

Notes 13 (1975) 9 T Hida Innovation approach to random complex systems Pub

Volterra Center 433 (2000) 10 T Hida and L Streit On quantum theory in terms of white noiseNagoya

Math J 68 21-34(1977) 11 T Hida J Pothoff and L Streit Dirichlet forms and white noise

analysis Commun Math Phys 116 235-245 (1988) 12 T Hida H-H Kuo J Potthoff and L Streit White noise an Infinite

dimensional calculus (Kluwer Academikc Pub 1993) 13 AM Polyakov Quantum geometry of Bosonic strings Phys Lett

103B 207-210(1981) 14 J Schwinger Brownian motion of a quantum oscillator J of Math

Phys 2 407-432 (1961) 15 Si Si Gaussian processes and Gaussian random fields Quantum Inshy

formational (World Scientific Pub Co 2000) 16 L Streit and T Hida Generalized Brownian functionals and the Feyn-

man integral Stoch Processes Appl 16 55-69 (1983) 17 L Accardi and Si Si Innovation approach to multiple Markov propershy

ties of some non Gaussian random fields to appear

170

STATISTICS A N D ERGODICITY OF WAVE FUNCTIONS IN CHAOTIC OPEN SYSTEMS

H ISHIO Department of Physics and Measurement Technology Linkoping University

S-581 83 Linkoping Sweden E-mail hirisifmliuse

and Division of Natural Science Osaka Kyoiku University Kashiwara

Osaka 582-8582 Japan E-mail ishioccosaka-kyoikuacjp

In general quantum chaotic systems are considered to be described in the context of the random matrix theory ie by random Gaussian variables (real or complex) in an appropriate universality class In reality however quantum states inside a chaotic open system are not given by a statistically homogeneous random state We show some numerical evidences of such statistical inhomogeneity for ballistic transport through two-dimensional chaotic open billiards and argue about their relation to the corresponding classical dynamics

1 Introduction

Quantum-mechanical signature of classical chaos is called quantum chaos The rigorous definition of chaotic systems in quantum theory has been given very recently for Kolmogorov (K-) and Anosov (C-) systems on the analogy of the corresponding classical natures1 In such systems quantum ergodicity is naturally expected Eigenfunctions are equidistributed in their representation space and all expectation values of quantum observables coincide with mean values of the corresponding classical observables It was first noted that a sufficient condition for quantum ergodicity to hold is the ergodicity of the corshyresponding classical dynamics2 More recently the statement was proved in the case of quantum billiards34 Nowadays the quantum ergodicity is one of the few results for which there exist mathematical proofs in the field of quantum chaos

The quantum ergodicity however can be reached only in the semiclassical limit (h mdashgt 0) In experiments or numerical simulations for chaotic systems we often see nonuniversal quantum features far from ergodicity even in a high (but finite) energy region In the present work we show some numerical evidences of such statistical inhomogeneity for chaotic open systems In Sec 2 we introshyduce a model of ballistic transport through a chaotic open billiard and show some evidences of nonergodicity in the classical dynamics We briefly discuss in Sec 3 the general wave-statistical description of chaotic open systems by

171

Figure 1 Typical single trajectory in the open stadium billiard

the random matrix theory (RMT) In Sec 4 we show numerical results of fully-quantum calculations of the open billiard model and find that the idealshyistic description by RMT does not apply in some cases even in a high energy region There we focus on the relation between the statistical deviations and wave localization corresponding to classical short paths Section 5 consists of conclusions

2 Classical Nonergodicity and Short-Path Dynamics

We consider a two-dimentional (2D) billiard where the motion of noninter-acting particles confined by Dirichlet boundaries is ballistic The shape of the boundaries directly determines the nonlinearity of particle dynamics inside the billiard One of the prototypes of conservative chaotic systems is a Bunimovich stadium billiard In the case of a closed stadium billiard it is proved that the system has K-property 5 In the case of an open stadium billiard coupled to two narrow leads (see Fig 1) the nonintegrability is still expected eg we can observe a fractal structure in the spectrum of dwell times inside the cavity region6 However the Monte Carlo simulation of the classical path-length (oc dwell time) distribution shows that the distribution function is not a simple exponential decay function as a signature of ergodicity but a highly structured function owing to short-path dynamics7

Another example showing nonergodicity of classical dynamics in the case

172

of the open stadium billiard is a transmission-reflection diagram of particles as is shown in Fig 2 There y is an initial transversal position of each particle incoming from the lead 1 (see Fig 1) at the entrance of the stadium cavity d denotes a common width of the attached leads We apply semiclassical quantization condition to the momentum of the incoming particles in the lead The angle of incidence is quantized as 6 = plusmn s in - 1 [(nir)(kd)] (n = 12 ) where we choose the positive and negative 0j for the upper and lower direction of particle motions in Fig 1 respectively k is the Fermi wave number of the semiclassical particles In the calculation of all the range of the diagram we fix the quantized mode number n as n = 1 Because of the semiclassical quantization condition 0i monotonically decreases as a function of k The distributed black and white points correspond to transmission and reflection events respectively The relative measure of the black (white) portion for each fc is equal to the classical transmission (reflection) probability Tci(k) (Rct(k)) In Fig 2 we see a number of black and white windows in the chaotic sea Each of them is associated with a family of short paths connecting from the lead 1 to the lead 2 (for the black) and the lead 1 (for the white) Such paths are stable in the event of transmission and reflection and are expected to make an important contribution as a family to the corresponding quantum transport

3 Universal Description of Wave Function Statistics

We write the scaled local density as p(r) mdash Vip(r)2 where V is the volume of the system in which a single-particle wave function ip(r) is normalized in terms of the position r It is well known that the probability distribution of the local densities of a chaotic eigenfunction of a closed system is the Porter-Thomas (P-T) distribution8

P(p) = ( l v 2 ^ ) exp( -p 2) (1)

described by a Gaussian orthogonal ensemble (GOE) of random matrices when time-reversal symmetry (TRS) is present ie ip poundR On the other hand the distribution is an exponential8Q

P(p) = exp(-p) (2)

described by a Gaussian unitary ensemble (GUE) of random matrices when TRS is broken in the closed system ie tp 6 C The space-averaged spatial correlation of the local densities of a 2D chaotic wave function with wave number k is also given by9 10 11

P2(kr) = (p^pfa)) = l + cJi(kr) (3)

173

where r = |ri mdash r2 | and Jox) is the Bessel function of zeroth order The parameter c is chosen as c = 2 for GOE (TRS) and c = 1 for GUE (broken TRS) eigenfunctions

Investigations of the continuous transition of the wave function statistics between GOE and GUE symmetries have been also worked out Introducshying a transition parameter b euro (12] we have the probability distribution 1213141516

PM = 2Vr3Texp(4(5^T))

where Iox) is the modified Bessel function of zeroth order and the spatial correlation17

Pb2kr) = 1 + (l + ( ^ ) 2 ) JS(kr) bull (5)

For b -gt 1 and b -gt 2 both equations tend to the GOE and GUE cases respectively

On the other hand the systematic statistical investigations of scattering wave functions in open chaotic systems have been carried out quite recently16

It is essential that the space reciprocity in conservative closed systems which means that each plane wave ties up with its counterpart with the same amplishytude and running in the opposite direction in phase is lost in open systems As a result the wave function statistics in a chaotic open system is expected to be the GUE if the system is completely open16

4 Numerical Analyses and Discussions

We show in this section some numerical evidences of wave statistical inho-mogeneity for ballistic transport through the 2D open stadium billiard Asshysuming steady current flow through the leads we solve the time-independent Schrodinger equation for a single particle under Dirichlet boundary conditions based on the plane-wave-expansion method6 giving reflection and transmission amplitudes as well as local wave functions for each energy In the calculation of the statistics a sample space A(= V) is taken in the cavity region corshyresponding to the closed stadium and more than one million sample points are used to obtain reliable statistics We show the numerical results for the wave probability density in Fig 3 and for the probability distribution P(p) and spatial correlation P2(kr) in Fig 4

174

In Fig 3(a) we find the so-called bouncing-ball mode in the central reshygion of the stadium cavity where we see a number of vertical nodes associated with marginally stable classical orbits bouncing vertically between the straight edges Bouncing-ball states are nonstatistical states since the amplitude of ip is strongly localized in the middle region of the stadium (the space reciprocity holds locally) and is very small in the endcaps (the space reciprocity does not necessarily hold) As a result both Pp) and P2(kr) for such states do not folshylow their universal expressions (see Fig 4(a)) In addition to the bouncing-ball mode we also see another wave localization strongly coupled to both the initial and the (open) transmission channels corresponding to the direct transmission path (see the white line depicted in Fig 3(a)) Along such localization plane wave may propagate with nonzero probability current partially contributing to the anomaly of the wave statistics16

In the higher energy region where the ratio of the system size A to the wave length A is v^4A ~ 25 (ie in the case of Fig 3(b)) we may expect the GUE statistics However we see in Fig 4(b) that both P(p) and P2(kr) follow closely the GOE

The reason is a localization effect reminiscent of the phenomenon known as scar 18 describing an anomalous localization of quantum probability denshysity along unstable periodic orbits in classically chaotic systems In order to characterize a localization we usually introduce a moment defined by J = V~l Jv tp(r)2qdr of the eigenfunction local density |VKr)|2 with V being the system volume19 20 The second moment I2 is known as the inverse particshyipation ratio (IPR) Assuming a normalization condition (|V|2) (= ^1) = 1gt we have I2 = 1 for completely ergodic (random and uniform) eigenfunctions while h = 00 for completely localized eigenfunctions like IV(r)2 ~ V5(r) The localization effect on wave-function density statistics has been examined anashylytically in relation to J for closed systems212223 and also numerically using a time-dependent approach ie in terms of recurrences of a test Gaussian wave packet for closed and weakly (imperfectly) open systems 24gt25gt26 In the latter work they showed that the tail of the wave-function intensity distribution in phase space is dominated by scarring departing from the RMT predictions

In contrast the most prominent effect of the localization of wave probashybility density in open billiards is the local space reciprocity holding along the classical orbits corresponding to the localization not strongly coupled to any (open) transmission channel (see eg the white lines depicted in Fig 3(b)) Along such orbits there is no net current owing to the coherent overlap of time-reversed waves so that both P(p) and P2(kr) are close to the GOE predicshytions 16 For quantitative discussion the value of the GOE-GUE transition pashyrameter b is calculated numerically from the wave function ip(r) mdash u(r) + iv(r)

175

by a formula 16

amp = 2 lt | V | 2 ) (hf) + y(|V|2)2-4((u2)( l2)-(w)2) (6)

and (bull bull bull) denotes a space average on A The obtained value for Fig 3(b) is b = 103 which corresponds to the case very close to the GOE

In the case of open systems the IPR may again play an important role as a measure of localization27 In the definition I2 = V 1 Jv |^(r) |4dr |V(r)|2(= p(r)) is the scattering-wave local density and V the area (A) of the stadium cavity in our case For chaotic wave functions normalized as (IVI2) = 1 gt w e

obtain from Eq (4) the IPR l for the transition between the GOE and GUE statistics as

Tb I p2Pb(p)dp = -7T

2VF^i

5 [2

70 Ti dQ

[l+(t-l)cos0]

3b2 - 4 6 + 4 b2 (7)

In the GOE and GUE limits I=1 = 3 and 7|=2 = 2 respectively For Fig 3(b) the numerically obtained IPR is h = 289 which is exactly equal to jt=i03 ^phis m e a n s that the enhancement of the IPR by the amplitude of the localized wave is not strong in the case of Fig 3(b) and that the effect of the localization appears mainly in the value of b which also determines the IPR

From our investigations together with more extended studies16 the comshyplete GUE statistics is conjectured to be obtained only in the high-energy (semiclassical) limit Until the energy reaches such limit the localization of wave functions within the chaotic open systems strongly affects the wave stashytistical properties leading to deviations from the RMT predictions based on the ergodicity or uniform randomness of wave functions

Finally we note that the classical-path families associated with the loshycalization found in Fig 3(a) and (b) can be identified as windows indicated with a and 3 in Fig 2 respectively (In Fig 3(b) only the path family for the localization touching the entrance can be identified in Fig 2) We notice that the angle of incidence 0 for a given k is irrelevant to that of the path corresponding to the observed localizations directly connected to the entrance

5 Conclusions

In conclusions our numerical analyses show that chaotic-scattering wave funcshytions in open systems exhibit remarkably different features from the idealistic GUE predictions The statistical deviations from the GUE can be understood in terms of wave localization corresponding to classical short-path dynamics

176

Acknowledgments

The auther is obliged to K-F Berggren A I Saichev and A F Sadreev for fruitful collaboration leading to the work in Sec 4 Support from the Swedish Board for Industrial and Technological Development (NUTEK) under Project No P12144-1 is also acknowledged Part of the calculations of the wave funcshytion statistics were carried out by using a resource in National Supercomputer Center (NSC) at Linkoping

References

1 H Narnhofer (to be published) 2 A I Shnirelman Usp Mat Nauk 29 181 (1974) 3 P Gerard and E Leichtnam Duke Math J 71 559 (1993) 4 S Zelditch and M Zworski Comm Math Phys 175 673 (1996) 5 L A Bunimovich Fund Anal Appl 8 254 (1974) 6 K Nakamura and H Ishio J Phys Soc Jpn 61 3939 (1992) 7 H Ishio and J Burgdorfer Phys Rev B 51 2013 (1995) 8 C Porter and R Thomas Phys Rev 104 483 (1956) 9 V N Prigodin Phys Rev Lett 74 1566 (1995)

10 V N Prigodin et al Phys Rev Lett 72 546 (1994) 11 M V Berry in Chaos and Quantum Physics ed M J Giannoni

A Voros and J Zinn-Justin (Elsevier Amsterdam 1990) p 251 12 K Zyczkowski and G Lenz Z Phys B 82 299 (1991) 13 G Lenz and K Zyczkowski J Phys A 25 5539 (1992) 14 E Kanzieper and V Freilikher Phys Rev B 54 8737 (1996) 15 R Pnini and B Shapiro Phys Rev E 54 R1032 (1996) 16 H Ishio et al (unpublished) 17 S-H Chung et al Phys Rev Lett 85 2482 (2000) 18 E J Heller Phys Rev Lett 53 1515 (1984) 19 F Wegner Z Phys B 36 209 (1980) 20 C Castellani and L Peliti J Phys A 19 L429 (1986) 21 Y V Fyodorov and A D Mirlin Phys Rev B 51 13403 (1995) 22 K Miiller et al Phys Rev Lett 78 215 (1997) 23 V N Prigodin and B L Altshuler Phys Rev Lett 80 1944 (1998) 24 L Kaplan Nonlinearity 12 Rl (1999) 25 L Kaplan Phys Rev Lett 80 2582 (1998) 26 L Kaplan and E J Heller Ann Phys 264 171 (1998) 27 H Ishio and L Kaplan (private communication)

177

-612 0 612-612 0 612 y(-9i) y(+6i)

Figure 2 Transmission-reflection diagram of classical particles as a function of initial position y at the entrance of the stadium cavity and Fermi wave number k corresponding to the angle of incidence $i calculated by semiclassical quantization condition (n = 1 in all the range) in the lead Black and white points correspond to transmission and reflection events respectively Two families of short paths are identified with an arrow beside the diagram (see the text)

178

Figure 3 Contour plot of wave probability density in the open stadium billiard for the condition (a) kdn = 18785 (n = 1) and (b) kdrc = 46553 (n = 1) Initial wave comes through the left lead into the cavity The transmission probability is (a) Tqm = 055 and (b) Tqm = 036 The contours show about 975 of the largest wave probability density Thin white lines show some of the short classical orbits corresponding to the localization of the wave probability density Taken from the work by the authors in Ref [12] (unpublished)

179

Q

Q_

001

10

Q

Q_

01

001

(b) = 2

X ^ Q U E _ _S gtJ^ 0 G O r T lt ^ lt

GOE

) 2 4 6 kr

bull

8

0

Figure 4 Probability distribution (steps) and spatial correlation (thick line in the inset) of local densities in the open stadium billiard for the condition (a) kd = 18785 (n = 1) and (b) kdir = 46553 (n = 1) Two thin lines show GOE (ie Eq (1)) and GUE (ie Eq (2)) cases (Eq (3) for the inset) Taken from the work by the authors in Ref [12] (unpublished)

180

ORIGIN OF Q U A N T U M PROBABILITIES

A N D R E I K H R E N N I K O V

International Center for Mathematical

Modeling in Physics and Cognitive Sciences

MSI University of Vaxjo S-35195 Sweden

Email AndreiKhrennikovmsivxuse

We demonstrate that the origin of the quantum probabilistic rule (which differs from the conventional Bayes formula by the presence of cos 0-factor) might be exshyplained by perturbation effects of preparation and measurement procedures The main consequence of our investigation is that interference could be produced by purely corpuscular objects In particular the quantum rule for probabilities (with nontrivial cos 0-factor) could be simulated for macroscopic physical systems via preparation procedures producing statistical deviations of a special form We disshycuss preparation and measurement procedures which may produce probabilistic rules which are neither classical nor quantum in particular hyperbolic quantum theory

1 Introduction

It is well known that the conventional probabilistic rule formula for the total probability (that is based on Bayes formula for conditional probabilities) canshynot be applied to quantum experiments see for example [1]-[12] for extended discussions It seems that special features of quantum probabilistic behaviour are just consequences of violations of the conventional probabilistic rule

In this paper we restrict our investigations to the two dimensional case Here the formula for the total probability has the form (i = 12)

p(A = ai) = p(B = h)p(A = ltnB = h) + p(B = b2)pA = taB = b2)

(1)

where A and B are physical variables which take respectively values aia2

and 6162- Symbols p(A = a^jB = bj) denote conditional probabilities It is one of the most important rules used in applied probability theory In fact it is the prediction rule if we know probabilities for B and conditional probabilities then we can find probabilities for A However this rule cannot be used for the prediction of probabilities observed in experiments with elementary particles The violation of conventional probabilistic rule and the necessity to use new prediction rule was found in interference experiments with elementary particles This astonishing fact was one of the main reasons to build the quantum formalism on the basis of the wave-particle duality

181

Let (fgt be a quantum state Let b gtf=1 be the basis consisting of eigenshyvectors of the operator B corresponding to the physical observable B The quantum probabilistic rule has the form (i = 12)

Pi = qiPii + q2P2i plusmn 2qiPHq2p2i cos0 (2)

where p = p^A = ai)qj - p^B = 6j)Py = pbigt(A = aj)ij = 12 Here probabilities have indexes corresponding to quantum states

By denoting P = pj and P i = qiPi i P2 = q2P2i we get the standard quantum probabilistic rule for interference of alternatives

P = P i + P 2 + 2v P7PT cos6raquo There is the large diversity of opinions on the origin of violations of convenshy

tional probabilistic rule (1) in quantum mechanics see [1]-[12] The common opinion is that violations of (1) are induced by special properties of quanshytum systems (for example Dirac Feynman Schrodinger) Thus the quantum probabilistic rule must be considered as a peculiarity of nature

An interesting investigation on this problem is contained in the paper of J Shummhammer [12] In the opposite to Dirac Feynman Schrodinger he claimed that quantum probabilistic rule (2) is not a peculiarity of nature but just a consequence of one special method of the probabilistic description of nature so called method of maximum predictive power

In this paper we provide probabilistic analysis of quantum rule (2) In our analysis probability has the meaning of the frequency probability namely the limit of frequencies in a long sequence of trials (or for a large statistical ensemble) Hence in fact we follow to R von Mises approach to probabilshyity [13] It seems that it would be impossible to find the roots of quantum rule (2) in the measure-theoretical framework A N Kolmorogov 1933 [14] In the measure-theoretical framework probabilities are defined as sets of real numbers having some special mathematical properties The conventional rule (1) is merely a consequence of the definition of conditional probabilities In the Kolmogorov framework to analyse the transition from (1) to (2) is to analshyyse the transition from one definition to another In the frequency framework we can analyse behaviour of trails which induce one or another property of probability Our analysis shows that quantum probabilistic rule (2) can be in principle a consequence of perturbation effects of preparation and measureshyment procedures Thus trigonometric fluctuations of quantum probabilities can be explained without using the wave arguments

In fact our investigation is strongly based on the famous Diracs analysis of foundations of quantum mechanics see [1] In particular P Dirac pointed out that one of the main differences between the classical and quantum theories is that in quantum case perturbation effects of preparation and measurement

182

procedures play the crucial role However P Dirac could not explain the origin of interference for quantum particles in the purely corpuscular model He must apply to wave arguments If the two components are now made to interfere we should require a photon in one component to be able to interfere with one in the other [1]

In this paper we discuss perturbation effects of preparation and measureshyment procedures We remark that we do not follow to W Heisenberg [15] we do not study perturbation effects for individual measurements We discuss statistical (ensemble) deviations induced by perturbations

We underline again that our probabilistic analysis was possible only due to the rejection of Kolmogorovs measure-theoretical model of probability theshyory Of course each particular experiment (measurement) can be described by Kolmogorovs model there are no quantum probablities Moreover it seems that there is nothing more than the binomial probability distribution (see the paper of J Shummhammer in the present volume) The most important feashyture of QUANTUM STATISTICS is not related to a single experiment We have to consider at least three different experiments (preparation procedures) to observe quantum probabilistic behaviour namely interference of alternashytives Kolmogorovs model is not adequate to such a situation In this model all random variables are defined on the same probability space It is impossible to do in the case of a few experiments that produce interference of alternatives (at least the author does not see any way to do this) In our analysis probashybility is classical relative frequency but it is not Kolmogorov (compare with Accardi [3])

An unexpected consequence of our analysis is that quantum probability rule (2) is just one of possible perturbations (by ensemble fluctuations) of conventional probability rule (1) In principle there might exist experiments which would produce perturbations of conventional probabilistic rule (1) which differ from quantum probabilistic rule (2)

Moreover if we use the same normalization of the interference term namely 2vPTP7 then we can classify all possible probabilistic rules that we have in nature

1) trigonometric 2) hyperbolic 3) hyper-trigonometric The hyperbolic probabilistic transformation has a linear space representashy

tion that is similar to the standard quantum formalism in the complex Hilbert space Instead of complex numbers we use so called hyperbolic numbers see for example [18] p21 The development of hyperbolic quantum mechanics can be interesting for comparative analysis with standard quantum mechanics In

Such an approach implies the statistical viewpoint to Heisenberg uncertainty relation the statistical dispersion principle see L Ballentine [16] [17] for the details

183

particular we clarify the role of complex numbers in quantum theory Complex (as well as hyperbolic) numbers were used to linearize nonlinear probabilistic rule (that in general could not be linearized over real numbers) Another intershyesting feature of hyperbolic quantum mechanics is the violation of the principle of superposition Here we have only some restricted variant of this principle

2 Quantum formalism and perturbation effects

1 Frequency probability theory The frequency definition of probability is more or less standard in quantum theory especially in the approach based on preparation and measurement procedures [5] [10] [16] [11]

Let us consider a sequence of physical systems n = (7TI7T2 71-JV bullbullbull) bull Suppose that elements of TT have some property for example position or spin and this property can be described by natural numbers L = 12 m the set of labels Thus for each -Kj euro TT we have a number Xj pound L So ir induces a sequence

x = (XIX2XN) Xj e L (3)

For each fixed a euro L we have the relative frequency VNOC) mdash niv(a)N of the appearance of a in (aia2 XN) Here njv(a) is the number of elements in (XIX2--XN) with Xj = a R von Mises [13] said that x satisfies to the principle of the statistical stabilization of relative frequencies if for each fixed a G L there exists the limit

p(a) = lim ^AT(Q) (4) NmdashHXl

This limit is said to be a probability of a Thus the probability is defined as the limit of relative frequencies In fact this definition of probability is used in all experimental investigations In Kolmogorovs approach [14] probability is denned as a measure The principle of the statistical stabilization is obtained as the mathematical theorem the law of large numbers

2 Preparation and measurement procedures and quantum forshymalism We consider a statistical ensemble S of quantum particles described by a quantum state ltjgt This ensemble is produced by some preparation proceshydure 8 see for example [4] [5] [16] [10] [11] for details see also P Dirac [1] In practice the conditions could be imposed by a suitable preparation of the system consisting perhaps in passing it through various kinds of sorting apparatus such as slits and polarimeters the system being left undisturbed after the preparation

There are two discrete physical observables B = bi 62 and A = ax a2

184

The total number of particles in S is equal to N Suppose that ni mdash 12 particles in S with B = bi and n i = 12 particles in S with A = a

Suppose that among those particles with B = bi there are riijij = 12 particles with A = aj (see (R) below to specify the meaning of with) So

n = nn +ni2n^ = nxi +n2jij = 12

(R) We follow to Einstein and use the objective realist model in that both B and A are objective properties of a quantum particle see [5] [4] [10] for the details In particular here each elementary particle has simultaneously defined position and momentum In such a model we can consider in the ensemble S sub-ensembles Sj(B) and Sj(A)j = 12 of particles having properties B = bj and A = aj respectively Set

Sij(AB) = S i(B)nS j(A) Then n^ is the number of elements in the ensemble S J ( A B ) We remark

that the existence of the objective property (B mdash bi and A mdash Oj) need not imshyply the possibility to measure this property For example such a measurement is impossible in the case of incompatible observables In general the property (B = bi and A = aj) is a kind of hidden objective property b

The physical experience says that the following frequency probabilities are well defined for all observables B A

q i = p^(B = 6 i ) = lim q ^ U r 0 ^ (5) JVmdashgtoo iV

p = p ( j 4 = a ) = l i m pWpf) = | (6) IS mdashtoo 1

Let quantum states |6j gt be eigenstates of the operator B Let us conshysider statistical ensembles Tii = 12 of quantum particles described by the quantum states |6j gt These ensembles are produced by some preparation proshycedures poundj For instance we can suppose that particles produced by a prepashyration procedure pound (for the quantum state 4gt) pass through additional niters Fi i = 12 In quantum formalism we have

ltfgt = xqT |ampi gt +V^eiB h gt bull (7)

^Attempts to use objective realism in quantum theory were strongly criticized especially in the connection with the EPR-Bell considerations Moreover many authors (for example P Dirac [1] and R Feynman [2]) claimed that the contradiction between objective realism and quantum theory can be observed just by comparing the conventional and quantum probabilistic rules (see dEspagnat [4] for the extended discussion) However in this paper we demonstrate that there is no direct contradiction between objective realism and quantum probabilistic rule

185

In the objective realist model (R) this representation may induce the illushysion that ensembles Tti = 12 for states bi gt must be identified with sub-ensembles Si(B) of the ensemble S for the state (j) However there are no physical reasons for such an identification

The additional filter Fj(i = 12) changes the A-property of quantum partishycles In general the probability distribution of the property A for the ensemble S(B) = IT e S B(7r) = b differs from the corresponding probability distrishybution for the ensemble T

Suppose that there are rriij particles in the ensemble T with A = aj(j mdash 12) c

The following frequency probabilities are well defined Pij = p|6 gt(A = aj) = limAr- oo pgt- where the relative frequency p ^ =

^f- (by measuring values of the variable A for the statistical ensemble T

we always observe the stabilization of the relative frequencies pj bull to some constant probability py)

Here it is assumed that the ensemble Tj consists of n^ particles i = 12 This assumption is natural if we consider preparation procedure pound = Ft a filter with respect to the value B mdash bi Only particles with B = bi pass this filter Hence the number of elements in the ensemble T (represented by the state bi gt) coincides with number of elements with B = bi in the ensemble 5 (represented by the state cjgt)

It is also assumed that n = n(N) -gt ooiV-gtoo In fact the latter assumption holds true if both probabilities qi = 12

are nonzero We remark that probabilities pjj = TpbigtA = aj) cannot be (in general)

identified with conditional probabilities p$(A = ajB = bi) As we have reshymarked these probabilities are related to statistical ensembles prepared by different preparation procedures namely by poundii mdash 12 and pound Probabilities P|ijgt(A = aj) can be found by measuring the A-variable for particles belongshying to the ensemble Tj Probabilities p^iA = CLJB = bi) in general could not be found these are hidden probabilities with respect to the ensemble S

3 Derivation of quantum probabilistic rule Here we present the standard Hilbert space calculations

cWe can use the objective realist model (R) Then m^- is just the number of particles in the ensemble Tj having the objective property A = aj We can also use the contextualist model (C) Then rriij is the number of particles in the ensemble T which in the process of an interaction with a measurement device for the physical observable A would give the result A = aj

186

lttgt = y5x h gt +y^eie b2 gt Let aj gt be the orthonormal basis consisting of eigenvectors of the

operator A We can restrict our considerations to the case

h gt= -vPiT K gt +e I 7 lv pH a2 gt b2 gt= VP2T K gt +en2^p22 a2 gt bull

(8)

We note that Pll + Pl2 = 1 P21 + P22 = 1-The first sum is the probability to observe one of values of the variable A

for the statistical ensemble Ti the second sum is the probability to observe one of values of the variable A for the statistical ensemble T2

As lt ampi|62 gt = 0 we obtain VP11P21 + e i(71 ~72) v p l ip i i = 0 We suppose that all probabilities pij gt 0 This is equivalent to say that

A and B are incompatible observables or that operators A and B do not commute

Hence sin(7i mdash 72) = 0 and 72 = 71 + nk We also have VP11P21 + cos(7i - 72VP12P22 = 0 This implies that k = 21 + 1 and ^ p i ^ i = iPi2P22- As p2 = 1 mdash P n

and P21 = 1 mdash P22 we obtain that

P l l = P 2 2 P l2=P21- (9)

This equalities are equivalent to the condition P u + P21 = 1 P12 + P22 = 1 Hence the matrix of probabilities (pij) is double stochastic matrix see

for example [5] for general considerations Thus in fact

h gt= v^PiT K gt +e17lVPi2 a2 gt b2 gt= ^pln |ai gt - e J 7 l v^22 a2 gt (10)

So (p = di |ai gt +d2|a2 gt where di = VqlpTT + e ^ y ^ p i T d2 = e i 7 l qiPi2 - e^+^yqjp^ Thus

pi = p 0 ( A = ai) = |di|2 = q i p n + q 2 p 2 i + 2 v q ip i iq 2 p 2 i cos^ (11)

p 2 = pltt(A = a2) = |d2|2 = qiPi2 + q2P22 - 2yqiPi2q2P22Cos0 (12)

187

3 Probability transformations connecting preparation proceshydures Let us forget at the moment about the quantum theory Let B(= b b2) and A(= 0102) be physical variables We consider an arbitrary preparation procedure pound for microsystems or macrosystems Suppose that pound produced an ensemble S of physical systems Let pound and pound2 be preparation procedures which are based on filters Fi and F2 corresponding respectively to values 61 and b2

of B Denote statistical ensembles produced by these preparation procedures by symbols Tx and T2 respectively Symbols

have the same meaning as in the previous considerations Probabilities qi)PijgtPi a r e defined in the same way as in the previous considerations The only difference is that instead of indexes corresponding to quantum states we use indexes corresponding to statistical ensembles

q = Ps(B = bi)pi = ps(A = ai)pij = PTi(A = a)

We shall restrict our considerations to the case of strictly positive probashybilities

The following simple frequency considerations are basic in our investigashytion We would like to represent the frequency p^ (for A = a in the ensemble S) as the sum of the conventional (Bayes) part

q i ^ P i f + q ^ P ^ and some perturbation term Such a perturbation term appears because

frequencies q and p ^ are calculated with respect to different ensembles The magnitude of this perturbation term will play the crucial role in our further analysis We have

(N) _ nplusmn _ nu I^pound _ mi l H2i 4 (nii ~ miraquo) (n2i ~ ra2j) P i ~ N ~ N N ~ N N N N

But for i = l 2 we have

tradegtu _ rnu_ r^_ _ (N) (N) m^ _ rn^ n | _ (jy) (N)

N ~ n N ~ P l i q i N ~ n N ~P2i ^

Hence

pw = qwp(f) + qwp(f) + r ) ) (13)

where

SiN) = Jj[(nu ~ m i i ) + (2i - m2i)] i mdash 12

188

In fact this rest term depends on the statistical ensembles STiT2 4Ngt=6W(STlT2) 4 Behaviour of fluctuations First we remark that limjv-yoo S exists

for all physical measurements We always observe that P 1

( N ) - M M q i( N ) - q p J ) - gt P u N - gt 0 0

Thus there exist limits 6i = limiv^oo S = Pi ~ qiPii - q2P2i-This coefficient Si is statistical deviation produced by the perturbation

effect of the preparation procedure Ei (quantities S are experimental statisshytical deviations)

Suppose that preparation procedures poundi = 12 (typically filters F) proshyduce negligibly small (with respect to the size N of the statistical ensemble) changes in properties of particles Then

6deg -gt0N-oo (14)

This asymptotic implies conventional probabilistic rule (1) In particular this rule can be used in all experiments of classical physics Hence preparation and measurement procedures of classical physics produce experimental statistical deviations with asymptotic (14) We also have such a behaviour in the case of compatible observables in quantum physics

Moreover the same conventional probabilistic rule we can obtain for inshycompatible observables B and A if the phase factor 9 = j + nk Therefore conventional probabilistic rule (1) is not directly related to commutativity of corresponding operators in quantum theory It is a consequence of asymptotic (14)

Despite the same asymptotic (14) there is the crucial difference between classical observations (and compatible observations) and decoherence 9 = f +

irk for incompatible observations In the first case S fa 0 TV -gt oo because both

4T = jj(nu ~mH)w deg siyen = jj(n2i ~ m 2 ) K deg N bullbull deg deg -In an ideal classical experiment we have

gtiiraquo = ma and n^i = tn^i-Here preparation procedures poundj (filters with respect to the values hi of the

variable B) do not change values of the A-variable at all In the case of decoherence of incompatible observables the statistical deshy

viations S j and 8 2 are not negligibly small So perturbations can be sufshyficiently strong However we still observe (14) as a consequence of the comshypensation effect of perturbations

189

x(N) ~ _x() degil ~ degi2 bull Suppose now that filters Fii = 12 produce changes in properties of

particles that are not negligibly small (from the statistical viewpoint) Then the statistical deviations

lim 6N) =Si^0 (15) iV-gtoo

Here we obtain probabilistic rules which differ from the conventional one (1) In particular this implies that behaviour (15) cannot be produced in experishyments of classical physics (or for compatible observables in quantum physics)

A rather special class of statistical deviations (15) is produced in experishyments of quantum physics However behaviour of form (15) is not the specific feature of quantum measurements (see further considerations)

To study carefully behaviour of fluctuations S we represent them as

where

A-N) = [jnu - mii) + (n2i - m2i)] 2ymum2i

These are normalized (experimental) statistical deviations We have used the fact

(N) (N) (N) (N) _ nj r^plusmn ^2 ^2i _ rniim2i qi P H q2 p2i - N bull n t bull N bull n6 - JV-2 bull

In the limit N -gt oo we get

Si = 2yqiPHq2P2i Araquo

where the coefficients Aj = limjv-gtoo A i = 12 Thus we found the general probabilistic transformation (for three preparation procedures) that can be obtained as a perturbation of the conventional probabilistic rule (i = 12)

Pi = qiPH + q2P2i + 2Vqiq2PiiP2iAj (16)

Of course we are free in the choice of a normalization constant in the perturbation term We use 2vqiq2Piipi7 by the analogy with quantum forshymalism In fact such a normalization was found in quantum formalism to get the representation of probabilities with the aid of complex numbers Comshyplex numbers were introduced in quantum formalism to linearize the nonlinear

190

probabilistic transformation q ip i + q2P2raquo + 2-vqiq2PiiP2i cos 6 To do this we use the formula (c d gt 0)

c + d + 2Vcdcos6 = ^+Vdeie2 (17)

The square root yc+Vde9 gives the possibility to use linear transformations Thus we do not see anything mystical in the appearance of complex numbers in quantum theory This is a consequence of the impossibility of real linearization of the nonlinear probabilistic transformation

In classical physics the coefficients A = 0 The same situation we have in quantum physics for all compatible observables as well as for measurements of incompatible observables for some states In the general case in quantum physics we can only say that the normalized statistical deviations

K lt 1 (18)

Hence for quantum experiments we always have

(nu - mu) + (n2i - m2i)

2ymum2i lt l J V - gt o o (19)

Thus quantum perturbations induce a relatively small (but not negligibly small) statistical variations of properties We underline again that quantum perturbations give just the proper class of perturbations satisfying to condition (19)

Let us consider arbitrary preparation procedures that induce perturbations satisfying to (18) We can set

Aj = cos9ii = 12 where 6i are some phases Here we can represent perturbation to the

conventional probabilistic rule in the form

St = 2vqipliq2p2iCOS0iJ = 12 (20)

In this case the probabilistic rule has the form (i = 12)

Pi = qiPii + q2P2i + 2^qiq2piiP2i cos8i (21)

This is the general form of a trigonometric probabilistic transformation The usual probabilistic calculations give us 1 = Pl + p 2 = qiPH + q2P21 + +qiPl2 + q2P22 + 2 TqTqiPiTpircos^i + 2 yqTqiPiipii cos 02

= 1 + 2Aqiq2[xpnP2i coslti + vPi2P22 cos02] bull

191

Thus we obtain the relation

P l l P 2 1 c o s ^ l + Pl2P22COS02 = 0 (22)

Suppose now that the matrix of probabilities is a double stochastic matrix We get

cos 6 mdash mdash cos 6-2 (23)

We obtain quantum probabilistic transformation (2) We demonstrate that this rule could be derived even in the realist framework Condition (19) has the evident interpretation To explain the mystery of quantum probabilistic rule we must give some physical interpretation to the condition of double stochasticity see section 4 for such an attempt

We can simulate quantum probabilistic transformation by using random variables niju)miju) such that the deviations

4T = nu - mH = 2^fVmiraquom2raquo (24)

4 i = n2i ~ m2j = ^ii VmUm2i (25)

where the coefficients poundy satisfy the inequality

l deg + $ deg I lt l-gtoo (26)

Suppose that Agt mdash poundj + Qj ~raquo A N -raquobull oo where |Ai| lt 1 We can repshy

resent A|N) = cos(9i(N) Then0JN) -gtbull 9imod2iT when N -gt oo Thus A = cos ft We remark that the conventional probabilistic rule (which is induced by

ensemble fluctuations with Q mdashgt 0) can be observed for fluctuations having relatively large absolute magnitudes For instance let

e l i mdash lt Vmlraquogt e2i mdash 2S2t V m 2i )raquo mdash J-iA (27)

where sequences of coefficients pound4 and pound^ are bounded (JV -gt oo) Here (N) f(JV) pound(JV)

^ = mti wmn -gt 0 iV -gt oo (as usual we assume that pj gt 0) Example 21 Let N laquo 106nJ w rig laquo 5 bull 105 mn ss mi2 laquo m2i laquo

m22 ~ 25 bull 104 So qi mdash q2 = 12 p u mdash p i 2 = p 2 1 = p 2 2 = 12 (symmetric state) Suppose we have fluctuations (27) with f m Qi ~ 12- Then eH w 4 w ^00 So riij = 24 bull 104 plusmn 500 Hence the relative deviation

192

(N)

m7 = 25I04 ~ 0002 Thus fluctuations of the relative magnitude laquo 0002 produce the conventional probabilistic rule

It is evident that fluctuations of essentially larger magnitude

4V = 2^f )(mH)1 2(m2 1)1Agt euro W = 2ampm2i)^(mu)Wap gt 2 (28)

where Q and pound2i a r e bounded sequences (N mdashgt 00) also produce (for Pij yen 0) the conventional probabilistic rule

Example 22 Let all numbers N mij be the same as in Example 31 and let deviations have behaviour (28) with a = = 4 Here the relative

AN)

deviation -mdash laquo 0045 Remark 21 The magnitude of fluctuations can be found experimentally

Let A and B be two physical observables We prepare free statistical ensembles S Ti T 2 corresponding to states ltj)bi gtb2 gt bull By measurements of B and A for 7r G S we obtain frequencies q[ q2 gt Pi gt P2 gt ^y measurements of A for 7r euro Ti and for TT G T2 we obtain frequencies p[j We have

H N ) = A ( N ) = p(N) q ( N ) p ( N ) _ q ( N ) p ( N

It would be interesting to obtain graphs of functions f (N) for different pairs of physical observables Of course we know that lini7v-raquooo ft (N) = plusmncos6 However it may be that such graphs can present a finer structure of quantum states

3 Hyperbolic and hyper-trigonometric probabilistic transformations

Let Si pound2 be preparation procedures that produce perturbations such that the normalized (experimental) statistical deviations

lAJ^I gt lJV-raquooo (29)

Thus |Aj| gt 12 = 12 Here the coefficients Aj can be represented in the form Aj = plusmn cosh8ii = 12 The corresponding probability rule has the following form

Pi = qiPii + Q2P2J plusmn 2AqIqipIip27cosh Qh i = 12 The normalization pi + p 2 = 1 gives the orthogonality relation

VP11P2I COSh 61 plusmn 1Pl2P22COSh^2 = 0 (30)

Thus cosh 62 mdash C0Sn^ipi2P22 and signAiA2 = mdash1

193

This probabilistic transformation can be called a hyperbolic rule It deshyscribes a part of nonconventional probabilistic behaviours which is not deshyscribed by the trigonometric formalism Experiments (and preparation proshycedures 86182) which produce hyperbolic probabilistic behaviour could be simulated on computer On the other hand at the moment we have no natural physical phenomena which are described by the hyperbolic probabilistic formalshyism Trigonometric probabilistic behaviour corresponds to essentially better control of properties in the process of preparation than hyperbolic probabilistic behaviour Of course the aim of any experimenter is to approach trigonometshyric behaviour However in principle there might exist such natural phenomena that trigonometric quantum behaviour could not be achieved

Example 3 1 Let qi = a q2 = 1 - a P n = = P22 = 12 Then pi = I + ya(l - a)Ai P2 = I - A(1 - laquo)^i bull If a is sufficiently small then Ai can be in principle larger than 1 We

can find a phase 6 such that the normalized statistical deviation Ai = cosh Let us consider experiments that produce hyperbolic probabilistic rule and

let the corresponding matrix of probabilities be double stochastic In this case orthogonality relation (30) has the form

coshi = cosh 62 = cosh We get the probabilistic transformation

Pi = q i P n +q2P2i plusmn 2^qiq2piiP2i coshfl

P2 = q iP i2 + q2P22 T 2v qiq2Pi2P22COsh0

This probabilistic transformation looks similar to the quantum probabilistic transformation The only difference is the presence of hyperbolic factors inshystead of trigonometric This similarity gives the possibility to construct a linear space representation of the hyperbolic probabilistic calculus see section 7

The reader can easily consider by himself the last possibility one norshymalized statistical deviations |A| is large than 1 and another is less than 1 hyper-trigonometric probabilistic transformation

Remark 31 The real experimental situation is more complicated In fact the phase parameter 6 is connected with the experimental arrangement In particular in the standard interference experiments the phase is related to the space-time structure of an experiment It may be that in some expershyiments dependence of the normalized statistical deviation A on 6 is neither trigonometric nor hyperbolic

P = P + P 2 + 2 yP^XiO) However if the function |A()| lt 1 then we can obtain the trigonometric

transformation by just the reparametrization 6 = arccos()

194

4 Double stochasticity and correlations between preparation proshycedures

In this section we study the frequency meaning of the fact that in the quantum formalism the matrix of probabilities is double stochastic We remark that this is a consequence of orthogonality of quantum states bi gt and |62 gt corresponding to distinct values of a physical observable B We have

PU = P22 ( 3 1 )

Pl2 P21

Suppose that all quantum features are induced by the impossibility to create new ensembles Ti and T2 without to change properties of quantum parshyticles Suppose that for example the preparation procedure Si practically destroys the property A = ai (transforms this property into the property A = a2) So p n = 0 As a consequence the pound1 makes the property A = a2

dominating So p i 2 laquo 1 Then the preparation procedure Si must practishycally destroy the property A = a2 (transforms this property into the property A = ai) So P22 PS 0 As a consequence the Si makes the property A = ai dominating So P21 laquo 1

We remark that

We recall that the number of elements in the ensemble T is equal to n Thus

n n -run _ n22 - m 2 2 ^ nil _ 22 bdquobdquo

This is nothing than the relation between fluctuations of property A under the transition from the ensemble S to ensembles Ti T2 and distribution of this property in the ensemble S

5 Hyperbolic quantum formalism

The mathematical formalism presented in this section can have different physshyical interpretations In particular quantum state can be interpreted from the orthodox Copenhagen as well as statistical viewpoints

A hyperbolic algebra G see [18] p 21 is a two dimensional real algebra with basis eo = 1 and ei = j where j 2 = 1 Elements of G have the form z = x + jy xy euro R We have zi + z2 = (xi + x2) + j(yi + yi) and ziz2 = xixi + 2122) + j(^i22 + X2yi) This algebra is commutative We introduce

195

the involution in G by setting z = x - jy We set z2 = zz = x2 - y2 We remark that z = yjx2 - y2 is not well denned for an arbitrary z euro G We set G+ = z pound G z2 gt 0 We remark that G+ is the multiplicative semigroup ZiZ2 pound G + mdashbull z = zz2 pound G+ It is a consequence of the equality

zxz22 = |zi |2 |z2 |2

Thus for zz2 pound G + we have zz2 = l^iH^I- We introduce

eje = cosh6+js inh9 6 pound R

We remark that

e j 0 i e j 02 _ em+ltgt2)^ _ e - j 9 |gjlaquo|2 _ c o s h 2 g _ s i n h 2 g _ L

Hence z = plusmneJ e always belongs to G+ We also have cosh6raquo = e +2

e sinh6gt = e ~j We set G = z e G + |Z|2 gt 0 Let z pound G+ We have

= W(1f[+W = laquoN( aSr+jHSr)-2 2

As A T - T TJ = 1 we can represent x sign a = cosh 6 and y sign a = sinh 6 where the phase 6 is unequally defined We can represent each z pound G+ as

z = sign x |z| ee By using this representation we can easily prove that G+ is the mulshy

tiplicative group Here mdash 5Spe-Jfl The unit circle in G is denned as Si = z pound G z2 = 1 = z = plusmneje9 pound ( -oo+oo) It is a multiplicative subgroup of G+

Hyperbolic Hilbert space is G-linear space (module) see [18] E with a G-linear product a map (bullbull) E x E mdashgt G that is

1) linear with respect to the first argument (az + bwu) = a(zu) + b(wu)ab pound Gzwu pound E 2) symmetric (zu) = (uz) 3) nondegenerated (zu) = 0 for all u pound E iff z mdash 0 If we consider E as just a R-linear space then (bull bull) is a bilinear form which

is not positively defined In particular in the two dimensional case we have the signature (+ mdash + mdash)

As in the ordinary quantum formalism we represent physical states by normalized vectors of the hyperbolic Hilbert space ltp pound E and (ip ip) = 1 We shall consider only dichotomic physical variables and quantum states belonging to the two dimensional Hilbert space So everywhere below E denotes the two dimensional space Let A = a a2 and B = bi b2 be two dichotomic physical variables We represent they by G-linear operators a gtlt a i | + a2 gtlt a2

196

and bi gtlt b + |amp2 gt lt b2 where |a gtj=i2 and bi gti=i2 are two orthonormal bases in E

Let (p be a state (normalized vector belonging to E) We can perform the following operation (which is well defined from the mathematical point of view) We expend the vector ltp with respect to the basis bi gti=i2 bull

ltP = Pibigt+p2b2gt (34)

where the coefficients (coordinates) Pi belong to G As the basis bi gti=i2 is orthonormal we get (as in the complex case) that

p12 + p2

2 = l (35)

However we could not automatically use Borns probabilistic interpretation for normalized vectors in the hyperbolic Hilbert space it may be that Pi $ G +

(in fact in the complex case we have C = C + ) We say that a state ip is deshycomposable with respect to the system of states |6j gti=i2 (S-decomposable) if

Pi G G+ (36)

In such a case we can use Borns probabilistic interpretation of vectors in a hyperbolic Hilbert space

Numbers q = Pi2i = 12 are interpreted as probabilities for values B = bi for the G-quantum state tp

We now repeat these considerations for each state bi gt by using the basis ogtk gt=i2- We suppose that each bi gt is ^-decomposable We have

|ampi gt = n k gt +Pi2a2 gt |amp2 gt = ampi |a i gt +p22a2 gt (37)

where the coefficients Pik belong to G+ We have automatically

|n|2 + |i2|2 = l |2i|2 + |22|2 = l (38)

We can use the probabilistic interpretation of numbers p n = |n|2pi2 = |3i2|2 and p2 i = |32i|

2P22 = P22 bull Pik is the probability for a - ak in the state bi gt

Let us consider matrices B = (Pik) and P = (pik)- As in the complex case the matrix B is unitary vectors u = (PnPi2) and u2 = (p2iP22) are orthonormal The matrix P is double stochastic

By using the G-linear space calculation (the change of the basis) we get ltp = a i |o i gt +a 2 | a 2 gt where a-i = PiPn + P2P21 and a2 mdash PP2 + 222-

197

We remark that decomposability is not transitive In principle ip may be not A-decomposable despite B-decomposability of ip and A-decomposability of the B-system

Suppose that ip is A-decomposable Therefore coefficients p^ = |afc|2 can be interpreted as probabilities for a = ak for the G-quantum state ltp

Let us consider states such that coefficients fiiPik belong to G+ We can uniquely represent them as

pi = plusmnvq~e^ I5ik = plusmnyJHkehih ik= 12

We find that

Pi = q i P u + Q2P21 + 2ei v q 1piiq 2p 2 i coshfli (39)

P2 = qiPi2 + q2P22 + 2e2vqTpl2q2P22 cosh^2 (40)

where 6t = 77 + 7 and 77 = f i - pound271 = 7n - 7217i = 7i2 - 722 and e = plusmn To find the right relation between signs of the last terms in equations (39) (40) we use the normalization condition

M 2 + |a2 |2 = l (41)

(which is a consequence of the normalization of ip and orthonormality of the system ai gti=i2) It is equivalent to the equation (condition of orthogonalshyity in the hyperbolic case see section 8)

VPl2P22COSh02 plusmn PllP2lCOSh02 = 0 Thus we have to choose opposite signs in equations (39) (40) Unitarity

of B also inply that 6 mdash 62 = 0 so 71 = 72 We recall that in the ordinary quantum mechanics we have similar conditions but trigonometric functions are used instead of hyperbolic and phases 71 and 72 are such that 71mdash72 = ir

Finally we get that (unitary) linear transformations in the G-Hilbert space (in the domain of decomposable states) represent the hyperbolic transformashytion of probabilities (see section 8)

Pi = QiPu + q2P2i plusmn 2-vq1piiq2p2iCOsh0 P2 = qiPi2 + q2P22 =F 2vq1pi2q2P22COsh0 This is a kind of hyperbolic interference There can be some connection with quantization in Hilbert spaces with

indefinite metric as well as the theory of relativity However at the moment we cannot say anything definite It seems that by using Lorentz-rotations we can produce hyperbolic interference in a similar way as we produce the standard trigonometric interference by using ordinary rotations

198

6 Physical consequences

The wave-particle dualism was created to explain the interference phenomenon for massive elementary particles In particular the orthodox Copenhagen inshyterpretation was proposed to find a compromise between corpuscular and wave features of elementary particles The idea of superposition of distinct propershyties is in fact based on these interference experiments It is well known that the orthodox Copenhagen interpretation is not free of difficulties (in particular collapse of wave function) and even paradoxes (see for example Schrodinger [19]) Problems in the orthodox Copenhagen interpretation induce even atshytempts to exclude corpuscular objects from quantum theory at all see for example [20] for Schrodinger critique of the classical concept of a particle At the moment there is only one alternative to the orthodox Copenhagen intershypretation namely Einsteins statistical interpretation By this interpretation the wave function describes distinct statistical features of an ensemble of eleshymentary particles see L Ballentine [17] for the details (see also [16] [5] [10]

[11])-However we must recognize that Einsteins statistical approach could not

solve the fundamental problem of quantum theory it could not explain the appearance of NEW STATISTICS in the purely corpuscular model We did this in the present paper On one hand this is the strong argument in favour of the statistical interpretation of quantum mechanics On the other hand one of main motivations to use the wave-particle duality disappeared

Nevertheless our investigation could not be considered as the crucial argushyment against the wave-particle duality It is clear that by using purely mathshyematical analysis we cannot prove or disprove some physical theory The only thing that we proved is that corpuscular objects (that have no wave features) can exhibit NEW STATISTICS

In fact we obtained essentially more than planed this NEW STATISTICS are not reduced to QUANTUM STATISTICS In principle we can propose experiments that induce TRIGONOMETRIC HYPERBOLIC and HYPER-TRIGONOMETRIC STATISTICS

We remark that the quantum probabilistic transformation P = Pi + P2 + 2VPTP7 cos0 gives the possibility to predict the probability P if we know probabilities

P i and P 2 In principle there might be created theories based on arbitrary transformations

P = F ( P 1 gt P 2 ) It may be that some rules have linear space representations over exotic number systems for example p-adic numbers [20]

199

Preliminary analysis of probabilistic foundations of quantum mechanics (that induced the present investigation) was performed in the books [11] and [21] (chapter 2) a part of results of this paper was presented in preprints [22]-[24]

Acknowledgements

I would like to thank S Albeverio L Accardi L Ballentine V Belavkin E Beltrametti W De Muynck S Gudder T Hida A Holevo P Lahti A Peres J Summhammer I Volovich for (sometimes critical) discussions on probabilistic foundations of quantum mechanics

References 1 P A M Dirac The Principles of Quantum Mechanics (Claredon Press

Oxford 1995) 2 R Feynman and A Hibbs Quantum Mechanics and Path Integrals

(McGraw-Hill New-York 1965) 3 L Accardi The probabilistic roots of the quantum mechanical parashy

doxes The wave-particle dualism A tribute to Louis de Broglie on his 90th Birthday ed S Diner D Fargue G Lochak and F Selleri (D Reidel Publ Company Dordrecht 297-330 1984)

4 B dEspagnat Veiled Reality An anlysis of present-day quantum meshychanical concepts (Addison-Wesley 1995)

5 A Peres Quantum Theory Concepts and Methods (Kluwer Academic Publishers 1994)

6 J von Neumann Mathematical foundations of quantum mechanics (Princeton Univ Press Princeton NJ 1955)

7 E Schrodinger Philosophy and the Birth of Quantum Mechanics Edited by M Bitbol O Darrigol (Editions Frontieres 1992)

8 J M Jauch Foundations of Quantum Mechanics (Addison-Wesley Reading Mass 1968)

9 P Busch M Grabowski P Lahti Operational Quantum Physics (Springer Verlag 1995)

10 W De Muynck W De Baere H Martens Found Phys 24 1589-1663 (1994)

11 A Yu Khrennikov Interpretations of probability (VSP Int Publ Utrecht 1999)

12 J Summhammer Int J Theor Phys 33 171-178 (1994) 13 R von Mises The mathematical theory of probability and statistics

(Academic London 1964)

200

14 A N Kolmogoroff Grundbegriffe der Wahrscheinlichkeitsrechnung (Springer Verlag Berlin 1933) reprinted Foundations of the Probability Theshyory (Chelsea Publ Comp New York 1956)

15 W Heisenberg Z Physik 43 172 (1927) 16 L E Ballentine Quantum mechanics (Englewood Cliffs New Jersey

1989) 17 L E Ballentine Rev Mod Phys 42 358-381 (1970) 18 A Yu Khrennikov Supernalysis (Kluwer Academic Publishers Dor-

dreht 1999) 19 E Schrodinger Die Naturwiss 23 807-812 824-828 844-849 (1935) 20 E Schrodinger What is an elementary particle in Gesammelte Ab-

handlungen (Wieweg and Son Wien 1984) 21 A Yu Khrennikov p-adic valued distributions in mathematical physics

(Kluwer Academic Publishers Dordrecht 1994) 22 A Yu Khrennikov Ensemble fluctuations and the origin of quantum

probabilistic rule Rep MSI Vaxjo Univ 90 October (2000) 23 A Yu Khrennikov Classification of transformations of probabilities

for preparation procedures trigonometric and hyperbolic behaviours Preprint quant-ph0012141 24 Dec (2000)

24 A Yu Khrennikov Hyperbolic quantum mechanics Preprint quant-ph0101002 31 Dec (2000)

201

NONCONVENTIONAL VIEWPOINT TO ELEMENTS OF PHYSICAL REALITY BASED ON NONREAL ASYMPTOTICS

OF RELATIVE FREQUENCIES

A N D R E I K H R E N N I K O V

International Center for Mathematical

Modeling in Physics and Cognitive Sciences

MSI University of Vaxjo S-35195 Sweden

EmailAndreiKhrennikovmsivxuse

We study connection between stabilization of relative frequencies and elements of physical reality We observe that besides the standard stabilization with respect to the real metric there can be considered other statistical stabilizations (in parshyticular with respect to so called p-adic metric on the set of rational numbers) Nonconventional statistical stabilizations might be connected with new (noncon-ventional) elements of reality We present a few natural examples of statistical phenomena in that relative frequencies of observed events stabilize in the p-adic metric but fluctuate in the standard real metric

1 Introduction

The present methodology of physical measurements is based on the principle of the statistical stabilization of relative frequencies in the long run of trials In the mathematical model this principle is represented by the law of large numbers This approach to measurements is induced by human representation of physical reality as reality of stable repetitive phenomena In the process of evolution we created cognitive structures that correspond to elements of this repetitive physical reality All modern physical investigations are oriented to the creation of new elements of such a reality

It must be remarked that the notion of stabil ization (of relative frequenshycies) plays the fundamental role in the creation of this reality I would like to point out that the conventional meaning of stabilization is based on real numbers When we say stabilization we mean the stabilization with respect to the standard real metric pn(xy) = |x mdash y| (the distance between points x and y on the real line R) Of course such a choice of the metric that deshytermines statistically elements of physical reality was not just a consequence of the development of one special mathematical theory real analysis b It

a W e ask the reader not connect our vague (common sense) use of the notion of an element of physical reality with the EPR sufficient condition to be an element of reality [1] bNevertheless we must not forget that the human factor played the large role in the expendshying of the (presently dominating) model of physical reality based on real numbers At the beginning Newtons analysis was propagated as a kind of religion There were (in particular

202

seems that the notion of ^-stabilization was induced by human practice in that quantities n laquo N were not important We created real physical reality because we used smallness based on the standard order on the set of natural numbers

It must be underlined that in modern physics the real physical reality (ie reality based on the 9R-stability) is in fact identified with the whole physical reality

On the other hand the modern mathematics is not more just a real analshyysis In particular the development of general topology [2] [3] induced large spectrum of new nearness (in particular metric) structures In principle we need not more identify any stabilization with the p^-stabilization There apshypears a huge set of new possibilities to introduce new forms of stability in physical experiments Moreover new stable structures can be considered as new elements of physical reality that in general need not belong the standard real reality

This idea was presented for the first time in authors investigations [4] [5] on so called p-adic physics [6]- [10] Later we tried to find the place of p-adic probabilities in quantum physics [11] [12] (in particular to justify on the mathematical level of rigorousness the use of negative and complex probabilishyties as well as create models with hidden variables that do not produce Bells inequality) In this paper we give the brief introduction into these probabilisshytic models as well as present a few rather natural examples in that relative frequencies of events stabilize with respect to so called p-adic metric but flucshytuate with respect to pR There is no corresponding element of the real reality But there is an element of the p-adic reality The objects considered in examshyples could be created on the hard-level In particular to create a plantation in that a colour of the flower (red or white) is the element of p-adic reality I need just a tractor and (sufficiently large) peace of land Nevertheless I must agree that such a p-adic element of reality were never observed in naturally created physical objects

The reader can be interested in the reasons by that we are concentrated on the statistical stabilization with respect to the p-adic numbers p-adic frequency probability theory The main reason is that p-adic numbers are in fact the unique alternative to real numbers there is no other possibility to complete the field of rational numbers and obtain a new number field (Ostrovskiis theorem see for example [13] [14])

Our probabilistic foundations are based on the generalization of R von Mises frequency theory of probability [15] [16] At the beginning of this censhytury when the foundation of modern probability theory were being laid the

in France) divine services devoted to Newtons analysis

203

frequency definition of probability proposed by von Mises played an imporshytant role In particular it was this definition of probability that Kolmogorov used to motivate his axioms of probability theory (see [17]) We also begin the construction of the new theory of probability with a frequency definition of probability

Von Mises defined the probability of an event as the limit of the relative frequencies of the occurrence of the event when the volume of the statistical sample tends to infinity This definition is the foundation of mathematical statistics (see example Cramer [18]) in which von Misess definition is formushylated as the principle of statistical stabilization of relative frequencies

In this paper we propose a general principle of statistical stabilization of relative frequencies By virtue of this principle statistical stabilization of relative frequencies u = nN can be considered not only in the real topology on Q (and all relative frequencies are rational numbers) but also in any other topology on Q Then the probabilities of events belong to the corresponding completion of the field of rational numbers As special cases we obtain the ordinary real probability theory (von Misess definition) and p-adic probability theories p = 2 3 5

How should one choose the topology of statistical stabilization for a given statistical sample The topology is determined by the properties of the studied probability model In essence we propose this principle for each probability model there is a corresponding topology (or topologies) of statistical stabilizashytion

For example in a random sample there need not be any statistical stashybilization of the relative frequencies in the real metric Thus from the point of view of real probability theory this is not a probabilistic object However in this random sample one may observe p-adic statistical stabilization of the relative frequencies

In essence I am asserting that the foundation of probability theory is provided by rational numbers (relative frequencies) and not real numbers Real probabilities of events merely represent one of many possibilities that arise in the statistical analysis of a random sample Such an approach to probability theory agrees well with Volovichs proposition that rational numbers are the foundation of theoretical physics [19] In accordance with this proposition everything physical is rational and number fields that are different from the field of rational numbers arise as an idealization needed for the theoretical description of physical results

All necessary information on p-adic (and more general m-adic) numbers can be found in Appendix 1 of this paper However in the first two sections they are hardly used at all and we may restrict ourselves to the remark that

204

in addition to the completion of the field of rational numbers Q with respect to the real metric there also exist completions with respect to other metrics and among these completions there are the fields of p-adic numbers Qpp = 2 3 5

2 Analysis of the foundation of probability theory

21 Frequency Definition of Probability As is well known the frequency definition of probability proposed by von Mises [15] in 1919 played an imporshytant role in the construction of the foundations of modern probability theory This definition exerted a strong influence on the theory of probability meashysures the foundations of which were laid by Borel [20] Kolmogorov [17] and Frechet [21] There is no point in giving here Kolmogorovs axioms (which can be found in any textbook on probability theory) but it is probably necessary to recall in its general features the main propositions of von Misess theory of probability The theory is based on infinite sequences x = (ai xlti xn) of samplings or observations If an experiment having S outcomes is made then Xj can take values 12 5 (possible outcomes) For the standard exshyperiment on coin trails we have 5 = 2 and Xj = 12 In what follows possible outcomes of an experiment will be called labels

However not every such sequence is regarded as an object of probability theory The fundamental principle of the frequency theory of probability is the principle of statistical stabilization of the relative frequencies of occurrence of a particular label and only sequences of samplings that satisfy this principle are regarded as objects of probability theory Such sequences of samplings are called collectives

A collective is a bulk phenomenon or a repeated process in brief a series of individual observations for which one is justified in assuming that the relative frequency of occurrence of each individual observable label tends to a definite limiting value [16]

The probability of an event E is defined as the limit of the sequence of frequencies u^ = nN where n is the number of cases in which the event E is detected in the first N tests

For the subsequent considerations it is important to note that in the statistical analysis of the results of an experiment only rational numbers -relative frequencies - are obtained

The principle of statistical stabilization of the relative frequencies is used practically unchanged in mathematical statistics

Observations of the frequency v^ of a fixed event E for increasing values of N reveals that this frequency has generally speaking a tendency to take a

205

more or less constant value at large N (see Cramer [18]) In defining a collective von Mises used a further principle - the principle

of irregularity of a sequence of tests ie invariance of the limit of the relative frequencies with respect to the selection made using a definite law from a given sequence of tests x = (xiX2 xn) of some subsequence It is important that the law of this selection should not be based on the difference of the elements of the sequence with respect to the considered label

Second this limiting value must remain unchanged if from the complete sequence we choose arbitrarily any part and consider in what follows only this part [16]

This principle like the principle of statistical stabilization of the relative frequencies is fully in accord with our intuitive ideas of randomness However there are here some logical difficulties associated with the arbitrariness of the choice A detailed analysis of these logical problems was made by Khinchin [22] see also [12] for the details It appears that one must agree with Khinchins critical comments and consider the frequency theory of probability that is based only on von Misess first principle - the principle of statistical stabilization of the relative frequencies

As is noted in [22] the frequency theory of probability based solely on von Misess first principle is axiomatized and is as rigorous a mathematical theory as Kolmogorovs theory of probability Here we do not intend to consider von Misess theory of probability in the framework of an axiomatic approach Our task is to analyze the principle of stabilization of the frequencies of occurrence of a particular event in a collective

22 Von Mises Frequency Theory of Probabilities as Objective Foundation of Kolmogorovs Axiomatics

As motivation of his axioms Kolmogorov used the properties of limits of relative frequencies see [17] We shall be interested in the manner in which Kolmogorovs axiom 2 arose in accordance with this axiom the probability PE) of any event E is a nonnegative real number lt 1 In [17] Kolmogorov considers von Misess definition [16] of probability as the limit of the relative frequencies of occurrence of the event E Further since the relative frequencies i(pound) = nN are rational numbers that lie between zero and unity their limits in the real topology are real numbers between zero and unity Cramer proceeded similarly in the construction of his theory of probability distributions [18]

Khinchin discussing the advantages of Kolmogorovs axioms over von Misess frequency theory of probability noted that from the formal asshypect the mutual relationship between the axiomatic and frequency theories is characterized in the first place by a higher degree of abstraction of the former

This higher degree of abstraction was the foundation of the successful

206

development of the theory of probability measures However this degree of abstraction is too high and some properties of the world of real frequencies are lost in it Essentially the rational numbers were lost in Kolmogorovs theory of probability Whereas in von Misess theory the rational numbers arise as primary objects and real probabilities are obtained as a result of a limiting process for rational frequencies in Kolmogorovs theory rational frequencies are secondary objects associated with real probabilities (which are here primary) by means of the law of large numbers

3 General principle of statistical stabilization of relative frequenshycies

First we emphasize that the probabilities P in von Misess frequency theory are ideal objects (symbols to denote the sequences of relative frequencies that are stabilized in the field of real numbers) Therefore real numbers arise here as ideal objects associated with rational sequences of frequencies (see also Borel [20] and Poincare [23])

A basis for a broader view of probability theory is provided by the following principle of statistical stabilization of frequencies

Statistical stabilization (the limiting process) can be considered not only in the real topology on the field of rational numbers Q but also in any other topolshyogy on Q The probabilities of events are defined as the limits of the sequences of relative frequencies in the corresponding completions of the field of rational numbers

For each considered probability model there is a corresponding topology on the field of rational numbers The metrizable topologies on Q given by absolute values are the most interesting By virtue of Ostrovskiis theorem there are very few such topologies indeed besides the usual real topology for which p(xy) = x mdash y there exists only the p-adic topologies p = 2 3 where p(x y) = x mdash yp Thus if we consider only topologies given by absolute values then besides the usual probability theory over R we obtain only the probability theories over Qp

It is here necessary to introduce a natural restriction on the topology of statistical stabilization

The completion Qt of the field of rational numbers Q with respect to the statistical stabilization topology t is a topological field

We have deliberately not introduced this restriction into the general prinshyciple of statistical stabilization One can also consider statistical stabilization topologies that are not consistent with the algebraic structure on Q However probability theory based on such topologies loses many familiar properties For

207

example it turns out that the continuity of the addition operation is equivashylent to additivity of probabilities and continuity of the division operation is equivalent to the existence of conditional probabilities

Let x = (xX2 bull bull xn) be some collective We denote the set of all labels for this collective (possible outcomes of an experiment producing this collective) by the symbol II We denote by fi the event consisting in the realization of at least of the label n euro II

Proposition 31 The probability of the event il is equal to unity To prove this it is sufficient to use the fact that all the relative frequencies

are equal to unity Let v^fi j = 12 be the relative frequencies of realization of certain labels

7Ti and 7r2 and Pj = l imi ^ be the corresponding probabilities Let event A be the realization of the label TT or -K-I A = n V TT2 bull Using the continuity of the addition operation we obtain

P(A) = lim iW = lim(jW + v^) = lim iW + lim J 2 ) = PX+P2 (1)

This rule can be generalized to any number of mutually exclusive events Proposition 32 Let Ajj = 1 k be mutually exclusive events (ie

the sets of labels that define these events are disjoint) Then

k

P(A1VVAk) = YP(Aj) (2) i= i

Using the continuity of the subtraction operation we obtain the following proposition

Proposition 33 For any two events A and B the equation P(AB) mdash PA) + PB) - PA A B) holds

In the language of collectives the rule of addition of probabilities is forshymulated as follows see[16] Beginning with an original collective possessing more than two labels an appreciable number of new collectives can be conshystructed by uniting labels the elements of the new collective are the same as in the original one but their labels are unifications of the labels of the origshyinal collective To the unification of labels there corresponds the addition of frequencies

We consider the set of rational numbers U = x euro Q Q lt x lt We denote by the symbol Ut the closure of the set U in the field Qt (if t is the ordinary real topology then Ut mdash [01]) An obvious consequence of the definition of probabilities is the following proposition

Proposition 34 The probability of any event PE) belongs to the set Ut-

208

Conditional probabilities are then introduced into the frequency theory in same way as in [16] Suppose there is some initial collective x = (xltx2-- xn) with probabilities pn of the labels IT euro II Using the unification rule we define the probabilities of all groups of labels

P(A) = YP- (3)

We fix some group of labels B = n^ V V iTik We are interested in the conditional probability P(TTB)TT euro B of the label n given the condition B We form a new collective x = (x[ x2 xn) which is obtained from the original one by choosing only the elements with the labels r pound 5 The probability of the label -K in this new collective is then called the conditional probability of the label n under the condition B P(nB) = lim v^lB^ where J(TB) a r e the relative frequencies of the label -K in the new collective Noting that z5) = iM z B ) where v^ is the relative frequency of the label it in the collective x and j B ) is the relative frequency of the event B in the collective x we obtain (using the continuity of the division operation)

j ( 7 r ) limiW p(V) PMB)=lua-m = mdash m = ^ y PB)0 (4)

The general formula can be proved similarly Proposition 35 P(AB) = PAAB)P(B)P(B) pound 0 We now introduce the concept of independence of events Analyzing argushy

ments in the book [16] one notes that the rule of multiplication of probabilities for independent events is equivalent to the continuity of the multiplication opshyeration

An important property that makes it possible to use p-adic probabilities when considering standard problems of probability theory is the p-adic intershypretation of the probabilities zero and one (which are probabilities in the sense of ordinary probability theory)

Indeed the equation P(E) = 0 in ordinary probability theory does not mean that the event E is impossible It merely means that in a long series of experiments the event E occurs in a very small fraction of cases However in a large number of experiments this fraction can be relatively large Moreover the equation P(E) = 0 lumps together a huge class of events that intuitively appear to have different probabilities For example suppose we consider two events E and Ei and in the first

N = Nk = Cpound)2 (5)

209

trials the event Ei is realized n^ = 2k times and the event E2 is realized

k

nW = Y2j (6) J=0

times It is intuitively clear that the probabilities of these events must be different However in real probability theory

Pi = lim n1)N = P2= lim n (2) N = 0 (7)

It is different in 2-adic probability theory Stabilization in the 2-adic topology gives

Pi = 0 P2 = - 1 since in Q2 we have 2 -gt 0 k -gt co and for - 1 we have the represenshy

tation - 1 = l + 2 + 22 + + 2 + We here encounter for the first time negative numbers for probabilities of events (compare to Wigner [24] Dirac [25] Feynman [26] see also [27] [28] [12]) Of course these probabilities are forbidden by Kolmogorovs second axiom in ordinary probability theory (in von Misess approach they are forbidden by the choice of the topology of stashytistical stabilization) However from the point of view of the frequency theory of probability P = mdash 1 is only an ideal object the symbol that denotes the limit of a sequence of relative frequencies This symbol is in no way better and in no way worse than the symbol P = jix in ordinary probability theory

In this example negative p-adic probabilities were used to split zero conshyventional (real) probability So p-adic negative probabilities can be interpreted as infinitely small conventional probabilities It may be that all negative probshyabilities that appear in quantum physics might be interpreted in such a way If conventional (real) probability is equal to zero there is no conventional (real) element of reality However there is nonconventional (p-adic) element of reality that is realized with negative probability Real and p-adic probabilities correshyspond to different classes of measurement procedures The element of reality that it would be impossible to observe by using real measurement procedure might be observed by using p-adic measurement procedure

One can treat similarly the case of a probability (in the sense of the ordishynary theory) equal to unity For example suppose

k k k k

N = Nk = (J2V)2n^ = (]T2^)2 - 2fcn(2) = ( ^ V ) 2 - pound)2gt (8) j=0 j=0 j=0 j=0

210

In 2-adic probability theory we find that

oo

P1 =l^P2 = l _ ( l ^ 2 gt ) = 2 (9) 3=0

We see here that natural numbers not equal to unity also belongs to the set Up

In this example p-adic (integer) probabilities which are larger than 1 were used to split conventional (real) probability one So under the p-adic considshyeration a conventional element of reality can be split to a few p-adic elements of reality

In the framework of p-adic statistical stabilizations there is also nothing seditious about complex probabilities For example let p = l(mod 4) Then i = ( - l )Va e Qp Let

i = io + hp + iip1 + bull bull bull ir = 0 1 p - 1 (10)

be the canonical decomposition of the imaginary unit in powers of p Note also that for any p

_ l = ( p - l ) + ( p - l ) p + ( p - l ) p 2 + (11)

Then for rational relative frequencies we have

v JQ + HP+ + ikpk ^ _ 1 2

(p - 1) + (p - l)p + + (p - l)pk

in the p-adic topology Geometrically one may suppose that the new probability theory is a transhy

sition from one-dimensional probabilities on the interval [01] to multidimenshysional probabilities

4 Probability distribution of a collective

Let x = (xi Xk bull bull bull) be some collective and II be the set of labels of this collective We consider the simplest case when the set II is finite II = ( 1 S) We denote by v^ the relative frequency of the jmdashlabel and by Pj = limiJ) the corresponding probability In the frequency theory the set of probabilities Px = (Pi bull bull Ps) is called the probability distribution of the collective x

211

The general principle of statistical stabilization makes it possible to conshysider not only real distributions but also distributions for other number fields For one and the same collective x there can exist distributions over different number fields Thus in the proposed approach a collective has in general an entire spectrum of distributions PXit = (P i t Pst) where t are the topologies of statistical stabilization for the given collective Therefore one here studies more subtle structure of the collective The relative frequencies are investigated not only for real stabilization but for a complete spectrum of stabilizations

In the connection with the existence of an entire spectrum of probability distributions of a collective it is necessary to make some comments

First this agrees well with von Misess principle that the collective comes first and the probabilities after Indeed a probability distribution is an object derived from a collective and to one and the same collective there corresponds an entire spectrum of probability distributions these reflecting different propshyerties of the collective

Second each statistical stabilization determines some physical property of the investigated object For example if in a statistical experiment involving the tossing of a coin the probability of heads is Pi and tails is P2 then these probabilities are physical characteristics of the coin like its mass or volume This question is discussed in detail in the books of Poincare [23] and von Mises [16]

If we consider from this point of view the new principle of statistical stashybilization we obtain new physical characteristics of the investigated objects For example if in the real topology statistical stabilization is absent then it is not possible to obtain any physical constants in the language of ordinary probability theory But these constants could exist and be for example p-adic numbers If a collective has not only a real probability distribution but an enshytire spectrum of other distributions then besides real constants corresponding to physical properties of the investigated object we obtain an entire spectrum of new constants corresponding to physical properties that were hidden from the real statistics Note that these new constants can also be ordinary rational numbers

5 Model examples of p-adic statistics

51 Plantation with Red and White Flowers As one of the first examples of a collective von Mises considered [16] a

plantation sown with flowers of different colors and he studied the statistical stabilization of the relative frequencies of each of the colors We shall construct

212

an analogous collective for which p-adic stabilization always occurs but real stabilization is in general absent

Suppose there are flowers of two types red (R) and white (W) The planshytation (or rather infinite bed) is sown in a random order with red and white flowers the flowers being sown in series formed by blocks of p flowers the length of the series (the power of p) being also determined in accordance with a random rule

Namely suppose there are two generators of random numbers 1) j = 01 2) i = 12 (with probabilities 05) If j = 0 then a series of red flowers is sown if j = 1 then a series of white ones The length of each series is defined as follows the length of the first series is some power p1 (it can also be determined in accordance with a random rule) if the length of the previous series was plm then the length of the next series is plm+x lm+i =lm + im

We introduce the relative frequencies of the red and white flowers in the firs m series vpoundgt = rVmgtNmi^T = ntrade Nm

Proposition 51 For all generators of the random numbers j and i there is statistical stabilization of the relative frequencies u^Rgt and u^wgt in the p-adic topology

Thus we have defined p-adic probabilities PR = l imi ^ and Pw mdash limi(w and

oo oo oo oo

PR = (pound(1 -Jn)P)CZPln)gtpw = (E^) (E^ n ) (13) n=l n= l n=l n=l

Note that in general there is no real statistical stabilization for such a random plantation If the generator of the random numbers j gives series 0 or 1 then u^ and v^w^ in the real topology can oscillate from zero to unity

Thus a real observer (an investigator who carries out statistical analysis of the sample in the field of real numbers) cannot obtain any statistically regular law

He will obtain only a random variation of the series of real relative frequenshycies In contrast the p-adic observer (the investigator who makes a statistical analysis of the sample in the field of p-adic numbers) will obtain a well-defined law consisting of the stabilization of the outcomes in the p-adic decomposition of the relative frequencies

It is evident that in the example of probability theory we observe a new funshydamental approach to the investigation of natural phenomena In accordance with this approach experimental results must be analyzed not only in the field of real numbers but also in p-adic fields

Naturally our example is purely illustrative but it does appear to reflect many very important properties of p-adic statistics

213

Remark 51 Intuitively one supposes that in a real plantation it is possible to find a white flower next to almost every red flower in contrast large groups (clusters) of red and white flowers are distributed randomly over a p-adic plantation (one can sow not only a bed but also distribute series of red and white flowers over a plane in accordance with a random rule) A real random plane is obtained if one throws at random red and white points onto the plane in contrast a p-adic random plane is obtained if one throws patches of pl points at a time of red and white color onto the plane

In Appendix 2 we give the results of statistical analysis of the results of a random modeling on a computer of the proposed probability model There is very rapid p-adic stabilization of the relative frequencies and no stabilization in the sense of ordinary real probability theory

Remark 52 Evidently the structure of series formed by powers of p need not necessarily be directly observed in a statistical sample This structure is introduced by rounding the number of results to powers of p In very large statistical samples one can take into account only the orders of the numbers and one thereby introduces into the sample a 10-adic structure

52 Random Choice of the Digit of a p-Adic Number Suppose there are two labels 1 and 2 j is a generator of random numbers

corresponding to the choice of one of the labels Each random label is produced in series the length of the series being determined by random choice of the next p-adic digit ie there is a generator of random numbers a that take the values a = 0 1 p - 1 and the length of the next series is anp

n~1n = 12 We introduce the relative frequencies v^ and v^

Proposition 52 For all generators of the random numbers j and a there is statistical stabilization of the relative frequencies v-1 and i 1 in the p-adic topology

Thus the following p-adic probabilities are defined

oo oo oo oo Pl = (Y^l-J^nPn~1)lY^nPn-l)P2 = (EjnltnP

n-l)(ltrianpn-1) n=l n=l n=l n=l

In the real topology there is in general no statistical stabilization Appendix 1 Every rational number x ^ 0 can be represented in the form

where p does not divide m and n Here p is a fixed prime The p-adic absolute value (norm) for the rational number x is defined by the equations xp =

214

p r i 0 |0|p = 0 This absolute value has the usual properties l)xp gt 0 xp = 0 laquo-raquobull x = 0 2)|x|p = |a|p|2|p and satisfies a strong triangle inequality 3)x + yp lt max(|a|p |y|p)

The completion of the field of rational numbers with respect to the metric p(x mdash y) = x mdash yp is called the field of p-adic numbers and denoted by the symbol Qp It is a locally compact field Numbers in the unit ball Zp = x euro QP bull XP lt 1 degf the field Qp are called integer p-adic numbers Prom the strong triangle inequality we obtain a theorem which states that a series in the field Qp converges if and only if its general term tends to zero Any p-adic number can be represented in a unique manner in the form of a (convergent) series in powers of p

oo x = Yla^ai =0 1 p-lfc = 0plusmnl (15)

j=k

with xp = p~k

One can define similarly m-adic numbers where m is any natural number m gt 2 In the general case property 2) is replaced by the weaker property xym lt |z|m|2|mgt i-e-gt xm ls a pseudonorm The completion of the field Q in the metric p(xy) = x mdash ym will not be a field (for m that are not prime) It is only a ring Here we already encounter some deviations from the ordinary probability rules (which can be extended without any changes to p-adic probabilities) For example one can have a situation of the following kind A and B are independent events P(A) ^ 0 and PB) ^ 0 but P(A AB)=0 In particular the conditional probability P(AB) is in general not defined for an event B having nonvanishing probability

Appendix 2

We give here the results of a random experiment (modeled on a computer) for a 2-adic plantation The results of this experiment give a good illustration of a situation in which there is no statistical stabilization in the real topology but there is statistical stabilization in the 2-adic topology In the following tables m is the number of a random experiment in which two random numbers are modeled one corresponding to the choice of a flower and the other to the length of the series of this flower d is the number of elements in the sample Because of the exponential growth of the number of elements in the series d increases very rapidly

The table of relative frequencies in the field of real numbers is

215

m 4 5 6 7

12 13 14

22 23

d 10 102

103

103

105

105

106

109

1010

w uyy

01304 06364 01913 00504

00006 05335 01703

00022 07453

uH

08696 03636 08087 09496

09994 04665 08297

09978 02547

Thus for the relative frequencies in the field of real numbers there is no stabilization of even the first digit after the decimal point We examined large sequences of experiments on the computer in which the oscillations continued The calculations in the field Q2 give the results

AT = 10

v(w) =101011111011000000110100010111011000110011011110110001011 iW =001100000100111111001011101000100111001100100001001110100

iV = 20

v(w) _ 10101111101100111011001100101111110000011100111000000001 vWgt = 00110000010011000100110011010000001111100011000111111110

AT = 30

iW = 101011111011001110110011001111111100000000100110110000011 iW =001100000100110001001100110000000011111111011001001111100

AT = 40

v(w) =101011111011001110110011001111111100000000010111001110100 iW =001100000100110001001100110000000011111111101000110001011

216

Thus after ten random experiments 14 digits are stabilized in the 2-adic decomposition for the relative frequency of occurrence of a red flower and 14 digits for a white flower after 20 experiments the numbers of digits that are stabilized are 27 for both colors after 30 experiments 42 digits are stabilized for each and so forth

Appendix 3 W e give the results of analysis of a statistical sample in a field of 5-adic

numbers Here N is the number of random experiments M is the number of elements of the sample M is the number of elements of the first label and Mi is the number of elements of the second label

N 2 M l 002 M 2 00002 M 00202

MlM1044004400440044004400440044004400440044004400440044 M2M0010440044004400440044004400440044004400440044004400

N 3 M l 002 M 2 000023 M 002023

MlM1040303403420004404141041024440040303403420004404141 M2M10014141041024440040303403420004404141041024440040303

N 4 M l 00200002 M 2 000023 M 00202302

MlM1040303004000130020234341334320032124414032304024031 M2M0014141440444314424210103110124412320030412140420413

N 5 M l 00200002 M 2 000023004 M 002023024

MlM1040301040132010043322212441423102032221232032034142 M2M0014143404312434401122232003021342412223212412410302

N 6 M l 00200002 M 2 00002300403 M 00202302403

MlM1040301003131014113132222240403413222311230303113140 M2M0014143441313430331312222204041031222133214141331304

N 7 M l 00200002 M 2 0000230040303 M 0020230240303

217

MlM1040301003202004101343032004014023441101104433243020 M2M0014143441242440343101412440430421003343340011201424

Thus in the analysis of the sample in the field of 5-adic numbers there is rapid stabilization of the digits in the 5-adic decomposition of the relative frequenshycies For example after 55 experiments 78 digits in the 5-adic decomposition of the relative frequencies are stabilized

When the sample is analyzed in the field of real numbers there is again no statistical stabilization

Acknowledgements

I would like to thank L Ballentine and J Summhammer for discussions on p-adic probabilities and elements of physical reality

References 1 A Einstein B Podolsky N Rosen Phys Rev 47 777-780 (1935) 2 PS Alexandrov Introduction to general theory of sets and functions

(Gostehizdat Moscow 1948) 3 R Engelking General Topology (PWN Warszawa 1977) 4 AYu Khrennikov Dokl Akad Nauk 322 1075-1079 (1992) 5 AYu Khrennikov J of Math Phys 32 932-937 (1991) 6 VS Vladimirov I V Volovich and E I Zelenov p-adic analysis and

mathematical physics ( World Scientific Publ Singapore 1994) 7 Yu Manin Springer Lecture Notes in Math1111 59-101 (1985) 8 P G 0 Freund and E Witten Phys Lett B 199 191-195 (1987) 9 AYu Khrennikov Non-Archimedean Analysis Quantum Paradoxes

Dynamical Systems and Biological Models (Kluwer Academic Publ Dordrecht 1997)

10 S Albeverio A Yu Khrennikov and R Cianci J Phys A Math and Gen 30 881-889 (1997)

11 A Yu Khrennikov J of Math Physics 39 1388-1402 (1998) 12 AYu Khrennikov Interpretations of probability (VSP Int Publ

Utrecht 1999) 13 Z I Borevich and I R Shafarevich Number Theory (Academic Press

New-York 1966) 14 W Schikhov Ultrametric calculus (Cambridge Univ Press Camshy

bridge 1984) 15 R von Mises MathZ 5 52-99 (1919)

16 R von Mises Probability Statistics and Truth (Macmillan London 1957)

17 A N Kolmogorov Foundations of the Probability Theory (Chelsea Publ Comp New York 1956)

18 H Cramer Mathematical theory of statistics (Univ Press Princeton 1949)

19 I V Volovich Number Theory as the Ultimate Physical Theory Preprint CERN Geneva TH 478187 (1987)

20 E Borel Rend Cic Mat Palermo 27 247 (1909) 21 M Frechet Recherches theoriques modernes sur la theorie des probashy

bility (Univ Press Paris 1937-1938) 22 A Ya Khinchin Voprosi Filosofii No 1 92 No 2 77 (1961) (in

Russian) 23 A Poincare About Science Collection of works (Nauka Moscow

1983) 24 E Wigner Quantum -mechanical distribution functions revisted in

Perspectives in quantum theory Yourgrau W and van der Merwe A editors (MIT Press Cambridge MA 1971)

25 P A M Dirac Proc Roy Soc London A 180 1-39 (1942) 26 R P Feynman Negative probability Quantum Implications Esshy

says in Honour of David Bohm 235-246 BJ Hiley and FD Peat editors (Routledge and Kegan Paul London 1987)

27 W Muckenheim Phys Reports 133 338-401 (1986) 28 A Yu Khrennikov Int J Theor Phys 34 2423-2434 (1995)

219

COMPLEMENTARITY OR SCHIZOPHRENIA IS PROBABILITY IN Q U A N T U M MECHANICS INFORMATION

OR ONTA

A F KRACKLAUER E-mail kracklaufossiuni-weimarde

Of the various complimentarities or dualities evident in Quantum Mechanics (QM) among the most vexing is that afflicting the character of a wave function which at once is to be something ontological because it diffracts at material boundshyaries and something epistemological because it carries only probabilistic informashytion Herein a description of a paradigm a conceptual model of physical effects will be presented that perhaps can provide an understanding of this schizophrenic nature of wave functions It is based on Stochastic Electrodynamics (SED) a candidate theory to elucidate the mysteries of QM The fundamental assumption underlying SED is the supposed existence of a certain sort of random electroshymagnetic background the nature of which it is hoped will ultimately account for the behavior of atomic scale entities as described usually by QM In addition the interplay of this paradigm with Bells no-go theorem for local realistic extentions of QM will be analyzed

1 Introduction

Of the various complimentarities or dualities evident in Quantum Mechanshyics (QM) among the most vexing is that afflicting the character of a wave function which at once is to be something ontological because it diffracts at material boundaries and something epistemological because it carries only probabilistic information All other diffractable waves it may be said carry momentum energy not conceptual abstract information ideas All other probabilities are calculational aids and like abstractions generally are utterly unaffected by material boundaries The literature is replete with resolutions of QM-conundrums selectively ignoring one or the other of these characteristicsmdash in the end they all fail

Herein a description of a paradigm a conceptual model of physical efshyfects will be presented that perhaps can provide an understanding of this schizophrenic nature of wave functions It is based on Stochastic Electrodyshynamics (SED) a candidate theory to elucidate the mysteries of QM1 The fundamental concept underlying SED is the supposed existence of a certain sort of random electromagnetic background the nature of which it is hoped will ultimately account for the behavior of atomic scale entities as described usually by QM2 Among the successes of SED one is a local realistic explashynation of the diffraction of particle beams3 The core of this explanation is the

220

notion that relative motion through the SED background effectively engenders de Broglies pilot wave Given such a pilot wave associated with a particles motion the statistical distribution of momentum in a density over phase space can be decomposed in the sense of Fourier analysis such that the resulting form of Liouvilles Equation under some conditions is Schrodingers Equation

From this viewpoint the schizophrenic character of wave functions can be discussed and understood free of preternatural attributes These concepts have broad implications for serious philosophical questions such as the mind-body dichotomy through teleportation to popular science fiction effects In addition the peculiar nature of probability in QM is clarified

Although much remains to be done to comprehensively interpret all of QM in terms of SED many of the by now hoary paradoxes can be rationally deconstructed

A secondary (but intimately related) issue is that of determining the imshyport of Bells Theorem for the use of the SED paradigm to reconcile fully the interpretation of QM Arguments will be presented showing that in his proof Bell (essentially by misconstruing the use of conditional probabilities) called on inappropriate hypothetical presumptions just as Hermann de Broglie Bohm and others found that Von Neumann did before him45

2 De Broglie waves as an SED effect

The foundation of the model or conceptual paradigm for the mechanism of particle diffraction proposed herein is Stochastic Electrodynamics (SED) Most of SED for which there exists a substantial literature is not crucial for the issue at hand1 The nux of SED can be characterized as the logical inversion of QM in the following sense If QM is taken as a valid theory then ultimately one concludes that there exists a finite ground state for the free electromagnetic field with energy per mode given by

E = huj2 (1)

SED on the other hand inverts this logic and axiomatically posits the existence of a random electromagnetic background field with this same spectral energy distribution and then endeavors to show that ultimately a consequence of the existence of such a background is that physical systems exhibit the behavior otherwise codified by QM The motivation for SED proponents is to find an intuitive local realistic interpretation for QM hopefully to resolve the well known philosophical and lexical problems as well as to inspire new attacks on other problems

221

The question of the origin of this electromagnetic background is of course fundamental In the historical development of SED its existence has been posited as an operational hypothesis whose justification rests o posteriori on results Nevertheless lurking on the fringes from the beginning has been the idea that this background is the result of self-consistent interaction ie the background arises out of interactions from all other electromagnetic charges in the universe6

For present purposes all that is needed is the hypothesis that particles as systems with charge structure (not necessarily with a net charge) are in equishylibrium with electromagnetic signals in the background Consider for example as a prototype system a dipole with characteristic frequency u Equilibrium for such a system in its rest frame can be expressed as

moc2 = Jkj0 (2)

This statement is actually tautological as it just defines UJQ for which an exact numerical value will turn out to be practically immaterial

This equilibrium in each degree of freedom is achieved in the particles rest frame by interaction with counter propagating electromagnetic background signals in both polarization modes separately which on the average add to give a standing wave with antinode at the particles position

2cos(fc0a)sin(wo)- (3)

Again this is essentially a tautological statement as a particle doesnt see signals with nodes at its location thereby leaving only the others Of course everything is to be understood in an on-the-average statistical sense

Now consider Eq (3) in a translating frame in particular the rest frame of a slit through which the particle as a member of a beam ensemble passes In such a frame the component signals under a Lorentz transform are Doppler shifted and then add together to give what appears as modulated waves

2 cos(fc07(x mdash cflt)) sin(wo7(i mdash c_13a)) (4)

for which the second the modulation factor has wave length A = (7fco)-1 From the Lorentz transform of Eq (2) P = hj3ko the factors j3k0 can be identified as the de Broglie wave vector from QM as expressed in the slit frame

In short it is seen that a particles de Broglie wave is modulation on what the orthodox theory designates Zitterbewegung The modulation-wave effectively functions as a pilot wave Unlike de Broglies original conception in which the pilot wave emanates from the kernel here this pilot wave is a kinematic effect of the particle interacting with the SED Background Because

222

this SED Background is classical electromagnetic radiation it will diffract according to the usual laws of optics and thereafter modify the trajectory of the particle with which it is in equilibrium3 (See Ref [1] Section 123 for a didactical elaboration of these concepts)

The detailed mechanism for pilot wave steerage is based on observing that the energy pattern of the actual signal that pilot waves are modulating and to which a particle tunes comprises a fence or rake-like structure with prongs of varying average heights specified by the pilot wave modulation These prongs in turn can be considered as forming the boundaries of energy wells in which particles are trapped a series of micro-Paul-traps as it were Intuitively it is clear that where such traps are deepest particles will tend to be captured and dwell the longest The exact mechanism moving and restraining particles is radiation pressure but not as given by the modulation rather by the carrier signal itself Of course because these signals are stochastic well boundaries are bobbing up and down somewhat so that any given particle with whatever energy it has will tend to migrate back and forth into neighboring cells as boundary fluctuations permit Where the wells are very shallow however particles are laterally (in a diffraction setup say) unconstrained they tend to vacate such regions and therefore have a low probability of being found there

The observable consequences of the constraints imposed on the motion of particles is a microscopic effect which can be made manifest only in the observation of many similar systems For illustration consider an ensemble of similar particles comprising a beam passing through a slit Let us assume that these particles are very close to equilibrium with the background that is that any effects due to the slit can be considered as slight perturbations on the systematic motion of the beam members

Given this assumption each member of the ensemble with index n say will with a certain probability have a given amount of kinetic energy En associated with each degree of freedom Of special interest here is the beam direction perpendicular to both the beam and the slit in which by virtue of the assumed state of near equilibrium with the background we can take the distribution with respect to energy of the members of the ensemble to be given in the usual way by the Boltzmann Factore_^pound where is the reciprocal product of the Boltzmann Constant k and the temperature T in degrees Kelvin The temperature in this case is that of the electromagnetic background serving as a thermal bath for the beam particles with which it is in near equilibrium

Now the relative probability of finding any given particle ie with energy Enj or Enltk or trapped in a particular well will be according to elementary probability proportional to the sum of the probabilities of finding

223

particles with energy less than the well depth

pound e -J = f ( t ) e s amp = (1-eSD) lt5) lEnltd JO 0 V 0

where approximating the sum with an integral is tantamount to the recognition that the number of energy levels if not a priori continuous is large with respect to the well depth

If now d in Eq (5) is expressed as a function of position we get the probability density as a function of position For example for a diffraction pattern from a single slit of width o at distance D the intensity (essentially the energy density) as a function of lateral position is E0 sin2(9)62 where 9 = k[piiotWave(^D)y and the probability of occurrence P(6(y)) as a function of position would be

P ( y ) a ( l - e - ^ s i n 2 W f l 2 ) (6)

Whenever the exponent in Eq (6) is significantly less than one its rhs is very accurately approximated by the exponent itself so that one obtains the standard and verified result that the probability of occurrence Py) = iptp in conventional QM is proportional to the intensity of a particles de Broglie (pilot) wave

3 Schrodinger Equation

A consequence of the attachment of a De Broglie pilot wave to each particle is that there exists a Fourier kernel of the following form

bull 2p V (7)

which can be used to decompose the density function of an ensemble of similar particles Consider an ensemble governed by the Liouville Equation

at m ^ = - V raquo - ^ + ( V p p ) F

i=xy z (8)

Now decompose p(x p)with respect to p using the De Broglie-Fourier Kernel

p(x x t) = e-^p(x p t)dp (9)

224

110

relative intensity

Neutron Diffraction

0 Particle Beam

1 x Radiation

bullI A Chi(y)-squared (x50)

lateral displacement in radians theta

Figure 1 A simulated single slit neutron diffraction pattern showing the closeness of the fit of Eq (6) to the pure wave diffraction patten See Ref [3] for details

to transform the Liouville Equation into

dt i2m

To solve separate variables using

f)(xP)

r = x + x r = x mdashx

to get

i = (^ )^ - (^raquo - ( i ) (-raquobull(4^^ which can (sometimes) be separated by writing

r r )=V(r )Vlt(r)

(10)

(11)

(12)

(13)

225

to get Schrodingers Equation

ihd-^ = ~y^ + v^ (14) at 2 m

4 Conclusions

Within this paradigm Quantum Mechanics is incomplete as surmised by Einshystein Padolsky and Rosen4 It is built on the basis of the Liouville Equation while taking a particular stochastic background into account The conceptual function of Probability in QM is just as in Statistical Mechanics Measurement reduces ignorance it does not precipitate reality Of course measurement also disturbs the measured system but this presents no more fundamental problems that it does in classical physics Heisenberg uncertainty on the other hand is seen to be caused simply by the incessant dynamical perturbashytion from background signals In so far as the source of background signals can not be isolated this source of uncertainty is intrinsic but not fundamentally novel For these reasons duality is superfluous Particles have the same ontological status as in classical physics Individual particles in a beam pass through one or the other slit in a Young double slit experiment for example while their De Broglie piloting waves pass through both slits Beyond the slit the particles are induced stochastically to track the nodes of their pilot waves so that a diffraction pattern is built up mimicking the intensity of the pilot wave

From within this paradigm the now infamously paradoxical situations illustrating various problems with the interpretation of QM never arise or are resolved with elementary reasoning In particular wave functions are not vested with an ambiguous nature

The SED Paradigm also clarifies the appearance of interference among probabilities Numerous analysts from various view points have discovered that fact that Probability Theory admits structure (used by QM) that goes unexploited in traditional applications (Eg see Gudder Summhammar this volume) While each of these approaches provides deep and surprising insights none really offers any explanation of why and how nature exploits this structure Just as a certain second order hyperbolic partial differential equation becomes the wave equation as a physics statement only with the introduction eg of Hooks Law so this extra probability structure can be made into physics only with an analogue to Hooks Law

SED provides that analogue for particle behavior with its model of pilot wave guidance In this model radiation pressure is responsible for particle guidance3 Radiation pressure is proportional to the square of EM fields ie

226

the intensity (in this case of the the background field as modified by objects in the environment) which is not additive Rather the field amplitudes are additive and interference arrises in the way well understood in classical EM In other words QM interference is a manifestation of EM interference The relevant Hooks Law analogue is the phenomenon of radiation pressure For radiation this is all intimately related of course to classical coherence theshyory as applied to square law photoelectron detectors which when properly applied resolves many QM conundrums including those instigated by Bells Theorem surrounding EPR correlations

Appendix Bells Theorem

The interpretation or paradigm described herein conflicts with the conclusions of Bells no-go theorem according to which a local realistic extention of QM should conform with certain restraints that have been shown empirically to be false To be sure this paradigm does not deliver the hidden variables for exploitation in calculations but it does indicate to which features in the universe they pertainmdashnamely all other charges The character of these hidden variables is dictated by the fact that they are distinguished only in that they pertain to particles distant from the system of particular interest thus internal consistency requires that they be local and realistic8

The basic proof

Bells Theorem purports to establish certain limitations on coincidence probashybilities of spin or polarization measurements as calculated using QM if they are to have an underlying deterministic but still local and realistic basis describ-able by extra as yet hidden variables A distributed with a density p(X) These limitations take the form of inequalities which measurable coincidences must respect The extraction of one of these inequalities where the input assumptions are enumerated as Bell made them proceeds as follows

Bells fundamental Ansatz consists of the following equation

P(a b) = f dp(X)A(a X)B(b A) (15)

where per explicit assumption A is not a function of 6 nor B of a This he motivated on the grounds that a measurement at station A if it respects locality can not depend on remote conditions such as the settings of a distant measuring device ie b In addition each by definition satisfies

Alt1 Blt1 (16)

227

Eq (15) expresses the fact that when the hidden variables are integrated out the usual results from QM are recovered

The extraction proceeds by considering the difference of two such coincishydence probabilities where the parameters of one measuring station differ

P(a b) - P(a b) = f dp(X)[A(a X)B(b A) - A(a X)B(b A)] (17)

to which zero in the form

A(a X)B(b X)A(a X)B(b A) - A(a X)B(b X)A(a X)B(b A) (18)

is added to get

P(a b) - P(a b) = [ dXp(X)(A(a X)B(b A))(l plusmn A(a X)B(b A)+

dXp(X)(A(a X)B(b A))(l plusmn A(a X)B(b A) (19)

which upon taking absolute values Bell wrote as

P(a b)-P(a b) lt [dXp(X)(l plusmn A(a X)B(b A)+

I dXpX)l plusmn A(a X)B(b A) (20)

Then using Eq (15) Ansatz and normalization J dXp(X) = 1 one gets

P(a b) - P(a b) + P(a V) + P(a b) lt 2 (21)

a Bell inequality9

Now if the QM result for these coincidences namely P(a b) = mdash cos(20) is put in Eq (21) it will be found that for 6 = iramp the rhs of Eq (21) becomes 22 Experiments verify this result10 Why the discrepancy According to Bell it must have been induced by demanding locality as all else he took to be harmless

228

Critiques

Although Bells analysis is denoted a theorem in fact there can be no such thing in Physics the axiomatic base on which to base a theorem consists of those fundamental theories which the whole enterprise is endeavoring to reveal Moreover buried in all mathematics pertaining to the physical world are numerous unarticulated assumptions some of which are exposed below

The analytical character of dichotomic functions

In motivating his discussion of the extraction of inequalities Bell considered the measurement of spin using Stern-Gerlach magnets or polarization measureshyments of photons In both cases single measurements can be seen as individshyual terms in a symmetric dichotomic series ie having the values plusmn 1 It is ther-fore natural to ask if the correlation computed using QM P(a b) = mdash cos(20) and verified empirically can be the correlation of dichotomic functions It is easy to show that they can not so be consider

- cos(20) = k f P(x- 6)P(x)dx (22)

where p(A) is fc27r and where the Ps are dichotomic functions Now take the derivative wrt 8 to get

2 sin(2lt9) = f 5(x - 6j)P(x)dx = ^ P0j) = k (23) J i

and again

4cos(20)=O (24)

which is false QED Some authors (see eg Aerts this volume) employ a parameterized dishy

chotomic function to represent measurements Such a function can be dishychotomic in the argument but continuous in the parameter eg of the form P(sin(i) mdash x)) for which then the correlation is taken to be of the form

Corr(t) = J D(x- sin(2t))D(x)dx (25) J mdash IT

However this approach seems misguided First it assumes that the the argushyment of Corr t can be identical to the parameter of the dichotomic function

229

Pt(x) rather than the off-set in the argument here x as befitting a correlashytion Moreover the same sort of consistency test applied above also results in contradictions therefore such parameterized functions do not constitute counterexamples invalidating the claim that discontinuous functions can not have an harmonic correlation At best this tactic implicitly results in the correlation of the measurement functions wrt the continuous parameter t which is interpreted as the weight or frequency of the the dichotomic value This tactic however does not conform with Bells analysis in which the dishychotomic values are to correlated rather it corresponds with the type of model proposed below without however recognizing Malus Law as the source of the weights

Conclusion There is a fundamental error in Bells analysis the QM result is at irreconcilable odds with the conventional understanding of his arguments11

This can be revealed alternately following Sica by considering four dishychotomic sequences (with values plusmn1 and length N) a a b and b and the following two quantities a ^ + a ^ = a(6j + 6J) and dfii mdash a^)i = abi mdash b^) Sum these expressions over i divide by N and take absolute values before adding together to get

N N N N

i i i i

N N

- pound | a j | | amp i + ampi + - jgtnamp i -amp i (26) i i

The rhs equals 2 so this is a Bell Inequality Conclusion this Bell Inequality is an arithmetic identity for dichotomic sequences there is no need to postulate locality in order to extract it12

Discrete vice continuous variables

By implication Bell considered discrete variables for which the correlation would be

1 N

Cor(a 6 ) = - 5 3 X 4 ( 0 ) ^ ( 6 ) (27) i

But experiments measure the number of hits per unit time given a b and then compute the correlation each event is a density not a single pair The

230

data taken in experiments corresponds to the read-out for Malus Law not the generation of dichotomic sequences for which each term represents an event consisting of a pair of photons with anticorrelated polarization or a particle pair with anticorrelated spins This discrepancy is ignored in the standard renditions of Bells analysis It is however serious and suggests a different tack

Consider following Barut a model for which the spin axis of pairs of particles have random but totally anticorrelated instantaneous orientation Si = mdashS213 Each particle then is directed through a Stern-Gerlach magnetic field with orientation a and b The observable in each case then would be A = Si bull a and B = S2 bull b Now by standard theory

_ bdquo s ltABgt - ltAgtltB gt Cor (A B) = = = = 28

Vlt A2 gt lt B2 gt the where the angle brackets indicate averages over the range of the variables This becomes

Cor(A B) = ^ s i n ( 7 ) d y c o s ( 7 - g ) c o s ( 7 ) ^

J(Jdysm(j)cos2(j))2

which evaluates to -cos(0) ie the QM result for spin state correlation Conclusion this model essentially a counter example to Bells analysis shows that continuous functions (vice dichotomic) work It is more than just natural to ask where do the gremlins reside in Bells analysis There are at least two

One has to do with the following covert hypothesis Bells proof seems to pertain to continuous variables in that the demand is only that A (B) lt 1 This argument however silently also assumes that the averages lt A gt = lt B gt = 0 It enters in the derivation of a Bell inequality where the second term above is ignored as if it is always zero When it is not zero Bell inequalities become eg

lP(a b) - P(a b) + P(a b) - P(a b)lt2+ 2 lt ^ gt lt f 2

gt ^ (30) Vlt Az gt lt Bz gt

which opens up a broader category of non quantum models A second covert gremlin having broader significance is discussed below

Are nonlocal correlations essential

The demand that in spite of the introduction of hidden variables A that a probability P(a b) averaged over these extra variables reduce to currently

231

used QM expressions implies that

P(a b)= f P(a b X)dX (31)

By basic probability theory the integrand in this equation is to be decomposed in terms of individual detections in each arm according to Bayes formula

Pa b A) = P(X)P(a X)P(ba A) (32)

where P(a A) is a conditional probability In turn the integrand above can be converted to the integrand of Bells Ansatz

P(a b) = jA(a X)B(b X)pX)dX iff

P(baX) = P(bX) Va (33)

This equation admits it seems two interpretations

(i) When this equation is true the ratio of occurrence of outcomes at station B must be statistically independent of the outcomes at A Therefore as the hidden variables A are extra and do not duplicate a and b even if the correlation is considered to be encoded by a A it will not be available to an observer But the correlation by hypothesis does exist and is to be detectable via the as and 6s therefore this equation can not hold Thus within this interpretation Bells Ansatz is not internally consistent

(ii) Alternately if the a on the lhs is superfluous so is b so that P mdash P(X) = 0 except at one value of A where it equals 1 or is a Dirac-delta function That is the correlation is totally encoded by the hidden variables as follows if a sufficient number of new variables are introduced to render everything deterministicmdashas often assumed Consequently individual products of probabilities at the separate stations ie ABs in Bells notation become Dirac delta-functions of the A If everything is deterministic then there can be no overlap of the of the non-zero values of pairs of probabilities for a given value of A and therefore in the extraction of a Bell inequality all quadruple products of P s with pair-wise different values of A in Eq (19) are identically zero so that the final form of a Bell inequality is the trivial identity

P(ab)-P(ab)lt2 (34)

232

In either case locality is not be so employed so as to exclude correlations generated at the conception of the spin-particles or photon pairs ie common causes The non existence of instantaneous communication can not impose a restraint here it must bear no relationship to the validity of Eq (33)

In addition Eq (34) reconciles Baruts continuous variable model with Bells analysis

Bell-Kochen-Specker Theorem

Besides Bells original theorem there is another set of no-go theorems ostensishybly prohibiting a local realistic extention for QM In contrast to the theorem analyzed above they do not make explicit use of locality rather they use cershytain properties (falsely it turns out) of angular momentum (spin) In general the proof of these theorems proceeds as follows The system of interest is deshyscribed as being in a state ip) specified by observables A B C A hidden variable theory is then taken to be a mapping v of observables to numerical values v(A)v(B)v(C) Use is then made of the fact that if a set of operashytors all commute then any function of these operators f(A BC) = 0 will also be satisfied by their eigenvalues f(v(A) v(B)v(C)) mdash 0

The proof of a Kochen-Specker Theorem proceeds by displaying a conshytradiction consider eg two spin-12 particles for which the nine separate mutually commuting operators can be arranged in the following 3 by 3 matrix

degl degl degdeg (35) degWy degldeg degdegz

It is then a little exercise in bookkeeping to verify that any assignment of plus and minus ones for each of the factors in each element of this matrix results in a contradiction namely the product of all these operators formed row-wise is plus one and the same product formed column-wise is minus one14

Now recall that given a uniform static magnetic field B in the z-direction the Hamiltonian is H = ^Baz for which the time-dependent solution of the

r nmdashiuit Schrodinger equation is ip(t) = 4= e

bdquo+iut and this in turn gives time-

dependent expectation values for spin values in the xy directions^5

lt ampx gtmdash ~ cos(oi) lt ay gt= - sin(wi) (36)

where w = eBmc

233

Proof of a Bell-Kochen-Specker theorem depends on simultaneously asshysigning the [eigenvalues plusmn1 to ltrx o~y and az as measurables for each particle (With some effort for all other proofs of this theorem one can find an equivashylent assumption) However as Barut13 observed and can be seen in Eq (36) if the eigenvalues plusmn1 are realizable measurement results in the P-field dishyrection then in the other two directions the expectation values oscillate out of phase and therefore can not be simultaneously equal to plusmn 1 Thus this variation of a Bell theorem also is defective physics

A local model for EPR (polarization) Correlations

The following model incorporates the features of polarization correlations withshyout preternatural aspects or the concept of photon The basic assumption is that the source emits oppositely directed anticorrelated classical electromagshynetic signals

EA = xcos(i) +ys in( f ) EB = mdash xsin( + 6) + y cos(i + 9) (37)

where factors of the form exp(i(wt + k bull x + pound(t)) where pound(pound) is a random variable are dropped as they are suppressed by averaging16 Now the random variables with physical significance emerging in the detectors per Malus Law are EA B It is the detectors that digitize the data and create the illusion of photons But because Maxwells Equations are not linear in intensities rather in the fields a fourth order field correlation is required to calculate the cross correlation of the intensity

P(a b) = Klt(A- B)(B bull A) gt (38)

where brackets indicate averages over space-time (This appears to be the source of entanglement in QM which is seen to have no basis beyond that found in classical physics) Here Eq (38) turns out to be

P ( + +) ltXK (COS(J) sin(i + 6) - sin(i) cos(i + 6)fdv (39) Jo

which gives P ( + + ) = P ( - - ) oc tsin2(0) a n d P ( - + ) = P ( - - ) ocfccos2(0) The constant K can be eliminated by computing the ratio of particular events to the total sample space which here includes coincident detections in all four combinations of detectors averaged over all possible displacement angles 6 thus the denominator is

mdash (sin2 (6raquo) + cos2 (6))d6 = 2K (40) i Jo

234

so that the ratio becomes

P ( + + ) = is in 2(0) (41)

the QM result This in turn yields the correlation

P ( + +) + P ( - - ) - P ( + - ) - P ( - +) Cor(a b) =

P ( + +) + P ( - - ) + P ( + - ) + P ( - + )

Cor (a b) = -cos(20) (42)

If the fundamental assumptions involved in this local realistic model are valid then there would be observable consequences For example if radiation on the other side of a photodetector is continuous and not comprised of photons then photoelectrons are evoked independently in each detector by continuous but (anti)correlated radiation Thus the density of photoelectron pairs should be linearly proportional (baring effects caused by limited cohershyence) to the coincidence window width On the other hand if photons are in fact generated in matched pairs at the source then at very low intensities the detection rate should be relatively insensitive to the coincidence window width once it is wide enough to capture both electrons

1 L de la Peha and A M Cetto The Quantum Dice (Kluwer Dordrecht 1996)

2 A F Kracklauer An Intuitive Paradigm for Quantum Mechanics Physics Essays 5 (2) 226 (1992)

3 A F Kracklauer Found Phys Lett 12 (5) 441 (1999) 4 G Hermann Die Naturphilosophischen Grundlagen der Quanten-

mechanik Abhandlungen der Friesschen Schule 6 75-152 (1935) 5 D Bohm Causality and Chance in Modern Physics (Routledge amp Kegan

Paul Ltd London 1957) 6 H Puthoff Phys Rev A 40 4857 (1989) 44 3385 (1991) 7 A Einstein B Podolsky and N Rosen Phys Rev 47 777 (1935) 8 J S Bell Speakable and unspeakable in quantum mechanics (Cambridge

University Press Cambridge 1987) 9 J S Bell in Foundations of Quantum Mechanics Proceedings of the

International School of Physics Enrico Fermi course IL (Academic New York 1971) p 171-181 reprinted in Ref [8]

10 A Afriat and F Selleri The Einstein Podolsky and Rosen Paradox (Plenum New York 1999) review theory and experiments from a current prospective

235

11 A F Kracklauer in New Developments on Fundamental Problems in Quantum Mechanics M Ferrero and A van der Merwe (eds) (Kluwer Dordrecht 1997) p185

12 L Sica Opt Commun 170 55-60 amp 61-66 (1999) 13 A O Barut Found Phys 22 (1) 137 (1992) 14 N D Mermin Rev Mod Phys 65 (3) 803 (1993) 15 R H Dicke and J P Wittke Introduction to Quantum Mechanics

(Addison-Wesley Reading 1960) p 195 16 A F Kracklauer in Instantaneous Action-at-a-Distance in Modern

Physics A E Chubykalo V Pope and R Smirnov-Rueda (eds) (Nova Science Commack NY 1999) p 379 httparXivquant-ph0007101 Ann Fond L deBroglie 20 (2) 193 (2000)

236

A PROBABILISTIC INEQUALITY FOR THE KOCHEN-SPECKER PARADOX

JAN-AKE LARSSON Matematiska Institutionen Linkopings Universitet

SE-581 83 Linkoping Sweden E-mail jalarmailiuse

A probabilistic version of the Kochen-Specker paradox is presented The paradox is restated in the form of an inequality relating probabilities from a non-contextual hidden-variable model by formulating the concept of probabilistic contextuality This enables an experimental test for contextuality at low experimental error rates Using the assumption of independent errors an explicit error bound of 071 is derived below which a Kochen-Specker contradiction occurs

1 Introduction

The description of quantum-mechanical (QM) processes by hidden variables is a subject being actively researched at present The interest can be traced to topics where recent improvements in technology has made testing and using QM processes possible Research in this field is usually intended to provide insight into whether how and why QM processes are different from classical processes Here the presentation will be restricted to the question whether there is a possibility of describing a certain QM system using a non-contextual hidden-variable model or not A non-contextual hidden-variable model would be a model where the result of a specific measurement does not depend on the context ie what other measurements that are simultaneously performed on the system It is already known that for perfect measurements (perfect alignment no measurement errors) no non-contextual model exists These results origin in the work of Gleasonf but a conceptually simpler proof was given by Kochen and Specker2 (KS)

The KS theorem concerns measurements on a QM system consisting of a spin-1 particle In the QM description of this system the operators associated with measurement of the spin components along orthogonal directions do not commute ie

Sxj^y and sz do not commute (1)

however the operators that are associated with measurement of the square of the spin components do commute ie

^1si and s^ commute (2)

237

The latter operators (the squared ones) have the eigenvalues 0 and 1 and

si +s2y + s2

z = 21 (3)

Thus it is possible to simultaneously measure the square of the spin composhynents along three orthogonal vectors and two of the results will be 1 while the third will be 0 Only this QM property of the system will be used in what follows

The notation used from now on is intended to avoid confusion with QM notation since the notions used will be those of (Kolmogorovian) probability theory not QM A hidden-variable model will be taken to be a probabilistic model ie the hidden variable A is represented as a point in a probabilistic space A and sets in this space (events) have a probability given by the probability measure P The measurement results are described by random variables (RVs) Xj(A) which take their values in the value space 01

These mappings will depend not only on the hidden variable A but also the specific directions in which we choose to measure the squared spin components so that we would have

X i ( x y z A ) A - gt 0 l

X 2 ( x y z A ) A - + 0 l (4)

X 3 ( x y z A ) A ^ 0 l

Here Xi is the result of the measurement along the first direction (x) X2

along the second (y) and X3 along the third (z) To be able to model the spin-1 system described above these RVs would need to sum to two ie

3

^ X i ( x y z A ) = 2 (5) i= l

This is in itself no guarantee that the model will be accurate but it is the least one would expect from a hidden-variable model yielding the QM behaviour

In simple experimental setups there is usually only one direction specified (the direction along which the spin component squared is measured) Thus we would expect that X only depends on x (and A) This is referred to as non-contextuality and more formally this can be written as

Xi(xyzA) =X 1 (x y z A )

X 2 (x y z A)=X 2 (x y z A ) (6)

AT3(xyzA) = X 3 ( x y z A )

These two prerequisites are all that is needed to arrive at the Kochen-Specker paradox

238

2 The Kochen-Specker t heo rem

A more appropriate name for this section is perhaps A Kochen-Specker theshyorem since there are several variants the example presented here is from Peres (1993)3 All variants aim for the same thing to show a contradiction by assigning values to measurement results coming from a non-contextual hidden-variable model In this particular one3 a set of 33 three-dimensional vectors are used depicted in Fig 1

Figure 1 The 33 vectors used in the Kochen-Specker theorem The vectors are from the center of the cube onto one of the spots on the cubes surface (normalized if desired)

The proof is as follows assume that we have a non-contextual hidden-variable model Then for any A (except perhaps for a null set) this model satisfies equations (5) and (6) in particular for the directions in Fig 1 Now look at Fig 2(a) The measurement result along one of the coordinate axes must be 0 and along the other axes it must be 1 Let us assume that the 0 is obtained from the measurement along the z axis (the white spot on the cube) and the other two measurements yield 1 (black spots) Measurements along other directions in the ay-plane must also yield 1 as indicated in Fig 2(a) In Fig 2(b-d) three more similar choices are made and having made these assignments a white spot must be added at the position indicated in Fig 2(e) because of the two black spots at orthogonal positions and by this another black spot must be added being orthogonal to the white one This proceshydure continues in Fig 2(f-j) until all the spots are painted either white or black as necessitated by the previously painted spots Finally in Fig 2(k) we have three black orthogonal spots violating equation (5) the condition of QM results A similar contradiction will occur whatever choices we make in our assignments in Fig 2(a-d) and we have a proof of the KS theorem We have

these were green and red in Peres3

239

(a) Arbitrary choice (b) Arbitrary choice (c) Arbitrary choice

(d) Arbitrary choice (e) Orthogonality (f) Orthogonality

(g) Orthogonality (h) Orthogonality (i) Orthogonality

(j) Orthogonality (k) Contradiction

Figure 2 A proof of the Kochen-Specker paradox

240

Theorem 1 (Kochen-Specker) The following three prerequisites cannot hold simultaneously for any A

(i) Realism Measurement results can be described by probability theory using three (families of) RVs

X ( x y z ) A - gt 0 l i = 123

(ii) Non-contextuality The result along a vector is not changed by rotation around that vector For example

Xi(xyzA) = X j ( x y z A )

(Hi) Quantum-mechanical results For any triad the sum of the results is two ie

^ X i ( x y z A ) = 2 i

Note that there is a certain structure to the proof assignment of meashysurement results on a finite number of orthogonal triads according to the QM rule and rotations connecting the measurement results on different triads by non-contextuality This structure can be made explicit in the statement of the theorem by introducing the set EKS (a KS set of triads)

copybullcopybullcopybullcopybull-bull(-i5) (7)

In this set there are n vectors forming TV distinct orthogonal triads where some vectors are present in more than one triad establishing in total M connections by rotation around a vector Using this notation (a restricted version of) the KS theorem is

Theorem 1 (Kochen-Specker) Given a KS set of vector triads EKS the following three prerequisites cannot hold simultaneously for any A

(i) Realism For any triad in EKS the measurement results can be described by probability theory using three (families of) RVs

Xi(xyz)A^0l 1 = 123

241

(ii) Non-contextuality For any pair of triads in EKS related by a rotation around a vector the result along that vector is not changed by the rotashytion For example

Xi(xyzA) = X i ( x y z A )

(Hi) Quantum-mechanical results For any triad in EKS the sum of the results is two ie

^ X i ( x y z A ) = 2 i

This version of the KS theorem will be useful when formulating a probabilistic version of the theorem

3 The Kochen-Specker inequality

The above discussion is valid in an ideal situation where no measurement errors are present Introducing measurement errors these occur as (i) missing detections (ii) changes in the results along the axis vector when rotating or (hi) deviations from the sum 2 Since the prerequisites of Theorem 1 is no longer valid neither is the theorem However using probabilistic notions the theorem can be restated as follows

Theorem 2 (Kochen-Specker inequality) Given a KS set EKS of AT vector triads with M interconnections by rotation if we have

(i) Realism For any triad in EKS the measurement results can be described by probability theory using three (families of) RVs

J f i ( x y z ) A X l - + 0 l i = l 2 3

where Ax is a (possibly proper) subset of A

(ii) Rotation error bound For any pair of triads in EKS related by a rotation around a vector the set of As where the result along that vector is not changed by the rotation is probabilistically large (has probability greater than 1 mdash S) For example

p ( Xi(xgt y gtzA) = Xi(xygtzgtA))gt) gt 1 - S

242

(Hi) Sum error bound For any triad in EKS the set of As where the sum of the results is two is probabilistically large (has probability greater than 1 - e ) ie

p f A ^ X i ( x y z A ) = 2 ) gt 1 - e

Then

M8 + Negt 1

To shorten the proof the following symmetry of the measurement results are assumed to hold (the proof goes through without the symmetry but grows notably in size)

Xi(xyzA) = X 2 ( z x y A ) = X 3 (y z x A) (8)

Proof By Theorem 1 we have

( f | A X 1 ( x y z A ) = X 1 ( x y z A ) ) f l M

( f | A ] T x i ( x v z A ) = 2 ) = 0 N

Then the complement has probability one and

1 = P (j^-X1(KyzX)=X1(xyzX) ) - M

U(UA pound^(x ygtzgtA) = 2c)l N i J

lt ^ p ( A X 1 ( x y z A ) = X 1 ( x y z A ) C ) ( 9 )

M

+ Ep(A Ex^xgtygtzA) = 2c) N i

ltM6 + Ne

Here the probability in (iii) is to be read as the probability of obtainshying results for all three Xi and that the sum is two In other words it is

243

possible to avoid using the no-enhancement assumption in Theorem 2 but unshyfortunately inefficient detector devices would contribute no-detection events to both the error rates S and e which puts a rather high demand on experimental equipment While the no-enhancement assumption can be used in inefficient setups this may weaken the statement (cf a similar argument for the GHZ paradox2)

The error rate e is the probability of getting an error in the sum (both non-detections and the wrong sum are errors here) not the probability of getting an error in an individual result This makes it easy to extract e from experimental data but unfortunately the errors that arise in rotation are not available in the experimental data so it is not possible to estimate the size of S (note that it is not even meaningful to discuss 5 in QM) It is possible to use e to obtain a bound for 5

Corollary 3 (Kochen-Specker inequality) Given a KS set of N vector triads EKS with M interconnections by rotation if Theorem 2 (i-iii) hold then

Obviously a small EKS s e t (small N and M) is better yielding a higher bound for S for a given e (for a few different KS sets see2 3 5)

In an inexact experiment yielding a large e one expects the error rate S to be large as well whereas the bound in Theorem 3 will be low because of the large e A model for this inexact experiment may then be said to be probabilistically non-contextual the measurement error rate is large enough to allow the changes arising in rotation to be explained as natural errors in the inexact measurement device rather than being fundamentally contextual For a good experiment yielding a low e one expects 6 to be low but here the bound in Theorem 3 is higher In a hidden-variable model of this experiment the changes arising in rotation occur at an unexpectedly high rate which cannot be explained as due to measurement errors and a model of this type may be said to be probabilistically contextual Note that this probabilistic non-contextuality is a weaker notion than the one used in Theorem 1 (ii)

4 Independence

To enable a general statement the proof of Theorem 2 does not make any assumptions on independence of the errors but it is possible to give a more quantitative bound for the error rate by introducing independence (for simshyplicity at 100 detector efficiency)

Corollary 4 (KS inequality for independent errors) Assuming that the errors are independent at the rate r and that Theorem 2 (i-iii) hold then both

244

= P(noerrors) + P(fliponbothXis) bull

6 and e are given by r and

M(2r - 2r2) + iV(3r - 5r2 + 3r3) gt 1

Proof In the case of independent errors at the rate r the expressions for the probabilities in Theorem 2 (i) and (ii) are

p(X1(Xyz)=X1(xyz))

rrors) + P(fliponboth

= ( l - r ) 2 + r 2 = l - ( 2 r - 2 r 2 )

p(AExlt(xyzgtA) = 2) 1 (ii)

= P(noerrors) + P(flipoftheOandonel) = (1 - r )3 + 2(1 - r)r2 = 1 - (3r - 5r2 + 3r3)

The probabilities of these sets are not independent so from this point on we cannot use independence The inequality above then follows easily from Theorem 2

An expression on the form r gt f(N M) can now be derived from Corolshylary 4 but this complicated expression is not central to the present paper One important observation is that again to obtain a contradiction for high error rates (r) a small EKS set is needed (small N and M) Unfortunately the error rate needs to be very low eg in the E^s m the present example6 only an error rate r below 071 yields a contradiction in Corollary 4 Please note that there is no experimental check whether the assumption of independent errors holds or not While the errors in the sum may be possible to check it is not possible to extract what errors are present in the rotations or check for independence of those errors (further discussion of independence is necessary but cannot be fit into this limited space)

The set contains 33 vectors forming 16 distinct orthonormal bases3 but some rotations used are not between two of these 16 bases in some cases a rotation goes from one of the 16 bases to a pair of vectors in the set (where the third needed to form a basis is not in the set) and a subsequent rotation returns us to another of the 16 bases Thus in the notation adopted here a few extra vectors are needed to form s yielding n = 41 N mdash 24 and M = 31 Note that these additional vectors are not needed to yield the KS contradiction but are only needed in the proof of the inequality in this paper A more detailed analysis for the initial set of 33 vectors is possible probably yielding a contradiction at a somewhat higher r than the one obtained from this general analysis but this is lengthy and will not be done here

245

5 Conclusions

To conclude for any hidden-variable model we have a bound on the changes arising in rotation

Here iV is the number of triads in EKS and M is the number of connections within EKS- A proof using few triads with few connections is not only easier to understand but is also essential to yield a bound usable in real experiments At a large error rate e probabilistically non-contextual models cannot be ruled out since the changes of the results arising in rotation can be attributed to measurement errors However a small error rate e will force any hidden-variable description of the physical system to be probabilistically contextual

If the assumption of independent errors is used an explicit bound can be determined for the error rate r

M(2r - 2r2) + V(3r - 5r2 + 3r3) gt 1 (13)

which is possible to write on the form r gt f(N M) Below the bound we have a KS contradiction Again a small KS set is better than a large one yielding a higher bound For example for the KS set used here3 an r below 071 yields a contradiction

While writing this paper the author learned from C Simon that a similar approach was in preparation by him C Brukner and A Zeilinger6

The author would like to thank A Kent for discussions This work was partially supported by the Quantum Information Theory Programme at the European Science Foundation

1 A M Gleason J Math Mech 6 885 (1957) 2 S Kochen and E P Specker J Math Mech 17 59 (1967) 3 A Peres Quantum Theory Concepts and Methods Ch 7 (Kluwer Dorshy

drecht 1993) 4 D M Greenberger M Home A Shimony and A Zeilinger Am J

Phys 58 1131 (1990) N D Mermin Phys Rev Lett 65 1838 (1990) J-A Larsson Phys Rev A 57 R3145 (1998) J-A Larsson Phys Rev A 59 4801 (1999)

5 A Peres J Phys A 24 L175 (1991) J Zimba and R Penrose Stud Hist Philos Sci 24 697 (1993)

6 C Simon C Brukner and A Zeilinger quant-ph0006043

246

Q U A N T U M STOCHASTICS THE N E W A P P R OA C H TO THE DESCRIPTION OF Q U A N T U M MEASUREMENTS

ELENA LOUBENETS Moscow State Institute of Electronics and Mathematics

Abstract

We propose a new general approach to the description of an arbitrary generalized direct quantum measurement with outcomes in a measurable space This approach is based on the introduction of the physically imshyportant mathematical notion of a family of quantum stochastic evolution operators describing in a Hilbert space the conditional evolution of a quantum system under a direct measurement

In the frame of the proposed approach which we call quantum stochasshytic all possible schemes of measurements upon a quantum system can be considered

The quantum stochastic approach (QSA) gives not only the complete statistical description of any quantum measurement (a POV measure and a family of posterior states) but it gives also the complete stochastic description of the random behaviour of a quantum sytem in a Hilbert space in the sense of specifying the probabilistic transition law governing the change from the initial state of a quantum system to a final one under a single measurement When a quantum system is isolated the family of quantum stochastic evolution operators consists of only one element which is a unitary operator

In the case of continuous in time measurements the QSA allows to define in the most general case the notion of the family of posterior pure state trajectories (quantum trajectories) in the Hilbert space of a quantum system and to give their probabilistic treatment

1 Introduction

The evolution of the isolated quantum system is quantum deterministic since its behaviour in a complex separable Hilbert space H is described by a unitary operator U(t) mdashgt satisfying the Schrodinger equation whose solutions are reversible in time

Under a measurement the behaviour of a quantum system becomes irreshyversible in time and stochastic not only is the outcome of a measurement random being defined with some probability distribution but the state of a quantum system becomes random as well

Consider the general scheme of description of any quantum measurement

247

with outcomes of the most general nature possible under a quantum measureshyment Such a measurement is usually called generalized

Let n be a set of outcomes and J7 be a u-algebra of subsets of fi Let po be a state of a quantum system at the instant before a measurement

The complete statistical description of any generalized quantum measureshyment implies that for any initial state po of a quantum system we can present

bull the probability distribution of different outcomes of a measurement bull the statistical description of a state change po -gt pout of the quantum

system under a measurement We shall say also about the complete stochastic description of the random

behaviour of a quantum system under a measurement in the sense of specifying the probabilistic transition law governing the change from the initial state of a quantum system to a final one under a single measurement

Introduce some notations Let fj(Epo) = Probw 6 Ep0 WE pound T be a probability that under

a measurement (upon a quantum system being initially in a state po) the observed outcome UJ belongs to a subset E

Let ExZE) be a conditional expectation of any von Neumann observable Z G C(H) Z = Z+ at the instant immediately after the measurement provided the observed outcome w 6 E Here CH) denotes the linear space of all linear bounded operators on 7i

The statistical (density) operator pout(Epo) is called a posterior state of a quantum system conditioned by the observed outcome w euro E if for any Z the following relation is valid

ExZE = tr[pout(Ep0)Z] (1)

Unconditional (a priori) state p0ut(QPo) of a quantum system defines the quantum mean value

tr[pout(np0)Z] = ExZQ = (Z)PoutnPo) (2)

of any von Neumann observable Z at the instant immediately after the meashysurement if the results of a measurement are ignored

Any conditional state change p0ut(Epo) of a quantum system under a measurement can be completely described by a family of statistical operators Pout(uPo)v G ft] denned ^-almost everytwhere on fl and called a family of posterior states

Specifically for WE pound T fi(E p0) ^ 0

PoutEPo) ~ pjE^) ( 3 )

248

and consequently due to (1) for any von Neumann observable Z the condishytional expectation can be presented as

ExZE = feB tr^pout^ P o ) Z M ^ Pa) ( 4 )

p(Ep0)

Every posterior state pout(^po) describes the state of a quantum system conditioned by the sharp outcome w In general however when outcomes of a measurement are not of discrete character or the observation is not sharp then provided the outcome ugt pound E we can only say that after a measurement the quantum system is in a state p0ut(lt^Po) with probability

ndwpo)

( w ) 7^T (5)

where XEltgt) is an indicator function of a subset E The a priori state p0ut(^Po) a n d the quantum mean value of any von

Neumann observable Z at the instant immediately after the measurement are represented through the family of posterior states as

Pout(ttp0)= Pout(up0)lJ(duPo) (6) Ja

(z)pout(npo)= tr[pout(ujpo)Z]ft(lthpo) (7) Jn

respectively The relation (6) can be considered as the usual statistical average over

posterior states p0utuPo) given with the probability distribution p(cLjpo) From (7) it also follows that in any possible measurement upon an obshy

servable Z which could be done immediately at the instant after the first measurement the probability distribution Probz euro Apout(Clpo) of possishyble outcomes is given by

Probz e A w(n 9 0 ) = Pvobz euro Apout(upo)fi(dup0) (8) JQ

This formula can be considered as the quantum analog of Bayes formula in classical probability theory

In quantum theory there are two major approaches to the specification of above mentioned elements of the description of a quantum measurement

249

bull The von Neumann approach [1] considers only direct measurements with outcomes in R According to this approach only self-adjoint operators on ~H are allowed to represent real-valued variables of a quantum system which can be measured (observables) The probability distribution p(Epo) of any measurement is denned as

Li(Epo)=tr[p0P(E)l (9)

through the projection-valued measure P(-) on (R B(M)) corresponding due to the spectral theorem to the self-adjoint operator representing this observshyable

Under the von Neumann approach the posterior state of a quantum sysshytem is defined only in the case of discrete spectrum of a measured quantum variable and is given by the well-known jump of a quantum system under a measurement prescribed by von Neumann reduction postulate

In the case of continuous spectrum of a quantum observable the description of a state change of a quantum system under a measurement is not formalized

The simultaneous measurement of n quantum observables is allowed if and only if the corresponding self-adjoint operators and consequently their spectral projection-valued measures commute

bullThe operational approach [2-8] gives the complete statistical description of any generalized quantum measurement In the frame of the operational approach the mathematical notion of a quantum instrument plays the central role In physical literature a quantum instrument is usually called a superop-erator

Specifically a mapping T(-)[-] T x C(Ji) -gt CT-L) is called a quantum inshystrument if T(-) is a measure on (fi F) with values T(E) VE pound T being linear bounded normal completely positive maps on pound(H) such that the following normality relation is valid T(fi)[J] = J

Let T(-)[-] be an instrument of a generalized quantum measurement Then the conditional expectation of any von Neumann observable Z at

the instant after a measurement is defined to be

Exm = ^mMMt yEpoundjr ( 1 0 ) Hhpo)

In case Z = I from (10) it follows that in the frame of the operational approach the probability distribution p(E po) of outcomes under a measurement is given by

p(Ep0) = tr[p0T(E)[I]] Vpound euro T (11)

250

The positive operator-valued measure M(E) = T(E)[I] satisfying the conshydition M(fi) = is called a probability operator-valued measure or a POV measure for short

From (1) and (10) it also follows that for any initial state po the posterior state p0ut(Epo) conditioned by the outcome us pound E can be represented as

Pout(Ep0)- KEpo) (12)

where T(E)[-] denotes the dual map acting on the linear space T(H) of trace class operators on H and denned by

tr[ST(E)[Z] = tr[T(E)[SZ] VZ pound CU) VS ltET(H) (13)

For any initial state po of a quantum system the family of posterior state Pout(upo)w G fi always exists and is denned uniquely ^-almost everyshywhere by the relation

tr[pout(cjp0)Z]fi(dup0)=tr[p0T(E)[Z] MZ 6 C(H) Vpound euro T (14) JuieuroE

Due to (13) (14) we have

T(E)[p0]= pout(ujpo)p-(du)po) (15) JweuroE

and consequently the posterior state pout(^Po) is a density of the measure T(-)[po] with respect to the probability scalar measure p(-po)

The operational approach is very important for the formalization of the complete statistical description of an arbitrary generalized quantum measureshyment

However the operational approach does not specify the description of a generalized direct quantum measurement that is the situation where we have to describe a direct interaction between a measuring device and an observed quantum system resulting in some observed outcome w in a classical world and the change of a quantum system state conditioned by this outcome

We would like to emphasize that in principle the description of a direct measurement can not be simply reduced to the quantum theoretical description of a measuring process We can not specify definitely neither the interaction nor the quantum state of a measuring device environment nor to describe a measuring device only in quantum theory terms In fact under such a scheme the description of a direct quantum measurement is simply postponed to the

251

description of a direct measurement of some observable of the environment of a measuring device

The operational approach does not also in general give the possibility to include into consideration the complete stochastic description of the random behaviour of a quantum system under a measurement

We recall that for the case of discrete outcomes the von Neumann approach gives both - the complete statistical description of a direct quantum measureshyment and the complete stochastic description in a Hilbert space of the random behaviour of a quantum system under a single measurement In particular if the initial state po of a quantum system is pure that is po = |Vo)(Vo| and if under a single measurement the outcome A_ is observed then in the frame of von Neumann approach the quantum system jumps with certainty to the posterior pure state

AVo H -iM

(16)

where Pj is the projection corresponding to the observed eigenvalue Xj The probability fij of the outcome Xj is given by

H = ll-P^oll2 (17)

We would also like to underline that the description of stochastic irreversible in time behaviour of the quantum system under a direct measurement is very important in particular in the case of continuous in time direct measureshyments where the evolution of continuously observed quantum system can not be described by reversible in time solutions of the Schrodinger equation

In quantum theory any physically based problem must be formulated in unitarily equivalent terms and the results of its consideration must not be deshypendent neither on the choice of a special representation picture (Schrodinger Heisenberg or interaction) nor on the choice of a basis in the Hilbert space That is why in [9] we introduce the notion of a class of unitarily equivalent measuring processes and analyse the invariants of this class

We show [9] that the description of any generalized direct quantum meashysurement with outcomes in a standard Borel space (n Fg) can be considered in the frame of a new general approach which we call quantum stochastic based on the notion of a family of quantum stochastic evolution operators satisfying the orthonormality relation In the case when a quantum system is isolated the family of quantum stochastic evolution operators consists of only one element which is a unitary operator

The quantum stochastic approach (QSA) which we present in the next section can be considered as the quantum stochastic generalization of the de-

252

scription of von Neumann measurements for the case of any measurable space of outcomes an input probability scalar measure of any type on the space of outcomes and any type of a quantum state reduction Due to the orthonorshymality relation the QSA allows to interpret the posterior pure states defined by quantum stochastic evolution operators as posterior pure state outcomes in a Hilbert space corresponding to different random measurement channels

Even for the special case of discrete outcomes the QSA differs due to the orthogonality relation for posterior pure state outcomes from looking someshywhat similar approaches considered in the physical literature [1011] where the so called measurement or Kraus operators are used for the description of both the statistics of a measurement (a POV measure) and the conditional state change of a quantum system

The QSA gives not only the complete statistical description of any genshyeralized direct quantum measurement but it gives also the complete stochastic description of the random behaviour of the quantum system under a measureshyment

2 Quantum stochastic approach

In this section we introduce the quantum stochastic approach (QSA) to the description of a generalized direct quantum measurement developed in [9]

Specifically it was shown in [9] that for any generalized direct quantum measurement with outcomes in a standard Borel space (ft TB) upon a quantum system being at the instant before the measurement in a state po there exist

bull the unique family of complex scalar measures absolutely continuous with respect to a finite positive scalar measure v(-) and satisfying the orthonormality relation

A = nji(ui)i(du) LJ pound Clij - 1N0N0 lt oo Trji(cj)i(du)) = lt Jn

(18)

bull the unique (up to phase equivalence) family of v- measurable operator-valued functions l^(-) on fi satisfying the orthonormality relation with values being linear operators on defined for any ip 6 v- almost everywhere on ft

V = Vi(u) u pound ili = 1 JV0 f Vf (u)Vi(w)irji(u)v(du) = (19)

and such that for any index i = lNo and for VE 6 TB

[ Vi(w)7rlaquo(u)i(dw) (20) JweuroE

253

is a bounded operator on The relation

W V O M = V M V Wgt G H (21)

holding ^-almost everywhere on fl defines the bounded linear operator Wi Ti mdashgtCe(iligtyH) with the norm ||Wj|| = 1 Here Vidw) = nu(ui)i(daj)

bull the unique sequence of positive numbers a = (0102 OJV0) satisfying the relation

No

5 gt i = i (22) raquo=i

such that the complete statistical description (a POV measure and a family of posterior states) of a measurement and the complete stochastic description of the random behaviour of a quantum system under a single measurement (a family of posterior pure state outcomes and their probability distribution) are given by

bull The POV measure

Wo

M(E) = J2 ltiMiE) Vpound e TB (23) i= l

with

Mi(E) = f VJ+MVSMi^dw) (24)

JweE

bull The family of posterior states

No

Poutu Po) = ^2 amp(w)r^(w po) (25) t = i

with

and

Tt(wp0) = Vi(cj)poV(Lj) (26)

E j ltXin MM7trade(u po)] flaquoH = ^ u ) f -gt (27)

254

bull The probability scalar measure of the measurement given by the expresshysion

H(dup0) = ^ a ^ w ( d w p 0 ) (28) i

through the probability scalar measures

^ ( d w p o ) = tr[T^t(ujpo)Mdoj) (29)

bull The family of random operators (19) describing the stochastic behaviour of the quantum system under a single measurement Every operator Vi(ui) defines in the Hilbert space a posterior pure state outcome conditioned by the observed result ui and corresponding to the i-th random channel of a measurement

For any ij)0 pound the following orthonormality relation for a family Vi(ugt)ipo w i poundli = lNo of unnormalized posterior pure state outcomes is valid

(^raquoVo v s M M w M K d w ) = ltMhMlaquo- (30)

For the definite observed outcome u the probability of the posterior pure state outcome Vi(-)tpo in the Hilbert space is given by

Q( A- O ^ M M I I V J M ^ O H 2 O I 1 ~E-laquoi i iMI|v-MiM2 ^

We call Viifjj) quantum stochastic evolution operators and the probability scalar measures ij(-)fo(-) = Z ^ a w O andzW(-p0) Pgt(-Po) = Sraquoaraquox( )(iA)) - input and output probability measures respectively

Due to the decompositions (23) (25) and (28) Mi(E) T^t(ujp0) Vi(-) and fj^(-po) are interpreted to present the POV measure the unnormalized posterior state the input and the output probability distributions of outcomes in the i-th func-random channel of the measurement respectivelyThe stashytistical weights of different i-th func-random channels are given by numbers agtii = 1 N0

The a priori state

Pout(tipo) = y2ai T^t(up0)ui((hj) (32) i Jn

is the usual statistical average over unnormalized posterior states Tg^t(ujpo) with respect to the input probability distribution of outcomes Ui(-) in every channeland with respect to different random channels of the measurement

255

Physically the introduced notion of different random channels of a meashysurement corresponds under the same observed outcome to different random quantum transitions of the environment of a measuring device which we can not however specify with certainty

The triple 7 = A V a is called a quantum stochastic representation of a generalized direct measurement

We call direct measurements presented by different quantum stochasshytic representations stochastic representation equivalent if the statistical and stochastic description of these direct measurements is identical

In the frame of the QSA von Neumann (projective) measurements present such the stochastic representation equivalence class of direct measurements on (E B(M)) for which the complete statistical and the complete stochastic description is given by the von Neumann measurement postulates [1] presented by the formulae (16) (17)

3 Concluding remarks

We present a new general approach to the description of a generalized direct quantum measurement The proposed approach allows to give

bull the complete statistical description (a POV measure and a family of posterior states) of any quantum measurement

bull the complete description in a Hilbert space of the stochastic behaviour of a quantum system under a measurement (in the sense of specifying of the probabilistic transition law governing the change from the initial state of a quantum system to a final one under a single measurement)

bull to formalize the consideration of all possible cases of quantum measureshyments including measurements continuous in time

bull to give the semiclassical interpretation of the description of a generalized direct quantum measurement

4 Acknowledgments

This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University I would like to thank A Khrennikov for the warm hospitality and fruitful discussions

References

1 J Von Neumann Mathematical foundations of Quantum Mechanics (Princeton U Princeton NJ 1955)

256

2 E B Davies J T Lewis An operational approach to quantum probashybility Commun MathPhys17 239-260 (1970)

3 E B Davies Quantum Theory of Open Systems (Academic Press Lonshydon 1976)

4 A S Holevo Probabilistic and statistical aspects of quantum the-on(Moscow Nauka 1980 North Holland Amsterdam 1982 English translation)

5 K Kraus States Effects and Operations Fundamental Notions of Quanshytum Theory (Springer-Verlag Berlin 1983)

6 M Ozawa Quantum measuring processes of continuous observables J Math Phys 25 79-87 (1984)

7 M Ozawa Conditional probability and a posteriori states in quantum mechanics Publ RIMS Kyoto Univ 21 279-295 (1985)

8 A Barchielli V P Belavkin Measurements continuous in time and a posteriori states in quantum mechanics J Phys A MathGen 24 1495-1514 (1991)

9 ER Loubenets Quantum stochastic approach to the description of quantum measurements Research Report N 39 MaPhySto University of Aarhus Denmark (2000)

10 A Peres Classical intervention in quantum systems I The measuring process Phys Rev A 61 022116 (1-9) (2000)

11 H Wiseman Adaptive quantum measurements Proceedings of the Workshop on Stochastics and Quantum Physics Miscellanea N 16 89-93 MaPhySto University of Aarhus Denmark (1999)

257

A B S T R A C T M O D E L S O F P R O B A B I L I T Y

V M M A X I M O V

Institute of Computer Science Bialystok University

PL15887 Bialystok ulSosnowa 64 POLAND

Probability theory presents a mathematical formalization of intuitive ideas of inshydependent events and a probability as a measure of randomness It is based on axioms 1-5 of AN Kolmogorov x and their generalizations 2 Different formalshyized refinements were proposed for such notions as events independence random value etc 2 3 whereas the measure of randomness ie numbers from [01] reshymained unchanged To be precise we mention some attempts of generalization of the probability theory with negative probabilities4 From another side the physishycists tryed to use the negative and even complex values of probability to explain some paradoxes in quantum mechanics 5 6 7 Only recently the necessity of forshymalization of quantum mechanics and their foundations 8 led to the construction of p-adic probabili t ies9 1 0 1 1 which essentially extended our concept of probability and randomness Therefore a natural question arises how to describe algebraic structures whose elements can be used as a measure of randomness As conseshyquence a necessity arises to define the types of randomness corresponding to every such algebraic structure Possibly this leads to another concept of randomness that has another nature different from combinatorical - metric conception of Kolshymogorov Apparenly discrepancy of real type of randomness corresponding to some experimental data lead to paradoxes if we use another model of randomness for data processing12 Algebraic structure whose elements can be used to estimate some randomness will be called a probability set $ Naturally the elements of 4gt are the probabilities

1 What probability sets $ are possible

For practical conclusions of probability theory two kinds of events so called certain and uncertain are of importance Therefore the probability set $ must have two type of elements corresponding to certainty and uncertainty Their main role is that they are coupling all elements of $ We interpret them as a possibility of a determination of any probability p euro $ of a random events by an infinite sequence of random independent variables denned by the probability set $ In this connection we dont require the formal physical interpretation for certainty

We would like to preserve all fundamental properties of probability on [01] corresponding to an intuitive ideas of a probability of an event for abshystract probability set $

Analogical situation occures in logic A construction which preserve the main properties of Bool algebra and possesses a some new properties led to appearance of the logical Lukasiewicz-Tarski system13 14

258

Definition 1 A set $ is called the probability set if it has the following propshyerties

(i) In $ a binary operation bull can be defined as multiplication of probabilishyties being unnecessary commutative Whith respect that operation the set $ is semigroup In addition $ consists of three non-intersecting semishygroups O e and P such that $ = O U P U e The elements of semigroup O will play a role of zeros ie O is a semigroup of zeros The eleshyments of e will play role of units ie e is a semigroup of units P is a semigroup of probabilities Besides for all p pound P 8 pound O we have 9 bull p p bull 6 pound O and for all p pound P e 6 e we have e-p p-e pound P

It is clear that zero elements correspond to uncertain events and the unit elements correspond to certain events

(ii) For some elements of $ a commutative and associative operation + of addition is defined The operation of addition and multiplication are distributive It means that ifforpqr pound $ the operationsp+q (p+q)+r are defined then operations q + r p + (q + r) also are defined and an equality takes place (p + q) + r = p+ (q + r) In addition for all uvr the operations u-p + v-q p-u + q-v are defined and the equalities take place r-(p + q)mdashr-p + r-q (p + q)-r=p-r + q-r

(iii) For all p pound P there exists a complementary element p pound P and e pound e such that p + p = e

(iv) The operation + is defined for all elements of O and is not defined for elements of e Besides for all p fi e 6 ^ O a sum p + 6 is defined and p + 6 pound O p + 6 $ e Also for e pound e the inclusion takes place 6 + e pound e but p + e is not defined

(v) In $ some topology is introduced such that with respect that topology the operations bull and + are continuous For arbitrary neighbourhood V(0) of zeros there esists p pound $ such that pn euro V(O) for ngtn0 (Vp)

(vi) IfpqE$ andp + q pound O then it follows that pq pound O (the property of indecomposability of zero) That property is not necessary For example in the complex and p-adic probability it can be not fulfilled

(vii) The equation p2 = p always has the solutions in O and e If the equation p2 = p has the solutions only in O and in e then we will say that Kolmogorov condition is valid for probability set $

The properties (31)-(5) provide the main identity of independent probshyabilities calculus ie if

259

Pi + bull bull bull +pn = e G e pi 6 P then we have

(p i + ---+Pn)n = E f t i bullbullbullPik = e f c euro e -

Unfortunately operations of a direct sum and of a tensor product of [01] do not produce new probability set different from [01]

For example in case of a direct sum [01] copy [01] with the coordinate-wise multiplication we have (pq) pq G [01] as probabilities Consequently (Pilti) + (P292) = (pi +P2qi +qi) and (pilti)(p2lt2) = (p i^ t f i f t ) - Obshyviously the element (00) must be zero But then (p0)(0q) = (00) It follows by zero semigroup properties that (p 0) G O or (0^) pound O Asshysume that (p 0) euro O p $ O Then by virtue of others axioms we obtain (mdash p 0) G O 0 lt mdash lt 1 and therefore by the continuity property the set (p 0)p G [01] consists O Formally the probability set differs from [01] But the factorization with respect the set O yields the [01] once again with usual addition and multiplication (see section 2) However there exists the probability set $ satisfying all axioms in the algebra consisting of pairs (xy) xy G R with the operations of coordinate-wise addition and multiplishycation

Indeed consider the set $ on Figl (parallelogram) bounded by vertices 0h 1 mdashh where h lt | Then we can easly verify that if x 21) (222) G $ then (xix22122) G $ The zero set O consists of a single element 0 and a set e consists of a single 1 The topology of $ is induced from R 2 The remaining properties of 4gt can be examined easily Note that the first coordinate x runs over the segment [01]

Since R2 with the coordinate-wise addition and multiplication is a simplest non-trivial topological semi-field 15 We can consider $ as an example of a probability set included in a topological semi-field

In 16 the foundation of classical probability theory is presented in terms of semi-fields Thus the construction of probability sets in abstract topologshyical semi-fields can be of interest for applications In section 3 we considshyered multidimentional examples of probability sets which could be even non-commutative These examples get beyond the frames of topological semi-fields

The zero-indecomposability property can be included or not included into the properties of $ It depends on a problem For example if we consider all fields of p-adic numbers as a probability set then the indecomposability property does not holds Nevethless it does not prevent the existence of an analogue of Bernoulli theorem in the p-adic probabilities10

However we can find sets satisfying all axioms in the field of p-adic numshybers For this purpose we take a p-adic number q qp lt 1 that is not a root of any algebraic equation with integer coefficients Then the set of p-adic

260

Fig 1

numbers of a form nkq

k + nk+1qk+1 +bullbullbull + nrq

r

where n G TV and the rest of n^ belong to Z k r 123 and of the form 1 mdash msq

s + ms+iqs+1 + bull bullbull + mtq where ms pound N and the rest of mj belong to Z st = 123 together with 0 and 1 are a probability sets with the operations of addition and multiplication in a p-adic set

The semigroups O and e consist of 0 and 1 respectively Essentially different examples of probability sets will be considered in secshy

tions 3 and 4

2 Uniqueness of semigroups of zeros and units

(i) Proposition 1 In the probability set $ defined by operations bull and + the semigroups O ande satisfying properties (31)-(34) are unique

Proof It is important to note that semigroups O and e posses the maximality property ie they cannot be extended to semigroups O O C O and e e C e or e C e O C O preserving the properties (31)-(34) Indeed if there is an extention O then there is an element p pound O such that p G O But this will contradict conditions (33)-(34) since on one hand the operation p + e e pound e is not defined for p pound O and on the other side the operation p + e is denned for all e e pound $ since p pound O

261

Now let O = O and e C e Then there exists an element j ) 6 e but p pound e By (33) there exists p pound O such that p + p euro e C e Prom the other side the operation p + q is not defined for q pound O = O and p e e Thus any two pairs of semigroups O and e satisfying (31)-(34) are maximal

By the same reason in $ there exists no other pairs semigroup O i and semigroup ei different from O and e Indeed assume these semigroups exist Let Ox ^ O O x ltf_ O O pound O j Then 3p 6 O p pound O i If e r i e j 7 0 then the operation p + e is defined for e e e f l e i since p pound O On the other hand the operation p + e is not defined for e pound e i since p $ O i If e H e = 0 we consider an element p such that p ^ O but p pound O i Then by (34) the sum p + q is defined V g euro $ On the other hand the sum p -f e is not defined for e euro e since p $ O

It remains to consider the case when O = O i but e 2 e i - This case does not coinside with the case O = Oi and e C e i studied above but the proof remains the same Namely there exists such p pound e i but p ^ e By virtue of (33) there exists an element p pound O such that p + p 6 e At the same time the operation p + p is not defined since p euro ei and pi Oi = O

(laquo) The homomorphism of the probability set $ i into the probability set $2 can be defined as usual but with the following natural complement

Definition 2 A mappind ip of a probability set $1 into the probability set $2 is defined to be homomorphism if

(a) (p is a semigroup homomorphism with respect to the multiplication

(b) If a sum p + q is defined in $ i then the sum ltp(p) + ltp(q) is also defined in $ 2 and ltp(p + q) mdash ip(p) + (p(q)

(c) If a sum ltpp) + ip(q) is defined in $2 then the sum p + q is defined in $1 and consequently by (iib) we have ip(p + q) = ip(p) + ltp(q)

Proposition 2 Let the probability set $2 ampe a (p-homomophic image of a probability set$i Let$i = O iUPiUe i and $ 2 = 0 2 UP2Ue 2 where Oj ei are semigroups of zeros and units respectivly Then ltp(Oi) = O2 lt^(Pi) = P2 and (p(ei) = e2 Also we have ltp(p) = ip(p) for allp euro P i

Proof Consider sets Oi = lt^-1(02) P i = ltp -1(P2) ei = tp~1(e2) Since the sets 0 2 P 2 and e2 do not intersect pairwise the sets 01 P i and ei also do not intersect pairwise and $1 = Oi U P[ U e[ Since

262

O2 P2 e2 are semigroups the semigroup properties of ip imply that the sets 0[ P i e[ are semigroups in $1 Further using properties (iia) and (iib) one can easly verify that the sets O^ and e[ satisfy conditions (31)-(34) of definition 1 and thus are semigroups of zeros and units In view of proposition 1 we have OI = Oi and e^ = e i It follows that P[ = P i Then if p pound P i there exists an element p pound P i such that p + p pound e i Therefore ip(p + p) = ipp) + (p(p) pound e2 and we can set ip(p) = ltp(p)

(Hi) Let $ be an arbitrary probability set with a semigroup of zeros O Proposhysitions 1 and 2 allow to consider instead of the probability set $ a home-omorphic probability set $0 (by proposition 3 below) whose semigroup of zeros consists of a single element Denote it by bull Then bull possesses all properties of the usual zero ie p+O = p bull bull p = p bull bull Vp euro ltlgto-

Definition 3 A class of the equivalence Kq of an element q pound $ is the set of all elements p pound $ for which p + 6 = q + 62 for some 1 62 euro O Set

$ 0 = Kq q G $

From definition 3 it is clear that KB = O for all 0 E O Indeed let x pound Kg then by definition 3 we have x + 61 = 0 + 62 for some 9i 82 pound O By 6 it follows that x pound O Further since p + 6 = 8+p6poundOwe have ppoundKp

The following two lemmas are similar to those for conjugate classes in rings but the proofs are different

Lemma 1 If z pound Kp then Kz = Kp

Proof If z pound Kp then by definition 3 we have z + 81 = p + 62 for some 1 82 pound O Let x be an arbitrary element of Kz Then by definition 3 we have that x + 83 = z + 84 for some 83 84 pound O Adding 81 to this equality and using the addition properties in $ and the relation z + 81 = p + 82

we obtain

(x + 83) + 0i = x + (83 + 0i) = (z + 8A) +8X =

= (Z + 01) + 04 = (p + 62) + 04 = P + (2 + 04)

Since 03 + 0i and 02 + 84 belongs to O from definition 3 follows that x pound Kp ie Kz C Kp

Also from the relation p + 82 = z + 0i it follows that p pound Kz Conseshyquently Kp C Kz and we have Kz = Kp

263

Lemma 2 The classes Kp and Kq either coinside or do not intersect

Proof Indeed let KpCKq^ If z euro Kp n Kq then by Lemma 1 we have Kz = Kp and Kz = Kq ie Kp = Kq

Proposition 3 In the set $ 0 one can introduce the operations of mulshytiplication and addition naturally induced by the operations in $ that transform $ 0 to a probabilitic set (We denote it by $o) Moreover the semigroup of zeros of a probability set $o consists of a single element Kg = O V0 euro O which possesses the properties of a usual zero

Proof Define the set Kp + Kq by a term-by-term addition of elements The definition of Kp + Kq is correct if p + q is defined Indeed let us consider x G Kp y G Kq Then by definition 3 we have that x + 0i = P + 02 y + 03 mdash q + 64 for some 0raquo G O Since p + q is defined by properties (32) and (34) imply

(p + 02) + (q + 04) = (p + q) + (02 + 04) = ( + raquo) + (0i + 03)-

Consequently x + y euro -ftTP+9 and it follows that Kp + Kq C -ftTp+g

Similarly we can define the set Kp bull Kq by term-by-term multiplication If x G Kp y e Kq we have x + 0i = p + 02 and y + 03 = ltZ + 04 0j euro O Multiplying left-hand and right-hand sides of these equalities and applying the properties of O we obtain

Or + 0i)(i + 03) = (p + 02)(lt + 04) = x bull y + 0 = p bull q + 0

where 0 0 euro O Consequently x-y euro Xpg and therefore KpKq C Kp

Those inclusions lemma 2 and properties (33) (34) allow to introduce correctly the operations of multiplication and addition on classes ltJgt0 by

KpGKq = Kpq KpHKq = Kp+q (1)

These operations transform the set $ 0 into a probability semigroup $o- The zero semigroup of ltJgt0 consists a single class O = K 0 euro O and the semigroup by units e O consists of classes Ke e euro e Obviously the properties (31)-(6) of definition 1 can be easly verified The class K$ = O V 0 G O possesses all properties of usual zero since Kq bull Kg = Kq9 = Kg = O and Kq + Kg = K g + e = if

We define lt on $ as ltj(p) = Kp Obviously the mapping ltp satiesfies the conditions of definition 2 and therefore is a homomorphism $ into $0 = $ 0

Probabilities with hidden parameters

(i) The idea of a hidden variables is very popular in quantum mechanics17 With the help of hidden variables many investigators try to overcome some difficulties of quantum mechanics For example in 1 8 to solve the Bells inequality paradox it was proposed the p-adic theory of distribushytions for hidden variables

On the other hand we propose to consider the hidden variables as a hidden parametres of usual probabilities so that the letter ones must be the abstract probabilities satysfying the conditions of definition 1

At first we consider one model of hidden parameters for abstract probshyabilities

Definition 4 We say that a set of abstract probabilities $ allows hidshyden parameters A (or $ has hidden parameters A) where A is certain topological space if to each a pound A corresponds a subset Pa C $ such that (J Pa = $ and the continuous mappings cp and ifi from A x A x $ x $

a

into A are defined and possess the following properties The operations

(p a) + (q 3) = (p + q tp(a p q)) (2)

pa)-q3) = p-qigta3pq)) (3)

where p G Pa q pound P0 p + q G P^afrpq) P bull Q euro ^V(laquoPlaquo) define

on the set of pairs (pa) a euro A p 6 Pa a probability set denoted by (4) P(A) C $ x A

Since the left hand side of (2) and (3) is the operations in the probashybility set $ the hidden parameters can describe additional properties of probabilities including some possible physical sense It is obvious that the principle problem conserning the probability with hidden parameters is as follows can we destinguish statistically the sequences Ci(w)gt bullbullbullgt Claquo(w)) mdash and T]i(ui) nn(poundj) where C(w) a r e independent random variables with identical distributions with respect to usual probabilities from [01] and (agt) are independent random variables with the some values as poundfc(w) but with the distributions from probability set [01] x A and satshyisfying the conditions if P(k(u) E B =p then pr)k(oJ) G B mdash (pa) for some a euro A

265

(ii) Now we consider the principle construction for different examples of usual probability on [01] with hidden parameters

Proposition 4 Let $ = [01] and A be some convex semigroup in arshybitrary Banach algebra over R Then the set $ x A = (p a) a pound A forms a probability set with respect to the operations

(pa) + (qa) = (p + q - pound - a + - ^ 8 ) p + qltl (4) p+q p+q

(pa)-qa) = (p-qa- ) (5)

Proof As a zero set O we consider the set (0a) a pound A and as e we consider the set ( l a ) a pound A Then all properties of definition 1 can be easly verified By the proposition 3 all elements of the form (0 a) a pound A can be ^identified with one zero

A simple interesting example of such kind can be obtained by considering a set of pairs (p q) pq pound [01] with the operations

(piQi) + P2qi) = (pi +P2 ^ mdash q + mdash92) Pi +P2 Pi+ Pi

0 lt p i + p 2 lt l (6)

(Pi 9i ) bull (P292) = (Pi -P2 qi bull 92) (7)

Obviously instead of q pound [01] we can take the elements of Banach alshygebra of sequences of numbers from [01] with coordinate-wise multishyplication We can interpret probabilities (p q) with hidden parameters Q mdash (lt7i)lt72 bullbullbull)) 0 ^ Ii ^ 1 a s follows if an event S occurs with the probability p then the probabilities (71(72 bullbullbull can be considered as probshyabilities of some independent events Si52 which can occur when S occurs

Another example of hidden parameters interesting from a probabilitic point of view can be obtained when q = qij runs over stochastic mashytrices Now we can consider random index i i = 12 with distribution (Pt ||ltfcmlD- Thus if the event i occurs with probability pi then qij is the probability of some events Sj This duplicates the previous situation differing that the matrix multiplication implies more interpretations

Problem of a general description of all mappings ltp and ip of the set [0 l ] x 4 into [01] or the full description of probabilities [01] with hidden parameters from [01] remains open

266

(Hi) As a prototype of a general construction of a probability $ with hidden parameters we can consider a set of positive measures min(G) on some semigroup structure G with natural opperation of addition and composhysition of measures

Indeed let G be an arbitrary locally compact semigroup Consider a set min(G) of all positive measures on G with weak topology We can naturally define operation of convolution (composition) on min(G) as follows for i v euro min(G)we set3

Hv(B) =fjxv(xy) x-yeB xypoundG (8)

where i x v denotes direct product of measures fi and u on G Then min(G) is a semigroup with respect to the convolution Besides the adshydition (fi + v)B) = nB) + vB) and the multiplication by a positive number A (v)(B) = XJ(B) are defined on min(G) Obviously the opshyerations of convolutions and additions are distributive Thus the linear set min(G) is convex semigroup with respect to convolution

The set min(G) possesses almost all properties of the probabilities sets with respect to these operations except one there is no semigroup of units in min(G) But if we restrict min(G) we can obtain a convex semigroup possessing all properties of a probability set To this end we consider a subset minj(G) of min(G) consisting of all probability meashysures ie the set of positive measures fi for which (i(G) = 1 Prom (8) it follows that mini (G) is a semigroup Consider a convex closed semishygroup min[01](G) consisting of all non-negative measures fi for which 0 lt i(G) lt 1 It can be readily seen that set min[0]i](G) with the operashytions of the addition and the composition satisfies all properties (31)-(6) of the probability set with a semigroup of units e = mini(G)

Each element fi from min[oii](G) can be obviously represented in the form p bull (^fJ) where n(G) = p 6 [01] p ^ 0 ^i euro mini (G) If fi and u belong to min[0ji](G) then we have

p q p + q

Hv = p(-raquo)q(-v) =pq(-ti)(-v)- (10)

Prom (9) and (10) we obtain the

267

Proposition 5 The convex semigroup min[oi](G) and the set $mini(G) of elements (pa) p pound [01] a E mini(G) with the operashytions (4) (5) are isomorphic

The probabilities (p n) can be interpreted similary to item ii above Howshyever the structure of multiplication of semigroup is rather more complishycated Consider an algebra of some events F Suppose that each such event has a state which can be represented by an element of a group G Let the probabilities (pipi) ]TXPJ^J) = (1pound) assigne the distribution on events Ti C T TiV Tj = 0 Then the probability (pifii) means the choice of a event Ti with the probability pi and the choice of a state g pound G with distribution n

It is obvious that the addition and multiplication of these probabilities must be determined by the physical model obtained from an experiment or theoretically

4 Probability sets with a single unit

If a semigroup G is finite then min[0ii] (G) is convex set in the Euclidean space We will show that convex set contains probability subsets with a single unit A special two-demensional case of such probability set was presented in section 1

(i) Let G be a finite group (commutative or non-comutative) with elements ei62 e s s gt 2 Consider a group algebra G(R) ie a linear space of linear forms ziei + (- xses i j G R with a group multiplication of basic elements ej Assume that the basis ej is ortonormalized Let mini(G) be a simplex formed by the vertices eei--es and the set min[o)i](G) be a simplex formed by the vertices 0eie2 e s see Fig2 Then the measure (i 6 min[01](G) can be written as fj = pe- -pses where 0 lt pi lt 1 and J2iPi 5 1- The geometrical center of mini (G) is an invariant measure no = e - h ^e s For any measure fi euro min[01] (G) we have

jnG = nGiJ - nG)nG (11)

In special case if p 6 mini(G) then una = nop = no and nG = no-Denote the line passing through the points 0 and no by I Then as it can be seen from Fig2 mini(G) is a part of hyperplane orthogonal to line I and passing through the point no and min[0)1](G) is a part of positive orthant cut of by mini(G)

268

^3

MG)

i ^ _ bdquo ^ bdquo r

Fig 2

Really Fig2 corresponds to the case s mdash 3 when G is a cyclic group of three elements This case is of a special interest because algebra G(R) is isomorphic to direct sum of real numbers field and complex numbers field19 Consider a cube Q as it is shown in Fig2 The cube Q consists of all measures fi = Y^l Piei fdeg r which 0 lt pt lt j

Proposition 6 The set Q considered as a subset of a convex semigroup minr0i](C) is a probability set with a single zero 0 and a single unit no-

Proof Let us establish that the set Q is a semigroup with respect to the multiplication Indeed if fi = ^2piei v mdash YHljej belong to Q then 0 lt Pi lt - 0 lt qj lt 1 and therefore we have iv = Y^Pi1ieiej ~

S ( ^Pilik I efcgt where i = 12 s are defined uniquely for each i and k i J

k by the condition a bull ek = ejt i k = 12 s Since G is a group then for any fixed k k mdash 12 s the indexes ik run over 12 s when i runs over 12 s Therefore we have

$gtife lt E laquo ^

269

Now let us show that a complimentary element ~p exists for each p = p-e + bull bull bull + pses euro Q By definition 1 we must have i + ~p 6 e In our case we set e = n g Then p + ~p = ng and therefore ~p - nG - p = ( i - pi)ei + bullbullbull + ( j - ps)es 6 Q since 0 lt pi lt pound i = 12 s Finally let us check property (34) Really if p euro Q p ^ no then p(G) = A lt 1 Thus by virtue of (11) we have pna = ^GM = n(G)nG = nG

The remaining properties of definition 1 for the set Q follow straightforshywardly from the properties of probability set min[0i](G)

Note that the Kolmogorov condition (7) holds in Q

(ii) It proves to be possible to construct even more general kind of probability sets with a single unit as a subsets of the set min[01] (G) For this purpose we consider an arbitrary convex semigroup S(G) in mini (G) and a convex set SQ(G) formed by zero (0) and the elements of the set S(G) One can readily see that So(G) also satisfies properties of a probability set in which S(G) is a set of units

Now we consider a set Q(S G) which is an intersection of the set S$(G) and all half-spaces contained zero and bounded by hyperplanes parallel to the faces of the So(G) and passing through the point nG

Proposition 7 Let S be an arbitrary convex semigroup in mini G) censhytral symmetric with respect to the point nG Then Q(S G) is a probability set with a single zero and a single unit

Proof We shall show that Q(SG) is a semigroup with respect to conshyvolution and hence Q(SG) as a subset of min[0]1](G) is a probability set with a single unit nG- First note that in view of central symmetry of 5 with respect to nG an intersection of any face of So(G) with any hyperplan passing through the element nG and parallel to another face lays in the intersection of faces of SQ(G) and the hyperplan h passing through nG and perpenducular to the line

Fig3 shows a plane -K passing through the point p0 euro S0(G) and line The rhombus 0AnGB is an intersection of Q(SG) with this plane Each element p of this rhombus can be represented by p = nG mdash Aixi where pi euro S(G) 0 lt Ai lt 1 Symilary for each other element v of QSG) we also have ii = nG - A2^i where v pound S(G) 0 lt A2 lt 1

270

71 O S(G)

JA

- bull x G s

^ 1

Fig 3

Therefore the product fiv equals

(nG - Xim)(nG - A2^i) - nG - A2nG^i - AizinG + AiA^i^i =

= ( 1 - A i - A2)nG +AiA2ii^2 (12)

Let us show that the element (12) belongs to Q(SG) Consider the first case when either Ai and A2 is greater than | Let for example Ai gt |

Then the point jl lays in the left-hand side of the rhombus and thus can be represented as ty i 6 S(G) t lt | On the other hand we have v - T bull v for v E Q(SG) where v pound S(G) 0 lt r lt 1 Therefore the product Jiv is equal tr bull fiu where fj bull v G S(G) and 0 lt tr lt | Consequently by construction of Q(SG) measure pigt lays the left of hyperplane h (Fig3) and consequently ftu pound Q(SG)

Now consider the case when Ai lt | A2 lt | Then p = 1 mdash x mdash A 2 gt 0 and q = 12 gt 0 Show that inequality p + 2q lt 1 holds which is equivalent to the inequality Ai + Ai gt 2AiA2 Indeed (Ai mdash A2)2 = Af + A| - 2AiA2 gt 0 Since 0 lt Ai lt 1 0 lt A2 lt 1 we have Ai + A2 - 2AX A2 gt + l - 2AiA2 gt 0 Whence p + 2pltl

Thus from (12) we have [iv = pna + qfJ-iVi fJ-i v pound S(G) pq gt

271

0 p + 2g lt 1 Show the measure m = pna + gw belongs to Q(S G) for any measure w euro S(G)

Fig4 shows the plane passing through the points 0 u ans no- The point m = priG + qw lays on the line parallel to Ow and passing through priG-

Now to prove that m belongs to Q(SG) it suffices to demonstrate that qugt lt |A| By similarity of triangles 0 u n s and pno BTIQ we have

|2A| ( l - p ) | n G |

ugt nG = l-p

That is |A| = | ( 1 -p)u Then

qu 1(1 -P) 2 Q

1 1 - p gt 1

U)

follows from the inequality p + 2q lt 1

Hypothesis For arbitrary S(G) C mini(G) the set Q(S G) as a subset of a convex semigroup minr0)i] (G) is a probability set with a single 0 and a single unit no bull

272

We would like to note in connection with the examples of section 1 that a general description of probability sets in topological semi-fields and in the field of p-adic numbers is of a great interest for applications

We hope that problems of an experimental determination of abstract probabilities will be considered in the continuation of this work

5 Acknowledgments

In conclusion I want to express my gratitude to A Yu Khrennikov (Vaxjo Univ Sweden) Yu V Prokhorov O V Viskov I V Volovich (all of Steklov Mathematical Institut Russia) V Ja Kozlov (Academy of Criptografy Russhysia) V I Serdobolskii (Moskow Univ of Electronic and Math Russia) and A K Kwasniewski (Bialystok Univ Institut of Computer Science Poland) for discussions and their advices on foundations of probability theory and quantum mechanics This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University

References

1 A N Kolmogorov Foundation of the probability theory (Chelsea Publ Comp New York 1956)

2 T L Fine Theories of probabilities an examination of foundations (Acashydemic Press New York 1973)

3 H Heyer Probability measures on locally compact groups (Springer -Verlag Berlin-Heidelberg New York 1977)

4 Y P Studnev TV and its applications 12 727 (1967) 5 R P Feyman Negative probability Quantum implications Essays in

Honour of David Bohm BJ Hiley and FDPeat (Routledge and Kegan Paul London 1987)

6 P Dirac Pev Mod Phys 17 195 (1945) 7 0 G Smolaynov and A Y Khrennikov Dokl Akademii Nauk USSR

281 279 (1985) 8 V S Vladimirov I V Volovich and E I Zelenov p-adic analysis and

mathematical physics (World Scientific Publ Singapore 1993) 9 A Y Khrennikov Theor and Math Phis 97 348 (1993)

10 A Y Khrennikov Doklady Mathematics 55 402 (1997) 11 A Y Khrennikov Mathematical and physical arguments for the change

of Kolmogorovs axiomatics Trends in Comtemporary Inf Dim Analshyysis and Quantum Probability Nl 215-249 (2000)

273

12 L Accardi The probabilitic roots of the quantum mechanical paradoxes The wave - particle dualism (D Reidel Publ Company Dordrecht 1958)

13 C C Chang Transactions of the Amer Math Sos 86 467 (1958) 14 R S Grigolia Algebraic ananlysis of Lukasiewicz - Tarskis n-valued

logical systems Selected papers on Lukasiewicz sentential calculi (PAN Ossolineum Poland 1977)

15 T A Sarymsakov Topological semi-fields and its applications (FAN Tashkent 1989)

16 T A Sarymsakov Topological semi-fields and probability theory (FAN Tashkent 1969)

17 J S Bell Rev Mod Phys 38 447 (1966) 18 A Y Khrennikov Physics Letters A 200 219 (1995) 19 B L Wan der Waerden Algebra I Achte Auflage der modern algebra

(Springer-Verlag Berlin-Heidelberg New Yok 1977)

274

Q U A N T U M K-SYSTEMS A N D THEIR ABELIAN MODELS

H NARNHOFER Institut fur Theoretische Physik

Universitat Wien Boltzmanngasse 5 A-1090 Wien E-mail narnhapunivieacat

In this review the concept of quantum K-systems is studied on one hand based on a set of increasing algebras on the other hand with respect to entropy properties We consider in examples how far it is possible to find abelian models

1 Introduction

Classical ergodic theory is a powerful discipline both in mathematics and physics to analyze mixing properties of a given dynamics Since in physics the mixing properties take place on the microscopic level that is controlled by quantum theory it is natural to try to translate the concepts of classical ergodic theory also into the quantum framework and to study how far these concepts can find their quantum counterpart and whether new features appear

One possibility is the following we start with a classical dynamical system eg a free particle on a hyperbolic manifold with finite measure and quantize the dynamics ie study the properties of the Laplace-Beltrami operator on this manifold Since the manifold has finite measure the Laplace-Beltrami operator has necessarily discrete spectrum1 and the classical mixing properties can only have their footprints in the distribution of the eigenvalues at high energy23 Many deep results have been found on the basis of this approach But in this review we will follow another path of considerations

We start with the classical dynamical system with optimal mixing propershyties the Kolmogorov system456 It can be characterized either by its algebraic structure or by properties of its dynamical entropy Both concepts find their counterpart in quantum systems7 but they are not equivalent any more

First we will give the definition of an algebraic K-system and some defshyinitions of dynamical entropies One of them relates the quantum system to classical K-systems that can be considered as models of the quantum system Then we will give examples of algebraic quantum K-systems and will discuss how far they can be represented by classical models Finally we will give examshyples of quantum K-systems for which no classical model exist and on the other hand a quantum dynamical model that allows the construction of a classical model but for which the algebraic K-property so far cannot be controlled

275

2 Classical K-System

Let us repeat the characteristics of a classical dynamical system (A a z) where we take A to be the abelian algebra built by the characteristic functions over a measure space with measure fi and a an automorphism over A with [i o a = fi 456

Definition 21 We call (A Ao a fi) a K(olmogorov) system if

Ao pound A crAoDAo JanAo=A f]a~nAo = XI (21)

For a given classical dynamical system (A a fi) we can decide in several ways if some Ao (that is not unique) exists so that (AAoafj) form a K-system 56

A) Choose some finite subalgebra 13 C A (ie some finite partition of the measure space) and construct its past algebra Ao = UneuroN a~namp- If A) is a proper subalgebra of A it will increase in time Check if J anAo = A if not B has to be increased If B is large enough check if f] a~nAo = Al

B) Consider the conditional entropy H(BAo) If this expression is strictly positive V B (A a fi) is a K-system

C) If

lim H(anBAo) = H(B) VB (22) nmdashfoo

then (^4 a (i) is a K-system

The classical K-system can also be characterized by its clustering properties Let (AAQ(JH) be a K-system Then to every B E A e gt 0 3 n0 such that

p(Bo-nA) - n(B)n(A) lt en(A) VAeAongtn0 (23)

The prototype of a K-system are the Bernoulli shifts (including the Baker transformation) We regard the Bernoulli shift as an infinite tensor product A mdash lt8)fez Bti where Be is isomorphic to a finite abelian algebra Bi laquo BQ = Pi Pk with projections P with expectation values z The dynamics is given as the shift a over the tensor product The state x has to be translation

276

invariant It can be the tensor product of the local state but we allow also spatial correlations The dynamical entropy is given by

s u p t f l Q c S I | J arB (24) t=0 rlt-l+n J

= s u p i f f M J lt r B j (25)

and coincides with H (B) if the state p factorizes

3 Algebraic Quantum K-Systems

It is obvious that one can adopt Definition 21 directly to define an algebraic quantum K-system It is also obvious that the definition is not empty because we can construct the quantum analogue of a Bernoulli shift by taking for B a nonabelian algebra eg a full matrix algebra Mkxk- In the following we will first discuss physical applications of this quantum Bernoulli shift and then turn to generalizations

A A model for Quantum Measurement

We start with a finite-dimensional algebra B and a state u over B In order to determine w we have to make many copies of u and repeat a variety of measurements The classical Bernoulli shift consists of projections and every measurement gives as outcome 0 or 1 on these projections with probability corresponding to the state p By repeated measurements we can determine p with exponentially increasing security

In the quantal situation a measurement corresponds to pick some abelian subalgebra Bo of B maximal abelian if the measurement is sharp and again the outcome of the measurement will be 0 or 1 on the projections in Bo- To determine the state u we have to vary the measurements respectively the alshygebras Bo Since the state space over B is compact it suffices to vary over finitely many Bo- Let u(Pj) = pj for Pj 6 BQ TO get security on the density distribution with respect to Bo the number of experiments have to be of the order pj(l mdash pj)e2 For the algebra Bo that commutes with the density mashytrix p corresponding to u the entropy S(pg ) is minimal and approximative security on the density distribution is reached for the smallest number of meashysurements For other abelian subalgebras BQ we are satisfied with less security

277

we have just to be sure that pe0 is more mixed than p-go With pj mdash UJ(PJ)

for Pj pound Bo and Jj- = u(Pj) for ~Fj e B0- The probability that the outcome of N measurements gives a probability qj gt pj + e is

Nipj-pj-e)2

exp mdash (31a P i ( l - P j )

This has to be compared with the security given by N measurements on B0

~Ne2

exp-^-p - r (31b)

Therefore the number of experiments N necessary to control ps0 is small comshypared to the number N that fixes pg and at the same time p If we interpret the entropy as a measure on the reliability of a sequence of measurements we see that it is not changed compared to the classical expression ie the same order of experiments is necessary and therefore

S(p) = S(pBo) = -Trplnp (32)

Remark In 8 the Shannon information resp von Neumann entropy (32) was questioned to be the appropriate quantity But in these considerations it was not taken into account that measurements on different abelian subalgebras are correlated We have incorporated these correlations by taking into account the varying necessary accuracy and in this way got the desired result

B Lattice Systems

Again we choose a matrix algebra B and define A = reg n 6 ^ Bn as before But now the algebra describes particles on a lattice (one-dimensional for n pound Z) the shift corresponds to space translation and the translation invariant state describes the system in eg the ground state or equilibrium state with respect to some Hamiltonian eg the Heisenberg ferromagnet Therefore in general the state will not factorize but be obtained as 9

T r e - ^ A u(A) = hm mdash ^smdash (33)

A-yZ Tr e-PH

We assume that the sequence of local Hamiltonians H determines a time automorphism on the algebra that commutes with space translation We can assume that ui(A) is space translation invariant In order that we have an algebraic K-system on the von Neumann level (in the weak topology) it is necessary that the state is extremal space translation invariant This can be achieved if necessay by a unique decomposition as in the classical situation9

278

C Fermi Systems

We consider the CAR algebra Aa(f) a^(g) either over C2(Z) or L2(R) The shift defines an automorphism over A and the K-property is satisfied with AQ = a( ) a t ( ) supp 6 Z~ or R~ This is not a Bernoulli-K-system because creation and annihilation operators anticommute

D Quantum Stationary Markov Processes

Another example 10 of a K-system is provided by stationary Markov chains Here many variations of the definition of such a Markov chain exist We give an explicit example that again cannot be imbedded into a Bernoulli system

Let Ao be a 2 x 2 matrix algebra and C = reg n euro Z Cn a Bernoulli system Cn again a 2 x 2 matrix algebra Define the map Ti A$ reg 1 mdashgt Ao lt8gt C by

Ti(axregl) mdash ~oxregox

T^yreg) = axregay (34)

r i ( a z reg l ) = lregaz

On C we consider the shift r and a r-invariant state CJ Therefore we can define

T = (Ti reg idci )degidAregT) (35)

Then A[mn] = mltkltnTk(Ao) and (-4[-oooo]^[-ooo] f reg w) define a K-

system for arbitrary states (p over ^lo-It can easily be seen that though -4[_oooo] can be imbedded in AregC the

automorphism T is not asymptotically abelian

[Tnax reg l)az regl) = ioyregox ax (36)

E Prize-Powers Shift

Another illustrative example for a quantum K-system is the Prize-Powers shift n

Let ej be a unitary satisfying e2 = 1 Let

eiek = ( - l ) ^ - ) e e i with g(i - k) e 01 (37)

Let aek = e^+i Then

Vgo = ehi lt 0Vg = etJ pound ZltJT)

279

form an algebraic K-system where r is the tracial state

-r(e) = Sift with e = J J eiu eik (38) iiiibdquoeurol

Special examples are

a) g(l) mdash 1 gk) = 0 otherwise Then the algebra coincides with 0 A M^ x 2

where

amp2k - crzregazpound Mk+i lt8gt Mk

R2k+i = 1 regltJx euro gtlfc

b) g(i) = IV i Then the algebra coincides with CAR on Z

et = ai+a

Other explicit examples can be found in1 2 In all these examples (A - E) we inherit from the classical theory the

following

Theorem Let (A Ao cr u) be a K-system and u an extremal translation-ally invariant state (That is equivalent that f)(j~nAo = Al in the strong topology) Then to every A e 3 no such that

oj(Aa-nB) - U(A)OJ(B) lt eB ngtn0 B e A0 (39)

Therefore we have the same clustering properties as in (23)

Proof If OJ is the tracial state T(AB) = T(BA) then in the GNS represenshytation

OJ(B) = (n|7r(B)|ngt

ir(Ao) defines a projection operator PQH = Tr(Ao)Q that is increasing respecshytively decreasing in an

uAo-xB) = oj(Aa-nP0(J-nB)

and

st- lim (7nP0 = 1 st- lim a~nPQ = fl)(fl (310) nmdashoo nmdashbulloo

280

If LJ is not the tracial state but a KMS state it cannot be excluded that ft is not only cyclic for TT(A) but also for TT(AO) But in this case the modular operator corresponding to ^(Ao) A0 can replace P0 for controlling the cluster properties and satisfies13

st- lim ltr-nmdashr^ = J |fi)(fi| (311) A i 2 + 1 2

nmdashyenoo

4 Dynamical En t ropy

The dynamical entropy of classical ergodic theory can be interpreted in two different ways

If we use the definition

ha) = supH(aB) = supH(B I J a~nB) (41)

then it measures how the algebraic K-system increases and how in the course of time our information on the complete system increases

If we concentrate on the fact that

lim H[akB I J a~nB) = H(B) (42)

it describes that the remote past becomes more and more irrelevant for the presence Both properties can inspire us to look for an appropriate definition for a dynamical entropy for a quantum dynamical system

a) For an algebraic K-system we can just copy the definition of a classical K-system

Definition Given two subalgebras A B C M w a state over M Then we define with S(ujip) the relative entropy the conditional entropy H(AB)

HUAB)= sup ^2(S(uuiU - S(uui)B) (43)

Evidently H(AB) gt 0 By monotonicity of the relative entropy H(AB) = OifAcB

Let (AAoau) be an algebraic K-system Then HiJ(aAoAo) measures how fast AQ is increasing The above expression has not been much

281

investigated The main reason lies in the fact that for a given quantum dynamical system different to the classical situation no strategy is known to decide whether an AQ with the desired properties exist If it exists there is no reason to assume that it is unique In the classical situation the dynamical entropy does not depend on the special choice of AQ In a quantum system due to the lack of a constructive approach to Ao we also have no chance to compare H(aAoAo) with respect to different past algebras Ao-

There exists also another characterization for the amount of increase

For A D Ao both type Hi algebras define P0 the projector on AoO in the GNS representation of the tracial state over A Po 6 n(Ao) Then 14

[AA0=T(P0)- (44)

r the trace over n(Ao)

This definition has been generalized to type III algebras by1 5 Note that it is not state dependent As a typical example it can be evaluated for the Price-Powers shift both (43) and (42) are independent of the sequence g and give In 2 resp 2 But it should be noted that in general there exists only an order relation16

H(aAoAo) lt 2 1 o g M o M-

b) The main obstacle to use (43) or (44) as a definition for the dynamical entropy comes from the fact that for noncommutative algebras in general U n = 1 a~nB will increase in a way that can be hardly controlled

An illustrating example is given by the following observation17

Take A = a(f)a^(g) f g G C2(R) a with a the space translation We know already that it corresponds to a K-system with A0 = a(f)a(g) fg euro C2(R~) But if we pick a(e~x ) and construct the algebra A0 = a(e~(x_a) ) a gt 0 then Adeg coincides with A if it would not we could find some with (|e~(x~deg) ) = OVa gt 0 and this is impossible due to the analyticity properties of the Gauss function

Due to this fact 18 proposed the following definition for a dynamical entropy

282

Definition Let M be a hyperfinite von Neumann algebra with a faithshyful normal trace Let Pf(M) be the family of finite subsets of M Let X C M We write

if for every x euro w there exists ay e x s u c n that

T((X - y)(x - y)) lt 6 (45)

Let J be the family of finite dimensional C subalgebras of M Then

rT(cj5) = infrank A A e TM)UJ C A (46)

1 (n~l

haT(aujS) = lim sup mdashlogrr I I J oUu)8 n-yenoo n ^

j=o

haT(augt) = suphaT(aujS) (5gt0

haT(a) = sup ioT ( (Tw)w6P(M) (47)

The notation stands for approximation entropy of a

The above definition allows many variations For instance the lim sup can be replaced by a lim inf and we can hope but it is not proven that this does not change the definition

New information can be gained if we change the approximation conditions (45)

The topological entropy uses the approximation in norm But to keep generality we cannot assume that the full matrix algebra belongs to A Concentrating on nuclear C algebras we have to approximate via completely positive maps (ltpipB) with B a finite dimensional algebra if M -gt B and ifgt B -gtbull M such that

tp o tp(a) - a lt 6 V a G w (48)

hata) is denned as haT only under the new approximation condition If M is an AF-algebra and therefore possesses a tracial state then the topological entropy dominates the approximation entropy

hta) lt hata) (49)

283

As another possibility we can approximate ip o p(a) mdash a in the strong topology in a given representation corresponding to a state ip and replace the rank of the best algebra A by the entropy19

s = (ipoip)

All these definitions satisfy the requirement that they coincide with the usual definitions (state dependent dynamical entropy or topological enshytropy) if we apply them to commutative algebras

Let us finally remark that applied to the Price-Powers shift again indeshypendent of g (37)

haT(a) = hat(a) = ht - ltp(a) = ^ H(AoW1 AQ) (410) Li

For further studies we refer to (Stormer Choda Dykema)20 21 22

c) An approch that differs very much from the mathematically motivated definition of Voiculescu is offered by Alicki and Fannes23 It is motivated from the concrete method how we are able to determine by experiment the state of a system we perform a measure and repeat the measurement in the course of time Here we use the idea of the history of a system as discussed eg in24 25

A single measure corresponds to a partition of unity

fc-i ]bullgt = (411) j = 0

In fact we may think that the x^ are commutative selfadjoint projecshytion operators But by time evolution this commutativity is destroyed anyhow and also for the necessary estimations it is preferable to conshysider this generalized partition of unity without further restrictions on Xi Repetition of the measurement corresponds to a composed partition

X = (x0xbdquo-i)

ax = ((TX0 o-xn_i)

VXdegX = ( ltTXi---Xk)

ie a partition of size k2

(iixXjn) = MX

284

defines a density matrix of dimension k with entropy

Hx) ~ S(MX (412)

As dynamical entropy h(x) we define

h(x) = limsupmdash H(am~1xdeg---vxdegx) m rn

= limsup mdash S(Mam-ixo axox)

ha) = suph(X) (413)

But here a problem arises if we do not restrict B in the algebra A we lose control on the dynamical entropy For instance if we take as C-algebra the Cuntz algebra9 with 1117j mdash and UfUj = Pj and use the Ui for then the identity map has infinite dynamical entropy If for instance we consider the shift on the lattice system B) then we can choose as natural subalgebra B that is dense in A the algebra of strictly local operators Some weakening of this restriction is possible and this is of course necessary if we want to apply the theory to time evolution with interaction where local operators immediately delocalize But this derealization decreases exponentially fast in space26 therefore B consisting of exponentially localized operators should be sufficient to define a dynamical entropy for time evolution in the sense of Alicki and Fannes As an example we consider the shift on the lattice Then

IAFMO = S(LJ) + lnd (414)

there s(u is the entropy density corresponding to the state w and d is the dimension of the full matrix algebra of each lattice point

d) As last proposal for the definition of a dynamical entropy we describe the one which in fact has the longest history First it was proposed by Connes and Stormer for type II algebras27 and then generalized in28 and 29 to general situations We present the definition given by Sauvageot and Thouvenot 30 which they showed to be equivalent to the ones in 27 and 29 for hyperfinite algebras In their definition it is most evident that this dynamical entropy measures how far the quantum system is related to a classical K-system In addition concepts developed in this framework also find their application in quantum information theory

285

Definition The entropy defect of an abelian model Let (4 w) be a nonabelian algebra with state u Let (B n) be an abelian algebra with state fi that is coupled to A by a state A over AregB satisfying A| t = w XB = fi Its entropy defect is defined as

HX(BA) = [H^B) - S(LJ reg iiX)A9B] (415)

Theorem The entropy of the state u is given as

SA(w) = sup [HB(fi) - HX(BA)] (416)

In fact there exist many abelian models that optimize the above expresshysion every decomposition of OJ into pure states ui = J^ILi Viui c a n be interpreted as abelian model with B = P i Pn and fi(Pi) = fii (PiregA) = fiiOJi(A)

Due to quantum effects the entropy is not monotonically increasing if we consider an increasing sequence An C Am nltm But monotonicity can be regained if we change the definition to

Definition Let A C C and (Bfx) be an abelian model for (CCJ)

Then

HUlC(A)= sup [HBn) - HX(BA)] (417) (BMA)

This suggests the definition for a dynamical entropy

Definition Given (Aaugt) a quantum dynamical system The dyshynamical entropy is given by

hu(a) = sup[raquoM(P|P_) - H(PP- reg A)] (418)

where the supremum is taken over all dynamical abelian models (B n 0 ) with n o 0 = 0 and coupling A o 0 ltggt a = A A|4 = ugt B = A- Here P- = U^Li Q~nP the past algebra of the partition P

Remark There holds equality between hu(a) and

sup [MP |P_) - H(PA)] (419)

286

This is based on considering

H(PP-) = lim - H I ekP) )

H(PP_ regA) = lim - H I BkPA) ]

and taking V kP as a new abelian model

It is evident that one can also define the dynamical entropy with respect to a subalgebra C C A

KaC) = sup[iM(P|P_) - HPP- reg C)] (420)

an expression that we need if we want to discuss 2C) in the framework of quantum systems Notice that (419) cannot be replaced in general by an expression like (418)

The main task now is to find abelian models This can be done very similar as for calculating the entropy of a state

Theorem Assume a state w is decomposed

w = ^MiiibdquoWi1in (421)

Define

Consider

lt lt = 1^ WiiraquoWiiiraquo-it l^k

H(C aC ak^C) = 5( W ) - pound S$) + pound ^ S M U ^ ^ - M

(422)

Consider now the decomposition

w = ^ p y 51 E 1 - i W i - - i laquo ^ = Sibdquoiwltilt-- (423) r = -

In the limit limmdash limbdquo^oo (i-e we have to start with a sufficiently large decomposition) the pik converge to an abelian model and all

287

abelian models can be obtained in this way The detailed proof for this statement can be found in3 0

This theorem enables us to find lower limits for the dynamical entropy Together with the fact that

1 H(CaCak-lC) lt SU(C) + 0(8) (424)

if C C C in the sense of (45) or (48) we also have the upper bound29

h(a) lt sup lim H(C ltr-1C) (425) c k

so that in some cases we can really evaluate the dynamical entropy

5 Some General Considerations on Abelian Models

As we already mentioned the entropy of a state over a quantum system can be calculated via an abelian model For a matrix algebra this view point may look superficial but has found its important application in the theory of entangled states where subalgebras Areg B C C are considered and the entanglement describes that a pure state over C will not be pure as state over A resp B This entanglement can be used for quantum communication and the amount of this applicability is expressed as entanglement of formation31 (compare (417))

EuA) = S(u)A - HW(A) = miY^mSugtuji)A (51)

Expressed in terms of an abelian model we can also write

HU(A) = sup S(UregH)AregB0 (52) A0o

where A is a state over BQ reg C We have the following inequality Let w as state over C be written in the

GNS-representation w(C) = ltn|7r(C)|ngt

and let C be the commutant in this representation Then

S(u reg HUgt)AregC0 lt HU(A) lt S(UJ reg ULJ)AregC (53)

with C0 any abelian subalgebra of C A maximal abelian subalgebra of C gives a lower bound to the entropy and in some cases it even is the best

288

abelian model (compare 32 and the explicit results in 33 for estimates on E ie without dynamics) but in other examples 32 see also the forthcoming 6E it is evidently too small If in addition the abelian model has to carry a dynamics the question arises when the abelian model can be imbedded into the commutant (or whether by the natural isomorphism the algebra itself contains a sufficiently large time invariant abelian subalgebra)

Here we have the following results

Theorem 34 Assume that (ACTCJ) is a dynamical system and OJ a tracial state Assume that the analogue of lc) (entropic K-system) is satisfied ie

lim H(onB) = H(A) V finite dimensional B C A nmdashtoo

Then

st-lim[ylltrM]=0 V A (54)

Proof It sufficies to choose B = P for all projection operators in A Then P is its own best abelian model in the calculation of H(B) Refinements of the models P anP have to be used to calculate H(anB) (compare theorem (423)) But they are only possible if P and anP nearly commute

The theorem was generalized to other states 34 but with the restriction that we had to be able to keep control over sufficiently many optimal abelian models We do not believe that these restrictions cannot be removed by a harder analysis

Another result on footprints of commutativity is the following

Theorem 35 Assume that in the calculation of the dynamical model there exists an optimal abelian model ie

h(a) = sup (419) = maxAipe(419) (55) B0

then the algebra 4 contains an abelian subalgebra Ao on which a acts as an automorphism Notice that this does not imply that this abelian subalgebra already is the optimal abelian model

6 Abelian Models for Algebraic K-Systems

In the following we will discuss the examples of abelian K-systems given in Sect 3 and how far they allow to find good abelian models

289

A) In this model of a quantized Bernoulli system that completely factorizes the obvious choice of the abelian model that gives the correct result is

-4o = (g)4n )

neuroZ

where BQ is the abelian algebra that commutes with p and describes the measurements with maximal certainty

B) For the lattice system for which the state does not factorize any more it does not suffice to pick a suitable abelian subalgebra at every lattice point This provides an abelian model but not an optimal one Accordshying to the observations (425) it is clear that an upper bound for the dynamical entropy is given by the entropy density 29 and it seems very plausible that it should not be less To our knowledge no general proof is available but for the states that are of physical interest equality is shown

Already in 29 equality was shown under some compatibility relation beshytween space translation and modular automorphism Only in reality it is difficult to check whether this compatibility relation holds For quasifree states this is possible and was done in 3 6 Here an abelian subalgebra was selected for increasing size of the tensor product This subalgebra delocalizes but only to such an extent that the convergence of these subalgebras to an abelian model that gives the desired result can be controlled

In 37 equilibrium states over lattice systems as in 9 were considered and a decomposition offered that in the limit gave the desired result 38 applied the affinity of the dynamical entropy to control these limits and allow to exchange them His ideas are generalized in39 giving the following result

If you assume that the shift a is asymptotically abelian (ie we consider not only lattice algebras but some generalization in the framework of AF-algebras) and you consider a dynamics given by a sequence of local Hamiltonians then

The thermodynamic limit of the equilibrium states exists and they satisfy the KMS property with respect to the dynamics

For these states the entropy density and the dynamical entropy of the shift coincide The dynamical entropy of the shift can be used in a thermodynamic variation principle This variation principle is satisfied exactly by states that are KMS with respect to the time evolution

290

The maximal dynamical entropy is achieved by the tracial state and coincides in this state with the Voiculescu-dynamical entropy hat (49) In all these examples the abelian model is constructed by considering the sequence p = C~HA and the corresponding minimal projectors in (421-23)

There exists another possibility to construct space translation invariant states on the lattice namely the method of correlated states

We start again with our chain A = regnBn In addition we choose an algebra C (we restrict to finite dimensional ones) and consider some completely positive map F C reg $ -gt C that we can write as fbc) and we demand i (c) = c Let w be a state over C satifying Q o fx =Q Then we define

uj(bi ltggt reg bk) = Q(fbl ofbaoo fbbdquo(l))

where bi is an operator at the lattice point i (many of them can be 1)

It can be checked that in this way we obtain a translation invariant state If eg amp(1) = oj(b) bull 1 then we obtain a state that is clustering If we want to have nontrivial correlations between nearest neighbours we have to choose another but this enforces that there must be also correlations to other neighbours Space clustering is encoded in the convergence properties of ( ) 4 0

Now the construction of an abelian model is offered by a decomposition of F into finer completely positive maps Convergence properties in the construction of abelian models as it is necessary in (423) are now conshytrolled by convergence properties of F (that acts over finite dimensional algebras) instead of convergence properties of space correlations Again we have to choose Bn sufficiently large ie combine sufficiently many lattice points With appropriate estimates it was shown 41 that for all finitely correlated states (C of finite dimension) the dynamical entropy and the entropy density of the so constructed states coincide

C) The Fermi Algebra

If we concentrate on the even subalgebra Ae of the CAR algebra ie the algebra consisting of even polynomials in creation and annihilation opershyators this is just a special AF-algebra that is asymptotically abelian and therefore the results in39 guarantee that for equilibrium states dynamical entropy of space translation and entropy density coincide

If in addition we apply the theorem 29

han) = n h(a)

291

then obviously

hAAdegn) lt hA(an)

~ h^PlP^-HiPlP-ttA)

lt hli(PPLn))-H(PP-regAe) + ln2 (61)

shows that hAc(a) = hA(a)

Nevertheless the noncommutativity of the algebra has consequences

Theorem If ugt = OJ O a then UJ(AQ) = 0 for all odd elements in A

Proof

M4gt)|2 N-l

bdquo N n=0

= ^EF U PO^4W (6-2)

The anticommutator vanishes for strictly local odd operators except for (pound-k) = 0(l) Therefore

K 4 o ) | 2 lt ^ ViV

We notice that noncommutativity reduces the possibility for invariant states

Concerning the question for entropic K-systems (22) for all even subal-gebras

KmH((TnBe)=H(Be)

but for a typical odd subalgebra AQ = ao + h(a Ao) = 0

D) For the stationary quantum Markov chain again an abelian model can be constructed that gives the optimal result ie the entropy density10 The main idea in the proof is the fact that apart from the algebra A we can concentrate on the algebra C and inside of this algebra we construct an optimal decomposition Therefore in the limit of these decompositions we find an abelian model with vanishing entropy defect H(PP- reg A)

292

As we already mentioned the automorphism T (as in our special exshyample) will not be asymptotically abelian in general and therefore the system fails to be an entropic K-system Similar as for the Fermi system we can introduce the gauge automorphism

7 ~Ox = -Vx

ldegy = -Oy

bullyaz = az

The elements invariant under this gauge automorphism are asymptotishycally abelian under space translation because they become localized in 1 regC Therefore again the result corresponds to the results in3 9 though the states are constructed in different ways

E) The last example we want to discuss in this framework is the Price-Powers shift We have already considered the special case g(i) mdash 1 the Fermi algebra (3Eb) For gl) = 1 g(l) mdash 0 otherwise the representation (3Ea) already indicates how to construct an abelian model For a2 we are dealing with a quantum Bernoulli shift that is factorizing with the obvious choice for an abelian model Therefore it is easy to construct the abelian model for a

We can consider Bff2 as subalgebra of A therefore oBai is again an abelian subalgebra and for the shift a we consider the abelian model

oBai

with the obvious coupling Notice that now we have presented an examshyple where the entropy defect of the abelian model does not vanish ie the abelian model is not a subalgebra of the system For arbitrary g we will in general fail to find an abelian model We have only to vary the proof (62) If g is sufficiently irregular so that for all wj euro A where Wi are monomials in a i euro

[wIltrkwI]+ = 0

for infinitely many k so that

|w(w)|2 = J2 TT UJ^(jkwi)

= jjjl E w([laquolaquo]+) = o (j-ijJ (63)

293

then LJ(WI) has to vanish

In fact it was shown in42 that it is possible to construct a sequence g so that (63) holds for all wi and therefore the only invariant state is the tracial state In4 3 we proved that with probability one on the set of possible lt (63) holds and again we have a unique invariant state But this argument can be generalized to every coupling to abelian models therefore every coupling has to be trivial and the dynamical entropy in the sense of29 resp 30 vanishes

The Price-Powers shift was also studied in the context of Voiculescus dynamical entropy and in the context of the Alicki-Fannes entropy23 44 Here the increasing property is the dominant feature We obtain

hat(a) = i In 2 hAF (a) = In 2 (64)

independently of the special sequence g

If we return to our remark that the dynamical entropy describes how information increases but at the same time becomes more and more irshyrelevant for classical dynamical systems we notice that the Voiculescu and the Alicki-Fannes algebra concentrate on the fact that information increases whereas the 29 entropy is sensitive to the amount how inforshymation becomes irrelevant

7 Continuous K-Systems

So far we concentrated on discrete dynamics But obviously the discrete group of translation Z can be replaced by R without varying much of the definitions Especially due to the linearity of the dynamical entropy (which is proven for 18 and2 9)

han) = n h(a) (71)

also for the continuous groups R we can choose the subgroup aZ and can calculate the dynamical entropy (for all possible definitions) for this subgroup It can be shown that the result will be independent of the scaling parameter a

Also the definition of an algebraic quantum K-system is applicable also for a continuous group Only in this case the amount of increase cannot be described by [At Ao it is either zero on infinity because [At AQ] = n[Atn AQ] and [A 40] is either 0 or gt 2 1 4

294

This remark shows that a continuous quasifree evolution over a Fermi lattice system (aaa(f) = a(eiapf) a 6 R) can give positive dynamical entropy but cannot correspond to a continuous algebraic K-system

[At A0] = hat(at)

and hat(at) = hT(crt)

in the tracial state (compare39) This leads to a contradiction if hT(aT) is bounded

A prototype of a continuous K-system is given in relativistic quantum field theory

The Wedge Algebra 45 Consider the algebra Aw = lttgtx)xi gt 0 as subalgebra of a quantum field theory A This algebra is mapped into itself by the following automorphisms

a) ampx the shift in the x-direction Therefore AAwltri (Q bull |fi) is a K-system in an irreducible state The unitary operator implementing ai1] is eiplx with spec (P1) = R

b) lpound the shift in the light direction x1 + xdeg Again AAwtpound (ft| bull |ft) is a K-system Now pound^ = ad eiL with spec (L1) = R+

c) fl) is cyclic and separating for Aw- Therefore it defines a KMS-automorphism and this KMS-automorphism coincides with the geometric action of the boost b^ With AwZ1)Awbw (tt bull |ft) we obtain a new K-system where the K-automorphism is now the modular automorphism ad b^ = ad eB poundx acts as endomorphism on Aw- The generators satisfy

[ f l W L W ] = i l W (72)

These relations can be generalized to the following theorem

Theorem Let A AoTtuj be a modular K-system ie rt the modular automorphism of A and

n A0 D Ao-

a) Then the GNS vector Q) implementing ui is cyclic and separating both for A and Ao-

295

b) Let Tt be implemented by eim eiHtil = Q) Let rtdeg be the modular automorphism of A implemented by eiH with eiH |fi) = |ft) Then

G = Hdeg - H is well defined G gt 0

e i G s s gt 0 implements an endomorphism on A with elG A e~G = Ao

[HG) = iG (73)

The proof is based on the analyticity properties of the modular operator taking appropriate care of domain properties46 47

We notice that for quantum modular K-systems in a natural way endomorshyphism arise that satisfy the Anosov commutation relations and therefore offer by Lyapunov exponents the clustering properties of the automorphism

Theorem Let A T(t)a(s)uj be an Anosov system with r the K-automorphism and a the Anosov endomorphism

Take XA to be the characteristic function (a oo) for some a gt 0 Choose A and B euro A such that

i) AQ 6 Tgt(Gr) for some r gt 0

ii) XA(G)BQ = 0 As a consequence (n|Z|fi) = 0

Then

|w(i4TB)| lt e-tra-rBnGrAn (74)

We refer t o 1 and4 8

As for discrete quantum K-systems we wonder whether the dynamical enshytropy is positive and there exists nontrivial models Again no general result is available On the basis of quasifree evolution 49 we can construct models for fermions and bosons that are modular K-systems with positive dynamishycal entropy But there exists also a ^-deformed quasifree modular system50 Here the past algebra has trivial relative commutant and therefore the algebra does not contain any subalgebra on which the dynamics acts asymptotically abelian which according to 34 seems to be a requirement for the construction of abelian models

296

8 Mixing Properties Without Algebraic K-Property

As already mentioned no strategy is available up to now to construct for a given quantum dynamical system a subalgebra that satisfies the K-property A model for which it is still undecided whether we are dealing with an algebraic K-system is the rotation algebra51

Definition The rotation algebra Aa is built by unitary operators U V with

U-V = eiaV bull U (81)

for some a G [027r) This algebra arises in a natural way in a physically motivated example Consider a free particle in a constant magnetic field confined to two dishy

mensions Then the particle describes Larmor bounds In the thermodynamic limit these Larmor bounds can be occupied up to a precise filling factor52 This thermodynamic limit can most easily be achieved by confining the particles in an additional harmonic potential whose strength is going to zero53 Another method more taylor-made to study electric currents are periodic boundary conditions Therefore the algebra is built by eiav ePv einx emy with

piavx Jinx pin(x+a) iavx

pifivypiny _ pim(y+P) giffvy

eiavXpil3vy _ pia0Bpi0vypiav g 2

with B the magnetic field orthogonal to the plane All other commutators vanish

If we introduce

exp[inx] =

exp[im7] =

len the algebra splits into

eiav em

exp

exp

tn(x - mdashvy

im(y - ~5vx

yreg einxeimy

pinXpimy _ g i Bpimypinx

297

Therefore the rotation algebra with a = lB describes the algebra of the center of the Larmor precision

For Aa there exists a representation on CT2)

7r(Va) = exp [i [y - ^Pz) ] gt (83)

where p pv are the momentum operators - mdash- - mdash with periodic boundary i ox i ay

conditions on the torus For |fi) = |1) the constant function on the torus

JJa)il) = eix

n(va)n) = jy (84)

independent of the rotation parameterM On Aa we have the following autoshymorphism

4(^C) = J^usv

with

n m

= T n m - ( ) bull

ad mdash be = 1

tjW describe currents and are therefore of physical relevance QT describes dilation in R space and reduces to a map on the torus T2 only for discrete values and discrete directions of the dilation A physical description for QT can be given if it describes a sudden periodic push to the particle Whereas CT1 and a(2gt have no good mixing behaviour QT inherits all mixing properties from the classical torus due to (84)

(nn(Wa(z))QTn(Wa(z))n) = (QirW0z))QTn(W0(z))il) (85)

But with respect to dynamical entropy the noncommutativity plays an essential

298

role Let A be the eigenvalue gt 1 of T Then

hat(ampT) = In A for a irrational18

= In A for a rational

IAF(copyT) = In A for all a 5 5

ICNT(copyT) = hi A for a rational

gt 0 for a depending rationally on A57

= 0 in general56

In addition it was possible to construct for a rational a subalgebra Ao so that (A AQQTU) became a K-system54 This was possible because A can be looked at as a crossed product of the classical algebra on T2 with a discrete translation group and by rather general considerations crossed product algeshybras inherit under some conditions the K-structure of the underlying algebra 56 Obviously this construction does not give a hint for irrational a

The strong dependence on a of the CNT-dynamical entropy is based on the fact of the strong dependence of the asymptotic commutation behaviour Only if a and A are rational depending the system is asymptotically abelian and the commutator converges asymptotically fast to zero This rapid convergence made it possible to construct an abelian model57 using the fact that the algebra Aa can be imbedded in but is not an AF-algebra Therefore different from the approaches for lattice systems the abelian model cannot be identified up to convergence problems with an abelian subalgebra of Aa-

9 Time Evolution

As we have seen in a quantum system there are many possibilities for some kind of mixing behaviour that are not equivalent as in the classical situation Up to now we concentrated on dynamics that were constructed in such a way that they should give us information on possible ergodic structures

When dynamics is given to us by a sequence of local Hamiltonians we have up to now hardly control on the asymptotic behaviour apart from quasifree evolution

We mention just one result The x-y model58 allows a transformation to a quasifree evolution Therefore we know that it is weakly but not strongly asymptotically abelian Its dynamical entropy is positive and all definitions give the same result (with the dimensional correction term for IAF)- We do not know whether it is an algebraic K-system for a discrete subset in time For sure it is not a continuous algebraic K-system

299

References

1 GG Emch H Narnhofer GL Sewell W Thirring Anosov Actions on Non-Commutative Algebras J Math Phys 3511 5582-5599 (1994)

2 MC Gutzwiller Chaos in classical and quantum mechanics (Springer New York 1990)

3 E Bogomolny F Leyvraz C Schmit Statistical Properties of Eigenshyvalues for the Modular Group in Xlth International Congress of Mathshyematical Physics Daniel Jagolnitzer ed (International Press Boston 306-323 1995)

4 AN Kolmogorov A new metric invariant of transitive systems and autoshymorphisms of Lebesgue spaces Dokl Akad Nauk 119 861-864 (1958)

5 P Walters An Introduction to Ergodic Theory (Springer New York 1982)

6 LP Cornfeld SV Fomin YaG Sinai Ergodic Theory (Springer New York 1982)

7 H Narnhofer W Thirring Quantum K-Systems Commun Math Phys 125 565-577 (1989)

8 C Brukner A Zeilinger Conceptual Inadequacy of the Shannon Inforshymation in Quantum Measurements quant-ph0006087

9 0 Bratteli DW Robinson Operator Algebras and Quantum Statistical Mechanics I II (Springer Berlin Heidelberg New York 1993)

10 B Kiimmerer Examples of Markov dilation over 2 x 2 matrices in L Accardi A Frigerio V Gorini eds Quantum Probability and Applicashytions to the Quantum Theory of Irreversible Processes Springer Berlin 1984 228-244 and private communications

11 RT Powers An index theory for semigroups of -endomorphisms of BH) and type Hi factors Canad J Math 40 86-114 (1988) GL Price Shifts of Hi factors Canad J Math 39 492-511 (1987)

12 H Narnhofer W Thirring Chaotic Properties of the Noncommutative 2-Shift in From Phase Transition to Chaos G Gyorgyi I Kondor S Sasvari T Tel eds World Scientific 1992 530-546

13 H Narnhofer W Thirring Clustering for Algebraic K-Systems Lett Math Phys 30 307-316 (1994)

14 VFR Jones Index for subfactors Invent Math 72 1-25 (1983) 15 R Longo Simple Injective Subfactors Adv Math 63 152-171 (1987)

Index of Subfactors and Statistics of Quantum Fields Commun Math Phys 130 285-309 (1990)

16 M Choda Entropy of canonical shifts Trans Amer Math Soc 334 827-849 (1992)

300

17 H Narnhofer A Pflug W Thirring Mixing and Entropy Increase in Quantum Systems in Symmetry in Nature in honour of Luigi A Radicati di Brozolo Scuola Normale Superiore Pisa 597-626 (1989)

18 DV Voiculescu Dynamical Approximation Entropies and Topological Entropy in Operator Algebras Commun Math Phys 170 249-282 (1995)

19 M Choda A C Dynamical Entropy and Applications to Canonical En-domorphisms J Fund Anal 173 453-480 (2000)

20 E Stormer A Survey of noncommutative dynamical entropy Oslo preprint No 18 Dep of Mathematics MSC-class 46L40 (2000)

21 M Choda Entropy on crossed products and entropy on free products preprint (1999)

22 K Dykema Topological entropy of some automorphisms of reduced amalshygamated free product C algebras preprint (1999)

23 R Alicki F Fannes Defining Quantum Dynamical Entropy Lett Math Phys 32 75-82 (1994)

24 RB Griffiths Consistent histories and the interpretation of quantum mechanics J Stat Phys 36 219-279 (1984)

25 M Gell-Mann J Hartle Alternative decohering histories in quantum mechanics in Proc of the 25th Int Conf on High Energy Physics Vol 2 ed by KK Phua and Y Yamaguchi World Scientific Singapore 1303-1310 (1991)

26 EH Lieb DW Robinson The finite group velocity of quantum spin systems Commun Math Phys 28 251-257 (1972)

27 A Connes E Stormer Entropy of IIj von Neumann algebras Acta Math 134 289-306 (1972)

28 A Connes Acad Sci Paris301I 1-4 (1985) 29 A Connes H Narnhofer W Thirring Dynamical Entropy of C-

Algebras and von Neumann Algebras Commun Math Phys 112 691-719 (1987)

30 JL Sauvageot JP Thouvenot Une nouvelle definition de Ientropic dynamique des systems non commutatifs Commun Math Phys 145 411-423 (1992)

31 CH Bennett DP DiVincenzo JA Smolin WK Wootters Mixed state entanglement and quantum error corrections Phys Rev A 54 3824-3851 (1996)

32 F Benatti H Narnhofer A Uhlmann Decomposition of quantum states with respect to entropy Rep Math Phys 38 123-141 (1996)

33 WK Wootters Entanglement of formation of an arbitrary state of two qubits q-ph970929

301

34 F Benatti H Narnhofer Strong asymptotoc abelianess for entropic K-systemsCommun Math Phys 136 231-250 (1991) Strong Clustering in Type III Entropic K-Systems Mh Math 124 287-307 (1996)

35 H Narnhofer An Ergodic Abelian Skeleton for Quantum Systems Lett Math Phys 28 85-95 (1993)

36 H Narnhofer W Thirring Dynamical Theory of Quantum Systems and Their Abelian Counterpart in On Klauders Path eds GG Emch GC Hegerfeldt L Streit World Scientific 127-145 (1994)

37 H Narnhofer Free energy and the dynamical entropy of space translashytion Rep Math Phys 25 345-356 (1988)

38 H Moriya Variational principle and the dynamical entropy of space translation Rev Math Phys 11 1315-1328 (1999)

39 S Neshveyev E Stormer The variational principle for a class of asympshytotically abelian C algebras MSC-class 46L55 (2000)

40 M Fannes B Nachtergaele RF Werner Finitely correlated states of quantum spin systems Commun Math Phys 144 443-490 (1992)

41 RF Werner private communication 42 H Narnhofer E Stormer W Thirring C dynamical systems for which

the tensor product formula for entropy fails Ergod Th amp Dynam Sys 15 961-968 (1995)

43 H Narnhofer W Thirring C dynamical systems that are highly anti-commutative Lett Math Phys 35 145-154 (1995)

44 R Alicki H Narnhofer Comparison of Dynamical Entropies for the Noncommutative Shifts Lett Math Phys 33 241-247 (1995)

45 HJ Borchers On the Revolutionization of Quantum Field Theory by Tomitas Modular Theory ESI preprint 160 pages 148 references

46 HJ Borchers On Modular Inclusion and Spectrum Condition Lett Math Phys 27 311-324 (1993)

47 HW Wiesbrock Halfsided Modular Inclusions of von Neumann Algeshybras Commun Math Phys 157 83-92 (1993) Commun Math Phys 184 683-685 (1997)

48 H Narnhofer Kolmogorov Systems and Anosov Systems in Quantum Theory review to be publ in IDAQP

49 H Narnhofer W Thirring Realization of Two-Sided Quantum K-Systems Rep Math Phys 45 239-256 (2000)

50 D Shlyakhtenko Free quasifree states Pac Journ of Math 177 329-368 (1997)

51 MA Rieffel Pac J Math 93 415 (1981) 52 RB Laughlin Quantized Hall Conductivity in Two Dimensions Phys

302

Rev B 2310 5632-5633 (1981) 53 N Ilieva W Thirring Second quantization picture of the edge currents

in the fractional quantum Hall effect math-ph0010038 54 F Benatti H Narnhofer GL Sewell A Non Commutative Version of

the Arnold Cat Map Lett Math Phys 21 157-172 (1991) 55 R Alicki J Andries M Fannes P Tuyls Lett Math Phys 35 375-

383 (1995) 56 H Narnhofer Ergodic Properties of Automorphisms on the Rotation

Algebra Rep Math Phys 39 387-406 (1997) 57 SV Neshveyev On the K property of quantized Arnold cat maps J

Math Phys 41 1961-1965 (2000) 58 H Araki T Matsui Commun Math Phys 101 213-246 (1985)

303

SCATTERING IN Q U A N T U M TUBES

B O R J E NILSSON

School of Mathematics and Systems Engineering Vaxjo University SE-351 95 VAXJO Sweden

E-mail borjenilssonmsivxuse

It is possible to fabricate mesoscopic structures where at least one of the dimenshysions is of the order of de Broglie wavelength for cold electrons By using semishyconductors composed of more than one material combined with a metal slip-gate two-dimensional quantum tubes may be built We present a method for predicting the transmission of low-temperature electrons in such a tube This problem is mathematically related to the transmission of acoustic or electromagnetic waves in a two-dimensional duct The tube is asymptotically straight with a constant cross-section Propagation properties for complicated tubes can be synthesised from corresponding results for more simple tubes by the so-called Building Block Method Conformal mapping techniques are then applied to transform the simple tube with curvature and varying cross-section to a straight constant cross-section tube with variable refractive index Stable formulations for the scattering operators in terms of ordinary differential equations are formulated by wave splitting using an invariant imbedding technique The mathematical framework is also generalised to handle tubes with edges which are of large technical interest The numerical method consists of using a standard MATLAB ordinary differential equation solver for the truncated reflection and transmission matrices in a Fourier sine basis It is proved that the numerical scheme converges with increasing truncation

1 Introduction

In the search for faster computers critical parts are becoming smaller Today it is possible to build mesoscopic structures where some dimensions are of the order of the de Broglie wavelength for cold electrons Often the electron motion is confined to two dimensions Consequently it may be necessary at least for some computer parts to include quantum effects in the design process

A large number of studies devoted to such quantum effects have been carried out in recent years and a review is given by Londegan et alx Many inshyvestigations aim at understanding the physical properties of a particular quanshytum tube rather than developing reliable mathematical and numerical methods that can be used in a more general context The research has given valuable knowledge on the physical behaviour but also reports on the limitations of the methods used For instance Lin amp Jaffe2 report that a straightforward matchshying at the boundary of a circular bend does not converge demonstrating the numerical problems with such a method An illposedness is present in quantum tube scattering and some type of regularisation is therefore required to avoid large errors Often the tubes have sharp corners to facilitate manufacturing

304

but also to enhance quantum effects The presence of corners with attached singularities requires special treatment

Scattering of electrons in quantum tubes see figure 1 is theorywise reshylated to the scattering of acoustic and electromagnetic waves in ducts Nilsson 3 treats a general method for the acoustic transmission in curved ducts with varying cross-sections Wellposedness ie stability is achieved in an asympshytotic sense The mathematical framework guarantees consistent results and allows for sharp corners and a proof for numerical convergence is given We set out to present a quantum version of the results of Nilsson3 In this way the problems reported on convergence2 and on inconsistent mathematical results would be resolved

The paper is organised as follows An introduction to scattering in quanshytum tubes is given in section 2 and a mathematical model is formulated in section 3 The Building block Method which is a systematic method to analyse complicated tubes in terms of results for simple tubes is also briefly described Then in section 4 the scattering problem for the curved tube with varying cross-section and constant potential is reformulated to a scattering problem for a straight tube with a varying refractive index The solution to this probshylem is presented in section 5 and a discussion on numerical methods are also given

2 Tubes in quantum heterostructures

A schematic view of a quantum heterostructure is shown in figure 2 following Wu et al 4 Electrons are emitted from the n-type doped AlGaAs layer migrate into the GaAs layer and stay close to the boundary to the AlGaAs layer In this way a very narrow layer of electrons which are free to move in a plane is formed Nearly all the electrons in this two-dimensional gas are in the same quantum state By applying a negative potential on the metal electrodes on the top of the heterostructure in figure 1 the electrons are banished from the region below the electrodes For relatively low voltages the effective potential in the tube for one electron is close to the square-well potential 1 As a consequence the electrons in the two-dimensional gas are further restricted to a tube that in form is a mirror picture of the gap between the two electrodes This quantum tube links the electrons between the two two-dimensional gases on both sides of the strip formed by the electrodes

3 Mathematical model

Consider a two-dimensional tube with interior ft according to figure 1 The boundary V consists of two continuous curves F+ and r_ which are piecewise

305

C2 The upper boundary r + can be continuously deformed to T_ within ft Outside a bounded region the duct is straight with constant widths a and b respectively These terminating ducts are called the left and the right terminating duct or L and R for short We use stationary scattering theory for one electron in an effective potential with time dependence exp(mdashiEth) assuming that the wave function ip satisfies the time-independent Schrodinger equation Atp + k2ip = 0 in ftwhere k2 = 2mEh and m is the effective mass5 Usually k2 is called energy The effective potential is assumed to be a square well meaning that Vlr = 0-

In a tube with constant cross-section the harmonic wavefunction ip can be uniquely decomposed in leftgoing and rightgoing parts by ip = ip++ip~ Super indices + and mdash indicate rightgoing or plus and leftgoing or minus waves respectively Let ipfn

a n d V^ be known incoming waves in the terminating ducts tpfn is present in the left and ip~n in the right one Let us write

f V = 1gttn + R+tfn + T-rp-JnL rj = VTn + RiTn + T+igtfninR ^

where for example the last two terms in (31a) are minus waves and the equashytion defines the left reflection mapping R+ that maps the incoming wave to an outgoing one in L The scattering problem consists of finding the mappings R+ T~ R~ and T+ as functions of energy for a given duct In summary we have

Aip + k2igt = Oinfl

1gt+=1gtpnL bull 6-2)

igt = gtPininR

There is always a solution to (32) and except for a discrete number of eigenenergies k2 = kfi = 123 the solution is unique 6 When k2 = k2 an eigenenergy there exists a solution without incoming but with outgoing waves

The use of the Building Block Method 7 or transfer matrix formalism 8 is very efficient for the solution of scattering problems In this method a tube with a complicated geometry is divided into two parts usually where the tube is straight These two parts are converted to the type shown in figure 1 by extending the terminating tubes to infinity A sub tube for the tube shown in figure 1 originates from the left part and is depicted in figure 3 The Building Block Method gives a procedure for calculating the mappings R+ T~ R~ and T+ for the entire tube in terms of the corresponding scattering properties for the sub tubes This procedure can be repeated to get several sub tubes

306

Rather than using a general numerical package for conformal mappings we have for the calculations in this paper employed the Schwarz-Christoffel mapping for a duct with corners and rounding the corners using the methods of Henrici 9 Required analytic integrations are performed in MATHEMATICA

We recall the standard duct theory6 in a form that illustrates the illposed-ness of the problem and we have

oo oo

rP = Vgt+ + V- = Y A+e t eVraquo(v) + pound ^ e ^ - ^ l y ) (33) ra=l n = l

with pn(y) = sin(nnya) and an = ^Jk2 mdash n2n2a2 Im an gt 0 It is conveshynient to define the operator Bo by

-Bo = pound r T = l ttnnVn

I f(y) = Zn=lltnfnltPn(y) ^

We find that BQ mdash d2x 4- k2 and dx^ mdash plusmni50Vplusmn- The initial value problem

dxtp+(x) = iB0ip

+(x)

I V+(0) = ^ (

is illposed for x lt 0 but not for x gt 0 If an attenuated plus wave is marched to the left an exponential growth is found To avoid the illposedness ip is decomposed and the plus waves are calculated by marching to the right and minus waves in the opposite direction

4 Reformulated scattering problem

To be able to use powerful spectral methods it is advantageous to transform the tube to a flat boundary It is enough according to the Building Block Method to consider the scattering in the sub tubes and we restrict ourselves to the first part as shown in figure 3 One way of transforming the tube is to use a conformal mapping w(C) transforming the interior CI of the tube with variable cross-section in the pound = x + iy plane (figure 3) to the interior H of a straight tube with constant cross-section in the w = u + iv plane The straight tube is described by mdashoo lt u lt o o 0 lt t lt a

Introducing cfgt(u v) = tp(x y) we get

f d2uclgt + B2(u)^ = 0inn (

0(uO) = 0(uo) = O u e R K

with B2u) = d2 + k2n(uv) and n = dCdw2 ^(uigt)-1 can be denoted as a refractive index for the straight tube In figure 4 x related to the simple

307

tube in figure 3 is depicted The factor (i(u v) is asymptotically constant at both ends of the tube or more precisely fj(u v) = (iplusmn+0(e^cu^) u mdashgt plusmn00 with [i- mdash 1 and J+ = (ba)2

We use a first order description and rewrite (41a) as

9u dultjgt ) ~ - B 2 0j dulttgt ) (42)

To avoid illposedness the decomposition ltjgt = ltfgt+ + cfgt~ is introduced which must be identical to the corresponding decomposition (33) in regions where n is a constant The new state variables (ltfgt+ltfgt~) are introduced via the linear relation

dultigt)- ic -ic )lttgt- ) bull (43)

Solving (43) for 0+and ltjgt that

and taking the u-derivative and using (42) we find

(pound) - ( i)(pound)- (44)

where

a = MiduC-^C + iC~lB2 + iC] -(duC-1)C + iC-1B2-iC -(duC-1)C-iC-1B2 + iC

S =[duC-l)C - iC~lB2 - iC]~

amp _ 1

7 = I 2

(45)

To generalize the concept of transmission operators we make them u-dependent using a similar notation as Fishman10

4gt+u2) f T+(U2Ui) V tf-(Ul) J V ^+(2laquol)

(u1 (u2) ( 4gt+(ui)

J V r (laquoraquo) ) R T-(Ulu2)

(46)

assuming that ti lt u2 and suppressing the explicit v-dependence It is asshysumed for (46) that the scattering problem has a unique solution or that homogenous solutions are removed A homogenous solution is usually called a bound state

Next we find a differential equation for the scattering operators T+(u2 u) R~(uiu2) R+(u2ui) and T~(uiu2) in (46) using the invariant imbedding technique11 10 It is required that the incoming wave from the right ltjgt~u2)

308

is vanishing Then put u = u find dultj) (u) from (46) use (46) once more to obtain

duR+(u2u) = J + 5R+(u2u) - R+(u2u)a - R+(u2u)PR+(u2u) (47)

In a similar manner we get

duT+ (u2 u) = -T+ (u2 u)a-T+ (u2 u)3R+ u2 u) (48)

The stability properties of (47) and (48) are of central importance In the flat regions where B = B+ or B- we have C mdash B and duC~x mdash 0 implying that = 7 = 0 and a = -S = IB Similarly (47) and (48) reduce to duX

+ = mdashiBX+ X+ = R+ or T + equations which are well-posed for marching to the left The initial values to accompany (47) and (48) are R+(u2u2) = 0 and T+(u2u2) = where I is the identity operator

We choose C mdash B- + f(u)(B+ mdash pound_) that is independent of v Here is increasing and smooth with limu-^-oo^) = 0 and limu_gt00(u) = 1

5 Solution of the scattering problem

For the numerical solution of the scattering operator we expand ltj) in a Fourier sine series and i i n a Fourier cosine series

^(uv) = pound ~ = 1 (pnu)tpn(v) (

where poundn(v) = cos(mra) Using the notation 4gt = ((jgt0(j))T we find that

^ M + B 2 ( U ) ^ ) = 0 (52)

The matrix elements of B 2 (u) are given by

k2 n2TT2

B2(u)nm = mdash [-fjm+n(u) - Hm-nu) - Hm + Hn-m(u)] ^Snm (53)

and it is understood in (53) that [ii(u) = 0 for negative I For the tube in the physical Cmdashplane we require that locally both the potenshy

tial and the kinetic part of the energy are finite that is both Jx ip dxdy lt oo and Jx Vip dxdy lt oo for all finite regions X inside the tube We say that ip belongs to the Sobolev space Hj1^ meaning that tp and its first derivatives are locally square integrable Transformed to the straight duct the local finite energy requirement means Jv (fgt fidudv lt oo and ^ |V^| dudv lt oo for all

309

finite regions U inside the tube For a smooth boundary cfgt is more regular and also the second derivatives of ltjgt are square integrable that is 0 G H2

0C It follows from the theory of Grisvard12 that also the second derivatives of ltjgt are square integrable which means that ltjgt 6 H2

oc According to a graph theorem13

cj) euro H2oc implies that cfgt(u-) 6 H32(0o) meaning that up to 32 derivatives

are square integrable To interpret this regularity with fractional derivatives we define following Taylor13 the function space

Ds = fe L2(0 a) f^ | bdquo | 2 (l + n2)s lt oo 1 s gt 0 (54) I 71=0 J

wi th = J2^Li fnltPn a n d bdquo = (fltpn)(ltPnPn)- D s is a Hi lber t space wi th the norm

oo

11112) = () = pound l n | 2 ( i + laquo2)- (5-5) n=l

Taylor13 shows that D0 =L2(0o) Di =Hj(0a) D2 =H2(0a)nHj(0a) and that dvDs = D s_i s gt 1 In this terminology we have that for a smooth boundary ltjgtu bull) euro D32-

The operator 92 is self-adjoint on D32- Thus we may define Bplusmn by

oo

Bplusmnf = ^2 k2Hplusmn-nHyafnipn (56) 7 1 = 1

assuming that the branch Im gt 0 of the square root is taken It is clear that T + R~ R+ and T~ are mappings D3 2 ^ D 3 2 and Bplusmn D s mdashgt D s_i s gt 1

For tubes with edges in the poundmdashduct things are a little more complicated With no restriction on the sharpness of the edges we cannot improve that (jgt euro Hoc implying ltjgtu-) euroDi2 Then as an intermediate step in our calcushylations Bplusmnltj) should be in the space D_2 Such a derivative must of course be interpreted as a distribution However the end result ie scattered wave function belongs to D ^ To generalise we define by duality for positive s

poundraquo_s = | g f(v)g(v)dv lt oo for all f pound Ds

Multiplication by^ju is an operator Tgti2 -gtbull D_2 and if s gt 12 we have the following mapping properties Bplusmn D s - bull Dg_idbdquo D s -gt D5_ and T + R~ R+ and T~ are mappings D s -^D s

310

The equations (47-48) can only in very special cases be solved in a closed form Therefore some type of numerical scheme is used Generally a numerical method cannot give uniform convergence for the entire space Ds In a practical application it is usually sufficient to know the effect of the scattering matrices on the lowest eigenfunctions the first No say A practical method is therefore to truncate the matrix representation of (47) - (48) to N raquo NQ and solve the finite-dimensional ordinary differential equation with a standard numerical routine Nilsson3 proves that such a procedure converges when N mdashgt oo

Presently numerical results are not available for the quantum tube scatshytering However Nilsson 3 presents results for the acoustic case where the Neumann rather than the Dirichlet boundary condition applies He reports that for the lowest order reflection coefficient N = 1 ie a scalar solution is accurate up to ka = 15 N = 2 gives a good and N = 5 gives a perfect discription up to ka = 6 Energy conservation holds for all N

References

1 J T Londegan J P Carini D P Murdock Binding and scattering in two-dimensional systems - Applications to quantum wires waveguides and photonic crystals Lecture notes in physics (Berlin Springer 1999)

2 K Lin R L Jaffe Bound states and threshold resonances in quantum wires with circular bends Phys Rev B54 5750-5762 (1996)

3 B Nilsson Acoustic transmission in curved ducts with varying cross-sections Article submitted to Proc Roy Soc A

4 J C Wu M N Wybourne W Yindeepol A Weisshaar S M Good-nick Interference phenomena due to a double bend in a quantum wire Appl Phys Lett 59 102-104 (1991)

5 J Davies The Physics of low-dimensional semiconductors (Cambridge Cambridge University press 1998)

6 M Cessenat Mathematical methods in electromagnetism (Singapore World Scientific Publishing Co 1996)

7 B Nilsson O Brander The propagation of sound in cylindrical ducts with mean flow and bulk reacting lining - IV Several interacting disconshytinuities IMA J Appl Math 27 263-289 (1981)

8 H Wu D W L Sprung J Martorell Periodic quantum wires and their quasi-one-dimensional nature J Phys D Appl Phys 26 798-803 (1993)

9 P Henrici Applied and computational complex analysis Volume I (New York John Wiley k Sons 1988)

10 L Fishman One-way propagation methods in direct and inverse scalar

311

wave propagation modeling Radio Science 28(5) 865-876 (1993) 11 R Bellman G M Wing An introduction to invariant imbedding Classhy

sics in Applied Mathematics 8 Society for Industrial and Applied Mathshyematics (SIAM) Philadelphia 1992

12 P Grisvard Elliptic problems in nonsmooth domains Monographs and studies in mathematics 24 (Boston Pitman 1985)

13 M Taylor Partial differential equations I Basic theory Applied mathshyematics sciences 115 (NewYork Springer 1996)

312

Figure 1 Two-dimensional quantum tube

Doped AJGaAs

Undoped AIGaAs

Undoped GaAs

Semi insulating GaAs

Figure 2 Schematic picture of heterostructure and split-gate structure

313

Figiire 3 Sub-tube with interior Q and upper boundary T^_and lower boundary T_ ba -06

2 0

Figure 4 fi(uv) in the straight duct Parameters as in figure 3 fi x is the refractive index

314

POSITION EIGENSTATES A N D THE STATISTICAL AXIOM OF Q U A N T U M MECHANICS

L POLLEY Physics Dept Oldenburg University 26111 Oldenburg Germany

E-mail polleyQuni-oldenburg de

Quantum mechanics postulates the existence of states determined by a particle position at a single time This very concept in conjunction with superposition induces much of the quantum-mechanical structure In particular it implies the time evolution to obey the Schrodinger equation and it can be used to complete a truely basic derivation of the statistical axiom as recently proposed by Deutsch

1 Quantum probabilities according to Deutsch

A basic argument to see why quantum-mechanical probabilities must be squares of amplitudes (statistical axiom) was given by Deutsch1 2 It is independent of the many-worlds interpretation Deutsch considers a superposition of the form

He introduces an auxilliary degree of freedom i = 1 m + n and replaces

1 4) and B) by normalized superpositions

~r~ m nr m+n

pound5gt)|igt l5gtWn pound m) (L2)

imdashl i=m+l All amplitudes in the grand superposition are equal to 1ym + n and should result in equal probabilities for the detection of the states This immediately implies the ratio m n for the probabilities of property A or B

The argument has clear advantages over previous derivations of the statisshytical axiom Gleasons theorem3 4 for example is mathematically non-trivial and not well received by many physicists while von Neumanns assumption 0 +Cgt2) = (Oi) + (O2) about expectations of observables 5 6 is difficult to interpret physicswise if 0 and Oi are non-commuting45

However Deutschs argument relies in an essential way on the unitarity of the replacement or the normalization of any physical state vector Why should a state vector be normalized in the usual sense of summing the squares of amplitudes It would seem desirable to provide justification for this beyond

315

its being natural 2 In fact the reasoning would appear circular without an extra argument about unitarity or normalization I have proposed 7 to realize the replacement (12) physically by the time evolution of a suitable device Then what can be said about quantum-mechanical evolution without anticipating the unitarity

2 Schrodingers equation for a free particle as a consequence of position eigenstates

For free particles a well-known and elegant way to obtain the Schrodinger equation is via unitary representations of space-time symmetries Interactions can be introduced via the principle of local gauge invariance However this approach to the equation anticipates unitarity

As I pointed out recently8 the Schrodinger equation for a free scalar parshyticle is also a consequence of the very concept of a position eigenstatea in dis-cretized space To an extent this just means to regard hopping amplitudes as they are familiar from solid state theory as a priori quantum-dynamical entities The point is to show however that a hopping-parameter scenario without unitarity would lead to consequences sufficiently absurd to imply that unitarity must be a property of the physical system As will be seen below the absurdity is that a wave-function that makes perfect sense at t = 0 would cease to exist anywhere in space at an earlier or later time

Consider a spinless particle hopping on a 1-dimensional chain of posishytions x = na where n is integer and a is the lattice spacing

bull bull bull bull bull - gt mdash bull mdash - bull bull - a - trade-i n +i

Assume the particle is in an eigenstate n t) of position number n at time t (using the Heisenberg picture) and it has a possibility to change its position The information given by a position at one time does not determine which direction the particle should go Thus the eigenstate n t) necessarily is a superposition when expressed in terms of eigenstates relating to another time t Moreover because of the same lack of information positions to the left and right will have to occur symmetrically If t mdashyen t only nearest neighbours will be involved Thus we expect a hopping equation of the form

nt)=a nt)+3 |n + l t ) + n-lt)

This can be rewritten as a differential equation in t

mdashimdash n t) = V n t) + K n + 1 t) + K n mdash 1 t) K V complex (so far)

Which relies on linear algebra hence includes the concept of superposition

316

Parameters a3 and K V are in an algebraic relation8 which need not concern us here To obtain an equation for a wave-function we consider a general state tp) composed of simultaneous position eigenstates

ip) = ^J^gt(npound) nt) (Heisenberg picture) n

This defines the coefficients ip(nt) for all t Now take the time derivative on both sides identify i[)nt) with a function ip(xt) where x = na and Taylor-expand the shifted values ip(x plusmn a t) This results in

Finally take a mdashgt 0 on the relevant physical scale The spatial spreading of the wave-function is then given by the a2 term and the solution of the equation is

ilgt(xt) = e~iv+2K)t f rP(p)eipxe-ia2Kp2tdp

This time evolution would be unitary if K and V were real Hence consider the consequences of a non-real K The integrand would then contain an evolution factor increasing towards positive or negative times like

exp (plusmn a2 Imtp21)

This would lead to physically absurd conclusions about certain harmless wave-functions like the Lorentz-shape function ij)x) = 11 + x2

bull For Imt gt 0 harmless function rpp) oc exp(mdashp) would not exist anywhere in space after a short while

bull For Imc lt 0 the harmless function could not be prepared for an experiment to be carried out on it after a short while

In a mathematical sense of course it still remains a postulate that the value of K be real But physicswise it does seem that unitarity of quantum mechanics is unavoidable once the superposition principle and the concept of position eigenstate are taken for granted

As for parameter V the factor e~lVt would be raised to the nth power in an n-particle state and would lead to an absurdity similar to the above with certain superpositions of n-particle states unless V is real too

317

3 Driven particle Weyl equation in general space-time

As an example of a particle interacting with external fields we may consider a massless spin 12 particle with inhomogeneous hopping conditions8 Here the starting point is common eigenstates of spin and position where position refers to a site on a cubic spatial lattice A particle in such a state at time t will be in a superposition of neighbouring positions and flipped spins at a time t laquo t In 3 dimensions and immediately in terms of a wave-function the corresponding differential equation is

-imdaships(xt)= S~] Hnssiilgtslx-ant) at mdash

lattice directions

where Hnssi are any complex amplitudes On-site hopping (time-like direction) is included as n = 0 To begin with a free particle is defined by translational and rotational symmetry In this case the hopping amplitudes reduce to two independent parameters8 e and K both of them complex so far By Taylor-expanding the wave-function and taking a mdashgt 0 we find

dtipsxt) = etp3(xt) - aKa^sdntpsgt(xt)

If K had an imaginary part it would lead to physical absurdities with the time-evolution of certain harmless wave-functions similarly to the previous section For real K we recover the non-interacting Weyl equation

If we now admit for slight (order of o) anisotropics and inhomogeneities in the hopping amplitudes by adding some a7MSS(x t) to the hopping conshystants above we recover a general-relativistic version of the equation 9 with the Juss (x t) acting as spin connection coefficients Unitarity in this context means that the probability current density

j(t) = v()ltcvv(t) is covariantly conserved

daja + Ta

0aj = O

This is found to hold automatically if the vector connection coefficients are identified as usual9 through the matrix equation

Imposing no constraints on the spin connection coefficients we are dealing with a metric-affine space-time here which can have torsion and whose metric

318

Figure 1 An array of eight cavities of equal shape The initial state is located in the central cavity When each channel is opened for an appropriate time the state evolves to an equal-amplitude superposition of the peripheral cavity-states

may be covariantly non-constant The study of space-times of this general structure has been motivated by problems of quantum gravity9 It may be interesting to note that nothing but propagation by superposing next-neighbour states needs to be assumed here In particular scalar products of state vectors are not needed

4 Realizing Deutschs substitution as a time evolution

Having demonstrated automatic unitarity on two rather general examples we can now turn with some confidence to the original issue of completing Deutschs derivation of the statistical axiom

To realize the particular substitution (12) for state vector (11) let us consider a particle with internal eigenstate A) or B) such as the polarisations of a photon Let this particle be placed in a system of cavities6 connected by channels (Fig 1) which can be opened selectively for internal state A) or

Or Paul traps or any other sort of potential well these are to enable us to store away parts of the wave function so that there is no influence on them by the other parts

319

B) It will be essential in the following that all cavities are of the same shape because this will enable us to exploit symmetries to a large extent The location of the particle in a cavity will serve as the auxilliary degree of freedom as in (12) except that A) and B) before the substitution will be identified with |A)|0) and |-B)|0) where |0) corresponds to the central cavity

Now let only one of the channels be open at a time We are then dealing with the wave-function dynamics of a two-cavity subsystem while the rest of the wave-function is standing by What law of evolution could we expect A particle with a well-defined (observed) position 0 at time t will no longer have a well-defined position at time t if we allow it to pass through a channel without observing it Thus a state |0 t) defined by position 0 at time t (using the Heisenberg picture) will be a superposition when expressed in terms of position states relating to a different time t In particular if channel 0 lt-bull 1 is the open one

0t) = a0t) + plt)

Likewise by symmetry of arrangement

|ltgt = a | M ) + 0 | O l f )

It follows that |0 t) plusmn |1pound) are stationary states whose dependence on time consists in prefactors

(a plusmn fi)k after k time steps (41)

If the particle is initially in the rest of the cavities whose channels are shut we would expect this state not to change with time

|restt) = |resti)

Now if (41) were not mere phase factors we could easily construct a supershyposition of |0) |1) and |rest) so that relative to the disconnected cavities the part of the state vector in the connected cavities would grow indefinitely or vanish in the long run As there is no physical reason for such an imbalance between the connected and the disconnected cavities we conclude that

a + p = ei a-0 = eiv

Having shown evolution through one open channel to be unitary we can idenshytify an opening time interval7 r m to realize the following step of the replaceshyment (12)

ymA) |0) + |rest) ^ ym=lA) |0) + | A) 11) + |rest)

320

Here |rest) stands for state vectors that are decoupled such as all |B)|i) and all | 4) |i) with i ^ 01 Opening other channels analogously each one for the appropriate r m and internal state we produce an equal-amplitude superposishytion

m m+n

Xraquo|igt + pound |Bgt|tgt i=l i=m+l

The probability of finding the particle in a particular cavity is now 1m + n as a matter of symmetry As the internal state is correlated with a cavity by the conduction of the process the probabilities for A and B immediately follow These must also be the probabilities for finding A or B in the original state because properties A and B have remained unchanged during the time evolution

5 Can normalization be replaced by symmetry

An interesting side effect of the above realization of Deutschs argument is that state vectors need no longer be normalized at all Permutational symmetry of a superposition suffices to show that all possible outcomes of an experiment must occur with equal frequency Then the numerical values of the probabilities are fully determined This feature of quantum probabilities may be relevant to problems of normalization in quantum gravity10 such as the non-locality of summing xp2 over all of space or the non-normalizability of the solutions of the Wheeler-DeWitt equation

References

1 D Deutsch Proc Roy Soc Lond A 455 3129 (1999) Oxford preprint (1989)

2 B DeWitt Int J Mod Phys 13 1881 (1998) 3 A M Gleason J Math Mech 6 885 (1957) 4 A Peres Quantum Theory (Kluwer Academic Publishers Dordrecht

1995) 5 J von Neumann Mathematische Grundlagen der Quantenmechanik

(Springer Berlin-New-York 1932) 6 A Bohr 0 Ulfbeck Rev Mod Phys 67 1 (1995) 7 L Polley quant-ph9906124 8 L Polley quant-ph0005051 9 F W Hehl et al Rev Mod Phys 48 393 (1976) Phys Rep 258 1

(1995) 10 A Ashtekar (ed) Conceptual problems of quantum gravity (Birkhauser

1991)

321

IS RANDOM EVENT THE CORE QUESTION SOME REMARKS AND A PROPOSAL

P ROCCHI

IBM via Shangai 53 00144 Roma Italy E-mail paolorocchiit ibm com

This work addresses the Probability Calculus foundations We begin with considering the relations of the event models today in use with the physical reality Then we propose the structural model of the event and a definition of probability that harmonizes the interpretations sustained by different probabilistic schools

1 Preface

The origin of the Probability Calculus is credited to Pascal who applied rigorous methods to the matter that had been grasped by gamblers and unreliable individuals until then He intended to lay the foundations of a new Geometry and the random event should be a point in this hypothetical abstract science Throughout the centuries several scientists shared the Pascals conjecture which has been accepted without discussion Instead in our opinion an exhaustive and systematic approach to probability requires us to investigate the argument before examining the probability itself The probability theories do not diverge in their final results do not provide different formulas for the total probability and the conditioned probability instead they are in contrast on the foundations to wit in the initial concepts and this circumstance seems to us a substantial reason to study the random event

In brief we may say that the probability theories use two main models of the random event the linguistic model and the set model We shall examine them in the ensuing sections However we do not restrict our works to mere criticism but we shall trace a theoretical proposal This one provides a new mathematical model of the random event and a definition of probability which seems capable of harmonizing the various authors appearing today in contrast Kolmogorov and the frequentists the subjectivist and objectivist schools etc In this article we present a few elements taken from the complete theoretical framework [11]

2 Linguistic Model

In general different sentences can describe the same random event Let the propositions p q regard one event and verify the equivalence relationship

322

p agt q (1)

They form the equivalence class X

X=pq (2)

that constitutes the model of the random event so that we have

P = P(X) (3)

We share the opinion that random events are extremely complex and the linguistic model (2) is consistent with this feature Disciplines which investigate complicated phenomena such as psychology and sociology business management and medicine adopt the linguistic representation and consider other schemes to be too simple and reductive The proposition seems an adequate model except for the following perplexity Each primitive is a simple idea and can be left to intuition only for its fundamental property For example a number a point an entity are elementary concepts Can we declare that the random event is complex and contemporarily assume it is a primary concept The acknowledgement of the complexity opposes the primitive assumption This contrast would at least require an in depth justification that instead is lacking as far as we know

The inconsistency is confirmed in the every-day practice and we examine the linguistic model in relation to the facts

21) - Some subjectivists declare that each particular of the event should be described in order to make evident its uniqueness whereas in usual calculations we accept a sentence such as

The coin comes down heads (4)

Note that only two items are reported the coin and the result The precise date time place and all the particulars that make the event unique and unrepeatable remain implicit In fact the parts of a probabilistic event are not easy to distinguish and to relate in a sentence In conclusion a gap exists between the theoretical assertions and the practical applications of (2)

22) - In the Logic of Predicates every phrase has a precise meaning and is liable to be calculated Programmers using Prolog and Lisp develop inferences Logical programs can deduce the thesis from the hypothesis using precise clauses However this linguistic precision constitutes an exception and normally the natural language is approximate to the extent that a word must be interpreted The natural language usually represents a random event in generic terms whereas the linguistic model (2) should be liable to the probability calculation (3)

323

3 Ensemble Model

The axiomatic theory [8] assumes that the sample space D includes all the possible elementary events Kolmogorov defines the random event X as a set of particular events Ex

X= Ex (5)

when X is a subset of Q

X c Q (6)

and the probability is the measure of X

P = P(X) (7)

The practical application of the theory is immediately clarified by Kolmogorov who defines X as the result of the event

31) - This conception causes some perplexities in the light of modern systemic studies Applied and theoretical works on systems [7] assume the event as the dynamic producing the result from the antecedent item

EVENT

ou tpu t (8)

The result is a part and the event is the whole The properties of the event are evidently quite different from the properties of the output We encounter heavy difficulties when we call Ex) set of events and contemporarily we conceive it as a set of results We cannot merge them without a logical justification But do we have any

32) - Some probabilistic outcomes cannot be properly modeled as sets and subsets The spectrum of interference in the two slit experiment is a well-known case emerging in Quantum Physics [6]

input

324

4 Structural Model

We searched for a solution of the above written difficulties and we designed a theoretical framework based on the structure model for the random event

Ludwing von Bertalanffy father of the General Systems Theory conceives a system and consequently an event as an intricate set of items which affect one another [2] Interacting and connecting is the essential character and the inner nature of events and we take this idea as the basis of our theoretical proposal We make the following assumption

Axiom 41) - The idea of relating of connecting of linking is a primitive

This idea suggests two elements specialized in relating and in being related that we call entity and relationship We define them such as

Definition 42) - The relationship R connects the entities and we say R has the property of connecting

Definition 43) - The entity E is connected by R and we say E has the property of being connected

Intuitively we may say R is the active element and E is the passive one They are symmetric complementary and complete since they exhaust the applications of Axiom 41) Relationships and entities are already known in Algebra as operations and elements as arrows and objects as edges and vertices The main difference is that all of them are given as primitive while R and E derive from the axiomatic concept 41) In other words the properties of the relationship and the entity are openly given in 42) and 43) while they are implicit in other theories We underline that Axiom 41) is not a theoretical refinement and will provide the necessary basis to the ensuing inferences

From Definitions 42) and 43) follows that the relationship R links the entity E and they give the set

S = (ER) (9)

which is an algebraic structure [4] In this article we discuss theoretical models with respect to the physical reality thus we immediately examine howE R and S provide proper models for events The parts of an event are entities and relationships As an example an entity is a dice a spade heads tails a product The relationship that connects two or more entities is for ease a device a force a physical interaction [3] In the physical reality an event is a dynamic phenomenon linking Ein to Eout and from (9) we can deduce this general structure

325

5 = (Ein Eout R) (10)

Using a graph we get

^

R Eout (11)

R is the pivotal element in (10) and (11) and the structural model represents accurately the facts In addition we get the following advantages

1 The result Eout is distinct from the event S The parts and the whole are logically separate and they give a precise answer to objection 31)

2 Relations and entities constitute finite and also infinite sets so that R and E match with both discrete and continuous mathematical formalism

3 When Eout is an ensemble

Eout = Ex (12) Eout c= Q (13)

The structure accomplishes the set model in (5) and (6) 4 The result Eout may be also a rational or an irrational number a real or an

imaginary value It can be calculated by a wave function or by another function etc and we can offer a formal solution to point 32)

5 The structure S can include the comprehensive context of the probabilistic event Eg The atomic experiment depends upon the observer Eo and we have this exhaustive structure

S = (EinEout Eo R) (14)

We believe that the structural model can give a contribution to Quantum Probability

6 A simple sentence includes nouns that are entities and a verb representing a dynamical evolution Eg (4) expresses the following entities and relationship

The coin comes down heads Ein R Eout (15)

326

In short the algebraic structure accomplishes the linguistic model However a sentence can be equivocal whereas the structure S is a rigorous formalism and answers to point 22)

Note that the set (9) has the associative dissociative property namely the event is unicum S then it is defined in terms of the details E and R If this analysis is insufficient we reveal the entities (ElE2Em) and the relations (Rl R2Rp) these are exploded at a greater level and so forth The structure of levels is the complete and rigorous model of any event

S = = (ER) = = (ElE2EmRlR2Rp) = = (E11E12 EmlEm2EmkRllR12 RplRp2Rph) (16)

The structure can also be written such as

level 0 S level 1 ER level 2 ElE2EmRlR2Rp level 3 E11E12 EmlEm2EmkRllR12 RplRp2Rph (17)

The multiple level decomposition is known also as hierarchical property in literature [13] It is applied by professionals in software analysis methodologies [14][10] it is basic in modern ontology [12] and in various other sectors [1] The progressive explosion of the event is already known in the Probability Calculus where we use trees connecting the parts and the subparts of a random event For example an urn contains x red balls y green balls and z white balls Which is the probability of getting a white and two green balls through three draws

We consider the drawing Rw of a white ball w and Rg of a green ball The winning combinations wgg gwg ggw are generated by Rl R2 and R3 Intuitively we write this tree connecting three levels

R3

l RgRgRw (18)

The structure of levels (17) is rigorous and complete It includes the relations of the event as well as the entities

327

level 0 S level 1 gw R1+R2+R3 level 2 wgggwgggw(RwRgRg)+(RgRwRg)+(RgRgRw) (19)

Thanks to this completeness the structural model provides some insight into what is involved In particular if Rx at level k includes the subrelationships of level (k + 1) then Rx connects the entities through these subrelationships Eg The structure of levels (19) illustrates the dynamic Rl carried out by (RwRgRg) that physically determine the results The structure (16) proves that any event is composed of precise macromechanisms and micromechanisms Any event appears like an industrial apparatus a mechanical clock or an electronic device including various working parts This operational analysis which is based on Axiom 41) will be fundamental in the next section

5 Certain and Uncertain Structures

Probability is the answer to such kinds of questions Who will win the next foot-ball match Who will be voted in the regional elections Shall I pass the examinations Where is the photon now

These questions prove that probability concerns the particulars of an event that is already known in the whole We see the overall random phenomenon but however we ignore the details that will produce the result When we ask who will win the next match we are familiar with the match we already know the teams which will play where the match will be held etc We master the event however we do not have the details that will set out the result Why do we not have details

The cognitive difficulties related to the particulars of a random event take several origins For example there is a generic memory the reports are not detailed the particulars are missing because they are disseminated over a vast area we meet obstacles in the use of instruments etc

Ignorance of microscopic is sometimes a voluntary choice Every detail could be observed and yet we decline to know them For example a company has collected analytical data but the executive managers ignore them and evaluate their average values in taking important decisions Macroscopic knowledge and unawareness of microscopic items provide a precise method Statisticians assume this method that is absolutely scientific

Let us translate these concepts into the formalism just introduced Let the event S have the level the level 2 up to the level q two cases arise now

328

51 Certain Structures

The event is entirely described by the relations and the entities of level q The elements at level (q + 1) do not exist in the paper and in the physical reality This structure which is wholly defined and complete is certain As an example we take a body falling

level 0 S level 1 EbETRf (20)

The structure includes the body Eb the Earth poundTand the force of gravity Rf at level 1 The elements exhaustively model the event and other elements do not exist in the physical world

52 Uncertain Structures

The event is not entirely described by the relations and the entities of level q The microelements pertaining to level (q + 1) exist in the physical reality and influence the final results in a decisive way however the structure do not include them We call uncertain (or random) such a structure which is partial As ease we take the flipping of a coin The structure includes the coin Em the launchingfalling dynamcs Rm The entities Et heads and Ec tails and the relations which are alternative and produce them appear at the next level

level 0 S level 1 EmRm level EtEcRt+Rc level 3 (21)

The subrelationships of Rt and of Re produce any specific outcome They are essential since they would enable the calculation of any result and should be listed at the level 3 in (21) However they do not appear and the structure (21) is uncertain

6 Probability

A certain event is entirely explained through the structure of levels The structure clearly indicates how the event runs through q levels which are exhaustive by definition On the contrary the uncertain structure is incomplete and cannot describe how the event runs in the physical reality As the impossibility of describing how the event functions since the level (q + 1) is unknown we inquire when the event behaves that is when the random event exists in the physical reality This

329

inquire unveils a typically physical approach The problem eludes whoever develops an abstract study For the pure theoretician the event S once defined on the paper exists by definition The applicative instead knows the great difference between the definition of a model and its experimental observation

The structure of levels (16) proves that the event S works through R therefore we measure the ability to connect of the relationship

Definition 51) - When R links the input to the output in the physical reality the event S is certain and the measure P(R) equals one

P(R)= 1 (22)

When R does not run in the physical reality S is impossible in the facts and the measure P(R) is zero

P(R) = 0 (23)

If R occasionally runs P(R) assumes a decimal value The connection is neither sure nor impossible and R has a value between zero and one

0 lt P(R) lt 1 (24)

We call probability the measure P(R) of the operation R which extensively indicates the occurrence of S We can add the ensuing remarks

1 The relationship R is the precise argument of probability while S is generic 2 Definition 51) is coherent with the common sense on probability as P(R)

gauges the possibility or the impossibility of the random event 3 In some special events we can define the operation using its outcome Formally

we state an univocal relation between Eout and R

Eout =gt R (25)

and we calculate the probability of the outcome

P(Eout) = P(R) (26)

Eg The result heads Et appears whenever Rt works and we forecast the chances of a gamble from the possible outputs

P(Et) = P(Rt) = 05 (27)

330

In conclusion if (12) (13) and (26) are true Definition 51) is consistent with the Kolmogorov s theory

4 Certain structures include only certain elements impossible elements have no sense and are omitted The unitary value of probability merely confirms what is already related in the levels For example P(Rf) is one and substantiates the structure of levels (20) Conversely the uncertain structure lacks the lowest elements that are essential and (24) unveils them The decimal values of probabilities clarify the intervention of the elements at level (q + 1) For example we ignore the parts of Rt producing the result Et in (21) instead the probability (27) is capable of explaining how they work Exactly half of the S occurrences is due to the subrelationships of Rt and the other half is activated by the components of Re The explicative and predictive values of probability in (24) appear absolutely relevant

7 Experimental Verification

Our inferences are strictly inspired by experience and Definition 51) must be confirmed in the facts In order to simplify the discussion of practical verification let the event include either the relationship Ri or NOT Ri at level 2 and level 3 is ignored

level 0 S level 1 ER level 2 EiNOT Ei (Ri+NOTRi) level 3 (28)

The probability P(Ri) expresses the runs of Ri by definition thus the occurrences gs(Ri) in the sample s verifies the theoretical value P(Ri) As much as Ri connects so much is gs(Ri) Vice versa as little Ri runs so small is gs(Ri) However the absolute frequency gs(Ri) exceeds the range [01] and we select the relative frequency Fs(Ri) which verifies

0 lt Fs(Ri) lt1 (29)

According to this theory the relative frequency must coincide with the probability calculated theoretically instead Fs(Ri) does not coincide withP(3() Why There is perhaps a systematic error in the experiment

The relationship Ri at level q works by means of its subrelationships at level (q + 1) however we do not know in details how these ones behave In particular a subrelationship at level (q + 1) occurs random and a finite number of tests does not

331

allow the subrelationships of Ri to maintain their dynamical contribution to Ri Symmetrically the subrelationships of NOTRi are not proportional to P(NOT Ri) Every finite sample of tests unbalances Ri and NOT Ri The occurrences of one group are lower to what they ought to be and the occurrences of the other are greater since the subrelationships are casual The relative frequencies appear in favour of one group of subrelationships and in detriment of another Fs(Ri) and Fs(NOT Ri) are necessarily unreliable and disagree P(Ri) and P(NOT Ri) We conclude the correct trial of probability must be extended over the universe where the subrelationships of Ri and of NOT Ri do not undergo limitations The ideal experimentation of P(Ri) which excludes any deforming influence and provides the unaltered value oiFs(Ri) requires the number Gs of tests be infinite

Gs = oo (30)

In this situation the theoretical value P(Ri) and the experimental one coincide

Fs(Ri) - P(Ri) = 0 (31)

The ideal experiment (30) is unattainable therefore we can only bring near We define this approximation using the limit

Urn Fs(Ri) - P(Ri) = 0 Gs^oo (32)

The limit affirms that given the high number AT there is a value Gs

Gs gt N (33)

such that

Fs(Ri) - P(Ri) lt1Gs (34)

In other words we repeat the tests a sufficiently high number of times and the difference between the frequency and the probability will be less to the small number 1Gs The limit (32) ensures a result as fine as desired It proves that the probability defined by (22) (23) (24) is verifiable in the fact and confirms that the present theory has substance

The limit (32) known as empirical law of chance or law of great numbers does not define probability but explains its experimental verification only It is less meaningful with respect to the law sustained by frequentists [9] and does not give rise to the same conceptual difficulties The limit (32) does not use probability to

332

describe the approximation of Fs(Ri) to P(Ri) and avoids a certain conceptual tautology

8 Objective and Subjective Probability

The limit (32) states that the higher the number of tests the more frequency moves near to probability Vice versa the smaller the sample the less reliable is the experimental control of probability The maximum deviation emerges in a single test and the structural model provides the explanation

One subrelationship of the level (q + 1) fires the single experiment and this subrelationship pertains to Ri or otherwise pertains to NOT Ri In both cases the frequency deviates completely from the probability which should be decimal

I bull Gs 1 gtN oo Fs wrong approximate right

(35)

The spectrum (35) is valid in relation to frequency and also in relation to probability What does this mean

Any scientific measure takes its meaning under the precise conditions in which it is defined Therefore a parameter does not have a value for ever but does only in the practical conditions under which it must be tested And this rule also concerns probability A fairly simple case can clarify the matter

We define the force as the factor causing the acceleration a to the mass

f=m-a (36)

Mechanics defines the force (36) in the conditions which pertain exclusively to the inertial system This is characterized by the property of being stationary or moving straight on and steadily In the inertial system the mass m goes through the force and accelerates in accordance with (36) Conversely the body can move without any mechanical solicitation in the non-inertial reference The force cannot be tested and definition (36) is meaningless when system is not inertial

In general a scientific measure takes on a significance only under the experimental conditions pertaining to it and out of this context it objectively has no meaning The same criterion applies to probability with additional difficulties due to the experimental conditions that are expressed by the limit (32) and are somewhat

333

complex We have not two alternative and mutually exclusive reference systems intertial and non-intertial conversely we have the continuous spectrum (35) Probability is correctly experimented and thus takes on a right and objective significance when

Gs =00 (37)

This is unattainable and we use a large sample

Gs gtN (38)

the higher is the test number and the more objective is the probability verification Probability loses significance as more as Gs decreases The test is absolutely meaningless when

Gs = 1 (39)

Probability is very useful (see point 3 in section 6) and we calculate P(R) even if (39) is true In the single event however the probability does not exist as De Finetti paradoxically states [5] Probability can only orientate the personal expectation namely probability takes on a subjective significance

I

Gs 1 gtN Fs wrong approximate P subjective objective

Note that the subjectivist schools focus their attention on the single event while the general event is a repetition of single events This remarks put to light once again that incongruences between various authors take their roots on the random event modeling

In substance Fs(Ri) and P(Ri) have a correct and objective meaning when they refer to the entire inductive base As the number of experiments decrease so the precision of Fs(Ri) decreases and the objectivity of P(Ri) decreases progressively to the point (39) in which the numerical value of Fs(Ri) is systematically wrong and the value ofP(Ri) is subjective

00

right

(40)

334

9 Conclusions

Our theoretical proposal arose from a critical approach to the probabilistic event in particular we started with examining the relation between theoretical models today in use and the physical reality We believe the algebraic structure meets the needs better than the linguistic and the set models Besides the theoretical appreciations that we listed in the previous pages we highlight that structures of levels are already applied in several fields and in Probability Calculus too

The definition of probability that derives from the structural model is consistent with the common sense and with the probabilistic schools The different interpretations of probability which today are conflicting are unified in between our framework We judge this is a significant feature and may provide a stimulation to the scientific debate

The reader may find some parts in this paper sketchy and insufficiently explained we regret the conciseness Other considerations and further calculations have been developed in [11] but exhaustive discussions cannot be included here

References 1 Ahl V Allen TFH Hierarchy theory a vision vocabulary and epistemology

(Columbia Univ Press NY 1996) 2 von Bertalanffy L General system theory (Brazziller NY 1968) 3 Chen PS The entity-relationship model toward a unified view of data ACM

Transactions on Database Systems vol 1 nl (1976) 4 Cony L Modern algebra and the rise of mathematical structures (Verlang

NY 1996) 5 de Finetti B Theory of probability (Wiler amp Sons NY 1975) 6 Feynman R The concept of probability in quantum mechanics Proceedings

Symp on Math andProb California University Press (1951) 7 Kalman RE Falb PL Arbib MA Topics in mathematical system theory

(McGrawNY1969) 8 Kolmogorov AN Foundations of the theory of probability (Chelsea NY

1956) 9 von Mises R The mathematical theory of probability and statistics (Academic

Press London 1964) 10 Rocchi P Technology + culture = software (IOS Press Amsterdam 2000) 11 Rocchi P La probabilitd e oggettiva o soggettiva (Pitagora Bologna 1998) 12 Uschold ML Building ontologies toward a unified methodology Proc Expert

Systems Cambridge (1996) 13 Takahara Y Mesarovic MD Macko D Theory of hierarchical multilevel

systems (Academic Press NY 1970) 14 YourdonE Modern structured analysis (Englewood Cliffs NY 1989)

335

CONSTRUCTIVE FOUNDATIONS OF R A N D O M N E S S

V I SERDOBOLSKII Moscow 109028 BTrekhsviatitelskii 312 MGIEM E-mail vserdmailru

The ideas of the complexity and randomness are developed in a successively conshystructive theory The Kolmogorov complexity is reconsidered as a minimization process Basic theorems are proved for the processes A new notion of the comshyplexity based on sequential prefix coding algorithms (S-algorithms) is proposed It is proved that a constructive infinite binary sequence is algorithmically stationary iff it is an S-encoded random sequence

1 Introduction

In 1963 ANKolmogorov [1] suggested an algorithmic approach to foundation of the probability His new definition of probability was based on the notion of the complexity which was defined as the length of the minimal description for a binary word x the complexity function is defined as

bull ()= min b | (1) A(p)=x

where p are (shorter) binary words and the minimum is evaluated over all possible algorithms A A remarkable properties of this approach was that thus algorithmically defined randomness was proved to display all traditional laws of probability However the function K(x) denned by (1) in a traditional intuitive approach cannot be effectively calculated since it is not a partially recursive function In fact this function is computable only for finitely many words x [2] In [3] it was shown that Kx) is not partially recursive for any universal algorithm In [4] the definition (1) was called a heuristic basis for various approximation In [5] the author writes that the non-constructive form of the definition (1) leads to some difficulties so that many important relations hold only to within an error term measured by the logarithm of the complexity To offer a constructive definition of randomness it would be desirable to call an infinite sequence random if all initial segments (prefixes) in it are incompressible However it was proved [6] that such sequences do not exist Kolmogorov proposed some definition of randomness (K-randomness) but he wrote that it was to be improved

In this paper we reconsider fundamental relations of the Kolmogorov comshyplexity theory and develop a successively constructive formalism The main idea is that as far as we deal with algorithms we must explicitly take into acshycount the current time of their performance Thus a static notion of minimal

336

description must be replaced by the process of the minimization Here we sugshygest a rigorous formalism in which it is possible to replace somewhat obscure intuitive reasoning of the existing complexity theory by formal investigation of strings of symbols We present a survey of basic results of the Kolmogorov complexity theory in terms of processes of step-by-step performance of algoshyrithms We also introduce a new form of the complexity based on a restriction by algorithms coding sequentially from left to right (S-algorithms) Construcshytive infinite binary sequences can be called stationary if frequencies of all finite blocks of digits in it converge We prove that a sequence is stationary iff it is the transformation of an incompressible (up to a logarithmic term) sequence by a sequential left-to-right encoding algorithm

Let us define the objects of the investigation and fix notations We study binary words x that are finite chains of binary digits and at the same time binary numbers These words are transformed with algorithmic procedures A which can be represented by Turing algorithms (Turing machines) or equiva-lently by partially computable (partially recursive) functions We also study infinite sequences xdegdeg of binary digits which can be considered at the same time as infinite sequences of words x of increasing length n ie initial segments of xdegdeg In the constructive approach these sequences must be generated by some finite algorithms (generating functions) We write A(x) = y if A halts at some finite step and yields y If A(x) does not halt we write A(x) = We will often need to perform algorithms step-by-step Let Atx) denote the result of the performance of Ax) for t steps At(x) mdash y if Ax) halts at the step t lt t and yields y We write At(x) = if A(x) does not halt or halts only at the moment t gt t Let |a| denote the length of binary word x

2 Kolmogorov Complexity

According to Kolmogorov the complexity of a binary word is the length of a minimal program generating this word To make this definition comshypletely constructive we first must explicitly describe the minimization proshycedure To minimize a partially computable function f(x) we combine the search of x with counting number of steps of an algorithm that evaluates f(x) Let us use the uniform increasing numeration N = 12 of n-tuples of arguments for example let N = 12345 represent pairs (11) (12) (21) (22) (1 3)

Define the standard minimization process for A(x) as follows

min A(x) = A(xN) N = l2 X

where N = (xt) A(x0)= and A(xN) = min (A(xN - l)A t(x)) for

337

N gt 1 In the minimization process the sign can be treated as infinity If Ax) halts for a computable number of steps t then the minimization process ends and min A(x) is a computable function If no such t exists we can say

X

then that the function A(x) has no bottom Consider the universal Turing machine U by definition U(Ap) = A(p)

in the domain where (and in the following) the same letter A also denotes the text of the algorithm Let A denote the length of the text A Theorem 1 There exist computable functions such that the mass problem of their minimization process halting is algorithmically unsolvable

Proof Consider the indicator function ind(xt) = 0 if Ut(x) with x = (Ap) halts exactly at the step t so that Ut(x) = A(x) otherwise ind(xt) = 1 Denote

(j)xt) =TT ind(aT) Tltt

The minimization process ltfgt(x l)(jgt(x 2 ) is finite iff U(x) halts But the halting problem for the universal Turing machine U is algorithmically unsolvable

Now we can define the complexity as follows

Definition 1 Given binary word x and an algorithm (partially computable function) A the complexity of x with respect to A is K(x A) = K(x AN) N = 1 2 where

K(x A N) = min p (pt)ltN A(p)=x

In this definition Ap) is called a generating algorithm and p is called a program or a code for x

So the complexity is defined as a process but not as a function If A(x) halts for some x then the sequence K(xA) = K(xAN) N = 12 converges to a constant for some computable N = NQ and we can say that the complexity function K(x) is defined Otherwise no such constructive function exist

To compare minimization processes we need a special technique

Definition 2 Given two minimization processes

min A(x) = A(x N) N = 12 min B(x) = B(x M) M = 12 X X

we write A(x)ltB(x) if for each M there exist an iVo such that for all N gt N0

the inequality holds A(x N) lt B(xM)

338

If the both processes halt we can write simply A(x) lt B(x) If A(x)ltB(x) and A(x)gtB(x) we say that the strong equivalence holds

and write A(x) ~ Bx) Define also a weak equivalence A(x) laquo B(x) if A(x)ltB(x) + c along with Bx)ltA(x) + c

The algorithmic theory of complexity was started with the discovery of universal descriptions and universal complexity This basic discovery was made simultaneously and independently by Kolmogorov and RSolomonoff in 1960-1964 (see in [7])

This theory is developed to study minimal descriptions of arbitrarily long words x with finite algorithms It means that A lt c All basic results are obtained with the accuracy up to constants c which are supposed to be indeshypendent of x

Definition 3 The complexity of the word x with respect to an algorithm A is the process K(x A) = K(x AN) N = 12 where

K(x A N) = min |raquo| (pt)ltN At(p)=x

We use two methods of the complexity theory upper estimates of the comshyplexity are derived by the construction of explicit generating procedures lower estimates are obtained by counting the variety of words and their programs

Theorem 2 For any algorithm A we have

K(xU)ltKxA) + cA

where CA depends only on A but not on x

Proof Count steps of Ax) by steps of the universal Turing machine performing A For each N we can find a number M such that

K(x U N) = min z lt (zt)ltN U(z)=x ~

min min |(Bp)| lt min (CA + p) lt B Bltc (pt)ltN Ut(Bp)=x ~ (pt)ltN Ut(Ap)=x ~

CA+ min p = CA +K(XA) (pt)ltM A(p)=x

where CA is a constant depending only on A This is the proof

This statement is called the Invariance Theorem Its significance is that it introduces a universal measure of complexity which is calculated by trying different algorithms with different input words Let us fix a particular universal Turing machine U as a reference machine and set K(x) = K(x U)

339

Let us call the difference |x| mdash K(x) the number of regularities

Remark 1 Given n = x the fraction of words x with the number of regushylarities more than m is no more than 2~m

This follows from the fact that there are only 2 n _ m programs p of length nmdashm So almost all words are incompressible up to a slowly increasing function of n

Remark 2 Kx)ltx + c This is obvious since we can use as a generating the identity algorithm A(x) = x

Note that the minimization process in Theorem 2 can be made more effishycient if we restrict p with p lt x + c

The complexity of finite words depends strongly on the additive constant c Therefore the main object of study will be the complexity of words x of arbitrarily great lengths n

Theorem 3 If fx) is a partially computable function then K(f(x))ltK(x) + c

Proof Suppose the algorithm evaluating f(x) halts Given an arbitrary algorithm A we construct the composition B = fA By Definition 3 and Theorem 2 for each N we can find M and a constant c independent of x such that

K(f(x)UN)= min p lt (zt)ltN Ut(z)=f(x)

min min Inl + c lt min p + c lt B Bltc (pt)ltM Bt(p)=f(x) ~ (pt)ltMf(At(p))=f(x)

min Id + c = K(x A) + cltK(x) + c (pt)ltMAt(p)=x V

The theorem is proved

Example Let x mdash 0n (n zeros) Then K(x)ltK(n) + clt logn + c If n = l m then K(x)ltlogogn + c Clearly Kxn) is not monotone in n

By definition it is impossible to present a conceivable example of a high-complexity word

To separate a number n in chain we define a special self-delimiting code for an integer n as follows n = Omln where m = logn with the length n = 2log n + 1 or a more refined code n = O l o g m lmn of length n lt logn + 2 + 2 log logn Here (and in the following) log a for x gt 0 denotes a function equal to an integer nearest from above to the standard logarithmic function logx and only positive arguments of log a are considered (if x lt 0 then the expressions containing log a are supposed to equal 0)

340

Note that the set of n presents a prefix-free set More sparing self-delimiting codes can be obtained by further iterations Denote their length by log n = log + log log n 4- log log log n + (the iterated logarithm)

Theorem 4 K(x y)ltK(x) + K(y) + 2 log ||z|| + 1

Proof It suffices to use programs for (x y) of the form p = 0mlp1p2 where m = logpi A(pi) = x B(p2) = y and 0m serves to separate p from p2

3 Incompressibility

Now we consider algorithmically generated infinite sequences of digits xdegdeg that are treated as sequences of words x |x| = n = 1 2

We cite (in a simplified form) two theorems by Martin-L6f [6]

Theorem 5 Any constructive xdegdeg contains infinitely many words x of length n with K(x)ltn mdash logn + c

Theorem 6 For almost all sequences xdegdeg for any e gt 0 for all words x of length n gt no with some computable no we have K(x) gt n mdash (1 + e) logn

Thus the complexity of a typical constructive binary sequence fluctuates between the lower bound n mdash (1 + e)logn and n

The idea to define randomness as algorithmic incompressibility was put forward by Kolmogorov [2] and GJChaitin [8] There exist no sequences in which all words in it are c-incompressible

Definition 4 (Kolmogorov) An infinite binary sequence is called K-random if it contains infinitely many words x with if(a)gt|a| mdash c

Remark 3 Almost all sequences xdegdeg are K-random

This follows from the fact that there is only a portion 2~c of words x for which K(a)lt|a| - c

Definition 5 An infinite binary sequence xdegdeg = x is called L-random if for some c we have K(x)gtn mdash c logn for all words n = x

Theorem 6 states that almost all binary sequences are L-random Stepping aside from the incompressibility idea Martin-L6f [6] suggested

another notion of randomness based on the idea of universal tests The Martin-Lof randomness (ML-randomness) follows from the Kolmogorov randomness If zdegdeg is Martin-Lof random then for any e gt 0 we have K(x)gtn- ( l + e ) l o g n from some n onwards

These properties suggest three notions of randomness implied one from the other K -+ ML -gt L

Now let us restrict classes of algorithms

341

4 Reversible Complexity

Let us restrict ourselves with reversible algorithms

Definition 6 An algorithm A(p) is called reversible (R-algorithm) if one can find another algorithm B = A-1 such that A(p) mdash x implies B(x) mdash p and vice versa

These algorithms state 1-1 correspondence between inputs and outputs We can say that B(x) is an encoding algorithm and A(p) is a decoding algoshyrithm

Definition 7 R-complexity of a word x is defined as the process KR(X) = KR(x N) N = 1 2 where

KR(XN) = min min Id A Altc pt)ltN Ut(Ap)=x

where A are R-algorithms and the minimization process is shortened by disshycovering the first root of the equation A(p) = x

Since the class of R-algorithms includes the identity algorithm we have KR(X) lt x + c

Definition 8 A function (an algorithm) A(x) is called unidomain if there are no pairs x ^ x-i such that Ax) = Ax2)

Proposition 1 A function A(x) is unidomain iff it is reversible

Proof First let A be unidomain Using A let us construct an algorithm B(y) as follows

for (pt) = 12 do if At(p) = y then B(y) = p halt

endfor

If A(x) = y then this algorithm provides the first root of this equation and halts If A(x) = then we have B(y) = Conversely if A is a reversible algorithm then there exist an algorithm B(y) such that Ax) = y implies B(y) = x and the argument of A is recovered uniquely

Theorem 7 There exist no algorithm W such that for any algorithm A we have W(A) = 1 if A can be a reversible algorithm and W(A) = 0 if not

Proof To prove this assertion it suffices to prove it for some special class of A Let N be a nullifying algorithm such that for any x we have N(x) = 0 and let B be an arbitrary algorithm Choose A so that A(0) = 0 A(l) = N(B(1)) and A(n) = n for n gt 1 This algorithm is not unidomain iff -B(l) halts However the mass problem of algorithm halting is algorithmically unsolvable This proves the theorem

342

Theorem 8 The complexity KRX) as K(X)

Proof The relation K(X)ltKR(X) + c follows from definitions Prove the converse relation Let Kx) be given by a sequence of functions

KixN) = min min Ipl A Altc (Apt)ltN At(p)=x

where A are arbitrary algorithms Given A the minimization here is carried out over all roots of the equation At(p) = x We replace the evaluation of all roots for a single algorithm At by evaluating roots of a number of the equations Let us numerate roots of the equation A(p) = x in the process (p t) = 12 Construct the algorithm B(vp) as follows

k=0 for (qr)=l 2 do

if ATq) mdash x then k = k + 1 if k = v and p = q then

B = x halt endfor

The function B(vp) = x iff p is the root number is otherwise B(yp) = By construction for fixed v the function B(ip) is unidomain The theorem statement follows

Knowing the complexity of a word x we can constructively evaluate its minimal codes Minimizing descriptions of physical events x can be considered as a process of a cognition of x by search of a regularities producing the phenomenon x It is known that all elementary physical processes are time-reversible The reversible generating algorithms generally speaking can be less efficient in producing long words The equivalence Kx) laquo KRX) stated by Theorem 8 can be interpreted as the absence of phenomena that can be produced but not cognized within the frames of the algorithmic theory

5 Complexity and Information

Kolmogorov discovered [2] [9] that information theory can be developed from the algorithmic definition of complexity

The conditional complexity of a binary word x with respect to the word y is defined as the minimal length of a program that generates x from y

K(xyA)= min p (pt) At(py)=x

Theorem 9 There exists an optimal algorithm V such that for any algorithm A we have

K(xy) d=f K(xy V)ltK(xy A) + c

343

Example We have K(Onn)ltc where the constant c is the length of the algorithm generating 0 from n

We show the connection between the notion of complexity and optimal coding in the Shannon information theory Suppose the words x of length n be partitioned from left to right into sequences of k blocks ba of binary digits of the identical length I m = 2l blocks in total n = kl Denote by fbdquo the empirical frequency of the occurence of bs in x The Shannon entropy per block is defined as

s

Theorem 10 Let o word x be partitioned into k blocks of length I Then k~1K(x)ltH(f) + clogfcfc where c depends on I but not on x

Proof Use a special code not depending on the source of information universal code) To specify x we can fix numbers k3 = kfs of the occurence of each block bs for all blocks s of length I and the number

~ kilk2kml

m = 2l where fci + bull bull bull + km = k Applying the Stirling formula we find that the length of this code is no more than m log k + kH(f) + c log k The theorem statement follows

Thus Kx) can be considered as the entropy and K(yx) as the conditional entropy The information in x about y is I(xy) = K(y) mdash K(yx)

Remark 4 For arbitrary words x and y

K(yx)ltK(y) + c and K(xy) = K(x) + K(y|x) + clog|x|

Indeed consider a special code for (x y) of the form P1P2 where pi is a self-delimiting code for x and pi is a code for y We have

K(xy)lt min min (|Pi| + IP2I) AB | A | lt c | B | lt c (piP2t) At(pi) = x Bt(p2) = y

This is the required statement Note that the measure of the information I(xy) is non-negative only

asymptotically for long x and y The correction logarithmic term can be preshyscribed to the individual description of x in contrast to traditional description in terms of distributions

344

6 Frequency Ra te s

The stability of frequency rates that is assumed a priori in the conventional concept of probability can be deduced in the algorithmic theory

Denote the empiric rate of occurences of 1 in x by f(x 1) The frequency rates stability can be stated as follows

Theorem 11 Given L-random xdegdeg c gt 0 for each word x in it

f(xl)-l22ltcognn

where c does not depend on n Proof Use a special code p for x as follows Let k = nf(xl) and

P = (fcgtj)gt where j = 1 C numerates all words x of length n with k units Use the prefix codes for (k j) of the form kj with k = log k lt 21ogn Thus

A(a)lt|(gtm)|lt21ogn + logC7

Using the Stirling formula we find that logC lt nH(kn) + clogn where the entropy H(f) = mdashlog mdash (1 - ) log( l - ) = kn It satisfies the inequality H(f) lt 1 mdash 2( - 12)2 Combining these formulas we obtain the desired result

Remark 5 If f(x 1) - 12|2 gt cn then K(x)ltn - 12 logn + c This inequality shows the effect of a regularity when the number of units is too close to n2

The refinement is natural We consider a partition of xdegdeg mdash x into blocks of digits b of the identical length b = Define by fxb) the number of blocks b = bi among the partition of a word x of length n = kl Denote 7T = 2 -J

Theorem 12 Given an L-random sequence xdegdeg = x and a block of digits b of length I for all words x of length n we have

f(xb)-2~l2 ltc(b) lognn

A number of other specifically probabilistic laws deduced previously by intuitive reasoning in can be proved similiarly

7 Prefix Complexity

In 1974-1975 another approach to the complexity was developed starting from the concept of a prefix complexity (by LALevin PGacs GJChaitin [10-12])

345

Definition 9 A set of words is called prefix-free if there are no pairs of different words such that one is the beginning of the other

Lemma 1 (1) If pi is a prefix set n = pi i mdash 12 then the Kraft inequality

holds pound 2-ltltl

t = l 2

(2) if numbers n nlti satisfy the Kraft inequality then one can find binary words pi P2 bull bull of length n n-i such that the set pi is prefix-free

These words can be constructed by the well-known Fano-Shannon proceshydure

Definition 10 An algorithm is called a prefix algorithm if its domain is a prefix-free set The prefix complexity of a word x with respect to a prefix alshygorithm A is defined as the process Kp(x A) = Kp(x AN) N = 1 2 where

KP(xAN)= min ||p|| (pt)ltN At=x

The set of prefix algorithms is an enumerable set

Theorem 13 There exists a universal prefix algorithm V such that for any prefix algorithm A we have

KPx) d= KP(x V)ltKP(x A) + cA

To deal with prefix algorithms we notice that we can recover the word x = 0n (n zeros) from n but we cannot encode numbers n as simple integers since they are not prefix-free Using self-delimiting codes we obtain prefix-free codes of length n + log n

Remark 6 K(x)ltKP(x)ltK(x) + log(z)

Remark 7 Kp(xy)ltKp(x) + Kp(y) + c In contrast to K(x) here we do

not need an end marker for the word x since x is recognized as a prefix

Theorem 14 [12] For any fixed length n of words x we have max Kp(x)gtn + log n mdash c

X

Theorem 15 [13]An infinite sequence xdegdeg is Martin-Lof random iff Kp(x)gtx mdash c for all words x

346

For most of xdegdeg we have Kp(x)gtx mdash c for all x Thus the prefix complexshyity of almost all sequences fluctuates within the bounds x and |a| + log x (with the accuracy up to c)

8 Universal Probability

The idea of a universal a priori probability was put forward by Solomonoff in [4] For a binary word x he introduced the probability P(x) = 2 _ l p ^^ where p(x) is a minimal description of a However

pound2-ltgt = oo x

To obtain normalizable algorithmic probabilities the Kraft inequality for a prefix-free set was proposed and this led to the development of a theory of the prefix complexity [10-12] Let us reformulate the basic results of it in a successively constructive form

Definition 11 The algorithmic probability of x is defined by the process

P(x) = 2-Kr(ltN AT = 12

Example If x = 0n then Kp(x)lt logn + 2 log log n + c Hence P(x)gtc(nlog2 n)

Definition 12 The universal a priori probability is defined by Qx) = Q(xUN) N = (p t) mdash 12 where U is the universal prefix algorithm and

Q(xUN) = QxUN-l) + md(Ut(p) = x) 2~M

where the indicator function equals 1 iff Ut(p) halts exactly at the step number t otherwise 0

Since the mass problem of the universal machine halting is algorithmically unsolvable the sequence Q(x) has no ceiling

The following Coding Theorem shows that these two formulations define processes differing by no more than a constant

Theorem 16 For each x we have Kpx) raquo logQ(x)

In [14] a non-constructive infinite binary fraction was considered

n =53 Q(x) lt I

347

The real number fi was called the universal algorithm halting probability It can be interpreted as a process Q(N) N mdash 12 with

fi(jV) = Yl MN ) + md(ut(p) = )]gt (xpt)ltN

where the indicator function equals 1 iff Utp) halts exactly at the moment t yielding x otherwise 0

The monotone increasing sequence il(N) is bounded from above and has no ceiling Knowing first signs of ilN) N mdash 12 we can accumulate in fi solutions of all constructive problems of bounded complexity CBennet and MGardner would call ft the number of Wisdom [15]

9 Sequentially Coding Algorithms

We suggest the following extension of the complexity theory produced by a restriction with algorithms coding sequentially from left to right

A set P of code words is called complete-code if any half-infinite sequence can be represented as a concatenation of codes from P

Definition 13 An one-to-one constructive function T X ltmdashgt Y is called a coding table if it is defined on complete-code prefix-free sets X and Y

Definition 14 An algorithm A evaluating a coding table T X ltmdashgt Y is called a sequential coder or an S-algorithm if

(1) for any concatenation x = xXi Xk of words Xi from X we have A(x) = A(x1)A(x2)A(xk)

(2) for any concatenation y = A(xx)A(x2) bull bull A(xk) we also have A(x1x2xk) = y

The set of S-algorithms is recursively enumerable

Definition 15 The S-complexity of a word x with respect to an S-algorithm A is a process Ks(x A) = Ks(x AN) N = 1 2 where

Ks(xAN)d= min p (pt)ltN At(p)=x

Theorem 17 There exists a (universal) S-algorithm V such that for any S-algorithm A we have

Ks(x) = Ks(xV)ltKs(xA) + cA

where CA does not depend on x

348

Since the class of S-algorithms contains the identity algorithm (with A(0) = 0 A(l) = 1) we have Ks(x)ltx+c If f(x) is a partially computable function evaluated by some S-algorithm then Ks(f(x))ltKs(x) + c

Obviously K(x)ltKs(x)ltKp(x) But we only have Ksxy)ltKpx) + Ks(y) since the sequentially coding algorithm can separate the utmost left prefix from the remaining ones

For words x = 0trade we have Ks(x)lt log n For almost all sequences xdegdeg for all sufficiently long words x in it for any

c gt 1 we have Ks(x)gtK(x)gtx mdash clog |x|

Definition 16 A binary sequence is called S-random if for all words x Ks(x)gtx mdash c log |a| where c does not depend on x

Definition 17 A binary sequence xdegdeg = x is algorithmically stationary if for any block b of digits in it there exist the limit lim f(b x)

xmdashgtoo

Any L -random sequence is algorithmically stationary Lemma 2 a binary sequence ydegdeg = y is produced from an algorithmically stationary sequence xdegdeg = x by an S-algorithm A so that y = A(x) then the sequence ydegdeg is also algorithmically stationary

Proof Suppose ydegdeg is produced from xdegdeg by y = A(x) where A is an S-algorithm The algorithm A defines a prefix-free domain X and a code-complete range of values Y Choose a block of digits b Using the completeness of Y we have b mdash 2122 bull bull bull Vk where j 6 Y i = 12 k By the sequential property we can find a program a = XXi Xk with all Xi euro X such that Aa) = b The frequencies f(ax) = f(by) This proves the lemma

Lemma 3 KsKs(x))ltKs(x) + c

Proof Note that S-algorithms are such that the composition AB of two S-algorithms A and B is again an S-algorithm For a fixed N we find

Ks(xN) = min min Ipl A Altc (pt)ltN At(p)=x

and for the minimizing value p = Po

KspoM)= min min y B Bltc (yt)ltM Bt(y)=p0

Let y = 20 be the minimizing value of a code for po- Since for some t AtBt(y) = x (if both algorithms halt) it is clear that Ksx) lt y + c We obtain K(x)ltKs(p) laquo Ks(Ks(x))

Theorem 18 An infinite binary sequence xdegdeg is algorithmically stationary iff it is an S-algorithm transformation of some S-random sequence

349

Proof First assume that y = A(x) for all x euro xdegdeg and Ks(x)gtx mdash clog x We have K(x)gtKs(x)-log x So K(x)gtx -c log|a | c gt c + l By Theorem 12 the sequence xdegdeg is stationary

To prove the converse assume that xdegdeg = x is stationary We find minKs(x N) for (p t) lt N let p be a minimum code for x At(p) = x for some t if At(p) halts Here A P -yen X has the domain P and the range X both prefix-free and code-complete Since X is code-complete we can express x as xxiXk with Xi e X and A(pi) = Xi with pi euro P i = lk By Lemma 3 we have Ks(p)gtp - c It follows that p mdash ppi pk is log-incompressible The proof is complete

The comparison of different notions of the complexity and randomness shows that this difference is no more than a logarithmic term With account of stationarity theorems it seems plausible to suggest a common definition of randomness of infinite sequences xdegdeg mdash x as the incompressibility up to the term c log |x| where c does not depend on x

In conclusion I have a pleasure to express my sincere gratitude to prof VMMaximov for encouraging discussions

References

1 A N Kolmogorov Grundlagen der Wahrscheintlickkeits Rechnung (Springer Verlag 1933 in English Chelsea New York 1956)

2 A N Kolmogorov Problems of Information Transfer 1 1 1-7 (1965) 3 L Longren Computer and Information Sciences 2 165-175(1967) 4 R J Solomonoff Progress of Symposia in Applied Math AMS 43

(1962) IEEE Trans on Inform Theory 4 5 662-664(1968) 5 Li Ming P Vitanyi An Introduction to Kolmogorov Complexity (Springer

Berlin-Heridelberg-New-York 1993) 6 P Martin-L6f Information and Control 9 602-619(1966) Zeits Warsch

Verw Geb 19225-230(1971) 7 A N Shiryaev The Annals of Probability 17 3 866-944(1989) 8 G J Chaitin J ACM 16 145-159(1969) 9 A N Kolmogorov Russian Math Survey 38 4 27-36(1983) 10 L A Levin Problems of Information Transmission 10 3206-210(1974) 11 P Gacs Soviet Math Doklady 15 1477-1480(1974) 12 G J Chaitin J ACM 22 329-340(1975) 13 V V Vjugin Semiotika i Informatika (in Russian) 16 14-43(1981)

V A Uspenskii SIAM J Theory Probab Appl 32 387-412(1987) 14 R J Solomonoff Information and Control 7 1-22(1964) 15 C H Bennet M Gardner Sci America 241 11 20-34(1979)

350

STRUCTURE OF PROBABILISTIC INFORMATION A N D Q U A N T U M LAWS

JOHANN SUMMHAMMER Atominstitut der Osterreichischen Universitdten

Stadionallee 2 A-1020 Vienna Austria E-mail summhammeratiacat

The acquisition and representation of basic experimental information under the probabilistic paradigm is analysed The multinomial probability distribution is identified as governing all scientific data collection at least in principle For this distribution there exist unique random variables whose standard deviation beshycomes asymptotically invariant of physical conditions Representing all informashytion by means of such random variables gives the quantum mechanical probabilshyity amplitude and a real alternative For predictions the linear evolution law (Schrodinger or Dirac equation) turns out to be the only way to extend the invari-ance property of the standard deviation to the predicted quantities This indicates that quantum theory originates in the structure of gaining pure probabilistic inshyformation without any mechanical underpinning

1 Introduction

The probabilistic paradigm proposed by Born is well accepted for comparing experimental results to quantum theoretical predictions It states that only the probabilities of the outcomes of an observation are determined by the exshyperimental conditions In this paper we wish to place this paradigm first We shall investigate its consequences without assuming quantum theory or any other physical theory We look at this paradigm as defining the method of the investigation of nature This consists in the collection of information in probabilistic experiments performed under well controlled conditions and in the efficient representation of this information Realising that the empirical information is necessarily finite permits to put limits on what can at best be extracted from this information and therefore also on what can at best be said about the outcomes of future experiments At first this has nothing to do with laws of nature But it tells us how optimal laws look like under probshyability Interestingly the quantum mechanical probability calculus is found as almost the best possibility It meets with difficulties only when it must make predictions from a low amount of input information We find that the quantum mechanical way of prediction does nothing but take the initial unshycertainty volume of the representation space of the finite input information and move this volume about without compressing or expanding it However we emphasize that any mechanistic imagery of particles waves fields even

351

space must be seen as what they are The human brains way of portraying sensory impressions mere images in our minds Taking them as corresponding to anything in nature while going a long way in the design of experiments can become very counter productive to sciences task of finding laws Here the correct path seems to be the search for invariant structures in the empirshyical information without any models Once embarked on this road the old question of how nature really is no longer seeks an answer in the muscular domain of mass force torque and the like which classical physics took as such unshakeable primary notions (not surprisingly considering our ape orishygin I cannot help commenting) Rather one asks Which of the structures principally detectable in probabilistic information are actually realized

In the following sections we shall analyse the process of scientific investishygation of nature under the probabilistic paradigm We shall first look at how we gain information then how we should best capture this information into numbers and finally what the ideal laws for making predictions should look like The last step will bring the quantum mechanical time evolution but will also indicate a problem due to finite information

2 Gaining experimental information

Under the probabilistic paradigm basic physical observation is not very difshyferent from tossing a coin or blindly picking balls from an urn One sets up specific conditions and checks what happens And then one repeats this many times to gather statistically significant amounts of information The difference to classical probabilistic experiments is that in quantum experiments one must carefully monitor the conditions and ensure they are the same for each trial Any noticeable change constitutes a different experimental situation and must be avoided0

Formally one has a probabilistic experiment in which a single trial can give K different outcomes one of which happens The probabilities of these outcomes pi PK (52Pj = 1) are determined by the conditions But they are unknown In order to find their values and thereby the values of physical quantities functionally related to them one does N trials Let us assume the outcomes j = 1 K happen L LK times respectively (52 Lj = N) The Lj are random variables subject to the multinomial probability distribution Listing Li LK represents the complete information gained in the N trials The customary way of representing the information is however by other random

Strictly speaking identical trials are impossible A deeper analysis of why one can neglect remote conditions might lead to an understanding of the notion of spatial distance about which relativity says nothing and which is badly missing in todays physics

352

variables the so called relative frequencies Vj = LjN Clearly they also obey the multinomial probability distribution

Examples

A trial in a spin-12 Stern-Gerlach experiment has two possible outcomes This experiment is therefore goverend by the binomial probability distribution A trial in a GHZ experiment has eight possible outcomes because each of the three particles can end up in one of two detectors 2 Here the relative frequencies follow the multinomial distribution of order eight Measuring an intensity in a detector which can only fire or not fire is in fact an experiment where one repeatedly checks whether a firing occurs in a sufficiently small time interval Thus one has a binomial experiment If the rate of firing is small the binomial distribution can be approximated by the Poisson distribution

We must emphasize that the multinomial probability distribution is of utshymost importance to physics under the probabilistic paradigm This can be seen as follows The conditions of a probabilistic experiment must be verified by auxiliary measurements These are usually coarse classical measurements but should actually also be probabilistic experiments of the most exacting standards The probabilistic experiment of interest must therefore be done by ensuring that for each of its trials the probabilities of the outcomes of the auxiliary probabilistic experiments are the same Consequently empirical scishyence is characterized by a succession of data-takings of multinomial probability distributions of various orders The laws of physics are contained in the reshylations between the random variables from these different experiments Since the statistical verification of these laws is again ruled by the properties of the multinomial probability distribution we should expect that the inner structure of the multinomial probability distribution will appear in one form or another in the fundamental laws of physics In fact we might be led to the bold conshyjecture that under the probabilistic paradigm basic physical law is no more than the structures implicit in the multinomial probability distribution There is no escape from this distribution Whichever way we turn we stumble across it as the unavoidable tool for connecting empirical data to physical ideas

The multinomial probability distribution of order K is obtained when calshyculating the probability that in N trials the outcomes 1 K occur L LK

times respectively

Prob(L1LKNp1pK) = L K ^ - P K - (2-1)

The expectation values of the relative frequencies are

353

Vj = pj (2 2)

and their standard deviations are

3 Efficient representation of probabilistic information

The reason why probabilistic information is most often represented by the relative frequencies Vj seems to be history Probability theory has originated as a method of estimating fractions of countable sets when inspecting all elements was not possible (good versus bad apples in a large plantation desirable versus undesirable outcomes in games of chance etc) The relative frequencies and their limits were the obvious entities to work with But the information can be represented equally well by other random variables jgt a s ldegng a s these are one-to-one mappings Xjvj)i s o that no information is lost The question is whether there exists a most efficient representation

To answer this let us see what we know about the limits pi PK before the experiment but having decided to do iV trials Our analysis is equivalent for all K outcomes so that we can pick out one and drop the subscript We can use Chebyshevs inequality4 to estimate the width of the interval to which the probability p of the chosen outcome is pinned down6

If N is not too small we get

Wp = 2kJ^ (31)

where A is a free confidence parameter (Eq(4) is not valid at ^=0 or 1) Before the experiment we do not know u so we can only give the upper limit

Wp lt - ^ (32)

But we can be much more specific about the limit x of the random variable x(f) for which we require that at least for large N the standard deviation

Chebyshevs inequality states For any random variable whose standard deviation exists the probability that the value of the random variable deviates by more than fc standard deviations from its expectation value is less than or equal to fc-2 Here A is a free confidence parameter greater 1

354

A shall be independent of p (or of x for that matter since there will exist a function px))

Ax = ^ (33)

where C is an arbitrary real constant For the derivation of the function X(v) it is easiest to make use of the illustration in Figl Although it already shows the solution the argument is general enough so that the particular form of the discussed function does not matter First we note that x(^) shall be smooth and differentiate and strictly monotonic For sufficiently large N the probability distribution of v can be approximated by a normal distribution centered at v and with standard deviation Av In other words it will approach the gaussian form

ProbvNp) laquo rexp (y-vf 2(Ai)2 (34)

where r is the normalization factor But clearly the corresponding probability distribution of will also tend to the gaussian form of standard deviation Ax-(For instance take the probability distributions of v and x for P mdash -5 These are the ones in the middle as shown in Figl) And if N is large both Av and Ax will be small so that in the range of x and v where the probability is significantly different from zero the curve x(^) can be approximated by its tangent

X laquo X W + ( | ) __v-v) (35)

Then it follows that the characteristic width of the probability distribution of xgt which is Ax will be proportional to the characteristic width of the probability distribution of v which is Av The proportionality constant will be gpound because this is by how much the distribution for v gets squeezed or stretched to become the one for x- So we have for large N

poundU pound (36) Av dv

Use of (3) and (6) and integration yields

X = C arcsin (2v - 1) + 9 (37)

where 9 is an arbitrary real constant For comparison with v we confine x to [01] and thus set C = 7r_1 and 6 = 5 as was already done in Figl Then we

355

have Ax = l(iryN) and upon application of Chebyshevs inequality we get the interval wx to which we can pin down the unknown limit x as

wx = mdash = (38)

Clearly this is narrower than the upper limit for wp in eq(5) Having done no experiment at all we have better knowledge on the value of x than on the value of p although both can only be in the interval [01] And note that the actual experimental data will add nothing to the accuracy with which we know x but they may add to the accuracy with which we know p Nevertheless even with data wp may still be larger than to especially when p is around 05

For the representation of information the random variable x is the proper choice because it disentangles the two aspects of empirical information The number of trials N which is determined by the experimenter not by nashyture and the actual data which are only determined by nature The expershyimenter controls the accuracy wx by deciding N nature supplies the data x and thereby the whereabouts of x In the real domain the only other random variables with this property are the linear transformations afforded by C and 9 From the physical point of view x s degf interest because its standard deshyviation is an invariant of the physical conditions as contained in p or x The random variable x expresses empirical information with a certain efficiency eliminating a numerical distortion that is due to the structure of the multishynomial distribution and which is apparent in all other random variables We shall call x an efficient random variable (ER) More generally we shall call any random variable an ER whose standard deviation is asymptotically invariant of the limit the random variable tends to eq(6)

Another graphical depiction of the relation between v and c a n be given by drawing a semicircle of diameter 1 along which we plot v (Fig2a) By orthogonal projection onto the semicircle we get the random variable C = [K + 2arcsin(2i mdash l)]4 and thereby Xi when we choose different constants The drawing also suggests a simple way how to obtain a complex ER We scale the semicircle by an arbitrary real factor a tilt it by an arbitrary angle ip and place it into the complex plane as shown in Fig2b This gives the random variable

0 = a(yv(l-v) +iv e^ + b (39)

where b is an arbitrary complex constant We get a very familiar special case by setting a mdash 1 and 6 = 0

Vgt = (yjv (1 - v) + iv) eiv (310)

356

Figure 1 Functional relation between random variables v and xgt and their respective probshyability distributions as expected for N = 100 trials plotted for five different values of p 07 25 50 75 and 93 The bar above each probablity distribution indicates twice its standard deviation Notice that the standard deviations of v differ considerably for different p while those of x a r e aU the same as required in eq(6)

357

(a) (b) Figure 2 (a) Graphical construction of efficient random variable pound (and thereby of x) from the observed relative frequency v pound is measured along the arc (b) Similar construction of the efficient random variable 3 It is given by its coordinates in the complex plane The quantum mechanical probability amplitude ip is the normalized case of 3 obtained by setting a = 1 and 6 = 0

358

For large N the probability distribution of v becomes gaussian but also that of any smooth function of v as we have already seen in Figl Therefore the standard deviation of ip is obtained as

Aip dip

dv 4 = S f lt3 Ugt

Obviously the random variable ip is an ER It fulfills ip2 mdash i and we recogshynize it as the probability amplitude of quantum theory which we would infer from the observed relative frequency v Note however that the intuitive way of getting the quantum mechanical probability amplitude namely by simply taking ^vexp(ia) where a is an arbitrary phase does not give us an ER

We have now two ways of representing the obtained information by ERs either the real valued x o r the complex valued Since the relative frequency of each of the K outcomes of a general probabilistic experiment can be conshyverted to its respective efficient random variable the information is efficiently represented by the vector (XI---XK) or by the vector (0i3K) The latshyter is equivalent to the quantum mechanical state vector if we normalize it (ipuipK)

At this point it is not clear whether fundamental science could be built solely on the real ERs j o r whether it must rely on the complex ERs J- and for practical reasons on the normalized case ipj as suggested by current formulations of quantum theory We cannot address this problem here but mention that working with the j3j or ipj can lead to nonsensical predictions while working with the Xj never does so that the former are more sensitive to inconsistencies in the input data 6 Therefore we use only the ipj in the next section but will not read them as if we were doing quantum theory

4 Predictions

Let us now see whether the representation of probabilistic information by ERs suggests specific laws for predictions A prediction is a statement on the exshypected values of the probabilities of the different outcomes of a probabilistic experiment which has not yet been done or whose data we just do not yet know on the basis of auxiliary probabilistic experiments which have been done and whose data we do know We intend to make a prediction for a probabilistic experiment with Z outcomes and wish to calculate the quantishyties 4gts (s = 1 Z) which shall be related to the predicted probabilities Ps

as Ps = (jgts2- We do not presuppose that the ltps are ERs

We assume we have done M different auxiliary probabilistic experiments of various multinomial order Km m = 1 M and we think that they provided

359

all the input information needed to predict the cfgts and therefore the Ps With (13) the obtained information is represented by the ERs iptrade where m denotes the experiment and j labels a possible outcome in it (j = 1 Km) Then the predictions are

and their standard deviations are by the usual convolution of gaussians as approximations of the multinomial distributions

Alttgts =

N M

4Nn

dltj)s

dip (42)

where Nm is the number of trials of the mth auxiliary experiment If we wish the ltfgts to be ERs we must demand that the A(ps depend only on the Nm (A technical requirement is that in each of the M auxiliary experiments one of the phases of ERs ip^1 cannot be chosen freely otherwise the second summations in (16) could not go to Km but only to Km mdash 1) Then the derivatives in (16) must be constants implying that the ltfgts are linear in the i)trade However we cannot simply assume such linearity because (15) contains the laws of physics which cannot be known a priori But we want to point out that a linear relation for (15) has very exceptional properties so that it would be nice if we found it realized in nature To be specific if the Nm are sufficiently large linearity would afford predictive power which no other functional relation could achieve It would be sufficient to know the number of trials of each auxiliary probabilistic experiment in order to specify the accuracy of the predicted ltfgts No data would be needed only a decision how many trials each auxiliary experiment will be given Moreover even the slightest increase of the amount of input information by only doing one more trial in any of the auxiliary experiments would lead to better accuracy of the predicted ltjgts by bringing a definite decrease of the Altjgts This latter property is absent in virtually all other functional relations conceivable for (15) In fact most nonlinear relations would allow more input information to result in less accurate predictions This would undermine the very idea of empirical science namely that by observation our knowledge about nature can only increase never just stay the same let alone decrease For this reason we assume linearity and apply it to a concrete example

We take a particle in a one dimensional box of width w Alice repeatedly prepares the particle in a state only she knows At time t after the preparation Bob measures the position by subdividing the box into K bins of width wK

360

and checking in which he finds the particle In N trials Bob obtains the relative frequencies vi VK giving a good idea of the particles position probability distribution at time t He represents this information by the ERs xpj of (10) and wants to use it to predict the position probability distribution at time T (T gt t)

First he predicts for t + dt With (15) the predicted ltps must be linear in the ipj if they are to be ERs

K

lt)s(t + dt) = J2asjxpj (43) i= i

Clearly when dt mdashgt 0 we must have asj mdash 1 for s mdash j and asj = 0 otherwise so we can write

asj (t) = 6aj + gsj (t)dt (44)

where gSj(t) are the complex elements of a matrix G and we included the possibility that they depend on t Using matrix notation and writing the ltfgts

and ipj as column vectors we have

$t + dt) = [1 + G(t)dt] $ (45)

For a prediction for time t + 2dt we must apply another such linear transforshymation to the prediction we had for t + dt

$t + 2dt) = [1 + G(t + dt)dt] $t + dt) (46)

Replacing t + dt by t and using ltp(t + dt) = lttgtt) HmdashQp-dt we have

d$t) dt

= Gt)ltjgtt) (47)

With (10) the input vector was normalized ip2 mdash 1 We also demand this from the vector ltfgt This results in the constraint that the diagonal elements gaa must be imaginary and the off-diagonal elements must fulfill gsj = mdashgjs And then we have obviously an evolution equation just as we know it from quantum theory

For a quantitative prediction we need to know G() and the phases (pj of the initial ipj We had assumed the ltpj to be arbitrary But now we see that they influence the prediction and therefore they attain physical significance G(t) is a unitary complex K x K matrix For fixed conditions it is indepenshydent of time and with the properties found above it is given by K2 mdash 1 real

361

numbers The initial vector ip has K complex components It is normalized and one phase is free so that it is fixed by 2K mdash 2 real numbers Altogether K2 + IK - 3 = (K + 3) (K - 1) numbers are needed to enable prediction Since one probabilistic experiment yields K mdash 1 numbers Bob must do K + 3 probabilistic experiments with different delay times between Alices preparashytion and his measurement to obtain sufficient input information But neither Plancks constant nor the particles mass are needed It should be noted that this analysis remains unaltered if the initial vector ip is obtained from meashysurement of joint probability distributions of several particles Therefore (21) also contains entanglement between particles

5 Discussion

This paper was based on the insight that under the probabilistic paradigm data from observations are subject to the multinomial probability distribution For the representation of the empirical information we searched for random variables which are stripped of numerical artefacts They should therefore have an invariance property We found as unique random variables a real and a complex class of efficient random variables (ERs) They capture the obtained information more efficiently than others because their standard deviation is an asymptotic invariant of the physical conditions The quantum mechanical probability amplitude is the normalized case-of the complex class It is natural that fundamental probabilistic science should use such random variables rather than any others as the representors of the observed information and therefore as the carriers of meaning

Using the ERs for prediction has given us an evolution prescription which is equivalent to the quantum theoretical way of applying a sequence of inshyfinitesimal rotations to the state vector in Hilbert space7 It seems that simply analysing how we gain empirical information what we can say from it about expected future information and not succumbing to the lure of the question what is behind this information can give us a basis for doing physics This confirms the operational approach to science And it is in support of Wheelers It-from-Bit hypothesis8 Weizsackers ur-theor$ Eddingtons idea that inforshymation increase itself defines the rest10 Anandans conjecture of absence of dynamical laws11 Bohr and Ulfbecks hypothesis of mere symmetry^2 or the recent 1 Bit mdash 1 Constituent hypothesis of Brukner and Zeilingei13

In view of the analysis presented here the quantum theoretical probability calculus is an almost trivial consequence of probability theory but not as applied to objects or anything physical but as applied to the naked data of probabilistic experiments If we continue this idea we encounter a deeper

362

problem namely whether the space which we consider physical this 3- or higher dimensional manifold in which we normally assume the world to unfurl 14 cannot also be understood as a peculiar way of representing data Kant conjectured this - in somewhat different words - over 200 years ago1 5 And indeed it is clearly so if we imagine the human observer as a robot who must find a compact memory representation of the gigantic data stream it receives through its senses16 That is why our earlier example of the particle in a box should only be seen as illustration by means of familiar terms It should not imply that we accept the naive conception of space or things like particles in it although this view works well in everyday life and in the laboratory mdash as long as we are not doing quantum experiments We think that a full acceptance of the probabilistic paradigm as the basis of empirical science will eventually require an attack on the notions of spatial distance and spatial dimension from the point of view of optimal representation of probabilistic information

Finally we want to remark on a difference of our analysis to quantum theory We have emphasized that the standard deviations of the ERs a n d tp become independent of the limits of these ERs only when we have infinitely many trials But there is a departure for finitely many trials especially for values of p close to 0 and close to 1 With some imagination this can be noticed in Figl in the top and bottom probability distributions of which are a little bit wider than those in the middle But as we always have only finitely many trials there should exist random variables which fulfill our requirement for an ER even better than x a n d ip- This implies that predictions based on these unknown random variables should also be more precise Whether we should see this as a fluke of statistics or as a need to amend quantum theory is a debatable question But it should be testable We need to have a number of different probabilistic experiments all of which are done with only very few trials From this we want to predict the outcomes of another probabilistic experiment which is then also done with only few trials Presumably the optimal procedure of prediction will not be the one we have presented here (and therefore not quantum theory) The difficulty with such tests is however that in the usual interpretation of data statistical theory and quantum theory are treated as separate while one message of this paper may also be that under the probabilistic paradigm the bottom level of physical theory should be equivalent to optimal representation of probabilistic information and this theory should not be in need of additional purely statistical theories to connect it to actual data We are discussing this problem in a future paper17

363

Acknowledgments

This paper is a result of pondering what I am doing in the lab how it can be that in the evening I know more than I knew in the morning and discussing this with G Krenn K Svozil C Brukner M Zukovski and a number of other people

References

1 M Born Zeitschrift f Physik 37 863 (1926) Brit J Philos Science 4 95 (1953)

2 D Bouwmeester et al Phys Rev Lett 82 1345 (1999) and references therein

3 W Feller An Introduction to Probability Theory and its Applications (John Wiley and Sons New York 3rd edition 1968) Vol1 p168

4 ibid p233 5 The connection of this relation to quantum physics was first stressed by

W K Wootters Phys Rev D 23 357 (1981) 6 We give the example in quant-ph0008098 7 Several authors have noted that probability theory itself suggests quanshy

tum theory A Lande Am J Phys 42 459 (1974) A Peres Quanshytum Theory Concepts and Methods (Kluwer Academic Publishers Dorshydrecht 1998) D I Fivel Phys Rev A 50 2108 (1994)

8 J A Wheeler in Quantum Theory and Measurement eds J A Wheeler and W H Zurek (Princeton University Press Princeton 1983) 182

9 C F von Weizsacker Aufbau der Physik (Hanser Munich 1985) Holger Lyre Int J Theor Phys 34 1541 (1995) Also quant-ph9703028

10 C W Kilmister Eddingtons Search for a Fundamental Theory (Camshybridge University Press 1994)

11 J Anandan Found Phys 29 1647 (1999) 12 A Bohr and 0 Ulfbeck Rev Mod Phys 67 1 (1995) 13 C Brukner and A Zeilinger Phys Rev Lett 83 3354 (1999) 14 A penetrating analysis of the view of space implied by quantum theory

is given by U Mohrhoff Am J Phys 68 (8) 728 (2000) 15 Immanuel Kant Critik der reinen Vernunft (Critique of Pure Reason)

Riga (1781) There should be many English translations 16 ET Jaynes introduced the reasoning robot in his book Probshy

ability Theory The Logic of Science in order to eliminate the problem of subjectivism that has been plaguing probability theshyory and quantum theory alike The book is freely available at httpbayeswustleduetjprobhtml

17 J Summhammer (to be published)

364

Q U A N T U M C R Y P T O G R A P H Y I N S P A C E A N D B E L L S T H E O R E M

I G O R V O L O V I C H

Steklov Mathematical Institute Gubkin St 8

GSP-1 117966 Moscow Russia

E-mail volovichmirasru

Bells theorem states that some quantum correlations can not be represented by classical correlations of separated random variables It has been interpreted as incompatibility of the requirement of locality with quantum mechanics We point out that in fact the space part of the wave function was neglected in the proof of Bells theorem However this space part is crucial for considerations of property of locality of quantum system Actually the space part leads to an extra factor in quantum correlations and as a result the ordinary proof of Bells theorem fails in this case Bells theorem constitutes an important part in quantum cryptography The promise of secure cryptographic quantum key distribution schemes is based on the use of Bells theorem in the spin space In many current quantum cryptography protocols the space part of the wave function is neglected As a result such schemes can be secure against eavesdropping attacks in the abstract spin space but they could be insecure in the real three-dimensional space We discuss an approach to the security of quantum key distribution in space by using a special preparation of the space part of the wave function

1 Introduction

Bells theorem1 states that there are quantum correlation functions that can not be represented as classical correlation functions of separated random varishyables It has been interpreted as incompatibility of the requirement of locality with the statistical predictions of quantum mechanics For a recent discusshysion of Bells theorem see for example 2 - 17 and references therein It is now widely accepted as a result of Bells theorem and related experiments that local realism must be rejected

Evidently the very formulation of the problem of locality in quantum mechanics is based on ascribing a special role to the position in ordinary three-dimensional space It is rather surprising therefore that the space dependence of the wave function is neglected in discussions of the problem of locality in relation to Bells inequalities Actually it is the space part of the wave function which is relevant to the consideration of the problem of locality

In this note we point out that the space part of the wave function leads to an extra factor in quantum correlation and as a result the ordinary proof of Bells theorem fails in this case We present a criterium of locality (or nonlocality) of quantum theory in a realist model of hidden variables We

365

argue that predictions of quantum mechanics can be consistent with Bells inequalities for Gaussian wave functions and hence Einsteins local realism is restored in this case

Bells theorem constitutes an important part in quantum cryptography19 It is now generally accepted that techniques of quantum cryptography can allow secure communications between distant parties 18 - 25 The promise of secure cryptographic quantum key distribution schemes is based on the use of quantum entanglement in the spin space and on quantum no-cloning theorem An important contribution of quantum cryptography is a mechanism for detecting eavesdropping

However in many current quantum cryptography protocols the space part of the wave function is neglected But exactly the space part of the wave function describes the behaviour of particles in ordinary real three-dimensional space As a result such schemes can be secure against eavesdropping attacks in the abstract spin space but could be insecure in the real three-dimensional space

It follows that proofs of the security of quantum cryptography schemes which neglect the space part of the wave function could fail against attacks in the real three-dimensional space We will discuss how one can try to improve the security of quantum cryptography schemes in space by using a special preparation of the space part of the wave function

2 Bells Inequality

In the presentation of Bells theorem we will follow 17 where one can find also more references The mathematical formulation of Bells theorem reads

cos(a -P)plusmn Eamptip (21)

where poundQ and r)p are two random processes such that |pounda | lt 1 r$ lt 1 and E is the expectation Let us discuss in more details the physical interpretation of this result Consider a pair of spin one-half particles formed in the singlet spin state and moving freely towards two detectors (Alice and Bob) If one neglects the space part of the wave function then the quantum mechanical correlation of two spins in the singlet state ipspin is

Dspin(a b) = (ipspin(7 -areg a bull btpspin) = -a bull b (22)

Here a and b are two unit vectors in three-dimensional space a mdash ( o i ^ ^ ) are the Pauli matrices and

366

Bells theorem states that the function Dspinab) Eq (22) can not be represented in the form

P(ab) = Jaa)r](bX)dp(X) (23)

ie

Dspin(ab) ^ P(ab) (24)

Here pound(a A) and 77(6 A) are random fields on the sphere |pound(a A)| lt 1 rj(b A)| lt 1 and dp(X) is a positive probability measure dp) = 1 The parameters A are interpreted as hidden variables in a realist theory It is clear that Eq (24) can be reduced to Eq (21)

One has the following Bell-Clauser-Horn-Shimony-Holt (CHSH) inequality

P(a b) - P(a b) + P(a b) + P(a b)lt2 (25)

Prom the other hand there are such vectors (ab mdash ab = ab = mdash ab = V22) for which one has

Dspin(a b) - Dspin(a b) + Dspin(a b) + Dspin(a b) = 2^2 (26)

Therefore if one supposes that Dspin(ab) = P(ab) then one gets the contrashydiction

It will be shown below that if one takes into account the space part of the wave function then the quantum correlation in the simplest case will take the form g cos(a mdash 3) instead of just cos(a - 3) where the parameter g describes the location of the system in space and time In this case one can get the representation

gcos(a-p)=EZaT]l3 (27)

if g is small enough (see below) The factor g gives a contribution to visibility or efficiency of detectors that are used in the phenomenological description of detectors

3 Localized Detectors

In the previous section the space part of the wave function of the particles was neglected However exactly the space part is relevant to the discussion of locality The complete wave function is tp = (Vgta3(rir2)) where a and are spinor indices and r i and r^ are vectors in three-dimensional space

367

We suppose that Alice and Bob have detectors which are located within the two localized regions OA and OB respectively well separated from one another

Quantum correlation describing the measurements of spins by Alice and Bob at their localized detectors is

G(a0AbOB) = (1gtW bull aPoA reg a bull bPoB|Vgt (3-1)

Here PQ is the projection operator onto the region O Let us consider the case when the wave function has the form of the product

of the spin function and the space function tp = y spin^(i ir2) Then one has

G(a 0A b 0B) = g(0A 0B)Dspin(a b) (32)

where the function

9(OAOB)= [ 4gt(r1T2)2dT1dv2 (33)

JOAXOB

describes correlation of particles in space It is the probability to find one particle in the region OA and another particle in the region OB- One has

0ltg(OAOB)ltl (34)

Remark In relativistic quantum field theory there is no nonzero strictly localized projection operator that annihilates the vacuum It is a consequence of the Reeh-Schlieder theorem Therefore apparently the function g(OAOs) should be always strictly smaller than 1 I am grateful to W Luecke for this remark

Now one inquires whether one can write the representation

9(0A0B)Dspin(ab) = f^aOAX)v(b0B)dP(X) (35)

Note that if we are interested in the conditional probablity of finding the projection of spin along vector a for the particle 1 in the region OA and the projection of spin along the vector b for the particle 2 in the region OB then we have to divide both sides of Eq (35) to g(OA OB)-

The factor g is important In particular one can write the following repshyresentation15 for 0 lt g lt 12

gcos(a-3)= v ^ c o s ( a - A ) v 2 p c o s ( ^ - A ) mdash (36) Jo An

Let us now apply these considerations to quantum cryptography

368

4 Quantum Key Distribution

Ekert1 9 showed that one can use the EPR correlations to establish a secret random key between two parties (Alice and Bob) Bells inequalities are used to check the presence of an intermediate eavesdropper (Eve) There are two stages to the Ekert protocol the first stage over a quantum channel the second over a public channel

The quantum channel consists of a source that emits pairs of spin one-half particles in a singlet state The particles fly apart towards Alice and Bob who after the particles have separated perform measurements on spin components along one of three directions given by unit vectors a and b In the second stage Alice and Bob communicate over a public channelThey announce in public the orientation of the detectors they have chosen for particular measurements Then they divide the measurement results into two separate groups a first group for which they used different orientation of the detectors and a second group for which they used the same orientation of the detectors Now Alice and Bob can reveal publicly the results they obtained but within the first group of measurements only This allows them by using Bells inequality to establish the presence of an eavesdropper (Eve) The results of the second group of measurements can be converted into a secret key One supposes that Eve has a detector which is located within the region OE and she is described by hidden variables A

We will interpret Eve as a hidden variable in a realist theory and will study whether the quantum correlation Eq (32) can be represented in the form Eq (23) ^From (25) (26) and (35) one can see that if the following inequality

g(0A0B) lt1V2 (41)

is valid for regions OA and OB which are well separated from one another then there is no violation of the CHSH inequalities (25) and therefore Alice and Bob can not detect the presence of an eavesdropper On the other side if for a pair of well separated regions OA and OB one has

9(OAOB) gtly2 (42)

then it could be a violation of the realist locality in these regions for a given state Then in principle one can hope to detect an eavesdropper in these circumstances

Note that if we set g(OA OB) = 1 in (35) as it was done in the original proof of Bells theorem then it means we did a special preparation of the states of particles to be completely localized inside of detectors There exist such

369

well localized states (see however the previous Remark) but there exist also another states with the wave functions which are not very well localized inside the detectors and still particles in such states are also observed in detectors The fact that a particle is observed inside the detector does not mean of course that its wave function is strictly localized inside the detector before the measurement Actually one has to perform a thorough investigation of the preparation and the evolution of our entangled states in space and time if one needs to estimate the function g(CgtA OB)-

5 Gaussian Wave Functions

Now let us consider the criterium of locality for Gaussian wave functions We will show that with a reasonable accuracy there is no violation of locality in this case Let us take the wave function ltfgt of the form ltfgt = Vi(ri)V2(r2) where the individual wave functions have the moduli

Mr)2 = ( ^ ) raquo V V a |Vgt2(r)|2 = (^ )raquo raquoe -raquo ( - 1 )Vraquo (51)

We suppose that the length of the vector 1 is much larger than 1m We can make measurements of PoA and PQB for any well separated regions OA and OB- Let us suppose a rather nonfavorite case for the criterium of locality when the wave functions of the particles are almost localized inside the regions OA and OB respectively In such a case the function 9(OAOB) can take values near its maxumum We suppose that the region OA is given by ri lt 1mr = (ri r2r3) and the region OB is obtained from OA by translation on 1 Hence Vi(ri) is a Gaussian function with modules appreciably different from zero only in OA and similarly laquogt2(i2) is localized in the region OB- Then we have

g(0A OB) = ( ^ L J ^ e~x^2dx (52)

One can estimate (52) as

g(0A0B)lt(^ (53)

which is smaller than 12 Therefore the locality criterium (41) is satisfied in this case

Let us remind that there is a well known effect of expansion of wave packets due to the free time evolution If e is the characteristic length of the Gaussian

370

wave packet describing a particle of mass M at time t = 0 then at time t the chracteristic length tt will be

It tends to (HMe)t as t mdashgt oo Therefore the locality criterium is always satisfied for nonrelativistic particles if regions OA and OB are far enough from each other The case of relativistic particles will be considered in a separate publication

6 Conclusions

It is shown in this note that if we do not neglect the space part of the wave function of two particles then the prediction of quantum mechanics can be consistent with Bells inequalities One can say that Einsteins local realism is restored in this case

It would be interesting to investigate whether one can prepare a reasonshyable wave function for which the condition of nonlocality (42) is satisfied for a pair of the well separated regions In principle the function g(CgtA OB) can approach its maximal value 1 if the wave functions of the particles are very well localized within the detector regions OA and OB respectively However perhaps to establish such a localization one has to destroy the original entanshyglement because it was created far away from detectors

It is shown that the presence of the space part in the wave function of two particles in the entangled state leads to a problem in the proof of the security of quantum key distribution To detect the eavesdroppers presence by using Bells inequality we have to estimate the function g(OA OB)- Only a special quantum key distribution protocol has been discussed here but it seems there are similar problems in other quantum cryptographic schemes as well

We dont claim in this note that it is in principle impossible to increase the detectability of the eavesdropper However it is not clear to the present author how to do it without a thorough investigation of the process of preparation of the entangled state and then its evolution in space and time towards Alice and Bob

In the previous section Eve was interpreted as an abstract hidden variable However one can assume that more information about Eve is available In particular one can assume that she is located somewhere in space in a region OE- It seems one has to study a generalization of the function g(OAOB) which depends not only on the Alice and Bob locations OA and OB but also depends on the Eve location OE and try to find a strategy which leads to an optimal value of this function

371

7 Acknowledgments

This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University I would like to thank A Khrennikov for the warm hospitality and fruitful discussions This work is supported in part also by RFFI 99-01-00105 and INTAS 99-0590

References

1 JS Bell Physics 1 195 (1964) 2 A Peres Quantum Theory Concepts and Methods Kluwer Dordrecht

1993 3 LE Ballentine Quantum Mechanics Prince-Hall 1990 4 Muynck WM de De Baere W and Martens H Found of Physics

(1994) 1589 5 DM Greenberger MA Home A Shimony and A Zeilinger Am J

Phys 58 1131 (1990) 6 SL Braunstein A Mann and M Revzen Phys Rev Lett 68 3259

(1992) 7 ND Mermin Am J Phys 62 880 (1994) 8 G M DAriano L Maccone M F Sacchi and A Garuccio Tomographic

test of Bells inequality quant-ph9907091 9 Luigi Accardi and Massimo Regoli Locality and Bells inequality quant-

ph0007005 10 Andrei Khrennikov Non-Kolmogorov probability models and modified

Bells inequality quant-ph0003017 11 Almut Beige William J Munro and Peter L Knight A Bells Inequality

Test with Entangled Atoms quant-ph0006054 12 F Benatti and R Floreanini On Bells locality tests with neutral kaons

hep-ph9812353 13 A Khrennikov Statistical measure of ensemble nonreproducibility and

correction to Bells inequality Nuovo Cimento 115B (2000)179 14 W A Hofer Information transfer via the phase A local model of

Einstein-Podolksy-Rosen experiments quant-ph0006005 15 Igor Volovich Yaroslav Volovich Bells Theorem and Random Variables

quant-ph0009058 16 N Gisin V Scarani W Tittel H Zbinden Optical tests of quantum

nonlocality from EPR-Bell tests towards experiments with moving obshyservers quant-ph0009055

17 Igor V Volovich Bells Theorem and Locality in Space quant-

372

ph0012010 18 CH Bennett and G Brassard in Proc of the IEEE Inst Conf on

Comuters Systems and Signal Processing Bangalore India (IEEE New York1984) p175

19 AK Ekert Phys Rev Lett 67 (1991)661 20 D S Naik C G Peterson A G White A J Berglund P G Kwiat

Entangled state quantum cryptography Eavesdropping on the Ekert proshytocol quant-ph9912105

21 Gilles Brassard Norbert Lutkenhaus Tal Mor Barry C Sanders Secushyrity Aspects of Practical Quantum Cryptography quant-ph9911054

22 Kei Inoue Takashi Matsuoka Masanori Ohya New approach to Epsilon-entropy and Its comparison with Kolmogorovs Epsilon-entropy quant-ph9806027

23 Hoi-Kwong Lo Will Quantum Cryptography ever become a successful technology in the marketplace quant-ph9912011

24 Akihisa Tomita Osamu Hirota Security of classical noise-based cryptogshyraphy quant-ph0002044

25 Yong-Sheng Zhang Chuan-Feng Li Guang-Can Guo Quantum key disshytribution via quantum encryption quant-ph0011034

373

INTERACTING STOCHASTIC PROCESS A N D RENORMALIZATION THEORY

YAROSLAV V O L O V I C H

Physics Department Moscow State University

Vorobievi Gori 119899Moscow Russia

E-mail yaroslav-Vmailru

A stochastic process with self-interaction as a model of quantum field theory is studied We consider an Ornstein-Uhlenbeck stochastic process x(t) with intershyaction of the form x ( a ( t ) 4 where a indicates the fractional derivative Using Bogoliubovs Rmdashoperation we investigate ultraviolet divergencies for the various parameters a Ultraviolet properties of this one-dimensional model in the case a = 34 are similar to those in the ip theory but there are extra counterterms It is shown that the model is two-loops renormalizable For 58 lt a lt 34 the model has a finite number of divergent Feynman diagrams In the case a = 23 the model is similar to the ltp theory If 0 lt a lt 58 then the model does not have ultraviolet divergencies at all Finally if a gt 34 then the model is nonrenormalizable

1 Introduction

There is a very fruitful interrelation between probability theory and quantum field theory 1 _ 6 In this note we consider a stochastic process that shows the same divergencies as quantum electrodynamics or ltgt4 theory in the 4-dimensional spacetime This stochastic process corresponds to one-dimensional Euclidean quantum field theory with the quartic interaction that contains fracshytional derivatives This one-dimensional model can be used for studying the fundamental problem of non-perturbative investigation of renormalized quanshytum field theory1 3 It can also find applications in theory of phase transishytions5 6

The Interacting Stochastic Process Let x(t) = x(tu)) be an Ornstein-Uhlenbeck stochastic process with the correlation function

1 rdegdeg pip(t-r) p~mt-r

where m gt 0 There exists a spectral representation of the Ornstein-Uhlenbeck stochastic process 8

xtu)= JeiktC(dku)

374

where ((dku) is a stochastic measure We define the fractional derivative a

as

lt lt gt (tw)= fkaeiktC(dkoj) (12)

If 0 lt a lt 12 then x^(t) is a stochastic process If a gt 12 then one needs a regularization described below We will use distribution notations and write

1 fdegdeg C(dkui) = x(kcj)dk i(kw) = mdash I x(tcj)e

2 r J-oo

-iktdt

We want to give a meaning to the following correlation functions

Kh tN)= Exh) bull bull bull xtN)e~xu) E(e-xu) (13)

for all N = 12 Here

OO

X^T)A g(T)dT (14)

-OO

where g(r) is a nonnegative test function with a compact support (the volume cut-off) a(Q)(i) denotes the fractional derivative (12) A gt 0 and ^ ( ^ ( T ) 4 is the Wick normal product We will denote the expectation value as E(A) mdash A) In this notations (x(t)x(r)) = plusmn J^ ^^rdp

For the correlation function (13) one has the perturbative expansion

(x(h) xtN)e~xu) = V Kmdashf- (xfa) bull bullbullx(tN)Un) (15) n=0

If a gt 58 then the expectation value in (15) has no meaning because there are ultraviolet divergencies We have to introduce a cutoff stochastic process xK (t) 3

xK(tegt)= f eiktadku) J mdashK

Instead of U in (13) we put

UK = j 4 a ) M 4 9(r)dr

Stochastic differential equations with fractional derivatives 7 are considered also on pmdashadic number fields

375

where

JmdashK

The problem is to prove that after the renormalization there exists a limit of the correlation functions

(xh)-x(tN)e-w)rm

as K -gt oo in each order of the perturbation expansion We will consider this problem below by using the Bogoliubov-Parasiuk R-operation and the standart language of the Feynman diagrams

In the momentum representation we obtain the expression of the form

x(pi)xjpN)e~xu) = ^2Gr(pi PN)

Here the sum runs over all Feynman diagrams T with N external legs that can be build up using 4-vertices corresponding to the x^4 term Contributions from the connected diagrams with n 4-vertices and L internal lines has a form

j = i j j = i lt i j + m

where I = L mdash (n mdash 1) qi are linear combinations of the internal momenta fci ki and external momenta p i PN-

The canonical degree D(T) of a proper diagram is defined by the dimension of the corresponding Feynman integral with respect to the integration variables Using (16) we have

D = D(T) = (2a - 2)L + I = (2a - )L - n + 1 (17)

If for a given diagram D lt 0 then this diagram is superficially finite otherwise it is divergent Let us consider a proper diagram with n vertices L internal lines and E legs We have the following relation

An-2L + E (18)

Note that for any nontrivial connected diagram

2n gt L gt n gt 2 (19)

E lt2n (110)

376

Theorem If a lt 58 then all Feynman diagrams of the interacting stochastic process are superficially finite If 58 lt a lt 34 then there exists a finite number of divergent diagrams moreover all divergent diagrams have only 0 or 2 legs If a = 34 then the model is renormalizable and all divergent diagrams have only 0 2 or 4 external lines Finally if a gt 34 then the model is nonrenormalizable Proof Let us prove the first statement of the theorem ie if a lt 58 then D lt 0 for any n gt 2 Using (17) and (19) we have

D nr 5 T n L-An + A ^ lt2L L-n + l = lt

alt58 8 4 (111)

lt In - An + 4

lt 0 4 2

Prom (111) it follows that D lt 0 for any a lt 58 Let us consider a = 58 Similarly to (111) from (17) we have

D L-An + A 2_ n

a=58 lt 0 (112)

Therefore only two-point (n = 2) diagram could be divergent (in this case D = 0) Rewriting (112) in the form

D A-(E + L)

alt58 (113)

Prom (113) it follows that only diagram with E = 0 L mdash A n = 2 is divergent In the case when 58 lt a lt 34 we can write

a = (114)

where 0 lt e lt 18 Substituting (114) into (17) and using (19) we have

D L 2n

= --2Le-n + llt mdash a=34-er 2 2

2ns - n + 1 = 1 - 2ne (115)

Thus for any given s gt 0 (and therefore any a lt 34) there exists a number N such that for any n gt N the canonical dimension D lt 0 Hence there exists only a finite number of divergent diagrams Rewriting (115) in the form

D a=34-e

= -2Le + A-E

377

It follows that D gt 0 only if E lt 4 ie E = 0 or E = 2 and the model is super-renormalizable

Let us consider the case when a = 34 Using (18) and (17) we have

D = l - f (116) a=34 4

The equality (116) means that all divergent diagrams have only 0 2 or 4 legs and the model is renormalizable

Finally if a gt 34 we have

D = - - n + l = gt ^ gt 0 (117) agt34 2 1 2

Therefore if a gt 34 then all proper diagrams are divergent bull Examples of application of this theorem one can find in9

2 Acknowledgments

This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University I would like to thank A Khrennikov for the warm hospitality and fruitful discussions

References

1 NN Bogoliubov and DV Shirkov Introduction to the theory of quantum fields Nauka Moscow 1973

2 T Hida Brownian Motion Springer-Verlag 1980 3 J Glimm and A Jaffe Quantum Physics A Functional Integral Point of

View Springer-Verlag 1987 4 T Hida H-H Kuo J Potthoff and L Streit White noise An Infinite

Dimensional Calculus Kluwer Academic 1993 5 J Kogut K Wilson Phys Reports 12C p 75 1974 6 AZ Patashinski and VL Pokrovski The fluctuational theory of phase

transitions Nauka Moscow 1975 7 VS Vladimirov Generalized functions over the field ofpmdashadic numbers

Russian Math Surveys 435 (1988) 8 II Gihman and AV Skorohod Introduction to Theory of Random Proshy

cesses Nauka Moscow 1977 9 YaI Volovich Interacting stochastic process and renormalization theory

quant-ph0008063

ISBN 981-02-4846-6

www worldscientificcom 48 84hc 9 789810 248468

  • Foreword
  • Contents
  • Preface
  • Locality and Bells Inequality
    • 1 Inequalities among numbers
    • 2 The Bell inequality
    • 3 Implications of the Bells inequalities for the singlet correlations
    • 4 Bell on the meaning of Bells inequality
    • 5 Critique of Bells vital assumption
    • 6 The role of the counterfactual argument in Bells proof
    • 7 Proofs of Bells inequality based on counting arguments
    • 8 The quantum probabilistic analysis
    • 9 The realism of ballot boxes and the corresponding statistics
    • 10 The realism of chameleons and the corresponding statistics
    • 11 Bells inequalities and the chamaleon effect
    • 12 Physical implausibility of Bells argument
    • 13 The role of the single probability space in CHSHs proof
    • 14 The role of the counterfactual argument in CHSHs proof
    • 15 Physical difference between the CHSHs and the original Bells inequalities
    • References
      • Refutation of Bells Theorem
        • 1 Introduction
        • 2 The EPRB gedanken experiment
        • 3 The CHSH function
        • 4 Strongly objective interpretation
        • 5 Weakly objective interpretation
        • 6 Conclusion
        • References
          • Probability Conservation and the State Determination Problem
            • 1 Introduction
            • 2 Conservation of Probability
            • 3 Determination of the phase function
            • 4 Validity and range of applicability
            • 5 Evolution of a Gaussian Wave Packet
            • 6 Operational Issues
            • Acknowledgments
            • References
              • Extrinsic and Intrinsic Irreversibility in Probabilistic Dynamical Laws
                • 1 Introduction
                • 2 Ontic and epistemic descriptions
                • 3 Breaking Time-Reversal Symmetry Extrinsic Irreversibility
                • 4 Breaking Time-Reversal Symmetry Intrinsic Irreversibility
                • 5 Summary and Open Questions
                • Acknowledgments
                • References
                  • Interpretations of Probability and Quantum Theory
                    • 1 Introduction
                    • 2 Interpretations of Probability
                    • 3 The Axioms of Probability
                    • 4 Probability in Quantum Mechanics
                    • 5 Conclusions
                    • References
                      • Forcing Discretization and Determination in Quantum History Theories
                        • 1 Introduction
                        • 2 Outcome determination via contextual models
                        • 3 Unitary ortho- and projective structure
                        • 4 Representing quantum history theory
                        • 5 Further discussion
                        • Acknowledgments
                        • References
                          • Interpretations of Quantum Mechanics and Interpretations of Violation of Bells Inequality
                            • 1 Realist and empiricist interpretations of quantum mechanics
                            • 2 EPR experiments and Bell experiments
                            • 3 Bells inequality in quantum mechanics
                            • 4 Bells inequality in stochastic and deterministic hidden-variables theories
                            • 5 Analogy between thermodynamics and quantum mechanics
                            • 6 Conclusions
                            • References
                              • Discrete Hessians in Study of Quantum Statistical Systems Complex Ginibre Ensemble
                                • 1 Introduction
                                • 2 The Ginibre ensembles
                                • Acknowledgements
                                • References
                                  • Some Remarks on Hardy Functions Associated with Dirichlet Series
                                    • 1 Introduction
                                    • 2 Hardyfication of Dirichlet series
                                    • 3 Factorization of n
                                    • 4 Applications
                                    • References
                                      • Ensemble Probabilistic Equilibrium and Non-Equilibrium Thermodynamics without the Thermodynamic Limit
                                        • 1 Introduction
                                        • 2 There is a lot to add to classical equilibrium statistics from our experience with Small systems
                                        • 3 Relation of the topology of S(E N) to the Yang-Lee zeros of Z(T u V)
                                        • 4 The regions of positive curvature A1 of s(es ns) correspond to phase transitions of first order
                                        • 5 Boltzmanns principle and non-equilibrium thermodynamics
                                        • 6 Macroscopic observables imply the EPS-probability
                                        • 7 On Einsteins objections against the EPS-probability
                                        • 8 Fractal distributions in phase space Second Law
                                        • 9 Conclusion
                                        • Appendix
                                        • Acknowledgement
                                        • References
                                          • An Approach to Quantum Probability
                                            • 1 Introduction
                                            • 2 Formulation
                                            • 3 Wave Functions and Hilbert Space
                                            • 4 Spin
                                            • 5 Traditional Quantum Mechanics
                                            • 6 Concluding Remarks
                                            • References
                                              • Innovation Approach to Stochastic Processes and Quantum Dynamics
                                                • 1 Introduction
                                                • 2 Review of defining a stochastic process and white noise analysis
                                                • 3 Relations to Quantum Dynamics
                                                • 4 Addenda to foundations of the theories Concluding remarks
                                                • Acknowledgements
                                                • References
                                                  • Statistics and Ergodicity of Wave Functions in Chaotic Open Systems
                                                    • 1 Introduction
                                                    • 2 Classical Nonergodicity and Short-Path Dynamics
                                                    • 3 Universal Description of Wave Function Statistics
                                                    • 4 Numerical Analyses and Discussions
                                                    • 5 Conclusions
                                                    • Acknowledgments
                                                    • References
                                                      • Origin of Quantum Probabilities
                                                        • 1 Introduction
                                                        • 2 Quantum formalism and perturbation effects
                                                        • 3 Probability transformations connecting preparation procedures
                                                        • 3 Hyperbolic and hyper-trigonometric probabilistic transformations
                                                        • 4 Double stochasticity and correlations between preparation procedures
                                                        • 5 Hyperbolic quantum formalism
                                                        • 6 Physical consequences
                                                        • Acknowledgements
                                                        • References
                                                          • Nonconventional Viewpoint to Elements of Physical Reality Based on Nonreal Asymptotics of Relative Frequencies
                                                            • 1 Introduction
                                                            • 2 Analysis of the foundation of probability theory
                                                            • 3 General principle of statistical stabilization of relative frequencies
                                                            • 4 Probability distribution of a collective
                                                            • 5 Model examples of p-adic statistics
                                                            • Acknowledgements
                                                            • References
                                                              • Complementarity or Schizophrenia Is Probability in Quantum Mechanics Information or Onta
                                                                • 1 Introduction
                                                                • 2 De Broglie waves as an SED effect
                                                                • 3 Schrodinger Equation
                                                                • 4 Conclusions
                                                                  • A Probabilistic Inequality for the Kochen-Specker Paradox
                                                                    • 1 Introduction
                                                                    • 2 The Kochen-Specker theorem
                                                                    • 3 The Kochen-Specker inequality
                                                                    • 4 Independence
                                                                    • 5 Conclusions
                                                                      • Quantum Stochastics The New Approach to the Description of Quantum Measurements
                                                                        • 1 Introduction
                                                                        • 2 Quantum stochastic approach
                                                                        • 3 Concluding remarks
                                                                        • 4 Acknowledgments
                                                                        • References
                                                                          • Abstract Models of Probability
                                                                            • 1 What probability sets o are possible
                                                                            • 2 Uniqueness of semigroups of zeros and units
                                                                            • 3 Probabilities with hidden parameters
                                                                            • 4 Probability sets with a single unit
                                                                            • 5 Acknowledgments
                                                                            • References
                                                                              • Quantum K-Systems and their Abelian Models
                                                                                • 1 Introduction
                                                                                • 2 Classical K-System
                                                                                • 3 Algebraic Quantum K-Systems
                                                                                • 4 Dynamical Entropy
                                                                                • 5 Some General Considerations on Abelian Models
                                                                                • 6 Abelian Models for Algebraic K-Systems
                                                                                • 7 Continuous K-Systems
                                                                                • 8 Mixing Properties Without Algebraic K-Property
                                                                                • 9 Time Evolution
                                                                                • References
                                                                                  • Scattering in Quantum Tubes
                                                                                    • 1 Introduction
                                                                                    • 2 Tubes in quantum heterostructures
                                                                                    • 3 Mathematical model
                                                                                    • 4 Reformulated scattering problem
                                                                                    • 5 Solution of the scattering problem
                                                                                    • References
                                                                                      • Position Eigenstates and the Statistical Axiom of Quantum Mechanics
                                                                                        • 1 Quantum probabilities according to Deutsch
                                                                                        • 2 Schrodingers equation for a free particle as a consequence of position eigenstates
                                                                                        • 3 Driven particle Weyl equation in general space-time
                                                                                        • 4 Realizing Deutschs substitution as a time evolution
                                                                                        • 5 Can normalization be replaced by symmetry
                                                                                        • References
                                                                                          • Is Random Event the Core Question Some Remarks and a Proposal
                                                                                            • 1 Preface
                                                                                            • 2 Linguistic Model
                                                                                            • 3 Ensemble Model
                                                                                            • 4 Structural Model
                                                                                            • 5 Certain and Uncertain Structures
                                                                                            • 6 Probability
                                                                                            • 7 Experimental Verification
                                                                                            • 8 Objective and Subjective Probability
                                                                                            • 9 Conclusions
                                                                                            • References
                                                                                              • Constructive Foundations of Randomness
                                                                                                • 1 Introduction
                                                                                                • 2 Kolmogorov Complexity
                                                                                                • 3 Incompressibility
                                                                                                • 4 Reversible Complexity
                                                                                                • 5 Complexity and Information
                                                                                                • 6 Frequency Rates
                                                                                                • 7 Prefix Complexity
                                                                                                • 8 Universal Probability
                                                                                                • 9 Sequentially Coding Algorithms
                                                                                                • References
                                                                                                  • Structure of Probabilistic Information and Quantum Laws
                                                                                                    • 1 Introduction
                                                                                                    • 2 Gaining experimental information
                                                                                                    • 3 Efficient representation of probabilistic information
                                                                                                    • 4 Predictions
                                                                                                    • 5 Discussion
                                                                                                    • Acknowledgments
                                                                                                    • References
                                                                                                      • Quantum Cryptography in Space and Bells Theorem
                                                                                                        • 1 Introduction
                                                                                                        • 2 Bells Inequality
                                                                                                        • 3 Localized Detectors
                                                                                                        • 4 Quantum Key Distribution
                                                                                                        • 5 Gaussian Wave Functions
                                                                                                        • 6 Conclusions
                                                                                                        • 7 Acknowledgments
                                                                                                        • References
                                                                                                          • Interacting Stochastic Process and Renormalization Theory
                                                                                                            • 1 Introduction
                                                                                                            • 2 Acknowledgments
                                                                                                            • References
Page 3: Foundations of Probability and Physics

P Q - Q P Quantum Probability and White Noise Analysis

Managing Editor W Freudenberg Advisory Board Members L Accardi T Hida R Hudson and K R Parthasarathy

PQ-QP Quantum Probability and White Noise Analysis

Vol 13 Foundations of Probability and Physics ed A Khrennikov

QP-PQ

Vol 10 Quantum Probability Communications eds R L Hudson and J M Lindsay

Vol 9 Quantum Probability and Related Topics ed L Accardi

Vol 8 Quantum Probability and Related Topics ed L Accardi

Vol 7 Quantum Probability and Related Topics ed L Accardi

Vol 6 Quantum Probability and Related Topics ed L Accardi

PQ-QP Quantum Probability and White Noise Analysis

Volume XIII

Proceedings of the Conference

foundations of probability and

physics Vaxjo Sweden 25 November - 1 December 2000

Edited by A Khrennikov University of Vaxjo Sweden

|5 World Scientific m New JerseyLondonSingapore New Jersey bull London bull Singapore bull Hong Kong

Published by

World Scientific Publishing Co Pte Ltd

P O Box 128 Farrer Road Singapore 912805

USA office Suite IB 1060 Main Street River Edge NJ 07661

UK office 57 Shelton Street Covent Garden London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library

FOUNDATIONS OF PROBABILITY AND PHYSICS PQ-QP Quantum Probability and White Noise Analysis - Vol 13

Copyright copy 2001 by World Scientific Publishing Co Pte Ltd

All rights reserved This book or parts thereof may not be reproduced in any form or by any means electronic or mechanical including photocopying recording or any information storage and retrieval system now known or to be invented without written permission from the Publisher

For photocopying of material in this volume please pay a copying fee through the Copyright Clearance Center Inc 222 Rosewood Drive Danvers MA 01923 USA In this case permission to photocopy is not required from the publisher

ISBN 981-02-4846-6

Printed in Singapore by World Scientific Printers (S) Pte Ltd

V

Foreword

With the present proceedings of a conference on Foundations of Probability and Physics we continue the QP series mdash the first volume of which appeared more than twenty years ago The series had its origin in proceedings of conshyferences and workshops on quantum probability and related topics Initially published by Springer-Verlag World Scientific has now been the publisher for about ten years Much has changed in the world of quantum probability in the last two decades Quantum probabilistic methods became a mature subject in mathematics and mathematical physics The number of well-established scienshytists who have turned their scientific interest to the field of quantum probability is impressively increasing Scientifically and numerically strong schools of quanshytum probability evolved in the past years Moreover the highly interdisciplinary character of quantum probability became more and more evident Especially the close connections to white noise analysis aroused the interest of classical and quantum probabilists and stimulated mutual exchange and cooperation fruitful for both parties

Taking into account this development during the previous QP conferences we discussed comprehensively and in detail the future profile and main goals of the series Some changes in the alignment and the objectives of the series reshysulted from these discussions First of all the new title reflects the intention to unify white noise analysis and quantum probability It is important and essenshytial to bring together classical and quantum probabilists and the success of the World Scientific journal Infinite Dimensional Analysis Quantum Probability and Related Topics shows that such an alliance will benefit both parties Furshythermore we should be open to a wide audience of scientists and to a broad spectrum of themes The present volume represents such a field being not very closely connected to quantum probability and white noise analysis but of general interest to the readership of the series

Future volumes of the series will include proceedings of conferences or workshyshops lecture notes of schools but also monographs on topics in quantum probshyability and white noise analysis

Finally we would like to thank all former editors of the series for their excellent job they did We especially appreciate the enthusiastic commitment of Luigi Accardi who initiated the series and was the responsible editor for many years

Wolfgang Freudenberg

VII

Contents

Foreword v

Preface xi

Locality and Bells Inequality 1 L Accardi and M Regoli

Refutation of Bells Theorem 29 G Adenier

Probability Conservation and the State Determination Problem 39 S Aerts

Extrinsic and Intrinsic Irreversibility in Probabilistic Dynamical Laws 50 H Atmanspacher R C Bishop and A Amann

Interpretations of Probability and Quantum Theory 71 L E Ballentine

Forcing Discretization and Determination in Quantum History Theories 85

B Coecke

Interpretations of Quantum Mechanics and Interpretations of Violation of Bells Inequality 95

W M De Muynck

Discrete Hessians in Study of Quantum Statistical Systems Complex Ginibre Ensemble 115

M M Duras

Some Remarks on Hardy Functions Associated with Dirichlet Series 121 W Ehm

Ensemble Probabilistic Equilibrium and Non-Equilibrium Thermodynamics without the Thermodynamic Limit 131

D H E Gross

An Approach to Quantum Probability 147 S Gudder

Innovation Approach to Stochastic Processes and Quantum Dynamics 161

T Hida

Statistics and Ergodicity of Wave Functions in Chaotic Open Systems 170 H Ishio

Origin of Quantum Probabilities 180 A Khrennikov

Nonconventional Viewpoint to Elements of Physical Reality Based on Nonreal Asymptotics of Relative Frequencies 201

A Khrennikov

Complementarity or Schizophrenia Is Probability in Quantum Mechanics Information or Onta 219

A F Kracklauer

A Probabilistic Inequality for the Kochen-Specker Paradox 236 J-A Larsson

Quantum Stochastics The New Approach to the Description of Quantum Measurements 246

E Loubenets

Abstract Models of Probability 257 V M Maximov

Quantum K-Systems and their Abelian Models 274 H Narnhofer

Scattering in Quantum Tubes 303 B Nilsson

Position Eigenstates and the Statistical Axiom of Quantum Mechanics 314

L Polley

Is Random Event the Core Question Some Remarks and a Proposal 321 P Rocchi

Constructive Foundations of Randomness 335 V I Serdobolskii

ix

Structure of Probabilistic Information and Quantum Laws 350 J Summhammer

Quantum Cryptography in Space and Bells Theorem 364 Volovich

Interacting Stochastic Process and Renormalization Theory 373 Y Volovich

xi

Preface

This volume constitutes the proceedings of the Conference Foundations of Probability and Physics held in Vaxjo (Smoland Sweden) from 25 November to 1 December 2000

The Organizing Committee of the Conference L Accardi (Rome Italy) W De Muynck (Eindhoven the Netherlands) T Hida (Meijo University Japan) A Khrennikov (Vaxjo University Sweden) and U V Maximov (Be-lostok Poland)

The purpose of the Conference (tentatively the first of a series) was to bring together scientists (physicists as well as mathematicians) who are intershyested in probabilistic foundations of physics An emphasis was made on both theory and experiment the underlying objective being to offer to the physical and mathematical scientific communities a truly interdisciplinary Conference as a privileged place for a scientific interaction among theoreticians and exshyperimentalists Due to the actual increased role of probabilistic foundations in physical applications (Einstein-Podolsky-Rosen correlation experiments Bells inequality quantum information computing and teleportation) as well as the necessity to reconsider foundations at the beginning of new millennium the organizers of the Conference decided that it was just the right time for taking the scientific risk of trying this

Since the creation of Statistical Mechanics probabilistic description plays more and more important role in physics The new crucial step in the develshyopment of the statistical approach to physics was made in the process of the creation of quantum mechanics The founders of quantum theory recognized that quantum formalism could not provide the description of physical processes for individual elementary particles The understanding of this surprising fact induced numerous debates on the possibilities of individual and probabilistic descriptions and relations between them These debates are characterized by the large diversity of opinions on the origin of quantum stochasticity

One of the viewpoints is that quantum stochasticity differs from classical stochasticity So quantum (statistical) mechanics could not be reduced to classical statistical mechanics This viewpoint implies convential interpretation of quantum mechanics

By this interpretation we could not use objective realism in quantum deshyscription of reality The very fundamental physical quantities such as for example position and momentum of an elementary particle could not be conshysidered as properties of the object the elementary particle The elementary particle can be in a state that is superposition of alternatives Only the act of a measurement gives the possibility to choose between these alternatives

xii

We recall historical roots of the origin of such a viewpoint namely the idea of superposition

In fact the whole quantum building was built on two experimental cornershystones 1) the experiment on photoelectric emission 2) the two slit experiment

The first experiment definitely demonstrated that light has the corpuscular structure (discrete structure of energy)

However the second experiment demonstrated that photons (corpuscular objects) do not follow the standard CLASSICAL STATISTICS The convenshytional rule for the addition of probabilistic alternatives

P = P1+P2

is violated in the interference experiments Instead of this rule probabilities observed in interference experiments follow to quantum rule for the addition of probabilistic alternatives

P = Pi + P2 + 2TP1P2COSO

Thus in general the classical rule is perturbed by the cos 0-factor The appearance of NEW STATISTICS induced the revolution in theoretshy

ical physics reconsideration of the role of all basic elements of the physical theory The common opinion was (and is) that quantum probabilistic rule could not be explained by purely corpuscular model To explain this rule we must apply to wave arguments (see for example Diracs book for the detailed analysis of the roots of quantum mechanical formalism)

This implies the wave-particle dualism and Bohrs principle of complemenshytarity This was the crucial change of the whole picture of physical reality (at least at micro-level)

We underline again that all these revolutionary changes had the purely probabilistic root namely the appearance of the new probabilistic rule We also underline that the founders of quantum mechanics in fact did not proshyvide deep probabilistic analysis of the problem Instead of this they analysed other elements of the physical model And such an analysis induces the new description of physical reality that we have already discussed namely quanshytum reality We will never know the real reasons of such a development of the

aOf course we must also mention that the necessity for a departure from classical meshychanics was shown by experiments demonstrating the remarkable stability of atoms and molecules The forces known in classical electrodynamics are inadequate for the explanation of this phenomenon However quantum mechanical explanation of such a stability is in fact based on the same arguments as the explanation of the photoelectric effect

bP A M Dirac The Principles of Quantum Mechanics (Claredon Press Oxford 1995)

xiii

theoretical study of the results of experiments with elementary particles at the beginning of the last century

It might be that one of the reasons was the absence of the mathematical theory of probability A N Kolmogorov proposed the modern axiomatics of probability theory only in 1933

During the round table at this conference Prof T Hida and Prof I Volovich pointed out to the fundamental role of direct contacts between physishycists and mathematician in the creation of new physical theories It may be that the absence of the direct collaboration between quantum physical and probabilistic communities was the main root of the absence of deep probabilisshytic analysis of quantum behaviour

Debates on foundations of quantum mechanics were continued with a new excitement in the connection with Einstein-Podolsky-Rosen (EPR) paradox Unfortunately the probabilistic element played the minor role in the EPR conshysiderations There was used (in a rather formal way) the notion of probability one in the formulation of the sufficient condition to be an element of physical reality A new probabilistic impulse to debates on foundations of quantum meshychanics was given by Bells inequality However we must recognize that Bells probabilistic considerations were performed on the formal level that could not be considered as satisfactory (at least from the point of view of mathematishycian) It may be that this absence of the deep probabilistic analysis of the EPR and Bell arguments was one of the main reasons to concentrate investigations in the direction of nonlocality and no-go theorems for hidden variables

The main aim of the conference Foundations of Probability and Physics was to provide probabilistic analysis of foundations of physics classical as well as quantum (in particular the EPR and Bell arguments) The present volume contains results of such analysis It gives the general picture of probabilistic foundations of modern physics Foundations of probability were considered in the close connection to foundations of physics We demonstrated that probashybility plays the fundamental role in models of physical reality It seems to be impossible to split probabilistic and physical problems On one hand many important problems that looks as purely physical are in fact just probabilistic problems On the other hand the right meaning of probability can be found only on the basis of physical investigations Such a meaning depends strongly on a physical model

The conference and the present volume give the good example of the fruitshyful collaboration between physicists and mathematicians stimulate research on the foundations of probability and physics especially quantum physics

We would like to thank Swedish Natural Science Foundation Swedish Technical Science Foundation Vaxjo University and Vaxjo Commune for fi-

XIV

nancial support that made the Conference possible We would also like to thank Prof Magnus Soderstrom the Rector of Vaxjo University for support of fundamental investigations and in particular this Conference

Andrei Khrennikov International Center for Mathematical Modelling in Physics and Cognitive Sciences University of Vaxjo Sweden December 2000

1

L O C A L I T Y A N D B E L L S I N E Q U A L I T Y

LUIDGI ACCARDI MASSIMO REGOLI Centro Vito Volterra

Universita di Roma Tor Vergata Roma Italy Email accardi copyvolterra mat uniroma2 it

We prove that the locality condition is irrelevant to Bell in equality We check that the real origin of the Bells inequality is the assumption of applicability of classical (Kolmogorovian) probability theory to quantum mechanics We describe the chameleon effect which allows to construct an experiment realizing a local realistic classical deterministic and macroscopic violation of the Bell inequalities

1 Inequal i t i e s a m o n g n u m b e r s

In this section we summarize some elementary inequalities among numbers which correspond to different forms of the Bell inequality one meets in the literature Since some confusion have arosen about the mutual relationships among these inequalities in particular their (in)equivalence and the cases of equality such a summary might not be totally useless

L e m m a (1) For any two numbers ac euro [mdash11] the following equivalent inshyequalities hold

aplusmncltlplusmnac (1)

Moreover equality in (1) holds if and only if either o = plusmn l o r c = plusmn l

Proof The equivalence of the two inequalities (1) follows from the fact tha t one is obtained from the other by changing the sign of c and c is arbi t rary in

[-11]-

Since for any a c 6 [mdash11] 1 plusmn ac gt 0 (1) is equivalent to

a plusmn c2 = a2 + c2 plusmn 2ac lt (1 plusmn ac)2 = 1 + a2c2 plusmn 2ac

and this is equivalent to a 2 ( l - c 2 ) + c2 lt 1

which is identically satisfied because 1 mdash c2 gt 0 and therefore

a 2 ( l - c 2 ) + c 2 lt l - c 2 + c2 = 1 (2)

Notice tha t in (2) equality holds if and only if a2 = 1 ie a = plusmn 1 Since exchanging a and c in (1) the inequality remains unchanged the thesis follows

2

Corollary (2) For any three numbers abc euro [mdash11] the following equivalent inequalities hold

ab plusmn cb lt 1 plusmn ac (3)

and equality holds if and only if b = plusmn1 and either a = plusmn l o r c = i l

Proof For b e [-11]

abplusmncb = b-aplusmncltaplusmnc (4)

so the thesis follows from Lemma (1) In (34) equality holds if and only if b = plusmn 1 so also the second statement follows from Lemma (1)

Lemma (3) For any numbers o a b b c e [mdash11] one has

ab - bc + ab + bc lt 2 (5)

ab + ab + ab -ab lt 2 (6)

In (5) equality holds if and only if b b = plusmn1 and either a o r c = plusmn 1

Proof Adding the two inequalities in (3) one finds (5) The left hand side of (6) is lt than

ab-ba + ab + la (7)

and replacing a by c (7) becomes the left hand side of (5) Therefore (6) holds If b b = plusmn1 and either a or c = plusmn1 equality holds in (3) hence in (5) Conversely suppose that equality holds in (5) and suppose that either b lt 1 or | V | lt 1 Then we arrive to the contradiction

2 = b bull a - a + b bull |o + a lta- a + a + a lt (1 - aa) + (1 + aa) = 2 (8)

So if equality holds in (5) we must have |6| = b = 1 In this case (5) becomes

a-a + a + a=2 (9)

and we know from Lemma (1) that the identity (41) can take place if and only if either a or a = plusmn 1

3

Corollary (4) If aabbc pound -11 then the inequalities (3) (6) and (5) are equivalent and equality holds in all of them

Proof From Lemma (1) we know that the inequalities (1) and (2) are equivshyalent Prom Lemma (3) we know that (3) implies (5) Choosing b = a in (5) since a = plusmn 1 (5) becomes

ab mdash cb lt 1 mdash ac

which is (3) The left hand side of (6) is

a(b + b) + a(b - b) (10)

In our assumptions either (b + b) or (b - b) is zero so (4) is either equal to

a(b+b) = b + b=2

or to a(b-b) = b-b = 2

Corollary (5) If abc G (mdash11) then the inequality (5) hence a fortiori (6) is strictly weaker than (3)

Proof We have already proved that that (3) implies (5) hence (6) On the other hand (5) is equivalent to

ab - bc lt (1 - ac) + (1 + ac - ab + bc (11)

ByLemma(l) 1+acmdash ab+bc gt 0 and equality holds if and only if | b | = land either a or c is plusmn 1 From this the thesis follows

2 The Bell inequality

Corollary (1) (Bell inequality) Let ABCD be random variables defined on the same probability space (f2 J- P) and with values in the interval [mdash11] Then the following inequalities hold

E(AB - BC) lt 1 - E(AC) (1)

E(AB + BC) lt 1 + EAC) (2)

4

E(AB - BC) + E(AD + DC) lt 2 (3)

where E denotes the expectation value in the probability space of the four variables Moreover (1) is equivalent to (2) and if either A or C has values plusmn 1 then the three inequalities are equivalent

Proof Lemma (11) implies the following inequalities (interpreted pointwise on fi)

AB - BC lt 1 - AC

AB + BC lt 1 + AC

AB - BC + AD + DC lt 2 from which (1) (2) (3) follow by taking expectation and using the fact that |pound(-0I lt Ed-X^) The equivalence is established by the same arguments as in Lemma (11)

Remark (2) Bells original proof as well as the almost totality of the availshyable proofs of Bells inequality deal only with the case of random variables assuming only the values +1 and mdash1 The present generalization is not withshyout interest because it dispenses from the assumption that the classical random variables used to describe quantum observables have the same set of values of the latter ones a hidden variable theory is required to reproduce the results of quantum theory only when the hidden parameters are averaged over

Theorem (3) Let Sa 5c 5^ 5^ be random variables defined on a probshyability space (poundlF P) and with values in the interval following inequalities holds

-1+1] Then the

pound(5laquo5lt2gt) - E(SWSP) lt 1 - E(SWS^) (4)

E(SMS12)) + E(SWsi2)) lt 1 + E(S^SW) (5)

E(sWsi2)) - pound ( 5 laquo 5 lt 2 ) ) + E(S^S2)) + E(S^S2)) lt 2 (6)

Proof This is a rephrasing of Corollary (2)

5

3 Implications of the Bells inequalities for the singlet correlations

To apply Bells inequalities to the singlet correlations considered in the EPR paradox it is enough to observe that they imply the following

Lemma (1) In the ordinary three-dimensional euclidean space there exist sets of three unit length vectors a b c such that it is not possible to find a probability space (Q T P) and six random variables SX

J (x = a 6 c j = 12) denned on ($7 J- P) and with values in the interval [mdash1 +1] whose correlations are given by

E(SW-SM) = -x-y xy = abc (1)

where if x = (xiX2X3) y = (211223) are two three-dimensional vectors x bull y denotes their euclidean scalar product ie the sum xyi + X2J2 + ^323-

Remark In the usual EPR-type experiments the random variables qti) qU) qii)

represent the spin (or polarization) of particle j of a singlet pair along the three directions abc in space The expression in the right-hand side of (1) is the singlet correlation of two spin or polarization observables theoretically predicted by quantum theory and experimentally confirmed by the Aspect-type experiments

Proof Suppose that for any choice of the unit vectors x = abc there exist random variables Si as in the statement of the Lemma Then using Bells inequality in the form (25) with A = spound1 B = s f ) C = S ^ ) we obtain

E(SWsl2)) + E(S12)SW) lt 1 + ESltpsM) (2)

Now notice that if x = y is chosen in (1) we obtain

ESP bull SM) =-x bull x = - x2 = ~l x = abc

and since Si J Si = 1 this is possible if and only if Si1 = -Sx2gtgt (x = a b c)

P-almost everywhere Using this (2) becomes equivalent to

ESPSIgt) + E(S^SW) lt 1 - E(S^S^)

or again using (1) to

a-b + b-c lt 1 + o-c (3)

6

If the three vectors a b c are chosen to be in the same plane and such that a is perpendicular to c and b lies between a and b forming an angle 9 with a then the inequality (3) becomes

cos9 + sin0 lt 1 0 lt 0 lt TT2 (4)

But the maximum of the function of 6 imdashgt sin 9 + cos 9 in the interval [0 n2] is 2 (obtained for 9 = 7r4) Therefore for 0 close to 7r4 the left-hand side of (4) will be close to 2 which is more that 1 In conclusion for such a choice of the unit vectors a b c random variables Sa S^ Sc Sc as in the statement of the Lemma cannot exist

Definition (2) A local realistic model for the EPR (singlet) correlations is defined by

(1) a probability space (fl T P)

(2) for every unit vector x in the three-dimensional euclidean space two random variables Sx SX defined on fi and with values in the interval [mdash1 +1] whose correlations for any x y are given by equation (1)

Corollary (3) If a b c are chosen so to violate (4) then a local realistic model for the EPR correlations in the sense of Definition (2) does not exist

Proof Its existence would contradict Lemma (1)

Remark In the literature one usually distinguishes two types of local realistic models - deterministic and stochastic ones Both are included in Definition (2) the deterministic models are defined by random variables Sx with values in the setmdash1 +1 while in the stochastic models the random variables take values in the interval [mdash1+1] The original paper [7] was devoted to the deterministic case Starting from [9] several papers have been introduced to justify the stochastic models We prefer to distinguish the definition of the models from their justification

4 Bell on the meaning of Bells inequality

In the last section of [8] (submitted before [7] but published after) Bell briefly describes Bohm hidden variable interpretation of quantum theory underlining

7

its non local character He then raises the question that there is no proof that any hidden variable account of quantum mechanics must have this extraorshydinary character and in a footnote added during the proof corrections he claims that Since the completion of this paper such a proof has been found

m-In the short Introduction to [7] Bell reaffirms the same ideas namely

that the result proven by him in this paper shows that any such [hidden variable] theory which reproduces exactly the quantum mechanical predictions must have a grossly nonlocal structure

The proof goes along the following scheme Bell proves an inequality in which according to what he says (cf statement after formula (1) in [7])

The vital assumption [2] is that the result B for particle 2 does not depend on the setting a of the magnet for particle nor A on b

The paper [2] mentioned in the above statement is nothing but the Einshystein Podolsky Rosen paper [11] and the locality issue is further emphasized by the fact that he reports the famous Einsteins statement [12] But on one supposition we should in my opinion absolutely hold fast the real factual situation of the system S2 is independent of what is done with the system Si which is spatially separated from the former

Stated otherwise according to Bell Bells inequality is a consequence of the locality assumption

It follows that a theory which violates the above mentioned inequality also violates the vital assumption needed according to Bell for its deduction ie locality

Since the experiments prove the violation of this inequality Bell concludes that quantum theory does not admit a local completion in particular quantum mechanics is a nonlocal theory To use again Bells words the statistical predictions of quantum mechanics are incompatible with separable predetermination ([7] p199) Moreover this incompatibility has to be undershystood in the sense that in a theory in which parameters are added to quantum mechanics to determine the results of individual measurements without changshying the statistical predictions there must be a mechanism whereby the setting of one measuring device can influence the reading of another instrument how-evere remote Moreover the signal involved must propagate instantaneously

5 Critique of Bells vital assumption

An assumption should be considered vital for a theorem if without it the theorem cannot be proved

8

To favor Bell let us require much less Namely let us agree to consider his assumption vital if the theorem cannot be proved by taking as its hypothesis the negation of this assumption

If even this minimal requirement is not satisfied then we must conclude that the given assumption has nothing to do with the theorem

Notice that Bell expresses his locality condition by the requirement that the result B for particle 2 should not depend on the setting a of the magnet for particle 1 (cf citation in the preceeding section) Let us denote Mi (M2) the space of all possible measurement settings on system 1 (2)

Theorem (1) For each unit vector x in the three dimensional euclidean space (1 6 R3 I a |= 1) let be given two random variables Sx Sx (spin of particle 1 (2) in direction x) defined on a space D with a probability P and with values in the 2-point set +1 mdash1- Fix 3 of these unit vectors a b c and suppose that the corresponding random variables satisfy the following non locality condition [violating Bells vital assumption] suppose that the probability space Cl has the following structure

) = A x M x M 2 (1)

so that for some function Fj1 F^2 A x Mi x M2 -raquobull [-11]

Sal) (w) = Fa

(1) (A mi m2) (S^ depends on m2) (2)

Sa2)(u) = Fa

(2)(A mi m2) (Sa2) depends on mi) (3)

with mi euro Mim2 euro M2 and similarly for b and c [nothing changes in the (2) proof if we add further dependences for example Fa may depend on all the

41 (w) and F0(1) on all the SX

2LJ)

Then the random variables Si S^2 Sc satisfy the inequality

I (SMStrade) - (StradeSW) |lt 1 - (S^SM) (4)

If moreover the singlet condition

lt5(1)-S(2)) = - 1 x = abc (5)

is also satisfied then Bells inequality holds in the form

(Sa^si2))-S^S^)ltl + (sWS^) (6)

9

Proof The random variables Sa S^ Sc satisfy the assumptions of Corolshylary (23) therefore (4) holds If also condition (5) is satisfied then since the variables take values in the set mdash1 +1 with probability 1 one must have

SP = -SW (x = abc) (7)

and therefore (S^S^) = -S^S^) Using this identity (4) becomes (6)

Summing up Theorem (1) proves that Bells inequality is satisfied if one takes as hypothesis the negation of his vital assumption From this we conclude that Bells vital assumption not only is not vital but in fact has nothing to do with Bells inequality

REMARK Using Lemma (141) below we can allow that the observables take values in [mdash11] also in Theorem (1)

REMARK The above discussion is not a refutation of the Bell inequality it is a refutation of Bells claim that his formulation of locality is an essential assumption for its validity since the locality assumption is irrelevant for the proof of Bells inequality it follows that this inequality cannot discriminate between local and non local hidden variable theories as claimed both in the introduction and the conclusions of Bells paper

In particular Theorem (1) gives an example of situations in which

(i) Bells locality condition is violated while his inequality is satisfied

In a recent experiment with M Regoli [4] we have produced examples of situations in which

(ii) Bells locality condition is satisfied while his inequality is violated

6 The role of the counterfactual argument in Bells proof

Bell uses the counterfactual argument in an essential way in his proof because it is easy to check that formula (13) in [7] paper is the one which allows him to reduce in the proof of his inequality all consideration to the A-variables (Sa

in our notations while Bells -B-variables are the Sa ^ in our notations) The pairs of chameleons (cf section (10) as well as the experiment of [4] provide a counterexample precisely to this formula

10

7 Proofs of Bells inequality based on counting arguments

There is a widespread illusion to exorcize the above mentioned critiques by restricting ones considerations to results of measurements The following conshysiderations show why this is an illusion

The counting arguments usually used to prove the Bell inequality are all based on the following scheme In the same notations used up to now conshysider N simultaneous measurements of the singlet pairs of observables (S^ S) (Spound S) (S 5) and one denotes S3

XV the results of the v-th measurement of Sdegx (j = 12 x = a b c v = 1 N) With these notations one can calculate the empirical correlations on the samples that is

u

(and similarly for the other ones) In the Bell inequality 3 such correlations are involved

(slsl) slsD slsD (2)

Thus in the three experiments observer 1 has to measure 5 in the first and third experiment and S in the second while observer 2 has to measure Sjj in the first and second experiment and S in the third Therefore the directions a and b can be chosen arbitrarily by the two observers and it is not necessary that observer 1 is informed of the choice of observer 2 or conversely However the direction c has to be chosen by both observers and therefore at least on this direction there should be a preliminary agreement among the two observers This preliminary information can be replaced it by a procedure in which each observer chooses at will the three directions only those choices are considered for which it happens (by chance) that the second choice of observer 1 coincides with the third of observer 2 (cf section (15) for further discussion of this point) Whichever procedure has been chosen after the results of the experiments one can compute the 3 empirical correlations

^ 2 )^ 1 ) ) = ^E^ 1 ) (^ 2 ) )^ 2 ) ^ 2 ) ) lt4gt

11

JV

(5)

where pj means the j - t h point of the 3-d experiment etc If we try to apply the Bell argument directly to the empirical data given by the right hand sides of (3) (4) (5) we meet the expression

Jj EampWWto) - plusmn E^^pf )5f (Pf) (6) N

J = I j = i

from which we immediately see that if we try to apply Bells reasoning to the empirical data we are stuck at the first step because we find a sum of terms of the type

si^sPip^-sUip^sfHpV) (7)

to which the inequalities among numbers of section (1) cannot be applied because in general

More explicitly since the expression (x) above is of the form

ab mdash bc

(8)

with a b b c euro plusmn1 the only possible upper bound for it is 2 and not 1 mdash ac Even supposing that we in order to uphold Bells thesis can introduce a

cleaning operation [3] (cf [4]) which eliminates all the points in which (8) is not satisfied we would arrive to the inequality

jf E^frf) Wgt) - jf E ^ f W (f) j = i 3 = 1

lt i-^E^W^fef) (9) j = i

and in order to deduce from this something comparable with the experiments we need to use the counterfactual argument assessing that

^ 1 (p 9 ) ) = -sltagt(Pa)) (2h (10)

12

But in the second experiment S^ and not Sc has been measured Thus to postulate the validity of (10) means to postulate that the value assumed by Sjj in the second experiment is the same that we would have found if Sc and

(2) not S^ had been measured The chameleon effect provides a counterexample to this statement

8 The quantum probabilistic analysis

Given the results of section (5) (6) (7) it is then legitimate to ask if Bells vital assumption is irrelevant for the deduction of Bells inequalshy

ity which is the really vital assumption which guarantees the validity of this inequality

This natural question was first answered in [1] and this result motivated the birth of quantum probability as something more than a mere noncommu-tative generalization of probability theory in fact a necessity motivated by experimental data

Theorem (23) has only two assumptions

(i) that the random variables take values in the interval [mdash1 +1]

(ii) that the random variables are defined on the same probability space

Since we are dealing with spin variables assumption (i) is reasonable Let us consider assumption (ii) This is equivalent to the claim that the

three probability measures PabPacPcb representing the distributions of the pairs (Sa Sl ) (Sc 5^ ) (Sa SC ) respectively can be obtained by reshystriction from a single probability measure P representing the distribution of the quadruple si1] s f s f SJ

This is indeed a strong assumption because due to the incompatibility of the spin variables along non parallel directions the three correlations

(spsP) ltslaquoslt2gtgt (s^sP) (i)

can only be estimated in different in fact mutually incompatible series of exshyperiments If we label each series of experiments by the corresponding pair (ie (a 6) (6 c) (c a)) then we cannot exclude the possibility that also the probability measure in each series of experiments will depend on the correshysponding pair In other words each of the measures Pab Pbc Pca describes the joint statistics of a pair of commuting observables (Si1 s f ) (S^ s f gt)

13

(Sa Sc ) and there is no a priori reason to postulate that all these joint disshytributions for pairs can be deduced from a single distribution for the quadruple r o U ) c ( l ) o(2) Q ( 2 ) I

We have already proved in Theorem (23) that this strong assumption implies the validity of the Bell inequality Now let us prove that it is the truly vital assumption for the validity of this inequality ie that if this assumption is dropped ie if no single distribution for quadruples exist then it is an easy exercise to construct counterexamples violating Bells inequality To this goal one can use the following lemma

Lemma (1) Let be given three probability measures plusmnabi aci - c6 on amp given (measurable) space (S1f) and let S^ si1] S^ SJp be functions defined on (QJ-) with values in the interval [mdash1-1-1] and such that the probability measure Pab (resp PcbPac) is the distribution of the pair (Sa Sl ) (resp ( ^ 1 ^ 2 ) ) (S i 1 ^ 2 ) ) ) For each pair define the corresponding correlation

Kab=SWS^)=Jsa^S^dPab

and suppose that for ee = plusmn the joint probabilities for pairs

Ki bullbull= P(Si1] = e bull Strade = e)

satisfy

p++ _ pmdash p + - _ p - + (o xy xy gt xy M xy ^I

P = Px = 12 (3)

then the Bell inequality

Kab - Kbc ltl~Kac (4)

is equivalent to

pb+-pb

+c++p^+lt (5)

Proof The inequality (4) is equivalent to

W - 2Pab ~ Pamp+ + 2P+-1 lt 1 - 2Pa+

c+ + 2 P + - (6)

14

Using the identity (equivalent to (3))

bull-xy 0 xy ()

the left hand side of (4) becomes the modulus of

2(^t+-^r )-2(nt+-nr) = 2 (s+-f +pav) -2 (pbt+-+nr)

= 4(p a v-n t + ) (8) and again using (7) the right hand side of (6) is equal to

1 - 2 ( P + + - 2 + Pac+ ) = 2 - 4P++ (9)

Summing up (4) is equivalent to

Kb+-Kc+ltl -PaV (io)

which is (5)

Corollary (2) There exist triples of PabPacPcb on the 4-point space + 1 - 1 x + 1 - 1 which satisfy conditions (1) (2) of Lemma (1) and are not compatible with any probability measure P on the 6-point space + 1 - 1 X + 1 - 1 X + 1 - 1

Proof Because of conditions (1) (3) the probability measures Pab Pac Pcb are uniquely determined by the three numbers

pb+p++px+euroioi (ii)

Thus if we choose these three numbers so that the inequality (5) is not satisfied the Bell inequality (4) cannot be satisfied because of Lemma (1)

9 The realism of ballot boxes and the corresponding statistics

The fact that there is no a priori reason to postulate that the joint distributions of the pairs ( S ^ s f 0 ) (si1]sf) ( S ^ S ^ ) can be deduced from a single distribution for the quadruple Sa Sc Sl Sc does not necessarily mean that such a common joint distribution does not exist

15

On the contrary in several physically meaningful situations we have good reasons to expect that such a joint distribution should exist even if it might not be accessible to direct experimental verification

This is a simple consequence of the so-called hypothesis of realism which is justified whenever we are entitled to believe that the results of our meashysurements are pre-determined In the words of Bell Since we can predict in advance the result of measuring any chosen component of olti by previously measuring the same component of o it follows that the result of any such measurement must actually be predetermined

Consider for example a box containing pairs of balls Suppose that the experiments allow to measure either the color or the weight or the material of which each ball is made of but the rules of the game are that on each ball only one measurement at a time can be performed Suppose moreover that the experiments show that for each property only two values are realized and that whenever a simultaneous measurement of the same property on the two elements of a pair is performed the resulting answers are always discordant Up to a change of convenction and in appropriate units we can always suppose that these two values are plusmn1 and we shall do so in the following

Then the joint distributions of pairs (of properties relative to different balls) are accessible to experiment but those of triples or quadruples are not

Nevertheless it is reasonable to postulate that in the box there is a well defined (although purely Platonic in the sense of not being accessible to experiment) number of balls with each given color weight and material These numbers give the relative frequencies of triples of properties for each element of the pair hence using the perfect anticorrelation a family of joint probabilities for all the possible sextuples More precisely due to the perfect anticorrelation the relative frequency of the triples of properties

SW=ai [Sf^h] [^1=Cl]

where aibia = plusmn1 are equal to the relative frequency of the sextuples of properties

[Strade = ai] [Si1] = h] [SP = Cl] [SM = - 0 l ] [Slt2gt = -bl] [S(2) = _C l]

and since we are confining ourselves to the case of 3 properties and 2 particles the above ones when abic vary in all possible ways in the set plusmn1 are all the possible configurations in this situation the counterfactural argument is applicable and in fact we have used it to deduce the joint distribution of sextuples from the joint distributions of triples

16

10 The realism of chameleons and the corresponding statistics

According to the quantum probabilistic interpretation what Einstein Podol-sky Rosen Bell and several other who have discussed this topic call the hyshypothesis of realism should be called in a more precise way the hypothesis of the ballot box realism as opposed to hypothesis of the chameleon realism

The point is that according to the quantum probabilistic interpretation the term predetermined should not be confused with the term realized a priori which has been discussed in section (9) it might be conditionally dediced according to the scheme if such and such will happen I will react so and so

The chameleon provides a simple example of this distinction a chameleon becomes deterministically green on a leaf and brown on a log In this sense we can surely claim that its color on a leaf is predetermined However this does not mean that the chameleon was green also before jumping on the leaf

The chameleon metaphora describes a mechanism which is perfectly local even deterministic and surely classical and macroscopic moreover there are no doubts that the situation it describes is absolutely realistic Yet this reshyalism being different from the ballot box realism allows to render free from metaphysics statements of the orthodox interpretation such as the act of meashysurement creates the value of the measured observable To many this looks metaphysic or magic but load how natural it sounds when you think of the color of a chameleon

Finally and most important for its implications relatively to the EPR arshygument the chameleon realism provides a simple and natural counterexample of a situation in which the results are predetermined however the counter-factual argument is not applicable

Imagine in fact a box in which there are many pairs of chameleons In each pair there is exactly an healthy one which becomes green on a leaf and brown on a log and a mutant one which becomes brown on a leaf and green on a log moreover exactly one of the chameleons in each pair weights 100 grams and exactly one 200 grams A measurement consists in separating the members of each pair each one in a smaller box and in performing one and only one measurement on each member of each pair

The color on the leaf color on the log and weight are 2-valued observables (because we do not know a priori if we are measuring the healthy or the mutant chameleon) Thus with respect to the observables color on the leaf color on the long and weight the pairs of chameleons behave exactly as EPR pairs whenever the same observable is measured on both elements of a pair the results are opposite However suppose I measure the color on the leaf of one element of a pair and the weight of the other one and suppose the answers I

17

find are green and 100 grams Can I conclude that the second element of the pair is brown and weights 100 grams Clearly not because there is no reason to believe that the second member of the pair of which the weight was measured while in a box was also on a leaf

From this point of view the measurement interaction enters the very definishytion of an observable However also in this interpretation which is more similar to the quantum mechanical situation the counterfactual argument cannot be applied because it amounts to answer brown to the question which is the color on the leaf if I have measured the weight and if I know that the chameleon is the mutant one (this because the measurement of the other one gave green on the leaf) But this answer is not correct because it could well be that inside the box there is a leaf and the chameleon is interacting with it while I am measuring its weight but it could also be that it is interacting with a log also contained inside the box in which case being a mutant it would be green

Therefore if we can produce an example of a 2-particle system in which the Heisenberg evolution of each particles observable satisfies Bells locality condition but the Schroedinger evolution of the state ie the expectation value (bull) depends on the pair (ab) of measured observables we can claim that this counterexample abides with the same definition of locality as Bells theorem

11 Bells inequalities and the chamaleon effect

Definition (1) Let S be a physical system and O a family of observable quantities relative to this system We say that the it chamaleon effect is realized on S if for any measurement M of an observable A pound O the dynamical evolution of S depends on the observable A If D denotes the state space of S this means that the change of state from the beginning to the end of the experiment is described by a map (a one-parameter group or semigroup in the case of continuous time)

TA D-gtD

Remark The explicit form of the dependence of TA on A depends on both the system and the measurement and many concrete examples can be constructed An example in the quantum domain is discussed in [3] and the experiment of [4] realizes an example in the classical domain

Remark If the system S is composed of two sub-systems S and 52 we can also consider the case in which the evolutions of the two subsystems are differshyent in the sense that for system 1 we have one form of functional dependence

18

Tjj of the evolution associated to the observable A and for system 2 we have another form of functional dependence Tjj In the experiment of [4] the state space is the unit disk D in the plane the observables are parametrized by angles in [02n) (or equivalently by unit vectors in the unit circle) and for each observable S i of system 1

and for each observable Sbdquo of system 2

where Ra denotes (counterclockwise) rotation of an angle a Let us consider Bells inequalities by assuming that a chamaleon effect

is present Denoting E the common initial state of the composite system (12) (eg singlet state) the state at the end of the measurement will be

Now replace Sx by

g(j) = gj) o T ( j )

x x --x

Since the Sx take values plusmn 1 we know from Theorem (23) that if we postulate

the existence of joint probabilities for the triple 5bdquo S^ Sc compatible with

the two correlations E(si1S^2)) E(si1S^2)) then the inequality

E(S^si2)) - E(S^si2)) lt 1 - E(S^S^)

holds and if we also have the singlet condition

ESpoundTWp)STWp)) = -l (1)

then ae

and we have the Bells inequality Thus if we postulate the same probability space even the chamaleon effect alone is not sufficient to guarantee violation of the Bells inequality

Therefore the fact that the three experiments are done on different and incompatible samples must play a crucial role

19

As far as the chameleon effect is concerned let us notice that in the above statement of the problem the fact that we use a single initial probability measure E is equivalent to postulate that at time t = 0 the three pairs of observables

(^U2)) (sMagt) (^U1) admit a common joint distribution in fact E

12 Physical implausibility of Bells argument

In this section we show that combining the chameleon effect with the fact that the three experiments refer to different samples then even in very simple situations no cleaning conditions can lead to a proof of the Bells inequality

If we try to apply Bells reasoning to the empirical data we have to start from the expression

~ E^W^sfcr^) -1 E^crJV)^(if Pf) 3 3

(1)

which we majorize by

^ E W^P^iT^p]) - SW(TJ V ) s f (tf V ) (2) N

3

But if we try to apply the inequality among numbers to the expression

SPiT^S^iTiW) - STWpraquo)sl2Traquo) (3)

we see that we are not dealing with the situation covered by Corollary (12)

ie

ab -cbltl-ac (4)

because since

si2)(T^)^S^(T^Py) (5)

the left hand side of (4) must be replaced by

ab-cb (6)

whose maximum for a b cb euro [mdash1 +1] is 2 and not 1 mdash ac

20

Bells implicit assumption of the single probability space is equivalent to the postulate that for each j = 1 N

P]=P (7)

Physically this means that the hidden parameter in the first experiment is the same as the hidden

parameter in the second experiment This is surely a very implausible assumption Notice however that without this assumption Bells argument cannot be

carried over and we cannot deduce the inequality because we must stop at equation (2)

13 The role of the single probability space in CHSHs proof

Clauser Home Shimony Holt [9] introduced the variant (26) of the Bell inequality for quadruples (ab) (ab) (ab) ab) which is based on the following inequality among numbers a b b a euro [mdash11]

ab + ab+ ab - ab |lt 2 (1)

Section (1) already contains a proof of (1) A direct proof follows from

b + b + b-blt2 (2)

because

| ab + ab + ab - ab | = | a(b + b) + ab - b) |

lta-b + b + a -b-b ltb + b + b-b lt2

The proof of (2) is obvious

Remark (1) Notice that an inequality of the form

a1b1+a2b2 + a3b3~a4b4lt2 (3)

would be obviously false In fact for example the choice

c1 = b = a2 = b2 = a3 = 63 = b4 = 1 a 4 = mdash1

would give I o-ih + a2b2 + a3b3 - a4b4 = 4

21

That is for the validity of (1) it is absolutely essential that the number a is the same in the first and the second term and similarly for a in the 3-d and the 4-th b in the 2-d and the 4-th b in the first and the 3-d

This inequality among numbers can be extended to pairs of random varishyables by introducing the following postulates

( P I ) Instead of four numbers a b b a g [mdash11] one considers four functions

o(l) c(2) o(l) o(2) dega Jdegb dega -V

all defined on the same space A (whose points are called hidden paramshyeters) and with values in [mdash11]

(P2) One postulates that there exists a probability measure P on A which defines the joint distribution of each of the following four pairs of funcshytions

ampamp) (gtSltgt) Slt$SP) S$SP) (4)

Remark (2) Notice that (P2) automatically implies that the joint distribushytions of the four pairs of functions can be deduced from a joint distribution of the whole quadruple ie the existence of a single Kolmogorov model for these four pairs With these premises for each A euro A one can apply the inequality

(1) to the four numbers

and deduce that

I Spound)S12) + SW)S$) + Slaquo(A)Sf (A) - S$)Strade() |lt 2 (5)

From this taking P-averages one obtains

I ltslM2)) + (^142)gt + lt ^ 2 ) gt - ltspoundWgt i= (6)

I J(SW)S12) + SW)Slt) + Si))si2x) - 5^(A)42)(A))rfP(A) |lt

(7)

lt||5W(A)^2)(A) + 5laquo(A)42)(A)+

22

S$)Sl2) - S$)Sigt() I dP(X) lt 2 (8)

Remark (3) Notice that in the step from (6) to (7) we have used in an essential way the existence of a joint distribution for the whole quadruple ie the fact that all these random variales can be realized in the same probability space In EPR type experiments we are interested in the case in which the

four pairs (a b) (a amp) (ab) (ab) come from four mutually incompatible experiments Let us assume that there is a hidden parameter determining the result of each of these experiments This means that we interpret the number Sa (A) as the value of the spin of particle 1 in direction a determined by the hidden parameter A

There is obviously no reason to postulate that the hidden parameter deshytermining the result of the first experiment is exactly the same one which determines the result of the second experiment However when CHSH conshysider the quantity (5) they are implicitly doing the much stronger assumption that the same hidden parameter A determines the results of all the four exshyperiments This assumption is quite unreasonable from the physical point of view and in any case it is a much stronger assumption than simply postulating the existence of hidden parameters The latter assumption would allow CHSH only to consider the expression

SPiWfHXi) + Slaquo(A2)42)(A2) + 5^(A3)5f (A3) - 5^(A4)4

)(A4) (9)

and as shown in Remark (1) above the maximum of this expression is not 2 but 4 and this does not allow to deduce the Bell inequality

14 The role of the counterfactual argument in CHSHs proof

Contrarily to the original Bells argument the CHSH proof of the Bell inequalshyity does not use explicitly the counterfactual argument Since one can perform experiments also on quadruples rather than on triples as originally proposed by Bell has led some authors to claim that the counterfactual argument is not essential in the deduction of the Bell inequality However we have just seen in section (7) that the hidden assumption as in Bells proof ie the realizabil-ity of all the random variales involved in the same probability space is also present in the CHSH argument The following lemma shows that under the singlet assumption the conclusion of the counterfactual argument follows from the hidden assumption of Bell and of CHSH

23

Lemma (1) If and g are random variables defined on a probability space (A P) and with values in [mdash11] then

(fg) bull= I fgdP = - i JA

if and only if Pfg = - i ) = i

Proof If P(fg gt - 1 ) gt 0 then

fgdP = -P(fg = - 1 ) - fgdP gt -P(fg = -1)-P(fg gt - 1 ) gt - 1 JA Jfggt-1

Corollary (2) Suppose that all the random variales in (x3) are realized in the same probability space Then if the singlet condition

(SPSW) = - 1 (1)

is satisfied then the condition

SW = SM ( 2)

(ie formula (13) in Bells 64 paper) is true almost everywhere

Proof Follows from Lemma (1) with the choice f = Sx g = Si Summing

up if you want to compare the predictions of a hidden variable theory with quantum theory in the EPR experiment (so that at least we admit the validity of the singlet law) then the hidden assumption of realizability of all the random variables in (3) in the same probability space (without which Bells inequality cannot be proved) implies the same conclusion of the counterfactual argument Stated otherwise the counterfactual argument is implicit when you postulate the singlet condition and the realizability on a single probability space It does not matter if you use triples or quadruples

15 Physical difference between the CHSHs and the original Bells inequalities

In the CHSH scheme

(ab) (ab) (ab) (ab)

24

the agreement required by the experimenters is the following - 1 will measures the same observable in experiments I and III and the

same observable in experiments II and IV - 2 will measure the same observable in experiments I and II and the same

observable in experiments III and IV Here there is no restriction a priori on the choice of the observables to be

measured In the Bell scheme the experimentalists agree that - 1 measures the same observable in experiments I and III - 2 measures the same observable in experiments I and II - 1 and 2 choose a priori ie before the experiment begins a direction c

and agree that 1 will measure spin in direction c in experiment II and 2 will measure spin in direction c in experiment III (strong agreement)

The strong agreement can be replaced by the following (weak agreement) - 1 and 2 choose a priori ie before the experiment begins a finite set of

directions c CK and agree that 1 will measure spin in a direction choosen randomly among the directions c CK in experiment II and 2 will do the same in experiment III

In this scheme there is an a priori restriction on the choice of some of the observables to be measured

If the directions fixed a priori in the plane are K then the probability of a coincidence corresponding to a totally random (equiprobable) choice is

p$ = 42A) = X gt =laquo 42A =laquo) = pound h = h a=l a=l

This shows that contrarily than in the CHSH scheme the choice has to be restricted to a finite number of possibilities otherwise the probability of coincidence will be zero

From this point of view we can claim that the Clauser Home Shimony Holt formulation of Bells inequalities realize a small improvement with respect to the original Bells formulation

Reproduction of the E P R correlations by the chameleon effect

Consider a classical dynamical system composed of two particles (12) Let S denote the state space of each of the particles and suppose that at

time t = to (initial time) the state i j of particle 1 and the state UdegJ OI particle 2 coincide

Hdeg = A=ti (1)

25

Starting from time to the two particles begin to move in opposite directions and after a time interval of length T two independent and non communicating experimenters simultaneously perform a measurement on each particle

Experimenter 1 (resp 2) can choose among three different measurements corresponding to the observables

SWSWSW (resp 5 ( 2 ) 5 f ^ ) ) (2)

of particle 1 (resp particle 2) We suppose that both particles satisfy the chameleon effect described by

the following

DEFINITION (1) Let S be the state space of a dynamical system u let 7 be a set and for each x euro I let be given a function

Sx S -gt R x euro I (3)

representing an observable of the system The system ltr is said to realize the chameleon effect with respect to the observables (33) if whenever the observable Sx is measured the dynamical evolution of the system

T S -gt S tell (4)

depends on the measured observable Sx In our case we consider only two instants of time the initial one and the

one when the measurement takes place and we omit time from our notations Moreover in our case we have two particles and each particle is far away from the other one hence it can only feel the interaction with the measurement apparatus near to it So combining the locality principle with the chameleon effect we conclude that if experimenter 1 (resp 2) chooses to measure the observable Sx (resp Sy ) then particle 1 (resp 2) will evolve according to the dynamics

T1gtx (resp T2lV) (5)

In our case the variables x y can be any element of the set a b c

Suppose that experimenter 1 chooses to measure and experimenter

Let ti (resp j2) denote the final state ie the state at the time when the measurement occurs of particle 1 (resp 2) Condition (31) is then equivalent to

^iTaVi = T276Va (6)

26

The empirical correlations of the measurements will then be

i pound 5(1)(x1)5f ( i ^ C O i - T2gt2) (7)

where J^(-) is a lt5-like factor keeping into account the fact that only the conshyfigurations satisfying condition (6) give a non zero contribution to the correlashytions

Now suppose that the state space S is the real line R Thus the empirical correlations (7) are

nab = Z J J 5laquo ( m )5 f (M2) (T1aV1 - T^^d^d^ (8)

where Z is a normalization constant With the change of variables

T ^ V i = Ai T~^2 = A2 (9)

(8) becomes

z j J 5W(T1aA1)^2)(T2bA2)lt5(A1 - X2)dTha(X1)dT2b(X2) (10)

Now introduce the notations

S^TiiXj)=S^(j) j = l2 x = ab (11)

with these notations supposing as always possible that T[i0(Ai)T2 6(A2) gt 0 (10) becomes

Z j j S^X1)Sb2x2)8Xl - X2)Tlta(X1)T^b(X2)dX1dX2 =

Z JSi1X)si2)(X)Tla(X)Tib(X)dX

Now let us make the following choices

A 6 [02vr] laquobull supp Sltj) C [0 2TT] (12)

Z = (27T)1 (13)

27

Tb = V^ (14)

n a ( A ) = ^ | c o s ( A - a ) | (15)

SW() = sgn (cos(A - x)) Strade = -Strade (16)

With these choices the correlations (8) become

I-2TT I

( S ^ f i f gt = - sgn (cos(A - a)) sgn(cos(A - 6))- | cos(A - a)d (17) Jo 4

= mdash sgn (cos(A mdash b)) cos(A mdash a)d = mdash cos(b mdash a) = mdasha bull b

which are the EPR correlations

References

1 L Accardi Phys Rep 77 169-192 (1981) 2 L Accardi Urne e camaleonti Dialogo sulla realta le leggi del caso

e la teoria quantistica (II Saggiatore 1997) Japanese translation Maruzen (2000) russian translation ed by Igor Volovich (PHASIS Publishing House 2000) english translation by Daniele Tartaglia to appear

3 L Accardi On the EPR paradox and the Bell inequality Volterra Preprint N 350 (1998)

4 L Accardi M Regoli Quantum probability and the interpretation of quantum mechanics a crucial experimentInvited talk at the workshop The applications of mathematics to the sciences of nature critical moments and aspetcs Arcidosso June 28-July 1 (1999) To appear in the proceedings of the workshop Preprint Volterra N 399 (1999)

5 L Accardi M Regoli Local realistic violation of Bells inequality an experiment Conference given by the first-named author at the Dipartimento di Fisica Universita di Pavia on 24-02-2000 Preprint Volterra N 402

6 L Accardi M Regoli Non-locality and quantum theory new experishymental evidence Invited talk given by the first-named author at the Confershyence Quantum paradoxes University of Nottingham on 4-05-2000 Preprint Volterra N 421

7 J S Bell Physics 1 3 195-200 (1964) 8 J S Bell Rev Mod Phys 38 447-452 (1966)

28

9 J F Clauser MA Home A Shimony R A Holt Phys Rev Letters 49 1804-1806 (1969) J S Bell Speakable and unspeakable in quantum mechanics (Cambridge Univ Press 1987)

10 J F Clauser M A Home Phys Rev D 10 2 (1974) 11 A Einstein B Podolsky N Rosen Phys Rev 47 777-780 (1935) 12 A Einstein in Albert Einstein Philosopher Scientist Edited by PA

Schilpp Library of Living Philosophers (Evanston Illinois 1949)

29

R e f u t a t i o n of Be l l s T h e o r e m

Guil laume A D E N I E R Louis Pasteur University Strasbourg France

E-mail guillaumeadenierulpu-strasbgfr

Bells Theorem was developed on the basis of considerations involving a linear combination of spin correlation functions each of which has a distinct pair of arguments The simultaneous presence of these different pairs of arguments in the same equation can be understood in two radically different ways either as strongly objective that is all correlation functions pertain to the same set of particle pairs or as weakly objective that is each correlation function pertains to a different set of particle pairs It is demonstrated that once this meaning is determined no discrepancy appears between local realistic theories and quantum mechanics the discrepancy in Bells Theorem is due only to a meaningless comparison between a local realistic inequality written within the strongly objective interpretation (thus relevant to a single set of particle pairs) and a quantum mechanical prediction derived from a weakly objective interpretation (thus relevant to several different sets of particle pairs)

1 Introduction

Bells Theorem1 exhibits a peculiar discrepancy between any local realistic theshyory and Quantum Mechanics which leads to empirically distinguishable altershynatives The quandary is that neither local realistic conceptions nor Quantum Mechanics are easy to abandon Indeed classical physics and common sense are usually based upon the former while the latter is rightly presented as the most successful theory of all times Several experiments have been done all but a few2 show violations of Bell inequalities3 Yet the ideas brought forth by Bells Theorem are so disconcerting that there is still incredulity not to menshytion antipathy evoked by the verdict The purpose of this article is to provide a refutation of this theorem within a strictly quantum theoretical framework without the use of outside assumptions

2 The E P R B gedanken experiment

21 Spin observables and singlet state

Bells theorem is usually based on a didactic reformulation of the EPR (Einshystein Podolsky and Rosen4) gedanken experiment due to D Bohm5 In this EPRB gedanken experiment a pair of spin-| particles with total spin zero is produced such that each particle moves away from the source in opposite directions along the y-axis Two Stern-Gerlach devices are placed at opposite

30

points (left and right) on the y-axis and are oriented respectively along the directions u and v The Hilbert space associated with the entire EPRB system is H = 7ih lt8gtHR where T^L and HR are the Hilbert spaces associated with each Stern-Gerlach device respectively The spin observable has two counterparts in this new product space H as

CTL-U = ltr-u(ggtIR (1)

ltTR bull v = IL reg a bull v (2)

where I I and IR are the identity operators of ~Hh and R Contrary to the observables a bull u and a bull v which are mutually non commuting when u ^ v these new observables ox bull u and OR bull v do commute reflecting the fact that the Stern-Gerlach devices are arbitrarily far from each other and are thus measuring distinct subsystems The product of these two observables is therefore also an observable and can be understood as a spin correlation observable corresponding to the joint spin measurement of both Stern-Gerlach devices Its eigenvectors are |poundLU) ltggt | pound R V ) with corresponding eigenvalues poundL-poundRgt where each e is either +1 or mdash1

In an EPRB gedanken experiment the source produces particle pairs with zero total spin represented by the singlet state

M = ^ [l+ngt reg -gtngt - -gtngt reg l+ngt]gt (3)

where n is an arbitrary unitary vector which can usually be ommited since the singlet state is invariant under rotation6

22 Statistical properties and hidden-variables

The expectation value of a spin observable for the singlet state ip) is zero

(r-u(8gtlR|Vgt) = 0 MI L regltr-v |^gt = 0 (4)

whatever u and v as follows from the rotational invariance of the singlet state Likewise the expectation value of the spin correlation observable 67 is

E(uv) = M ( o f u ) ( o - v ) M (5)

= - u - v (6)

which depends only on the relative angle between u and v

31

In a local realistic hidden-variables model a single particle pair is supposed to be entirely characterised by means of a set of hidden-variables which are symbolically represented by a parameter A so that the measurement result on the left along u can be written as A(uA) and the result on the right along v as B(v) Although the hidden-variables model is supposed to be fully deterministic it must also be capable of reproducing the stochastic nature of the EPRB gedanken experiment expressed in Eqs (4) and (6) For that purpose the complete state specification Aj of any particle pair with label i must be a random variable1s its complete state Aj is supposed to be drawn randomly according to a probability distribution p

Consider a set of N particle pairs i = 1 N the mean value of joint spin measurements for this set is

1 N

M(uv) = - ^ A ( u A i ) B ( v A i ) (7)

3 The CHSH function

In order to establish Bells Theorem a linear combination of correlation funcshytions c(a b) with different arguments 9 is considered once when these correlashytion functions are expectation values E^av) given by Quantum Mechanics ie Eq(6) and once when they are mean values M p (u v ) given by local hidden-variables theories Eq(7) then the results are to be compared A well known choice of such a linear combination is the CHSH (Clauser Home Shi-mony and Holt10) function written with four pairs of arguments

S = |c(ab) - c ( a b ) +c (a b ) + c(a b ) | (8)

The exact meaning of the simultaneous presence of these different argushyments in a CHSH function must be clarified Basically there are two possible interpretations the strongly objective interpretation and the weakly objective interpretation1112

Strongly Objective Interpretation implies that all correlation functions are relevant to the same set of N particle pairs As such they cannot be relevant to actual experiments but rather with what result would have been obtained if measured on the same set of N particle pairs along different directions

Weakly Objective Interpretation implies that each correlation function is actually to be measured on distinct sets of N particle pairs that is for each pair only one joint spin measurement is to be executed

32

The CHSH function was actually developed specifically for experimental convenience10 and many experiments have been done (the most famous being Aspects13) obviously invoking the natural interpretation namely the weakly objective one Nevertheless the strongly objective interpretation must also be considered since it remains a possible interpretation a priori and since the choice between strong and weak objectivity is not made at all explicit in many papers including Bells

It must be stressed that these interpretations are radically different not only epistemologically but also physically Indeed the strongly objective inshyterpretation pertains to a single set of N particle pairs characterised by the corresponding set of parameters A i = 1 TV whereas the weakly obshyjective interpretation pertains to no less than 4 sets of N particle pairs The fact is that a finite set of N particle pairs characterised by A cant be identishycally reproduced either theoretically (for each complete state A of any particle pair i is a random variable as defined in Section 22) or empirically (for the experimenter has no control over the complete state of a particle pair in a sinshyglet state) Hence in the weakly objective interpretation these four sets are necessarily four different sets of particle pairs 7 14 respectively characterised by four different sets of hidden-variables parameters Aij ^2i ^3i a n d A4J

The difference between each interpretation can therefore be embodied in the number of degrees of freedom of the whole system Let be the degrees of freedom of a single particle pair In the strongly objective interpretation the degrees of freedom of the whole CHSH system is then Nf whereas in the weakly objective interpretation it is 4 times as large that is 47V Thus before initiating Bells analysis one has to choose explicitly one interpretation and stick to it

4 Strongly objective interpretation

4-1 Local realistic inequality within strongly objective interpretation

The local realistic formulation of the CHSH function within strong objectivity is written

OP ^strong

M ( a b ) - M ( a b ) + Mgt(ab) + M (a b ) (9)

which (using Eq 7) becomes after factorisation a summation where each term can have two values 2 7

A(a Xi) B(b Xi) - B(b Xi)] + A(a Xt) [l(b Alt) + B(b A)] = plusmn2 (10)

33

so that the most restrictive local realistic inequality within the strongly objecshytive interpretation is

Strong lt 2- (11)

This is the well known generalised formulation of Bells inequality due to CHSH10 It must be stressed once more however that this inequality has been established only within the strongly objective interpretation which means that each expectation value is relevant to the same set of N particle pairs Hence this result cannot be compared directly with results from real experimental tests where in fact mean values from four distinct sets of N particle pairs are measured

4-2 Quantum mechanical prediction within strongly objective interpretation

The quantum prediction for the CHSH function within the strongly objective interpretation is written

strong = l ^ ( a b ) - E ( a b ) + E+(ab) + E(ah) (12)

This equation is usually directly evaluated by replacing each expectation value by the scalar product result of Eq (6) This unfortunately is all too hasty

Indeed in order to understand better the quantum mechanical meaning of equation (12) it is advantageous to take a step backward using equation (5)

^strong (Vgt|(aLa)(ltTRbM - ltVgt|(lt7La)(lt7Rb)|tgt)

+ (ygt|(lt7La)(ltTRb)|V) + (igt|(lt7La)(lt7R b)|V) bull (13)

The four spin correlation observables in this equation are non commuting observables (this can be shown by calculating the commutator of ((7LU)(ltTRV)

and ((TLU)(CTRV) with v ^ v ) so that the meaning of their combination must be questioned

According to Von Neumann15 any linear combination of expectation valshyues of different observables R S is meaningful in quantum mechanics

R + S + )4 = (R)4 + (S)4 + (14)

even if R S are non commuting observables However as was stressed by dEspagnat 1116 quantum mechanics is only a weakly objective theory and expectation values given by quantum mechanics are also weakly objective statements that is to say statements relevant to observations so that when

34

R 5 are non commuting observables the expectation values cannot be simultaneously relevant to the same set of N systems each expectation value is necessarily relevant to a distinct set of JV systems Therefore the only possible meaning of equation (13) is weakly objective not strongly objective as desired Of course this does not imply that Quantum Mechanics cannot provide any meaning at all for the CHSH function it implies only that this meaning cannot be strongly objective

Since the local realistic inequality SgtT0 cannot be compared with any strongly objective prediction given by Quantum Mechanics Bells Theorem cannot be verified with a strongly objective interpretation given to the CHSH function Hence there is no choice but to rely on the weakly objective interpreshytation in order to compare hidden-variables theories and Quantum Mechanics

5 Weakly objective interpretation

51 Quantum mechanical prediction within weakly objective interpretation

It was shown in Section 3 that strong objectivity and weak objectivity pertain to different physical systems This difference should therefore appear in the relevant equations Indeed the correlation expressed in Eq (6) is relevant to spin measurements performed on particles that once constituted a single parent particle Yet two particles issued from two distinct parents never have intershyacted with each other so that spin measurements performed on such particle pairs can not be correlated Hence if left and right spin measurements are pershyformed on two distinct sets of N particle pairs instead of the same set there should be no correlation and this property should appear in a generalised spin correlation function (ie generalised to the case of spin measurements performed on different sets of particle pairs)

This can be easily done within a quantum theoretical framework by means of a distinct EPRB space for each set of N particle pairs Let Hj be the EPRB Hilbert space associated with the jth set of particle pairs In this Hilbert space the EPRB gedanken experiment is represented by the singlet state ipj) (see Section 2)

|V) = ^[l+gtreg|-gt-|-gtreg|+gt-] (15)

The whole CHSH experiment with the four sets of particle pairs can be exshypressed then in terms of a new tensor product space W1234 = i reg 2 reg 3 reg HA in which the state vector is

1 1234) = |Vl) reg 1 2) reg |^s) reg |^4gt- (16)

35

The counterparts of observables in 7 1234 are obtained as in Section 21 For instance the observable pertaining to the right Stern-Gerlach device for the 2nd set of particle pairs is

a2R -u = Ii reg (CTR bull u) lt8gt I3 reg I4 (17)

where Ij is the identity operator of the EPRB space Hj Hence the expectation value of the product of two spin observables the first belonging to the fcth set and the second to the Zth set is

Eftu V) = (Vgt1234|(ltTL bull U)(lt7IR bull v)|Vgt1234) (18)

and this is the generalised expectation value of spin correlation observables that was sought The expectation value for measurements performed on the same set (k = I) of particle pairs is already known Eq (6) and E^k(u v) should provide the same result Indeed using Eqs (16) and (17) leads to

lt ( u v ) = ltIM(ltTL -u) bull K - v)rpk) = - u v (19)

but when k ^ I the result is quite different

J3(uv) = (V-fcKot - u ^ X V - z I K -v)hM = 0 (20)

in accord with Eq (4) There are indeed no correlations between two sets of particle pairs as stipulated in the beginning of this section

Now contrary to what was done in Section 42 it is possible to proceed here in full accord with the quantum mechanical postulates because the spin correlation observables as the one given in Eq (17) are mutually commuting so that a linear combination of these commuting observables is an observable as well The CHSH experiment can therefore be described by a new observable

Sweak = (lt7lL bull a)(ai R bull b ) - (ltT2L bull a)(lt72R b )

+(o-3L-a)(ltT3R-b) + (lt74L- a)(ltx4R bull b ) (21)

and the quantum prediction for the CHSH function within a weakly objective interpretation is therefore obtained by calculating the expectation value of the observable 5weak when the system is in the quantum state 1 1234)

Sweak = (^1234|5weak|V1234) (22)

which using Eqs (17) (18) and (19) is

S L k = S f 1 ( a b ) - ^ 2 ( a b ) + ^ 3 ( a b ) + E 4 (a b ) (23)

36

This equation is not ambiguous (as was Eq 12) it is a linear combination of expectation values each relevant to a distinct set of N particle pairs This equation is therefore weakly objective as requested

Finally using Eq (19) yields

weak a bull b - a bull b + a bull b + a bull b

with a well known maximum equal to

max(5 B a k )=2gt^

(24)

(25)

This numerical result is indeed the one given in the literature the only difshyference here being the fact that the meaning of this result is unambiguously weakly objective Quantum Mechanics which is a weakly objective theory n

provides a clear answer to the CHSH function understood as a weakly objective question

52 Local realistic inequality within weakly objective interpretation

The last step consists in comparing the quantum prediction S^eak with its local realistic counterpart S^eak As was stressed in Section 3 the j t h set of particle pairs must be characterised by a distinct set of hidden-variables parameters [Xji j = 1 N Hence to the generalised expectation value of the spin correlation observable Eq (18) corresponds the generalised mean value of joint spin measurements

1 N

Mpound(uv) = - J gt ( u A M ) B ( v A M ) (26)

which is a priori capable of reproducing not only the k mdash I prediction Eq (19) but also the k ^ prediction Eq (20) The local realistic CHSH function with a weakly objective interpretation is therefore

9P = weak

Mftfob) - M22(ab) + M3 3(ab) + M4 4(ab) (27)

and that is explicitly

i 1 N

5weak = b E [^(a A M )pound(b A M ) - gtl(aA2li)B(bA2ii)

+A(a 3i)B(h A3i) + AB A4i)B(bl A4]i) ] (28)

37

This expression is to be compared with the one pertaining to the strongly objective interpretation (Section 41) which contained terms that could be factored Here since each term is different from the others no factorisation is possible ie there is no way to derive a Bell inequality7mdashthis is not the first time this fact has been noticed unfortunately no conclusion was drawn then Yet this fact cannot be ignored for it has been shown in Section 4 that Bells Theorem cannot be demonstrated within a strongly objective interpretation

Here the only local realistic inequality that can be derived is obtained by consideringmdashas was done with Eq (10)mdashthe possible numerical values of each term of the summation in Eq (28) for which the extrema are +4 and -4 so that the narrowest local realistic inequality that can be derived from Eq (28) is nothing but

^ e a k lt 4 - (29)

This most restrictive local realistic inequality (which can also be found in Accardi17) is not incompatible with the quantum mechanical prediction as the maximum of Sbdquoe a k is 2-2 This shows that experiments intended to test Bells Theorem were unfortunately not testing the strongly objective inequality Eq (11)mdashwhich is a Bell inequalitymdash but this weakly objective one Eq (29) since all experimental tests necessarily are executed in a weakly objective way due to the irreducible incompatibility between spin measurements As was stressed by Sica18 and Accardi17 a local realistic inequality is nothing but an arithmetic identity and inequality (29) is definitely too lax to be violated by experimental tests

6 Conclusion

It was shown that Bells Theorem cannot be derived either within a strongly objective interpretation of the CHSH function because Quantum Mechanics gives no strongly objective results for the CHSH function (see Section 42) or within a weakly objective interpretation because the only derivable loshycal realistic inequality is never violated either by Quantum Mechanics or by experiments (see Section 52) It was demonstrated that the discrepancy in Bells Theorem is due only to a meaningless comparison between S^trons lt 2 and 5^ e a k = 22 where the former is relevant to a system with Nf degrees of freedom whereas the latter to one with 4Nf (see Section 3) The only meaningful comparison is between the weakly objective local realistic inequalshyity 5^ e a k lt 4 and the weakly objective quantum prediction Sbdquo e a k = 2^2 but these results are not incompatible Bells Theorem therefore is refuted

38

References

1 J S Bell Physics 1 195 (1964) 2 F Selleri Le grand debat de la mcanique quantique (Champs Flammar-

ion Paris 1986) 3 A Aspect Nature 398 189 (1999) 4 A Einstein B Podolsky and N Rosen Phys Rev 47 777 (1935) 5 D Bohm Phys Rev 85 166 (1952) 6 D Greenberger M Home A Shimony and A Zeilinger Am J Phys

58 1131 (1990) 7 A Bohm Quantum Mechanics Foundations and applications (Springer-

Verlag New York 1979) 8 J S Bell in Proceedings of the international School of physics Enrico

Fermi course IL Foundations of quantum mechanics (Academic New York 1971) p 171

9 J S Bell Epistemological Letters p 2 (July 1975) 10 J F Clauser M A Home A Shimony and R A Holt Phys Rev Lett

23 880 (1969) 11 B dEspagnat Veiled Reality An Analysis of Present Day Quantum

Mechanical Concepts (Addison-Wesley 1995) 12 B dEspagnat httparXivabsquant-ph9802046 13 A Aspect J Dalibard and G Roger Phys Rev Lett 49 1804 (1982) 14 A Khrennikov httparXivabsquant-ph0006017 15 J von Neumann Mathematical Foundations of Quantum Mechanics

(Princeton University Press 1955) 16 B dEspagnat Conceptual foundations of Quantum Mechanics (WA

Benjamin Massachusetts 1976) 17 L Accardi httparXivabsquant-ph0007005 18 L Sica Opt Commun 170 55 (1999)

39

PROBABILITY CONSERVATION A N D THE STATE DETERMINATION PROBLEM

S AERTS Free University of Brussels

Triomflaan 2 Brussels Belgium E-mail saertsvubacbe

The problem of finding an operational definition for the wave vector is briefly examined from a historical point of view Led by an old idea of Feenberg we integrate the one dimensional probability conservation equation to obtain a closed formula that determines the state vector in the spinless case The formula that determines the state does not depend on the (real) potential external fields having their influence on the state only through the time derivative of the probability density function in position space We apply the method to the simple case of a free Gaussian wave packet Some problems regarding the operational status of the quantities involved are discussed

1 Introduction

It is well known that Heisenberg constructed the matrix formulation of quanshytum mechanics by keeping in close accordance with what might be labelled the principle of operationality Roughly one can describe this principle as a determination to introduce only measurable quantities Schrodinger more concerned with anschaulichkeit than operationality introduced rather unshyscrupulously the concept of a wave function He initially interpreted the wave function as a charge density in space but this interpretation is difficult to extend to several particle problems a The interpretation that would stand the test of time as testimonied by it being awarded the Nobel prize in 1954 was due to Born In analogy with the theory of electro-magnetic radiation in which the intensity is the square of the amplitude Born took the step to interpret the intensity of an electro-magnetic wave in a given region of space as proportional to the relative frequency of a photon detection in that region and the probabilistic interpretation was born However this correspondence still doesnt make it an operational quantity as for every density p(x t) there are infinitely many 4gt(xt) such that with ip(xt) = ^pxt)el^xt we get ip(xt)ip(xt) = p(xt) The problem is then to find suitable functions that we can approximate experimentally in a statistical way that in some well choshysen combination yield the same information as the complete wave function In order to make the question mathematically more precise Prugovecki2 intro-

aFor a rescue attempt of the original Schrodinger interpretation see Dorling1

40

duced the notion of informational completeness A family T = Oii euro 1 of bounded operators on a Hilbert space ~H is called informationally complete iff for every two density operators p and p the equality Tr(pOi) = Tr(pOi) implies p = p This definition implies that the set of expectation values of an informationally complete set of operators allows only one state operator from which the expectation values could have been derived What characterizes such a set In a classical statistical framework we can calculate all macroshyscopic quantities from a single density function p(p q) in phase space Hence by analogy one is naturally led to the following interesting question originally due to Pauli3 Is it sufficient to know the probability density functions of poshysition and momentum to determine unambiguously the quantum mechanical state of the physical system In the quantum mechanical case it is sufficient to know the wave function in coordinate space ip(xt) since the corresponding wave function for the same system in momentum space ip(pt) is given by its Fourier transform Hence we can phrase the problem in a more mathematical way is it possible to determine a square integrable function uniquely from both its modulus and the modulus of its Fourier transform Possibly the first non-trivial counterexamples came from Bargmann b who constructed explicit examples of wave functions Vl and ip2 that give rise to the same probabilshyity distributions for position and momentum but give a different probability distribution for a third operator that does not commute with the position or momentum operator This leads to the remarkable conclusion that the wave function in its coordinate representation contains more information than the corresponding probability densities in position and momentum together Due to Bargmann we know the answer to be negative in a physically relevant way c

and what is now commonly referred to as the Pauli problem is either the probshylem of determining the set of states that share the same modulus and the modshyulus of their Fourier transform or the problem of finding a set of observables that are informationally complete The problems are related but not identical and we prefer to refer to the first version of the problem as the Pauli probshylem and to the second as simply the state determination problem It seems much more work has been done on the state determination problem which isnt surprising given the fact that the Pauli problem is a special case of it With the exception of the production of counterexamples such as Bargmanns the first instructive results regarding the Pauli problem were obtained only in

Bargmann never seems to have published these results himself and as a result little refershyence is given to his work in the literature However the examples can be found in Reichen-bach 4 c The problem re-appeared unaltered in the 1958 edition of Paulis book more than a decade after the first counterexamples

41

1978 by Corbett and Hurst5 In their paper they construct physically imporshytant classes of functions that are uniquely determined by their position and momentum distributions However they also show there exist dense subsets of states that are not uniquely determined by their position and momentum disshytributions and as a consequence any state can be approximated in norm by a non-unique state Extensions comments and counterexamples to their work can be found in Friedman6 and Pavicic7 Nevertheless the complete charshyacterization of the set of states that share modulus and the modulus of their Fourier transform is still open As for the state determination problem we can split the work into those who were primarily concerned with establishing a set of observables that is informationally complete (or disproving a certain set to have this property) and those that set out to characterize such sets The first group includes Feenberg8 (1933) Moyal9 (1949 ) Gale Guth and Trammell (1968)10 Band and Park 1 1 1 2 13 (1970-1971) and many more14 15 16 We will not go into the reconstruction of the state by placing the entity in different potentials a method pioneered by Lamb17 and one that inspired many similar approaches such as Wiesbrock18 and Weigert19 nor will we mention the vast literature pertaining to the measurement of the Wigner distribution known as phase-space tomography However concerning the characterization of inshyformationally complete sets we cannot help but make the following elementary remarks Suppose we have a non-trivial (ie not a multiple of the identity) self-adjoint operator A that commutes with every member of a set of operashytors S in a Hilbert space 7i It is well known that the one parameter family of unitary operators exp(itA) also commutes with every element of ltS Now take any xj) that is not an eigenvector of A For any observable in S the state ipt mdash exp(itA)tp gives the same expectation value for this operator whatever numerical value t has But if t ^ s it follows that ipt ^ Vs (for the relation of this with superselection rules see Wick Wightman and Wigner (1952) 20 Emch and Piron (1963) 21 and Piron2 2) Hence S is not an informationally complete set of observables So a necessary condition for a set of observables to be informationally complete is maximality in the sense of Dirac in other words that there be no other non-trivial operator that commutes with every member of the set However this is far from sufficiency As Bush and Lahti23

have shown it is easy to derive d from the considerations above that no comshymuting set of observables is informationally complete Maximal commuting sets of observables serve as a means of state preparation not state identifishycation This means that at least for for continuous variables the Pauli set P Q is in a certain sense the minimal set that one could possibly hope to be informationally complete (although Bargmann has shown this in general not

One arrives at this result by allowing A to be a member of S

42

to be the case)

2 Conservation of Probability

What we will present in this article is an elaboration on the reasoning followed by Feenberg Consider the time-dependent Schrodinger equation in tp with a real e potential V and using the shorthand tp for ip(r t)

~ = -h2imV2tp +^rVip at in

Multiply by tp and add this to the complex conjugate of the above equation multiplied by ip After some elementary vector operator manipulation we find what is commonly known as the conservation law of probability

Substitution of the polar representation of the wave vector iP(rt) = yfafietrade (ip assumed real) into the former equation yields a second order partial difshy

ferential equation which is in fact a Fokker-Planck equation with zero diffusion coefficient and the phase serving as a a potential

Feenbergs argument is a uniqueness result based on this last equation It amounts to showing that any two phase functions that satisfy this equation and some gentle boundary conditions differ by at most a constant His 1933 thesis is hard to get hold of but the argument was (erroneously1015 ) extended by Kemble 24 to three spatial dimensions in his much easier to find handbook on quantum mechanics What we will do here is go back to the original one dimensional idea but rather than trying to establish a uniqueness result we will show that in this simple case a solution can be obtained by direct integration

3 Determination of the phase function

So p and ip satisfy the conservation law as given by the last equation Rewriting this equation in one dimension evaluated at a specific time instant t = to gives us eThe imaginary part of a complex potential can be used to mimic creation and annihilation effects Although this is sometimes a useful approximation such results violate the continuity equation and for a more reliable analysis one should really use a second quantized theory

43

lt9V dp(xt0)dip mtdpxt) pxto)w + mdashdxmdashTx + -nmdashm-]t^ = deg

Assume for the time being that p(x t0) ^ 0 and divide the equation by p(x t0)

d2(p dinp(xt0) dip m dlnp(xt) _ ~dtf + dx ~5x~+ J dt h=t0 ~

Assuming pox) and its time derivative to be known functions we can solve for the unknown phase ltp(xto) Set

As all quantities are evaluated at the same time instant t = to we will not bother to give further notational reference to this fact In what follows we will also abbreviate (with abuse of language) ( a i nP(x f)) f = t o a s dtlnp(x) Applying these transformations the equation becomes

^ + f(X)(fgt = g(X)

So we have transformed the second order partial differential equation into an ordinary first order linear differential equation with a source g(x) at a fixed time instant The solution of the homogeneous equation is ltph = exp[mdash f f(x)dx] = p~1x) The general solution with c chosen to fit the boundary condition is ltfgt(x) = 4gthx)(c + $x g(s)p(s)ds) We have to integrate this result once more to get ltp(x)

x rr

4gthr)(c+ I g(s)p(s)ds)dr

= J p~(7)[c+J J P(s)dtlnp(s)ds]

= J (c+-J dtP(s)ds)W)

4 Validity and range of applicability

The solution is seen to be a two parameter family of curves one for every value of the constant c and one for every lower limit say x$ of the r integration The result of changing the lower integration limit is only the addition

bullThe lower limit of the s integration is absorbed in the constant c

44

of an overall constant to tp(xt) Because we know the quantum mechanical expectation values and probabilities to be invariant under such an addition we set this constant equal to zero The value of the constant c can potentially affect the phase in a more profound way Depending on the particular p(r t) used pfriy m i g n t diverge when p(r t) is zero for some value(s) of r or even worse for some Ar First of all we assumed in our derivation that p(r t) ^ 0 but this restriction can easily be removed Indeed suppose we have n places xn where the density does equal zero A solution ipi is then obtained for each interval ]x Xi+ [ by means of our equation The total solution ip is obtained by pasting all the ipi together by requiring continuity of if and V^- 9 bull Now continuity of ip and VVgt implies continuity of their respective complex conjugates and hence of p and Vp If we are to infer the phase from actual data it seems reasonable to require (p also to be continuous In fact the conservation equation requires it to be twice differentiable If any cutting and pasting is necessary to obtain the solution we can easily see that the constant c should be the same for any two pasted pieces Hence if the cut is applied at a pole c has to be zero h for ltp to be continuous We arrive at the same conclusion when we use the same reasoning on a point adjacent to the support of p Hence we arrive at the main result of our paper

m rx fo rr

V(xt0) = yp(xt0)exp(imdash dtp(st0)ds)

Note that the state does not contain reference to the potential External fields will show up in the state indirectly as a consequence of the time dependence of p The assumptions that underlie the derivation of the equation are a spinless one dimensional particle that acts under a real potential V being prepared in a pure state In short all that is required for a particle to obey the one dimensional dynamical Schrodinger equation However restricted this class is it does include many examples that can be found in standard textbooks on quantum mechanics

Comparing the result we have found to those in the literature we find the closest match with a result obtained by Gale Guth and Trammel10 They apply the definitions of p(r) and j(r) to show that knowing these is sufficient for the determination of the phase They then discuss a gedanken experiment

9 This continuity demand is in fact a necessity because the validity of the equation of probshyability conservation (and a fortiori of the Schrodinger equation) requires xjj and Vigt to be continuous A notable but unproblematic exception is that of an infinite potential step h the value of c might be non-zero in applications where the continuity equation only expresses conservation of the probability flux in some intermediate region the boundaries (possibly at infinity) containing sinks or sources of probability

45

for establishing the probability current by measuring the expectation of the velocity and argue by means of this experiment and an intuitive argument that the current j(r) equals p(r) lt v(r) gt for some r inside a small space region that is supposed to contain the particle Our result was obtained by a direct integration and as a consequence is exact It is however difficult to extend to higher dimensions because of two reasons The first is the fact that the expression for the probability current in the presence of a vector potential becomes J(xpound) = Reip(xt)[pmmdash (qmc)A]ip(x t) and depending on the form of the vector potential it is not obvious to what function of the phase this corresponds If the vector potential corresponds to a uniform magnetic field or in absence of a vector potential (in which case one can transform the equation into a Poisson equation) one can solve the continuity equation by employing standard techniques However one then encounters a second problem Providing an initial value for the phase (which is unproblematic as the phase is only determined within an additive constant) is no longer sufficient instead we need an initial boundary function Hence we have to resort to other principles to determine the phase on such a boundary in order to solve the problem Of course the principle of conservation may still serve the purpose of reducing the family of admissible functions for the phase of the amplitude We will now illustrate the principle by applying it to a Gaussian wave packet Later we will expound a few operational issues regarding the quantities involved in the solution given above

5 Evolution of a Gaussian Wave Packet

The full time dependent wave function for a free Gaussian wave packet is

c o = ltMA)Srltlti + ^ r -x24(Ax)l + ik0x - ik2Ht2m

eXpL 1 + iht2m(Ax)20 J

From this we easily calculate p(xt)

p(xt) = tpxt)ip(xt)

iv A N2W- h2t2 N--12 r -(x + k0htm)2

Now assume we did not know the wave function only the probability density and its time derivative at some time instant t mdash 0 In an abbreviated

46

form (with easy identification of the coefficients) we can write the probability as

) = + tf)-raquolaquop[-JEplusmn|pound]

At time t = 0 this gives us p(x0) mdash aexp(mdash^-) The derivative of p with respect to the time parameter

bulllaquoraquo - 4ilt1 + 6 2gt~1 2 e x plt-|r^)gt]= CX X2

= ~2a~dexp(~~j)

So the phase becomes

ltp(x0) = j J J dtp(s0)d p(r0)

2 bdquo2 bull v

C TTl f fr S V

= ~2d-hJ J sexP(--)dsexP(-)d

m fx v^ r2

kohm = T~x

m n

= kox

which is precisely the desired phase of the wave function at t = 0 6 Operational Issues

Expounding Feenbergs uniqueness result Reichenbach points out that we can recover the phase by numerical computation if we know p(x to) and dtp(x t) t=t0 bull In order to establish these quantities Reichenbach outlines the following proshycedure4 We take an ensemble A of identically prepared systems such that the ensemble can be properly described by a pure state ifgt Now select at random two sub-ensembles from A say B and C For each system in B we measure at the time to the value of a As the results will vary we obtain in this way a distribution p(xto)- Likewise for each system in C we we measure at the time ti the value of x obtaining a distribution p(xti) The quotient

p(xt0) - p(xh)

h mdash to

47

is then supposed to approximate dtp(xt) for t euro [toh] if the interval [toh] is chosen sufficiently small The wave function can then be obtained through numerical approximation and represents the state of the systems that are left untouched in the original ensemble A There is a problem with Reichenbachs procedure for determining these quantities that is of equal concern to our method Despite the fact that it is entirely possible to position the detector wherever one wants it to be hence effectively controlling x in p(xt) it is an annoying peculiarity of quanta that one cannot determine when a detection will take place One places a detector and simply waits for a detection count to happen The problem seems related to what Mielnik has called the screen problem in a provocative and enlightening paper by the same name 25 As Mielnik points out experimentalists perform a lot of experiments but none reshysembling an instantaneous check of particle position Indeed a measurement setup typically consists of a source that what is emitted undergoes a series of transformations (ie an optical bench or a potential) and is subsequently detected by a fixed detector or a set of fixed detectors If we are to describe operational means of measuring densities at some time instant we will have to do so by such a typical setup To produce anything remotely satisfactory we will need a few assumptions A first assumption is that if a particle is detected at some time instant to in position x the intricate mechanism beshytween the measurement apparatus and the particle that is responsible for its detection does not depend on to and in this sense has no effect on the value of p(xt) However unnatural the assumption might be from a physical point of view it seems to underlie the statistical interpretation of fn ^x t)2dV as an instantaneous localization probability of the system in a state ip in a space region fi and at a time instant t In so far as our analysis depends on this assumption so does the standard interpretation of quantum mechanics The next assumption is that we are able to control the release of the particle in a certain state within a sufficient small time interval At such that within this small time interval the density can reasonably be approximated by a linear function This can be achieved by placing a shutter mechanism behind the source Naturally the shutter opening time has to be substantially less than the coherence time of the particle A sufficiently short opening time can only be established by experiment and one can never be quite sure if there would still be more oscillations on a much shorter time scale A density function with a larger variation will be harder to approximate as it requires a shorter shutter opening time and hence will result in a lower detection rate The wave packet then participates in the transformations we may have set up (optical bench Stern-Gerlach) and is detected The time interval between the shutter reshylease and the detection time is noted together with the position of the detector

48

After many of such recordings we gather all the data to reconstruct p(xt) How many samples do we need Well if the samples were taken at equidistant At and Ax we could do a Fourier synthesis and apply the Shannon-Whittaker sampling theorem However due to the non-equidistant spreading of the tn (at best following some statistical pattern) we need Frame Theory (Duffin and Schaeffer26) to reconstruct band limited signals from irregularly spaced samshyples f(tn) The derivative with respect to time can then be derived from the reconstructed signal and the phase derived by means of the proposed equation

Acknowledgments

The author wishes to acknowledge a helpful discussion with John Corbett regarding the subject of this paper

References

1 J Dorling Schrodinger Centenary celebration of a polymath eds CW Kilmister (Cambridge 1987)

2 E Prugovecki Int J Theor Phys 16 pp 321-331 (1977) 3 W Pauli Encyclopedia of Physics Vol V p17 (Springer-Verlag Berlin

1958) 4 H Reichenbach Philosophic Foundations of Quantum Mechanics (Unishy

versity of California Press 1948) 5 JV Corbett CA Hurst J Austral Math Soc B20 182-201 (1978) 6 CN Friedman J Austral Math Soc B30 298 (1987) 7 M Pavicic Phys Lett A 122 280 (1987) 8 E Feenberg The Scattering of Slow Electrons in Neutral Atoms Thesis

Harvard University (1933) 9 JE Moyal Proc Cambridge Phil Soc 45 99 (1949)

10 W Gale E Guth and GT Trammell Phys Rev A 165 1434-1436 (1968)

11 W Band J Park Found Phys 1 No 2 pp 133-144 (1970) 12 J Park W Band Found Phys 1 No 4 pp 339-357 (1971) 13 W Band J Park Am J Phy 47 pp 188-191 (1979) 14 A Royer Phys Rev Lett 55 pp 2745 (1985) 15 A Royer Found Phys 19 3 (1989) 16 W Stulpe M Singer Found Phys Lett 3 153 (1990) 17 W E Lamb Phys Today 22(4) 23 (1969) 18 H-W Wiesbrock Int J Theor Phys 26 pp 1175 (1987) 19 S Weigert Phys Rev A 45 pp 7688-7696 (1992)

49

20 GC Wick AS Wightman EP Wigner Phys Rev 88 pp 101-105 (1952)

21 EC Emch C Piron J Math Phys 4pp 496-473 (1963) 22 C Piron Helv Phys Acta 42 pp 330-338 (1969) 23 P Bush PJ Lahti Found Phys 19 pp 633 (1971) 24 EC Kemble New York MacGraw-Hill (1937) 25 B Mielnik Found Phys 24 8 pp 1113-1129 (1994) 26 RJ Duffin AC Schaeffer Trans Amer Math Soc 72 341-366

(1952)

50

EXTRINSIC A N D INTRINSIC IRREVERSIBILITY IN PROBABILISTIC DYNAMICAL LAWS

H ATMANSPACHER Institut fur Grenzgebiete der Psychologie und Psychohygiene

Wilhelmstr 3a D-79098 Freiburg Germany E-mail haaigppde

and Max-Planck-Institut fur extraterrestrische Physik

D-85740 Garching Germany

R C BISHOP Institut fur Grenzgebiete der Psychologie und Psychohygiene

Wilhelmstr 3a D-79098 Freiburg Germany E-mail rcbigppde

A AMANN Universitatsklinik fur Anasthesie Leopold-Franzens- Universitat

Anichstr 35 A-6020 Innsbruck Austria E-mail antonamannuibkacat

and Institut fur Allgemeine Anorganische und Theoretische Chemie Abteilung fur theoretische Chemie Leopold-Franzens- Universitat

Innrain 52a A-6020 Innsbruck Austria

Two distinct conceptions for the relation between reversible time-reversal invarishyant laws of nature and the irreversible behavior of physical systems are outlined The standard extrinsic concept of irreversibility is based on the notion of an open system interacting with its environment An alternative intrinsic concept of irreshyversibility does not explicitly refer to any environment at all Basic aspects of the two concepts are presented and compared with each other The significance of the terms extrinsic and intrinsic is discussed

1 Introduction

The relation between reversible time-reversal invariant laws of nature and the irreversible behavior of empirical systems has been a long-standing problem in physics In most standard approaches fundamental dynamical laws such as in Newtons Maxwells Einsteins or Schrodingers equations describe the temporal evolution of isolated systems Irreversible dynamical laws are typshyically regarded as emerging from the interaction between systems and their environment ie from considering open systems

In contrast to this extrinsic conception of irreversibility there is a group

51

of scientists who insist that some kinds of irreversibility are intrinsic ie some kinds of irreversible laws are fundamental On this view mainly adshyvocated by Prigogine and colleagues in Brussels and Austin the switch from extrinsic to intrinsic irreversibility goes along with a switch from particular kinds of deterministic descriptions to particular kinds of probabilistic descripshytions

In general the two viewpoints are considered to be distinct sometimes even entirely incompatible It is the main goal of this contribution to show that there are both differences and similarities between them As a consequence it does not make too much sense to prefer one of them at the expense of the other It is much more interesting to explore whether particular aspects of each of the two views can be constructively related to each other in order to increase our insight into the issue of irreversibility

In the following both conceptions will be presented to some detail and compared It is suggested that the distinction of ontic and epistemic catego-rial frameworks for some problems associated with irreversibility is particularly useful when focusing on a conceptual discussion Such a distinction serves to clarify both common and distinct aspects of extrinsic and intrinsic irreversibilshyity and it helps to frame a number of open questions concerning them

In Section 2 ontic and epistemic descriptions are briefly introduced We use an algebraic framework for this introduction since this has proven fruitful in related problem areas Section 3 outlines some basic issues with respect to the ontic states of closed quantum systems and their time-reversal invariant dynamical evolution Subsequently two ways to conceive of extrinsic irreshyversibility are described In one of them epistemic states are represented by (reduced) density operators in the other they are represented by probabilshyity distributions of pure states Section 4 presents the intrinsic conception of irreversibility One major line of research in this regard deals with transformashytions from invertible K-systems to non-invertible exact systems the other uses the concept of rigged Hilbert spaces to extend the state of a system beyond Hilbert space Section 5 summarizes the main points and indicates some open questions

2 Ontic and epistemic descriptions

21 General issues

Can nature be observed and described as it is in itself independent of those who observe and describe - that is to say nature as it is when nobody looks This question has been debated throughout the history of philosophy with no clear answer either way Each perspective has strengths and weaknesses and in each

52

epoch has had its critics and proponents In contemporary terminology the two perspectives can be distinguished as the topics of ontology and epistemology Ontological questions refer to the structure and behavior of a system as such whereas epistemological questions refer to knowledge (or information) about systems

In philosophical discourse it is considered a serious fallacy to confuse these two types of questions For instance Fetzer and Almeder emphasize that an ontic answer to an epistemic question (or vice versa) normally commits a category mistake 1 Nevertheless such mistakes are frequently committed in many fields of research when addressing subjects where the distinction between ontological and epistemological arguments is important

The onticepistemic distinction refers to states and properties of a system as such or in its relation to observers hence it is an ontological distinction0

In physics the rise of quantum theory with its interpretational problems was one of the first major challenges to the onticepistemic distinction The Bohr-Einstein discussions in the 1920s and 1930s serve as a famous historical examshyple Einsteins arguments were generally ontically motivated that is to say he emphasized a viewpoint independent of observers or measurements By conshytrast Bohrs emphasis was generally epistemically motivated focusing on what we could know and infer from observed quantum phenomena Since Bohr and Einstein never made their basic viewpoints explicit it is not surprising that they talked past each other in a number of respects2

Examples of approaches trying to avoid the confusions of the Bohr-Einstein discussions are Heisenbergs distinction of actuality and potentiality 3 Bohms ideas on explicate and implicate orders5 or dEspagnats scheme of an empirshyical weakly objective reality and an objective (veiled) reality independent of observers and their minds5 Further terms fitting into the ontic side of these distinctions are latency6 propensity7 or disposition8 See also Jammers discussion of these notions including their criticism and additional references 9

A first attempt to draw an explicit distinction between ontic and epistemic descriptions for quantum systems was introduced by Scheibe 10 who himself however strongly emphasized the epistemic realm Later Primas developed this distinction in the formal framework of algebraic quantum theory11 The basic structure of the onticepistemic distinction which will be made more precise below can be roughly characterized as follows (for more details the reader is referred to1 1 1 2)

On the other hand the distinction between ontological and epistemological problems can be considered as epistemological insofar as both areas represent fields of (philosophical) knowledge

53

Ontic states describe all properties of a physical system exhausshytively (Exhaustive in this context means that an ontic state is precisely the way it is without any reference to epistemic knowledge or ignorance) Ontic states are the referents of indishyvidual descriptions the properties of the system are treated as intrinsic bullproperties As an important example ontic states reshyfer to closed systems they are empirically inaccessible Typically their temporal evolution (dynamics) is reversible and follows fundashymental deterministic laws Epistemic states describe our (usually non-exhaustive) knowledge of the properties of a physical system ie based on a finite partition of the relevant phase space The refshyerents of statistical descriptions are epistemic states the properties of the system are treated as contextual properties Epistemic states refer to open systems they are at least in principle empirically accessible Typically their temporal evolution (dynamics) follows irreversible laws

The combination of the onticepistemic distinction with the formalism of algebraic quantum theory provides a framework that is both formally and conshyceptually satisfying Although the formalism of algebraic quantum theory is often hard to handle for specific physical applications it offers significant clarshyifications concerning the basic structure and the philosophical implications of quantum theory For instance the modern achievements of algebraic quanshytum theory make clear in what sense pioneer quantum mechanics (which von Neumann implicitly formulated epistemically 13) as well as classical and stashytistical mechanics can be considered as special cases of a more general theory Compared to the framework of von Neumanns monograph13 important exshytensions are obtained by giving up the irreducibility of the algebra of observshyables (not admitting observables which commute with every observable in the same algebra) and the restriction to locally compact phase spaces (admitting only finitely many degrees of freedom) As a consequence modern quantum physics is able to deal with open systems in addition to isolated ones it can involve infinitely many degrees of freedom such as the infinitely many modes of a radiation field it can properly consider interactions with the environment of a system superselection rules classical observables and phase transitions can be formulated which would be impossible in an irreducible algebra of obshyservables there exist infinitely many representations inequivalent to the Fock

In a more technical terminology one speaks of observables (mathematically represented by operators) rather than properties of a system Prima facie the term observable has nothing to do with the actual observability of a corresponding property

54

representation and non-automorphic irreversible dynamical evolutions can be successfully incorporated and even derived

In addition to this remarkable progress the mathematical rigor of algeshybraic quantum theory in combination with the onticepistemic distinction alshylows us to address a number of unresolved conceptual and interpretational problems of pioneer quantum mechanics from a new perspective First the distinction between different concepts of states as well as observables provides a much better understanding of many confusing issues in earlier conceptions including alleged paradoxes such as those of Einstein Podolsky and Rosen (EPR) 1 4 Second a clear-cut characterization of different concepts of states and observables is a necessary precondition to explore new approaches beshyyond von Neumanns projection postulate toward the central problem that pervades all quantum theory the measurement problem Third a number of much-discussed interpretations of quantum theory and their variants can be appreciated more properly if they are considered from the perspective of an algebraic formulation

One of the most striking differences between the concepts of ontic and epistemic states is their difference concerning operational access ie observshyability and measurability At first sight it might appear pointless to keep a level of description which is not related to what can be operationalized empirshyically However a most appealing feature at this ontic level is the existence of first principles and fundamental laws that cannot be obtained at the episshytemic level Furthermore it is possible to rigorously deduce (eg to GNS-construct cf 12gt15) a proper epistemic description from an ontic description if enough details about the empirically given situation are known These aspects show that the crucial point is not to decide whether ontic or epistemic levels of discussions are right or wrong in a mutually exclusive sense There are always ontic and epistemic elements to be taken into account for a proper description of a system This requires the definition of ontic and epistemic terms to be relativized with respect to some selected framework within a set of (hierarchishycal) descriptions (see16 for details and examples) The problem is then to use the proper level of description for a given context and to develop and explore well-defined relations between different levels

These relations are not universally prescribed they depend on contexts of various kinds The concepts of reduction and emergence are of crucial sigshynificance here In contrast to the majority of publications dealing with these topics it is possible to precisely specify their meaning in mathematical terms Contexts or contingent conditions can be formally incorporated as topologies in which particular asymptotic limits give rise to novel emergent properties unavailable without those contexts (see 15 for more details) It should also

55

be mentioned that the distinction between ontic and epistemic descriptions is neither identical with that of parts and wholes nor with that of micro- and macrostates as used in statistical mechanics or thermodynamics The thermoshydynamic limit of an infinite number of degrees of freedom provides only one example of a contextual topology others are the Born-Oppenheimer limit in molecular physics or the short-wavelength limit for geometrical optics

These examples indicate that the usefulness or even inevitability of the onticepistemic distinction is not restricted to quantum systems It plays a significant role in the description of classical systems as well More specifically it has been shown in detail that for systems exhibiting deterministic chaos the distinction of ontic and epistemic descriptions is necessary if category mistakes and corresponding interpretational fallacies are to be avoided17

3 Breaking Time-Reversal Symmetry Extrinsic Irreversibility

31 Time-Reversal Symmetry in Closed Systems

Let us start with a closed quantum system which can be considered without any reference to an environment The pure state ltfgt of such a system is an extremal positive linear functional on a C-algebra A The state ltgt euro A where A is the dual of A is then called an ontic state of the closed system If a Hilbert space representation of A is possible ltjgt can be represented as a state vector ip G characterized by the expectation values lt ipAip gt of all observables A euro A Under particular conditions the dynamics of ltfgt is given by the time-reversal invariant Schrodinger equation

In the traditional Hilbert space representation the algebra A of observshyables is irreducible there are no commuting observables Due to the Stone-von Neumann theorem every representation of the canonical commutation relashytions is then equivalent to the Schrodinger representation In the more general setting of a Fock space (sum of tensor products of one-particle Hilbert spaces) the same holds for Fock representations

A restriction of ltfr to a subsystem is not a pure state in general hence it is in general illegitimate to consider a closed quantum system as consisting of closed subsystems As a consequence an ontic state cfgt characterizes an individual undivided whole not consisting of subsystems with their own ontic states This is the level of description to which the notions of quantum nonlocality or quantum holism apply Since the concept of an environment does not make sense for ontic states of closed systems it is illegitimate to speak about their entanglement or interaction with another state

If one introduces a distinction (Heisenberg cut) to create subsystems in

56

a closed system then these subsystems in general are open For example one can then consider an object entangled andor interacting with its environshyment The epistemic state r] of those subsystems can be represented in two conceptually different ways

32 Density Operators as Non-Pure States

The first more or less familiar representation of an epistemic state n is given by a (reduced) density operator D 6 M where M is the predual of a W-algebra M of contextual observables The expectation value of D is given by TrDM for observables M E M The epistemic state n represented by D is a non-pure state EPR-correlations between subsystem and environment are generic if the contextual algebra of observables is non-commutative

The term contextual observables derives from the fact that their conshystruction requires the selection of a context defined by a subset of relevant observables B E B C A and a reference state (eg vacuum state KMS state) distinguished by some appropriate stability condition This context induces the weak closure of B and gives rise to a contextual topology in M If the context is known well enough then the GNS representation is a powerful constructive tool to implement a proper contextual topology (see eg15)

The dynamics of D is of Schrodinger type plus dissipative terms (eg a master equation) so that the time-reversal invariance of the Schrodinger equation can be broken18 19

33 Probability Distributions of Pure States

If the epistemic state r of an open system is approximately pure by a clever dressing of object and environment (b indicates bare objects and environments and d indicates dressed objects and environments)

ri0ij lt8gt Henv = Hgbj lt8gt nenv

7] can be represented (estimated) by a probability distribution fj of pure states (A dressing procedure is clever if it minimizes EPR-correlations between obshyject and environment or if it maximizes the integrity of both object and environment20) Hgbj is the proper Hilbert space for an approximately pure epistemic state 77 Although 77 can be uniquely extended to a normal state on M (represented by a density operator) the pure states and their distribution fi themselves do not make sense on M The relevant observables are elements of a C-subalgebra B C A

57

The dynamics of p is of Schrodinger type plus stochastic terms (eg an ItoStratonovic equation) so that the time-reversal invariance of the Schroshydinger equation can be broken The stochastic aspect of the time evolution (of approximately pure states of the object) originates from the fact that the (initial) state of the environment cannot be determined and therefore must be treated as a stochastic variable Starting from an initial pure state pa one gets time-evolved states ptu where co is the stochastic variable First steps of such an approach toward single open quantum systems not based exclusively on decompositions of density-operator dynamics were proposed in2 1 2 2

For a large class of stochastic dynamics of approximately pure states of objects one ends up with one particular distribution p^ of pure states in the limit t mdashgt oo independently of the initial conditions (such dynamical objects are called ergodic) Splitting the underlying C-algebra B into two subsystems with two C-subalgebras B and B2 B = B reg B2 is then admitted under particular conditions In an ideal situation all those pure states onto which the probability measures pt extend are product states with respect to the tensor product B = B reg $2- This situation never arises in practice but most relevant pure states can be product states or almost product states if the dressing tensorization is chosen appropriately 23

3-4 Dynamics of Measurement a Simple Example

Any dynamical description of measurement has to start from a proper decomshyposition of a system into a dressed object and its dressed environment It is crucial to keep in mind that such a decomposition is a logical precondition for the dynamics of measurement insofar as the Hamiltonian of the composed system needs to be written as a sum

H = Hobiregl + lregHmy+Hint (1)

An illustrative heuristic example has been extensively discussed by Primas24 Consider the simple case of a two-level quantum object (spin 12 system) with the Hamiltonian

h 3

^ o b j ~ Tj^yGu (2)

a sufficiently nontrivial boson field environment

3

-Henv = ^2^2ujkaklakv (3)

58

and an interaction

3

Hint = ^ lt7bdquo (ggt Abdquo (4)

where

Av = ^ ^kuOtkv + CC (5) k

If such a decomposition has been properly carried out (cf Sec 33) then it is possible to derive the expectation values

M(t) = ltiptWflHgt (6)

a(t) = ltXtAXtgt (7)

with respect to the (approximate) product state

t = v- tobjregxr- (8)

Corresponding to the product state Pt the C-algebra of intrinsic observables in the composed system of dressed object and dressed environment is

A = A0hi reg-4env (9)

Aohi is the C-algebra of 2 x 2 matrices and ^4env is the C-algebra of intrinsic observables of an environment with infinitely many degrees of freedom

The equations of motion for the expectation values M(t) and a(t) are given by

M(t) = M(t) x ft + M(t) x a(t) (10)

() = -UkOLkv + -^gt~kvMvt) (11)

They describe the feedback between object and environment More precisely they describe the polarization M of the object under the influence of the enshyvironment and the motion of the environment observable a (boson operator) under the polarizing influence of the object The solution of the second equashytion referring to the observables of the environment (or the measuring system

59

respectively) has a retarded and an advanced part

(t gt 0) (12)

(t lt 0) (13)

A bidirectionally deterministic system can be described in terms of a superposhysition of a backward deterministic (forward non-deterministic) and a forward deterministic (backward non-deterministic) process which are equally relevant a priori Selecting one of these solutions and disregarding the other requires the time inversion symmetry of the compound system to be broken For this purpose one can apply the principle of causality (past-determinacy error-free retrodiction no anticipation) as a heuristic argument for the selection of the retarded solution

It has been argued that the retarded ie the backward deterministic forward non-deterministic solution is a K-flowc on a state space with infinitely many degrees of freedom24 In the simplest case the relaxation time for this K-flow is the time constant rbdquo of an exponentially decaying correlation function (for details see24)

Kv = ivexp(-tTv) (14)

At this point we are still at the level of description of intrinsic observables needed for the specification of initial conditions of the K-flow Conceptually this K-flow represents a stochastic process which corresponds to chaos in the sense of Wiener25 rather than chaos in the sense of Kolmogorov and Sinai (ie a dissipative dynamics) By introducing a context via a reference state with respect to which stability in a particular sense (hopefully more general than thermal equilibrium) can be checked one can proceed to (GNS-constructed) contextual observables

35 General Features of Extrinsic Irreversibility

The breaking of time-reversal symmetry in the framework of extrinsic irreshyversibility corresponds to the conceptual transition from closed systems with cNote that K-flows or K-systems play an important role in one of the approaches of intrinsic irreversibility (see Sec 41) It would be interesting but exceeds the scope of this paper to explore the question of whether the process of measurement as described here can be conceived as intrinsically irreversible In this respect see eg2 6

aTke = exp(-iLjkt)akl0)

i r - 2Xk exp(-iuk(t - s))Mv(s)ds

fj = exp(-iujkt)akv(t)

i fdeg + 9 ^ exp(-wt(t-s))Mbdquo(s)ds

60

ontic states to open systems with epistemic states Such a transition can be understood by dividing a closed system into open more or less EPR-correlated subsystems (eg object and environment) and by selecting a subset of relshyevant observables The proper state concepts are epistemic There are then two different statistical representations for different epistemic state concepts A ^-statistical representation expresses a probability distribution of pure states whereas the usual ^-statistical representation focuses on reduced density opshyerators

The interaction of the open subsystems is described by dynamical laws difshyferent from the time-reversal invariant dynamics of a closed system Breaking the time-reversal invariance of a unitary group evolution generates two semishygroups which can be endowed with two arrows of time opposite to each other It should be pointed out that the forward arrow cannot be selected by physical reasons alone Extra-physical arguments such as consistency with experience causality etc must be invoked

4 Breaking Time-Reversal Symmetry Intrinsic Irreversibility

In contrast to the extrinsic concept of irreversibility there is an alternative concept of intrinsic irreversibility mainly advocated by Prigogine and collabshyorators (more recently also by Bohm) They propose describing states of any system generically with distributions p (ie probability distributions or denshysity operators) The claim is that the state p of systems beyond a particular degree of complexity evolves irreversibly by itself ie without any relationship to an environment There are essentially two lines of research pursuing this proposal

4-1 A-Transformation from K-Systems to Exact Systems

The notion of the A-transformation has been developed by Misra Courbage and Prigogine in the 1970s It is essentially based on the theory of ergodic systems In particular the concept of Kolmogorov systems briefly K-systems is of central significance in this context

Definition 127 Let (X A n) be a normalized measure space and let S X mdashgt X be an invertible transformation such that S and 5 _ 1 are measurable and measure preserving The transformation S is called a K-automorphism if there exists a cr-algebra A0 such that the following three conditions are satisfied (i)S-1(A0)cA0 (ii) the cr-algebra f l^Lo - ^ 0 ) is trivial (ie contains only sets of measure

61

1 or 0) (hi) the smallest cr-algebra containing Jtrade=0S

n(Ao) is identical to A Another way to characterize (classical) K-systems is by way of the existence

of positive Ljapounov exponents equivalent to a strictly positive Kolmogorov-Sinai entropy The properties of K-systems imply mixing and ergodicity K-systems are invertible transformations hence their deterministic dynamics given by p(t) = Ut p(0) is reversible (Ut is a unitary evolution operator acting on p) A standard example is the (2-dimensional) baker transformation

Another important class of mixing systems refers to so-called exact sysshytems

Definition 2 27 Let (XAp) be a normalized measure space and let S X mdasht X a measure preserving transformation such that S(A) pound A for each A pound A If l im^oo = p(Sn(A)) = 1 for every A euro A p(A) = 1 then S is called exact

Exact systems are represented by non-invertible transformations hence their stochastic dynamics given by p(t) = Wt p(0) is irreversible Wt is a semigroup evolution operator acting on a distribution p rather than p For instance an exact system obtained from the baker transformation is the dyadic transformation

S(x) = 2x (mod 1)

A theorem by Rokhlin28 says that every exact system is the factor of a K-system This means that K-systems can be transformed into exact systems by their projections (or factors see2 7) More generally a factor of a K-system can be obtained by restriction to dilating fibers or unstable manifolds Hence it is intuitively clear that the invertibility of a K-system gets lost by its transformation into an exact system

According to Misra et al 29 30 the relations between the two kinds of

dynamics Ut and Wt and the two state concepts p and p are provided by a similarity transformation A according to

Wt = AUtA-1

p = Ap

Wightmans question31 as to the meaning of p in his review of30 gets an imshymediate answer if one applies Rokhlins theorem to construct A (cf 3 2 ) The transformed distribution p is the projection of p onto a dilating subspace This can easily be seen for the examples of the baker transformation and the dyadic transformation In the more complicated case of continuous-time nonlinear (hyperbolic) systems the corresponding procedure would be a projection onto the unstable manifolds ie those directions along which the Lyapunov expo-

62

nents are positive and add up to the Kolmogorov-Sinai entropy (cf 33gt34) As an important conceptual feature such projections select a time direction

A crucial formal feature associated with the irreversibility due to Wt is that a properly constructed A (and hence A[ (A

_1) preserves the positivity of the state distributions only for positive times A conceptual discussion of this point can be found in3 5 For a more detailed formal account of the role which positivity preservation plays in the transformation between irreversible semigroups and chaotic dynamics see 36 and references given there

4-2 Rigged Hilbert Space Representation

Intrinsic irreversibility has also been implemented in an approach based on an extension of the usual Hilbert space representation of the state of a sysshytem This approach makes use of the so-called rigged Hilbert space (RHS) construction first introduced by the Russian mathematician Gelfand and his collaborators37 Roberts38 and Bohm3 9 independently showed how Diracs formalism could be justified with complete mathematical rigor in a RHS By the end of the 1970s it turned out that some basic physical problems of Hilbert space quantum mechanics notably in the context of decaying states or resoshynances could be clarified in terms of RHS (40 and references therein)

Very briefly a RHS (Gelfand triplet) can be understood as follows Let be an abstract linear scalar product space and complete with respect to two topologies The first topology is the standard norm topology yielding a separable Hilbert space The second topology r$ is defined by a countable set of norms

IMU = Aamp0)n ^ euro n = 012 (15)

where (fgt e $ and the scalar product is given by

(lt(gt ltf)n = (ltjgt (A + 1) V ) n = 0 1 2 (16)

where A is the Nelson operator A =J2iXi41- The Xi are operators representing the observables for the system in question and are the generators for the Nelson operator Furthermore the operator A + 1 is a nuclear operator and ensures that $ is a nuclear space (cf 42gt39) An operator is nuclear if it is linear essentially self-adjoint and its inverse is Hilbert-Schmidt An operator A-1 is Hilbert Schmidt if A1 = XiPi where the Pt are mutually orthogonal projection operators on a finite dimensional vector space and J2iPi lt degdeg gt Pi denoting the eigenvalues of Pi39 We then have the Gelfand triplet of spaces

$ C ^ C $ X (17)

63

where $ x is the dual to the space $ The Nelson operator fully determines the choice of function space when

it comes to choosing a realization of the space $ However there are many different inequivalent irreducible representations of an enveloping algebra of a Lie group used to generate a Nelson operator describing physical systems Therefore further restrictions on the choice of function space for a realization of $ are required The particular characteristics of the physical context of the system being modeled provide some of these restrictions analogous to the situation for GNS constructions in the transition from C- to W-algebras in algebraic quantum mechanics23 Additional restrictions may be required due to the convergence properties desired for test functions in $ and ltJgtX

Bohm and colleagues applied the RHS approach to intrinsic irreversibility in the context of scattering and decay phenomena4043 Antoniou and Prigogine 44 extended the approach to broader contexts The core idea in both versions is that a unitary group operator Ut = exp(-iHt) mdashoo lt t lt oo generated by a Hamiltonian H under very general circumstances may be extended from W to $ x (restricted to $) For scattering processes $ is the intersection of the Hardy class functions with the Schwarz class functions Because of continuity and completeness requirements Ut $ x mdashgt $ x (Ut $mdashgt$) can be extended to the upper half plane $+ (restricted to $+) for positive times and to the lower half plane $ x ($_) for negative times4 3 The extension of Ut to $ x

(restriction to $) forms two semigroups because the extension (restriction) cannot be defined for replacement of t with mdasht Thus semigroup evolution falls out of the analysis quite naturally in the RHS framework

4-3 General Features of Intrinsic Irreversibility

In the intrinsic conception of irreversibility states of a system are generically represented by distributions in a suitable state space where pure states are S functions The trajectories of individual points are either (1) considered irreleshyvant because empirically inaccessible (as in the A-transformation approach) or (2) make minimal contributions to the collective behavior of the system when a sufficient number of Poincare resonances are present (as in the RHS approach) For systems beyond a particular degree of complexity (K-systems Poincare resshyonances etc) the dynamics of the system is governed by irreversible evolution laws regardless of interactions with an environment

While the A-transformation approach has only been applied to the baker map the RHS approach has been applied to nonlinear maps Friedrich models

dThe dual space x is the space of linear functionals acting on elements of ltpoundgt and its topology is induced by the choice of T and includes distributions among its elements

64

scattering experiments and other decay phenomena In the latter approach exact Golden Rules for decay and survival probabilities and their rates can be derived in agreement with experimental observations43

In both approaches the transition from reversible to irreversible dynamical evolution laws is achieved by breaking the time-reversal symmetry in specific ways leading to two semigroups The time direction of the semigroups howshyever is not given by either the A-transformation or RHS approaches Physical considerations alone are insufficient to select the forward arrow and one must appeal to consistency with experience causality or other criteria

5 Summary and Open Questions

There are two basic points at which extrinsic and intrinsic notions of irreshyversibility coincide The first is that both notions explicitly break the time-reversal symmetry of reversible dynamical laws This is clearly the case for the standard external view in which the transition from fundamental reversible laws to contextual irreversible laws corresponds to the transition from ontic states of closed systems to epistemic states of open systems But even for the alternative intrinsic view irreversibility is an emergent feature 45 In the framework of the A-transformation the time-reversal symmetry of K-systems is broken leading to irreversible exact systems In the RHS representation a similar symmetry breaking is achieved by the transition from Hilbert space to the rigging spaces $ and $ x

The breaking of time-reversal symmetry always produces two semigroups which can be endowed with opposite temporal directions Selection criteria must be used to select one of these two directions for a preferred mode of description In both extrinsic and intrinsic approaches there is no such crishyterion available based on physical reasoning alone The selection is based on extra-physical arguments such as causality experience and others This secshyond point of agreement between extrinsic and intrinsic irreversibility raises the interesting question of what conditions the proper direction of time has to satisfy It could be argued that up to the condition that it is the same for all physical systems the selection is arbitrary

There are two basic points at which extrinsic and intrinsic notions of irreshyversibility apparently differ One of them concerns the role of the environment the other has to do with the state concepts used in the two approaches Briefly speaking the role of the environment and the distinction of different state concepts is crucial in the standard framework of extrinsic irreversibility The conceptual framework of the formalisms refering to intrinsic irreversibility neishyther (1) explicitly contains the concept of an environment nor (2) distinguishes

65

between different state concepts These observations do not necessarily imply that intrinsic irreversibility

really can dispense with points (1) and (2) It is likely that the two points play crucial roles even though they do not explicitly appear in the formalism and its usual interpretation

The projection (factorization) which is the crucial part of a A transforshymation can be considered as the selection of an exact subsystem of the origshyinal K-system Obviously the A-transformation is not universal but context-dependent Conceptually the irreversible evolution of p mdash Kp due to Wt could then be attributed to the restriction of the K-system to an exact subsystem This might lead to interesting analogies with aspects of extrinsic irreversibility if the subsystem cannot be described as a closed subsystem Concrete empirshyical applications of the A-transformation are not yet available They would be necessary to check the significance of a physical environment which is not explicit in the formalism

Concerning the distinction between ontic and epistemic state concepts it is clear that the approach of intrinsic irreversibility starts at the level of distributions rather than points In the space of distributions 5 functions are special cases that could be related to points in a state space underlying the distribution space considered In this way a connection between distributions as epistemic states and points as ontic states is possible The general claim in the A-transformation framework of intrinsic irreversibility though is that ontic states in the sense of phase points are meaningless or irrelevant since they are empirically inaccessible

But is it justified to consider ontic states as generally irrelevant because they are empirically inaccessible Reversible fundamental laws refer to ontic states and it is not easy to formulate physics without them The monoshygraphs by Ludwig46 which consistently avoid any ontic elements are an ilshylustrative example Moreover special techniques to break symmetries often enable a unique derivation of irreversible contextual laws if the fundamental laws plus contexts are known This also holds for the symmetry breaking used to derive intrinsic irreversibility from time-reversal invariant evolution in the A-transformation approach The empirical inaccessibility of ontic states notwithstanding one should therefore not dismiss their overall relevance too quickly

In the RHS approach there is no contradiction with the formal arguments in the case of extrinsic irreversibility insofar as the extension of Ut from V into $ x leads from reversibility to irreversibility In this case irreversibility is a feature arising during the transition from states in to states whose state space is defined with respect to contexts In the algebraic framework of Sec 3

66

such contexts are reflected by a contextual topology on M As mentioned in Sec 42 physical contexts may not be known sufficiently well to determine $ x uniquely The physical examples used to demonstrate the significance of the RHS formulation (eg decay) suggest that a physical environment is inevitable although this is not explicit in the formalism

The relationship between ontic and epistemic states in the RHS approach is more subtle than in the A-transformation approach As Petrosky and Pri-gogine argue4748 the presence of a sufficient number of Poincare resonances in so-called large Poincare systems (LPS) rapidly convert the smooth infinitely differentiable trajectories of the phase space points into random walks Though the trajectories are not considered to be empirically inaccessible their effects are limited to the formation of higher and higher orders of correlations as the dynamics evolves The phase space points can represent ontic states but the correlations also have an ontic status Correlations very rapidly come to domishynate the dynamics of all collective modes of behavior of LPS (eg the approach to equilibrium) as the correlations diffuse throughout the system In this way the effects of individual points and trajectories become irrelevant to the dyshynamics of the whole and thus one can argue that the distribution description is an ontic description of the systems behavior

In this way the distinction between ontic and epistemic states might be a powerful conceptual tool even at the level of distributions alone There is a conceptual difference between a probability distribution conceived as a distrishybution over an ensemble of individual pure states (as in the ^-statistical represhysentation) and a probability distribution conceived as an individual whole The latter concept is sometimes indicated in the context of intrinsic irreversibility and can be considered as an ontic version of the former (cf the notion of relshyative onticity16) For instance continuum mechanics requires a formulation which needs ontically interpreted holistic distributions from the very beginshyning since its description in terms of an ensemble of points would violate basic physical laws

Among the adherents of intrinsic irreversibility it is claimed that the holisshytic concept of a distribution as a whole entails predictions eg related to the dynamics of correlations in large systems which cannot be obtained with the concept of a probability distribution of individual pure states This claim particularly refers to situations far from thermal equilibrium Based on Gallavottis approach which describes systems far from equilibrium in terms of SRB-measures49 ie in an ensemble description this claim may become testable (see also50 for a brief discussion)

After all it is possible to view the intrinsic approach to irreversibility as emphasizing the relative importance of the advanced level of complexity

67

of systems with nontrivial correlations over environmental effects While exshytrinsic irreversibility addresses the importance of an environment intrinsic irreversibility should not primarily be understood as focusing on the neglect of such an environment (eg the environment may be a necessary condition for the existence of the dynamics) Instead it is perhaps more appropriate to understand intrinsic irreversibility as irreversibility intrinsic to the dynamics of a system given a particular degree of its complexity

Acknowledgments

Helpful comments by L Accardi L Ballentine H Narnhofer and I Volovich during the discussion of this contribution at the conference are much apprecishyated We are grateful to H Primas for remarks on an earlier version of this paper

References

1 JH Fetzer and RF Almeder Glossary of EpistemologyPhilosophy of Science (Paragon House New York 1993) p lOOf

2 D Howard Space-time and separability problems of identity and indishyviduation in fundamental physics In Potentiality Entanglement and Passion-at-a-Distance ed by RS Cohen M Home and J Stachel (Kluwer Dordrecht 1997) pp 113-141

3 W Heisenberg Physics and Philosophy (Harper and Row New York 1958)

4 D Bohm Wholeness and the Implicate Order (Routledge and Kegan Paul London 1980)

5 B dEspagnat Veiled Reality (Addison-Wesley Reading 1995) 6 H Margenau Reality in quantum mechanics Phil Science 16 287-302

(1949) here p 297 7 KR Popper The propensity interpretation of probability and quanshy

tum mechanics In Observation and Interpretation in the Philosophy of Physics - With special reference to Quantum Mechanics ed by S Korner in collaboration with MHL Pryce (Constable London 1957) pp 65-70 [Reprinted by Dover New York 1962]

8 R Harre Is there a basic ontology for the physical sciences Dialectica 51 17-34 (1997)

9 M Jammer The Philosophy of Quantum Mechanics (Wiley New York 1974) pp 448-453 504-507

10 E Scheibe The Logical Analysis of Quantum Mechanics (Pergamon Oxford 1973) pp 82-88

68

11 H Primas Mathematical and philosophical questions in the theory of open and macroscopic quantum systems In Sixty-Two Years of Uncershytainty ed by AI Miller (Plenum New York 1990) pp 233-257

12 H Primas Endo- and exotheories of matter In Inside Versus Outside ed by H Atmanspacher and GJ Dalenoort (Springer Berlin 1994) pp 163-193

13 J von Neumann Mathematische Grundlagen der Quantenmechanik (Springer Berlin 1932) English translation Mathematical Foundations of Quantum Mechanics (Princeton University Press Princeton 1955)

14 A Einstein B Podolsky and N Rosen Can quantum-mechanical deshyscription of physical reality be considered complete Phys Rev 47 777-780 (1935)

15 H Primas Emergence in exact natural sciences Acta Polytechnica Scan-dinavica M a 91 83-98 (1998) See also Primas Chemistry Quantum Mechanics and Reductionism (Springer Berlin 1983) Chap 6

16 H Atmanspacher and F Kronz Relative onticity In On Quanta Mind and Matter Hans Primas in Context Edited by H Atmanspacher A Amann and U Miiller-Herold (Kluwer Dordrecht 1999) pp 273-294

17 H Atmanspacher Ontic and epistemic descriptions of chaotic systems In Computing Anticipatory Systems CASYS 99 Edited by D Dubois (Springer Berlin 2000) pp 465-478

18 E Fick and G Sauermann Quantenstatistik dynamischer Prozesse Ha Antwort- und Relaxationstheorie (Harri Deutsch Thun 1986)

19 R Kubo M Toda and N Hashitsume Statistical Physics II (Springer Berlin 1985)

20 H Primas The Cartesian cut the Heisenberg cut and disentangled observers In Symposia on the Foundations of Modern Physics Wolfgang Pauli as a Philosopher ed by KV Laurikainen and C Montonen (World Scientific Singapore 1993) pp 245-269

21 A Amann Structure dynamics and spectroscopy of single molecules a challenge to quantum mechanics J Math Chem 18 247-308 (1995)

22 A Amann and H Atmanspacher Fluctuations in the dynamics of single quantum systems Stud Hist Phil Mod Phys 29 151-182 (1998)

23 A Amann and H Atmanspacher C- and W-algebras of observ-ables their interpretation and the problem of measurement In On Quanta Mind and Matter Hans Primas in Context Edited by H Atshymanspacher A Amann and U Miiller-Herold (Kluwer Dordrecht 1999) pp 57-79

24 H Primas Induced nonlinear time evolution of open quantum systems

69

In Sixty-Two Years of Uncertainty ed by AI Miller (Plenum New York 1990) pp 259-280

25 N Wiener (1938) The homogeneous chaos Am J Math 60 897-936 (1938)

26 CM Lockhart and B Misra Irreversibility and measurement in quanshytum mechanics Physica A 136 47-76 (1986) Cf H Primas Math Rev 87k 81006 (1987)

27 A Lasota and MC Mackey Chaos Fractals and Noise (Springer Berlin 1995)

28 VA Rokhlin Exact endomorphisms of Lebesgue spaces Izv Akad Nauk SSSR Ser Mat 25 499-530 (1964) transl in Am Math Soc Transl 39 1-36 (1964)

29 B Misra NonequiUbrium entropy Lyapounov variables and ergodic properties of classical systems Proc Ntl Acad Sci USA 75 1627-1631 (1978)

30 B Misra I Prigogine and M Courbage From deterministic dynamics to probabilistic descriptions Physica A 98 1-26 (1979)

31 A Wightman Review of Misra Prigogine and Courbage30 Math Rev 82e 58066 (1982)

32 Z Suchanecki On lambda and internal time operators Physica A 187 249-266 (1992)

33 H Atmanspacher and H Scheingraber A fundamental link between sysshytem theory and statistical mechanics Found Phys 17 939-963 (1987)

34 H Atmanspacher Dynamical entropy in dynamical systems In Time Temporality Now ed by H Atmanspacher and E Ruhnau (Springer Berlin 1997) pp 325-344

35 RW Batterman Randomness and probability in dynamical theories on the proposals of the Prigogine school Philosophy of Science 58 241-263 (1991)

36 I Antoniou K Gustafson and Z Suchanecki (1998) On the inverse problem of statistical physics from irreversible semigroups to chaotic dynamics Physica A 252 345-361 (1998)

37 IM Gelfand and NYa Vilenkin Generalized Functions Vol 4 (Acashydemic New York 1964) Russian original published 1961 in Moscow

38 JERoberts The Dirac bra and ket formalism Journal of Mathematical Physics 7 1097-1104 (1966)

39 A Bohm Rigged Hilbert space and mathematical descriptions of physshyical systems In Lectures in Theoretical Physics IX A Mathematical methods of theoretical physics Edited by WE Brittin AO Barut and M Guenin (Gordon and Breach New York 1967) pp 255-317

70

40 A Bohm and M Gadella Dirac Kets Gamow Vectors and Gelfand Triplets Lecture Notes in Physics Vol 348 ed by A Bohm and JD Dollard (Springer Berlin 1989)

41 E Nelson Analytic Vectors Annals of Mathematics 70 572-615 (1959) 42 F Treves Topological Vector Spaces Distributions and Kernels (Acashy

demic Press New York 1967) 43 A Bohm S Maxson M Loewe and M Gadella Quantum mechanical

irreversibility Physica A 236 485-549 (1997) 44 I Antoniou and I Prigogine Intrinsic irreversibility and integrability of

dynamics Physica A 192 443-464 (1993) 45 T Petrosky and I Prigogine The Liouville space extension of quantum

mechanics Adv Chem Phys XCIX 1-120 (1997) here p 71 46 G Ludwig Foundations of Quantum Mechanics Vols 12 (Springer

Berlin 19831985) 47 T Petrosky and I Prigogine Poincare resonances and the extension of

classical dynamics Chaos Solitons amp Fractals 7 441-497 (1996) 48 T Petrosky and I Prigogine The Extension of Classical Dynamics for

Unstable Hamiltonian Systems Computers amp Mathematics with Applishycations 34 1-44 (1997)

49 G Gallavotti Chaotic dynamics fluctuations nonequilibrium ensemshybles CHAOS 8 384-392(1998)

50 D Ruelle Gaps and new ideas in our understanding of nonequilibrium Physica A 263 540-544 (1999)

71

INTERPRETATIONS OF PROBABILITY A N D Q U A N T U M THEORY

L E B A L L E N T I N E

Department of Physics Simon Fraser University Burnaby

BC V5A 1S6 Canada

e-mail ballentisfuca

There is a peculiar similarity between Probability Theory and Quantum Mechanics both subjects are mature and successful yet both remain subject to controversy about their foundations and interpretation I first present a classification of the various interpretations of probability arguing that they should not be thought of as rivals but rather as applications of a general theory to different kinds of subshyject matter An axiom system that makes conditional probability the fundamental concept is put forward as being superior to Kolmogorovs axioms I then discuss the relevance to quantum theory of the various interpretations of probability the applicability of classical probability theory within quantum mechanics and the reshylations between the interpretation of probability and the interpretation of quantum mechanics

1 Introduction

There are many connections between Probability Theory and Quantum Meshychanics the most notable being that Quantum Mechanics uses Probability Theory in its fundamental interpretation not merely as a technique But I wish to concentrate on a more peculiar similarity Although both subjects are mature and successful both remain subject to controversy about their foundations and interpretation There may be even more interpretations of probability than there are of quantum theory Can one bring some degree of order to this subject

Probability Theory being a branch of mathematics is defined by a set of axioms So it can legitimately be applied to any entity that satisfies those axioms Most of the interpretations of probability can be viewed as applications of the formal theory to different subject matters It is therefore misguided to argue over which is the correct interpretation Most of them are correct within their appropriate domain of application But it is still reasonable to ask whether there is a general overarching form of Probability Theory of which all the various interpretations can be seen as special cases applied to special subject matters

I shall propose such a classification of the various interpretations of probshyability To do so it is necessary to overlook small differences and to lump closely related interpretations into a few broad categories I expect this classi-

72

fication to be controversial but I believe that it is a step in the right direction I shall consider only theories that are based on the same or equivalent sets of axioms Hence generalizations such as negative probabilities are not included in this scheme although I shall briefly refer to them later After describing the major categories of interpretation of probability I will discuss the relevance of each to quantum mechanics

2 Interpretations of Probability

Many different interpretations of probability are examined in detail by T L Fine1 I propose to overlook many of the fine differences and hence classify them into a few major groups shown in Figure 1 References to most of the authors named in Fig 1 and critical analyses of their ideas are given by Fine1

21 The Theory of Inductive Inference

I propose that the Theory of Inductive Inference be taken as the master theory and that all other interpretations be regarded as special cases applicable in more restricted contexts This point of view was expressed most completely by E T Jaynes in his book Probability Theory The Logic of Science which unfortunately was not completed during his lifetime

Within this interpretation probability is assigned to propositions The notation P(AC) is to be read as the probability of A under the condition C Probability is regarded as a logical relation among propositions that is weaker than entailment Inductive logic reduces to deductive logic in the limit of probability values 0 and 1 Probability is an objective relation and should not be confused with degrees of belief

The propositions to which probability is assigned may have any particular content If we specialize to propositions about repeated experiments we obtain the Ensemble-Frequency theory If we specialize to propositions about personal belief we obtain Subjective probability If we specialize to propositions about indeterministic or unpredictable events we obtain the Propensity theory

Although P(AC) is a logical relation between proposition A and the conshyditioning information C it is not merely a formal syntactic relation The content (meaning) of A and C must be invoked to evaluate P(AC) There is no magic formula to translate arbitrary information into probabilities Jaynes has given solutions to this problem in some important special cases (symmetry groups marginalization) but there is as yet no general solution

73

The Logic of Inductive Inference

(E T Jaynes R T Cox H Jefferys)

P(AC) is the probability that proposhysition A is true given the information C

Ensemble and Frequency

(Kolmogorov Bernoulli von Mises)

Measure on a set Limit frequency in an ordered sequence

Propensity

(K R Popper)

PAC) is the propensity for event A to occur under the conshydition C

Subjective and Personal

(de Finnetti L J Savage I J Good)

Incomplete knowledge Degrees of reasonable belief

Figure 1 Classification of the interpretations of Probability

22 Ensemble and Frequency Theories

One of the most common interpretations of probability is as a limit frequency in an ordered sequence The ratio of the number n of occurrences of a particshyular type in a sequence of N events nN is identified with the probability This interpretation is useful in analyzing repeated experiments but it has the

74

difficulty that in a random sequence the ratio nN need not have a limit The ensemble interpretation is a generalization of the frequency interpretation in which probability is identified with a measure on a set that need not be orshydered It is closely associated with Kolmogorovs axiom system which will be discussed later

23 Subjective Probability

Subjectivism has its place and subjective probability provides an excellent way to describe degrees of reasonable belief But in science subjectivism can be like a virus and we must guard against its infection In general the probability P(AC) expresses an objective relation between A and C determined by the totality of the information C and not by anyones personal opinions Jaynes tried to ensure objectivity through the pedagogical device of introducing a robot that is programmed to reason consistently using only the information that is given to it But even Jaynes sometimes slipped from objective to personal probabilities in his examples without apparently being aware of doing so Indeed the contamination of Inductive Logic Probability by subjectivism may have been a major barrier to its acceptance

24 Propensity

Propensity is a form of causality that is weaker than determinism34 Generally speaking probability expresses logical relations rather that causal relations (Recall the old saying Correlation does not imply causality) However causalshyity is a special kind of logical relation and propensity theory deals with just that special case The propensity interpretation of probability is natural in situations such as those described by quantum mechanics in which events can not be predicted with certainty from their antecedents

3 The Axioms of Probability

The axioms of probability theory can be given in several different forms howshyever those given by RT Cox56 are particularly convenient

Axiom 1 0 lt PAB) lt 1 Axiom 2 PAA) = 1 Axiom 3 PhAB) = 1 - P(AB) Axiom 4 P(AkBC) = P(AC) PBAkC)

Here the notation is as follows -gtA means not A AkB means A and J5 A B means either A or B

75

Axiom 2 states that the probability of a certainty (A given A) is one Axiom 1 states that no probabilities are greater than the probability of a certainty Axiom 3 expresses the notion that the probability of non-occurrence of an event increases as the probability of its occurrence decreases It also implies P-gtAA) = 0 an impossibility (not A given A) has zero probability Axiom 4 is the least intuitive The probability of both A and B (under some condition C) is equal to the probability of A multiplied by the probability of B given A

The probabilities of negation (-gtA) and conjunction (AampB) each require an axiom However no further axioms are required to treat disjunction because AV B = -i(-iAamp-ii) in words A or B is equivalent to the negation of neither A nor B This allows us to deduce a theorem

P(A V BC) = P(AC) + P(BC) - PAkBC) (1)

If A and B are mutually exclusive then we obtain

PAV BC) = P(AC) + P(BC) (2)

which is often taken to be an axiom and may be used in place of Axiom 3 Several remarks about these axioms are in order First the notion of ranshy

domness plays no fundamental role in the theory Hence we need not enquire whether our variables and events are random as a prerequisite to applying probability theory

Second these axioms are not arbitrary They are uniquely determined (apart from formal changes that do not affect the content) by conditions of plausibility and consistency (see Cox5 and Jaynes2)

(i) The probability of A on some given evidence determines also the probshyability of not A on the same evidence

(ii) The probability on given evidence that both A and B are true is determined by their separate probabilities one on the given evidence and the other on that evidence plus the assumption that the first is true

(iii) If a complex proposition can be composed in more than one way [ex (AampB)ampC or AampcBbC) then all ways of computing its probability must lead to the same answer Notice that in (i) and (ii) only the existence of certain connections are asshysumed but not their mathematical form The consistency condition (iii) then leads to the mathematical forms of the axioms Therefore anyone who proshyposes an inequivalent alternative to Coxs axioms (such as allowing negative probabilities) has an obligation to explain how and why he departs from these conditions of plausibility and consistency

76

Finally a very important remark All probabilities are conditional

The use of the single-variable notation PA) instead of P(AC) is permissible only if the conditional information C is obvious from the context and is unshychanging throughout the problem Many fallacies and paradoxes follow from ignoring this principle

31 Kolmogorovs axioms

If the fundamental axioms that define Probability Theory are those given above then what is the status of Kolmogorovs well-known axioms According to Kolmogorovs axioms probability is assigned to subsets of a universal set fi with the following rules

(i) p(n) = I (2) P(f) gt 0 for any in il (3) If i - - - laquoare disjoint then P(f) = Sj j where is the union of

fir fn-(4) If mdashgt 0 (the empty set) then P(fi) -gt 0 The answer I believe is that Kolmogorovs axioms provide a mathematshy

ical model of probability theory (defined by Coxs axioms) on the theory of measurable sets A mathematical model is useful because it reduces the conshysistency of one theory to that of another (A familiar example is the algebra of complex numbers which can be modeled by the algebra of ordered pairs of reals) Thus any doubts about the consistency of Probability Theory may be laid to rest because of the existence of Kolmogorovs model

There are several objections to taking Kolmogorovs axioms as a foundashytion for Probability Theory rather than merely as a model bull The universal set Cl is often fictitious The propositions to which probabilities are assigned are not subsets of a set bull Conditional probability is relegated to secondary status while the matheshymatical fiction of absolute probability is made primary bull Probability theory and Measure theory are distinct subjects The interesting problems of one are not closely related to the interesting problems of the other For example measure theory deals mostly with infinite sets culminating with the construction of non-measureable sets which have no probabilistic intershypretation But in probability theory one seldom needs to consider an infishynite number of conjunctions and disjunctions On the other hand the imporshytant problem of translating qualitative information into probabilities has no measure-theoretic analog

77

4 Probability in Quantum Mechanics

4-1 Relevant and Irrelevant Interpretations of Probability

Which of the interpretations of probability are relevant to quantum mechanshyics The ensemble-frequency interpretation is obviously relevant and widely used in discussing the statistics of repeated experiments on similarly prepared states Indeed the standard description of an idealized experiment is (1) prepare a state (2) measure an observable of the system (3) repeat the previous two steps until sufficient statistical data has been accumulated (4) compare the relative frequencies of this data with the probabilities predicted by quantum theory

The propensity interpretation is in accord with the ensemble-frequency interpretation whenever it is applied to repeated experiments but it also allows one to make meaningful statements about individual events The propensity interpretation is more natural when one considers time-dependent states and hence time-dependent probabilities Consider the following examples

(i) A source produces s = 12 particles polarized at an angle 4gt relative to some coordinate axis A Stern-Gerlach magnet has its field gradient axis oriented at an angle 8 What is the probability that such a particle incident on the apparatus will emerge with spin up

The formal answer is of course p = cos[(9 mdash ltj))22 but what does this mean

According to the propensity interpretation it means The propensity (chance) of the particle emerging with spin up is p

According to the ensemble-frequency interpretation it means In a long run of similar experiments the fraction of particles emerging with spin up will be (approximately) p

(ii) Now let the magnet be re-oriented in some arbitrary manner before each particle is released so that 6 is different in each case

According to the propensity interpretation we say nearly the same thing The propensity (chance) has a different value p = p$ in each case

But in the ensemble-frequency interpretation one must conceptually embed each event in an imaginary long run of experiments having the same value of 6 in order to make a frequency statement

78

(iii) Suppose next that the polarization direction ltjgt of the particles is unknown Can it be inferred from the data of (ii)

In the ensemble-frequency interpretation the answer would appear to be No A long run of events for each value of 0 would be necessary to estimate p$ as a frequency and hence to determine its dependence on 6

In the propensity interpretation the answer is Yes Bayesian inference (equivalent to maximum likelihood if the prior probashybility distribution for ltgt is uniform) can determine the most probable value of ltjgt even if there is only one event for each value of 9

I have never seen a coherent exposition of QM based on a subjective inshyterpretation of quantum probabilities as representing knowledge This point (which has also been argued at length by Popper8) is worth emphasizing beshycause the interpretation of probabilities as knowledge seems to be a tenet of the Copenhagen interpretation

Two persons (with limited knowledge of QM) might have different reashysonable beliefs about the position of the electron in the hydrogen atom and those beliefs could be represented by subjective probabilities But such igshynorance probabilities have nothing to do with |gt(a0|2 from the Schroedinger equation |V(a)|2 is an objective propensity not a subjective degree of belief

The so-called Uncertainty principle AxAp gt h2 has nothing to do with subjective knowledge or ignorance Its meaning is that in any physical prepashyration of a state the values of x and p will not be reproducible the widths of their distributions being related by the inequality The widths Aa and Ap are objective predictable and measurable parameters which should not be called uncertainties Indeed the name Indeterminacy principle is preferable to Uncertainty principle0

Subjective probabilities can occur in the information games that are played in quantum communication theory Consider a typical example

Bob prepares some quantum state but keeps it secret He tells Alice only that it is one of four (usually nonorthogonal) possible states and she must try to infer what the hidden state is from a measurement Alices incomplete knowledge of that hidden state can be expressed as a subjective probability Suppose also that Bob tells Carol that the unknown state is one of three posshysibilities Carols knowledge is different from Alices and hence her subjective probability will be different But both of these subjective knowledge probabilshyities are quite distinct from the objective quantum probabilities (propensities)

When I once heard Heisenberg speak (about 1964) he used the term Indeterminacy prinshyciple In his early writings he used the words Ungenauigheit (inexactness) Unbestimmtheit (indeterminacy) and Unsicherheit (uncertainty) with various shades of meaning

79

that would be calculated by solving Schroedingers equation for Bobs state preparation apparatus

I suspect that the subjective knowledge interpretation of QM probabilshyities came about by accident the founders of QM may have believed (erroshyneously) that probability can only be a measure of knowledgeignorance Max Born has written that Heisenberg did not know what a matrix was when he was inventing what later became known as matrix mechanics It is therefore not very radical to suppose that the founders of quantum mechanics had an inadequate understanding of probability

4-2 Fallacies in the use of Probability

Unsound arguments to the effect that classical probability theory does not apply to QM are woefully common Before examining an actual argument to that effect let us first consider a simple classical paradox

The Bookies Paradox A bookie needs to fix the odds on a star track runner who has a 60 chance of winning any race that he enters There is a race in Paris and a race in Tokyo scheduled on the same day so he cannot enter both and we do not know which he will enter What is the probability that he will win at least one of these races

Let A = (winning in Paris) and let B = (winning in Tokyo) Clearly A and B are mutually exclusive events so PAJB) = PA) + P(B) The probability of his winning at least one race is 06 + 06 = 12 But this is absurd since 12 gt 1

The paradox is resolved by taking account of a principle that was noted in Sec 3

All probabilities are conditional The notation PA) instead of P(AC) is permissible only if the conditional information C is obshyvious from the context and unchanging throughout the problem

Let us therefore be more precise about the conditions involved Let Ep = (entering in Paris) and let ET mdash (entering in Tokyo) Then clearly we have

P(AEP) = 06 P(BEP)=0 P(AET) = 0 P(BEr) = 06

80

Additivity P(A V BC) = P(AC) + PBC) holds for the same condition C in all terms But PAEp) and P(BET) are not additive by any valid rule so the absurd conclusion reached above followed only from an erroneous apshyplication of probability theory

Double-slit Fallacy A common fallacy about 2-slit experiment is of exactly the same form The experiment consists of three parts

(a) Open slit 1 close slit 2 The probability of a particle arriving at the point X on the screen is Pi(X)

(b) Open slit 2 close slit 1 The probability of a particle arriving at X is now P2(X)

(c) Open both slits 1 and 2 The probability of a particle arriving at X is Pi2(X)

Now passage through slit 1 and through slit 2 are mutually exclusive so we deduce

PuX) = Pi(X) + P2(X) which is empirically false It is then concluded (fallaciously) that classical probability theory does not apply in quantum mechanics

The above reasoning embodies essentially the same fallacy is does the Bookies paradox and it is resolved similarly by paying proper attention to the conditional nature of the probabilities

Let condition C = (slit 1 open slit 2 closed) Let C2 = (slit 2 open slit 1 closed) Let C3 = (both slits open)

We observe empirically that P(XCi) + P(XC2) ^ P(XC3)

(due of course to interference) But this fact is is fully compatible with classical probability theory

4-3 Quantum Probabilities

Quantum probabilities are not essentially different from classical probabilities but like quantum theory itself they do require some care in their interpreshytation H Jefferys 7 remarked that the probability statements of quantum mechanics are incomplete because a probability is always relative to a set of data and the data are not specified In our terminology Jefferys is saying that all probabilities are conditional and the conditions need to be specified to

81

make the probability statement meaningful This can be accomplished through a propensity interpretation of quantum probabilities with proper attention beshying given to the basic concepts of measurement and state preparation When that is done it can be demonstrated9 10 that quantum probabilities obey all of the axioms of classical probability theory The demonstration is straight forshyward but too lengthy to review here so I shall only remark on some conceptual points

(a) The standard formula P(A=an^) = | (abdquo |) |2 where Aan) = anan) should be read as

The probability (propensity) for a measurement of the dynamical variable A to yield the value an conditional on the preparation of the state is | (abdquo |) |2

Note that the propensity is conditioned by the physical process of state prepashyration and not by anyones beliefs or opinions

(b) One can also calculate the probability of a measurement result condishytioned by state preparation and the results of other measurements^

P(B=bm(A=an)kV) However it is necessary that the measurement processes be described dynamshyically as an interaction between the object and the apparatus Simplistic applishycation of the Projection Postulate is liable to give an incorrect answer11

(c) No difficulties of principle arise if the probabilities are conditioned on actual events of state preparation and measurement But assigning probabilishyties to hypothetical unmeasured values is not always possible This problem is encountered if we try to introduce joint probability distributions for (unmeashysured values of) non-commuting observables and require the marginal distrishybutions to agree with the quantum probabilities of the individual observables

In the case of position and momentum we would like to have a joint distribution P(xp) that satisfies

P(xp) gt 0 (3)

Jp(xp)dp=(x)2 (4)

Jp(xp)dx = (pV)2 (5)

There are infinitely many solutions to this problem12 but there is no apparent physical reason for any one of them to be preferred

However in the case of angular momentum where we might seek a joint distribution P(JxJyJz) for the three angular momentum components it is

82

not difficult to show that no such a function can yield the quantum probshyabilities of the three components as marginals However this has more to do with Kochen-Specker13 difficulties (the impossibility of assigning values to all quantum observables consistent with all the relevant constraints) than to probability theory There is no case in which a quantum probability is well defined but violates an axiom of classical probability theory

5 Conclusions

In this paper I have suggested a scheme whereby all the major interpretations of probability are unified with the separate interpretations now seen as applishycations of the general theory to particular subject matters That such different ideas as ensemble-frequency theories propensity theory and subjective degrees of reasonable belief can all be encompassed within a single framework is both useful and surprizing Because they can all be described by the same matheshymatical axioms it is easy to switch from one kind of probability to another as may be appropriate in a particular problem But on the other hand one can ask why such different things as frequencies propensities and degrees of belief should necessarily obey the same axiom system This question should stimulate further foundational research

For the case of degrees of reasonable belief this work has already been completed by Cox56 who showed that certain conditions of plausibility and consistency determine the axioms essentially uniquely Essentially unique means subject only to formal transformations that do not alter the content of the theory Therefore any alternative inequivalent system of plausible reasonshying could be shown to suffer from some degree of inconsistency

Khrennikov14 has studied limit frequencies outside of any theory of probshyability imposing only a condition of stabilization that in a long sequence the frequencies should approach a limit He has found many different cases to be possible some of which lie outside of probability theory It will be interesting to see whether these new logical possibilities are realized in nature If not then his stabilization condition will have to be supplemented by other conditions

The greatest need for more foundational research is in the case of propenshysity Although it clearly can be described by the axioms of probability theory it is not yet clear why it must be so described

Although I have dealt only with versions of probability theory that are derivable from the same axioms I expect that the classification of interpretashytions (Fig 1) may also be useful for generalized theories such as those that admit negative probabilities15 For such generalizations we should ask which of the interpretations do they support Can such generalized probabilities be

83

interpreted as frequencies As propensities As degrees of belief Or must they be given some entirely new interpretation

There are connections between the interpretations of probability and of quantum mechanics This must be so because quantum mechanics does not predict events but only the probabilities of events If one adheres exclusively to a frequency interpretation of probability then one is bound to assert that a quantum state describes only an ensemble of similarly prepared systems If on the other hand one adopts a propensity interpretation of probability then it becomes possible to make meaningful probability statements about an individshyual system However the empirically testable content of those statements can be realized only by measurements on an ensemble of similarly prepared sysshytems Thus the frequency interpretation is not made obsolete by the propensity interpretation but merely broadened The subjective interpretation of probshyability can be used in some situations such as when the observer is not fully informed about the state preparation procedure But it is never correct to interpret ip2 as representing knowledge (except perhaps in the trivial case in which the observers knowledge is complete and in perfect accord with reality)

References

1 TL Fine Theories of Probability an Examination of Foundations (Acashydemic Press New York 1973)

2 ET Jaynes Probability Theory The Logic of Science (Cambridge Unishyversity Press forthcoming) an incomplete version of this work is availshyable electronically at httpbayeswustledu

3 KR Popper in Observation and Interpretation ed S Korner (Butter-worths London 1957)

4 KR Popper Realism and the Aim of Science (Hutchinson London 1983)

5 RT Cox The Algebra of Probable Inference (Johns Hopkins University Press Baltimore MD 1961)

6 RT Cox Am J Phys 14 1 (1946) 7 H Jefferys Scientific Inference (Cambridge University Press Cambridge

1973) sec 1031 8 KR Popper Quantum Theory and the Schism in Physics (Hutchinson

London 1982) 9 LE Ballentine Quantum Mechanics - A Modern Development (World

Scientific Singapore 1998) Ch 15 24 96 10 LE Ballentine Am J Phys 54 883 (1986) 11 LE Ballentine Found Phys 20 1329 (1990)

84

12 L Cohen in Frontiers of Nonequilibrium Statistical Physics ed GT Moore and MO Scully (Plenum New York 1986) pp 97-117

13 S Kochen and EP Specker J Math Mech 17 59 (1967) 14 A Khrennikov Nonconventional approach to elements of physical realshy

ity based on nonreal asymptotics of relative frequencies Proc Conf Foundations of Probability and Physics Vaxjo-2000 (WSP Singapore 2001)

15 A Khrennikov Interpretations of Probability (VSP Utrecht 1999)

85

FORCING DISCRETIZATION A N D DETERMINATION IN Q U A N T U M HISTORY THEORIES

BOB COECKE Imperial College of Science Technology amp Medicine Theoretical Physics Group

The Blackett Laboratory South Kensington LondonSW7 2BZ and

Free University of Brussels Department of Mathematics Pleinlaan 2 B-1050 Brussels

E-mail bocoeckevubacbe

We present a formally deterministic representation for quantum history theories where we obtain the probabilistic structure via a discrete contextual variable no continuous probabilities are as such involved at the primal level

1 Introduction

In this paper we propose and study a model for history theories in which the probability structure emerges from a finite number of contextual happenings any next happening having a fixed chance to occur under the condition that the previous one happened Although this model cannot have a canonical mathematical status since it has been proved that this type of representation in general admits no essentially unique smallest one 8 u it provides insight in the emergence of logicality in the History Projection Operator setting14 and it illustrates how deterministic behavior can be encoded beyond those inshyterpretations of quantum history theories that are interpretationally restricted by so-called consistency or quasi-consistency (eg approximate decoherence) The particular motivation for this paradigm case study finds its origin in structural considerations towards a theory of quantum gravity4 15 19 As arshygued in16 although the relative frequency interpretation of probability justifies the continuous interval as the codomain for value assignment in the quanshytum gravity regime standard ideas of space and time might break down in such a way that the idea of spatial or temporal ensembles is inappropriate For the other main interpretations of probability mdash subjective logical or propensity mdash there seems to be no compelling a priori reason why probabilities should be real numbers Our model should be envisioned as a deconstructive step unshyraveling the probabilistic continuum as it appears in standard quantum theory reducing it explicitly to a discrete temporal sequence of (contextual) events The as such emerging temporal sequence is then easier to manipulate towards alternative encoding of contextual events eg in propositional terms It also enables a separate treatment of internal (the systems) and external (the con-

86

texts) time-encoding variable Although quantum history theories are currently most frequently envishy

sioned in a context of so-called decoherence we prefer to take the minimal perspective that a history theory is a theory that deals with sequential quanshytum measurements but remains essentially a dichotomic propositional theory This is formally encoded in a rigid way in the History Projection Operator-approach 14 We also mention recently studied sequential structures in the context of quantum logic of which references can be found in1 0 resulting in a dynamic disjunctive quantum logic which provides an appropriate formal context to discuss the logicality of history theories

A general theory on deterministic contextual models can be found in 8 Note here that what we consider as contextuality is that in a measurement there is an interaction between the system and its context and that precisely this interaction to some extend may influence the outcome of a measurement A lack of knowledge on the precise interaction then yields quantum-type unshycertainties Besides this interpretational issue classical representations are important since we think classical so even without giving any conceptual sigshynificance to the representation it provides a mode to think deterministically in terms of determined trajectories of the systems state without having to reconcile with concrete non-canonical constructs like pilot-wave mechanics

2 Outcome determination via contextual models

We will present the required results in full abstraction such that the reader clearly sees which structural ingredient of quantum theory determines existence of contextual models For details and proofs we refer t o 8 Let B(M) denote the Borel subsets of M Definition 1 A probabilistic measurement system is given by (i) A set of states pound and a set of measurements pound (ii) For each e e pound an outcome set Oe euro B(W) a a-field B(Oe) of Oe-subsets and (Kolmogorovian) probability measures Pplte B(Oe) -gt [01] for eachp 6 pound The canonical example is that of quantum theory with every Hilbert space ray ij) representing a state every self-adjoint operator H representing a measureshyment with its spectrum OH C K as outcome set where the a-structure B(OH) is inherited from that of B(R) and with probability measures P^tHE) bull= (tpPEtp) where PE denotes the spectral projector for E G BOH) bull In benefit of insight and also for notational convenience we will from now on assume that the measurements e pound pound are represented in a one to one way by their outcome sets Oe mdash note that whenever pound can be represented by points of W it then suffices to consider W x w = W+v in stead of W to fulfill this assumption

87

taking Oe x e as the corresponding outcome set We stress however that the results listed below also hold in absence of this assumption81 Definition 2 A pre-probabilistic hidden measurement system is given by (i) A set of states pound and a set of measurements pound (ii) Sets O C B(W) and A that parameterize pound ie pound = eAo|A pound A0 pound O and each e pound pound goes equipped with a map ltpto bull pound mdashgt O We can represent ltpoundAO|A pound A as ipo pound x A -gt O (p A) H-gt ltPAO(P) giving A a similar formal status as the set of states pound or as AAo pound x 13(0) mdashgt P ( pound ) (pE) gt-gt A|y0(p A) pound E where 7gt(A) denotes the set of subsets of A The core of this definition is that given a state p pound pound and a value A euro A we have a completely determined outcome tpo [p A) These pre-probabilistic hidden measurement systems encode as such fully deterministic settings Definition 3 Whenever for a given pre-probabilistic hidden measurement system (Ypound(0 A) ltpooeo) there exists a a-field B(A) of A-subsets that satisfies J0e0AAo(pE)(pE) pound pound x B(0) C B(A) it defines a probashybilistic hidden measurement system if a probability measure p B(A) mdashgt [01] is also specified

The condition on A A requires that all AAo(p E) are 23(A)-measurable such that to all triples (p O E) we can assign a value PPto(E) = p(AAo(p E)) euro [01] As such any probabilistic hidden measurement system defines a meashysurement system The question then rises whether every probabilistic meashysurement system (MS) can be encoded as a probabilistic hidden measurement system (HMS) The answer to this question is yes8 42 Theorem 12 3 There always exists a canonical HMS-representation for A = [01] B(A) = B([01]) (ie the Borel sets in [01]) and pu([0a]) = a ie uniformly distributed mdash the proof goes via a construction using the Loomis-Sikorski Theorem17 20 and Marczewskis Lemma13 It makes as such sense to investigate how the different possible HMS-representations for different non-isomorphic pairs (B(A)p) are structured mdash below it will become clear what we mean here by non-isomorphic First we will discuss an example that illustrates the above it traces back to 1 and details and illustrations can be found in 2 8 Consider the states of a spin-1 entity encoded as a point on the Poincare sphere pound 0 ( = C^C) C E3 Then any pair of antipodically located points of pound 0 encodes mutual orthogshyonal states as such encodes mutual orthogonal one-dimensional projectors and thus a (dichotomic) measurement Let p pound pound 0 let (a -gta) be a pair of mutual orthogonal points of pound 0 and let A be the diagonal connecting a and -lta Let xp pound A be the orthogonal projection of p on the diagonal A Then for A pound [xp-gta] ie xp pound [aA] we set ltp(pA) = a and for A pound [a xp[ ie xp euro]A -IQ] we set ltp(p A) = -a One then verifies that for p0 bull= B([a -gta]) mdashgt [01] [a (1 mdash x)a + x-lta] gt-gt x ie uniformly distributed

88

we obtain exactly the probability structure for spin- | in quantum theory a An interpretational proposal of this model could be the following123 Rather than decomposing states as in so-called hidden variable theories here we decompose the measurements in deterministic ones mdash the probability measure fi should then be envisioned as encoding the lack of knowledge on the interaction of the measured system with its environment including measurement device

We now introduce a notion of relative size of HMS-representations jusshytifying the use of smaller Given a er-algebra6 and probability measure H B mdashgt [01] denote by Bn the ltr-algebra of equivalence classes [E] with respect to the relation

pound ~ pound iff n(E n Ec) = nE H (E)c) = 0

ie iff E and E coincide up to a symmetric difference of measure zero The ordering of Bn is inherited from B For notational convenience denote the induced measure Bfi mdashgt [01] [E] H-gt H(E) again by fi Given two pairs (B x) and (B1 ) consisting of separable cr-algebras and probability measures on them set

bull (B u) lt (B u) amp 3f B^ ~ B^ a n i n J e c t i v e c-nidegrphism

We call Bn) and (Bfi) equivalent denoted (Bfi) ~ (Bfi) whenever in the above is a c-isomorphism Given two MS (poundpound) and (Epound ) we set

3s S -gt E 3t pound-+pound both bijections Ve 6 pound 3 e B(Oe) -gt B(Ot(e)) a cr-isomorphism Vp E E V e E pound Ps(p)t(e) deg fe = PPe

Via this equivalence relation we can define a relation lt M S between classes of measurement systems M and M1 as M ltMSM if for all (Epound) euro M there exists (Epound) 6 M such that (Epound) ~M S(S pound ) ie if M is included in M up to MS-equivalence We can then prove the following

(i) (Bi) ~ (Bii) if and only if (BgtAi) lt (Bn) and Bft) lt Bft) mdash 8 3 Lemma 1 thus the equivalence classes with respect to ~ constitute a partially ordered set (poset) for the ordering induced by lt we will denote

As shown in 6 9 this deterministic model for spin-^ in R3 can be generalized to R3-models for arbitrary spin-N2 The states are then represented in the so called Majorana representation 1 8 5 ie as N copies of So Correct probabilistic behavior is then obtained by introducing entanglement between the N different spin-^ systems fcIe a pointless cr-fleld In particular it follows from the Loomis-Sikorski theorem 1 7 2 0

that all separable ltr-algebras (ie which contain a countable dense subset) can be represented as a ltT-field mdash it as such also follows that assuming that B(A) is a er-field and not an abstracted c-algebra imposes no formal restriction

89

the set of these equivalence classes by M a class in it will be denoted via a member of it as [B n]

(ii) When setting M H M S = M[BK)ii [B(A)n] pound M where M[B(A)fi] stands for all HMS with B(A) and i such that (S(A) fi) pound [B(A)j] we have that (B(A)i) lt (B(A)M) BndM[B(A)n] ltMS M[B(A)n] are equivalent 8 i 3 Theorem 2 This then results in

Theorem 1 (M lt) and (MH M S ltM S) are isomorphic posets One of the crucial ingredients in (ii) above and also in the proof for genshy

eral existence with A = [01] is the following when setting AM(Epound) = (B(Oe) Ppe)p euro pound e G pound we obtain that pound pound admits a HMS-representation with B(A) and i if and only if AM(E pound) lt (B(A)n) where the order applies pointwisely to the elements of AM(Epound) 8 t 42 Theorem 1 Using this and Theorem 1 above we can now translate properties of M to propositions on the existence of certain HMS-representations We obtain the following

(i) (M lt) is not a join-semilattice thus In general there exists no smallest HMS-representation As such we will have to refine our study to particular settings where we are able to make statements whether there exists a smallest one and if not whether we can say at least something on the cardinality of A

(ii) One can prove a number of criteria on AM(Epound) that force (B(A)fi) ~ (S([01]) ibdquo) as such assuring existence of a smallest representation Among these the following Let Mfinite = (B(X)^) euro M J X is finite ^bullfinite Q AM(pound pound ) than A cannot be discrete It then follows for examshyple that quantum theory restricted to measurements with a finite number of outcomes still requires A = [01]

(iii) Let MJV = (B(X)(i) 6 M | X has at most N elements J AM(pound pound ) C M^r then there exists a HMS-representation with A mdash N Thus quantum theory restricted to those measurements with at most a fixed number N of outcomes has discrete HMS-representation

(iv) A M ( E pound ) = MAT then there exists no smallest HMS-representation Neither does it exist when fixing the number of outcomes So there is no essenshytially unique smallest HMS-representation for V-outcome quantum theory

Although there exists no smallest and as such no canonical discrete HMS-representation we will give the construction of one solution for dichotomic (or propositional) quantum theory ie N = 2 since this will constitute the core of the model presented in this paper We will follow82 to which we also refer for a construction for arbitrary N Let us denote the quantum mechanical probability to obtain a positive outcome in a measurement of a proposition or question a on a system in state p as Pp(a) mdash the outcome set consists here of we obtain a positive answer for the question a slightly abusively denoted

90

as a itself and we obtain a negative answer for the question a denoted as -ia Set inductively for A euro N c

a iff P (n gt A- 4- V - 1 i(Vc(plti)a) ltpa(p X)= a tradeigt W Z ^ + U=i 2gt

^ -ia otherwise

One verifies that for p(X) = ^x we obtain the correct probabilities in the resultshying HMS-model This provides a discrete alternative for the above discussed E3 -model for spin-i The model including the projection xp remains the same although we dont consider [a -gta] as A anymore Let A e A = N Set xbdquo = ( 1 - pound)a+ (pound)-lta for n pound Z2gt-i bull For xp ltE [ax$[U[x$x$[U[xxpound[U U [a2A-i~lQ] w e se^ faampty = agt anc^ PaiPty = ~ltx otherwise Then for p0 = B(N) mdashraquobull [01] A gt-gt ^ we obtain again quantum probability Geshyometrically this means that the values of A pound A as compared to the first model where they represents points on the diagonal ie a continuous intershyval or again equivalently decompositions of an interval in two intervals we now consider decompositions of an interval in 2A equally long parts of which there are only a discrete number of possibilities We refer t o 8 for details and illustrations concerning

3 Unitary ortho- and projective structure

In the above discussed E3 models rotational symmetries where implicit in their spatial geometry However in general the decompositions of measurements over p B(A) mdashgt [01] go measurement by measurement so additional structure if there is any has to be put in by hand It is probably fair to say that these contextual models only become non-trivial and useful when encoding physical symmetries within the maps tpa in an appropriate manner For sake of the argument we will distinguish between three types of symmetries that can be encoded namely unitary ortho- and projective ones

i Unitary symmetries When considering quantum measurements with disshycrete non-degenerated spectrum we can represent the outcomes OJJ by the corresponding eigenstates pii via spectral decomposition ie there exshyists an injective map B(Oe) -t P(E) for each e euro pound Then specification of ltp E x A mdashbull pii and p for one measurement eo G pound fixes it for any other e E pound by symmetry ltgte = (UoipoU-1) AxE -gt peii where U E -gt E is the unishytary transformation that satisfies U(pi) = pei and pe = p This is exactly the

cWe agree on N = 12 Note here that already by non-uniqueness of binary decomshyposition mdash i = 4- = EigN T^TT mdash follows that the construction below is not canonical Obviously there are also less pathological differences between the different non-comparable discrete representations8

91

symmetry encoded in the above described E3-models Note in particular that in this perspective the pairs (a -ia) and (-gta -gt(-gta)) should not be envisioned as merely a change of names of the outcomes but truly as putting the meashysurement device (or at least its detecting part) upside down d In this setting where we represent outcomes as states the assignment of an outcome can now be envisioned as a true change of state fegt E -gt E (D Oe) p i-gt tpe(p A) as such allowing to describe the behavior of the system under concatenated measurements

ii Projective symmetries For non-degenerated quantum measurements the outcomes require representation by higher dimensional subspaces so identifishycation in terms of states now requires an injective map B(Oe) -raquo V(V(S)) The behavior of states of the system under concatenated measurements then requires specification of a family of projectors TTT bull S -gt TT euro Oe eg the orthogonal projectors 7 r ^ E - gt A p i - gt ^ l A ( p V A x ) on the correshysponding subspace A in quantum theory The above discussed non-degenerated case fits also in this picture by setting Oe C p | p pound E where now each 7Tp E mdashgt p is uniquely determined (having a singleton codomain)

Hi Orthosymmetries The existence of an orthocomplementation on the latshytice of closed subspaces of a Hilbert space provides a dichotomic representashytion for measurements which can be envisioned as a pair consisting of a (to be verified) proposition a and its negation -a in quantum theory yielding TT^A bull E mdashgt A1- p Hraquo A L A ( p V A ) In terms of linear operator calculus we have IT^A = 1 mdash A gt both of them being orthogonal projectors

4 Representing quantum history theory

Although quantum history theory involves sequential measurements one of its goals is to remain an essentially dichotomic propositional theory This is forshymally encoded in a rigid way in the History Projection Operator-approach 14 The key idea here is that the form of logicality aimed at in 14 represhysents faithfully in the Hilbert space tensor producte Let A = (ctti)i be a

d The attentive reader will note that it is at this point that we escape the so-called hidden variable no-go theorems They arise when trying to impose contextual symmetries within the states of the system by requiring that values of observables are independent of the chosen context eg the proof of the Kochen-Specker theorem Our newly introduced variable A pound A follows contextual manipulations in an obvious manner c At this point we mention that in the study of sequential phenomena in the axiomatic quantum theory perspective on quantum logic sequentiality and compoundness both turn out to be specifications of a universal causal duality 1 0 as such providing a metaphysical perspective on the use of tensor products both for the description of compound physical systems and sequential processes

92

(so-called homogeneous) quantum history proposition with temporal support (pound1 pound2 bull bull bull tn) bull Then rather than representing this as a sequence of subspaces (Ai)i or projectors (ir^i we will either represent A as a pure tensor regiAi in the lattice of closed subspaces of the tensor product of the corresponding Hilbert spaces or as the orthogonal projector regi~Ki on this subspace The crucial propshyerty of this representation is then that -gtA again encodes as a projector namely idmdashregiiTi14 clarifying the notations TTJ and 7r-^ Moreover if Ali is a set of so-called disjoint history propositions ie lt8gtkAk plusmn regkA3

k for i ^ j then the history proposition that expresses the disjunction of Ai sensu14 is exactly encoded as the projector ] [ reg7rpound We get as such a kind of logical setting that is still encoded in terms of projectors Note that TT-A is not of the form regj7Tj but of the form Yli regA7rfc breaking the structural symmetry between a proposition and its negation in ordinary quantum theory

We will now transcribe the observations in the two previous sections to this setting in order to provide a contextual deterministic model for quantum history theory with discretely originating probabilities One could say that we will apply a split picture in terms of Schrodinger-Eisenbergh namely we assume that on the level of unitary evolution we apply the Eisenbergh picshyture such that we can fix notation without reference to this evolution but for changes of state due to measurement we will (obviously) express this in the state space When encoding outcomes in terms of states we need to consider n copies of E encoding the trajectories due to the measurements In view of the considerations made above it will be no surprise that we will consider these trajectories as of the form regiPi in the tensor product (gijEj This will require the introduction of the following pseudo-projector

bull 7r^ pound -gt regipoundi p Hgt p ^ = p reg m(p) reg reg (7Tn_i o o in)(p) Setting poundreg = TTreg[pound] = pg|p pound pound then ir pound -gt E^ encodes a bijective representation of E Noting that PP(A) mdash (preg IXAPA) is the probability given by quantum theory to obtain A we then set inductively for fixed A pound N that ltPA(P A) = A if and only if

bull lt P S I trade S gt gt pound + E pound ^ ^ and (p^(p) = -14 otherwise The outcome trajectories in case we obtain A are then given in terms of initial states by (n^ o 7rreg) E mdashgt regiAi The value A euro N can be envisioned as follows We assume it to be a number of contextual events either real or virtual depending on ones taste and we asshysume that given that some events already happened the chance of a next one happening is equal to the chance that it doesnt happen so we actually conshysider a finite number of probabilistically balanced consecutive binary decisive processes where the result of the previous one determines whether we actually

93

will perform the next one Unitary symmetries are induced in the obvious way as tensored unitary operators regiUi This model then produces the statistical behavior of quantum history theory

The breaking of the structural symmetry between a proposition and its negation manifestates itself in the most explicit way in the sense that when we have a determined outcome -gtA we dont have a determined trajectory in our model mdash obviously one could build a fully deterministic model that also determines this by concatenation of individual deterministic models (one for each element in the temporal support) but we feel that this would not be in accordance with the propositional flavor a history theory aims at The negation -gtA is indeed cognitive and not ontological with respect to the actual executed physical procedure or in other words the systems context and one cannot expect an ontological model to encode this in terms of a formal duality Explicitly -i(AregB) can be written both as H lt8gt -gtB) copy (-gtA reg B) and (-gtA reg H) copy (A reg -gtB) which clearly define different procedures with respect to imposed change of state due to the measurement Even more explicitly setting HPO(Hkk) = E reg 4 l 4 G pound(laquo)gt reg4l -L reg 4 for i ^ j for pound(ik) the lattice of closed subspaces of Hk the ontologically faithful hull oiUVO(Ukk) consists then of all ortho-ideals Ol(HVO(Hkk)) ~

bull 4[regAji] | A e CUk)regkA plusmn regkA for i plusmn j

where J[mdash] assigns to a set of pure tensors all pure tensors in QkHk that are smaller than at least one in the given set this with respect to the ordering in CregkHk) mdash the downset 4-[~] construction makes Ol(HVO(Hkk)) inherit the pound(regkHk)-oideT as intersection If a particular decomposition is specified as an element of OX(HVO(Hkk)) what means full specification of the physishycal procedure where summation over different sequences of pure tensors is now envisioned as choice of procedure we can provide a deterministic contextual model the choice of procedure itself becoming an additional variable Conshyclusively the HPO-setting looses part of the physical ontology that goes with an operational perspective on quantum theory and as such if we want to provide a deterministic representation for general inhomogeneous history propositions sensu the one we obtained for the homogeneous ones we formally need to restore this part of the physical ontology eg as Ol7iVO(7ikk))

5 Further discussion

In this paper we didnt provide an answer and we even didnt pose a question We just provided a new way to think about things slightly confronting the

A choice that is motivated by the traditional consistent history setting and its interpretation as well as by a particular semantical perspective on quantum logic as a whole

94

usual consistency or decoherence perspective for history theories Even if one does not subscribe to the underlying deterministic nature of the model it still exhibits what a minimal representation of the indeterministic ingredients can be as such representing it in a more tangible way With respect to the nonshyexistence of a smallest representation in view of other physical considerations it could be that one of the constructible discrete models presents itself as the truly canonical one eg equilibrium or other thermodynamical considerations metastatistical ones emerging from additional modelization

Acknowledgments

We thank Chris Isham for useful discussions on the content of this paper

References

1 D Aerts J Math Phys 27 202 (1986) 2 D Aerts Int J Theor Phys 32 2207 (1993) 3 D Aerts Found Phys 24 1227 (1994) 4 GK Au mdash Interview with A Ashtekar CJ Isham and E Witten The

Quest for Quantum Gravity arXiv gr-qc9506001 (1995) 5 H Bacry J Math Phys 15 1686 (1974) 6 B Coecke Helv Phys Acta 68 396 (1995) 7 B Coecke Found Phys Lett 8 437 (1995) 8 B Coecke Helv Phys Acta 70 442 462(1997) arXiv quant-

ph0008061 k 0008062 Tatra Mt Math Publ 10 63 9 B Coecke Found Phys 28 1347 (1998)

10 B Coecke et ai Found Phys Lett 14(2001) arXiv quant-ph0009100 11 N Gisin and C Piron Lett Math Phys 5 379 (1981) 12 S Gudder J Math Phys 11 431 (1970) 13 A Horn and H Tarski Trans AMS 64 467 (1948) 14 C J Isham J Math Phys 23 2157 (1994) 15 C J Isham Structural Issues in Quantum Gravity In General Relativshy

ity and Gravitation GR14 pp167 (World Scientific Singapore 1997) 16 CJ Isham and J Butterfield Found Phys 30 1707 (2000) 17 L Loomis Bull AMS 53 757 (1947) 18 E Majorana Nuovo Cimento 9 43 (1932) 19 C Rovelli Strings Loops and Others A Critical Survey of the Present

Approaches to Quantum Gravity Plenary Lecture at GR15 Poona India (1998) arXiv gr-qc9803024

20 R Sikorski Fund Math 35 247 (1948)

95

INTERPRETATIONS OF Q U A N T U M MECHANICS A N D INTERPRETATIONS OF VIOLATION OF BELLS

INEQUALITY

WILLEM M DE MUYNCK Theoretical Physics Eindhoven University of Technology

FOB 513 5600 MB Eindhoven the Netherlands E-mail W-MdMuyncktuenl

The discussion of the foundations of quantum mechanics is complicated by the fact that a number of different issues are closely entangled Three of these issues are i) the interpretation of probability ii) the choice between realist and empiricist interpretations of the mathematical formalism of quantum mechanics iii) the disshytinction between measurement and preparation It will be demonstrated that an interpretation of violation of Bells inequality by quantum mechanics as evidence of non-locality of the quantum world is a consequence of a particular choice beshytween these alternatives Also a distinction must be drawn between two forms of realism viz a) realist interpretations of quantum mechanics b) the possibility of hidden-variables (sub-quantum) theories

1 Realist and empiricist interpretations of quantum mechanics

In realist interpretations of the mathematical formalism of quantum mechanics state vector and observable are thought to refer to the microscopic object in the usual way presented in most textbooks Although of course preparing and measuring instruments are often present these are not taken into account in the mathematical description (unless as in the theory of measurement the subject is the interaction between object and measuring instrument)

In an empiricist interpretation quantum mechanics is thought to describe relations between input and output of a measurement process A state vector is just a label of a preparation procedure an observable is a label of a measuring instrument In an empiricist interpretation quantum mechanics is not thought to describe the microscopic object This of course does not imply that this object would not exist it only means that it is not described by quantum mechanics Explanation of relations between input and output of a measureshyment process should be provided by another theory eg a hidden-variables (sub-quantum) theory This is analogous to the way the theory of rigid bodies describes the empirical behavior of a billiard ball or to the description by thershymodynamics of the thermodynamic properties of a volume of gas explanations being relegated to theories describing the microscopic (atomic) properties of the systems

Although a term like observable (rather than physical quantity) is ev-

96

idence of the empiricist origin of quantum mechanics (compare Heisenberg1) there has always existed a strong tendency toward a realist interpretation in which observables are considered as properties of the microscopic object more or less analogous to classical ones Likewise many physicists use to think about electrons as wave packets flying around in space without bothering too much about the Unanschaulichkeit that for Schrodingei2 was such a problematic feature of quantum theory Without entering into a detailed discussion of the relative merits of either of these interpretations (eg de Muynck3) it is noted here that an empiricist interpretation is in agreement with the operational way theory and experiment are compared in the laboratory Moreover it is free of paradoxes which have their origin in a realist interpretation As will be seen in the next section the difference between realist and empiricist interpretations is highly relevant when dealing with the EPR problem

2 E P R experiments and Bell experiments

In figure 1 the experiment is depicted

measuring instrument for Q or P

Figure 1 E P R experiment

proposed by Einstein Podolsky and Rosen4 to study (in)completeness of quantum mechanics A pair of particles (1 and 2) is prepared in an entangled state and allowed to separate A measurement is performed on particle 1 It is essential to the EPR reasoning that particle 2 does not interact with any measuring instrument thus allowing to consider so-called elements of physical reality of this particle that can be considered as objective properties being attributable to particle 2 independently of what happens to particle 1 By EPR this arrangement was presented as a way to perform a measurement on particle 2 without in any way disturbing this particle

The EPR experiment should be compared to correlation measurements of the type performed by Aspect et al56 to test Bells inequality (cf figure 2) In these latter experiments also particle 2 is interacting with a measurshying instrument In the literature these experiments are often referred to as EPR experiments too thus neglecting the fundamental difference between

97

Q

Figure 2 Bell experiment

the two measurement arrangements of figures 1 and 2 This negligence has been responsible for quite a bit of confusion and should preferably be avoided by referring to the latter experiments as Bell experiments rather than EPR ones In EPR experiments particle 2 is not subject to a measurement but to a (conditional) preparation (conditional on the measurement result obtained for particle 1) This is especially clear in an empiricist interpretation because here measurement results cannot exist unless a measuring instrument is present its pointer positions corresponding to the measurement results

Unfortunately the EPR experiment of figure 1 was presented by EPR as a measurement performed on particle 2 and accepted by Bohr as such That this could happen is a consequence of the fact that both Einstein and Bohr entertained a realist interpretation of quantum mechanical observables (note that they differed with respect to the interpretation of the state vector) the only difference being that Einsteins realist interpretation was an objectivistic one (in which observables are considered as properties of the object possessed independently of any measurement the EPR elements of physical reality) whereas Bohrs was a contextualistic realism (in which observables are only well-defined within the context of the measurement) Note that in Bell expershyiments the EPR reasoning would break down because due to the interaction of particle 2 with its measuring instrument there cannot exist elements of physical reality

Much confusion could have been avoided if Bohr had maintained his intershyactional view of measurement However by accepting the EPR experiment as a measurement of particle 2 he had to weaken his interpretation to a relational one (eg Popper7 Jammer8) allowing the observable of particle 2 to be co-determined by the measurement context for particle 1 This introduced for the first time non-locality in the interpretation of quantum mechanics But this could easily have been avoided if Bohr had required that for a measurement of particle 2 a measuring instrument should be actually interacting with this very particle with the result that an observable of particle n (n = 12) can be co-determined in a local way by the measurement context of that particle only This incidentally would have completely made obsolete the EPR ele-

98

ments of physical reality and would have been quite a bit less confusing than the answer Bohr9 actually gave (to the effect that the definition of the EPR element of physical reality would be ambiguous because of the fact that it did not take into account the measurement arrangement for the other particle) thus promoting the non-locality idea

Summarizing the idea of EPR non-locality is a consequence of i) a neglect of the difference between EPR and Bell experiments (equating elements of physical reality to measurement results) ii) a realist interpretation of quantum mechanics (considering measurement results as properties of the microscopic object ie particle 2) In an empiricist interpretation there is no reason to assume any non-locality

It is often asserted that non-locality is proven by the Aspect experiments because these are violating Bells inequality The reason for such an assertion is that it is thought that non-locality is a necessary condition for a derivation of Bells inequality However as will be demonstrated in the following this cannot be correct since this inequality can be derived from quite different assumptions Also experiments like the Aspect ones -although violating Bells inequality-do not exhibit any trace of non-locality because their measurement results are completely consistent with the postulate of local commutativity implyshying that relative frequencies of measurement results are independent of which measurements are performed in causally disconnected regions Admittedly this does not logically exclude a certain non-locality at the individual level being unobservable at the statistical level of quantum mechanical probability distributions However from a physical point of view a peaceful coexistence between locality at the (physically relevant) statistical level and non-locality at the individual level is extremely implausible Unobservability of the latter would require a kind of conspiracy not unlike the one making unobservable 19 century world aether For this reason the non-locality explanation of the experimental violation of Bells inequality does not seem to be very plausible and does it seem wise to look for alternative explanations

Since non-locality is never the only assumption in deriving Bells inequalshyity such alternative explanations do exist Thus Einsteins assumption of the existence of elements of physical reality is such an additional assumption More generally in Bells derivation10 the existence of hidden-variables is one Is it still possible to derive Bells inequality if these assumptions are abolshyished Moreover even assuming the possibility of hidden-variables theories are there in Bells derivation no hidden assumptions additional to the locality assumption

Bells inequality refers to a set of four quantum mechanical observables AiBiA2 and B2 observables with differentidentical indices being compati-

99

bleincompatible In the Aspect experiments measurements of the four possible compatible pairs are performed in these experiments An and Bn refer to polarshyization observables of photon n n = 12 respectively) Bells inequality can typically be derived for the stochastic quantities of a classical Kolmogorovian probability theory Hence violation of Bells inequality is an indication that observables A B A2 and B2 are not stochastic quantities in the sense of Kol-mogorovs probability theory In particular there cannot exist a quadrivariate joint probability distribution of these four observables Such a non-existence is a consequence of the incompatibility of certain of the observables Since inshycompatibility is a local affair this is another reason to doubt the non-locality explanation of the violation of Bells inequality

In the following derivations of Bells inequality will be scrutinized to see whether the non-locality assumption is as crucial as was assumed by Bell In doing so it is necessary to distinguish derivations in quantum mechanics from derivations in hidden-variables theories

3 Bells inequality in quantum mechanics

For dichotomic observables having values plusmn 1 Bells inequality is given accordshying to

A^A2) - AXB2) - (B1B2) - (BiA2) lt 2 (1)

A more general inequality being valid for arbitrary values of the observables is the BCHS inequality

-lltp(b1a2) +p(bib2)+p(a1b2) - p ( o i a 2 ) -p(bi) -p(b2) lt 0 (2)

from which (31) can be derived for the dichotomic case Because of its indeshypendence of the values of the observables inequality (32) is preferable by far over inequality (31) Bells inequality may be violated if some of the observshyables are incompatible [gtliii]_ ^ O [^2-62]- ^ O

I shall now discuss two derivations of Bells inequality which can be formushylated within the quantum mechanical formalism and which do not rely on the existence of hidden variables The first one is relying on a possessed values principle stating that

values of quantum mechanical observables may be attributed to the object as objective properties possessed by the object independent of observation

values principle can be seen as an expression of the objectiv-

possessed values = lt principle

The possessec istic-realist interpretation of the quantum mechanical formalism preferred by

100

Einstein (compare the EPR elements of physical reality) The important point is that by this principle well-defined values are simultaneously attributed to incompatible observables If an bj = plusmn1 are the values of Ai and Bj for the nth of a sequence of N particle pairs then we have

- 2 lt lt 4 n ) 4 n ) - a[n)b2n) - b[n)b2

n) - ampltn)a2n) lt 2

from which it directly follows that the quantities

lt iA2gt = l f a W 4 n gt gt e t c n=l

must satisfy Bells inequality (31) (a similar derivation has first been given by Stapp11 although starting from quite a different interpretation) The essential point in the derivation is the assumption of the existence of a quadruple of values (ai b a262) for each of the particle pairs

From the experimental violation of Bells inequality it follows that an objectivistic-realist interpretation of the quantum mechanical formalism enshycompassing the possessed values principle is impossible Violation of Bells inequality entails failure of the possessed values principle (no quadruples availshyable) In view of the important role measurement is playing in the interpreshytation of quantum mechanics this is hardly surprising As is well-known due to the incompatibility of some of the observables the existence of a quadruple of values can only be attained on the basis of doubtful counterfactual reashysoning If a realist interpretation is feasible at all it seems to have to be a contextualistic one in which the values of observables are co-determined by the measurement arrangement In the case of Bell experiments non-locality does not seem to be involved

As a second possibility to derive Bells inequality within quantum meshychanics we should consider derivations of the BCHS inequality (32) from the existence of a quadrivariate probability distribution p(ai 610262) by Fine12

and Rastalf3 (also de Muynck14) Hence from violation of Bells inequality the non-existence of a quadrivariate joint probability distribution follows In view of the fact that incompatible observables are involved this once again is hardly surprising

A priori there are two possible reasons for the non-existence of the quadrishyvariate joint probability distribution (01610262) First it is possible that Um]v-gt00N(aibia2b2)N of the relative frequencies of quadruples of meashysurement results does not exist Since however Bells inequality already folshylows from the existence of relative frequency ^(01610262)^ with finite

101

N and the limit N mdashgt oo is never involved in any experimental implementashytion this answer does not seem to be sufficient Therefore the reason for the non-existence of the quadrivariate joint probability distribution pa ampi alti 62) can only be the non-existence of relative frequencies N(aibia2b2)N This seems to reduce the present case to the previous one Bells inequality can be violated because quadruples ( 4i = a B = bi A = 02 B2 = ^2) do not exist

Could non-locality explain the non-existence of quadruples A = aB = bi A2 = a2 B2 = 62) Indeed it could If the value of A say is co-determined by the measurement arrangement of particle 2 then non-locality could entail

Oi(^2) 0(B2) (3)

thus preventing the existence of one single value of observable A for the two Aspect experiments involving this observable This precisely is the non-locality explanation referred to above This explanation is close to Bohrs ambiguity answer to EPR referred to in section 2 stating that the definishytion of an element of physical reality of observable A must depend on the measurement context of particle 2

As will be demonstrated next there is a more plausible local explanation however based on the inequality

a i ^ O ^ a ^ B i ) (4)

expressing that the value of Ai say will depend on whether either Ai or B is measured Inequality (34) could be seen as an implementation of Heisenbergs disturbance theory of measurement to the effect that observables incompatishyble with the actually measured one are disturbed by the measurement That such an effect is really occurring in the Aspect experiments can be seen from the generalized Aspect experiment depicted in figure 3 This experiment should be compared with the Aspect switching experiment in which the switches have been replaced by two semi-transparent mirrors (transmissivities 71 and 72 reshyspectively) The four Aspect experiments are special cases of the generalized one having 7bdquo = 0 or 1 n = 12

Restricting for a moment to one side of the interferometer it is possible to calculate the joint detection probabilities of the two detectors according to

p^auMj)) - ( 1 _ 7 l ) ( F ( D + ) i - 7 l ( pound ( i ) + ) - ( l - 7 l ) ( f ( i ) + ) Jgt

(5)

in which E^ + E^bdquo and F^+jF^- are the spectral representations of the two polarization observables (Ai and Bi) in directions 81 and 6[ respecshytively The values an = +mdashbij = +mdash correspond to yesno registration

102

(IIS bull y ltamp bull BID Pole D

Pole C S 3 E 3 Pol 9]

Figure 3 Generalized Aspect experiment

of a photon by the detector p 7 1 (+ +) = 0 means that like in the switching experiment only one of the detectors can register photon 1 There however is a fundamental difference with the switching experiment because in this latter experiment the photon wave packet is sent either toward one detector or the other whereas in the present one it is split so as to interact coherently with both detectors This makes it possible to interpret the right hand part of the generalized experiment of figure 3 as a joint non-ideal measurement of the inshycompatible polarization observables in directions 6 and 6[ (eg de Muynck et al15) the joint probability distribution of the observables being given by (5)

It is not possible to extensively discuss here the relevance of experiments of the generalized type for understanding Heisenbergs disturbance theory of measurement and its relation to the Heisenberg uncertainty relations (see eg de Muynck16) The important point is that such experiments do not fit into the standard (Dirac-von Neumann) formalism in which a probability is an expectation value of a projection operator Indeed from (5) it follows that P-n(aubij) = TrpR^ij is yielding operators R^ij according to

( ( 1 ) laquo ) = ( ( 1 - T 0 F lt 1 gt + 7 i pound(D 7 ipound ( 1 ) +

+ ( l - 7 l ) F ( O (6)

The set of operators R^ij constitutes a so-called positive operator-valued measure (POVM) Only generalized measurements corresponding to POVMs are able to describe joint non-ideal measurements of incompatible observables By calculating the marginals of probability distribution p 7 l (an bj) it is possishyble to see that for each value of 71 information is obtained on both polarization observables be it that information on polarization in direction 0 gets more non-ideal as 71 decreases while information on polarization in direction 0[ is getting more ideal This is in perfect agreement with the idea of mutual disshyturbance in a joint measurement of incompatible observables The explanation of the non-existence of a single measurement result for observable Ai say as implied by inequality (34) is corroborated by this analysis

103

The analysis can easily be extended to the joint detection probabilities of the whole experiment of figure 3 The joint detection probability distribution of all four detectors is given by the expectation value of a quadrivariate POVM Rijki according to

(an bija2khi) = TrpRijkt- (7)

This POVM can be expressed in terms of the POVMs of the left and right interferometer arms according to

Rijki = R)R) (8)

It is important to note that the existence of the quadrivariate joint probshyability distribution (7) and the consequent satisfaction of Bells inequality is a consequence of the existence of quadruples of measurement results available because it is possible to determine for each individual particle pair what is the result of each of the four detectors Although because of (35) also loshycality is assumed this does not play an essential role Under the condition that a quadruple of measurement results exists for each individual photon pair Bells inequality would be satisfied also if due to non-local interaction Rijkt were not a product of operators of the two arms of the interferometer The reason why the standard Aspect experiments do not satisfy Bells inequality is the non-existence of a quadrivariate joint probability distribution yielding the bivariate probabilities of these experiments as marginals Such a nonshyexistence is strongly suggested by Heisenbergs idea of mutual disturbance in a joint measurement of incompatible observables This is corroborated by the easily verifiable fact that the quadrivariate joint probability distributions of the standard Aspect experiments obtained from (7) and (35) by taking j n

to be either 1 or 0 are all distinct Moreover in general the quadrivariate joint probability distribution (7) for one standard Aspect experiment does not yield the bivariate ones of the other experiments as marginals Although it is not strictly excluded that a quadrivariate joint probability distribution might exist having the bivariate probabilities of the standard Aspect experiments as marginals (hence different from the ones referred to above) does the mathshyematical formalism of quantum mechanics not give any reason to surmise its existence As far as quantum mechanics is concerned the standard Aspect experiments need not satisfy Bells inequality

104

4 Bells inequality in stochastic and deterministic hidden-variables theories

In stochastic hidden-variables theories quantum mechanical probabilities are usually given as

p(ai)= [ d p()p(ai) (1) JA

in which A is the space of hidden variable A (to be compared with classical phase space) and p(ai|A) is the conditional probability of measurement result A = ai if the value of the hidden variable was A and pX) the probability of A It should be noticed that expression (41) fits perfectly into an empiricist intershypretation of the quantum mechanical formalism in which measurement result ai is referring to a pointer position of a measuring instrument the object being described by the hidden variable Since p(ai | A) may depend on the specific way the measurement is carried out the stochastic hidden-variables model correshysponds to a contextualistic interpretation of quantum mechanical observables Deterministic hidden-variables theories are just special cases in which p(ai|A) is either 1 or 0 In the deterministic case it is possible to associate in a unique way (although possibly dependent on the measurement procedure) the value ai to the phase space point A the object is prepared in A disadvantage of a deterministic theory is that the physical interaction of object and measuring instrument is left out of consideration thus suggesting measurement result ai to be a (possibly contextually determined) property of the object In order to have maximal generality it is preferable to deal with the stochastic case

For Bell experiments we have

p(aia2)= dp(X)p(aia2) (2) JA

a condition of conditional statistical independence

p(a1a2X) =p(ai|A)p(o2 |A) (3)

expressing that the measurement procedures of Ai and A2 do not influence each other (so-called locality condition)

As is well-known the locality condition was thought by Bell to be the crucial condition allowing a derivation of his inequality This does not seem to be correct however As a matter of fact Bells inequality can be derived if a quadrivariate joint probability distribution exists1213 In a stochastic hidden-variables theory such a distribution could be represented by

p(aibia2b2) = dX p(X)p(aibia2b2X) (4) JA

105

without any necessity that the conditional probability be factorizable in order that Bells inequality be satisfied (although for the generalized experiment disshycussed in section 3 it would be reasonable to require that p(ai 6102621 A) = p(ai6i|A)p(a2amp2|A)) Analogous to the quantum mechanical case it is suffishycient that for each individual preparation (here parameterized by A) a quadrushyple of measurement results exists If Heisenberg measurement disturbance is a physically realistic effect in the experiments at issue it should be described by the hidden-variables theory as well Therefore the explanation of the nonshyexistence of such quadruples is the same as in quantum mechanics

However with respect to the possibility of deriving Bells inequality there is an important difference between quantum mechanics and the stochastic hidden-variables theories of the kind discussed here Whereas quantum meshychanics does not yield any indication as regards the existence of a quadrivariate joint probability distribution returning the bivariate probabilities of the Asshypect experiments as marginals local stochastic hidden-variables theory does Indeed using the single-observable conditional probabilities assumed to exist in the local theory (compare (3)) it is possible to construct a quadrivariate joint probability distribution according to

p(aia2b1b2) = d p(A)p(ai|A)p(a2|A)p(ampi|A)p(amp2|A) (5) JK

satisfying all requirements It should be noted that (42) does not describe the results of any joint measurement of the four observables that are involved Quadruples (ai a2 b b2) are obtained here by combining measurement results found in different experiments assuming the same value of A in all experishyments For this reason the physical meaning of this probability distribution is not clear However this does not seem to be important The existence of (42) as a purely mathematical constraint is sufficient to warrant that any stochastic hidden-variables theory in which (2) and (3) are satisfied must reshyquire that the standard Aspect experiments obey Bells inequality Admittedly there is a possibility that (42) might not be a valid mathematical entity beshycause it is based on multiplication of the probability distributions p(a|A) which might be distributions in the sense of Schwartz distribution theory However the remark made with respect to the existence of probability distributions as infinitemdashA limits of relative frequencies is valid also here the reasoning does not depend on this limit but is equally applicable to relative frequencies in finite sequences

The question is whether this reasoning is sufficient to conclude that no local hidden-variables theory can reproduce quantum mechanics Such a conshyclusion would only be justified if locality would be the only assumption in

106

deriving Bells inequality If there would be any additional assumption in this derivation then violation of Bells inequality could possibly be blamed on the invalidity of this additional assumption rather than locality Evidently one such additional assumption is the existence of hidden variables A belief in the completeness of the quantum mechanical formalism would indeed be a suffishycient reason to reject this assumption thus increasing pressure on the locality assumption Since however an empiricist interpretation is hardly reconcilshyable with such a completeness belief we have to take hidden-variables theories seriously and look for the possibility of additional assumptions within such theories

In expression (41) one such assumption is evident viz the existence of the conditional probability p(ai|A) The assumption of the applicability of this quantity in a quantum mechanical measurement is far less innocuous than appears at first sight If quantum mechanical measurements really can be modshyeled by equality (41) this implies that a quantum mechanical measurement result is determined either in a stochastic or in a deterministic sense by an instantaneous value A of the hidden variable prepared independently of the measurement to be performed later It is questionable whether this is a reshyalistic assumption in particular if hidden variables would have the character of rapidly fluctuating stochastic variables As a matter of fact every individshyual quantum mechanical measurement takes a certain amount of time and it will in general be virtually impossible to determine the precise instant to be taken as the initial time of the measurement as well as the precise value of the stochastic variable at that moment Hence hidden-variables theories of the kind considered here may be too specific

Because of the assumption of a non-contextual preparation of the hidshyden variable such theories were called quasi-objectivistic stochastic hidden-variables theories in de Muynck and van Stekelenborg17 (dependence of the conditional probabilities p(aiX) on the measurement procedure preventing complete objectivity of the theory) In the past attention has mainly been restricted to quasi-objectivistic hidden-variables theories It is questionable however whether the assumption of quasi-objectivity is a possible one for hidden-variables theories purporting to reproduce quantum mechanical meashysurement results The existence of quadrivariate probability distribution (42) only excludes quasi-objectivistic local hidden-variables theories (either stochasshytic or deterministic) from the possibility of reproducing quantum mechanics As will be seen in the next section it is far more reasonable to blame quasi-objectivity than locality for this thus leaving the possibility of local hidden-variables theories that are not quasi-objectivistic

107

5 Analogy between thermodynamics and quantum mechanics

The essential feature of expression (41) is the possibility to attribute either in a stochastic or in a deterministic way measurement result a to an instantashyneous value of hidden variable A The question is whether this is a reasonable assumption within the domain of quantum mechanical measurement Are the conditional probabilities p(ai|A) experimentally relevant within this domain In order to give a tentative answer to this question we shall exploit the analogy between thermodynamics and quantum mechanics considered already a long time ago by many authors (eg de Broglie18 Bohm et al1920 Nelson2122)

Quantum mechanics -yen Hidden variables theory (A1A2BUB2) A

t t Thermodynamics mdashgt Classical statistical mechanics

(PTS) quPi In this analogy thermodynamics and quantum mechanics are considered as phenomenological theories to be reduced to more fundamental microscopic theories The reduction of thermodynamics to classical statistical mechanics is thought to be analogous to a possible reduction of quantum mechanics to stochastic hidden-variables theory Due to certain restrictions imposed on preparations and measurements within the domains of the phenomenological theories their domains of application are thought to be contained in but smaller than the domains of the microscopic theories

In order to assess the nature and the importance of such restrictions let us first look at thermodynamics As is well-known (eg Hollinger and Zenzen23) thermodynamics is valid only under a condition of molecular chaos assuring the existence of local equilibrium necessary for the ergodic hypothesis to be satisfied Thermodynamics only describes measurements of quantities (like pressure temperature and entropy) being defined for such equilibrium states From an operational point of view this implies that measurements within the domain of thermodynamics do not yield information on the object system valid for one particular instant of time but it is time-averaged information time averaging being replaced under the ergodic hypothesis by ensemble averaging In the Gibbs theory this ensemble is represented by the canonical density function Z~1e~H^qnp^^kT on phase space This state is called a macrostate to be distinguished from the microstate qnPn representing the point in phase space the classical object is in at a certain instant of time

The restricted validity of thermodynamics is manifest in a two-fold way i) through the restriction of all possible density functions on phase space to aIn equilibrium thermodynamics equilibrium is assumed to be even global

108

the canonical ones ii) through the restriction of thermodynamical quantities (observables) to functionals on the space of thermodynamic states Physishycally this can be interpreted as a restriction of the domain of application of thermodynamics to those measurement procedures probing only properties of the macrostates This implies that such measurements only yield information that is averaged over times exceeding the relaxation time needed to reach a state of (local) equilibrium Thus it is important to note that thermodynamic quantities are quite different from the physical quantities of classical statistical mechanics the latter ones being represented by functions of the microstate ltlnPn and hence referring to a particular instant of time6 Only if it were possible to perform measurements faster than the relaxation time would it be necessary to consider such non-thermodynamic quantities Such measureshyments then are outside the domain of application of thermodynamics Thus if we have a cubic container containing a volume of gas in a microstate initially concentrated at its center and if we could measure at a single instant of time either the total kinetic energy or the force exerted on the boundary of the conshytainer then these results would not be equal to thermodynamic temperature and pressurec respectively because this microstate is not an equilibrium state Only after the gas has reached equilibrium within the volume denned by the container (equilibrium) thermodynamics becomes applicable

Within the domain of application of thermodynamics the microstate of the system may change appreciably without the macrostate being affected Indeed a macrostate is equivalent to an (ergodic) trajectory qn(t)pn(t)ergodic- We might exploit as follows the difference between micro- and macrostates for charshyacterizing objectivity of a physical theory Whereas the microstate is thought to yield an objective description of the (microscopic) object the macrostate just describes certain phenomena to be attributed to the object system only while being observed under conditions valid within the domain of application of the theory In this sense classical mechanics is an objective theory all quantities being instantaneous properties of the microstate Thermodynamic quantities only being attributable to the macrostate (ie to an ergodic trashyjectory) can not be seen however as properties belonging to the object at a certain instant of time Of course we might attribute the thermodynamic quantity to the event in space-time represented by the trajectory but it should be realized that this event is not determined solely by the preparation of the microstate but is determined as well by the macroscopic arrangement serving

6Note that a definition of an instantaneous temperature by means of the equality Z2nkT = S i P2mj does not make sense as can easily be seen by applying this definition to an ideal gas in a container freely falling in a gravitational field t h e r m o d y n a m i c pressure is defined for the canonical ensemble by p mdash kTddV log Z

109

Figure 4 Incompatible thermodynamic arrangements

to define the macrostate In order to illustrate this consider two identical cubic containers differing

only in their orientations (cf figure 4) In principle the same microstate may be prepared in the two containers Because of the different orientations howshyever the macrostates evolving from this microstate during the time the gas is reaching equilibrium with the container are different (for different orientations of the container we have Hx ^ H2 and hence e - i f l f c T Z i ^ e~H2kTZ2 since H = T+V and Vi ^ V2 because potential energy is infinite outside a conshytainer) This implies that thermodynamic macrostates may be different even though starting from the same microstate Macrostates in thermodynamics have a contextual meaning It is important to note that since the container is part of the preparing apparatus this contextuality is connected here to prepashyration rather than to measurement Consequently whereas classical quantities f(qnPn) can be interpreted as objective properties thermodynamic quanshytities are non-objective the non-objectivity being of a contextual nature

Let us now suppose that quantum mechanics is related to hidden-variables theory analogous to the way thermodynamics is related to classical mechanshyics the analogy maybe being even closer for non-equilibrium thermodynamics (only local equilibrium being assumed) than for the thermodynamics of global equilibrium processes Support for this idea was found in de Muynck and van Stekelenborg17 where it was demonstrated that in the Husimi representashytion of quantum mechanics by means of non-negative probability distribution functions on phase space an analogous restriction to a canonical set of disshytributions obtains as in thermodynamics In particular it was demonstrated that the dispersionfree states p(qp) = S(q mdash qo)S(p mdash po) are not canonical in this sense This implies that within the domain of quantum mechanics it does not make sense to consider the preparation of the object in a microstate with a well-defined value of the hidden variables (qp)

In the analogy quantum mechanical observables like AiA2BiB2 should be compared to thermodynamic quantities like pressure temperature and enshytropy The central issue in the analogy is the fact that thermodynamic quanti-

110

ties like pressure and temperature cannot be conditioned on the instantaneous phase space variable qnPn (microstate) Expressions like p(qnPn) and T(qnPn) are meaningless within thermodynamics Thermodynamic quanshytities are conditioned on macrostates corresponding to ergodic paths in phase space Analogously a quantum mechanical observable might not correspond to an instantaneous property of the object but might have to be associated with an (ergodic) path in hidden-variables space A (macrostate) rather than with an instantaneous value A (microstate)

On the basis of the analogy between thermodynamics and quantum meshychanics it is possible to state the following conjectures

bull Quantum mechanical measurements (analogous to thermodynamic meashysurements) do not probe microstates but macrostates

bull Quantum mechanical quantities (analogous to thermodynamic quantishyties) should be conditioned on macrostates

A hidden-variables macrostate will be symbolically indicated by A For quantum mechanical measurements the conditional probabilities p(ai) of (41) should then be replaced by p(ai|A ) Concomitantly quantum mechanshyical probabilities should be represented in the hidden-variables theory by a functional integral

p(ai) = Jd ptfMa^X1) (1)

in which the integration is over all possible macrostates consistent with the preparation procedure

By itself conditioning of quantum mechanical observables on macrostates rather than microstates is not sufficient to prevent derivation of Bells inequalshyity As a matter of fact on the basis of expression (43) a quadrivariate joint probability distribution can be defined analogous to (42) according to

p(oi026162) = f dt p(A)p(a1|At)p(a2|At)p(61|Alt)p(62|At) (2)

from which Bells inequality can be derived just as well There is however one important aspect that up till now has not sufficiently been taken into acshycount viz contextuality In the construction of (44) it is assumed that the

macrostate A is applicable in each of the measurement arrangements of obshyservables AA2Bi and B2 Because of the incompatibility of some of these observables this is an implausible assumption On the basis of the thermoshydynamic analogy it is to be expected that macrostates A will depend on the

111

measurement context of a specific observable Since [AiBi]_ ^ O we will have

f f1 (3)

and analogously for A2 and B2 Then for the Bell experiments measuring the pairs (Ai A2) and (AiB2) respectively we have

p(aia2) = dX 2 p(t 1 2)p(ai|A 1 2)p(a2X 1 2 ) (4)

p(aib2) = JdtAlB2 ptMB2)patfMB)pa2tMB) (5)

Now the contextuality expressed by inequality (45) prevents the construction of a quadrivariate joint probability distribution analogous to (44) Hence like in the quantum mechanical approach also in the local non-objectivistic hidden-variables theory a derivation of Bells inequality is prevented due to the local contextuality involved in the interaction of the particle and the measuring instrument it is directly interacting with

6 Conclusions

Our conclusion is that if quantum mechanical measurements do probe macro-states A rather than microstates A then Bells inequality cannot be derived for quantum mechanical measurements Both in quantum mechanics and in hidden-variables theories is Bells inequality a consequence of the assumption that the theory is yielding an objective description of reality in the sense that the preparation of the microscopic object as far as relevant to the realization of the measurement result can be thought to be independent of the measureshyment arrangement The important point to be noticed is that although in Bell experiments the preparation of the particle pair at the source (ie the microstate) can be considered to be independent of the measurement proceshydures to be carried out later (and hence one and the same microstate can be assumed in different Bell experiments) the measurement result is only detershymined by the macrostate which is co-determined by the interaction with the measuring instruments It really seems that the Copenhagen maxim of the impossibility of attributing quantum mechanical measurement results to the object as objective properties possessed independently of the measurement should be taken very seriously and implemented also in hidden-variables theshyories purporting to reproduce the quantum mechanical results The quantum

112

mechanical dice is only cast after the object has been interacting with the meashysuring instrument even though its result can be deterministically determined by the (sub-quantum mechanical) microstate

The thermodynamic analogy suggests which experiments could be done in order to transcend the boundaries of the domain of application of quanshytum mechanics If it would be possible to perform experiments that probe the microstate A rather than the macrostate A then we are in the domain of (quasi-)objectivistic hidden-variables theories Because of (42) it then is to be expected that Bells inequality should be satisfied for such experiments In such experiments preparation and measurement must be completed well within the relaxation time of the microstates Such times have been estimated by Bohm24 for the sake of illustration as the time light needs to cover a disshytance of the order of the size of an atom (10~18 s say) If this is correct then all present-day experimentation is well within the range of quantum mechanshyics thus explaining the seemingly universal applicability of this latter theory By hindsight this would explain why Aspects switching experiment is corshyroborating quantum mechanics the applied switching frequency (50 MHz) although sufficient to warrant locality has been far too low to beat the local relaxation processes in each of the measuring instruments separately

It has often been felt that the most surprising feature of Bell experiments is the possibility (in certain states) of a strict correlation between the measureshyment results of the two measured observables without being able to attribute this to a previous preparation of the object (no elements of physical reality ) For many physicists the existence of such strict correlations has been reason enough to doubt Bohrs Copenhagen solution to renounce causal explanation of measurement results and to replace determinism by complementarity It seems that the urge for causal reasoning has been so strong that even within the Copenhagen interpretation a certain causality has been accepted even a non-local one in an EPR experiment (cf figure 1) determining a measurement result for particle 2 by the measurement of particle 1 This however should rather be seen as an internal inconsistency of this interpretation caused by a tendency to make the Copenhagen interpretation as realist as possible In a consistent application of the Copenhagen interpretation to Bell experiments such experiments could be interpreted as measurements of bivariate correlation observables The certainty of obtaining a certain (bivariate) eigenvalue of such an observable would not be more surprising than the certainty of obtaining a certain eigenvalue of a univariate one if the state vector is the corresponding eigenvector

It is important to note that this latter interpretation of Bell experiments takes seriously the Copenhagen idea that quantum mechanics need not ex-

113

plain the specific measurement result found in an individual measurement Indeed in order to compare theory and experiment it would be sufficient that quantum mechanics just describe the relative frequencies found in such meashysurements In this view quantum mechanics is just a phenomenological theory in an analogous way describing (not explaining) observations as does thermoshydynamics in its own domain of application Explanations should be provided by more fundamental theories describing the mechanisms behind the obshyservable phenomena Hence the Copenhagen completeness thesis should be rejected (although this need not imply a return to determinism)

This approach has important consequences One consequence is that the non-existence within quantum mechanics of elements of physical reality does not imply that elements of physical reality do not exist at all They could be elements of the more fundamental theories In section 5 it was discussed how an analogy between quantum mechanics and thermodynamics could be exploited to spell this out Elements of physical reality could correspond to hidden-variables microstates A The determinism necessary to explain the strict correlations referred to above would be explained if within a given measurement context a microstate would define a unique macrostate A This demonstrates how it could be possible that quantum mechanical measurement results cannot be attributed to the object as properties possessed prior to meashysurement and there yet is sufficient determinism to yield a local explanation of strict correlations of quantum mechanical measurement results in certain Bell experiments

Another important aspect of a dissociation of phenomenological and funshydamental aspects of measurement is the possibility of an empiricist interpreshytation of quantum mechanics As demonstrated by the generalized Aspect experiment discussed in section 3 an empiricist approach needs a generalshyization of the mathematical formalism of quantum mechanics in which an observable is represented by a POVM rather than by a projection-valued meashysure corresponding to a self-adjoint operator of the standard formalism Such a generalization has been very important in assessing the meaning of Bells inequality In the major part of the literature of the past this subject has been dealt with on the basis of the (restricted) standard formalism However some conclusions drawn from the restricted formalism are not cogent when viewed in the generalized one (for instance because von Neumanns projection postulate is not applicable in general) For this reason we must be very careful when accepting conclusions drawn from the standard formalism This in particular holds true for the issue of non-locality

114

References

1 W Heisenberg Zeitschr f Phys 33 879 (1925) 2 E Schrodinger Naturwissenschaften 23 807 823 844 (1935) (English

translation in Quantum Theory and Measurement eds JA Wheeler and WH Zurek (Princeton Univ Press 1983 p 152))

3 WM de Muynck Synthese 102 293 (1995) 4 A Einstein B Podolsky and N Rosen Phys Rev 47 777 (1935) 5 A Aspect P Grangier and G Roger Phys Rev Lett 47 460 (1981) 6 A Aspect J Dalibard and G Roger Phys Rev Lett 49 1804 (1982) 7 KR Popper Quantum theory and the schism in physics (Rowman and

Littlefield Totowa 1982) 8 M Jammer The philosophy of quantum mechanics (Wiley New York

1974) 9 N Bohr Phys Rev 48 696 (1935)

10 JS Bell Physics 1 195 (1964) 11 HR Stapp Phys Rev D 3 1303 (1971) II Nuovo Cim 29B 270

(1975) 12 A Fine Journ Math Phys 23 1306 (1982) Phys Rev Lett 48 291

(1982) 13 P Rastall Found of Phys 13 555 (1983) 14 WM de Muynck Phys Lett A 114 65 (1986) 15 WM de Muynck W De Baere and H Martens Found of Phys 24

1589 (1994) 16 WM de Muynck Found of Phys 30 205 (2000) 17 WM de Muynck and JT van Stekelenborg Ann der Phys 7 Folge

45 222 (1988) 18 L de Broglie La thermodynamique de la particule isolee (Gauthier-

Villars 1964) L de Broglie Diverses questions de mecanique et de thershymodynamique classiques et relativistes (Springer-Verlag 1995)

19 D Bohm Phys Rev 89 458 (1953) 20 D Bohm and J-P Vigier Phys Rev 96 208 (1954) 21 E Nelson Dynamical theories of Brownian motion (Princeton University

Press 1967) 22 E Nelson Quantum fluctuations (Princeton University Press 1985) 23 HB Hollinger and MJZenzen The Nature of Irreversibility (D Reidel

Publishing Company Dordrecht 1985 sect 44) 24 D Bohm Phys Rev 85 166 180 (1952)

115

DISCRETE HESSIANS IN STUDY OF Q U A N T U M STATISTICAL SYSTEMS COMPLEX GINIBRE ENSEMBLE

M M DURAS

Institute of Physics Cracow University of Technology ulica Podchorazych 1 PL-30084 Cracow Poland

E-mail mdurasriaduskpkedupl

The Ginibre ensemble of nonhermitean random Hamiltonian matrices K is conshysidered Each quantum system described by K is a dissipative system and the eigenenergies Z of the Hamiltonian are complex-valued random variables The second difference of complex eigenenergies is viewed as discrete analog of Hessian with respect to labelling index The results are considered in view of Wigner and Dysons electrostatic analogy An extension of space of dynamics of random magnitudes is performed by introduction of discrete space of labeling indices

1 Introduction

Random Matrix Theory RMT studies quantum Hamiltonian operators H which are random matrix variables Their matrix elements Hij are independent ranshydom scalar variables 12345678 There were studied among others the folshylowing Gaussian Random Matrix ensembles GRME orthogonal GOE unitary GUE symplectic GSE as well as circular ensembles orthogonal COE unishytary CUE and symplectic CSE The choice of ensemble is based on quantum symmetries ascribed to the Hamiltonian H The Hamiltonian H acts on quanshytum space V of eigenfunctions It is assumed that V is TV-dimensional Hilbert space V = F ^ where the real complex or quaternion field F = R C H corresponds to GOE GUE or GSE respectively If the Hamiltonian matrix

116

H is hermitean H mdash H then the probability density function of H reads

MH)=CH0exp[-p-plusmn-Tr(H2) (1)

CH0 = ( ^ ) ^ 2

MHP=N+ ^N(N - 1)0

fn(H)dH = 1

N N D-l

^=nniK) i = l j gt i 7=0

Hii = (H$HltSgt-raquo)eF

where the parameter 3 assume values 3 = 124 for GOE(iV) GUE(A^) GSE(A^) respectively and Nap is number of independent matrix elements of hermitean Hamiltonian H The Hamiltonian H belongs to Lie group of hermitean N x AT-matrices and the matrix Haars measure dH is invarishyant under transformations from the unitary group U(iV F) The eigenenergies Eii = 1 N oi H are real-valued random variables Ei = E It was Eushygene Wigner who firstly dealt with eigenenergy level repulsion phenomenon studying nuclear spectra1 2 3 RMT is applicable now in many branches of physics nuclear physics (slow neutron resonances highly excited complex nushyclei) condensed phase physics (fine metallic particles random Ising model [spin glasses]) quantum chaos (quantum billiards quantum dots) disordered meso-scopic systems (transport phenomena) quantum chromodynamics quantum gravity field theory

2 The Ginibre ensembles

Jean Ginibre considered another example of GRME dropping the assumption of hermiticity of Hamiltonians thus denning generic F-valued Hamiltonian K 12910 j j e n C 6 ) j belong to general linear Lie group GL(N F) and the matrix Haars measure dK is invariant under transformations form that group The

117

distribution of K is given by

MK) = CK0 exp [-P-- TrffftA-)] (2)

KHfgt = N2p

fKK)dK = 1

N N D-

^=nniK) i=j= 7=0

where 3 mdash 124 stands for real complex and quaternion Ginibre ensembles respectively Therefore the eigenenergies Zi of quantum system ascribed to Ginibre ensemble are complex-valued random variables The eigenenergies Zii = 1N of nonhermitean Hamiltonian K are not real-valued random variables Zi ^ Z Jean Ginibre postulated the following joint probability density function of random vector of complex eigenvalues Z ZN tor N X N Hamiltonian matrices K for f = 21 2-9 10

PzuzN) = (3) N 1 N N

=n ^771 bull n zi - ztf bull exp(- zZ I^I2) 3 = 1 J iltj j=l

where Zi are complex-valued sample points (zi 6 C) We emphasize here Wigner and Dysons electrostatic analogy A Coulomb

gas of iV unit charges moving on complex plane (Gausss plane) C is considered The vectors of positions of charges are zt and potential energy of the system is

U(z1zN) = -J2]nzi-j + lEZil (4) iltj i

If gas is in thermodynamical equilibrium at temperature T = ^- (ft = -^-^ = 2 ks is Boltzmanns constant) then probability density function of vectors of positions is P(ZIZN) Eq (3) Therefore complex eigenenergies Zi of quantum system are analogous to vectors of positions of charges of Coulomb

118

gas Moreover complex-valued spacings AxZi of complex eigenenergies of quantum system

A1Zi = Zi+1-Zii = l(N-l) (5)

are analogous to vectors of relative positions of electric charges Finally complex-valued second differences A2Zj of complex eigenenergies

A2Zi = Zi+2 - 2Zi+l + Zui = 1 N - 2) (6)

are analogous to vectors of relative positions of vectors of relative positions of electric charges

The eigenenergies Zi = Z(i) can be treated as values of function Z of discrete parameter i mdash 1 N The Jacobian of Zi reads

dZi A1Zi JacZi = V ~ ^ T 1 = A Zlt- 7

Ol A1 We readily have that the spacing is an discrete analog of Jacobian since the indexing parameter i belongs to discrete space of indices i pound = l iV Therefore the first derivative with respect to i reduces to the first differential quotient The Hessian is a Jacobian applied to Jacobian We immediately have the formula for discrete Hessian for the eigenenergies Zi

Q2 7 A 2 7

Thus the second difference of Z is discrete analog of Hessian of Z One emphasizes that both Jacobian and Hessian work on discrete index space of indices i The finite differences of order higher than two are discrete analogs of compositions of Jacobians with Hessians of Z

The eigenenergies Eii 6 of the hermitean Hamiltonian H are ordered increasingly real-valued random variables They are values of discrete function Ei = Ei) The first difference of adjacent eigenenergies is

A1Ei = Ei+1-Eii = l(N-l) (9)

are analogous to vectors of relative positions of electric charges of one-dimensional Coulomb gas It is simply the spacing of two adjacent energies Real-valued second differences A2Ei of eigenenergies

A2Ei = Ei+2 - 2Ei+1 +Eui = 1 (N - 2) (10)

119

are analogous to vectors of relative positions of vectors of relative positions of charges of one-dimensional Coulomb gas The A2Zi have their real parts ReA2Zi and imaginary parts ImA2Z as well as radii (moduli) A2Zi and main arguments (angles) ArgA2Zi A2Zj are extensions of real-valued second differences

A 2 poundi = Ei+2 - 2Ei+1 +Ehi = 1 (N - 2) (11)

of adjacent ordered increasingly real-valued eigenenergies Ei of Hamiltonian H defined for GOE GUE GSE and Poisson ensemble PE (where Poisson ensemshyble is composed of uncorrelated randomly distributed eigenenergies)1112131415 The Jacobian and Hessian operators of energy function E(i) mdash Ei for these ensembles read

and

The treatment of first and second differences of eigenenergies as discrete analogs of Jacobians and Hessians allows one to consider these eigenenergies as a magshynitudes with statistical properties studied in discrete space of indices The labelling index i of the eigenenergies is an additional variable of motion hence the space of indices I augments the space of dynamics of random magshynitudes

Acknowledgements

It is my pleasure to most deeply thank Professor Antoni Ostoja-Gajewski for continuous help I also thank Professor Wlodzimierz Wojcik for his giving me access to computer facilities

References

1 F Haake Quantum Signatures of Chaos (Springer-Verlag Berlin Heidelshyberg New York 1990) Chapters 1 3 4 8 pp 1-11 33-77 202-213

2 T Guhr A Miiller-Groeling and H A Weidenmuller Phys Rept 299 189-425 (1998)

3 M L Mehta Random matrices (Academic Press Boston 1990) Chapters 1 2 9 pp 1-54 182-193

4 L E Reichl The Transition to Chaos In Conservative Classical Systems Quantum Manifestations (Springer-Verlag New York 1992) Chapter 6 p 248

5 O Bohigas in Proceedings of the Les Houches Summer School on Chaos and Quantum Physics (North-Holland Amsterdam 1991) p 89

6 CE Porter Statistical Theories of Spectra Fluctuations (Academic Press New York 1965)

7 T A Brody J Flores J B French P A Mello A Pandey and S S M Wong Rev Mod Phys 53 385 (1981)

8 C W J Beenakker Rev Mod Phys 69 731 (1997) 9 J Ginibre J Math Phys 6 440 (1965)

10 M L Mehta Random matrices (Academic Press Boston 1990) Chapter 15 pp 294-310

11 M M Duras and K Sokalski Phys Rev E 54 3142 (1996) 12 M M Duras Finite difference and finite element distributions in statisshy

tical theory of energy levels in quantum systems (PhD thesis Jagellonian University Cracow 1996)

13 M M Duras and K Sokalski Physica D125 260 (1999) 14 M M Duras Description of Quantum Systems by Random Matrix Enshy

sembles of Large Dimensions in Proceedings of the Sixth International Conference on Squeezed States and Uncertainty Relations 24 May-29 May 1999 Naples Italy (NASA Greenbelt Maryland at press 2000)

15 M M Duras J Opt B Quantum Semiclass Opt 2 287 (2000)

121

SOME REMARKS ON HARDY FUNCTIONS ASSOCIATED WITH DIRICHLET SERIES

W E H M Institut fur Grenzgebiete der Psychologie und Psychohygiene

Wilhelmstrasse 3a 79098 Freiburg Germany E-mail ehmigppde

A simple method of associating a Hardy function with a Dirichlet series is described and applied to some examples connected with the Riemann zeta function The theory of Hardy functions then is used to derive integral tests of the Riemann hypothesis generalizing a recent result of Balazard Saias and Yor1

1 Introduction

The most famous example of a Dirichlet series f(z) = Y^=i an n~z converging absolutely in the half plane $lz gt 1 is the Riemann zeta function ((z) which has all coefficients an = 1 It has a simple pole at z mdash 1 and can be extended as a meromorphic function with no other singularities to the whole complex plane6

A simple method of associating a Hardy function with a Dirichlet series of that kind consists in multiplying f(z) by (z mdash l ) ^ 2 the factor (z mdash l)z removes the pole at z = 1 and the division by z achieves square integrability along vertical lines Moreover the zeros of fz) remain unchanged by this modification The motivation for passing from f(z) to f(z) (z mdash l)z2 is to utilize the theory of Hardy functions especially factorization of Hardy functions for the study of the zeta function

In section 2 of this note we give conditions under which the function f(z) (z mdash l)z2 has an analytic continuation as a Hardy function beyond the abscissa of convergence of the Dirichlet series f(z) The criterion is tested on three examples all related to the Riemann zeta function Factorization of the Hardy function pound(z) (z mdash l)z2 which is briefly dicussed in section 3 is used in section 4 to derive some integral tests of the Riemann hypothesis The content of the Riemann hypothesis hereafter abbreviated RH is Riemanns yet unproven conjecture that all non-real zeros of the pound function lie on the line iftz = 12 in the complex plane It has received increasing interest among physicists since the discovery of striking similarities in the distribution of the zeros of the zeta function and the spectrum of large random matrices2

The idea to utilize Hardy functions in connection with the zeta function including integral tests of the Riemann hypothesis is not new See the recent article of Balazard Saias and Yor1 who initially work with Hardy functions in the disc then pass to the half plane 3te gt 12 by conformal mapping In our

122

approach based on the function C(z)(z mdash l ) z 2 which also appears in recent work of Burnol4 we deal with half plane Hardy functions from the beginning This leads to somewhat more general results in a natural fashion

2 Hardyfication of Dirichlet series

The basic result of this section is the following

Theorem Given a Dirichlet series f(z) = $3nLi a laquo n~z with a finite abscissa of convergence let functions A and ltfgt be defined by

A(x) = ^2 abdquo ltj)x) = ^^ an(l-x + ogn) (x euro R ) l lt n lt x lltnlte

(1)

Suppose that Ax) = 0(x) as x mdashgtbull oo and let

X = l i m s u p l-pM where DN = A(N) - V ^ M ( 2 )

Then the function f(z) (z mdash l)z2 can be represented as the Laplace transform of ltfgt(x) in the half plane Stz gt A

(3) bullOO

f(z)(z-l)z2 = e-zx4gt(x)dx ($lzgt) Jo

Proof Fix an integer N gt 1 and let log N lt x lt og(N + 1) Then

4gt(x)-4gt(logN) = (x-logN)A(N)ltA(N)logtplusmnl = 0(1)

as N -gt oo by the assumed growth behavior of A(x) Combining this with

(A(log(n + l))-lt)(logn) = an+1 - A(n) log ^ = an+1 - A(n)n + 0(n1)

we get for N = [ex] -gt oo

N-l

4gtx) = m + J2 [^(log(+)) - ^(losn)] + deg() n=l

N-l

= ai + 5 3 [an+1 - A(n)n + Ofa-1)] + 0(1) = DN + 0(log N) n = l

123

and thus for every e gt 0 ltfgt(x) = 0(ea(A+egt) x t oo by the definition of A Since 4gt vanishes on the left half line it follows that the integral on the right-hand side of (3) converges absolutely in the half plane 5ftz gt A It remains to show that this Laplace transform coincides with f(z) (z - l ) z 2 in the half plane 3z gt aa where aa denotes the abscissa of absolute convergence of f(z)

To that end let us write r)(z) = f(z) (z mdash l)z2 and introduce truncated versions

N

fN(z) = ^2ann~z T]N(z) = fN(z)(z-l)z2

n = l

(j)Nx) = Y2 an(l-x + ogn) lltnltmin(Nex)

N gt1 and set h^^ix) mdash e~~ax ltfgtjv(x) Using

2TT J^ [ + ] 0 if x lt 0

(for every integer q gt 1 a gt 0) we get for fixed a gt aa

(bullOO

eitxr)N(v + it)dt (4)

-i -oo N = v eitx ]C a n~deg~it (a + it- l)l(a +t)2 dt

2r J -OO

-f 2TT J_

n = l N

^-ijy^-i^u dt ya + it (a + it)2

Y ann-dege-deg(x-lo^(l-(x-logn)) = haNx) lltnltmin(Nex)

almost everywhere in x S R the Fourier integrals being understood in the L2

sense Note that r](z) is square integrable along every line 9z = a with a gt aa Clearly rj^i^+it) converges to r)a+it) in L2(dt) so h^^ is a Cauchy sequence in L2(dx) by Parsevals formula The pointwise limit ha(x) of hltT^(x) then also is the L2(dx) limit so that by (4) h^x) and T)(a + it) represent a Fourier transform pair for every a gt aa Therefore

poo poo

r](a + it) = Kit) = hax)e~ixtdx = e-(deg+iVxltf)(x)dx (5) Jo Jo

124

holds almost everywhere in t (a gt aa) hence everywhere in 3te gt aa by continuity This shows that the Laplace transform of ltfgt represents the analytic continuation of 77 to the region $tz gt A completing the proof

Let Ti2 denote the Hardy space consisting of all functions g(z) which are analytic for $lz gt a and such that s u p ^ ^ J^deg g(cr + it)2 dt lt 00 The growth behavior of (jgt(x) established in the proof implies ha euro L2 for every a gt A so that by (5) and Parsevals formula we obtain the following

Corollary Under the conditions of the theorem the function f(z) (z mdash l)z2

belongs to every Hardy space H2 a gt X

Example 1 Let obdquo = 1 for all n that is f(z) mdash Cz) Then DN = 1 N gt 1 so that A = 0 A more careful analysis shows that ltfrx) is nonnegative and grows linearly as x tends to infinity Consequently (z) (z mdash l)z2 is a member of every Hardy space W2 a gt 0 but not of H2 The nonnegativity allows one to associate with ltfgt an exponential family V mdash pa a gt 0 of probability densities with support [000) by setting

pbdquo(x) = K(x)r](a) = ltfgtx)e-xri((T) (x euro R a gt 0) (6)

The function pound(z) (z mdash l)z2 was also considered by Burno in connection with a closure problem in function space known as the Nyman - Beurling real variable form of the Riemann hypothesis

It may be interesting to note here that although ha is square integrable for every a gt 0 it is not true that hafM mdashgtbull iltr in L2 if cr lt 1 In fact we have

Uminf jv-gtoo ||fr(7JV-iltr||2 gt 0 0 lt a lt 1 (7)

Proof Note first that for x gt log N -gt 00

4gtx) - 4gtNX) (8)

J ^ ( l - z + logn) = ( l - a O Q e ^ - A O + l o g t e ^ l - l o g A T Nltnlte

= ( l - x ) ( [ e ] - A 0 + ([ex + plusmn)log[ex] - [ex] - (N + | ) logiV + N + 0(1)

= (JV+)(log[ex]- logJV) + ( [ e^ ] - iV) ( log [e a ] -x )+0 ( l )

= (N + ) ( - log TV) + 0(1)

on using Stirlings formula and the inequalities 0 lt x - log [ex] lt2e~x (x gt 0) The estimate (8) shows that there exists a finite constant B gt 0 such that

125

ltfgt(x) - 4gtNx) gtN(x- logN) for all large N and x gt B + log JV Therefore

O0

KN-Kl gt (ltfgt(x) - lttgtN(x))2 e-2 dx JB+ogN

roo TOO

gt TV2 (x-logN)2e-2axdx = N2~2deg y2 e~2try dy JB+ogN JB

for all large N and assertion (7) follows

Example 2 Let f(z) = ^2p~z^ogp where the sum extends over all prime numbers This example is related to the logarithmic derivative of the zeta function as may be seen from the product representation pound(z) = J~T_ (1mdashp_ z)_ 1 For IRz gt 1

C(z) v - logP gt V - ogP C(z) ^ Pz - 1 M ^ ^ Pz (p2 - 1)

and since the last series converges for Htz gt 12 it suffices to consider f(z) as far as the analytic continuation of C(z)C(z) 1S concerned

The series f(z) had convergence abscissa 12 implying the RH if the associated sequence DN satisfied condition (2) with A = 12 For a numerical check we computed DN for TV up to 5 million A plot of log+ |Djv| log TV versus logiV (thinned out to every 200th data point the general picture is not affected thereby) is shown in Figure 1 (a) Within the considered range the observed behavior is well in accordance with a possible value of A = 12 Notice the obvious connection with the classical criterion saying that the RH is equivalent to the error estimate $^pltxlogp mdash x = 0(x12+e) (V e gt 0) in the prime number theorem (Edwards6 Sect 55) Incidentally 4gt(x) seems to be nonnegative in this case too as a plot of ltfgt(x) for small a-values indicates

Example 3 Let f(z) = 1C(z) = ^2^Li^(n)n~z with fj the Mobius funcshytion It is well-known that the RH is equivalent to the condition A(N) = EnltivM(trade) = 0(V1 2 + e) (for every e gt 0) that is to A = 12 The analogous plot for this case is shown in Figure 1 (b) with similar findings

3 Factorization of r)

From now on we shall restrict attention to the case = pound For brevity we write r](z) = ((z)(z mdash l)z2 throughout the sequel Recall from the previous section that TJ belongs to every Hardy space H2

T a gt 0 Being a Hardy function r admits a useful factorization some applications of which will be discussed in

126

Figure 1 Convergence abscissa of Laplace transform equal to 12 Plot of criterion log1 DN I logN versus log AT for (a) Example 2 (b) Example 3

the next section The zeros of r) in the right half plane Sftz gt 0 which coincide with the non-trivial zeros of the zeta function are generically denoted by p The ps are known to lie symmetrically with respect to both the real axis and the critical line Kz = 12 That is whenever p is a zero then so are the mirror images p 1-9 and 1 mdash p

Let a gt 0 be fixed According to the factorization theorem for Hardy functions (see eg Dym and McKean5 (ch 27) or Hoffman8 (p 132 133)) TJ can be represented as the product of an outer and an inner function on the half plane 5Rz gt a More precisely

r(z) = Haz)Baz)

where the outer function is given by

(ftz gt a)

Hltr(z) = exp 7T J-c

log rj(a + it) t(z mdash a) + i dt t + i(z-a) 1+t2

(9)

(10)

and the inner function reduces in the present case to a Blaschke product Ba

which is composed of the zeros p of T] with 5fygt gt a and their mirror images after reflection at the line 9z = a 2a mdash ~p Explicitly

l-p-o D M _ TT z ~ P l 1 ( i i )

These formulae are easily obtained from the familiar ones for the half plane 9iz gt 08 by shifting both the complex variable and the zeros by a The inner

127

factor simplifies to a Blaschke product for the following reasons (i) n has an analytic continuation across the line dtz = a to the entire right half plane so that there is no singular factor (ii) the constant c appearing in the general factorization formula reduces to unity because Ba(o) = 1 and Ha(a) = rj(a) as is readily verified For real arguments z = s taking first logarithms then real parts on both sides of (9) one obtains for s gt a gt 0

iog(s) = i jy^(^) s(s_-^2 + pound i0i

5Rpgtltr

s-p s-(2a-p)

(12)

Note that T](s) is positive for s gt 0 being the Laplace transform of a nonneg-ative function

4 Applications

The factorization of n gives rise to various tests of the RH A first example is obtained by setting a = 12 in (12) The sum on the right-hand side of (12) vanishes if and only if pound(z) has no zero within the region $lz gt 12 Therefore the RH is true if and only if for some (and then for all) s gt 12

If 71 J-lt

logMl + ^ l ^ = lograquoK) (13) (s 2) +t

This criterion is equivalent to the condition that r)(z) be an outer function for the half plane 9z gt 12 cf Dym and McKean5 Sect 27 For s = 1 it assumes a particularly neat form The right-hand side vanishes and the left-hand side can be simplified and one gets the following criterion for the truth of the RH due to Balazard Saias and Yor1

4 + l

Another example results from the formula

OO 1

log[|ij(ltr + it)|i(lt7)] -2L - 2 pound K ( p - a ) 1 (15)

(cr gt 0) which can be derived from (12) by subtracting logger) on both sides dividing by s - cr and then taking the limit s a The interchange of limits and integration (or summation) can be justified by dominated convergence

128

Putting a = 12 in (15) one obtains the following differential version of the integral tests (13) (14) The RH is true if and only if

f j mdash lt

dt l o g t W i + i t J I M D l - r j = ( log^) ( i ) (16)

This statement can be amplified in various ways First it is possible to evaluate (log77)(|) explicitly (logr)(|) = f + |log(87r) + f - 6 and for u = 12 the sum in (15) can be written in a more symmetric form One thus obtains the relation

00

log v+it)

v(h) dt (l 1 7T ^$tp-5 ( l + l l o g M + I _ 6 ) = E 2 I

bullKt2 2 2 6V 4 J ^ p - | p (17)

in which the sum extends over all zeros in the critical strip Note that (17) quantifies the difference between the two sides of (16) as a weighted sum of the absolute deviations of the real parts of the zeros from 12

Secondly there is a connection with logarithmic Hilbert transforms also called logarithmic dispersion relations3 Suppose we had T](z) ^ 0 for IStz gt 12 Then n itself would be an outer function

Taking imaginary parts in this equation one can show with a little algebra that for z mdash 12 = a + ib a gt 0 one then has

ZlogV(z) = - J ^ (log|7(i + it) - l o g W +ib)) -plusmn-plusmn j - ^ 1 8 )

l o g M | + r t ) I - log T + ib) I a dt

-I t-b a2 + (t-b)2

Fix any b gt 0 such that 7(| +ib) ^ 0 Then the last term in (38) converges to zero as a 4- 0 Therefore using the fact that r]( + it) is an even function of t one obtains in the limit the logarithmic dispersion relation

o-i ( + bull 2b Z-00 log k ( | + it)| - log |raquo(| + t6)| ^ Zlogriiz+ib) = mdash J i ^ mdash ^ dt (19)

which expresses the phase of rj on the boundary dtz = 12 as an integral of its log modulus along that line Recall that this relation is a consequence of the

129

assumed outer function character of 77 that is of the RH In fact the validity of (19) for every 6 gt 0 such that 7(| + ib) ^ 0 is also sufficient for the RH To see this divide both sides of (19) by b and let 6 4-0 Then the left side tends to (lograquo7)(i) the right side to f 0degdeglog[r]( + it)h)] sect so in the limit we get the condition (16) shown above to be equivalent to the RH

Finally we note that mdash (log77)(ltr) equals the first moment of the probability density pbdquo cp (6) In view of (16) and (15) this raises the question whether the integral term in these relations admits of a probabilistic interpretation too Relevant to this question is the observation going back to Khintchine that for every a gt 1 the function fa(t) = pound(a + it)((a) is the characteristic function of an infinitely divisible distribution cf Example 6 p 75 in Gnedenko and Kolmogorov7 This can be verified by rewriting the product representation of the zeta function (for a gt 1) in the form

C(o- + it) = T T 1-p-7

exp mdash Tmdashon

y^ y^ E ie-itnoSp _ i p n = l

(20)

and noting that fat) is thus represented as a product of terms of the form exp(a(elbt mdash 1)) each of which is the characteristic function of a Poisson random variable with intensity a and values in the lattice kb k = 012

In order to connect this fact with the above question it is convenient to introduce the Levy measure Fa which puts mass (npncr)~1 at each of the points - logp ngtlp prime Then (20) becomes log ^fffi = J(eitx - 1) Fa(dx) so taking real parts in this equation and using J^deg (l mdash costx)t2 dt = n x (x pound R) one obtains

J o g [ | C ( a + i i ) | C ( lt T ) ] ^ = j_^jpostx-l)Fadx)^

= ( c o s t e - 1 ) mdash ^ F ^ d x ) = - hxlFeidx) = xFbdquo(dx)

Thus we find that the essential part of the integral in question equals the first moment of the Levy measure Fa The other part stemming from the factor (z mdash l)z2 can be incorporated by introducing a signed absolutely continuous measure Ga with density x _ 1 [2eax - e ^ - 1 ^ ) on (-000) (zero on [000)) One then has

log r)a + it) plusmnii) = j(eax-l)(Fa-Ga)(dx)

130

and hence

l o g [ | bdquo ( | + r t ) I M sect ) ] ^ = lx(Fbdquo-Ga)dx) (ltxgtl)

These calculations give a more detailed picture of the way how the factor (z mdash l)z2 regularizes the zeta function as a J 1 it compensates the flow of mass of Fa towards mdash oo by the subtraction of measures Ga such that the first moment of Fa mdash Ga remains bounded Evidently other ways of renormalizing the Levy measure as a 1 are also conceivable and may be interesting to explore

References

1 M Balazard E Saias and M Yor Adv Math 143 284 (1999) 2 MV Berry and JP Keating SIAM Review 41 236 (1999) 3 RE Burge MA Fiddy AH Greenaway and G Ross Proc R Soc

London A 350 191 (1976) 4 J -F Burnol lt h t t p arXivorgabsmath0001013gt (2000) 5 H Dym and HP McKean Gaussian Processes Function Theory and

the Inverse Spectral Problem (Academic Press New York 1976) 6 HM Edwards The Theory of the Riemann Zeta Function (Academic

Press New York 1974) 7 BV Gnedenko and AN Kolmogorov Limit Distributions for Sums of

Independent Random Variables (Addison-Wesley Cambridge 1954) 8 K Hoffman Banach Spaces of Analytic Functions (Dover New York

1988)

131

ENSEMBLE PROBABILISTIC EQUILIBRIUM A N D NON-EQUILIBRIUM THERMODYNAMICS W I T H O U T THE

THERMODYNAMICAL LIMIT

D H E G R O S S

Hahn-Meitner-Institut Berlin Bereich Theoretische PhysikGlienickerstrlOO

14109 Berlin Germany and Freie Universitdt Berlin Fachbereich Physik Email grosshmide

Boltzmanns principle S = k In W allows to extend equilibrium thermo-statistics to Small systems without invoking the thermodynamic limit23 As the limit hides more than clarifies the origin of phase transitions a deeper and more transparent understanding is thus possible The main clue is to base statistical probability on ensemble averaging and not on time averaging It is argued that due to the incomplete information obtained by macroscopic measurements thermodynamics handles ensembles or finite-sized sub-manifolds in phase space and not single time-dependent trajectories Therefore ensemble averages are the natural objects of statistical probabilities This is the physical origin of coarse-graining which is not anymore a mathematical ad hoc assumption The probabilities P(M) of macroshyscopic measurements M are given by the ratio P(M) = W(M)W of the volumes of the sub-manifold M of the microcanonical ensemble with the constraint M to the one without From this concept all equilibrium thermodynamics can be deduced quite naturally including the most sophisticated phenomena of phase transitions for Small systems

Boltzmanns principle is generalized to non-equilibrium Hamiltonian systems with possibly fractal distributions M in 6iV-dim phase space by replacing the conshyventional Riemann integral for the volume in phase space by its corresponding box-counting volume This is equal to the volume of the closure M With this extension the Second Law is derived without invoking the thermodynamic limit The irreversibility in this approach is due to the replacement of the phase-space volume of the fractal sub-manifold M by the volume of its closure M The physical reason for this replacement is that macroscopic measurements cannot distinguish M from Ai Whereas the former is not changing in time due to Liouvilles theoshyrem the volume of the closure can be larger In contrast to conventional coarse graining the box-counting volume is defined in the limit of infinite resolution Ie there is no artificial loss of information

1 Introduction

Recently the interest in the thermo-statistical behavior of non-extensive many-body systems like atomic nuclei atomic clusters soft-matter biological sysshytems mdash and also self-gravitating astro-physical systems lead to consider thermo-statistics without using the thermodynamic limit This is most safely done by going back to Boltzmann Einstein considers Boltzmanns definition of entropy as eg written on his

132

famous epitaph

S=k-lnW (1)

as Boltzmanns principle4 from which Boltzmann was able to deduce thermoshydynamics Here W is the number of micro-states at given energy E of the TV-body system in the spatial volume V

W(ENV) = tr[e0S(E - HN)) (2)

ltlt-amp)] = ff^(^0)BBbdquo) (3)

eo is a suitable energy constant to make W dimensionless Hpf is the N-particle Hamilton-function and the iV positions q are restricted to the volume V whereas the momenta p are unrestricted In what follows we remain on the level of classical mechanics The only reminders of the underlying quantum meshychanics are the measure of the phase space in units of 2-KK and the factor 1N which respects the indistinguishability of the particles (Gibbs paradoxon) In contrast to Boltzmann56 who used the principle only for dilute gases and to Schrodinger7 who thought equation (1) is useless otherwise I take the princishyple as the fundamental generic definition of entropy In the following sections 1 will demonstrate that this definition of thermo-statistics works well espeshycially also at higher densities and at phase transitions without invoking the thermodynamic limit

2 There is a lot to add to classical equilibrium statistics from our experience with Small systems

Following Lieb8 extensivity a and the existence of the thermodynamic limit N mdashgt oo|jvv=cobdquogt are essential conditions for conventional (canonical) thershymodynamics to apply Certainly this implies also the homogeneity of the system Phase transitions are somehow foreign to this The essence of first order transitions is that the systems become inhomogeneous and split into difshyferent phases separated by interfaces In the conventional Yang-Lee theory phase transitions are represented by the positive zeros of the grand-canonical partition sum where the grand-canonical formalism breaks down (Yang-Lee singularities) In the following we show that the micro-canonical ensemble

Dividing extensive systems into larger pieces the total energy and entropy are equal to the sum of those of the pieces

133

gives much more detailed and more natural insight which corresponds to the experimental identification of phase transitions

There is a whole group of physical many-body systems called Small in the following which cannot be addressed by conventional thermo-statistics

bull nuclei

bull atomic cluster

bull polymers

bull soft matter (biological) systems

bull astrophysical systems

bull first order transitions are distinguished from continuous transitions by the appearance of phase-separations and interfaces with surface tension If the range of the force or the thickness of the surface layers is such that the number of surface particles is not negligible compared to the total number of particles these systems are non-extensive

For such systems the thermodynamic limit does not exist or makes no sense Either the range of the forces (Coulomb gravitation) is of the order of the linear dimensions of these systems andor they are strongly inhomogeneous eg at phase-separation

Boltzmanns principle does not invoke the thermodynamic limit nor ad-ditivity nor extensivity nor concavity of the entropy S(EN) (downwards bending) This was largely forgotten since hundred years We have to go back to pre Gibbsian times It is a purely geometrical definition of the entropy and applies as well to Small systems Moreover the entropy S(E N) as defined above is everywhere single-valued and multiple differentiable There are no singularities in it This is the most simple access to equilibrium statistics9 We will explore its consequences in this contribution Moreover we will see that this way we get simultaneously the complete information about the three crucial parameters characterizing a phase transition of first order transition tempershyature Ttr latent heat per atom qiat and surface tension crsurf Boltzmanns famous epitaph above (eql) contains everything what can be said about equishylibrium thermodynamics in its most condensed form W is the volume of the sub-manifold at sharp energy in the 6iV-dim phase space

134

3 Relation of the topology of S(EN) to the Yang-Lee zeros of Z(TnV)

In conventional thermo-statistics phase transitions are indicated by zeros of the grand-canonical partition function Z(T n V) V is the volume See more details in1-2310

Z(TfiV) = f r mdash dN e-[E-N-TsmiT JJo go

rdegdegdE

V2

= Y_ ff de dn c-V[ e-Mn-r(en)]T_ laquoo JJo

const+lin+quadr

(4)

in the thermodynamic limit V mdashgt oo|vy=cobdquos t The double Laplace integral (4) can be evaluated asymptotically for large

V by expanding the exponent as indicated in the last line to second order in Ae An around the stationary point esns where the linear term vanishes

1 T

T P f

dE 8

as dN

dS dv (5)

the only term remaining to be integrated is the quadratic one If the two eigen-curvatures Ai lt 0 A2 lt 0 this is then a Gaussian integral and yields

Z(TliV) = Yle-V[e-Itn-T^n)]T ffdegdeg dvidv2eV[Mvl+Xvl2 ( g )

CO JJ-00

Z(TfiV) = e - F ^ ^ (7)

FiT^V) _ _ T B i i ^ ^ ^ plusmn ^ ( g )

V

bdquo Tln(vdet(eg n)) l n V -+ea- in - Tss + VV

VK s + o ( mdash )

Here det(e s n s) is the determinant of the curvatures of s(en) viv2 are the eigenvectors of d

det(en) = de2 dnde d s d s

dedn dn2 Sfie Snn A1A2 Ai gt A2 (9)

135

Nalooo P = 1 a t m ^ AS s u r f ^_^

^ J - ^ mdash ^ r f ^

bull7 e2 1 s ( e ) - 2 5 - e 1 1 5

H l a t

e 3

03 0 5 07 09 11 13

Figure 1 MMMC simulation of the entropy s(e) per atom (e in eV per atom) of a system of JVo = 1000 sodium atoms with realistic inshyteraction at an external pressure of 1 atm At the energy per atom e the system is in the pure liquid phase and at e$ in the pure gas phase of course with fluctuations The latent heat per atom is qiat = e mdash e

Attention the curve s(e) is artifically sheared by subtracting a linear funcshytion 25 -(- e 115 in order to make the convex intruder visible s(e) is always a steeply monotonic rising functionWe clearly see the global concave (downshywards bending) nature of s(e) and its convex intruder Its depth is the enshytropy loss due to the additional corshyrelations by the interfaces Prom this one can calculate the surface tension per surface atom aSUrfTtr = As3 1 i r NoNsUrf The double tangent is the concave hull of s(e) Its derivative gives the Maxwell line in the caloric curve T(e) at Ttr- In the thermodynamic limit the intruder would disappear and s(e) would approach the double tanshygent (Maxwell line) from below

In the cases studied here A2 lt 0 but Ai can be positive or negative If d e t ( e s n s ) is positive (Ai lt 0) the last two terms in eq(8) go to 0 and we obtain the familiar result fTnV mdashgt oo) = es mdash xns mdash Tss Ie the curvashyture Ai of the entropy surface s(e n V) decides whether the grand-canonical ensemble agrees with the fundamental micro ensemble in the thermodynamic limit If this is the case n[Z(T j)] or f(Tn) is analytical in e3^ and due to Yang and Lee we have a single stable phase Or otherwise the Yang-Lee zeros reflect anomalous pointsregions of Ai gt 0 (det (e n) lt 0) This is crucial As d e t ( e s n s ) can be studied for finite or even small systems as well this is the only proper extension of phase transit ions to Small systems

4 T h e reg ions of p o s i t i v e curvature Ai of sesns) c o r r e s p o n d t o p h a s e t rans i t i ons of first order

We will now discuss the physical origin of convex (upwards bending) intruders in the entropy surface in two examples

In table (1) we compare the liquid-gas phase transit ion in sodium clusshyters of a few hundred atoms with tha t of the bulk at 1 a tm cf also fig(l)

Figure (2) shows how for a small system (Pot ts q = 3 lattice gas with 50 50 points) all phenomena of phase transitions can be studied from the

136

Table 1 Parameters of the liquid-gas transition of small sodium clusters (MMMC-calculation1) in comparison with the bulk for rising number No of atoms Nsurf is the average number of surface atoms of all clusters together

N a

N0

Ttr [K] qiat [eV]

Sboil

^Ssurf

bullL surf

crTtr

200

940 082 101 055 3994 275

1000

990 091 107 056 9853 568

3000

1095 094 99 044 1866 707

bulk 1156 0923 9267

oo 741

topology of the determinant of curvatures (9) in the micro-canonical ensemble

5 Boltzmanns principle and non-equilibrium thermodynamics

Before we proceed we must comment on Einsteins attitude to the principle11) Originally Boltzmann called W the Wahrscheinlichkeit (probability) ie the relative time a system spends (along a time-dependent path) in a given region of 6V-dim phase space Our interpretation of W to be the number of complexions (Boltzmanns second interpretation) or quantum states (trace) with the same energy was criticized by Einstein4 as artificial It is exactly that criticized interpretation of W which I use here and which works so excellently1 In section 7 I will come back to this fundamental point

After succeeding to deduce equilibrium statistics including all phenomshyena of phase transitions from Boltzmanns principle even for Small systems ie non-extensive many-body systems it is challenging to explore how far this most conservative and restrictive way to thermodynamics9 is able to describe also the approach of (eventually Small) systems to equilibrium and the Second Law of Thermodynamics

Thermodynamics describes the development of macroscopic features of many-body systems without specifying them microscopically in all details Beshyfore we address the Second Law we have to clarify what we mean with the label macroscopic observable

6 Macroscopic observables imply the EPS-probability

A single point qi(t)Pi(t)i=iN in the Af-body phase space corresponds to a detailed specification of the system with all degrees of freedom (dof) com-

137

1

0 8

0 6

0 4

0 2

0 - 2 - 1 5 - 1 - 0 5 0

e Figure 2 Conture plot of the curvature determinant of Potts-3 lattice gas Dark grey line d = 0 boundary of the region of phase coexistence the triangle APmB Light grey line minimum of d(en) in the direction of the largest curvature second order transition In the triangle APmC ordered (solid) phase Above and right of the line CPmB disordered (gas) phase The crossing Pm of the boundary lines is a multi critical point The light gray region around the multi-critical point Pm corresponds to a flat region of d(e n) ~ 0

pletely fixed at time t (microscopic determination) Fixing only the total energy E of an iV-body system leaves the other (6N mdash l)-degrees of freeshydom unspecified A second system with the same energy is most likely not in the same microscopic state as the first it will be at another point in phase space the other dof will be different Ie the measurement of the total energy HN or any other macroscopic observable M determines a (QN mdash 1)-dimensional sub-manifold pound or M in phase space All points in iV-body phase space consistent with the given value of E and volume V ie all points in the (6N mdash l)-dimensional sub-manifold poundNV) of phase space are equally consistent with this measurement pound(NV) is the microcanonical ensemble This example tells us that any macroscopic measurement is incomplete and defines a sub-manifold of points in phase space not a single point An addishytional measurement of another macroscopic quantity Bqp reduces pound further to the cross-section pound O B a (6iV mdash 2)-dimensional subset of points in pound with the volume

WBENV) = plusmnJ j0f) e0S(E-HNqp)6(B-Bqp) (10)

138

If Hffqp as also Bqp are continuous differentiable functions of their arguments what we assume in the following pound n B is closed In the following we use W for the Riemann or Liouville volume of a many-fold

Microcanonical thermostatics gives the probability P(B E N V) to find the TV-body system in the sub-manifold pound D B(EN V)

P(B E N V)~ W(BEgtNV) _ ln[W(BENV)]-S(ENV) ( m

This is what Krylov seems to have had in mind12 and what I will call the ensemble probabilistic formulation of statistical mechanics (EPS)

Similarly thermodynamics describes the development of some macroscopic observable Bqtpt in time of a system which was specified at an earlier time to by another macroscopic measurement Aqop0 It is related to the volume of the sub-manifold M(t) = A(t0) n B(t) D pound

W(ABEt) = ^J^0)N^-Bqupt]) 6(A - Aq0po)e0d(E - Hqtpt) (12)

where qtQoPoPtQoPo is the set of trajectories solving the Hamilton-Jacobi equations

dH 8H = laquo - Pi = mdash laquo - i = l---N (13)

with the initial conditions q(t = to) = lto p(t = t0) = Po- For a very large system with N ~ 1023 the probability to find a given value B(T) P(B(t)) is usually sharply peaked as function of B Ordinary thermodynamics treats systems in the thermodynamic limit N mdashbull oo and gives only ltB(t)gt However here we are interested to formulate the Second Law for Small systems ie we are interested in the whole distribution P(B(t)) not only in its mean value ltB(t)gt Thermodynamics does not describe the temporal development of a single system (single point in the 6iV-diiri phase space)

There is an important property of macroscopic measurements Whereas the macroscopic constraint Aqopo determines (usually) a compact region A(to) in qoPo this does not need to be the case at later times t 3gt to A(t) denned by AqoqtptPoltltPt might become a fractal ie spaghetti-like manifold cf fig3 as a function of qtPt in f at i mdash oo and loose compactness

This can be expressed in mathematical terms There exist series of points an euro -4(oo) which converge to a point an=_+oo which is not in ^4(oo) Eg

139

such points may have intruded from the phase space complimentary to A(to) Illustrative examples for this evolution of an initially compact sub-manifold into a fractal set are the baker transformation discussed in this context by ref1314 Then no macroscopic (incomplete) measurement at time t = oo can resolve aoo from its immediate neighbors an in phase space with distances o-n mdash laquooo| less then any arbitrary small 5 In other words at the time t Sgt to no macroscopic measurement with its incomplete information about qtPt can decide whether qoqtPtPoqtPt euro -4(o) or not Ie any macroscopic theory like thermodynamics can only deal with the closure of A(t) If necessary the sub-manifold A(t) must be artificially closed to A(t) as developed further in section 8 Clearly in this approach this is the physical origin of irreversibility We come back to this in section 8

7 On Einsteins objections against the EPS-probability

According to Abraham Pais Subtle is the Lord11 Einstein was critical with regard to the definition of relative probabilities by eql l Boltzmanns countshying of complexions He considered it as artificial and not corresponding to the immediate picture of probability used in the actual problem The word probability is used in a sense that does not conform to its definition as given in the theory of probability In particular cases of equal probability are often hypothetically defined in instances where the theoretical pictures used are sufshyficiently definite to give a deduction rather than a hypothetical assertion4 He preferred to define probability by the relative time a system (a trajectory of a single point moving with time in the V-body phase space) spends in a subset of the phase space However is this really the immediate picture of probashybility used in statistical mechanics This definition demands the ergodicity of the trajectory in phase space As we discussed above thermodynamics as any other macroscopic theory handles incomplete macroscopic informations of the A-body system It handles consequently the temporal evolution of finite sized sub-manifolds - ensembles - not single points in phase space The typical outcomes of macroscopic measurements are calculated Nobody waits in a macroscopic measurement eg of the temperature long enough that an atom can cross the whole system

In this respect I think the EPS version of statistical mechanics is closer to the experimental situation than the duration-time of a single trajectory Moreover in an experiment on a small system like a nucleus the excited nushycleus which then may fragment statistically later on is produced by a multiple repetition of scattering events and statistical averages are taken No ergodic covering of the whole phase space by a single trajectory in time is demanded

140

At the high excitations of the nuclei in the fragmentation region their life-time would be too short for that This is analogous to the statistics of a falling ball on a Galtons nail-board where also a single trajectory is not touching all nails but is random Only after many repetitions the smooth binomial distribution is established As I am discussing here the Second Law in finite systems this is the correct scenario not the time average over a single ergodic trajectory

8 Fractal distributions in phase space Second Law

Let us examine the following Gedanken experiment Suppose the probability to find our system at points qtPt in phase space is uniformly distributed for times t lt to over the sub-manifold poundN V) of the TV-body phase space at energy E and spatial volume V At time t gt to we allow the system to spread over the larger volume V2 gt Vi without changing its energy If the system is dynamically mixing the majority of trajectories qtPt^ in phase space starting from points qoPo with qo 6 V at to will now spread over the larger volume V2- Of course the Liouvillean measure of the distribution JAqtPt in phase space at t gt to will remain the same (= tr[pound(N Vi)]f5 (The label qo pound Vi of the integral means that the positions qo^ are restricted to the volume Vi the momenta po are unrestricted)

tr[MqtqoPoPtqoPo]goeVl

-UMW-^-61^ lt14) because of 7-7mdash-mdashr = 1 (15)

oqoPo

But as already argued by Gibbs the distribution MqtPt will be filamented like ink in water and will approach any point of poundN V2) arbitrarily close Mqtpt becomes dense in the new larger pound(N V2) for times sufficiently larger than to (strictly in the limt_gtoo)- The closure M becomes equal to poundNV-z) This is clearly expressed by Lebowitz1617

In order to express this fact mathematically we have to redefine Boltz-manns definition of entropy eq(l) and introduce the following fractal mea-

141

sure for integrals like (3) or (10)

W(ENtraquot0) = plusmn [ i^Sf)zo6(E-HNquPt) (16)

With the transformation

f(d3qt d3Pt)

N bull bull bull = d lt n bullbull bull da6N bull bull bull (17)

1 ^dH dH 1 _ 1 Q do-QN = mdash gt -mdash- dqi + -^mdashdpi = mdashdE (18)

IVffll Ns)+gy W[E N t raquo t0) = v 9 Lv3jv f rfltJi bull bull bull d(76N-1-

JVH||

we replace M by its closure M and define now

(20)

W(EW traquo fo ) -gt M(E JV traquo t 0 ) =ltG(pound(JVV2))gt volt08[MCEJTt raquo i o ) ] (21)

where lt G(S(N V2)) gt is the average of fi^llvgll o v e r t i e (^arSer) m a n _

ifold pound(N V2) and volbox[M(ENt raquo to)] is the box-counting volume of M(E N t 3gt to) which is the same as the volume of M see below

To obtain voltox[M(E Nt 3gt to)] we cover the d-dim sub-manifold M(t) here with d = (6V mdash 1) of the phase space by a grid with spacing 6 and count the number N$ oc 5~d of boxes of size S6N which contain points of M Then we determine

vobox[M(ENt raquo to)] =)ms_y05dNs[M(ENfraquo f0)] (22)

with lim= inf [lim ] or symbolically

M(ENtraquot0) = L lf^^Pi) e06(E-HN)(23) J laquoolaquoplaquoeViM V ( 2 ^ ) ^ J

N

i 1 1 aat arvt

= WfaNWtWiE^M) (24)

142

Va vb va + vb

t lt 0 gt i o

Figure 3 The compact set M(to) left side develops into an increasingly folded spaghetti-like distribution in phase-space with rising time t This figure shows only the early form of the distribution At much larger times it will become more and more fractal The grid illustrates the boxes of the box-counting method All boxes which overlap with A4(t) are counted in Ng in eq(22)

where 3d means that this integral should be evaluated via the box-counting

volume (22) here with d = 6N mdash 1 This is illustrated by the figure 3 With this extension of eq(3) Boltzmanns entropy (1) is at time t -gtbull oo equal to the logarithm of the larger phase space W(E TV V )- This is the Second Law of Thermodynamics The box-counting is also used in the definition of the Kolmogorov entropy the average rate of entropy gain1819 Of course still at to Mto)=Mt0)=poundNV1)

l_ M(ENt0) =

lt7oeuroVi

qoeuroVi N l

= WENV)

4o6Vgt N

d3q0 dpQ

(2irH)3

d3q0 d3p0 (2nh)3 J

e06(E - HN) (25)

e0S(E - HN)

(26)

The box-counting volume is analogous to the standard method to detershymine the fractal dimension of a set of points18 by the box-counting dimension

dimbox[M(ENt raquo t0)] = lira InNs[M(ENtgt tp)]

In S (27)

143

Like the box-counting dimension volbox has the peculiarity that it is equal to the volume of the smallest closed covering set Eg The box-counting volume of the set of rational numbers Q between 0 and 1 is voloxQ = 1 and thus equal to the measure of the real numbers cf Falconer18 section 31 This is the reason why volampox is not a measure in its mathematical definition because then we should have

volf0 pound(M) ieuroQ

2 voUolaquo[Mi] = 0 (28) ieQ

therefore the quotation marks for the box-counting measure Coming back to the the end of section (6) the volume W(ABbull bull bull t) of

the relevant ensemble the closure M(t) must be measured by something like

the box-counting measure (2223) with the box-counting integral B d which

must replace the integral in eq(3) Due to the fact that the box-counting volume is equal to the volume of the smallest closed covering set the new extended definition of the phase-space integral eq(23) is for compact sets like the equilibrium distribution pound identical to the old one eq(3) Therefore one can simply replace the old Boltzmann-definition of the number of complexions and with it of the entropy by the new one (23)

9 Conclusion

Macroscopic measurements M determine only a very few of all 6N dof Any macroscopic theory like thermodynamics deals with the volumes M of the corresponding closed sub-manifolds M in the 6iV-dim phase space not with single points The averaging over ensembles or finite sub-manifolds in phase space becomes especially important for the micro canonical ensemble of a finite system

Because of this necessarily coarsed information macroscopic measureshyments and with it also macroscopic theories are unable to distinguish fractal sets M from their closures M Therefore I make the conjecture the proper manifolds determined by a macroscopic theory like thermodynamics are the closed M However an initially closed subset of points at time to does not necshyessarily evolve again into a closed subset at t ^gt to- l e the closure operation and the t mdash)bull oo limit do not commute and the macroscopic dynamics becomes irreversible The limt-^oo and l i m ^ o may be linked as eg S gt constft and the S mdashgtbull 0 limit taken after the t mdashgt oo limit

Here is the origin of the misunderstanding by the famous reversibility paradoxes which were invented by Loschmidt20 and Zermelo2122 and which

144

bothered Boltzmann so much2324 These paradoxes address to trajectories of single points in the JV-body phase space which must return after Poincarres recurrence time or which must run backwards if all momenta are exactly reshyversed Therefore Loschmidt and Zermelo concluded that the entropy should decrease as well as it was increasing before The specification of a single point demands of course a microscopic exact specification of all 6N degrees of freeshydom not a determination of a few macroscopic degrees of freedom only No entropy is defined for a single point

By our formulation of thermo-statistics various non-trivial limiting proshycesses can be avoided Neither does one invoke the thermodynamic limit of a homogeneous system with infinitely many particles nor does one rely on the er-godic hypothesis of the equivalence of (very long) time averages and ensemble averages The use of ensemble averages is justified directly by the very nature of macroscopic (incomplete) measurements Coarse-graining appears as natushyral consequence of this The box-counting method mirrors the averaging over the overwhelming number of non-determined degrees of freedom Of course a fully consistent theory must use this averaging explicitly Then one would not depend on the order of the limits l i m ^ o limt_gtoo as it was tacitly assumed here Presumably the rise of the entropy can then be already seen at finite times when the fractality of the distribution in phase space is not yet fully deshyveloped The coarse-graining is no more any mathematical ad hoc assumption Moreover the Second Law is in the EPS-formulation of statistical mechanics not linked to the thermodynamic limit as was thought up to now1617

Appendix

In the mathematical theory of fractals18 one usually uses the Hausdorff measure or the Hausdorff dimension of the fractal19 This however would be wrong in Statistical Mechanics Here I want to point out the difference between the box-counting measure and the proper Hausdorff measure of a manifold of points in phase space Without going into too much mathematical details we can make this clear again with the same example as above The Hausdorff measure of the rational numbers euro [01] is 0 whereas the Hausdorff measure of the real numbers euro [01] is 1 Therefore the Hausdorff measure of a set is a proper measure The Hausdorff measure of the fractal distribution in phase space M(t -gt oo) is the same as that of M(to) W(E NV) Measured by the Hausdorff measure the phase space volume of the fractal distribution M(t -t oo) is conserved and Liouvilles theorem applies This would demand that thermodynamics could distinguish between any point inside the fractal from any point outside of it independently how close it is This however

145

is impossible for any macroscopic theory that can only address macroscopic information where all unobserved degrees of freedom are averaged over That is the deep reason why the box-counting measure must be taken and where irreversibility comes from

Acknowledgement

I thank to EGD Cohen and Pierre Gaspard for detailed discussions

References

1 D H E Gross Microcanonical thermodynamics Phase transitions in Small systems Lecture Notes in Physics (World Scientific Singapore 2000)

2 D H E Gross and E Votyakov Phase transitions in small sysshytems EurPhysJB 15 115-126 (2000) httparXivorgabscond-mat9911257

3 D H E Gross Micro-canonical statistical mechanics of some non-extensive systems httparXiv orgabsastro-phcond-mat0004268 (2000)

4 A Einstein Uber einen die Erzeugung und Verwandlung des Lichtes betreffenden heuristischen Gesichtspunkt Annalen der Physik 17 132 (1905)

5 L Boltzmann Uber die Beziehung eines algemeinen mechanischen Satzes zum Hauptsatz der Warmelehre Sitzungsbericht der Akadamie der Wis-senschaften Wien 2 67-73 (1877)

6 L Boltzmann Uber die Begriindung einer kinetischen Gastheorie auf anziehende Krafte allein Wiener Berichte 89 714 (1884)

7 E Schrodinger Statistical Thermodynamics a Course of Seminar Lecshytures delivered in January-March 1944 at the School of Theoretical Physics (Cambridge University Press London 1946)

8 Elliott H Lieb and J Yngvason The physics and mathematics of the second law of thermodynamics Physics Reportcond-mat9708200 310 1-96 (1999)

9 J Bricmont Science of chaos or chaos in science Physicalia Magazine Proceedings of the New York Academy of Science to apear 1-50 (2000)

10 DHE Gross Phase transitions in small systems - a challenge for thershymodynamics httparXivorgabscond-mat0006087 page 8 (2000)

11 A Pais Subtle is the Lord chapter 4 pages 60 - 78 (Oxford University Press Oxford 1982)

12 N S Krylov Works on the Foundation of Statistical Physics (Princeton University Press Princeton 1979)

13 R F Fox Entropy evolution for the baker map Chaos 8 462-465 (1998)

14 T Gilbert J R Dorfman and P Gaspard Entropy production fractals and relaxation to equilibrium PhysRevLett 85 1606nlinCD000301 (2000)

15 H Goldstein Classical Mechanics (Addison-Wesley Reading Mass 1959)

16 J L Lebowitz Microscopic origins of irreversible macroscopic behavior Physica A 263 516-527 (1999)

17 J L Lebowitz Statistical mechanics A selective review of two central issues RevModPhys 71 S346-S357 (1999)

18 K Falconer Fractal Geometry - Mathematical Foundations and Apshyplications ( John Wiley amp Sons Chichester New York Brisbane TorontoSingapore 1990)

19 E W Weisstein Concise Encyclopedia of Mathemetics (CRC Press Lonshydon New York Washington DC 1999 CD-ROM edition 1 205 99)

20 J Loschmidt Wienerberichte 73 128 (1876) 21 E Zermelo WiedAnn 57 778-784 (1896) 22 E Zermelo Uber die mechanische Erklarung irreversiblen Vorgange

WiedAnn 60 392-398 (1897) 23 E G D Cohen Boltzmann and statistical mechanics In Boltz-

manns Legacy 150 Years after his Birth httpxxxlanlgovabscond-mat9608054 (Atti dell Accademia dei Lincei Rome 1997)

24 E G D Cohen Boltzmann and Statistical Mechanics volume 371 of Dynamics Models and Kinetic Methods for Nonequilibrium Many Body Systems J Karkheck editor 223-238 (Kluwer Dordrecht The Nethershylands 2000)

147

A N APPROACH TO Q U A N T U M PROBABILITY

STAN GUDDER Department of Mathematics

University of Denver Denver Colorado 80208

sguddercs du edu

We present an approach to quantum probability that is motivated by the Feynman formalism This approach shows that there is a realistic description of quantum mechanics and that nonrelativistic quantum theory can be derived from simple postulates of quantum probability The basic concepts in this framework are meashysurements and actions The measurements are similar to the dynamic variables of classical mechanics and the random variables of classical probability theory The actions correspond to quantum mechanical states An influence between configshyurations of a physical system is defined in terms of an action The fundamental postulate of this approach is that the probability density at a measurement outshycome x is the sum (or integral) of the influences between each pair of configurations that result in x upon executing the measurement

1 Introduction

We shall discuss a new approach to quantum probability that combines a reshyformulation of the mathematical foundations of quantum mechanics and the basic tenets of probability theory This approach is motivated by the Feynshyman formalism1 and it answers various puzzling questions about traditional quantum mechanics Some of these questions are the following

1 Where does the quantum mechanical Hilbert space H come from

2 Why are states represented by unit vectors in H and observables by self-

adjoint operators on HI

3 Why does the probability have its postulated form

4 Why do the position and momentum operators have their particular forms

5 Why does a physical theory that must give real-valued results involve complex amplitudes or states

6 Is there a realistic description of quantum mechanics

Our philosophy is that quantum probability theory need not be the same as classical probability theory That is the probability need not be given by a measure However the predictions of quantum probability theory should agree

148

with experimental long run relative frequencies We shall show that there is a realistic description of quantum mechanics In other words a quantum system has properties independent of observation We also show that nonrelativistic quantum mechanics can be derived from simple postulates of this approach Our presentation is a modified version of the discussion in Gudder 2

2 Formulation

We denote the set of possible configurations of a physical system ltS by fl and call $1 a sample space If X is a measurement on ltS then executing X results in a unique outcome depending on the configuration u of S To be precise we define a measurement to be a map X from fl onto its range R(X) C R satisfying

(Ml) R(X) is the base space of a measure space (R(X) Ex fix)-

(M2) X_1(x) is the base space of a measurable space (X~1(x) E x ) for every x e R(x)

We call the elements of R(X) X-outcomes and the sets in Ex are X-events Note that X _ 1 (x ) corresponds to the set of configurations resulting in outcome x when X is executed and we call X_1(x) the X-fiber over x The measure fix represents an a priori weight due to our knowledge of the system (for example we may know the energy of S or we might assume the energy has a certain value) In the case of total ignorance the weight is taken to be counting measure in the discrete case and uniform measure in the continuous case This framework gives a realistic theory because a configuration CJ detershymines the properties of S independent of any particular observation That is w determines the outcomes of all measurements simultaneously Notice that measurements are similar to the dynamical variables of classical mechanics and the random variables of classical probability theory The sample space fi gives an underlying level of reality upon which traditional quantum mechanics can be constructed

If X is a measurement an X-action is a pair

(Spound xeR(X))

where S CI mdashgt R and (ix is a measure on [X~lx)Hxx) As we shall see

actions correspond to quantum states For simplicity we frequently denote an action by S and we remark that S depends on our model of S and also on our knowledge of ltS We define the influence between w w 6 SI relative to S

149

by

Fs(uu) = JVf cos[S(w) - S(u)] (1)

where Ns gt 0 is a normalization constant The appearance of the cosine in (1) is not arbitrary but it can be derived from the regularity conditions of continuity and causality25

We now make a fundamental reformulation of the probability concept2 5

We postulate that the probability density Pxs) of an X-outcome x is the sum (or integral) of the influences between each pair of configurations that reshysult in x upon executing X Precisely we postulate that Fs(w u) is integrable and that

PXS(X)= f [ FS(ujUj)fMx(du)^x(dLj JX-l(x) JX~l(x)

(2)

Also to ensure that Pxsx) is indeed a probability density we assume that Pxsx) is measurable with respect to Ex and that

L RX) Pxs(x)nx(dx) - 1 (3)

Equation (3) can be employed to find Ns- To show that Pxs(x) gt 0 we have

Pxs()

= N2S[ f [caaS(w)coaS(w) + 8mS(u)S(u)]px(du)px(du)

Jx-Hx) Jx-Hx)

= N2S

-| 2 p

cosS(u)fix(dcj + sinS(w)^x(eL Jx-1(x) Jx-^x)

gt 0

We conclude that Pxs(x) is a probability density on R(X) pound X J X )

If B G pound is an X-event we define the (X 5)-probability of B by

PxsB) = [ Pxs(x)Vxdx) JB

(4)

(5)

Then Pxs- Ex -gt [01] is a probability measure on (R(X)Hx) that we call the S-distribution of X If h R(X) -gtbull R is ^x-integrable then the

150

5-expectation of hX) is defined by

Es(hX))= [ h(x)Pxs(dx)= [ h(x)Pxs(x)nx(dx) (6) JR(X) JR(X)

In particular if h is the identity function the 5-expectation of X becomes

ES(X)= [ xPxsx)nx(dx) (7) JR(X)

Influence is a strictly quantum phenomenon that is not present in classical physics In the classical limit Fswu) approaches a delta function 5U(UJ) In this limit Fs(uiui) = 0 for u 7 OJ and there is no influence between distinct configurations We then have Pxs(x) mdash nx

x X~lx)) which gives a classical probability framework

We can extend this theory to include expectations of other functions on Q Let g Q mdashgt R be a function that is integrable along X-fibers We define the (X 5)-expectation of g at x by

EXlS(g)(x) = I [ 5(w)fs(wa)Mx(dw)Mx(dw) (8) JX-1(x)JX-^(x)

This is the natural generalization of (2) from a probability density to an exshypectation density If Exs(g) 1S integrable then the (X 5)-expectation of g is given by

Exs(9) = [ Exs9)x)raquoxdx) (9) JR(X)

In particular if g(u) = h (X(CJ)) then

Exs(g)(x) = h(x)Pxs(x)

and

ExM = I h(x)Pxs(x)raquox(dx) = Es (h(X)) JR(X)

This shows that (9) is an extension of (6) We can also use this formalism to compute probabilities of events in fi Let

ACQ and denote the characteristic function of Aby xA- If XA is integrable along X-fibers we define analogously as in classical probability theory the (X 5)-pseudoprobability of A by

xs(A) = Exs(xA)

151

It follows from (3) and (9) that Pxs(ty = 1 and Pxs is countably additive However Pxs rnay have negative values which is why it is called a pseudo-probability Nevertheless there are cr-algebras of subsets of fi on which Pxs is a probability measure For example if A = X~XB) for B euro Ex then it can be shown that Pxs(A) = Pxs(B)2 Therefore in this case Pxs reduces to the distribution Pxs- We shall consider some less trivial examples later

3 Wave Functions and Hilbert Space

This section employs the formalism of Section 2 to derive the wave functions and Hilbert space of traditional quantum mechanics It is not necessary to do this because the needed probability formulas have been presented in Section 2 However as we shall see the Hilbert space formulation gives more convenient and concise notations

Applying (4) we obtain

NseiS^raquox(duj)

JX-l(x)

2

(10)

We call the function

s M = NseiS^ (11)

the S-amplitude function and define the (X S)-wave function by

fxs() = f fs(u)raquoxx(du) (12)

X-i(a)

From (10) and (12) we obtain

Pxs(x) = l xs()|2 (13)

We also have

Fs(uw) = iVfRe e ^ M e - ^ ) = Re s(w)s(w) (14)

Equation (10) shows how the complex numbers arise in quantum mechanshyics The complex numbers are not needed for the computation of Pxs because we can always write FS(OJW) in the form (1) They are merely a convenience that gives a simple and concise formula Equation (11) gives the Feynman amshyplitude function which we have now derived from deeper principles and (12) is Feynmans prescription that the amplitude of an outcome a is the sum (or

152

integral) of the amplitudes of the configurations (or alternatives) that result in x when X is executed

If B G Ex applying (5) and (13) gives

Pxs(B) = [ fxs(x)2raquox(dx) (15) JB

and this is the usual probabilistic formula of traditional quantum mechanics It follows from (3) that fxs is a unit vector in the Hilbert space 1 (R(X)Hx^x) and this derives the quantum Hilbert space and the vector form for a state If Ax is a set of X-actions then the Hilbert space Hx Q L2 (R(X) TxfJ-x) genshyerated by the set of wave functions fxs- S euro Ax is called an X-Hilbert space Some X-actions may not be relevant for physical reason so we may want Ax to be a proper subset of the set of all X-actions

If g Cl mdashgt R is integrable along rr-fibers and S pound Ax we define the (X 5)-amplitude average of g at x by

fxs(9)x) = [ g(u)fs(ugt)fx(dLj) = NS [ gu)eiS^nxd) Jx-l(x) JX-i(x)

(16)

Applying (8) and (14) we obtain

poundx s ( f f ) (s )=Re g(Lj)fs(cj)raquox(du) [ s(^)gti(^)

= Befxs(g)(x)fxsx)

It follows from (9) that

Exs(g)=Re(fxs(g)fxs) (17)

Define the linear operator g on Hx by gfxs() = fxs(g)() and extend by linearity If the operator Tj is self-adjoint on Hx we call g an X-observable and we have

Exs(9) = (9fxsfxs) (18)

for all S G Ax- We then say that g is represented by the self-adjoint opershyator lt on Hx bull This derives the representation of observables by self-adjoint operators

153

For a simple example of a representation let g pound1 -raquo R be a constant function g(uj) = c Then (16) gives

fxs(g)x) = c fs(w)nx(du) = cfxs(x) JX-1(x)

Hence g is an A-observable and is represented by the self-adjoint operator cl As another example letting g mdash X we have by (16) that

fxs(X)x) = xfXiS(x)

It follows that X is represented by the self-adjoint operator X on Hx given by Xu(x) = xux) We conclude that Hx is a Hilbert space in which X is diagonal More generally since

fxs (h(X)) (x) = h(x)fxs(x) (19)

we see that hX) is represented by the self-adjoint operator h(X)Au(x) = h(x)u(x) Moreover the spectral measure Px is given by Px (B)u(X) mdash XB(x)u(x) and applying (15) gives

Pxs(B) = px(B)fxs

which is again a standard probabilistic formula Finally for A C fi the (X 5)-pseudoprobability becomes by (17)

Pxs(A) = Re (fxs(xA)fxs) (20)

where by (16) we have

fxAxA)(x)= [ fs(cj)fixx(du) = NS I eiS^raquox(ckj) (21) JX- ( i )n i Jx-1(x)nA

4 Spin

We now illustrate the framework presented in the last two sections by preshysenting a model for spin 12 measurements Fix a direction corresponding to the z axis and assume that the spin j z in the z direction is known (either 12 or mdash12) Let UJ euro [07r] denote a direction whose angle to the z axis is LJ By symmetry the spin distribution should depend only on u Let fi = [07r] 8 6 fi and let X Q -gt -1212 be the function

X(u) = - 1 2 for u E [06] and X(u) = 12 for u G (0TT]

154

We make X into a measurement by defining

fix (-12)= ^ (12) = 1

and endowing X~1(-l2) = [0(9] and X~ 1 ( 1 2) = (0ir] with the usual Borel structure The function X corresponds to a spin 12 measurement in the 0 direction Letting 6 vary we obtain an infinite number of spin measurements each applied in a different direction Observe that a sample point ugt euro CI determines the spin in every direction simultaneously

For j z = 12 we define the X-action (S lt fix fix gtJ given by S(LJ) = u

and fix fix are fi2 where fi is Lebesgue measure restricted to X_ 1(mdash12) X _ 1 ( l 2 ) respectively We then have

FS(OJCJ) = cos(o - a)

(we shall see that Ns = 1) The probabilities become

P 5 ( - l 2 ) = l oVoCOs^-wJdwdw

= i[09cosadu]2 + i [ 0

e s i n a ^ ] 2 (22)

= plusmn s i n 2 0 + i ( l - c o s 0 ) 2 = s i n 2 f

Pxs(l2) = fficoa(u-uj)dLjdu

= [fg cos uiduj] + i [fg sin udu] (23)

= sin2 6 + (1 + cos Of = cos2 f

Since Pxs(-l2) + Pxs(ll2) = 1 we see that Ns = 1 Notice that (22) and (23) are the usual probability distribution for spin in the 9 direction when U = i 2

For j z = mdash12 we define the X-action S Avx vj J given by

S = u for u e (07r) and S = -TT2 for u e 0 n and vx = So + fi2 vx = Sn + fi2 where lt5o Sv are the Dirac point

measures at 0 ir respectively A similar but more tedious calculation gives

i ^ S ( - 1 2 ) = cos 2^

Pxs-(12) = s in 2 ^

155

which is the usual distribution for spin in the 6 direction when j z mdash - 1 2 We now examine the wave functions and Hilbert space corresponding to

this model The 5-amplitude function becomes fs(ugt) = etw and the (XS)-wave function fxs is given by

x s ( - l 2 ) 2 Jo e w d w = - ( l - )

fxs^l2) = f e^ltkj^-l + i0

The S-amplitude function becomes fsgt (w) = etrade for u euro (0 TT) and s - M = -i for w euro 0 TT and the (X 5)-wave function fxs IS given by

fxM-W) = f[o9]fs(gtx12^) = -i+12foeid

= - f ( l + eiS)

x5lt(l2) = M ] 5 H ^ 2 ( ^ ) = - i + 3 X r ^ d W

= - | ( l - e i e )

The X-Hilbert space is clearly C 2 and we can represent fxs and xS in C 2 by the unit vectors

vs

VS

(l-ei9l + eie)

(I + eie1 - eie)

Notice that vs i vs- Also when 6 = 0 vs mdash (01) and us = (10) which are the usual eigenvectors for the spin 12 operator in the z direction We can treat this as a measurement and the general X as an observable It can be shown that the matrix for X in the standard basis (10) (01) becomes

= 5 cos 9 ism 6

-i sin 6 mdash cos 6 = - cos 6

2 1 0 0 - 1

+ - sin 6 0 i -i 0

which is the usual form for a spin 12 matrix in the direction 6 We can extend this analysis to higher order spins3 Moreover this frameshy

work gives a realistic model for the Bohm version of the EPR problem4 The reason that Bells theorem is not contradicted is because Bells inequalities are derived using classical probability theory and we have employed quantum probability theory

156

5 Traditional Quantum Mechanics

We now show that this formalism contains traditional nonrelativistic quantum mechanics For simplicity we consider a single spinless particle in one dimenshysion although this work easily generalizes to three dimensions We take our sample space to be the phase space

n = K2 = (qp) qpER

The two most important measurements are the position and momentum given by Q(QP) = ltgt P(QJP) = P respectively However as is frequently done in quantum mechanics we shall investigate the ^-representation of the system In this case Q is considered a measurement and P fi mdashgt R is viewed as a function on fi which as we shall show is a Q-observable

Each Q-fiber Q~lq) = (qp)- p pound R can be identified with R We make Q a measurement by endowing its range R(Q) = R with Lebesgue meashysure and its fibers with the usual Borel structure of R Only certain Q-actions ISlt(1Q lt 7 G R H correspond to traditional quantum states and these can be derived from natural postulates We assume that fj is absolutely continuous relative to Lebesgue measure on R and that IQ is independent of Q This is because sets of Lebesgue measure zero are too small to have any effect on the outcomes of position measurements and there is no a priori reason to disshytinguish between Q-fibers It follows from the Radon-Nikodym theorem that there exists a nonnegative Lebesgue measurable function pound R mdashgt R such that

raquoQ(dp) = (2irh)-12ap)dp (24)

We take S fl mdashgt R to have the form

S(qp) = f+V(p) (25)

This form is natural because qp is the classical action and adding a function of momentum gives a quantum fluctuation We could also add a function of q but it is easy to see that this would just multiply the wave function by a constant phase which would not alter the probabilistic formulas Denote by AQ the set of (^-actions that have the form (24) (25)

Applying (12) for S euro AQ we find that the (Q 5)-wave function becomes

fQs(q) = 2-KK)-12 J tipYnp)eiqvhdp

Defining

m = t(p)eivp) (26)

157

and denoting the inverse Fourier transform by v we have

fQs(q) = (27Tr12 4gtPyqphdP = ltpa) (27)

In order for (3) to be satisfied Q ^ must be a unit vector in L2(R dq) or equivalently ltjgtp) must be a unit vector in L2(R dp) However every vector in L2 (R dp) has the form (26) for some functions pound R -raquobull R + 77 R -gtbull R It follows that the Q-Hilbert space becomes the traditional Hilbert space HQ = L2(R dq) and fQs is the usual wave function (or state)

Let (s l^9Q q euro R ) be a fixed Q-action in AQ of the form (24) (25)

and let ip(q) = fQs(q) $(p) = ^(p)eitgt^ Applying (16) and (27) we have

fQs(P)(Q) = (2nh)-12Jpltigt(p)ei^dp

= -ihplusmn(2nh)-V2j4gt(P)eilphdp=-ihq)

More generally if n is a positive integer we obtain

fQs(Pn)(Q) = (-ihQ V-CP) (28)

Moreover applying (18) we have

E^pn) = l[(-ihiS 1gt(q) P(q)dq

which is the usual quantum expectation formula We conclude from (28) that P is a Q-observable and is represented by the operator (mdashihddq)n Moreover if V R mdashgt R is measurable we see from (19) that V(Q) is a Q-observable and is represented by the operator V(Q)Au(q) = V(q)u(q) This together with our observation concerning P gives a derivation of the Bohr correspondence principle

We now consider probability distributions We have already seen in (15) that

PQS(B)= I ltP(q)2dq JB

which is the usual distribution of Q It is more interesting to compute the probability of A = P~1(B) for the momentum function P We have from (21) that

fQs(xA)(q) = 2Kh)-12 [ 4gtjgtyqphdp=xB4gtYq) JB

158

Hence by (20) and the Plancherel formula we obtain

PQS [P-^B)] = jxBdgtYq)rq)dq

(xB4gt)p)ltP(p)dp lt

= |(p) JB lt

dp

Again this is the usual momentum distribution This gives an example in which PQS is an actual probability measure on a er-algebra of subsets of fi

Until now we have treated time as fixed We now briefly consider dynamshyics Let ipqt) be a smooth function Our previous formulas hold with tp(q) replaced by tp(qt) and HQ replaced by tQt- We now derive Schrodingers equation from Hamiltons equation of classical mechanics dpdt = mdashdHdq Suppose the energy function has the form

H(qP) = ^+V(q)

We assume that Hamiltons equation holds in the amplitude average Applying (16) we have

Jt J Pfs(qPt)nqQltt(dp) = -mdashJ H(qp)fsqpt)nq

Qtdp)

Hence

dt Jp$(p t)e^hdp =-^f H(qp)$(p t)e^lhdp

Applying (28) and (19) gives

h2 d2igt dt dq J dq 2m dq2 + V(q)rlgt

Interchanging the order of differentiation on the left side of this equation and integrating with respect to q gives Schrodingers equation

6 Concluding Remarks

In this paper we have presented a realistic contextual nonlocal approach to quantum probability theory The formalism is realistic because each sample

159

point w euro n uniquely determines a value X(ugt) for any measurement X In this way a physical system ltS possesses all of its attributes independent of whether they are measured Although the sample space fi exists and we can discuss its properties fi is not physically accessible in general This is because the samshyple points may not correspond to physical states which can be prepared in the laboratory or at least exist in nature We may think of fi as a hidden variable completion of quantum mechanics This approach is contextual because it is necessary to specify a particular basic measurement X Once X is specified a Hilbert space Hx can be constructed and Hx provides an X-representation for S Of course one may choose a different basic measurement Y and then the ^-representation will give a different picture of S For example in trashyditional quantum mechanics we usually choose the position representation or the momentum representation to describe ltS For a given basic measurement X and an action S we have given a method for constructing the probability distribution Pxs of X We have shown that Pxs may be found in terms of a state vector fxs 6 Hx and these correspond to physically accessible states In Hx the measurement X and functions of X are diagonal and hence represhysented by random variables Other measurements which we call observables to distinguish them from X are represented by self-adjoint operators on Hx and their usual distributions follow in a natural way The theory is nonlocal because the distribution Pxs is specified by an influence function Fs(ww) This function provides an influence between pairs of sample points which in a spacetime model may be spacelike separated

There is considerable controversy concerning various interpretations and approaches to probability theory I believe that three types of probabilities are necessary for a description of quantum mechanics The probabilities and disshytributions of measurement results in the laboratory are usually computed using long run relative frequencies Even though a measurement X may involve a microscopic system S (for example the position of an electron) S must intershyact with a macroscopic apparatus in order to obtain an observable outcome The theoreticians task is to find the distribution Px of X This theoretical distribution should agree with the long run relative frequencies found in the laboratory or give predictions that can eventually be tested experimentally Since there are serious well-known difficulties in dealing with abstract theories of relative frequencies it is convenient and perhaps even necessary to use the standard Kolmogorovian probability theory for describing Px- Now Px is a probability measure that satisfies the axioms of standard probability theory However the method for computing Px is characteristic of quantum mechanshyics and is not found in any classical theory Richard Feynman whose work has motivated the present paper once said that nobody really understands

160

quantum mechanics I think that what he meant is that nobody understands why nature has chosen to compute probabilities in this unusual way As preshysented here the probability density for Px is found by employing an influence function The advantage of this method is that it is physically motivated and avoids complex numbers An equivalent method which is usually employed in quantum mechanics is to take the absolute value squared of the wave function

The quantum probability approach that we have presented contains stanshydard probability theory as a special case Thus we only need two types of probabilities to describe quantum mechanics Standard probability theory as developed by Kolmogorov is a distillation of hundreds of years of experience with empirical and theoretical studies of chance phenomena The founders of the subject were concerned with games of chance statistics and the behavior of macroscopic objects They were not aware of microscopic objects and quanshytum mechanics and had no reason to design a probability theory for describing such situations It is therefore not surprising that a new theory called quantum probability theory had to be developed to serve these purposes

References

1 R Feynman and A Hibbs Quantum Mechanics and Path Integrals (Mc Graw-Hill New York 1965)

2 S Gudder Int J Theor Phys 32 1747 (1993) 3 S Gudder Int J Theor Phys 32 824 (1993) 4 S Gudder Quantum probability and the EPR argument Ann Found

Louis De Broglie 20 167 (1994) 5 G Hemion Int J Theor Phys 29 1335 (1990)

161

INNOVATION APPROACH TO STOCHASTIC PROCESSES A N D Q U A N T U M DYNAMICS

TAKEYUKI HIDA Department of Mathematics

Meijo University TenpakuNagoya 468-8502

and Nagoya University (Professor Emeritus)

Theory of stochastic process has extensively developed in the twentieth century and there established a beautiful connection with quantum dynamics It seems to be a good time now to revisit the foundations of stochastic process and quantum mechanics with the hope that the attempt would suggest some of further directions of these two disciplines with intimate relations For this purpose we review some topics in white noise analysis and observe motivations from physiscs and how they have actually been realized

1 Introduction

We shall discuss the analysis of random complex systems and its connection with Quantum dynamics In particular we analyse some stochastic processes Xt) and random fields X(C) in a manner of using the innovation and revisit quantum dynamics in connection with stochastic analysis Actually our aim is to study those random complex systems including quantum fields by using the white noise analysis

The basic idea of our analysis is that we first discuss stochastic processes by taking a basic and standard system of random variables then expressing the given process as a function of the system that has been provided The system of such variables from where we have started is called idealized elemental random variables (abbr ierv) The idea of taking such a system is in line with the

Reductionism One might think that this thought seems to be similar to the Reductionism

in physics Before we come to this point it sounds interesting to refer to the lecture given by PW Anderson at University of Tokyo 1999 His title included Emergence together with reductionism and he gave good interpretation

Following the reductionism we then come to the next step is to form a function of the iervs so that the function represents the given random complex system It is nothing but

Synthesis

162

Then naturally follows the analysis of functions which have been formed in our setup Thus the goal has therefore to be the analysis of the function (may be called functional) to identify the random complex system in question

The first step of taking suitable system of iervs has been influenced by the way how to understand the notion of a stochastic process We therefore have a quick review of the definition of a stochastic process starting from the idea of J Bernoulli (Ars Conjectandi 1713) S Bernstein (1933) and P Levy on the definition of a stochastic process (1947) where we are suggested to consider the innovation of a stochastic process It is viewed as a system of iervs which will be specified to be a white noise

The analysis of white noise functionals has many significant characteristics which are fitting for investigation of quantum mechnical phenomena Thus we shall be able to show examples to which white noise theory is efficiently applied

Having had great contribution by many authors the theory developed in our line has become the present state

AMS 2000 Mathematics Subject Classification 60H40 White Noise Theory

2 Review of defining a stochastic process and white noise analysis

There is a traditional and in fact original way of defining a stochastic process Let us refer to Levys definition of a stochastic process given in his book [3] Chapt II une fonction aleatoire X(t) du temps t dans lequel le hasard inter-vient a chaque instant The hasard is expressed as an infinitesimal random variable Y(t) which is independent of the observed values of X(s) s lt t in the past The random variable Y(t) is nothing but the innovation of the process X(t)

Formally speaking the Y(t) which is usually an infinitesimal random varishyable contains the information that was gained by the X(t) during the time interval [t t + dt) To express this idea P Levy proposed a formula called an infinitesimal equation for the variation 5X (t)

6X(t) = $(X(s)s lt tY(t)tdt)

where $ is a non-random functional Although this equation has only a formal significance it still tells us lots of suggestions

While it would be fine if the given process is expressed as a functional of

163

Yt) in the following manner

X(t) = V(Y(s)slttt)

where ^ is a sure (non random) function Such a trick may be called the Reduction and Synthesis method The

above expression is causal in the sense that the X(t) is expressed as a function of Y(s) s ltt and never uses Y(s) with s gt t

Note that this method of denning a stochastic process is more important than function space type distribution

The collection Y(s) is a system of iervs so that the above expression is a realization of the synthesis We are particularly interested in the case where the system of iervs is taken to be a white noise and thus ready to discuss white noise analysis

So far we have discussed the theory only for a stochastic process It is in fact quite natural to extend the theory for a random field X(C) indexed by an ovaloid say a contour or closed surface A generalization of the infinitesimal equation is

SX(C) = $ (X(C) C lt CY(s)s e CC6C)

The y(s) s G C is the innovation

We note that the white noise analysis has many advantages as are quickly mentioned below Such a generalization can be done because of the use of the innovation

1) It is an infinite dimensional analysis Actually our stochastic analysis can be systematically done by taking a white noise as a sytem of iervs to express the given random complex systems Indeed the analysis is essentially infinite dimensional as will be seen in what follows

2) Infinite dimensional harmonic analysis The white noise measure supported by the space E of generalized funcshy

tions on the parameter space Rd is invariant under the rotations of E Hence a harmonic analysis arising from the group will naturally be discussed The group contains significant subgroups which describes essentially infinite dimenshysional characters

3) Generalizations to random fields X(C) are discussed in the similar manshyner to X(t) so far as innovation is concerned Needless to say X(C) enjoys more profound characteristic properties

164

4) Connection with the classical functional analysis The so-called S-transform applied to white noise functionals provides a bridge connecting white noise functionals and classical functionals of ordinary functions We can thereshyfor appeal to the classical theory of functionals established in the first half of the twentieth century

5) Good connection with quantum dynamics as will be seen in the next section

Differential and integral calculus of white noise functionals using annihishylation dt and creation lt9t class of generalized functionals harmonic analysis including Fourie analysis the Levy Laplacian A L complexification and other theories are refered to the monograph [12] and other literatures

3 Relations to Quantum Dynamics

We now explain briefly some topics in quantum dynamics to which white noise theory can be applied What we are going to present here may seem to be separate topics each other but behind the description always is a white noise

1) Representation of the canonical commutation relations for Boson field This topic is well known

Let B(t) be a white noise and let dt denote the S(i)-derivative Then it is an annihilation operator and its dual operator 3t stands for the creation They satisfy the commutation relations

[fta] = [aa] = o

[dtd] = s(t-s)

From these a representation of the canonical commutation relations are given for Bosonic particle

It is noted that the following assertion holds

Proposition There are continuously many irreducible representations of the canonical commutation relations

White noises with different variances are inequivalent each other which proves the assertion

2) Reflection positivity (T-positivity)

165

A stationary multiple Markov (say N-ple Markov) Gaussian process has a spetral density function (A) of particular type Namely

On the other hand it is proved that

Proposition The covariance function 7(t) of a stationary T-positive Gausshysian process is expressed in the form

bull O O

j(h) = exp[mdash |i|x]cfo(a) Jo

where v is a positive finite measure

By applying this assertion to the N-ple Markov Gaussian process we claim that T-positivity requires Ck gt 0 for every k

Note that in the strictly N-ple Markov case this condition is not satisfied

It is our hope that this result would be generalized to the cases of general stochastic processes of multiple Markov properties

3) A path integral formulation

One of the realizations of Dirac-Feynmans idea of the path integral may be given by the following method using generalized white noise functionals First we establish a class of possible trajectories when a Lagrangian L(x x) is given Let x be the classical trajectory determined by the Lagrangian As soon as we come to quantum dynamics we have to consider fluctuating paths y We propose they are given by

y(s) = xs) + mdashBs) V m

The average over the paths is replaced with the expectation with respect to the probability measure for which Brownian motion B(t) is defined Thus the propagator G(yiy2t) is given by

ENexp[l-J L(yy)ds+^j B(s)2ds] bull S(y(t) - y2)

With this setup actual computations have been done to get exact formulae of the propagators (L Streit et al)

166

4) Dirichlet forms in infinite dimensions With the help of positive grneralized white noise functionals we prove criteria for closability of energy forms See [3]

5) Random fields X(C)

A random field XC) depending on a parameter C which is taken to be a certain smooth and closed manifold in a Euclidean space naturally enjoys more complex probabilistic structure than a stochastic process X(t) depending on the time t It therefore has good connections with quantum fields in physics

We are particularly interested in the case where X(C) has a causal represhysentation in terms of white noise Some typical examples are listed below

51) Markov property and multiple Markov properties We are suggested by Diracs paper [1] to define Markov property For

Gaussian case a reasonable definition has been given (see [15]) by using the canonical representation in terms of white noise where the canonical property of a representation can be introduced as a geberalization of that for a Gaussian process Some attempts have been made for some non Gaussian fields (see [17]) For Gaussian case multiple Markov properties have been defined It is now an interesting question to find conditions under which a Gaussian random field satisfies a multiple Markov property

52) Stochastic variational equations of Langevin type Let C runs through a class C of concentric circles The equation is to solve

the following stochastic variational equation of Langevin type

SX(C) = -XXC) [ 6n(s)ds + X0 [ v(s)ds5n(s)ds Jc Jc

The explicit solution is given by using the 5-transform and the classical theory of functionals

53) We have made an attempt to define a random field X(C)C G C which satisfies conformal invariance Reversibility can also be discussed

Example Linear parameter case A Brownian bridge For t euro [01] it is defined by

X(t) = (l-t) [ mdash^mdashB(u)du Jo 1 ~u

167

Reversibility can be guaranteed not only by the time reflection but also by whiskers (one-parameter subgroup denned by deformation of parameter) in the conformal group that leaves the unit time interval invariant

We now come to the case of a random field Let C be the class of concentric circles Assume 0 lt r0 lt r lt r Denote by Cr the circle with radius r Then we define

(ft) - yfi^^bw w^w^ This is a canonical representation To show a reversibility we apply the invershysion with respect to the circle with radius yrori

We claim that it is possible to have a generalization to the case where C is taken to be a class of curves obtained by a conformal mapping of concentric circles

Remark 1 It is noted that the white noise x(t) is regarded as a representation of the parameter t so that propagation of randomness (fluctuation) is expressed in terms of x(t) instead the time t itself Namely the way of development of random complex phenomena in particular reversibility has explicit description in terms of white noise as is seen in the above example

Remark 2 See the papers [1] by Dirac and [13] by Polyakov to have suggestions on a generalization of the path integral

4 Addenda to foundations of the theories Concluding remarks

Before the concluding remarks are given we should like to add some facts as an addenda to SI regarding the foundations of probability theory

Prom a brief history mentioned in SI we understand the reason why a white noise that is a system of iervs is introduced It is a generalized stochastic process so that we need some additional consideration when reashysonable functionals in general nonlinear functionals of white noise are introshyduced In physics we met interesting cases where those nonlinear functionals of white noise are requested canonical commutation relations for quantum fields where degree of freedom is continuously infinite Feynmans path inteshygrals as was discussed in 3) of the last section and variational equation for a

168

random field On the other hand we were lucky when a class of generalized white noise functionals were introduced in 1975 since the theory of genaral-ized functions was established and some attempt had been made to apply it to the theory of generalized stochastic processes To have further fruitful results we have been given a powerful method to study random fields indexed by a manifold It is the so-called innovation approach where our reductionism does not care higher dimensionality of the parameter space With these in mind we can come to the concluding remarks

As the concluding remarks some of proposed future directions are now in order

1 One is concerned with good applications of the Levy Laplacian Its signifishycance is that it is an operator that is essentially infinite dimensional

2 A two-dimensional Brownian path is considered to have some optimality in occupying the territory This property should reflect to forming a model of physical phenomena

3 Systematic approach to in variance of random fields under transformation group will be discussed

4 Stochastic Variational Calculus for random fields

With the classical results on variational calculus we can proceed further white noise analysis

Acknowledgements The author is grateful to Professor A Khrenikov who has invited him to give a talk at this conference Thanks are due to Academic Frontier Project at Meijo University for the support of this work

References

1 PAM Dirac The Lagrangian in quantum mechanics Phys Z Soviet Union 3 64-72(1933)

2 S Tomonaga On a relativistically invariant formulation of the quantum theory of wave fields Prog Theor Phys 1 27-42 (1946)

3 P Levy Processus stochastiques et mouvement brownien (Gauthier-Villars 1948 2 ed 1965)

4 P Levy Nouvelle notice sur les travaux scientifique de M Paul Levy Janvier 1964 Part III Processus stochastiques (unpublished manuscript)

169

5 T Hida Canonical representations of Gaussian processes and their applications Mem College of Science Univ of Kyoto A 33 109-155(1960)

6 T Hida Stationary stochastic processes (Princeton Univ Press 1970) 7 T Hida Brownian motion (Iwanami Pub Co 1975 English ed

Springer-Verlag 1980) 8 T Hida Analysis of Brownina functionals Carleton Math Lecture

Notes 13 (1975) 9 T Hida Innovation approach to random complex systems Pub

Volterra Center 433 (2000) 10 T Hida and L Streit On quantum theory in terms of white noiseNagoya

Math J 68 21-34(1977) 11 T Hida J Pothoff and L Streit Dirichlet forms and white noise

analysis Commun Math Phys 116 235-245 (1988) 12 T Hida H-H Kuo J Potthoff and L Streit White noise an Infinite

dimensional calculus (Kluwer Academikc Pub 1993) 13 AM Polyakov Quantum geometry of Bosonic strings Phys Lett

103B 207-210(1981) 14 J Schwinger Brownian motion of a quantum oscillator J of Math

Phys 2 407-432 (1961) 15 Si Si Gaussian processes and Gaussian random fields Quantum Inshy

formational (World Scientific Pub Co 2000) 16 L Streit and T Hida Generalized Brownian functionals and the Feyn-

man integral Stoch Processes Appl 16 55-69 (1983) 17 L Accardi and Si Si Innovation approach to multiple Markov propershy

ties of some non Gaussian random fields to appear

170

STATISTICS A N D ERGODICITY OF WAVE FUNCTIONS IN CHAOTIC OPEN SYSTEMS

H ISHIO Department of Physics and Measurement Technology Linkoping University

S-581 83 Linkoping Sweden E-mail hirisifmliuse

and Division of Natural Science Osaka Kyoiku University Kashiwara

Osaka 582-8582 Japan E-mail ishioccosaka-kyoikuacjp

In general quantum chaotic systems are considered to be described in the context of the random matrix theory ie by random Gaussian variables (real or complex) in an appropriate universality class In reality however quantum states inside a chaotic open system are not given by a statistically homogeneous random state We show some numerical evidences of such statistical inhomogeneity for ballistic transport through two-dimensional chaotic open billiards and argue about their relation to the corresponding classical dynamics

1 Introduction

Quantum-mechanical signature of classical chaos is called quantum chaos The rigorous definition of chaotic systems in quantum theory has been given very recently for Kolmogorov (K-) and Anosov (C-) systems on the analogy of the corresponding classical natures1 In such systems quantum ergodicity is naturally expected Eigenfunctions are equidistributed in their representation space and all expectation values of quantum observables coincide with mean values of the corresponding classical observables It was first noted that a sufficient condition for quantum ergodicity to hold is the ergodicity of the corshyresponding classical dynamics2 More recently the statement was proved in the case of quantum billiards34 Nowadays the quantum ergodicity is one of the few results for which there exist mathematical proofs in the field of quantum chaos

The quantum ergodicity however can be reached only in the semiclassical limit (h mdashgt 0) In experiments or numerical simulations for chaotic systems we often see nonuniversal quantum features far from ergodicity even in a high (but finite) energy region In the present work we show some numerical evidences of such statistical inhomogeneity for chaotic open systems In Sec 2 we introshyduce a model of ballistic transport through a chaotic open billiard and show some evidences of nonergodicity in the classical dynamics We briefly discuss in Sec 3 the general wave-statistical description of chaotic open systems by

171

Figure 1 Typical single trajectory in the open stadium billiard

the random matrix theory (RMT) In Sec 4 we show numerical results of fully-quantum calculations of the open billiard model and find that the idealshyistic description by RMT does not apply in some cases even in a high energy region There we focus on the relation between the statistical deviations and wave localization corresponding to classical short paths Section 5 consists of conclusions

2 Classical Nonergodicity and Short-Path Dynamics

We consider a two-dimentional (2D) billiard where the motion of noninter-acting particles confined by Dirichlet boundaries is ballistic The shape of the boundaries directly determines the nonlinearity of particle dynamics inside the billiard One of the prototypes of conservative chaotic systems is a Bunimovich stadium billiard In the case of a closed stadium billiard it is proved that the system has K-property 5 In the case of an open stadium billiard coupled to two narrow leads (see Fig 1) the nonintegrability is still expected eg we can observe a fractal structure in the spectrum of dwell times inside the cavity region6 However the Monte Carlo simulation of the classical path-length (oc dwell time) distribution shows that the distribution function is not a simple exponential decay function as a signature of ergodicity but a highly structured function owing to short-path dynamics7

Another example showing nonergodicity of classical dynamics in the case

172

of the open stadium billiard is a transmission-reflection diagram of particles as is shown in Fig 2 There y is an initial transversal position of each particle incoming from the lead 1 (see Fig 1) at the entrance of the stadium cavity d denotes a common width of the attached leads We apply semiclassical quantization condition to the momentum of the incoming particles in the lead The angle of incidence is quantized as 6 = plusmn s in - 1 [(nir)(kd)] (n = 12 ) where we choose the positive and negative 0j for the upper and lower direction of particle motions in Fig 1 respectively k is the Fermi wave number of the semiclassical particles In the calculation of all the range of the diagram we fix the quantized mode number n as n = 1 Because of the semiclassical quantization condition 0i monotonically decreases as a function of k The distributed black and white points correspond to transmission and reflection events respectively The relative measure of the black (white) portion for each fc is equal to the classical transmission (reflection) probability Tci(k) (Rct(k)) In Fig 2 we see a number of black and white windows in the chaotic sea Each of them is associated with a family of short paths connecting from the lead 1 to the lead 2 (for the black) and the lead 1 (for the white) Such paths are stable in the event of transmission and reflection and are expected to make an important contribution as a family to the corresponding quantum transport

3 Universal Description of Wave Function Statistics

We write the scaled local density as p(r) mdash Vip(r)2 where V is the volume of the system in which a single-particle wave function ip(r) is normalized in terms of the position r It is well known that the probability distribution of the local densities of a chaotic eigenfunction of a closed system is the Porter-Thomas (P-T) distribution8

P(p) = ( l v 2 ^ ) exp( -p 2) (1)

described by a Gaussian orthogonal ensemble (GOE) of random matrices when time-reversal symmetry (TRS) is present ie ip poundR On the other hand the distribution is an exponential8Q

P(p) = exp(-p) (2)

described by a Gaussian unitary ensemble (GUE) of random matrices when TRS is broken in the closed system ie tp 6 C The space-averaged spatial correlation of the local densities of a 2D chaotic wave function with wave number k is also given by9 10 11

P2(kr) = (p^pfa)) = l + cJi(kr) (3)

173

where r = |ri mdash r2 | and Jox) is the Bessel function of zeroth order The parameter c is chosen as c = 2 for GOE (TRS) and c = 1 for GUE (broken TRS) eigenfunctions

Investigations of the continuous transition of the wave function statistics between GOE and GUE symmetries have been also worked out Introducshying a transition parameter b euro (12] we have the probability distribution 1213141516

PM = 2Vr3Texp(4(5^T))

where Iox) is the modified Bessel function of zeroth order and the spatial correlation17

Pb2kr) = 1 + (l + ( ^ ) 2 ) JS(kr) bull (5)

For b -gt 1 and b -gt 2 both equations tend to the GOE and GUE cases respectively

On the other hand the systematic statistical investigations of scattering wave functions in open chaotic systems have been carried out quite recently16

It is essential that the space reciprocity in conservative closed systems which means that each plane wave ties up with its counterpart with the same amplishytude and running in the opposite direction in phase is lost in open systems As a result the wave function statistics in a chaotic open system is expected to be the GUE if the system is completely open16

4 Numerical Analyses and Discussions

We show in this section some numerical evidences of wave statistical inho-mogeneity for ballistic transport through the 2D open stadium billiard Asshysuming steady current flow through the leads we solve the time-independent Schrodinger equation for a single particle under Dirichlet boundary conditions based on the plane-wave-expansion method6 giving reflection and transmission amplitudes as well as local wave functions for each energy In the calculation of the statistics a sample space A(= V) is taken in the cavity region corshyresponding to the closed stadium and more than one million sample points are used to obtain reliable statistics We show the numerical results for the wave probability density in Fig 3 and for the probability distribution P(p) and spatial correlation P2(kr) in Fig 4

174

In Fig 3(a) we find the so-called bouncing-ball mode in the central reshygion of the stadium cavity where we see a number of vertical nodes associated with marginally stable classical orbits bouncing vertically between the straight edges Bouncing-ball states are nonstatistical states since the amplitude of ip is strongly localized in the middle region of the stadium (the space reciprocity holds locally) and is very small in the endcaps (the space reciprocity does not necessarily hold) As a result both Pp) and P2(kr) for such states do not folshylow their universal expressions (see Fig 4(a)) In addition to the bouncing-ball mode we also see another wave localization strongly coupled to both the initial and the (open) transmission channels corresponding to the direct transmission path (see the white line depicted in Fig 3(a)) Along such localization plane wave may propagate with nonzero probability current partially contributing to the anomaly of the wave statistics16

In the higher energy region where the ratio of the system size A to the wave length A is v^4A ~ 25 (ie in the case of Fig 3(b)) we may expect the GUE statistics However we see in Fig 4(b) that both P(p) and P2(kr) follow closely the GOE

The reason is a localization effect reminiscent of the phenomenon known as scar 18 describing an anomalous localization of quantum probability denshysity along unstable periodic orbits in classically chaotic systems In order to characterize a localization we usually introduce a moment defined by J = V~l Jv tp(r)2qdr of the eigenfunction local density |VKr)|2 with V being the system volume19 20 The second moment I2 is known as the inverse particshyipation ratio (IPR) Assuming a normalization condition (|V|2) (= ^1) = 1gt we have I2 = 1 for completely ergodic (random and uniform) eigenfunctions while h = 00 for completely localized eigenfunctions like IV(r)2 ~ V5(r) The localization effect on wave-function density statistics has been examined anashylytically in relation to J for closed systems212223 and also numerically using a time-dependent approach ie in terms of recurrences of a test Gaussian wave packet for closed and weakly (imperfectly) open systems 24gt25gt26 In the latter work they showed that the tail of the wave-function intensity distribution in phase space is dominated by scarring departing from the RMT predictions

In contrast the most prominent effect of the localization of wave probashybility density in open billiards is the local space reciprocity holding along the classical orbits corresponding to the localization not strongly coupled to any (open) transmission channel (see eg the white lines depicted in Fig 3(b)) Along such orbits there is no net current owing to the coherent overlap of time-reversed waves so that both P(p) and P2(kr) are close to the GOE predicshytions 16 For quantitative discussion the value of the GOE-GUE transition pashyrameter b is calculated numerically from the wave function ip(r) mdash u(r) + iv(r)

175

by a formula 16

amp = 2 lt | V | 2 ) (hf) + y(|V|2)2-4((u2)( l2)-(w)2) (6)

and (bull bull bull) denotes a space average on A The obtained value for Fig 3(b) is b = 103 which corresponds to the case very close to the GOE

In the case of open systems the IPR may again play an important role as a measure of localization27 In the definition I2 = V 1 Jv |^(r) |4dr |V(r)|2(= p(r)) is the scattering-wave local density and V the area (A) of the stadium cavity in our case For chaotic wave functions normalized as (IVI2) = 1 gt w e

obtain from Eq (4) the IPR l for the transition between the GOE and GUE statistics as

Tb I p2Pb(p)dp = -7T

2VF^i

5 [2

70 Ti dQ

[l+(t-l)cos0]

3b2 - 4 6 + 4 b2 (7)

In the GOE and GUE limits I=1 = 3 and 7|=2 = 2 respectively For Fig 3(b) the numerically obtained IPR is h = 289 which is exactly equal to jt=i03 ^phis m e a n s that the enhancement of the IPR by the amplitude of the localized wave is not strong in the case of Fig 3(b) and that the effect of the localization appears mainly in the value of b which also determines the IPR

From our investigations together with more extended studies16 the comshyplete GUE statistics is conjectured to be obtained only in the high-energy (semiclassical) limit Until the energy reaches such limit the localization of wave functions within the chaotic open systems strongly affects the wave stashytistical properties leading to deviations from the RMT predictions based on the ergodicity or uniform randomness of wave functions

Finally we note that the classical-path families associated with the loshycalization found in Fig 3(a) and (b) can be identified as windows indicated with a and 3 in Fig 2 respectively (In Fig 3(b) only the path family for the localization touching the entrance can be identified in Fig 2) We notice that the angle of incidence 0 for a given k is irrelevant to that of the path corresponding to the observed localizations directly connected to the entrance

5 Conclusions

In conclusions our numerical analyses show that chaotic-scattering wave funcshytions in open systems exhibit remarkably different features from the idealistic GUE predictions The statistical deviations from the GUE can be understood in terms of wave localization corresponding to classical short-path dynamics

176

Acknowledgments

The auther is obliged to K-F Berggren A I Saichev and A F Sadreev for fruitful collaboration leading to the work in Sec 4 Support from the Swedish Board for Industrial and Technological Development (NUTEK) under Project No P12144-1 is also acknowledged Part of the calculations of the wave funcshytion statistics were carried out by using a resource in National Supercomputer Center (NSC) at Linkoping

References

1 H Narnhofer (to be published) 2 A I Shnirelman Usp Mat Nauk 29 181 (1974) 3 P Gerard and E Leichtnam Duke Math J 71 559 (1993) 4 S Zelditch and M Zworski Comm Math Phys 175 673 (1996) 5 L A Bunimovich Fund Anal Appl 8 254 (1974) 6 K Nakamura and H Ishio J Phys Soc Jpn 61 3939 (1992) 7 H Ishio and J Burgdorfer Phys Rev B 51 2013 (1995) 8 C Porter and R Thomas Phys Rev 104 483 (1956) 9 V N Prigodin Phys Rev Lett 74 1566 (1995)

10 V N Prigodin et al Phys Rev Lett 72 546 (1994) 11 M V Berry in Chaos and Quantum Physics ed M J Giannoni

A Voros and J Zinn-Justin (Elsevier Amsterdam 1990) p 251 12 K Zyczkowski and G Lenz Z Phys B 82 299 (1991) 13 G Lenz and K Zyczkowski J Phys A 25 5539 (1992) 14 E Kanzieper and V Freilikher Phys Rev B 54 8737 (1996) 15 R Pnini and B Shapiro Phys Rev E 54 R1032 (1996) 16 H Ishio et al (unpublished) 17 S-H Chung et al Phys Rev Lett 85 2482 (2000) 18 E J Heller Phys Rev Lett 53 1515 (1984) 19 F Wegner Z Phys B 36 209 (1980) 20 C Castellani and L Peliti J Phys A 19 L429 (1986) 21 Y V Fyodorov and A D Mirlin Phys Rev B 51 13403 (1995) 22 K Miiller et al Phys Rev Lett 78 215 (1997) 23 V N Prigodin and B L Altshuler Phys Rev Lett 80 1944 (1998) 24 L Kaplan Nonlinearity 12 Rl (1999) 25 L Kaplan Phys Rev Lett 80 2582 (1998) 26 L Kaplan and E J Heller Ann Phys 264 171 (1998) 27 H Ishio and L Kaplan (private communication)

177

-612 0 612-612 0 612 y(-9i) y(+6i)

Figure 2 Transmission-reflection diagram of classical particles as a function of initial position y at the entrance of the stadium cavity and Fermi wave number k corresponding to the angle of incidence $i calculated by semiclassical quantization condition (n = 1 in all the range) in the lead Black and white points correspond to transmission and reflection events respectively Two families of short paths are identified with an arrow beside the diagram (see the text)

178

Figure 3 Contour plot of wave probability density in the open stadium billiard for the condition (a) kdn = 18785 (n = 1) and (b) kdrc = 46553 (n = 1) Initial wave comes through the left lead into the cavity The transmission probability is (a) Tqm = 055 and (b) Tqm = 036 The contours show about 975 of the largest wave probability density Thin white lines show some of the short classical orbits corresponding to the localization of the wave probability density Taken from the work by the authors in Ref [12] (unpublished)

179

Q

Q_

001

10

Q

Q_

01

001

(b) = 2

X ^ Q U E _ _S gtJ^ 0 G O r T lt ^ lt

GOE

) 2 4 6 kr

bull

8

0

Figure 4 Probability distribution (steps) and spatial correlation (thick line in the inset) of local densities in the open stadium billiard for the condition (a) kd = 18785 (n = 1) and (b) kdir = 46553 (n = 1) Two thin lines show GOE (ie Eq (1)) and GUE (ie Eq (2)) cases (Eq (3) for the inset) Taken from the work by the authors in Ref [12] (unpublished)

180

ORIGIN OF Q U A N T U M PROBABILITIES

A N D R E I K H R E N N I K O V

International Center for Mathematical

Modeling in Physics and Cognitive Sciences

MSI University of Vaxjo S-35195 Sweden

Email AndreiKhrennikovmsivxuse

We demonstrate that the origin of the quantum probabilistic rule (which differs from the conventional Bayes formula by the presence of cos 0-factor) might be exshyplained by perturbation effects of preparation and measurement procedures The main consequence of our investigation is that interference could be produced by purely corpuscular objects In particular the quantum rule for probabilities (with nontrivial cos 0-factor) could be simulated for macroscopic physical systems via preparation procedures producing statistical deviations of a special form We disshycuss preparation and measurement procedures which may produce probabilistic rules which are neither classical nor quantum in particular hyperbolic quantum theory

1 Introduction

It is well known that the conventional probabilistic rule formula for the total probability (that is based on Bayes formula for conditional probabilities) canshynot be applied to quantum experiments see for example [1]-[12] for extended discussions It seems that special features of quantum probabilistic behaviour are just consequences of violations of the conventional probabilistic rule

In this paper we restrict our investigations to the two dimensional case Here the formula for the total probability has the form (i = 12)

p(A = ai) = p(B = h)p(A = ltnB = h) + p(B = b2)pA = taB = b2)

(1)

where A and B are physical variables which take respectively values aia2

and 6162- Symbols p(A = a^jB = bj) denote conditional probabilities It is one of the most important rules used in applied probability theory In fact it is the prediction rule if we know probabilities for B and conditional probabilities then we can find probabilities for A However this rule cannot be used for the prediction of probabilities observed in experiments with elementary particles The violation of conventional probabilistic rule and the necessity to use new prediction rule was found in interference experiments with elementary particles This astonishing fact was one of the main reasons to build the quantum formalism on the basis of the wave-particle duality

181

Let (fgt be a quantum state Let b gtf=1 be the basis consisting of eigenshyvectors of the operator B corresponding to the physical observable B The quantum probabilistic rule has the form (i = 12)

Pi = qiPii + q2P2i plusmn 2qiPHq2p2i cos0 (2)

where p = p^A = ai)qj - p^B = 6j)Py = pbigt(A = aj)ij = 12 Here probabilities have indexes corresponding to quantum states

By denoting P = pj and P i = qiPi i P2 = q2P2i we get the standard quantum probabilistic rule for interference of alternatives

P = P i + P 2 + 2v P7PT cos6raquo There is the large diversity of opinions on the origin of violations of convenshy

tional probabilistic rule (1) in quantum mechanics see [1]-[12] The common opinion is that violations of (1) are induced by special properties of quanshytum systems (for example Dirac Feynman Schrodinger) Thus the quantum probabilistic rule must be considered as a peculiarity of nature

An interesting investigation on this problem is contained in the paper of J Shummhammer [12] In the opposite to Dirac Feynman Schrodinger he claimed that quantum probabilistic rule (2) is not a peculiarity of nature but just a consequence of one special method of the probabilistic description of nature so called method of maximum predictive power

In this paper we provide probabilistic analysis of quantum rule (2) In our analysis probability has the meaning of the frequency probability namely the limit of frequencies in a long sequence of trials (or for a large statistical ensemble) Hence in fact we follow to R von Mises approach to probabilshyity [13] It seems that it would be impossible to find the roots of quantum rule (2) in the measure-theoretical framework A N Kolmorogov 1933 [14] In the measure-theoretical framework probabilities are defined as sets of real numbers having some special mathematical properties The conventional rule (1) is merely a consequence of the definition of conditional probabilities In the Kolmogorov framework to analyse the transition from (1) to (2) is to analshyyse the transition from one definition to another In the frequency framework we can analyse behaviour of trails which induce one or another property of probability Our analysis shows that quantum probabilistic rule (2) can be in principle a consequence of perturbation effects of preparation and measureshyment procedures Thus trigonometric fluctuations of quantum probabilities can be explained without using the wave arguments

In fact our investigation is strongly based on the famous Diracs analysis of foundations of quantum mechanics see [1] In particular P Dirac pointed out that one of the main differences between the classical and quantum theories is that in quantum case perturbation effects of preparation and measurement

182

procedures play the crucial role However P Dirac could not explain the origin of interference for quantum particles in the purely corpuscular model He must apply to wave arguments If the two components are now made to interfere we should require a photon in one component to be able to interfere with one in the other [1]

In this paper we discuss perturbation effects of preparation and measureshyment procedures We remark that we do not follow to W Heisenberg [15] we do not study perturbation effects for individual measurements We discuss statistical (ensemble) deviations induced by perturbations

We underline again that our probabilistic analysis was possible only due to the rejection of Kolmogorovs measure-theoretical model of probability theshyory Of course each particular experiment (measurement) can be described by Kolmogorovs model there are no quantum probablities Moreover it seems that there is nothing more than the binomial probability distribution (see the paper of J Shummhammer in the present volume) The most important feashyture of QUANTUM STATISTICS is not related to a single experiment We have to consider at least three different experiments (preparation procedures) to observe quantum probabilistic behaviour namely interference of alternashytives Kolmogorovs model is not adequate to such a situation In this model all random variables are defined on the same probability space It is impossible to do in the case of a few experiments that produce interference of alternatives (at least the author does not see any way to do this) In our analysis probashybility is classical relative frequency but it is not Kolmogorov (compare with Accardi [3])

An unexpected consequence of our analysis is that quantum probability rule (2) is just one of possible perturbations (by ensemble fluctuations) of conventional probability rule (1) In principle there might exist experiments which would produce perturbations of conventional probabilistic rule (1) which differ from quantum probabilistic rule (2)

Moreover if we use the same normalization of the interference term namely 2vPTP7 then we can classify all possible probabilistic rules that we have in nature

1) trigonometric 2) hyperbolic 3) hyper-trigonometric The hyperbolic probabilistic transformation has a linear space representashy

tion that is similar to the standard quantum formalism in the complex Hilbert space Instead of complex numbers we use so called hyperbolic numbers see for example [18] p21 The development of hyperbolic quantum mechanics can be interesting for comparative analysis with standard quantum mechanics In

Such an approach implies the statistical viewpoint to Heisenberg uncertainty relation the statistical dispersion principle see L Ballentine [16] [17] for the details

183

particular we clarify the role of complex numbers in quantum theory Complex (as well as hyperbolic) numbers were used to linearize nonlinear probabilistic rule (that in general could not be linearized over real numbers) Another intershyesting feature of hyperbolic quantum mechanics is the violation of the principle of superposition Here we have only some restricted variant of this principle

2 Quantum formalism and perturbation effects

1 Frequency probability theory The frequency definition of probability is more or less standard in quantum theory especially in the approach based on preparation and measurement procedures [5] [10] [16] [11]

Let us consider a sequence of physical systems n = (7TI7T2 71-JV bullbullbull) bull Suppose that elements of TT have some property for example position or spin and this property can be described by natural numbers L = 12 m the set of labels Thus for each -Kj euro TT we have a number Xj pound L So ir induces a sequence

x = (XIX2XN) Xj e L (3)

For each fixed a euro L we have the relative frequency VNOC) mdash niv(a)N of the appearance of a in (aia2 XN) Here njv(a) is the number of elements in (XIX2--XN) with Xj = a R von Mises [13] said that x satisfies to the principle of the statistical stabilization of relative frequencies if for each fixed a G L there exists the limit

p(a) = lim ^AT(Q) (4) NmdashHXl

This limit is said to be a probability of a Thus the probability is defined as the limit of relative frequencies In fact this definition of probability is used in all experimental investigations In Kolmogorovs approach [14] probability is denned as a measure The principle of the statistical stabilization is obtained as the mathematical theorem the law of large numbers

2 Preparation and measurement procedures and quantum forshymalism We consider a statistical ensemble S of quantum particles described by a quantum state ltjgt This ensemble is produced by some preparation proceshydure 8 see for example [4] [5] [16] [10] [11] for details see also P Dirac [1] In practice the conditions could be imposed by a suitable preparation of the system consisting perhaps in passing it through various kinds of sorting apparatus such as slits and polarimeters the system being left undisturbed after the preparation

There are two discrete physical observables B = bi 62 and A = ax a2

184

The total number of particles in S is equal to N Suppose that ni mdash 12 particles in S with B = bi and n i = 12 particles in S with A = a

Suppose that among those particles with B = bi there are riijij = 12 particles with A = aj (see (R) below to specify the meaning of with) So

n = nn +ni2n^ = nxi +n2jij = 12

(R) We follow to Einstein and use the objective realist model in that both B and A are objective properties of a quantum particle see [5] [4] [10] for the details In particular here each elementary particle has simultaneously defined position and momentum In such a model we can consider in the ensemble S sub-ensembles Sj(B) and Sj(A)j = 12 of particles having properties B = bj and A = aj respectively Set

Sij(AB) = S i(B)nS j(A) Then n^ is the number of elements in the ensemble S J ( A B ) We remark

that the existence of the objective property (B mdash bi and A mdash Oj) need not imshyply the possibility to measure this property For example such a measurement is impossible in the case of incompatible observables In general the property (B = bi and A = aj) is a kind of hidden objective property b

The physical experience says that the following frequency probabilities are well defined for all observables B A

q i = p^(B = 6 i ) = lim q ^ U r 0 ^ (5) JVmdashgtoo iV

p = p ( j 4 = a ) = l i m pWpf) = | (6) IS mdashtoo 1

Let quantum states |6j gt be eigenstates of the operator B Let us conshysider statistical ensembles Tii = 12 of quantum particles described by the quantum states |6j gt These ensembles are produced by some preparation proshycedures poundj For instance we can suppose that particles produced by a prepashyration procedure pound (for the quantum state 4gt) pass through additional niters Fi i = 12 In quantum formalism we have

ltfgt = xqT |ampi gt +V^eiB h gt bull (7)

^Attempts to use objective realism in quantum theory were strongly criticized especially in the connection with the EPR-Bell considerations Moreover many authors (for example P Dirac [1] and R Feynman [2]) claimed that the contradiction between objective realism and quantum theory can be observed just by comparing the conventional and quantum probabilistic rules (see dEspagnat [4] for the extended discussion) However in this paper we demonstrate that there is no direct contradiction between objective realism and quantum probabilistic rule

185

In the objective realist model (R) this representation may induce the illushysion that ensembles Tti = 12 for states bi gt must be identified with sub-ensembles Si(B) of the ensemble S for the state (j) However there are no physical reasons for such an identification

The additional filter Fj(i = 12) changes the A-property of quantum partishycles In general the probability distribution of the property A for the ensemble S(B) = IT e S B(7r) = b differs from the corresponding probability distrishybution for the ensemble T

Suppose that there are rriij particles in the ensemble T with A = aj(j mdash 12) c

The following frequency probabilities are well defined Pij = p|6 gt(A = aj) = limAr- oo pgt- where the relative frequency p ^ =

^f- (by measuring values of the variable A for the statistical ensemble T

we always observe the stabilization of the relative frequencies pj bull to some constant probability py)

Here it is assumed that the ensemble Tj consists of n^ particles i = 12 This assumption is natural if we consider preparation procedure pound = Ft a filter with respect to the value B mdash bi Only particles with B = bi pass this filter Hence the number of elements in the ensemble T (represented by the state bi gt) coincides with number of elements with B = bi in the ensemble 5 (represented by the state cjgt)

It is also assumed that n = n(N) -gt ooiV-gtoo In fact the latter assumption holds true if both probabilities qi = 12

are nonzero We remark that probabilities pjj = TpbigtA = aj) cannot be (in general)

identified with conditional probabilities p$(A = ajB = bi) As we have reshymarked these probabilities are related to statistical ensembles prepared by different preparation procedures namely by poundii mdash 12 and pound Probabilities P|ijgt(A = aj) can be found by measuring the A-variable for particles belongshying to the ensemble Tj Probabilities p^iA = CLJB = bi) in general could not be found these are hidden probabilities with respect to the ensemble S

3 Derivation of quantum probabilistic rule Here we present the standard Hilbert space calculations

cWe can use the objective realist model (R) Then m^- is just the number of particles in the ensemble Tj having the objective property A = aj We can also use the contextualist model (C) Then rriij is the number of particles in the ensemble T which in the process of an interaction with a measurement device for the physical observable A would give the result A = aj

186

lttgt = y5x h gt +y^eie b2 gt Let aj gt be the orthonormal basis consisting of eigenvectors of the

operator A We can restrict our considerations to the case

h gt= -vPiT K gt +e I 7 lv pH a2 gt b2 gt= VP2T K gt +en2^p22 a2 gt bull

(8)

We note that Pll + Pl2 = 1 P21 + P22 = 1-The first sum is the probability to observe one of values of the variable A

for the statistical ensemble Ti the second sum is the probability to observe one of values of the variable A for the statistical ensemble T2

As lt ampi|62 gt = 0 we obtain VP11P21 + e i(71 ~72) v p l ip i i = 0 We suppose that all probabilities pij gt 0 This is equivalent to say that

A and B are incompatible observables or that operators A and B do not commute

Hence sin(7i mdash 72) = 0 and 72 = 71 + nk We also have VP11P21 + cos(7i - 72VP12P22 = 0 This implies that k = 21 + 1 and ^ p i ^ i = iPi2P22- As p2 = 1 mdash P n

and P21 = 1 mdash P22 we obtain that

P l l = P 2 2 P l2=P21- (9)

This equalities are equivalent to the condition P u + P21 = 1 P12 + P22 = 1 Hence the matrix of probabilities (pij) is double stochastic matrix see

for example [5] for general considerations Thus in fact

h gt= v^PiT K gt +e17lVPi2 a2 gt b2 gt= ^pln |ai gt - e J 7 l v^22 a2 gt (10)

So (p = di |ai gt +d2|a2 gt where di = VqlpTT + e ^ y ^ p i T d2 = e i 7 l qiPi2 - e^+^yqjp^ Thus

pi = p 0 ( A = ai) = |di|2 = q i p n + q 2 p 2 i + 2 v q ip i iq 2 p 2 i cos^ (11)

p 2 = pltt(A = a2) = |d2|2 = qiPi2 + q2P22 - 2yqiPi2q2P22Cos0 (12)

187

3 Probability transformations connecting preparation proceshydures Let us forget at the moment about the quantum theory Let B(= b b2) and A(= 0102) be physical variables We consider an arbitrary preparation procedure pound for microsystems or macrosystems Suppose that pound produced an ensemble S of physical systems Let pound and pound2 be preparation procedures which are based on filters Fi and F2 corresponding respectively to values 61 and b2

of B Denote statistical ensembles produced by these preparation procedures by symbols Tx and T2 respectively Symbols

have the same meaning as in the previous considerations Probabilities qi)PijgtPi a r e defined in the same way as in the previous considerations The only difference is that instead of indexes corresponding to quantum states we use indexes corresponding to statistical ensembles

q = Ps(B = bi)pi = ps(A = ai)pij = PTi(A = a)

We shall restrict our considerations to the case of strictly positive probashybilities

The following simple frequency considerations are basic in our investigashytion We would like to represent the frequency p^ (for A = a in the ensemble S) as the sum of the conventional (Bayes) part

q i ^ P i f + q ^ P ^ and some perturbation term Such a perturbation term appears because

frequencies q and p ^ are calculated with respect to different ensembles The magnitude of this perturbation term will play the crucial role in our further analysis We have

(N) _ nplusmn _ nu I^pound _ mi l H2i 4 (nii ~ miraquo) (n2i ~ ra2j) P i ~ N ~ N N ~ N N N N

But for i = l 2 we have

tradegtu _ rnu_ r^_ _ (N) (N) m^ _ rn^ n | _ (jy) (N)

N ~ n N ~ P l i q i N ~ n N ~P2i ^

Hence

pw = qwp(f) + qwp(f) + r ) ) (13)

where

SiN) = Jj[(nu ~ m i i ) + (2i - m2i)] i mdash 12

188

In fact this rest term depends on the statistical ensembles STiT2 4Ngt=6W(STlT2) 4 Behaviour of fluctuations First we remark that limjv-yoo S exists

for all physical measurements We always observe that P 1

( N ) - M M q i( N ) - q p J ) - gt P u N - gt 0 0

Thus there exist limits 6i = limiv^oo S = Pi ~ qiPii - q2P2i-This coefficient Si is statistical deviation produced by the perturbation

effect of the preparation procedure Ei (quantities S are experimental statisshytical deviations)

Suppose that preparation procedures poundi = 12 (typically filters F) proshyduce negligibly small (with respect to the size N of the statistical ensemble) changes in properties of particles Then

6deg -gt0N-oo (14)

This asymptotic implies conventional probabilistic rule (1) In particular this rule can be used in all experiments of classical physics Hence preparation and measurement procedures of classical physics produce experimental statistical deviations with asymptotic (14) We also have such a behaviour in the case of compatible observables in quantum physics

Moreover the same conventional probabilistic rule we can obtain for inshycompatible observables B and A if the phase factor 9 = j + nk Therefore conventional probabilistic rule (1) is not directly related to commutativity of corresponding operators in quantum theory It is a consequence of asymptotic (14)

Despite the same asymptotic (14) there is the crucial difference between classical observations (and compatible observations) and decoherence 9 = f +

irk for incompatible observations In the first case S fa 0 TV -gt oo because both

4T = jj(nu ~mH)w deg siyen = jj(n2i ~ m 2 ) K deg N bullbull deg deg -In an ideal classical experiment we have

gtiiraquo = ma and n^i = tn^i-Here preparation procedures poundj (filters with respect to the values hi of the

variable B) do not change values of the A-variable at all In the case of decoherence of incompatible observables the statistical deshy

viations S j and 8 2 are not negligibly small So perturbations can be sufshyficiently strong However we still observe (14) as a consequence of the comshypensation effect of perturbations

189

x(N) ~ _x() degil ~ degi2 bull Suppose now that filters Fii = 12 produce changes in properties of

particles that are not negligibly small (from the statistical viewpoint) Then the statistical deviations

lim 6N) =Si^0 (15) iV-gtoo

Here we obtain probabilistic rules which differ from the conventional one (1) In particular this implies that behaviour (15) cannot be produced in experishyments of classical physics (or for compatible observables in quantum physics)

A rather special class of statistical deviations (15) is produced in experishyments of quantum physics However behaviour of form (15) is not the specific feature of quantum measurements (see further considerations)

To study carefully behaviour of fluctuations S we represent them as

where

A-N) = [jnu - mii) + (n2i - m2i)] 2ymum2i

These are normalized (experimental) statistical deviations We have used the fact

(N) (N) (N) (N) _ nj r^plusmn ^2 ^2i _ rniim2i qi P H q2 p2i - N bull n t bull N bull n6 - JV-2 bull

In the limit N -gt oo we get

Si = 2yqiPHq2P2i Araquo

where the coefficients Aj = limjv-gtoo A i = 12 Thus we found the general probabilistic transformation (for three preparation procedures) that can be obtained as a perturbation of the conventional probabilistic rule (i = 12)

Pi = qiPH + q2P2i + 2Vqiq2PiiP2iAj (16)

Of course we are free in the choice of a normalization constant in the perturbation term We use 2vqiq2Piipi7 by the analogy with quantum forshymalism In fact such a normalization was found in quantum formalism to get the representation of probabilities with the aid of complex numbers Comshyplex numbers were introduced in quantum formalism to linearize the nonlinear

190

probabilistic transformation q ip i + q2P2raquo + 2-vqiq2PiiP2i cos 6 To do this we use the formula (c d gt 0)

c + d + 2Vcdcos6 = ^+Vdeie2 (17)

The square root yc+Vde9 gives the possibility to use linear transformations Thus we do not see anything mystical in the appearance of complex numbers in quantum theory This is a consequence of the impossibility of real linearization of the nonlinear probabilistic transformation

In classical physics the coefficients A = 0 The same situation we have in quantum physics for all compatible observables as well as for measurements of incompatible observables for some states In the general case in quantum physics we can only say that the normalized statistical deviations

K lt 1 (18)

Hence for quantum experiments we always have

(nu - mu) + (n2i - m2i)

2ymum2i lt l J V - gt o o (19)

Thus quantum perturbations induce a relatively small (but not negligibly small) statistical variations of properties We underline again that quantum perturbations give just the proper class of perturbations satisfying to condition (19)

Let us consider arbitrary preparation procedures that induce perturbations satisfying to (18) We can set

Aj = cos9ii = 12 where 6i are some phases Here we can represent perturbation to the

conventional probabilistic rule in the form

St = 2vqipliq2p2iCOS0iJ = 12 (20)

In this case the probabilistic rule has the form (i = 12)

Pi = qiPii + q2P2i + 2^qiq2piiP2i cos8i (21)

This is the general form of a trigonometric probabilistic transformation The usual probabilistic calculations give us 1 = Pl + p 2 = qiPH + q2P21 + +qiPl2 + q2P22 + 2 TqTqiPiTpircos^i + 2 yqTqiPiipii cos 02

= 1 + 2Aqiq2[xpnP2i coslti + vPi2P22 cos02] bull

191

Thus we obtain the relation

P l l P 2 1 c o s ^ l + Pl2P22COS02 = 0 (22)

Suppose now that the matrix of probabilities is a double stochastic matrix We get

cos 6 mdash mdash cos 6-2 (23)

We obtain quantum probabilistic transformation (2) We demonstrate that this rule could be derived even in the realist framework Condition (19) has the evident interpretation To explain the mystery of quantum probabilistic rule we must give some physical interpretation to the condition of double stochasticity see section 4 for such an attempt

We can simulate quantum probabilistic transformation by using random variables niju)miju) such that the deviations

4T = nu - mH = 2^fVmiraquom2raquo (24)

4 i = n2i ~ m2j = ^ii VmUm2i (25)

where the coefficients poundy satisfy the inequality

l deg + $ deg I lt l-gtoo (26)

Suppose that Agt mdash poundj + Qj ~raquo A N -raquobull oo where |Ai| lt 1 We can repshy

resent A|N) = cos(9i(N) Then0JN) -gtbull 9imod2iT when N -gt oo Thus A = cos ft We remark that the conventional probabilistic rule (which is induced by

ensemble fluctuations with Q mdashgt 0) can be observed for fluctuations having relatively large absolute magnitudes For instance let

e l i mdash lt Vmlraquogt e2i mdash 2S2t V m 2i )raquo mdash J-iA (27)

where sequences of coefficients pound4 and pound^ are bounded (JV -gt oo) Here (N) f(JV) pound(JV)

^ = mti wmn -gt 0 iV -gt oo (as usual we assume that pj gt 0) Example 21 Let N laquo 106nJ w rig laquo 5 bull 105 mn ss mi2 laquo m2i laquo

m22 ~ 25 bull 104 So qi mdash q2 = 12 p u mdash p i 2 = p 2 1 = p 2 2 = 12 (symmetric state) Suppose we have fluctuations (27) with f m Qi ~ 12- Then eH w 4 w ^00 So riij = 24 bull 104 plusmn 500 Hence the relative deviation

192

(N)

m7 = 25I04 ~ 0002 Thus fluctuations of the relative magnitude laquo 0002 produce the conventional probabilistic rule

It is evident that fluctuations of essentially larger magnitude

4V = 2^f )(mH)1 2(m2 1)1Agt euro W = 2ampm2i)^(mu)Wap gt 2 (28)

where Q and pound2i a r e bounded sequences (N mdashgt 00) also produce (for Pij yen 0) the conventional probabilistic rule

Example 22 Let all numbers N mij be the same as in Example 31 and let deviations have behaviour (28) with a = = 4 Here the relative

AN)

deviation -mdash laquo 0045 Remark 21 The magnitude of fluctuations can be found experimentally

Let A and B be two physical observables We prepare free statistical ensembles S Ti T 2 corresponding to states ltj)bi gtb2 gt bull By measurements of B and A for 7r G S we obtain frequencies q[ q2 gt Pi gt P2 gt ^y measurements of A for 7r euro Ti and for TT G T2 we obtain frequencies p[j We have

H N ) = A ( N ) = p(N) q ( N ) p ( N ) _ q ( N ) p ( N

It would be interesting to obtain graphs of functions f (N) for different pairs of physical observables Of course we know that lini7v-raquooo ft (N) = plusmncos6 However it may be that such graphs can present a finer structure of quantum states

3 Hyperbolic and hyper-trigonometric probabilistic transformations

Let Si pound2 be preparation procedures that produce perturbations such that the normalized (experimental) statistical deviations

lAJ^I gt lJV-raquooo (29)

Thus |Aj| gt 12 = 12 Here the coefficients Aj can be represented in the form Aj = plusmn cosh8ii = 12 The corresponding probability rule has the following form

Pi = qiPii + Q2P2J plusmn 2AqIqipIip27cosh Qh i = 12 The normalization pi + p 2 = 1 gives the orthogonality relation

VP11P2I COSh 61 plusmn 1Pl2P22COSh^2 = 0 (30)

Thus cosh 62 mdash C0Sn^ipi2P22 and signAiA2 = mdash1

193

This probabilistic transformation can be called a hyperbolic rule It deshyscribes a part of nonconventional probabilistic behaviours which is not deshyscribed by the trigonometric formalism Experiments (and preparation proshycedures 86182) which produce hyperbolic probabilistic behaviour could be simulated on computer On the other hand at the moment we have no natural physical phenomena which are described by the hyperbolic probabilistic formalshyism Trigonometric probabilistic behaviour corresponds to essentially better control of properties in the process of preparation than hyperbolic probabilistic behaviour Of course the aim of any experimenter is to approach trigonometshyric behaviour However in principle there might exist such natural phenomena that trigonometric quantum behaviour could not be achieved

Example 3 1 Let qi = a q2 = 1 - a P n = = P22 = 12 Then pi = I + ya(l - a)Ai P2 = I - A(1 - laquo)^i bull If a is sufficiently small then Ai can be in principle larger than 1 We

can find a phase 6 such that the normalized statistical deviation Ai = cosh Let us consider experiments that produce hyperbolic probabilistic rule and

let the corresponding matrix of probabilities be double stochastic In this case orthogonality relation (30) has the form

coshi = cosh 62 = cosh We get the probabilistic transformation

Pi = q i P n +q2P2i plusmn 2^qiq2piiP2i coshfl

P2 = q iP i2 + q2P22 T 2v qiq2Pi2P22COsh0

This probabilistic transformation looks similar to the quantum probabilistic transformation The only difference is the presence of hyperbolic factors inshystead of trigonometric This similarity gives the possibility to construct a linear space representation of the hyperbolic probabilistic calculus see section 7

The reader can easily consider by himself the last possibility one norshymalized statistical deviations |A| is large than 1 and another is less than 1 hyper-trigonometric probabilistic transformation

Remark 31 The real experimental situation is more complicated In fact the phase parameter 6 is connected with the experimental arrangement In particular in the standard interference experiments the phase is related to the space-time structure of an experiment It may be that in some expershyiments dependence of the normalized statistical deviation A on 6 is neither trigonometric nor hyperbolic

P = P + P 2 + 2 yP^XiO) However if the function |A()| lt 1 then we can obtain the trigonometric

transformation by just the reparametrization 6 = arccos()

194

4 Double stochasticity and correlations between preparation proshycedures

In this section we study the frequency meaning of the fact that in the quantum formalism the matrix of probabilities is double stochastic We remark that this is a consequence of orthogonality of quantum states bi gt and |62 gt corresponding to distinct values of a physical observable B We have

PU = P22 ( 3 1 )

Pl2 P21

Suppose that all quantum features are induced by the impossibility to create new ensembles Ti and T2 without to change properties of quantum parshyticles Suppose that for example the preparation procedure Si practically destroys the property A = ai (transforms this property into the property A = a2) So p n = 0 As a consequence the pound1 makes the property A = a2

dominating So p i 2 laquo 1 Then the preparation procedure Si must practishycally destroy the property A = a2 (transforms this property into the property A = ai) So P22 PS 0 As a consequence the Si makes the property A = ai dominating So P21 laquo 1

We remark that

We recall that the number of elements in the ensemble T is equal to n Thus

n n -run _ n22 - m 2 2 ^ nil _ 22 bdquobdquo

This is nothing than the relation between fluctuations of property A under the transition from the ensemble S to ensembles Ti T2 and distribution of this property in the ensemble S

5 Hyperbolic quantum formalism

The mathematical formalism presented in this section can have different physshyical interpretations In particular quantum state can be interpreted from the orthodox Copenhagen as well as statistical viewpoints

A hyperbolic algebra G see [18] p 21 is a two dimensional real algebra with basis eo = 1 and ei = j where j 2 = 1 Elements of G have the form z = x + jy xy euro R We have zi + z2 = (xi + x2) + j(yi + yi) and ziz2 = xixi + 2122) + j(^i22 + X2yi) This algebra is commutative We introduce

195

the involution in G by setting z = x - jy We set z2 = zz = x2 - y2 We remark that z = yjx2 - y2 is not well denned for an arbitrary z euro G We set G+ = z pound G z2 gt 0 We remark that G+ is the multiplicative semigroup ZiZ2 pound G + mdashbull z = zz2 pound G+ It is a consequence of the equality

zxz22 = |zi |2 |z2 |2

Thus for zz2 pound G + we have zz2 = l^iH^I- We introduce

eje = cosh6+js inh9 6 pound R

We remark that

e j 0 i e j 02 _ em+ltgt2)^ _ e - j 9 |gjlaquo|2 _ c o s h 2 g _ s i n h 2 g _ L

Hence z = plusmneJ e always belongs to G+ We also have cosh6raquo = e +2

e sinh6gt = e ~j We set G = z e G + |Z|2 gt 0 Let z pound G+ We have

= W(1f[+W = laquoN( aSr+jHSr)-2 2

As A T - T TJ = 1 we can represent x sign a = cosh 6 and y sign a = sinh 6 where the phase 6 is unequally defined We can represent each z pound G+ as

z = sign x |z| ee By using this representation we can easily prove that G+ is the mulshy

tiplicative group Here mdash 5Spe-Jfl The unit circle in G is denned as Si = z pound G z2 = 1 = z = plusmneje9 pound ( -oo+oo) It is a multiplicative subgroup of G+

Hyperbolic Hilbert space is G-linear space (module) see [18] E with a G-linear product a map (bullbull) E x E mdashgt G that is

1) linear with respect to the first argument (az + bwu) = a(zu) + b(wu)ab pound Gzwu pound E 2) symmetric (zu) = (uz) 3) nondegenerated (zu) = 0 for all u pound E iff z mdash 0 If we consider E as just a R-linear space then (bull bull) is a bilinear form which

is not positively defined In particular in the two dimensional case we have the signature (+ mdash + mdash)

As in the ordinary quantum formalism we represent physical states by normalized vectors of the hyperbolic Hilbert space ltp pound E and (ip ip) = 1 We shall consider only dichotomic physical variables and quantum states belonging to the two dimensional Hilbert space So everywhere below E denotes the two dimensional space Let A = a a2 and B = bi b2 be two dichotomic physical variables We represent they by G-linear operators a gtlt a i | + a2 gtlt a2

196

and bi gtlt b + |amp2 gt lt b2 where |a gtj=i2 and bi gti=i2 are two orthonormal bases in E

Let (p be a state (normalized vector belonging to E) We can perform the following operation (which is well defined from the mathematical point of view) We expend the vector ltp with respect to the basis bi gti=i2 bull

ltP = Pibigt+p2b2gt (34)

where the coefficients (coordinates) Pi belong to G As the basis bi gti=i2 is orthonormal we get (as in the complex case) that

p12 + p2

2 = l (35)

However we could not automatically use Borns probabilistic interpretation for normalized vectors in the hyperbolic Hilbert space it may be that Pi $ G +

(in fact in the complex case we have C = C + ) We say that a state ip is deshycomposable with respect to the system of states |6j gti=i2 (S-decomposable) if

Pi G G+ (36)

In such a case we can use Borns probabilistic interpretation of vectors in a hyperbolic Hilbert space

Numbers q = Pi2i = 12 are interpreted as probabilities for values B = bi for the G-quantum state tp

We now repeat these considerations for each state bi gt by using the basis ogtk gt=i2- We suppose that each bi gt is ^-decomposable We have

|ampi gt = n k gt +Pi2a2 gt |amp2 gt = ampi |a i gt +p22a2 gt (37)

where the coefficients Pik belong to G+ We have automatically

|n|2 + |i2|2 = l |2i|2 + |22|2 = l (38)

We can use the probabilistic interpretation of numbers p n = |n|2pi2 = |3i2|2 and p2 i = |32i|

2P22 = P22 bull Pik is the probability for a - ak in the state bi gt

Let us consider matrices B = (Pik) and P = (pik)- As in the complex case the matrix B is unitary vectors u = (PnPi2) and u2 = (p2iP22) are orthonormal The matrix P is double stochastic

By using the G-linear space calculation (the change of the basis) we get ltp = a i |o i gt +a 2 | a 2 gt where a-i = PiPn + P2P21 and a2 mdash PP2 + 222-

197

We remark that decomposability is not transitive In principle ip may be not A-decomposable despite B-decomposability of ip and A-decomposability of the B-system

Suppose that ip is A-decomposable Therefore coefficients p^ = |afc|2 can be interpreted as probabilities for a = ak for the G-quantum state ltp

Let us consider states such that coefficients fiiPik belong to G+ We can uniquely represent them as

pi = plusmnvq~e^ I5ik = plusmnyJHkehih ik= 12

We find that

Pi = q i P u + Q2P21 + 2ei v q 1piiq 2p 2 i coshfli (39)

P2 = qiPi2 + q2P22 + 2e2vqTpl2q2P22 cosh^2 (40)

where 6t = 77 + 7 and 77 = f i - pound271 = 7n - 7217i = 7i2 - 722 and e = plusmn To find the right relation between signs of the last terms in equations (39) (40) we use the normalization condition

M 2 + |a2 |2 = l (41)

(which is a consequence of the normalization of ip and orthonormality of the system ai gti=i2) It is equivalent to the equation (condition of orthogonalshyity in the hyperbolic case see section 8)

VPl2P22COSh02 plusmn PllP2lCOSh02 = 0 Thus we have to choose opposite signs in equations (39) (40) Unitarity

of B also inply that 6 mdash 62 = 0 so 71 = 72 We recall that in the ordinary quantum mechanics we have similar conditions but trigonometric functions are used instead of hyperbolic and phases 71 and 72 are such that 71mdash72 = ir

Finally we get that (unitary) linear transformations in the G-Hilbert space (in the domain of decomposable states) represent the hyperbolic transformashytion of probabilities (see section 8)

Pi = QiPu + q2P2i plusmn 2-vq1piiq2p2iCOsh0 P2 = qiPi2 + q2P22 =F 2vq1pi2q2P22COsh0 This is a kind of hyperbolic interference There can be some connection with quantization in Hilbert spaces with

indefinite metric as well as the theory of relativity However at the moment we cannot say anything definite It seems that by using Lorentz-rotations we can produce hyperbolic interference in a similar way as we produce the standard trigonometric interference by using ordinary rotations

198

6 Physical consequences

The wave-particle dualism was created to explain the interference phenomenon for massive elementary particles In particular the orthodox Copenhagen inshyterpretation was proposed to find a compromise between corpuscular and wave features of elementary particles The idea of superposition of distinct propershyties is in fact based on these interference experiments It is well known that the orthodox Copenhagen interpretation is not free of difficulties (in particular collapse of wave function) and even paradoxes (see for example Schrodinger [19]) Problems in the orthodox Copenhagen interpretation induce even atshytempts to exclude corpuscular objects from quantum theory at all see for example [20] for Schrodinger critique of the classical concept of a particle At the moment there is only one alternative to the orthodox Copenhagen intershypretation namely Einsteins statistical interpretation By this interpretation the wave function describes distinct statistical features of an ensemble of eleshymentary particles see L Ballentine [17] for the details (see also [16] [5] [10]

[11])-However we must recognize that Einsteins statistical approach could not

solve the fundamental problem of quantum theory it could not explain the appearance of NEW STATISTICS in the purely corpuscular model We did this in the present paper On one hand this is the strong argument in favour of the statistical interpretation of quantum mechanics On the other hand one of main motivations to use the wave-particle duality disappeared

Nevertheless our investigation could not be considered as the crucial argushyment against the wave-particle duality It is clear that by using purely mathshyematical analysis we cannot prove or disprove some physical theory The only thing that we proved is that corpuscular objects (that have no wave features) can exhibit NEW STATISTICS

In fact we obtained essentially more than planed this NEW STATISTICS are not reduced to QUANTUM STATISTICS In principle we can propose experiments that induce TRIGONOMETRIC HYPERBOLIC and HYPER-TRIGONOMETRIC STATISTICS

We remark that the quantum probabilistic transformation P = Pi + P2 + 2VPTP7 cos0 gives the possibility to predict the probability P if we know probabilities

P i and P 2 In principle there might be created theories based on arbitrary transformations

P = F ( P 1 gt P 2 ) It may be that some rules have linear space representations over exotic number systems for example p-adic numbers [20]

199

Preliminary analysis of probabilistic foundations of quantum mechanics (that induced the present investigation) was performed in the books [11] and [21] (chapter 2) a part of results of this paper was presented in preprints [22]-[24]

Acknowledgements

I would like to thank S Albeverio L Accardi L Ballentine V Belavkin E Beltrametti W De Muynck S Gudder T Hida A Holevo P Lahti A Peres J Summhammer I Volovich for (sometimes critical) discussions on probabilistic foundations of quantum mechanics

References 1 P A M Dirac The Principles of Quantum Mechanics (Claredon Press

Oxford 1995) 2 R Feynman and A Hibbs Quantum Mechanics and Path Integrals

(McGraw-Hill New-York 1965) 3 L Accardi The probabilistic roots of the quantum mechanical parashy

doxes The wave-particle dualism A tribute to Louis de Broglie on his 90th Birthday ed S Diner D Fargue G Lochak and F Selleri (D Reidel Publ Company Dordrecht 297-330 1984)

4 B dEspagnat Veiled Reality An anlysis of present-day quantum meshychanical concepts (Addison-Wesley 1995)

5 A Peres Quantum Theory Concepts and Methods (Kluwer Academic Publishers 1994)

6 J von Neumann Mathematical foundations of quantum mechanics (Princeton Univ Press Princeton NJ 1955)

7 E Schrodinger Philosophy and the Birth of Quantum Mechanics Edited by M Bitbol O Darrigol (Editions Frontieres 1992)

8 J M Jauch Foundations of Quantum Mechanics (Addison-Wesley Reading Mass 1968)

9 P Busch M Grabowski P Lahti Operational Quantum Physics (Springer Verlag 1995)

10 W De Muynck W De Baere H Martens Found Phys 24 1589-1663 (1994)

11 A Yu Khrennikov Interpretations of probability (VSP Int Publ Utrecht 1999)

12 J Summhammer Int J Theor Phys 33 171-178 (1994) 13 R von Mises The mathematical theory of probability and statistics

(Academic London 1964)

200

14 A N Kolmogoroff Grundbegriffe der Wahrscheinlichkeitsrechnung (Springer Verlag Berlin 1933) reprinted Foundations of the Probability Theshyory (Chelsea Publ Comp New York 1956)

15 W Heisenberg Z Physik 43 172 (1927) 16 L E Ballentine Quantum mechanics (Englewood Cliffs New Jersey

1989) 17 L E Ballentine Rev Mod Phys 42 358-381 (1970) 18 A Yu Khrennikov Supernalysis (Kluwer Academic Publishers Dor-

dreht 1999) 19 E Schrodinger Die Naturwiss 23 807-812 824-828 844-849 (1935) 20 E Schrodinger What is an elementary particle in Gesammelte Ab-

handlungen (Wieweg and Son Wien 1984) 21 A Yu Khrennikov p-adic valued distributions in mathematical physics

(Kluwer Academic Publishers Dordrecht 1994) 22 A Yu Khrennikov Ensemble fluctuations and the origin of quantum

probabilistic rule Rep MSI Vaxjo Univ 90 October (2000) 23 A Yu Khrennikov Classification of transformations of probabilities

for preparation procedures trigonometric and hyperbolic behaviours Preprint quant-ph0012141 24 Dec (2000)

24 A Yu Khrennikov Hyperbolic quantum mechanics Preprint quant-ph0101002 31 Dec (2000)

201

NONCONVENTIONAL VIEWPOINT TO ELEMENTS OF PHYSICAL REALITY BASED ON NONREAL ASYMPTOTICS

OF RELATIVE FREQUENCIES

A N D R E I K H R E N N I K O V

International Center for Mathematical

Modeling in Physics and Cognitive Sciences

MSI University of Vaxjo S-35195 Sweden

EmailAndreiKhrennikovmsivxuse

We study connection between stabilization of relative frequencies and elements of physical reality We observe that besides the standard stabilization with respect to the real metric there can be considered other statistical stabilizations (in parshyticular with respect to so called p-adic metric on the set of rational numbers) Nonconventional statistical stabilizations might be connected with new (noncon-ventional) elements of reality We present a few natural examples of statistical phenomena in that relative frequencies of observed events stabilize in the p-adic metric but fluctuate in the standard real metric

1 Introduction

The present methodology of physical measurements is based on the principle of the statistical stabilization of relative frequencies in the long run of trials In the mathematical model this principle is represented by the law of large numbers This approach to measurements is induced by human representation of physical reality as reality of stable repetitive phenomena In the process of evolution we created cognitive structures that correspond to elements of this repetitive physical reality All modern physical investigations are oriented to the creation of new elements of such a reality

It must be remarked that the notion of stabil ization (of relative frequenshycies) plays the fundamental role in the creation of this reality I would like to point out that the conventional meaning of stabilization is based on real numbers When we say stabilization we mean the stabilization with respect to the standard real metric pn(xy) = |x mdash y| (the distance between points x and y on the real line R) Of course such a choice of the metric that deshytermines statistically elements of physical reality was not just a consequence of the development of one special mathematical theory real analysis b It

a W e ask the reader not connect our vague (common sense) use of the notion of an element of physical reality with the EPR sufficient condition to be an element of reality [1] bNevertheless we must not forget that the human factor played the large role in the expendshying of the (presently dominating) model of physical reality based on real numbers At the beginning Newtons analysis was propagated as a kind of religion There were (in particular

202

seems that the notion of ^-stabilization was induced by human practice in that quantities n laquo N were not important We created real physical reality because we used smallness based on the standard order on the set of natural numbers

It must be underlined that in modern physics the real physical reality (ie reality based on the 9R-stability) is in fact identified with the whole physical reality

On the other hand the modern mathematics is not more just a real analshyysis In particular the development of general topology [2] [3] induced large spectrum of new nearness (in particular metric) structures In principle we need not more identify any stabilization with the p^-stabilization There apshypears a huge set of new possibilities to introduce new forms of stability in physical experiments Moreover new stable structures can be considered as new elements of physical reality that in general need not belong the standard real reality

This idea was presented for the first time in authors investigations [4] [5] on so called p-adic physics [6]- [10] Later we tried to find the place of p-adic probabilities in quantum physics [11] [12] (in particular to justify on the mathematical level of rigorousness the use of negative and complex probabilishyties as well as create models with hidden variables that do not produce Bells inequality) In this paper we give the brief introduction into these probabilisshytic models as well as present a few rather natural examples in that relative frequencies of events stabilize with respect to so called p-adic metric but flucshytuate with respect to pR There is no corresponding element of the real reality But there is an element of the p-adic reality The objects considered in examshyples could be created on the hard-level In particular to create a plantation in that a colour of the flower (red or white) is the element of p-adic reality I need just a tractor and (sufficiently large) peace of land Nevertheless I must agree that such a p-adic element of reality were never observed in naturally created physical objects

The reader can be interested in the reasons by that we are concentrated on the statistical stabilization with respect to the p-adic numbers p-adic frequency probability theory The main reason is that p-adic numbers are in fact the unique alternative to real numbers there is no other possibility to complete the field of rational numbers and obtain a new number field (Ostrovskiis theorem see for example [13] [14])

Our probabilistic foundations are based on the generalization of R von Mises frequency theory of probability [15] [16] At the beginning of this censhytury when the foundation of modern probability theory were being laid the

in France) divine services devoted to Newtons analysis

203

frequency definition of probability proposed by von Mises played an imporshytant role In particular it was this definition of probability that Kolmogorov used to motivate his axioms of probability theory (see [17]) We also begin the construction of the new theory of probability with a frequency definition of probability

Von Mises defined the probability of an event as the limit of the relative frequencies of the occurrence of the event when the volume of the statistical sample tends to infinity This definition is the foundation of mathematical statistics (see example Cramer [18]) in which von Misess definition is formushylated as the principle of statistical stabilization of relative frequencies

In this paper we propose a general principle of statistical stabilization of relative frequencies By virtue of this principle statistical stabilization of relative frequencies u = nN can be considered not only in the real topology on Q (and all relative frequencies are rational numbers) but also in any other topology on Q Then the probabilities of events belong to the corresponding completion of the field of rational numbers As special cases we obtain the ordinary real probability theory (von Misess definition) and p-adic probability theories p = 2 3 5

How should one choose the topology of statistical stabilization for a given statistical sample The topology is determined by the properties of the studied probability model In essence we propose this principle for each probability model there is a corresponding topology (or topologies) of statistical stabilizashytion

For example in a random sample there need not be any statistical stashybilization of the relative frequencies in the real metric Thus from the point of view of real probability theory this is not a probabilistic object However in this random sample one may observe p-adic statistical stabilization of the relative frequencies

In essence I am asserting that the foundation of probability theory is provided by rational numbers (relative frequencies) and not real numbers Real probabilities of events merely represent one of many possibilities that arise in the statistical analysis of a random sample Such an approach to probability theory agrees well with Volovichs proposition that rational numbers are the foundation of theoretical physics [19] In accordance with this proposition everything physical is rational and number fields that are different from the field of rational numbers arise as an idealization needed for the theoretical description of physical results

All necessary information on p-adic (and more general m-adic) numbers can be found in Appendix 1 of this paper However in the first two sections they are hardly used at all and we may restrict ourselves to the remark that

204

in addition to the completion of the field of rational numbers Q with respect to the real metric there also exist completions with respect to other metrics and among these completions there are the fields of p-adic numbers Qpp = 2 3 5

2 Analysis of the foundation of probability theory

21 Frequency Definition of Probability As is well known the frequency definition of probability proposed by von Mises [15] in 1919 played an imporshytant role in the construction of the foundations of modern probability theory This definition exerted a strong influence on the theory of probability meashysures the foundations of which were laid by Borel [20] Kolmogorov [17] and Frechet [21] There is no point in giving here Kolmogorovs axioms (which can be found in any textbook on probability theory) but it is probably necessary to recall in its general features the main propositions of von Misess theory of probability The theory is based on infinite sequences x = (ai xlti xn) of samplings or observations If an experiment having S outcomes is made then Xj can take values 12 5 (possible outcomes) For the standard exshyperiment on coin trails we have 5 = 2 and Xj = 12 In what follows possible outcomes of an experiment will be called labels

However not every such sequence is regarded as an object of probability theory The fundamental principle of the frequency theory of probability is the principle of statistical stabilization of the relative frequencies of occurrence of a particular label and only sequences of samplings that satisfy this principle are regarded as objects of probability theory Such sequences of samplings are called collectives

A collective is a bulk phenomenon or a repeated process in brief a series of individual observations for which one is justified in assuming that the relative frequency of occurrence of each individual observable label tends to a definite limiting value [16]

The probability of an event E is defined as the limit of the sequence of frequencies u^ = nN where n is the number of cases in which the event E is detected in the first N tests

For the subsequent considerations it is important to note that in the statistical analysis of the results of an experiment only rational numbers -relative frequencies - are obtained

The principle of statistical stabilization of the relative frequencies is used practically unchanged in mathematical statistics

Observations of the frequency v^ of a fixed event E for increasing values of N reveals that this frequency has generally speaking a tendency to take a

205

more or less constant value at large N (see Cramer [18]) In defining a collective von Mises used a further principle - the principle

of irregularity of a sequence of tests ie invariance of the limit of the relative frequencies with respect to the selection made using a definite law from a given sequence of tests x = (xiX2 xn) of some subsequence It is important that the law of this selection should not be based on the difference of the elements of the sequence with respect to the considered label

Second this limiting value must remain unchanged if from the complete sequence we choose arbitrarily any part and consider in what follows only this part [16]

This principle like the principle of statistical stabilization of the relative frequencies is fully in accord with our intuitive ideas of randomness However there are here some logical difficulties associated with the arbitrariness of the choice A detailed analysis of these logical problems was made by Khinchin [22] see also [12] for the details It appears that one must agree with Khinchins critical comments and consider the frequency theory of probability that is based only on von Misess first principle - the principle of statistical stabilization of the relative frequencies

As is noted in [22] the frequency theory of probability based solely on von Misess first principle is axiomatized and is as rigorous a mathematical theory as Kolmogorovs theory of probability Here we do not intend to consider von Misess theory of probability in the framework of an axiomatic approach Our task is to analyze the principle of stabilization of the frequencies of occurrence of a particular event in a collective

22 Von Mises Frequency Theory of Probabilities as Objective Foundation of Kolmogorovs Axiomatics

As motivation of his axioms Kolmogorov used the properties of limits of relative frequencies see [17] We shall be interested in the manner in which Kolmogorovs axiom 2 arose in accordance with this axiom the probability PE) of any event E is a nonnegative real number lt 1 In [17] Kolmogorov considers von Misess definition [16] of probability as the limit of the relative frequencies of occurrence of the event E Further since the relative frequencies i(pound) = nN are rational numbers that lie between zero and unity their limits in the real topology are real numbers between zero and unity Cramer proceeded similarly in the construction of his theory of probability distributions [18]

Khinchin discussing the advantages of Kolmogorovs axioms over von Misess frequency theory of probability noted that from the formal asshypect the mutual relationship between the axiomatic and frequency theories is characterized in the first place by a higher degree of abstraction of the former

This higher degree of abstraction was the foundation of the successful

206

development of the theory of probability measures However this degree of abstraction is too high and some properties of the world of real frequencies are lost in it Essentially the rational numbers were lost in Kolmogorovs theory of probability Whereas in von Misess theory the rational numbers arise as primary objects and real probabilities are obtained as a result of a limiting process for rational frequencies in Kolmogorovs theory rational frequencies are secondary objects associated with real probabilities (which are here primary) by means of the law of large numbers

3 General principle of statistical stabilization of relative frequenshycies

First we emphasize that the probabilities P in von Misess frequency theory are ideal objects (symbols to denote the sequences of relative frequencies that are stabilized in the field of real numbers) Therefore real numbers arise here as ideal objects associated with rational sequences of frequencies (see also Borel [20] and Poincare [23])

A basis for a broader view of probability theory is provided by the following principle of statistical stabilization of frequencies

Statistical stabilization (the limiting process) can be considered not only in the real topology on the field of rational numbers Q but also in any other topolshyogy on Q The probabilities of events are defined as the limits of the sequences of relative frequencies in the corresponding completions of the field of rational numbers

For each considered probability model there is a corresponding topology on the field of rational numbers The metrizable topologies on Q given by absolute values are the most interesting By virtue of Ostrovskiis theorem there are very few such topologies indeed besides the usual real topology for which p(xy) = x mdash y there exists only the p-adic topologies p = 2 3 where p(x y) = x mdash yp Thus if we consider only topologies given by absolute values then besides the usual probability theory over R we obtain only the probability theories over Qp

It is here necessary to introduce a natural restriction on the topology of statistical stabilization

The completion Qt of the field of rational numbers Q with respect to the statistical stabilization topology t is a topological field

We have deliberately not introduced this restriction into the general prinshyciple of statistical stabilization One can also consider statistical stabilization topologies that are not consistent with the algebraic structure on Q However probability theory based on such topologies loses many familiar properties For

207

example it turns out that the continuity of the addition operation is equivashylent to additivity of probabilities and continuity of the division operation is equivalent to the existence of conditional probabilities

Let x = (xX2 bull bull xn) be some collective We denote the set of all labels for this collective (possible outcomes of an experiment producing this collective) by the symbol II We denote by fi the event consisting in the realization of at least of the label n euro II

Proposition 31 The probability of the event il is equal to unity To prove this it is sufficient to use the fact that all the relative frequencies

are equal to unity Let v^fi j = 12 be the relative frequencies of realization of certain labels

7Ti and 7r2 and Pj = l imi ^ be the corresponding probabilities Let event A be the realization of the label TT or -K-I A = n V TT2 bull Using the continuity of the addition operation we obtain

P(A) = lim iW = lim(jW + v^) = lim iW + lim J 2 ) = PX+P2 (1)

This rule can be generalized to any number of mutually exclusive events Proposition 32 Let Ajj = 1 k be mutually exclusive events (ie

the sets of labels that define these events are disjoint) Then

k

P(A1VVAk) = YP(Aj) (2) i= i

Using the continuity of the subtraction operation we obtain the following proposition

Proposition 33 For any two events A and B the equation P(AB) mdash PA) + PB) - PA A B) holds

In the language of collectives the rule of addition of probabilities is forshymulated as follows see[16] Beginning with an original collective possessing more than two labels an appreciable number of new collectives can be conshystructed by uniting labels the elements of the new collective are the same as in the original one but their labels are unifications of the labels of the origshyinal collective To the unification of labels there corresponds the addition of frequencies

We consider the set of rational numbers U = x euro Q Q lt x lt We denote by the symbol Ut the closure of the set U in the field Qt (if t is the ordinary real topology then Ut mdash [01]) An obvious consequence of the definition of probabilities is the following proposition

Proposition 34 The probability of any event PE) belongs to the set Ut-

208

Conditional probabilities are then introduced into the frequency theory in same way as in [16] Suppose there is some initial collective x = (xltx2-- xn) with probabilities pn of the labels IT euro II Using the unification rule we define the probabilities of all groups of labels

P(A) = YP- (3)

We fix some group of labels B = n^ V V iTik We are interested in the conditional probability P(TTB)TT euro B of the label n given the condition B We form a new collective x = (x[ x2 xn) which is obtained from the original one by choosing only the elements with the labels r pound 5 The probability of the label -K in this new collective is then called the conditional probability of the label n under the condition B P(nB) = lim v^lB^ where J(TB) a r e the relative frequencies of the label -K in the new collective Noting that z5) = iM z B ) where v^ is the relative frequency of the label it in the collective x and j B ) is the relative frequency of the event B in the collective x we obtain (using the continuity of the division operation)

j ( 7 r ) limiW p(V) PMB)=lua-m = mdash m = ^ y PB)0 (4)

The general formula can be proved similarly Proposition 35 P(AB) = PAAB)P(B)P(B) pound 0 We now introduce the concept of independence of events Analyzing argushy

ments in the book [16] one notes that the rule of multiplication of probabilities for independent events is equivalent to the continuity of the multiplication opshyeration

An important property that makes it possible to use p-adic probabilities when considering standard problems of probability theory is the p-adic intershypretation of the probabilities zero and one (which are probabilities in the sense of ordinary probability theory)

Indeed the equation P(E) = 0 in ordinary probability theory does not mean that the event E is impossible It merely means that in a long series of experiments the event E occurs in a very small fraction of cases However in a large number of experiments this fraction can be relatively large Moreover the equation P(E) = 0 lumps together a huge class of events that intuitively appear to have different probabilities For example suppose we consider two events E and Ei and in the first

N = Nk = Cpound)2 (5)

209

trials the event Ei is realized n^ = 2k times and the event E2 is realized

k

nW = Y2j (6) J=0

times It is intuitively clear that the probabilities of these events must be different However in real probability theory

Pi = lim n1)N = P2= lim n (2) N = 0 (7)

It is different in 2-adic probability theory Stabilization in the 2-adic topology gives

Pi = 0 P2 = - 1 since in Q2 we have 2 -gt 0 k -gt co and for - 1 we have the represenshy

tation - 1 = l + 2 + 22 + + 2 + We here encounter for the first time negative numbers for probabilities of events (compare to Wigner [24] Dirac [25] Feynman [26] see also [27] [28] [12]) Of course these probabilities are forbidden by Kolmogorovs second axiom in ordinary probability theory (in von Misess approach they are forbidden by the choice of the topology of stashytistical stabilization) However from the point of view of the frequency theory of probability P = mdash 1 is only an ideal object the symbol that denotes the limit of a sequence of relative frequencies This symbol is in no way better and in no way worse than the symbol P = jix in ordinary probability theory

In this example negative p-adic probabilities were used to split zero conshyventional (real) probability So p-adic negative probabilities can be interpreted as infinitely small conventional probabilities It may be that all negative probshyabilities that appear in quantum physics might be interpreted in such a way If conventional (real) probability is equal to zero there is no conventional (real) element of reality However there is nonconventional (p-adic) element of reality that is realized with negative probability Real and p-adic probabilities correshyspond to different classes of measurement procedures The element of reality that it would be impossible to observe by using real measurement procedure might be observed by using p-adic measurement procedure

One can treat similarly the case of a probability (in the sense of the ordishynary theory) equal to unity For example suppose

k k k k

N = Nk = (J2V)2n^ = (]T2^)2 - 2fcn(2) = ( ^ V ) 2 - pound)2gt (8) j=0 j=0 j=0 j=0

210

In 2-adic probability theory we find that

oo

P1 =l^P2 = l _ ( l ^ 2 gt ) = 2 (9) 3=0

We see here that natural numbers not equal to unity also belongs to the set Up

In this example p-adic (integer) probabilities which are larger than 1 were used to split conventional (real) probability one So under the p-adic considshyeration a conventional element of reality can be split to a few p-adic elements of reality

In the framework of p-adic statistical stabilizations there is also nothing seditious about complex probabilities For example let p = l(mod 4) Then i = ( - l )Va e Qp Let

i = io + hp + iip1 + bull bull bull ir = 0 1 p - 1 (10)

be the canonical decomposition of the imaginary unit in powers of p Note also that for any p

_ l = ( p - l ) + ( p - l ) p + ( p - l ) p 2 + (11)

Then for rational relative frequencies we have

v JQ + HP+ + ikpk ^ _ 1 2

(p - 1) + (p - l)p + + (p - l)pk

in the p-adic topology Geometrically one may suppose that the new probability theory is a transhy

sition from one-dimensional probabilities on the interval [01] to multidimenshysional probabilities

4 Probability distribution of a collective

Let x = (xi Xk bull bull bull) be some collective and II be the set of labels of this collective We consider the simplest case when the set II is finite II = ( 1 S) We denote by v^ the relative frequency of the jmdashlabel and by Pj = limiJ) the corresponding probability In the frequency theory the set of probabilities Px = (Pi bull bull Ps) is called the probability distribution of the collective x

211

The general principle of statistical stabilization makes it possible to conshysider not only real distributions but also distributions for other number fields For one and the same collective x there can exist distributions over different number fields Thus in the proposed approach a collective has in general an entire spectrum of distributions PXit = (P i t Pst) where t are the topologies of statistical stabilization for the given collective Therefore one here studies more subtle structure of the collective The relative frequencies are investigated not only for real stabilization but for a complete spectrum of stabilizations

In the connection with the existence of an entire spectrum of probability distributions of a collective it is necessary to make some comments

First this agrees well with von Misess principle that the collective comes first and the probabilities after Indeed a probability distribution is an object derived from a collective and to one and the same collective there corresponds an entire spectrum of probability distributions these reflecting different propshyerties of the collective

Second each statistical stabilization determines some physical property of the investigated object For example if in a statistical experiment involving the tossing of a coin the probability of heads is Pi and tails is P2 then these probabilities are physical characteristics of the coin like its mass or volume This question is discussed in detail in the books of Poincare [23] and von Mises [16]

If we consider from this point of view the new principle of statistical stashybilization we obtain new physical characteristics of the investigated objects For example if in the real topology statistical stabilization is absent then it is not possible to obtain any physical constants in the language of ordinary probability theory But these constants could exist and be for example p-adic numbers If a collective has not only a real probability distribution but an enshytire spectrum of other distributions then besides real constants corresponding to physical properties of the investigated object we obtain an entire spectrum of new constants corresponding to physical properties that were hidden from the real statistics Note that these new constants can also be ordinary rational numbers

5 Model examples of p-adic statistics

51 Plantation with Red and White Flowers As one of the first examples of a collective von Mises considered [16] a

plantation sown with flowers of different colors and he studied the statistical stabilization of the relative frequencies of each of the colors We shall construct

212

an analogous collective for which p-adic stabilization always occurs but real stabilization is in general absent

Suppose there are flowers of two types red (R) and white (W) The planshytation (or rather infinite bed) is sown in a random order with red and white flowers the flowers being sown in series formed by blocks of p flowers the length of the series (the power of p) being also determined in accordance with a random rule

Namely suppose there are two generators of random numbers 1) j = 01 2) i = 12 (with probabilities 05) If j = 0 then a series of red flowers is sown if j = 1 then a series of white ones The length of each series is defined as follows the length of the first series is some power p1 (it can also be determined in accordance with a random rule) if the length of the previous series was plm then the length of the next series is plm+x lm+i =lm + im

We introduce the relative frequencies of the red and white flowers in the firs m series vpoundgt = rVmgtNmi^T = ntrade Nm

Proposition 51 For all generators of the random numbers j and i there is statistical stabilization of the relative frequencies u^Rgt and u^wgt in the p-adic topology

Thus we have defined p-adic probabilities PR = l imi ^ and Pw mdash limi(w and

oo oo oo oo

PR = (pound(1 -Jn)P)CZPln)gtpw = (E^) (E^ n ) (13) n=l n= l n=l n=l

Note that in general there is no real statistical stabilization for such a random plantation If the generator of the random numbers j gives series 0 or 1 then u^ and v^w^ in the real topology can oscillate from zero to unity

Thus a real observer (an investigator who carries out statistical analysis of the sample in the field of real numbers) cannot obtain any statistically regular law

He will obtain only a random variation of the series of real relative frequenshycies In contrast the p-adic observer (the investigator who makes a statistical analysis of the sample in the field of p-adic numbers) will obtain a well-defined law consisting of the stabilization of the outcomes in the p-adic decomposition of the relative frequencies

It is evident that in the example of probability theory we observe a new funshydamental approach to the investigation of natural phenomena In accordance with this approach experimental results must be analyzed not only in the field of real numbers but also in p-adic fields

Naturally our example is purely illustrative but it does appear to reflect many very important properties of p-adic statistics

213

Remark 51 Intuitively one supposes that in a real plantation it is possible to find a white flower next to almost every red flower in contrast large groups (clusters) of red and white flowers are distributed randomly over a p-adic plantation (one can sow not only a bed but also distribute series of red and white flowers over a plane in accordance with a random rule) A real random plane is obtained if one throws at random red and white points onto the plane in contrast a p-adic random plane is obtained if one throws patches of pl points at a time of red and white color onto the plane

In Appendix 2 we give the results of statistical analysis of the results of a random modeling on a computer of the proposed probability model There is very rapid p-adic stabilization of the relative frequencies and no stabilization in the sense of ordinary real probability theory

Remark 52 Evidently the structure of series formed by powers of p need not necessarily be directly observed in a statistical sample This structure is introduced by rounding the number of results to powers of p In very large statistical samples one can take into account only the orders of the numbers and one thereby introduces into the sample a 10-adic structure

52 Random Choice of the Digit of a p-Adic Number Suppose there are two labels 1 and 2 j is a generator of random numbers

corresponding to the choice of one of the labels Each random label is produced in series the length of the series being determined by random choice of the next p-adic digit ie there is a generator of random numbers a that take the values a = 0 1 p - 1 and the length of the next series is anp

n~1n = 12 We introduce the relative frequencies v^ and v^

Proposition 52 For all generators of the random numbers j and a there is statistical stabilization of the relative frequencies v-1 and i 1 in the p-adic topology

Thus the following p-adic probabilities are defined

oo oo oo oo Pl = (Y^l-J^nPn~1)lY^nPn-l)P2 = (EjnltnP

n-l)(ltrianpn-1) n=l n=l n=l n=l

In the real topology there is in general no statistical stabilization Appendix 1 Every rational number x ^ 0 can be represented in the form

where p does not divide m and n Here p is a fixed prime The p-adic absolute value (norm) for the rational number x is defined by the equations xp =

214

p r i 0 |0|p = 0 This absolute value has the usual properties l)xp gt 0 xp = 0 laquo-raquobull x = 0 2)|x|p = |a|p|2|p and satisfies a strong triangle inequality 3)x + yp lt max(|a|p |y|p)

The completion of the field of rational numbers with respect to the metric p(x mdash y) = x mdash yp is called the field of p-adic numbers and denoted by the symbol Qp It is a locally compact field Numbers in the unit ball Zp = x euro QP bull XP lt 1 degf the field Qp are called integer p-adic numbers Prom the strong triangle inequality we obtain a theorem which states that a series in the field Qp converges if and only if its general term tends to zero Any p-adic number can be represented in a unique manner in the form of a (convergent) series in powers of p

oo x = Yla^ai =0 1 p-lfc = 0plusmnl (15)

j=k

with xp = p~k

One can define similarly m-adic numbers where m is any natural number m gt 2 In the general case property 2) is replaced by the weaker property xym lt |z|m|2|mgt i-e-gt xm ls a pseudonorm The completion of the field Q in the metric p(xy) = x mdash ym will not be a field (for m that are not prime) It is only a ring Here we already encounter some deviations from the ordinary probability rules (which can be extended without any changes to p-adic probabilities) For example one can have a situation of the following kind A and B are independent events P(A) ^ 0 and PB) ^ 0 but P(A AB)=0 In particular the conditional probability P(AB) is in general not defined for an event B having nonvanishing probability

Appendix 2

We give here the results of a random experiment (modeled on a computer) for a 2-adic plantation The results of this experiment give a good illustration of a situation in which there is no statistical stabilization in the real topology but there is statistical stabilization in the 2-adic topology In the following tables m is the number of a random experiment in which two random numbers are modeled one corresponding to the choice of a flower and the other to the length of the series of this flower d is the number of elements in the sample Because of the exponential growth of the number of elements in the series d increases very rapidly

The table of relative frequencies in the field of real numbers is

215

m 4 5 6 7

12 13 14

22 23

d 10 102

103

103

105

105

106

109

1010

w uyy

01304 06364 01913 00504

00006 05335 01703

00022 07453

uH

08696 03636 08087 09496

09994 04665 08297

09978 02547

Thus for the relative frequencies in the field of real numbers there is no stabilization of even the first digit after the decimal point We examined large sequences of experiments on the computer in which the oscillations continued The calculations in the field Q2 give the results

AT = 10

v(w) =101011111011000000110100010111011000110011011110110001011 iW =001100000100111111001011101000100111001100100001001110100

iV = 20

v(w) _ 10101111101100111011001100101111110000011100111000000001 vWgt = 00110000010011000100110011010000001111100011000111111110

AT = 30

iW = 101011111011001110110011001111111100000000100110110000011 iW =001100000100110001001100110000000011111111011001001111100

AT = 40

v(w) =101011111011001110110011001111111100000000010111001110100 iW =001100000100110001001100110000000011111111101000110001011

216

Thus after ten random experiments 14 digits are stabilized in the 2-adic decomposition for the relative frequency of occurrence of a red flower and 14 digits for a white flower after 20 experiments the numbers of digits that are stabilized are 27 for both colors after 30 experiments 42 digits are stabilized for each and so forth

Appendix 3 W e give the results of analysis of a statistical sample in a field of 5-adic

numbers Here N is the number of random experiments M is the number of elements of the sample M is the number of elements of the first label and Mi is the number of elements of the second label

N 2 M l 002 M 2 00002 M 00202

MlM1044004400440044004400440044004400440044004400440044 M2M0010440044004400440044004400440044004400440044004400

N 3 M l 002 M 2 000023 M 002023

MlM1040303403420004404141041024440040303403420004404141 M2M10014141041024440040303403420004404141041024440040303

N 4 M l 00200002 M 2 000023 M 00202302

MlM1040303004000130020234341334320032124414032304024031 M2M0014141440444314424210103110124412320030412140420413

N 5 M l 00200002 M 2 000023004 M 002023024

MlM1040301040132010043322212441423102032221232032034142 M2M0014143404312434401122232003021342412223212412410302

N 6 M l 00200002 M 2 00002300403 M 00202302403

MlM1040301003131014113132222240403413222311230303113140 M2M0014143441313430331312222204041031222133214141331304

N 7 M l 00200002 M 2 0000230040303 M 0020230240303

217

MlM1040301003202004101343032004014023441101104433243020 M2M0014143441242440343101412440430421003343340011201424

Thus in the analysis of the sample in the field of 5-adic numbers there is rapid stabilization of the digits in the 5-adic decomposition of the relative frequenshycies For example after 55 experiments 78 digits in the 5-adic decomposition of the relative frequencies are stabilized

When the sample is analyzed in the field of real numbers there is again no statistical stabilization

Acknowledgements

I would like to thank L Ballentine and J Summhammer for discussions on p-adic probabilities and elements of physical reality

References 1 A Einstein B Podolsky N Rosen Phys Rev 47 777-780 (1935) 2 PS Alexandrov Introduction to general theory of sets and functions

(Gostehizdat Moscow 1948) 3 R Engelking General Topology (PWN Warszawa 1977) 4 AYu Khrennikov Dokl Akad Nauk 322 1075-1079 (1992) 5 AYu Khrennikov J of Math Phys 32 932-937 (1991) 6 VS Vladimirov I V Volovich and E I Zelenov p-adic analysis and

mathematical physics ( World Scientific Publ Singapore 1994) 7 Yu Manin Springer Lecture Notes in Math1111 59-101 (1985) 8 P G 0 Freund and E Witten Phys Lett B 199 191-195 (1987) 9 AYu Khrennikov Non-Archimedean Analysis Quantum Paradoxes

Dynamical Systems and Biological Models (Kluwer Academic Publ Dordrecht 1997)

10 S Albeverio A Yu Khrennikov and R Cianci J Phys A Math and Gen 30 881-889 (1997)

11 A Yu Khrennikov J of Math Physics 39 1388-1402 (1998) 12 AYu Khrennikov Interpretations of probability (VSP Int Publ

Utrecht 1999) 13 Z I Borevich and I R Shafarevich Number Theory (Academic Press

New-York 1966) 14 W Schikhov Ultrametric calculus (Cambridge Univ Press Camshy

bridge 1984) 15 R von Mises MathZ 5 52-99 (1919)

16 R von Mises Probability Statistics and Truth (Macmillan London 1957)

17 A N Kolmogorov Foundations of the Probability Theory (Chelsea Publ Comp New York 1956)

18 H Cramer Mathematical theory of statistics (Univ Press Princeton 1949)

19 I V Volovich Number Theory as the Ultimate Physical Theory Preprint CERN Geneva TH 478187 (1987)

20 E Borel Rend Cic Mat Palermo 27 247 (1909) 21 M Frechet Recherches theoriques modernes sur la theorie des probashy

bility (Univ Press Paris 1937-1938) 22 A Ya Khinchin Voprosi Filosofii No 1 92 No 2 77 (1961) (in

Russian) 23 A Poincare About Science Collection of works (Nauka Moscow

1983) 24 E Wigner Quantum -mechanical distribution functions revisted in

Perspectives in quantum theory Yourgrau W and van der Merwe A editors (MIT Press Cambridge MA 1971)

25 P A M Dirac Proc Roy Soc London A 180 1-39 (1942) 26 R P Feynman Negative probability Quantum Implications Esshy

says in Honour of David Bohm 235-246 BJ Hiley and FD Peat editors (Routledge and Kegan Paul London 1987)

27 W Muckenheim Phys Reports 133 338-401 (1986) 28 A Yu Khrennikov Int J Theor Phys 34 2423-2434 (1995)

219

COMPLEMENTARITY OR SCHIZOPHRENIA IS PROBABILITY IN Q U A N T U M MECHANICS INFORMATION

OR ONTA

A F KRACKLAUER E-mail kracklaufossiuni-weimarde

Of the various complimentarities or dualities evident in Quantum Mechanics (QM) among the most vexing is that afflicting the character of a wave function which at once is to be something ontological because it diffracts at material boundshyaries and something epistemological because it carries only probabilistic informashytion Herein a description of a paradigm a conceptual model of physical effects will be presented that perhaps can provide an understanding of this schizophrenic nature of wave functions It is based on Stochastic Electrodynamics (SED) a candidate theory to elucidate the mysteries of QM The fundamental assumption underlying SED is the supposed existence of a certain sort of random electroshymagnetic background the nature of which it is hoped will ultimately account for the behavior of atomic scale entities as described usually by QM In addition the interplay of this paradigm with Bells no-go theorem for local realistic extentions of QM will be analyzed

1 Introduction

Of the various complimentarities or dualities evident in Quantum Mechanshyics (QM) among the most vexing is that afflicting the character of a wave function which at once is to be something ontological because it diffracts at material boundaries and something epistemological because it carries only probabilistic information All other diffractable waves it may be said carry momentum energy not conceptual abstract information ideas All other probabilities are calculational aids and like abstractions generally are utterly unaffected by material boundaries The literature is replete with resolutions of QM-conundrums selectively ignoring one or the other of these characteristicsmdash in the end they all fail

Herein a description of a paradigm a conceptual model of physical efshyfects will be presented that perhaps can provide an understanding of this schizophrenic nature of wave functions It is based on Stochastic Electrodyshynamics (SED) a candidate theory to elucidate the mysteries of QM1 The fundamental concept underlying SED is the supposed existence of a certain sort of random electromagnetic background the nature of which it is hoped will ultimately account for the behavior of atomic scale entities as described usually by QM2 Among the successes of SED one is a local realistic explashynation of the diffraction of particle beams3 The core of this explanation is the

220

notion that relative motion through the SED background effectively engenders de Broglies pilot wave Given such a pilot wave associated with a particles motion the statistical distribution of momentum in a density over phase space can be decomposed in the sense of Fourier analysis such that the resulting form of Liouvilles Equation under some conditions is Schrodingers Equation

From this viewpoint the schizophrenic character of wave functions can be discussed and understood free of preternatural attributes These concepts have broad implications for serious philosophical questions such as the mind-body dichotomy through teleportation to popular science fiction effects In addition the peculiar nature of probability in QM is clarified

Although much remains to be done to comprehensively interpret all of QM in terms of SED many of the by now hoary paradoxes can be rationally deconstructed

A secondary (but intimately related) issue is that of determining the imshyport of Bells Theorem for the use of the SED paradigm to reconcile fully the interpretation of QM Arguments will be presented showing that in his proof Bell (essentially by misconstruing the use of conditional probabilities) called on inappropriate hypothetical presumptions just as Hermann de Broglie Bohm and others found that Von Neumann did before him45

2 De Broglie waves as an SED effect

The foundation of the model or conceptual paradigm for the mechanism of particle diffraction proposed herein is Stochastic Electrodynamics (SED) Most of SED for which there exists a substantial literature is not crucial for the issue at hand1 The nux of SED can be characterized as the logical inversion of QM in the following sense If QM is taken as a valid theory then ultimately one concludes that there exists a finite ground state for the free electromagnetic field with energy per mode given by

E = huj2 (1)

SED on the other hand inverts this logic and axiomatically posits the existence of a random electromagnetic background field with this same spectral energy distribution and then endeavors to show that ultimately a consequence of the existence of such a background is that physical systems exhibit the behavior otherwise codified by QM The motivation for SED proponents is to find an intuitive local realistic interpretation for QM hopefully to resolve the well known philosophical and lexical problems as well as to inspire new attacks on other problems

221

The question of the origin of this electromagnetic background is of course fundamental In the historical development of SED its existence has been posited as an operational hypothesis whose justification rests o posteriori on results Nevertheless lurking on the fringes from the beginning has been the idea that this background is the result of self-consistent interaction ie the background arises out of interactions from all other electromagnetic charges in the universe6

For present purposes all that is needed is the hypothesis that particles as systems with charge structure (not necessarily with a net charge) are in equishylibrium with electromagnetic signals in the background Consider for example as a prototype system a dipole with characteristic frequency u Equilibrium for such a system in its rest frame can be expressed as

moc2 = Jkj0 (2)

This statement is actually tautological as it just defines UJQ for which an exact numerical value will turn out to be practically immaterial

This equilibrium in each degree of freedom is achieved in the particles rest frame by interaction with counter propagating electromagnetic background signals in both polarization modes separately which on the average add to give a standing wave with antinode at the particles position

2cos(fc0a)sin(wo)- (3)

Again this is essentially a tautological statement as a particle doesnt see signals with nodes at its location thereby leaving only the others Of course everything is to be understood in an on-the-average statistical sense

Now consider Eq (3) in a translating frame in particular the rest frame of a slit through which the particle as a member of a beam ensemble passes In such a frame the component signals under a Lorentz transform are Doppler shifted and then add together to give what appears as modulated waves

2 cos(fc07(x mdash cflt)) sin(wo7(i mdash c_13a)) (4)

for which the second the modulation factor has wave length A = (7fco)-1 From the Lorentz transform of Eq (2) P = hj3ko the factors j3k0 can be identified as the de Broglie wave vector from QM as expressed in the slit frame

In short it is seen that a particles de Broglie wave is modulation on what the orthodox theory designates Zitterbewegung The modulation-wave effectively functions as a pilot wave Unlike de Broglies original conception in which the pilot wave emanates from the kernel here this pilot wave is a kinematic effect of the particle interacting with the SED Background Because

222

this SED Background is classical electromagnetic radiation it will diffract according to the usual laws of optics and thereafter modify the trajectory of the particle with which it is in equilibrium3 (See Ref [1] Section 123 for a didactical elaboration of these concepts)

The detailed mechanism for pilot wave steerage is based on observing that the energy pattern of the actual signal that pilot waves are modulating and to which a particle tunes comprises a fence or rake-like structure with prongs of varying average heights specified by the pilot wave modulation These prongs in turn can be considered as forming the boundaries of energy wells in which particles are trapped a series of micro-Paul-traps as it were Intuitively it is clear that where such traps are deepest particles will tend to be captured and dwell the longest The exact mechanism moving and restraining particles is radiation pressure but not as given by the modulation rather by the carrier signal itself Of course because these signals are stochastic well boundaries are bobbing up and down somewhat so that any given particle with whatever energy it has will tend to migrate back and forth into neighboring cells as boundary fluctuations permit Where the wells are very shallow however particles are laterally (in a diffraction setup say) unconstrained they tend to vacate such regions and therefore have a low probability of being found there

The observable consequences of the constraints imposed on the motion of particles is a microscopic effect which can be made manifest only in the observation of many similar systems For illustration consider an ensemble of similar particles comprising a beam passing through a slit Let us assume that these particles are very close to equilibrium with the background that is that any effects due to the slit can be considered as slight perturbations on the systematic motion of the beam members

Given this assumption each member of the ensemble with index n say will with a certain probability have a given amount of kinetic energy En associated with each degree of freedom Of special interest here is the beam direction perpendicular to both the beam and the slit in which by virtue of the assumed state of near equilibrium with the background we can take the distribution with respect to energy of the members of the ensemble to be given in the usual way by the Boltzmann Factore_^pound where is the reciprocal product of the Boltzmann Constant k and the temperature T in degrees Kelvin The temperature in this case is that of the electromagnetic background serving as a thermal bath for the beam particles with which it is in near equilibrium

Now the relative probability of finding any given particle ie with energy Enj or Enltk or trapped in a particular well will be according to elementary probability proportional to the sum of the probabilities of finding

223

particles with energy less than the well depth

pound e -J = f ( t ) e s amp = (1-eSD) lt5) lEnltd JO 0 V 0

where approximating the sum with an integral is tantamount to the recognition that the number of energy levels if not a priori continuous is large with respect to the well depth

If now d in Eq (5) is expressed as a function of position we get the probability density as a function of position For example for a diffraction pattern from a single slit of width o at distance D the intensity (essentially the energy density) as a function of lateral position is E0 sin2(9)62 where 9 = k[piiotWave(^D)y and the probability of occurrence P(6(y)) as a function of position would be

P ( y ) a ( l - e - ^ s i n 2 W f l 2 ) (6)

Whenever the exponent in Eq (6) is significantly less than one its rhs is very accurately approximated by the exponent itself so that one obtains the standard and verified result that the probability of occurrence Py) = iptp in conventional QM is proportional to the intensity of a particles de Broglie (pilot) wave

3 Schrodinger Equation

A consequence of the attachment of a De Broglie pilot wave to each particle is that there exists a Fourier kernel of the following form

bull 2p V (7)

which can be used to decompose the density function of an ensemble of similar particles Consider an ensemble governed by the Liouville Equation

at m ^ = - V raquo - ^ + ( V p p ) F

i=xy z (8)

Now decompose p(x p)with respect to p using the De Broglie-Fourier Kernel

p(x x t) = e-^p(x p t)dp (9)

224

110

relative intensity

Neutron Diffraction

0 Particle Beam

1 x Radiation

bullI A Chi(y)-squared (x50)

lateral displacement in radians theta

Figure 1 A simulated single slit neutron diffraction pattern showing the closeness of the fit of Eq (6) to the pure wave diffraction patten See Ref [3] for details

to transform the Liouville Equation into

dt i2m

To solve separate variables using

f)(xP)

r = x + x r = x mdashx

to get

i = (^ )^ - (^raquo - ( i ) (-raquobull(4^^ which can (sometimes) be separated by writing

r r )=V(r )Vlt(r)

(10)

(11)

(12)

(13)

225

to get Schrodingers Equation

ihd-^ = ~y^ + v^ (14) at 2 m

4 Conclusions

Within this paradigm Quantum Mechanics is incomplete as surmised by Einshystein Padolsky and Rosen4 It is built on the basis of the Liouville Equation while taking a particular stochastic background into account The conceptual function of Probability in QM is just as in Statistical Mechanics Measurement reduces ignorance it does not precipitate reality Of course measurement also disturbs the measured system but this presents no more fundamental problems that it does in classical physics Heisenberg uncertainty on the other hand is seen to be caused simply by the incessant dynamical perturbashytion from background signals In so far as the source of background signals can not be isolated this source of uncertainty is intrinsic but not fundamentally novel For these reasons duality is superfluous Particles have the same ontological status as in classical physics Individual particles in a beam pass through one or the other slit in a Young double slit experiment for example while their De Broglie piloting waves pass through both slits Beyond the slit the particles are induced stochastically to track the nodes of their pilot waves so that a diffraction pattern is built up mimicking the intensity of the pilot wave

From within this paradigm the now infamously paradoxical situations illustrating various problems with the interpretation of QM never arise or are resolved with elementary reasoning In particular wave functions are not vested with an ambiguous nature

The SED Paradigm also clarifies the appearance of interference among probabilities Numerous analysts from various view points have discovered that fact that Probability Theory admits structure (used by QM) that goes unexploited in traditional applications (Eg see Gudder Summhammar this volume) While each of these approaches provides deep and surprising insights none really offers any explanation of why and how nature exploits this structure Just as a certain second order hyperbolic partial differential equation becomes the wave equation as a physics statement only with the introduction eg of Hooks Law so this extra probability structure can be made into physics only with an analogue to Hooks Law

SED provides that analogue for particle behavior with its model of pilot wave guidance In this model radiation pressure is responsible for particle guidance3 Radiation pressure is proportional to the square of EM fields ie

226

the intensity (in this case of the the background field as modified by objects in the environment) which is not additive Rather the field amplitudes are additive and interference arrises in the way well understood in classical EM In other words QM interference is a manifestation of EM interference The relevant Hooks Law analogue is the phenomenon of radiation pressure For radiation this is all intimately related of course to classical coherence theshyory as applied to square law photoelectron detectors which when properly applied resolves many QM conundrums including those instigated by Bells Theorem surrounding EPR correlations

Appendix Bells Theorem

The interpretation or paradigm described herein conflicts with the conclusions of Bells no-go theorem according to which a local realistic extention of QM should conform with certain restraints that have been shown empirically to be false To be sure this paradigm does not deliver the hidden variables for exploitation in calculations but it does indicate to which features in the universe they pertainmdashnamely all other charges The character of these hidden variables is dictated by the fact that they are distinguished only in that they pertain to particles distant from the system of particular interest thus internal consistency requires that they be local and realistic8

The basic proof

Bells Theorem purports to establish certain limitations on coincidence probashybilities of spin or polarization measurements as calculated using QM if they are to have an underlying deterministic but still local and realistic basis describ-able by extra as yet hidden variables A distributed with a density p(X) These limitations take the form of inequalities which measurable coincidences must respect The extraction of one of these inequalities where the input assumptions are enumerated as Bell made them proceeds as follows

Bells fundamental Ansatz consists of the following equation

P(a b) = f dp(X)A(a X)B(b A) (15)

where per explicit assumption A is not a function of 6 nor B of a This he motivated on the grounds that a measurement at station A if it respects locality can not depend on remote conditions such as the settings of a distant measuring device ie b In addition each by definition satisfies

Alt1 Blt1 (16)

227

Eq (15) expresses the fact that when the hidden variables are integrated out the usual results from QM are recovered

The extraction proceeds by considering the difference of two such coincishydence probabilities where the parameters of one measuring station differ

P(a b) - P(a b) = f dp(X)[A(a X)B(b A) - A(a X)B(b A)] (17)

to which zero in the form

A(a X)B(b X)A(a X)B(b A) - A(a X)B(b X)A(a X)B(b A) (18)

is added to get

P(a b) - P(a b) = [ dXp(X)(A(a X)B(b A))(l plusmn A(a X)B(b A)+

dXp(X)(A(a X)B(b A))(l plusmn A(a X)B(b A) (19)

which upon taking absolute values Bell wrote as

P(a b)-P(a b) lt [dXp(X)(l plusmn A(a X)B(b A)+

I dXpX)l plusmn A(a X)B(b A) (20)

Then using Eq (15) Ansatz and normalization J dXp(X) = 1 one gets

P(a b) - P(a b) + P(a V) + P(a b) lt 2 (21)

a Bell inequality9

Now if the QM result for these coincidences namely P(a b) = mdash cos(20) is put in Eq (21) it will be found that for 6 = iramp the rhs of Eq (21) becomes 22 Experiments verify this result10 Why the discrepancy According to Bell it must have been induced by demanding locality as all else he took to be harmless

228

Critiques

Although Bells analysis is denoted a theorem in fact there can be no such thing in Physics the axiomatic base on which to base a theorem consists of those fundamental theories which the whole enterprise is endeavoring to reveal Moreover buried in all mathematics pertaining to the physical world are numerous unarticulated assumptions some of which are exposed below

The analytical character of dichotomic functions

In motivating his discussion of the extraction of inequalities Bell considered the measurement of spin using Stern-Gerlach magnets or polarization measureshyments of photons In both cases single measurements can be seen as individshyual terms in a symmetric dichotomic series ie having the values plusmn 1 It is ther-fore natural to ask if the correlation computed using QM P(a b) = mdash cos(20) and verified empirically can be the correlation of dichotomic functions It is easy to show that they can not so be consider

- cos(20) = k f P(x- 6)P(x)dx (22)

where p(A) is fc27r and where the Ps are dichotomic functions Now take the derivative wrt 8 to get

2 sin(2lt9) = f 5(x - 6j)P(x)dx = ^ P0j) = k (23) J i

and again

4cos(20)=O (24)

which is false QED Some authors (see eg Aerts this volume) employ a parameterized dishy

chotomic function to represent measurements Such a function can be dishychotomic in the argument but continuous in the parameter eg of the form P(sin(i) mdash x)) for which then the correlation is taken to be of the form

Corr(t) = J D(x- sin(2t))D(x)dx (25) J mdash IT

However this approach seems misguided First it assumes that the the argushyment of Corr t can be identical to the parameter of the dichotomic function

229

Pt(x) rather than the off-set in the argument here x as befitting a correlashytion Moreover the same sort of consistency test applied above also results in contradictions therefore such parameterized functions do not constitute counterexamples invalidating the claim that discontinuous functions can not have an harmonic correlation At best this tactic implicitly results in the correlation of the measurement functions wrt the continuous parameter t which is interpreted as the weight or frequency of the the dichotomic value This tactic however does not conform with Bells analysis in which the dishychotomic values are to correlated rather it corresponds with the type of model proposed below without however recognizing Malus Law as the source of the weights

Conclusion There is a fundamental error in Bells analysis the QM result is at irreconcilable odds with the conventional understanding of his arguments11

This can be revealed alternately following Sica by considering four dishychotomic sequences (with values plusmn1 and length N) a a b and b and the following two quantities a ^ + a ^ = a(6j + 6J) and dfii mdash a^)i = abi mdash b^) Sum these expressions over i divide by N and take absolute values before adding together to get

N N N N

i i i i

N N

- pound | a j | | amp i + ampi + - jgtnamp i -amp i (26) i i

The rhs equals 2 so this is a Bell Inequality Conclusion this Bell Inequality is an arithmetic identity for dichotomic sequences there is no need to postulate locality in order to extract it12

Discrete vice continuous variables

By implication Bell considered discrete variables for which the correlation would be

1 N

Cor(a 6 ) = - 5 3 X 4 ( 0 ) ^ ( 6 ) (27) i

But experiments measure the number of hits per unit time given a b and then compute the correlation each event is a density not a single pair The

230

data taken in experiments corresponds to the read-out for Malus Law not the generation of dichotomic sequences for which each term represents an event consisting of a pair of photons with anticorrelated polarization or a particle pair with anticorrelated spins This discrepancy is ignored in the standard renditions of Bells analysis It is however serious and suggests a different tack

Consider following Barut a model for which the spin axis of pairs of particles have random but totally anticorrelated instantaneous orientation Si = mdashS213 Each particle then is directed through a Stern-Gerlach magnetic field with orientation a and b The observable in each case then would be A = Si bull a and B = S2 bull b Now by standard theory

_ bdquo s ltABgt - ltAgtltB gt Cor (A B) = = = = 28

Vlt A2 gt lt B2 gt the where the angle brackets indicate averages over the range of the variables This becomes

Cor(A B) = ^ s i n ( 7 ) d y c o s ( 7 - g ) c o s ( 7 ) ^

J(Jdysm(j)cos2(j))2

which evaluates to -cos(0) ie the QM result for spin state correlation Conclusion this model essentially a counter example to Bells analysis shows that continuous functions (vice dichotomic) work It is more than just natural to ask where do the gremlins reside in Bells analysis There are at least two

One has to do with the following covert hypothesis Bells proof seems to pertain to continuous variables in that the demand is only that A (B) lt 1 This argument however silently also assumes that the averages lt A gt = lt B gt = 0 It enters in the derivation of a Bell inequality where the second term above is ignored as if it is always zero When it is not zero Bell inequalities become eg

lP(a b) - P(a b) + P(a b) - P(a b)lt2+ 2 lt ^ gt lt f 2

gt ^ (30) Vlt Az gt lt Bz gt

which opens up a broader category of non quantum models A second covert gremlin having broader significance is discussed below

Are nonlocal correlations essential

The demand that in spite of the introduction of hidden variables A that a probability P(a b) averaged over these extra variables reduce to currently

231

used QM expressions implies that

P(a b)= f P(a b X)dX (31)

By basic probability theory the integrand in this equation is to be decomposed in terms of individual detections in each arm according to Bayes formula

Pa b A) = P(X)P(a X)P(ba A) (32)

where P(a A) is a conditional probability In turn the integrand above can be converted to the integrand of Bells Ansatz

P(a b) = jA(a X)B(b X)pX)dX iff

P(baX) = P(bX) Va (33)

This equation admits it seems two interpretations

(i) When this equation is true the ratio of occurrence of outcomes at station B must be statistically independent of the outcomes at A Therefore as the hidden variables A are extra and do not duplicate a and b even if the correlation is considered to be encoded by a A it will not be available to an observer But the correlation by hypothesis does exist and is to be detectable via the as and 6s therefore this equation can not hold Thus within this interpretation Bells Ansatz is not internally consistent

(ii) Alternately if the a on the lhs is superfluous so is b so that P mdash P(X) = 0 except at one value of A where it equals 1 or is a Dirac-delta function That is the correlation is totally encoded by the hidden variables as follows if a sufficient number of new variables are introduced to render everything deterministicmdashas often assumed Consequently individual products of probabilities at the separate stations ie ABs in Bells notation become Dirac delta-functions of the A If everything is deterministic then there can be no overlap of the of the non-zero values of pairs of probabilities for a given value of A and therefore in the extraction of a Bell inequality all quadruple products of P s with pair-wise different values of A in Eq (19) are identically zero so that the final form of a Bell inequality is the trivial identity

P(ab)-P(ab)lt2 (34)

232

In either case locality is not be so employed so as to exclude correlations generated at the conception of the spin-particles or photon pairs ie common causes The non existence of instantaneous communication can not impose a restraint here it must bear no relationship to the validity of Eq (33)

In addition Eq (34) reconciles Baruts continuous variable model with Bells analysis

Bell-Kochen-Specker Theorem

Besides Bells original theorem there is another set of no-go theorems ostensishybly prohibiting a local realistic extention for QM In contrast to the theorem analyzed above they do not make explicit use of locality rather they use cershytain properties (falsely it turns out) of angular momentum (spin) In general the proof of these theorems proceeds as follows The system of interest is deshyscribed as being in a state ip) specified by observables A B C A hidden variable theory is then taken to be a mapping v of observables to numerical values v(A)v(B)v(C) Use is then made of the fact that if a set of operashytors all commute then any function of these operators f(A BC) = 0 will also be satisfied by their eigenvalues f(v(A) v(B)v(C)) mdash 0

The proof of a Kochen-Specker Theorem proceeds by displaying a conshytradiction consider eg two spin-12 particles for which the nine separate mutually commuting operators can be arranged in the following 3 by 3 matrix

degl degl degdeg (35) degWy degldeg degdegz

It is then a little exercise in bookkeeping to verify that any assignment of plus and minus ones for each of the factors in each element of this matrix results in a contradiction namely the product of all these operators formed row-wise is plus one and the same product formed column-wise is minus one14

Now recall that given a uniform static magnetic field B in the z-direction the Hamiltonian is H = ^Baz for which the time-dependent solution of the

r nmdashiuit Schrodinger equation is ip(t) = 4= e

bdquo+iut and this in turn gives time-

dependent expectation values for spin values in the xy directions^5

lt ampx gtmdash ~ cos(oi) lt ay gt= - sin(wi) (36)

where w = eBmc

233

Proof of a Bell-Kochen-Specker theorem depends on simultaneously asshysigning the [eigenvalues plusmn1 to ltrx o~y and az as measurables for each particle (With some effort for all other proofs of this theorem one can find an equivashylent assumption) However as Barut13 observed and can be seen in Eq (36) if the eigenvalues plusmn1 are realizable measurement results in the P-field dishyrection then in the other two directions the expectation values oscillate out of phase and therefore can not be simultaneously equal to plusmn 1 Thus this variation of a Bell theorem also is defective physics

A local model for EPR (polarization) Correlations

The following model incorporates the features of polarization correlations withshyout preternatural aspects or the concept of photon The basic assumption is that the source emits oppositely directed anticorrelated classical electromagshynetic signals

EA = xcos(i) +ys in( f ) EB = mdash xsin( + 6) + y cos(i + 9) (37)

where factors of the form exp(i(wt + k bull x + pound(t)) where pound(pound) is a random variable are dropped as they are suppressed by averaging16 Now the random variables with physical significance emerging in the detectors per Malus Law are EA B It is the detectors that digitize the data and create the illusion of photons But because Maxwells Equations are not linear in intensities rather in the fields a fourth order field correlation is required to calculate the cross correlation of the intensity

P(a b) = Klt(A- B)(B bull A) gt (38)

where brackets indicate averages over space-time (This appears to be the source of entanglement in QM which is seen to have no basis beyond that found in classical physics) Here Eq (38) turns out to be

P ( + +) ltXK (COS(J) sin(i + 6) - sin(i) cos(i + 6)fdv (39) Jo

which gives P ( + + ) = P ( - - ) oc tsin2(0) a n d P ( - + ) = P ( - - ) ocfccos2(0) The constant K can be eliminated by computing the ratio of particular events to the total sample space which here includes coincident detections in all four combinations of detectors averaged over all possible displacement angles 6 thus the denominator is

mdash (sin2 (6raquo) + cos2 (6))d6 = 2K (40) i Jo

234

so that the ratio becomes

P ( + + ) = is in 2(0) (41)

the QM result This in turn yields the correlation

P ( + +) + P ( - - ) - P ( + - ) - P ( - +) Cor(a b) =

P ( + +) + P ( - - ) + P ( + - ) + P ( - + )

Cor (a b) = -cos(20) (42)

If the fundamental assumptions involved in this local realistic model are valid then there would be observable consequences For example if radiation on the other side of a photodetector is continuous and not comprised of photons then photoelectrons are evoked independently in each detector by continuous but (anti)correlated radiation Thus the density of photoelectron pairs should be linearly proportional (baring effects caused by limited cohershyence) to the coincidence window width On the other hand if photons are in fact generated in matched pairs at the source then at very low intensities the detection rate should be relatively insensitive to the coincidence window width once it is wide enough to capture both electrons

1 L de la Peha and A M Cetto The Quantum Dice (Kluwer Dordrecht 1996)

2 A F Kracklauer An Intuitive Paradigm for Quantum Mechanics Physics Essays 5 (2) 226 (1992)

3 A F Kracklauer Found Phys Lett 12 (5) 441 (1999) 4 G Hermann Die Naturphilosophischen Grundlagen der Quanten-

mechanik Abhandlungen der Friesschen Schule 6 75-152 (1935) 5 D Bohm Causality and Chance in Modern Physics (Routledge amp Kegan

Paul Ltd London 1957) 6 H Puthoff Phys Rev A 40 4857 (1989) 44 3385 (1991) 7 A Einstein B Podolsky and N Rosen Phys Rev 47 777 (1935) 8 J S Bell Speakable and unspeakable in quantum mechanics (Cambridge

University Press Cambridge 1987) 9 J S Bell in Foundations of Quantum Mechanics Proceedings of the

International School of Physics Enrico Fermi course IL (Academic New York 1971) p 171-181 reprinted in Ref [8]

10 A Afriat and F Selleri The Einstein Podolsky and Rosen Paradox (Plenum New York 1999) review theory and experiments from a current prospective

235

11 A F Kracklauer in New Developments on Fundamental Problems in Quantum Mechanics M Ferrero and A van der Merwe (eds) (Kluwer Dordrecht 1997) p185

12 L Sica Opt Commun 170 55-60 amp 61-66 (1999) 13 A O Barut Found Phys 22 (1) 137 (1992) 14 N D Mermin Rev Mod Phys 65 (3) 803 (1993) 15 R H Dicke and J P Wittke Introduction to Quantum Mechanics

(Addison-Wesley Reading 1960) p 195 16 A F Kracklauer in Instantaneous Action-at-a-Distance in Modern

Physics A E Chubykalo V Pope and R Smirnov-Rueda (eds) (Nova Science Commack NY 1999) p 379 httparXivquant-ph0007101 Ann Fond L deBroglie 20 (2) 193 (2000)

236

A PROBABILISTIC INEQUALITY FOR THE KOCHEN-SPECKER PARADOX

JAN-AKE LARSSON Matematiska Institutionen Linkopings Universitet

SE-581 83 Linkoping Sweden E-mail jalarmailiuse

A probabilistic version of the Kochen-Specker paradox is presented The paradox is restated in the form of an inequality relating probabilities from a non-contextual hidden-variable model by formulating the concept of probabilistic contextuality This enables an experimental test for contextuality at low experimental error rates Using the assumption of independent errors an explicit error bound of 071 is derived below which a Kochen-Specker contradiction occurs

1 Introduction

The description of quantum-mechanical (QM) processes by hidden variables is a subject being actively researched at present The interest can be traced to topics where recent improvements in technology has made testing and using QM processes possible Research in this field is usually intended to provide insight into whether how and why QM processes are different from classical processes Here the presentation will be restricted to the question whether there is a possibility of describing a certain QM system using a non-contextual hidden-variable model or not A non-contextual hidden-variable model would be a model where the result of a specific measurement does not depend on the context ie what other measurements that are simultaneously performed on the system It is already known that for perfect measurements (perfect alignment no measurement errors) no non-contextual model exists These results origin in the work of Gleasonf but a conceptually simpler proof was given by Kochen and Specker2 (KS)

The KS theorem concerns measurements on a QM system consisting of a spin-1 particle In the QM description of this system the operators associated with measurement of the spin components along orthogonal directions do not commute ie

Sxj^y and sz do not commute (1)

however the operators that are associated with measurement of the square of the spin components do commute ie

^1si and s^ commute (2)

237

The latter operators (the squared ones) have the eigenvalues 0 and 1 and

si +s2y + s2

z = 21 (3)

Thus it is possible to simultaneously measure the square of the spin composhynents along three orthogonal vectors and two of the results will be 1 while the third will be 0 Only this QM property of the system will be used in what follows

The notation used from now on is intended to avoid confusion with QM notation since the notions used will be those of (Kolmogorovian) probability theory not QM A hidden-variable model will be taken to be a probabilistic model ie the hidden variable A is represented as a point in a probabilistic space A and sets in this space (events) have a probability given by the probability measure P The measurement results are described by random variables (RVs) Xj(A) which take their values in the value space 01

These mappings will depend not only on the hidden variable A but also the specific directions in which we choose to measure the squared spin components so that we would have

X i ( x y z A ) A - gt 0 l

X 2 ( x y z A ) A - + 0 l (4)

X 3 ( x y z A ) A ^ 0 l

Here Xi is the result of the measurement along the first direction (x) X2

along the second (y) and X3 along the third (z) To be able to model the spin-1 system described above these RVs would need to sum to two ie

3

^ X i ( x y z A ) = 2 (5) i= l

This is in itself no guarantee that the model will be accurate but it is the least one would expect from a hidden-variable model yielding the QM behaviour

In simple experimental setups there is usually only one direction specified (the direction along which the spin component squared is measured) Thus we would expect that X only depends on x (and A) This is referred to as non-contextuality and more formally this can be written as

Xi(xyzA) =X 1 (x y z A )

X 2 (x y z A)=X 2 (x y z A ) (6)

AT3(xyzA) = X 3 ( x y z A )

These two prerequisites are all that is needed to arrive at the Kochen-Specker paradox

238

2 The Kochen-Specker t heo rem

A more appropriate name for this section is perhaps A Kochen-Specker theshyorem since there are several variants the example presented here is from Peres (1993)3 All variants aim for the same thing to show a contradiction by assigning values to measurement results coming from a non-contextual hidden-variable model In this particular one3 a set of 33 three-dimensional vectors are used depicted in Fig 1

Figure 1 The 33 vectors used in the Kochen-Specker theorem The vectors are from the center of the cube onto one of the spots on the cubes surface (normalized if desired)

The proof is as follows assume that we have a non-contextual hidden-variable model Then for any A (except perhaps for a null set) this model satisfies equations (5) and (6) in particular for the directions in Fig 1 Now look at Fig 2(a) The measurement result along one of the coordinate axes must be 0 and along the other axes it must be 1 Let us assume that the 0 is obtained from the measurement along the z axis (the white spot on the cube) and the other two measurements yield 1 (black spots) Measurements along other directions in the ay-plane must also yield 1 as indicated in Fig 2(a) In Fig 2(b-d) three more similar choices are made and having made these assignments a white spot must be added at the position indicated in Fig 2(e) because of the two black spots at orthogonal positions and by this another black spot must be added being orthogonal to the white one This proceshydure continues in Fig 2(f-j) until all the spots are painted either white or black as necessitated by the previously painted spots Finally in Fig 2(k) we have three black orthogonal spots violating equation (5) the condition of QM results A similar contradiction will occur whatever choices we make in our assignments in Fig 2(a-d) and we have a proof of the KS theorem We have

these were green and red in Peres3

239

(a) Arbitrary choice (b) Arbitrary choice (c) Arbitrary choice

(d) Arbitrary choice (e) Orthogonality (f) Orthogonality

(g) Orthogonality (h) Orthogonality (i) Orthogonality

(j) Orthogonality (k) Contradiction

Figure 2 A proof of the Kochen-Specker paradox

240

Theorem 1 (Kochen-Specker) The following three prerequisites cannot hold simultaneously for any A

(i) Realism Measurement results can be described by probability theory using three (families of) RVs

X ( x y z ) A - gt 0 l i = 123

(ii) Non-contextuality The result along a vector is not changed by rotation around that vector For example

Xi(xyzA) = X j ( x y z A )

(Hi) Quantum-mechanical results For any triad the sum of the results is two ie

^ X i ( x y z A ) = 2 i

Note that there is a certain structure to the proof assignment of meashysurement results on a finite number of orthogonal triads according to the QM rule and rotations connecting the measurement results on different triads by non-contextuality This structure can be made explicit in the statement of the theorem by introducing the set EKS (a KS set of triads)

copybullcopybullcopybullcopybull-bull(-i5) (7)

In this set there are n vectors forming TV distinct orthogonal triads where some vectors are present in more than one triad establishing in total M connections by rotation around a vector Using this notation (a restricted version of) the KS theorem is

Theorem 1 (Kochen-Specker) Given a KS set of vector triads EKS the following three prerequisites cannot hold simultaneously for any A

(i) Realism For any triad in EKS the measurement results can be described by probability theory using three (families of) RVs

Xi(xyz)A^0l 1 = 123

241

(ii) Non-contextuality For any pair of triads in EKS related by a rotation around a vector the result along that vector is not changed by the rotashytion For example

Xi(xyzA) = X i ( x y z A )

(Hi) Quantum-mechanical results For any triad in EKS the sum of the results is two ie

^ X i ( x y z A ) = 2 i

This version of the KS theorem will be useful when formulating a probabilistic version of the theorem

3 The Kochen-Specker inequality

The above discussion is valid in an ideal situation where no measurement errors are present Introducing measurement errors these occur as (i) missing detections (ii) changes in the results along the axis vector when rotating or (hi) deviations from the sum 2 Since the prerequisites of Theorem 1 is no longer valid neither is the theorem However using probabilistic notions the theorem can be restated as follows

Theorem 2 (Kochen-Specker inequality) Given a KS set EKS of AT vector triads with M interconnections by rotation if we have

(i) Realism For any triad in EKS the measurement results can be described by probability theory using three (families of) RVs

J f i ( x y z ) A X l - + 0 l i = l 2 3

where Ax is a (possibly proper) subset of A

(ii) Rotation error bound For any pair of triads in EKS related by a rotation around a vector the set of As where the result along that vector is not changed by the rotation is probabilistically large (has probability greater than 1 mdash S) For example

p ( Xi(xgt y gtzA) = Xi(xygtzgtA))gt) gt 1 - S

242

(Hi) Sum error bound For any triad in EKS the set of As where the sum of the results is two is probabilistically large (has probability greater than 1 - e ) ie

p f A ^ X i ( x y z A ) = 2 ) gt 1 - e

Then

M8 + Negt 1

To shorten the proof the following symmetry of the measurement results are assumed to hold (the proof goes through without the symmetry but grows notably in size)

Xi(xyzA) = X 2 ( z x y A ) = X 3 (y z x A) (8)

Proof By Theorem 1 we have

( f | A X 1 ( x y z A ) = X 1 ( x y z A ) ) f l M

( f | A ] T x i ( x v z A ) = 2 ) = 0 N

Then the complement has probability one and

1 = P (j^-X1(KyzX)=X1(xyzX) ) - M

U(UA pound^(x ygtzgtA) = 2c)l N i J

lt ^ p ( A X 1 ( x y z A ) = X 1 ( x y z A ) C ) ( 9 )

M

+ Ep(A Ex^xgtygtzA) = 2c) N i

ltM6 + Ne

Here the probability in (iii) is to be read as the probability of obtainshying results for all three Xi and that the sum is two In other words it is

243

possible to avoid using the no-enhancement assumption in Theorem 2 but unshyfortunately inefficient detector devices would contribute no-detection events to both the error rates S and e which puts a rather high demand on experimental equipment While the no-enhancement assumption can be used in inefficient setups this may weaken the statement (cf a similar argument for the GHZ paradox2)

The error rate e is the probability of getting an error in the sum (both non-detections and the wrong sum are errors here) not the probability of getting an error in an individual result This makes it easy to extract e from experimental data but unfortunately the errors that arise in rotation are not available in the experimental data so it is not possible to estimate the size of S (note that it is not even meaningful to discuss 5 in QM) It is possible to use e to obtain a bound for 5

Corollary 3 (Kochen-Specker inequality) Given a KS set of N vector triads EKS with M interconnections by rotation if Theorem 2 (i-iii) hold then

Obviously a small EKS s e t (small N and M) is better yielding a higher bound for S for a given e (for a few different KS sets see2 3 5)

In an inexact experiment yielding a large e one expects the error rate S to be large as well whereas the bound in Theorem 3 will be low because of the large e A model for this inexact experiment may then be said to be probabilistically non-contextual the measurement error rate is large enough to allow the changes arising in rotation to be explained as natural errors in the inexact measurement device rather than being fundamentally contextual For a good experiment yielding a low e one expects 6 to be low but here the bound in Theorem 3 is higher In a hidden-variable model of this experiment the changes arising in rotation occur at an unexpectedly high rate which cannot be explained as due to measurement errors and a model of this type may be said to be probabilistically contextual Note that this probabilistic non-contextuality is a weaker notion than the one used in Theorem 1 (ii)

4 Independence

To enable a general statement the proof of Theorem 2 does not make any assumptions on independence of the errors but it is possible to give a more quantitative bound for the error rate by introducing independence (for simshyplicity at 100 detector efficiency)

Corollary 4 (KS inequality for independent errors) Assuming that the errors are independent at the rate r and that Theorem 2 (i-iii) hold then both

244

= P(noerrors) + P(fliponbothXis) bull

6 and e are given by r and

M(2r - 2r2) + iV(3r - 5r2 + 3r3) gt 1

Proof In the case of independent errors at the rate r the expressions for the probabilities in Theorem 2 (i) and (ii) are

p(X1(Xyz)=X1(xyz))

rrors) + P(fliponboth

= ( l - r ) 2 + r 2 = l - ( 2 r - 2 r 2 )

p(AExlt(xyzgtA) = 2) 1 (ii)

= P(noerrors) + P(flipoftheOandonel) = (1 - r )3 + 2(1 - r)r2 = 1 - (3r - 5r2 + 3r3)

The probabilities of these sets are not independent so from this point on we cannot use independence The inequality above then follows easily from Theorem 2

An expression on the form r gt f(N M) can now be derived from Corolshylary 4 but this complicated expression is not central to the present paper One important observation is that again to obtain a contradiction for high error rates (r) a small EKS set is needed (small N and M) Unfortunately the error rate needs to be very low eg in the E^s m the present example6 only an error rate r below 071 yields a contradiction in Corollary 4 Please note that there is no experimental check whether the assumption of independent errors holds or not While the errors in the sum may be possible to check it is not possible to extract what errors are present in the rotations or check for independence of those errors (further discussion of independence is necessary but cannot be fit into this limited space)

The set contains 33 vectors forming 16 distinct orthonormal bases3 but some rotations used are not between two of these 16 bases in some cases a rotation goes from one of the 16 bases to a pair of vectors in the set (where the third needed to form a basis is not in the set) and a subsequent rotation returns us to another of the 16 bases Thus in the notation adopted here a few extra vectors are needed to form s yielding n = 41 N mdash 24 and M = 31 Note that these additional vectors are not needed to yield the KS contradiction but are only needed in the proof of the inequality in this paper A more detailed analysis for the initial set of 33 vectors is possible probably yielding a contradiction at a somewhat higher r than the one obtained from this general analysis but this is lengthy and will not be done here

245

5 Conclusions

To conclude for any hidden-variable model we have a bound on the changes arising in rotation

Here iV is the number of triads in EKS and M is the number of connections within EKS- A proof using few triads with few connections is not only easier to understand but is also essential to yield a bound usable in real experiments At a large error rate e probabilistically non-contextual models cannot be ruled out since the changes of the results arising in rotation can be attributed to measurement errors However a small error rate e will force any hidden-variable description of the physical system to be probabilistically contextual

If the assumption of independent errors is used an explicit bound can be determined for the error rate r

M(2r - 2r2) + V(3r - 5r2 + 3r3) gt 1 (13)

which is possible to write on the form r gt f(N M) Below the bound we have a KS contradiction Again a small KS set is better than a large one yielding a higher bound For example for the KS set used here3 an r below 071 yields a contradiction

While writing this paper the author learned from C Simon that a similar approach was in preparation by him C Brukner and A Zeilinger6

The author would like to thank A Kent for discussions This work was partially supported by the Quantum Information Theory Programme at the European Science Foundation

1 A M Gleason J Math Mech 6 885 (1957) 2 S Kochen and E P Specker J Math Mech 17 59 (1967) 3 A Peres Quantum Theory Concepts and Methods Ch 7 (Kluwer Dorshy

drecht 1993) 4 D M Greenberger M Home A Shimony and A Zeilinger Am J

Phys 58 1131 (1990) N D Mermin Phys Rev Lett 65 1838 (1990) J-A Larsson Phys Rev A 57 R3145 (1998) J-A Larsson Phys Rev A 59 4801 (1999)

5 A Peres J Phys A 24 L175 (1991) J Zimba and R Penrose Stud Hist Philos Sci 24 697 (1993)

6 C Simon C Brukner and A Zeilinger quant-ph0006043

246

Q U A N T U M STOCHASTICS THE N E W A P P R OA C H TO THE DESCRIPTION OF Q U A N T U M MEASUREMENTS

ELENA LOUBENETS Moscow State Institute of Electronics and Mathematics

Abstract

We propose a new general approach to the description of an arbitrary generalized direct quantum measurement with outcomes in a measurable space This approach is based on the introduction of the physically imshyportant mathematical notion of a family of quantum stochastic evolution operators describing in a Hilbert space the conditional evolution of a quantum system under a direct measurement

In the frame of the proposed approach which we call quantum stochasshytic all possible schemes of measurements upon a quantum system can be considered

The quantum stochastic approach (QSA) gives not only the complete statistical description of any quantum measurement (a POV measure and a family of posterior states) but it gives also the complete stochastic description of the random behaviour of a quantum sytem in a Hilbert space in the sense of specifying the probabilistic transition law governing the change from the initial state of a quantum system to a final one under a single measurement When a quantum system is isolated the family of quantum stochastic evolution operators consists of only one element which is a unitary operator

In the case of continuous in time measurements the QSA allows to define in the most general case the notion of the family of posterior pure state trajectories (quantum trajectories) in the Hilbert space of a quantum system and to give their probabilistic treatment

1 Introduction

The evolution of the isolated quantum system is quantum deterministic since its behaviour in a complex separable Hilbert space H is described by a unitary operator U(t) mdashgt satisfying the Schrodinger equation whose solutions are reversible in time

Under a measurement the behaviour of a quantum system becomes irreshyversible in time and stochastic not only is the outcome of a measurement random being defined with some probability distribution but the state of a quantum system becomes random as well

Consider the general scheme of description of any quantum measurement

247

with outcomes of the most general nature possible under a quantum measureshyment Such a measurement is usually called generalized

Let n be a set of outcomes and J7 be a u-algebra of subsets of fi Let po be a state of a quantum system at the instant before a measurement

The complete statistical description of any generalized quantum measureshyment implies that for any initial state po of a quantum system we can present

bull the probability distribution of different outcomes of a measurement bull the statistical description of a state change po -gt pout of the quantum

system under a measurement We shall say also about the complete stochastic description of the random

behaviour of a quantum system under a measurement in the sense of specifying the probabilistic transition law governing the change from the initial state of a quantum system to a final one under a single measurement

Introduce some notations Let fj(Epo) = Probw 6 Ep0 WE pound T be a probability that under

a measurement (upon a quantum system being initially in a state po) the observed outcome UJ belongs to a subset E

Let ExZE) be a conditional expectation of any von Neumann observable Z G C(H) Z = Z+ at the instant immediately after the measurement provided the observed outcome w 6 E Here CH) denotes the linear space of all linear bounded operators on 7i

The statistical (density) operator pout(Epo) is called a posterior state of a quantum system conditioned by the observed outcome w euro E if for any Z the following relation is valid

ExZE = tr[pout(Ep0)Z] (1)

Unconditional (a priori) state p0ut(QPo) of a quantum system defines the quantum mean value

tr[pout(np0)Z] = ExZQ = (Z)PoutnPo) (2)

of any von Neumann observable Z at the instant immediately after the meashysurement if the results of a measurement are ignored

Any conditional state change p0ut(Epo) of a quantum system under a measurement can be completely described by a family of statistical operators Pout(uPo)v G ft] denned ^-almost everytwhere on fl and called a family of posterior states

Specifically for WE pound T fi(E p0) ^ 0

PoutEPo) ~ pjE^) ( 3 )

248

and consequently due to (1) for any von Neumann observable Z the condishytional expectation can be presented as

ExZE = feB tr^pout^ P o ) Z M ^ Pa) ( 4 )

p(Ep0)

Every posterior state pout(^po) describes the state of a quantum system conditioned by the sharp outcome w In general however when outcomes of a measurement are not of discrete character or the observation is not sharp then provided the outcome ugt pound E we can only say that after a measurement the quantum system is in a state p0ut(lt^Po) with probability

ndwpo)

( w ) 7^T (5)

where XEltgt) is an indicator function of a subset E The a priori state p0ut(^Po) a n d the quantum mean value of any von

Neumann observable Z at the instant immediately after the measurement are represented through the family of posterior states as

Pout(ttp0)= Pout(up0)lJ(duPo) (6) Ja

(z)pout(npo)= tr[pout(ujpo)Z]ft(lthpo) (7) Jn

respectively The relation (6) can be considered as the usual statistical average over

posterior states p0utuPo) given with the probability distribution p(cLjpo) From (7) it also follows that in any possible measurement upon an obshy

servable Z which could be done immediately at the instant after the first measurement the probability distribution Probz euro Apout(Clpo) of possishyble outcomes is given by

Probz e A w(n 9 0 ) = Pvobz euro Apout(upo)fi(dup0) (8) JQ

This formula can be considered as the quantum analog of Bayes formula in classical probability theory

In quantum theory there are two major approaches to the specification of above mentioned elements of the description of a quantum measurement

249

bull The von Neumann approach [1] considers only direct measurements with outcomes in R According to this approach only self-adjoint operators on ~H are allowed to represent real-valued variables of a quantum system which can be measured (observables) The probability distribution p(Epo) of any measurement is denned as

Li(Epo)=tr[p0P(E)l (9)

through the projection-valued measure P(-) on (R B(M)) corresponding due to the spectral theorem to the self-adjoint operator representing this observshyable

Under the von Neumann approach the posterior state of a quantum sysshytem is defined only in the case of discrete spectrum of a measured quantum variable and is given by the well-known jump of a quantum system under a measurement prescribed by von Neumann reduction postulate

In the case of continuous spectrum of a quantum observable the description of a state change of a quantum system under a measurement is not formalized

The simultaneous measurement of n quantum observables is allowed if and only if the corresponding self-adjoint operators and consequently their spectral projection-valued measures commute

bullThe operational approach [2-8] gives the complete statistical description of any generalized quantum measurement In the frame of the operational approach the mathematical notion of a quantum instrument plays the central role In physical literature a quantum instrument is usually called a superop-erator

Specifically a mapping T(-)[-] T x C(Ji) -gt CT-L) is called a quantum inshystrument if T(-) is a measure on (fi F) with values T(E) VE pound T being linear bounded normal completely positive maps on pound(H) such that the following normality relation is valid T(fi)[J] = J

Let T(-)[-] be an instrument of a generalized quantum measurement Then the conditional expectation of any von Neumann observable Z at

the instant after a measurement is defined to be

Exm = ^mMMt yEpoundjr ( 1 0 ) Hhpo)

In case Z = I from (10) it follows that in the frame of the operational approach the probability distribution p(E po) of outcomes under a measurement is given by

p(Ep0) = tr[p0T(E)[I]] Vpound euro T (11)

250

The positive operator-valued measure M(E) = T(E)[I] satisfying the conshydition M(fi) = is called a probability operator-valued measure or a POV measure for short

From (1) and (10) it also follows that for any initial state po the posterior state p0ut(Epo) conditioned by the outcome us pound E can be represented as

Pout(Ep0)- KEpo) (12)

where T(E)[-] denotes the dual map acting on the linear space T(H) of trace class operators on H and denned by

tr[ST(E)[Z] = tr[T(E)[SZ] VZ pound CU) VS ltET(H) (13)

For any initial state po of a quantum system the family of posterior state Pout(upo)w G fi always exists and is denned uniquely ^-almost everyshywhere by the relation

tr[pout(cjp0)Z]fi(dup0)=tr[p0T(E)[Z] MZ 6 C(H) Vpound euro T (14) JuieuroE

Due to (13) (14) we have

T(E)[p0]= pout(ujpo)p-(du)po) (15) JweuroE

and consequently the posterior state pout(^Po) is a density of the measure T(-)[po] with respect to the probability scalar measure p(-po)

The operational approach is very important for the formalization of the complete statistical description of an arbitrary generalized quantum measureshyment

However the operational approach does not specify the description of a generalized direct quantum measurement that is the situation where we have to describe a direct interaction between a measuring device and an observed quantum system resulting in some observed outcome w in a classical world and the change of a quantum system state conditioned by this outcome

We would like to emphasize that in principle the description of a direct measurement can not be simply reduced to the quantum theoretical description of a measuring process We can not specify definitely neither the interaction nor the quantum state of a measuring device environment nor to describe a measuring device only in quantum theory terms In fact under such a scheme the description of a direct quantum measurement is simply postponed to the

251

description of a direct measurement of some observable of the environment of a measuring device

The operational approach does not also in general give the possibility to include into consideration the complete stochastic description of the random behaviour of a quantum system under a measurement

We recall that for the case of discrete outcomes the von Neumann approach gives both - the complete statistical description of a direct quantum measureshyment and the complete stochastic description in a Hilbert space of the random behaviour of a quantum system under a single measurement In particular if the initial state po of a quantum system is pure that is po = |Vo)(Vo| and if under a single measurement the outcome A_ is observed then in the frame of von Neumann approach the quantum system jumps with certainty to the posterior pure state

AVo H -iM

(16)

where Pj is the projection corresponding to the observed eigenvalue Xj The probability fij of the outcome Xj is given by

H = ll-P^oll2 (17)

We would also like to underline that the description of stochastic irreversible in time behaviour of the quantum system under a direct measurement is very important in particular in the case of continuous in time direct measureshyments where the evolution of continuously observed quantum system can not be described by reversible in time solutions of the Schrodinger equation

In quantum theory any physically based problem must be formulated in unitarily equivalent terms and the results of its consideration must not be deshypendent neither on the choice of a special representation picture (Schrodinger Heisenberg or interaction) nor on the choice of a basis in the Hilbert space That is why in [9] we introduce the notion of a class of unitarily equivalent measuring processes and analyse the invariants of this class

We show [9] that the description of any generalized direct quantum meashysurement with outcomes in a standard Borel space (n Fg) can be considered in the frame of a new general approach which we call quantum stochastic based on the notion of a family of quantum stochastic evolution operators satisfying the orthonormality relation In the case when a quantum system is isolated the family of quantum stochastic evolution operators consists of only one element which is a unitary operator

The quantum stochastic approach (QSA) which we present in the next section can be considered as the quantum stochastic generalization of the de-

252

scription of von Neumann measurements for the case of any measurable space of outcomes an input probability scalar measure of any type on the space of outcomes and any type of a quantum state reduction Due to the orthonorshymality relation the QSA allows to interpret the posterior pure states defined by quantum stochastic evolution operators as posterior pure state outcomes in a Hilbert space corresponding to different random measurement channels

Even for the special case of discrete outcomes the QSA differs due to the orthogonality relation for posterior pure state outcomes from looking someshywhat similar approaches considered in the physical literature [1011] where the so called measurement or Kraus operators are used for the description of both the statistics of a measurement (a POV measure) and the conditional state change of a quantum system

The QSA gives not only the complete statistical description of any genshyeralized direct quantum measurement but it gives also the complete stochastic description of the random behaviour of the quantum system under a measureshyment

2 Quantum stochastic approach

In this section we introduce the quantum stochastic approach (QSA) to the description of a generalized direct quantum measurement developed in [9]

Specifically it was shown in [9] that for any generalized direct quantum measurement with outcomes in a standard Borel space (ft TB) upon a quantum system being at the instant before the measurement in a state po there exist

bull the unique family of complex scalar measures absolutely continuous with respect to a finite positive scalar measure v(-) and satisfying the orthonormality relation

A = nji(ui)i(du) LJ pound Clij - 1N0N0 lt oo Trji(cj)i(du)) = lt Jn

(18)

bull the unique (up to phase equivalence) family of v- measurable operator-valued functions l^(-) on fi satisfying the orthonormality relation with values being linear operators on defined for any ip 6 v- almost everywhere on ft

V = Vi(u) u pound ili = 1 JV0 f Vf (u)Vi(w)irji(u)v(du) = (19)

and such that for any index i = lNo and for VE 6 TB

[ Vi(w)7rlaquo(u)i(dw) (20) JweuroE

253

is a bounded operator on The relation

W V O M = V M V Wgt G H (21)

holding ^-almost everywhere on fl defines the bounded linear operator Wi Ti mdashgtCe(iligtyH) with the norm ||Wj|| = 1 Here Vidw) = nu(ui)i(daj)

bull the unique sequence of positive numbers a = (0102 OJV0) satisfying the relation

No

5 gt i = i (22) raquo=i

such that the complete statistical description (a POV measure and a family of posterior states) of a measurement and the complete stochastic description of the random behaviour of a quantum system under a single measurement (a family of posterior pure state outcomes and their probability distribution) are given by

bull The POV measure

Wo

M(E) = J2 ltiMiE) Vpound e TB (23) i= l

with

Mi(E) = f VJ+MVSMi^dw) (24)

JweE

bull The family of posterior states

No

Poutu Po) = ^2 amp(w)r^(w po) (25) t = i

with

and

Tt(wp0) = Vi(cj)poV(Lj) (26)

E j ltXin MM7trade(u po)] flaquoH = ^ u ) f -gt (27)

254

bull The probability scalar measure of the measurement given by the expresshysion

H(dup0) = ^ a ^ w ( d w p 0 ) (28) i

through the probability scalar measures

^ ( d w p o ) = tr[T^t(ujpo)Mdoj) (29)

bull The family of random operators (19) describing the stochastic behaviour of the quantum system under a single measurement Every operator Vi(ui) defines in the Hilbert space a posterior pure state outcome conditioned by the observed result ui and corresponding to the i-th random channel of a measurement

For any ij)0 pound the following orthonormality relation for a family Vi(ugt)ipo w i poundli = lNo of unnormalized posterior pure state outcomes is valid

(^raquoVo v s M M w M K d w ) = ltMhMlaquo- (30)

For the definite observed outcome u the probability of the posterior pure state outcome Vi(-)tpo in the Hilbert space is given by

Q( A- O ^ M M I I V J M ^ O H 2 O I 1 ~E-laquoi i iMI|v-MiM2 ^

We call Viifjj) quantum stochastic evolution operators and the probability scalar measures ij(-)fo(-) = Z ^ a w O andzW(-p0) Pgt(-Po) = Sraquoaraquox( )(iA)) - input and output probability measures respectively

Due to the decompositions (23) (25) and (28) Mi(E) T^t(ujp0) Vi(-) and fj^(-po) are interpreted to present the POV measure the unnormalized posterior state the input and the output probability distributions of outcomes in the i-th func-random channel of the measurement respectivelyThe stashytistical weights of different i-th func-random channels are given by numbers agtii = 1 N0

The a priori state

Pout(tipo) = y2ai T^t(up0)ui((hj) (32) i Jn

is the usual statistical average over unnormalized posterior states Tg^t(ujpo) with respect to the input probability distribution of outcomes Ui(-) in every channeland with respect to different random channels of the measurement

255

Physically the introduced notion of different random channels of a meashysurement corresponds under the same observed outcome to different random quantum transitions of the environment of a measuring device which we can not however specify with certainty

The triple 7 = A V a is called a quantum stochastic representation of a generalized direct measurement

We call direct measurements presented by different quantum stochasshytic representations stochastic representation equivalent if the statistical and stochastic description of these direct measurements is identical

In the frame of the QSA von Neumann (projective) measurements present such the stochastic representation equivalence class of direct measurements on (E B(M)) for which the complete statistical and the complete stochastic description is given by the von Neumann measurement postulates [1] presented by the formulae (16) (17)

3 Concluding remarks

We present a new general approach to the description of a generalized direct quantum measurement The proposed approach allows to give

bull the complete statistical description (a POV measure and a family of posterior states) of any quantum measurement

bull the complete description in a Hilbert space of the stochastic behaviour of a quantum system under a measurement (in the sense of specifying of the probabilistic transition law governing the change from the initial state of a quantum system to a final one under a single measurement)

bull to formalize the consideration of all possible cases of quantum measureshyments including measurements continuous in time

bull to give the semiclassical interpretation of the description of a generalized direct quantum measurement

4 Acknowledgments

This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University I would like to thank A Khrennikov for the warm hospitality and fruitful discussions

References

1 J Von Neumann Mathematical foundations of Quantum Mechanics (Princeton U Princeton NJ 1955)

256

2 E B Davies J T Lewis An operational approach to quantum probashybility Commun MathPhys17 239-260 (1970)

3 E B Davies Quantum Theory of Open Systems (Academic Press Lonshydon 1976)

4 A S Holevo Probabilistic and statistical aspects of quantum the-on(Moscow Nauka 1980 North Holland Amsterdam 1982 English translation)

5 K Kraus States Effects and Operations Fundamental Notions of Quanshytum Theory (Springer-Verlag Berlin 1983)

6 M Ozawa Quantum measuring processes of continuous observables J Math Phys 25 79-87 (1984)

7 M Ozawa Conditional probability and a posteriori states in quantum mechanics Publ RIMS Kyoto Univ 21 279-295 (1985)

8 A Barchielli V P Belavkin Measurements continuous in time and a posteriori states in quantum mechanics J Phys A MathGen 24 1495-1514 (1991)

9 ER Loubenets Quantum stochastic approach to the description of quantum measurements Research Report N 39 MaPhySto University of Aarhus Denmark (2000)

10 A Peres Classical intervention in quantum systems I The measuring process Phys Rev A 61 022116 (1-9) (2000)

11 H Wiseman Adaptive quantum measurements Proceedings of the Workshop on Stochastics and Quantum Physics Miscellanea N 16 89-93 MaPhySto University of Aarhus Denmark (1999)

257

A B S T R A C T M O D E L S O F P R O B A B I L I T Y

V M M A X I M O V

Institute of Computer Science Bialystok University

PL15887 Bialystok ulSosnowa 64 POLAND

Probability theory presents a mathematical formalization of intuitive ideas of inshydependent events and a probability as a measure of randomness It is based on axioms 1-5 of AN Kolmogorov x and their generalizations 2 Different formalshyized refinements were proposed for such notions as events independence random value etc 2 3 whereas the measure of randomness ie numbers from [01] reshymained unchanged To be precise we mention some attempts of generalization of the probability theory with negative probabilities4 From another side the physishycists tryed to use the negative and even complex values of probability to explain some paradoxes in quantum mechanics 5 6 7 Only recently the necessity of forshymalization of quantum mechanics and their foundations 8 led to the construction of p-adic probabili t ies9 1 0 1 1 which essentially extended our concept of probability and randomness Therefore a natural question arises how to describe algebraic structures whose elements can be used as a measure of randomness As conseshyquence a necessity arises to define the types of randomness corresponding to every such algebraic structure Possibly this leads to another concept of randomness that has another nature different from combinatorical - metric conception of Kolshymogorov Apparenly discrepancy of real type of randomness corresponding to some experimental data lead to paradoxes if we use another model of randomness for data processing12 Algebraic structure whose elements can be used to estimate some randomness will be called a probability set $ Naturally the elements of 4gt are the probabilities

1 What probability sets $ are possible

For practical conclusions of probability theory two kinds of events so called certain and uncertain are of importance Therefore the probability set $ must have two type of elements corresponding to certainty and uncertainty Their main role is that they are coupling all elements of $ We interpret them as a possibility of a determination of any probability p euro $ of a random events by an infinite sequence of random independent variables denned by the probability set $ In this connection we dont require the formal physical interpretation for certainty

We would like to preserve all fundamental properties of probability on [01] corresponding to an intuitive ideas of a probability of an event for abshystract probability set $

Analogical situation occures in logic A construction which preserve the main properties of Bool algebra and possesses a some new properties led to appearance of the logical Lukasiewicz-Tarski system13 14

258

Definition 1 A set $ is called the probability set if it has the following propshyerties

(i) In $ a binary operation bull can be defined as multiplication of probabilishyties being unnecessary commutative Whith respect that operation the set $ is semigroup In addition $ consists of three non-intersecting semishygroups O e and P such that $ = O U P U e The elements of semigroup O will play a role of zeros ie O is a semigroup of zeros The eleshyments of e will play role of units ie e is a semigroup of units P is a semigroup of probabilities Besides for all p pound P 8 pound O we have 9 bull p p bull 6 pound O and for all p pound P e 6 e we have e-p p-e pound P

It is clear that zero elements correspond to uncertain events and the unit elements correspond to certain events

(ii) For some elements of $ a commutative and associative operation + of addition is defined The operation of addition and multiplication are distributive It means that ifforpqr pound $ the operationsp+q (p+q)+r are defined then operations q + r p + (q + r) also are defined and an equality takes place (p + q) + r = p+ (q + r) In addition for all uvr the operations u-p + v-q p-u + q-v are defined and the equalities take place r-(p + q)mdashr-p + r-q (p + q)-r=p-r + q-r

(iii) For all p pound P there exists a complementary element p pound P and e pound e such that p + p = e

(iv) The operation + is defined for all elements of O and is not defined for elements of e Besides for all p fi e 6 ^ O a sum p + 6 is defined and p + 6 pound O p + 6 $ e Also for e pound e the inclusion takes place 6 + e pound e but p + e is not defined

(v) In $ some topology is introduced such that with respect that topology the operations bull and + are continuous For arbitrary neighbourhood V(0) of zeros there esists p pound $ such that pn euro V(O) for ngtn0 (Vp)

(vi) IfpqE$ andp + q pound O then it follows that pq pound O (the property of indecomposability of zero) That property is not necessary For example in the complex and p-adic probability it can be not fulfilled

(vii) The equation p2 = p always has the solutions in O and e If the equation p2 = p has the solutions only in O and in e then we will say that Kolmogorov condition is valid for probability set $

The properties (31)-(5) provide the main identity of independent probshyabilities calculus ie if

259

Pi + bull bull bull +pn = e G e pi 6 P then we have

(p i + ---+Pn)n = E f t i bullbullbullPik = e f c euro e -

Unfortunately operations of a direct sum and of a tensor product of [01] do not produce new probability set different from [01]

For example in case of a direct sum [01] copy [01] with the coordinate-wise multiplication we have (pq) pq G [01] as probabilities Consequently (Pilti) + (P292) = (pi +P2qi +qi) and (pilti)(p2lt2) = (p i^ t f i f t ) - Obshyviously the element (00) must be zero But then (p0)(0q) = (00) It follows by zero semigroup properties that (p 0) G O or (0^) pound O Asshysume that (p 0) euro O p $ O Then by virtue of others axioms we obtain (mdash p 0) G O 0 lt mdash lt 1 and therefore by the continuity property the set (p 0)p G [01] consists O Formally the probability set differs from [01] But the factorization with respect the set O yields the [01] once again with usual addition and multiplication (see section 2) However there exists the probability set $ satisfying all axioms in the algebra consisting of pairs (xy) xy G R with the operations of coordinate-wise addition and multiplishycation

Indeed consider the set $ on Figl (parallelogram) bounded by vertices 0h 1 mdashh where h lt | Then we can easly verify that if x 21) (222) G $ then (xix22122) G $ The zero set O consists of a single element 0 and a set e consists of a single 1 The topology of $ is induced from R 2 The remaining properties of 4gt can be examined easily Note that the first coordinate x runs over the segment [01]

Since R2 with the coordinate-wise addition and multiplication is a simplest non-trivial topological semi-field 15 We can consider $ as an example of a probability set included in a topological semi-field

In 16 the foundation of classical probability theory is presented in terms of semi-fields Thus the construction of probability sets in abstract topologshyical semi-fields can be of interest for applications In section 3 we considshyered multidimentional examples of probability sets which could be even non-commutative These examples get beyond the frames of topological semi-fields

The zero-indecomposability property can be included or not included into the properties of $ It depends on a problem For example if we consider all fields of p-adic numbers as a probability set then the indecomposability property does not holds Nevethless it does not prevent the existence of an analogue of Bernoulli theorem in the p-adic probabilities10

However we can find sets satisfying all axioms in the field of p-adic numshybers For this purpose we take a p-adic number q qp lt 1 that is not a root of any algebraic equation with integer coefficients Then the set of p-adic

260

Fig 1

numbers of a form nkq

k + nk+1qk+1 +bullbullbull + nrq

r

where n G TV and the rest of n^ belong to Z k r 123 and of the form 1 mdash msq

s + ms+iqs+1 + bull bullbull + mtq where ms pound N and the rest of mj belong to Z st = 123 together with 0 and 1 are a probability sets with the operations of addition and multiplication in a p-adic set

The semigroups O and e consist of 0 and 1 respectively Essentially different examples of probability sets will be considered in secshy

tions 3 and 4

2 Uniqueness of semigroups of zeros and units

(i) Proposition 1 In the probability set $ defined by operations bull and + the semigroups O ande satisfying properties (31)-(34) are unique

Proof It is important to note that semigroups O and e posses the maximality property ie they cannot be extended to semigroups O O C O and e e C e or e C e O C O preserving the properties (31)-(34) Indeed if there is an extention O then there is an element p pound O such that p G O But this will contradict conditions (33)-(34) since on one hand the operation p + e e pound e is not defined for p pound O and on the other side the operation p + e is denned for all e e pound $ since p pound O

261

Now let O = O and e C e Then there exists an element j ) 6 e but p pound e By (33) there exists p pound O such that p + p euro e C e Prom the other side the operation p + q is not defined for q pound O = O and p e e Thus any two pairs of semigroups O and e satisfying (31)-(34) are maximal

By the same reason in $ there exists no other pairs semigroup O i and semigroup ei different from O and e Indeed assume these semigroups exist Let Ox ^ O O x ltf_ O O pound O j Then 3p 6 O p pound O i If e r i e j 7 0 then the operation p + e is defined for e e e f l e i since p pound O On the other hand the operation p + e is not defined for e pound e i since p $ O i If e H e = 0 we consider an element p such that p ^ O but p pound O i Then by (34) the sum p + q is defined V g euro $ On the other hand the sum p -f e is not defined for e euro e since p $ O

It remains to consider the case when O = O i but e 2 e i - This case does not coinside with the case O = Oi and e C e i studied above but the proof remains the same Namely there exists such p pound e i but p ^ e By virtue of (33) there exists an element p pound O such that p + p 6 e At the same time the operation p + p is not defined since p euro ei and pi Oi = O

(laquo) The homomorphism of the probability set $ i into the probability set $2 can be defined as usual but with the following natural complement

Definition 2 A mappind ip of a probability set $1 into the probability set $2 is defined to be homomorphism if

(a) (p is a semigroup homomorphism with respect to the multiplication

(b) If a sum p + q is defined in $ i then the sum ltp(p) + ltp(q) is also defined in $ 2 and ltp(p + q) mdash ip(p) + (p(q)

(c) If a sum ltpp) + ip(q) is defined in $2 then the sum p + q is defined in $1 and consequently by (iib) we have ip(p + q) = ip(p) + ltp(q)

Proposition 2 Let the probability set $2 ampe a (p-homomophic image of a probability set$i Let$i = O iUPiUe i and $ 2 = 0 2 UP2Ue 2 where Oj ei are semigroups of zeros and units respectivly Then ltp(Oi) = O2 lt^(Pi) = P2 and (p(ei) = e2 Also we have ltp(p) = ip(p) for allp euro P i

Proof Consider sets Oi = lt^-1(02) P i = ltp -1(P2) ei = tp~1(e2) Since the sets 0 2 P 2 and e2 do not intersect pairwise the sets 01 P i and ei also do not intersect pairwise and $1 = Oi U P[ U e[ Since

262

O2 P2 e2 are semigroups the semigroup properties of ip imply that the sets 0[ P i e[ are semigroups in $1 Further using properties (iia) and (iib) one can easly verify that the sets O^ and e[ satisfy conditions (31)-(34) of definition 1 and thus are semigroups of zeros and units In view of proposition 1 we have OI = Oi and e^ = e i It follows that P[ = P i Then if p pound P i there exists an element p pound P i such that p + p pound e i Therefore ip(p + p) = ipp) + (p(p) pound e2 and we can set ip(p) = ltp(p)

(Hi) Let $ be an arbitrary probability set with a semigroup of zeros O Proposhysitions 1 and 2 allow to consider instead of the probability set $ a home-omorphic probability set $0 (by proposition 3 below) whose semigroup of zeros consists of a single element Denote it by bull Then bull possesses all properties of the usual zero ie p+O = p bull bull p = p bull bull Vp euro ltlgto-

Definition 3 A class of the equivalence Kq of an element q pound $ is the set of all elements p pound $ for which p + 6 = q + 62 for some 1 62 euro O Set

$ 0 = Kq q G $

From definition 3 it is clear that KB = O for all 0 E O Indeed let x pound Kg then by definition 3 we have x + 61 = 0 + 62 for some 9i 82 pound O By 6 it follows that x pound O Further since p + 6 = 8+p6poundOwe have ppoundKp

The following two lemmas are similar to those for conjugate classes in rings but the proofs are different

Lemma 1 If z pound Kp then Kz = Kp

Proof If z pound Kp then by definition 3 we have z + 81 = p + 62 for some 1 82 pound O Let x be an arbitrary element of Kz Then by definition 3 we have that x + 83 = z + 84 for some 83 84 pound O Adding 81 to this equality and using the addition properties in $ and the relation z + 81 = p + 82

we obtain

(x + 83) + 0i = x + (83 + 0i) = (z + 8A) +8X =

= (Z + 01) + 04 = (p + 62) + 04 = P + (2 + 04)

Since 03 + 0i and 02 + 84 belongs to O from definition 3 follows that x pound Kp ie Kz C Kp

Also from the relation p + 82 = z + 0i it follows that p pound Kz Conseshyquently Kp C Kz and we have Kz = Kp

263

Lemma 2 The classes Kp and Kq either coinside or do not intersect

Proof Indeed let KpCKq^ If z euro Kp n Kq then by Lemma 1 we have Kz = Kp and Kz = Kq ie Kp = Kq

Proposition 3 In the set $ 0 one can introduce the operations of mulshytiplication and addition naturally induced by the operations in $ that transform $ 0 to a probabilitic set (We denote it by $o) Moreover the semigroup of zeros of a probability set $o consists of a single element Kg = O V0 euro O which possesses the properties of a usual zero

Proof Define the set Kp + Kq by a term-by-term addition of elements The definition of Kp + Kq is correct if p + q is defined Indeed let us consider x G Kp y G Kq Then by definition 3 we have that x + 0i = P + 02 y + 03 mdash q + 64 for some 0raquo G O Since p + q is defined by properties (32) and (34) imply

(p + 02) + (q + 04) = (p + q) + (02 + 04) = ( + raquo) + (0i + 03)-

Consequently x + y euro -ftTP+9 and it follows that Kp + Kq C -ftTp+g

Similarly we can define the set Kp bull Kq by term-by-term multiplication If x G Kp y e Kq we have x + 0i = p + 02 and y + 03 = ltZ + 04 0j euro O Multiplying left-hand and right-hand sides of these equalities and applying the properties of O we obtain

Or + 0i)(i + 03) = (p + 02)(lt + 04) = x bull y + 0 = p bull q + 0

where 0 0 euro O Consequently x-y euro Xpg and therefore KpKq C Kp

Those inclusions lemma 2 and properties (33) (34) allow to introduce correctly the operations of multiplication and addition on classes ltJgt0 by

KpGKq = Kpq KpHKq = Kp+q (1)

These operations transform the set $ 0 into a probability semigroup $o- The zero semigroup of ltJgt0 consists a single class O = K 0 euro O and the semigroup by units e O consists of classes Ke e euro e Obviously the properties (31)-(6) of definition 1 can be easly verified The class K$ = O V 0 G O possesses all properties of usual zero since Kq bull Kg = Kq9 = Kg = O and Kq + Kg = K g + e = if

We define lt on $ as ltj(p) = Kp Obviously the mapping ltp satiesfies the conditions of definition 2 and therefore is a homomorphism $ into $0 = $ 0

Probabilities with hidden parameters

(i) The idea of a hidden variables is very popular in quantum mechanics17 With the help of hidden variables many investigators try to overcome some difficulties of quantum mechanics For example in 1 8 to solve the Bells inequality paradox it was proposed the p-adic theory of distribushytions for hidden variables

On the other hand we propose to consider the hidden variables as a hidden parametres of usual probabilities so that the letter ones must be the abstract probabilities satysfying the conditions of definition 1

At first we consider one model of hidden parameters for abstract probshyabilities

Definition 4 We say that a set of abstract probabilities $ allows hidshyden parameters A (or $ has hidden parameters A) where A is certain topological space if to each a pound A corresponds a subset Pa C $ such that (J Pa = $ and the continuous mappings cp and ifi from A x A x $ x $

a

into A are defined and possess the following properties The operations

(p a) + (q 3) = (p + q tp(a p q)) (2)

pa)-q3) = p-qigta3pq)) (3)

where p G Pa q pound P0 p + q G P^afrpq) P bull Q euro ^V(laquoPlaquo) define

on the set of pairs (pa) a euro A p 6 Pa a probability set denoted by (4) P(A) C $ x A

Since the left hand side of (2) and (3) is the operations in the probashybility set $ the hidden parameters can describe additional properties of probabilities including some possible physical sense It is obvious that the principle problem conserning the probability with hidden parameters is as follows can we destinguish statistically the sequences Ci(w)gt bullbullbullgt Claquo(w)) mdash and T]i(ui) nn(poundj) where C(w) a r e independent random variables with identical distributions with respect to usual probabilities from [01] and (agt) are independent random variables with the some values as poundfc(w) but with the distributions from probability set [01] x A and satshyisfying the conditions if P(k(u) E B =p then pr)k(oJ) G B mdash (pa) for some a euro A

265

(ii) Now we consider the principle construction for different examples of usual probability on [01] with hidden parameters

Proposition 4 Let $ = [01] and A be some convex semigroup in arshybitrary Banach algebra over R Then the set $ x A = (p a) a pound A forms a probability set with respect to the operations

(pa) + (qa) = (p + q - pound - a + - ^ 8 ) p + qltl (4) p+q p+q

(pa)-qa) = (p-qa- ) (5)

Proof As a zero set O we consider the set (0a) a pound A and as e we consider the set ( l a ) a pound A Then all properties of definition 1 can be easly verified By the proposition 3 all elements of the form (0 a) a pound A can be ^identified with one zero

A simple interesting example of such kind can be obtained by considering a set of pairs (p q) pq pound [01] with the operations

(piQi) + P2qi) = (pi +P2 ^ mdash q + mdash92) Pi +P2 Pi+ Pi

0 lt p i + p 2 lt l (6)

(Pi 9i ) bull (P292) = (Pi -P2 qi bull 92) (7)

Obviously instead of q pound [01] we can take the elements of Banach alshygebra of sequences of numbers from [01] with coordinate-wise multishyplication We can interpret probabilities (p q) with hidden parameters Q mdash (lt7i)lt72 bullbullbull)) 0 ^ Ii ^ 1 a s follows if an event S occurs with the probability p then the probabilities (71(72 bullbullbull can be considered as probshyabilities of some independent events Si52 which can occur when S occurs

Another example of hidden parameters interesting from a probabilitic point of view can be obtained when q = qij runs over stochastic mashytrices Now we can consider random index i i = 12 with distribution (Pt ||ltfcmlD- Thus if the event i occurs with probability pi then qij is the probability of some events Sj This duplicates the previous situation differing that the matrix multiplication implies more interpretations

Problem of a general description of all mappings ltp and ip of the set [0 l ] x 4 into [01] or the full description of probabilities [01] with hidden parameters from [01] remains open

266

(Hi) As a prototype of a general construction of a probability $ with hidden parameters we can consider a set of positive measures min(G) on some semigroup structure G with natural opperation of addition and composhysition of measures

Indeed let G be an arbitrary locally compact semigroup Consider a set min(G) of all positive measures on G with weak topology We can naturally define operation of convolution (composition) on min(G) as follows for i v euro min(G)we set3

Hv(B) =fjxv(xy) x-yeB xypoundG (8)

where i x v denotes direct product of measures fi and u on G Then min(G) is a semigroup with respect to the convolution Besides the adshydition (fi + v)B) = nB) + vB) and the multiplication by a positive number A (v)(B) = XJ(B) are defined on min(G) Obviously the opshyerations of convolutions and additions are distributive Thus the linear set min(G) is convex semigroup with respect to convolution

The set min(G) possesses almost all properties of the probabilities sets with respect to these operations except one there is no semigroup of units in min(G) But if we restrict min(G) we can obtain a convex semigroup possessing all properties of a probability set To this end we consider a subset minj(G) of min(G) consisting of all probability meashysures ie the set of positive measures fi for which (i(G) = 1 Prom (8) it follows that mini (G) is a semigroup Consider a convex closed semishygroup min[01](G) consisting of all non-negative measures fi for which 0 lt i(G) lt 1 It can be readily seen that set min[0]i](G) with the operashytions of the addition and the composition satisfies all properties (31)-(6) of the probability set with a semigroup of units e = mini(G)

Each element fi from min[oii](G) can be obviously represented in the form p bull (^fJ) where n(G) = p 6 [01] p ^ 0 ^i euro mini (G) If fi and u belong to min[0ji](G) then we have

p q p + q

Hv = p(-raquo)q(-v) =pq(-ti)(-v)- (10)

Prom (9) and (10) we obtain the

267

Proposition 5 The convex semigroup min[oi](G) and the set $mini(G) of elements (pa) p pound [01] a E mini(G) with the operashytions (4) (5) are isomorphic

The probabilities (p n) can be interpreted similary to item ii above Howshyever the structure of multiplication of semigroup is rather more complishycated Consider an algebra of some events F Suppose that each such event has a state which can be represented by an element of a group G Let the probabilities (pipi) ]TXPJ^J) = (1pound) assigne the distribution on events Ti C T TiV Tj = 0 Then the probability (pifii) means the choice of a event Ti with the probability pi and the choice of a state g pound G with distribution n

It is obvious that the addition and multiplication of these probabilities must be determined by the physical model obtained from an experiment or theoretically

4 Probability sets with a single unit

If a semigroup G is finite then min[0ii] (G) is convex set in the Euclidean space We will show that convex set contains probability subsets with a single unit A special two-demensional case of such probability set was presented in section 1

(i) Let G be a finite group (commutative or non-comutative) with elements ei62 e s s gt 2 Consider a group algebra G(R) ie a linear space of linear forms ziei + (- xses i j G R with a group multiplication of basic elements ej Assume that the basis ej is ortonormalized Let mini(G) be a simplex formed by the vertices eei--es and the set min[o)i](G) be a simplex formed by the vertices 0eie2 e s see Fig2 Then the measure (i 6 min[01](G) can be written as fj = pe- -pses where 0 lt pi lt 1 and J2iPi 5 1- The geometrical center of mini (G) is an invariant measure no = e - h ^e s For any measure fi euro min[01] (G) we have

jnG = nGiJ - nG)nG (11)

In special case if p 6 mini(G) then una = nop = no and nG = no-Denote the line passing through the points 0 and no by I Then as it can be seen from Fig2 mini(G) is a part of hyperplane orthogonal to line I and passing through the point no and min[0)1](G) is a part of positive orthant cut of by mini(G)

268

^3

MG)

i ^ _ bdquo ^ bdquo r

Fig 2

Really Fig2 corresponds to the case s mdash 3 when G is a cyclic group of three elements This case is of a special interest because algebra G(R) is isomorphic to direct sum of real numbers field and complex numbers field19 Consider a cube Q as it is shown in Fig2 The cube Q consists of all measures fi = Y^l Piei fdeg r which 0 lt pt lt j

Proposition 6 The set Q considered as a subset of a convex semigroup minr0i](C) is a probability set with a single zero 0 and a single unit no-

Proof Let us establish that the set Q is a semigroup with respect to the multiplication Indeed if fi = ^2piei v mdash YHljej belong to Q then 0 lt Pi lt - 0 lt qj lt 1 and therefore we have iv = Y^Pi1ieiej ~

S ( ^Pilik I efcgt where i = 12 s are defined uniquely for each i and k i J

k by the condition a bull ek = ejt i k = 12 s Since G is a group then for any fixed k k mdash 12 s the indexes ik run over 12 s when i runs over 12 s Therefore we have

$gtife lt E laquo ^

269

Now let us show that a complimentary element ~p exists for each p = p-e + bull bull bull + pses euro Q By definition 1 we must have i + ~p 6 e In our case we set e = n g Then p + ~p = ng and therefore ~p - nG - p = ( i - pi)ei + bullbullbull + ( j - ps)es 6 Q since 0 lt pi lt pound i = 12 s Finally let us check property (34) Really if p euro Q p ^ no then p(G) = A lt 1 Thus by virtue of (11) we have pna = ^GM = n(G)nG = nG

The remaining properties of definition 1 for the set Q follow straightforshywardly from the properties of probability set min[0i](G)

Note that the Kolmogorov condition (7) holds in Q

(ii) It proves to be possible to construct even more general kind of probability sets with a single unit as a subsets of the set min[01] (G) For this purpose we consider an arbitrary convex semigroup S(G) in mini (G) and a convex set SQ(G) formed by zero (0) and the elements of the set S(G) One can readily see that So(G) also satisfies properties of a probability set in which S(G) is a set of units

Now we consider a set Q(S G) which is an intersection of the set S$(G) and all half-spaces contained zero and bounded by hyperplanes parallel to the faces of the So(G) and passing through the point nG

Proposition 7 Let S be an arbitrary convex semigroup in mini G) censhytral symmetric with respect to the point nG Then Q(S G) is a probability set with a single zero and a single unit

Proof We shall show that Q(SG) is a semigroup with respect to conshyvolution and hence Q(SG) as a subset of min[0]1](G) is a probability set with a single unit nG- First note that in view of central symmetry of 5 with respect to nG an intersection of any face of So(G) with any hyperplan passing through the element nG and parallel to another face lays in the intersection of faces of SQ(G) and the hyperplan h passing through nG and perpenducular to the line

Fig3 shows a plane -K passing through the point p0 euro S0(G) and line The rhombus 0AnGB is an intersection of Q(SG) with this plane Each element p of this rhombus can be represented by p = nG mdash Aixi where pi euro S(G) 0 lt Ai lt 1 Symilary for each other element v of QSG) we also have ii = nG - A2^i where v pound S(G) 0 lt A2 lt 1

270

71 O S(G)

JA

- bull x G s

^ 1

Fig 3

Therefore the product fiv equals

(nG - Xim)(nG - A2^i) - nG - A2nG^i - AizinG + AiA^i^i =

= ( 1 - A i - A2)nG +AiA2ii^2 (12)

Let us show that the element (12) belongs to Q(SG) Consider the first case when either Ai and A2 is greater than | Let for example Ai gt |

Then the point jl lays in the left-hand side of the rhombus and thus can be represented as ty i 6 S(G) t lt | On the other hand we have v - T bull v for v E Q(SG) where v pound S(G) 0 lt r lt 1 Therefore the product Jiv is equal tr bull fiu where fj bull v G S(G) and 0 lt tr lt | Consequently by construction of Q(SG) measure pigt lays the left of hyperplane h (Fig3) and consequently ftu pound Q(SG)

Now consider the case when Ai lt | A2 lt | Then p = 1 mdash x mdash A 2 gt 0 and q = 12 gt 0 Show that inequality p + 2q lt 1 holds which is equivalent to the inequality Ai + Ai gt 2AiA2 Indeed (Ai mdash A2)2 = Af + A| - 2AiA2 gt 0 Since 0 lt Ai lt 1 0 lt A2 lt 1 we have Ai + A2 - 2AX A2 gt + l - 2AiA2 gt 0 Whence p + 2pltl

Thus from (12) we have [iv = pna + qfJ-iVi fJ-i v pound S(G) pq gt

271

0 p + 2g lt 1 Show the measure m = pna + gw belongs to Q(S G) for any measure w euro S(G)

Fig4 shows the plane passing through the points 0 u ans no- The point m = priG + qw lays on the line parallel to Ow and passing through priG-

Now to prove that m belongs to Q(SG) it suffices to demonstrate that qugt lt |A| By similarity of triangles 0 u n s and pno BTIQ we have

|2A| ( l - p ) | n G |

ugt nG = l-p

That is |A| = | ( 1 -p)u Then

qu 1(1 -P) 2 Q

1 1 - p gt 1

U)

follows from the inequality p + 2q lt 1

Hypothesis For arbitrary S(G) C mini(G) the set Q(S G) as a subset of a convex semigroup minr0)i] (G) is a probability set with a single 0 and a single unit no bull

272

We would like to note in connection with the examples of section 1 that a general description of probability sets in topological semi-fields and in the field of p-adic numbers is of a great interest for applications

We hope that problems of an experimental determination of abstract probabilities will be considered in the continuation of this work

5 Acknowledgments

In conclusion I want to express my gratitude to A Yu Khrennikov (Vaxjo Univ Sweden) Yu V Prokhorov O V Viskov I V Volovich (all of Steklov Mathematical Institut Russia) V Ja Kozlov (Academy of Criptografy Russhysia) V I Serdobolskii (Moskow Univ of Electronic and Math Russia) and A K Kwasniewski (Bialystok Univ Institut of Computer Science Poland) for discussions and their advices on foundations of probability theory and quantum mechanics This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University

References

1 A N Kolmogorov Foundation of the probability theory (Chelsea Publ Comp New York 1956)

2 T L Fine Theories of probabilities an examination of foundations (Acashydemic Press New York 1973)

3 H Heyer Probability measures on locally compact groups (Springer -Verlag Berlin-Heidelberg New York 1977)

4 Y P Studnev TV and its applications 12 727 (1967) 5 R P Feyman Negative probability Quantum implications Essays in

Honour of David Bohm BJ Hiley and FDPeat (Routledge and Kegan Paul London 1987)

6 P Dirac Pev Mod Phys 17 195 (1945) 7 0 G Smolaynov and A Y Khrennikov Dokl Akademii Nauk USSR

281 279 (1985) 8 V S Vladimirov I V Volovich and E I Zelenov p-adic analysis and

mathematical physics (World Scientific Publ Singapore 1993) 9 A Y Khrennikov Theor and Math Phis 97 348 (1993)

10 A Y Khrennikov Doklady Mathematics 55 402 (1997) 11 A Y Khrennikov Mathematical and physical arguments for the change

of Kolmogorovs axiomatics Trends in Comtemporary Inf Dim Analshyysis and Quantum Probability Nl 215-249 (2000)

273

12 L Accardi The probabilitic roots of the quantum mechanical paradoxes The wave - particle dualism (D Reidel Publ Company Dordrecht 1958)

13 C C Chang Transactions of the Amer Math Sos 86 467 (1958) 14 R S Grigolia Algebraic ananlysis of Lukasiewicz - Tarskis n-valued

logical systems Selected papers on Lukasiewicz sentential calculi (PAN Ossolineum Poland 1977)

15 T A Sarymsakov Topological semi-fields and its applications (FAN Tashkent 1989)

16 T A Sarymsakov Topological semi-fields and probability theory (FAN Tashkent 1969)

17 J S Bell Rev Mod Phys 38 447 (1966) 18 A Y Khrennikov Physics Letters A 200 219 (1995) 19 B L Wan der Waerden Algebra I Achte Auflage der modern algebra

(Springer-Verlag Berlin-Heidelberg New Yok 1977)

274

Q U A N T U M K-SYSTEMS A N D THEIR ABELIAN MODELS

H NARNHOFER Institut fur Theoretische Physik

Universitat Wien Boltzmanngasse 5 A-1090 Wien E-mail narnhapunivieacat

In this review the concept of quantum K-systems is studied on one hand based on a set of increasing algebras on the other hand with respect to entropy properties We consider in examples how far it is possible to find abelian models

1 Introduction

Classical ergodic theory is a powerful discipline both in mathematics and physics to analyze mixing properties of a given dynamics Since in physics the mixing properties take place on the microscopic level that is controlled by quantum theory it is natural to try to translate the concepts of classical ergodic theory also into the quantum framework and to study how far these concepts can find their quantum counterpart and whether new features appear

One possibility is the following we start with a classical dynamical system eg a free particle on a hyperbolic manifold with finite measure and quantize the dynamics ie study the properties of the Laplace-Beltrami operator on this manifold Since the manifold has finite measure the Laplace-Beltrami operator has necessarily discrete spectrum1 and the classical mixing properties can only have their footprints in the distribution of the eigenvalues at high energy23 Many deep results have been found on the basis of this approach But in this review we will follow another path of considerations

We start with the classical dynamical system with optimal mixing propershyties the Kolmogorov system456 It can be characterized either by its algebraic structure or by properties of its dynamical entropy Both concepts find their counterpart in quantum systems7 but they are not equivalent any more

First we will give the definition of an algebraic K-system and some defshyinitions of dynamical entropies One of them relates the quantum system to classical K-systems that can be considered as models of the quantum system Then we will give examples of algebraic quantum K-systems and will discuss how far they can be represented by classical models Finally we will give examshyples of quantum K-systems for which no classical model exist and on the other hand a quantum dynamical model that allows the construction of a classical model but for which the algebraic K-property so far cannot be controlled

275

2 Classical K-System

Let us repeat the characteristics of a classical dynamical system (A a z) where we take A to be the abelian algebra built by the characteristic functions over a measure space with measure fi and a an automorphism over A with [i o a = fi 456

Definition 21 We call (A Ao a fi) a K(olmogorov) system if

Ao pound A crAoDAo JanAo=A f]a~nAo = XI (21)

For a given classical dynamical system (A a fi) we can decide in several ways if some Ao (that is not unique) exists so that (AAoafj) form a K-system 56

A) Choose some finite subalgebra 13 C A (ie some finite partition of the measure space) and construct its past algebra Ao = UneuroN a~namp- If A) is a proper subalgebra of A it will increase in time Check if J anAo = A if not B has to be increased If B is large enough check if f] a~nAo = Al

B) Consider the conditional entropy H(BAo) If this expression is strictly positive V B (A a fi) is a K-system

C) If

lim H(anBAo) = H(B) VB (22) nmdashfoo

then (^4 a (i) is a K-system

The classical K-system can also be characterized by its clustering properties Let (AAQ(JH) be a K-system Then to every B E A e gt 0 3 n0 such that

p(Bo-nA) - n(B)n(A) lt en(A) VAeAongtn0 (23)

The prototype of a K-system are the Bernoulli shifts (including the Baker transformation) We regard the Bernoulli shift as an infinite tensor product A mdash lt8)fez Bti where Be is isomorphic to a finite abelian algebra Bi laquo BQ = Pi Pk with projections P with expectation values z The dynamics is given as the shift a over the tensor product The state x has to be translation

276

invariant It can be the tensor product of the local state but we allow also spatial correlations The dynamical entropy is given by

s u p t f l Q c S I | J arB (24) t=0 rlt-l+n J

= s u p i f f M J lt r B j (25)

and coincides with H (B) if the state p factorizes

3 Algebraic Quantum K-Systems

It is obvious that one can adopt Definition 21 directly to define an algebraic quantum K-system It is also obvious that the definition is not empty because we can construct the quantum analogue of a Bernoulli shift by taking for B a nonabelian algebra eg a full matrix algebra Mkxk- In the following we will first discuss physical applications of this quantum Bernoulli shift and then turn to generalizations

A A model for Quantum Measurement

We start with a finite-dimensional algebra B and a state u over B In order to determine w we have to make many copies of u and repeat a variety of measurements The classical Bernoulli shift consists of projections and every measurement gives as outcome 0 or 1 on these projections with probability corresponding to the state p By repeated measurements we can determine p with exponentially increasing security

In the quantal situation a measurement corresponds to pick some abelian subalgebra Bo of B maximal abelian if the measurement is sharp and again the outcome of the measurement will be 0 or 1 on the projections in Bo- To determine the state u we have to vary the measurements respectively the alshygebras Bo Since the state space over B is compact it suffices to vary over finitely many Bo- Let u(Pj) = pj for Pj 6 BQ TO get security on the density distribution with respect to Bo the number of experiments have to be of the order pj(l mdash pj)e2 For the algebra Bo that commutes with the density mashytrix p corresponding to u the entropy S(pg ) is minimal and approximative security on the density distribution is reached for the smallest number of meashysurements For other abelian subalgebras BQ we are satisfied with less security

277

we have just to be sure that pe0 is more mixed than p-go With pj mdash UJ(PJ)

for Pj pound Bo and Jj- = u(Pj) for ~Fj e B0- The probability that the outcome of N measurements gives a probability qj gt pj + e is

Nipj-pj-e)2

exp mdash (31a P i ( l - P j )

This has to be compared with the security given by N measurements on B0

~Ne2

exp-^-p - r (31b)

Therefore the number of experiments N necessary to control ps0 is small comshypared to the number N that fixes pg and at the same time p If we interpret the entropy as a measure on the reliability of a sequence of measurements we see that it is not changed compared to the classical expression ie the same order of experiments is necessary and therefore

S(p) = S(pBo) = -Trplnp (32)

Remark In 8 the Shannon information resp von Neumann entropy (32) was questioned to be the appropriate quantity But in these considerations it was not taken into account that measurements on different abelian subalgebras are correlated We have incorporated these correlations by taking into account the varying necessary accuracy and in this way got the desired result

B Lattice Systems

Again we choose a matrix algebra B and define A = reg n 6 ^ Bn as before But now the algebra describes particles on a lattice (one-dimensional for n pound Z) the shift corresponds to space translation and the translation invariant state describes the system in eg the ground state or equilibrium state with respect to some Hamiltonian eg the Heisenberg ferromagnet Therefore in general the state will not factorize but be obtained as 9

T r e - ^ A u(A) = hm mdash ^smdash (33)

A-yZ Tr e-PH

We assume that the sequence of local Hamiltonians H determines a time automorphism on the algebra that commutes with space translation We can assume that ui(A) is space translation invariant In order that we have an algebraic K-system on the von Neumann level (in the weak topology) it is necessary that the state is extremal space translation invariant This can be achieved if necessay by a unique decomposition as in the classical situation9

278

C Fermi Systems

We consider the CAR algebra Aa(f) a^(g) either over C2(Z) or L2(R) The shift defines an automorphism over A and the K-property is satisfied with AQ = a( ) a t ( ) supp 6 Z~ or R~ This is not a Bernoulli-K-system because creation and annihilation operators anticommute

D Quantum Stationary Markov Processes

Another example 10 of a K-system is provided by stationary Markov chains Here many variations of the definition of such a Markov chain exist We give an explicit example that again cannot be imbedded into a Bernoulli system

Let Ao be a 2 x 2 matrix algebra and C = reg n euro Z Cn a Bernoulli system Cn again a 2 x 2 matrix algebra Define the map Ti A$ reg 1 mdashgt Ao lt8gt C by

Ti(axregl) mdash ~oxregox

T^yreg) = axregay (34)

r i ( a z reg l ) = lregaz

On C we consider the shift r and a r-invariant state CJ Therefore we can define

T = (Ti reg idci )degidAregT) (35)

Then A[mn] = mltkltnTk(Ao) and (-4[-oooo]^[-ooo] f reg w) define a K-

system for arbitrary states (p over ^lo-It can easily be seen that though -4[_oooo] can be imbedded in AregC the

automorphism T is not asymptotically abelian

[Tnax reg l)az regl) = ioyregox ax (36)

E Prize-Powers Shift

Another illustrative example for a quantum K-system is the Prize-Powers shift n

Let ej be a unitary satisfying e2 = 1 Let

eiek = ( - l ) ^ - ) e e i with g(i - k) e 01 (37)

Let aek = e^+i Then

Vgo = ehi lt 0Vg = etJ pound ZltJT)

279

form an algebraic K-system where r is the tracial state

-r(e) = Sift with e = J J eiu eik (38) iiiibdquoeurol

Special examples are

a) g(l) mdash 1 gk) = 0 otherwise Then the algebra coincides with 0 A M^ x 2

where

amp2k - crzregazpound Mk+i lt8gt Mk

R2k+i = 1 regltJx euro gtlfc

b) g(i) = IV i Then the algebra coincides with CAR on Z

et = ai+a

Other explicit examples can be found in1 2 In all these examples (A - E) we inherit from the classical theory the

following

Theorem Let (A Ao cr u) be a K-system and u an extremal translation-ally invariant state (That is equivalent that f)(j~nAo = Al in the strong topology) Then to every A e 3 no such that

oj(Aa-nB) - U(A)OJ(B) lt eB ngtn0 B e A0 (39)

Therefore we have the same clustering properties as in (23)

Proof If OJ is the tracial state T(AB) = T(BA) then in the GNS represenshytation

OJ(B) = (n|7r(B)|ngt

ir(Ao) defines a projection operator PQH = Tr(Ao)Q that is increasing respecshytively decreasing in an

uAo-xB) = oj(Aa-nP0(J-nB)

and

st- lim (7nP0 = 1 st- lim a~nPQ = fl)(fl (310) nmdashoo nmdashbulloo

280

If LJ is not the tracial state but a KMS state it cannot be excluded that ft is not only cyclic for TT(A) but also for TT(AO) But in this case the modular operator corresponding to ^(Ao) A0 can replace P0 for controlling the cluster properties and satisfies13

st- lim ltr-nmdashr^ = J |fi)(fi| (311) A i 2 + 1 2

nmdashyenoo

4 Dynamical En t ropy

The dynamical entropy of classical ergodic theory can be interpreted in two different ways

If we use the definition

ha) = supH(aB) = supH(B I J a~nB) (41)

then it measures how the algebraic K-system increases and how in the course of time our information on the complete system increases

If we concentrate on the fact that

lim H[akB I J a~nB) = H(B) (42)

it describes that the remote past becomes more and more irrelevant for the presence Both properties can inspire us to look for an appropriate definition for a dynamical entropy for a quantum dynamical system

a) For an algebraic K-system we can just copy the definition of a classical K-system

Definition Given two subalgebras A B C M w a state over M Then we define with S(ujip) the relative entropy the conditional entropy H(AB)

HUAB)= sup ^2(S(uuiU - S(uui)B) (43)

Evidently H(AB) gt 0 By monotonicity of the relative entropy H(AB) = OifAcB

Let (AAoau) be an algebraic K-system Then HiJ(aAoAo) measures how fast AQ is increasing The above expression has not been much

281

investigated The main reason lies in the fact that for a given quantum dynamical system different to the classical situation no strategy is known to decide whether an AQ with the desired properties exist If it exists there is no reason to assume that it is unique In the classical situation the dynamical entropy does not depend on the special choice of AQ In a quantum system due to the lack of a constructive approach to Ao we also have no chance to compare H(aAoAo) with respect to different past algebras Ao-

There exists also another characterization for the amount of increase

For A D Ao both type Hi algebras define P0 the projector on AoO in the GNS representation of the tracial state over A Po 6 n(Ao) Then 14

[AA0=T(P0)- (44)

r the trace over n(Ao)

This definition has been generalized to type III algebras by1 5 Note that it is not state dependent As a typical example it can be evaluated for the Price-Powers shift both (43) and (42) are independent of the sequence g and give In 2 resp 2 But it should be noted that in general there exists only an order relation16

H(aAoAo) lt 2 1 o g M o M-

b) The main obstacle to use (43) or (44) as a definition for the dynamical entropy comes from the fact that for noncommutative algebras in general U n = 1 a~nB will increase in a way that can be hardly controlled

An illustrating example is given by the following observation17

Take A = a(f)a^(g) f g G C2(R) a with a the space translation We know already that it corresponds to a K-system with A0 = a(f)a(g) fg euro C2(R~) But if we pick a(e~x ) and construct the algebra A0 = a(e~(x_a) ) a gt 0 then Adeg coincides with A if it would not we could find some with (|e~(x~deg) ) = OVa gt 0 and this is impossible due to the analyticity properties of the Gauss function

Due to this fact 18 proposed the following definition for a dynamical entropy

282

Definition Let M be a hyperfinite von Neumann algebra with a faithshyful normal trace Let Pf(M) be the family of finite subsets of M Let X C M We write

if for every x euro w there exists ay e x s u c n that

T((X - y)(x - y)) lt 6 (45)

Let J be the family of finite dimensional C subalgebras of M Then

rT(cj5) = infrank A A e TM)UJ C A (46)

1 (n~l

haT(aujS) = lim sup mdashlogrr I I J oUu)8 n-yenoo n ^

j=o

haT(augt) = suphaT(aujS) (5gt0

haT(a) = sup ioT ( (Tw)w6P(M) (47)

The notation stands for approximation entropy of a

The above definition allows many variations For instance the lim sup can be replaced by a lim inf and we can hope but it is not proven that this does not change the definition

New information can be gained if we change the approximation conditions (45)

The topological entropy uses the approximation in norm But to keep generality we cannot assume that the full matrix algebra belongs to A Concentrating on nuclear C algebras we have to approximate via completely positive maps (ltpipB) with B a finite dimensional algebra if M -gt B and ifgt B -gtbull M such that

tp o tp(a) - a lt 6 V a G w (48)

hata) is denned as haT only under the new approximation condition If M is an AF-algebra and therefore possesses a tracial state then the topological entropy dominates the approximation entropy

hta) lt hata) (49)

283

As another possibility we can approximate ip o p(a) mdash a in the strong topology in a given representation corresponding to a state ip and replace the rank of the best algebra A by the entropy19

s = (ipoip)

All these definitions satisfy the requirement that they coincide with the usual definitions (state dependent dynamical entropy or topological enshytropy) if we apply them to commutative algebras

Let us finally remark that applied to the Price-Powers shift again indeshypendent of g (37)

haT(a) = hat(a) = ht - ltp(a) = ^ H(AoW1 AQ) (410) Li

For further studies we refer to (Stormer Choda Dykema)20 21 22

c) An approch that differs very much from the mathematically motivated definition of Voiculescu is offered by Alicki and Fannes23 It is motivated from the concrete method how we are able to determine by experiment the state of a system we perform a measure and repeat the measurement in the course of time Here we use the idea of the history of a system as discussed eg in24 25

A single measure corresponds to a partition of unity

fc-i ]bullgt = (411) j = 0

In fact we may think that the x^ are commutative selfadjoint projecshytion operators But by time evolution this commutativity is destroyed anyhow and also for the necessary estimations it is preferable to conshysider this generalized partition of unity without further restrictions on Xi Repetition of the measurement corresponds to a composed partition

X = (x0xbdquo-i)

ax = ((TX0 o-xn_i)

VXdegX = ( ltTXi---Xk)

ie a partition of size k2

(iixXjn) = MX

284

defines a density matrix of dimension k with entropy

Hx) ~ S(MX (412)

As dynamical entropy h(x) we define

h(x) = limsupmdash H(am~1xdeg---vxdegx) m rn

= limsup mdash S(Mam-ixo axox)

ha) = suph(X) (413)

But here a problem arises if we do not restrict B in the algebra A we lose control on the dynamical entropy For instance if we take as C-algebra the Cuntz algebra9 with 1117j mdash and UfUj = Pj and use the Ui for then the identity map has infinite dynamical entropy If for instance we consider the shift on the lattice system B) then we can choose as natural subalgebra B that is dense in A the algebra of strictly local operators Some weakening of this restriction is possible and this is of course necessary if we want to apply the theory to time evolution with interaction where local operators immediately delocalize But this derealization decreases exponentially fast in space26 therefore B consisting of exponentially localized operators should be sufficient to define a dynamical entropy for time evolution in the sense of Alicki and Fannes As an example we consider the shift on the lattice Then

IAFMO = S(LJ) + lnd (414)

there s(u is the entropy density corresponding to the state w and d is the dimension of the full matrix algebra of each lattice point

d) As last proposal for the definition of a dynamical entropy we describe the one which in fact has the longest history First it was proposed by Connes and Stormer for type II algebras27 and then generalized in28 and 29 to general situations We present the definition given by Sauvageot and Thouvenot 30 which they showed to be equivalent to the ones in 27 and 29 for hyperfinite algebras In their definition it is most evident that this dynamical entropy measures how far the quantum system is related to a classical K-system In addition concepts developed in this framework also find their application in quantum information theory

285

Definition The entropy defect of an abelian model Let (4 w) be a nonabelian algebra with state u Let (B n) be an abelian algebra with state fi that is coupled to A by a state A over AregB satisfying A| t = w XB = fi Its entropy defect is defined as

HX(BA) = [H^B) - S(LJ reg iiX)A9B] (415)

Theorem The entropy of the state u is given as

SA(w) = sup [HB(fi) - HX(BA)] (416)

In fact there exist many abelian models that optimize the above expresshysion every decomposition of OJ into pure states ui = J^ILi Viui c a n be interpreted as abelian model with B = P i Pn and fi(Pi) = fii (PiregA) = fiiOJi(A)

Due to quantum effects the entropy is not monotonically increasing if we consider an increasing sequence An C Am nltm But monotonicity can be regained if we change the definition to

Definition Let A C C and (Bfx) be an abelian model for (CCJ)

Then

HUlC(A)= sup [HBn) - HX(BA)] (417) (BMA)

This suggests the definition for a dynamical entropy

Definition Given (Aaugt) a quantum dynamical system The dyshynamical entropy is given by

hu(a) = sup[raquoM(P|P_) - H(PP- reg A)] (418)

where the supremum is taken over all dynamical abelian models (B n 0 ) with n o 0 = 0 and coupling A o 0 ltggt a = A A|4 = ugt B = A- Here P- = U^Li Q~nP the past algebra of the partition P

Remark There holds equality between hu(a) and

sup [MP |P_) - H(PA)] (419)

286

This is based on considering

H(PP-) = lim - H I ekP) )

H(PP_ regA) = lim - H I BkPA) ]

and taking V kP as a new abelian model

It is evident that one can also define the dynamical entropy with respect to a subalgebra C C A

KaC) = sup[iM(P|P_) - HPP- reg C)] (420)

an expression that we need if we want to discuss 2C) in the framework of quantum systems Notice that (419) cannot be replaced in general by an expression like (418)

The main task now is to find abelian models This can be done very similar as for calculating the entropy of a state

Theorem Assume a state w is decomposed

w = ^MiiibdquoWi1in (421)

Define

Consider

lt lt = 1^ WiiraquoWiiiraquo-it l^k

H(C aC ak^C) = 5( W ) - pound S$) + pound ^ S M U ^ ^ - M

(422)

Consider now the decomposition

w = ^ p y 51 E 1 - i W i - - i laquo ^ = Sibdquoiwltilt-- (423) r = -

In the limit limmdash limbdquo^oo (i-e we have to start with a sufficiently large decomposition) the pik converge to an abelian model and all

287

abelian models can be obtained in this way The detailed proof for this statement can be found in3 0

This theorem enables us to find lower limits for the dynamical entropy Together with the fact that

1 H(CaCak-lC) lt SU(C) + 0(8) (424)

if C C C in the sense of (45) or (48) we also have the upper bound29

h(a) lt sup lim H(C ltr-1C) (425) c k

so that in some cases we can really evaluate the dynamical entropy

5 Some General Considerations on Abelian Models

As we already mentioned the entropy of a state over a quantum system can be calculated via an abelian model For a matrix algebra this view point may look superficial but has found its important application in the theory of entangled states where subalgebras Areg B C C are considered and the entanglement describes that a pure state over C will not be pure as state over A resp B This entanglement can be used for quantum communication and the amount of this applicability is expressed as entanglement of formation31 (compare (417))

EuA) = S(u)A - HW(A) = miY^mSugtuji)A (51)

Expressed in terms of an abelian model we can also write

HU(A) = sup S(UregH)AregB0 (52) A0o

where A is a state over BQ reg C We have the following inequality Let w as state over C be written in the

GNS-representation w(C) = ltn|7r(C)|ngt

and let C be the commutant in this representation Then

S(u reg HUgt)AregC0 lt HU(A) lt S(UJ reg ULJ)AregC (53)

with C0 any abelian subalgebra of C A maximal abelian subalgebra of C gives a lower bound to the entropy and in some cases it even is the best

288

abelian model (compare 32 and the explicit results in 33 for estimates on E ie without dynamics) but in other examples 32 see also the forthcoming 6E it is evidently too small If in addition the abelian model has to carry a dynamics the question arises when the abelian model can be imbedded into the commutant (or whether by the natural isomorphism the algebra itself contains a sufficiently large time invariant abelian subalgebra)

Here we have the following results

Theorem 34 Assume that (ACTCJ) is a dynamical system and OJ a tracial state Assume that the analogue of lc) (entropic K-system) is satisfied ie

lim H(onB) = H(A) V finite dimensional B C A nmdashtoo

Then

st-lim[ylltrM]=0 V A (54)

Proof It sufficies to choose B = P for all projection operators in A Then P is its own best abelian model in the calculation of H(B) Refinements of the models P anP have to be used to calculate H(anB) (compare theorem (423)) But they are only possible if P and anP nearly commute

The theorem was generalized to other states 34 but with the restriction that we had to be able to keep control over sufficiently many optimal abelian models We do not believe that these restrictions cannot be removed by a harder analysis

Another result on footprints of commutativity is the following

Theorem 35 Assume that in the calculation of the dynamical model there exists an optimal abelian model ie

h(a) = sup (419) = maxAipe(419) (55) B0

then the algebra 4 contains an abelian subalgebra Ao on which a acts as an automorphism Notice that this does not imply that this abelian subalgebra already is the optimal abelian model

6 Abelian Models for Algebraic K-Systems

In the following we will discuss the examples of abelian K-systems given in Sect 3 and how far they allow to find good abelian models

289

A) In this model of a quantized Bernoulli system that completely factorizes the obvious choice of the abelian model that gives the correct result is

-4o = (g)4n )

neuroZ

where BQ is the abelian algebra that commutes with p and describes the measurements with maximal certainty

B) For the lattice system for which the state does not factorize any more it does not suffice to pick a suitable abelian subalgebra at every lattice point This provides an abelian model but not an optimal one Accordshying to the observations (425) it is clear that an upper bound for the dynamical entropy is given by the entropy density 29 and it seems very plausible that it should not be less To our knowledge no general proof is available but for the states that are of physical interest equality is shown

Already in 29 equality was shown under some compatibility relation beshytween space translation and modular automorphism Only in reality it is difficult to check whether this compatibility relation holds For quasifree states this is possible and was done in 3 6 Here an abelian subalgebra was selected for increasing size of the tensor product This subalgebra delocalizes but only to such an extent that the convergence of these subalgebras to an abelian model that gives the desired result can be controlled

In 37 equilibrium states over lattice systems as in 9 were considered and a decomposition offered that in the limit gave the desired result 38 applied the affinity of the dynamical entropy to control these limits and allow to exchange them His ideas are generalized in39 giving the following result

If you assume that the shift a is asymptotically abelian (ie we consider not only lattice algebras but some generalization in the framework of AF-algebras) and you consider a dynamics given by a sequence of local Hamiltonians then

The thermodynamic limit of the equilibrium states exists and they satisfy the KMS property with respect to the dynamics

For these states the entropy density and the dynamical entropy of the shift coincide The dynamical entropy of the shift can be used in a thermodynamic variation principle This variation principle is satisfied exactly by states that are KMS with respect to the time evolution

290

The maximal dynamical entropy is achieved by the tracial state and coincides in this state with the Voiculescu-dynamical entropy hat (49) In all these examples the abelian model is constructed by considering the sequence p = C~HA and the corresponding minimal projectors in (421-23)

There exists another possibility to construct space translation invariant states on the lattice namely the method of correlated states

We start again with our chain A = regnBn In addition we choose an algebra C (we restrict to finite dimensional ones) and consider some completely positive map F C reg $ -gt C that we can write as fbc) and we demand i (c) = c Let w be a state over C satifying Q o fx =Q Then we define

uj(bi ltggt reg bk) = Q(fbl ofbaoo fbbdquo(l))

where bi is an operator at the lattice point i (many of them can be 1)

It can be checked that in this way we obtain a translation invariant state If eg amp(1) = oj(b) bull 1 then we obtain a state that is clustering If we want to have nontrivial correlations between nearest neighbours we have to choose another but this enforces that there must be also correlations to other neighbours Space clustering is encoded in the convergence properties of ( ) 4 0

Now the construction of an abelian model is offered by a decomposition of F into finer completely positive maps Convergence properties in the construction of abelian models as it is necessary in (423) are now conshytrolled by convergence properties of F (that acts over finite dimensional algebras) instead of convergence properties of space correlations Again we have to choose Bn sufficiently large ie combine sufficiently many lattice points With appropriate estimates it was shown 41 that for all finitely correlated states (C of finite dimension) the dynamical entropy and the entropy density of the so constructed states coincide

C) The Fermi Algebra

If we concentrate on the even subalgebra Ae of the CAR algebra ie the algebra consisting of even polynomials in creation and annihilation opershyators this is just a special AF-algebra that is asymptotically abelian and therefore the results in39 guarantee that for equilibrium states dynamical entropy of space translation and entropy density coincide

If in addition we apply the theorem 29

han) = n h(a)

291

then obviously

hAAdegn) lt hA(an)

~ h^PlP^-HiPlP-ttA)

lt hli(PPLn))-H(PP-regAe) + ln2 (61)

shows that hAc(a) = hA(a)

Nevertheless the noncommutativity of the algebra has consequences

Theorem If ugt = OJ O a then UJ(AQ) = 0 for all odd elements in A

Proof

M4gt)|2 N-l

bdquo N n=0

= ^EF U PO^4W (6-2)

The anticommutator vanishes for strictly local odd operators except for (pound-k) = 0(l) Therefore

K 4 o ) | 2 lt ^ ViV

We notice that noncommutativity reduces the possibility for invariant states

Concerning the question for entropic K-systems (22) for all even subal-gebras

KmH((TnBe)=H(Be)

but for a typical odd subalgebra AQ = ao + h(a Ao) = 0

D) For the stationary quantum Markov chain again an abelian model can be constructed that gives the optimal result ie the entropy density10 The main idea in the proof is the fact that apart from the algebra A we can concentrate on the algebra C and inside of this algebra we construct an optimal decomposition Therefore in the limit of these decompositions we find an abelian model with vanishing entropy defect H(PP- reg A)

292

As we already mentioned the automorphism T (as in our special exshyample) will not be asymptotically abelian in general and therefore the system fails to be an entropic K-system Similar as for the Fermi system we can introduce the gauge automorphism

7 ~Ox = -Vx

ldegy = -Oy

bullyaz = az

The elements invariant under this gauge automorphism are asymptotishycally abelian under space translation because they become localized in 1 regC Therefore again the result corresponds to the results in3 9 though the states are constructed in different ways

E) The last example we want to discuss in this framework is the Price-Powers shift We have already considered the special case g(i) mdash 1 the Fermi algebra (3Eb) For gl) = 1 g(l) mdash 0 otherwise the representation (3Ea) already indicates how to construct an abelian model For a2 we are dealing with a quantum Bernoulli shift that is factorizing with the obvious choice for an abelian model Therefore it is easy to construct the abelian model for a

We can consider Bff2 as subalgebra of A therefore oBai is again an abelian subalgebra and for the shift a we consider the abelian model

oBai

with the obvious coupling Notice that now we have presented an examshyple where the entropy defect of the abelian model does not vanish ie the abelian model is not a subalgebra of the system For arbitrary g we will in general fail to find an abelian model We have only to vary the proof (62) If g is sufficiently irregular so that for all wj euro A where Wi are monomials in a i euro

[wIltrkwI]+ = 0

for infinitely many k so that

|w(w)|2 = J2 TT UJ^(jkwi)

= jjjl E w([laquolaquo]+) = o (j-ijJ (63)

293

then LJ(WI) has to vanish

In fact it was shown in42 that it is possible to construct a sequence g so that (63) holds for all wi and therefore the only invariant state is the tracial state In4 3 we proved that with probability one on the set of possible lt (63) holds and again we have a unique invariant state But this argument can be generalized to every coupling to abelian models therefore every coupling has to be trivial and the dynamical entropy in the sense of29 resp 30 vanishes

The Price-Powers shift was also studied in the context of Voiculescus dynamical entropy and in the context of the Alicki-Fannes entropy23 44 Here the increasing property is the dominant feature We obtain

hat(a) = i In 2 hAF (a) = In 2 (64)

independently of the special sequence g

If we return to our remark that the dynamical entropy describes how information increases but at the same time becomes more and more irshyrelevant for classical dynamical systems we notice that the Voiculescu and the Alicki-Fannes algebra concentrate on the fact that information increases whereas the 29 entropy is sensitive to the amount how inforshymation becomes irrelevant

7 Continuous K-Systems

So far we concentrated on discrete dynamics But obviously the discrete group of translation Z can be replaced by R without varying much of the definitions Especially due to the linearity of the dynamical entropy (which is proven for 18 and2 9)

han) = n h(a) (71)

also for the continuous groups R we can choose the subgroup aZ and can calculate the dynamical entropy (for all possible definitions) for this subgroup It can be shown that the result will be independent of the scaling parameter a

Also the definition of an algebraic quantum K-system is applicable also for a continuous group Only in this case the amount of increase cannot be described by [At Ao it is either zero on infinity because [At AQ] = n[Atn AQ] and [A 40] is either 0 or gt 2 1 4

294

This remark shows that a continuous quasifree evolution over a Fermi lattice system (aaa(f) = a(eiapf) a 6 R) can give positive dynamical entropy but cannot correspond to a continuous algebraic K-system

[At A0] = hat(at)

and hat(at) = hT(crt)

in the tracial state (compare39) This leads to a contradiction if hT(aT) is bounded

A prototype of a continuous K-system is given in relativistic quantum field theory

The Wedge Algebra 45 Consider the algebra Aw = lttgtx)xi gt 0 as subalgebra of a quantum field theory A This algebra is mapped into itself by the following automorphisms

a) ampx the shift in the x-direction Therefore AAwltri (Q bull |fi) is a K-system in an irreducible state The unitary operator implementing ai1] is eiplx with spec (P1) = R

b) lpound the shift in the light direction x1 + xdeg Again AAwtpound (ft| bull |ft) is a K-system Now pound^ = ad eiL with spec (L1) = R+

c) fl) is cyclic and separating for Aw- Therefore it defines a KMS-automorphism and this KMS-automorphism coincides with the geometric action of the boost b^ With AwZ1)Awbw (tt bull |ft) we obtain a new K-system where the K-automorphism is now the modular automorphism ad b^ = ad eB poundx acts as endomorphism on Aw- The generators satisfy

[ f l W L W ] = i l W (72)

These relations can be generalized to the following theorem

Theorem Let A AoTtuj be a modular K-system ie rt the modular automorphism of A and

n A0 D Ao-

a) Then the GNS vector Q) implementing ui is cyclic and separating both for A and Ao-

295

b) Let Tt be implemented by eim eiHtil = Q) Let rtdeg be the modular automorphism of A implemented by eiH with eiH |fi) = |ft) Then

G = Hdeg - H is well defined G gt 0

e i G s s gt 0 implements an endomorphism on A with elG A e~G = Ao

[HG) = iG (73)

The proof is based on the analyticity properties of the modular operator taking appropriate care of domain properties46 47

We notice that for quantum modular K-systems in a natural way endomorshyphism arise that satisfy the Anosov commutation relations and therefore offer by Lyapunov exponents the clustering properties of the automorphism

Theorem Let A T(t)a(s)uj be an Anosov system with r the K-automorphism and a the Anosov endomorphism

Take XA to be the characteristic function (a oo) for some a gt 0 Choose A and B euro A such that

i) AQ 6 Tgt(Gr) for some r gt 0

ii) XA(G)BQ = 0 As a consequence (n|Z|fi) = 0

Then

|w(i4TB)| lt e-tra-rBnGrAn (74)

We refer t o 1 and4 8

As for discrete quantum K-systems we wonder whether the dynamical enshytropy is positive and there exists nontrivial models Again no general result is available On the basis of quasifree evolution 49 we can construct models for fermions and bosons that are modular K-systems with positive dynamishycal entropy But there exists also a ^-deformed quasifree modular system50 Here the past algebra has trivial relative commutant and therefore the algebra does not contain any subalgebra on which the dynamics acts asymptotically abelian which according to 34 seems to be a requirement for the construction of abelian models

296

8 Mixing Properties Without Algebraic K-Property

As already mentioned no strategy is available up to now to construct for a given quantum dynamical system a subalgebra that satisfies the K-property A model for which it is still undecided whether we are dealing with an algebraic K-system is the rotation algebra51

Definition The rotation algebra Aa is built by unitary operators U V with

U-V = eiaV bull U (81)

for some a G [027r) This algebra arises in a natural way in a physically motivated example Consider a free particle in a constant magnetic field confined to two dishy

mensions Then the particle describes Larmor bounds In the thermodynamic limit these Larmor bounds can be occupied up to a precise filling factor52 This thermodynamic limit can most easily be achieved by confining the particles in an additional harmonic potential whose strength is going to zero53 Another method more taylor-made to study electric currents are periodic boundary conditions Therefore the algebra is built by eiav ePv einx emy with

piavx Jinx pin(x+a) iavx

pifivypiny _ pim(y+P) giffvy

eiavXpil3vy _ pia0Bpi0vypiav g 2

with B the magnetic field orthogonal to the plane All other commutators vanish

If we introduce

exp[inx] =

exp[im7] =

len the algebra splits into

eiav em

exp

exp

tn(x - mdashvy

im(y - ~5vx

yreg einxeimy

pinXpimy _ g i Bpimypinx

297

Therefore the rotation algebra with a = lB describes the algebra of the center of the Larmor precision

For Aa there exists a representation on CT2)

7r(Va) = exp [i [y - ^Pz) ] gt (83)

where p pv are the momentum operators - mdash- - mdash with periodic boundary i ox i ay

conditions on the torus For |fi) = |1) the constant function on the torus

JJa)il) = eix

n(va)n) = jy (84)

independent of the rotation parameterM On Aa we have the following autoshymorphism

4(^C) = J^usv

with

n m

= T n m - ( ) bull

ad mdash be = 1

tjW describe currents and are therefore of physical relevance QT describes dilation in R space and reduces to a map on the torus T2 only for discrete values and discrete directions of the dilation A physical description for QT can be given if it describes a sudden periodic push to the particle Whereas CT1 and a(2gt have no good mixing behaviour QT inherits all mixing properties from the classical torus due to (84)

(nn(Wa(z))QTn(Wa(z))n) = (QirW0z))QTn(W0(z))il) (85)

But with respect to dynamical entropy the noncommutativity plays an essential

298

role Let A be the eigenvalue gt 1 of T Then

hat(ampT) = In A for a irrational18

= In A for a rational

IAF(copyT) = In A for all a 5 5

ICNT(copyT) = hi A for a rational

gt 0 for a depending rationally on A57

= 0 in general56

In addition it was possible to construct for a rational a subalgebra Ao so that (A AQQTU) became a K-system54 This was possible because A can be looked at as a crossed product of the classical algebra on T2 with a discrete translation group and by rather general considerations crossed product algeshybras inherit under some conditions the K-structure of the underlying algebra 56 Obviously this construction does not give a hint for irrational a

The strong dependence on a of the CNT-dynamical entropy is based on the fact of the strong dependence of the asymptotic commutation behaviour Only if a and A are rational depending the system is asymptotically abelian and the commutator converges asymptotically fast to zero This rapid convergence made it possible to construct an abelian model57 using the fact that the algebra Aa can be imbedded in but is not an AF-algebra Therefore different from the approaches for lattice systems the abelian model cannot be identified up to convergence problems with an abelian subalgebra of Aa-

9 Time Evolution

As we have seen in a quantum system there are many possibilities for some kind of mixing behaviour that are not equivalent as in the classical situation Up to now we concentrated on dynamics that were constructed in such a way that they should give us information on possible ergodic structures

When dynamics is given to us by a sequence of local Hamiltonians we have up to now hardly control on the asymptotic behaviour apart from quasifree evolution

We mention just one result The x-y model58 allows a transformation to a quasifree evolution Therefore we know that it is weakly but not strongly asymptotically abelian Its dynamical entropy is positive and all definitions give the same result (with the dimensional correction term for IAF)- We do not know whether it is an algebraic K-system for a discrete subset in time For sure it is not a continuous algebraic K-system

299

References

1 GG Emch H Narnhofer GL Sewell W Thirring Anosov Actions on Non-Commutative Algebras J Math Phys 3511 5582-5599 (1994)

2 MC Gutzwiller Chaos in classical and quantum mechanics (Springer New York 1990)

3 E Bogomolny F Leyvraz C Schmit Statistical Properties of Eigenshyvalues for the Modular Group in Xlth International Congress of Mathshyematical Physics Daniel Jagolnitzer ed (International Press Boston 306-323 1995)

4 AN Kolmogorov A new metric invariant of transitive systems and autoshymorphisms of Lebesgue spaces Dokl Akad Nauk 119 861-864 (1958)

5 P Walters An Introduction to Ergodic Theory (Springer New York 1982)

6 LP Cornfeld SV Fomin YaG Sinai Ergodic Theory (Springer New York 1982)

7 H Narnhofer W Thirring Quantum K-Systems Commun Math Phys 125 565-577 (1989)

8 C Brukner A Zeilinger Conceptual Inadequacy of the Shannon Inforshymation in Quantum Measurements quant-ph0006087

9 0 Bratteli DW Robinson Operator Algebras and Quantum Statistical Mechanics I II (Springer Berlin Heidelberg New York 1993)

10 B Kiimmerer Examples of Markov dilation over 2 x 2 matrices in L Accardi A Frigerio V Gorini eds Quantum Probability and Applicashytions to the Quantum Theory of Irreversible Processes Springer Berlin 1984 228-244 and private communications

11 RT Powers An index theory for semigroups of -endomorphisms of BH) and type Hi factors Canad J Math 40 86-114 (1988) GL Price Shifts of Hi factors Canad J Math 39 492-511 (1987)

12 H Narnhofer W Thirring Chaotic Properties of the Noncommutative 2-Shift in From Phase Transition to Chaos G Gyorgyi I Kondor S Sasvari T Tel eds World Scientific 1992 530-546

13 H Narnhofer W Thirring Clustering for Algebraic K-Systems Lett Math Phys 30 307-316 (1994)

14 VFR Jones Index for subfactors Invent Math 72 1-25 (1983) 15 R Longo Simple Injective Subfactors Adv Math 63 152-171 (1987)

Index of Subfactors and Statistics of Quantum Fields Commun Math Phys 130 285-309 (1990)

16 M Choda Entropy of canonical shifts Trans Amer Math Soc 334 827-849 (1992)

300

17 H Narnhofer A Pflug W Thirring Mixing and Entropy Increase in Quantum Systems in Symmetry in Nature in honour of Luigi A Radicati di Brozolo Scuola Normale Superiore Pisa 597-626 (1989)

18 DV Voiculescu Dynamical Approximation Entropies and Topological Entropy in Operator Algebras Commun Math Phys 170 249-282 (1995)

19 M Choda A C Dynamical Entropy and Applications to Canonical En-domorphisms J Fund Anal 173 453-480 (2000)

20 E Stormer A Survey of noncommutative dynamical entropy Oslo preprint No 18 Dep of Mathematics MSC-class 46L40 (2000)

21 M Choda Entropy on crossed products and entropy on free products preprint (1999)

22 K Dykema Topological entropy of some automorphisms of reduced amalshygamated free product C algebras preprint (1999)

23 R Alicki F Fannes Defining Quantum Dynamical Entropy Lett Math Phys 32 75-82 (1994)

24 RB Griffiths Consistent histories and the interpretation of quantum mechanics J Stat Phys 36 219-279 (1984)

25 M Gell-Mann J Hartle Alternative decohering histories in quantum mechanics in Proc of the 25th Int Conf on High Energy Physics Vol 2 ed by KK Phua and Y Yamaguchi World Scientific Singapore 1303-1310 (1991)

26 EH Lieb DW Robinson The finite group velocity of quantum spin systems Commun Math Phys 28 251-257 (1972)

27 A Connes E Stormer Entropy of IIj von Neumann algebras Acta Math 134 289-306 (1972)

28 A Connes Acad Sci Paris301I 1-4 (1985) 29 A Connes H Narnhofer W Thirring Dynamical Entropy of C-

Algebras and von Neumann Algebras Commun Math Phys 112 691-719 (1987)

30 JL Sauvageot JP Thouvenot Une nouvelle definition de Ientropic dynamique des systems non commutatifs Commun Math Phys 145 411-423 (1992)

31 CH Bennett DP DiVincenzo JA Smolin WK Wootters Mixed state entanglement and quantum error corrections Phys Rev A 54 3824-3851 (1996)

32 F Benatti H Narnhofer A Uhlmann Decomposition of quantum states with respect to entropy Rep Math Phys 38 123-141 (1996)

33 WK Wootters Entanglement of formation of an arbitrary state of two qubits q-ph970929

301

34 F Benatti H Narnhofer Strong asymptotoc abelianess for entropic K-systemsCommun Math Phys 136 231-250 (1991) Strong Clustering in Type III Entropic K-Systems Mh Math 124 287-307 (1996)

35 H Narnhofer An Ergodic Abelian Skeleton for Quantum Systems Lett Math Phys 28 85-95 (1993)

36 H Narnhofer W Thirring Dynamical Theory of Quantum Systems and Their Abelian Counterpart in On Klauders Path eds GG Emch GC Hegerfeldt L Streit World Scientific 127-145 (1994)

37 H Narnhofer Free energy and the dynamical entropy of space translashytion Rep Math Phys 25 345-356 (1988)

38 H Moriya Variational principle and the dynamical entropy of space translation Rev Math Phys 11 1315-1328 (1999)

39 S Neshveyev E Stormer The variational principle for a class of asympshytotically abelian C algebras MSC-class 46L55 (2000)

40 M Fannes B Nachtergaele RF Werner Finitely correlated states of quantum spin systems Commun Math Phys 144 443-490 (1992)

41 RF Werner private communication 42 H Narnhofer E Stormer W Thirring C dynamical systems for which

the tensor product formula for entropy fails Ergod Th amp Dynam Sys 15 961-968 (1995)

43 H Narnhofer W Thirring C dynamical systems that are highly anti-commutative Lett Math Phys 35 145-154 (1995)

44 R Alicki H Narnhofer Comparison of Dynamical Entropies for the Noncommutative Shifts Lett Math Phys 33 241-247 (1995)

45 HJ Borchers On the Revolutionization of Quantum Field Theory by Tomitas Modular Theory ESI preprint 160 pages 148 references

46 HJ Borchers On Modular Inclusion and Spectrum Condition Lett Math Phys 27 311-324 (1993)

47 HW Wiesbrock Halfsided Modular Inclusions of von Neumann Algeshybras Commun Math Phys 157 83-92 (1993) Commun Math Phys 184 683-685 (1997)

48 H Narnhofer Kolmogorov Systems and Anosov Systems in Quantum Theory review to be publ in IDAQP

49 H Narnhofer W Thirring Realization of Two-Sided Quantum K-Systems Rep Math Phys 45 239-256 (2000)

50 D Shlyakhtenko Free quasifree states Pac Journ of Math 177 329-368 (1997)

51 MA Rieffel Pac J Math 93 415 (1981) 52 RB Laughlin Quantized Hall Conductivity in Two Dimensions Phys

302

Rev B 2310 5632-5633 (1981) 53 N Ilieva W Thirring Second quantization picture of the edge currents

in the fractional quantum Hall effect math-ph0010038 54 F Benatti H Narnhofer GL Sewell A Non Commutative Version of

the Arnold Cat Map Lett Math Phys 21 157-172 (1991) 55 R Alicki J Andries M Fannes P Tuyls Lett Math Phys 35 375-

383 (1995) 56 H Narnhofer Ergodic Properties of Automorphisms on the Rotation

Algebra Rep Math Phys 39 387-406 (1997) 57 SV Neshveyev On the K property of quantized Arnold cat maps J

Math Phys 41 1961-1965 (2000) 58 H Araki T Matsui Commun Math Phys 101 213-246 (1985)

303

SCATTERING IN Q U A N T U M TUBES

B O R J E NILSSON

School of Mathematics and Systems Engineering Vaxjo University SE-351 95 VAXJO Sweden

E-mail borjenilssonmsivxuse

It is possible to fabricate mesoscopic structures where at least one of the dimenshysions is of the order of de Broglie wavelength for cold electrons By using semishyconductors composed of more than one material combined with a metal slip-gate two-dimensional quantum tubes may be built We present a method for predicting the transmission of low-temperature electrons in such a tube This problem is mathematically related to the transmission of acoustic or electromagnetic waves in a two-dimensional duct The tube is asymptotically straight with a constant cross-section Propagation properties for complicated tubes can be synthesised from corresponding results for more simple tubes by the so-called Building Block Method Conformal mapping techniques are then applied to transform the simple tube with curvature and varying cross-section to a straight constant cross-section tube with variable refractive index Stable formulations for the scattering operators in terms of ordinary differential equations are formulated by wave splitting using an invariant imbedding technique The mathematical framework is also generalised to handle tubes with edges which are of large technical interest The numerical method consists of using a standard MATLAB ordinary differential equation solver for the truncated reflection and transmission matrices in a Fourier sine basis It is proved that the numerical scheme converges with increasing truncation

1 Introduction

In the search for faster computers critical parts are becoming smaller Today it is possible to build mesoscopic structures where some dimensions are of the order of the de Broglie wavelength for cold electrons Often the electron motion is confined to two dimensions Consequently it may be necessary at least for some computer parts to include quantum effects in the design process

A large number of studies devoted to such quantum effects have been carried out in recent years and a review is given by Londegan et alx Many inshyvestigations aim at understanding the physical properties of a particular quanshytum tube rather than developing reliable mathematical and numerical methods that can be used in a more general context The research has given valuable knowledge on the physical behaviour but also reports on the limitations of the methods used For instance Lin amp Jaffe2 report that a straightforward matchshying at the boundary of a circular bend does not converge demonstrating the numerical problems with such a method An illposedness is present in quantum tube scattering and some type of regularisation is therefore required to avoid large errors Often the tubes have sharp corners to facilitate manufacturing

304

but also to enhance quantum effects The presence of corners with attached singularities requires special treatment

Scattering of electrons in quantum tubes see figure 1 is theorywise reshylated to the scattering of acoustic and electromagnetic waves in ducts Nilsson 3 treats a general method for the acoustic transmission in curved ducts with varying cross-sections Wellposedness ie stability is achieved in an asympshytotic sense The mathematical framework guarantees consistent results and allows for sharp corners and a proof for numerical convergence is given We set out to present a quantum version of the results of Nilsson3 In this way the problems reported on convergence2 and on inconsistent mathematical results would be resolved

The paper is organised as follows An introduction to scattering in quanshytum tubes is given in section 2 and a mathematical model is formulated in section 3 The Building block Method which is a systematic method to analyse complicated tubes in terms of results for simple tubes is also briefly described Then in section 4 the scattering problem for the curved tube with varying cross-section and constant potential is reformulated to a scattering problem for a straight tube with a varying refractive index The solution to this probshylem is presented in section 5 and a discussion on numerical methods are also given

2 Tubes in quantum heterostructures

A schematic view of a quantum heterostructure is shown in figure 2 following Wu et al 4 Electrons are emitted from the n-type doped AlGaAs layer migrate into the GaAs layer and stay close to the boundary to the AlGaAs layer In this way a very narrow layer of electrons which are free to move in a plane is formed Nearly all the electrons in this two-dimensional gas are in the same quantum state By applying a negative potential on the metal electrodes on the top of the heterostructure in figure 1 the electrons are banished from the region below the electrodes For relatively low voltages the effective potential in the tube for one electron is close to the square-well potential 1 As a consequence the electrons in the two-dimensional gas are further restricted to a tube that in form is a mirror picture of the gap between the two electrodes This quantum tube links the electrons between the two two-dimensional gases on both sides of the strip formed by the electrodes

3 Mathematical model

Consider a two-dimensional tube with interior ft according to figure 1 The boundary V consists of two continuous curves F+ and r_ which are piecewise

305

C2 The upper boundary r + can be continuously deformed to T_ within ft Outside a bounded region the duct is straight with constant widths a and b respectively These terminating ducts are called the left and the right terminating duct or L and R for short We use stationary scattering theory for one electron in an effective potential with time dependence exp(mdashiEth) assuming that the wave function ip satisfies the time-independent Schrodinger equation Atp + k2ip = 0 in ftwhere k2 = 2mEh and m is the effective mass5 Usually k2 is called energy The effective potential is assumed to be a square well meaning that Vlr = 0-

In a tube with constant cross-section the harmonic wavefunction ip can be uniquely decomposed in leftgoing and rightgoing parts by ip = ip++ip~ Super indices + and mdash indicate rightgoing or plus and leftgoing or minus waves respectively Let ipfn

a n d V^ be known incoming waves in the terminating ducts tpfn is present in the left and ip~n in the right one Let us write

f V = 1gttn + R+tfn + T-rp-JnL rj = VTn + RiTn + T+igtfninR ^

where for example the last two terms in (31a) are minus waves and the equashytion defines the left reflection mapping R+ that maps the incoming wave to an outgoing one in L The scattering problem consists of finding the mappings R+ T~ R~ and T+ as functions of energy for a given duct In summary we have

Aip + k2igt = Oinfl

1gt+=1gtpnL bull 6-2)

igt = gtPininR

There is always a solution to (32) and except for a discrete number of eigenenergies k2 = kfi = 123 the solution is unique 6 When k2 = k2 an eigenenergy there exists a solution without incoming but with outgoing waves

The use of the Building Block Method 7 or transfer matrix formalism 8 is very efficient for the solution of scattering problems In this method a tube with a complicated geometry is divided into two parts usually where the tube is straight These two parts are converted to the type shown in figure 1 by extending the terminating tubes to infinity A sub tube for the tube shown in figure 1 originates from the left part and is depicted in figure 3 The Building Block Method gives a procedure for calculating the mappings R+ T~ R~ and T+ for the entire tube in terms of the corresponding scattering properties for the sub tubes This procedure can be repeated to get several sub tubes

306

Rather than using a general numerical package for conformal mappings we have for the calculations in this paper employed the Schwarz-Christoffel mapping for a duct with corners and rounding the corners using the methods of Henrici 9 Required analytic integrations are performed in MATHEMATICA

We recall the standard duct theory6 in a form that illustrates the illposed-ness of the problem and we have

oo oo

rP = Vgt+ + V- = Y A+e t eVraquo(v) + pound ^ e ^ - ^ l y ) (33) ra=l n = l

with pn(y) = sin(nnya) and an = ^Jk2 mdash n2n2a2 Im an gt 0 It is conveshynient to define the operator Bo by

-Bo = pound r T = l ttnnVn

I f(y) = Zn=lltnfnltPn(y) ^

We find that BQ mdash d2x 4- k2 and dx^ mdash plusmni50Vplusmn- The initial value problem

dxtp+(x) = iB0ip

+(x)

I V+(0) = ^ (

is illposed for x lt 0 but not for x gt 0 If an attenuated plus wave is marched to the left an exponential growth is found To avoid the illposedness ip is decomposed and the plus waves are calculated by marching to the right and minus waves in the opposite direction

4 Reformulated scattering problem

To be able to use powerful spectral methods it is advantageous to transform the tube to a flat boundary It is enough according to the Building Block Method to consider the scattering in the sub tubes and we restrict ourselves to the first part as shown in figure 3 One way of transforming the tube is to use a conformal mapping w(C) transforming the interior CI of the tube with variable cross-section in the pound = x + iy plane (figure 3) to the interior H of a straight tube with constant cross-section in the w = u + iv plane The straight tube is described by mdashoo lt u lt o o 0 lt t lt a

Introducing cfgt(u v) = tp(x y) we get

f d2uclgt + B2(u)^ = 0inn (

0(uO) = 0(uo) = O u e R K

with B2u) = d2 + k2n(uv) and n = dCdw2 ^(uigt)-1 can be denoted as a refractive index for the straight tube In figure 4 x related to the simple

307

tube in figure 3 is depicted The factor (i(u v) is asymptotically constant at both ends of the tube or more precisely fj(u v) = (iplusmn+0(e^cu^) u mdashgt plusmn00 with [i- mdash 1 and J+ = (ba)2

We use a first order description and rewrite (41a) as

9u dultjgt ) ~ - B 2 0j dulttgt ) (42)

To avoid illposedness the decomposition ltjgt = ltfgt+ + cfgt~ is introduced which must be identical to the corresponding decomposition (33) in regions where n is a constant The new state variables (ltfgt+ltfgt~) are introduced via the linear relation

dultigt)- ic -ic )lttgt- ) bull (43)

Solving (43) for 0+and ltjgt that

and taking the u-derivative and using (42) we find

(pound) - ( i)(pound)- (44)

where

a = MiduC-^C + iC~lB2 + iC] -(duC-1)C + iC-1B2-iC -(duC-1)C-iC-1B2 + iC

S =[duC-l)C - iC~lB2 - iC]~

amp _ 1

7 = I 2

(45)

To generalize the concept of transmission operators we make them u-dependent using a similar notation as Fishman10

4gt+u2) f T+(U2Ui) V tf-(Ul) J V ^+(2laquol)

(u1 (u2) ( 4gt+(ui)

J V r (laquoraquo) ) R T-(Ulu2)

(46)

assuming that ti lt u2 and suppressing the explicit v-dependence It is asshysumed for (46) that the scattering problem has a unique solution or that homogenous solutions are removed A homogenous solution is usually called a bound state

Next we find a differential equation for the scattering operators T+(u2 u) R~(uiu2) R+(u2ui) and T~(uiu2) in (46) using the invariant imbedding technique11 10 It is required that the incoming wave from the right ltjgt~u2)

308

is vanishing Then put u = u find dultj) (u) from (46) use (46) once more to obtain

duR+(u2u) = J + 5R+(u2u) - R+(u2u)a - R+(u2u)PR+(u2u) (47)

In a similar manner we get

duT+ (u2 u) = -T+ (u2 u)a-T+ (u2 u)3R+ u2 u) (48)

The stability properties of (47) and (48) are of central importance In the flat regions where B = B+ or B- we have C mdash B and duC~x mdash 0 implying that = 7 = 0 and a = -S = IB Similarly (47) and (48) reduce to duX

+ = mdashiBX+ X+ = R+ or T + equations which are well-posed for marching to the left The initial values to accompany (47) and (48) are R+(u2u2) = 0 and T+(u2u2) = where I is the identity operator

We choose C mdash B- + f(u)(B+ mdash pound_) that is independent of v Here is increasing and smooth with limu-^-oo^) = 0 and limu_gt00(u) = 1

5 Solution of the scattering problem

For the numerical solution of the scattering operator we expand ltj) in a Fourier sine series and i i n a Fourier cosine series

^(uv) = pound ~ = 1 (pnu)tpn(v) (

where poundn(v) = cos(mra) Using the notation 4gt = ((jgt0(j))T we find that

^ M + B 2 ( U ) ^ ) = 0 (52)

The matrix elements of B 2 (u) are given by

k2 n2TT2

B2(u)nm = mdash [-fjm+n(u) - Hm-nu) - Hm + Hn-m(u)] ^Snm (53)

and it is understood in (53) that [ii(u) = 0 for negative I For the tube in the physical Cmdashplane we require that locally both the potenshy

tial and the kinetic part of the energy are finite that is both Jx ip dxdy lt oo and Jx Vip dxdy lt oo for all finite regions X inside the tube We say that ip belongs to the Sobolev space Hj1^ meaning that tp and its first derivatives are locally square integrable Transformed to the straight duct the local finite energy requirement means Jv (fgt fidudv lt oo and ^ |V^| dudv lt oo for all

309

finite regions U inside the tube For a smooth boundary cfgt is more regular and also the second derivatives of ltjgt are square integrable that is 0 G H2

0C It follows from the theory of Grisvard12 that also the second derivatives of ltjgt are square integrable which means that ltjgt 6 H2

oc According to a graph theorem13

cj) euro H2oc implies that cfgt(u-) 6 H32(0o) meaning that up to 32 derivatives

are square integrable To interpret this regularity with fractional derivatives we define following Taylor13 the function space

Ds = fe L2(0 a) f^ | bdquo | 2 (l + n2)s lt oo 1 s gt 0 (54) I 71=0 J

wi th = J2^Li fnltPn a n d bdquo = (fltpn)(ltPnPn)- D s is a Hi lber t space wi th the norm

oo

11112) = () = pound l n | 2 ( i + laquo2)- (5-5) n=l

Taylor13 shows that D0 =L2(0o) Di =Hj(0a) D2 =H2(0a)nHj(0a) and that dvDs = D s_i s gt 1 In this terminology we have that for a smooth boundary ltjgtu bull) euro D32-

The operator 92 is self-adjoint on D32- Thus we may define Bplusmn by

oo

Bplusmnf = ^2 k2Hplusmn-nHyafnipn (56) 7 1 = 1

assuming that the branch Im gt 0 of the square root is taken It is clear that T + R~ R+ and T~ are mappings D3 2 ^ D 3 2 and Bplusmn D s mdashgt D s_i s gt 1

For tubes with edges in the poundmdashduct things are a little more complicated With no restriction on the sharpness of the edges we cannot improve that (jgt euro Hoc implying ltjgtu-) euroDi2 Then as an intermediate step in our calcushylations Bplusmnltj) should be in the space D_2 Such a derivative must of course be interpreted as a distribution However the end result ie scattered wave function belongs to D ^ To generalise we define by duality for positive s

poundraquo_s = | g f(v)g(v)dv lt oo for all f pound Ds

Multiplication by^ju is an operator Tgti2 -gtbull D_2 and if s gt 12 we have the following mapping properties Bplusmn D s - bull Dg_idbdquo D s -gt D5_ and T + R~ R+ and T~ are mappings D s -^D s

310

The equations (47-48) can only in very special cases be solved in a closed form Therefore some type of numerical scheme is used Generally a numerical method cannot give uniform convergence for the entire space Ds In a practical application it is usually sufficient to know the effect of the scattering matrices on the lowest eigenfunctions the first No say A practical method is therefore to truncate the matrix representation of (47) - (48) to N raquo NQ and solve the finite-dimensional ordinary differential equation with a standard numerical routine Nilsson3 proves that such a procedure converges when N mdashgt oo

Presently numerical results are not available for the quantum tube scatshytering However Nilsson 3 presents results for the acoustic case where the Neumann rather than the Dirichlet boundary condition applies He reports that for the lowest order reflection coefficient N = 1 ie a scalar solution is accurate up to ka = 15 N = 2 gives a good and N = 5 gives a perfect discription up to ka = 6 Energy conservation holds for all N

References

1 J T Londegan J P Carini D P Murdock Binding and scattering in two-dimensional systems - Applications to quantum wires waveguides and photonic crystals Lecture notes in physics (Berlin Springer 1999)

2 K Lin R L Jaffe Bound states and threshold resonances in quantum wires with circular bends Phys Rev B54 5750-5762 (1996)

3 B Nilsson Acoustic transmission in curved ducts with varying cross-sections Article submitted to Proc Roy Soc A

4 J C Wu M N Wybourne W Yindeepol A Weisshaar S M Good-nick Interference phenomena due to a double bend in a quantum wire Appl Phys Lett 59 102-104 (1991)

5 J Davies The Physics of low-dimensional semiconductors (Cambridge Cambridge University press 1998)

6 M Cessenat Mathematical methods in electromagnetism (Singapore World Scientific Publishing Co 1996)

7 B Nilsson O Brander The propagation of sound in cylindrical ducts with mean flow and bulk reacting lining - IV Several interacting disconshytinuities IMA J Appl Math 27 263-289 (1981)

8 H Wu D W L Sprung J Martorell Periodic quantum wires and their quasi-one-dimensional nature J Phys D Appl Phys 26 798-803 (1993)

9 P Henrici Applied and computational complex analysis Volume I (New York John Wiley k Sons 1988)

10 L Fishman One-way propagation methods in direct and inverse scalar

311

wave propagation modeling Radio Science 28(5) 865-876 (1993) 11 R Bellman G M Wing An introduction to invariant imbedding Classhy

sics in Applied Mathematics 8 Society for Industrial and Applied Mathshyematics (SIAM) Philadelphia 1992

12 P Grisvard Elliptic problems in nonsmooth domains Monographs and studies in mathematics 24 (Boston Pitman 1985)

13 M Taylor Partial differential equations I Basic theory Applied mathshyematics sciences 115 (NewYork Springer 1996)

312

Figure 1 Two-dimensional quantum tube

Doped AJGaAs

Undoped AIGaAs

Undoped GaAs

Semi insulating GaAs

Figure 2 Schematic picture of heterostructure and split-gate structure

313

Figiire 3 Sub-tube with interior Q and upper boundary T^_and lower boundary T_ ba -06

2 0

Figure 4 fi(uv) in the straight duct Parameters as in figure 3 fi x is the refractive index

314

POSITION EIGENSTATES A N D THE STATISTICAL AXIOM OF Q U A N T U M MECHANICS

L POLLEY Physics Dept Oldenburg University 26111 Oldenburg Germany

E-mail polleyQuni-oldenburg de

Quantum mechanics postulates the existence of states determined by a particle position at a single time This very concept in conjunction with superposition induces much of the quantum-mechanical structure In particular it implies the time evolution to obey the Schrodinger equation and it can be used to complete a truely basic derivation of the statistical axiom as recently proposed by Deutsch

1 Quantum probabilities according to Deutsch

A basic argument to see why quantum-mechanical probabilities must be squares of amplitudes (statistical axiom) was given by Deutsch1 2 It is independent of the many-worlds interpretation Deutsch considers a superposition of the form

He introduces an auxilliary degree of freedom i = 1 m + n and replaces

1 4) and B) by normalized superpositions

~r~ m nr m+n

pound5gt)|igt l5gtWn pound m) (L2)

imdashl i=m+l All amplitudes in the grand superposition are equal to 1ym + n and should result in equal probabilities for the detection of the states This immediately implies the ratio m n for the probabilities of property A or B

The argument has clear advantages over previous derivations of the statisshytical axiom Gleasons theorem3 4 for example is mathematically non-trivial and not well received by many physicists while von Neumanns assumption 0 +Cgt2) = (Oi) + (O2) about expectations of observables 5 6 is difficult to interpret physicswise if 0 and Oi are non-commuting45

However Deutschs argument relies in an essential way on the unitarity of the replacement or the normalization of any physical state vector Why should a state vector be normalized in the usual sense of summing the squares of amplitudes It would seem desirable to provide justification for this beyond

315

its being natural 2 In fact the reasoning would appear circular without an extra argument about unitarity or normalization I have proposed 7 to realize the replacement (12) physically by the time evolution of a suitable device Then what can be said about quantum-mechanical evolution without anticipating the unitarity

2 Schrodingers equation for a free particle as a consequence of position eigenstates

For free particles a well-known and elegant way to obtain the Schrodinger equation is via unitary representations of space-time symmetries Interactions can be introduced via the principle of local gauge invariance However this approach to the equation anticipates unitarity

As I pointed out recently8 the Schrodinger equation for a free scalar parshyticle is also a consequence of the very concept of a position eigenstatea in dis-cretized space To an extent this just means to regard hopping amplitudes as they are familiar from solid state theory as a priori quantum-dynamical entities The point is to show however that a hopping-parameter scenario without unitarity would lead to consequences sufficiently absurd to imply that unitarity must be a property of the physical system As will be seen below the absurdity is that a wave-function that makes perfect sense at t = 0 would cease to exist anywhere in space at an earlier or later time

Consider a spinless particle hopping on a 1-dimensional chain of posishytions x = na where n is integer and a is the lattice spacing

bull bull bull bull bull - gt mdash bull mdash - bull bull - a - trade-i n +i

Assume the particle is in an eigenstate n t) of position number n at time t (using the Heisenberg picture) and it has a possibility to change its position The information given by a position at one time does not determine which direction the particle should go Thus the eigenstate n t) necessarily is a superposition when expressed in terms of eigenstates relating to another time t Moreover because of the same lack of information positions to the left and right will have to occur symmetrically If t mdashyen t only nearest neighbours will be involved Thus we expect a hopping equation of the form

nt)=a nt)+3 |n + l t ) + n-lt)

This can be rewritten as a differential equation in t

mdashimdash n t) = V n t) + K n + 1 t) + K n mdash 1 t) K V complex (so far)

Which relies on linear algebra hence includes the concept of superposition

316

Parameters a3 and K V are in an algebraic relation8 which need not concern us here To obtain an equation for a wave-function we consider a general state tp) composed of simultaneous position eigenstates

ip) = ^J^gt(npound) nt) (Heisenberg picture) n

This defines the coefficients ip(nt) for all t Now take the time derivative on both sides identify i[)nt) with a function ip(xt) where x = na and Taylor-expand the shifted values ip(x plusmn a t) This results in

Finally take a mdashgt 0 on the relevant physical scale The spatial spreading of the wave-function is then given by the a2 term and the solution of the equation is

ilgt(xt) = e~iv+2K)t f rP(p)eipxe-ia2Kp2tdp

This time evolution would be unitary if K and V were real Hence consider the consequences of a non-real K The integrand would then contain an evolution factor increasing towards positive or negative times like

exp (plusmn a2 Imtp21)

This would lead to physically absurd conclusions about certain harmless wave-functions like the Lorentz-shape function ij)x) = 11 + x2

bull For Imt gt 0 harmless function rpp) oc exp(mdashp) would not exist anywhere in space after a short while

bull For Imc lt 0 the harmless function could not be prepared for an experiment to be carried out on it after a short while

In a mathematical sense of course it still remains a postulate that the value of K be real But physicswise it does seem that unitarity of quantum mechanics is unavoidable once the superposition principle and the concept of position eigenstate are taken for granted

As for parameter V the factor e~lVt would be raised to the nth power in an n-particle state and would lead to an absurdity similar to the above with certain superpositions of n-particle states unless V is real too

317

3 Driven particle Weyl equation in general space-time

As an example of a particle interacting with external fields we may consider a massless spin 12 particle with inhomogeneous hopping conditions8 Here the starting point is common eigenstates of spin and position where position refers to a site on a cubic spatial lattice A particle in such a state at time t will be in a superposition of neighbouring positions and flipped spins at a time t laquo t In 3 dimensions and immediately in terms of a wave-function the corresponding differential equation is

-imdaships(xt)= S~] Hnssiilgtslx-ant) at mdash

lattice directions

where Hnssi are any complex amplitudes On-site hopping (time-like direction) is included as n = 0 To begin with a free particle is defined by translational and rotational symmetry In this case the hopping amplitudes reduce to two independent parameters8 e and K both of them complex so far By Taylor-expanding the wave-function and taking a mdashgt 0 we find

dtipsxt) = etp3(xt) - aKa^sdntpsgt(xt)

If K had an imaginary part it would lead to physical absurdities with the time-evolution of certain harmless wave-functions similarly to the previous section For real K we recover the non-interacting Weyl equation

If we now admit for slight (order of o) anisotropics and inhomogeneities in the hopping amplitudes by adding some a7MSS(x t) to the hopping conshystants above we recover a general-relativistic version of the equation 9 with the Juss (x t) acting as spin connection coefficients Unitarity in this context means that the probability current density

j(t) = v()ltcvv(t) is covariantly conserved

daja + Ta

0aj = O

This is found to hold automatically if the vector connection coefficients are identified as usual9 through the matrix equation

Imposing no constraints on the spin connection coefficients we are dealing with a metric-affine space-time here which can have torsion and whose metric

318

Figure 1 An array of eight cavities of equal shape The initial state is located in the central cavity When each channel is opened for an appropriate time the state evolves to an equal-amplitude superposition of the peripheral cavity-states

may be covariantly non-constant The study of space-times of this general structure has been motivated by problems of quantum gravity9 It may be interesting to note that nothing but propagation by superposing next-neighbour states needs to be assumed here In particular scalar products of state vectors are not needed

4 Realizing Deutschs substitution as a time evolution

Having demonstrated automatic unitarity on two rather general examples we can now turn with some confidence to the original issue of completing Deutschs derivation of the statistical axiom

To realize the particular substitution (12) for state vector (11) let us consider a particle with internal eigenstate A) or B) such as the polarisations of a photon Let this particle be placed in a system of cavities6 connected by channels (Fig 1) which can be opened selectively for internal state A) or

Or Paul traps or any other sort of potential well these are to enable us to store away parts of the wave function so that there is no influence on them by the other parts

319

B) It will be essential in the following that all cavities are of the same shape because this will enable us to exploit symmetries to a large extent The location of the particle in a cavity will serve as the auxilliary degree of freedom as in (12) except that A) and B) before the substitution will be identified with |A)|0) and |-B)|0) where |0) corresponds to the central cavity

Now let only one of the channels be open at a time We are then dealing with the wave-function dynamics of a two-cavity subsystem while the rest of the wave-function is standing by What law of evolution could we expect A particle with a well-defined (observed) position 0 at time t will no longer have a well-defined position at time t if we allow it to pass through a channel without observing it Thus a state |0 t) defined by position 0 at time t (using the Heisenberg picture) will be a superposition when expressed in terms of position states relating to a different time t In particular if channel 0 lt-bull 1 is the open one

0t) = a0t) + plt)

Likewise by symmetry of arrangement

|ltgt = a | M ) + 0 | O l f )

It follows that |0 t) plusmn |1pound) are stationary states whose dependence on time consists in prefactors

(a plusmn fi)k after k time steps (41)

If the particle is initially in the rest of the cavities whose channels are shut we would expect this state not to change with time

|restt) = |resti)

Now if (41) were not mere phase factors we could easily construct a supershyposition of |0) |1) and |rest) so that relative to the disconnected cavities the part of the state vector in the connected cavities would grow indefinitely or vanish in the long run As there is no physical reason for such an imbalance between the connected and the disconnected cavities we conclude that

a + p = ei a-0 = eiv

Having shown evolution through one open channel to be unitary we can idenshytify an opening time interval7 r m to realize the following step of the replaceshyment (12)

ymA) |0) + |rest) ^ ym=lA) |0) + | A) 11) + |rest)

320

Here |rest) stands for state vectors that are decoupled such as all |B)|i) and all | 4) |i) with i ^ 01 Opening other channels analogously each one for the appropriate r m and internal state we produce an equal-amplitude superposishytion

m m+n

Xraquo|igt + pound |Bgt|tgt i=l i=m+l

The probability of finding the particle in a particular cavity is now 1m + n as a matter of symmetry As the internal state is correlated with a cavity by the conduction of the process the probabilities for A and B immediately follow These must also be the probabilities for finding A or B in the original state because properties A and B have remained unchanged during the time evolution

5 Can normalization be replaced by symmetry

An interesting side effect of the above realization of Deutschs argument is that state vectors need no longer be normalized at all Permutational symmetry of a superposition suffices to show that all possible outcomes of an experiment must occur with equal frequency Then the numerical values of the probabilities are fully determined This feature of quantum probabilities may be relevant to problems of normalization in quantum gravity10 such as the non-locality of summing xp2 over all of space or the non-normalizability of the solutions of the Wheeler-DeWitt equation

References

1 D Deutsch Proc Roy Soc Lond A 455 3129 (1999) Oxford preprint (1989)

2 B DeWitt Int J Mod Phys 13 1881 (1998) 3 A M Gleason J Math Mech 6 885 (1957) 4 A Peres Quantum Theory (Kluwer Academic Publishers Dordrecht

1995) 5 J von Neumann Mathematische Grundlagen der Quantenmechanik

(Springer Berlin-New-York 1932) 6 A Bohr 0 Ulfbeck Rev Mod Phys 67 1 (1995) 7 L Polley quant-ph9906124 8 L Polley quant-ph0005051 9 F W Hehl et al Rev Mod Phys 48 393 (1976) Phys Rep 258 1

(1995) 10 A Ashtekar (ed) Conceptual problems of quantum gravity (Birkhauser

1991)

321

IS RANDOM EVENT THE CORE QUESTION SOME REMARKS AND A PROPOSAL

P ROCCHI

IBM via Shangai 53 00144 Roma Italy E-mail paolorocchiit ibm com

This work addresses the Probability Calculus foundations We begin with considering the relations of the event models today in use with the physical reality Then we propose the structural model of the event and a definition of probability that harmonizes the interpretations sustained by different probabilistic schools

1 Preface

The origin of the Probability Calculus is credited to Pascal who applied rigorous methods to the matter that had been grasped by gamblers and unreliable individuals until then He intended to lay the foundations of a new Geometry and the random event should be a point in this hypothetical abstract science Throughout the centuries several scientists shared the Pascals conjecture which has been accepted without discussion Instead in our opinion an exhaustive and systematic approach to probability requires us to investigate the argument before examining the probability itself The probability theories do not diverge in their final results do not provide different formulas for the total probability and the conditioned probability instead they are in contrast on the foundations to wit in the initial concepts and this circumstance seems to us a substantial reason to study the random event

In brief we may say that the probability theories use two main models of the random event the linguistic model and the set model We shall examine them in the ensuing sections However we do not restrict our works to mere criticism but we shall trace a theoretical proposal This one provides a new mathematical model of the random event and a definition of probability which seems capable of harmonizing the various authors appearing today in contrast Kolmogorov and the frequentists the subjectivist and objectivist schools etc In this article we present a few elements taken from the complete theoretical framework [11]

2 Linguistic Model

In general different sentences can describe the same random event Let the propositions p q regard one event and verify the equivalence relationship

322

p agt q (1)

They form the equivalence class X

X=pq (2)

that constitutes the model of the random event so that we have

P = P(X) (3)

We share the opinion that random events are extremely complex and the linguistic model (2) is consistent with this feature Disciplines which investigate complicated phenomena such as psychology and sociology business management and medicine adopt the linguistic representation and consider other schemes to be too simple and reductive The proposition seems an adequate model except for the following perplexity Each primitive is a simple idea and can be left to intuition only for its fundamental property For example a number a point an entity are elementary concepts Can we declare that the random event is complex and contemporarily assume it is a primary concept The acknowledgement of the complexity opposes the primitive assumption This contrast would at least require an in depth justification that instead is lacking as far as we know

The inconsistency is confirmed in the every-day practice and we examine the linguistic model in relation to the facts

21) - Some subjectivists declare that each particular of the event should be described in order to make evident its uniqueness whereas in usual calculations we accept a sentence such as

The coin comes down heads (4)

Note that only two items are reported the coin and the result The precise date time place and all the particulars that make the event unique and unrepeatable remain implicit In fact the parts of a probabilistic event are not easy to distinguish and to relate in a sentence In conclusion a gap exists between the theoretical assertions and the practical applications of (2)

22) - In the Logic of Predicates every phrase has a precise meaning and is liable to be calculated Programmers using Prolog and Lisp develop inferences Logical programs can deduce the thesis from the hypothesis using precise clauses However this linguistic precision constitutes an exception and normally the natural language is approximate to the extent that a word must be interpreted The natural language usually represents a random event in generic terms whereas the linguistic model (2) should be liable to the probability calculation (3)

323

3 Ensemble Model

The axiomatic theory [8] assumes that the sample space D includes all the possible elementary events Kolmogorov defines the random event X as a set of particular events Ex

X= Ex (5)

when X is a subset of Q

X c Q (6)

and the probability is the measure of X

P = P(X) (7)

The practical application of the theory is immediately clarified by Kolmogorov who defines X as the result of the event

31) - This conception causes some perplexities in the light of modern systemic studies Applied and theoretical works on systems [7] assume the event as the dynamic producing the result from the antecedent item

EVENT

ou tpu t (8)

The result is a part and the event is the whole The properties of the event are evidently quite different from the properties of the output We encounter heavy difficulties when we call Ex) set of events and contemporarily we conceive it as a set of results We cannot merge them without a logical justification But do we have any

32) - Some probabilistic outcomes cannot be properly modeled as sets and subsets The spectrum of interference in the two slit experiment is a well-known case emerging in Quantum Physics [6]

input

324

4 Structural Model

We searched for a solution of the above written difficulties and we designed a theoretical framework based on the structure model for the random event

Ludwing von Bertalanffy father of the General Systems Theory conceives a system and consequently an event as an intricate set of items which affect one another [2] Interacting and connecting is the essential character and the inner nature of events and we take this idea as the basis of our theoretical proposal We make the following assumption

Axiom 41) - The idea of relating of connecting of linking is a primitive

This idea suggests two elements specialized in relating and in being related that we call entity and relationship We define them such as

Definition 42) - The relationship R connects the entities and we say R has the property of connecting

Definition 43) - The entity E is connected by R and we say E has the property of being connected

Intuitively we may say R is the active element and E is the passive one They are symmetric complementary and complete since they exhaust the applications of Axiom 41) Relationships and entities are already known in Algebra as operations and elements as arrows and objects as edges and vertices The main difference is that all of them are given as primitive while R and E derive from the axiomatic concept 41) In other words the properties of the relationship and the entity are openly given in 42) and 43) while they are implicit in other theories We underline that Axiom 41) is not a theoretical refinement and will provide the necessary basis to the ensuing inferences

From Definitions 42) and 43) follows that the relationship R links the entity E and they give the set

S = (ER) (9)

which is an algebraic structure [4] In this article we discuss theoretical models with respect to the physical reality thus we immediately examine howE R and S provide proper models for events The parts of an event are entities and relationships As an example an entity is a dice a spade heads tails a product The relationship that connects two or more entities is for ease a device a force a physical interaction [3] In the physical reality an event is a dynamic phenomenon linking Ein to Eout and from (9) we can deduce this general structure

325

5 = (Ein Eout R) (10)

Using a graph we get

^

R Eout (11)

R is the pivotal element in (10) and (11) and the structural model represents accurately the facts In addition we get the following advantages

1 The result Eout is distinct from the event S The parts and the whole are logically separate and they give a precise answer to objection 31)

2 Relations and entities constitute finite and also infinite sets so that R and E match with both discrete and continuous mathematical formalism

3 When Eout is an ensemble

Eout = Ex (12) Eout c= Q (13)

The structure accomplishes the set model in (5) and (6) 4 The result Eout may be also a rational or an irrational number a real or an

imaginary value It can be calculated by a wave function or by another function etc and we can offer a formal solution to point 32)

5 The structure S can include the comprehensive context of the probabilistic event Eg The atomic experiment depends upon the observer Eo and we have this exhaustive structure

S = (EinEout Eo R) (14)

We believe that the structural model can give a contribution to Quantum Probability

6 A simple sentence includes nouns that are entities and a verb representing a dynamical evolution Eg (4) expresses the following entities and relationship

The coin comes down heads Ein R Eout (15)

326

In short the algebraic structure accomplishes the linguistic model However a sentence can be equivocal whereas the structure S is a rigorous formalism and answers to point 22)

Note that the set (9) has the associative dissociative property namely the event is unicum S then it is defined in terms of the details E and R If this analysis is insufficient we reveal the entities (ElE2Em) and the relations (Rl R2Rp) these are exploded at a greater level and so forth The structure of levels is the complete and rigorous model of any event

S = = (ER) = = (ElE2EmRlR2Rp) = = (E11E12 EmlEm2EmkRllR12 RplRp2Rph) (16)

The structure can also be written such as

level 0 S level 1 ER level 2 ElE2EmRlR2Rp level 3 E11E12 EmlEm2EmkRllR12 RplRp2Rph (17)

The multiple level decomposition is known also as hierarchical property in literature [13] It is applied by professionals in software analysis methodologies [14][10] it is basic in modern ontology [12] and in various other sectors [1] The progressive explosion of the event is already known in the Probability Calculus where we use trees connecting the parts and the subparts of a random event For example an urn contains x red balls y green balls and z white balls Which is the probability of getting a white and two green balls through three draws

We consider the drawing Rw of a white ball w and Rg of a green ball The winning combinations wgg gwg ggw are generated by Rl R2 and R3 Intuitively we write this tree connecting three levels

R3

l RgRgRw (18)

The structure of levels (17) is rigorous and complete It includes the relations of the event as well as the entities

327

level 0 S level 1 gw R1+R2+R3 level 2 wgggwgggw(RwRgRg)+(RgRwRg)+(RgRgRw) (19)

Thanks to this completeness the structural model provides some insight into what is involved In particular if Rx at level k includes the subrelationships of level (k + 1) then Rx connects the entities through these subrelationships Eg The structure of levels (19) illustrates the dynamic Rl carried out by (RwRgRg) that physically determine the results The structure (16) proves that any event is composed of precise macromechanisms and micromechanisms Any event appears like an industrial apparatus a mechanical clock or an electronic device including various working parts This operational analysis which is based on Axiom 41) will be fundamental in the next section

5 Certain and Uncertain Structures

Probability is the answer to such kinds of questions Who will win the next foot-ball match Who will be voted in the regional elections Shall I pass the examinations Where is the photon now

These questions prove that probability concerns the particulars of an event that is already known in the whole We see the overall random phenomenon but however we ignore the details that will produce the result When we ask who will win the next match we are familiar with the match we already know the teams which will play where the match will be held etc We master the event however we do not have the details that will set out the result Why do we not have details

The cognitive difficulties related to the particulars of a random event take several origins For example there is a generic memory the reports are not detailed the particulars are missing because they are disseminated over a vast area we meet obstacles in the use of instruments etc

Ignorance of microscopic is sometimes a voluntary choice Every detail could be observed and yet we decline to know them For example a company has collected analytical data but the executive managers ignore them and evaluate their average values in taking important decisions Macroscopic knowledge and unawareness of microscopic items provide a precise method Statisticians assume this method that is absolutely scientific

Let us translate these concepts into the formalism just introduced Let the event S have the level the level 2 up to the level q two cases arise now

328

51 Certain Structures

The event is entirely described by the relations and the entities of level q The elements at level (q + 1) do not exist in the paper and in the physical reality This structure which is wholly defined and complete is certain As an example we take a body falling

level 0 S level 1 EbETRf (20)

The structure includes the body Eb the Earth poundTand the force of gravity Rf at level 1 The elements exhaustively model the event and other elements do not exist in the physical world

52 Uncertain Structures

The event is not entirely described by the relations and the entities of level q The microelements pertaining to level (q + 1) exist in the physical reality and influence the final results in a decisive way however the structure do not include them We call uncertain (or random) such a structure which is partial As ease we take the flipping of a coin The structure includes the coin Em the launchingfalling dynamcs Rm The entities Et heads and Ec tails and the relations which are alternative and produce them appear at the next level

level 0 S level 1 EmRm level EtEcRt+Rc level 3 (21)

The subrelationships of Rt and of Re produce any specific outcome They are essential since they would enable the calculation of any result and should be listed at the level 3 in (21) However they do not appear and the structure (21) is uncertain

6 Probability

A certain event is entirely explained through the structure of levels The structure clearly indicates how the event runs through q levels which are exhaustive by definition On the contrary the uncertain structure is incomplete and cannot describe how the event runs in the physical reality As the impossibility of describing how the event functions since the level (q + 1) is unknown we inquire when the event behaves that is when the random event exists in the physical reality This

329

inquire unveils a typically physical approach The problem eludes whoever develops an abstract study For the pure theoretician the event S once defined on the paper exists by definition The applicative instead knows the great difference between the definition of a model and its experimental observation

The structure of levels (16) proves that the event S works through R therefore we measure the ability to connect of the relationship

Definition 51) - When R links the input to the output in the physical reality the event S is certain and the measure P(R) equals one

P(R)= 1 (22)

When R does not run in the physical reality S is impossible in the facts and the measure P(R) is zero

P(R) = 0 (23)

If R occasionally runs P(R) assumes a decimal value The connection is neither sure nor impossible and R has a value between zero and one

0 lt P(R) lt 1 (24)

We call probability the measure P(R) of the operation R which extensively indicates the occurrence of S We can add the ensuing remarks

1 The relationship R is the precise argument of probability while S is generic 2 Definition 51) is coherent with the common sense on probability as P(R)

gauges the possibility or the impossibility of the random event 3 In some special events we can define the operation using its outcome Formally

we state an univocal relation between Eout and R

Eout =gt R (25)

and we calculate the probability of the outcome

P(Eout) = P(R) (26)

Eg The result heads Et appears whenever Rt works and we forecast the chances of a gamble from the possible outputs

P(Et) = P(Rt) = 05 (27)

330

In conclusion if (12) (13) and (26) are true Definition 51) is consistent with the Kolmogorov s theory

4 Certain structures include only certain elements impossible elements have no sense and are omitted The unitary value of probability merely confirms what is already related in the levels For example P(Rf) is one and substantiates the structure of levels (20) Conversely the uncertain structure lacks the lowest elements that are essential and (24) unveils them The decimal values of probabilities clarify the intervention of the elements at level (q + 1) For example we ignore the parts of Rt producing the result Et in (21) instead the probability (27) is capable of explaining how they work Exactly half of the S occurrences is due to the subrelationships of Rt and the other half is activated by the components of Re The explicative and predictive values of probability in (24) appear absolutely relevant

7 Experimental Verification

Our inferences are strictly inspired by experience and Definition 51) must be confirmed in the facts In order to simplify the discussion of practical verification let the event include either the relationship Ri or NOT Ri at level 2 and level 3 is ignored

level 0 S level 1 ER level 2 EiNOT Ei (Ri+NOTRi) level 3 (28)

The probability P(Ri) expresses the runs of Ri by definition thus the occurrences gs(Ri) in the sample s verifies the theoretical value P(Ri) As much as Ri connects so much is gs(Ri) Vice versa as little Ri runs so small is gs(Ri) However the absolute frequency gs(Ri) exceeds the range [01] and we select the relative frequency Fs(Ri) which verifies

0 lt Fs(Ri) lt1 (29)

According to this theory the relative frequency must coincide with the probability calculated theoretically instead Fs(Ri) does not coincide withP(3() Why There is perhaps a systematic error in the experiment

The relationship Ri at level q works by means of its subrelationships at level (q + 1) however we do not know in details how these ones behave In particular a subrelationship at level (q + 1) occurs random and a finite number of tests does not

331

allow the subrelationships of Ri to maintain their dynamical contribution to Ri Symmetrically the subrelationships of NOTRi are not proportional to P(NOT Ri) Every finite sample of tests unbalances Ri and NOT Ri The occurrences of one group are lower to what they ought to be and the occurrences of the other are greater since the subrelationships are casual The relative frequencies appear in favour of one group of subrelationships and in detriment of another Fs(Ri) and Fs(NOT Ri) are necessarily unreliable and disagree P(Ri) and P(NOT Ri) We conclude the correct trial of probability must be extended over the universe where the subrelationships of Ri and of NOT Ri do not undergo limitations The ideal experimentation of P(Ri) which excludes any deforming influence and provides the unaltered value oiFs(Ri) requires the number Gs of tests be infinite

Gs = oo (30)

In this situation the theoretical value P(Ri) and the experimental one coincide

Fs(Ri) - P(Ri) = 0 (31)

The ideal experiment (30) is unattainable therefore we can only bring near We define this approximation using the limit

Urn Fs(Ri) - P(Ri) = 0 Gs^oo (32)

The limit affirms that given the high number AT there is a value Gs

Gs gt N (33)

such that

Fs(Ri) - P(Ri) lt1Gs (34)

In other words we repeat the tests a sufficiently high number of times and the difference between the frequency and the probability will be less to the small number 1Gs The limit (32) ensures a result as fine as desired It proves that the probability defined by (22) (23) (24) is verifiable in the fact and confirms that the present theory has substance

The limit (32) known as empirical law of chance or law of great numbers does not define probability but explains its experimental verification only It is less meaningful with respect to the law sustained by frequentists [9] and does not give rise to the same conceptual difficulties The limit (32) does not use probability to

332

describe the approximation of Fs(Ri) to P(Ri) and avoids a certain conceptual tautology

8 Objective and Subjective Probability

The limit (32) states that the higher the number of tests the more frequency moves near to probability Vice versa the smaller the sample the less reliable is the experimental control of probability The maximum deviation emerges in a single test and the structural model provides the explanation

One subrelationship of the level (q + 1) fires the single experiment and this subrelationship pertains to Ri or otherwise pertains to NOT Ri In both cases the frequency deviates completely from the probability which should be decimal

I bull Gs 1 gtN oo Fs wrong approximate right

(35)

The spectrum (35) is valid in relation to frequency and also in relation to probability What does this mean

Any scientific measure takes its meaning under the precise conditions in which it is defined Therefore a parameter does not have a value for ever but does only in the practical conditions under which it must be tested And this rule also concerns probability A fairly simple case can clarify the matter

We define the force as the factor causing the acceleration a to the mass

f=m-a (36)

Mechanics defines the force (36) in the conditions which pertain exclusively to the inertial system This is characterized by the property of being stationary or moving straight on and steadily In the inertial system the mass m goes through the force and accelerates in accordance with (36) Conversely the body can move without any mechanical solicitation in the non-inertial reference The force cannot be tested and definition (36) is meaningless when system is not inertial

In general a scientific measure takes on a significance only under the experimental conditions pertaining to it and out of this context it objectively has no meaning The same criterion applies to probability with additional difficulties due to the experimental conditions that are expressed by the limit (32) and are somewhat

333

complex We have not two alternative and mutually exclusive reference systems intertial and non-intertial conversely we have the continuous spectrum (35) Probability is correctly experimented and thus takes on a right and objective significance when

Gs =00 (37)

This is unattainable and we use a large sample

Gs gtN (38)

the higher is the test number and the more objective is the probability verification Probability loses significance as more as Gs decreases The test is absolutely meaningless when

Gs = 1 (39)

Probability is very useful (see point 3 in section 6) and we calculate P(R) even if (39) is true In the single event however the probability does not exist as De Finetti paradoxically states [5] Probability can only orientate the personal expectation namely probability takes on a subjective significance

I

Gs 1 gtN Fs wrong approximate P subjective objective

Note that the subjectivist schools focus their attention on the single event while the general event is a repetition of single events This remarks put to light once again that incongruences between various authors take their roots on the random event modeling

In substance Fs(Ri) and P(Ri) have a correct and objective meaning when they refer to the entire inductive base As the number of experiments decrease so the precision of Fs(Ri) decreases and the objectivity of P(Ri) decreases progressively to the point (39) in which the numerical value of Fs(Ri) is systematically wrong and the value ofP(Ri) is subjective

00

right

(40)

334

9 Conclusions

Our theoretical proposal arose from a critical approach to the probabilistic event in particular we started with examining the relation between theoretical models today in use and the physical reality We believe the algebraic structure meets the needs better than the linguistic and the set models Besides the theoretical appreciations that we listed in the previous pages we highlight that structures of levels are already applied in several fields and in Probability Calculus too

The definition of probability that derives from the structural model is consistent with the common sense and with the probabilistic schools The different interpretations of probability which today are conflicting are unified in between our framework We judge this is a significant feature and may provide a stimulation to the scientific debate

The reader may find some parts in this paper sketchy and insufficiently explained we regret the conciseness Other considerations and further calculations have been developed in [11] but exhaustive discussions cannot be included here

References 1 Ahl V Allen TFH Hierarchy theory a vision vocabulary and epistemology

(Columbia Univ Press NY 1996) 2 von Bertalanffy L General system theory (Brazziller NY 1968) 3 Chen PS The entity-relationship model toward a unified view of data ACM

Transactions on Database Systems vol 1 nl (1976) 4 Cony L Modern algebra and the rise of mathematical structures (Verlang

NY 1996) 5 de Finetti B Theory of probability (Wiler amp Sons NY 1975) 6 Feynman R The concept of probability in quantum mechanics Proceedings

Symp on Math andProb California University Press (1951) 7 Kalman RE Falb PL Arbib MA Topics in mathematical system theory

(McGrawNY1969) 8 Kolmogorov AN Foundations of the theory of probability (Chelsea NY

1956) 9 von Mises R The mathematical theory of probability and statistics (Academic

Press London 1964) 10 Rocchi P Technology + culture = software (IOS Press Amsterdam 2000) 11 Rocchi P La probabilitd e oggettiva o soggettiva (Pitagora Bologna 1998) 12 Uschold ML Building ontologies toward a unified methodology Proc Expert

Systems Cambridge (1996) 13 Takahara Y Mesarovic MD Macko D Theory of hierarchical multilevel

systems (Academic Press NY 1970) 14 YourdonE Modern structured analysis (Englewood Cliffs NY 1989)

335

CONSTRUCTIVE FOUNDATIONS OF R A N D O M N E S S

V I SERDOBOLSKII Moscow 109028 BTrekhsviatitelskii 312 MGIEM E-mail vserdmailru

The ideas of the complexity and randomness are developed in a successively conshystructive theory The Kolmogorov complexity is reconsidered as a minimization process Basic theorems are proved for the processes A new notion of the comshyplexity based on sequential prefix coding algorithms (S-algorithms) is proposed It is proved that a constructive infinite binary sequence is algorithmically stationary iff it is an S-encoded random sequence

1 Introduction

In 1963 ANKolmogorov [1] suggested an algorithmic approach to foundation of the probability His new definition of probability was based on the notion of the complexity which was defined as the length of the minimal description for a binary word x the complexity function is defined as

bull ()= min b | (1) A(p)=x

where p are (shorter) binary words and the minimum is evaluated over all possible algorithms A A remarkable properties of this approach was that thus algorithmically defined randomness was proved to display all traditional laws of probability However the function K(x) denned by (1) in a traditional intuitive approach cannot be effectively calculated since it is not a partially recursive function In fact this function is computable only for finitely many words x [2] In [3] it was shown that Kx) is not partially recursive for any universal algorithm In [4] the definition (1) was called a heuristic basis for various approximation In [5] the author writes that the non-constructive form of the definition (1) leads to some difficulties so that many important relations hold only to within an error term measured by the logarithm of the complexity To offer a constructive definition of randomness it would be desirable to call an infinite sequence random if all initial segments (prefixes) in it are incompressible However it was proved [6] that such sequences do not exist Kolmogorov proposed some definition of randomness (K-randomness) but he wrote that it was to be improved

In this paper we reconsider fundamental relations of the Kolmogorov comshyplexity theory and develop a successively constructive formalism The main idea is that as far as we deal with algorithms we must explicitly take into acshycount the current time of their performance Thus a static notion of minimal

336

description must be replaced by the process of the minimization Here we sugshygest a rigorous formalism in which it is possible to replace somewhat obscure intuitive reasoning of the existing complexity theory by formal investigation of strings of symbols We present a survey of basic results of the Kolmogorov complexity theory in terms of processes of step-by-step performance of algoshyrithms We also introduce a new form of the complexity based on a restriction by algorithms coding sequentially from left to right (S-algorithms) Construcshytive infinite binary sequences can be called stationary if frequencies of all finite blocks of digits in it converge We prove that a sequence is stationary iff it is the transformation of an incompressible (up to a logarithmic term) sequence by a sequential left-to-right encoding algorithm

Let us define the objects of the investigation and fix notations We study binary words x that are finite chains of binary digits and at the same time binary numbers These words are transformed with algorithmic procedures A which can be represented by Turing algorithms (Turing machines) or equiva-lently by partially computable (partially recursive) functions We also study infinite sequences xdegdeg of binary digits which can be considered at the same time as infinite sequences of words x of increasing length n ie initial segments of xdegdeg In the constructive approach these sequences must be generated by some finite algorithms (generating functions) We write A(x) = y if A halts at some finite step and yields y If A(x) does not halt we write A(x) = We will often need to perform algorithms step-by-step Let Atx) denote the result of the performance of Ax) for t steps At(x) mdash y if Ax) halts at the step t lt t and yields y We write At(x) = if A(x) does not halt or halts only at the moment t gt t Let |a| denote the length of binary word x

2 Kolmogorov Complexity

According to Kolmogorov the complexity of a binary word is the length of a minimal program generating this word To make this definition comshypletely constructive we first must explicitly describe the minimization proshycedure To minimize a partially computable function f(x) we combine the search of x with counting number of steps of an algorithm that evaluates f(x) Let us use the uniform increasing numeration N = 12 of n-tuples of arguments for example let N = 12345 represent pairs (11) (12) (21) (22) (1 3)

Define the standard minimization process for A(x) as follows

min A(x) = A(xN) N = l2 X

where N = (xt) A(x0)= and A(xN) = min (A(xN - l)A t(x)) for

337

N gt 1 In the minimization process the sign can be treated as infinity If Ax) halts for a computable number of steps t then the minimization process ends and min A(x) is a computable function If no such t exists we can say

X

then that the function A(x) has no bottom Consider the universal Turing machine U by definition U(Ap) = A(p)

in the domain where (and in the following) the same letter A also denotes the text of the algorithm Let A denote the length of the text A Theorem 1 There exist computable functions such that the mass problem of their minimization process halting is algorithmically unsolvable

Proof Consider the indicator function ind(xt) = 0 if Ut(x) with x = (Ap) halts exactly at the step t so that Ut(x) = A(x) otherwise ind(xt) = 1 Denote

(j)xt) =TT ind(aT) Tltt

The minimization process ltfgt(x l)(jgt(x 2 ) is finite iff U(x) halts But the halting problem for the universal Turing machine U is algorithmically unsolvable

Now we can define the complexity as follows

Definition 1 Given binary word x and an algorithm (partially computable function) A the complexity of x with respect to A is K(x A) = K(x AN) N = 1 2 where

K(x A N) = min p (pt)ltN A(p)=x

In this definition Ap) is called a generating algorithm and p is called a program or a code for x

So the complexity is defined as a process but not as a function If A(x) halts for some x then the sequence K(xA) = K(xAN) N = 12 converges to a constant for some computable N = NQ and we can say that the complexity function K(x) is defined Otherwise no such constructive function exist

To compare minimization processes we need a special technique

Definition 2 Given two minimization processes

min A(x) = A(x N) N = 12 min B(x) = B(x M) M = 12 X X

we write A(x)ltB(x) if for each M there exist an iVo such that for all N gt N0

the inequality holds A(x N) lt B(xM)

338

If the both processes halt we can write simply A(x) lt B(x) If A(x)ltB(x) and A(x)gtB(x) we say that the strong equivalence holds

and write A(x) ~ Bx) Define also a weak equivalence A(x) laquo B(x) if A(x)ltB(x) + c along with Bx)ltA(x) + c

The algorithmic theory of complexity was started with the discovery of universal descriptions and universal complexity This basic discovery was made simultaneously and independently by Kolmogorov and RSolomonoff in 1960-1964 (see in [7])

This theory is developed to study minimal descriptions of arbitrarily long words x with finite algorithms It means that A lt c All basic results are obtained with the accuracy up to constants c which are supposed to be indeshypendent of x

Definition 3 The complexity of the word x with respect to an algorithm A is the process K(x A) = K(x AN) N = 12 where

K(x A N) = min |raquo| (pt)ltN At(p)=x

We use two methods of the complexity theory upper estimates of the comshyplexity are derived by the construction of explicit generating procedures lower estimates are obtained by counting the variety of words and their programs

Theorem 2 For any algorithm A we have

K(xU)ltKxA) + cA

where CA depends only on A but not on x

Proof Count steps of Ax) by steps of the universal Turing machine performing A For each N we can find a number M such that

K(x U N) = min z lt (zt)ltN U(z)=x ~

min min |(Bp)| lt min (CA + p) lt B Bltc (pt)ltN Ut(Bp)=x ~ (pt)ltN Ut(Ap)=x ~

CA+ min p = CA +K(XA) (pt)ltM A(p)=x

where CA is a constant depending only on A This is the proof

This statement is called the Invariance Theorem Its significance is that it introduces a universal measure of complexity which is calculated by trying different algorithms with different input words Let us fix a particular universal Turing machine U as a reference machine and set K(x) = K(x U)

339

Let us call the difference |x| mdash K(x) the number of regularities

Remark 1 Given n = x the fraction of words x with the number of regushylarities more than m is no more than 2~m

This follows from the fact that there are only 2 n _ m programs p of length nmdashm So almost all words are incompressible up to a slowly increasing function of n

Remark 2 Kx)ltx + c This is obvious since we can use as a generating the identity algorithm A(x) = x

Note that the minimization process in Theorem 2 can be made more effishycient if we restrict p with p lt x + c

The complexity of finite words depends strongly on the additive constant c Therefore the main object of study will be the complexity of words x of arbitrarily great lengths n

Theorem 3 If fx) is a partially computable function then K(f(x))ltK(x) + c

Proof Suppose the algorithm evaluating f(x) halts Given an arbitrary algorithm A we construct the composition B = fA By Definition 3 and Theorem 2 for each N we can find M and a constant c independent of x such that

K(f(x)UN)= min p lt (zt)ltN Ut(z)=f(x)

min min Inl + c lt min p + c lt B Bltc (pt)ltM Bt(p)=f(x) ~ (pt)ltMf(At(p))=f(x)

min Id + c = K(x A) + cltK(x) + c (pt)ltMAt(p)=x V

The theorem is proved

Example Let x mdash 0n (n zeros) Then K(x)ltK(n) + clt logn + c If n = l m then K(x)ltlogogn + c Clearly Kxn) is not monotone in n

By definition it is impossible to present a conceivable example of a high-complexity word

To separate a number n in chain we define a special self-delimiting code for an integer n as follows n = Omln where m = logn with the length n = 2log n + 1 or a more refined code n = O l o g m lmn of length n lt logn + 2 + 2 log logn Here (and in the following) log a for x gt 0 denotes a function equal to an integer nearest from above to the standard logarithmic function logx and only positive arguments of log a are considered (if x lt 0 then the expressions containing log a are supposed to equal 0)

340

Note that the set of n presents a prefix-free set More sparing self-delimiting codes can be obtained by further iterations Denote their length by log n = log + log log n 4- log log log n + (the iterated logarithm)

Theorem 4 K(x y)ltK(x) + K(y) + 2 log ||z|| + 1

Proof It suffices to use programs for (x y) of the form p = 0mlp1p2 where m = logpi A(pi) = x B(p2) = y and 0m serves to separate p from p2

3 Incompressibility

Now we consider algorithmically generated infinite sequences of digits xdegdeg that are treated as sequences of words x |x| = n = 1 2

We cite (in a simplified form) two theorems by Martin-L6f [6]

Theorem 5 Any constructive xdegdeg contains infinitely many words x of length n with K(x)ltn mdash logn + c

Theorem 6 For almost all sequences xdegdeg for any e gt 0 for all words x of length n gt no with some computable no we have K(x) gt n mdash (1 + e) logn

Thus the complexity of a typical constructive binary sequence fluctuates between the lower bound n mdash (1 + e)logn and n

The idea to define randomness as algorithmic incompressibility was put forward by Kolmogorov [2] and GJChaitin [8] There exist no sequences in which all words in it are c-incompressible

Definition 4 (Kolmogorov) An infinite binary sequence is called K-random if it contains infinitely many words x with if(a)gt|a| mdash c

Remark 3 Almost all sequences xdegdeg are K-random

This follows from the fact that there is only a portion 2~c of words x for which K(a)lt|a| - c

Definition 5 An infinite binary sequence xdegdeg = x is called L-random if for some c we have K(x)gtn mdash c logn for all words n = x

Theorem 6 states that almost all binary sequences are L-random Stepping aside from the incompressibility idea Martin-L6f [6] suggested

another notion of randomness based on the idea of universal tests The Martin-Lof randomness (ML-randomness) follows from the Kolmogorov randomness If zdegdeg is Martin-Lof random then for any e gt 0 we have K(x)gtn- ( l + e ) l o g n from some n onwards

These properties suggest three notions of randomness implied one from the other K -+ ML -gt L

Now let us restrict classes of algorithms

341

4 Reversible Complexity

Let us restrict ourselves with reversible algorithms

Definition 6 An algorithm A(p) is called reversible (R-algorithm) if one can find another algorithm B = A-1 such that A(p) mdash x implies B(x) mdash p and vice versa

These algorithms state 1-1 correspondence between inputs and outputs We can say that B(x) is an encoding algorithm and A(p) is a decoding algoshyrithm

Definition 7 R-complexity of a word x is defined as the process KR(X) = KR(x N) N = 1 2 where

KR(XN) = min min Id A Altc pt)ltN Ut(Ap)=x

where A are R-algorithms and the minimization process is shortened by disshycovering the first root of the equation A(p) = x

Since the class of R-algorithms includes the identity algorithm we have KR(X) lt x + c

Definition 8 A function (an algorithm) A(x) is called unidomain if there are no pairs x ^ x-i such that Ax) = Ax2)

Proposition 1 A function A(x) is unidomain iff it is reversible

Proof First let A be unidomain Using A let us construct an algorithm B(y) as follows

for (pt) = 12 do if At(p) = y then B(y) = p halt

endfor

If A(x) = y then this algorithm provides the first root of this equation and halts If A(x) = then we have B(y) = Conversely if A is a reversible algorithm then there exist an algorithm B(y) such that Ax) = y implies B(y) = x and the argument of A is recovered uniquely

Theorem 7 There exist no algorithm W such that for any algorithm A we have W(A) = 1 if A can be a reversible algorithm and W(A) = 0 if not

Proof To prove this assertion it suffices to prove it for some special class of A Let N be a nullifying algorithm such that for any x we have N(x) = 0 and let B be an arbitrary algorithm Choose A so that A(0) = 0 A(l) = N(B(1)) and A(n) = n for n gt 1 This algorithm is not unidomain iff -B(l) halts However the mass problem of algorithm halting is algorithmically unsolvable This proves the theorem

342

Theorem 8 The complexity KRX) as K(X)

Proof The relation K(X)ltKR(X) + c follows from definitions Prove the converse relation Let Kx) be given by a sequence of functions

KixN) = min min Ipl A Altc (Apt)ltN At(p)=x

where A are arbitrary algorithms Given A the minimization here is carried out over all roots of the equation At(p) = x We replace the evaluation of all roots for a single algorithm At by evaluating roots of a number of the equations Let us numerate roots of the equation A(p) = x in the process (p t) = 12 Construct the algorithm B(vp) as follows

k=0 for (qr)=l 2 do

if ATq) mdash x then k = k + 1 if k = v and p = q then

B = x halt endfor

The function B(vp) = x iff p is the root number is otherwise B(yp) = By construction for fixed v the function B(ip) is unidomain The theorem statement follows

Knowing the complexity of a word x we can constructively evaluate its minimal codes Minimizing descriptions of physical events x can be considered as a process of a cognition of x by search of a regularities producing the phenomenon x It is known that all elementary physical processes are time-reversible The reversible generating algorithms generally speaking can be less efficient in producing long words The equivalence Kx) laquo KRX) stated by Theorem 8 can be interpreted as the absence of phenomena that can be produced but not cognized within the frames of the algorithmic theory

5 Complexity and Information

Kolmogorov discovered [2] [9] that information theory can be developed from the algorithmic definition of complexity

The conditional complexity of a binary word x with respect to the word y is defined as the minimal length of a program that generates x from y

K(xyA)= min p (pt) At(py)=x

Theorem 9 There exists an optimal algorithm V such that for any algorithm A we have

K(xy) d=f K(xy V)ltK(xy A) + c

343

Example We have K(Onn)ltc where the constant c is the length of the algorithm generating 0 from n

We show the connection between the notion of complexity and optimal coding in the Shannon information theory Suppose the words x of length n be partitioned from left to right into sequences of k blocks ba of binary digits of the identical length I m = 2l blocks in total n = kl Denote by fbdquo the empirical frequency of the occurence of bs in x The Shannon entropy per block is defined as

s

Theorem 10 Let o word x be partitioned into k blocks of length I Then k~1K(x)ltH(f) + clogfcfc where c depends on I but not on x

Proof Use a special code not depending on the source of information universal code) To specify x we can fix numbers k3 = kfs of the occurence of each block bs for all blocks s of length I and the number

~ kilk2kml

m = 2l where fci + bull bull bull + km = k Applying the Stirling formula we find that the length of this code is no more than m log k + kH(f) + c log k The theorem statement follows

Thus Kx) can be considered as the entropy and K(yx) as the conditional entropy The information in x about y is I(xy) = K(y) mdash K(yx)

Remark 4 For arbitrary words x and y

K(yx)ltK(y) + c and K(xy) = K(x) + K(y|x) + clog|x|

Indeed consider a special code for (x y) of the form P1P2 where pi is a self-delimiting code for x and pi is a code for y We have

K(xy)lt min min (|Pi| + IP2I) AB | A | lt c | B | lt c (piP2t) At(pi) = x Bt(p2) = y

This is the required statement Note that the measure of the information I(xy) is non-negative only

asymptotically for long x and y The correction logarithmic term can be preshyscribed to the individual description of x in contrast to traditional description in terms of distributions

344

6 Frequency Ra te s

The stability of frequency rates that is assumed a priori in the conventional concept of probability can be deduced in the algorithmic theory

Denote the empiric rate of occurences of 1 in x by f(x 1) The frequency rates stability can be stated as follows

Theorem 11 Given L-random xdegdeg c gt 0 for each word x in it

f(xl)-l22ltcognn

where c does not depend on n Proof Use a special code p for x as follows Let k = nf(xl) and

P = (fcgtj)gt where j = 1 C numerates all words x of length n with k units Use the prefix codes for (k j) of the form kj with k = log k lt 21ogn Thus

A(a)lt|(gtm)|lt21ogn + logC7

Using the Stirling formula we find that logC lt nH(kn) + clogn where the entropy H(f) = mdashlog mdash (1 - ) log( l - ) = kn It satisfies the inequality H(f) lt 1 mdash 2( - 12)2 Combining these formulas we obtain the desired result

Remark 5 If f(x 1) - 12|2 gt cn then K(x)ltn - 12 logn + c This inequality shows the effect of a regularity when the number of units is too close to n2

The refinement is natural We consider a partition of xdegdeg mdash x into blocks of digits b of the identical length b = Define by fxb) the number of blocks b = bi among the partition of a word x of length n = kl Denote 7T = 2 -J

Theorem 12 Given an L-random sequence xdegdeg = x and a block of digits b of length I for all words x of length n we have

f(xb)-2~l2 ltc(b) lognn

A number of other specifically probabilistic laws deduced previously by intuitive reasoning in can be proved similiarly

7 Prefix Complexity

In 1974-1975 another approach to the complexity was developed starting from the concept of a prefix complexity (by LALevin PGacs GJChaitin [10-12])

345

Definition 9 A set of words is called prefix-free if there are no pairs of different words such that one is the beginning of the other

Lemma 1 (1) If pi is a prefix set n = pi i mdash 12 then the Kraft inequality

holds pound 2-ltltl

t = l 2

(2) if numbers n nlti satisfy the Kraft inequality then one can find binary words pi P2 bull bull of length n n-i such that the set pi is prefix-free

These words can be constructed by the well-known Fano-Shannon proceshydure

Definition 10 An algorithm is called a prefix algorithm if its domain is a prefix-free set The prefix complexity of a word x with respect to a prefix alshygorithm A is defined as the process Kp(x A) = Kp(x AN) N = 1 2 where

KP(xAN)= min ||p|| (pt)ltN At=x

The set of prefix algorithms is an enumerable set

Theorem 13 There exists a universal prefix algorithm V such that for any prefix algorithm A we have

KPx) d= KP(x V)ltKP(x A) + cA

To deal with prefix algorithms we notice that we can recover the word x = 0n (n zeros) from n but we cannot encode numbers n as simple integers since they are not prefix-free Using self-delimiting codes we obtain prefix-free codes of length n + log n

Remark 6 K(x)ltKP(x)ltK(x) + log(z)

Remark 7 Kp(xy)ltKp(x) + Kp(y) + c In contrast to K(x) here we do

not need an end marker for the word x since x is recognized as a prefix

Theorem 14 [12] For any fixed length n of words x we have max Kp(x)gtn + log n mdash c

X

Theorem 15 [13]An infinite sequence xdegdeg is Martin-Lof random iff Kp(x)gtx mdash c for all words x

346

For most of xdegdeg we have Kp(x)gtx mdash c for all x Thus the prefix complexshyity of almost all sequences fluctuates within the bounds x and |a| + log x (with the accuracy up to c)

8 Universal Probability

The idea of a universal a priori probability was put forward by Solomonoff in [4] For a binary word x he introduced the probability P(x) = 2 _ l p ^^ where p(x) is a minimal description of a However

pound2-ltgt = oo x

To obtain normalizable algorithmic probabilities the Kraft inequality for a prefix-free set was proposed and this led to the development of a theory of the prefix complexity [10-12] Let us reformulate the basic results of it in a successively constructive form

Definition 11 The algorithmic probability of x is defined by the process

P(x) = 2-Kr(ltN AT = 12

Example If x = 0n then Kp(x)lt logn + 2 log log n + c Hence P(x)gtc(nlog2 n)

Definition 12 The universal a priori probability is defined by Qx) = Q(xUN) N = (p t) mdash 12 where U is the universal prefix algorithm and

Q(xUN) = QxUN-l) + md(Ut(p) = x) 2~M

where the indicator function equals 1 iff Ut(p) halts exactly at the step number t otherwise 0

Since the mass problem of the universal machine halting is algorithmically unsolvable the sequence Q(x) has no ceiling

The following Coding Theorem shows that these two formulations define processes differing by no more than a constant

Theorem 16 For each x we have Kpx) raquo logQ(x)

In [14] a non-constructive infinite binary fraction was considered

n =53 Q(x) lt I

347

The real number fi was called the universal algorithm halting probability It can be interpreted as a process Q(N) N mdash 12 with

fi(jV) = Yl MN ) + md(ut(p) = )]gt (xpt)ltN

where the indicator function equals 1 iff Utp) halts exactly at the moment t yielding x otherwise 0

The monotone increasing sequence il(N) is bounded from above and has no ceiling Knowing first signs of ilN) N mdash 12 we can accumulate in fi solutions of all constructive problems of bounded complexity CBennet and MGardner would call ft the number of Wisdom [15]

9 Sequentially Coding Algorithms

We suggest the following extension of the complexity theory produced by a restriction with algorithms coding sequentially from left to right

A set P of code words is called complete-code if any half-infinite sequence can be represented as a concatenation of codes from P

Definition 13 An one-to-one constructive function T X ltmdashgt Y is called a coding table if it is defined on complete-code prefix-free sets X and Y

Definition 14 An algorithm A evaluating a coding table T X ltmdashgt Y is called a sequential coder or an S-algorithm if

(1) for any concatenation x = xXi Xk of words Xi from X we have A(x) = A(x1)A(x2)A(xk)

(2) for any concatenation y = A(xx)A(x2) bull bull A(xk) we also have A(x1x2xk) = y

The set of S-algorithms is recursively enumerable

Definition 15 The S-complexity of a word x with respect to an S-algorithm A is a process Ks(x A) = Ks(x AN) N = 1 2 where

Ks(xAN)d= min p (pt)ltN At(p)=x

Theorem 17 There exists a (universal) S-algorithm V such that for any S-algorithm A we have

Ks(x) = Ks(xV)ltKs(xA) + cA

where CA does not depend on x

348

Since the class of S-algorithms contains the identity algorithm (with A(0) = 0 A(l) = 1) we have Ks(x)ltx+c If f(x) is a partially computable function evaluated by some S-algorithm then Ks(f(x))ltKs(x) + c

Obviously K(x)ltKs(x)ltKp(x) But we only have Ksxy)ltKpx) + Ks(y) since the sequentially coding algorithm can separate the utmost left prefix from the remaining ones

For words x = 0trade we have Ks(x)lt log n For almost all sequences xdegdeg for all sufficiently long words x in it for any

c gt 1 we have Ks(x)gtK(x)gtx mdash clog |x|

Definition 16 A binary sequence is called S-random if for all words x Ks(x)gtx mdash c log |a| where c does not depend on x

Definition 17 A binary sequence xdegdeg = x is algorithmically stationary if for any block b of digits in it there exist the limit lim f(b x)

xmdashgtoo

Any L -random sequence is algorithmically stationary Lemma 2 a binary sequence ydegdeg = y is produced from an algorithmically stationary sequence xdegdeg = x by an S-algorithm A so that y = A(x) then the sequence ydegdeg is also algorithmically stationary

Proof Suppose ydegdeg is produced from xdegdeg by y = A(x) where A is an S-algorithm The algorithm A defines a prefix-free domain X and a code-complete range of values Y Choose a block of digits b Using the completeness of Y we have b mdash 2122 bull bull bull Vk where j 6 Y i = 12 k By the sequential property we can find a program a = XXi Xk with all Xi euro X such that Aa) = b The frequencies f(ax) = f(by) This proves the lemma

Lemma 3 KsKs(x))ltKs(x) + c

Proof Note that S-algorithms are such that the composition AB of two S-algorithms A and B is again an S-algorithm For a fixed N we find

Ks(xN) = min min Ipl A Altc (pt)ltN At(p)=x

and for the minimizing value p = Po

KspoM)= min min y B Bltc (yt)ltM Bt(y)=p0

Let y = 20 be the minimizing value of a code for po- Since for some t AtBt(y) = x (if both algorithms halt) it is clear that Ksx) lt y + c We obtain K(x)ltKs(p) laquo Ks(Ks(x))

Theorem 18 An infinite binary sequence xdegdeg is algorithmically stationary iff it is an S-algorithm transformation of some S-random sequence

349

Proof First assume that y = A(x) for all x euro xdegdeg and Ks(x)gtx mdash clog x We have K(x)gtKs(x)-log x So K(x)gtx -c log|a | c gt c + l By Theorem 12 the sequence xdegdeg is stationary

To prove the converse assume that xdegdeg = x is stationary We find minKs(x N) for (p t) lt N let p be a minimum code for x At(p) = x for some t if At(p) halts Here A P -yen X has the domain P and the range X both prefix-free and code-complete Since X is code-complete we can express x as xxiXk with Xi e X and A(pi) = Xi with pi euro P i = lk By Lemma 3 we have Ks(p)gtp - c It follows that p mdash ppi pk is log-incompressible The proof is complete

The comparison of different notions of the complexity and randomness shows that this difference is no more than a logarithmic term With account of stationarity theorems it seems plausible to suggest a common definition of randomness of infinite sequences xdegdeg mdash x as the incompressibility up to the term c log |x| where c does not depend on x

In conclusion I have a pleasure to express my sincere gratitude to prof VMMaximov for encouraging discussions

References

1 A N Kolmogorov Grundlagen der Wahrscheintlickkeits Rechnung (Springer Verlag 1933 in English Chelsea New York 1956)

2 A N Kolmogorov Problems of Information Transfer 1 1 1-7 (1965) 3 L Longren Computer and Information Sciences 2 165-175(1967) 4 R J Solomonoff Progress of Symposia in Applied Math AMS 43

(1962) IEEE Trans on Inform Theory 4 5 662-664(1968) 5 Li Ming P Vitanyi An Introduction to Kolmogorov Complexity (Springer

Berlin-Heridelberg-New-York 1993) 6 P Martin-L6f Information and Control 9 602-619(1966) Zeits Warsch

Verw Geb 19225-230(1971) 7 A N Shiryaev The Annals of Probability 17 3 866-944(1989) 8 G J Chaitin J ACM 16 145-159(1969) 9 A N Kolmogorov Russian Math Survey 38 4 27-36(1983) 10 L A Levin Problems of Information Transmission 10 3206-210(1974) 11 P Gacs Soviet Math Doklady 15 1477-1480(1974) 12 G J Chaitin J ACM 22 329-340(1975) 13 V V Vjugin Semiotika i Informatika (in Russian) 16 14-43(1981)

V A Uspenskii SIAM J Theory Probab Appl 32 387-412(1987) 14 R J Solomonoff Information and Control 7 1-22(1964) 15 C H Bennet M Gardner Sci America 241 11 20-34(1979)

350

STRUCTURE OF PROBABILISTIC INFORMATION A N D Q U A N T U M LAWS

JOHANN SUMMHAMMER Atominstitut der Osterreichischen Universitdten

Stadionallee 2 A-1020 Vienna Austria E-mail summhammeratiacat

The acquisition and representation of basic experimental information under the probabilistic paradigm is analysed The multinomial probability distribution is identified as governing all scientific data collection at least in principle For this distribution there exist unique random variables whose standard deviation beshycomes asymptotically invariant of physical conditions Representing all informashytion by means of such random variables gives the quantum mechanical probabilshyity amplitude and a real alternative For predictions the linear evolution law (Schrodinger or Dirac equation) turns out to be the only way to extend the invari-ance property of the standard deviation to the predicted quantities This indicates that quantum theory originates in the structure of gaining pure probabilistic inshyformation without any mechanical underpinning

1 Introduction

The probabilistic paradigm proposed by Born is well accepted for comparing experimental results to quantum theoretical predictions It states that only the probabilities of the outcomes of an observation are determined by the exshyperimental conditions In this paper we wish to place this paradigm first We shall investigate its consequences without assuming quantum theory or any other physical theory We look at this paradigm as defining the method of the investigation of nature This consists in the collection of information in probabilistic experiments performed under well controlled conditions and in the efficient representation of this information Realising that the empirical information is necessarily finite permits to put limits on what can at best be extracted from this information and therefore also on what can at best be said about the outcomes of future experiments At first this has nothing to do with laws of nature But it tells us how optimal laws look like under probshyability Interestingly the quantum mechanical probability calculus is found as almost the best possibility It meets with difficulties only when it must make predictions from a low amount of input information We find that the quantum mechanical way of prediction does nothing but take the initial unshycertainty volume of the representation space of the finite input information and move this volume about without compressing or expanding it However we emphasize that any mechanistic imagery of particles waves fields even

351

space must be seen as what they are The human brains way of portraying sensory impressions mere images in our minds Taking them as corresponding to anything in nature while going a long way in the design of experiments can become very counter productive to sciences task of finding laws Here the correct path seems to be the search for invariant structures in the empirshyical information without any models Once embarked on this road the old question of how nature really is no longer seeks an answer in the muscular domain of mass force torque and the like which classical physics took as such unshakeable primary notions (not surprisingly considering our ape orishygin I cannot help commenting) Rather one asks Which of the structures principally detectable in probabilistic information are actually realized

In the following sections we shall analyse the process of scientific investishygation of nature under the probabilistic paradigm We shall first look at how we gain information then how we should best capture this information into numbers and finally what the ideal laws for making predictions should look like The last step will bring the quantum mechanical time evolution but will also indicate a problem due to finite information

2 Gaining experimental information

Under the probabilistic paradigm basic physical observation is not very difshyferent from tossing a coin or blindly picking balls from an urn One sets up specific conditions and checks what happens And then one repeats this many times to gather statistically significant amounts of information The difference to classical probabilistic experiments is that in quantum experiments one must carefully monitor the conditions and ensure they are the same for each trial Any noticeable change constitutes a different experimental situation and must be avoided0

Formally one has a probabilistic experiment in which a single trial can give K different outcomes one of which happens The probabilities of these outcomes pi PK (52Pj = 1) are determined by the conditions But they are unknown In order to find their values and thereby the values of physical quantities functionally related to them one does N trials Let us assume the outcomes j = 1 K happen L LK times respectively (52 Lj = N) The Lj are random variables subject to the multinomial probability distribution Listing Li LK represents the complete information gained in the N trials The customary way of representing the information is however by other random

Strictly speaking identical trials are impossible A deeper analysis of why one can neglect remote conditions might lead to an understanding of the notion of spatial distance about which relativity says nothing and which is badly missing in todays physics

352

variables the so called relative frequencies Vj = LjN Clearly they also obey the multinomial probability distribution

Examples

A trial in a spin-12 Stern-Gerlach experiment has two possible outcomes This experiment is therefore goverend by the binomial probability distribution A trial in a GHZ experiment has eight possible outcomes because each of the three particles can end up in one of two detectors 2 Here the relative frequencies follow the multinomial distribution of order eight Measuring an intensity in a detector which can only fire or not fire is in fact an experiment where one repeatedly checks whether a firing occurs in a sufficiently small time interval Thus one has a binomial experiment If the rate of firing is small the binomial distribution can be approximated by the Poisson distribution

We must emphasize that the multinomial probability distribution is of utshymost importance to physics under the probabilistic paradigm This can be seen as follows The conditions of a probabilistic experiment must be verified by auxiliary measurements These are usually coarse classical measurements but should actually also be probabilistic experiments of the most exacting standards The probabilistic experiment of interest must therefore be done by ensuring that for each of its trials the probabilities of the outcomes of the auxiliary probabilistic experiments are the same Consequently empirical scishyence is characterized by a succession of data-takings of multinomial probability distributions of various orders The laws of physics are contained in the reshylations between the random variables from these different experiments Since the statistical verification of these laws is again ruled by the properties of the multinomial probability distribution we should expect that the inner structure of the multinomial probability distribution will appear in one form or another in the fundamental laws of physics In fact we might be led to the bold conshyjecture that under the probabilistic paradigm basic physical law is no more than the structures implicit in the multinomial probability distribution There is no escape from this distribution Whichever way we turn we stumble across it as the unavoidable tool for connecting empirical data to physical ideas

The multinomial probability distribution of order K is obtained when calshyculating the probability that in N trials the outcomes 1 K occur L LK

times respectively

Prob(L1LKNp1pK) = L K ^ - P K - (2-1)

The expectation values of the relative frequencies are

353

Vj = pj (2 2)

and their standard deviations are

3 Efficient representation of probabilistic information

The reason why probabilistic information is most often represented by the relative frequencies Vj seems to be history Probability theory has originated as a method of estimating fractions of countable sets when inspecting all elements was not possible (good versus bad apples in a large plantation desirable versus undesirable outcomes in games of chance etc) The relative frequencies and their limits were the obvious entities to work with But the information can be represented equally well by other random variables jgt a s ldegng a s these are one-to-one mappings Xjvj)i s o that no information is lost The question is whether there exists a most efficient representation

To answer this let us see what we know about the limits pi PK before the experiment but having decided to do iV trials Our analysis is equivalent for all K outcomes so that we can pick out one and drop the subscript We can use Chebyshevs inequality4 to estimate the width of the interval to which the probability p of the chosen outcome is pinned down6

If N is not too small we get

Wp = 2kJ^ (31)

where A is a free confidence parameter (Eq(4) is not valid at ^=0 or 1) Before the experiment we do not know u so we can only give the upper limit

Wp lt - ^ (32)

But we can be much more specific about the limit x of the random variable x(f) for which we require that at least for large N the standard deviation

Chebyshevs inequality states For any random variable whose standard deviation exists the probability that the value of the random variable deviates by more than fc standard deviations from its expectation value is less than or equal to fc-2 Here A is a free confidence parameter greater 1

354

A shall be independent of p (or of x for that matter since there will exist a function px))

Ax = ^ (33)

where C is an arbitrary real constant For the derivation of the function X(v) it is easiest to make use of the illustration in Figl Although it already shows the solution the argument is general enough so that the particular form of the discussed function does not matter First we note that x(^) shall be smooth and differentiate and strictly monotonic For sufficiently large N the probability distribution of v can be approximated by a normal distribution centered at v and with standard deviation Av In other words it will approach the gaussian form

ProbvNp) laquo rexp (y-vf 2(Ai)2 (34)

where r is the normalization factor But clearly the corresponding probability distribution of will also tend to the gaussian form of standard deviation Ax-(For instance take the probability distributions of v and x for P mdash -5 These are the ones in the middle as shown in Figl) And if N is large both Av and Ax will be small so that in the range of x and v where the probability is significantly different from zero the curve x(^) can be approximated by its tangent

X laquo X W + ( | ) __v-v) (35)

Then it follows that the characteristic width of the probability distribution of xgt which is Ax will be proportional to the characteristic width of the probability distribution of v which is Av The proportionality constant will be gpound because this is by how much the distribution for v gets squeezed or stretched to become the one for x- So we have for large N

poundU pound (36) Av dv

Use of (3) and (6) and integration yields

X = C arcsin (2v - 1) + 9 (37)

where 9 is an arbitrary real constant For comparison with v we confine x to [01] and thus set C = 7r_1 and 6 = 5 as was already done in Figl Then we

355

have Ax = l(iryN) and upon application of Chebyshevs inequality we get the interval wx to which we can pin down the unknown limit x as

wx = mdash = (38)

Clearly this is narrower than the upper limit for wp in eq(5) Having done no experiment at all we have better knowledge on the value of x than on the value of p although both can only be in the interval [01] And note that the actual experimental data will add nothing to the accuracy with which we know x but they may add to the accuracy with which we know p Nevertheless even with data wp may still be larger than to especially when p is around 05

For the representation of information the random variable x is the proper choice because it disentangles the two aspects of empirical information The number of trials N which is determined by the experimenter not by nashyture and the actual data which are only determined by nature The expershyimenter controls the accuracy wx by deciding N nature supplies the data x and thereby the whereabouts of x In the real domain the only other random variables with this property are the linear transformations afforded by C and 9 From the physical point of view x s degf interest because its standard deshyviation is an invariant of the physical conditions as contained in p or x The random variable x expresses empirical information with a certain efficiency eliminating a numerical distortion that is due to the structure of the multishynomial distribution and which is apparent in all other random variables We shall call x an efficient random variable (ER) More generally we shall call any random variable an ER whose standard deviation is asymptotically invariant of the limit the random variable tends to eq(6)

Another graphical depiction of the relation between v and c a n be given by drawing a semicircle of diameter 1 along which we plot v (Fig2a) By orthogonal projection onto the semicircle we get the random variable C = [K + 2arcsin(2i mdash l)]4 and thereby Xi when we choose different constants The drawing also suggests a simple way how to obtain a complex ER We scale the semicircle by an arbitrary real factor a tilt it by an arbitrary angle ip and place it into the complex plane as shown in Fig2b This gives the random variable

0 = a(yv(l-v) +iv e^ + b (39)

where b is an arbitrary complex constant We get a very familiar special case by setting a mdash 1 and 6 = 0

Vgt = (yjv (1 - v) + iv) eiv (310)

356

Figure 1 Functional relation between random variables v and xgt and their respective probshyability distributions as expected for N = 100 trials plotted for five different values of p 07 25 50 75 and 93 The bar above each probablity distribution indicates twice its standard deviation Notice that the standard deviations of v differ considerably for different p while those of x a r e aU the same as required in eq(6)

357

(a) (b) Figure 2 (a) Graphical construction of efficient random variable pound (and thereby of x) from the observed relative frequency v pound is measured along the arc (b) Similar construction of the efficient random variable 3 It is given by its coordinates in the complex plane The quantum mechanical probability amplitude ip is the normalized case of 3 obtained by setting a = 1 and 6 = 0

358

For large N the probability distribution of v becomes gaussian but also that of any smooth function of v as we have already seen in Figl Therefore the standard deviation of ip is obtained as

Aip dip

dv 4 = S f lt3 Ugt

Obviously the random variable ip is an ER It fulfills ip2 mdash i and we recogshynize it as the probability amplitude of quantum theory which we would infer from the observed relative frequency v Note however that the intuitive way of getting the quantum mechanical probability amplitude namely by simply taking ^vexp(ia) where a is an arbitrary phase does not give us an ER

We have now two ways of representing the obtained information by ERs either the real valued x o r the complex valued Since the relative frequency of each of the K outcomes of a general probabilistic experiment can be conshyverted to its respective efficient random variable the information is efficiently represented by the vector (XI---XK) or by the vector (0i3K) The latshyter is equivalent to the quantum mechanical state vector if we normalize it (ipuipK)

At this point it is not clear whether fundamental science could be built solely on the real ERs j o r whether it must rely on the complex ERs J- and for practical reasons on the normalized case ipj as suggested by current formulations of quantum theory We cannot address this problem here but mention that working with the j3j or ipj can lead to nonsensical predictions while working with the Xj never does so that the former are more sensitive to inconsistencies in the input data 6 Therefore we use only the ipj in the next section but will not read them as if we were doing quantum theory

4 Predictions

Let us now see whether the representation of probabilistic information by ERs suggests specific laws for predictions A prediction is a statement on the exshypected values of the probabilities of the different outcomes of a probabilistic experiment which has not yet been done or whose data we just do not yet know on the basis of auxiliary probabilistic experiments which have been done and whose data we do know We intend to make a prediction for a probabilistic experiment with Z outcomes and wish to calculate the quantishyties 4gts (s = 1 Z) which shall be related to the predicted probabilities Ps

as Ps = (jgts2- We do not presuppose that the ltps are ERs

We assume we have done M different auxiliary probabilistic experiments of various multinomial order Km m = 1 M and we think that they provided

359

all the input information needed to predict the cfgts and therefore the Ps With (13) the obtained information is represented by the ERs iptrade where m denotes the experiment and j labels a possible outcome in it (j = 1 Km) Then the predictions are

and their standard deviations are by the usual convolution of gaussians as approximations of the multinomial distributions

Alttgts =

N M

4Nn

dltj)s

dip (42)

where Nm is the number of trials of the mth auxiliary experiment If we wish the ltfgts to be ERs we must demand that the A(ps depend only on the Nm (A technical requirement is that in each of the M auxiliary experiments one of the phases of ERs ip^1 cannot be chosen freely otherwise the second summations in (16) could not go to Km but only to Km mdash 1) Then the derivatives in (16) must be constants implying that the ltfgts are linear in the i)trade However we cannot simply assume such linearity because (15) contains the laws of physics which cannot be known a priori But we want to point out that a linear relation for (15) has very exceptional properties so that it would be nice if we found it realized in nature To be specific if the Nm are sufficiently large linearity would afford predictive power which no other functional relation could achieve It would be sufficient to know the number of trials of each auxiliary probabilistic experiment in order to specify the accuracy of the predicted ltfgts No data would be needed only a decision how many trials each auxiliary experiment will be given Moreover even the slightest increase of the amount of input information by only doing one more trial in any of the auxiliary experiments would lead to better accuracy of the predicted ltjgts by bringing a definite decrease of the Altjgts This latter property is absent in virtually all other functional relations conceivable for (15) In fact most nonlinear relations would allow more input information to result in less accurate predictions This would undermine the very idea of empirical science namely that by observation our knowledge about nature can only increase never just stay the same let alone decrease For this reason we assume linearity and apply it to a concrete example

We take a particle in a one dimensional box of width w Alice repeatedly prepares the particle in a state only she knows At time t after the preparation Bob measures the position by subdividing the box into K bins of width wK

360

and checking in which he finds the particle In N trials Bob obtains the relative frequencies vi VK giving a good idea of the particles position probability distribution at time t He represents this information by the ERs xpj of (10) and wants to use it to predict the position probability distribution at time T (T gt t)

First he predicts for t + dt With (15) the predicted ltps must be linear in the ipj if they are to be ERs

K

lt)s(t + dt) = J2asjxpj (43) i= i

Clearly when dt mdashgt 0 we must have asj mdash 1 for s mdash j and asj = 0 otherwise so we can write

asj (t) = 6aj + gsj (t)dt (44)

where gSj(t) are the complex elements of a matrix G and we included the possibility that they depend on t Using matrix notation and writing the ltfgts

and ipj as column vectors we have

$t + dt) = [1 + G(t)dt] $ (45)

For a prediction for time t + 2dt we must apply another such linear transforshymation to the prediction we had for t + dt

$t + 2dt) = [1 + G(t + dt)dt] $t + dt) (46)

Replacing t + dt by t and using ltp(t + dt) = lttgtt) HmdashQp-dt we have

d$t) dt

= Gt)ltjgtt) (47)

With (10) the input vector was normalized ip2 mdash 1 We also demand this from the vector ltfgt This results in the constraint that the diagonal elements gaa must be imaginary and the off-diagonal elements must fulfill gsj = mdashgjs And then we have obviously an evolution equation just as we know it from quantum theory

For a quantitative prediction we need to know G() and the phases (pj of the initial ipj We had assumed the ltpj to be arbitrary But now we see that they influence the prediction and therefore they attain physical significance G(t) is a unitary complex K x K matrix For fixed conditions it is indepenshydent of time and with the properties found above it is given by K2 mdash 1 real

361

numbers The initial vector ip has K complex components It is normalized and one phase is free so that it is fixed by 2K mdash 2 real numbers Altogether K2 + IK - 3 = (K + 3) (K - 1) numbers are needed to enable prediction Since one probabilistic experiment yields K mdash 1 numbers Bob must do K + 3 probabilistic experiments with different delay times between Alices preparashytion and his measurement to obtain sufficient input information But neither Plancks constant nor the particles mass are needed It should be noted that this analysis remains unaltered if the initial vector ip is obtained from meashysurement of joint probability distributions of several particles Therefore (21) also contains entanglement between particles

5 Discussion

This paper was based on the insight that under the probabilistic paradigm data from observations are subject to the multinomial probability distribution For the representation of the empirical information we searched for random variables which are stripped of numerical artefacts They should therefore have an invariance property We found as unique random variables a real and a complex class of efficient random variables (ERs) They capture the obtained information more efficiently than others because their standard deviation is an asymptotic invariant of the physical conditions The quantum mechanical probability amplitude is the normalized case-of the complex class It is natural that fundamental probabilistic science should use such random variables rather than any others as the representors of the observed information and therefore as the carriers of meaning

Using the ERs for prediction has given us an evolution prescription which is equivalent to the quantum theoretical way of applying a sequence of inshyfinitesimal rotations to the state vector in Hilbert space7 It seems that simply analysing how we gain empirical information what we can say from it about expected future information and not succumbing to the lure of the question what is behind this information can give us a basis for doing physics This confirms the operational approach to science And it is in support of Wheelers It-from-Bit hypothesis8 Weizsackers ur-theor$ Eddingtons idea that inforshymation increase itself defines the rest10 Anandans conjecture of absence of dynamical laws11 Bohr and Ulfbecks hypothesis of mere symmetry^2 or the recent 1 Bit mdash 1 Constituent hypothesis of Brukner and Zeilingei13

In view of the analysis presented here the quantum theoretical probability calculus is an almost trivial consequence of probability theory but not as applied to objects or anything physical but as applied to the naked data of probabilistic experiments If we continue this idea we encounter a deeper

362

problem namely whether the space which we consider physical this 3- or higher dimensional manifold in which we normally assume the world to unfurl 14 cannot also be understood as a peculiar way of representing data Kant conjectured this - in somewhat different words - over 200 years ago1 5 And indeed it is clearly so if we imagine the human observer as a robot who must find a compact memory representation of the gigantic data stream it receives through its senses16 That is why our earlier example of the particle in a box should only be seen as illustration by means of familiar terms It should not imply that we accept the naive conception of space or things like particles in it although this view works well in everyday life and in the laboratory mdash as long as we are not doing quantum experiments We think that a full acceptance of the probabilistic paradigm as the basis of empirical science will eventually require an attack on the notions of spatial distance and spatial dimension from the point of view of optimal representation of probabilistic information

Finally we want to remark on a difference of our analysis to quantum theory We have emphasized that the standard deviations of the ERs a n d tp become independent of the limits of these ERs only when we have infinitely many trials But there is a departure for finitely many trials especially for values of p close to 0 and close to 1 With some imagination this can be noticed in Figl in the top and bottom probability distributions of which are a little bit wider than those in the middle But as we always have only finitely many trials there should exist random variables which fulfill our requirement for an ER even better than x a n d ip- This implies that predictions based on these unknown random variables should also be more precise Whether we should see this as a fluke of statistics or as a need to amend quantum theory is a debatable question But it should be testable We need to have a number of different probabilistic experiments all of which are done with only very few trials From this we want to predict the outcomes of another probabilistic experiment which is then also done with only few trials Presumably the optimal procedure of prediction will not be the one we have presented here (and therefore not quantum theory) The difficulty with such tests is however that in the usual interpretation of data statistical theory and quantum theory are treated as separate while one message of this paper may also be that under the probabilistic paradigm the bottom level of physical theory should be equivalent to optimal representation of probabilistic information and this theory should not be in need of additional purely statistical theories to connect it to actual data We are discussing this problem in a future paper17

363

Acknowledgments

This paper is a result of pondering what I am doing in the lab how it can be that in the evening I know more than I knew in the morning and discussing this with G Krenn K Svozil C Brukner M Zukovski and a number of other people

References

1 M Born Zeitschrift f Physik 37 863 (1926) Brit J Philos Science 4 95 (1953)

2 D Bouwmeester et al Phys Rev Lett 82 1345 (1999) and references therein

3 W Feller An Introduction to Probability Theory and its Applications (John Wiley and Sons New York 3rd edition 1968) Vol1 p168

4 ibid p233 5 The connection of this relation to quantum physics was first stressed by

W K Wootters Phys Rev D 23 357 (1981) 6 We give the example in quant-ph0008098 7 Several authors have noted that probability theory itself suggests quanshy

tum theory A Lande Am J Phys 42 459 (1974) A Peres Quanshytum Theory Concepts and Methods (Kluwer Academic Publishers Dorshydrecht 1998) D I Fivel Phys Rev A 50 2108 (1994)

8 J A Wheeler in Quantum Theory and Measurement eds J A Wheeler and W H Zurek (Princeton University Press Princeton 1983) 182

9 C F von Weizsacker Aufbau der Physik (Hanser Munich 1985) Holger Lyre Int J Theor Phys 34 1541 (1995) Also quant-ph9703028

10 C W Kilmister Eddingtons Search for a Fundamental Theory (Camshybridge University Press 1994)

11 J Anandan Found Phys 29 1647 (1999) 12 A Bohr and 0 Ulfbeck Rev Mod Phys 67 1 (1995) 13 C Brukner and A Zeilinger Phys Rev Lett 83 3354 (1999) 14 A penetrating analysis of the view of space implied by quantum theory

is given by U Mohrhoff Am J Phys 68 (8) 728 (2000) 15 Immanuel Kant Critik der reinen Vernunft (Critique of Pure Reason)

Riga (1781) There should be many English translations 16 ET Jaynes introduced the reasoning robot in his book Probshy

ability Theory The Logic of Science in order to eliminate the problem of subjectivism that has been plaguing probability theshyory and quantum theory alike The book is freely available at httpbayeswustleduetjprobhtml

17 J Summhammer (to be published)

364

Q U A N T U M C R Y P T O G R A P H Y I N S P A C E A N D B E L L S T H E O R E M

I G O R V O L O V I C H

Steklov Mathematical Institute Gubkin St 8

GSP-1 117966 Moscow Russia

E-mail volovichmirasru

Bells theorem states that some quantum correlations can not be represented by classical correlations of separated random variables It has been interpreted as incompatibility of the requirement of locality with quantum mechanics We point out that in fact the space part of the wave function was neglected in the proof of Bells theorem However this space part is crucial for considerations of property of locality of quantum system Actually the space part leads to an extra factor in quantum correlations and as a result the ordinary proof of Bells theorem fails in this case Bells theorem constitutes an important part in quantum cryptography The promise of secure cryptographic quantum key distribution schemes is based on the use of Bells theorem in the spin space In many current quantum cryptography protocols the space part of the wave function is neglected As a result such schemes can be secure against eavesdropping attacks in the abstract spin space but they could be insecure in the real three-dimensional space We discuss an approach to the security of quantum key distribution in space by using a special preparation of the space part of the wave function

1 Introduction

Bells theorem1 states that there are quantum correlation functions that can not be represented as classical correlation functions of separated random varishyables It has been interpreted as incompatibility of the requirement of locality with the statistical predictions of quantum mechanics For a recent discusshysion of Bells theorem see for example 2 - 17 and references therein It is now widely accepted as a result of Bells theorem and related experiments that local realism must be rejected

Evidently the very formulation of the problem of locality in quantum mechanics is based on ascribing a special role to the position in ordinary three-dimensional space It is rather surprising therefore that the space dependence of the wave function is neglected in discussions of the problem of locality in relation to Bells inequalities Actually it is the space part of the wave function which is relevant to the consideration of the problem of locality

In this note we point out that the space part of the wave function leads to an extra factor in quantum correlation and as a result the ordinary proof of Bells theorem fails in this case We present a criterium of locality (or nonlocality) of quantum theory in a realist model of hidden variables We

365

argue that predictions of quantum mechanics can be consistent with Bells inequalities for Gaussian wave functions and hence Einsteins local realism is restored in this case

Bells theorem constitutes an important part in quantum cryptography19 It is now generally accepted that techniques of quantum cryptography can allow secure communications between distant parties 18 - 25 The promise of secure cryptographic quantum key distribution schemes is based on the use of quantum entanglement in the spin space and on quantum no-cloning theorem An important contribution of quantum cryptography is a mechanism for detecting eavesdropping

However in many current quantum cryptography protocols the space part of the wave function is neglected But exactly the space part of the wave function describes the behaviour of particles in ordinary real three-dimensional space As a result such schemes can be secure against eavesdropping attacks in the abstract spin space but could be insecure in the real three-dimensional space

It follows that proofs of the security of quantum cryptography schemes which neglect the space part of the wave function could fail against attacks in the real three-dimensional space We will discuss how one can try to improve the security of quantum cryptography schemes in space by using a special preparation of the space part of the wave function

2 Bells Inequality

In the presentation of Bells theorem we will follow 17 where one can find also more references The mathematical formulation of Bells theorem reads

cos(a -P)plusmn Eamptip (21)

where poundQ and r)p are two random processes such that |pounda | lt 1 r$ lt 1 and E is the expectation Let us discuss in more details the physical interpretation of this result Consider a pair of spin one-half particles formed in the singlet spin state and moving freely towards two detectors (Alice and Bob) If one neglects the space part of the wave function then the quantum mechanical correlation of two spins in the singlet state ipspin is

Dspin(a b) = (ipspin(7 -areg a bull btpspin) = -a bull b (22)

Here a and b are two unit vectors in three-dimensional space a mdash ( o i ^ ^ ) are the Pauli matrices and

366

Bells theorem states that the function Dspinab) Eq (22) can not be represented in the form

P(ab) = Jaa)r](bX)dp(X) (23)

ie

Dspin(ab) ^ P(ab) (24)

Here pound(a A) and 77(6 A) are random fields on the sphere |pound(a A)| lt 1 rj(b A)| lt 1 and dp(X) is a positive probability measure dp) = 1 The parameters A are interpreted as hidden variables in a realist theory It is clear that Eq (24) can be reduced to Eq (21)

One has the following Bell-Clauser-Horn-Shimony-Holt (CHSH) inequality

P(a b) - P(a b) + P(a b) + P(a b)lt2 (25)

Prom the other hand there are such vectors (ab mdash ab = ab = mdash ab = V22) for which one has

Dspin(a b) - Dspin(a b) + Dspin(a b) + Dspin(a b) = 2^2 (26)

Therefore if one supposes that Dspin(ab) = P(ab) then one gets the contrashydiction

It will be shown below that if one takes into account the space part of the wave function then the quantum correlation in the simplest case will take the form g cos(a mdash 3) instead of just cos(a - 3) where the parameter g describes the location of the system in space and time In this case one can get the representation

gcos(a-p)=EZaT]l3 (27)

if g is small enough (see below) The factor g gives a contribution to visibility or efficiency of detectors that are used in the phenomenological description of detectors

3 Localized Detectors

In the previous section the space part of the wave function of the particles was neglected However exactly the space part is relevant to the discussion of locality The complete wave function is tp = (Vgta3(rir2)) where a and are spinor indices and r i and r^ are vectors in three-dimensional space

367

We suppose that Alice and Bob have detectors which are located within the two localized regions OA and OB respectively well separated from one another

Quantum correlation describing the measurements of spins by Alice and Bob at their localized detectors is

G(a0AbOB) = (1gtW bull aPoA reg a bull bPoB|Vgt (3-1)

Here PQ is the projection operator onto the region O Let us consider the case when the wave function has the form of the product

of the spin function and the space function tp = y spin^(i ir2) Then one has

G(a 0A b 0B) = g(0A 0B)Dspin(a b) (32)

where the function

9(OAOB)= [ 4gt(r1T2)2dT1dv2 (33)

JOAXOB

describes correlation of particles in space It is the probability to find one particle in the region OA and another particle in the region OB- One has

0ltg(OAOB)ltl (34)

Remark In relativistic quantum field theory there is no nonzero strictly localized projection operator that annihilates the vacuum It is a consequence of the Reeh-Schlieder theorem Therefore apparently the function g(OAOs) should be always strictly smaller than 1 I am grateful to W Luecke for this remark

Now one inquires whether one can write the representation

9(0A0B)Dspin(ab) = f^aOAX)v(b0B)dP(X) (35)

Note that if we are interested in the conditional probablity of finding the projection of spin along vector a for the particle 1 in the region OA and the projection of spin along the vector b for the particle 2 in the region OB then we have to divide both sides of Eq (35) to g(OA OB)-

The factor g is important In particular one can write the following repshyresentation15 for 0 lt g lt 12

gcos(a-3)= v ^ c o s ( a - A ) v 2 p c o s ( ^ - A ) mdash (36) Jo An

Let us now apply these considerations to quantum cryptography

368

4 Quantum Key Distribution

Ekert1 9 showed that one can use the EPR correlations to establish a secret random key between two parties (Alice and Bob) Bells inequalities are used to check the presence of an intermediate eavesdropper (Eve) There are two stages to the Ekert protocol the first stage over a quantum channel the second over a public channel

The quantum channel consists of a source that emits pairs of spin one-half particles in a singlet state The particles fly apart towards Alice and Bob who after the particles have separated perform measurements on spin components along one of three directions given by unit vectors a and b In the second stage Alice and Bob communicate over a public channelThey announce in public the orientation of the detectors they have chosen for particular measurements Then they divide the measurement results into two separate groups a first group for which they used different orientation of the detectors and a second group for which they used the same orientation of the detectors Now Alice and Bob can reveal publicly the results they obtained but within the first group of measurements only This allows them by using Bells inequality to establish the presence of an eavesdropper (Eve) The results of the second group of measurements can be converted into a secret key One supposes that Eve has a detector which is located within the region OE and she is described by hidden variables A

We will interpret Eve as a hidden variable in a realist theory and will study whether the quantum correlation Eq (32) can be represented in the form Eq (23) ^From (25) (26) and (35) one can see that if the following inequality

g(0A0B) lt1V2 (41)

is valid for regions OA and OB which are well separated from one another then there is no violation of the CHSH inequalities (25) and therefore Alice and Bob can not detect the presence of an eavesdropper On the other side if for a pair of well separated regions OA and OB one has

9(OAOB) gtly2 (42)

then it could be a violation of the realist locality in these regions for a given state Then in principle one can hope to detect an eavesdropper in these circumstances

Note that if we set g(OA OB) = 1 in (35) as it was done in the original proof of Bells theorem then it means we did a special preparation of the states of particles to be completely localized inside of detectors There exist such

369

well localized states (see however the previous Remark) but there exist also another states with the wave functions which are not very well localized inside the detectors and still particles in such states are also observed in detectors The fact that a particle is observed inside the detector does not mean of course that its wave function is strictly localized inside the detector before the measurement Actually one has to perform a thorough investigation of the preparation and the evolution of our entangled states in space and time if one needs to estimate the function g(CgtA OB)-

5 Gaussian Wave Functions

Now let us consider the criterium of locality for Gaussian wave functions We will show that with a reasonable accuracy there is no violation of locality in this case Let us take the wave function ltfgt of the form ltfgt = Vi(ri)V2(r2) where the individual wave functions have the moduli

Mr)2 = ( ^ ) raquo V V a |Vgt2(r)|2 = (^ )raquo raquoe -raquo ( - 1 )Vraquo (51)

We suppose that the length of the vector 1 is much larger than 1m We can make measurements of PoA and PQB for any well separated regions OA and OB- Let us suppose a rather nonfavorite case for the criterium of locality when the wave functions of the particles are almost localized inside the regions OA and OB respectively In such a case the function 9(OAOB) can take values near its maxumum We suppose that the region OA is given by ri lt 1mr = (ri r2r3) and the region OB is obtained from OA by translation on 1 Hence Vi(ri) is a Gaussian function with modules appreciably different from zero only in OA and similarly laquogt2(i2) is localized in the region OB- Then we have

g(0A OB) = ( ^ L J ^ e~x^2dx (52)

One can estimate (52) as

g(0A0B)lt(^ (53)

which is smaller than 12 Therefore the locality criterium (41) is satisfied in this case

Let us remind that there is a well known effect of expansion of wave packets due to the free time evolution If e is the characteristic length of the Gaussian

370

wave packet describing a particle of mass M at time t = 0 then at time t the chracteristic length tt will be

It tends to (HMe)t as t mdashgt oo Therefore the locality criterium is always satisfied for nonrelativistic particles if regions OA and OB are far enough from each other The case of relativistic particles will be considered in a separate publication

6 Conclusions

It is shown in this note that if we do not neglect the space part of the wave function of two particles then the prediction of quantum mechanics can be consistent with Bells inequalities One can say that Einsteins local realism is restored in this case

It would be interesting to investigate whether one can prepare a reasonshyable wave function for which the condition of nonlocality (42) is satisfied for a pair of the well separated regions In principle the function g(CgtA OB) can approach its maximal value 1 if the wave functions of the particles are very well localized within the detector regions OA and OB respectively However perhaps to establish such a localization one has to destroy the original entanshyglement because it was created far away from detectors

It is shown that the presence of the space part in the wave function of two particles in the entangled state leads to a problem in the proof of the security of quantum key distribution To detect the eavesdroppers presence by using Bells inequality we have to estimate the function g(OA OB)- Only a special quantum key distribution protocol has been discussed here but it seems there are similar problems in other quantum cryptographic schemes as well

We dont claim in this note that it is in principle impossible to increase the detectability of the eavesdropper However it is not clear to the present author how to do it without a thorough investigation of the process of preparation of the entangled state and then its evolution in space and time towards Alice and Bob

In the previous section Eve was interpreted as an abstract hidden variable However one can assume that more information about Eve is available In particular one can assume that she is located somewhere in space in a region OE- It seems one has to study a generalization of the function g(OAOB) which depends not only on the Alice and Bob locations OA and OB but also depends on the Eve location OE and try to find a strategy which leads to an optimal value of this function

371

7 Acknowledgments

This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University I would like to thank A Khrennikov for the warm hospitality and fruitful discussions This work is supported in part also by RFFI 99-01-00105 and INTAS 99-0590

References

1 JS Bell Physics 1 195 (1964) 2 A Peres Quantum Theory Concepts and Methods Kluwer Dordrecht

1993 3 LE Ballentine Quantum Mechanics Prince-Hall 1990 4 Muynck WM de De Baere W and Martens H Found of Physics

(1994) 1589 5 DM Greenberger MA Home A Shimony and A Zeilinger Am J

Phys 58 1131 (1990) 6 SL Braunstein A Mann and M Revzen Phys Rev Lett 68 3259

(1992) 7 ND Mermin Am J Phys 62 880 (1994) 8 G M DAriano L Maccone M F Sacchi and A Garuccio Tomographic

test of Bells inequality quant-ph9907091 9 Luigi Accardi and Massimo Regoli Locality and Bells inequality quant-

ph0007005 10 Andrei Khrennikov Non-Kolmogorov probability models and modified

Bells inequality quant-ph0003017 11 Almut Beige William J Munro and Peter L Knight A Bells Inequality

Test with Entangled Atoms quant-ph0006054 12 F Benatti and R Floreanini On Bells locality tests with neutral kaons

hep-ph9812353 13 A Khrennikov Statistical measure of ensemble nonreproducibility and

correction to Bells inequality Nuovo Cimento 115B (2000)179 14 W A Hofer Information transfer via the phase A local model of

Einstein-Podolksy-Rosen experiments quant-ph0006005 15 Igor Volovich Yaroslav Volovich Bells Theorem and Random Variables

quant-ph0009058 16 N Gisin V Scarani W Tittel H Zbinden Optical tests of quantum

nonlocality from EPR-Bell tests towards experiments with moving obshyservers quant-ph0009055

17 Igor V Volovich Bells Theorem and Locality in Space quant-

372

ph0012010 18 CH Bennett and G Brassard in Proc of the IEEE Inst Conf on

Comuters Systems and Signal Processing Bangalore India (IEEE New York1984) p175

19 AK Ekert Phys Rev Lett 67 (1991)661 20 D S Naik C G Peterson A G White A J Berglund P G Kwiat

Entangled state quantum cryptography Eavesdropping on the Ekert proshytocol quant-ph9912105

21 Gilles Brassard Norbert Lutkenhaus Tal Mor Barry C Sanders Secushyrity Aspects of Practical Quantum Cryptography quant-ph9911054

22 Kei Inoue Takashi Matsuoka Masanori Ohya New approach to Epsilon-entropy and Its comparison with Kolmogorovs Epsilon-entropy quant-ph9806027

23 Hoi-Kwong Lo Will Quantum Cryptography ever become a successful technology in the marketplace quant-ph9912011

24 Akihisa Tomita Osamu Hirota Security of classical noise-based cryptogshyraphy quant-ph0002044

25 Yong-Sheng Zhang Chuan-Feng Li Guang-Can Guo Quantum key disshytribution via quantum encryption quant-ph0011034

373

INTERACTING STOCHASTIC PROCESS A N D RENORMALIZATION THEORY

YAROSLAV V O L O V I C H

Physics Department Moscow State University

Vorobievi Gori 119899Moscow Russia

E-mail yaroslav-Vmailru

A stochastic process with self-interaction as a model of quantum field theory is studied We consider an Ornstein-Uhlenbeck stochastic process x(t) with intershyaction of the form x ( a ( t ) 4 where a indicates the fractional derivative Using Bogoliubovs Rmdashoperation we investigate ultraviolet divergencies for the various parameters a Ultraviolet properties of this one-dimensional model in the case a = 34 are similar to those in the ip theory but there are extra counterterms It is shown that the model is two-loops renormalizable For 58 lt a lt 34 the model has a finite number of divergent Feynman diagrams In the case a = 23 the model is similar to the ltp theory If 0 lt a lt 58 then the model does not have ultraviolet divergencies at all Finally if a gt 34 then the model is nonrenormalizable

1 Introduction

There is a very fruitful interrelation between probability theory and quantum field theory 1 _ 6 In this note we consider a stochastic process that shows the same divergencies as quantum electrodynamics or ltgt4 theory in the 4-dimensional spacetime This stochastic process corresponds to one-dimensional Euclidean quantum field theory with the quartic interaction that contains fracshytional derivatives This one-dimensional model can be used for studying the fundamental problem of non-perturbative investigation of renormalized quanshytum field theory1 3 It can also find applications in theory of phase transishytions5 6

The Interacting Stochastic Process Let x(t) = x(tu)) be an Ornstein-Uhlenbeck stochastic process with the correlation function

1 rdegdeg pip(t-r) p~mt-r

where m gt 0 There exists a spectral representation of the Ornstein-Uhlenbeck stochastic process 8

xtu)= JeiktC(dku)

374

where ((dku) is a stochastic measure We define the fractional derivative a

as

lt lt gt (tw)= fkaeiktC(dkoj) (12)

If 0 lt a lt 12 then x^(t) is a stochastic process If a gt 12 then one needs a regularization described below We will use distribution notations and write

1 fdegdeg C(dkui) = x(kcj)dk i(kw) = mdash I x(tcj)e

2 r J-oo

-iktdt

We want to give a meaning to the following correlation functions

Kh tN)= Exh) bull bull bull xtN)e~xu) E(e-xu) (13)

for all N = 12 Here

OO

X^T)A g(T)dT (14)

-OO

where g(r) is a nonnegative test function with a compact support (the volume cut-off) a(Q)(i) denotes the fractional derivative (12) A gt 0 and ^ ( ^ ( T ) 4 is the Wick normal product We will denote the expectation value as E(A) mdash A) In this notations (x(t)x(r)) = plusmn J^ ^^rdp

For the correlation function (13) one has the perturbative expansion

(x(h) xtN)e~xu) = V Kmdashf- (xfa) bull bullbullx(tN)Un) (15) n=0

If a gt 58 then the expectation value in (15) has no meaning because there are ultraviolet divergencies We have to introduce a cutoff stochastic process xK (t) 3

xK(tegt)= f eiktadku) J mdashK

Instead of U in (13) we put

UK = j 4 a ) M 4 9(r)dr

Stochastic differential equations with fractional derivatives 7 are considered also on pmdashadic number fields

375

where

JmdashK

The problem is to prove that after the renormalization there exists a limit of the correlation functions

(xh)-x(tN)e-w)rm

as K -gt oo in each order of the perturbation expansion We will consider this problem below by using the Bogoliubov-Parasiuk R-operation and the standart language of the Feynman diagrams

In the momentum representation we obtain the expression of the form

x(pi)xjpN)e~xu) = ^2Gr(pi PN)

Here the sum runs over all Feynman diagrams T with N external legs that can be build up using 4-vertices corresponding to the x^4 term Contributions from the connected diagrams with n 4-vertices and L internal lines has a form

j = i j j = i lt i j + m

where I = L mdash (n mdash 1) qi are linear combinations of the internal momenta fci ki and external momenta p i PN-

The canonical degree D(T) of a proper diagram is defined by the dimension of the corresponding Feynman integral with respect to the integration variables Using (16) we have

D = D(T) = (2a - 2)L + I = (2a - )L - n + 1 (17)

If for a given diagram D lt 0 then this diagram is superficially finite otherwise it is divergent Let us consider a proper diagram with n vertices L internal lines and E legs We have the following relation

An-2L + E (18)

Note that for any nontrivial connected diagram

2n gt L gt n gt 2 (19)

E lt2n (110)

376

Theorem If a lt 58 then all Feynman diagrams of the interacting stochastic process are superficially finite If 58 lt a lt 34 then there exists a finite number of divergent diagrams moreover all divergent diagrams have only 0 or 2 legs If a = 34 then the model is renormalizable and all divergent diagrams have only 0 2 or 4 external lines Finally if a gt 34 then the model is nonrenormalizable Proof Let us prove the first statement of the theorem ie if a lt 58 then D lt 0 for any n gt 2 Using (17) and (19) we have

D nr 5 T n L-An + A ^ lt2L L-n + l = lt

alt58 8 4 (111)

lt In - An + 4

lt 0 4 2

Prom (111) it follows that D lt 0 for any a lt 58 Let us consider a = 58 Similarly to (111) from (17) we have

D L-An + A 2_ n

a=58 lt 0 (112)

Therefore only two-point (n = 2) diagram could be divergent (in this case D = 0) Rewriting (112) in the form

D A-(E + L)

alt58 (113)

Prom (113) it follows that only diagram with E = 0 L mdash A n = 2 is divergent In the case when 58 lt a lt 34 we can write

a = (114)

where 0 lt e lt 18 Substituting (114) into (17) and using (19) we have

D L 2n

= --2Le-n + llt mdash a=34-er 2 2

2ns - n + 1 = 1 - 2ne (115)

Thus for any given s gt 0 (and therefore any a lt 34) there exists a number N such that for any n gt N the canonical dimension D lt 0 Hence there exists only a finite number of divergent diagrams Rewriting (115) in the form

D a=34-e

= -2Le + A-E

377

It follows that D gt 0 only if E lt 4 ie E = 0 or E = 2 and the model is super-renormalizable

Let us consider the case when a = 34 Using (18) and (17) we have

D = l - f (116) a=34 4

The equality (116) means that all divergent diagrams have only 0 2 or 4 legs and the model is renormalizable

Finally if a gt 34 we have

D = - - n + l = gt ^ gt 0 (117) agt34 2 1 2

Therefore if a gt 34 then all proper diagrams are divergent bull Examples of application of this theorem one can find in9

2 Acknowledgments

This investigation was supported by the grant of Swedish Royal Academy of Sciences on the collaboration with states of the former Soviet Union and the Profile Mathematical Modeling of Vaxjo University I would like to thank A Khrennikov for the warm hospitality and fruitful discussions

References

1 NN Bogoliubov and DV Shirkov Introduction to the theory of quantum fields Nauka Moscow 1973

2 T Hida Brownian Motion Springer-Verlag 1980 3 J Glimm and A Jaffe Quantum Physics A Functional Integral Point of

View Springer-Verlag 1987 4 T Hida H-H Kuo J Potthoff and L Streit White noise An Infinite

Dimensional Calculus Kluwer Academic 1993 5 J Kogut K Wilson Phys Reports 12C p 75 1974 6 AZ Patashinski and VL Pokrovski The fluctuational theory of phase

transitions Nauka Moscow 1975 7 VS Vladimirov Generalized functions over the field ofpmdashadic numbers

Russian Math Surveys 435 (1988) 8 II Gihman and AV Skorohod Introduction to Theory of Random Proshy

cesses Nauka Moscow 1977 9 YaI Volovich Interacting stochastic process and renormalization theory

quant-ph0008063

ISBN 981-02-4846-6

www worldscientificcom 48 84hc 9 789810 248468

  • Foreword
  • Contents
  • Preface
  • Locality and Bells Inequality
    • 1 Inequalities among numbers
    • 2 The Bell inequality
    • 3 Implications of the Bells inequalities for the singlet correlations
    • 4 Bell on the meaning of Bells inequality
    • 5 Critique of Bells vital assumption
    • 6 The role of the counterfactual argument in Bells proof
    • 7 Proofs of Bells inequality based on counting arguments
    • 8 The quantum probabilistic analysis
    • 9 The realism of ballot boxes and the corresponding statistics
    • 10 The realism of chameleons and the corresponding statistics
    • 11 Bells inequalities and the chamaleon effect
    • 12 Physical implausibility of Bells argument
    • 13 The role of the single probability space in CHSHs proof
    • 14 The role of the counterfactual argument in CHSHs proof
    • 15 Physical difference between the CHSHs and the original Bells inequalities
    • References
      • Refutation of Bells Theorem
        • 1 Introduction
        • 2 The EPRB gedanken experiment
        • 3 The CHSH function
        • 4 Strongly objective interpretation
        • 5 Weakly objective interpretation
        • 6 Conclusion
        • References
          • Probability Conservation and the State Determination Problem
            • 1 Introduction
            • 2 Conservation of Probability
            • 3 Determination of the phase function
            • 4 Validity and range of applicability
            • 5 Evolution of a Gaussian Wave Packet
            • 6 Operational Issues
            • Acknowledgments
            • References
              • Extrinsic and Intrinsic Irreversibility in Probabilistic Dynamical Laws
                • 1 Introduction
                • 2 Ontic and epistemic descriptions
                • 3 Breaking Time-Reversal Symmetry Extrinsic Irreversibility
                • 4 Breaking Time-Reversal Symmetry Intrinsic Irreversibility
                • 5 Summary and Open Questions
                • Acknowledgments
                • References
                  • Interpretations of Probability and Quantum Theory
                    • 1 Introduction
                    • 2 Interpretations of Probability
                    • 3 The Axioms of Probability
                    • 4 Probability in Quantum Mechanics
                    • 5 Conclusions
                    • References
                      • Forcing Discretization and Determination in Quantum History Theories
                        • 1 Introduction
                        • 2 Outcome determination via contextual models
                        • 3 Unitary ortho- and projective structure
                        • 4 Representing quantum history theory
                        • 5 Further discussion
                        • Acknowledgments
                        • References
                          • Interpretations of Quantum Mechanics and Interpretations of Violation of Bells Inequality
                            • 1 Realist and empiricist interpretations of quantum mechanics
                            • 2 EPR experiments and Bell experiments
                            • 3 Bells inequality in quantum mechanics
                            • 4 Bells inequality in stochastic and deterministic hidden-variables theories
                            • 5 Analogy between thermodynamics and quantum mechanics
                            • 6 Conclusions
                            • References
                              • Discrete Hessians in Study of Quantum Statistical Systems Complex Ginibre Ensemble
                                • 1 Introduction
                                • 2 The Ginibre ensembles
                                • Acknowledgements
                                • References
                                  • Some Remarks on Hardy Functions Associated with Dirichlet Series
                                    • 1 Introduction
                                    • 2 Hardyfication of Dirichlet series
                                    • 3 Factorization of n
                                    • 4 Applications
                                    • References
                                      • Ensemble Probabilistic Equilibrium and Non-Equilibrium Thermodynamics without the Thermodynamic Limit
                                        • 1 Introduction
                                        • 2 There is a lot to add to classical equilibrium statistics from our experience with Small systems
                                        • 3 Relation of the topology of S(E N) to the Yang-Lee zeros of Z(T u V)
                                        • 4 The regions of positive curvature A1 of s(es ns) correspond to phase transitions of first order
                                        • 5 Boltzmanns principle and non-equilibrium thermodynamics
                                        • 6 Macroscopic observables imply the EPS-probability
                                        • 7 On Einsteins objections against the EPS-probability
                                        • 8 Fractal distributions in phase space Second Law
                                        • 9 Conclusion
                                        • Appendix
                                        • Acknowledgement
                                        • References
                                          • An Approach to Quantum Probability
                                            • 1 Introduction
                                            • 2 Formulation
                                            • 3 Wave Functions and Hilbert Space
                                            • 4 Spin
                                            • 5 Traditional Quantum Mechanics
                                            • 6 Concluding Remarks
                                            • References
                                              • Innovation Approach to Stochastic Processes and Quantum Dynamics
                                                • 1 Introduction
                                                • 2 Review of defining a stochastic process and white noise analysis
                                                • 3 Relations to Quantum Dynamics
                                                • 4 Addenda to foundations of the theories Concluding remarks
                                                • Acknowledgements
                                                • References
                                                  • Statistics and Ergodicity of Wave Functions in Chaotic Open Systems
                                                    • 1 Introduction
                                                    • 2 Classical Nonergodicity and Short-Path Dynamics
                                                    • 3 Universal Description of Wave Function Statistics
                                                    • 4 Numerical Analyses and Discussions
                                                    • 5 Conclusions
                                                    • Acknowledgments
                                                    • References
                                                      • Origin of Quantum Probabilities
                                                        • 1 Introduction
                                                        • 2 Quantum formalism and perturbation effects
                                                        • 3 Probability transformations connecting preparation procedures
                                                        • 3 Hyperbolic and hyper-trigonometric probabilistic transformations
                                                        • 4 Double stochasticity and correlations between preparation procedures
                                                        • 5 Hyperbolic quantum formalism
                                                        • 6 Physical consequences
                                                        • Acknowledgements
                                                        • References
                                                          • Nonconventional Viewpoint to Elements of Physical Reality Based on Nonreal Asymptotics of Relative Frequencies
                                                            • 1 Introduction
                                                            • 2 Analysis of the foundation of probability theory
                                                            • 3 General principle of statistical stabilization of relative frequencies
                                                            • 4 Probability distribution of a collective
                                                            • 5 Model examples of p-adic statistics
                                                            • Acknowledgements
                                                            • References
                                                              • Complementarity or Schizophrenia Is Probability in Quantum Mechanics Information or Onta
                                                                • 1 Introduction
                                                                • 2 De Broglie waves as an SED effect
                                                                • 3 Schrodinger Equation
                                                                • 4 Conclusions
                                                                  • A Probabilistic Inequality for the Kochen-Specker Paradox
                                                                    • 1 Introduction
                                                                    • 2 The Kochen-Specker theorem
                                                                    • 3 The Kochen-Specker inequality
                                                                    • 4 Independence
                                                                    • 5 Conclusions
                                                                      • Quantum Stochastics The New Approach to the Description of Quantum Measurements
                                                                        • 1 Introduction
                                                                        • 2 Quantum stochastic approach
                                                                        • 3 Concluding remarks
                                                                        • 4 Acknowledgments
                                                                        • References
                                                                          • Abstract Models of Probability
                                                                            • 1 What probability sets o are possible
                                                                            • 2 Uniqueness of semigroups of zeros and units
                                                                            • 3 Probabilities with hidden parameters
                                                                            • 4 Probability sets with a single unit
                                                                            • 5 Acknowledgments
                                                                            • References
                                                                              • Quantum K-Systems and their Abelian Models
                                                                                • 1 Introduction
                                                                                • 2 Classical K-System
                                                                                • 3 Algebraic Quantum K-Systems
                                                                                • 4 Dynamical Entropy
                                                                                • 5 Some General Considerations on Abelian Models
                                                                                • 6 Abelian Models for Algebraic K-Systems
                                                                                • 7 Continuous K-Systems
                                                                                • 8 Mixing Properties Without Algebraic K-Property
                                                                                • 9 Time Evolution
                                                                                • References
                                                                                  • Scattering in Quantum Tubes
                                                                                    • 1 Introduction
                                                                                    • 2 Tubes in quantum heterostructures
                                                                                    • 3 Mathematical model
                                                                                    • 4 Reformulated scattering problem
                                                                                    • 5 Solution of the scattering problem
                                                                                    • References
                                                                                      • Position Eigenstates and the Statistical Axiom of Quantum Mechanics
                                                                                        • 1 Quantum probabilities according to Deutsch
                                                                                        • 2 Schrodingers equation for a free particle as a consequence of position eigenstates
                                                                                        • 3 Driven particle Weyl equation in general space-time
                                                                                        • 4 Realizing Deutschs substitution as a time evolution
                                                                                        • 5 Can normalization be replaced by symmetry
                                                                                        • References
                                                                                          • Is Random Event the Core Question Some Remarks and a Proposal
                                                                                            • 1 Preface
                                                                                            • 2 Linguistic Model
                                                                                            • 3 Ensemble Model
                                                                                            • 4 Structural Model
                                                                                            • 5 Certain and Uncertain Structures
                                                                                            • 6 Probability
                                                                                            • 7 Experimental Verification
                                                                                            • 8 Objective and Subjective Probability
                                                                                            • 9 Conclusions
                                                                                            • References
                                                                                              • Constructive Foundations of Randomness
                                                                                                • 1 Introduction
                                                                                                • 2 Kolmogorov Complexity
                                                                                                • 3 Incompressibility
                                                                                                • 4 Reversible Complexity
                                                                                                • 5 Complexity and Information
                                                                                                • 6 Frequency Rates
                                                                                                • 7 Prefix Complexity
                                                                                                • 8 Universal Probability
                                                                                                • 9 Sequentially Coding Algorithms
                                                                                                • References
                                                                                                  • Structure of Probabilistic Information and Quantum Laws
                                                                                                    • 1 Introduction
                                                                                                    • 2 Gaining experimental information
                                                                                                    • 3 Efficient representation of probabilistic information
                                                                                                    • 4 Predictions
                                                                                                    • 5 Discussion
                                                                                                    • Acknowledgments
                                                                                                    • References
                                                                                                      • Quantum Cryptography in Space and Bells Theorem
                                                                                                        • 1 Introduction
                                                                                                        • 2 Bells Inequality
                                                                                                        • 3 Localized Detectors
                                                                                                        • 4 Quantum Key Distribution
                                                                                                        • 5 Gaussian Wave Functions
                                                                                                        • 6 Conclusions
                                                                                                        • 7 Acknowledgments
                                                                                                        • References
                                                                                                          • Interacting Stochastic Process and Renormalization Theory
                                                                                                            • 1 Introduction
                                                                                                            • 2 Acknowledgments
                                                                                                            • References
Page 4: Foundations of Probability and Physics
Page 5: Foundations of Probability and Physics
Page 6: Foundations of Probability and Physics
Page 7: Foundations of Probability and Physics
Page 8: Foundations of Probability and Physics
Page 9: Foundations of Probability and Physics
Page 10: Foundations of Probability and Physics
Page 11: Foundations of Probability and Physics
Page 12: Foundations of Probability and Physics
Page 13: Foundations of Probability and Physics
Page 14: Foundations of Probability and Physics
Page 15: Foundations of Probability and Physics
Page 16: Foundations of Probability and Physics
Page 17: Foundations of Probability and Physics
Page 18: Foundations of Probability and Physics
Page 19: Foundations of Probability and Physics
Page 20: Foundations of Probability and Physics
Page 21: Foundations of Probability and Physics
Page 22: Foundations of Probability and Physics
Page 23: Foundations of Probability and Physics
Page 24: Foundations of Probability and Physics
Page 25: Foundations of Probability and Physics
Page 26: Foundations of Probability and Physics
Page 27: Foundations of Probability and Physics
Page 28: Foundations of Probability and Physics
Page 29: Foundations of Probability and Physics
Page 30: Foundations of Probability and Physics
Page 31: Foundations of Probability and Physics
Page 32: Foundations of Probability and Physics
Page 33: Foundations of Probability and Physics
Page 34: Foundations of Probability and Physics
Page 35: Foundations of Probability and Physics
Page 36: Foundations of Probability and Physics
Page 37: Foundations of Probability and Physics
Page 38: Foundations of Probability and Physics
Page 39: Foundations of Probability and Physics
Page 40: Foundations of Probability and Physics
Page 41: Foundations of Probability and Physics
Page 42: Foundations of Probability and Physics
Page 43: Foundations of Probability and Physics
Page 44: Foundations of Probability and Physics
Page 45: Foundations of Probability and Physics
Page 46: Foundations of Probability and Physics
Page 47: Foundations of Probability and Physics
Page 48: Foundations of Probability and Physics
Page 49: Foundations of Probability and Physics
Page 50: Foundations of Probability and Physics
Page 51: Foundations of Probability and Physics
Page 52: Foundations of Probability and Physics
Page 53: Foundations of Probability and Physics
Page 54: Foundations of Probability and Physics
Page 55: Foundations of Probability and Physics
Page 56: Foundations of Probability and Physics
Page 57: Foundations of Probability and Physics
Page 58: Foundations of Probability and Physics
Page 59: Foundations of Probability and Physics
Page 60: Foundations of Probability and Physics
Page 61: Foundations of Probability and Physics
Page 62: Foundations of Probability and Physics
Page 63: Foundations of Probability and Physics
Page 64: Foundations of Probability and Physics
Page 65: Foundations of Probability and Physics
Page 66: Foundations of Probability and Physics
Page 67: Foundations of Probability and Physics
Page 68: Foundations of Probability and Physics
Page 69: Foundations of Probability and Physics
Page 70: Foundations of Probability and Physics
Page 71: Foundations of Probability and Physics
Page 72: Foundations of Probability and Physics
Page 73: Foundations of Probability and Physics
Page 74: Foundations of Probability and Physics
Page 75: Foundations of Probability and Physics
Page 76: Foundations of Probability and Physics
Page 77: Foundations of Probability and Physics
Page 78: Foundations of Probability and Physics
Page 79: Foundations of Probability and Physics
Page 80: Foundations of Probability and Physics
Page 81: Foundations of Probability and Physics
Page 82: Foundations of Probability and Physics
Page 83: Foundations of Probability and Physics
Page 84: Foundations of Probability and Physics
Page 85: Foundations of Probability and Physics
Page 86: Foundations of Probability and Physics
Page 87: Foundations of Probability and Physics
Page 88: Foundations of Probability and Physics
Page 89: Foundations of Probability and Physics
Page 90: Foundations of Probability and Physics
Page 91: Foundations of Probability and Physics
Page 92: Foundations of Probability and Physics
Page 93: Foundations of Probability and Physics
Page 94: Foundations of Probability and Physics
Page 95: Foundations of Probability and Physics
Page 96: Foundations of Probability and Physics
Page 97: Foundations of Probability and Physics
Page 98: Foundations of Probability and Physics
Page 99: Foundations of Probability and Physics
Page 100: Foundations of Probability and Physics
Page 101: Foundations of Probability and Physics
Page 102: Foundations of Probability and Physics
Page 103: Foundations of Probability and Physics
Page 104: Foundations of Probability and Physics
Page 105: Foundations of Probability and Physics
Page 106: Foundations of Probability and Physics
Page 107: Foundations of Probability and Physics
Page 108: Foundations of Probability and Physics
Page 109: Foundations of Probability and Physics
Page 110: Foundations of Probability and Physics
Page 111: Foundations of Probability and Physics
Page 112: Foundations of Probability and Physics
Page 113: Foundations of Probability and Physics
Page 114: Foundations of Probability and Physics
Page 115: Foundations of Probability and Physics
Page 116: Foundations of Probability and Physics
Page 117: Foundations of Probability and Physics
Page 118: Foundations of Probability and Physics
Page 119: Foundations of Probability and Physics
Page 120: Foundations of Probability and Physics
Page 121: Foundations of Probability and Physics
Page 122: Foundations of Probability and Physics
Page 123: Foundations of Probability and Physics
Page 124: Foundations of Probability and Physics
Page 125: Foundations of Probability and Physics
Page 126: Foundations of Probability and Physics
Page 127: Foundations of Probability and Physics
Page 128: Foundations of Probability and Physics
Page 129: Foundations of Probability and Physics
Page 130: Foundations of Probability and Physics
Page 131: Foundations of Probability and Physics
Page 132: Foundations of Probability and Physics
Page 133: Foundations of Probability and Physics
Page 134: Foundations of Probability and Physics
Page 135: Foundations of Probability and Physics
Page 136: Foundations of Probability and Physics
Page 137: Foundations of Probability and Physics
Page 138: Foundations of Probability and Physics
Page 139: Foundations of Probability and Physics
Page 140: Foundations of Probability and Physics
Page 141: Foundations of Probability and Physics
Page 142: Foundations of Probability and Physics
Page 143: Foundations of Probability and Physics
Page 144: Foundations of Probability and Physics
Page 145: Foundations of Probability and Physics
Page 146: Foundations of Probability and Physics
Page 147: Foundations of Probability and Physics
Page 148: Foundations of Probability and Physics
Page 149: Foundations of Probability and Physics
Page 150: Foundations of Probability and Physics
Page 151: Foundations of Probability and Physics
Page 152: Foundations of Probability and Physics
Page 153: Foundations of Probability and Physics
Page 154: Foundations of Probability and Physics
Page 155: Foundations of Probability and Physics
Page 156: Foundations of Probability and Physics
Page 157: Foundations of Probability and Physics
Page 158: Foundations of Probability and Physics
Page 159: Foundations of Probability and Physics
Page 160: Foundations of Probability and Physics
Page 161: Foundations of Probability and Physics
Page 162: Foundations of Probability and Physics
Page 163: Foundations of Probability and Physics
Page 164: Foundations of Probability and Physics
Page 165: Foundations of Probability and Physics
Page 166: Foundations of Probability and Physics
Page 167: Foundations of Probability and Physics
Page 168: Foundations of Probability and Physics
Page 169: Foundations of Probability and Physics
Page 170: Foundations of Probability and Physics
Page 171: Foundations of Probability and Physics
Page 172: Foundations of Probability and Physics
Page 173: Foundations of Probability and Physics
Page 174: Foundations of Probability and Physics
Page 175: Foundations of Probability and Physics
Page 176: Foundations of Probability and Physics
Page 177: Foundations of Probability and Physics
Page 178: Foundations of Probability and Physics
Page 179: Foundations of Probability and Physics
Page 180: Foundations of Probability and Physics
Page 181: Foundations of Probability and Physics
Page 182: Foundations of Probability and Physics
Page 183: Foundations of Probability and Physics
Page 184: Foundations of Probability and Physics
Page 185: Foundations of Probability and Physics
Page 186: Foundations of Probability and Physics
Page 187: Foundations of Probability and Physics
Page 188: Foundations of Probability and Physics
Page 189: Foundations of Probability and Physics
Page 190: Foundations of Probability and Physics
Page 191: Foundations of Probability and Physics
Page 192: Foundations of Probability and Physics
Page 193: Foundations of Probability and Physics
Page 194: Foundations of Probability and Physics
Page 195: Foundations of Probability and Physics
Page 196: Foundations of Probability and Physics
Page 197: Foundations of Probability and Physics
Page 198: Foundations of Probability and Physics
Page 199: Foundations of Probability and Physics
Page 200: Foundations of Probability and Physics
Page 201: Foundations of Probability and Physics
Page 202: Foundations of Probability and Physics
Page 203: Foundations of Probability and Physics
Page 204: Foundations of Probability and Physics
Page 205: Foundations of Probability and Physics
Page 206: Foundations of Probability and Physics
Page 207: Foundations of Probability and Physics
Page 208: Foundations of Probability and Physics
Page 209: Foundations of Probability and Physics
Page 210: Foundations of Probability and Physics
Page 211: Foundations of Probability and Physics
Page 212: Foundations of Probability and Physics
Page 213: Foundations of Probability and Physics
Page 214: Foundations of Probability and Physics
Page 215: Foundations of Probability and Physics
Page 216: Foundations of Probability and Physics
Page 217: Foundations of Probability and Physics
Page 218: Foundations of Probability and Physics
Page 219: Foundations of Probability and Physics
Page 220: Foundations of Probability and Physics
Page 221: Foundations of Probability and Physics
Page 222: Foundations of Probability and Physics
Page 223: Foundations of Probability and Physics
Page 224: Foundations of Probability and Physics
Page 225: Foundations of Probability and Physics
Page 226: Foundations of Probability and Physics
Page 227: Foundations of Probability and Physics
Page 228: Foundations of Probability and Physics
Page 229: Foundations of Probability and Physics
Page 230: Foundations of Probability and Physics
Page 231: Foundations of Probability and Physics
Page 232: Foundations of Probability and Physics
Page 233: Foundations of Probability and Physics
Page 234: Foundations of Probability and Physics
Page 235: Foundations of Probability and Physics
Page 236: Foundations of Probability and Physics
Page 237: Foundations of Probability and Physics
Page 238: Foundations of Probability and Physics
Page 239: Foundations of Probability and Physics
Page 240: Foundations of Probability and Physics
Page 241: Foundations of Probability and Physics
Page 242: Foundations of Probability and Physics
Page 243: Foundations of Probability and Physics
Page 244: Foundations of Probability and Physics
Page 245: Foundations of Probability and Physics
Page 246: Foundations of Probability and Physics
Page 247: Foundations of Probability and Physics
Page 248: Foundations of Probability and Physics
Page 249: Foundations of Probability and Physics
Page 250: Foundations of Probability and Physics
Page 251: Foundations of Probability and Physics
Page 252: Foundations of Probability and Physics
Page 253: Foundations of Probability and Physics
Page 254: Foundations of Probability and Physics
Page 255: Foundations of Probability and Physics
Page 256: Foundations of Probability and Physics
Page 257: Foundations of Probability and Physics
Page 258: Foundations of Probability and Physics
Page 259: Foundations of Probability and Physics
Page 260: Foundations of Probability and Physics
Page 261: Foundations of Probability and Physics
Page 262: Foundations of Probability and Physics
Page 263: Foundations of Probability and Physics
Page 264: Foundations of Probability and Physics
Page 265: Foundations of Probability and Physics
Page 266: Foundations of Probability and Physics
Page 267: Foundations of Probability and Physics
Page 268: Foundations of Probability and Physics
Page 269: Foundations of Probability and Physics
Page 270: Foundations of Probability and Physics
Page 271: Foundations of Probability and Physics
Page 272: Foundations of Probability and Physics
Page 273: Foundations of Probability and Physics
Page 274: Foundations of Probability and Physics
Page 275: Foundations of Probability and Physics
Page 276: Foundations of Probability and Physics
Page 277: Foundations of Probability and Physics
Page 278: Foundations of Probability and Physics
Page 279: Foundations of Probability and Physics
Page 280: Foundations of Probability and Physics
Page 281: Foundations of Probability and Physics
Page 282: Foundations of Probability and Physics
Page 283: Foundations of Probability and Physics
Page 284: Foundations of Probability and Physics
Page 285: Foundations of Probability and Physics
Page 286: Foundations of Probability and Physics
Page 287: Foundations of Probability and Physics
Page 288: Foundations of Probability and Physics
Page 289: Foundations of Probability and Physics
Page 290: Foundations of Probability and Physics
Page 291: Foundations of Probability and Physics
Page 292: Foundations of Probability and Physics
Page 293: Foundations of Probability and Physics
Page 294: Foundations of Probability and Physics
Page 295: Foundations of Probability and Physics
Page 296: Foundations of Probability and Physics
Page 297: Foundations of Probability and Physics
Page 298: Foundations of Probability and Physics
Page 299: Foundations of Probability and Physics
Page 300: Foundations of Probability and Physics
Page 301: Foundations of Probability and Physics
Page 302: Foundations of Probability and Physics
Page 303: Foundations of Probability and Physics
Page 304: Foundations of Probability and Physics
Page 305: Foundations of Probability and Physics
Page 306: Foundations of Probability and Physics
Page 307: Foundations of Probability and Physics
Page 308: Foundations of Probability and Physics
Page 309: Foundations of Probability and Physics
Page 310: Foundations of Probability and Physics
Page 311: Foundations of Probability and Physics
Page 312: Foundations of Probability and Physics
Page 313: Foundations of Probability and Physics
Page 314: Foundations of Probability and Physics
Page 315: Foundations of Probability and Physics
Page 316: Foundations of Probability and Physics
Page 317: Foundations of Probability and Physics
Page 318: Foundations of Probability and Physics
Page 319: Foundations of Probability and Physics
Page 320: Foundations of Probability and Physics
Page 321: Foundations of Probability and Physics
Page 322: Foundations of Probability and Physics
Page 323: Foundations of Probability and Physics
Page 324: Foundations of Probability and Physics
Page 325: Foundations of Probability and Physics
Page 326: Foundations of Probability and Physics
Page 327: Foundations of Probability and Physics
Page 328: Foundations of Probability and Physics
Page 329: Foundations of Probability and Physics
Page 330: Foundations of Probability and Physics
Page 331: Foundations of Probability and Physics
Page 332: Foundations of Probability and Physics
Page 333: Foundations of Probability and Physics
Page 334: Foundations of Probability and Physics
Page 335: Foundations of Probability and Physics
Page 336: Foundations of Probability and Physics
Page 337: Foundations of Probability and Physics
Page 338: Foundations of Probability and Physics
Page 339: Foundations of Probability and Physics
Page 340: Foundations of Probability and Physics
Page 341: Foundations of Probability and Physics
Page 342: Foundations of Probability and Physics
Page 343: Foundations of Probability and Physics
Page 344: Foundations of Probability and Physics
Page 345: Foundations of Probability and Physics
Page 346: Foundations of Probability and Physics
Page 347: Foundations of Probability and Physics
Page 348: Foundations of Probability and Physics
Page 349: Foundations of Probability and Physics
Page 350: Foundations of Probability and Physics
Page 351: Foundations of Probability and Physics
Page 352: Foundations of Probability and Physics
Page 353: Foundations of Probability and Physics
Page 354: Foundations of Probability and Physics
Page 355: Foundations of Probability and Physics
Page 356: Foundations of Probability and Physics
Page 357: Foundations of Probability and Physics
Page 358: Foundations of Probability and Physics
Page 359: Foundations of Probability and Physics
Page 360: Foundations of Probability and Physics
Page 361: Foundations of Probability and Physics
Page 362: Foundations of Probability and Physics
Page 363: Foundations of Probability and Physics
Page 364: Foundations of Probability and Physics
Page 365: Foundations of Probability and Physics
Page 366: Foundations of Probability and Physics
Page 367: Foundations of Probability and Physics
Page 368: Foundations of Probability and Physics
Page 369: Foundations of Probability and Physics
Page 370: Foundations of Probability and Physics
Page 371: Foundations of Probability and Physics
Page 372: Foundations of Probability and Physics
Page 373: Foundations of Probability and Physics
Page 374: Foundations of Probability and Physics
Page 375: Foundations of Probability and Physics
Page 376: Foundations of Probability and Physics
Page 377: Foundations of Probability and Physics
Page 378: Foundations of Probability and Physics
Page 379: Foundations of Probability and Physics
Page 380: Foundations of Probability and Physics
Page 381: Foundations of Probability and Physics
Page 382: Foundations of Probability and Physics
Page 383: Foundations of Probability and Physics
Page 384: Foundations of Probability and Physics
Page 385: Foundations of Probability and Physics
Page 386: Foundations of Probability and Physics
Page 387: Foundations of Probability and Physics
Page 388: Foundations of Probability and Physics
Page 389: Foundations of Probability and Physics
Page 390: Foundations of Probability and Physics
Page 391: Foundations of Probability and Physics