Foundations of Probability and Physics
Transcript of Foundations of Probability and Physics
PQ-QP Quantum Probability and WItite Noise Analysis
Volume XIII
^ ^ Proceedings of the Conference
Foundations of p robability and
physics Edited by A Khrennikov
World Scientific
^ ^ Proceedings of the Conference
foundations of Probability and
physics
P Q - Q P Quantum Probability and White Noise Analysis
Managing Editor W Freudenberg Advisory Board Members L Accardi T Hida R Hudson and K R Parthasarathy
PQ-QP Quantum Probability and White Noise Analysis
Vol 13 Foundations of Probability and Physics ed A Khrennikov
QP-PQ
Vol 10 Quantum Probability Communications eds R L Hudson and J M Lindsay
Vol 9 Quantum Probability and Related Topics ed L Accardi
Vol 8 Quantum Probability and Related Topics ed L Accardi
Vol 7 Quantum Probability and Related Topics ed L Accardi
Vol 6 Quantum Probability and Related Topics ed L Accardi
PQ-QP Quantum Probability and White Noise Analysis
Volume XIII
Proceedings of the Conference
foundations of probability and
physics Vaxjo Sweden 25 November - 1 December 2000
Edited by A Khrennikov University of Vaxjo Sweden
|5 World Scientific m New JerseyLondonSingapore New Jersey bull London bull Singapore bull Hong Kong
Published by
World Scientific Publishing Co Pte Ltd
P O Box 128 Farrer Road Singapore 912805
USA office Suite IB 1060 Main Street River Edge NJ 07661
UK office 57 Shelton Street Covent Garden London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library
FOUNDATIONS OF PROBABILITY AND PHYSICS PQ-QP Quantum Probability and White Noise Analysis - Vol 13
Copyright copy 2001 by World Scientific Publishing Co Pte Ltd
All rights reserved This book or parts thereof may not be reproduced in any form or by any means electronic or mechanical including photocopying recording or any information storage and retrieval system now known or to be invented without written permission from the Publisher
For photocopying of material in this volume please pay a copying fee through the Copyright Clearance Center Inc 222 Rosewood Drive Danvers MA 01923 USA In this case permission to photocopy is not required from the publisher
ISBN 981-02-4846-6
Printed in Singapore by World Scientific Printers (S) Pte Ltd
V
Foreword
With the present proceedings of a conference on Foundations of Probability and Physics we continue the QP series mdash the first volume of which appeared more than twenty years ago The series had its origin in proceedings of conshyferences and workshops on quantum probability and related topics Initially published by Springer-Verlag World Scientific has now been the publisher for about ten years Much has changed in the world of quantum probability in the last two decades Quantum probabilistic methods became a mature subject in mathematics and mathematical physics The number of well-established scienshytists who have turned their scientific interest to the field of quantum probability is impressively increasing Scientifically and numerically strong schools of quanshytum probability evolved in the past years Moreover the highly interdisciplinary character of quantum probability became more and more evident Especially the close connections to white noise analysis aroused the interest of classical and quantum probabilists and stimulated mutual exchange and cooperation fruitful for both parties
Taking into account this development during the previous QP conferences we discussed comprehensively and in detail the future profile and main goals of the series Some changes in the alignment and the objectives of the series reshysulted from these discussions First of all the new title reflects the intention to unify white noise analysis and quantum probability It is important and essenshytial to bring together classical and quantum probabilists and the success of the World Scientific journal Infinite Dimensional Analysis Quantum Probability and Related Topics shows that such an alliance will benefit both parties Furshythermore we should be open to a wide audience of scientists and to a broad spectrum of themes The present volume represents such a field being not very closely connected to quantum probability and white noise analysis but of general interest to the readership of the series
Future volumes of the series will include proceedings of conferences or workshyshops lecture notes of schools but also monographs on topics in quantum probshyability and white noise analysis
Finally we would like to thank all former editors of the series for their excellent job they did We especially appreciate the enthusiastic commitment of Luigi Accardi who initiated the series and was the responsible editor for many years
Wolfgang Freudenberg
VII
Contents
Foreword v
Preface xi
Locality and Bells Inequality 1 L Accardi and M Regoli
Refutation of Bells Theorem 29 G Adenier
Probability Conservation and the State Determination Problem 39 S Aerts
Extrinsic and Intrinsic Irreversibility in Probabilistic Dynamical Laws 50 H Atmanspacher R C Bishop and A Amann
Interpretations of Probability and Quantum Theory 71 L E Ballentine
Forcing Discretization and Determination in Quantum History Theories 85
B Coecke
Interpretations of Quantum Mechanics and Interpretations of Violation of Bells Inequality 95
W M De Muynck
Discrete Hessians in Study of Quantum Statistical Systems Complex Ginibre Ensemble 115
M M Duras
Some Remarks on Hardy Functions Associated with Dirichlet Series 121 W Ehm
Ensemble Probabilistic Equilibrium and Non-Equilibrium Thermodynamics without the Thermodynamic Limit 131
D H E Gross
An Approach to Quantum Probability 147 S Gudder
Innovation Approach to Stochastic Processes and Quantum Dynamics 161
T Hida
Statistics and Ergodicity of Wave Functions in Chaotic Open Systems 170 H Ishio
Origin of Quantum Probabilities 180 A Khrennikov
Nonconventional Viewpoint to Elements of Physical Reality Based on Nonreal Asymptotics of Relative Frequencies 201
A Khrennikov
Complementarity or Schizophrenia Is Probability in Quantum Mechanics Information or Onta 219
A F Kracklauer
A Probabilistic Inequality for the Kochen-Specker Paradox 236 J-A Larsson
Quantum Stochastics The New Approach to the Description of Quantum Measurements 246
E Loubenets
Abstract Models of Probability 257 V M Maximov
Quantum K-Systems and their Abelian Models 274 H Narnhofer
Scattering in Quantum Tubes 303 B Nilsson
Position Eigenstates and the Statistical Axiom of Quantum Mechanics 314
L Polley
Is Random Event the Core Question Some Remarks and a Proposal 321 P Rocchi
Constructive Foundations of Randomness 335 V I Serdobolskii
ix
Structure of Probabilistic Information and Quantum Laws 350 J Summhammer
Quantum Cryptography in Space and Bells Theorem 364 Volovich
Interacting Stochastic Process and Renormalization Theory 373 Y Volovich
xi
Preface
This volume constitutes the proceedings of the Conference Foundations of Probability and Physics held in Vaxjo (Smoland Sweden) from 25 November to 1 December 2000
The Organizing Committee of the Conference L Accardi (Rome Italy) W De Muynck (Eindhoven the Netherlands) T Hida (Meijo University Japan) A Khrennikov (Vaxjo University Sweden) and U V Maximov (Be-lostok Poland)
The purpose of the Conference (tentatively the first of a series) was to bring together scientists (physicists as well as mathematicians) who are intershyested in probabilistic foundations of physics An emphasis was made on both theory and experiment the underlying objective being to offer to the physical and mathematical scientific communities a truly interdisciplinary Conference as a privileged place for a scientific interaction among theoreticians and exshyperimentalists Due to the actual increased role of probabilistic foundations in physical applications (Einstein-Podolsky-Rosen correlation experiments Bells inequality quantum information computing and teleportation) as well as the necessity to reconsider foundations at the beginning of new millennium the organizers of the Conference decided that it was just the right time for taking the scientific risk of trying this
Since the creation of Statistical Mechanics probabilistic description plays more and more important role in physics The new crucial step in the develshyopment of the statistical approach to physics was made in the process of the creation of quantum mechanics The founders of quantum theory recognized that quantum formalism could not provide the description of physical processes for individual elementary particles The understanding of this surprising fact induced numerous debates on the possibilities of individual and probabilistic descriptions and relations between them These debates are characterized by the large diversity of opinions on the origin of quantum stochasticity
One of the viewpoints is that quantum stochasticity differs from classical stochasticity So quantum (statistical) mechanics could not be reduced to classical statistical mechanics This viewpoint implies convential interpretation of quantum mechanics
By this interpretation we could not use objective realism in quantum deshyscription of reality The very fundamental physical quantities such as for example position and momentum of an elementary particle could not be conshysidered as properties of the object the elementary particle The elementary particle can be in a state that is superposition of alternatives Only the act of a measurement gives the possibility to choose between these alternatives
xii
We recall historical roots of the origin of such a viewpoint namely the idea of superposition
In fact the whole quantum building was built on two experimental cornershystones 1) the experiment on photoelectric emission 2) the two slit experiment
The first experiment definitely demonstrated that light has the corpuscular structure (discrete structure of energy)
However the second experiment demonstrated that photons (corpuscular objects) do not follow the standard CLASSICAL STATISTICS The convenshytional rule for the addition of probabilistic alternatives
P = P1+P2
is violated in the interference experiments Instead of this rule probabilities observed in interference experiments follow to quantum rule for the addition of probabilistic alternatives
P = Pi + P2 + 2TP1P2COSO
Thus in general the classical rule is perturbed by the cos 0-factor The appearance of NEW STATISTICS induced the revolution in theoretshy
ical physics reconsideration of the role of all basic elements of the physical theory The common opinion was (and is) that quantum probabilistic rule could not be explained by purely corpuscular model To explain this rule we must apply to wave arguments (see for example Diracs book for the detailed analysis of the roots of quantum mechanical formalism)
This implies the wave-particle dualism and Bohrs principle of complemenshytarity This was the crucial change of the whole picture of physical reality (at least at micro-level)
We underline again that all these revolutionary changes had the purely probabilistic root namely the appearance of the new probabilistic rule We also underline that the founders of quantum mechanics in fact did not proshyvide deep probabilistic analysis of the problem Instead of this they analysed other elements of the physical model And such an analysis induces the new description of physical reality that we have already discussed namely quanshytum reality We will never know the real reasons of such a development of the
aOf course we must also mention that the necessity for a departure from classical meshychanics was shown by experiments demonstrating the remarkable stability of atoms and molecules The forces known in classical electrodynamics are inadequate for the explanation of this phenomenon However quantum mechanical explanation of such a stability is in fact based on the same arguments as the explanation of the photoelectric effect
bP A M Dirac The Principles of Quantum Mechanics (Claredon Press Oxford 1995)
xiii
theoretical study of the results of experiments with elementary particles at the beginning of the last century
It might be that one of the reasons was the absence of the mathematical theory of probability A N Kolmogorov proposed the modern axiomatics of probability theory only in 1933
During the round table at this conference Prof T Hida and Prof I Volovich pointed out to the fundamental role of direct contacts between physishycists and mathematician in the creation of new physical theories It may be that the absence of the direct collaboration between quantum physical and probabilistic communities was the main root of the absence of deep probabilisshytic analysis of quantum behaviour
Debates on foundations of quantum mechanics were continued with a new excitement in the connection with Einstein-Podolsky-Rosen (EPR) paradox Unfortunately the probabilistic element played the minor role in the EPR conshysiderations There was used (in a rather formal way) the notion of probability one in the formulation of the sufficient condition to be an element of physical reality A new probabilistic impulse to debates on foundations of quantum meshychanics was given by Bells inequality However we must recognize that Bells probabilistic considerations were performed on the formal level that could not be considered as satisfactory (at least from the point of view of mathematishycian) It may be that this absence of the deep probabilistic analysis of the EPR and Bell arguments was one of the main reasons to concentrate investigations in the direction of nonlocality and no-go theorems for hidden variables
The main aim of the conference Foundations of Probability and Physics was to provide probabilistic analysis of foundations of physics classical as well as quantum (in particular the EPR and Bell arguments) The present volume contains results of such analysis It gives the general picture of probabilistic foundations of modern physics Foundations of probability were considered in the close connection to foundations of physics We demonstrated that probashybility plays the fundamental role in models of physical reality It seems to be impossible to split probabilistic and physical problems On one hand many important problems that looks as purely physical are in fact just probabilistic problems On the other hand the right meaning of probability can be found only on the basis of physical investigations Such a meaning depends strongly on a physical model
The conference and the present volume give the good example of the fruitshyful collaboration between physicists and mathematicians stimulate research on the foundations of probability and physics especially quantum physics
We would like to thank Swedish Natural Science Foundation Swedish Technical Science Foundation Vaxjo University and Vaxjo Commune for fi-
XIV
nancial support that made the Conference possible We would also like to thank Prof Magnus Soderstrom the Rector of Vaxjo University for support of fundamental investigations and in particular this Conference
Andrei Khrennikov International Center for Mathematical Modelling in Physics and Cognitive Sciences University of Vaxjo Sweden December 2000
1
L O C A L I T Y A N D B E L L S I N E Q U A L I T Y
LUIDGI ACCARDI MASSIMO REGOLI Centro Vito Volterra
Universita di Roma Tor Vergata Roma Italy Email accardi copyvolterra mat uniroma2 it
We prove that the locality condition is irrelevant to Bell in equality We check that the real origin of the Bells inequality is the assumption of applicability of classical (Kolmogorovian) probability theory to quantum mechanics We describe the chameleon effect which allows to construct an experiment realizing a local realistic classical deterministic and macroscopic violation of the Bell inequalities
1 Inequal i t i e s a m o n g n u m b e r s
In this section we summarize some elementary inequalities among numbers which correspond to different forms of the Bell inequality one meets in the literature Since some confusion have arosen about the mutual relationships among these inequalities in particular their (in)equivalence and the cases of equality such a summary might not be totally useless
L e m m a (1) For any two numbers ac euro [mdash11] the following equivalent inshyequalities hold
aplusmncltlplusmnac (1)
Moreover equality in (1) holds if and only if either o = plusmn l o r c = plusmn l
Proof The equivalence of the two inequalities (1) follows from the fact tha t one is obtained from the other by changing the sign of c and c is arbi t rary in
[-11]-
Since for any a c 6 [mdash11] 1 plusmn ac gt 0 (1) is equivalent to
a plusmn c2 = a2 + c2 plusmn 2ac lt (1 plusmn ac)2 = 1 + a2c2 plusmn 2ac
and this is equivalent to a 2 ( l - c 2 ) + c2 lt 1
which is identically satisfied because 1 mdash c2 gt 0 and therefore
a 2 ( l - c 2 ) + c 2 lt l - c 2 + c2 = 1 (2)
Notice tha t in (2) equality holds if and only if a2 = 1 ie a = plusmn 1 Since exchanging a and c in (1) the inequality remains unchanged the thesis follows
2
Corollary (2) For any three numbers abc euro [mdash11] the following equivalent inequalities hold
ab plusmn cb lt 1 plusmn ac (3)
and equality holds if and only if b = plusmn1 and either a = plusmn l o r c = i l
Proof For b e [-11]
abplusmncb = b-aplusmncltaplusmnc (4)
so the thesis follows from Lemma (1) In (34) equality holds if and only if b = plusmn 1 so also the second statement follows from Lemma (1)
Lemma (3) For any numbers o a b b c e [mdash11] one has
ab - bc + ab + bc lt 2 (5)
ab + ab + ab -ab lt 2 (6)
In (5) equality holds if and only if b b = plusmn1 and either a o r c = plusmn 1
Proof Adding the two inequalities in (3) one finds (5) The left hand side of (6) is lt than
ab-ba + ab + la (7)
and replacing a by c (7) becomes the left hand side of (5) Therefore (6) holds If b b = plusmn1 and either a or c = plusmn1 equality holds in (3) hence in (5) Conversely suppose that equality holds in (5) and suppose that either b lt 1 or | V | lt 1 Then we arrive to the contradiction
2 = b bull a - a + b bull |o + a lta- a + a + a lt (1 - aa) + (1 + aa) = 2 (8)
So if equality holds in (5) we must have |6| = b = 1 In this case (5) becomes
a-a + a + a=2 (9)
and we know from Lemma (1) that the identity (41) can take place if and only if either a or a = plusmn 1
3
Corollary (4) If aabbc pound -11 then the inequalities (3) (6) and (5) are equivalent and equality holds in all of them
Proof From Lemma (1) we know that the inequalities (1) and (2) are equivshyalent Prom Lemma (3) we know that (3) implies (5) Choosing b = a in (5) since a = plusmn 1 (5) becomes
ab mdash cb lt 1 mdash ac
which is (3) The left hand side of (6) is
a(b + b) + a(b - b) (10)
In our assumptions either (b + b) or (b - b) is zero so (4) is either equal to
a(b+b) = b + b=2
or to a(b-b) = b-b = 2
Corollary (5) If abc G (mdash11) then the inequality (5) hence a fortiori (6) is strictly weaker than (3)
Proof We have already proved that that (3) implies (5) hence (6) On the other hand (5) is equivalent to
ab - bc lt (1 - ac) + (1 + ac - ab + bc (11)
ByLemma(l) 1+acmdash ab+bc gt 0 and equality holds if and only if | b | = land either a or c is plusmn 1 From this the thesis follows
2 The Bell inequality
Corollary (1) (Bell inequality) Let ABCD be random variables defined on the same probability space (f2 J- P) and with values in the interval [mdash11] Then the following inequalities hold
E(AB - BC) lt 1 - E(AC) (1)
E(AB + BC) lt 1 + EAC) (2)
4
E(AB - BC) + E(AD + DC) lt 2 (3)
where E denotes the expectation value in the probability space of the four variables Moreover (1) is equivalent to (2) and if either A or C has values plusmn 1 then the three inequalities are equivalent
Proof Lemma (11) implies the following inequalities (interpreted pointwise on fi)
AB - BC lt 1 - AC
AB + BC lt 1 + AC
AB - BC + AD + DC lt 2 from which (1) (2) (3) follow by taking expectation and using the fact that |pound(-0I lt Ed-X^) The equivalence is established by the same arguments as in Lemma (11)
Remark (2) Bells original proof as well as the almost totality of the availshyable proofs of Bells inequality deal only with the case of random variables assuming only the values +1 and mdash1 The present generalization is not withshyout interest because it dispenses from the assumption that the classical random variables used to describe quantum observables have the same set of values of the latter ones a hidden variable theory is required to reproduce the results of quantum theory only when the hidden parameters are averaged over
Theorem (3) Let Sa 5c 5^ 5^ be random variables defined on a probshyability space (poundlF P) and with values in the interval following inequalities holds
-1+1] Then the
pound(5laquo5lt2gt) - E(SWSP) lt 1 - E(SWS^) (4)
E(SMS12)) + E(SWsi2)) lt 1 + E(S^SW) (5)
E(sWsi2)) - pound ( 5 laquo 5 lt 2 ) ) + E(S^S2)) + E(S^S2)) lt 2 (6)
Proof This is a rephrasing of Corollary (2)
5
3 Implications of the Bells inequalities for the singlet correlations
To apply Bells inequalities to the singlet correlations considered in the EPR paradox it is enough to observe that they imply the following
Lemma (1) In the ordinary three-dimensional euclidean space there exist sets of three unit length vectors a b c such that it is not possible to find a probability space (Q T P) and six random variables SX
J (x = a 6 c j = 12) denned on ($7 J- P) and with values in the interval [mdash1 +1] whose correlations are given by
E(SW-SM) = -x-y xy = abc (1)
where if x = (xiX2X3) y = (211223) are two three-dimensional vectors x bull y denotes their euclidean scalar product ie the sum xyi + X2J2 + ^323-
Remark In the usual EPR-type experiments the random variables qti) qU) qii)
represent the spin (or polarization) of particle j of a singlet pair along the three directions abc in space The expression in the right-hand side of (1) is the singlet correlation of two spin or polarization observables theoretically predicted by quantum theory and experimentally confirmed by the Aspect-type experiments
Proof Suppose that for any choice of the unit vectors x = abc there exist random variables Si as in the statement of the Lemma Then using Bells inequality in the form (25) with A = spound1 B = s f ) C = S ^ ) we obtain
E(SWsl2)) + E(S12)SW) lt 1 + ESltpsM) (2)
Now notice that if x = y is chosen in (1) we obtain
ESP bull SM) =-x bull x = - x2 = ~l x = abc
and since Si J Si = 1 this is possible if and only if Si1 = -Sx2gtgt (x = a b c)
P-almost everywhere Using this (2) becomes equivalent to
ESPSIgt) + E(S^SW) lt 1 - E(S^S^)
or again using (1) to
a-b + b-c lt 1 + o-c (3)
6
If the three vectors a b c are chosen to be in the same plane and such that a is perpendicular to c and b lies between a and b forming an angle 9 with a then the inequality (3) becomes
cos9 + sin0 lt 1 0 lt 0 lt TT2 (4)
But the maximum of the function of 6 imdashgt sin 9 + cos 9 in the interval [0 n2] is 2 (obtained for 9 = 7r4) Therefore for 0 close to 7r4 the left-hand side of (4) will be close to 2 which is more that 1 In conclusion for such a choice of the unit vectors a b c random variables Sa S^ Sc Sc as in the statement of the Lemma cannot exist
Definition (2) A local realistic model for the EPR (singlet) correlations is defined by
(1) a probability space (fl T P)
(2) for every unit vector x in the three-dimensional euclidean space two random variables Sx SX defined on fi and with values in the interval [mdash1 +1] whose correlations for any x y are given by equation (1)
Corollary (3) If a b c are chosen so to violate (4) then a local realistic model for the EPR correlations in the sense of Definition (2) does not exist
Proof Its existence would contradict Lemma (1)
Remark In the literature one usually distinguishes two types of local realistic models - deterministic and stochastic ones Both are included in Definition (2) the deterministic models are defined by random variables Sx with values in the setmdash1 +1 while in the stochastic models the random variables take values in the interval [mdash1+1] The original paper [7] was devoted to the deterministic case Starting from [9] several papers have been introduced to justify the stochastic models We prefer to distinguish the definition of the models from their justification
4 Bell on the meaning of Bells inequality
In the last section of [8] (submitted before [7] but published after) Bell briefly describes Bohm hidden variable interpretation of quantum theory underlining
7
its non local character He then raises the question that there is no proof that any hidden variable account of quantum mechanics must have this extraorshydinary character and in a footnote added during the proof corrections he claims that Since the completion of this paper such a proof has been found
m-In the short Introduction to [7] Bell reaffirms the same ideas namely
that the result proven by him in this paper shows that any such [hidden variable] theory which reproduces exactly the quantum mechanical predictions must have a grossly nonlocal structure
The proof goes along the following scheme Bell proves an inequality in which according to what he says (cf statement after formula (1) in [7])
The vital assumption [2] is that the result B for particle 2 does not depend on the setting a of the magnet for particle nor A on b
The paper [2] mentioned in the above statement is nothing but the Einshystein Podolsky Rosen paper [11] and the locality issue is further emphasized by the fact that he reports the famous Einsteins statement [12] But on one supposition we should in my opinion absolutely hold fast the real factual situation of the system S2 is independent of what is done with the system Si which is spatially separated from the former
Stated otherwise according to Bell Bells inequality is a consequence of the locality assumption
It follows that a theory which violates the above mentioned inequality also violates the vital assumption needed according to Bell for its deduction ie locality
Since the experiments prove the violation of this inequality Bell concludes that quantum theory does not admit a local completion in particular quantum mechanics is a nonlocal theory To use again Bells words the statistical predictions of quantum mechanics are incompatible with separable predetermination ([7] p199) Moreover this incompatibility has to be undershystood in the sense that in a theory in which parameters are added to quantum mechanics to determine the results of individual measurements without changshying the statistical predictions there must be a mechanism whereby the setting of one measuring device can influence the reading of another instrument how-evere remote Moreover the signal involved must propagate instantaneously
5 Critique of Bells vital assumption
An assumption should be considered vital for a theorem if without it the theorem cannot be proved
8
To favor Bell let us require much less Namely let us agree to consider his assumption vital if the theorem cannot be proved by taking as its hypothesis the negation of this assumption
If even this minimal requirement is not satisfied then we must conclude that the given assumption has nothing to do with the theorem
Notice that Bell expresses his locality condition by the requirement that the result B for particle 2 should not depend on the setting a of the magnet for particle 1 (cf citation in the preceeding section) Let us denote Mi (M2) the space of all possible measurement settings on system 1 (2)
Theorem (1) For each unit vector x in the three dimensional euclidean space (1 6 R3 I a |= 1) let be given two random variables Sx Sx (spin of particle 1 (2) in direction x) defined on a space D with a probability P and with values in the 2-point set +1 mdash1- Fix 3 of these unit vectors a b c and suppose that the corresponding random variables satisfy the following non locality condition [violating Bells vital assumption] suppose that the probability space Cl has the following structure
) = A x M x M 2 (1)
so that for some function Fj1 F^2 A x Mi x M2 -raquobull [-11]
Sal) (w) = Fa
(1) (A mi m2) (S^ depends on m2) (2)
Sa2)(u) = Fa
(2)(A mi m2) (Sa2) depends on mi) (3)
with mi euro Mim2 euro M2 and similarly for b and c [nothing changes in the (2) proof if we add further dependences for example Fa may depend on all the
41 (w) and F0(1) on all the SX
2LJ)
Then the random variables Si S^2 Sc satisfy the inequality
I (SMStrade) - (StradeSW) |lt 1 - (S^SM) (4)
If moreover the singlet condition
lt5(1)-S(2)) = - 1 x = abc (5)
is also satisfied then Bells inequality holds in the form
(Sa^si2))-S^S^)ltl + (sWS^) (6)
9
Proof The random variables Sa S^ Sc satisfy the assumptions of Corolshylary (23) therefore (4) holds If also condition (5) is satisfied then since the variables take values in the set mdash1 +1 with probability 1 one must have
SP = -SW (x = abc) (7)
and therefore (S^S^) = -S^S^) Using this identity (4) becomes (6)
Summing up Theorem (1) proves that Bells inequality is satisfied if one takes as hypothesis the negation of his vital assumption From this we conclude that Bells vital assumption not only is not vital but in fact has nothing to do with Bells inequality
REMARK Using Lemma (141) below we can allow that the observables take values in [mdash11] also in Theorem (1)
REMARK The above discussion is not a refutation of the Bell inequality it is a refutation of Bells claim that his formulation of locality is an essential assumption for its validity since the locality assumption is irrelevant for the proof of Bells inequality it follows that this inequality cannot discriminate between local and non local hidden variable theories as claimed both in the introduction and the conclusions of Bells paper
In particular Theorem (1) gives an example of situations in which
(i) Bells locality condition is violated while his inequality is satisfied
In a recent experiment with M Regoli [4] we have produced examples of situations in which
(ii) Bells locality condition is satisfied while his inequality is violated
6 The role of the counterfactual argument in Bells proof
Bell uses the counterfactual argument in an essential way in his proof because it is easy to check that formula (13) in [7] paper is the one which allows him to reduce in the proof of his inequality all consideration to the A-variables (Sa
in our notations while Bells -B-variables are the Sa ^ in our notations) The pairs of chameleons (cf section (10) as well as the experiment of [4] provide a counterexample precisely to this formula
10
7 Proofs of Bells inequality based on counting arguments
There is a widespread illusion to exorcize the above mentioned critiques by restricting ones considerations to results of measurements The following conshysiderations show why this is an illusion
The counting arguments usually used to prove the Bell inequality are all based on the following scheme In the same notations used up to now conshysider N simultaneous measurements of the singlet pairs of observables (S^ S) (Spound S) (S 5) and one denotes S3
XV the results of the v-th measurement of Sdegx (j = 12 x = a b c v = 1 N) With these notations one can calculate the empirical correlations on the samples that is
u
(and similarly for the other ones) In the Bell inequality 3 such correlations are involved
(slsl) slsD slsD (2)
Thus in the three experiments observer 1 has to measure 5 in the first and third experiment and S in the second while observer 2 has to measure Sjj in the first and second experiment and S in the third Therefore the directions a and b can be chosen arbitrarily by the two observers and it is not necessary that observer 1 is informed of the choice of observer 2 or conversely However the direction c has to be chosen by both observers and therefore at least on this direction there should be a preliminary agreement among the two observers This preliminary information can be replaced it by a procedure in which each observer chooses at will the three directions only those choices are considered for which it happens (by chance) that the second choice of observer 1 coincides with the third of observer 2 (cf section (15) for further discussion of this point) Whichever procedure has been chosen after the results of the experiments one can compute the 3 empirical correlations
^ 2 )^ 1 ) ) = ^E^ 1 ) (^ 2 ) )^ 2 ) ^ 2 ) ) lt4gt
11
JV
(5)
where pj means the j - t h point of the 3-d experiment etc If we try to apply the Bell argument directly to the empirical data given by the right hand sides of (3) (4) (5) we meet the expression
Jj EampWWto) - plusmn E^^pf )5f (Pf) (6) N
J = I j = i
from which we immediately see that if we try to apply Bells reasoning to the empirical data we are stuck at the first step because we find a sum of terms of the type
si^sPip^-sUip^sfHpV) (7)
to which the inequalities among numbers of section (1) cannot be applied because in general
More explicitly since the expression (x) above is of the form
ab mdash bc
(8)
with a b b c euro plusmn1 the only possible upper bound for it is 2 and not 1 mdash ac Even supposing that we in order to uphold Bells thesis can introduce a
cleaning operation [3] (cf [4]) which eliminates all the points in which (8) is not satisfied we would arrive to the inequality
jf E^frf) Wgt) - jf E ^ f W (f) j = i 3 = 1
lt i-^E^W^fef) (9) j = i
and in order to deduce from this something comparable with the experiments we need to use the counterfactual argument assessing that
^ 1 (p 9 ) ) = -sltagt(Pa)) (2h (10)
12
But in the second experiment S^ and not Sc has been measured Thus to postulate the validity of (10) means to postulate that the value assumed by Sjj in the second experiment is the same that we would have found if Sc and
(2) not S^ had been measured The chameleon effect provides a counterexample to this statement
8 The quantum probabilistic analysis
Given the results of section (5) (6) (7) it is then legitimate to ask if Bells vital assumption is irrelevant for the deduction of Bells inequalshy
ity which is the really vital assumption which guarantees the validity of this inequality
This natural question was first answered in [1] and this result motivated the birth of quantum probability as something more than a mere noncommu-tative generalization of probability theory in fact a necessity motivated by experimental data
Theorem (23) has only two assumptions
(i) that the random variables take values in the interval [mdash1 +1]
(ii) that the random variables are defined on the same probability space
Since we are dealing with spin variables assumption (i) is reasonable Let us consider assumption (ii) This is equivalent to the claim that the
three probability measures PabPacPcb representing the distributions of the pairs (Sa Sl ) (Sc 5^ ) (Sa SC ) respectively can be obtained by reshystriction from a single probability measure P representing the distribution of the quadruple si1] s f s f SJ
This is indeed a strong assumption because due to the incompatibility of the spin variables along non parallel directions the three correlations
(spsP) ltslaquoslt2gtgt (s^sP) (i)
can only be estimated in different in fact mutually incompatible series of exshyperiments If we label each series of experiments by the corresponding pair (ie (a 6) (6 c) (c a)) then we cannot exclude the possibility that also the probability measure in each series of experiments will depend on the correshysponding pair In other words each of the measures Pab Pbc Pca describes the joint statistics of a pair of commuting observables (Si1 s f ) (S^ s f gt)
13
(Sa Sc ) and there is no a priori reason to postulate that all these joint disshytributions for pairs can be deduced from a single distribution for the quadruple r o U ) c ( l ) o(2) Q ( 2 ) I
We have already proved in Theorem (23) that this strong assumption implies the validity of the Bell inequality Now let us prove that it is the truly vital assumption for the validity of this inequality ie that if this assumption is dropped ie if no single distribution for quadruples exist then it is an easy exercise to construct counterexamples violating Bells inequality To this goal one can use the following lemma
Lemma (1) Let be given three probability measures plusmnabi aci - c6 on amp given (measurable) space (S1f) and let S^ si1] S^ SJp be functions defined on (QJ-) with values in the interval [mdash1-1-1] and such that the probability measure Pab (resp PcbPac) is the distribution of the pair (Sa Sl ) (resp ( ^ 1 ^ 2 ) ) (S i 1 ^ 2 ) ) ) For each pair define the corresponding correlation
Kab=SWS^)=Jsa^S^dPab
and suppose that for ee = plusmn the joint probabilities for pairs
Ki bullbull= P(Si1] = e bull Strade = e)
satisfy
p++ _ pmdash p + - _ p - + (o xy xy gt xy M xy ^I
P = Px = 12 (3)
then the Bell inequality
Kab - Kbc ltl~Kac (4)
is equivalent to
pb+-pb
+c++p^+lt (5)
Proof The inequality (4) is equivalent to
W - 2Pab ~ Pamp+ + 2P+-1 lt 1 - 2Pa+
c+ + 2 P + - (6)
14
Using the identity (equivalent to (3))
bull-xy 0 xy ()
the left hand side of (4) becomes the modulus of
2(^t+-^r )-2(nt+-nr) = 2 (s+-f +pav) -2 (pbt+-+nr)
= 4(p a v-n t + ) (8) and again using (7) the right hand side of (6) is equal to
1 - 2 ( P + + - 2 + Pac+ ) = 2 - 4P++ (9)
Summing up (4) is equivalent to
Kb+-Kc+ltl -PaV (io)
which is (5)
Corollary (2) There exist triples of PabPacPcb on the 4-point space + 1 - 1 x + 1 - 1 which satisfy conditions (1) (2) of Lemma (1) and are not compatible with any probability measure P on the 6-point space + 1 - 1 X + 1 - 1 X + 1 - 1
Proof Because of conditions (1) (3) the probability measures Pab Pac Pcb are uniquely determined by the three numbers
pb+p++px+euroioi (ii)
Thus if we choose these three numbers so that the inequality (5) is not satisfied the Bell inequality (4) cannot be satisfied because of Lemma (1)
9 The realism of ballot boxes and the corresponding statistics
The fact that there is no a priori reason to postulate that the joint distributions of the pairs ( S ^ s f 0 ) (si1]sf) ( S ^ S ^ ) can be deduced from a single distribution for the quadruple Sa Sc Sl Sc does not necessarily mean that such a common joint distribution does not exist
15
On the contrary in several physically meaningful situations we have good reasons to expect that such a joint distribution should exist even if it might not be accessible to direct experimental verification
This is a simple consequence of the so-called hypothesis of realism which is justified whenever we are entitled to believe that the results of our meashysurements are pre-determined In the words of Bell Since we can predict in advance the result of measuring any chosen component of olti by previously measuring the same component of o it follows that the result of any such measurement must actually be predetermined
Consider for example a box containing pairs of balls Suppose that the experiments allow to measure either the color or the weight or the material of which each ball is made of but the rules of the game are that on each ball only one measurement at a time can be performed Suppose moreover that the experiments show that for each property only two values are realized and that whenever a simultaneous measurement of the same property on the two elements of a pair is performed the resulting answers are always discordant Up to a change of convenction and in appropriate units we can always suppose that these two values are plusmn1 and we shall do so in the following
Then the joint distributions of pairs (of properties relative to different balls) are accessible to experiment but those of triples or quadruples are not
Nevertheless it is reasonable to postulate that in the box there is a well defined (although purely Platonic in the sense of not being accessible to experiment) number of balls with each given color weight and material These numbers give the relative frequencies of triples of properties for each element of the pair hence using the perfect anticorrelation a family of joint probabilities for all the possible sextuples More precisely due to the perfect anticorrelation the relative frequency of the triples of properties
SW=ai [Sf^h] [^1=Cl]
where aibia = plusmn1 are equal to the relative frequency of the sextuples of properties
[Strade = ai] [Si1] = h] [SP = Cl] [SM = - 0 l ] [Slt2gt = -bl] [S(2) = _C l]
and since we are confining ourselves to the case of 3 properties and 2 particles the above ones when abic vary in all possible ways in the set plusmn1 are all the possible configurations in this situation the counterfactural argument is applicable and in fact we have used it to deduce the joint distribution of sextuples from the joint distributions of triples
16
10 The realism of chameleons and the corresponding statistics
According to the quantum probabilistic interpretation what Einstein Podol-sky Rosen Bell and several other who have discussed this topic call the hyshypothesis of realism should be called in a more precise way the hypothesis of the ballot box realism as opposed to hypothesis of the chameleon realism
The point is that according to the quantum probabilistic interpretation the term predetermined should not be confused with the term realized a priori which has been discussed in section (9) it might be conditionally dediced according to the scheme if such and such will happen I will react so and so
The chameleon provides a simple example of this distinction a chameleon becomes deterministically green on a leaf and brown on a log In this sense we can surely claim that its color on a leaf is predetermined However this does not mean that the chameleon was green also before jumping on the leaf
The chameleon metaphora describes a mechanism which is perfectly local even deterministic and surely classical and macroscopic moreover there are no doubts that the situation it describes is absolutely realistic Yet this reshyalism being different from the ballot box realism allows to render free from metaphysics statements of the orthodox interpretation such as the act of meashysurement creates the value of the measured observable To many this looks metaphysic or magic but load how natural it sounds when you think of the color of a chameleon
Finally and most important for its implications relatively to the EPR arshygument the chameleon realism provides a simple and natural counterexample of a situation in which the results are predetermined however the counter-factual argument is not applicable
Imagine in fact a box in which there are many pairs of chameleons In each pair there is exactly an healthy one which becomes green on a leaf and brown on a log and a mutant one which becomes brown on a leaf and green on a log moreover exactly one of the chameleons in each pair weights 100 grams and exactly one 200 grams A measurement consists in separating the members of each pair each one in a smaller box and in performing one and only one measurement on each member of each pair
The color on the leaf color on the log and weight are 2-valued observables (because we do not know a priori if we are measuring the healthy or the mutant chameleon) Thus with respect to the observables color on the leaf color on the long and weight the pairs of chameleons behave exactly as EPR pairs whenever the same observable is measured on both elements of a pair the results are opposite However suppose I measure the color on the leaf of one element of a pair and the weight of the other one and suppose the answers I
17
find are green and 100 grams Can I conclude that the second element of the pair is brown and weights 100 grams Clearly not because there is no reason to believe that the second member of the pair of which the weight was measured while in a box was also on a leaf
From this point of view the measurement interaction enters the very definishytion of an observable However also in this interpretation which is more similar to the quantum mechanical situation the counterfactual argument cannot be applied because it amounts to answer brown to the question which is the color on the leaf if I have measured the weight and if I know that the chameleon is the mutant one (this because the measurement of the other one gave green on the leaf) But this answer is not correct because it could well be that inside the box there is a leaf and the chameleon is interacting with it while I am measuring its weight but it could also be that it is interacting with a log also contained inside the box in which case being a mutant it would be green
Therefore if we can produce an example of a 2-particle system in which the Heisenberg evolution of each particles observable satisfies Bells locality condition but the Schroedinger evolution of the state ie the expectation value (bull) depends on the pair (ab) of measured observables we can claim that this counterexample abides with the same definition of locality as Bells theorem
11 Bells inequalities and the chamaleon effect
Definition (1) Let S be a physical system and O a family of observable quantities relative to this system We say that the it chamaleon effect is realized on S if for any measurement M of an observable A pound O the dynamical evolution of S depends on the observable A If D denotes the state space of S this means that the change of state from the beginning to the end of the experiment is described by a map (a one-parameter group or semigroup in the case of continuous time)
TA D-gtD
Remark The explicit form of the dependence of TA on A depends on both the system and the measurement and many concrete examples can be constructed An example in the quantum domain is discussed in [3] and the experiment of [4] realizes an example in the classical domain
Remark If the system S is composed of two sub-systems S and 52 we can also consider the case in which the evolutions of the two subsystems are differshyent in the sense that for system 1 we have one form of functional dependence
18
Tjj of the evolution associated to the observable A and for system 2 we have another form of functional dependence Tjj In the experiment of [4] the state space is the unit disk D in the plane the observables are parametrized by angles in [02n) (or equivalently by unit vectors in the unit circle) and for each observable S i of system 1
and for each observable Sbdquo of system 2
where Ra denotes (counterclockwise) rotation of an angle a Let us consider Bells inequalities by assuming that a chamaleon effect
is present Denoting E the common initial state of the composite system (12) (eg singlet state) the state at the end of the measurement will be
Now replace Sx by
g(j) = gj) o T ( j )
x x --x
Since the Sx take values plusmn 1 we know from Theorem (23) that if we postulate
the existence of joint probabilities for the triple 5bdquo S^ Sc compatible with
the two correlations E(si1S^2)) E(si1S^2)) then the inequality
E(S^si2)) - E(S^si2)) lt 1 - E(S^S^)
holds and if we also have the singlet condition
ESpoundTWp)STWp)) = -l (1)
then ae
and we have the Bells inequality Thus if we postulate the same probability space even the chamaleon effect alone is not sufficient to guarantee violation of the Bells inequality
Therefore the fact that the three experiments are done on different and incompatible samples must play a crucial role
19
As far as the chameleon effect is concerned let us notice that in the above statement of the problem the fact that we use a single initial probability measure E is equivalent to postulate that at time t = 0 the three pairs of observables
(^U2)) (sMagt) (^U1) admit a common joint distribution in fact E
12 Physical implausibility of Bells argument
In this section we show that combining the chameleon effect with the fact that the three experiments refer to different samples then even in very simple situations no cleaning conditions can lead to a proof of the Bells inequality
If we try to apply Bells reasoning to the empirical data we have to start from the expression
~ E^W^sfcr^) -1 E^crJV)^(if Pf) 3 3
(1)
which we majorize by
^ E W^P^iT^p]) - SW(TJ V ) s f (tf V ) (2) N
3
But if we try to apply the inequality among numbers to the expression
SPiT^S^iTiW) - STWpraquo)sl2Traquo) (3)
we see that we are not dealing with the situation covered by Corollary (12)
ie
ab -cbltl-ac (4)
because since
si2)(T^)^S^(T^Py) (5)
the left hand side of (4) must be replaced by
ab-cb (6)
whose maximum for a b cb euro [mdash1 +1] is 2 and not 1 mdash ac
20
Bells implicit assumption of the single probability space is equivalent to the postulate that for each j = 1 N
P]=P (7)
Physically this means that the hidden parameter in the first experiment is the same as the hidden
parameter in the second experiment This is surely a very implausible assumption Notice however that without this assumption Bells argument cannot be
carried over and we cannot deduce the inequality because we must stop at equation (2)
13 The role of the single probability space in CHSHs proof
Clauser Home Shimony Holt [9] introduced the variant (26) of the Bell inequality for quadruples (ab) (ab) (ab) ab) which is based on the following inequality among numbers a b b a euro [mdash11]
ab + ab+ ab - ab |lt 2 (1)
Section (1) already contains a proof of (1) A direct proof follows from
b + b + b-blt2 (2)
because
| ab + ab + ab - ab | = | a(b + b) + ab - b) |
lta-b + b + a -b-b ltb + b + b-b lt2
The proof of (2) is obvious
Remark (1) Notice that an inequality of the form
a1b1+a2b2 + a3b3~a4b4lt2 (3)
would be obviously false In fact for example the choice
c1 = b = a2 = b2 = a3 = 63 = b4 = 1 a 4 = mdash1
would give I o-ih + a2b2 + a3b3 - a4b4 = 4
21
That is for the validity of (1) it is absolutely essential that the number a is the same in the first and the second term and similarly for a in the 3-d and the 4-th b in the 2-d and the 4-th b in the first and the 3-d
This inequality among numbers can be extended to pairs of random varishyables by introducing the following postulates
( P I ) Instead of four numbers a b b a g [mdash11] one considers four functions
o(l) c(2) o(l) o(2) dega Jdegb dega -V
all defined on the same space A (whose points are called hidden paramshyeters) and with values in [mdash11]
(P2) One postulates that there exists a probability measure P on A which defines the joint distribution of each of the following four pairs of funcshytions
ampamp) (gtSltgt) Slt$SP) S$SP) (4)
Remark (2) Notice that (P2) automatically implies that the joint distribushytions of the four pairs of functions can be deduced from a joint distribution of the whole quadruple ie the existence of a single Kolmogorov model for these four pairs With these premises for each A euro A one can apply the inequality
(1) to the four numbers
and deduce that
I Spound)S12) + SW)S$) + Slaquo(A)Sf (A) - S$)Strade() |lt 2 (5)
From this taking P-averages one obtains
I ltslM2)) + (^142)gt + lt ^ 2 ) gt - ltspoundWgt i= (6)
I J(SW)S12) + SW)Slt) + Si))si2x) - 5^(A)42)(A))rfP(A) |lt
(7)
lt||5W(A)^2)(A) + 5laquo(A)42)(A)+
22
S$)Sl2) - S$)Sigt() I dP(X) lt 2 (8)
Remark (3) Notice that in the step from (6) to (7) we have used in an essential way the existence of a joint distribution for the whole quadruple ie the fact that all these random variales can be realized in the same probability space In EPR type experiments we are interested in the case in which the
four pairs (a b) (a amp) (ab) (ab) come from four mutually incompatible experiments Let us assume that there is a hidden parameter determining the result of each of these experiments This means that we interpret the number Sa (A) as the value of the spin of particle 1 in direction a determined by the hidden parameter A
There is obviously no reason to postulate that the hidden parameter deshytermining the result of the first experiment is exactly the same one which determines the result of the second experiment However when CHSH conshysider the quantity (5) they are implicitly doing the much stronger assumption that the same hidden parameter A determines the results of all the four exshyperiments This assumption is quite unreasonable from the physical point of view and in any case it is a much stronger assumption than simply postulating the existence of hidden parameters The latter assumption would allow CHSH only to consider the expression
SPiWfHXi) + Slaquo(A2)42)(A2) + 5^(A3)5f (A3) - 5^(A4)4
)(A4) (9)
and as shown in Remark (1) above the maximum of this expression is not 2 but 4 and this does not allow to deduce the Bell inequality
14 The role of the counterfactual argument in CHSHs proof
Contrarily to the original Bells argument the CHSH proof of the Bell inequalshyity does not use explicitly the counterfactual argument Since one can perform experiments also on quadruples rather than on triples as originally proposed by Bell has led some authors to claim that the counterfactual argument is not essential in the deduction of the Bell inequality However we have just seen in section (7) that the hidden assumption as in Bells proof ie the realizabil-ity of all the random variales involved in the same probability space is also present in the CHSH argument The following lemma shows that under the singlet assumption the conclusion of the counterfactual argument follows from the hidden assumption of Bell and of CHSH
23
Lemma (1) If and g are random variables defined on a probability space (A P) and with values in [mdash11] then
(fg) bull= I fgdP = - i JA
if and only if Pfg = - i ) = i
Proof If P(fg gt - 1 ) gt 0 then
fgdP = -P(fg = - 1 ) - fgdP gt -P(fg = -1)-P(fg gt - 1 ) gt - 1 JA Jfggt-1
Corollary (2) Suppose that all the random variales in (x3) are realized in the same probability space Then if the singlet condition
(SPSW) = - 1 (1)
is satisfied then the condition
SW = SM ( 2)
(ie formula (13) in Bells 64 paper) is true almost everywhere
Proof Follows from Lemma (1) with the choice f = Sx g = Si Summing
up if you want to compare the predictions of a hidden variable theory with quantum theory in the EPR experiment (so that at least we admit the validity of the singlet law) then the hidden assumption of realizability of all the random variables in (3) in the same probability space (without which Bells inequality cannot be proved) implies the same conclusion of the counterfactual argument Stated otherwise the counterfactual argument is implicit when you postulate the singlet condition and the realizability on a single probability space It does not matter if you use triples or quadruples
15 Physical difference between the CHSHs and the original Bells inequalities
In the CHSH scheme
(ab) (ab) (ab) (ab)
24
the agreement required by the experimenters is the following - 1 will measures the same observable in experiments I and III and the
same observable in experiments II and IV - 2 will measure the same observable in experiments I and II and the same
observable in experiments III and IV Here there is no restriction a priori on the choice of the observables to be
measured In the Bell scheme the experimentalists agree that - 1 measures the same observable in experiments I and III - 2 measures the same observable in experiments I and II - 1 and 2 choose a priori ie before the experiment begins a direction c
and agree that 1 will measure spin in direction c in experiment II and 2 will measure spin in direction c in experiment III (strong agreement)
The strong agreement can be replaced by the following (weak agreement) - 1 and 2 choose a priori ie before the experiment begins a finite set of
directions c CK and agree that 1 will measure spin in a direction choosen randomly among the directions c CK in experiment II and 2 will do the same in experiment III
In this scheme there is an a priori restriction on the choice of some of the observables to be measured
If the directions fixed a priori in the plane are K then the probability of a coincidence corresponding to a totally random (equiprobable) choice is
p$ = 42A) = X gt =laquo 42A =laquo) = pound h = h a=l a=l
This shows that contrarily than in the CHSH scheme the choice has to be restricted to a finite number of possibilities otherwise the probability of coincidence will be zero
From this point of view we can claim that the Clauser Home Shimony Holt formulation of Bells inequalities realize a small improvement with respect to the original Bells formulation
Reproduction of the E P R correlations by the chameleon effect
Consider a classical dynamical system composed of two particles (12) Let S denote the state space of each of the particles and suppose that at
time t = to (initial time) the state i j of particle 1 and the state UdegJ OI particle 2 coincide
Hdeg = A=ti (1)
25
Starting from time to the two particles begin to move in opposite directions and after a time interval of length T two independent and non communicating experimenters simultaneously perform a measurement on each particle
Experimenter 1 (resp 2) can choose among three different measurements corresponding to the observables
SWSWSW (resp 5 ( 2 ) 5 f ^ ) ) (2)
of particle 1 (resp particle 2) We suppose that both particles satisfy the chameleon effect described by
the following
DEFINITION (1) Let S be the state space of a dynamical system u let 7 be a set and for each x euro I let be given a function
Sx S -gt R x euro I (3)
representing an observable of the system The system ltr is said to realize the chameleon effect with respect to the observables (33) if whenever the observable Sx is measured the dynamical evolution of the system
T S -gt S tell (4)
depends on the measured observable Sx In our case we consider only two instants of time the initial one and the
one when the measurement takes place and we omit time from our notations Moreover in our case we have two particles and each particle is far away from the other one hence it can only feel the interaction with the measurement apparatus near to it So combining the locality principle with the chameleon effect we conclude that if experimenter 1 (resp 2) chooses to measure the observable Sx (resp Sy ) then particle 1 (resp 2) will evolve according to the dynamics
T1gtx (resp T2lV) (5)
In our case the variables x y can be any element of the set a b c
Suppose that experimenter 1 chooses to measure and experimenter
Let ti (resp j2) denote the final state ie the state at the time when the measurement occurs of particle 1 (resp 2) Condition (31) is then equivalent to
^iTaVi = T276Va (6)
26
The empirical correlations of the measurements will then be
i pound 5(1)(x1)5f ( i ^ C O i - T2gt2) (7)
where J^(-) is a lt5-like factor keeping into account the fact that only the conshyfigurations satisfying condition (6) give a non zero contribution to the correlashytions
Now suppose that the state space S is the real line R Thus the empirical correlations (7) are
nab = Z J J 5laquo ( m )5 f (M2) (T1aV1 - T^^d^d^ (8)
where Z is a normalization constant With the change of variables
T ^ V i = Ai T~^2 = A2 (9)
(8) becomes
z j J 5W(T1aA1)^2)(T2bA2)lt5(A1 - X2)dTha(X1)dT2b(X2) (10)
Now introduce the notations
S^TiiXj)=S^(j) j = l2 x = ab (11)
with these notations supposing as always possible that T[i0(Ai)T2 6(A2) gt 0 (10) becomes
Z j j S^X1)Sb2x2)8Xl - X2)Tlta(X1)T^b(X2)dX1dX2 =
Z JSi1X)si2)(X)Tla(X)Tib(X)dX
Now let us make the following choices
A 6 [02vr] laquobull supp Sltj) C [0 2TT] (12)
Z = (27T)1 (13)
27
Tb = V^ (14)
n a ( A ) = ^ | c o s ( A - a ) | (15)
SW() = sgn (cos(A - x)) Strade = -Strade (16)
With these choices the correlations (8) become
I-2TT I
( S ^ f i f gt = - sgn (cos(A - a)) sgn(cos(A - 6))- | cos(A - a)d (17) Jo 4
= mdash sgn (cos(A mdash b)) cos(A mdash a)d = mdash cos(b mdash a) = mdasha bull b
which are the EPR correlations
References
1 L Accardi Phys Rep 77 169-192 (1981) 2 L Accardi Urne e camaleonti Dialogo sulla realta le leggi del caso
e la teoria quantistica (II Saggiatore 1997) Japanese translation Maruzen (2000) russian translation ed by Igor Volovich (PHASIS Publishing House 2000) english translation by Daniele Tartaglia to appear
3 L Accardi On the EPR paradox and the Bell inequality Volterra Preprint N 350 (1998)
4 L Accardi M Regoli Quantum probability and the interpretation of quantum mechanics a crucial experimentInvited talk at the workshop The applications of mathematics to the sciences of nature critical moments and aspetcs Arcidosso June 28-July 1 (1999) To appear in the proceedings of the workshop Preprint Volterra N 399 (1999)
5 L Accardi M Regoli Local realistic violation of Bells inequality an experiment Conference given by the first-named author at the Dipartimento di Fisica Universita di Pavia on 24-02-2000 Preprint Volterra N 402
6 L Accardi M Regoli Non-locality and quantum theory new experishymental evidence Invited talk given by the first-named author at the Confershyence Quantum paradoxes University of Nottingham on 4-05-2000 Preprint Volterra N 421
7 J S Bell Physics 1 3 195-200 (1964) 8 J S Bell Rev Mod Phys 38 447-452 (1966)
28
9 J F Clauser MA Home A Shimony R A Holt Phys Rev Letters 49 1804-1806 (1969) J S Bell Speakable and unspeakable in quantum mechanics (Cambridge Univ Press 1987)
10 J F Clauser M A Home Phys Rev D 10 2 (1974) 11 A Einstein B Podolsky N Rosen Phys Rev 47 777-780 (1935) 12 A Einstein in Albert Einstein Philosopher Scientist Edited by PA
Schilpp Library of Living Philosophers (Evanston Illinois 1949)
29
R e f u t a t i o n of Be l l s T h e o r e m
Guil laume A D E N I E R Louis Pasteur University Strasbourg France
E-mail guillaumeadenierulpu-strasbgfr
Bells Theorem was developed on the basis of considerations involving a linear combination of spin correlation functions each of which has a distinct pair of arguments The simultaneous presence of these different pairs of arguments in the same equation can be understood in two radically different ways either as strongly objective that is all correlation functions pertain to the same set of particle pairs or as weakly objective that is each correlation function pertains to a different set of particle pairs It is demonstrated that once this meaning is determined no discrepancy appears between local realistic theories and quantum mechanics the discrepancy in Bells Theorem is due only to a meaningless comparison between a local realistic inequality written within the strongly objective interpretation (thus relevant to a single set of particle pairs) and a quantum mechanical prediction derived from a weakly objective interpretation (thus relevant to several different sets of particle pairs)
1 Introduction
Bells Theorem1 exhibits a peculiar discrepancy between any local realistic theshyory and Quantum Mechanics which leads to empirically distinguishable altershynatives The quandary is that neither local realistic conceptions nor Quantum Mechanics are easy to abandon Indeed classical physics and common sense are usually based upon the former while the latter is rightly presented as the most successful theory of all times Several experiments have been done all but a few2 show violations of Bell inequalities3 Yet the ideas brought forth by Bells Theorem are so disconcerting that there is still incredulity not to menshytion antipathy evoked by the verdict The purpose of this article is to provide a refutation of this theorem within a strictly quantum theoretical framework without the use of outside assumptions
2 The E P R B gedanken experiment
21 Spin observables and singlet state
Bells theorem is usually based on a didactic reformulation of the EPR (Einshystein Podolsky and Rosen4) gedanken experiment due to D Bohm5 In this EPRB gedanken experiment a pair of spin-| particles with total spin zero is produced such that each particle moves away from the source in opposite directions along the y-axis Two Stern-Gerlach devices are placed at opposite
30
points (left and right) on the y-axis and are oriented respectively along the directions u and v The Hilbert space associated with the entire EPRB system is H = 7ih lt8gtHR where T^L and HR are the Hilbert spaces associated with each Stern-Gerlach device respectively The spin observable has two counterparts in this new product space H as
CTL-U = ltr-u(ggtIR (1)
ltTR bull v = IL reg a bull v (2)
where I I and IR are the identity operators of ~Hh and R Contrary to the observables a bull u and a bull v which are mutually non commuting when u ^ v these new observables ox bull u and OR bull v do commute reflecting the fact that the Stern-Gerlach devices are arbitrarily far from each other and are thus measuring distinct subsystems The product of these two observables is therefore also an observable and can be understood as a spin correlation observable corresponding to the joint spin measurement of both Stern-Gerlach devices Its eigenvectors are |poundLU) ltggt | pound R V ) with corresponding eigenvalues poundL-poundRgt where each e is either +1 or mdash1
In an EPRB gedanken experiment the source produces particle pairs with zero total spin represented by the singlet state
M = ^ [l+ngt reg -gtngt - -gtngt reg l+ngt]gt (3)
where n is an arbitrary unitary vector which can usually be ommited since the singlet state is invariant under rotation6
22 Statistical properties and hidden-variables
The expectation value of a spin observable for the singlet state ip) is zero
(r-u(8gtlR|Vgt) = 0 MI L regltr-v |^gt = 0 (4)
whatever u and v as follows from the rotational invariance of the singlet state Likewise the expectation value of the spin correlation observable 67 is
E(uv) = M ( o f u ) ( o - v ) M (5)
= - u - v (6)
which depends only on the relative angle between u and v
31
In a local realistic hidden-variables model a single particle pair is supposed to be entirely characterised by means of a set of hidden-variables which are symbolically represented by a parameter A so that the measurement result on the left along u can be written as A(uA) and the result on the right along v as B(v) Although the hidden-variables model is supposed to be fully deterministic it must also be capable of reproducing the stochastic nature of the EPRB gedanken experiment expressed in Eqs (4) and (6) For that purpose the complete state specification Aj of any particle pair with label i must be a random variable1s its complete state Aj is supposed to be drawn randomly according to a probability distribution p
Consider a set of N particle pairs i = 1 N the mean value of joint spin measurements for this set is
1 N
M(uv) = - ^ A ( u A i ) B ( v A i ) (7)
3 The CHSH function
In order to establish Bells Theorem a linear combination of correlation funcshytions c(a b) with different arguments 9 is considered once when these correlashytion functions are expectation values E^av) given by Quantum Mechanics ie Eq(6) and once when they are mean values M p (u v ) given by local hidden-variables theories Eq(7) then the results are to be compared A well known choice of such a linear combination is the CHSH (Clauser Home Shi-mony and Holt10) function written with four pairs of arguments
S = |c(ab) - c ( a b ) +c (a b ) + c(a b ) | (8)
The exact meaning of the simultaneous presence of these different argushyments in a CHSH function must be clarified Basically there are two possible interpretations the strongly objective interpretation and the weakly objective interpretation1112
Strongly Objective Interpretation implies that all correlation functions are relevant to the same set of N particle pairs As such they cannot be relevant to actual experiments but rather with what result would have been obtained if measured on the same set of N particle pairs along different directions
Weakly Objective Interpretation implies that each correlation function is actually to be measured on distinct sets of N particle pairs that is for each pair only one joint spin measurement is to be executed
32
The CHSH function was actually developed specifically for experimental convenience10 and many experiments have been done (the most famous being Aspects13) obviously invoking the natural interpretation namely the weakly objective one Nevertheless the strongly objective interpretation must also be considered since it remains a possible interpretation a priori and since the choice between strong and weak objectivity is not made at all explicit in many papers including Bells
It must be stressed that these interpretations are radically different not only epistemologically but also physically Indeed the strongly objective inshyterpretation pertains to a single set of N particle pairs characterised by the corresponding set of parameters A i = 1 TV whereas the weakly obshyjective interpretation pertains to no less than 4 sets of N particle pairs The fact is that a finite set of N particle pairs characterised by A cant be identishycally reproduced either theoretically (for each complete state A of any particle pair i is a random variable as defined in Section 22) or empirically (for the experimenter has no control over the complete state of a particle pair in a sinshyglet state) Hence in the weakly objective interpretation these four sets are necessarily four different sets of particle pairs 7 14 respectively characterised by four different sets of hidden-variables parameters Aij ^2i ^3i a n d A4J
The difference between each interpretation can therefore be embodied in the number of degrees of freedom of the whole system Let be the degrees of freedom of a single particle pair In the strongly objective interpretation the degrees of freedom of the whole CHSH system is then Nf whereas in the weakly objective interpretation it is 4 times as large that is 47V Thus before initiating Bells analysis one has to choose explicitly one interpretation and stick to it
4 Strongly objective interpretation
4-1 Local realistic inequality within strongly objective interpretation
The local realistic formulation of the CHSH function within strong objectivity is written
OP ^strong
M ( a b ) - M ( a b ) + Mgt(ab) + M (a b ) (9)
which (using Eq 7) becomes after factorisation a summation where each term can have two values 2 7
A(a Xi) B(b Xi) - B(b Xi)] + A(a Xt) [l(b Alt) + B(b A)] = plusmn2 (10)
33
so that the most restrictive local realistic inequality within the strongly objecshytive interpretation is
Strong lt 2- (11)
This is the well known generalised formulation of Bells inequality due to CHSH10 It must be stressed once more however that this inequality has been established only within the strongly objective interpretation which means that each expectation value is relevant to the same set of N particle pairs Hence this result cannot be compared directly with results from real experimental tests where in fact mean values from four distinct sets of N particle pairs are measured
4-2 Quantum mechanical prediction within strongly objective interpretation
The quantum prediction for the CHSH function within the strongly objective interpretation is written
strong = l ^ ( a b ) - E ( a b ) + E+(ab) + E(ah) (12)
This equation is usually directly evaluated by replacing each expectation value by the scalar product result of Eq (6) This unfortunately is all too hasty
Indeed in order to understand better the quantum mechanical meaning of equation (12) it is advantageous to take a step backward using equation (5)
^strong (Vgt|(aLa)(ltTRbM - ltVgt|(lt7La)(lt7Rb)|tgt)
+ (ygt|(lt7La)(ltTRb)|V) + (igt|(lt7La)(lt7R b)|V) bull (13)
The four spin correlation observables in this equation are non commuting observables (this can be shown by calculating the commutator of ((7LU)(ltTRV)
and ((TLU)(CTRV) with v ^ v ) so that the meaning of their combination must be questioned
According to Von Neumann15 any linear combination of expectation valshyues of different observables R S is meaningful in quantum mechanics
R + S + )4 = (R)4 + (S)4 + (14)
even if R S are non commuting observables However as was stressed by dEspagnat 1116 quantum mechanics is only a weakly objective theory and expectation values given by quantum mechanics are also weakly objective statements that is to say statements relevant to observations so that when
34
R 5 are non commuting observables the expectation values cannot be simultaneously relevant to the same set of N systems each expectation value is necessarily relevant to a distinct set of JV systems Therefore the only possible meaning of equation (13) is weakly objective not strongly objective as desired Of course this does not imply that Quantum Mechanics cannot provide any meaning at all for the CHSH function it implies only that this meaning cannot be strongly objective
Since the local realistic inequality SgtT0 cannot be compared with any strongly objective prediction given by Quantum Mechanics Bells Theorem cannot be verified with a strongly objective interpretation given to the CHSH function Hence there is no choice but to rely on the weakly objective interpreshytation in order to compare hidden-variables theories and Quantum Mechanics
5 Weakly objective interpretation
51 Quantum mechanical prediction within weakly objective interpretation
It was shown in Section 3 that strong objectivity and weak objectivity pertain to different physical systems This difference should therefore appear in the relevant equations Indeed the correlation expressed in Eq (6) is relevant to spin measurements performed on particles that once constituted a single parent particle Yet two particles issued from two distinct parents never have intershyacted with each other so that spin measurements performed on such particle pairs can not be correlated Hence if left and right spin measurements are pershyformed on two distinct sets of N particle pairs instead of the same set there should be no correlation and this property should appear in a generalised spin correlation function (ie generalised to the case of spin measurements performed on different sets of particle pairs)
This can be easily done within a quantum theoretical framework by means of a distinct EPRB space for each set of N particle pairs Let Hj be the EPRB Hilbert space associated with the jth set of particle pairs In this Hilbert space the EPRB gedanken experiment is represented by the singlet state ipj) (see Section 2)
|V) = ^[l+gtreg|-gt-|-gtreg|+gt-] (15)
The whole CHSH experiment with the four sets of particle pairs can be exshypressed then in terms of a new tensor product space W1234 = i reg 2 reg 3 reg HA in which the state vector is
1 1234) = |Vl) reg 1 2) reg |^s) reg |^4gt- (16)
35
The counterparts of observables in 7 1234 are obtained as in Section 21 For instance the observable pertaining to the right Stern-Gerlach device for the 2nd set of particle pairs is
a2R -u = Ii reg (CTR bull u) lt8gt I3 reg I4 (17)
where Ij is the identity operator of the EPRB space Hj Hence the expectation value of the product of two spin observables the first belonging to the fcth set and the second to the Zth set is
Eftu V) = (Vgt1234|(ltTL bull U)(lt7IR bull v)|Vgt1234) (18)
and this is the generalised expectation value of spin correlation observables that was sought The expectation value for measurements performed on the same set (k = I) of particle pairs is already known Eq (6) and E^k(u v) should provide the same result Indeed using Eqs (16) and (17) leads to
lt ( u v ) = ltIM(ltTL -u) bull K - v)rpk) = - u v (19)
but when k ^ I the result is quite different
J3(uv) = (V-fcKot - u ^ X V - z I K -v)hM = 0 (20)
in accord with Eq (4) There are indeed no correlations between two sets of particle pairs as stipulated in the beginning of this section
Now contrary to what was done in Section 42 it is possible to proceed here in full accord with the quantum mechanical postulates because the spin correlation observables as the one given in Eq (17) are mutually commuting so that a linear combination of these commuting observables is an observable as well The CHSH experiment can therefore be described by a new observable
Sweak = (lt7lL bull a)(ai R bull b ) - (ltT2L bull a)(lt72R b )
+(o-3L-a)(ltT3R-b) + (lt74L- a)(ltx4R bull b ) (21)
and the quantum prediction for the CHSH function within a weakly objective interpretation is therefore obtained by calculating the expectation value of the observable 5weak when the system is in the quantum state 1 1234)
Sweak = (^1234|5weak|V1234) (22)
which using Eqs (17) (18) and (19) is
S L k = S f 1 ( a b ) - ^ 2 ( a b ) + ^ 3 ( a b ) + E 4 (a b ) (23)
36
This equation is not ambiguous (as was Eq 12) it is a linear combination of expectation values each relevant to a distinct set of N particle pairs This equation is therefore weakly objective as requested
Finally using Eq (19) yields
weak a bull b - a bull b + a bull b + a bull b
with a well known maximum equal to
max(5 B a k )=2gt^
(24)
(25)
This numerical result is indeed the one given in the literature the only difshyference here being the fact that the meaning of this result is unambiguously weakly objective Quantum Mechanics which is a weakly objective theory n
provides a clear answer to the CHSH function understood as a weakly objective question
52 Local realistic inequality within weakly objective interpretation
The last step consists in comparing the quantum prediction S^eak with its local realistic counterpart S^eak As was stressed in Section 3 the j t h set of particle pairs must be characterised by a distinct set of hidden-variables parameters [Xji j = 1 N Hence to the generalised expectation value of the spin correlation observable Eq (18) corresponds the generalised mean value of joint spin measurements
1 N
Mpound(uv) = - J gt ( u A M ) B ( v A M ) (26)
which is a priori capable of reproducing not only the k mdash I prediction Eq (19) but also the k ^ prediction Eq (20) The local realistic CHSH function with a weakly objective interpretation is therefore
9P = weak
Mftfob) - M22(ab) + M3 3(ab) + M4 4(ab) (27)
and that is explicitly
i 1 N
5weak = b E [^(a A M )pound(b A M ) - gtl(aA2li)B(bA2ii)
+A(a 3i)B(h A3i) + AB A4i)B(bl A4]i) ] (28)
37
This expression is to be compared with the one pertaining to the strongly objective interpretation (Section 41) which contained terms that could be factored Here since each term is different from the others no factorisation is possible ie there is no way to derive a Bell inequality7mdashthis is not the first time this fact has been noticed unfortunately no conclusion was drawn then Yet this fact cannot be ignored for it has been shown in Section 4 that Bells Theorem cannot be demonstrated within a strongly objective interpretation
Here the only local realistic inequality that can be derived is obtained by consideringmdashas was done with Eq (10)mdashthe possible numerical values of each term of the summation in Eq (28) for which the extrema are +4 and -4 so that the narrowest local realistic inequality that can be derived from Eq (28) is nothing but
^ e a k lt 4 - (29)
This most restrictive local realistic inequality (which can also be found in Accardi17) is not incompatible with the quantum mechanical prediction as the maximum of Sbdquoe a k is 2-2 This shows that experiments intended to test Bells Theorem were unfortunately not testing the strongly objective inequality Eq (11)mdashwhich is a Bell inequalitymdash but this weakly objective one Eq (29) since all experimental tests necessarily are executed in a weakly objective way due to the irreducible incompatibility between spin measurements As was stressed by Sica18 and Accardi17 a local realistic inequality is nothing but an arithmetic identity and inequality (29) is definitely too lax to be violated by experimental tests
6 Conclusion
It was shown that Bells Theorem cannot be derived either within a strongly objective interpretation of the CHSH function because Quantum Mechanics gives no strongly objective results for the CHSH function (see Section 42) or within a weakly objective interpretation because the only derivable loshycal realistic inequality is never violated either by Quantum Mechanics or by experiments (see Section 52) It was demonstrated that the discrepancy in Bells Theorem is due only to a meaningless comparison between S^trons lt 2 and 5^ e a k = 22 where the former is relevant to a system with Nf degrees of freedom whereas the latter to one with 4Nf (see Section 3) The only meaningful comparison is between the weakly objective local realistic inequalshyity 5^ e a k lt 4 and the weakly objective quantum prediction Sbdquo e a k = 2^2 but these results are not incompatible Bells Theorem therefore is refuted
38
References
1 J S Bell Physics 1 195 (1964) 2 F Selleri Le grand debat de la mcanique quantique (Champs Flammar-
ion Paris 1986) 3 A Aspect Nature 398 189 (1999) 4 A Einstein B Podolsky and N Rosen Phys Rev 47 777 (1935) 5 D Bohm Phys Rev 85 166 (1952) 6 D Greenberger M Home A Shimony and A Zeilinger Am J Phys
58 1131 (1990) 7 A Bohm Quantum Mechanics Foundations and applications (Springer-
Verlag New York 1979) 8 J S Bell in Proceedings of the international School of physics Enrico
Fermi course IL Foundations of quantum mechanics (Academic New York 1971) p 171
9 J S Bell Epistemological Letters p 2 (July 1975) 10 J F Clauser M A Home A Shimony and R A Holt Phys Rev Lett
23 880 (1969) 11 B dEspagnat Veiled Reality An Analysis of Present Day Quantum
Mechanical Concepts (Addison-Wesley 1995) 12 B dEspagnat httparXivabsquant-ph9802046 13 A Aspect J Dalibard and G Roger Phys Rev Lett 49 1804 (1982) 14 A Khrennikov httparXivabsquant-ph0006017 15 J von Neumann Mathematical Foundations of Quantum Mechanics
(Princeton University Press 1955) 16 B dEspagnat Conceptual foundations of Quantum Mechanics (WA
Benjamin Massachusetts 1976) 17 L Accardi httparXivabsquant-ph0007005 18 L Sica Opt Commun 170 55 (1999)
39
PROBABILITY CONSERVATION A N D THE STATE DETERMINATION PROBLEM
S AERTS Free University of Brussels
Triomflaan 2 Brussels Belgium E-mail saertsvubacbe
The problem of finding an operational definition for the wave vector is briefly examined from a historical point of view Led by an old idea of Feenberg we integrate the one dimensional probability conservation equation to obtain a closed formula that determines the state vector in the spinless case The formula that determines the state does not depend on the (real) potential external fields having their influence on the state only through the time derivative of the probability density function in position space We apply the method to the simple case of a free Gaussian wave packet Some problems regarding the operational status of the quantities involved are discussed
1 Introduction
It is well known that Heisenberg constructed the matrix formulation of quanshytum mechanics by keeping in close accordance with what might be labelled the principle of operationality Roughly one can describe this principle as a determination to introduce only measurable quantities Schrodinger more concerned with anschaulichkeit than operationality introduced rather unshyscrupulously the concept of a wave function He initially interpreted the wave function as a charge density in space but this interpretation is difficult to extend to several particle problems a The interpretation that would stand the test of time as testimonied by it being awarded the Nobel prize in 1954 was due to Born In analogy with the theory of electro-magnetic radiation in which the intensity is the square of the amplitude Born took the step to interpret the intensity of an electro-magnetic wave in a given region of space as proportional to the relative frequency of a photon detection in that region and the probabilistic interpretation was born However this correspondence still doesnt make it an operational quantity as for every density p(x t) there are infinitely many 4gt(xt) such that with ip(xt) = ^pxt)el^xt we get ip(xt)ip(xt) = p(xt) The problem is then to find suitable functions that we can approximate experimentally in a statistical way that in some well choshysen combination yield the same information as the complete wave function In order to make the question mathematically more precise Prugovecki2 intro-
aFor a rescue attempt of the original Schrodinger interpretation see Dorling1
40
duced the notion of informational completeness A family T = Oii euro 1 of bounded operators on a Hilbert space ~H is called informationally complete iff for every two density operators p and p the equality Tr(pOi) = Tr(pOi) implies p = p This definition implies that the set of expectation values of an informationally complete set of operators allows only one state operator from which the expectation values could have been derived What characterizes such a set In a classical statistical framework we can calculate all macroshyscopic quantities from a single density function p(p q) in phase space Hence by analogy one is naturally led to the following interesting question originally due to Pauli3 Is it sufficient to know the probability density functions of poshysition and momentum to determine unambiguously the quantum mechanical state of the physical system In the quantum mechanical case it is sufficient to know the wave function in coordinate space ip(xt) since the corresponding wave function for the same system in momentum space ip(pt) is given by its Fourier transform Hence we can phrase the problem in a more mathematical way is it possible to determine a square integrable function uniquely from both its modulus and the modulus of its Fourier transform Possibly the first non-trivial counterexamples came from Bargmann b who constructed explicit examples of wave functions Vl and ip2 that give rise to the same probabilshyity distributions for position and momentum but give a different probability distribution for a third operator that does not commute with the position or momentum operator This leads to the remarkable conclusion that the wave function in its coordinate representation contains more information than the corresponding probability densities in position and momentum together Due to Bargmann we know the answer to be negative in a physically relevant way c
and what is now commonly referred to as the Pauli problem is either the probshylem of determining the set of states that share the same modulus and the modshyulus of their Fourier transform or the problem of finding a set of observables that are informationally complete The problems are related but not identical and we prefer to refer to the first version of the problem as the Pauli probshylem and to the second as simply the state determination problem It seems much more work has been done on the state determination problem which isnt surprising given the fact that the Pauli problem is a special case of it With the exception of the production of counterexamples such as Bargmanns the first instructive results regarding the Pauli problem were obtained only in
Bargmann never seems to have published these results himself and as a result little refershyence is given to his work in the literature However the examples can be found in Reichen-bach 4 c The problem re-appeared unaltered in the 1958 edition of Paulis book more than a decade after the first counterexamples
41
1978 by Corbett and Hurst5 In their paper they construct physically imporshytant classes of functions that are uniquely determined by their position and momentum distributions However they also show there exist dense subsets of states that are not uniquely determined by their position and momentum disshytributions and as a consequence any state can be approximated in norm by a non-unique state Extensions comments and counterexamples to their work can be found in Friedman6 and Pavicic7 Nevertheless the complete charshyacterization of the set of states that share modulus and the modulus of their Fourier transform is still open As for the state determination problem we can split the work into those who were primarily concerned with establishing a set of observables that is informationally complete (or disproving a certain set to have this property) and those that set out to characterize such sets The first group includes Feenberg8 (1933) Moyal9 (1949 ) Gale Guth and Trammell (1968)10 Band and Park 1 1 1 2 13 (1970-1971) and many more14 15 16 We will not go into the reconstruction of the state by placing the entity in different potentials a method pioneered by Lamb17 and one that inspired many similar approaches such as Wiesbrock18 and Weigert19 nor will we mention the vast literature pertaining to the measurement of the Wigner distribution known as phase-space tomography However concerning the characterization of inshyformationally complete sets we cannot help but make the following elementary remarks Suppose we have a non-trivial (ie not a multiple of the identity) self-adjoint operator A that commutes with every member of a set of operashytors S in a Hilbert space 7i It is well known that the one parameter family of unitary operators exp(itA) also commutes with every element of ltS Now take any xj) that is not an eigenvector of A For any observable in S the state ipt mdash exp(itA)tp gives the same expectation value for this operator whatever numerical value t has But if t ^ s it follows that ipt ^ Vs (for the relation of this with superselection rules see Wick Wightman and Wigner (1952) 20 Emch and Piron (1963) 21 and Piron2 2) Hence S is not an informationally complete set of observables So a necessary condition for a set of observables to be informationally complete is maximality in the sense of Dirac in other words that there be no other non-trivial operator that commutes with every member of the set However this is far from sufficiency As Bush and Lahti23
have shown it is easy to derive d from the considerations above that no comshymuting set of observables is informationally complete Maximal commuting sets of observables serve as a means of state preparation not state identifishycation This means that at least for for continuous variables the Pauli set P Q is in a certain sense the minimal set that one could possibly hope to be informationally complete (although Bargmann has shown this in general not
One arrives at this result by allowing A to be a member of S
42
to be the case)
2 Conservation of Probability
What we will present in this article is an elaboration on the reasoning followed by Feenberg Consider the time-dependent Schrodinger equation in tp with a real e potential V and using the shorthand tp for ip(r t)
~ = -h2imV2tp +^rVip at in
Multiply by tp and add this to the complex conjugate of the above equation multiplied by ip After some elementary vector operator manipulation we find what is commonly known as the conservation law of probability
Substitution of the polar representation of the wave vector iP(rt) = yfafietrade (ip assumed real) into the former equation yields a second order partial difshy
ferential equation which is in fact a Fokker-Planck equation with zero diffusion coefficient and the phase serving as a a potential
Feenbergs argument is a uniqueness result based on this last equation It amounts to showing that any two phase functions that satisfy this equation and some gentle boundary conditions differ by at most a constant His 1933 thesis is hard to get hold of but the argument was (erroneously1015 ) extended by Kemble 24 to three spatial dimensions in his much easier to find handbook on quantum mechanics What we will do here is go back to the original one dimensional idea but rather than trying to establish a uniqueness result we will show that in this simple case a solution can be obtained by direct integration
3 Determination of the phase function
So p and ip satisfy the conservation law as given by the last equation Rewriting this equation in one dimension evaluated at a specific time instant t = to gives us eThe imaginary part of a complex potential can be used to mimic creation and annihilation effects Although this is sometimes a useful approximation such results violate the continuity equation and for a more reliable analysis one should really use a second quantized theory
43
lt9V dp(xt0)dip mtdpxt) pxto)w + mdashdxmdashTx + -nmdashm-]t^ = deg
Assume for the time being that p(x t0) ^ 0 and divide the equation by p(x t0)
d2(p dinp(xt0) dip m dlnp(xt) _ ~dtf + dx ~5x~+ J dt h=t0 ~
Assuming pox) and its time derivative to be known functions we can solve for the unknown phase ltp(xto) Set
As all quantities are evaluated at the same time instant t = to we will not bother to give further notational reference to this fact In what follows we will also abbreviate (with abuse of language) ( a i nP(x f)) f = t o a s dtlnp(x) Applying these transformations the equation becomes
^ + f(X)(fgt = g(X)
So we have transformed the second order partial differential equation into an ordinary first order linear differential equation with a source g(x) at a fixed time instant The solution of the homogeneous equation is ltph = exp[mdash f f(x)dx] = p~1x) The general solution with c chosen to fit the boundary condition is ltfgt(x) = 4gthx)(c + $x g(s)p(s)ds) We have to integrate this result once more to get ltp(x)
x rr
4gthr)(c+ I g(s)p(s)ds)dr
= J p~(7)[c+J J P(s)dtlnp(s)ds]
= J (c+-J dtP(s)ds)W)
4 Validity and range of applicability
The solution is seen to be a two parameter family of curves one for every value of the constant c and one for every lower limit say x$ of the r integration The result of changing the lower integration limit is only the addition
bullThe lower limit of the s integration is absorbed in the constant c
44
of an overall constant to tp(xt) Because we know the quantum mechanical expectation values and probabilities to be invariant under such an addition we set this constant equal to zero The value of the constant c can potentially affect the phase in a more profound way Depending on the particular p(r t) used pfriy m i g n t diverge when p(r t) is zero for some value(s) of r or even worse for some Ar First of all we assumed in our derivation that p(r t) ^ 0 but this restriction can easily be removed Indeed suppose we have n places xn where the density does equal zero A solution ipi is then obtained for each interval ]x Xi+ [ by means of our equation The total solution ip is obtained by pasting all the ipi together by requiring continuity of if and V^- 9 bull Now continuity of ip and VVgt implies continuity of their respective complex conjugates and hence of p and Vp If we are to infer the phase from actual data it seems reasonable to require (p also to be continuous In fact the conservation equation requires it to be twice differentiable If any cutting and pasting is necessary to obtain the solution we can easily see that the constant c should be the same for any two pasted pieces Hence if the cut is applied at a pole c has to be zero h for ltp to be continuous We arrive at the same conclusion when we use the same reasoning on a point adjacent to the support of p Hence we arrive at the main result of our paper
m rx fo rr
V(xt0) = yp(xt0)exp(imdash dtp(st0)ds)
Note that the state does not contain reference to the potential External fields will show up in the state indirectly as a consequence of the time dependence of p The assumptions that underlie the derivation of the equation are a spinless one dimensional particle that acts under a real potential V being prepared in a pure state In short all that is required for a particle to obey the one dimensional dynamical Schrodinger equation However restricted this class is it does include many examples that can be found in standard textbooks on quantum mechanics
Comparing the result we have found to those in the literature we find the closest match with a result obtained by Gale Guth and Trammel10 They apply the definitions of p(r) and j(r) to show that knowing these is sufficient for the determination of the phase They then discuss a gedanken experiment
9 This continuity demand is in fact a necessity because the validity of the equation of probshyability conservation (and a fortiori of the Schrodinger equation) requires xjj and Vigt to be continuous A notable but unproblematic exception is that of an infinite potential step h the value of c might be non-zero in applications where the continuity equation only expresses conservation of the probability flux in some intermediate region the boundaries (possibly at infinity) containing sinks or sources of probability
45
for establishing the probability current by measuring the expectation of the velocity and argue by means of this experiment and an intuitive argument that the current j(r) equals p(r) lt v(r) gt for some r inside a small space region that is supposed to contain the particle Our result was obtained by a direct integration and as a consequence is exact It is however difficult to extend to higher dimensions because of two reasons The first is the fact that the expression for the probability current in the presence of a vector potential becomes J(xpound) = Reip(xt)[pmmdash (qmc)A]ip(x t) and depending on the form of the vector potential it is not obvious to what function of the phase this corresponds If the vector potential corresponds to a uniform magnetic field or in absence of a vector potential (in which case one can transform the equation into a Poisson equation) one can solve the continuity equation by employing standard techniques However one then encounters a second problem Providing an initial value for the phase (which is unproblematic as the phase is only determined within an additive constant) is no longer sufficient instead we need an initial boundary function Hence we have to resort to other principles to determine the phase on such a boundary in order to solve the problem Of course the principle of conservation may still serve the purpose of reducing the family of admissible functions for the phase of the amplitude We will now illustrate the principle by applying it to a Gaussian wave packet Later we will expound a few operational issues regarding the quantities involved in the solution given above
5 Evolution of a Gaussian Wave Packet
The full time dependent wave function for a free Gaussian wave packet is
c o = ltMA)Srltlti + ^ r -x24(Ax)l + ik0x - ik2Ht2m
eXpL 1 + iht2m(Ax)20 J
From this we easily calculate p(xt)
p(xt) = tpxt)ip(xt)
iv A N2W- h2t2 N--12 r -(x + k0htm)2
Now assume we did not know the wave function only the probability density and its time derivative at some time instant t mdash 0 In an abbreviated
46
form (with easy identification of the coefficients) we can write the probability as
) = + tf)-raquolaquop[-JEplusmn|pound]
At time t = 0 this gives us p(x0) mdash aexp(mdash^-) The derivative of p with respect to the time parameter
bulllaquoraquo - 4ilt1 + 6 2gt~1 2 e x plt-|r^)gt]= CX X2
= ~2a~dexp(~~j)
So the phase becomes
ltp(x0) = j J J dtp(s0)d p(r0)
2 bdquo2 bull v
C TTl f fr S V
= ~2d-hJ J sexP(--)dsexP(-)d
m fx v^ r2
kohm = T~x
m n
= kox
which is precisely the desired phase of the wave function at t = 0 6 Operational Issues
Expounding Feenbergs uniqueness result Reichenbach points out that we can recover the phase by numerical computation if we know p(x to) and dtp(x t) t=t0 bull In order to establish these quantities Reichenbach outlines the following proshycedure4 We take an ensemble A of identically prepared systems such that the ensemble can be properly described by a pure state ifgt Now select at random two sub-ensembles from A say B and C For each system in B we measure at the time to the value of a As the results will vary we obtain in this way a distribution p(xto)- Likewise for each system in C we we measure at the time ti the value of x obtaining a distribution p(xti) The quotient
p(xt0) - p(xh)
h mdash to
47
is then supposed to approximate dtp(xt) for t euro [toh] if the interval [toh] is chosen sufficiently small The wave function can then be obtained through numerical approximation and represents the state of the systems that are left untouched in the original ensemble A There is a problem with Reichenbachs procedure for determining these quantities that is of equal concern to our method Despite the fact that it is entirely possible to position the detector wherever one wants it to be hence effectively controlling x in p(xt) it is an annoying peculiarity of quanta that one cannot determine when a detection will take place One places a detector and simply waits for a detection count to happen The problem seems related to what Mielnik has called the screen problem in a provocative and enlightening paper by the same name 25 As Mielnik points out experimentalists perform a lot of experiments but none reshysembling an instantaneous check of particle position Indeed a measurement setup typically consists of a source that what is emitted undergoes a series of transformations (ie an optical bench or a potential) and is subsequently detected by a fixed detector or a set of fixed detectors If we are to describe operational means of measuring densities at some time instant we will have to do so by such a typical setup To produce anything remotely satisfactory we will need a few assumptions A first assumption is that if a particle is detected at some time instant to in position x the intricate mechanism beshytween the measurement apparatus and the particle that is responsible for its detection does not depend on to and in this sense has no effect on the value of p(xt) However unnatural the assumption might be from a physical point of view it seems to underlie the statistical interpretation of fn ^x t)2dV as an instantaneous localization probability of the system in a state ip in a space region fi and at a time instant t In so far as our analysis depends on this assumption so does the standard interpretation of quantum mechanics The next assumption is that we are able to control the release of the particle in a certain state within a sufficient small time interval At such that within this small time interval the density can reasonably be approximated by a linear function This can be achieved by placing a shutter mechanism behind the source Naturally the shutter opening time has to be substantially less than the coherence time of the particle A sufficiently short opening time can only be established by experiment and one can never be quite sure if there would still be more oscillations on a much shorter time scale A density function with a larger variation will be harder to approximate as it requires a shorter shutter opening time and hence will result in a lower detection rate The wave packet then participates in the transformations we may have set up (optical bench Stern-Gerlach) and is detected The time interval between the shutter reshylease and the detection time is noted together with the position of the detector
48
After many of such recordings we gather all the data to reconstruct p(xt) How many samples do we need Well if the samples were taken at equidistant At and Ax we could do a Fourier synthesis and apply the Shannon-Whittaker sampling theorem However due to the non-equidistant spreading of the tn (at best following some statistical pattern) we need Frame Theory (Duffin and Schaeffer26) to reconstruct band limited signals from irregularly spaced samshyples f(tn) The derivative with respect to time can then be derived from the reconstructed signal and the phase derived by means of the proposed equation
Acknowledgments
The author wishes to acknowledge a helpful discussion with John Corbett regarding the subject of this paper
References
1 J Dorling Schrodinger Centenary celebration of a polymath eds CW Kilmister (Cambridge 1987)
2 E Prugovecki Int J Theor Phys 16 pp 321-331 (1977) 3 W Pauli Encyclopedia of Physics Vol V p17 (Springer-Verlag Berlin
1958) 4 H Reichenbach Philosophic Foundations of Quantum Mechanics (Unishy
versity of California Press 1948) 5 JV Corbett CA Hurst J Austral Math Soc B20 182-201 (1978) 6 CN Friedman J Austral Math Soc B30 298 (1987) 7 M Pavicic Phys Lett A 122 280 (1987) 8 E Feenberg The Scattering of Slow Electrons in Neutral Atoms Thesis
Harvard University (1933) 9 JE Moyal Proc Cambridge Phil Soc 45 99 (1949)
10 W Gale E Guth and GT Trammell Phys Rev A 165 1434-1436 (1968)
11 W Band J Park Found Phys 1 No 2 pp 133-144 (1970) 12 J Park W Band Found Phys 1 No 4 pp 339-357 (1971) 13 W Band J Park Am J Phy 47 pp 188-191 (1979) 14 A Royer Phys Rev Lett 55 pp 2745 (1985) 15 A Royer Found Phys 19 3 (1989) 16 W Stulpe M Singer Found Phys Lett 3 153 (1990) 17 W E Lamb Phys Today 22(4) 23 (1969) 18 H-W Wiesbrock Int J Theor Phys 26 pp 1175 (1987) 19 S Weigert Phys Rev A 45 pp 7688-7696 (1992)
49
20 GC Wick AS Wightman EP Wigner Phys Rev 88 pp 101-105 (1952)
21 EC Emch C Piron J Math Phys 4pp 496-473 (1963) 22 C Piron Helv Phys Acta 42 pp 330-338 (1969) 23 P Bush PJ Lahti Found Phys 19 pp 633 (1971) 24 EC Kemble New York MacGraw-Hill (1937) 25 B Mielnik Found Phys 24 8 pp 1113-1129 (1994) 26 RJ Duffin AC Schaeffer Trans Amer Math Soc 72 341-366
(1952)
50
EXTRINSIC A N D INTRINSIC IRREVERSIBILITY IN PROBABILISTIC DYNAMICAL LAWS
H ATMANSPACHER Institut fur Grenzgebiete der Psychologie und Psychohygiene
Wilhelmstr 3a D-79098 Freiburg Germany E-mail haaigppde
and Max-Planck-Institut fur extraterrestrische Physik
D-85740 Garching Germany
R C BISHOP Institut fur Grenzgebiete der Psychologie und Psychohygiene
Wilhelmstr 3a D-79098 Freiburg Germany E-mail rcbigppde
A AMANN Universitatsklinik fur Anasthesie Leopold-Franzens- Universitat
Anichstr 35 A-6020 Innsbruck Austria E-mail antonamannuibkacat
and Institut fur Allgemeine Anorganische und Theoretische Chemie Abteilung fur theoretische Chemie Leopold-Franzens- Universitat
Innrain 52a A-6020 Innsbruck Austria
Two distinct conceptions for the relation between reversible time-reversal invarishyant laws of nature and the irreversible behavior of physical systems are outlined The standard extrinsic concept of irreversibility is based on the notion of an open system interacting with its environment An alternative intrinsic concept of irreshyversibility does not explicitly refer to any environment at all Basic aspects of the two concepts are presented and compared with each other The significance of the terms extrinsic and intrinsic is discussed
1 Introduction
The relation between reversible time-reversal invariant laws of nature and the irreversible behavior of empirical systems has been a long-standing problem in physics In most standard approaches fundamental dynamical laws such as in Newtons Maxwells Einsteins or Schrodingers equations describe the temporal evolution of isolated systems Irreversible dynamical laws are typshyically regarded as emerging from the interaction between systems and their environment ie from considering open systems
In contrast to this extrinsic conception of irreversibility there is a group
51
of scientists who insist that some kinds of irreversibility are intrinsic ie some kinds of irreversible laws are fundamental On this view mainly adshyvocated by Prigogine and colleagues in Brussels and Austin the switch from extrinsic to intrinsic irreversibility goes along with a switch from particular kinds of deterministic descriptions to particular kinds of probabilistic descripshytions
In general the two viewpoints are considered to be distinct sometimes even entirely incompatible It is the main goal of this contribution to show that there are both differences and similarities between them As a consequence it does not make too much sense to prefer one of them at the expense of the other It is much more interesting to explore whether particular aspects of each of the two views can be constructively related to each other in order to increase our insight into the issue of irreversibility
In the following both conceptions will be presented to some detail and compared It is suggested that the distinction of ontic and epistemic catego-rial frameworks for some problems associated with irreversibility is particularly useful when focusing on a conceptual discussion Such a distinction serves to clarify both common and distinct aspects of extrinsic and intrinsic irreversibilshyity and it helps to frame a number of open questions concerning them
In Section 2 ontic and epistemic descriptions are briefly introduced We use an algebraic framework for this introduction since this has proven fruitful in related problem areas Section 3 outlines some basic issues with respect to the ontic states of closed quantum systems and their time-reversal invariant dynamical evolution Subsequently two ways to conceive of extrinsic irreshyversibility are described In one of them epistemic states are represented by (reduced) density operators in the other they are represented by probabilshyity distributions of pure states Section 4 presents the intrinsic conception of irreversibility One major line of research in this regard deals with transformashytions from invertible K-systems to non-invertible exact systems the other uses the concept of rigged Hilbert spaces to extend the state of a system beyond Hilbert space Section 5 summarizes the main points and indicates some open questions
2 Ontic and epistemic descriptions
21 General issues
Can nature be observed and described as it is in itself independent of those who observe and describe - that is to say nature as it is when nobody looks This question has been debated throughout the history of philosophy with no clear answer either way Each perspective has strengths and weaknesses and in each
52
epoch has had its critics and proponents In contemporary terminology the two perspectives can be distinguished as the topics of ontology and epistemology Ontological questions refer to the structure and behavior of a system as such whereas epistemological questions refer to knowledge (or information) about systems
In philosophical discourse it is considered a serious fallacy to confuse these two types of questions For instance Fetzer and Almeder emphasize that an ontic answer to an epistemic question (or vice versa) normally commits a category mistake 1 Nevertheless such mistakes are frequently committed in many fields of research when addressing subjects where the distinction between ontological and epistemological arguments is important
The onticepistemic distinction refers to states and properties of a system as such or in its relation to observers hence it is an ontological distinction0
In physics the rise of quantum theory with its interpretational problems was one of the first major challenges to the onticepistemic distinction The Bohr-Einstein discussions in the 1920s and 1930s serve as a famous historical examshyple Einsteins arguments were generally ontically motivated that is to say he emphasized a viewpoint independent of observers or measurements By conshytrast Bohrs emphasis was generally epistemically motivated focusing on what we could know and infer from observed quantum phenomena Since Bohr and Einstein never made their basic viewpoints explicit it is not surprising that they talked past each other in a number of respects2
Examples of approaches trying to avoid the confusions of the Bohr-Einstein discussions are Heisenbergs distinction of actuality and potentiality 3 Bohms ideas on explicate and implicate orders5 or dEspagnats scheme of an empirshyical weakly objective reality and an objective (veiled) reality independent of observers and their minds5 Further terms fitting into the ontic side of these distinctions are latency6 propensity7 or disposition8 See also Jammers discussion of these notions including their criticism and additional references 9
A first attempt to draw an explicit distinction between ontic and epistemic descriptions for quantum systems was introduced by Scheibe 10 who himself however strongly emphasized the epistemic realm Later Primas developed this distinction in the formal framework of algebraic quantum theory11 The basic structure of the onticepistemic distinction which will be made more precise below can be roughly characterized as follows (for more details the reader is referred to1 1 1 2)
On the other hand the distinction between ontological and epistemological problems can be considered as epistemological insofar as both areas represent fields of (philosophical) knowledge
53
Ontic states describe all properties of a physical system exhausshytively (Exhaustive in this context means that an ontic state is precisely the way it is without any reference to epistemic knowledge or ignorance) Ontic states are the referents of indishyvidual descriptions the properties of the system are treated as intrinsic bullproperties As an important example ontic states reshyfer to closed systems they are empirically inaccessible Typically their temporal evolution (dynamics) is reversible and follows fundashymental deterministic laws Epistemic states describe our (usually non-exhaustive) knowledge of the properties of a physical system ie based on a finite partition of the relevant phase space The refshyerents of statistical descriptions are epistemic states the properties of the system are treated as contextual properties Epistemic states refer to open systems they are at least in principle empirically accessible Typically their temporal evolution (dynamics) follows irreversible laws
The combination of the onticepistemic distinction with the formalism of algebraic quantum theory provides a framework that is both formally and conshyceptually satisfying Although the formalism of algebraic quantum theory is often hard to handle for specific physical applications it offers significant clarshyifications concerning the basic structure and the philosophical implications of quantum theory For instance the modern achievements of algebraic quanshytum theory make clear in what sense pioneer quantum mechanics (which von Neumann implicitly formulated epistemically 13) as well as classical and stashytistical mechanics can be considered as special cases of a more general theory Compared to the framework of von Neumanns monograph13 important exshytensions are obtained by giving up the irreducibility of the algebra of observshyables (not admitting observables which commute with every observable in the same algebra) and the restriction to locally compact phase spaces (admitting only finitely many degrees of freedom) As a consequence modern quantum physics is able to deal with open systems in addition to isolated ones it can involve infinitely many degrees of freedom such as the infinitely many modes of a radiation field it can properly consider interactions with the environment of a system superselection rules classical observables and phase transitions can be formulated which would be impossible in an irreducible algebra of obshyservables there exist infinitely many representations inequivalent to the Fock
In a more technical terminology one speaks of observables (mathematically represented by operators) rather than properties of a system Prima facie the term observable has nothing to do with the actual observability of a corresponding property
54
representation and non-automorphic irreversible dynamical evolutions can be successfully incorporated and even derived
In addition to this remarkable progress the mathematical rigor of algeshybraic quantum theory in combination with the onticepistemic distinction alshylows us to address a number of unresolved conceptual and interpretational problems of pioneer quantum mechanics from a new perspective First the distinction between different concepts of states as well as observables provides a much better understanding of many confusing issues in earlier conceptions including alleged paradoxes such as those of Einstein Podolsky and Rosen (EPR) 1 4 Second a clear-cut characterization of different concepts of states and observables is a necessary precondition to explore new approaches beshyyond von Neumanns projection postulate toward the central problem that pervades all quantum theory the measurement problem Third a number of much-discussed interpretations of quantum theory and their variants can be appreciated more properly if they are considered from the perspective of an algebraic formulation
One of the most striking differences between the concepts of ontic and epistemic states is their difference concerning operational access ie observshyability and measurability At first sight it might appear pointless to keep a level of description which is not related to what can be operationalized empirshyically However a most appealing feature at this ontic level is the existence of first principles and fundamental laws that cannot be obtained at the episshytemic level Furthermore it is possible to rigorously deduce (eg to GNS-construct cf 12gt15) a proper epistemic description from an ontic description if enough details about the empirically given situation are known These aspects show that the crucial point is not to decide whether ontic or epistemic levels of discussions are right or wrong in a mutually exclusive sense There are always ontic and epistemic elements to be taken into account for a proper description of a system This requires the definition of ontic and epistemic terms to be relativized with respect to some selected framework within a set of (hierarchishycal) descriptions (see16 for details and examples) The problem is then to use the proper level of description for a given context and to develop and explore well-defined relations between different levels
These relations are not universally prescribed they depend on contexts of various kinds The concepts of reduction and emergence are of crucial sigshynificance here In contrast to the majority of publications dealing with these topics it is possible to precisely specify their meaning in mathematical terms Contexts or contingent conditions can be formally incorporated as topologies in which particular asymptotic limits give rise to novel emergent properties unavailable without those contexts (see 15 for more details) It should also
55
be mentioned that the distinction between ontic and epistemic descriptions is neither identical with that of parts and wholes nor with that of micro- and macrostates as used in statistical mechanics or thermodynamics The thermoshydynamic limit of an infinite number of degrees of freedom provides only one example of a contextual topology others are the Born-Oppenheimer limit in molecular physics or the short-wavelength limit for geometrical optics
These examples indicate that the usefulness or even inevitability of the onticepistemic distinction is not restricted to quantum systems It plays a significant role in the description of classical systems as well More specifically it has been shown in detail that for systems exhibiting deterministic chaos the distinction of ontic and epistemic descriptions is necessary if category mistakes and corresponding interpretational fallacies are to be avoided17
3 Breaking Time-Reversal Symmetry Extrinsic Irreversibility
31 Time-Reversal Symmetry in Closed Systems
Let us start with a closed quantum system which can be considered without any reference to an environment The pure state ltfgt of such a system is an extremal positive linear functional on a C-algebra A The state ltgt euro A where A is the dual of A is then called an ontic state of the closed system If a Hilbert space representation of A is possible ltjgt can be represented as a state vector ip G characterized by the expectation values lt ipAip gt of all observables A euro A Under particular conditions the dynamics of ltfgt is given by the time-reversal invariant Schrodinger equation
In the traditional Hilbert space representation the algebra A of observshyables is irreducible there are no commuting observables Due to the Stone-von Neumann theorem every representation of the canonical commutation relashytions is then equivalent to the Schrodinger representation In the more general setting of a Fock space (sum of tensor products of one-particle Hilbert spaces) the same holds for Fock representations
A restriction of ltfr to a subsystem is not a pure state in general hence it is in general illegitimate to consider a closed quantum system as consisting of closed subsystems As a consequence an ontic state cfgt characterizes an individual undivided whole not consisting of subsystems with their own ontic states This is the level of description to which the notions of quantum nonlocality or quantum holism apply Since the concept of an environment does not make sense for ontic states of closed systems it is illegitimate to speak about their entanglement or interaction with another state
If one introduces a distinction (Heisenberg cut) to create subsystems in
56
a closed system then these subsystems in general are open For example one can then consider an object entangled andor interacting with its environshyment The epistemic state r] of those subsystems can be represented in two conceptually different ways
32 Density Operators as Non-Pure States
The first more or less familiar representation of an epistemic state n is given by a (reduced) density operator D 6 M where M is the predual of a W-algebra M of contextual observables The expectation value of D is given by TrDM for observables M E M The epistemic state n represented by D is a non-pure state EPR-correlations between subsystem and environment are generic if the contextual algebra of observables is non-commutative
The term contextual observables derives from the fact that their conshystruction requires the selection of a context defined by a subset of relevant observables B E B C A and a reference state (eg vacuum state KMS state) distinguished by some appropriate stability condition This context induces the weak closure of B and gives rise to a contextual topology in M If the context is known well enough then the GNS representation is a powerful constructive tool to implement a proper contextual topology (see eg15)
The dynamics of D is of Schrodinger type plus dissipative terms (eg a master equation) so that the time-reversal invariance of the Schrodinger equation can be broken18 19
33 Probability Distributions of Pure States
If the epistemic state r of an open system is approximately pure by a clever dressing of object and environment (b indicates bare objects and environments and d indicates dressed objects and environments)
ri0ij lt8gt Henv = Hgbj lt8gt nenv
7] can be represented (estimated) by a probability distribution fj of pure states (A dressing procedure is clever if it minimizes EPR-correlations between obshyject and environment or if it maximizes the integrity of both object and environment20) Hgbj is the proper Hilbert space for an approximately pure epistemic state 77 Although 77 can be uniquely extended to a normal state on M (represented by a density operator) the pure states and their distribution fi themselves do not make sense on M The relevant observables are elements of a C-subalgebra B C A
57
The dynamics of p is of Schrodinger type plus stochastic terms (eg an ItoStratonovic equation) so that the time-reversal invariance of the Schroshydinger equation can be broken The stochastic aspect of the time evolution (of approximately pure states of the object) originates from the fact that the (initial) state of the environment cannot be determined and therefore must be treated as a stochastic variable Starting from an initial pure state pa one gets time-evolved states ptu where co is the stochastic variable First steps of such an approach toward single open quantum systems not based exclusively on decompositions of density-operator dynamics were proposed in2 1 2 2
For a large class of stochastic dynamics of approximately pure states of objects one ends up with one particular distribution p^ of pure states in the limit t mdashgt oo independently of the initial conditions (such dynamical objects are called ergodic) Splitting the underlying C-algebra B into two subsystems with two C-subalgebras B and B2 B = B reg B2 is then admitted under particular conditions In an ideal situation all those pure states onto which the probability measures pt extend are product states with respect to the tensor product B = B reg $2- This situation never arises in practice but most relevant pure states can be product states or almost product states if the dressing tensorization is chosen appropriately 23
3-4 Dynamics of Measurement a Simple Example
Any dynamical description of measurement has to start from a proper decomshyposition of a system into a dressed object and its dressed environment It is crucial to keep in mind that such a decomposition is a logical precondition for the dynamics of measurement insofar as the Hamiltonian of the composed system needs to be written as a sum
H = Hobiregl + lregHmy+Hint (1)
An illustrative heuristic example has been extensively discussed by Primas24 Consider the simple case of a two-level quantum object (spin 12 system) with the Hamiltonian
h 3
^ o b j ~ Tj^yGu (2)
a sufficiently nontrivial boson field environment
3
-Henv = ^2^2ujkaklakv (3)
58
and an interaction
3
Hint = ^ lt7bdquo (ggt Abdquo (4)
where
Av = ^ ^kuOtkv + CC (5) k
If such a decomposition has been properly carried out (cf Sec 33) then it is possible to derive the expectation values
M(t) = ltiptWflHgt (6)
a(t) = ltXtAXtgt (7)
with respect to the (approximate) product state
t = v- tobjregxr- (8)
Corresponding to the product state Pt the C-algebra of intrinsic observables in the composed system of dressed object and dressed environment is
A = A0hi reg-4env (9)
Aohi is the C-algebra of 2 x 2 matrices and ^4env is the C-algebra of intrinsic observables of an environment with infinitely many degrees of freedom
The equations of motion for the expectation values M(t) and a(t) are given by
M(t) = M(t) x ft + M(t) x a(t) (10)
() = -UkOLkv + -^gt~kvMvt) (11)
They describe the feedback between object and environment More precisely they describe the polarization M of the object under the influence of the enshyvironment and the motion of the environment observable a (boson operator) under the polarizing influence of the object The solution of the second equashytion referring to the observables of the environment (or the measuring system
59
respectively) has a retarded and an advanced part
(t gt 0) (12)
(t lt 0) (13)
A bidirectionally deterministic system can be described in terms of a superposhysition of a backward deterministic (forward non-deterministic) and a forward deterministic (backward non-deterministic) process which are equally relevant a priori Selecting one of these solutions and disregarding the other requires the time inversion symmetry of the compound system to be broken For this purpose one can apply the principle of causality (past-determinacy error-free retrodiction no anticipation) as a heuristic argument for the selection of the retarded solution
It has been argued that the retarded ie the backward deterministic forward non-deterministic solution is a K-flowc on a state space with infinitely many degrees of freedom24 In the simplest case the relaxation time for this K-flow is the time constant rbdquo of an exponentially decaying correlation function (for details see24)
Kv = ivexp(-tTv) (14)
At this point we are still at the level of description of intrinsic observables needed for the specification of initial conditions of the K-flow Conceptually this K-flow represents a stochastic process which corresponds to chaos in the sense of Wiener25 rather than chaos in the sense of Kolmogorov and Sinai (ie a dissipative dynamics) By introducing a context via a reference state with respect to which stability in a particular sense (hopefully more general than thermal equilibrium) can be checked one can proceed to (GNS-constructed) contextual observables
35 General Features of Extrinsic Irreversibility
The breaking of time-reversal symmetry in the framework of extrinsic irreshyversibility corresponds to the conceptual transition from closed systems with cNote that K-flows or K-systems play an important role in one of the approaches of intrinsic irreversibility (see Sec 41) It would be interesting but exceeds the scope of this paper to explore the question of whether the process of measurement as described here can be conceived as intrinsically irreversible In this respect see eg2 6
aTke = exp(-iLjkt)akl0)
i r - 2Xk exp(-iuk(t - s))Mv(s)ds
fj = exp(-iujkt)akv(t)
i fdeg + 9 ^ exp(-wt(t-s))Mbdquo(s)ds
60
ontic states to open systems with epistemic states Such a transition can be understood by dividing a closed system into open more or less EPR-correlated subsystems (eg object and environment) and by selecting a subset of relshyevant observables The proper state concepts are epistemic There are then two different statistical representations for different epistemic state concepts A ^-statistical representation expresses a probability distribution of pure states whereas the usual ^-statistical representation focuses on reduced density opshyerators
The interaction of the open subsystems is described by dynamical laws difshyferent from the time-reversal invariant dynamics of a closed system Breaking the time-reversal invariance of a unitary group evolution generates two semishygroups which can be endowed with two arrows of time opposite to each other It should be pointed out that the forward arrow cannot be selected by physical reasons alone Extra-physical arguments such as consistency with experience causality etc must be invoked
4 Breaking Time-Reversal Symmetry Intrinsic Irreversibility
In contrast to the extrinsic concept of irreversibility there is an alternative concept of intrinsic irreversibility mainly advocated by Prigogine and collabshyorators (more recently also by Bohm) They propose describing states of any system generically with distributions p (ie probability distributions or denshysity operators) The claim is that the state p of systems beyond a particular degree of complexity evolves irreversibly by itself ie without any relationship to an environment There are essentially two lines of research pursuing this proposal
4-1 A-Transformation from K-Systems to Exact Systems
The notion of the A-transformation has been developed by Misra Courbage and Prigogine in the 1970s It is essentially based on the theory of ergodic systems In particular the concept of Kolmogorov systems briefly K-systems is of central significance in this context
Definition 127 Let (X A n) be a normalized measure space and let S X mdashgt X be an invertible transformation such that S and 5 _ 1 are measurable and measure preserving The transformation S is called a K-automorphism if there exists a cr-algebra A0 such that the following three conditions are satisfied (i)S-1(A0)cA0 (ii) the cr-algebra f l^Lo - ^ 0 ) is trivial (ie contains only sets of measure
61
1 or 0) (hi) the smallest cr-algebra containing Jtrade=0S
n(Ao) is identical to A Another way to characterize (classical) K-systems is by way of the existence
of positive Ljapounov exponents equivalent to a strictly positive Kolmogorov-Sinai entropy The properties of K-systems imply mixing and ergodicity K-systems are invertible transformations hence their deterministic dynamics given by p(t) = Ut p(0) is reversible (Ut is a unitary evolution operator acting on p) A standard example is the (2-dimensional) baker transformation
Another important class of mixing systems refers to so-called exact sysshytems
Definition 2 27 Let (XAp) be a normalized measure space and let S X mdasht X a measure preserving transformation such that S(A) pound A for each A pound A If l im^oo = p(Sn(A)) = 1 for every A euro A p(A) = 1 then S is called exact
Exact systems are represented by non-invertible transformations hence their stochastic dynamics given by p(t) = Wt p(0) is irreversible Wt is a semigroup evolution operator acting on a distribution p rather than p For instance an exact system obtained from the baker transformation is the dyadic transformation
S(x) = 2x (mod 1)
A theorem by Rokhlin28 says that every exact system is the factor of a K-system This means that K-systems can be transformed into exact systems by their projections (or factors see2 7) More generally a factor of a K-system can be obtained by restriction to dilating fibers or unstable manifolds Hence it is intuitively clear that the invertibility of a K-system gets lost by its transformation into an exact system
According to Misra et al 29 30 the relations between the two kinds of
dynamics Ut and Wt and the two state concepts p and p are provided by a similarity transformation A according to
Wt = AUtA-1
p = Ap
Wightmans question31 as to the meaning of p in his review of30 gets an imshymediate answer if one applies Rokhlins theorem to construct A (cf 3 2 ) The transformed distribution p is the projection of p onto a dilating subspace This can easily be seen for the examples of the baker transformation and the dyadic transformation In the more complicated case of continuous-time nonlinear (hyperbolic) systems the corresponding procedure would be a projection onto the unstable manifolds ie those directions along which the Lyapunov expo-
62
nents are positive and add up to the Kolmogorov-Sinai entropy (cf 33gt34) As an important conceptual feature such projections select a time direction
A crucial formal feature associated with the irreversibility due to Wt is that a properly constructed A (and hence A[ (A
_1) preserves the positivity of the state distributions only for positive times A conceptual discussion of this point can be found in3 5 For a more detailed formal account of the role which positivity preservation plays in the transformation between irreversible semigroups and chaotic dynamics see 36 and references given there
4-2 Rigged Hilbert Space Representation
Intrinsic irreversibility has also been implemented in an approach based on an extension of the usual Hilbert space representation of the state of a sysshytem This approach makes use of the so-called rigged Hilbert space (RHS) construction first introduced by the Russian mathematician Gelfand and his collaborators37 Roberts38 and Bohm3 9 independently showed how Diracs formalism could be justified with complete mathematical rigor in a RHS By the end of the 1970s it turned out that some basic physical problems of Hilbert space quantum mechanics notably in the context of decaying states or resoshynances could be clarified in terms of RHS (40 and references therein)
Very briefly a RHS (Gelfand triplet) can be understood as follows Let be an abstract linear scalar product space and complete with respect to two topologies The first topology is the standard norm topology yielding a separable Hilbert space The second topology r$ is defined by a countable set of norms
IMU = Aamp0)n ^ euro n = 012 (15)
where (fgt e $ and the scalar product is given by
(lt(gt ltf)n = (ltjgt (A + 1) V ) n = 0 1 2 (16)
where A is the Nelson operator A =J2iXi41- The Xi are operators representing the observables for the system in question and are the generators for the Nelson operator Furthermore the operator A + 1 is a nuclear operator and ensures that $ is a nuclear space (cf 42gt39) An operator is nuclear if it is linear essentially self-adjoint and its inverse is Hilbert-Schmidt An operator A-1 is Hilbert Schmidt if A1 = XiPi where the Pt are mutually orthogonal projection operators on a finite dimensional vector space and J2iPi lt degdeg gt Pi denoting the eigenvalues of Pi39 We then have the Gelfand triplet of spaces
$ C ^ C $ X (17)
63
where $ x is the dual to the space $ The Nelson operator fully determines the choice of function space when
it comes to choosing a realization of the space $ However there are many different inequivalent irreducible representations of an enveloping algebra of a Lie group used to generate a Nelson operator describing physical systems Therefore further restrictions on the choice of function space for a realization of $ are required The particular characteristics of the physical context of the system being modeled provide some of these restrictions analogous to the situation for GNS constructions in the transition from C- to W-algebras in algebraic quantum mechanics23 Additional restrictions may be required due to the convergence properties desired for test functions in $ and ltJgtX
Bohm and colleagues applied the RHS approach to intrinsic irreversibility in the context of scattering and decay phenomena4043 Antoniou and Prigogine 44 extended the approach to broader contexts The core idea in both versions is that a unitary group operator Ut = exp(-iHt) mdashoo lt t lt oo generated by a Hamiltonian H under very general circumstances may be extended from W to $ x (restricted to $) For scattering processes $ is the intersection of the Hardy class functions with the Schwarz class functions Because of continuity and completeness requirements Ut $ x mdashgt $ x (Ut $mdashgt$) can be extended to the upper half plane $+ (restricted to $+) for positive times and to the lower half plane $ x ($_) for negative times4 3 The extension of Ut to $ x
(restriction to $) forms two semigroups because the extension (restriction) cannot be defined for replacement of t with mdasht Thus semigroup evolution falls out of the analysis quite naturally in the RHS framework
4-3 General Features of Intrinsic Irreversibility
In the intrinsic conception of irreversibility states of a system are generically represented by distributions in a suitable state space where pure states are S functions The trajectories of individual points are either (1) considered irreleshyvant because empirically inaccessible (as in the A-transformation approach) or (2) make minimal contributions to the collective behavior of the system when a sufficient number of Poincare resonances are present (as in the RHS approach) For systems beyond a particular degree of complexity (K-systems Poincare resshyonances etc) the dynamics of the system is governed by irreversible evolution laws regardless of interactions with an environment
While the A-transformation approach has only been applied to the baker map the RHS approach has been applied to nonlinear maps Friedrich models
dThe dual space x is the space of linear functionals acting on elements of ltpoundgt and its topology is induced by the choice of T and includes distributions among its elements
64
scattering experiments and other decay phenomena In the latter approach exact Golden Rules for decay and survival probabilities and their rates can be derived in agreement with experimental observations43
In both approaches the transition from reversible to irreversible dynamical evolution laws is achieved by breaking the time-reversal symmetry in specific ways leading to two semigroups The time direction of the semigroups howshyever is not given by either the A-transformation or RHS approaches Physical considerations alone are insufficient to select the forward arrow and one must appeal to consistency with experience causality or other criteria
5 Summary and Open Questions
There are two basic points at which extrinsic and intrinsic notions of irreshyversibility coincide The first is that both notions explicitly break the time-reversal symmetry of reversible dynamical laws This is clearly the case for the standard external view in which the transition from fundamental reversible laws to contextual irreversible laws corresponds to the transition from ontic states of closed systems to epistemic states of open systems But even for the alternative intrinsic view irreversibility is an emergent feature 45 In the framework of the A-transformation the time-reversal symmetry of K-systems is broken leading to irreversible exact systems In the RHS representation a similar symmetry breaking is achieved by the transition from Hilbert space to the rigging spaces $ and $ x
The breaking of time-reversal symmetry always produces two semigroups which can be endowed with opposite temporal directions Selection criteria must be used to select one of these two directions for a preferred mode of description In both extrinsic and intrinsic approaches there is no such crishyterion available based on physical reasoning alone The selection is based on extra-physical arguments such as causality experience and others This secshyond point of agreement between extrinsic and intrinsic irreversibility raises the interesting question of what conditions the proper direction of time has to satisfy It could be argued that up to the condition that it is the same for all physical systems the selection is arbitrary
There are two basic points at which extrinsic and intrinsic notions of irreshyversibility apparently differ One of them concerns the role of the environment the other has to do with the state concepts used in the two approaches Briefly speaking the role of the environment and the distinction of different state concepts is crucial in the standard framework of extrinsic irreversibility The conceptual framework of the formalisms refering to intrinsic irreversibility neishyther (1) explicitly contains the concept of an environment nor (2) distinguishes
65
between different state concepts These observations do not necessarily imply that intrinsic irreversibility
really can dispense with points (1) and (2) It is likely that the two points play crucial roles even though they do not explicitly appear in the formalism and its usual interpretation
The projection (factorization) which is the crucial part of a A transforshymation can be considered as the selection of an exact subsystem of the origshyinal K-system Obviously the A-transformation is not universal but context-dependent Conceptually the irreversible evolution of p mdash Kp due to Wt could then be attributed to the restriction of the K-system to an exact subsystem This might lead to interesting analogies with aspects of extrinsic irreversibility if the subsystem cannot be described as a closed subsystem Concrete empirshyical applications of the A-transformation are not yet available They would be necessary to check the significance of a physical environment which is not explicit in the formalism
Concerning the distinction between ontic and epistemic state concepts it is clear that the approach of intrinsic irreversibility starts at the level of distributions rather than points In the space of distributions 5 functions are special cases that could be related to points in a state space underlying the distribution space considered In this way a connection between distributions as epistemic states and points as ontic states is possible The general claim in the A-transformation framework of intrinsic irreversibility though is that ontic states in the sense of phase points are meaningless or irrelevant since they are empirically inaccessible
But is it justified to consider ontic states as generally irrelevant because they are empirically inaccessible Reversible fundamental laws refer to ontic states and it is not easy to formulate physics without them The monoshygraphs by Ludwig46 which consistently avoid any ontic elements are an ilshylustrative example Moreover special techniques to break symmetries often enable a unique derivation of irreversible contextual laws if the fundamental laws plus contexts are known This also holds for the symmetry breaking used to derive intrinsic irreversibility from time-reversal invariant evolution in the A-transformation approach The empirical inaccessibility of ontic states notwithstanding one should therefore not dismiss their overall relevance too quickly
In the RHS approach there is no contradiction with the formal arguments in the case of extrinsic irreversibility insofar as the extension of Ut from V into $ x leads from reversibility to irreversibility In this case irreversibility is a feature arising during the transition from states in to states whose state space is defined with respect to contexts In the algebraic framework of Sec 3
66
such contexts are reflected by a contextual topology on M As mentioned in Sec 42 physical contexts may not be known sufficiently well to determine $ x uniquely The physical examples used to demonstrate the significance of the RHS formulation (eg decay) suggest that a physical environment is inevitable although this is not explicit in the formalism
The relationship between ontic and epistemic states in the RHS approach is more subtle than in the A-transformation approach As Petrosky and Pri-gogine argue4748 the presence of a sufficient number of Poincare resonances in so-called large Poincare systems (LPS) rapidly convert the smooth infinitely differentiable trajectories of the phase space points into random walks Though the trajectories are not considered to be empirically inaccessible their effects are limited to the formation of higher and higher orders of correlations as the dynamics evolves The phase space points can represent ontic states but the correlations also have an ontic status Correlations very rapidly come to domishynate the dynamics of all collective modes of behavior of LPS (eg the approach to equilibrium) as the correlations diffuse throughout the system In this way the effects of individual points and trajectories become irrelevant to the dyshynamics of the whole and thus one can argue that the distribution description is an ontic description of the systems behavior
In this way the distinction between ontic and epistemic states might be a powerful conceptual tool even at the level of distributions alone There is a conceptual difference between a probability distribution conceived as a distrishybution over an ensemble of individual pure states (as in the ^-statistical represhysentation) and a probability distribution conceived as an individual whole The latter concept is sometimes indicated in the context of intrinsic irreversibility and can be considered as an ontic version of the former (cf the notion of relshyative onticity16) For instance continuum mechanics requires a formulation which needs ontically interpreted holistic distributions from the very beginshyning since its description in terms of an ensemble of points would violate basic physical laws
Among the adherents of intrinsic irreversibility it is claimed that the holisshytic concept of a distribution as a whole entails predictions eg related to the dynamics of correlations in large systems which cannot be obtained with the concept of a probability distribution of individual pure states This claim particularly refers to situations far from thermal equilibrium Based on Gallavottis approach which describes systems far from equilibrium in terms of SRB-measures49 ie in an ensemble description this claim may become testable (see also50 for a brief discussion)
After all it is possible to view the intrinsic approach to irreversibility as emphasizing the relative importance of the advanced level of complexity
67
of systems with nontrivial correlations over environmental effects While exshytrinsic irreversibility addresses the importance of an environment intrinsic irreversibility should not primarily be understood as focusing on the neglect of such an environment (eg the environment may be a necessary condition for the existence of the dynamics) Instead it is perhaps more appropriate to understand intrinsic irreversibility as irreversibility intrinsic to the dynamics of a system given a particular degree of its complexity
Acknowledgments
Helpful comments by L Accardi L Ballentine H Narnhofer and I Volovich during the discussion of this contribution at the conference are much apprecishyated We are grateful to H Primas for remarks on an earlier version of this paper
References
1 JH Fetzer and RF Almeder Glossary of EpistemologyPhilosophy of Science (Paragon House New York 1993) p lOOf
2 D Howard Space-time and separability problems of identity and indishyviduation in fundamental physics In Potentiality Entanglement and Passion-at-a-Distance ed by RS Cohen M Home and J Stachel (Kluwer Dordrecht 1997) pp 113-141
3 W Heisenberg Physics and Philosophy (Harper and Row New York 1958)
4 D Bohm Wholeness and the Implicate Order (Routledge and Kegan Paul London 1980)
5 B dEspagnat Veiled Reality (Addison-Wesley Reading 1995) 6 H Margenau Reality in quantum mechanics Phil Science 16 287-302
(1949) here p 297 7 KR Popper The propensity interpretation of probability and quanshy
tum mechanics In Observation and Interpretation in the Philosophy of Physics - With special reference to Quantum Mechanics ed by S Korner in collaboration with MHL Pryce (Constable London 1957) pp 65-70 [Reprinted by Dover New York 1962]
8 R Harre Is there a basic ontology for the physical sciences Dialectica 51 17-34 (1997)
9 M Jammer The Philosophy of Quantum Mechanics (Wiley New York 1974) pp 448-453 504-507
10 E Scheibe The Logical Analysis of Quantum Mechanics (Pergamon Oxford 1973) pp 82-88
68
11 H Primas Mathematical and philosophical questions in the theory of open and macroscopic quantum systems In Sixty-Two Years of Uncershytainty ed by AI Miller (Plenum New York 1990) pp 233-257
12 H Primas Endo- and exotheories of matter In Inside Versus Outside ed by H Atmanspacher and GJ Dalenoort (Springer Berlin 1994) pp 163-193
13 J von Neumann Mathematische Grundlagen der Quantenmechanik (Springer Berlin 1932) English translation Mathematical Foundations of Quantum Mechanics (Princeton University Press Princeton 1955)
14 A Einstein B Podolsky and N Rosen Can quantum-mechanical deshyscription of physical reality be considered complete Phys Rev 47 777-780 (1935)
15 H Primas Emergence in exact natural sciences Acta Polytechnica Scan-dinavica M a 91 83-98 (1998) See also Primas Chemistry Quantum Mechanics and Reductionism (Springer Berlin 1983) Chap 6
16 H Atmanspacher and F Kronz Relative onticity In On Quanta Mind and Matter Hans Primas in Context Edited by H Atmanspacher A Amann and U Miiller-Herold (Kluwer Dordrecht 1999) pp 273-294
17 H Atmanspacher Ontic and epistemic descriptions of chaotic systems In Computing Anticipatory Systems CASYS 99 Edited by D Dubois (Springer Berlin 2000) pp 465-478
18 E Fick and G Sauermann Quantenstatistik dynamischer Prozesse Ha Antwort- und Relaxationstheorie (Harri Deutsch Thun 1986)
19 R Kubo M Toda and N Hashitsume Statistical Physics II (Springer Berlin 1985)
20 H Primas The Cartesian cut the Heisenberg cut and disentangled observers In Symposia on the Foundations of Modern Physics Wolfgang Pauli as a Philosopher ed by KV Laurikainen and C Montonen (World Scientific Singapore 1993) pp 245-269
21 A Amann Structure dynamics and spectroscopy of single molecules a challenge to quantum mechanics J Math Chem 18 247-308 (1995)
22 A Amann and H Atmanspacher Fluctuations in the dynamics of single quantum systems Stud Hist Phil Mod Phys 29 151-182 (1998)
23 A Amann and H Atmanspacher C- and W-algebras of observ-ables their interpretation and the problem of measurement In On Quanta Mind and Matter Hans Primas in Context Edited by H Atshymanspacher A Amann and U Miiller-Herold (Kluwer Dordrecht 1999) pp 57-79
24 H Primas Induced nonlinear time evolution of open quantum systems
69
In Sixty-Two Years of Uncertainty ed by AI Miller (Plenum New York 1990) pp 259-280
25 N Wiener (1938) The homogeneous chaos Am J Math 60 897-936 (1938)
26 CM Lockhart and B Misra Irreversibility and measurement in quanshytum mechanics Physica A 136 47-76 (1986) Cf H Primas Math Rev 87k 81006 (1987)
27 A Lasota and MC Mackey Chaos Fractals and Noise (Springer Berlin 1995)
28 VA Rokhlin Exact endomorphisms of Lebesgue spaces Izv Akad Nauk SSSR Ser Mat 25 499-530 (1964) transl in Am Math Soc Transl 39 1-36 (1964)
29 B Misra NonequiUbrium entropy Lyapounov variables and ergodic properties of classical systems Proc Ntl Acad Sci USA 75 1627-1631 (1978)
30 B Misra I Prigogine and M Courbage From deterministic dynamics to probabilistic descriptions Physica A 98 1-26 (1979)
31 A Wightman Review of Misra Prigogine and Courbage30 Math Rev 82e 58066 (1982)
32 Z Suchanecki On lambda and internal time operators Physica A 187 249-266 (1992)
33 H Atmanspacher and H Scheingraber A fundamental link between sysshytem theory and statistical mechanics Found Phys 17 939-963 (1987)
34 H Atmanspacher Dynamical entropy in dynamical systems In Time Temporality Now ed by H Atmanspacher and E Ruhnau (Springer Berlin 1997) pp 325-344
35 RW Batterman Randomness and probability in dynamical theories on the proposals of the Prigogine school Philosophy of Science 58 241-263 (1991)
36 I Antoniou K Gustafson and Z Suchanecki (1998) On the inverse problem of statistical physics from irreversible semigroups to chaotic dynamics Physica A 252 345-361 (1998)
37 IM Gelfand and NYa Vilenkin Generalized Functions Vol 4 (Acashydemic New York 1964) Russian original published 1961 in Moscow
38 JERoberts The Dirac bra and ket formalism Journal of Mathematical Physics 7 1097-1104 (1966)
39 A Bohm Rigged Hilbert space and mathematical descriptions of physshyical systems In Lectures in Theoretical Physics IX A Mathematical methods of theoretical physics Edited by WE Brittin AO Barut and M Guenin (Gordon and Breach New York 1967) pp 255-317
70
40 A Bohm and M Gadella Dirac Kets Gamow Vectors and Gelfand Triplets Lecture Notes in Physics Vol 348 ed by A Bohm and JD Dollard (Springer Berlin 1989)
41 E Nelson Analytic Vectors Annals of Mathematics 70 572-615 (1959) 42 F Treves Topological Vector Spaces Distributions and Kernels (Acashy
demic Press New York 1967) 43 A Bohm S Maxson M Loewe and M Gadella Quantum mechanical
irreversibility Physica A 236 485-549 (1997) 44 I Antoniou and I Prigogine Intrinsic irreversibility and integrability of
dynamics Physica A 192 443-464 (1993) 45 T Petrosky and I Prigogine The Liouville space extension of quantum
mechanics Adv Chem Phys XCIX 1-120 (1997) here p 71 46 G Ludwig Foundations of Quantum Mechanics Vols 12 (Springer
Berlin 19831985) 47 T Petrosky and I Prigogine Poincare resonances and the extension of
classical dynamics Chaos Solitons amp Fractals 7 441-497 (1996) 48 T Petrosky and I Prigogine The Extension of Classical Dynamics for
Unstable Hamiltonian Systems Computers amp Mathematics with Applishycations 34 1-44 (1997)
49 G Gallavotti Chaotic dynamics fluctuations nonequilibrium ensemshybles CHAOS 8 384-392(1998)
50 D Ruelle Gaps and new ideas in our understanding of nonequilibrium Physica A 263 540-544 (1999)
71
INTERPRETATIONS OF PROBABILITY A N D Q U A N T U M THEORY
L E B A L L E N T I N E
Department of Physics Simon Fraser University Burnaby
BC V5A 1S6 Canada
e-mail ballentisfuca
There is a peculiar similarity between Probability Theory and Quantum Mechanics both subjects are mature and successful yet both remain subject to controversy about their foundations and interpretation I first present a classification of the various interpretations of probability arguing that they should not be thought of as rivals but rather as applications of a general theory to different kinds of subshyject matter An axiom system that makes conditional probability the fundamental concept is put forward as being superior to Kolmogorovs axioms I then discuss the relevance to quantum theory of the various interpretations of probability the applicability of classical probability theory within quantum mechanics and the reshylations between the interpretation of probability and the interpretation of quantum mechanics
1 Introduction
There are many connections between Probability Theory and Quantum Meshychanics the most notable being that Quantum Mechanics uses Probability Theory in its fundamental interpretation not merely as a technique But I wish to concentrate on a more peculiar similarity Although both subjects are mature and successful both remain subject to controversy about their foundations and interpretation There may be even more interpretations of probability than there are of quantum theory Can one bring some degree of order to this subject
Probability Theory being a branch of mathematics is defined by a set of axioms So it can legitimately be applied to any entity that satisfies those axioms Most of the interpretations of probability can be viewed as applications of the formal theory to different subject matters It is therefore misguided to argue over which is the correct interpretation Most of them are correct within their appropriate domain of application But it is still reasonable to ask whether there is a general overarching form of Probability Theory of which all the various interpretations can be seen as special cases applied to special subject matters
I shall propose such a classification of the various interpretations of probshyability To do so it is necessary to overlook small differences and to lump closely related interpretations into a few broad categories I expect this classi-
72
fication to be controversial but I believe that it is a step in the right direction I shall consider only theories that are based on the same or equivalent sets of axioms Hence generalizations such as negative probabilities are not included in this scheme although I shall briefly refer to them later After describing the major categories of interpretation of probability I will discuss the relevance of each to quantum mechanics
2 Interpretations of Probability
Many different interpretations of probability are examined in detail by T L Fine1 I propose to overlook many of the fine differences and hence classify them into a few major groups shown in Figure 1 References to most of the authors named in Fig 1 and critical analyses of their ideas are given by Fine1
21 The Theory of Inductive Inference
I propose that the Theory of Inductive Inference be taken as the master theory and that all other interpretations be regarded as special cases applicable in more restricted contexts This point of view was expressed most completely by E T Jaynes in his book Probability Theory The Logic of Science which unfortunately was not completed during his lifetime
Within this interpretation probability is assigned to propositions The notation P(AC) is to be read as the probability of A under the condition C Probability is regarded as a logical relation among propositions that is weaker than entailment Inductive logic reduces to deductive logic in the limit of probability values 0 and 1 Probability is an objective relation and should not be confused with degrees of belief
The propositions to which probability is assigned may have any particular content If we specialize to propositions about repeated experiments we obtain the Ensemble-Frequency theory If we specialize to propositions about personal belief we obtain Subjective probability If we specialize to propositions about indeterministic or unpredictable events we obtain the Propensity theory
Although P(AC) is a logical relation between proposition A and the conshyditioning information C it is not merely a formal syntactic relation The content (meaning) of A and C must be invoked to evaluate P(AC) There is no magic formula to translate arbitrary information into probabilities Jaynes has given solutions to this problem in some important special cases (symmetry groups marginalization) but there is as yet no general solution
73
The Logic of Inductive Inference
(E T Jaynes R T Cox H Jefferys)
P(AC) is the probability that proposhysition A is true given the information C
Ensemble and Frequency
(Kolmogorov Bernoulli von Mises)
Measure on a set Limit frequency in an ordered sequence
Propensity
(K R Popper)
PAC) is the propensity for event A to occur under the conshydition C
Subjective and Personal
(de Finnetti L J Savage I J Good)
Incomplete knowledge Degrees of reasonable belief
Figure 1 Classification of the interpretations of Probability
22 Ensemble and Frequency Theories
One of the most common interpretations of probability is as a limit frequency in an ordered sequence The ratio of the number n of occurrences of a particshyular type in a sequence of N events nN is identified with the probability This interpretation is useful in analyzing repeated experiments but it has the
74
difficulty that in a random sequence the ratio nN need not have a limit The ensemble interpretation is a generalization of the frequency interpretation in which probability is identified with a measure on a set that need not be orshydered It is closely associated with Kolmogorovs axiom system which will be discussed later
23 Subjective Probability
Subjectivism has its place and subjective probability provides an excellent way to describe degrees of reasonable belief But in science subjectivism can be like a virus and we must guard against its infection In general the probability P(AC) expresses an objective relation between A and C determined by the totality of the information C and not by anyones personal opinions Jaynes tried to ensure objectivity through the pedagogical device of introducing a robot that is programmed to reason consistently using only the information that is given to it But even Jaynes sometimes slipped from objective to personal probabilities in his examples without apparently being aware of doing so Indeed the contamination of Inductive Logic Probability by subjectivism may have been a major barrier to its acceptance
24 Propensity
Propensity is a form of causality that is weaker than determinism34 Generally speaking probability expresses logical relations rather that causal relations (Recall the old saying Correlation does not imply causality) However causalshyity is a special kind of logical relation and propensity theory deals with just that special case The propensity interpretation of probability is natural in situations such as those described by quantum mechanics in which events can not be predicted with certainty from their antecedents
3 The Axioms of Probability
The axioms of probability theory can be given in several different forms howshyever those given by RT Cox56 are particularly convenient
Axiom 1 0 lt PAB) lt 1 Axiom 2 PAA) = 1 Axiom 3 PhAB) = 1 - P(AB) Axiom 4 P(AkBC) = P(AC) PBAkC)
Here the notation is as follows -gtA means not A AkB means A and J5 A B means either A or B
75
Axiom 2 states that the probability of a certainty (A given A) is one Axiom 1 states that no probabilities are greater than the probability of a certainty Axiom 3 expresses the notion that the probability of non-occurrence of an event increases as the probability of its occurrence decreases It also implies P-gtAA) = 0 an impossibility (not A given A) has zero probability Axiom 4 is the least intuitive The probability of both A and B (under some condition C) is equal to the probability of A multiplied by the probability of B given A
The probabilities of negation (-gtA) and conjunction (AampB) each require an axiom However no further axioms are required to treat disjunction because AV B = -i(-iAamp-ii) in words A or B is equivalent to the negation of neither A nor B This allows us to deduce a theorem
P(A V BC) = P(AC) + P(BC) - PAkBC) (1)
If A and B are mutually exclusive then we obtain
PAV BC) = P(AC) + P(BC) (2)
which is often taken to be an axiom and may be used in place of Axiom 3 Several remarks about these axioms are in order First the notion of ranshy
domness plays no fundamental role in the theory Hence we need not enquire whether our variables and events are random as a prerequisite to applying probability theory
Second these axioms are not arbitrary They are uniquely determined (apart from formal changes that do not affect the content) by conditions of plausibility and consistency (see Cox5 and Jaynes2)
(i) The probability of A on some given evidence determines also the probshyability of not A on the same evidence
(ii) The probability on given evidence that both A and B are true is determined by their separate probabilities one on the given evidence and the other on that evidence plus the assumption that the first is true
(iii) If a complex proposition can be composed in more than one way [ex (AampB)ampC or AampcBbC) then all ways of computing its probability must lead to the same answer Notice that in (i) and (ii) only the existence of certain connections are asshysumed but not their mathematical form The consistency condition (iii) then leads to the mathematical forms of the axioms Therefore anyone who proshyposes an inequivalent alternative to Coxs axioms (such as allowing negative probabilities) has an obligation to explain how and why he departs from these conditions of plausibility and consistency
76
Finally a very important remark All probabilities are conditional
The use of the single-variable notation PA) instead of P(AC) is permissible only if the conditional information C is obvious from the context and is unshychanging throughout the problem Many fallacies and paradoxes follow from ignoring this principle
31 Kolmogorovs axioms
If the fundamental axioms that define Probability Theory are those given above then what is the status of Kolmogorovs well-known axioms According to Kolmogorovs axioms probability is assigned to subsets of a universal set fi with the following rules
(i) p(n) = I (2) P(f) gt 0 for any in il (3) If i - - - laquoare disjoint then P(f) = Sj j where is the union of
fir fn-(4) If mdashgt 0 (the empty set) then P(fi) -gt 0 The answer I believe is that Kolmogorovs axioms provide a mathematshy
ical model of probability theory (defined by Coxs axioms) on the theory of measurable sets A mathematical model is useful because it reduces the conshysistency of one theory to that of another (A familiar example is the algebra of complex numbers which can be modeled by the algebra of ordered pairs of reals) Thus any doubts about the consistency of Probability Theory may be laid to rest because of the existence of Kolmogorovs model
There are several objections to taking Kolmogorovs axioms as a foundashytion for Probability Theory rather than merely as a model bull The universal set Cl is often fictitious The propositions to which probabilities are assigned are not subsets of a set bull Conditional probability is relegated to secondary status while the matheshymatical fiction of absolute probability is made primary bull Probability theory and Measure theory are distinct subjects The interesting problems of one are not closely related to the interesting problems of the other For example measure theory deals mostly with infinite sets culminating with the construction of non-measureable sets which have no probabilistic intershypretation But in probability theory one seldom needs to consider an infishynite number of conjunctions and disjunctions On the other hand the imporshytant problem of translating qualitative information into probabilities has no measure-theoretic analog
77
4 Probability in Quantum Mechanics
4-1 Relevant and Irrelevant Interpretations of Probability
Which of the interpretations of probability are relevant to quantum mechanshyics The ensemble-frequency interpretation is obviously relevant and widely used in discussing the statistics of repeated experiments on similarly prepared states Indeed the standard description of an idealized experiment is (1) prepare a state (2) measure an observable of the system (3) repeat the previous two steps until sufficient statistical data has been accumulated (4) compare the relative frequencies of this data with the probabilities predicted by quantum theory
The propensity interpretation is in accord with the ensemble-frequency interpretation whenever it is applied to repeated experiments but it also allows one to make meaningful statements about individual events The propensity interpretation is more natural when one considers time-dependent states and hence time-dependent probabilities Consider the following examples
(i) A source produces s = 12 particles polarized at an angle 4gt relative to some coordinate axis A Stern-Gerlach magnet has its field gradient axis oriented at an angle 8 What is the probability that such a particle incident on the apparatus will emerge with spin up
The formal answer is of course p = cos[(9 mdash ltj))22 but what does this mean
According to the propensity interpretation it means The propensity (chance) of the particle emerging with spin up is p
According to the ensemble-frequency interpretation it means In a long run of similar experiments the fraction of particles emerging with spin up will be (approximately) p
(ii) Now let the magnet be re-oriented in some arbitrary manner before each particle is released so that 6 is different in each case
According to the propensity interpretation we say nearly the same thing The propensity (chance) has a different value p = p$ in each case
But in the ensemble-frequency interpretation one must conceptually embed each event in an imaginary long run of experiments having the same value of 6 in order to make a frequency statement
78
(iii) Suppose next that the polarization direction ltjgt of the particles is unknown Can it be inferred from the data of (ii)
In the ensemble-frequency interpretation the answer would appear to be No A long run of events for each value of 0 would be necessary to estimate p$ as a frequency and hence to determine its dependence on 6
In the propensity interpretation the answer is Yes Bayesian inference (equivalent to maximum likelihood if the prior probashybility distribution for ltgt is uniform) can determine the most probable value of ltjgt even if there is only one event for each value of 9
I have never seen a coherent exposition of QM based on a subjective inshyterpretation of quantum probabilities as representing knowledge This point (which has also been argued at length by Popper8) is worth emphasizing beshycause the interpretation of probabilities as knowledge seems to be a tenet of the Copenhagen interpretation
Two persons (with limited knowledge of QM) might have different reashysonable beliefs about the position of the electron in the hydrogen atom and those beliefs could be represented by subjective probabilities But such igshynorance probabilities have nothing to do with |gt(a0|2 from the Schroedinger equation |V(a)|2 is an objective propensity not a subjective degree of belief
The so-called Uncertainty principle AxAp gt h2 has nothing to do with subjective knowledge or ignorance Its meaning is that in any physical prepashyration of a state the values of x and p will not be reproducible the widths of their distributions being related by the inequality The widths Aa and Ap are objective predictable and measurable parameters which should not be called uncertainties Indeed the name Indeterminacy principle is preferable to Uncertainty principle0
Subjective probabilities can occur in the information games that are played in quantum communication theory Consider a typical example
Bob prepares some quantum state but keeps it secret He tells Alice only that it is one of four (usually nonorthogonal) possible states and she must try to infer what the hidden state is from a measurement Alices incomplete knowledge of that hidden state can be expressed as a subjective probability Suppose also that Bob tells Carol that the unknown state is one of three posshysibilities Carols knowledge is different from Alices and hence her subjective probability will be different But both of these subjective knowledge probabilshyities are quite distinct from the objective quantum probabilities (propensities)
When I once heard Heisenberg speak (about 1964) he used the term Indeterminacy prinshyciple In his early writings he used the words Ungenauigheit (inexactness) Unbestimmtheit (indeterminacy) and Unsicherheit (uncertainty) with various shades of meaning
79
that would be calculated by solving Schroedingers equation for Bobs state preparation apparatus
I suspect that the subjective knowledge interpretation of QM probabilshyities came about by accident the founders of QM may have believed (erroshyneously) that probability can only be a measure of knowledgeignorance Max Born has written that Heisenberg did not know what a matrix was when he was inventing what later became known as matrix mechanics It is therefore not very radical to suppose that the founders of quantum mechanics had an inadequate understanding of probability
4-2 Fallacies in the use of Probability
Unsound arguments to the effect that classical probability theory does not apply to QM are woefully common Before examining an actual argument to that effect let us first consider a simple classical paradox
The Bookies Paradox A bookie needs to fix the odds on a star track runner who has a 60 chance of winning any race that he enters There is a race in Paris and a race in Tokyo scheduled on the same day so he cannot enter both and we do not know which he will enter What is the probability that he will win at least one of these races
Let A = (winning in Paris) and let B = (winning in Tokyo) Clearly A and B are mutually exclusive events so PAJB) = PA) + P(B) The probability of his winning at least one race is 06 + 06 = 12 But this is absurd since 12 gt 1
The paradox is resolved by taking account of a principle that was noted in Sec 3
All probabilities are conditional The notation PA) instead of P(AC) is permissible only if the conditional information C is obshyvious from the context and unchanging throughout the problem
Let us therefore be more precise about the conditions involved Let Ep = (entering in Paris) and let ET mdash (entering in Tokyo) Then clearly we have
P(AEP) = 06 P(BEP)=0 P(AET) = 0 P(BEr) = 06
80
Additivity P(A V BC) = P(AC) + PBC) holds for the same condition C in all terms But PAEp) and P(BET) are not additive by any valid rule so the absurd conclusion reached above followed only from an erroneous apshyplication of probability theory
Double-slit Fallacy A common fallacy about 2-slit experiment is of exactly the same form The experiment consists of three parts
(a) Open slit 1 close slit 2 The probability of a particle arriving at the point X on the screen is Pi(X)
(b) Open slit 2 close slit 1 The probability of a particle arriving at X is now P2(X)
(c) Open both slits 1 and 2 The probability of a particle arriving at X is Pi2(X)
Now passage through slit 1 and through slit 2 are mutually exclusive so we deduce
PuX) = Pi(X) + P2(X) which is empirically false It is then concluded (fallaciously) that classical probability theory does not apply in quantum mechanics
The above reasoning embodies essentially the same fallacy is does the Bookies paradox and it is resolved similarly by paying proper attention to the conditional nature of the probabilities
Let condition C = (slit 1 open slit 2 closed) Let C2 = (slit 2 open slit 1 closed) Let C3 = (both slits open)
We observe empirically that P(XCi) + P(XC2) ^ P(XC3)
(due of course to interference) But this fact is is fully compatible with classical probability theory
4-3 Quantum Probabilities
Quantum probabilities are not essentially different from classical probabilities but like quantum theory itself they do require some care in their interpreshytation H Jefferys 7 remarked that the probability statements of quantum mechanics are incomplete because a probability is always relative to a set of data and the data are not specified In our terminology Jefferys is saying that all probabilities are conditional and the conditions need to be specified to
81
make the probability statement meaningful This can be accomplished through a propensity interpretation of quantum probabilities with proper attention beshying given to the basic concepts of measurement and state preparation When that is done it can be demonstrated9 10 that quantum probabilities obey all of the axioms of classical probability theory The demonstration is straight forshyward but too lengthy to review here so I shall only remark on some conceptual points
(a) The standard formula P(A=an^) = | (abdquo |) |2 where Aan) = anan) should be read as
The probability (propensity) for a measurement of the dynamical variable A to yield the value an conditional on the preparation of the state is | (abdquo |) |2
Note that the propensity is conditioned by the physical process of state prepashyration and not by anyones beliefs or opinions
(b) One can also calculate the probability of a measurement result condishytioned by state preparation and the results of other measurements^
P(B=bm(A=an)kV) However it is necessary that the measurement processes be described dynamshyically as an interaction between the object and the apparatus Simplistic applishycation of the Projection Postulate is liable to give an incorrect answer11
(c) No difficulties of principle arise if the probabilities are conditioned on actual events of state preparation and measurement But assigning probabilishyties to hypothetical unmeasured values is not always possible This problem is encountered if we try to introduce joint probability distributions for (unmeashysured values of) non-commuting observables and require the marginal distrishybutions to agree with the quantum probabilities of the individual observables
In the case of position and momentum we would like to have a joint distribution P(xp) that satisfies
P(xp) gt 0 (3)
Jp(xp)dp=(x)2 (4)
Jp(xp)dx = (pV)2 (5)
There are infinitely many solutions to this problem12 but there is no apparent physical reason for any one of them to be preferred
However in the case of angular momentum where we might seek a joint distribution P(JxJyJz) for the three angular momentum components it is
82
not difficult to show that no such a function can yield the quantum probshyabilities of the three components as marginals However this has more to do with Kochen-Specker13 difficulties (the impossibility of assigning values to all quantum observables consistent with all the relevant constraints) than to probability theory There is no case in which a quantum probability is well defined but violates an axiom of classical probability theory
5 Conclusions
In this paper I have suggested a scheme whereby all the major interpretations of probability are unified with the separate interpretations now seen as applishycations of the general theory to particular subject matters That such different ideas as ensemble-frequency theories propensity theory and subjective degrees of reasonable belief can all be encompassed within a single framework is both useful and surprizing Because they can all be described by the same matheshymatical axioms it is easy to switch from one kind of probability to another as may be appropriate in a particular problem But on the other hand one can ask why such different things as frequencies propensities and degrees of belief should necessarily obey the same axiom system This question should stimulate further foundational research
For the case of degrees of reasonable belief this work has already been completed by Cox56 who showed that certain conditions of plausibility and consistency determine the axioms essentially uniquely Essentially unique means subject only to formal transformations that do not alter the content of the theory Therefore any alternative inequivalent system of plausible reasonshying could be shown to suffer from some degree of inconsistency
Khrennikov14 has studied limit frequencies outside of any theory of probshyability imposing only a condition of stabilization that in a long sequence the frequencies should approach a limit He has found many different cases to be possible some of which lie outside of probability theory It will be interesting to see whether these new logical possibilities are realized in nature If not then his stabilization condition will have to be supplemented by other conditions
The greatest need for more foundational research is in the case of propenshysity Although it clearly can be described by the axioms of probability theory it is not yet clear why it must be so described
Although I have dealt only with versions of probability theory that are derivable from the same axioms I expect that the classification of interpretashytions (Fig 1) may also be useful for generalized theories such as those that admit negative probabilities15 For such generalizations we should ask which of the interpretations do they support Can such generalized probabilities be
83
interpreted as frequencies As propensities As degrees of belief Or must they be given some entirely new interpretation
There are connections between the interpretations of probability and of quantum mechanics This must be so because quantum mechanics does not predict events but only the probabilities of events If one adheres exclusively to a frequency interpretation of probability then one is bound to assert that a quantum state describes only an ensemble of similarly prepared systems If on the other hand one adopts a propensity interpretation of probability then it becomes possible to make meaningful probability statements about an individshyual system However the empirically testable content of those statements can be realized only by measurements on an ensemble of similarly prepared sysshytems Thus the frequency interpretation is not made obsolete by the propensity interpretation but merely broadened The subjective interpretation of probshyability can be used in some situations such as when the observer is not fully informed about the state preparation procedure But it is never correct to interpret ip2 as representing knowledge (except perhaps in the trivial case in which the observers knowledge is complete and in perfect accord with reality)
References
1 TL Fine Theories of Probability an Examination of Foundations (Acashydemic Press New York 1973)
2 ET Jaynes Probability Theory The Logic of Science (Cambridge Unishyversity Press forthcoming) an incomplete version of this work is availshyable electronically at httpbayeswustledu
3 KR Popper in Observation and Interpretation ed S Korner (Butter-worths London 1957)
4 KR Popper Realism and the Aim of Science (Hutchinson London 1983)
5 RT Cox The Algebra of Probable Inference (Johns Hopkins University Press Baltimore MD 1961)
6 RT Cox Am J Phys 14 1 (1946) 7 H Jefferys Scientific Inference (Cambridge University Press Cambridge
1973) sec 1031 8 KR Popper Quantum Theory and the Schism in Physics (Hutchinson
London 1982) 9 LE Ballentine Quantum Mechanics - A Modern Development (World
Scientific Singapore 1998) Ch 15 24 96 10 LE Ballentine Am J Phys 54 883 (1986) 11 LE Ballentine Found Phys 20 1329 (1990)
84
12 L Cohen in Frontiers of Nonequilibrium Statistical Physics ed GT Moore and MO Scully (Plenum New York 1986) pp 97-117
13 S Kochen and EP Specker J Math Mech 17 59 (1967) 14 A Khrennikov Nonconventional approach to elements of physical realshy
ity based on nonreal asymptotics of relative frequencies Proc Conf Foundations of Probability and Physics Vaxjo-2000 (WSP Singapore 2001)
15 A Khrennikov Interpretations of Probability (VSP Utrecht 1999)
85
FORCING DISCRETIZATION A N D DETERMINATION IN Q U A N T U M HISTORY THEORIES
BOB COECKE Imperial College of Science Technology amp Medicine Theoretical Physics Group
The Blackett Laboratory South Kensington LondonSW7 2BZ and
Free University of Brussels Department of Mathematics Pleinlaan 2 B-1050 Brussels
E-mail bocoeckevubacbe
We present a formally deterministic representation for quantum history theories where we obtain the probabilistic structure via a discrete contextual variable no continuous probabilities are as such involved at the primal level
1 Introduction
In this paper we propose and study a model for history theories in which the probability structure emerges from a finite number of contextual happenings any next happening having a fixed chance to occur under the condition that the previous one happened Although this model cannot have a canonical mathematical status since it has been proved that this type of representation in general admits no essentially unique smallest one 8 u it provides insight in the emergence of logicality in the History Projection Operator setting14 and it illustrates how deterministic behavior can be encoded beyond those inshyterpretations of quantum history theories that are interpretationally restricted by so-called consistency or quasi-consistency (eg approximate decoherence) The particular motivation for this paradigm case study finds its origin in structural considerations towards a theory of quantum gravity4 15 19 As arshygued in16 although the relative frequency interpretation of probability justifies the continuous interval as the codomain for value assignment in the quanshytum gravity regime standard ideas of space and time might break down in such a way that the idea of spatial or temporal ensembles is inappropriate For the other main interpretations of probability mdash subjective logical or propensity mdash there seems to be no compelling a priori reason why probabilities should be real numbers Our model should be envisioned as a deconstructive step unshyraveling the probabilistic continuum as it appears in standard quantum theory reducing it explicitly to a discrete temporal sequence of (contextual) events The as such emerging temporal sequence is then easier to manipulate towards alternative encoding of contextual events eg in propositional terms It also enables a separate treatment of internal (the systems) and external (the con-
86
texts) time-encoding variable Although quantum history theories are currently most frequently envishy
sioned in a context of so-called decoherence we prefer to take the minimal perspective that a history theory is a theory that deals with sequential quanshytum measurements but remains essentially a dichotomic propositional theory This is formally encoded in a rigid way in the History Projection Operator-approach 14 We also mention recently studied sequential structures in the context of quantum logic of which references can be found in1 0 resulting in a dynamic disjunctive quantum logic which provides an appropriate formal context to discuss the logicality of history theories
A general theory on deterministic contextual models can be found in 8 Note here that what we consider as contextuality is that in a measurement there is an interaction between the system and its context and that precisely this interaction to some extend may influence the outcome of a measurement A lack of knowledge on the precise interaction then yields quantum-type unshycertainties Besides this interpretational issue classical representations are important since we think classical so even without giving any conceptual sigshynificance to the representation it provides a mode to think deterministically in terms of determined trajectories of the systems state without having to reconcile with concrete non-canonical constructs like pilot-wave mechanics
2 Outcome determination via contextual models
We will present the required results in full abstraction such that the reader clearly sees which structural ingredient of quantum theory determines existence of contextual models For details and proofs we refer t o 8 Let B(M) denote the Borel subsets of M Definition 1 A probabilistic measurement system is given by (i) A set of states pound and a set of measurements pound (ii) For each e e pound an outcome set Oe euro B(W) a a-field B(Oe) of Oe-subsets and (Kolmogorovian) probability measures Pplte B(Oe) -gt [01] for eachp 6 pound The canonical example is that of quantum theory with every Hilbert space ray ij) representing a state every self-adjoint operator H representing a measureshyment with its spectrum OH C K as outcome set where the a-structure B(OH) is inherited from that of B(R) and with probability measures P^tHE) bull= (tpPEtp) where PE denotes the spectral projector for E G BOH) bull In benefit of insight and also for notational convenience we will from now on assume that the measurements e pound pound are represented in a one to one way by their outcome sets Oe mdash note that whenever pound can be represented by points of W it then suffices to consider W x w = W+v in stead of W to fulfill this assumption
87
taking Oe x e as the corresponding outcome set We stress however that the results listed below also hold in absence of this assumption81 Definition 2 A pre-probabilistic hidden measurement system is given by (i) A set of states pound and a set of measurements pound (ii) Sets O C B(W) and A that parameterize pound ie pound = eAo|A pound A0 pound O and each e pound pound goes equipped with a map ltpto bull pound mdashgt O We can represent ltpoundAO|A pound A as ipo pound x A -gt O (p A) H-gt ltPAO(P) giving A a similar formal status as the set of states pound or as AAo pound x 13(0) mdashgt P ( pound ) (pE) gt-gt A|y0(p A) pound E where 7gt(A) denotes the set of subsets of A The core of this definition is that given a state p pound pound and a value A euro A we have a completely determined outcome tpo [p A) These pre-probabilistic hidden measurement systems encode as such fully deterministic settings Definition 3 Whenever for a given pre-probabilistic hidden measurement system (Ypound(0 A) ltpooeo) there exists a a-field B(A) of A-subsets that satisfies J0e0AAo(pE)(pE) pound pound x B(0) C B(A) it defines a probashybilistic hidden measurement system if a probability measure p B(A) mdashgt [01] is also specified
The condition on A A requires that all AAo(p E) are 23(A)-measurable such that to all triples (p O E) we can assign a value PPto(E) = p(AAo(p E)) euro [01] As such any probabilistic hidden measurement system defines a meashysurement system The question then rises whether every probabilistic meashysurement system (MS) can be encoded as a probabilistic hidden measurement system (HMS) The answer to this question is yes8 42 Theorem 12 3 There always exists a canonical HMS-representation for A = [01] B(A) = B([01]) (ie the Borel sets in [01]) and pu([0a]) = a ie uniformly distributed mdash the proof goes via a construction using the Loomis-Sikorski Theorem17 20 and Marczewskis Lemma13 It makes as such sense to investigate how the different possible HMS-representations for different non-isomorphic pairs (B(A)p) are structured mdash below it will become clear what we mean here by non-isomorphic First we will discuss an example that illustrates the above it traces back to 1 and details and illustrations can be found in 2 8 Consider the states of a spin-1 entity encoded as a point on the Poincare sphere pound 0 ( = C^C) C E3 Then any pair of antipodically located points of pound 0 encodes mutual orthogshyonal states as such encodes mutual orthogonal one-dimensional projectors and thus a (dichotomic) measurement Let p pound pound 0 let (a -gta) be a pair of mutual orthogonal points of pound 0 and let A be the diagonal connecting a and -lta Let xp pound A be the orthogonal projection of p on the diagonal A Then for A pound [xp-gta] ie xp pound [aA] we set ltp(pA) = a and for A pound [a xp[ ie xp euro]A -IQ] we set ltp(p A) = -a One then verifies that for p0 bull= B([a -gta]) mdashgt [01] [a (1 mdash x)a + x-lta] gt-gt x ie uniformly distributed
88
we obtain exactly the probability structure for spin- | in quantum theory a An interpretational proposal of this model could be the following123 Rather than decomposing states as in so-called hidden variable theories here we decompose the measurements in deterministic ones mdash the probability measure fi should then be envisioned as encoding the lack of knowledge on the interaction of the measured system with its environment including measurement device
We now introduce a notion of relative size of HMS-representations jusshytifying the use of smaller Given a er-algebra6 and probability measure H B mdashgt [01] denote by Bn the ltr-algebra of equivalence classes [E] with respect to the relation
pound ~ pound iff n(E n Ec) = nE H (E)c) = 0
ie iff E and E coincide up to a symmetric difference of measure zero The ordering of Bn is inherited from B For notational convenience denote the induced measure Bfi mdashgt [01] [E] H-gt H(E) again by fi Given two pairs (B x) and (B1 ) consisting of separable cr-algebras and probability measures on them set
bull (B u) lt (B u) amp 3f B^ ~ B^ a n i n J e c t i v e c-nidegrphism
We call Bn) and (Bfi) equivalent denoted (Bfi) ~ (Bfi) whenever in the above is a c-isomorphism Given two MS (poundpound) and (Epound ) we set
3s S -gt E 3t pound-+pound both bijections Ve 6 pound 3 e B(Oe) -gt B(Ot(e)) a cr-isomorphism Vp E E V e E pound Ps(p)t(e) deg fe = PPe
Via this equivalence relation we can define a relation lt M S between classes of measurement systems M and M1 as M ltMSM if for all (Epound) euro M there exists (Epound) 6 M such that (Epound) ~M S(S pound ) ie if M is included in M up to MS-equivalence We can then prove the following
(i) (Bi) ~ (Bii) if and only if (BgtAi) lt (Bn) and Bft) lt Bft) mdash 8 3 Lemma 1 thus the equivalence classes with respect to ~ constitute a partially ordered set (poset) for the ordering induced by lt we will denote
As shown in 6 9 this deterministic model for spin-^ in R3 can be generalized to R3-models for arbitrary spin-N2 The states are then represented in the so called Majorana representation 1 8 5 ie as N copies of So Correct probabilistic behavior is then obtained by introducing entanglement between the N different spin-^ systems fcIe a pointless cr-fleld In particular it follows from the Loomis-Sikorski theorem 1 7 2 0
that all separable ltr-algebras (ie which contain a countable dense subset) can be represented as a ltT-field mdash it as such also follows that assuming that B(A) is a er-field and not an abstracted c-algebra imposes no formal restriction
89
the set of these equivalence classes by M a class in it will be denoted via a member of it as [B n]
(ii) When setting M H M S = M[BK)ii [B(A)n] pound M where M[B(A)fi] stands for all HMS with B(A) and i such that (S(A) fi) pound [B(A)j] we have that (B(A)i) lt (B(A)M) BndM[B(A)n] ltMS M[B(A)n] are equivalent 8 i 3 Theorem 2 This then results in
Theorem 1 (M lt) and (MH M S ltM S) are isomorphic posets One of the crucial ingredients in (ii) above and also in the proof for genshy
eral existence with A = [01] is the following when setting AM(Epound) = (B(Oe) Ppe)p euro pound e G pound we obtain that pound pound admits a HMS-representation with B(A) and i if and only if AM(E pound) lt (B(A)n) where the order applies pointwisely to the elements of AM(Epound) 8 t 42 Theorem 1 Using this and Theorem 1 above we can now translate properties of M to propositions on the existence of certain HMS-representations We obtain the following
(i) (M lt) is not a join-semilattice thus In general there exists no smallest HMS-representation As such we will have to refine our study to particular settings where we are able to make statements whether there exists a smallest one and if not whether we can say at least something on the cardinality of A
(ii) One can prove a number of criteria on AM(Epound) that force (B(A)fi) ~ (S([01]) ibdquo) as such assuring existence of a smallest representation Among these the following Let Mfinite = (B(X)^) euro M J X is finite ^bullfinite Q AM(pound pound ) than A cannot be discrete It then follows for examshyple that quantum theory restricted to measurements with a finite number of outcomes still requires A = [01]
(iii) Let MJV = (B(X)(i) 6 M | X has at most N elements J AM(pound pound ) C M^r then there exists a HMS-representation with A mdash N Thus quantum theory restricted to those measurements with at most a fixed number N of outcomes has discrete HMS-representation
(iv) A M ( E pound ) = MAT then there exists no smallest HMS-representation Neither does it exist when fixing the number of outcomes So there is no essenshytially unique smallest HMS-representation for V-outcome quantum theory
Although there exists no smallest and as such no canonical discrete HMS-representation we will give the construction of one solution for dichotomic (or propositional) quantum theory ie N = 2 since this will constitute the core of the model presented in this paper We will follow82 to which we also refer for a construction for arbitrary N Let us denote the quantum mechanical probability to obtain a positive outcome in a measurement of a proposition or question a on a system in state p as Pp(a) mdash the outcome set consists here of we obtain a positive answer for the question a slightly abusively denoted
90
as a itself and we obtain a negative answer for the question a denoted as -ia Set inductively for A euro N c
a iff P (n gt A- 4- V - 1 i(Vc(plti)a) ltpa(p X)= a tradeigt W Z ^ + U=i 2gt
^ -ia otherwise
One verifies that for p(X) = ^x we obtain the correct probabilities in the resultshying HMS-model This provides a discrete alternative for the above discussed E3 -model for spin-i The model including the projection xp remains the same although we dont consider [a -gta] as A anymore Let A e A = N Set xbdquo = ( 1 - pound)a+ (pound)-lta for n pound Z2gt-i bull For xp ltE [ax$[U[x$x$[U[xxpound[U U [a2A-i~lQ] w e se^ faampty = agt anc^ PaiPty = ~ltx otherwise Then for p0 = B(N) mdashraquobull [01] A gt-gt ^ we obtain again quantum probability Geshyometrically this means that the values of A pound A as compared to the first model where they represents points on the diagonal ie a continuous intershyval or again equivalently decompositions of an interval in two intervals we now consider decompositions of an interval in 2A equally long parts of which there are only a discrete number of possibilities We refer t o 8 for details and illustrations concerning
3 Unitary ortho- and projective structure
In the above discussed E3 models rotational symmetries where implicit in their spatial geometry However in general the decompositions of measurements over p B(A) mdashgt [01] go measurement by measurement so additional structure if there is any has to be put in by hand It is probably fair to say that these contextual models only become non-trivial and useful when encoding physical symmetries within the maps tpa in an appropriate manner For sake of the argument we will distinguish between three types of symmetries that can be encoded namely unitary ortho- and projective ones
i Unitary symmetries When considering quantum measurements with disshycrete non-degenerated spectrum we can represent the outcomes OJJ by the corresponding eigenstates pii via spectral decomposition ie there exshyists an injective map B(Oe) -t P(E) for each e euro pound Then specification of ltp E x A mdashbull pii and p for one measurement eo G pound fixes it for any other e E pound by symmetry ltgte = (UoipoU-1) AxE -gt peii where U E -gt E is the unishytary transformation that satisfies U(pi) = pei and pe = p This is exactly the
cWe agree on N = 12 Note here that already by non-uniqueness of binary decomshyposition mdash i = 4- = EigN T^TT mdash follows that the construction below is not canonical Obviously there are also less pathological differences between the different non-comparable discrete representations8
91
symmetry encoded in the above described E3-models Note in particular that in this perspective the pairs (a -ia) and (-gta -gt(-gta)) should not be envisioned as merely a change of names of the outcomes but truly as putting the meashysurement device (or at least its detecting part) upside down d In this setting where we represent outcomes as states the assignment of an outcome can now be envisioned as a true change of state fegt E -gt E (D Oe) p i-gt tpe(p A) as such allowing to describe the behavior of the system under concatenated measurements
ii Projective symmetries For non-degenerated quantum measurements the outcomes require representation by higher dimensional subspaces so identifishycation in terms of states now requires an injective map B(Oe) -raquo V(V(S)) The behavior of states of the system under concatenated measurements then requires specification of a family of projectors TTT bull S -gt TT euro Oe eg the orthogonal projectors 7 r ^ E - gt A p i - gt ^ l A ( p V A x ) on the correshysponding subspace A in quantum theory The above discussed non-degenerated case fits also in this picture by setting Oe C p | p pound E where now each 7Tp E mdashgt p is uniquely determined (having a singleton codomain)
Hi Orthosymmetries The existence of an orthocomplementation on the latshytice of closed subspaces of a Hilbert space provides a dichotomic representashytion for measurements which can be envisioned as a pair consisting of a (to be verified) proposition a and its negation -a in quantum theory yielding TT^A bull E mdashgt A1- p Hraquo A L A ( p V A ) In terms of linear operator calculus we have IT^A = 1 mdash A gt both of them being orthogonal projectors
4 Representing quantum history theory
Although quantum history theory involves sequential measurements one of its goals is to remain an essentially dichotomic propositional theory This is forshymally encoded in a rigid way in the History Projection Operator-approach 14 The key idea here is that the form of logicality aimed at in 14 represhysents faithfully in the Hilbert space tensor producte Let A = (ctti)i be a
d The attentive reader will note that it is at this point that we escape the so-called hidden variable no-go theorems They arise when trying to impose contextual symmetries within the states of the system by requiring that values of observables are independent of the chosen context eg the proof of the Kochen-Specker theorem Our newly introduced variable A pound A follows contextual manipulations in an obvious manner c At this point we mention that in the study of sequential phenomena in the axiomatic quantum theory perspective on quantum logic sequentiality and compoundness both turn out to be specifications of a universal causal duality 1 0 as such providing a metaphysical perspective on the use of tensor products both for the description of compound physical systems and sequential processes
92
(so-called homogeneous) quantum history proposition with temporal support (pound1 pound2 bull bull bull tn) bull Then rather than representing this as a sequence of subspaces (Ai)i or projectors (ir^i we will either represent A as a pure tensor regiAi in the lattice of closed subspaces of the tensor product of the corresponding Hilbert spaces or as the orthogonal projector regi~Ki on this subspace The crucial propshyerty of this representation is then that -gtA again encodes as a projector namely idmdashregiiTi14 clarifying the notations TTJ and 7r-^ Moreover if Ali is a set of so-called disjoint history propositions ie lt8gtkAk plusmn regkA3
k for i ^ j then the history proposition that expresses the disjunction of Ai sensu14 is exactly encoded as the projector ] [ reg7rpound We get as such a kind of logical setting that is still encoded in terms of projectors Note that TT-A is not of the form regj7Tj but of the form Yli regA7rfc breaking the structural symmetry between a proposition and its negation in ordinary quantum theory
We will now transcribe the observations in the two previous sections to this setting in order to provide a contextual deterministic model for quantum history theory with discretely originating probabilities One could say that we will apply a split picture in terms of Schrodinger-Eisenbergh namely we assume that on the level of unitary evolution we apply the Eisenbergh picshyture such that we can fix notation without reference to this evolution but for changes of state due to measurement we will (obviously) express this in the state space When encoding outcomes in terms of states we need to consider n copies of E encoding the trajectories due to the measurements In view of the considerations made above it will be no surprise that we will consider these trajectories as of the form regiPi in the tensor product (gijEj This will require the introduction of the following pseudo-projector
bull 7r^ pound -gt regipoundi p Hgt p ^ = p reg m(p) reg reg (7Tn_i o o in)(p) Setting poundreg = TTreg[pound] = pg|p pound pound then ir pound -gt E^ encodes a bijective representation of E Noting that PP(A) mdash (preg IXAPA) is the probability given by quantum theory to obtain A we then set inductively for fixed A pound N that ltPA(P A) = A if and only if
bull lt P S I trade S gt gt pound + E pound ^ ^ and (p^(p) = -14 otherwise The outcome trajectories in case we obtain A are then given in terms of initial states by (n^ o 7rreg) E mdashgt regiAi The value A euro N can be envisioned as follows We assume it to be a number of contextual events either real or virtual depending on ones taste and we asshysume that given that some events already happened the chance of a next one happening is equal to the chance that it doesnt happen so we actually conshysider a finite number of probabilistically balanced consecutive binary decisive processes where the result of the previous one determines whether we actually
93
will perform the next one Unitary symmetries are induced in the obvious way as tensored unitary operators regiUi This model then produces the statistical behavior of quantum history theory
The breaking of the structural symmetry between a proposition and its negation manifestates itself in the most explicit way in the sense that when we have a determined outcome -gtA we dont have a determined trajectory in our model mdash obviously one could build a fully deterministic model that also determines this by concatenation of individual deterministic models (one for each element in the temporal support) but we feel that this would not be in accordance with the propositional flavor a history theory aims at The negation -gtA is indeed cognitive and not ontological with respect to the actual executed physical procedure or in other words the systems context and one cannot expect an ontological model to encode this in terms of a formal duality Explicitly -i(AregB) can be written both as H lt8gt -gtB) copy (-gtA reg B) and (-gtA reg H) copy (A reg -gtB) which clearly define different procedures with respect to imposed change of state due to the measurement Even more explicitly setting HPO(Hkk) = E reg 4 l 4 G pound(laquo)gt reg4l -L reg 4 for i ^ j for pound(ik) the lattice of closed subspaces of Hk the ontologically faithful hull oiUVO(Ukk) consists then of all ortho-ideals Ol(HVO(Hkk)) ~
bull 4[regAji] | A e CUk)regkA plusmn regkA for i plusmn j
where J[mdash] assigns to a set of pure tensors all pure tensors in QkHk that are smaller than at least one in the given set this with respect to the ordering in CregkHk) mdash the downset 4-[~] construction makes Ol(HVO(Hkk)) inherit the pound(regkHk)-oideT as intersection If a particular decomposition is specified as an element of OX(HVO(Hkk)) what means full specification of the physishycal procedure where summation over different sequences of pure tensors is now envisioned as choice of procedure we can provide a deterministic contextual model the choice of procedure itself becoming an additional variable Conshyclusively the HPO-setting looses part of the physical ontology that goes with an operational perspective on quantum theory and as such if we want to provide a deterministic representation for general inhomogeneous history propositions sensu the one we obtained for the homogeneous ones we formally need to restore this part of the physical ontology eg as Ol7iVO(7ikk))
5 Further discussion
In this paper we didnt provide an answer and we even didnt pose a question We just provided a new way to think about things slightly confronting the
A choice that is motivated by the traditional consistent history setting and its interpretation as well as by a particular semantical perspective on quantum logic as a whole
94
usual consistency or decoherence perspective for history theories Even if one does not subscribe to the underlying deterministic nature of the model it still exhibits what a minimal representation of the indeterministic ingredients can be as such representing it in a more tangible way With respect to the nonshyexistence of a smallest representation in view of other physical considerations it could be that one of the constructible discrete models presents itself as the truly canonical one eg equilibrium or other thermodynamical considerations metastatistical ones emerging from additional modelization
Acknowledgments
We thank Chris Isham for useful discussions on the content of this paper
References
1 D Aerts J Math Phys 27 202 (1986) 2 D Aerts Int J Theor Phys 32 2207 (1993) 3 D Aerts Found Phys 24 1227 (1994) 4 GK Au mdash Interview with A Ashtekar CJ Isham and E Witten The
Quest for Quantum Gravity arXiv gr-qc9506001 (1995) 5 H Bacry J Math Phys 15 1686 (1974) 6 B Coecke Helv Phys Acta 68 396 (1995) 7 B Coecke Found Phys Lett 8 437 (1995) 8 B Coecke Helv Phys Acta 70 442 462(1997) arXiv quant-
ph0008061 k 0008062 Tatra Mt Math Publ 10 63 9 B Coecke Found Phys 28 1347 (1998)
10 B Coecke et ai Found Phys Lett 14(2001) arXiv quant-ph0009100 11 N Gisin and C Piron Lett Math Phys 5 379 (1981) 12 S Gudder J Math Phys 11 431 (1970) 13 A Horn and H Tarski Trans AMS 64 467 (1948) 14 C J Isham J Math Phys 23 2157 (1994) 15 C J Isham Structural Issues in Quantum Gravity In General Relativshy
ity and Gravitation GR14 pp167 (World Scientific Singapore 1997) 16 CJ Isham and J Butterfield Found Phys 30 1707 (2000) 17 L Loomis Bull AMS 53 757 (1947) 18 E Majorana Nuovo Cimento 9 43 (1932) 19 C Rovelli Strings Loops and Others A Critical Survey of the Present
Approaches to Quantum Gravity Plenary Lecture at GR15 Poona India (1998) arXiv gr-qc9803024
20 R Sikorski Fund Math 35 247 (1948)
95
INTERPRETATIONS OF Q U A N T U M MECHANICS A N D INTERPRETATIONS OF VIOLATION OF BELLS
INEQUALITY
WILLEM M DE MUYNCK Theoretical Physics Eindhoven University of Technology
FOB 513 5600 MB Eindhoven the Netherlands E-mail W-MdMuyncktuenl
The discussion of the foundations of quantum mechanics is complicated by the fact that a number of different issues are closely entangled Three of these issues are i) the interpretation of probability ii) the choice between realist and empiricist interpretations of the mathematical formalism of quantum mechanics iii) the disshytinction between measurement and preparation It will be demonstrated that an interpretation of violation of Bells inequality by quantum mechanics as evidence of non-locality of the quantum world is a consequence of a particular choice beshytween these alternatives Also a distinction must be drawn between two forms of realism viz a) realist interpretations of quantum mechanics b) the possibility of hidden-variables (sub-quantum) theories
1 Realist and empiricist interpretations of quantum mechanics
In realist interpretations of the mathematical formalism of quantum mechanics state vector and observable are thought to refer to the microscopic object in the usual way presented in most textbooks Although of course preparing and measuring instruments are often present these are not taken into account in the mathematical description (unless as in the theory of measurement the subject is the interaction between object and measuring instrument)
In an empiricist interpretation quantum mechanics is thought to describe relations between input and output of a measurement process A state vector is just a label of a preparation procedure an observable is a label of a measuring instrument In an empiricist interpretation quantum mechanics is not thought to describe the microscopic object This of course does not imply that this object would not exist it only means that it is not described by quantum mechanics Explanation of relations between input and output of a measureshyment process should be provided by another theory eg a hidden-variables (sub-quantum) theory This is analogous to the way the theory of rigid bodies describes the empirical behavior of a billiard ball or to the description by thershymodynamics of the thermodynamic properties of a volume of gas explanations being relegated to theories describing the microscopic (atomic) properties of the systems
Although a term like observable (rather than physical quantity) is ev-
96
idence of the empiricist origin of quantum mechanics (compare Heisenberg1) there has always existed a strong tendency toward a realist interpretation in which observables are considered as properties of the microscopic object more or less analogous to classical ones Likewise many physicists use to think about electrons as wave packets flying around in space without bothering too much about the Unanschaulichkeit that for Schrodingei2 was such a problematic feature of quantum theory Without entering into a detailed discussion of the relative merits of either of these interpretations (eg de Muynck3) it is noted here that an empiricist interpretation is in agreement with the operational way theory and experiment are compared in the laboratory Moreover it is free of paradoxes which have their origin in a realist interpretation As will be seen in the next section the difference between realist and empiricist interpretations is highly relevant when dealing with the EPR problem
2 E P R experiments and Bell experiments
In figure 1 the experiment is depicted
measuring instrument for Q or P
Figure 1 E P R experiment
proposed by Einstein Podolsky and Rosen4 to study (in)completeness of quantum mechanics A pair of particles (1 and 2) is prepared in an entangled state and allowed to separate A measurement is performed on particle 1 It is essential to the EPR reasoning that particle 2 does not interact with any measuring instrument thus allowing to consider so-called elements of physical reality of this particle that can be considered as objective properties being attributable to particle 2 independently of what happens to particle 1 By EPR this arrangement was presented as a way to perform a measurement on particle 2 without in any way disturbing this particle
The EPR experiment should be compared to correlation measurements of the type performed by Aspect et al56 to test Bells inequality (cf figure 2) In these latter experiments also particle 2 is interacting with a measurshying instrument In the literature these experiments are often referred to as EPR experiments too thus neglecting the fundamental difference between
97
Q
Figure 2 Bell experiment
the two measurement arrangements of figures 1 and 2 This negligence has been responsible for quite a bit of confusion and should preferably be avoided by referring to the latter experiments as Bell experiments rather than EPR ones In EPR experiments particle 2 is not subject to a measurement but to a (conditional) preparation (conditional on the measurement result obtained for particle 1) This is especially clear in an empiricist interpretation because here measurement results cannot exist unless a measuring instrument is present its pointer positions corresponding to the measurement results
Unfortunately the EPR experiment of figure 1 was presented by EPR as a measurement performed on particle 2 and accepted by Bohr as such That this could happen is a consequence of the fact that both Einstein and Bohr entertained a realist interpretation of quantum mechanical observables (note that they differed with respect to the interpretation of the state vector) the only difference being that Einsteins realist interpretation was an objectivistic one (in which observables are considered as properties of the object possessed independently of any measurement the EPR elements of physical reality) whereas Bohrs was a contextualistic realism (in which observables are only well-defined within the context of the measurement) Note that in Bell expershyiments the EPR reasoning would break down because due to the interaction of particle 2 with its measuring instrument there cannot exist elements of physical reality
Much confusion could have been avoided if Bohr had maintained his intershyactional view of measurement However by accepting the EPR experiment as a measurement of particle 2 he had to weaken his interpretation to a relational one (eg Popper7 Jammer8) allowing the observable of particle 2 to be co-determined by the measurement context for particle 1 This introduced for the first time non-locality in the interpretation of quantum mechanics But this could easily have been avoided if Bohr had required that for a measurement of particle 2 a measuring instrument should be actually interacting with this very particle with the result that an observable of particle n (n = 12) can be co-determined in a local way by the measurement context of that particle only This incidentally would have completely made obsolete the EPR ele-
98
ments of physical reality and would have been quite a bit less confusing than the answer Bohr9 actually gave (to the effect that the definition of the EPR element of physical reality would be ambiguous because of the fact that it did not take into account the measurement arrangement for the other particle) thus promoting the non-locality idea
Summarizing the idea of EPR non-locality is a consequence of i) a neglect of the difference between EPR and Bell experiments (equating elements of physical reality to measurement results) ii) a realist interpretation of quantum mechanics (considering measurement results as properties of the microscopic object ie particle 2) In an empiricist interpretation there is no reason to assume any non-locality
It is often asserted that non-locality is proven by the Aspect experiments because these are violating Bells inequality The reason for such an assertion is that it is thought that non-locality is a necessary condition for a derivation of Bells inequality However as will be demonstrated in the following this cannot be correct since this inequality can be derived from quite different assumptions Also experiments like the Aspect ones -although violating Bells inequality-do not exhibit any trace of non-locality because their measurement results are completely consistent with the postulate of local commutativity implyshying that relative frequencies of measurement results are independent of which measurements are performed in causally disconnected regions Admittedly this does not logically exclude a certain non-locality at the individual level being unobservable at the statistical level of quantum mechanical probability distributions However from a physical point of view a peaceful coexistence between locality at the (physically relevant) statistical level and non-locality at the individual level is extremely implausible Unobservability of the latter would require a kind of conspiracy not unlike the one making unobservable 19 century world aether For this reason the non-locality explanation of the experimental violation of Bells inequality does not seem to be very plausible and does it seem wise to look for alternative explanations
Since non-locality is never the only assumption in deriving Bells inequalshyity such alternative explanations do exist Thus Einsteins assumption of the existence of elements of physical reality is such an additional assumption More generally in Bells derivation10 the existence of hidden-variables is one Is it still possible to derive Bells inequality if these assumptions are abolshyished Moreover even assuming the possibility of hidden-variables theories are there in Bells derivation no hidden assumptions additional to the locality assumption
Bells inequality refers to a set of four quantum mechanical observables AiBiA2 and B2 observables with differentidentical indices being compati-
99
bleincompatible In the Aspect experiments measurements of the four possible compatible pairs are performed in these experiments An and Bn refer to polarshyization observables of photon n n = 12 respectively) Bells inequality can typically be derived for the stochastic quantities of a classical Kolmogorovian probability theory Hence violation of Bells inequality is an indication that observables A B A2 and B2 are not stochastic quantities in the sense of Kol-mogorovs probability theory In particular there cannot exist a quadrivariate joint probability distribution of these four observables Such a non-existence is a consequence of the incompatibility of certain of the observables Since inshycompatibility is a local affair this is another reason to doubt the non-locality explanation of the violation of Bells inequality
In the following derivations of Bells inequality will be scrutinized to see whether the non-locality assumption is as crucial as was assumed by Bell In doing so it is necessary to distinguish derivations in quantum mechanics from derivations in hidden-variables theories
3 Bells inequality in quantum mechanics
For dichotomic observables having values plusmn 1 Bells inequality is given accordshying to
A^A2) - AXB2) - (B1B2) - (BiA2) lt 2 (1)
A more general inequality being valid for arbitrary values of the observables is the BCHS inequality
-lltp(b1a2) +p(bib2)+p(a1b2) - p ( o i a 2 ) -p(bi) -p(b2) lt 0 (2)
from which (31) can be derived for the dichotomic case Because of its indeshypendence of the values of the observables inequality (32) is preferable by far over inequality (31) Bells inequality may be violated if some of the observshyables are incompatible [gtliii]_ ^ O [^2-62]- ^ O
I shall now discuss two derivations of Bells inequality which can be formushylated within the quantum mechanical formalism and which do not rely on the existence of hidden variables The first one is relying on a possessed values principle stating that
values of quantum mechanical observables may be attributed to the object as objective properties possessed by the object independent of observation
values principle can be seen as an expression of the objectiv-
possessed values = lt principle
The possessec istic-realist interpretation of the quantum mechanical formalism preferred by
100
Einstein (compare the EPR elements of physical reality) The important point is that by this principle well-defined values are simultaneously attributed to incompatible observables If an bj = plusmn1 are the values of Ai and Bj for the nth of a sequence of N particle pairs then we have
- 2 lt lt 4 n ) 4 n ) - a[n)b2n) - b[n)b2
n) - ampltn)a2n) lt 2
from which it directly follows that the quantities
lt iA2gt = l f a W 4 n gt gt e t c n=l
must satisfy Bells inequality (31) (a similar derivation has first been given by Stapp11 although starting from quite a different interpretation) The essential point in the derivation is the assumption of the existence of a quadruple of values (ai b a262) for each of the particle pairs
From the experimental violation of Bells inequality it follows that an objectivistic-realist interpretation of the quantum mechanical formalism enshycompassing the possessed values principle is impossible Violation of Bells inequality entails failure of the possessed values principle (no quadruples availshyable) In view of the important role measurement is playing in the interpreshytation of quantum mechanics this is hardly surprising As is well-known due to the incompatibility of some of the observables the existence of a quadruple of values can only be attained on the basis of doubtful counterfactual reashysoning If a realist interpretation is feasible at all it seems to have to be a contextualistic one in which the values of observables are co-determined by the measurement arrangement In the case of Bell experiments non-locality does not seem to be involved
As a second possibility to derive Bells inequality within quantum meshychanics we should consider derivations of the BCHS inequality (32) from the existence of a quadrivariate probability distribution p(ai 610262) by Fine12
and Rastalf3 (also de Muynck14) Hence from violation of Bells inequality the non-existence of a quadrivariate joint probability distribution follows In view of the fact that incompatible observables are involved this once again is hardly surprising
A priori there are two possible reasons for the non-existence of the quadrishyvariate joint probability distribution (01610262) First it is possible that Um]v-gt00N(aibia2b2)N of the relative frequencies of quadruples of meashysurement results does not exist Since however Bells inequality already folshylows from the existence of relative frequency ^(01610262)^ with finite
101
N and the limit N mdashgt oo is never involved in any experimental implementashytion this answer does not seem to be sufficient Therefore the reason for the non-existence of the quadrivariate joint probability distribution pa ampi alti 62) can only be the non-existence of relative frequencies N(aibia2b2)N This seems to reduce the present case to the previous one Bells inequality can be violated because quadruples ( 4i = a B = bi A = 02 B2 = ^2) do not exist
Could non-locality explain the non-existence of quadruples A = aB = bi A2 = a2 B2 = 62) Indeed it could If the value of A say is co-determined by the measurement arrangement of particle 2 then non-locality could entail
Oi(^2) 0(B2) (3)
thus preventing the existence of one single value of observable A for the two Aspect experiments involving this observable This precisely is the non-locality explanation referred to above This explanation is close to Bohrs ambiguity answer to EPR referred to in section 2 stating that the definishytion of an element of physical reality of observable A must depend on the measurement context of particle 2
As will be demonstrated next there is a more plausible local explanation however based on the inequality
a i ^ O ^ a ^ B i ) (4)
expressing that the value of Ai say will depend on whether either Ai or B is measured Inequality (34) could be seen as an implementation of Heisenbergs disturbance theory of measurement to the effect that observables incompatishyble with the actually measured one are disturbed by the measurement That such an effect is really occurring in the Aspect experiments can be seen from the generalized Aspect experiment depicted in figure 3 This experiment should be compared with the Aspect switching experiment in which the switches have been replaced by two semi-transparent mirrors (transmissivities 71 and 72 reshyspectively) The four Aspect experiments are special cases of the generalized one having 7bdquo = 0 or 1 n = 12
Restricting for a moment to one side of the interferometer it is possible to calculate the joint detection probabilities of the two detectors according to
p^auMj)) - ( 1 _ 7 l ) ( F ( D + ) i - 7 l ( pound ( i ) + ) - ( l - 7 l ) ( f ( i ) + ) Jgt
(5)
in which E^ + E^bdquo and F^+jF^- are the spectral representations of the two polarization observables (Ai and Bi) in directions 81 and 6[ respecshytively The values an = +mdashbij = +mdash correspond to yesno registration
102
(IIS bull y ltamp bull BID Pole D
Pole C S 3 E 3 Pol 9]
Figure 3 Generalized Aspect experiment
of a photon by the detector p 7 1 (+ +) = 0 means that like in the switching experiment only one of the detectors can register photon 1 There however is a fundamental difference with the switching experiment because in this latter experiment the photon wave packet is sent either toward one detector or the other whereas in the present one it is split so as to interact coherently with both detectors This makes it possible to interpret the right hand part of the generalized experiment of figure 3 as a joint non-ideal measurement of the inshycompatible polarization observables in directions 6 and 6[ (eg de Muynck et al15) the joint probability distribution of the observables being given by (5)
It is not possible to extensively discuss here the relevance of experiments of the generalized type for understanding Heisenbergs disturbance theory of measurement and its relation to the Heisenberg uncertainty relations (see eg de Muynck16) The important point is that such experiments do not fit into the standard (Dirac-von Neumann) formalism in which a probability is an expectation value of a projection operator Indeed from (5) it follows that P-n(aubij) = TrpR^ij is yielding operators R^ij according to
( ( 1 ) laquo ) = ( ( 1 - T 0 F lt 1 gt + 7 i pound(D 7 ipound ( 1 ) +
+ ( l - 7 l ) F ( O (6)
The set of operators R^ij constitutes a so-called positive operator-valued measure (POVM) Only generalized measurements corresponding to POVMs are able to describe joint non-ideal measurements of incompatible observables By calculating the marginals of probability distribution p 7 l (an bj) it is possishyble to see that for each value of 71 information is obtained on both polarization observables be it that information on polarization in direction 0 gets more non-ideal as 71 decreases while information on polarization in direction 0[ is getting more ideal This is in perfect agreement with the idea of mutual disshyturbance in a joint measurement of incompatible observables The explanation of the non-existence of a single measurement result for observable Ai say as implied by inequality (34) is corroborated by this analysis
103
The analysis can easily be extended to the joint detection probabilities of the whole experiment of figure 3 The joint detection probability distribution of all four detectors is given by the expectation value of a quadrivariate POVM Rijki according to
(an bija2khi) = TrpRijkt- (7)
This POVM can be expressed in terms of the POVMs of the left and right interferometer arms according to
Rijki = R)R) (8)
It is important to note that the existence of the quadrivariate joint probshyability distribution (7) and the consequent satisfaction of Bells inequality is a consequence of the existence of quadruples of measurement results available because it is possible to determine for each individual particle pair what is the result of each of the four detectors Although because of (35) also loshycality is assumed this does not play an essential role Under the condition that a quadruple of measurement results exists for each individual photon pair Bells inequality would be satisfied also if due to non-local interaction Rijkt were not a product of operators of the two arms of the interferometer The reason why the standard Aspect experiments do not satisfy Bells inequality is the non-existence of a quadrivariate joint probability distribution yielding the bivariate probabilities of these experiments as marginals Such a nonshyexistence is strongly suggested by Heisenbergs idea of mutual disturbance in a joint measurement of incompatible observables This is corroborated by the easily verifiable fact that the quadrivariate joint probability distributions of the standard Aspect experiments obtained from (7) and (35) by taking j n
to be either 1 or 0 are all distinct Moreover in general the quadrivariate joint probability distribution (7) for one standard Aspect experiment does not yield the bivariate ones of the other experiments as marginals Although it is not strictly excluded that a quadrivariate joint probability distribution might exist having the bivariate probabilities of the standard Aspect experiments as marginals (hence different from the ones referred to above) does the mathshyematical formalism of quantum mechanics not give any reason to surmise its existence As far as quantum mechanics is concerned the standard Aspect experiments need not satisfy Bells inequality
104
4 Bells inequality in stochastic and deterministic hidden-variables theories
In stochastic hidden-variables theories quantum mechanical probabilities are usually given as
p(ai)= [ d p()p(ai) (1) JA
in which A is the space of hidden variable A (to be compared with classical phase space) and p(ai|A) is the conditional probability of measurement result A = ai if the value of the hidden variable was A and pX) the probability of A It should be noticed that expression (41) fits perfectly into an empiricist intershypretation of the quantum mechanical formalism in which measurement result ai is referring to a pointer position of a measuring instrument the object being described by the hidden variable Since p(ai | A) may depend on the specific way the measurement is carried out the stochastic hidden-variables model correshysponds to a contextualistic interpretation of quantum mechanical observables Deterministic hidden-variables theories are just special cases in which p(ai|A) is either 1 or 0 In the deterministic case it is possible to associate in a unique way (although possibly dependent on the measurement procedure) the value ai to the phase space point A the object is prepared in A disadvantage of a deterministic theory is that the physical interaction of object and measuring instrument is left out of consideration thus suggesting measurement result ai to be a (possibly contextually determined) property of the object In order to have maximal generality it is preferable to deal with the stochastic case
For Bell experiments we have
p(aia2)= dp(X)p(aia2) (2) JA
a condition of conditional statistical independence
p(a1a2X) =p(ai|A)p(o2 |A) (3)
expressing that the measurement procedures of Ai and A2 do not influence each other (so-called locality condition)
As is well-known the locality condition was thought by Bell to be the crucial condition allowing a derivation of his inequality This does not seem to be correct however As a matter of fact Bells inequality can be derived if a quadrivariate joint probability distribution exists1213 In a stochastic hidden-variables theory such a distribution could be represented by
p(aibia2b2) = dX p(X)p(aibia2b2X) (4) JA
105
without any necessity that the conditional probability be factorizable in order that Bells inequality be satisfied (although for the generalized experiment disshycussed in section 3 it would be reasonable to require that p(ai 6102621 A) = p(ai6i|A)p(a2amp2|A)) Analogous to the quantum mechanical case it is suffishycient that for each individual preparation (here parameterized by A) a quadrushyple of measurement results exists If Heisenberg measurement disturbance is a physically realistic effect in the experiments at issue it should be described by the hidden-variables theory as well Therefore the explanation of the nonshyexistence of such quadruples is the same as in quantum mechanics
However with respect to the possibility of deriving Bells inequality there is an important difference between quantum mechanics and the stochastic hidden-variables theories of the kind discussed here Whereas quantum meshychanics does not yield any indication as regards the existence of a quadrivariate joint probability distribution returning the bivariate probabilities of the Asshypect experiments as marginals local stochastic hidden-variables theory does Indeed using the single-observable conditional probabilities assumed to exist in the local theory (compare (3)) it is possible to construct a quadrivariate joint probability distribution according to
p(aia2b1b2) = d p(A)p(ai|A)p(a2|A)p(ampi|A)p(amp2|A) (5) JK
satisfying all requirements It should be noted that (42) does not describe the results of any joint measurement of the four observables that are involved Quadruples (ai a2 b b2) are obtained here by combining measurement results found in different experiments assuming the same value of A in all experishyments For this reason the physical meaning of this probability distribution is not clear However this does not seem to be important The existence of (42) as a purely mathematical constraint is sufficient to warrant that any stochastic hidden-variables theory in which (2) and (3) are satisfied must reshyquire that the standard Aspect experiments obey Bells inequality Admittedly there is a possibility that (42) might not be a valid mathematical entity beshycause it is based on multiplication of the probability distributions p(a|A) which might be distributions in the sense of Schwartz distribution theory However the remark made with respect to the existence of probability distributions as infinitemdashA limits of relative frequencies is valid also here the reasoning does not depend on this limit but is equally applicable to relative frequencies in finite sequences
The question is whether this reasoning is sufficient to conclude that no local hidden-variables theory can reproduce quantum mechanics Such a conshyclusion would only be justified if locality would be the only assumption in
106
deriving Bells inequality If there would be any additional assumption in this derivation then violation of Bells inequality could possibly be blamed on the invalidity of this additional assumption rather than locality Evidently one such additional assumption is the existence of hidden variables A belief in the completeness of the quantum mechanical formalism would indeed be a suffishycient reason to reject this assumption thus increasing pressure on the locality assumption Since however an empiricist interpretation is hardly reconcilshyable with such a completeness belief we have to take hidden-variables theories seriously and look for the possibility of additional assumptions within such theories
In expression (41) one such assumption is evident viz the existence of the conditional probability p(ai|A) The assumption of the applicability of this quantity in a quantum mechanical measurement is far less innocuous than appears at first sight If quantum mechanical measurements really can be modshyeled by equality (41) this implies that a quantum mechanical measurement result is determined either in a stochastic or in a deterministic sense by an instantaneous value A of the hidden variable prepared independently of the measurement to be performed later It is questionable whether this is a reshyalistic assumption in particular if hidden variables would have the character of rapidly fluctuating stochastic variables As a matter of fact every individshyual quantum mechanical measurement takes a certain amount of time and it will in general be virtually impossible to determine the precise instant to be taken as the initial time of the measurement as well as the precise value of the stochastic variable at that moment Hence hidden-variables theories of the kind considered here may be too specific
Because of the assumption of a non-contextual preparation of the hidshyden variable such theories were called quasi-objectivistic stochastic hidden-variables theories in de Muynck and van Stekelenborg17 (dependence of the conditional probabilities p(aiX) on the measurement procedure preventing complete objectivity of the theory) In the past attention has mainly been restricted to quasi-objectivistic hidden-variables theories It is questionable however whether the assumption of quasi-objectivity is a possible one for hidden-variables theories purporting to reproduce quantum mechanical meashysurement results The existence of quadrivariate probability distribution (42) only excludes quasi-objectivistic local hidden-variables theories (either stochasshytic or deterministic) from the possibility of reproducing quantum mechanics As will be seen in the next section it is far more reasonable to blame quasi-objectivity than locality for this thus leaving the possibility of local hidden-variables theories that are not quasi-objectivistic
107
5 Analogy between thermodynamics and quantum mechanics
The essential feature of expression (41) is the possibility to attribute either in a stochastic or in a deterministic way measurement result a to an instantashyneous value of hidden variable A The question is whether this is a reasonable assumption within the domain of quantum mechanical measurement Are the conditional probabilities p(ai|A) experimentally relevant within this domain In order to give a tentative answer to this question we shall exploit the analogy between thermodynamics and quantum mechanics considered already a long time ago by many authors (eg de Broglie18 Bohm et al1920 Nelson2122)
Quantum mechanics -yen Hidden variables theory (A1A2BUB2) A
t t Thermodynamics mdashgt Classical statistical mechanics
(PTS) quPi In this analogy thermodynamics and quantum mechanics are considered as phenomenological theories to be reduced to more fundamental microscopic theories The reduction of thermodynamics to classical statistical mechanics is thought to be analogous to a possible reduction of quantum mechanics to stochastic hidden-variables theory Due to certain restrictions imposed on preparations and measurements within the domains of the phenomenological theories their domains of application are thought to be contained in but smaller than the domains of the microscopic theories
In order to assess the nature and the importance of such restrictions let us first look at thermodynamics As is well-known (eg Hollinger and Zenzen23) thermodynamics is valid only under a condition of molecular chaos assuring the existence of local equilibrium necessary for the ergodic hypothesis to be satisfied Thermodynamics only describes measurements of quantities (like pressure temperature and entropy) being defined for such equilibrium states From an operational point of view this implies that measurements within the domain of thermodynamics do not yield information on the object system valid for one particular instant of time but it is time-averaged information time averaging being replaced under the ergodic hypothesis by ensemble averaging In the Gibbs theory this ensemble is represented by the canonical density function Z~1e~H^qnp^^kT on phase space This state is called a macrostate to be distinguished from the microstate qnPn representing the point in phase space the classical object is in at a certain instant of time
The restricted validity of thermodynamics is manifest in a two-fold way i) through the restriction of all possible density functions on phase space to aIn equilibrium thermodynamics equilibrium is assumed to be even global
108
the canonical ones ii) through the restriction of thermodynamical quantities (observables) to functionals on the space of thermodynamic states Physishycally this can be interpreted as a restriction of the domain of application of thermodynamics to those measurement procedures probing only properties of the macrostates This implies that such measurements only yield information that is averaged over times exceeding the relaxation time needed to reach a state of (local) equilibrium Thus it is important to note that thermodynamic quantities are quite different from the physical quantities of classical statistical mechanics the latter ones being represented by functions of the microstate ltlnPn and hence referring to a particular instant of time6 Only if it were possible to perform measurements faster than the relaxation time would it be necessary to consider such non-thermodynamic quantities Such measureshyments then are outside the domain of application of thermodynamics Thus if we have a cubic container containing a volume of gas in a microstate initially concentrated at its center and if we could measure at a single instant of time either the total kinetic energy or the force exerted on the boundary of the conshytainer then these results would not be equal to thermodynamic temperature and pressurec respectively because this microstate is not an equilibrium state Only after the gas has reached equilibrium within the volume denned by the container (equilibrium) thermodynamics becomes applicable
Within the domain of application of thermodynamics the microstate of the system may change appreciably without the macrostate being affected Indeed a macrostate is equivalent to an (ergodic) trajectory qn(t)pn(t)ergodic- We might exploit as follows the difference between micro- and macrostates for charshyacterizing objectivity of a physical theory Whereas the microstate is thought to yield an objective description of the (microscopic) object the macrostate just describes certain phenomena to be attributed to the object system only while being observed under conditions valid within the domain of application of the theory In this sense classical mechanics is an objective theory all quantities being instantaneous properties of the microstate Thermodynamic quantities only being attributable to the macrostate (ie to an ergodic trashyjectory) can not be seen however as properties belonging to the object at a certain instant of time Of course we might attribute the thermodynamic quantity to the event in space-time represented by the trajectory but it should be realized that this event is not determined solely by the preparation of the microstate but is determined as well by the macroscopic arrangement serving
6Note that a definition of an instantaneous temperature by means of the equality Z2nkT = S i P2mj does not make sense as can easily be seen by applying this definition to an ideal gas in a container freely falling in a gravitational field t h e r m o d y n a m i c pressure is defined for the canonical ensemble by p mdash kTddV log Z
109
Figure 4 Incompatible thermodynamic arrangements
to define the macrostate In order to illustrate this consider two identical cubic containers differing
only in their orientations (cf figure 4) In principle the same microstate may be prepared in the two containers Because of the different orientations howshyever the macrostates evolving from this microstate during the time the gas is reaching equilibrium with the container are different (for different orientations of the container we have Hx ^ H2 and hence e - i f l f c T Z i ^ e~H2kTZ2 since H = T+V and Vi ^ V2 because potential energy is infinite outside a conshytainer) This implies that thermodynamic macrostates may be different even though starting from the same microstate Macrostates in thermodynamics have a contextual meaning It is important to note that since the container is part of the preparing apparatus this contextuality is connected here to prepashyration rather than to measurement Consequently whereas classical quantities f(qnPn) can be interpreted as objective properties thermodynamic quanshytities are non-objective the non-objectivity being of a contextual nature
Let us now suppose that quantum mechanics is related to hidden-variables theory analogous to the way thermodynamics is related to classical mechanshyics the analogy maybe being even closer for non-equilibrium thermodynamics (only local equilibrium being assumed) than for the thermodynamics of global equilibrium processes Support for this idea was found in de Muynck and van Stekelenborg17 where it was demonstrated that in the Husimi representashytion of quantum mechanics by means of non-negative probability distribution functions on phase space an analogous restriction to a canonical set of disshytributions obtains as in thermodynamics In particular it was demonstrated that the dispersionfree states p(qp) = S(q mdash qo)S(p mdash po) are not canonical in this sense This implies that within the domain of quantum mechanics it does not make sense to consider the preparation of the object in a microstate with a well-defined value of the hidden variables (qp)
In the analogy quantum mechanical observables like AiA2BiB2 should be compared to thermodynamic quantities like pressure temperature and enshytropy The central issue in the analogy is the fact that thermodynamic quanti-
110
ties like pressure and temperature cannot be conditioned on the instantaneous phase space variable qnPn (microstate) Expressions like p(qnPn) and T(qnPn) are meaningless within thermodynamics Thermodynamic quanshytities are conditioned on macrostates corresponding to ergodic paths in phase space Analogously a quantum mechanical observable might not correspond to an instantaneous property of the object but might have to be associated with an (ergodic) path in hidden-variables space A (macrostate) rather than with an instantaneous value A (microstate)
On the basis of the analogy between thermodynamics and quantum meshychanics it is possible to state the following conjectures
bull Quantum mechanical measurements (analogous to thermodynamic meashysurements) do not probe microstates but macrostates
bull Quantum mechanical quantities (analogous to thermodynamic quantishyties) should be conditioned on macrostates
A hidden-variables macrostate will be symbolically indicated by A For quantum mechanical measurements the conditional probabilities p(ai) of (41) should then be replaced by p(ai|A ) Concomitantly quantum mechanshyical probabilities should be represented in the hidden-variables theory by a functional integral
p(ai) = Jd ptfMa^X1) (1)
in which the integration is over all possible macrostates consistent with the preparation procedure
By itself conditioning of quantum mechanical observables on macrostates rather than microstates is not sufficient to prevent derivation of Bells inequalshyity As a matter of fact on the basis of expression (43) a quadrivariate joint probability distribution can be defined analogous to (42) according to
p(oi026162) = f dt p(A)p(a1|At)p(a2|At)p(61|Alt)p(62|At) (2)
from which Bells inequality can be derived just as well There is however one important aspect that up till now has not sufficiently been taken into acshycount viz contextuality In the construction of (44) it is assumed that the
macrostate A is applicable in each of the measurement arrangements of obshyservables AA2Bi and B2 Because of the incompatibility of some of these observables this is an implausible assumption On the basis of the thermoshydynamic analogy it is to be expected that macrostates A will depend on the
111
measurement context of a specific observable Since [AiBi]_ ^ O we will have
f f1 (3)
and analogously for A2 and B2 Then for the Bell experiments measuring the pairs (Ai A2) and (AiB2) respectively we have
p(aia2) = dX 2 p(t 1 2)p(ai|A 1 2)p(a2X 1 2 ) (4)
p(aib2) = JdtAlB2 ptMB2)patfMB)pa2tMB) (5)
Now the contextuality expressed by inequality (45) prevents the construction of a quadrivariate joint probability distribution analogous to (44) Hence like in the quantum mechanical approach also in the local non-objectivistic hidden-variables theory a derivation of Bells inequality is prevented due to the local contextuality involved in the interaction of the particle and the measuring instrument it is directly interacting with
6 Conclusions
Our conclusion is that if quantum mechanical measurements do probe macro-states A rather than microstates A then Bells inequality cannot be derived for quantum mechanical measurements Both in quantum mechanics and in hidden-variables theories is Bells inequality a consequence of the assumption that the theory is yielding an objective description of reality in the sense that the preparation of the microscopic object as far as relevant to the realization of the measurement result can be thought to be independent of the measureshyment arrangement The important point to be noticed is that although in Bell experiments the preparation of the particle pair at the source (ie the microstate) can be considered to be independent of the measurement proceshydures to be carried out later (and hence one and the same microstate can be assumed in different Bell experiments) the measurement result is only detershymined by the macrostate which is co-determined by the interaction with the measuring instruments It really seems that the Copenhagen maxim of the impossibility of attributing quantum mechanical measurement results to the object as objective properties possessed independently of the measurement should be taken very seriously and implemented also in hidden-variables theshyories purporting to reproduce the quantum mechanical results The quantum
112
mechanical dice is only cast after the object has been interacting with the meashysuring instrument even though its result can be deterministically determined by the (sub-quantum mechanical) microstate
The thermodynamic analogy suggests which experiments could be done in order to transcend the boundaries of the domain of application of quanshytum mechanics If it would be possible to perform experiments that probe the microstate A rather than the macrostate A then we are in the domain of (quasi-)objectivistic hidden-variables theories Because of (42) it then is to be expected that Bells inequality should be satisfied for such experiments In such experiments preparation and measurement must be completed well within the relaxation time of the microstates Such times have been estimated by Bohm24 for the sake of illustration as the time light needs to cover a disshytance of the order of the size of an atom (10~18 s say) If this is correct then all present-day experimentation is well within the range of quantum mechanshyics thus explaining the seemingly universal applicability of this latter theory By hindsight this would explain why Aspects switching experiment is corshyroborating quantum mechanics the applied switching frequency (50 MHz) although sufficient to warrant locality has been far too low to beat the local relaxation processes in each of the measuring instruments separately
It has often been felt that the most surprising feature of Bell experiments is the possibility (in certain states) of a strict correlation between the measureshyment results of the two measured observables without being able to attribute this to a previous preparation of the object (no elements of physical reality ) For many physicists the existence of such strict correlations has been reason enough to doubt Bohrs Copenhagen solution to renounce causal explanation of measurement results and to replace determinism by complementarity It seems that the urge for causal reasoning has been so strong that even within the Copenhagen interpretation a certain causality has been accepted even a non-local one in an EPR experiment (cf figure 1) determining a measurement result for particle 2 by the measurement of particle 1 This however should rather be seen as an internal inconsistency of this interpretation caused by a tendency to make the Copenhagen interpretation as realist as possible In a consistent application of the Copenhagen interpretation to Bell experiments such experiments could be interpreted as measurements of bivariate correlation observables The certainty of obtaining a certain (bivariate) eigenvalue of such an observable would not be more surprising than the certainty of obtaining a certain eigenvalue of a univariate one if the state vector is the corresponding eigenvector
It is important to note that this latter interpretation of Bell experiments takes seriously the Copenhagen idea that quantum mechanics need not ex-
113
plain the specific measurement result found in an individual measurement Indeed in order to compare theory and experiment it would be sufficient that quantum mechanics just describe the relative frequencies found in such meashysurements In this view quantum mechanics is just a phenomenological theory in an analogous way describing (not explaining) observations as does thermoshydynamics in its own domain of application Explanations should be provided by more fundamental theories describing the mechanisms behind the obshyservable phenomena Hence the Copenhagen completeness thesis should be rejected (although this need not imply a return to determinism)
This approach has important consequences One consequence is that the non-existence within quantum mechanics of elements of physical reality does not imply that elements of physical reality do not exist at all They could be elements of the more fundamental theories In section 5 it was discussed how an analogy between quantum mechanics and thermodynamics could be exploited to spell this out Elements of physical reality could correspond to hidden-variables microstates A The determinism necessary to explain the strict correlations referred to above would be explained if within a given measurement context a microstate would define a unique macrostate A This demonstrates how it could be possible that quantum mechanical measurement results cannot be attributed to the object as properties possessed prior to meashysurement and there yet is sufficient determinism to yield a local explanation of strict correlations of quantum mechanical measurement results in certain Bell experiments
Another important aspect of a dissociation of phenomenological and funshydamental aspects of measurement is the possibility of an empiricist interpreshytation of quantum mechanics As demonstrated by the generalized Aspect experiment discussed in section 3 an empiricist approach needs a generalshyization of the mathematical formalism of quantum mechanics in which an observable is represented by a POVM rather than by a projection-valued meashysure corresponding to a self-adjoint operator of the standard formalism Such a generalization has been very important in assessing the meaning of Bells inequality In the major part of the literature of the past this subject has been dealt with on the basis of the (restricted) standard formalism However some conclusions drawn from the restricted formalism are not cogent when viewed in the generalized one (for instance because von Neumanns projection postulate is not applicable in general) For this reason we must be very careful when accepting conclusions drawn from the standard formalism This in particular holds true for the issue of non-locality
114
References
1 W Heisenberg Zeitschr f Phys 33 879 (1925) 2 E Schrodinger Naturwissenschaften 23 807 823 844 (1935) (English
translation in Quantum Theory and Measurement eds JA Wheeler and WH Zurek (Princeton Univ Press 1983 p 152))
3 WM de Muynck Synthese 102 293 (1995) 4 A Einstein B Podolsky and N Rosen Phys Rev 47 777 (1935) 5 A Aspect P Grangier and G Roger Phys Rev Lett 47 460 (1981) 6 A Aspect J Dalibard and G Roger Phys Rev Lett 49 1804 (1982) 7 KR Popper Quantum theory and the schism in physics (Rowman and
Littlefield Totowa 1982) 8 M Jammer The philosophy of quantum mechanics (Wiley New York
1974) 9 N Bohr Phys Rev 48 696 (1935)
10 JS Bell Physics 1 195 (1964) 11 HR Stapp Phys Rev D 3 1303 (1971) II Nuovo Cim 29B 270
(1975) 12 A Fine Journ Math Phys 23 1306 (1982) Phys Rev Lett 48 291
(1982) 13 P Rastall Found of Phys 13 555 (1983) 14 WM de Muynck Phys Lett A 114 65 (1986) 15 WM de Muynck W De Baere and H Martens Found of Phys 24
1589 (1994) 16 WM de Muynck Found of Phys 30 205 (2000) 17 WM de Muynck and JT van Stekelenborg Ann der Phys 7 Folge
45 222 (1988) 18 L de Broglie La thermodynamique de la particule isolee (Gauthier-
Villars 1964) L de Broglie Diverses questions de mecanique et de thershymodynamique classiques et relativistes (Springer-Verlag 1995)
19 D Bohm Phys Rev 89 458 (1953) 20 D Bohm and J-P Vigier Phys Rev 96 208 (1954) 21 E Nelson Dynamical theories of Brownian motion (Princeton University
Press 1967) 22 E Nelson Quantum fluctuations (Princeton University Press 1985) 23 HB Hollinger and MJZenzen The Nature of Irreversibility (D Reidel
Publishing Company Dordrecht 1985 sect 44) 24 D Bohm Phys Rev 85 166 180 (1952)
115
DISCRETE HESSIANS IN STUDY OF Q U A N T U M STATISTICAL SYSTEMS COMPLEX GINIBRE ENSEMBLE
M M DURAS
Institute of Physics Cracow University of Technology ulica Podchorazych 1 PL-30084 Cracow Poland
E-mail mdurasriaduskpkedupl
The Ginibre ensemble of nonhermitean random Hamiltonian matrices K is conshysidered Each quantum system described by K is a dissipative system and the eigenenergies Z of the Hamiltonian are complex-valued random variables The second difference of complex eigenenergies is viewed as discrete analog of Hessian with respect to labelling index The results are considered in view of Wigner and Dysons electrostatic analogy An extension of space of dynamics of random magnitudes is performed by introduction of discrete space of labeling indices
1 Introduction
Random Matrix Theory RMT studies quantum Hamiltonian operators H which are random matrix variables Their matrix elements Hij are independent ranshydom scalar variables 12345678 There were studied among others the folshylowing Gaussian Random Matrix ensembles GRME orthogonal GOE unitary GUE symplectic GSE as well as circular ensembles orthogonal COE unishytary CUE and symplectic CSE The choice of ensemble is based on quantum symmetries ascribed to the Hamiltonian H The Hamiltonian H acts on quanshytum space V of eigenfunctions It is assumed that V is TV-dimensional Hilbert space V = F ^ where the real complex or quaternion field F = R C H corresponds to GOE GUE or GSE respectively If the Hamiltonian matrix
116
H is hermitean H mdash H then the probability density function of H reads
MH)=CH0exp[-p-plusmn-Tr(H2) (1)
CH0 = ( ^ ) ^ 2
MHP=N+ ^N(N - 1)0
fn(H)dH = 1
N N D-l
^=nniK) i = l j gt i 7=0
Hii = (H$HltSgt-raquo)eF
where the parameter 3 assume values 3 = 124 for GOE(iV) GUE(A^) GSE(A^) respectively and Nap is number of independent matrix elements of hermitean Hamiltonian H The Hamiltonian H belongs to Lie group of hermitean N x AT-matrices and the matrix Haars measure dH is invarishyant under transformations from the unitary group U(iV F) The eigenenergies Eii = 1 N oi H are real-valued random variables Ei = E It was Eushygene Wigner who firstly dealt with eigenenergy level repulsion phenomenon studying nuclear spectra1 2 3 RMT is applicable now in many branches of physics nuclear physics (slow neutron resonances highly excited complex nushyclei) condensed phase physics (fine metallic particles random Ising model [spin glasses]) quantum chaos (quantum billiards quantum dots) disordered meso-scopic systems (transport phenomena) quantum chromodynamics quantum gravity field theory
2 The Ginibre ensembles
Jean Ginibre considered another example of GRME dropping the assumption of hermiticity of Hamiltonians thus denning generic F-valued Hamiltonian K 12910 j j e n C 6 ) j belong to general linear Lie group GL(N F) and the matrix Haars measure dK is invariant under transformations form that group The
117
distribution of K is given by
MK) = CK0 exp [-P-- TrffftA-)] (2)
KHfgt = N2p
fKK)dK = 1
N N D-
^=nniK) i=j= 7=0
where 3 mdash 124 stands for real complex and quaternion Ginibre ensembles respectively Therefore the eigenenergies Zi of quantum system ascribed to Ginibre ensemble are complex-valued random variables The eigenenergies Zii = 1N of nonhermitean Hamiltonian K are not real-valued random variables Zi ^ Z Jean Ginibre postulated the following joint probability density function of random vector of complex eigenvalues Z ZN tor N X N Hamiltonian matrices K for f = 21 2-9 10
PzuzN) = (3) N 1 N N
=n ^771 bull n zi - ztf bull exp(- zZ I^I2) 3 = 1 J iltj j=l
where Zi are complex-valued sample points (zi 6 C) We emphasize here Wigner and Dysons electrostatic analogy A Coulomb
gas of iV unit charges moving on complex plane (Gausss plane) C is considered The vectors of positions of charges are zt and potential energy of the system is
U(z1zN) = -J2]nzi-j + lEZil (4) iltj i
If gas is in thermodynamical equilibrium at temperature T = ^- (ft = -^-^ = 2 ks is Boltzmanns constant) then probability density function of vectors of positions is P(ZIZN) Eq (3) Therefore complex eigenenergies Zi of quantum system are analogous to vectors of positions of charges of Coulomb
118
gas Moreover complex-valued spacings AxZi of complex eigenenergies of quantum system
A1Zi = Zi+1-Zii = l(N-l) (5)
are analogous to vectors of relative positions of electric charges Finally complex-valued second differences A2Zj of complex eigenenergies
A2Zi = Zi+2 - 2Zi+l + Zui = 1 N - 2) (6)
are analogous to vectors of relative positions of vectors of relative positions of electric charges
The eigenenergies Zi = Z(i) can be treated as values of function Z of discrete parameter i mdash 1 N The Jacobian of Zi reads
dZi A1Zi JacZi = V ~ ^ T 1 = A Zlt- 7
Ol A1 We readily have that the spacing is an discrete analog of Jacobian since the indexing parameter i belongs to discrete space of indices i pound = l iV Therefore the first derivative with respect to i reduces to the first differential quotient The Hessian is a Jacobian applied to Jacobian We immediately have the formula for discrete Hessian for the eigenenergies Zi
Q2 7 A 2 7
Thus the second difference of Z is discrete analog of Hessian of Z One emphasizes that both Jacobian and Hessian work on discrete index space of indices i The finite differences of order higher than two are discrete analogs of compositions of Jacobians with Hessians of Z
The eigenenergies Eii 6 of the hermitean Hamiltonian H are ordered increasingly real-valued random variables They are values of discrete function Ei = Ei) The first difference of adjacent eigenenergies is
A1Ei = Ei+1-Eii = l(N-l) (9)
are analogous to vectors of relative positions of electric charges of one-dimensional Coulomb gas It is simply the spacing of two adjacent energies Real-valued second differences A2Ei of eigenenergies
A2Ei = Ei+2 - 2Ei+1 +Eui = 1 (N - 2) (10)
119
are analogous to vectors of relative positions of vectors of relative positions of charges of one-dimensional Coulomb gas The A2Zi have their real parts ReA2Zi and imaginary parts ImA2Z as well as radii (moduli) A2Zi and main arguments (angles) ArgA2Zi A2Zj are extensions of real-valued second differences
A 2 poundi = Ei+2 - 2Ei+1 +Ehi = 1 (N - 2) (11)
of adjacent ordered increasingly real-valued eigenenergies Ei of Hamiltonian H defined for GOE GUE GSE and Poisson ensemble PE (where Poisson ensemshyble is composed of uncorrelated randomly distributed eigenenergies)1112131415 The Jacobian and Hessian operators of energy function E(i) mdash Ei for these ensembles read
and
The treatment of first and second differences of eigenenergies as discrete analogs of Jacobians and Hessians allows one to consider these eigenenergies as a magshynitudes with statistical properties studied in discrete space of indices The labelling index i of the eigenenergies is an additional variable of motion hence the space of indices I augments the space of dynamics of random magshynitudes
Acknowledgements
It is my pleasure to most deeply thank Professor Antoni Ostoja-Gajewski for continuous help I also thank Professor Wlodzimierz Wojcik for his giving me access to computer facilities
References
1 F Haake Quantum Signatures of Chaos (Springer-Verlag Berlin Heidelshyberg New York 1990) Chapters 1 3 4 8 pp 1-11 33-77 202-213
2 T Guhr A Miiller-Groeling and H A Weidenmuller Phys Rept 299 189-425 (1998)
3 M L Mehta Random matrices (Academic Press Boston 1990) Chapters 1 2 9 pp 1-54 182-193
4 L E Reichl The Transition to Chaos In Conservative Classical Systems Quantum Manifestations (Springer-Verlag New York 1992) Chapter 6 p 248
5 O Bohigas in Proceedings of the Les Houches Summer School on Chaos and Quantum Physics (North-Holland Amsterdam 1991) p 89
6 CE Porter Statistical Theories of Spectra Fluctuations (Academic Press New York 1965)
7 T A Brody J Flores J B French P A Mello A Pandey and S S M Wong Rev Mod Phys 53 385 (1981)
8 C W J Beenakker Rev Mod Phys 69 731 (1997) 9 J Ginibre J Math Phys 6 440 (1965)
10 M L Mehta Random matrices (Academic Press Boston 1990) Chapter 15 pp 294-310
11 M M Duras and K Sokalski Phys Rev E 54 3142 (1996) 12 M M Duras Finite difference and finite element distributions in statisshy
tical theory of energy levels in quantum systems (PhD thesis Jagellonian University Cracow 1996)
13 M M Duras and K Sokalski Physica D125 260 (1999) 14 M M Duras Description of Quantum Systems by Random Matrix Enshy
sembles of Large Dimensions in Proceedings of the Sixth International Conference on Squeezed States and Uncertainty Relations 24 May-29 May 1999 Naples Italy (NASA Greenbelt Maryland at press 2000)
15 M M Duras J Opt B Quantum Semiclass Opt 2 287 (2000)
121
SOME REMARKS ON HARDY FUNCTIONS ASSOCIATED WITH DIRICHLET SERIES
W E H M Institut fur Grenzgebiete der Psychologie und Psychohygiene
Wilhelmstrasse 3a 79098 Freiburg Germany E-mail ehmigppde
A simple method of associating a Hardy function with a Dirichlet series is described and applied to some examples connected with the Riemann zeta function The theory of Hardy functions then is used to derive integral tests of the Riemann hypothesis generalizing a recent result of Balazard Saias and Yor1
1 Introduction
The most famous example of a Dirichlet series f(z) = Y^=i an n~z converging absolutely in the half plane $lz gt 1 is the Riemann zeta function ((z) which has all coefficients an = 1 It has a simple pole at z mdash 1 and can be extended as a meromorphic function with no other singularities to the whole complex plane6
A simple method of associating a Hardy function with a Dirichlet series of that kind consists in multiplying f(z) by (z mdash l ) ^ 2 the factor (z mdash l)z removes the pole at z = 1 and the division by z achieves square integrability along vertical lines Moreover the zeros of fz) remain unchanged by this modification The motivation for passing from f(z) to f(z) (z mdash l)z2 is to utilize the theory of Hardy functions especially factorization of Hardy functions for the study of the zeta function
In section 2 of this note we give conditions under which the function f(z) (z mdash l)z2 has an analytic continuation as a Hardy function beyond the abscissa of convergence of the Dirichlet series f(z) The criterion is tested on three examples all related to the Riemann zeta function Factorization of the Hardy function pound(z) (z mdash l)z2 which is briefly dicussed in section 3 is used in section 4 to derive some integral tests of the Riemann hypothesis The content of the Riemann hypothesis hereafter abbreviated RH is Riemanns yet unproven conjecture that all non-real zeros of the pound function lie on the line iftz = 12 in the complex plane It has received increasing interest among physicists since the discovery of striking similarities in the distribution of the zeros of the zeta function and the spectrum of large random matrices2
The idea to utilize Hardy functions in connection with the zeta function including integral tests of the Riemann hypothesis is not new See the recent article of Balazard Saias and Yor1 who initially work with Hardy functions in the disc then pass to the half plane 3te gt 12 by conformal mapping In our
122
approach based on the function C(z)(z mdash l ) z 2 which also appears in recent work of Burnol4 we deal with half plane Hardy functions from the beginning This leads to somewhat more general results in a natural fashion
2 Hardyfication of Dirichlet series
The basic result of this section is the following
Theorem Given a Dirichlet series f(z) = $3nLi a laquo n~z with a finite abscissa of convergence let functions A and ltfgt be defined by
A(x) = ^2 abdquo ltj)x) = ^^ an(l-x + ogn) (x euro R ) l lt n lt x lltnlte
(1)
Suppose that Ax) = 0(x) as x mdashgtbull oo and let
X = l i m s u p l-pM where DN = A(N) - V ^ M ( 2 )
Then the function f(z) (z mdash l)z2 can be represented as the Laplace transform of ltfgt(x) in the half plane Stz gt A
(3) bullOO
f(z)(z-l)z2 = e-zx4gt(x)dx ($lzgt) Jo
Proof Fix an integer N gt 1 and let log N lt x lt og(N + 1) Then
4gt(x)-4gt(logN) = (x-logN)A(N)ltA(N)logtplusmnl = 0(1)
as N -gt oo by the assumed growth behavior of A(x) Combining this with
(A(log(n + l))-lt)(logn) = an+1 - A(n) log ^ = an+1 - A(n)n + 0(n1)
we get for N = [ex] -gt oo
N-l
4gtx) = m + J2 [^(log(+)) - ^(losn)] + deg() n=l
N-l
= ai + 5 3 [an+1 - A(n)n + Ofa-1)] + 0(1) = DN + 0(log N) n = l
123
and thus for every e gt 0 ltfgt(x) = 0(ea(A+egt) x t oo by the definition of A Since 4gt vanishes on the left half line it follows that the integral on the right-hand side of (3) converges absolutely in the half plane 5ftz gt A It remains to show that this Laplace transform coincides with f(z) (z - l ) z 2 in the half plane 3z gt aa where aa denotes the abscissa of absolute convergence of f(z)
To that end let us write r)(z) = f(z) (z mdash l)z2 and introduce truncated versions
N
fN(z) = ^2ann~z T]N(z) = fN(z)(z-l)z2
n = l
(j)Nx) = Y2 an(l-x + ogn) lltnltmin(Nex)
N gt1 and set h^^ix) mdash e~~ax ltfgtjv(x) Using
2TT J^ [ + ] 0 if x lt 0
(for every integer q gt 1 a gt 0) we get for fixed a gt aa
(bullOO
eitxr)N(v + it)dt (4)
-i -oo N = v eitx ]C a n~deg~it (a + it- l)l(a +t)2 dt
2r J -OO
-f 2TT J_
n = l N
^-ijy^-i^u dt ya + it (a + it)2
Y ann-dege-deg(x-lo^(l-(x-logn)) = haNx) lltnltmin(Nex)
almost everywhere in x S R the Fourier integrals being understood in the L2
sense Note that r](z) is square integrable along every line 9z = a with a gt aa Clearly rj^i^+it) converges to r)a+it) in L2(dt) so h^^ is a Cauchy sequence in L2(dx) by Parsevals formula The pointwise limit ha(x) of hltT^(x) then also is the L2(dx) limit so that by (4) h^x) and T)(a + it) represent a Fourier transform pair for every a gt aa Therefore
poo poo
r](a + it) = Kit) = hax)e~ixtdx = e-(deg+iVxltf)(x)dx (5) Jo Jo
124
holds almost everywhere in t (a gt aa) hence everywhere in 3te gt aa by continuity This shows that the Laplace transform of ltfgt represents the analytic continuation of 77 to the region $tz gt A completing the proof
Let Ti2 denote the Hardy space consisting of all functions g(z) which are analytic for $lz gt a and such that s u p ^ ^ J^deg g(cr + it)2 dt lt 00 The growth behavior of (jgt(x) established in the proof implies ha euro L2 for every a gt A so that by (5) and Parsevals formula we obtain the following
Corollary Under the conditions of the theorem the function f(z) (z mdash l)z2
belongs to every Hardy space H2 a gt X
Example 1 Let obdquo = 1 for all n that is f(z) mdash Cz) Then DN = 1 N gt 1 so that A = 0 A more careful analysis shows that ltfrx) is nonnegative and grows linearly as x tends to infinity Consequently (z) (z mdash l)z2 is a member of every Hardy space W2 a gt 0 but not of H2 The nonnegativity allows one to associate with ltfgt an exponential family V mdash pa a gt 0 of probability densities with support [000) by setting
pbdquo(x) = K(x)r](a) = ltfgtx)e-xri((T) (x euro R a gt 0) (6)
The function pound(z) (z mdash l)z2 was also considered by Burno in connection with a closure problem in function space known as the Nyman - Beurling real variable form of the Riemann hypothesis
It may be interesting to note here that although ha is square integrable for every a gt 0 it is not true that hafM mdashgtbull iltr in L2 if cr lt 1 In fact we have
Uminf jv-gtoo ||fr(7JV-iltr||2 gt 0 0 lt a lt 1 (7)
Proof Note first that for x gt log N -gt 00
4gtx) - 4gtNX) (8)
J ^ ( l - z + logn) = ( l - a O Q e ^ - A O + l o g t e ^ l - l o g A T Nltnlte
= ( l - x ) ( [ e ] - A 0 + ([ex + plusmn)log[ex] - [ex] - (N + | ) logiV + N + 0(1)
= (JV+)(log[ex]- logJV) + ( [ e^ ] - iV) ( log [e a ] -x )+0 ( l )
= (N + ) ( - log TV) + 0(1)
on using Stirlings formula and the inequalities 0 lt x - log [ex] lt2e~x (x gt 0) The estimate (8) shows that there exists a finite constant B gt 0 such that
125
ltfgt(x) - 4gtNx) gtN(x- logN) for all large N and x gt B + log JV Therefore
O0
KN-Kl gt (ltfgt(x) - lttgtN(x))2 e-2 dx JB+ogN
roo TOO
gt TV2 (x-logN)2e-2axdx = N2~2deg y2 e~2try dy JB+ogN JB
for all large N and assertion (7) follows
Example 2 Let f(z) = ^2p~z^ogp where the sum extends over all prime numbers This example is related to the logarithmic derivative of the zeta function as may be seen from the product representation pound(z) = J~T_ (1mdashp_ z)_ 1 For IRz gt 1
C(z) v - logP gt V - ogP C(z) ^ Pz - 1 M ^ ^ Pz (p2 - 1)
and since the last series converges for Htz gt 12 it suffices to consider f(z) as far as the analytic continuation of C(z)C(z) 1S concerned
The series f(z) had convergence abscissa 12 implying the RH if the associated sequence DN satisfied condition (2) with A = 12 For a numerical check we computed DN for TV up to 5 million A plot of log+ |Djv| log TV versus logiV (thinned out to every 200th data point the general picture is not affected thereby) is shown in Figure 1 (a) Within the considered range the observed behavior is well in accordance with a possible value of A = 12 Notice the obvious connection with the classical criterion saying that the RH is equivalent to the error estimate $^pltxlogp mdash x = 0(x12+e) (V e gt 0) in the prime number theorem (Edwards6 Sect 55) Incidentally 4gt(x) seems to be nonnegative in this case too as a plot of ltfgt(x) for small a-values indicates
Example 3 Let f(z) = 1C(z) = ^2^Li^(n)n~z with fj the Mobius funcshytion It is well-known that the RH is equivalent to the condition A(N) = EnltivM(trade) = 0(V1 2 + e) (for every e gt 0) that is to A = 12 The analogous plot for this case is shown in Figure 1 (b) with similar findings
3 Factorization of r)
From now on we shall restrict attention to the case = pound For brevity we write r](z) = ((z)(z mdash l)z2 throughout the sequel Recall from the previous section that TJ belongs to every Hardy space H2
T a gt 0 Being a Hardy function r admits a useful factorization some applications of which will be discussed in
126
Figure 1 Convergence abscissa of Laplace transform equal to 12 Plot of criterion log1 DN I logN versus log AT for (a) Example 2 (b) Example 3
the next section The zeros of r) in the right half plane Sftz gt 0 which coincide with the non-trivial zeros of the zeta function are generically denoted by p The ps are known to lie symmetrically with respect to both the real axis and the critical line Kz = 12 That is whenever p is a zero then so are the mirror images p 1-9 and 1 mdash p
Let a gt 0 be fixed According to the factorization theorem for Hardy functions (see eg Dym and McKean5 (ch 27) or Hoffman8 (p 132 133)) TJ can be represented as the product of an outer and an inner function on the half plane 5Rz gt a More precisely
r(z) = Haz)Baz)
where the outer function is given by
(ftz gt a)
Hltr(z) = exp 7T J-c
log rj(a + it) t(z mdash a) + i dt t + i(z-a) 1+t2
(9)
(10)
and the inner function reduces in the present case to a Blaschke product Ba
which is composed of the zeros p of T] with 5fygt gt a and their mirror images after reflection at the line 9z = a 2a mdash ~p Explicitly
l-p-o D M _ TT z ~ P l 1 ( i i )
These formulae are easily obtained from the familiar ones for the half plane 9iz gt 08 by shifting both the complex variable and the zeros by a The inner
127
factor simplifies to a Blaschke product for the following reasons (i) n has an analytic continuation across the line dtz = a to the entire right half plane so that there is no singular factor (ii) the constant c appearing in the general factorization formula reduces to unity because Ba(o) = 1 and Ha(a) = rj(a) as is readily verified For real arguments z = s taking first logarithms then real parts on both sides of (9) one obtains for s gt a gt 0
iog(s) = i jy^(^) s(s_-^2 + pound i0i
5Rpgtltr
s-p s-(2a-p)
(12)
Note that T](s) is positive for s gt 0 being the Laplace transform of a nonneg-ative function
4 Applications
The factorization of n gives rise to various tests of the RH A first example is obtained by setting a = 12 in (12) The sum on the right-hand side of (12) vanishes if and only if pound(z) has no zero within the region $lz gt 12 Therefore the RH is true if and only if for some (and then for all) s gt 12
If 71 J-lt
logMl + ^ l ^ = lograquoK) (13) (s 2) +t
This criterion is equivalent to the condition that r)(z) be an outer function for the half plane 9z gt 12 cf Dym and McKean5 Sect 27 For s = 1 it assumes a particularly neat form The right-hand side vanishes and the left-hand side can be simplified and one gets the following criterion for the truth of the RH due to Balazard Saias and Yor1
4 + l
Another example results from the formula
OO 1
log[|ij(ltr + it)|i(lt7)] -2L - 2 pound K ( p - a ) 1 (15)
(cr gt 0) which can be derived from (12) by subtracting logger) on both sides dividing by s - cr and then taking the limit s a The interchange of limits and integration (or summation) can be justified by dominated convergence
128
Putting a = 12 in (15) one obtains the following differential version of the integral tests (13) (14) The RH is true if and only if
f j mdash lt
dt l o g t W i + i t J I M D l - r j = ( log^) ( i ) (16)
This statement can be amplified in various ways First it is possible to evaluate (log77)(|) explicitly (logr)(|) = f + |log(87r) + f - 6 and for u = 12 the sum in (15) can be written in a more symmetric form One thus obtains the relation
00
log v+it)
v(h) dt (l 1 7T ^$tp-5 ( l + l l o g M + I _ 6 ) = E 2 I
bullKt2 2 2 6V 4 J ^ p - | p (17)
in which the sum extends over all zeros in the critical strip Note that (17) quantifies the difference between the two sides of (16) as a weighted sum of the absolute deviations of the real parts of the zeros from 12
Secondly there is a connection with logarithmic Hilbert transforms also called logarithmic dispersion relations3 Suppose we had T](z) ^ 0 for IStz gt 12 Then n itself would be an outer function
Taking imaginary parts in this equation one can show with a little algebra that for z mdash 12 = a + ib a gt 0 one then has
ZlogV(z) = - J ^ (log|7(i + it) - l o g W +ib)) -plusmn-plusmn j - ^ 1 8 )
l o g M | + r t ) I - log T + ib) I a dt
-I t-b a2 + (t-b)2
Fix any b gt 0 such that 7(| +ib) ^ 0 Then the last term in (38) converges to zero as a 4- 0 Therefore using the fact that r]( + it) is an even function of t one obtains in the limit the logarithmic dispersion relation
o-i ( + bull 2b Z-00 log k ( | + it)| - log |raquo(| + t6)| ^ Zlogriiz+ib) = mdash J i ^ mdash ^ dt (19)
which expresses the phase of rj on the boundary dtz = 12 as an integral of its log modulus along that line Recall that this relation is a consequence of the
129
assumed outer function character of 77 that is of the RH In fact the validity of (19) for every 6 gt 0 such that 7(| + ib) ^ 0 is also sufficient for the RH To see this divide both sides of (19) by b and let 6 4-0 Then the left side tends to (lograquo7)(i) the right side to f 0degdeglog[r]( + it)h)] sect so in the limit we get the condition (16) shown above to be equivalent to the RH
Finally we note that mdash (log77)(ltr) equals the first moment of the probability density pbdquo cp (6) In view of (16) and (15) this raises the question whether the integral term in these relations admits of a probabilistic interpretation too Relevant to this question is the observation going back to Khintchine that for every a gt 1 the function fa(t) = pound(a + it)((a) is the characteristic function of an infinitely divisible distribution cf Example 6 p 75 in Gnedenko and Kolmogorov7 This can be verified by rewriting the product representation of the zeta function (for a gt 1) in the form
C(o- + it) = T T 1-p-7
exp mdash Tmdashon
y^ y^ E ie-itnoSp _ i p n = l
(20)
and noting that fat) is thus represented as a product of terms of the form exp(a(elbt mdash 1)) each of which is the characteristic function of a Poisson random variable with intensity a and values in the lattice kb k = 012
In order to connect this fact with the above question it is convenient to introduce the Levy measure Fa which puts mass (npncr)~1 at each of the points - logp ngtlp prime Then (20) becomes log ^fffi = J(eitx - 1) Fa(dx) so taking real parts in this equation and using J^deg (l mdash costx)t2 dt = n x (x pound R) one obtains
J o g [ | C ( a + i i ) | C ( lt T ) ] ^ = j_^jpostx-l)Fadx)^
= ( c o s t e - 1 ) mdash ^ F ^ d x ) = - hxlFeidx) = xFbdquo(dx)
Thus we find that the essential part of the integral in question equals the first moment of the Levy measure Fa The other part stemming from the factor (z mdash l)z2 can be incorporated by introducing a signed absolutely continuous measure Ga with density x _ 1 [2eax - e ^ - 1 ^ ) on (-000) (zero on [000)) One then has
log r)a + it) plusmnii) = j(eax-l)(Fa-Ga)(dx)
130
and hence
l o g [ | bdquo ( | + r t ) I M sect ) ] ^ = lx(Fbdquo-Ga)dx) (ltxgtl)
These calculations give a more detailed picture of the way how the factor (z mdash l)z2 regularizes the zeta function as a J 1 it compensates the flow of mass of Fa towards mdash oo by the subtraction of measures Ga such that the first moment of Fa mdash Ga remains bounded Evidently other ways of renormalizing the Levy measure as a 1 are also conceivable and may be interesting to explore
References
1 M Balazard E Saias and M Yor Adv Math 143 284 (1999) 2 MV Berry and JP Keating SIAM Review 41 236 (1999) 3 RE Burge MA Fiddy AH Greenaway and G Ross Proc R Soc
London A 350 191 (1976) 4 J -F Burnol lt h t t p arXivorgabsmath0001013gt (2000) 5 H Dym and HP McKean Gaussian Processes Function Theory and
the Inverse Spectral Problem (Academic Press New York 1976) 6 HM Edwards The Theory of the Riemann Zeta Function (Academic
Press New York 1974) 7 BV Gnedenko and AN Kolmogorov Limit Distributions for Sums of
Independent Random Variables (Addison-Wesley Cambridge 1954) 8 K Hoffman Banach Spaces of Analytic Functions (Dover New York
1988)
131
ENSEMBLE PROBABILISTIC EQUILIBRIUM A N D NON-EQUILIBRIUM THERMODYNAMICS W I T H O U T THE
THERMODYNAMICAL LIMIT
D H E G R O S S
Hahn-Meitner-Institut Berlin Bereich Theoretische PhysikGlienickerstrlOO
14109 Berlin Germany and Freie Universitdt Berlin Fachbereich Physik Email grosshmide
Boltzmanns principle S = k In W allows to extend equilibrium thermo-statistics to Small systems without invoking the thermodynamic limit23 As the limit hides more than clarifies the origin of phase transitions a deeper and more transparent understanding is thus possible The main clue is to base statistical probability on ensemble averaging and not on time averaging It is argued that due to the incomplete information obtained by macroscopic measurements thermodynamics handles ensembles or finite-sized sub-manifolds in phase space and not single time-dependent trajectories Therefore ensemble averages are the natural objects of statistical probabilities This is the physical origin of coarse-graining which is not anymore a mathematical ad hoc assumption The probabilities P(M) of macroshyscopic measurements M are given by the ratio P(M) = W(M)W of the volumes of the sub-manifold M of the microcanonical ensemble with the constraint M to the one without From this concept all equilibrium thermodynamics can be deduced quite naturally including the most sophisticated phenomena of phase transitions for Small systems
Boltzmanns principle is generalized to non-equilibrium Hamiltonian systems with possibly fractal distributions M in 6iV-dim phase space by replacing the conshyventional Riemann integral for the volume in phase space by its corresponding box-counting volume This is equal to the volume of the closure M With this extension the Second Law is derived without invoking the thermodynamic limit The irreversibility in this approach is due to the replacement of the phase-space volume of the fractal sub-manifold M by the volume of its closure M The physical reason for this replacement is that macroscopic measurements cannot distinguish M from Ai Whereas the former is not changing in time due to Liouvilles theoshyrem the volume of the closure can be larger In contrast to conventional coarse graining the box-counting volume is defined in the limit of infinite resolution Ie there is no artificial loss of information
1 Introduction
Recently the interest in the thermo-statistical behavior of non-extensive many-body systems like atomic nuclei atomic clusters soft-matter biological sysshytems mdash and also self-gravitating astro-physical systems lead to consider thermo-statistics without using the thermodynamic limit This is most safely done by going back to Boltzmann Einstein considers Boltzmanns definition of entropy as eg written on his
132
famous epitaph
S=k-lnW (1)
as Boltzmanns principle4 from which Boltzmann was able to deduce thermoshydynamics Here W is the number of micro-states at given energy E of the TV-body system in the spatial volume V
W(ENV) = tr[e0S(E - HN)) (2)
ltlt-amp)] = ff^(^0)BBbdquo) (3)
eo is a suitable energy constant to make W dimensionless Hpf is the N-particle Hamilton-function and the iV positions q are restricted to the volume V whereas the momenta p are unrestricted In what follows we remain on the level of classical mechanics The only reminders of the underlying quantum meshychanics are the measure of the phase space in units of 2-KK and the factor 1N which respects the indistinguishability of the particles (Gibbs paradoxon) In contrast to Boltzmann56 who used the principle only for dilute gases and to Schrodinger7 who thought equation (1) is useless otherwise I take the princishyple as the fundamental generic definition of entropy In the following sections 1 will demonstrate that this definition of thermo-statistics works well espeshycially also at higher densities and at phase transitions without invoking the thermodynamic limit
2 There is a lot to add to classical equilibrium statistics from our experience with Small systems
Following Lieb8 extensivity a and the existence of the thermodynamic limit N mdashgt oo|jvv=cobdquogt are essential conditions for conventional (canonical) thershymodynamics to apply Certainly this implies also the homogeneity of the system Phase transitions are somehow foreign to this The essence of first order transitions is that the systems become inhomogeneous and split into difshyferent phases separated by interfaces In the conventional Yang-Lee theory phase transitions are represented by the positive zeros of the grand-canonical partition sum where the grand-canonical formalism breaks down (Yang-Lee singularities) In the following we show that the micro-canonical ensemble
Dividing extensive systems into larger pieces the total energy and entropy are equal to the sum of those of the pieces
133
gives much more detailed and more natural insight which corresponds to the experimental identification of phase transitions
There is a whole group of physical many-body systems called Small in the following which cannot be addressed by conventional thermo-statistics
bull nuclei
bull atomic cluster
bull polymers
bull soft matter (biological) systems
bull astrophysical systems
bull first order transitions are distinguished from continuous transitions by the appearance of phase-separations and interfaces with surface tension If the range of the force or the thickness of the surface layers is such that the number of surface particles is not negligible compared to the total number of particles these systems are non-extensive
For such systems the thermodynamic limit does not exist or makes no sense Either the range of the forces (Coulomb gravitation) is of the order of the linear dimensions of these systems andor they are strongly inhomogeneous eg at phase-separation
Boltzmanns principle does not invoke the thermodynamic limit nor ad-ditivity nor extensivity nor concavity of the entropy S(EN) (downwards bending) This was largely forgotten since hundred years We have to go back to pre Gibbsian times It is a purely geometrical definition of the entropy and applies as well to Small systems Moreover the entropy S(E N) as defined above is everywhere single-valued and multiple differentiable There are no singularities in it This is the most simple access to equilibrium statistics9 We will explore its consequences in this contribution Moreover we will see that this way we get simultaneously the complete information about the three crucial parameters characterizing a phase transition of first order transition tempershyature Ttr latent heat per atom qiat and surface tension crsurf Boltzmanns famous epitaph above (eql) contains everything what can be said about equishylibrium thermodynamics in its most condensed form W is the volume of the sub-manifold at sharp energy in the 6iV-dim phase space
134
3 Relation of the topology of S(EN) to the Yang-Lee zeros of Z(TnV)
In conventional thermo-statistics phase transitions are indicated by zeros of the grand-canonical partition function Z(T n V) V is the volume See more details in1-2310
Z(TfiV) = f r mdash dN e-[E-N-TsmiT JJo go
rdegdegdE
V2
= Y_ ff de dn c-V[ e-Mn-r(en)]T_ laquoo JJo
const+lin+quadr
(4)
in the thermodynamic limit V mdashgt oo|vy=cobdquos t The double Laplace integral (4) can be evaluated asymptotically for large
V by expanding the exponent as indicated in the last line to second order in Ae An around the stationary point esns where the linear term vanishes
1 T
T P f
dE 8
as dN
dS dv (5)
the only term remaining to be integrated is the quadratic one If the two eigen-curvatures Ai lt 0 A2 lt 0 this is then a Gaussian integral and yields
Z(TliV) = Yle-V[e-Itn-T^n)]T ffdegdeg dvidv2eV[Mvl+Xvl2 ( g )
CO JJ-00
Z(TfiV) = e - F ^ ^ (7)
FiT^V) _ _ T B i i ^ ^ ^ plusmn ^ ( g )
V
bdquo Tln(vdet(eg n)) l n V -+ea- in - Tss + VV
VK s + o ( mdash )
Here det(e s n s) is the determinant of the curvatures of s(en) viv2 are the eigenvectors of d
det(en) = de2 dnde d s d s
dedn dn2 Sfie Snn A1A2 Ai gt A2 (9)
135
Nalooo P = 1 a t m ^ AS s u r f ^_^
^ J - ^ mdash ^ r f ^
bull7 e2 1 s ( e ) - 2 5 - e 1 1 5
H l a t
e 3
03 0 5 07 09 11 13
Figure 1 MMMC simulation of the entropy s(e) per atom (e in eV per atom) of a system of JVo = 1000 sodium atoms with realistic inshyteraction at an external pressure of 1 atm At the energy per atom e the system is in the pure liquid phase and at e$ in the pure gas phase of course with fluctuations The latent heat per atom is qiat = e mdash e
Attention the curve s(e) is artifically sheared by subtracting a linear funcshytion 25 -(- e 115 in order to make the convex intruder visible s(e) is always a steeply monotonic rising functionWe clearly see the global concave (downshywards bending) nature of s(e) and its convex intruder Its depth is the enshytropy loss due to the additional corshyrelations by the interfaces Prom this one can calculate the surface tension per surface atom aSUrfTtr = As3 1 i r NoNsUrf The double tangent is the concave hull of s(e) Its derivative gives the Maxwell line in the caloric curve T(e) at Ttr- In the thermodynamic limit the intruder would disappear and s(e) would approach the double tanshygent (Maxwell line) from below
In the cases studied here A2 lt 0 but Ai can be positive or negative If d e t ( e s n s ) is positive (Ai lt 0) the last two terms in eq(8) go to 0 and we obtain the familiar result fTnV mdashgt oo) = es mdash xns mdash Tss Ie the curvashyture Ai of the entropy surface s(e n V) decides whether the grand-canonical ensemble agrees with the fundamental micro ensemble in the thermodynamic limit If this is the case n[Z(T j)] or f(Tn) is analytical in e3^ and due to Yang and Lee we have a single stable phase Or otherwise the Yang-Lee zeros reflect anomalous pointsregions of Ai gt 0 (det (e n) lt 0) This is crucial As d e t ( e s n s ) can be studied for finite or even small systems as well this is the only proper extension of phase transit ions to Small systems
4 T h e reg ions of p o s i t i v e curvature Ai of sesns) c o r r e s p o n d t o p h a s e t rans i t i ons of first order
We will now discuss the physical origin of convex (upwards bending) intruders in the entropy surface in two examples
In table (1) we compare the liquid-gas phase transit ion in sodium clusshyters of a few hundred atoms with tha t of the bulk at 1 a tm cf also fig(l)
Figure (2) shows how for a small system (Pot ts q = 3 lattice gas with 50 50 points) all phenomena of phase transitions can be studied from the
136
Table 1 Parameters of the liquid-gas transition of small sodium clusters (MMMC-calculation1) in comparison with the bulk for rising number No of atoms Nsurf is the average number of surface atoms of all clusters together
N a
N0
Ttr [K] qiat [eV]
Sboil
^Ssurf
bullL surf
crTtr
200
940 082 101 055 3994 275
1000
990 091 107 056 9853 568
3000
1095 094 99 044 1866 707
bulk 1156 0923 9267
oo 741
topology of the determinant of curvatures (9) in the micro-canonical ensemble
5 Boltzmanns principle and non-equilibrium thermodynamics
Before we proceed we must comment on Einsteins attitude to the principle11) Originally Boltzmann called W the Wahrscheinlichkeit (probability) ie the relative time a system spends (along a time-dependent path) in a given region of 6V-dim phase space Our interpretation of W to be the number of complexions (Boltzmanns second interpretation) or quantum states (trace) with the same energy was criticized by Einstein4 as artificial It is exactly that criticized interpretation of W which I use here and which works so excellently1 In section 7 I will come back to this fundamental point
After succeeding to deduce equilibrium statistics including all phenomshyena of phase transitions from Boltzmanns principle even for Small systems ie non-extensive many-body systems it is challenging to explore how far this most conservative and restrictive way to thermodynamics9 is able to describe also the approach of (eventually Small) systems to equilibrium and the Second Law of Thermodynamics
Thermodynamics describes the development of macroscopic features of many-body systems without specifying them microscopically in all details Beshyfore we address the Second Law we have to clarify what we mean with the label macroscopic observable
6 Macroscopic observables imply the EPS-probability
A single point qi(t)Pi(t)i=iN in the Af-body phase space corresponds to a detailed specification of the system with all degrees of freedom (dof) com-
137
1
0 8
0 6
0 4
0 2
0 - 2 - 1 5 - 1 - 0 5 0
e Figure 2 Conture plot of the curvature determinant of Potts-3 lattice gas Dark grey line d = 0 boundary of the region of phase coexistence the triangle APmB Light grey line minimum of d(en) in the direction of the largest curvature second order transition In the triangle APmC ordered (solid) phase Above and right of the line CPmB disordered (gas) phase The crossing Pm of the boundary lines is a multi critical point The light gray region around the multi-critical point Pm corresponds to a flat region of d(e n) ~ 0
pletely fixed at time t (microscopic determination) Fixing only the total energy E of an iV-body system leaves the other (6N mdash l)-degrees of freeshydom unspecified A second system with the same energy is most likely not in the same microscopic state as the first it will be at another point in phase space the other dof will be different Ie the measurement of the total energy HN or any other macroscopic observable M determines a (QN mdash 1)-dimensional sub-manifold pound or M in phase space All points in iV-body phase space consistent with the given value of E and volume V ie all points in the (6N mdash l)-dimensional sub-manifold poundNV) of phase space are equally consistent with this measurement pound(NV) is the microcanonical ensemble This example tells us that any macroscopic measurement is incomplete and defines a sub-manifold of points in phase space not a single point An addishytional measurement of another macroscopic quantity Bqp reduces pound further to the cross-section pound O B a (6iV mdash 2)-dimensional subset of points in pound with the volume
WBENV) = plusmnJ j0f) e0S(E-HNqp)6(B-Bqp) (10)
138
If Hffqp as also Bqp are continuous differentiable functions of their arguments what we assume in the following pound n B is closed In the following we use W for the Riemann or Liouville volume of a many-fold
Microcanonical thermostatics gives the probability P(B E N V) to find the TV-body system in the sub-manifold pound D B(EN V)
P(B E N V)~ W(BEgtNV) _ ln[W(BENV)]-S(ENV) ( m
This is what Krylov seems to have had in mind12 and what I will call the ensemble probabilistic formulation of statistical mechanics (EPS)
Similarly thermodynamics describes the development of some macroscopic observable Bqtpt in time of a system which was specified at an earlier time to by another macroscopic measurement Aqop0 It is related to the volume of the sub-manifold M(t) = A(t0) n B(t) D pound
W(ABEt) = ^J^0)N^-Bqupt]) 6(A - Aq0po)e0d(E - Hqtpt) (12)
where qtQoPoPtQoPo is the set of trajectories solving the Hamilton-Jacobi equations
dH 8H = laquo - Pi = mdash laquo - i = l---N (13)
with the initial conditions q(t = to) = lto p(t = t0) = Po- For a very large system with N ~ 1023 the probability to find a given value B(T) P(B(t)) is usually sharply peaked as function of B Ordinary thermodynamics treats systems in the thermodynamic limit N mdashbull oo and gives only ltB(t)gt However here we are interested to formulate the Second Law for Small systems ie we are interested in the whole distribution P(B(t)) not only in its mean value ltB(t)gt Thermodynamics does not describe the temporal development of a single system (single point in the 6iV-diiri phase space)
There is an important property of macroscopic measurements Whereas the macroscopic constraint Aqopo determines (usually) a compact region A(to) in qoPo this does not need to be the case at later times t 3gt to A(t) denned by AqoqtptPoltltPt might become a fractal ie spaghetti-like manifold cf fig3 as a function of qtPt in f at i mdash oo and loose compactness
This can be expressed in mathematical terms There exist series of points an euro -4(oo) which converge to a point an=_+oo which is not in ^4(oo) Eg
139
such points may have intruded from the phase space complimentary to A(to) Illustrative examples for this evolution of an initially compact sub-manifold into a fractal set are the baker transformation discussed in this context by ref1314 Then no macroscopic (incomplete) measurement at time t = oo can resolve aoo from its immediate neighbors an in phase space with distances o-n mdash laquooo| less then any arbitrary small 5 In other words at the time t Sgt to no macroscopic measurement with its incomplete information about qtPt can decide whether qoqtPtPoqtPt euro -4(o) or not Ie any macroscopic theory like thermodynamics can only deal with the closure of A(t) If necessary the sub-manifold A(t) must be artificially closed to A(t) as developed further in section 8 Clearly in this approach this is the physical origin of irreversibility We come back to this in section 8
7 On Einsteins objections against the EPS-probability
According to Abraham Pais Subtle is the Lord11 Einstein was critical with regard to the definition of relative probabilities by eql l Boltzmanns countshying of complexions He considered it as artificial and not corresponding to the immediate picture of probability used in the actual problem The word probability is used in a sense that does not conform to its definition as given in the theory of probability In particular cases of equal probability are often hypothetically defined in instances where the theoretical pictures used are sufshyficiently definite to give a deduction rather than a hypothetical assertion4 He preferred to define probability by the relative time a system (a trajectory of a single point moving with time in the V-body phase space) spends in a subset of the phase space However is this really the immediate picture of probashybility used in statistical mechanics This definition demands the ergodicity of the trajectory in phase space As we discussed above thermodynamics as any other macroscopic theory handles incomplete macroscopic informations of the A-body system It handles consequently the temporal evolution of finite sized sub-manifolds - ensembles - not single points in phase space The typical outcomes of macroscopic measurements are calculated Nobody waits in a macroscopic measurement eg of the temperature long enough that an atom can cross the whole system
In this respect I think the EPS version of statistical mechanics is closer to the experimental situation than the duration-time of a single trajectory Moreover in an experiment on a small system like a nucleus the excited nushycleus which then may fragment statistically later on is produced by a multiple repetition of scattering events and statistical averages are taken No ergodic covering of the whole phase space by a single trajectory in time is demanded
140
At the high excitations of the nuclei in the fragmentation region their life-time would be too short for that This is analogous to the statistics of a falling ball on a Galtons nail-board where also a single trajectory is not touching all nails but is random Only after many repetitions the smooth binomial distribution is established As I am discussing here the Second Law in finite systems this is the correct scenario not the time average over a single ergodic trajectory
8 Fractal distributions in phase space Second Law
Let us examine the following Gedanken experiment Suppose the probability to find our system at points qtPt in phase space is uniformly distributed for times t lt to over the sub-manifold poundN V) of the TV-body phase space at energy E and spatial volume V At time t gt to we allow the system to spread over the larger volume V2 gt Vi without changing its energy If the system is dynamically mixing the majority of trajectories qtPt^ in phase space starting from points qoPo with qo 6 V at to will now spread over the larger volume V2- Of course the Liouvillean measure of the distribution JAqtPt in phase space at t gt to will remain the same (= tr[pound(N Vi)]f5 (The label qo pound Vi of the integral means that the positions qo^ are restricted to the volume Vi the momenta po are unrestricted)
tr[MqtqoPoPtqoPo]goeVl
-UMW-^-61^ lt14) because of 7-7mdash-mdashr = 1 (15)
oqoPo
But as already argued by Gibbs the distribution MqtPt will be filamented like ink in water and will approach any point of poundN V2) arbitrarily close Mqtpt becomes dense in the new larger pound(N V2) for times sufficiently larger than to (strictly in the limt_gtoo)- The closure M becomes equal to poundNV-z) This is clearly expressed by Lebowitz1617
In order to express this fact mathematically we have to redefine Boltz-manns definition of entropy eq(l) and introduce the following fractal mea-
141
sure for integrals like (3) or (10)
W(ENtraquot0) = plusmn [ i^Sf)zo6(E-HNquPt) (16)
With the transformation
f(d3qt d3Pt)
N bull bull bull = d lt n bullbull bull da6N bull bull bull (17)
1 ^dH dH 1 _ 1 Q do-QN = mdash gt -mdash- dqi + -^mdashdpi = mdashdE (18)
IVffll Ns)+gy W[E N t raquo t0) = v 9 Lv3jv f rfltJi bull bull bull d(76N-1-
JVH||
we replace M by its closure M and define now
(20)
W(EW traquo fo ) -gt M(E JV traquo t 0 ) =ltG(pound(JVV2))gt volt08[MCEJTt raquo i o ) ] (21)
where lt G(S(N V2)) gt is the average of fi^llvgll o v e r t i e (^arSer) m a n _
ifold pound(N V2) and volbox[M(ENt raquo to)] is the box-counting volume of M(E N t 3gt to) which is the same as the volume of M see below
To obtain voltox[M(E Nt 3gt to)] we cover the d-dim sub-manifold M(t) here with d = (6V mdash 1) of the phase space by a grid with spacing 6 and count the number N$ oc 5~d of boxes of size S6N which contain points of M Then we determine
vobox[M(ENt raquo to)] =)ms_y05dNs[M(ENfraquo f0)] (22)
with lim= inf [lim ] or symbolically
M(ENtraquot0) = L lf^^Pi) e06(E-HN)(23) J laquoolaquoplaquoeViM V ( 2 ^ ) ^ J
N
i 1 1 aat arvt
= WfaNWtWiE^M) (24)
142
Va vb va + vb
t lt 0 gt i o
Figure 3 The compact set M(to) left side develops into an increasingly folded spaghetti-like distribution in phase-space with rising time t This figure shows only the early form of the distribution At much larger times it will become more and more fractal The grid illustrates the boxes of the box-counting method All boxes which overlap with A4(t) are counted in Ng in eq(22)
where 3d means that this integral should be evaluated via the box-counting
volume (22) here with d = 6N mdash 1 This is illustrated by the figure 3 With this extension of eq(3) Boltzmanns entropy (1) is at time t -gtbull oo equal to the logarithm of the larger phase space W(E TV V )- This is the Second Law of Thermodynamics The box-counting is also used in the definition of the Kolmogorov entropy the average rate of entropy gain1819 Of course still at to Mto)=Mt0)=poundNV1)
l_ M(ENt0) =
lt7oeuroVi
qoeuroVi N l
= WENV)
4o6Vgt N
d3q0 dpQ
(2irH)3
d3q0 d3p0 (2nh)3 J
e06(E - HN) (25)
e0S(E - HN)
(26)
The box-counting volume is analogous to the standard method to detershymine the fractal dimension of a set of points18 by the box-counting dimension
dimbox[M(ENt raquo t0)] = lira InNs[M(ENtgt tp)]
In S (27)
143
Like the box-counting dimension volbox has the peculiarity that it is equal to the volume of the smallest closed covering set Eg The box-counting volume of the set of rational numbers Q between 0 and 1 is voloxQ = 1 and thus equal to the measure of the real numbers cf Falconer18 section 31 This is the reason why volampox is not a measure in its mathematical definition because then we should have
volf0 pound(M) ieuroQ
2 voUolaquo[Mi] = 0 (28) ieQ
therefore the quotation marks for the box-counting measure Coming back to the the end of section (6) the volume W(ABbull bull bull t) of
the relevant ensemble the closure M(t) must be measured by something like
the box-counting measure (2223) with the box-counting integral B d which
must replace the integral in eq(3) Due to the fact that the box-counting volume is equal to the volume of the smallest closed covering set the new extended definition of the phase-space integral eq(23) is for compact sets like the equilibrium distribution pound identical to the old one eq(3) Therefore one can simply replace the old Boltzmann-definition of the number of complexions and with it of the entropy by the new one (23)
9 Conclusion
Macroscopic measurements M determine only a very few of all 6N dof Any macroscopic theory like thermodynamics deals with the volumes M of the corresponding closed sub-manifolds M in the 6iV-dim phase space not with single points The averaging over ensembles or finite sub-manifolds in phase space becomes especially important for the micro canonical ensemble of a finite system
Because of this necessarily coarsed information macroscopic measureshyments and with it also macroscopic theories are unable to distinguish fractal sets M from their closures M Therefore I make the conjecture the proper manifolds determined by a macroscopic theory like thermodynamics are the closed M However an initially closed subset of points at time to does not necshyessarily evolve again into a closed subset at t ^gt to- l e the closure operation and the t mdash)bull oo limit do not commute and the macroscopic dynamics becomes irreversible The limt-^oo and l i m ^ o may be linked as eg S gt constft and the S mdashgtbull 0 limit taken after the t mdashgt oo limit
Here is the origin of the misunderstanding by the famous reversibility paradoxes which were invented by Loschmidt20 and Zermelo2122 and which
144
bothered Boltzmann so much2324 These paradoxes address to trajectories of single points in the JV-body phase space which must return after Poincarres recurrence time or which must run backwards if all momenta are exactly reshyversed Therefore Loschmidt and Zermelo concluded that the entropy should decrease as well as it was increasing before The specification of a single point demands of course a microscopic exact specification of all 6N degrees of freeshydom not a determination of a few macroscopic degrees of freedom only No entropy is defined for a single point
By our formulation of thermo-statistics various non-trivial limiting proshycesses can be avoided Neither does one invoke the thermodynamic limit of a homogeneous system with infinitely many particles nor does one rely on the er-godic hypothesis of the equivalence of (very long) time averages and ensemble averages The use of ensemble averages is justified directly by the very nature of macroscopic (incomplete) measurements Coarse-graining appears as natushyral consequence of this The box-counting method mirrors the averaging over the overwhelming number of non-determined degrees of freedom Of course a fully consistent theory must use this averaging explicitly Then one would not depend on the order of the limits l i m ^ o limt_gtoo as it was tacitly assumed here Presumably the rise of the entropy can then be already seen at finite times when the fractality of the distribution in phase space is not yet fully deshyveloped The coarse-graining is no more any mathematical ad hoc assumption Moreover the Second Law is in the EPS-formulation of statistical mechanics not linked to the thermodynamic limit as was thought up to now1617
Appendix
In the mathematical theory of fractals18 one usually uses the Hausdorff measure or the Hausdorff dimension of the fractal19 This however would be wrong in Statistical Mechanics Here I want to point out the difference between the box-counting measure and the proper Hausdorff measure of a manifold of points in phase space Without going into too much mathematical details we can make this clear again with the same example as above The Hausdorff measure of the rational numbers euro [01] is 0 whereas the Hausdorff measure of the real numbers euro [01] is 1 Therefore the Hausdorff measure of a set is a proper measure The Hausdorff measure of the fractal distribution in phase space M(t -gt oo) is the same as that of M(to) W(E NV) Measured by the Hausdorff measure the phase space volume of the fractal distribution M(t -t oo) is conserved and Liouvilles theorem applies This would demand that thermodynamics could distinguish between any point inside the fractal from any point outside of it independently how close it is This however
145
is impossible for any macroscopic theory that can only address macroscopic information where all unobserved degrees of freedom are averaged over That is the deep reason why the box-counting measure must be taken and where irreversibility comes from
Acknowledgement
I thank to EGD Cohen and Pierre Gaspard for detailed discussions
References
1 D H E Gross Microcanonical thermodynamics Phase transitions in Small systems Lecture Notes in Physics (World Scientific Singapore 2000)
2 D H E Gross and E Votyakov Phase transitions in small sysshytems EurPhysJB 15 115-126 (2000) httparXivorgabscond-mat9911257
3 D H E Gross Micro-canonical statistical mechanics of some non-extensive systems httparXiv orgabsastro-phcond-mat0004268 (2000)
4 A Einstein Uber einen die Erzeugung und Verwandlung des Lichtes betreffenden heuristischen Gesichtspunkt Annalen der Physik 17 132 (1905)
5 L Boltzmann Uber die Beziehung eines algemeinen mechanischen Satzes zum Hauptsatz der Warmelehre Sitzungsbericht der Akadamie der Wis-senschaften Wien 2 67-73 (1877)
6 L Boltzmann Uber die Begriindung einer kinetischen Gastheorie auf anziehende Krafte allein Wiener Berichte 89 714 (1884)
7 E Schrodinger Statistical Thermodynamics a Course of Seminar Lecshytures delivered in January-March 1944 at the School of Theoretical Physics (Cambridge University Press London 1946)
8 Elliott H Lieb and J Yngvason The physics and mathematics of the second law of thermodynamics Physics Reportcond-mat9708200 310 1-96 (1999)
9 J Bricmont Science of chaos or chaos in science Physicalia Magazine Proceedings of the New York Academy of Science to apear 1-50 (2000)
10 DHE Gross Phase transitions in small systems - a challenge for thershymodynamics httparXivorgabscond-mat0006087 page 8 (2000)
11 A Pais Subtle is the Lord chapter 4 pages 60 - 78 (Oxford University Press Oxford 1982)
12 N S Krylov Works on the Foundation of Statistical Physics (Princeton University Press Princeton 1979)
13 R F Fox Entropy evolution for the baker map Chaos 8 462-465 (1998)
14 T Gilbert J R Dorfman and P Gaspard Entropy production fractals and relaxation to equilibrium PhysRevLett 85 1606nlinCD000301 (2000)
15 H Goldstein Classical Mechanics (Addison-Wesley Reading Mass 1959)
16 J L Lebowitz Microscopic origins of irreversible macroscopic behavior Physica A 263 516-527 (1999)
17 J L Lebowitz Statistical mechanics A selective review of two central issues RevModPhys 71 S346-S357 (1999)
18 K Falconer Fractal Geometry - Mathematical Foundations and Apshyplications ( John Wiley amp Sons Chichester New York Brisbane TorontoSingapore 1990)
19 E W Weisstein Concise Encyclopedia of Mathemetics (CRC Press Lonshydon New York Washington DC 1999 CD-ROM edition 1 205 99)
20 J Loschmidt Wienerberichte 73 128 (1876) 21 E Zermelo WiedAnn 57 778-784 (1896) 22 E Zermelo Uber die mechanische Erklarung irreversiblen Vorgange
WiedAnn 60 392-398 (1897) 23 E G D Cohen Boltzmann and statistical mechanics In Boltz-
manns Legacy 150 Years after his Birth httpxxxlanlgovabscond-mat9608054 (Atti dell Accademia dei Lincei Rome 1997)
24 E G D Cohen Boltzmann and Statistical Mechanics volume 371 of Dynamics Models and Kinetic Methods for Nonequilibrium Many Body Systems J Karkheck editor 223-238 (Kluwer Dordrecht The Nethershylands 2000)
147
A N APPROACH TO Q U A N T U M PROBABILITY
STAN GUDDER Department of Mathematics
University of Denver Denver Colorado 80208
sguddercs du edu
We present an approach to quantum probability that is motivated by the Feynman formalism This approach shows that there is a realistic description of quantum mechanics and that nonrelativistic quantum theory can be derived from simple postulates of quantum probability The basic concepts in this framework are meashysurements and actions The measurements are similar to the dynamic variables of classical mechanics and the random variables of classical probability theory The actions correspond to quantum mechanical states An influence between configshyurations of a physical system is defined in terms of an action The fundamental postulate of this approach is that the probability density at a measurement outshycome x is the sum (or integral) of the influences between each pair of configurations that result in x upon executing the measurement
1 Introduction
We shall discuss a new approach to quantum probability that combines a reshyformulation of the mathematical foundations of quantum mechanics and the basic tenets of probability theory This approach is motivated by the Feynshyman formalism1 and it answers various puzzling questions about traditional quantum mechanics Some of these questions are the following
1 Where does the quantum mechanical Hilbert space H come from
2 Why are states represented by unit vectors in H and observables by self-
adjoint operators on HI
3 Why does the probability have its postulated form
4 Why do the position and momentum operators have their particular forms
5 Why does a physical theory that must give real-valued results involve complex amplitudes or states
6 Is there a realistic description of quantum mechanics
Our philosophy is that quantum probability theory need not be the same as classical probability theory That is the probability need not be given by a measure However the predictions of quantum probability theory should agree
148
with experimental long run relative frequencies We shall show that there is a realistic description of quantum mechanics In other words a quantum system has properties independent of observation We also show that nonrelativistic quantum mechanics can be derived from simple postulates of this approach Our presentation is a modified version of the discussion in Gudder 2
2 Formulation
We denote the set of possible configurations of a physical system ltS by fl and call $1 a sample space If X is a measurement on ltS then executing X results in a unique outcome depending on the configuration u of S To be precise we define a measurement to be a map X from fl onto its range R(X) C R satisfying
(Ml) R(X) is the base space of a measure space (R(X) Ex fix)-
(M2) X_1(x) is the base space of a measurable space (X~1(x) E x ) for every x e R(x)
We call the elements of R(X) X-outcomes and the sets in Ex are X-events Note that X _ 1 (x ) corresponds to the set of configurations resulting in outcome x when X is executed and we call X_1(x) the X-fiber over x The measure fix represents an a priori weight due to our knowledge of the system (for example we may know the energy of S or we might assume the energy has a certain value) In the case of total ignorance the weight is taken to be counting measure in the discrete case and uniform measure in the continuous case This framework gives a realistic theory because a configuration CJ detershymines the properties of S independent of any particular observation That is w determines the outcomes of all measurements simultaneously Notice that measurements are similar to the dynamical variables of classical mechanics and the random variables of classical probability theory The sample space fi gives an underlying level of reality upon which traditional quantum mechanics can be constructed
If X is a measurement an X-action is a pair
(Spound xeR(X))
where S CI mdashgt R and (ix is a measure on [X~lx)Hxx) As we shall see
actions correspond to quantum states For simplicity we frequently denote an action by S and we remark that S depends on our model of S and also on our knowledge of ltS We define the influence between w w 6 SI relative to S
149
by
Fs(uu) = JVf cos[S(w) - S(u)] (1)
where Ns gt 0 is a normalization constant The appearance of the cosine in (1) is not arbitrary but it can be derived from the regularity conditions of continuity and causality25
We now make a fundamental reformulation of the probability concept2 5
We postulate that the probability density Pxs) of an X-outcome x is the sum (or integral) of the influences between each pair of configurations that reshysult in x upon executing X Precisely we postulate that Fs(w u) is integrable and that
PXS(X)= f [ FS(ujUj)fMx(du)^x(dLj JX-l(x) JX~l(x)
(2)
Also to ensure that Pxsx) is indeed a probability density we assume that Pxsx) is measurable with respect to Ex and that
L RX) Pxs(x)nx(dx) - 1 (3)
Equation (3) can be employed to find Ns- To show that Pxs(x) gt 0 we have
Pxs()
= N2S[ f [caaS(w)coaS(w) + 8mS(u)S(u)]px(du)px(du)
Jx-Hx) Jx-Hx)
= N2S
-| 2 p
cosS(u)fix(dcj + sinS(w)^x(eL Jx-1(x) Jx-^x)
gt 0
We conclude that Pxs(x) is a probability density on R(X) pound X J X )
If B G pound is an X-event we define the (X 5)-probability of B by
PxsB) = [ Pxs(x)Vxdx) JB
(4)
(5)
Then Pxs- Ex -gt [01] is a probability measure on (R(X)Hx) that we call the S-distribution of X If h R(X) -gtbull R is ^x-integrable then the
150
5-expectation of hX) is defined by
Es(hX))= [ h(x)Pxs(dx)= [ h(x)Pxs(x)nx(dx) (6) JR(X) JR(X)
In particular if h is the identity function the 5-expectation of X becomes
ES(X)= [ xPxsx)nx(dx) (7) JR(X)
Influence is a strictly quantum phenomenon that is not present in classical physics In the classical limit Fswu) approaches a delta function 5U(UJ) In this limit Fs(uiui) = 0 for u 7 OJ and there is no influence between distinct configurations We then have Pxs(x) mdash nx
x X~lx)) which gives a classical probability framework
We can extend this theory to include expectations of other functions on Q Let g Q mdashgt R be a function that is integrable along X-fibers We define the (X 5)-expectation of g at x by
EXlS(g)(x) = I [ 5(w)fs(wa)Mx(dw)Mx(dw) (8) JX-1(x)JX-^(x)
This is the natural generalization of (2) from a probability density to an exshypectation density If Exs(g) 1S integrable then the (X 5)-expectation of g is given by
Exs(9) = [ Exs9)x)raquoxdx) (9) JR(X)
In particular if g(u) = h (X(CJ)) then
Exs(g)(x) = h(x)Pxs(x)
and
ExM = I h(x)Pxs(x)raquox(dx) = Es (h(X)) JR(X)
This shows that (9) is an extension of (6) We can also use this formalism to compute probabilities of events in fi Let
ACQ and denote the characteristic function of Aby xA- If XA is integrable along X-fibers we define analogously as in classical probability theory the (X 5)-pseudoprobability of A by
xs(A) = Exs(xA)
151
It follows from (3) and (9) that Pxs(ty = 1 and Pxs is countably additive However Pxs rnay have negative values which is why it is called a pseudo-probability Nevertheless there are cr-algebras of subsets of fi on which Pxs is a probability measure For example if A = X~XB) for B euro Ex then it can be shown that Pxs(A) = Pxs(B)2 Therefore in this case Pxs reduces to the distribution Pxs- We shall consider some less trivial examples later
3 Wave Functions and Hilbert Space
This section employs the formalism of Section 2 to derive the wave functions and Hilbert space of traditional quantum mechanics It is not necessary to do this because the needed probability formulas have been presented in Section 2 However as we shall see the Hilbert space formulation gives more convenient and concise notations
Applying (4) we obtain
NseiS^raquox(duj)
JX-l(x)
2
(10)
We call the function
s M = NseiS^ (11)
the S-amplitude function and define the (X S)-wave function by
fxs() = f fs(u)raquoxx(du) (12)
X-i(a)
From (10) and (12) we obtain
Pxs(x) = l xs()|2 (13)
We also have
Fs(uw) = iVfRe e ^ M e - ^ ) = Re s(w)s(w) (14)
Equation (10) shows how the complex numbers arise in quantum mechanshyics The complex numbers are not needed for the computation of Pxs because we can always write FS(OJW) in the form (1) They are merely a convenience that gives a simple and concise formula Equation (11) gives the Feynman amshyplitude function which we have now derived from deeper principles and (12) is Feynmans prescription that the amplitude of an outcome a is the sum (or
152
integral) of the amplitudes of the configurations (or alternatives) that result in x when X is executed
If B G Ex applying (5) and (13) gives
Pxs(B) = [ fxs(x)2raquox(dx) (15) JB
and this is the usual probabilistic formula of traditional quantum mechanics It follows from (3) that fxs is a unit vector in the Hilbert space 1 (R(X)Hx^x) and this derives the quantum Hilbert space and the vector form for a state If Ax is a set of X-actions then the Hilbert space Hx Q L2 (R(X) TxfJ-x) genshyerated by the set of wave functions fxs- S euro Ax is called an X-Hilbert space Some X-actions may not be relevant for physical reason so we may want Ax to be a proper subset of the set of all X-actions
If g Cl mdashgt R is integrable along rr-fibers and S pound Ax we define the (X 5)-amplitude average of g at x by
fxs(9)x) = [ g(u)fs(ugt)fx(dLj) = NS [ gu)eiS^nxd) Jx-l(x) JX-i(x)
(16)
Applying (8) and (14) we obtain
poundx s ( f f ) (s )=Re g(Lj)fs(cj)raquox(du) [ s(^)gti(^)
= Befxs(g)(x)fxsx)
It follows from (9) that
Exs(g)=Re(fxs(g)fxs) (17)
Define the linear operator g on Hx by gfxs() = fxs(g)() and extend by linearity If the operator Tj is self-adjoint on Hx we call g an X-observable and we have
Exs(9) = (9fxsfxs) (18)
for all S G Ax- We then say that g is represented by the self-adjoint opershyator lt on Hx bull This derives the representation of observables by self-adjoint operators
153
For a simple example of a representation let g pound1 -raquo R be a constant function g(uj) = c Then (16) gives
fxs(g)x) = c fs(w)nx(du) = cfxs(x) JX-1(x)
Hence g is an A-observable and is represented by the self-adjoint operator cl As another example letting g mdash X we have by (16) that
fxs(X)x) = xfXiS(x)
It follows that X is represented by the self-adjoint operator X on Hx given by Xu(x) = xux) We conclude that Hx is a Hilbert space in which X is diagonal More generally since
fxs (h(X)) (x) = h(x)fxs(x) (19)
we see that hX) is represented by the self-adjoint operator h(X)Au(x) = h(x)u(x) Moreover the spectral measure Px is given by Px (B)u(X) mdash XB(x)u(x) and applying (15) gives
Pxs(B) = px(B)fxs
which is again a standard probabilistic formula Finally for A C fi the (X 5)-pseudoprobability becomes by (17)
Pxs(A) = Re (fxs(xA)fxs) (20)
where by (16) we have
fxAxA)(x)= [ fs(cj)fixx(du) = NS I eiS^raquox(ckj) (21) JX- ( i )n i Jx-1(x)nA
4 Spin
We now illustrate the framework presented in the last two sections by preshysenting a model for spin 12 measurements Fix a direction corresponding to the z axis and assume that the spin j z in the z direction is known (either 12 or mdash12) Let UJ euro [07r] denote a direction whose angle to the z axis is LJ By symmetry the spin distribution should depend only on u Let fi = [07r] 8 6 fi and let X Q -gt -1212 be the function
X(u) = - 1 2 for u E [06] and X(u) = 12 for u G (0TT]
154
We make X into a measurement by defining
fix (-12)= ^ (12) = 1
and endowing X~1(-l2) = [0(9] and X~ 1 ( 1 2) = (0ir] with the usual Borel structure The function X corresponds to a spin 12 measurement in the 0 direction Letting 6 vary we obtain an infinite number of spin measurements each applied in a different direction Observe that a sample point ugt euro CI determines the spin in every direction simultaneously
For j z = 12 we define the X-action (S lt fix fix gtJ given by S(LJ) = u
and fix fix are fi2 where fi is Lebesgue measure restricted to X_ 1(mdash12) X _ 1 ( l 2 ) respectively We then have
FS(OJCJ) = cos(o - a)
(we shall see that Ns = 1) The probabilities become
P 5 ( - l 2 ) = l oVoCOs^-wJdwdw
= i[09cosadu]2 + i [ 0
e s i n a ^ ] 2 (22)
= plusmn s i n 2 0 + i ( l - c o s 0 ) 2 = s i n 2 f
Pxs(l2) = fficoa(u-uj)dLjdu
= [fg cos uiduj] + i [fg sin udu] (23)
= sin2 6 + (1 + cos Of = cos2 f
Since Pxs(-l2) + Pxs(ll2) = 1 we see that Ns = 1 Notice that (22) and (23) are the usual probability distribution for spin in the 9 direction when U = i 2
For j z = mdash12 we define the X-action S Avx vj J given by
S = u for u e (07r) and S = -TT2 for u e 0 n and vx = So + fi2 vx = Sn + fi2 where lt5o Sv are the Dirac point
measures at 0 ir respectively A similar but more tedious calculation gives
i ^ S ( - 1 2 ) = cos 2^
Pxs-(12) = s in 2 ^
155
which is the usual distribution for spin in the 6 direction when j z mdash - 1 2 We now examine the wave functions and Hilbert space corresponding to
this model The 5-amplitude function becomes fs(ugt) = etw and the (XS)-wave function fxs is given by
x s ( - l 2 ) 2 Jo e w d w = - ( l - )
fxs^l2) = f e^ltkj^-l + i0
The S-amplitude function becomes fsgt (w) = etrade for u euro (0 TT) and s - M = -i for w euro 0 TT and the (X 5)-wave function fxs IS given by
fxM-W) = f[o9]fs(gtx12^) = -i+12foeid
= - f ( l + eiS)
x5lt(l2) = M ] 5 H ^ 2 ( ^ ) = - i + 3 X r ^ d W
= - | ( l - e i e )
The X-Hilbert space is clearly C 2 and we can represent fxs and xS in C 2 by the unit vectors
vs
VS
(l-ei9l + eie)
(I + eie1 - eie)
Notice that vs i vs- Also when 6 = 0 vs mdash (01) and us = (10) which are the usual eigenvectors for the spin 12 operator in the z direction We can treat this as a measurement and the general X as an observable It can be shown that the matrix for X in the standard basis (10) (01) becomes
= 5 cos 9 ism 6
-i sin 6 mdash cos 6 = - cos 6
2 1 0 0 - 1
+ - sin 6 0 i -i 0
which is the usual form for a spin 12 matrix in the direction 6 We can extend this analysis to higher order spins3 Moreover this frameshy
work gives a realistic model for the Bohm version of the EPR problem4 The reason that Bells theorem is not contradicted is because Bells inequalities are derived using classical probability theory and we have employed quantum probability theory
156
5 Traditional Quantum Mechanics
We now show that this formalism contains traditional nonrelativistic quantum mechanics For simplicity we consider a single spinless particle in one dimenshysion although this work easily generalizes to three dimensions We take our sample space to be the phase space
n = K2 = (qp) qpER
The two most important measurements are the position and momentum given by Q(QP) = ltgt P(QJP) = P respectively However as is frequently done in quantum mechanics we shall investigate the ^-representation of the system In this case Q is considered a measurement and P fi mdashgt R is viewed as a function on fi which as we shall show is a Q-observable
Each Q-fiber Q~lq) = (qp)- p pound R can be identified with R We make Q a measurement by endowing its range R(Q) = R with Lebesgue meashysure and its fibers with the usual Borel structure of R Only certain Q-actions ISlt(1Q lt 7 G R H correspond to traditional quantum states and these can be derived from natural postulates We assume that fj is absolutely continuous relative to Lebesgue measure on R and that IQ is independent of Q This is because sets of Lebesgue measure zero are too small to have any effect on the outcomes of position measurements and there is no a priori reason to disshytinguish between Q-fibers It follows from the Radon-Nikodym theorem that there exists a nonnegative Lebesgue measurable function pound R mdashgt R such that
raquoQ(dp) = (2irh)-12ap)dp (24)
We take S fl mdashgt R to have the form
S(qp) = f+V(p) (25)
This form is natural because qp is the classical action and adding a function of momentum gives a quantum fluctuation We could also add a function of q but it is easy to see that this would just multiply the wave function by a constant phase which would not alter the probabilistic formulas Denote by AQ the set of (^-actions that have the form (24) (25)
Applying (12) for S euro AQ we find that the (Q 5)-wave function becomes
fQs(q) = 2-KK)-12 J tipYnp)eiqvhdp
Defining
m = t(p)eivp) (26)
157
and denoting the inverse Fourier transform by v we have
fQs(q) = (27Tr12 4gtPyqphdP = ltpa) (27)
In order for (3) to be satisfied Q ^ must be a unit vector in L2(R dq) or equivalently ltjgtp) must be a unit vector in L2(R dp) However every vector in L2 (R dp) has the form (26) for some functions pound R -raquobull R + 77 R -gtbull R It follows that the Q-Hilbert space becomes the traditional Hilbert space HQ = L2(R dq) and fQs is the usual wave function (or state)
Let (s l^9Q q euro R ) be a fixed Q-action in AQ of the form (24) (25)
and let ip(q) = fQs(q) $(p) = ^(p)eitgt^ Applying (16) and (27) we have
fQs(P)(Q) = (2nh)-12Jpltigt(p)ei^dp
= -ihplusmn(2nh)-V2j4gt(P)eilphdp=-ihq)
More generally if n is a positive integer we obtain
fQs(Pn)(Q) = (-ihQ V-CP) (28)
Moreover applying (18) we have
E^pn) = l[(-ihiS 1gt(q) P(q)dq
which is the usual quantum expectation formula We conclude from (28) that P is a Q-observable and is represented by the operator (mdashihddq)n Moreover if V R mdashgt R is measurable we see from (19) that V(Q) is a Q-observable and is represented by the operator V(Q)Au(q) = V(q)u(q) This together with our observation concerning P gives a derivation of the Bohr correspondence principle
We now consider probability distributions We have already seen in (15) that
PQS(B)= I ltP(q)2dq JB
which is the usual distribution of Q It is more interesting to compute the probability of A = P~1(B) for the momentum function P We have from (21) that
fQs(xA)(q) = 2Kh)-12 [ 4gtjgtyqphdp=xB4gtYq) JB
158
Hence by (20) and the Plancherel formula we obtain
PQS [P-^B)] = jxBdgtYq)rq)dq
(xB4gt)p)ltP(p)dp lt
= |(p) JB lt
dp
Again this is the usual momentum distribution This gives an example in which PQS is an actual probability measure on a er-algebra of subsets of fi
Until now we have treated time as fixed We now briefly consider dynamshyics Let ipqt) be a smooth function Our previous formulas hold with tp(q) replaced by tp(qt) and HQ replaced by tQt- We now derive Schrodingers equation from Hamiltons equation of classical mechanics dpdt = mdashdHdq Suppose the energy function has the form
H(qP) = ^+V(q)
We assume that Hamiltons equation holds in the amplitude average Applying (16) we have
Jt J Pfs(qPt)nqQltt(dp) = -mdashJ H(qp)fsqpt)nq
Qtdp)
Hence
dt Jp$(p t)e^hdp =-^f H(qp)$(p t)e^lhdp
Applying (28) and (19) gives
h2 d2igt dt dq J dq 2m dq2 + V(q)rlgt
Interchanging the order of differentiation on the left side of this equation and integrating with respect to q gives Schrodingers equation
6 Concluding Remarks
In this paper we have presented a realistic contextual nonlocal approach to quantum probability theory The formalism is realistic because each sample
159
point w euro n uniquely determines a value X(ugt) for any measurement X In this way a physical system ltS possesses all of its attributes independent of whether they are measured Although the sample space fi exists and we can discuss its properties fi is not physically accessible in general This is because the samshyple points may not correspond to physical states which can be prepared in the laboratory or at least exist in nature We may think of fi as a hidden variable completion of quantum mechanics This approach is contextual because it is necessary to specify a particular basic measurement X Once X is specified a Hilbert space Hx can be constructed and Hx provides an X-representation for S Of course one may choose a different basic measurement Y and then the ^-representation will give a different picture of S For example in trashyditional quantum mechanics we usually choose the position representation or the momentum representation to describe ltS For a given basic measurement X and an action S we have given a method for constructing the probability distribution Pxs of X We have shown that Pxs may be found in terms of a state vector fxs 6 Hx and these correspond to physically accessible states In Hx the measurement X and functions of X are diagonal and hence represhysented by random variables Other measurements which we call observables to distinguish them from X are represented by self-adjoint operators on Hx and their usual distributions follow in a natural way The theory is nonlocal because the distribution Pxs is specified by an influence function Fs(ww) This function provides an influence between pairs of sample points which in a spacetime model may be spacelike separated
There is considerable controversy concerning various interpretations and approaches to probability theory I believe that three types of probabilities are necessary for a description of quantum mechanics The probabilities and disshytributions of measurement results in the laboratory are usually computed using long run relative frequencies Even though a measurement X may involve a microscopic system S (for example the position of an electron) S must intershyact with a macroscopic apparatus in order to obtain an observable outcome The theoreticians task is to find the distribution Px of X This theoretical distribution should agree with the long run relative frequencies found in the laboratory or give predictions that can eventually be tested experimentally Since there are serious well-known difficulties in dealing with abstract theories of relative frequencies it is convenient and perhaps even necessary to use the standard Kolmogorovian probability theory for describing Px- Now Px is a probability measure that satisfies the axioms of standard probability theory However the method for computing Px is characteristic of quantum mechanshyics and is not found in any classical theory Richard Feynman whose work has motivated the present paper once said that nobody really understands
160
quantum mechanics I think that what he meant is that nobody understands why nature has chosen to compute probabilities in this unusual way As preshysented here the probability density for Px is found by employing an influence function The advantage of this method is that it is physically motivated and avoids complex numbers An equivalent method which is usually employed in quantum mechanics is to take the absolute value squared of the wave function
The quantum probability approach that we have presented contains stanshydard probability theory as a special case Thus we only need two types of probabilities to describe quantum mechanics Standard probability theory as developed by Kolmogorov is a distillation of hundreds of years of experience with empirical and theoretical studies of chance phenomena The founders of the subject were concerned with games of chance statistics and the behavior of macroscopic objects They were not aware of microscopic objects and quanshytum mechanics and had no reason to design a probability theory for describing such situations It is therefore not surprising that a new theory called quantum probability theory had to be developed to serve these purposes
References
1 R Feynman and A Hibbs Quantum Mechanics and Path Integrals (Mc Graw-Hill New York 1965)
2 S Gudder Int J Theor Phys 32 1747 (1993) 3 S Gudder Int J Theor Phys 32 824 (1993) 4 S Gudder Quantum probability and the EPR argument Ann Found
Louis De Broglie 20 167 (1994) 5 G Hemion Int J Theor Phys 29 1335 (1990)
161
INNOVATION APPROACH TO STOCHASTIC PROCESSES A N D Q U A N T U M DYNAMICS
TAKEYUKI HIDA Department of Mathematics
Meijo University TenpakuNagoya 468-8502
and Nagoya University (Professor Emeritus)
Theory of stochastic process has extensively developed in the twentieth century and there established a beautiful connection with quantum dynamics It seems to be a good time now to revisit the foundations of stochastic process and quantum mechanics with the hope that the attempt would suggest some of further directions of these two disciplines with intimate relations For this purpose we review some topics in white noise analysis and observe motivations from physiscs and how they have actually been realized
1 Introduction
We shall discuss the analysis of random complex systems and its connection with Quantum dynamics In particular we analyse some stochastic processes Xt) and random fields X(C) in a manner of using the innovation and revisit quantum dynamics in connection with stochastic analysis Actually our aim is to study those random complex systems including quantum fields by using the white noise analysis
The basic idea of our analysis is that we first discuss stochastic processes by taking a basic and standard system of random variables then expressing the given process as a function of the system that has been provided The system of such variables from where we have started is called idealized elemental random variables (abbr ierv) The idea of taking such a system is in line with the
Reductionism One might think that this thought seems to be similar to the Reductionism
in physics Before we come to this point it sounds interesting to refer to the lecture given by PW Anderson at University of Tokyo 1999 His title included Emergence together with reductionism and he gave good interpretation
Following the reductionism we then come to the next step is to form a function of the iervs so that the function represents the given random complex system It is nothing but
Synthesis
162
Then naturally follows the analysis of functions which have been formed in our setup Thus the goal has therefore to be the analysis of the function (may be called functional) to identify the random complex system in question
The first step of taking suitable system of iervs has been influenced by the way how to understand the notion of a stochastic process We therefore have a quick review of the definition of a stochastic process starting from the idea of J Bernoulli (Ars Conjectandi 1713) S Bernstein (1933) and P Levy on the definition of a stochastic process (1947) where we are suggested to consider the innovation of a stochastic process It is viewed as a system of iervs which will be specified to be a white noise
The analysis of white noise functionals has many significant characteristics which are fitting for investigation of quantum mechnical phenomena Thus we shall be able to show examples to which white noise theory is efficiently applied
Having had great contribution by many authors the theory developed in our line has become the present state
AMS 2000 Mathematics Subject Classification 60H40 White Noise Theory
2 Review of defining a stochastic process and white noise analysis
There is a traditional and in fact original way of defining a stochastic process Let us refer to Levys definition of a stochastic process given in his book [3] Chapt II une fonction aleatoire X(t) du temps t dans lequel le hasard inter-vient a chaque instant The hasard is expressed as an infinitesimal random variable Y(t) which is independent of the observed values of X(s) s lt t in the past The random variable Y(t) is nothing but the innovation of the process X(t)
Formally speaking the Y(t) which is usually an infinitesimal random varishyable contains the information that was gained by the X(t) during the time interval [t t + dt) To express this idea P Levy proposed a formula called an infinitesimal equation for the variation 5X (t)
6X(t) = $(X(s)s lt tY(t)tdt)
where $ is a non-random functional Although this equation has only a formal significance it still tells us lots of suggestions
While it would be fine if the given process is expressed as a functional of
163
Yt) in the following manner
X(t) = V(Y(s)slttt)
where ^ is a sure (non random) function Such a trick may be called the Reduction and Synthesis method The
above expression is causal in the sense that the X(t) is expressed as a function of Y(s) s ltt and never uses Y(s) with s gt t
Note that this method of denning a stochastic process is more important than function space type distribution
The collection Y(s) is a system of iervs so that the above expression is a realization of the synthesis We are particularly interested in the case where the system of iervs is taken to be a white noise and thus ready to discuss white noise analysis
So far we have discussed the theory only for a stochastic process It is in fact quite natural to extend the theory for a random field X(C) indexed by an ovaloid say a contour or closed surface A generalization of the infinitesimal equation is
SX(C) = $ (X(C) C lt CY(s)s e CC6C)
The y(s) s G C is the innovation
We note that the white noise analysis has many advantages as are quickly mentioned below Such a generalization can be done because of the use of the innovation
1) It is an infinite dimensional analysis Actually our stochastic analysis can be systematically done by taking a white noise as a sytem of iervs to express the given random complex systems Indeed the analysis is essentially infinite dimensional as will be seen in what follows
2) Infinite dimensional harmonic analysis The white noise measure supported by the space E of generalized funcshy
tions on the parameter space Rd is invariant under the rotations of E Hence a harmonic analysis arising from the group will naturally be discussed The group contains significant subgroups which describes essentially infinite dimenshysional characters
3) Generalizations to random fields X(C) are discussed in the similar manshyner to X(t) so far as innovation is concerned Needless to say X(C) enjoys more profound characteristic properties
164
4) Connection with the classical functional analysis The so-called S-transform applied to white noise functionals provides a bridge connecting white noise functionals and classical functionals of ordinary functions We can thereshyfor appeal to the classical theory of functionals established in the first half of the twentieth century
5) Good connection with quantum dynamics as will be seen in the next section
Differential and integral calculus of white noise functionals using annihishylation dt and creation lt9t class of generalized functionals harmonic analysis including Fourie analysis the Levy Laplacian A L complexification and other theories are refered to the monograph [12] and other literatures
3 Relations to Quantum Dynamics
We now explain briefly some topics in quantum dynamics to which white noise theory can be applied What we are going to present here may seem to be separate topics each other but behind the description always is a white noise
1) Representation of the canonical commutation relations for Boson field This topic is well known
Let B(t) be a white noise and let dt denote the S(i)-derivative Then it is an annihilation operator and its dual operator 3t stands for the creation They satisfy the commutation relations
[fta] = [aa] = o
[dtd] = s(t-s)
From these a representation of the canonical commutation relations are given for Bosonic particle
It is noted that the following assertion holds
Proposition There are continuously many irreducible representations of the canonical commutation relations
White noises with different variances are inequivalent each other which proves the assertion
2) Reflection positivity (T-positivity)
165
A stationary multiple Markov (say N-ple Markov) Gaussian process has a spetral density function (A) of particular type Namely
On the other hand it is proved that
Proposition The covariance function 7(t) of a stationary T-positive Gausshysian process is expressed in the form
bull O O
j(h) = exp[mdash |i|x]cfo(a) Jo
where v is a positive finite measure
By applying this assertion to the N-ple Markov Gaussian process we claim that T-positivity requires Ck gt 0 for every k
Note that in the strictly N-ple Markov case this condition is not satisfied
It is our hope that this result would be generalized to the cases of general stochastic processes of multiple Markov properties
3) A path integral formulation
One of the realizations of Dirac-Feynmans idea of the path integral may be given by the following method using generalized white noise functionals First we establish a class of possible trajectories when a Lagrangian L(x x) is given Let x be the classical trajectory determined by the Lagrangian As soon as we come to quantum dynamics we have to consider fluctuating paths y We propose they are given by
y(s) = xs) + mdashBs) V m
The average over the paths is replaced with the expectation with respect to the probability measure for which Brownian motion B(t) is defined Thus the propagator G(yiy2t) is given by
ENexp[l-J L(yy)ds+^j B(s)2ds] bull S(y(t) - y2)
With this setup actual computations have been done to get exact formulae of the propagators (L Streit et al)
166
4) Dirichlet forms in infinite dimensions With the help of positive grneralized white noise functionals we prove criteria for closability of energy forms See [3]
5) Random fields X(C)
A random field XC) depending on a parameter C which is taken to be a certain smooth and closed manifold in a Euclidean space naturally enjoys more complex probabilistic structure than a stochastic process X(t) depending on the time t It therefore has good connections with quantum fields in physics
We are particularly interested in the case where X(C) has a causal represhysentation in terms of white noise Some typical examples are listed below
51) Markov property and multiple Markov properties We are suggested by Diracs paper [1] to define Markov property For
Gaussian case a reasonable definition has been given (see [15]) by using the canonical representation in terms of white noise where the canonical property of a representation can be introduced as a geberalization of that for a Gaussian process Some attempts have been made for some non Gaussian fields (see [17]) For Gaussian case multiple Markov properties have been defined It is now an interesting question to find conditions under which a Gaussian random field satisfies a multiple Markov property
52) Stochastic variational equations of Langevin type Let C runs through a class C of concentric circles The equation is to solve
the following stochastic variational equation of Langevin type
SX(C) = -XXC) [ 6n(s)ds + X0 [ v(s)ds5n(s)ds Jc Jc
The explicit solution is given by using the 5-transform and the classical theory of functionals
53) We have made an attempt to define a random field X(C)C G C which satisfies conformal invariance Reversibility can also be discussed
Example Linear parameter case A Brownian bridge For t euro [01] it is defined by
X(t) = (l-t) [ mdash^mdashB(u)du Jo 1 ~u
167
Reversibility can be guaranteed not only by the time reflection but also by whiskers (one-parameter subgroup denned by deformation of parameter) in the conformal group that leaves the unit time interval invariant
We now come to the case of a random field Let C be the class of concentric circles Assume 0 lt r0 lt r lt r Denote by Cr the circle with radius r Then we define
(ft) - yfi^^bw w^w^ This is a canonical representation To show a reversibility we apply the invershysion with respect to the circle with radius yrori
We claim that it is possible to have a generalization to the case where C is taken to be a class of curves obtained by a conformal mapping of concentric circles
Remark 1 It is noted that the white noise x(t) is regarded as a representation of the parameter t so that propagation of randomness (fluctuation) is expressed in terms of x(t) instead the time t itself Namely the way of development of random complex phenomena in particular reversibility has explicit description in terms of white noise as is seen in the above example
Remark 2 See the papers [1] by Dirac and [13] by Polyakov to have suggestions on a generalization of the path integral
4 Addenda to foundations of the theories Concluding remarks
Before the concluding remarks are given we should like to add some facts as an addenda to SI regarding the foundations of probability theory
Prom a brief history mentioned in SI we understand the reason why a white noise that is a system of iervs is introduced It is a generalized stochastic process so that we need some additional consideration when reashysonable functionals in general nonlinear functionals of white noise are introshyduced In physics we met interesting cases where those nonlinear functionals of white noise are requested canonical commutation relations for quantum fields where degree of freedom is continuously infinite Feynmans path inteshygrals as was discussed in 3) of the last section and variational equation for a
168
random field On the other hand we were lucky when a class of generalized white noise functionals were introduced in 1975 since the theory of genaral-ized functions was established and some attempt had been made to apply it to the theory of generalized stochastic processes To have further fruitful results we have been given a powerful method to study random fields indexed by a manifold It is the so-called innovation approach where our reductionism does not care higher dimensionality of the parameter space With these in mind we can come to the concluding remarks
As the concluding remarks some of proposed future directions are now in order
1 One is concerned with good applications of the Levy Laplacian Its signifishycance is that it is an operator that is essentially infinite dimensional
2 A two-dimensional Brownian path is considered to have some optimality in occupying the territory This property should reflect to forming a model of physical phenomena
3 Systematic approach to in variance of random fields under transformation group will be discussed
4 Stochastic Variational Calculus for random fields
With the classical results on variational calculus we can proceed further white noise analysis
Acknowledgements The author is grateful to Professor A Khrenikov who has invited him to give a talk at this conference Thanks are due to Academic Frontier Project at Meijo University for the support of this work
References
1 PAM Dirac The Lagrangian in quantum mechanics Phys Z Soviet Union 3 64-72(1933)
2 S Tomonaga On a relativistically invariant formulation of the quantum theory of wave fields Prog Theor Phys 1 27-42 (1946)
3 P Levy Processus stochastiques et mouvement brownien (Gauthier-Villars 1948 2 ed 1965)
4 P Levy Nouvelle notice sur les travaux scientifique de M Paul Levy Janvier 1964 Part III Processus stochastiques (unpublished manuscript)
169
5 T Hida Canonical representations of Gaussian processes and their applications Mem College of Science Univ of Kyoto A 33 109-155(1960)
6 T Hida Stationary stochastic processes (Princeton Univ Press 1970) 7 T Hida Brownian motion (Iwanami Pub Co 1975 English ed
Springer-Verlag 1980) 8 T Hida Analysis of Brownina functionals Carleton Math Lecture
Notes 13 (1975) 9 T Hida Innovation approach to random complex systems Pub
Volterra Center 433 (2000) 10 T Hida and L Streit On quantum theory in terms of white noiseNagoya
Math J 68 21-34(1977) 11 T Hida J Pothoff and L Streit Dirichlet forms and white noise
analysis Commun Math Phys 116 235-245 (1988) 12 T Hida H-H Kuo J Potthoff and L Streit White noise an Infinite
dimensional calculus (Kluwer Academikc Pub 1993) 13 AM Polyakov Quantum geometry of Bosonic strings Phys Lett
103B 207-210(1981) 14 J Schwinger Brownian motion of a quantum oscillator J of Math
Phys 2 407-432 (1961) 15 Si Si Gaussian processes and Gaussian random fields Quantum Inshy
formational (World Scientific Pub Co 2000) 16 L Streit and T Hida Generalized Brownian functionals and the Feyn-
man integral Stoch Processes Appl 16 55-69 (1983) 17 L Accardi and Si Si Innovation approach to multiple Markov propershy
ties of some non Gaussian random fields to appear
170
STATISTICS A N D ERGODICITY OF WAVE FUNCTIONS IN CHAOTIC OPEN SYSTEMS
H ISHIO Department of Physics and Measurement Technology Linkoping University
S-581 83 Linkoping Sweden E-mail hirisifmliuse
and Division of Natural Science Osaka Kyoiku University Kashiwara
Osaka 582-8582 Japan E-mail ishioccosaka-kyoikuacjp
In general quantum chaotic systems are considered to be described in the context of the random matrix theory ie by random Gaussian variables (real or complex) in an appropriate universality class In reality however quantum states inside a chaotic open system are not given by a statistically homogeneous random state We show some numerical evidences of such statistical inhomogeneity for ballistic transport through two-dimensional chaotic open billiards and argue about their relation to the corresponding classical dynamics
1 Introduction
Quantum-mechanical signature of classical chaos is called quantum chaos The rigorous definition of chaotic systems in quantum theory has been given very recently for Kolmogorov (K-) and Anosov (C-) systems on the analogy of the corresponding classical natures1 In such systems quantum ergodicity is naturally expected Eigenfunctions are equidistributed in their representation space and all expectation values of quantum observables coincide with mean values of the corresponding classical observables It was first noted that a sufficient condition for quantum ergodicity to hold is the ergodicity of the corshyresponding classical dynamics2 More recently the statement was proved in the case of quantum billiards34 Nowadays the quantum ergodicity is one of the few results for which there exist mathematical proofs in the field of quantum chaos
The quantum ergodicity however can be reached only in the semiclassical limit (h mdashgt 0) In experiments or numerical simulations for chaotic systems we often see nonuniversal quantum features far from ergodicity even in a high (but finite) energy region In the present work we show some numerical evidences of such statistical inhomogeneity for chaotic open systems In Sec 2 we introshyduce a model of ballistic transport through a chaotic open billiard and show some evidences of nonergodicity in the classical dynamics We briefly discuss in Sec 3 the general wave-statistical description of chaotic open systems by
171
Figure 1 Typical single trajectory in the open stadium billiard
the random matrix theory (RMT) In Sec 4 we show numerical results of fully-quantum calculations of the open billiard model and find that the idealshyistic description by RMT does not apply in some cases even in a high energy region There we focus on the relation between the statistical deviations and wave localization corresponding to classical short paths Section 5 consists of conclusions
2 Classical Nonergodicity and Short-Path Dynamics
We consider a two-dimentional (2D) billiard where the motion of noninter-acting particles confined by Dirichlet boundaries is ballistic The shape of the boundaries directly determines the nonlinearity of particle dynamics inside the billiard One of the prototypes of conservative chaotic systems is a Bunimovich stadium billiard In the case of a closed stadium billiard it is proved that the system has K-property 5 In the case of an open stadium billiard coupled to two narrow leads (see Fig 1) the nonintegrability is still expected eg we can observe a fractal structure in the spectrum of dwell times inside the cavity region6 However the Monte Carlo simulation of the classical path-length (oc dwell time) distribution shows that the distribution function is not a simple exponential decay function as a signature of ergodicity but a highly structured function owing to short-path dynamics7
Another example showing nonergodicity of classical dynamics in the case
172
of the open stadium billiard is a transmission-reflection diagram of particles as is shown in Fig 2 There y is an initial transversal position of each particle incoming from the lead 1 (see Fig 1) at the entrance of the stadium cavity d denotes a common width of the attached leads We apply semiclassical quantization condition to the momentum of the incoming particles in the lead The angle of incidence is quantized as 6 = plusmn s in - 1 [(nir)(kd)] (n = 12 ) where we choose the positive and negative 0j for the upper and lower direction of particle motions in Fig 1 respectively k is the Fermi wave number of the semiclassical particles In the calculation of all the range of the diagram we fix the quantized mode number n as n = 1 Because of the semiclassical quantization condition 0i monotonically decreases as a function of k The distributed black and white points correspond to transmission and reflection events respectively The relative measure of the black (white) portion for each fc is equal to the classical transmission (reflection) probability Tci(k) (Rct(k)) In Fig 2 we see a number of black and white windows in the chaotic sea Each of them is associated with a family of short paths connecting from the lead 1 to the lead 2 (for the black) and the lead 1 (for the white) Such paths are stable in the event of transmission and reflection and are expected to make an important contribution as a family to the corresponding quantum transport
3 Universal Description of Wave Function Statistics
We write the scaled local density as p(r) mdash Vip(r)2 where V is the volume of the system in which a single-particle wave function ip(r) is normalized in terms of the position r It is well known that the probability distribution of the local densities of a chaotic eigenfunction of a closed system is the Porter-Thomas (P-T) distribution8
P(p) = ( l v 2 ^ ) exp( -p 2) (1)
described by a Gaussian orthogonal ensemble (GOE) of random matrices when time-reversal symmetry (TRS) is present ie ip poundR On the other hand the distribution is an exponential8Q
P(p) = exp(-p) (2)
described by a Gaussian unitary ensemble (GUE) of random matrices when TRS is broken in the closed system ie tp 6 C The space-averaged spatial correlation of the local densities of a 2D chaotic wave function with wave number k is also given by9 10 11
P2(kr) = (p^pfa)) = l + cJi(kr) (3)
173
where r = |ri mdash r2 | and Jox) is the Bessel function of zeroth order The parameter c is chosen as c = 2 for GOE (TRS) and c = 1 for GUE (broken TRS) eigenfunctions
Investigations of the continuous transition of the wave function statistics between GOE and GUE symmetries have been also worked out Introducshying a transition parameter b euro (12] we have the probability distribution 1213141516
PM = 2Vr3Texp(4(5^T))
where Iox) is the modified Bessel function of zeroth order and the spatial correlation17
Pb2kr) = 1 + (l + ( ^ ) 2 ) JS(kr) bull (5)
For b -gt 1 and b -gt 2 both equations tend to the GOE and GUE cases respectively
On the other hand the systematic statistical investigations of scattering wave functions in open chaotic systems have been carried out quite recently16
It is essential that the space reciprocity in conservative closed systems which means that each plane wave ties up with its counterpart with the same amplishytude and running in the opposite direction in phase is lost in open systems As a result the wave function statistics in a chaotic open system is expected to be the GUE if the system is completely open16
4 Numerical Analyses and Discussions
We show in this section some numerical evidences of wave statistical inho-mogeneity for ballistic transport through the 2D open stadium billiard Asshysuming steady current flow through the leads we solve the time-independent Schrodinger equation for a single particle under Dirichlet boundary conditions based on the plane-wave-expansion method6 giving reflection and transmission amplitudes as well as local wave functions for each energy In the calculation of the statistics a sample space A(= V) is taken in the cavity region corshyresponding to the closed stadium and more than one million sample points are used to obtain reliable statistics We show the numerical results for the wave probability density in Fig 3 and for the probability distribution P(p) and spatial correlation P2(kr) in Fig 4
174
In Fig 3(a) we find the so-called bouncing-ball mode in the central reshygion of the stadium cavity where we see a number of vertical nodes associated with marginally stable classical orbits bouncing vertically between the straight edges Bouncing-ball states are nonstatistical states since the amplitude of ip is strongly localized in the middle region of the stadium (the space reciprocity holds locally) and is very small in the endcaps (the space reciprocity does not necessarily hold) As a result both Pp) and P2(kr) for such states do not folshylow their universal expressions (see Fig 4(a)) In addition to the bouncing-ball mode we also see another wave localization strongly coupled to both the initial and the (open) transmission channels corresponding to the direct transmission path (see the white line depicted in Fig 3(a)) Along such localization plane wave may propagate with nonzero probability current partially contributing to the anomaly of the wave statistics16
In the higher energy region where the ratio of the system size A to the wave length A is v^4A ~ 25 (ie in the case of Fig 3(b)) we may expect the GUE statistics However we see in Fig 4(b) that both P(p) and P2(kr) follow closely the GOE
The reason is a localization effect reminiscent of the phenomenon known as scar 18 describing an anomalous localization of quantum probability denshysity along unstable periodic orbits in classically chaotic systems In order to characterize a localization we usually introduce a moment defined by J = V~l Jv tp(r)2qdr of the eigenfunction local density |VKr)|2 with V being the system volume19 20 The second moment I2 is known as the inverse particshyipation ratio (IPR) Assuming a normalization condition (|V|2) (= ^1) = 1gt we have I2 = 1 for completely ergodic (random and uniform) eigenfunctions while h = 00 for completely localized eigenfunctions like IV(r)2 ~ V5(r) The localization effect on wave-function density statistics has been examined anashylytically in relation to J for closed systems212223 and also numerically using a time-dependent approach ie in terms of recurrences of a test Gaussian wave packet for closed and weakly (imperfectly) open systems 24gt25gt26 In the latter work they showed that the tail of the wave-function intensity distribution in phase space is dominated by scarring departing from the RMT predictions
In contrast the most prominent effect of the localization of wave probashybility density in open billiards is the local space reciprocity holding along the classical orbits corresponding to the localization not strongly coupled to any (open) transmission channel (see eg the white lines depicted in Fig 3(b)) Along such orbits there is no net current owing to the coherent overlap of time-reversed waves so that both P(p) and P2(kr) are close to the GOE predicshytions 16 For quantitative discussion the value of the GOE-GUE transition pashyrameter b is calculated numerically from the wave function ip(r) mdash u(r) + iv(r)
175
by a formula 16
amp = 2 lt | V | 2 ) (hf) + y(|V|2)2-4((u2)( l2)-(w)2) (6)
and (bull bull bull) denotes a space average on A The obtained value for Fig 3(b) is b = 103 which corresponds to the case very close to the GOE
In the case of open systems the IPR may again play an important role as a measure of localization27 In the definition I2 = V 1 Jv |^(r) |4dr |V(r)|2(= p(r)) is the scattering-wave local density and V the area (A) of the stadium cavity in our case For chaotic wave functions normalized as (IVI2) = 1 gt w e
obtain from Eq (4) the IPR l for the transition between the GOE and GUE statistics as
Tb I p2Pb(p)dp = -7T
2VF^i
5 [2
70 Ti dQ
[l+(t-l)cos0]
3b2 - 4 6 + 4 b2 (7)
In the GOE and GUE limits I=1 = 3 and 7|=2 = 2 respectively For Fig 3(b) the numerically obtained IPR is h = 289 which is exactly equal to jt=i03 ^phis m e a n s that the enhancement of the IPR by the amplitude of the localized wave is not strong in the case of Fig 3(b) and that the effect of the localization appears mainly in the value of b which also determines the IPR
From our investigations together with more extended studies16 the comshyplete GUE statistics is conjectured to be obtained only in the high-energy (semiclassical) limit Until the energy reaches such limit the localization of wave functions within the chaotic open systems strongly affects the wave stashytistical properties leading to deviations from the RMT predictions based on the ergodicity or uniform randomness of wave functions
Finally we note that the classical-path families associated with the loshycalization found in Fig 3(a) and (b) can be identified as windows indicated with a and 3 in Fig 2 respectively (In Fig 3(b) only the path family for the localization touching the entrance can be identified in Fig 2) We notice that the angle of incidence 0 for a given k is irrelevant to that of the path corresponding to the observed localizations directly connected to the entrance
5 Conclusions
In conclusions our numerical analyses show that chaotic-scattering wave funcshytions in open systems exhibit remarkably different features from the idealistic GUE predictions The statistical deviations from the GUE can be understood in terms of wave localization corresponding to classical short-path dynamics
176
Acknowledgments
The auther is obliged to K-F Berggren A I Saichev and A F Sadreev for fruitful collaboration leading to the work in Sec 4 Support from the Swedish Board for Industrial and Technological Development (NUTEK) under Project No P12144-1 is also acknowledged Part of the calculations of the wave funcshytion statistics were carried out by using a resource in National Supercomputer Center (NSC) at Linkoping
References
1 H Narnhofer (to be published) 2 A I Shnirelman Usp Mat Nauk 29 181 (1974) 3 P Gerard and E Leichtnam Duke Math J 71 559 (1993) 4 S Zelditch and M Zworski Comm Math Phys 175 673 (1996) 5 L A Bunimovich Fund Anal Appl 8 254 (1974) 6 K Nakamura and H Ishio J Phys Soc Jpn 61 3939 (1992) 7 H Ishio and J Burgdorfer Phys Rev B 51 2013 (1995) 8 C Porter and R Thomas Phys Rev 104 483 (1956) 9 V N Prigodin Phys Rev Lett 74 1566 (1995)
10 V N Prigodin et al Phys Rev Lett 72 546 (1994) 11 M V Berry in Chaos and Quantum Physics ed M J Giannoni
A Voros and J Zinn-Justin (Elsevier Amsterdam 1990) p 251 12 K Zyczkowski and G Lenz Z Phys B 82 299 (1991) 13 G Lenz and K Zyczkowski J Phys A 25 5539 (1992) 14 E Kanzieper and V Freilikher Phys Rev B 54 8737 (1996) 15 R Pnini and B Shapiro Phys Rev E 54 R1032 (1996) 16 H Ishio et al (unpublished) 17 S-H Chung et al Phys Rev Lett 85 2482 (2000) 18 E J Heller Phys Rev Lett 53 1515 (1984) 19 F Wegner Z Phys B 36 209 (1980) 20 C Castellani and L Peliti J Phys A 19 L429 (1986) 21 Y V Fyodorov and A D Mirlin Phys Rev B 51 13403 (1995) 22 K Miiller et al Phys Rev Lett 78 215 (1997) 23 V N Prigodin and B L Altshuler Phys Rev Lett 80 1944 (1998) 24 L Kaplan Nonlinearity 12 Rl (1999) 25 L Kaplan Phys Rev Lett 80 2582 (1998) 26 L Kaplan and E J Heller Ann Phys 264 171 (1998) 27 H Ishio and L Kaplan (private communication)
177
-612 0 612-612 0 612 y(-9i) y(+6i)
Figure 2 Transmission-reflection diagram of classical particles as a function of initial position y at the entrance of the stadium cavity and Fermi wave number k corresponding to the angle of incidence $i calculated by semiclassical quantization condition (n = 1 in all the range) in the lead Black and white points correspond to transmission and reflection events respectively Two families of short paths are identified with an arrow beside the diagram (see the text)
178
Figure 3 Contour plot of wave probability density in the open stadium billiard for the condition (a) kdn = 18785 (n = 1) and (b) kdrc = 46553 (n = 1) Initial wave comes through the left lead into the cavity The transmission probability is (a) Tqm = 055 and (b) Tqm = 036 The contours show about 975 of the largest wave probability density Thin white lines show some of the short classical orbits corresponding to the localization of the wave probability density Taken from the work by the authors in Ref [12] (unpublished)
179
Q
Q_
001
10
Q
Q_
01
001
(b) = 2
X ^ Q U E _ _S gtJ^ 0 G O r T lt ^ lt
GOE
) 2 4 6 kr
bull
8
0
Figure 4 Probability distribution (steps) and spatial correlation (thick line in the inset) of local densities in the open stadium billiard for the condition (a) kd = 18785 (n = 1) and (b) kdir = 46553 (n = 1) Two thin lines show GOE (ie Eq (1)) and GUE (ie Eq (2)) cases (Eq (3) for the inset) Taken from the work by the authors in Ref [12] (unpublished)
180
ORIGIN OF Q U A N T U M PROBABILITIES
A N D R E I K H R E N N I K O V
International Center for Mathematical
Modeling in Physics and Cognitive Sciences
MSI University of Vaxjo S-35195 Sweden
Email AndreiKhrennikovmsivxuse
We demonstrate that the origin of the quantum probabilistic rule (which differs from the conventional Bayes formula by the presence of cos 0-factor) might be exshyplained by perturbation effects of preparation and measurement procedures The main consequence of our investigation is that interference could be produced by purely corpuscular objects In particular the quantum rule for probabilities (with nontrivial cos 0-factor) could be simulated for macroscopic physical systems via preparation procedures producing statistical deviations of a special form We disshycuss preparation and measurement procedures which may produce probabilistic rules which are neither classical nor quantum in particular hyperbolic quantum theory
1 Introduction
It is well known that the conventional probabilistic rule formula for the total probability (that is based on Bayes formula for conditional probabilities) canshynot be applied to quantum experiments see for example [1]-[12] for extended discussions It seems that special features of quantum probabilistic behaviour are just consequences of violations of the conventional probabilistic rule
In this paper we restrict our investigations to the two dimensional case Here the formula for the total probability has the form (i = 12)
p(A = ai) = p(B = h)p(A = ltnB = h) + p(B = b2)pA = taB = b2)
(1)
where A and B are physical variables which take respectively values aia2
and 6162- Symbols p(A = a^jB = bj) denote conditional probabilities It is one of the most important rules used in applied probability theory In fact it is the prediction rule if we know probabilities for B and conditional probabilities then we can find probabilities for A However this rule cannot be used for the prediction of probabilities observed in experiments with elementary particles The violation of conventional probabilistic rule and the necessity to use new prediction rule was found in interference experiments with elementary particles This astonishing fact was one of the main reasons to build the quantum formalism on the basis of the wave-particle duality
181
Let (fgt be a quantum state Let b gtf=1 be the basis consisting of eigenshyvectors of the operator B corresponding to the physical observable B The quantum probabilistic rule has the form (i = 12)
Pi = qiPii + q2P2i plusmn 2qiPHq2p2i cos0 (2)
where p = p^A = ai)qj - p^B = 6j)Py = pbigt(A = aj)ij = 12 Here probabilities have indexes corresponding to quantum states
By denoting P = pj and P i = qiPi i P2 = q2P2i we get the standard quantum probabilistic rule for interference of alternatives
P = P i + P 2 + 2v P7PT cos6raquo There is the large diversity of opinions on the origin of violations of convenshy
tional probabilistic rule (1) in quantum mechanics see [1]-[12] The common opinion is that violations of (1) are induced by special properties of quanshytum systems (for example Dirac Feynman Schrodinger) Thus the quantum probabilistic rule must be considered as a peculiarity of nature
An interesting investigation on this problem is contained in the paper of J Shummhammer [12] In the opposite to Dirac Feynman Schrodinger he claimed that quantum probabilistic rule (2) is not a peculiarity of nature but just a consequence of one special method of the probabilistic description of nature so called method of maximum predictive power
In this paper we provide probabilistic analysis of quantum rule (2) In our analysis probability has the meaning of the frequency probability namely the limit of frequencies in a long sequence of trials (or for a large statistical ensemble) Hence in fact we follow to R von Mises approach to probabilshyity [13] It seems that it would be impossible to find the roots of quantum rule (2) in the measure-theoretical framework A N Kolmorogov 1933 [14] In the measure-theoretical framework probabilities are defined as sets of real numbers having some special mathematical properties The conventional rule (1) is merely a consequence of the definition of conditional probabilities In the Kolmogorov framework to analyse the transition from (1) to (2) is to analshyyse the transition from one definition to another In the frequency framework we can analyse behaviour of trails which induce one or another property of probability Our analysis shows that quantum probabilistic rule (2) can be in principle a consequence of perturbation effects of preparation and measureshyment procedures Thus trigonometric fluctuations of quantum probabilities can be explained without using the wave arguments
In fact our investigation is strongly based on the famous Diracs analysis of foundations of quantum mechanics see [1] In particular P Dirac pointed out that one of the main differences between the classical and quantum theories is that in quantum case perturbation effects of preparation and measurement
182
procedures play the crucial role However P Dirac could not explain the origin of interference for quantum particles in the purely corpuscular model He must apply to wave arguments If the two components are now made to interfere we should require a photon in one component to be able to interfere with one in the other [1]
In this paper we discuss perturbation effects of preparation and measureshyment procedures We remark that we do not follow to W Heisenberg [15] we do not study perturbation effects for individual measurements We discuss statistical (ensemble) deviations induced by perturbations
We underline again that our probabilistic analysis was possible only due to the rejection of Kolmogorovs measure-theoretical model of probability theshyory Of course each particular experiment (measurement) can be described by Kolmogorovs model there are no quantum probablities Moreover it seems that there is nothing more than the binomial probability distribution (see the paper of J Shummhammer in the present volume) The most important feashyture of QUANTUM STATISTICS is not related to a single experiment We have to consider at least three different experiments (preparation procedures) to observe quantum probabilistic behaviour namely interference of alternashytives Kolmogorovs model is not adequate to such a situation In this model all random variables are defined on the same probability space It is impossible to do in the case of a few experiments that produce interference of alternatives (at least the author does not see any way to do this) In our analysis probashybility is classical relative frequency but it is not Kolmogorov (compare with Accardi [3])
An unexpected consequence of our analysis is that quantum probability rule (2) is just one of possible perturbations (by ensemble fluctuations) of conventional probability rule (1) In principle there might exist experiments which would produce perturbations of conventional probabilistic rule (1) which differ from quantum probabilistic rule (2)
Moreover if we use the same normalization of the interference term namely 2vPTP7 then we can classify all possible probabilistic rules that we have in nature
1) trigonometric 2) hyperbolic 3) hyper-trigonometric The hyperbolic probabilistic transformation has a linear space representashy
tion that is similar to the standard quantum formalism in the complex Hilbert space Instead of complex numbers we use so called hyperbolic numbers see for example [18] p21 The development of hyperbolic quantum mechanics can be interesting for comparative analysis with standard quantum mechanics In
Such an approach implies the statistical viewpoint to Heisenberg uncertainty relation the statistical dispersion principle see L Ballentine [16] [17] for the details
183
particular we clarify the role of complex numbers in quantum theory Complex (as well as hyperbolic) numbers were used to linearize nonlinear probabilistic rule (that in general could not be linearized over real numbers) Another intershyesting feature of hyperbolic quantum mechanics is the violation of the principle of superposition Here we have only some restricted variant of this principle
2 Quantum formalism and perturbation effects
1 Frequency probability theory The frequency definition of probability is more or less standard in quantum theory especially in the approach based on preparation and measurement procedures [5] [10] [16] [11]
Let us consider a sequence of physical systems n = (7TI7T2 71-JV bullbullbull) bull Suppose that elements of TT have some property for example position or spin and this property can be described by natural numbers L = 12 m the set of labels Thus for each -Kj euro TT we have a number Xj pound L So ir induces a sequence
x = (XIX2XN) Xj e L (3)
For each fixed a euro L we have the relative frequency VNOC) mdash niv(a)N of the appearance of a in (aia2 XN) Here njv(a) is the number of elements in (XIX2--XN) with Xj = a R von Mises [13] said that x satisfies to the principle of the statistical stabilization of relative frequencies if for each fixed a G L there exists the limit
p(a) = lim ^AT(Q) (4) NmdashHXl
This limit is said to be a probability of a Thus the probability is defined as the limit of relative frequencies In fact this definition of probability is used in all experimental investigations In Kolmogorovs approach [14] probability is denned as a measure The principle of the statistical stabilization is obtained as the mathematical theorem the law of large numbers
2 Preparation and measurement procedures and quantum forshymalism We consider a statistical ensemble S of quantum particles described by a quantum state ltjgt This ensemble is produced by some preparation proceshydure 8 see for example [4] [5] [16] [10] [11] for details see also P Dirac [1] In practice the conditions could be imposed by a suitable preparation of the system consisting perhaps in passing it through various kinds of sorting apparatus such as slits and polarimeters the system being left undisturbed after the preparation
There are two discrete physical observables B = bi 62 and A = ax a2
184
The total number of particles in S is equal to N Suppose that ni mdash 12 particles in S with B = bi and n i = 12 particles in S with A = a
Suppose that among those particles with B = bi there are riijij = 12 particles with A = aj (see (R) below to specify the meaning of with) So
n = nn +ni2n^ = nxi +n2jij = 12
(R) We follow to Einstein and use the objective realist model in that both B and A are objective properties of a quantum particle see [5] [4] [10] for the details In particular here each elementary particle has simultaneously defined position and momentum In such a model we can consider in the ensemble S sub-ensembles Sj(B) and Sj(A)j = 12 of particles having properties B = bj and A = aj respectively Set
Sij(AB) = S i(B)nS j(A) Then n^ is the number of elements in the ensemble S J ( A B ) We remark
that the existence of the objective property (B mdash bi and A mdash Oj) need not imshyply the possibility to measure this property For example such a measurement is impossible in the case of incompatible observables In general the property (B = bi and A = aj) is a kind of hidden objective property b
The physical experience says that the following frequency probabilities are well defined for all observables B A
q i = p^(B = 6 i ) = lim q ^ U r 0 ^ (5) JVmdashgtoo iV
p = p ( j 4 = a ) = l i m pWpf) = | (6) IS mdashtoo 1
Let quantum states |6j gt be eigenstates of the operator B Let us conshysider statistical ensembles Tii = 12 of quantum particles described by the quantum states |6j gt These ensembles are produced by some preparation proshycedures poundj For instance we can suppose that particles produced by a prepashyration procedure pound (for the quantum state 4gt) pass through additional niters Fi i = 12 In quantum formalism we have
ltfgt = xqT |ampi gt +V^eiB h gt bull (7)
^Attempts to use objective realism in quantum theory were strongly criticized especially in the connection with the EPR-Bell considerations Moreover many authors (for example P Dirac [1] and R Feynman [2]) claimed that the contradiction between objective realism and quantum theory can be observed just by comparing the conventional and quantum probabilistic rules (see dEspagnat [4] for the extended discussion) However in this paper we demonstrate that there is no direct contradiction between objective realism and quantum probabilistic rule
185
In the objective realist model (R) this representation may induce the illushysion that ensembles Tti = 12 for states bi gt must be identified with sub-ensembles Si(B) of the ensemble S for the state (j) However there are no physical reasons for such an identification
The additional filter Fj(i = 12) changes the A-property of quantum partishycles In general the probability distribution of the property A for the ensemble S(B) = IT e S B(7r) = b differs from the corresponding probability distrishybution for the ensemble T
Suppose that there are rriij particles in the ensemble T with A = aj(j mdash 12) c
The following frequency probabilities are well defined Pij = p|6 gt(A = aj) = limAr- oo pgt- where the relative frequency p ^ =
^f- (by measuring values of the variable A for the statistical ensemble T
we always observe the stabilization of the relative frequencies pj bull to some constant probability py)
Here it is assumed that the ensemble Tj consists of n^ particles i = 12 This assumption is natural if we consider preparation procedure pound = Ft a filter with respect to the value B mdash bi Only particles with B = bi pass this filter Hence the number of elements in the ensemble T (represented by the state bi gt) coincides with number of elements with B = bi in the ensemble 5 (represented by the state cjgt)
It is also assumed that n = n(N) -gt ooiV-gtoo In fact the latter assumption holds true if both probabilities qi = 12
are nonzero We remark that probabilities pjj = TpbigtA = aj) cannot be (in general)
identified with conditional probabilities p$(A = ajB = bi) As we have reshymarked these probabilities are related to statistical ensembles prepared by different preparation procedures namely by poundii mdash 12 and pound Probabilities P|ijgt(A = aj) can be found by measuring the A-variable for particles belongshying to the ensemble Tj Probabilities p^iA = CLJB = bi) in general could not be found these are hidden probabilities with respect to the ensemble S
3 Derivation of quantum probabilistic rule Here we present the standard Hilbert space calculations
cWe can use the objective realist model (R) Then m^- is just the number of particles in the ensemble Tj having the objective property A = aj We can also use the contextualist model (C) Then rriij is the number of particles in the ensemble T which in the process of an interaction with a measurement device for the physical observable A would give the result A = aj
186
lttgt = y5x h gt +y^eie b2 gt Let aj gt be the orthonormal basis consisting of eigenvectors of the
operator A We can restrict our considerations to the case
h gt= -vPiT K gt +e I 7 lv pH a2 gt b2 gt= VP2T K gt +en2^p22 a2 gt bull
(8)
We note that Pll + Pl2 = 1 P21 + P22 = 1-The first sum is the probability to observe one of values of the variable A
for the statistical ensemble Ti the second sum is the probability to observe one of values of the variable A for the statistical ensemble T2
As lt ampi|62 gt = 0 we obtain VP11P21 + e i(71 ~72) v p l ip i i = 0 We suppose that all probabilities pij gt 0 This is equivalent to say that
A and B are incompatible observables or that operators A and B do not commute
Hence sin(7i mdash 72) = 0 and 72 = 71 + nk We also have VP11P21 + cos(7i - 72VP12P22 = 0 This implies that k = 21 + 1 and ^ p i ^ i = iPi2P22- As p2 = 1 mdash P n
and P21 = 1 mdash P22 we obtain that
P l l = P 2 2 P l2=P21- (9)
This equalities are equivalent to the condition P u + P21 = 1 P12 + P22 = 1 Hence the matrix of probabilities (pij) is double stochastic matrix see
for example [5] for general considerations Thus in fact
h gt= v^PiT K gt +e17lVPi2 a2 gt b2 gt= ^pln |ai gt - e J 7 l v^22 a2 gt (10)
So (p = di |ai gt +d2|a2 gt where di = VqlpTT + e ^ y ^ p i T d2 = e i 7 l qiPi2 - e^+^yqjp^ Thus
pi = p 0 ( A = ai) = |di|2 = q i p n + q 2 p 2 i + 2 v q ip i iq 2 p 2 i cos^ (11)
p 2 = pltt(A = a2) = |d2|2 = qiPi2 + q2P22 - 2yqiPi2q2P22Cos0 (12)
187
3 Probability transformations connecting preparation proceshydures Let us forget at the moment about the quantum theory Let B(= b b2) and A(= 0102) be physical variables We consider an arbitrary preparation procedure pound for microsystems or macrosystems Suppose that pound produced an ensemble S of physical systems Let pound and pound2 be preparation procedures which are based on filters Fi and F2 corresponding respectively to values 61 and b2
of B Denote statistical ensembles produced by these preparation procedures by symbols Tx and T2 respectively Symbols
have the same meaning as in the previous considerations Probabilities qi)PijgtPi a r e defined in the same way as in the previous considerations The only difference is that instead of indexes corresponding to quantum states we use indexes corresponding to statistical ensembles
q = Ps(B = bi)pi = ps(A = ai)pij = PTi(A = a)
We shall restrict our considerations to the case of strictly positive probashybilities
The following simple frequency considerations are basic in our investigashytion We would like to represent the frequency p^ (for A = a in the ensemble S) as the sum of the conventional (Bayes) part
q i ^ P i f + q ^ P ^ and some perturbation term Such a perturbation term appears because
frequencies q and p ^ are calculated with respect to different ensembles The magnitude of this perturbation term will play the crucial role in our further analysis We have
(N) _ nplusmn _ nu I^pound _ mi l H2i 4 (nii ~ miraquo) (n2i ~ ra2j) P i ~ N ~ N N ~ N N N N
But for i = l 2 we have
tradegtu _ rnu_ r^_ _ (N) (N) m^ _ rn^ n | _ (jy) (N)
N ~ n N ~ P l i q i N ~ n N ~P2i ^
Hence
pw = qwp(f) + qwp(f) + r ) ) (13)
where
SiN) = Jj[(nu ~ m i i ) + (2i - m2i)] i mdash 12
188
In fact this rest term depends on the statistical ensembles STiT2 4Ngt=6W(STlT2) 4 Behaviour of fluctuations First we remark that limjv-yoo S exists
for all physical measurements We always observe that P 1
( N ) - M M q i( N ) - q p J ) - gt P u N - gt 0 0
Thus there exist limits 6i = limiv^oo S = Pi ~ qiPii - q2P2i-This coefficient Si is statistical deviation produced by the perturbation
effect of the preparation procedure Ei (quantities S are experimental statisshytical deviations)
Suppose that preparation procedures poundi = 12 (typically filters F) proshyduce negligibly small (with respect to the size N of the statistical ensemble) changes in properties of particles Then
6deg -gt0N-oo (14)
This asymptotic implies conventional probabilistic rule (1) In particular this rule can be used in all experiments of classical physics Hence preparation and measurement procedures of classical physics produce experimental statistical deviations with asymptotic (14) We also have such a behaviour in the case of compatible observables in quantum physics
Moreover the same conventional probabilistic rule we can obtain for inshycompatible observables B and A if the phase factor 9 = j + nk Therefore conventional probabilistic rule (1) is not directly related to commutativity of corresponding operators in quantum theory It is a consequence of asymptotic (14)
Despite the same asymptotic (14) there is the crucial difference between classical observations (and compatible observations) and decoherence 9 = f +
irk for incompatible observations In the first case S fa 0 TV -gt oo because both
4T = jj(nu ~mH)w deg siyen = jj(n2i ~ m 2 ) K deg N bullbull deg deg -In an ideal classical experiment we have
gtiiraquo = ma and n^i = tn^i-Here preparation procedures poundj (filters with respect to the values hi of the
variable B) do not change values of the A-variable at all In the case of decoherence of incompatible observables the statistical deshy
viations S j and 8 2 are not negligibly small So perturbations can be sufshyficiently strong However we still observe (14) as a consequence of the comshypensation effect of perturbations
189
x(N) ~ _x() degil ~ degi2 bull Suppose now that filters Fii = 12 produce changes in properties of
particles that are not negligibly small (from the statistical viewpoint) Then the statistical deviations
lim 6N) =Si^0 (15) iV-gtoo
Here we obtain probabilistic rules which differ from the conventional one (1) In particular this implies that behaviour (15) cannot be produced in experishyments of classical physics (or for compatible observables in quantum physics)
A rather special class of statistical deviations (15) is produced in experishyments of quantum physics However behaviour of form (15) is not the specific feature of quantum measurements (see further considerations)
To study carefully behaviour of fluctuations S we represent them as
where
A-N) = [jnu - mii) + (n2i - m2i)] 2ymum2i
These are normalized (experimental) statistical deviations We have used the fact
(N) (N) (N) (N) _ nj r^plusmn ^2 ^2i _ rniim2i qi P H q2 p2i - N bull n t bull N bull n6 - JV-2 bull
In the limit N -gt oo we get
Si = 2yqiPHq2P2i Araquo
where the coefficients Aj = limjv-gtoo A i = 12 Thus we found the general probabilistic transformation (for three preparation procedures) that can be obtained as a perturbation of the conventional probabilistic rule (i = 12)
Pi = qiPH + q2P2i + 2Vqiq2PiiP2iAj (16)
Of course we are free in the choice of a normalization constant in the perturbation term We use 2vqiq2Piipi7 by the analogy with quantum forshymalism In fact such a normalization was found in quantum formalism to get the representation of probabilities with the aid of complex numbers Comshyplex numbers were introduced in quantum formalism to linearize the nonlinear
190
probabilistic transformation q ip i + q2P2raquo + 2-vqiq2PiiP2i cos 6 To do this we use the formula (c d gt 0)
c + d + 2Vcdcos6 = ^+Vdeie2 (17)
The square root yc+Vde9 gives the possibility to use linear transformations Thus we do not see anything mystical in the appearance of complex numbers in quantum theory This is a consequence of the impossibility of real linearization of the nonlinear probabilistic transformation
In classical physics the coefficients A = 0 The same situation we have in quantum physics for all compatible observables as well as for measurements of incompatible observables for some states In the general case in quantum physics we can only say that the normalized statistical deviations
K lt 1 (18)
Hence for quantum experiments we always have
(nu - mu) + (n2i - m2i)
2ymum2i lt l J V - gt o o (19)
Thus quantum perturbations induce a relatively small (but not negligibly small) statistical variations of properties We underline again that quantum perturbations give just the proper class of perturbations satisfying to condition (19)
Let us consider arbitrary preparation procedures that induce perturbations satisfying to (18) We can set
Aj = cos9ii = 12 where 6i are some phases Here we can represent perturbation to the
conventional probabilistic rule in the form
St = 2vqipliq2p2iCOS0iJ = 12 (20)
In this case the probabilistic rule has the form (i = 12)
Pi = qiPii + q2P2i + 2^qiq2piiP2i cos8i (21)
This is the general form of a trigonometric probabilistic transformation The usual probabilistic calculations give us 1 = Pl + p 2 = qiPH + q2P21 + +qiPl2 + q2P22 + 2 TqTqiPiTpircos^i + 2 yqTqiPiipii cos 02
= 1 + 2Aqiq2[xpnP2i coslti + vPi2P22 cos02] bull
191
Thus we obtain the relation
P l l P 2 1 c o s ^ l + Pl2P22COS02 = 0 (22)
Suppose now that the matrix of probabilities is a double stochastic matrix We get
cos 6 mdash mdash cos 6-2 (23)
We obtain quantum probabilistic transformation (2) We demonstrate that this rule could be derived even in the realist framework Condition (19) has the evident interpretation To explain the mystery of quantum probabilistic rule we must give some physical interpretation to the condition of double stochasticity see section 4 for such an attempt
We can simulate quantum probabilistic transformation by using random variables niju)miju) such that the deviations
4T = nu - mH = 2^fVmiraquom2raquo (24)
4 i = n2i ~ m2j = ^ii VmUm2i (25)
where the coefficients poundy satisfy the inequality
l deg + $ deg I lt l-gtoo (26)
Suppose that Agt mdash poundj + Qj ~raquo A N -raquobull oo where |Ai| lt 1 We can repshy
resent A|N) = cos(9i(N) Then0JN) -gtbull 9imod2iT when N -gt oo Thus A = cos ft We remark that the conventional probabilistic rule (which is induced by
ensemble fluctuations with Q mdashgt 0) can be observed for fluctuations having relatively large absolute magnitudes For instance let
e l i mdash lt Vmlraquogt e2i mdash 2S2t V m 2i )raquo mdash J-iA (27)
where sequences of coefficients pound4 and pound^ are bounded (JV -gt oo) Here (N) f(JV) pound(JV)
^ = mti wmn -gt 0 iV -gt oo (as usual we assume that pj gt 0) Example 21 Let N laquo 106nJ w rig laquo 5 bull 105 mn ss mi2 laquo m2i laquo
m22 ~ 25 bull 104 So qi mdash q2 = 12 p u mdash p i 2 = p 2 1 = p 2 2 = 12 (symmetric state) Suppose we have fluctuations (27) with f m Qi ~ 12- Then eH w 4 w ^00 So riij = 24 bull 104 plusmn 500 Hence the relative deviation
192
(N)
m7 = 25I04 ~ 0002 Thus fluctuations of the relative magnitude laquo 0002 produce the conventional probabilistic rule
It is evident that fluctuations of essentially larger magnitude
4V = 2^f )(mH)1 2(m2 1)1Agt euro W = 2ampm2i)^(mu)Wap gt 2 (28)
where Q and pound2i a r e bounded sequences (N mdashgt 00) also produce (for Pij yen 0) the conventional probabilistic rule
Example 22 Let all numbers N mij be the same as in Example 31 and let deviations have behaviour (28) with a = = 4 Here the relative
AN)
deviation -mdash laquo 0045 Remark 21 The magnitude of fluctuations can be found experimentally
Let A and B be two physical observables We prepare free statistical ensembles S Ti T 2 corresponding to states ltj)bi gtb2 gt bull By measurements of B and A for 7r G S we obtain frequencies q[ q2 gt Pi gt P2 gt ^y measurements of A for 7r euro Ti and for TT G T2 we obtain frequencies p[j We have
H N ) = A ( N ) = p(N) q ( N ) p ( N ) _ q ( N ) p ( N
It would be interesting to obtain graphs of functions f (N) for different pairs of physical observables Of course we know that lini7v-raquooo ft (N) = plusmncos6 However it may be that such graphs can present a finer structure of quantum states
3 Hyperbolic and hyper-trigonometric probabilistic transformations
Let Si pound2 be preparation procedures that produce perturbations such that the normalized (experimental) statistical deviations
lAJ^I gt lJV-raquooo (29)
Thus |Aj| gt 12 = 12 Here the coefficients Aj can be represented in the form Aj = plusmn cosh8ii = 12 The corresponding probability rule has the following form
Pi = qiPii + Q2P2J plusmn 2AqIqipIip27cosh Qh i = 12 The normalization pi + p 2 = 1 gives the orthogonality relation
VP11P2I COSh 61 plusmn 1Pl2P22COSh^2 = 0 (30)
Thus cosh 62 mdash C0Sn^ipi2P22 and signAiA2 = mdash1
193
This probabilistic transformation can be called a hyperbolic rule It deshyscribes a part of nonconventional probabilistic behaviours which is not deshyscribed by the trigonometric formalism Experiments (and preparation proshycedures 86182) which produce hyperbolic probabilistic behaviour could be simulated on computer On the other hand at the moment we have no natural physical phenomena which are described by the hyperbolic probabilistic formalshyism Trigonometric probabilistic behaviour corresponds to essentially better control of properties in the process of preparation than hyperbolic probabilistic behaviour Of course the aim of any experimenter is to approach trigonometshyric behaviour However in principle there might exist such natural phenomena that trigonometric quantum behaviour could not be achieved
Example 3 1 Let qi = a q2 = 1 - a P n = = P22 = 12 Then pi = I + ya(l - a)Ai P2 = I - A(1 - laquo)^i bull If a is sufficiently small then Ai can be in principle larger than 1 We
can find a phase 6 such that the normalized statistical deviation Ai = cosh Let us consider experiments that produce hyperbolic probabilistic rule and
let the corresponding matrix of probabilities be double stochastic In this case orthogonality relation (30) has the form
coshi = cosh 62 = cosh We get the probabilistic transformation
Pi = q i P n +q2P2i plusmn 2^qiq2piiP2i coshfl
P2 = q iP i2 + q2P22 T 2v qiq2Pi2P22COsh0
This probabilistic transformation looks similar to the quantum probabilistic transformation The only difference is the presence of hyperbolic factors inshystead of trigonometric This similarity gives the possibility to construct a linear space representation of the hyperbolic probabilistic calculus see section 7
The reader can easily consider by himself the last possibility one norshymalized statistical deviations |A| is large than 1 and another is less than 1 hyper-trigonometric probabilistic transformation
Remark 31 The real experimental situation is more complicated In fact the phase parameter 6 is connected with the experimental arrangement In particular in the standard interference experiments the phase is related to the space-time structure of an experiment It may be that in some expershyiments dependence of the normalized statistical deviation A on 6 is neither trigonometric nor hyperbolic
P = P + P 2 + 2 yP^XiO) However if the function |A()| lt 1 then we can obtain the trigonometric
transformation by just the reparametrization 6 = arccos()
194
4 Double stochasticity and correlations between preparation proshycedures
In this section we study the frequency meaning of the fact that in the quantum formalism the matrix of probabilities is double stochastic We remark that this is a consequence of orthogonality of quantum states bi gt and |62 gt corresponding to distinct values of a physical observable B We have
PU = P22 ( 3 1 )
Pl2 P21
Suppose that all quantum features are induced by the impossibility to create new ensembles Ti and T2 without to change properties of quantum parshyticles Suppose that for example the preparation procedure Si practically destroys the property A = ai (transforms this property into the property A = a2) So p n = 0 As a consequence the pound1 makes the property A = a2
dominating So p i 2 laquo 1 Then the preparation procedure Si must practishycally destroy the property A = a2 (transforms this property into the property A = ai) So P22 PS 0 As a consequence the Si makes the property A = ai dominating So P21 laquo 1
We remark that
We recall that the number of elements in the ensemble T is equal to n Thus
n n -run _ n22 - m 2 2 ^ nil _ 22 bdquobdquo
This is nothing than the relation between fluctuations of property A under the transition from the ensemble S to ensembles Ti T2 and distribution of this property in the ensemble S
5 Hyperbolic quantum formalism
The mathematical formalism presented in this section can have different physshyical interpretations In particular quantum state can be interpreted from the orthodox Copenhagen as well as statistical viewpoints
A hyperbolic algebra G see [18] p 21 is a two dimensional real algebra with basis eo = 1 and ei = j where j 2 = 1 Elements of G have the form z = x + jy xy euro R We have zi + z2 = (xi + x2) + j(yi + yi) and ziz2 = xixi + 2122) + j(^i22 + X2yi) This algebra is commutative We introduce
195
the involution in G by setting z = x - jy We set z2 = zz = x2 - y2 We remark that z = yjx2 - y2 is not well denned for an arbitrary z euro G We set G+ = z pound G z2 gt 0 We remark that G+ is the multiplicative semigroup ZiZ2 pound G + mdashbull z = zz2 pound G+ It is a consequence of the equality
zxz22 = |zi |2 |z2 |2
Thus for zz2 pound G + we have zz2 = l^iH^I- We introduce
eje = cosh6+js inh9 6 pound R
We remark that
e j 0 i e j 02 _ em+ltgt2)^ _ e - j 9 |gjlaquo|2 _ c o s h 2 g _ s i n h 2 g _ L
Hence z = plusmneJ e always belongs to G+ We also have cosh6raquo = e +2
e sinh6gt = e ~j We set G = z e G + |Z|2 gt 0 Let z pound G+ We have
= W(1f[+W = laquoN( aSr+jHSr)-2 2
As A T - T TJ = 1 we can represent x sign a = cosh 6 and y sign a = sinh 6 where the phase 6 is unequally defined We can represent each z pound G+ as
z = sign x |z| ee By using this representation we can easily prove that G+ is the mulshy
tiplicative group Here mdash 5Spe-Jfl The unit circle in G is denned as Si = z pound G z2 = 1 = z = plusmneje9 pound ( -oo+oo) It is a multiplicative subgroup of G+
Hyperbolic Hilbert space is G-linear space (module) see [18] E with a G-linear product a map (bullbull) E x E mdashgt G that is
1) linear with respect to the first argument (az + bwu) = a(zu) + b(wu)ab pound Gzwu pound E 2) symmetric (zu) = (uz) 3) nondegenerated (zu) = 0 for all u pound E iff z mdash 0 If we consider E as just a R-linear space then (bull bull) is a bilinear form which
is not positively defined In particular in the two dimensional case we have the signature (+ mdash + mdash)
As in the ordinary quantum formalism we represent physical states by normalized vectors of the hyperbolic Hilbert space ltp pound E and (ip ip) = 1 We shall consider only dichotomic physical variables and quantum states belonging to the two dimensional Hilbert space So everywhere below E denotes the two dimensional space Let A = a a2 and B = bi b2 be two dichotomic physical variables We represent they by G-linear operators a gtlt a i | + a2 gtlt a2
196
and bi gtlt b + |amp2 gt lt b2 where |a gtj=i2 and bi gti=i2 are two orthonormal bases in E
Let (p be a state (normalized vector belonging to E) We can perform the following operation (which is well defined from the mathematical point of view) We expend the vector ltp with respect to the basis bi gti=i2 bull
ltP = Pibigt+p2b2gt (34)
where the coefficients (coordinates) Pi belong to G As the basis bi gti=i2 is orthonormal we get (as in the complex case) that
p12 + p2
2 = l (35)
However we could not automatically use Borns probabilistic interpretation for normalized vectors in the hyperbolic Hilbert space it may be that Pi $ G +
(in fact in the complex case we have C = C + ) We say that a state ip is deshycomposable with respect to the system of states |6j gti=i2 (S-decomposable) if
Pi G G+ (36)
In such a case we can use Borns probabilistic interpretation of vectors in a hyperbolic Hilbert space
Numbers q = Pi2i = 12 are interpreted as probabilities for values B = bi for the G-quantum state tp
We now repeat these considerations for each state bi gt by using the basis ogtk gt=i2- We suppose that each bi gt is ^-decomposable We have
|ampi gt = n k gt +Pi2a2 gt |amp2 gt = ampi |a i gt +p22a2 gt (37)
where the coefficients Pik belong to G+ We have automatically
|n|2 + |i2|2 = l |2i|2 + |22|2 = l (38)
We can use the probabilistic interpretation of numbers p n = |n|2pi2 = |3i2|2 and p2 i = |32i|
2P22 = P22 bull Pik is the probability for a - ak in the state bi gt
Let us consider matrices B = (Pik) and P = (pik)- As in the complex case the matrix B is unitary vectors u = (PnPi2) and u2 = (p2iP22) are orthonormal The matrix P is double stochastic
By using the G-linear space calculation (the change of the basis) we get ltp = a i |o i gt +a 2 | a 2 gt where a-i = PiPn + P2P21 and a2 mdash PP2 + 222-
197
We remark that decomposability is not transitive In principle ip may be not A-decomposable despite B-decomposability of ip and A-decomposability of the B-system
Suppose that ip is A-decomposable Therefore coefficients p^ = |afc|2 can be interpreted as probabilities for a = ak for the G-quantum state ltp
Let us consider states such that coefficients fiiPik belong to G+ We can uniquely represent them as
pi = plusmnvq~e^ I5ik = plusmnyJHkehih ik= 12
We find that
Pi = q i P u + Q2P21 + 2ei v q 1piiq 2p 2 i coshfli (39)
P2 = qiPi2 + q2P22 + 2e2vqTpl2q2P22 cosh^2 (40)
where 6t = 77 + 7 and 77 = f i - pound271 = 7n - 7217i = 7i2 - 722 and e = plusmn To find the right relation between signs of the last terms in equations (39) (40) we use the normalization condition
M 2 + |a2 |2 = l (41)
(which is a consequence of the normalization of ip and orthonormality of the system ai gti=i2) It is equivalent to the equation (condition of orthogonalshyity in the hyperbolic case see section 8)
VPl2P22COSh02 plusmn PllP2lCOSh02 = 0 Thus we have to choose opposite signs in equations (39) (40) Unitarity
of B also inply that 6 mdash 62 = 0 so 71 = 72 We recall that in the ordinary quantum mechanics we have similar conditions but trigonometric functions are used instead of hyperbolic and phases 71 and 72 are such that 71mdash72 = ir
Finally we get that (unitary) linear transformations in the G-Hilbert space (in the domain of decomposable states) represent the hyperbolic transformashytion of probabilities (see section 8)
Pi = QiPu + q2P2i plusmn 2-vq1piiq2p2iCOsh0 P2 = qiPi2 + q2P22 =F 2vq1pi2q2P22COsh0 This is a kind of hyperbolic interference There can be some connection with quantization in Hilbert spaces with
indefinite metric as well as the theory of relativity However at the moment we cannot say anything definite It seems that by using Lorentz-rotations we can produce hyperbolic interference in a similar way as we produce the standard trigonometric interference by using ordinary rotations
198
6 Physical consequences
The wave-particle dualism was created to explain the interference phenomenon for massive elementary particles In particular the orthodox Copenhagen inshyterpretation was proposed to find a compromise between corpuscular and wave features of elementary particles The idea of superposition of distinct propershyties is in fact based on these interference experiments It is well known that the orthodox Copenhagen interpretation is not free of difficulties (in particular collapse of wave function) and even paradoxes (see for example Schrodinger [19]) Problems in the orthodox Copenhagen interpretation induce even atshytempts to exclude corpuscular objects from quantum theory at all see for example [20] for Schrodinger critique of the classical concept of a particle At the moment there is only one alternative to the orthodox Copenhagen intershypretation namely Einsteins statistical interpretation By this interpretation the wave function describes distinct statistical features of an ensemble of eleshymentary particles see L Ballentine [17] for the details (see also [16] [5] [10]
[11])-However we must recognize that Einsteins statistical approach could not
solve the fundamental problem of quantum theory it could not explain the appearance of NEW STATISTICS in the purely corpuscular model We did this in the present paper On one hand this is the strong argument in favour of the statistical interpretation of quantum mechanics On the other hand one of main motivations to use the wave-particle duality disappeared
Nevertheless our investigation could not be considered as the crucial argushyment against the wave-particle duality It is clear that by using purely mathshyematical analysis we cannot prove or disprove some physical theory The only thing that we proved is that corpuscular objects (that have no wave features) can exhibit NEW STATISTICS
In fact we obtained essentially more than planed this NEW STATISTICS are not reduced to QUANTUM STATISTICS In principle we can propose experiments that induce TRIGONOMETRIC HYPERBOLIC and HYPER-TRIGONOMETRIC STATISTICS
We remark that the quantum probabilistic transformation P = Pi + P2 + 2VPTP7 cos0 gives the possibility to predict the probability P if we know probabilities
P i and P 2 In principle there might be created theories based on arbitrary transformations
P = F ( P 1 gt P 2 ) It may be that some rules have linear space representations over exotic number systems for example p-adic numbers [20]
199
Preliminary analysis of probabilistic foundations of quantum mechanics (that induced the present investigation) was performed in the books [11] and [21] (chapter 2) a part of results of this paper was presented in preprints [22]-[24]
Acknowledgements
I would like to thank S Albeverio L Accardi L Ballentine V Belavkin E Beltrametti W De Muynck S Gudder T Hida A Holevo P Lahti A Peres J Summhammer I Volovich for (sometimes critical) discussions on probabilistic foundations of quantum mechanics
References 1 P A M Dirac The Principles of Quantum Mechanics (Claredon Press
Oxford 1995) 2 R Feynman and A Hibbs Quantum Mechanics and Path Integrals
(McGraw-Hill New-York 1965) 3 L Accardi The probabilistic roots of the quantum mechanical parashy
doxes The wave-particle dualism A tribute to Louis de Broglie on his 90th Birthday ed S Diner D Fargue G Lochak and F Selleri (D Reidel Publ Company Dordrecht 297-330 1984)
4 B dEspagnat Veiled Reality An anlysis of present-day quantum meshychanical concepts (Addison-Wesley 1995)
5 A Peres Quantum Theory Concepts and Methods (Kluwer Academic Publishers 1994)
6 J von Neumann Mathematical foundations of quantum mechanics (Princeton Univ Press Princeton NJ 1955)
7 E Schrodinger Philosophy and the Birth of Quantum Mechanics Edited by M Bitbol O Darrigol (Editions Frontieres 1992)
8 J M Jauch Foundations of Quantum Mechanics (Addison-Wesley Reading Mass 1968)
9 P Busch M Grabowski P Lahti Operational Quantum Physics (Springer Verlag 1995)
10 W De Muynck W De Baere H Martens Found Phys 24 1589-1663 (1994)
11 A Yu Khrennikov Interpretations of probability (VSP Int Publ Utrecht 1999)
12 J Summhammer Int J Theor Phys 33 171-178 (1994) 13 R von Mises The mathematical theory of probability and statistics
(Academic London 1964)
200
14 A N Kolmogoroff Grundbegriffe der Wahrscheinlichkeitsrechnung (Springer Verlag Berlin 1933) reprinted Foundations of the Probability Theshyory (Chelsea Publ Comp New York 1956)
15 W Heisenberg Z Physik 43 172 (1927) 16 L E Ballentine Quantum mechanics (Englewood Cliffs New Jersey
1989) 17 L E Ballentine Rev Mod Phys 42 358-381 (1970) 18 A Yu Khrennikov Supernalysis (Kluwer Academic Publishers Dor-
dreht 1999) 19 E Schrodinger Die Naturwiss 23 807-812 824-828 844-849 (1935) 20 E Schrodinger What is an elementary particle in Gesammelte Ab-
handlungen (Wieweg and Son Wien 1984) 21 A Yu Khrennikov p-adic valued distributions in mathematical physics
(Kluwer Academic Publishers Dordrecht 1994) 22 A Yu Khrennikov Ensemble fluctuations and the origin of quantum
probabilistic rule Rep MSI Vaxjo Univ 90 October (2000) 23 A Yu Khrennikov Classification of transformations of probabilities
for preparation procedures trigonometric and hyperbolic behaviours Preprint quant-ph0012141 24 Dec (2000)
24 A Yu Khrennikov Hyperbolic quantum mechanics Preprint quant-ph0101002 31 Dec (2000)
201
NONCONVENTIONAL VIEWPOINT TO ELEMENTS OF PHYSICAL REALITY BASED ON NONREAL ASYMPTOTICS
OF RELATIVE FREQUENCIES
A N D R E I K H R E N N I K O V
International Center for Mathematical
Modeling in Physics and Cognitive Sciences
MSI University of Vaxjo S-35195 Sweden
EmailAndreiKhrennikovmsivxuse
We study connection between stabilization of relative frequencies and elements of physical reality We observe that besides the standard stabilization with respect to the real metric there can be considered other statistical stabilizations (in parshyticular with respect to so called p-adic metric on the set of rational numbers) Nonconventional statistical stabilizations might be connected with new (noncon-ventional) elements of reality We present a few natural examples of statistical phenomena in that relative frequencies of observed events stabilize in the p-adic metric but fluctuate in the standard real metric
1 Introduction
The present methodology of physical measurements is based on the principle of the statistical stabilization of relative frequencies in the long run of trials In the mathematical model this principle is represented by the law of large numbers This approach to measurements is induced by human representation of physical reality as reality of stable repetitive phenomena In the process of evolution we created cognitive structures that correspond to elements of this repetitive physical reality All modern physical investigations are oriented to the creation of new elements of such a reality
It must be remarked that the notion of stabil ization (of relative frequenshycies) plays the fundamental role in the creation of this reality I would like to point out that the conventional meaning of stabilization is based on real numbers When we say stabilization we mean the stabilization with respect to the standard real metric pn(xy) = |x mdash y| (the distance between points x and y on the real line R) Of course such a choice of the metric that deshytermines statistically elements of physical reality was not just a consequence of the development of one special mathematical theory real analysis b It
a W e ask the reader not connect our vague (common sense) use of the notion of an element of physical reality with the EPR sufficient condition to be an element of reality [1] bNevertheless we must not forget that the human factor played the large role in the expendshying of the (presently dominating) model of physical reality based on real numbers At the beginning Newtons analysis was propagated as a kind of religion There were (in particular
202
seems that the notion of ^-stabilization was induced by human practice in that quantities n laquo N were not important We created real physical reality because we used smallness based on the standard order on the set of natural numbers
It must be underlined that in modern physics the real physical reality (ie reality based on the 9R-stability) is in fact identified with the whole physical reality
On the other hand the modern mathematics is not more just a real analshyysis In particular the development of general topology [2] [3] induced large spectrum of new nearness (in particular metric) structures In principle we need not more identify any stabilization with the p^-stabilization There apshypears a huge set of new possibilities to introduce new forms of stability in physical experiments Moreover new stable structures can be considered as new elements of physical reality that in general need not belong the standard real reality
This idea was presented for the first time in authors investigations [4] [5] on so called p-adic physics [6]- [10] Later we tried to find the place of p-adic probabilities in quantum physics [11] [12] (in particular to justify on the mathematical level of rigorousness the use of negative and complex probabilishyties as well as create models with hidden variables that do not produce Bells inequality) In this paper we give the brief introduction into these probabilisshytic models as well as present a few rather natural examples in that relative frequencies of events stabilize with respect to so called p-adic metric but flucshytuate with respect to pR There is no corresponding element of the real reality But there is an element of the p-adic reality The objects considered in examshyples could be created on the hard-level In particular to create a plantation in that a colour of the flower (red or white) is the element of p-adic reality I need just a tractor and (sufficiently large) peace of land Nevertheless I must agree that such a p-adic element of reality were never observed in naturally created physical objects
The reader can be interested in the reasons by that we are concentrated on the statistical stabilization with respect to the p-adic numbers p-adic frequency probability theory The main reason is that p-adic numbers are in fact the unique alternative to real numbers there is no other possibility to complete the field of rational numbers and obtain a new number field (Ostrovskiis theorem see for example [13] [14])
Our probabilistic foundations are based on the generalization of R von Mises frequency theory of probability [15] [16] At the beginning of this censhytury when the foundation of modern probability theory were being laid the
in France) divine services devoted to Newtons analysis
203
frequency definition of probability proposed by von Mises played an imporshytant role In particular it was this definition of probability that Kolmogorov used to motivate his axioms of probability theory (see [17]) We also begin the construction of the new theory of probability with a frequency definition of probability
Von Mises defined the probability of an event as the limit of the relative frequencies of the occurrence of the event when the volume of the statistical sample tends to infinity This definition is the foundation of mathematical statistics (see example Cramer [18]) in which von Misess definition is formushylated as the principle of statistical stabilization of relative frequencies
In this paper we propose a general principle of statistical stabilization of relative frequencies By virtue of this principle statistical stabilization of relative frequencies u = nN can be considered not only in the real topology on Q (and all relative frequencies are rational numbers) but also in any other topology on Q Then the probabilities of events belong to the corresponding completion of the field of rational numbers As special cases we obtain the ordinary real probability theory (von Misess definition) and p-adic probability theories p = 2 3 5
How should one choose the topology of statistical stabilization for a given statistical sample The topology is determined by the properties of the studied probability model In essence we propose this principle for each probability model there is a corresponding topology (or topologies) of statistical stabilizashytion
For example in a random sample there need not be any statistical stashybilization of the relative frequencies in the real metric Thus from the point of view of real probability theory this is not a probabilistic object However in this random sample one may observe p-adic statistical stabilization of the relative frequencies
In essence I am asserting that the foundation of probability theory is provided by rational numbers (relative frequencies) and not real numbers Real probabilities of events merely represent one of many possibilities that arise in the statistical analysis of a random sample Such an approach to probability theory agrees well with Volovichs proposition that rational numbers are the foundation of theoretical physics [19] In accordance with this proposition everything physical is rational and number fields that are different from the field of rational numbers arise as an idealization needed for the theoretical description of physical results
All necessary information on p-adic (and more general m-adic) numbers can be found in Appendix 1 of this paper However in the first two sections they are hardly used at all and we may restrict ourselves to the remark that
204
in addition to the completion of the field of rational numbers Q with respect to the real metric there also exist completions with respect to other metrics and among these completions there are the fields of p-adic numbers Qpp = 2 3 5
2 Analysis of the foundation of probability theory
21 Frequency Definition of Probability As is well known the frequency definition of probability proposed by von Mises [15] in 1919 played an imporshytant role in the construction of the foundations of modern probability theory This definition exerted a strong influence on the theory of probability meashysures the foundations of which were laid by Borel [20] Kolmogorov [17] and Frechet [21] There is no point in giving here Kolmogorovs axioms (which can be found in any textbook on probability theory) but it is probably necessary to recall in its general features the main propositions of von Misess theory of probability The theory is based on infinite sequences x = (ai xlti xn) of samplings or observations If an experiment having S outcomes is made then Xj can take values 12 5 (possible outcomes) For the standard exshyperiment on coin trails we have 5 = 2 and Xj = 12 In what follows possible outcomes of an experiment will be called labels
However not every such sequence is regarded as an object of probability theory The fundamental principle of the frequency theory of probability is the principle of statistical stabilization of the relative frequencies of occurrence of a particular label and only sequences of samplings that satisfy this principle are regarded as objects of probability theory Such sequences of samplings are called collectives
A collective is a bulk phenomenon or a repeated process in brief a series of individual observations for which one is justified in assuming that the relative frequency of occurrence of each individual observable label tends to a definite limiting value [16]
The probability of an event E is defined as the limit of the sequence of frequencies u^ = nN where n is the number of cases in which the event E is detected in the first N tests
For the subsequent considerations it is important to note that in the statistical analysis of the results of an experiment only rational numbers -relative frequencies - are obtained
The principle of statistical stabilization of the relative frequencies is used practically unchanged in mathematical statistics
Observations of the frequency v^ of a fixed event E for increasing values of N reveals that this frequency has generally speaking a tendency to take a
205
more or less constant value at large N (see Cramer [18]) In defining a collective von Mises used a further principle - the principle
of irregularity of a sequence of tests ie invariance of the limit of the relative frequencies with respect to the selection made using a definite law from a given sequence of tests x = (xiX2 xn) of some subsequence It is important that the law of this selection should not be based on the difference of the elements of the sequence with respect to the considered label
Second this limiting value must remain unchanged if from the complete sequence we choose arbitrarily any part and consider in what follows only this part [16]
This principle like the principle of statistical stabilization of the relative frequencies is fully in accord with our intuitive ideas of randomness However there are here some logical difficulties associated with the arbitrariness of the choice A detailed analysis of these logical problems was made by Khinchin [22] see also [12] for the details It appears that one must agree with Khinchins critical comments and consider the frequency theory of probability that is based only on von Misess first principle - the principle of statistical stabilization of the relative frequencies
As is noted in [22] the frequency theory of probability based solely on von Misess first principle is axiomatized and is as rigorous a mathematical theory as Kolmogorovs theory of probability Here we do not intend to consider von Misess theory of probability in the framework of an axiomatic approach Our task is to analyze the principle of stabilization of the frequencies of occurrence of a particular event in a collective
22 Von Mises Frequency Theory of Probabilities as Objective Foundation of Kolmogorovs Axiomatics
As motivation of his axioms Kolmogorov used the properties of limits of relative frequencies see [17] We shall be interested in the manner in which Kolmogorovs axiom 2 arose in accordance with this axiom the probability PE) of any event E is a nonnegative real number lt 1 In [17] Kolmogorov considers von Misess definition [16] of probability as the limit of the relative frequencies of occurrence of the event E Further since the relative frequencies i(pound) = nN are rational numbers that lie between zero and unity their limits in the real topology are real numbers between zero and unity Cramer proceeded similarly in the construction of his theory of probability distributions [18]
Khinchin discussing the advantages of Kolmogorovs axioms over von Misess frequency theory of probability noted that from the formal asshypect the mutual relationship between the axiomatic and frequency theories is characterized in the first place by a higher degree of abstraction of the former
This higher degree of abstraction was the foundation of the successful
206
development of the theory of probability measures However this degree of abstraction is too high and some properties of the world of real frequencies are lost in it Essentially the rational numbers were lost in Kolmogorovs theory of probability Whereas in von Misess theory the rational numbers arise as primary objects and real probabilities are obtained as a result of a limiting process for rational frequencies in Kolmogorovs theory rational frequencies are secondary objects associated with real probabilities (which are here primary) by means of the law of large numbers
3 General principle of statistical stabilization of relative frequenshycies
First we emphasize that the probabilities P in von Misess frequency theory are ideal objects (symbols to denote the sequences of relative frequencies that are stabilized in the field of real numbers) Therefore real numbers arise here as ideal objects associated with rational sequences of frequencies (see also Borel [20] and Poincare [23])
A basis for a broader view of probability theory is provided by the following principle of statistical stabilization of frequencies
Statistical stabilization (the limiting process) can be considered not only in the real topology on the field of rational numbers Q but also in any other topolshyogy on Q The probabilities of events are defined as the limits of the sequences of relative frequencies in the corresponding completions of the field of rational numbers
For each considered probability model there is a corresponding topology on the field of rational numbers The metrizable topologies on Q given by absolute values are the most interesting By virtue of Ostrovskiis theorem there are very few such topologies indeed besides the usual real topology for which p(xy) = x mdash y there exists only the p-adic topologies p = 2 3 where p(x y) = x mdash yp Thus if we consider only topologies given by absolute values then besides the usual probability theory over R we obtain only the probability theories over Qp
It is here necessary to introduce a natural restriction on the topology of statistical stabilization
The completion Qt of the field of rational numbers Q with respect to the statistical stabilization topology t is a topological field
We have deliberately not introduced this restriction into the general prinshyciple of statistical stabilization One can also consider statistical stabilization topologies that are not consistent with the algebraic structure on Q However probability theory based on such topologies loses many familiar properties For
207
example it turns out that the continuity of the addition operation is equivashylent to additivity of probabilities and continuity of the division operation is equivalent to the existence of conditional probabilities
Let x = (xX2 bull bull xn) be some collective We denote the set of all labels for this collective (possible outcomes of an experiment producing this collective) by the symbol II We denote by fi the event consisting in the realization of at least of the label n euro II
Proposition 31 The probability of the event il is equal to unity To prove this it is sufficient to use the fact that all the relative frequencies
are equal to unity Let v^fi j = 12 be the relative frequencies of realization of certain labels
7Ti and 7r2 and Pj = l imi ^ be the corresponding probabilities Let event A be the realization of the label TT or -K-I A = n V TT2 bull Using the continuity of the addition operation we obtain
P(A) = lim iW = lim(jW + v^) = lim iW + lim J 2 ) = PX+P2 (1)
This rule can be generalized to any number of mutually exclusive events Proposition 32 Let Ajj = 1 k be mutually exclusive events (ie
the sets of labels that define these events are disjoint) Then
k
P(A1VVAk) = YP(Aj) (2) i= i
Using the continuity of the subtraction operation we obtain the following proposition
Proposition 33 For any two events A and B the equation P(AB) mdash PA) + PB) - PA A B) holds
In the language of collectives the rule of addition of probabilities is forshymulated as follows see[16] Beginning with an original collective possessing more than two labels an appreciable number of new collectives can be conshystructed by uniting labels the elements of the new collective are the same as in the original one but their labels are unifications of the labels of the origshyinal collective To the unification of labels there corresponds the addition of frequencies
We consider the set of rational numbers U = x euro Q Q lt x lt We denote by the symbol Ut the closure of the set U in the field Qt (if t is the ordinary real topology then Ut mdash [01]) An obvious consequence of the definition of probabilities is the following proposition
Proposition 34 The probability of any event PE) belongs to the set Ut-
208
Conditional probabilities are then introduced into the frequency theory in same way as in [16] Suppose there is some initial collective x = (xltx2-- xn) with probabilities pn of the labels IT euro II Using the unification rule we define the probabilities of all groups of labels
P(A) = YP- (3)
We fix some group of labels B = n^ V V iTik We are interested in the conditional probability P(TTB)TT euro B of the label n given the condition B We form a new collective x = (x[ x2 xn) which is obtained from the original one by choosing only the elements with the labels r pound 5 The probability of the label -K in this new collective is then called the conditional probability of the label n under the condition B P(nB) = lim v^lB^ where J(TB) a r e the relative frequencies of the label -K in the new collective Noting that z5) = iM z B ) where v^ is the relative frequency of the label it in the collective x and j B ) is the relative frequency of the event B in the collective x we obtain (using the continuity of the division operation)
j ( 7 r ) limiW p(V) PMB)=lua-m = mdash m = ^ y PB)0 (4)
The general formula can be proved similarly Proposition 35 P(AB) = PAAB)P(B)P(B) pound 0 We now introduce the concept of independence of events Analyzing argushy
ments in the book [16] one notes that the rule of multiplication of probabilities for independent events is equivalent to the continuity of the multiplication opshyeration
An important property that makes it possible to use p-adic probabilities when considering standard problems of probability theory is the p-adic intershypretation of the probabilities zero and one (which are probabilities in the sense of ordinary probability theory)
Indeed the equation P(E) = 0 in ordinary probability theory does not mean that the event E is impossible It merely means that in a long series of experiments the event E occurs in a very small fraction of cases However in a large number of experiments this fraction can be relatively large Moreover the equation P(E) = 0 lumps together a huge class of events that intuitively appear to have different probabilities For example suppose we consider two events E and Ei and in the first
N = Nk = Cpound)2 (5)
209
trials the event Ei is realized n^ = 2k times and the event E2 is realized
k
nW = Y2j (6) J=0
times It is intuitively clear that the probabilities of these events must be different However in real probability theory
Pi = lim n1)N = P2= lim n (2) N = 0 (7)
It is different in 2-adic probability theory Stabilization in the 2-adic topology gives
Pi = 0 P2 = - 1 since in Q2 we have 2 -gt 0 k -gt co and for - 1 we have the represenshy
tation - 1 = l + 2 + 22 + + 2 + We here encounter for the first time negative numbers for probabilities of events (compare to Wigner [24] Dirac [25] Feynman [26] see also [27] [28] [12]) Of course these probabilities are forbidden by Kolmogorovs second axiom in ordinary probability theory (in von Misess approach they are forbidden by the choice of the topology of stashytistical stabilization) However from the point of view of the frequency theory of probability P = mdash 1 is only an ideal object the symbol that denotes the limit of a sequence of relative frequencies This symbol is in no way better and in no way worse than the symbol P = jix in ordinary probability theory
In this example negative p-adic probabilities were used to split zero conshyventional (real) probability So p-adic negative probabilities can be interpreted as infinitely small conventional probabilities It may be that all negative probshyabilities that appear in quantum physics might be interpreted in such a way If conventional (real) probability is equal to zero there is no conventional (real) element of reality However there is nonconventional (p-adic) element of reality that is realized with negative probability Real and p-adic probabilities correshyspond to different classes of measurement procedures The element of reality that it would be impossible to observe by using real measurement procedure might be observed by using p-adic measurement procedure
One can treat similarly the case of a probability (in the sense of the ordishynary theory) equal to unity For example suppose
k k k k
N = Nk = (J2V)2n^ = (]T2^)2 - 2fcn(2) = ( ^ V ) 2 - pound)2gt (8) j=0 j=0 j=0 j=0
210
In 2-adic probability theory we find that
oo
P1 =l^P2 = l _ ( l ^ 2 gt ) = 2 (9) 3=0
We see here that natural numbers not equal to unity also belongs to the set Up
In this example p-adic (integer) probabilities which are larger than 1 were used to split conventional (real) probability one So under the p-adic considshyeration a conventional element of reality can be split to a few p-adic elements of reality
In the framework of p-adic statistical stabilizations there is also nothing seditious about complex probabilities For example let p = l(mod 4) Then i = ( - l )Va e Qp Let
i = io + hp + iip1 + bull bull bull ir = 0 1 p - 1 (10)
be the canonical decomposition of the imaginary unit in powers of p Note also that for any p
_ l = ( p - l ) + ( p - l ) p + ( p - l ) p 2 + (11)
Then for rational relative frequencies we have
v JQ + HP+ + ikpk ^ _ 1 2
(p - 1) + (p - l)p + + (p - l)pk
in the p-adic topology Geometrically one may suppose that the new probability theory is a transhy
sition from one-dimensional probabilities on the interval [01] to multidimenshysional probabilities
4 Probability distribution of a collective
Let x = (xi Xk bull bull bull) be some collective and II be the set of labels of this collective We consider the simplest case when the set II is finite II = ( 1 S) We denote by v^ the relative frequency of the jmdashlabel and by Pj = limiJ) the corresponding probability In the frequency theory the set of probabilities Px = (Pi bull bull Ps) is called the probability distribution of the collective x
211
The general principle of statistical stabilization makes it possible to conshysider not only real distributions but also distributions for other number fields For one and the same collective x there can exist distributions over different number fields Thus in the proposed approach a collective has in general an entire spectrum of distributions PXit = (P i t Pst) where t are the topologies of statistical stabilization for the given collective Therefore one here studies more subtle structure of the collective The relative frequencies are investigated not only for real stabilization but for a complete spectrum of stabilizations
In the connection with the existence of an entire spectrum of probability distributions of a collective it is necessary to make some comments
First this agrees well with von Misess principle that the collective comes first and the probabilities after Indeed a probability distribution is an object derived from a collective and to one and the same collective there corresponds an entire spectrum of probability distributions these reflecting different propshyerties of the collective
Second each statistical stabilization determines some physical property of the investigated object For example if in a statistical experiment involving the tossing of a coin the probability of heads is Pi and tails is P2 then these probabilities are physical characteristics of the coin like its mass or volume This question is discussed in detail in the books of Poincare [23] and von Mises [16]
If we consider from this point of view the new principle of statistical stashybilization we obtain new physical characteristics of the investigated objects For example if in the real topology statistical stabilization is absent then it is not possible to obtain any physical constants in the language of ordinary probability theory But these constants could exist and be for example p-adic numbers If a collective has not only a real probability distribution but an enshytire spectrum of other distributions then besides real constants corresponding to physical properties of the investigated object we obtain an entire spectrum of new constants corresponding to physical properties that were hidden from the real statistics Note that these new constants can also be ordinary rational numbers
5 Model examples of p-adic statistics
51 Plantation with Red and White Flowers As one of the first examples of a collective von Mises considered [16] a
plantation sown with flowers of different colors and he studied the statistical stabilization of the relative frequencies of each of the colors We shall construct
212
an analogous collective for which p-adic stabilization always occurs but real stabilization is in general absent
Suppose there are flowers of two types red (R) and white (W) The planshytation (or rather infinite bed) is sown in a random order with red and white flowers the flowers being sown in series formed by blocks of p flowers the length of the series (the power of p) being also determined in accordance with a random rule
Namely suppose there are two generators of random numbers 1) j = 01 2) i = 12 (with probabilities 05) If j = 0 then a series of red flowers is sown if j = 1 then a series of white ones The length of each series is defined as follows the length of the first series is some power p1 (it can also be determined in accordance with a random rule) if the length of the previous series was plm then the length of the next series is plm+x lm+i =lm + im
We introduce the relative frequencies of the red and white flowers in the firs m series vpoundgt = rVmgtNmi^T = ntrade Nm
Proposition 51 For all generators of the random numbers j and i there is statistical stabilization of the relative frequencies u^Rgt and u^wgt in the p-adic topology
Thus we have defined p-adic probabilities PR = l imi ^ and Pw mdash limi(w and
oo oo oo oo
PR = (pound(1 -Jn)P)CZPln)gtpw = (E^) (E^ n ) (13) n=l n= l n=l n=l
Note that in general there is no real statistical stabilization for such a random plantation If the generator of the random numbers j gives series 0 or 1 then u^ and v^w^ in the real topology can oscillate from zero to unity
Thus a real observer (an investigator who carries out statistical analysis of the sample in the field of real numbers) cannot obtain any statistically regular law
He will obtain only a random variation of the series of real relative frequenshycies In contrast the p-adic observer (the investigator who makes a statistical analysis of the sample in the field of p-adic numbers) will obtain a well-defined law consisting of the stabilization of the outcomes in the p-adic decomposition of the relative frequencies
It is evident that in the example of probability theory we observe a new funshydamental approach to the investigation of natural phenomena In accordance with this approach experimental results must be analyzed not only in the field of real numbers but also in p-adic fields
Naturally our example is purely illustrative but it does appear to reflect many very important properties of p-adic statistics
213
Remark 51 Intuitively one supposes that in a real plantation it is possible to find a white flower next to almost every red flower in contrast large groups (clusters) of red and white flowers are distributed randomly over a p-adic plantation (one can sow not only a bed but also distribute series of red and white flowers over a plane in accordance with a random rule) A real random plane is obtained if one throws at random red and white points onto the plane in contrast a p-adic random plane is obtained if one throws patches of pl points at a time of red and white color onto the plane
In Appendix 2 we give the results of statistical analysis of the results of a random modeling on a computer of the proposed probability model There is very rapid p-adic stabilization of the relative frequencies and no stabilization in the sense of ordinary real probability theory
Remark 52 Evidently the structure of series formed by powers of p need not necessarily be directly observed in a statistical sample This structure is introduced by rounding the number of results to powers of p In very large statistical samples one can take into account only the orders of the numbers and one thereby introduces into the sample a 10-adic structure
52 Random Choice of the Digit of a p-Adic Number Suppose there are two labels 1 and 2 j is a generator of random numbers
corresponding to the choice of one of the labels Each random label is produced in series the length of the series being determined by random choice of the next p-adic digit ie there is a generator of random numbers a that take the values a = 0 1 p - 1 and the length of the next series is anp
n~1n = 12 We introduce the relative frequencies v^ and v^
Proposition 52 For all generators of the random numbers j and a there is statistical stabilization of the relative frequencies v-1 and i 1 in the p-adic topology
Thus the following p-adic probabilities are defined
oo oo oo oo Pl = (Y^l-J^nPn~1)lY^nPn-l)P2 = (EjnltnP
n-l)(ltrianpn-1) n=l n=l n=l n=l
In the real topology there is in general no statistical stabilization Appendix 1 Every rational number x ^ 0 can be represented in the form
where p does not divide m and n Here p is a fixed prime The p-adic absolute value (norm) for the rational number x is defined by the equations xp =
214
p r i 0 |0|p = 0 This absolute value has the usual properties l)xp gt 0 xp = 0 laquo-raquobull x = 0 2)|x|p = |a|p|2|p and satisfies a strong triangle inequality 3)x + yp lt max(|a|p |y|p)
The completion of the field of rational numbers with respect to the metric p(x mdash y) = x mdash yp is called the field of p-adic numbers and denoted by the symbol Qp It is a locally compact field Numbers in the unit ball Zp = x euro QP bull XP lt 1 degf the field Qp are called integer p-adic numbers Prom the strong triangle inequality we obtain a theorem which states that a series in the field Qp converges if and only if its general term tends to zero Any p-adic number can be represented in a unique manner in the form of a (convergent) series in powers of p
oo x = Yla^ai =0 1 p-lfc = 0plusmnl (15)
j=k
with xp = p~k
One can define similarly m-adic numbers where m is any natural number m gt 2 In the general case property 2) is replaced by the weaker property xym lt |z|m|2|mgt i-e-gt xm ls a pseudonorm The completion of the field Q in the metric p(xy) = x mdash ym will not be a field (for m that are not prime) It is only a ring Here we already encounter some deviations from the ordinary probability rules (which can be extended without any changes to p-adic probabilities) For example one can have a situation of the following kind A and B are independent events P(A) ^ 0 and PB) ^ 0 but P(A AB)=0 In particular the conditional probability P(AB) is in general not defined for an event B having nonvanishing probability
Appendix 2
We give here the results of a random experiment (modeled on a computer) for a 2-adic plantation The results of this experiment give a good illustration of a situation in which there is no statistical stabilization in the real topology but there is statistical stabilization in the 2-adic topology In the following tables m is the number of a random experiment in which two random numbers are modeled one corresponding to the choice of a flower and the other to the length of the series of this flower d is the number of elements in the sample Because of the exponential growth of the number of elements in the series d increases very rapidly
The table of relative frequencies in the field of real numbers is
215
m 4 5 6 7
12 13 14
22 23
d 10 102
103
103
105
105
106
109
1010
w uyy
01304 06364 01913 00504
00006 05335 01703
00022 07453
uH
08696 03636 08087 09496
09994 04665 08297
09978 02547
Thus for the relative frequencies in the field of real numbers there is no stabilization of even the first digit after the decimal point We examined large sequences of experiments on the computer in which the oscillations continued The calculations in the field Q2 give the results
AT = 10
v(w) =101011111011000000110100010111011000110011011110110001011 iW =001100000100111111001011101000100111001100100001001110100
iV = 20
v(w) _ 10101111101100111011001100101111110000011100111000000001 vWgt = 00110000010011000100110011010000001111100011000111111110
AT = 30
iW = 101011111011001110110011001111111100000000100110110000011 iW =001100000100110001001100110000000011111111011001001111100
AT = 40
v(w) =101011111011001110110011001111111100000000010111001110100 iW =001100000100110001001100110000000011111111101000110001011
216
Thus after ten random experiments 14 digits are stabilized in the 2-adic decomposition for the relative frequency of occurrence of a red flower and 14 digits for a white flower after 20 experiments the numbers of digits that are stabilized are 27 for both colors after 30 experiments 42 digits are stabilized for each and so forth
Appendix 3 W e give the results of analysis of a statistical sample in a field of 5-adic
numbers Here N is the number of random experiments M is the number of elements of the sample M is the number of elements of the first label and Mi is the number of elements of the second label
N 2 M l 002 M 2 00002 M 00202
MlM1044004400440044004400440044004400440044004400440044 M2M0010440044004400440044004400440044004400440044004400
N 3 M l 002 M 2 000023 M 002023
MlM1040303403420004404141041024440040303403420004404141 M2M10014141041024440040303403420004404141041024440040303
N 4 M l 00200002 M 2 000023 M 00202302
MlM1040303004000130020234341334320032124414032304024031 M2M0014141440444314424210103110124412320030412140420413
N 5 M l 00200002 M 2 000023004 M 002023024
MlM1040301040132010043322212441423102032221232032034142 M2M0014143404312434401122232003021342412223212412410302
N 6 M l 00200002 M 2 00002300403 M 00202302403
MlM1040301003131014113132222240403413222311230303113140 M2M0014143441313430331312222204041031222133214141331304
N 7 M l 00200002 M 2 0000230040303 M 0020230240303
217
MlM1040301003202004101343032004014023441101104433243020 M2M0014143441242440343101412440430421003343340011201424
Thus in the analysis of the sample in the field of 5-adic numbers there is rapid stabilization of the digits in the 5-adic decomposition of the relative frequenshycies For example after 55 experiments 78 digits in the 5-adic decomposition of the relative frequencies are stabilized
When the sample is analyzed in the field of real numbers there is again no statistical stabilization
Acknowledgements
I would like to thank L Ballentine and J Summhammer for discussions on p-adic probabilities and elements of physical reality
References 1 A Einstein B Podolsky N Rosen Phys Rev 47 777-780 (1935) 2 PS Alexandrov Introduction to general theory of sets and functions
(Gostehizdat Moscow 1948) 3 R Engelking General Topology (PWN Warszawa 1977) 4 AYu Khrennikov Dokl Akad Nauk 322 1075-1079 (1992) 5 AYu Khrennikov J of Math Phys 32 932-937 (1991) 6 VS Vladimirov I V Volovich and E I Zelenov p-adic analysis and
mathematical physics ( World Scientific Publ Singapore 1994) 7 Yu Manin Springer Lecture Notes in Math1111 59-101 (1985) 8 P G 0 Freund and E Witten Phys Lett B 199 191-195 (1987) 9 AYu Khrennikov Non-Archimedean Analysis Quantum Paradoxes
Dynamical Systems and Biological Models (Kluwer Academic Publ Dordrecht 1997)
10 S Albeverio A Yu Khrennikov and R Cianci J Phys A Math and Gen 30 881-889 (1997)
11 A Yu Khrennikov J of Math Physics 39 1388-1402 (1998) 12 AYu Khrennikov Interpretations of probability (VSP Int Publ
Utrecht 1999) 13 Z I Borevich and I R Shafarevich Number Theory (Academic Press
New-York 1966) 14 W Schikhov Ultrametric calculus (Cambridge Univ Press Camshy
bridge 1984) 15 R von Mises MathZ 5 52-99 (1919)
16 R von Mises Probability Statistics and Truth (Macmillan London 1957)
17 A N Kolmogorov Foundations of the Probability Theory (Chelsea Publ Comp New York 1956)
18 H Cramer Mathematical theory of statistics (Univ Press Princeton 1949)
19 I V Volovich Number Theory as the Ultimate Physical Theory Preprint CERN Geneva TH 478187 (1987)
20 E Borel Rend Cic Mat Palermo 27 247 (1909) 21 M Frechet Recherches theoriques modernes sur la theorie des probashy
bility (Univ Press Paris 1937-1938) 22 A Ya Khinchin Voprosi Filosofii No 1 92 No 2 77 (1961) (in
Russian) 23 A Poincare About Science Collection of works (Nauka Moscow
1983) 24 E Wigner Quantum -mechanical distribution functions revisted in
Perspectives in quantum theory Yourgrau W and van der Merwe A editors (MIT Press Cambridge MA 1971)
25 P A M Dirac Proc Roy Soc London A 180 1-39 (1942) 26 R P Feynman Negative probability Quantum Implications Esshy
says in Honour of David Bohm 235-246 BJ Hiley and FD Peat editors (Routledge and Kegan Paul London 1987)
27 W Muckenheim Phys Reports 133 338-401 (1986) 28 A Yu Khrennikov Int J Theor Phys 34 2423-2434 (1995)
219
COMPLEMENTARITY OR SCHIZOPHRENIA IS PROBABILITY IN Q U A N T U M MECHANICS INFORMATION
OR ONTA
A F KRACKLAUER E-mail kracklaufossiuni-weimarde
Of the various complimentarities or dualities evident in Quantum Mechanics (QM) among the most vexing is that afflicting the character of a wave function which at once is to be something ontological because it diffracts at material boundshyaries and something epistemological because it carries only probabilistic informashytion Herein a description of a paradigm a conceptual model of physical effects will be presented that perhaps can provide an understanding of this schizophrenic nature of wave functions It is based on Stochastic Electrodynamics (SED) a candidate theory to elucidate the mysteries of QM The fundamental assumption underlying SED is the supposed existence of a certain sort of random electroshymagnetic background the nature of which it is hoped will ultimately account for the behavior of atomic scale entities as described usually by QM In addition the interplay of this paradigm with Bells no-go theorem for local realistic extentions of QM will be analyzed
1 Introduction
Of the various complimentarities or dualities evident in Quantum Mechanshyics (QM) among the most vexing is that afflicting the character of a wave function which at once is to be something ontological because it diffracts at material boundaries and something epistemological because it carries only probabilistic information All other diffractable waves it may be said carry momentum energy not conceptual abstract information ideas All other probabilities are calculational aids and like abstractions generally are utterly unaffected by material boundaries The literature is replete with resolutions of QM-conundrums selectively ignoring one or the other of these characteristicsmdash in the end they all fail
Herein a description of a paradigm a conceptual model of physical efshyfects will be presented that perhaps can provide an understanding of this schizophrenic nature of wave functions It is based on Stochastic Electrodyshynamics (SED) a candidate theory to elucidate the mysteries of QM1 The fundamental concept underlying SED is the supposed existence of a certain sort of random electromagnetic background the nature of which it is hoped will ultimately account for the behavior of atomic scale entities as described usually by QM2 Among the successes of SED one is a local realistic explashynation of the diffraction of particle beams3 The core of this explanation is the
220
notion that relative motion through the SED background effectively engenders de Broglies pilot wave Given such a pilot wave associated with a particles motion the statistical distribution of momentum in a density over phase space can be decomposed in the sense of Fourier analysis such that the resulting form of Liouvilles Equation under some conditions is Schrodingers Equation
From this viewpoint the schizophrenic character of wave functions can be discussed and understood free of preternatural attributes These concepts have broad implications for serious philosophical questions such as the mind-body dichotomy through teleportation to popular science fiction effects In addition the peculiar nature of probability in QM is clarified
Although much remains to be done to comprehensively interpret all of QM in terms of SED many of the by now hoary paradoxes can be rationally deconstructed
A secondary (but intimately related) issue is that of determining the imshyport of Bells Theorem for the use of the SED paradigm to reconcile fully the interpretation of QM Arguments will be presented showing that in his proof Bell (essentially by misconstruing the use of conditional probabilities) called on inappropriate hypothetical presumptions just as Hermann de Broglie Bohm and others found that Von Neumann did before him45
2 De Broglie waves as an SED effect
The foundation of the model or conceptual paradigm for the mechanism of particle diffraction proposed herein is Stochastic Electrodynamics (SED) Most of SED for which there exists a substantial literature is not crucial for the issue at hand1 The nux of SED can be characterized as the logical inversion of QM in the following sense If QM is taken as a valid theory then ultimately one concludes that there exists a finite ground state for the free electromagnetic field with energy per mode given by
E = huj2 (1)
SED on the other hand inverts this logic and axiomatically posits the existence of a random electromagnetic background field with this same spectral energy distribution and then endeavors to show that ultimately a consequence of the existence of such a background is that physical systems exhibit the behavior otherwise codified by QM The motivation for SED proponents is to find an intuitive local realistic interpretation for QM hopefully to resolve the well known philosophical and lexical problems as well as to inspire new attacks on other problems
221
The question of the origin of this electromagnetic background is of course fundamental In the historical development of SED its existence has been posited as an operational hypothesis whose justification rests o posteriori on results Nevertheless lurking on the fringes from the beginning has been the idea that this background is the result of self-consistent interaction ie the background arises out of interactions from all other electromagnetic charges in the universe6
For present purposes all that is needed is the hypothesis that particles as systems with charge structure (not necessarily with a net charge) are in equishylibrium with electromagnetic signals in the background Consider for example as a prototype system a dipole with characteristic frequency u Equilibrium for such a system in its rest frame can be expressed as
moc2 = Jkj0 (2)
This statement is actually tautological as it just defines UJQ for which an exact numerical value will turn out to be practically immaterial
This equilibrium in each degree of freedom is achieved in the particles rest frame by interaction with counter propagating electromagnetic background signals in both polarization modes separately which on the average add to give a standing wave with antinode at the particles position
2cos(fc0a)sin(wo)- (3)
Again this is essentially a tautological statement as a particle doesnt see signals with nodes at its location thereby leaving only the others Of course everything is to be understood in an on-the-average statistical sense
Now consider Eq (3) in a translating frame in particular the rest frame of a slit through which the particle as a member of a beam ensemble passes In such a frame the component signals under a Lorentz transform are Doppler shifted and then add together to give what appears as modulated waves
2 cos(fc07(x mdash cflt)) sin(wo7(i mdash c_13a)) (4)
for which the second the modulation factor has wave length A = (7fco)-1 From the Lorentz transform of Eq (2) P = hj3ko the factors j3k0 can be identified as the de Broglie wave vector from QM as expressed in the slit frame
In short it is seen that a particles de Broglie wave is modulation on what the orthodox theory designates Zitterbewegung The modulation-wave effectively functions as a pilot wave Unlike de Broglies original conception in which the pilot wave emanates from the kernel here this pilot wave is a kinematic effect of the particle interacting with the SED Background Because
222
this SED Background is classical electromagnetic radiation it will diffract according to the usual laws of optics and thereafter modify the trajectory of the particle with which it is in equilibrium3 (See Ref [1] Section 123 for a didactical elaboration of these concepts)
The detailed mechanism for pilot wave steerage is based on observing that the energy pattern of the actual signal that pilot waves are modulating and to which a particle tunes comprises a fence or rake-like structure with prongs of varying average heights specified by the pilot wave modulation These prongs in turn can be considered as forming the boundaries of energy wells in which particles are trapped a series of micro-Paul-traps as it were Intuitively it is clear that where such traps are deepest particles will tend to be captured and dwell the longest The exact mechanism moving and restraining particles is radiation pressure but not as given by the modulation rather by the carrier signal itself Of course because these signals are stochastic well boundaries are bobbing up and down somewhat so that any given particle with whatever energy it has will tend to migrate back and forth into neighboring cells as boundary fluctuations permit Where the wells are very shallow however particles are laterally (in a diffraction setup say) unconstrained they tend to vacate such regions and therefore have a low probability of being found there
The observable consequences of the constraints imposed on the motion of particles is a microscopic effect which can be made manifest only in the observation of many similar systems For illustration consider an ensemble of similar particles comprising a beam passing through a slit Let us assume that these particles are very close to equilibrium with the background that is that any effects due to the slit can be considered as slight perturbations on the systematic motion of the beam members
Given this assumption each member of the ensemble with index n say will with a certain probability have a given amount of kinetic energy En associated with each degree of freedom Of special interest here is the beam direction perpendicular to both the beam and the slit in which by virtue of the assumed state of near equilibrium with the background we can take the distribution with respect to energy of the members of the ensemble to be given in the usual way by the Boltzmann Factore_^pound where is the reciprocal product of the Boltzmann Constant k and the temperature T in degrees Kelvin The temperature in this case is that of the electromagnetic background serving as a thermal bath for the beam particles with which it is in near equilibrium
Now the relative probability of finding any given particle ie with energy Enj or Enltk or trapped in a particular well will be according to elementary probability proportional to the sum of the probabilities of finding
223
particles with energy less than the well depth
pound e -J = f ( t ) e s amp = (1-eSD) lt5) lEnltd JO 0 V 0
where approximating the sum with an integral is tantamount to the recognition that the number of energy levels if not a priori continuous is large with respect to the well depth
If now d in Eq (5) is expressed as a function of position we get the probability density as a function of position For example for a diffraction pattern from a single slit of width o at distance D the intensity (essentially the energy density) as a function of lateral position is E0 sin2(9)62 where 9 = k[piiotWave(^D)y and the probability of occurrence P(6(y)) as a function of position would be
P ( y ) a ( l - e - ^ s i n 2 W f l 2 ) (6)
Whenever the exponent in Eq (6) is significantly less than one its rhs is very accurately approximated by the exponent itself so that one obtains the standard and verified result that the probability of occurrence Py) = iptp in conventional QM is proportional to the intensity of a particles de Broglie (pilot) wave
3 Schrodinger Equation
A consequence of the attachment of a De Broglie pilot wave to each particle is that there exists a Fourier kernel of the following form
bull 2p V (7)
which can be used to decompose the density function of an ensemble of similar particles Consider an ensemble governed by the Liouville Equation
at m ^ = - V raquo - ^ + ( V p p ) F
i=xy z (8)
Now decompose p(x p)with respect to p using the De Broglie-Fourier Kernel
p(x x t) = e-^p(x p t)dp (9)
224
110
relative intensity
Neutron Diffraction
0 Particle Beam
1 x Radiation
bullI A Chi(y)-squared (x50)
lateral displacement in radians theta
Figure 1 A simulated single slit neutron diffraction pattern showing the closeness of the fit of Eq (6) to the pure wave diffraction patten See Ref [3] for details
to transform the Liouville Equation into
dt i2m
To solve separate variables using
f)(xP)
r = x + x r = x mdashx
to get
i = (^ )^ - (^raquo - ( i ) (-raquobull(4^^ which can (sometimes) be separated by writing
r r )=V(r )Vlt(r)
(10)
(11)
(12)
(13)
225
to get Schrodingers Equation
ihd-^ = ~y^ + v^ (14) at 2 m
4 Conclusions
Within this paradigm Quantum Mechanics is incomplete as surmised by Einshystein Padolsky and Rosen4 It is built on the basis of the Liouville Equation while taking a particular stochastic background into account The conceptual function of Probability in QM is just as in Statistical Mechanics Measurement reduces ignorance it does not precipitate reality Of course measurement also disturbs the measured system but this presents no more fundamental problems that it does in classical physics Heisenberg uncertainty on the other hand is seen to be caused simply by the incessant dynamical perturbashytion from background signals In so far as the source of background signals can not be isolated this source of uncertainty is intrinsic but not fundamentally novel For these reasons duality is superfluous Particles have the same ontological status as in classical physics Individual particles in a beam pass through one or the other slit in a Young double slit experiment for example while their De Broglie piloting waves pass through both slits Beyond the slit the particles are induced stochastically to track the nodes of their pilot waves so that a diffraction pattern is built up mimicking the intensity of the pilot wave
From within this paradigm the now infamously paradoxical situations illustrating various problems with the interpretation of QM never arise or are resolved with elementary reasoning In particular wave functions are not vested with an ambiguous nature
The SED Paradigm also clarifies the appearance of interference among probabilities Numerous analysts from various view points have discovered that fact that Probability Theory admits structure (used by QM) that goes unexploited in traditional applications (Eg see Gudder Summhammar this volume) While each of these approaches provides deep and surprising insights none really offers any explanation of why and how nature exploits this structure Just as a certain second order hyperbolic partial differential equation becomes the wave equation as a physics statement only with the introduction eg of Hooks Law so this extra probability structure can be made into physics only with an analogue to Hooks Law
SED provides that analogue for particle behavior with its model of pilot wave guidance In this model radiation pressure is responsible for particle guidance3 Radiation pressure is proportional to the square of EM fields ie
226
the intensity (in this case of the the background field as modified by objects in the environment) which is not additive Rather the field amplitudes are additive and interference arrises in the way well understood in classical EM In other words QM interference is a manifestation of EM interference The relevant Hooks Law analogue is the phenomenon of radiation pressure For radiation this is all intimately related of course to classical coherence theshyory as applied to square law photoelectron detectors which when properly applied resolves many QM conundrums including those instigated by Bells Theorem surrounding EPR correlations
Appendix Bells Theorem
The interpretation or paradigm described herein conflicts with the conclusions of Bells no-go theorem according to which a local realistic extention of QM should conform with certain restraints that have been shown empirically to be false To be sure this paradigm does not deliver the hidden variables for exploitation in calculations but it does indicate to which features in the universe they pertainmdashnamely all other charges The character of these hidden variables is dictated by the fact that they are distinguished only in that they pertain to particles distant from the system of particular interest thus internal consistency requires that they be local and realistic8
The basic proof
Bells Theorem purports to establish certain limitations on coincidence probashybilities of spin or polarization measurements as calculated using QM if they are to have an underlying deterministic but still local and realistic basis describ-able by extra as yet hidden variables A distributed with a density p(X) These limitations take the form of inequalities which measurable coincidences must respect The extraction of one of these inequalities where the input assumptions are enumerated as Bell made them proceeds as follows
Bells fundamental Ansatz consists of the following equation
P(a b) = f dp(X)A(a X)B(b A) (15)
where per explicit assumption A is not a function of 6 nor B of a This he motivated on the grounds that a measurement at station A if it respects locality can not depend on remote conditions such as the settings of a distant measuring device ie b In addition each by definition satisfies
Alt1 Blt1 (16)
227
Eq (15) expresses the fact that when the hidden variables are integrated out the usual results from QM are recovered
The extraction proceeds by considering the difference of two such coincishydence probabilities where the parameters of one measuring station differ
P(a b) - P(a b) = f dp(X)[A(a X)B(b A) - A(a X)B(b A)] (17)
to which zero in the form
A(a X)B(b X)A(a X)B(b A) - A(a X)B(b X)A(a X)B(b A) (18)
is added to get
P(a b) - P(a b) = [ dXp(X)(A(a X)B(b A))(l plusmn A(a X)B(b A)+
dXp(X)(A(a X)B(b A))(l plusmn A(a X)B(b A) (19)
which upon taking absolute values Bell wrote as
P(a b)-P(a b) lt [dXp(X)(l plusmn A(a X)B(b A)+
I dXpX)l plusmn A(a X)B(b A) (20)
Then using Eq (15) Ansatz and normalization J dXp(X) = 1 one gets
P(a b) - P(a b) + P(a V) + P(a b) lt 2 (21)
a Bell inequality9
Now if the QM result for these coincidences namely P(a b) = mdash cos(20) is put in Eq (21) it will be found that for 6 = iramp the rhs of Eq (21) becomes 22 Experiments verify this result10 Why the discrepancy According to Bell it must have been induced by demanding locality as all else he took to be harmless
228
Critiques
Although Bells analysis is denoted a theorem in fact there can be no such thing in Physics the axiomatic base on which to base a theorem consists of those fundamental theories which the whole enterprise is endeavoring to reveal Moreover buried in all mathematics pertaining to the physical world are numerous unarticulated assumptions some of which are exposed below
The analytical character of dichotomic functions
In motivating his discussion of the extraction of inequalities Bell considered the measurement of spin using Stern-Gerlach magnets or polarization measureshyments of photons In both cases single measurements can be seen as individshyual terms in a symmetric dichotomic series ie having the values plusmn 1 It is ther-fore natural to ask if the correlation computed using QM P(a b) = mdash cos(20) and verified empirically can be the correlation of dichotomic functions It is easy to show that they can not so be consider
- cos(20) = k f P(x- 6)P(x)dx (22)
where p(A) is fc27r and where the Ps are dichotomic functions Now take the derivative wrt 8 to get
2 sin(2lt9) = f 5(x - 6j)P(x)dx = ^ P0j) = k (23) J i
and again
4cos(20)=O (24)
which is false QED Some authors (see eg Aerts this volume) employ a parameterized dishy
chotomic function to represent measurements Such a function can be dishychotomic in the argument but continuous in the parameter eg of the form P(sin(i) mdash x)) for which then the correlation is taken to be of the form
Corr(t) = J D(x- sin(2t))D(x)dx (25) J mdash IT
However this approach seems misguided First it assumes that the the argushyment of Corr t can be identical to the parameter of the dichotomic function
229
Pt(x) rather than the off-set in the argument here x as befitting a correlashytion Moreover the same sort of consistency test applied above also results in contradictions therefore such parameterized functions do not constitute counterexamples invalidating the claim that discontinuous functions can not have an harmonic correlation At best this tactic implicitly results in the correlation of the measurement functions wrt the continuous parameter t which is interpreted as the weight or frequency of the the dichotomic value This tactic however does not conform with Bells analysis in which the dishychotomic values are to correlated rather it corresponds with the type of model proposed below without however recognizing Malus Law as the source of the weights
Conclusion There is a fundamental error in Bells analysis the QM result is at irreconcilable odds with the conventional understanding of his arguments11
This can be revealed alternately following Sica by considering four dishychotomic sequences (with values plusmn1 and length N) a a b and b and the following two quantities a ^ + a ^ = a(6j + 6J) and dfii mdash a^)i = abi mdash b^) Sum these expressions over i divide by N and take absolute values before adding together to get
N N N N
i i i i
N N
- pound | a j | | amp i + ampi + - jgtnamp i -amp i (26) i i
The rhs equals 2 so this is a Bell Inequality Conclusion this Bell Inequality is an arithmetic identity for dichotomic sequences there is no need to postulate locality in order to extract it12
Discrete vice continuous variables
By implication Bell considered discrete variables for which the correlation would be
1 N
Cor(a 6 ) = - 5 3 X 4 ( 0 ) ^ ( 6 ) (27) i
But experiments measure the number of hits per unit time given a b and then compute the correlation each event is a density not a single pair The
230
data taken in experiments corresponds to the read-out for Malus Law not the generation of dichotomic sequences for which each term represents an event consisting of a pair of photons with anticorrelated polarization or a particle pair with anticorrelated spins This discrepancy is ignored in the standard renditions of Bells analysis It is however serious and suggests a different tack
Consider following Barut a model for which the spin axis of pairs of particles have random but totally anticorrelated instantaneous orientation Si = mdashS213 Each particle then is directed through a Stern-Gerlach magnetic field with orientation a and b The observable in each case then would be A = Si bull a and B = S2 bull b Now by standard theory
_ bdquo s ltABgt - ltAgtltB gt Cor (A B) = = = = 28
Vlt A2 gt lt B2 gt the where the angle brackets indicate averages over the range of the variables This becomes
Cor(A B) = ^ s i n ( 7 ) d y c o s ( 7 - g ) c o s ( 7 ) ^
J(Jdysm(j)cos2(j))2
which evaluates to -cos(0) ie the QM result for spin state correlation Conclusion this model essentially a counter example to Bells analysis shows that continuous functions (vice dichotomic) work It is more than just natural to ask where do the gremlins reside in Bells analysis There are at least two
One has to do with the following covert hypothesis Bells proof seems to pertain to continuous variables in that the demand is only that A (B) lt 1 This argument however silently also assumes that the averages lt A gt = lt B gt = 0 It enters in the derivation of a Bell inequality where the second term above is ignored as if it is always zero When it is not zero Bell inequalities become eg
lP(a b) - P(a b) + P(a b) - P(a b)lt2+ 2 lt ^ gt lt f 2
gt ^ (30) Vlt Az gt lt Bz gt
which opens up a broader category of non quantum models A second covert gremlin having broader significance is discussed below
Are nonlocal correlations essential
The demand that in spite of the introduction of hidden variables A that a probability P(a b) averaged over these extra variables reduce to currently
231
used QM expressions implies that
P(a b)= f P(a b X)dX (31)
By basic probability theory the integrand in this equation is to be decomposed in terms of individual detections in each arm according to Bayes formula
Pa b A) = P(X)P(a X)P(ba A) (32)
where P(a A) is a conditional probability In turn the integrand above can be converted to the integrand of Bells Ansatz
P(a b) = jA(a X)B(b X)pX)dX iff
P(baX) = P(bX) Va (33)
This equation admits it seems two interpretations
(i) When this equation is true the ratio of occurrence of outcomes at station B must be statistically independent of the outcomes at A Therefore as the hidden variables A are extra and do not duplicate a and b even if the correlation is considered to be encoded by a A it will not be available to an observer But the correlation by hypothesis does exist and is to be detectable via the as and 6s therefore this equation can not hold Thus within this interpretation Bells Ansatz is not internally consistent
(ii) Alternately if the a on the lhs is superfluous so is b so that P mdash P(X) = 0 except at one value of A where it equals 1 or is a Dirac-delta function That is the correlation is totally encoded by the hidden variables as follows if a sufficient number of new variables are introduced to render everything deterministicmdashas often assumed Consequently individual products of probabilities at the separate stations ie ABs in Bells notation become Dirac delta-functions of the A If everything is deterministic then there can be no overlap of the of the non-zero values of pairs of probabilities for a given value of A and therefore in the extraction of a Bell inequality all quadruple products of P s with pair-wise different values of A in Eq (19) are identically zero so that the final form of a Bell inequality is the trivial identity
P(ab)-P(ab)lt2 (34)
232
In either case locality is not be so employed so as to exclude correlations generated at the conception of the spin-particles or photon pairs ie common causes The non existence of instantaneous communication can not impose a restraint here it must bear no relationship to the validity of Eq (33)
In addition Eq (34) reconciles Baruts continuous variable model with Bells analysis
Bell-Kochen-Specker Theorem
Besides Bells original theorem there is another set of no-go theorems ostensishybly prohibiting a local realistic extention for QM In contrast to the theorem analyzed above they do not make explicit use of locality rather they use cershytain properties (falsely it turns out) of angular momentum (spin) In general the proof of these theorems proceeds as follows The system of interest is deshyscribed as being in a state ip) specified by observables A B C A hidden variable theory is then taken to be a mapping v of observables to numerical values v(A)v(B)v(C) Use is then made of the fact that if a set of operashytors all commute then any function of these operators f(A BC) = 0 will also be satisfied by their eigenvalues f(v(A) v(B)v(C)) mdash 0
The proof of a Kochen-Specker Theorem proceeds by displaying a conshytradiction consider eg two spin-12 particles for which the nine separate mutually commuting operators can be arranged in the following 3 by 3 matrix
degl degl degdeg (35) degWy degldeg degdegz
It is then a little exercise in bookkeeping to verify that any assignment of plus and minus ones for each of the factors in each element of this matrix results in a contradiction namely the product of all these operators formed row-wise is plus one and the same product formed column-wise is minus one14
Now recall that given a uniform static magnetic field B in the z-direction the Hamiltonian is H = ^Baz for which the time-dependent solution of the
r nmdashiuit Schrodinger equation is ip(t) = 4= e
bdquo+iut and this in turn gives time-
dependent expectation values for spin values in the xy directions^5
lt ampx gtmdash ~ cos(oi) lt ay gt= - sin(wi) (36)
where w = eBmc
233
Proof of a Bell-Kochen-Specker theorem depends on simultaneously asshysigning the [eigenvalues plusmn1 to ltrx o~y and az as measurables for each particle (With some effort for all other proofs of this theorem one can find an equivashylent assumption) However as Barut13 observed and can be seen in Eq (36) if the eigenvalues plusmn1 are realizable measurement results in the P-field dishyrection then in the other two directions the expectation values oscillate out of phase and therefore can not be simultaneously equal to plusmn 1 Thus this variation of a Bell theorem also is defective physics
A local model for EPR (polarization) Correlations
The following model incorporates the features of polarization correlations withshyout preternatural aspects or the concept of photon The basic assumption is that the source emits oppositely directed anticorrelated classical electromagshynetic signals
EA = xcos(i) +ys in( f ) EB = mdash xsin( + 6) + y cos(i + 9) (37)
where factors of the form exp(i(wt + k bull x + pound(t)) where pound(pound) is a random variable are dropped as they are suppressed by averaging16 Now the random variables with physical significance emerging in the detectors per Malus Law are EA B It is the detectors that digitize the data and create the illusion of photons But because Maxwells Equations are not linear in intensities rather in the fields a fourth order field correlation is required to calculate the cross correlation of the intensity
P(a b) = Klt(A- B)(B bull A) gt (38)
where brackets indicate averages over space-time (This appears to be the source of entanglement in QM which is seen to have no basis beyond that found in classical physics) Here Eq (38) turns out to be
P ( + +) ltXK (COS(J) sin(i + 6) - sin(i) cos(i + 6)fdv (39) Jo
which gives P ( + + ) = P ( - - ) oc tsin2(0) a n d P ( - + ) = P ( - - ) ocfccos2(0) The constant K can be eliminated by computing the ratio of particular events to the total sample space which here includes coincident detections in all four combinations of detectors averaged over all possible displacement angles 6 thus the denominator is
mdash (sin2 (6raquo) + cos2 (6))d6 = 2K (40) i Jo
234
so that the ratio becomes
P ( + + ) = is in 2(0) (41)
the QM result This in turn yields the correlation
P ( + +) + P ( - - ) - P ( + - ) - P ( - +) Cor(a b) =
P ( + +) + P ( - - ) + P ( + - ) + P ( - + )
Cor (a b) = -cos(20) (42)
If the fundamental assumptions involved in this local realistic model are valid then there would be observable consequences For example if radiation on the other side of a photodetector is continuous and not comprised of photons then photoelectrons are evoked independently in each detector by continuous but (anti)correlated radiation Thus the density of photoelectron pairs should be linearly proportional (baring effects caused by limited cohershyence) to the coincidence window width On the other hand if photons are in fact generated in matched pairs at the source then at very low intensities the detection rate should be relatively insensitive to the coincidence window width once it is wide enough to capture both electrons
1 L de la Peha and A M Cetto The Quantum Dice (Kluwer Dordrecht 1996)
2 A F Kracklauer An Intuitive Paradigm for Quantum Mechanics Physics Essays 5 (2) 226 (1992)
3 A F Kracklauer Found Phys Lett 12 (5) 441 (1999) 4 G Hermann Die Naturphilosophischen Grundlagen der Quanten-
mechanik Abhandlungen der Friesschen Schule 6 75-152 (1935) 5 D Bohm Causality and Chance in Modern Physics (Routledge amp Kegan
Paul Ltd London 1957) 6 H Puthoff Phys Rev A 40 4857 (1989) 44 3385 (1991) 7 A Einstein B Podolsky and N Rosen Phys Rev 47 777 (1935) 8 J S Bell Speakable and unspeakable in quantum mechanics (Cambridge
University Press Cambridge 1987) 9 J S Bell in Foundations of Quantum Mechanics Proceedings of the
International School of Physics Enrico Fermi course IL (Academic New York 1971) p 171-181 reprinted in Ref [8]
10 A Afriat and F Selleri The Einstein Podolsky and Rosen Paradox (Plenum New York 1999) review theory and experiments from a current prospective
235
11 A F Kracklauer in New Developments on Fundamental Problems in Quantum Mechanics M Ferrero and A van der Merwe (eds) (Kluwer Dordrecht 1997) p185
12 L Sica Opt Commun 170 55-60 amp 61-66 (1999) 13 A O Barut Found Phys 22 (1) 137 (1992) 14 N D Mermin Rev Mod Phys 65 (3) 803 (1993) 15 R H Dicke and J P Wittke Introduction to Quantum Mechanics
(Addison-Wesley Reading 1960) p 195 16 A F Kracklauer in Instantaneous Action-at-a-Distance in Modern
Physics A E Chubykalo V Pope and R Smirnov-Rueda (eds) (Nova Science Commack NY 1999) p 379 httparXivquant-ph0007101 Ann Fond L deBroglie 20 (2) 193 (2000)
236
A PROBABILISTIC INEQUALITY FOR THE KOCHEN-SPECKER PARADOX
JAN-AKE LARSSON Matematiska Institutionen Linkopings Universitet
SE-581 83 Linkoping Sweden E-mail jalarmailiuse
A probabilistic version of the Kochen-Specker paradox is presented The paradox is restated in the form of an inequality relating probabilities from a non-contextual hidden-variable model by formulating the concept of probabilistic contextuality This enables an experimental test for contextuality at low experimental error rates Using the assumption of independent errors an explicit error bound of 071 is derived below which a Kochen-Specker contradiction occurs
1 Introduction
The description of quantum-mechanical (QM) processes by hidden variables is a subject being actively researched at present The interest can be traced to topics where recent improvements in technology has made testing and using QM processes possible Research in this field is usually intended to provide insight into whether how and why QM processes are different from classical processes Here the presentation will be restricted to the question whether there is a possibility of describing a certain QM system using a non-contextual hidden-variable model or not A non-contextual hidden-variable model would be a model where the result of a specific measurement does not depend on the context ie what other measurements that are simultaneously performed on the system It is already known that for perfect measurements (perfect alignment no measurement errors) no non-contextual model exists These results origin in the work of Gleasonf but a conceptually simpler proof was given by Kochen and Specker2 (KS)
The KS theorem concerns measurements on a QM system consisting of a spin-1 particle In the QM description of this system the operators associated with measurement of the spin components along orthogonal directions do not commute ie
Sxj^y and sz do not commute (1)
however the operators that are associated with measurement of the square of the spin components do commute ie
^1si and s^ commute (2)
237
The latter operators (the squared ones) have the eigenvalues 0 and 1 and
si +s2y + s2
z = 21 (3)
Thus it is possible to simultaneously measure the square of the spin composhynents along three orthogonal vectors and two of the results will be 1 while the third will be 0 Only this QM property of the system will be used in what follows
The notation used from now on is intended to avoid confusion with QM notation since the notions used will be those of (Kolmogorovian) probability theory not QM A hidden-variable model will be taken to be a probabilistic model ie the hidden variable A is represented as a point in a probabilistic space A and sets in this space (events) have a probability given by the probability measure P The measurement results are described by random variables (RVs) Xj(A) which take their values in the value space 01
These mappings will depend not only on the hidden variable A but also the specific directions in which we choose to measure the squared spin components so that we would have
X i ( x y z A ) A - gt 0 l
X 2 ( x y z A ) A - + 0 l (4)
X 3 ( x y z A ) A ^ 0 l
Here Xi is the result of the measurement along the first direction (x) X2
along the second (y) and X3 along the third (z) To be able to model the spin-1 system described above these RVs would need to sum to two ie
3
^ X i ( x y z A ) = 2 (5) i= l
This is in itself no guarantee that the model will be accurate but it is the least one would expect from a hidden-variable model yielding the QM behaviour
In simple experimental setups there is usually only one direction specified (the direction along which the spin component squared is measured) Thus we would expect that X only depends on x (and A) This is referred to as non-contextuality and more formally this can be written as
Xi(xyzA) =X 1 (x y z A )
X 2 (x y z A)=X 2 (x y z A ) (6)
AT3(xyzA) = X 3 ( x y z A )
These two prerequisites are all that is needed to arrive at the Kochen-Specker paradox