Modeling infinitesimals with 2x2 matrices

categories - Category Theory list
 help / color / mirror / Atom feed

* Modeling infinitesimals with 2x2 matrices
@ 2004-04-24  6:45 Vaughan Pratt
  0 siblings, 0 replies; 6+ messages in thread
From: Vaughan Pratt @ 2004-04-24  6:45 UTC (permalink / raw)
  To: categories

At some point I'll try to collect my thoughts on Sol Feferman's Thursday
lecture on his alternative to Grothendieck universes, which he objected to
as entailing an infinity of inaccessible cardinals.  (What was Grothendieck's
view of inaccessible cardinals vis a vis his universes?)

During the lecture it struck me that his approach was quite like Robinson's
approach to infinitesimals, in that it constructed lots of models of what
was needed, took the common theory, then constructed a single model from
the many, using techniques of Vaught and others to avoid losing too much
of the common spirit of the many guided by the common theory (not sure if
that captures the idea completely faithfully, but it's something like that).

Thus distracted, I found myself wondering yet again why the d^2 = 0 property
was so difficult for an infinitesimal d.  Having been mulling over the
quaternions lately, it seemed to me there was something of an analogy there,
some property so built into our very psyche that we can't let go of it.
Hamilton finally dropped commutativity, along with any reservations he
might have harbored about vandalizing stone bridges in his own town.

For the quaternions, d^2 = 0 implies d = 0, so this doesn't help.  However the
quaternions have a sibling algebra, just as noncommutative, and of exactly
the same vector space dimension (in fact the only Clifford such, i.e. the
only other real 4D vector space for which ij+ji=0 for all orthogonal vectors
i,j having no real component), that is even better known than the quaternions
(imagine that).

Namely the Clifford algebra of 2x2 real matrices, as a 4D real vector space,
made an algebra with matrix multiplication.

Why not model d as the matrix
0 1
0 0?

This is a perfectly good quantity, adding and scaling just like any
real, e.g. 2d =
0 2
0 0.

And obviously d^2 = 0.

Standard reals x would have the form

x 0
0 x

1+d would therefore be

1 1
0 1

(1+d)^2 then becomes

1 2
0 1

as common sense would indicate.

The determinant of d being 0, one can't divide by it.  But who in their
right mind would want to divide by a quantity infinitesimally close to zero?
Obviously that's going to produce an infinitely large quantity; if you want
to do that, why not just go ahead and divide by zero itself?  As Douglas
Adams pointed out, you may think the store down the road is a fair way away,
but other galaxies are even further away.  To a nematode they're all far away.

On the other hand
1 2
0 1
has a perfectly good reciprocal, namely
1 -2
0 1
again as suggested by common sense.

So the proposal is to base calculus on a field-like object that is a field
in the large, but zero divide errors set in when one gets infinitesimally
close to zero.  Basically what happens with IEEE floating point arithmetic,
but modeled with 2x2 real matrices rather than 64-bit numbers.

Oh, but what about the noncommutativity of 2x2 matrices, might that mess
something up?

Actually no, this two-dimensional algebra consisting of matrices of the form
a b
0 a
is commutative.  So only the zero divisors really close to 0 constitute
any departure at all from the field axioms.

The diagonal element a is the standard real part and the off-diagonal
element b in the upper right gives the infinitesimal displacement.

So we have a real commutative associative algebra of refined numbers,
having a real part and an infinitesimal part, whose only zero divisors are
the infinitesimals.  We don't *have* to think of them as matrices because
we can just write its elements as x+yd by analogy with x+iy, where d is
the above matrix representing the prototypical infinitesimal.  The square
of i is -1, and the square of d is 0.

Moreover x and y in x+yd can be complex.  We then have numbers x+iy+ud+ivd,
which can parsed as either refined complex numbers, namely complex numbers
with refined coefficients x+ud+i(y+vd), or complex refined numbers, namely
refined numbers with complex coefficients x+iy+(u+iv)d.  This is still a
real associative algebra, which through force of habit people will no doubt
want to call a complex commutative associative algebra, but it could just
as legitimately be called a refined associative algebra.

Ok, what about commutative?  Well, the complex numbers are commutative and
the refined numbers are commutative, so how could refining complex numbers
make any difference?

Well, the reason I wrote x+yd rather than x+dy is that, even though the
*natural* thing to do is to make i commute with d, if instead we make
id+di=0, the defining condition for Clifford algebras, then we can fit the
whole thing into 2x2 *real* matrices!

Here I'm using the following 2x2 real matrices for i and d respectively:

(0 -1) (0 1)
(1  0) (0 0)

But now notice that the matrices for 1,i,d,id form a basis for all the
2x2 matrices.  In fact *any* 2x2 matrix [[a,b],[c,d]] can be decomposed as

(d -c) + (a-d b+c)
(c  d)   ( 0   0 )

(I'd appreciate feedback from anyone for whom the above doesn't typeset
readably.)

So to read an arbitary 2x2 real matrix as a refined complex number, take the
bottom row reversed as the complex part and the departure of the top row from
the usual matrix representation of complex numbers as the infinitesimal part,
taking care to get both signs right.

How did I notice this?  Simple.  I knew (i) that id+di=0 would make it a
Clifford algebra, (ii) there are only two 4D Clifford algebras, and (iii) d^2
= 0 -> d = 0 in the quaternions.  This narrows things down to the 2x2 real
matrices, there are no other associative algebras with these properties.
Getting the above decomposition was then just a matter of solving some
trivial linear equations.

This is so simple, and the infinitesimals have been worried at for so long,
that this *has* to be known already.  But then it would really bug me to
have been the last to learn about it -- why wasn't I told, as they say?

Vaughan Pratt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Modeling infinitesimals with 2x2 matrices
@ 2004-04-29  0:54 John Baez
  0 siblings, 0 replies; 6+ messages in thread
From: John Baez @ 2004-04-29  0:54 UTC (permalink / raw)
  To: categories

Vaughan Pratt writes:

> Why not model d as the matrix
>
> 0 1
> 0 0 ?
>
> This is a perfectly good quantity, adding and scaling just like any
> real, e.g.
>
> 2d =  0 2
>       0 0.
>
> And obviously d^2 = 0.

Part of this idea is implicit in the usual algebraic geometry treatment
of infinitesimals as nilpotents.  In addition to the usual "point", such
that complex functions on this space form the commutative ring C, algebraic
geometers like to think about the "point with nth-order nilpotent fuzz",
such that complex functions on this space form the commutative ring
C[d]/<d^n = 0>.   They visualize this as a space slightly bigger than
a point: just big enough to tell the difference between the function 0
and the function whose first n-1 derivatives equal zero!

To deal with this sort of "space" in a precise way, someone like Grothendieck
invented the category of affine schemes, which is just the opposite of the
category of commutative rings.  But affine schemes are happier as part of a
larger category of schemes... and thus topos theory was brought kicking
and screaming into the world.  To see how this led to a really nice treatment
of infinitesimals, see:

F. William Lawvere, Outline of synthetic differential geometry, available
at http://www.acsu.buffalo.edu/~wlawvere/downloadlist.html

or

Anders Kock, Synthetic Differential Geometry, Cambridge U. Press,
Cambridge, 1981.

But, it's also tempting to embed the commutative ring C[d]/<d^n = 0> into
the noncommutative ring of nxn complex matrices, by letting d be a
slightly off-diagonal matrix, like this:

0 1 0 0
0 0 1 0                    (in the case n = 4)
0 0 0 1
0 0 0 0

(Vaughan is considering the case n = 2.)  And this is more like how
Alain Connes thinks of infinitesimals: as part of the bigger world of
noncommutative geometry!

Best,
jb

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Modeling infinitesimals with 2x2 matrices
@ 2004-04-28  5:13 Vaughan Pratt
  0 siblings, 0 replies; 6+ messages in thread
From: Vaughan Pratt @ 2004-04-28  5:13 UTC (permalink / raw)
  To: categories

>But if you extend the domain to the algebra R(2) of 2x2 real matrices,
>the columns indexed by singular matrices now lose some of their entries.
>But not all, and so the solution space ceases to be rectangular.

On reflection this is not so simple in the case when b in a/b is
infinitesimal.  First, noting that R(2) is noncommutative, the requirement
should be phrased as two equations, a = bx and a = xb to prevent multiple
solutions whose diagonal is not constant.

But while this duplication then determines a unique real part (the diagonal),
the two equations fail to pin down the infinitesimal part (the upper right
entry).  That's the sort of thing that happens with matrices of less than
full rank.

When there is no solution, certainly a/b should be considered undefined.
But when there are multiple solutions, the question arises as to whether to
punt completely (as with 0/0) or do something creative such as setting the
undefined infinitesimal part to 0 (as with the ratio of two infinitesimals).

One test is whether the "dominant" term is fixed, but this breaks down
for 0/b.

A better test would be to use the rank of b to decide how much of the
quotient a/b to ignore---if b has rank 1 (a nonzero infinitesimal) then
ignore the infinitesimal part of a/b.

---------------

One virtue of Robinson's approach is its universality with respect to all
first-order definable functions; this however is not sufficient to compensate
for its more counterintuitive aspects.  Now that I'm starting to see that
the zero-divisor approach is less easily managed than I'd first thought,
I'm not so sold on it either at this point (but maybe all its difficulties
have been overcome somewhere...?).

Meanwhile I remain convinced that Boole's finite difference approach to
handling infinitesimals is superior (recall the trick here: setting h=0
instead of some positive quantity like 1 or Planck's constant introduces no
artificial singularities with Boole's method).  His 1860 *A Treatise on the
Calculus of Finite Differences," substantially revised by J.F. Moulton for
the 1872 edition after Boole's death, is 336 pages of inspired analysis.
(You can get second hand copies for $10 from Amazon; my very second hand
copy has "F.S. Curry, Trin. Coll., Feb. 1881" written on the inside cover.)

The preface to the first edition starts out,

"In the following exposition of the Calculus of Finite Differences,
particular attention has been paid to the connexion of its methods with
those of the Differential Calculus---a connexion which in some instances
involves far more than a merely formal analogy.

Indeed the work is in some measure designed as a sequel to my *Treatise
on Differential Equations*.  And it has been composed on the same plan."

An updated version of this book incorporating the greatly matured perspective
on linear algebra since then could be a worthwhile project for someone
interested in improving on the existing explications of infinitesimals as
real objects.  While Boole's system beats the current crop hands down in
principle (in my view anyway), in outlook it is showing its age.

Category theory creatively applied might also help.  I confess to having no
idea how intuitionistic logic could be brought to bear effectively though.
I can see that not cancelling certain double negations might preserve certain
nuances that convey certain constructively motivated notions, but to my
untrained eye these come across as nuances with a capital N when their
contribution is assessed in the larger picture of alternative approaches
to constructivizing infinitesimals.  That makes me either a beer guzzler
at a wine tasting or the owner of a screwdriver in a room full of hammer
owners depending on one's outlook.  :)

YBMV (Your biases may vary.)

Vaughan Pratt
--------------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Modeling infinitesimals with 2x2 matrices
       [not found] ` <408CCCAA.9090404@cs.bham.ac.uk>
@ 2004-04-26 16:54   ` Vaughan Pratt
  0 siblings, 0 replies; 6+ messages in thread
From: Vaughan Pratt @ 2004-04-26 16:54 UTC (permalink / raw)
  To: categories

From: Steve Vickers <s.j.vickers@cs.bham.ac.uk>
>I thought that was the whole reason for contemplating infinitesimals in
>the first place - 0/0 cannot be meaningful, but d/d is 1, ed/d is e etc.

Well, you're not alone in thinking that way.  This was the basis for
Robinson's invention of nonstandard analysis: the belief that a/b has to
be defined for all nonzero b in order to make infinitesimals nonparadoxical.

Instead of formulating division a/b as an operation, go back to its motivating
formulation as a system in search of a solution, in this case the system
consisting of the one linear equation a = bx in one unknown.

Had the system been one of ordinary or partial differential equations,
there would be no argument that the solution space could turn out quite
oddly shaped.

Now when a and b are reals the solution space is a rectangle: only the
column indexed by b=0 is undefined.  This remains true when a and b are
extended to the complex numbers, or even to the quaternions.

But if you extend the domain to the algebra R(2) of 2x2 real matrices,
the columns indexed by singular matrices now lose some of their entries.
But not all, and so the solution space ceases to be rectangular.

Robinson believed that the way to make infinitesimals safe for analysis was
to make the solution space for a = bx rectangular.  Today's logicians are
magicians with logic: if logic indicates the impossibility of a rectangular
solution space, no need to abandon that goal, just bend logic until the
solution space does become rectangular.  The students will bend with you,
at least those who've approached nonstandard analysis with the proper
upbeat spirit about how much simpler analysis becomes when infinitesimals
can be objectified.  Power tools are wonderful.

To answer your question (or comment), in the system of refined numbers I
described, if b is infinitesimal and nonzero, a = bx is solvable if and only
if a is infinitesimal.  The division table is no longer rectangular.  So what?

One might grumble that a = bx can't have an infinitesimal part when a and
b are both infinitesimals, but in the simple cases that's a plus.  In more
complicated cases, 2x2 matrices aren't enough, you need nxn matrices,
with distance of nonzero entries from the diagonal measuring the degree
of their infinitesimality (if that's a word).  In this case d^n = 0 only
for higher n's.

After thinking along those lines for a bit more the other day, I decided
that even though I liked this approach better than throwing ultrafilters at
it, it still wasn't as good as doing analysis in Boole's finite difference
calculus with h remaining unbound throughout, the approach I'd used since the
early 1970's.  That approach has the great advantage of being able to use the
same analysis in classical and quantum physics by setting h=0 to interpret a
result classically and setting it to Planck's constant to interpret the same
result quantumly.  As a case in point, the same integration formulas can
deliver areas under smooth curves and discrete summations of e.g. n^3, the
latter with h=1. (I already wrote a bit about that two or three messages ago.)

The right power tools are even more wonderful.

Vaughan Pratt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Modeling infinitesimals with 2x2 matrices
@ 2004-04-25  6:58 Vaughan Pratt
  0 siblings, 0 replies; 6+ messages in thread
From: Vaughan Pratt @ 2004-04-25  6:58 UTC (permalink / raw)
  To: categories

I'm told that Bell's "microlinear calculus" in his 1998 book on infinitesimals
is equivalent to the matrix approach I suggested, so that was not new after
all, other than perhaps its formulation in terms of 2x2 matrices.

On the other hand it is apparently mixed in with Bell's strongly
intuitionistic outlook, whereas it would seem intuitively that something
so simple as a model of d^2 = 0 should transcend whether one is working
intuitionistically or classically.  A more classical version of Bell's account
might be of interest (perhaps to relatively few people on the categories
mailing list though, which seems to have a strongly intuitionistic slant).

Meanwhile I received the April issue of Mathematics Magazine just now, and it
has an article on pp. 118-129 on "Geometry of Generalized Complex Numbers"
by Anthony and Joseph Harkin.  The microlinear calculus, under the names
"Study product" and "parabolic complex numbers," apparently dates back
to Study's 1903 book Geometrie der Dynamen.  The Harkins associate i^2 =
-1,0,1 with respectively Ordinary (i.e. complex) product, Study product,
and Clifford product (though Clifford algebras include ordinary product as
well, the quaternions being a Clifford algebra).

The article makes no mention of infinitesimals, and it would be interesting to
try to find the appropriate infinitesimal interpretations of the geometric
properties of the parabolic complex plane.

One approach I very much like to infinitesimals that I haven't seen in the
nonstandard analysis/infinitesimal literature (but would certainly appreciate
pointers) is one that does all the work with what one might call finitesimals.
A finitesimal h is just a positive real that you plan one day to reduce to
zero, and thus organize everything around it to that end.

Polynomials in R[x] of degree d form a (d+1)-dimensional vector space.
The usual basis for this space is the d+1 monomials x^i for i in 0..d.
However if one fixes h > 0 and takes the basis to be 1, x, x(x-h),
x(x-h)(x-2h),... then Boole's difference calculus works essentially
identically to the infinitesimal calculus for polynomials represented in
the monomial basis.  Since h is a free variable throughout the development,
one can do all the work first and then drive h to 0 uniformly everywhere
at the end.  Expressions such as x^i (Knuth writes an underbar under the i
and calls it "x to the falling i") mention h only implicitly and hence don't
change (as symbolic expressions) as h changes, though their numerical values
at any given x change.  The Stirling numbers of the first and second kind,
organized as matrices, constitute linear transformations from the bases
for h=1 to h=0 and back again, respectively.

I've looked from time to time at how one might extend this to exponentials
and logarithms, but have never been satisfied with the results.  It would be
nice to know how to deal exactly with exp(it) for nonzero h.  If this were
possible it might give an even nicer constructive treatment of infinitesimals
than the others, and one that didn't care at all whether one was classically
or intuitionistically inclined.

Vaughan Pratt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Modeling infinitesimals with 2x2 matrices
@ 2004-04-24 22:46 Vaughan Pratt
  0 siblings, 0 replies; 6+ messages in thread
From: Vaughan Pratt @ 2004-04-24 22:46 UTC (permalink / raw)
  To: categories

Correction to my suggestion id+di = 0.  Don't do it.  id = di is fine
as it stands for refined complex numbers, which should be represented in
C(2) = 2x2 complex matrices (embeddable in R(4) - 4x4 real matrices)
as the obvious extension of the refined reals x+yd.

I shouldn't have been so smug about 4D Clifford algebras, this algebra of
refined complex numbers doesn't satisfy d^4 = 1, needed if d is to be a
Clifford generator.

And in fact although di =
1 0
0 0
we have id =
0 0
0 1
(I should have checked that more carefully.)

I thought about trying to make the infinitesimals points on the "light cone"
of R(2) (the singular matrices) but couldn't get that to work.  So 2x2
complex matrices with id = di is the best I could think of.  This works
for modeling the refined complex numbers (barring any other errors), but
with nothing left to motivate  id+di = 0.

The representation x+iy+dv+idw is fine, with idw = diw = wid etc., all is
commutative.  (I was hoping too hard for the excitement of noncommutativity,
this is boringly noninteractive as it stands.)

Vaughan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-04-29  0:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-04-24  6:45 Modeling infinitesimals with 2x2 matrices Vaughan Pratt
2004-04-24 22:46 Vaughan Pratt
2004-04-25  6:58 Vaughan Pratt
     [not found] <s.j.vickers@cs.bham.ac.uk>
     [not found] ` <408CCCAA.9090404@cs.bham.ac.uk>
2004-04-26 16:54   ` Vaughan Pratt
2004-04-28  5:13 Vaughan Pratt
2004-04-29  0:54 John Baez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).