Sunday, May 20, 2018

Dual Space

LabKitty Great Seal
I used to not understand dual space. I still don't, but I used to not, also.

But I kid Mitch Hedberg. I've recently been trying to read Carroll's book on relativity, and if you know the slightest anything about GR you know you smack into full florid tensor calculus almost immediately. (Imagination is more important than knowledge my furry ass. I want every sullen weeaboo posting that bon mot on Pinterest to write out the covariant derivative a hundred times.)

I grok parts of the tensor machine, but my current tensor bête noire is dual space. Here the mathozoids have truly attained some unholy capstone of obfuscation. Every account I've come across reads like an elaborate hoax. Like git or the Trump presidency. Like a gaggle of Soho hipsters gushing about a blank canvas hanging in the MOMA.

There exist entire books on dual space that never explain dual space. They simply offer no psychological footholds. Understanding dual space is like trying to climb El Capitan wearing oven mitts.



Here's what Wikipedia has to say about dual space:

In linear algebra, given a vector space V with a basis B of vectors indexed by an index set I (the cardinality of I is the dimensionality of V), its dual set is a set B* of vectors in the dual space V* with the same index set I such that B and B* form a biorthogonal system. The dual set is always linearly independent but does not necessarily span V* If it does span V*, then B* is called the dual basis or reciprocal basis for the basis B.

Dear Whoever Fingerbarfed This Onto Wikipedia: This isn't an explanation, this is a definition. For this to qualify as an explanation, you must add additional mouth-type sounds in go-in-your-eyeball form sufficient to cause the thinky bing bong happening in your noggin to take root in the reader's.

I suppose I could try to patch this up, try to translate math-speak for persons not living on the 13th plane of the Balrog. Instead, let's just start fresh.

I will begin with imprecise language and fix it after. So for now, humor me.

There is the thing, and there is the description of the thing. Of the former, there is but one. Of the latter, there can be many.

Example: I describe LabKitty as a lovable scamp. You describe LabKitty as a bundle of crippling insecurities wrapped in a thin veneer of nerd humor. Google describes LabKitty as a website that doesn't pull enough traffic to monetize.

Yet, there is but one LabKitty.

Thing and description of the thing.

However, our topic today isn't Hayakawian wanking, it's mathematics. Ergo, for "thing" let us instead say vector and for "description of the thing" let us instead say the components of the vector in a chosen coordinate system.

The difficulty is there are as many coordinate systems as there are people to choose one. You might choose Cartesian and I might choose oblique and Google might choose polar. Each of us then computes two* numbers to describe a given vector -- the components (* for simplicity, I'm assuming two-dimensional space in our chat today). My components are different than your components. Yet, the vector is the same vector.

Dual space is how we convert all this hand waving into a machine. Let's demonstrate with an example. It behoofs us to use the simplest coordinate system in which dual space manifests to do so. Dual space vanishes in Cartesian coordinates (we will revisit this factoid later). So Cartesian is out. Polar and the other bendys involve lots of trig. Yuck. So imma pick oblique.

Nobody uses oblique coordinates if they don't have to, so you may have not run into them. They're not complicated, they just suck. Oblique is like Cartesian, except not orthogonal. Have a picture:

cartesian and oblique basis

In the left panel, the standard Cartesian basis. In the right panel, an oblique basis. I use e1 and e2 for basis vectors. In standard Cartesian, we have e1 = (1,0) and e2 = (0,1). In oblique, we have whatever. In this example, I picked e1 = (1,0) and e2 = (1,1). The angle between oblique basis vectors can be anything except zero. Ditto for their lengths. However, they are a perfectly respectable basis.

The task at hand is determining the components of a given vector -- let's call it v -- in our selected basis:

cartesian and oblique bases w/ vector

Here I picked v = (2,1). Finding the components of v in Cartesian is trivial -- you just read them off. So we have v = (2,1), aka v = 2 e1 + 1 e2.

In contrast, obtaining the components of v in oblique coordinates requires some work. As any competent linear algebra book will tell you, we first define new vectors e*1 and e*2 satisfying

    e*i • ej = δij

where δij is the Kronecker delta -- that is, δij = 1 for i = j and 0 otherwise. (This equation is not as mysterious as it looks -- in two dimensions the whole thing is just two equations and two unknowns, as I will soon demonstrate.) We can then express v in the oblique basis as:

    v = Σ (ve*i) ei

Easy peasy. Turning to our example, we have :

    e*1e1 = (e*11)(1) + (e*12)(0) = 1
    e*1e2 = (e*11)(1) + (e*12)(1) = 0

(I'm writing e*ij for the jth component of e*i . I don't write this in bold because it's just a number.) Solving simultaneously, we find e*1 = (1,-1).

Similarly:

    e*2e1 = (e*21)(1) + (e*22)(0) = 0
    e*2e2 = (e*21)(1) + (e*22)(1) = 1

and we find e*2= (0,1). We now use these to find the components of v in the oblique basis:

    ve*1 = (2)(1) + (1)(-1) = 1
    ve*2 = (2)(0) + (1)(1) = 1

or v = 1 e*1 + 1 e*2. That is, v = (1,1) in our oblique basis.

The good news is once we construct e*1 and e*2, we can find the components of any vector in the space -- call it u -- using ue*1 and ue*2. Quite the labor saving device, I think you would agree.

The vectors e*1 and e*2 are called dual vectors.

Now, if you worked lots of this kind of problem, you might notice something: The dual vectors exhibit a systematic relationship to the basis vectors. First, e*1 is always orthogonal to e2 and e*2 is always orthogonal to e1. Second, the dual vectors scale in inverse proportion to the basis vectors: e*1 gets shorter if e1 gets longer, and so on.

I whipped up a little Javascript app to illustrate this:


v = 0.1 e1 + 0.2 e2
v = 0.1 e1 + 0.2 e2
v⋅v = xxx   v⋅v* = xxx




The app generates a random oblique basis and its dual and allows you to manipulate them. The basis vectors are in blue and the dual vectors are in red.

The three lines of text beneath the plot refer to the fixed vector v, drawn in black. The first line list the components of v in the oblique basis (this should confirm the example shown above unless you've already clicked some buttons). The second line expresses the components of v using the dual as a basis. You can obtain these components by the same sort of calculation given above, just reversing the roles of e and e*.

Finally, the third line lists the dot product of v with itself and v with its dual. (Just to be clear, "v" in these expressions means v expressed in the basis and "v*" means v expressed in the dual -- lines one and two, respectively.). These results will become important later.

Generate a random oblique basis by pressing the "random" button. Note the dual vectors. Rotate the basis. Note how the dual rotates to maintain orthogonality with the basis. Also note the components of v change as you tweak the basis, but it's always the same v.

Now, scale the basis larger or smaller. Note how the dual scales opposite, as do the components of v. If your basis vectors become twice as long, then your components describing a given vector must become half as big. Why? Because same vector.

Once you have squeezed all possible fun from such activity, continue reading.

The dual vectors are a labor-saving device. But there is more. Note we obtain as many dual vectors as there are basis vectors. Note also the dual vectors are linearly independent (assuming the basis vectors are). Ergo, the dual vectors span the space (R2 in our examples). That is, the dual vectors also form a basis.

Except that's not quite correct. I should have written the preceding paragraph like so:

The dual "vectors" are a labor-saving device. But there is more. Note we obtain as many dual "vectors" as there are basis vectors. Note also the dual "vectors" are linearly independent (assuming the basis vectors are). Ergo, the dual "vectors" span the space (R2 in our examples). That is, the dual "vectors" also form a basis.

See what I did? What's with the scare quotes?

This is where things get weird.

The thingys that live in dual space seem like vectors and I have drawn them like vectors, but they're not vectors. Or rather, they're "vectors" in the abstract sense (objects that are closed under addition and scalar multiplication blah blah), but they're not the arrow-type vectors you know and love. Why not? Well, look how we used the dual: We dotted it with a given vector to find the given vector's components in the basis associated with the dual.

Here you haz a confuse. What makes a dot product if not two vectors??

Brace yourself for some tough love.

Your whole life you have been taught that we take the dot product of two vectors. That is a lie. A simplifying lie, a convenient lie, but a lie nonetheless. Yes, in component form, we write ab = Σ ai bi. I will have more to say about this notation later. But if you think back to treating a vector as a column of numbers, you will recall a • b really means aT b. The first player in that expression (aT) isn't a vector. It's the transpose of a vector. That transpose is actually a dual.

This is not merely terminology. Consider again our vector v = (2,1). Its length is √5. We get the length by dotting the vector with itself and then taking the square root. The result must be the same no matter what coordinate system we use. The length is the length is the length.

In Cartesian we have | v |2 = vv = Σ vi vi = (2)(2) + (1)(1) = 5. So | v | = √5. So far so good.

Now consider oblique. We just found v = (1,1) in oblique. But (1)(1) + (1)(1) = 2, and √2 ≠ √5. Something has gone wrong.

What has gone wrong is we should have computed the dot product using the dual of v. Or, using our previous notation: | v | = v* • v. In Cartesian (and only in Cartesian), the components of v* are numerically the same as the components of v. In oblique, however, we have: v* • v = (2)(1) + (3)(1) = 5. Taking the square root now gives the correct length.

That's the gist of dual space. It's a kind of shadow your basis vectors drag around. You use it to do useful things like compute components and take dot products, retaining (more or less) familiar formulas to do so. Or, taking the opposite view, you've always been using dual space; you just didn't know it. The dual vanishes in standard Cartesian coordinates. If you chase the math around, you would find e*1 = e1 = (1,0) and e*2 = e2 = (0,1). That is, the dual is hidden under the basis vectors. It's like your shadow at noon -- directly under your feet so you don't notice it. That's another reason dual space seems mysterious, what with Cartesian being the pizza of basis vectors. You don't bump into dual space until/unless you engage more exotic coordinate systems.

Long story short: If your basis isn't orthonormal, the basis vectors start unwelcomely inserting themselves into your calculations. We need a way to uninsert them. That way is dual space. It's a tool that systematically accounts for the length and orientation of basis vectors so that operations like the dot product give correct answers.

Epilogue

Tensor analysis (which you'll recall is how we got into this mess) uses special notation to keep vectors and duals organized. We continue to use subscripts for dual components but switch to superscripts for vector components. As such, we don't write v* • v = Σ vi vi but rather v* • v = Σ vi vi (also, the summation sign is usually omitted in tensor equations -- a practice introduced by Einstein -- so we really just write v* • v = vi vi). Instead of "taking the transpose," we speak of "lowering the index."

The product vi vi may look like a trivial modification of the familiar dot product formula, but it contains a profound truth. Glossing over many (many) details, tensor analysis demands quantities with a raised index transform opposite to quantities with a lowered index -- vectors and dual "vectors" being but one example. This is called contravariant and covariant. When two such quantities appear together in an expression (e.g., dot product), the result remains correct no matter what coordinates we switch to.

Killjoy math professors don't much care for writing "vector" with scare quotes, so instead they invented incomprehensible terminology. A normal person would say "we dot a dual with a vector." Mathozoids say "the dual operates on a vector to give a scalar." That is, a dual vector isn't a vector, it's a function (a functional, to be pedantic, or linear functional to be annoyingly so). It eats a vector and poops a number. That's why textbooks define the dual as a space of linear functionals. As an explanation, that definition is useless. But, hopefully, it now makes sense having crept up on it one step at a time.

Dual space isn't complicated and mysterious; it's just usually unnecessary and almost always horribly explained.

No comments:

Post a Comment