A Hermitian Matrix Has Real Eigenvalues

When I studied math, I tended to find myself more interested in the “continuous” side of things rather than the discrete. This is calculus and analysis and such, in contrast to things like logic, abstract algebra, number theory, graphs and other things where everything is rather chunky. I suppose I was largely motivated by the use of calculus in physics, which was usually my main focus. It’s easy to see that calculus and continuous functions are essential when studying physics. It’s not as easy to argue for number theory’s direct applicability to an engineer.

Sometimes I feel a bit ashamed about this interest. Calculus and analysis are the next logical steps after real number algebra. One could argue I didn’t allow myself to expand my horizons~~! But, whatever, who cares. They’re still cool. OK? THEY’RE COOL.

It’s very easy to claim that you’re interested in something. It’s almost as easy to hear someone talk about how they’re a fan and try to call them out on it by asking about some trivia. This is often the crux of an argument in the whole “Fake Geek Girl” and Gamergate things.  Similarly, sometimes, it feels like everyone around me is nuts about Local Football Home Team, and I often find myself skeptical of the “purity” of their fanaticism. I need to remind myself that someone can enjoy something and not know everything about it. You can be interested in Watching Football Match and not know the names of everyone (or even anyone) on the team.

The same is true for something like math. It had better be true, since there’s always another thing we could define or discover, and all of the fields already developed aren’t completely understood by a single person. It’s fine to be more interested in one thing rather than another. If you take that too far, you’d end up criticizing people for not being interested in every single thing in existence equally.

It’s all right to wonder if I should look into a certain topic, though. A few years ago, a colleague teaching introductory calculus to high school seniors mentioned that they were working on delta-epsilon proofs in the class. This blew me away! The concept of a limit is usually introduced in a pre-calculus class, or the beginning of a calculus class. I am under the impression that it’s usually a matter of, “yeah, this function gets really close to this point, but doesn’t actually hit it,” and that’s all a student really needs to know until they do some analysis. A delta-epsilon definition is a way to formally define a limit, so there is no uncertainty as to what is actually happening. “Gets really close” ends up being defined exactly — basically, it says, “For this function f(x), gimme any number bigger than zero, even a SUPER tiny number, and I can give you a distance away from x=c such that $latex $f(x)$ is never further from a limiting value L than your number.”

Okay, maybe that is not super enlightening. But on a side note, it’s fun to think about how much like a playground taunt that is.

I was ready to argue that his students didn’t need to bother with delta-epsilon proofs, that they could learn to work with the fuzzy idea of a limit first and get along just fine in calculus, just as I had. But, I did start to doubt myself. Should I have learned the definition of a limit before trying to use it, in my hubris?

In retrospect: no, that’s silly. Looking at epsilon-delta definition, I realize it would have taken me ages to get through it, taking away from valuable time spent thinking about the applications of calculus. But, that feeling is there. Should I have known about this?

What does this have to do with Hermitian matrices, you demand, grabbing me by the shoulders and shaking, while my cravat soaks up your spittle.

I had this same feeling this week, when thinking about a topic in linear algebra and physics. In quantum mechanics, matrices are used extensively to describe certain kinds of actions that could be done to a particle or a system. One of those actions could be to take a measurement, such as the amount of energy in the system. If you call your matrix H and your particle current state \psi, then you could represent a measurement of how much energy the system has by multiplying them together.


When you multiply them together, you can get a single number \lambda times the particle state \psi as a result. If measured the energy of the particle, then the value of \lambda is that energy. You can’t just divide both sides by \psi because a particle’s state is a vector of many possibilities, and division by vectors rather than numbers doesn’t mean a whole lot here. (You can multiply both sides by a vector transpose and get into something called Dirac notation, but don’t bother with that now.)

The number \lambda and the state $\psi$ are called eigenvalues and eigenvectors of…

Is this worth describing? If you don’t know this, it might be incomprehensible. It turns out that if H is Hermitian, meaning, it is self-adjoint:

H_{ij} = \bar{H_{ji}},

then it always has real eigenvalues (as opposed to $\lambda$ being a complex number). Physicists interpret this to mean the matrix definitely represents a physical measurement. Hermitian matrices aren’t the only ones with real eigenvalues, but they also have a property that lets you be sure you’ve measured your particle as being in a certain state.

I’ve seen proofs that Hermitian matrices have real eigenvalues. Here are a couple. These start by assuming there is some eigenvalue/eigenvector pair, and using the fact that a vector magnitude is real at some point.

Finding the eigenvalues of a matrix is a matter of solving an equation which involves a determinant:

\det(H-\lambda I) = 0,

where I is the matrix identity. I thought, could I use an expanded determinant to show that the eigenvalues have to be real?

This isn’t so bad with a 2×2 matrix. With

H = \left[\begin{array}{cc} a & b+ci \\ b-ci & d\end{array}\right] ,

the characteristic equation is

0 = \det(H-\lambda I)

= \left| \begin{array}{cc} a-\lambda & b+ci \\ b-ci & d-\lambda\end{array}\right|

= (a-\lambda)(d-\lambda) - (b^2 +c^2)

= \lambda^2 - (a+d)\lambda +ad - (b^2+c^2)

You can show the two solutions for \lambda have to be real by shoving all these numbers in the quadratic formula. The discriminant (remember that?) is positive because it ends up being a sum of squares. I’m bored.

My thought after this point was to use mathematical induction. We’ve shown that an n\times n Hermitian matrix has real eigenvalues. Let’s show that an (n+1) \times (n+1) one does as a consequence.

Maybe this is doable, but I haven’t done it. It occurred to me that all the cofactors of diagonal entries in a Hermitian matrix would be themselves Hermitian, and a proof by induction would rest on this.

H = \left[ \begin{array}{cccc}h_{11} & h_{12} & \dots & h_{1,n+1}\\ h_{21} & h_{22} & & h_{2,n+1} \\ \vdots & & \ddots & \vdots \\ h_{n+1,1} & \dots && h_{n+1,n+1} \end{array}\right]

= \left[ \begin{array}{cccc}h_{11} & h_{12} & \dots & h_{1,n+1}\\\\ \overline{h_{12}} & h_{22} &  & h_{2,n+1} \\\\ \vdots & & \ddots & \vdots \\\\\overline{h_{1,n+1}} & \dots && h_{n+1,n+1} \end{array}\right]

My thought was… can you construct a determinant using only cofactors of the diagonal entries?

A 4×4 Hermitian matrix. Each matrix made by removing all entries in the same row and column of a diagonal entry is also Hermitian.

This turned out to be not helpful in an embarassing way. I asked myself, can you calculate a determinant by expanding over a diagonal, rather than a row or column? I was able to convince myself no, but the fact that I considered it at all seemed messed up. Shouldn’t that be something I should know by heart about determinants?

Similar to a student using limits without knowing the delta-epsilon definition, I realized that I don’t have a solid grasp of what determinants even are, although I’ve used them extensively. It felt like I was missing a huge part of my mathematics education. I don’t think I had ever proven any of the properties of determinants I had used in a linear algebra course.

I didn’t even know how to define a determinant. In high school, we learned how to calculate a 2×2 determinant. We then learned how to calculate determinants for larger matrices, using cofactors (although we didn’t use that word). But I didn’t (and still, don’t really) know what it was.

This doesn’t look unusual. I’ve got three books on linear algebra next to my desk here.

DeFranza and Gagliardi start by defining a 2×2 determinant as just ad-bc. It then tells how to calculate a 3×3 determinant, and then how to calculate larger determinants using cofactors. This seems in line with how I was taught (although this isn’t a coincidence. I took Jim DeFranza’s linear algebra class). The useful properties of determinants come later.

Zelinsky starts off (on page 166 of 266!) with an entire chapter on all of the algebraic properties we’d like determinants to have. It waits 11 pages to actually give explicit formulas for 2×2, then 3×3 matrices. It isn’t until after that that it gives something that feels like a definition.

Kolman starts with this definition right away:

Let be an x n matrix. We define the determinant of A by

|\mathbf{A}| = \Sigma (\pm) a_{1j_1}a_{2j2}\dots a_{nj_n}

where the summation ranges over all permutations j_1j_2\dots j_n of the set S = \{1,2,\dots, n\}. The sign is taken as + or – according to whether the permutation is even or odd.

Woah. This seems like something fundamental. I had only known determinants as, effectively, recurrence relations. This is a closed form statement of the matrix determinant. Why don’t I recognize this?

Really, though, I can see why this might not be commonly taught. It’s even more cumbersome. But it feels like I missed the nuts and bolts of determinants while using them for so long.

That definition seems ripe for a Levi-Cevita symbol.

It’s probably not worth making most people trudge through millions of subscripts. That’s sort of the MO of mathematics, right? You make an easy-to-write definition or theorem, and then use that to save time and energy.

Maybe I’ll try to show Hermitian matrices have real eigenvalues using that definition. Descartes’ rule of signs or Sturm’s theorem might help. But I’m sleepy. So maybe later.


The Martian Tripod Problem and Transcendental Functions

I thought this week about a problem I originally considered about ten years ago. I imagined a source of a laser beam, mounted high up above the ground, shining straight down, and allowed to rotate upwards at a constant rate until it was shining horizontally. The point at which the laser beam touched the ground would travel from directly below the source to the horizon.


The Martian Tripod Problem: What is the location where the beam strikes the ground as a function of time?

The question is, if I know where the source of the laser is, and how quickly it is rotating, can I know exactly where the beam strikes the ground over time?

The problem came about, I’m sure, as I was listening to Jeff Wayne’s Musical Version of the War of the Worlds, imagining beams of Martian heat rays mounted to towering tripods sweeping across the hull of the Thunder Child.

Trigonometry: What’s the length of a side of a right triangle with a constant height?

This is not so bad of a problem when only the geometry is considered. For now let’s call the angle that the laser makes with the “straight down” direction (the vertical) “omega-t”: \omega t. With t a length of time since the laser started shining, we can see that \omega is a sort of speed — when I multiply it by a length of time t, it gives a total angle, which is like a distance. The product \omega t works the same way that 30 miles per hour times 2 hours is 60 miles. In physics we’d call this speed of rotation \omega the angular speed, or angular velocity if you’re considering rotations in all three dimensions. For now, it doesn’t matter what the value of this speed of rotation is.

Setting up the laser at an angle \omega t from the vertical and a height H from the ground, we find it shines at a point H \sec(\omega t) away, and a horizontal distance H \tan(\omega t) to the right.

A laser at height H, at angle \omega t with the vertical, shines at a point H\tan(\omega t) to the right and a full distance H\sec(\omega t) away.

If you’re not so used to working with trig functions, you could get to the image above by first setting up the “classic” trig diagram, with the point a distance (hypotenuse) H away:

A scaled version of the previous image. Divide all lengths by cosine to get a constant height of H.

The above image has all the right ratios, but keeps the hypotenuse constant, not the height. Divide all the lengths by \cos(\omega t), and remember that secant = 1/cosine (by definition).

We’re already done. The point at which the laser beam hits the ground is

\bigg( H \tan (\omega t) , 0 \bigg)

Tangent “blows up” to infinity at $\pi/2$, which corresponds to the laser shining parallel to the ground. It intersects the flat ground an infinite distance away, at the horizon. Hopefully this agrees with your expectations; tangent is defined to act this way.

The Tripod Problem: Incorporating the speed of light

So that’s not super interesting. The real “tripod problem” was this: Where is the point of intersection if the speed of light isn’t infinite? If it takes some time for the laser beam to travel from the source to the ground, and the laser continues to rotate, then the location where the beam strikes will lag behind the orientation of the laser emitter.

This results in a “floppy” trajectory of the laser beam, drooping down to the ground

A rough estimate of the shape of the laser beam given a very slow speed of light, a very quickly rotating heat ray, or a very tall tripod. Directly underneath the source, the movement of the intersection of the trajectory and the ground is dominated by the rotation of the laser. Far away from the source, the motion of the point on the ground is dominated by the speed of light.

The behavior of the point where the laser strikes the ground is very different with the speed of light restriction. It never reaches the horizon in finite time — the behavior for long lengths of time, and far distances, is totally dominated by the speed of light. It should travel along the ground at a speed approaching c.

The way the pointer location moves depends little on the speed of light directly under the source, where the distance to the ground doesn’t depend strongly on the angle, and a wider angle of the laser covers a small length. As the laser approaches the horizontal, the length covered by each small change in angle increases. The light that will eventually strike very far distances looks more like a point source, since the small angles will be covered in a very short length of time. The trajectory of the beam itself, while it’s still in the air, will look more and more like an expanding circle with time.

When I first mentioned the tripod problem to a friend recently, he had the insight of saying that a laser’s point could definitely travel faster than the speed of light. He could shine a laser at one side of the moon, and then the other. A quick enough rotation on a far enough canvas could result in a pointer appearing to travel faster than the speed of light. Remember, this doesn’t violate anything in relativity. No object is traveling faster than light, rather, a series of events in which different photons strike the distant moon are occurring. This situation is very much like that when the tripod laser is pointed nearly downwards. The speed of the pointer is dominated not by the speed of light, but by the rate of the laser rotation because the distance the light has to travel doesn’t change much when the laser is pointed directly downwards (or from one side of the moon to the other).

This suggests that the speed of the pointer, at very far distances from the tripod, would approach from slower speeds if the laser were rotating slowly. But, if the laser were rotating quickly enough, could counterintuitively approach from faster speeds.

Anyway, let’s try to deal with the problem. Take a look at our trig diagram again.

The laser beam has to travel a distance H\sec(\omega t).

When a certain portion of the laser beam is emitted at a time t (and an angle \omega t), it has to travel a distance H\sec(\omega t). Traveling this distance at speed c takes a length of time equal to

\cfrac{H}{c}\ \sec(\omega t).

Any portion of the heat ray travels in a straight line. Although the beam as a whole is curved, we’re still assuming it’s always traveling directly away from the source (no diffraction, etc.). The laser travels in the same direction and therefore strikes the ground at the point

\big( H\tan(\omega t),0 \big)

at a later time

t^\prime = t +  \cfrac{H}{c}\ \sec(\omega t)

This seems great. We have the basis for a complete understanding of the position of the laser pointer (or toasted Edwardian human) given some time. A portion of the laser, emitted at time t, will strike the ground a horizontal distance H\tan(\omega t) away not at t but at t^\prime above. This allows us to find the location corresponding to any time of emission in the interval

0 \leq t < \cfrac{\pi}{2\omega}.

If we were satisfied with this, the game plan would be to pick a time of emission t, determine how long that portion of the beam traveled, and then pair up the resulting t^\prime with the distance H\tan(\omega t).

I’m not satisfied, though. I’d like to get a trajectory of the laser pointer: a location as a function of the actual time t^\prime rather than the time that portion of the laser was emitted, t. In order to do this, we’d need to replace the t in the tangent function with an equivalent function of t^\prime. In order to do that, we’d need to solve

t^\prime = t +  \cfrac{H}{c}\ \sec(\omega t)

for t. Good luck.

What we’ve got above is a transcendental equation. This means it is not composed of a finite number of additions, subtractions, multiplications and divisions of our variable t and the constants, as well as rational powers of these. In most cases, and I’m pretty sure in this one, we can’t solve a transcendental equation exactly for the input variable. We cannot write t in terms of t^\prime.

It seems like the best we could do, if we wanted to create an animation with a step by step progression of the position of the pointer, is to prepare ahead of time. Pick an emission time t, find the value of the tangent function to find the distance, find the value of the strike time t^\prime, and record that pair. Then do this many, many times to create a table with more values than we expect someone to ask us for. We could find the position as a function of time with as much precision as we wanted, supposing we were willing to put the effort in.

I wanted a closed form solution to the problem, a trajectory x(t^\prime), and it seems more than out of reach. This annoyed me, until a friend (hey, there, buddo!) reminded me that “closed-form” is just a matter of what I’m allowing as a definition. In fact, like I mentioned in the last post, all of the trig functions are themselves transcendental: They can be written as Taylor polynomials, but these are polynomials of infinite length. The secant in the equation above can be estimated using

1 + \cfrac{1}{2}\ x^2 + \cfrac{5}{24}\ x^4 + \cfrac{61}{720}\ x^6 + \cfrac{277}{8064}\ x^8 +\dots

The problem with this, though, is that this isn’t much better. It would still take an infinite amount of time to achieve the exact value of secant given most x’s. The only reason I’m more comfortable using this and the other trig functions is because I’ve been trained to use this name for them, and rely on calculators or tables to give me the values whenever I need them. Anyone using a trigonometric function table is benefiting from someone else’s hard work to overprepare. When we use a calculator, we are relying on an estimation that is as precise as the manufacturer (or sometimes the user) dictates. One could make this estimation with a Taylor series, or with a more efficient method, but the calculator still wont give an exact decimal value.

Any single irrational number, whether it is the solution to an algebraic or a transcendental equation, is another instance of this. I’ve gotten used writing things like \sqrt{2} or \pi as representations of numbers with clear definitions. These numbers have exact values, but it’s hopeless for me to try writing them down. In a very definite sense, these numbers elude us. I could write or use them to any finite precision I wanted, with millions and millions of digits, so long as I were willing to come up with and use an efficient algorithm to find them, or if I were were willing to wait or work a very long time, or both. But, I still wouldn’t have the “exact” value, just one that was plenty good enough for whatever application I had in mind.

These examples remind me: it’s convenient to have named functions like “Cosine” to cover a mathematical idea, but we can’t let this name cover up the meaning of that idea. There are an infinite number of angles whose cosine is a transcendental value. We’re able to use cosine because we can always (right?) reach a higher precision than is necessary in a physical application. I’ve gotten used to working with cosine, and mentally separated it from the solution to the tripod problem, because someone gave it a name that I’ve adopted.

So, I guess I should name the solution. Let’s call the composed function

x(t^\prime) = H\tan\big(\omega t(t^\prime)\big)

where t and t^\prime are related by

t^\prime = t +  \cfrac{H}{c}\ \sec(\omega t)

the heat ray function. We could create a huge table for x(t), someone could come up with an efficient algorithm for calculating values of x, and in the future we could use these to invade infinite planes with laser pointers more effectively.

Tangent of angles approaching 90 degrees

Last week a colleague came to me with a puzzle. He asked me to punch in the tangent of 89 degrees into a nearby TI-83 calculator.

\tan(89^\circ) = 57.28996163

He asked me what was surprising about this number. I wasn’t surprised. I didn’t have an answer for him, although in retrospect I probably should have. He had to tell me this was the number of degrees in a radian. Oh! So it is.

Even further, he said, try punching in tan(89.9), or tan(89.99), etc.

\begin{array}{rl} \tan(89^\circ) &= 57.28996163\\ \tan(89.9^\circ)&=572.9572134\\ \tan(89.99^\circ) &= 5729.577893\\ \tan(89.999^\circ) &= 57295.77951\end{array}

Each one is (about) ten times the previous. (With the TI-83, replacing the last result with each new one, I didn’t see the “about” until later.) This is kinda neat! His question to me was: WHY is this true?

Tangent is a function that accepts an angle and spits out a ratio of lengths. It seems weird that the answer for tan(89) looks like a number of degrees. It is a unitless output, though, and ~57 degrees per radian is also unitless, so I suppose this isn’t much of an issue. The question is, why does it appear that

\tan(89^\circ) = \cfrac{180^\circ}{\pi \text{ rad}}\text{ ?}

Ditching the degree measure,

\tan\left(\cfrac{\pi}{2}-\cfrac{\pi}{180}\right) \stackrel{?}{=} \cfrac{180}{\pi}

My problem was in trying to first tackle this question visually.

tan(89 degrees) is the length of the vertical line lying between an extended radius of a unit circle drawn 89 degrees from the horizontal and the right side of the circle. The numbers above make it seem like it is 180/\pi.

An equivalent image:

A triangle and circle \pi times larger have the same relative lengths.

I tried to explain this by imagining rolling the circle over the tangent line, wrapping the line around the circle, etc. I didn’t get anywhere.

I also tried considering the fact that there’s nothing particularly special about degree measure, except for the fact that 360 is an easy to divide number. Does this happen with other angle units? For example, what about a unit that was, instead of 1/360 of a circle, a larger 1/100 of a circle? We could instead take the equation above and ask,

\tan\left(\cfrac{\pi}{2}-\cfrac{\pi}{100}\right) \stackrel{?}{=} \cfrac{100}{\pi},

Is the tangent of one one-hundredth of a circle short of \pi/2 equal to the number of hundredths of a circle in a single radian? It looks to be true!

\begin{array}{rl} \tan\left(\cfrac{\pi}{2}-\cfrac{\pi}{100}\right)&= 31.82051595\dots\\\\ \cfrac{100}{\pi} &=31.83098862\dots \end{array}

But this is only approximate. We could extend this to any fractional unit 1/n of a circle:

\tan\left(\cfrac{\pi}{2}-\cfrac{2\pi}{n}\right) \approx \cfrac{n}{\pi}

Using this different unit, where the approximation is less accurate, I was able to see that the degree version wasn’t exactly true, either. It definitely looks like dividing the circle into a larger number (360, rather than 100) yields a closer approximation:


In the above, y= \tan\left(\cfrac{\pi}{2}-\cfrac{2\pi}{n}\right) (red) and y=\cfrac{n}{2\pi} (blue) converge for larger n (horizontal).

I was comfortable in concluding now that this wasn’t just a coincidence that relied on degree measure, and could extend this to include using 89.9, 89.99 etc degrees as well. In fact, tacking on .9s to the 1/nths of a circle units works. Just plugging in a bunch of numbers, it looks like

\tan\left(\cfrac{\pi}{2}-\cfrac{2\pi}{n}\ 10^{-a}\right) \approx \cfrac{n}{\pi}\ 10^a

works for any n, and also extends to any power a, not just the integers.


y=\tan\left(\left(\frac{n}{4}-10^{-x}\right)\frac{2\pi}{n}\right) (red) and y=\frac{n}{2\pi}\cdot10^x (green) lie almost on top of each other for positive x (horizontal). In the link you can see this works for any n by fiddling with a slider.

The question remained, why is this true? Now that I saw it’s only an approximation, I realized that I should be going about this algebraically from the start.

A trick, called the small angle approximation, is used in physics often to get rid of pesky sines and tangents when you’d rather just have an expression with the angle inside.

\begin{array}{rl} \sin x &\approx x\\ \tan x& \approx x \qquad\text{when }x\ll1 \end{array}

This behavior is clear when the functions are written in their Taylor series form:

\begin{array}{rl} \sin x &= x - \cfrac{x^3}{6} + \cfrac{x^5}{120} - \cfrac{x^7}{5040} +\dots \\\\ \tan x &= x +\cfrac{x^3}{3} + \cfrac{2x^5}{15} +\cfrac{17x^7}{315}+\dots \end{array}

When x is real small, all the higher power terms get super small, and the approximation becomes more accurate.

This approximation was my first thought, but there’s a problem: it works for small angles, but my colleague’s puzzle was about angles near 90 degrees. In fact, we can’t even fudge the Taylor series of tangent near here, because there is no Taylor series around 90 degrees. (This is a consequence of the fact that tan(x) blows up to infinity at 90 degrees.)

The problem is solved by noting that working with tangent near 90 degrees is the same as working with another trig function, cotangent, near 0 degrees.

\tan\left(\cfrac{\pi}{2}-x\right) = \cot(x) = \cfrac{1}{\tan(x)}.

Setting everything up:

\begin{array}{rl} \tan\left(\cfrac{\pi}{2}-\cfrac{2\pi}{n}\ 10^{-a}\right) &=  \left(\tan\left(\cfrac{2\pi}{n}\ 10^{-a} \right) \right)^{-1} \\ & = \left(\left(\cfrac{2\pi}{n}\ 10^{-a} \right) +\cfrac{1}{3}\left(\cfrac{2\pi}{n}\ 10^{-a} \right)^3 +\dots \right)^{-1}\\ &\approx  \left(\cfrac{2\pi}{n}\ 10^{-a} \right)^{-1} \qquad \text{(the approximation)}\\ &= \cfrac{n}{2\pi}\ 10^a \end{array}

Done! Having the 10^a instead of any old number is unneccessary — this works for any multiple. However, integer a‘s makes the trick of having the same digits show up in tan(89), tan(89.9), etc. work.

So, we can show this algebraically. I just wish I had a nice geometric argument.

More on the circular solution to the intersection of two perpendicular lines

In the last post, I showed that the intersection of two perpendicular lines must lie on a circle, so long as the lines are each forced to go through particular points. The final result was a parameterization based on the classic cosine, sine version of a circle, but the bit I found more interesting was the earlier form:

(x,y) = \left(\cfrac{am^2+(d-c)m+b}{m^2+1},\cfrac{dm^2+(b-a)m+c}{m^2+1}\right), \enspace m\in \mathbb{R}

One of the results of the parameterization was that the angle at which the point lied on the circle was not the angle at which one of the lines made with the x axis (unless the circle’s center was at the origin). This led to a phase shift in the parameterization from the angle of the line. If we were willing to lose a bit of information, the phase, we could also show that the above is a circle if it satisfies

\left(x-\cfrac{a+b}{2}\right)^2+\left(y-\cfrac{c+d}{2}\right)^2 = R^2


R^2 = \left(\cfrac{b-a}{2}\right)^2 + \left(\cfrac{d-c}{2}\right)^2

Since we already have the parameterization with above, showing this is true is just a matter of algebra. To start, add and subtract the coordinates of the center of the circle from x and y:

\begin{array}{c} x = \cfrac{am^2+(d-c)m+b}{m^2+1}-\cfrac{a+b}{2}+\cfrac{a+b}{2}\\\\ y=\cfrac{dm^2+(b-a)m+c}{m^2+1}-\cfrac{c+d}{2}+\cfrac{c+d}{2}\end{array}

Finding a common denominator and carefully combining gives

\begin{array}{c} x = \cfrac{(a-b)m^2+2(d-c)m+b-a}{2m^2+2}+\cfrac{a+b}{2}\\\\ y=\cfrac{(d-c)m^2+2(b-a)m+c-d}{2m^2+2}+\cfrac{c+d}{2}\end{array}

We now have a form that, when plugged into the LHS for the circle equation, cancels out the center point coordinates.

\left(x-\cfrac{a+b}{2}\right)^2+\left(y-\cfrac{c+d}{2}\right)^2 = \left(\cfrac{(a-b)m^2+2(d-c)m+(b-a)}{2m^2+2}\right)^2+\left(\cfrac{(d-c)m^2+2(b-a)m+(c-d)}{2m^2+2}\right)^2

The RHS of this thing is a bit easier to work with with by letting

p = b-a \quad\text{and}\quad q=d-c.

It becomes


Careful manipulation yields


Nicely, the m‘s cancel out completely. This becomes

\cfrac{p^2+q^2}{4} = \left(\cfrac{p}{2}\right)^2+\left(\cfrac{q}{2}\right)^2 = \left(\cfrac{b-a}{2}\right)^2+\left(\cfrac{d-c}{2}\right)^2


Parameterizing a circle with the intersection point of two perpendicular lines.

I’ve been really taken with Desmos, an online calculator and easy to use graphing tool. My students have been using it for some time, and I’m especially happy with the “slider” tool that it offers. Whenever you put a letter into a function while graphing, it suggests a value to assign it, and lets you tune that value with the slider. This tool is similar to Mathematica’s Manipulate or Animate functions, which I’ve had success using in previous classes to show how a function depends on its parameters.

My year-long teacher’s Mathematica license recently expired, making it a bit tougher to install on a new device. While I do have access to an unsupported copy, Desmos has more than replaced M-ca for any of my presentation needs.

In a recent class, we were playing around with linear systems and intersecting lines. To show that a negative reciprocal slope leads to a perpendicular line, I assigned a slider to the value m, and made two linear equations with slopes of m and -1/m:

y=mx \qquad\text{and}\qquad y=-\cfrac{1}{m}\ x

The slider has the nice effect of letting you rotate the lines to see that they’re always perpendicular.

desmos-graph (1).png
Play with it yourself, why don’t ya. You can animate or adjust the slope with the m slider on the left.

The kids were delighted by the pinwheel spinning of the lines as the slope was adjusted. To show that we weren’t limited to lines that passed through the origin, I tacked on a y-intercept to both of the equations, and asked the students, what do you think happens when I adjust the slope now?

My point was to show that the lines remain perpendicular. I would have been pleased to hear that the students could also predict that the point of intersection of the two lines would now move around, instead of be fixed at the origin.

One student went further, however: he was able to predict that the point of intersection of the two lines will always be fixed to a circle.

desmos-graph (3)
You can adjust the slope once again, as well as the points the line are fixed to pass through using the sliders. Only adjusting the slope m keeps the point of intersection on a circle. You can also adjust the points the lines are forced through with the sliders below.


The student had seen a connection to his geometry class from the previous year. An inscribed angle is half the measure of the intercepted arc. An angle inscribing half the circle must then be a right angle. What this student had realized was the converse of this statement: that a right angle, formed by two perpendicular lines each forced to pass through particular points, must lie on a circle, and those two points are the endpoints of a diameter of that circle. I thought this was awfully insightful!

I figured it would be neat to try to show that this must be true on my own. Solving the system

\left\{ \begin{array}{c} y =m (x-a)+c\\\\ y=-\frac{1}{m}\ (x-b)+d  \end{array} \right.

gives the point

\left(\frac{a\thinspace m^2+\left(d-c\right)m+b}{m^2+1},\frac{d\thinspace m^2+\left(b-a\right)m+c}{m^2+1}\right)

I thought this was really neat. We haven’t shown that this point lies on a circle yet, but assuming it does, it shows a way to parameterize a circle with m as the ratio of quadratics. Maybe this is something a mathematician would immediately recognize, but it’s new to me!

To show this does lie on a circle, I need to find an appropriate transformation that turns the above into the more familiar

\left( R\cos\theta + x_1 , R\sin\theta + y_1\right)

for a circle of radius and center (x_1,y_1) . The obvious choice is to connect the slope of one of the lines to the angle on the circle:

m \rightarrow \tan\theta

The parameterization becomes

\left(\frac{a\thinspace \tan^2\theta+\left(d-c\right)\tan\theta+b}{\tan^2\theta+1},\frac{d\thinspace \tan^2\theta+\left(b-a\right)\tan\theta+c}{\tan^2\theta+1}\right)

This is where all your trig identities pay off. Those denominators become squared secants, letting you get rid of the fractions altogether.

\left(\frac{a\thinspace \tan^2\theta+\left(d-c\right)\tan\theta+b}{\sec^2\theta},\frac{d\thinspace \tan^2\theta+\left(b-a\right)\tan\theta+c}{\sec^2\theta}\right)

\bigg(\enspace a\thinspace \sin^2\theta+\left(d-c\right)\sin\theta\cos\theta+b\cos^2\theta\quad,\quad d\thinspace \sin^2\theta+\left(b-a\right)\sin\theta\cos^2\theta+c\cos^2\theta\enspace\bigg)

I’m having a bit of difficulty with formatting here. I’ll have to just write it like so:

\begin{array}{c} x=a\thinspace \sin^2\theta+\left(d-c\right)\sin\theta\cos\theta+b\cos^2\theta \\\\y=d\thinspace \sin^2\theta+\left(b-a\right)\sin\theta\cos\theta+c\cos^2\theta\end{array}

The middle bits of these should pop out: a sine times a cosine is a part of one of the double angle formulas:

2\sin\theta\cos\theta = \sin2\theta

While we’re tossing in sine of a double angle, we might as well introduce the cosine of the double angle as well. This shows up from the squares:

\sin^2\theta = \cfrac{1-\cos2\theta}{2} \qquad \text{and} \qquad \cos^2\theta=\cfrac{1+\cos2\theta}{2}

Our parameterization becomes

\begin{array}{c} x=a\thinspace\left(\cfrac{1-\cos2\theta}{2}\right) +\left(\cfrac{d-c}{2}\right)\sin 2\theta+b\left(\cfrac{1+\cos2\theta}{2}\right) \\\\y=d\thinspace \left(\cfrac{1-\cos2\theta}{2}\right)+\left(\cfrac{b-a}{2}\right)\sin2\theta+c\left(\cfrac{1+\cos2\theta}{2}\right)\end{array}

What’s neat about this is that the center of the circle now falls out as a constant term at the end, and we’ve maintained some kind of symmetry with the sines and cosines.

\begin{array}{c} x= \left(\cfrac{b-a}{2}\right)\cos2\theta +  \left(\cfrac{d-c}{2}\right)\sin2\theta +  \left(\cfrac{a+b}{2}\right)\\\\ y= - \left(\cfrac{d-c}{2}\right)\cos2\theta +  \left(\cfrac{b-a}{2}\right)\sin2\theta + \left(\cfrac{c+d}{2}\right)\end{array}

Here’s where my trig knowledge stopped. The sine and cosines can be combined, though: a linear combination of sine and cosine should leave a single sine curve, but with a phase angle tossed in.

w \cos\theta + u\sin\theta = \sqrt{w^2+u^2}\enspace \sin\left(\theta+\arctan\frac{w}{u}\right)

This is great! We’ve got a way to combine the a, b, c, ds to get something looking like a radius.

\begin{array}{c} x = R \sin\left(2\theta + \arctan\left(\cfrac{a-b}{d-c}\right)\right) +\left(\cfrac{a+b}{2}\right) \\\\ y =R \sin\left(2\theta + \arctan\left(\cfrac{c-d}{a-b}\right)\right) +\left(\cfrac{c+d}{2}\right)\end{array}


R = \sqrt{\left(\cfrac{b-a}{2}\right)^2+\left(\cfrac{d-c}{2}\right)^2}

At this stage, we need to do is turn that first sine into a cosine (using \sin\theta = \cos\left(\theta-\frac{\pi}{2}\right).

\begin{array}{c} x = R \cos\left(2\theta + \arctan\left(\cfrac{a-b}{d-c}\right)-\cfrac{\pi}{2}\right) +\left(\cfrac{a+b}{2}\right) \\\\ y =R \sin\left(2\theta + \arctan\left(\cfrac{c-d}{a-b}\right)\right) +\left(\cfrac{c+d}{2}\right)\end{array}

We’re left with one remaining question: are the phase angles the same?

\arctan\left(\cfrac{a-b}{d-c}\right)-\cfrac{\pi}{2} \quad \stackrel{?}{=}\quad \arctan\left(\cfrac{c-d}{a-b}\right)

A couple more identities that I don’t have memorized clears this up:

\arctan(-x) = -\arctan(x) \qquad\text{and}\qquad \arctan\left(\cfrac{1}{x}\right) - \cfrac{\pi}{2} = -\arctan(x)

\longrightarrow \arctan(-x) = \arctan\left(\cfrac{1}{x}\right) - \cfrac{\pi}{2}

This answers the question above: yes! Our parameterization is

\begin{array}{c} x = R \cos\left(2\theta + \arctan\left(\cfrac{c-d}{a-b}\right)\right) +\left(\cfrac{a+b}{2}\right) \\\\ y =R \sin\left(2\theta + \arctan\left(\cfrac{c-d}{a-b}\right)\right) +\left(\cfrac{c+d}{2}\right)\end{array}

If we wanted to make it a bit nicer, replace:

\phi = 2\theta + \arctan\left(\cfrac{c-d}{a-b}\right)

and we get a nice

\begin{array}{c} x = R \cos\phi +\left(\cfrac{a+b}{2}\right) \\\\ y =R \sin\phi +\left(\cfrac{c+d}{2}\right),\end{array}

a circle centered at x=(a+b)/2 and y=(c+d)/2. Woof! Bark bark! Woof woof bark!

Thoughts before watching Star Trek: Discovery

Looking back over my thoughts on Star Trek in general, I’m reminded that Star Trek: Discovery is the only TV show I’ve ever truly felt some investment in before its release. I do feel a sort of obligation to myself to pay attention to the show, to keep track of the details. It’s a very distinct feeling from the Abrams movies. In 2009, I had only watched TNG and TOS, and was also geared to expect a Star Trek movie, along the lines of the TNG movies, which taken all together are tolerable at best, and totally different in tone in the tv shows.

The new series has more incentive to live up to them if it is trying to cater to Star Trek fans, even if they can be nitpicky.

The Abbreviation

A lot of people are snickering about their abbreviation of the name of the show. Star Trek Discovery can be shortened to STD! Ha! Disease is funny! Didn’t the producers realize that, before deciding to use the forbidden fourth letter of the alphabet?

But really, this is a fine way to abbreviate the show’s name. I’m not going to tell you you can’t refer to the show as STD. We should remember, though, that there is an established abbreviation convention for the previous Star Trek series: we only abbreviate the subtitle! Star Trek (1966) is abbrevated TOS for The Original Series, Star Trek: The Next Generation has long been abbreviated TNG, Deep Space Nine DS9, Voyager VOY, Enterprise ENT. CBS’s use of DSC (or the more popular use of DIS) for Discovery is not a cover-up or a workaround, but falls in line with the rule.

The details and tech should serve the story

One of the big technical issues with Star Trek is in the speed of the ships themselves. Warp Factor, a number (1 through 9.995) that describes the ships faster-than-light speed, is a brilliant device because it draws the audience’s attention away from the actual speed of the ship, except for implying super-future-fast (warp 5) or flippin-fantastic-future-fast (warp 9). We don’t need to know how quickly the ships are zooming around in terms of kilometers per second for the sake of most stories. The show can be totally inconsistent from one episode to the next in terms of speeds and distances travelled and the times between, but very very few Star Treks are about the specific speeds of the ships. Stating these speeds has the potential to be distracting. It was great when the writers created a function and diagram mapping the warp factor to a real speed, but this would be violated many times afterwards.

The point to remember is that we are not going to get so caught up in this TV show that we’re going to lose sight of the fact that it isn’t real. We acknowledge the inconsistencies and move on, giving us mental energy to think about the story at hand.

This extends to other technologies or plot elements in the show that might seem contradictory, both to each other, or to our own world. Who cares that the Eugenics Wars of the 1990s, cited in the original series and The Wrath of Khan, didn’t seem to happen. Exhaustive in-universe retconning should only be done if it makes an interesting story, or else they’ll be wasting our time.

Some fans really griped that the ship bridge and technology featured in ST:DIS didn’t look like the original series. Again, the point of these visuals is to support a believable story, and not be distracting. For someone who has “lived in” the Star Trek universe their whole lives, thinking of Kirk on the bridge all the time, it might be distracting to try to believe that DIS comes ten years before and yet doesn’t have all the goofy memory tapes and buttons of the original series bridge. However, for most people, recreating the original Enterprise set for a modern television show that’s trying to take itself seriously would be much, much more distracting.

Regardless, it’s fun to nitpick and note details

A fan picking apart details and trying to fit everything together in the puzzle should be able to acknowledge they’re doing it for fun, or for their enjoyment of the show. There doesn’t need to be some higher purpose involved, and it will be in their best interest to learn how to do so without being angry or indignant towards the creators. It will also serve a consumer well to learn how to interpret someone else’s nitpicking and exploration as an exercise in puzzle-solving rather true criticism. I’m going to have fun recognizing Star Trek elements and seeing how they fit (or don’t!) in the established world.

“Missing” and tacked-on information

Fans have been skeptical about the use of Sarek in the show as Michael Burnham’s adopted Vulcan father. Sarek, introduced in the first series as Spock’s father, serves as the only character we’ve seen connecting Discovery to another Star Trek series. It’s an obscure (enough) character that only fans of Star Trek will really care about. And this is maybe a good point: using him shrinks the universe a bit. Why can’t we have a story about a bunch of people in the same universe who have never met an Original Series character? I’ve heard this argument made with the Aliens franchise, in which many incarnations of the movies, games, or comics seem to feature a family member of Ellen Ripley. It’s easy to be drawn from the narrative when presented with something unlikely. The reminder of a character from an earlier incarnation of Star Trek is designed to connect and bridge us more closely to the show, but it backfires by just reminding us that the show is manufactured to do so.

The fact that Spock has an adopted human half-sister, though, is not much of a retcon. Some fans have said, why did we never hear of Spock mention his sister Michael? Why didn’t Sarek, Perrin, or Sybok, Spock’s immediate family members, ever mention her in their appearances in their appearances in The Original Series, the movies, or The Next Generation?

I’d answer: because she never needed to come up. Spock is notorious for being efficient and logical. Do you think he’d yammer on about his experiences with his sister while in a life-or-death situation onboard the Enterprise? Even at the first appearance of Sarek, Spock appears reluctant to mention that he is his father, defaulting to the business at hand and referring to him as a diplomat. His failure to mention his sister is no plot hole. Although TNG season 7 might make it seem otherwise, a TV show does not need to have an episode devoted to each family member of each main character.

Again, let’s remind ourselves, this is a TV show. Let’s do our best to judge it as one, and not as an historical record of real events. We don’t have to convince ourselves to like it, completely disregarding continuity errors, but we can adjust how strongly we react.



Thoughts and speculation on Star Trek’s popularity


It may be possible that I have thought of Star Trek more than half of the days I’ve been alive. Those days are also pretty skewed towards the second half of my life as well. Is this much of a feat? Sometimes I will think of people or places that have been important in my life, and I’ll take pause to wonder how many weeks or months it’s been since I last thought of them. I’m not sure I have the same issue with this TV show. Plenty of entertainment is manufactured to stick in our minds.

People who like Star Trek can seem pretty evangelical for a bunch of humanists. This comes with any fandom, but it’s a special thing to be able to look at the culture surrounding the show that is often considered the originator of the modern fandom. It might be ridiculous to say that fanzines and fanfiction, slash fiction and shipping, conventions and cosplay were all created from this one 3-season show. People published and shared stories about their favorite characters beforehand, I’m sure. I’m also sure that plenty of these were a bit sexy, played with taboo, and featured author-insertion (no pun intended). Theatrically or historically minded folks dressed up and played parts for their own sake. Interestingly, one of the first major renaissance “faires” was in 1966, the year ST debuted, at the Paramount ranch (although Paramount did own the rights to broadcast the show until 1969). But Star Trek does seem to be a major popularizer for these ideas, and set up some expectations about things like anime conventions and unwritten rules about how to write yourself doing some cool kissing with Matt Smith without receiving too much judgement.

Part of the continuing success of Star Trek is due to its popularity. It’s a franchise, and success leads to success. It’s a household name.

There must be original things that originally led to its popularity, though, and keep it going. It boils down to the following.

It drew from established genres that were underserved and added to them. Gene Roddenberry drew inspiration for Captain Kirk from the Horatio Hornblower novels. I haven’t read these, but it seems that they themselves heavily draw from the fetishized “plucky” character type and high adventure of (boys’) Victorian-era novels and subsequent pulp fiction. In addition, the space-adventure theme of the show probably drew in a lot of grown-up boys who used to read Tom Swift, Flash Gordon, and still looked fondly back on watching Captain Video. The Twilight Zone, and other “short story” format science fiction shows existed beforehand, but it seems they didn’t have the draw of a character connection.

This is the awesome benefit of having a show based on an indefinitely fast space ship. You can have an episodic format which allows for missing a broadcast and still following the story, like a sitcom, but offers a way to feature completely different premises as often as the creators want. This gives more freedom than a sitcom, which returns to the status quo at the end of every episode, and also keeps the same neighbors around.

Star Trek wasn’t really a huge hit except with a vocal minority (you could spin as an example of the Pareto principle) until in syndication, after its cancellation. The Next Generation series wasn’t exactly a hit right off the bat either. But, funnily enough, it likely benefited from a similar drought of science fiction on tv, only a couple decades later. The drop in popularity (and as many others say, quality) of Star Trek in the 90s, as Deep Space Nine, Voyager and Enterprise continued on, was likely an indirect result of TNG’s popularity: the new Star Trek spawned a new interest in the genre. We might not have seen Babylon 5, Stargates, The X-Files without it, but these might have also saturated the market.

It straddled the lines of being interesting, silly, and thoughtful. Again, Star Trek wasn’t a huge hit at first, but the new-setting-every-week bit was attractive to the science fiction short story lovers. These types are usually looking for ideas that will stick with them for some time afterwards. The ending of The Twilight Zone I’m sure left a hole that Star Trek was able to fill. These ideas were social and political commentary, either unspoken or explicitly stated, or technological and scientific. Deeper personal stories wouldn’t start until the first movie, and The Next Generation, and the political intrigue wouldn’t really pick up until later seasons of TNG (and DS9!).

Viewers at least had some characters to count on week after week. Again, this could have attracted the pulp lovers who fondly remembered reading the Hardy Boys or Nancy Drew (although I’m not sure the Hardy Boys were giving the die-hard fans homoerotic suggestions in the first year of publications). These characters served as the much needed “human” connection that short stories tend to lack — having seen them week after week raises the stakes when they find themselves in peril.

The thinly veiled suggestions of a romantic or sexual relationship between Kirk and Spock brings us to the common assertion that Star Trek was intentionally camp. It’s come to my attention, after seeing Whatever Happened to Baby Jane? at our little Fancy Movie Night, that at the time the camp aesthetic was becoming more popular on television as well as in film. By the way, this might also owe itself to the popularity of Some Like it Hot, also seen at FMN, a film which served to help dismantle the Motion Picture Production Code. The effective removal of this explicitly written code made it a lot easier to play around with the potentially gay relationship, which allowed the creators to bring in other campy qualities.

The title of this section isn’t to suggest being gay is silly, but it may have appeared so to a typical viewer. In addition, the camp aesthetic definitely connected lightheartedness with homosexuality.

Finally, and maybe most obviously, the show was entertaining on a surface level. The traditionally boyish interests of seeing the ship fly around, shooting ray guns, goofy aliens, funny and sexy pajamas, and brawling action I’m sure solidified the interest of plenty of people (and not just boys).

But the thing people like to talk about the most nowadays is…

It introduced a progressive cast of skilled officers and regularly gave social commentary. This is the part that remains the most inspiring. I wonder how essential it was to the show’s initial popularity, but without these aspects, the show would certainly be a lot less interesting, and definitely not worth getting worked up over 50 years later. Having a collection of Earth’s races all working together on the same bridge, backed by Roddenberry’s vision that they have moved beyond interpersonal conflict, stands as a great source of inspiration. In addition to the obviously racial statements, the addition of the Russian Chekov in later seasons, although he was largely there for comic relief, represented a promising future (respect, as well as existence!) for what had been the United States’ chief enemy at the time. (I’ll have plenty of chance in the future to chat about the other representations of the USA’s enemies, the Klingons and the Romulans.)

Human progress was not only shown in social achievement or technological power. Spock himself was often played for laughs due to his rigidity and failure to acknowledge his (clearly present) humanity. Although loved and respected by fans, Spock often was a straw man representing the emotionally bankrupt, completely unempathetic man struggling to maintain feelings of superiority based on knowledge and critical thinking. I think if Spock were a character created today, he would be compared to “mansplainers” or Red Pill types. The fact that the other characters can explain their choices outside of Spock’s coldness, and make fun of him when he lets a bit of emotion shine through, promises a future in which people don’t have to be miserable pedants who only care about things for their academic value. The human still has a place in the 23rd century. We don’t have to replace ourselves or mimic machines as time pushes forward. People liked this idea.


I’ve been thinking about Star Trek even more than usual lately, what with the first episodes of Star Trek Discovery airing last week. I’ve got even more to say about that, but perhaps it can wait for another day.