A short one, just to get the ol’ bloggin’ fingers movin’ again’. It’s starting to get chilly outdoors and they’re simply covered in cobwebs!! Where did they come from? How did I get spider webs on my hands…..?

Last year, 3Blue1Brown produced a series of lovely videos (but really, they all are) on a strange little thing. An object is sent bouncing back and forth between a rigid wall and another moving object with a particular (integer!) mass ratio. If this ratio is a power of 100, the number of collisions will be some of the first digits of pi!! Isn’t that nuts.

Anyone who has taken a high school senior level physics class or above would recognize the need to implement two conservation laws: that of momentum (linear with velocities) and that of energy (quadratic with speeds). Someone with just a bit of programming experience should be able to set up a series of solutions for the speeds after successive collisions, until the “no-collide” condition is met: the two blocks are traveling in the same direction, away from the wall, and gaining distance between them.

Hey, I’ve got a bit of programming experience! I whipped this up to see if I could get the same results.

It looks like they’re doing it right! A fun little exercise. In the spirit of the original prompt I’ll let you find or figure out why it works for yourself. You can always watch Sanderson’s explanation video.

One hint I can give is it is a bit related to the Buffon’s needle technique of finding pi. This is a fun idea that understanding could open some avenues of thought for more difficult estimation problems. Check out the description and explanation on Numberphile:

In some other post I linked to the Wolfram “atlas” of 1-dimensional elementary cellular automata, the list of 256 rules that can apply to a row of “on” or “off” pixels to determine a successive row, in which a pixel’s successive value is only dependent on its own and those of its two neighbors. There are only 256 of these because of this limitation.

We can extend this to a 2-dimensional space. This seems to be easier to follow what’s going on — maybe just by the nature of looking at a computer screen. It’s the space that the Game of Life lives in. A grid of pixels can be on or off, and a successor grid follows, with each pixel’s state determined by its initial state and those of its neighbors. In order to visualize this, you could lay each successor on top of its parent grid to make a 3D object (analogous to forming a 2D image for a 1D array of pixels). But this would be really hard to look through. Instead, just play each image after its successor as an animation.

A ruleset can be described with a series of images, like the 1-D automaton, but again, you’d have to either stack the start example on top of the resulting center pixel, or place them side by side. In addition, there are going to be many more images necessary for a rule, as there are more configurations of on-and-off neighboring cells in 2 dimensions.

This also introduces the uncertainty in what a neighborhood is in 2D. We could consider the 4 cells in the cardinal directions as the only neighbors (the von Neumann neighborhood), or we could tack on the corners, too (the Moore neighborhood).

For the von Neumann neighborhood, each rule alone would require images. With a choice of on or off for the resulting cell for each image, there are 1024 elementary cellular automata. Not a bewildering number, but still a lot.

The Moore neighborhood would have 9 squares, replacing the center square, and so one would need icons for each rule. There are rules of this type. Yikes! There are the same number of squares in the “extended” von Neumann neighborhood, so these numbers apply there as well.

Both of these neighborhoods are probably characterized completely — tracking a few of the properties of 263000 systems doesn’t seem to be out of the realm of possibility, especially when considering that this number can be reduced by a factor of due to horizontal-mirroring, vertical-mirroring, centerpoint-mirroring, and color-inverting of each of these. So, the 512 in the 5-cell neighborhood reduces to only 64 with unique behaviors, and the 9-cell neighborhood reduces to 32,768.

What seemed exciting was to look at larger systems. A Super von Neumann neighborhood of 21 squares will have 4,398,046,511,104 rulesets of 2,097,152 icons each, and the Super Moore neighborhood of 25 squares has 1,125,899,906,842,624 rulesets of 33,554,432 icons each. Now we’re having fun.

It felt like there was some chance of picking something weird out of here.

A ruleset based on an oscillator

One contruction I didn’t mention in the last post was an oscillator — a repeating, unmoving pattern. Some oscillators can be seen in the wakes of the dirty puffers.

Oscillators are emergent in GoL; there isn’t anything about the rule that explicitly reflects an intent of repetition. I was interested in making a rule that was based on an oscillator, and to see if something like spaceships would emerge.

Maggotworld

The simplest oscillator I could think of that seemed interesting was a clockwise spinning line. 3 activated cells, with a diagonal between the first two, would appear to spin around in a circle. If there were three cells in this configuration, the next epoch would deactivate the cell in the back, and activate a new one in the next step around a circle. This resulted in an 8-period oscillator. I originally called the idea Spinworld, but it soon seemed appropriate to call it Maggotworld.

Instead of matching the Super Moore neighborhood around every pixel with one of the 33.5 million in the ruleset, the maggotworld propagator just checks if there are three activated pixels in the right configuration nearby to turn on. There are eight ways this could happen — corresponding to the eight squares that are activated throughout the cycle. This doesn’t mean there are only eight icons of the 33.5 million that lead to an activated central pixel. The pixels toggle; a dead pixel toggles on if there are one, or three maggots approaching it. It toggles off if there are two or four approaching it. The same is likewise true for the tail ends of a maggot — two maggots with the same tail will leave the tail point on, as it is toggled twice. So there are many ways pixels could be left on or off depending on the situation. This seems ripe for emergent behavior!

I ended up trying a bunch of different setups to see how they behaved way back in April, and wrote most of the above then, but haven’t gotten around to this showcase until now.

Random starts

I began by just slapping random starting configurations down and seeing what happened.

After some initial discouragement, some neat patterns did emerge. In the last animation you can spot a few figure 8s, figure triple-Os(?), and maggots passing straight through each other. Not a spaceship, but at least something a bit more complicated.

Not so random starts

If I couldn’t get a spaceship out of this, I could at least look for something cool. Time to get systematic. I placed maggots at varying relative phases and positions to start and let them go.

This was a bit discouraging — none of the ones that touched made any lasting patterns beyond a single oscillator. Anything interesting would have to be made of more than 2 of these, or have some stationary gunk waiting around to push them around a bit.

In one of the random setups, I saw this little friend, which I’ll call a slingshot oscillator.

It’s got a period of 14. Having a different period is… useful? How is any of this useful? I dunno whatever.

Another neat result would be to get a pattern that turns on cells outside of the normal oscillator radius. Here’s one. It looks like a really low resolution hand-style cursor. Let’s call it the flinger.

The last thing of this type is a little u. I like this because, while it seems like these rules easily reduce the number of lit cells, this generates an extra sixth cell by the end of its process.

Behavior in grids

If I couldn’t get a spaceship flying away in empty space, maybe I could get something neat with stationary patterns placed ahead of time. I started by plopping oscillators in a lattice of lit points. It’s real easy to get some ongoing domino action this way.

Vertical motion. There’s a lot of trash left behind, but the key to see here is that only one or two lit squares next to a lit bar drives the motion. The lit side determines which way it travels.

Trams and fuses

The last one looked neat. I tried to make more “trams” following a track. Here are a bunch. There are also configurations that rip the track apart. Let’s call em fuses.

Some of the styles are repeated with a hole in the track to show how they deal with that. Holes are also placed at different locations along the track for the same style of tram to show off how they travel: trams hop two spaces at a time, not one. An “oddly” placed hole will force it to reflect or scatter differently than an “evenly” placed one.

I was taken with the self-repairing track on the bottom.

This seemed like a way to create a caterpillar-like spaceship of indefinite length that scootches forward a couple squares each cycle. I wasn’t able to figure how to get it to repeat, thought.

Another fuse had an end that, after some acrobatics, sent another tram that flipped the orientation of the fuse.

Timers

One last thing and I’ll let you go, I swear. A row of lit squares can be used as a timer of indefinite length. This could go up above with the other oscillators, but the fact that this can be stretched to as long as you need it puts it in its own class.

These can also take a turn. These turns could be used to stop the motion, or to initiate the timer in the first place:

Maybe these are more appropriately called fuses? But I already used that word.

Showing these little tricks and behaviors is interesting enough alone for me, but it would be really swell if we could figure out how to produce oscillators / timers of truly any period. I’m also still wondering if there is an elusory spaceship, one without a grid or track to support it, hiding in these rules. The holy grail after that is to get these things to do binary logic, but let’s save that for another day.

I Lied

I said I had only one last thing but I totally lied to you. You must be feeling pretty silly right now! Just one more bit. I promise. 🤞

Before finishing up with the maggotworld animations, I had tried a lot of different rulesets to see how they’d act. One of them was simple: A cell would turn on if it had an odd number of neighbors (in the Moore neighborhood), and off otherwise.

What a pretty pattern! I was pleased to learn later that I had recreated something called Fredkin’s Replicator. You can see some explanation in Numberphile’s Terrific Toothpick Patterns (starting at 14:20). A lecture also by Neil Sloane gives more insight into analyzing this and other “odd-ruled” cellular automata. Do yourself a favor and check out at least the Numberphile vid.

I was reminded recently that I actually have a blog. I have a Cellular Automata Part II post that’s been sitting around since April, but here’s something quick before I get around to polishing that up.

During a prescribed Art Night, I decided to come up with a visualization of long texts. I grabbed a few examples of literature from Project Gutenberg as plain text files, and converted to binary using each letter’s ASCII character code.

An entire word will then create a string of ones and zeroes 7 times longer. The visualization comes in by turning string into something like the output of a Lindenmayer system. A line is drawn, and at each step it reads the next character. A one instructs the drawing to take a right turn, and a zero instructs it to take a left turn.

Starting from the bottom of an image, drawing upward, and taking these turns one at a time gives a blocky figure.

This was pretty quick to implement. It initially took about 10 or 15 minutes to finish for single-MB size books using repeated AppendTos in Mathematica to construct a list of successive points along the path. A rewrite using Reap/Sow made these finish in less than a minute. I have no learning about optimization, except that the latter is supposed to be fast in M-ca. I suspect that Append rewrites the entire list each time, while Reap/Sow does not. There are likely 800 other better languages in which to write this kind of thing, but hey! Heyo! Woo!

Gutenberg adds in header and footer to each text file, saying some standard stuff about where it was downloaded and when it was written, etc. For each of these, I removed everything except the nitty gritty, proper text. This would only serve to tack on a bit at the beginnings and ends, ultimately not changing much to the shapes. They would have been rotated, though.

Here are the results for 4 books. Here’s a zoomable pdf of the Frankenstein one. In each of these, a green point marks the beginning, and a red point the end. Frankenstein’s green point, for example, is in the lower right, and its red is on the ear of the evil head in the upper left. Boo!

It was really fun to get these results. I acknowledge that there is probably little meaning in these images from a stylometric point of view, but it is interesting to observe the differences. I’m sure these are very sensitive to format, the words themselves, and the ASCII codes for English letters. The Bible is loaded with many paragraph breaks and colons for each verse, in contrast to the long paragraphs of Ulysses. Ensuring that each character begins with a right turn (1) guarantees that each image is going to have some kind of curliness. A single typo wouldn’t change the shape of the subsequent curve, but it would lead to a rotation about that point.

So, is it just happenstance that Ulysses is so much more unfolded than Moby Dick? Is there something fundamentally different about the second half of Moby Dick that makes it hover around the end point?

An issue with the visualizations themselves is that they do not show any overlap as a line covers regions it has already followed. A way around this would be to assign an increasing height to each point, and show these paths as text tunnels in space. These could be seen as a 3D model or as projected images from different directions. Maybe I’ll try this out.

The full image files showing these walks are about 10x the size of the original text files. A text file storing the list of ordered pairs of the path is about 100x the original text file size. So, a forseeable application of this technique could be to make text file sizes much larger, as well as difficult to read and process.

As a final thought, it was easy to call these random walks. Clearly these aren’t random in that they carry the meaning of the texts, but perhaps English letters appear random after being converted in this way. A way to test this is by looking at the distance from the starting point — A true random walk of steps ought to be a distance of roughly from the starting point.

None of these seem to be plots of distance = sqrt steps.

Conway’s game of life is maybe the most well known ruleset for cellular automata. Each cell in a grid has some possible set of states, and set of rules for changing state based on its own and it’s (usually) immediate neighbor cells. Time elapses with a number of discrete steps, sometimes called epochs. In Game of Life, a cell can either be “alive” or “dead” (on or off), and will live for the next time step only if it has one or two alive neighbors.

Other rulesets lead to neat patterns. Here’s a fun one with many cell states. It lets you rate the results as it scans some starting parameter space.

So, we’re certainly not limited to one ruleset. Coming up with one is easy — finding if there are any neat patterns maybe not so much. It might be better to first explore simpler rules — ones in only 1 dimension. In this case, a linear space is segmented into cells, and each cell has only two neighbors. Given some configuration of three cells in a row, the middle cell’s state in the next epoch will be dependent on its own state as well as the two alongside. When considering how to change any cell, since there are only three cells to consider, and only two possible states for each, there are only arrangements to consider when changing any cell. Because of this small number, it’s no problem to display the rule of the automaton as a list of diagrams, each showing one of the eight starting arrangements and the resulting state of the middle cell in the next epoch.

With some initial state, for example:

The next line is produced by scanning over this line and matching the top row with the one from the rule icons.

Note that this wraps around, connecting the two edges. This process then continues indefinitely for successive lines. We’re free to pick any length of line and any arrangement of on and off cells as an initial condition, but once this is done, the rest is automated. This particular initial condition for this rule ends up looking like this:

This is kind of neat, but these processes don’t get to show off unless they’ve got some room to work with. Here’s the same rule applied to an initial state of a single black cell in a long row of white.

The repeating triangles are a pattern characteristic of some of the more interesting rules. Rules 30, 90, 110, and 184 are supposedly the most cited in scientific or mathematics articles.

Because there are only 8 arrangements of cells we need to specify to determine a 1-D ruleset, and each arrangement offers the choice of the middle cell turning on or off, there are possible rules of this type. You can see links to them all here. One could argue there are only 128 rules, 64, or even 32, since negating all the colors, taking a mirror image, or negating the mirror image doesn’t change the behavior much. This means each rule is one of a set of four that operate in the same way.

I’ve already linked to the Wolfram Atlas a few times. The images above were taken from there or made using Mathematica. Stephen Wolfram, of Mathematica fame, is also well-known for his work on cellular automata. He’s argued that these things may be more fundamental to nature than mathematics. It’s considered controversial. I haven’t read A New Kind of Science, a book where he lays this all out, but it’s free and online. Maybe worth checking out. It seems like some of the argument hinges on the fact that some of these automata are Turing complete — they can be set up to operate and produce results like any computer program.

I’d just like to note that my interest in cellular automata is based pretty much entirely on having fun. It’s 11 o’ clock on a Saturday night.

A step after checking out the 1-D rules is to explore the 2-D rules. Next I’ll show off a rule and try to do something interesting with it.

The Wrath of Khan is usually considered the best Star Trek movie out of the thirteen (or ten (or six), depending on how you count). Even if it’s not your favorite, it definitely tends to be the undeniable compromise. It’s got action, a bit of life advice, tension and suspense, fun space action, heroic and villainous sacrifice, and a few believable female characters, especially in comparison to the docile bunnies of the original series and the robotic Ilia of The Motion Picture.

Also in contrast with the original movie is the presence of a clear enemy. Khan is a real hole and a half, having no sympathetic qualities despite a reasonable motivation for revenge. V-ger is threatening but mysterious and literally nebulous. The movie is extremely approachable, allowing anyone to enjoy seeing a Bad Man fighting our heroes. You can also think a bit more deeply about Kirk’s struggle with aging if you want, too. That definitely doesn’t make this a cerebral or philosophical flick, but at least opens the movie up to people wanting more than just an action movie.

Most action movies glorify pluckiness. Quick thinking, resourcefulness, physical strength and speed all help the hero solve the problem at the last moment. This is a great way to keep the suspense up — you can feel that the cause is lost just before our hero comes through in the clutch. It’s a classic admirable quality in film and literature. It is especially revered in 19th century English works, Hollywood movies, and anything influenced by those (nearly everything in popular culture).

If our team details a military plan of action complete with contingencies, and then carries it out without a hitch, well. Okay. Good for them. But we want to see something that’s different from work.

An alternative to this is an Ocean’s Eleven style presentation, where the plan is described as a voice over during the execution, or with interstitial scenes.

Wrath of Khan does neither of these things, though. This isn’t strange at all. The Ulysses / Homestuck / Christopher Nolan style of nonlinear storytelling is pretty new as a pop culture thang.

Wrath of Khan is a story about sticking to protocol. It’s a rare instance in film, even in the Star Trek universe, when the guy in control has most everything figured out ahead of time. An old go-getter realizing the wisdom of experience and accepting loss and the unavoidable is the obvious theme of the movie. Supporting this theme, though, is a pragmatic lesson about planning and protocol.

There are three instances when things don’t go as expected. The first and the third support the top-level theme. The Kobayashi Maru test and the background about Kirk cheating it sets us up; Ok, Kirk is going to have to learn how to approach the no-win scenario. Later, when Spock sacrifices himself, Kirk realizes he wasn’t able to save everything in the end and he learns a lesson about aging and loss.

It’s pretty goofy that Kirk learns this lesson only when he loses Spock, but not all the other people that have died on his ship over the years.

In the middle is the screw-up that causes nearly all of the trouble of the movie. When Khan approaches the Enterprise with the Reliant, with no communications, Saavik quotes Star Fleet protocol: Put the shields up, you idiot! For some reason, Kirk doesn’t, and Spock defends him. Why doesn’t he do that? Shields are never described as a precious resource, or particularly difficult to raise or lower. Thinking along the lines of the movie’s theme, he’s probably feeling some thrill of skirting danger.

This screws everything up. Khan gets the chance to fire on the warp core and disable the Enterprise. Kirk very quickly owns up, and tells Saavik that she should keep reminding him of the regulations. If the ship hadn’t been disabled, the Constitution class ship would outmatch the Miranda easily.

After this point, everything is done by the book. Kirk stalls for time expertly while the crew looks up the Reliant’s bypass code. They explain this is a failsafe — a technical workaround planned ahead of time in case of such a situation. They’re able to take down the Reliant’s shields and hit hard, taking away Khan’s advantage.

The entire operation on the Regula asteroid involves a pre-planned code between Kirk and Spock to ensure Khan misunderstands the situation. The code makes Khan think that the Enterprise will return after two days of repairs, when they only needed two hours. The other stranded characters, as well as the viewer might be surprised when they find out that the ship is back early, but Kirk isn’t surprised at all. They’re following through with an agreed-upon plan. In Khantrast, the villain decides to hunt down Kirk and his ship in spite of his crew’s suggestions to escape with the starship Reliant, a fantastic, futuristic home to a bunch of fossils from the 90s.

Even when he’s keeping up the facade of being stranded, Kirk is going by the book. He starts with the most pressing issue, survival, while David is accusing him of ignoring escape.

Later in the Mutara Nebula, while not explicitly a by-the-book maneuver, the 500 meter drop which circumvents Khan’s “two dimensional thinking” probably reflects the Starfleet veterans’ training as their experience. The experience comment aligns with the coming-to-terms-with-aging theme, but it’s also hard to believe a space-naval strategy course wouldn’t give some statement like “Please use all three dimensions available in a fight” on the first day. It’s such a basic maneuver it hardly deserves a name. In submarine warfare, finding an enemy ship, overtaking it out of range, dropping below it and waiting for it to pass overhead before rising to fire torpedoes is a classic World War II tactic called an end-around (pdf). Kirk and crew just did the last bit.

The take-away here is that Kirk really messes up by not adhering to standard procedure in one instance, which leads to all of the further problems in the movie. Later he acts conservatively, avoiding brash, spur-of-the-moment decisions. From his point of view, everything goes swimmingly after the initial attack. Yes, Spock has to sacrifice himself in a last-ditch effort to fix the warp engine, but only after every other option had been exhausted.

It’s a palette cleanser to see a blockbuster-style movie based on planning. Aging gracefully is still the backbone motif of the movie, and it’s supported by the reliance on plan, not pluck. This doesn’t get in the way at all of delivering the story. It’s invisible. That’s good! But a rewatch, while keeping in mind their use of protocol, can help one appreciate how the characters got into and out of the whole mess.

When I studied math, I tended to find myself more interested in the “continuous” side of things rather than the discrete. This is calculus and analysis and such, in contrast to things like logic, abstract algebra, number theory, graphs and other things where everything is rather chunky. I suppose I was largely motivated by the use of calculus in physics, which was usually my main focus. It’s easy to see that calculus and continuous functions are essential when studying physics. It’s not as easy to argue for number theory’s direct applicability to an engineer.

Sometimes I feel a bit ashamed about this interest. Calculus and analysis are the next logical steps after real number algebra. One could argue I didn’t allow myself to expand my horizons~~! But, whatever, who cares. They’re still cool. OK? THEY’RE COOL.

It’s very easy to claim that you’re interested in something. It’s almost as easy to hear someone talk about how they’re a fan and try to call them out on it by asking about some trivia. This is often the crux of an argument in the whole “Fake Geek Girl” and Gamergate things. Similarly, sometimes, it feels like everyone around me is nuts about Local Football Home Team, and I often find myself skeptical of the “purity” of their fanaticism. I need to remind myself that someone can enjoy something and not know everything about it. You can be interested in Watching Football Match and not know the names of everyone (or even anyone) on the team.

The same is true for something like math. It had better be true, since there’s always another thing we could define or discover, and all of the fields already developed aren’t completely understood by a single person. It’s fine to be more interested in one thing rather than another. If you take that too far, you’d end up criticizing people for not being interested in every single thing in existence equally.

It’s all right to wonder if I should look into a certain topic, though. A few years ago, a colleague teaching introductory calculus to high school seniors mentioned that they were working on delta-epsilon proofs in the class. This blew me away! The concept of a limit is usually introduced in a pre-calculus class, or the beginning of a calculus class. I am under the impression that it’s usually a matter of, “yeah, this function gets really close to this point, but doesn’t actually hit it,” and that’s all a student really needs to know until they do some analysis. A delta-epsilon definition is a way to formally define a limit, so there is no uncertainty as to what is actually happening. “Gets really close” ends up being defined exactly — basically, it says, “For this function , gimme any number bigger than zero, even a SUPER tiny number, and I can give you a distance away from such that $latex $f(x)$ is never further from a limiting value L than your number.”

Okay, maybe that is not super enlightening. But on a side note, it’s fun to think about how much like a playground taunt that is.

I was ready to argue that his students didn’t need to bother with delta-epsilon proofs, that they could learn to work with the fuzzy idea of a limit first and get along just fine in calculus, just as I had. But, I did start to doubt myself. Should I have learned the definition of a limit before trying to use it, in my hubris?

In retrospect: no, that’s silly. Looking at epsilon-delta definition, I realize it would have taken me ages to get through it, taking away from valuable time spent thinking about the applications of calculus. But, that feeling is there. Should I have known about this?

What does this have to do with Hermitian matrices, you demand, grabbing me by the shoulders and shaking, while my cravat soaks up your spittle.

I had this same feeling this week, when thinking about a topic in linear algebra and physics. In quantum mechanics, matrices are used extensively to describe certain kinds of actions that could be done to a particle or a system. One of those actions could be to take a measurement, such as the amount of energy in the system. If you call your matrix H and your particle current state , then you could represent a measurement of how much energy the system has by multiplying them together.

When you multiply them together, you can get a single number times the particle state as a result. If H measured the energy of the particle, then the value of is that energy. You can’t just divide both sides by because a particle’s state is a vector of many possibilities, and division by vectors rather than numbers doesn’t mean a whole lot here. (You can multiply both sides by a vector transpose and get into something called Dirac notation, but don’t bother with that now.)

The number and the state $\psi$ are called eigenvalues and eigenvectors of…

Is this worth describing? If you don’t know this, it might be incomprehensible. It turns out that if H is Hermitian, meaning, it is self-adjoint:

,

then it always has real eigenvalues (as opposed to $\lambda$ being a complex number). Physicists interpret this to mean the matrix definitely represents a physical measurement. Hermitian matrices aren’t the only ones with real eigenvalues, but they also have a property that lets you be sure you’ve measured your particle as being in a certain state.

I’ve seen proofs that Hermitian matrices have real eigenvalues. Here are a couple. These start by assuming there is some eigenvalue/eigenvector pair, and using the fact that a vector magnitude is real at some point.

Finding the eigenvalues of a matrix is a matter of solving an equation which involves a determinant:

,

where I is the matrix identity. I thought, could I use an expanded determinant to show that the eigenvalues have to be real?

This isn’t so bad with a 2×2 matrix. With

,

the characteristic equation is

You can show the two solutions for have to be real by shoving all these numbers in the quadratic formula. The discriminant (remember that?) is positive because it ends up being a sum of squares. I’m bored.

My thought after this point was to use mathematical induction. We’ve shown that an Hermitian matrix has real eigenvalues. Let’s show that an one does as a consequence.

Maybe this is doable, but I haven’t done it. It occurred to me that all the cofactors of diagonal entries in a Hermitian matrix would be themselves Hermitian, and a proof by induction would rest on this.

My thought was… can you construct a determinant using only cofactors of the diagonal entries?

This turned out to be not helpful in an embarassing way. I asked myself, can you calculate a determinant by expanding over a diagonal, rather than a row or column? I was able to convince myself no, but the fact that I considered it at all seemed messed up. Shouldn’t that be something I should know by heart about determinants?

Similar to a student using limits without knowing the delta-epsilon definition, I realized that I don’t have a solid grasp of what determinants even are, although I’ve used them extensively. It felt like I was missing a huge part of my mathematics education. I don’t think I had ever proven any of the properties of determinants I had used in a linear algebra course.

I didn’t even know how to define a determinant. In high school, we learned how to calculate a 2×2 determinant. We then learned how to calculate determinants for larger matrices, using cofactors (although we didn’t use that word). But I didn’t (and still, don’t really) know what it was.

This doesn’t look unusual. I’ve got three books on linear algebra next to my desk here.

DeFranza and Gagliardi start by defining a 2×2 determinant as just ad-bc. It then tells how to calculate a 3×3 determinant, and then how to calculate larger determinants using cofactors. This seems in line with how I was taught (although this isn’t a coincidence. I took Jim DeFranza’s linear algebra class). The useful properties of determinants come later.

Zelinsky starts off (on page 166 of 266!) with an entire chapter on all of the algebraic properties we’d like determinants to have. It waits 11 pages to actually give explicit formulas for 2×2, then 3×3 matrices. It isn’t until after that that it gives something that feels like a definition.

Let A be an n x n matrix. We define the determinant of A by

where the summation ranges over all permutations of the set . The sign is taken as + or – according to whether the permutation is even or odd.

Woah. This seems like something fundamental. I had only known determinants as, effectively, recurrence relations. This is a closed form statement of the matrix determinant. Why don’t I recognize this?

Really, though, I can see why this might not be commonly taught. It’s even more cumbersome. But it feels like I missed the nuts and bolts of determinants while using them for so long.

That definition seems ripe for a Levi-Cevita symbol.

It’s probably not worth making most people trudge through millions of subscripts. That’s sort of the MO of mathematics, right? You make an easy-to-write definition or theorem, and then use that to save time and energy.

Maybe I’ll try to show Hermitian matrices have real eigenvalues using that definition. Descartes’ rule of signs or Sturm’s theorem might help. But I’m sleepy. So maybe later.

I thought this week about a problem I originally considered about ten years ago. I imagined a source of a laser beam, mounted high up above the ground, shining straight down, and allowed to rotate upwards at a constant rate until it was shining horizontally. The point at which the laser beam touched the ground would travel from directly below the source to the horizon.

The question is, if I know where the source of the laser is, and how quickly it is rotating, can I know exactly where the beam strikes the ground over time?

The problem came about, I’m sure, as I was listening to Jeff Wayne’s Musical Version of the War of the Worlds, imagining beams of Martian heat rays mounted to towering tripods sweeping across the hull of the Thunder Child.

Trigonometry: What’s the length of a side of a right triangle with a constant height?

This is not so bad of a problem when only the geometry is considered. For now let’s call the angle that the laser makes with the “straight down” direction (the vertical) “omega-t”: . With a length of time since the laser started shining, we can see that is a sort of speed — when I multiply it by a length of time , it gives a total angle, which is like a distance. The product works the same way that 30 miles per hour times 2 hours is 60 miles. In physics we’d call this speed of rotation the angular speed, or angular velocity if you’re considering rotations in all three dimensions. For now, it doesn’t matter what the value of this speed of rotation is.

Setting up the laser at an angle from the vertical and a height from the ground, we find it shines at a point away, and a horizontal distance to the right.

If you’re not so used to working with trig functions, you could get to the image above by first setting up the “classic” trig diagram, with the point a distance (hypotenuse) away:

The above image has all the right ratios, but keeps the hypotenuse constant, not the height. Divide all the lengths by , and remember that secant = 1/cosine (by definition).

We’re already done. The point at which the laser beam hits the ground is

Tangent “blows up” to infinity at $\pi/2$, which corresponds to the laser shining parallel to the ground. It intersects the flat ground an infinite distance away, at the horizon. Hopefully this agrees with your expectations; tangent is defined to act this way.

The Tripod Problem: Incorporating the speed of light

So that’s not super interesting. The real “tripod problem” was this: Where is the point of intersection if the speed of light isn’t infinite? If it takes some time for the laser beam to travel from the source to the ground, and the laser continues to rotate, then the location where the beam strikes will lag behind the orientation of the laser emitter.

This results in a “floppy” trajectory of the laser beam, drooping down to the ground

The behavior of the point where the laser strikes the ground is very different with the speed of light restriction. It never reaches the horizon in finite time — the behavior for long lengths of time, and far distances, is totally dominated by the speed of light. It should travel along the ground at a speed approaching c.

The way the pointer location moves depends little on the speed of light directly under the source, where the distance to the ground doesn’t depend strongly on the angle, and a wider angle of the laser covers a small length. As the laser approaches the horizontal, the length covered by each small change in angle increases. The light that will eventually strike very far distances looks more like a point source, since the small angles will be covered in a very short length of time. The trajectory of the beam itself, while it’s still in the air, will look more and more like an expanding circle with time.

When I first mentioned the tripod problem to a friend recently, he had the insight of saying that a laser’s point could definitely travel faster than the speed of light. He could shine a laser at one side of the moon, and then the other. A quick enough rotation on a far enough canvas could result in a pointer appearing to travel faster than the speed of light. Remember, this doesn’t violate anything in relativity. No object is traveling faster than light, rather, a series of events in which different photons strike the distant moon are occurring. This situation is very much like that when the tripod laser is pointed nearly downwards. The speed of the pointer is dominated not by the speed of light, but by the rate of the laser rotation because the distance the light has to travel doesn’t change much when the laser is pointed directly downwards (or from one side of the moon to the other).

This suggests that the speed of the pointer, at very far distances from the tripod, would approach c from slower speeds if the laser were rotating slowly. But, if the laser were rotating quickly enough, could counterintuitively approach c from faster speeds.

Anyway, let’s try to deal with the problem. Take a look at our trig diagram again.

When a certain portion of the laser beam is emitted at a time (and an angle ), it has to travel a distance . Traveling this distance at speed c takes a length of time equal to

.

Any portion of the heat ray travels in a straight line. Although the beam as a whole is curved, we’re still assuming it’s always traveling directly away from the source (no diffraction, etc.). The laser travels in the same direction and therefore strikes the ground at the point

at a later time

This seems great. We have the basis for a complete understanding of the position of the laser pointer (or toasted Edwardian human) given some time. A portion of the laser, emitted at time , will strike the ground a horizontal distance away not at but at above. This allows us to find the location corresponding to any time of emission in the interval

.

If we were satisfied with this, the game plan would be to pick a time of emission , determine how long that portion of the beam traveled, and then pair up the resulting with the distance .

I’m not satisfied, though. I’d like to get a trajectory of the laser pointer: a location as a function of the actual time rather than the time that portion of the laser was emitted, . In order to do this, we’d need to replace the in the tangent function with an equivalent function of . In order to do that, we’d need to solve

for . Good luck.

What we’ve got above is a transcendental equation. This means it is not composed of a finite number of additions, subtractions, multiplications and divisions of our variable and the constants, as well as rational powers of these. In most cases, and I’m pretty sure in this one, we can’t solve a transcendental equation exactly for the input variable. We cannot write in terms of .

It seems like the best we could do, if we wanted to create an animation with a step by step progression of the position of the pointer, is to prepare ahead of time. Pick an emission time , find the value of the tangent function to find the distance, find the value of the strike time , and record that pair. Then do this many, many times to create a table with more values than we expect someone to ask us for. We could find the position as a function of time with as much precision as we wanted, supposing we were willing to put the effort in.

I wanted a closed form solution to the problem, a trajectory , and it seems more than out of reach. This annoyed me, until a friend (hey, there, buddo!) reminded me that “closed-form” is just a matter of what I’m allowing as a definition. In fact, like I mentioned in the last post, all of the trig functions are themselves transcendental: They can be written as Taylor polynomials, but these are polynomials of infinite length. The secant in the equation above can be estimated using

The problem with this, though, is that this isn’t much better. It would still take an infinite amount of time to achieve the exact value of secant given most x’s. The only reason I’m more comfortable using this and the other trig functions is because I’ve been trained to use this name for them, and rely on calculators or tables to give me the values whenever I need them. Anyone using a trigonometric function table is benefiting from someone else’s hard work to overprepare. When we use a calculator, we are relying on an estimation that is as precise as the manufacturer (or sometimes the user) dictates. One could make this estimation with a Taylor series, or with a more efficient method, but the calculator still wont give an exact decimal value.

Any single irrational number, whether it is the solution to an algebraic or a transcendental equation, is another instance of this. I’ve gotten used writing things like or as representations of numbers with clear definitions. These numbers have exact values, but it’s hopeless for me to try writing them down. In a very definite sense, these numbers elude us. I could write or use them to any finite precision I wanted, with millions and millions of digits, so long as I were willing to come up with and use an efficient algorithm to find them, or if I were were willing to wait or work a very long time, or both. But, I still wouldn’t have the “exact” value, just one that was plenty good enough for whatever application I had in mind.

These examples remind me: it’s convenient to have named functions like “Cosine” to cover a mathematical idea, but we can’t let this name cover up the meaning of that idea. There are an infinite number of angles whose cosine is a transcendental value. We’re able to use cosine because we can always (right?) reach a higher precision than is necessary in a physical application. I’ve gotten used to working with cosine, and mentally separated it from the solution to the tripod problem, because someone gave it a name that I’ve adopted.

So, I guess I should name the solution. Let’s call the composed function

where and are related by

the heat ray function. We could create a huge table for x(t), someone could come up with an efficient algorithm for calculating values of x, and in the future we could use these to invade infinite planes with laser pointers more effectively.

Last week a colleague came to me with a puzzle. He asked me to punch in the tangent of 89 degrees into a nearby TI-83 calculator.

He asked me what was surprising about this number. I wasn’t surprised. I didn’t have an answer for him, although in retrospect I probably should have. He had to tell me this was the number of degrees in a radian. Oh! So it is.

Even further, he said, try punching in tan(89.9), or tan(89.99), etc.

Each one is (about) ten times the previous. (With the TI-83, replacing the last result with each new one, I didn’t see the “about” until later.) This is kinda neat! His question to me was: WHY is this true?

Tangent is a function that accepts an angle and spits out a ratio of lengths. It seems weird that the answer for tan(89) looks like a number of degrees. It is a unitless output, though, and ~57 degrees per radian is also unitless, so I suppose this isn’t much of an issue. The question is, why does it appear that

Ditching the degree measure,

My problem was in trying to first tackle this question visually.

An equivalent image:

I tried to explain this by imagining rolling the circle over the tangent line, wrapping the line around the circle, etc. I didn’t get anywhere.

I also tried considering the fact that there’s nothing particularly special about degree measure, except for the fact that 360 is an easy to divide number. Does this happen with other angle units? For example, what about a unit that was, instead of 1/360 of a circle, a larger 1/100 of a circle? We could instead take the equation above and ask,

Is the tangent of one one-hundredth of a circle short of equal to the number of hundredths of a circle in a single radian? It looks to be true!

But this is only approximate. We could extend this to any fractional unit of a circle:

Using this different unit, where the approximation is less accurate, I was able to see that the degree version wasn’t exactly true, either. It definitely looks like dividing the circle into a larger number (360, rather than 100) yields a closer approximation:

I was comfortable in concluding now that this wasn’t just a coincidence that relied on degree measure, and could extend this to include using 89.9, 89.99 etc degrees as well. In fact, tacking on .9s to the 1/nths of a circle units works. Just plugging in a bunch of numbers, it looks like

works for any n, and also extends to any power a, not just the integers.

The question remained, why is this true? Now that I saw it’s only an approximation, I realized that I should be going about this algebraically from the start.

A trick, called the small angle approximation, is used in physics often to get rid of pesky sines and tangents when you’d rather just have an expression with the angle inside.

This behavior is clear when the functions are written in their Taylor series form:

When x is real small, all the higher power terms get super small, and the approximation becomes more accurate.

This approximation was my first thought, but there’s a problem: it works for small angles, but my colleague’s puzzle was about angles near 90 degrees. In fact, we can’t even fudge the Taylor series of tangent near here, because there is no Taylor series around 90 degrees. (This is a consequence of the fact that tan(x) blows up to infinity at 90 degrees.)

The problem is solved by noting that working with tangent near 90 degrees is the same as working with another trig function, cotangent, near 0 degrees.

Setting everything up:

Done! Having the instead of any old number is unneccessary — this works for any multiple. However, integer a‘s makes the trick of having the same digits show up in tan(89), tan(89.9), etc. work.

So, we can show this algebraically. I just wish I had a nice geometric argument.

In the last post, I showed that the intersection of two perpendicular lines must lie on a circle, so long as the lines are each forced to go through particular points. The final result was a parameterization based on the classic cosine, sine version of a circle, but the bit I found more interesting was the earlier form:

One of the results of the parameterization was that the angle at which the point lied on the circle was not the angle at which one of the lines made with the x axis (unless the circle’s center was at the origin). This led to a phase shift in the parameterization from the angle of the line. If we were willing to lose a bit of information, the phase, we could also show that the above is a circle if it satisfies

with

Since we already have the parameterization with m above, showing this is true is just a matter of algebra. To start, add and subtract the coordinates of the center of the circle from x and y:

Finding a common denominator and carefully combining gives

We now have a form that, when plugged into the LHS for the circle equation, cancels out the center point coordinates.

The RHS of this thing is a bit easier to work with with by letting

It becomes

Careful manipulation yields

Nicely, the m‘s cancel out completely. This becomes