HOW TO BUILD A ROLLER COASTER
WITH ROCKET SCIENCE!
PART 1 OF 3
A really cool article in an October 2012 issue of IEEE Spectrum explained how scientists managed to to turn the International Space Station using only gravity and gyroscopes without help from the engines, saving millions of dollars' worth of rocket fuel.1 The same mathematical techniques could also be used to calculate ways to turn satellites rapidly to refocus them on new targets. In a discussion of the scientific theory used in this project, the article mentioned the famous brachistochrone problem. Under specific conditions, the problem asks for the path by which travel between two points can be achieved most quickly.
Imagine that you have been assigned to a team of engineers designing a roller-coaster track. The roller-coaster will climb slowly using engine power up to point A, where the ride begins. You can assume that the roller-coaster will begin at point A at a standstill, with no velocity. After that point, to save fuel, you've been instructed not to use the engines at all, so you can only use the attraction of gravity and the normal force of the track itself to power and direct the motion of the roller-coaster. The track will be thickly waxed and slippery, so you can ignore friction altogether. Your job is to build the first section of the track, a section that begins at point A and ends slightly lower, at point B. You want to construct the track in the ideal shape so that the roller coaster gets from point A to point B as quickly as possible. The customers need to hear the air roaring past them as they ride, or you'll earn a reputation for lame roller-coaster construction. In what shape should you construct the arc of the track to take the coaster from point A to point B in the shortest amount of time? Reworded to involve fewer roller-coasters and more mind-numbing theoretical jargon about point masses, this problem is famous as the brachistochrone problem, from the Greek for 'shortest time.'
The Spectrum article poses the brachistochrone problem and states the solution, but does not explain how the solution can be determined. I think this is unfortunate, because the methods used to solve the problem are as interesting as the problem itself. These methods forms the basis of optimization theory, which has many exciting applications, including those discussed in the Spectrum article.
Both Wikipedia2 and Wolfram MathWorld3 have pages devoted to the brachistochrone problem. Unfortunately, at the time I am writing this, Wikipedia does not show how to take advantage of optimization theory by using the Euler-Lagrange equation to solve the problem. MathWorld does present a solution using the Euler-Lagrange equation, but it does not fully explain the logical context of the mathematics, and the solution skips some steps. In this post I provide a thorough, step-by-step examination of the brachistochrone problem in order to expose the elegance with which optimization theory finds complicated answers to simple questions. A basic knowledge of calculus is required to understand the explanation that follows.
FINDING VELOCITY: CONSERVATION OF ENERGY
It is intuitive to guess that the optimal solution is a straight line between A and B, because a straight line is the shortest distance between two points. Travel time, however, depends not only on the distance involved but also on the speed with which that distance is traversed. For example, Philadelphia is closer to New York than San Francisco is, but I bet I could fly to San Francisco from New York faster than you could walk to Philadelphia from New York, because airplanes move more quickly than pedestrians. Similarly, although a straight line represents the shortest distance between A and B, the roller-coaster would not accelerate as quickly along a straight path as it would along a path with a steeper initial descent. Therefore, in order to determine the optimal path, we need to consider velocity as well as distance.
What is the velocity of the roller-coaster? This is not a constant value; due to the attraction of gravity and the normal force of the track, the roller-coaster accelerates. However, since we have assumed that friction is negligible, we can use conservation of energy to determine the velocity of the roller-coaster at any individual point along the track. The only types of energy that the coaster has that are changing are gravitational potential energy and kinetic energy.
Let m be the roller-coaster's mass, g be the acceleration of gravity, h be the height of the coaster above the ground, and v be the magnitude of its velocity. Then, we know from conservation of energy that $$mgh+\frac{1}{2}mv^2=C$$ where C is some constant, since the total energy is constant.
Thus:
$$\Delta \left( mgh+\frac{1}{2} mv^2 \right) = \Delta C = 0$$
$$\Delta mgh+ \Delta \frac{1}{2} mv^2 = 0$$
$$\Delta \frac{1}{2} mv^2 = 0$$
$$\Delta \frac{1}{2} mv^2 = - \Delta mgh$$
Factor out m:
$$\Delta \frac{1}{2} v^2 = - \Delta gh$$
Since constants do not change:
$$\frac{1}{2} \Delta v^2 = -g \Delta h$$
$$\Delta v^2 = -2g \Delta h$$
Let x0 denote the initial value of x for any variable x. Then:
$$v^2 - v_0^2 = -2g(h-h_0)$$
Since the initial velocity v0 is zero:
$$v^2 = - 2g \left( h-h_0 \right)$$
$$v = \sqrt{-2g \left( h-h_0 \right) }$$
$$v^2 = - 2g \left( h-h_0 \right)$$
$$v = \sqrt{-2g \left( h-h_0 \right)}$$
This is correct but awkward. In order to remove ugly negative signs and simplify the resulting expression, it is convenient for us to introduce a variable y representing depth, which is the opposite of height. The depth y increases along an upside-down vertical axis centered at A so that y(A) = 0 with positive y pointing downwards. As the roller-coaster rolls downwards, the y-value of its position increases while h decreases. Thus, y is a normalized version of the height h:
$$\Delta y = - \Delta h$$
$$y(A) = y_0 = 0$$
Using these more convenient units:
$$\Delta v^2 = -2g \Delta h$$
$$\Delta v^2 = 2g \Delta y$$
$$v^2 - v_0^2 = 2g \left(y-y_0 \right)$$
Since the roller-coaster begins at point A:
$$v^2 - v_0^2 = 2g \left(y-y \left(A \right) \right)$$
Since y(A) = 0:
$$v^2 - v_0^2 = 2gy$$
Since the initial velocity is zero:
$$v^2 = 2gy$$
$$v = \sqrt{2gy}$$
This looks much nicer. Now, we know the velocity of the roller-coaster at any given point along the track, since we know the height of the track at that point (this should be specified in the design of the track's shape).
TIME ELAPSED AS A PATH INTEGRAL
Any time an object travels along a path, the time it takes to travel along that path is equal to the length of the path (the distance traversed) divided by the average speed of the object. In this problem, the average speed of the object in question is not easy to determine, because the speed of the object is constantly changing due to the acceleration of gravity and the normal force of the track. Therefore, it is convenient to break up the track into small sections so that the speed of the object is approximately constant within each of those sections. Then, the time taken to cross each section of the track is equal to the length of that section divided by the speed of the object when it is in that part of the track. The total time taken will be equal to the sum of these individual times. In the limit as we break up the track into an infinite number of infinitely small sections in order to justify our assumption that the speed of the roller-coaster is constant within any one section, we are effectively integrating over the arc length of the path.
Thus, we have determined:
$$t = \int \limits_{path} \! \frac{\mathrm{d}s}{v}$$
where t is time, ds is the change in arc length, v is velocity, and p is some path of interest from A to B.
In order to justify using calculus, we assume that the shape of track can be described by a smooth, differentiable function. This is not only logically convenient because it allows us to use calculus; it also makes logical sense. Any path with a sharp discontinuity that made it not differentiable would be poorly designed. We want the roller-coaster to slide smoothly along the track, not bounce or fall off. In an extreme counterexample, if we built an L-shaped track that went straight down and then straight across, the roller-coaster would fall smack into its own tracks. This could injure the passengers, who might become upset. For the kinds of tracks we would actually want to design, calculus does apply.
Let's incorporate our earlier result from conservation of energy into the arc length integral:
$$t = \int \limits_{path} \! \frac{\mathrm{d}s}{v}$$
$$v = \sqrt{2gy}$$
$$t = \int \limits_{path} \! \frac{\mathrm{d}s}{\sqrt{2gy}}$$
So far, so good. Now, we need to get rid of the annoying ds term. Arc length is an interesting abstract concept but not an easy one to work with mathematically. The image below from Wikipedia4 shows that any arc can be approximated by concatenating multiple hypotenuses. In the limit as we use infinite infinitesimal hypotenuses in a line integral over the arc length, the approximation converges to the exact solution.
We can take advantage of this to reframe our expression for travel time, t, in terms of our vertical y variable from earlier and a horizontal position variable, x, that increases in the lateral direction from A towards B. In particular, we can use the Pythagorean theorem to express ds in terms of dy and dx:
$$\mathrm{d}s^2 = \mathrm{d}x^2 + \mathrm{d}y^2$$
$$\mathrm{d}s = \sqrt{\mathrm{d}x^2 + \mathrm{d}y^2}$$
Abuse of notation with algebra on infinitesimals produces a nicer integrand:
$$dy = \frac{\mathrm{d}y}{\mathrm{d}x} \ \mathrm{d}x$$
$$\mathrm{d}y^2 = \left(\frac{\mathrm{d}y}{\mathrm{d}x} \right)^2 \ \mathrm{d}x^2$$
$$\mathrm{d}s = \sqrt{\mathrm{d}x^2 + \left(\frac{\mathrm{d}y}{\mathrm{d}} \right)^2 \ \mathrm{d}x^2}$$
$$\mathrm{d}s = \sqrt{\mathrm{d}x^2 \left(1 + \left(\frac{\mathrm{d}y}{\mathrm{d}x} \right)^2 \right)}$$
$$\mathrm{d}s = \sqrt{ \left(1 + \left(\frac{\mathrm{d}y}{\mathrm{d}x} \right)^2 \right)} \ \mathrm{d}x$$
We can use this result to convert our line integral over arc length (ds) into a regular integral across the horizontal axis (dx). In doing so, we must assume that we are only considering paths that can be represented by single-valued functions of horizontal position (x) that produce the height at each position in a one-to-one correspondence. Again, in addition to being mathematically convenient, this makes logical sense. Although loop-de-loops are exciting, they are clearly not the fastest way to get from point A to point B. In general, any path that doubles back over or under itself wastes time. Therefore, we can proceed without any grave concern. Limiting ourselves to nice functions is not a great loss; practically speaking, finding the optimal smooth, single-valued path is often equivalent to finding the optimal path in general.
Henceforth, let f' denote the derivative of f with respect to x, for any function or variable f. Then:
$$ds = \sqrt{1+y'^2} \mathrm{d}x$$
Now substitute this result into our previous expression for travel time:
$$t = \int \limits_{path} \! \frac{\mathrm{d}s}{\sqrt{2gy}}$$
$$t = \int \limits_{path} \! \frac{\sqrt{1+y'^2} \mathrm{d}x}{\sqrt{2gy}}$$
Henceforth, let the x-values of A and B be called a and b. Then:
$$t = \int_a^b \! \frac{\sqrt{1+y'^2} \mathrm{d}x}{\sqrt{2gy}}$$
$$t = \int_a^b \! \sqrt{\frac{1+y'^2}{2gy}} \, \mathrm{d}x$$
Alright! We've now produced a fairly simple expression for calculating the time that the roller-coaster would take to travel across a given path in terms of the vertical depth of the path (y) and its slope (y') as the path progresses from left to right away from A and towards B (dx). Given any two paths, we could evaluate the integral for each of the paths and compare the values. The path with the smaller integral would be better, because it would take a shorter time to traverse. However, since there are an infinite number of possible paths, we can't compare all of them to find the optimal path. How can we minimize the time taken to get from A to B, using our knowledge of how to find the time taken to traverse a given path? In order to progress, we need to review some of the basic theory behind a method we use often in calculus: differentiation to find extrema.