Tag Archives: covariant derivative


Straight Lines

When we think of a straight line, we usually think of a line in the Euclidean sense; that is, c(t)=p+tX, where p is a point contained in the line, t is a real number, and X is a vector that points parallel to the line. If we consider Euclidean space as a manifold, we would say that X is in the tangent space T_{c(t)}(\mathbb E^n), because c'(t)=X. One important observation to make is that all along c(t), X never changes; i.e., we never accelerate. That is, if we move along the curve, we never speed up or slow down, and we never turn.

In the language of my post on covariant derivatives, this is easy to express:

\nabla_{c'} c' \equiv 0

The geometric interpretation is simple here: in the direction of the velocity vector, the velocity vector doesn’t change. You can probably see the punchline coming by now. If we generalize to a curve c(t) on a manifold M, c(t) is a geodesic if \nabla_{c'} c' \equiv 0.

Now, you may notice that we can trace out the same curve if we tweak the parameter t so that we could accelerate on the curve (we wouldn’t turn, but we could speed up or slow down). That is, we could have an alternate parametrization. But in order to have a geodesic, we need \nabla_{c'} c' \equiv 0, so \nabla_{c'}\left<c',c' \right> = 2\left<\nabla_{c'} c',c' \right> = 0, and therefore \|c'\| is a constant along the curve. This gives us a unique parametrization of the curve, up to a constant scaling factor on the parameter. In fact, if we consider such a scaling factor, we get that \nabla_{c'(st)} c'(st) = \nabla_{s c'(t)} s c'(t) = s^2\nabla_{c'(t)} c'(t) = 0, so a geodesic with a constant scaling factor on its parameter is still a geodesic (and obviously has the same image). This motivates the following definition: if \|c'\| = 1 then the geodesic is called a normal geodesic.

The Exponential Map

Say that some curve \gamma(t) is a geodesic. Then \nabla_{\gamma'}\gamma' = 0 is a second-order differential equation in t. If we assume that \gamma(0) = p, and \gamma'(0) = v, then we have the required conditions for existence and uniqueness of a solution to the differential equation. That is, given a point p\in M and tangent vector v\in T_p(M), there is a unique geodesic \gamma_v that passes through p with velocity v.

The exponential map \text{exp}_p:T_p(M)\to M is defined as \text{exp}_p(v) = \gamma_v(1), assuming that 1 is in the domain of \gamma_v. The exponential map is fairly important when talking about Riemannian manifolds, and it turns out that it is smooth and a local diffeomorphism. The latter means that there is a neighborhood around p where its unique inverse exists. This inverse is the logarithmic map, or \text{log}_p:M\to T_p(M).

The exponential map is so important, in fact, that it appears in many of the important theorems in Riemannian geometry, like the Hopf-Rinow Theorem and the Cartan-Hadamard Theorem. It’s also essential to understanding the effects of curvature on a Riemannian manifold.

Arc Length

At this point we can ask about the relationship between arc length and geodesics. Assume that we have some smooth function \alpha : [a,b]\times(-\epsilon,\epsilon)\to M. We can compute the change in arc length L[c_s] over the family of curves c_s = \alpha | [a,b]\times\{s\}:

\frac d{ds}L[c_s] = \frac d{ds}\int_a^b\left<c_s'(t),c_s'(t)\right>^{1/2}dt = \int_a^b\nabla_S\left<T,T\right>^{1/2}dt
= \frac 1 2\int_a^b\left<T,T\right>^{-1/2}\nabla_S\left<T,T\right>dt = \int_a^b\left<T,T\right>^{-1/2}\left<\nabla_S T,T\right>dt

The variables S,T that we substitute here are fields of tangent vectors corresponding to the differential of \alpha with respect to the variables s,t. The rest is just calculus. Since s,t are independent of each other, we know that their derivatives commute and so we can say that [T,V] = 0. This means that we can make the switch \nabla_S T = \nabla_T S:

\frac d{ds}L[c_s] = \int_a^b\left<T,T\right>^{-1/2}\left<\nabla_T S,T\right>dt
= \int_a^b\left<T,T\right>^{-1/2}\left(T\left<S,T\right>-\left<S,\nabla_T T\right>\right)dt

If we consider the curve c_0, and consider that we can always reparametrize a curve without loss of generality so that l = \left<T,T\right>^{1/2} is a constant,

\frac d{ds}L[c_s]\mid_{s = 0} = l^{-1} \left(\left<S,T\right>\mid_a^b-\int_a^b\left<S,\nabla_T T\right>dt\right)

This is called the first variation formula. The function \alpha is called a variation. If we assume that all the c_s are curves that join two points in M, then we know that S vanishes at the endpoints. If we further assume that c_0 is a geodesic, then the integral vanishes (because \nabla_T T = 0). What this means is that geodesics are critical points of the arc length function L for curves that join two points.

We can’t claim that a geodesic segment minimizes the distance between two points (though there is a unique minimizing geodesic segment; for that we need the second variation formula, which I won’t get into in this post). To see this, consider the case when M is a sphere, with the usual angular metric. If we consider any two distinct points, there is a great circle path that joins them that is of length the angular distance between them, \delta. However, there is also a path of length 2\pi - \delta that goes around “the long way” that joins the points as well. This path happens to be the longest one that you can take, and it’s also a geodesic segment. Obviously this would be a maximum of the first variation formula.

It’s easy to see that the first variation formula gives us a lot of power in talking about the geometry of a Riemannian manifold. The source that I use actually motivates the definition of a geodesic from an effort to minimize the first variation formula. I prefer to motivate it from the “straight line” perspective.


Much of this material comes from Comparison Theorems in Riemannian Geometry by Jeff Cheeger and David G. Ebin.


Riemannian Connections

For the project that I’m working on, I needed to know the basics of riemannian connections. Connections confused the hell out of me until I took a few days to really absorb them. I’m writing down my interpretation here so that I can burn it into the neurons, and hopefully help someone else trying to understand the same topic.

Covariant Derivatives of Scalar Functions

A connection is also called a covariant derivative. One of the principles of differential geometry is that everything should behave the same regardless of which coordinate system you work in, so we’d like a way to get the derivative of a quantity when along an arbitrary direction. When we consider a scalar function f, the covariant derivative is just the directional derivative. If X = \sum_{k=1}^n b_j E_j :

\nabla_X f = Xf = \sum_{i=1}^n a_i \frac{\partial f}{\partial x_i}

I found it extremely useful to think of the covariant derivative as a linear operator:

\nabla_X f = \left(\sum_{i=1}^n a_i \frac\partial{\partial x_i}\right)f

Covariant Derivatives of Vector Fields

If we want to apply \nabla_X to a vector field Y, then we can apply the operator:

\nabla_X Y = \left(\sum_{i=1}^n a_i \frac\partial{\partial x_i}\right)Y = \sum_{i=1}^n a_i \frac{\partial Y}{\partial x_i}

Immediately we can see an interpretation for \nabla_X Y: see how Y changes with respect to each coordinate direction, and then sum the resulting vectors together, weighted by each component of X. It’s easy to see how this gives us a coordinate-free derivative of a vector field. What we have right now is called an affine connection.

Affine Connections

Affine connections have two properties; linearity in X and the product rule on fY. This is immediate from the operator representation:

\nabla_{fU+gV} Y = f\nabla_U Y + g\nabla_V Y
\nabla_X fY = (\nabla_X f)Y + f\nabla_X Y

This means that we can expand the representation in X:

\nabla_X Y = \sum_{i=1}^n a_i\nabla_{E_i}Y

It should be pretty obvious that \nabla_{E_i}Y is the same as \partial Y/\partial x_i, in that they both represent how Y changes in the unit direction of x_i. If you’ve been paying attention, you’ve probably been wondering about how we compute these constructs. It’s fairly straightforward to assume that in Cartesian coordinates, we just differentiate each component of Y. What about in other bases? Well, assuming that Y = \sum_{j=1}^n b_j E_j, we can just apply the product rule on the terms:

\nabla_X Y = \sum_{i=1}^n a_i\nabla_{E_i} \sum_{j=1}^n b_j E_j
= \sum_{i=1}^n a_i \left(\sum_{j=1}^n \left(\nabla_{E_i} b_j\right) E_j + \sum_{j=1}^n b_j \nabla_{E_i} E_j\right)
= \sum_{i,j} a_i \frac{\partial b_j}{\partial x_i} E_j + \sum_{i,j} a_i b_j \nabla_{E_i} E_j

In Cartesian coordinates, the second term is going to vanish, because the coordinate directions don’t change with respect to any direction. So our assumption about Cartesian coordinates is correct. In other bases, we can just think of the second term as a corrective factor for the curvature of the coordinate frames. In most texts, the vector \nabla_{E_i} E_j = \sum_{k=1}^n \Gamma_{ij}^k E_k is defined, where the \Gamma_{ij}^k are called Christoffel symbols. I won’t get into them here, except to say that they have some important symmetries.

Riemannian Connections

If you’re familiar with this material, you may have noticed that I’ve hand-waved a lot. There’s a lot of machinery that needs to be set up to prove existence and uniqueness of all these constructs. It’s also machinery that works fairly well in Euclidean space, but we can’t make the same assumptions on general smooth manifolds. We’d like a connection that works on general manifolds, but we need to make some extra assumptions. A Riemannian connection is an affine connection with some extra properties:

\nabla_X Y - \nabla_Y X = \left[X,Y\right]
\nabla_X\left<U,V\right> = \left<\nabla_X U,V\right> + \left<U,\nabla_X V\right>

Where \left<\cdot,\cdot\right> is an inner product on the tangent space, and \left[\cdot,\cdot\right] is the Lie bracket. The first condition imposes a restriction on the coordinate frames that states that the frames must be torsion-free; that is, the coordinate frames may not twist when moving in any particular direction. The second just imposes the product rule on the inner product. Euclidean space already has these properties, so the covariant derivative as I described it above is a Riemannian connection.

These extra rules basically allow us to assume that a connection \nabla is unique on any particular smooth manifold that has an inner product defined on its tangent space, and that we can use the above formula to write it out explicitly. There’s a lot more to it, of course, but we have enough to work with. I’ll be writing more posts that cover this topic, but I encourage you to read up on it yourself and derive your own intuition of what’s going on.