1  Introduction

Mechanics is the study of motion, and motion is described using the spatial positions of a physical system as a function of time. So, the best place to start is by carefully defining how we will talk about space and time.

One warning Taylor (2005) gives that I’ll amplify: if my choices of notation are not what you’re used to, that’s a good thing! There are many conventions and notation choices out there; to read scientific books, papers, etc. more broadly, you need to get comfortable with changes to notation. In fact, there are some ways in which my notation will deviate from Taylor’s notation - I will be extra careful to call these differences out, however. I will also comment on popular alternatives in notation sometimes.

1.0.1 Units and dimensions

The foundation of science is reproducibility, which means if you and I do the same experiment, we will find the same results. But defining what “the same results” means requires us to agree on some ground rules. One very important detail is choice of units (look up the sad story of the Mars Climate Orbiter!), but you should be familiar with the basics of scientific units by now, so we’ll just agree to work in SI and move on.

Actually, one piece of terminology first: we will work in SI units, but a slightly more general concept than units is the idea of dimensions. A dimension is a basic physical property of some system. Meters, micrometers, feet, and light-years are all different units, but they all measure the same dimension, which is length. The important physical dimensions for mechanics are:

  • Length (SI base unit: meters, m), [L] for short
  • Time (SI base unit: seconds, s), [T] for short
  • Mass (SI base unit: kilogram, kg), [M] for short

and in fact, that’s it: we can derive everything else. For example, energy (SI base unit: Joules, J) has dimensions of [M] \cdot [L]^2 / [T]^2, and correspondingly the Joule is equal to \textrm{J} = \textrm{kg} \cdot \textrm{m}^2 / \textrm{s}^2. As long as we only work in SI base units, we can use dimensions and units more or less interchangeably (and I will mix the terms together), but it’s worth remembering that there is a difference.

(Note: there are some other concepts that aren’t just simply derived from this set of three dimensions, like charge and temperature. But length, time, mass is all we will need for mechanics!)

The existence of dimensions gives a really important distinction between physics problems and pure math problems. The simple requirement of matching dimensions can greatly restrict the space of allowed solutions, and sometimes lets you guess the form of the right answer without a full solution (this is called dimensional analysis.) Even if you solve a problem in full, checking dimensions is a nice filter to easily verify your results (this will save you hours of confusion and many mistakes over the course of your physics career!)

Exercise: Tutorial 1A

Here, you should complete Tutorial 1A on “Math and Modeling.” (Tutorials are not included with these lecture notes; if you’re in the class, you will find them on Canvas.)

1.1 Vectors and coordinate systems

Having settled on units, we next have to agree on a coordinate system to describe where things are in our experiments. We live in three dimensions, so we need three coordinates - three numbers - to uniquely describe a given point in space. (You can take this as a definition of what “three dimensions” means, in fact.) We also need to agree on an origin, O, from which our coordinates are measured.

All of the below should be review at least partly, but it will be a good opportunity to refresh your memory and let me set up some math notation, for which I’ll generally try to follow Taylor. In addition to the coordinate definitions themselves, we’ll also be considering how vectors change as we change coordinates. (Vectors are essential to most of the physics we’ll study in this class!)

We also have to agree on our time coordinates, but since there’s only one dimension of time, that’s easy: if my time axis is t and yours is t', the only possible difference is that we might disagree on the origin, i.e. my t=0 s might be your t'=2 s. Unless explicitly said otherwise, I’ll assume that we always have a single common time axis t with common origin, and just worry about the other coordinates.

1.1.1 Rectangular coordinates

Let’s start with the most familiar coordinate system, called rectangular or sometimes rectilinear or Cartesian coordinates. These are the coordinates (x,y,z) that describe distances along a set of three perpendicular axes. Any choice of three axes will do, as long as they’re all mutually perpendicular!

At this point, it’s good to introduce vector notation, which will be very helpful in thinking about relating different coordinate systems. We define three unit vectors \hat{x}, \hat{y}, \hat{z} which point along the corresponding axes. Then we can write any other vector in terms of the unit vectors. For example, the position vector \vec{r} describes the location of an object relative to the origin:

\vec{r} = x \hat{x} + y \hat{y} + z \hat{z} \tag{1.1}

Common alternative names for the unit vectors are \hat{i}, \hat{j}, \hat{k} and \hat{e}_1, \hat{e}_2, \hat{e}_3; the latter generic-looking set are sometimes used to describe other coordinate systems, so beware!

(Note on notation: I will use arrows to denote vectors; Taylor uses bold-face, so he would write the vector above as \mathbf{r}. Bold-face looks clean but arrow notation is much more practical when writing by hand!)

As mentioned above, keeping track of units is very important, so when we introduce new things I’ll try to mention their dimensions as well. Since \vec{r} has units of distance, and so do the individual lengths x,y,z, we notice that the unit vectors themselves have no units - they just point in a direction.

When we want to refer to components of a vector by name, we’ll use a subscript equal to the corresponding unit vector: for example, if \vec{v} is a velocity, then v_x is the speed in the x-direction. For the position vector, we have r_x = x.

1.1.2 The dot product

Vector information about an object and its motion is great, but often we want to know things like: “what is the speed of my object?” or “how far are these two masses from each other?” Speed and distance are examples of scalar information: they are quantities that don’t care about direction, which means they definitely aren’t vectors!

One useful way to get scalar information from vector information is the dot product, which multiplies two vectors together component by component and gives back a number: \vec{a} \cdot \vec{b} = a_x b_x + a_y b_y + a_z b_z. \tag{1.2}

We can immediately use this to define useful things. For example, the dot product of a vector with itself gives us the square of its length, just by the Pythagorean theorem: |\vec{r}| = \sqrt{ x^2 + y^2 + z^2} = \sqrt{\vec{r} \cdot \vec{r}} \tag{1.3} where paired vertical lines |...| gives absolute value for a regular number, but length for a vector. They’re sort of the same thing, because a vector’s length can’t be negative, and it’s easy to show that |-\vec{r}| = |\vec{r}|. When it isn’t ambiguous, I’ll usually write this simply as r = |\vec{r}|.

In general, if you go look up some trigonometry formulas, it’s not too hard to show that \vec{a} \cdot \vec{b} = ab \cos \theta, \tag{1.4} where \theta is the angle between the two vectors. I won’t derive this formula, but I will apply one other concept that I’ll emphasize over and over, which is checking limits and special cases (another technique to save you from hours of frustration!) First special case: if \vec{b} = \vec{a}, then the angle \theta is obviously zero, and we recover the formula \vec{a} \cdot \vec{a} = |\vec{a}|^2 from above. Our check is successful!

Another interesting limit is when \theta = 90^\circ: the formula tells us that the dot product should be ab \cos 90^\circ = 0. If we pick our coordinates axes so that \vec{a} = a \hat{x} and \vec{b} = b \hat{y}, for example, then it’s easy to confirm this from the definition. So this check passes too.

A natural question you might ask is: what if I’ve already picked my coordinate axes, and my vectors \vec{a} and \vec{b} aren’t along the coordinate axes? Why does the argument above still work? A really important point to remember is that vectors exist independent of our choice of coordinates. Mathematically, this is true by definition: obviously in physics, it had better be true, because a block sliding down a ramp only has a single velocity vector even if you and I measure it differently.

Since \vec{a} \cdot \vec{b} only depends on the vectors and not on any components, it is coordinate-independent. This is why I was allowed to pick my coordinates above to work out what happens to the dot product when \theta = 90^\circ.

Scalars, vectors, and coordinates

A scalar quantity, like the distance between two objects, is independent of coordinate system. A vector quantity, like the velocity of an object, also exists independent of coordinate choices! However, the components of a vector will change when we change coordinates.

We emphasized that unit vectors like \hat{x} just point in a direction: they carry no units and their length doesn’t change. Using the dot product, we can define a unit vector pointing in the direction of any vector by dividing its length out: \hat{a} = \frac{1}{\sqrt{(\vec{a} \cdot \vec{a})}} \vec{a}. \tag{1.5} or using our informal notation for the length and inverting, \vec{a} = a \hat{a}. As another example, we can write the position vector as \vec{r} = r \hat{r}.

One more useful feature of the dot product is that we can use it to project out components. For example, the x component of an arbitary vector \vec{v} is given by: v_x = \vec{v} \cdot \hat{x} and similarly for v_y and v_z. This might seem kind of trivial if we already know what \vec{v} looks like in rectangular coordinates - but this simple observation is often useful if we’re changing coordinate systems.

The other important vector product to know is the cross product, \vec{a} \times \vec{b}. However, it will be a while until we actually have a use for it this semester, so I’ll defer talking about it until later on.

1.1.3 Cylindrical coordinates

Let’s do cylindrical coordinates next, because they only swap out two out of three coordinates from Cartesian coordinates: \hat{z} is kept the same. Ignoring the z-direction, we want to swap out the other two coordinates (x,y) for 2-d polar coordinates:

Warning: Name of the cylindrical radius

Taylor does something I think is confusing, which is to use r instead of \rho if he’s in two dimensions. I’ll always use \rho for the polar radius, keeping r for the three-dimensional position vector, i.e. distance to the origin.

(The symbol \rho we use for the distance to the origin is the greek letter “rho”. Greek letters are commonly used for variables in math and physics; here is a complete list on Wikipedia.) As we can read off the sketch, the relationship between the coordinates is x = \rho \cos \phi \\ y = \rho \sin \phi or going backwards, \rho = \sqrt{x^2 + y^2} \\ \tan \phi = \frac{y}{x}.

Warning: Inverse tangent and polar coordinates

A word of warning about this last formula: you might be tempted to just use an inverse function to “simplify” it and write \phi = \tan^{-1} (y/x). The problem with this will be obvious if we plot the \tan^{-1} function:

The range of \tan^{-1}, i.e. its set of possible output values, is (-\pi/2, \pi/2). But this only covers half of the plane! The issue is that if we reflect a point from (x,y) \rightarrow (-x,-y), the value of the ratio y/x doesn’t change: in terms of angles, \tan (\phi + \pi) = \tan \phi. So if you use the ratio y/x to find what the coordinate \phi is, double-check what quadrant your point is supposed to be in!

If you’re doing this numerically, it’s not hard to write a computer program that will use \tan^{-1} and then check the signs and add \pi if needed. However, the program needs to know both x and y separately. In Mathematica, if you give two numbers i.e. ArcTan[x,y], it gives you exactly this corrected result for \phi. In other programming languages, the ‘quadrant-aware’ version of the function is usually called arctan2.

If we want to do integrals in cylindrical coordinates, we need the volume element:

dV = \rho\ d\rho\ d\phi\ dz. \tag{1.6}

A tip to help remember this formula is to think about checking units. The units of volume are [L]^3, but the angle \phi is dimensionless, so if you just write out d\rho d\phi dz you should notice that you’re missing a unit of length - which comes from the extra factor of \rho.

One important note about volume elements, since I like to emphasize both geometric and algebraic solutions. We know that the Cartesian volume element is dV = dx dy dz, so maybe another way to derive this formula is to just use the coordinate change formulas above. If you try this, you’ll find that it doesn’t work! The problem is that when we’re computing a volume element, we have to account for the directions of the unit vectors; in Cartesian coordinates the element is just a cube, but in other coordinate systems we get a different shape. The correct general formula is actually the triple product dV = (\vec{dx} \times \vec{dy}) \cdot \vec{dz}. Since all of the axes are perpendicular in Cartesian coordinates, this just becomes simply dx dy dz. But in the current example, the difference matters; if you keep track of things properly, using this formula as the starting point should yield the correct cylindrical dV.

On to the unit vectors. Remember that the idea of a coordinate unit vector is that it points in the direction in which that coordinate increases; if we pick a point and sketch the polar coordinates on, it should be easy to see geometrically which way \hat{\rho} and \hat{\phi} point. But now, if we choose two different points and identify \hat{\rho} and \hat{\phi}, it’s easy to see that we have a new complication: the directions \hat{\rho} and \hat{\phi} depend on what point we are asking about!

We can do a bit of trigonometry on any given point and easily show the general relationship \hat{\rho} = \cos \phi \hat{x} + \sin \phi \hat{y} \\ \hat{\phi} = -\sin \phi \hat{x} + \cos \phi \hat{y} The way in which they depend on our coordinates isn’t so bad: their directions just change with the angle \hat{\phi}.

Instead of going through the geometry, I’ll show you an algebraic way to derive the unit vectors instead. When we take the derivative of a vector \vec{v} with respect to some other variable s, the new vector d\vec{v}/ds gives us both the rate and the direction of change with respect to s. So when we say “the \rho direction”, what we mean is the direction of the vector d\vec{r}/d\rho. Recalling that we can rescale any vector to a unit vector by dividing its length out, we have the equations: \hat{\rho} = \frac{d\vec{r}/d\rho}{|d\vec{r}/d\rho|} \\ \hat{\phi} = \frac{d\vec{r}/d\phi}{|d\vec{r}/d\phi|}

We start by writing \vec{r} out in Cartesian components:

\vec{r} = x\hat{x} + y\hat{y} + z\hat{z}

Next, we substitute in the formulas for x and y in terms of cylindrical coordinates: \vec{r} = \rho \cos \phi \hat{x} + \rho \sin \phi \hat{y} + z\hat{z}

Now we can take derivatives, remembering that the derivative of a vector is still a vector: \frac{d\vec{r}}{d\rho} = \cos \phi \hat{x} + \sin \phi \hat{y} and \frac{d\vec{r}}{d\phi} = -\rho \sin \phi \hat{x} + \rho \cos \phi \hat{y}

These are the correct directions for \hat{\rho} and \hat{\phi} already; to make them unit vectors, we just have to normalize. It’s easy to show that d\vec{r}/d\rho is already a unit vector, while the length of the other vector is \left| \frac{d\vec{r}}{d\phi} \right| = \sqrt{\rho^2 (\sin^2 \phi + \cos^2\phi)} = \rho. So finally, we have \hat{\rho} = \frac{d\vec{r}/d\rho}{|d\vec{r}/d\rho|} = \cos \phi \hat{x} + \sin \phi \hat{y} \tag{1.7} \hat{\phi} = \frac{d\vec{r}/d\phi}{|d\vec{r}/d\phi|} = -\sin \phi \hat{x} + \cos \phi \hat{y} \tag{1.8}

matching the result above.

Let’s get a bit of practice with using these cylindrical unit vectors.

Exercise: Position vectors in cylindrical coordinates

For the point (x,y) = (1,1), what is the position vector \vec{r} in terms of cylindrical unit vectors?

Answer:

The simplest way to understand this is by drawing the unit vectors on the sketch. You can do this just by geometric reasoning (the directions that \rho and \phi increase), or use our formulas above to find \hat{\rho} = \frac{1}{\sqrt{2}} (\hat{x} + \hat{y}) \hat{\phi} = \frac{1}{\sqrt{2}} (-\hat{x} + \hat{y}) with either method leading to the following diagram:

We see that \vec{r} is pointing directly along the \hat{\rho} direction, so it has no \hat{\phi} component. The length of the vector is \sqrt{2}, so we have simply \vec{r} = \sqrt{2} \hat{\rho}.

Notice that if we pick any point in the plane, the vector \vec{r} never has any \hat{\phi} component! This means that, for example, all of the points on the circle of radius \sqrt{2} from the origin have the same position vector \vec{r} = \sqrt{2} \hat{\rho}. The angular dependence is hidden within \hat{\rho} itself.

As the exercise above strongly hints, it’s easy to show that the general expression for the position vector in cylindrical coordinates is \vec{r} = \rho \hat{\rho} + z \hat{z} \tag{1.9}

with no \hat{\phi} component! If you are ever tempted to add a \phi \hat{\phi} term, you should be stopped by noticing it has the wrong units; \vec{r} should have units of distance, but \phi is unitless and so is \hat{\phi}, so something is wrong with \phi \hat{\phi}.

It’s also very important to point out that in cylindrical coordinates, the decomposition of a vector depends on what point it starts at. If we took the same vector \hat{x} + \hat{y} and started it at the point (1,-1), now it is pointing purely in the \hat{\phi} direction! We’ll mostly be dealing with vectors from the origin like \vec{r}, in which case this won’t be an issue, but I wanted to point it out anyway so you’re aware.

1.1.4 Time derivatives and unit vectors

As we’ve just emphasized, a key conceptual difference that appears in cylindrical coordinates vs. rectangular coordinates is that the unit vectors depend on what point we are looking at. This is very easy to see just by drawing two different vectors \vec{r}_1 and \vec{r}_2, and noting the directions of the unit vectors:

Looking forward to the physics a bit, in the context of mechanics \vec{r}_2 might represent a time-evolved version of \vec{r}_1 for some physical system, i.e. \vec{r}_1 = \vec{r}(t_1) and \vec{r}_2 = \vec{r}(t_2). If so, then it’s obvious that the unit vectors \hat{\rho} and \hat{\phi} themselves must be time-dependent.

Here I introduce some new notation, since we’ll be taking lots and lots of time derivatives: a dot over a quantity indicates acting on it with d/dt, \dot{q} \equiv \frac{dq}{dt}.

(Bonus notation: triple equals \equiv is the equivalence symbol and means “is defined to be”.) This applies both to scalars and vectors, so for example we can write \vec{v} = \dot{\vec{r}}. Finally, multiple dots can be used for multiple derivatives, so \ddot{q} = \frac{d^2 q}{dt^2}.

If you look at Taylor, he’ll give you a nice geometric argument for the time derivatives, with the result:

\dot{\vec{r}} = \dot{\rho} \hat{\rho} + \rho \dot{\phi} \hat{\phi}. \tag{1.10}

Instead of repeating his argument, let’s check the results using some limits. If \phi is constant, then \dot{\phi} = 0 and the velocity vector just reduces to one-dimensional speed in the \hat{\rho} direction, which makes sense. On the other hand, if \dot{\rho} = 0, then we are dealing with perfectly circular motion; in this case, we can recognize |\dot{\vec{r}}| = \rho \dot{\phi} as the standard expression relating tangential speed to angular speed (it might look more familiar if we write it instead as v = R \omega.)

We can also use algebra to find the same result. Recalling Equation 1.7 and taking the time derivative explicitly, we find that \frac{d\hat{\rho}}{dt} = -\dot{\phi} \sin \phi \hat{x} + \dot{\phi} \cos \phi \hat{y} \\ = \dot{\phi} \hat{\phi}, recognizing the form of the other unit vector from Equation 1.8 and substituting it back in. Then from Equation 1.9, we can see that (ignoring the \hat{z} component) the first derivative of the position vector is, by the product rule, \dot{\vec{r}} = \dot{\rho} \hat{\rho} + \rho \frac{d\hat{\rho}}{dt} and plugging in gives us back the result Equation 1.10.

Exercise: Tutorial 1B

Here, you should complete Tutorial 1B on “Time derivatives of the position vector”. (Tutorials are not included with these lecture notes; if you’re in the class, you will find them on Canvas.)

As explored in the tutorial, the other unit-vector derivative d\hat{\phi} / dt is crucial for finding the second derivative of the position vector, \ddot{\vec{r}} (also known as the acceleration - this will be very important when we start doing physics with these math results!) Let’s finish the derivation here, going slowly: \ddot{\vec{r}} = \frac{d^2 \vec{r}}{dt^2} = \frac{d}{dt} \left( \dot{\rho} \hat{\rho} + \rho \dot{\phi} \hat{\phi} \right) \\ = \ddot{\rho} \hat{\rho} + \dot{\rho} \frac{d\hat{\rho}}{dt} + (\dot{\rho} \dot{\phi} + \rho \ddot{\phi}) \hat{\phi} + \rho \dot{\phi} \frac{d\hat{\phi}}{dt} \\ = \ddot{\rho} \hat{\rho} + \dot{\rho} \dot{\phi} \hat{\phi} + (\dot{\rho} \dot{\phi} + \rho \ddot{\phi}) \hat{\phi} - \rho \dot{\phi}^2 \hat{\rho}

\Rightarrow \ddot{\vec{r}} = (\ddot{\rho} - \rho \dot{\phi}^2) \hat{\rho} + (\rho \ddot{\phi} + 2 \dot{\rho} \dot{\phi}) \hat{\phi}. \tag{1.11}

Right now this isn’t a very illuminating result; we’ll come back to it below in the context of Newton’s laws and motion.

1.1.5 Spherical coordinates

In spherical coordinates, we adopt r itself as one of our coordinates, in combination with two angles that let us rotate around to any point in space. We keep the angle \phi in the x-y plane, and add the angle \theta which is taken from the positive \hat{z}-axis:

(Confusingly, \theta is usually called the “polar angle”, thinking of the z-axis as the “pole”. In this case, \phi is called the “azimuthal angle”. See Taylor (2005), p.135.)

Warning: spherical angles in math vs. physics

Be aware that these are the physics conventions for what to call these angles. Mathematicians tend to prefer the opposite choice, using \theta as the azimuthal angle and \phi the polar. On top of the angle name confusion, math books (and some physics texts) will also use \rho for the spherical distance and r the polar distance. Be very careful when looking at other resources!

The relationship between spherical and cylindrical coordinates is actually relatively simple to work out, as we can see by looking at a cross-section containing both \vec{r} and \hat{z}:

It’s easy to see from the sketch that z = r \cos \theta \\ \rho = r \sin \theta We can then take this and plug in one more step to get the formulas for rectangular coordinates: x = r \sin \theta \cos \phi \\ y = r \sin \theta \sin \phi \\ z = r \cos \theta If you forget exactly where the sine and cosines go in this expression, I find it’s easiest to think about converting from cylindrical coordinates. I’ll skip the derivation of the volume element since it’s more involved, but the result is important for doing integrals: dV = r^2 \sin \theta\ dr\ d\theta\ d\phi. \tag{1.12}

We could find results for the unit vectors in spherical coordinates \hat{r}, \hat{\theta}, \hat{\phi} in terms of the Cartesian unit vectors, but we’re not going to be doing vector calculus in these coordinates for a while, so I’ll put this off for now - it’s a bit messy compared to cylindrical. I will simply note that basically by definition, the decomposition of the position vector \vec{r} is extremely simple in spherical coordinates: \vec{r} = r \hat{r}. \tag{1.13}

1.2 Motion and Newton’s laws

Now that we’ve set up our coordinate system basics, let’s turn back to physics, starting with Newton’s laws of motion. Taylor discusses some fundamentals about defining mass and force, but I’ll let you read that on your own. Here, let’s just remind ourselves what the laws are:

Newton’s Laws of Motion
  1. In the absence of forces, an object moves with constant velocity.
  2. An object of mass m subject to a net force will accelerate according to the relation \vec{F} = m \vec{a}.
  3. If object 1 exerts a force on object 2, there is an equal and opposite force exerts on object 2 by object 1,

\vec{F}_{12} = -\vec{F}_{21}.

In our modern understanding, the first law is more or less redundant, because the second law immediately tells us that if \vec{F} = 0, then \vec{a} = 0; since \vec{a} = d\vec{v}/dt, no acceleration means constant velocity. (This isn’t quite true, because you can think of the first law as something to check to make sure you’re in an inertial frame where the second law will hold; see the discussion in Taylor, chapter 1. This semester, we will do everything in inertial frames of reference - which just means we have to avoid situations in which our coordinate systems are tied to accelerating objects. (For a more detailed discussion of this topic, see the aside on reference frames below.)

We mentioned vector time derivatives before, but let’s talk a bit more about them, and define some terms. Taking the time derivative of the position vector \vec{r} gives us the velocity vector, \vec{v} = \frac{d\vec{r}}{dt}, and one more derivative gives us the acceleration vector, \vec{a} = \frac{d\vec{v}}{dt} = \frac{d^2 \vec{r}}{dt^2}. Thus, using dot notation to make things more compact, we can write Newton’s second law as \vec{F} = m\vec{a} = m\ddot{\vec{r}}. \tag{1.14}

1.2.1 Newton’s laws in rectangular coordinates

Let’s think about how this breaks into components, starting with rectangular coordinates. The velocity vector then becomes: \frac{d}{dt} \vec{r} = \frac{d}{dt} \left( x \hat{x} + y \hat{y} + z \hat{z} \right) \\ = \dot{x} \hat{x} + \dot{y} \hat{y} + \dot{z} \hat{z}. \tag{1.15} Taking another time derivative for acceleration will just give us double-dots instead of single-dots; again, the unit vectors don’t depend on time. So we have for Newton’s second law: F_x \hat{x} + F_y \hat{y} + F_z \hat{z} = m \left(\ddot{x} \hat{x} + \ddot{y} \hat{y} + \ddot{z} \hat{z} \right)

This is, in fact, just three separate copies of the same equation, one for each direction: F_x = m\ddot{x} \\ F_y = m\ddot{y} \\ F_z = m\ddot{z}

Our vector equation has split apart into a system of differential equations. These equations are collectively known as the equations of motion, because if we solve them, we know the answer for \vec{r}(t) - we know what the motion of the system will look like over time.

Equations of motion

The equations of motion are the system of differential equations that we solve for any given problem in order to find \vec{r}(t). (If we know \vec{r}(t), then we know how our system’s position evolves with time - this is the motion!) This semester, the equations of motion will always come from Newton’s second law.

Before we even move on from rectangular coordinates, it’s important to note that we have a great deal of freedom in choice of coordinate system. For example, consider the motion of a block sliding down a ramp with friction:

Taylor does this as example 1.1, in fact; if you look at his solution, he takes his x-axis to be parallel to the ramp and y-axis perpendicular to it, as pictured in the top right (the black coordinate axes.) But we could also work in the green coordinate system, where y' is in the direction of gravity instead. The choice is arbitrary - the physics is the same! Of course, the algebra might be easier if we pick our coordinates well, and in fact Taylor’s choice gives the simplest math if we follow the solution through.

(Just to make this point once again: notice that I have drawn two coordinate systems, but only one free-body diagram. The force vectors are the same in any coordinates, even if how we break them into components is different!)

You should already be familiar with solving simple problems using Newton’s laws and free-body diagrams! If you need a refresher, click to expand the bonus example below which contains the full solution in the primed coordinates.

We’ll treat this as a two-dimensional problem, as indicated, which just means that we ignore the z coordinate. (This is valid because of Newton’s first law: if there are no z-direction forces, then there is no interesting z-direction motion at all.)

To proceed, we’ll need to split all the forces into components. Let’s draw some angles on our free-body diagram:

(if where I put the \theta’s in my diagram isn’t obvious to you, draw more parallel lines in and use right triangles to identify which angles are \theta and which are 90^\circ - \theta.) From the diagram, the forces split into components are, using the often-convenient notation that \vec{F} = (F_x, F_y):

\vec{N} = (N \sin \theta, N \cos \theta) \\ \vec{F}_f = (-F_f \cos \theta, F_f \sin \theta) \\ \vec{F}_g = (0, -mg)

Now, we know that the normal force N is equal and opposite to the magnitude of all other forces acting perpendicular to the surface of the ramp. From the diagram, we can just read off that it is equal to a component of the gravitational force, N = mg \cos \theta. The second fact we know is that the magnitude of the frictional force is proportional to the normal force, F_f = \mu N = \mu mg \cos \theta. Plugging back in, then, we have the net forces in the x' and y' directions: \vec{F}_{\textrm{net}} = \vec{N} + \vec{F}_f + \vec{F}_g \\ = \left(N \sin \theta - F_f \cos \theta \right) \hat{x}' + \left(N \cos \theta + F_f \sin \theta - mg \right) \hat{y}' \\ = mg \left( \cos \theta \sin \theta - \mu \cos^2 \theta \right) \hat{x}' + mg \left(\cos^2 \theta + \mu \cos \theta \sin \theta - 1 \right) \hat{y}'

This looks a little messy, but let’s keep going. Now that we have the net force, we can solve for the accelerations using \vec{F}_{\textrm{net}} = m \ddot{\vec{r}}, which gives us two equations looking at each component (and cancelling off the mass): \ddot{x}' = g \cos \theta (\sin \theta - \mu \cos \theta) \\ \ddot{y}' = -g (1 - \cos^2 \theta - \mu \sin \theta \cos \theta) \\ = -g \sin \theta (\sin \theta - \mu \cos \theta) where I’ve used a trig identity to simplify the last line; now the x' and y' equations look very similar.

The good news is that even though these still don’t look so nice, everything on the right-hand side of both equations is constant, so to solve for the motion, all we have to do is integrate twice, for example: \dot{x}'(t) = v_{x,0} + \int_0^{t} dt' \ddot{x}'(t') = v_{x',0} + gt \cos \theta (\sin \theta - \mu \cos \theta) \\ x'(t) = \int_0^t dt' \dot{x}'(t') = x'_0 + v_{x',0} t + \frac{1}{2} gt^2 \cos \theta (\sin \theta - \mu \cos \theta) and similarly, y'(t) = y'_0 + v_{y',0} t -\frac{1}{2} gt^2 \sin \theta (\sin \theta - \mu \cos \theta). If we assume starting from rest, then v_{x',0}, v_{y',0} are set to zero; if we also start at the origin, then x'_0 = y'_0 = 0 and we have our final result, shown below.

Solving for the motion in the primed coordinates gives the result (starting at rest from the origin): x'(t) = \frac{1}{2} gt^2 \cos \theta ( \sin \theta - \mu \cos \theta) \tag{1.16} y'(t) = -\frac{1}{2} gt^2 \sin \theta (\sin \theta - \mu \cos \theta). \tag{1.17}

Here \mu is the coefficient of kinetic friction for the moving block.

Exercise: checking limits

First, convince yourself that the units of the solution above make sense. Then show that if \theta \rightarrow 90^\circ, you get the expected result (freefall, since the ramp is now vertical.)

As a bonus, can you figure out why taking the other limit \theta \rightarrow 0^\circ appears to give nonsensical results? (Hint: think about both static and kinetic friction…)

Answer:

If \theta \rightarrow 90^\circ, then the ramp becomes completely vertical, so the motion should only be in the y' direction, and it should just be freefall. At 90^\circ we have \cos \theta = 0 and \sin \theta = 1, so x'(t) \xrightarrow{\theta = 90^\circ} x_0' + \frac{1}{2} gt^2 (0) (1 - 0\mu) = x_0' \\ y'(t) \xrightarrow{\theta = 90^\circ} y_0' - \frac{1}{2} gt^2 (1) (1 - 0\mu) = y_0' - \frac{1}{2} gt^2 so indeed, we have freefall in the y' direction and no motion in the x' direction.

How about the opposite limit, \theta \rightarrow 0^\circ? Since the motion is entirely caused by gravity, what we expect to find is that as the ramp becomes completely flat, all motion stops. Plugging in again, we find this time that x'(t) \xrightarrow{\theta = 0^\circ} x_0' + \frac{1}{2} gt^2 (1) (0 - 1\mu) = x_0' - \frac{1}{2} \mu gt^2 \\ y'(t) \xrightarrow{\theta = 0^\circ} y_0' - \frac{1}{2} gt^2 (0) (0 - 1\mu) = y_0'

The y'(t) result is fine, but our block will apparently start sliding backwards in the x' direction if the ramp is laid flat! What went wrong with our calculation?! (Think about it yourself, before you continue reading…)

In fact, our calculation above is just fine. The problem that was revealed when \theta \rightarrow 0^\circ has to do with our starting assumptions. In particular, we assumed a single coefficient of friction \mu. Since our block is in motion, this must be the coefficient of kinetic friction, \mu_k. However, we know that there is also a coefficent of static friction \mu_s that must be overcome if the block is starting at rest, F_{f,s} \leq \mu_s mg \cos \theta, which will keep the block from moving (i.e. give zero accelerations) if the other forces are not large enough. We can take the x'-direction net force and solve: F_{\textrm{net}, x'} = N \sin \theta - F_f \cos \theta = 0 \\ mg \cos \theta \sin \theta = F_f \cos \theta \\ mg \sin \theta \leq \mu_s mg \cos \theta \\ \tan \theta \leq \mu_s

So the full picture is: if the angle \theta is small enough that \tan \theta \leq \mu_s, then the block just won’t move at all. Since generally \mu_k \leq \mu_s, this means that \tan \theta \geq \mu_k whenever static friction is overcome; this means that the offending term which changed the sign of our x'-direction motion, (\sin \theta - \mu_k \cos \theta) \geq \sin \theta - \tan \theta \cos \theta \geq 0 and we don’t have any problems with backwards-moving blocks anymore.

Since the physics is the same, we should also be able to show that this answer matches the one Taylor finds in the unprimed coordinates. This is a simple exercise in trigonometry, or dot products, so I’ll leave it to you!

Exercise: changing coordinates

Apply a change of coordinates to the results above for x'(t) and y'(t), and show that you reproduce the answers given in Taylor (1.1) for x(t) and y(t).

Answer:

I’ll show you the answer for x(t) only below; it’s straightforward to follow the same steps for y(t) (and you know what you should find!)

Let’s put our knowledge of vector coordinate systems to use and work out the coordinate change. First, we add the angle \theta to our coordinate diagram:

To see how the coordinate systems are related, let’s think in terms of the unit vectors. Using the formula Equation 1.4, we can easily see that \hat{x} \cdot \hat{x}' = \cos \theta and \hat{x} \cdot \hat{y}' = \cos (\theta + \frac{\pi}{2}) = -\sin \theta using a trig identity. You can convince yourself that the minus signs are correct by inspecting the diagram. Next, we take the position vector \vec{r} and decompose in both coordinate systems: \vec{r} = x \hat{x} + y\hat{y} = x' \hat{x}' + y' \hat{y}'. Observing that x = \hat{x} \cdot \vec{r} and using the dot products we just found on the right-hand side of the above equation, we immediately find that x(t) = x'(t) \cos \theta - y'(t) \sin \theta where I’ve put the time-dependence back to remind us that this relation is always true.

Putting our solutions together, then, we have x(t) = \cos \theta \left(\frac{1}{2} gt^2 \cos \theta (\sin \theta - \mu \cos \theta)\right) - \sin \theta \left( -\frac{1}{2} gt^2 \sin \theta (\sin \theta - \mu \cos \theta) \right) \\ = \frac{1}{2} gt^2 (\sin \theta - \mu \cos \theta) and if we open up the textbook, we’ll find that this matches Taylor’s solution exactly.

Note: this is an important concept - but also, hopefully one you have seen before. We won’t be making use of the explicit reference-frame notation introduced below at all this semester, which is why this is an aside and not part of the main lecture notes. If you want to know more about what “inertial frame” really means, read on!

Whenever we do classical mechanics, we have to specify a reference frame. Reference frames are choices of coordinate systems, but because we’re dealing with both time AND space, they have an extra complication as compared to fixed coordinates: two frames can be moving relative to one another. Following Taylor, we’ll use a script \mathcal{S} to denote a reference frame. \mathcal{S} includes :

  • A coordinate system,
  • An origin \mathcal{O}, and finally
  • Some specification of how \mathcal{O} is or isn’t moving, relative to some physical object or other point of reference.

Our chosen coordinates (x', y') for the ramp problem are an example of a reference frame \mathcal{S}'; Taylor’s rotated coordinates (x,y) are a different reference frame \mathcal{S}. Both frames have the same origin, and both are fixed with respect to the ramp, which means there’s also no relative motion between \mathcal{S} and \mathcal{S}'. However, we could come up with a third frame \mathcal{S}'' with coordinates (x'', y''), defined by the following relations: x''(t) = x'(t) - ut \\ y''(t) = y'(t) In other words, the origin of \mathcal{S}'' is moving horizontally to the right with constant speed u.

Since we already solved for the motion in \mathcal{S}', we can just use the coordinate change to find the motion in \mathcal{S}'':

x''(t) = x''_0 + (v_{x',0} - u) t + \frac{1}{2} gt^2 \cos \theta (\sin \theta - \mu \cos \theta) \\ y'(t) = y''_0 + v_{y',0} t -\frac{1}{2} gt^2 \sin \theta (\sin \theta - \mu \cos \theta). Notice that if our block starts at rest with respect to the ramp, it will appear to be moving to the left with initial speed u in our moving coordinates.

Just using a change of frame on our solution is always valid; if we have solved the equations of motion in one reference frame, we know the answer in any other frame. However, we have to be careful if we start in a given reference frame and try to apply Newton’s laws. If you try it in the \mathcal{S}'' frame, the forces will all be exactly the same, and you’ll find the same answer that we got above. However, when you have moving frames you have to be careful - Newton’s laws do not work (without modification) in accelerating frames! Accelerating frames are also called non-inertial frames, because the law of inertia (Newton’s first law) doesn’t hold - and neither does the second law.

We can explicitly see the breakdown of Newton’s laws if we try to use them in a very simple accelerating frame, provided by a moving elevator:

We drop a ball inside an elevator, from initial height y_0. From the point of view of frame \mathcal{S}, fixed with respect to the ground, the elevator is moving straight down with constant acceleration a. As pictured, we can define a second frame of reference \mathcal{S}' which is fixed to the elevator (so the point of view of an experimenter riding the elevator, basically.)

Let’s work out the motion in frame \mathcal{S} first. Neglecting air resistance, the only force acting on the ball is gravity: \vec{F}_g = -mg\hat{y} Using Newton’s second law, \vec{F}_{\textrm{net}} = -mg \hat{y} = m\vec{a} = m(\ddot{x} \hat{x} + \ddot{y} \hat{y}) so nothing happens in the \hat{x}-direction, but vertically we have a constant acceleration, \ddot{y} = -g. If we drop the ball at rest, integrating twice gives us y(t) = y_0 - \frac{1}{2} gt^2. So far, we haven’t mentioned the elevator at all; it enters in when we ask how far the ball is from the floor of the elevator at any given time. From the diagram, we can see right away that \Delta y(t) = y(t) + \frac{1}{2} at^2 = y_0 + \frac{1}{2} (a-g) t^2. In particular, notice that if the elevator is also free falling (a=g), then the ball will appear to be suspended above the floor.

What about the frame \mathcal{S}'? Well, the only force in the problem is still gravity, so we conclude right away that y'(t) = y'_0 - \frac{1}{2} gt^2. y'(t) is exactly the same thing as \Delta y(t), the distance from the floor to the ball. But we don’t see a anywhere, and in particular we don’t see any possibility for the ball to be suspended in mid-air. This is a contradiction!

Obviously, one of our answers is wrong, and if you remember the fine print on Newton’s laws, you may already know the answer: the calculation in \mathcal{S}' is wrong, because \mathcal{S}' is an accelerating (non-inertial) frame of reference.

The good news is that we’re never forced to use a non-inertial frame, because frames of reference are our choice! Our calculation in the frame of an observer outside the elevator is perfectly fine. But it’s an important point to be aware of.

1.2.2 Newton’s laws in cylindrical coordinates

Now that we’ve reviewed both Newton’s laws and curvilinear coordinate systems, let’s have a closer look at what happens when we put the two together. We’ll focus mainly on cylindrical coordinates, which will show the important features; spherical is more complicated but not really different.

Remember that one of the nice things about Newton’s laws in Cartesian coordinates is that they split apart into three separate equations for the x,y,z directions. Let’s remind ourselves why: Newton’s second law reads \vec{F} = m\ddot{\vec{r}} \\ F_x \hat{x} + F_y \hat{y} + F_z \hat{z} = m \frac{d^2}{dt^2} \left( x \hat{x} + y \hat{y} + z\hat{z} \right) and then acting with the time derivatives and matching components, we read off the individual equations F_x = m\ddot{x} and so on.

Now, we can do the same thing in cylindrical coordinates, using our results from above: F_\rho \hat{\rho} + F_\phi \hat{\phi} + F_z \hat{z} = m \frac{d^2}{dt^2} \left( \rho \hat{\rho} + z \hat{z} \right) You might be tempted to just starting matching terms again and conclude that F_\rho \stackrel{?}{=} m\ddot{\rho}. But there’s an immediate problem, which is that there doesn’t seem to be a \hat{\phi} term on the right-hand side! Does this mean that F_\phi is just irrelevant, no matter how large the force is? (That seems like a strange conclusion…)

The resolution to this problem is that the simple argument above is ignoring the time dependence of the cylindrical unit vectors. In fact, we already did the hard work above: Equation 1.11 contains the full result for \ddot{r} taking this into account properly. Plugging that result in, we have

F_\rho \hat{\rho} + F_\phi \hat{\phi} + F_z \hat{z} = m (\ddot{\rho} - \rho \dot{\phi}^2) \hat{\rho} + m(\rho \ddot{\phi} + 2 \dot{\rho} \dot{\phi}) \hat{\phi} + m \ddot{z} \hat{z}

from which we can read off the equations of motion: F_\rho = m(\ddot{\rho} - \rho \dot{\phi}^2) \tag{1.18} F_\phi = m(\rho \ddot{\phi} + 2\dot{\rho} \dot{\phi}) \tag{1.19} and F_z = m\ddot{z} as in rectangular coordinates.

The good news is that we now have dependence on all three coordinates, and on all three components of the force. The bad news is that this is going to give much more complicated differential equations for us to solve! (Attacking them directly for the most general problems, using the alternative Lagrangian formulation, is something you’ll come back to next semester.)

However, there are a number of specific problems where cylindrical coordinates are the best choice for solving using Newton’s laws.

Exercise: Tutorial 1C

Here, you should complete Tutorial 1C on “Motion in polar coordinates”. (Tutorials are not included with these lecture notes; if you’re in the class, you will find them on Canvas.)

Another good example of a problem best solved in cylindrical coordinates is a simple pendulum:

Note that I’ve turned the coordinates around a bit compared to how we usually draw them; this is so that I can identify the angle \phi = 0 with the lowest point of the pendulum. (Again, physical intution: we know that hanging straight down is a special point because the pendulum won’t move if we start it there. So we anticipate the answer will be a bit simpler with these coordinates.)

Once again, the full setup and solution for the pendulum is included as a bonus example below; we won’t go through it in lecture. But I will go through the start, to make a point. We should begin with a free-body diagram:

Let’s draw angles onto our free-body diagram and identify the unit-vector directions in cylindrical coordinates:

We immediately see that \hat{\phi} is perpendicular to T, which means that T_\phi is zero! This is one of the nice things about using cylindrical coordinates for this problem, the fact that the vector expression for tension is always the same, \vec{T} = -T \hat{\rho}, regardless of where the pendulum is. This is the payoff of using cylindrical coordinates; it makes our force equations “nicer”!

In lecture I will skip to the answer at this point, but if you want some practice you can click to go though the bonus example in detail.

For the pendulum, the cylindrical radius is fixed to the length of the string, \rho = L. This means that time derivatives of \rho vanish, leaving us with the equations F_\rho = -mL \dot{\phi}^2 \\ F_\phi = mL \ddot{\phi}

Notice, by the way, that if we were studying purely circular motion (e.g. the pendulum is being spun around on a flat surface instead of hanging vertically), we would have \dot{\phi} = v/L, and the first equation becomes F_\rho = -mL \left( \frac{v}{L} \right)^2 = -m\frac{v^2}{L}

which is the familiar equation for centripetal force. (By assumption of circular motion, the angular speed \dot{\phi} is constant, so the second equation is just F_\phi = 0.)

Back to the more general case of a pendulum.

As for gravity, we will need to decompose it into radial and tangential pieces. This time I’ll do it the geometric way, but note that you can also start with F_g = -mg\hat{y} and use algebra as an alternative way to convert. (Try it out if you’re not comfortable with vector algebra yet!) From the sketch above, we readily find that \vec{F}_g = mg \cos \phi \hat{\rho} - mg \sin \phi \hat{\phi}

Now we can plug back in to Newton’s laws in cylindrical components. Combining the radial forces, the first equation reads mg \cos \phi - T = -mL \dot{\phi}^2 and the second is -mg \sin \phi = mL \ddot{\phi}.

To solve for the motion, we can completely ignore the first equation, because it contains the unknown tension force T. (If we care about the tension, like if we know our string has a max tension and we want to predict whether it will break, we can go back and check at the end once we know \phi(t).) The second equation simplifies to the final result given below.

The results of the example calculation above are a single equation of motion for the angle \phi of the pendulum: \ddot{\phi} = -\frac{g}{L} \sin \phi.

Let’s check units, always a good practice: \phi is dimensionless, g has units of m/s^2, and L has units of m, so we have 1/s^2 on the right. This matches the left exactly, because d/dt itself has units of 1/s. (If this isn’t clear to you, think about the basic definition of a derivative: du/dt comes from the infinitesmal limit of \Delta u / \Delta t, and the units of \Delta t are clearly the same as the units of t.)

In this case both sides depend on the unknown \phi, so we can’t just integrate with respect to time. In fact, this is actually a surprisingly hard differential equation to solve, for such a simple system! There is no analytic solution in terms of elementary functions in general, but we can find an approximate solution: if we assume the angle \phi is always small (the ‘small-angle approximation’), then we can Taylor expand the right-hand side about \phi=0, \sin \phi \approx \phi\ + ... giving the simpler equation \ddot{\phi} = -\frac{g}{L} \phi. This still depends on \phi on both sides, but it has a relatively simple general solution, of the form \phi(t) = A \sin (\omega t) + B \cos (\omega t) where \omega = \sqrt{g/L}. At this point, we’re not ready to study how to get that general solution yet, but it’s easy for you to plug it back in and check that it works.

We’re starting to see that there are examples of equations of motion where the forces depend on coordinates, and we can’t just solve by simply integrating. In fact, this is a great point for us to go back to math, and start to study the more general theory of differential equations.