Session 1: Introduction to Theoretical Physics
George E. Hrabovsky
MAST
This is the first of a series of articles/book chapters/web pages that I intend to write about the nature and processes of theoretical physics. I plan to author corresponding mathematics writings that will explain in greater depth the justifications and deeper subtleties of the mathematical techniques that will be brought to bear. In this series I will present mathematical techniques, often without justification and with the understanding that you will be able to explore those ideas more fully elsewhere. This writing is an overview of theoretical physics; the first cut through the material, if you will. I will then expand each of the main topics into their own writings; the second cut through the material. This expansion of the material will proceed until I have become too exhausted to continue, at which point someone else can pick it up. In reality, at some point I will make a decision to stop a certain expansion in favor of those that are more interesting to me; you are free to expand the material as you like.
In its fundamental essence this will constitute a complete course in theoretical physics from the ground up. I assume that you have little background in physics, and that you have had exposure to some algebra and geometry at the high school level. Beyond that you will need to keep a certain mental agility to be able to cope with the uncertainties and ambiguities inherent in physics.
In what follows I will begin by giving an illustrative example of theoretical physics. I will then describe, in a very broad sense, what we consider to be the subject matter of physics. This will be followed by describing what theoretical physics is about. Then I will discuss the mathematical and computational aspects of physics. This will be followed by a description of things to do before starting the next writing.
If you are reading this, you are interested in theoretical physics. But what is theoretical physics? I have come up with several definitions, based on how you approach the subject.
The modeling approach to theoretical physics. Another way of calling this would be the phenomena-centered approach, whose goal is to understand a specific phenomena by developing either a mathematical or computational model. You begin this by choosing a phenomena to study. Then you choose an approach to representing the phenomena; can you represent it as particle? a field? or some continuous distribution of matter? Then you choose a mathematical formulation. Examples of mathematical formulations are Newtonian mechanics, Maxwell's equations, Lorentz covariance, the Maxwell-Boltzmann distribution, etc. Such formulations Constitute much of the material of most textbooks and courses on physics. You then adapt your approach to the mathematical formulation, thus developing a mathematical representation of your phenomena. You then use physical, mathematical, and/or computational arguments and methods to make predictions in the form of tables, plots, and/or formulas. By studying these results in different circumstances you can extend our understanding of the phenomena. This is the most direct method of doing theoretical physics, it is a straight application of mathematical or computational methods. It is certainly the most structured way of doing theoretical physics.
The constructive approach to theoretical physics. This can be thought of as the method to develop a new formulation of a physical theory. Examples are the Lagrangian formulation of mechanics, the Lagrangian formulation of electrodynamics, the Eulerian formulation of fluid dynamics, the path-integral formulation of quantum mechanics, and so on. You begin by choosing how you represent objects in your developing theory. Then you choose some quantity, or set of quantities to base your construction on. Then you choose an argument to base your construction on. Are you seeking to find symmetries? Are you arguing from some conserved quantity? Are you assuming that your quantity is minimized? For example, in the Lagrangian formulation you choose to create a new quantity called the Lagrangian and then you work out the consequences when the integral of the Lagrangian—the action—is minimized. This leads to the Euler-Lagrange equations of motion, an new formulation of classical mechanics. This is a much more difficult, but powerful method—you build the formulation. The difficulty stems from the lack of structural guidelines in creating a new formulation.
The abstract approach to theoretical physics. This mode is where you take a number of specific cases and generalize their results. For example, knowing that when a derivative is 0 and quantity is unchanged; you take the zero derivatives of momentum in many cases and generalize that into the law of conservation of momentum. This sort of activity is very difficult since there are few guidelines for how to proceed beyond what is already known.
The unification approach to theoretical physics. This is based on the idea that it would be nice if there was a single theory to govern a wide range of phenomena. There is no real reason to believe that this is true generally. This is one difficulty with practical application. another difficulty is that all of our equations are, to one degree or another, an approximation of reality. So the fact that equations in different fields look alike is another way of saying that the approximations are similar. Does that mean the phenomena are also similar? Sometimes. Isaac Newton unified gravity at the surface of the Earth and gravity away from the Earth. James Maxwell unified electricity, magnetism, and light. Abdus Salam, Sheldon Glashow and Steven Weinberg unified electromagnetism and the weak nuclear force. The work of unifying electroweak theory with the strong interaction force is a work in progress. Even less success has been made in unifying gravity.
So this, then, is the general nature of theoretical physics.
In this section I intend to show how an idea in physics evolves. This will demonstrate that ideas change and this change leads to greater accuracy.
The ancient Greek philosophers, often led by Aristotle, had the idea that gravitation was a natural tendency for objects to be attracted to an almost mystical place in the world. This special place was the center of the Earth and the heavier an object was the more strongly attracted to the center it would be. In other words, their weight determined their proper place and they all settled into that place. This was their idea of gravity. Today scientists laugh at that idea, but what tells us that this idea is wrong? What is the right idea?
The fact that Aristotle's idea of gravity was wrong took a long time to be realized. It was Galileo that put the proverbial "nail in the coffin". His argument went something like this; note—I will enumerate the arguments so they are easier to follow (this will be a standard procedure for proofs and derivations):
We will assume that an object that is heavy falls faster than a lighter object—as they are each trying to get to their proper place in the world. This idea explains why it is possible to pick up small objects, but not buildings or mountains; the latter being in their proper places. This is the idea promoted by Aristotle.
What happens when we strap a lighter object to a heavy one? There are two possibilities; either the combined object acts like a single object, or it does not. This idea is an example of the law of the excluded middle. Something either is or it is not, there is no middle. These possibilities lead to the next two arguments. Examining each of these two situations is an example of a proof technique called case analysis.
If the combination forms a single object, that single object is heavier than either of the two components. By the assumption in step 1 the single heavier object must fall faster than the heavier of the two component objects.
If the combination does not form a composite object, then, by the assumption made in step 1, the lighter object will fall slower than the heavier. Since they are connected by the strap, the lighter object will slow the rate of fall of the heavier object, so the combination will not fall as fast as the heavier object.
These two arguments lead to the prediction that the same combination of objects fall both faster and slower than the heavier of the two component objects alone. A situation where a given assertion leads to two or more opposing outcomes is called a contradiction. No assertion that leads to a contradiction can be true. This method of proof is proof by contradiction, or reductio ad absurdum. Let us say that you are trying to prove an assertion. The first step in a proof by contradiction is to assume your assertion to be false. You then show that this falsehood leads to a contradiction. Since no assertion leading to a contradiction can be true, the falsehood is then itself false. This proves your original assertion cannot be false. By the law of the excluded middle, it must then be true. This completes a proof by contradiction.
In this case we have proved that Aristotle's assertion that objects fall at a rate according to their weight is false; this is the same as proving that objects fall in a way that is independent of their weight. In fact, this principle is the law of falling bodies. To state this law explicitly, objects fall under the influence of gravity independent of their weight. This implies that the influence of gravity is the same for all objects.
Having made the prediction that objects fall independently of their weights, experiments were performed that confirmed this result.
This is a fantastic example of the process of theoretical physics! We have an established idea, predicted that this idea produced results that were contradictory, thus formulated a new hypothesis and confirmed it by both logical reasoning and physical experiment.
There is a well-worn definition of physics that it the study of matter and energy. I do not like that definition, as it is almost misleading. Physics concerns itself with the most fundamental principles of the universe around us. This boils down to understanding the most elementary constituents of matter and the interactions between them.
Wait a minute! What about energy? The truth is we do not really know what energy is. We can calculate energy for many different situations. We can use these calculations to learn about different situations, but these calculations result from matter and interactions. All we know about energy is that it is some number we can calculate and use in calculations.
The attempt to understand matter by studying idealized objects without regard to shape or size is called particle theory. The first step in understanding any physics is to try to simplify the situation by removing all complications and then working out all of the consequences of the situation. The particle is this kind of simplification. For such a simple explanation, it is very rich in principles and consequences. In the last century, particle physics has also taken on a definition relating to subatomic particles.
The attempt to understand interactions between collections of matter by examining properties that seem to be everywhere is called field theory. Here we look at a property like temperature. We then state that this property exists everywhere we are considering. Thus there is a temperature at every point we can possibly look at in the situation we are studying. This situation is said to represent a field, in this case a temperature field.
The theories of matter are the result of the inevitable complication of nature over idealized theories. Once we have studied many simple ideas, we need to make them more realistic by reintroducing some of the complications that we removed in the process of simplification. We can treat matter in bulk as a kind of matter field. We can also examine matter and the interactions of matter at ever smaller scales, where the simple ideas no longer hold.
Applied physics is a collection of disciplines that use physics to describe specific phenomena. These have the character of being much more complicated than pure physics, since they deal with situations where the simplifications of pure physics do not always hold. The simplified theories of pure physics have removed complications that must be considered in the more realistic situations covered by applied physics. Here we include astrophysics, atmospheric physics, biophysics, physical chemistry, the physical theory of computation and information, electronics, engineering physics, geophysics, physical hydrology, materials physics, and physical oceanography.
Most everyone has heard the term laws of physics, but what is a law of physics? How does it come about? What makes it a law? Let us say that you have been thinking about the relationship between the pressure, volume, and temperature of a gas. After a while you become so curious that you do some experiments and measure the pressure of a gas for different volumes at constant temperature. You find that the pressure, symbolized by P is proportional to the inverse of the volume, symbolized by V, we write this symbolically,
Eq 1
Where ∝ is the proportionality symbol. After some more analysis we note that (1) is exactly true when the volume is multiplied by some constant determined by the physical system under study. We will symbolize this constant c, so we have,
Eq 2
We can rewrite this,
Eq 3
This is called Boyle's law. It is one of the basic gas laws.
All such laws are similar in two ways. First, they are similar in that they are all true within what I call their region of applicability, that is when the assumptions that were made when they were discovered are still valid. Second, they all break down in some way when those assumptions are no longer valid, in other words when the law is used outside of its region of applicability.
So, is physics just a collection of laws? No, such a collection is an absolute statement of fact and is unable to extend itself beyond the regions of applicability of the laws. It, also, does not explore the relationships between the various laws. Any list of laws of physics will, by necessity, be restrictive. How, then, does physics advance into regions not covered by existing laws? As we will see in the next section, it happens by modifying existing laws, or creating new ones.
What is a physical theory? It turns out that the answer to this question is a little counterintuitive from the point of view of the general idea of what a theory is. A scientific theory is a body of work leading to a self-consistent idea that is considered to be a fact. In most cases there is no controversy about the theory in question.
The program of theoretical physics is all about developing physical theories. This takes us back to the section on the nature of theoretical physics; where you choose the goal of your physical theory.
For a model-based approach the process begins by forming primitive, intuitive, and ill-defined notions about what you are studying. From this beginning you construct precise ideas and give them symbolic representation. You then formulate relationships between these ideas from observation, experiment, or theoretical work; these are the physical laws of the previous section. Often they are stated in the form, "Let us assume...". By manipulating these statements, making physical arguments, and making calculations for specific situations, we can make predictions with these statements. This type of prediction is called a model. Some models are based on mathematical derivations, some are computer simulations. A body of models linked by physical argument, derivation methods, and/or computer simulations constitute a physical theory. In specific you begin by choosing a particle theory, a field theory, a theory of matter, or a theory of applied physics. Then, you can choose either a mathematical or computational formulation and begin to study the problem you are interested in. Often, this process starts with a study of the basic relationships between the quantities you are interested in, this is called dimensional analysis. Another initial approach is to estimate the orders of magnitude of the quantities you will study. Both of these techniques have come to be known as back-of-the-envelope physics, and they allow you to have some idea of whether an answer you get makes sense or not.
For a constructive approach the process begins in much the same way. Again you choose a particle theory, a field theory, a theory of matter, or a theory of applied physics. Instead of choosing an existing mathematical or computational formulation, you invent a new quantity to play with. This should be based upon the existing quantities of the theory, but viewed in a new way. This new way of viewing the new quantity can be based on functional relationships with the known quantities. It could be be based on finding symmetries—ways of changing the model that leaves the quantity unchanged. It could be based on conservation—using the fact that if a quantity is conserved it does not change. It could be based on minimization—the idea that a model will always have the least value of a quantity in order to be most efficient. No matter what it is based on, you then work out the ramifications of this new formulation.
For an abstract approach you begin by examining one or more models. Then you look for things that are both common to them all and not already encompassed by the existing theory. This is somewhat ambiguous because it is as much art as it is science to find such things. Once you find such a common element you assume that it is true and you work out its ramifications.
For the unification approach, you begin by deciding what to unify. Then you try to figure out how to perform this unification. Such a unification will result in a single formulation that encompasses all of the things you want to unify. Then you attempt to prove that the unification scheme is mathematically viable. Then you try to make models based on it.
The formulation and solution of physics problems by means of mathematical structures and techniques is called mathematical physics. It is composed of two main parts: the development of new mathematics, and the application of existing mathematics to physics problems.
In physics most of the models we develop will involve one quantity changing with respect to another. Systems that change in such a way are referred to as dynamical systems. Let us say that we are modeling the change in location with respect to time. We say that for time, t, we have a corresponding value of location, . Now we measure the location at n time steps later, t+n time later; the value of the location will be . We can then say that the change in location, Δ x, is,
Eq 4
This is called an increment in x. Should this expression be equivalent to some function of t, then we call it a finite difference equation. This is the simplest model of a changing system, and it forms the basis for numerical computer models. We can get the average (or mean) change in position with time if we take the ratio of the change in location with the corresponding change in time,
Eq 5
This is called a divided difference. It is important to note that we can also write this in another form, using the standard notion for functions as f,
Eq 6
If we think about (5) or (6), we can see that if we decrease the time increment in the denominator then the estimate of the average will become more accurate. The value of the mean change will get closer to the actual value. On the other hand, if we make the denominator equal to 0 we get the disaster known as a singularity, where the mean value expands without limit.
Fortunately, an invention has been created to prevent this singularity from happening. We can state that we will take the limit of the mean value such that Δ t approaches zero. This means that the distance representing the time interval, in essence the measurement error, drops to practically nothing. To actually calculate this result we must perform the following steps:
Write f(t) explicitly, that is you write out the entire expression for f(t).
Write f(t + Δ t) explicitly and expand algebraically as required.
Write f(t + Δ t) - f(t) explicitly and evaluate the difference as required.
Divide through f(t + Δ t) - f(t) by Δ t and evaluate the quotient as required.
Substitute 0 for all instances of Δ t. This is taking the limit as Δ t goes to 0. Perform all necessary multiplications by 0.
What you are left with is called a derivative of the function. We write it in many different ways that are all equivalent,
Eq 7
This expression is the best possible approximation of the change of a function with respect to another quantity. For example,
Assume .
Then .
Then .
Dividing by Δ t we have .
We take the limit as Δ t goes to 0, .
Now, we can use this principle to approximate the value of the function at any point a. To do this we use the formula, where f'(t) is the derivative.
Eq 8
This is called the linear approximation. We can use our example above to calculate .
Choose a=10. I chose this value because 10 is close to 11 and the cube is easy to calculate.
.
We know that .
We can calculate 11-10 = 1.
So, .
Note that the actual value of . We were off by 32, which is an error of 2.3%; not too bad.
For more details see references [1], [2], [3], and [4]. It is important to note the relationship between the finite difference and the derivative. In essence they serve the same purpose, the derivative is for situations where variables change smoothly, continuously, and the finite difference is where variables change in discrete steps.
As the derivative can be thought of as the calculus form of subtraction (based on the finite-difference), what about addition? Is there a calculus form of addition? Yes, when we accumulate all the little bits of something considered in differentiation (taking the derivative) we call that integration. Integration is the opposite of differentiation. If you look at a table of derivatives and you start with the derivative, the function the derivative came from is the integral. In the example above, we began with and its derivative ; were we to integrate this derivative we would get plus a constant as the result. We can write,
Eq 9
Here c represents an arbitrary constant that can be determined from the details of the problem being studied. When we differentiate a constant it goes away (since there is no change in a constant). When we integrate, we must always add an arbitrary constant, as such may be present without any trace following the differentiation. Expression (9) is called an indefinite integral. The symbol ∫ is called a summa, and is a stretched out s, signifying a summation. What is being summed are all the bits of t described by the symbol dt. More details about this can be found in references [1], [2], [3], and [4].
Just as there is a finite difference correlating with the derivative for the discrete case, so too is there a discrete analogue of the integral. Let us say that we are summing a collection of n small increments of t, we can write,
Eq 10
This is called a series. If n=∞, then we call it an infinite series. The series is the discrete analogue for the integral. Again you can find out more in references [1], [2], [3], and [4].
Computational physics is similar to mathematical physics in that it is a means of formulating and solving physics problems. Computational physics involves both the development of new computational structures and the application of existing ones.
We will be using the computer algebra system called Mathematica to develop our computational models. In Mathematica we represent a general finite difference equation, like (4) above, by the user-created function,
This is the general form of a function in Mathematica, fundtionname[arguments]. The _ is a symbol for pattern matching. Anything that replaces the t_ symbol replaces every instance of t later on. Thus delx[a,b] becomes x[a+b]-x[a], or,
The average can be written,
So,
We can take the limit of a function using the limit command. To find the limit,
we write
For example,
is
We can calculate a derivative using the D command. The syntax is D[function, independent variable]. For example, the derivative of with respect to t is
We could write the linear approximation for f(t)
Here we have used the command Function[pattern,variable][argument] to specify the functional form we are going to use. Then we use the symbol /. to indicate the transformation of the specified symbol to another specified symbol. We calculate ,
Which is what we expected. We can calculate integrals using the Integrate[function,independent variable] command. We integrate .
We can find a discrete sum using the Sum[function,iterations] command. Here we sum
or
Begin building your library of references. Definitely begin with references [1], [2], [3], [4], and [9]. Reference [5] is a gold mine, and I use it now—even though I have the immensely powerful computer algebra system (CAS) Mathematica. Check out the web sites listed in [6]. Work some problems in calculus and do not be concerned with how you did on them, just get some experience.
Get a computer. Plain and simple—a computer is absolutely essential these days. Get one and use it every day.
Get a Computer Algebra System (CAS). I recommend Mathematica but if you cannot afford it, then there is a free alternative in Maxima. You can download it from here: http://maxima.sourceforge.net/, other alternatives are available, but they require LISP or C++ languages to be installed on your system, and you need to build them—this can be a lot of work. Learn to use it and then use it to check your work. There is an inexpensive home version of Mathematica that is a fully capable version of the software. It runs about $300, so it is well within most budgets.
To become proficient at theoretical physics, you must do it. Each of these writings will include a section on practice problems. These will be solved in the next writing.
Write out three functions of t whose properties you understand.
Express each function from problem 1 as a divided difference as in (6).
Express each function from problem 1 as a derivative in t.
I have presented a basic definition of what theoretical physics is all about. I have also presented a few mathematical tools to get you started. All-in-all this is a good beginning. Remember that this is the first session in studying a field that has enjoyed rapid expansion for the last four hundred years.
While you work through these writings it is a good idea to read popular accounts and to keep track of recent events in physics, I particularly like reference [7] for this. I suggest reference [8] as a means of trying to keep abreast of what is new. It is the arXive (pronounced arkive) preprint server and should be a daily required reading list for theoretical physicists. I doubt that you will get much out of it for a long time, but it can motivate you. A lot of review articles appear here and you can use them to frame your interests. One way of doing this is to read until you encounter something you don't understand and then work backwards, looking things up until you reach a point where you do understand.
[1] Gilbert Strang, (1991), Calculus, Wellesley Cambridge Press. Also available at the MIT Open Courseware site http://ocw.mit.edu/ans7870/resources/Strang/strangtext.htm as a free download. Gilbert Strang is a fine writer and famous professor. This book is pretty good, not my favorite, but I like it.
[2] Paul Dawkins, (2007), Calculus I, available at http://tutorial.math.lamar.edu/pdf/CalcI/CalcI_Complete.pdf as a free download. Also available from this web site are Calculus II, Calculus III, Algebra, Linear Algebra, and Differential Equations. These books are all very nice.
[3] Frank Ayers, Jr., Elliott Mendelson, (1999), Calculus, 4th edition, McGraw-Hill, this is one of the Schaum's Outline series. If you are not familiar with the Schaum's Outline series, they are an outline of the theory along with hundreds of solved practice problems, in this case there are 1,103 such problems!
[4] Dan Sloughter, (2000), Difference Equations to Differential Equations, as a free download located here: http://synechism.org/drupal/de2de/. This is my favorite textbook for single-variable calculus. He has also written The Calculus of Functions of Several Variables (available at http://synechism.org/drupal/cfsv), Yet Another Calculus Text (dealing with hypercomplex numbers, available at http://synechism.org/drupal/yact), and the introduction to advanced calculus A Primer of Real Analysis (available at http://synechism.org/drupal/pra).
[5] Murray R. Speigel, (1995), Mathematical Handbook, 34th printing, McGraw-Hill, this is one of the Schaum's Outline series.
[6] There are four web sites that warehouse lots of math and physics books that are free. The most active one is Free Science and is located at http://freescience.info/index.php, which has all areas of science and technology. Then there is the Free Book Centre located at http://www.freebookcentre.net/, this site has lots of computer science, math, electronics, and even medical books. Then there is Free Online Books at http://books.pspxworld.com/ where they have math and computer science books. My favorite, with math and physics, is Textbooks in Mathematics, located at http://mathbooks.110mb.com/mylist.php, though this list is not as active as it once was.
[7] Leonard Susskind, (2008), The Black Hole War: My Battle with Stephen Hawking to Make the World Safe for Quantum Mechanics, Little, Brown and Company. While written for the general public, this book provides deep insights into the methods and attitudes of theoretical physicists.
[8] The arXive preprint server. Available at http://arxiv.org/.
[9] Wilfred Kaplan, Donald J. Lewis, (1970), Calculus and Linear Algebra, Volume 1 and 2, Wiley and Sons, reprinted in 2007 by the Scholarly Publishing Office of the University of Michigan. The first volume in this remarkable series is available as a free download from the University of Michigan: http://quod.lib.umich.edu/cgi/t/text/text-idx?c=spobooks;idno=5597602.0001.001. Volume 2 is available from the same source: http://quod.lib.umich.edu/cgi/t/text/text-idx?c=spobooks;idno=5597602.0002.001 . The nice thing about this series is that it not only covers calculus, but also linear algebra.