# A brief introduction to Inverse Theory

GEO 5210 – Seismology I L14-1
Lecture 14 – Seismic Inversion Part I
1. A brief introduction to Inverse Theory
Inverse theory in seismology has many applications. Some of the most important are:
1) Earthquake location
2) Determining 1D velocity and density structure of the Earth
3) Seismic tomography – i.e., 3D velocity variations.
Many problems in the physical sciences can be written in the form of a linear equation:
𝐀𝐀𝐀𝐀=𝐝𝐝
Where we define:
A = model/data kernel. This provides the relationship between model parameters and data.
m = model vector. Contains the unknown model parameters.
d = data vector. Contains our observations.
Example: Straight line fitting
A simple example would be temperature measurements (Ti) made as a function of depth (zi) in the Earth.
Looking at these data we see what looks to be a linear increase of temperature with depth. So, let’s try and describe our ith temperature measurement as follows:
𝑇𝑇𝑖𝑖=𝑎𝑎+𝑏𝑏𝑧𝑧𝑖𝑖
That is, we want to find the equation of a straight line that best fits our observations. We can write this in matrix form:
GEO 5210 – Seismology I L14-2
⎣⎢⎢⎢⎡𝑇𝑇1𝑇𝑇2𝑇𝑇3𝑇𝑇4𝑇𝑇5⎦⎥⎥⎥⎤=⎣⎢⎢⎢⎡11111𝑧𝑧1𝑧𝑧2𝑧𝑧3𝑧𝑧4𝑧𝑧5⎦⎥⎥⎥⎤􁉂𝑎𝑎𝑏𝑏􁉃
Where our data vector (d) contains our temperature measurements (T1, T2, …, T5), our model vector (m) contains the unknowns (our slope – b and intercept – a), and the model kernel (A) shows the relationship we are assuming between our data and unknowns.
Just multiplying out the first term of our matrix above we can see that this represents the relationship we are looking for:
𝑇𝑇1=𝑎𝑎+𝑏𝑏𝑧𝑧1
If the system above was evenly determined (i.e., we had the same number of equations and unknowns) then we could easily find the solution of 𝐀𝐀𝐀𝐀=𝐝𝐝 by just solving for m. That is:
𝐦 =𝐀𝐀−1𝐝𝐝
We can see why this is called inverse theory, because to find our solutions (our model parameters) we have to find the inverse of the model kernel.
But, our example problem had 2 unknowns and 5 observations. This is often the case that we encounter, i.e., we usually have more observations than unknowns – this is referred to as an overdetermined system.
Overdetermined Systems
Recall that a matrix A is invertible if there exists a matrix B such that: AB = BA = I. Non-square matrices are not invertible. Hence, in our example above we have a 5 × 2 matrix, which is not square and not directly invertible.
Traditionally, what is done is to look for the solution that provides the best fit in the least squares sense. Consider the next image:
GEO 5210 – Seismology I L14-3
Here we show our observed temperature readings (red circles) and a possible line fit to these observations. The black cross shows the predicted (model) value for this line at each depth. What we calculate is the residual. Where we define the residual = observed value – model value.
The least-squares solution provides the best-fit where we define the best-fit as the line for which the sum of the squared residuals has its least value. That is, we seek to minimize:
𝑆𝑆=Σ𝑟𝑟𝑖𝑖2𝑛𝑛𝑖𝑖=1
Where ri is the ith residual.
We set up our equations as:
𝐀𝐀𝐀𝐀=𝐝𝐝
The solution by least squares can be determined from multiplying each side of the above equation by the transpose of A: 𝐀𝐀𝑡𝑡 (see your linear algebra text for proof).
⇒𝐀𝐀𝑡𝑡𝐀𝐀𝐀𝐀=𝐀𝐀𝑡𝑡𝐝𝐝
Now we solve for m:
⇒𝐦 =(𝐀𝐀𝑡𝑡𝐀𝐀)−𝟏𝟏𝐀𝐀𝑡𝑡𝐝𝐝
Note that in our example A is a 5 × 2 matrix. So, At is a 2 × 5 matrix. So, multiplying 𝐀𝐀𝑡𝑡𝐀𝐀 we get a 2 × 2 matrix, which is square, and possibly invertible.
Underdetermined Systems
Underdetermined systems are also possible. That is, we have more unknowns than equations. If we try the least squares solution we find that 𝐀𝐀𝑡𝑡𝐀𝐀 is singular (i.e., it has no inverse). There are an infinite number of solutions. That is, if we try to fit a straight line to a single data point, any line will fit through that data point:
GEO 5210 – Seismology I L14-4
This is unfortunate because sometimes parts of problems we are trying to solve may be underdetermined.
The traditional approach to handling this is to look for the simplest solution. That is, we search for the model values that have the shortest length in the least squares sense. Specifically we seek to minimize:
𝑆𝑆=Σ𝑚𝑚𝑖𝑖2𝑖𝑖
In this case, the solution is:
𝐦 =𝐀𝐀𝑡𝑡(𝐀𝐀𝐀𝐀𝑡𝑡)−1𝐝𝐝
To make this clear, let’s take a look at an example. Suppose we have just one measurement:
T1 = 5° at z1 = 10 m.
So, we set up our matrix equation as:
𝐀𝐀𝐀𝐀=𝐝𝐝
⇒[1𝑧𝑧1]􁉂𝑎𝑎𝑏𝑏􁉃=[𝑇𝑇1]
⇒[110]􁉂𝑎𝑎𝑏𝑏􁉃=[5]
Solving 𝐦 =𝐀𝐀𝑡𝑡(𝐀𝐀𝐀𝐀𝑡𝑡)−1𝐝𝐝 we get:
⇒􁉂𝑎𝑎𝑏𝑏􁉃=􁉂110􁉃􁉀[110]􁉂110􁉃􁉁−1[5]
⇒􁉂𝑎𝑎𝑏𝑏􁉃=􁉂110􁉃(101)−1[5]
⇒􁉂𝑎𝑎𝑏𝑏􁉃=
􁉂110􁉃􁉂1101􁉃[
5]
⇒􁉂𝑎𝑎𝑏𝑏􁉃=􁉎510150101􁉏
GEO 5210 – Seismology I L14-5
⇒𝑎𝑎=5101=0.0495 (the intercept)
⇒𝑏𝑏=50101=0.495 (the slope)
And our solution length is:
𝑆𝑆=Σ𝑚𝑚𝑖𝑖2𝑖𝑖=𝑎𝑎2+𝑏𝑏2=.247
So, we get a straight line passing through our point that looks like:
But is this solution really of any interest to us?
The problem in geophysics is that we often encounter cases where parts of our problem are over-determined and other parts are under-determined. So, typically we try to seek a compromise between minimizing the prediction error (the residuals) and the solution length.
Damped Least Squares
The damped least squares solution is:
𝐦 =(𝐀𝐀𝑡𝑡𝐀𝐀+λ2𝐈𝐈)−1𝐀𝐀𝑡𝑡𝐝𝐝
Where λ is a damping factor.
We try to choose a value of λ that minimizes both the prediction error and solution length.
• If λ2 is large  the emphasis is on minimizing the solution length.
GEO 5210 – Seismology I L14-6
The effect of a large damping factor is to produce a smooth model (i.e., we have a small model length) but not necessarily a great fit to the data.
• If λ2 is small  the emphasis is on minimizing the residuals (prediction error).
The effect of a small damping factor is have very small residuals but a very rough model (and a very large solution length).
We typically try to find a balance between these two end-member cases. For example, we can try a range of values for λ and produce a trade-off curve as shown below. Here we try to find the optimal solution as a balance between model roughness and model misfit. Calculating model roughness can be done in a variety of ways, for example by looking at the 2nd derivative of the model values.
GEO 5210 – Seismology I L14-7
But, one must always be cautious of these solutions. The optimal model doesn’t necessarily imply it is really the Earth we are looking at.
The figure below is an example of the effect of damping in a seismic tomography model. This is from the S-wave tomography model S20RTS by Jeroen Ritsema. It shows a cross-section through the Earth (cutting through the central Pacific LLSVP). Here we can see the effect that choosing a different damping parameter can have on the resulting model. On the left we have a heavily damped model which gives us a very smooth model. Whereas on the right we have less damping that results in a much rougher model and stronger perturbations in velocity.
2. Tau-P Inversion
Previously we noted that travel-time vs. distance or ray parameter vs. distance plots do not produce singularly valued functions. To get around this limitation we generate a new function called a Tau-p or τ(p) curve.
GEO 5210 – Seismology I L14-8
Setting up the inverse problem:
Recall from a previous lecture that we defined Tau(p) as follows:
𝜏𝜏(𝑝𝑝)=2􀶱(𝑢𝑢2−𝑝𝑝2)12 𝑑𝑑𝑑 𝑧𝑧𝑝𝑝0
We can write this instead for a stack of discrete layers:
𝜏𝜏􀵫𝑝𝑝𝑗𝑗􀵯=2􀷍􀵫𝑢𝑢𝑖𝑖2−𝑝𝑝𝑗𝑗2􀵯12𝑛𝑛𝑖𝑖=1ℎ𝑖𝑖; 𝑢𝑢𝑖𝑖>𝑝𝑝𝑗𝑗
That is, we have measured τ at a number of discrete ray parameters (p1, p2, p3, …, pm). And we decide to parameterize our Earth model as a series of n-layers, each with thickness h:
Let’s think about this problem in the sense of straight-line fitting segments of our travel-time vs. distance curve:
GEO 5210 – Seismology I L14-9
That is, what if we approximate the real travel-time curve (blue), by a series of straight line segments (red segments). Each one of these segments has a ray parameter p, and let’s assume that the slowness is u. This is strictly only true if we are looking at refracted arrivals (headwaves – see the aside at the end of this lecture), but is a good first approximation in any case, and is true in the limit that our straight line segment lengths go to zero.
Let’s consider a few values of this equation.
𝜏𝜏􀵫𝑝𝑝𝑗𝑗􀵯=2􀷍􀵫𝑢𝑢𝑖𝑖2−𝑝𝑝𝑗𝑗2􀵯12𝑛𝑛𝑖𝑖=1ℎ𝑖𝑖; 𝑢𝑢𝑖𝑖>𝑝𝑝𝑗𝑗
j = 1
𝜏𝜏(𝑝𝑝1)=2􀷍􀵫𝑢𝑢𝑖𝑖2−𝑝𝑝12􀵯12𝑛𝑛𝑖𝑖=1ℎ𝑖𝑖
⇒𝜏𝜏(𝑝𝑝1)=2ℎ1(𝑢𝑢12−𝑝𝑝12)12+2

2(𝑢𝑢22−𝑝𝑝12)12+⋯
Recall that in our setup we assume that u1 = p1, u2 = p2, etc. Hence our first term (𝑢𝑢12−𝑝𝑝12)=0. Since we have increasing velocity with depth, that means that u2 < u1. And u1 = p1, so that u2 < p1. Thus, our second term (𝑢𝑢22−𝑝𝑝12)12 is undefined (hence, the condition that 𝑢𝑢𝑖𝑖>𝑝𝑝𝑗𝑗).
So, 𝜏𝜏(𝑝𝑝1)=0. This makes sense since our first straight-line segment passes through the origin.
j = 2
𝜏𝜏(𝑝𝑝2)=2ℎ1(𝑢𝑢12−𝑝𝑝22)12+2
ℎ2(𝑢𝑢22−𝑝𝑝22)12
The second term is zero here, so:
⇒𝜏𝜏(𝑝𝑝2)=2ℎ1(𝑢𝑢12−𝑝𝑝22)12
j = 3
Just considering the non-zero terms:
𝜏𝜏(𝑝𝑝3)=2ℎ1(𝑢𝑢12−𝑝𝑝32)12+2

2(𝑢𝑢22−𝑝𝑝32)12
All terms
So, it should be obvious by now that we can write this in the form of a matrix:
GEO 5210 – Seismology I L14-10
􀵦𝜏𝜏(𝑝𝑝2)𝜏𝜏(𝑝𝑝3)⋮𝜏𝜏(𝑝𝑝𝑚𝑚)􀵪=⎣⎢⎢⎢⎢⎡2(𝑢𝑢12−𝑝𝑝22)1200…2(𝑢𝑢12−𝑝𝑝32)122(𝑢𝑢22−𝑝𝑝32)120…⋮2(𝑢𝑢12−𝑝𝑝𝑚𝑚2)12⋮2(𝑢𝑢22−𝑝𝑝𝑚𝑚2)12⋱⋮…2(𝑢𝑢𝑛𝑛2−𝑝𝑝𝑚𝑚2)12⎦⎥⎥⎥⎥⎤􀵦ℎ1ℎ2⋮ℎ𝑛𝑛􀵪
Which is a matrix equation of the form:
𝝉𝝉=𝐀𝐀𝐀𝐀
Which can be inverted for the layer thicknesses (h).
3. 1D Seismic Velocity Models
The first 1-D models of Earth’s seismic velocity structure were set up by Jeffreys and Bullen. Their 1940 JB model consisted of 7 layers labeled A through G as follows:
Layer
Deepest Depth (km)
Region
A
33
Crust
B
413
Upper Mantle
C
984
Mantle Transition
D
2898
Lower Mantle
E
4982
Outer Core
F
5121
Core Transition
G
6371
Inner Core
The layer names A through G are no longer used. But, it is interesting to note that early on the lowermost part of the D layer was observed to be different, hence the D layer was split into the D’ and Dʺ layers. Dʺ is still used for the lowermost part of the mantle. Another interesting feature was the F-layer, or the Core transition zone. This was dropped in future models, but now researchers are putting forth further evidence that there is a transition region from the outer- to inner- core, and hence the F-layer is making a comeback.
More recent developments are:
1981 – PREM (Preliminary Reference Earth Model) – by Dziewonski and Anderson.
Note that these models typically invert for seismic velocities as polynomial functions in different bands of the Earth as opposed to a stack of layers as we discussed above.
GEO 5210 – Seismology I L14-11
PREM model.
1991 – IASP91 (International Association of Seismology and the Physics of the Earth’s Interior) – by Kennett and Engdahl.
This model primarily improves the seismic velocities at the mantle transition zone.
1995 – AK135 – by Kennett, Engdahl, and Buland.
The main difference in model AK135 from IASP91 is in the velocities near the boundary of the inner core.
The next figure shows the difference between the JB40 model and the IASP91 model. As can be seen the variations aren’t that dramatic.
GEO 5210 – Seismology I L14-12
From Stein & Wysession.
The Moon
The only other planetary body for which we have 1D seismic velocity profiles is the moon. Recall that we installed seismometers on the moon during the Apollo missions:
Apollo 12 Alsep (Apollo Lunar Surface Experiments Package)
GEO 5210 – Seismology I L14-13
Several competing models currently exist. But, one of the best models is by Nakamura 1983. Note that these models do not resolve structure down to the lunar core.
Model Nakamura 1983. Plot by Yao Yao (green – S-wave velocity; red – P-wave velocity)
4. Density Structure
Models like PREM also include density as a function of depth. How is this determined? The answer can be seen by looking at the Adams-Williamson equation.
Assume that the Earth is made of infinitesimally thin spherical shells with uniform properties.
We know at depth that pressure (P) is given by:
𝑃𝑃=𝜌𝜌𝜌𝜌ℎ
Where,
g = gravitational acceleration.
ρ = density
h = depth
If we consider a change in pressure ΔP with increasing depth Δh:
Δ𝑃𝑃=𝜌𝜌𝜌𝜌Δℎ
⇒Δ𝑃𝑃Δℎ=𝜌𝜌𝜌𝜌
GEO 5210 – Seismology I L14-14
And, in the limit Δℎ⟶0
⇒d𝑃𝑃dℎ=𝜌𝜌𝜌𝜌
That is the change in pressure with increasing depth is +ρg. Going the other way, with increasing radius:
⇒𝑑𝑑𝑑 𝑑𝑑𝑑 =−𝑔 (𝑟𝑟)𝜌𝜌(𝑟𝑟)
That is, the pressure decreases with increasing radius.
We also know that the acceleration of gravity at radius r is (just equate Newton’s second law with Newton’s law of universal gravitation):
𝑔 (𝑟𝑟)=𝐺𝐺𝑀𝑀𝑟𝑟𝑟𝑟2
Where,
Mr = Mass within radius r.
Thus,
⇒𝑑𝑑𝑑 𝑑𝑑𝑑 =−𝐺𝐺𝑀𝑀𝑟𝑟𝑟𝑟2𝜌𝜌(𝑟𝑟)
From the chain rule:
𝑑𝑑𝑑 𝑑𝑑𝑑 =𝑑𝑑𝑑 𝑑𝑑𝑑 𝑑𝑑𝑑 𝑑𝑑𝑑
⇒𝑑𝑑𝑑 𝑑𝑑𝑑 =−𝐺𝐺𝑀𝑀𝑟𝑟𝜌𝜌(𝑟𝑟)𝑟𝑟2𝑑𝑑𝑑 𝑑𝑑𝑑 (Eqn 1)
Recall that bulk modulus (k) (or Incompressibility) is a materials ability to resist being compressed under a pressure equal in all directions.
Bulk modulus = k VVP−ΔΔ=,
Where P is the Pressure (Force/Area) applied to a volume of material given by V. Graphically, if we apply an equal Pressure (P) on all sides…
GEO 5210 – Seismology I L14-15
Then the box deforms as follows…
Where the dashed lines show the volume of the original, uncompressed, box.
For small volumes:
𝑘𝑘=𝑑𝑑𝑑 𝑑𝑑𝑑𝑑−𝑉 􀵗; (𝐸𝐸𝐸𝐸𝐸𝐸 2)
Noting that we have a minus sign so that with increasing pressure we get a smaller volume.
Since we know that 𝜌𝜌=𝑚𝑚/𝑉 we can write:
𝑑𝑑𝑑 𝑑𝑑𝑑𝑑=𝑑𝑑𝑑𝑑𝑑𝑑􁉀𝑚𝑚𝑉 􁉁
Carrying out the derivative:
⇒𝑑𝑑𝑑 𝑑𝑑𝑑𝑑=−𝑚𝑚𝑉 2
And substituting back in 𝜌𝜌=𝑚𝑚/𝑉 :
⇒𝑑𝑑𝑑 𝑑𝑑𝑑𝑑=−𝜌𝜌𝑉
GEO 5210 – Seismology I L14-16
⇒𝑑𝑑𝑑𝑑𝑉 =−𝑑𝑑𝑑 𝜌𝜌
Now, if we substitute this into Eqn 2:
⇒𝑘𝑘=𝑑𝑑𝑑 𝑑𝑑𝑑𝑑−𝑉 􀵗=𝑑𝑑𝑑 𝑑𝑑𝑑 𝜌𝜌=𝜌𝜌𝑑𝑑𝑑 𝑑𝑑𝑑
⇒𝑑𝑑𝑑 𝑑𝑑𝑑 =𝜌𝜌𝑘𝑘
And substituting this back into Eqn 1:
⇒𝑑𝑑𝑑 𝑑𝑑𝑑 =−𝐺𝐺𝑀𝑀𝑟𝑟𝜌𝜌(𝑟𝑟)𝑟𝑟2𝑑𝑑𝑑 𝑑𝑑𝑑 =−𝐺𝐺𝑀𝑀𝑟𝑟𝜌𝜌(𝑟𝑟)𝑟𝑟2 𝜌𝜌(𝑟𝑟)𝑘𝑘
Now, if we recall our seismic wave speeds:
P-wave velocity = 𝛼𝛼=􀶨𝑘𝑘+43𝜇𝜇𝜌𝜌
S-wave velocity = 𝛽𝛽=􀶧𝜇𝜇𝜌𝜌
⇒𝜇𝜇=𝛽𝛽2𝜌𝜌
⇒𝛼𝛼=􀶨𝑘𝑘+43𝜇𝜇𝜌𝜌=􀶨𝑘𝑘+43𝛽𝛽2𝜌𝜌𝜌𝜌
⇒𝛼𝛼2=𝑘𝑘𝜌𝜌+4𝛽𝛽23
Rearranging:
⇒𝑘𝑘𝜌𝜌=𝛼𝛼2−4𝛽𝛽23
⇒𝜌𝜌𝑘𝑘=1𝛼𝛼2−43𝛽𝛽2
And substituting back into our previous equation for the rate of change of density with respect to radius:
GEO 5210 – Seismology I L14-17
⇒𝑑𝑑𝑑 𝑑𝑑𝑑 = −𝐺𝐺𝑀𝑀𝑟𝑟𝜌𝜌(𝑟𝑟)𝑟𝑟2 𝜌𝜌(𝑟𝑟)𝑘𝑘=−𝐺𝐺𝑀𝑀𝑟𝑟𝜌𝜌(𝑟𝑟)𝑟𝑟21𝛼𝛼2−43𝛽𝛽2
If we let 𝜑𝜑=𝛼𝛼2−43𝛽𝛽2, then we arrive at the Adams-Williamson equation:
𝑑𝑑𝑑 𝑑𝑑𝑑 = −𝐺𝐺𝑀𝑀𝑟𝑟𝜌𝜌(𝑟𝑟)𝑟𝑟2𝜑𝜑
This equation gives us the density gradient with respect to radius. Unfortunately, density is also on the RHS of this equation, which is what we are interested in solving for. We already know the seismic velocities however in great detail (see previous lecture!). So, how do we solve this problem?
What we do is an iterative approach. That is, we start at the Earth’s surface at a known density:
𝜌𝜌(6371 𝑘𝑘𝑘 )=3300𝑘𝑘𝑘 𝑚𝑚3⁄
We know the Mass of the Earth (Mr) and the seismic wave speeds, so we can now calculate the gradient 𝑑𝑑𝑑 𝑑𝑑𝑑 . Hence, with this gradient we can linearly extrapolate the density to a new smaller radius:
Now, with this new density, we can calculate a new Mr at the new radius, and repeat. Hence, it is an iterative process. Remember, we also have important constraints that must be satisfied:
• Total Mass of the Earth: ME = 5.974×1024 kg.
• Mass at each successive level: 𝑀𝑀𝑟𝑟=𝑀𝑀𝐸𝐸−4𝜋𝜋∫𝜌𝜌(𝑟𝑟)𝑟𝑟2𝑑𝑑𝑑 𝑅𝑅𝐸𝐸𝑟𝑟
• Moment of inertia constant: 𝐼𝐼𝑚𝑚𝑎𝑎2=0.3307
There are a couple problems with direct implementation of this approach that also need to be accounted for.
GEO 5210 – Seismology I L14-18
• This assumes adiabatic compression. But, convection is happening in the mantle so an extra term accounting for convection must be included.
• This approach also assumes that the composition is constant throughout the Earth. This is readily accounted for by introducing density jumps at compositional boundaries.

Don't use plagiarized sources. Get Your Custom Essay on
A brief introduction to Inverse Theory
Just from \$10/Page
Basic features
• Free title page and bibliography
• Unlimited revisions
• Plagiarism-free guarantee
• Money-back guarantee
On-demand options
• Writer’s samples
• Part-by-part delivery
• Overnight delivery
• Copies of used sources
Paper format
• 275 words per page
• 12 pt Arial/Times New Roman
• Double line spacing
• Any citation style (APA, MLA, Chicago/Turabian, Harvard)

# Our guarantees

We value our customers and so we ensure that what we do is 100% original..
With us you are guaranteed of quality work done by our qualified experts.Your information and everything that you do with us is kept completely confidential.

### Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

### Zero-plagiarism guarantee

The Product ordered is guaranteed to be original. Orders are checked by the most advanced anti-plagiarism software in the market to assure that the Product is 100% original. The Company has a zero tolerance policy for plagiarism.

### Free-revision policy

The Free Revision policy is a courtesy service that the Company provides to help ensure Customer’s total satisfaction with the completed Order. To receive free revision the Company requires that the Customer provide the request within fourteen (14) days from the first completion date and within a period of thirty (30) days for dissertations.

The Company is committed to protect the privacy of the Customer and it will never resell or share any of Customer’s personal information, including credit card data, with any third party. All the online transactions are processed through the secure and reliable online payment systems.

### Fair-cooperation guarantee

By placing an order with us, you agree to the service we provide. We will endear to do all that it takes to deliver a comprehensive paper as per your requirements. We also count on your cooperation to ensure that we deliver on this mandate.

## Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
\$26
The price is based on these factors: