Some antiderivatives are difficult (or impossible) to compute. In such cases, numerical approximation of a definite integral may be a better (or the only) option.
In Calculus 1, we learned about the "numerical_integral" command in Sage. We are not going to use this command in this lab. Instead, we will explore various approximation techniques.
The simplest numerical approximation comes from the definition of the definite integral as a limit of Riemann sums. Each Riemann sum is an approximation of the definite integral using rectangles (with one side on the x-axis), and we can make the approximation better by increasing the number of rectangles.
Our goal is to approximate $\displaystyle\int_a^bf(x)\,dx$.
We will use $n$ rectangles of equal width, which means the width of each rectangle is $\Delta x=\frac{b-a}{n}$.
The height of the rectangle will be given by the value of the function $f$ at some point in the base of that rectangle.
If we choose the left endpoint of each rectangle, we will find the left Riemann sum. If we choose the right endpoint of each rectangle, we will find the right Riemann sum.
The endpoints of the rectangles are $a,\ a+\Delta x,\ a+2\Delta x,\ a+3\Delta x,... ,\ a+n\Delta x=b$.
So the left Riemann sum is $\displaystyle LS=f(a)\cdot\Delta x+f(a+\Delta x)\cdot\Delta x+\cdots+f(a+(n-1)\Delta x)\cdot\Delta x=\sum_{i=0}^{n-1}\left(f(a+i\Delta x)\cdot\Delta x\right)$.
Similarly, the right Riemann sum is $\displaystyle RS=f(a+\Delta x)\cdot\Delta x+f(a+2\Delta x)\cdot\Delta x+\cdots+f(b)\cdot\Delta x=\sum_{i=1}^{n}\left(f(a+i\Delta x)\cdot\Delta x\right)$.
The only difference in these formulas is the index of summation.
Consider $\displaystyle\int_0^2x^2-x\, dx$.
Here are pictures with $n=5$ rectangles.
The animation below shows graphs of left Riemann sums for increasing values of $n$. You can see that the rectangles begin to fill in the area under the curve.
The formulas below compute the left and right Riemann sums.
For the sake of comparison, the exact value is $\frac{2}{3}\approx0.66667$.
Let's change the number of rectangles to 10 and see how our approximation improves.
Remember that the exact answer is $\frac{2}{3}$. Both of these are closer to the correct answer than before, but they are still not very close.
Of course, we can increase our accuracy by using larger values of $n$, but it would be very nice if there were more accurate approximation methods. Fortunately, there are.
The Midpoint Rule is another Riemann sum approximation, but instead of using the left or right endpoints, we will use the midpoint of each subinterval.
Notice in our example above, that each rectangle is either too big (an over-estimate) or too small (an under-estimate).
If we use the midpoint, then part of the rectangle will be above the curve and part will be below, so they will tend to cancel out and give us a better estimate.
Consider $\displaystyle\int_0^2x^2-x\, dx$.
Here is a plot and calculation for the Midpoint Rule using $n=5$.
The exact value is $\frac{2}{3}$, and the Midpoint Rule gives us the correct answer to one decimal place with only 5 rectangles.
Recall that for $n=5$, the left sum was 0.32 and the right sum was 1.12. The Midpoint Rule has done a good job of balancing out the under- and over-estimates.
So far we have been approximating our function using a constant (horizontal line = degree 0 polynomial) on each subinterval.
We may be able to improve our approximation if we use a degree 1 polynomial instead (i.e., a non-horizontal line). For each subinterval, we'll use the secant line based on the left and right endpoints. Of course, the resulting shape is not a rectangle but a trapezoid. Thus, this approach is called the Trapezoidal Rule.
Recall that the area of a trapezoid is $\frac{b_1+b_2}{2}h$, where $b_1$ and $b_2$ are the lengths of the bases and $h$ is the height. In this case, the trapezoid is sitting on its side, so the bases are actually vertical and the height is $\Delta x$.
Consider $\displaystyle\int_0^2x^2-x\, dx$.
Here is the plot and calculation for the Trapezoidal Rule using $n=5$.
Notice that this approximation is worse than what we got from the Midpoint Rule, but it is much better than either the left or right sum.
In fact, numerically the Trapezoidal Rule simply gives the average of the left and right sums!
In general (but not always), the Midpoint Rule will give a better approximation. If the function $f$ is either concave up or concave down on the entire subinterval, then the linear approximation from the Trapezoidal Rule (the secant line) will be entirely above or below the curve. On the other hand, the horizontal line segment from the Midpoint Rule will be above the curve on part of the interval and below the curve on part of the interval, so the errors will cancel out.
Despite this numerical disappointment, the idea of increasing the degree of the approximating polynomial was sound. The next step is to use a degree 2 polynomial (quadratic, parabola) to approximate $f$ on each subinterval. This is called Simpson's Rule.
Instead of using a line to approximate our curve on each subinterval, we will use a parabola. Since a parabola is usually closer to the curve than a line, this should give us a better approximation (of course, this is not true if our original curve is a line, but in that case, why are we approximating the integral?).
A line is determined by 2 points, but it takes 3 points to determine a parabola.
The three points that are normally used are the left endpoints of three consecutive subintervals. This choice forces us to use an even number of subintervals (i.e., $n$ must be even). The number of approximating parabolas is then $n/2$.
I won't take you through all the calculations (see the textbook, pages 454-456). The algebra is complicated, but here is the final result:
$\int_a^bf(x)\, dx\approx\frac{\Delta x}{3}\left(f(a)+4f(a+\Delta x)+2f(a+2\Delta x)+4f(a+3\Delta x)+2f(a+4\Delta x)+\cdots+2f(a+(n-2)\Delta x)+4f(a+(n-1)\Delta x)+f(b)\right)$
Notice the pattern in the coefficients: $1,4,2,4,2,\ldots,2,4,1$
Consider $\displaystyle\int_0^2x^2-x\, dx$.
Here is the result for Simpson's Rule using $n=10$ (5 parabolas).
Of course, in this example $f(x)=x^2-x$, which is already a parabola, so Simpson's Rule gives the exact answer.
It is interesting to note that numerically the answer from Simpson's Rule is a weighted average of the answers from the Trapezoidal Rule and the Midpoint Rule (with half the number of rectangles). In particular, if $S_{2n}$ is the approximation from Simpson's Rule with $2n$ subintervals ($n$ parabolas), $T_n$ is the approximation from the Trapezoidal Rule with $n$ trapezoids, and $M_n$ is the approximation from the Midpoint Rule with $n$ rectangles, then $S_{2n}=\frac{T_n+2M_n}{3}$
Here is the calculation for the examples above:
Here is one example that puts all the rules together. (I'll skip the graphs again.).
Let's approximate $\displaystyle\int_1^4 x^5-4x^2+6x-9\, dx$ using $n=10$, $n=50$, and $n=100$.
Compare these to the exact value of $616.5$.
Now let's try $n=50$:
Compare these to the exact value of $616.5$.
Now let's try $n=100$:
Compare these to the exact value of $616.5$.
What do we see from this example?
All the approximations improve as $n$ increases.
The left and right Riemann sums are not very good approximations.
The Midpoint Rule and Trapezoidal Rule are better than the left and right sums.
The Midpoint Rule is better than the Trapezoidal Rule.
Simpson's Rule is the best.
The function to be integrated, $f(x)=\cos(x)$, is in black. The red vertical lines mark the ends of each parabola. The approximating parabolas are in blue.
I have set $n=10$ (5 parabolas).
Here's the graph with $n=20$ (10 parabolas).