# Bounding sums or integrals of non-negative quantities

0
17

A typical activity in evaluation is to acquire bounds on sums

$displaystyle sum_{n in A} f(n)$

or integrals

$displaystyle int_A f(x) dx$

the place ${A}$ is a few easy area (akin to an interval) in a number of dimensions, and ${f}$ is an specific (and elementary) non-negative expression involving a number of variables (akin to ${n}$ or ${x}$, and probably additionally some further parameters. Typically, one could be content material with an order of magnitude higher certain akin to

$displaystyle sum_{n in A} f(n) ll X$

or

$displaystyle int_A f(x) dx ll X$

the place we use ${X ll Y}$ (or ${Y gg X}$ or ${X = O(Y)}$) to indicate the certain $leq CY$ for some fixed ${C}$; generally one needs to additionally acquire the matching decrease certain, thus acquiring

$displaystyle sum_{n in A} f(n) asymp X$

or

$displaystyle int_A f(x) dx asymp X$

the place ${X asymp Y}$ is synonymous with ${X ll Y ll X}$. Lastly, one might want to acquire a extra exact certain, akin to

$displaystyle sum_{n in A} f(n) = (1+o(1)) X$

the place ${o(1)}$ is a amount that goes to zero because the parameters of the issue go to infinity (or another restrict). (For a deeper dive into asymptotic notation usually, see this previous blog post.)

Listed below are some typical examples of such estimation issues, drawn from latest questions on MathOverflow:

In comparison with different estimation duties, akin to that of controlling oscillatory integrals, exponential sums, singular integrals, or expressions involving a number of unknown features (which might be solely identified to lie in some operate areas, akin to an ${L^p}$ area), high-dimensional geometry (or alternatively, giant numbers of random variables), or number-theoretic buildings (such because the primes), estimation of sums or integrals of non-negative elementary expressions is a comparatively easy activity, and could be completed by quite a lot of strategies. The artwork of acquiring such estimates is usually not explicitly taught in textbooks, aside from via some examples and workouts; it’s usually picked up by analysts (or these working in adjoining areas, akin to PDE, combinatorics, or theoretical pc science) as graduate college students, whereas they work via their thesis or their first few papers within the topic.

Considerably within the spirit of this previous post on analysis problem solving strategies, I’m going to strive right here to gather some normal rules and methods that I’ve discovered helpful for these kinds of issues. As with the earlier put up, I hope this will probably be one thing of a residing doc, and encourage others so as to add their very own suggestions or recommendations within the feedback.

— 1. Asymptotic arithmetic —

Asymptotic notation is designed in order that most of the ordinary guidelines of algebra and inequality manipulation proceed to carry, with the caveat that one needs to be cautious if subtraction or division is concerned. As an illustration, if one is aware of that ${A ll X}$ and ${B ll Y}$, then one can instantly conclude that ${A + B ll X+Y}$ and ${AB ll XY}$, even when ${A,B}$ are detrimental (observe that the notation ${A ll X}$ or ${B ll Y}$ robotically forces ${X,Y}$ to be non-negative). Equivalently, now we have the principles

$displaystyle O(X) + O(Y) = O(X+Y); quad O(X) cdot O(Y) = O(XY)$

and extra typically now we have the triangle inequality

$displaystyle sum_alpha O(X_alpha) = O( sum_alpha X_alpha ).$

(Once more, we stress that this kind of rule implicitly requires the ${X_alpha}$ to be non-negative. As a rule of thumb, in case your calculations have arrived at a scenario the place a signed or oscillating sum or integral seems inside the big-O notation, or on the right-hand aspect of an estimate, with out being “protected” by absolute worth indicators, then you could have most likely made a severe error in your calculations.)

One other rule of inequalities that’s inherited by asymptotic notation is that if one has two bounds

$displaystyle A ll X; quad A ll Y (1)$

for the same amount ${A}$, then one can mix them into the unified asymptotic certain

$displaystyle A ll min(X, Y). (2)$

That is an instance of a “free transfer”: a alternative of bounds that doesn’t lose any of the energy of the unique bounds, since in fact (2) implies (1). In distinction, different methods to mix the 2 bounds (1), akin to taking the geometric imply

$displaystyle A ll X^{1/2} Y^{1/2}, (3)$

whereas usually handy, are usually not “free”: the bounds (1) indicate the averaged certain (3), however the certain (3) doesn’t indicate (1). Then again, the inequality (2), whereas it doesn’t concede any logical energy, can require extra calculation to work with, actually because one finally ends up splitting up instances akin to ${X ll Y}$ and ${X gg Y}$ as a way to simplify the minimal. So in observe, when making an attempt to ascertain an estimate, one usually begins with utilizing conservative bounds akin to (2) as a way to maximize one’s possibilities of getting any proof (irrespective of how messy) of the specified estimate, and solely after such a proof is discovered, one tries to search for extra elegant approaches utilizing much less environment friendly bounds akin to (3).

As an illustration, suppose one wished to indicate that the sum

$displaystyle sum_{n=-infty}^infty frac{2^n}{(1+n^2) (1+2^{2n})}$

was convergent. Decrease bounding the denominator time period ${1+2^{2n}}$ by ${1}$ or by ${2^{2n}}$, one obtains the bounds

$displaystyle frac{2^n}{(1+n^2) (1+2^{2n})} ll frac{2^n}{1+n^2} (4)$

$displaystyle frac{2^n}{(1+n^2) (1+2^{2n})} ll frac{2^n}{(1+n^2) 2^{2n}} = frac{2^{-n}}{1+n^2} (5)$

so by making use of (2) we acquire the unified certain

$displaystyle frac{2^n}{(1+n^2) (1+2^{2n})} ll frac{2^n}{(1+n^2) 2^{2n}} = frac{max(2^n,2^{-n})}{1+n^2}.$

To cope with this certain, we will break up into the 2 contributions ${n geq 0}$, the place ${2^{-n}}$ dominates, and ${n < 0}$, the place ${2^n}$ dominates. Within the former case we see (from the ratio take a look at, as an illustration) that the sum

$displaystyle sum_{n=0}^infty frac{2^{-n}}{1+n^2}$

is completely convergent, and within the latter case we see that the sum

$displaystyle sum_{n=-infty}^{-1} frac{2^{n}}{1+n^2}$

can also be completely convergent, so your entire sum is completely convergent. However as soon as one has this argument, one can attempt to streamline it, as an illustration by taking the geometric imply of (4), (5) moderately than the minimal to acquire the weaker certain

$displaystyle frac{2^n}{(1+n^2) (1+2^{2n})} ll frac{1}{1+n^2} (6)$

and now one can conclude with out decomposition simply by observing absolutely the convergence of the doubly infinite sum ${sum_{n=-infty}^infty frac{1}{1+n^2}}$. This can be a much less “environment friendly” estimate, as a result of one has conceded lots of the decay within the summand by utilizing (6) (the summand was once exponentially decaying in ${n}$, however is now solely polynomially decaying), however it’s nonetheless ample for the aim of building absolute convergence.

One of many key benefits of coping with order of magnitude estimates, versus sharp inequalities, is that the arithmetic turns into tropical. Extra explicitly, now we have the essential rule

$displaystyle X + Y asymp max(X,Y)$

whenver ${X,Y}$ are non-negative, since we clearly have

$displaystyle max(X,Y) leq X+Y leq 2 max(X,Y).$

In praticular, if ${Y leq X}$, then ${O(X) + O(Y) = O(X)}$. That’s to say, given two orders of magnitudes, any time period ${O(Y)}$ of equal or decrease order to a “predominant time period” ${O(X)}$ could be discarded. This can be a very helpful rule to remember when making an attempt to estimate sums or integrals, because it permits one to discard many phrases that aren’t contributing to the ultimate reply. It additionally units up the basic divide and conquer technique for estimation: if one desires to show a certain akin to ${A ll X}$, it’ll suffice to acquire a decomposition

$displaystyle A = A_1 + dots + A_k$

or no less than an higher certain

$displaystyle A ll A_1 + dots + A_k$

of ${A}$ by some bounded variety of parts ${A_1,dots,A_k}$, and set up the bounds ${A_1 ll X, dots, A_k ll X}$ individually. Usually the ${A_1,dots,A_k}$ will probably be (morally no less than) smaller than the unique amount ${A}$ – as an illustration, if ${A}$ is a sum of non-negative portions, every of the ${A_i}$ could be a subsum of those self same portions – which signifies that such a decomposition is a “free transfer”, within the sense that it doesn’t danger making the issue tougher. (It is because, if the unique certain ${A ll X}$ is to be true, every of the brand new goals ${A_1 ll X, dots, A_k ll X}$ should even be true, and so the decomposition can solely make the issue logically simpler, not tougher.) The one prices to such decomposition are that your proofs could be ${k}$ instances longer, as you could be repeating the identical arguments ${k}$ instances, and that the implied constants within the ${A_1 ll X, dots, A_k ll X}$ bounds could also be worse than the implied fixed within the unique ${A ll X}$ certain. Nevertheless, in lots of instances these prices are nicely price the advantages of with the ability to simplify the issue into smaller items. As talked about above, as soon as one efficiently executes a divide and conquer technique, one can return and attempt to scale back the variety of decompositions, as an illustration by unifying parts which might be handled by related strategies, or by changing robust however unwieldy estimates with weaker, however extra handy estimates.

The above divide and conquer technique doesn’t immediately apply when one is decomposing into an unbounded variety of items ${A_j}$, ${j=1,2,dots}$. In such instances, one wants a further achieve within the index ${j}$ that’s summable in ${j}$ as a way to conclude. As an illustration, if one desires to ascertain a certain of the shape ${A ll X}$, and one has situated a decomposition or higher certain

$displaystyle A ll sum_{j=1}^infty A_j$

that appears promising for the issue, then it might suffice to acquire exponentially decaying bounds akin to

$displaystyle A_j ll 2^{-cj} X$

for all ${j geq 1}$ and a few fixed ${c>0}$, since this may indicate

$displaystyle A ll sum_{j=1}^infty 2^{-cj} X ll X (7)$

due to the geometric collection method. (Right here it will be important that the implied constants within the asymptotic notation are uniform on ${j}$; a ${j}$-dependent certain akin to ${A_j ll_j 2^{-cj} X}$ could be ineffective for this software, as then the expansion of the implied fixed in ${j}$ may overwhelm the exponential decay within the ${2^{-cj}}$ issue). Exponential decay is in truth overkill; polynomial decay akin to

$displaystyle A_j ll frac{X}{j^{1+c}}$

would already be ample, though harmonic decay such

$displaystyle A_j ll frac{X}{j} (8)$

is just not fairly sufficient (the sum ${sum_{j=1}^infty frac{1}{j}}$ diverges logarithmically), though in lots of such conditions one may attempt to nonetheless salvage the certain by working rather a lot tougher to squeeze some further logarithmic components out of 1’s estimates. As an illustration, if one can enhance eqre{ajx} to

$displaystyle A_j ll frac{X}{j log^{1+c} j}$

for all ${j geq 2}$ and a few fixed ${c>0}$, since (by the integral take a look at) the sum ${sum_{j=2}^infty frac{1}{jlog^{1+c} j}}$ converges (and one can deal with the ${j=1}$ time period individually if one already has (8)).

Generally, when making an attempt to show an estimate akin to ${A ll X}$, one has recognized a promising decomposition with an unbounded variety of phrases

$displaystyle A ll sum_{j=1}^J A_j$

(the place ${J}$ is finite however unbounded) however is not sure of easy methods to proceed subsequent. Typically the following factor to do is to check the acute phrases ${A_1}$ and ${A_J}$ of this decomposition, and first attempt to set up (the presumably easier) duties of exhibiting that ${A_1 ll X}$ and ${A_J ll X}$. Typically as soon as one does so, it turns into clear easy methods to mix the remedies of the 2 excessive instances to additionally deal with the intermediate instances, acquiring a certain ${A_j ll X}$ for every particular person time period, resulting in the inferior certain ${A ll JX}$; this may then be used as a place to begin to hunt for added positive aspects, such because the exponential or polynomial positive aspects talked about beforehand, that might be used to take away this lack of ${J}$. (There are extra superior methods, akin to these primarily based on controlling moments such because the sq. operate ${()sum_{j=1}^J |A_j|^2)^{1/2}}$, or making an attempt to grasp the exact circumstances through which a “giant values” situation $A_j$ happens, and the way these situations work together with one another for various ${j}$, however these are past the scope of this put up, as they’re not often wanted when coping with sums or integrals of elementary features.)

— 1.1. Psychological distinctions between actual and asymptotic arithmetic —

The adoption of the “divide and conquer” technique requires a sure psychological shift from the “simplify, simplify” technique that one is taught in highschool algebra. Within the latter technique, one tries to gather phrases in an expression make them as quick as potential, as an illustration by working with a typical denominator, with the concept that unified and elegant-looking expressions are “easier” than sprawling expressions with many phrases. In distinction, the divide and conquer technique is deliberately extraordinarily prepared to vastly improve the entire size of the expressions to be estimated, as long as every particular person part of the expressions seems simpler to estimate than the unique one. Each methods are nonetheless making an attempt to scale back the unique downside to an easier downside (or assortment of easier sub-problems), however the metric by which one judges whether or not the issue has turn out to be easier is moderately completely different.

A associated psychological shift that one must undertake in evaluation is to maneuver away from the precise identities which might be so prized in algebra (and in undergraduate calculus), because the precision they provide is commonly pointless and distracting for the duty at hand, and sometimes fail to generalize to extra sophisticated contexts through which actual identities are now not accessible. As a easy instance, think about the duty of estimating the expression

$displaystyle int_0^a frac{dx}{1+x^2}$

the place ${a > 0}$ is a parameter. With a trigonometric substitution, one can consider this expression precisely as ${mathrm{arctan}(a)}$, nonetheless the presence of the arctangent could be inconvenient if one has to do additional estimation duties (as an illustration, if ${a}$ relies upon in a sophisticated vogue on different parameters, which one then additionally desires to sum or combine over). As an alternative, by observing the trivial bounds

$displaystyle int_0^a frac{dx}{1+x^2} leq int_0^a dx = a$

and

$displaystyle int_0^a frac{dx}{1+x^2} leq int_0^infty frac{dx}{1+x^2} = frac{pi}{2}$

one can mix them utilizing (2) to acquire the higher certain

$displaystyle int_0^a frac{dx}{1+x^2} leq min( a, frac{pi}{2} ) asymp min(a,1)$

and related arguments additionally give the matching decrease certain, thus

$displaystyle int_0^a frac{dx}{1+x^2} asymp min(a,1). (9)$

This certain, whereas cruder than the precise reply of ${mathrm{arctan}(a)}$, is commonly adequate for a lot of purposes (par ticularly in conditions the place one is prepared to concede constants within the bounds), and could be extra tractible to work with than the precise reply. Moreover, these arguments could be tailored with out issue to deal with the same expression

$displaystyle int_0^a frac{dx}{1+x^4}$

for which there isn’t a closed type actual expression by way of elementary features such because the arctangent.

As a normal rule, as an alternative of relying solely on actual formulae, one ought to search approximations which might be legitimate as much as the diploma of precision that one seeks within the last estimate. As an illustration, suppose one one needs to ascertain the certain

$displaystyle sec(x) - cos(x) = x^2 + O(x^3)$

for all small enough ${x}$. If one was clinging to the precise identification mindset, one may attempt to search for some trigonometric identification to simplify the left-hand aspect precisely, however the faster (and extra sturdy) method to proceed is simply to make use of Taylor enlargement as much as the desired accuracy ${O(x^3)}$ to acquire

$displaystyle cos(x) = 1 - frac{x^2}{2} + O(x^3)$

which one can invert utilizing the geometric collection method ${(1-y)^{-1} = 1 + y + y^2 + dots}$ to acquire

$displaystyle sec(x) = 1 + frac{x^2}{2} + O(x^3)$

from which the declare follows. (One may even have computed the Taylor enlargement of ${sec(x)}$ immediately, however as this can be a collection that’s often not memorized, this may take a little bit bit extra time than simply computing it on to the required accuracy.) Observe that the notion of “specified accuracy” might should be interpreted in a relative sense if one is planning to multiply or divide a number of estimates collectively. As an illustration, if one needs to establsh the certain

$displaystyle sin(x) cos(x) = x + O(x^3)$

for small ${x}$, one wants an approximation

$displaystyle sin(x) = x + O(x^3)$

to the sine operate that’s correct to order ${O(x^3)}$, however one solely wants an approximation

$displaystyle cos(x) = 1 + O(x^2)$

to the cosine operate that’s correct to order ${O(x^2)}$, as a result of the cosine is to be multiplied by ${sin(x)= O(x)}$. Right here the bottom line is to acquire estimates which have a relative error of ${O(x^2)}$, in comparison with the primary time period (which is ${1}$ for cosine, and ${x}$ for sine).

Then again, some actual formulae are nonetheless very helpful, notably if the top results of that method is clear and tractable to work with (versus involving considerably unique features such because the arctangent). The geometric collection method, as an illustration, is a particularly helpful actual method, a lot in order that it’s usually fascinating to manage summands by a geometrical collection purely to make use of this method (we already noticed an instance of this in (7)). Precise integral identities, akin to

$displaystyle frac{1}{a} = int_0^infty e^{-at} dt$

or extra typically

$displaystyle frac{Gamma(s)}{a^s} = int_0^infty e^{-at} t^{s-1} dt$

for ${a,s>0}$ (the place ${Gamma}$ is the Gamma function) are additionally fairly generally used, and elementary actual integration guidelines such because the change of variables method, the Fubini-Tonelli theorem or integration by components are all esssential instruments for an analyst making an attempt to show estimates. Due to this, it’s usually fascinating to estimate a sum by an integral. The integral test is a traditional instance of this precept in motion: a extra quantitative variations of this take a look at is the certain

$displaystyle int_{a}^{b+1} f(t) dt leq sum_{n=a}^b f(n) leq sum_{a-1}^b f(t) dt (10)$

at any time when ${a leq b}$ are integers and ${f: [a-1,b+1] rightarrow {bf R}}$ is monotone lowering, or the carefully associated certain

$displaystyle sum_{a leq n leq b} f(n) = int_a^b f(t) dt + O( |f(a)| + |f(b)| ) (11)$

at any time when ${a geq b}$ are reals and ${f: [a,b] rightarrow {bf R}}$ is monotone (both growing or lowering); see Lemma 2 of this previous post. Such bounds enable one to change forwards and backwards fairly simply between sums and integrals so long as the summand or integrand behaves in a largely monotone vogue (as an illustration, whether it is monotone growing on one portion of the area and monotone lowering on the opposite). For extra precision, one may flip to extra superior relationships between sums and integrals, such because the Euler-Maclaurin formula or the Poisson summation formula, however these are past the scope of this put up.

Train 1 Suppose ${f: {bf R} rightarrow {bf R}^+}$ obeys the quasi-monotonicity property ${f(x) ll f(y)}$ at any time when ${y-1 leq x leq y}$. Present that ${int_a^{b-1} f(t) dt ll sum_{n=a}^b f(n) ll int_a^{b+1} f(t) dt}$ for any integers ${a < b}$.

Train 2 Use (11) to acquire the “low-cost Stirling approximation

$displaystyle n! = exp( n log n - n + O(log n) )$

for any pure quantity ${n geq 2}$. (Trace: take logarithms to transform the product ${n! = 1 times 2 times dots times n}$ right into a sum.)

With observe, it is possible for you to to determine any time period in a computation which is already “negligible” or “acceptable” within the sense that its contribution is all the time going to result in an error that’s smaller than the specified accuracy of the ultimate estimate. One can then work “modulo” these negligible phrases and discard them as quickly as they seem. This may help take away lots of muddle in a single’s arguments. As an illustration, if one needs to ascertain an asymptotic of the shape

$displaystyle A = X + O(Y)$

for some predominant time period ${X}$ and decrease order error ${O(Y)}$, any part of ${A}$ that one can already determine to be of measurement ${O(Y)}$ is negligible and could be faraway from ${A}$ “without cost”. Conversely, it may be helpful to add negligible phrases to an expression, if it makes the expression simpler to work with. As an illustration, suppose one desires to estimate the expression

$displaystyle sum_{n=1}^N frac{1}{n^2}. (12)$

This can be a partial sum for the zeta operate

$displaystyle sum_{n=1}^infty frac{1}{n^2} = zeta(2) = frac{pi^2}{6}$

so it may well make sense so as to add and subtract the tail ${sum_{n=N+1}^infty frac{1}{n^2}}$ to the expression (12) to rewrite it as

$displaystyle frac{pi^2}{6} - sum_{n=N+1}^infty frac{1}{n^2}.$

To cope with the tail, we swap from a sum to the integral utilizing (10) to certain

$displaystyle sum_{n=N+1}^infty frac{1}{n^2} ll int_N^infty frac{1}{t^2} dt = frac{1}{N}$

giving us the moderately correct certain

$displaystyle sum_{n=1}^N frac{1}{n^2} = frac{pi^2}{6} - O(frac{1}{N}).$

One can sharpen this approximation considerably utilizing (11) or the Euler–Maclaurin method; we go away this to the reader.

One other psychological shift when switching from algebraic simplification issues to estimation issues is that one needs to be ready to let go of constraints in an expression that complicate the evaluation. Suppose as an illustration we now want to estimate the variant

$displaystyle sum_{1 leq n leq N, hbox{ square-free}} frac{1}{n^2}$

of (12), the place we at the moment are limiting ${n}$ to be square-free. An identification from analytic quantity principle (the Euler product identity) lets us calculate the precise sum

$displaystyle sum_{n geq 1, hbox{ square-free}} frac{1}{n^2} = frac{zeta(2)}{zeta(4)} = frac{15}{pi^2}$

in order earlier than we will write the specified expression as

$displaystyle frac{15}{pi^2} - sum_{n > N, hbox{ square-free}} frac{1}{n^2}.$

Beforehand, we utilized the integral take a look at (10), however this time we can’t accomplish that, as a result of the restriction to square-free integers destroys the monotonicity. However we will merely take away this restriction:

$displaystyle sum_{n > N, hbox{ square-free}} frac{1}{n^2} leq sum_{n > N} frac{1}{n^2}.$

Heuristically no less than, this transfer solely “prices us a continuing”, since a constructive fraction (${1/zeta(2)= 6/pi^2}$, in truth) of all integers are square-free. Now that this constraint has been eliminated, we will use the integral take a look at as earlier than and procure the moderately correct asymptotic

$displaystyle sum_{1 leq n leq N, hbox{ square-free}} frac{1}{n^2} = frac{15}{pi^2} + O(frac{1}{N}).$

— 2. Extra on decomposition —

The way in which through which one decomposes a sum or integral akin to ${sum_{n in A} f(n)}$ or ${int_A f(x) dx}$ is commonly guided by the “geometry” of ${f}$, and specifically the place ${f}$ is giant or small (or whether or not varied part phrases in ${f}$ are giant or small relative to one another). As an illustration, if ${f(x)}$ comes near a most sooner or later ${x=x_0}$, then it might make sense to decompose primarily based on the gap $x-x_0$ to ${x_0}$, or maybe to deal with the instances ${x leq x_0}$ and ${x>x_0}$ individually. (Observe that ${x_0}$ doesn’t actually should be the utmost to ensure that this to be an inexpensive decomposition; whether it is in “inside affordable distance” of the utmost, this might nonetheless be a very good transfer. As such, it’s usually not worthwhile to attempt to compute the utmost of ${f}$ precisely, particularly if this actual method finally ends up being too sophisticated to be helpful.)

If an expression includes a distance $X-Y$ between two portions ${X,Y}$, it’s generally helpful to separate into the case $/2$ the place ${X}$ is far smaller than ${Y}$ (in order that ), the case $leq$ the place ${Y}$ is far smaller than ${X}$ (in order that $X-Y$), or the case when neither of the 2 earlier instances apply (in order that $Y$). The components of ${2}$ right here are usually not of important significance; the purpose is that in every of those three instances, one has some hope of simplifying the expression into one thing extra tractable. As an illustration, suppose one desires to estimate the expression

$displaystyle int_{-infty}^infty frac{dx}{(1+(x-a)^2) (1+(x-b)^2)} (13)$

by way of the 2 actual parameters ${a, b}$, which we’ll take to be distinct for sake of this dialogue. This specific integral is straightforward sufficient that it may be evaluated precisely (as an illustration utilizing contour integration methods), however within the spirit of Precept 1, allow us to keep away from doing so and as an alternative attempt to decompose this expression into easier items. A graph of the integrand reveals that it peaks when ${x}$ is close to ${a}$ or close to ${b}$. Impressed by this, one can decompose the area of integration into three items:

• (i) The area the place ${|x-a| leq frac{2}}$.
• (ii) The area the place ${|x-b| leq frac{2}}$.
• (iii) The area the place ${|x-a|, |x-b| > frac{2}}$.

(This isn’t the one method to reduce up the integral, however it’ll suffice. Typically there isn’t a “canonical” or “elegant” method to carry out the decomposition; one ought to simply attempt to discover a decomposition that’s handy for the issue at hand.)

The explanation why we need to carry out such a decomposition is that in every of the three instances, one can simplify how the integrand relies on ${x}$. As an illustration, in area (i), we see from the triangle inequality that $x-b$ is now similar to , in order that this contribution to (13) is similar to

$displaystyle asymp int_/2 frac{dx}{(1+(x-a)^2) (1+(a-b)^2)}.$

Utilizing a variant of (9), this expression is similar to

$displaystyle asymp min( 1, |a-b|/2) frac{1}{1+(a-b)^2} asymp fraca-b{1+(a-b)^2}. (14)$

The contribution of area (ii) could be dealt with equally, and can also be similar to (14). Lastly, in area (iii), we see from the triangle inequality that $x-b$ at the moment are comparable to one another, and so the contribution of this area is similar to

$displaystyle asymp int_ > frac{dx}{(1+(x-a)^2)^2}.$

Now that now we have centered the integral round ${x=a}$, we’ll discard the $x-b$ constraint, higher bounding this integral by

$displaystyle asymp int_/2 frac{dx}{(1+(x-a)^2)^2}.$

On the one hand this integral is bounded by

$displaystyle int_{-infty}^infty frac{dx}{(1+(x-a)^2)^2} = int_{-infty}^infty frac{dx}{(1+x^2)^2} asymp 1$

and then again we will certain

$displaystyle int_/2 frac{dx}{(1+(x-a)^2)^2} leq int_/2 frac{dx}{(x-a)^4} asymp |a-b|^{-3}$

and so we will certain the contribution of (iii) by ${O( min( 1, |a-b|^{-3} ))}$. Placing all this collectively, and dividing into the instances $a-b$ and $> 1$, one can quickly acquire a complete certain of ${O(min( 1, |a-b|^{-3}))}$ for your entire integral. One may adapt this argument to indicate that this certain is sharp as much as constants, thus

$displaystyle int_{-infty}^infty frac{dx}{(1+(x-a)^2) (1+(x-b)^2)} asymp min( 1, |a-b|^{-3}) asymp frac{1}^3.$

A strong and customary kind of decomposition is dyadic decomposition. If the summand or integrand includes some amount ${Q}$ in a key method, it’s usually helpful to interrupt up into dyadic areas akin to ${2^{j-1} leq Q < 2^{j}}$, in order that ${Q sim 2^j}$, after which sum over ${j}$. (One can tweak the dyadic vary ${2^{j-1} leq Q < 2^{j}}$ right here with minor variants akin to ${2^{j} < Q leq 2^{j+1}}$, or substitute the bottom ${2}$ by another base, however these modifications largely have a minor aesthetic influence on the arguments at finest.) As an illustration, one may break up a sum

$displaystyle sum_{n=1}^{infty} f(n) (15)$

$displaystyle sum_{j=1}^infty sum_{2^{j-1} leq n < 2^{j}} f(n)$

after which search to estimate every dyadic block ${sum_{2^{j-1} leq n < 2^{j}} f(n)}$ individually (hoping to get some exponential or polynomial decay in ${j}$). The classical strategy of Cauchy condensation is a primary instance of this technique. However one may dyadically decompose different portions than ${n}$. As an illustration one can carry out a “vertical” dyadic decomposition (in distinction to the “horizontal” one simply carried out) by rewriting (15) as

$displaystyle sum_{k in {bf Z}} sum_{n geq 1: 2^{k-1} leq f(n) < 2^k} f(n);$

for the reason that summand ${f(n)}$ is ${asymp 2^k}$, we might simplify this to

$displaystyle asymp sum_{k in {bf Z}} 2^k # { n geq 1: 2^{k-1} leq f(n) < 2^k}.$

This now converts the issue of estimating the sum (15) to the extra combinatorial downside of estimating the scale of the dyadic degree units ${{ n geq 1: 2^{k-1} leq f(n) < 2^k}}$ for varied ${k}$. In an analogous spirit, now we have

$displaystyle int_A f(x) dx asymp sum_{k in {bf Z}} 2^k | { x in A: 2^{k-1} leq f(x) < 2^k }|$

the place  denotes the Lebesgue measure of a set ${E}$, and now we’re confronted with a geometrical downside of estimating the measure of some specific set. This enables one to make use of geometric instinct to resolve the issue, as an alternative of multivariable calculus:

Train 3 Let ${S}$ be a clean compact submanifold of ${{bf R}^d}$. Set up the certain

$displaystyle int_{B(0,C)} frac{dx}{varepsilon^2 + mathrm{dist}(x,S)^2} ll varepsilon$

for all ${0 < varepsilon < C}$, the place the implied constants are allowed to rely on ${C, d, S}$. (This may be completed both by a vertical dyadic decomposition, or a dyadic decomposition of the amount ${mathrm{dist}(x,S)}$.)

Train 4 Clear up downside (ii) from the introduction to this put up by dyadically decomposing within the ${d}$ variable.

Comment 5 By such instruments as (10), (11), or Train 1, one may convert the dyadic sums one obtains from dyadic decomposition into integral variants. Nevertheless, if one wished, one may “reduce out the middle-man” and work with steady dyadic decompositions moderately than discrete ones. Certainly, from the integral identification

$displaystyle int_0^infty 1_{lambda < Q leq 2lambda} frac{dlambda}{lambda} = log 2$

for any ${Q>0}$, along with the Fubini–Tonelli theorem, we acquire the continual dyadic decomposition

$displaystyle sum_{n in A} f(n) = int_0^infty sum_{n in A: lambda leq Q(n) < 2lambda} f(n) frac{dlambda}{lambda}$

for any amount ${Q(n)}$ that’s constructive at any time when ${f(n)}$ is constructive. Equally if we work with integrals ${int_A f(x) dx}$ moderately than sums. This model of dyadic decomposition is often a little bit extra handy to work with, notably if one then desires to carry out varied adjustments of variables within the ${lambda}$ parameter which might be tough to execute if this had been a discrete variable.

— 3. Exponential weights —

Many sums contain expressions which might be “exponentially giant” or “exponentially small” in some parameter. A primary rule of thumb is that any amount that’s “exponentially small” will doubtless give a negligible contribution when put next in opposition to portions that aren’t exponentially small. As an illustration, if an expression includes a time period of the shape ${e^{-Q}}$ for some non-negative amount ${Q}$, which could be bounded on no less than one portion of the area of summation or integration, then one expects the area the place ${Q}$ is bounded to supply the dominant contribution. As an illustration, if one needs to estimate the integral

$displaystyle int_0^infty e^{-varepsilon x} frac{dx}{1+x}$

for some ${0 < varepsilon < 1/2}$, this heuristic means that the dominant contribution ought to come from the area ${x = O(1/varepsilon)}$, through which one can certain ${e^{-varepsilon x}}$ just by ${1}$ and procure an higher certain of

$displaystyle ll int_{x = O(1/varepsilon)} frac{dx}{1+x} ll log frac{1}{varepsilon}.$

To make such a heuristic exact, one can carry out a dyadic decomposition within the exponential weight ${e^{-varepsilon x}}$, or equivalently carry out an additive decomposition within the exponent ${varepsilon x}$, as an illustration writing

$displaystyle int_0^infty e^{-varepsilon x} frac{dx}{1+x} = sum_{j=1}^infty int_{j-1 leq varepsilon x < j} e^{-varepsilon x} frac{dx}{1+x}.$

Train 6 Use this decomposition to carefully set up the certain

$displaystyle int_0^infty e^{-varepsilon x} frac{dx}{1+x} ll log frac{1}{varepsilon}$

for any ${0 < varepsilon < 1/2}$.

Train 7 Clear up downside (i) from the introduction to this put up.

Extra typically, if one is working with a sum or integral akin to

$displaystyle sum_{n in A} e^{phi(n)} psi(n)$

or

$displaystyle int_A e^{phi(x)} psi(x) dx$

with some exponential weight ${e^phi}$ and a decrease order amplitude ${psi}$, then one usually expects the dominant contribution to come back from the area the place ${phi}$ comes near attaining its maximal worth. If this most is attained on the boundary, then one usually has geometric collection conduct away from the boundary, and one can usually get a very good estimate by acquiring geometric collection kind conduct. As an illustration, suppose one desires to estimate the error operate

$displaystyle mathrm{erf}(z) = frac{2}{sqrt{pi}} int_0^z e^{-t^2} dt$

for ${z geq 1}$. In view of the whole integral

$displaystyle int_0^infty e^{-t^2} dt = frac{sqrt{pi}}{2}$

we will rewrite this as

$displaystyle mathrm{erf}(z) = 1 - frac{2}{sqrt{pi}} int_z^infty e^{-t^2} dt.$

The exponential weight ${e^{-t^2}}$ attains its most on the left endpoint ${t=z}$ and decays shortly away from that endpoint. One may estimate this by dyadic decomposition of ${e^{-t^2}}$ as mentioned beforehand, however a slicker method to proceed right here is to make use of the convexity of ${t^2}$ to acquire a geometrical collection higher certain

$displaystyle e^{-t^2} leq e^{-z^2 - 2 z (t-z)}$

for ${t geq z}$, which on integration provides

$displaystyle int_z^infty e^{-t^2} dt leq int_z^infty e^{-z^2 - 2 z (t-z)} dt = frac{e^{-z^2}}{2z}$

giving the asymptotic

$displaystyle mathrm{erf}(z) = 1 - O( frac{e^{-z^2}}{z})$

for ${z geq 1}$.

Train 8 Within the converse path, set up the higher certain

$displaystyle mathrm{erf}(z) leq 1 - c frac{e^{-z^2}}{z}$

for some absolute fixed ${c>0}$ and all ${z geq 1}$.

Train 9 If ${theta n leq m leq n}$ for some ${1/2 < theta < 1}$, present that

$displaystyle sum_{k=m}^n binom{n}{k} ll frac{1}{2theta-1} binom{n}{m}.$

(Trace: estimate the ratio between consecutive binomial coefficients ${binom{n}{k}}$ after which management the sum by a geometrical collection).

When the utmost of the exponent ${phi}$ happens within the inside of the area of summation or integration, then one can get good outcomes by some model of <a href=”https://en.wikipedia.org/wiki/Laplace

$displaystyle int_a^b e^{phi(x)} psi(x) dx$

the place ${phi}$ attains a non-degenerate international most at some inside level ${x = x_0}$. The rule of thumb right here is that

$displaystyle int_a^b e^{phi(x)} psi(x) dx approx sqrt{frac{2pi}} e^{phi(x_0)} psi(x_0).$

The heuristic justification is as follows. The primary contribution ought to be when ${x}$ is near ${x_0}$. Right here we will carry out a Taylor enlargement

$displaystyle phi(x) approx phi(x_0) - frac{1}{2} |phi''(x_0)| (x-x_0)^2$

since at a non-degenerate most now we have ${phi'(x_)=0}$ and ${phi''(x_0) > 0}$. Additionally, if ${psi}$ is steady, then ${psi(x) approx psi(x_0)}$ when ${x}$ is near ${x_0}$. Thus we must always have the ability to estimate the above integral by the gaussian integral

$displaystyle int_{bf R} e^{phi(x_0) - frac{1}{2} |phi''(x_0)| (x-x_0)^2} psi(x_0) dx$

which could be computed to equal ${sqrt{frac{2pi}} e^{phi(x_0)} psi(x_0)}$ as desired.

Allow us to illustrate how this argument could be made rigorous by contemplating the duty of estimating the factorial ${n!}$ of a giant quantity. In distinction to what we did in Train ref”>, we’ll proceed utilizing a model of Laplace’s methodology, counting on the integral illustration

$displaystyle n! = Gamma(n+1) = int_0^infty x^n e^{-x} dx.$

As ${n}$ is giant, we’ll think about ${x^n}$ to be a part of the exponential weight moderately than the amplitude, scripting this expression as

$displaystyle int_0^infty e^{-phi(x)} dx$

the place

$displaystyle phi(x) = x - n log x.$

The operate ${phi}$ attains a worldwide most at ${x_0 = n}$, with ${phi(n) = 0}$ and ${phi''(n) = 1/n}$. We are going to subsequently decompose this integral into three items

$displaystyle int_0^{n-R} e^{-phi(x)} dx + int_{n-R}^{n+R} e^{-phi(x)} dx + int_{n+R}^infty e^{-phi(x)} dx (16)$

the place ${0 < R < n}$ is a radius parameter which we’ll select later, as it’s not instantly apparent for now easy methods to choose it.

The primary time period is predicted to be the center time period, so we will use crude strategies to certain the opposite two phrases. For the primary half the place ${0 < x leq n-R}$, ${phi}$ is growing so we will crudely certain ${e^{-phi(x)} leq e^{-phi(n-R)}}$ and thus

$displaystyle int_0^{n-R} e^{-phi(x)} dx leq (n-R) e^{-phi(n-R)} leq n e^{-phi(n-R)}.$

(We anticipate ${R}$ to be a lot smaller than ${n}$, so there’s not a lot level to saving the tiny ${-R}$ time period within the ${n-R}$ issue.) For the third half the place ${x geq n+R}$, ${phi}$ is lowering, however bounding ${e^{-phi(x)}}$ by ${e^{-phi(n+R)}}$ wouldn’t work due to the unbounded nature of ${x}$; some further decay is required. Fortuitously, now we have a strict improve

$displaystyle phi'(x) = 1 - frac{n}{x} geq 1 - frac{n}{n+R} = frac{R}{n+R}$

for ${x geq n+R}$, so by the intermediate worth theorem now we have

$displaystyle phi(x) geq phi(n+R) + frac{R}{n+R} (x-n-R)$

and after a brief calculation this offers

$displaystyle int_{n+R}^infty e^{-phi(x)} dx leq frac{n+R}{R} e^{-phi(n+R)} ll frac{n}{R} e^{-phi(n+R)}.$

Now we flip to the essential center time period. If we assume ${R leq n/2}$, then we could have ${phi'''(x) = O( 1/n^2 )}$ within the area ${n-R leq x leq n+R}$, so by Taylor’s theorem with the rest

$displaystyle phi(x) = phi(n) + phi'(n) (x-n) + frac{1}{2} phi''(n) (x-n)^2 + O( fracx-n{n^2} )$

$displaystyle = phi(n) + frac{(x-n)^2}{2n} + O( frac{R^3}{n^2} ).$

If we assume that ${R = O(n^{2/3})}$, then the error time period is bounded and we will exponentiate to acquire

$displaystyle e^{-phi(x)} = (1 + O(frac{R^3}{n^2})) e^{-phi(n) - frac{(x-n)^2}{2n}} (17)$

for ${n-R leq x leq n+R}$ and therefore

$displaystyle int_{n-R}^{n+R} e^{-phi(x)} dx = (1 + O(frac{R^3}{n^2})) e^{-phi(n)} int_{n-R}^{n+R} e^{-(x-n)^2/2n} dx.$

If we additionally assume that ${R gg sqrt{n}}$, we will use the error operate kind estimates from earlier than to estimate

$displaystyle int_{n-R}^{n+R} e^{-(x-n)^2/2n} dx = sqrt{2pi n} + O( frac{n}{R} e^{-R^2/2n} ).$

Placing all this collectively, and utilizing eqref for particulars.

Train 10 Clear up downside (iii) from the introduction. (Trace: extract out the time period ${frac{k^{2n-4k}}{(n-k)^{2n-4k}}}$ to jot down because the exponential issue ${e^{phi(k)}}$, putting all the opposite phrases (that are of polynomial measurement) within the amplitude operate ${psi(k)}$. The operate ${phi}$ will then attain a most at ${k=n/2}$; carry out a Taylor enlargement and mimic the arguments above.)