Let be a non-empty limited set. {If is a random variable taking worths in , the Shannon entropy of is specified as

.|The Shannon entropy

* is a random variable taking worths in *.}

There is a good variational formula that lets one calculate logs of amounts of exponentials in regards to this entropy:.

Lemma 1 (Gibbs variational formula)(1) Let be a function. {Then Proof: Note that moving

by a consistent impacts both sides of Gibbs inequality the exact same method, so we might stabilize .| Proof: Kullback-Leibler divergence Note that moving the Fenchelâ€“Young inequality by a consistent impacts both sides of the exact same method, so we might stabilize .} {Then

is now the possibility circulation of some random variable inequality of Carbery, and the inequality can be reworded as

.|is now the possibility circulation of some random variable , and the inequality can be reworded as .}

{However this is exactly the .|This is exactly the .} (The expression inside the supremum can likewise be composed as , where represents One can likewise translate this inequality as a diplomatic immunity ofrelating the conjugate convex functions and

.)

In this note I want to utilize this variational formula (which is likewise referred to as the Donsker-Varadhan variational formula) to offer another evidence of the following

Theorem 2 (Generalized Cauchy-Schwarz inequality) Let

, let be limited non-empty sets, and let be functions for each

Let and

be favorable functions for each {Then .|

.}

where is the amount

where is the set of all tuples such that 1 for

.

Therefore for example, the identity is unimportant for When , the inequality checks out

which is quickly shown by Cauchy-Schwarz, while for the inequality checks out

which can likewise be shown by primary ways. {Nevertheless even for

, the existing evidence need the “tensor power technique” in order to minimize to the case when theare action functions (in which case the inequality can be shown elementarily, as gone over in the above paper of Carbery).|Even for , the existing evidence need the “tensor power technique” in order to minimize to the case when the are action functions (in which case the inequality can be shown elementarily, as gone over in the above paper of Carbery).}

We now show this inequality. We compose and for some functions and

* {If we take logarithms in the inequality to be shown and use Lemma *, the inequality ends up being .|The inequality ends up being if we take logarithms in the inequality to be shown and use Lemma .}

where varies over random variables taking worths in , variety over tuples of random variables taking worths in , and variety over random variables taking worths in

Comparing the suprema, the claim now minimizes to.

Lemma 3 (Conditional expectation calculation) Let be an – valued random variable. {Then there exists a – valued random variable , where each has the exact same circulation as , and Proof: We induct on .|There exists a – valued random variable , where each has the exact same circulation as entropy chain rul, and

Proof:

We induct on .} When

we simply take(*) Now expect that (*), and the claim has actually currently been shown for (*), hence one has actually currently gotten a tuple (*) with each (*) having the exact same circulation as (*), and(*)

By hypothesis, (*) has the exact same circulation as(*) For each worth (*) achieved by (*), we can take conditionally independent copies of (*) and (*) conditioned to the occasions (*) and (*) respectively, and after that concatenate them to form a tuple (*) in (*), with (*) a more copy of (*) that is conditionally independent of (*) relative to(*) One can the utilize the (*) e to calculate(*)

and the claim now follows from the induction hypothesis.(*)

With a little bit more effort, one can change (*) by a more basic step area (and utilize differential entropy in location of Shannon entropy), to recuperate Carbery’s inequality completely generality; we leave the information to the interested reader.

(*)