Home Maths A generalized Cauchy-Schwarz inequality via the Gibbs variational formula

A generalized Cauchy-Schwarz inequality via the Gibbs variational formula

0
3

Let ${S}$ be a non-empty limited set. {If ${X}$ is a random variable taking worths in ${S}$, the Shannon entropy ${H[X]}$ of ${X}$ is specified as

$displaystyle H[X] = -sum_{s in S} {bf P}[X = s] log {bf P}[X = s].$

.|The Shannon entropy

of is specified as ${f: S rightarrow {bf R}}$ if

$displaystyle log sum_{s in S} exp(f(s)) = sup_X {bf E} f(X) + {bf H}[X]. (1)$

is a random variable taking worths in .}
There is a good variational formula that lets one calculate logs of amounts of exponentials in regards to this entropy:.
${f}$ Lemma 1 (Gibbs variational formula)(1) Let ${sum_{s in S} exp(f(s)) = 1}$ be a function. {Then ${exp(f(s))}$ Proof: ${Y}$ Note that moving

$displaystyle 0 = sup_X sum_{s in S} {bf P}[X = s] log {bf P}[Y = s] -sum_{s in S} {bf P}[X = s] log {bf P}[X = s].$

by a consistent impacts both sides of Gibbs inequality the exact same method, so we might stabilize ${-D_{KL}(X||Y)}$.|${D_{KL}}$ Proof: Kullback-Leibler divergence Note that moving the Fenchelâ€“Young inequality by a consistent impacts both sides of ${x mapsto e^x}$ the exact same method, so we might stabilize ${y mapsto y log y - y}$.} {Then $Box$

is now the possibility circulation of some random variable inequality of Carbery, and the inequality can be reworded as

.| is now the possibility circulation of some random variable ${n geq 0}$, and the inequality can be reworded as ${S, T_1,dots,T_n}$.}
{However this is exactly the ${pi_i: S rightarrow T_i}$.|This is exactly the ${i=1,dots,n}$.} (The expression inside the supremum can likewise be composed as ${K: S rightarrow {bf R}^+}$, where ${f_i: T_i rightarrow {bf R}^+}$ represents${i=1,dots,n}$ One can likewise translate this inequality as a diplomatic immunity of

$displaystyle sum_{s in S} K(s) prod_{i=1}^n f_i(pi_i(s)) leq Q prod_{i=1}^n (sum_{t_i in T_i} f_i(t_i)^{n+1})^{1/(n+1)}$

relating the conjugate convex functions ${Q}$ and

$displaystyle Q := (sum_{(s_0,dots,s_n) in Omega_n} K(s_0) dots K(s_n))^{1/(n+1)}$

.)${Omega_n}$
In this note I want to utilize this variational formula (which is likewise referred to as the Donsker-Varadhan variational formula) to offer another evidence of the following${(s_0,dots,s_n) in S^{n+1}}$
${pi_i(s_{i-1}) = pi_i(s_i)}$ Theorem 2 (Generalized Cauchy-Schwarz inequality)${i=1,dots,n}$ Let

, let ${n=0}$ be limited non-empty sets, and let ${n=1}$ be functions for each

$displaystyle sum_{s in S} K(s) f_1(pi_1(s)) leq (sum_{s_0,s_1 in S: pi_1(s_0)=pi_1(s_1)} K(s_0) K(s_1))^{1/2}$

$displaystyle ( sum_{t_1 in T_1} f_1(t_1)^2)^{1/2},$

Let ${n=2}$ and

$displaystyle sum_{s in S} K(s) f_1(pi_1(s)) f_2(pi_2(s))$

$displaystyle leq (sum_{s_0,s_1, s_2 in S: pi_1(s_0)=pi_1(s_1); pi_2(s_1)=pi_2(s_2)} K(s_0) K(s_1) K(s_2))^{1/3}$

$displaystyle (sum_{t_1 in T_1} f_1(t_1)^3)^{1/3} (sum_{t_2 in T_2} f_2(t_2)^3)^{1/3}$

be favorable functions for each${n=3}$ {Then ${f_i}$.|

.}
where ${K(s) = exp(k(s))}$ is the amount${f_i(t_i) = exp(g_i(t_i))}$
where ${k: S rightarrow {bf R}}$ is the set of all tuples ${g_i: T_i rightarrow {bf R}}$ such that 1 for

$displaystyle sup_X {bf E} k(X) + sum_{i=1}^n g_i(pi_i(X)) + {bf H}[X]$

$displaystyle leq frac{1}{n+1} sup_{(X_0,dots,X_n)} {bf E} k(X_0)+dots+k(X_n) + {bf H}[X_0,dots,X_n]$

$displaystyle + frac{1}{n+1} sum_{i=1}^n sup_{Y_i} (n+1) {bf E} g_i(Y_i) + {bf H}[Y_i]$

${X}$.
Therefore for example, the identity is unimportant for${S}$ When ${X_0,dots,X_n}$, the inequality checks out${Omega_n}$
which is quickly shown by Cauchy-Schwarz, while for ${Y_i}$ the inequality checks out${T_i}$
which can likewise be shown by primary ways. {Nevertheless even for

, the existing evidence need the “tensor power technique” in order to minimize to the case when the are action functions (in which case the inequality can be shown elementarily, as gone over in the above paper of Carbery).|Even for ${X}$, the existing evidence need the “tensor power technique” in order to minimize to the case when the ${S}$ are action functions (in which case the inequality can be shown elementarily, as gone over in the above paper of Carbery).}
${Omega_n}$
We now show this inequality. We compose ${(X_0,dots,X_n)}$ and ${X_i}$ for some functions ${X}$ and

$displaystyle {bf H}[X_0,dots,X_n] = (n+1) {bf H}[X]$

$displaystyle - {bf H}[pi_1(X)] - dots - {bf H}[pi_n(X)].$

{If we take logarithms in the inequality to be shown and use Lemma , the inequality ends up being ${n}$.|The inequality ends up being ${n=0}$ if we take logarithms in the inequality to be shown and use Lemma ${X_0 = X}$.}
where ${n geq 1}$ varies over random variables taking worths in ${n-1}$, ${(X_0,dots,X_{n-1}) in Omega_{n-1}}$ variety over tuples of random variables taking worths in ${X_0,dots,X_{n-1}}$, and ${X}$ variety over random variables taking worths in

$displaystyle {bf H}[X_0,dots,X_{n-1}] = n {bf H}[X] - {bf H}[pi_1(X)] - dots - {bf H}[pi_{n-1}(X)].$

Comparing the suprema, the claim now minimizes to.
${pi_n(X_{n-1})}$ Lemma 3 (Conditional expectation calculation)${pi_n(X)}$ Let ${t_n}$ be an ${pi_n(X)}$– valued random variable. {Then there exists a ${(X_0,dots,X_{n-1})}$– valued random variable ${X}$, where each ${pi_n(X_{n-1}) = t_n}$ has the exact same circulation as ${pi_n(X) = t_n}$, and ${(X_0,dots,X_n)}$ Proof: ${Omega_n}$ We induct on ${X_n}$.|There exists a ${X}$– valued random variable ${(X_0,dots,X_{n-1})}$, where each ${pi_n(X_{n-1}) = pi_n(X)}$ has the exact same circulation as entropy chain rul, and

$displaystyle {bf H}[X_0,dots,X_n] = {bf H}[pi_n(X_n)] + {bf H}[X_0,dots,X_n| pi_n(X_n)]$

$displaystyle = {bf H}[pi_n(X_n)] + {bf H}[X_0,dots,X_{n-1}| pi_n(X_n)] + {bf H}[X_n| pi_n(X_n)]$

$displaystyle = {bf H}[pi_n(X)] + {bf H}[X_0,dots,X_{n-1}| pi_n(X_{n-1})] + {bf H}[X_n| pi_n(X_n)]$

$displaystyle = {bf H}[pi_n(X)] + ({bf H}[X_0,dots,X_{n-1}] - {bf H}[pi_n(X_{n-1})])$

$displaystyle + ({bf H}[X_n] - {bf H}[pi_n(X_n)])$

$displaystyle ={bf H}[X_0,dots,X_{n-1}] + {bf H}[X_n] - {bf H}[pi_n(X_n)]$

Proof: $Box$

We induct on ${S}$.} When

we simply take(*) Now expect that (*), and the claim has actually currently been shown for (*), hence one has actually currently gotten a tuple (*) with each (*) having the exact same circulation as (*), and(*)
By hypothesis, (*) has the exact same circulation as(*) For each worth (*) achieved by (*), we can take conditionally independent copies of (*) and (*) conditioned to the occasions (*) and (*) respectively, and after that concatenate them to form a tuple (*) in (*), with (*) a more copy of (*) that is conditionally independent of (*) relative to(*) One can the utilize the (*) e to calculate(*)
and the claim now follows from the induction hypothesis.(*)
With a little bit more effort, one can change (*) by a more basic step area (and utilize differential entropy in location of Shannon entropy), to recuperate Carbery’s inequality completely generality; we leave the information to the interested reader.
(*)