鳞状细胞是什么意思| 腹腔淋巴结是什么意思| 猪头三是什么意思| 生菜有什么营养价值| 什么叫大男子主义| 汗疱疹用什么药膏| 迥异是什么意思| 控告是什么意思| 荔枝有什么作用| 颇负盛名的颇是什么意思| 地震为什么会发生| 农历10月19日是什么星座| 褐色分泌物是什么原因引起的| 胃一阵一阵绞痛是什么原因| 碱性水是什么水| kodak是什么牌子| 大电念什么| 什么东西最隔音| 为什么光放屁| 乳酸高是什么原因| 盯眝是什么意思| 男性左下腹疼痛是什么原因| 男生染头发什么颜色好看| 结石是什么| 小孩突然头疼是什么原因| 人工念什么字| 便秘了吃什么容易排便| pcl是什么材料| 肌酐激酶高是什么原因| 脑袋痛什么原因| 下载什么软件可以赚钱| 左眼皮老是跳是什么原因| 舌头发紫是什么原因| 彩金和黄金有什么区别| 什么叫压缩性骨折| 尿路感染需要做什么检查| 竟无语凝噎什么意思| 生是什么结构的字| 梦见牙齿掉了是什么征兆| 鹿几念什么| 姓林的女孩取什么名字好| 梦见黑棺材是什么征兆| 什么民族不吃猪肉| 城市户口和农村户口有什么区别| 灵魂摆渡人是什么意思| 什么是越位| 飞机不能带什么东西| 巴基斯坦是什么人种| 淋巴细胞高是什么原因| 缺维生素c会得什么病| 头顶痛吃什么药| 人造海蜇丝是什么做的| 夫复何求什么意思| 输血前四项检查是什么| 乳核是什么| 儿童尿频什么原因引起的| 金牛座和什么星座最不配| 幼儿园中班学什么| 每日家情思睡昏昏什么意思| lanvin是什么牌子| 腘窝囊肿是什么原因引起的| 恶心吃什么药| 为什么同房会痛| 呃逆吃什么药| 鸡汤炖什么菜好吃| 刮痧红色说明什么原因| 什么减肥药好使| 脚后跟干裂用什么药膏| 致电是什么意思| 肾上腺是什么意思| 肚子大什么原因| 什么是肾上腺素| 杯弓蛇影是什么物理现象| zero什么意思| mrcp是什么检查| 自由基是什么意思| 吃驼奶粉有什么好处| 吃白糖有什么好处和坏处| 鹅蛋有什么功效| 变色龙形容什么样的人| 箜篌是什么乐器| 孕妇吃什么是补铁的| 经常咳嗽是什么病| 蟾蜍吃什么| 干燥综合征吃什么药| 黄芪主要治疗什么| 为什么总是长口腔溃疡| 后知后觉什么意思| 脾虚是什么原因引起的| 工装裤搭配什么上衣| 欲情故纵什么意思| 营养素是什么| 晚上老是做梦是什么原因| 金黄色葡萄球菌是什么| 做酸菜鱼用什么鱼| 中元节是什么| 上环后需要注意什么| 俯卧撑有什么好处| 五月生日是什么星座| 乳腺囊肿吃什么药| 红褐色是什么颜色| 白噪音什么意思| rip什么意思| 阿玛尼属于什么档次| 六月二十三是什么日子| 温煦是什么意思| 圣母娘娘是什么神| 血小板低是什么症状| 右耳朵耳鸣是什么原因| 螃蟹吃什么食物| 为什么不建议切除脂肪瘤| 鳞状上皮增生是什么意思| 表面积是什么| 右脚踝肿是什么原因引起的| 什么是化疗和放疗| 魏大勋什么星座| 在什么情况下需要做肠镜| 绿茶是什么| 肌酐高有什么症状表现| 小孩病毒性感冒吃什么药效果好| 头皮发紧是什么病的前兆| 早晨起床口干口苦是什么原因| 敏感肌是什么| 腿部抽筋是什么原因引起的| 翊是什么意思| 尿培养是检查什么病| 脂溢性皮炎吃什么药| www是什么意思| 花椒桂圆艾绒敷肚脐有什么作用| 不管事是什么意思| 为什么会得纤维瘤| 孕妇吃花胶对胎儿有什么好处| 血氧饱和度是什么意思| 脾疼是什么原因| cmv是什么病毒| 随诊复查是什么意思| 9月14号是什么星座| 老年人反复发烧是什么原因引起的| 病毒的繁殖方式是什么| 69什么意思| 术前四项检查是什么| 宝宝流鼻血是什么原因| 知心朋友是什么意思| 秋高气爽是什么意思| 瞬息万变什么意思| 肾上腺挂什么科| 吐了后吃点什么能舒服| ems是什么| 高血糖能吃什么水果| 三叉神经痛吃什么药效果好| 面试是什么意思| 鼻子长痘是什么原因| 为什么一坐车就想睡觉| 谷氨酰转肽酶高什么原因| 什么原因不来月经| 龙凤呈祥是什么意思| 知柏地黄丸适合什么人吃| adivon是什么牌子| 猪头猪脑是什么生肖| miko是什么意思| 心电图逆钟向转位什么意思| 角质增生是什么意思| 炖汤用什么鸡| 肾气虚吃什么药| 嘴酸是什么原因| cmv是什么病毒| 心脏大是什么原因| 灵五行属性是什么| 犹太人属于什么人种| 为什么一躺下就鼻塞| m的意思是什么| 社保缴费基数什么意思| 血压太低有什么危害| 乏是什么单位| 当今社会什么行业前途比较好| 什么时候开始孕吐| 无机盐包括什么| 烦闷是什么意思| 血沉50说明什么原因| 乔迁之喜送什么| 青玉是什么玉| 男宠是什么意思| 六月二十三号是什么星座| 长期喝饮料对身体有什么危害| 木芙蓉什么时候开花| 熬中药用什么锅| 7月6日是什么星座| 天麻能治什么病| 聚宝盆是什么意思| 再接再厉后面接什么好| 三维是什么| 月经期间可以喝什么茶| 潮湿是什么意思| 乳腺增生吃什么药最好| 性取向是什么意思| 刻板是什么意思| 闪点什么意思| 尿频看什么科| 心脏早搏挂什么科| 2月20是什么星座| 阳历6月28日是什么星座| 白芨有什么作用和功效| beginning什么意思| 中班小朋友应该学什么| 憨包是什么意思| 处暑什么意思| hushpuppies是什么牌子| 女生是什么意思| 安痛定又叫什么| 什么是终端| 为老不尊是什么意思| 男人前列腺炎有什么症状表现| 八面玲珑是指什么生肖| 急性支气管炎吃什么药| 梦到抓了好多鱼是什么意思| 股骨头坏死吃什么药| 为什么一直流鼻血| 钱癣用什么药膏最好| edc是什么意思| 血压低什么原因造成的| 献殷勤是什么意思| 煎中药用什么容器最好| cdf1是什么意思| 长红疹是什么原因| 什么叫做脂肪肝| 户籍地址填什么| 宫腔积液吃什么药效果最好| 驴友是什么意思| 71年出生属什么生肖| ao是什么意思| beauty是什么意思| 洛阳以前叫什么名字| 月经推迟挂什么科| 世界上最长的蛇是什么蛇| 固涩是什么意思| 头孢呋辛钠主治什么病| 腮腺炎吃什么药好得快| 排卵期和排卵日有什么区别| 肾腺瘤是什么病严重吗| 肠易激综合征是什么病| 不为良相便为良医是什么意思| 喉咙发炎吃什么食物好| 萎缩性阴道炎用什么药| wba是什么意思| 乳腺增生的前兆是什么| rt什么意思| apc药片是什么药| 半边脸疼是什么原因| 睡觉老是流口水是什么原因| 老人脚肿是什么征兆| 性无能是什么意思| 梦见手机坏了是什么意思| 经常便秘吃什么药好| 女人绝经是什么症状| 烹饪是什么意思| 喝苹果醋有什么好处和坏处| 通透是什么意思| 长高吃什么钙片| 取笑是什么意思| 河南有什么市| 乳腺结节看什么科| 茉莉什么时候开花| adem是什么病| 什么果酒最好喝| 百度Jump to content

翌痤桎嚯 蝠囗耵铕爨蝾?(Toroidal transformer)

From Wikipedia, the free encyclopedia
百度 充分利用各种媒体和阵地,开展广泛宣传和深入研究,营造浓厚舆论氛围。

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class (usually defined in terms of specified properties or measures), then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

Definition of entropy and differential entropy

[edit]

If is a continuous random variable with probability density , then the differential entropy of is defined as[1][2][3]

If is a discrete random variable with distribution given by then the entropy of is defined as

The seemingly divergent term is replaced by zero, whenever

This is a special case of more general forms described in the articles Entropy (information theory), Principle of maximum entropy, and differential entropy. In connection with maximum entropy distributions, this is the only one needed, because maximizing will also maximize the more general forms.

The base of the logarithm is not important, as long as the same one is used consistently: Change of base merely results in a rescaling of the entropy. Information theorists may prefer to use base 2 in order to express the entropy in bits; mathematicians and physicists often prefer the natural logarithm, resulting in a unit of "nat"s for the entropy.

However, the chosen measure is crucial, even though the typical use of the Lebesgue measure is often defended as a "natural" choice: Which measure is chosen determines the entropy and the consequent maximum entropy distribution.

Distributions with measured constants

[edit]

Many statistical distributions of applicable interest are those for which the moments or other measurable quantities are constrained to be constants. The following theorem by Ludwig Boltzmann gives the form of the probability density under these constraints.

Continuous case

[edit]

Suppose is a continuous, closed subset of the real numbers and we choose to specify measurable functions and numbers We consider the class of all real-valued random variables which are supported on (i.e. whose density function is zero outside of ) and which satisfy the moment conditions:

If there is a member in whose density function is positive everywhere in and if there exists a maximal entropy distribution for then its probability density has the following form:

where we assume that The constant and the Lagrange multipliers solve the constrained optimization problem with (which ensures that integrates to unity):[4]

Using the Karush–Kuhn–Tucker conditions, it can be shown that the optimization problem has a unique solution because the objective function in the optimization is concave in

Note that when the moment constraints are equalities (instead of inequalities), that is,

then the constraint condition can be dropped, which makes optimization over the Lagrange multipliers unconstrained.

Discrete case

[edit]

Suppose is a (finite or infinite) discrete subset of the reals, and that we choose to specify functions and numbers We consider the class of all discrete random variables which are supported on and which satisfy the moment conditions

If there exists a member of class which assigns positive probability to all members of and if there exists a maximum entropy distribution for then this distribution has the following shape:

where we assume that and the constants solve the constrained optimization problem with :[5]

Again as above, if the moment conditions are equalities (instead of inequalities), then the constraint condition is not present in the optimization.

Proof in the case of equality constraints

[edit]

In the case of equality constraints, this theorem is proved with the calculus of variations and Lagrange multipliers. The constraints can be written as

We consider the functional

where and are the Lagrange multipliers. The zeroth constraint ensures the second axiom of probability. The other constraints are that the measurements of the function are given constants up to order . The entropy attains an extremum when the functional derivative is equal to zero:

Therefore, the extremal entropy probability distribution in this case must be of the form (),

remembering that . It can be verified that this is the maximal solution by checking that the variation around this solution is always negative.

Uniqueness of the maximum

[edit]

Suppose and are distributions satisfying the expectation-constraints. Letting and considering the distribution it is clear that this distribution satisfies the expectation-constraints and furthermore has as support From basic facts about entropy, it holds that Taking limits and respectively, yields

It follows that a distribution satisfying the expectation-constraints and maximising entropy must necessarily have full support — i. e. the distribution is almost everywhere strictly positive. It follows that the maximising distribution must be an internal point in the space of distributions satisfying the expectation-constraints, that is, it must be a local extreme. Thus it suffices to show that the local extreme is unique, in order to show both that the entropy-maximising distribution is unique (and this also shows that the local extreme is the global maximum).

Suppose and are local extremes. Reformulating the above computations these are characterised by parameters via and similarly for where We now note a series of identities: Via the satisfaction of the expectation-constraints and utilising gradients / directional derivatives, one has

and similarly for Letting one obtains:

where for some Computing further, one has

where is similar to the distribution above, only parameterised by Assuming that no non-trivial linear combination of the observables is almost everywhere (a.e.) constant, (which e.g. holds if the observables are independent and not a.e. constant), it holds that has non-zero variance, unless By the above equation it is thus clear, that the latter must be the case. Hence so the parameters characterising the local extrema are identical, which means that the distributions themselves are identical. Thus, the local extreme is unique and by the above discussion, the maximum is unique – provided a local extreme actually exists.

Caveats

[edit]

Note that not all classes of distributions contain a maximum entropy distribution. It is possible that a class contain distributions of arbitrarily large entropy (e.g. the class of all continuous distributions on R with mean 0 but arbitrary standard deviation), or that the entropies are bounded above but there is no distribution which attains the maximal entropy.[a] It is also possible that the expected value restrictions for the class C force the probability distribution to be zero in certain subsets of S. In that case our theorem doesn't apply, but one can work around this by shrinking the set S.

Examples

[edit]

Every probability distribution is trivially a maximum entropy probability distribution under the constraint that the distribution has its own entropy. To see this, rewrite the density as and compare to the expression of the theorem above. By choosing to be the measurable function and

to be the constant, is the maximum entropy probability distribution under the constraint

Nontrivial examples are distributions that are subject to multiple constraints that are different from the assignment of the entropy. These are often found by starting with the same procedure and finding that can be separated into parts.

A table of examples of maximum entropy distributions is given in Lisman (1972)[6] and Park & Bera (2009).[7]

Uniform and piecewise uniform distributions

[edit]

The uniform distribution on the interval [a,b] is the maximum entropy distribution among all continuous distributions which are supported in the interval [a, b], and thus the probability density is 0 outside of the interval. This uniform density can be related to Laplace's principle of indifference, sometimes called the principle of insufficient reason. More generally, if we are given a subdivision a=a0 < a1 < ... < ak = b of the interval [a,b] and probabilities p1,...,pk that add up to one, then we can consider the class of all continuous distributions such that The density of the maximum entropy distribution for this class is constant on each of the intervals [aj?1,aj). The uniform distribution on the finite set {x1,...,xn} (which assigns a probability of 1/n to each of these values) is the maximum entropy distribution among all discrete distributions supported on this set.

Positive and specified mean: the exponential distribution

[edit]

The exponential distribution, for which the density function is

is the maximum entropy distribution among all continuous distributions supported in [0,∞) that have a specified mean of 1/λ.

In the case of distributions supported on [0,∞), the maximum entropy distribution depends on relationships between the first and second moments. In specific cases, it may be the exponential distribution, or may be another distribution, or may be undefinable.[8]

Specified mean and variance: the normal distribution

[edit]

The normal distribution N(μ,σ2), for which the density function is

has maximum entropy among all real-valued distributions supported on (?∞,∞) with a specified variance σ2 (a particular moment). The same is true when the mean μ and the variance σ2 is specified (the first two moments), since entropy is translation invariant on (?∞,∞). Therefore, the assumption of normality imposes the minimal prior structural constraint beyond these moments. (See the differential entropy article for a derivation.)

Discrete distributions with specified mean

[edit]

Among all the discrete distributions supported on the set {x1,...,xn} with a specified mean μ, the maximum entropy distribution has the following shape: where the positive constants C and r can be determined by the requirements that the sum of all the probabilities must be 1 and the expected value must be μ.

For example, if a large number N of dice are thrown, and you are told that the sum of all the shown numbers is S. Based on this information alone, what would be a reasonable assumption for the number of dice showing 1, 2, ..., 6? This is an instance of the situation considered above, with {x1,...,x6} = {1,...,6} and μ = S/N.

Finally, among all the discrete distributions supported on the infinite set with mean μ, the maximum entropy distribution has the shape: where again the constants C and r were determined by the requirements that the sum of all the probabilities must be 1 and the expected value must be μ. For example, in the case that xk = k, this gives

such that respective maximum entropy distribution is the geometric distribution.

Circular random variables

[edit]

For a continuous random variable distributed about the unit circle, the Von Mises distribution maximizes the entropy when the real and imaginary parts of the first circular moment are specified[9] or, equivalently, the circular mean and circular variance are specified.

When the mean and variance of the angles modulo are specified, the wrapped normal distribution maximizes the entropy.[9]

Maximizer for specified mean, variance and skew

[edit]

There exists an upper bound on the entropy of continuous random variables on with a specified mean, variance, and skew. However, there is no distribution which achieves this upper bound, because is unbounded when (see Cover & Thomas (2006: chapter 12)).

However, the maximum entropy is ε-achievable: a distribution's entropy can be arbitrarily close to the upper bound. Start with a normal distribution of the specified mean and variance. To introduce a positive skew, perturb the normal distribution upward by a small amount at a value many σ larger than the mean. The skewness, being proportional to the third moment, will be affected more than the lower order moments.

This is a special case of the general case in which the exponential of any odd-order polynomial in x will be unbounded on . For example, will likewise be unbounded on , but when the support is limited to a bounded or semi-bounded interval the upper entropy bound may be achieved (e.g. if x lies in the interval [0,∞] and λ< 0, the exponential distribution will result).

Maximizer for specified mean and deviation risk measure

[edit]

Every distribution with log-concave density is a maximal entropy distribution with specified mean μ and deviation risk measure D .[10]

In particular, the maximal entropy distribution with specified mean and deviation is:

  • The normal distribution if is the standard deviation;
  • The Laplace distribution, if is the average absolute deviation;[6]
  • The distribution with density of the form if is the standard lower semi-deviation, where are constants and the function returns only the negative values of its argument, otherwise zero.[10]

Other examples

[edit]

In the table below, each listed distribution maximizes the entropy for a particular set of functional constraints listed in the third column, and the constraint that be included in the support of the probability density, which is listed in the fourth column.[6][7]

Several listed examples (Bernoulli, geometric, exponential, Laplace, Pareto) are trivially true, because their associated constraints are equivalent to the assignment of their entropy. They are included anyway because their constraint is related to a common or easily measured quantity.

For reference, is the gamma function, is the digamma function, is the beta function, and is the Euler-Mascheroni constant.

Table of probability distributions and corresponding maximum entropy constraints
Distribution name Probability density / mass function Maximum entropy constraint Support
Uniform (discrete) None
Uniform (continuous) None
Bernoulli
Geometric
Exponential
Laplace
Asymmetric Laplace
where
Pareto
Normal
Truncated normal (see article)
von Mises
Rayleigh
Beta for
Cauchy
Chi
Chi-squared
Erlang

Gamma
Lognormal
Maxwell–Boltzmann
Weibull
Multivariate normal
Binomial
n-generalized binomial distribution[11]
Poisson
-generalized binomial distribution[11]
Logistic

The maximum entropy principle can be used to upper bound the entropy of statistical mixtures.[12]

See also

[edit]

Notes

[edit]
  1. ^ For example, the class of all continuous distributions X on R with E(X) = 0 and E(X2) = E(X3) = 1 (see Cover, Ch 12).

Citations

[edit]
  1. ^ Williams, D. (2001). Weighing the Odds. Cambridge University Press. pp. 197–199. ISBN 0-521-00618-X.
  2. ^ Bernardo, J.M.; Smith, A.F.M. (2000). Bayesian Theory. Wiley. pp. 209, 366. ISBN 0-471-49464-X.
  3. ^ O'Hagan, A. (1994), Bayesian Inference. Kendall's Advanced Theory of Statistics. Vol. 2B. Edward Arnold. 1994. section 5.40. ISBN 0-340-52922-9.
  4. ^ Botev, Z.I.; Kroese, D.P. (2011). "The generalized cross entropy method, with applications to probability density estimation" (PDF). Methodology and Computing in Applied Probability. 13 (1): 1–27. doi:10.1007/s11009-009-9133-7. S2CID 18155189.
  5. ^ Botev, Z.I.; Kroese, D.P. (2008). "Non-asymptotic bandwidth selection for density estimation of discrete data". Methodology and Computing in Applied Probability. 10 (3): 435. doi:10.1007/s11009-007-9057-zv (inactive 1 July 2025). S2CID 122047337.{{cite journal}}: CS1 maint: DOI inactive as of July 2025 (link)
  6. ^ a b c Lisman, J. H. C.; van Zuylen, M. C. A. (1972). "Note on the generation of most probable frequency distributions". Statistica Neerlandica. 26 (1): 19–23. doi:10.1111/j.1467-9574.1972.tb00152.x.
  7. ^ a b Park, Sung Y.; Bera, Anil K. (2009). "Maximum entropy autoregressive conditional heteroskedasticity model" (PDF). Journal of Econometrics. 150 (2): 219–230. CiteSeerX 10.1.1.511.9750. doi:10.1016/j.jeconom.2008.12.014. Archived from the original (PDF) on 2025-08-06. Retrieved 2025-08-06.
  8. ^ Dowson, D.; Wragg, A. (September 1973). "Maximum-entropy distributions having prescribed first and second moments". IEEE Transactions on Information Theory (correspondance). 19 (5): 689–693. doi:10.1109/tit.1973.1055060. ISSN 0018-9448.
  9. ^ a b Jammalamadaka, S. Rao; SenGupta, A. (2001). Topics in circular statistics. New Jersey: World Scientific. ISBN 978-981-02-3778-3. Retrieved 2025-08-06.
  10. ^ a b Grechuk, Bogdan; Molyboha, Anton; Zabarankin, Michael (2009). "Maximum entropy principle with general deviation measures". Mathematics of Operations Research. 34 (2): 445–467. doi:10.1287/moor.1090.0377 – via researchgate.net.
  11. ^ a b Harrem?s, Peter (2001). "Binomial and Poisson distributions as maximum entropy distributions". IEEE Transactions on Information Theory. 47 (5): 2039–2041. doi:10.1109/18.930936. S2CID 16171405.
  12. ^ Nielsen, Frank; Nock, Richard (2017). "MaxEnt upper bounds for the differential entropy of univariate continuous distributions". IEEE Signal Processing Letters. 24 (4). IEEE: 402–406. Bibcode:2017ISPL...24..402N. doi:10.1109/LSP.2017.2666792. S2CID 14092514.

References

[edit]
4月26日是什么星座 绝情是什么意思 二尖瓣少量反流是什么意思 出佛身血是什么意思 头昏吃什么药
dears是什么意思 血红蛋白偏低是什么意思 呈现是什么意思 结节性红斑是什么病 水痘是什么病毒
1984年什么命 肾脏炎有什么症状 梦见女尸是什么预兆 什么沐浴露好用 什么是喜欢什么是爱
腮腺炎吃什么消炎药 什么药可以延长时间 什么的知了 慢性胃炎是什么原因引起的 胃炎是什么
风平浪静是什么生肖hcv9jop1ns9r.cn 为什么会得水痘0297y7.com 劳动局全称叫什么hcv9jop6ns1r.cn 丁火是什么意思hcv9jop2ns9r.cn 蒸鱼豉油可以用什么代替hcv8jop5ns0r.cn
朱砂是什么hcv7jop6ns9r.cn 木星是什么颜色hcv8jop4ns3r.cn 痔疮吃什么水果好得快hcv8jop2ns0r.cn 属鸡的什么命hcv9jop4ns5r.cn 60岁属什么hcv9jop2ns3r.cn
舌头尖发麻是什么原因jingluanji.com 五月十七号是什么星座hcv9jop3ns1r.cn 外阴白斑是什么原因shenchushe.com 胸口闷堵是什么原因hcv8jop9ns9r.cn 7年之痒是什么意思aiwuzhiyu.com
dha孕妇什么时候吃hcv8jop0ns0r.cn 农历6月28日是什么星座hebeidezhi.com 逆天改命是什么意思hcv8jop0ns4r.cn 血府逐瘀片主治什么病hcv8jop0ns7r.cn 牙龈肿痛看什么科beikeqingting.com
百度