轻微食物中毒吃什么药| 三岁看大七岁看老什么意思| 什么病需要做透析| 甜菜什么意思| 乏力是什么意思| 贵州有什么特产| 舌头肥大有齿痕是什么原因| 狗狗为什么会得细小| 梦见自己出轨是什么意思| jdv是什么牌子| 眼睛痒是怎么回事用什么药| 成都市市长是什么级别| 唯有读书高的前一句是什么| 脚踏一星是什么命| 吃马齿菜有什么好处| 嗓子干疼吃什么药| 偏袒是什么意思| 赞什么不已| 女人身体发热预示什么| 做b超能查出什么| 什么是黑天鹅事件| 备孕喝苏打水什么作用| 120是什么意思| 痔疮用什么药治最好效果最快| 咽炎什么症状| 脚背上长痣代表什么| 什么叫npc| 股票pb是什么意思| 红豆大红豆芋头是什么歌| 低压是什么意思| 西梅什么时候成熟| 红颜知己是什么意思| 三聚磷酸钠是什么| 孕酮低吃什么药| 月经期间适合吃什么| 焦虑症有什么症状| 痔疮用什么药膏最好| 仁爱是什么意思| 毡房是什么意思| 经常打嗝是什么原因| 秦始皇的母亲叫什么名字| 口干舌燥吃什么中成药| 取缔役什么意思| 湿疹吃什么| 分山念什么| 有点想吐是什么原因| 口引念什么| 打飞机是什么| q12h医学上是什么意思| 什么睡姿可以矫正驼背| 青梅是什么水果| 对牛弹琴是什么意思| 球镜是什么意思| 脑供血不足是什么症状| 丑时属什么| 宸字五行属什么| 最大的淡水湖是什么湖| 7.14什么星座| 11度穿什么衣服| 大眼角痒用什么眼药水| 塑造是什么意思| 大队长是什么级别| 谷丙转氨酶偏低是什么意思| 解脲支原体阳性是什么意思| 静脉注射是什么意思| 例假血是黑色的是什么原因| 大美是什么意思| 晚饭适合吃什么| 高梁长什么样子| 阿奇霉素主治什么| 手脚心出汗是什么原因| 白细胞0是什么意思| 鼠和什么属相相冲| 夜来香是什么花| 梅州有什么大学| 为什么庙里不让孕妇去| tory burch什么牌子| 柠檬什么时候成熟| 什么都不放的冬瓜清汤| 梦到被雷劈是什么意思| 胃痛吃什么食物| 嵌甲去医院挂什么科| 饧是什么意思| 吃茄子有什么坏处| 冰箱不制冷是什么问题| 大将军衔相当于什么官| 考试穿什么颜色的衣服| 獠牙是什么意思| oa是什么| 骨扫描是检查什么| 气性大是什么意思| 请佛容易送佛难什么意思| 明天什么节| 今年40岁属什么生肖| 腿麻是什么原因引起的| 胸透能查出什么| 铁观音什么季节喝最好| 什么叫野鸡大学| 为什么同房后小腹疼痛| 月蚀是什么意思| 尿酸高是什么问题| 肌酸激酶高是什么原因| 血压高会引起什么症状| 巴郎子是什么意思| 脑震荡什么症状| 制动是什么意思| 黄瓜为什么会发苦| 轻度肠上皮化生是什么意思| 922是什么星座| 断背山讲的是什么故事| 男性粘液丝高什么原因| 一月27日是什么星座| 胸片可以检查出什么| 退翳什么意思| 2001年属蛇的是什么命| 麸皮是什么东西| 什么食物含铁| 维生素B3叫什么名字| 吃避孕药为什么要吃维生素c| 素毛肚是什么做的| 西瓜又什么又什么填空| 00年是什么年| 人心果什么时候成熟| 什么牌子的空调最好| 沾沾喜气什么意思| 子宫内膜增厚是什么意思| 什么是熵| 蝴蝶有什么寓意| 什么是缘分| 李白和杜甫并称什么| 脸上皮肤痒是什么原因| 祸不单行是什么意思| dcc是什么意思| 女生排卵期在什么时候| 豆腐是什么做的| 跑步后头晕是什么原因| 脑梗吃什么鱼最好| 狗尾续貂是什么意思| 抗病毒什么药效果好| 伏藏是什么意思| 甲鱼什么人不能吃| 明年是什么生肖年| 头发容易油是什么原因| 什么粥减肥效果好| 黄瓜和什么不能一起吃| 什么叫个人修养| 胆结石吃什么最好| 四维什么时候做最佳| 子息克乏是什么意思| 皮肤黑是什么原因| 羊肉配什么菜好吃| 点完痣要注意什么| 牡丹花什么季节开| 悦人不如悦己什么意思| 为什么叫新四军| 出虚汗是什么原因| 男性肾虚有什么症状| 0代表什么| 淋巴结有什么症状| 五六月份是什么星座| 武则天属什么生肖| bees是什么意思| 媞是什么意思| 明目退翳是什么意思| choker什么意思| 62岁属什么生肖| strange是什么意思| 高钾血症是什么原因引起的| 什么是远视| 原点是什么| 一个既一个旦念什么| 孟姜女属什么生肖| 总ige是什么意思| 炖鸡汤用什么鸡| 兔和什么相冲| 人人有的是什么生肖| 七月份适合种什么蔬菜| 女人依赖男人说明什么| 八字华盖是什么意思| 思密达是什么意思| levis是什么牌子| 马克定食是什么意思| 耐人寻味什么意思| 子宫直肠凹积液是什么意思| 贞操带是什么| 花椒水泡脚有什么好处| 1981年什么命| 决明子泡水喝有什么功效| 吃避孕药不能吃什么东西| 白日做梦是什么生肖| 比目鱼是什么鱼| 海燕是什么鸟| 装修公司名字取什么好| 没腿毛的男人代表什么| 韩国人为什么叫棒子| 女人梦见火是什么预兆| 雪里红是什么菜| 3月29号是什么星座| 梦见别人掉牙齿是什么征兆| 什么叫吐槽| 血氧低吃什么提高的快| 囊肿有什么症状| mico是什么意思| 右侧中耳乳突炎是什么意思| 血糖高的人吃什么好| 军校出来是什么军衔| 什么马不能骑| 胃食管反流病吃什么药| aivei是什么品牌| 湿气太重吃什么排湿最快| 梵高是什么画派| 六六无穷是什么意思| 七宗罪都是什么| 王秋儿和王冬儿什么关系| 尘埃落定什么意思| 例假为什么第一天最疼| 吃什么药可以减肥| 啫喱是什么| 心阴虚吃什么食物| 勾芡是什么意思| 吃什么除体内湿气最快| 两毛二是什么军衔| 什么时候割包皮最好| 黄粉虫吃什么| 什么叫pc| 双子座是什么象| 长高吃什么钙片| 10月21号是什么星座| 想吃辣椒身体里缺什么| 东北小咬是什么虫子| bpd是胎儿的什么意思| 痤疮是什么样子的| 甲状腺吃什么食物好| 酸菜鱼加什么配菜好吃| 促甲状腺激素偏高有什么症状| 三头六臂指什么生肖| 桂花树施什么肥| 鲸鱼属于什么类动物| 宫内暗区是什么意思| 年柱金舆是什么意思| 啪啪啪是什么意思| 罗衣是什么意思| 影像是什么意思| 松针泡水喝有什么功效| 晚生是什么意思| 缘木求鱼是什么意思| 为什么16岁不能吃维生素B| 淋巴结炎吃什么药| 人生八苦是什么| 胡子为什么长得快| 元气什么意思| 淋巴细胞高是什么意思| 退烧药吃多了有什么副作用| 肺门不大是什么意思| hpv45型阳性是什么意思| 口僻是什么病| 邮箱是什么| 胆红素升高是什么原因| 两个a型血的人生的孩子什么血型| 祭是什么意思| 荔枝可以做什么菜| 办健康证需要带什么证件| 学士学位证书有什么用| 接吻是什么样的感觉| 百度Jump to content

【网信事业新成就】内蒙古出台加快推进“互...

From Wikipedia, the free encyclopedia
百度 人口老龄化导致劳动力参与率下降及劳动力年龄结构老化,给企业生产用工及地方经济发展带来了冲击。

A normal Q–Q plot of randomly generated, independent standard exponential data, (X ~ Exp(1)). This Q–Q plot compares a sample of data on the vertical axis to a statistical population on the horizontal axis. The points follow a strongly nonlinear pattern, suggesting that the data are not distributed as a standard normal (X ~ N(0,1)). The offset between the line and the points suggests that the mean of the data is not 0. The median of the points can be determined to be near 0.7.
A normal Q–Q plot comparing randomly generated, independent standard normal data on the vertical axis to a standard normal population on the horizontal axis. The linearity of the points suggests that the data are normally distributed.
A Q–Q plot of a sample of data versus a Weibull distribution. The deciles of the distributions are shown in red. Three outliers are evident at the high end of the range. Otherwise, the data fit the Weibull(1,2) model well.
A Q–Q plot comparing the distributions of standardized daily maximum temperatures at 25 stations in the US state of Ohio in March and in July. The curved pattern suggests that the central quantiles are more closely spaced in July than in March, and that the July distribution is skewed to the left compared to the March distribution. The data cover the period 1893–2001.

In statistics, a Q–Q plot (quantile–quantile plot) is a probability plot, a graphical method for comparing two probability distributions by plotting their quantiles against each other.[1] A point (x, y) on the plot corresponds to one of the quantiles of the second distribution (y-coordinate) plotted against the same quantile of the first distribution (x-coordinate). This defines a parametric curve where the parameter is the index of the quantile interval.

If the two distributions being compared are similar, the points in the Q–Q plot will approximately lie on the identity line y = x. If the distributions are linearly related, the points in the Q–Q plot will approximately lie on a line, but not necessarily on the line y = x. Q–Q plots can also be used as a graphical means of estimating parameters in a location-scale family of distributions.

A Q–Q plot is used to compare the shapes of distributions, providing a graphical view of how properties such as location, scale, and skewness are similar or different in the two distributions. Q–Q plots can be used to compare collections of data, or theoretical distributions. The use of Q–Q plots to compare two samples of data can be viewed as a non-parametric approach to comparing their underlying distributions. A Q–Q plot is generally more diagnostic than comparing the samples' histograms, but is less widely known. Q–Q plots are commonly used to compare a data set to a theoretical model.[2][3] This can provide an assessment of goodness of fit that is graphical, rather than reducing to a numerical summary statistic. Q–Q plots are also used to compare two theoretical distributions to each other.[4] Since Q–Q plots compare distributions, there is no need for the values to be observed as pairs, as in a scatter plot, or even for the numbers of values in the two groups being compared to be equal.

The term "probability plot" sometimes refers specifically to a Q–Q plot, sometimes to a more general class of plots, and sometimes to the less commonly used P–P plot. The probability plot correlation coefficient plot (PPCC plot) is a quantity derived from the idea of Q–Q plots, which measures the agreement of a fitted distribution with observed data and which is sometimes used as a means of fitting a distribution to data.

Definition and construction

[edit]
Q–Q plot for first opening/final closing dates of Washington State Route 20, versus a normal distribution.[5] Outliers are visible in the upper right corner.

A Q–Q plot is a plot of the quantiles of two distributions against each other, or a plot based on estimates of the quantiles. The pattern of points in the plot is used to compare the two distributions.

The main step in constructing a Q–Q plot is calculating or estimating the quantiles to be plotted. If one or both of the axes in a Q–Q plot is based on a theoretical distribution with a continuous cumulative distribution function (CDF), all quantiles are uniquely defined and can be obtained by inverting the CDF. If a theoretical probability distribution with a discontinuous CDF is one of the two distributions being compared, some of the quantiles may not be defined, so an interpolated quantile may be plotted. If the Q–Q plot is based on data, there are multiple quantile estimators in use. Rules for forming Q–Q plots when quantiles must be estimated or interpolated are called plotting positions.

A simple case is where one has two data sets of the same size. In that case, to make the Q–Q plot, one orders each set in increasing order, then pairs off and plots the corresponding values. A more complicated construction is the case where two data sets of different sizes are being compared. To construct the Q–Q plot in this case, it is necessary to use an interpolated quantile estimate so that quantiles corresponding to the same underlying probability can be constructed.

More abstractly,[4] given two cumulative probability distribution functions F and G, with associated quantile functions F?1 and G?1 (the inverse function of the CDF is the quantile function), the Q–Q plot draws the q-th quantile of F against the q-th quantile of G for a range of values of q. Thus, the Q–Q plot is a parametric curve indexed over [0,1] with values in the real plane R2.

Typically for an analysis of normality, the vertical axis shows the values of the variable of interest, say x with CDF F(x), and the horizontal axis represents N?1(F(x)), where N?1(.) represents the inverse cumulative normal distribution function.

Interpretation

[edit]

The points plotted in a Q–Q plot are always non-decreasing when viewed from left to right. If the two distributions being compared are identical, the Q–Q plot follows the 45° line y = x. If the two distributions agree after linearly transforming the values in one of the distributions, then the Q–Q plot follows some line, but not necessarily the line y = x. If the general trend of the Q–Q plot is flatter than the line y = x, the distribution plotted on the horizontal axis is more dispersed than the distribution plotted on the vertical axis. Conversely, if the general trend of the Q–Q plot is steeper than the line y = x, the distribution plotted on the vertical axis is more dispersed than the distribution plotted on the horizontal axis. Q–Q plots are often arced, or S-shaped, indicating that one of the distributions is more skewed than the other, or that one of the distributions has heavier tails than the other.

Although a Q–Q plot is based on quantiles, in a standard Q–Q plot it is not possible to determine which point in the Q–Q plot determines a given quantile. For example, it is not possible to determine the median of either of the two distributions being compared by inspecting the Q–Q plot. Some Q–Q plots indicate the deciles to make determinations such as this possible.

The intercept and slope of a linear regression between the quantiles gives a measure of the relative location and relative scale of the samples. If the median of the distribution plotted on the horizontal axis is 0, the intercept of a regression line is a measure of location, and the slope is a measure of scale. The distance between medians is another measure of relative location reflected in a Q–Q plot. The "probability plot correlation coefficient" (PPCC plot) is the correlation coefficient between the paired sample quantiles. The closer the correlation coefficient is to one, the closer the distributions are to being shifted, scaled versions of each other. For distributions with a single shape parameter, the probability plot correlation coefficient plot provides a method for estimating the shape parameter – one simply computes the correlation coefficient for different values of the shape parameter, and uses the one with the best fit, just as if one were comparing distributions of different types.

Another common use of Q–Q plots is to compare the distribution of a sample to a theoretical distribution, such as the standard normal distribution N(0,1), as in a normal probability plot. As in the case when comparing two samples of data, one orders the data (formally, computes the order statistics), then plots them against certain quantiles of the theoretical distribution.[3]

Plotting positions

[edit]

The choice of quantiles from a theoretical distribution can depend upon context and purpose. One choice, given a sample of size n, is k / n for k = 1, …, n, as these are the quantiles that the sampling distribution realizes. The last of these, n / n, corresponds to the 100th percentile – the maximum value of the theoretical distribution, which is sometimes infinite. Other choices are the use of (k ? 0.5) / n, or instead to space the n points such that there is an equal distance between all of them and also between the two outermost points and the edges of the interval, using k / (n + 1).[6]

Many other choices have been suggested, both formal and heuristic, based on theory or simulations relevant in context. The following subsections discuss some of these. A narrower question is choosing a maximum (estimation of a population maximum), known as the German tank problem, for which similar "sample maximum, plus a gap" solutions exist, most simply m + m/n ? 1. A more formal application of this uniformization of spacing occurs in maximum spacing estimation of parameters.

Expected value of the order statistic for a uniform distribution

[edit]

The k / (n + 1) approach equals that of plotting the points according to the probability that the last of (n + 1) randomly drawn values will not exceed the k-th smallest of the first n randomly drawn values.[7][8]

Expected value of the order statistic for a standard normal distribution

[edit]

In using a normal probability plot, the quantiles one uses are the rankits, the quantile of the expected value of the order statistic of a standard normal distribution.

More generally, Shapiro–Wilk test uses the expected values of the order statistics of the given distribution; the resulting plot and line yields the generalized least squares estimate for location and scale (from the intercept and slope of the fitted line).[9] Although this is not too important for the normal distribution (the location and scale are estimated by the mean and standard deviation, respectively), it can be useful for many other distributions.

However, this requires calculating the expected values of the order statistic, which may be difficult if the distribution is not normal.

Median of the order statistics

[edit]

Alternatively, one may use estimates of the median of the order statistics, which one can compute based on estimates of the median of the order statistics of a uniform distribution and the quantile function of the distribution; this was suggested by Filliben (1975).[9]

This can be easily generated for any distribution for which the quantile function can be computed, but conversely the resulting estimates of location and scale are no longer precisely the least squares estimates, though these only differ significantly for n small.

Heuristics

[edit]

Several different formulas have been used or proposed as affine symmetrical plotting positions. Such formulas have the form (k ? a) / (n + 1 ? 2a) for some value of a in the range from 0 to 1, which gives a range between k / (n + 1) and (k ? 1) / (n ? 1).

Expressions include:

For large sample size, n, there is little difference between these various expressions.

Filliben's estimate

[edit]

The order statistic medians are the medians of the order statistics of the distribution. These can be expressed in terms of the quantile function and the order statistic medians for the continuous uniform distribution by:

where U(i) are the uniform order statistic medians and G is the quantile function for the desired distribution. The quantile function is the inverse of the cumulative distribution function (probability that X is less than or equal to some value). That is, given a probability, we want the corresponding quantile of the cumulative distribution function.

James J. Filliben uses the following estimates for the uniform order statistic medians:[15]

The reason for this estimate is that the order statistic medians do not have a simple form.

Software

[edit]

The R programming language comes with functions to make Q–Q plots, namely qqnorm and qqplot from the stats package. The fastqq package implements faster plotting for large number of data points.

See also

[edit]

Notes

[edit]
  1. ^ Note that this also uses a different expression for the first & last points. [1] cites the original work by Filliben (1975). This expression is an estimate of the medians of U(k).
  2. ^ A simple (and easy to remember) formula for plotting positions; used in BMDP statistical package.
  3. ^ This is Blom (1958)'s earlier approximation and is the expression used in MINITAB.
  4. ^ This plotting position was used by Irving I. Gringorten[13] to plot points in tests for the Gumbel distribution.
  5. ^ Used by Filliben (1975), these plotting points are equal to the modes of U(k).

References

[edit]

Citations

[edit]
  1. ^ Wilk, M.B.; Gnanadesikan, R. (1968), "Probability plotting methods for the analysis of data", Biometrika, 55 (1), Biometrika Trust: 1–17, doi:10.1093/biomet/55.1.1, JSTOR 2334448, PMID 5661047.
  2. ^ Gnanadesikan (1977), p. 199.
  3. ^ a b Thode (2002), Section 2.2.2, Quantile-Quantile Plots, p. 21
  4. ^ a b Gibbons & Chakraborti (2003), p. 144
  5. ^ "SR 20 – North Cascades Highway – Opening and Closing History". North Cascades Passes. Washington State Department of Transportation. October 2009. Retrieved 8 February 2009.
  6. ^ Weibull, Waloddi (1939), "The Statistical Theory of the Strength of Materials", IVA Handlingar, Royal Swedish Academy of Engineering Sciences (151)
  7. ^ Madsen, H.O.; et al. (1986), Methods of Structural Safety
  8. ^ Makkonen, L. (2008), "Bringing closure to the plotting position controversy", Communications in Statistics – Theory and Methods, 37 (3): 460–467, doi:10.1080/03610920701653094, S2CID 122822135
  9. ^ a b Testing for Normality, by Henry C. Thode, CRC Press, 2002, ISBN 978-0-8247-9613-6, p. 31
  10. ^ Benard, A.; Bos-Levenbach, E. C. (September 1953). "The plotting of observations on probability paper". Statistica Neerlandica (in Dutch). 7: 163–173. doi:10.1111/j.1467-9574.1953.tb00821.x.
  11. ^ "1.3.3.21. Normal Probability Plot". itl.nist.gov. Retrieved 16 February 2022.
  12. ^ Distribution free plotting position, Yu & Huang
  13. ^ Gringorten, Irving I. (1963). "A plotting rule for extreme probability paper". Journal of Geophysical Research. 68 (3): 813–814. Bibcode:1963JGR....68..813G. doi:10.1029/JZ068i003p00813. ISSN 2156-2202.
  14. ^ Hazen, Allen (1914), "Storage to be provided in the impounding reservoirs for municipal water supply", Transactions of the American Society of Civil Engineers (77): 1547–1550
  15. ^ Filliben (1975).

Sources

[edit]
[edit]
查雌激素挂什么科 庆字五行属什么 老年性阴道炎用什么药 zara中文叫什么 观音得道日是什么时候
宋江的绰号是什么 风湿性关节炎用什么药效果好 二月二是什么节 白色裤子搭什么颜色上衣 台湾什么时候回归
精囊炎吃什么药最有效 欣赏一个人是什么意思 右耳痒是什么预兆 崩漏下血是什么意思 英雄的动物是什么生肖
急得什么 强化灶是什么意思 孕妇吃什么蔬菜好 早上起来口苦是什么原因 暑假什么时候放
一飞冲天是什么生肖hcv7jop6ns1r.cn 呵呵呵呵是什么意思hcv7jop6ns3r.cn 三马念什么hcv9jop6ns9r.cn 守宫是什么意思hcv8jop7ns5r.cn 安罗替尼适合什么肿瘤hcv8jop1ns8r.cn
肉瘤是什么hcv8jop8ns0r.cn 尿潜血挂什么科hcv8jop1ns6r.cn 潮喷是什么hanqikai.com alt是什么意思hcv7jop7ns0r.cn 李荣浩什么学历hcv9jop2ns1r.cn
吃什么食物可以降低胆固醇hcv9jop2ns6r.cn 掏耳朵咳嗽是什么原因hcv8jop3ns4r.cn 咖啡喝了有什么好处hcv8jop2ns7r.cn 幼小衔接班主要教什么hcv9jop6ns6r.cn 吃粽子是什么节日hcv8jop8ns4r.cn
生菜什么时候种hcv8jop2ns3r.cn 橙字五行属什么hcv9jop3ns2r.cn 指鹿为马是什么生肖hcv8jop3ns3r.cn 外阴灼热用什么药hcv8jop2ns4r.cn 拔了尿管尿不出来有什么好办法hcv8jop4ns8r.cn
百度