Notation in probability and statistics
Jump to navigation
Jump to search
This content was retrieved from
Wikipedia : http://en.wikipedia.org/wiki/Notation_in_probability_and_statisticsProbability |
---|
Statistics |
---|
Probability theory and statistics have some commonly used conventions, in addition to standard mathematical notation and mathematical symbols.
Contents
Probability theory
- Random variables are usually written in upper case roman letters: X, Y, etc.
- Particular realizations of a random variable are written in corresponding lower case letters. For example, x_{1}, x_{2}, …, x_{n} could be a sample corresponding to the random variable X. A cumulative probability is formally written to differentiate the random variable from its realization.
- The probability is sometimes written to distinguish it from other functions and measure P so as to avoid having to define " P is a probability" and is short for , where is the event space and is a random variable. notation is used alternatively.
- or indicates the probability that events A and B both occur. The joint probability distribution of random variables X and Y is denoted as , while joint probability mass function or probability density function as and joint cumulative distribution function as .
- or indicates the probability of either event A or event B occurring ("or" in this case means one or the other or both).
- σ-algebras are usually written with uppercase calligraphic (e.g. for the set of sets on which we define the probability P)
- Probability density functions (pdfs) and probability mass functions are denoted by lowercase letters, e.g. , or .
- Cumulative distribution functions (cdfs) are denoted by uppercase letters, e.g. , or .
- Survival functions or complementary cumulative distribution functions are often denoted by placing an overbar over the symbol for the cumulative:, or denoted as ,
- In particular, the pdf of the standard normal distribution is denoted by φ(z), and its cdf by Φ(z).
- Some common operators:
- E[X] : expected value of X
- var[X] : variance of X
- cov[X, Y] : covariance of X and Y
- X is independent of Y is often written or , and X is independent of Y given W is often written
- or
- , the conditional probability, is the probability of given , i.e., after is observed.^{[citation needed]}
Statistics
- Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).
- A tilde (~) denotes "has the probability distribution of".
- Placing a hat, or caret, over a true parameter denotes an estimator of it, e.g., is an estimator for .
- The arithmetic mean of a series of values x_{1}, x_{2}, ..., x_{n} is often denoted by placing an "overbar" over the symbol, e.g. , pronounced "x bar".
- Some commonly used symbols for sample statistics are given below:
- the sample mean ,
- the sample variance s^{2},
- the sample standard deviation s,
- the sample correlation coefficient r,
- the sample cumulants k_{r}.
- Some commonly used symbols for population parameters are given below:
- the population mean μ,
- the population variance σ^{2},
- the population standard deviation σ,
- the population correlation ρ,
- the population cumulants κ_{r},
- is used for the order statistic, where is the sample minimum and is the sample maximum from a total sample size n.
Critical values
The α-level upper critical value of a probability distribution is the value exceeded with probability α, that is, the value x_{α} such that F(x_{α}) = 1 − α where F is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:
- z_{α} or z(α) for the Standard normal distribution
- t_{α,ν} or t(α,ν) for the t-distribution with ν degrees of freedom
- or for the chi-squared distribution with ν degrees of freedom
- or F(α,ν_{1},ν_{2}) for the F-distribution with ν_{1} and ν_{2} degrees of freedom
Linear algebra
- Matrices are usually denoted by boldface capital letters, e.g. A.
- Column vectors are usually denoted by boldface lowercase letters, e.g. x.
- The transpose operator is denoted by either a superscript T (e.g. A^{T}) or a prime symbol (e.g. A′).
- A row vector is written as the transpose of a column vector, e.g. x^{T} or x′.
Abbreviations
Common abbreviations include:
- a.e. almost everywhere
- a.s. almost surely
- cdf cumulative distribution function
- cmf cumulative mass function
- df degrees of freedom (also )
- i.i.d. independent and identically distributed
- pdf probability density function
- pmf probability mass function
- r.v. random variable
- w.p. with probability; wp1 with probability 1
See also
- Glossary of probability and statistics
- Combinations and permutations
- Typographical conventions in mathematical formulae
- History of mathematical notation
References
- Halperin, Max; Hartley, H. O.; Hoel, P. G. (1965), "Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation", The American Statistician, 19 (3): 12–14, doi:10.2307/2681417, JSTOR 2681417
External links
- Earliest Uses of Symbols in Probability and Statistics, maintained by Jeff Miller.
This page is based on the copyrighted Wikipedia article "Notation in probability and statistics"; it is used under the Creative Commons
Attribution-ShareAlike 3.0 Unported License (CC-BY-SA). You may
redistribute it, verbatim or modified, providing that you comply with
the terms of the CC-BY-SA