Models:
- Bernoulli: distribution of binary outcomes (happened/unhappened) experience, the parameter can be explained as its expectation / possibility.
- Binomial: number of times happened in i.i.d. Bernoulli experiments, parameterized by , the possibility of happening in each experiment. Its formula is form by accumulate all situations of the specific times of experiment outcome is happened.
- Poisson: if the possibility is evenly distributed on a continuous space, and in that space the rate of arrival / number of happening is expected to be , then for each subspace, we test it and the possibility of happening on this subspace is , then the number of happening in this whole space can is the limit of Binomial Distribution.
- Exponential: Possibility of time of “First Happen”. Rate of arrival in a time unit is , then the rate of arrival is in space , and possibility of nothing happened for time units is simply calculated by plug into the PMF of Poisson, transform, take derivative, and boom, we get PDF of exponential distribution.
- Gamma Distribution: extension of Exponential Distribution: the Possibility of time of “n-th event happen”.
Bernoulli Distribution
单次概率
Binomial Distribution
每次概率,次出现次的概率
Why it’s called binomial distribution?
Poisson Process
当随机过程 被称为Poisson过程:
- 从零开始:
- 零时刻事件发生次数为0
- 无记忆:
- 无论从哪里开始计时,分布都一样
- 独立性:事件的发生互不干扰
- 稀疏性:同一时刻发生两个事件的概率几乎为0
- I think it could be written as:
Notation Misuse
可以看作一个随机变量,代表的是时间内事情发生的次数。 是时长,而不是时刻,但默认时刻从零开始的时候,又可以指代这个时长,此时为时刻。
Poisson Process: 随机事件在连续时间内发生的基础模型
发生率:单位时间发生率
- 计数视角:单位事件发生的次数分布(离散)possion分布
- 间隔视角:相邻两事件的时间间隔(连续),指数分布
- 等待视角:从零时刻到第n个事件发生所经历的总时间(连续)gamma分布
Poisson Distribution
how to calculate the possibility of happening in a time interval, or formally how to calculate ?
define: and we want:
Approach: use binomial distribution.
Poisson Distribution: Deduction
Binomial Distribution describe the possibility of happening out of tests, if we divide a space (for example, the unit time interval we are interested in now) into small subspace, see if the event happen and take the limit of , we got the possibility of happening in that continuous space.
as we divide the time unit into small even subspaces, as , every variable is Bernoulli distributed as defined by Poisson process:
and thus the event “happening times in a time unit” becomes “happening times in tests, ”, which makes the possibility being able to calculated by PMF of Binomial distribution.
\begin{align} P(N(1) = k) & = P\left( \lim\_{ n \to \infty } \sum\_{i=1}^{n} N\_{i}\left( \frac{1}{n} \right) = k \right) \\ & = P\left{ \text{ $k$ out of $n$ i.i.d. tests success} \right} \implies \text{ use Binomial PMF} \end{align}but what’s ?
As all “test results” are i.i.d and Bernoulli distributed, , so E\left\[ N\_{i}\left( \frac{1}{n} \right) \right] = p
\begin{align} E\left\[ N\left( \frac{1}{n} \right) \right] & = E\left\[ \sum\_{i=1}^{n} N\_{i}\left( \frac{1}{n} \right) \right] \\ & = nE\left\[ N\_{i}\left( \frac{1}{n} \right) \right] \\ & = np \end{align}let’s said the is the expectation of , then .
so:
Poisson distribution: Properties
Apparently, the expectation of Poisson distribution should be :
The Variance of Poisson distribution is also :
and is called “arrival rate”.
Exponential Distribution
What’s the possibility of wait time? Or, since wait time is a continuous variable, what’s the possibility of wait time is greater than some threshold ? Or formally: define as the wait time, what is ?
if it’s a poisson process, then basically it’s saying that you wait and nothing happen, that event is .
And the possibility has no memory:
It’s straightforward to get the CDF and PDF of :
CDF:
PDF:
Pick the right tool
These function — PDF, CDF — are just different aspects of how a random variable is distributed. You should pick the right tool to solve the problem you currently tackle with.
Gamma Distribution
If a variable is describing the time for i.i.d. events to happen,
Deduction of Distribution
- Deduction: from several general to a more specific conclusion
- Induction: from special cases to general form
The possibility of you wait for at least for the events to happen, then it’s equivalent to say within time , there’s no more than events happen. Take the opposite event — you wait for no more than time units, and more than events happen in — this is the CDF of your wait time:
take the derivate, you get an alternating series, all terms except the first term cancels out, so the PDF
\begin{align} f(t) &= \frac{d F(t)}{ dt} \\ &= \sum\_{k=n}^{\infty} \left\[ \frac{\lambda^k t^{k-1} e^{-\lambda k}}{(k-1)!} - \frac{\lambda^{k+1} t^k e^{-\lambda k}}{k!} \right] \\ &= \frac{\lambda^n t^{n-1} e^{-\lambda n}}{(n-1)!} \text{ only first term left} \\ &= \frac{t^{\alpha-1} e^{-n/\beta}}{\Gamma(\alpha) \beta^{\alpha}}, \beta = \frac{1}{\lambda}, \alpha = n, \Gamma(\alpha)=(\alpha-1)! \end{align}Explanation:
- the Gamma Function:
- scale parameter: , actually it’s the reciprocal of arrival rate
- shape parameter: , actually it’s the number of events to wait.