Flash and JavaScript are required for this feature.

Download the video from iTunes U or the Internet Archive.

**Description:** In this lecture, the professor discussed Poisson process, distribution of number of arrivals, and distribution of interarrival times.

**Instructor:** John Tsitsiklis

Lecture 14: Poisson Process I

The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.

PROFESSOR: So last time we started talking about random processes. A random process is a random experiment that evolves over time. And conceptually, it's important to realize that it's a single probabilistic experiment that has many stages. Actually, it has an infinite number of stages. And we discussed the simplest random process there is, the Bernoulli process, which is nothing but the sequence of Bernoulli trials-- an infinite sequence of Bernoulli trials. For example, flipping a coin over and over.

Once we understand what's going on with that process, then what we want is to move into a continuous time version of the Bernoulli process. And this is what we will call the Poisson process. And for the Poisson process, we're going to do exactly the same things that we did for the Bernoulli process. That is, talk about the number of arrivals during a given time period, and talk also about the time between consecutive arrivals, and for the distribution of inter-arrival times.

So let's start with a quick review of what we discussed last time. First, a note about language. If you think of coin tosses, we then talk about heads and tails. If you think of these as a sequence of trials, you can talk about successes and failures. The language that we will be using will be more the language of arrivals. That is, if in a given slot you have a success, you say that something arrived. If you have a failure, nothing arrived. And that language is a little more convenient and more natural, especially when we talk about continuous time-- to talk about arrivals instead of successes.

But in any case, for the Bernoulli process let's keep, for a little bit, the language of successes. Whereas working in discrete time, we have time slots. During each time slot, we have an independent Bernoulli trial. There is probability p of having a success. Different slots are independent of each other. And this probability p is the same for any given time slot.

So for this process we will discuss the one random variable of interest, which is the following. If we have n time slots, or n trials, how many arrivals will there be? Or how many successes will there be? Well, this is just given by the binomial PMF. Number of successes in n trials is a random variable that has a binomial PMF, and we know what this is.

Then we talked about inter-arrival times. The time until the first arrival happens has a geometric distribution. And we have seen that from some time ago. Now if you start thinking about the time until k arrivals happen, and we denote that by Yk, this is the time until the first arrival happens. And then after the first arrival happens, you have to wait some time until the second arrival happens, and so on. And then the time from the (k -1)th arrival, until arrival number k.

The important thing to realize here is that because the process has a memorylessness property, once the first arrival comes, it's as if we're starting from scratch and we will be flipping our coins until the next arrival comes. So the time it will take until the next arrival comes will also be a geometric random variable. And because different slots are independent, whatever happens after the first arrival is independent from whatever happened before. So T1 and T2 will be independent random variables. And similarly, all the way up to Tk.

So the time until the k-th arrival is a sum of independent geometric random variables, with the same parameter p. And we saw last time that we can find the probability distribution of Yk. The probability that Yk takes a value of t is equal to-- there's this combinatorial factor here, and then you get p to the k, (1-p) to the (t-k), and this formula is true for t equal to k, k+1, and so on. And this distribution has a name. It's called the Pascal PMF.

So this is all there is to know about the Bernoulli process. One important comment is to realize what exactly this memorylessness property is saying. So I discussed it a little bit last time. Let me reiterate it. So we have a Bernoulli process, which is a sequence of Bernoulli trials. And these are (0,1) random variables that keep going on forever.

So someone is watching this movie of Bernoulli trials B_t. And at some point, they say they think, or something interesting has happened, why don't you come in and start watching? So at some time t, they tell you to come in and start watching. So what you will see once you come in will be this future trials.

So actually what you will see is a random process, whose first random variable is going to be the first one that you see, B_(t +1). The second one is going to be this, and so on. So this is the process that's seen by the person who's asked to come in and start watching at that time. And the claim is that this process is itself a Bernoulli process, provided that the person who calls you into the room does not look into the future. The person who calls you into the room decides to call you in only on the basis of what they have seen so far.

So for example, who calls you into the room might have a rule that says, as soon as I see a sequence of 3 heads, I ask the other person to come in. So if they use that particular rule, it means that when you're called in, the previous 3 were heads. But this doesn't give you any information about the future. And so the future ones will be just independent Bernoulli trials.

If on the other hand, the person who calls you in has seen the movie before and they use a rule, such as, for example, I call you in just before 3 heads show up for the first time. So the person calls you in based on knowledge that these two would be three heads. If they have such foresight-- if they can look into the future-- then X1, X2, X3, they're certain to be three heads, so they do not correspond to random independent Bernoulli trials.

So to rephrase this, the process is memoryless. It does not matter what has happened in the past. And that's true even if you are called into the room and start watching at a random time, as long as that random time is determined in a causal way on the basis of what has happened so far.

So you are called into the room in a causal manner, just based on what's happened so far. What you're going to see starting from that time will still be a sequence of independent Bernoulli trials. And this is the argument that we used here, essentially, to argue that this T2 is an independent random variable from T1.

So a person is watching the movie, sees the first success. And on the basis of what they have seen-- they have just seen the first success-- they ask you to come in. You come in. What you're going to see is a sequence of Bernoulli trials. And you wait this long until the next success comes in. What you see is a Bernoulli process, as if the process was just starting right now. And that convinces us that this should be a geometric random variable of the same kind as this one, as independent from what happened before.

All right. So this is pretty much all there is to know about the Bernoulli process. Plus the two things that we did at the end of the last lecture where we merge two independent Bernoulli processes, we get a Bernoulli process. If we have a Bernoulli process and we split it by flipping a coin and sending things one way or the other, then we get two separate Bernoulli processes. And we see that all of these carry over to the continuous time. And our task for today is basically to work these continuous time variations.

So the Poisson process is a continuous time version of the Bernoulli process. Here's the motivation for considering it a Bernoulli process. So you have that person whose job is to sit outside the door of a bank. And they have this long sheet, and for every one second slot, they mark an X if a person came in, or they mark something else if no one came in during that slot.

Now the bank manager is a really scientifically trained person and wants very accurate results. So they tell you, don't use one second slots, use milliseconds slots. So you have all those slots and you keep filling if someone arrived or not during that slot. Well then you come up with an idea. Why use millisecond slots and keep putting crosses or zero's into each slot? It's much simpler if I just record the exact times when people came in. So time is continuous.

I don't keep doing something at every time slot. But instead of the time axis, I mark the times at which customers arrive. So there's no real need for slots. The only information that you want is when did we have arrivals of people. And we want to now model a process of this kind happening in continuous time, that has the same flavor, however, as the Bernoulli process. So that's the model we want to develop.

OK. So what are the properties that we're going to have? First, we're going to assume that intervals over the same length behave probabilistically in an identical fashion. So what does that mean? Think of an interval of some given length. During the interval of that length, there's going to be a random number of arrivals. And that random number of arrivals is going to have a probability distribution. So that probability distribution-- let's denote it by this notation.

We fix t, we fix the duration. So this is fixed. And we look at the different k's. The probability of having 0 arrivals, the probability of 1 arrival, the probability of 2 arrivals, and so on. So this thing is essentially a PMF. So it should have the property that the sum over all k's of this P_(k, tau) should be equal to 1.

Now, hidden inside this notation is an assumption of time homogeneity. That is, this probability distribution for the number of arrivals only depends on the length of the interval, but not the exact location of the interval on the time axis.

That is, if I take an interval of length tau, and I ask about the number of arrivals in this interval. And I take another interval of length tau, and I ask about the number of arrivals during that interval. Number of arrivals here, and number of arrivals there have the same probability distribution, which is denoted this way.

So the statistical behavior of arrivals here is the same as the statistical behavioral of arrivals there. What's the relation with the Bernoulli process? It's very much like the assumption-- the Bernoulli process-- that in different slots, we have the same probability of success. Every slot looks probabilistically as any other slot.

So similarly here, any interval of length tau looks probabilistically as any other interval of length tau. And the number of arrivals during that interval is a random variable described by these probabilities. Number of arrivals here is a random variable described by these same probabilities. So that's our first assumption.

Then what else? In the Bernoulli process we had the assumption that different time slots were independent of each other. Here we do not have time slots, but we can still think in a similar way and impose the following assumption, that these joint time intervals are statistically independent. What does that mean?

Does a random number of arrivals during this interval, and the random number of arrivals during this interval, and the random number of arrivals during this interval-- so these are three different random variables-- these three random variables are independent of each other. How many arrivals we got here is independent from how many arrivals we got there.

So this is similar to saying that different time slots were independent. That's what we did in discrete time. The continuous time analog is this independence assumption. So for example, in particular, number of arrivals here is independent from the number of arrivals there. So these are two basic assumptions about the process.

Now in order to write down a formula, eventually, about this probability distribution-- which is our next objective, we would like to say something specific about this distribution of number of arrivals-- we need to add a little more structure into the problem.

And we're going to make the following assumption. If we look at the time interval of length delta-- and delta now is supposed to be a small number, so a picture like this-- during a very small time interval, there is a probability that we get exactly one arrival, which is lambda times delta. Delta is the length of the interval and lambda is a proportionality factor, which is sort of the intensity of the arrival process.

Bigger lambda means that a little interval is more likely to get an arrival. So there's a probability lambda times delta of 1 arrival. The remaining probability goes to 0 arrivals. And when delta is small, the probability of 2 arrivals can be approximated by 0. So this is a description of what happens during a small, tiny slot.

Now this is something that's supposed to be true in some limiting sense, when delta is very small. So the exact version of this statement would be that this is an equality, plus order of delta squared terms. So this is an approximate equality. And what approximation means is that in the limit of small deltas, the dominant terms-- the constant and the first order term are given by this.

Now when delta is very small, second order terms in delta do not matter. They are small compared to first order terms. So we ignore this. So you can either think in terms of an exact relation, which is the probabilities are given by this, plus delta squared terms. Or if you want to be a little more loose, you just write here, as an approximate equality. And the understanding is that this equality holds-- approximately becomes more and more correct as delta goes to 0.

So another version of that statement would be that if you take the limit as delta goes to 0, of p, the probability of having 1 arrival in an interval of length delta, divided by delta, this is equal to lambda. So that would be one version of an exact statement of what we are assuming here.

So this lambda, we call it the arrival rate, or the intensity of the process. And clearly, if you double lambda, then a little interval is likely -- you expect to get -- the probability of obtaining an arrival during that interval has doubled. So in some sense we have twice as intense arrival process.

If you look at the number of arrivals during delta interval, what is the expected value of that random variable? Well with probability lambda delta we get 1 arrival. And with the remaining probability, we get 0 arrivals. So it's just lambda times delta. So expected number of arrivals during a little interval is lambda times delta. So expected number of arrivals is proportional to lambda, and that's again why we call lambda the arrival rate.

If you send delta to the denominator in this equality, it tells you that lambda is the expected number of arrivals per unit time. So the arrival rate is expected number of arrivals per unit time. And again, that justifies why we call lambda the intensity of this process.

All right. So where are we now? For the Bernoulli process, the number of arrivals during a given interval of length n had the PMF that we knew it was the binomial PMF. What is the formula for the corresponding PMF for the continuous time process? Somehow we would like to use our assumptions and come up with the formula for this quantity.

So this tells us about the distribution of number of arrivals during an interval of some general length. We have made assumptions about the number of arrivals during an interval of small length. An interval of big length is composed of many intervals of small length, so maybe this is the way to go. Take a big interval, and split it into many intervals of small length.

So we have here our time axis. And we have an interval of length tau. And I'm going to split it into lots of little intervals of length delta. So how many intervals are we going to have? The number of intervals is going to be the total time, divided by delta.

Now what happens during each one of these little intervals? As long as the intervals are small, what you have is that during an interval, you're going to have either 0 or 1 arrival. The probability of more than 1 arrival during a little interval is negligible.

So with this picture, you have essentially a Bernoulli process that consists of so many trials. And during each one of those trials, we have a probability of success, which is lambda times delta.

Different little intervals here are independent of each other. That's one of our assumptions, that these joint time intervals are independent. So approximately, what we have is a Bernoulli process. We have independence. We have the number of slots of interest. And during each one of the slots we have a certain probability of success.

So if we think of this as another good approximation of the Poisson process-- with the approximation becoming more and more accurate as delta goes to 0 -- what we should do would be to take the formula for the PMF of number of arrivals in a Bernoulli process, and then take the limit as delta goes to 0.

So in the Bernoulli process, the probability of k arrivals is n choose k, and then you have p to the k. Now in our case, we have here lambda times delta, delta is tau over n. Delta is tau over n, so p is lambda times tau divided by n. So here's our p -- Lambda tau over n -- to the power k, and then times one minus this-- this is our one minus p-- to the power n-k.

So this is the exact formula for the Bernoulli process. For the Poisson process, what we do is we take that formula and we let delta go to 0. As delta goes to 0, n goes to infinity. So that's the limit that we're taking.

On the other hand, this expression lambda times tau-- lambda times tau, what is it going to be? Lambda times tau is equal to n times p. n times p, is that what I want? No, let's see. Lambda tau is np. Yeah. So lambda tau is np.

All right. So we have this relation, lambda tau equals np. These two numbers being equal kind of makes sense. np is the expected number of successes you're going to get in the Bernoulli process. Lambda tau-- since lambda is the arrival rate and you have a total time of tau, lambda tau you can think of it as the number of expected arrivals in the Bernoulli process.

We're doing a Bernoulli approximation to the Poisson process. We take the formula for the Bernoulli, and now take the limit as n goes to infinity. Now lambda tau over n is equal to p, so it's clear what this term is going to give us. This is just p to the power k.

It will actually take a little more work than that. Now I'm not going to do the algebra, but I'm just telling you that one can take the limit in this formula here, as n goes to infinity. And that will give you another formula, the final formula for the Poisson PMF.

One thing to notice is that here you have something like 1 minus a constant over n, to the power n. And you may recall from calculus a formula of this kind, that this converges to e to the minus c. If you remember that formula from calculus, then you will expect that here, in the limit, you are going to get something like an e to the minus lambda tau. So indeed, we will get such a term.

There is some work that needs to be done to find the limit of this expression, times that expression. The algebra is not hard, it's in the text. Let's not spend more time doing this. But let me just give you the formula of what comes at the end. And the formula that comes at the end is of this form.

So what matters here is not so much the specific algebra that you will do to go from this formula to that one. It's kind of straightforward. What's important is the idea that the Poisson process, by definition, can be approximated by a Bernoulli process in which we have a very large number of slots-- n goes to infinity. Whereas we have a very small probability of success during each time slot. So a large number of slots, but tiny probability of success during each slot. And we take the limit as the slots become smaller and smaller.

So with this approximation we end up with this particular formula. And this is the so-called Poisson PMF. Now this function P here -- has two arguments. The important thing to realize is that when you think of this as a PMF, you fix t to tau. And for a fixed tau, now this is a PMF. As I said before, the sum over k has to be equal to 1. So for a given tau, these probabilities add up to 1. The formula is moderately messy, but not too messy. One can work with it without too much pain.

And what's the mean and variance of this PMF? Well what's the expected number of arrivals? If you think of this Bernoulli analogy, we know that the expected number of arrivals in the Bernoulli process is n times p. In the approximation that we're using in these procedure, n times p is the same as lambda tau. And that's why we get lambda tau to be the expected number of arrivals. Here I'm using t instead of tau. The expected number of arrivals is lambda t.

So if you double the time, you expect to get twice as many arrivals. If you double the arrival rate, you expect to get twice as many arrivals. How about the formula for the variance? The variance of the Bernoulli process is np, times one minus p.

What does this go to in the limit? In the limit that we're taking, as delta goes to zero, then p also goes to zero. The probability of success in any given slot goes to zero. So this term becomes insignificant. So this becomes n times p, which is again lambda t, or lambda tau.

So the variance, instead of having this more complicated formula of the variance is the Bernoulli process, here it gets simplified and it's lambda t. So interestingly, the variance in the Poisson process is exactly the same as the expected value. So you can look at this as just some interesting coincidence.

So now we're going to take this formula and see how to use it. First we're going to do a completely trivial, straightforward example. So 15 years ago when that example was made, email was coming at a rate of five messages per hour. I wish that was the case today. And now emails that are coming in, let's say during the day-- the arrival rates of emails are probably different in different times of the day. But if you fix a time slot, let's say 1:00 to 2:00 in the afternoon, there's probably a constant rate. And email arrivals are reasonably well modeled by a Poisson process.

Speaking of modeling, it's not just email arrivals. Whenever arrivals happen in a completely random way, without any additional structure, the Poisson process is a good model of these arrivals. So the times at which car accidents will happen, that's a Poisson processes. If you have a very, very weak light source that's shooting out photons, just one at a time, the times at which these photons will go out is well modeled again by a Poisson process. So it's completely random.

Or if you have a radioactive material where one atom at a time changes at random times. So it's a very slow radioactive decay. The time at which these alpha particles, or whatever we get emitted, again is going to be described by a Poisson process. So if you have arrivals, or emissions, that happen at completely random times, and once in a while you get an arrival or an event, then the Poisson process is a very good model for these events.

So back to emails. Get them at a rate of five messages per day, per hour. In 30 minutes this is half an hour. So what we have is that lambda t, total number of arrivals is-- the expected number of arrivals is-- lambda is five, t is one-half, if we talk about hours. So lambda t is two to the 0.5.

The probability of no new messages is the probability of zero, in time interval of length t, which, in our case, is one-half. And then we look back into the formula from the previous slide, and the probability of zero arrivals is lambda t to the power zero, divided by zero factorial, and then an e to the lambda t. And you plug in the numbers that we have. Lambda t to the zero power is one. Zero factorial is one. So we're left with e to the minus 2.5. And that number is 0.08.

Similarly, you can ask for the probability that you get exactly one message in half an hour. And that would be-- the probability of one message in one-half an hour-- is going to be lambda t to the first power, divided by 1 factorial, e to the minus lambda t, which-- as we now get the extra lambda t factor-- is going to be 2.5, e to the minus 2.5. And the numerical answer is 0.20. So this is how you use the PMF formula for the Poisson distribution that we had in the previous slide.

All right. So this was all about the distribution of the number of arrivals. What else did we do last time? Last time we also talked about the time it takes until the k-th arrival. OK. So let's try to figure out something about this particular distribution. We can derive the distribution of the time of the k-th arrival by using the exact same argument as we did last time.

So now the time of the k-th arrival is a continuous random variable. So it has a PDF. Since we are in continuous time, arrivals can happen at any time. So Yk is a continuous random variable. But now let's think of a time interval of length little delta. And use our usual interpretation of PDFs. The PDF of a random variable evaluated at a certain time times delta, this is the probability that the Yk falls in this little interval.

So as I've said before, this is the best way of thinking about PDFs. PDFs give you probabilities of little intervals. So now let's try to calculate this probability. For the k-th arrival to happen inside this little interval, we need two things. We need an arrival to happen in this interval, and we need k minus one arrivals to happen during that interval.

OK. You'll tell me, but it's possible that we might have the k minus one arrival happen here, and the k-th arrival to happen here. In principle, that's possible. But in the limit, when we take delta very small, the probability of having two arrivals in the same little slot is negligible. So assuming that no two arrivals can happen in the same mini slot, then for the k-th one to happen here, we must have k minus one during this interval.

Now because we have assumed that these joint intervals are independent of each other, this breaks down into the probability that we have exactly k minus one arrivals, during the interval from zero to t, times the probability of exactly one arrival during that little interval, which is lambda delta.

We do have a formula for this from the previous slide, which is lambda t, to the k minus 1, over k minus one factorial, times e to minus lambda t. And then lambda times delta. Did I miss something?

Yeah, OK. All right. And now you cancel this delta with that delta. And that gives us a formula for the PDF of the time until the k-th arrival. This PDF, of course, depends on the number k. The first arrival is going to happen somewhere in this range of time. So this is the PDF that it has.

The second arrival, of course, is going to happen later. And the PDF is this. So it's more likely to happen around these times. The third arrival has this PDF, so it's more likely to happen around those times.

And if you were to take k equal to 100, you might get a PDF-- it's extremely unlikely that the k-th arrival happens in the beginning, and it might happen somewhere down there, far into the future. So depending on which particular arrival we're talking about, it has a different probability distribution. The time of the 100th arrival, of course, is expected to be a lot larger than the time of the first arrival.

Incidentally, the time of the first arrival has a PDF whose form is quite simple. If you let k equal to one here, this term disappears. That term becomes a one. You're left with just lambda, e to the minus lambda. And you recognize it, it's the exponential distribution. So the time until the first arrival in a Poisson process is an exponential distribution.

What was the time of the first arrival in the Bernoulli process? It was a geometric distribution. Well, not coincidentally, these two look quite a bit like the other. A geometric distribution has this kind of shape. The exponential distribution has that kind of shape. The geometric is just a discrete version of the exponential. In the Bernoulli case, we are in discrete time. We have a PMF for the time of the first arrival, which is geometric.

In the Poisson case, what we get is the limit of the geometric as you let those lines become closer and closer, which gives you the exponential distribution. Now the Poisson process shares all the memorylessness properties of the Bernoulli process. And the way one can argue is just in terms of this picture.

Since the Poisson process is the limit of Bernoulli processes, whatever qualitative processes you have in the Bernoulli process remain valid for the Poisson process. In particular we have this memorylessness property. You let the Poisson process run for some time, and then you start watching it. What ever happened in the past has no bearing about the future.

Starting from right now, what's going to happen in the future is described again by a Poisson process, in the sense that during every little slot of length delta, there's going to be a probability of lambda delta of having an arrival. And that probably lambda delta is the same-- is always lambda delta-- no matter what happened in the past of the process.

And in particular, we could use this argument to say that the time until the k-th arrival is the time that it takes for the first arrival to happen. OK, let me do it for k equal to two. And then after the first arrival happens, you wait a certain amount of time until the second arrival happens.

Now once the first arrival happened, that's in the past. You start watching. From now on you have mini slots of length delta, each one having a probability of success lambda delta. It's as if we started the Poisson process from scratch. So starting from that time, the time until the next arrival is going to be again an exponential distribution, which doesn't care about what happened in the past, how long it took you for the first arrival.

So these two random variables are going to be independent and exponential, with the same parameter lambda. So among other things, what we have done here is we have essentially derived the PDF of the sum of k independent exponentials. The time of the k-th arrival is the sum of k inter-arrival times. The inter-arrival times are all independent of each other because of memorylessness. And they all have the same exponential distribution.

And by the way, this gives you a way to simulate the Poisson process. If you wanted to simulate it on your computer, you would have one option to break time into tiny, tiny slots. And for every tiny slot, use your random number generator to decide whether there was an arrival or not. To get it very accurate, you would have to use tiny, tiny slots. So that would be a lot of computation.

The more clever way of simulating the Poisson process is you use your random number generator to generate a sample from an exponential distribution and call that your first arrival time. Then go back to the random number generator, generate another independent sample, again from the same exponential distribution. That's the time between the first and the second arrival, and you keep going that way.

So as a sort of a quick summary, this is the big picture. This table doesn't tell you anything new. But it's good to have it as a reference, and to look at it, and to make sure you understand what all the different boxes are. Basically the Bernoulli process runs in discrete time. The Poisson process runs in continuous time.

There's an analogy of arrival rates, p per trial, or intensity per unit time. We did derive, or sketched the derivation for the PMF of the number of arrivals. And the Poisson distribution, which is the distribution that we get, this Pk of t. Pk and t is the limit of the binomial when we take the limit in this particular way, as delta goes to zero, and n goes to infinity.

The geometric becomes an exponential in the limit. And the distribution of the time of the k-th arrival-- we had a closed form formula last time for the Bernoulli process. We got the closed form formula this time for the Poisson process. And we actually used exactly the same argument to get these two closed form formulas.

All right. So now let's talk about adding or merging Poisson processes. And there's two statements that we can make here. One has to do with adding Poisson random variables, just random variables. There's another statement about adding Poisson processes. And the second is a bigger statement than the first. But this is a warm up. Let's work with the first statement.

So the claim is that the sum of independent Poisson random variables is Poisson. OK. So suppose that we have a Poisson process with rate-- just for simplicity-- lambda one. And I take the interval from zero to two. And that take then the interval from two until five. The number of arrivals during this interval-- let's call it n from zero to two-- is going to be a Poisson random variable, with parameter, or with mean, two.

The number of arrivals during this interval is n from time two until five. This is again a Poisson random variable with mean equal to three, because the arrival rate is 1 and the duration of the interval is three. These two random variables are independent. They obey the Poisson distribution that we derived before. If you add them, what you get is the number of arrivals during the interval from zero to five.

Now what kind of distribution does this random variable have? Well this is the number of arrivals over an interval of a certain length in a Poisson process. Therefore, this is also Poisson with mean five.

Because for the Poisson process we know that this number of arrivals is Poisson, this is Poisson, but also the number of overall arrivals is also Poisson. This establishes that the sum of a Poisson plus a Poisson random variable gives us another Poisson random variable. So adding Poisson random variables gives us a Poisson random variable.

But now I'm going to make a more general statement that it's not just number of arrivals during a fixed time interval-- it's not just numbers of arrivals for given time intervals-- but rather if you take two different Poisson processes and add them up, the process itself is Poisson in the sense that this process is going to satisfy all the assumptions of a Poisson process.

So the story is that you have a red bulb that flashes at random times at the rate of lambda one. It's a Poisson process. You have an independent process where a green bulb flashes at random times. And you happen to be color blind, so you just see when something is flashing. So these two are assumed to be independent Poisson processes. What can we say about the process that you observe?

So in the processes that you observe, if you take a typical time interval of length little delta, what can happen during that little time interval? The red process may have something flashing. So red flashes. Or the red does not. And for the other bulb, the green bulb, there's two possibilities. The green one flashes. And the other possibility is that the green does not.

OK. So there's four possibilities about what can happen during a little slot. The probability that the red one flashes and the green one flashes, what is this probability? It's lambda one delta that the first one flashes, and lambda two delta that the second one does. I'm multiplying probabilities here because I'm making the assumption that the two processes are independent.

OK. Now the probability that the red one flashes is lambda one delta. But the green one doesn't is one, minus lambda two delta. Here the probability would be that the red one does not, times the probability that the green one does. And then here we have the probability that none of them flash, which is whatever is left. But it's one minus lambda one delta, times one minus lambda two delta.

Now we're thinking about delta as small. So think of the case where delta goes to zero, but in a way that we keep the first order terms. We keep the delta terms, but we throw away the delta squared terms. Delta squared terms are much smaller than the delta terms when delta becomes small. If we do that-- if we only keep the order of delta terms-- this term effectively disappears. This is delta squared. So we make it zero. So the probability of having simultaneously a red and a green flash during a little interval is negligible.

What do we get here? Lambda delta times one survives, but this times that doesn't. So we can throw that away. So the approximation that we get is lambda one delta. Similarly here, this goes away. We're left with a lambda two delta. And this is whatever remains, whatever is left.

So what do we have? That there is a probability of seeing a flash, either a red or a green, which is lambda one delta, plus lambda two delta. So if we take a little interval of length delta here, it's going to see an arrival with probability approximately lambda one, plus lambda two, delta.

So every slot in this merged process has an arrival probability with a rate which is the sum of the rates of these two processes. So this is one part of the definition of the Poisson process. There's a few more things that one would need to verify. Namely, that intervals of the same length have the same probability distribution and that different slots are independent of each other.

This can be argued by starting from here because different intervals in this process are independent from each other. Different intervals here are independent from each other. It's not hard to argue that different intervals in the merged process will also be independent of each other.

So the conclusion that comes at the end is that this process is a Poisson process, with a total rate which is equal to the sum of the rate of the two processes. And now if I tell you that an arrival happened in the merged process at a certain time, how likely is it that it came from here? How likely is it?

We go to this picture. Given that an arrival occurred-- which is the event that this or that happened-- what is the probability that it came from the first process, the red one? Well it's the probability of this divided by the probability of this, times that.

Given that this event occurred, you want to find the conditional probability of that sub event. So we're asking the question, out of the total probability of these two, what fraction of that probability is assigned here? And this is lambda one delta, after we ignore the other terms.

This is lambda two delta. So that fraction is going to be lambda one, over lambda one plus lambda two. What does this tell you? If lambda one and lambda two are equal, given that I saw an arrival here, it's equally likely to be red or green. But if the reds have a much higher arrival rate, when I see an arrival here, it's more likely this number will be large. So it's more likely to have come from the red process.

OK so we'll continue with this story and do some applications next time.