1 00:00:00,530 --> 00:00:02,960 The following content is provided under a Creative 2 00:00:02,960 --> 00:00:04,370 Commons license. 3 00:00:04,370 --> 00:00:07,410 Your support will help MIT OpenCourseWare continue to 4 00:00:07,410 --> 00:00:11,060 offer high quality educational resources for free. 5 00:00:11,060 --> 00:00:13,960 To make a donation or view additional materials from 6 00:00:13,960 --> 00:00:17,890 hundreds of MIT courses, visit MIT OpenCourseWare at 7 00:00:17,890 --> 00:00:19,140 ocw.mit.edu. 8 00:00:24,600 --> 00:00:25,230 PROFESSOR: OK. 9 00:00:25,230 --> 00:00:28,710 Today we're all through with Markov chains, or at least 10 00:00:28,710 --> 00:00:32,530 with finite state Markov chains. 11 00:00:32,530 --> 00:00:36,440 And we're going on to renewal processes. 12 00:00:36,440 --> 00:00:40,660 As part of that, we will spend a good deal of time talking 13 00:00:40,660 --> 00:00:44,950 about the strong law of large numbers, and convergence with 14 00:00:44,950 --> 00:00:46,830 probability one. 15 00:00:46,830 --> 00:00:51,440 The idea of convergence with probability one, at least to 16 00:00:51,440 --> 00:00:56,540 me is by far the most difficult part of the course. 17 00:00:56,540 --> 00:00:59,890 It's very abstract mathematically. 18 00:00:59,890 --> 00:01:03,170 It looks like it's simple, and it's one of those things you 19 00:01:03,170 --> 00:01:06,630 start to think you understand it, and then at a certain 20 00:01:06,630 --> 00:01:09,460 point, you realize that you don't. 21 00:01:09,460 --> 00:01:12,740 And this has been happening to me for 20 years now. 22 00:01:12,740 --> 00:01:16,650 I keep thinking I really understand this idea of 23 00:01:16,650 --> 00:01:19,800 convergence with probability one, and then I see some 24 00:01:19,800 --> 00:01:22,180 strange example again. 25 00:01:22,180 --> 00:01:25,130 And I say there's something very peculiar 26 00:01:25,130 --> 00:01:26,500 about this whole idea. 27 00:01:26,500 --> 00:01:29,750 And I'm going to illustrate that you at the end of the 28 00:01:29,750 --> 00:01:30,980 lecture today. 29 00:01:30,980 --> 00:01:35,100 But for the most part, I will be talking not so much about 30 00:01:35,100 --> 00:01:40,260 renewal processes, but this set of mathematical issues 31 00:01:40,260 --> 00:01:43,990 that we have to understand in order to be able to look at 32 00:01:43,990 --> 00:01:47,440 renewal processes in the simplest way. 33 00:01:47,440 --> 00:01:50,830 One of the funny things about the strong law of large 34 00:01:50,830 --> 00:01:55,420 numbers and how it gets applied to renewal processes 35 00:01:55,420 --> 00:02:01,220 is that although the idea of convergence with probability 36 00:02:01,220 --> 00:02:06,990 one is sticky and strange, once you understand it, it is 37 00:02:06,990 --> 00:02:11,920 one of the most easy things to use there is. 38 00:02:11,920 --> 00:02:16,220 And therefore, once you become comfortable with it, you can 39 00:02:16,220 --> 00:02:19,150 use it to do things which would be very hard to do in 40 00:02:19,150 --> 00:02:21,260 any other way. 41 00:02:21,260 --> 00:02:24,630 And because of that, most people feel they understand it 42 00:02:24,630 --> 00:02:26,700 better than they actually do. 43 00:02:26,700 --> 00:02:30,670 And that's the reason why it sometimes crops up when you're 44 00:02:30,670 --> 00:02:33,230 least expecting it, and you find there's 45 00:02:33,230 --> 00:02:35,630 something very peculiar. 46 00:02:35,630 --> 00:02:39,050 OK, so let's start out by talking a little bit about 47 00:02:39,050 --> 00:02:41,580 renewal processes. 48 00:02:41,580 --> 00:02:47,570 And then talking about this convergence, and the strong 49 00:02:47,570 --> 00:02:52,190 law of large numbers, and what it does to all of this. 50 00:02:52,190 --> 00:02:54,090 This is just review. 51 00:02:54,090 --> 00:02:57,230 We talked about arrival processes when we started 52 00:02:57,230 --> 00:03:00,090 talking about Poisson processes. 53 00:03:00,090 --> 00:03:03,740 Renewal processes are a special kind of arrival 54 00:03:03,740 --> 00:03:07,510 processes, and Poisson processes are a special kind 55 00:03:07,510 --> 00:03:09,290 of renewal process. 56 00:03:09,290 --> 00:03:13,710 So this is something you're already sort of familiar with. 57 00:03:13,710 --> 00:03:20,310 All of arrival processes, we will tend to treat in one of 58 00:03:20,310 --> 00:03:23,450 three equivalent ways, which is the same thing we did with 59 00:03:23,450 --> 00:03:24,700 Poisson processes. 60 00:03:27,250 --> 00:03:30,250 A stochastic process, we said, is a 61 00:03:30,250 --> 00:03:32,410 family of random variables. 62 00:03:32,410 --> 00:03:38,070 But in this case, we always view it as three families of 63 00:03:38,070 --> 00:03:40,770 random variables, which are all related. 64 00:03:40,770 --> 00:03:43,720 And all of which define the other. 65 00:03:43,720 --> 00:03:46,780 And you jump back and forth from looking at one to looking 66 00:03:46,780 --> 00:03:51,650 at the other, which is, as you saw with Poisson processes, 67 00:03:51,650 --> 00:03:55,250 you really want to do this, because if you stick to only 68 00:03:55,250 --> 00:04:00,060 one way of looking at it, you really only pick up about a 69 00:04:00,060 --> 00:04:02,736 quarter, or a half of the picture. 70 00:04:02,736 --> 00:04:04,840 OK. 71 00:04:04,840 --> 00:04:08,810 So this one picture gives us a relationship between the 72 00:04:08,810 --> 00:04:14,170 arrival epochs of an arrival process, the inter-arrival 73 00:04:14,170 --> 00:04:21,560 intervals, the x1, x2, x3, and the counting process, n of t, 74 00:04:21,560 --> 00:04:26,100 and whichever one you use, you use the one which is easiest, 75 00:04:26,100 --> 00:04:28,330 for whatever you plan to do. 76 00:04:28,330 --> 00:04:31,980 For defining a renewal process, the easy thing to do 77 00:04:31,980 --> 00:04:36,330 is to look at the inter-arrival intervals, 78 00:04:36,330 --> 00:04:40,280 because the definition of a renewal process is it's an 79 00:04:40,280 --> 00:04:45,930 arrival process for which the interrenewals are independent 80 00:04:45,930 --> 00:04:48,290 and identically distributed. 81 00:04:48,290 --> 00:04:53,000 So any process where the arrivals, inter-arrivals have 82 00:04:53,000 --> 00:04:56,260 that property are IID. 83 00:04:56,260 --> 00:05:00,640 OK, renewal processes are characterized, and the name 84 00:05:00,640 --> 00:05:06,700 comes from the idea that you start over at each interval. 85 00:05:06,700 --> 00:05:11,180 This idea of starting over is something that we talk about 86 00:05:11,180 --> 00:05:13,350 more later on. 87 00:05:13,350 --> 00:05:17,310 And it's a little bit strange, and a little bit fishy. 88 00:05:17,310 --> 00:05:21,910 It's like with a Poisson process, you look at different 89 00:05:21,910 --> 00:05:25,400 intervals, and they're independent of each other. 90 00:05:25,400 --> 00:05:28,860 And we sort of know what that means by now. 91 00:05:28,860 --> 00:05:34,540 OK, you look at the arrival epochs for a Poisson process. 92 00:05:34,540 --> 00:05:37,280 Are they independent of each other? 93 00:05:37,280 --> 00:05:39,210 Of course not. 94 00:05:39,210 --> 00:05:41,450 The arrival epochs are the sums of the 95 00:05:41,450 --> 00:05:43,520 inter-arrival intervals. 96 00:05:43,520 --> 00:05:45,950 The inter-arrival intervals are the things that are 97 00:05:45,950 --> 00:05:47,190 independent. 98 00:05:47,190 --> 00:05:49,260 And the arrival epochs are the sums of 99 00:05:49,260 --> 00:05:50,810 inter-arrival intervals. 100 00:05:50,810 --> 00:05:54,250 If you know that the first arrival epoch takes 10 times 101 00:05:54,250 --> 00:05:59,480 longer than its mean, then that the second arrival epoch 102 00:05:59,480 --> 00:06:01,590 is going to be kind of long, too. 103 00:06:01,590 --> 00:06:05,910 It's got to be at least 10 times as long as the mean of 104 00:06:05,910 --> 00:06:10,330 the inter-arrival epochs, because each arrival epoch is 105 00:06:10,330 --> 00:06:13,130 a sum of these inter-arrival epochs. 106 00:06:13,130 --> 00:06:15,980 It's the inter-arrival epochs that are independent. 107 00:06:15,980 --> 00:06:19,720 So when you say that the one interval is independent of the 108 00:06:19,720 --> 00:06:23,720 other, yes, you know exactly what you mean. 109 00:06:23,720 --> 00:06:26,160 And the idea is very simple. 110 00:06:26,160 --> 00:06:28,890 It's the same idea here. 111 00:06:28,890 --> 00:06:32,490 But then you start to think you understand this, and you 112 00:06:32,490 --> 00:06:35,980 start to use it in a funny way. 113 00:06:35,980 --> 00:06:39,710 And suddenly you're starting to say that the arrival epochs 114 00:06:39,710 --> 00:06:42,280 are independent from one time to the other, which they 115 00:06:42,280 --> 00:06:44,050 certainly aren't. 116 00:06:44,050 --> 00:06:49,070 What renewal theory does is it lets you treat the gross 117 00:06:49,070 --> 00:06:53,000 characteristics of a process in a very simple and 118 00:06:53,000 --> 00:06:54,640 straightforward way. 119 00:06:54,640 --> 00:06:58,270 So you're breaking up the process into two sets 120 00:06:58,270 --> 00:06:59,700 of views about it. 121 00:06:59,700 --> 00:07:03,470 One is the long term behavior, which you treat by renewal 122 00:07:03,470 --> 00:07:10,110 theory, and you use this one exotic theory in a simple and 123 00:07:10,110 --> 00:07:14,550 straightforward way for every different process, for every 124 00:07:14,550 --> 00:07:17,150 different renewal process you look at. 125 00:07:17,150 --> 00:07:21,230 And then you have this usually incredibly complicated kind of 126 00:07:21,230 --> 00:07:26,260 thing in the inside of each arrival epoch. 127 00:07:26,260 --> 00:07:29,650 And the nice thing about renewal theory is it lets you 128 00:07:29,650 --> 00:07:33,180 look at that complicated thing without worrying about what's 129 00:07:33,180 --> 00:07:35,090 going on outside. 130 00:07:35,090 --> 00:07:38,950 So the local characteristics can be studied without 131 00:07:38,950 --> 00:07:43,410 worrying about the long term interactions. 132 00:07:43,410 --> 00:07:48,470 One example of this, and one of the reasons we are now 133 00:07:48,470 --> 00:07:51,280 looking at Markov chains before we look at renewal 134 00:07:51,280 --> 00:07:57,000 processes is that a Markov chain is one of the nicest 135 00:07:57,000 --> 00:08:01,890 examples there is of a renewal process, when you look at it 136 00:08:01,890 --> 00:08:03,450 in the right way. 137 00:08:03,450 --> 00:08:11,150 If you have a recurrent Markov chain, then the interval from 138 00:08:11,150 --> 00:08:15,360 one time entering a particularly recurrent state 139 00:08:15,360 --> 00:08:17,190 until the next time you enter that 140 00:08:17,190 --> 00:08:21,500 recurrent state is a renewal. 141 00:08:21,500 --> 00:08:27,570 So we look at the sequence of times at which we enter this 142 00:08:27,570 --> 00:08:29,110 one given state. 143 00:08:29,110 --> 00:08:31,150 Enter state one over here. 144 00:08:31,150 --> 00:08:33,520 We enter state one again over here. 145 00:08:33,520 --> 00:08:36,780 We enter state one again, and so forth. 146 00:08:36,780 --> 00:08:39,710 We're ignoring everything that goes on between 147 00:08:39,710 --> 00:08:41,539 entries to state one. 148 00:08:41,539 --> 00:08:44,810 But every time you enter state 1, you're in the same 149 00:08:44,810 --> 00:08:49,490 situation as you were the last time you entered state one. 150 00:08:49,490 --> 00:08:53,150 You're in the same situation, in the sense that the 151 00:08:53,150 --> 00:08:58,010 inter-arrivals from state one to state one again are 152 00:08:58,010 --> 00:08:59,910 independent of what they were before. 153 00:08:59,910 --> 00:09:03,780 In other words, when you enter state one, your successive 154 00:09:03,780 --> 00:09:07,650 state transitions from there are the same 155 00:09:07,650 --> 00:09:08,850 as they were before. 156 00:09:08,850 --> 00:09:15,500 So it's the same situation as we saw with Poisson processes, 157 00:09:15,500 --> 00:09:19,930 and it's the same kind of renewal where when you talk 158 00:09:19,930 --> 00:09:25,800 about renewal, you have to be very careful about what it is 159 00:09:25,800 --> 00:09:26,740 that's a renewable. 160 00:09:26,740 --> 00:09:30,420 Once you're careful about it, it's clear what's going on. 161 00:09:30,420 --> 00:09:33,000 One of the things we're going to find out now is one of the 162 00:09:33,000 --> 00:09:36,610 things that we failed to point out before when we talked 163 00:09:36,610 --> 00:09:39,400 about finite state and Markov chains. 164 00:09:39,400 --> 00:09:43,080 One of the most interesting characteristics is the 165 00:09:43,080 --> 00:09:47,620 expected amount of time from one entry to a recurrent state 166 00:09:47,620 --> 00:09:51,380 until the next time you enter that recurrent state is 1 over 167 00:09:51,380 --> 00:09:55,390 pi sub i, where pi sub i is a steady state probability of 168 00:09:55,390 --> 00:09:57,670 that steady state. 169 00:09:57,670 --> 00:09:59,600 Namely, we didn't do that. 170 00:09:59,600 --> 00:10:03,290 It's a little tricky to do that in terms of Markov chains 171 00:10:03,290 --> 00:10:07,530 it's almost trivial to do it in terms of renewal processes. 172 00:10:07,530 --> 00:10:11,450 And what's more, when we do it in terms of renewal processes, 173 00:10:11,450 --> 00:10:13,230 you will see that it's obvious, and you will 174 00:10:13,230 --> 00:10:14,410 never forget it. 175 00:10:14,410 --> 00:10:17,120 If we did it in terms of Markov chains, it would be 176 00:10:17,120 --> 00:10:21,700 some long, tedious derivation, and you'd get this nice 177 00:10:21,700 --> 00:10:24,970 answer, and you say, why did that nice answer occur? 178 00:10:24,970 --> 00:10:26,500 And you wouldn't have any idea. 179 00:10:26,500 --> 00:10:29,000 When you look at renewal processes, it's 180 00:10:29,000 --> 00:10:31,100 obvious why it happens. 181 00:10:31,100 --> 00:10:34,130 And we'll see why that is very soon. 182 00:10:34,130 --> 00:10:38,310 Also, after we finish renewal processes, the next thing 183 00:10:38,310 --> 00:10:41,330 we're going to do is to talk about accountable 184 00:10:41,330 --> 00:10:43,090 state Markov chains. 185 00:10:43,090 --> 00:10:46,740 Markov chains with accountable, infinitely 186 00:10:46,740 --> 00:10:48,025 countable set of states. 187 00:10:52,520 --> 00:10:56,100 If you don't have a background in renewal theory when you 188 00:10:56,100 --> 00:10:59,600 start to look at that, you get very confused. 189 00:10:59,600 --> 00:11:02,940 So renewal theory will give us the right tool to look at 190 00:11:02,940 --> 00:11:07,140 those more complicated Markov chains. 191 00:11:07,140 --> 00:11:07,830 OK. 192 00:11:07,830 --> 00:11:10,620 So we carry from Markov chains with accountably infinite 193 00:11:10,620 --> 00:11:14,940 state space comes largely from renewal process. 194 00:11:14,940 --> 00:11:19,350 So yes, we'll be interested in understanding that. 195 00:11:19,350 --> 00:11:19,750 OK. 196 00:11:19,750 --> 00:11:25,430 Another example is GTM queue. 197 00:11:25,430 --> 00:11:28,980 The text talked a little bit, and we might have talked in 198 00:11:28,980 --> 00:11:32,810 class a little bit about this strange notation a queueing 199 00:11:32,810 --> 00:11:34,710 theorist used. 200 00:11:34,710 --> 00:11:37,360 There are always at least three letters separated by 201 00:11:37,360 --> 00:11:41,140 slashes to talk about what kind of queue 202 00:11:41,140 --> 00:11:42,670 you're talking about. 203 00:11:42,670 --> 00:11:47,990 The first letter describes the arrival process for the queue. 204 00:11:47,990 --> 00:11:52,860 G means it's a general rival process, which doesn't really 205 00:11:52,860 --> 00:11:54,960 mean it's a general arrival process. 206 00:11:54,960 --> 00:11:58,210 It means the arrival process is renewal. 207 00:11:58,210 --> 00:12:03,210 Namely, it says the arrival process is IID, 208 00:12:03,210 --> 00:12:05,280 inter-arrivals. 209 00:12:05,280 --> 00:12:07,890 But you don't know what their distribution is. 210 00:12:07,890 --> 00:12:11,060 You would call that M if you meant a Poisson process, which 211 00:12:11,060 --> 00:12:14,780 would mean memory lists, inter-arrivals. 212 00:12:14,780 --> 00:12:19,940 The second G stands for the service time distribution. 213 00:12:19,940 --> 00:12:23,690 Again, we assume that no matter how many servers you 214 00:12:23,690 --> 00:12:27,700 have, no matter how the servers work, the time to 215 00:12:27,700 --> 00:12:31,980 serve one user is independent of the time 216 00:12:31,980 --> 00:12:33,600 to serve other users. 217 00:12:33,600 --> 00:12:37,370 But that the distribution of that time has a general 218 00:12:37,370 --> 00:12:38,600 distribution. 219 00:12:38,600 --> 00:12:42,800 It would be M as you meant a memory list distribution, 220 00:12:42,800 --> 00:12:45,680 which would mean exponential distribution. 221 00:12:45,680 --> 00:12:51,110 Finally, the thing at the end says we're talking about IQ 222 00:12:51,110 --> 00:12:52,950 with M servers. 223 00:12:52,950 --> 00:12:55,180 So the point here is we're talking about a relatively 224 00:12:55,180 --> 00:12:57,690 complicated thing. 225 00:12:57,690 --> 00:13:00,550 Can you talk about this in terms of renewals? 226 00:13:00,550 --> 00:13:05,290 Yes, you can, but it's not quite obvious how to do it. 227 00:13:05,290 --> 00:13:09,550 You would think that the obvious way of viewing a 228 00:13:09,550 --> 00:13:14,620 complicated queue like this is to look at what happens from 229 00:13:14,620 --> 00:13:18,040 one busy period to the next busy period. 230 00:13:18,040 --> 00:13:20,590 You would think the busy periods would be independent 231 00:13:20,590 --> 00:13:21,800 of each other. 232 00:13:21,800 --> 00:13:23,610 But they're not quite. 233 00:13:23,610 --> 00:13:27,820 Suppose you finish one busy period, and when you finish 234 00:13:27,820 --> 00:13:33,525 the one busy period, one customer has just finished 235 00:13:33,525 --> 00:13:34,930 being served. 236 00:13:34,930 --> 00:13:38,630 But at that point, you're in the middle of the waiting for 237 00:13:38,630 --> 00:13:41,230 the next customer to arrive. 238 00:13:41,230 --> 00:13:43,630 And as that's a general distribution, the amount of 239 00:13:43,630 --> 00:13:47,380 time you have to wait for that next customer to arrive 240 00:13:47,380 --> 00:13:50,120 depends on a whole lot of things in 241 00:13:50,120 --> 00:13:52,000 the previous interval. 242 00:13:52,000 --> 00:13:54,540 So how can you talk about renewals here? 243 00:13:54,540 --> 00:13:57,520 You talk about renewables by waiting until that next 244 00:13:57,520 --> 00:13:58,730 arrival comes. 245 00:13:58,730 --> 00:14:04,270 When that next arrival comes to terminate the idle period 246 00:14:04,270 --> 00:14:07,970 between busy periods, at that time you're in the same state 247 00:14:07,970 --> 00:14:10,730 that you were in when the whole thing started before. 248 00:14:10,730 --> 00:14:14,180 When you had the first arrival come in. 249 00:14:14,180 --> 00:14:17,480 And at that point, you had one a rival there being served you 250 00:14:17,480 --> 00:14:20,360 go through some long complicated thing. 251 00:14:20,360 --> 00:14:22,730 Eventually the busy period is over. 252 00:14:22,730 --> 00:14:25,760 Eventually, then, another arrival comes in. 253 00:14:25,760 --> 00:14:28,920 And presto, at that point, you're statistically back 254 00:14:28,920 --> 00:14:30,070 where you started. 255 00:14:30,070 --> 00:14:33,560 You're statistically back where you started in terms of 256 00:14:33,560 --> 00:14:37,400 all inter-arrival times at that point. 257 00:14:37,400 --> 00:14:40,220 And we will have to, even though it's intuitively 258 00:14:40,220 --> 00:14:44,300 obvious that those things are independent of each other, 259 00:14:44,300 --> 00:14:47,380 we're really going to have to sort that out a little bit, 260 00:14:47,380 --> 00:14:51,290 because you come upon many situations 261 00:14:51,290 --> 00:14:53,820 where this is not obvious. 262 00:14:53,820 --> 00:14:56,310 So if you don't know how to sort it out when it is 263 00:14:56,310 --> 00:14:59,480 obvious, you're not going to know how to sort it out when 264 00:14:59,480 --> 00:15:00,490 it's not obvious. 265 00:15:00,490 --> 00:15:03,840 But anyway, that's another example of 266 00:15:03,840 --> 00:15:06,650 where we have renewals. 267 00:15:06,650 --> 00:15:08,710 OK. 268 00:15:08,710 --> 00:15:11,760 We want to talk about convergence now. 269 00:15:11,760 --> 00:15:15,170 This idea of convergence with probability one. 270 00:15:15,170 --> 00:15:17,990 It's based on the idea of numbers 271 00:15:17,990 --> 00:15:21,440 converging to some limit. 272 00:15:21,440 --> 00:15:27,430 And I'm always puzzled about how much to talk about this, 273 00:15:27,430 --> 00:15:31,440 because all of you, when you first study calculus, talk 274 00:15:31,440 --> 00:15:32,990 about limits. 275 00:15:32,990 --> 00:15:36,650 Most of you, if you're engineers, when you talk about 276 00:15:36,650 --> 00:15:40,400 calculus, it goes in one ear and it goes out the other ear, 277 00:15:40,400 --> 00:15:43,900 because you don't have to understand this very much. 278 00:15:43,900 --> 00:15:46,820 Because all the things you deal with, the limits exist 279 00:15:46,820 --> 00:15:49,330 very nicely, and there's no problem. 280 00:15:49,330 --> 00:15:51,170 So you can ignore it. 281 00:15:51,170 --> 00:15:55,310 And then you hear about these epsilons and deltas, and I do 282 00:15:55,310 --> 00:15:56,930 the same thing. 283 00:15:56,930 --> 00:15:59,840 I can deal with an epsilon, but as soon as you have an 284 00:15:59,840 --> 00:16:03,890 epsilon and a delta, I go into orbit. 285 00:16:03,890 --> 00:16:07,210 I have no idea what's going on anymore until I sit down and 286 00:16:07,210 --> 00:16:09,350 think about it very, very carefully. 287 00:16:09,350 --> 00:16:13,200 Fortunately, when we have a sequence of numbers, we only 288 00:16:13,200 --> 00:16:14,060 have an epsilon. 289 00:16:14,060 --> 00:16:15,610 We don't have a delta. 290 00:16:15,610 --> 00:16:17,095 So things are a little bit simpler. 291 00:16:19,860 --> 00:16:23,260 I should warn you, though, that you can't let this go in 292 00:16:23,260 --> 00:16:27,250 one ear and out the other ear, because at this point, we are 293 00:16:27,250 --> 00:16:30,590 using the convergence of numbers to be able to talk 294 00:16:30,590 --> 00:16:34,330 about convergence of random variables, and convergence of 295 00:16:34,330 --> 00:16:38,500 random variables is indeed not a simple topic. 296 00:16:38,500 --> 00:16:43,540 Convergence of numbers is a simple topic made complicated 297 00:16:43,540 --> 00:16:44,790 by mathematicians. 298 00:16:46,800 --> 00:16:49,760 Any good mathematician, when they hear me say 299 00:16:49,760 --> 00:16:51,430 this will be furious. 300 00:16:51,430 --> 00:16:55,210 Because in fact, when you think about what they've done, 301 00:16:55,210 --> 00:16:59,080 they've taken something which is simple but looks 302 00:16:59,080 --> 00:17:03,820 complicated, and they've turned it into something which 303 00:17:03,820 --> 00:17:06,609 looks complicated in another way, but is really the 304 00:17:06,609 --> 00:17:08,359 simplest way to deal with it. 305 00:17:08,359 --> 00:17:12,680 So let's do that and be done with it, and then we can start 306 00:17:12,680 --> 00:17:15,140 using it for random variables. 307 00:17:15,140 --> 00:17:19,880 A sequence, b1, b2, b3, so forth of real numbers. 308 00:17:19,880 --> 00:17:22,520 Real numbers are complex numbers that doesn't make any 309 00:17:22,520 --> 00:17:30,640 difference, is said to converge to a limit, b. 310 00:17:30,640 --> 00:17:33,470 If for each real epsilon greater than zero, is there an 311 00:17:33,470 --> 00:17:37,700 integer M such that bn minus b is less than or equal to 312 00:17:37,700 --> 00:17:41,460 epsilon for all n greater than or equal to n? 313 00:17:41,460 --> 00:17:44,580 Now, how many people can look at that and understand it? 314 00:17:44,580 --> 00:17:45,510 Be honest. 315 00:17:45,510 --> 00:17:46,250 Good. 316 00:17:46,250 --> 00:17:48,410 Some of you can. 317 00:17:48,410 --> 00:17:53,760 How many people look at that, and their mind just, ah! 318 00:17:53,760 --> 00:17:57,010 How many people are in that category? 319 00:17:57,010 --> 00:17:58,020 I am. 320 00:17:58,020 --> 00:18:00,660 But if I'm the only one, that's good. 321 00:18:00,660 --> 00:18:03,070 OK. 322 00:18:03,070 --> 00:18:06,250 There's an equivalent way to talk about this. 323 00:18:06,250 --> 00:18:10,110 A sequence of numbers, real or complex, is said to converge 324 00:18:10,110 --> 00:18:11,420 to limit b. 325 00:18:11,420 --> 00:18:16,480 If for each integer k greater than zero, there's an integer 326 00:18:16,480 --> 00:18:21,450 m of k, such that bn minus b is less than or equal to 1 327 00:18:21,450 --> 00:18:25,825 over k for all n greater than or equal to m. 328 00:18:25,825 --> 00:18:26,210 OK. 329 00:18:26,210 --> 00:18:30,160 And the argument there is pick any epsilon you want to, no 330 00:18:30,160 --> 00:18:32,670 matter how small. 331 00:18:32,670 --> 00:18:35,930 And then you pick a k, such that 1 over k is less than or 332 00:18:35,930 --> 00:18:37,340 equal to epsilon. 333 00:18:37,340 --> 00:18:42,890 According to this definition, bn minus b less than or equal 334 00:18:42,890 --> 00:18:48,990 to 1 over k ensures that you have this condition up here 335 00:18:48,990 --> 00:18:51,120 that we're talking about. 336 00:18:51,120 --> 00:18:56,160 When bn minus b is less than or equal to 1 over k, then 337 00:18:56,160 --> 00:18:59,950 also bn minus b is less than or equal to epsilon. 338 00:18:59,950 --> 00:19:02,780 In other words, when you look at this, you're starting to 339 00:19:02,780 --> 00:19:06,640 see what this definition really means. 340 00:19:06,640 --> 00:19:12,700 Here, you don't really care about all epsilon. 341 00:19:12,700 --> 00:19:18,200 All you care about is that this holds true for small 342 00:19:18,200 --> 00:19:19,740 enough epsilon. 343 00:19:19,740 --> 00:19:23,890 And the trouble is there's no way to specify a 344 00:19:23,890 --> 00:19:25,570 small enough epsilon. 345 00:19:25,570 --> 00:19:29,670 So the only way we can do this is to say for all epsilon. 346 00:19:29,670 --> 00:19:35,700 But what the argument is is if you can assert this statement 347 00:19:35,700 --> 00:19:39,650 for a sequence of smaller and smaller values of epsilon, 348 00:19:39,650 --> 00:19:41,110 that's all you need. 349 00:19:41,110 --> 00:19:45,510 Because as soon as this is true for one value of epsilon, 350 00:19:45,510 --> 00:19:48,200 it's true for all smaller values of epsilon. 351 00:19:48,200 --> 00:19:53,110 Now, let me show you a picture which, unfortunately, there's 352 00:19:53,110 --> 00:19:56,680 a kind of a complicated picture. 353 00:19:56,680 --> 00:20:00,900 It's the picture that says what that argument was really 354 00:20:00,900 --> 00:20:02,410 talking about. 355 00:20:02,410 --> 00:20:05,720 So if you don't understand the picture, you were kidding 356 00:20:05,720 --> 00:20:08,230 yourself when you said you thought you understood what 357 00:20:08,230 --> 00:20:10,090 the definition said. 358 00:20:10,090 --> 00:20:12,970 So what the picture says, it's in terms of 359 00:20:12,970 --> 00:20:16,800 this 1 over k business. 360 00:20:16,800 --> 00:20:21,610 It says if you have a sequence of numbers, b1, b2, b3, excuse 361 00:20:21,610 --> 00:20:24,220 me for insulting you by talking about 362 00:20:24,220 --> 00:20:26,302 something so trivial. 363 00:20:26,302 --> 00:20:29,350 But believe me, as soon as we start talking about random 364 00:20:29,350 --> 00:20:33,500 variables, this trivial thing mixed with so many other 365 00:20:33,500 --> 00:20:37,160 things will start to become less trivial, and you really 366 00:20:37,160 --> 00:20:40,320 need to understand what this is saying. 367 00:20:40,320 --> 00:20:46,020 So we're saying if we have a sequence, b1, b2, b3, b4, b5 368 00:20:46,020 --> 00:20:52,030 and so forth, what that second idea of convergence says is 369 00:20:52,030 --> 00:21:02,760 that there's an M1 which says that for all n greater than or 370 00:21:02,760 --> 00:21:12,500 equal to M1, b4, b5, b6, b7 minus b all lies within this 371 00:21:12,500 --> 00:21:16,660 limit here between b plus 1 and b minus 1. 372 00:21:16,660 --> 00:21:20,360 There's a number M2, which says that as soon as you get 373 00:21:20,360 --> 00:21:23,570 bigger than M of 2, all these numbers lie 374 00:21:23,570 --> 00:21:25,390 between these two limits. 375 00:21:25,390 --> 00:21:28,890 There's a number M3, which says all of these numbers lie 376 00:21:28,890 --> 00:21:30,130 between these limits. 377 00:21:30,130 --> 00:21:34,670 So it's saying that, it's essentially saying that you 378 00:21:34,670 --> 00:21:40,340 can a pipe, and as n increases, you squeeze this 379 00:21:40,340 --> 00:21:42,000 pipe gradually down. 380 00:21:42,000 --> 00:21:44,550 You don't know how fast you can squeeze it down when 381 00:21:44,550 --> 00:21:46,440 you're talking about convergence. 382 00:21:46,440 --> 00:21:49,460 You might have something that converges very slowly, and 383 00:21:49,460 --> 00:21:51,790 then M1 will be way out here. 384 00:21:51,790 --> 00:21:53,700 M2 will be way over there. 385 00:21:53,700 --> 00:21:58,480 M3 will be off on the other side of Vassar 386 00:21:58,480 --> 00:22:01,160 Street, and so forth. 387 00:22:01,160 --> 00:22:05,650 But there always is such an M1, M2, and M3, which says 388 00:22:05,650 --> 00:22:09,070 these numbers are getting closer and closer to b. 389 00:22:09,070 --> 00:22:12,280 And they're staying closer and closer to b. 390 00:22:12,280 --> 00:22:16,800 An example, which we'll come back to, where you don't have 391 00:22:16,800 --> 00:22:20,840 convergence is the following kind of thing. 392 00:22:20,840 --> 00:22:25,030 b1 is equal to 3/4, in this case. 393 00:22:25,030 --> 00:22:27,470 b5 is equal to 3/4. 394 00:22:27,470 --> 00:22:30,110 b25 is equal to 3/4. 395 00:22:30,110 --> 00:22:34,540 b5 to the third is equal to 1. 396 00:22:34,540 --> 00:22:35,380 And so forth. 397 00:22:35,380 --> 00:22:43,110 These values at which b sub n is equal to 3/4, b is equal to 398 00:22:43,110 --> 00:22:46,530 little b plus 3/4 get more and more rare. 399 00:22:46,530 --> 00:22:51,560 So in some sense, this sequence here where b2 up to 400 00:22:51,560 --> 00:22:53,190 b4 is zero. 401 00:22:53,190 --> 00:22:57,310 b6 up to b24 is zero and so forth. 402 00:22:57,310 --> 00:23:00,580 This is some kind of convergence, also. 403 00:23:00,580 --> 00:23:05,340 But it's not what anyone would call convergence. 404 00:23:05,340 --> 00:23:08,610 I mean, as far as numbers are concerned, there's only one 405 00:23:08,610 --> 00:23:11,950 kind of convergence that people ever talk about, and 406 00:23:11,950 --> 00:23:13,930 it's this kind of convergence here. 407 00:23:13,930 --> 00:23:17,070 This, although these numbers are getting close 408 00:23:17,070 --> 00:23:18,820 to b in some sense. 409 00:23:18,820 --> 00:23:21,630 That's not viewed as convergence. 410 00:23:21,630 --> 00:23:26,950 So here, even though almost all the numbers are close to 411 00:23:26,950 --> 00:23:30,420 b, they don't stay close to b, in a sense. 412 00:23:30,420 --> 00:23:34,290 They always pop up at some place in the future, and that 413 00:23:34,290 --> 00:23:37,160 destroys the whole idea of convergence. 414 00:23:37,160 --> 00:23:41,920 It destroys most theorems about convergence. 415 00:23:41,920 --> 00:23:45,430 That's an example where you don't have convergence. 416 00:23:45,430 --> 00:23:49,100 OK, random variables are really a lot more complicated 417 00:23:49,100 --> 00:23:50,290 than numbers. 418 00:23:50,290 --> 00:23:57,000 I mean, a random variable is a function from the sample space 419 00:23:57,000 --> 00:23:59,010 to real numbers. 420 00:23:59,010 --> 00:24:00,800 All of you know that's not really what a 421 00:24:00,800 --> 00:24:02,470 random variable is. 422 00:24:02,470 --> 00:24:06,100 All of you know that a random variable is a number that 423 00:24:06,100 --> 00:24:10,110 wiggles around a little bit, rather than being fixed at 424 00:24:10,110 --> 00:24:12,695 what you ordinarily think of a number as being, right? 425 00:24:16,990 --> 00:24:22,350 Since that's a very imprecise notion, and the precise notion 426 00:24:22,350 --> 00:24:26,150 is very complicated, to build up your intuition about this, 427 00:24:26,150 --> 00:24:29,830 you have to really think hard about what convergence of 428 00:24:29,830 --> 00:24:32,300 random variables means. 429 00:24:32,300 --> 00:24:35,230 For convergence and distribution, it's not the 430 00:24:35,230 --> 00:24:38,140 random variables, but the distribution function of the 431 00:24:38,140 --> 00:24:40,100 random variables that converge. 432 00:24:40,100 --> 00:24:43,710 In other words, in the distribution function of z sub 433 00:24:43,710 --> 00:24:46,940 n, where you have a sequence of random variables, z1, z2, 434 00:24:46,940 --> 00:24:50,520 z3 and so forth, the distribution function 435 00:24:50,520 --> 00:24:56,890 evaluated at each real value z converges for each z in the 436 00:24:56,890 --> 00:25:01,370 case where the distribution function of this final 437 00:25:01,370 --> 00:25:04,740 convergent random variable is continuous. 438 00:25:04,740 --> 00:25:06,600 We all studied that. 439 00:25:06,600 --> 00:25:08,370 We know what that means now. 440 00:25:08,370 --> 00:25:11,490 For convergence and probability, we talked about 441 00:25:11,490 --> 00:25:14,270 convergence and probability in two ways. 442 00:25:14,270 --> 00:25:16,910 One with an epsilon and a delta. 443 00:25:16,910 --> 00:25:19,480 And then saying for every epsilon and delta that isn't 444 00:25:19,480 --> 00:25:22,130 big enough, something happens. 445 00:25:22,130 --> 00:25:25,730 And then we saw that it was a little easier to describe. 446 00:25:25,730 --> 00:25:28,570 It was a little easier to drive describe by saying the 447 00:25:28,570 --> 00:25:30,880 convergence in probability. 448 00:25:30,880 --> 00:25:33,460 These distribution functions have to 449 00:25:33,460 --> 00:25:34,810 converge to a unit step. 450 00:25:34,810 --> 00:25:35,620 And that's enough. 451 00:25:35,620 --> 00:25:39,230 They converge to a unit steps at every z except 452 00:25:39,230 --> 00:25:41,570 where the step is. 453 00:25:41,570 --> 00:25:42,920 We talked about that. 454 00:25:42,920 --> 00:25:46,130 For convergence with probability one, and this is 455 00:25:46,130 --> 00:25:50,030 the thing we want to talk about today, this is the one 456 00:25:50,030 --> 00:25:54,020 that sounds so easy, and which is really tricky. 457 00:25:54,020 --> 00:25:55,990 I don't want to scare you about this. 458 00:25:55,990 --> 00:26:00,660 If you're not scared about it to start with, I don't 459 00:26:00,660 --> 00:26:01,450 want to scare you. 460 00:26:01,450 --> 00:26:05,670 But I would like to convince you that if you think you 461 00:26:05,670 --> 00:26:09,260 understand it and you haven't spent a lot of time thinking 462 00:26:09,260 --> 00:26:12,070 about it, you're probably due for a rude 463 00:26:12,070 --> 00:26:14,370 awakening at some point. 464 00:26:14,370 --> 00:26:17,870 So for convergence with probability one, the set of 465 00:26:17,870 --> 00:26:22,370 sample paths that converge has probability one. 466 00:26:22,370 --> 00:26:27,930 In other words, the sequence Y1, Y2 converges to zero with 467 00:26:27,930 --> 00:26:30,290 probability one. 468 00:26:30,290 --> 00:26:33,070 And now I'm going to talk about converging to zero 469 00:26:33,070 --> 00:26:35,820 rather than converging to some random variable. 470 00:26:35,820 --> 00:26:39,120 Because if you're interested in a sequence of random 471 00:26:39,120 --> 00:26:43,550 variables Z1, Z2 that converge to some other random variable 472 00:26:43,550 --> 00:26:47,570 Z, you can get rid of a lot of the complication by just 473 00:26:47,570 --> 00:26:51,310 saying, let's define a random variable y sub n, which is 474 00:26:51,310 --> 00:26:53,570 equal to z sub n minus c. 475 00:26:53,570 --> 00:26:56,430 And then what we're interested in is do these random 476 00:26:56,430 --> 00:26:59,490 variables y sub n converged to 0. 477 00:26:59,490 --> 00:27:02,980 We can forget about what it's converging to, and only worry 478 00:27:02,980 --> 00:27:05,560 about it converging to 0. 479 00:27:05,560 --> 00:27:11,270 OK, so when we do that, this sequence of random variables 480 00:27:11,270 --> 00:27:15,160 converges to 0 with probability 1. 481 00:27:15,160 --> 00:27:22,800 If the probability of the set of sample points for which the 482 00:27:22,800 --> 00:27:25,410 sample path converges to 0. 483 00:27:25,410 --> 00:27:31,310 If that set of sample paths has probability 1-- 484 00:27:31,310 --> 00:27:36,050 namely, for almost everything in the space, for almost 485 00:27:36,050 --> 00:27:39,820 everything in its peculiar sense of probability-- 486 00:27:39,820 --> 00:27:44,050 if that holds true, then you say you have convergence with 487 00:27:44,050 --> 00:27:45,300 probability 1. 488 00:27:49,440 --> 00:27:51,440 Now, that looks straightforward, 489 00:27:51,440 --> 00:27:54,770 and I hope it is. 490 00:27:54,770 --> 00:27:56,590 You can memorize it or do whatever you 491 00:27:56,590 --> 00:27:59,270 want to do with it. 492 00:27:59,270 --> 00:28:02,340 We're going to go on now and prove an important theorem 493 00:28:02,340 --> 00:28:05,260 about convergence with probability 1. 494 00:28:05,260 --> 00:28:09,160 I'm going to give a proof here in class that's a little more 495 00:28:09,160 --> 00:28:12,610 detailed than the proof I give in the notes. 496 00:28:12,610 --> 00:28:15,280 I don't like to give proofs in class. 497 00:28:15,280 --> 00:28:19,275 I think it's a lousy idea because when you're studying a 498 00:28:19,275 --> 00:28:22,940 proof, you have to go at your own pace. 499 00:28:22,940 --> 00:28:29,420 But the problem is, I know that students-- 500 00:28:29,420 --> 00:28:31,200 and I was once a student myself, 501 00:28:31,200 --> 00:28:33,170 and I'm still a student. 502 00:28:33,170 --> 00:28:38,165 If I see a proof, I will only look at enough of it to say, 503 00:28:38,165 --> 00:28:40,000 ah, I get the idea of it. 504 00:28:40,000 --> 00:28:41,610 And then I will stop. 505 00:28:41,610 --> 00:28:44,840 And for this one, you need a little more than the idea of 506 00:28:44,840 --> 00:28:46,410 it because it's something we're going to 507 00:28:46,410 --> 00:28:48,020 build on all the time. 508 00:28:48,020 --> 00:28:51,300 So I want to go through this proof carefully. 509 00:28:51,300 --> 00:28:56,990 And I hope that most of you will follow most of it. 510 00:28:56,990 --> 00:28:59,560 And the parts of it that you don't follow, I hope you'll go 511 00:28:59,560 --> 00:29:02,010 back and think about it, because 512 00:29:02,010 --> 00:29:05,290 this is really important. 513 00:29:05,290 --> 00:29:10,630 OK, so the theorem says, let this sequence of random 514 00:29:10,630 --> 00:29:17,080 variables satisfy the expected value of the magnitude of Y 515 00:29:17,080 --> 00:29:23,360 sub n, the sum from n equals 1 to infinity of this is less 516 00:29:23,360 --> 00:29:24,610 than infinity. 517 00:29:27,740 --> 00:29:29,990 As usual there's a misprint there. 518 00:29:29,990 --> 00:29:33,280 The sum, the expected value of Yn, the 519 00:29:33,280 --> 00:29:34,970 bracket should be there. 520 00:29:34,970 --> 00:29:36,560 It's supposed to be less than infinity. 521 00:29:40,820 --> 00:29:42,780 Let me write that down. 522 00:29:42,780 --> 00:29:49,720 The sum from n equals 1 to infinity of expected value of 523 00:29:49,720 --> 00:29:56,330 the magnitude of Y sub n is less than infinity. 524 00:29:56,330 --> 00:29:59,340 So it's a finite sum. 525 00:29:59,340 --> 00:30:03,960 So we're talking about these Yn's when we start talking 526 00:30:03,960 --> 00:30:07,350 about the strong law of large numbers. 527 00:30:07,350 --> 00:30:12,300 Yn is going to be something like the sum from n equals 1 528 00:30:12,300 --> 00:30:15,730 to m divided by m. 529 00:30:15,730 --> 00:30:17,800 In other words, it's going to be the sample average, or 530 00:30:17,800 --> 00:30:19,090 something like that. 531 00:30:19,090 --> 00:30:22,650 And these sample averages, if you have a mean 0, are going 532 00:30:22,650 --> 00:30:23,850 to get small. 533 00:30:23,850 --> 00:30:26,620 The question is, when you sum all of these things that are 534 00:30:26,620 --> 00:30:31,120 getting small, do you still get something which is small? 535 00:30:31,120 --> 00:30:34,490 When you're dealing with the weak law of large numbers, 536 00:30:34,490 --> 00:30:37,640 it's not necessary that that sum gets small. 537 00:30:37,640 --> 00:30:40,910 It's only necessary that each of the terms get small. 538 00:30:40,910 --> 00:30:43,660 Here we're saying, let's assume also that 539 00:30:43,660 --> 00:30:46,560 this sum gets small. 540 00:30:46,560 --> 00:30:50,930 OK, so we want to prove that under this condition, all of 541 00:30:50,930 --> 00:30:57,430 these sequences with probability 1 converge to 0, 542 00:30:57,430 --> 00:31:00,690 the individual sequences converge. 543 00:31:00,690 --> 00:31:04,520 OK, so let's go through the proof now. 544 00:31:04,520 --> 00:31:08,560 And as I say, I won't do this to you very often. 545 00:31:08,560 --> 00:31:14,110 But I think for this one, it's sort of necessary. 546 00:31:14,110 --> 00:31:18,000 OK, so first we'll use the Markov inequality. 547 00:31:18,000 --> 00:31:22,310 And I'm dealing with a finite value of m here. 548 00:31:22,310 --> 00:31:26,900 The probability that the sum of a finite set of these Y sub 549 00:31:26,900 --> 00:31:30,960 n's is greater than alpha is less than or equal to the 550 00:31:30,960 --> 00:31:33,890 expected value of that random variable. 551 00:31:33,890 --> 00:31:36,730 Namely, this random variable here. 552 00:31:36,730 --> 00:31:40,930 Sum from n equals 1 to m of magnitude of Y sub n. 553 00:31:40,930 --> 00:31:43,560 That's just a random variable. 554 00:31:43,560 --> 00:31:46,640 And the probability that that random variable is greater 555 00:31:46,640 --> 00:31:49,610 than alpha is less than or equal to the expected value of 556 00:31:49,610 --> 00:31:52,980 that random variable divided by alpha. 557 00:31:52,980 --> 00:32:02,850 OK, well now, this quantity here is increasing in Y sub n. 558 00:32:02,850 --> 00:32:06,510 The magnitude of Y sub n is a non-negative quantity. 559 00:32:06,510 --> 00:32:10,860 You take the expectation of a non-negative quantity, if it 560 00:32:10,860 --> 00:32:15,470 has an expectation, which we're assuming here for this 561 00:32:15,470 --> 00:32:18,680 to be less infinity, all of these things have to have 562 00:32:18,680 --> 00:32:20,800 expectations. 563 00:32:20,800 --> 00:32:26,990 So as we increase m, this gets bigger and bigger. 564 00:32:26,990 --> 00:32:31,720 So this quantity here is going to be less than or equal to 565 00:32:31,720 --> 00:32:35,370 the sum from n equals 1 to infinity of expected value of 566 00:32:35,370 --> 00:32:37,470 Y sub n divided by alpha. 567 00:32:37,470 --> 00:32:41,830 What I'm being careful about here is all of the things that 568 00:32:41,830 --> 00:32:47,170 happen when you go from finite m to infinite m. 569 00:32:47,170 --> 00:32:52,260 And I'm using what you know about finite m, and then being 570 00:32:52,260 --> 00:32:56,080 very careful about going to infinite m. 571 00:32:56,080 --> 00:32:59,340 And I'm going to try to explain why as we do it. 572 00:32:59,340 --> 00:33:01,570 But here, it's straightforward. 573 00:33:01,570 --> 00:33:07,090 The expected value of a finite sum is equal to the finite sum 574 00:33:07,090 --> 00:33:08,580 of an expected value. 575 00:33:08,580 --> 00:33:11,560 When you go to the limit, m goes to infinity, you don't 576 00:33:11,560 --> 00:33:15,230 know whether these expected values exist or not. 577 00:33:15,230 --> 00:33:18,990 You're sort of confused on both sides of this equation. 578 00:33:18,990 --> 00:33:21,550 So we're sticking to finite values here. 579 00:33:21,550 --> 00:33:25,710 Then, we're taking this quantity, going to the limit 580 00:33:25,710 --> 00:33:27,450 as m goes to infinity. 581 00:33:27,450 --> 00:33:30,570 This quantity has to get bigger and bigger as m goes to 582 00:33:30,570 --> 00:33:33,870 infinity, so this quantity has to be less 583 00:33:33,870 --> 00:33:36,070 than or equal to this. 584 00:33:36,070 --> 00:33:40,980 This now, for a given alpha, is just a number. 585 00:33:40,980 --> 00:33:43,820 It's nothing more than a number, so we can deal with 586 00:33:43,820 --> 00:33:47,420 this pretty easily as we make alpha big enough. 587 00:33:47,420 --> 00:33:50,050 But for most of the argument, we're going to view alpha as 588 00:33:50,050 --> 00:33:51,700 being fixed. 589 00:33:51,700 --> 00:33:58,590 OK, so now the probability that this sum, finite sum is 590 00:33:58,590 --> 00:34:01,920 greater than alpha, is less than or equal to this. 591 00:34:01,920 --> 00:34:03,620 This was the thing we just finished proving 592 00:34:03,620 --> 00:34:04,870 on the other page. 593 00:34:09,110 --> 00:34:11,600 This is less than or equal to that. 594 00:34:11,600 --> 00:34:17,969 That's what I repeated, so I'm not cheating you at all here. 595 00:34:17,969 --> 00:34:21,400 Now, it's a pain to write that down all the time. 596 00:34:21,400 --> 00:34:27,580 So let's let the set, A sub m, be the set of sample points 597 00:34:27,580 --> 00:34:32,380 such that its finite sum of Y sub n of omega is 598 00:34:32,380 --> 00:34:34,170 greater than alpha. 599 00:34:34,170 --> 00:34:35,420 This is a random-- 600 00:34:39,260 --> 00:34:43,150 for each value of omega, this is just a number. 601 00:34:43,150 --> 00:34:49,710 The sum of the magnitude of Y sub n is a random variable. 602 00:34:49,710 --> 00:34:53,270 It takes on a numerical value for every omega 603 00:34:53,270 --> 00:34:54,760 in the sample space. 604 00:34:54,760 --> 00:34:59,140 So A sub m is the set of points in the sample space for 605 00:34:59,140 --> 00:35:02,600 which this quantity here is bigger than alpha. 606 00:35:02,600 --> 00:35:08,010 So we can rewrite this now as just the 607 00:35:08,010 --> 00:35:10,440 probability of A sub m. 608 00:35:10,440 --> 00:35:14,680 So this is equivalent to saying, the probability of A 609 00:35:14,680 --> 00:35:18,120 sub m is less than or equal to this number here. 610 00:35:18,120 --> 00:35:20,700 For a fixed alpha, this is a number. 611 00:35:20,700 --> 00:35:24,500 This is something which can vary with m. 612 00:35:24,500 --> 00:35:29,560 Since these numbers here now, now we're dealing with a 613 00:35:29,560 --> 00:35:32,710 sample space, which is a little strange. 614 00:35:32,710 --> 00:35:39,570 We're talking about sample points and we're saying, this 615 00:35:39,570 --> 00:35:46,320 number here, this magnitude of Y sub n at a particular sample 616 00:35:46,320 --> 00:35:50,040 point omega is greater than or equal to 0. 617 00:35:50,040 --> 00:35:55,050 Therefore, a sub m is a subset of A sub m plus 1. 618 00:35:55,050 --> 00:36:03,650 In other words, as m gets larger and larger here, m here 619 00:36:03,650 --> 00:36:05,450 gets larger and larger. 620 00:36:05,450 --> 00:36:08,950 Therefore, this sum here gets larger and larger. 621 00:36:08,950 --> 00:36:14,240 Therefore, the set of omega for which this increasing sum 622 00:36:14,240 --> 00:36:17,370 is greater than alpha gets bigger and bigger. 623 00:36:17,370 --> 00:36:20,430 And that's the thing that we're saying here, A sub m is 624 00:36:20,430 --> 00:36:24,990 included in A sub m plus 1 for m greater than or equal to 1. 625 00:36:24,990 --> 00:36:28,920 OK, so the left side of this quantity here, as a function 626 00:36:28,920 --> 00:36:32,370 of m, is a non-decreasing bounded 627 00:36:32,370 --> 00:36:34,305 sequence of real numbers. 628 00:36:41,420 --> 00:36:43,090 Yes, the probability of something 629 00:36:43,090 --> 00:36:44,120 is just a real number. 630 00:36:44,120 --> 00:36:46,970 A probability is a number. 631 00:36:46,970 --> 00:36:50,230 So this quantity here is a real number. 632 00:36:50,230 --> 00:36:53,810 It's a real number which is non-decreasing, so it keeps 633 00:36:53,810 --> 00:36:54,920 moving upward. 634 00:36:54,920 --> 00:36:59,330 What I'm trying to do now is now, I went to 635 00:36:59,330 --> 00:37:00,300 the limit over here. 636 00:37:00,300 --> 00:37:03,110 I want to go to the limit here. 637 00:37:03,110 --> 00:37:08,680 And so I have a sequence of numbers in m. 638 00:37:08,680 --> 00:37:12,500 This sequence of numbers is non-decreasing. 639 00:37:12,500 --> 00:37:14,250 So it's moving up. 640 00:37:14,250 --> 00:37:17,110 Every one of those quantities is bounded by 641 00:37:17,110 --> 00:37:18,840 this quantity here. 642 00:37:18,840 --> 00:37:23,350 So I have an increasing sequence of real numbers, 643 00:37:23,350 --> 00:37:25,960 which is bounded on the top. 644 00:37:25,960 --> 00:37:28,280 What happens? 645 00:37:28,280 --> 00:37:30,810 When you have a sequence of real 646 00:37:30,810 --> 00:37:32,780 numbers which is bounded-- 647 00:37:36,200 --> 00:37:38,810 I have a slide to prove this, but I'm not going to prove it 648 00:37:38,810 --> 00:37:40,953 because it's tedious. 649 00:37:46,470 --> 00:37:53,210 Here we have this probability which I'm calling A sub m 650 00:37:53,210 --> 00:37:59,740 probability of A sub m. 651 00:37:59,740 --> 00:38:07,730 Here I have the probability of A sub m plus 1, and so forth. 652 00:38:07,730 --> 00:38:09,980 Here I have this limit up here. 653 00:38:09,980 --> 00:38:13,850 All of this sequence of numbers, there's an infinite 654 00:38:13,850 --> 00:38:15,770 sequence of them. 655 00:38:15,770 --> 00:38:17,650 They're all non-decreasing. 656 00:38:17,650 --> 00:38:20,780 They're all bounded by this number here. 657 00:38:20,780 --> 00:38:23,300 And what happens? 658 00:38:23,300 --> 00:38:27,600 Well, either we go up to there as a limit or else we stop 659 00:38:27,600 --> 00:38:30,890 sometime earlier as a limit. 660 00:38:30,890 --> 00:38:36,600 I should prove this, but it's something we use all the time. 661 00:38:36,600 --> 00:38:43,190 It's a sequence of increasing or non-decreasing numbers. 662 00:38:43,190 --> 00:38:46,420 If it's bounded by something, it has to have a finite limit. 663 00:38:46,420 --> 00:38:49,390 The limit is less than or equal to this quantity. 664 00:38:49,390 --> 00:38:53,690 It might be strictly less, but the limit has to exist. 665 00:38:53,690 --> 00:38:55,620 And the limit has to be less than or equal to b. 666 00:38:59,900 --> 00:39:02,240 OK, that's what we're saying here. 667 00:39:02,240 --> 00:39:10,510 When we go to this limit, this limit of the probability of A 668 00:39:10,510 --> 00:39:14,380 sub m is less than or equal to this number here. 669 00:39:14,380 --> 00:39:19,310 OK, if I use this property of nesting intervals, when you 670 00:39:19,310 --> 00:39:26,860 have A sub 1 nested inside of A sub 2, nested inside of A 671 00:39:26,860 --> 00:39:33,380 sub 3, what we'd like to go do is go to this limit. 672 00:39:33,380 --> 00:39:35,210 The limit, unfortunately, doesn't make 673 00:39:35,210 --> 00:39:36,460 any sense in general. 674 00:39:39,130 --> 00:39:43,240 With this property of the axioms, it's equation number 9 675 00:39:43,240 --> 00:39:47,490 in chapter 1 says that we can do something 676 00:39:47,490 --> 00:39:51,190 that's almost as good. 677 00:39:51,190 --> 00:40:00,630 What it says that as we go to this limit here, what we get 678 00:40:00,630 --> 00:40:06,830 is that this limit is the probability of 679 00:40:06,830 --> 00:40:08,960 this infinite union. 680 00:40:08,960 --> 00:40:13,410 That's equal to the limit as m goes to infinity of 681 00:40:13,410 --> 00:40:16,655 probability of A sub m. 682 00:40:16,655 --> 00:40:20,810 OK, look up equation 9, and you'll see that's 683 00:40:20,810 --> 00:40:22,120 exactly what it says. 684 00:40:22,120 --> 00:40:25,870 If you think this is obvious, it's not. 685 00:40:25,870 --> 00:40:29,350 It ain't obvious at all because it's not even clear 686 00:40:29,350 --> 00:40:32,380 that this-- 687 00:40:32,380 --> 00:40:35,090 well, nothing very much about this union is clear. 688 00:40:35,090 --> 00:40:38,570 We know that this union must be a measurable set. 689 00:40:38,570 --> 00:40:40,280 It must have a probability. 690 00:40:40,280 --> 00:40:42,090 We don't know much more about it. 691 00:40:42,090 --> 00:40:45,340 But anyway, that property tells us that this is true. 692 00:41:03,190 --> 00:41:06,040 OK, so where we are at this point. 693 00:41:06,040 --> 00:41:09,680 I don't think I've skip something, have I? 694 00:41:09,680 --> 00:41:13,430 Oh, no, that's the thing I didn't want to talk about. 695 00:41:13,430 --> 00:41:19,720 OK, so A sub m is a set of omega which satisfy 696 00:41:19,720 --> 00:41:21,970 this for finite m. 697 00:41:21,970 --> 00:41:28,230 The probability of this union is then the union of all of 698 00:41:28,230 --> 00:41:30,470 these quantities over all m. 699 00:41:30,470 --> 00:41:35,050 And this is less than or equal to this bound that we had. 700 00:41:35,050 --> 00:41:44,890 OK, so I even hate giving proofs of this sort because 701 00:41:44,890 --> 00:41:46,960 it's a set of simple ideas. 702 00:41:46,960 --> 00:41:49,980 To track down every one of them is difficult. 703 00:41:49,980 --> 00:41:53,320 The text doesn't track down every one of them. 704 00:41:53,320 --> 00:41:55,750 And that's what I'm trying to do here. 705 00:42:06,580 --> 00:42:11,870 We have two possibilities here, and we're looking at 706 00:42:11,870 --> 00:42:13,330 this limit here. 707 00:42:13,330 --> 00:42:20,390 This limiting sum, which for each omega is just a sequence, 708 00:42:20,390 --> 00:42:23,500 a non-decreasing sequence of real numbers. 709 00:42:23,500 --> 00:42:27,940 So one possibility is that this sequence of real numbers 710 00:42:27,940 --> 00:42:30,000 is bigger than alpha. 711 00:42:30,000 --> 00:42:32,390 The other possibility is that it's less 712 00:42:32,390 --> 00:42:34,050 than or equal to alpha. 713 00:42:34,050 --> 00:42:37,660 If it's less than or equal to alpha, then every one of these 714 00:42:37,660 --> 00:42:43,920 numbers is less than or equal to alpha and omega has to be 715 00:42:43,920 --> 00:42:47,230 not in this union here. 716 00:42:47,230 --> 00:42:50,800 If the sum is bigger than alpha, then one of the 717 00:42:50,800 --> 00:42:53,500 elements in this set is bigger than alpha and 718 00:42:53,500 --> 00:42:55,550 omega is in this set. 719 00:42:55,550 --> 00:43:00,300 So what all of that says, and you're just going to have to 720 00:43:00,300 --> 00:43:03,301 look at that because it's not-- 721 00:43:03,301 --> 00:43:05,760 it's one of these tedious arguments. 722 00:43:05,760 --> 00:43:09,690 So the probability of omega such that this sum is greater 723 00:43:09,690 --> 00:43:13,030 than alpha is less than or equal to this number here. 724 00:43:13,030 --> 00:43:16,770 At this point, we have made a major change 725 00:43:16,770 --> 00:43:19,000 in what we're doing. 726 00:43:19,000 --> 00:43:24,750 Before we were talking about numbers like probabilities, 727 00:43:24,750 --> 00:43:27,400 numbers like expected values. 728 00:43:27,400 --> 00:43:33,020 Here, suddenly, we are talking about sample points. 729 00:43:33,020 --> 00:43:35,870 We're talking about the probability of a set of sample 730 00:43:35,870 --> 00:43:40,340 points, such that the sum is greater than alpha. 731 00:43:40,340 --> 00:43:41,358 Yes? 732 00:43:41,358 --> 00:43:43,207 AUDIENCE: I understand how if the whole sum if less than or 733 00:43:43,207 --> 00:43:45,840 equal than alpha then every element is. 734 00:43:45,840 --> 00:43:49,160 But did you say that if it's greater than alpha, then at 735 00:43:49,160 --> 00:43:50,360 least one element is greater than alpha? 736 00:43:50,360 --> 00:43:51,610 Why is that? 737 00:43:57,660 --> 00:44:00,430 PROFESSOR: Well, because either the sum is less than or 738 00:44:00,430 --> 00:44:03,670 equal to alpha or it's greater than alpha. 739 00:44:03,670 --> 00:44:09,230 And if it's less than or equal to alpha, then omega 740 00:44:09,230 --> 00:44:11,570 is not in this set. 741 00:44:11,570 --> 00:44:17,860 So the alternative is that omega has to be in this set. 742 00:44:17,860 --> 00:44:20,540 Except the other way of looking at it is if you have a 743 00:44:20,540 --> 00:44:25,390 sequence of numbers, which is approaching a limit, and that 744 00:44:25,390 --> 00:44:31,870 limit is bigger than alpha, then one of the terms has to 745 00:44:31,870 --> 00:44:33,990 be bigger than alpha. 746 00:44:33,990 --> 00:44:34,924 Yes? 747 00:44:34,924 --> 00:44:36,406 AUDIENCE: I think the confusion is between the 748 00:44:36,406 --> 00:44:38,876 partial sums and the terms of the sum. 749 00:44:38,876 --> 00:44:40,852 That's what he's confusing. 750 00:44:40,852 --> 00:44:43,322 Does that make sense? 751 00:44:43,322 --> 00:44:45,298 He's saying instead of each partial sum, not 752 00:44:45,298 --> 00:44:46,548 each term in the sum. 753 00:44:50,750 --> 00:44:52,000 PROFESSOR: Yes. 754 00:44:58,540 --> 00:45:00,510 Except I don't see how that answers the question. 755 00:45:11,000 --> 00:45:16,630 Except the point here is, if each partial sum is less than 756 00:45:16,630 --> 00:45:20,260 or equal to alpha, then the limit has to be less than or 757 00:45:20,260 --> 00:45:20,960 equal to alpha. 758 00:45:20,960 --> 00:45:22,450 That's what I was saying on the other page. 759 00:45:22,450 --> 00:45:25,130 If you have a sequence of numbers, which has an upper 760 00:45:25,130 --> 00:45:28,810 bound on them, then you have to have a limit. 761 00:45:28,810 --> 00:45:31,900 And that limit has to be less than or equal to alpha. 762 00:45:31,900 --> 00:45:34,090 So that's this case here. 763 00:45:34,090 --> 00:45:38,660 We have a sum of numbers as we're going to the limit as m 764 00:45:38,660 --> 00:45:42,490 gets larger and larger, these partial sums 765 00:45:42,490 --> 00:45:45,210 have to go to a limit. 766 00:45:45,210 --> 00:45:48,070 The partial sums are all less than or equal to alpha. 767 00:45:48,070 --> 00:45:51,130 Then the infinite sum is less than or equal to alpha, and 768 00:45:51,130 --> 00:45:53,810 omega is not in this set here. 769 00:45:53,810 --> 00:45:55,540 And otherwise, it is. 770 00:45:55,540 --> 00:45:59,220 OK, if I talk more about it, I'll get more confused. 771 00:45:59,220 --> 00:46:01,985 So I think the slides are clear. 772 00:46:09,020 --> 00:46:17,710 Now, if we look at the case where alpha is greater than or 773 00:46:17,710 --> 00:46:26,180 equal to this sum, and we take the complement of the set, the 774 00:46:26,180 --> 00:46:30,380 probability of the set of omega for which this sum is 775 00:46:30,380 --> 00:46:32,900 less than or equal to alpha has-- 776 00:46:32,900 --> 00:46:36,830 oh, let's forget about this for the moment. 777 00:46:36,830 --> 00:46:40,440 If I take the complement of this set, the probability of 778 00:46:40,440 --> 00:46:43,560 the set of omega, such that the sum is less than or equal 779 00:46:43,560 --> 00:46:47,190 to alpha, is greater than 1 minus this 780 00:46:47,190 --> 00:46:48,820 expected value here. 781 00:46:48,820 --> 00:46:51,480 Now I'm saying, let's look at the case where alpha is big 782 00:46:51,480 --> 00:46:55,280 enough that it's greater than this number here. 783 00:46:55,280 --> 00:47:00,820 So this probability is greater than 1 minus this number. 784 00:47:00,820 --> 00:47:06,720 So if the sum is less than or equal to alpha for any given 785 00:47:06,720 --> 00:47:12,220 omega, then this quantity here converges. 786 00:47:12,220 --> 00:47:14,620 Now I'm talking about sample sequences. 787 00:47:14,620 --> 00:47:18,630 I'm saying I have an increasing sequence of numbers 788 00:47:18,630 --> 00:47:21,540 corresponding to one particular sample point. 789 00:47:21,540 --> 00:47:25,500 This increasing set of numbers is less than or equal. 790 00:47:25,500 --> 00:47:28,760 Each element of it is less than or equal to alpha, so the 791 00:47:28,760 --> 00:47:31,820 limit of it is less than or equal to alpha. 792 00:47:31,820 --> 00:47:37,050 And what that says is the limit of Y sub n of omega, 793 00:47:37,050 --> 00:47:39,760 this has to be equal to 0 for that sample point. 794 00:47:39,760 --> 00:47:41,500 This is all the sample point argument. 795 00:47:46,280 --> 00:47:53,500 And what that says then is the probability of omega, such 796 00:47:53,500 --> 00:47:57,580 that this limit here is equal to 0, that's this quantity 797 00:47:57,580 --> 00:48:01,310 here, which is the same as this quantity, which has to be 798 00:48:01,310 --> 00:48:02,935 greater than this quantity. 799 00:48:08,570 --> 00:48:11,100 This implies this. 800 00:48:11,100 --> 00:48:16,260 Therefore, the probability of this has to be bigger than 801 00:48:16,260 --> 00:48:17,950 this probability here. 802 00:48:17,950 --> 00:48:22,480 Now, if we let alpha go to infinity, what that says is 803 00:48:22,480 --> 00:48:26,310 this quantity goes to 0 and the probability of the set of 804 00:48:26,310 --> 00:48:30,730 omega, such that this limit is equal to 0, is equal to 1. 805 00:48:35,010 --> 00:48:39,640 I think if I try to spend 20 more minutes talking about 806 00:48:39,640 --> 00:48:42,860 that in more detail, it won't get any clearer. 807 00:48:42,860 --> 00:48:47,020 It is one of these very tedious arguments where you 808 00:48:47,020 --> 00:48:51,380 have to sit down and follow it step by step. 809 00:48:51,380 --> 00:48:54,500 I wrote the steps that's very carefully. 810 00:48:54,500 --> 00:49:02,180 And at this point, I have to leave it as it is. 811 00:49:02,180 --> 00:49:05,540 But the theorem has been proven, at least in what's 812 00:49:05,540 --> 00:49:08,800 written, if not in what I've said. 813 00:49:08,800 --> 00:49:13,050 OK, let's look at an example of this now. 814 00:49:13,050 --> 00:49:17,110 Let's look at the example where these random variables Y 815 00:49:17,110 --> 00:49:21,930 sub n for n greater than or equal to 1, have this 816 00:49:21,930 --> 00:49:23,170 following property. 817 00:49:23,170 --> 00:49:26,190 It's almost the same as the sequence of numbers I talked 818 00:49:26,190 --> 00:49:27,830 about before. 819 00:49:27,830 --> 00:49:31,990 But what I'm going to do now is-- 820 00:49:31,990 --> 00:49:34,630 these are not IID random variables. 821 00:49:34,630 --> 00:49:39,130 If they're IID random variables, you're never going 822 00:49:39,130 --> 00:49:41,310 to talk about the sum being finite. 823 00:49:41,310 --> 00:49:44,260 Sum of the expected values being finite. 824 00:49:44,260 --> 00:49:52,620 How they behave is that for one less than or equal to 5, 825 00:49:52,620 --> 00:49:55,520 you pick one of these random variables in here and make it 826 00:49:55,520 --> 00:49:56,690 equal to 1. 827 00:49:56,690 --> 00:49:59,040 And all the rest are equal to 0. 828 00:49:59,040 --> 00:50:03,560 From 5 to 25, you pick one of the random variables, make it 829 00:50:03,560 --> 00:50:06,400 equal to 1, and all the others are equal to 0. 830 00:50:06,400 --> 00:50:08,130 You choose randomly in here. 831 00:50:08,130 --> 00:50:12,380 From 25 to 125, you pick one random variable, 832 00:50:12,380 --> 00:50:14,020 set it equal to 1. 833 00:50:14,020 --> 00:50:16,910 All the other random variables, set it equal to 0, 834 00:50:16,910 --> 00:50:19,500 and so forth forever after. 835 00:50:19,500 --> 00:50:23,600 OK, so what does that say for the sample points? 836 00:50:23,600 --> 00:50:29,080 If I look at any particular sample point, what I find is 837 00:50:29,080 --> 00:50:36,610 that there's one occurrence of a sample value equal to 1 from 838 00:50:36,610 --> 00:50:38,340 here to here. 839 00:50:38,340 --> 00:50:42,370 There's exactly one that's equal to 1 from here to here. 840 00:50:42,370 --> 00:50:45,970 There's exactly one that's equal to 1 from here to way 841 00:50:45,970 --> 00:50:49,890 out here at 125, and so forth. 842 00:50:49,890 --> 00:50:55,480 This is not a sequence of sample values which converges 843 00:50:55,480 --> 00:51:00,790 because it keeps popping up to 1 at all these values. 844 00:51:00,790 --> 00:51:05,300 So for every omega, Yn of omega is 1 for 845 00:51:05,300 --> 00:51:07,530 some n in this interval. 846 00:51:07,530 --> 00:51:10,110 For every j and it's 0 elsewhere. 847 00:51:10,110 --> 00:51:15,830 This Yn of omega doesn't converge for omega. 848 00:51:15,830 --> 00:51:19,310 So the probability that that sequence converges 849 00:51:19,310 --> 00:51:22,390 is not 1, it's 0. 850 00:51:22,390 --> 00:51:26,080 So this is a particularly awful example. 851 00:51:26,080 --> 00:51:29,490 This is a sequence of random variables, which does not 852 00:51:29,490 --> 00:51:31,970 converge with probability 1. 853 00:51:31,970 --> 00:51:38,440 At the same time, the expected value of Y sub n is 1 over 5 854 00:51:38,440 --> 00:51:42,320 to the j plus 1 minus 5 to the j. 855 00:51:42,320 --> 00:51:46,680 That's the probability that you pick that particular n for 856 00:51:46,680 --> 00:51:50,310 a random variable to be equal to 1. 857 00:51:50,310 --> 00:51:54,130 It's equal to this for 5 to the j less than or equal to n, 858 00:51:54,130 --> 00:51:57,350 less than 5 to the j plus 1. 859 00:51:57,350 --> 00:52:01,000 When you add up all of these things, when you add up 860 00:52:01,000 --> 00:52:04,330 expected value of Yn equal to that over this 861 00:52:04,330 --> 00:52:06,120 interval, you get 1. 862 00:52:06,120 --> 00:52:09,460 When you add it up over the next interval, which is much, 863 00:52:09,460 --> 00:52:11,400 much bigger, you get 1 again. 864 00:52:11,400 --> 00:52:12,950 When you add it up over the next 865 00:52:12,950 --> 00:52:14,830 interval, you get 1 again. 866 00:52:14,830 --> 00:52:19,925 So the expected value of the sum-- 867 00:52:19,925 --> 00:52:22,850 the sum of the expected value of the Y sub 868 00:52:22,850 --> 00:52:26,520 n's is equal to infinity. 869 00:52:26,520 --> 00:52:34,850 And what you wind up with then is that this 870 00:52:34,850 --> 00:52:37,195 sequence does not converge-- 871 00:52:49,690 --> 00:52:52,430 This says the theorem doesn't apply at all. 872 00:52:52,430 --> 00:52:57,510 This says that the Y sub n of omega does not converge for 873 00:52:57,510 --> 00:52:59,330 any sample function at all. 874 00:53:03,050 --> 00:53:06,910 This says that according to the theorem, it doesn't have 875 00:53:06,910 --> 00:53:09,200 to converge. 876 00:53:09,200 --> 00:53:11,810 I mean, when you look at an example after working very 877 00:53:11,810 --> 00:53:17,110 hard to prove a theorem, you would like to find that if the 878 00:53:17,110 --> 00:53:25,990 conditions of the theorem are satisfied what the theorem 879 00:53:25,990 --> 00:53:28,120 says is satisfied also. 880 00:53:28,120 --> 00:53:31,700 Here, the conditions are not satisfied. 881 00:53:31,700 --> 00:53:33,810 And you also don't have convergence 882 00:53:33,810 --> 00:53:35,370 with probability 1. 883 00:53:35,370 --> 00:53:39,280 You do have convergence in probability, however. 884 00:53:39,280 --> 00:53:42,520 So this gives you a nice example of where you have a 885 00:53:42,520 --> 00:53:47,320 sequence of random variables that converges in probability. 886 00:53:47,320 --> 00:53:51,710 It converges in probability because as n gets larger and 887 00:53:51,710 --> 00:53:56,730 larger, the probability that Y sub n is going to be equal to 888 00:53:56,730 --> 00:54:00,900 anything other than 0 gets very, very small. 889 00:54:00,900 --> 00:54:05,450 So the limit as n goes to infinity of the probability 890 00:54:05,450 --> 00:54:07,920 that Y sub n is greater than epsilon-- 891 00:54:07,920 --> 00:54:11,750 for any epsilon greater than 0, this probability is equal 892 00:54:11,750 --> 00:54:13,760 to 0 for all epsilon. 893 00:54:13,760 --> 00:54:17,670 So this quantity does converge in probability. 894 00:54:17,670 --> 00:54:20,480 It does not converge with probability 1. 895 00:54:20,480 --> 00:54:24,420 It's the simplest example I know of where you don't have 896 00:54:24,420 --> 00:54:30,390 convergence with probability 1 and you do have convergence in 897 00:54:30,390 --> 00:54:32,290 probability. 898 00:54:32,290 --> 00:54:43,440 How about if you're looking at a sequence of sample averages. 899 00:54:43,440 --> 00:54:47,720 Suppose you're looking at S sub n over n where S sub n is 900 00:54:47,720 --> 00:54:51,680 a sum of IID random variables. 901 00:54:51,680 --> 00:54:57,460 Can you find an example there where when you have a-- 902 00:55:01,290 --> 00:55:05,850 can you find an example where this sequence S sub n over n 903 00:55:05,850 --> 00:55:10,960 does converge in probability, but does not converge with 904 00:55:10,960 --> 00:55:12,470 probability 1? 905 00:55:12,470 --> 00:55:16,660 Unfortunately, that's very hard to do. 906 00:55:16,660 --> 00:55:20,900 And the reason is the main theorem, which we will never 907 00:55:20,900 --> 00:55:30,210 get around to proving here is that if you have a random 908 00:55:30,210 --> 00:55:36,510 variable x, and the expected value of the magnitude of x is 909 00:55:36,510 --> 00:55:40,780 finite, then the strong law of large numbers holds. 910 00:55:40,780 --> 00:55:47,530 Also, the weak law of large numbers holds, which says that 911 00:55:47,530 --> 00:55:50,600 you're not going to find an example where one holds and 912 00:55:50,600 --> 00:55:53,490 the other doesn't hold. 913 00:55:53,490 --> 00:55:57,910 So you have to go to strange things like this in order to 914 00:55:57,910 --> 00:56:00,110 get these examples that you're looking at. 915 00:56:04,220 --> 00:56:09,510 OK, let's now go from convergence in probability 1 916 00:56:09,510 --> 00:56:14,240 to applying this to the sequence of random variables 917 00:56:14,240 --> 00:56:23,210 where Y sub n is now equal to the sum of n IID random 918 00:56:23,210 --> 00:56:24,520 variable divided by n. 919 00:56:24,520 --> 00:56:29,130 Namely, it's the sample average, and we're looking at 920 00:56:29,130 --> 00:56:32,110 the limit as n goes to infinity 921 00:56:32,110 --> 00:56:33,320 of this sample average. 922 00:56:33,320 --> 00:56:40,500 What's the probability of the set of sample points for which 923 00:56:40,500 --> 00:56:47,580 this sample path converges to X bar? 924 00:56:47,580 --> 00:56:54,330 And the theorem says that this quantity is equal to 1 if the 925 00:56:54,330 --> 00:56:57,940 expected value of X is less than infinity. 926 00:56:57,940 --> 00:57:01,540 We're not going to prove that, but what we are going to prove 927 00:57:01,540 --> 00:57:07,360 is that if the expected value of the fourth moment of X is 928 00:57:07,360 --> 00:57:09,730 finite, then we're going to prove that 929 00:57:09,730 --> 00:57:12,640 this theorem is true. 930 00:57:12,640 --> 00:57:20,480 OK, when we write this from now on, we will sometimes get 931 00:57:20,480 --> 00:57:21,810 more terse. 932 00:57:21,810 --> 00:57:27,350 And instead of writing the probability of an omega in the 933 00:57:27,350 --> 00:57:33,240 set of sample points such that this limit for a sample point 934 00:57:33,240 --> 00:57:36,510 is equal to X bar, this whole thing is equal to 1. 935 00:57:36,510 --> 00:57:39,480 We can sometimes write it as the probability that this 936 00:57:39,480 --> 00:57:43,940 limit, which is now a limit of Sn of omega over n 937 00:57:43,940 --> 00:57:45,100 is equal to X bar. 938 00:57:45,100 --> 00:57:47,450 But that's equal to 1. 939 00:57:47,450 --> 00:57:51,430 Some people write it even more tersely as the limit of S sub 940 00:57:51,430 --> 00:57:56,590 n over n is equal to X bar with probability 1. 941 00:57:56,590 --> 00:58:01,570 This is a very strange statement here because this-- 942 00:58:07,640 --> 00:58:11,320 I mean, what you're saying with this statement is not 943 00:58:11,320 --> 00:58:16,770 that this limit is equal to X bar with probability 1. 944 00:58:16,770 --> 00:58:21,610 It's saying, with probability 1, this limit here exists for 945 00:58:21,610 --> 00:58:25,260 a sample point, and that limit is equal to X bar. 946 00:58:25,260 --> 00:58:28,380 The thing which makes the strong law of large numbers 947 00:58:28,380 --> 00:58:34,520 difficult is not proving that the limit has 948 00:58:34,520 --> 00:58:36,170 a particular value. 949 00:58:36,170 --> 00:58:39,170 If there is a limit, it's always easy to find 950 00:58:39,170 --> 00:58:40,230 what the value is. 951 00:58:40,230 --> 00:58:43,790 The thing which is difficult is figuring out whether it has 952 00:58:43,790 --> 00:58:44,800 a limit or not. 953 00:58:44,800 --> 00:58:51,380 So this statement is fine for people who understand what it 954 00:58:51,380 --> 00:58:55,700 says, but it's kind of confusing otherwise. 955 00:58:55,700 --> 00:59:00,690 Still more tersely, people talk about it as Sn over n 956 00:59:00,690 --> 00:59:04,790 goes to limit X bar with probability 1. 957 00:59:04,790 --> 00:59:07,940 This is probably an even better way to say it than this 958 00:59:07,940 --> 00:59:09,580 is because this is-- 959 00:59:09,580 --> 00:59:12,580 I mean, this says that there's something strange 960 00:59:12,580 --> 00:59:14,440 in the limit here. 961 00:59:14,440 --> 00:59:19,210 But I would suggest that you write it this way until you 962 00:59:19,210 --> 00:59:20,720 get used to what it's saying. 963 00:59:20,720 --> 00:59:24,620 Because then, when you write it this way, you realize that 964 00:59:24,620 --> 00:59:28,480 what you're talking about is the limit over individual 965 00:59:28,480 --> 00:59:33,520 sample points rather than some kind of more general limit. 966 00:59:33,520 --> 00:59:38,630 And convergence with probability 1 is always that 967 00:59:38,630 --> 00:59:42,420 sort of convergence. 968 00:59:42,420 --> 00:59:46,170 OK, this strong law and the idea of convergence with 969 00:59:46,170 --> 00:59:49,240 probability 1 is really pretty different from the other forms 970 00:59:49,240 --> 00:59:50,610 of convergence. 971 00:59:50,610 --> 00:59:54,420 In the sense that it focuses directly on sample paths. 972 00:59:54,420 --> 00:59:59,080 The other forms of convergence focus on things like the 973 00:59:59,080 --> 01:00:02,900 sequence of expected values, or where the sequence of 974 01:00:02,900 --> 01:00:06,980 probabilities, or sequences of numbers, which are the things 975 01:00:06,980 --> 01:00:09,840 you're used to dealing with. 976 01:00:09,840 --> 01:00:15,820 Here you're dealing directly with sample points, and it 977 01:00:15,820 --> 01:00:18,740 makes it more difficult to talk about the rate of 978 01:00:18,740 --> 01:00:21,070 convergence as n approaches infinity. 979 01:00:21,070 --> 01:00:24,180 You can't talk about the rate of convergence here as n 980 01:00:24,180 --> 01:00:25,860 approaches infinity. 981 01:00:25,860 --> 01:00:28,720 If you have any n less than infinity, if you're only 982 01:00:28,720 --> 01:00:34,280 looking at a finite sequence, you have no way of saying 983 01:00:34,280 --> 01:00:38,210 whether any of the sample values over that sequence are 984 01:00:38,210 --> 01:00:41,360 going to converge or whether they're not going to converge, 985 01:00:41,360 --> 01:00:44,670 because you don't know what the rest of them are. 986 01:00:44,670 --> 01:00:48,570 So talking about a rate of convergence with respect to 987 01:00:48,570 --> 01:00:50,680 the strong law of large numbers 988 01:00:50,680 --> 01:00:53,270 doesn't make any sense. 989 01:00:53,270 --> 01:00:55,880 It's connected directly to the standard notion of a 990 01:00:55,880 --> 01:01:00,640 convergence of a sequence of numbers when you look at those 991 01:01:00,640 --> 01:01:04,690 numbers applied to a sample path. 992 01:01:04,690 --> 01:01:07,900 This is what gives the strong law of large numbers its 993 01:01:07,900 --> 01:01:13,920 power, the fact that it's related to this standard idea 994 01:01:13,920 --> 01:01:14,690 of convergence. 995 01:01:14,690 --> 01:01:18,530 The standard idea of convergence is what the whole 996 01:01:18,530 --> 01:01:22,310 theory of analysis is built on. 997 01:01:22,310 --> 01:01:26,030 And there are some very powerful things you can do 998 01:01:26,030 --> 01:01:27,030 with analysis. 999 01:01:27,030 --> 01:01:31,740 And it's because convergence is defined the way that it is. 1000 01:01:31,740 --> 01:01:35,400 When we talk about the strong law of large numbers, we are 1001 01:01:35,400 --> 01:01:39,170 locked into that particular notion of convergence. 1002 01:01:39,170 --> 01:01:41,690 And therefore, it's going to have a lot of power. 1003 01:01:41,690 --> 01:01:44,050 We will see this as soon as we start talking 1004 01:01:44,050 --> 01:01:45,750 about renewal theory. 1005 01:01:45,750 --> 01:01:47,890 And in fact, we'll see it in the proof of the strong law 1006 01:01:47,890 --> 01:01:50,640 that we're going to go through. 1007 01:01:50,640 --> 01:01:53,470 Most of the heavy lifting with the strong law of large 1008 01:01:53,470 --> 01:01:57,780 numbers has been done by the analysis of convergence with 1009 01:01:57,780 --> 01:01:58,740 probability 1. 1010 01:01:58,740 --> 01:02:01,540 The hard thing is this theorem we've just proven. 1011 01:02:01,540 --> 01:02:02,990 And that's tricky. 1012 01:02:02,990 --> 01:02:05,750 And I apologize for getting a little confused about it as we 1013 01:02:05,750 --> 01:02:08,720 went through it, and not explaining all the steps 1014 01:02:08,720 --> 01:02:10,910 completely. 1015 01:02:10,910 --> 01:02:12,950 But as I said, it's hard to follow proofs 1016 01:02:12,950 --> 01:02:14,950 in real time anyway. 1017 01:02:14,950 --> 01:02:17,080 But all of that is done now. 1018 01:02:17,080 --> 01:02:19,700 How do we go through the strong law of large numbers 1019 01:02:19,700 --> 01:02:22,990 now if we accept this convergence 1020 01:02:22,990 --> 01:02:25,370 with probability 1? 1021 01:02:25,370 --> 01:02:29,370 Well, it turns out to be pretty easy. 1022 01:02:29,370 --> 01:02:32,300 We're going to assume that the expected value of the fourth 1023 01:02:32,300 --> 01:02:35,890 moment of this underlying random variable 1024 01:02:35,890 --> 01:02:38,040 is less than infinity. 1025 01:02:38,040 --> 01:02:43,830 So let's look at the expected value of the sum of n random 1026 01:02:43,830 --> 01:02:47,120 variables taken to the fourth power. 1027 01:02:47,120 --> 01:02:48,790 OK, so what is that? 1028 01:02:48,790 --> 01:02:57,760 It's the expected value of S sub n times S sub n times S 1029 01:02:57,760 --> 01:03:00,030 sub n times S sub n. 1030 01:03:00,030 --> 01:03:05,220 Sub n is the sum of Xi from 1 to n. 1031 01:03:05,220 --> 01:03:07,020 It's also this. 1032 01:03:07,020 --> 01:03:08,840 It's also this. 1033 01:03:08,840 --> 01:03:09,970 It's also this. 1034 01:03:09,970 --> 01:03:14,150 So the expected value of S to the n fourth is the expected 1035 01:03:14,150 --> 01:03:17,810 value of this entire product here. 1036 01:03:17,810 --> 01:03:22,800 I should have a big bracket around all of that. 1037 01:03:22,800 --> 01:03:27,190 If I multiply all of these terms out, each of these terms 1038 01:03:27,190 --> 01:03:29,900 goes from 1 to n, what I'm going to get is the 1039 01:03:29,900 --> 01:03:32,710 sum from 1 to n. 1040 01:03:32,710 --> 01:03:35,010 Sum over j from 1 to n. 1041 01:03:35,010 --> 01:03:37,310 Sum over k from 1 to n. 1042 01:03:37,310 --> 01:03:40,810 And a sum over l from 1 to n. 1043 01:03:40,810 --> 01:03:44,710 So I'm going to have the expected value of X sub i 1044 01:03:44,710 --> 01:03:48,370 times X sub j times X sub k times X sub l. 1045 01:03:48,370 --> 01:03:50,100 Let's review what this is. 1046 01:03:50,100 --> 01:04:01,340 X sub i is the random variable for the i-th of these X's. 1047 01:04:01,340 --> 01:04:02,640 I have n X's-- 1048 01:04:02,640 --> 01:04:06,310 X1, X2, X3, up to X sub n. 1049 01:04:06,310 --> 01:04:11,000 What I'm trying to find is the expected value of this sum to 1050 01:04:11,000 --> 01:04:12,460 the fourth power. 1051 01:04:12,460 --> 01:04:15,702 When you look at the sum of something, if I look at the 1052 01:04:15,702 --> 01:04:34,410 sum of numbers, [INAUDIBLE] of a sub i, times the sum of b 1053 01:04:34,410 --> 01:04:38,740 sub i, I write it as j. 1054 01:04:38,740 --> 01:04:40,830 If i just do this, what's it equal to? 1055 01:04:40,830 --> 01:04:45,380 It's equal to the sum over i and j of a sub 1056 01:04:45,380 --> 01:04:47,890 i times a sub j. 1057 01:04:47,890 --> 01:04:50,650 I'm doing exactly the same thing here, but I'm taking the 1058 01:04:50,650 --> 01:04:52,270 expected value of it. 1059 01:04:52,270 --> 01:04:55,540 That's a finite sum. the expected value of the sum is 1060 01:04:55,540 --> 01:05:00,540 equal to the sum of the expected values. 1061 01:05:00,540 --> 01:05:07,240 So if I look at any particular value of X-- 1062 01:05:07,240 --> 01:05:08,580 of this first X here. 1063 01:05:08,580 --> 01:05:11,390 Suppose I look at i equals 1 . 1064 01:05:11,390 --> 01:05:17,990 I suppose I look at the expected value of X1 times-- 1065 01:05:17,990 --> 01:05:20,460 and I'll make this anything other than 1. 1066 01:05:20,460 --> 01:05:24,350 I'll make this anything other than 1, and this anything 1067 01:05:24,350 --> 01:05:24,870 other than 1. 1068 01:05:24,870 --> 01:05:27,370 For example, suppose I'm trying to find the expected 1069 01:05:27,370 --> 01:05:35,490 value of X1 times X2 times X10 times X3. 1070 01:05:35,490 --> 01:05:38,110 OK, what is that? 1071 01:05:38,110 --> 01:05:42,910 Since X1, X2, X3 are all independent of each other, the 1072 01:05:42,910 --> 01:05:47,460 expected value of X1 times the expected value of all these 1073 01:05:47,460 --> 01:05:52,420 other things is the expected value of X1 conditional on the 1074 01:05:52,420 --> 01:05:54,610 values of these other quantities. 1075 01:05:54,610 --> 01:05:57,105 And then I average over all the other quantities. 1076 01:06:00,210 --> 01:06:03,460 Now, if these are independent random variables, the expected 1077 01:06:03,460 --> 01:06:08,080 value of this given the values of these other quantities is 1078 01:06:08,080 --> 01:06:11,130 just the expected value of X1. 1079 01:06:11,130 --> 01:06:14,740 I'm dealing with a case where the expected value of X is 1080 01:06:14,740 --> 01:06:17,800 equal to 0. 1081 01:06:17,800 --> 01:06:20,330 Assuming X bar equals 0. 1082 01:06:20,330 --> 01:06:26,080 So when I pick i equal to 1 and all of these equal to 1083 01:06:26,080 --> 01:06:31,680 something other than 1, this expected value is equal to 0. 1084 01:06:31,680 --> 01:06:34,760 That's a whole bunch of expected values because that 1085 01:06:34,760 --> 01:06:39,580 includes j equals 2 to n, k equals 2 to n, and X sub l 1086 01:06:39,580 --> 01:06:41,090 equals 2 to n. 1087 01:06:41,090 --> 01:06:45,520 Now, I can do this for X sub i equals 2, X sub i equals 3, 1088 01:06:45,520 --> 01:06:46,770 and so forth. 1089 01:06:48,950 --> 01:06:53,770 If i is different from j, and k, and l, this expected value 1090 01:06:53,770 --> 01:06:55,020 is equal to 0. 1091 01:06:58,340 --> 01:07:02,160 And the same thing if X sub j is different 1092 01:07:02,160 --> 01:07:03,430 than all the others. 1093 01:07:03,430 --> 01:07:05,640 The expected value is equal to 0. 1094 01:07:05,640 --> 01:07:09,150 So how can I get anything that's nonzero? 1095 01:07:09,150 --> 01:07:15,240 Well, if I look at X sub 1 times X sub 1 times X sub 1 1096 01:07:15,240 --> 01:07:18,130 times X sub 1, that gives me expected 1097 01:07:18,130 --> 01:07:19,690 value of X to the fourth. 1098 01:07:19,690 --> 01:07:22,950 That's not 0, presumably. 1099 01:07:22,950 --> 01:07:24,860 And I have n terms like that. 1100 01:07:29,050 --> 01:07:33,350 Well, I'm getting down to here. 1101 01:07:33,350 --> 01:07:37,540 What we have is two kinds of nonzero terms. 1102 01:07:37,540 --> 01:07:41,930 One of them is where i is equal to j is equal to k is 1103 01:07:41,930 --> 01:07:43,060 equal to l. 1104 01:07:43,060 --> 01:07:46,830 And then we have X sub i to the fourth power. 1105 01:07:46,830 --> 01:07:49,980 And we're assuming that's some finite quantity. 1106 01:07:49,980 --> 01:07:52,890 That's the basic assumption we're using here, expected 1107 01:07:52,890 --> 01:07:55,740 value of X fourth is less than infinity. 1108 01:07:55,740 --> 01:07:58,470 What other kinds of things can we have? 1109 01:07:58,470 --> 01:08:05,130 Well, if i is equal to j, and if k is equal to l, then I 1110 01:08:05,130 --> 01:08:13,920 have the expected value of Xi squared times expected value 1111 01:08:13,920 --> 01:08:16,890 of Xi squared Xk squared. 1112 01:08:16,890 --> 01:08:17,950 What is that? 1113 01:08:17,950 --> 01:08:22,510 Xi squared is independent of Xk squared because i is 1114 01:08:22,510 --> 01:08:23,710 unequal to k. 1115 01:08:23,710 --> 01:08:26,060 These are independent random variables. 1116 01:08:26,060 --> 01:08:31,250 So I have the expected value of Xi squared is what? 1117 01:08:31,250 --> 01:08:35,720 It's just a variance of X. This quantity here is the 1118 01:08:35,720 --> 01:08:37,819 variance of X also. 1119 01:08:37,819 --> 01:08:43,729 So I have the variance of Xi squared, which is squared. 1120 01:08:43,729 --> 01:08:50,040 So I have sigma to the fourth power. 1121 01:08:50,040 --> 01:08:55,720 So those are the only terms that I have for this second 1122 01:08:55,720 --> 01:08:59,850 kind of nonzero term where Xi-- 1123 01:08:59,850 --> 01:09:02,160 excuse me. not Xi is equal to Xj. 1124 01:09:02,160 --> 01:09:03,689 That's not what we're talking about. 1125 01:09:03,689 --> 01:09:06,819 Where i is equal to j. 1126 01:09:06,819 --> 01:09:12,330 Namely, we have a sum where i runs from 1 to n, where j runs 1127 01:09:12,330 --> 01:09:16,670 from 1 to n, k runs from 1 to n, and l runs from 1 to n. 1128 01:09:16,670 --> 01:09:21,040 What we're looking at is, for what values of i, j, k, and l 1129 01:09:21,040 --> 01:09:24,500 is this quantity not equal to 0? 1130 01:09:24,500 --> 01:09:28,430 We're saying that if i is equal to j is equal to k is 1131 01:09:28,430 --> 01:09:32,229 equal to l, then for all of those terms, we have the 1132 01:09:32,229 --> 01:09:34,550 expected value of X fourth. 1133 01:09:34,550 --> 01:09:39,640 For all terms in which i is equal to j and k is equal to 1134 01:09:39,640 --> 01:09:44,689 l, for all of those terms, we have the expected value of X 1135 01:09:44,689 --> 01:09:47,609 sub i quantity squared. 1136 01:09:47,609 --> 01:09:50,560 Now, how many of those terms are there? 1137 01:09:50,560 --> 01:09:55,180 Well, x sub i can be any one of n terms. 1138 01:09:55,180 --> 01:10:01,220 x sub j can be any one of how many terms? 1139 01:10:01,220 --> 01:10:02,470 It can't be equal. 1140 01:10:06,634 --> 01:10:11,120 i is equal to j, how many things can k be? 1141 01:10:11,120 --> 01:10:17,450 It can't be equal to i because then we would wind up with X 1142 01:10:17,450 --> 01:10:19,550 sub i to the fourth power. 1143 01:10:19,550 --> 01:10:24,650 So we're looking at n minus 1 possible values for k, n 1144 01:10:24,650 --> 01:10:27,430 possible values for i. 1145 01:10:27,430 --> 01:10:30,820 So there are n times n minus 1 of those terms. 1146 01:10:30,820 --> 01:10:32,070 I can also have-- 1147 01:10:41,868 --> 01:10:44,600 let me write in this way. 1148 01:10:44,600 --> 01:10:51,580 Times Xk Xl equals k. 1149 01:10:51,580 --> 01:10:52,800 I can have those terms. 1150 01:10:52,800 --> 01:11:00,470 I can also have Xi Xj unequal to i. 1151 01:11:00,470 --> 01:11:08,720 Xk equal to k and Xl equal to i. 1152 01:11:08,720 --> 01:11:10,640 I can have terms like this. 1153 01:11:10,640 --> 01:11:13,650 And that gives me a sigma fourth term also. 1154 01:11:13,650 --> 01:11:18,630 I can also have Xi Xj unequal to i. 1155 01:11:18,630 --> 01:11:22,370 k can be equal to i and l can be equal to j. 1156 01:11:22,370 --> 01:11:24,880 So I really have three kinds of terms. 1157 01:11:24,880 --> 01:11:33,840 I have three times n times n minus 1 times the expected 1158 01:11:33,840 --> 01:11:43,690 value of X squared, this quantity squared. 1159 01:11:43,690 --> 01:11:48,190 So that's the total value of expected value of S sub n to 1160 01:11:48,190 --> 01:11:50,130 the fourth. 1161 01:11:50,130 --> 01:11:55,640 It's the n terms for which i is equal to j is equal to k is 1162 01:11:55,640 --> 01:12:02,700 equal to l plus the 3n times n minus 1 terms in which we have 1163 01:12:02,700 --> 01:12:05,790 two pairs of equal terms. 1164 01:12:05,790 --> 01:12:07,470 So we have that quantity here. 1165 01:12:07,470 --> 01:12:13,010 Now, expected value of X fourth is the second moment of 1166 01:12:13,010 --> 01:12:16,410 the random variable X squared. 1167 01:12:16,410 --> 01:12:23,870 So the expected value of X squared squared is the mean of 1168 01:12:23,870 --> 01:12:26,610 X squared squared. 1169 01:12:26,610 --> 01:12:30,580 And that's less than or equal to the variance of X squared, 1170 01:12:30,580 --> 01:12:32,100 which is this quantity. 1171 01:12:32,100 --> 01:12:36,190 The expected value of Sn fourth is-- 1172 01:12:38,980 --> 01:12:43,310 well, actually it's less than or equal to 3n squared times 1173 01:12:43,310 --> 01:12:45,690 the expected value of X fourth. 1174 01:12:45,690 --> 01:12:50,020 And blah, blah, blah, until we get to 3 times the expected 1175 01:12:50,020 --> 01:12:53,300 value of X fourth times the sum from n equals 1 to 1176 01:12:53,300 --> 01:12:56,140 infinity of 1 over n squared. 1177 01:12:56,140 --> 01:12:59,695 Now, is that quantity finite or is it infinite? 1178 01:13:06,050 --> 01:13:09,100 Well, let's talk of three different ways of showing that 1179 01:13:09,100 --> 01:13:13,220 this sum is going to be finite. 1180 01:13:13,220 --> 01:13:17,710 One of the ways is that this is an approximation, a crude 1181 01:13:17,710 --> 01:13:20,860 approximation, of the integral from 1 to 1182 01:13:20,860 --> 01:13:24,395 infinity of 1 over X squared. 1183 01:13:24,395 --> 01:13:26,990 You know that that integral is finite. 1184 01:13:26,990 --> 01:13:30,560 Another way of doing it is you already know that if you take 1185 01:13:30,560 --> 01:13:35,650 1 over n times 1 over n plus 1, you know how to sum that. 1186 01:13:35,650 --> 01:13:37,340 That sum is finite. 1187 01:13:37,340 --> 01:13:41,040 You can bound this by that. 1188 01:13:43,940 --> 01:13:48,570 And the other way of doing it is simply to know that the sum 1189 01:13:48,570 --> 01:13:50,140 of 1 over n squared is finite. 1190 01:13:53,990 --> 01:13:59,740 So what this says is that the expected value of S sub n 1191 01:13:59,740 --> 01:14:04,290 fourth over n fourth is less than infinity. 1192 01:14:04,290 --> 01:14:14,210 That says that the probability of the set of omega for which 1193 01:14:14,210 --> 01:14:20,450 S to the fourth over n fourth is equal to 0 is equal to 1. 1194 01:14:20,450 --> 01:14:24,750 in other words, it's saying that S to the fourth over 1195 01:14:24,750 --> 01:14:30,850 omega over n fourth converges to 0. 1196 01:14:30,850 --> 01:14:34,200 That's not quite what we want, is it? 1197 01:14:34,200 --> 01:14:37,770 But the set of sample points for which this quantity 1198 01:14:37,770 --> 01:14:44,790 converges has probability 1. 1199 01:14:44,790 --> 01:14:47,860 And here is where you see the real power of the strong law 1200 01:14:47,860 --> 01:14:49,410 of large numbers. 1201 01:14:49,410 --> 01:14:57,170 Because if these numbers converge to 0 with probability 1202 01:14:57,170 --> 01:15:11,310 1, what happens to the set of numbers is Sn to the fourth of 1203 01:15:11,310 --> 01:15:15,920 omega divided by n to the fourth, this limit-- 1204 01:15:23,488 --> 01:15:28,940 if this was equal to 0, then what is the limit as n 1205 01:15:28,940 --> 01:15:35,541 approaches infinity of Sn of omega over n to the fourth? 1206 01:15:38,880 --> 01:15:43,240 If I take the fourth root of this, I get this. 1207 01:15:43,240 --> 01:15:47,220 If this quantity is converging to 0, the fourth root of this 1208 01:15:47,220 --> 01:15:53,830 also has to be converging to 0 on a sample path basis of the 1209 01:15:53,830 --> 01:15:58,300 fact that this converges means that this converges also. 1210 01:16:00,910 --> 01:16:03,350 Now, you see if you were dealing with convergence in 1211 01:16:03,350 --> 01:16:07,060 probability or something like that, you couldn't play this 1212 01:16:07,060 --> 01:16:09,090 funny game. 1213 01:16:09,090 --> 01:16:12,450 And the ability to play this game is really what makes 1214 01:16:12,450 --> 01:16:16,490 convergence in probability a powerful concept. 1215 01:16:16,490 --> 01:16:19,350 You can do all sorts of strange things with it. 1216 01:16:19,350 --> 01:16:23,590 And we'll talk about that next time. 1217 01:16:23,590 --> 01:16:29,920 But that's why all of this works. 1218 01:16:29,920 --> 01:16:33,960 So that's what says that the probability of the set of 1219 01:16:33,960 --> 01:16:37,570 omega for which the limits of Sn of omega over n 1220 01:16:37,570 --> 01:16:38,820 equals 0 equals 1. 1221 01:16:41,840 --> 01:16:45,590 Now, let's look at the strange aspect of 1222 01:16:45,590 --> 01:16:47,880 what we've just done. 1223 01:16:47,880 --> 01:16:51,940 And this is where things get very peculiar. 1224 01:16:51,940 --> 01:16:55,640 Let's look at the Bernoulli case, which by now we all 1225 01:16:55,640 --> 01:16:57,470 understand. 1226 01:16:57,470 --> 01:17:03,260 So we consider a Bernoulli process, all 1227 01:17:03,260 --> 01:17:04,860 moments of X exist. 1228 01:17:04,860 --> 01:17:08,110 Moment-generating functions of X exist. 1229 01:17:08,110 --> 01:17:11,590 X is about as well-behaved as you can expect because it only 1230 01:17:11,590 --> 01:17:14,030 has the values 1 or 0. 1231 01:17:14,030 --> 01:17:16,630 So it's very nice. 1232 01:17:16,630 --> 01:17:19,690 The expected value of X is going to be equal 1233 01:17:19,690 --> 01:17:22,070 to p in this case. 1234 01:17:22,070 --> 01:17:28,380 The set of sample paths for which Sn of omega over n is 1235 01:17:28,380 --> 01:17:31,270 equal to p has probability 1. 1236 01:17:31,270 --> 01:17:37,300 In other words, with probability 1, when you look 1237 01:17:37,300 --> 01:17:40,305 at a sample path and you look at the whole thing from n 1238 01:17:40,305 --> 01:17:43,730 equals 1 off to infinity, and you take the limit of that 1239 01:17:43,730 --> 01:17:47,750 sample path as n goes to infinity, what you get is p. 1240 01:17:47,750 --> 01:17:52,090 And the probability that you get p is equal to 1. 1241 01:17:52,090 --> 01:17:55,880 Well, now, the thing that's disturbing is, if you look at 1242 01:17:55,880 --> 01:17:59,930 another Bernoulli process where the probability of the 1 1243 01:17:59,930 --> 01:18:03,160 is p prime instead of p. 1244 01:18:03,160 --> 01:18:06,630 What happens then? 1245 01:18:06,630 --> 01:18:12,440 With probability 1, you get convergence of Sn of omega 1246 01:18:12,440 --> 01:18:19,820 over n, but the convergence is to p prime instead of to p. 1247 01:18:19,820 --> 01:18:24,790 The events in these two spaces are exactly the same. 1248 01:18:24,790 --> 01:18:28,470 We've changed the probability measure, but we've kept all 1249 01:18:28,470 --> 01:18:30,740 the events the same. 1250 01:18:30,740 --> 01:18:34,040 And by changing the probability measure, we have 1251 01:18:34,040 --> 01:18:43,160 changed one set of probability 1 into a set of probability 0. 1252 01:18:43,160 --> 01:18:46,500 And we changed another set of probability 0 into set of 1253 01:18:46,500 --> 01:18:48,550 probability 1. 1254 01:18:48,550 --> 01:18:52,130 So we have two different events here. 1255 01:18:52,130 --> 01:18:56,450 On one probability measure, this event has probability 1. 1256 01:18:56,450 --> 01:18:58,700 On the other one, it has probability 0. 1257 01:18:58,700 --> 01:19:04,930 They're both very nice, very well-behaved probabilistic 1258 01:19:04,930 --> 01:19:06,750 situations. 1259 01:19:06,750 --> 01:19:08,560 So that's a little disturbing. 1260 01:19:08,560 --> 01:19:14,140 But then you say, you can pick p in an uncountably infinite 1261 01:19:14,140 --> 01:19:15,940 number of ways. 1262 01:19:15,940 --> 01:19:18,680 And for each way you count p, you have 1263 01:19:18,680 --> 01:19:20,160 uncountably many events. 1264 01:19:23,250 --> 01:19:28,790 Excuse me, for each value of p, you have one event of 1265 01:19:28,790 --> 01:19:31,790 probability 1 for that p. 1266 01:19:31,790 --> 01:19:35,750 So as you go through this uncountable number of events, 1267 01:19:35,750 --> 01:19:39,480 you go through this uncountable number of p's, you 1268 01:19:39,480 --> 01:19:43,560 have an uncountable number of events, each of which has 1269 01:19:43,560 --> 01:19:47,890 probability 1 for its own p. 1270 01:19:47,890 --> 01:19:52,600 And now the set of sequences that converge is, in fact, a 1271 01:19:52,600 --> 01:19:55,270 rather peculiar sequence to start with. 1272 01:19:55,270 --> 01:19:57,656 So if you look at all the other things that are going to 1273 01:19:57,656 --> 01:20:01,010 happen, there are an awful lot of those events also. 1274 01:20:04,300 --> 01:20:08,780 So what is happening here is that these events that we're 1275 01:20:08,780 --> 01:20:14,750 talking about are indeed very, very peculiar events. 1276 01:20:14,750 --> 01:20:16,530 I mean, all the mathematics works out. 1277 01:20:16,530 --> 01:20:17,850 The mathematics is fine. 1278 01:20:17,850 --> 01:20:21,420 There's no doubt about it. 1279 01:20:21,420 --> 01:20:24,280 In fact, the mathematics of probability 1280 01:20:24,280 --> 01:20:26,570 theory was worked out. 1281 01:20:26,570 --> 01:20:29,940 People like Kolmogorov went to great efforts to make sure 1282 01:20:29,940 --> 01:20:31,960 that all of this worked out. 1283 01:20:31,960 --> 01:20:34,330 But then he wound up with this peculiar kind 1284 01:20:34,330 --> 01:20:36,660 of situation here. 1285 01:20:36,660 --> 01:20:40,170 And that's what happens when you go to an infinite number 1286 01:20:40,170 --> 01:20:43,080 of random variables. 1287 01:20:43,080 --> 01:20:49,320 And it's ugly, but that's the way it is. 1288 01:20:49,320 --> 01:20:55,140 So that what I'm arguing here is that when you go from 1289 01:20:55,140 --> 01:21:00,220 finite m to infinite n, and you start interchanging 1290 01:21:00,220 --> 01:21:06,010 limits, and you start taking limits without much care and 1291 01:21:06,010 --> 01:21:09,280 you start doing all the things that you would like to do, 1292 01:21:09,280 --> 01:21:15,700 thinking that infinite n is sort of the same as finite n. 1293 01:21:15,700 --> 01:21:18,990 In most places in probability, you can do that and you can 1294 01:21:18,990 --> 01:21:20,170 away with it. 1295 01:21:20,170 --> 01:21:22,800 As soon as you start dealing with the strong law of large 1296 01:21:22,800 --> 01:21:26,130 numbers, you suddenly really have to start being careful 1297 01:21:26,130 --> 01:21:27,810 about this. 1298 01:21:27,810 --> 01:21:31,900 So from now on, we have to be just a little bit careful 1299 01:21:31,900 --> 01:21:36,120 about interchanging limits, interchanging summation and 1300 01:21:36,120 --> 01:21:40,190 integration, interchanging all sorts of things, as soon as we 1301 01:21:40,190 --> 01:21:43,540 have an infinite number of random variables. 1302 01:21:43,540 --> 01:21:47,600 So that's a care that we have to worry about from here on. 1303 01:21:47,600 --> 01:21:48,850 OK, thank you.