1 00:00:01,161 --> 00:00:03,920 ANNOUNCER: The following content is provided under a Creative 2 00:00:03,920 --> 00:00:05,310 Commons license. 3 00:00:05,310 --> 00:00:07,520 Your support will help MIT Open Courseware 4 00:00:07,520 --> 00:00:11,610 continue to offer high quality educational resources for free. 5 00:00:11,610 --> 00:00:14,180 To make a donation or to view additional materials 6 00:00:14,180 --> 00:00:16,670 from hundreds of MIT courses, visit 7 00:00:16,670 --> 00:00:18,540 MITOpenCourseware@OCW.MIT.edu. 8 00:00:22,824 --> 00:00:25,960 PROFESSOR: So this is a big day mathematically speaking, 9 00:00:25,960 --> 00:00:32,320 because we come to this key idea, which is 10 00:00:32,320 --> 00:00:34,000 a little bit like eigenvalues. 11 00:00:34,000 --> 00:00:36,970 Well, a lot like eigenvalues, but different 12 00:00:36,970 --> 00:00:44,410 because the matrix A now is more usually rectangular. 13 00:00:44,410 --> 00:00:48,250 So for a rectangular matrix, the whole idea of eigenvalues 14 00:00:48,250 --> 00:00:54,160 is shot because if I multiply A times a vector 15 00:00:54,160 --> 00:00:59,590 x in n dimensions, out will come something in m dimensions 16 00:00:59,590 --> 00:01:01,810 and it's not going to equal lambda x. 17 00:01:01,810 --> 00:01:08,620 So Ax equal lambda x is not even possible if A is rectangular. 18 00:01:08,620 --> 00:01:13,330 And even if A is square, what are the problems, just thinking 19 00:01:13,330 --> 00:01:16,420 for a minute about eigenvalues? 20 00:01:16,420 --> 00:01:20,950 The case I wrote up here is the great case 21 00:01:20,950 --> 00:01:24,790 where I have a symmetric matrix and then it's 22 00:01:24,790 --> 00:01:28,720 got a full set of eigenvalues and eigenvectors 23 00:01:28,720 --> 00:01:31,360 and they're orthogonal, all good. 24 00:01:31,360 --> 00:01:36,730 But for a general square matrix, either the eigenvectors 25 00:01:36,730 --> 00:01:38,800 are complex-- 26 00:01:38,800 --> 00:01:42,550 eigenvalues are complex or the eigenvectors 27 00:01:42,550 --> 00:01:43,870 are not orthogonal. 28 00:01:46,780 --> 00:01:50,410 So we can't stay with eigenvalues forever. 29 00:01:50,410 --> 00:01:52,000 That's what I'm saying. 30 00:01:52,000 --> 00:01:55,840 And this is the right thing to do. 31 00:01:55,840 --> 00:01:57,910 So what are these pieces? 32 00:01:57,910 --> 00:02:07,710 So these are the left and these are the right singular vectors. 33 00:02:07,710 --> 00:02:10,035 So this is the new word is singular. 34 00:02:16,170 --> 00:02:18,660 And in between go the-- 35 00:02:18,660 --> 00:02:23,790 not the eigenvalues, but the singular values. 36 00:02:23,790 --> 00:02:25,780 So we've got the whole point now. 37 00:02:25,780 --> 00:02:27,690 You've got to pick up on this. 38 00:02:27,690 --> 00:02:32,490 There are two sets of singular vectors, not one. 39 00:02:32,490 --> 00:02:38,540 For eigenvectors, we just had one set, the Q's. 40 00:02:38,540 --> 00:02:41,690 Now we have a rectangular matrix, 41 00:02:41,690 --> 00:02:47,870 we've got one set of left eigenvectors in m dimensions, 42 00:02:47,870 --> 00:02:50,630 and we've got another set of right eigenvectors 43 00:02:50,630 --> 00:02:53,030 in n dimensions. 44 00:02:53,030 --> 00:02:58,280 And numbers in between are not eigenvalues, 45 00:02:58,280 --> 00:02:59,700 but singular values. 46 00:02:59,700 --> 00:03:00,770 So these guys are-- 47 00:03:03,440 --> 00:03:05,150 let me write what that looks like. 48 00:03:05,150 --> 00:03:11,090 This is, again, a diagonal matrix sigma 2 to sigma r, 49 00:03:11,090 --> 00:03:13,470 let's say. 50 00:03:13,470 --> 00:03:17,250 So it's again, a diagonal matrix in the middle. 51 00:03:17,250 --> 00:03:26,670 But the numbers on the diagonal are all positive or 0. 52 00:03:26,670 --> 00:03:29,110 And they're called singular values. 53 00:03:29,110 --> 00:03:31,920 So it's just a different world. 54 00:03:31,920 --> 00:03:35,940 So really, the first step by have to do, the math step, 55 00:03:35,940 --> 00:03:41,550 is to show that any matrix can be 56 00:03:41,550 --> 00:03:47,630 factored into u times sigma times v transpose. 57 00:03:47,630 --> 00:03:51,810 So that's the parallel to the spectral theorem 58 00:03:51,810 --> 00:03:55,650 that any symmetric matrix could be factored that way. 59 00:03:55,650 --> 00:03:58,380 So you're good for that part. 60 00:03:58,380 --> 00:04:04,230 We just have to do it to see what are u and sigma and v? 61 00:04:04,230 --> 00:04:08,280 What are those vectors and those singular values? 62 00:04:08,280 --> 00:04:09,810 Let's go. 63 00:04:09,810 --> 00:04:16,990 So the key is that A transpose A is a great matrix. 64 00:04:16,990 --> 00:04:24,370 So that's the key to the math is A transpose A. So 65 00:04:24,370 --> 00:04:26,650 what are the properties of A transpose A? 66 00:04:26,650 --> 00:04:28,970 A is rectangular again. 67 00:04:28,970 --> 00:04:34,710 So maybe m by n A transpose. 68 00:04:34,710 --> 00:04:36,715 So this was m by n. 69 00:04:36,715 --> 00:04:39,760 And this was n by m. 70 00:04:39,760 --> 00:04:44,500 So we get a result that's n by n. 71 00:04:44,500 --> 00:04:48,630 And what else can you tell me about A transpose A? 72 00:04:48,630 --> 00:04:49,780 It's a metric. 73 00:04:49,780 --> 00:04:51,610 That's a big deal. 74 00:04:51,610 --> 00:04:53,230 And it's square. 75 00:04:53,230 --> 00:04:55,690 And well yeah, you can tell me more now, 76 00:04:55,690 --> 00:04:58,690 because we talked about something, 77 00:04:58,690 --> 00:05:04,390 a topic that's a little more than symmetric last time. 78 00:05:04,390 --> 00:05:09,070 The matrix A transpose A will be positive, definite. 79 00:05:09,070 --> 00:05:13,990 It's eigenvalues are greater or equal to 0. 80 00:05:13,990 --> 00:05:18,190 And that will mean that we can take their square roots. 81 00:05:18,190 --> 00:05:19,720 And that's what we will do. 82 00:05:19,720 --> 00:05:23,020 So A transpose A we'll have a factorization. 83 00:05:23,020 --> 00:05:24,380 It's symmetric. 84 00:05:24,380 --> 00:05:29,200 It'll have a like, a Q lambda Q transpose, but I'm 85 00:05:29,200 --> 00:05:31,630 going to call it V lambda-- 86 00:05:31,630 --> 00:05:36,610 no, yeah, lambda-- I'll still call it lambda V transpose. 87 00:05:36,610 --> 00:05:39,700 So these V's-- what do we know about eigenvectors of these 88 00:05:39,700 --> 00:05:43,180 V's or eigenvectors of this guy? 89 00:05:43,180 --> 00:05:46,030 Square, symmetric, positive, definite matrix. 90 00:05:46,030 --> 00:05:47,740 So we're in good shape. 91 00:05:47,740 --> 00:05:52,420 And what do we know about the eigenvalues of A transpose A? 92 00:05:52,420 --> 00:05:55,380 They are all positive. 93 00:05:55,380 --> 00:05:59,230 So the eigenvalues are-- well, or equal to 0. 94 00:05:59,230 --> 00:06:02,905 And these guys are orthogonal. 95 00:06:02,905 --> 00:06:04,780 And these guys are greater or equal to there. 96 00:06:07,330 --> 00:06:09,700 So that's good. 97 00:06:09,700 --> 00:06:12,520 That's one of our-- 98 00:06:12,520 --> 00:06:14,300 We'll depend a lot on that. 99 00:06:14,300 --> 00:06:17,365 But also, you've got to recognize that A, 100 00:06:17,365 --> 00:06:24,310 A transpose is a different guy, A, A transpose. 101 00:06:24,310 --> 00:06:28,300 So what's the shape of A, A transpose? 102 00:06:28,300 --> 00:06:30,160 How big is that? 103 00:06:30,160 --> 00:06:33,180 Now I've got-- what do I have? 104 00:06:33,180 --> 00:06:35,220 M by n times n by m. 105 00:06:35,220 --> 00:06:37,580 So this will be what size? 106 00:06:37,580 --> 00:06:38,340 N by m. 107 00:06:38,340 --> 00:06:44,370 Different shape but with the same eigenvalues-- 108 00:06:44,370 --> 00:06:45,510 the same eigenvalues. 109 00:06:45,510 --> 00:06:48,210 So it's going to have some other eigenvectors, u-- of course, 110 00:06:48,210 --> 00:06:49,752 I'm going to call them u, because I'm 111 00:06:49,752 --> 00:06:51,330 going to go in over there. 112 00:06:51,330 --> 00:06:53,140 They'll be the same. 113 00:06:53,140 --> 00:06:56,000 Well, they're saying yeah, let me-- 114 00:06:56,000 --> 00:07:02,650 I shouldn't-- I have to say when I say the same, 115 00:07:02,650 --> 00:07:05,690 I can't quite literally mean the very same, 116 00:07:05,690 --> 00:07:10,700 because this has got n eigenvalues and this has m 117 00:07:10,700 --> 00:07:12,650 eigenvalues. 118 00:07:12,650 --> 00:07:16,660 But the missing guys, the ones that are in one of them 119 00:07:16,660 --> 00:07:21,110 and not in the other, depending on the sizes, are zeros. 120 00:07:21,110 --> 00:07:25,670 So really, the heart of the thing, the non-zero eigenvalues 121 00:07:25,670 --> 00:07:26,290 are the same. 122 00:07:29,060 --> 00:07:34,210 Well actually, I've pretty much revealed 123 00:07:34,210 --> 00:07:38,350 what the SVD is going to use. 124 00:07:38,350 --> 00:07:43,520 It's going to use the U's from here and the V's from here. 125 00:07:43,520 --> 00:07:45,280 But that's the story. 126 00:07:45,280 --> 00:07:48,550 You've got to see that story. 127 00:07:48,550 --> 00:07:52,330 So fresh start on the singular value decomposition. 128 00:07:52,330 --> 00:07:54,110 What are we looking for? 129 00:07:54,110 --> 00:07:56,990 Well, as a factorization-- 130 00:07:56,990 --> 00:07:58,030 so we're looking for-- 131 00:08:03,460 --> 00:08:12,450 we want A. We want vectors v, so that when I multiply by v-- 132 00:08:12,450 --> 00:08:17,100 so if it was an eigenvector, it would be Av equal lambda v. 133 00:08:17,100 --> 00:08:20,600 But now for A, it's rectangular. 134 00:08:20,600 --> 00:08:22,530 It hasn't got eigenvectors. 135 00:08:22,530 --> 00:08:31,780 So Av is sigma, that the new singular value times u. 136 00:08:31,780 --> 00:08:38,169 That's the first guy and the second guy and the rth guy. 137 00:08:38,169 --> 00:08:42,740 I'll stop at r, the rank. 138 00:08:42,740 --> 00:08:43,350 Oh, yeah. 139 00:08:46,047 --> 00:08:46,880 Is that what I want? 140 00:08:50,870 --> 00:08:53,740 A-- let me just see. 141 00:08:53,740 --> 00:08:56,200 Av is sigma u. 142 00:08:56,200 --> 00:08:58,570 Yeah, that's good. 143 00:08:58,570 --> 00:09:02,750 So this is what takes the place of Ax equal lambda x. 144 00:09:02,750 --> 00:09:06,200 A times one set of singular vectors 145 00:09:06,200 --> 00:09:09,590 gives me a number of times the other set of singular vectors. 146 00:09:09,590 --> 00:09:13,010 And why did I stop at r the rank? 147 00:09:13,010 --> 00:09:15,360 Because after that, the sigmas are 0. 148 00:09:15,360 --> 00:09:19,700 So after that, I could have some more guys, 149 00:09:19,700 --> 00:09:29,030 but they'll be in the null space 0 on down to of Vn. 150 00:09:29,030 --> 00:09:32,460 So these are the important ones. 151 00:09:32,460 --> 00:09:35,910 So that's what I'm looking for. 152 00:09:35,910 --> 00:09:38,680 Let me say it now in words. 153 00:09:38,680 --> 00:09:43,540 I'm looking for a bunch of orthogonal vectors v 154 00:09:43,540 --> 00:09:46,330 so that when I multiply them by A 155 00:09:46,330 --> 00:09:49,910 I get a bunch of orthogonal vectors u. 156 00:09:49,910 --> 00:09:53,570 That is not so clearly possible. 157 00:09:53,570 --> 00:09:56,090 But it is possible. 158 00:09:56,090 --> 00:09:57,320 It does happen. 159 00:09:57,320 --> 00:09:59,840 I'm looking for one set of orthogonal vectors 160 00:09:59,840 --> 00:10:02,960 v in the input space, you could say, 161 00:10:02,960 --> 00:10:07,850 so that the Av's in the output space are also orthogonal. 162 00:10:11,360 --> 00:10:14,390 In our picture of the fundamental-- 163 00:10:14,390 --> 00:10:18,560 the big picture of linear algebra, 164 00:10:18,560 --> 00:10:27,530 we have v's in this space, and then stuff in the null space. 165 00:10:27,530 --> 00:10:34,190 And we have u's over here in the columns space 166 00:10:34,190 --> 00:10:38,160 and some stuff in the null space over there. 167 00:10:38,160 --> 00:10:42,840 And the idea is that I have orthogonal v's here. 168 00:10:42,840 --> 00:10:44,820 And when I multiply by A-- 169 00:10:44,820 --> 00:10:48,920 so multiply by A-- 170 00:10:48,920 --> 00:10:54,760 then I get orthogonal u's over here, orthogonal to orthogonal. 171 00:10:54,760 --> 00:11:00,260 That's what makes the V's and they u's special. 172 00:11:00,260 --> 00:11:01,510 Right? 173 00:11:01,510 --> 00:11:03,370 That's the property. 174 00:11:03,370 --> 00:11:04,960 And then when we write down-- well, 175 00:11:04,960 --> 00:11:07,390 let me write down what that would mean. 176 00:11:07,390 --> 00:11:11,950 So I've just drawn a picture to go with this-- 177 00:11:11,950 --> 00:11:13,930 those equations. 178 00:11:13,930 --> 00:11:16,610 That picture just goes with these equations. 179 00:11:16,610 --> 00:11:18,970 And let me just write down what it means. 180 00:11:18,970 --> 00:11:23,200 It means in matrix-- so I've written it. 181 00:11:23,200 --> 00:11:28,653 Oh yeah, I've written it here in vectors one at a time. 182 00:11:28,653 --> 00:11:30,070 But of course, you, know I'm going 183 00:11:30,070 --> 00:11:33,610 to put those vectors into the columns of a matrix. 184 00:11:33,610 --> 00:11:44,380 So A times v1 up to let's say vr will equal-- 185 00:11:44,380 --> 00:11:45,490 oh yeah. 186 00:11:45,490 --> 00:11:47,780 It equals sigma as times u. 187 00:11:47,780 --> 00:11:53,420 So this is what I'm after is u1 up to ur 188 00:11:53,420 --> 00:11:58,400 multiplied by sigma 1 along to sigma r. 189 00:12:01,730 --> 00:12:05,540 What I'm doing now is just to say I'm 190 00:12:05,540 --> 00:12:10,360 converting these individual singular 191 00:12:10,360 --> 00:12:14,680 vectors, each v going into a u to putting them 192 00:12:14,680 --> 00:12:16,300 all together into a matrix. 193 00:12:16,300 --> 00:12:21,670 And of course, what I've written here is Av equals u sigma, 194 00:12:21,670 --> 00:12:25,300 Av equals u sigma. 195 00:12:28,150 --> 00:12:32,550 That's what that amounts to. 196 00:12:32,550 --> 00:12:38,790 Well, then I'm going to put a v transpose on this side. 197 00:12:38,790 --> 00:12:44,520 And I'm going to get to A equals u sigma v transpose, 198 00:12:44,520 --> 00:12:48,380 multiplying both sides there by v transpose. 199 00:12:48,380 --> 00:12:51,750 I'm kind of writing the same thing in different forms, 200 00:12:51,750 --> 00:12:55,155 matrix form, vector at a time form. 201 00:12:57,690 --> 00:13:01,440 And now we have to find them. 202 00:13:01,440 --> 00:13:06,210 Now I've used up boards saying what we're after, 203 00:13:06,210 --> 00:13:08,430 but now we've got to get there. 204 00:13:08,430 --> 00:13:10,335 So what are the v's and what are the u's? 205 00:13:15,820 --> 00:13:20,700 Well, the cool idea is to think of A transpose A. 206 00:13:20,700 --> 00:13:23,470 So you're with me what we're for. 207 00:13:23,470 --> 00:13:26,210 And now think about A transpose A. 208 00:13:26,210 --> 00:13:29,740 So if this is what I'm hoping for, 209 00:13:29,740 --> 00:13:33,700 what will A transpose A turn out to be? 210 00:13:36,940 --> 00:13:41,770 So big moment that's going to reveal what the v's are. 211 00:13:41,770 --> 00:13:46,000 So if I form A transpose A-- 212 00:13:46,000 --> 00:13:50,080 so A transpose-- so I got to transpose this guy. 213 00:13:50,080 --> 00:13:57,390 So A transpose is V sigma transpose U transpose, right? 214 00:13:57,390 --> 00:14:04,670 And then comes A, which is this, U sigma V transpose. 215 00:14:04,670 --> 00:14:08,040 So why did I do that? 216 00:14:08,040 --> 00:14:12,810 Why is it that A transpose A is the cool thing to look at 217 00:14:12,810 --> 00:14:14,550 to make the problem simpler? 218 00:14:14,550 --> 00:14:20,370 Well, what becomes simpler in that line just written? 219 00:14:20,370 --> 00:14:25,960 U transpose U is the identity, because I'm looking 220 00:14:25,960 --> 00:14:29,830 for orthogonal, in fact orthonormal U's. 221 00:14:29,830 --> 00:14:31,240 So that's the identity. 222 00:14:31,240 --> 00:14:38,980 So this is V sigma transpose sigma V transpose. 223 00:14:38,980 --> 00:14:42,080 And I'll put parentheses around that because that's 224 00:14:42,080 --> 00:14:43,100 a diagonal matrix. 225 00:14:48,800 --> 00:14:50,180 What does that tell me? 226 00:14:50,180 --> 00:14:53,440 What does that tell all of us? 227 00:14:53,440 --> 00:14:55,840 A transpose A has this form. 228 00:14:55,840 --> 00:14:57,420 Now we've seen that form before. 229 00:14:57,420 --> 00:15:01,320 We know that this is a symmetric matrix, symmetric 230 00:15:01,320 --> 00:15:03,090 and even positive definite. 231 00:15:03,090 --> 00:15:06,160 So what are the v's? 232 00:15:06,160 --> 00:15:13,860 The v's are the eigenvectors of A transpose A. 233 00:15:13,860 --> 00:15:20,770 This is the Q lambda Q transpose for that symmetric matrix. 234 00:15:20,770 --> 00:15:24,360 So we know the v's are the eigenvectors, 235 00:15:24,360 --> 00:15:32,040 v is the eigenvectors of A transpose A. 236 00:15:32,040 --> 00:15:35,820 I guess we're also going to get the singular values. 237 00:15:35,820 --> 00:15:41,760 So the sigma transpose sigma, which will be the sigma squared 238 00:15:41,760 --> 00:15:51,800 are the eigenvalues of A transpose A. Good! 239 00:15:55,010 --> 00:15:58,190 Sort of by looking for the correct thing, U sigma V 240 00:15:58,190 --> 00:16:01,790 transpose and then just using the U transpose U 241 00:16:01,790 --> 00:16:04,640 equal identity, we got it back to something 242 00:16:04,640 --> 00:16:07,800 we perfectly recognize. 243 00:16:07,800 --> 00:16:09,430 A transpose A has that form. 244 00:16:09,430 --> 00:16:11,400 So now we know what the V's are. 245 00:16:11,400 --> 00:16:17,940 And if I do it the other way, which, what's the other way? 246 00:16:17,940 --> 00:16:20,220 Instead of A transpose A, the other way 247 00:16:20,220 --> 00:16:23,760 is to look at A, A transpose. 248 00:16:23,760 --> 00:16:26,910 And if I write all that down, that a 249 00:16:26,910 --> 00:16:32,910 is the U sigma V transpose, and the A transpose is the V sigma 250 00:16:32,910 --> 00:16:35,230 transpose U transpose. 251 00:16:35,230 --> 00:16:39,030 And again, this stuff goes away and leaves me 252 00:16:39,030 --> 00:16:45,610 with U sigma, sigma transpose U transpose. 253 00:16:45,610 --> 00:16:47,990 So I know what the U's are too. 254 00:16:47,990 --> 00:16:53,990 They are eigenvectors of A, A transpose. 255 00:17:00,430 --> 00:17:03,670 Isn't that a beautiful symmetry? 256 00:17:03,670 --> 00:17:06,940 You just-- A transpose A and A, A transpose 257 00:17:06,940 --> 00:17:08,440 are two different guys now. 258 00:17:08,440 --> 00:17:13,089 So each has its own eigenvectors and we use both. 259 00:17:13,089 --> 00:17:15,680 It's just right. 260 00:17:15,680 --> 00:17:19,190 And I just have to take the final step, 261 00:17:19,190 --> 00:17:23,730 and we've established the SVD. 262 00:17:23,730 --> 00:17:26,550 So the final step is to remember what I'm going for here. 263 00:17:29,840 --> 00:17:33,030 A times a v is supposed to be a sigma times a u. 264 00:17:36,210 --> 00:17:37,960 See, what I have to deal with now 265 00:17:37,960 --> 00:17:41,870 is I haven't quite finished. 266 00:17:41,870 --> 00:17:44,020 It's just perfect as far as it goes, 267 00:17:44,020 --> 00:17:48,710 but it hasn't gone to the end yet because we 268 00:17:48,710 --> 00:17:51,650 could have double eigenvalues and triple eigenvalues, 269 00:17:51,650 --> 00:17:56,590 and all those horrible possibilities. 270 00:17:56,590 --> 00:18:01,400 And if I have triple eigenvalues or double eigenvalues, 271 00:18:01,400 --> 00:18:04,030 then what's the deal with eigenvectors 272 00:18:04,030 --> 00:18:05,750 if I have double eigenvalues? 273 00:18:05,750 --> 00:18:10,630 Suppose a matrix has a symmetric matrix, 274 00:18:10,630 --> 00:18:12,770 has a double eigenvalue. 275 00:18:12,770 --> 00:18:14,610 Let me just take an example. 276 00:18:14,610 --> 00:18:20,950 So symmetric matrix like say, 1, 1, 5, make it. 277 00:18:20,950 --> 00:18:23,550 Why not? 278 00:18:23,550 --> 00:18:25,620 What's the deal with eigenvectors 279 00:18:25,620 --> 00:18:29,510 for that matrix 1, 1, 5? 280 00:18:29,510 --> 00:18:31,610 So 5 has got an eigenvector. 281 00:18:31,610 --> 00:18:35,720 You can see what it is, 0, 0, 1. 282 00:18:35,720 --> 00:18:39,410 What about eigenvectors that go with lambda equal 1 283 00:18:39,410 --> 00:18:42,440 for that matrix? 284 00:18:42,440 --> 00:18:43,230 What's up? 285 00:18:43,230 --> 00:18:48,530 What would be eigenvectors for a lambda equal 1? 286 00:18:48,530 --> 00:18:52,040 Unfortunately, there was a whole plane of them. 287 00:18:52,040 --> 00:18:56,840 Any vector of the form x, y, 0. 288 00:18:56,840 --> 00:19:04,220 Any vector in the x, y plane would produce x, y, 0. 289 00:19:04,220 --> 00:19:06,440 So I have a whole plane of eigenvectors. 290 00:19:06,440 --> 00:19:11,420 And I've got to pick two that are orthogonal, which I can do. 291 00:19:11,420 --> 00:19:12,940 And then they have to be-- 292 00:19:12,940 --> 00:19:15,860 in the SVD those two orthogonal guys 293 00:19:15,860 --> 00:19:18,420 have to go to two orthogonal guys. 294 00:19:18,420 --> 00:19:23,732 In other words, it's a little bit of detail here, 295 00:19:23,732 --> 00:19:28,710 a little getting into this exactly what is-- 296 00:19:28,710 --> 00:19:38,950 well, actually, let me tell you the steps. 297 00:19:38,950 --> 00:19:44,710 So I use this to conclude that the V's the singular vectors 298 00:19:44,710 --> 00:19:46,270 should be eigenvalues. 299 00:19:46,270 --> 00:19:49,240 I concluded those guys from this step. 300 00:19:49,240 --> 00:19:52,240 Now I'm not going to use this step so much. 301 00:19:52,240 --> 00:19:57,010 Of course, it's in the back of my mind but I'm not using it. 302 00:19:57,010 --> 00:20:00,820 I'm going to get the u's from here. 303 00:20:00,820 --> 00:20:17,190 So u1 is A v1 over sigma 1 ur is Avr over sigma r. 304 00:20:17,190 --> 00:20:20,330 You see what I'm doing here? 305 00:20:20,330 --> 00:20:28,540 I'm picking in a possible plane of things the one I want, 306 00:20:28,540 --> 00:20:29,490 the u's I want. 307 00:20:29,490 --> 00:20:31,510 So I've chosen the v's. 308 00:20:31,510 --> 00:20:33,640 I've chosen the sigmas. 309 00:20:33,640 --> 00:20:37,750 They were fixed for A transpose A. 310 00:20:37,750 --> 00:20:40,780 The eigenvectors are v's, the things-- 311 00:20:40,780 --> 00:20:44,770 the eigenvalues are sigma squared. 312 00:20:44,770 --> 00:20:49,460 And now then this is the u I want. 313 00:20:49,460 --> 00:20:50,980 Are you with me? 314 00:20:50,980 --> 00:20:56,830 So I want to get these u's correct. 315 00:20:56,830 --> 00:20:59,150 And if I have a whole plane of possibilities, 316 00:20:59,150 --> 00:21:01,890 I got to pick the right one. 317 00:21:01,890 --> 00:21:05,600 And now finally, I have to show that it's the right one. 318 00:21:05,600 --> 00:21:08,970 So what is left to show? 319 00:21:08,970 --> 00:21:14,100 I should show that these u's are eigenvectors of A, A transpose. 320 00:21:16,850 --> 00:21:21,050 And I should show that they're orthogonal. 321 00:21:21,050 --> 00:21:22,850 That's the key. 322 00:21:22,850 --> 00:21:27,320 I would like to show that these are orthogonal. 323 00:21:27,320 --> 00:21:30,420 And that's what goes in this picture. 324 00:21:30,420 --> 00:21:32,690 The v's-- I've got orthogonal, guys, 325 00:21:32,690 --> 00:21:36,410 because they're the eigenvectors of a symmetric matrix. 326 00:21:36,410 --> 00:21:37,850 Pick them orthogonal. 327 00:21:37,850 --> 00:21:40,490 But now I'm multiplying by A, so I'm 328 00:21:40,490 --> 00:21:46,370 getting the u which is Av over sigma for the basis vectors. 329 00:21:46,370 --> 00:21:48,290 And I have to show they're orthogonal. 330 00:21:48,290 --> 00:21:52,490 So this is like the final moment. 331 00:21:52,490 --> 00:21:54,590 Does everything come together right? 332 00:21:57,920 --> 00:22:02,480 If I've picked the v's as the eigenvectors of A transpose A, 333 00:22:02,480 --> 00:22:09,420 and then I take these for the u, are they orthogonal? 334 00:22:09,420 --> 00:22:12,900 So I would like to think that we can check that fact 335 00:22:12,900 --> 00:22:15,550 and that it will come out. 336 00:22:15,550 --> 00:22:18,570 Could you just help me through this one? 337 00:22:18,570 --> 00:22:23,716 I'll never ask for anything again, just get the SVD one. 338 00:22:31,560 --> 00:22:37,580 So I would like to show that u1-- 339 00:22:37,580 --> 00:22:40,250 so let me put up what I'm doing. 340 00:22:40,250 --> 00:22:47,610 I'm trying to show that u1 transpose u2 is 0. 341 00:22:47,610 --> 00:22:49,680 They're orthogonal. 342 00:22:49,680 --> 00:22:57,890 So u1 is A v1 over sigma 1. 343 00:22:57,890 --> 00:22:59,040 That's transpose. 344 00:22:59,040 --> 00:23:00,740 That's u1. 345 00:23:00,740 --> 00:23:06,000 And u2 is A v2 over sigma 2. 346 00:23:06,000 --> 00:23:08,280 And I want to get 0. 347 00:23:08,280 --> 00:23:13,560 The whole conversation is ending right here. 348 00:23:13,560 --> 00:23:16,866 Why is that thing 0? 349 00:23:16,866 --> 00:23:19,370 The v's are orthogonal. 350 00:23:19,370 --> 00:23:21,850 We know the v's are orthogonal. 351 00:23:21,850 --> 00:23:25,070 They're orthogonal eigenvectors of A transpose A. 352 00:23:25,070 --> 00:23:26,580 Let me repeat that. 353 00:23:26,580 --> 00:23:36,970 The v's are orthogonal eigenvectors of A transpose A, 354 00:23:36,970 --> 00:23:40,150 which I know we can find them. 355 00:23:40,150 --> 00:23:42,520 Then I chose the u's to be this. 356 00:23:42,520 --> 00:23:44,920 And I want to get the answer 0. 357 00:23:44,920 --> 00:23:48,470 Are you ready to do it? 358 00:23:48,470 --> 00:23:53,210 We want to compute that and get 0. 359 00:23:53,210 --> 00:23:56,580 So what do I get? 360 00:23:56,580 --> 00:23:57,680 We just have to do it. 361 00:23:57,680 --> 00:24:02,180 So I can see that the denominator is that. 362 00:24:02,180 --> 00:24:09,030 So is it v1 transpose A transpose times A v2. 363 00:24:14,870 --> 00:24:17,690 And I'm hoping to get 0. 364 00:24:17,690 --> 00:24:21,060 Do I get 0 here? 365 00:24:21,060 --> 00:24:23,240 You hope so. 366 00:24:23,240 --> 00:24:25,060 v1 is orthogonal v2. 367 00:24:25,060 --> 00:24:28,900 But I've got A transpose A stuck in the middle there. 368 00:24:28,900 --> 00:24:33,210 So what happens here? 369 00:24:33,210 --> 00:24:34,230 How do I look at that? 370 00:24:37,530 --> 00:24:44,960 v2 is an eigenvector of A transpose A. Terrific! 371 00:24:47,630 --> 00:24:50,670 So this is v1 transpose. 372 00:24:50,670 --> 00:24:54,030 And this is the matrix times v2. 373 00:24:54,030 --> 00:25:00,420 So that's sigma 2 transpose v2, isn't it? 374 00:25:00,420 --> 00:25:04,050 It's the eigenvector with eigenvalue sigma 375 00:25:04,050 --> 00:25:07,550 2 squared times v2. 376 00:25:07,550 --> 00:25:10,825 Yeah, divided by sigma 1 sigma 2. 377 00:25:16,410 --> 00:25:18,000 So the A's are out of there now. 378 00:25:21,310 --> 00:25:25,310 So I've just got these numbers, sigma 2 squared. 379 00:25:25,310 --> 00:25:28,900 So that would be sigma 2 over sigma 1-- 380 00:25:28,900 --> 00:25:34,660 I've accounted for these numbers here-- times v1 transpose v2. 381 00:25:34,660 --> 00:25:40,810 And now what's up? 382 00:25:40,810 --> 00:25:43,030 They're orthonormal. 383 00:25:43,030 --> 00:25:44,200 We got it. 384 00:25:44,200 --> 00:25:45,610 That's 0. 385 00:25:45,610 --> 00:25:49,020 That is 0 there, yeah. 386 00:25:49,020 --> 00:25:53,710 So not only are the v's orthogonal to each other, 387 00:25:53,710 --> 00:25:57,490 but because they're eigenvectors of A transpose A, when 388 00:25:57,490 --> 00:26:00,460 I do this, I discover that the Av's 389 00:26:00,460 --> 00:26:06,460 are orthogonal to each other over in the column space. 390 00:26:06,460 --> 00:26:11,010 So orthogonal v's in the row space, orthogonal Av's 391 00:26:11,010 --> 00:26:13,110 over in column space. 392 00:26:13,110 --> 00:26:19,260 That was discovered late-- much long after eigenvectors. 393 00:26:19,260 --> 00:26:22,980 And it's a interesting history. 394 00:26:22,980 --> 00:26:26,430 And it just comes out right. 395 00:26:26,430 --> 00:26:30,930 And then it was discovered, but not much used, for oh, 396 00:26:30,930 --> 00:26:32,670 100 years probably. 397 00:26:32,670 --> 00:26:38,880 And then people saw that it was exactly the right thing, 398 00:26:38,880 --> 00:26:41,370 and data matrices became important, which 399 00:26:41,370 --> 00:26:47,060 are large rectangular matrices. 400 00:26:47,060 --> 00:26:48,530 And we have not-- 401 00:26:48,530 --> 00:26:52,040 oh, I better say a word, just a word here about actually 402 00:26:52,040 --> 00:26:58,310 computing the v's and sigmas and the u's 403 00:26:58,310 --> 00:27:02,030 So how would you actually find them? 404 00:27:02,030 --> 00:27:05,870 What I most want to say is you would not 405 00:27:05,870 --> 00:27:09,065 go this A transpose A route. 406 00:27:14,530 --> 00:27:15,750 Why is it like it? 407 00:27:15,750 --> 00:27:18,300 Is that a big mistake? 408 00:27:18,300 --> 00:27:24,180 If you have a matrix A, say 5,000 by 10,000, 409 00:27:24,180 --> 00:27:27,480 why is it a mistake to actually use 410 00:27:27,480 --> 00:27:29,790 A transpose A in the computation? 411 00:27:29,790 --> 00:27:33,840 We used it heavily in the proof. 412 00:27:33,840 --> 00:27:37,530 And we could find another proof that wouldn't use it so much. 413 00:27:37,530 --> 00:27:45,310 But why would I not multiply these two together? 414 00:27:45,310 --> 00:27:48,320 It's very big, very expensive. 415 00:27:48,320 --> 00:27:55,450 It adds in a whole lot of round off-- 416 00:27:55,450 --> 00:27:57,780 you have a matrix that's now-- 417 00:27:57,780 --> 00:28:02,580 its vulnerability to round off errors is squared-- 418 00:28:02,580 --> 00:28:05,220 that's called its condition number-- gets squared. 419 00:28:05,220 --> 00:28:07,560 And you just don't go there. 420 00:28:07,560 --> 00:28:12,330 So the actual computational methods are quite different. 421 00:28:12,330 --> 00:28:14,770 And we'll talk about those. 422 00:28:14,770 --> 00:28:18,550 But the A transpose A, because it's 423 00:28:18,550 --> 00:28:24,100 symmetric positive definite, made the proof so nice. 424 00:28:24,100 --> 00:28:30,640 You've seen the nicest proof, I'd say, of the-- 425 00:28:30,640 --> 00:28:33,560 Now I should think about the geometry. 426 00:28:33,560 --> 00:28:38,150 So what does A equal A for u sigma? 427 00:28:38,150 --> 00:28:45,570 Maybe I take another board, but it will fill it. 428 00:28:45,570 --> 00:28:49,110 But it's a good U sigma V transpose. 429 00:28:52,120 --> 00:28:55,190 So it's got three factors there. 430 00:28:55,190 --> 00:28:57,500 And I would like then each factor 431 00:28:57,500 --> 00:28:59,840 is kind of a special matrix. 432 00:28:59,840 --> 00:29:01,980 U and V are orthogonal matrix. 433 00:29:01,980 --> 00:29:05,090 So I think of those as rotations. 434 00:29:05,090 --> 00:29:07,370 Sigma is a diagonal matrix. 435 00:29:07,370 --> 00:29:09,200 I think of it as stretching. 436 00:29:09,200 --> 00:29:11,310 So now I'm just going to draw the picture. 437 00:29:11,310 --> 00:29:14,450 So here's unit vectors. 438 00:29:17,170 --> 00:29:22,420 And the first thing-- so if I multiply by x, 439 00:29:22,420 --> 00:29:24,370 this is the first thing that happens. 440 00:29:24,370 --> 00:29:26,530 So that rotates. 441 00:29:26,530 --> 00:29:28,900 So here's x's. 442 00:29:28,900 --> 00:29:31,720 Then V transpose x's. 443 00:29:31,720 --> 00:29:35,730 That's still a circle length and change 444 00:29:35,730 --> 00:29:39,420 for those, when I multiply by an orthogonal matrix. 445 00:29:39,420 --> 00:29:43,670 But the vectors turned. 446 00:29:43,670 --> 00:29:46,250 It's a rotation. 447 00:29:46,250 --> 00:29:49,580 Could be a reflection, but let's keep it as a rotation. 448 00:29:49,580 --> 00:29:52,250 Now what does sigma do? 449 00:29:52,250 --> 00:29:55,450 So I have this unit circle. 450 00:29:55,450 --> 00:29:56,090 I'm in 2D. 451 00:29:59,460 --> 00:30:03,390 So I'm drawing a picture of the vectors. 452 00:30:03,390 --> 00:30:08,120 These are the unit vectors in 2D, x,y. 453 00:30:08,120 --> 00:30:12,620 They got turned by the orthogonal matrix. 454 00:30:12,620 --> 00:30:17,160 What does sigma do to that picture? 455 00:30:17,160 --> 00:30:20,220 It stretches, because sigma multiplies 456 00:30:20,220 --> 00:30:22,500 by sigma 1 in the first component, 457 00:30:22,500 --> 00:30:24,120 sigma 2 in the second. 458 00:30:24,120 --> 00:30:26,130 So it stretches these guys. 459 00:30:26,130 --> 00:30:30,000 And let's suppose this is number 1 and this is number 2, 460 00:30:30,000 --> 00:30:32,190 this is number 1 and number 2. 461 00:30:32,190 --> 00:30:36,320 So sigma 1, our convention is sigma 1-- 462 00:30:36,320 --> 00:30:39,360 we always take sigma 1 greater or equal to sigma 2, 463 00:30:39,360 --> 00:30:43,720 greater or equal whatever, greater equal, sigma rank. 464 00:30:43,720 --> 00:30:46,530 And they're all positive. 465 00:30:49,110 --> 00:30:51,540 And the rest are 0. 466 00:30:51,540 --> 00:30:53,890 So sigma 1 will be bigger than sigma 2. 467 00:30:53,890 --> 00:30:58,660 So I'm expecting a circle goes to an ellipse 468 00:30:58,660 --> 00:30:59,905 when you stretch-- 469 00:31:02,680 --> 00:31:07,480 I didn't get it quite perfect, but not bad. 470 00:31:07,480 --> 00:31:15,250 So this would be sigma 2 v2, sigma 1 v1, 471 00:31:15,250 --> 00:31:17,890 and this would be sigma 2 v2. 472 00:31:17,890 --> 00:31:19,105 And we now have an ellipse. 473 00:31:22,260 --> 00:31:25,190 So we started with x is in a circle. 474 00:31:25,190 --> 00:31:26,540 We rotated. 475 00:31:26,540 --> 00:31:27,560 We stretched. 476 00:31:27,560 --> 00:31:31,610 And now the final step is take these guys 477 00:31:31,610 --> 00:31:33,650 and multiply them by u. 478 00:31:33,650 --> 00:31:38,150 So this was the sigma V transpose x. 479 00:31:38,150 --> 00:31:42,050 And now I'm ready for the u part which comes last 480 00:31:42,050 --> 00:31:44,060 because it's at the left. 481 00:31:44,060 --> 00:31:45,200 And what happens? 482 00:31:45,200 --> 00:31:46,340 What's the picture now? 483 00:31:50,040 --> 00:31:52,710 What does u do to the ellipse? 484 00:31:52,710 --> 00:31:54,370 It rotates it. 485 00:31:54,370 --> 00:31:56,820 It's another orthogonal matrix. 486 00:31:56,820 --> 00:31:59,370 It rotates it somewhere, maybe there. 487 00:32:03,060 --> 00:32:11,330 And now we see the u's, u2 and u1. 488 00:32:20,950 --> 00:32:23,980 Well, let me think about that. 489 00:32:23,980 --> 00:32:26,460 Basically, that's not that's right. 490 00:32:29,150 --> 00:32:34,010 So this SVD is telling us something quite remarkable 491 00:32:34,010 --> 00:32:36,800 that every linear transformation, 492 00:32:36,800 --> 00:32:40,040 every matrix multiplication factors 493 00:32:40,040 --> 00:32:46,020 into a rotation times a stretch times a different rotation, 494 00:32:46,020 --> 00:32:50,350 but possibly different. 495 00:32:50,350 --> 00:32:55,240 Actually, when would the u be the same as a v? 496 00:32:55,240 --> 00:32:56,430 Here's a good question. 497 00:32:56,430 --> 00:33:00,280 When is u the same as v when are the two singular 498 00:33:00,280 --> 00:33:02,030 vectors just the same? 499 00:33:02,030 --> 00:33:03,190 AUDIENCE: A square. 500 00:33:03,190 --> 00:33:06,220 PROFESSOR: Because A would have to be square. 501 00:33:06,220 --> 00:33:12,340 And we want this to be the same as Q lambda Q 502 00:33:12,340 --> 00:33:15,730 transpose if they're the same. 503 00:33:15,730 --> 00:33:18,430 So the U's would be the same as the V's 504 00:33:18,430 --> 00:33:22,180 when the matrix is symmetric. 505 00:33:22,180 --> 00:33:26,050 And actually we need it to be positive definite. 506 00:33:26,050 --> 00:33:27,830 Why is that? 507 00:33:27,830 --> 00:33:33,290 Because our convention is these guys are greater or equal to 0. 508 00:33:33,290 --> 00:33:36,080 It's going to be the same, then-- 509 00:33:36,080 --> 00:33:42,080 so far a positive definite symmetric matrix, 510 00:33:42,080 --> 00:33:47,740 the S that we started with is the same 511 00:33:47,740 --> 00:33:50,070 as the A on the next line. 512 00:33:50,070 --> 00:33:55,340 Yeah, the Q is the U, the Q transpose is the V transpose, 513 00:33:55,340 --> 00:33:57,780 the lambda is the sigma. 514 00:33:57,780 --> 00:33:59,520 So those are the good matrices. 515 00:33:59,520 --> 00:34:02,580 And they're the ones that you can't improve basically. 516 00:34:02,580 --> 00:34:06,300 They're so good you can't make a positive definite symmetric 517 00:34:06,300 --> 00:34:08,070 matrix better than it is. 518 00:34:08,070 --> 00:34:14,400 Well, maybe diagonalize it or something, but OK. 519 00:34:14,400 --> 00:34:18,239 Now I think of like, one question here 520 00:34:18,239 --> 00:34:26,830 that helps me anyway to keep this figure straight, 521 00:34:26,830 --> 00:34:32,820 how I want to count the parameters 522 00:34:32,820 --> 00:34:38,699 in this factorization. 523 00:34:38,699 --> 00:34:40,980 So I am 2 by 2. 524 00:34:40,980 --> 00:34:43,199 I'm 2 by 2. 525 00:34:43,199 --> 00:34:48,110 So A has four numbers, a, b, c, d. 526 00:34:52,929 --> 00:34:55,510 Then I guess I feel that four numbers should 527 00:34:55,510 --> 00:34:59,440 appear on the right hand side. 528 00:34:59,440 --> 00:35:03,190 Somehow the U and the sigma and the V transpose 529 00:35:03,190 --> 00:35:06,050 should use up a total of four numbers. 530 00:35:06,050 --> 00:35:10,780 So we have a counting match between the left side 531 00:35:10,780 --> 00:35:14,050 that's got four numbers a, b, c, d, and the right side 532 00:35:14,050 --> 00:35:18,360 that's got four numbers buried in there somewhere. 533 00:35:18,360 --> 00:35:21,690 So how can we dig them out? 534 00:35:21,690 --> 00:35:23,340 How many numbers in sigma? 535 00:35:23,340 --> 00:35:24,900 That's pretty clear. 536 00:35:28,310 --> 00:35:31,462 Two, sigma 1 and sigma 2. 537 00:35:31,462 --> 00:35:34,560 The two eigenvalues. 538 00:35:34,560 --> 00:35:36,900 How many numbers in this rotation? 539 00:35:39,670 --> 00:35:42,340 So if I had a different color chalk, 540 00:35:42,340 --> 00:35:47,890 I would put 2 for the number of things I counted for by sigma. 541 00:35:47,890 --> 00:35:53,520 How many parameters does a two by two rotation require? 542 00:35:53,520 --> 00:35:54,020 One. 543 00:35:54,020 --> 00:35:56,570 And what's a good word for that one? 544 00:35:59,961 --> 00:36:02,740 Is that one parameter? 545 00:36:02,740 --> 00:36:05,380 It's like I have our cos theta, sine theta, 546 00:36:05,380 --> 00:36:07,000 minus sine theta, cos theta. 547 00:36:07,000 --> 00:36:09,610 There's a number theta. 548 00:36:09,610 --> 00:36:12,230 It's the angle it rotates. 549 00:36:12,230 --> 00:36:18,380 So that's one guy to tell the rotation angle, two guys 550 00:36:18,380 --> 00:36:25,740 to tell the stretchings, and one more to tell the rotation 551 00:36:25,740 --> 00:36:29,070 from you, adding up to four. 552 00:36:29,070 --> 00:36:31,620 So those count-- that was a match up 553 00:36:31,620 --> 00:36:35,160 with the four numbers, a, b, c, d that we start with. 554 00:36:35,160 --> 00:36:39,450 Of course, it's a complicated relation between those four 555 00:36:39,450 --> 00:36:43,410 numbers and rotations and stretches, 556 00:36:43,410 --> 00:36:45,320 but it's four equals four anyway. 557 00:36:48,890 --> 00:36:51,720 And I guess if you did three by threes-- 558 00:36:51,720 --> 00:36:54,390 oh, three by threes. 559 00:36:54,390 --> 00:36:56,400 What would happen then? 560 00:36:56,400 --> 00:36:57,780 So let me take three. 561 00:36:57,780 --> 00:37:01,180 Do you want to care for three by threes? 562 00:37:01,180 --> 00:37:04,690 Just, it's sort of satisfying to get four equal four. 563 00:37:04,690 --> 00:37:07,590 But now what do we get three by three? 564 00:37:10,590 --> 00:37:13,050 We got how many numbers here? 565 00:37:13,050 --> 00:37:13,550 Nine. 566 00:37:17,280 --> 00:37:18,915 So where are those nine numbers? 567 00:37:23,730 --> 00:37:24,420 How many here? 568 00:37:24,420 --> 00:37:26,850 That's usually the easy-- 569 00:37:26,850 --> 00:37:28,470 three. 570 00:37:28,470 --> 00:37:33,310 So what's your guess for the how many in a rotation? 571 00:37:33,310 --> 00:37:38,490 And a 3D rotation, you take a sphere and you rotate it. 572 00:37:38,490 --> 00:37:42,570 How many how many numbers to tell you what's what-- 573 00:37:42,570 --> 00:37:44,410 to tell you what you did? 574 00:37:44,410 --> 00:37:44,910 Three. 575 00:37:44,910 --> 00:37:45,930 We hope three. 576 00:37:45,930 --> 00:37:51,900 Yeah, it's going to be three, three, and three for the three 577 00:37:51,900 --> 00:37:53,520 dimensional world that we live in. 578 00:37:53,520 --> 00:38:00,300 So people who do rotations for a living understand that rotation 579 00:38:00,300 --> 00:38:02,378 in 3D, but how do you see this? 580 00:38:02,378 --> 00:38:03,670 AUDIENCE: Roll, pitch, and yaw. 581 00:38:03,670 --> 00:38:04,390 PROFESSOR: Sorry? 582 00:38:04,390 --> 00:38:05,280 AUDIENCE: Roll, pitch and yaw. 583 00:38:05,280 --> 00:38:06,750 PROFESSOR: Roll, pitch, and yaw. 584 00:38:06,750 --> 00:38:07,740 That sounds good. 585 00:38:07,740 --> 00:38:11,040 I mean, it's three words and we've got it, right? 586 00:38:11,040 --> 00:38:11,970 OK, yeah. 587 00:38:11,970 --> 00:38:13,270 Roll, pitch and yaw. 588 00:38:13,270 --> 00:38:19,030 Yeah, I guess a pilot hopefully, knows about those three. 589 00:38:19,030 --> 00:38:21,310 Yeah, yeah, yeah. 590 00:38:21,310 --> 00:38:22,260 Which is roll? 591 00:38:22,260 --> 00:38:24,030 When you are like forward and back? 592 00:38:26,690 --> 00:38:29,280 Does anybody, anybody? 593 00:38:29,280 --> 00:38:30,840 Roll, pitch, and yaw? 594 00:38:33,780 --> 00:38:35,415 AUDIENCE: Pitch is the up and down one. 595 00:38:35,415 --> 00:38:37,140 PROFESSOR: Pitch is the up and down one. 596 00:38:37,140 --> 00:38:37,640 OK. 597 00:38:37,640 --> 00:38:41,550 AUDIENCE: Roll is like, think of a barrel roll. 598 00:38:41,550 --> 00:38:43,305 And yaw is your side-to-side motion. 599 00:38:43,305 --> 00:38:46,980 PROFESSOR: Oh, yaw, you stay in a plane and you-- 600 00:38:46,980 --> 00:38:48,270 OK, beautiful. 601 00:38:48,270 --> 00:38:50,040 Right, right. 602 00:38:50,040 --> 00:38:54,030 And that leads us to our four-- four dimensions. 603 00:38:54,030 --> 00:38:57,090 What's your guess on 4D? 604 00:38:57,090 --> 00:38:59,130 Well, we could do the count again. 605 00:38:59,130 --> 00:39:02,670 If it was 4 by 4, we would have 16 numbers there. 606 00:39:02,670 --> 00:39:06,060 And in the middle, we always have an easy time with that. 607 00:39:06,060 --> 00:39:07,890 That would be 4. 608 00:39:07,890 --> 00:39:10,620 So we've got 12 left to share out. 609 00:39:10,620 --> 00:39:13,110 So six somehow-- six-- 610 00:39:13,110 --> 00:39:17,720 six angles in four dimensions. 611 00:39:17,720 --> 00:39:19,220 Well, we'll leave it there. 612 00:39:19,220 --> 00:39:21,160 Yeah, yeah, yeah. 613 00:39:21,160 --> 00:39:23,210 OK. 614 00:39:23,210 --> 00:39:27,330 So there is the SVD but without an example. 615 00:39:27,330 --> 00:39:30,990 Examples, you know, I would have to compute A transpose A 616 00:39:30,990 --> 00:39:32,220 and find it. 617 00:39:32,220 --> 00:39:35,010 So the text will do that-- 618 00:39:35,010 --> 00:39:37,690 does it for a particular matrix. 619 00:39:37,690 --> 00:39:39,930 Oh! 620 00:39:39,930 --> 00:39:43,740 Yeah, the text does it for a matrix 3, 4, 0, 621 00:39:43,740 --> 00:39:48,260 5 that came out pretty well. 622 00:39:48,260 --> 00:39:50,810 A few facts we could learn though. 623 00:39:50,810 --> 00:39:54,890 So if I multiply all the eigenvalues together 624 00:39:54,890 --> 00:39:58,160 for a matrix A, what do I get? 625 00:39:58,160 --> 00:39:59,850 I get the determinant. 626 00:39:59,850 --> 00:40:04,430 What if I multiply the singular values together? 627 00:40:04,430 --> 00:40:06,170 Well again, I get the determinant. 628 00:40:06,170 --> 00:40:10,190 You can see it right away from the big formula. 629 00:40:10,190 --> 00:40:15,000 Take determinant-- take determinant. 630 00:40:15,000 --> 00:40:17,220 Well, assuming the matrix A is square. 631 00:40:17,220 --> 00:40:18,990 So it's got a determinant. 632 00:40:18,990 --> 00:40:21,990 Then I take determinant of this product. 633 00:40:21,990 --> 00:40:24,660 I can take the separate determinants. 634 00:40:24,660 --> 00:40:29,890 That has determinant equal to one. 635 00:40:29,890 --> 00:40:39,120 An orthogonal matrix, the determinant is one. 636 00:40:39,120 --> 00:40:40,960 And similarly, here. 637 00:40:40,960 --> 00:40:45,710 So the product of the sigmas is also the determinant. 638 00:40:45,710 --> 00:40:46,240 Yeah. 639 00:40:46,240 --> 00:40:49,410 Yeah, so the product of the sigmas is also the determinant. 640 00:40:49,410 --> 00:40:52,130 The product of the sigmas here will be 15. 641 00:40:52,130 --> 00:40:59,960 But you'll find that sigma one is smaller than lambda 1. 642 00:40:59,960 --> 00:41:02,650 So here are the eigenvalues, lambda 1 643 00:41:02,650 --> 00:41:06,630 less or equal to lambda 2, say. 644 00:41:06,630 --> 00:41:12,380 But the singular values are outside them. 645 00:41:12,380 --> 00:41:13,040 Yeah. 646 00:41:13,040 --> 00:41:14,930 But they still multiply. 647 00:41:14,930 --> 00:41:19,730 Sigma 1 times sigma 2 will still be 15. 648 00:41:19,730 --> 00:41:22,770 And that's the same as lambda 1 times lambda 2. 649 00:41:22,770 --> 00:41:23,270 Yeah. 650 00:41:26,230 --> 00:41:31,930 But overall, computing the examples of the SVD 651 00:41:31,930 --> 00:41:35,510 take more time because-- 652 00:41:35,510 --> 00:41:40,270 well, yeah, you just compute A transpose A and you've got 653 00:41:40,270 --> 00:41:40,930 the v's. 654 00:41:40,930 --> 00:41:42,820 And you're on your way. 655 00:41:42,820 --> 00:41:47,560 And you have to take the square root of the eigenvalues. 656 00:41:47,560 --> 00:41:54,390 So that's the SVD as a piece of pure math. 657 00:41:54,390 --> 00:41:58,170 But of course, what we'll do next time starting right away 658 00:41:58,170 --> 00:42:01,010 is use SVD. 659 00:42:01,010 --> 00:42:04,540 And let me tell you even today, the most-- 660 00:42:04,540 --> 00:42:11,910 yeah, yeah most important pieces of the SVD. 661 00:42:11,910 --> 00:42:14,300 So what do I mean by pieces of the SVD? 662 00:42:14,300 --> 00:42:16,860 I've got one more blackboard still to write on. 663 00:42:16,860 --> 00:42:18,200 So here we go. 664 00:42:20,930 --> 00:42:30,380 So let me write out A is the u's times the sigmas-- 665 00:42:30,380 --> 00:42:35,300 sigmas 1 to r times the v's-- 666 00:42:35,300 --> 00:42:39,540 v transpose v1 transpose down to vr transpose. 667 00:42:39,540 --> 00:42:43,240 So those are across. 668 00:42:43,240 --> 00:42:43,790 Yeah. 669 00:42:43,790 --> 00:42:48,940 Actually what I've written here-- 670 00:42:52,700 --> 00:42:57,130 so you could say there is a big economies. 671 00:42:57,130 --> 00:43:01,400 There is a smaller size SVD that has the real stuff that 672 00:43:01,400 --> 00:43:02,720 really counts. 673 00:43:02,720 --> 00:43:07,220 And then there's a larger SVD that has a whole lot of zeros. 674 00:43:07,220 --> 00:43:10,415 So this it would be the smaller one, m by r. 675 00:43:13,250 --> 00:43:15,410 This would be r by r. 676 00:43:15,410 --> 00:43:17,570 And these would all be positive. 677 00:43:17,570 --> 00:43:19,430 And this would be r by n. 678 00:43:24,260 --> 00:43:29,720 So that's only using the r non-zeros. 679 00:43:29,720 --> 00:43:33,360 All these guys are greater than zero. 680 00:43:33,360 --> 00:43:37,590 Then the other one we could fill out 681 00:43:37,590 --> 00:43:43,882 to get a square orthogonal matrix, 682 00:43:43,882 --> 00:43:52,060 the sigmas and square v's v1 transpose to vn transpose. 683 00:43:52,060 --> 00:43:54,000 So what are the shapes now? 684 00:43:54,000 --> 00:43:57,030 This shape is m by m. 685 00:43:57,030 --> 00:43:59,370 It's a proper orthogonal matrix. 686 00:43:59,370 --> 00:44:01,680 This one also n by n. 687 00:44:01,680 --> 00:44:04,740 So this guy has to be-- this is the sigma now. 688 00:44:04,740 --> 00:44:07,280 So it has to be what size? 689 00:44:07,280 --> 00:44:08,220 m by m. 690 00:44:08,220 --> 00:44:10,800 That's the remaining space. 691 00:44:10,800 --> 00:44:17,460 So it starts with the sigmas, and then it's all zeros, 692 00:44:17,460 --> 00:44:19,140 accounting for null space stuff. 693 00:44:23,130 --> 00:44:23,630 Yeah. 694 00:44:23,630 --> 00:44:29,130 So you should really see that these two are possible. 695 00:44:29,130 --> 00:44:34,100 That all these zeros when you multiply out, 696 00:44:34,100 --> 00:44:37,010 just give nothing, so that really the only thing 697 00:44:37,010 --> 00:44:41,250 that non-zero is in these bits. 698 00:44:41,250 --> 00:44:43,080 But there is a complete one. 699 00:44:43,080 --> 00:44:49,405 So what are these extra u's that are in the null space of A, 700 00:44:49,405 --> 00:44:51,970 A transpose or A transpose A? 701 00:44:51,970 --> 00:44:57,290 Yeah, so two sizes, the large size and the small size. 702 00:44:57,290 --> 00:45:03,500 But then the things that count are all in there. 703 00:45:03,500 --> 00:45:05,820 OK. 704 00:45:05,820 --> 00:45:09,940 So I was going to do one more thing. 705 00:45:09,940 --> 00:45:13,140 Let me see what it was. 706 00:45:17,170 --> 00:45:21,130 So this is section 1.8 of the notes. 707 00:45:21,130 --> 00:45:25,350 And you'll see examples there. 708 00:45:25,350 --> 00:45:33,440 And you'll see a second approach to the finding the u's 709 00:45:33,440 --> 00:45:36,700 and v's and sigmas. 710 00:45:36,700 --> 00:45:38,200 I can tell you what that is. 711 00:45:38,200 --> 00:45:47,380 But maybe with just do something nice at the end, 712 00:45:47,380 --> 00:45:57,380 let me tell you about another factorization of A that's 713 00:45:57,380 --> 00:46:09,260 famous in engineering, and it's famous in geometry. 714 00:46:09,260 --> 00:46:12,740 So this is NEA is a U sigma V transpose. 715 00:46:12,740 --> 00:46:14,070 We've got that. 716 00:46:14,070 --> 00:46:17,120 Now the other one that I'm thinking of, 717 00:46:17,120 --> 00:46:18,450 I'll tell you its name. 718 00:46:18,450 --> 00:46:23,075 It's called the polar decomposition of a matrix. 719 00:46:26,900 --> 00:46:31,400 And all I want you to see is that it's virtually here. 720 00:46:31,400 --> 00:46:33,160 So a polar means-- 721 00:46:33,160 --> 00:46:37,310 what's polar in-- for a complex number, 722 00:46:37,310 --> 00:46:40,990 what's the polar form of a complex number? 723 00:46:40,990 --> 00:46:42,280 AUDIENCE: e to the i theta. 724 00:46:42,280 --> 00:46:44,990 PROFESSOR: Yeah, it's e to the i theta times r. 725 00:46:44,990 --> 00:46:46,010 Yeah. 726 00:46:46,010 --> 00:46:49,100 A real guy-- so the real guy r will 727 00:46:49,100 --> 00:46:52,490 translate into a symmetric guy. 728 00:46:52,490 --> 00:46:55,190 And the e to the i theta will translate into-- 729 00:46:58,970 --> 00:47:02,510 what kind of a matrix reminds you of e to the i theta? 730 00:47:02,635 --> 00:47:03,510 AUDIENCE: Orthogonal. 731 00:47:03,510 --> 00:47:06,650 PROFESSOR: Orthogonal, size 1. 732 00:47:06,650 --> 00:47:08,022 So orthogonal. 733 00:47:10,680 --> 00:47:14,460 So that's a very, very kind of nice. 734 00:47:14,460 --> 00:47:18,750 Every matrix factors into a symmetric matrix 735 00:47:18,750 --> 00:47:20,830 times an orthogonal matrix. 736 00:47:20,830 --> 00:47:26,790 And I of course, describe these as the most important classes 737 00:47:26,790 --> 00:47:27,600 of matrices. 738 00:47:27,600 --> 00:47:33,380 And here, we're saying every matrix is a S times a Q. 739 00:47:33,380 --> 00:47:37,220 And I'm also saying that I can get that quickly out 740 00:47:37,220 --> 00:47:39,410 of the SVD. 741 00:47:39,410 --> 00:47:43,210 So I'm just want to do it. 742 00:47:43,210 --> 00:47:47,190 So I want to find an S and find a Q out of this. 743 00:47:47,190 --> 00:47:49,200 So to get an S-- 744 00:47:49,200 --> 00:47:50,550 So let me just start it. 745 00:47:50,550 --> 00:47:55,350 U sigma-- but now I'm looking for an S. 746 00:47:55,350 --> 00:47:59,530 So what shall I put in now? 747 00:47:59,530 --> 00:48:02,710 I better put in-- 748 00:48:02,710 --> 00:48:04,950 if I've got to U sigma something, 749 00:48:04,950 --> 00:48:06,860 and I want it to be a symmetric, I 750 00:48:06,860 --> 00:48:13,860 should put in U transpose would do it. 751 00:48:13,860 --> 00:48:15,600 But then if I put it in U transpose, 752 00:48:15,600 --> 00:48:20,970 I've got to put it in U. So now I've got U sigma. 753 00:48:20,970 --> 00:48:22,560 U transpose U is the identity. 754 00:48:22,560 --> 00:48:26,470 Then I've got to get V transpose. 755 00:48:26,470 --> 00:48:30,490 And have I got what the polar decomposition 756 00:48:30,490 --> 00:48:35,080 is asking for in this line? 757 00:48:35,080 --> 00:48:36,300 So, yeah. 758 00:48:36,300 --> 00:48:38,220 What have I got here? 759 00:48:38,220 --> 00:48:41,570 Where's the where's the S in this? 760 00:48:41,570 --> 00:48:46,460 So you see, I took the SVD and I just put the identity in there, 761 00:48:46,460 --> 00:48:48,080 just shifted things a little. 762 00:48:48,080 --> 00:48:50,860 And now where's the S that I can read off? 763 00:48:55,360 --> 00:49:01,100 For three, that's an S. That's a symmetric matrix. 764 00:49:01,100 --> 00:49:02,420 And where's the Q? 765 00:49:02,420 --> 00:49:06,020 Well, I guess we can see where the Q has to be. 766 00:49:06,020 --> 00:49:11,060 It's here, yeah. 767 00:49:11,060 --> 00:49:14,000 Yeah, so just by sticking U transpose U 768 00:49:14,000 --> 00:49:17,120 and putting the parentheses right, 769 00:49:17,120 --> 00:49:23,860 I recover that decomposition of a matrix, which 770 00:49:23,860 --> 00:49:27,910 in mechanical engineering language, is language 771 00:49:27,910 --> 00:49:32,530 tells me that any strain can be-- 772 00:49:32,530 --> 00:49:36,460 which is like stretching of elastic thing, 773 00:49:36,460 --> 00:49:48,160 has a symmetric kind of a stretch and a internal twist. 774 00:49:48,160 --> 00:49:50,320 Yeah. 775 00:49:50,320 --> 00:49:51,970 So that's good. 776 00:49:51,970 --> 00:50:00,130 Well, this was a 3, 6, 9 boards filled with matrices. 777 00:50:00,130 --> 00:50:02,840 Well, it is 18 0, 6, 5. 778 00:50:02,840 --> 00:50:04,630 So maybe that's all right. 779 00:50:04,630 --> 00:50:11,890 But the idea is to use them on a matrix of data. 780 00:50:11,890 --> 00:50:16,230 And I'll just tell you the key fact. 781 00:50:16,230 --> 00:50:27,240 The key fact-- if I have a big matrix of data, A, 782 00:50:27,240 --> 00:50:30,000 and if I want to pull out of that matrix 783 00:50:30,000 --> 00:50:33,790 the important part, so that's what 784 00:50:33,790 --> 00:50:36,280 data science has to be doing. 785 00:50:36,280 --> 00:50:42,340 Out of a big matrix, some part of it is noise, some part of it 786 00:50:42,340 --> 00:50:43,510 is signal. 787 00:50:43,510 --> 00:50:46,060 I'm looking for the most important part of the signal 788 00:50:46,060 --> 00:50:46,840 here. 789 00:50:46,840 --> 00:50:49,240 So I'm looking for the most important part of the matrix. 790 00:50:52,620 --> 00:50:57,420 In a way, the biggest numbers, but of course, 791 00:50:57,420 --> 00:51:00,810 I don't look at individual numbers. 792 00:51:00,810 --> 00:51:04,950 So what's the biggest part of the matrix? 793 00:51:04,950 --> 00:51:06,930 What are the principal components? 794 00:51:06,930 --> 00:51:08,700 Now we're really getting in-- 795 00:51:11,670 --> 00:51:13,140 it could be data. 796 00:51:13,140 --> 00:51:15,060 And we want to do statistics, or we 797 00:51:15,060 --> 00:51:18,120 want to see what has high variance, what 798 00:51:18,120 --> 00:51:24,240 has low variance, we'll do these connections with statistics. 799 00:51:24,240 --> 00:51:26,950 But what's the important part of the matrix? 800 00:51:26,950 --> 00:51:33,130 Well, let me look at U sigma V transpose. 801 00:51:33,130 --> 00:51:37,600 Here, yeah, let me look at it. 802 00:51:37,600 --> 00:51:43,900 So what's the one most important part of that matrix? 803 00:51:43,900 --> 00:51:44,950 The right one? 804 00:51:44,950 --> 00:51:46,430 It's a rank one piece. 805 00:51:46,430 --> 00:51:50,750 So when I say a part, of course it's going to be a matrix part. 806 00:51:50,750 --> 00:51:52,900 So the simple matrix building block 807 00:51:52,900 --> 00:51:57,850 is like a rank one matrix, a something, something transpose. 808 00:51:57,850 --> 00:52:00,550 And what should I pull out of that 809 00:52:00,550 --> 00:52:02,890 as being the most important rank one 810 00:52:02,890 --> 00:52:06,230 matrix that's in that product? 811 00:52:06,230 --> 00:52:09,890 So I'll erase the 1.8 while you think 812 00:52:09,890 --> 00:52:16,640 what do I do to pick out the big deal, the thing that the data 813 00:52:16,640 --> 00:52:19,200 is telling me first. 814 00:52:19,200 --> 00:52:24,360 Well, these are orthonormal. 815 00:52:24,360 --> 00:52:26,740 No one is bigger than another one. 816 00:52:26,740 --> 00:52:29,580 These are orthonormal, no one is bigger than another one. 817 00:52:29,580 --> 00:52:35,130 But here, I look here, which is the most important number? 818 00:52:35,130 --> 00:52:37,200 Sigma 1. 819 00:52:37,200 --> 00:52:37,880 Sigma 1. 820 00:52:37,880 --> 00:52:41,930 So the part I pick out is this biggest number times 821 00:52:41,930 --> 00:52:45,080 it's row times it's column. 822 00:52:45,080 --> 00:52:55,220 So it's u 1 sigma 1 v1 transpose is the top principal part 823 00:52:55,220 --> 00:52:57,320 of the matrix A. It's the leading 824 00:52:57,320 --> 00:53:01,700 part of the matrix A. It's the biggest rank one 825 00:53:01,700 --> 00:53:04,040 part of the matrix is there. 826 00:53:04,040 --> 00:53:07,580 So computing those three guys is the first step 827 00:53:07,580 --> 00:53:10,080 to understanding the data. 828 00:53:10,080 --> 00:53:10,580 Yeah. 829 00:53:10,580 --> 00:53:12,920 So that's what's coming next is-- 830 00:53:12,920 --> 00:53:17,230 and I guess tomorrow, since they moved-- 831 00:53:17,230 --> 00:53:24,410 MIT declared Tuesday to be Monday. 832 00:53:24,410 --> 00:53:25,860 They didn't change Wednesday. 833 00:53:25,860 --> 00:53:32,450 So I'll see you tomorrow for the principal components. 834 00:53:32,450 --> 00:53:34,000 Good.