Flash and JavaScript are required for this feature.
Download the video from Internet Archive.
Description
Singular Value Decomposition (SVD) is the primary topic of this lecture. Professor Strang explains and illustrates how the SVD separates a matrix into rank one pieces, and that those pieces come in order of importance.
Summary
Columns of V are orthonormal eigenvectors of ATA.
Av = \(\sigma\)u gives orthonormal eigenvectors u of AAT.
\(\sigma^2 =\) eigenvalue of ATA = eigenvalue of AAT \( \neq\) 0
A = (rotation)(stretching)(rotation) \(U\Sigma\)VT for every A
Related section in textbook: I.8
Instructor: Prof. Gilbert Strang
Lecture 6: Singular Value D...
ANNOUNCER: The following content is provided under a Creative Commons license. Your support will help MIT Open Courseware continue to offer high quality educational resources for free. To make a donation or to view additional materials from hundreds of MIT courses, visit MITOpenCourseware@OCW.MIT.edu.
PROFESSOR: So this is a big day mathematically speaking, because we come to this key idea, which is a little bit like eigenvalues. Well, a lot like eigenvalues, but different because the matrix A now is more usually rectangular. So for a rectangular matrix, the whole idea of eigenvalues is shot because if I multiply A times a vector x in n dimensions, out will come something in m dimensions and it's not going to equal lambda x. So Ax equal lambda x is not even possible if A is rectangular.
And even if A is square, what are the problems, just thinking for a minute about eigenvalues? The case I wrote up here is the great case where I have a symmetric matrix and then it's got a full set of eigenvalues and eigenvectors and they're orthogonal, all good. But for a general square matrix, either the eigenvectors are complex-- eigenvalues are complex or the eigenvectors are not orthogonal. So we can't stay with eigenvalues forever. That's what I'm saying.
And this is the right thing to do. So what are these pieces? So these are the left and these are the right singular vectors. So this is the new word is singular. And in between go the-- not the eigenvalues, but the singular values. So we've got the whole point now.
You've got to pick up on this. There are two sets of singular vectors, not one. For eigenvectors, we just had one set, the Q's. Now we have a rectangular matrix, we've got one set of left eigenvectors in m dimensions, and we've got another set of right eigenvectors in n dimensions. And numbers in between are not eigenvalues, but singular values.
So these guys are-- let me write what that looks like. This is, again, a diagonal matrix sigma 2 to sigma r, let's say. So it's again, a diagonal matrix in the middle. But the numbers on the diagonal are all positive or 0. And they're called singular values. So it's just a different world.
So really, the first step by have to do, the math step, is to show that any matrix can be factored into u times sigma times v transpose. So that's the parallel to the spectral theorem that any symmetric matrix could be factored that way. So you're good for that part. We just have to do it to see what are u and sigma and v? What are those vectors and those singular values? Let's go.
So the key is that A transpose A is a great matrix. So that's the key to the math is A transpose A. So what are the properties of A transpose A? A is rectangular again.
So maybe m by n A transpose. So this was m by n. And this was n by m. So we get a result that's n by n.
And what else can you tell me about A transpose A? It's a metric. That's a big deal. And it's square. And well yeah, you can tell me more now, because we talked about something, a topic that's a little more than symmetric last time.
The matrix A transpose A will be positive, definite. It's eigenvalues are greater or equal to 0. And that will mean that we can take their square roots. And that's what we will do.
So A transpose A we'll have a factorization. It's symmetric. It'll have a like, a Q lambda Q transpose, but I'm going to call it V lambda-- no, yeah, lambda-- I'll still call it lambda V transpose.
So these V's-- what do we know about eigenvectors of these V's or eigenvectors of this guy? Square, symmetric, positive, definite matrix. So we're in good shape. And what do we know about the eigenvalues of A transpose A? They are all positive.
So the eigenvalues are-- well, or equal to 0. And these guys are orthogonal. And these guys are greater or equal to there. So that's good.
That's one of our-- We'll depend a lot on that. But also, you've got to recognize that A, A transpose is a different guy, A, A transpose. So what's the shape of A, A transpose? How big is that?
Now I've got-- what do I have? M by n times n by m. So this will be what size? N by m.
Different shape but with the same eigenvalues-- the same eigenvalues. So it's going to have some other eigenvectors, u-- of course, I'm going to call them u, because I'm going to go in over there. They'll be the same.
Well, they're saying yeah, let me-- I shouldn't-- I have to say when I say the same, I can't quite literally mean the very same, because this has got n eigenvalues and this has m eigenvalues. But the missing guys, the ones that are in one of them and not in the other, depending on the sizes, are zeros. So really, the heart of the thing, the non-zero eigenvalues are the same.
Well actually, I've pretty much revealed what the SVD is going to use. It's going to use the U's from here and the V's from here. But that's the story. You've got to see that story.
So fresh start on the singular value decomposition. What are we looking for? Well, as a factorization-- so we're looking for-- we want A. We want vectors v, so that when I multiply by v-- so if it was an eigenvector, it would be Av equal lambda v.
But now for A, it's rectangular. It hasn't got eigenvectors. So Av is sigma, that the new singular value times u. That's the first guy and the second guy and the rth guy. I'll stop at r, the rank.
Oh, yeah. Is that what I want? A-- let me just see. Av is sigma u. Yeah, that's good.
So this is what takes the place of Ax equal lambda x. A times one set of singular vectors gives me a number of times the other set of singular vectors. And why did I stop at r the rank? Because after that, the sigmas are 0. So after that, I could have some more guys, but they'll be in the null space 0 on down to of Vn.
So these are the important ones. So that's what I'm looking for. Let me say it now in words. I'm looking for a bunch of orthogonal vectors v so that when I multiply them by A I get a bunch of orthogonal vectors u.
That is not so clearly possible. But it is possible. It does happen. I'm looking for one set of orthogonal vectors v in the input space, you could say, so that the Av's in the output space are also orthogonal.
In our picture of the fundamental-- the big picture of linear algebra, we have v's in this space, and then stuff in the null space. And we have u's over here in the columns space and some stuff in the null space over there. And the idea is that I have orthogonal v's here. And when I multiply by A-- so multiply by A-- then I get orthogonal u's over here, orthogonal to orthogonal. That's what makes the V's and they u's special. Right?
That's the property. And then when we write down-- well, let me write down what that would mean. So I've just drawn a picture to go with this-- those equations. That picture just goes with these equations. And let me just write down what it means.
It means in matrix-- so I've written it. Oh yeah, I've written it here in vectors one at a time. But of course, you, know I'm going to put those vectors into the columns of a matrix. So A times v1 up to let's say vr will equal-- oh yeah. It equals sigma as times u.
So this is what I'm after is u1 up to ur multiplied by sigma 1 along to sigma r. What I'm doing now is just to say I'm converting these individual singular vectors, each v going into a u to putting them all together into a matrix. And of course, what I've written here is Av equals u sigma, Av equals u sigma. That's what that amounts to.
Well, then I'm going to put a v transpose on this side. And I'm going to get to A equals u sigma v transpose, multiplying both sides there by v transpose. I'm kind of writing the same thing in different forms, matrix form, vector at a time form.
And now we have to find them. Now I've used up boards saying what we're after, but now we've got to get there. So what are the v's and what are the u's? Well, the cool idea is to think of A transpose A. So you're with me what we're for.
And now think about A transpose A. So if this is what I'm hoping for, what will A transpose A turn out to be? So big moment that's going to reveal what the v's are. So if I form A transpose A-- so A transpose-- so I got to transpose this guy. So A transpose is V sigma transpose U transpose, right?
And then comes A, which is this, U sigma V transpose. So why did I do that? Why is it that A transpose A is the cool thing to look at to make the problem simpler? Well, what becomes simpler in that line just written? U transpose U is the identity, because I'm looking for orthogonal, in fact orthonormal U's.
So that's the identity. So this is V sigma transpose sigma V transpose. And I'll put parentheses around that because that's a diagonal matrix.
What does that tell me? What does that tell all of us? A transpose A has this form. Now we've seen that form before. We know that this is a symmetric matrix, symmetric and even positive definite.
So what are the v's? The v's are the eigenvectors of A transpose A. This is the Q lambda Q transpose for that symmetric matrix. So we know the v's are the eigenvectors, v is the eigenvectors of A transpose A. I guess we're also going to get the singular values.
So the sigma transpose sigma, which will be the sigma squared are the eigenvalues of A transpose A. Good! Sort of by looking for the correct thing, U sigma V transpose and then just using the U transpose U equal identity, we got it back to something we perfectly recognize. A transpose A has that form. So now we know what the V's are.
And if I do it the other way, which, what's the other way? Instead of A transpose A, the other way is to look at A, A transpose. And if I write all that down, that a is the U sigma V transpose, and the A transpose is the V sigma transpose U transpose.
And again, this stuff goes away and leaves me with U sigma, sigma transpose U transpose. So I know what the U's are too. They are eigenvectors of A, A transpose.
Isn't that a beautiful symmetry? You just-- A transpose A and A, A transpose are two different guys now. So each has its own eigenvectors and we use both. It's just right. And I just have to take the final step, and we've established the SVD.
So the final step is to remember what I'm going for here. A times a v is supposed to be a sigma times a u. See, what I have to deal with now is I haven't quite finished. It's just perfect as far as it goes, but it hasn't gone to the end yet because we could have double eigenvalues and triple eigenvalues, and all those horrible possibilities.
And if I have triple eigenvalues or double eigenvalues, then what's the deal with eigenvectors if I have double eigenvalues? Suppose a matrix has a symmetric matrix, has a double eigenvalue. Let me just take an example.
So symmetric matrix like say, 1, 1, 5, make it. Why not? What's the deal with eigenvectors for that matrix 1, 1, 5? So 5 has got an eigenvector. You can see what it is, 0, 0, 1.
What about eigenvectors that go with lambda equal 1 for that matrix? What's up? What would be eigenvectors for a lambda equal 1? Unfortunately, there was a whole plane of them. Any vector of the form x, y, 0.
Any vector in the x, y plane would produce x, y, 0. So I have a whole plane of eigenvectors. And I've got to pick two that are orthogonal, which I can do. And then they have to be-- in the SVD those two orthogonal guys have to go to two orthogonal guys.
In other words, it's a little bit of detail here, a little getting into this exactly what is-- well, actually, let me tell you the steps. So I use this to conclude that the V's the singular vectors should be eigenvalues. I concluded those guys from this step. Now I'm not going to use this step so much. Of course, it's in the back of my mind but I'm not using it.
I'm going to get the u's from here. So u1 is A v1 over sigma 1 ur is Avr over sigma r. You see what I'm doing here? I'm picking in a possible plane of things the one I want, the u's I want.
So I've chosen the v's. I've chosen the sigmas. They were fixed for A transpose A. The eigenvectors are v's, the things-- the eigenvalues are sigma squared. And now then this is the u I want.
Are you with me? So I want to get these u's correct. And if I have a whole plane of possibilities, I got to pick the right one. And now finally, I have to show that it's the right one. So what is left to show?
I should show that these u's are eigenvectors of A, A transpose. And I should show that they're orthogonal. That's the key. I would like to show that these are orthogonal. And that's what goes in this picture.
The v's-- I've got orthogonal, guys, because they're the eigenvectors of a symmetric matrix. Pick them orthogonal. But now I'm multiplying by A, so I'm getting the u which is Av over sigma for the basis vectors. And I have to show they're orthogonal.
So this is like the final moment. Does everything come together right? If I've picked the v's as the eigenvectors of A transpose A, and then I take these for the u, are they orthogonal?
So I would like to think that we can check that fact and that it will come out. Could you just help me through this one? I'll never ask for anything again, just get the SVD one. So I would like to show that u1-- so let me put up what I'm doing. I'm trying to show that u1 transpose u2 is 0. They're orthogonal. So u1 is A v1 over sigma 1. That's transpose. That's u1.
And u2 is A v2 over sigma 2. And I want to get 0. The whole conversation is ending right here. Why is that thing 0?
The v's are orthogonal. We know the v's are orthogonal. They're orthogonal eigenvectors of A transpose A. Let me repeat that. The v's are orthogonal eigenvectors of A transpose A, which I know we can find them.
Then I chose the u's to be this. And I want to get the answer 0. Are you ready to do it? We want to compute that and get 0. So what do I get?
We just have to do it. So I can see that the denominator is that. So is it v1 transpose A transpose times A v2. And I'm hoping to get 0. Do I get 0 here?
You hope so. v1 is orthogonal v2. But I've got A transpose A stuck in the middle there. So what happens here? How do I look at that?
v2 is an eigenvector of A transpose A. Terrific! So this is v1 transpose. And this is the matrix times v2.
So that's sigma 2 transpose v2, isn't it? It's the eigenvector with eigenvalue sigma 2 squared times v2. Yeah, divided by sigma 1 sigma 2. So the A's are out of there now.
So I've just got these numbers, sigma 2 squared. So that would be sigma 2 over sigma 1-- I've accounted for these numbers here-- times v1 transpose v2. And now what's up? They're orthonormal. We got it.
That's 0. That is 0 there, yeah. So not only are the v's orthogonal to each other, but because they're eigenvectors of A transpose A, when I do this, I discover that the Av's are orthogonal to each other over in the column space. So orthogonal v's in the row space, orthogonal Av's over in column space. That was discovered late-- much long after eigenvectors.
And it's a interesting history. And it just comes out right. And then it was discovered, but not much used, for oh, 100 years probably. And then people saw that it was exactly the right thing, and data matrices became important, which are large rectangular matrices. And we have not-- oh, I better say a word, just a word here about actually computing the v's and sigmas and the u's
So how would you actually find them? What I most want to say is you would not go this A transpose A route. Why is it like it? Is that a big mistake?
If you have a matrix A, say 5,000 by 10,000, why is it a mistake to actually use A transpose A in the computation? We used it heavily in the proof. And we could find another proof that wouldn't use it so much. But why would I not multiply these two together? It's very big, very expensive.
It adds in a whole lot of round off-- you have a matrix that's now-- its vulnerability to round off errors is squared-- that's called its condition number-- gets squared. And you just don't go there. So the actual computational methods are quite different. And we'll talk about those. But the A transpose A, because it's symmetric positive definite, made the proof so nice.
You've seen the nicest proof, I'd say, of the-- Now I should think about the geometry. So what does A equal A for u sigma? Maybe I take another board, but it will fill it. But it's a good U sigma V transpose.
So it's got three factors there. And I would like then each factor is kind of a special matrix. U and V are orthogonal matrix. So I think of those as rotations.
Sigma is a diagonal matrix. I think of it as stretching. So now I'm just going to draw the picture. So here's unit vectors. And the first thing-- so if I multiply by x, this is the first thing that happens.
So that rotates. So here's x's. Then V transpose x's. That's still a circle length and change for those, when I multiply by an orthogonal matrix.
But the vectors turned. It's a rotation. Could be a reflection, but let's keep it as a rotation.
Now what does sigma do? So I have this unit circle. I'm in 2D. So I'm drawing a picture of the vectors. These are the unit vectors in 2D, x,y.
They got turned by the orthogonal matrix. What does sigma do to that picture? It stretches, because sigma multiplies by sigma 1 in the first component, sigma 2 in the second. So it stretches these guys.
And let's suppose this is number 1 and this is number 2, this is number 1 and number 2. So sigma 1, our convention is sigma 1-- we always take sigma 1 greater or equal to sigma 2, greater or equal whatever, greater equal, sigma rank. And they're all positive. And the rest are 0. So sigma 1 will be bigger than sigma 2.
So I'm expecting a circle goes to an ellipse when you stretch-- I didn't get it quite perfect, but not bad. So this would be sigma 2 v2, sigma 1 v1, and this would be sigma 2 v2. And we now have an ellipse.
So we started with x is in a circle. We rotated. We stretched. And now the final step is take these guys and multiply them by u. So this was the sigma V transpose x.
And now I'm ready for the u part which comes last because it's at the left. And what happens? What's the picture now? What does u do to the ellipse? It rotates it.
It's another orthogonal matrix. It rotates it somewhere, maybe there. And now we see the u's, u2 and u1. Well, let me think about that. Basically, that's not that's right.
So this SVD is telling us something quite remarkable that every linear transformation, every matrix multiplication factors into a rotation times a stretch times a different rotation, but possibly different. Actually, when would the u be the same as a v? Here's a good question. When is u the same as v when are the two singular vectors just the same?
AUDIENCE: A square.
PROFESSOR: Because A would have to be square. And we want this to be the same as Q lambda Q transpose if they're the same. So the U's would be the same as the V's when the matrix is symmetric. And actually we need it to be positive definite.
Why is that? Because our convention is these guys are greater or equal to 0. It's going to be the same, then-- so far a positive definite symmetric matrix, the S that we started with is the same as the A on the next line. Yeah, the Q is the U, the Q transpose is the V transpose, the lambda is the sigma.
So those are the good matrices. And they're the ones that you can't improve basically. They're so good you can't make a positive definite symmetric matrix better than it is. Well, maybe diagonalize it or something, but OK.
Now I think of like, one question here that helps me anyway to keep this figure straight, how I want to count the parameters in this factorization. So I am 2 by 2. I'm 2 by 2. So A has four numbers, a, b, c, d. Then I guess I feel that four numbers should appear on the right hand side.
Somehow the U and the sigma and the V transpose should use up a total of four numbers. So we have a counting match between the left side that's got four numbers a, b, c, d, and the right side that's got four numbers buried in there somewhere. So how can we dig them out? How many numbers in sigma? That's pretty clear.
Two, sigma 1 and sigma 2. The two eigenvalues. How many numbers in this rotation? So if I had a different color chalk, I would put 2 for the number of things I counted for by sigma. How many parameters does a two by two rotation require? One.
And what's a good word for that one? Is that one parameter? It's like I have our cos theta, sine theta, minus sine theta, cos theta. There's a number theta. It's the angle it rotates.
So that's one guy to tell the rotation angle, two guys to tell the stretchings, and one more to tell the rotation from you, adding up to four. So those count-- that was a match up with the four numbers, a, b, c, d that we start with. Of course, it's a complicated relation between those four numbers and rotations and stretches, but it's four equals four anyway.
And I guess if you did three by threes-- oh, three by threes. What would happen then? So let me take three. Do you want to care for three by threes? Just, it's sort of satisfying to get four equal four.
But now what do we get three by three? We got how many numbers here? Nine. So where are those nine numbers? How many here?
That's usually the easy-- three. So what's your guess for the how many in a rotation? And a 3D rotation, you take a sphere and you rotate it. How many how many numbers to tell you what's what-- to tell you what you did?
Three. We hope three. Yeah, it's going to be three, three, and three for the three dimensional world that we live in. So people who do rotations for a living understand that rotation in 3D, but how do you see this?
AUDIENCE: Roll, pitch, and yaw.
PROFESSOR: Sorry?
AUDIENCE: Roll, pitch and yaw.
PROFESSOR: Roll, pitch, and yaw. That sounds good. I mean, it's three words and we've got it, right? OK, yeah. Roll, pitch and yaw.
Yeah, I guess a pilot hopefully, knows about those three. Yeah, yeah, yeah. Which is roll? When you are like forward and back? Does anybody, anybody? Roll, pitch, and yaw?
AUDIENCE: Pitch is the up and down one.
PROFESSOR: Pitch is the up and down one. OK.
AUDIENCE: Roll is like, think of a barrel roll. And yaw is your side-to-side motion.
PROFESSOR: Oh, yaw, you stay in a plane and you-- OK, beautiful. Right, right. And that leads us to our four-- four dimensions. What's your guess on 4D? Well, we could do the count again.
If it was 4 by 4, we would have 16 numbers there. And in the middle, we always have an easy time with that. That would be 4.
So we've got 12 left to share out. So six somehow-- six-- six angles in four dimensions. Well, we'll leave it there. Yeah, yeah, yeah. OK.
So there is the SVD but without an example. Examples, you know, I would have to compute A transpose A and find it. So the text will do that-- does it for a particular matrix. Oh! Yeah, the text does it for a matrix 3, 4, 0, 5 that came out pretty well.
A few facts we could learn though. So if I multiply all the eigenvalues together for a matrix A, what do I get? I get the determinant. What if I multiply the singular values together? Well again, I get the determinant.
You can see it right away from the big formula. Take determinant-- take determinant. Well, assuming the matrix A is square. So it's got a determinant. Then I take determinant of this product.
I can take the separate determinants. That has determinant equal to one. An orthogonal matrix, the determinant is one. And similarly, here. So the product of the sigmas is also the determinant.
Yeah. Yeah, so the product of the sigmas is also the determinant. The product of the sigmas here will be 15. But you'll find that sigma one is smaller than lambda 1. So here are the eigenvalues, lambda 1 less or equal to lambda 2, say.
But the singular values are outside them. Yeah. But they still multiply. Sigma 1 times sigma 2 will still be 15. And that's the same as lambda 1 times lambda 2. Yeah.
But overall, computing the examples of the SVD take more time because-- well, yeah, you just compute A transpose A and you've got the v's. And you're on your way. And you have to take the square root of the eigenvalues. So that's the SVD as a piece of pure math. But of course, what we'll do next time starting right away is use SVD.
And let me tell you even today, the most-- yeah, yeah most important pieces of the SVD. So what do I mean by pieces of the SVD? I've got one more blackboard still to write on. So here we go.
So let me write out A is the u's times the sigmas-- sigmas 1 to r times the v's-- v transpose v1 transpose down to vr transpose. So those are across. Yeah. Actually what I've written here-- so you could say there is a big economies.
There is a smaller size SVD that has the real stuff that really counts. And then there's a larger SVD that has a whole lot of zeros. So this it would be the smaller one, m by r.
This would be r by r. And these would all be positive. And this would be r by n. So that's only using the r non-zeros. All these guys are greater than zero.
Then the other one we could fill out to get a square orthogonal matrix, the sigmas and square v's v1 transpose to vn transpose. So what are the shapes now? This shape is m by m. It's a proper orthogonal matrix. This one also n by n.
So this guy has to be-- this is the sigma now. So it has to be what size? m by m. That's the remaining space. So it starts with the sigmas, and then it's all zeros, accounting for null space stuff.
Yeah. So you should really see that these two are possible. That all these zeros when you multiply out, just give nothing, so that really the only thing that non-zero is in these bits. But there is a complete one.
So what are these extra u's that are in the null space of A, A transpose or A transpose A? Yeah, so two sizes, the large size and the small size. But then the things that count are all in there. OK.
So I was going to do one more thing. Let me see what it was. So this is section 1.8 of the notes. And you'll see examples there. And you'll see a second approach to the finding the u's and v's and sigmas.
I can tell you what that is. But maybe with just do something nice at the end, let me tell you about another factorization of A that's famous in engineering, and it's famous in geometry. So this is NEA is a U sigma V transpose. We've got that.
Now the other one that I'm thinking of, I'll tell you its name. It's called the polar decomposition of a matrix. And all I want you to see is that it's virtually here. So a polar means-- what's polar in-- for a complex number, what's the polar form of a complex number?
AUDIENCE: e to the i theta.
PROFESSOR: Yeah, it's e to the i theta times r. Yeah. A real guy-- so the real guy r will translate into a symmetric guy. And the e to the i theta will translate into-- what kind of a matrix reminds you of e to the i theta?
AUDIENCE: Orthogonal.
PROFESSOR: Orthogonal, size 1. So orthogonal. So that's a very, very kind of nice. Every matrix factors into a symmetric matrix times an orthogonal matrix. And I of course, describe these as the most important classes of matrices.
And here, we're saying every matrix is a S times a Q. And I'm also saying that I can get that quickly out of the SVD. So I'm just want to do it. So I want to find an S and find a Q out of this. So to get an S--
So let me just start it. U sigma-- but now I'm looking for an S. So what shall I put in now? I better put in-- if I've got to U sigma something, and I want it to be a symmetric, I should put in U transpose would do it. But then if I put it in U transpose, I've got to put it in U. So now I've got U sigma.
U transpose U is the identity. Then I've got to get V transpose. And have I got what the polar decomposition is asking for in this line? So, yeah. What have I got here?
Where's the where's the S in this? So you see, I took the SVD and I just put the identity in there, just shifted things a little. And now where's the S that I can read off? For three, that's an S. That's a symmetric matrix.
And where's the Q? Well, I guess we can see where the Q has to be. It's here, yeah. Yeah, so just by sticking U transpose U and putting the parentheses right, I recover that decomposition of a matrix, which in mechanical engineering language, is language tells me that any strain can be-- which is like stretching of elastic thing, has a symmetric kind of a stretch and a internal twist. Yeah.
So that's good. Well, this was a 3, 6, 9 boards filled with matrices. Well, it is 18 0, 6, 5. So maybe that's all right. But the idea is to use them on a matrix of data.
And I'll just tell you the key fact. The key fact-- if I have a big matrix of data, A, and if I want to pull out of that matrix the important part, so that's what data science has to be doing. Out of a big matrix, some part of it is noise, some part of it is signal. I'm looking for the most important part of the signal here. So I'm looking for the most important part of the matrix.
In a way, the biggest numbers, but of course, I don't look at individual numbers. So what's the biggest part of the matrix? What are the principal components? Now we're really getting in-- it could be data. And we want to do statistics, or we want to see what has high variance, what has low variance, we'll do these connections with statistics.
But what's the important part of the matrix? Well, let me look at U sigma V transpose. Here, yeah, let me look at it. So what's the one most important part of that matrix? The right one?
It's a rank one piece. So when I say a part, of course it's going to be a matrix part. So the simple matrix building block is like a rank one matrix, a something, something transpose. And what should I pull out of that as being the most important rank one matrix that's in that product? So I'll erase the 1.8 while you think what do I do to pick out the big deal, the thing that the data is telling me first.
Well, these are orthonormal. No one is bigger than another one. These are orthonormal, no one is bigger than another one. But here, I look here, which is the most important number? Sigma 1.
Sigma 1. So the part I pick out is this biggest number times it's row times it's column. So it's u 1 sigma 1 v1 transpose is the top principal part of the matrix A. It's the leading part of the matrix A. It's the biggest rank one part of the matrix is there.
So computing those three guys is the first step to understanding the data. Yeah. So that's what's coming next is-- and I guess tomorrow, since they moved-- MIT declared Tuesday to be Monday.
They didn't change Wednesday. So I'll see you tomorrow for the principal components. Good.
Problems for Lecture 6
From textbook Section I.8
1. A symmetric matrix \(S=S^{\mathtt{T}}\) has orthonormal eigenvectors \(\boldsymbol{v}_1\) to \(\boldsymbol{v}_n\). Then any vector \(\boldsymbol{x}\) can be written as a combination \(\boldsymbol{x} = c_1\boldsymbol{v}_1 + · · · +c_n\boldsymbol{v}_n\). Explain these two formulas:
$$ \boldsymbol{x}^{\mathtt{T}}\boldsymbol{x} = {c_1}^2+ · · · +{c_n}^2 \hspace{12pt} \boldsymbol{x}^{\mathtt{T}} S\boldsymbol{x} = \lambda_1{c_1}^2+ · · · +\lambda_n{c_n}^2 $$
6. Find the σ’s and \(\boldsymbol{v}\)’s and \(\boldsymbol{u}\)’s in the SVD for \(A =\left[\begin{matrix}3 & 4\\ 0 & 5\end{matrix}\right]\). Use equation (12).