1 00:00:17,470 --> 00:00:20,000 MICHALE FEE: OK, let's go ahead and get started. 2 00:00:20,000 --> 00:00:22,630 All right, so today, we're going to continue talking about 3 00:00:22,630 --> 00:00:26,350 feed-forward neural networks, and we're going to keep working 4 00:00:26,350 --> 00:00:31,180 on some interesting aspects of linear algebra-- 5 00:00:31,180 --> 00:00:32,950 matrix transformations. 6 00:00:32,950 --> 00:00:37,990 We're going to introduce a new idea from linear algebra, 7 00:00:37,990 --> 00:00:40,240 the idea of basis sets. 8 00:00:40,240 --> 00:00:42,700 We're going to describe some interesting and important 9 00:00:42,700 --> 00:00:46,390 properties of basis sets, such as linear independence. 10 00:00:46,390 --> 00:00:49,480 And then we're going to end with just a very simple formulation 11 00:00:49,480 --> 00:00:55,090 of how to change between different basis sets. 12 00:00:55,090 --> 00:00:58,210 So let me explain a little bit more, 13 00:00:58,210 --> 00:01:03,850 motivate a little bit more why we're doing these things. 14 00:01:03,850 --> 00:01:08,770 So as people, as animals, looking out at the world, 15 00:01:08,770 --> 00:01:11,740 we are looking at high-dimensional data. 16 00:01:11,740 --> 00:01:16,330 We have hundreds of millions of photoreceptors in our retina. 17 00:01:16,330 --> 00:01:22,660 Those data get compressed down into about a million 18 00:01:22,660 --> 00:01:26,050 nerve fibers that go through our optic nerve up to our brain. 19 00:01:26,050 --> 00:01:28,120 So it's a very high-dimensional data set. 20 00:01:28,120 --> 00:01:31,030 And then our brain unpacks that data and tries 21 00:01:31,030 --> 00:01:32,020 to make sense of it. 22 00:01:32,020 --> 00:01:34,720 And it does that by passing that data 23 00:01:34,720 --> 00:01:36,970 through layers of neural circuits 24 00:01:36,970 --> 00:01:38,500 that make transformations. 25 00:01:38,500 --> 00:01:42,790 And we've talked about how in going from one layer of neurons 26 00:01:42,790 --> 00:01:44,440 to another layer of neurons, there's 27 00:01:44,440 --> 00:01:47,080 a feed-forward projection that essentially 28 00:01:47,080 --> 00:01:50,507 does what looks like a matrix multiplication, OK? 29 00:01:50,507 --> 00:01:52,090 So that's one of the reasons why we're 30 00:01:52,090 --> 00:01:55,940 trying to understand what matrix multiplications do. 31 00:01:55,940 --> 00:01:58,930 Now, we talked about some of the matrix transformations 32 00:01:58,930 --> 00:02:01,240 that you can see when you do a matrix multiplication. 33 00:02:01,240 --> 00:02:04,630 And one of those was a rotation. 34 00:02:04,630 --> 00:02:07,690 Matrix multiplications can implement rotations. 35 00:02:07,690 --> 00:02:11,980 And rotations are very important for visualizing 36 00:02:11,980 --> 00:02:13,160 high-dimensional data. 37 00:02:13,160 --> 00:02:17,740 So this is from a website at Google research, 38 00:02:17,740 --> 00:02:21,640 where they've implemented different viewers 39 00:02:21,640 --> 00:02:24,790 for high-dimensional data, ways of taking high-dimensional data 40 00:02:24,790 --> 00:02:29,590 and reducing the dimensionality and then visualizing 41 00:02:29,590 --> 00:02:31,000 what that data looks like. 42 00:02:31,000 --> 00:02:33,100 And one of the most important ways 43 00:02:33,100 --> 00:02:36,310 that you visualize high-dimensional data 44 00:02:36,310 --> 00:02:39,945 is by rotating it and looking at it from different angles. 45 00:02:39,945 --> 00:02:41,320 And what you're doing when you do 46 00:02:41,320 --> 00:02:43,440 that is you take this high-dimensional data, 47 00:02:43,440 --> 00:02:45,580 you rotate it, and you project it 48 00:02:45,580 --> 00:02:49,190 into a plane, which is what you're seeing on the screen. 49 00:02:49,190 --> 00:02:52,630 And you can see that you get a lot out 50 00:02:52,630 --> 00:02:56,590 of looking at different projections 51 00:02:56,590 --> 00:02:58,780 and different rotations of data sets. 52 00:02:58,780 --> 00:03:01,750 Also, when you're zooming in on the data, 53 00:03:01,750 --> 00:03:04,270 that's another matrix transformation. 54 00:03:04,270 --> 00:03:07,570 You can stretch and compress and do 55 00:03:07,570 --> 00:03:09,910 all sorts of different things to data. 56 00:03:09,910 --> 00:03:13,810 Now, one of the cool things is that when 57 00:03:13,810 --> 00:03:17,800 we study the brain to try to figure out 58 00:03:17,800 --> 00:03:24,730 how it does this really cool process of rotating data 59 00:03:24,730 --> 00:03:26,980 through its transformations that are 60 00:03:26,980 --> 00:03:30,370 produced by neural networks, we record from lots of neurons. 61 00:03:30,370 --> 00:03:32,230 There's technology now where you can 62 00:03:32,230 --> 00:03:35,800 image from thousands, or even tens of thousands, of neurons 63 00:03:35,800 --> 00:03:37,120 simultaneously. 64 00:03:37,120 --> 00:03:40,000 And again, it's this really high-dimensional data 65 00:03:40,000 --> 00:03:42,040 set that we're looking at to try to figure out 66 00:03:42,040 --> 00:03:43,570 how the brain works. 67 00:03:43,570 --> 00:03:46,330 And so in order to analyze those data, 68 00:03:46,330 --> 00:03:50,140 we try to build programs or machines that 69 00:03:50,140 --> 00:03:54,280 act like the brain in order to understand the data that we 70 00:03:54,280 --> 00:03:56,260 collect from the brain. 71 00:03:56,260 --> 00:03:59,350 It's really cool. 72 00:03:59,350 --> 00:04:00,560 So it's kind of fun. 73 00:04:00,560 --> 00:04:04,030 As neuroscientists, we're trying to build a brain 74 00:04:04,030 --> 00:04:08,160 to analyze the data that we collect from the brain. 75 00:04:08,160 --> 00:04:12,100 All right, so the cool thing is that the math that we're 76 00:04:12,100 --> 00:04:16,060 looking at right now and the kinds of neural networks 77 00:04:16,060 --> 00:04:18,190 that we're looking at right now are exactly 78 00:04:18,190 --> 00:04:19,959 the kinds of math and neural networks 79 00:04:19,959 --> 00:04:24,790 that you use to explain the brain 80 00:04:24,790 --> 00:04:30,220 and to look at data in very powerful ways, all right? 81 00:04:30,220 --> 00:04:32,230 So that's what we're trying to do. 82 00:04:32,230 --> 00:04:35,800 So let's start by coming back to our two-layer feed-forward 83 00:04:35,800 --> 00:04:38,410 network and looking in a little bit more 84 00:04:38,410 --> 00:04:39,790 detail about what it does. 85 00:04:39,790 --> 00:04:42,850 OK, so I introduced the idea, this two-layer feed-forward 86 00:04:42,850 --> 00:04:43,630 network. 87 00:04:43,630 --> 00:04:46,540 We have an input layer that has a vector of firing rates, 88 00:04:46,540 --> 00:04:49,600 a firing rate that describes each of those input neurons, 89 00:04:49,600 --> 00:04:51,040 a vector of firing rates. 90 00:04:51,040 --> 00:04:52,690 That, again, is a list of numbers 91 00:04:52,690 --> 00:04:55,840 that describes the firing rate of each neuron in the output 92 00:04:55,840 --> 00:04:57,930 layer. 93 00:04:57,930 --> 00:05:00,030 And the connections between these two layers 94 00:05:00,030 --> 00:05:03,870 are a bunch of synapses, synaptic weights, 95 00:05:03,870 --> 00:05:08,310 that we can use to calculate to transform the firing 96 00:05:08,310 --> 00:05:11,010 rates at the input layer into the firing rates at the output 97 00:05:11,010 --> 00:05:11,970 layer. 98 00:05:11,970 --> 00:05:15,330 So let's look in a little bit more detail now 99 00:05:15,330 --> 00:05:20,460 at what that collection of weights looks like. 100 00:05:20,460 --> 00:05:22,820 So we describe it as a matrix. 101 00:05:22,820 --> 00:05:24,850 That's called the weight matrix. 102 00:05:24,850 --> 00:05:27,750 The matrix has in it a number for the weight 103 00:05:27,750 --> 00:05:31,860 from each of the input neurons to each of the output neurons. 104 00:05:31,860 --> 00:05:35,790 The rows are a vector of weights onto each 105 00:05:35,790 --> 00:05:36,870 of the output neurons. 106 00:05:36,870 --> 00:05:39,660 And we'll see in a couple of slides 107 00:05:39,660 --> 00:05:45,150 that the columns are the set of weights from each input neuron 108 00:05:45,150 --> 00:05:47,790 to all the output neurons. 109 00:05:47,790 --> 00:05:51,990 A row of this weight matrix is a vector of weights 110 00:05:51,990 --> 00:05:54,480 onto one of the output neurons. 111 00:05:57,460 --> 00:06:01,630 All right, so we can compute the firing rates 112 00:06:01,630 --> 00:06:04,630 of the neurons in our output layer 113 00:06:04,630 --> 00:06:08,590 for the case of linear neurons in the output layer 114 00:06:08,590 --> 00:06:11,890 simply as a matrix product of this weight 115 00:06:11,890 --> 00:06:15,550 vector times the vector of input firing rates. 116 00:06:15,550 --> 00:06:18,640 And that matrix multiplication gives us 117 00:06:18,640 --> 00:06:21,400 a vector that describes the firing rates of the output 118 00:06:21,400 --> 00:06:22,460 layer. 119 00:06:22,460 --> 00:06:24,650 So let me just go through what that looks like. 120 00:06:24,650 --> 00:06:28,780 If we define a column vector of firing rates of each 121 00:06:28,780 --> 00:06:31,510 of the output neurons, we can write that 122 00:06:31,510 --> 00:06:36,400 as the weight matrix times the column vector of the firing 123 00:06:36,400 --> 00:06:38,740 rates of the input layer. 124 00:06:38,740 --> 00:06:42,550 We can calculate the firing rate of the first neuron 125 00:06:42,550 --> 00:06:45,070 in the output layer as the dot product 126 00:06:45,070 --> 00:06:48,790 of that row of the weight matrix with that vector 127 00:06:48,790 --> 00:06:50,410 of firing rates, OK? 128 00:06:50,410 --> 00:06:54,460 And that gives us the firing rate. v1 is then 129 00:06:54,460 --> 00:06:58,570 W of a equals 1 dot u. 130 00:06:58,570 --> 00:07:02,380 That is one particular way of thinking 131 00:07:02,380 --> 00:07:07,220 about how you're calculating the firing rates in the output 132 00:07:07,220 --> 00:07:07,720 layer. 133 00:07:07,720 --> 00:07:10,600 And it's called the dot product interpretation 134 00:07:10,600 --> 00:07:12,880 of matrix multiplication, all right? 135 00:07:12,880 --> 00:07:17,140 Now, there's a different sort of complementary way 136 00:07:17,140 --> 00:07:18,850 of thinking about what happens when 137 00:07:18,850 --> 00:07:21,970 you do this matrix product that's also 138 00:07:21,970 --> 00:07:23,650 important to understand, because it's 139 00:07:23,650 --> 00:07:27,910 a different way of thinking about what's going on. 140 00:07:27,910 --> 00:07:32,600 We can also think about the columns of this weight matrix. 141 00:07:32,600 --> 00:07:35,200 And we can think about the weight matrix 142 00:07:35,200 --> 00:07:39,130 as a collection of column vectors 143 00:07:39,130 --> 00:07:43,270 that we put together into matrix form. 144 00:07:43,270 --> 00:07:45,930 So in this particular network here, 145 00:07:45,930 --> 00:07:48,600 we can write down this weight matrix, all right? 146 00:07:48,600 --> 00:07:51,510 And you can see that this first input 147 00:07:51,510 --> 00:07:57,180 neuron connects to output neuron one, so there's a one there. 148 00:07:57,180 --> 00:08:00,090 The first input neuron connects to output neuron two, 149 00:08:00,090 --> 00:08:01,500 so there's a one there. 150 00:08:01,500 --> 00:08:05,070 The first input neuron does not connect to output neuron three, 151 00:08:05,070 --> 00:08:08,870 so there's a zero there, OK? 152 00:08:08,870 --> 00:08:09,480 All right. 153 00:08:09,480 --> 00:08:12,360 So the columns of the weight matrix 154 00:08:12,360 --> 00:08:16,800 represent the pattern of projections from one 155 00:08:16,800 --> 00:08:20,280 of the input neurons to all of the output neurons. 156 00:08:23,570 --> 00:08:27,200 All right, so let's just take a look at what 157 00:08:27,200 --> 00:08:30,800 would happen if only one of our input neurons was active 158 00:08:30,800 --> 00:08:33,650 and all the others were silent. 159 00:08:33,650 --> 00:08:35,110 So this neuron is active. 160 00:08:35,110 --> 00:08:39,716 What would the output vector look like? 161 00:08:39,716 --> 00:08:41,299 What would the pattern of firing rates 162 00:08:41,299 --> 00:08:46,630 look like for the output neurons in this case? 163 00:08:46,630 --> 00:08:47,130 Anybody? 164 00:08:47,130 --> 00:08:48,290 It's straightforward. 165 00:08:48,290 --> 00:08:49,373 It's not a trick question. 166 00:08:53,110 --> 00:08:53,885 [INAUDIBLE]? 167 00:09:00,535 --> 00:09:01,428 AUDIENCE: So-- 168 00:09:01,428 --> 00:09:03,720 MICHALE FEE: If this neuron is firing and these weights 169 00:09:03,720 --> 00:09:04,645 are all one or zero. 170 00:09:07,293 --> 00:09:09,482 AUDIENCE: The one neuron, a-- 171 00:09:09,482 --> 00:09:10,190 MICHALE FEE: Yes? 172 00:09:10,190 --> 00:09:10,730 This-- 173 00:09:10,730 --> 00:09:11,240 AUDIENCE: Yeah, [INAUDIBLE]. 174 00:09:11,240 --> 00:09:13,032 MICHALE FEE: --would fire, this would fire, 175 00:09:13,032 --> 00:09:14,440 and that would not fire, right? 176 00:09:14,440 --> 00:09:17,060 Good. 177 00:09:17,060 --> 00:09:20,460 So you can write that out as a matrix multiplication. 178 00:09:20,460 --> 00:09:23,300 So the firing rate vector, in this case, 179 00:09:23,300 --> 00:09:28,310 would be the dot product of this with this, this with this, 180 00:09:28,310 --> 00:09:29,570 and that with that. 181 00:09:29,570 --> 00:09:33,200 And what you would see is that the output firing rate 182 00:09:33,200 --> 00:09:42,390 vector would look like this first column of the weight 183 00:09:42,390 --> 00:09:43,430 matrix. 184 00:09:43,430 --> 00:09:45,860 So the output vector would look like 1, 185 00:09:45,860 --> 00:09:50,040 1, 0 if only the first neuron were active. 186 00:09:50,040 --> 00:09:55,320 So you can think of the output firing rate vector 187 00:09:55,320 --> 00:09:59,940 as being a contribution from neuron one-- 188 00:09:59,940 --> 00:10:01,980 and that contribution from neuron one 189 00:10:01,980 --> 00:10:05,910 is simply the first column of the weight matrix-- 190 00:10:05,910 --> 00:10:09,330 plus a contribution from neuron two, 191 00:10:09,330 --> 00:10:12,720 which is given by the second column of the weight 192 00:10:12,720 --> 00:10:16,510 matrix, and a contribution from input neuron three, 193 00:10:16,510 --> 00:10:20,490 which is given by the third column of the weight matrix, 194 00:10:20,490 --> 00:10:21,120 OK? 195 00:10:21,120 --> 00:10:26,010 So you can think of the output firing rate vector 196 00:10:26,010 --> 00:10:29,670 as being a linear combination of a contribution 197 00:10:29,670 --> 00:10:32,910 from the first neuron, a contribution 198 00:10:32,910 --> 00:10:34,980 from the second neuron, and a contribution 199 00:10:34,980 --> 00:10:36,060 from the third neuron. 200 00:10:36,060 --> 00:10:38,760 Does that make sense? 201 00:10:38,760 --> 00:10:41,970 It's a different way of thinking about it. 202 00:10:41,970 --> 00:10:45,060 In the dot product interpretation, 203 00:10:45,060 --> 00:10:51,210 we're asking, what is the-- 204 00:10:51,210 --> 00:10:53,190 we're summing up all of the weights 205 00:10:53,190 --> 00:10:57,150 onto neuron one from those synapses. 206 00:10:57,150 --> 00:10:59,940 We're summing up all the weights onto neuron two 207 00:10:59,940 --> 00:11:02,610 from those synapses and summing up all the weights onto neuron 208 00:11:02,610 --> 00:11:04,240 three from those synapses. 209 00:11:04,240 --> 00:11:08,900 So we're doing it one output neuron at a time. 210 00:11:08,900 --> 00:11:12,230 In this other interpretation of this matrix multiplication, 211 00:11:12,230 --> 00:11:13,850 we're doing something different. 212 00:11:13,850 --> 00:11:17,780 We're asking, what is the contribution to the output 213 00:11:17,780 --> 00:11:20,240 from one of the input neurons? 214 00:11:20,240 --> 00:11:24,620 What is the contribution to the output from another input 215 00:11:24,620 --> 00:11:25,260 neuron? 216 00:11:25,260 --> 00:11:27,110 And what is the contribution to the output 217 00:11:27,110 --> 00:11:29,270 from yet another input neuron? 218 00:11:29,270 --> 00:11:31,110 Does that makes sense? 219 00:11:31,110 --> 00:11:32,900 OK. 220 00:11:32,900 --> 00:11:35,510 All right, so we have a linear combination 221 00:11:35,510 --> 00:11:38,650 of contributions from each of those input neurons. 222 00:11:41,910 --> 00:11:45,390 And that's called the outer product interpretation. 223 00:11:45,390 --> 00:11:47,880 I'm not going to explain right now why it's called that, 224 00:11:47,880 --> 00:11:50,470 but that's how that's referred to. 225 00:11:50,470 --> 00:11:53,700 So the output pattern is a linear combination 226 00:11:53,700 --> 00:11:55,360 of contributions. 227 00:11:55,360 --> 00:12:03,660 OK, so let's take a look at the effect of some very simple 228 00:12:03,660 --> 00:12:04,830 feed-forward networks, OK? 229 00:12:04,830 --> 00:12:08,210 So let's just look at a few examples. 230 00:12:08,210 --> 00:12:10,270 So if we have a feed forward-- this 231 00:12:10,270 --> 00:12:12,940 is sort of the simplest feed-forward network. 232 00:12:12,940 --> 00:12:16,710 Each neuron in the input layer connects 233 00:12:16,710 --> 00:12:20,230 to one neuron in the output layer with a weight of one. 234 00:12:20,230 --> 00:12:24,330 So what is the weight matrix of this network? 235 00:12:24,330 --> 00:12:25,190 AUDIENCE: Identity. 236 00:12:25,190 --> 00:12:26,920 MICHALE FEE: It's the identity matrix. 237 00:12:26,920 --> 00:12:29,820 And so the firing rate of the output layer 238 00:12:29,820 --> 00:12:33,030 will be exactly the same as the firing rates in the input 239 00:12:33,030 --> 00:12:33,720 layer, OK? 240 00:12:33,720 --> 00:12:37,920 So there's the weight matrix, which is just the identity 241 00:12:37,920 --> 00:12:40,380 matrix, the firing rate. 242 00:12:40,380 --> 00:12:43,710 And the output layer is just the identity matrix 243 00:12:43,710 --> 00:12:45,690 times the firing rate of the input layer. 244 00:12:45,690 --> 00:12:49,650 And so that's equal to the input firing rate, OK? 245 00:12:49,650 --> 00:12:55,100 All right, let's take a slightly more complex network, 246 00:12:55,100 --> 00:12:58,280 and let's make each one of those weights independent. 247 00:12:58,280 --> 00:12:59,810 They're not all just equal to one, 248 00:12:59,810 --> 00:13:02,480 but they're scaled by some constant-- 249 00:13:02,480 --> 00:13:05,780 lambda 1, lambda 2, and lambda 3. 250 00:13:05,780 --> 00:13:08,040 The weight matrix looks like this. 251 00:13:08,040 --> 00:13:10,400 It's a diagonal matrix, where each of those weights 252 00:13:10,400 --> 00:13:13,070 is on the diagonal. 253 00:13:13,070 --> 00:13:16,250 And in that case, you can see that the output firing 254 00:13:16,250 --> 00:13:19,160 rate is just this diagonal matrix times the input firing 255 00:13:19,160 --> 00:13:20,190 rate. 256 00:13:20,190 --> 00:13:24,260 And you can see that the output firing rate is just 257 00:13:24,260 --> 00:13:27,680 the input firing rate where each component of the input firing 258 00:13:27,680 --> 00:13:29,855 rate is scaled by some constant. 259 00:13:32,360 --> 00:13:35,960 Pretty straightforward. 260 00:13:35,960 --> 00:13:42,230 Let's take a look at a case where the weight matrix now 261 00:13:42,230 --> 00:13:44,660 corresponds to a rotation matrix, OK? 262 00:13:44,660 --> 00:13:49,430 So we're going to let the weight matrix look like this rotation 263 00:13:49,430 --> 00:13:52,100 matrix that we talked about on Tuesday, where 264 00:13:52,100 --> 00:13:56,930 are the diagonal elements are cosine of sum rotation angle, 265 00:13:56,930 --> 00:13:59,780 and the off-diagonal elements are plus and minus sine 266 00:13:59,780 --> 00:14:01,850 of the rotation angle. 267 00:14:01,850 --> 00:14:05,150 So you can see that this weight matrix corresponds 268 00:14:05,150 --> 00:14:10,280 to this network, where the projection from input neuron 269 00:14:10,280 --> 00:14:13,820 one to output neuron one is cosine phi. 270 00:14:13,820 --> 00:14:16,490 Input neuron two to output neuron two is cosine phi. 271 00:14:16,490 --> 00:14:21,110 And then these cross-connections are a plus and minus sine phi. 272 00:14:21,110 --> 00:14:23,540 OK, so what does that do? 273 00:14:23,540 --> 00:14:28,950 So we can see that the output firing rate vector is just 274 00:14:28,950 --> 00:14:32,100 a product of this rotation matrix times the input firing 275 00:14:32,100 --> 00:14:32,970 rate vector. 276 00:14:32,970 --> 00:14:36,860 And you can write down each component like that. 277 00:14:36,860 --> 00:14:39,340 All right, so what does that do? 278 00:14:39,340 --> 00:14:41,310 So let's take a particular rotation angle. 279 00:14:41,310 --> 00:14:44,280 We're going to take a rotation angle of pi 280 00:14:44,280 --> 00:14:46,930 over 4, which is 45 degrees. 281 00:14:46,930 --> 00:14:49,930 That's what the weight matrix looks like. 282 00:14:49,930 --> 00:14:53,010 And we can do that multiplication 283 00:14:53,010 --> 00:14:58,880 to find that the output firing rate vector looks like-- 284 00:14:58,880 --> 00:15:01,880 one of the neurons has a firing rate 285 00:15:01,880 --> 00:15:06,500 that looks like the sum of the two input firing rates, 286 00:15:06,500 --> 00:15:10,310 and the other output neuron has a firing rate 287 00:15:10,310 --> 00:15:14,220 that looks like the difference between the two input firing 288 00:15:14,220 --> 00:15:15,463 rates. 289 00:15:15,463 --> 00:15:16,880 And if you look at what this looks 290 00:15:16,880 --> 00:15:22,310 like in the space of firing rates of the input 291 00:15:22,310 --> 00:15:26,620 layer and the output layer, we can see what happens, OK? 292 00:15:26,620 --> 00:15:29,720 So what we'll often do when we look 293 00:15:29,720 --> 00:15:31,970 at the behavior of neural networks 294 00:15:31,970 --> 00:15:35,930 is we'll make a plot of the firing 295 00:15:35,930 --> 00:15:38,600 rates of the different neurons in the network. 296 00:15:38,600 --> 00:15:42,620 And what we'll often do for simple feed-forward networks, 297 00:15:42,620 --> 00:15:45,230 and we'll also do this for recurrent networks, 298 00:15:45,230 --> 00:15:53,890 is we'll plot the input firing rates as in the plane of u1 299 00:15:53,890 --> 00:15:56,010 and u2. 300 00:15:56,010 --> 00:16:00,670 And then we can plot the output firing rates in the same plane. 301 00:16:00,670 --> 00:16:05,370 So, for example, if we have an input state that 302 00:16:05,370 --> 00:16:10,080 looks like u1 equals u2, it will be some point 303 00:16:10,080 --> 00:16:11,520 on this diagonal line. 304 00:16:14,100 --> 00:16:17,850 We can then plot the output firing rate 305 00:16:17,850 --> 00:16:21,270 on this plane, v1 versus v2. 306 00:16:21,270 --> 00:16:26,540 And what will the output firing rate look like? 307 00:16:26,540 --> 00:16:30,190 What will the firing rate of v1 look like in this case? 308 00:16:34,910 --> 00:16:36,810 AUDIENCE: [INAUDIBLE] 309 00:16:36,810 --> 00:16:41,410 MICHALE FEE: Yeah let's say this is one and one. 310 00:16:41,410 --> 00:16:44,140 So what will the firing rate of this neuron look like? 311 00:16:44,140 --> 00:16:44,852 [INAUDIBLE]? 312 00:16:44,852 --> 00:16:46,398 AUDIENCE: [INAUDIBLE] 313 00:16:46,398 --> 00:16:47,440 MICHALE FEE: What's that? 314 00:16:47,440 --> 00:16:48,340 AUDIENCE: [INAUDIBLE] 315 00:16:48,340 --> 00:16:50,410 MICHALE FEE: So the firing rate of v1 316 00:16:50,410 --> 00:16:52,720 is just this quantity right here, right? 317 00:16:52,720 --> 00:16:56,690 So it's u1 plus u2, right? 318 00:16:56,690 --> 00:17:00,920 So it's like 1 plus 1 over root 2. 319 00:17:00,920 --> 00:17:02,450 So it will be big. 320 00:17:02,450 --> 00:17:06,109 What will the firing rate of neuron v2 look like? 321 00:17:06,109 --> 00:17:09,079 It'll be u2 minus u1, which is? 322 00:17:09,079 --> 00:17:09,810 AUDIENCE: Zero. 323 00:17:09,810 --> 00:17:10,700 MICHALE FEE: Zero. 324 00:17:10,700 --> 00:17:15,300 So it will be over here, right? 325 00:17:15,300 --> 00:17:20,579 So it will be that input rotated by 45 degrees. 326 00:17:24,290 --> 00:17:26,089 And input down here-- 327 00:17:26,089 --> 00:17:30,110 so the firing rate of the one will be the sum of those two. 328 00:17:30,110 --> 00:17:32,220 Those two inputs are both negative. 329 00:17:32,220 --> 00:17:37,940 So v1 for this input will be big and negative. 330 00:17:37,940 --> 00:17:41,960 And v2 will be the difference of u1 and u2, 331 00:17:41,960 --> 00:17:44,785 which for anything on this line is? 332 00:17:44,785 --> 00:17:45,410 AUDIENCE: Zero. 333 00:17:45,410 --> 00:17:46,160 MICHALE FEE: Zero. 334 00:17:48,430 --> 00:17:49,840 OK. 335 00:17:49,840 --> 00:17:54,680 And so that input will be rotated over to here. 336 00:17:54,680 --> 00:17:58,160 So you can think of it this way-- 337 00:17:58,160 --> 00:18:05,110 any input in this space of u1 and u2, in the output 338 00:18:05,110 --> 00:18:10,880 will be just rotate by, in this case, it's minus 45 degrees. 339 00:18:10,880 --> 00:18:15,220 So that's clockwise, are the minus rotations. 340 00:18:15,220 --> 00:18:19,240 So you can just predict the output firing rates simply 341 00:18:19,240 --> 00:18:23,170 by taking the input firing rates in this plane 342 00:18:23,170 --> 00:18:26,740 and rotating them by minus 45 degrees. 343 00:18:26,740 --> 00:18:28,520 All right, any questions about that? 344 00:18:28,520 --> 00:18:31,610 It's very simple. 345 00:18:31,610 --> 00:18:34,990 So this little neural network implements 346 00:18:34,990 --> 00:18:41,280 rotations of this input space. 347 00:18:41,280 --> 00:18:42,170 That's pretty cool. 348 00:18:51,370 --> 00:18:54,190 Why would you want a network to do rotations? 349 00:18:54,190 --> 00:18:57,040 Well, this solves exactly the problem 350 00:18:57,040 --> 00:18:59,890 that we were working on last time 351 00:18:59,890 --> 00:19:02,440 when we were talking about our perceptron, where we were 352 00:19:02,440 --> 00:19:07,450 trying to classify stimuli that could not be separated 353 00:19:07,450 --> 00:19:10,330 in one dimension, but rather, can 354 00:19:10,330 --> 00:19:12,050 be separated in two dimensions. 355 00:19:12,050 --> 00:19:14,890 So if we have different categories-- 356 00:19:14,890 --> 00:19:17,050 dogs and non-dogs-- 357 00:19:17,050 --> 00:19:22,190 that can be viewed along different dimensions-- 358 00:19:22,190 --> 00:19:24,820 how furry they are-- 359 00:19:24,820 --> 00:19:26,810 but can't be separated-- 360 00:19:26,810 --> 00:19:30,640 the two categories can't be separated from each other 361 00:19:30,640 --> 00:19:35,260 on the basis of just one dimension of observation. 362 00:19:35,260 --> 00:19:42,580 So in this case, what we want to do is take this base of inputs 363 00:19:42,580 --> 00:19:49,120 and rotate it into a new what we'll call a new basis set 364 00:19:49,120 --> 00:19:53,620 so that now we can take the firing rates of these output 365 00:19:53,620 --> 00:19:59,090 neurons and use those to separate 366 00:19:59,090 --> 00:20:01,935 these different categories from each other. 367 00:20:01,935 --> 00:20:02,810 Does that make sense? 368 00:20:07,660 --> 00:20:09,980 OK, so let me show you a few more examples of that. 369 00:20:12,670 --> 00:20:15,490 So this is one way to think about what we 370 00:20:15,490 --> 00:20:18,120 do when we do color vision, OK? 371 00:20:18,120 --> 00:20:23,320 So you know that we have different cones in our retina 372 00:20:23,320 --> 00:20:26,590 that are sensitive to different wavelengths. 373 00:20:26,590 --> 00:20:31,020 Most colors are combinations of those wavelengths. 374 00:20:31,020 --> 00:20:35,040 So if we look at the activity of, let's say, 375 00:20:35,040 --> 00:20:39,060 a cone that's sensitive to wavelength one and the activity 376 00:20:39,060 --> 00:20:45,485 in a cone that's sensitive to wavelength two, we might see-- 377 00:20:45,485 --> 00:20:47,180 and then we look around the world. 378 00:20:47,180 --> 00:20:49,370 We'll see a bunch of different objects 379 00:20:49,370 --> 00:20:51,590 or a bunch of different stimuli that 380 00:20:51,590 --> 00:20:54,815 activate those two different cones in different ratios. 381 00:20:57,750 --> 00:21:03,120 And you might imagine that this axis corresponds to, let's say, 382 00:21:03,120 --> 00:21:05,290 how much red there is in a stimulus. 383 00:21:05,290 --> 00:21:07,530 This axis corresponds to how much green 384 00:21:07,530 --> 00:21:08,970 there is in a stimulus. 385 00:21:08,970 --> 00:21:11,190 But let's say that you're in an environment where 386 00:21:11,190 --> 00:21:15,460 there's some cloud of contribution of red and green. 387 00:21:15,460 --> 00:21:19,530 So what would this direction correspond to in this cloud? 388 00:21:24,870 --> 00:21:28,860 This direction corresponds to more red and more green. 389 00:21:28,860 --> 00:21:32,465 What would that correspond to? 390 00:21:32,465 --> 00:21:34,693 AUDIENCE: Brown. 391 00:21:34,693 --> 00:21:36,610 MICHALE FEE: So what I'm trying to get at here 392 00:21:36,610 --> 00:21:39,520 is that the sum of those two is sort 393 00:21:39,520 --> 00:21:42,040 of the brightness of the object, right? 394 00:21:42,040 --> 00:21:45,540 Something that has little red and little green 395 00:21:45,540 --> 00:21:47,890 will look the same color as something that has 396 00:21:47,890 --> 00:21:50,620 more red and more green, right? 397 00:21:50,620 --> 00:21:52,510 But what's different about those two stimuli 398 00:21:52,510 --> 00:21:55,120 is that the one's brighter than the other. 399 00:21:55,120 --> 00:21:57,860 The second one is brighter than the first one. 400 00:21:57,860 --> 00:22:02,810 But this dimension corresponds to what? 401 00:22:02,810 --> 00:22:06,560 Differences in the ratio of those two colors, right? 402 00:22:06,560 --> 00:22:11,870 Sort of changes in the different [AUDIO OUT] wavelengths, 403 00:22:11,870 --> 00:22:14,460 and that corresponds to color. 404 00:22:14,460 --> 00:22:17,780 So if we can take this base of stimuli 405 00:22:17,780 --> 00:22:21,230 and rotate it such that one axis corresponds 406 00:22:21,230 --> 00:22:25,490 to the sum of the two colors and the other axis corresponds 407 00:22:25,490 --> 00:22:27,920 to the difference of the two colors, 408 00:22:27,920 --> 00:22:31,310 then this axis will tell you how bright it is, 409 00:22:31,310 --> 00:22:35,560 and this axis will tell you what the hue is, what the color is. 410 00:22:35,560 --> 00:22:36,660 Does that makes sense? 411 00:22:36,660 --> 00:22:38,750 So there's a simple case of where 412 00:22:38,750 --> 00:22:46,220 taking a rotation of a inputs base, of a set of sensors, 413 00:22:46,220 --> 00:22:48,470 will give you different information 414 00:22:48,470 --> 00:22:53,060 than you would get if you just had one of those stimuli. 415 00:22:53,060 --> 00:22:58,070 If you were to just look at the activity of the cone that's 416 00:22:58,070 --> 00:23:01,670 giving you a red signal, if one object 417 00:23:01,670 --> 00:23:03,830 has more activity in that cone, you 418 00:23:03,830 --> 00:23:07,220 don't know whether that other object is just brighter 419 00:23:07,220 --> 00:23:11,220 or if it's actually more red, that looked red. 420 00:23:11,220 --> 00:23:12,740 Does that makes sense? 421 00:23:12,740 --> 00:23:18,770 So doing a rotation gives us signals in single neurons 422 00:23:18,770 --> 00:23:20,480 that carries useful information. 423 00:23:20,480 --> 00:23:24,520 It can disambiguate different kinds of information. 424 00:23:24,520 --> 00:23:27,740 All right, so we can use that simple rotation matrix 425 00:23:27,740 --> 00:23:32,360 to perform that kind of separation. 426 00:23:32,360 --> 00:23:36,100 So brightness and color. 427 00:23:36,100 --> 00:23:38,860 Here's another example. 428 00:23:38,860 --> 00:23:45,070 I didn't get to talk about this in this class, but there are-- 429 00:23:45,070 --> 00:23:53,400 so barn owls, they can very exquisitely localize objects 430 00:23:53,400 --> 00:23:56,100 by sound. 431 00:23:56,100 --> 00:23:59,100 So they hunt, essentially, at night in the dark. 432 00:23:59,100 --> 00:24:04,800 They can hear a mouse scurrying around in the grass. 433 00:24:04,800 --> 00:24:07,710 They just listen to that sound, and they 434 00:24:07,710 --> 00:24:10,950 can tell exactly where it is, and then they dive down 435 00:24:10,950 --> 00:24:13,500 and catch the mouse. 436 00:24:13,500 --> 00:24:15,160 So how did they do that? 437 00:24:15,160 --> 00:24:17,130 Well, they used timing differences 438 00:24:17,130 --> 00:24:21,750 to tell which way the sound is coming from side to side, 439 00:24:21,750 --> 00:24:23,730 and they use intensity differences 440 00:24:23,730 --> 00:24:26,200 to tell which way the sound is coming from up and down. 441 00:24:26,200 --> 00:24:27,950 Now, how do you use intensity differences? 442 00:24:27,950 --> 00:24:30,570 Well, one of their ears, their right ear 443 00:24:30,570 --> 00:24:32,950 pointed slightly upwards. 444 00:24:32,950 --> 00:24:35,710 And their left ear is pointed slightly downwards. 445 00:24:35,710 --> 00:24:38,770 So when they hear a sound that's slightly louder 446 00:24:38,770 --> 00:24:42,040 in the right ear and slightly softer in the left ear, 447 00:24:42,040 --> 00:24:45,550 they know that it's coming from up above, right? 448 00:24:45,550 --> 00:24:47,195 And if it's the other way around, 449 00:24:47,195 --> 00:24:49,480 if it's slightly louder in the left ear and softer 450 00:24:49,480 --> 00:24:55,550 in the right ear, they know it's coming from below horizontal. 451 00:24:55,550 --> 00:24:59,060 And it's extremely precise system, OK? 452 00:24:59,060 --> 00:25:00,450 So here's an example. 453 00:25:00,450 --> 00:25:02,390 So if they're sitting there listening 454 00:25:02,390 --> 00:25:06,030 to the intensity, the amplitude of the sound in the left ear 455 00:25:06,030 --> 00:25:10,040 and the amplitude of the sound in the right ear, 456 00:25:10,040 --> 00:25:14,390 some sounds will be up here with high amplitude in both ears. 457 00:25:14,390 --> 00:25:16,970 Some sounds will be over here, with more 458 00:25:16,970 --> 00:25:22,070 amplitude in the right ear and less amplitude in the left ear. 459 00:25:22,070 --> 00:25:24,800 What does this dimension correspond to? 460 00:25:24,800 --> 00:25:27,764 That dimension corresponds to? 461 00:25:27,764 --> 00:25:28,620 AUDIENCE: Proximity. 462 00:25:28,620 --> 00:25:30,390 MICHALE FEE: Proximity or, overall, 463 00:25:30,390 --> 00:25:32,610 the loudness of the sound, right? 464 00:25:32,610 --> 00:25:36,060 And what does this dimension correspond to? 465 00:25:36,060 --> 00:25:37,650 AUDIENCE: Direction. 466 00:25:37,650 --> 00:25:40,060 MICHALE FEE: The difference in intensity corresponds 467 00:25:40,060 --> 00:25:47,830 to the elevation of the sound relative to the horizontal. 468 00:25:47,830 --> 00:25:48,330 All right? 469 00:25:48,330 --> 00:25:52,200 So, in fact, what happens in the owl's brain 470 00:25:52,200 --> 00:25:57,540 is that these two signals undergo a rotation to produce 471 00:25:57,540 --> 00:26:01,080 activity in some neurons that's sensitive to the overall 472 00:26:01,080 --> 00:26:04,830 loudness and activity in other neurons that's 473 00:26:04,830 --> 00:26:09,190 sensitive to the difference between the intensity 474 00:26:09,190 --> 00:26:10,570 of the two sounds. 475 00:26:10,570 --> 00:26:14,560 It's a measure of the elevation of the sounds. 476 00:26:14,560 --> 00:26:17,590 All right, so this kind of rotation matrix 477 00:26:17,590 --> 00:26:22,060 is very useful for projecting stimuli 478 00:26:22,060 --> 00:26:26,530 into the right dimension so that they give useful signals. 479 00:26:33,320 --> 00:26:38,600 All right, so let's come back to our matrix transformations 480 00:26:38,600 --> 00:26:41,060 and look in a little bit more detail 481 00:26:41,060 --> 00:26:43,640 about what kinds of transformations 482 00:26:43,640 --> 00:26:45,270 you can do with matrices. 483 00:26:45,270 --> 00:26:50,720 So we talked about how matrices can do 484 00:26:50,720 --> 00:26:53,630 stretch, compression, rotation. 485 00:26:53,630 --> 00:26:56,660 And we're going to talk about a new kind of transformation 486 00:26:56,660 --> 00:26:59,880 that they can do. 487 00:26:59,880 --> 00:27:05,070 So you remember we talked about how a matrix multiplication 488 00:27:05,070 --> 00:27:08,580 implements a transformation from one set of vectors 489 00:27:08,580 --> 00:27:10,500 into another set of vectors? 490 00:27:10,500 --> 00:27:14,250 And the inverse of that matrix transforms back 491 00:27:14,250 --> 00:27:17,820 to the original set of vectors, OK? 492 00:27:17,820 --> 00:27:19,870 So you can make a transformation, 493 00:27:19,870 --> 00:27:21,900 and then you can undo that transformation 494 00:27:21,900 --> 00:27:25,680 by multiplying by the inverse of the matrix. 495 00:27:25,680 --> 00:27:29,760 OK, so we talked about different kinds of transformations 496 00:27:29,760 --> 00:27:31,380 that you can do. 497 00:27:31,380 --> 00:27:33,790 So if you take the identity matrix 498 00:27:33,790 --> 00:27:35,580 and you make a small perturbation 499 00:27:35,580 --> 00:27:38,730 to both of the diagonal elements, the same perturbation 500 00:27:38,730 --> 00:27:40,860 to both diagonal elements, you're basically 501 00:27:40,860 --> 00:27:43,620 taking a set of vectors and you're stretching them 502 00:27:43,620 --> 00:27:45,810 uniformly in all directions. 503 00:27:45,810 --> 00:27:48,900 If you make a perturbation to just one of the components 504 00:27:48,900 --> 00:27:51,540 of the identity matrix, you can take the data 505 00:27:51,540 --> 00:27:55,200 and stretch it in one direction or stretch it 506 00:27:55,200 --> 00:27:57,060 in the other direction. 507 00:27:57,060 --> 00:28:01,740 If you add something to the first component 508 00:28:01,740 --> 00:28:04,020 and subtract something from the second component, 509 00:28:04,020 --> 00:28:06,630 you can stretch in one direction and compress 510 00:28:06,630 --> 00:28:08,710 in another direction. 511 00:28:08,710 --> 00:28:13,447 We talked about reflections and inversions through the origin. 512 00:28:13,447 --> 00:28:15,030 These are all transformations that are 513 00:28:15,030 --> 00:28:18,840 produced by diagonal matrices. 514 00:28:18,840 --> 00:28:22,260 And the inverse of those diagonal matrices 515 00:28:22,260 --> 00:28:25,800 is just one over the diagonal elements. 516 00:28:25,800 --> 00:28:28,170 OK, we also talked about rotations 517 00:28:28,170 --> 00:28:30,960 that you can do with this rotation matrix. 518 00:28:30,960 --> 00:28:34,380 And then the inverse of the rotation matrix 519 00:28:34,380 --> 00:28:39,420 is, basically, you compute the inverse of a rotation matrix 520 00:28:39,420 --> 00:28:41,610 simply by computing the rotation matrix 521 00:28:41,610 --> 00:28:46,260 with a minus sign for this, using the negative 522 00:28:46,260 --> 00:28:47,550 of the rotation angle. 523 00:28:50,750 --> 00:28:54,440 And we also talked about how a rotation matrix-- 524 00:28:54,440 --> 00:28:56,480 for a rotation matrix, the inverse 525 00:28:56,480 --> 00:28:58,280 is also equal to the transpose. 526 00:28:58,280 --> 00:29:01,220 And the reason is that rotation matrices 527 00:29:01,220 --> 00:29:04,460 have this antisymmetry, where the off-diagonal elements have 528 00:29:04,460 --> 00:29:06,720 the opposite sign. 529 00:29:06,720 --> 00:29:09,680 One of the things we haven't talked about is-- 530 00:29:09,680 --> 00:29:15,950 so we talked about how this kind of matrix 531 00:29:15,950 --> 00:29:20,960 can produce a stretch along one dimension or a stretch 532 00:29:20,960 --> 00:29:24,560 along the other dimension of the vectors. 533 00:29:24,560 --> 00:29:30,380 But one really important kind of transformation 534 00:29:30,380 --> 00:29:35,180 that we need to understand is how you can produce stretches 535 00:29:35,180 --> 00:29:37,460 in an arbitrary direction, OK? 536 00:29:37,460 --> 00:29:42,380 So not just along the x-axis or along the y-axis, but along any 537 00:29:42,380 --> 00:29:44,990 arbitrary direction. 538 00:29:44,990 --> 00:29:48,410 And the reason we need to know how that works 539 00:29:48,410 --> 00:29:53,240 is because that formulation of how you write down a matrix 540 00:29:53,240 --> 00:29:56,780 to stretch data in any arbitrary direction 541 00:29:56,780 --> 00:30:01,520 is the basis of a lot of really important data analysis 542 00:30:01,520 --> 00:30:04,490 methods, including principal component 543 00:30:04,490 --> 00:30:07,910 analysis and other methods. 544 00:30:07,910 --> 00:30:09,980 So I'm going to walk you through how 545 00:30:09,980 --> 00:30:12,630 to think about making stretches in data 546 00:30:12,630 --> 00:30:14,180 in arbitrary dimensions. 547 00:30:14,180 --> 00:30:18,380 OK, so here's what we're going to walk through. 548 00:30:18,380 --> 00:30:20,000 Let's say we have a set of vectors. 549 00:30:20,000 --> 00:30:21,093 I just picked-- 550 00:30:21,093 --> 00:30:22,260 I don't know, what is that-- 551 00:30:22,260 --> 00:30:25,940 20 or so random vectors. 552 00:30:25,940 --> 00:30:29,840 So I just called a random number generator 20 times 553 00:30:29,840 --> 00:30:33,720 and just picked 20 random vectors. 554 00:30:33,720 --> 00:30:40,280 And we're going to figure out how to write down a matrix that 555 00:30:40,280 --> 00:30:43,640 will transform that set of vectors 556 00:30:43,640 --> 00:30:48,560 into another set of vectors that stretched along some arbitrary 557 00:30:48,560 --> 00:30:50,870 axis. 558 00:30:50,870 --> 00:30:52,700 Does that make sense? 559 00:30:52,700 --> 00:30:55,450 So how do we do that? 560 00:30:55,450 --> 00:30:59,210 And remember, we know how to do two things. 561 00:30:59,210 --> 00:31:03,070 We know how to stretch a set of vectors along the x-axis. 562 00:31:03,070 --> 00:31:06,680 We know how to stretch vectors along the y-axis, 563 00:31:06,680 --> 00:31:09,020 and we know how to rotate a set of vectors. 564 00:31:09,020 --> 00:31:11,140 So we're just going to combine those two 565 00:31:11,140 --> 00:31:14,500 ingredients to produce this stretch in an arbitrary 566 00:31:14,500 --> 00:31:15,560 direction. 567 00:31:15,560 --> 00:31:18,290 So now I've given you the recipe-- 568 00:31:18,290 --> 00:31:20,200 or I've given you the ingredients. 569 00:31:20,200 --> 00:31:21,900 The recipe's pretty obvious, right? 570 00:31:21,900 --> 00:31:25,450 We're going to take this set of initial vectors. 571 00:31:25,450 --> 00:31:26,150 Good. 572 00:31:26,150 --> 00:31:26,650 Lina? 573 00:31:26,650 --> 00:31:28,960 AUDIENCE: You [INAUDIBLE]. 574 00:31:28,960 --> 00:31:29,890 That's it. 575 00:31:29,890 --> 00:31:30,880 MICHALE FEE: Bingo. 576 00:31:30,880 --> 00:31:32,170 That's it. 577 00:31:32,170 --> 00:31:33,630 OK, so we're going to take-- 578 00:31:33,630 --> 00:31:36,400 all right, so we're going to rotate this thing 45 degrees. 579 00:31:36,400 --> 00:31:38,500 We take this original set of vectors. 580 00:31:38,500 --> 00:31:39,550 We're going to-- 581 00:31:39,550 --> 00:31:41,830 OK, so first of all, the first thing 582 00:31:41,830 --> 00:31:45,010 we do when we want to take a set of points 583 00:31:45,010 --> 00:31:47,530 and stretch it along an arbitrary direction, 584 00:31:47,530 --> 00:31:49,960 we pick that angle that we want to stretch it 585 00:31:49,960 --> 00:31:51,880 on-- in this case, 45 degrees. 586 00:31:51,880 --> 00:31:55,150 And we write down a rotation matrix corresponding 587 00:31:55,150 --> 00:31:58,490 to that rotation, corresponding to that angle. 588 00:31:58,490 --> 00:32:01,780 So that's the first thing we do. 589 00:32:01,780 --> 00:32:04,550 So we've chosen 45 degrees as the angle 590 00:32:04,550 --> 00:32:06,040 we want to stretch on. 591 00:32:06,040 --> 00:32:08,080 So now we write down a rotation matrix 592 00:32:08,080 --> 00:32:11,230 for a 45-degree rotation. 593 00:32:11,230 --> 00:32:12,730 Then what we're going to do is we're 594 00:32:12,730 --> 00:32:15,820 going to take that set of points and we're 595 00:32:15,820 --> 00:32:19,540 going to rotate it by minus 45 degrees. 596 00:32:24,190 --> 00:32:27,210 So how do we do that? 597 00:32:27,210 --> 00:32:33,000 How do we take any one of those vectors x and rotate it by-- 598 00:32:33,000 --> 00:32:36,800 so this that rotation matrix is for plus 45. 599 00:32:36,800 --> 00:32:41,322 How do we rotate that vector by minus 45? 600 00:32:41,322 --> 00:32:44,150 AUDIENCE: [INAUDIBLE] multiply it by the [INAUDIBLE].. 601 00:32:44,150 --> 00:32:44,900 MICHALE FEE: Good. 602 00:32:44,900 --> 00:32:45,760 Say it. 603 00:32:45,760 --> 00:32:47,510 AUDIENCE: Multiply by the inverse of that. 604 00:32:47,510 --> 00:32:49,170 MICHALE FEE: Yeah, and what's the inverse of a-- 605 00:32:49,170 --> 00:32:49,770 AUDIENCE: Transpose. 606 00:32:49,770 --> 00:32:50,728 MICHALE FEE: Transpose. 607 00:32:50,728 --> 00:32:54,360 So we don't have to go to Matlab and use the inverse matrix 608 00:32:54,360 --> 00:32:55,470 in inversion. 609 00:32:55,470 --> 00:32:58,560 We can just do the transpose. 610 00:32:58,560 --> 00:33:03,570 OK, so we take that vector and we multiply it by transpose. 611 00:33:03,570 --> 00:33:06,060 So that does a minus 45-degree rotation 612 00:33:06,060 --> 00:33:08,082 of all of those points. 613 00:33:08,082 --> 00:33:09,040 And then what do we do? 614 00:33:13,290 --> 00:33:14,250 Lina, you said it. 615 00:33:14,250 --> 00:33:14,760 Stretch it. 616 00:33:14,760 --> 00:33:16,708 Stretch it along? 617 00:33:16,708 --> 00:33:19,180 AUDIENCE: The x-axis? 618 00:33:19,180 --> 00:33:21,520 MICHALE FEE: The x-axis, good. 619 00:33:21,520 --> 00:33:25,040 What does that matrix look like that does that? 620 00:33:25,040 --> 00:33:27,234 Just give me-- yup? 621 00:33:27,234 --> 00:33:29,165 AUDIENCE: 5, 0, 0, 1. 622 00:33:29,165 --> 00:33:30,040 MICHALE FEE: Awesome. 623 00:33:30,040 --> 00:33:30,580 That's it. 624 00:33:30,580 --> 00:33:36,480 So we're going to stretch using a stretch matrix. 625 00:33:36,480 --> 00:33:39,220 So I use phi for a rotation matrix, 626 00:33:39,220 --> 00:33:42,830 and I use lambda for a stretch matrix, a stretch 627 00:33:42,830 --> 00:33:45,940 matrix along x or y. 628 00:33:45,940 --> 00:33:48,790 Lambda is a diagonal matrix, which always just 629 00:33:48,790 --> 00:33:52,910 stretches or compresses along the x or y direction. 630 00:33:52,910 --> 00:33:55,045 And then what do we do? 631 00:33:55,045 --> 00:33:56,320 AUDIENCE: [INAUDIBLE] 632 00:33:56,320 --> 00:33:57,070 MICHALE FEE: Good. 633 00:33:57,070 --> 00:34:00,220 By multiplying by? 634 00:34:00,220 --> 00:34:01,690 By this. 635 00:34:01,690 --> 00:34:03,700 Excellent. 636 00:34:03,700 --> 00:34:04,210 That's all. 637 00:34:04,210 --> 00:34:05,890 So how do we write this down? 638 00:34:05,890 --> 00:34:09,370 So, remember, here, we're sort of marching through the recipe 639 00:34:09,370 --> 00:34:12,520 from left to right. 640 00:34:12,520 --> 00:34:16,070 When you write down matrices, you go the other way. 641 00:34:16,070 --> 00:34:18,070 So when you do matrix multiplication, 642 00:34:18,070 --> 00:34:22,300 you take your vector x and you multiply it on the left side 643 00:34:22,300 --> 00:34:25,929 by phi transpose. 644 00:34:25,929 --> 00:34:28,810 And then you take that and you multiply that on the left side 645 00:34:28,810 --> 00:34:30,969 by lambda. 646 00:34:30,969 --> 00:34:33,020 And then you take that. 647 00:34:33,020 --> 00:34:34,719 That now gives you these. 648 00:34:34,719 --> 00:34:38,210 And now to get the final answer here, 649 00:34:38,210 --> 00:34:42,565 you multiply again on the left side by phi. 650 00:34:42,565 --> 00:34:44,409 That's it. 651 00:34:44,409 --> 00:34:47,230 That's how you produce an arbitrary stretch-- 652 00:34:47,230 --> 00:34:49,630 a stretch or a compression of a data 653 00:34:49,630 --> 00:34:52,480 in an arbitrary direction, all right? 654 00:34:52,480 --> 00:34:55,449 You take the data, the vector. 655 00:34:55,449 --> 00:34:59,200 You multiply it by a rotation matrix transpose, 656 00:34:59,200 --> 00:35:02,560 multiply it by a stretch matrix, a diagonal matrix, 657 00:35:02,560 --> 00:35:07,150 and you multiply it by a rotation matrix. 658 00:35:07,150 --> 00:35:09,376 Rotate, stretch, unrotate. 659 00:35:14,860 --> 00:35:18,610 So let's actually do this for 45 degrees. 660 00:35:18,610 --> 00:35:22,850 So there's our rotation matrix-- 661 00:35:22,850 --> 00:35:27,470 1, minus 1, 1, 1. 662 00:35:27,470 --> 00:35:31,590 The transpose is 1, 1, minus 1, 1. 663 00:35:31,590 --> 00:35:33,670 And here's our stretch matrix. 664 00:35:33,670 --> 00:35:37,320 In this case, it was stretched by a factor of two. 665 00:35:37,320 --> 00:35:44,250 So we multiply x by phi transpose, multiply by lambda, 666 00:35:44,250 --> 00:35:46,680 and then multiply by phi. 667 00:35:46,680 --> 00:35:49,320 So we can now write that down. 668 00:35:49,320 --> 00:35:51,600 If you just do those three matrix 669 00:35:51,600 --> 00:35:54,580 multiplications-- those two matrix multiplications, sorry, 670 00:35:54,580 --> 00:35:55,080 yes? 671 00:35:55,080 --> 00:35:56,370 One, two. 672 00:35:56,370 --> 00:35:58,920 Two matrix multiplications. 673 00:35:58,920 --> 00:36:02,797 You get a single matrix that when you multiply it by 674 00:36:02,797 --> 00:36:05,175 x implements this stretch. 675 00:36:08,040 --> 00:36:10,000 Any questions about that? 676 00:36:10,000 --> 00:36:12,910 You should ask me now if you don't understand, 677 00:36:12,910 --> 00:36:16,780 because I want you to be able to do this for an arbitrary-- 678 00:36:16,780 --> 00:36:20,980 so I'm going to give you some angle, 679 00:36:20,980 --> 00:36:25,090 and I'll tell you, construct a matrix that 680 00:36:25,090 --> 00:36:32,290 stretches data along a 30-degree axis by a factor of five. 681 00:36:32,290 --> 00:36:35,410 You should be able to write down that matrix. 682 00:36:35,410 --> 00:36:37,510 All right, so this is what you're going to do, 683 00:36:37,510 --> 00:36:41,710 and that's what that matrix will look like, something like that. 684 00:36:41,710 --> 00:36:48,880 Now, we can stretch these data along a 45-degree axis 685 00:36:48,880 --> 00:36:50,350 by some factor. 686 00:36:50,350 --> 00:36:52,430 It's a factor of two here. 687 00:36:52,430 --> 00:36:53,500 How do we go back? 688 00:36:53,500 --> 00:36:57,340 How do we undo that stretch? 689 00:36:57,340 --> 00:37:01,780 So how do you take the inverse of a product of a bunch 690 00:37:01,780 --> 00:37:03,480 of matrices like this? 691 00:37:03,480 --> 00:37:05,600 So the answer is very simple. 692 00:37:05,600 --> 00:37:10,420 If we want to take the inverse of a product of three matrices, 693 00:37:10,420 --> 00:37:13,570 what we do is we just-- 694 00:37:13,570 --> 00:37:16,790 it's, again, a product of three matrices. 695 00:37:16,790 --> 00:37:20,830 It's a product of the inverse of those three matrices, 696 00:37:20,830 --> 00:37:22,880 but you have to reverse the order. 697 00:37:22,880 --> 00:37:25,660 So if you want to find the inverse of matrix A times B 698 00:37:25,660 --> 00:37:31,270 times C, it's C inverse times B inverse times A inverse. 699 00:37:31,270 --> 00:37:34,970 And you can prove that that's the right term as follows. 700 00:37:34,970 --> 00:37:42,520 So ABC inverse times ABC should be the identity matrix, right? 701 00:37:42,520 --> 00:37:49,150 So let's replace this by this result here. 702 00:37:49,150 --> 00:37:52,020 So C inverse B inverse A inverse times 703 00:37:52,020 --> 00:37:54,790 ABC would be the identity matrix. 704 00:37:54,790 --> 00:38:01,340 And you can see that right here, A inverse times A is i. 705 00:38:01,340 --> 00:38:03,290 So you can get rid of that. 706 00:38:03,290 --> 00:38:06,740 B inverse times B is i. 707 00:38:06,740 --> 00:38:11,480 C inverse times C is i. 708 00:38:11,480 --> 00:38:14,900 So we just proved that that is the correct way 709 00:38:14,900 --> 00:38:18,070 of taking the inverse of a product of matrices, all right? 710 00:38:20,650 --> 00:38:26,050 So the inverse of this kind of matrix 711 00:38:26,050 --> 00:38:30,210 that stretches data along an arbitrary direction 712 00:38:30,210 --> 00:38:31,080 looks like this. 713 00:38:31,080 --> 00:38:37,050 It's phi transpose inverse lambda inverse phi inverse. 714 00:38:37,050 --> 00:38:40,320 So let's figure out what each one of those things is. 715 00:38:40,320 --> 00:38:44,100 So what is phi transpose inverse, 716 00:38:44,100 --> 00:38:46,228 where phi is a rotation matrix? 717 00:38:46,228 --> 00:38:47,020 AUDIENCE: Just phi. 718 00:38:47,020 --> 00:38:48,940 MICHALE FEE: Phi, good. 719 00:38:48,940 --> 00:38:51,171 And what is phi inverse? 720 00:38:51,171 --> 00:38:52,464 AUDIENCE: [INAUDIBLE] 721 00:38:52,464 --> 00:38:53,760 MICHALE FEE: [INAUDIBLE]. 722 00:38:53,760 --> 00:38:55,190 Good. 723 00:38:55,190 --> 00:38:58,320 And lambda inverse we'll get to in a second. 724 00:38:58,320 --> 00:39:05,300 So the inverse of this arbitrary rotated stretch matrix 725 00:39:05,300 --> 00:39:12,450 is just another rotated stretch matrix, right? 726 00:39:12,450 --> 00:39:17,860 Where the lambda now has-- 727 00:39:17,860 --> 00:39:21,370 lambda inverse is just given by the inverse of each 728 00:39:21,370 --> 00:39:24,240 of those diagonal elements. 729 00:39:24,240 --> 00:39:28,815 So it's super easy to find the inverse of one 730 00:39:28,815 --> 00:39:33,200 of these matrices that computes this stretch in an arbitrary 731 00:39:33,200 --> 00:39:34,550 direction. 732 00:39:34,550 --> 00:39:36,800 You just keep the same phi. 733 00:39:36,800 --> 00:39:40,940 It's just phi times some diagonal matrix times 734 00:39:40,940 --> 00:39:45,963 phi transpose, but the diagonals are inverted. 735 00:39:45,963 --> 00:39:46,880 Does that makes sense? 736 00:39:49,700 --> 00:39:51,110 All right, so let's write it out. 737 00:39:51,110 --> 00:39:55,550 We're going to undo this 45-degree stretch that we just 738 00:39:55,550 --> 00:39:56,520 did. 739 00:39:56,520 --> 00:40:02,060 We're going to do it by rotating, stretching by 1/2 740 00:40:02,060 --> 00:40:04,520 instead of stretching by two. 741 00:40:04,520 --> 00:40:09,060 So you can see that compresses now along the x-axis. 742 00:40:09,060 --> 00:40:10,790 And then we rotate back, and we're back 743 00:40:10,790 --> 00:40:14,380 to our original data. 744 00:40:14,380 --> 00:40:17,110 Any questions about that? 745 00:40:17,110 --> 00:40:19,570 It's really easy, as long as you just 746 00:40:19,570 --> 00:40:25,000 think through what you're doing as you go through those steps, 747 00:40:25,000 --> 00:40:25,870 all right? 748 00:40:25,870 --> 00:40:26,970 Any questions about that? 749 00:40:31,200 --> 00:40:32,410 OK. 750 00:40:32,410 --> 00:40:32,910 Wow. 751 00:40:37,910 --> 00:40:38,690 All right. 752 00:40:38,690 --> 00:40:41,090 So you can actually just write those down 753 00:40:41,090 --> 00:40:46,100 and compute the single matrix that 754 00:40:46,100 --> 00:40:55,710 implements this compression along that 45-degree axis, OK? 755 00:40:55,710 --> 00:40:56,210 All right. 756 00:41:00,040 --> 00:41:02,470 So let me just show you one other example. 757 00:41:02,470 --> 00:41:04,930 And I'll show you something interesting 758 00:41:04,930 --> 00:41:09,680 that happens if you construct a matrix that instead 759 00:41:09,680 --> 00:41:13,400 of stretching along a 45-degree axis does compression 760 00:41:13,400 --> 00:41:16,130 along a 45-degree axis. 761 00:41:16,130 --> 00:41:18,690 So here's our original data. 762 00:41:18,690 --> 00:41:25,155 Let's take that data and rotate it by plus 45 degrees. 763 00:41:28,100 --> 00:41:33,720 Multiplied by lambda, that compresses along the x-axis 764 00:41:33,720 --> 00:41:39,150 and then rotates by minus 45 degrees. 765 00:41:39,150 --> 00:41:44,670 So here's an example where we can take data and compress it 766 00:41:44,670 --> 00:41:48,750 along an axis of minus 45 degrees, all right? 767 00:41:48,750 --> 00:41:50,130 So you can write this down. 768 00:41:50,130 --> 00:41:52,440 So we're going to say we're going to compress 769 00:41:52,440 --> 00:41:54,630 along a minus 45 degree axis. 770 00:41:54,630 --> 00:41:57,450 We write down phi of minus 45. 771 00:41:57,450 --> 00:42:00,453 Notice that when you do this compression or stretching, 772 00:42:00,453 --> 00:42:02,370 there are different ways you can do it, right? 773 00:42:02,370 --> 00:42:03,930 You can take the data. 774 00:42:03,930 --> 00:42:08,190 You can rotate it this way and then squish along this axis. 775 00:42:08,190 --> 00:42:12,090 Or you could rotate it this way and squish along this axis, 776 00:42:12,090 --> 00:42:12,590 right? 777 00:42:15,770 --> 00:42:18,277 So there are choices for how you do it. 778 00:42:18,277 --> 00:42:19,860 But in the end, you're going to end up 779 00:42:19,860 --> 00:42:23,460 with the same matrix that does all of those equivalent 780 00:42:23,460 --> 00:42:24,420 transformations. 781 00:42:24,420 --> 00:42:25,750 OK, so here we are. 782 00:42:25,750 --> 00:42:27,000 We're going to write this out. 783 00:42:27,000 --> 00:42:28,458 So we're writing down a matrix that 784 00:42:28,458 --> 00:42:32,730 produces this compression along a minus 45-degree axis. 785 00:42:32,730 --> 00:42:34,770 So there's 5 minus 45. 786 00:42:34,770 --> 00:42:37,720 There's lambda, a compression along the x-axis. 787 00:42:37,720 --> 00:42:41,950 So here, it's 0.2001. 788 00:42:41,950 --> 00:42:44,250 And here's the phi transpose. 789 00:42:44,250 --> 00:42:52,310 So you write all that out, and you get 0.6, 0.4, 0.4, 0.4. 790 00:42:52,310 --> 00:42:53,540 Let me show you one more. 791 00:42:56,240 --> 00:43:03,980 What happens if we accidentally take this data, we rotate it, 792 00:43:03,980 --> 00:43:09,068 and then we squish the data to zero? 793 00:43:09,068 --> 00:43:10,496 Yes? 794 00:43:10,496 --> 00:43:16,210 AUDIENCE: [INAUDIBLE] 795 00:43:16,210 --> 00:43:17,350 MICHALE FEE: It doesn't. 796 00:43:17,350 --> 00:43:18,340 You can do either one. 797 00:43:21,780 --> 00:43:22,490 Let me go back. 798 00:43:32,940 --> 00:43:34,690 Let me just go back to the very first one. 799 00:43:37,680 --> 00:43:42,390 So here, we rotated clockwise and then 800 00:43:42,390 --> 00:43:46,020 stretched along the x-axis and then unrotated. 801 00:43:46,020 --> 00:43:51,930 We could have taken these data, rotated counterclockwise, 802 00:43:51,930 --> 00:43:56,695 stretched along the y-axis, and then rotated back, right? 803 00:43:56,695 --> 00:43:57,570 Does that make sense? 804 00:44:01,240 --> 00:44:03,070 You'll still get the same answer. 805 00:44:03,070 --> 00:44:07,750 You'll still get the same answer for this matrix here. 806 00:44:11,940 --> 00:44:13,230 OK, now watch this. 807 00:44:19,560 --> 00:44:23,120 What happens if we take these data, we rotate them, 808 00:44:23,120 --> 00:44:29,650 and then we compress data all the way to zero? 809 00:44:29,650 --> 00:44:32,660 So by compressing the data to a line, 810 00:44:32,660 --> 00:44:34,820 we're multiplying it by zero. 811 00:44:34,820 --> 00:44:40,440 We put a zero in this element of the stretch matrix, all right? 812 00:44:40,440 --> 00:44:41,450 And what happens? 813 00:44:41,450 --> 00:44:46,120 The data get compressed right to zero, OK? 814 00:44:46,120 --> 00:44:47,360 And then we can rotate back. 815 00:44:47,360 --> 00:44:49,460 So we've taken these data. 816 00:44:49,460 --> 00:44:53,150 We can write down a matrix that takes those data 817 00:44:53,150 --> 00:45:00,310 and squishes them to zero along some arbitrary direction. 818 00:45:00,310 --> 00:45:08,510 Now, can we take those data and go back to the original data? 819 00:45:08,510 --> 00:45:10,220 Can we write down a transformation 820 00:45:10,220 --> 00:45:13,310 that takes those and goes back to the original data? 821 00:45:13,310 --> 00:45:15,119 Why not? 822 00:45:15,119 --> 00:45:16,877 AUDIENCE: Lambda doesn't [INAUDIBLE].. 823 00:45:16,877 --> 00:45:17,960 MICHALE FEE: Say it again. 824 00:45:17,960 --> 00:45:19,340 AUDIENCE: Lambda doesn't [INAUDIBLE].. 825 00:45:19,340 --> 00:45:20,090 MICHALE FEE: Good. 826 00:45:20,090 --> 00:45:22,412 What's another way to think about that? 827 00:45:22,412 --> 00:45:24,180 AUDIENCE: We've lost [INAUDIBLE].. 828 00:45:24,180 --> 00:45:26,310 MICHALE FEE: You've lost that information. 829 00:45:26,310 --> 00:45:30,990 So in order to go back from here to the original data, 830 00:45:30,990 --> 00:45:35,280 you have to have information somewhere here that tells you 831 00:45:35,280 --> 00:45:40,240 how far out to stretch it again when you try to go back. 832 00:45:40,240 --> 00:45:42,160 But in this case, we've compressed everything 833 00:45:42,160 --> 00:45:46,260 to a line, and so there's no information 834 00:45:46,260 --> 00:45:48,140 how to go back to the original data. 835 00:45:51,610 --> 00:45:54,700 And how do you know if you've done this? 836 00:45:54,700 --> 00:45:58,195 Well, you can take a look at this matrix that you created. 837 00:46:00,720 --> 00:46:03,210 So let's say somebody gave you this matrix. 838 00:46:03,210 --> 00:46:05,810 How would you tell whether you could 839 00:46:05,810 --> 00:46:07,100 back to the original data? 840 00:46:09,660 --> 00:46:11,890 Any ideas? 841 00:46:11,890 --> 00:46:13,042 Abiba? 842 00:46:13,042 --> 00:46:14,260 AUDIENCE: [INAUDIBLE] 843 00:46:14,260 --> 00:46:15,010 MICHALE FEE: Good. 844 00:46:15,010 --> 00:46:16,177 You look at the determinant. 845 00:46:16,177 --> 00:46:19,480 So if you calculate the determinant of this matrix, 846 00:46:19,480 --> 00:46:21,100 the determinant is zero. 847 00:46:21,100 --> 00:46:23,620 And as soon as you see a zero determinant, 848 00:46:23,620 --> 00:46:27,100 you know right away that you can't go back. 849 00:46:27,100 --> 00:46:28,840 After you've made this transformation, 850 00:46:28,840 --> 00:46:32,320 you can't go back to the original data. 851 00:46:32,320 --> 00:46:36,010 And we're going to get into a little more detail about why 852 00:46:36,010 --> 00:46:39,040 that is and what that means. 853 00:46:39,040 --> 00:46:43,660 And the reason here is that the determinant of lambda is zero. 854 00:46:43,660 --> 00:46:46,780 The determinant of a product matrices 855 00:46:46,780 --> 00:46:49,210 like this is the product of the determinants. 856 00:46:49,210 --> 00:46:51,940 And in this case, the determinant of the lambda 857 00:46:51,940 --> 00:46:55,510 matrix is zero, and so the determinant of the product 858 00:46:55,510 --> 00:46:57,910 is zero, OK? 859 00:46:57,910 --> 00:47:02,930 All right, so now let's talk about basis sets. 860 00:47:02,930 --> 00:47:07,230 All right, so we can think of vectors in abstract directions. 861 00:47:07,230 --> 00:47:11,190 So if I hold my arm out here and tell you 862 00:47:11,190 --> 00:47:13,220 this is a vector-- there's the origin. 863 00:47:13,220 --> 00:47:15,390 The vectors pointing in that direction. 864 00:47:15,390 --> 00:47:19,020 You don't need a coordinate system 865 00:47:19,020 --> 00:47:21,540 to know which way I'm pointing. 866 00:47:21,540 --> 00:47:25,800 I don't need to tell you my arm is pointing 867 00:47:25,800 --> 00:47:28,470 80 centimeters in that direction and 40 868 00:47:28,470 --> 00:47:30,938 centimeters in that direction and 10 centimeters 869 00:47:30,938 --> 00:47:31,980 in that direction, right? 870 00:47:31,980 --> 00:47:34,200 You don't need a coordinate system 871 00:47:34,200 --> 00:47:38,930 to know which way I'm pointing, right? 872 00:47:38,930 --> 00:47:44,870 But if I want to quantify that vector so 873 00:47:44,870 --> 00:47:47,690 that-- if you want to quantify that vector so that you can 874 00:47:47,690 --> 00:47:50,780 maybe tell somebody else precisely which direction I'm 875 00:47:50,780 --> 00:47:55,890 pointing, you need to write down those numbers, OK? 876 00:47:55,890 --> 00:48:00,280 So you can think of vectors in abstract directions, 877 00:48:00,280 --> 00:48:05,040 but if you want to actually quantify it or write it down, 878 00:48:05,040 --> 00:48:07,410 you need to choose a coordinate system. 879 00:48:07,410 --> 00:48:10,170 And so to do this, you choose a set 880 00:48:10,170 --> 00:48:13,890 of vectors, special vectors, called a basis set. 881 00:48:13,890 --> 00:48:16,590 And now we just say, here's a vector. 882 00:48:16,590 --> 00:48:21,510 How much is it pointing in that direction, that direction, 883 00:48:21,510 --> 00:48:22,890 and that direction? 884 00:48:22,890 --> 00:48:24,870 And that's called a basis set. 885 00:48:24,870 --> 00:48:28,230 So we can write down our vector now 886 00:48:28,230 --> 00:48:32,490 as a set of three numbers that simply tell us 887 00:48:32,490 --> 00:48:35,520 how far that vector is overlapped 888 00:48:35,520 --> 00:48:39,810 with three other vectors that form the basis set. 889 00:48:39,810 --> 00:48:41,430 So the standard way of doing this 890 00:48:41,430 --> 00:48:47,080 is to describe a vector as a component in the x direction, 891 00:48:47,080 --> 00:48:51,400 which is a vector 1, 1, 0, sort of in the standard notation; 892 00:48:51,400 --> 00:48:53,880 a component in the y direction, which is 0, 893 00:48:53,880 --> 00:48:58,380 1, 0; and a component in the z direction, 0, 0, 1. 894 00:48:58,380 --> 00:49:04,920 So we can write those vectors as standard basis vectors. 895 00:49:04,920 --> 00:49:07,260 The numbers x, y, and z here are called 896 00:49:07,260 --> 00:49:09,150 the coordinates of the vector. 897 00:49:09,150 --> 00:49:13,950 And the vectors e1, e2, and e3 are called the basis vectors. 898 00:49:13,950 --> 00:49:16,380 And this is how you would write that down 899 00:49:16,380 --> 00:49:18,660 for a three-dimensional vector, OK? 900 00:49:18,660 --> 00:49:20,640 Again, the little hat here denotes 901 00:49:20,640 --> 00:49:25,600 that those are unit vectors that have a length one. 902 00:49:25,600 --> 00:49:27,680 All right, so in order to describe an arbitrary 903 00:49:27,680 --> 00:49:30,770 vector in a space of n real numbers, 904 00:49:30,770 --> 00:49:36,620 Rn, the basis vectors each need to have n numbers. 905 00:49:36,620 --> 00:49:39,410 And in order to describe an arbitrary vector in that space, 906 00:49:39,410 --> 00:49:42,710 you need to have n basis vectors. 907 00:49:42,710 --> 00:49:44,570 You need to have-- 908 00:49:44,570 --> 00:49:47,630 in n dimensions, you need to have n basis vectors, 909 00:49:47,630 --> 00:49:52,830 and each one knows basis vectors has to have n numbers in them. 910 00:49:52,830 --> 00:49:55,130 So these vectors here-- 911 00:49:55,130 --> 00:49:59,435 1, 0, 0; 0, 1, 0; and 0, 0, 1-- are called the standard basis. 912 00:50:03,120 --> 00:50:06,120 And each one of these values has one element that's one 913 00:50:06,120 --> 00:50:07,200 and the rest are zero. 914 00:50:07,200 --> 00:50:08,340 That's the standard basis. 915 00:50:12,720 --> 00:50:16,450 The standard basis has the property 916 00:50:16,450 --> 00:50:20,470 that any one of those vectors dotted into itself is one. 917 00:50:20,470 --> 00:50:22,060 That's because they're unit vectors. 918 00:50:22,060 --> 00:50:23,440 They have length one. 919 00:50:23,440 --> 00:50:28,960 So i dot ei is the length squared of the i-th vector. 920 00:50:28,960 --> 00:50:32,340 And if the length is one, then the length squared is one. 921 00:50:32,340 --> 00:50:36,210 Each vector is orthogonal to all the other vectors. 922 00:50:36,210 --> 00:50:41,440 That means that each e1 dot e2 is zero, and e1 dot e3 is zero, 923 00:50:41,440 --> 00:50:43,770 and e2 dot e3 is zero. 924 00:50:43,770 --> 00:50:49,350 You can write down as e sub i dot e sub j equals zero for i 925 00:50:49,350 --> 00:50:52,580 not equal to j. 926 00:50:52,580 --> 00:50:54,470 You can write all of those properties 927 00:50:54,470 --> 00:50:56,420 down in one equation-- 928 00:50:56,420 --> 00:51:00,920 e sub i dot e sub j equals delta i j. 929 00:51:00,920 --> 00:51:05,950 Delta i j is what's called the Kronecker delta function. 930 00:51:05,950 --> 00:51:09,800 The Kronecker delta function is a one if i equals j and a zero 931 00:51:09,800 --> 00:51:13,430 if i is not equal to j, OK? 932 00:51:13,430 --> 00:51:16,730 So it's a very compact way of writing down this property 933 00:51:16,730 --> 00:51:19,070 that each vector is a unit vector 934 00:51:19,070 --> 00:51:23,370 and each vector is orthogonal to all the other vectors. 935 00:51:23,370 --> 00:51:28,140 And the set with that property is called an off an orthonormal 936 00:51:28,140 --> 00:51:28,790 basis set. 937 00:51:31,700 --> 00:51:37,600 All right, now, the standard basis is not the only basis-- 938 00:51:37,600 --> 00:51:39,110 sorry. 939 00:51:39,110 --> 00:51:41,850 I'm trying to do x, y, and z here. 940 00:51:41,850 --> 00:51:45,510 So if you have x, y, and z, that's 941 00:51:45,510 --> 00:51:48,690 not the only orthonormal basis set. 942 00:51:48,690 --> 00:51:54,480 Any basis set that is a rotation of those three vectors 943 00:51:54,480 --> 00:51:57,880 is also an orthonormal basis. 944 00:51:57,880 --> 00:52:02,720 Let's write down two other orthogonal unit vectors. 945 00:52:02,720 --> 00:52:06,500 We can write down our vector v in this other basis 946 00:52:06,500 --> 00:52:08,760 set as follows. 947 00:52:08,760 --> 00:52:13,910 We just take our vector v. We can plot the basis vectors 948 00:52:13,910 --> 00:52:15,650 in this other basis. 949 00:52:15,650 --> 00:52:20,370 And we can simply project v onto those other basis vectors. 950 00:52:20,370 --> 00:52:26,540 So we can project v onto f1, and we can project v onto f2. 951 00:52:26,540 --> 00:52:32,030 So we can write v as a sum of a vector in the direction of f1 952 00:52:32,030 --> 00:52:33,890 and a vector in the direction of f2. 953 00:52:36,420 --> 00:52:42,720 You can write down this vector v in this different basis set 954 00:52:42,720 --> 00:52:45,580 as a vector with two components. 955 00:52:45,580 --> 00:52:48,270 This is two dimensional. 956 00:52:48,270 --> 00:52:50,180 This is R2. 957 00:52:50,180 --> 00:52:53,010 You can write it down as a two-component vector-- 958 00:52:53,010 --> 00:52:56,460 v dot f1 and v dot f2. 959 00:52:56,460 --> 00:52:59,460 So that's a simple intuition for what [AUDIO OUT] 960 00:52:59,460 --> 00:53:01,050 in two dimensions. 961 00:53:01,050 --> 00:53:05,370 We're going to develop the formalism for doing this 962 00:53:05,370 --> 00:53:07,100 in arbitrary dimensions, OK? 963 00:53:07,100 --> 00:53:09,620 And it's very simple. 964 00:53:09,620 --> 00:53:14,100 All right, these components here are 965 00:53:14,100 --> 00:53:20,240 called the vector coordinates of this vector basis f. 966 00:53:20,240 --> 00:53:26,360 All right, now, basis sets, or basis vectors, 967 00:53:26,360 --> 00:53:29,000 don't have to be orthogonal to each other, 968 00:53:29,000 --> 00:53:31,750 and they don't have to be normal. 969 00:53:31,750 --> 00:53:33,980 They don't have to be unit vector. 970 00:53:33,980 --> 00:53:37,220 You can write down an arbitrary vector 971 00:53:37,220 --> 00:53:41,570 as a sum of components that aren't 972 00:53:41,570 --> 00:53:43,350 orthogonal to each other. 973 00:53:43,350 --> 00:53:45,080 So you can write down this vector v 974 00:53:45,080 --> 00:53:50,510 as a sum of a component here in the f1 direction 975 00:53:50,510 --> 00:53:53,100 and a component in the f2 direction, 976 00:53:53,100 --> 00:53:56,330 even if f1 and f2 are not orthogonal to each other 977 00:53:56,330 --> 00:53:59,100 and even if they're not unit vectors. 978 00:53:59,100 --> 00:54:02,930 So, again, v is expressed as a linear combination 979 00:54:02,930 --> 00:54:05,360 of a vector in the f1 direction and a vector 980 00:54:05,360 --> 00:54:07,760 in the f2 direction. 981 00:54:07,760 --> 00:54:12,400 OK, so let's take a vector and decompose it 982 00:54:12,400 --> 00:54:15,420 into an arbitrary basis set f1 and f2. 983 00:54:18,120 --> 00:54:22,510 So v equals c1 f1 plus c2 f2. 984 00:54:22,510 --> 00:54:24,560 The coefficients here are called the coordinates 985 00:54:24,560 --> 00:54:27,020 of the vector in this basis. 986 00:54:27,020 --> 00:54:30,710 And the vector v sub f-- 987 00:54:30,710 --> 00:54:39,620 these numbers, c1 and c2, when combined into this vector, 988 00:54:39,620 --> 00:54:44,840 is called the coordinate vector of v in the basis f1 989 00:54:44,840 --> 00:54:46,880 and f2, OK? 990 00:54:46,880 --> 00:54:47,870 Does that makes sense? 991 00:54:47,870 --> 00:54:49,440 Just some terminology. 992 00:54:52,630 --> 00:54:56,440 OK, so let's define this basis, f1 and f2. 993 00:54:56,440 --> 00:55:00,550 We just pick two vectors, an arbitrary two vectors. 994 00:55:00,550 --> 00:55:05,740 And I'll explain later that not all choices of vectors work, 995 00:55:05,740 --> 00:55:08,030 but most of them do. 996 00:55:08,030 --> 00:55:11,080 So here are two vectors that we can choose as a basis-- 997 00:55:11,080 --> 00:55:17,105 so 1, 3, which is sort of like this, and minus 2, 1 998 00:55:17,105 --> 00:55:17,980 is kind of like that. 999 00:55:22,442 --> 00:55:24,150 And we're going to write down this vector 1000 00:55:24,150 --> 00:55:26,070 v in this new basis. 1001 00:55:26,070 --> 00:55:30,250 So we have a vector v that's 3, 5 in the standard basis, 1002 00:55:30,250 --> 00:55:35,250 and we're going to rewrite it in this new basis, all right? 1003 00:55:35,250 --> 00:55:37,680 So we're going to find the vector coordinates of v 1004 00:55:37,680 --> 00:55:39,150 in the new basis. 1005 00:55:39,150 --> 00:55:40,840 So we're going to do this as follows. 1006 00:55:40,840 --> 00:55:43,650 We're going to write v as a linear combination of these two 1007 00:55:43,650 --> 00:55:45,240 basis vectors. 1008 00:55:45,240 --> 00:55:49,580 So c1 times f1-- 1009 00:55:49,580 --> 00:55:53,060 1, 3-- plus c2 times f2-- 1010 00:55:53,060 --> 00:55:56,240 minus 2, 1-- is equal to 3, 5. 1011 00:55:56,240 --> 00:55:57,680 That make sense? 1012 00:55:57,680 --> 00:55:58,370 So what is that? 1013 00:55:58,370 --> 00:56:04,010 That is just a system of equations, right? 1014 00:56:04,010 --> 00:56:08,570 And what we're trying to do is solve for c1 and c2. 1015 00:56:08,570 --> 00:56:09,470 That's it. 1016 00:56:09,470 --> 00:56:13,010 So we already did this problem in the last lecture. 1017 00:56:16,420 --> 00:56:18,440 So we have this system of equations. 1018 00:56:18,440 --> 00:56:23,060 We can write this down in the following matrix notation. 1019 00:56:23,060 --> 00:56:29,280 F times vf-- vf is just c1 and c2-- 1020 00:56:29,280 --> 00:56:31,305 equals v. So there's F-- 1021 00:56:31,305 --> 00:56:32,970 1, 3; minus 2, 1. 1022 00:56:32,970 --> 00:56:36,030 Those are our two basis vectors. 1023 00:56:36,030 --> 00:56:41,130 Times c1 c2-- the vector c1, c2-- is equal to 3, 5. 1024 00:56:41,130 --> 00:56:43,620 And we solve for vf. 1025 00:56:43,620 --> 00:56:46,080 In other words, we solve for c1 and c2 1026 00:56:46,080 --> 00:56:56,540 simply by multiplying v by the inverse of this matrix F. 1027 00:56:56,540 --> 00:57:02,820 So the coordinate vector in this new base is said 1028 00:57:02,820 --> 00:57:06,810 is just the old vector times f inverse. 1029 00:57:06,810 --> 00:57:08,430 And what is f inverse? 1030 00:57:08,430 --> 00:57:16,750 F inverse is just a matrix that has the basis vectors 1031 00:57:16,750 --> 00:57:18,175 as the columns of the matrix. 1032 00:57:24,330 --> 00:57:29,730 So the coordinates of this vector in his new basis set 1033 00:57:29,730 --> 00:57:35,260 are given by f inverse times v. We can find the inverse of f. 1034 00:57:35,260 --> 00:57:40,690 So if that's our f, we can calculate the inverse of that. 1035 00:57:40,690 --> 00:57:44,050 Remember, you flip the diagonal elements. 1036 00:57:44,050 --> 00:57:47,020 You multiply the off-diagonals by minus 1, 1037 00:57:47,020 --> 00:57:49,930 and you divide by the determinant. 1038 00:57:49,930 --> 00:58:01,000 So f inverse is this times v is that, and v sub f is just 13/7 1039 00:58:01,000 --> 00:58:04,380 over minus 4/7. 1040 00:58:04,380 --> 00:58:08,550 So that's just a different way of writing v. 1041 00:58:08,550 --> 00:58:10,710 So there's v in the standard basis. 1042 00:58:10,710 --> 00:58:15,210 There's v in this new basis, all right? 1043 00:58:15,210 --> 00:58:21,890 And all you do to go from the standard basis 1044 00:58:21,890 --> 00:58:25,520 to any arbitrary new basis is multiply the vector 1045 00:58:25,520 --> 00:58:26,270 by f inverse. 1046 00:58:33,800 --> 00:58:38,550 And when you're actually doing this in Matlab, 1047 00:58:38,550 --> 00:58:39,940 this is really simple. 1048 00:58:39,940 --> 00:58:43,800 You just write down a matrix F that has 1049 00:58:43,800 --> 00:58:46,530 the basis sets in the columns. 1050 00:58:46,530 --> 00:58:49,620 You just use the matrix inverse function, 1051 00:58:49,620 --> 00:58:52,710 and then you multiply that by the data matrix, 1052 00:58:52,710 --> 00:58:54,300 by the data vector. 1053 00:58:54,300 --> 00:58:58,490 All right, so I'm just going to summarize again. 1054 00:58:58,490 --> 00:59:02,060 In order to find the coordinate vector for v in this new basis, 1055 00:59:02,060 --> 00:59:05,780 you construct a matrix F, whose columns 1056 00:59:05,780 --> 00:59:09,000 are just the elements of the basis vectors. 1057 00:59:09,000 --> 00:59:11,720 So if you have two basis vectors, 1058 00:59:11,720 --> 00:59:14,600 it's a two-- remember, each of those basis vectors. 1059 00:59:14,600 --> 00:59:16,850 In two dimensions, there are two basis vectors. 1060 00:59:16,850 --> 00:59:20,180 Each has two numbers, so this is a 2 by 2 matrix. 1061 00:59:20,180 --> 00:59:24,200 In n dimensions, you have n basis vectors. 1062 00:59:24,200 --> 00:59:26,640 Each of the basis vectors has n numbers. 1063 00:59:26,640 --> 00:59:31,440 And so this matrix F is an n by n matrix, all right? 1064 00:59:31,440 --> 00:59:38,730 You know that you can write down v as this basis times v sub f. 1065 00:59:38,730 --> 00:59:41,310 You solve for v sub f by multiplying both sides 1066 00:59:41,310 --> 00:59:42,840 by f inverse, all right? 1067 00:59:42,840 --> 00:59:45,720 That performs whats called change of basis. 1068 00:59:50,100 --> 00:59:54,670 Now, that only works if f has an inverse. 1069 00:59:54,670 --> 00:59:59,550 So if you're going to choose a new basis to write down 1070 00:59:59,550 --> 01:00:02,250 your vector, you have to be careful to pick one 1071 01:00:02,250 --> 01:00:04,320 that has an inverse, all right? 1072 01:00:04,320 --> 01:00:05,820 And I want to show you what it looks 1073 01:00:05,820 --> 01:00:08,640 like when you pick a basis that doesn't have an inverse 1074 01:00:08,640 --> 01:00:10,110 and what that means. 1075 01:00:10,110 --> 01:00:14,620 All right, and that gets to the idea of linear independence. 1076 01:00:14,620 --> 01:00:20,140 All right, so, remember I said that if in n dimensions, in Rn, 1077 01:00:20,140 --> 01:00:25,390 in order to have a basis in Rn, you have certain requirements? 1078 01:00:25,390 --> 01:00:26,990 Not any vectors will work. 1079 01:00:26,990 --> 01:00:29,920 So let's take a look at these vectors. 1080 01:00:29,920 --> 01:00:32,800 Will those work to describe an-- 1081 01:00:32,800 --> 01:00:35,890 will that basis set work to describe an arbitrary 1082 01:00:35,890 --> 01:00:37,435 vector in three dimensions? 1083 01:00:37,435 --> 01:00:38,050 No? 1084 01:00:38,050 --> 01:00:39,913 Why not? 1085 01:00:39,913 --> 01:00:45,068 AUDIENCE: [INAUDIBLE] vectors, so if you're [INAUDIBLE].. 1086 01:00:45,068 --> 01:00:45,860 MICHALE FEE: Right. 1087 01:00:45,860 --> 01:00:48,950 So the problem is in which coordinate, which axis? 1088 01:00:48,950 --> 01:00:49,700 AUDIENCE: Z-axis. 1089 01:00:49,700 --> 01:00:50,700 MICHALE FEE: The z-axis. 1090 01:00:50,700 --> 01:00:54,020 You can see that you have zeros in all three of those vectors, 1091 01:00:54,020 --> 01:00:56,690 OK? 1092 01:00:56,690 --> 01:00:59,720 You can't describe any vector with this basis 1093 01:00:59,720 --> 01:01:03,169 that has a non-zero component in the z direction. 1094 01:01:08,710 --> 01:01:11,700 And the reason is that any linear combination 1095 01:01:11,700 --> 01:01:16,700 of these three vectors will always lie in the xy plane. 1096 01:01:16,700 --> 01:01:19,310 So you can't describe any vector here 1097 01:01:19,310 --> 01:01:25,720 that has a non-zero z component, all right? 1098 01:01:25,720 --> 01:01:28,330 So what we say is that this set of vectors 1099 01:01:28,330 --> 01:01:31,910 doesn't span all of R3. 1100 01:01:31,910 --> 01:01:36,830 It only spans the xy plane, which 1101 01:01:36,830 --> 01:01:40,225 is what we call a subspace of R3, OK? 1102 01:01:44,990 --> 01:01:47,210 OK, so let's take a look at these three vectors. 1103 01:01:47,210 --> 01:01:48,770 The other thing to notice is that you 1104 01:01:48,770 --> 01:01:52,250 can write any one of these vectors 1105 01:01:52,250 --> 01:01:56,240 as a linear combination of the other two. 1106 01:01:56,240 --> 01:02:01,750 So you can write f3 as a sum of f1 and f2. 1107 01:02:01,750 --> 01:02:03,850 The sum of those two vectors is equal to that one. 1108 01:02:03,850 --> 01:02:06,850 You can write f2 as f3 minus f1. 1109 01:02:06,850 --> 01:02:09,940 So any of these vectors can be written as a linear combination 1110 01:02:09,940 --> 01:02:11,330 of the others. 1111 01:02:11,330 --> 01:02:15,310 And so that set of vectors is called linearly dependent. 1112 01:02:19,180 --> 01:02:23,630 And any set of linearly dependent vectors cannot form 1113 01:02:23,630 --> 01:02:24,130 a basis. 1114 01:02:26,880 --> 01:02:28,980 And how do you know if a set of vectors 1115 01:02:28,980 --> 01:02:33,480 that you choose for your basis is linearly dependent? 1116 01:02:33,480 --> 01:02:38,560 Well, again, you just find the determinant of that matrix. 1117 01:02:38,560 --> 01:02:44,030 And if it's zero, those vectors are linearly dependent. 1118 01:02:44,030 --> 01:02:48,670 So what that corresponds to is you're taking your data 1119 01:02:48,670 --> 01:02:54,890 and when you transform it into a new basis, 1120 01:02:54,890 --> 01:02:58,220 if the determinant of that matrix F 1121 01:02:58,220 --> 01:03:01,580 is zero, then what you're doing is you're taking those data 1122 01:03:01,580 --> 01:03:05,763 and transforming them to a space where they're being collapsed. 1123 01:03:05,763 --> 01:03:07,430 Let's say if you're in three dimensions, 1124 01:03:07,430 --> 01:03:12,350 those data are being collapsed onto a plane or onto a line, 1125 01:03:12,350 --> 01:03:14,390 OK? 1126 01:03:14,390 --> 01:03:18,510 And that means you can't undo that transformation, all right? 1127 01:03:18,510 --> 01:03:20,730 And the way to tell whether you've got that problem 1128 01:03:20,730 --> 01:03:23,862 is looking at the determinant. 1129 01:03:23,862 --> 01:03:25,820 All right, let me show you one other cool thing 1130 01:03:25,820 --> 01:03:27,920 about the determinant. 1131 01:03:27,920 --> 01:03:30,500 There's a very simple geometrical interpretation 1132 01:03:30,500 --> 01:03:33,320 of what the determinant is, OK? 1133 01:03:33,320 --> 01:03:34,700 All right, sorry. 1134 01:03:34,700 --> 01:03:37,580 So if f maps your data onto a subspace, 1135 01:03:37,580 --> 01:03:39,290 then the mapping is not reversible. 1136 01:03:39,290 --> 01:03:43,910 OK, so what does the determinant correspond to? 1137 01:03:43,910 --> 01:03:48,770 Let's say in two dimensions, if I have two orthogonal unit 1138 01:03:48,770 --> 01:03:52,670 vectors, you can think of those vectors 1139 01:03:52,670 --> 01:03:58,460 as kind of forming a square in this space. 1140 01:03:58,460 --> 01:04:01,470 Or in three dimensions, if I have three orthogonal vectors, 1141 01:04:01,470 --> 01:04:05,810 you can think of those vectors as defining a cube, OK? 1142 01:04:05,810 --> 01:04:07,700 And if there unit vectors, then they 1143 01:04:07,700 --> 01:04:10,990 define a cube of volume one. 1144 01:04:10,990 --> 01:04:16,010 Here, you have the square of area one. 1145 01:04:16,010 --> 01:04:21,454 So let's think about this unit volume. 1146 01:04:21,454 --> 01:04:26,120 If I transform those two vectors or those three vectors 1147 01:04:26,120 --> 01:04:30,710 in 3D space by a matrix A, those vectors 1148 01:04:30,710 --> 01:04:34,730 get rotated and transformed. 1149 01:04:34,730 --> 01:04:38,450 They point in different directions, and they define-- 1150 01:04:38,450 --> 01:04:42,150 it's no longer a cube, but they define some sort of rhombus, 1151 01:04:42,150 --> 01:04:43,720 OK? 1152 01:04:43,720 --> 01:04:48,580 You can ask, what is the volume of that rhombus? 1153 01:04:48,580 --> 01:04:53,560 The volume of that rhombus is just the determinant 1154 01:04:53,560 --> 01:04:58,550 of that matrix A. So now what happens 1155 01:04:58,550 --> 01:05:03,180 if I have a cube in three-dimensional space 1156 01:05:03,180 --> 01:05:06,210 and I multiply it by a matrix that transforms it 1157 01:05:06,210 --> 01:05:10,230 into a rhombus that has zero volume? 1158 01:05:10,230 --> 01:05:12,240 So let's say I have those three vectors. 1159 01:05:12,240 --> 01:05:16,440 It transforms it into, let's say, a square. 1160 01:05:16,440 --> 01:05:20,200 The volume of that square in three dimensional space 1161 01:05:20,200 --> 01:05:22,830 is zero. 1162 01:05:22,830 --> 01:05:25,800 So what that means is I'm transforming my vectors 1163 01:05:25,800 --> 01:05:28,770 into a space that has zero volume 1164 01:05:28,770 --> 01:05:30,640 in the original dimensions, OK? 1165 01:05:30,640 --> 01:05:35,880 So I'm transforming things from 3D into a 2D plane. 1166 01:05:35,880 --> 01:05:39,210 And what that means is I've lost information, 1167 01:05:39,210 --> 01:05:40,260 and I can't go back. 1168 01:05:44,430 --> 01:05:49,840 OK, notice that a rotation matrix, if I take this cube 1169 01:05:49,840 --> 01:05:53,620 and I rotate it, has exactly the same volume 1170 01:05:53,620 --> 01:05:55,940 as it did before I rotated it. 1171 01:05:55,940 --> 01:06:00,400 And so you can always tell when you have a rotation matrix, 1172 01:06:00,400 --> 01:06:04,640 because the determinant of a rotation matrix is one. 1173 01:06:04,640 --> 01:06:11,050 So if you take a matrix A and you find the determinant 1174 01:06:11,050 --> 01:06:12,910 and you find that the determinant is one, 1175 01:06:12,910 --> 01:06:18,190 you know that you have a pure rotation matrix. 1176 01:06:18,190 --> 01:06:20,736 What does it mean if the determinant is minus one? 1177 01:06:24,310 --> 01:06:26,800 What it means is you have a rotation, 1178 01:06:26,800 --> 01:06:32,620 but that one of the axes is inverted, is flipped. 1179 01:06:32,620 --> 01:06:33,730 There's a mirror in there. 1180 01:06:36,660 --> 01:06:39,470 So you can tell if you have a pure rotation 1181 01:06:39,470 --> 01:06:43,610 or if you have a rotation and one of the axes is flipped. 1182 01:06:43,610 --> 01:06:46,490 Because in the pure rotation, the determinant is one. 1183 01:06:46,490 --> 01:06:53,360 And in an impure rotation, you have a rotation and a mirror 1184 01:06:53,360 --> 01:06:53,860 flip. 1185 01:06:56,890 --> 01:07:02,750 All right, and I just want to make a couple more comments 1186 01:07:02,750 --> 01:07:05,990 about change of basis, OK? 1187 01:07:05,990 --> 01:07:10,580 All right, so let's choose a set of basis vectors 1188 01:07:10,580 --> 01:07:13,370 for our new basis. 1189 01:07:13,370 --> 01:07:17,470 Let's write those into a matrix F. 1190 01:07:17,470 --> 01:07:22,140 It's going to be our matrix of basis vectors. 1191 01:07:22,140 --> 01:07:24,990 If the determinant is not equal to zero, 1192 01:07:24,990 --> 01:07:27,300 then these vectors, that set of vectors, 1193 01:07:27,300 --> 01:07:29,790 are linearly independent. 1194 01:07:29,790 --> 01:07:34,050 That means you cannot write one of those vectors as a linear 1195 01:07:34,050 --> 01:07:35,280 combination of-- 1196 01:07:35,280 --> 01:07:37,800 any one of those vectors as a linear combination 1197 01:07:37,800 --> 01:07:39,800 of the others. 1198 01:07:39,800 --> 01:07:45,230 Those vectors form a complete basis in that n dimensional 1199 01:07:45,230 --> 01:07:47,820 space. 1200 01:07:47,820 --> 01:07:50,960 The matrix F implements a change of basis, 1201 01:07:50,960 --> 01:07:54,110 and you can go from the standard basis to F 1202 01:07:54,110 --> 01:07:56,600 by multiplying your vector by F inverse 1203 01:07:56,600 --> 01:07:59,570 to get the coordinate vector and your new basis. 1204 01:07:59,570 --> 01:08:05,100 And you can go back from that rotated or transformed basis 1205 01:08:05,100 --> 01:08:10,440 back to the coordinate basis by multiplying by F, OK? 1206 01:08:10,440 --> 01:08:14,250 Multiply by F inverse transforms to the new basis. 1207 01:08:14,250 --> 01:08:16,200 Multiplying by F transforms back. 1208 01:08:19,319 --> 01:08:26,260 If that set of vectors is an orthonormal basis, then-- 1209 01:08:26,260 --> 01:08:31,200 OK, so let's take this matrix F that has columns 1210 01:08:31,200 --> 01:08:32,729 that are the new basis vectors. 1211 01:08:32,729 --> 01:08:38,630 And let's say that those form an orthonormal basis. 1212 01:08:38,630 --> 01:08:42,020 In that case, we can write down-- so, in any case, 1213 01:08:42,020 --> 01:08:46,100 we can write down the transpose of this matrix, F transpose. 1214 01:08:46,100 --> 01:08:51,210 And now the rows of that matrix are the basis vectors. 1215 01:08:51,210 --> 01:08:55,569 Notice that if we multiply F transpose times F, 1216 01:08:55,569 --> 01:08:59,990 we have basis vectors in rows here and columns here. 1217 01:08:59,990 --> 01:09:03,060 So what is F transpose F for the case 1218 01:09:03,060 --> 01:09:05,399 where these are unit vectors that 1219 01:09:05,399 --> 01:09:07,180 are orthogonal to each other? 1220 01:09:07,180 --> 01:09:08,385 What is that product? 1221 01:09:08,385 --> 01:09:09,260 AUDIENCE: [INAUDIBLE] 1222 01:09:09,260 --> 01:09:09,479 MICHALE FEE: It's what? 1223 01:09:09,479 --> 01:09:10,060 AUDIENCE: [INAUDIBLE] 1224 01:09:10,060 --> 01:09:10,810 MICHALE FEE: Good. 1225 01:09:10,810 --> 01:09:14,738 Because F1 dot F1 is one. 1226 01:09:14,738 --> 01:09:17,840 F1 dot F2 is zero. 1227 01:09:17,840 --> 01:09:21,330 F2 dot F1 is zero, and F2 dot F2 is 0. 1228 01:09:21,330 --> 01:09:24,140 So that's equal to the identity matrix, right? 1229 01:09:26,880 --> 01:09:30,300 So F transpose equals F inverse. 1230 01:09:30,300 --> 01:09:33,899 If the inverse of a matrix is just its transpose, 1231 01:09:33,899 --> 01:09:35,924 then that matrix is a rotation matrix. 1232 01:09:38,810 --> 01:09:41,100 So F is just the rotation matrix. 1233 01:09:41,100 --> 01:09:43,109 All right, now let's see what happens. 1234 01:09:43,109 --> 01:09:48,810 So that means the inverse of F is just this F transpose. 1235 01:09:48,810 --> 01:09:51,359 Let's do this coordinate-- let's [AUDIO OUT] 1236 01:09:51,359 --> 01:09:54,310 change of basis for this case. 1237 01:09:54,310 --> 01:09:58,680 So you can see that v sub f, the coordinate vector in the new 1238 01:09:58,680 --> 01:10:04,770 basis, is F transpose v. Here's F transpose-- 1239 01:10:04,770 --> 01:10:07,020 the basis vectors are in the rows-- 1240 01:10:07,020 --> 01:10:14,525 times v. This is just v dot F1, v dot F2, right? 1241 01:10:14,525 --> 01:10:20,830 So this shows how for a orthonormal basis, 1242 01:10:20,830 --> 01:10:25,270 the transpose, which is the inverse of F-- 1243 01:10:25,270 --> 01:10:27,190 taking the transpose of F times v 1244 01:10:27,190 --> 01:10:29,290 is just taking the dot product of v 1245 01:10:29,290 --> 01:10:32,320 with each of the basis vectors, OK? 1246 01:10:32,320 --> 01:10:36,880 So that ties it back to what we were showing before about how 1247 01:10:36,880 --> 01:10:39,220 to do this change of basis, OK? 1248 01:10:39,220 --> 01:10:42,400 Just tying up those two ways of thinking about it. 1249 01:10:45,190 --> 01:10:53,490 So, again, what we've been developing 1250 01:10:53,490 --> 01:10:56,720 when we talk about change of basis 1251 01:10:56,720 --> 01:11:02,500 are ways of rotating vectors, rotating sets 1252 01:11:02,500 --> 01:11:04,480 of data, into different dimensions, 1253 01:11:04,480 --> 01:11:07,780 into different basis sets so that we 1254 01:11:07,780 --> 01:11:11,510 can look at data from different directions. 1255 01:11:11,510 --> 01:11:14,210 That's all we're doing. 1256 01:11:14,210 --> 01:11:16,370 And you can see that when you look 1257 01:11:16,370 --> 01:11:20,300 at data from different directions, you can get-- 1258 01:11:20,300 --> 01:11:23,720 some views of data, you have a lot of things overlapping, 1259 01:11:23,720 --> 01:11:24,680 and you can't see them. 1260 01:11:24,680 --> 01:11:28,010 But when you rotate those data, now, all of a sudden, 1261 01:11:28,010 --> 01:11:31,820 you can see things clearly that used to be-- 1262 01:11:31,820 --> 01:11:36,590 things get separated in some views, whereas in other views, 1263 01:11:36,590 --> 01:11:39,980 things are kind of mixed up and covering each other, OK? 1264 01:11:39,980 --> 01:11:44,270 And that's exactly what neural networks are doing when they're 1265 01:11:44,270 --> 01:11:48,260 analyzing sensory stimuli. 1266 01:11:48,260 --> 01:11:50,150 They're doing that kind of rotations 1267 01:11:50,150 --> 01:11:54,440 and untangling the data to see what's 1268 01:11:54,440 --> 01:11:58,400 there in that high-dimensional data, OK? 1269 01:11:58,400 --> 01:12:00,670 All right, that's it.