1
00:00:01,550 --> 00:00:03,920
The following content is
provided under a Creative

2
00:00:03,920 --> 00:00:05,310
Commons license.

3
00:00:05,310 --> 00:00:07,520
Your support will help
MIT OpenCourseWare

4
00:00:07,520 --> 00:00:11,610
continue to offer high quality
educational resources for free.

5
00:00:11,610 --> 00:00:14,180
To make a donation or to
view additional materials

6
00:00:14,180 --> 00:00:18,140
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:18,140 --> 00:00:19,026
at ocw.mit.edu.

8
00:00:23,260 --> 00:00:26,230
GILBERT STRANG: Well, so
three things to mention.

9
00:00:26,230 --> 00:00:33,520
One was you remember last time
I made a list of six or seven

10
00:00:33,520 --> 00:00:38,080
different situations where
Ax equal b and problems

11
00:00:38,080 --> 00:00:39,850
that could arise?

12
00:00:39,850 --> 00:00:44,980
The last one was when A was just
way too big to fit into core.

13
00:00:44,980 --> 00:00:48,220
But in the middle
were other methods.

14
00:00:48,220 --> 00:00:52,030
So other issues,
like the columns

15
00:00:52,030 --> 00:00:58,900
being nearly dependent when
Gram-Schmidt will come up.

16
00:00:58,900 --> 00:01:01,390
First, I want to say
that I know there

17
00:01:01,390 --> 00:01:03,310
are typos in the two pages.

18
00:01:03,310 --> 00:01:08,710
I thought you might just like to
see the very first draft of two

19
00:01:08,710 --> 00:01:10,000
pages of the book.

20
00:01:12,910 --> 00:01:14,710
People sometimes
ask me, how long

21
00:01:14,710 --> 00:01:17,020
does the book take to write?

22
00:01:17,020 --> 00:01:22,930
So I started when 18.065
started a year ago.

23
00:01:22,930 --> 00:01:27,790
So I'm into the second year,
and it usually takes two years.

24
00:01:27,790 --> 00:01:32,450
And the system is
I write by hand so.

25
00:01:32,450 --> 00:01:38,030
I wrote those opening pages that
you saw, two pages, by hand.

26
00:01:38,030 --> 00:01:41,210
Then I scan them
to Mumbai, where

27
00:01:41,210 --> 00:01:46,160
my best friend in the world
types them with typos.

28
00:01:46,160 --> 00:01:49,370
No problem, because I'm
going to make so many changes

29
00:01:49,370 --> 00:01:52,430
that a few typos are nothing.

30
00:01:52,430 --> 00:01:57,360
Anyway, so he scans
them back to me typed.

31
00:01:57,360 --> 00:02:00,270
And then I start making changes.

32
00:02:00,270 --> 00:02:03,440
If I'm lucky I'd have the
chance to talk about it in here.

33
00:02:03,440 --> 00:02:06,060
Then I realize
better things to do.

34
00:02:06,060 --> 00:02:09,600
Then I scan back to him and back
to me and back to him and back

35
00:02:09,600 --> 00:02:10,330
to me.

36
00:02:10,330 --> 00:02:15,540
So that's where the
two years disappear.

37
00:02:15,540 --> 00:02:18,300
Anyway, I'm quite happy
with those two pages,

38
00:02:18,300 --> 00:02:21,300
until I start improving them.

39
00:02:21,300 --> 00:02:27,165
And one other topic from the
past that came out in class.

40
00:02:27,165 --> 00:02:30,420
It wasn't in the notes yet.

41
00:02:30,420 --> 00:02:37,260
Do you remember the day that
we minimized different norms?

42
00:02:37,260 --> 00:02:45,450
L1 or L2 or max, L infinity
norm with the condition

43
00:02:45,450 --> 00:02:50,220
of solving with a constraint
that this equation was

44
00:02:50,220 --> 00:02:51,000
satisfied.

45
00:02:51,000 --> 00:02:54,240
I'm in 2D to be able
to draw a picture.

46
00:02:54,240 --> 00:02:56,520
And the constraint is one line.

47
00:02:56,520 --> 00:02:59,580
And that's about what
the line looks like.

48
00:02:59,580 --> 00:03:04,470
So I'm going to
draw here again--

49
00:03:04,470 --> 00:03:11,640
what I'm doing is putting some
numbers in to that insight

50
00:03:11,640 --> 00:03:16,235
I drew about a week ago.

51
00:03:16,235 --> 00:03:17,110
Do you remember that?

52
00:03:17,110 --> 00:03:23,020
Because I thought that really
illustrates how L1 and L2 and L

53
00:03:23,020 --> 00:03:24,280
infinity are different.

54
00:03:24,280 --> 00:03:27,610
Let me draw the L2 one here.

55
00:03:27,610 --> 00:03:30,590
So where's the
point on this line--

56
00:03:30,590 --> 00:03:34,060
so x has to lie on this line.

57
00:03:34,060 --> 00:03:37,780
And where is the point
that has the smallest

58
00:03:37,780 --> 00:03:41,860
sum of squares norm,
standard L2 norm?

59
00:03:41,860 --> 00:03:44,290
So geometrically
where is that point?

60
00:03:44,290 --> 00:03:49,510
Well, what does the set
of points with norm 1

61
00:03:49,510 --> 00:03:52,460
look like for L2?

62
00:03:52,460 --> 00:03:54,260
It's a circle, right.

63
00:03:54,260 --> 00:03:57,860
So we just blow that
circle up or shrink it down

64
00:03:57,860 --> 00:04:00,290
until it touches this thing.

65
00:04:00,290 --> 00:04:03,500
And where it touches,
it'll touch where

66
00:04:03,500 --> 00:04:07,260
the radius is perpendicular.

67
00:04:07,260 --> 00:04:16,350
So there's our best point
in L2, because if we picked

68
00:04:16,350 --> 00:04:21,269
another point, the norm
would have to be bigger

69
00:04:21,269 --> 00:04:22,860
to go through that point.

70
00:04:22,860 --> 00:04:24,960
So that's clearly the first one.

71
00:04:24,960 --> 00:04:27,060
And actually, we
can probably see

72
00:04:27,060 --> 00:04:31,950
what it is, because if we
know those are perpendicular,

73
00:04:31,950 --> 00:04:33,990
I know the slope
of this line. so I

74
00:04:33,990 --> 00:04:39,750
think that the slope of this
line is something like 3/4,

75
00:04:39,750 --> 00:04:44,320
probably coming from
there, or maybe 4/3.

76
00:04:47,720 --> 00:04:48,902
We'll figure it out.

77
00:04:48,902 --> 00:04:49,610
I think it's 4/3.

78
00:04:52,130 --> 00:04:53,840
I'll find that point.

79
00:04:53,840 --> 00:04:57,640
Then the most interesting
one was the L1 norm,

80
00:04:57,640 --> 00:05:02,644
because what was the shape
of the unit ball for L1?

81
00:05:02,644 --> 00:05:03,505
AUDIENCE: A diamond.

82
00:05:03,505 --> 00:05:04,880
GILBERT STRANG:
A diamond, right.

83
00:05:04,880 --> 00:05:05,580
A diamond.

84
00:05:05,580 --> 00:05:10,350
So the diamond that first
touches the line is here.

85
00:05:12,970 --> 00:05:15,370
That's the winning point.

86
00:05:15,370 --> 00:05:19,120
And if the line is
3x1 plus 4x2 equal 1,

87
00:05:19,120 --> 00:05:22,000
then I know that that point is--

88
00:05:22,000 --> 00:05:27,550
x1 will be 0 and x2 will be 1/4.

89
00:05:27,550 --> 00:05:29,335
So that's the
winning point in L1.

90
00:05:32,450 --> 00:05:37,440
And I think I calculated
this right that the winning

91
00:05:37,440 --> 00:05:41,430
point in L2 I think will be--

92
00:05:41,430 --> 00:05:45,240
let's see, this goes up to--

93
00:05:45,240 --> 00:05:47,190
I'm moving down the line.

94
00:05:47,190 --> 00:05:53,400
I think this would
have 3/25, 4/25.

95
00:05:53,400 --> 00:05:55,470
I won't stop to derive that.

96
00:05:55,470 --> 00:05:59,730
But at least-- yeah, the
slope is looking like 4/3.

97
00:05:59,730 --> 00:06:04,560
It goes up 4 when it crosses 3.

98
00:06:04,560 --> 00:06:07,110
And the 4 and the
3 came from there.

99
00:06:07,110 --> 00:06:12,030
And, of course, you notice that
I've scaled it to fit the line.

100
00:06:12,030 --> 00:06:20,730
3 times 3/25 is 9/25
plus 16/25, 25/25 is 1.

101
00:06:20,730 --> 00:06:23,280
And finally, what
about L infinity?

102
00:06:23,280 --> 00:06:24,560
What was the picture there?

103
00:06:24,560 --> 00:06:31,050
What's the unit ball look
like in 2D for the max norm?

104
00:06:31,050 --> 00:06:31,550
It's a--

105
00:06:31,550 --> 00:06:32,450
AUDIENCE: Square.

106
00:06:32,450 --> 00:06:34,190
GILBERT STRANG: Square, right.

107
00:06:34,190 --> 00:06:37,250
So the square will hit there.

108
00:06:37,250 --> 00:06:40,610
These two on the square are--

109
00:06:40,610 --> 00:06:45,350
this is a 45 degree
line now on that square.

110
00:06:45,350 --> 00:06:47,600
It hits it at that sharp point.

111
00:06:47,600 --> 00:06:50,510
And so the x1 and x2 are equal.

112
00:06:50,510 --> 00:06:53,510
And I think they're probably--

113
00:06:53,510 --> 00:06:59,390
let's see, if they're
equal, if we made them 1/7,

114
00:06:59,390 --> 00:07:02,900
then I'd have 3/7
plus 4/7 equals 7/7.

115
00:07:02,900 --> 00:07:06,260
Yeah, I think we
make some 1/7, 1/7.

116
00:07:09,220 --> 00:07:13,300
So that would be the
x infinity point.

117
00:07:13,300 --> 00:07:15,700
3/7 plus 4/7 give 1.

118
00:07:15,700 --> 00:07:18,970
So why do I mention this?

119
00:07:18,970 --> 00:07:21,820
First, because when I did it
before I just drew pictures

120
00:07:21,820 --> 00:07:25,960
without really
solving the problem.

121
00:07:25,960 --> 00:07:30,181
And secondly, because in
thinking ahead about projects,

122
00:07:30,181 --> 00:07:33,880
this is the kind of
project that I would

123
00:07:33,880 --> 00:07:35,470
think is quite interesting.

124
00:07:35,470 --> 00:07:38,650
Obviously, as this is p equal 1.

125
00:07:38,650 --> 00:07:40,780
This is p equal 2.

126
00:07:40,780 --> 00:07:42,400
This is p equal infinity.

127
00:07:42,400 --> 00:07:46,030
And I guess it's pretty
clear from the pictures

128
00:07:46,030 --> 00:07:50,750
that as p increases, this
point starts up here.

129
00:07:50,750 --> 00:07:54,670
And moves down the
line and ends up here.

130
00:07:54,670 --> 00:07:59,770
And I've no idea like what the
solution is for a different p

131
00:07:59,770 --> 00:08:01,180
and how it moves.

132
00:08:01,180 --> 00:08:09,940
And to make it more of a
project, what happens in 3D?

133
00:08:09,940 --> 00:08:11,260
What happens in 3D?

134
00:08:11,260 --> 00:08:17,020
In 3D, if I have one equation
in 3D, then I have a plane.

135
00:08:17,020 --> 00:08:21,610
And this would become a
diamond, a 3D diamond.

136
00:08:21,610 --> 00:08:23,560
This would be a 3D sphere.

137
00:08:23,560 --> 00:08:26,050
This would be a 3D cube.

138
00:08:26,050 --> 00:08:30,110
They would expand
to hit that plane.

139
00:08:30,110 --> 00:08:35,190
And I don't know how many
zeros you get in that case.

140
00:08:35,190 --> 00:08:38,299
So that would be the
case of one equation.

141
00:08:38,299 --> 00:08:45,630
So there would be a plane that
these diamond, sphere, and cube

142
00:08:45,630 --> 00:08:47,790
expand until they hit it.

143
00:08:47,790 --> 00:08:51,580
Or you could have two
constraints, two equations.

144
00:08:51,580 --> 00:08:54,480
So if we had two equations
and three unknowns,

145
00:08:54,480 --> 00:08:56,840
that would be a line again.

146
00:08:56,840 --> 00:09:01,250
But how many zeros would we
get in these different cases?

147
00:09:01,250 --> 00:09:06,100
How sparse is L1 going to be?

148
00:09:06,100 --> 00:09:08,480
That's like a recapture
of what we did.

149
00:09:11,280 --> 00:09:14,420
It's nice occasionally
to have pictures

150
00:09:14,420 --> 00:09:20,390
showing where the solution is.

151
00:09:20,390 --> 00:09:28,280
Now, I'm coming to the topic of
the day, which is Gram-Schmidt.

152
00:09:28,280 --> 00:09:33,710
And so Gram-Schmidt, number
one, is the standard way

153
00:09:33,710 --> 00:09:41,580
that would be taught in 18.06.

154
00:09:41,580 --> 00:09:44,530
So what's Gram-Schmidt about?

155
00:09:44,530 --> 00:09:49,240
I'll just put down here
general facts of Gram-Schmidt.

156
00:09:49,240 --> 00:09:53,780
We start with a matrix
A. It's got n columns.

157
00:09:58,770 --> 00:10:00,900
But they're not orthogonal.

158
00:10:00,900 --> 00:10:04,560
And, in fact, they may
be badly conditioned.

159
00:10:04,560 --> 00:10:08,320
Columns might be nearly
dependent on the others.

160
00:10:08,320 --> 00:10:11,020
I'm going to assume the
columns are independent,

161
00:10:11,020 --> 00:10:13,540
but they might be
barely independent.

162
00:10:17,320 --> 00:10:20,410
So those lines then would
be sort of like pointing

163
00:10:20,410 --> 00:10:22,450
very nearly parallel.

164
00:10:22,450 --> 00:10:25,390
But Gram-Schmidt
opens up the picture

165
00:10:25,390 --> 00:10:33,850
to get a matrix Q, an orthogonal
matrix with columns Q1 to Qn,

166
00:10:33,850 --> 00:10:36,050
which our ortho-normal.

167
00:10:36,050 --> 00:10:40,550
So it gets a
perfect basis of Qs.

168
00:10:40,550 --> 00:10:44,650
And so that's what
Gram-Schmidt does.

169
00:10:44,650 --> 00:10:47,740
And these are different ways
to do it, different ways

170
00:10:47,740 --> 00:10:51,910
to organize the computation.

171
00:10:51,910 --> 00:10:55,150
I really only put
the standard way in.

172
00:10:55,150 --> 00:10:57,150
What is this mysterious R?

173
00:11:01,470 --> 00:11:07,530
So the Qs are
combinations of the As.

174
00:11:07,530 --> 00:11:11,850
So there's some matrix to tell
me what those combinations are.

175
00:11:11,850 --> 00:11:16,980
Or if I go back go backwards
and say, well, the As

176
00:11:16,980 --> 00:11:20,670
are combinations of the Qs,
that's what I'm about to do.

177
00:11:20,670 --> 00:11:24,330
If I say each A is
a combination of Qs,

178
00:11:24,330 --> 00:11:28,410
that means that my
A matrix is my Q

179
00:11:28,410 --> 00:11:31,410
matrix times some
R matrix, which

180
00:11:31,410 --> 00:11:35,130
tells me the combinations.

181
00:11:35,130 --> 00:11:37,680
When I multiply
by R on the right,

182
00:11:37,680 --> 00:11:40,830
I'm taking combinations
of the columns of Q

183
00:11:40,830 --> 00:11:43,170
and getting the columns of A.

184
00:11:43,170 --> 00:11:48,300
So just like LU, you go
forward with the algorithm

185
00:11:48,300 --> 00:11:53,730
to reach U. Here, we go forward
where the algorithm to reach Q.

186
00:11:53,730 --> 00:12:00,000
But then when we want to put
it in one simple equation,

187
00:12:00,000 --> 00:12:03,300
it turns out to be
better to go backwards

188
00:12:03,300 --> 00:12:08,340
and say how is the original
A related to the final Q,

189
00:12:08,340 --> 00:12:12,020
there has to be some R.

190
00:12:12,020 --> 00:12:15,050
OK, I always feel when I
talk about Gram-Schmidt--

191
00:12:15,050 --> 00:12:18,050
I usually end with
that A equal QR.

192
00:12:18,050 --> 00:12:21,350
And, of course, the Matlab
command is exactly QR.

193
00:12:21,350 --> 00:12:30,830
So in Matlab, the command would
be QR of A, instead of LU of A.

194
00:12:30,830 --> 00:12:35,750
So it would give you Q and R.
That's what Matlab will output.

195
00:12:35,750 --> 00:12:41,510
Now, as I say, Q is the
saying we're constructing.

196
00:12:41,510 --> 00:12:48,960
R is the combinations that
we need to get what we want.

197
00:12:48,960 --> 00:12:53,720
And so it comes at the
end, what the heck was R.

198
00:12:53,720 --> 00:12:57,050
But actually, R is
really a simple idea.

199
00:12:57,050 --> 00:13:00,570
So I want to show that at the
beginning instead of the end.

200
00:13:00,570 --> 00:13:07,460
OK, so I'm going to move
Q over here, as Q inverse.

201
00:13:07,460 --> 00:13:10,289
But what is Q inverse?

202
00:13:10,289 --> 00:13:11,572
AUDIENCE: Q transpose.

203
00:13:11,572 --> 00:13:13,030
GILBERT STRANG: Q
transpose, right,

204
00:13:13,030 --> 00:13:15,590
because I've created
an orthogonal matrix.

205
00:13:15,590 --> 00:13:21,382
So this mysterious
R is Q transpose A.

206
00:13:21,382 --> 00:13:29,450
And let me just sort of make
it grow out into matrices.

207
00:13:29,450 --> 00:13:34,590
That has the Qs along the
rows, Q transpose, of course.

208
00:13:34,590 --> 00:13:39,260
Qn transpose, Q1 transpose.

209
00:13:39,260 --> 00:13:41,780
I'm transposing the
matrix above it.

210
00:13:41,780 --> 00:13:44,060
So these columns become rows.

211
00:13:44,060 --> 00:13:46,540
Times the As, A1 to An.

212
00:13:51,900 --> 00:13:56,790
So what is a typical entry in R?

213
00:13:56,790 --> 00:14:00,750
That's really why I want to say
nothing mysterious about this.

214
00:14:00,750 --> 00:14:02,930
You can see what
you end up with.

215
00:14:02,930 --> 00:14:04,500
It will be right in
front of us here.

216
00:14:07,290 --> 00:14:14,270
What is the entry in
row i, column j of R?

217
00:14:14,270 --> 00:14:21,370
OK, this says that
all those entries in R

218
00:14:21,370 --> 00:14:27,140
are Qi transpose times Aj.

219
00:14:27,140 --> 00:14:29,740
That's the old way
to multiply matrices.

220
00:14:29,740 --> 00:14:36,140
And it's the best way for
this, a row times a column.

221
00:14:36,140 --> 00:14:40,960
In other words, the Rs are
just the inner products,

222
00:14:40,960 --> 00:14:46,530
the dot products of the Qs with
the As, of the Qs with the As.

223
00:14:46,530 --> 00:14:50,380
That's sort of like
nothing mysterious about R.

224
00:14:50,380 --> 00:14:53,200
Because Q is an
orthogonal matrix,

225
00:14:53,200 --> 00:14:58,260
we were able to put
it over here, get

226
00:14:58,260 --> 00:15:02,640
a nice expression for R,
and see what it really is.

227
00:15:02,640 --> 00:15:09,270
So you can do R at the
end or on the root.

228
00:15:09,270 --> 00:15:13,740
But that's just the inner
product of the Qs with the A's.

229
00:15:13,740 --> 00:15:16,980
Now, what's Gram-Schmidt?

230
00:15:16,980 --> 00:15:20,310
I'm sort of thinking
you've seen the basic ideas

231
00:15:20,310 --> 00:15:23,340
of Gram-Schmidt,
but let's review.

232
00:15:23,340 --> 00:15:32,230
So I start with a.

233
00:15:32,230 --> 00:15:35,950
So what does
Gram-Schmidt begin with?

234
00:15:35,950 --> 00:15:37,000
a1.

235
00:15:37,000 --> 00:15:38,410
It takes that first column.

236
00:15:38,410 --> 00:15:47,170
So these a's are not
orthogonal generally.

237
00:15:47,170 --> 00:15:50,070
But the first direction is OK.

238
00:15:50,070 --> 00:15:52,290
I have no complaints
about the first direction,

239
00:15:52,290 --> 00:15:58,150
except that a1 might
not be a unit vector.

240
00:15:58,150 --> 00:16:05,980
So q1 will just be a1 over its
norm to have a unit vector.

241
00:16:05,980 --> 00:16:10,100
The whole idea of
Gram-Schmidt is in q2.

242
00:16:10,100 --> 00:16:12,490
So what is q2?

243
00:16:12,490 --> 00:16:15,040
The whole idea is coming here.

244
00:16:15,040 --> 00:16:16,930
It's the only thing
you need to know.

245
00:16:16,930 --> 00:16:18,760
And the picture shows it.

246
00:16:18,760 --> 00:16:20,560
So q2, I start with a2.

247
00:16:24,640 --> 00:16:28,240
But it's not orthogonal to a1.

248
00:16:28,240 --> 00:16:29,020
So what do I do?

249
00:16:31,800 --> 00:16:37,690
I figure out the component
of a2 in the a1 direction

250
00:16:37,690 --> 00:16:39,190
and I remove it.

251
00:16:39,190 --> 00:16:42,605
So I take that vector away
and I'm left with this vector.

252
00:16:42,605 --> 00:16:43,480
So there is a vector.

253
00:16:43,480 --> 00:16:44,440
I'll call that A2.

254
00:16:47,270 --> 00:16:57,010
So A2 is the original little a2
with the a1 direction removed.

255
00:16:59,910 --> 00:17:03,610
So what what's the formula
for what I just did?

256
00:17:03,610 --> 00:17:08,230
This is the whole, the key
step that Gram-Schmidt repeats

257
00:17:08,230 --> 00:17:09,910
over and over and over again.

258
00:17:09,910 --> 00:17:12,670
It's truly boring.

259
00:17:12,670 --> 00:17:18,010
So it subtracts-- well,
remember that this

260
00:17:18,010 --> 00:17:20,530
is in the same direction as Q1.

261
00:17:20,530 --> 00:17:26,960
And it's better to work with Q1,
because we've found that guy.

262
00:17:26,960 --> 00:17:28,130
We've got it.

263
00:17:28,130 --> 00:17:30,710
And we know it's a unit vector.

264
00:17:30,710 --> 00:17:35,180
So here's my linear
algebra question.

265
00:17:35,180 --> 00:17:40,630
What's the component of a2
that I want to subtract off?

266
00:17:40,630 --> 00:17:43,880
It's the component in
the direction of q1.

267
00:17:43,880 --> 00:17:48,230
It's this in the
direction of q1.

268
00:17:48,230 --> 00:17:51,520
And let me just
remember, so obviously,

269
00:17:51,520 --> 00:17:54,560
that angle is coming into it.

270
00:17:54,560 --> 00:18:02,010
So that will be a2
transpose q1 times q1.

271
00:18:02,010 --> 00:18:04,460
That's it.

272
00:18:04,460 --> 00:18:06,310
That's the component
that we remove.

273
00:18:12,060 --> 00:18:18,220
And maybe I'd prefer to
write it as q1 transpose a2.

274
00:18:18,220 --> 00:18:19,560
I don't know.

275
00:18:19,560 --> 00:18:21,060
It doesn't matter of course.

276
00:18:21,060 --> 00:18:23,320
The two dot products
are the same.

277
00:18:23,320 --> 00:18:27,420
Maybe I will just--

278
00:18:27,420 --> 00:18:30,480
yeah-- well, maybe not.

279
00:18:30,480 --> 00:18:30,980
Fine.

280
00:18:35,790 --> 00:18:39,700
Now what is that vector
supposed to achieve?

281
00:18:39,700 --> 00:18:44,170
It's supposed to be this vector.

282
00:18:44,170 --> 00:18:45,850
This vector I'm
really going to call

283
00:18:45,850 --> 00:18:52,390
A2, because it's in the
right direction for Q2,

284
00:18:52,390 --> 00:18:55,060
but it is not yet?

285
00:18:55,060 --> 00:18:56,350
AUDIENCE: Normal.

286
00:18:56,350 --> 00:18:57,710
GILBERT STRANG: Normal.

287
00:18:57,710 --> 00:19:00,550
So what is Q2 then?

288
00:19:00,550 --> 00:19:03,860
So I'm saying this guy
got the direction right.

289
00:19:03,860 --> 00:19:06,280
They're saying subtract
it off this vector.

290
00:19:06,280 --> 00:19:07,600
Got that direction right.

291
00:19:07,600 --> 00:19:09,190
Got it as A2.

292
00:19:09,190 --> 00:19:12,850
What is Q2 now that
I want to finish?

293
00:19:12,850 --> 00:19:13,860
I've got the direction.

294
00:19:13,860 --> 00:19:18,060
All I want to do is get
it to be a unit vector.

295
00:19:18,060 --> 00:19:21,720
So I just take A2 over its norm.

296
00:19:26,870 --> 00:19:31,640
That double step is the
whole thing in Gram-Schmidt,

297
00:19:31,640 --> 00:19:33,760
the whole thing.

298
00:19:33,760 --> 00:19:40,220
Subtract off the components
in the directions already set.

299
00:19:40,220 --> 00:19:43,340
Then you get something in a
totally new direction, called

300
00:19:43,340 --> 00:19:48,290
A, capital A. And then
you divide by its length

301
00:19:48,290 --> 00:19:49,720
to make it a unit vector.

302
00:19:49,720 --> 00:19:53,570
And that gives you the new Q.

303
00:19:53,570 --> 00:19:56,620
Just to show that
we've got a point, what

304
00:19:56,620 --> 00:20:01,860
about the next
step, aiming for Q3.

305
00:20:01,860 --> 00:20:04,890
So tell me what A3 should be?

306
00:20:04,890 --> 00:20:07,580
A3, I'm going to
start with this.

307
00:20:07,580 --> 00:20:11,920
And I'm going to
subtract off some stuff.

308
00:20:11,920 --> 00:20:14,460
What am I going to subtract off?

309
00:20:14,460 --> 00:20:15,930
AUDIENCE: The transpose.

310
00:20:15,930 --> 00:20:20,220
GILBERT STRANG: The component,
a3 transpose, right.

311
00:20:20,220 --> 00:20:24,180
Times q1 q1.

312
00:20:24,180 --> 00:20:30,190
And I didn't yet check so that
this came out orthogonal to q1,

313
00:20:30,190 --> 00:20:32,260
but I'll come back to that.

314
00:20:32,260 --> 00:20:37,910
Now, have I done everything
I should do here with a3?

315
00:20:37,910 --> 00:20:40,350
No, I've got one
more step to take.

316
00:20:40,350 --> 00:20:43,360
And I should take the
two steps separately.

317
00:20:43,360 --> 00:20:45,360
It's called modified
Gram-Schmidt.

318
00:20:45,360 --> 00:20:51,500
And what I want to do is
subtract off the q2 component.

319
00:20:51,500 --> 00:20:55,340
So what multiple
of q2 do I need?

320
00:20:55,340 --> 00:20:59,240
Because q2 has been set
by the time I get to a3,

321
00:20:59,240 --> 00:21:03,120
so what goes here?

322
00:21:03,120 --> 00:21:04,960
AUDIENCE: a3 transpose

323
00:21:04,960 --> 00:21:07,230
GILBERT STRANG: a3 transpose--

324
00:21:07,230 --> 00:21:07,930
AUDIENCE: q2

325
00:21:07,930 --> 00:21:08,722
GILBERT STRANG: q2.

326
00:21:08,722 --> 00:21:09,290
Thanks.

327
00:21:09,290 --> 00:21:09,790
Thanks.

328
00:21:12,980 --> 00:21:16,370
If you look at a code,
say a Matlab code

329
00:21:16,370 --> 00:21:23,060
to do Gram-Schmidt-- oh,
what's the final q3 then?

330
00:21:23,060 --> 00:21:24,252
AUDIENCE: Normalize it.

331
00:21:24,252 --> 00:21:25,460
GILBERT STRANG: Normalize it.

332
00:21:25,460 --> 00:21:29,570
So you take A3, which is
in the right direction,

333
00:21:29,570 --> 00:21:33,050
and you divide by its
length to get a unit vector.

334
00:21:36,760 --> 00:21:41,170
Let me just come back and
check that I did it right,

335
00:21:41,170 --> 00:21:42,790
that I got the right direction.

336
00:21:42,790 --> 00:21:44,870
So what do I mean by
the right direction?

337
00:21:44,870 --> 00:21:49,360
What should I check
about that guy?

338
00:21:49,360 --> 00:21:54,515
I should check that it's
inner product with q1 is?

339
00:21:59,030 --> 00:22:02,420
If this is the right
direction to go,

340
00:22:02,420 --> 00:22:05,930
that way, then I should check--

341
00:22:05,930 --> 00:22:07,550
I have to check--

342
00:22:07,550 --> 00:22:10,250
hopefully, I've got
the formula right--

343
00:22:10,250 --> 00:22:14,840
that it's dot product,
inner product with q1 is--

344
00:22:14,840 --> 00:22:15,530
AUDIENCE: 0.

345
00:22:15,530 --> 00:22:16,280
GILBERT STRANG: 0.

346
00:22:16,280 --> 00:22:18,110
Thank you.

347
00:22:18,110 --> 00:22:20,090
So is it obvious that it is?

348
00:22:20,090 --> 00:22:24,453
Take the inner product of the
dot product of that with q1,

349
00:22:24,453 --> 00:22:25,120
what do you get?

350
00:22:28,020 --> 00:22:32,370
You get the same number
q1 a2 transpose q1.

351
00:22:32,370 --> 00:22:33,780
You get that number.

352
00:22:33,780 --> 00:22:39,690
And over here, you're
getting q1 transpose with q1.

353
00:22:39,690 --> 00:22:42,365
So do you see what I'm doing?

354
00:22:42,365 --> 00:22:45,750
Probably it looks like it
would have been better.

355
00:22:45,750 --> 00:22:51,940
I'm checking that q1
transpose a2 is 0.

356
00:22:54,660 --> 00:22:57,650
Yeah, it is.

357
00:22:57,650 --> 00:23:02,930
q1 transpose a2 here, q1
transpose a2, or a2 transpose

358
00:23:02,930 --> 00:23:04,690
q1, I don't mind.

359
00:23:04,690 --> 00:23:08,840
And I have another
q1 transpose q1 here.

360
00:23:08,840 --> 00:23:10,550
And what is that?

361
00:23:10,550 --> 00:23:12,680
What is q1 transpose q1?

362
00:23:12,680 --> 00:23:13,450
It is?

363
00:23:13,450 --> 00:23:13,950
AUDIENCE: 1.

364
00:23:13,950 --> 00:23:15,200
GILBERT STRANG: 1.

365
00:23:15,200 --> 00:23:16,010
So check.

366
00:23:22,130 --> 00:23:25,460
OK, that's Gram-Schmidt,
standard Gram-Schmidt,

367
00:23:25,460 --> 00:23:29,885
which you have met before.

368
00:23:32,760 --> 00:23:37,350
Now, I'm ready for a
better Gram-Schmidt.

369
00:23:37,350 --> 00:23:42,250
You could say a better
Gram-Schmidt, because here--

370
00:23:42,250 --> 00:23:45,480
so what's it going
to be the difference?

371
00:23:45,480 --> 00:23:50,010
Here, I took the a's in
their original order.

372
00:23:52,830 --> 00:23:57,390
Now, suppose I did
that with elimination.

373
00:23:57,390 --> 00:24:00,330
Elimination, usually we
write as acting on the row.

374
00:24:00,330 --> 00:24:04,820
So thinking about elimination,
I'm thinking I'm the rows.

375
00:24:04,820 --> 00:24:10,340
What would be the danger in
taking the rows in order, doing

376
00:24:10,340 --> 00:24:17,140
no row exchanges, just figure
out the pivot each time,

377
00:24:17,140 --> 00:24:20,420
and kill the rest of the
column and then move on?

378
00:24:20,420 --> 00:24:23,350
So taking the rows in
the order they came,

379
00:24:23,350 --> 00:24:26,140
whatever it might
be, what's the risk?

380
00:24:26,140 --> 00:24:30,770
And why would
Matlab not do that?

381
00:24:30,770 --> 00:24:34,550
Because something can be
very small and totally

382
00:24:34,550 --> 00:24:37,100
blow up your calculations.

383
00:24:37,100 --> 00:24:39,770
And that's that pivot number.

384
00:24:39,770 --> 00:24:45,620
Sort of the question is,
if a2 is very near a1--

385
00:24:48,490 --> 00:24:50,510
so let me draw a new picture.

386
00:24:50,510 --> 00:24:52,150
So here's the risk.

387
00:24:52,150 --> 00:24:55,110
So a1 was whatever it was.

388
00:24:55,110 --> 00:25:00,510
If a2 is really close
to the same direction,

389
00:25:00,510 --> 00:25:03,500
then I'm subtracting
off almost all of it,

390
00:25:03,500 --> 00:25:08,780
and I've got some tiny little
bit for the new direction.

391
00:25:08,780 --> 00:25:12,020
That's like the
pivot in elimination.

392
00:25:12,020 --> 00:25:16,670
It's the number that
sort of measures what's

393
00:25:16,670 --> 00:25:20,030
new, what the new
row in elimination

394
00:25:20,030 --> 00:25:26,960
or the new column in
Gram-Schmidt gives you.

395
00:25:26,960 --> 00:25:28,850
And if that's too small--

396
00:25:28,850 --> 00:25:31,070
I mean, like in
elimination, as I say,

397
00:25:31,070 --> 00:25:35,210
we would never use
an elimination code

398
00:25:35,210 --> 00:25:39,170
on a general matrix that didn't
check the size of the pivot

399
00:25:39,170 --> 00:25:42,380
and exchange rows
when necessary.

400
00:25:42,380 --> 00:25:45,810
Well, similarly,
with Gram-Schmidt, it

401
00:25:45,810 --> 00:25:48,780
can take the columns in order.

402
00:25:48,780 --> 00:25:52,770
That's the standard Gram-Schmidt
taking the columns in order.

403
00:25:52,770 --> 00:25:57,960
But only if it checks each
time that the little bit--

404
00:26:02,340 --> 00:26:06,600
well, that the new part,
what would be the new part,

405
00:26:06,600 --> 00:26:15,910
is big enough to be able to--
we have to divide by the thing.

406
00:26:15,910 --> 00:26:19,180
And if that A2 is
tiny is a tiny vector.

407
00:26:19,180 --> 00:26:23,380
This is dividing
by A2 and onwards

408
00:26:23,380 --> 00:26:29,330
is building in round-off error
that we can't remove again.

409
00:26:29,330 --> 00:26:31,150
We're stuck with it.

410
00:26:31,150 --> 00:26:40,320
So that's the column exchange,
column pivoting idea in sort

411
00:26:40,320 --> 00:26:45,870
of a more professional
Gram-Schmidt.

412
00:26:45,870 --> 00:26:50,340
And to do it--

413
00:26:50,340 --> 00:26:56,100
so I have to be able to
compare this little bit,

414
00:26:56,100 --> 00:26:58,650
if it is little, with what?

415
00:26:58,650 --> 00:27:02,210
What am I going to compare with?

416
00:27:02,210 --> 00:27:04,720
In elimination, I looked
down the rest of the column

417
00:27:04,720 --> 00:27:05,560
for a bigger number.

418
00:27:08,980 --> 00:27:15,400
I guess what I have to do is
I have to find this component,

419
00:27:15,400 --> 00:27:20,860
not just from a2, but from
all the remaining a's.

420
00:27:20,860 --> 00:27:23,330
And I'll pick the biggest.

421
00:27:23,330 --> 00:27:26,630
So there, in that
sentence, I said

422
00:27:26,630 --> 00:27:29,780
the main idea of
column exchangers

423
00:27:29,780 --> 00:27:36,850
is once you get q1 set, which
certainly q1 was the easiest

424
00:27:36,850 --> 00:27:41,920
one in the world,
but maybe even for q1

425
00:27:41,920 --> 00:27:44,980
I guess it could even happen.

426
00:27:44,980 --> 00:27:47,410
That would be like
starting with a zero

427
00:27:47,410 --> 00:27:51,040
up in the upper left corner
for elimination, like what

428
00:27:51,040 --> 00:27:54,290
a way to start the day.

429
00:27:54,290 --> 00:28:01,180
Here, if my matrix A
had a tiny little a1,

430
00:28:01,180 --> 00:28:04,000
then I should look
for a bigger one

431
00:28:04,000 --> 00:28:07,780
to get the very first q chosen.

432
00:28:07,780 --> 00:28:12,640
Let's suppose-- give ourselves
a reasonable chance here--

433
00:28:12,640 --> 00:28:19,840
let's suppose a1 was a decent
size, so as I've drawn.

434
00:28:24,450 --> 00:28:28,380
The next step might not
be, if I use a2, as I say,

435
00:28:28,380 --> 00:28:31,810
it could be same
direction as a1 virtually,

436
00:28:31,810 --> 00:28:35,280
and then I'm just working
with that little piece.

437
00:28:35,280 --> 00:28:39,460
So what do I have
to do differently?

438
00:28:39,460 --> 00:28:44,220
I have to be able to compare
this little piece with all

439
00:28:44,220 --> 00:28:49,020
the other potential
possibilities.

440
00:28:49,020 --> 00:28:51,930
So let me just write
down what you have

441
00:28:51,930 --> 00:28:55,180
to do in a different order.

442
00:28:55,180 --> 00:28:59,730
So this is now with column
pivoting, column exchange,

443
00:28:59,730 --> 00:29:06,880
column pivoting allowed,
or it's possible.

444
00:29:06,880 --> 00:29:17,500
So to make it possible, I
have to find not only A2,

445
00:29:17,500 --> 00:29:25,480
the piece of little a2, I
have to find a2, the piece.

446
00:29:25,480 --> 00:29:27,160
I'm just going to copy that.

447
00:29:27,160 --> 00:29:34,840
I have to take my second column,
subtract off the q1 part.

448
00:29:34,840 --> 00:29:37,330
And that could be small.

449
00:29:37,330 --> 00:29:39,790
So I have to compare it with--

450
00:29:39,790 --> 00:29:43,000
oh, I haven't
written this page up.

451
00:29:43,000 --> 00:29:47,080
So I haven't got a
notation in mind yet.

452
00:29:47,080 --> 00:29:48,520
I won't give it a name.

453
00:29:48,520 --> 00:29:58,050
I have to also compute at
this step before deciding q2--

454
00:29:58,050 --> 00:30:07,290
now I'm describing how to
decide q2, the second vector.

455
00:30:09,930 --> 00:30:13,530
And I'm saying that
the way to decide q2

456
00:30:13,530 --> 00:30:20,370
is not only to take a piece of
a2 but also the piece of a3.

457
00:30:20,370 --> 00:30:22,050
Look at this piece.

458
00:30:24,870 --> 00:30:26,245
And look at all
the other pieces.

459
00:30:37,910 --> 00:30:39,420
And now, what will be my policy?

460
00:30:43,290 --> 00:30:46,980
Standard Gram-Schmidt
accepted this one,

461
00:30:46,980 --> 00:30:49,060
and didn't look at these.

462
00:30:49,060 --> 00:30:51,850
But, now, I'm going
to look at them all.

463
00:30:51,850 --> 00:30:54,500
And I'm going to
take the largest.

464
00:30:54,500 --> 00:30:56,590
I'm going to take the largest.

465
00:30:56,590 --> 00:31:00,130
And that will be
the A2 that I want.

466
00:31:00,130 --> 00:31:03,360
So it might not be this one.

467
00:31:03,360 --> 00:31:08,310
If this guy is largest, then
I'm taking column 3 first.

468
00:31:08,310 --> 00:31:11,220
And that will be my A2.

469
00:31:11,220 --> 00:31:12,720
And then I'll say fine.

470
00:31:12,720 --> 00:31:17,170
And then q2 will be
that A2 over its norm.

471
00:31:19,860 --> 00:31:21,680
You see the difference?

472
00:31:21,680 --> 00:31:23,832
It's not exciting.

473
00:31:23,832 --> 00:31:25,290
And you might think,
wait a minute,

474
00:31:25,290 --> 00:31:27,810
this is a heck of
a lot more work.

475
00:31:27,810 --> 00:31:30,850
But it isn't.

476
00:31:30,850 --> 00:31:33,430
It isn't actually more
work, because these

477
00:31:33,430 --> 00:31:34,870
are all the things--

478
00:31:34,870 --> 00:31:37,930
these ones that look like
we're paying a price,

479
00:31:37,930 --> 00:31:41,260
we're computing all
these alternatives--

480
00:31:41,260 --> 00:31:44,530
but we had to do that
eventually anyway.

481
00:31:44,530 --> 00:31:45,670
Do you see that?

482
00:31:45,670 --> 00:31:46,750
Let me just say it again.

483
00:31:49,610 --> 00:31:59,310
The standard way took all the
components, like for here.

484
00:31:59,310 --> 00:32:03,760
The standard way waited
until you got to column 3

485
00:32:03,760 --> 00:32:07,180
and then subtracted
off both pieces,

486
00:32:07,180 --> 00:32:11,590
waited until you got to column
4, subtracted off three pieces.

487
00:32:11,590 --> 00:32:14,880
This way, you're subtracting
off the first piece

488
00:32:14,880 --> 00:32:17,140
as soon as you know
what it should be.

489
00:32:17,140 --> 00:32:21,700
As soon as you know
q1, you remove it

490
00:32:21,700 --> 00:32:24,220
from all the remaining vectors.

491
00:32:24,220 --> 00:32:26,050
And you look to see
what's the biggest.

492
00:32:26,050 --> 00:32:27,950
You pick the biggest one.

493
00:32:27,950 --> 00:32:30,220
I said maybe it's this guy.

494
00:32:30,220 --> 00:32:31,590
So you move that one.

495
00:32:31,590 --> 00:32:34,420
Or some permutation
matrix is going to move it

496
00:32:34,420 --> 00:32:36,220
to the second column.

497
00:32:36,220 --> 00:32:38,615
See, it started in
the third column,

498
00:32:38,615 --> 00:32:40,240
but I'm going to move
it to the second,

499
00:32:40,240 --> 00:32:42,220
because it's the biggest.

500
00:32:42,220 --> 00:32:45,520
Then I do the right thing.

501
00:32:45,520 --> 00:32:47,560
I find q2.

502
00:32:47,560 --> 00:32:59,230
And now I go on toward q3.

503
00:32:59,230 --> 00:33:00,460
And how will I find q3?

504
00:33:04,170 --> 00:33:07,200
I'd want to pick the
biggest column to work with.

505
00:33:07,200 --> 00:33:11,230
So I subtract off
the q2 components.

506
00:33:11,230 --> 00:33:13,770
This is like easy to
say, but I had never

507
00:33:13,770 --> 00:33:15,940
like figured it out before.

508
00:33:15,940 --> 00:33:17,440
So let me just say it again.

509
00:33:17,440 --> 00:33:21,260
And then I'll leave
that with you.

510
00:33:21,260 --> 00:33:22,350
How do I get q3?

511
00:33:25,370 --> 00:33:28,220
I've fixed two columns.

512
00:33:28,220 --> 00:33:33,590
They happened to be,
maybe not necessarily

513
00:33:33,590 --> 00:33:36,530
the first two, but two
columns, two q's are set,

514
00:33:36,530 --> 00:33:39,170
and I'm looking
for the next one.

515
00:33:39,170 --> 00:33:43,190
I go on and I look at all the
remaining columns, all of which

516
00:33:43,190 --> 00:33:49,230
have had subtracted off
their q1 and q2 parts.

517
00:33:49,230 --> 00:33:52,950
So I've orthogonalized
with respect to q1 and q2.

518
00:33:52,950 --> 00:33:55,080
I look at all the
remaining things

519
00:33:55,080 --> 00:34:00,140
that I have to work with
and pick the biggest.

520
00:34:00,140 --> 00:34:03,450
Just like picking the biggest
number to go into the pivot.

521
00:34:03,450 --> 00:34:08,330
OK, I don't think I
can say it anymore

522
00:34:08,330 --> 00:34:12,469
without just repeating myself.

523
00:34:12,469 --> 00:34:16,650
And I bring it here to
class, because I had not

524
00:34:16,650 --> 00:34:21,440
sort of appreciated the
point that no extra work was

525
00:34:21,440 --> 00:34:22,489
involved.

526
00:34:22,489 --> 00:34:29,810
You just did these subtractions
for all the remaining columns

527
00:34:29,810 --> 00:34:35,060
of A before you started
on the next job.

528
00:34:35,060 --> 00:34:36,860
Is that OK?

529
00:34:36,860 --> 00:34:40,480
Eventually the notes
will describe that.

530
00:34:40,480 --> 00:34:41,770
Maybe they even do.

531
00:34:41,770 --> 00:34:43,190
Yeah, I think they even do.

532
00:34:43,190 --> 00:34:45,920
I wrote it but I
didn't understand it.

533
00:34:45,920 --> 00:34:49,130
Now, little improvement.

534
00:34:49,130 --> 00:34:50,961
So yes?

535
00:34:50,961 --> 00:34:52,650
AUDIENCE: So are we
permuting every time

536
00:34:52,650 --> 00:34:53,730
to get the biggest pivot?

537
00:34:53,730 --> 00:34:54,409
GILBERT STRANG: Yeah.

538
00:34:54,409 --> 00:34:54,989
Yeah.

539
00:34:54,989 --> 00:34:56,489
Only we don't call them pivots.

540
00:34:56,489 --> 00:34:57,600
Or maybe we should.

541
00:34:57,600 --> 00:35:00,600
I don't know what word is
used to get the biggest

542
00:35:00,600 --> 00:35:02,950
column remaining or something.

543
00:35:02,950 --> 00:35:05,370
Yeah, yeah, each time.

544
00:35:08,430 --> 00:35:11,070
You know, if the columns
were in a stupid order,

545
00:35:11,070 --> 00:35:13,320
this puts them in
the right order.

546
00:35:13,320 --> 00:35:19,345
OK, finally, come these
weird names, Krylov, Russian.

547
00:35:19,345 --> 00:35:22,380
Arnoldi, actually, I
don't know what he is.

548
00:35:22,380 --> 00:35:23,940
And I shouldn't
admit that on tape.

549
00:35:27,560 --> 00:35:31,920
So what's the idea there?

550
00:35:31,920 --> 00:35:34,850
So again, we're
solving Ax equal b.

551
00:35:34,850 --> 00:35:37,680
So this is going to be Krylov.

552
00:35:37,680 --> 00:35:40,400
What was his idea?

553
00:35:40,400 --> 00:35:42,300
Well, I want to
solve Ax equal b.

554
00:35:44,920 --> 00:35:49,980
A is a big matrix, pretty big.

555
00:35:49,980 --> 00:35:51,810
Of course, I don't
plan to invert it.

556
00:35:51,810 --> 00:35:53,130
That would be insane.

557
00:35:55,660 --> 00:36:00,640
What I can do with a the matrix
A, especially if it's sparse--

558
00:36:00,640 --> 00:36:07,330
so a large sparse A would be
a good candidate for Krylov.

559
00:36:12,770 --> 00:36:17,920
So what is it that you could
do cheap and fast with a large,

560
00:36:17,920 --> 00:36:23,144
I mean really large, but
really sparse matrix A?

561
00:36:23,144 --> 00:36:24,500
AUDIENCE: Matrix times a vector.

562
00:36:24,500 --> 00:36:27,600
GILBERT STRANG: You can do
a matrix times a vector.

563
00:36:27,600 --> 00:36:28,710
And here's our matrix.

564
00:36:28,710 --> 00:36:31,740
And there is our vector.

565
00:36:31,740 --> 00:36:35,640
So we could start
with a vector b.

566
00:36:35,640 --> 00:36:38,850
We can multiply A times b.

567
00:36:38,850 --> 00:36:42,930
We can multiply A
times A times b.

568
00:36:42,930 --> 00:36:44,430
And, of course, I
write it that way.

569
00:36:44,430 --> 00:36:50,320
I never-- I mean, like if you
multiply A times A first, then

570
00:36:50,320 --> 00:36:59,590
like you turn in
your Matlab account,

571
00:36:59,590 --> 00:37:02,510
because you just have
to do it that way.

572
00:37:02,510 --> 00:37:05,840
And then you keep going, which
of course is A squared b,

573
00:37:05,840 --> 00:37:07,730
but you didn't form A squared.

574
00:37:07,730 --> 00:37:09,380
And then on up to--

575
00:37:09,380 --> 00:37:16,010
in the end, you get to some
A to say k, k minus 1 b.

576
00:37:16,010 --> 00:37:20,210
But, of course, that's
computed as A times

577
00:37:20,210 --> 00:37:23,360
the previous one, which was
A times the previous one.

578
00:37:23,360 --> 00:37:29,050
So there is a bunch
of vectors, which

579
00:37:29,050 --> 00:37:30,400
are likely to be independent.

580
00:37:33,050 --> 00:37:35,330
So they span a space.

581
00:37:35,330 --> 00:37:38,520
And it's called
the Krylov space.

582
00:37:38,520 --> 00:37:39,730
So these span.

583
00:37:39,730 --> 00:37:42,230
They're combinations.

584
00:37:42,230 --> 00:37:48,650
Combinations give--
oh, I don't like

585
00:37:48,650 --> 00:37:50,930
that letter k, because
that's also Krylov,

586
00:37:50,930 --> 00:37:54,300
so what shall I say? j.

587
00:37:54,300 --> 00:37:56,300
So I have j vectors.

588
00:37:56,300 --> 00:38:00,780
The original b, Ab, A
squared b, up to that.

589
00:38:00,780 --> 00:38:12,890
So combinations give the
Krylov space, say, we'll

590
00:38:12,890 --> 00:38:17,450
name it after Krylov and
we need a subscript j

591
00:38:17,450 --> 00:38:22,750
to show how big it
is, its dimension.

592
00:38:22,750 --> 00:38:27,640
So that will be the idea.

593
00:38:27,640 --> 00:38:30,610
Well, let me complete the idea.

594
00:38:30,610 --> 00:38:32,760
The idea will be--

595
00:38:32,760 --> 00:38:33,980
there are combinations.

596
00:38:33,980 --> 00:38:41,890
So that's a space, a subspace,
pretty big if j is big.

597
00:38:41,890 --> 00:38:48,046
And I'm going to look for the
best solution in that space.

598
00:38:48,046 --> 00:38:53,780
So I'm not going to solve
of Ax equals b exactly.

599
00:38:53,780 --> 00:38:57,140
I'm going to find the
best solution, the closest

600
00:38:57,140 --> 00:39:01,790
solution, the least squares
solution in this Krylov space.

601
00:39:01,790 --> 00:39:04,190
I'm going to let
j be pretty big.

602
00:39:04,190 --> 00:39:08,550
So this space has got
plenty of vectors in it.

603
00:39:08,550 --> 00:39:10,240
I have a basis for this space.

604
00:39:14,230 --> 00:39:19,910
And some combination of these
basis vectors will be my xj.

605
00:39:19,910 --> 00:39:34,300
So, again, xj will be the best
vector, or the closest vector

606
00:39:34,300 --> 00:39:39,830
in this Krylov space, j,
It will be the best vector

607
00:39:39,830 --> 00:39:43,070
in that space, the closest one.

608
00:39:43,070 --> 00:39:45,710
So I know what the space is.

609
00:39:45,710 --> 00:39:48,830
I've reduced the
dimension down to j.

610
00:39:48,830 --> 00:39:51,650
And I can find this best vector.

611
00:39:54,570 --> 00:39:55,960
There's just one catch.

612
00:39:55,960 --> 00:40:00,960
And it's the same
catch that Gram-Schmidt

613
00:40:00,960 --> 00:40:07,900
were aiming to help to remove.

614
00:40:07,900 --> 00:40:10,570
That is this basis that I'm--

615
00:40:10,570 --> 00:40:15,130
right now, I'm working with
all combinations of these guys.

616
00:40:15,130 --> 00:40:20,510
And those could be
very, very dependent.

617
00:40:20,510 --> 00:40:22,500
That might be a terrible basis.

618
00:40:22,500 --> 00:40:26,090
Anytime you want to
do big computations,

619
00:40:26,090 --> 00:40:28,020
what kind of a
basis do you want?

620
00:40:30,942 --> 00:40:31,920
Yes?

621
00:40:31,920 --> 00:40:37,300
So what sort of a basis
is good to project onto

622
00:40:37,300 --> 00:40:41,720
to find the best solution
within that subspace?

623
00:40:41,720 --> 00:40:45,920
So we're sort of
finding a projection.

624
00:40:45,920 --> 00:40:50,040
And you've got vectors
that span the space.

625
00:40:50,040 --> 00:40:55,300
So you know what
you're projecting onto.

626
00:40:55,300 --> 00:41:01,210
But those vectors, they
might be nearly dependent.

627
00:41:01,210 --> 00:41:05,170
They might all be pointing
almost the same direction.

628
00:41:05,170 --> 00:41:08,710
In which case, your
calculations are terrible.

629
00:41:08,710 --> 00:41:10,950
So what do you do?

630
00:41:10,950 --> 00:41:15,910
Orth-- Orthogonalize.

631
00:41:15,910 --> 00:41:18,250
And that's where
Arnoldi comes in.

632
00:41:18,250 --> 00:41:21,280
And there's also a
Hungarian guy named Lanczos.

633
00:41:26,210 --> 00:41:30,160
So that's what they
contribute is how

634
00:41:30,160 --> 00:41:33,750
to orthogonalize that basis.

635
00:41:33,750 --> 00:41:38,900
And then, once you've done that,
you have an orthogonal basis.

636
00:41:38,900 --> 00:41:40,750
And, of course, an
orthogonal basis

637
00:41:40,750 --> 00:41:44,860
is perfect to do a projection.

638
00:41:47,570 --> 00:41:49,550
Everybody has to know that.

639
00:41:49,550 --> 00:41:51,890
Why is a orthogonal
basis so great?

640
00:41:51,890 --> 00:41:53,510
Ortho-normal even.

641
00:41:53,510 --> 00:41:55,040
Let's just remember.

642
00:41:55,040 --> 00:41:56,740
Suppose I have a vector x.

643
00:42:00,146 --> 00:42:01,280
It's unknown here.

644
00:42:01,280 --> 00:42:03,590
But suppose I have it.

645
00:42:03,590 --> 00:42:06,470
And I want to write
it as a combination

646
00:42:06,470 --> 00:42:08,750
of these ortho-normal guys.

647
00:42:14,110 --> 00:42:14,750
say, n.

648
00:42:18,310 --> 00:42:21,940
What is it about all
ortho-normal q's that

649
00:42:21,940 --> 00:42:25,420
makes this easy to
do, which it would not

650
00:42:25,420 --> 00:42:28,390
be with an arbitrary basis?

651
00:42:28,390 --> 00:42:33,010
So this is really
Q times c, right?

652
00:42:33,010 --> 00:42:36,590
Q times this vector of C's.

653
00:42:36,590 --> 00:42:41,270
The q's are in the columns of
Q. The c's were multiplying

654
00:42:41,270 --> 00:42:44,040
a matrix by a vector.

655
00:42:44,040 --> 00:42:45,750
It's a combination
of the columns.

656
00:42:45,750 --> 00:42:47,540
That's what we get.

657
00:42:47,540 --> 00:42:54,200
And when the q's are
orthogonal, what's the answer?

658
00:42:54,200 --> 00:42:57,990
We can get the
answer straight away.

659
00:42:57,990 --> 00:43:03,870
So here, we're trying to find
the coefficients with respect

660
00:43:03,870 --> 00:43:10,470
to the basis vectors
Q of a given vector x.

661
00:43:10,470 --> 00:43:13,770
And what's the answer
to that question?

662
00:43:13,770 --> 00:43:18,900
The point is, usually, to
find the coefficients, c would

663
00:43:18,900 --> 00:43:21,300
have to be Q inverse x.

664
00:43:21,300 --> 00:43:24,510
We'd have to solve that
system of equations.

665
00:43:24,510 --> 00:43:27,240
We do have to solve that
system of equations.

666
00:43:27,240 --> 00:43:32,735
But where's the payoff
from ortho-normal basis?

667
00:43:32,735 --> 00:43:33,610
AUDIENCE: Q inverse--

668
00:43:33,610 --> 00:43:36,910
GILBERT STRANG: Q
inverse is Q transpose.

669
00:43:36,910 --> 00:43:38,590
That's the payoff.

670
00:43:38,590 --> 00:43:45,410
So it's just telling
me that to find c1--

671
00:43:45,410 --> 00:43:47,280
how do I find c1?

672
00:43:47,280 --> 00:43:52,010
This says take the first
q1, transpose with x.

673
00:43:52,010 --> 00:43:53,720
I'll say the same thing here.

674
00:43:53,720 --> 00:43:58,090
Take the first vector with x.

675
00:43:58,090 --> 00:44:01,780
That will be about
cq, q1, transpose q1.

676
00:44:01,780 --> 00:44:05,990
I'm just taking the dot product
of everything there with q1.

677
00:44:05,990 --> 00:44:11,510
And then a c2, q1
transpose, q2 and so on.

678
00:44:11,510 --> 00:44:12,836
But what's good?

679
00:44:12,836 --> 00:44:14,460
AUDIENCE: You have zeros.

680
00:44:14,460 --> 00:44:15,710
GILBERT STRANG: Tell me again.

681
00:44:15,710 --> 00:44:17,240
AUDIENCE: The other
ones are zeros.

682
00:44:17,240 --> 00:44:18,698
GILBERT STRANG:
These are all zero.

683
00:44:22,650 --> 00:44:25,530
And the q1 transpose q1 is?

684
00:44:25,530 --> 00:44:26,140
AUDIENCE: 1.

685
00:44:26,140 --> 00:44:26,890
GILBERT STRANG: 1.

686
00:44:26,890 --> 00:44:28,170
So it's perfect.

687
00:44:28,170 --> 00:44:30,450
c1 is q1 transpose x.

688
00:44:30,450 --> 00:44:35,160
And that's exactly
what that tells us.

689
00:44:35,160 --> 00:44:37,470
The first component
is the first row

690
00:44:37,470 --> 00:44:46,760
of q transpose, which
is q1 transpose with x.

691
00:44:46,760 --> 00:44:48,420
So that's the idea.

692
00:44:48,420 --> 00:44:52,260
So that's the idea here.

693
00:44:52,260 --> 00:44:56,790
That's the reason for
Arnoldi and Lanczos

694
00:44:56,790 --> 00:45:01,140
being famous is that they
figured out a good way

695
00:45:01,140 --> 00:45:04,070
to orthogonalize that basis.

696
00:45:10,680 --> 00:45:13,500
Do we want to see what they did?

697
00:45:13,500 --> 00:45:16,910
Or those would be in the notes.

698
00:45:16,910 --> 00:45:20,070
Well, how do you do it?

699
00:45:20,070 --> 00:45:21,420
So this is the basis.

700
00:45:21,420 --> 00:45:24,630
This is our not good basis.

701
00:45:24,630 --> 00:45:28,712
And then our good basis
is going to be q's.

702
00:45:31,880 --> 00:45:34,740
So I'll take b to
be-- q1 will be what?

703
00:45:34,740 --> 00:45:36,725
What would be the
right choice for q1?

704
00:45:39,230 --> 00:45:42,520
Well, I'll take that first
vector and normalize it.

705
00:45:46,240 --> 00:45:48,010
We're just doing Gram-Schmidt.

706
00:45:48,010 --> 00:45:50,950
What would q2 be?

707
00:45:50,950 --> 00:45:56,730
How would I find q2 following
the Gram-Schmidt idea?

708
00:45:56,730 --> 00:45:58,950
I take Ab.

709
00:45:58,950 --> 00:46:05,140
I subtract off its component
in this q1 direction

710
00:46:05,140 --> 00:46:07,120
and I normalize.

711
00:46:07,120 --> 00:46:10,510
And all the Arnoldi
Lanczos algorithm

712
00:46:10,510 --> 00:46:16,000
is is that same
Gram-Schmidt idea applied

713
00:46:16,000 --> 00:46:19,480
to these Krylov vectors.

714
00:46:19,480 --> 00:46:24,430
So Arnoldi-Lanczos-- Arnoldi
is for any matrix and Lanczos

715
00:46:24,430 --> 00:46:28,120
is for a symmetric matrix where
you get some special benefit.

716
00:46:30,920 --> 00:46:34,260
So what they did,
you could say now,

717
00:46:34,260 --> 00:46:40,350
they just wrote
down Gram-Schmidt,

718
00:46:40,350 --> 00:46:42,960
in fact, probably the
standard Gram-Schmidt,

719
00:46:42,960 --> 00:46:45,570
because this is a
case where we really

720
00:46:45,570 --> 00:46:47,910
don't want to exchange columns.

721
00:46:47,910 --> 00:46:53,460
I don't want suddenly to
be pushed into this one.

722
00:46:53,460 --> 00:46:55,740
I'd rather take them in
order, because it just

723
00:46:55,740 --> 00:46:56,760
turns out right.

724
00:46:56,760 --> 00:46:58,650
And this is in the notes.

725
00:46:58,650 --> 00:47:01,150
So let me tell
you where this is.

726
00:47:01,150 --> 00:47:02,990
It will be Section II.1.

727
00:47:05,560 --> 00:47:08,500
So Part 2 of the book,
which is where we are,

728
00:47:08,500 --> 00:47:10,690
and the first section.

729
00:47:10,690 --> 00:47:15,010
So what all together is
in this first section when

730
00:47:15,010 --> 00:47:17,470
you look at it?

731
00:47:17,470 --> 00:47:21,160
That section is standard
numerical linear algebra,

732
00:47:21,160 --> 00:47:28,270
what any course in
MIT offers, 18.3--

733
00:47:28,270 --> 00:47:36,910
I'm not sure of the number, 330
maybe, which is, of all things

734
00:47:36,910 --> 00:47:38,610
like this.

735
00:47:38,610 --> 00:47:41,600
Krylov would be there,
Arnoldi, Lanczos.

736
00:47:41,600 --> 00:47:44,480
Of course, Gram and
Schmidt would be there.

737
00:47:44,480 --> 00:47:48,960
That's five people who've
thought of the same thing.

738
00:47:48,960 --> 00:47:53,190
And so that Section
II.1 summarizes

739
00:47:53,190 --> 00:47:56,370
what's in really good,
well, a lot of textbooks.

740
00:47:56,370 --> 00:48:04,560
And let me mention a favorite,
a book by Trefethen and Bau.

741
00:48:08,800 --> 00:48:10,150
Or the "Bible."

742
00:48:10,150 --> 00:48:12,130
So this is maybe a
moment to tell you

743
00:48:12,130 --> 00:48:17,680
about two books on classical
numerical linear algebra, what

744
00:48:17,680 --> 00:48:21,050
do you do for matrices
of order 1,000,

745
00:48:21,050 --> 00:48:22,856
not for matrices
of order millions.

746
00:48:26,030 --> 00:48:28,210
That you have to rethink.

747
00:48:28,210 --> 00:48:31,810
So Trefethen-Bau isn't called
numerical linear algebra.

748
00:48:31,810 --> 00:48:35,860
And do you know the
authors of the Bible

749
00:48:35,860 --> 00:48:38,890
of numerical linear algebra?

750
00:48:42,820 --> 00:48:45,190
So that's a textbook.

751
00:48:45,190 --> 00:48:48,730
And what I'm going to
write down now finally

752
00:48:48,730 --> 00:48:54,880
is 750 pages it's grown
to in its fourth edition.

753
00:48:54,880 --> 00:48:58,870
It's the Bible for all
numerical linear algebra people.

754
00:48:58,870 --> 00:49:03,440
And it's written by
Golub and VanLoan.

755
00:49:08,760 --> 00:49:13,190
So Gene Golub was
a remarkable guy.

756
00:49:13,190 --> 00:49:18,050
He probably didn't write more
than about 11 pages of this.

757
00:49:18,050 --> 00:49:20,300
Charlie wrote most of it.

758
00:49:20,300 --> 00:49:25,770
But Golub was an
amazing person who

759
00:49:25,770 --> 00:49:29,400
traveled the world
and connected people

760
00:49:29,400 --> 00:49:32,900
and left behind
papers to be written

761
00:49:32,900 --> 00:49:35,520
and books to be written.

762
00:49:35,520 --> 00:49:40,530
And so this Golub-VanLoan
is now in the fourth volume.

763
00:49:40,530 --> 00:49:47,230
And it has so much good stuff
and references that it's

764
00:49:47,230 --> 00:49:49,690
like the good reference.

765
00:49:49,690 --> 00:49:52,000
And this is the
good textbook if you

766
00:49:52,000 --> 00:49:56,695
were going to teach a course
on numerical linear algebra.

767
00:49:56,695 --> 00:50:02,350
So I think that I've come
to the point to finish.

768
00:50:02,350 --> 00:50:06,880
So I really have finished
along with the extra attraction

769
00:50:06,880 --> 00:50:12,860
of this different problem,
I finished with Ax equal b.

770
00:50:12,860 --> 00:50:18,040
And, well, at least,
I now move onto what

771
00:50:18,040 --> 00:50:21,158
to do with really,
really large matrices.