1 00:00:17,888 --> 00:00:18,555 [AUDIO PLAYBACK] 2 00:00:18,555 --> 00:00:19,472 - Good morning, class. 3 00:00:19,472 --> 00:00:20,190 [END PLAYBACK] 4 00:00:20,190 --> 00:00:22,300 MICHALE FEE: Hey, let's go ahead and get started. 5 00:00:22,300 --> 00:00:25,150 So we're going to finish spectral analysis today. 6 00:00:25,150 --> 00:00:32,369 So we are going to learn how to make a graphical representation 7 00:00:32,369 --> 00:00:36,780 like this of the spectral and temporal structure of time 8 00:00:36,780 --> 00:00:39,840 series, or in this case, a speech signal recorded 9 00:00:39,840 --> 00:00:42,060 on a microphone. 10 00:00:42,060 --> 00:00:44,370 Well, actually let me just tell you 11 00:00:44,370 --> 00:00:46,580 exactly what it is that we're looking at here. 12 00:00:46,580 --> 00:00:52,080 So this is a spectrogram that displays the amount of power 13 00:00:52,080 --> 00:00:55,830 in this signal as a function of time 14 00:00:55,830 --> 00:00:57,430 and as a function of frequency. 15 00:00:57,430 --> 00:01:02,160 So you remember we've been learning how to construct 16 00:01:02,160 --> 00:01:04,500 the spectrum of a signal. 17 00:01:04,500 --> 00:01:06,930 And today, we're going to learn how 18 00:01:06,930 --> 00:01:11,600 to construct a representation like this, called 19 00:01:11,600 --> 00:01:15,880 a spectrogram, that shows how that spectrum varies over time. 20 00:01:15,880 --> 00:01:17,610 So as you recall, we have learned 21 00:01:17,610 --> 00:01:21,970 how to compute the Fourier transform of a signal. 22 00:01:21,970 --> 00:01:26,010 This is one of the signals that we actually started with. 23 00:01:26,010 --> 00:01:29,820 So if you compute the Fourier transform of this square wave, 24 00:01:29,820 --> 00:01:32,710 you can see that in the frequency domain, 25 00:01:32,710 --> 00:01:34,830 now we plot the amount of, essentially, 26 00:01:34,830 --> 00:01:39,090 the components of this signal at different frequencies. 27 00:01:39,090 --> 00:01:41,190 So the Fourier transform of this square wave 28 00:01:41,190 --> 00:01:43,350 has a number of peaks. 29 00:01:43,350 --> 00:01:49,350 Each of these peaks correspond to a cosine contribution 30 00:01:49,350 --> 00:01:54,020 to this time series, OK? 31 00:01:54,020 --> 00:01:54,520 All right. 32 00:01:54,520 --> 00:01:56,740 And we also discussed how you can 33 00:01:56,740 --> 00:01:59,890 compute the power spectrum of a signal 34 00:01:59,890 --> 00:02:01,520 from the Fourier transform. 35 00:02:01,520 --> 00:02:04,510 So here what I've done is I've taken the Fourier 36 00:02:04,510 --> 00:02:07,120 transform of this square wave. 37 00:02:07,120 --> 00:02:09,699 And now we take the square magnitude 38 00:02:09,699 --> 00:02:11,830 of each of these values, and we just 39 00:02:11,830 --> 00:02:14,380 plot the spectrum, the square magnitude 40 00:02:14,380 --> 00:02:16,720 of just the positive frequency components. 41 00:02:16,720 --> 00:02:21,640 For real-valued functions, the power spectrum is symmetric. 42 00:02:21,640 --> 00:02:23,620 The power in each of these-- 43 00:02:23,620 --> 00:02:28,366 at each of these frequencies in the positive, half of the-- 44 00:02:28,366 --> 00:02:31,390 for the positive frequencies is exactly the same 45 00:02:31,390 --> 00:02:34,100 as the power in the negative frequencies. 46 00:02:34,100 --> 00:02:37,060 So if we plot the power spectrum of that square wave, 47 00:02:37,060 --> 00:02:39,430 we can see that there are multiple peaks 48 00:02:39,430 --> 00:02:41,620 at regular intervals. 49 00:02:41,620 --> 00:02:46,000 Now the problem with plotting power spectra on a linear scale 50 00:02:46,000 --> 00:02:48,520 here is that you often have contributions-- 51 00:02:48,520 --> 00:02:51,910 important contributions to signals that actually have 52 00:02:51,910 --> 00:02:55,180 a very small amount of power when you plot them 53 00:02:55,180 --> 00:02:56,120 on a linear scale. 54 00:02:56,120 --> 00:02:58,180 And so you can barely see them. 55 00:02:58,180 --> 00:03:00,580 You can barely see those contributions 56 00:03:00,580 --> 00:03:04,070 at these frequencies here on a linear scale. 57 00:03:04,070 --> 00:03:05,740 But if you plot this on a log scale, 58 00:03:05,740 --> 00:03:09,010 you can see the spectrum much more easily. 59 00:03:09,010 --> 00:03:10,570 So for example, what we've done here 60 00:03:10,570 --> 00:03:13,510 is we've plotted the square magnitude of the Fourier 61 00:03:13,510 --> 00:03:19,840 transform, taken the log base 10 of that spectrum-- 62 00:03:19,840 --> 00:03:23,600 spectrum is the square magnitude of the Fourier transform. 63 00:03:23,600 --> 00:03:28,520 And now we can take the log base 10 of the power spectrum 64 00:03:28,520 --> 00:03:32,830 to get the power in units of bels, 65 00:03:32,830 --> 00:03:37,090 and multiply that by 10 to get the power in units of decibels, 66 00:03:37,090 --> 00:03:37,840 OK? 67 00:03:37,840 --> 00:03:43,210 So each tick mark here of size 10 corresponds 68 00:03:43,210 --> 00:03:46,540 to one order of magnitude in power, OK? 69 00:03:46,540 --> 00:03:51,400 So this peak here is about 10 dB lower than that peak there, 70 00:03:51,400 --> 00:03:56,960 and that corresponds to about a factor of 10 lower in power. 71 00:03:56,960 --> 00:03:58,250 OK, any questions about that? 72 00:03:58,250 --> 00:04:01,420 I want to be able-- want you to understand what 73 00:04:01,420 --> 00:04:03,130 these units of decibels are. 74 00:04:03,130 --> 00:04:05,200 They're going to be on the test. 75 00:04:05,200 --> 00:04:06,040 OK. 76 00:04:06,040 --> 00:04:07,030 Questions about that? 77 00:04:07,030 --> 00:04:08,925 You want to just ask me right now? 78 00:04:08,925 --> 00:04:10,380 OK. 79 00:04:10,380 --> 00:04:12,930 Remember this. 80 00:04:12,930 --> 00:04:14,050 OK. 81 00:04:14,050 --> 00:04:18,050 And keep in mind that the power in a signal 82 00:04:18,050 --> 00:04:21,649 is proportional to the square of the amplitude, OK? 83 00:04:21,649 --> 00:04:26,120 So if I tell you that a signal has 10 times as much amplitude, 84 00:04:26,120 --> 00:04:28,960 it's going to have 100 times as much power. 85 00:04:28,960 --> 00:04:35,500 100 times as much power is 10 to the 2 86 00:04:35,500 --> 00:04:38,083 bels, which is 20 decibels. 87 00:04:38,083 --> 00:04:39,710 Does that make sense? 88 00:04:39,710 --> 00:04:40,680 OK. 89 00:04:40,680 --> 00:04:41,180 All right. 90 00:04:41,180 --> 00:04:44,960 So we also talked about some Fourier transforms 91 00:04:44,960 --> 00:04:46,530 of different kinds of functions. 92 00:04:46,530 --> 00:04:49,280 So this is the Fourier transform of a square pulse. 93 00:04:49,280 --> 00:04:50,910 So here I showed you a square pulse 94 00:04:50,910 --> 00:04:53,460 that has a width of 100 milliseconds. 95 00:04:53,460 --> 00:04:56,300 The Fourier transform is this sinc function, 96 00:04:56,300 --> 00:05:00,180 and for a square pulse of width 100 milliseconds, 97 00:05:00,180 --> 00:05:02,120 the sinc function has a half-- 98 00:05:02,120 --> 00:05:07,400 sorry-- has a full width at half height of 12 hertz. 99 00:05:07,400 --> 00:05:11,330 If we have a square pulse that's five times as long, 100 00:05:11,330 --> 00:05:14,990 500 milliseconds long, the Fourier transform 101 00:05:14,990 --> 00:05:17,150 is the sinc function again, but it 102 00:05:17,150 --> 00:05:21,560 has a width of this central lobe here of 2.4 hertz. 103 00:05:21,560 --> 00:05:25,010 So you can see that the longer the pulse, the shorter-- 104 00:05:25,010 --> 00:05:28,320 the narrower the structure in the frequency domain. 105 00:05:28,320 --> 00:05:30,530 Up here, if we'd look at the Fourier transform 106 00:05:30,530 --> 00:05:34,640 of a square pulse that's 25 milliseconds long, 107 00:05:34,640 --> 00:05:37,460 then the Fourier transform is again a sinc function, 108 00:05:37,460 --> 00:05:40,310 and the width of that central lobe is 48 hertz. 109 00:05:40,310 --> 00:05:43,370 So you can see that the width in the time domain 110 00:05:43,370 --> 00:05:46,830 and the width in the frequency domain are inversely related. 111 00:05:46,830 --> 00:05:49,370 So the product of the width in the time domain 112 00:05:49,370 --> 00:05:52,760 and the width in the frequency domain is constant. 113 00:05:52,760 --> 00:05:56,820 And that constant is called the time-bandwidth product 114 00:05:56,820 --> 00:05:58,360 of that signal. 115 00:05:58,360 --> 00:06:02,610 The time-bandwidth product of this square pulse 116 00:06:02,610 --> 00:06:05,170 and sinc function is constant. 117 00:06:05,170 --> 00:06:06,420 It's independent of the width. 118 00:06:06,420 --> 00:06:09,980 The time-bandwidth product is a function of that. 119 00:06:09,980 --> 00:06:16,084 It's a characteristic of that functional form. 120 00:06:16,084 --> 00:06:18,270 OK. 121 00:06:18,270 --> 00:06:21,110 Now we also talked about the convolution theorem, 122 00:06:21,110 --> 00:06:25,430 which relates the way signals get multiplied in time 123 00:06:25,430 --> 00:06:27,840 or convolved in frequency domain. 124 00:06:27,840 --> 00:06:31,910 So for example, if we have a square pulse in time multiplied 125 00:06:31,910 --> 00:06:35,780 by a cosine function in time to get a windowed cosine-- 126 00:06:35,780 --> 00:06:37,670 so this function is zero everywhere 127 00:06:37,670 --> 00:06:41,760 except it's cosine within this window-- 128 00:06:41,760 --> 00:06:43,700 we can compute the Fourier transform 129 00:06:43,700 --> 00:06:49,020 of this windowed cosine function by convolving the Fourier 130 00:06:49,020 --> 00:06:51,210 transform of the square pulse with the Fourier 131 00:06:51,210 --> 00:06:54,272 transform of the cosine function, like this. 132 00:06:54,272 --> 00:06:56,105 So the Fourier transform of the square pulse 133 00:06:56,105 --> 00:06:57,980 is, again, this sinc function. 134 00:06:57,980 --> 00:07:00,860 The Fourier transform of the cosine 135 00:07:00,860 --> 00:07:03,720 are these two delta functions. 136 00:07:03,720 --> 00:07:06,680 Now if we convolve the sinc function with those two delta 137 00:07:06,680 --> 00:07:10,160 functions, we get a copy of that sinc function 138 00:07:10,160 --> 00:07:13,560 at the location of each of those delta functions. 139 00:07:13,560 --> 00:07:16,828 And that is the Fourier transform, OK? 140 00:07:16,828 --> 00:07:17,870 Any questions about that? 141 00:07:17,870 --> 00:07:21,770 Just a quick review of things we've been talking about. 142 00:07:21,770 --> 00:07:24,220 All right. 143 00:07:24,220 --> 00:07:26,920 So we can look at this Fourier transform here. 144 00:07:26,920 --> 00:07:28,870 We can look at the power spectrum 145 00:07:28,870 --> 00:07:33,560 of this windowed cosine function, like this. 146 00:07:33,560 --> 00:07:35,740 So there's the windowed cosine function. 147 00:07:35,740 --> 00:07:38,410 The power spectrum-- the power spectrum 148 00:07:38,410 --> 00:07:42,310 plotted on a linear scale is just the square magnitude 149 00:07:42,310 --> 00:07:44,440 of what I've plotted here. 150 00:07:44,440 --> 00:07:47,870 And we're just going to plot the positive frequencies. 151 00:07:47,870 --> 00:07:50,230 That's what the power spectrum of that signal 152 00:07:50,230 --> 00:07:52,210 looks like on a log scale. 153 00:07:52,210 --> 00:07:54,730 So you can see that it has a peak 154 00:07:54,730 --> 00:07:57,310 at 20 hertz, which was the frequency 155 00:07:57,310 --> 00:07:59,550 of the cosine function. 156 00:07:59,550 --> 00:08:01,790 You see some little wiggles out here. 157 00:08:01,790 --> 00:08:03,400 But if you look on a log scale, you 158 00:08:03,400 --> 00:08:05,830 can see that those wiggles off to the side 159 00:08:05,830 --> 00:08:07,940 are actually quite significant. 160 00:08:07,940 --> 00:08:12,370 The first side lobe there has a power that's about 1/10 161 00:08:12,370 --> 00:08:13,840 of the central peak. 162 00:08:13,840 --> 00:08:16,368 That may not matter, sometimes, when 163 00:08:16,368 --> 00:08:18,160 you're looking at the spectrum of a signal, 164 00:08:18,160 --> 00:08:21,640 but sometimes it will matter because those side lobes 165 00:08:21,640 --> 00:08:23,560 there will interfere. 166 00:08:23,560 --> 00:08:28,900 They'll mask the spectrum of other components of this signal 167 00:08:28,900 --> 00:08:32,600 that you may be interested in. 168 00:08:32,600 --> 00:08:35,470 We also talked about how this spectrum 169 00:08:35,470 --> 00:08:39,620 depends on the function that you multiply your cosine by here. 170 00:08:39,620 --> 00:08:42,549 So for example, if you take a cosine 171 00:08:42,549 --> 00:08:47,020 and you multiply it by Gaussian, the power spectrum 172 00:08:47,020 --> 00:08:49,450 has the shape of a Gaussian, it's 173 00:08:49,450 --> 00:08:53,080 a Gaussian that has a peak at 20 hertz. 174 00:08:53,080 --> 00:08:55,450 And if you look at the-- 175 00:08:55,450 --> 00:08:57,310 if you look at that spectrum on a log scale, 176 00:08:57,310 --> 00:09:00,580 you can see that it loses it-- 177 00:09:00,580 --> 00:09:03,710 you've lost all of these high-frequency wiggles, 178 00:09:03,710 --> 00:09:04,450 up here. 179 00:09:04,450 --> 00:09:08,110 All of those wiggles come from the sharp edge 180 00:09:08,110 --> 00:09:12,040 of this square pulse windowing function, OK? 181 00:09:12,040 --> 00:09:15,430 So the shape of the spectrum that you get 182 00:09:15,430 --> 00:09:19,180 depends a lot on how you window the function 183 00:09:19,180 --> 00:09:21,630 that you're looking at. 184 00:09:21,630 --> 00:09:23,560 Questions about that? 185 00:09:23,560 --> 00:09:25,360 Right, again, more review. 186 00:09:25,360 --> 00:09:25,860 OK. 187 00:09:25,860 --> 00:09:29,110 So we talked about estimating the spectrum of a signal. 188 00:09:29,110 --> 00:09:32,260 If you have many different measurements of some signal, 189 00:09:32,260 --> 00:09:35,380 you can actually just compute the spectrum of each one. 190 00:09:35,380 --> 00:09:38,590 This little hat here means an estimate of the spectrum. 191 00:09:38,590 --> 00:09:41,380 You compute some estimate of the spectrum of each 192 00:09:41,380 --> 00:09:45,080 of those trials, samples of your data, 193 00:09:45,080 --> 00:09:48,154 and you can just average of those together. 194 00:09:48,154 --> 00:09:52,590 OK, now if you have a continuous signal, you can also-- 195 00:09:52,590 --> 00:09:54,450 you could estimate the spectrum just 196 00:09:54,450 --> 00:09:58,380 by taking the Fourier transform of a long recording 197 00:09:58,380 --> 00:09:59,370 of your signal. 198 00:09:59,370 --> 00:10:03,210 But it's much better to break your signal into small pieces, 199 00:10:03,210 --> 00:10:06,630 compute a spectral estimate of each one of those small pieces 200 00:10:06,630 --> 00:10:08,440 and average those together. 201 00:10:08,440 --> 00:10:12,570 Now how do you construct a small sample of a signal? 202 00:10:12,570 --> 00:10:15,480 If you have a continuous signal, how do you 203 00:10:15,480 --> 00:10:17,400 take a small sample of it? 204 00:10:17,400 --> 00:10:19,350 Well, you can think about that as taking 205 00:10:19,350 --> 00:10:21,780 your continuous signal and multiplying it 206 00:10:21,780 --> 00:10:23,760 by a square window. 207 00:10:23,760 --> 00:10:26,550 Setting everything outside that window to zero 208 00:10:26,550 --> 00:10:28,993 and just keeping the part that's in that window. 209 00:10:28,993 --> 00:10:30,660 And you know that when you take a signal 210 00:10:30,660 --> 00:10:34,200 and you multiply it by a square window, what have you done? 211 00:10:34,200 --> 00:10:38,250 You've convolved the spectrum of this original signal 212 00:10:38,250 --> 00:10:41,130 with the spectrum of this square pulse. 213 00:10:41,130 --> 00:10:43,290 And that spectrum of the square pulse 214 00:10:43,290 --> 00:10:45,900 is really a nasty looking thing, right? 215 00:10:45,900 --> 00:10:49,590 It is this what we call the Dirichlet kernel, which is just 216 00:10:49,590 --> 00:10:51,660 the power spectrum of a square pulse 217 00:10:51,660 --> 00:10:53,820 that we just talked about, OK? 218 00:10:53,820 --> 00:10:55,680 So that's called the Dirichlet kernel. 219 00:10:55,680 --> 00:11:03,240 And using a square pulse to select out a sample of data 220 00:11:03,240 --> 00:11:07,680 introduces two errors into your spectral estimate, 221 00:11:07,680 --> 00:11:09,330 narrowband bias. 222 00:11:09,330 --> 00:11:14,190 It broadens your estimate of the spectrum of, let's say, 223 00:11:14,190 --> 00:11:17,220 sinusoidal or periodic components in your signal. 224 00:11:17,220 --> 00:11:20,980 And it also introduces these side lobes. 225 00:11:20,980 --> 00:11:23,650 So the way we solve that problem is 226 00:11:23,650 --> 00:11:26,200 we break our signal into little pieces, 227 00:11:26,200 --> 00:11:28,240 multiply each of those little pieces 228 00:11:28,240 --> 00:11:31,000 by a smoother windowing function by something 229 00:11:31,000 --> 00:11:33,280 that isn't a square pulse, multiply it 230 00:11:33,280 --> 00:11:36,700 by something that maybe looks like a Gaussian, or a-- 231 00:11:36,700 --> 00:11:41,700 or half of a cosine function. 232 00:11:41,700 --> 00:11:46,620 That gives us what we call tapered segments of our data. 233 00:11:46,620 --> 00:11:50,280 We can estimate the spectrum of those tapered pieces 234 00:11:50,280 --> 00:11:52,802 and averaged those together, OK? 235 00:11:52,802 --> 00:11:54,440 Any questions? 236 00:11:54,440 --> 00:11:56,300 OK, again, that's a review. 237 00:11:56,300 --> 00:11:58,310 And I showed you briefly what happens if we 238 00:11:58,310 --> 00:12:00,230 take a little piece of signal. 239 00:12:00,230 --> 00:12:03,500 The blue is white noise with a little bit 240 00:12:03,500 --> 00:12:09,090 of this periodic sine function added to it. 241 00:12:09,090 --> 00:12:11,570 And if you run that analysis, you 242 00:12:11,570 --> 00:12:14,960 can see that there is a large component of the spectrum 243 00:12:14,960 --> 00:12:16,880 that's due to the white noise. 244 00:12:16,880 --> 00:12:19,250 That's this broadband component here. 245 00:12:19,250 --> 00:12:22,700 And that sinusoidal component there gives you 246 00:12:22,700 --> 00:12:26,920 this peak in the spectrum, OK? 247 00:12:26,920 --> 00:12:28,260 And there's a-- 248 00:12:28,260 --> 00:12:30,820 I've posted-- or Daniel's posted a function 249 00:12:30,820 --> 00:12:35,350 called wspec.m that implements this spectral estimate 250 00:12:35,350 --> 00:12:36,220 like this. 251 00:12:36,220 --> 00:12:41,560 So now today, we're going to turn to estimating time varying 252 00:12:41,560 --> 00:12:44,500 signals, estimating the spectrum of time varying signals. 253 00:12:44,500 --> 00:12:48,240 So this is a microphone recording of a speech signal. 254 00:12:48,240 --> 00:12:50,103 Let me see if I can play that. 255 00:12:50,103 --> 00:12:50,770 [AUDIO PLAYBACK] 256 00:12:50,770 --> 00:12:51,435 - Hello. 257 00:12:51,435 --> 00:12:51,850 [END PLAYBACK] 258 00:12:51,850 --> 00:12:52,808 MICHALE FEE: All right. 259 00:12:52,808 --> 00:12:56,510 So that's just me saying hello in a robotic voice. 260 00:12:56,510 --> 00:12:57,010 OK. 261 00:12:57,010 --> 00:13:00,490 So that is the signal. 262 00:13:00,490 --> 00:13:02,890 That's basically voltage recorded 263 00:13:02,890 --> 00:13:04,900 on the output of a microphone. 264 00:13:04,900 --> 00:13:07,180 It's got some interesting structure in it, right? 265 00:13:07,180 --> 00:13:13,000 So first these little pulses here, 266 00:13:13,000 --> 00:13:15,470 you see this kind of periodic pulse. 267 00:13:15,470 --> 00:13:16,840 Those are called glottal pulses. 268 00:13:16,840 --> 00:13:19,330 Does anyone know what those are? 269 00:13:19,330 --> 00:13:22,745 What produces those? 270 00:13:22,745 --> 00:13:24,236 No? 271 00:13:24,236 --> 00:13:28,580 OK, so when you speak a voiced sound, 272 00:13:28,580 --> 00:13:30,170 your vocal cords are vibrating. 273 00:13:30,170 --> 00:13:35,090 You have two pieces of flexible tissue 274 00:13:35,090 --> 00:13:38,300 that are close to each other in your trachea. 275 00:13:38,300 --> 00:13:40,880 As air flows up through your trachea, 276 00:13:40,880 --> 00:13:46,250 the air pressure builds up and pushes the glottal folds apart. 277 00:13:46,250 --> 00:13:51,050 Air begins to flow rapidly through that open space. 278 00:13:51,050 --> 00:13:54,500 At high velocities, the velocity flowing 279 00:13:54,500 --> 00:13:57,110 through the constriction is higher than the velocity of air 280 00:13:57,110 --> 00:13:59,180 anywhere else in the trachea because it's flowing 281 00:13:59,180 --> 00:14:01,160 through a tiny little space. 282 00:14:01,160 --> 00:14:04,520 At high velocities, at constrictions 283 00:14:04,520 --> 00:14:06,800 where you have a high fluid flow, 284 00:14:06,800 --> 00:14:09,140 the pressure actually drops. 285 00:14:09,140 --> 00:14:14,660 And that pulls the vocal folds back together again. 286 00:14:14,660 --> 00:14:17,750 When they snap together, all the airflow stops, 287 00:14:17,750 --> 00:14:22,080 and you have a pulse of negative pressure above the glottis, 288 00:14:22,080 --> 00:14:22,580 right? 289 00:14:22,580 --> 00:14:25,320 Imagine you have airflow coming up. 290 00:14:25,320 --> 00:14:27,260 And all of a sudden, you pinch it off. 291 00:14:27,260 --> 00:14:30,410 There's a sudden drop in the pressure as that mass of air 292 00:14:30,410 --> 00:14:33,650 keeps flowing up, but there's nothing more coming up below. 293 00:14:33,650 --> 00:14:36,380 So you get a sharp drop in the pressure. 294 00:14:36,380 --> 00:14:38,750 Then the air pressure builds up again. 295 00:14:38,750 --> 00:14:41,810 The glottal folds open, velocity increases, 296 00:14:41,810 --> 00:14:43,410 and they snap shut again. 297 00:14:43,410 --> 00:14:46,460 And so that's what happens as you're talking, OK? 298 00:14:46,460 --> 00:14:51,740 And so that periodic signal right there, those 299 00:14:51,740 --> 00:14:54,200 pulses in pressure-- the microphone is 300 00:14:54,200 --> 00:14:55,980 recording pressure, remember. 301 00:14:55,980 --> 00:15:00,200 So those pulses are due to your glottis snapping 302 00:15:00,200 --> 00:15:06,290 shut each time it closes during the cycle, 303 00:15:06,290 --> 00:15:08,390 during that oscillatory cycle. 304 00:15:08,390 --> 00:15:11,060 The period of that-- those glottal pulses 305 00:15:11,060 --> 00:15:16,550 is about 10 milliseconds in men and about 5 milliseconds 306 00:15:16,550 --> 00:15:17,420 in women. 307 00:15:17,420 --> 00:15:17,930 OK. 308 00:15:17,930 --> 00:15:21,440 But you can see there's a lot of other structure 309 00:15:21,440 --> 00:15:23,780 changes in this signal that go on through time. 310 00:15:23,780 --> 00:15:26,240 But let's start by just looking at the spectrum 311 00:15:26,240 --> 00:15:27,560 of that whole signal. 312 00:15:27,560 --> 00:15:29,670 Now what might we expect? 313 00:15:29,670 --> 00:15:35,090 So if you have periodic pulses at 10 millisecond period, 314 00:15:35,090 --> 00:15:38,110 what should the spectrum look like? 315 00:15:38,110 --> 00:15:43,240 If you have a train of pulses, let's say delta functions 316 00:15:43,240 --> 00:15:46,810 with 10 millisecond period, what would the spectrum of that look 317 00:15:46,810 --> 00:15:49,375 like? 318 00:15:49,375 --> 00:15:52,270 Anybody remember what the spectrum of a train of pulses 319 00:15:52,270 --> 00:15:52,770 looks like? 320 00:15:57,910 --> 00:15:58,630 Almost, yes. 321 00:15:58,630 --> 00:15:59,513 There would be. 322 00:15:59,513 --> 00:16:01,180 But there would be other things as well. 323 00:16:04,230 --> 00:16:06,510 What would a signal look like that just 324 00:16:06,510 --> 00:16:08,470 has a peak at 100 hertz? 325 00:16:08,470 --> 00:16:09,350 What is that? 326 00:16:11,990 --> 00:16:14,390 Has one peak at 100 hertz? 327 00:16:14,390 --> 00:16:18,740 Or let's say [INAUDIBLE] Fourier transform would have 328 00:16:18,740 --> 00:16:22,630 a peak at 100 and at minus 100. 329 00:16:22,630 --> 00:16:24,040 That's just a cosine. 330 00:16:24,040 --> 00:16:27,070 That's not a train of pulses. 331 00:16:27,070 --> 00:16:29,350 What's the Fourier transform of a train of pulses? 332 00:16:29,350 --> 00:16:31,900 Those of you who are concentrating on this right 333 00:16:31,900 --> 00:16:34,130 now are going to be really glad on the midterm. 334 00:16:34,130 --> 00:16:37,840 What's the Fourier transform of a train of pulses? 335 00:16:37,840 --> 00:16:39,480 OK, let's go back to here because there 336 00:16:39,480 --> 00:16:43,020 was a bit of a hint here at the beginning of lecture. 337 00:16:43,020 --> 00:16:47,820 What's the Fourier transform of a square wave? 338 00:16:47,820 --> 00:16:51,103 Any idea what happens if we make these pulses narrower 339 00:16:51,103 --> 00:16:51,645 and narrower? 340 00:16:56,530 --> 00:16:59,010 The pulses get more and more narrow, 341 00:16:59,010 --> 00:17:01,065 these peaks get bigger and bigger. 342 00:17:01,065 --> 00:17:05,050 And as we go to a train of delta functions, 343 00:17:05,050 --> 00:17:07,480 you just get Fourier transform of a train 344 00:17:07,480 --> 00:17:09,510 of delta functions in time, is just 345 00:17:09,510 --> 00:17:13,349 a train of delta functions in frequency. 346 00:17:13,349 --> 00:17:17,210 The spacing between the peaks in frequency is just 1 347 00:17:17,210 --> 00:17:21,130 over the spacing between the peaks in time, right? 348 00:17:21,130 --> 00:17:22,720 Make sure you know that. 349 00:17:27,410 --> 00:17:30,530 OK, so now let's go back to our speech signal. 350 00:17:34,967 --> 00:17:37,942 These are almost like delta functions. 351 00:17:37,942 --> 00:17:40,150 Maybe not quite, but for now, let's pretend they are. 352 00:17:40,150 --> 00:17:42,820 If those are a train of delta functions spaced 353 00:17:42,820 --> 00:17:45,940 at 10 milliseconds, what is our spectrum going to look like? 354 00:17:53,240 --> 00:17:55,500 I just said it. 355 00:17:55,500 --> 00:17:58,750 What is it going to look like? 356 00:17:58,750 --> 00:17:59,340 Yep. 357 00:17:59,340 --> 00:18:00,748 Spaced by? 358 00:18:00,748 --> 00:18:02,103 AUDIENCE: One. 359 00:18:02,103 --> 00:18:03,020 MICHALE FEE: Which is? 360 00:18:05,560 --> 00:18:06,530 100 hertz. 361 00:18:06,530 --> 00:18:07,120 Good. 362 00:18:07,120 --> 00:18:10,550 So here's the spectrum of that speech signal. 363 00:18:10,550 --> 00:18:13,088 What do you see? 364 00:18:13,088 --> 00:18:19,150 You see a train of delta functions separated 365 00:18:19,150 --> 00:18:22,820 by about 100 hertz, right? 366 00:18:22,820 --> 00:18:26,210 That's a kilohertz, that's 500 hertz, that's 100 hertz. 367 00:18:26,210 --> 00:18:29,960 So you get a train of delta functions separated 368 00:18:29,960 --> 00:18:31,820 by 100 hertz, OK? 369 00:18:31,820 --> 00:18:34,922 That's called a harmonic stack. 370 00:18:34,922 --> 00:18:35,422 OK. 371 00:18:35,422 --> 00:18:38,410 And the spectrum of a speech signal 372 00:18:38,410 --> 00:18:43,000 has a harmonic stack because the signal has 373 00:18:43,000 --> 00:18:48,540 these short little pulses of pressure in them. 374 00:18:48,540 --> 00:18:50,670 OK, what are these bumps here? 375 00:18:50,670 --> 00:18:53,460 Why is there a bump here, a bump here, and a bump here? 376 00:18:53,460 --> 00:18:54,580 Does anyone know that? 377 00:19:01,984 --> 00:19:07,830 What is it that shapes the sound as you speak? 378 00:19:07,830 --> 00:19:11,442 That makes an "ooh" sound different from an "ahh?" 379 00:19:11,442 --> 00:19:18,180 [INAUDIBLE] This is hello. 380 00:19:18,180 --> 00:19:20,280 Sorry, I'm having trouble with my pointer. 381 00:19:23,058 --> 00:19:24,735 That's hello. 382 00:19:28,598 --> 00:19:32,864 What is it that makes all things sound different? 383 00:19:32,864 --> 00:19:35,500 So the sound, those pulses, are made down 384 00:19:35,500 --> 00:19:37,300 hearing your vocal tract. 385 00:19:37,300 --> 00:19:41,620 As those pulses propagate up from your glottis 386 00:19:41,620 --> 00:19:46,570 to your lips, they [AUDIO OUT] filter, which is your mouth. 387 00:19:46,570 --> 00:19:48,370 And that the shape of that filter 388 00:19:48,370 --> 00:19:52,270 is controlled by the closure of your lips, 389 00:19:52,270 --> 00:19:58,180 by where your tongue is, where different parts of your tongue 390 00:19:58,180 --> 00:20:03,040 are closing the opening in your mouth. 391 00:20:03,040 --> 00:20:08,820 And all of those things produce filters that have peaks. 392 00:20:08,820 --> 00:20:13,050 And the vocal filter has three main peaks 393 00:20:13,050 --> 00:20:15,700 that move around as you move the shape of your mouth. 394 00:20:15,700 --> 00:20:19,740 And those are called formants, OK? 395 00:20:19,740 --> 00:20:20,240 OK. 396 00:20:20,240 --> 00:20:23,850 Now you can see that this temporal structure, 397 00:20:23,850 --> 00:20:26,650 this spectral structure isn't constant in time. 398 00:20:26,650 --> 00:20:29,440 It changes-- right-- throughout this word. 399 00:20:29,440 --> 00:20:33,380 So what we can do is we can take that signal, 400 00:20:33,380 --> 00:20:37,180 and we can compute the spectrum of little parts of it, OK? 401 00:20:37,180 --> 00:20:41,600 So we can take that signal and multiply it by a window 402 00:20:41,600 --> 00:20:46,430 here, a taper here, and get a little sample of the speech 403 00:20:46,430 --> 00:20:48,320 signal and calculate the spectrum of it 404 00:20:48,320 --> 00:20:50,970 just by Fourier transforming, OK? 405 00:20:50,970 --> 00:20:52,080 We can do the same thing. 406 00:20:52,080 --> 00:20:55,470 Shift it over a little bit and compute the spectrum 407 00:20:55,470 --> 00:20:56,670 of that signal, all right? 408 00:20:56,670 --> 00:21:00,210 So we're going to take a little piece of the signal that 409 00:21:00,210 --> 00:21:04,600 has width in time, capital T-- 410 00:21:04,600 --> 00:21:06,920 OK-- that's the width of the window. 411 00:21:06,920 --> 00:21:09,420 We're going to multiply it by a taper, compute the spectrum. 412 00:21:09,420 --> 00:21:10,920 And we're going to shift that window 413 00:21:10,920 --> 00:21:13,545 by a smaller amount, delta t, so that you 414 00:21:13,545 --> 00:21:15,090 have overlapping windows. 415 00:21:15,090 --> 00:21:17,020 Compute the spectrum of each one, 416 00:21:17,020 --> 00:21:21,111 and then stack all of those up next to each other 417 00:21:21,111 --> 00:21:25,840 So now you've got a spectrum that's a function of time 418 00:21:25,840 --> 00:21:29,640 and frequency, OK? 419 00:21:29,640 --> 00:21:34,830 So each column is the spectrum of one little piece 420 00:21:34,830 --> 00:21:37,700 of the sound at one moment in time. 421 00:21:37,700 --> 00:21:39,450 Does that make sense? 422 00:21:39,450 --> 00:21:39,950 OK. 423 00:21:39,950 --> 00:21:43,960 And that's where this spectrogram comes from. 424 00:21:43,960 --> 00:21:45,340 Here in this spectrogram, you can 425 00:21:45,340 --> 00:21:50,380 see these horizontal striations are the harmonics stack 426 00:21:50,380 --> 00:21:52,450 produced by the glottal pulse. 427 00:21:52,450 --> 00:21:55,180 This is a really key way that people 428 00:21:55,180 --> 00:21:59,620 study the mechanisms of sound production and speech, 429 00:21:59,620 --> 00:22:04,590 and animals vocalizations, and all kinds of signals, 430 00:22:04,590 --> 00:22:06,004 more generally, OK? 431 00:22:06,004 --> 00:22:09,210 All right, any questions about that? 432 00:22:09,210 --> 00:22:09,940 All right. 433 00:22:09,940 --> 00:22:12,550 Now what's really cool is that you can actually 434 00:22:12,550 --> 00:22:15,190 focus on different things in a signal, OK? 435 00:22:15,190 --> 00:22:18,550 So for example, if I compute the spectrogram 436 00:22:18,550 --> 00:22:21,760 with signals where that little window that I'm choosing 437 00:22:21,760 --> 00:22:25,990 is really long, then I have high frequency-- 438 00:22:25,990 --> 00:22:28,720 high resolution and frequency, and the spectrogram 439 00:22:28,720 --> 00:22:30,240 looks like this. 440 00:22:30,240 --> 00:22:33,730 But if I compute the spectrograph 441 00:22:33,730 --> 00:22:37,870 with little windows in time that are very short, 442 00:22:37,870 --> 00:22:41,630 then my frequency resolution is very poor, 443 00:22:41,630 --> 00:22:43,910 but the temporal resolution is very high. 444 00:22:43,910 --> 00:22:46,300 And now you can see the spectrum. 445 00:22:46,300 --> 00:22:48,520 You can see these vertical striations. 446 00:22:48,520 --> 00:22:50,410 Those vertical striations correspond 447 00:22:50,410 --> 00:22:52,810 to pulse of the glottal pulse. 448 00:22:52,810 --> 00:22:57,700 And we can basically see the spectrum of each pulse coming 449 00:22:57,700 --> 00:23:00,940 through the vocal tract. 450 00:23:00,940 --> 00:23:02,110 Pretty cool, right? 451 00:23:02,110 --> 00:23:05,200 So how you compute the spectrum depends on 452 00:23:05,200 --> 00:23:07,390 whether you're actually interested in. 453 00:23:07,390 --> 00:23:11,950 If you want to focus on the glottal pulses, 454 00:23:11,950 --> 00:23:13,450 for example, the pitch of the speech 455 00:23:13,450 --> 00:23:15,580 you look with a longtime window. 456 00:23:15,580 --> 00:23:18,030 If you want to focus on the formants, 457 00:23:18,030 --> 00:23:20,770 here you can see the performance very nicely, 458 00:23:20,770 --> 00:23:22,420 you would use shorttime window. 459 00:23:25,210 --> 00:23:28,380 Any questions? 460 00:23:28,380 --> 00:23:34,950 So now I'm going to talk more about the kinds of tapers 461 00:23:34,950 --> 00:23:41,690 that you use to get the best possible spectral estimate. 462 00:23:41,690 --> 00:23:45,940 So a perfect taper, in a sense would give you 463 00:23:45,940 --> 00:23:47,860 perfect temporal resolution. 464 00:23:51,470 --> 00:23:54,560 It would give you really fine temporal resolution. 465 00:23:54,560 --> 00:23:58,370 And it would give you really fine frequency resolution, 466 00:23:58,370 --> 00:24:03,050 but because there is a fundamental limit on the time 467 00:24:03,050 --> 00:24:08,270 bandwidth product, you can't measure frequency infinitely 468 00:24:08,270 --> 00:24:12,410 well with an infinitely short sample of a signal. 469 00:24:12,410 --> 00:24:14,420 Imagine you have a sine wave, and you 470 00:24:14,420 --> 00:24:17,550 took like two samples of a sine wave. 471 00:24:17,550 --> 00:24:20,270 It would be really hard to figure out the frequency, 472 00:24:20,270 --> 00:24:23,270 whereas if you have many, many, many samples of a sine wave, 473 00:24:23,270 --> 00:24:24,720 you can figure out the frequency. 474 00:24:24,720 --> 00:24:26,870 So there's a fundamental limit there. 475 00:24:26,870 --> 00:24:29,180 So there's no such thing as a perfect taper. 476 00:24:29,180 --> 00:24:32,270 If I want to take a sample of my signal 477 00:24:32,270 --> 00:24:37,850 in time, if I have a sample that's limited in time, 478 00:24:37,850 --> 00:24:41,610 if it goes from one time to another time 479 00:24:41,610 --> 00:24:44,830 and a zero outside of that, then in frequency, 480 00:24:44,830 --> 00:24:47,130 it's spread out to infinity. 481 00:24:47,130 --> 00:24:51,480 And so all we can do is choose how it is. 482 00:24:51,480 --> 00:24:54,390 We can either have things look worse in time 483 00:24:54,390 --> 00:24:57,180 and better in frequency or better in time and worse 484 00:24:57,180 --> 00:24:58,150 in frequency. 485 00:25:01,170 --> 00:25:04,170 So the other problem is that when we taper a signal, 486 00:25:04,170 --> 00:25:06,570 we're throwing away data here at the edges. 487 00:25:06,570 --> 00:25:10,860 But if you take a square window and you keep all the data 488 00:25:10,860 --> 00:25:13,020 within that square window, well, you've 489 00:25:13,020 --> 00:25:15,390 got all the data in that window. 490 00:25:15,390 --> 00:25:17,520 But as soon as you taper it, you're 491 00:25:17,520 --> 00:25:19,480 throwing away stuff at the edges. 492 00:25:19,480 --> 00:25:21,570 So you taper it to make it smooth 493 00:25:21,570 --> 00:25:25,740 and improve the spectrum, the spectral estimate, 494 00:25:25,740 --> 00:25:28,050 but you're throwing away data. 495 00:25:28,050 --> 00:25:33,110 So you can actually compute the optimal taper. 496 00:25:33,110 --> 00:25:35,060 Here's how you do that. 497 00:25:35,060 --> 00:25:37,110 What we're going to do is we're going 498 00:25:37,110 --> 00:25:40,020 to think of this as what's called 499 00:25:40,020 --> 00:25:41,820 the spectral concentration problem. 500 00:25:41,820 --> 00:25:45,750 We're going to find a function W. 501 00:25:45,750 --> 00:25:49,170 This is a tapering that is limited 502 00:25:49,170 --> 00:25:54,870 in time from some minus T/2 to plus T/2. 503 00:25:54,870 --> 00:25:57,020 So it's 0 outside of that. 504 00:25:57,020 --> 00:25:59,580 It concentrates the maximum amount 505 00:25:59,580 --> 00:26:02,310 of energy in it's Fourier Transform, 506 00:26:02,310 --> 00:26:08,580 in its power spectrum within a window that has widths 2W. 507 00:26:08,580 --> 00:26:11,968 So W is this [INAUDIBLE]. 508 00:26:11,968 --> 00:26:13,560 Does that makes sense? 509 00:26:13,560 --> 00:26:15,950 We're going to find a function w that 510 00:26:15,950 --> 00:26:21,140 concentrates as much energy as possible in square window. 511 00:26:24,804 --> 00:26:27,700 And of course, that's going to have the result 512 00:26:27,700 --> 00:26:29,590 that the energy in the side lobes 513 00:26:29,590 --> 00:26:34,030 is going to be as small as possible. 514 00:26:34,030 --> 00:26:35,930 And there are many different optimizations 515 00:26:35,930 --> 00:26:38,150 you can do in principle. 516 00:26:38,150 --> 00:26:41,750 But this particular optimization is about getting as much 517 00:26:41,750 --> 00:26:44,750 of the power as possible into a central low. 518 00:26:44,750 --> 00:26:46,460 Here's this function of time. 519 00:26:46,460 --> 00:26:49,850 We simply calculate the Fourier Transform of W. 520 00:26:49,850 --> 00:26:52,920 We call that U of f. 521 00:26:52,920 --> 00:26:55,770 And now we just write down a parameter that 522 00:26:55,770 --> 00:26:58,440 says how much of that Fourier-- 523 00:26:58,440 --> 00:27:03,750 how much of the power in U is in the window from minus 524 00:27:03,750 --> 00:27:08,160 w to w compared to how much power there is in U 525 00:27:08,160 --> 00:27:12,480 overall, overall frequencies? 526 00:27:12,480 --> 00:27:17,200 So if lambda is 1, then all of the power 527 00:27:17,200 --> 00:27:19,680 is between minus w and w. 528 00:27:19,680 --> 00:27:21,250 Does that make sense? 529 00:27:21,250 --> 00:27:26,560 So you can actually solve this optimization problem, maximize 530 00:27:26,560 --> 00:27:30,340 lambda, and what you find is that there's not just one 531 00:27:30,340 --> 00:27:37,490 function that gives very good concentration of the power 532 00:27:37,490 --> 00:27:38,120 into this band. 533 00:27:38,120 --> 00:27:40,360 There's actually a family of functions. 534 00:27:40,360 --> 00:27:43,600 There's actually k of these functions, 535 00:27:43,600 --> 00:27:48,730 where k is twice the bandwidth times the duration 536 00:27:48,730 --> 00:27:50,005 of the window minus 1. 537 00:27:50,005 --> 00:27:52,810 So there are a family of k functions 538 00:27:52,810 --> 00:27:54,970 called Slepian functions for which lambda 539 00:27:54,970 --> 00:27:56,080 is very close to 1. 540 00:27:58,870 --> 00:28:02,540 There are also discrete probate spheroid sequence functions, 541 00:28:02,540 --> 00:28:03,770 dpss. 542 00:28:03,770 --> 00:28:05,760 And that's the command that Matlab uses 543 00:28:05,760 --> 00:28:08,970 to find those functions dpss. 544 00:28:08,970 --> 00:28:10,420 Here's what they look like. 545 00:28:10,420 --> 00:28:15,000 So these are five functions that give lambda close 546 00:28:15,000 --> 00:28:23,360 to 1 or for a particular bandwidth in a particular time 547 00:28:23,360 --> 00:28:23,860 window. 548 00:28:23,860 --> 00:28:26,830 The n equals 1 function is a single peak. 549 00:28:26,830 --> 00:28:30,190 It looks a lot like a Gaussian, but it's not a Gaussian. 550 00:28:30,190 --> 00:28:33,550 What's fundamentally different between this function 551 00:28:33,550 --> 00:28:35,120 and a Gaussian? 552 00:28:35,120 --> 00:28:40,070 This function goes to 0 outside that time window, 553 00:28:40,070 --> 00:28:42,530 whereas a Gaussian goes on forever. 554 00:28:42,530 --> 00:28:47,290 The second slepian in this family 555 00:28:47,290 --> 00:28:51,255 has a peak, a positive peak in the left half, 556 00:28:51,255 --> 00:28:55,630 a negative peak in the right. 557 00:28:55,630 --> 00:28:59,300 The third one has positive, negative, positive, 558 00:28:59,300 --> 00:29:01,100 and then goes to 0. 559 00:29:01,100 --> 00:29:04,960 And the higher order functions just have more wiggle. 560 00:29:04,960 --> 00:29:09,290 They all have the property that they go to 0 at the edges. 561 00:29:09,290 --> 00:29:11,260 And the other interesting properties 562 00:29:11,260 --> 00:29:14,350 that these functions are all orthogonal to each other. 563 00:29:14,350 --> 00:29:18,730 That means if you multiply this function times that function 564 00:29:18,730 --> 00:29:21,160 and integrate, you get 0. 565 00:29:21,160 --> 00:29:24,490 Multiply any two of these functions and integrate 566 00:29:24,490 --> 00:29:27,980 over the window minus T/2 to plus T/2 567 00:29:27,980 --> 00:29:29,940 the integral [INAUDIBLE] 568 00:29:29,940 --> 00:29:33,720 What that means is that the spectral estimate you 569 00:29:33,720 --> 00:29:37,680 get by windowing your data with each of these functions 570 00:29:37,680 --> 00:29:41,280 separately are statistically independent. 571 00:29:41,280 --> 00:29:43,860 You actually have multiple different estimates 572 00:29:43,860 --> 00:29:45,757 of the spectrum from the same little piece 573 00:29:45,757 --> 00:29:48,270 of [AUDIO OUT] The other cool thing is 574 00:29:48,270 --> 00:29:51,710 that remember the problem with windowing our [AUDIO OUT] 575 00:29:51,710 --> 00:29:54,900 with one peak like this is we were throwing away 576 00:29:54,900 --> 00:29:55,970 data at the edges. 577 00:29:55,970 --> 00:29:59,940 Well, notice that the higher order slepian functions 578 00:29:59,940 --> 00:30:01,690 have big peaks at the edges. 579 00:30:01,690 --> 00:30:04,290 And so they are actually measuring the spectrum 580 00:30:04,290 --> 00:30:08,860 of the parts of the signal that are at the edge of the window. 581 00:30:08,860 --> 00:30:12,700 Now notice that those functions start crashing into the edges. 582 00:30:12,700 --> 00:30:16,900 So you start getting sharp, sharp edges out here, 583 00:30:16,900 --> 00:30:21,690 which is why the higher order functions have worse ripples 584 00:30:21,690 --> 00:30:24,990 outside that central lobe. 585 00:30:24,990 --> 00:30:26,160 Any questions about that? 586 00:30:30,860 --> 00:30:33,490 It's a lot [AUDIO OUT] Just remember 587 00:30:33,490 --> 00:30:37,390 that for a given width of the window in time 588 00:30:37,390 --> 00:30:41,740 and within frequency, there are multiple of these functions 589 00:30:41,740 --> 00:30:47,350 that put the maximum amount of power in this window 2W. 590 00:30:54,310 --> 00:30:55,660 So that's great. 591 00:30:55,660 --> 00:30:56,480 So good question. 592 00:30:56,480 --> 00:31:00,260 What would you do if you're trying to measure something 593 00:31:00,260 --> 00:31:02,620 and you measure it five different times, 594 00:31:02,620 --> 00:31:05,410 how would you get an estimate of what the actual number is? 595 00:31:08,990 --> 00:31:12,740 How would you get an error bar on how good your estimate is? 596 00:31:17,320 --> 00:31:19,960 For deviation of your estimates, right. 597 00:31:19,960 --> 00:31:21,550 And that's exactly what you do. 598 00:31:21,550 --> 00:31:23,980 So not only can you get a good estimate 599 00:31:23,980 --> 00:31:27,550 of the average spectrum by averaging all of these things 600 00:31:27,550 --> 00:31:29,950 together, but you can actually get an error bar. 601 00:31:29,950 --> 00:31:31,060 And that's really cool. 602 00:31:34,740 --> 00:31:38,080 So here's the procedure that you use. 603 00:31:38,080 --> 00:31:41,700 And this is what's in that little function W spec.m. 604 00:31:44,500 --> 00:31:48,220 So you select a time window of a particular width. 605 00:31:48,220 --> 00:31:49,810 How do you know what with to choose? 606 00:31:55,630 --> 00:31:56,750 That's part of it. 607 00:31:56,750 --> 00:31:59,380 The other thing is if your signal is changing rapidly 608 00:31:59,380 --> 00:32:02,050 in time and you actually care about that change, 609 00:32:02,050 --> 00:32:02,910 you should choose-- 610 00:32:02,910 --> 00:32:05,800 you're more interested in temporal resolution. 611 00:32:05,800 --> 00:32:07,740 If your signal is really constant, 612 00:32:07,740 --> 00:32:09,147 like it doesn't change very fast, 613 00:32:09,147 --> 00:32:10,480 then you can use bigger windows. 614 00:32:13,210 --> 00:32:15,617 So we're going to choose a time width. 615 00:32:15,617 --> 00:32:17,200 Then what you're going to do is you're 616 00:32:17,200 --> 00:32:19,630 going to select this parameter p, which is just 617 00:32:19,630 --> 00:32:22,720 the time-bandwidth product And if you've already 618 00:32:22,720 --> 00:32:25,870 chosen T, what you're doing is you're just choosing 619 00:32:25,870 --> 00:32:27,634 the frequency resolution. 620 00:32:30,590 --> 00:32:35,090 Once you compute p and you know T, 621 00:32:35,090 --> 00:32:38,940 you just stuff those numbers into this Matlab function 622 00:32:38,940 --> 00:32:46,280 dpss, which sends back to you this set of functions here. 623 00:32:46,280 --> 00:32:50,540 It sends you back k of those functions that once you've 624 00:32:50,540 --> 00:32:54,130 chosen p, k is just 2p minus 1. 625 00:32:54,130 --> 00:32:55,130 And then what do you do? 626 00:32:55,130 --> 00:32:57,740 You just take your little snippet of data. 627 00:32:57,740 --> 00:33:01,520 You multiply it by the first taper, compute the spectrum, 628 00:33:01,520 --> 00:33:04,810 compute the Fourier Transform. 629 00:33:04,810 --> 00:33:07,460 And then take your little piece of data, 630 00:33:07,460 --> 00:33:10,720 multiply it by the second one, compute the Fourier transform, 631 00:33:10,720 --> 00:33:11,950 and the power spectrum. 632 00:33:11,950 --> 00:33:15,520 And then you're just going to average. 633 00:33:15,520 --> 00:33:21,190 This square magnitude should be inside the window, 634 00:33:21,190 --> 00:33:22,360 your piece of data. 635 00:33:22,360 --> 00:33:26,170 You Fourier transform it, square magnitude, and then average 636 00:33:26,170 --> 00:33:28,720 all those spectra together. 637 00:33:28,720 --> 00:33:30,970 I see this is Fourier transform right here. 638 00:33:30,970 --> 00:33:32,740 This sum is the Fourier transform that. 639 00:33:32,740 --> 00:33:35,830 We square magnitude that to get the spectral estimate 640 00:33:35,830 --> 00:33:37,810 of that particular sample. 641 00:33:37,810 --> 00:33:40,840 Then we're going to average that spectrum together 642 00:33:40,840 --> 00:33:46,330 for all the different windowing tapering function. 643 00:33:46,330 --> 00:33:49,780 Now you get then multiple spectral estimates. 644 00:33:49,780 --> 00:33:52,960 You're going to average them together to get the mean. 645 00:33:52,960 --> 00:33:56,020 And you can also look at the variance 646 00:33:56,020 --> 00:33:57,799 to get the standard deviation. 647 00:34:04,160 --> 00:34:04,985 Questions? 648 00:34:04,985 --> 00:34:05,780 Let's stop there. 649 00:34:05,780 --> 00:34:07,880 That was that was a lot of stuff. 650 00:34:07,880 --> 00:34:14,230 Let's take a breath and [INAUDIBLE] to see whether we 651 00:34:14,230 --> 00:34:20,086 [AUDIO OUT] Questions? 652 00:34:27,394 --> 00:34:27,894 No. 653 00:34:34,560 --> 00:34:35,960 Don't worry about it. 654 00:34:35,960 --> 00:34:40,670 This is representation of the Fourier transform. 655 00:34:40,670 --> 00:34:44,050 You sum over all the time samples. 656 00:34:44,050 --> 00:34:48,260 This, you will just do as fast Fourier transform. 657 00:34:48,260 --> 00:34:52,030 So you'll take the data, multiply it 658 00:34:52,030 --> 00:34:58,150 by this taper function, which is the slepian 659 00:34:58,150 --> 00:35:00,220 and then do the Fourier transform, take 660 00:35:00,220 --> 00:35:02,350 the square magnitude. 661 00:35:02,350 --> 00:35:06,010 We just want make sure that we've got the basic idea. 662 00:35:06,010 --> 00:35:09,730 So you've got a long piece of data. 663 00:35:09,730 --> 00:35:17,390 You're going to lose some time window, capital T. You're 664 00:35:17,390 --> 00:35:24,940 going to choose a bandwidth W or this time bandwidth product p. 665 00:35:24,940 --> 00:35:29,010 Bend T and p to this dpss function. 666 00:35:29,010 --> 00:35:34,410 It will send you back a bunch of these dpss functions that 667 00:35:34,410 --> 00:35:35,867 fit in that window. 668 00:35:35,867 --> 00:35:37,700 Now you're going to take your piece of data. 669 00:35:37,700 --> 00:35:40,440 You're going to break it into little windows of that length, 670 00:35:40,440 --> 00:35:43,860 multiply them by each one of the slepian functions 671 00:35:43,860 --> 00:35:47,570 Do the Fourier transform of each one of those products. 672 00:35:47,570 --> 00:35:49,670 Average them all altogether. 673 00:35:49,670 --> 00:35:52,250 Take the square magnitude of each one to get the spectrum, 674 00:35:52,250 --> 00:35:55,410 and then average all those spectra together. 675 00:35:55,410 --> 00:36:00,880 So now so what does p do? p chooses 676 00:36:00,880 --> 00:36:06,630 bandwidth of the slepian function in that window. 677 00:36:06,630 --> 00:36:11,260 So if you have a window that's 100 milliseconds wide-- 678 00:36:11,260 --> 00:36:14,280 so we're going to take our data and break it 679 00:36:14,280 --> 00:36:16,830 into little pieces that's milliseconds long. 680 00:36:16,830 --> 00:36:21,580 It's goes from minus 50 to plus 50 milliseconds. 681 00:36:21,580 --> 00:36:28,290 Choose a window that has a narrow bandwidth, 682 00:36:28,290 --> 00:36:32,170 the small p, then the bandwidth is narrow. 683 00:36:32,170 --> 00:36:35,200 The bandwidth is narrow, because the function is wide, 684 00:36:35,200 --> 00:36:37,060 or you can choose a large bandwidth. 685 00:36:37,060 --> 00:36:38,320 What does that mean? 686 00:36:38,320 --> 00:36:42,560 It's a narrower function in time. 687 00:36:42,560 --> 00:36:46,650 Now if p is 5, you have a broader bandwidth. 688 00:36:46,650 --> 00:36:49,340 And that means that the window, the tapering function 689 00:36:49,340 --> 00:36:53,020 is narrower in time. 690 00:36:53,020 --> 00:36:55,990 Look at the Fourier transform of each of two 691 00:36:55,990 --> 00:36:57,280 different tapering functions. 692 00:36:57,280 --> 00:37:00,970 You can see that if p equals 1.5, 693 00:37:00,970 --> 00:37:03,250 the tapering function is broad. 694 00:37:03,250 --> 00:37:11,370 But that Fourier transform, a kernel in frequency space 695 00:37:11,370 --> 00:37:13,460 is narrower. 696 00:37:13,460 --> 00:37:18,330 Take the p equals 5 function, a broader bandwidth, 697 00:37:18,330 --> 00:37:22,530 it's narrower in time and broader in frequency. 698 00:37:22,530 --> 00:37:25,650 Does that makes sense? 699 00:37:25,650 --> 00:37:30,290 p just for a given size time window 700 00:37:30,290 --> 00:37:34,610 tells you how many different samples 701 00:37:34,610 --> 00:37:36,710 we're going to take within that time window. 702 00:37:39,780 --> 00:37:41,610 no 703 00:37:41,610 --> 00:37:44,950 So let me just go back to this example right here. 704 00:37:44,950 --> 00:37:48,120 So I took this speech signal that I just showed you 705 00:37:48,120 --> 00:37:50,500 that was recorded on the microphone. 706 00:37:50,500 --> 00:37:53,170 I chose a time window of 50 milliseconds. 707 00:37:53,170 --> 00:37:54,910 So I broke the speech signal down 708 00:37:54,910 --> 00:37:57,280 into little 50 millisecond chunks. 709 00:37:57,280 --> 00:38:00,610 I chose a bandwidth of 60 hertz. 710 00:38:00,610 --> 00:38:06,430 That corresponds to p equals 1.5 and k equals 2. 711 00:38:06,430 --> 00:38:09,010 That gives me back a bunch of these little functions. 712 00:38:09,010 --> 00:38:12,090 And I computed this spectragram. 713 00:38:12,090 --> 00:38:15,250 For this spectragram, I chose a shorter time window, eight 714 00:38:15,250 --> 00:38:21,130 milliseconds, choose a bandwidth of 375 hertz, which 715 00:38:21,130 --> 00:38:24,550 also corresponds to p equals 1.5 and k equals 2. 716 00:38:24,550 --> 00:38:29,130 And if you Fourier transform the spectragram 717 00:38:29,130 --> 00:38:33,210 with those parameters, you get this example right here. 718 00:38:33,210 --> 00:38:36,840 So in this case, I kept the same p, 719 00:38:36,840 --> 00:38:43,190 the same time-bandwidth product, but I made the time shorter. 720 00:38:43,190 --> 00:38:45,690 So the best way to do this, when you're actually doing this, 721 00:38:45,690 --> 00:38:48,050 practically is just to take a signal 722 00:38:48,050 --> 00:38:51,040 by some of these different things [INAUDIBLE] 723 00:38:51,040 --> 00:38:52,600 That's really the best way to do it. 724 00:38:52,600 --> 00:38:58,900 You can't-- I don't recommend trying to think through 725 00:38:58,900 --> 00:39:02,543 beforehand too much exactly what it's going to look like if you 726 00:39:02,543 --> 00:39:04,960 choose these different values when it's easier just to try 727 00:39:04,960 --> 00:39:06,793 different things and see what it looks like. 728 00:39:09,440 --> 00:39:09,940 Yes. 729 00:39:09,940 --> 00:39:14,020 AUDIENCE: What are looking for? 730 00:39:14,020 --> 00:39:16,300 MICHALE FEE: Well, it depends on what you're 731 00:39:16,300 --> 00:39:18,460 trying to get out of the data. 732 00:39:18,460 --> 00:39:21,130 If you want to visualize formants, 733 00:39:21,130 --> 00:39:24,220 you can see that the formants are much clearer. 734 00:39:24,220 --> 00:39:29,440 These different windows give you a different view on the data. 735 00:39:29,440 --> 00:39:31,300 So just look through different windows 736 00:39:31,300 --> 00:39:34,492 and see what looks interesting in the results. 737 00:39:34,492 --> 00:39:35,700 That's the best way to do it. 738 00:39:41,210 --> 00:39:43,190 So I just want to say one more word 739 00:39:43,190 --> 00:39:48,180 about this time-bandwidth product. 740 00:39:48,180 --> 00:39:55,320 So the time-bandwidth product of any function is greater than 1. 741 00:39:55,320 --> 00:40:00,260 So you can make time shorter, but bandwidth is worse. 742 00:40:00,260 --> 00:40:01,960 The way that you can think about this 743 00:40:01,960 --> 00:40:05,620 as that you're sort of looking at your data 744 00:40:05,620 --> 00:40:12,550 through a window in time and frequency. 745 00:40:12,550 --> 00:40:16,390 What you want is to look with infinitely fine resolution 746 00:40:16,390 --> 00:40:19,540 in both time and frequency, but really you 747 00:40:19,540 --> 00:40:23,200 can't have infinite time and frequency resolution. 748 00:40:23,200 --> 00:40:28,540 You're going to be smearing your view of the data 749 00:40:28,540 --> 00:40:32,650 with something that has a minimum area, 750 00:40:32,650 --> 00:40:37,300 the time-bandwidth product which has a minimum of size of 1. 751 00:40:40,080 --> 00:40:45,740 You can either make time small and stretch the bandwidth out, 752 00:40:45,740 --> 00:40:49,220 or you can stretch out time and make the bandwidth shorter 753 00:40:49,220 --> 00:40:52,730 or make time short and make the bandwidth long. 754 00:40:52,730 --> 00:40:56,150 But you can't squeeze both, because of 755 00:40:56,150 --> 00:41:00,410 this fundamental limit on the time bandwidth product. 756 00:41:00,410 --> 00:41:02,120 This all depends on how you're measuring 757 00:41:02,120 --> 00:41:04,190 the time and the bandwidth. 758 00:41:04,190 --> 00:41:06,620 These are kind of funny shaped functions. 759 00:41:06,620 --> 00:41:09,860 So there are different ways of measure 760 00:41:09,860 --> 00:41:11,420 what the bandwidth of a signal is 761 00:41:11,420 --> 00:41:14,300 or what the time width of a signal is. 762 00:41:14,300 --> 00:41:19,090 Now, the windows that you're looking in time and frequency 763 00:41:19,090 --> 00:41:23,340 with are the smallest time bandwidth product. 764 00:41:23,340 --> 00:41:25,810 So notice that if the time-bandwidth product is 765 00:41:25,810 --> 00:41:30,540 small, close to 1, the number of tapers 766 00:41:30,540 --> 00:41:35,100 you get in this dpss, this family of functions you get 767 00:41:35,100 --> 00:41:36,190 is just 1. 768 00:41:36,190 --> 00:41:40,790 If p is 1, then k is 2P minus 1, which is 1. 769 00:41:40,790 --> 00:41:43,350 So you only get one window. 770 00:41:43,350 --> 00:41:47,390 You only get one estimate of the spectrum, 771 00:41:47,390 --> 00:41:51,760 but you can also choose to look at your data 772 00:41:51,760 --> 00:41:59,170 with worse, [AUDIO OUT] that have a worse time frequency 773 00:41:59,170 --> 00:42:00,800 time-bandwidth product. 774 00:42:00,800 --> 00:42:02,240 Why would you do that? 775 00:42:02,240 --> 00:42:04,270 Why would you ever look at your data 776 00:42:04,270 --> 00:42:07,120 with functions that have a worse time-bandwidth product? 777 00:42:11,850 --> 00:42:17,160 Well, notice that if the time-bandwidth product is 2, 778 00:42:17,160 --> 00:42:19,440 how many functions do you have? 779 00:42:19,440 --> 00:42:23,030 Why does that matter, because now you 780 00:42:23,030 --> 00:42:27,020 have three independent estimates of what that spectrum is. 781 00:42:30,320 --> 00:42:33,830 So sometimes you would gladly choose 782 00:42:33,830 --> 00:42:37,910 to have a worse resolution in time and frequency, 783 00:42:37,910 --> 00:42:40,520 because you've got more independent estimates means 784 00:42:40,520 --> 00:42:41,340 better. 785 00:42:41,340 --> 00:42:44,090 So sometimes your signal might be changing very slowly. 786 00:42:44,090 --> 00:42:47,200 And then you can use a big time-bandwidth product. 787 00:42:47,200 --> 00:42:49,780 It doesn't matter. 788 00:42:49,780 --> 00:42:52,870 Sometimes your signal is changing very rapidly in time. 789 00:42:52,870 --> 00:42:57,680 And so you want to keep the time-bandwidth product small. 790 00:42:57,680 --> 00:43:03,490 Does that begin to [INAUDIBLE] bigger time-bandwidth products 791 00:43:03,490 --> 00:43:08,000 and now you get even more independent estimates. 792 00:43:08,000 --> 00:43:15,320 Most typically, you choose p's that go from 1.5 to multiples 793 00:43:15,320 --> 00:43:18,805 of 0.5, because then you have an integer number of k's. 794 00:43:22,362 --> 00:43:26,780 But usually, you choose p equals 1.5 or higher 795 00:43:26,780 --> 00:43:29,210 in multiples of 0.5. 796 00:43:29,210 --> 00:43:32,590 If you really care about temporal resolution 797 00:43:32,590 --> 00:43:36,490 and frequency resolution, you want that box 798 00:43:36,490 --> 00:43:38,110 that's smearing out your spectrum 799 00:43:38,110 --> 00:43:40,210 to be as small as possible. 800 00:43:40,210 --> 00:43:44,680 Small as possible means it has an area of 1. 801 00:43:44,680 --> 00:43:48,850 That's the minimum area it can have, 802 00:43:48,850 --> 00:43:51,720 but that only gives you one taper. 803 00:43:51,720 --> 00:43:55,800 But if you really care about both temporal and frequency-- 804 00:43:55,800 --> 00:43:58,498 time and frequency resolution, then that's 805 00:43:58,498 --> 00:43:59,790 the trade-off you have to make. 806 00:44:02,830 --> 00:44:07,890 Slowly you can air out more in time, maybe more in frequency, 807 00:44:07,890 --> 00:44:10,710 and you can choose your time-bandwidth product, 808 00:44:10,710 --> 00:44:13,218 in which case you get more tapers and a better 809 00:44:13,218 --> 00:44:14,260 estimate of the spectrum. 810 00:44:21,970 --> 00:44:27,600 So this is state of the art spectral estimation. 811 00:44:27,600 --> 00:44:31,610 It doesn't get better than this. 812 00:44:31,610 --> 00:44:37,690 To put it like this, you're doing it the best possible way, 813 00:44:37,690 --> 00:44:38,880 a bit of digesting. 814 00:44:49,810 --> 00:44:53,950 So lets spend the rest of the lecture 815 00:44:53,950 --> 00:44:56,260 today talking about filtering. 816 00:44:56,260 --> 00:44:59,935 So Matlab has a bunch of really powerful filtering tools. 817 00:45:01,610 --> 00:45:03,700 So here's an example of the kind of thing 818 00:45:03,700 --> 00:45:04,910 where we use filtering. 819 00:45:04,910 --> 00:45:09,770 So this is a [INAUDIBLE] finch song recorded in the lab. 820 00:45:20,704 --> 00:45:22,890 OK, so now I want you to just listen-- 821 00:45:22,890 --> 00:45:24,690 so you were probably listening to the song, 822 00:45:24,690 --> 00:45:27,690 but now listen at very low frequencies. 823 00:45:30,880 --> 00:45:31,810 Tell me what you hear. 824 00:45:31,810 --> 00:45:33,490 Listen at very low frequen-- 825 00:45:37,098 --> 00:45:37,765 [AUDIO PLAYBACK] 826 00:45:37,765 --> 00:45:39,665 [FINCH CHIRPING] 827 00:45:39,665 --> 00:45:40,540 [END PLAYBACK] 828 00:45:40,540 --> 00:45:41,040 [HUMS] 829 00:45:42,380 --> 00:45:46,100 --background, that's hum from the building's air 830 00:45:46,100 --> 00:45:48,220 conditioners, air handling. 831 00:45:48,220 --> 00:45:58,690 It all makes this low rumbling, which adds a lot of noise 832 00:45:58,690 --> 00:46:01,060 to the signal that can make it hard to see 833 00:46:01,060 --> 00:46:06,486 where the syllables are in the [AUDIO OUT] time series. 834 00:46:06,486 --> 00:46:09,227 Here are the syllables right here. 835 00:46:09,227 --> 00:46:10,560 And that's the background noise. 836 00:46:10,560 --> 00:46:13,170 But the background noise is at very low frequencies. 837 00:46:13,170 --> 00:46:16,100 So sometimes you want to just filter stuff like that 838 00:46:16,100 --> 00:46:19,670 away because we don't care about the air conditioner. 839 00:46:19,670 --> 00:46:21,750 We care about the bird's song. 840 00:46:21,750 --> 00:46:25,940 OK, so we can get rid of that by applying-- what kind of filter 841 00:46:25,940 --> 00:46:28,400 would we apply to this signal to get rid 842 00:46:28,400 --> 00:46:31,968 of these low frequencies? 843 00:46:31,968 --> 00:46:32,468 [INAUDIBLE] 844 00:46:32,468 --> 00:46:33,430 AUDIENCE: A high pass. 845 00:46:33,430 --> 00:46:35,440 MICHALE FEE: A high-pass filter, very good. 846 00:46:35,440 --> 00:46:38,110 OK, so let's put a high pass filter on this. 847 00:46:38,110 --> 00:46:39,820 Now, in the past, previously we've 848 00:46:39,820 --> 00:46:44,560 talked about using convolution to carry out a high pass 849 00:46:44,560 --> 00:46:45,400 filtering function. 850 00:46:45,400 --> 00:46:48,010 But Matlab has all these very powerful tools. 851 00:46:48,010 --> 00:46:50,410 So I wanted to show you what those look like 852 00:46:50,410 --> 00:46:52,490 and how to use them. 853 00:46:52,490 --> 00:46:55,150 OK, so this is a little piece of code 854 00:46:55,150 --> 00:46:58,070 that implements a high-pass filter on that signal. 855 00:46:58,070 --> 00:47:00,310 Now, you can see that all of that low frequency stuff 856 00:47:00,310 --> 00:47:04,630 is [AUDIO OUT] You have a nice clean, silent background. 857 00:47:04,630 --> 00:47:06,430 And now you can see the syllables 858 00:47:06,430 --> 00:47:08,210 on top of that background. 859 00:47:08,210 --> 00:47:09,290 Here's the spectrogram. 860 00:47:09,290 --> 00:47:12,160 You can see that all of that low frequency stuff is gone. 861 00:47:12,160 --> 00:47:16,910 And this is a little bit of sample code here. 862 00:47:16,910 --> 00:47:18,970 I just want to point out a few things. 863 00:47:18,970 --> 00:47:21,760 You give it the Nyquist frequency, which is just 864 00:47:21,760 --> 00:47:23,490 the sampling rate divided by 2. 865 00:47:23,490 --> 00:47:25,990 I'll explain later what that means. 866 00:47:25,990 --> 00:47:27,730 You set a cutoff frequency. 867 00:47:27,730 --> 00:47:35,660 So you tell it to cut off below 500 hertz. 868 00:47:35,660 --> 00:47:40,440 You put the cutoff and Nyquest frequency together, 869 00:47:40,440 --> 00:47:42,300 you get a ratio of those two that's 870 00:47:42,300 --> 00:47:45,480 basically the fraction of the spectral width 871 00:47:45,480 --> 00:47:47,010 that you're going to cut off. 872 00:47:47,010 --> 00:47:50,880 And then you tell it to give you the parameters 873 00:47:50,880 --> 00:47:52,710 for a Butterworth filter. 874 00:47:52,710 --> 00:47:55,670 It's just one of the kinds of filters that you use. 875 00:47:55,670 --> 00:47:59,190 Tell it whether it's a high-pass or low-pass. 876 00:47:59,190 --> 00:48:04,200 Send that filter, those filter parameters, to this function 877 00:48:04,200 --> 00:48:08,955 called filtfilt. You give it these two parameters, B and A, 878 00:48:08,955 --> 00:48:11,340 and your data vector. 879 00:48:11,340 --> 00:48:14,610 And when run that, that's what the result looks like, OK? 880 00:48:14,610 --> 00:48:17,924 Let me play that again for you after filtering. 881 00:48:28,310 --> 00:48:32,905 All that low frequency hum is gone. 882 00:48:32,905 --> 00:48:37,790 All right, so here's an example of what-- 883 00:48:37,790 --> 00:48:40,060 I mean, we would never actually do this in the lab. 884 00:48:40,060 --> 00:48:43,160 But this is what it would look like if you wanted to emphasize 885 00:48:43,160 --> 00:48:44,330 that low frequency stuff. 886 00:48:44,330 --> 00:48:48,740 Let's say that you're the air conditioner technician who 887 00:48:48,740 --> 00:48:50,270 comes and wants to figure out what's 888 00:48:50,270 --> 00:48:51,562 wrong with the air conditioner. 889 00:48:51,562 --> 00:48:54,290 And it turns out that the way it sounds really is helpful. 890 00:48:54,290 --> 00:48:57,440 So you now do a low-pass filter. 891 00:48:57,440 --> 00:48:59,570 And you're going to keep the low frequency part. 892 00:48:59,570 --> 00:49:02,690 Because all those annoying birds are making it hard 893 00:49:02,690 --> 00:49:05,310 for you to hear what's wrong with the air conditioner. 894 00:49:05,310 --> 00:49:07,355 OK, so here's-- 895 00:49:15,991 --> 00:49:17,530 Didn't quite get rid of the birds. 896 00:49:17,530 --> 00:49:21,900 But now you can hear the low frequency stuff much better. 897 00:49:21,900 --> 00:49:26,190 OK, all right, so now we just did that by, again, 898 00:49:26,190 --> 00:49:27,290 giving it the Nyquist. 899 00:49:27,290 --> 00:49:30,540 The cutoff, we're going to cut off above 2,000, 900 00:49:30,540 --> 00:49:33,210 pass below 2,000. 901 00:49:33,210 --> 00:49:37,350 We're going to tell it to use a Butterworth filter, now 902 00:49:37,350 --> 00:49:38,610 low-pass. 903 00:49:38,610 --> 00:49:42,030 And again, we just pass it the parameters and the data. 904 00:49:42,030 --> 00:49:43,800 And it sends us back the filtered data. 905 00:49:48,390 --> 00:49:51,450 OK, you can also do a band-pass. 906 00:49:51,450 --> 00:49:56,480 OK, so a band-pass does a high-pass and a low-pass 907 00:49:56,480 --> 00:49:57,540 together. 908 00:49:57,540 --> 00:50:00,530 Now you're filtering out everything above some number 909 00:50:00,530 --> 00:50:02,180 and below some number. 910 00:50:02,180 --> 00:50:04,800 And here we give it a cutoff with two numbers. 911 00:50:04,800 --> 00:50:07,040 So it's going to cut off everything 912 00:50:07,040 --> 00:50:11,030 below 4 kilohertz and everything above 5 kilohertz. 913 00:50:11,030 --> 00:50:15,380 Again, we use the Butterworth filter. 914 00:50:15,380 --> 00:50:18,235 You leave off the tag to get a band-pass filter. 915 00:50:18,235 --> 00:50:19,610 And here's what that sounds like. 916 00:50:24,787 --> 00:50:25,454 [AUDIO PLAYBACK] 917 00:50:25,454 --> 00:50:32,272 [BIRDS CHIRPING] 918 00:50:32,272 --> 00:50:34,707 [END PLAYBACK] 919 00:50:34,707 --> 00:50:37,100 And thats a band-pass filter. 920 00:50:37,100 --> 00:50:38,090 Questions? 921 00:50:47,990 --> 00:50:50,480 Yeah, so there are many different ways 922 00:50:50,480 --> 00:50:53,240 to do this kind of filtering. 923 00:50:53,240 --> 00:50:56,570 Daniel, do you know how filtfilt actually implements this? 924 00:50:56,570 --> 00:51:00,350 Because Matlab has a bunch of different filtering functions. 925 00:51:00,350 --> 00:51:02,928 And this is just one of them. 926 00:51:02,928 --> 00:51:09,200 [INAUDIBLE] how it's actually implemented [AUDIO OUT] 927 00:51:09,200 --> 00:51:11,000 Right, so there's a filt function, 928 00:51:11,000 --> 00:51:14,750 which actually does a convolution in one direction. 929 00:51:14,750 --> 00:51:18,080 And filtfilt does the convolution one direction 930 00:51:18,080 --> 00:51:20,500 and then the other direction. 931 00:51:20,500 --> 00:51:22,960 And what that does it [AUDIO OUT] 932 00:51:22,960 --> 00:51:27,140 output center with respect to the input, centered [AUDIO OUT] 933 00:51:27,140 --> 00:51:29,460 Anyway, there are different ways of doing it. 934 00:51:29,460 --> 00:51:32,780 And the nice thing about-- 935 00:51:32,780 --> 00:51:33,280 yeah? 936 00:51:33,280 --> 00:51:39,468 AUDIENCE: [INAUDIBLE] 937 00:51:39,468 --> 00:51:44,210 MICHALE FEE: Well, for the bird data, 938 00:51:44,210 --> 00:51:46,838 it doesn't necessarily make all that much sense, 939 00:51:46,838 --> 00:51:47,880 right, on the face of it? 940 00:51:47,880 --> 00:51:49,710 But there are applications where there's 941 00:51:49,710 --> 00:51:52,400 some signal at that particular band. 942 00:51:52,400 --> 00:51:55,070 So for example, let's say you had a speech signal 943 00:51:55,070 --> 00:51:57,500 and you wanted to find out when the formants cross 944 00:51:57,500 --> 00:51:58,790 a certain frequency. 945 00:51:58,790 --> 00:52:03,990 Let's say you wanted to find out if somebody could learn 946 00:52:03,990 --> 00:52:09,700 to speak [AUDIO OUT] if you blocked one of their formants 947 00:52:09,700 --> 00:52:12,340 whenever it comes through a particular frequency. 948 00:52:12,340 --> 00:52:15,070 OK, so let's say I have my second formant 949 00:52:15,070 --> 00:52:17,170 and every time it crosses 2 kilohertz 950 00:52:17,170 --> 00:52:18,370 I play a burst of noise. 951 00:52:18,370 --> 00:52:21,850 And I ask, can I understand if I've knocked 952 00:52:21,850 --> 00:52:24,165 out that particular formant? 953 00:52:24,165 --> 00:52:25,790 I don't know why you'd want to do that. 954 00:52:25,790 --> 00:52:27,880 But maybe it's fun, right? 955 00:52:27,880 --> 00:52:29,720 So I don't know, it might be kind of cool. 956 00:52:29,720 --> 00:52:31,637 So then you would run a band-pass filter right 957 00:52:31,637 --> 00:52:34,240 over 2-kilohertz band. 958 00:52:34,240 --> 00:52:37,180 And now, you'd get a big signal whenever that formant passed 959 00:52:37,180 --> 00:52:39,037 through that band, right? 960 00:52:39,037 --> 00:52:40,870 And then you would send that to an amplifier 961 00:52:40,870 --> 00:52:44,430 and play a noise burst into the person's ear. 962 00:52:44,430 --> 00:52:47,390 All right, we do things like that with birds 963 00:52:47,390 --> 00:52:51,650 to find out if they can learn to shift the pitch of their song 964 00:52:51,650 --> 00:52:53,510 in response to errors. 965 00:52:53,510 --> 00:52:56,919 OK, so yes, they can. 966 00:52:56,919 --> 00:52:57,419 Yes-- 967 00:52:57,419 --> 00:53:00,320 AUDIENCE: [INAUDIBLE] 968 00:53:00,320 --> 00:53:02,300 MICHALE FEE: Formants are the peaks 969 00:53:02,300 --> 00:53:05,780 in the filter that's formed by your vocal tract 970 00:53:05,780 --> 00:53:11,720 by the [AUDIO OUT] air channel from your glottis to your lips. 971 00:53:17,090 --> 00:53:24,190 The location of those peaks changes [INAUDIBLE] So ahh, 972 00:53:24,190 --> 00:53:26,260 ooh, the difference between those 973 00:53:26,260 --> 00:53:30,880 is just the location of those formant peaks. 974 00:53:30,880 --> 00:53:34,135 All those things just have formants at different location. 975 00:53:34,135 --> 00:53:46,342 AUDIENCE: [INAUDIBLE] 976 00:53:46,342 --> 00:53:49,860 MICHALE FEE: So explain a little bit more what 977 00:53:49,860 --> 00:53:52,230 you mean by analog interference. 978 00:53:52,230 --> 00:53:55,429 AUDIENCE: [INAUDIBLE] 979 00:53:55,429 --> 00:53:58,540 MICHALE FEE: Oh, OK, like 60 hertz. 980 00:53:58,540 --> 00:54:00,220 OK, so that's a great question. 981 00:54:00,220 --> 00:54:02,940 So let's say that you're doing an experiment. 982 00:54:02,940 --> 00:54:05,460 And you [AUDIO OUT] contamination 983 00:54:05,460 --> 00:54:10,410 of your signal by 60 hertz noise from the outlet. 984 00:54:10,410 --> 00:54:13,350 OK, it's really better to spend the time to figure out 985 00:54:13,350 --> 00:54:15,390 how to get rid of that noise. 986 00:54:15,390 --> 00:54:21,500 But let's say that you [INAUDIBLE] advisor your data 987 00:54:21,500 --> 00:54:24,500 [AUDIO OUT] quite figured out how to get rid of the 60 hertz 988 00:54:24,500 --> 00:54:29,070 yet [AUDIO OUT] How would you get rid of the 60 hertz from 989 00:54:29,070 --> 00:54:30,160 your signal? 990 00:54:30,160 --> 00:54:35,380 You could make what's called a band-stop filter where 991 00:54:35,380 --> 00:54:40,800 you suppress frequencies within a particular band. 992 00:54:40,800 --> 00:54:43,610 Put that band-stop filter at 60 hertz. 993 00:54:43,610 --> 00:54:45,870 The thing is, it's very hard to make 994 00:54:45,870 --> 00:54:48,350 a very narrow band-stop filter. 995 00:54:48,350 --> 00:54:50,700 So we learned this in the last lecture. 996 00:54:50,700 --> 00:55:00,440 How would you get rid of a particular [AUDIO OUT] Yeah, 997 00:55:00,440 --> 00:55:02,660 so take the Fourier transform of your signal, 998 00:55:02,660 --> 00:55:07,560 that 60 hertz [AUDIO OUT] one particular value of the Fourier 999 00:55:07,560 --> 00:55:08,100 transform. 1000 00:55:08,100 --> 00:55:14,588 And you can just set that [AUDIO OUT] 1001 00:55:14,588 --> 00:55:18,290 Because the filtering in that case 1002 00:55:18,290 --> 00:55:20,930 would be knocking down a whole band 1003 00:55:20,930 --> 00:55:26,040 of [AUDIO OUT] frequencies. 1004 00:55:26,040 --> 00:55:32,322 AUDIENCE: [INAUDIBLE] 1005 00:55:32,322 --> 00:55:37,172 MICHALE FEE: Well, it's just that with filtfilt, 1006 00:55:37,172 --> 00:55:39,630 like I said, there are many different ways of doing things. 1007 00:55:39,630 --> 00:55:42,960 filtfilt won't do that for you. 1008 00:55:42,960 --> 00:55:46,980 But once you know this stuff that we've been learning, 1009 00:55:46,980 --> 00:55:49,200 you can go in and do stuff. 1010 00:55:49,200 --> 00:55:51,780 You don't have to have some Matlab function to do it. 1011 00:55:51,780 --> 00:55:54,060 You just know how it all works and you just 1012 00:55:54,060 --> 00:55:55,140 write a program to do it. 1013 00:55:55,140 --> 00:55:59,015 OK, that's pretty cool, right? 1014 00:55:59,015 --> 00:56:01,060 All right-- oh, and here's the band-stop filter. 1015 00:56:04,525 --> 00:56:12,270 [INAUDIBLE] that lag there stop, OK? 1016 00:56:12,270 --> 00:56:13,410 OK, let's keep going. 1017 00:56:13,410 --> 00:56:16,490 Oh, and there's a tool here that's part of Matlab. 1018 00:56:16,490 --> 00:56:20,820 It's called a filter visualization tool, FV tool. 1019 00:56:20,820 --> 00:56:25,260 You just run this and you can select different kinds 1020 00:56:25,260 --> 00:56:29,820 of filters that have different kinds of roll-off 1021 00:56:29,820 --> 00:56:32,430 in frequency, that have different properties 1022 00:56:32,430 --> 00:56:33,870 in the time domain. 1023 00:56:33,870 --> 00:56:35,310 It's kind of fun to play with. 1024 00:56:35,310 --> 00:56:38,250 If you have to do filtering on some signal, 1025 00:56:38,250 --> 00:56:39,420 just play around with this. 1026 00:56:39,420 --> 00:56:41,910 Because there are a bunch of different kind of filters that 1027 00:56:41,910 --> 00:56:47,130 have different weird names like Butterworth and Chebyshev 1028 00:56:47,130 --> 00:56:50,900 and a bunch of other things that have different properties. 1029 00:56:50,900 --> 00:56:52,970 But you can actually just play around with this 1030 00:56:52,970 --> 00:56:57,170 and design your own filter to meet your own [AUDIO OUT] 1031 00:56:57,170 --> 00:57:00,930 OK, so I want to end by spending a little bit of time talking 1032 00:57:00,930 --> 00:57:03,480 about some really cool things about the Fourier transform 1033 00:57:03,480 --> 00:57:06,780 and talk about the Nyquist Shannon theorem. 1034 00:57:06,780 --> 00:57:08,700 This is really kind of mind boggling. 1035 00:57:08,700 --> 00:57:10,110 It's pretty cool. 1036 00:57:10,110 --> 00:57:15,120 So all right, so remember that when you take the Fourier 1037 00:57:15,120 --> 00:57:17,700 transform-- the fast Fourier transform of something-- 1038 00:57:17,700 --> 00:57:20,390 [INAUDIBLE] take the Fourier transform 1039 00:57:20,390 --> 00:57:25,400 of something analytically, the Fourier transform 1040 00:57:25,400 --> 00:57:27,860 is defined continuously. 1041 00:57:27,860 --> 00:57:33,980 At every value of F, there's a Fourier transform. 1042 00:57:33,980 --> 00:57:36,320 But when we do fast Fourier transforms, 1043 00:57:36,320 --> 00:57:41,360 we've discretized time and we've discretized frequency, right? 1044 00:57:41,360 --> 00:57:43,520 So when we take the fast Fourier transform, 1045 00:57:43,520 --> 00:57:48,130 we get answer back where we have a value of the Fourier 1046 00:57:48,130 --> 00:57:50,900 transform at a bunch of discrete frequencies. 1047 00:57:50,900 --> 00:57:53,450 So frequency is discretized. 1048 00:57:53,450 --> 00:57:55,822 And we have frequencies, little samples 1049 00:57:55,822 --> 00:57:57,530 of the spectrum at different frequencies, 1050 00:57:57,530 --> 00:58:00,747 that are separated by a little delta f. 1051 00:58:00,747 --> 00:58:06,760 [INAUDIBLE] What does that mean? 1052 00:58:06,760 --> 00:58:12,490 Remember when we were doing a Fourier series? 1053 00:58:12,490 --> 00:58:18,220 What was it that we had to have to write down a Fourier series 1054 00:58:18,220 --> 00:58:20,200 where we can write down an approximation 1055 00:58:20,200 --> 00:58:25,690 to a function as a sum of sine waves and multiples 1056 00:58:25,690 --> 00:58:28,380 of a common frequency? 1057 00:58:28,380 --> 00:58:30,985 What was it about the signal in time 1058 00:58:30,985 --> 00:58:32,110 that allowed us to do that? 1059 00:58:40,660 --> 00:58:41,630 It's periodic. 1060 00:58:41,630 --> 00:58:45,240 We could only do that if the signal is periodic. 1061 00:58:45,240 --> 00:58:52,700 So when we write down our fast Fourier transform of a signal, 1062 00:58:52,700 --> 00:58:55,910 it's discretized in time and frequency. 1063 00:58:55,910 --> 00:59:00,310 What that means is that it's periodic in time. 1064 00:59:00,310 --> 00:59:06,360 So when we pass a signal that we've sampled of some duration 1065 00:59:06,360 --> 00:59:09,360 and the fast Fourier transform algorithm passes back 1066 00:59:09,360 --> 00:59:13,410 a spectrum that's discreted in frequency, what that means is 1067 00:59:13,410 --> 00:59:15,750 that you can think about that signal 1068 00:59:15,750 --> 00:59:19,280 as being periodic in time, OK? 1069 00:59:19,280 --> 00:59:24,160 Now, when you discretize the signal in time, 1070 00:59:24,160 --> 00:59:27,720 you've taken samples of that signal 1071 00:59:27,720 --> 00:59:31,830 in time separated by delta t. 1072 00:59:31,830 --> 00:59:35,680 What does that tell you about the spectrum? 1073 00:59:35,680 --> 00:59:38,360 So when we pass the Fourier transform 1074 00:59:38,360 --> 00:59:41,580 FFT algorithm, a signal that's discretized in time, 1075 00:59:41,580 --> 00:59:48,480 it passes us back this thing here, right, 1076 00:59:48,480 --> 00:59:50,460 with positive frequencies in the first half 1077 00:59:50,460 --> 00:59:53,430 of the vector, the negative frequencies in the second half. 1078 00:59:53,430 --> 00:59:55,650 It's really a piece-- 1079 00:59:55,650 --> 00:59:58,125 it's one period of a periodic spectrum. 1080 01:00:01,749 --> 01:00:05,100 [AUDIO OUT] right? 1081 01:00:05,100 --> 01:00:08,160 Mathematically, if our signal is discretized in time, 1082 01:00:08,160 --> 01:00:10,570 it means the spectrum is periodic. 1083 01:00:10,570 --> 01:00:14,830 And the FFT algorithm is passing back one period. 1084 01:00:14,830 --> 01:00:18,575 And then there's a circular shift to get this thing. 1085 01:00:18,575 --> 01:00:21,920 Does that makes sense? 1086 01:00:21,920 --> 01:00:25,990 OK, now, because these are real functions, 1087 01:00:25,990 --> 01:00:30,460 this piece here is exactly equal to that piece. 1088 01:00:30,460 --> 01:00:33,000 It's symmetric. 1089 01:00:33,000 --> 01:00:37,650 The magnitude of the spectrum is symmetric. 1090 01:00:37,650 --> 01:00:40,140 So what does that mean? 1091 01:00:40,140 --> 01:00:43,010 What that means, if our signal has some bandwidth-- 1092 01:00:43,010 --> 01:00:49,990 if the highest frequency is less than some bandwidth B-- 1093 01:00:49,990 --> 01:00:53,050 if the sampling rate is high enough, 1094 01:00:53,050 --> 01:00:57,330 then you can see that the frequency components here 1095 01:00:57,330 --> 01:01:01,680 don't interact with the frequency components here. 1096 01:01:01,680 --> 01:01:03,180 You can see that they're separated. 1097 01:01:06,150 --> 01:01:07,590 OK, one more thing. 1098 01:01:07,590 --> 01:01:15,090 The period of a spectrum 1 is over delta t [INAUDIBLE] which 1099 01:01:15,090 --> 01:01:17,200 is equal to the sampling rate. 1100 01:01:17,200 --> 01:01:20,700 So when we have a signal that's discretized in time, 1101 01:01:20,700 --> 01:01:24,480 the spectrum is periodic and there are multiple copies 1102 01:01:24,480 --> 01:01:27,120 of that spectrum, of this spectrum, 1103 01:01:27,120 --> 01:01:31,820 at intervals of [AUDIO OUT] rate. 1104 01:01:31,820 --> 01:01:35,230 OK, so if the sampling rate is high enough, 1105 01:01:35,230 --> 01:01:38,130 then the positive frequencies are well 1106 01:01:38,130 --> 01:01:40,890 separated from the negative frequencies 1107 01:01:40,890 --> 01:01:47,220 if the sampling rate is higher than twice the bandwidth 1108 01:01:47,220 --> 01:01:55,360 [AUDIO OUT] If I sample the signal at a slower 1109 01:01:55,360 --> 01:01:57,490 and slower rate but it's the same signal, 1110 01:01:57,490 --> 01:02:02,250 you can see at some point that negative frequencies are 1111 01:02:02,250 --> 01:02:05,270 going to start crashing into the positive frequencies. 1112 01:02:05,270 --> 01:02:09,140 So you can see that you don't run into this problem 1113 01:02:09,140 --> 01:02:13,640 as long as the sampling rate is greater than twice the highest 1114 01:02:13,640 --> 01:02:15,272 frequency [INAUDIBLE] 1115 01:02:15,272 --> 01:02:17,552 So what? 1116 01:02:17,552 --> 01:02:20,530 So who cares? 1117 01:02:20,530 --> 01:02:22,040 What's so bad about this? 1118 01:02:22,040 --> 01:02:27,680 Well, it turns out that if you sample 1119 01:02:27,680 --> 01:02:30,955 at a frequency higher than twice the bandwidth, the highest 1120 01:02:30,955 --> 01:02:32,330 frequency in the signal, then you 1121 01:02:32,330 --> 01:02:33,570 can do something really cool. 1122 01:02:36,636 --> 01:02:39,800 You can perfectly reconstruct the original signal 1123 01:02:39,800 --> 01:02:45,320 even though you've sampled it only discretely. 1124 01:02:45,320 --> 01:02:47,960 Put an arbitrary signal in. 1125 01:02:47,960 --> 01:02:49,580 You can sample it discretely. 1126 01:02:49,580 --> 01:02:52,160 And as long as you've sampled it at twice the highest 1127 01:02:52,160 --> 01:02:53,960 frequency in the original signal, 1128 01:02:53,960 --> 01:02:58,190 you can perfectly reconstruct the original signal. 1129 01:02:58,190 --> 01:02:59,660 Back to this. 1130 01:02:59,660 --> 01:03:03,080 Here's our discretely sampled signal. 1131 01:03:03,080 --> 01:03:04,820 There is the spectrum. 1132 01:03:04,820 --> 01:03:05,690 It's periodic. 1133 01:03:05,690 --> 01:03:07,820 Let's say that the sampling rate is 1134 01:03:07,820 --> 01:03:11,420 more than twice the bandwidth. 1135 01:03:11,420 --> 01:03:13,670 How would I reconstruct the original signal? 1136 01:03:26,981 --> 01:03:31,410 But remember that the convolution theorem 1137 01:03:31,410 --> 01:03:36,760 says that by multiplying the frequency domain, 1138 01:03:36,760 --> 01:03:40,550 I'm convolving in the time domain, OK? 1139 01:03:40,550 --> 01:03:45,670 So remember that this piece right here 1140 01:03:45,670 --> 01:03:50,970 was the spectrum of the original signal, right? 1141 01:03:50,970 --> 01:03:55,370 As I sampled it in time, I added these [AUDIO OUT] 1142 01:03:55,370 --> 01:03:58,910 copies at intervals of the sampling rate. 1143 01:03:58,910 --> 01:04:02,810 If I want to get the original signal back, 1144 01:04:02,810 --> 01:04:05,720 I can just put a square window around this, 1145 01:04:05,720 --> 01:04:10,115 keep that, and throw away all the others. 1146 01:04:10,115 --> 01:04:12,030 [INAUDIBLE] 1147 01:04:12,030 --> 01:04:15,180 By sampling regularly, I've just added these other copies. 1148 01:04:15,180 --> 01:04:18,360 But they're far enough away that I can just throw them off. 1149 01:04:18,360 --> 01:04:21,610 I can set them to zero. 1150 01:04:21,610 --> 01:04:25,720 Now, when I put a square window in the frequency domain, 1151 01:04:25,720 --> 01:04:29,650 what am I doing in the time domain? 1152 01:04:29,650 --> 01:04:31,960 Multiply by a square window in frequency, 1153 01:04:31,960 --> 01:04:35,210 what am I doing in time? 1154 01:04:35,210 --> 01:04:35,710 [INAUDIBLE] 1155 01:04:35,710 --> 01:04:40,870 So basically what I do is I take the original signal sampled 1156 01:04:40,870 --> 01:04:41,890 regularly in time. 1157 01:04:41,890 --> 01:04:44,055 And I just convolve it with what? 1158 01:04:44,055 --> 01:04:46,150 What's the Fourier transform of a square pulse? 1159 01:04:50,450 --> 01:04:55,030 [INAUDIBLE] If I could just convolve the time domain 1160 01:04:55,030 --> 01:04:56,537 [INAUDIBLE] with a kernel, that's 1161 01:04:56,537 --> 01:04:58,370 the Fourier transform of that square window. 1162 01:04:58,370 --> 01:05:00,080 It's just the sync function. 1163 01:05:00,080 --> 01:05:03,640 And when I do that, I get back the original function. 1164 01:05:03,640 --> 01:05:05,320 But it's actually easier to do. 1165 01:05:05,320 --> 01:05:07,240 Rather than convolving with a sync function, 1166 01:05:07,240 --> 01:05:10,500 it's easier just to multiply in the frequency domain. 1167 01:05:10,500 --> 01:05:14,920 So I can basically get back my sampled function 1168 01:05:14,920 --> 01:05:19,540 at arbitrarily fine [AUDIO OUT] 1169 01:05:19,540 --> 01:05:21,160 Here's how you actually do that. 1170 01:05:21,160 --> 01:05:24,180 That process is called zero-padding. 1171 01:05:24,180 --> 01:05:27,710 OK, so what you can do is you can take a function, Fourier 1172 01:05:27,710 --> 01:05:30,470 transform it, get the spectrum. 1173 01:05:30,470 --> 01:05:32,105 And what the Fourier transform hands 1174 01:05:32,105 --> 01:05:36,130 us back is just this piece right here [INAUDIBLE] 1175 01:05:36,130 --> 01:05:41,960 But what I can do is I can just move those other peaks away. 1176 01:05:41,960 --> 01:05:44,080 So that's what my FFT sends back to me. 1177 01:05:44,080 --> 01:05:45,640 Now what am I going to do, I'm just 1178 01:05:45,640 --> 01:05:52,830 going to push that away and add zeros in the middle. 1179 01:05:52,830 --> 01:06:01,440 Now, inverse Fourier transform, and [INAUDIBLE] So the sampling 1180 01:06:01,440 --> 01:06:04,320 rate is just the number of frequency samples 1181 01:06:04,320 --> 01:06:05,580 I have times delta f. 1182 01:06:05,580 --> 01:06:08,400 And here I'm just adding a bunch of frequency samples 1183 01:06:08,400 --> 01:06:09,780 that are zero. 1184 01:06:09,780 --> 01:06:16,020 And my new delta t is just going to be 1 over that new sampling 1185 01:06:16,020 --> 01:06:16,530 rate. 1186 01:06:16,530 --> 01:06:17,500 Here's an example. 1187 01:06:17,500 --> 01:06:19,860 This is a little bit of code that does it. 1188 01:06:19,860 --> 01:06:23,010 Here I've taken a sine wave at 20 hertz. 1189 01:06:23,010 --> 01:06:26,070 You can see 50 millisecond spacing 1190 01:06:26,070 --> 01:06:30,370 sampled four times per cycle. 1191 01:06:30,370 --> 01:06:32,560 I just run this little zero-padding algorithm. 1192 01:06:32,560 --> 01:06:38,161 And you can see that it sends me back these red dots. 1193 01:06:38,161 --> 01:06:42,690 [INAUDIBLE] have more completely reconstructed the sine wave 1194 01:06:42,690 --> 01:06:43,710 that I sampled. 1195 01:06:43,710 --> 01:06:47,400 OK, but you can do that with any function 1196 01:06:47,400 --> 01:06:52,150 as long as the highest frequency in your original signal 1197 01:06:52,150 --> 01:06:56,730 is less than [AUDIO OUT] half the sampling rate. 1198 01:07:00,090 --> 01:07:04,420 [INAUDIBLE] 1199 01:07:04,420 --> 01:07:07,360 So zero-padding, so what I showed you here 1200 01:07:07,360 --> 01:07:11,280 is that zero-padding in the frequency domain 1201 01:07:11,280 --> 01:07:16,290 gives you higher sampling, faster sampling, 1202 01:07:16,290 --> 01:07:17,130 in the time domain. 1203 01:07:17,130 --> 01:07:21,230 OK, and you can also do the same thing. 1204 01:07:21,230 --> 01:07:24,480 You can also zero-pad in the time domain 1205 01:07:24,480 --> 01:07:28,060 to give finer spacing in the frequency domain. 1206 01:07:28,060 --> 01:07:30,800 FFT samples will be closer together 1207 01:07:30,800 --> 01:07:32,340 in the frequency domain. 1208 01:07:32,340 --> 01:07:34,100 OK, so here's how you do that. 1209 01:07:34,100 --> 01:07:36,830 So you take a little piece of data. 1210 01:07:36,830 --> 01:07:40,220 You multiply it by your DPSS taper. 1211 01:07:40,220 --> 01:07:43,840 And then just add a bunch of zeros. 1212 01:07:43,840 --> 01:07:45,720 And then take the Fourier transform 1213 01:07:45,720 --> 01:07:49,310 of that longer piece with all those zeros added to it. 1214 01:07:49,310 --> 01:07:52,030 And when you do that, what you're going to get back 1215 01:07:52,030 --> 01:07:57,550 is an FFT that has the samples in frequency 1216 01:07:57,550 --> 01:07:58,690 more finely spaced. 1217 01:07:58,690 --> 01:08:01,060 Your delta f is going to be smaller. 1218 01:08:01,060 --> 01:08:03,430 Now, that doesn't [AUDIO OUT] frequency resolution. 1219 01:08:03,430 --> 01:08:06,905 There's no magic getting around the minimum time 1220 01:08:06,905 --> 01:08:08,090 [INAUDIBLE] product. 1221 01:08:08,090 --> 01:08:08,590 OK? 1222 01:08:08,590 --> 01:08:12,460 But you have more samples in frequency. 1223 01:08:12,460 --> 01:08:14,400 All right, any questions? 1224 01:08:17,930 --> 01:08:21,680 [INAUDIBLE] We're going to be starting a new topic next time. 1225 01:08:21,680 --> 01:08:24,789 We're done with spectral analysis.