1 00:00:15,737 --> 00:00:17,820 PROFESSOR: It's going to be a great lecture today. 2 00:00:17,820 --> 00:00:18,720 It's about proteins. 3 00:00:18,720 --> 00:00:20,040 I love proteins. 4 00:00:20,040 --> 00:00:21,780 Don't forget the handout, yeah. 5 00:00:21,780 --> 00:00:24,880 OK, so I'm going to briefly wrap up 6 00:00:24,880 --> 00:00:27,130 the lecture we were doing on Friday because there were 7 00:00:27,130 --> 00:00:29,520 a couple of things that I wanted to make a note of, 8 00:00:29,520 --> 00:00:32,430 and then we'll move on to section 2.3 9 00:00:32,430 --> 00:00:35,280 about amino acids, peptides, and proteins. 10 00:00:35,280 --> 00:00:37,980 Now, in the last class, I introduced you 11 00:00:37,980 --> 00:00:41,340 to the lipidic molecules, and you can pick them out 12 00:00:41,340 --> 00:00:44,430 of a lineup because they are rich in carbon-carbon and 13 00:00:44,430 --> 00:00:45,970 carbon-hydrogen bonds. 14 00:00:45,970 --> 00:00:49,350 As you can see here in these line-angled drawings, 15 00:00:49,350 --> 00:00:51,930 the majority of a lot of these molecules 16 00:00:51,930 --> 00:00:55,110 is carbon-carbon or carbon-hydrogen hydrogen. 17 00:00:55,110 --> 00:00:59,280 They are molecules that are mostly hydrophobic, so there 18 00:00:59,280 --> 00:01:00,820 are some terminologies here. 19 00:01:00,820 --> 00:01:01,320 Whoops. 20 00:01:01,320 --> 00:01:05,700 Hydrophobic, which can also be referred to as lipophilic. 21 00:01:05,700 --> 00:01:11,310 You either-- you can hate water and love fatty acid or fatty 22 00:01:11,310 --> 00:01:15,000 types of materials, so those both terms are synonymous. 23 00:01:15,000 --> 00:01:18,380 And some of the lipids are what are known as amphipathic, 24 00:01:18,380 --> 00:01:21,720 and they include hydrophobic and hydrophobic components. 25 00:01:21,720 --> 00:01:24,780 There are a couple of tiny terms that I 26 00:01:24,780 --> 00:01:26,910 didn't mention explicitly, so I just 27 00:01:26,910 --> 00:01:29,250 want to go ahead and do that now. 28 00:01:29,250 --> 00:01:33,260 For example, in this phospholipid structure-- 29 00:01:33,260 --> 00:01:34,560 and we'll talk about these-- 30 00:01:34,560 --> 00:01:39,510 they have long chain fatty acids attached via esters 31 00:01:39,510 --> 00:01:41,640 to this glycerol unit, so there's 32 00:01:41,640 --> 00:01:44,310 one here and the second one here, 33 00:01:44,310 --> 00:01:47,190 and then what's known as the polar head group. 34 00:01:47,190 --> 00:01:51,030 In those fatty acids, they could be fully saturated. 35 00:01:51,030 --> 00:01:54,750 It means they have no double bonds in the structure, 36 00:01:54,750 --> 00:02:06,080 so that term saturated is equivalent to no double bonds, 37 00:02:06,080 --> 00:02:08,289 so no carbon-carbon double bonds. 38 00:02:08,289 --> 00:02:10,810 Or they could be unsaturated, where 39 00:02:10,810 --> 00:02:16,260 there is a double bond within it, 40 00:02:16,260 --> 00:02:20,370 so that's one or more double bonds. 41 00:02:20,370 --> 00:02:23,690 And those double bonds take on a particular shape 42 00:02:23,690 --> 00:02:27,680 because there's not freedom of rotation around double bonds 43 00:02:27,680 --> 00:02:30,080 the same way there is around single bonds. 44 00:02:30,080 --> 00:02:32,330 So those single bonds, you can twist them around 45 00:02:32,330 --> 00:02:35,780 and twist them around, but the double bond geometry is fixed. 46 00:02:35,780 --> 00:02:37,760 And so double bonds, we refer to them 47 00:02:37,760 --> 00:02:41,820 as either trans, where the two groups are on opposite sides, 48 00:02:41,820 --> 00:02:46,760 leaving the double bond, or we refer to them as cis, 49 00:02:46,760 --> 00:02:49,100 where the two groups are on the same side. 50 00:02:49,100 --> 00:02:51,890 And we tend to use that cis and trans 51 00:02:51,890 --> 00:02:55,380 sort of naming system in a lot of other contexts as well, 52 00:02:55,380 --> 00:02:57,020 but you almost always want to remember 53 00:02:57,020 --> 00:02:59,360 that trans is as far away as possible, 54 00:02:59,360 --> 00:03:02,030 cis is closer than trans. 55 00:03:02,030 --> 00:03:05,750 All right, so I'm just going to take you forward 56 00:03:05,750 --> 00:03:08,400 to the phospholipid structure. 57 00:03:08,400 --> 00:03:21,730 This is a very important semi-permeable membranes 58 00:03:21,730 --> 00:03:26,320 are made up through the non-covalent, supramolecular 59 00:03:26,320 --> 00:03:30,340 association of phospholipid monomer units. 60 00:03:30,340 --> 00:03:32,530 Here's a monomer unit up here. 61 00:03:32,530 --> 00:03:35,170 You see it has an amphipathic structure, 62 00:03:35,170 --> 00:03:38,920 with a lot of hydrophobicity but also hydrophilicity, 63 00:03:38,920 --> 00:03:43,870 and these molecules assemble into supramolecular structures 64 00:03:43,870 --> 00:03:46,630 that form the boundaries of your cells. 65 00:03:46,630 --> 00:03:50,080 Saying they are semi-permeable tells us a little bit 66 00:03:50,080 --> 00:03:52,510 about what can go through them. 67 00:03:52,510 --> 00:03:55,187 If they were fully permeable, anything could come and go 68 00:03:55,187 --> 00:03:56,770 and they wouldn't be much use frankly. 69 00:03:56,770 --> 00:03:58,990 It's like leaving the door open the whole time. 70 00:03:58,990 --> 00:04:01,300 But because they're semi-permeable, 71 00:04:01,300 --> 00:04:04,870 only a few things can come and go without extra help, 72 00:04:04,870 --> 00:04:09,080 and other things need active mechanisms to go through. 73 00:04:09,080 --> 00:04:12,620 So let's take a look at the boundary here. 74 00:04:12,620 --> 00:04:15,268 So when you see a membrane bilayer, 75 00:04:15,268 --> 00:04:17,560 they are-- they're often shown looking like this, where 76 00:04:17,560 --> 00:04:20,440 every one of these units is a phospholipid, 77 00:04:20,440 --> 00:04:23,620 and there's water on both sides of the phospholipid, 78 00:04:23,620 --> 00:04:27,100 because that polar head group is interacting 79 00:04:27,100 --> 00:04:29,140 with water on both sides. 80 00:04:29,140 --> 00:04:32,380 So down here could be the inside of the cell. 81 00:04:32,380 --> 00:04:34,960 Up here could be the outside of the cell. 82 00:04:34,960 --> 00:04:38,180 And a lot of cells, especially eukaryotic cells, 83 00:04:38,180 --> 00:04:41,830 the ones that make us up, have a lot of endomembranes, 84 00:04:41,830 --> 00:04:44,240 membranes within the cells. 85 00:04:44,240 --> 00:04:47,020 For example, forming the boundary to the nucleus 86 00:04:47,020 --> 00:04:48,710 or to the mitochondria. 87 00:04:48,710 --> 00:04:49,544 Yes? 88 00:04:49,544 --> 00:04:52,210 AUDIENCE: Why is there a [INAUDIBLE] 89 00:04:52,210 --> 00:04:54,230 PROFESSOR: Oh, this guy must be-- 90 00:04:54,230 --> 00:04:59,180 so this looks like it's probably a saturated fatty acid. 91 00:04:59,180 --> 00:05:03,450 So what do you think this one might be, folks? 92 00:05:03,450 --> 00:05:04,380 Unsaturated. 93 00:05:04,380 --> 00:05:06,060 And what's the double-bond geometry? 94 00:05:06,060 --> 00:05:07,170 AUDIENCE: Cis. 95 00:05:07,170 --> 00:05:08,180 PROFESSOR: Cis. 96 00:05:08,180 --> 00:05:08,760 Yeah, it is. 97 00:05:08,760 --> 00:05:12,280 It's like a-- it looks like a ballerina or something. 98 00:05:12,280 --> 00:05:15,210 OK, so we have a lot of concerns, 99 00:05:15,210 --> 00:05:18,630 and we'll see later about how things get in and out of cells. 100 00:05:18,630 --> 00:05:26,050 But most commonly, things like oxygen or water 101 00:05:26,050 --> 00:05:35,440 and other small hydrophobic molecules 102 00:05:35,440 --> 00:05:39,400 can pass readily in and out through the semi-permeable 103 00:05:39,400 --> 00:05:42,340 barrier, but other things, things that are charged, 104 00:05:42,340 --> 00:05:45,220 things that are big, need a different mechanism 105 00:05:45,220 --> 00:05:46,120 to get in and out. 106 00:05:46,120 --> 00:05:49,750 And we will see later on how proteins provide 107 00:05:49,750 --> 00:05:52,480 the opportunities to cargo things 108 00:05:52,480 --> 00:05:56,770 into cells or out of cells, even very large entities, 109 00:05:56,770 --> 00:05:58,810 and there are certain mechanisms whereby 110 00:05:58,810 --> 00:06:02,650 that happens through a semi-permeable membrane, OK? 111 00:06:02,650 --> 00:06:05,800 I want to show you the other feature of membranes. 112 00:06:05,800 --> 00:06:07,137 They are self-healing. 113 00:06:11,340 --> 00:06:14,250 What this means is if you poke them, 114 00:06:14,250 --> 00:06:17,700 you poke a hole in a cellular membrane, You? 115 00:06:17,700 --> 00:06:21,600 Basically push apart those non-covalent forces. 116 00:06:21,600 --> 00:06:24,150 Once you take the thing away, be it a needle 117 00:06:24,150 --> 00:06:28,710 or a very fine glass capillary, they seal right back up 118 00:06:28,710 --> 00:06:31,690 to close to close the hole in the cell wall, 119 00:06:31,690 --> 00:06:34,950 so that kind of tells us that they're non-covalent forces. 120 00:06:34,950 --> 00:06:37,500 So this is a really cool video of someone 121 00:06:37,500 --> 00:06:41,130 doing micro-injection into eukaryotic cells. 122 00:06:41,130 --> 00:06:44,520 The needle points to the cell, approaches the surface. 123 00:06:44,520 --> 00:06:46,590 You can drop something into the cell, 124 00:06:46,590 --> 00:06:49,080 and then the cell closes and maintain-- 125 00:06:49,080 --> 00:06:52,840 regains its integrity of the barrier, 126 00:06:52,840 --> 00:06:56,670 so this is a very cool observation. 127 00:06:56,670 --> 00:06:58,140 People do this. 128 00:06:58,140 --> 00:07:00,060 They have to not drink too much coffee 129 00:07:00,060 --> 00:07:02,910 because it's quite complicated to do a lot of micro-injection, 130 00:07:02,910 --> 00:07:06,750 because you can really cause carnage in your cell population 131 00:07:06,750 --> 00:07:10,590 if you're not very dexterous with the micro-injection 132 00:07:10,590 --> 00:07:12,900 but people can be very good at it. 133 00:07:12,900 --> 00:07:14,830 So I just want to ask a couple of questions 134 00:07:14,830 --> 00:07:17,550 before, give you a couple of things to think about 135 00:07:17,550 --> 00:07:19,200 before we close up. 136 00:07:19,200 --> 00:07:20,200 The lipids. 137 00:07:20,200 --> 00:07:22,500 So here's a typical lipid bilayer, 138 00:07:22,500 --> 00:07:25,140 where I've highlighted a single lipid. 139 00:07:25,140 --> 00:07:27,390 And the colors, those are the head groups, 140 00:07:27,390 --> 00:07:32,520 and all in white and gray are the hydrophilic components, 141 00:07:32,520 --> 00:07:35,970 and just one of the phospholipids is highlighted, 142 00:07:35,970 --> 00:07:38,670 and that would be this molecular structure here. 143 00:07:38,670 --> 00:07:42,480 So first of all, what do you think the non-covalent forces 144 00:07:42,480 --> 00:07:45,210 at that membrane interface may be? 145 00:07:45,210 --> 00:07:48,810 That is, what's going on here at the interface? 146 00:07:48,810 --> 00:07:50,790 What are the types of interactions 147 00:07:50,790 --> 00:07:53,760 that you might have there? 148 00:07:53,760 --> 00:07:55,650 Give you a minute to think about it, 149 00:07:55,650 --> 00:07:57,960 and I want to show you that I'm actually giving you 150 00:07:57,960 --> 00:08:01,800 a clue here, because you can see the structure, negative charge, 151 00:08:01,800 --> 00:08:03,990 positive charge, but also remember 152 00:08:03,990 --> 00:08:08,130 this is a barrier to water, so there are other things going on 153 00:08:08,130 --> 00:08:11,160 with the solvent that the membrane is sitting in 154 00:08:11,160 --> 00:08:14,250 because there's water surrounding that barrier layer. 155 00:08:14,250 --> 00:08:17,740 Anyone want to tell me what the answer is, and why? 156 00:08:17,740 --> 00:08:20,168 Yeah, did you-- are you-- yeah. 157 00:08:20,168 --> 00:08:21,380 AUDIENCE: Hydrogen bonding. 158 00:08:21,380 --> 00:08:24,596 PROFESSOR: Yeah, between what and what? 159 00:08:24,596 --> 00:08:37,409 AUDIENCE: Like the oxygen and [INAUDIBLE] 160 00:08:37,409 --> 00:08:39,230 PROFESSOR: Right, so water. 161 00:08:39,230 --> 00:08:42,659 Water is a good hydrogen bond donor and acceptor, 162 00:08:42,659 --> 00:08:44,670 so there will be hydrogen bonding. 163 00:08:44,670 --> 00:08:46,650 What about amongst all those lipid 164 00:08:46,650 --> 00:08:48,405 head groups, what's the other major force? 165 00:08:51,390 --> 00:08:52,300 Yeah? 166 00:08:52,300 --> 00:08:53,582 AUDIENCE: Electrostatic force. 167 00:08:53,582 --> 00:08:55,290 PROFESSOR: Between the different charges. 168 00:08:55,290 --> 00:08:57,600 So the correct answer here is both of them. 169 00:08:57,600 --> 00:09:00,660 Don't think it's just electrostatic, it's both. 170 00:09:00,660 --> 00:09:04,380 It's electrostatic amongst the head groups, hydrogen bonding 171 00:09:04,380 --> 00:09:08,680 between all that sort of dense bunch of charge, and the water. 172 00:09:08,680 --> 00:09:11,130 And then the other question, what type of molecules 173 00:09:11,130 --> 00:09:12,010 can get across? 174 00:09:12,010 --> 00:09:14,520 I've already answered that question to you. 175 00:09:14,520 --> 00:09:17,730 Salts are going to need ways to get in and out. 176 00:09:17,730 --> 00:09:21,840 Small proteins are too big to dissolve in that membrane 177 00:09:21,840 --> 00:09:24,270 through passive mechanisms, so we're 178 00:09:24,270 --> 00:09:27,150 going to have to figure out how to get proteins 179 00:09:27,150 --> 00:09:29,160 in and out of cells. 180 00:09:29,160 --> 00:09:31,560 Neurotransmitters, such as this, this 181 00:09:31,560 --> 00:09:34,500 is GABA, or gamma aminobutyric acid. 182 00:09:34,500 --> 00:09:35,550 It's charged. 183 00:09:35,550 --> 00:09:38,400 It just can't get through without a transporter 184 00:09:38,400 --> 00:09:41,280 of some kind, and it's actually proteins 185 00:09:41,280 --> 00:09:44,160 that end up doing the heavy lifting of the transport 186 00:09:44,160 --> 00:09:45,950 processes that we'll see. 187 00:09:45,950 --> 00:09:48,990 OK, so moving along. 188 00:09:48,990 --> 00:09:53,550 This section will be about the building blocks of your protein 189 00:09:53,550 --> 00:09:55,590 macromolecules, which I want to remind 190 00:09:55,590 --> 00:09:59,670 you comprise 50% of all of the macromolecules, 191 00:09:59,670 --> 00:10:02,640 so that suggests it's a pretty important class 192 00:10:02,640 --> 00:10:06,720 of macromolecules that has a lot of different functions. 193 00:10:06,720 --> 00:10:08,460 Now, the amino acid building locks-- 194 00:10:08,460 --> 00:10:10,440 blocks look pretty simple. 195 00:10:10,440 --> 00:10:13,050 They're called amino acids because they 196 00:10:13,050 --> 00:10:21,480 have an amine, the carboxylic acid, 197 00:10:21,480 --> 00:10:24,600 and there's a carbon that is tetrehedral between the 198 00:10:24,600 --> 00:10:27,450 carboxylic acid and the amine. 199 00:10:27,450 --> 00:10:32,010 And the simplest of those is when those are both hydrogen, 200 00:10:32,010 --> 00:10:35,310 but most of the amino acids are differentiated 201 00:10:35,310 --> 00:10:37,830 from that-- this one I've showed you on the board. 202 00:10:37,830 --> 00:10:42,690 This amino acid is glycine. 203 00:10:42,690 --> 00:10:46,920 Usually, when it's just a lonely amino acid in aqueous solution, 204 00:10:46,920 --> 00:10:51,180 it's in a different charged form, 205 00:10:51,180 --> 00:10:56,650 just consistent with what we talked about in the last class. 206 00:10:56,650 --> 00:10:57,890 And I put it here. 207 00:10:57,890 --> 00:10:59,480 So this is glycine. 208 00:10:59,480 --> 00:11:07,595 It's one of the 20 encoded amino acids. 209 00:11:11,420 --> 00:11:14,090 That means the amino acids that are 210 00:11:14,090 --> 00:11:17,510 made through ribosomal biosynthesis 211 00:11:17,510 --> 00:11:22,080 through a code that's provided by the messenger RNA, 212 00:11:22,080 --> 00:11:27,800 so they are encoded by messenger RNA. 213 00:11:27,800 --> 00:11:30,640 Later on, you'll see all of the beautiful mechanics 214 00:11:30,640 --> 00:11:32,330 of those processes. 215 00:11:32,330 --> 00:11:35,270 Now, this table looks pretty complicated, 216 00:11:35,270 --> 00:11:36,830 so I'm going to deconstruct it a bit. 217 00:11:36,830 --> 00:11:40,470 But what I first of all want to assure you is that these-- 218 00:11:40,470 --> 00:11:43,870 you will always get a handout with these structures on them. 219 00:11:43,870 --> 00:11:46,960 We are not asking you to remember these structures. 220 00:11:46,960 --> 00:11:49,510 You might become familiar with some of them, 221 00:11:49,510 --> 00:11:51,790 but you do not have to remember them. 222 00:11:51,790 --> 00:11:54,670 You'll have a table that shows them, but on that table, 223 00:11:54,670 --> 00:11:57,880 I won't necessarily give you the information 224 00:11:57,880 --> 00:12:00,730 on what their properties are, because those are things 225 00:12:00,730 --> 00:12:04,090 that you should be able to spot by looking at their chemical 226 00:12:04,090 --> 00:12:05,140 structures, all right? 227 00:12:05,140 --> 00:12:06,400 So that's important. 228 00:12:06,400 --> 00:12:09,430 So these are all line-angled drawings, 229 00:12:09,430 --> 00:12:11,020 so you see the carbon. 230 00:12:11,020 --> 00:12:13,120 The hydrogens aren't shown in there. 231 00:12:13,120 --> 00:12:15,850 The charges are shown for what's called the side 232 00:12:15,850 --> 00:12:29,550 chain, because most of the amino acids have a side chain. 233 00:12:36,120 --> 00:12:38,290 The amino acids are also chiral, but you'll 234 00:12:38,290 --> 00:12:41,830 learn more than you ever wanted to know about chirality in 512, 235 00:12:41,830 --> 00:12:44,780 so I won't weigh you down with any of those properties. 236 00:12:44,780 --> 00:12:47,320 So there is a side chain that dictates the properties 237 00:12:47,320 --> 00:12:48,920 of the amino acids. 238 00:12:48,920 --> 00:12:52,030 One tiny detail, the amino acids that 239 00:12:52,030 --> 00:12:54,970 are encoded in our proteins are all what 240 00:12:54,970 --> 00:12:58,120 are known as alpha amino acids. 241 00:12:58,120 --> 00:12:59,740 There are other amino acids. 242 00:12:59,740 --> 00:13:02,470 GABA, that I showed you on the previous slide, 243 00:13:02,470 --> 00:13:04,390 is not an alpha amino acid. 244 00:13:04,390 --> 00:13:06,820 Actually it's, a gamma amino acid. 245 00:13:06,820 --> 00:13:11,320 These are called amino acids because the amine group 246 00:13:11,320 --> 00:13:14,722 is at the alpha position relative to the carboxyl. 247 00:13:14,722 --> 00:13:16,930 Don't need to know a lot more about that with respect 248 00:13:16,930 --> 00:13:17,810 to that. 249 00:13:17,810 --> 00:13:20,830 So let's take a look at this set of amino acids, 250 00:13:20,830 --> 00:13:24,490 and what you see is amino side chains with rather 251 00:13:24,490 --> 00:13:26,220 different properties. 252 00:13:26,220 --> 00:13:29,950 I've amassed-- here's glycine at the very top. 253 00:13:29,950 --> 00:13:32,140 All amino acids have a three-letter code 254 00:13:32,140 --> 00:13:33,700 or a one-letter code. 255 00:13:33,700 --> 00:13:35,950 I particularly enjoy using one letter codes 256 00:13:35,950 --> 00:13:38,970 and spelling out people's names in peptides and things 257 00:13:38,970 --> 00:13:39,470 like that. 258 00:13:39,470 --> 00:13:42,640 I'll let you do that in the privacy of your own room. 259 00:13:42,640 --> 00:13:46,180 It's kind of amusing to see if your name actually spells out 260 00:13:46,180 --> 00:13:47,350 a peptide. 261 00:13:47,350 --> 00:13:50,380 Some of us-- if I get a little stopped stuck with Barbara 262 00:13:50,380 --> 00:13:54,670 because there are no B amino acid one letters with a B. 263 00:13:54,670 --> 00:13:58,090 The next most abundant type of amino acid 264 00:13:58,090 --> 00:14:01,300 have hydrophobic side chains. 265 00:14:01,300 --> 00:14:03,170 What that means is they have a lot of CHs, 266 00:14:03,170 --> 00:14:05,710 but not a lot else, right? 267 00:14:05,710 --> 00:14:07,290 So take a look at them. 268 00:14:07,290 --> 00:14:13,740 Alanine has a methyl group, for example, where I've shown 269 00:14:13,740 --> 00:14:17,730 the R, that would be alanine. 270 00:14:17,730 --> 00:14:19,500 And they get increasingly big. 271 00:14:19,500 --> 00:14:20,950 They're quite large. 272 00:14:20,950 --> 00:14:23,670 Some of them have quite extended size chains. 273 00:14:23,670 --> 00:14:25,740 Other ones have side chains with rings 274 00:14:25,740 --> 00:14:27,360 with double bonds in them. 275 00:14:27,360 --> 00:14:30,780 Those are what we would designate in organic chemistry 276 00:14:30,780 --> 00:14:32,190 as aromatic. 277 00:14:32,190 --> 00:14:34,320 They show-- they are still hydrophobic, 278 00:14:34,320 --> 00:14:36,250 but they show different properties 279 00:14:36,250 --> 00:14:39,210 to this other set of amino acids. 280 00:14:39,210 --> 00:14:41,340 Some of these amino acids may actually 281 00:14:41,340 --> 00:14:45,330 have polar groups in them, but their major feature 282 00:14:45,330 --> 00:14:47,160 is that they're hydrophobic. 283 00:14:47,160 --> 00:14:49,830 But in an amino acid, such as tyrosine, 284 00:14:49,830 --> 00:14:53,190 you could not only have hydrophobic interactions 285 00:14:53,190 --> 00:14:57,660 with that ring system, but also hydrogen bonding with the OH 286 00:14:57,660 --> 00:15:00,060 on the tyrosine, so some of the amino acids 287 00:15:00,060 --> 00:15:03,120 can do a few different things. 288 00:15:03,120 --> 00:15:05,130 The next set of amino acids are those 289 00:15:05,130 --> 00:15:07,920 that are polar and charged, and I've 290 00:15:07,920 --> 00:15:12,120 shown you the most common state of all of those amino acids, 291 00:15:12,120 --> 00:15:14,670 but you already know that the amine of lysine 292 00:15:14,670 --> 00:15:16,500 is likely to be charged. 293 00:15:16,500 --> 00:15:19,680 This quanidinium group of arginine, take my word for it, 294 00:15:19,680 --> 00:15:20,660 it's charged. 295 00:15:20,660 --> 00:15:22,820 It's a bit more complicated to draw. 296 00:15:22,820 --> 00:15:26,010 Histidine is also one of those that's annoying to draw, 297 00:15:26,010 --> 00:15:29,370 but the negatively-charged side chains with a carboxylate 298 00:15:29,370 --> 00:15:31,590 are both negatively charged, and that's 299 00:15:31,590 --> 00:15:34,230 something you would remember from the previous class 300 00:15:34,230 --> 00:15:35,190 hopefully. 301 00:15:35,190 --> 00:15:36,930 And then finally, there are amino acids 302 00:15:36,930 --> 00:15:39,360 with polar uncharged side chains, 303 00:15:39,360 --> 00:15:41,140 such as those shown here. 304 00:15:41,140 --> 00:15:44,190 Now, this doesn't look like a very exciting set 305 00:15:44,190 --> 00:15:45,120 of building blocks. 306 00:15:45,120 --> 00:15:49,800 How can life run on things made of 20 relatively 307 00:15:49,800 --> 00:15:52,460 simple building blocks with functional groups? 308 00:15:52,460 --> 00:15:54,870 And it's that the building blocks are not functional 309 00:15:54,870 --> 00:15:56,110 themselves. 310 00:15:56,110 --> 00:16:02,680 It is the polymers that are made up of amino acids, 311 00:16:02,680 --> 00:16:05,830 and I'll always call them AAs because it's easier for me. 312 00:16:05,830 --> 00:16:09,760 The polymers of amino acids are heteropolymers. 313 00:16:14,420 --> 00:16:17,660 That means they're made up of a bunch of different monomer 314 00:16:17,660 --> 00:16:19,760 units when they're called heteropolymers. 315 00:16:26,710 --> 00:16:29,770 And the other important thing about these polymers 316 00:16:29,770 --> 00:16:34,600 is that they are of defined sequence. 317 00:16:38,030 --> 00:16:39,270 What is the sequence? 318 00:16:39,270 --> 00:16:42,510 It's the order in which the amino acids appear. 319 00:16:42,510 --> 00:16:46,430 So I'm writing that down, order. 320 00:16:46,430 --> 00:16:48,100 And all the functions of proteins 321 00:16:48,100 --> 00:16:51,010 are dictated by the order of the amino acids, 322 00:16:51,010 --> 00:16:53,630 so let's take a look at the sidebar here. 323 00:16:53,630 --> 00:16:56,290 So once again, remember a couple of things 324 00:16:56,290 --> 00:16:59,510 that we will always give you this table to think about. 325 00:16:59,510 --> 00:17:00,967 Ooh, come back. 326 00:17:00,967 --> 00:17:03,550 There are a couple of outliers I just want to mention quickly. 327 00:17:03,550 --> 00:17:05,920 So I talked to you about glycine, 328 00:17:05,920 --> 00:17:09,819 the simplest amino acid with no elaborate side chain. 329 00:17:09,819 --> 00:17:13,260 Proline is a little odd because its side chain is kind of 330 00:17:13,260 --> 00:17:16,210 in a cyclic structure, and towards the end of the class, 331 00:17:16,210 --> 00:17:19,480 I'll talk to you about collagen, whose structure 332 00:17:19,480 --> 00:17:23,109 is totally dependent on the involvement of proline 333 00:17:23,109 --> 00:17:27,160 in the sequence of the amino acids that make up collagen. 334 00:17:27,160 --> 00:17:31,480 And then the last sorts of unusual amino acid is cysteine. 335 00:17:31,480 --> 00:17:36,180 It has a thiol, and the one clever thing about cysteine-- 336 00:17:36,180 --> 00:17:39,890 I'm just going to put a bit of a peptide here. 337 00:17:39,890 --> 00:17:49,640 One cysteine, and then I'm going to put a second cysteine, 338 00:17:49,640 --> 00:17:55,120 and these are going to be deemed in a peptidic structure. 339 00:17:55,120 --> 00:17:58,630 What cysteine can do is it can exist either 340 00:17:58,630 --> 00:18:03,130 with the thiol side chain, SH, or it 341 00:18:03,130 --> 00:18:06,280 can be at a different oxidation state 342 00:18:06,280 --> 00:18:12,290 where the two sulfurs are joined to each other. 343 00:18:12,290 --> 00:18:16,240 So for the most part, your linear arrangement 344 00:18:16,240 --> 00:18:20,530 of amino acids that dictates sequence is solely held by-- 345 00:18:20,530 --> 00:18:24,370 together by the covalent bonds and the peptide backbone 346 00:18:24,370 --> 00:18:26,360 that we'll talk about in a minute. 347 00:18:26,360 --> 00:18:28,900 But occasionally, enfolded structures, 348 00:18:28,900 --> 00:18:31,540 if two cysteines are close to each other 349 00:18:31,540 --> 00:18:33,430 and the environment is oxidizing, 350 00:18:33,430 --> 00:18:35,140 they will form a cross-link. 351 00:18:35,140 --> 00:18:37,240 But they're not what drives folding. 352 00:18:37,240 --> 00:18:39,490 They kind of fall into place later on, 353 00:18:39,490 --> 00:18:42,130 but that just sort of sets cysteine apart a little bit 354 00:18:42,130 --> 00:18:44,830 for its properties, all right? 355 00:18:44,830 --> 00:18:47,000 OK, so coming down the side here. 356 00:18:47,000 --> 00:18:50,350 Amino acids are assembled in a unique linear polymer 357 00:18:50,350 --> 00:18:53,650 of defined order, and we designate that defined 358 00:18:53,650 --> 00:18:55,855 sequence the primary sequence. 359 00:19:03,450 --> 00:19:09,850 And proteins can be 1,000 amino acids, 1,500, 100 amino acids. 360 00:19:09,850 --> 00:19:17,910 They can be various lengths where they, you know, 361 00:19:17,910 --> 00:19:20,400 we would generally consider the smallest 362 00:19:20,400 --> 00:19:23,960 protein to be about 400 amino acids, 363 00:19:23,960 --> 00:19:27,270 and you might go up to thousands of amino acids. 364 00:19:27,270 --> 00:19:30,390 I'm going to write 2,000 or more here. 365 00:19:30,390 --> 00:19:33,570 When the proteins are smaller, they 366 00:19:33,570 --> 00:19:37,400 are not capable of adopting too much ordered structure, 367 00:19:37,400 --> 00:19:39,240 and we mostly call them peptides. 368 00:19:39,240 --> 00:19:41,820 Peptides are sort of shorter sequences, 369 00:19:41,820 --> 00:19:44,260 so peptide sequences. 370 00:19:44,260 --> 00:19:54,640 So this would be a protein, and peptides, probably two 371 00:19:54,640 --> 00:19:57,740 to 39 amino acids, but these breakpoints 372 00:19:57,740 --> 00:19:59,990 are a little bit more vague. 373 00:19:59,990 --> 00:20:04,077 So the primary sequence will define the structure 374 00:20:04,077 --> 00:20:05,660 of a protein, and we're going to start 375 00:20:05,660 --> 00:20:08,600 to talk about the hierarchical structure of proteins 376 00:20:08,600 --> 00:20:12,320 as put in place, and that's the primary sequence, 377 00:20:12,320 --> 00:20:15,950 And that primary sequence is kind of a cool thing 378 00:20:15,950 --> 00:20:18,050 because it's very specific. 379 00:20:18,050 --> 00:20:21,950 It defines-- it's got encoded into its structure, 380 00:20:21,950 --> 00:20:25,820 the three-dimensional fold of the protein, OK? 381 00:20:25,820 --> 00:20:29,630 All the information for the folded, compact, globular 382 00:20:29,630 --> 00:20:33,320 structure that's functional is encoded 383 00:20:33,320 --> 00:20:35,240 in that primary sequence. 384 00:20:35,240 --> 00:20:36,890 It's a cryptic code. 385 00:20:36,890 --> 00:20:39,830 We may not be able to tell by looking at it 386 00:20:39,830 --> 00:20:42,800 what it really looks like, but all the information 387 00:20:42,800 --> 00:20:48,080 is there in order to program the folding into a globular 388 00:20:48,080 --> 00:20:49,220 structure. 389 00:20:49,220 --> 00:20:51,950 So the primary sequence determines the fold, 390 00:20:51,950 --> 00:20:56,060 and it's the fold of the protein that mandates its function. 391 00:20:56,060 --> 00:20:57,830 It's not the sequence of the protein. 392 00:20:57,830 --> 00:20:59,930 The sequence defines the fold. 393 00:20:59,930 --> 00:21:04,730 The fold, the three-dimensional form, defines the function, OK? 394 00:21:04,730 --> 00:21:05,900 So that's very important. 395 00:21:05,900 --> 00:21:08,420 And I think it's absolutely amazing 396 00:21:08,420 --> 00:21:11,510 that with a relatively limited set of building blocks, 397 00:21:11,510 --> 00:21:15,830 we can define so many different functions of all the proteins 398 00:21:15,830 --> 00:21:18,290 in our body that may be structural, 399 00:21:18,290 --> 00:21:20,900 they may be catalysts, they may be things 400 00:21:20,900 --> 00:21:23,570 that transfer information from the outside 401 00:21:23,570 --> 00:21:25,130 to the inside of cell. 402 00:21:25,130 --> 00:21:29,570 All of that is programmed with this rather limited set 403 00:21:29,570 --> 00:21:32,440 of building blocks, OK? 404 00:21:32,440 --> 00:21:36,400 Now, let's now talk about peptides 405 00:21:36,400 --> 00:21:38,500 because one gets a little frustrated 406 00:21:38,500 --> 00:21:40,450 looking at single amino acids. 407 00:21:40,450 --> 00:21:44,330 They don't tell us so much about the peptidic structure, 408 00:21:44,330 --> 00:21:48,620 so I'm going to draw two amino acids, 409 00:21:48,620 --> 00:21:51,970 and then I'm going to tell you one important thing. 410 00:21:51,970 --> 00:21:55,870 So let's put R1, and I'm going to draw another amino acid, 411 00:21:55,870 --> 00:21:58,840 and I'm putting it in a particular orientation. 412 00:22:01,870 --> 00:22:07,850 R2, because that designates that these 413 00:22:07,850 --> 00:22:09,990 might be different amino acids. 414 00:22:09,990 --> 00:22:14,270 For example, if R1 is H, there's an implied hydrogen here, 415 00:22:14,270 --> 00:22:16,070 that would be glycine. 416 00:22:16,070 --> 00:22:19,460 If R2 is a methyl group, there's an implied hydrogen there, 417 00:22:19,460 --> 00:22:21,770 that would be alanine, all right? 418 00:22:21,770 --> 00:22:26,990 When nature bonds all these amino acids together, 419 00:22:26,990 --> 00:22:33,940 it carries out a condensation reaction 420 00:22:33,940 --> 00:22:39,290 to form a peptide bond between these two 421 00:22:39,290 --> 00:22:42,410 components of the amino acid, the amine 422 00:22:42,410 --> 00:22:44,275 and the carboxylic acid. 423 00:22:44,275 --> 00:22:47,840 And now I'm going to draw you the first of the dipeptides 424 00:22:47,840 --> 00:22:49,880 that you'll meet. 425 00:22:49,880 --> 00:22:52,720 And there are so many things to tell you 426 00:22:52,720 --> 00:22:55,120 about these structures, it sort of 427 00:22:55,120 --> 00:22:57,035 drives me crazy thinking about, oh, I 428 00:22:57,035 --> 00:22:58,660 must remember to tell them that or I've 429 00:22:58,660 --> 00:23:01,180 got to remember to tell them that, because the structures 430 00:23:01,180 --> 00:23:01,720 are cool. 431 00:23:01,720 --> 00:23:04,150 R1, R2. 432 00:23:04,150 --> 00:23:11,763 OK, so this is a dipeptide, two amino acids, 433 00:23:11,763 --> 00:23:13,180 and there are some characteristics 434 00:23:13,180 --> 00:23:14,740 I want you to remember. 435 00:23:14,740 --> 00:23:23,570 When we write out peptides, we always write them N to C. 436 00:23:23,570 --> 00:23:33,200 So in that peptide, this would be the carboxyl terminus, 437 00:23:33,200 --> 00:23:35,000 and this would be the amino terminus. 438 00:23:37,890 --> 00:23:40,890 If you don't always remember to write things in this order, 439 00:23:40,890 --> 00:23:43,980 and you tell your friend, oh, go and get this peptide 440 00:23:43,980 --> 00:23:45,990 made, and you put it down in the wrong order, 441 00:23:45,990 --> 00:23:47,910 they'll make the wrong peptide. 442 00:23:47,910 --> 00:23:51,030 So you always-- there is basically an agreement 443 00:23:51,030 --> 00:23:54,450 amongst everyone that we always write from left to right, 444 00:23:54,450 --> 00:23:56,340 the sequence of peptides. 445 00:23:56,340 --> 00:23:58,500 The next important thing about this structure, 446 00:23:58,500 --> 00:24:03,660 as you look at it, there are several bonds joining 447 00:24:03,660 --> 00:24:06,030 the polymeric structure. 448 00:24:06,030 --> 00:24:10,500 Many of these bonds show free rotations. 449 00:24:10,500 --> 00:24:12,420 You can twist them around, there's nothing 450 00:24:12,420 --> 00:24:13,830 stopping that conversion. 451 00:24:16,370 --> 00:24:18,780 All of these show freedom of rotation. 452 00:24:18,780 --> 00:24:29,030 But the amide, or peptide bond, is 453 00:24:29,030 --> 00:24:31,070 unique in that there's restricted 454 00:24:31,070 --> 00:24:33,180 rotation about that bond. 455 00:24:33,180 --> 00:24:35,360 So it's as if you've got a linear polymer, 456 00:24:35,360 --> 00:24:38,780 but every third bond has kind of stuck 457 00:24:38,780 --> 00:24:41,000 in a particular orientation, which 458 00:24:41,000 --> 00:24:45,350 starts to define a lot of details about protein tertiary 459 00:24:45,350 --> 00:24:46,190 structure. 460 00:24:46,190 --> 00:24:48,647 It's not complete spaghetti. 461 00:24:48,647 --> 00:24:51,230 It's like spaghetti with little bits that haven't been cooked. 462 00:24:51,230 --> 00:24:54,040 They're stiffer than the rest of the sequence. 463 00:24:54,040 --> 00:24:55,640 And the other really important thing 464 00:24:55,640 --> 00:24:58,370 about the peptide structure is that embedded 465 00:24:58,370 --> 00:25:05,710 within that structure, there is the amide or peptide functional 466 00:25:05,710 --> 00:25:15,320 group where, remember, this can be a hydrogen bond acceptor, 467 00:25:15,320 --> 00:25:18,340 and this can be a hydrogen bond donor. 468 00:25:18,340 --> 00:25:21,940 Once you know that, the next few slides will make a lot of sense 469 00:25:21,940 --> 00:25:25,155 as we talk about higher-order structure of proteins. 470 00:25:25,155 --> 00:25:26,530 So let's just take a look at that 471 00:25:26,530 --> 00:25:28,870 with a slightly longer peptide. 472 00:25:28,870 --> 00:25:31,600 By convention, if I'm going to draw a peptide 473 00:25:31,600 --> 00:25:34,160 that's methionine isoleucine threonine-- 474 00:25:34,160 --> 00:25:35,500 you can look up that names-- 475 00:25:35,500 --> 00:25:37,180 those names on the chart-- 476 00:25:37,180 --> 00:25:39,490 that would be the MIT peptide. 477 00:25:39,490 --> 00:25:41,740 These are the three amino acids. 478 00:25:41,740 --> 00:25:45,430 I'm going to condense them into a tripeptide. 479 00:25:45,430 --> 00:25:47,240 When I condense three amino acids, 480 00:25:47,240 --> 00:25:49,510 I spit out two molecules of water, 481 00:25:49,510 --> 00:25:54,430 and I put in place two amide or peptide bonds. 482 00:25:54,430 --> 00:25:58,420 If I go down this backbone, every third bond 483 00:25:58,420 --> 00:26:00,440 is going to be fixed, fairly fixed. 484 00:26:00,440 --> 00:26:02,950 There's not freedom of rotation around it, 485 00:26:02,950 --> 00:26:06,190 and every third bond is going to have the capacity 486 00:26:06,190 --> 00:26:09,580 to be involved in hydrogen bonding interactions, 487 00:26:09,580 --> 00:26:12,610 as I've suggested here, all right? 488 00:26:12,610 --> 00:26:14,350 What else is there here? 489 00:26:14,350 --> 00:26:19,090 When I write the MIT peptide, I write M first, 490 00:26:19,090 --> 00:26:21,730 I second, T third. 491 00:26:21,730 --> 00:26:25,450 If I wrote TIM, it would be a completely different chemical 492 00:26:25,450 --> 00:26:27,940 structure with different chemical properties, 493 00:26:27,940 --> 00:26:31,600 so the directionality is important to understand, 494 00:26:31,600 --> 00:26:33,290 and there you have it. 495 00:26:33,290 --> 00:26:35,170 So now you can go home and practice your name 496 00:26:35,170 --> 00:26:37,840 in amino acids and draw them out. 497 00:26:37,840 --> 00:26:41,560 If you draw them out fairly sort of sharply, 498 00:26:41,560 --> 00:26:44,290 then you'll never get confused about what 499 00:26:44,290 --> 00:26:46,883 end's what and where the substitutes are, 500 00:26:46,883 --> 00:26:48,550 but it's important to remember as you're 501 00:26:48,550 --> 00:26:51,640 making a dipeptide-- oops, I forget this doesn't work. 502 00:26:51,640 --> 00:26:53,770 As you're condensing a dipeptide, 503 00:26:53,770 --> 00:26:56,110 when you're putting these R groups on, one goes up, 504 00:26:56,110 --> 00:26:58,300 one goes down, but these are nuances 505 00:26:58,300 --> 00:27:00,700 of the structure that may be lit for-- good 506 00:27:00,700 --> 00:27:02,290 for a later discussion. 507 00:27:02,290 --> 00:27:06,760 So here is now a longer linear peptide, and the suggestion 508 00:27:06,760 --> 00:27:09,400 of a globular structure that might be found 509 00:27:09,400 --> 00:27:12,970 if that peptide was folded up. 510 00:27:12,970 --> 00:27:20,850 And the primary sequence here defines the globular structure, 511 00:27:20,850 --> 00:27:25,330 and the process whereby you go from the extended primary 512 00:27:25,330 --> 00:27:29,530 sequence to the folded structure is called protein folding. 513 00:27:37,130 --> 00:27:39,470 And physical chemists and physicists 514 00:27:39,470 --> 00:27:42,380 and computational chemists have for years 515 00:27:42,380 --> 00:27:47,060 tried to understand how we could predict the folded structure 516 00:27:47,060 --> 00:27:49,190 from the primary sequence. 517 00:27:49,190 --> 00:27:52,130 It's not simple because what you're doing 518 00:27:52,130 --> 00:27:55,210 is you're solving a massive energy diagram, 519 00:27:55,210 --> 00:27:57,170 where as you fold a structure up, 520 00:27:57,170 --> 00:28:02,090 you're trying to maximize all those non-covalent forces 521 00:28:02,090 --> 00:28:06,920 for maximum thermodynamic stability, right? 522 00:28:06,920 --> 00:28:09,950 It's kind of a three-dimensional puzzle where you're 523 00:28:09,950 --> 00:28:12,510 trying to have as many hydrogen bonds, 524 00:28:12,510 --> 00:28:15,650 electrostatic interactions, and so on, as you can possibly 525 00:28:15,650 --> 00:28:16,460 make. 526 00:28:16,460 --> 00:28:20,090 So when computational chemists try to fold proteins, 527 00:28:20,090 --> 00:28:22,940 they're basically solving a three-dimensional puzzle 528 00:28:22,940 --> 00:28:25,720 where they are maximizing interactions. 529 00:28:25,720 --> 00:28:28,850 And there are a lot of ab initio and molecular dynamics 530 00:28:28,850 --> 00:28:34,010 programs that are now starting to be able to fold proteins 531 00:28:34,010 --> 00:28:36,890 into fairly reliable structures, but they don't always 532 00:28:36,890 --> 00:28:40,370 get them right because they haven't 533 00:28:40,370 --> 00:28:41,810 gotten all the clues yet. 534 00:28:41,810 --> 00:28:45,500 And also while they may be able to do ab initio 535 00:28:45,500 --> 00:28:48,830 or computational folding with small structures, 536 00:28:48,830 --> 00:28:51,990 the headache gets way bigger the larger the structures get. 537 00:28:51,990 --> 00:28:55,370 So the predictors aren't very good at predicting 538 00:28:55,370 --> 00:28:57,620 big structures, they're getting better 539 00:28:57,620 --> 00:28:59,720 at predicting small structures. 540 00:28:59,720 --> 00:29:05,870 And so just to reinforce to you, the primary sequence 541 00:29:05,870 --> 00:29:09,710 is established by covalent bonds, the peptide bonds, 542 00:29:09,710 --> 00:29:12,200 but the globular tertiary structure 543 00:29:12,200 --> 00:29:15,920 is based on non-covalent covalent interactions, OK? 544 00:29:15,920 --> 00:29:17,870 Now, I want to ask you this. 545 00:29:17,870 --> 00:29:20,390 I love cartoons with science in them, 546 00:29:20,390 --> 00:29:25,910 but you know, 10%, 20% of the time, they make mistakes, 547 00:29:25,910 --> 00:29:28,430 and I felt this one was particularly pertinent. 548 00:29:28,430 --> 00:29:31,490 So a bunch of guys lugging around in a lab and says, 549 00:29:31,490 --> 00:29:34,700 well, we finished the genome map, 550 00:29:34,700 --> 00:29:37,147 now we just have to figure out how to fold it. 551 00:29:37,147 --> 00:29:38,480 What is wrong with that cartoon? 552 00:29:41,970 --> 00:29:42,470 What fold? 553 00:29:42,470 --> 00:29:43,072 Yeah? 554 00:29:43,072 --> 00:29:44,530 AUDIENCE: You want to [INAUDIBLE].. 555 00:29:44,530 --> 00:29:45,197 PROFESSOR: Yeah. 556 00:29:45,197 --> 00:29:46,150 AUDIENCE: [INAUDIBLE] 557 00:29:46,150 --> 00:29:48,180 PROFESSOR: Yeah, the genome doesn't fold. 558 00:29:48,180 --> 00:29:51,660 It's double helical, duplex DNA or something. 559 00:29:51,660 --> 00:29:53,260 You're actually folding proteins, 560 00:29:53,260 --> 00:29:54,960 so the cartoon is not quite right, 561 00:29:54,960 --> 00:29:56,790 but it's sort of kind of cute. 562 00:29:56,790 --> 00:30:00,360 All right, now, when we talk about the non-covalent forces 563 00:30:00,360 --> 00:30:02,490 that hold proteins together, I just 564 00:30:02,490 --> 00:30:04,440 want you to remember from last time 565 00:30:04,440 --> 00:30:08,370 this set of non-covalent forces, because if you understand them 566 00:30:08,370 --> 00:30:11,810 and recognize them, you'll understand how they may occur 567 00:30:11,810 --> 00:30:14,070 in folded protein structures. 568 00:30:14,070 --> 00:30:18,190 All right, so here's a peptide sequence. 569 00:30:18,190 --> 00:30:19,190 Here's a puzzle for you. 570 00:30:19,190 --> 00:30:21,990 You can go back and figure out what the one-letter code spells 571 00:30:21,990 --> 00:30:23,280 there. 572 00:30:23,280 --> 00:30:27,500 Just take out your table with all the amino acids. 573 00:30:27,500 --> 00:30:29,790 It's appended to the back of your P-set, 574 00:30:29,790 --> 00:30:33,020 and you'll be able to see what that very large peptide spells. 575 00:30:33,020 --> 00:30:34,770 All right, I don't want you working it out 576 00:30:34,770 --> 00:30:35,290 while you're here. 577 00:30:35,290 --> 00:30:37,270 You've got to listen to me for the time being. 578 00:30:37,270 --> 00:30:47,050 OK, so the first order, we get it, there's a primary sequence. 579 00:30:52,210 --> 00:30:54,380 The next thing to think about is what's 580 00:30:54,380 --> 00:30:56,510 known as secondary structure. 581 00:30:56,510 --> 00:30:59,780 It's a higher order than just the primary sequence, 582 00:30:59,780 --> 00:31:05,600 and it's established by non-covalent bonds, 583 00:31:05,600 --> 00:31:08,810 and it's called secondary-- oof, my writing's horrid today. 584 00:31:08,810 --> 00:31:10,420 Secondary structure. 585 00:31:13,740 --> 00:31:26,530 And those are interactions that are put in place exclusively 586 00:31:26,530 --> 00:31:31,090 by interactions between the peptide bonds of what's 587 00:31:31,090 --> 00:31:33,220 known as the peptide backbone. 588 00:31:33,220 --> 00:31:36,490 So if I look at the structure, these are the side chains. 589 00:31:36,490 --> 00:31:41,320 The peptide backbone is this continuous linear sequence. 590 00:31:41,320 --> 00:31:43,840 That's what we would call the peptide backbone, 591 00:31:43,840 --> 00:31:46,450 and the secondary structure is put in place 592 00:31:46,450 --> 00:31:49,750 by hydrogen bonding between components 593 00:31:49,750 --> 00:31:51,370 of the peptide backbone. 594 00:31:51,370 --> 00:31:54,890 So for example, a hydrogen bonds, 595 00:31:54,890 --> 00:32:02,810 such as that, or a different hydrogen bonding interaction, 596 00:32:02,810 --> 00:32:03,680 such as that. 597 00:32:03,680 --> 00:32:08,300 Between the atoms that have lone pairs of electrons 598 00:32:08,300 --> 00:32:10,640 and the other atoms-- 599 00:32:10,640 --> 00:32:14,180 heavy atoms that hold a hydrogen that's quite acidic. 600 00:32:14,180 --> 00:32:18,050 And there are a couple of major forms of secondary structure. 601 00:32:18,050 --> 00:32:20,300 What I'm showing you here is what's 602 00:32:20,300 --> 00:32:22,910 known as the alpha helix. 603 00:32:22,910 --> 00:32:26,660 First deduced by Pauling, in fact, through model building, 604 00:32:26,660 --> 00:32:30,230 he said, proteins could form these ordered structures, 605 00:32:30,230 --> 00:32:34,520 and an alpha helix is an ordered structure exclusively made up 606 00:32:34,520 --> 00:32:36,470 from the hydrogen-bonding interactions 607 00:32:36,470 --> 00:32:38,130 of the peptide backbone. 608 00:32:38,130 --> 00:32:40,640 And you can look at this helical structure. 609 00:32:40,640 --> 00:32:44,060 It's a continuous strand of peptide, 610 00:32:44,060 --> 00:32:50,510 but there are hydrogen bonds between COs and NHs all the way 611 00:32:50,510 --> 00:32:54,530 through the backbone, such that this strand of peptide 612 00:32:54,530 --> 00:32:58,190 can fold up into a cylindrical, helical structure, where 613 00:32:58,190 --> 00:33:01,655 all those R groups, the side chains of the amino acids, 614 00:33:01,655 --> 00:33:04,550 are on the perimeter of that helix. 615 00:33:04,550 --> 00:33:08,930 So this secondary structure is an important one 616 00:33:08,930 --> 00:33:13,830 because it's very prevalent in a lot of proteins. 617 00:33:13,830 --> 00:33:17,030 The next secondary structure is also held together 618 00:33:17,030 --> 00:33:20,180 by hydrogen bonding, and it's interactions 619 00:33:20,180 --> 00:33:24,140 between stretched out strands of peptides that may not 620 00:33:24,140 --> 00:33:26,690 be close to each other in the primary sequence, 621 00:33:26,690 --> 00:33:29,750 but they align in the folded structure. 622 00:33:29,750 --> 00:33:31,280 And so for example, what I've shown 623 00:33:31,280 --> 00:33:34,380 you here is what's known as a-- 624 00:33:34,380 --> 00:33:38,330 this guy is then to say this is an anti-parallel beta sheet. 625 00:33:38,330 --> 00:33:42,470 And across that sheet, there are continuous opportunities 626 00:33:42,470 --> 00:33:44,690 for hydrogen bonding interaction. 627 00:33:44,690 --> 00:33:48,350 If the strands run in opposite directions, it's anti-parallel. 628 00:33:48,350 --> 00:33:50,930 If they're in the same direction, it's parallel. 629 00:33:50,930 --> 00:33:57,220 These two secondary structure elements 630 00:33:57,220 --> 00:34:00,130 make up a lot of the sort of basics 631 00:34:00,130 --> 00:34:02,050 of how proteins start to fold. 632 00:34:02,050 --> 00:34:05,830 They're key non-covalent forces, and there are also 633 00:34:05,830 --> 00:34:07,870 other smaller motifs. 634 00:34:07,870 --> 00:34:14,739 One is called a beta turn, where the peptide sequence may 635 00:34:14,739 --> 00:34:18,040 go through a chain reversal, so the sequence 636 00:34:18,040 --> 00:34:19,030 would look like this. 637 00:34:19,030 --> 00:34:20,488 I'm going to just draw it, and I'll 638 00:34:20,488 --> 00:34:24,560 talk to you in a moment about ribbon diagrams. 639 00:34:24,560 --> 00:34:27,560 And this piece here would be the turn, 640 00:34:27,560 --> 00:34:30,108 whereas that would be the interactions enforced 641 00:34:30,108 --> 00:34:30,650 by the sheet. 642 00:34:30,650 --> 00:34:34,409 These are the ordered elements of secondary structure. 643 00:34:34,409 --> 00:34:36,840 You don't have to be able to figure them out, 644 00:34:36,840 --> 00:34:38,659 but you have to be able to pick them out 645 00:34:38,659 --> 00:34:42,679 in order to understand the structure, OK? 646 00:34:42,679 --> 00:34:46,159 So even those simple elements still 647 00:34:46,159 --> 00:34:50,190 it's hard to make big enough structures to have functions. 648 00:34:50,190 --> 00:34:53,480 So as I mentioned in a continuation of the theme, 649 00:34:53,480 --> 00:34:56,540 the protein folding is hierarchical, 650 00:34:56,540 --> 00:35:00,650 you can start to put together elements of secondary structure 651 00:35:00,650 --> 00:35:02,990 to make things that are a little larger. 652 00:35:02,990 --> 00:35:05,780 Helix, turn, helix. 653 00:35:05,780 --> 00:35:08,180 Helix with a different kind of turn, 654 00:35:08,180 --> 00:35:12,650 maybe put in place by a metal ion or something, or a strand, 655 00:35:12,650 --> 00:35:15,020 turn, strand, or now something that's 656 00:35:15,020 --> 00:35:20,180 a composite of these two major types of secondary structure, 657 00:35:20,180 --> 00:35:21,840 the helix and the turn. 658 00:35:21,840 --> 00:35:23,960 And these really start to be proteins 659 00:35:23,960 --> 00:35:27,290 that might be big enough to be able to do something, 660 00:35:27,290 --> 00:35:29,750 but they're all exclusively held together 661 00:35:29,750 --> 00:35:34,940 by non-covalent forces between the amides or peptide bonds 662 00:35:34,940 --> 00:35:37,580 in the backbone of the protein, OK? 663 00:35:37,580 --> 00:35:40,690 Not very exciting just yet. 664 00:35:40,690 --> 00:35:43,570 Now, one other little clue that people will-- 665 00:35:43,570 --> 00:35:45,730 you might see and you might be confused, 666 00:35:45,730 --> 00:35:47,800 people sometimes, when they're drawing sort 667 00:35:47,800 --> 00:35:51,520 of a quick picture of a protein, they might draw a helix, 668 00:35:51,520 --> 00:35:53,890 but instead of really showing it in detail, 669 00:35:53,890 --> 00:35:55,510 they might show it as a cylinder, 670 00:35:55,510 --> 00:35:59,910 so you might need to pick that out of a structure. 671 00:35:59,910 --> 00:36:02,620 And then I want to call your attention to that, 672 00:36:02,620 --> 00:36:07,370 that in all those motifs, when you join one helix to another, 673 00:36:07,370 --> 00:36:11,090 you might need to turn a strand to another strand you 674 00:36:11,090 --> 00:36:12,650 need to turn, and so on. 675 00:36:12,650 --> 00:36:16,730 OK, so this is like taking your very extended stored 676 00:36:16,730 --> 00:36:19,850 of polymer, knowing there are different kinks in it, because 677 00:36:19,850 --> 00:36:22,490 of the backbone bonds, but folding it up 678 00:36:22,490 --> 00:36:26,510 in a structure that maximizes the opportunity 679 00:36:26,510 --> 00:36:30,540 for another order of structure, which we'll talk about now. 680 00:36:30,540 --> 00:36:33,770 All right, so we've seen primary. 681 00:36:33,770 --> 00:36:35,285 Secondary is just with backbone. 682 00:36:38,700 --> 00:36:42,660 And things start to get much more interesting 683 00:36:42,660 --> 00:36:45,240 when we get to tertiary structure, 684 00:36:45,240 --> 00:36:48,420 because tertiary structure is enabled 685 00:36:48,420 --> 00:36:52,740 by all these other interactions, electrostatic, hydrogen 686 00:36:52,740 --> 00:36:55,050 bonding, hydrophobic forces, that 687 00:36:55,050 --> 00:36:57,780 can be put in place due to the side 688 00:36:57,780 --> 00:37:01,402 chains of amino acids interacting with each other 689 00:37:01,402 --> 00:37:02,735 or with the backbone structures. 690 00:37:02,735 --> 00:37:05,580 So I'm going to walk you through this, so you can sort of get 691 00:37:05,580 --> 00:37:08,520 a sense of how these three-dimensional puzzles work 692 00:37:08,520 --> 00:37:10,110 on a very small scale. 693 00:37:10,110 --> 00:37:13,430 So look here, that's a very small motif. 694 00:37:13,430 --> 00:37:16,440 And what I'm going to call your attention to 695 00:37:16,440 --> 00:37:18,630 is when you fold up these motifs, 696 00:37:18,630 --> 00:37:20,640 when the secondary structure is in place, 697 00:37:20,640 --> 00:37:23,340 a lot of the side chains are near each other, 698 00:37:23,340 --> 00:37:26,400 and they can engage in long-distance contacts. 699 00:37:26,400 --> 00:37:29,190 And so for example, I'm going to show you 700 00:37:29,190 --> 00:37:31,650 interactions between side chains, 701 00:37:31,650 --> 00:37:34,260 between side chains and the peptide backbone, 702 00:37:34,260 --> 00:37:36,030 or side chains and water. 703 00:37:36,030 --> 00:37:38,280 But what I want to do is take a look at this and see, 704 00:37:38,280 --> 00:37:41,310 can you put any of those potential interactions 705 00:37:41,310 --> 00:37:44,250 on the drawing that's on your handout? 706 00:37:44,250 --> 00:37:46,140 It's pretty obvious where there's 707 00:37:46,140 --> 00:37:49,782 an electrostatic interaction, right? 708 00:37:49,782 --> 00:37:50,730 Boop. 709 00:37:50,730 --> 00:37:53,430 OK, between plus-- get those out of the way, 710 00:37:53,430 --> 00:37:54,750 those are the easy ones. 711 00:37:54,750 --> 00:37:58,830 And then interactions between hydrophobic groups, 712 00:37:58,830 --> 00:38:02,380 where they want to amass that lipophilic structure, 713 00:38:02,380 --> 00:38:04,380 so it's not exposed as much to water, 714 00:38:04,380 --> 00:38:06,960 so they cluster, so those are easy. 715 00:38:06,960 --> 00:38:10,470 And then you can start thinking about what are all of hydrogen 716 00:38:10,470 --> 00:38:12,270 bonds you could draw. 717 00:38:12,270 --> 00:38:16,440 Here I've shown one between side chains, 718 00:38:16,440 --> 00:38:18,270 between side chains and backbone, 719 00:38:18,270 --> 00:38:21,180 between side chains and water, and those may all 720 00:38:21,180 --> 00:38:25,110 contribute to the ultimate thermodynamic stability. 721 00:38:25,110 --> 00:38:27,030 Make sure you get your hydrogen bonds right. 722 00:38:27,030 --> 00:38:31,240 Remember, two donors don't interact with each other 723 00:38:31,240 --> 00:38:32,680 into acceptors, don't-- 724 00:38:32,680 --> 00:38:36,420 so this might describe the folding possibilities 725 00:38:36,420 --> 00:38:38,190 of that small motif. 726 00:38:38,190 --> 00:38:40,376 Now what I want to show you-- 727 00:38:40,376 --> 00:38:41,980 I'm going to-- let me-- 728 00:38:46,330 --> 00:38:51,260 is an ab-initio simulation of a folding process. 729 00:38:51,260 --> 00:38:54,350 So let me just get that a little bigger on the screen. 730 00:38:54,350 --> 00:38:57,190 So this is computing. 731 00:38:57,190 --> 00:39:01,720 GB1 is a very small protein that holds reversibly 732 00:39:01,720 --> 00:39:05,380 under appropriate conditions, and what I'm going to do 733 00:39:05,380 --> 00:39:08,050 is forward you through this video. 734 00:39:08,050 --> 00:39:09,440 This is a simulation. 735 00:39:09,440 --> 00:39:10,840 This is all computation. 736 00:39:10,840 --> 00:39:14,350 It's not looking at anything by spectroscopy or in solution 737 00:39:14,350 --> 00:39:15,370 or anything like that. 738 00:39:19,053 --> 00:39:21,220 And what I'm going to do is I'm going to forward you 739 00:39:21,220 --> 00:39:22,870 through the structure. 740 00:39:22,870 --> 00:39:24,750 This is multi-scale modeling. 741 00:39:24,750 --> 00:39:28,110 It's got a lot of details in how it's done, 742 00:39:28,110 --> 00:39:31,530 but the starting point is a very denatured protein, 743 00:39:31,530 --> 00:39:33,450 all stretched out, right? 744 00:39:33,450 --> 00:39:36,570 And what I'm going to do is just show you for a few seconds, 745 00:39:36,570 --> 00:39:38,100 you know, this thing's like trying 746 00:39:38,100 --> 00:39:40,540 to find its thermodynamic minimum, 747 00:39:40,540 --> 00:39:42,360 and it's actually failing pretty badly. 748 00:39:42,360 --> 00:39:45,390 And it does that for about 30-- 749 00:39:45,390 --> 00:39:49,200 60 seconds of the simulations, so I made a point to myself 750 00:39:49,200 --> 00:39:52,260 to take you to about minute one, where things start 751 00:39:52,260 --> 00:39:53,475 to get fairly interesting. 752 00:39:53,475 --> 00:39:55,770 And you're saying, well, what's interesting about that? 753 00:39:55,770 --> 00:39:59,760 You see that nascent helix, in the background, the red 754 00:39:59,760 --> 00:40:02,430 and the blue, is starting to form strands that 755 00:40:02,430 --> 00:40:04,170 are a little bit aligned, and it's 756 00:40:04,170 --> 00:40:07,530 trying to find as many connections as possible 757 00:40:07,530 --> 00:40:09,660 to satisfy a stable structure. 758 00:40:09,660 --> 00:40:12,420 At a certain point in the simulation, 759 00:40:12,420 --> 00:40:15,720 five of the hydrophobic groups are in a little pea. 760 00:40:15,720 --> 00:40:18,180 They're in a little hydrophobic cluster, 761 00:40:18,180 --> 00:40:21,370 and that's a breakpoint in the folding process, 762 00:40:21,370 --> 00:40:24,100 because that gets everything glued together better, 763 00:40:24,100 --> 00:40:26,310 so that the rest of it now can start 764 00:40:26,310 --> 00:40:30,270 to really find its final place in the folded structure. 765 00:40:30,270 --> 00:40:33,360 These early structures are known as molten globules. 766 00:40:33,360 --> 00:40:36,270 A lot of the interactions are not yet in place, 767 00:40:36,270 --> 00:40:38,610 but the hydrophobic cluster is critical. 768 00:40:38,610 --> 00:40:40,710 But then after that, it's almost as 769 00:40:40,710 --> 00:40:43,110 if you're sliding downhill to get 770 00:40:43,110 --> 00:40:47,110 all the remaining interactions in place to fold the protein, 771 00:40:47,110 --> 00:40:47,850 OK? 772 00:40:47,850 --> 00:40:51,150 So protein folding is a puzzle that 773 00:40:51,150 --> 00:40:54,390 can be solved computationally by maximizing 774 00:40:54,390 --> 00:40:56,340 thermodynamic interactions. 775 00:40:56,340 --> 00:41:00,990 So it's sigma this, sum of this, sum of this, sum of that. 776 00:41:00,990 --> 00:41:03,720 That's going to get difficult the larger the protein gets, 777 00:41:03,720 --> 00:41:07,110 but for small proteins, those simulations really 778 00:41:07,110 --> 00:41:10,020 start to make sense, OK? 779 00:41:10,020 --> 00:41:12,450 All right, so let's just move on here. 780 00:41:12,450 --> 00:41:14,478 Lost-- ah, good. 781 00:41:14,478 --> 00:41:16,020 What did you think of the simulation? 782 00:41:16,020 --> 00:41:17,250 It's kind of cool, right? 783 00:41:17,250 --> 00:41:19,480 So you can find the link in the sidebar. 784 00:41:19,480 --> 00:41:25,030 So just pop these back on now, and that's 785 00:41:25,030 --> 00:41:26,350 the folded structure. 786 00:41:26,350 --> 00:41:28,630 All right, so with many proteins, 787 00:41:28,630 --> 00:41:30,250 they're much more complex than that. 788 00:41:30,250 --> 00:41:34,640 So for example, here's cyclin A. It's involved in cell cycle, 789 00:41:34,640 --> 00:41:38,860 and you can see its alpha helix structure dominantly, 790 00:41:38,860 --> 00:41:41,860 very clearly, all those beautiful alpha helices. 791 00:41:41,860 --> 00:41:44,800 Next to it is the green fluorescent protein, 792 00:41:44,800 --> 00:41:47,440 which is a cylindrical structure made up 793 00:41:47,440 --> 00:41:49,510 of anti-parallel beta sheets. 794 00:41:49,510 --> 00:41:51,610 What's really cool is when you sort of rotate it, 795 00:41:51,610 --> 00:41:53,470 you can see all those sheets, but then it 796 00:41:53,470 --> 00:41:56,590 does this little sort of curtsy to the audience, 797 00:41:56,590 --> 00:41:59,200 and you can look down into the barrel. 798 00:41:59,200 --> 00:42:01,560 And then in some cases, proteins may 799 00:42:01,560 --> 00:42:05,020 be a mixture of a secondary structure elements. 800 00:42:05,020 --> 00:42:06,580 Here it's a little hard to tell. 801 00:42:06,580 --> 00:42:08,800 This is triose phosphate isomerase, 802 00:42:08,800 --> 00:42:11,630 but if you look down it, you can see the helices, 803 00:42:11,630 --> 00:42:15,890 and there's also a group of beta strands that are held together. 804 00:42:15,890 --> 00:42:19,390 So in that protein, it's a mixture of alpha helix and beta 805 00:42:19,390 --> 00:42:20,130 sheet. 806 00:42:20,130 --> 00:42:23,890 Now, I'm not going to tell you much about pulling up 807 00:42:23,890 --> 00:42:26,380 Protein Data Bank files right now because I 808 00:42:26,380 --> 00:42:27,955 want to cover the next topic. 809 00:42:27,955 --> 00:42:29,830 And then when we have a few minutes later on, 810 00:42:29,830 --> 00:42:30,850 I'll show you. 811 00:42:30,850 --> 00:42:33,250 But wherever I show you a structure, 812 00:42:33,250 --> 00:42:37,190 I'm trying to show you the Protein Data Bank code, 813 00:42:37,190 --> 00:42:39,220 and in the web site, you can see there 814 00:42:39,220 --> 00:42:42,250 is a free download of PyMOL, which 815 00:42:42,250 --> 00:42:45,490 is the program I used to create all these structures 816 00:42:45,490 --> 00:42:47,740 and movies, so you can really look at things. 817 00:42:47,740 --> 00:42:49,677 And believe me, it took me about three years 818 00:42:49,677 --> 00:42:51,010 to learn how to use it properly. 819 00:42:51,010 --> 00:42:54,530 It'll probably take you about a week or maybe a couple of days. 820 00:42:54,530 --> 00:42:57,610 So if I can learn it, you can certainly learn it. 821 00:42:57,610 --> 00:43:01,460 Now, there is one final element of protein structure 822 00:43:01,460 --> 00:43:03,070 that people get kind of hung up on, 823 00:43:03,070 --> 00:43:05,760 and it's what's called quaternary structure. 824 00:43:05,760 --> 00:43:09,250 It's like, aren't we done yet? 825 00:43:09,250 --> 00:43:13,300 So in addition to all of these, let's say 826 00:43:13,300 --> 00:43:19,320 I have a folded motif, and there's its structure. 827 00:43:19,320 --> 00:43:24,480 That would be have primary, secondary, between the strands 828 00:43:24,480 --> 00:43:27,840 or the helix, and tertiary structure, right? 829 00:43:27,840 --> 00:43:34,840 But in some cases, proteins hold up to quaternary structure, 830 00:43:34,840 --> 00:43:43,340 where it's multiple of these units joined together-- 831 00:43:43,340 --> 00:43:47,860 hoo, I could have picked a simpler fold, 832 00:43:47,860 --> 00:43:50,740 but that will get you the general gist of it-- 833 00:43:50,740 --> 00:43:54,190 all right, where these are actually associated 834 00:43:54,190 --> 00:43:56,200 by non-covalent forces. 835 00:43:56,200 --> 00:43:59,110 So there's more than one polypeptide chain. 836 00:43:59,110 --> 00:44:03,550 In fact, here would be four peptide chains coming together 837 00:44:03,550 --> 00:44:05,290 in a higher-order structure that's 838 00:44:05,290 --> 00:44:07,540 made up of four of those units. 839 00:44:07,540 --> 00:44:11,170 The prototypic example of this is the protein 840 00:44:11,170 --> 00:44:13,720 that carries oxygen around in your blood, which 841 00:44:13,720 --> 00:44:19,150 is hemoglobin, and it has four primary sequences 842 00:44:19,150 --> 00:44:24,090 that have come together in a tetrameric quaternary 843 00:44:24,090 --> 00:44:25,360 structure. 844 00:44:25,360 --> 00:44:27,230 Hemoglobin is kind of interesting, 845 00:44:27,230 --> 00:44:31,330 because it's made up of two alpha and two beta subunits. 846 00:44:31,330 --> 00:44:34,330 If All these subunits were identical, 847 00:44:34,330 --> 00:44:37,210 they would be called homooligomers, 848 00:44:37,210 --> 00:44:38,830 all the same pieces. 849 00:44:38,830 --> 00:44:42,230 If they are different, they are called heterooligomers. 850 00:44:42,230 --> 00:44:44,080 We'll see a little bit more about this 851 00:44:44,080 --> 00:44:46,900 when I talk about hemoglobin in the next class, 852 00:44:46,900 --> 00:44:50,260 because the features of the quaternary structure 853 00:44:50,260 --> 00:44:53,890 are very, very important for the proper transport of oxygen, 854 00:44:53,890 --> 00:44:56,800 and single mutations can really mess things up, 855 00:44:56,800 --> 00:45:01,010 and you'll see more about that in the next class. 856 00:45:01,010 --> 00:45:04,030 So just wrap that little bit up, proteins 857 00:45:04,030 --> 00:45:07,460 are condensation polymers of amino acids. 858 00:45:07,460 --> 00:45:12,370 Each protein sequence is defined by covalent bonding. 859 00:45:12,370 --> 00:45:13,660 Native proteins. 860 00:45:13,660 --> 00:45:17,460 Most of them that are not have quite quaternary structure 861 00:45:17,460 --> 00:45:21,370 are folded through secondary and tertiary interactions, 862 00:45:21,370 --> 00:45:23,630 these things that we already talked about, 863 00:45:23,630 --> 00:45:27,790 and folding is defined by how to maximize 864 00:45:27,790 --> 00:45:29,860 all those non-covalent forces to get 865 00:45:29,860 --> 00:45:33,100 the maximum thermodynamic stability 866 00:45:33,100 --> 00:45:35,230 with the maximum number of interactions. 867 00:45:35,230 --> 00:45:37,930 And subunits may also come together 868 00:45:37,930 --> 00:45:40,830 through quaternary structure. 869 00:45:40,830 --> 00:45:44,550 OK, so I'm going to talk to you about several proteins 870 00:45:44,550 --> 00:45:46,770 throughout the course, but for now, I 871 00:45:46,770 --> 00:45:49,890 want to focus you in on a structural protein that 872 00:45:49,890 --> 00:45:53,190 provides mechanical support for tissues. 873 00:45:53,190 --> 00:45:57,240 In the next class, we'll talk about transporters and enzymes, 874 00:45:57,240 --> 00:45:59,340 and as we move on to signaling, things 875 00:45:59,340 --> 00:46:02,680 like receptors and membrane proteins and so on. 876 00:46:02,680 --> 00:46:05,970 So the protein I'm going to describe to you is collagen. 877 00:46:05,970 --> 00:46:09,550 It is the most abundant protein in the human body. 878 00:46:09,550 --> 00:46:10,920 It plays enormous roles. 879 00:46:10,920 --> 00:46:13,200 It's not an enzyme, it's not a catalyst, 880 00:46:13,200 --> 00:46:14,550 it's not a transporter. 881 00:46:14,550 --> 00:46:17,370 It is one of those structural proteins, where 882 00:46:17,370 --> 00:46:20,010 the structure of collagen has evolved 883 00:46:20,010 --> 00:46:23,220 to provide a mechanical stability to lots 884 00:46:23,220 --> 00:46:26,640 of essential components of complex organisms. 885 00:46:26,640 --> 00:46:28,200 And there are many different types 886 00:46:28,200 --> 00:46:32,410 of collagens that are found in different parts of the body. 887 00:46:32,410 --> 00:46:35,940 For example, bone, tendon, cartilage, and so on. 888 00:46:35,940 --> 00:46:38,560 They are all college and structures, 889 00:46:38,560 --> 00:46:40,050 but they have subtle differences, 890 00:46:40,050 --> 00:46:42,390 maybe some have different, slightly different, 891 00:46:42,390 --> 00:46:45,540 mechanical properties to adapt to the functions 892 00:46:45,540 --> 00:46:48,860 that they perform, OK? 893 00:46:48,860 --> 00:46:51,540 And what I'm going to show you is that a single amino acid 894 00:46:51,540 --> 00:46:55,020 change in the primary sequence of collagen 895 00:46:55,020 --> 00:46:59,380 can destabilize the structure, so it is no longer viable. 896 00:46:59,380 --> 00:47:01,440 And the disease type I'm going to talk to you 897 00:47:01,440 --> 00:47:06,980 about is a set of diseases known as collagenopathies, 898 00:47:06,980 --> 00:47:10,440 and the particular one is called osteogenesis imperfecta. 899 00:47:10,440 --> 00:47:13,980 Osteo always refers to bone because college 900 00:47:13,980 --> 00:47:17,490 and plays a critical role in the structure of bone. 901 00:47:17,490 --> 00:47:22,410 Bone isn't just bone, it's collagen involved in it. 902 00:47:22,410 --> 00:47:25,940 And it's also this disease is called brittle bone syndrome. 903 00:47:25,940 --> 00:47:29,880 And here's the X-ray of a baby born with brittle bones 904 00:47:29,880 --> 00:47:33,480 syndrome, and you'll see that the long bones in the upper arm 905 00:47:33,480 --> 00:47:36,750 are all irregular because the bones are brittle, 906 00:47:36,750 --> 00:47:40,580 and they'll break even in utero. 907 00:47:40,580 --> 00:47:42,980 A lot of babies with this defect can't even 908 00:47:42,980 --> 00:47:44,730 be born through the birth canal because it 909 00:47:44,730 --> 00:47:47,700 would crush the bones, and many of them 910 00:47:47,700 --> 00:47:49,680 don't survive very long at all. 911 00:47:49,680 --> 00:47:52,410 Some survive with different kinds of cases, 912 00:47:52,410 --> 00:47:54,120 but their lives are greatly impacted, 913 00:47:54,120 --> 00:47:56,400 and they could just sort of hit a table 914 00:47:56,400 --> 00:47:58,590 and the bones would break, all right? 915 00:47:58,590 --> 00:48:01,020 There are those sort of serious situations 916 00:48:01,020 --> 00:48:05,590 where parents are actually accused of abuse to the child, 917 00:48:05,590 --> 00:48:08,490 but the child actually had brittle bone syndrome, 918 00:48:08,490 --> 00:48:11,460 and it was just through helping them put their clothes on 919 00:48:11,460 --> 00:48:15,600 or taking them upstairs, the bones got broken very readily. 920 00:48:15,600 --> 00:48:18,300 So osteogenesis imperfecta really 921 00:48:18,300 --> 00:48:21,510 describes a collection of these defects. 922 00:48:21,510 --> 00:48:26,640 Now the collagen tertiary structure is shown here. 923 00:48:26,640 --> 00:48:29,290 It's actually made up of a type of helix. 924 00:48:29,290 --> 00:48:31,060 It's not an alpha helix. 925 00:48:31,060 --> 00:48:34,740 It's a polyproline helix, where the individual subunits 926 00:48:34,740 --> 00:48:37,830 in that tertiary in the structure are fairly long 927 00:48:37,830 --> 00:48:41,070 and extended, and I show you three strands 928 00:48:41,070 --> 00:48:45,420 in this polymeric structure, a yellow, a red, and a green. 929 00:48:45,420 --> 00:48:49,290 And these rolled together into a three helix bundle that 930 00:48:49,290 --> 00:48:52,680 has a fibrillous structure, and then all these structures 931 00:48:52,680 --> 00:48:56,310 come together to make the macromolecular structure that 932 00:48:56,310 --> 00:48:57,260 is collagen. 933 00:48:57,260 --> 00:48:59,400 It's not just one of those fibrils. 934 00:48:59,400 --> 00:49:04,740 It's bundles of those fibrils in a very organized pattern where 935 00:49:04,740 --> 00:49:07,770 you could even see that patterning in electron 936 00:49:07,770 --> 00:49:09,030 microscopy. 937 00:49:09,030 --> 00:49:11,660 And there are many genetic defects of collagen, 938 00:49:11,660 --> 00:49:13,710 and what's so important to think about 939 00:49:13,710 --> 00:49:16,620 is if you have a defect in one strand 940 00:49:16,620 --> 00:49:20,790 that defect will propagate through every single strand. 941 00:49:20,790 --> 00:49:26,130 If this is one strand made up of three polypeptide chains, 942 00:49:26,130 --> 00:49:29,070 it propagates all the way through the structure. 943 00:49:29,070 --> 00:49:32,700 And I believe I have little time to just show you, 944 00:49:32,700 --> 00:49:34,980 here's the collagen structure. 945 00:49:34,980 --> 00:49:37,050 I'm just showing you how it's extended. 946 00:49:37,050 --> 00:49:39,480 Those are three independent strands, 947 00:49:39,480 --> 00:49:42,720 and there's a set of magenta residues in the middle, which 948 00:49:42,720 --> 00:49:45,210 come from a defect in the sequence 949 00:49:45,210 --> 00:49:49,800 where a glycine has been changed to an alanine. 950 00:49:49,800 --> 00:49:52,260 So I'm going to show you this movie because it shows 951 00:49:52,260 --> 00:49:54,990 you right at the center of the structure, there 952 00:49:54,990 --> 00:49:57,420 are residues painted in pink. 953 00:49:57,420 --> 00:50:01,260 And what I'm going to do is show you close up of that segment. 954 00:50:01,260 --> 00:50:03,270 If you look at those cells they're 955 00:50:03,270 --> 00:50:06,920 all nicely organized, except where that defect is, 956 00:50:06,920 --> 00:50:09,930 and that defect is caused by the change of a hydrogen 957 00:50:09,930 --> 00:50:14,640 to a methyl group on three residues that come together, 958 00:50:14,640 --> 00:50:18,000 and that bulges out that fibrillous structure 959 00:50:18,000 --> 00:50:20,490 and makes it not as compact and beautiful 960 00:50:20,490 --> 00:50:24,113 as it should be in the version that's got the glycine there. 961 00:50:24,113 --> 00:50:25,530 So if you look at it, you can even 962 00:50:25,530 --> 00:50:29,250 see that helix gets bulged out and it's not 963 00:50:29,250 --> 00:50:32,970 as well-aligned as the rest of the structure. 964 00:50:32,970 --> 00:50:35,430 And then that defect gets propagated 965 00:50:35,430 --> 00:50:39,870 into all the fibrils and results in the weakening of the bones. 966 00:50:39,870 --> 00:50:42,720 Either the collagen fails to form properly, 967 00:50:42,720 --> 00:50:44,550 or the collagen, when it forms, it 968 00:50:44,550 --> 00:50:47,080 has much less mechanical stability. 969 00:50:47,080 --> 00:50:48,830 So I think that's a good place to stop 970 00:50:48,830 --> 00:50:53,520 and I'll pick up next time with hemoglobin. 971 00:50:53,520 --> 00:50:57,190 Oh, one last little thing, a couple of things for you to do. 972 00:50:57,190 --> 00:51:00,860 There's a great link on the website to the Protein Data 973 00:51:00,860 --> 00:51:03,490 Bank to see how enzymes work. 974 00:51:03,490 --> 00:51:05,280 And if you have a little time, it 975 00:51:05,280 --> 00:51:07,620 would be awesome if you could just 976 00:51:07,620 --> 00:51:10,590 take a quick flick through those parts of the text. 977 00:51:10,590 --> 00:51:13,680 These slides are posted with these reading assignments, 978 00:51:13,680 --> 00:51:17,180 and they're posted in color if you want to look at them again.