1 00:00:00,500 --> 00:00:02,830 The following content is provided under a Creative 2 00:00:02,830 --> 00:00:04,370 Commons license. 3 00:00:04,370 --> 00:00:06,670 Your support will help MIT OpenCourseWare 4 00:00:06,670 --> 00:00:11,030 continue to offer high-quality educational resources for free. 5 00:00:11,030 --> 00:00:13,660 To make a donation or view additional materials 6 00:00:13,660 --> 00:00:17,610 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,610 --> 00:00:18,520 at ocw.mit.edu. 8 00:00:25,693 --> 00:00:27,110 ELIZABETH NOLAN: Last time we were 9 00:00:27,110 --> 00:00:33,500 working on this PKS assembly line that makes acrylide. 10 00:00:33,500 --> 00:00:37,580 And just as a review of that, we left off 11 00:00:37,580 --> 00:00:42,230 having gone over the domains and module architecture 12 00:00:42,230 --> 00:00:44,310 for this assembly line. 13 00:00:44,310 --> 00:00:48,890 So recall, each module activates a given monomer. 14 00:00:48,890 --> 00:00:51,740 And we can use depictions like this 15 00:00:51,740 --> 00:00:57,610 to show how the PKS builds a growing polyketide chain. 16 00:00:57,610 --> 00:01:00,740 OK, and as you saw in recitation last week, 17 00:01:00,740 --> 00:01:02,900 the actual structure of one of these synthases 18 00:01:02,900 --> 00:01:05,180 is very different than what's depicted 19 00:01:05,180 --> 00:01:08,510 by this left-to-right kind of assembly-line depiction there. 20 00:01:08,510 --> 00:01:10,910 So you saw some amazing conformational changes 21 00:01:10,910 --> 00:01:13,040 of the fatty acid synthase. 22 00:01:13,040 --> 00:01:15,410 And they're all different, but just keep that in mind 23 00:01:15,410 --> 00:01:16,550 when thinking about these. 24 00:01:16,550 --> 00:01:19,760 So this sort of notation is very helpful for us 25 00:01:19,760 --> 00:01:22,100 in terms of thinking about how the biosynthesis goes, 26 00:01:22,100 --> 00:01:26,480 but it's not an accurate representation of structure. 27 00:01:26,480 --> 00:01:29,210 OK, so where we left off was with looking 28 00:01:29,210 --> 00:01:34,370 at how these optional domains can do chemistry 29 00:01:34,370 --> 00:01:38,300 on the upstream monomer. 30 00:01:38,300 --> 00:01:42,620 And the last thing we're going to do related to this assembly 31 00:01:42,620 --> 00:01:46,850 line is, one, ask how is the polyketide released 32 00:01:46,850 --> 00:01:49,880 from the assembly line when the biosynthesis is over. 33 00:01:49,880 --> 00:01:52,850 And then we'll just do one exercise 34 00:01:52,850 --> 00:01:55,290 looking at the macrolide and working backwards. 35 00:01:55,290 --> 00:01:58,790 So last time, we were looking at the domain organization 36 00:01:58,790 --> 00:02:01,160 to determine what sort of chemistry 37 00:02:01,160 --> 00:02:02,900 happens to a given monomer. 38 00:02:02,900 --> 00:02:04,610 We can do, effectively, the opposite, 39 00:02:04,610 --> 00:02:07,550 looking at a natural product and identify 40 00:02:07,550 --> 00:02:10,370 what those monomers and properties of the assembly line 41 00:02:10,370 --> 00:02:11,450 are. 42 00:02:11,450 --> 00:02:16,610 OK, so in terms of chain release, 43 00:02:16,610 --> 00:02:18,335 there are thioesterase domains. 44 00:02:28,090 --> 00:02:32,480 And these domains are involved in chain release 45 00:02:32,480 --> 00:02:33,820 from the assembly line. 46 00:02:41,730 --> 00:02:44,660 So if you take a look in the final module 47 00:02:44,660 --> 00:02:49,790 here, what we see at the end is a TE for the thioesterase. 48 00:02:49,790 --> 00:02:56,210 OK, and so what happens in the case of DEBS is, 49 00:02:56,210 --> 00:03:00,830 ultimately, the chain gets transferred to a serine residue 50 00:03:00,830 --> 00:03:01,970 on the TE domain. 51 00:03:06,640 --> 00:03:11,520 OK, and I'm just going to draw the polyketide like that. 52 00:03:11,520 --> 00:03:15,690 And then in this case here, we remember 53 00:03:15,690 --> 00:03:23,790 we have the propionyl-CoA from the loading module, so 54 00:03:23,790 --> 00:03:25,500 the starter unit. 55 00:03:25,500 --> 00:03:29,310 In this case, what happens is there is a macrocyclization. 56 00:03:29,310 --> 00:03:31,770 So we can imagine deprotonation-- 57 00:03:31,770 --> 00:03:35,790 oh, excuse me, I forgot the linkage here. 58 00:03:35,790 --> 00:03:41,130 So for this TE, we no longer have the growing chain 59 00:03:41,130 --> 00:03:43,560 tethered by a Ppant arm. 60 00:03:43,560 --> 00:03:47,670 With the TE domain, it's tethered to the serine residue. 61 00:03:47,670 --> 00:03:51,900 So it's transferred from the thioester to this serine. 62 00:03:51,900 --> 00:03:54,405 And this here is just the polyketide in between. 63 00:03:57,800 --> 00:04:00,150 I'm just abbreviating it. 64 00:04:00,150 --> 00:04:01,560 So we can have that. 65 00:04:01,560 --> 00:04:04,520 We can have attack and loss. 66 00:04:04,520 --> 00:04:09,570 OK, so in this case what we end up with 67 00:04:09,570 --> 00:04:16,410 is the TE domain plus a macrocycle. 68 00:04:21,459 --> 00:04:27,190 And so that's how we end up with the structure as shown here. 69 00:04:27,190 --> 00:04:32,620 So some TE domains will result in formation of a macrocycle. 70 00:04:32,620 --> 00:04:35,890 Some TE domains will catalyze a hydrolytic release. 71 00:04:35,890 --> 00:04:37,450 And you get the linear chain. 72 00:04:37,450 --> 00:04:39,790 So you need to look at the natural product structure. 73 00:04:39,790 --> 00:04:42,760 And based on that structure, you can make an assessment 74 00:04:42,760 --> 00:04:46,150 as to how the TE works. 75 00:04:46,150 --> 00:04:49,450 And so that's also shown in this depiction and one 76 00:04:49,450 --> 00:04:50,750 other depiction in the notes. 77 00:04:50,750 --> 00:04:53,980 So here, the entire chain is drawn. 78 00:04:53,980 --> 00:04:56,020 And we're seeing deprotonation donation here 79 00:04:56,020 --> 00:04:58,690 and then attack here to give the macrocycle. 80 00:05:04,360 --> 00:05:13,270 So here is the product of this DEBS. 81 00:05:13,270 --> 00:05:17,680 And what we're going to do as a last exercise with the PKS 82 00:05:17,680 --> 00:05:21,190 is just look at this structure and work through identifying 83 00:05:21,190 --> 00:05:23,740 the monomer units and what optional domains 84 00:05:23,740 --> 00:05:27,430 acted on each monomer. 85 00:05:27,430 --> 00:05:31,960 And basically, where can we start? 86 00:05:31,960 --> 00:05:37,210 So if the thioesterase catalyzes a macrocyclization, 87 00:05:37,210 --> 00:05:42,730 that's an easy starting point, because basically, 88 00:05:42,730 --> 00:05:47,530 the final monomer needs to be involved there. 89 00:05:47,530 --> 00:05:50,200 And we know that the only place we 90 00:05:50,200 --> 00:05:52,390 can get a structure like this is from the starter, 91 00:05:52,390 --> 00:05:54,640 from that propionyl-CoA. 92 00:05:54,640 --> 00:05:59,320 So here, if we just look, we have the monomer 93 00:05:59,320 --> 00:06:02,440 from module 0, the loading. 94 00:06:02,440 --> 00:06:04,870 And then as we learned last time, 95 00:06:04,870 --> 00:06:08,770 each additional unit that gets attached to the growing 96 00:06:08,770 --> 00:06:12,100 polyketide gives two carbons, so two carbon units 97 00:06:12,100 --> 00:06:13,450 to the growing chain. 98 00:06:13,450 --> 00:06:18,300 So we can work our way around by two carbons, 1, 2. 99 00:06:18,300 --> 00:06:25,060 OK, here we have module 1, another two carbons, 100 00:06:25,060 --> 00:06:51,060 module 2 here, module 3, module 4, module 5, and here, 101 00:06:51,060 --> 00:06:51,790 module 6. 102 00:06:54,860 --> 00:06:57,110 So looking at a structure, you can 103 00:06:57,110 --> 00:06:59,900 begin to dissect what the assembly line will 104 00:06:59,900 --> 00:07:02,810 look like in terms of the number of modules 105 00:07:02,810 --> 00:07:06,290 by counting C2 units to the growing chain here. 106 00:07:06,290 --> 00:07:08,450 And then the other thing we can do 107 00:07:08,450 --> 00:07:11,450 is look at the functional group status 108 00:07:11,450 --> 00:07:14,030 and ask what types of optional domains 109 00:07:14,030 --> 00:07:18,590 needed to be there in order to give a given functional group. 110 00:07:18,590 --> 00:07:22,430 So for instance, in this case, in module 1 here, 111 00:07:22,430 --> 00:07:23,910 we're seeing an OH group. 112 00:07:23,910 --> 00:07:27,770 So we know there had to be the action of a keto reductase 113 00:07:27,770 --> 00:07:30,200 to reduce the ketone. 114 00:07:30,200 --> 00:07:33,410 Here we see we have this carbonyl, 115 00:07:33,410 --> 00:07:35,960 so there was no optional domain. 116 00:07:35,960 --> 00:07:37,640 In this case, what happens? 117 00:07:37,640 --> 00:07:38,780 We have a methylene. 118 00:07:38,780 --> 00:07:42,480 So that ketone we started with was fully reduced. 119 00:07:42,480 --> 00:07:45,270 So in this case, we have the keto reductase, 120 00:07:45,270 --> 00:07:51,050 the dehydratase, and the enoyl reductase. 121 00:07:51,050 --> 00:07:52,760 Again, here we can look at this unit. 122 00:07:52,760 --> 00:07:56,140 We see an OH, which tells us that there 123 00:07:56,140 --> 00:07:58,370 was action of a keto reductase. 124 00:07:58,370 --> 00:08:04,760 And here we have another OH, so we have a keto reductase. 125 00:08:04,760 --> 00:08:11,850 And in this case, we have none, this final one. 126 00:08:11,850 --> 00:08:14,540 And here, I didn't write it, but none 127 00:08:14,540 --> 00:08:17,840 in terms of optional domains. 128 00:08:17,840 --> 00:08:19,730 So this can be pretty fun. 129 00:08:19,730 --> 00:08:21,320 This is a pretty simple structure, 130 00:08:21,320 --> 00:08:23,040 but as structures get more complex, 131 00:08:23,040 --> 00:08:26,030 you can map out what are the optional domains there. 132 00:08:26,030 --> 00:08:28,970 And maybe you'll see, some other unusual structural features 133 00:08:28,970 --> 00:08:33,020 will indicate there is other optional domains 134 00:08:33,020 --> 00:08:34,471 beyond these three. 135 00:08:34,471 --> 00:08:35,929 And we're going to see some of that 136 00:08:35,929 --> 00:08:37,935 as we move into the non-ribosomal peptides. 137 00:08:40,710 --> 00:08:45,070 So that's given in the notes if you want to practice on that. 138 00:08:45,070 --> 00:08:53,510 So with this, we're going to transition into an NRPS 139 00:08:53,510 --> 00:08:55,700 and look at the assembly line logic 140 00:08:55,700 --> 00:08:58,470 for non-ribosomal peptides. 141 00:08:58,470 --> 00:09:03,650 And so this is a slide from last time, where we considered 142 00:09:03,650 --> 00:09:08,330 the starter units and extender units for fatty acids 143 00:09:08,330 --> 00:09:10,520 and for polyketides. 144 00:09:10,520 --> 00:09:13,100 And so in non-ribosomal peptides, 145 00:09:13,100 --> 00:09:17,510 we also have starter units and extender units, 146 00:09:17,510 --> 00:09:20,150 but in the case of the non-ribosomal peptide, 147 00:09:20,150 --> 00:09:21,640 as the name indicates, we're going 148 00:09:21,640 --> 00:09:25,220 to be thinking about amino acid monomers. 149 00:09:25,220 --> 00:09:28,250 And we're also going to be considering examples where 150 00:09:28,250 --> 00:09:30,380 there is aryl acid monomers. 151 00:09:30,380 --> 00:09:34,310 So these NRPS assembly lines will 152 00:09:34,310 --> 00:09:38,420 form polymers that incorporate amino acid and aryl acid 153 00:09:38,420 --> 00:09:41,090 monomers. 154 00:09:41,090 --> 00:09:43,910 And this is another slide from last time 155 00:09:43,910 --> 00:09:48,110 that is just summarizing the core domains and then 156 00:09:48,110 --> 00:09:53,300 examples of optional domains for the PKS and NRPS. 157 00:09:53,300 --> 00:09:58,520 So we learned last time that for PKS, every module 158 00:09:58,520 --> 00:10:03,170 will have a KS and a T domain, with the exception 159 00:10:03,170 --> 00:10:04,820 of the loading or starter module. 160 00:10:04,820 --> 00:10:06,590 That has no keto synthase, because there 161 00:10:06,590 --> 00:10:09,400 is no upstream group here. 162 00:10:09,400 --> 00:10:14,660 For NRPS, the core of a module is CAT trio. 163 00:10:14,660 --> 00:10:17,450 So condensation domain, or C domain, 164 00:10:17,450 --> 00:10:20,210 this is the domain that's going to catalyze peptide bond 165 00:10:20,210 --> 00:10:23,180 formation between two of the monomers. 166 00:10:23,180 --> 00:10:25,130 We have an adenylation domain. 167 00:10:25,130 --> 00:10:26,570 And we'll see this does chemistry 168 00:10:26,570 --> 00:10:30,110 similar to the aminoacyl tRNA synthetases. 169 00:10:30,110 --> 00:10:33,140 And then we have the T domains that 170 00:10:33,140 --> 00:10:37,320 are carrier proteins for the monomers and growing chain. 171 00:10:37,320 --> 00:10:40,430 OK, and then within a given NRPS module, 172 00:10:40,430 --> 00:10:42,590 there can also be optional domains. 173 00:10:42,590 --> 00:10:44,720 And just two examples are shown here. 174 00:10:44,720 --> 00:10:48,110 So maybe there is an epimerization of an amino acid. 175 00:10:48,110 --> 00:10:50,240 Maybe there is a methyl group and there 176 00:10:50,240 --> 00:10:52,660 needs to be methyltransferase to put that on. 177 00:10:52,660 --> 00:10:55,130 There is a lot of diversity that comes 178 00:10:55,130 --> 00:10:57,260 into these structures on the basis 179 00:10:57,260 --> 00:10:59,540 of these optional domains. 180 00:10:59,540 --> 00:11:01,640 And just to highlight that, I've presented 181 00:11:01,640 --> 00:11:05,840 here a list of possible optional domains 182 00:11:05,840 --> 00:11:08,495 you can find in NRPS, or for that matter, 183 00:11:08,495 --> 00:11:12,380 a PKS here, so all sorts of things. 184 00:11:12,380 --> 00:11:16,670 Look at halogenase, cyclase, reductase. 185 00:11:16,670 --> 00:11:21,080 There is tremendous structural diversity that can occur. 186 00:11:21,080 --> 00:11:29,030 OK, so if we consider the NRPS assembly line 187 00:11:29,030 --> 00:11:31,130 structure and notation similar to what 188 00:11:31,130 --> 00:11:36,170 we did with the polyketide synthases, what do we see? 189 00:11:49,840 --> 00:11:52,420 So I'll just draw one with two modules 190 00:11:52,420 --> 00:11:54,850 here, although n can indicate more. 191 00:11:54,850 --> 00:11:59,050 So initially what we have here is a starting or loading 192 00:11:59,050 --> 00:12:08,280 module, OK, so for instance, module 0. 193 00:12:11,520 --> 00:12:20,010 OK, here we have module 1, 2 for extenders. 194 00:12:24,130 --> 00:12:27,775 And here we have a thioesterase for chain release. 195 00:12:31,390 --> 00:12:34,000 So we'll find that in the final module 196 00:12:34,000 --> 00:12:39,190 like what we saw with the polyketide synthase for DEB. 197 00:12:39,190 --> 00:12:45,190 So this whole thing can be called an NRPS here. 198 00:12:47,890 --> 00:12:51,100 And what happens in terms of the action 199 00:12:51,100 --> 00:12:53,920 of these different core domains-- 200 00:12:53,920 --> 00:12:57,410 so A, we have adenylation. 201 00:13:06,520 --> 00:13:11,110 OK, and what these domains do are select 202 00:13:11,110 --> 00:13:22,160 and activate the amino acid or aryl acid monomers. 203 00:13:28,920 --> 00:13:31,430 OK and after these monomers are activated, 204 00:13:31,430 --> 00:13:34,580 the A domain also transfers them to the T domain. 205 00:13:37,330 --> 00:13:39,660 And we'll go over the chemistry in a minute. 206 00:13:45,960 --> 00:13:50,230 OK, this T domain is like what we saw with the PKS. 207 00:13:50,230 --> 00:14:04,550 We can call it a thiolation domain or a peptidyl carrier 208 00:14:04,550 --> 00:14:05,050 protein. 209 00:14:09,390 --> 00:14:15,840 So these T domains are going to be modified with the Ppant arm, 210 00:14:15,840 --> 00:14:17,845 like what we saw for PKS. 211 00:14:21,420 --> 00:14:25,860 We have the C domain, condensation. 212 00:14:35,030 --> 00:14:38,750 And so this domain capitalizes peptide bond formation. 213 00:14:48,890 --> 00:14:53,070 And I'll just point out here that, in contrast to the keto 214 00:14:53,070 --> 00:14:55,170 synthase we saw in PKS-- 215 00:14:55,170 --> 00:14:58,130 so we saw the keto synthase doing covalent catalysis 216 00:14:58,130 --> 00:15:00,350 via its cysteine residue-- 217 00:15:00,350 --> 00:15:03,560 the condensation domains of NRPS are involved 218 00:15:03,560 --> 00:15:05,230 in non-covalent catalysis. 219 00:15:08,720 --> 00:15:10,830 So that's just an important distinction. 220 00:15:14,170 --> 00:15:20,500 The growing chain does not get attached to the C domain here. 221 00:15:20,500 --> 00:15:23,640 And then we have the TE, so thioesterase, 222 00:15:23,640 --> 00:15:25,735 as we saw, for chain release. 223 00:15:29,130 --> 00:15:35,542 And this can be hydrolytic or macrocyclization. 224 00:15:45,540 --> 00:15:50,090 OK, so let's consider just the example 225 00:15:50,090 --> 00:15:55,760 of an NRPS that is responsible for synthesizing a tripeptide. 226 00:15:55,760 --> 00:15:57,145 So what is the net reaction? 227 00:16:13,670 --> 00:16:18,680 So imagine that we have three amino acid monomers. 228 00:16:22,910 --> 00:16:26,090 And I'll just point out here too that beyond knowing 229 00:16:26,090 --> 00:16:30,590 an epimerization domain epimerizes an amino acid, 230 00:16:30,590 --> 00:16:33,920 you're not responsible for stereochemistry 231 00:16:33,920 --> 00:16:36,320 in terms of the various structures we'll 232 00:16:36,320 --> 00:16:37,640 look at going through here. 233 00:16:37,640 --> 00:16:40,430 So I'm just not drawing stereochemistry here. 234 00:16:51,000 --> 00:16:52,890 So we have three amino acid monomers. 235 00:17:00,690 --> 00:17:04,200 There is going to be some NRPS that's 236 00:17:04,200 --> 00:17:06,869 responsible for formation of the tripeptide. 237 00:17:06,869 --> 00:17:12,569 And what we'll see is that making a trimer 238 00:17:12,569 --> 00:17:17,130 requires three ATPs, so one ATP per amino acid 239 00:17:17,130 --> 00:17:48,070 or aryl acid monomer, giving us three AMT plus three PPi here 240 00:17:48,070 --> 00:17:55,960 to give us our tripeptide plus three water molecules here. 241 00:17:55,960 --> 00:17:59,420 OK, so how does this happen? 242 00:17:59,420 --> 00:18:02,080 How does the NRPS take these monomers 243 00:18:02,080 --> 00:18:04,960 and build, say, a tripeptide? 244 00:18:04,960 --> 00:18:12,640 We're going to look at the ACV synthetase as a model for this. 245 00:18:12,640 --> 00:18:19,900 And so the ACV tripeptide is important. 246 00:18:19,900 --> 00:18:24,340 It forms the backbone of antibiotics of the penicillin 247 00:18:24,340 --> 00:18:26,140 and cephalosporin classes. 248 00:18:26,140 --> 00:18:29,680 So many of these are used clinically. 249 00:18:29,680 --> 00:18:32,920 So here are the structures of penicillin N 250 00:18:32,920 --> 00:18:34,810 And the cephalosporin. 251 00:18:34,810 --> 00:18:36,670 So at first inspection, you might not 252 00:18:36,670 --> 00:18:40,240 guess that these are effectively built from a tripeptide, 253 00:18:40,240 --> 00:18:42,820 but what happens is that a non-ribosomal peptide 254 00:18:42,820 --> 00:18:47,590 synthetase, the ACV synthetase, is responsible for forming 255 00:18:47,590 --> 00:18:50,710 two amide bonds between the three starters-- or the three 256 00:18:50,710 --> 00:18:51,730 monomers. 257 00:18:51,730 --> 00:18:53,680 And then there is additional enzymes 258 00:18:53,680 --> 00:18:58,990 that are responsible for modifying that peptide scaffold 259 00:18:58,990 --> 00:19:01,510 to give, say, this four-five fused ring 260 00:19:01,510 --> 00:19:06,190 system or this four-six fused ring system. 261 00:19:06,190 --> 00:19:10,180 OK, so what is the overall reaction of this? 262 00:19:10,180 --> 00:19:13,120 So similar to having these three amino acid 263 00:19:13,120 --> 00:19:17,770 monomers here, what we have are aminoadipate. 264 00:19:17,770 --> 00:19:21,240 We have L-cysteine and L-valine. 265 00:19:21,240 --> 00:19:24,400 And the synthetase takes these three monomers and makes 266 00:19:24,400 --> 00:19:26,900 this molecule here, which is called ACV. 267 00:19:29,430 --> 00:19:33,190 And so if we look at the synthetase in cartoon form, 268 00:19:33,190 --> 00:19:34,810 this is the cartoon. 269 00:19:34,810 --> 00:19:38,080 So we see a loading module, so just AT. 270 00:19:38,080 --> 00:19:41,140 Similar to the PKS, there is no catalytic domain 271 00:19:41,140 --> 00:19:43,750 to make a new bond in the loading module 272 00:19:43,750 --> 00:19:45,640 because there is nothing upstream. 273 00:19:45,640 --> 00:19:48,610 We see a module here, CAT. 274 00:19:48,610 --> 00:19:51,340 We have another CAT trio here. 275 00:19:51,340 --> 00:19:53,050 And then what's this? 276 00:19:53,050 --> 00:19:57,170 This is our first example of an optional domain within an NRPS. 277 00:19:57,170 --> 00:20:00,160 OK so this E is for epimerization. 278 00:20:00,160 --> 00:20:03,940 I mean what we'll see is that the synthetase epimerizes 279 00:20:03,940 --> 00:20:07,900 L-valine to D-valine during the synthesis, and then 280 00:20:07,900 --> 00:20:09,670 the thioesterase. 281 00:20:09,670 --> 00:20:13,270 So similar to what we did with the PKS, for the NRPS, 282 00:20:13,270 --> 00:20:18,040 you can count T domains as a way to identify the modules 283 00:20:18,040 --> 00:20:20,815 and to figure out how many monomers are involved. 284 00:20:25,310 --> 00:20:29,390 I also just point out-- and this builds upon Colin's comment 285 00:20:29,390 --> 00:20:30,910 from last time-- 286 00:20:30,910 --> 00:20:34,280 is that this assembly line is responsible 287 00:20:34,280 --> 00:20:36,610 only for the synthesis of a tripeptide, 288 00:20:36,610 --> 00:20:38,300 but look at its size. 289 00:20:38,300 --> 00:20:41,270 It's greater than 450 kilodaltons. 290 00:20:41,270 --> 00:20:46,520 That's quite big-- so a large enzyme, 10 different domains, 291 00:20:46,520 --> 00:20:50,000 all just for synthesis of this one tripeptide here. 292 00:20:53,210 --> 00:20:54,350 So what happens? 293 00:21:03,580 --> 00:21:09,240 We're going to go over the action of the A domains 294 00:21:09,240 --> 00:21:13,170 and the T domains first. 295 00:21:13,170 --> 00:21:15,600 And then we'll look at a cartoon in the slides. 296 00:21:22,370 --> 00:21:25,330 So the first points to make are that we 297 00:21:25,330 --> 00:21:30,800 need to have loading of the assembly line. 298 00:21:30,800 --> 00:21:34,480 So amino acids need to be selected and activated. 299 00:21:37,380 --> 00:21:40,420 And that's where these A domains come in. 300 00:21:40,420 --> 00:21:41,710 So what's happening? 301 00:21:45,570 --> 00:21:50,390 So we have some amino acid monomers-- 302 00:21:50,390 --> 00:21:54,650 so maybe it's the L-cysteine, for instance, or the L-valine-- 303 00:21:54,650 --> 00:21:57,710 plus ATP. 304 00:21:57,710 --> 00:22:00,650 The A domain does chemistry similar to what 305 00:22:00,650 --> 00:22:05,270 we saw with the aminoacyl tRNA synthetases 306 00:22:05,270 --> 00:22:08,108 to form an activated intermediate. 307 00:22:14,090 --> 00:22:17,150 So we get an amino adenylate here. 308 00:22:17,150 --> 00:22:18,140 And then what happens? 309 00:22:18,140 --> 00:22:19,085 So the T domain-- 310 00:22:24,120 --> 00:22:34,500 OK, and this T domain must be modified by a PPTase, 311 00:22:34,500 --> 00:22:45,630 like what we saw for the PKS, to have the Ppant arm. 312 00:22:49,550 --> 00:22:52,370 After we have activation of the amino acid 313 00:22:52,370 --> 00:22:54,800 or aryl acid monomer, the A domain 314 00:22:54,800 --> 00:22:58,220 is going to assist with transfer of this monomer to the Ppant 315 00:22:58,220 --> 00:22:59,830 arm of the T domain here. 316 00:23:24,500 --> 00:23:33,040 OK, so we got an aminoacyl-S-T covalently tethered 317 00:23:33,040 --> 00:23:37,240 via a thioester linkage. 318 00:23:37,240 --> 00:23:42,880 So one ATP is consumed per monomer loaded. 319 00:23:42,880 --> 00:23:47,020 And the ATP PPi exchange assay we discussed back 320 00:23:47,020 --> 00:23:50,450 in the translation module for studying the aminoacyl tRNA 321 00:23:50,450 --> 00:23:56,560 synthetases is used all the time to study new A domains 322 00:23:56,560 --> 00:23:58,930 and ask what amino acid or aryl acid 323 00:23:58,930 --> 00:24:02,470 monomers do they activate here. 324 00:24:02,470 --> 00:24:05,390 So that assay comes up in this type of work. 325 00:24:08,290 --> 00:24:17,700 So what happens then in terms of formation of a peptide bond, 326 00:24:17,700 --> 00:24:26,315 we're going to consider condensation by the C domains. 327 00:24:33,040 --> 00:24:35,330 And so let's just imagine-- 328 00:24:35,330 --> 00:24:37,970 we're just going to draw two modules. 329 00:24:37,970 --> 00:24:47,290 So we have a loading module and then a first extender module. 330 00:24:47,290 --> 00:24:55,160 And the T domains have been post translationally modified 331 00:24:55,160 --> 00:24:56,750 with the Ppant arm. 332 00:24:56,750 --> 00:24:59,962 And the action of the A domains has loaded the amino acids 333 00:24:59,962 --> 00:25:00,545 at this stage. 334 00:25:06,310 --> 00:25:10,180 OK, so we have some amino acid loaded here. 335 00:25:10,180 --> 00:25:19,320 And then we have some amino acid loaded here. 336 00:25:21,920 --> 00:25:23,980 OK, and what happens? 337 00:25:23,980 --> 00:25:26,280 We're going to have nucleophilic attack 338 00:25:26,280 --> 00:25:30,900 from the alpha amino group onto the upstream monomer 339 00:25:30,900 --> 00:25:32,345 and then transfer of this monomer. 340 00:25:35,740 --> 00:25:38,455 And this occurs via the action of the C domain. 341 00:25:58,560 --> 00:26:01,200 We have R2. 342 00:26:01,200 --> 00:26:09,280 And now we have formation of our new peptide bond. 343 00:26:09,280 --> 00:26:16,780 Sorry, this is R2 here. 344 00:26:16,780 --> 00:26:20,290 And as I noted above, there is no covalent catalysis 345 00:26:20,290 --> 00:26:22,240 with the C domain. 346 00:26:22,240 --> 00:26:25,230 Somehow it's helping to bring these chains together 347 00:26:25,230 --> 00:26:28,300 and to allow this nucleophilic attack to occur 348 00:26:28,300 --> 00:26:30,550 and to allow the monomer to be transferred, 349 00:26:30,550 --> 00:26:32,680 but this unit is never transferred 350 00:26:32,680 --> 00:26:33,870 to the C domain itself. 351 00:26:33,870 --> 00:26:34,370 Yeah? 352 00:26:34,370 --> 00:26:36,880 AUDIENCE: So is the C domain responsible for deprotonating 353 00:26:36,880 --> 00:26:37,570 the NH2? 354 00:26:37,570 --> 00:26:39,790 Or is that just always-- 355 00:26:39,790 --> 00:26:41,230 ELIZABETH NOLAN: Yeah, I don't-- 356 00:26:41,230 --> 00:26:43,610 how this gets deprotonated, I don't know. 357 00:26:43,610 --> 00:26:45,340 But this is back to similar, like what 358 00:26:45,340 --> 00:26:47,230 we saw in the ribosome. 359 00:26:47,230 --> 00:26:50,650 And somehow, this alpha amino group needs to be deprotonated. 360 00:26:50,650 --> 00:26:52,840 And there is something in the environment 361 00:26:52,840 --> 00:26:55,660 of this machine that's allowing that to happen, 362 00:26:55,660 --> 00:26:58,610 but whether it's the C domain or something else, 363 00:26:58,610 --> 00:27:00,580 yeah, I don't know the answer to that. 364 00:27:03,350 --> 00:27:08,890 So let's look at a cartoon of this with this ACV synthase. 365 00:27:08,890 --> 00:27:12,970 So here we have on top the synthase 366 00:27:12,970 --> 00:27:17,290 loaded with the amino acid monomers. 367 00:27:17,290 --> 00:27:22,480 OK, so we see loading module and then two extender modules. 368 00:27:22,480 --> 00:27:24,850 We have the aminoadipate. 369 00:27:24,850 --> 00:27:28,630 So it's not a canonical amino acid, but it's amino-acid-like. 370 00:27:28,630 --> 00:27:32,260 We have the cysteine and the valine. 371 00:27:32,260 --> 00:27:35,860 What happens as these condensation reactions occur, 372 00:27:35,860 --> 00:27:37,600 we get chain elongation. 373 00:27:37,600 --> 00:27:40,660 So this is depicted here in a similar manner 374 00:27:40,660 --> 00:27:43,210 to how that PKS assembly line was depicted. 375 00:27:46,630 --> 00:27:52,540 So formation of two peptide bonds, and then what happens? 376 00:27:52,540 --> 00:27:55,390 Ultimately, we have chain transfer 377 00:27:55,390 --> 00:27:59,320 to a serine residue on the thioesterase domain. 378 00:27:59,320 --> 00:28:02,420 And this is a case where the thioesterase domain catalyzes 379 00:28:02,420 --> 00:28:04,510 this hydrolytic release. 380 00:28:04,510 --> 00:28:06,400 So as opposed to macrocyclization, 381 00:28:06,400 --> 00:28:09,040 we're seeing activation of a water molecule 382 00:28:09,040 --> 00:28:13,800 and attack, which releases this ACV tripeptide. 383 00:28:13,800 --> 00:28:16,570 OK, and I've drawn the ACV tripeptide here 384 00:28:16,570 --> 00:28:22,330 to indicate effectively getting to this structure. 385 00:28:22,330 --> 00:28:26,020 So what happens after this tripeptide is released 386 00:28:26,020 --> 00:28:28,810 from the assembly line, is that there is additional enzymes 387 00:28:28,810 --> 00:28:30,850 that play a tailoring role. 388 00:28:30,850 --> 00:28:33,370 So like, for proteins we talk about post-translational 389 00:28:33,370 --> 00:28:36,550 modification, for these types of natural products, 390 00:28:36,550 --> 00:28:39,760 we talk about post-assembly-line tailoring. 391 00:28:39,760 --> 00:28:44,760 And so in this case, there is some enzymes such as IPNF, 392 00:28:44,760 --> 00:28:47,920 and non-heme iron enzyme that's responsible 393 00:28:47,920 --> 00:28:53,170 for oxygenated cyclization to give the fused ring system 394 00:28:53,170 --> 00:28:59,250 characteristics of these beta-lactams like isopenicillin 395 00:28:59,250 --> 00:29:00,760 N. 396 00:29:00,760 --> 00:29:04,360 We can look at this in another cartoon form. 397 00:29:04,360 --> 00:29:07,750 So here is the holoform. 398 00:29:07,750 --> 00:29:10,720 Recall, we called the T domains apo 399 00:29:10,720 --> 00:29:13,930 when the serine is not post-translationally modified 400 00:29:13,930 --> 00:29:15,550 with the Ppant arm. 401 00:29:15,550 --> 00:29:18,730 And the T domains are holo when the Ppant 402 00:29:18,730 --> 00:29:22,630 arm has been attached, as indicated by this squiggle. 403 00:29:22,630 --> 00:29:26,620 We then have loading of the amino acid monomers 404 00:29:26,620 --> 00:29:29,470 via the action of the A domains. 405 00:29:29,470 --> 00:29:33,410 So formation of that aminoacyl AMP 406 00:29:33,410 --> 00:29:39,460 or amino adenylate intermediate, so one monomer per module. 407 00:29:39,460 --> 00:29:43,240 We have chain elongation events catalyzed by the condensation 408 00:29:43,240 --> 00:29:44,950 domain. 409 00:29:44,950 --> 00:29:50,200 We have chain transfer to the TE domain as shown here, 410 00:29:50,200 --> 00:29:56,490 chain transfer, and then chain release here, 411 00:29:56,490 --> 00:29:58,440 and then post-assembly-line tailoring. 412 00:30:01,350 --> 00:30:04,380 So with that in mind, what we're going to do now 413 00:30:04,380 --> 00:30:09,120 is look at another non-ribosomal peptide synthetase. 414 00:30:09,120 --> 00:30:12,810 This one synthesizes the backbone 415 00:30:12,810 --> 00:30:16,170 of the antibiotic vancomycin. 416 00:30:16,170 --> 00:30:22,110 And the structure of vancomycin is shown here. 417 00:30:22,110 --> 00:30:24,690 This is an antibiotic that's basically 418 00:30:24,690 --> 00:30:28,530 considered one of last resort for bacterial infections. 419 00:30:28,530 --> 00:30:31,860 And there is a huge problem of vancomycin resistance 420 00:30:31,860 --> 00:30:34,320 in the clinic these days. 421 00:30:34,320 --> 00:30:37,320 So at first glance, this molecule 422 00:30:37,320 --> 00:30:40,170 might not look like it's based on a peptide. 423 00:30:40,170 --> 00:30:41,670 But then if you look more carefully, 424 00:30:41,670 --> 00:30:44,000 you see there is a lot of amide bonds. 425 00:30:44,000 --> 00:30:47,880 And there is also some other things going in 426 00:30:47,880 --> 00:30:50,250 to get this final structure. 427 00:30:50,250 --> 00:30:54,390 So effectively, the backbone of vancomycin 428 00:30:54,390 --> 00:30:59,220 is a polypeptide that's a sevenmer. 429 00:30:59,220 --> 00:31:02,640 So within this heptapeptide scaffold, 430 00:31:02,640 --> 00:31:05,370 there are two proteinogenic amino acids 431 00:31:05,370 --> 00:31:10,170 and five non-proteinogenic amino acids here. 432 00:31:10,170 --> 00:31:15,750 And because we have seven amino-acid-type monomers, 433 00:31:15,750 --> 00:31:18,810 we need an assembly line that has seven modules, one 434 00:31:18,810 --> 00:31:23,010 module per amino acid monomer. 435 00:31:23,010 --> 00:31:27,780 And what we'll see is that these seven modules are distributed 436 00:31:27,780 --> 00:31:32,200 over three proteins. 437 00:31:32,200 --> 00:31:36,960 We have a case of a thioesterase catalyzing hydrolytic release. 438 00:31:36,960 --> 00:31:38,760 And then we're going to need to think 439 00:31:38,760 --> 00:31:41,520 about what are the other tailoring enzymes involved 440 00:31:41,520 --> 00:31:44,200 in giving vancomycin this structure. 441 00:31:44,200 --> 00:31:46,230 So for instance, look here. 442 00:31:46,230 --> 00:31:49,800 We see there is this aryl-aryl C-C bond. 443 00:31:49,800 --> 00:31:52,710 We see these aryl-ether connections. 444 00:31:52,710 --> 00:31:55,590 And we also have these sugars attached. 445 00:31:55,590 --> 00:31:57,480 And look, there is also an N-methylation 446 00:31:57,480 --> 00:32:02,220 here of leucine 1, so a lot happening. 447 00:32:02,220 --> 00:32:06,510 And the consequence of this post-assembly-line tailoring 448 00:32:06,510 --> 00:32:09,990 is that, what's a linear sevenmer polypeptide ends up 449 00:32:09,990 --> 00:32:13,860 having an architecture that's described as a dome, so 450 00:32:13,860 --> 00:32:15,750 a dome-shaped architecture. 451 00:32:15,750 --> 00:32:20,340 And what vancomycin does is that it blocks biosynthesis 452 00:32:20,340 --> 00:32:24,390 of the bacterial cell wall by binding to a certain lipid 453 00:32:24,390 --> 00:32:26,760 precursor in that. 454 00:32:26,760 --> 00:32:30,360 So let's look at the assembly line. 455 00:32:30,360 --> 00:32:33,210 And this is just an overview of the tailoring 456 00:32:33,210 --> 00:32:35,040 I just told you about. 457 00:32:35,040 --> 00:32:37,440 And this is the amino acid sequence 458 00:32:37,440 --> 00:32:39,510 in order of the different monomers there 459 00:32:39,510 --> 00:32:44,290 and the identities of the non-proteinogenic amino acids. 460 00:32:44,290 --> 00:32:48,040 So here is the assembly line. 461 00:32:48,040 --> 00:32:53,190 And if we take a look, we have the loading module, AT. 462 00:32:53,190 --> 00:32:59,340 We can count the T domains to give us the modules involved 463 00:32:59,340 --> 00:33:00,210 in extension. 464 00:33:00,210 --> 00:33:02,280 So there is seven T domains. 465 00:33:02,280 --> 00:33:05,460 And look, CAT, CAT, CAT-- 466 00:33:05,460 --> 00:33:09,180 we have a number of optional epimerization domains. 467 00:33:09,180 --> 00:33:12,750 And at the end, we see this TE domain. 468 00:33:12,750 --> 00:33:17,130 And so you can walk through and look at each monomer being 469 00:33:17,130 --> 00:33:19,890 attached to the growing chain. 470 00:33:19,890 --> 00:33:21,750 And then what do we see? 471 00:33:21,750 --> 00:33:24,510 What we see happening down here is 472 00:33:24,510 --> 00:33:29,820 that when we have the linear polypeptide attached 473 00:33:29,820 --> 00:33:32,610 to this module here, what happens 474 00:33:32,610 --> 00:33:35,730 is that there is some tailoring happening 475 00:33:35,730 --> 00:33:39,870 while the polypeptide is still attached to the assembly line. 476 00:33:39,870 --> 00:33:43,510 So enzymes that are not parts of the assembly line 477 00:33:43,510 --> 00:33:46,720 but are involved in the biosynthesis can come in. 478 00:33:46,720 --> 00:33:49,890 And sometimes they'll modify the chain 479 00:33:49,890 --> 00:33:53,460 when it's still attached to the NRPS or PKS. 480 00:33:53,460 --> 00:33:55,230 Or sometimes they do the chemistry 481 00:33:55,230 --> 00:33:57,840 after the chain is released. 482 00:33:57,840 --> 00:34:00,810 And often, this is a question that people need 483 00:34:00,810 --> 00:34:02,390 to sort out experimentally. 484 00:34:02,390 --> 00:34:04,785 So in this case here, we see that there 485 00:34:04,785 --> 00:34:08,730 is some oxidative cross-linking that occurs while the chain is 486 00:34:08,730 --> 00:34:10,409 still attached to the T domain. 487 00:34:10,409 --> 00:34:12,780 So there is formation of the aryl-ether bond 488 00:34:12,780 --> 00:34:15,460 and this aryl-aryl bond here. 489 00:34:15,460 --> 00:34:17,190 And then after the chain is released 490 00:34:17,190 --> 00:34:20,190 in a hydrolytic manner, what happens 491 00:34:20,190 --> 00:34:24,059 is the sugars get attached post-assembly-line here. 492 00:34:24,059 --> 00:34:25,139 Do you have a question? 493 00:34:25,139 --> 00:34:27,030 AUDIENCE: Yeah, are the enzymes ever 494 00:34:27,030 --> 00:34:29,690 actually in the assembly line, like the optional domains 495 00:34:29,690 --> 00:34:30,190 of PKS? 496 00:34:30,190 --> 00:34:32,370 Or in this case, is it always such 497 00:34:32,370 --> 00:34:34,260 that the enzymes are separate? 498 00:34:34,260 --> 00:34:36,570 ELIZABETH NOLAN: It will depend on the assembly line. 499 00:34:36,570 --> 00:34:37,987 Yeah, so that's something you need 500 00:34:37,987 --> 00:34:42,719 to look for in the assembly line from the bioinformatics. 501 00:34:42,719 --> 00:34:45,690 So in this case, we're only seeing epimerization domains 502 00:34:45,690 --> 00:34:48,270 in the assembly line, but there can easily 503 00:34:48,270 --> 00:34:50,550 be methyltransferases, or reductases, 504 00:34:50,550 --> 00:34:54,150 or cyclases-- any number of possibilities 505 00:34:54,150 --> 00:34:57,810 within the assembly line itself there. 506 00:34:57,810 --> 00:35:03,080 And these optional domains will work on the upstream monomer. 507 00:35:03,080 --> 00:35:07,130 This is just an example of the tailoring enzymes involved 508 00:35:07,130 --> 00:35:10,790 for cross-linking of this vancomycin scaffold. 509 00:35:10,790 --> 00:35:15,560 In this case, there are three cytochrome P450 enzymes 510 00:35:15,560 --> 00:35:23,360 that are needed in order to make these cross-links. 511 00:35:23,360 --> 00:35:26,210 And that chemistry is shown here to get 512 00:35:26,210 --> 00:35:29,330 to what's called the vancomycin aglycone, which 513 00:35:29,330 --> 00:35:34,040 means that there are no sugars attached. 514 00:35:34,040 --> 00:35:36,260 And I won't draw this one on the board, 515 00:35:36,260 --> 00:35:38,960 but you can do a similar exercise 516 00:35:38,960 --> 00:35:41,150 with this molecule or any others in terms 517 00:35:41,150 --> 00:35:44,060 of identifying the monomer units from the structure 518 00:35:44,060 --> 00:35:45,740 for yourself. 519 00:35:45,740 --> 00:35:48,980 So if we're looking here, we have 520 00:35:48,980 --> 00:35:52,730 effectively the N-terminus, so the starter, 521 00:35:52,730 --> 00:35:56,270 and then effectively look at the peptide bonds 522 00:35:56,270 --> 00:35:59,900 and work your way through to find the different monomers 523 00:35:59,900 --> 00:36:00,590 here. 524 00:36:00,590 --> 00:36:02,930 So by doing that, if you're given a natural product, 525 00:36:02,930 --> 00:36:05,150 you can figure out how many modules are 526 00:36:05,150 --> 00:36:07,160 needed in the assembly line. 527 00:36:07,160 --> 00:36:08,960 And you can also make an assessment 528 00:36:08,960 --> 00:36:12,680 as to what other types of chemistry might have to happen. 529 00:36:12,680 --> 00:36:15,440 And I'll just keep in mind, for something like this-- 530 00:36:15,440 --> 00:36:18,410 let's just take this for an example with this halogen. 531 00:36:18,410 --> 00:36:21,410 You might ask, well, is that part of the monomer? 532 00:36:21,410 --> 00:36:24,800 Or is that atom incorporated sometime down the road? 533 00:36:24,800 --> 00:36:26,450 OK, those are types of questions people 534 00:36:26,450 --> 00:36:31,140 who explore biosynthesis of these molecules think about. 535 00:36:31,140 --> 00:36:36,920 OK, so with that in mind, let's take a look at some examples. 536 00:36:36,920 --> 00:36:41,720 And the questions are, what kind of assembly line is this? 537 00:36:41,720 --> 00:36:43,240 How many monomers? 538 00:36:43,240 --> 00:36:47,880 And maybe there will be some extra questions as we go. 539 00:36:47,880 --> 00:36:50,000 So here we have an assembly line that's 540 00:36:50,000 --> 00:36:53,490 required to make an antibiotic called daptomycin. 541 00:36:53,490 --> 00:36:55,580 And a company down the street in Lexington 542 00:36:55,580 --> 00:36:59,570 called Cubist has done a lot of work on this natural product. 543 00:36:59,570 --> 00:37:01,130 So how many monomers are here? 544 00:37:09,324 --> 00:37:13,270 Yeah, 13, right-- so count these T domains 545 00:37:13,270 --> 00:37:15,280 based on what's seen here. 546 00:37:15,280 --> 00:37:16,675 How many optional domains? 547 00:37:19,900 --> 00:37:21,763 AUDIENCE: Three. 548 00:37:21,763 --> 00:37:23,680 ELIZABETH NOLAN: And then what else do we see? 549 00:37:23,680 --> 00:37:26,650 So we see that this assembly line 550 00:37:26,650 --> 00:37:32,980 is divided over three proteins, effectively, here. 551 00:37:32,980 --> 00:37:35,140 And similar to what we saw with DEBS, 552 00:37:35,140 --> 00:37:37,900 when we have a break in the cartoon, 553 00:37:37,900 --> 00:37:41,260 that indicates a new polypeptide chain. 554 00:37:41,260 --> 00:37:41,980 What's missing? 555 00:37:48,693 --> 00:37:49,735 AUDIENCE: Loading module. 556 00:37:49,735 --> 00:37:52,550 ELIZABETH NOLAN: Yeah, there is no loading module here, 557 00:37:52,550 --> 00:37:55,710 right, no AT at the beginning. 558 00:37:55,710 --> 00:37:56,850 So what's going on? 559 00:38:02,600 --> 00:38:05,370 So in this case, I haven't shown you a structure. 560 00:38:05,370 --> 00:38:08,880 It highlights there is always exceptions to the rule. 561 00:38:08,880 --> 00:38:12,150 What happens here is that the loading module actually 562 00:38:12,150 --> 00:38:18,390 loads a fatty acid, so not a standard monomer for NRPS. 563 00:38:18,390 --> 00:38:21,100 So that fatty acid has to come from somewhere. 564 00:38:21,100 --> 00:38:22,740 And you can think about discussions 565 00:38:22,740 --> 00:38:26,280 here as to where that may have come from. 566 00:38:26,280 --> 00:38:31,580 Look at how big this is, 624 kilodaltons, 783, 256-- 567 00:38:31,580 --> 00:38:34,150 we're on the order of 1.5 megadaltons. 568 00:38:34,150 --> 00:38:37,455 This is huge for a 13-mer natural product. 569 00:38:40,710 --> 00:38:43,410 What about this one? 570 00:38:43,410 --> 00:38:44,490 What do we see here? 571 00:38:49,630 --> 00:38:51,070 So this is a natural product-- 572 00:38:51,070 --> 00:38:55,150 this makes the natural product produced by Streptomyces 573 00:38:55,150 --> 00:38:57,850 that has insecticidal activity. 574 00:38:57,850 --> 00:38:59,320 And it kills parasitic worms. 575 00:38:59,320 --> 00:39:03,610 But anyhow, what kind of natural product 576 00:39:03,610 --> 00:39:05,380 is produced by this assembly line? 577 00:39:08,380 --> 00:39:10,480 We have a polyketide, right? 578 00:39:10,480 --> 00:39:11,335 How many modules? 579 00:39:28,328 --> 00:39:30,065 [INAUDIBLE] the T domains. 580 00:39:30,065 --> 00:39:31,190 AUDIENCE: [INAUDIBLE] 581 00:39:31,190 --> 00:39:32,900 ELIZABETH NOLAN: Yeah, 13 again, right-- 582 00:39:32,900 --> 00:39:36,950 four proteins, 13 modules, so how many 583 00:39:36,950 --> 00:39:38,225 unmodified beta ketones? 584 00:39:41,715 --> 00:39:46,329 What would you want to look for for a modified beta ketone? 585 00:39:46,329 --> 00:39:47,810 AUDIENCE: [INAUDIBLE] 586 00:39:47,810 --> 00:39:50,520 ELIZABETH NOLAN: Exactly, no optional domains-- so how many 587 00:39:50,520 --> 00:39:52,350 of those? 588 00:39:52,350 --> 00:39:56,040 Yeah, right, so two modules, we have one 589 00:39:56,040 --> 00:40:01,635 here and then one over here with no optional domains. 590 00:40:05,250 --> 00:40:06,150 What about this one? 591 00:40:09,980 --> 00:40:11,730 This is for a molecule called bleomycin. 592 00:40:11,730 --> 00:40:17,600 JoAnne is an expert on the mechanism of this molecule. 593 00:40:17,600 --> 00:40:18,300 What's going on? 594 00:40:35,415 --> 00:40:36,810 OK, there is a lot going on. 595 00:40:36,810 --> 00:40:38,160 This one is very complicated. 596 00:40:38,160 --> 00:40:40,530 But in terms of making an assessment 597 00:40:40,530 --> 00:40:44,988 about the type of biosynthetic logic, what do we see here? 598 00:40:44,988 --> 00:40:47,360 AUDIENCE: [INAUDIBLE] 599 00:40:47,360 --> 00:40:49,020 ELIZABETH NOLAN: Right, so what we 600 00:40:49,020 --> 00:40:52,290 see is that there is both non-ribosomal peptide 601 00:40:52,290 --> 00:40:54,830 synthesis happening and polyketide 602 00:40:54,830 --> 00:40:58,230 biosynthesis happening in this assembly line. 603 00:40:58,230 --> 00:41:01,020 And that tells us that the product metabolite 604 00:41:01,020 --> 00:41:03,750 is a PKS-NRPS hybrid. 605 00:41:03,750 --> 00:41:05,410 OK, so what do we see? 606 00:41:05,410 --> 00:41:07,740 We see all of these CAT trios which 607 00:41:07,740 --> 00:41:10,740 are indicative of non-ribosomal peptide biosynthesis. 608 00:41:10,740 --> 00:41:12,630 And then what's happening here? 609 00:41:12,630 --> 00:41:16,470 We have a module that's using polyketide machinery. 610 00:41:16,470 --> 00:41:20,070 And then we go back to non-ribosomal-peptide-based 611 00:41:20,070 --> 00:41:20,870 logic here. 612 00:41:23,440 --> 00:41:24,960 We have many proteins, right? 613 00:41:24,960 --> 00:41:28,800 So this assembly line is divided over many proteins. 614 00:41:28,800 --> 00:41:32,100 And look, we see that even some of the modules are divided up. 615 00:41:32,100 --> 00:41:35,100 So for instance, this CAT trio is 616 00:41:35,100 --> 00:41:38,080 divided between two proteins. 617 00:41:38,080 --> 00:41:43,209 So you may not have all domains of a module on a given protein. 618 00:41:43,209 --> 00:41:45,630 AUDIENCE: What happens if you have two C domains in a row? 619 00:41:45,630 --> 00:41:48,130 ELIZABETH NOLAN: So where do you see two C domains in a row? 620 00:41:48,130 --> 00:41:51,548 AUDIENCE: Between BlmV and BlmX. 621 00:41:51,548 --> 00:41:53,024 ELIZABETH NOLAN: Five and-- 622 00:41:53,024 --> 00:41:55,405 AUDIENCE: Is that actually in a row? 623 00:41:55,405 --> 00:41:57,530 ELIZABETH NOLAN: Yeah, so then that's the question. 624 00:41:57,530 --> 00:42:00,390 Are they actually in a row? 625 00:42:00,390 --> 00:42:04,072 AUDIENCE: Further down, four Cy cyclases without any C domain. 626 00:42:04,072 --> 00:42:05,780 ELIZABETH NOLAN: Yeah, so that's actually 627 00:42:05,780 --> 00:42:06,738 where I was going next. 628 00:42:06,738 --> 00:42:12,800 So what's going on with the Cy without a C domain? 629 00:42:12,800 --> 00:42:15,950 So what's happening-- and we'll probably, if there is time, 630 00:42:15,950 --> 00:42:19,580 go over an example of this on Friday-- 631 00:42:19,580 --> 00:42:24,350 is that Cy, so these cyclization domains 632 00:42:24,350 --> 00:42:27,480 are a variant on a condensation domain. 633 00:42:27,480 --> 00:42:30,050 And what they do is, they both catalyze formation 634 00:42:30,050 --> 00:42:33,440 of the peptide bond and then they catalyze-- after that, 635 00:42:33,440 --> 00:42:36,350 they catalyze formation of a heterocycle. 636 00:42:36,350 --> 00:42:38,210 So if you recall, I believe we looked 637 00:42:38,210 --> 00:42:40,430 at the structure of yersiniabactin 638 00:42:40,430 --> 00:42:44,240 during the first lecture on these. 639 00:42:44,240 --> 00:42:45,890 It has a number of heterocycles. 640 00:42:45,890 --> 00:42:49,490 And those form by this Cy domain. 641 00:42:49,490 --> 00:42:51,730 And we can see that here in the structure. 642 00:42:51,730 --> 00:42:53,870 So what I've done on this slide is 643 00:42:53,870 --> 00:42:57,040 just present to you the structures, 644 00:42:57,040 --> 00:42:58,820 so the natural products that result 645 00:42:58,820 --> 00:43:00,890 from these different assembly lines. 646 00:43:00,890 --> 00:43:05,510 And if we take a look at the bleomycin, what do we see here? 647 00:43:05,510 --> 00:43:09,480 We have these two heterocycles that are fused together. 648 00:43:09,480 --> 00:43:15,290 And those are formed via the action of these two cyclization 649 00:43:15,290 --> 00:43:16,980 domains down here. 650 00:43:16,980 --> 00:43:21,020 So effectively, these originate from cysteine. 651 00:43:21,020 --> 00:43:24,110 So cysteines, and serines, and threonines 652 00:43:24,110 --> 00:43:29,180 can end up forming structures like these if there is 653 00:43:29,180 --> 00:43:31,250 the appropriate type of domain. 654 00:43:31,250 --> 00:43:34,490 This molecule is extremely complicated here. 655 00:43:34,490 --> 00:43:36,650 And so it's a good puzzle to look at it 656 00:43:36,650 --> 00:43:41,210 and try to sort out what are the monomers in it in here. 657 00:43:41,210 --> 00:43:46,316 Does anyone know what this does, bleomycin? 658 00:43:46,316 --> 00:43:48,272 AUDIENCE: [INAUDIBLE] 659 00:43:51,700 --> 00:43:56,040 ELIZABETH NOLAN: Well, so it's an anticancer antibiotic here. 660 00:43:56,040 --> 00:43:58,080 It can intercalate into DNA. 661 00:43:58,080 --> 00:44:00,900 And these heterocycles are important for that. 662 00:44:00,900 --> 00:44:03,233 And then it causes strand breaks. 663 00:44:03,233 --> 00:44:04,650 And I've actually learned recently 664 00:44:04,650 --> 00:44:06,450 it's also used for, like, treating arts. 665 00:44:06,450 --> 00:44:09,680 So it will kill HPV that causes warts. 666 00:44:09,680 --> 00:44:14,470 Anyhow, all of these compounds have interesting activities, 667 00:44:14,470 --> 00:44:18,600 which is one reason why they can be of interest. 668 00:44:18,600 --> 00:44:24,450 So with the logic in place, where we're going to close 669 00:44:24,450 --> 00:44:29,670 this module is thinking about how folks study these in lab. 670 00:44:29,670 --> 00:44:33,720 So say you want to figure out the biosynthesis of a molecule 671 00:44:33,720 --> 00:44:36,810 like daptomycin or bleomycin, what 672 00:44:36,810 --> 00:44:39,660 is it that one needs to do? 673 00:44:39,660 --> 00:44:42,030 And something just to keep in mind with this right 674 00:44:42,030 --> 00:44:44,610 off the bat, is that these are huge. 675 00:44:44,610 --> 00:44:46,560 So some of these examples here, if you 676 00:44:46,560 --> 00:44:50,850 take a look at the sizes, they're, like, comparable 677 00:44:50,850 --> 00:44:53,830 to the prokaryotic ribosome. 678 00:44:53,830 --> 00:44:56,550 That's a huge protein assembly. 679 00:44:56,550 --> 00:44:59,640 And that presents a limitation from the standpoint 680 00:44:59,640 --> 00:45:04,020 of doing experimental work, because trying 681 00:45:04,020 --> 00:45:08,670 to overexpress or produce these assembly lines in something 682 00:45:08,670 --> 00:45:11,850 like E. coli is typically just unreasonable. 683 00:45:11,850 --> 00:45:14,610 And in terms of a native producer organism, 684 00:45:14,610 --> 00:45:16,740 say, something like Streptomyces, 685 00:45:16,740 --> 00:45:18,630 we may or may not know conditions 686 00:45:18,630 --> 00:45:22,080 that cause the organism to make the natural product, so 687 00:45:22,080 --> 00:45:24,920 conditions that cause it to express this machinery, 688 00:45:24,920 --> 00:45:26,950 and then even if it made at a-- 689 00:45:26,950 --> 00:45:29,220 in an amount that's useful. 690 00:45:29,220 --> 00:45:33,000 So what happens? 691 00:45:33,000 --> 00:45:35,190 What are we going to do as experimentalists? 692 00:45:35,190 --> 00:45:39,870 So as I said, we need to keep in mind that these machines are 693 00:45:39,870 --> 00:45:41,350 enormous. 694 00:45:41,350 --> 00:45:43,230 And so we need to take this into account 695 00:45:43,230 --> 00:45:46,980 during experimental design. 696 00:45:46,980 --> 00:45:52,260 And these days, bioinformatics drives a lot of the studies. 697 00:45:52,260 --> 00:45:54,697 So rather than first finding a natural product 698 00:45:54,697 --> 00:45:56,280 and determining its structure and then 699 00:45:56,280 --> 00:45:59,700 hunting down the protein machinery, a wealth of genomes 700 00:45:59,700 --> 00:46:01,020 are becoming available. 701 00:46:01,020 --> 00:46:04,140 And so you can use bioinformatics to search 702 00:46:04,140 --> 00:46:08,430 for PKS or NRPS gene clusters. 703 00:46:08,430 --> 00:46:11,880 And then you can make some assessment 704 00:46:11,880 --> 00:46:15,120 as to what type of molecule these gene clusters might 705 00:46:15,120 --> 00:46:16,860 be responsible for making. 706 00:46:16,860 --> 00:46:20,370 So bioinformatics plays a huge role. 707 00:46:20,370 --> 00:46:24,450 And it allows us to predict the domains, 708 00:46:24,450 --> 00:46:28,980 to predict their locations, and predict their boundaries here. 709 00:46:28,980 --> 00:46:32,370 So as I just said, overexpression 710 00:46:32,370 --> 00:46:35,700 of a complete assembly line is generally not feasible. 711 00:46:35,700 --> 00:46:37,440 So what do people do? 712 00:46:37,440 --> 00:46:43,770 People will typically express individual domains or maybe 713 00:46:43,770 --> 00:46:48,930 di-domains and study those in the test tube. 714 00:46:48,930 --> 00:46:54,150 So you can imagine PCR amplifying an A domain or a T 715 00:46:54,150 --> 00:46:58,800 domain, or maybe the A and T domain together, 716 00:46:58,800 --> 00:47:01,170 and then creating some plasmid that allows 717 00:47:01,170 --> 00:47:04,980 you to express that in E. coli. 718 00:47:04,980 --> 00:47:07,230 So there is a lot of overexpression. 719 00:47:07,230 --> 00:47:09,060 The proteins need to be purified, 720 00:47:09,060 --> 00:47:11,820 so maybe something like affinity chromatography that we've 721 00:47:11,820 --> 00:47:13,710 spoken about before. 722 00:47:13,710 --> 00:47:16,170 And then a key point is that, in order 723 00:47:16,170 --> 00:47:18,570 to have any of this chemistry work, 724 00:47:18,570 --> 00:47:22,590 these T domains need to be post-translationally modified 725 00:47:22,590 --> 00:47:24,660 by the Ppant arm. 726 00:47:24,660 --> 00:47:27,710 And if you're overexpressing a T domain from Streptomyces 727 00:47:27,710 --> 00:47:30,750 or some organism in E. coli, you can pretty much 728 00:47:30,750 --> 00:47:34,330 assume there is no PPTase in E. coli 729 00:47:34,330 --> 00:47:36,990 that's going to do this for you. 730 00:47:36,990 --> 00:47:39,810 So you need to do that after the fact. 731 00:47:39,810 --> 00:47:45,720 And so there needs to be a PPTase. 732 00:47:45,720 --> 00:47:49,470 And what we'll see is that there is 733 00:47:49,470 --> 00:47:54,320 a PPTase from B. subtilis called SFP that's very promiscuous. 734 00:47:54,320 --> 00:47:57,330 It will basically modify any T domain. 735 00:47:57,330 --> 00:48:00,540 And so experimentally, this is what people use, 736 00:48:00,540 --> 00:48:06,450 because often, one has no clue what the endogenous PPTase is 737 00:48:06,450 --> 00:48:10,530 here, so SFP to the rescue. 738 00:48:10,530 --> 00:48:12,870 In terms of activity assay, so once 739 00:48:12,870 --> 00:48:16,440 you have your domains or di-domains purified, 740 00:48:16,440 --> 00:48:18,210 what happens? 741 00:48:18,210 --> 00:48:20,640 This is the typical flow. 742 00:48:20,640 --> 00:48:24,780 So the first is to characterize the A domains and to ask, 743 00:48:24,780 --> 00:48:29,260 what amino acid or aryl acid is activated by the A domain 744 00:48:29,260 --> 00:48:31,140 and what is the selectivity? 745 00:48:31,140 --> 00:48:32,880 And by getting that information, you 746 00:48:32,880 --> 00:48:35,970 have a good clue as to what monomer a given 747 00:48:35,970 --> 00:48:38,580 module is responsible for. 748 00:48:38,580 --> 00:48:41,970 And the ATP-PPi exchange assay we discussed 749 00:48:41,970 --> 00:48:45,090 in the context of the aminoacyl tRNA synthetases 750 00:48:45,090 --> 00:48:47,100 is commonly employed. 751 00:48:47,100 --> 00:48:50,280 So this is where we use the radiolabeled ATP 752 00:48:50,280 --> 00:48:53,020 and took into reversibility there. 753 00:48:53,020 --> 00:48:56,680 So go back and review that assay as needed. 754 00:48:56,680 --> 00:48:59,890 There will be some examples of this in the problem set. 755 00:48:59,890 --> 00:49:04,180 So once the A domain activity is known 756 00:49:04,180 --> 00:49:07,850 in terms of preferred monomer, the next question is, 757 00:49:07,850 --> 00:49:12,820 will that A domain transfer the amino acid monomer 758 00:49:12,820 --> 00:49:14,650 to a given T domain? 759 00:49:14,650 --> 00:49:18,820 So you design assays to look for transfer of the activated 760 00:49:18,820 --> 00:49:22,330 monomer to the post-translationally-modified T 761 00:49:22,330 --> 00:49:23,950 domain here. 762 00:49:23,950 --> 00:49:25,960 So in these assays, there is a lot 763 00:49:25,960 --> 00:49:31,630 of work with radiolabels, with HPLC, and mass spec. 764 00:49:31,630 --> 00:49:35,560 So once these T domains are loaded, 765 00:49:35,560 --> 00:49:38,500 you can look for peptide bond formation. 766 00:49:38,500 --> 00:49:43,360 So imagine you have an isolated T domain from a loading module 767 00:49:43,360 --> 00:49:47,380 that you've stuck the amino acid on and then you have this guy, 768 00:49:47,380 --> 00:49:49,690 the next question is, does the C domain 769 00:49:49,690 --> 00:49:53,320 catalyze bond formation reaction? 770 00:49:53,320 --> 00:49:56,640 And again, we'll see there is a lot of use of radiolabels, 771 00:49:56,640 --> 00:50:01,990 HPLC, SDS-PAGE here. 772 00:50:01,990 --> 00:50:05,980 And then you know, there is the question of the TE domain 773 00:50:05,980 --> 00:50:09,250 and the TE domain catalyzing chain release. 774 00:50:09,250 --> 00:50:12,520 So it's quite systematic in terms of how you work through 775 00:50:12,520 --> 00:50:19,360 from identifying an assembly line to then teasing apart 776 00:50:19,360 --> 00:50:21,970 the various activities of the different domains 777 00:50:21,970 --> 00:50:23,530 and different modules. 778 00:50:23,530 --> 00:50:27,430 And so where we'll close this module on Friday 779 00:50:27,430 --> 00:50:32,140 is with looking at the experiments that 780 00:50:32,140 --> 00:50:35,980 were done for the biosynthesis of an iron chelator produced 781 00:50:35,980 --> 00:50:42,550 by E. coli and working through basically you know, how was it 782 00:50:42,550 --> 00:50:47,500 that this NRPS was found? 783 00:50:47,500 --> 00:50:50,050 What were the experiments done to identify 784 00:50:50,050 --> 00:50:52,330 the different activities of the different domains? 785 00:50:52,330 --> 00:50:55,900 And it's really that work that has 786 00:50:55,900 --> 00:50:58,210 served as a foundation and a paradigm 787 00:50:58,210 --> 00:51:03,700 for many, many further studies of these systems here. 788 00:51:03,700 --> 00:51:05,530 And so with that, we'll close for today. 789 00:51:05,530 --> 00:51:09,570 And there is no class Wednesday, so I'll see you on Friday.