1 00:00:00,500 --> 00:00:02,830 The following content is provided under a Creative 2 00:00:02,830 --> 00:00:04,370 Commons license. 3 00:00:04,370 --> 00:00:06,670 Your support will help MIT OpenCourseWare 4 00:00:06,670 --> 00:00:11,030 continue to offer high quality educational resources for free. 5 00:00:11,030 --> 00:00:13,660 To make a donation or view additional materials 6 00:00:13,660 --> 00:00:17,610 from hundreds of MIT courses, visit MIT OpenCourseWare 7 00:00:17,610 --> 00:00:18,510 at ocw.mit.edu. 8 00:00:25,900 --> 00:00:28,630 ELIZABETH NOLAN: We're going to end the unit on synthesis 9 00:00:28,630 --> 00:00:29,560 today. 10 00:00:29,560 --> 00:00:32,049 And the focus of today's lecture will really 11 00:00:32,049 --> 00:00:34,960 be looking at one system in detail 12 00:00:34,960 --> 00:00:36,460 and the types of experiments that 13 00:00:36,460 --> 00:00:41,290 are done to elucidate a biosynthetic pathway 14 00:00:41,290 --> 00:00:43,720 for a non-ribosomal peptide. 15 00:00:43,720 --> 00:00:47,590 And so just to recap from last time, 16 00:00:47,590 --> 00:00:52,780 if we think about studying assembly lines in lab, 17 00:00:52,780 --> 00:00:58,060 and we're thinking about this for a non-ribosomal peptide 18 00:00:58,060 --> 00:01:00,560 synthetase, what needs to be done? 19 00:01:00,560 --> 00:01:04,239 So first, it's necessary to, typically, 20 00:01:04,239 --> 00:01:16,345 overexpress and purify domains, didomains or modules. 21 00:01:21,260 --> 00:01:25,880 And so on Monday, it came up that often, these proteins 22 00:01:25,880 --> 00:01:30,140 are enormous and it's not possible or feasible to express 23 00:01:30,140 --> 00:01:32,330 entire modules, or entire proteins that 24 00:01:32,330 --> 00:01:33,380 have multiple modules. 25 00:01:33,380 --> 00:01:37,160 So oftentimes, people will look at individual domains, 26 00:01:37,160 --> 00:01:39,920 or didomains, which are smaller and more amenable, 27 00:01:39,920 --> 00:01:44,120 to overexpression in an organism like E. coli. 28 00:01:44,120 --> 00:01:51,740 Then it's necessary to assay for A domain activity. 29 00:01:56,502 --> 00:01:58,710 So we're called the A domains through the adenylation 30 00:01:58,710 --> 00:01:59,310 domains. 31 00:01:59,310 --> 00:02:06,000 And the question is, what monomer 32 00:02:06,000 --> 00:02:11,900 is selected and activated? 33 00:02:11,900 --> 00:02:16,920 And so the ATP-PPi exchange assay comes up here. 34 00:02:16,920 --> 00:02:25,950 There needs to be assays for loading 35 00:02:25,950 --> 00:02:33,415 of the T domain, or carrier protein, with the monomer. 36 00:02:36,730 --> 00:02:51,850 Assay for peptide bond formation, which 37 00:02:51,850 --> 00:02:55,180 is the condensation domain. 38 00:02:55,180 --> 00:02:58,030 And then often, some assay for chain 39 00:02:58,030 --> 00:03:00,940 released by the thioesterase domain. 40 00:03:00,940 --> 00:03:05,020 OK, so assay for TE activity. 41 00:03:10,370 --> 00:03:11,060 Chain release. 42 00:03:14,540 --> 00:03:19,640 And so in terms about of thinking about these T domains, 43 00:03:19,640 --> 00:03:22,850 we learned that these T domains need to be post-translationally 44 00:03:22,850 --> 00:03:26,090 modified with the Ppant arm, which means we 45 00:03:26,090 --> 00:03:29,930 need an enzyme called a PPTase. 46 00:03:29,930 --> 00:03:36,650 And so in many cases, we don't know what the PPTase 47 00:03:36,650 --> 00:03:39,770 is for a given gene cluster. 48 00:03:39,770 --> 00:03:42,410 And what's done, often in the lab, 49 00:03:42,410 --> 00:03:51,160 is that a PPTase from B. subtilis, named Sfp, 50 00:03:51,160 --> 00:03:55,030 is used in order to post-translationally modify 51 00:03:55,030 --> 00:03:57,130 the T domains with the Ppant arm. 52 00:03:57,130 --> 00:03:59,440 So there is a serine residue in these T domains 53 00:03:59,440 --> 00:04:00,400 that gets modified. 54 00:04:00,400 --> 00:04:03,940 We looked over that in a prior lecture. 55 00:04:03,940 --> 00:04:06,110 So this one is very useful. 56 00:04:06,110 --> 00:04:09,400 And if you don't know the enzyme to use, 57 00:04:09,400 --> 00:04:13,150 people will use recombinant Sfp And just recall, 58 00:04:13,150 --> 00:04:15,670 we have the T domain. 59 00:04:15,670 --> 00:04:18,250 There's a serine moiety. 60 00:04:18,250 --> 00:04:19,009 We have a PPTase. 61 00:04:23,600 --> 00:04:26,990 That's going to stick on the P pant arm here. 62 00:04:26,990 --> 00:04:32,960 So we call this apo, holo, and then the amino acid, 63 00:04:32,960 --> 00:04:35,600 or aryl acid monomer, in the case of NRPS 64 00:04:35,600 --> 00:04:39,090 gets loaded here via a thioester. 65 00:04:39,090 --> 00:04:45,305 And so Sfp can be used to get us here. 66 00:04:48,980 --> 00:04:55,040 And even what people have done is make modified analogs, 67 00:04:55,040 --> 00:04:56,360 where there's some R group. 68 00:04:56,360 --> 00:04:58,700 So you can imagine using chemical synthesis 69 00:04:58,700 --> 00:05:01,850 to load a monomer, or even some other type of group 70 00:05:01,850 --> 00:05:04,640 that, for some reason, you might want to transfer here. 71 00:05:04,640 --> 00:05:08,830 And this Sfp is very promiscuous and it can do that. 72 00:05:08,830 --> 00:05:14,810 And so the take-home here is if you need a PPTase, overexpress, 73 00:05:14,810 --> 00:05:17,210 purify, and utilize Sfp. 74 00:05:17,210 --> 00:05:20,510 Here's just an example for review, 75 00:05:20,510 --> 00:05:23,540 where we have a carrier protein, so a T domain, 76 00:05:23,540 --> 00:05:27,470 and we have the PPTase activity here, Sfp, 77 00:05:27,470 --> 00:05:29,870 attaching this Ppant arm. 78 00:05:29,870 --> 00:05:31,760 And here, it's described with an R group. 79 00:05:31,760 --> 00:05:36,770 And just to give you an example of possibilities here, 80 00:05:36,770 --> 00:05:40,880 there have been many reports of CoA analogs being transferred 81 00:05:40,880 --> 00:05:43,220 to T domains by Sfp. 82 00:05:43,220 --> 00:05:46,850 And these can range from things like an isotope label 83 00:05:46,850 --> 00:05:52,160 to peptides to steroids to some non-ribosomal peptide 84 00:05:52,160 --> 00:05:54,410 derivative, or a fluorophore. 85 00:05:54,410 --> 00:05:56,750 So this has been used as a tool. 86 00:05:56,750 --> 00:06:00,150 And you might ask, why is this possible? 87 00:06:00,150 --> 00:06:03,830 And if we just take a look at the structure of Sfp from B. 88 00:06:03,830 --> 00:06:08,750 subtilits with coASH and magnesium bound. 89 00:06:08,750 --> 00:06:13,700 What we see is that this end of the coA 90 00:06:13,700 --> 00:06:16,610 is extended out into the solvent. 91 00:06:16,610 --> 00:06:20,180 And at least in this structure here, it's 92 00:06:20,180 --> 00:06:22,710 not interacting with regions of the protein. 93 00:06:22,710 --> 00:06:24,950 So you can imagine that it's possible to attach 94 00:06:24,950 --> 00:06:28,460 some group, even a bulky group, here 95 00:06:28,460 --> 00:06:31,820 and be able to transfer it there. 96 00:06:31,820 --> 00:06:37,220 So where we're going to focus the rest of the lecture 97 00:06:37,220 --> 00:06:39,560 is on an assembly line responsible 98 00:06:39,560 --> 00:06:42,070 for the biosynthesis of a natural product called 99 00:06:42,070 --> 00:06:46,370 enterobactin, and this is a siderophore. 100 00:06:46,370 --> 00:06:49,820 And so in thinking about this, what I would just 101 00:06:49,820 --> 00:06:52,880 like to first note is that when we talk about these assembly 102 00:06:52,880 --> 00:06:57,290 lines, we can group them into two types, which 103 00:06:57,290 --> 00:07:00,860 are non-iterative and iterative assembly. 104 00:07:00,860 --> 00:07:03,690 And so what does this mean? 105 00:07:03,690 --> 00:07:06,950 So we've seen examples of non-iterative assembly 106 00:07:06,950 --> 00:07:11,870 last time on Monday with the ACV tripeptide and the vancomycin 107 00:07:11,870 --> 00:07:13,140 synthetase. 108 00:07:13,140 --> 00:07:16,550 So in these non-interative assembly lines, 109 00:07:16,550 --> 00:07:19,820 effectively, each step has its own module. 110 00:07:19,820 --> 00:07:24,920 So each carrier protein, T domain, each condensation 111 00:07:24,920 --> 00:07:29,210 or catalytic domain, is used only once as the chain grows. 112 00:07:29,210 --> 00:07:31,460 And we see the chain passed along from module 113 00:07:31,460 --> 00:07:33,740 to module here. 114 00:07:33,740 --> 00:07:38,540 So also, the PKS we looked at for synthesis of DEB 115 00:07:38,540 --> 00:07:41,630 is one of these non-iterative assembly lines. 116 00:07:41,630 --> 00:07:43,760 So in contrast, in the example we're 117 00:07:43,760 --> 00:07:46,880 going to look at today with the enterobactin synthetase is 118 00:07:46,880 --> 00:07:49,370 an iterative assembly line, and this 119 00:07:49,370 --> 00:07:52,710 is similar to what we saw in fatty acid synthase. 120 00:07:52,710 --> 00:07:56,720 So in these iterative assembly lines, 121 00:07:56,720 --> 00:08:01,680 effectively, only one module is employed over and over again. 122 00:08:01,680 --> 00:08:03,620 So you can have the same carrier protein 123 00:08:03,620 --> 00:08:07,520 and same catalytic domain used for multiple cycles of chain 124 00:08:07,520 --> 00:08:08,960 elongation. 125 00:08:08,960 --> 00:08:11,030 And that's what we saw in fatty acid synthase, 126 00:08:11,030 --> 00:08:12,830 where there are multiple cycles in addition 127 00:08:12,830 --> 00:08:18,440 of a C2 unit via the same domains. 128 00:08:18,440 --> 00:08:20,540 And so what we're going to see today 129 00:08:20,540 --> 00:08:22,710 is this type of iterative assembly 130 00:08:22,710 --> 00:08:26,120 is responsible for the synthesis of this molecule here. 131 00:08:28,840 --> 00:08:33,159 So first, just an overview of building blocks. 132 00:08:33,159 --> 00:08:36,669 And then we'll talk about why organisms 133 00:08:36,669 --> 00:08:38,830 want to make this molecule, and then 134 00:08:38,830 --> 00:08:42,370 focus on the biosynthetic logic and experiments. 135 00:08:42,370 --> 00:08:47,740 So this molecule, enterobactin, is produced from two monomers. 136 00:08:47,740 --> 00:08:53,350 So we have 2, 3 dihydroxybenzoic acid, or DHB, 137 00:08:53,350 --> 00:08:55,810 and we have serine here. 138 00:08:55,810 --> 00:08:59,170 And there is a two-module assembly line 139 00:08:59,170 --> 00:09:03,310 responsible for the synthesis of this natural product. 140 00:09:03,310 --> 00:09:06,040 And that assembly line is shown here. 141 00:09:06,040 --> 00:09:12,760 So we see that there's three proteins, EntE, EntB, and EntF. 142 00:09:12,760 --> 00:09:17,230 We have an initiation module, elongation module, and this TE 143 00:09:17,230 --> 00:09:19,540 domain for termination. 144 00:09:19,540 --> 00:09:24,250 So overall, three separate proteins, two modules, 145 00:09:24,250 --> 00:09:25,580 and seven domains. 146 00:09:25,580 --> 00:09:28,450 So this NRPS is quite small. 147 00:09:28,450 --> 00:09:32,350 And this is an example of a non-ribosomal peptide that's 148 00:09:32,350 --> 00:09:34,000 produced by E. coli. 149 00:09:34,000 --> 00:09:36,340 So E. coli makes this molecule, as well as 150 00:09:36,340 --> 00:09:40,810 some other gram-negative bacteria. 151 00:09:40,810 --> 00:09:44,270 So this is iterative. 152 00:09:44,270 --> 00:09:48,040 We have three of each of these monomers, 153 00:09:48,040 --> 00:09:51,100 yet only two T domains here, so imagine 154 00:09:51,100 --> 00:09:53,530 one responsible for each. 155 00:09:53,530 --> 00:09:56,920 So before we get more into this biosynthetic logic, 156 00:09:56,920 --> 00:09:59,830 let's just take a moment to think about why 157 00:09:59,830 --> 00:10:01,370 this molecule is produced. 158 00:10:01,370 --> 00:10:03,160 So this is a case where we actually 159 00:10:03,160 --> 00:10:06,130 have very good understanding about why an organism is 160 00:10:06,130 --> 00:10:08,200 producing a natural product. 161 00:10:08,200 --> 00:10:11,620 And this actually gives a segue into JoAnne's section 162 00:10:11,620 --> 00:10:15,220 on metal homeostasis, which will come up after cholesterol 163 00:10:15,220 --> 00:10:17,050 after spring break. 164 00:10:17,050 --> 00:10:23,470 So many bacteria use non-ribosomal peptide synthesis 165 00:10:23,470 --> 00:10:26,140 machinery in order to make chelators 166 00:10:26,140 --> 00:10:28,210 in order to acquire iron. 167 00:10:28,210 --> 00:10:32,200 And that's because iron is an essential nutrient 168 00:10:32,200 --> 00:10:34,820 and it's actually quite scarce. 169 00:10:34,820 --> 00:10:37,300 So if you imagine an organism in the soil, 170 00:10:37,300 --> 00:10:41,140 maybe it needs to obtain iron from a rock. 171 00:10:41,140 --> 00:10:44,270 Somehow it needs to get iron from our pool, 172 00:10:44,270 --> 00:10:47,860 and concentrations are very tightly regulated, 173 00:10:47,860 --> 00:10:49,990 and most iron is tightly bound. 174 00:10:49,990 --> 00:10:52,960 And we can also think about this from a standpoint 175 00:10:52,960 --> 00:10:56,440 of solubility, so simple KST type things. 176 00:10:56,440 --> 00:10:59,980 We all know that iron 3, which is the predominant oxidation 177 00:10:59,980 --> 00:11:03,840 state in aerobic conditions, is very insoluble. 178 00:11:03,840 --> 00:11:06,090 So our cars rust up here in the Northeast 179 00:11:06,090 --> 00:11:08,820 because they sit outside on the road in the winter, 180 00:11:08,820 --> 00:11:10,520 and that's no good. 181 00:11:10,520 --> 00:11:13,030 So we can think about 10 to the minus 18 molar. 182 00:11:13,030 --> 00:11:16,000 And then if we think about free ion in human serum, 183 00:11:16,000 --> 00:11:18,940 for instance, the concentration is even lower 184 00:11:18,940 --> 00:11:21,640 because there's inherent toxicity associated 185 00:11:21,640 --> 00:11:22,410 with free iron. 186 00:11:22,410 --> 00:11:26,270 And you'll hear about that from JoAnne in more detail later. 187 00:11:26,270 --> 00:11:28,390 So these organisms have a predicament 188 00:11:28,390 --> 00:11:30,760 because for metabolism, they need 189 00:11:30,760 --> 00:11:33,280 iron on the order of micromolar concentrations. 190 00:11:33,280 --> 00:11:37,850 So how does some organism obtain micromolar iron 191 00:11:37,850 --> 00:11:40,060 when in environments where, say, that's 10 192 00:11:40,060 --> 00:11:42,180 to the minus 24 molar? 193 00:11:42,180 --> 00:11:45,700 And there's a number of strategies that come up, 194 00:11:45,700 --> 00:11:48,400 but one of the strategies is the biosynthesis 195 00:11:48,400 --> 00:11:52,420 of non-ribosomal peptides that act as metal scavengers 196 00:11:52,420 --> 00:11:53,670 and metal chelators. 197 00:11:53,670 --> 00:11:56,290 And so I just show you two examples here. 198 00:11:56,290 --> 00:11:59,400 And we have enterobactin, which we're going to focus on today. 199 00:11:59,400 --> 00:12:02,080 And this is really just a wonderful molecule. 200 00:12:02,080 --> 00:12:05,145 Yersiniabactin-- and I put this up here, in part, 201 00:12:05,145 --> 00:12:06,520 because there were some questions 202 00:12:06,520 --> 00:12:11,680 about those cyclization domains in the bleomycin gene cluster, 203 00:12:11,680 --> 00:12:14,560 that we looked at that assembly line on Monday. 204 00:12:14,560 --> 00:12:17,080 And this is another example where 205 00:12:17,080 --> 00:12:20,470 cyclization of cysteine residues occurs in order 206 00:12:20,470 --> 00:12:24,690 to give the final natural product via those modified 207 00:12:24,690 --> 00:12:27,840 condensation domains here. 208 00:12:27,840 --> 00:12:30,260 So if we think about enterobactin 209 00:12:30,260 --> 00:12:35,030 for a moment longer, what happens, 210 00:12:35,030 --> 00:12:37,670 effectively, this molecule can bind 211 00:12:37,670 --> 00:12:39,980 iron 3 with higher affinity. 212 00:12:39,980 --> 00:12:42,560 And the iron bound form is shown here. 213 00:12:42,560 --> 00:12:45,950 So these aryl acids, these catechol groups, 214 00:12:45,950 --> 00:12:49,620 provide six oxygen donors to the iron center 215 00:12:49,620 --> 00:12:51,050 to get a structure like this. 216 00:12:53,690 --> 00:12:57,140 So in terms of the organism in production, 217 00:12:57,140 --> 00:12:59,750 what happens when these organisms are confronted 218 00:12:59,750 --> 00:13:00,950 with iron limitations? 219 00:13:00,950 --> 00:13:04,400 So essentially, they're starved for essential nutrients. 220 00:13:04,400 --> 00:13:06,380 They'll turn on biosynthesis. 221 00:13:06,380 --> 00:13:10,760 So they'll express the enterobactin synthetase, 222 00:13:10,760 --> 00:13:13,370 which will allow for production of enterobactin. 223 00:13:13,370 --> 00:13:15,870 So this is happening in the cytoplasm. 224 00:13:15,870 --> 00:13:18,080 So we have those three proteins that 225 00:13:18,080 --> 00:13:22,900 comprise the assembly line that use the HDML serine to produce 226 00:13:22,900 --> 00:13:24,860 the natural product. 227 00:13:24,860 --> 00:13:27,440 And then in addition to that biosynthetic machinery, 228 00:13:27,440 --> 00:13:31,940 the organism needs to also express and use 229 00:13:31,940 --> 00:13:34,440 a whole bunch of transport machinery. 230 00:13:34,440 --> 00:13:40,400 So what happens is that this natural product is exported 231 00:13:40,400 --> 00:13:42,720 into the extracellular space. 232 00:13:42,720 --> 00:13:44,650 So this is a gram-negative organism, 233 00:13:44,650 --> 00:13:47,630 so it has an inner membrane and an outer membrane. 234 00:13:47,630 --> 00:13:49,700 And it's in the extracellular space 235 00:13:49,700 --> 00:13:52,940 that enterobactin will scavenge iron 3. 236 00:13:52,940 --> 00:13:54,800 So there's formation of the coordination 237 00:13:54,800 --> 00:13:57,230 complex, shown in cartoon here. 238 00:13:57,230 --> 00:13:59,180 And then there's a dedicated receptor 239 00:13:59,180 --> 00:14:03,020 on the outer membrane that will recognize that iron bound form 240 00:14:03,020 --> 00:14:05,180 and bring that into the cell. 241 00:14:05,180 --> 00:14:09,530 And then through transport and through breakdown 242 00:14:09,530 --> 00:14:11,450 of the natural product, this iron 243 00:14:11,450 --> 00:14:14,270 can be released and then used. 244 00:14:14,270 --> 00:14:19,100 So iron is a co-factor of many types of proteins and enzymes 245 00:14:19,100 --> 00:14:20,330 here. 246 00:14:20,330 --> 00:14:22,250 So a whole lot is going on. 247 00:14:22,250 --> 00:14:25,560 We're going to focus on the biosynthetic part. 248 00:14:25,560 --> 00:14:28,880 And so in thinking about this, from the standpoint 249 00:14:28,880 --> 00:14:31,800 of a non-ribosomal peptide synthetase, 250 00:14:31,800 --> 00:14:33,750 what's something interesting? 251 00:14:33,750 --> 00:14:36,950 So in the examples we saw last time, 252 00:14:36,950 --> 00:14:41,210 we had the ACD tripeptide, the vancomycin synthetase. 253 00:14:41,210 --> 00:14:45,440 These assembly lines are only forming peptide bonds, 254 00:14:45,440 --> 00:14:48,320 so we saw formation of amide bond. 255 00:14:48,320 --> 00:14:50,980 If we take a look at enterobactin 256 00:14:50,980 --> 00:14:55,520 and we think about the monomers coming from, what do we see? 257 00:14:55,520 --> 00:14:58,430 So this has some C3 symmetry. 258 00:14:58,430 --> 00:15:01,770 And we can see that it's comprised of three of these DHB 259 00:15:01,770 --> 00:15:06,460 serine monomers, so 1, 2, 3. 260 00:15:06,460 --> 00:15:09,740 And effectively, there's formation of amide bonds 261 00:15:09,740 --> 00:15:11,060 between DHC and serine. 262 00:15:11,060 --> 00:15:14,840 So it's shown and throughout here. 263 00:15:14,840 --> 00:15:17,780 But there's also ester linkages formed. 264 00:15:17,780 --> 00:15:21,110 So this ring here is often called a trilactone, 265 00:15:21,110 --> 00:15:23,780 or a macrolactone, and somehow, these three esters 266 00:15:23,780 --> 00:15:25,290 need to be formed. 267 00:15:25,290 --> 00:15:29,246 So how is the enterobactin synthetase doing this? 268 00:15:32,900 --> 00:15:36,140 So if we look at an overview of different enterobactin 269 00:15:36,140 --> 00:15:42,140 synthetase, the gene cluster, what do we learn? 270 00:15:42,140 --> 00:15:45,020 So the first point to make is that there are actually 271 00:15:45,020 --> 00:15:47,000 six proteins required. 272 00:15:47,000 --> 00:15:50,850 So you've seen three so far, in terms of the assembly line. 273 00:15:50,850 --> 00:15:54,290 So we have an A, B, C, D, E, and F. 274 00:15:54,290 --> 00:15:58,130 And A, B, and C are required for the biosynthesis 275 00:15:58,130 --> 00:16:01,984 of this aryl acid building block here, this DHB. 276 00:16:01,984 --> 00:16:06,680 And then this is a case, rather unusual, 277 00:16:06,680 --> 00:16:09,167 where the PPTase was identified, and we're 278 00:16:09,167 --> 00:16:11,750 going to talk about that more as we go through the experiment. 279 00:16:11,750 --> 00:16:14,120 So I just told you about using Sfp 280 00:16:14,120 --> 00:16:15,890 if you don't know what to do. 281 00:16:15,890 --> 00:16:18,290 This was the case where the researchers 282 00:16:18,290 --> 00:16:20,240 were able to identify the dedicated 283 00:16:20,240 --> 00:16:22,560 PPTase for the assembly line. 284 00:16:22,560 --> 00:16:24,200 So that's EntE. 285 00:16:24,200 --> 00:16:28,240 And then we have B, E, and F that provide 286 00:16:28,240 --> 00:16:32,380 an iterative assembly line that yields the natural product, 287 00:16:32,380 --> 00:16:34,120 as shown here. 288 00:16:34,120 --> 00:16:38,150 OK, so also just note that B is coming up twice. 289 00:16:38,150 --> 00:16:41,200 We're seeing it here and we're seeing it here. 290 00:16:41,200 --> 00:16:42,950 So that should bring up a question, what's 291 00:16:42,950 --> 00:16:44,720 going on with this enzyme? 292 00:16:44,720 --> 00:16:46,895 And then we'll address that as we move forward. 293 00:16:49,790 --> 00:16:54,570 So in terms of thinking about this synthetase, 294 00:16:54,570 --> 00:16:58,100 we'll do an overview and then look at the experiments. 295 00:17:10,475 --> 00:17:21,540 So we have an A domain and B. We have a protein here 296 00:17:21,540 --> 00:17:25,800 that has a T domain and an IDL domain that we'll get back to. 297 00:17:28,369 --> 00:17:44,190 This is EntE, and then here we have EntF. 298 00:17:44,190 --> 00:17:46,160 And then we have our PPTase. 299 00:17:52,950 --> 00:17:58,740 So effectively, here, we can have our initiation. 300 00:18:06,470 --> 00:18:09,180 Here, we have elongation. 301 00:18:13,522 --> 00:18:15,518 And here, termination. 302 00:18:23,510 --> 00:18:26,260 So what is the overview, in terms of what 303 00:18:26,260 --> 00:18:28,990 happens for A domain activity? 304 00:18:28,990 --> 00:18:32,650 Loading of the T domains and peptide bond formation. 305 00:18:32,650 --> 00:18:42,400 So for the overview, we'll first consider 306 00:18:42,400 --> 00:18:46,070 getting a monomer on to EntB. 307 00:18:46,070 --> 00:18:47,545 So EntB has a T domain. 308 00:18:53,500 --> 00:18:56,240 And that has a serine. 309 00:18:56,240 --> 00:19:00,710 The serine needs to be modified under the PPTase EntD. 310 00:19:08,260 --> 00:19:10,230 Holo EntD. 311 00:19:10,230 --> 00:19:12,520 We put the Ppant arm. 312 00:19:12,520 --> 00:19:21,460 And then what we'll see is that EntE is the A domain that's 313 00:19:21,460 --> 00:19:27,820 responsible for activating DHB and transferring 314 00:19:27,820 --> 00:19:29,387 that monomer to EntB. 315 00:19:48,280 --> 00:19:51,190 So then in terms of EntF and getting 316 00:19:51,190 --> 00:19:53,500 the two domains of EntF loaded, it's 317 00:19:53,500 --> 00:19:55,890 going to be loaded with L-serine. 318 00:19:55,890 --> 00:20:02,620 And so here, you have EntF, again, 319 00:20:02,620 --> 00:20:06,280 focusing on the T domain. 320 00:20:06,280 --> 00:20:14,986 Again, we have that action of EntD 321 00:20:14,986 --> 00:20:18,260 to give us the holo form with the Ppant arm. 322 00:20:18,260 --> 00:20:20,660 And then we see that, in this case, 323 00:20:20,660 --> 00:20:23,660 the A domain is within the same protein. 324 00:20:23,660 --> 00:20:26,760 So the A domain of EntF is going to activate L-serine 325 00:20:26,760 --> 00:20:30,490 and transfer that to the T domain. 326 00:20:30,490 --> 00:20:51,360 So we have EntF, A domain to get us here. 327 00:20:51,360 --> 00:20:54,450 So then what about peptide bond formation? 328 00:20:54,450 --> 00:20:58,590 So we see the C domain, condensation domain as an EntF. 329 00:20:58,590 --> 00:21:01,170 And so what we can imagine is that we 330 00:21:01,170 --> 00:21:18,226 have our EntB loaded with the aryl acid monomer 331 00:21:18,226 --> 00:21:28,450 plus EntF loaded with L-serine. 332 00:21:31,740 --> 00:21:33,010 And what's going to happen? 333 00:21:33,010 --> 00:21:36,940 The C domain of EntF is going to catalyze formation 334 00:21:36,940 --> 00:21:58,310 of the amide bond here to give us EntB plus EntF, effectively, 335 00:21:58,310 --> 00:22:05,490 with DHA serine attached. 336 00:22:24,180 --> 00:22:28,050 So this gives us some insight, just this overview, 337 00:22:28,050 --> 00:22:31,500 in terms of how the amide bond is formed and pretty 338 00:22:31,500 --> 00:22:36,720 much follows what we saw for the ACV tripeptide and vancomycin 339 00:22:36,720 --> 00:22:39,060 biosynthesis for the heptapeptide that 340 00:22:39,060 --> 00:22:41,580 forms its backbone. 341 00:22:41,580 --> 00:22:44,850 So a question we have at this stage is, 342 00:22:44,850 --> 00:22:48,420 well, we see in that structure, in addition to these amides, 343 00:22:48,420 --> 00:22:50,340 there's also esters. 344 00:22:50,340 --> 00:22:51,720 How are those formed? 345 00:22:51,720 --> 00:22:54,980 And then what assays are needed? 346 00:22:54,980 --> 00:22:57,390 And so first, we're going to think about formation 347 00:22:57,390 --> 00:22:59,370 of the ester linkages, and then we'll 348 00:22:59,370 --> 00:23:02,020 launch into the experiment. 349 00:23:02,020 --> 00:23:05,200 So let's take a look at this assembly line. 350 00:23:05,200 --> 00:23:10,090 So we have EntE, the A domain, EntB, this didomain. 351 00:23:10,090 --> 00:23:11,490 That has the T domain. 352 00:23:11,490 --> 00:23:12,690 And here's EntF. 353 00:23:12,690 --> 00:23:15,990 And we see in this cartoon, the T domains 354 00:23:15,990 --> 00:23:18,660 are already modified with the P pant arm. 355 00:23:18,660 --> 00:23:21,330 And here is the serine residue of the TE domain 356 00:23:21,330 --> 00:23:25,560 that, ultimately, accepts the chain. 357 00:23:25,560 --> 00:23:26,950 So what happens? 358 00:23:26,950 --> 00:23:29,970 If we take a look, so we saw this on the board, 359 00:23:29,970 --> 00:23:34,680 EntE becomes loaded with dihydroxybenzoic acid. 360 00:23:34,680 --> 00:23:37,820 EntF becomes loaded with serine. 361 00:23:37,820 --> 00:23:40,860 The condensation domain catalyzes this formation 362 00:23:40,860 --> 00:23:43,530 of an amide bond between two monomers. 363 00:23:43,530 --> 00:23:45,180 And then what happens? 364 00:23:45,180 --> 00:23:49,370 We see transfer of this DHB serine unit to the TE domain 365 00:23:49,370 --> 00:23:50,400 here. 366 00:23:50,400 --> 00:23:52,560 And then we can imagine these two domains being 367 00:23:52,560 --> 00:23:54,702 loaded with monomers again. 368 00:23:54,702 --> 00:23:55,410 And what happens? 369 00:24:01,200 --> 00:24:02,470 What do we think about this? 370 00:24:06,620 --> 00:24:08,840 Effectively, formation of one amide bond 371 00:24:08,840 --> 00:24:11,390 transferred to the TE domain. 372 00:24:11,390 --> 00:24:13,860 Formation of another amide bond. 373 00:24:13,860 --> 00:24:15,560 And look. 374 00:24:15,560 --> 00:24:19,370 The second moiety here is transferred to the TE domain, 375 00:24:19,370 --> 00:24:22,400 to the initial monomer, via this ester linkage. 376 00:24:22,400 --> 00:24:26,190 This is really unusual behavior for a TE domain. 377 00:24:26,190 --> 00:24:27,380 And what happens again? 378 00:24:27,380 --> 00:24:31,460 We see this happen again, so we get this linear trimer 379 00:24:31,460 --> 00:24:33,290 of enterobactin, effectively. 380 00:24:33,290 --> 00:24:34,940 And then what happens? 381 00:24:34,940 --> 00:24:39,020 Chain release to form the macrolactone here. 382 00:24:39,020 --> 00:24:42,990 We have this group that can come around here. 383 00:24:42,990 --> 00:24:45,350 So what is the hypothesis? 384 00:24:45,350 --> 00:24:48,080 The hypothesis that was put forth by the researchers 385 00:24:48,080 --> 00:24:50,600 is that in this assembly line, effectively, 386 00:24:50,600 --> 00:24:54,140 this thioesterase is serving as a waiting room. 387 00:24:54,140 --> 00:24:57,530 And it's allowing these DHB serine monomers 388 00:24:57,530 --> 00:25:00,830 to wait around and remain attached, such 389 00:25:00,830 --> 00:25:02,450 that these esters can be formed. 390 00:25:02,450 --> 00:25:05,630 And somehow, it senses this appropriate size, 391 00:25:05,630 --> 00:25:09,760 this linear trimer, and then catalyzes chain release, 392 00:25:09,760 --> 00:25:12,902 as shown here. 393 00:25:12,902 --> 00:25:13,985 AUDIENCE: Does it mess up? 394 00:25:13,985 --> 00:25:15,360 ELIZABETH NOLAN: Does it mess up? 395 00:25:15,360 --> 00:25:17,460 AUDIENCE: Yeah, does it always give you a 3 396 00:25:17,460 --> 00:25:21,097 under [INAUDIBLE] circle? 397 00:25:21,097 --> 00:25:23,180 ELIZABETH NOLAN: Yes, to the best of my knowledge. 398 00:25:23,180 --> 00:25:25,620 What's very interesting is that-- so this is worked out 399 00:25:25,620 --> 00:25:28,130 as Chris Walsh's group. 400 00:25:28,130 --> 00:25:30,590 Recently, Alison Butler's lab at Santa Barbara 401 00:25:30,590 --> 00:25:33,800 has discovered an enterobactin analog 402 00:25:33,800 --> 00:25:37,400 that has an additional unit in it here. 403 00:25:37,400 --> 00:25:40,190 So it looks like there's other thioesterases around that 404 00:25:40,190 --> 00:25:43,250 serve as waiting rooms and can accommodate different ring 405 00:25:43,250 --> 00:25:44,330 sizes. 406 00:25:44,330 --> 00:25:48,885 But this one will just give this size. 407 00:25:48,885 --> 00:25:51,260 And that's a very interesting question, just in terms of, 408 00:25:51,260 --> 00:25:53,660 how is this thioesterase doing that? 409 00:25:53,660 --> 00:25:58,640 We need more structural understanding to get at that. 410 00:25:58,640 --> 00:26:00,680 In addition, these are just some overviews 411 00:26:00,680 --> 00:26:02,750 that I've put in the notes, other depictions 412 00:26:02,750 --> 00:26:05,870 of this process and the waiting room 413 00:26:05,870 --> 00:26:09,450 hypothesis from the literature. 414 00:26:09,450 --> 00:26:11,960 So we're going to look at the experiments that 415 00:26:11,960 --> 00:26:14,000 were done to study this. 416 00:26:14,000 --> 00:26:17,030 And I really, 1, like enterobactin, 417 00:26:17,030 --> 00:26:19,850 so I got excited about this molecule as an undergraduate, 418 00:26:19,850 --> 00:26:21,260 actually. 419 00:26:21,260 --> 00:26:27,200 But beyond that, why I like to present on this system, 420 00:26:27,200 --> 00:26:31,250 in terms of experiments, is that many firsts came from it, 421 00:26:31,250 --> 00:26:32,990 and it really serves as a paradigm 422 00:26:32,990 --> 00:26:34,490 for many, many other studies. 423 00:26:34,490 --> 00:26:38,060 So if we just consider the various firsts that 424 00:26:38,060 --> 00:26:41,825 came from the studies of the enterobactin synthetase, 1, 425 00:26:41,825 --> 00:26:44,870 it was the first siderophore synthetase to be studied, 426 00:26:44,870 --> 00:26:48,580 and there's hundreds of siderophores out there 427 00:26:48,580 --> 00:26:51,940 and many have been investigated since this one. 428 00:26:51,940 --> 00:26:54,550 It's the first example of a siderophore synthetase 429 00:26:54,550 --> 00:26:58,040 to be characterized for the Ppant arms. 430 00:26:58,040 --> 00:26:59,840 This was the first identification 431 00:26:59,840 --> 00:27:04,880 of a dedicated PPTase for one of these assembly lines. 432 00:27:04,880 --> 00:27:08,930 And the first identification of the thioesterase domain 433 00:27:08,930 --> 00:27:13,140 that has this behavior of forming this cyclo-oligomer. 434 00:27:13,140 --> 00:27:18,030 And the first identification of an aryl carrier protein, so 435 00:27:18,030 --> 00:27:23,300 this T domain that carries DHB here. 436 00:27:23,300 --> 00:27:28,850 And in terms of the experiments we'll go through, 437 00:27:28,850 --> 00:27:32,450 these experiments that were devised in this system 438 00:27:32,450 --> 00:27:36,440 have been generalized across many, many assembly lines 439 00:27:36,440 --> 00:27:39,980 and the methods are still routinely used today. 440 00:27:39,980 --> 00:27:42,470 But a major difference I want to point out 441 00:27:42,470 --> 00:27:48,440 is that today, we have so many microbial genomes sequenced 442 00:27:48,440 --> 00:27:52,160 that a lot of work is driven by bioinformatics here, in terms 443 00:27:52,160 --> 00:27:54,800 of identifying that NRPS. 444 00:27:54,800 --> 00:27:58,160 So often, the gene cluster may be identified well 445 00:27:58,160 --> 00:28:01,380 before the natural product is ever isolated. 446 00:28:01,380 --> 00:28:03,290 And this is a case where the natural product 447 00:28:03,290 --> 00:28:07,460 isolation occurred first, so that was done well, well 448 00:28:07,460 --> 00:28:10,310 before here. 449 00:28:10,310 --> 00:28:12,570 And this is a case where we know how 450 00:28:12,570 --> 00:28:15,350 to get the organism to produce this natural product. 451 00:28:15,350 --> 00:28:18,890 You starve the organism of iron and it will start to make it, 452 00:28:18,890 --> 00:28:21,530 in many instances, for other natural products produced 453 00:28:21,530 --> 00:28:22,670 by these assembly lines. 454 00:28:22,670 --> 00:28:24,410 We don't know how to get the organism 455 00:28:24,410 --> 00:28:28,580 to actually make the molecule in a laboratory setting there. 456 00:28:28,580 --> 00:28:32,000 So there's some interesting work being done about that. 457 00:28:32,000 --> 00:28:34,970 Some actually recent work out of Northeastern, 458 00:28:34,970 --> 00:28:36,800 actually trying to grow organisms 459 00:28:36,800 --> 00:28:39,620 in soil-like environments and seeing what they 460 00:28:39,620 --> 00:28:41,630 can be provoked to produce. 461 00:28:41,630 --> 00:28:45,180 If you're curious, I'm happy to give you references. 462 00:28:45,180 --> 00:28:47,840 OK, so where are we going to start, 463 00:28:47,840 --> 00:28:52,440 in terms of characterization of this synthetase here? 464 00:28:52,440 --> 00:28:55,460 We're, more or less, going to follow the logic outlined up 465 00:28:55,460 --> 00:28:57,560 here for this. 466 00:29:00,640 --> 00:29:04,690 So here's the cartoon of the players. 467 00:29:04,690 --> 00:29:07,390 And the first order of business is 468 00:29:07,390 --> 00:29:10,090 that it's necessary to characterize the adenylation 469 00:29:10,090 --> 00:29:11,500 domains. 470 00:29:11,500 --> 00:29:14,380 So we need to ask, what the monomers 471 00:29:14,380 --> 00:29:16,540 are selected and activated? 472 00:29:16,540 --> 00:29:19,480 And we have two adenylation domains to consider, 473 00:29:19,480 --> 00:29:23,530 so EntE and the A domain of EntF. 474 00:29:23,530 --> 00:29:26,800 So what was done? 475 00:29:26,800 --> 00:29:30,160 For EntE, where we'll start, this protein 476 00:29:30,160 --> 00:29:34,560 was purified from E. coli and characterized here. 477 00:29:34,560 --> 00:29:36,350 And so how was it characterized? 478 00:29:36,350 --> 00:29:39,670 It was characterized by ATP-PPi exchange, 479 00:29:39,670 --> 00:29:42,670 like what we saw for the amino acyltransferase synthetase 480 00:29:42,670 --> 00:29:44,500 characterization. 481 00:29:44,500 --> 00:29:48,130 And so what was observed is that when 482 00:29:48,130 --> 00:29:54,310 EntE was combined with dihydroxybenzoic acid, ATP 483 00:29:54,310 --> 00:29:58,990 and radiolabeled PPi, there was incorporation of the P32 label 484 00:29:58,990 --> 00:30:01,120 into ATP. 485 00:30:01,120 --> 00:30:08,200 So that indicates formation of this adenylate intermediate. 486 00:30:08,200 --> 00:30:12,310 And resulted in the conclusion that EntE 487 00:30:12,310 --> 00:30:17,060 is the A domain that activates this aryl acid monomer, so 488 00:30:17,060 --> 00:30:18,970 this chemistry, which should be very 489 00:30:18,970 --> 00:30:21,750 familiar at this stage based on our discussions 490 00:30:21,750 --> 00:30:22,750 in the translation unit. 491 00:30:26,080 --> 00:30:30,310 So what about EntF and its A domain? 492 00:30:30,310 --> 00:30:33,460 So again, we're working with E. coli proteins. 493 00:30:33,460 --> 00:30:36,820 EntF was purified from E. coli. 494 00:30:36,820 --> 00:30:42,340 And again, this ATP-PPi assay was performed. 495 00:30:42,340 --> 00:30:45,340 And so in this case, what was observed is 496 00:30:45,340 --> 00:30:51,220 that when EntF was incubated with L-serine, ATP, 497 00:30:51,220 --> 00:30:55,300 and radiolabeled PPi, there was incorporation of the radiolabel 498 00:30:55,300 --> 00:30:59,440 into ATP, which indicates that EntF, 499 00:30:59,440 --> 00:31:04,060 its A domain is responsible for that activation of L-serine. 500 00:31:04,060 --> 00:31:06,880 And so you can imagine in each set of experiments, 501 00:31:06,880 --> 00:31:10,180 the researchers also tried the other monomer, and in 502 00:31:10,180 --> 00:31:14,980 the case of EntF, would have seen no ATP-PPi exchange 503 00:31:14,980 --> 00:31:16,390 with DHB. 504 00:31:16,390 --> 00:31:19,420 And likewise, for EntE, if they tried with L-serine, 505 00:31:19,420 --> 00:31:20,800 there would be no exchange. 506 00:31:20,800 --> 00:31:24,040 You'd want to see that, in terms of making a robust conclusion 507 00:31:24,040 --> 00:31:25,810 here for that. 508 00:31:29,850 --> 00:31:32,250 So that's good. 509 00:31:32,250 --> 00:31:33,990 Now, the next question is we need 510 00:31:33,990 --> 00:31:40,550 to get these monomers to these T domains here. 511 00:31:40,550 --> 00:31:44,740 And so that's the next step, is to study the T domain. 512 00:31:44,740 --> 00:31:46,760 And something you all need to appreciate 513 00:31:46,760 --> 00:31:48,770 about the time of this work, there 514 00:31:48,770 --> 00:31:51,350 wasn't a whole lot known about PPTases. 515 00:31:51,350 --> 00:31:54,080 There wasn't Sfp that you could borrow from your lab mate, 516 00:31:54,080 --> 00:31:56,660 or maybe you've expressed 100 milligrams for yourself 517 00:31:56,660 --> 00:31:59,810 and you could get that Ppant arm on here. 518 00:31:59,810 --> 00:32:02,420 And so JoAnne may want to elaborate, 519 00:32:02,420 --> 00:32:04,790 but there were a lot of effort to try to figure out, 520 00:32:04,790 --> 00:32:06,500 what is going on here? 521 00:32:06,500 --> 00:32:08,750 JOANNE STUBBE: And graduate students had no thesis. 522 00:32:08,750 --> 00:32:10,780 Because they couldn't get any activity of any 523 00:32:10,780 --> 00:32:13,015 of the enterobactin genes. 524 00:32:13,015 --> 00:32:14,090 ELIZABETH NOLAN: Yeah. 525 00:32:14,090 --> 00:32:15,910 JOANNE STUBBE: Until it was discovered what was going on. 526 00:32:15,910 --> 00:32:16,868 ELIZABETH NOLAN: Right. 527 00:32:16,868 --> 00:32:19,760 So this was a major, major effort, undertaking, 528 00:32:19,760 --> 00:32:22,100 and discovery here. 529 00:32:22,100 --> 00:32:25,460 And so they couldn't find activity, 530 00:32:25,460 --> 00:32:28,520 and that's because these two domains needed to be modified 531 00:32:28,520 --> 00:32:31,740 and they weren't modified. 532 00:32:31,740 --> 00:32:34,910 But some little detective work here. 533 00:32:34,910 --> 00:32:38,210 So in the analysis of purified EntF, 534 00:32:38,210 --> 00:32:40,910 what analysis of this purified protein 535 00:32:40,910 --> 00:32:43,740 had revealed, in some instances, with substoichiometric 536 00:32:43,740 --> 00:32:45,940 phosphopantetheine. 537 00:32:45,940 --> 00:32:50,690 And so is that a contamination or is that meaningful? 538 00:32:50,690 --> 00:32:53,120 In this case, it was a meaningful observation 539 00:32:53,120 --> 00:32:56,900 that, when chased, proved to be very hopeful. 540 00:32:56,900 --> 00:33:00,110 It suggested that maybe there's a T domain here 541 00:33:00,110 --> 00:33:01,010 that's modified. 542 00:33:01,010 --> 00:33:04,160 That's something we can infer from this. 543 00:33:04,160 --> 00:33:11,610 So if this Ppant arm is attached to EntF, how does it get there? 544 00:33:11,610 --> 00:33:16,100 And if we rewind and think about what was going on at the time, 545 00:33:16,100 --> 00:33:21,320 it was only shortly before that the PPTase for the acyl carrier 546 00:33:21,320 --> 00:33:24,860 protein and fatty acid synthetase was discovered. 547 00:33:24,860 --> 00:33:27,830 So that might beg the question, is it possible 548 00:33:27,830 --> 00:33:32,720 that this enzyme also modifies EntF, 549 00:33:32,720 --> 00:33:35,180 if you don't know much about its substrate scope? 550 00:33:35,180 --> 00:33:39,930 And so that hypothesis was tested and it didn't pan out. 551 00:33:39,930 --> 00:33:44,260 So if EntF was incubated with ACPS 552 00:33:44,260 --> 00:33:47,150 from fatty acid biosynthesis and coASH, 553 00:33:47,150 --> 00:33:49,250 there was no product formation. 554 00:33:49,250 --> 00:33:52,050 There was no transfer of the Ppant arm to here. 555 00:33:52,050 --> 00:33:53,120 Yeah? 556 00:33:53,120 --> 00:33:57,500 AUDIENCE: Was it obvious the fatty acid synthesis-- 557 00:33:57,500 --> 00:34:02,100 was there [INAUDIBLE] synthesis at the time, 558 00:34:02,100 --> 00:34:03,395 or did it have a name? 559 00:34:03,395 --> 00:34:05,270 ELIZABETH NOLAN: I don't think it had a name, 560 00:34:05,270 --> 00:34:07,103 but I defer to JoAnne, who was on the thesis 561 00:34:07,103 --> 00:34:09,400 committee, because this is really the first one. 562 00:34:09,400 --> 00:34:11,817 AUDIENCE: Were the analogs of mercury obvious at the time? 563 00:34:11,817 --> 00:34:14,840 ELIZABETH NOLAN: No, it's really discovery work at this stage. 564 00:34:14,840 --> 00:34:18,320 The question is, is there a possible lead from somewhere? 565 00:34:18,320 --> 00:34:20,449 And if you try it, what will happen? 566 00:34:20,449 --> 00:34:25,639 And really, there is no clue as to what is this modification 567 00:34:25,639 --> 00:34:27,469 and that design involved. 568 00:34:27,469 --> 00:34:30,320 But if you see an enzyme with activity in one system, 569 00:34:30,320 --> 00:34:32,239 maybe it will be active with another. 570 00:34:32,239 --> 00:34:32,929 Maybe not. 571 00:34:32,929 --> 00:34:35,480 And in this case, it didn't work, 572 00:34:35,480 --> 00:34:38,150 but it was something certainly worthwhile to try. 573 00:34:41,300 --> 00:34:44,420 So then what was done? 574 00:34:44,420 --> 00:34:47,540 So there is a search for another PPTase, 575 00:34:47,540 --> 00:34:50,510 and this was done using BLAST. 576 00:34:50,510 --> 00:34:52,909 And what BLAST, this bioinformatics, 577 00:34:52,909 --> 00:34:57,680 revealed was the identification of that enzyme EntD. 578 00:34:57,680 --> 00:35:02,010 So here, we have this EntD, the PPTase here. 579 00:35:02,010 --> 00:35:04,820 And so EntD was overexpressed and purified. 580 00:35:04,820 --> 00:35:08,090 And so in this case, a histag was used, affinity column 581 00:35:08,090 --> 00:35:09,590 purification. 582 00:35:09,590 --> 00:35:13,220 And the question is, what happens if we incubate 583 00:35:13,220 --> 00:35:18,770 EntD with EntF and coASH? 584 00:35:18,770 --> 00:35:24,350 And so in these experiments, radiolabeled coASH was used, 585 00:35:24,350 --> 00:35:26,840 and radio labels are commonly used 586 00:35:26,840 --> 00:35:31,490 to look for transfer of either Ppant arms or monomers, 587 00:35:31,490 --> 00:35:35,340 as we'll see as we go forward, to these domains. 588 00:35:35,340 --> 00:35:40,280 And so the question is, will we see transfer of the radiolabel 589 00:35:40,280 --> 00:35:45,970 to EntF in the presence of EntD and coASH? 590 00:35:45,970 --> 00:35:49,400 And so here are the results from the experiment. 591 00:35:49,400 --> 00:35:53,090 So we have formation of holo EntF, 592 00:35:53,090 --> 00:35:57,980 as monitored by the radiolabel, versus time. 593 00:35:57,980 --> 00:36:03,630 And so what's done, the reaction is run for a given time point. 594 00:36:03,630 --> 00:36:05,840 The reaction is quenched with acid 595 00:36:05,840 --> 00:36:08,120 to precipitate the proteins. 596 00:36:08,120 --> 00:36:11,750 And then you can imagine measuring radioactivity 597 00:36:11,750 --> 00:36:13,810 in the pellet. 598 00:36:13,810 --> 00:36:17,540 coASH will remain in the soluble fraction 599 00:36:17,540 --> 00:36:21,290 and then protein in the pellet here. 600 00:36:21,290 --> 00:36:26,840 You can imagine control assays with EntD included there. 601 00:36:26,840 --> 00:36:29,360 And so what do we see? 602 00:36:29,360 --> 00:36:33,230 So as I said before, that we tried the acyl carrier, 603 00:36:33,230 --> 00:36:37,010 ACPS from fatty acid synthesis. 604 00:36:37,010 --> 00:36:39,020 There's no reaction. 605 00:36:39,020 --> 00:36:39,740 But look. 606 00:36:39,740 --> 00:36:45,000 When EntD is present, we see transfer of this Ppant arm 607 00:36:45,000 --> 00:36:46,940 to the protein here. 608 00:36:46,940 --> 00:36:50,060 So this was a really exciting result at the time. 609 00:36:50,060 --> 00:36:52,610 We have a new enzyme, a new activity, 610 00:36:52,610 --> 00:36:56,770 this post-translational modification there. 611 00:36:56,770 --> 00:36:59,420 And this opens the door to further studies. 612 00:36:59,420 --> 00:37:01,770 If you can get the Ppat arm on, then we 613 00:37:01,770 --> 00:37:05,880 can look at loading the monomers here. 614 00:37:05,880 --> 00:37:12,540 So what's the next step? 615 00:37:12,540 --> 00:37:14,850 We have EntF loaded. 616 00:37:14,850 --> 00:37:17,220 We're also going to want to try to load EntD-- 617 00:37:17,220 --> 00:37:19,980 EntB, excuse me-- here. 618 00:37:19,980 --> 00:37:23,010 But of course, you need to know some more about EntB. 619 00:37:26,040 --> 00:37:30,030 And so let's think about that. 620 00:37:30,030 --> 00:37:33,750 I'll also note, just noted here, the next step 621 00:37:33,750 --> 00:37:37,200 is to look at loading of L-serine onto this moiety 622 00:37:37,200 --> 00:37:39,640 here, as drawn. 623 00:37:39,640 --> 00:37:44,350 And you can think about how to do that experimentally. 624 00:37:44,350 --> 00:37:47,160 So what about EntB? 625 00:37:47,160 --> 00:37:52,230 This was another mystery, in terms of experimental work 626 00:37:52,230 --> 00:37:55,080 and exploration. 627 00:37:55,080 --> 00:38:00,240 And so initially, EntB was purified and characterized 628 00:38:00,240 --> 00:38:04,260 for its activity that led towards the biosynthesis 629 00:38:04,260 --> 00:38:06,900 of the BHP monomer. 630 00:38:06,900 --> 00:38:10,560 So this ICL domain is involved in the series 631 00:38:10,560 --> 00:38:13,880 of reactions that give DHB. 632 00:38:13,880 --> 00:38:16,790 On a historical note, it was thought 633 00:38:16,790 --> 00:38:18,560 there was another protein. 634 00:38:18,560 --> 00:38:21,470 And this protein was named EntG that 635 00:38:21,470 --> 00:38:25,940 was thought to be required for enterobactin biosynthesis. 636 00:38:25,940 --> 00:38:32,030 And EntG would be the T domain that is for the aryl acid. 637 00:38:32,030 --> 00:38:34,430 So effectively, it would be this T domain, 638 00:38:34,430 --> 00:38:40,220 or aryl carrier protein for dihydroxybenzoic acid. 639 00:38:40,220 --> 00:38:45,830 But the problem was they couldn't find a gene for EntG. 640 00:38:45,830 --> 00:38:49,910 And so as it turned out, what more detective 641 00:38:49,910 --> 00:38:54,070 work showed is that this EntG is actually just 642 00:38:54,070 --> 00:38:56,720 the N-terminus of EntB here. 643 00:38:56,720 --> 00:39:01,700 So they realize that EntB has another role, another function, 644 00:39:01,700 --> 00:39:05,910 and that in addition to having this function and the synthesis 645 00:39:05,910 --> 00:39:08,720 of dihydroxybenzoic acid, because of this domain 646 00:39:08,720 --> 00:39:11,270 at the N-terminus, it's also the carrier 647 00:39:11,270 --> 00:39:16,160 protein for this monomer here. 648 00:39:16,160 --> 00:39:22,080 So how is this sorted out here? 649 00:39:22,080 --> 00:39:26,400 What we can do is just take a look at a sequence alignment. 650 00:39:26,400 --> 00:39:30,770 And so this is from one of the papers about all 651 00:39:30,770 --> 00:39:32,750 of these explorations. 652 00:39:32,750 --> 00:39:37,040 And effectively, what we're taking a look at 653 00:39:37,040 --> 00:39:43,315 are known [INAUDIBLE] phosphopantetheinylation sites, 654 00:39:43,315 --> 00:39:44,780 the proteins. 655 00:39:44,780 --> 00:39:48,350 So something is known about fatty acid synthesis 656 00:39:48,350 --> 00:39:53,750 and some other carrier proteins here from different organs. 657 00:39:53,750 --> 00:39:56,600 And so effectively, if we just look 658 00:39:56,600 --> 00:39:58,490 at this region of the alignment, we 659 00:39:58,490 --> 00:40:03,620 see this serine residue with an [INAUDIBLE] 660 00:40:03,620 --> 00:40:08,013 And this happens to be serine 245 of EntB. 661 00:40:11,260 --> 00:40:17,060 So this led to the hypothesis that maybe this serine 245 662 00:40:17,060 --> 00:40:19,570 towards the C-terminus terminus of EntB 663 00:40:19,570 --> 00:40:23,680 is the site where the Ppant arm is attached here. 664 00:40:23,680 --> 00:40:26,800 And so this means that some experiments 665 00:40:26,800 --> 00:40:33,010 are needed to show that EntB has this carrier protein, or T 666 00:40:33,010 --> 00:40:37,810 domain, and that it can be modified with the Ppant arm. 667 00:40:37,810 --> 00:40:40,810 And it was predicted EntD would do this. 668 00:40:40,810 --> 00:40:46,120 And also, that once modified with the Ppant arm, 669 00:40:46,120 --> 00:40:49,460 the aryl acid can be transferred to EntD. 670 00:40:49,460 --> 00:40:59,762 So if we just think about EntB for a minute, 671 00:40:59,762 --> 00:41:03,620 So have the N-terminal domain. 672 00:41:03,620 --> 00:41:05,085 Here's the C-terminal domain. 673 00:41:09,840 --> 00:41:14,070 Here's the ICL domain. 674 00:41:14,070 --> 00:41:21,150 Here's the T domain for an aryl carrier protein. 675 00:41:21,150 --> 00:41:29,360 So amino acid 1, 285. 676 00:41:29,360 --> 00:41:30,540 This is 188. 677 00:41:30,540 --> 00:41:32,760 It's not quite drawn to scale. 678 00:41:32,760 --> 00:41:40,800 So we serine 245 around here, which 679 00:41:40,800 --> 00:41:45,265 is the serine of interest for post-translational modification 680 00:41:45,265 --> 00:41:48,460 with the Ppant arm. 681 00:41:48,460 --> 00:41:55,960 And so what was done is that pathways were performed, where 682 00:41:55,960 --> 00:42:02,280 EntB was incubated with EntE and radiolabeled coASH, 683 00:42:02,280 --> 00:42:05,400 like what we saw for the studies of EntF. 684 00:42:05,400 --> 00:42:08,370 But they made a few additional constructs. 685 00:42:08,370 --> 00:42:14,910 So they considered full length EntB, so as shown here. 686 00:42:14,910 --> 00:42:20,130 They considered an EntD variant where with C-terminal 25 687 00:42:20,130 --> 00:42:22,240 amino acids were deleted. 688 00:42:22,240 --> 00:42:27,690 So you can imagine, they just put a stop codon in and leave 689 00:42:27,690 --> 00:42:29,550 the last 25 amino acids. 690 00:42:29,550 --> 00:42:32,220 So the serine is still there, but a bunch 691 00:42:32,220 --> 00:42:35,286 of the C-terminal residues aren't there. 692 00:42:35,286 --> 00:42:39,240 And then they also considered a variant of EntB 693 00:42:39,240 --> 00:42:42,805 where they deleted this entire N-terminal domain. 694 00:42:46,890 --> 00:42:49,510 And so the question is, what are the requirements? 695 00:42:49,510 --> 00:42:52,060 Well 1, does this reaction work? 696 00:42:52,060 --> 00:42:55,420 Does EntD modify EntB with the Ppant arm? 697 00:42:55,420 --> 00:42:58,670 And then if yes, what are the requirements? 698 00:42:58,670 --> 00:43:01,720 So is the N-terminal domain needed? 699 00:43:01,720 --> 00:43:04,460 Are these C-terminal residues needed? 700 00:43:04,460 --> 00:43:10,820 And so these are the gels that come from this experiment. 701 00:43:10,820 --> 00:43:13,585 And so what we're looking at on top 702 00:43:13,585 --> 00:43:16,560 are total proteins, so Coomassie staining. 703 00:43:16,560 --> 00:43:20,380 And on the bottom, we're looking at radioactivity. 704 00:43:20,380 --> 00:43:25,690 And the idea here is that we want to track the radiolabels. 705 00:43:25,690 --> 00:43:29,890 So in lane 1, we have assays with full length EntB. 706 00:43:29,890 --> 00:43:33,340 In lane 2 with the C-terminal truncation. 707 00:43:33,340 --> 00:43:37,310 And in lane 3, deletion of the N-terminal domain. 708 00:43:37,310 --> 00:43:43,430 OK, so the question is, what do we see from these data here? 709 00:43:43,430 --> 00:43:45,580 And so these give us a sense as to where 710 00:43:45,580 --> 00:43:48,930 the individual proteins run on the gel. 711 00:43:48,930 --> 00:43:51,270 And here, we're looking at the radioactivity. 712 00:43:55,590 --> 00:43:56,780 So what do we see? 713 00:43:56,780 --> 00:44:10,000 In lane 1, you see a huge blob of radioactivity. 714 00:44:10,000 --> 00:44:13,220 This isn't the most beautiful gel, actually. 715 00:44:13,220 --> 00:44:15,620 Nevertheless, there's much to learn. 716 00:44:15,620 --> 00:44:18,760 So what do we see? 717 00:44:18,760 --> 00:44:23,140 We see radioactivity associated with EntB. 718 00:44:23,140 --> 00:44:24,790 That's really good news. 719 00:44:24,790 --> 00:44:29,140 We see transfer of this radiolabeled Ppant arm. 720 00:44:29,140 --> 00:44:30,820 What about lane 3? 721 00:44:33,760 --> 00:44:35,030 So what do we see there? 722 00:44:51,390 --> 00:44:52,980 AUDIENCE: Also a lot of radioactivity. 723 00:44:52,980 --> 00:44:53,938 ELIZABETH NOLAN: Right. 724 00:44:53,938 --> 00:44:55,800 We have a lot of radioactivity. 725 00:44:55,800 --> 00:44:58,410 We're looking at the construct that only 726 00:44:58,410 --> 00:45:01,440 has this C-terminal domain. 727 00:45:01,440 --> 00:45:03,780 So what does that tell us? 728 00:45:03,780 --> 00:45:05,072 AUDIENCE: It's shorter. 729 00:45:05,072 --> 00:45:06,780 That's why it moved down the gel further. 730 00:45:06,780 --> 00:45:07,110 ELIZABETH NOLAN: Right. 731 00:45:07,110 --> 00:45:09,780 So that's why it has a different migration on the gel. 732 00:45:09,780 --> 00:45:12,510 But in terms of seeing the radioactivity, what did 733 00:45:12,510 --> 00:45:13,050 we learn? 734 00:45:13,050 --> 00:45:17,310 Is this region of the protein essential or dispensable? 735 00:45:17,310 --> 00:45:20,040 We don't need this N-terminal domain in order 736 00:45:20,040 --> 00:45:23,730 for EntD to modify EntB. 737 00:45:23,730 --> 00:45:24,608 So we're seeing that. 738 00:45:24,608 --> 00:45:25,650 What about in the middle? 739 00:45:32,810 --> 00:45:35,032 AUDIENCE: The deleted region is important [INAUDIBLE] 740 00:45:35,032 --> 00:45:35,990 ELIZABETH NOLAN: Right. 741 00:45:35,990 --> 00:45:39,890 We see very little radioactivity here. 742 00:45:39,890 --> 00:45:41,780 Basically, almost nothing, especially 743 00:45:41,780 --> 00:45:43,580 compared to these spots. 744 00:45:43,580 --> 00:45:48,770 So deletion of those C-terminal amino acids is detrimental, 745 00:45:48,770 --> 00:45:51,750 and so that region is important. 746 00:45:51,750 --> 00:45:54,170 So maybe there's protein-protein interaction going on, 747 00:45:54,170 --> 00:45:57,650 or something with information that's important. 748 00:45:57,650 --> 00:46:01,820 So from here, we learn that EntD transfers the Ppant arm 749 00:46:01,820 --> 00:46:03,260 to EntB. 750 00:46:03,260 --> 00:46:06,050 The ICL domain is not essential for this, 751 00:46:06,050 --> 00:46:09,620 but the C-terminus of this region is here. 752 00:46:12,670 --> 00:46:13,810 So now what? 753 00:46:13,810 --> 00:46:17,650 We've got in here via the action of EntD. 754 00:46:17,650 --> 00:46:22,450 Can we get attachment of the monomer? 755 00:46:22,450 --> 00:46:25,840 And so our hypothesis is that EntE, which 756 00:46:25,840 --> 00:46:31,130 we saw EntE activate DHB to form with the adenylate, 757 00:46:31,130 --> 00:46:36,380 it will also transfer this moiety to EntB. 758 00:46:36,380 --> 00:46:39,280 So in this case, what was done, again, we're 759 00:46:39,280 --> 00:46:43,840 looking at use of a radiolabel. 760 00:46:43,840 --> 00:46:49,840 In this case, the radiolabel is on the DHB lane. 761 00:46:49,840 --> 00:46:51,490 And this is an important point. 762 00:46:51,490 --> 00:46:56,380 In order to see this, we cannot have radiolabeled Ppant arm, 763 00:46:56,380 --> 00:46:59,740 in this case, because that would give you a big background. 764 00:46:59,740 --> 00:47:04,420 So they're going to prepare EntB with the Ppant arm unlabeled. 765 00:47:04,420 --> 00:47:06,820 We know that will work based on the prior study. 766 00:47:06,820 --> 00:47:10,120 And now, we repeat that with unlabeled coASH. 767 00:47:10,120 --> 00:47:15,962 And then ask, if we incubate total EntB with EntE, ATP, 768 00:47:15,962 --> 00:47:21,280 and radiolabeled DHB, do we see transfer of the radiolabel 769 00:47:21,280 --> 00:47:24,278 to this protein here? 770 00:47:24,278 --> 00:47:25,820 JOANNE STUBBE: Let me ask a question. 771 00:47:25,820 --> 00:47:27,670 This will be a question during class. 772 00:47:27,670 --> 00:47:31,980 Can you do this experiment with tritiated CoA and C14-labeled 773 00:47:31,980 --> 00:47:39,320 serine, based on what you know about radioactivity? 774 00:47:46,520 --> 00:47:49,103 We actually discussed a similar situation. 775 00:47:53,047 --> 00:47:57,060 AUDIENCE: Would it last longer [INAUDIBLE] 776 00:47:57,060 --> 00:47:59,435 JOANNE STUBBE: Did you go back and look at the lifetimes? 777 00:47:59,435 --> 00:48:01,830 Is it infinite compared to any experiments? 778 00:48:01,830 --> 00:48:03,410 So it's not lifetimes. 779 00:48:03,410 --> 00:48:04,530 Do you have any ideas? 780 00:48:07,510 --> 00:48:12,460 AUDIENCE: I mean so tritium, the energy, the particle released, 781 00:48:12,460 --> 00:48:15,155 is much lower than the energy of C14. 782 00:48:15,155 --> 00:48:16,030 JOANNE STUBBE: Right. 783 00:48:16,030 --> 00:48:18,310 So the difference is in the energies. 784 00:48:18,310 --> 00:48:19,260 We talked about this. 785 00:48:19,260 --> 00:48:22,300 You can tune the scintillation counter. 786 00:48:22,300 --> 00:48:24,400 So you measure tritium to C14. 787 00:48:24,400 --> 00:48:28,690 So people that do enzymology for a living often 788 00:48:28,690 --> 00:48:31,360 use tritium in C14 at the same time. 789 00:48:31,360 --> 00:48:34,460 And you can quantitate, if you do your experiments right. 790 00:48:34,460 --> 00:48:38,782 It's a very powerful tool together, actually. 791 00:48:38,782 --> 00:48:40,240 ELIZABETH NOLAN: So another option, 792 00:48:40,240 --> 00:48:41,605 the non-simplistic approach. 793 00:48:44,410 --> 00:48:52,360 So basically here, if we're looking at the four lanes, 794 00:48:52,360 --> 00:48:54,310 again, we're looking at total protein. 795 00:48:54,310 --> 00:48:57,790 We're looking at radioactivity, and can 796 00:48:57,790 --> 00:49:02,750 consider the overall reaction, and then a variety of controls. 797 00:49:02,750 --> 00:49:04,913 OK, and I want to move forward to get 798 00:49:04,913 --> 00:49:07,330 through the rest of the slides and we just have a few more 799 00:49:07,330 --> 00:49:09,550 minutes, but you should work through this gel 800 00:49:09,550 --> 00:49:12,630 and convince yourself that there is transfer 801 00:49:12,630 --> 00:49:15,730 of this radiolabel in the presence of EntD 802 00:49:15,730 --> 00:49:17,827 And this was done with unlabeled EntD. 803 00:49:21,410 --> 00:49:25,840 OK, so what about peptide bond formation? 804 00:49:25,840 --> 00:49:28,990 We have the T domains loaded with the monomer. 805 00:49:28,990 --> 00:49:32,830 Can we see activity from the C domain, 806 00:49:32,830 --> 00:49:36,010 in terms of the formation of an amide bond. 807 00:49:36,010 --> 00:49:40,880 And so this experiment requires a lot of components. 808 00:49:40,880 --> 00:49:42,130 So what is the experiment? 809 00:49:42,130 --> 00:49:45,640 To look at whether or not EntF catalyzes condensation. 810 00:49:48,280 --> 00:49:56,290 Basically, we can incubate EntE, holo EntE, holo EntF, ATP, 811 00:49:56,290 --> 00:49:58,570 and these monomers. 812 00:49:58,570 --> 00:50:01,630 And what we want to do, in this case, 813 00:50:01,630 --> 00:50:07,620 is look at transfer of radiolabeled DHB 814 00:50:07,620 --> 00:50:08,790 to serine-loaded EntF. 815 00:50:08,790 --> 00:50:11,703 And I guess I got a little ahead of myself on the prior slide. 816 00:50:11,703 --> 00:50:13,120 So this is a case where if you add 817 00:50:13,120 --> 00:50:15,970 C14 labels in both of your monomers, 818 00:50:15,970 --> 00:50:18,070 you'd have a big problem. 819 00:50:18,070 --> 00:50:24,850 So key here is to use unlabeled serine and radiolabeled DHB 820 00:50:24,850 --> 00:50:28,240 so you're not getting a big background. 821 00:50:28,240 --> 00:50:31,450 And an important point to make here in these experiments 822 00:50:31,450 --> 00:50:34,450 is that we're looking for detection 823 00:50:34,450 --> 00:50:36,190 of a covalent intermediate. 824 00:50:36,190 --> 00:50:46,430 So effectively, having this guy here attached to EntF. 825 00:50:46,430 --> 00:50:56,230 And so the radiolabel is here. 826 00:50:56,230 --> 00:50:59,535 So that's what we're looking for, not the final product, 827 00:50:59,535 --> 00:51:03,220 and that's indicated by having the gels. 828 00:51:03,220 --> 00:51:05,310 So what do we see? 829 00:51:05,310 --> 00:51:09,290 We have the total protein and then the autoradiograph. 830 00:51:09,290 --> 00:51:15,300 And so we have holoEntF, EntE, holoEntB. 831 00:51:15,300 --> 00:51:20,310 And the question is, where do we see radiolabels transfer? 832 00:51:20,310 --> 00:51:25,830 And if we look at lane 3, where we have the proteins, ATP, 833 00:51:25,830 --> 00:51:29,880 serine, and radiolabeled DHB, what we observe 834 00:51:29,880 --> 00:51:33,210 is that there is some radioactivity here, 835 00:51:33,210 --> 00:51:36,000 which is indicative of a covalent intermediate. 836 00:51:36,000 --> 00:51:39,470 And again, you should work through these gels 837 00:51:39,470 --> 00:51:41,190 and work through the different conditions 838 00:51:41,190 --> 00:51:45,450 and make sure it makes sense to you what's seen in each one. 839 00:51:45,450 --> 00:51:50,760 So finally, at that stage, the activities 840 00:51:50,760 --> 00:51:53,410 for all of these different domains have been found. 841 00:51:53,410 --> 00:51:55,860 And so the question is, in the test tube, 842 00:51:55,860 --> 00:51:59,790 can we actually get enterobactin biosynthesized, 843 00:51:59,790 --> 00:52:03,720 which is going to rely on this TE domain. 844 00:52:03,720 --> 00:52:07,050 So the idea is if we incubate everything together, 845 00:52:07,050 --> 00:52:09,210 similar to what was done before, can we 846 00:52:09,210 --> 00:52:12,720 detect the actual small molecule, rather than 847 00:52:12,720 --> 00:52:17,040 this intermediate attached to EntF here? 848 00:52:17,040 --> 00:52:21,630 And so the way this was done was by monitoring the reaction 849 00:52:21,630 --> 00:52:26,320 by HPLC using reverse stage chromatography. 850 00:52:26,320 --> 00:52:29,920 And so here, we have all of the reaction components. 851 00:52:29,920 --> 00:52:31,650 Here, we see standard. 852 00:52:31,650 --> 00:52:35,630 So in enterobactin, this is the linear trimer, 853 00:52:35,630 --> 00:52:37,570 the linear dimer, a monomer. 854 00:52:37,570 --> 00:52:41,110 Here's the DHB substrate. 855 00:52:41,110 --> 00:52:45,240 And the question is, over time, do we see enterobactin formed? 856 00:52:45,240 --> 00:52:47,850 So you can imagine quenching the reaction, 857 00:52:47,850 --> 00:52:49,680 taking the soluble component, which 858 00:52:49,680 --> 00:52:54,100 should have this small molecule, and looking at each POC. 859 00:52:54,100 --> 00:52:57,480 And where you should just focus at the moment is here. 860 00:52:57,480 --> 00:52:59,850 So the enterobactin peak, what we see 861 00:52:59,850 --> 00:53:03,900 is that over time, there's growth in this peak. 862 00:53:03,900 --> 00:53:06,660 You can imagine doing something like LC-MS analysis 863 00:53:06,660 --> 00:53:09,960 to confirm it is the species you expect here. 864 00:53:09,960 --> 00:53:13,800 So this is full in vitro reconstitution 865 00:53:13,800 --> 00:53:18,150 of a non-ribosomal peptide synthetase in the test tube, 866 00:53:18,150 --> 00:53:21,930 and it really paved the way for many, many additional 867 00:53:21,930 --> 00:53:26,550 experiments related to these types of biosynthetic machines. 868 00:53:26,550 --> 00:53:30,270 And so with that, we're going to close this module, 869 00:53:30,270 --> 00:53:32,340 and I hope you all have a great spring break. 870 00:53:32,340 --> 00:53:34,560 And I leave you in the good hands of JoAnne 871 00:53:34,560 --> 00:53:37,610 starting the 28th here for lecture.