Flash and JavaScript are required for this feature.
Download the video from Internet Archive.
Description
After wrapping up the lecture on lipids, Professor Imperiali moves on to discussing amino acids, peptides, and proteins.
Instructor: Barbara Imperiali
Lecture 3: Structures of Am...
PROFESSOR: It's going to be a great lecture today. It's about proteins. I love proteins. Don't forget the handout, yeah. OK, so I'm going to briefly wrap up the lecture we were doing on Friday because there were a couple of things that I wanted to make a note of, and then we'll move on to section 2.3 about amino acids, peptides, and proteins.
Now, in the last class, I introduced you to the lipidic molecules, and you can pick them out of a lineup because they are rich in carbon-carbon and carbon-hydrogen bonds. As you can see here in these line-angled drawings, the majority of a lot of these molecules is carbon-carbon or carbon-hydrogen hydrogen. They are molecules that are mostly hydrophobic, so there are some terminologies here. Whoops.
Hydrophobic, which can also be referred to as lipophilic. You either-- you can hate water and love fatty acid or fatty types of materials, so those both terms are synonymous. And some of the lipids are what are known as amphipathic, and they include hydrophobic and hydrophobic components.
There are a couple of tiny terms that I didn't mention explicitly, so I just want to go ahead and do that now. For example, in this phospholipid structure-- and we'll talk about these-- they have long chain fatty acids attached via esters to this glycerol unit, so there's one here and the second one here, and then what's known as the polar head group. In those fatty acids, they could be fully saturated. It means they have no double bonds in the structure, so that term saturated is equivalent to no double bonds, so no carbon-carbon double bonds. Or they could be unsaturated, where there is a double bond within it, so that's one or more double bonds.
And those double bonds take on a particular shape because there's not freedom of rotation around double bonds the same way there is around single bonds. So those single bonds, you can twist them around and twist them around, but the double bond geometry is fixed. And so double bonds, we refer to them as either trans, where the two groups are on opposite sides, leaving the double bond, or we refer to them as cis, where the two groups are on the same side. And we tend to use that cis and trans sort of naming system in a lot of other contexts as well, but you almost always want to remember that trans is as far away as possible, cis is closer than trans.
All right, so I'm just going to take you forward to the phospholipid structure. This is a very important semi-permeable membranes are made up through the non-covalent, supramolecular association of phospholipid monomer units. Here's a monomer unit up here. You see it has an amphipathic structure, with a lot of hydrophobicity but also hydrophilicity, and these molecules assemble into supramolecular structures that form the boundaries of your cells.
Saying they are semi-permeable tells us a little bit about what can go through them. If they were fully permeable, anything could come and go and they wouldn't be much use frankly. It's like leaving the door open the whole time. But because they're semi-permeable, only a few things can come and go without extra help, and other things need active mechanisms to go through. So let's take a look at the boundary here.
So when you see a membrane bilayer, they are-- they're often shown looking like this, where every one of these units is a phospholipid, and there's water on both sides of the phospholipid, because that polar head group is interacting with water on both sides. So down here could be the inside of the cell. Up here could be the outside of the cell. And a lot of cells, especially eukaryotic cells, the ones that make us up, have a lot of endomembranes, membranes within the cells. For example, forming the boundary to the nucleus or to the mitochondria. Yes?
AUDIENCE: Why is there a [INAUDIBLE]
PROFESSOR: Oh, this guy must be-- so this looks like it's probably a saturated fatty acid. So what do you think this one might be, folks? Unsaturated. And what's the double-bond geometry?
AUDIENCE: Cis.
PROFESSOR: Cis. Yeah, it is. It's like a-- it looks like a ballerina or something. OK, so we have a lot of concerns, and we'll see later about how things get in and out of cells. But most commonly, things like oxygen or water and other small hydrophobic molecules can pass readily in and out through the semi-permeable barrier, but other things, things that are charged, things that are big, need a different mechanism to get in and out. And we will see later on how proteins provide the opportunities to cargo things into cells or out of cells, even very large entities, and there are certain mechanisms whereby that happens through a semi-permeable membrane, OK?
I want to show you the other feature of membranes. They are self-healing. What this means is if you poke them, you poke a hole in a cellular membrane, You? Basically push apart those non-covalent forces. Once you take the thing away, be it a needle or a very fine glass capillary, they seal right back up to close to close the hole in the cell wall, so that kind of tells us that they're non-covalent forces.
So this is a really cool video of someone doing micro-injection into eukaryotic cells. The needle points to the cell, approaches the surface. You can drop something into the cell, and then the cell closes and maintain-- regains its integrity of the barrier, so this is a very cool observation. People do this. They have to not drink too much coffee because it's quite complicated to do a lot of micro-injection, because you can really cause carnage in your cell population if you're not very dexterous with the micro-injection but people can be very good at it.
So I just want to ask a couple of questions before, give you a couple of things to think about before we close up. The lipids. So here's a typical lipid bilayer, where I've highlighted a single lipid. And the colors, those are the head groups, and all in white and gray are the hydrophilic components, and just one of the phospholipids is highlighted, and that would be this molecular structure here. So first of all, what do you think the non-covalent forces at that membrane interface may be? That is, what's going on here at the interface? What are the types of interactions that you might have there?
Give you a minute to think about it, and I want to show you that I'm actually giving you a clue here, because you can see the structure, negative charge, positive charge, but also remember this is a barrier to water, so there are other things going on with the solvent that the membrane is sitting in because there's water surrounding that barrier layer. Anyone want to tell me what the answer is, and why? Yeah, did you-- are you-- yeah.
AUDIENCE: Hydrogen bonding.
PROFESSOR: Yeah, between what and what?
AUDIENCE: Like the oxygen and [INAUDIBLE]
PROFESSOR: Right, so water. Water is a good hydrogen bond donor and acceptor, so there will be hydrogen bonding. What about amongst all those lipid head groups, what's the other major force? Yeah?
AUDIENCE: Electrostatic force.
PROFESSOR: Between the different charges. So the correct answer here is both of them. Don't think it's just electrostatic, it's both. It's electrostatic amongst the head groups, hydrogen bonding between all that sort of dense bunch of charge, and the water. And then the other question, what type of molecules can get across? I've already answered that question to you.
Salts are going to need ways to get in and out. Small proteins are too big to dissolve in that membrane through passive mechanisms, so we're going to have to figure out how to get proteins in and out of cells. Neurotransmitters, such as this, this is GABA, or gamma aminobutyric acid. It's charged. It just can't get through without a transporter of some kind, and it's actually proteins that end up doing the heavy lifting of the transport processes that we'll see.
OK, so moving along. This section will be about the building blocks of your protein macromolecules, which I want to remind you comprise 50% of all of the macromolecules, so that suggests it's a pretty important class of macromolecules that has a lot of different functions. Now, the amino acid building locks-- blocks look pretty simple. They're called amino acids because they have an amine, the carboxylic acid, and there's a carbon that is tetrehedral between the carboxylic acid and the amine.
And the simplest of those is when those are both hydrogen, but most of the amino acids are differentiated from that-- this one I've showed you on the board. This amino acid is glycine. Usually, when it's just a lonely amino acid in aqueous solution, it's in a different charged form, just consistent with what we talked about in the last class. And I put it here.
So this is glycine. It's one of the 20 encoded amino acids. That means the amino acids that are made through ribosomal biosynthesis through a code that's provided by the messenger RNA, so they are encoded by messenger RNA. Later on, you'll see all of the beautiful mechanics of those processes.
Now, this table looks pretty complicated, so I'm going to deconstruct it a bit. But what I first of all want to assure you is that these-- you will always get a handout with these structures on them. We are not asking you to remember these structures. You might become familiar with some of them, but you do not have to remember them.
You'll have a table that shows them, but on that table, I won't necessarily give you the information on what their properties are, because those are things that you should be able to spot by looking at their chemical structures, all right? So that's important. So these are all line-angled drawings, so you see the carbon. The hydrogens aren't shown in there.
The charges are shown for what's called the side chain, because most of the amino acids have a side chain. The amino acids are also chiral, but you'll learn more than you ever wanted to know about chirality in 512, so I won't weigh you down with any of those properties. So there is a side chain that dictates the properties of the amino acids.
One tiny detail, the amino acids that are encoded in our proteins are all what are known as alpha amino acids. There are other amino acids. GABA, that I showed you on the previous slide, is not an alpha amino acid. Actually it's, a gamma amino acid. These are called amino acids because the amine group is at the alpha position relative to the carboxyl. Don't need to know a lot more about that with respect to that.
So let's take a look at this set of amino acids, and what you see is amino side chains with rather different properties. I've amassed-- here's glycine at the very top. All amino acids have a three-letter code or a one-letter code. I particularly enjoy using one letter codes and spelling out people's names in peptides and things like that. I'll let you do that in the privacy of your own room. It's kind of amusing to see if your name actually spells out a peptide. Some of us-- if I get a little stopped stuck with Barbara because there are no B amino acid one letters with a B.
The next most abundant type of amino acid have hydrophobic side chains. What that means is they have a lot of CHs, but not a lot else, right? So take a look at them. Alanine has a methyl group, for example, where I've shown the R, that would be alanine.
And they get increasingly big. They're quite large. Some of them have quite extended size chains. Other ones have side chains with rings with double bonds in them. Those are what we would designate in organic chemistry as aromatic. They show-- they are still hydrophobic, but they show different properties to this other set of amino acids.
Some of these amino acids may actually have polar groups in them, but their major feature is that they're hydrophobic. But in an amino acid, such as tyrosine, you could not only have hydrophobic interactions with that ring system, but also hydrogen bonding with the OH on the tyrosine, so some of the amino acids can do a few different things.
The next set of amino acids are those that are polar and charged, and I've shown you the most common state of all of those amino acids, but you already know that the amine of lysine is likely to be charged. This quanidinium group of arginine, take my word for it, it's charged. It's a bit more complicated to draw. Histidine is also one of those that's annoying to draw, but the negatively-charged side chains with a carboxylate are both negatively charged, and that's something you would remember from the previous class hopefully. And then finally, there are amino acids with polar uncharged side chains, such as those shown here.
Now, this doesn't look like a very exciting set of building blocks. How can life run on things made of 20 relatively simple building blocks with functional groups? And it's that the building blocks are not functional themselves. It is the polymers that are made up of amino acids, and I'll always call them AAs because it's easier for me. The polymers of amino acids are heteropolymers. That means they're made up of a bunch of different monomer units when they're called heteropolymers.
And the other important thing about these polymers is that they are of defined sequence. What is the sequence? It's the order in which the amino acids appear. So I'm writing that down, order. And all the functions of proteins are dictated by the order of the amino acids, so let's take a look at the sidebar here.
So once again, remember a couple of things that we will always give you this table to think about. Ooh, come back. There are a couple of outliers I just want to mention quickly. So I talked to you about glycine, the simplest amino acid with no elaborate side chain. Proline is a little odd because its side chain is kind of in a cyclic structure, and towards the end of the class, I'll talk to you about collagen, whose structure is totally dependent on the involvement of proline in the sequence of the amino acids that make up collagen.
And then the last sorts of unusual amino acid is cysteine. It has a thiol, and the one clever thing about cysteine-- I'm just going to put a bit of a peptide here. One cysteine, and then I'm going to put a second cysteine, and these are going to be deemed in a peptidic structure. What cysteine can do is it can exist either with the thiol side chain, SH, or it can be at a different oxidation state where the two sulfurs are joined to each other.
So for the most part, your linear arrangement of amino acids that dictates sequence is solely held by-- together by the covalent bonds and the peptide backbone that we'll talk about in a minute. But occasionally, enfolded structures, if two cysteines are close to each other and the environment is oxidizing, they will form a cross-link. But they're not what drives folding. They kind of fall into place later on, but that just sort of sets cysteine apart a little bit for its properties, all right?
OK, so coming down the side here. Amino acids are assembled in a unique linear polymer of defined order, and we designate that defined sequence the primary sequence. And proteins can be 1,000 amino acids, 1,500, 100 amino acids. They can be various lengths where they, you know, we would generally consider the smallest protein to be about 400 amino acids, and you might go up to thousands of amino acids. I'm going to write 2,000 or more here.
When the proteins are smaller, they are not capable of adopting too much ordered structure, and we mostly call them peptides. Peptides are sort of shorter sequences, so peptide sequences. So this would be a protein, and peptides, probably two to 39 amino acids, but these breakpoints are a little bit more vague. So the primary sequence will define the structure of a protein, and we're going to start to talk about the hierarchical structure of proteins as put in place, and that's the primary sequence,
And that primary sequence is kind of a cool thing because it's very specific. It defines-- it's got encoded into its structure, the three-dimensional fold of the protein, OK? All the information for the folded, compact, globular structure that's functional is encoded in that primary sequence. It's a cryptic code. We may not be able to tell by looking at it what it really looks like, but all the information is there in order to program the folding into a globular structure.
So the primary sequence determines the fold, and it's the fold of the protein that mandates its function. It's not the sequence of the protein. The sequence defines the fold. The fold, the three-dimensional form, defines the function, OK? So that's very important.
And I think it's absolutely amazing that with a relatively limited set of building blocks, we can define so many different functions of all the proteins in our body that may be structural, they may be catalysts, they may be things that transfer information from the outside to the inside of cell. All of that is programmed with this rather limited set of building blocks, OK?
Now, let's now talk about peptides because one gets a little frustrated looking at single amino acids. They don't tell us so much about the peptidic structure, so I'm going to draw two amino acids, and then I'm going to tell you one important thing. So let's put R1, and I'm going to draw another amino acid, and I'm putting it in a particular orientation. R2, because that designates that these might be different amino acids. For example, if R1 is H, there's an implied hydrogen here, that would be glycine. If R2 is a methyl group, there's an implied hydrogen there, that would be alanine, all right?
When nature bonds all these amino acids together, it carries out a condensation reaction to form a peptide bond between these two components of the amino acid, the amine and the carboxylic acid. And now I'm going to draw you the first of the dipeptides that you'll meet. And there are so many things to tell you about these structures, it sort of drives me crazy thinking about, oh, I must remember to tell them that or I've got to remember to tell them that, because the structures are cool. R1, R2.
OK, so this is a dipeptide, two amino acids, and there are some characteristics I want you to remember. When we write out peptides, we always write them N to C. So in that peptide, this would be the carboxyl terminus, and this would be the amino terminus. If you don't always remember to write things in this order, and you tell your friend, oh, go and get this peptide made, and you put it down in the wrong order, they'll make the wrong peptide. So you always-- there is basically an agreement amongst everyone that we always write from left to right, the sequence of peptides.
The next important thing about this structure, as you look at it, there are several bonds joining the polymeric structure. Many of these bonds show free rotations. You can twist them around, there's nothing stopping that conversion. All of these show freedom of rotation.
But the amide, or peptide bond, is unique in that there's restricted rotation about that bond. So it's as if you've got a linear polymer, but every third bond has kind of stuck in a particular orientation, which starts to define a lot of details about protein tertiary structure. It's not complete spaghetti. It's like spaghetti with little bits that haven't been cooked. They're stiffer than the rest of the sequence.
And the other really important thing about the peptide structure is that embedded within that structure, there is the amide or peptide functional group where, remember, this can be a hydrogen bond acceptor, and this can be a hydrogen bond donor. Once you know that, the next few slides will make a lot of sense as we talk about higher-order structure of proteins. So let's just take a look at that with a slightly longer peptide.
By convention, if I'm going to draw a peptide that's methionine isoleucine threonine-- you can look up that names-- those names on the chart-- that would be the MIT peptide. These are the three amino acids. I'm going to condense them into a tripeptide.
When I condense three amino acids, I spit out two molecules of water, and I put in place two amide or peptide bonds. If I go down this backbone, every third bond is going to be fixed, fairly fixed. There's not freedom of rotation around it, and every third bond is going to have the capacity to be involved in hydrogen bonding interactions, as I've suggested here, all right? What else is there here?
When I write the MIT peptide, I write M first, I second, T third. If I wrote TIM, it would be a completely different chemical structure with different chemical properties, so the directionality is important to understand, and there you have it. So now you can go home and practice your name in amino acids and draw them out.
If you draw them out fairly sort of sharply, then you'll never get confused about what end's what and where the substitutes are, but it's important to remember as you're making a dipeptide-- oops, I forget this doesn't work. As you're condensing a dipeptide, when you're putting these R groups on, one goes up, one goes down, but these are nuances of the structure that may be lit for-- good for a later discussion.
So here is now a longer linear peptide, and the suggestion of a globular structure that might be found if that peptide was folded up. And the primary sequence here defines the globular structure, and the process whereby you go from the extended primary sequence to the folded structure is called protein folding. And physical chemists and physicists and computational chemists have for years tried to understand how we could predict the folded structure from the primary sequence. It's not simple because what you're doing is you're solving a massive energy diagram, where as you fold a structure up, you're trying to maximize all those non-covalent forces for maximum thermodynamic stability, right?
It's kind of a three-dimensional puzzle where you're trying to have as many hydrogen bonds, electrostatic interactions, and so on, as you can possibly make. So when computational chemists try to fold proteins, they're basically solving a three-dimensional puzzle where they are maximizing interactions. And there are a lot of ab initio and molecular dynamics programs that are now starting to be able to fold proteins into fairly reliable structures, but they don't always get them right because they haven't gotten all the clues yet.
And also while they may be able to do ab initio or computational folding with small structures, the headache gets way bigger the larger the structures get. So the predictors aren't very good at predicting big structures, they're getting better at predicting small structures. And so just to reinforce to you, the primary sequence is established by covalent bonds, the peptide bonds, but the globular tertiary structure is based on non-covalent covalent interactions, OK?
Now, I want to ask you this. I love cartoons with science in them, but you know, 10%, 20% of the time, they make mistakes, and I felt this one was particularly pertinent. So a bunch of guys lugging around in a lab and says, well, we finished the genome map, now we just have to figure out how to fold it. What is wrong with that cartoon? What fold? Yeah?
AUDIENCE: You want to [INAUDIBLE].
PROFESSOR: Yeah.
AUDIENCE: [INAUDIBLE]
PROFESSOR: Yeah, the genome doesn't fold. It's double helical, duplex DNA or something. You're actually folding proteins, so the cartoon is not quite right, but it's sort of kind of cute. All right, now, when we talk about the non-covalent forces that hold proteins together, I just want you to remember from last time this set of non-covalent forces, because if you understand them and recognize them, you'll understand how they may occur in folded protein structures.
All right, so here's a peptide sequence. Here's a puzzle for you. You can go back and figure out what the one-letter code spells there. Just take out your table with all the amino acids. It's appended to the back of your P-set, and you'll be able to see what that very large peptide spells. All right, I don't want you working it out while you're here. You've got to listen to me for the time being.
OK, so the first order, we get it, there's a primary sequence. The next thing to think about is what's known as secondary structure. It's a higher order than just the primary sequence, and it's established by non-covalent bonds, and it's called secondary-- oof, my writing's horrid today. Secondary structure. And those are interactions that are put in place exclusively by interactions between the peptide bonds of what's known as the peptide backbone.
So if I look at the structure, these are the side chains. The peptide backbone is this continuous linear sequence. That's what we would call the peptide backbone, and the secondary structure is put in place by hydrogen bonding between components of the peptide backbone. So for example, a hydrogen bonds, such as that, or a different hydrogen bonding interaction, such as that. Between the atoms that have lone pairs of electrons and the other atoms-- heavy atoms that hold a hydrogen that's quite acidic.
And there are a couple of major forms of secondary structure. What I'm showing you here is what's known as the alpha helix. First deduced by Pauling, in fact, through model building, he said, proteins could form these ordered structures, and an alpha helix is an ordered structure exclusively made up from the hydrogen-bonding interactions of the peptide backbone.
And you can look at this helical structure. It's a continuous strand of peptide, but there are hydrogen bonds between COs and NHs all the way through the backbone, such that this strand of peptide can fold up into a cylindrical, helical structure, where all those R groups, the side chains of the amino acids, are on the perimeter of that helix. So this secondary structure is an important one because it's very prevalent in a lot of proteins.
The next secondary structure is also held together by hydrogen bonding, and it's interactions between stretched out strands of peptides that may not be close to each other in the primary sequence, but they align in the folded structure. And so for example, what I've shown you here is what's known as a-- this guy is then to say this is an anti-parallel beta sheet. And across that sheet, there are continuous opportunities for hydrogen bonding interaction. If the strands run in opposite directions, it's anti-parallel. If they're in the same direction, it's parallel.
These two secondary structure elements make up a lot of the sort of basics of how proteins start to fold. They're key non-covalent forces, and there are also other smaller motifs. One is called a beta turn, where the peptide sequence may go through a chain reversal, so the sequence would look like this. I'm going to just draw it, and I'll talk to you in a moment about ribbon diagrams. And this piece here would be the turn, whereas that would be the interactions enforced by the sheet.
These are the ordered elements of secondary structure. You don't have to be able to figure them out, but you have to be able to pick them out in order to understand the structure, OK? So even those simple elements still it's hard to make big enough structures to have functions.
So as I mentioned in a continuation of the theme, the protein folding is hierarchical, you can start to put together elements of secondary structure to make things that are a little larger. Helix, turn, helix. Helix with a different kind of turn, maybe put in place by a metal ion or something, or a strand, turn, strand, or now something that's a composite of these two major types of secondary structure, the helix and the turn.
And these really start to be proteins that might be big enough to be able to do something, but they're all exclusively held together by non-covalent forces between the amides or peptide bonds in the backbone of the protein, OK? Not very exciting just yet. Now, one other little clue that people will-- you might see and you might be confused, people sometimes, when they're drawing sort of a quick picture of a protein, they might draw a helix, but instead of really showing it in detail, they might show it as a cylinder, so you might need to pick that out of a structure.
And then I want to call your attention to that, that in all those motifs, when you join one helix to another, you might need to turn a strand to another strand you need to turn, and so on. OK, so this is like taking your very extended stored of polymer, knowing there are different kinks in it, because of the backbone bonds, but folding it up in a structure that maximizes the opportunity for another order of structure, which we'll talk about now.
All right, so we've seen primary. Secondary is just with backbone. And things start to get much more interesting when we get to tertiary structure, because tertiary structure is enabled by all these other interactions, electrostatic, hydrogen bonding, hydrophobic forces, that can be put in place due to the side chains of amino acids interacting with each other or with the backbone structures. So I'm going to walk you through this, so you can sort of get a sense of how these three-dimensional puzzles work on a very small scale.
So look here, that's a very small motif. And what I'm going to call your attention to is when you fold up these motifs, when the secondary structure is in place, a lot of the side chains are near each other, and they can engage in long-distance contacts. And so for example, I'm going to show you interactions between side chains, between side chains and the peptide backbone, or side chains and water.
But what I want to do is take a look at this and see, can you put any of those potential interactions on the drawing that's on your handout? It's pretty obvious where there's an electrostatic interaction, right? Boop. OK, between plus-- get those out of the way, those are the easy ones. And then interactions between hydrophobic groups, where they want to amass that lipophilic structure, so it's not exposed as much to water, so they cluster, so those are easy.
And then you can start thinking about what are all of hydrogen bonds you could draw. Here I've shown one between side chains, between side chains and backbone, between side chains and water, and those may all contribute to the ultimate thermodynamic stability. Make sure you get your hydrogen bonds right. Remember, two donors don't interact with each other into acceptors, don't-- so this might describe the folding possibilities of that small motif.
Now what I want to show you-- I'm going to-- let me-- is an ab-initio simulation of a folding process. So let me just get that a little bigger on the screen. So this is computing. GB1 is a very small protein that holds reversibly under appropriate conditions, and what I'm going to do is forward you through this video. This is a simulation. This is all computation. It's not looking at anything by spectroscopy or in solution or anything like that.
And what I'm going to do is I'm going to forward you through the structure. This is multi-scale modeling. It's got a lot of details in how it's done, but the starting point is a very denatured protein, all stretched out, right? And what I'm going to do is just show you for a few seconds, you know, this thing's like trying to find its thermodynamic minimum, and it's actually failing pretty badly.
And it does that for about 30-- 60 seconds of the simulations, so I made a point to myself to take you to about minute one, where things start to get fairly interesting. And you're saying, well, what's interesting about that? You see that nascent helix, in the background, the red and the blue, is starting to form strands that are a little bit aligned, and it's trying to find as many connections as possible to satisfy a stable structure.
At a certain point in the simulation, five of the hydrophobic groups are in a little pea. They're in a little hydrophobic cluster, and that's a breakpoint in the folding process, because that gets everything glued together better, so that the rest of it now can start to really find its final place in the folded structure. These early structures are known as molten globules. A lot of the interactions are not yet in place, but the hydrophobic cluster is critical.
But then after that, it's almost as if you're sliding downhill to get all the remaining interactions in place to fold the protein, OK? So protein folding is a puzzle that can be solved computationally by maximizing thermodynamic interactions. So it's sigma this, sum of this, sum of this, sum of that. That's going to get difficult the larger the protein gets, but for small proteins, those simulations really start to make sense, OK?
All right, so let's just move on here. Lost-- ah, good. What did you think of the simulation? It's kind of cool, right? So you can find the link in the sidebar. So just pop these back on now, and that's the folded structure.
All right, so with many proteins, they're much more complex than that. So for example, here's cyclin A. It's involved in cell cycle, and you can see its alpha helix structure dominantly, very clearly, all those beautiful alpha helices. Next to it is the green fluorescent protein, which is a cylindrical structure made up of anti-parallel beta sheets. What's really cool is when you sort of rotate it, you can see all those sheets, but then it does this little sort of curtsy to the audience, and you can look down into the barrel.
And then in some cases, proteins may be a mixture of a secondary structure elements. Here it's a little hard to tell. This is triose phosphate isomerase, but if you look down it, you can see the helices, and there's also a group of beta strands that are held together. So in that protein, it's a mixture of alpha helix and beta sheet.
Now, I'm not going to tell you much about pulling up Protein Data Bank files right now because I want to cover the next topic. And then when we have a few minutes later on, I'll show you. But wherever I show you a structure, I'm trying to show you the Protein Data Bank code, and in the web site, you can see there is a free download of PyMOL, which is the program I used to create all these structures and movies, so you can really look at things.
And believe me, it took me about three years to learn how to use it properly. It'll probably take you about a week or maybe a couple of days. So if I can learn it, you can certainly learn it.
Now, there is one final element of protein structure that people get kind of hung up on, and it's what's called quaternary structure. It's like, aren't we done yet? So in addition to all of these, let's say I have a folded motif, and there's its structure. That would be have primary, secondary, between the strands or the helix, and tertiary structure, right?
But in some cases, proteins hold up to quaternary structure, where it's multiple of these units joined together-- hoo, I could have picked a simpler fold, but that will get you the general gist of it-- all right, where these are actually associated by non-covalent forces. So there's more than one polypeptide chain. In fact, here would be four peptide chains coming together in a higher-order structure that's made up of four of those units.
The prototypic example of this is the protein that carries oxygen around in your blood, which is hemoglobin, and it has four primary sequences that have come together in a tetrameric quaternary structure. Hemoglobin is kind of interesting, because it's made up of two alpha and two beta subunits. If All these subunits were identical, they would be called homooligomers, all the same pieces. If they are different, they are called heterooligomers. We'll see a little bit more about this when I talk about hemoglobin in the next class, because the features of the quaternary structure are very, very important for the proper transport of oxygen, and single mutations can really mess things up, and you'll see more about that in the next class.
So just wrap that little bit up, proteins are condensation polymers of amino acids. Each protein sequence is defined by covalent bonding. Native proteins. Most of them that are not have quite quaternary structure are folded through secondary and tertiary interactions, these things that we already talked about, and folding is defined by how to maximize all those non-covalent forces to get the maximum thermodynamic stability with the maximum number of interactions. And subunits may also come together through quaternary structure.
OK, so I'm going to talk to you about several proteins throughout the course, but for now, I want to focus you in on a structural protein that provides mechanical support for tissues. In the next class, we'll talk about transporters and enzymes, and as we move on to signaling, things like receptors and membrane proteins and so on. So the protein I'm going to describe to you is collagen.
It is the most abundant protein in the human body. It plays enormous roles. It's not an enzyme, it's not a catalyst, it's not a transporter. It is one of those structural proteins, where the structure of collagen has evolved to provide a mechanical stability to lots of essential components of complex organisms.
And there are many different types of collagens that are found in different parts of the body. For example, bone, tendon, cartilage, and so on. They are all college and structures, but they have subtle differences, maybe some have different, slightly different, mechanical properties to adapt to the functions that they perform, OK?
And what I'm going to show you is that a single amino acid change in the primary sequence of collagen can destabilize the structure, so it is no longer viable. And the disease type I'm going to talk to you about is a set of diseases known as collagenopathies, and the particular one is called osteogenesis imperfecta. Osteo always refers to bone because college and plays a critical role in the structure of bone. Bone isn't just bone, it's collagen involved in it.
And it's also this disease is called brittle bone syndrome. And here's the X-ray of a baby born with brittle bones syndrome, and you'll see that the long bones in the upper arm are all irregular because the bones are brittle, and they'll break even in utero. A lot of babies with this defect can't even be born through the birth canal because it would crush the bones, and many of them don't survive very long at all.
Some survive with different kinds of cases, but their lives are greatly impacted, and they could just sort of hit a table and the bones would break, all right? There are those sort of serious situations where parents are actually accused of abuse to the child, but the child actually had brittle bone syndrome, and it was just through helping them put their clothes on or taking them upstairs, the bones got broken very readily. So osteogenesis imperfecta really describes a collection of these defects.
Now the collagen tertiary structure is shown here. It's actually made up of a type of helix. It's not an alpha helix. It's a polyproline helix, where the individual subunits in that tertiary in the structure are fairly long and extended, and I show you three strands in this polymeric structure, a yellow, a red, and a green. And these rolled together into a three helix bundle that has a fibrillous structure, and then all these structures come together to make the macromolecular structure that is collagen.
It's not just one of those fibrils. It's bundles of those fibrils in a very organized pattern where you could even see that patterning in electron microscopy. And there are many genetic defects of collagen, and what's so important to think about is if you have a defect in one strand that defect will propagate through every single strand. If this is one strand made up of three polypeptide chains, it propagates all the way through the structure.
And I believe I have little time to just show you, here's the collagen structure. I'm just showing you how it's extended. Those are three independent strands, and there's a set of magenta residues in the middle, which come from a defect in the sequence where a glycine has been changed to an alanine.
So I'm going to show you this movie because it shows you right at the center of the structure, there are residues painted in pink. And what I'm going to do is show you close up of that segment. If you look at those cells they're all nicely organized, except where that defect is, and that defect is caused by the change of a hydrogen to a methyl group on three residues that come together, and that bulges out that fibrillous structure and makes it not as compact and beautiful as it should be in the version that's got the glycine there.
So if you look at it, you can even see that helix gets bulged out and it's not as well-aligned as the rest of the structure. And then that defect gets propagated into all the fibrils and results in the weakening of the bones. Either the collagen fails to form properly, or the collagen, when it forms, it has much less mechanical stability.
So I think that's a good place to stop and I'll pick up next time with hemoglobin. Oh, one last little thing, a couple of things for you to do. There's a great link on the website to the Protein Data Bank to see how enzymes work. And if you have a little time, it would be awesome if you could just take a quick flick through those parts of the text. These slides are posted with these reading assignments, and they're posted in color if you want to look at them again.