Flash and JavaScript are required for this feature.
Download the video from Internet Archive.
Description
Professor Martin talks about DNA sequencing and why it is helpful to know the DNA sequence, followed by linkage mapping and then the different methods of sequencing DNA.
Instructor: Adam Martin
Lecture 17: Genomes and DNA...
PROFESSOR: And today I'm going to talk about DNA sequencing. And I want to start by just sort of illustrating an example of how knowing the DNA sequence can be helpful. So you remember in the last lecture, we talked about how one might identify a gene through functional complementation. And this process involved making a DNA library that had different fragments of DNA cloned into different plasmids and then involved finding the needle in the haystack where you find the gene that can rescue a defect in a mutant that you have.
So if this line that I'm drawing here is genomic DNA, and it could be genomic DNA from, let's say, a prototroph for LEU2, the leucine gene. So this is from a prototroph.
Then you could cut up the DNA with EcoRI. And if there is not a restriction site in this LEU2 gene, you get a fragment that contains the LEU2 gene. And then you could clone this into some type of plasmid that replicates in the organism that you're introducing it and propagating it in.
And so that would allow you to then test whether or not this piece of DNA that you have compliments a LEU2 auxotroph, OK?
Now one thing I want to point out is that because these EcoR1 sites, these sticky ends, would recognize this EcoR1 one end or this EcoR1 end, you can imagine that this gene-- if the gene reads this way to this way-- it could insert this way into the plasmid. Or it could insert in the opposite direction. So it could be inverted. So this would have some sort of origin of replication and some type of selectable marker.
But if you have the same restriction site it can insert one way or the opposite way. That's just one thing I wanted to point out.
Now let's say rather than leucine, you're interested in cycling dependent kinase, and you had a mutant end CDK and you had this sequence of your yeast CDK gene. Well, rather than having to dig through a whole library of pieces of DNA for the CDK gene, basically you're sort of fishing for that needle in the haystack. If you knew the sequence of the human genome, you'd be able to identify similar genes by sequence homology.
And you could then take a more direct approach, where you take-- let's say you have a piece of human DNA now, double stranded DNA, and it has the CDK gene. You could take human DNA with this CDK gene. And you have unique sequence around the CDK gene, which would allow you to denature this DNA. And if you denature the DNA, you'd get two single strands of DNA. And you could then design primers that recognize unique sequences flanking the CDK gene.
So you could imagine you'd have a primer here and a primer here. And then you could use PCR to amplify specifically CDK gene from, it could be the genome or from some library. And then you get this fragment here, which includes CDK. So knowing the sequence of the genome would allow you to more rapidly go from maybe a gene that you've identified as being important in one organism, and find the human equivalent that might be doing something similar in humans. So this step here is basically PCR.
And let's say the CDK gene had restriction sites. Let's see, we'll say restriction site K and A here. Then if you have these restriction sites in your fragment of DNA, you can then digest or cut that piece of DNA with these restriction endonucleases. And then you'd get a fragment of CDK that has K and A sticky ends. We'll pretend that both of these have sticky ends.
And now you have unique sticky ends between K and A. And you might have a vector that also has these two sites. And you could digest this vector with these two enzymes. And that would allow you to insert the specific gene in this plasmid.
And if you have two unique sites, because K only recognizes K here and A only recognizes A, then it will ligate in. But you can do it with a specific orientation because you have two different restriction sites. So I hope you all see how it's with one restriction site versus two.
All right. Now let's say you want to do something more complicated than this. Let's say rather than just identifying the gene that's involved in cell division, you want to engineer a new gene, in order to determine where this particular protein, CDK, localizes in the cell. So we have CDK, which could be from yeast or human, it doesn't matter. And you want to engineer a new protein, basically, that you can see.
So remember Professor Imperiali introduced green fluorescent protein earlier in the year. And this green fluorescent protein is from a gene from jellyfish. So now we could, using what I've told you, reconstruct or engineer a gene that has DNA from three different organisms, in order to make a CDK variant that we are able to see in the cell.
So remember, a green fluorescent protein is like a beacon, if it's attached to a protein. If you shine blue light on it, it emits green light. And so you can use a fluorescent microscope in order to see it.
In this case, let's say there's also another restriction site here, R. And let's say you have a fragment of GFP that has two restriction sites, A and R. You could then cut this fragment and this fragment with these restriction enzymes A and R. And you could insert GFP at the C terminus of the CDK gene. So you could go and have a gene that has CDK GFP inserted inside a bacterial vector.
Now which one of these junction sites do you think would be most sensitive in doing this type of experiment? So there are three junction sites. There's this one, this one, and this one. Which is the one you're probably going to put the most thought into when you're doing this experiment?
Yes, Miles.
AUDIENCE: The A?
ADAM MARTIN: The A site. Miles is exactly right. This one is going to be important. And why did you choose that site?
AUDIENCE: Of the three sites, two are half insert, half originals [INAUDIBLE]. But at A, both sides of it are inserts. So [INAUDIBLE] carefully.
ADAM MARTIN: And if you're trying to make a fusion protein, what's going to be an important quality of this? Malik, DID you have a point?
AUDIENCE: Well, they try to [INAUDIBLE] we'd have to make sure that the [INAUDIBLE].
ADAM MARTIN: Excellent job. So Malik just pointed out two really important things. To make this a fusion protein, you have two different open reading frames. These two open reading frames have to be in frame with each other.
So this junction here has to be in frame where GFP is in frame with CDK, meaning that you're reading the same triplet codons for GFP, there in the same frame as CDK. Also, you want to make sure there's no stop codon here. Because if you had a stop codon here, you're just going to make a CDK protein. And then it's going to stop and then you won't have it fused to GFP.
And you guys will work through more of these in the homework. So you'll be able to get a sense of it.
So now for the remainder of this lecture and also for Monday's lecture, I want to go through a problem with you. Basically, if you have a given disease that's heritable, how might you go from knowing that disease is heritable to finding out what gene is responsible for that given disease? And this is going to involve thinking about different levels of resolution, in terms of maps.
So the highest resolution map you can have for a genome is the sequence. You can have the full nucleotide sequence of a genome. And that's the highest possible resolution because you have single nucleotide resolution as to what every single base pair is. But that's like knowing like your apartment number and your street number and basically knowing everything. But starting out, you might want to know what continent it's on, or what country is it in.
And so you first have to narrow down the possible locations for a given disease gene. And that will, at first, involve establishing what chromosome and what region of a chromosome a given disease allele is linked to. And that involves making essentially a linkage map, where you establish where a disease gene is located based on its linkage to known markers that are present in the genome.
Now this is going to require that you remember back two weeks ago, to when we talked about linkage and recombination. And you'll recall that we were looking at the linkage between genes and flies and genes and yeast. One difference between that type of linkage mapping and human linkage mapping is we don't have really clear traits that are defined by single genes. You can't just take hair color and map the hair color gene to link it to a disease gene. Because hair color is determined by many, many different genes.
So in fruit flies, you can take white eyes and see if it's connected with yellow body color because both of those are determined by single genes. So we need something other than just having phenotypic traits that we can track. We need what are known as molecular markers to be able to perform linkage mapping.
And so what we need in these molecular markers-- well, if we just think about if we wanted to determine the linkage between the A and B genes. And if you did this cross, would you be able to determine linkage?
Georgia, you made a motion that was correct. Tell me. Why did you shake your head no?
AUDIENCE: They'd all be heterozygous.
ADAM MARTIN: Yeah they'd all be heterozygous. Because this individual has the same allele on both chromosomes, you're not going to be able to differentiate one chromosome from the other. And so the point I want to make is that in order to see linkage, what you need is variation.
So we need to have variation. And another term for genetic variation is polymorphism. So we need polymorphism, or genetic variation, between these molecular markers.
We also need genetic variation in the disease. But we have that. We have individuals that are affected by a disease and individuals that are not affected by a disease. So we have variation in alleles there. But in order to map it with a molecular marker, to map linkage to a molecular marker, you also need variation here. So the problem with this cross is here you need to have heterozygote. There needs to be variation in this individual, where both of these alleles are heterozygous.
So now I want to talk about some of these molecular markers that we can use, and how they vary between individuals and between chromosomes. Now this is going to be maybe the lowest resolution map. But I'm talking about this linkage map here. And you can see highlighted that the bottom here are various types of polymorphisms that we can use to link a given disease allele to a specific chromosome and a specific place on chromosome.
So I'll start with the first one, which is a simple sequence repeat. It goes by many names. But I will stick with what's on the slide.
So a simple sequence repeat is also known as a microsatellite. So you might see that term floating around, if you're reading about this. And what a simple sequence repeat is, as the name implies, it's a simple sequence. It could be a dinucleotide, like CA. And it's just a dinucleotide that's repeated over and over again.
So on a chromosome, you might have a unique sequence, which I'll just draw as a line. , And then you could have a CA dinucleotide that's repeated some number of times, N. And then that's followed by another unique sequence. And that's what's present in it.
So that would be one strand. And then in the opposite strand, you'd have a unique sequence, the complement of CA, which is GT, and then, again, unique sequence. And so there's variation in the number of repeats of the CA. And so there's polymorphism. So we can use this to establish linkage between this marker and a phenotype, like a disease phenotype.
So how might you detect the number of repeats that are present here? Anyone have an idea of a tool that we've discussed that could be used here? So one hint that I gave you is that the sequence here is unique and the sequence here is unique. So is there a way we can leverage that unique sequence to determine whether there's a difference in the number of repeats?
What's a technique we discussed that involves some component of the technique recognizing a unique sequence? Yeah, Natalie?
AUDIENCE: CRISPR Cas9.
ADAM MARTIN: Well, CRISPR Cas9 is a possibility. Jeremy, did you have an idea?
AUDIENCE: PCR?
ADAM MARTIN: PCR-- so it's true. You could get it to recognize that. But then you have to detect it, somehow. So what's more commonly used is PCR. Those are both good ideas. But using PCR, you could design a primer here and a primer here. And you could amplify this repeat sequence. And the number of repeats would determine the size of your PCR fragment.
So if you did PCR, then you'd get a PCR fragment that has the primers on each end, but then has this certain size based on the number of repeats. So in that case, we need some sort of tool that enables us to determine the size of a particular DNA fragment. And so I'm going to just introduce to you one such tool, which is gel electrophoresis.
And gel electrophoresis involves taking DNA that you've generated, by either PCR or by cutting up DNA with a restriction enzyme, and loading it in a gel that has agarose. Maybe it's composed of agarose. It could be composed of polyacrylamide. And then because DNA is negatively charged, the backbone, if you run a current through it, such as the positive electrode is at the bottom, then the DNA is going to snake through this gel.
Now we'll do a quick demonstration, if you two could come up. I need one volunteer. Ori, find 10 of your friends and bring them down. All right. That's probably good. Yeah.
All right, Hannah, why don't you-- you guys have to link up, OK? Stay over here. We'll start at this end. This is the negative electrode over here. The positive electrode is going to be down there. And Jackie is going to be our single nucleotide. You guys link like-- yeah. You don't have to do-si-do, or anything like that.
All right. Now what I want you guys to do is I want you to slalom through these cones like it's all agarose gel. So that you're going towards the other side. And I'm going to turn on the current now. So go. All right, stop.
All right. See how the shorter DNA fragment is able to more easily navigate through the cones and get farther. So it was somewhat rigged. I know. But I just needed some way to make sure you always remember that the shorter nucleotide, or the shorter fragment, is going to migrate faster.
You guys can go back up. Thank you for your participation. Let's give them round of applause.
[APPLAUSE]
All right. So what you just saw is that the longer DNA fragments, they're going to be more inhibited by moving through the gel. And so they're going to move slower and thus, not move as far in the gel. Whereas, the small fragments are going to move much faster because they're able to maneuver their way through this gel much more quickly. So there's going to be an inverse proportionality between the size of the DNA chain and its rate of movement. You're always going to see the shorter DNA fragment moving faster.
So what one of these gels actually looks like is shown here. So this is a DNA gel that's agarose. And DNA has been run in these different samples. And what you're seeing is this gel is subsequently stained with a dye, like ethidium bromide, which allows you to visualize the individual DNA fragments. And so a band on this gel indicates a whole bunch of DNA fragments that are all roughly the same length. So essentially, you can measure DNA length using this technique.
What's over here at the end of the gel, this is probably some sort of DNA ladder, where you have DNA fragments of known length that you can use to calibrate the length of these bands over here. So this is how you measure DNA length. And we're going to use it over and over again, as we talk about DNA and sequencing.
So now, let's think about how this is going to help us establish linkage between a particular marker in the genome and a genetic disease. So if we think about these microsatellite repeats, I told you they're polymorphic. They exhibit a lot of variation in size. And so here's an example showing you a female who has two intermediate sized microsatellites. And if you look at this-- if you did PCR and measured the size of these, you get two different bands because there are two different alleles of different length here.
So you can see this individual has two intermediate length repeats. And this person has had children with an individual that has a short and a long microsatellite. And you can see that on the gel, here.
Now this female is affected by some disease. And these two individuals have children. And you can see that a number of those children are affected by the disease. So what mode of inheritance does this look like? If you had your choice between autosomal recessive, autosomal dominant, sex linked dominant, and sex linked recessive, what mode of inheritance is this looking like? Oh, Carmen.
AUDIENCE: Autosomal recessive.
ADAM MARTIN: Autosomal recessive? Why do you go with recessive? Yeah, go ahead.
AUDIENCE: Because there is a male that's affected. But not both of the parents are affected. So it seems like the father is heterozygous and the mother is homozygous recessive.
ADAM MARTIN: That's possible. That's exactly the logic I want to see. Is there another possibility? Yeah, Jeremy.
AUDIENCE: Autosomal dominant.
ADAM MARTIN: It could also be autosomal dominant. So you're right. You're right. If this was not a rare disease, then that male could care be a carrier and could be passing it on to half the children. So that's good. You'd essentially need more information to differentiate between autosomal recessive and autosomal dominant.
For the purposes of this, we're going to go with autosomal dominant. And what you see is that you want to look at the affected individuals and see if the disease phenotype is linked, or connected, with one of these microsatellite alleles. So if we look at-- we basically PCR DNA from all these individuals. And if you look at who is affected, each one of the individuals has this M double prime band. And none of the unaffected individuals has it.
So obviously, it would be better to have more pedigrees and more data to really establish significance between this linkage. But this is just a simple example, showing what you could possibly see if you have one of these molecular markers linked to a particular disease allele. So that kind of establishes the principle.
Now let's think about what are some other molecular markers that are possible? So another type of marker, and this is one that's the most common one, if I go here. So here, you see here's is a linkage map, here. And you see most of these bands are green. And the green markers, here, are what are known as Single Nucleotide Polymorphisms, or SNPs.
So single nucleotide polymorphisms-- and this is abbreviated SNP. And what a single nucleotide polymorphism is, is it's a variation of a nucleotide at a single position in the genome. So it's just a one base pair difference at a position. So there's variation of single nucleotide at a given position, at a position in the genome. And because that's a pretty general definition, there are tons of these in the genome.
Now one thing to think about is you could have a mutation in an individual that creates a SNP. So you could have a de novo formation of a SNP. But if you have a SNP and it gets incorporated to the gametes of an individual, then that variant is going to be passed on to the next generation. So this is something that could occur de novo. But it is also heritable. And if it's heritable, then you can follow it and use it to determine if a given variant is linked to a given phenotype, like a disease.
So to identify a single nucleotide polymorphism, it's helpful to be able to sequence the DNA. And I'll talk about how we could do that in just a minute. But before I go on, I just want to point out a subclass of SNPs that can be visualized without sequencing. And these are called restriction fragment length polymorphisms.
So restriction fragment-- so it's going to involve some type of restriction enzyme digest length polymorphism. It's a long word. But it's abbreviated RFLP. And what this is, is it's a variation of a single nucleotide. But this is a subclass of SNP. Because this is when the variation occurs in a restriction site for a restriction enzyme.
So if you remember your good friend EcoR1, EcoR1 recognizes the nucleotide sequence GAATTC. And EcoR1 only cleaves DNA sequence that has GAATTC. So if there was a single nucleotide variation in the sequence, such that it's now GATTTC, or something like that, that destroys the EcoR1 site. And so EcoR1 will no longer be able to recognize this site in the genome and cut it.
So you could imagine that if you had one individual in the genome having three EcoR1 sites, if you digest this region, you'd get two fragments. But if you destroyed the one in the middle, then if you digested this piece of DNA, then you'd only get one fragment. And that's something. Because it results in different sizes of fragments, that's something you can see just by doing DNA electrophoresis. And maybe you would use some method to detect this specific region, so that you're not looking at all the DNA in the genome, but you're establishing linkage to this specific area.
You could use PCR. You can have PCR primers here and here. And you could then cut with EcoR1. In one case, you'd get two fragments. In this case, you'd get two fragments. In this case, if you amplified this region of the genome and cut with EcoR1, you'd only get one fragment. So you'd be able to differentiate between those possibilities.
Yes, Malik.
AUDIENCE: When you use PCR, are there [INAUDIBLE]?
ADAM MARTIN: What's that?
AUDIENCE: Are there [INAUDIBLE]?
ADAM MARTIN: Oh. You're saying what causes it to stop? That's a great question, Malik. Yeah.
So initially, it's not going to stop. That's absolutely right. But because every step, each time you replicate, it's then primed with another primer. So you'd replicate something like this that's too long. But then the reverse primer would replicate like this. And it would stop.
So if you go back to my slide from last lecture, look through that and see if it makes sense how it's ending. Because if you do this 30 times, you really will enrich for a fragment that stops and ends at the two primers, or begins and ends at the two primers, I should say. Good question. Thank you.
All right. Now, let's talk about DNA sequencing. Because as I showed you, obviously, these SNPs, because there are so many of them, are probably the most useful of these markers to narrow in on where your disease gene is. And to detect a SNP, we need to be able to sequence DNA.
So I'm going to start with an older method for DNA sequencing, which conceptually, is very similar to how we do DNA sequencing today. And so it will illustrate my point. And then at the end, I'll talk about more modern techniques to sequencing.
So the technique I'm going to introduce to you is called Sanger sequencing. And that's because it was identified by an individual named Fred Sanger. And I'm going to just take a very simple DNA sequence, in order to illustrate how Sanger sequencing works.
So let's take a sequence that's really simple. This is very, very simple, and then more sequence here. So let's say we want to determine the nucleotide that's at every position of this DNA fragment. So one way we could maybe conceptually think about doing this, is to try to let DNA polymerase tell us where given nucleotides are. And if we're going to use DNA polymerase, what are we going to need, in order to facilitate this process? Yes, Rachel.
AUDIENCE: [INAUDIBLE].
ADAM MARTIN: You're going to need nucleotides, definitely. So we're going to need nucleotides. What else? To start, what are you going to need? Miles?
AUDIENCE: Primer.
ADAM MARTIN: You're going to need a primer, exactly. Good job. So you need a primer. So here's a primer.
And now, we're going to try to get DNA polymerase to tell us whenever there is a given nucleotide in this DNA sequence. And so think with me. Let's say we were able to get DNA polymerase to stop whenever there was a certain nucleotide.
So if we go through just a couple nucleotides, let's say, at first, we want DNA polymerase to stop whenever there's an A. So let's say there was a possibility it would stop at this A. If it's stopped at this A, you'd generate a fragment of this length. But if it read on through that A, there's another possibility that it would stop at this A.
So we're kind of looking at when these are stopping. And the final possibility is it goes on and stops at this A. So if this DNA polymerase stopped only at As, you'd get fragments that are these three discrete lengths.
Now let's consider another possibility. So pink here is stop at A. And in blue, I'm going to draw what would happen if it stopped at T. So they all start from the same place. If it stopped at T, it would just stop one nucleotide beyond this A in this simple sequence. So in blue here, this is stop at T.
But if it's just a possibility, it stops. And some of the polymerases could go beyond this T and go to the next T and stop here. And again, this would be one nucleotide length longer than this pink one, here. And the final one would-- I'll just draw it down here-- would get out to this last T, here.
So what you see is if we could get DNA polymerase to stop at these discrete positions, we'd get a different sized fragments, whether it was stopping at one nucleotide versus the other nucleotide. You all see how this is resulting in different fragment lengths. Yes, Andrew.
AUDIENCE: How would you create a pattern [INAUDIBLE]?
ADAM MARTIN: There are companies now. You can basically take nucleotides and synthesize these primers chemically, not using DNA polymerase.
AUDIENCE: I'm saying how would you know what primer to use, if you don't know the sequence?
ADAM MARTIN: Oh, in this case, you'd have to start with some sequence that you know. So in most sequencing technologies, you kind of make a DNA library, where you know the sequence of the vector. And then you'd use the vector sequence as a primer to sequence into the unknown sequence. Great question. Good job.
All right. So what we need now then is some sort of tool or ability to stop DNA polymerase when there's a certain nucleotide base. And to do that, we can use this type of molecule, here, which is known as a dideoxynucleotide. Remember, for DNA polymerase to elongate a chain, it requires that the last base have a three prime hydroxyl.
And so what this dideoxynucleoside triphosphate is, is it's a nucleoside triphosphate that lacks a three prime hydroxyl. Here, I'll highlight that.
So you see this guy? You see it bolt the highlight H? There's a hydrogen there on the three prime carbon, rather than the normal hydroxyl group. So if this base gets incorporated into a elongating chain, DNA polymerase is not going to be able to move on.
So this method where you can add a certain dideoxynucleoside triphosphate to stop chain elongation is known as a chain termination method. So you're getting chain termination. And you're getting this chain termination with a specific dideoxynucleoside triphosphate. So these dideoxynucleotide triphosphates, if they get incorporated into the DNA, are going to halt the synthesis of that DNA strand.
So if we take our example, here, this might be a reaction that has dideoxythymidine triphosphate. So if we had dideoxythymidine triphosphate in this sample and it's elongating, then when the polymerase reaches this point, there's a possibility that it will incorporate the dideoxynucleoside triphosphate. And if this is a dideoxynucleoside triphosphate, then there won't be a three prime hydroxyl.
And DNA polymerase will just be like, oh, I can't go on! Because it's not going to have a three prime hydroxyl. So it's not going to be able to continue with the next nucleotide. So this is known as chain termination.
So let me take you through an example, here. All right. So here's an example that you have a slide of. And again, there's a template strand, which is the top strand. And this method requires that you have a primer. And what's often done is you label the primer.
So the first step is you have to denature your DNA. So you have to go from double stranded DNA to a single stranded DNA. And then you mix the double stranded DNA with first, this labeled primer, such that the primer can then yield to the single stranded DNA. You need DNA polymerase, as I've mentioned. And as, I believe, Rachel mentioned before, you need the building blocks of DNA. So you need the four dideoxynucleoside triphosphates.
So you always have the four dideoxynucleotide triphosphates. But what's special here is you're going to spike several reactions with one of the dideoxynucleoside triphosphates. So you spike the reaction with a tiny amount of one of your dideoxynucleoside triphosphates.
So let's say you have a reaction, here. And this this one here has dideoxyadenosine triphosphate. Then polymerase will along get this strand until there's a thymidine on the template. And then there's a possibility that it will incorporate this dideoxy NTP. And if it does, then you get chain termination. And you get a fragment of this length.
But the other possibility, because there is still the deoxy form of the NTP present, it's possible that it incorporates a deoxyadenosine triphosphate there. And keeps going, and then incorporates a dideoxy ATP later on, where you have another T. And so the polymerase will essentially randomly stop at these different thymidine residues, depending on whether or not a dideoxynucleoside triphosphate is incorporated. And that means for a given reaction, one in which you have dideoxy ATP, you get a certain pattern of bands that represent the length of fragments, where you have, in this case, a thymidine base.
And then you do this for all four bases, where you have four reactions, each with a different base that's dideoxy. So when you're adding these, you're going to do four reactions, one with dideoxy ATP spiked in, one with dideoxy TTP, one with dideoxy CTP, and the last with dideoxy GTP. And because these nucleotides are present in different positions along the sequence, you're going to get distinct banding pattern for each of these reactions. But using that banding pattern, you can then read off the sequence of DNA that's present on the template strand.
So this is how sequencing was done for many, many years. These days, it's been made cheaper and faster. And now what's often used is next generation sequencing. And one the pain in the ass about sequencing before is you'd use a lot of radioactivity. Your primer would be radioactive, so that you could detect these bands. Right now, everything's done using fluorescence, which makes it much nicer, I think.
And so in next generation sequencing, your template DNA is attached to a solid substrate, such that it's immobilized on some type of substrate. And then you add each of the four nucleoside triphosphates. In this case, they're labeled with a dye, such that each one is a different color. But the dye also functions to prevent elongation, such that, again, it's this chain termination. When you incorporate one of these, the polymerase just can't run along the DNA. It incorporates one and then stops.
So if you get your first nucleotide incorporated, it will incorporate one of these four. And it will be fluorescent at a certain wavelength, which you can see using a device or microscope. And then what you then do is chemically modify this base, such that you remove the dye and allow it to extend one more base pair. And so you go one nucleotide at a time. And you read out the pattern of fluorescence that appears. And that gives you the sequence of DNA on this molecule that's stuck to your substrate.
And you can do this in parallel. You can have tons, many different strands of DNA. And you can be reading out the sequence of each one of these strands in parallel.
Great. Any questions about DNA sequencing? OK. Very good. I will see you on Monday. Have a great weekend.