so we are moving along lecture six is the last of the biochemistry lectures we're going to be talking about nucleotides and nucleic acids and you'll understand these terms in a moment there I'll clarify them for you but this is a tremendous stepping stone to the next portion of the class so I'll show you a few images here I'm going to reshow you some of these in a moment when we talk about addressing understanding the non-covalent structure of DNA which is so critical to understanding information storage and information transfer but for now let's just have a quick peek forward after this section I'm going to be covering molecular biology so how to go from DNA to RNA to protein and then Professor Martin will take over with the basic of structures and functions of cells and then genetics and but for all of this we're going to need nucleic acids and I'll explain to you why here so nucleic acids form fundamental units for information storage storage and that is the DNA that is in our nucleus and in our mitochondria and then information transfer and if I get a little bit of time at the end I have three or four quick slides that you don't have on your handout because it's sort of a floating topic on the use of DNA in DNA based computing because it's a nanosec scale structure that one can program to do different things and I think you might enjoy that so in this picture of the components and the what's known as the central dogma that is how DNA is converted into messenger RNA which through the help of transfer RNA and ribosomal RNA we get proteins the key elements on this slide are DNA messenger RNA ribosomal RNA for RNA and those are all made up of nucleotides being brought together into polymers that are nucleic acids so obviously we really need to crack the structures of these and understand how the structure informs function remember we did that for proteins we've done that for phospholipids we thought about it very briefly for carbohydrates but the thing that I really want to stress to you with the forth of these macromolecules is looking at how the last component of the biomolecules structure really informs function and it's really cool to think about how it's done so how is that chemical molecular structure something that we can understand from from the perspective of function ok so what we need to do first of all is think about what nucleotides are and understand their structure so that we can move forward to understand how they come together to build these macromolecules are so pivotal and essential in life for programming the biosynthesis of our proteins and now we're understanding more and more about not only that but also how RNA RNA not DNA is involved in a large number of regulatory processes so it's not just DNA double stranded DNA goes to a messenger and so on there's also a lot of regulation occurs because of a lot of the other nucleic acids that are within the cell so I'm going to go here because I want to describe the composite components of nucleotides so we understand their their structure and their properties so what are nucleotides and you look at these structures up on the board they look kind of complicated so let me deconstruct them for you it'll make life a lot easier so they're too familiar building blocks and one new one so the familiar building blocks are first of all carbohydrates so the key carbohydrate in nucleic acid is a five carbon pentose sugar which looks like this you can count the carbons four and five and you can reassure yourselves everything is there with respect to the carbons by translating this line angle drawing into a drawing where you put all the hydrogen's on and you know where everything is there are two types of five carbon pentoses that are used in the nucleic acid they are ribose which is shown here with all oh s on all of those carbons and two deoxy ribose which is a building block of DNA whereas ribose is a building block of RNA what else do I need to tell you you'll see this later on that ribose sugar ends up being connected to these what are known as nucleo bases you don't do not necessarily need to draw those because you've got them on your handouts to put sketches on so I put them on the board for my so I don't have to stand here and draw them for you and I want to explain certain things so the nuclear bases in the numbering system and I'm gonna keep on reiterating this so you'll get familiar with it number the carbons 1 through whatever it is so rather than numbers when you're walking around the ring so when we talk about the ribose component they have what's known as a prime number to differentiate it from the numbering system in the ribose asst so this would be one prime two prime three prime four prime at five right why is that this becomes incredibly important when we talk about putting together polymers of DNA and the direction in which DNA is assembled in life and also even when we describe to deoxyribose or ribose because this would be called two prime deoxyribose in the nucleic acid so I'm gonna bore you with that numbering system because I'll start to use it very commonly and it will make a lot of sense as we start to assemble the DNA macro molecule when we talk about the way it's built and drawn and written the numbering system will important because we'll constantly refer to five-prime and three-prime that's just a little preview for later the next component of the nucleic acid is a phosphate thus first looks like this but in nucleic in nucleotides these are joined to other units as phosphor esters but you want to remember that in phosphorus you have 1 2 3 4 5 bonds to phosphorous and you commonly have a negative charge on one of those oxygens and in the structure of DNA you actually have foster phosphates occurring as phosphodiesters and you once again you will see that when we see the intact structure of DNA so what a nucleotides nucleotides are a combination of a carbohydrate of carbohydrate or a sugar a phosphate and a nuclear base that's the third component the one we're going to learn about now so the the nuclear bases look like this there are two families two flavors of nuclear base there is one flavor just cleaned up a little bit here that has two rings and it has the shorter name purine and there's a different family or flavor of nuclear bases that has one ring and it has the bigger name and that to this day is the way I remember purines and pyrimidines small name big structure big name small structure okay if that's helpful to you go for it use it I haven't patented it or anything all right okay so in nucleic acids there are two different purines they are known as adenine and guanine you do not need to know these structures I actually only know my favorite three of the five to draw easily and the other two I'm always stumbling around the ring so don't worry about that we all get to know the ones we work with every day for me it's uracil its adenine and it's cytosine but not the others okay but what you do need to understand is a little bit about their structures because when we start to talk about the non-covalent structure of nucleic acids for instantly the double stranded helix of DNA we need to know where the hydrogen bond donors and acceptors are in these structures so if you want to indulge me you can take a look at these structures this hydrogen would be a donor right you can see that it's a hydrogen on a nitrogen this nitrogen is interesting it has one two three bonds to nitrogen which means there are a lone pair of electrons also on that ring systems so that would be a hydrogen bond acceptor and the adenine nuclear base can accept and give a pair of hydrogen bonds and you can work that out for all of the others so in guanine there is an acceptor another acceptor the donor and so on so those rings in the nuclear bases are very important because they have places that you can hydrogen bond to now is everyone feeling comfortable about this does anyone want to ask me a question that might help clarify because it's quite a bit yeah do you have a question sorry uracil these are all up these all I'm sorry these all these nuclear bases have fancy names okay so so far I've shown you the structure of adenine guanine cytosine and thymine uracil which is not drawn on the board is very similar to Phi mean except this methyl group is a hydrogen okay knowing the names is also complicated I really care that you understand hydrogen bonding patterns not to draw the whole structures but to identify hydrogen bonding patterns not to remember fancy names because there's no logic to those names but really to remember ribose deoxyribose phosphate and phosphodiesters purines and pyrimidines just the sizes of them to pick them out does that make sense what I want you to know and what you can remember if you think it's interesting okay all right now in nature we use nucleic we use the nucleotide building blocks or the nucleotides in many different ways it's not just in DNA and RNA and so here I'm showing you some really important nucleotides that are found in nature and I'll give you a little bit of information about their signaling so here are the building the components that you can pick out there is a in this case a ribose sugar in this case it's phosphate but it's a phosphite triester so it's got three phosphates in a row and here's a nuclear base which is a purine and this is adenosine triphosphate so it's one of the bases that one of the nucleotides that used in energy energy transfer in a lot of metabolic processes we use ATP as a molecule that can you that has energy that can be unlocked for chemical processes there's another one of these which is guanosine triphosphate whether the nuclear base is different they're both purines but they have different structures you can see them there and then finally the last one I show you here is an is a nucleotide that has a cyclic phosphate but it still has a nuclear base a ribose and a phosphate and this is cyclic ANP and when we when I come back after professor Martinez taught we'll talk about the role of cycling campi as a second messenger so these two molecules in addition to being building blocks for DNA and RNA also a forms of energy where you can use ATP or GTP as a form of energy in a lot of metabolic processes and in fact though when we start constructing proteins using the ribosomal system you'll notice we use gtp as a form of energy not ATP it's interesting how nature chooses to do that any questions about this okay all right one tiny wrinkle left to deal with and that's a little bit more about those building blocks for the nucleic acid one more item that it's useful to understand the name of so here are the five nuclear bases two purines and three pyrimidine x' in DNA we have a tea nc so a T G and C so we have different building blocks one three of the three are common to both polymers one is different uracil and thymine are exchanged when you go from DNA to RNA the pyrimidines a cytosine uracil and thymine and in RNA you have a u G and C so there are reasons for these differences and I'll get a nudge into some of those chemical differences in a moment so that what the information up there is the same information that I have on this board the next thing I need to talk to you is we very commonly use the term two terms nucleoside and a nucleotide how irritating is that the nucleoside is just the ribose plus the nuclear base but no phosphates okay as soon as you put on phosphates they become nucleotides so for example nucleobase ribose and in this case a phosphate on it and that becomes a nucleotide no matter how many phosphates they are it's called a nucleotide I'm less concerned that you remember that nomenclature more that you know what it's all about because otherwise it might become a little bit confusing so just remember if you can remember that but I've I think I've tried to define the things I would like you to remember the building blocks their numbering system the phosphodiester linkages and the nuclear bases as far as understanding where donors and acceptors are for hydrogen bonding all right okay and there's one thing so we call that a nucleoside whereas we call it a nucleotide when it includes the phosphates and there's one thing that you want to notice is that the bond from the nuclear base to the ribose is a glycoside bond it's a bond to a carbohydrate so that's why it's called a glycoside bond there are glycoside Aizaz that cleave the bond from the base to the sugar those are very important when we have mutations in our DNA and we want to cut out the sugar to fix it so it doesn't get misread in the biosynthesis of DNA in the biosynthesis of messenger RNA so that bond is important we may often talk about it but only when we get to learning about how DNA sequences are corrected if there are mistakes in those sequences and that will be later on okay so let's start to now look at the polymers now I want to tell you that by the early 1900's people pretty much knew the structure the non-covalent structure of DNA and I'll describe it to you now DNA is made up of nucleotides and this is its basic structure where you have a phosphodiester backbone linking ribose each of those ribose iziz modified with a purine or pyrimidine and that is the basic structure of a nucleic acid polymer only it's very very very very long so let's take a look at the components here look at the bonds and you know maybe on your notes just highlight the bonds and some of the things I'll talk about so first of all the numbering system here we always talk about nucleic acid and we describe the sequence of the nucleic acid based on from 5 prime to 3 prime because the phosphodiester bonds join the 5 prime there should be a number there and the 3 prime site so the linkage would be here would be 5 Prime and three prime joining to the to the ribose molecules so the architecture of that nucleic acid is a polymer that includes a phosphodiester backbone linked by phosphate esters that's one phosphate ester that's the other one on two of the OHS of the ribose sugar when this is DNA there's no o H group at that carbon site that would be the two prime site you can see you can pick straight out that this is DNA the sequence is then defined by what the identity of the bases here so this would be guanine adenine thymine on that sequence now by convention if we write out this sequence the way the sequences are written a 5 prime to 3 prime direction so if I look at that I would be able to name it as an a an a G T sequence because we always write the sequences 5 prime to 3 prime we can remember that later on because we actually also build sequences 5 prime to 3 prime so there are some conventions in biology in biochemistry you want to remember the by convention we write peptides in terminal 2 C terminal but we also build them and to C so that's why the convention is strong and it's good to remember because it can get you out of a lot of trouble if you remember those things now when we are building a DNA polymer we grow that sequence you'll see the biochemistry for all of that polymerization in the next class it's amazingly cool how the entire contents of a cell the DNA can be replicated in amazing time frames but all through growing those chains from 5 prime to 3 prime so when we add another building block on we remove a molecule of water so that's a condensation reaction and we form a new phosphodiester bond so in the biosynthesis of DNA you keep on adding new nucleotides to the 3 prime end there's a chemical reason for that when we build DNA we don't just cram the two groups together we rather come in with a triphosphate and use that act Aveda triphosphate as the new building block and you kick out triphosphate and you'll see that when we talk about DNA synthesis but what I want you to remember here is this is another condensation reaction we talked about them when making peptides we talked about them when making carbohydrate polymers and now we're seeing once again a condensation reaction to make a nucleic acid polymer now the last term that's kind of worth mentioning it is the word nucleic acid what's that about I don't see any carboxylic acids it turns out the polymers of DNA are very acidic because the O H group on those phosphodiester backbone x' is very acidic so you give up H+ and this is in its most stable form as O - so when DNA was first isolated it was isolated from white blood cells by isolating the nucleus and it was found that it was a very acidic material packed into the nucleus that's why it was called nucleic acid acids in the nucleus before people even understood anything about the composition it got it garnered that name nucleic acid so we talked about polymers of nucleotides we call them nucleic acids okay all right so then with respect to writing our sequences we could write them in this way so PD GATC that would be that structure what do all the little extra P's Indies stand for the P stands for whether there's a phosphate at this end the d stands for whether it's a deoxy sugar as a building block going all the way to the other end there's no little P at the other end so it means that which is free does everyone understand that shorthand writing there's another way I no this was DNA without needing to put deoxy on each of the the building blocks does anyone know how I know immediately it's a billet it's a stretch of DNA yeah yeah there's no uracil on this I mean instead so in the principle as long as there's a chi in there you know it's DNA as long as are you in there you know it's RNA okay all right now let's talk about the non covalent structure because you know I really feel that that's the most exciting part of this entire endeavor because the covalent structure really doesn't allow us to understand how DNA stores information for building proteins it doesn't tell us that much about it it just it looks like a cool polymer but we can't really understand the details by not looking at the covalence of the non covalent structure so there was one key piece of information and it's called shoggoths data and this piece of scientific information ran around the scientific community in the early 50s because it seemed incredibly important and what Jacques data was he collected all kinds of organisms and then their nuclei and then measured or their DNA and then measured the ratio between the purines and the pyrimidines okay he measured the ratio of the large ones and the small ones of the nuclear bases so how many of these relative to how many of those okay emori found by looking all across organisms from all domains of life is that there was a one-to-one ratio of purine to pyrimidine so that became very interesting because what it suggests 'add was that in some way the non-covalent structure of nucleic acids had some correlation between the number of the purines and the number of the pyrimidines and what you can imagine is it sounds like we're always pairing a small one with a large one by looking at that number so this is really really important because it's like the light bulb that went on with respect to understanding the structure of double-stranded DNA so despite all kinds of variations some organisms have a lot more GCS some have more ATS but no matter what's the ratio is always one-to-one and this ultimately led to understanding the non covalent structure of double-stranded DNA because it provided clues to how there could be some way that information was coded but then could be replicated now the next thing that became the clue to the structure of double-stranded DNA came from a very talented researcher rosalind Franklin who sadly died way before her time of ovarian cancer really in large part because she spent a lot of time near x-ray beams so that would have got caused mutations to her DNA and she developed a way to make fibrils of DNA that were ordered in us to collect electron diffraction data and that diffraction data actually gave a clue to some of the dimensions of the double-stranded DNA structure and it actually was the clue that told the spacing between the strands of DNA so it really was a piece of information that you simply couldn't do without with char CAF's data and with this what was called photograph 51 it really gave you the clue and it was really during those years the Watson and Crick were desperately model building to try to understand the non-covalent DNA and once they had those two pieces of information they could actually put together hand-built models this looks kind of clunky but I know the room they took this photo in from my years at Caltech in fact I can recognize the room they built not just little tiny molecular models but big molecular models because so they could make measurements to say the diffraction data told me this was so many nanometers apart and they were able to piece together the structure of double-stranded DNA but I still haven't shown you how those two strands to get come together it's really intriguing because at that very same time Linus Pauling have been done very well with the structure of the Alpha helix and proteins also was trying to figure out the structure of DNA but he came up with a sort of a crazy structure where he thought that it was a triple stranded structure where the base is actually stuck out and somehow this triple stranded structure coded for replication of DNA now there's a ton of things that are really also about this structure first of all it's a triple stranded but the other terrible thing is there's so many phosphodiesterase in the backbone they would have been massive electrostatic repulsion those sequences would want to blow themselves apart because you can't cram that much negative all in one place but it was really intriguing sort of sociological phenomena at the time Pauling was a major pacifist and he was really really active in nuclear disarmament and they said that his mind just wasn't on some of this stuff and that this model came out of him really worrying about other things and not focusing on the DNA structure so let's try to explain mushara gas data by looking the nuclear bases and thinking about how they might come together so here I show you the structures of the four nuclear bases in DNA wherever I have an you can assume that's part that's a ribose that is part of the phosphodiester backbone what we want to understand is how did the nuclear bases come together to form some kind of tear that could be could be useful to programming there resynthesis so I've drawn them all here but it's not quite intuitive I need to do a little bit of flipping around to line things up better and the other thing I need to do is get things at the right angle so you can start seeing how those bases might come together because char gas data dictates that you have a purine and a pyrimidine purine pyrimidine you have pairing between the nuclear bases in your double-stranded DNA in a structure that looks more like this and in each case you paired a purine and pyrimidine so what do I want you to do is take a look I've shown you now where donors and acceptors are you can go back and do this for all the nuclear bases but I'm going to do this for you right now by showing you the donors and acceptors of hydrogen bonds within those structures what I've done is I've lined them up beautifully so they look straight at each other so you can tell that there is a complementarity between a purine and a pyrimidine that makes very nice hydrogen bonding which is the non covalent force that's very important between G and C I can set up three hydrogen bonds between a and T I can only set up two hydrogen bonds so they're one purine is complimentary to one of the pyrimidines one purine is complementary to one of the other pyrimidines all right and then we can draw those hydrogen bonds in place that totally explains the measurement from the Franklin data of the distance the width of the double stranded helix because it's identical for both of those base pair and that gives you the structure that forms the non-covalent structure of DNA which is a series of interactions where the solid line is the phosphodiester backbone but sticking out like steps on a spiral staircase are the bases where each base is complementary to a specific additional base all right so it predicts the SHA gap ratio and it also predicts the distances now within all the model building it became quite clear that the structure the non covalent structure of DNA was afforded by anti parallel strands where one strand went in one direction 5 prime to 3 prime and the other strand went in the opposite direction 5 prime to 3 prime when we start replicating DNA we're gonna see that that's pretty convenient but thermodynamically it is also the favored orientation so let's just look at the orientation where you would draw one strand of DNA 5 prime to 3 prime now I've taken this all down to cartoon level these are the phosphate digesters the ribose is the 3 prime end and the 5 prime end and the bases that come off at the 1 prime carbon and then when you pair it with another strand one strand goes in one direction five prime oh I don't know why this is misbehaving five prime whoops 5 prime to 3 prime the other strand goes in the other direction 5 prime to 3 Prime and when asked this question a few years ago I couldn't really explain it very well that I just said it had to be because it always has been but what's really cool is if people have been able to solve the crystal structure of a pair of a parallel pair of DNA strands so this is canonical DNA the beautiful anti parallel structure and it's very regular very very even it turns out though when you try to pair the two strands in a parallel orientation the very uncomfortable and it's much less stable so the anti-parallel orientation is very important for the thermodynamic stability and the optimum hydrogen bonding interaction of all those bases that are pairing so it's actually what nature favors because it is more stable any questions okay and this it's on your slides but you can see just how regular DNA looks so organized whereas the anti-parallel one the one the parallel one really does not afford you good hydrogen bonding interactions at all okay so let us know so what we've done now is we understand the structure of DNA the coat non covalent and covalent structure of DNA we understand its antiparallel what we'll do in the next class is show how you can peel apart those anti parallel structures to make unpaired structures and you can use each of them as the template for the synthesis of a new strand of DNA so you can get two daughter double strands from a single parent double strand and that all comes from understanding the structure now what I want to do is move you just very briefly to the structure of RNA and comparing the DNA and RNA structures because there are some differences so let's just work through what the differences are I've have this written down and the differences are very important for the functional properties so DNA RNA first of all obviously deoxy ribose writers you may go why why why is nature so complicated why do I have this extra fact ID to remember about RNA versus DNA and it's really amazing that the difference between that hydroxyl on the two prime position this is not happening not having it makes enormous differences to the stability of the polymer rnase breakdown very very readily DNA is a stable for the lifetime of a cell all perfect in the nucleus or mitochondria they stay intact so there's a stability difference between the two sugars because DNA has to be the place where you store your genetic material it's got to stay good whereas RNA is the message that you make transiently to program a protein being made and then you want to get rid of it so we need the differences in stability that originated from that small feature a TG C there's the difference C in the basis the most common DNA is double-stranded DNA whereas RNA forms various structures so it's much more irregular structures than the DNA probably in part because the ribose is substituted differently so that continuous strand of dead of double-stranded material is not quite so stable in RNA we find DNA principally as double stranded DNA but RNA we find as transfer RNA messenger RNA ribosomal RNA does go on forever short interfering RNA so the various RNA is used for a lot of purposes whereas DNA principally stays as the double stranded DNA there's a little double stranded RNA but it is a precursor to some of these other forms of RNA okay so this slide just summarizes some of that for you the difference is comparing DNA RNA and so what we'll see later is how RNA lends itself to these interesting structures where you still have some base pairing but you have a lot of loops and pans and diversity of structure and that's really kind of the origin of this RNA world where RNA structures were not what could have a variety of form that might contribute to different functions beyond just as a message as a place to store DNA message so there are a lot of things that one can understand about DNA by knowing its hydrogen bonding patterns so can you guys guess which of these strands would have a complimentary stand and be the most stable double-stranded DNA so this would be one strand you could draw for each of them its complementary strand can you guess the clues to figuring out which would have a most stable organization of the anti parallel double-stranded DNA what would I be looking for yeah okay so number one higher GC content because Jesuses form three hydrogen bonds isn't is only formed two and what's the other determinant just looking at those structures yeah yeah you you are doing no it's it's actually even more silly it's more simple than that length so all you do is you go along and say I can make three hydrogen bonds 2 3 2 2 3 2 2 2 so usually it's truly just count hydrogen bonds in its part in the sequence and you can guess which is going to be the more stable because it has the most hydrogen bonds okay so we might ask you that which one will come apart now the intriguing thing about DNA is you can peel it you can heat it and it'll come apart but it doesn't denature the way proteins do if you just cool it down it comes back together so another feature of DNA is that you can heat denature and then really Oh exactly how it was in the first place it doesn't denature to something that that's not very useful now the question can you draw the complementary stand I always find if this top strand here which of these is the complementary strand frankly the best way to do it is to sketch out the complementary strand you can see it kind of upside down because it's really hard to draw things 5 prime to 3 prime when you're also trying to figure out base pairing so draw it upside down make sure you know the 5 prime and the 3 prime end and then you can guess the right answer for these types of questions about complementary strands now one last question the stability of double-stranded DNA I've made a whole big deal about hydrogen bonding right that's what holds it together what other forces could be a play-in double stranded DNA that might contribute to its stability any thoughts what else well it certainly doesn't look like it's charged because the predominant charge is negative this is not not an opportunity probably got metal ions there kind of neutralizing that charge what would be the other force and how would I describe it it's a tricky one so we've got these bases and they're pretty hydrophobic they're planes they have electron density on both sides so it turns out there is some stability gained between the packing of the steps of DNA between each base pair with the next with the next so there are hydrophobic forces and researchers of Scripps have actually proved this paradigm by making extra DNA bases that don't have hydrogen bonding partnerships but just provide the stuff that's the flat hydrophobic entity with the right size that can slip into DNA sequences and make stable addicts oh it makes stable not really base pairs anymore but just be stable in that polymeric structure a people understanding and following that okay well so finally when we look at the structure of DNA there are some trenches where things can bind to proteins combined and we talked about the major groove and the minor groove but I will talk about those later on when we talk about transcription factors now I just want to in really triple past time and I'll put this on the on the website there's tremendous interest in using the building blocks of DNA through information storage in computing so if you look up DNA based computing on Wikipedia you'll learn a whole lot about it because what's so exciting about is is it so organized nano scale material that can be programmed to base param form certain structures so in the sort of range of different sizes there's been a lot of interest in DNA as a material for information storage not for your genetic material but for plain old information storage so people have learned how to build structures of DNA where they can construct these sort of cruciform structures by base pairing they can make the arms of these structures a little bit extended so you could start joining those things together to make very defined three-dimensional entities that you in kind of nuts doing this sort of stuff because you can build sort of tetrahedra and other sort of shapes and sizes all by strands that base pair that are about 10 base pairs long that are stable and the only complement certain other base pairs so you could literally build up they often called it DNA origami because you can build up macroscopic structures just by the assembly of strands of DNA that will ultimately fold to form the best complementary DNA to form the structures and it's also been found as I said they went completely not smiley faces and stars and stripes and so on but the most valuable thing you can as I said you can read more about this is to use DNA as logic gates to define and or or not so the sort of three options and actually use them to program certain puzzles where the DNA will spit out the answer to a particular puzzle through a logic diagram so those of you who are interested in computing and these kinds of logic puzzles may want to read a little bit more because DNA is such a reliable non-covalent structure where those base pairs are incredibly reliable that you can start envisioning not just building double-stranded DNA but building all kinds of architectures or programming things with the sequence of DNA and that's it for today and that's the end of the biochemistry section you