Transcript for:
Understanding the Central Dogma of Biology

[Music] all right so i am a geneticist and developmental biologist uh and i have been in icer for about 10 to 12 years i work on this animal called as the fruit fly which you can see on the left hand side and i basically model human disease and do genetic experiments using this model organism now as i said i would like to start with something which is part of the molecular biology module of what i am supposed to do and what i would like to start with is a is what is called as the central dogma of molecular biology now uh central dogma of molecular biology was proposed as i will show you uh in the early 1950s which is pretty much almost 70 years ago and there are issues with calling the central dogma the central dogma which is again something i'll come to but i will use this as the centerpiece of whatever i'm going to teach you over the next five to seven lectures so whenever i talk to you about for example biomolecules i'll refer to the central dogma whenever we talk about cell biology and refer to the central dogma and the central dogma will be sort of the bones of the flesh which i will be talking about now a dogma is supposed to be a belief or a set of beliefs which are held by a group or organization that others are expected to accept without argument so it is a hard and fast set of beliefs which cannot be changed and as you will realize when i talk to you about the central dogma that is not what the central dogma is now the history of the central dogma goes all the way back to the 1950s and what you the person you see here on the right hand side is a gentleman called francis crick and he's giving a lecture in 1967 in new york state in a university called the cold spring hub or cshl now in this meeting as you can see in the blackboard behind he is drawing a simple schematic which today we know as the central dogma and the origins of the central dogma as it is drawn over here goes all the way back to the late 40s to the early 50s and here is a uh snapshot of uh of uh of a document which francis crick wrote in the late 1950s where he talks about the central dogma and what the central dogma pertains to is basically the flow of information okay and these are molecules you have at least those of you who have done a little bit of biology till 10th or 12th standard have been exposed to you know what dna is you know what rna is and you know what protein is so what uh this central dogma or which we should actually call the central hypothesis as you'll realize pertains to the flow of information okay and what it basically says is that information is stored in a molecule called dna and in the 1950s it was not very clear where information was stored and till the early 1930s and even in the 1950s there were a set of people who believed that proteins were the repository of genetic of information today we know that information is genetic and it is stored in dna we will come back to this era again and again in the next few lectures so dna as we know today is the repository of information from dna information is transferred to protein and at that time it was not very clear how dna and this was this was a hypothesis it was something which was proposed it was not clearly proven it was not clear if at all information would flow from dna to protein which were because the executive molecule within the cell how this would flow and the proposal over here was it could flow directly from dna to protein there could be an rna intermediate and rna intermediate could also transfer information to protein and what uh crick proposed at that time that never could protein uh transfer information to protein never could protein go back in time and transfer into information to rna and never could protein transfer information back to dna so this basically the central dogma as it was proposed related to the transfer of information and various possibilities were discussed in the top part and in the bottom part what they thought at that point was not possible okay and this dotted line over here even in the 1950s there was a possibility that rna could transfer information to dna and this was based on viral experiment experiments on viruses which which were carried out now those people of your generation will see something very strange over here they'll see that the the type part in black is basically is dna rna and protein but for some reason the arrows are not in black and that was because this was typed on paper with a typewriter and in the 1950s technology to use paintbrush or any other software to actually draw arrows was not there so things were typed first and the arrows had to be drawn by hand now the central dogma even though francis called it the central dogma and we today know it as textbook material as a central dogma was not really a dogma as it is defined by the oxford dictionary on the left and the cambridge dictionary on on the right it was not a fixed set of beliefs it was a hypothesis and this hypothesis basically was is changing was changed over time and much of what we know about molecular biology can be understood by just each time you study something new you imagine the central dogma in front of you and then you ask yourself how are things changing how has how have we learned more and more about biology from the 1950s to the year 2000 now on the left you see this sort of timeline which starts from 1900 and this is a very important year in biology especially from the view of molecular biology and genetics 1900 as we will go into and you learnt in your school was the year that mendelian genetics was rediscovered after almost 20 years by the western world and the rediscovery of mendelian genetics along with merging the mendelian genetics with the evolutionary hypothesis which happened in the 1930s which is somewhere over here led to a deeper understanding of biology and the 1940s 50s 60s 70s which is all this range over here was what we know as the golden years of molecular biology spectacular discoveries were made in this time and in my class i will touch upon about a few of them probably five to ten percent of them right which is why over here in 1950 to 60s i call the decade of the great leap and you will understand what i mean now from the 1900s to 1950s the the theories of mendel were incorporated into the mainstream of biology evolutionary theory was incorporated into the mainstream morphology and genetics was do being done very very routinely in universities and research centers all over the world by the 1950s beedle who would get a nobel prize george beetle later on made this very famous statement which again you have read in text books that one gene made so there was a relationship between genetic material and uh and the making of an enzyme which at that time was supposed to be completely proteins and even though this much was known even though we knew what amino acids were it was not completely clear whether what the genetic material looked like and many people did not believe the dna was a genetic material though by the 1950s especially when you'll see in this lecture from from the 1950s to 1960s this was something which was completely proven and nobody went back to question that pretty much ever again now as i said the central theme of today's lecture is the central dogma and this is the redrawing of the central dogma which you saw in this old typewritten note in a more modern arrangement well not really modern but about 10 years later by francis crick in in a review and again this redrawing which is from 1958 talks about dna the flow of information from dna to rna information flow from rna to protein and the possibility of information flow from dna to protein and the dotted lines were where less possible events in terms of flow of information and the unbroken arrows were there more probable events there were questions being raised about dna being the the repository of information where where all the information was stored and this would be something which would be underscored and deeply believed in the 1950s it was not particularly clear that rna was indeed an intermediate and information from rna came from dna and one was then trans translated into protein but this was something which was let's call it an emerging idea and people started to believe this and the formation of protein itself ribosomes had been discovered and it was believed that these were potential protein factories but ah proving all of this happened in the 1950s 19 and you can read about this in an article which i'll upload which talks about the 40 years under the central dogma in it's a review from the late 1980s the year is kind of long away now a very interesting set of events were happening in the 1940s 50s and 60s including the definition of the central dogma the understanding of dna replication the understanding of transcription which is the formation of rna from dna the understanding of translation which is the formation of protein from rna and molecular biology was a very exciting thing in those times and a group of friends led by george gamow and jim watson who's shown over here decided to do something a little strange they decided to form a club and the members of this club came from different countries though many of them were based in the uk and the us and these people were very excited about all the events all the new information which is coming about molecular biology and all the mysteries which surrounded molecular biology so gamma proposed that a group of people would form what would we call as the rna tie club and each member would be given a code name which was basically related to an amino acid and they knew that amino acids were strung together to make polypeptides this was pretty much well defined by the 1950s but the clear mechanistic link between dna and rna was not be not very clear so george gamow and jim watson got together they started meeting like like-minded people like alex rich leslie orgel over here and to and remember the members of this club did not all stay in a single location they they were from different places they would write letters to each other meet each other in meetings and uh discuss the latest which was happening in in biology in molecular biology at that time and gamma basically made these ties for everybody and each member and these are the members of the club shown over here got a tyre designation and which was basically an amino acid and there were 20 members and as you know there are 20 amino acids so each member got basically an amino acid designation with the type of which would tie of different colors now when you look at these names which are members of the rna tie club you would realize that many of them are very famous scientists and these are names you should know about so for example alexander rich went on to do very interesting things with dna structure and with collagen structure uh erwin chargaff i hope some of you have heard about he basically found uh the the equivalent equivalence of ratios between atm gc uh nucleotides jim watson and francis crick of course we'll talk about many of you have heard of richard feyman who was at caltech and sydney brenner is somebody you will hear about max den melvin calvin also so went forward to become nobel laureates right so uh this was a this was a set of very uh very smart people who were communicating with each other and who were basically involved in trying to understand the wonders of what was coming out of very very of many experimental labs in the 1940s 50s and 60s okay now isaac newton has famously said in a letter in 1675 to robert hooke that if i have seen further it is by standing on the shoulders of giants and these people became giants of their fields and in order to become giants they depended on work done by other giants so as to speak in the 1920s in the 1930s and we can go all the way back to the theory of evolution we can go back to mendelian genetics so much of progress in science is basically done by breakthroughs which are made in earlier generations and these very famous people depended on researchers before them and researchers today depend on the science done by these people now george gamow himself was a very interesting chap and if you look over here you'll realize that george gamma was actually a physicist he was actually a cosmologist and he was the student of niels bohr and you'll also notice that not everybody over here is a biologist there are many chemists because biochemistry is in those days was the purview of the chemist if not only the biologist you'll notice that there is a theoretical chemist for example less leslie organ del brook was a theoretical physicist who went completely and totally into molecular biology especially phages bacteriophages you'll see an odd physicist physicist over here and you will also see theoretical people who do math who are mathematicians now gamma particularly somebody i will focus on for a few minutes so each one of these people is interesting because gamo in spite of being a cosmologist and working in the area related to the origin of the universe especially the big bang also had very had a broad interest in science and generally uh did all kinds of interesting things apart from just working on cosmology gamma was also known today amongst the the physicists for his very famous paper which related to the big bang called the origin of chemical elements which is called as alpha beta gamma paper and the reason it's called the alpha meta gamma paper is because the it of course is a very interesting breakthrough paper but it also had as his authors alpha bethe and gamma so these were the three authors of the paper and it turned out that when gamma did this work he did it with a graduate student whose alpha and uh when he was uh when the final draft of the paper was ready and he was sending it out to publication he decided that the paper would look so much better if there was a beta in between alpha and gamma and he basically added hans bethe as an author to this paper even though hans with had not contributed at all to any of the work just for fun and this just for fun is another reason why you realize why george gamow created this club called the rna attack club it was a means to bring fun into science and it was also a means by which a group of very interesting people stayed in touch by writing letters meeting each other in meetings and forwarding the idea of molecular biology so so this is where we basically where in the 1950s there was a relationship in terms of information flow between the three major macromolecules which is dna rna and also protein and what was not very clear at that time and this is a paper written by francis crick in nature in 1970 as to and the cross over here marks the time period what we are talking about is what the relationship between them in terms of flow of information i know students are taught in high school about dna makes rna and rna makes protein but the central dogma was not about what makes what the central dogma was what is the storage area for information this information is used to build life and how does this information flow and figure one over here talks about all the possibilities of flow of information dna to protein protein to dna protein makes protein and so on and so forth figure 2 is the figure you saw earlier which is which was crawled in late in 1950s by by francis crick which was a possibility of dna also uh allowing flow of information to protein and uh this thing has changed uh a little bit and the dna to protein flowing or flow of information has never really been proved but what was very clear as early as the 1950s uh was that absolutely crick believed that protein information from protein could not be transferred back to dna or to rna and this seems to be fairly clear though there are many modifications on on the central top so here is the central dogma as we know it dna can replicate and this is shown in the right hand side this is uh this is where the genomic information is stored as atgc as the atgc code which you are aware of dna replication allows you to take information stored in dna and make copies of it and this is something you know happens in cell division during mitosis and also in meiosis dna is copied and try into two independent parts and both these independent parts basically will go into two different cells so the information is conserved from generation to generation by dna replication now inside a cell and this is a picture of a cell information can be passed to rna because it turns out that the way molecular biology and biochemistry have evolved in sense over the last few billion years there appears to be a need for an intermediate and dna cannot pass on its information directly to the executive molecule which is protein so information is passed to rna rna then basically is decoded because rna is there in a certain form which is very similar to dna it's also a nucleic acid the process of making rna from dna is called transcription and the molecule involved is rna polymerase and when transcription happens mrna is created and in eukaryotic cells mrna is in a different compartment from where the ribosome is therefore it has to be transferred through nuclear pores to the cytoplasm where the ribosomes are sitting the ribosomes will read the rna which we now call messenger rna and this rna will now be converted into protein and this is called as this process is called as translation and just for nomenclature sake i am going to say a few write down a few things which are sort of important in the future so i will call the machine and i am calling it a molecular machine which does replication so this machine i am going to call dna dependent dna polymerase so replication is carried out by a polymerase which can polymerize nucleotides but it is dependent on dna so it's a dna dependent dna polymerase polymerase and we call it we call it a molecular machine transcription is done by a protein called rna polymerase but its full name is a dna dependent rna polymerase now you know that rna can also go back and form dna this is the process of reverse transcription which we will talk about in the next lecture and we will call that machine as a rna dependent dna polymerase right so you can start with rna and you can make dna and you can take rna and you can make protein you know the ribosome does it but let's just give it a name let's call it a rna dependent amino acid polymerase well polymerase is not a great term we'll call it a rna dependent amino acid it can't be a synthesis because it's not synthesizing amino acids it's just stitching it together and that is what the ribosome does fine i'll now show you a movie which is a movie from the cold spring harbor laboratory which is a famous laboratory in new york where in the 1950s meetings where routinely held and jim watson who we'll talk about in the next few slides actually became for many years the director of the cold stream will have a laboratory all right so let's see the movie the dna double helix contains two linear sequences of the letters a c g and t which carry coded instructions transcription of dna begins with a bundle of factors assembling at the start of a gene to read off the information that will be needed to make a protein the blue molecule is unzipping the double helix and copying one of the two strands the yellow chain sneaking out at the top is a close chemical cousin of dna called rna the building blocks to make the rna enter through an intake hole they are matched to the dna letter by letter to copy the gene at this point the rna needs to be edited before it can be translated into a protein this editing process is called splicing which involves removing the green non-coding regions called introns leaving only the yellow encoding exons splicing begins with assembly of factors at the intron exon borders which act as beacons to guide small proteins to form a splicing machine called the spliceosome the animation is showing this happening in real time the spliceosome then brings the exons on either side of the intron very close together ready to be cut one end of the intron is cut and folded back on itself to join and form a loop the spliceosome then cuts the rna to release the loop and join the two exons together the edited rna and intron are released and the spliceosome disassembles this process is repeated for every intron in the rna numerous lysosomes remove all introns so that the edited rna contains only exons which are the complete instructions for the protein again this is happening in real time when the rna copy is complete it sneaks out into the outer part of the cell then all the components of a molecular factory called the ribosome lock together around the rna it translates the genetic information in the rna into a string of amino acids that will become a protein special transfer molecules green triangles bring each amino acid to the ribosome inside the ribosome the rna is pulled through like a tape there are different transfer molecules for each of the 20 amino acids shown as small red tips the code for each amino acid is read off the rna three letters at a time and match three corresponding letters on the transfer molecules the amino acid is added to the growing protein chain and after a few seconds the protein starts to emerge from the ribosome ribosomes can make many proteins it just depends what genetic message you feed into the rna all right so you saw an animation of what happens inside the nucleus where the dna dependent rna polymerase sits on dna along with many many transcription factors it copies all the information to a yellow nucleic acid strand which we are called which is rna we'll come to rna splicing a little later but rna is edited and i'll explain why it is edited for those of you who have who have not done this earlier and the edited rna which is the yellow strand comes out of the nucleus goes into the cytoplasm finds a ribosome and the ribosome reads the hidden code the genetic code in the rna which originally is stored in dna and then makes a protein the protein which is a linear sequence of amino acids then folds and then it has an executive role and this executive role is very important for life per se and unless you have all the information stored in dna you cannot go through the process of transcription translation which is part of the central dogma has proposed bactric and you cannot make proteins and if you don't have proteins you basically cannot have life all right so now that you have a visualization of how the different molecular mechanisms in the central dogma are happening you have an idea about the flow of information let's now go back to 1953 and 1953 again is is a very very critical and major year in molecular biology and one element of the central dogma is dna and what we'll do now is we'll focus a little bit on dna now what you see in the picture is a historic picture taken on may 21 1953 with the watson on the left crick on the right standing in front of their model of what dna structure looked like and they built this model and proposed this model in a in a very famous paper in nature in 1953 which is shown on the right hand side it was basically a one page paper which spilled over to the second page a little bit a few hundred words and this paper today we know was very accurate about uh about the structure of dna so did they solve the structure of dna no they did not where they're the first to modern dna they were not they in fact had model dna multiple times in the two years before this paper was published and they had modeled it in different ways it was just that this model to them made a lot of sense and they thought it was the correct model of dna had anybody else model dna yes everybody in every chemist in the world had tried to model dna it was the holy grail of molecular biology and everybody knew that modeling dna was would be a would be a great discovery so what i'll do in the next few slides is tell you the story a little bit of the story of how dna was modeled and what was the background behind it so the year is 1953 as shown on the timeline these two people are very happy because they think they have they their model is the model of dna and they are they will be even more happier uh nine years from this point because nine years from 19 to 1953 which is 1962 they will receive the nobel prize uh for uh making this model now let's learn a little bit about both these people jim watson and francis crick so jim watson is 23 years of age not too far away from your age a few years older he's done his phd already he's from the united states and he's come to cambridge to study in the cavendish laboratory francis creek at 35 years of age is a phd student and he is doing his phd in the university of cambridge they are 10 years apart from each other they meet when watson comes to cambridge they hang around together eat lunch together and become friends and they start taking interest in in dna per se and also the potential structure of dna neither of them actually does any serious experimental work they in fact go around looking at data which is already published and start trying to think about what dna what dna would look like in terms of its molecular structure in early 1953 the famous linus pauling who is in the united states publishes a dna structure and this structure is coming out in nature in that structure there are three helices of dna and all the nucleotide bases are actually facing outside now watson and crick are busy trying to build their own structure and what seems to be a very key point in in their solution of the final structure is a visit to a nearby college called king's college in london which is a few hours away from cambridge and there they meet morris wilkins who is an experimentalist and who is doing an experiment which is called as fiber diffraction and what fiber diffraction is that you extract nucleic acid out from cells nucleic acid in water is very viscous and if you uh put in a glass rod inside pure nucleic acid in solution you can actually uh wind it out just like you can wind out a rope and what experimentalists in king's college are doing is they are pointing x-rays at this fiber and getting what are common as fiber diffraction pattern something i will show you in the next class and amongst the people working in king's college is a scientist called rosalind franklin and rosalind franklin has very good experimental skills and early in 1953 when crick visits morris wilkins who is a friend of his morris wilkins shows him the data collected by rosalind franklin franklin and that data is the cleanest fiber diffraction data which francis has quick has seen in the last three to four years because it is a very clean picture immediately some things become obvious about the structure of dna and crick goes back and sits down with watson and says that based on this picture there these are certain restraints and constraints which have to be indian and using that as the key breakthrough and please remember that that is not the only reason there are many other pieces of data out there and both watson and crick have been busy using these they have been trying to model dna for for a significant amount of time but that seems to give them additional insight and they build a model of dna within three months and they publish it by i think may or june around that time in in nature now this structure has certain key features which i will not describe very clearly uh very obviously the the the they are two strands which are antiparallel which most of you know by now uh the phosphate bonds are connecting the nucleotides together and show you pictures of nucleotides in a few slides and they basically use shargaff's information to connect how at gc are hydrogen together i'll show you this this later and very obviously data collected by rosalind franklin and previously published data becomes fairly fairly important for their discovery so this is the paper on 25th april 1953 in nature and this is a schematic on the right hand side showing the anti-parallel anti-parallel strands phi prime to 3 prime and 3 prime 5 prime again these are concepts i'll tell you for those of you who are not aware of that and the these strands which you see over here these are basically the bases which are finding which are forming hydrogen bonds so this is a simplistic representation of the double helix which we know fairly well and this is how their paper starts we wish to suggest a structure for the salt of deoxyribose nucleic acid dna this structure has novel features which are of considerable biological interest and they basically rebut the polling and corey structure which has just come out two months ago in nature and they say that that is completely wrong and the next 10 years especially data correct collected by crystallography not just on fiber diffraction but also an actual crystals of dna confirms that this form of dna which we now today know as the b form of dna is actually an accurate model of what will be confirmed later as the correct model of dna now many scientists forget that the 1953 issue of of nature which is the april 19 1953 issue contains not one paper but three papers three very important papers the first paper is the easiest to read it is the shortest and it very clearly and simply spells out what the model of dna is also in the paper immediately following the watson creek paper is the wilkins stokes paper which also talks about the molecular structure of dna and which is followed by rosalind franklin's paper which talks which actually shows the fibre diffraction image which has become very very famous so it is these three papers together which form the central tenet of dns dna structure with the model proposed by watson and crick which stands to this day data collected by rosalind franklin and also other data collected by uh by maurice wilkins now on the left is a cell and shown in orange is the nucleus and as you know the nucleus contains many many chromosomes and these chromosomes when you envision them will basically have a single strand of dna a single strand of double stranded dna with two strands running anti parallel to each other and connected by a hydrogen bonding which is shown shown over here now there is a lot of dna in cells chromosomal dna has a in mammals have about 10 raised to 9 base pairs which is a lot of zeros followed by one plants have even more dna in the nucleus 10 raised to 11 base pairs mitochondrial dna is smaller they are about 10 raised to 3 to 10 raised to 4 base space and chloroplast dna is is is also not very long it's about 10 raised to that and this is the central repository of all information and this information is trans transcribed and translated to make proteins and it is the proteins which do much of the work inside the cell for those of you who haven't seen the unit of dna it's basically a nucleotide it contains a base in green over here sugar in blue and the phosphate group and the connectivity between the phosphate groups with a t g and c which is the three nucleotide triphosphates atp gtp ctp and dtp is basically what makes up a strand of dna [Music] [Applause]