Amino Acids and Protein Structure

So let's get started by looking at the periodic table which you probably have learned already in your high school chemistry or integrated science. The table shows all the elements that are found on Earth. However, there is only a small subset of elements that are found in living organisms. So these red boxes are representing the abundant elements that are found in living organisms. Together they account for about 97th of the weight of most living organisms but their relative amounts would vary among different organisms. So these abundant elements carbon, hydrogen, nitrogen, oxygen, uh phosphorates, phosphorus and sulfur they can form stable coalent bonds to make different types of biomolelecules that are going to be discussed in this module. In addition, there are 23 other elements. Okay. uh making up of about 3% of the body weight in living organisms. They include five essential elements colored in purple here. Four of them are metallic ion and the other one is a non-metallic ion. And there are a number of elements that are described as trace elements. Those indicated in these darker blue boxes are more common while those in the these light blue boxes are less common. There are four major types of biom molecules. In each group there is a number of members which share similar structural or chemical properties. The simplest b molecules are amino acids. For carbohydrates, the simple forms are called monossaccharides or simply sugars. Nucleotides are the structural unit for DNA and RNA and lipids are diverse collection of molecules that are water insoluble. In living cells, biomolelecules that often arrange as biopolymers or biological polymers. There are three major types. Amino acids can be linked together to form proteins or polyeptides. Simple sugars could be linked together to form more complex carbohydrates called polysaccharides. Nucleotides could be linked together to form nucleic acids, polyucides or it may be called. and uh the two forms DNA or RNA. Now you need to notice that lipids okay which are very structurally diverse and complicated in some cases they do not form polymers. So only amino acids, carbohydrates and nucleotides can form these different types of polymers. But lipids do not form polymers. There are some terminologies that I want you to know. These simple forms, okay, sometimes you can refer as monomers when they are existing as individual molecules. But once they linked together to form a polymer. Okay, each of this unit is called a residue. Okay, for example, in proteins, okay, the monomers will be amino acids. Okay, individual free amino acids. But when they're linked together to form a polyeptide, each of this unit is called an amino acid residues. Similarly, if it is a polysaccharide, the monomers will be the simple sugars. These three simple sugars when they are linked together to form a polymer, each individual unit can be called a sugar residue. Amino acids are the most simple biomolelecules in the in cell. They are the monomers of proteins. They're called amino acids because they have an amino group and a caroxyic group. So it's a caroxyic acid with an amino group. Both of these groups are attached to the same carbon and this carbon is called the alpha carbon. But in in the IUPAC lowman clature this is uh this is the first carbon. The caroxile carbon is the number one carbon while this carbon is the second carbon. There are two other substituents that are coalent covealently linked to the alpha carbon hydrogen and an all group. So this all group uh is variable. Okay, different amino acids would have a different R group. Down here shows some structural properties in amino acid. the uh electrons are shared between these two oxygens although one of them uh should be a double bond and these coalent linkages are showing uh in different orientation. So that's actually reflecting the uh the stable structure of the amino acid molecule. shown on the right is a a bow and stick model. Okay. Um so these different bowls represent uh the different element as indicated by the different colors and the side chain is here. This one is the side chain or the group. This is the alpha carbon attached with the caroxile group and the alpha amino group. Okay. This side chain here contains a carbon. The carbon that attached to the alpha carbon in the side chain uh is referred as the beta carbon. Most amino acids are chyro molecules because of the asymmetry of the alpha carbon. So if the four substituents of attached to the alpha carbon are different there are two configurations for that amino acid and they are mirror image of each other. However you cannot superimpose one on the other. Okay. Uh no matter how you rotate the chyro molecule because they are simply not the same structurally. On the other hand in a chyro molecule in this case you have two substituents that are the same. Okay. If you rotate around this molecule you can actually superimpose that onto its mirror image. In other words they are the same. So there's only one configuration for an ayro molecule. There are two structural forms for all chyro amino acids which are mirror images of each other. They are described as stereo isomers and the two forms are the L and D enantumer. L means level left. D means destro. So let's use alanine as an example to illustrate the difference between an L and a D isomer. And these are the different structural representations for these two forms of alanine. Now this is the alpha carbon and you put the caroxile group on the top. Okay? and the group the side chain and in this case is the methile group you put it at the bottom. So if the amino group attached to the alpha carbon is pointing to the left that will be the L form the L enensen. On the other hand, if the amino group, the alpha amino group is pointing to the right, that will become the dnenumer. Okay, so that's how you differentiate the L enumer from the D enensen of an amino acid. There are 20 amino acids that are commonly found in proteins and chyro amino acids that occur in proteins are all L and enumer. Can you quickly identify which one of these two is an L and ensumer? Yes, it's the one on the left. Amino acids are different from each other by their out group or the side chain. and the 20 amino acid can be classified into different groups based on the chemical nature of their side chains. This slide shows the common names of the 20 amino acids with their different meanings and all this information was taken from the dictionary. Okay, so apparently it's not necessary for you to memorize all these stuff. Okay, but there are some interesting information. Um, for example, asparagene gets its name from asparagus. Okay, because it was first isolated from this vegetable. Okay, but asparagene is present in all organisms. Lucine, okay, gets it name from the Greek word lucose. Okay, which means white because this amino acid forms white crystal. So again you can read through all these other information uh of your own interest but I do want you to pay attention to one uh terminology called essential amino acids. Obviously all amino acids are important because we need them all to make proteins. Okay. Now these essential amino acids okay we are not able to make them in our body. We have to obtain them from the diet. Okay. So that we can make all different types of proteins. For all these others, they are non-essential amino acids. We can synthesize them. Okay. In our body, it's not necessary to obtain them from the diet. The first group of amino acids are having side chains that are non-polar and alifhatic. That means having an open hydrocarbon chain. Glycine is the simplest amino acid. The group is just a hydrogen. That means this compound is not chyro because these two substituents are the same. In fact, glycine is the only amino acid that is a cyro. That means there are no L or D enumers for glycine. For all the other amino acids, they would have a D isomer and an L isomer. Alolene, failing, and isolucine are all having hydrocarbon alifhatic chains with different levels of complexity. Methionine it has a straight chain or side chain containing a sulfur and this functional group is called a the thio ether. Methionine therefore is a sulfur containing amino acid and it is actually uh the dietary source of sulfur. So we have uh different types of sulfur containing biomolelecules in their body and we obtain our sulfur in a form of methionine from our diet. So in other words uh we can make other uh sulfur containing organic molecules okay that we need in our body as long as we have a methionine obtained from our diet. Finally, proline, okay, is also an amino acid having non-polar alophalic uh side chain, but the structure is a little bit different. You can see this is the side chain. Okay? And the carbon here is also linked to the alpha amino group. For all the other amino acid, the alpha amino groups are free. But in here, it is not free. Okay, sometimes we can call it a secondary amino group. This brain structure is called a peradine grain. Okay. And strictly speaking uh it is an immuno acid instead of amino acids because this uh group is not a free amino group. But nevertheless we also okay consider this as one of the amino acids okay because it is occurring together with other amino acids in proteins and I would like to add some more information about these three amino acids failine ducine and isolioine. These amino acids are having side chains that are containing branches branch. Okay. So they are called branch chain amino acids. Branch chain amino acids BCAA are referring to these three amino acids. As I just said, the side chains in these amino acids are containing branches. Okay, you can tell that uh these are branch chains. Okay, by looking at some carbons here, this one, this one, and this one. Each of these carbon is attached with three other carbons. So that would uh let them have these uh carbon branches, carbon containing branches. Okay. Now BCAA is actually being sold as a supplement taken by people having uh vigorous physical exercises like uh the athletes or bodybuilders. you can uh read the label information okay that it contains uh these three blanching amino acids in addition to other ingredients. Now in muscle the protein are containing high levels of BCAA. So during vigorous exercise uh some of these proteins will be degraded to release BCAA and BCAA will be metabolized uh for use as an energy source. Okay. So that's why uh people who are doing vigorous exercise would take BCAA as a supplement. They may or may not know the reason behind. Next time when you see a friend taking this supplement, it may be a good time for you to educate them. The second group of amino acids are having positively charged side chains. These side chains are hydrophilic. The first example is lysine. It has a long side chain with a terminal amino group that is positively charged. This is the second amino group in this molecule in addition to the alpha amino group which is present in all amino acids. I want to show your attention to these different carbons with different Greek letters. This is the alpha alpha carbon as I said earlier and the next carbon is beta gamma delta and epsilon and these amino ac amino group is attached to the epsilon carbon. The next example is arginine. Again it has a fairly long side chain and these different carbons are are named with these different Greek letters. Similarly it has a functional group called the guanadelinium ion. Okay. Three hydro nitrogens and this one is being positively charged. The third example is histadine. Okay. The side chain contains an immediate with two nitrogens. Now if you look at the structure here, there is no charge. It is because this nitrogen would only be positively charged if the pH is lower than six. So at physiological pH like seven this molecule or this side chain here is not charged. But when the pH lowers to below six this nitrogen will become protonated hence giving a positive charge. These two amino acids are having negatively charged side chains. They are also hydrophilic. So in the side chain you can see there is a a caroxile group. Okay. At the terminal. This is a second caroxile group in addition to the alpha caroxile group which is present in all amino acids. So these two amino acids are similar in terms of their side chain structure. The amino acids shown here are all having side chains that are polar but uncharged. So we can see these different side chains. They don't have any charges. However, they contain functional groups that are polar. For example, amine groups. This hydroxile groups or thio groups these different functional groups can form hydrogen bonds with water making them all polar. Okay. So these are all polar groups although they are not charged. The last group of amino acids are having aromatic side chains or whole groups. You can probably recognize this aromatic grains. So overall, these side chains are very hydrophobic, especially compared to the ones that are charged or having polar groups. Phenol alanine and tyrrosine are similar except for the presence of the hydroxile group. Here for tyrosine in tryptophan it contains a a fused wing with these two uh like a double ring structure. It is called an indoor group. Now these two amino acids actually uh could absorb UV light with maximum absorption at 280 nanometers and that properties is being useful for the determination of protein concentration. So this is an UV absorption profile for a protein solution. You can see a peak here at about 280 nanometers. This is because of the presence of amino acids tryptophan and tyrrosine in proteins. You can see that the absorption profile are quite similar with u pet at around the same wavelength 280. So for that reason we can use uh UV absorbance at this wavelength to determine or to estimate protein concentrations in a solution. The higher the absorbance, the higher the concentration of protein. Next I'm going to talk about the ionization properties of amino acids. Each amino acid has at least two groups that are ionizable. That means they become charged. The two groups are alpha caroxile group and the alpha amino group. Again, these two groups are present in all amino acids. There are at least three different ionic forms for an amino acid. At physiological pH, the alpha caroxile group is negatively charged while the alpha amino group is positively charged. If the pH values are low, the caroxile group will accept a proton and becomes neutral. On the other hand, if the pH values are high, the alpha amino group will lose a proton that becomes neutral. K1 and K2 are the dissociation constants for the uh caroxile group and the amino group respectively. Okay, the dissociation is referring to the proton. Okay, the hydrogen ion. Now if the pH value is equal to pK1, the P means negative log. So here if the pK if the pH value is pK1, then these forms form two and form one will be at equal concentration. Similarly, if the pH value is equal to pK2. So again, P is the negative log. K2 is the dissociation constants um for the alpha amino group. So if the pH value equals to PK2, these two forms form three and form two will be at equal concentration. Now let me explain a little bit about the relationship between dissociation constant and pH. An ionizable group may exist in either one of these two forms. Uh it could be neutral when it is protonated but after losing a proton this group becomes negatively charged. On the other hand, the group may be positively charged when it is protonated. Now, after losing a proton, this group becomes neutral. The dissociation constant can be expressed uh in terms of the concentration of the uh proton ion or the hydrogen ion or proton uh the concentration of proton acceptor and the concentration of proton donor. So these are the proton donors and these two are the proton acceptors. After going through these four mathematical conversions uh you can get the equation down here in which if pH value is equal to the pKa value. So that means this here the logarithm of the concentration of the proton acceptor over the concentrate of concentration of the proton donor this one is equal to zero. If these two are the same. So if uh the log logarithm of this ratio is zero that means these two are at equal concentration. So that would explain why at pH value that is the same uh value as the pKa then the proton acceptor concentration would be the same as the proton donor concentration. The ionization behavior of a molecule can be described by a titration curve. And let's use acetic acid which is uh a weak acid as an example. So acetic acid after losing a proton the acetate ion becomes negatively charged. The titration curve shows the change in pH in the solution after gradual addition of strong base. As you can see in the initial stage the change in pH or the increase in pH is quite uh rapid but this is followed by a more gradual uh increase in pH. This is described as a flattening zone. Uh sometimes uh this is referred as the buffering zone. So this is the region of the pH in which uh addition of more alkaline would not change the pH uh abruptly. But at the end uh when more and more alkaline is being added uh the pH value will go up again more rapidly. The pKa value actually would gives uh would give an an indication of whether or not this form or this form uh is more dominant. Okay, the protonator form or the deprotonator form. Now at midpoint of this titration curve the pH value is with is equal to the pKa value. At this point the concentration of the proton donor form and the concentration of the proton acceptor form are equal. So these two forms are equal in concentration when the pH is equal to the pKa value. Now if the pH4 belows the pKa value so that means will be within this region most of the molecules will be existing in this form which is the protonator form. Now if the pH value uh exceeds the pKa value which is uh in this region most of the molecules will be in this form the the protonator form. Okay. So for an ionizable group if the pH falls below the pKa value it will be mostly protonated. On the other hand, if the pH value is higher than the pKa value, this ionizable form will be mostly deprotonated. Now let's look at the titration curve for an amino acid using alanine as an example. There are two ionizable groups in alanine and as you can see in the titration curve there are two distinct stages. The first stage is reflecting the ionization behavior for the alpha caroxile group while the second stage is referring to the ionization behavior for the alpha amino group. There are two pKa values. PK1 which is 2.4. It is the pKa value for the alpha caroxile group. Okay. So when pH is uh higher than 2.4 most of the alpha caroxile group will become deprotonated. So when this group is deprotonated the caroxile group becomes negatively charged. Now for the alpha amino group the pKa value is much higher which is 9.9 here. Okay. Now if the pH value goes above 9.9 most of the alpha amino group will become deprotonated again. Okay. But in this case when the uh alpha amino group is deprotonated then it becomes neutral. Okay. So you can see that these two groups the behaves quite differently. Okay. Uh in terms of whether or not they are charged for the alpha caroxile group it is negatively charged when it is deprotonated. Okay. And the protonation or the loss of proton occurs okay at a relatively lower pH okay compared to the alpha amino group at which the loss of proton occurs only after uh the pH value is much higher. Okay. So if the pH value is higher than 9.9 most of the form uh will be in this deprotonated form which is neutral. Now the alpha amino group is charged positively when it is in its protonator form. So this is the protonator form. Okay. Then there is another parameter which called the pi. Pi for alanine is the pH value at which the net charge for the molecule is neutral is zero is also called the iso electric point. So this is the the point where all the ions all the amino acid alanine is existing in this form is called sweeter island ion. Okay. uh sweeter iron. So it's also um it's it means a hybrid ion. So the same molecule is carrying two different charges. So the net charge for the most for this molecule is zero and this uh it's occurring okay when the pH value is equal to 6.15 is the midpoint between these two pKa values all amino acids have at least two pKa values because there are at least two ionizable groups the alpha amino group and the alpha caroxile group. However, there are some amino acids that may have three pKa values because the out group the side chain may also be ionizable. Let's look at some examples for aspartic acid and glutamic acid the side chain. Okay, this is the caroxile group in the side chain. So, it could also be ionized. Okay, the protons could be lost and these groups will become negatively charged. On the other hand, for lysine and arginine, this their side chain contains an ionizable group here. Okay, which uh again they may lose their proton. Okay, and the protonated forms okay are positively charged but once they lose their uh proton they will become neutral. Okay. Similarly for cyine and tyrosine uh these uh the side chain and this proton or the hydrogen can be lost as a proton and they would become negatively charged. Okay. So for all these different um amino acids okay they have three ionizable groups. alpha amino group, alpha amino group, alpha caroxile group and the side chain. Okay, so these three groups are all ionizable for these uh amino acids and therefore they would have three different pKa values. Okay, so all the forms, okay, all the ionizable groups shown uh in these figures. Okay, are in their protonator forms. So that means the protons are still there. Okay, they have not been dissociated yet. These protons will be dissociated if the pH value increases. For the amino acid histadine, there were three ionizable groups. the alpha caroxile group, the alpha amino group and the side chain. Therefore, there are three pKa values in the titration curve. You can see there are three distinct stages each referring to one of the ionizable groups. There are four ionic forms for histadine. This is the one at the lowest pH values and the three ionizable groups are all protonated. The alpha caroxile group is neutral while this alpha amino group and the side chain are both positively charged. So the net charge for this molecule therefore is plus2. As the pH increases, the alpha caroxile group which has the lowest pKa value 1.8 would start to uh get deprotonated and becomes negatively charged. So this molecule now has a net charge of + one. As the pH value increases further the side chain which which is having the next lowest pKa value of 6.0 it will also get deprotonated and then it becomes neutral. Now this molecule is now having a net charge of zero. And when the pH increases further alpha amino group which has the highest pH uh the highest pKa value would also get the protonated. Okay. And when that occurs it become neutral. So the overall net charge for this molecule at very high pH will be -1. So if you are provided the pH value you should be able to determine the net charge of the predominant form of histadine the major form of histadine for example at pH1 okay which is below the pKa value for the alpha caroxile group okay so that means most of the sis histadine will be existing in this form. So the net charge for the predominant form therefore will be plus two. If the pKa or if the pH value is three that is in between these two pKa values. So the predominant form will be this one the second ionic form and that will be having a net charge of + one. Similarly at pH7 which is between these two pKa values that would be the predominant form of histadine with a net charge of zero. And finally at pH10 which is above the value of the last pka value for the alpha amino group. So that one this ionic form will be the predominant form for histadine. So the net charge for that uh predominant species will be minus1. Let us look at another amino acid which also contains an ionizable side chain. The example shown here is spartate. Now the S party has a different side chain and the side chain uh becomes negatively charged after uh losing a proton. So there are three pKa values for aspartate. As you can see down here, the titration curve containing three distinct stages each is referring to uh one of the uh ionizable groups. Okay. And as you can see up here, the net charge for these different ionic forms are shown. Okay. Now based on what we have just discussed for histadine, can you now determine the net charge for the predominant form or the major form of espartate at these different pH values. I want you to try them out on your own. Okay? And we can discuss about this later on. Two more examples on amino acids having ionizable side chains. Arginine, okay, which has a positively charged side chain. Okay, so there are three pKa values. PK1 refers to the alpha alpha caroxile group. Okay, with increasing pH, it starts to dissociate the proton. Okay. And this is followed by the alpha amino group. Okay. Uh which is having the second uh highest pKa value. Okay. Then the side chain actually has a very high pKa value. So it only gets dissociate uh deprotonated at very high pH. So the final fully deprotonated ionic form is shown here okay contains these ionizable side chain that can get dissociated okay to become negatively charged. So there are three pKa values. As pH increases the one with the lowest pKa value that is the alpha caroxile group will get dissociate I mean will get deprotonated. It becomes negatively charged that is followed by the next one. Okay having the second um lowest pKa value. So that's referring to this group here. Okay. So uh it becomes negatively charged. Okay. And finally as the pH goes up further. Okay. Uh this alpha amino group which has uh the highest pKa value would also get deprotonated. So that's the fully deprotonated form at very high pH for cysteine. Now let's move on to talk about peptides and proteins. So these are polymers formed by amino acids that are linked to each other. The linkage between amino acids are called peptide bonds. Okay. It is formed as a result of reaction between the caroxile group. The alpha caroxile group of one amino acid and the alpha amino group and the adjacent amino acid. So these two groups react afterwards one water molecule is lost. So this is a condensation reaction and then a linkage is formed between the carbon and nitrogen. So this is a peptide bond and it's described uh as a chance a peptide bond is having the chance configuration. This chance ref is referring to these two side chains in the adjacent uh amino acids because they are on the opposite sides relative to the peptide. Okay. Shown here is the structure of a tetropeptide that contains four amino acid residues. So by looking at the structure can you identify the peptide linkages? So it should be this one, this one and this one. So there are three peptide linkages in a tetropeptide. Okay. Now as you can see the sign here one way. So is uh illustrating the fact that all proteins or peptides are unidirectional. So there is one terminus is called the N terminus because it contains a free alpha amino group while at the other end is called the C terminus because it contains a free caroxile group. For all the other alpha amino groups and alpha caroxile group along the polyeptide, they were all participating in the formation of the peptide linkages. For the amino acid, you can use a threelet um abbreviation or one letter symbol to describe it. Now the sequence of a protein or a peptide is represented by these letters each representing an amino acid that the sequence is unidirectional. So it's always written from the N terminus to the C terminus. So E here means that this is the amino acid at the N terminus and K is the amino acid at the C terminus and the sequence is also numbered in this manner. The one at the end terminus is number one, number two, number three and number four and if the sequence is longer you can just count number. Okay. Now there are different terminologies when describing polyeptides. Oricopeptides are referring to peptides just containing a few amino acids. Deptide, tripeptide, tetrapeptide. So we can tell how many amino acids each of these contained. If a po if a peptide contains more than 40 amino acid residues, you can call them polyeptides. Uh usually um the molecular weight is smaller than 10,000. Proteins are bigger polyeptides. they have higher molecular weights and in some cases there may be more than one poly peptide that are associated with each other to form the mature protein molecule. So this table shows the uh threelet abbreviation and one letter uh symbol for each of the 20 amino acids. There's no need for you to memorize this table. So this information will be provided uh during the quiz or the final exam. I would like to talk about a very special deptide called aspatum. It is being used as an artificial sweetener. So it is actually a synthetic deptide. So it is not naturally occurring. It is being made in a laboratory. By looking at the structure, can you identify quickly where the peptide linkage is? I give you three second. One, two, three. Should be here. Right here. This is the peptide linkage. This is the amino acid residues at the end terminus being aspartate. While the other amino acid residue is actually a modified phenyl alanine. It is phenyl alanine methile estester. So this C terminus is not free. The caroxylic acid has been eststerified. Okay. With the attachment of a methile group. Now this deptide okay is very sweet and once it gets inside our body it will be digested to release phenyl alanine methanol and aspartate. Okay now this is being used aspartame is being used in products like Coca-Cola zero or the sugar substitute that is being used for our coffee or tea. Okay. But in these products, a warning statement must be included uh on the on the label. Okay. Then the warning statement is to warn people with the condition called phenyl keto that this product contains phenol alanine because it will be released from the digestion of this asp. Now what might be the problem for these people with uh phenyl keto? People having this condition which is a metabolic disease they cannot degrade phenyl alanine properly. So if their diet contains too much phenyl alanine they will accumulate toxic metabolites that affect their nervous system resulting in mental retardation. This is particularly damaging to younger kids when they are still growing and developing their brain. Okay. So this must be included because it is very important information for people who are affected. Okay. So next time when you see these products, I want you to go and look for these warning statements and you have if you uh happen to be with a friend or a family member, you should educate them why this warning statement is there. Now let's talk about polyeptides. With 20 choices available for each amino acid residue in a polyeptide, there is a huge number of different protein molecules that are theoretically possible. As a man of fat, polyeptides or proteins are having the most complex sequence. They are having the highest sequence complexity when compared to carbohydrates or nucleotides. In carbohydrates, usually there is only one one type of sugar residue. And in nucleotides, DNA or RNA, there are four different types of nucleotides. But in proteins, there are 20 different choices that can be used to make a polyeptide or a protein. Even it for a small protein like having 100 amino acid residues. Do you know how many unique sequences are there? How many possible sequences are there for an amino for a protein having 100 amino acids? The answer is 20 to the power 100. So there are 20 to the power 100 unique sequences that are possible for a protein with 100 amino acids. But obviously only a tiny fraction of these theoretical possibilities are found in nature. In fact, the actual or naturally occurring polyeptides are kind of limited in size and composition. The table shown here are some selected examples for proteins uh isolated from different uh organisms. The largest polyeptide um is in human. It contains up to about 35,000 amino acid residues. But some polyeptides are pretty small or less than 100 amino acid residues. The majority of them are usually between um 100 and a thousand amino acid uh am residues. Many proteins are containing only one poly peptide but many others are having multiple polyeptides. It could get up to 12 or more. Um so that means in the mature protein for this enzyme glutamine synthetase for example uh it needs to have 12 polyeptides that come together in the final mature structure. Now this table is only for your information about the diversity of polyeptides. There is no need to memorize any information in this table. This table shows the amino acidic compositions of proteins. The information is obtained after analysis of a large number of proteins regarding their composition. As you can see here, the 20 different amino acids, they do not appear with equal frequencies in proteins. Some of them like leucine, alanine and glycine they occur more abundantly while cyine, tryptophan and histadine are occurring in much lower frequency in proteins. Next I would like to talk about sequence versus composition of amino acid residues in proteins. Which one do you think is more important in determining the features like structure and function in proteins? Is it sequence or is it the composition? Let's look at these two words. They have the exact composition of alphabets. However, the sequence of the alphabet is different. And of course, they have completely different meanings. Similarly for proteins, if pro two proteins, they may have the same amino acid compositions. But if the amino acid sequences for the two proteins that are having same amino acid composition are different. They are two different proteins. They will have different functions. They will have different structures. The same is also true for DNA and RNA. Two DNA molecule may have the same nucleotide composition. But if the nucleate sequences are not the same, the two DNA molecules contain different genetic information. There are four levels of structural organization in proteins primary, secondary, tertiary and quturnary. The primary structure refers to the linear sequence of amino acid residues from the N terminus to the C terminus. Secondary structure refers to some regular uh structural patterns in localized region of a protein. Um one example is the alpha helix. There could be more than one type of secondary structure along an entire polyeptide. The tertiary structure refers to the overall threedimensional structure of a polyeptide after folding. So um like I said earlier there might be some localized regions that are having a particular secondary structure but the polyeptide as a whole is being folded into uh defined u three-dimensional structure. In quturnary structure we it refers to the mature structure of proteins that are having more than one polyeptide. Okay. Uh so each of the polyeptide is a subunit. So in this case you see there are four polyeptide. Each of them is having its own three-dimensional structure. And when they are assembled together, the overall structure is referred as a quturnary structure. Let us first look at the primary structure of proteins. Again, it refers to the sequence of amino acid residues that are linked together by peptide bonds. So this is the peptide bond here linking two adjacent amino acids. You can see there are two different alphaarbons in these two adjacent amino acids. Now this is the carbonial group with a double bond. But the electrons here can be deoized. Okay. To go to this uh peptide uh linkage. So the electron is like moving around between these two places. what it results is the partial double bond nature of the peptide linkage. So this peptide bond is considered to be a partial double bond. What that means is that this linkage cannot be rotated is it is rigid. Consequently, all the elements that are associated with this peptide linkage, okay, which are represented by these um rectangle, okay, this 1, 2, 3, 4, 5, six, these six elements are on a single plane as represented by these blue rectangle. So that's considered to be one peptide group group which is on a single plane. Similarly, this is the adjacent peptide linkage and this peptide group again is on a single plane. It's all because again it's uh of the um partial double bond nature of this peptide linkages. While the uh peptide linkages cannot be rotated, the two bonds that are attached to the alpha carbon can be rotated. Okay. But there might be certain constraints regarding the degree of rotation at these two linkages. For example, uh the two carbonial oxygens between the adjacent amino acid cannot be too close to each other because of steering interference. The presence of the group may also affect the uh degree of rotation at these two bonds that are linked to the alpha carbon. As I said, the groups represented by the purple balls may also affect the degree of freedom uh for the bond rotation uh attached to the um alpha carbon. For example, if the amino acid is glycine, which has a very simple side chain H, it will allows more freedom for the bond rotation. Okay. at these two places. Okay, linked to the alpha carbon. But if the group uh is like proline, it has u a ring structure and is a more rigid structure. It will impose uh some restriction on the u freedom of bond rotation. Here the secondary structures of proteins refer to some regular structural patterns that are occurring in localized region uh along a polyeptide. One of the very common example is alpha helix. It refers to a righthanded helical structure such as the one shown on the right here. You can actually use your right hand or left hand to determine the direction of the helical structure. Right-handed structure and left-handed structure. The helical structure is maintained by hydrogen bonding that are formed within the polyeptide backbone. It is formed between the carbonial oxygen. Okay. of one amino acid residue okay with the amii hydrogen of the residue that are at the m +4 position uh toward the c terminus let's look at uh the example here let's say this amino acid residue is at the precision n this is the alpha carbon this is the carbonial group the carbonial oxygen here forms the hydrogen bond with this amine hydrogen. This amine hydrogen uh it belongs to the amino acid residue that is at n +4 position. Okay. Uh so if this is 100 position number 100 that will be position number 104 towards the c terminus. And all the other hydrogen bonds are formed in this manner. Okay. With uh the carbonial oxygen of for the number n okay with the amii hydrogen okay of the residue at n +4. So altogether these uh regular formation of hydrogen bonding help to maintain this region in a alpha helico structure a right-handed alpha helical structure. the side chains or our groups of amino acid residues uh in the alpha helical structures are all projecting outward from the helix. So these ball end states are all different side chains for the different amino acid residues along this region. The side chains are not involved in the hydrogen bonding that are formed within the helix. The hydrogen bonds are formed between an alpha carbonio oxygen and an alpha amine hydrogen for two amino acid residues that are four uh residues apart from each other. The side chains however may affect whether an amino acid residue is found more frequently in an alpha helix. Generally amino acid residues with small and simple side chains will be more commonly found in an alpha helix. But if their side chains are bulky, it will make them less likely to be found within an alpha helix. There are two amino acid residues that are quite extreme. Okay, both of them are not common in alpha helix. Glycine for example, it has a very very simple side chain just a hydrogen. Okay. But because it is too small, okay, it would allow unconstrained bond rotation around the alpha carbon. So these two bonds you can imagine they might keep rotating and when that occurs it is going to destabilize the hydrogen bondings that are formed within the helix. Okay. The other one is proline. Proline uh has a structure done here. As you can see this is a very rigid ring structure. It is not common in alpha helix for two reasons. First this amine uh nitrogen actually after forming uh the peptide linkage there is no amine hydrogen uh that is available for uh forming a hydrogen bonding. Okay. And the second reason is that this is too bulky. So that would interfere the alpha helico structure sterically. So as a result uh both glycine and protein are not common within an alpha helilico structure. Another common secondary structures are beta strands and beta sheets. Okay. What are beta strands? Beta strands refer to the a portion of a polyeptide chain that are fully extended. Okay. Now for example this is a beta strand okay this region a fully extended polyeptide region there are no hydrogen bonding within this region okay however it could form hydrogen bonding between an adjacent beta strand okay and the way that it is form is the same like the alpha helical structure is between an alpha uh amine hydrogen and alpha carox carbonial oxygen. Okay, so these are the same type of hydrogen bonding just like the alpha helical structure but they are formed between adjacent strands. Okay. And so there are three strands in the example given here. Okay. So these three strands are held together by these uh hydrogen bonding. So forming like a a sheet structure that we call it a beta sheet. Okay. Now if these strands are all running in the same direction. So meaning from the N to the C direction the N terminus to C terminus direction. So if these all these three strands are in the same direction we call them we call it a parallel beater sheet. Okay. Sometimes uh you could have uh two strands that are running in opposite directions. Again the direction refers to uh the N terminal to the C terminal direction. So the arrow is always pointing toward the C terminals. So in this in this case here these three strands are arranged. Okay. The uh the adjacent strands are in opposite direction. So we would describe this sheet structure as antip parallel beta sheets. So these sheet structure again they are formed for by strands beta strands that are lying side by side with each other and they are held together by these different uh hydrogen bonds. The beater sheets are often described to have pleated appearance as shown here. It's like a piece of paper that has been folded. Okay, a piece of folded papers. Uh you can see that uh uh between adjacent strands the two peptide groups are on the same plane. So remember peptide groups are all the six elements containing the peptide bonds that cannot be rotated. Okay. So uh we use like a rectangle to represent this as a single plane. So in fact in the beer sheet structure the adjacent strands okay are having these peptide groups the adjacent peptide groups all lying on the same plane. Okay. and between the adjacent peptide groups along the same polyeptide it is meeting at an angle okay in this zigzag manner okay for the next uh amino acid residue right there so consequently you would like having uh this folded structure okay just like a piece of paper that has been folded this multiffold structure and all the side chain chains. The side chains are the purple balls are pointing on either side of these uh beer sheet structure. Tertiary structures are the third level of structural organization in proteins. Each protein has a unique three-dimensional structure. The whole protein could have regular uh secondary structures or places that are irregular. That mean it doesn't have any regular structural pattern. But together the whole polyeptide will be folded in a very specific manner. The folding of a polyeptide is not random. Okay. Each protein okay would be folded in a specific manner to achieve a structure that is functional. If proper folding does not occur, a protein would not function properly. The folding of protein that give rise to the tertiary structure is brought about by side chains. Okay. The side chains will interact okay with each other by non-coovvealent interactions such as hydrophobic interactions, hydrogen bonding or electrostatic interactions. Okay. So the each of the side chains would have a very specific spatial arrangement. Okay. It may occur in a way that two amino acids may have their side chains interacting with each other. Although in the nine sequence the two amino acids residues involved may be very very far away from each other. Okay. Uh this is an example of a tertiary structure for this enzyme called isomease trios phosphor isomerates. The 3D structure is a globular protein like a spherical structure. Now the way that the different places u may be close together is because the side chains in that area are interacting uh with each other okay by different types of non-covealent interactions. In addition to the non-coalent interactions, sometimes coalent linkages such as dulfide bridges could form. Okay, these dulfi bridges are formed between two cyine residues. Okay, cysteine residues they have an SH group. Okay, as a side chain if two cyines are close enough they could form a coalent linkage called dulfride breach. But these coalent linkages only form after uh protein folding occurred. These are some examples of tertiary structures formed after proper folding. A mature protein. A three-dimensional structure of the protein may contain different uh alpha helical structure. Again they are brought together by the interactions of side chains between different regions of the polyeptide. In this case you have a lot of beta strands. Okay they are lying close enough to form different beta sheet structure. Uh they could be proteins uh in their final structures. There are places where you have this parallel beta sheet structures uh in a more internal region while on the outside more exterior regions there are different places with alpha helical structures. Okay. Sometimes uh the structure of a protein uh may only contain very few secondary structure. Okay. This is the model for protein folding. Okay. The formation of a final protein uh that is properly folded into the mature structure. A newly symicized polyeptide is made according to genetic information in in a gene. Okay. uh so this newly synthesized polyeptide contains specific sequence information. Now different places along the polyeptide would then start to form secondary structures depending on the identity of the sequence in different regions. Some places you may see alpha helixes and some other places there could be beta sheet structures. Afterwards, okay, folding would start to occur with the interactions between different side chains of the amino acid residues along the polyeptide. So again this folding is occurring in a specific manner. The side chains are brought together. Okay. Uh interacting with each other so that the final 3D structure will be occurred. The final 3D structure will be achieved and the final structure will be stable and functional. So it needs to achieve a very specific structure so that a protein would function properly. So after folding a newly synthesized polyeptide would eventually uh forming a functional structure 3D structure or we call call it a native structure. So that structure is stable and functional but sometimes the folding may occur improperly. So misfolding would occur resulting in misfolded structure. But what may happen to these misfolded proteins? So if the misfolding is not too um not too bad, okay, not too extreme, uh it may get refolded, okay, back to the native structure so that the punk protein can function again. Sometime the misfolding may go beyond repair. uh some of them may get together to to form some aggregation. So this is usually not very desirable because it could have harmful effects inside a cell. In some other cases, uh the misfold structure would simply be degraded. Okay? So that individual amino acid residues uh will be released and they could be used for making a new uh polyeptide. Okay. as uh as the monomers. The folding of proteins into the final tertiary structure is determined by features of um amino acid side change and one of the uh important features is their relative hydrophobicity. So in a theoretical uh folded protein like this uh the interior okay of the protein usually is consist consisting of amino acid residues that are hydrophobic. They don't like water. So they tend to be uh clustering together due to interactions between them. hydrophobic interactions and and they would be in the interior of of a folder protein to avoid the contact of water. So if you stretch off the protein in a in a linear manner, these places will be facing a very unfavorable salvation condition. uh the hydropathy scale is a quantitative indication of the relative hydrophobicity of the side chain in different amino acids. Okay. uh if the hydropathy value is very high that indicates the amino acid side chain are very hydrophobic and they would tend to be clustering uh together in the interior of of a folded protein. If the hydropathy values are very large and negative so that means these amino acids are having very hydrophilic side chains. For example, arginine or lysine are having positively charged amino acid side chains. Let's look at uh the example of a water soluble enzyme alcohol dehydrogenates from horse liver. So this water soluble protein you would expect the exterior of the protein will be having amino acids that are uh hydroilic uh in its side chain. So you can see here let's just look at a a small region this alpha helix uh in this overall enzyme. Okay. So the external surface okay is having these red uh side chains and these are charged amino acid side chains. So they are hydrophilic. So they would tend to be located on the surface of the protein for interaction with water. While you see the blue ones are the blue side chains which are highly hydrophobic. would tend to see that they are pointing toward the in interior of the protein uh to avoid direct contact with uh with with the water uh in the outside environment in a grass pollen protein. It has a very characteristic sandwich structure for two pair of beater sheets. Okay. So as indicated here there is a you know an anti parallel beater sheets on one surface and down below okay underneath this is another pair of beer sheet structure. So the the reason why they can be um arranged uh on top of each other one sheet on top of another sheet is due to the interaction of the hydrophobic residues hydrophobic side chains between the two pairs of sheet structures. So the lower pair would have some hydrophobic side chain. So they are sticking up while the upper pair of the um meaning here the upper pair of the beta sheet structure are having some hydrophobic side chains that are pointing downwards. So that these two uh layers okay of side chains that are hydrophobic will be interacting with each other just like uh you know the ingredient of of of a sandwich. So the overall thing is described as a sandwich. Okay. While these orange side chains are sticking on the surface of the upper beater sheet structure and these is these are hydrophilic side chain for interaction with the acreous cellular environment after folding into a tertiary structure. uh due to the interactions between side chains of amino acid residues the overall structure could be further strengthened by the formation of dulfide bridges. These dulfide bridges are coalent bonds that are formed between 16 residues. Okay, they could be within the same polyeptide or in between adjacent polyeptides. These covealent modifications are common in proteins that are located extracellularly because in the extracellular environment it could be very harsh, very extreme that may disrupt the uh non-coalent interactions that help to maintain the 3D structure. So the addition or the formation of coalent linkages would help to stabilize proteins that are located extracellularly. The purpose of that is is to prevent protein unfolding. Remember a protein uh needs to be folded in a proper configuration in order to be functional. So this is the chemical reaction showing how the dulfide bridge can be formed. So the two system residues when if they are close enough uh in the 3D structure the side chain the SH group or the thio group could react in the presence of oxygen to form this linkage between two sulfur. That's why it's called a dulfide bridge. So in the 3D structure okay uh if two cyines or residues are close enough this yellow dulfi uh bridges bridges can be formed. So again they would serve to strengthen the overall 3D structure after the folding proper folding as a result of interactions between amino acid side chains.

Transcript for:Amino Acids and Protein Structure

Transcript for:
Amino Acids and Protein Structure