Transcript for:
Protein Structure and Chemistry

Hello everyone, my name's Iman. Welcome back to my YouTube channel. Today we're kicking off a brand new, improved, and fully updated MCAP biochemistry series. If you've been here before, you know that I'm always working to improve the clarity, structure, and depth of these lessons. I take your feedback seriously because I want to make sure you have everything you need to truly understand the material and feel confident going into the MCAT. Everything I share here is freely available because I believe that education should never be gated behind a payw wall. Everyone deserves access to quality science education, no matter their background or resources. So, you'll find a link to the notes and a full transcript for this video in the description box below. With that being said, let's get started with chapter 1, which is titled amino acids, peptides, and proteins. In this chapter, we're going to cover six main objectives. First, we'll begin with amino acids found in proteins. We'll go over basic terminology, the stereochemistry of amino acids, their structures, and how to classify them as hydrophobic or hydrophilic. In addition, we will learn the onelet and threeletter abbreviations for each amino acid that you'll need to recognize on the MCAT. Second, we'll dive into the acid base chemistry of amino acids. This includes how they gain or lose protons depending on pH, how to interpret pKa values, and also how to analyze titration curves, especially the isoctric point. Third, we're going to look at peptide bond formation and hydraysis. You'll learn how amino acids are linked together, how peptide bonds are broken, and how this process fits into protein synthesis and degradation. Fourth, we'll cover primary and secondary protein structure. That means understanding the sequence of amino acids and how they fold into alpha helyses or beta sheets through hydrogen bonding. Fifth, we'll explore tertiary and quatinary structure. Tertiary structure is the full three-dimensional shape of a single polyeptide. And quatinary structure applies when multiple chains come together to form a functional protein. We'll also talk about conjugated proteins which have extra chemical groups called prosthetic groups that help them function. Finally, we'll talk about how proteins can become denatured. That's when a protein loses its structure, often due to heat or chemical stress, and as a result loses its function, too. Let's begin with the first objective. Amino acids found in proteins. Proteins are some of the most versatile molecules in biology. They come in an incredible variety of structures which allows them to perform a wide range of functions. Everything from carrying oxygen to speeding up reactions to fighting infections. Proteins are at the center of it all. The first protein to have its three-dimensional structure determined was myoglobin. And this was a major scientific milestone. And it was made possible through a technique called X-ray crystalallography. That structure gave us our first real view of what proteins look like at the atomic level and it laid the groundwork for understanding how protein structure relates to its function. Now depending on the role a protein plays, we can group them into different categories. First we have enzyatic proteins. These are proteins that act as catalysts. They speed up chemical reactions by lowering the activation energy needed. An example is digestive enzymes. And these enzymes in your digestive system help break down large molecules in food by catalyzing the hydraysis of chemical bonds. That means they use water to break those bonds and make the nutrients easier to absorb. Next, we have defensive proteins. These are involved in protecting the body against disease. Antibodies are a great example. They recognize and bind to foreign invaders like viruses or bacteria and they inactivate them and tag them for destruction by the immune system. Then we have storage proteins. As the name suggests these proteins store amino acids or other important molecules for later use. One example is uvalbumin which is found in egg white. It serves as a reservoir of amino acids for the developing embryo. We also have transport proteins. These move substances throughout the body or across cell membranes. Hemoglobin is a classic example. It's found in red blood cells and it carries oxygen from the lungs to tissues all around the body. Without it, oxygen wouldn't reach where it's needed. Now, all of these proteins, no matter how different their functions seem, are built from the same building blocks. And that brings us to a big question. What are proteins actually made of? The answer is amino acids. Amino acids are the building blocks of proteins. Each one contains four essential components bonded to a central carbon atom which we call the alphaarbon. These four groups are an amino group, a caroxile group, a single hydrogen atom, and a variable group known as the R group or side chain. It's this R group that gives each amino acid its unique chemical properties. When we study amino acids for the MCAT, we focus on the 20 standard amino acids that are directly encoded by the genetic code. These are called proteoggenic alpha amino acids. There is a 21st amino acid selenocyine. And while it plays a role in specialized proteins, it's not something you need to memorize for the MCAT. So, we're just going to ignore it. Now, these 20 amino acids, they're all alpha amino acids, meaning the amino group is attached to the same carbon that holds the caroxile group, the alphaarbon. And to keep everything organized, each amino acid has a full name, a threelet abbreviation, and a onelet code. on the MCAT. These are going to show up in figures, in tables, and in passages. So throughout this lesson, I'll make sure to clearly say all three for each amino acid as we walk through them. Let's start with the nonpolar, nonaromatic amino acids. These amino acids have side chains that are hydrophobic. They don't interact well with water and they're usually found in the interior of folded proteins where they help stabilize the overall structure. First is glycine. Its threelet abbreviation is gl and its onelet code is g. Glycine is special because its side chain is just a single hydrogen atom. That makes it the smallest amino acid and the only one that is not chyro. Because it's so small, glycine is highly flexible and you'll often find it in regions of the protein where the backbone needs to bend or turn. Next is alanine. Its threelet abbreviation is ALA and its onelet code is A. Its side chain is a simple methyl group. So it's compact, non-polar and hydrophobic. Then we have veiling. Its threeletter abbreviation is V A L and its onelet code is V and it has a branched isopropyl side chain. This adds bulk and it makes it more strongly hydrophobic. Then there's leucine and isolucine that are also branch chain amino acids. They're hydrophobic and they're commonly found packed in the cores of folded proteins. The threeletter abbreviation for leucine is L EU and its onelet code is L. The threeletter abbreviation for isolucine is IL E and its onelet code is I. Next is proline. Its threeletter abbreviation is P R O and its onelet code is P. Proline has a distinctive ring structure that connects back to its own amino group and this creates a rigid backbone that limits flexibility. Because of this rigidity, proline often disrupts alpha helyses and it creates turns or kinks in protein structures. Finally, we have methionin. It has a sulfur containing theoether in its side chain. Despite the sulfur, it remains nonpolar and hydrophobic. Its threelet abbreviation is me and its onelet code is M. Let's move on to aromatic amino acids. These amino acids contain ring structures with conjugated pi electrons which allows them to absorb ultraviolet light and this makes them useful for detecting proteins using different techniques. First is fennel alanine. Its threelet abbreviation is ph and its onelet code is f. It has a benzel side chain which is highly hydrophobic and non-polar. Next is tyrosine. Its threeletter abbreviation is T Y R and its onelet code is Y. It has a hydroxal group attached to an aromatic ring and that hydroxal group makes tyroine slightly polar and capable of forming hydrogen bonds. Then we have tryptophan. Its threelet abbreviation is trp and its onelet code is w. Tryptophan has a bulky double ring structure and while it's generally non-polar, it is more hydrophilic than fennel alanine. The next category to cover is polar uncharged amino acids. These amino acids have side chains that can form hydrogen bonds. So they are hydrophilic but they don't carry a formal charge at physiological pH. First we have serene. Its threeletter abbreviation is s e r and its onelet code is s. It has a hydroxal group on a short side chain and it's small and highly polar. Next, we have thriionine. It has a threeletter abbreviation of THR and a onelet code of T. It also has a hydroxal group, but it's attached to a slightly larger side chain. And like serene, it can form hydrogen bonds and it's polar. Next, we have asparagene and glutamine. And they both have amide groups in their side chains. These amides are polar and excellent at forming hydrogen bonds which help stabilize protein surfaces. Esperagene's three-letter abbreviation is as SN and its onelet code is N. Glutamine's threeletter abbreviation is GLN and its onelet code is Q. We should also include here cyine. Its threeletter abbreviation is C YS and its onelet code is C. Cyine has a thol group on its side chain which makes it polar. The key feature of cyine is that two cyine residues can form a covealent dulfide bond which helps stabilize the threedimensional shape of a protein. Moving on to negatively charged amino acids, also called acidic amino acids. These amino acids have side chains that are deproinated at physiological pH and that gives them a negative charge. There's two amino acids to cover in this category. First we have aspartic acid and this term refers to the protonated version of this amino acid while aspartate is its conjugate base. This is the deprotonated form that predominates at physiological pH. Its threelet abbreviation is ASP and its onelet code is D. It has a caroxyic acid group attached to just one carbon away from the alphaarbon. Then there's glutamic acid. Again, this follows the same pattern. Glutamic acid is the protonated form of this amino acid and glutamate is the deprotonated conjugate base. Its threelet abbreviation is glu and its onelet code is e. It's very similar to a spartate, but it has an extra carbon in the side chain before the terminal caroxyic acid group. Again, both are strongly hydrophilic and they're frequently involved in ionic interactions such as salt bridges that help stabilize folded proteins. Finally, let's talk about the positively charged amino acids, also known as the basic amino acids. These amino acids have side chains that are protonated at physiological pH, which gives them a net positive charge. First, we have lysine. Its threelet abbreviation is lys, and its onelet code is K. It has a long side chain that ends in a positively charged amino group. It's strongly basic and highly hydrophilic. Then there's arginine. Its threelet abbreviation is a RG and its onelet code is R. It has a guanadenium group at the end of its side chain. And this group is resonance stabilized and it carries a positive charge which makes arginine one of the strongest basic amino acids. Lastly, we have histadine. Its threelet abbreviation is H IS and its onelet code is H. Histadine has a side chain with a pKa that's pretty close to physiological pH. That means it can shift between being protonated or unproetenated depending on the environment which makes it especially useful in enzyme active sites where acid base chemistry is involved. With that we have introduced our 20 amino acids. In addition, we can classify amino acids based on how they behave in water along a spectrum from hydrophobic to hydrophilic. We briefly mentioned this for every amino acid we talked about, but I kind of want to put it into perspective and reiterate. On the hydrophobic end, we have amino acids like alanine, isolucine, leucine, veene, and fennyl alanine. These side chains tend to avoid water and they prefer to be tucked away inside the folded protein. On the hydrophilic end, we have histadine, arginine, lysine, glutamate, and despartate. And these amino acids have polar or uncharged side chains that readily interact with water. and they're usually found on the outside of proteins where they can interact with the aquous environment. Everything else, meaning most of the remaining amino acids, falls somewhere in the middle and their behavior depends on the specific context like pH or nearby residues or whether they're exposed to water or buried in the protein core. Now there are also two structural facts that are important to keep in mind. First all amino acids are chyro except for glycine. Since glycine side chain is just a hydrogen atom its alphaarbon is bonded to two identical groups. So it lacks chyalin. Second, all amino acids have S absolute configuration with the exception of cyine. Even though cyine is still L in biological systems, its sulfur atom changes the priority order in its size in its side chain and that gives it a R configuration instead of S. So to wrap up this objective, amino acids can be grouped in several different ways, by polarity, by charge, and by whether they're hydrophobic or hydrophilic. And these properties directly affect how they behave in water and how they fold into proteins and how they interact with other molecules. And remember, for the MCAT, it's essential that you can recognize the name, the structure, the three-letter abbreviation, and the onelet code for each of these amino acids. With that, we can move into the third objective, which is titled acid base chemistry of amino acids. Amino acids are amputeric, which means they can act as both acids and bases. In other words, they have the ability to either accept a proton or donate a proton. And how they behave depends entirely on the pH of their environment. At the most basic structural level, every amino acid has at least two groups that can gain or lose protons. You have a caroxile group and an amino group. And these groups respond differently depending on whether the pH is acidic, neutral, or basic. Let's start by imagining an amino acid that hasn't gained or lost any protons yet. So there are no charges on this amino acid. In this state, the caroxile group is CO and the amino group is NH2. But in water and especially around physiological pH that exact form doesn't exist. Instead amino acids tend to exist in a structure called a zwitter ion. As witter ion is a molecule that carries both a positive charge and a negative charge on different atoms, but it has no net charge overall. Here's how this happens. In water, the caroxile group tends to lose a proton and it becomes negatively charged and then the amino group tends to gain a proton and it becomes positively charged. So even though the molecule carries both charges, they ultimately cancel each other out and this results in a zitter ionic form. This is electrically neutral overall. Now to understand how amino acids behave as acids and bases, there are two key rules you need to remember. The first rule is that ionizable groups tend to gain protons under acidic conditions and they tend to lose protons under basic conditions. So in general at low pH ionizable groups tend to be protonated and at high pH they tend to be deprotonated. The second thing is that the pKa of a group is the pH at which half of the molecules of that group are deprotonated. So if the pH is lower than the pKa most of the groups will be protonated. But if the pH is higher than the pKa, then it will be deprotonated. Every amino acid has at least two groups that can gain or lose protons. Obviously, the amino group and the caroxile group, which means they all have at least two pKa values. The first pKa value is usually around 2 and that corresponds to the caroxile group. The second pKa is usually around 9 to 10 and that corresponds to the amino group. And if an amino acid has an ionizable side chain, then it can have a third pKa specific to that group. Let's apply those ideas in different pH environments. So at a low pH in acidic conditions, the solution contains lots of free protons. In this environment, the amino acid is fully protonated. The caroxile groups stay in its protonated form as CO and the amino group is NH3+. That gives the molecule a net positive charge. Then as we raise the pH and we approach neutral pH the caroxile group loses its proton first because its pKa is lower around two. So at this point the caroxile group is CO minus while the amino group remains NH3 plus. Now this molecule carries both a negative charge and a positive charge making it a zwitter ion. This is the dominant form at physiological pH which is about 7.4. But if we continue to raise the pH into basic territory, the amino group eventually loses its proton 2. So NH3+ now becomes NH2. and both functional groups are deproinated and the overall charge on the amino acid is now negative. So to summarize this trend at low pH amino acids carry a positive charge. At neutral pH they exist as zwitter ions with no net charge and at high pH they carry a negative charge. Because amino acids have both acidic and basic groups, they're excellent candidates for titration experiments. Their ability to gain and lose protons at specific pH values allows us to study how their charge changes across the pH scale. Let's take a closer look at the titration curve of glycine. In this case, we're going to be starting with a 1 molar glycine solution and gradually adding base in the form of hydroxide ions. On the xaxis, we track the amount of base added. And on the yaxis, we monitor the pH of the solution. Now, glycine contains two ionizable groups, a caroxilic acid group and an amino group. At the start of the titration, the solution is very acidic and glycine is fully protetonated. So the caroxile group is CO and the amino group is in its positively charged form NH3+. And this gives the molecule a net positive charge. But as we begin to add base, the caroxile group is going to lose a proton and it's going to lose the proton first. This is because its pKa is lower than that of the amino group, meaning it is more acidic. When the pH reaches 2.34, this is glycine's first pKa. the concentration of the protetonated and the deproinated form of the caroxile groups are equal. This is a key concept. At the pKa of a group, its conjugate acid and conjugate base are in equilibrium. So at this stage, glycine exists as a mixture of two forms. the fully protetonated form and the zwitter ion form. And this region of the curve is relatively flat because glycine acts as a puff a buffer around its pKa. As we continue to add base, we eventually reach the isoctric point or pi at a pH of 5.97. This is the pH at which glycine exists entirely in its zwitter ionic form. Because the molecule is electrically neutral, it no longer buffers the solution and the titration curve becomes steep, meaning the pH rises rapidly with each additional amount of base. beyond the pi further addition of base begins to deproinate the amino group. So around pH9.6 this is the second pKa. The NH3+ group loses its proton and it becomes NH2. Here at this pH both the zwitter ion and the fully deproinated form exists in equilibrium. Now an important question to ask is how do we calculate the pi for a neutral amino acid like glycine? Since it has only two ionizable groups, the amino group and the caroxile group, we take the average of the two pKa values. So this is going to be 2.34 plus 9.6 and that is divided by 2. This gives us 5.97. That is the pi value for glycine. But for amino acids that have ionizable side chains like glutamate or lysine, the titration curve has an extra step. And that's because they have three ionizable groups, not just two. But what does that mean for the way we calculate pi? Let's start with glutamate first as an example. It has two caroxile groups, one on the backbone and one on the side chain. And it also has of course its amino group. The zeter ion of glutamic acid forms after the first caroxile group deproenates but before the side chain caroxile group does. So to find the pi we just take the average of the pKa values of the two caroxile groups. Those are the two deeprotonation steps on either side of the neutral zitter ion. So this is going to be equal to 2.2 + 4.2 divided by 2 and that's going to be approximately a pi of 3.1. This is the tactic for figuring out the pi for acidic amino acids. What about basic amino acids like lysine which has two amino groups, one on the backbone and one on the side chain and a single caroxile group. In this case, the zwitter ion exists after the caroxile group deproenates but before either amino group does. So to calculate the pi, we average the two highest pKa values. The one for the backbone amino group and the one for the side chain. So here in this case, it's going to be 10.79 + 9.18 and you divide that by 2 to get the pi. So in summary, for amino acids without ionizable side chains, the pi is the average of the backbone pKa values. But for amino acids with ionizable side chains, the pi is calculated by averaging the two pKa values that flank the zer ionic form. And with that, we've completed objective two. And we can finally move into objective three which is titled peptide bond formation and hydraysis. Peptides are chains made up of amino acid subunits. When we link amino acids together we form a peptide. Well, if the chain is shorter than about 50 amino acids, we call it a peptide. If it's longer than 50 amino acids, we typically refer to it as a protein. Now, the bond that links one amino acid to the next, that's called a peptide bond. Let's walk through how it forms. Peptide bond formation is a type of condensation reaction, also known as a dehydration reaction. That means two molecules are joined together and in the process a water molecule is removed. Specifically the caroxile group from one amino acid reacts with the amino group of another amino acid. The hydroxal group from the caroxyic acid and a hydrogen from the amino group are removed together forming that water that leaves. What's left is a new bond between the carbon of the caroxile group and the nitrogen of the amino group. And that bond again is what we call the peptide bond. This type of reaction requires energy and in cells it's catalyzed by ribosomes during protein synthesis. Once the peptide bond is formed, we can now refer to the two ends of the peptide chain using specific terms. The end of the chain that still has a free amino group, this is called the N terminus or the amino terminus. The other end which has the free caroxile group, this is called the C terminus or the caroxile terminus. So when we talk about the direction of a peptide or protein, we always read it from the N terminus to the C terminus. And just like peptide bonds can be formed, they can also be broken. And the reverse of this process is called hydraysis. It involves the addition of water to break the bond between two amino acids. This reaction can happen under acidic or basic conditions or with the help of specific enzymes. So again, peptide bond formation is a condensation reaction that removes water and peptide bond hydraysis is the reverse. It adds water to break the bond. With that, we're moving into objective four, which covers protein structure. We're going to start off here with primary and secondary structure. Proteins are long chains of amino acids and all proteins have a specific hierarchical structure that determines how they fold and how they function. We organize protein structure into four levels. Primary, secondary, tertiary and quatinary. You can think of this progression the same way we build meaning in language. The primary structure is like letters of the alphabet. The secondary structure is like words. The tertiary structure is like full sentences. And the quatinary structure which only applies to some proteins is like paragraphs. Multiple units coming together to express something more complex. Let's start with the first level primary structure. The primary structure is the exact linear sequence of amino acids connected by peptide bonds. This sequence is determined by the DNA that codes for the protein and it's always written from the N terminus which has the free amino group to the C terminus which has the free caroxile group. The secondary structure is the local folding that happens between neighboring amino acids. These folds are stabilized by hydrogen bonds, but specifically between atoms in the backbone, not the side chains. This level of structure includes recurring patterns that we see over and over in proteins. There are two common types of secondary structure. Alpha helix and beta pleated sheet. First is the alpha helix. This is a right-handed coil where the peptide backbone spirals around itself. Hydrogen bonds form between the carbonial oxygen of one amino acid and the amide hydrogen a few residues ahead. And this gives the helix a stable and compact shape. The second is the beta pleated sheet. In this structure, strands of the polyeptide chain lie next to each other either in the same direction or in opposite directions. Hydrogen bonds form between the strands giving the sheet a flat extended appearance. And these can be parallel or antiparallel depending on the orientation of the strands. Now a fun little fact I'd like to mention here is how proline behaves in secondary structures. Proline has a unique rigid ring structure that connects back to its backbone nitrogen which restricts its flexibility. Because of this, proline introduces a kink in the polyeptide chain when it's found in the middle of an alpha helix. It disrupts the helical geometry and it breaks the hydrogen bonding pattern. For that reason, it's rarely found within alpha helyses, though it might appear at the start of one to help cap or terminate the helix. Similarly, proline is also uncommon in beta sheets where the extended confirmation and the regular hydrogen bonding makes it a pretty unfavorable fit. So when you see proline in a protein sequence, it usually signals a bend or turn rather than a clean alpha helix or beta sheet. So that's the primary and secondary structure. we can move into the fifth objective where we will discuss tertiary and quatinary structure. The tertiary structure refers to the overall 3D shape of a single polyeptide chain. This level of structure is determined not just by interactions along the backbone, but by how the R groupoups or side chains of the amino acids interact with one another in three-dimensional space. Several types of interactions drive the folding into a stable tertiary structure. First, you have hydrophobic interactions. These are especially important here. Non-polar side chains tend to cluster together in the interior of the protein away from water. This process increases the entropy of the surrounding water molecules and it decreases the systems Gibbs free energy which makes the folded structure thermodynamically favorable. Then for polar and charged side chains, these tend to remain on the protein surface where they can interact with water or form salt bridges which are ionic bonds between positively and negatively charged side chains. Hydrogen bonds also stabilize the structure especially between polar groups in different parts of the chain. Finally, dulfide bonds form when two cyine residues come close together and form a covealent bond. This sulfur sulfur linkage adds significant stability to the folded protein. For many proteins, the tertiary structure represents the final level of organization. But in some cases, there is one more level, the quatinary structure. This only applies to proteins made up of more than one polyeptide chain. These multi-ubunit proteins must arrange their chains in a specific way to function properly and forming a quatinary structure provides several advantages. First, it can enhance the stability of the overall protein complex. It can be genetically efficient allowing a large complex to be built from smaller subunits which reduces the total DNA sequence required. It can also improve catalytic efficiency especially when bringing active sites into close proximity. And finally, it can enable cooperivity where binding at one subunit affects the behavior of others. And this is especially important in proteins like hemoglobin. There's one last concept to mention here and that is conjugated proteins. These are proteins that require a nonproin component called a prosthetic group to function properly. Prosthetic groups are covealently attached and they're essential to the protein's role. If the prosthetic group is a lipid, the protein is called a lipoprotein. If it's a carbohydrate, it's a glyoprotein. And if it's a nucleic acid, we call it a nucleoprotein. So to wrap this up, let's summarize the order of protein structures. Primary structure is the linear amino acid sequence. Secondary structure refers to the localized folding patterns like alpha helyses and beta sheets. Tertiary structure is the complete three-dimensional shape of a single polyeptide chain and quatinary structure is the arrangement of multiple polyeptide chains into a single functional protein complex. Each level builds on the last and together they determine how a protein folds, functions and interacts in the biological world. With that we can move into our last and final objective. Our final objective is about dennaturation and this is essentially the reverse of protein folding. When a protein folds, it adopts a very specific three-dimensional shape that allows it to function properly. But if the protein is exposed to the wrong conditions like too much heat or certain chemical solutes, it can lose that shape and that process is called dennaturation. During dennaturation, the tertiary and quatinary structure of the protein are disrupted. So the protein unfolds and it loses these stabilizing interactions it has present like hydrogen bonds, hydrophobic interactions, salt bridges and dulfide bonds. It's important to note that the primary structure the sequence of amino acids usually stays intact. Common causes of dennaturation include one heat which increases molecular motion and it breaks non-coovvealent interactions and two solutes like ura or detergents which interfere with hydrogen bonding or disrupt hydrophobic environments. Dennaturation inactivates the protein because function is directly tied to structure. For example, enzymes lose their active sight shape and can no longer catalyze reactions. In some cases, dennaturation is reversible if the damaging agent is removed and the protein refolds correctly. But often, especially in the case of extreme heat or strong chemicals, dennaturation is irreversible and the protein is permanently nonfunctional. So while folding is how proteins gain structure and function, denaturation is what strips that away, leaving the protein unfolded and biologically inactive. With that, we've completed our first chapter in this MCAP biochemistry playlist. I hope it was helpful. Please let me know if you have any questions, comments, concerns down below. Other than that, good luck. Happy studying and have a beautiful, beautiful day. Future doctors.