Hello, welcome to the course Biochemistry 1 conducted by me Dr. S Dasgupta, Department
of chemistry Indian institute of technology Kharagpur. In this course we will be studying
certain aspects of Biochemistry starting from structures and functions of Biomolecules right
on to Bioenergetics and Metabolism.
The topics we will be covering are structures and functions of Biological molecules. In
that we will be considering Amino acids and Proteins, Enzymes. Under the Enzymes we will
be considering the mechanisms of specific enzymes move to Vitamins and Coenzymes, Carbohydrates
and Lipids, Nucleic acids and their components. The entire range of topics of this course
will be covered in their detail as is relevant to this course.
Also we will be considering the Principles of Bioenergetics with special reference to
carbohydrate metabolism. The books we will be covering are common Biochemistry books
such as Stryer, Lehninger, Voet & Voet.
When we consider “The Central Dogma of Biology” the first thing that comes to mind is DNA,
the DNA is the storage medium. The Central Dogma of Biology goes like this DNA ? RNA
? Protein. We have a text that is comprised of DNA which
is the four basis of DNA that is the storage medium and this is transcripted to RNA which
is the transmission medium that also is comprised of four basis one of them being a bit different.
The RNA is then translated to the Protein. The alphabet of DNA, RNA and Protein is slightly
different. What is this alphabet? In DNA we have four letters to the alphabet the four
letters are as you can here A, G, T and C. These are the four letters that comprise the
alphabet of DNA.
If you look at the corresponding alphabet of RNA you will see that we have U, C, A and
G. When we go on to study the structure and contents of Nucleic acids the structures of
each of this basis will be much clear but we have to know that DNA and RNA are comprised
of these letters which actually represent nitrogenous basis.
The Protein alphabet is a bit different the Protein alphabet is some times represented
as a 3-letter code which we will see in a moment or by the 1-letter code which is another
representation of the very same 3-letter code. The Protein alphabet is comprised of twenty
unique letters that tell us what the Protein sequence is. We will understand what an Amino
acid sequence is once we get into the details of what is a peptide bond and what is an Amino
acid. The first thing that we know or we try to understand here is the carbon atom.
The Proteins are actually made of the Amino acids that are linked by peptide bonds. We
will also see how a peptide bond is actually formed and how these letters were linked together
to form like a sentence. Each amino acid consists of: a central carbon atom Ca so we have the
central carbon atom that is marked as Ca and we have an amino group NH2 group, and we have
the carboxylic acid group which is -COOH and we also have what is called an R group. This
R group is the side chain of the amino acid. And we also have the hydrogen atom.
This part is common to all Amino acids because it has an amino group, it has an acid group,
it also has an H a hydrogen attached to it. So we see this central carbon atom is actually
a chiral carbon which means it is asymmetric which again means that there are four different
groups attached to this central carbon atom. And since all amino acids have a common set
of groups here in the amino group, the hydrogen atom and the carboxylic acid group. The side
chain differs from one amino acid to another amino acid which can be different atoms, different
groups of atoms and this is what actually distinguishes the various amino acids.
Now we are going to consider the types of R groups we can actually have. If we look
at the different forms of amino acids that could be incorporated into proteins as we
mentioned earlier we have an amino group, an carboxyl group, an hydrogen atom and we
also have an R group attached to it. Now because of its chirality it can have an L -form or
a D –form. Usually L -amino acids are incorporated into proteins. Now you understand that these
side chains the R group can differ in its size, it can differ in its shape, it can differ
in its polarity. There are twenty common amino acids which have distinctive R groups with
distinct properties of size, shape and its polarity.
We will consider the amino acid side chains by group in each case and you have to remember
the 3-letter code of the amino acid along with their 1-letter code as well and obviously
you have to remember what the side chain comprises. We have listed here is Glycine and Proline
which are unique amino acids. The Glycine is the simplest amino acid because the R group
is just a hydrogen atom.
This hydrogen atom makes the central carbon atom of Glycine symmetric because it does
not have four different groups attached to it. It was attached with two hydrogen atoms
which do not make it chiral anymore and this is the only such amino acid. So, if we look
at the Glycine the side chain the R group is attached to central carbon atom and we
have an amino group and also a carboxylic acid group attached to it. We have all the
amino acids are in the form of NH3+ and a COO- because at physiological pH the –OH group looses its hydrogen atom due
to pk value of the carboxylic acid group and this amino group is protonated which means
that it has an additional hydrogen atom making this nitrogen positively charged and this
is called the zwitterion form of the amino acid and it is
represented in this fashion because we would like to represent the amino acids as what
they would be at physiological pH. The next unique amino acid is Proline. The Proline
does not have a distinct R group attached to it but the R group is actually linked up
to the amino group here.
So the side chain -CH2 -CH2 -CH2 is actually linked to the NH+ in this case. So the a-
carbon have attached with a hydrogen atom, with a carboxylic acid. But the Proline is
being as an imino acid instead of being an amino acid because it has an imine group instead
of an amine group. So we have an imino acid where the side chain bends on to itself to
form Proline. So these are the two amino acids that are unique in their features. The Glycine
is being just because it has an hydrogen atom as its side chain and it is achiral. And the
Proline is being as an imino acid because the side chain bends back upon itself.When
we represent the amino acids we represent them in Zwitterionic form which is NH3+ and
COO- . Because this is how they would remain at a physiological pH in normal solution.
The next group of Amino acids is hydrophobic amino acids. The Hydrophobic amino acids are
comprised of mostly of carbon atoms and hydrogen atoms in their side chains. So they would
tend to be away from the solvent. Usually the solvent being water or water based. They
would be away from water, not liking to be in water so they would be Hydrophobic. Here
the simplest side chain having amino acid is Alanine. In which the 3-letter code is
Ala and the 1-letter code is A. The side chain is a methyl group so this is what we would
say the R group. Here again you recognize the Zwitterionic representation of amino acid.
Then we come to Valine. The Valine is a ß-branched amino acid. Its side chain is -CH(CH3)2. The
next one we have is Leucine in which the R-group is -CH2 -CH(CH3)2..So it is branched at the
?-atom. The way these are represented is that if this is the Ca the next atom is the Cß
beta which is connected to 2 C? atoms which would be a unique representation of the amino
acid Valine.If we look at Leucine we would again have a unique representation. We will
consider this as Ca, the next one which is attached to this is Cß and then C?. It is
attached to the 2 Cd atoms in which one is represented as Cd1 and the other as Cd2. The
Alanine side chain has only a Cß-carbon. So this code actually be represented very
clearly in a unique manner where each amino acid has this zwitterionic part being common.
The chain could be represented by the types of atoms that are attached to the Ca.
If you look at the Isoleucine we have the side chain –CH(CH3) –CH2 –CH3. So we
have a Cß atom attached by one methyl group and one ethyl group. All these side chains
are comprised of C and H which makes them Hydrophobic in nature. In the same way the
Methionine can fall into this category is well. But it has a Sulphur atom and a methyl
group attached to the Sulphur atom in its side chain. So we have a Ca attached with
Cß which is C?. And this C? is attached to the Sulphur atom and then we have a methyl
group attached to the Sulphur.
The Methionine along with another amino acid Cysteine is the Sulphur containing amino acids.
They could be grouped together in a group of their own or they could be considered in
this group as well.The next group that we will be considering is the Polar amino acids.
The Polar Amino acids have an oxygen atom or a nitrogen atom in their side chain. And
by virtue of having the hetero atoms like the oxygen or the nitrogen in the side chain
is they can participate in polar interactions not only among themselves but also with solvent
molecules. So they can participate in Hydrogen bonding which is extremely important in non-covalent
interactions in Proteins which holds a Protein fold it together in the protein chain the
amino acid chain.
But the Polar amino acids are likely to interact with the solvent. In this interaction they
can allow the oxygen and nitrogen atoms to interact with the solvent molecules or within
themselves to form a network and remain in solvent. In contrast the hydrophobic amino
acids are unlikely to be on the surface of the protein. So when we have an globally structure
we will see that there are certain side chains are preferred to be on the surface of the
protein and there are certain side chains that are preferred to be away from the solvent
which we have seen earlier would be the Hydrophobic in nature.
Now, if you to look at the side chains that are comprised this polar group of amino acids.
Each of these have an oxygen or nitrogen attached to it. We have of course the common part of
amino acid in the Asparagine side chain and in the glutamine side. These two amino acids
have amide group in its side chain. The amide groups are -C(O)NH2 groups. So this -CH2-C(O)
-NH2 group comprises the amide of Asparagine. This -CH2 -CH2 -C(O)-NH2 group comprises the
amide of Glutamine. Here the only difference is the Glutamine chain is one carbon longer
than the Asparagine chain. So here we have a -CH2 group that is the ß-carbon attached
to the a-carbon followed by a ?-carbon that has attached with an oxygen atom and then
-NH2 group. So in Aspargine the amide group has a single ß-carbon attached to the a-carbon.
In Glutamine we have two -CH2 moieties in the side chain. We have Ca, Cß, C? and Cd
carbons. And this Cd is attached with oxygen by a double bond and with a -NH¬2 group.
So in Aspargine and similarly in Glutamine the oxygen, nitrogen can participate in Hydrogen
bonding which means if we have a specific donor or an acceptor then this could participate
in Hydrogen bonding not only with other amino acids but also with the solvent. In this group
the next amino acid is Serine. The Serine is a small amino acid but a polar amino acid.
The group it has is -CH2OH and this -OH can participate in the Hydrogen bonding. In this
series Threonine is the next amino acid. It has a -CH3 group and a –OH group attached
to the ß-carbon so again it differs from Serine.
The next amino acid in this group is Cysteine. The Cysteine is another type of the amino
acid which has a Sulphur atom same as in Methionine which also has a Sulphur atom but the Sulphur
was attached by a methyl group. Here we have a hydrogen atom making this ethyl so we have
-CH2SH.The Histidine is a very important amino acid. Enzymes and the Enzyme mechanisms will
be covered in detail about the Histidine because of its specific polarity or specific properties
of this side chain that is an amidazole group. So again Histidine have a common amino acid
part. In the side chain of Histidine have two nitrogen atoms which are in part of the
amidazole protein. So the side chains of all these polar group of amino acids have contained
a hetero atom.
We will be considering the next group of side chains are Acidic Amino Acids. Earlier we
looked at Asparagine and Glutamine. The Aspargine has the -C(O)NH2group. We know that an amide
comes from a carboxylic acid. So the Asparagine comes from a specific carboxylic acid. Similarly
the Glutamine also comes from a carboxylic acid. So we group them into Acidic Amino Acids
and we call these specific acids as the Aspartic acid which gives rise to a Asparagine and
Glutamic acid which gives rise to Glutamine.
Now we have here is a -COO- group. This -COO- group is apart from the actual carboxylic
acid that is common to all amino acids. This is part of the R group the side chain. So
the side chain in Aspartic acid also contains a carboxylic acid group. Similarly the side
chain in Glutamic acid also has the carboxylic acid group but it has an additional -CH2 just
as like in Glutamine. Here in addition we have a pKa value. If the pKa value is less
than pH in a solution then carboxylic acid is going to loose its proton similarly the
carboxylic acid looses its proton but this amino group has not. Because the pKa value
of this amino group is actually higher than physiological pH which is why it has still
kept its proton attached to it. But if we consider the physiological pH to be 7.4 it
means the pKa of this group is greater than 7.4 and we will see how it is actually something
close to 9 or between 9 and10. So if we have the pKa >7.4 this is going to remain protonated
but these carboxylic acid cannot remain protonated. So these are comprised as Acidic Amino Acids.
If there are Acidic Amino Acids it means that there are also be Basic amino acids. These
are Lysine and Arginine.
Now we will look at the side chain groups of the Lysine and the Arginine. This is the
long side chain of Lysine and this is the side chain of Asparagine. Now each of these
two amino acids has different pKa values. The amino groups are still protonated because
of the pKa values are greater than the physiological. So we have protonated nitrogens because the
physiological pH is 7.4. And it did not reach the pKa value where this is going to loose
So, here we have an additional amino group apart from the common part of the amino group
because it is a Basic amino acid. We have a guanidinium group in Arginine which is part
of the side chain and it has nitrogen here, nitrogen here and nitrogen here. So this is
Lysine and this is Arginine. Especially these are the residues preferred to be on the surface
of the protein especially because of their properties.
So, if we look at the different structures of the amino acids in which we have considered
specific groupings. The different groupings are the GLYCINE and the PROLINE which forms
a group by itself because of the uniqueness in their properties and we have the other
group the POLAR AMINO ACIDS, we have HYDROPHOBIC AMINO ACIDS, we have ACIDIC AMINO ACIDS and we have BASIC AMINO ACIDS. There is another
group of amino acids the AROMATIC AMINO ACIDS. The Aromatic amino acids are unique and Aromatic
in nature, under these we have three amino acids namely Phenylalanine, Tyrosine and Tryptophan.
Here we have Phenylalanine. As we have seen earlier the Alanine was just a methyl group
attached to the a-carbon. But in this case one -H has been replaced by a phenyl group
so its name is Phenylalanine. Of course it also has the common part of the amino acid.
The 3-letter code for Phenylalanine is Phe and the 1-letter code is F.
So we will get Phenylalanine by replacing one -H with phenyl group in Alanine side chain.
So this is Aromatic in nature. Then we have Tyrosine which is similar to Phenylalanine
but the only difference in the side chain one hydrogen is replaced by an –OH group.
So the Tyrosine can actually also be involved in hydrogen bonding. In the grouping of amino
acids this could also be put in a Polar group but it is already grouped under the Aromatic
amino acids because of the presence of phenyl ring. So the Tyrosine have a -CH2 and a phenyl
and an -OH attached to this phenyl ring. In this group the other amino acid is the Tryptophan.
It has an Indole ring attached to this –CH2. This is very bulky amino acid as you can see
by the size of it and it is quite rare in Proteins. So it is not present in very large
extent in many of the Proteins.
The unique properties of these Aromatic amino acids are makes the Protein useful in an analytical way. All the Aromatic
amino acids which are Phenylalanine, Tyrosine and Tryptophan are absorbs the Ultra violet
(UV) light. So their presence in Proteins can actually be neutralized in this fashion.
This means they absorb UV light in the range of 280nm. Even though they have different
?max values but we usually look at 280nm to identify a Protein. If a solution has a certain amount of Protein in it so we
can determine the amount of Protein present in the solution by a consideration of the
number of Phenylalanine, Tyrosine and Tryptophan that are present in the Protein chain.
So, if we monitor or if we find out the Absorbance at 280nm we know that the extension coefficient
of the protein and we know the length of the cell and we also know the absorbance at 280nm
which is also represented as A280 then we can determine the concentration of the protein.
So the presence of these Aromatic amino acids can help in determining whether the solution
is actually contained the protein or not. We can also find out the content of the concentration
of the protein in solution by virtue of having Phenylalanine, Tyrosine and Tryptophan. In
which the Ttryptophan has the highest extension coefficient which means if you have a large
number of Tryptophan amino acids in the protein you are going to have a larger absorbance
at 280nm. But the presence of the Aromatic amino acids themselves will give an absorbance
at 280nm which is how proteins are monitored in Biochemistry laboratories.
The next thing we are going to look at is a representation. As we have seen already
we have a carboxylic acid group, we have an amino group, and we have a hydrogen atom which
is common to all amino acids and we have a side chain, R. Here the side chain is an amide
Asparagine and Glutamine are the amide group having amino acids. So the Glutamine side
chain had two –CH2 groups. The 2 –CH2 groups attached here. This is a stick representation
where the asymmetric carbon is in green, the other carbon atoms are in grey, the nitrogen
atoms are in blue and the oxygen atoms are in red. If we look at the linking of these
amino acids, we know these amino acids are the building blocks in the formation of proteins.
These building blocks have to be linked together to form a protein. They are linked together
by a peptide bond.
Now the representation here is not zwitterionic representation because the proton of the carboxylic
group is attached to it and this is the NH2 group. In actual form it would remain as -NH3+
and -COO-. Here we have two R groups and the first amino acid has an amino terminal, the
other has a carboxylic terminal. So this is a di peptide because these two amino acids
were linked by a peptide bond. This peptide linkage has a carbon with doubly bonded oxygen
and a _NH group. But as we have here is the two R groups that in the first amino acid
which is on the left hand side has an amino terminal. so this is a di-peptide because
the two amino acids linked were by a peptide bond. The original amino acids were missing
an -OH from the carboxylic acid side and missing -H from the amino side to form a peptide bond.
These were makes an H2O molecule. So the linking to two amino acids by the elimination of H2O
can form a peptide bond.
We will look into the features of the peptide bond once we considered the protein structure
in general and the amino acid sequence. But when these amino acids are linked together
on the left hand side you always have the N terminus and on the right hand side you
always have the C terminus because this is the way the proteins are formed where this
is the way they are synthesized. So we have an amino terminal and we have a carboxylic
acid terminal. The first amino acid is always has the NH3+ group attached to it and the
last amino has the -COO- attached to it acid in a protein sequence or in a protein chain.
So this is a di peptide linked by a peptide bond. There are certain features of the peptide
bond unique to protein structure.
Here we have an amino acid the Glycine which is having a Zwitterionic representation. The
Glycine is a unique amino acid where the R group is H where this R is the side chain
representation of the amino acid. Here now the R group is CH3 in case of the Alanine
and the R group is –CH2SH for Cysteine.
For Glycine we actually cannot distinguish which is the R group because the hydrogen
is present and the hydrogen is also the side chain. So we have an -NH3+ and a -COO- and
by eliminating water into forming a peptide -C(O)-NH-. So we have two peptide bonds a
-C(O)-NH- and a -C(O)-NH-. So we have Gly -Ala -Cys in the formation of a tri peptide.
And this can continue to form other peptide linkages. So we have a Ca and we have –NH3+
group and we have -COO- group, we have H atom and an R1 group is attached to this in the basic
structure of the amino acid.
If we look at other amino acid you would have another –NH3+ group, we have -COO- and we
have H and we have R2 group. So now when we combine these two amino acids to form a di
peptide we would have linked these by a peptide bond. So we have a -C(O) connected to -NH
which is coming from the second amino acid, we have an a-carbon, we have R1and R2, the
two hydrogen atoms and the COO- group and –NH3+ group in this di peptide. So this
particular linkage is known as the peptide linkage. Now we have linked the R1 and the
R2. In a representation of a protein it is not very convenient to keep on writing all
the atoms together. We already knew that the amino acids are the building blocks in the
protein sequence in the primary amino acid sequence which are linked together by the
Now since they are just linked by the peptide bonds. Then it is not necessary to write the
common part of all the amino acids because these are certain features that we already
know. We know that the first amino acid is going to be linked with the -NH3+ terminal
and we know that the last one is going to be linked with the -COO-. So it is sufficient
to write instead of writing -NH3+ and -COO- in each case in a laboratory fashion. When
we write a protein sequence all the information we actually need is R1 and R2. .Because we
know each amino acid looks the same but it differs only in R group. So if we just know
the R1 and R2 we need to know about how they are linked together in the protein sequence.
Suppose in this protein sequence if this is one amino acid and this is the other amino
acid. Then this first amino acid has to have the –NH3+ attached to it and therefore3,
4 so on and say to 120 will have the -COO– attached to it. And also we know that these
linkages are nothing but peptide bonds. So here we need to know what R1 is, what R2 is,
what R3 is and what R4 is so on up to R120 in this sequence. Here we just write either
the 3-letter code or the 1-letter code. In the 3-letter code, if this first one were
a Glycine then we would write as Gly and it is linked with for say Alanine which is represented
as Ala linked with acidic amino acid Aspartic acid linked with a basic amino acid Lysine
and so on and so fourth the sequence represented as Gly -Ala -Asp -Lys -•••••• because
when we have the Glycine and the Alanine and the Aspartic acid so we know what are the
rest of the atoms are because we know the side chain of the Glycine and we know the
side chain of the Alanine and so on and so fourth. If we write this in a one letter code
it would be G -A -D -K so just wrote as GADK you could write those structure of this tetra
Similarly when we consider a whole protein chain in this case say just one hundred and
twenty amino acids so the difference is in the properties of the amino acids side chains.
Here the first property is the Size and shape of the amino acid. It is extremely important
in its accommodation in the protein. The next property is the Charge on the protein whether
it is acidic or it is basic. Next we look at the polarity whether it can be involved
in hydrogen bonding or is it a polar amino acid.
Then the Hydrophobicity, where this amino acid is likely to be located whether it is
going to be located in the centre of the protein because its likes to be away from the solvent
or whether it is going to be likely to be on the surface of the protein. But we know
that any hydrophobic amino acid would prefer to be in the core of the protein. The next
one is the Aromaticity, the aromatic amino acids that we are considered Phenylalanine,
Tyrosine and Tryptophan are important in imparting UV properties to amino acids because these
are the one set that absorb UV light. The proteins can be detected in solution due to
the presence of the Aromatic amino acid which will absorb the UV light. And from the Beer
Lambert’s law we can find out the concentration of the proteins.
Now the confirmation, as we have already seen this is usually determined by the side chain.
We will see the most of the side chains are linked by the single bonds. Also we will see
how rotation about the side chains can actually bring about conformational changes to the
amino acid orientations in the proteins. And this change in confirmation or the change
in dihedral angles will allow us to look at different properties of amino acids in the
way they interact with the other amino acids. We also look at the propensity to adopt a
particular conformation. What does it mean? It means that if a protein were to have an
amino acid that would likely form a helix or be part of a helix. We will see whether
it is likely to be in a helix or it is likely to be in a sheet.
So, the summary of today’s lecture is the different types of amino acids, the different
groupings of amino acids and the important properties of the side chains of the amino
acids. We were considered the central carbon atom is the asymmetric carbon atom which is
also known as the a-carbon atom has linked to four different groups. These are hydrogen
atom and amino group and carboxylic acid group and a side chain that is represented as R.
We have twenty different common amino acids each having these various R groups. The twenty
amino acids have twenty different R groups that differed largely in their properties.
We have different types of amino acids which are grouped into the type of R group the type
of side chain that they have attached to them. We have unique amino acids Glycine and Proline.
we have Hydrophobic amino acids and we have Polar amino acids, we have Acidic and Basic
amino acids and we have Aromatic amino acids.
Methionine and Cysteine are Sulpher containing amino acids and there are other sets we have
here is the hetero atoms oxygen atoms and nitrogen atoms in them and the side chains
comprised entirely of carbon and hydrogen making them hydrophobic in nature.
And lastly we look at overall properties that we can consider together are the Size and
shape, the Charge, the Polarity, the Hydrophobicity, the Aromaticity. All of these will actually
determines the property of the protein in general. Because we know that these amino
acids are linked by peptide bonds. These linkages of the peptide bonds bringing different types
of amino acids together to form a protein sequence. We have seen the peptide bonds and
we saw how the amino acids were linked together by the peptide bond and how we can actually
represent the protein sequence by just writing either the 3-letter code one after the other
or the1-letter code one after the other. Because we know that the first amino acid will have
N terminus and the last amino acid will have C terminus which means that first amino acid
is going to have the -NH3+ attached to it and the last amino acid is going to have the
-COO- attached to it to making up the protein chain. We will learn in the later classes
how the protein actually folds and how the hydrophobic amino acids tend to remain in
the centre of the protein, thank you.