This is the first lecture of the course digital image processing which is offered as a special topics course for both undergrad and graduate students in engineering and computer science. So one thing that I should note is that the material that I present here are derived from the digital image processing book the fourth edition written by Gonzalez and Woods and you can purchase the book either from Amazon or from the publisher's website. So in the first lecture these are the topics that will be covered. First I will give you an introduction of what is the digital image processing and then we talk about the origins of the IP.
And then we talk more about some of the applications of digital image processing, then some about the fundamental steps that are involved, the different components in any DIP system. This basically wraps up the first chapter of the book. As for the second chapter, we have some introductory material on the elements of visual perception, how the light and the electromagnetic spectrum are defined and categorized, and then finally a little bit about image sensing and acquisition, and finally sampling and quantization of real world into digital images.
So the first question that you might ask is, what is digital image processing and the answer to that is we need to first define the image itself so an image you can consider it as a two-dimensional two-dimensional function f of X and Y with X and Y being the spatial coordinate and F at any pair of coordinates X and Y is considered the intensity or gray level of the image at that point So this is the general definition of an image and in this general definition x and y can be both continuous or discrete values. The same is true for the value at any point 2. So f can have any value from 0 to infinity. But when x and y the intensity values F are finite discrete quantities the image is considered a digital image so digital image processing is the process is all the processes that involve digital images by means of digital computers so you can say each digital image composed of a finite number of elements which are usually called pixel but their names to them too picture elements image element or tells but now the question becomes if this is the definition of the digital image processing what are its limits so is image analysis also visual image processing or more advanced techniques that are usually used in computer vision are all of them also part of digital image processing And the answer to that is yes and no. So to better categorize different types of processes that are applied to images, we can have three different levels. We have the low-level processes in which both the input and output to the algorithm are images.
So some examples can be noise reduction, contrast enhancement, image sharpening. We can have mid-level processes in which the inputs are usually images but the outputs are image attributes. So, examples of these type of processes are segmentation and classification of objects.
And then we can have high-level processes in which given an image or a set of images, the algorithm tries to make sense of the ensemble of recognized objects. For example, let's say if you have... image of a classroom so you have chairs desks students, pen, books.
So if you give this image to a high level algorithm just by detecting each of the individual components and then putting all of them together can give you for example a description of what it sees and what it thinks the image is from. So for example here the image can be of a classroom so the input is an image but the out is a higher level semantic description of what's happening inside the image. So usually digital image processing consists of processes whose inputs and outputs are images and also processes that extract attributes from the images.
So the high level processes are usually fall in the category of computer vision. so as for the origin the earliest applications of digital digital image processing is in the newspaper industry when we had to transfer pictures for example across across the atlantic ocean so before the introduction of digital images and the transmission equipment the process could take like days or even weeks but after that in the early 1900s the time that required for this purpose reduced significantly to an order of hours in such techniques specialized printing equipment coded the pictures for transmission then transmitted them across the ocean and then reconstructed at their receiving end. So early systems were only able to code images using five distinct levels of gray values.
By the end of the 1920s the capability increased to 15 levels. So here you can see two examples of such images. but that was the concept of digital images what about digital image processing so as the definition that we had before digital image processing is basically processing the images using digital computers so digital computer had to be invented too so we have the digital image processing so the foundation of digital computers and dates back to 1940s there the concepts of memories that could hold programs and data and also Conditional branching were first introduced then we had the invention of transistors at Bell lab in 1948 then common base business or oriented language or COBOL and Formula translator fourth line programming languages were invented in 50s and 60s invention integrated circuits then operating systems and microprocessors which had processing units as well as memory and inputs and outputs components were invented in 1970s and then later on we had the invention of personal computers and large-scale trans large-scale integration and very larger scale integration in the 70s and 80s and in the present time we have computers that fall under the category of ultra large scale integration because we were able to cram more and more transistors in the limited space that is available through in the cpus so in some senses digital image processing and the advancement of computers go hand in hand.
An early example of application of digital image processing dates back to the Jet Propulsion Lab, which they used such algorithms to process images that were captured from the Moon by Ranger 7 satellite. Around the same time, the invention of computerized axial tomography or CAT or computerized tomography or CT. another application of digital image processing in the medical diagnosis domain. When it comes to the applications of digital image processing, as you might imagine, the areas of application are numerous. So it's a good idea to be able to categorize these different areas.
So as suggested by the book and followed here, we focus this categorization on the sources of these images so they can be from electromagnetic signals acoustics ultrasonic electronics and computer generated images so let's talk about the electromagnetic signal that I use for image processing brief introduction about electromagnetic waves how they are defined so they are defined as propagating sinusoidal waves of varying wavelength or a stream of massless particles traveling in a wave-like pattern so these two definitions depend on how you define the concept of a photon whether you define it as a wave or as a particle so each massless particle contains a certain amount of energy which we call it the photon and this energy is in proportion to the wavelength so the equation for the energy of a photon is HF which in which H is the Planck constant and F is the frequency and as for the frequency it can be derived as the ratio between the speed of light in the vacuum and the wavelength of a photon and Wavelength basically is the the physical distance between the two the two consecutive peaks in a sinusoidal wave So based on this definition the range of electromagnetic energy can be divided into several regions so at the higher end We have gamma rays, x-rays, ultraviolet. Then we have the range of visible electromagnetic waves. And then we have infrared, microwaves, and radio waves, which have a lower amount of energy.
So as for the application, on the left, you see several applications of gamma ray imaging. So we have the bone scan. We have the PET image or positron imaging. tomography we have satellite images of the Cygnus or Cygnus loop is a celestial body and we have the gamma radiation from a reactor valve but these fall under the category of gamma ray imaging as for the x-ray i'm sure you are all familiar with different types of x-ray images we have a shift x-ray we have an aortic angiogram, we have a brain x-ray or brain CT image, we have applications of x-ray imaging in electronics to be able to detect faults and errors in the circuit, and then we have examples of x-ray imaging for night sky applications you can say.
So this is an image of the same sickness loop that we had in gamma ray range. This is the image in x-ray range. Then we have some applications of ultraviolet imaging.
So for example the first two are used in food industry. The one on the left is ultraviolet image of normal corn, this is the corn infected by SMOT and then we have again the Cygnus loop image in the ultraviolet range. As for the visible as for the imaging in the visible range you are familiar with all the concepts or all the applications there the simplest one are basically the images that we capture using our phones every day. Also we have other types of imaging in the visible range using light microscope.
so we have different examples here for example this is an image of a microprocessor using a light microscope this is an image of a cholesterol cell this is for example an image of a an organic superconductor and as you can see in the legend they all required using different levels of magnification. We have examples of imaging in the visible range using satellite images too. So these are some examples of satellite images of the Washington DC area. The first three are captured in the visible range.
So we have the visible blue, visible green, and visible red. And then the ones underneath are captured in the infrared range and as you can see in the following table each one is used to signify different types of features in the image so for example we have some that signify the vegetation or the moisture content of the soil and so on and so forth This is another example of infrared satellite image of the continent of Americas, which correspond to the amount of light pollution in different areas of the continent. On the right, we have one example of microwave imaging or radar imaging.
the mountains in Tibet so this is usually used when we want to reduce the adverse effect of occlusions in satellite images for example if we are imaging the same area in the visible range the view will probably be blocked by clouds but if we do the imaging in the microwave range we can eliminate the cloud in the images that we get. As for examples of radio wave imaging, MRI is a good example. So you see it here, two MRI examples.
So in MRI what happens is that we have a very large constant magnet that helps to align all the, not all, but many of the molecules of hydrogen in the body along the axis of the magnetic force of that large magnet then we excite different regions of the human body using radio waves and then we have detectors that detect basically the behavior of the molecules when they want to go back to their original state by recording that for the entire volume we can create a volumetric data using the MRI machines and underneath you see one example of a celestial body the crab pulsar image in different ranges of the electromagnetic spectrum so we have the gamma range image then we have the x-ray the optical image the infrared and the radio wave imaging and as you can see these images are captured from the same object and they are perfectly aligned but imaging in different ranges of electromagnetic spectrum we can see different types of information as for the other types of sources that are used for capturing images I talked about acoustics, ultrasonic, electronic and then computer generated ones. So the one on top left is a cross section of the seismic model under I think ocean that is used usually to detect reservoirs of oil and gas. Top right these four images are examples of ultrasound imaging. Then underneath we have examples of scanning electron microscope which uses a beam of focused electrons to basically quantify the surface surfaces of microscopic samples and then finally in the left we have some examples of computer generated models or computer generated images so there are several fundamental steps in digital image processing that are usually used and are discussed in the textbook for the course so after on deciding on the problem domain we have the concept or the step of image acquisition and doing that required a specialized tool so whether it's imaging in the visible spectrum or using scanning electron microscope each one requires their own set of image acquisition equipment but after we acquired the images then the next next step is usually image filtering and enhancement so as for image filtering you can think of noise reduction as one of the examples another example is to improve the contrast of the acquired images. Then we might have image restoration which mainly talks about image noise reduction but at the same time talks about more advanced topics like blur or moving artifact reductions.
Then we might have color image processing especially if our images are in color. or even if they are in gray scale image we can use pseudo color image processing mainly because the human visual system is only capable of distinguishing between a few dozen of different intensity levels in the gray images but at the same time the human vision is very sensitive to changes even slight changes in color so it can process like would say hundreds of times different intensity level if the image is in color so that's another step that is usually considered we can have wavelets and other image transform so one thing that I forgot to mention in the image filtering and enhancement we can divide the technique that we use here to either spatial processing or transform domain processing. So, in the spatial processing, what we do is that we work directly on the pixels of the image.
But in the transform domain processing, we apply some transformation to our images and then we do whatever processing that we have in the transform domain. So, a very good example of this procedure. he's using Fourier transform that we will discuss later on in this course wavelet and transform you can see the you can say they are derived from a Fourier transform but they have their own processes as well so they can be much more advanced for image processing and wavelets are only one of the Transforms that are usually used there are other types like curve leads different types of wavelets surflet and Like so on and so forth, which we don't cover them in this course Then we have the step of compression or watermarking and then we have morphological image processing so these processes the inputs and outputs are generally both images then we can have the different types of segmentation the segmentation can range from anything from point or edge detection to more advanced topics like region segmentation or object segmentation we can have feature extraction And then finally we can have image pattern classification. So these techniques generally have inputs of images but outputs are image attributes. So in this course these three chapters won't be discussed and not all sections of the chapters that are shown in white are also discussed because some of the topics are more suited for more advanced courses.
As for the components of a general digital image processing system We have the problem domain, we have the image sensors that are used for image acquisition, and then we have some specialized image processing hardware. And then the data after capturing is transferred to a computer, in which case the processes are usually common between all types of image processing techniques. We may have hard copies. we may have different image displays mass storage image processing software and then finally this data can be transferred to different locations using network or even put on the cloud so digital image processing usually involves this component the image processing software so Now that we have a better understanding of what image processing is and different components of any image processing or digital image processing system let's talk a little bit about the elements of visual perception and To begin that we can talk about how the human eye works.
So on the right you see a generic graph of the human eye it roughly resembles a sphere with a size of around 20 millimeters and it has several components. We have the cornea, we have the iris, we have the ciliary muscles that are connected to the lens and help to change the shape of the lens and then on the boundary we have the choroid, we have the ciliary and then we have this internal part of the lens. boundary that is called the retina the retina itself has several sections we have the blind spot and we have the fovea too so lens is responsible for focusing the light on the retina ciliary muscles are responsible for changing the shape of the lens and then the retina contains different types of light sensitive cells that help us to basically absorb the light and see the image that we see so we have two types of cells we have cone cells and we have rod cells so cone cells they are around six to seven millions of cone cells per eye and they are very sensitive to color they are mainly responsible for the photopic or bright light vision and they are mainly concentrated around the the fovea we have three different types of cone cells red blue and green cone cells which we don't talk about at this point then we have the rod cells that covered the rest of the retina and they are between 75 to 150 millions of rod cells they are not sensitive to color but they are sensitive to low-level illumination so they are more responsible for the scotopic or dim light vision here you see a profile of the distribution of the cone and rod cells the retina so if you consider fovea to be at zero degrees you see that the distribution of the cone cells is largely around the fovea while the distribution of rod cells is generally anywhere else and at the blind spot which basically is connected to the nerve that leaves the eye we don't have either cones or rod cells so in regular cameras the lens has a fixed shape and fixed focal length and if you want to focus the subject we do this by varying the distance between the lens and the imaging plane but in the human eye the distance between the center of the lens and the retina is fixed so to achieve the focusing capabilities the focal length is going to be changed and this uses those ciliary muscles that I talked about in the previous slide and By changing the shape or the focal length of the lens we can achieve focusing capabilities in the eye.
As for the sensitivity of the human eye to the changes of intensities, human eye can attempt adapt to a very large range of intensities so this range of intensity is shown as a log and they are measured in milli Lambert so the range is in the order of 10 to the 10 it goes from 10 to the negative 6 to 10 to the 4 which is basically the glare limit but one thing that you should keep in mind that the human eye is not sensitive to the full range at any given time So the way that it achieved this wide range is by subjective brightness so you can say at any given time a smaller range is the sensitivity range of the human eye so adaptation happens usually based on the brightness level so the reason for this is that we have two different phenomena that affect the perceived brightness the first thing is that visual system tends to undershoot or overshoot around the boundaries of regions in different intensities so to test that we can create sets of gray levels basically areas with exactly the same gray level and if you see a profile of this figure and this is something this is the thing that we have but the perceived intensity will look like something like this so before moving from a darker to a brighter area we have one undershoot and one overshoot So these are usually called these are called the Mach bands. The other phenomena that controls the perceived brightness is that a region's perceived brightness does not only depend on its intensity. So this is what is called the simultaneous contrast. So to see an example in these three figures the square in the middle in all the cases have has exactly the same intensity but because its surroundings have different intensities we perceive this circle this square darker than this one so as for the light and the electromagnetic spectrum we talked about this uh before so in 1666 isaac newton discovered that when a beam of sunlight passes through a glass prism.
It's decomposed into a continuous spectrum of colors that range from violet to red. So the range of the visible spectrum is around 400 nanometers to 700 nanometers with the smaller wavelength belonging to violet and the larger wavelength belonging to red. If you remember from one of the previous slides, the energy was hc over lambda.
So, as you can see, whenever we have a larger lambda, that means we have smaller energies. And lambda is basically the distance between the two consecutive peaks of any sinusoidal wave. So now that we saw how the visible range the spectrum can be derived from the electromagnetic spectrum it's good to see how we can perceive colors so what happens is that the colors that we perceive in an object are determined by the nature of the light reflected by it so based on that we can define two types of light We can have monochromatic or achromatic light, which is basically a light that is void of color, represented only by its intensity or gray level. And it can range from black to white. And then we can have chromatic light, which spans the electromagnetic energy from 0.43 to 0.79 micrometers in the spectral.
the electromagnetic spectrum but the definition of light by itself is not enough we have to distinguish between the radiance the luminance and the brightness as for the radiance we define it as the total amount of energy that flows from the light source and it's measured in watts then we can have luminance which is the amount of energy an observer perceives from light source measured in lumens and What you can see is that radiance and luminance they are not the same thing so for example The source can have a radiance of Let's say several watts in the ultraviolet range, but since the human eye cannot perceive the ultraviolet light the luminance is zero or very very low and then finally we have the concept of brightness which is a subjective descriptor of light perception and basically is impossible to measure because it's different between different is different from person to person and it represents the achromatic notion of intensity. So now let's see how image sensing and acquisition works. So images are generated by combination of an illumination source and the reflection or absorption of energy from the source by the elements of the scene. So that was a mouse pull. But the basic idea is that we have some illumination source, that is shining through a scene and depending on the reflection or absorption of that energy from the objects that are in the scene we can create an image so as you can see i put illumination and scene in double quotes and the reason for that is these ideas these concepts are very general.
So illumination can be from a source of electromagnetic energy like visible spectrum or from something completely different from an ultrasonic wave or from a beam of electrons or even computer generated. The same is true for scene elements too that they can be familiar object like a desk or a chair or they can be exotic ones like molecules, right formation, human brain, etc etc. So to sense or capture images we need to have a sensing element.
So the general graph for a sensing element is something like this. We have some energy source or some illumination source. We have some sensing material. that sensing material basically converts this energy source into an electric signal that we can record and then show it as an image so this is a single sensing element we can put a bunch of these together to create a line sensor or even better we can put them together to create an array sensor but and this is not something that can be done for all types of imaging or even recommended for all types of imaging and you see in a second why So, even using a single sensing element, it's possible to create line or array images. So, this is one example of combining a single sensing element with a mechanical motion to generate a 2D image.
Or, we can use line sensing elements to create 2D images. So, here is an example of... line sensing element that is combined with a linear motion to basically scan a bigger area and it's usually used in satellite imaging or another way is happening in CT machines that we have a line sensing element that is placed around this circle and then we have the x-ray source So at any given time the X-ray source provides the illumination and on the other side we capture the amount of the X-ray that pass through the subject of the imaging.
And based on that we do reconstruction and we create images. But perhaps the most familiar type of sensor which is used for image acquisition is the one that is used in digital cameras so in this case we have a scene we have an illumination source and we have this imaging system which basically captures the amount of reflection or transmission of the light for the object within the scene so We have this internal image plane and then the output of this is a digitized image. So now that we want to talk about the concept of digitization, it's good to have a model for image formation.
So as before, we define an image as a function of spatial coordinates x and y and the value at each. coordinate location is a scalar quantity which is proportional to the energy radiated by a physical source. So because of that the values for F are non-negative and they are finite because the energy source only provides finite amount of energy. So we can characterize this function by two components first we can have the illumination which is basically the amount of source illumination incident on the scene being viewed and then we have reflectance which is the amount of illumination reflected by the object in the scene so given this definition we can define function F has to be the multiplication of I and R I can range anywhere between 0 to infinity right in theory but the reflection can only be from 0 to 1 so if it's 0 that means that we don't have any reflection and if it's 1 that means that we have 100 reflection in the cases that the light actually goes through the object for example in x-ray imaging we can exchange the different reflectance with transmissivity function. But the concept is still the same.
So now let's see how we can go from a continuous domain scene to a digital domain image. So to do that we have to have two different processes. First, we have sampling, which basically is digitization of the spatial domain.
And then we have the quantization, which is the digitization of the function domain. So let's just look at a simple example here. Let's assume we have this scene with this object in a continuous domain. And let's see what we see in the line that is drawn between. point point a and point b so in this part of the line is in white region so that means that its intensity is high then we suddenly jump to the object we have a slight variation in the intensity within the object and then finally after this point this part of the line is in the white region so we have highest amount of intensity one thing that you notice is that since we are in a continuous domain There is always some source for noise too.
So that's why the profile that we draw here contains noise so sampling says that we have to divide this area into samples of Fixed distance. So these blue dots. These are basically the samples that we get from the image So this is the sampling part in which we digitize the spatial domain.
But what about the function domain? So even though we did the sampling but the function values at any of these samples can still have any value that it wants. So quantization basically tries to define some ranges and if a sample falls within that range that specific intensity value is assigned to that sample.
So this is the result of both sampling followed by quantization and we do this for all the lines within our images. So doing that we go from this continuous image into a sampled and quantized version of the image. So to put it in more mathematical equation we can assume that we have f of s and t as a continuous image function and we use sampling and digitization to create image f of x y containing m rows and n column.
The special coordinate values for x and y can change for x can change from 0 to m minus 1 and for y can change from 0 to n minus 1 and they are considered to be integer as for the quantization we can define L as the number of intensity levels that is usually represented as a power of 2 and the reason for that is that we are working with digital computers and digital computers work with zeros and ones in another word they work in base 2 So, it's usually common to represent the number of intensity levels as a power of 2. For example, if you are working with an 8-bit image, we have 256 intensity levels that can change from 0 to 255. And this is the equation that's connected to L is the number of intensity levels and K is the number of bits that are used to represent each intensity level. so you can visualize each image in different forms you can visualize it as a 3d function with x and y being the spatial coordinates and z being the the function coordinate or we can show it as an image so x and y are still a special coordinate but instead of showing the z as the height of given point we just show it with different colors or we can use a matrix view basically we see it as some numbers so this is the function view of the image this is an equivalent view using matrix notation so these two are basically same notation here we just showed each value as a function value of function f of X and Y here we only showed the values themselves without considering them as samples of a function and we can show show it using X and Y as a spatial coordinates too and putting each pixel integer locations in this coordinate system. So if we consider this any pixel can be shown using two indices we have f of X sorry f of i and j i showing the row number and j showing the column number and then based on that we can define for example the center of the image which is the floor of n over 2 and floor of n over 2 and the floor is basically the smallest integer number that is close to m over 2 so for example if m is 512 the floor of 512 over 2 is 256 if m is 511 the floor of 511 over 2 is 255 so now let's see what are the effects of spatial and intensity sampling and to do that we introduce the concept of resolution so we can have both spatial resolution and intensity resolution so let's just see what happens with a spatial resolution so a spatial resolution is defined as the size of the smallest perceptible detail in an image so what does it mean it means that what is the smallest object in the image that can be inferred for a given resolution so we can measure it by the number of pixels per unit of distance so as you can see a spatial resolution has a physical meaning to it too so how many pixels we have for let's say every centimeter or every inch so spatial resolution is dependent on the sampling rate so here you see some examples of exactly same images but with different spatial resolutions so the image on top left is created using 930 dot per inch the one on the right is created using 300 dot per inch so the difference between the two is not that significant but still you can see the effect for example in this hand of the clock you see that because of the lower resolution there are some jagged edges when you reduce the spatial resolution even more the effect is more pronounced so this is an image that is created using 150 dots per inch this one is an image that is created using 72 dots per inch so as you can see a lot of the content is not visible accurately in this image so what about the intensity resolution intensity resolution is defined as the smallest discernible change in the intensity level and is usually measured in the number of bits used for quantization so you remember before I talked about having an 8-bit image so in an 8-bit image that means that we have 256 different levels of intensity so now let's see what happens in a medical image so this image is captured using 256 levels of intensity so this is basically an 8-bit image this one is created using 128 levels of intensity so the difference between the two is not that obvious but as soon as we start decreasing the intensity resolution some patterns emerge that are not actually in the image itself so this is usually called false contouring so for example in this image which is created only using 16 different intensity levels you see these contours that appear that are not visible in the original image and that is definitely an artifact of using low intensity resolution and finally if we go to two intensity levels there is basically no distinction between different parts of the image the only distinction is between brighter regions and darker regions so for example the background is completely dark now and the foreground is completely white now. So to summarize, both spatial and intensity resolution are digitization dependent. Spatial resolution depends on the number of samples.
We call it N and intensity resolution depends on the number of bits. We call it K. They each cause different types of artifacts.
when we have too low spatial resolution we end up having jagged edges and lines when we have too low intensity resolution we end up with images that have false contouring as for the sensitivity spatial resolution is more sensitive to the sharp variations that are common in the images and for the intensity resolution it's more sensitive to the lighting variations. Sorry for the spatial resolution, it's more sensitive to shape variation. Sorry, I said sharp variation.
So now let's take a look at these three images. So on the left, we have least geometric details, but more lighting information. And the reason for that is that we have different shades of gray.
in different regions of the image for example if you look here we start from a very bright shade to much darker shade or like it's obvious in different parts of the image as well the one in the middle we have much more geometric detail but less lighting information so you can clearly divide this image into a very bright background and very dark foreground and finally in the last one we have most geometric detail because there are much more shape variations in the image but the lighting information is much less than the previous two images so the question now becomes what would be a good digitization scheme for these images is it possible to have a single scheme for the imaging or do they require different schemes So to answer this equation, in 1960s they have done some experiments using human subjects and they created a bunch of similar images in different intensity resolutions and spatial resolution. And then derived some subjective comparison between how human visual system perceive quality in different images. So, to make the conclusion, they created graphs like this, in which the horizontal line shows the sampling, and the vertical line shows the levels of intensity. And then, they created these isopreference graphs. curves which basically shows images with the same quality judged by observers so for example here in the case of crowd image all the points here represent almost the same quality so as you can see in this example different levels of intensity Did not have that much effect on the perceived quality but on the other hand for example in the face example we see a curved pattern so the conclusion was that images with more shape detail for example the crowd needed fewer intensity levels to achieve the same quality as i said in this line you see that the quality here for example with n being 100 even with two to the four levels of intensity is judged equally to the images of the same sampling rate but two to the six levels of intensity on the other hand images with less shape detail for example the face image are more sensitive to the intensity resolution but less sensitive to the spatial resolution.
So this was the end of the first lecture in the next lecture I'm going to talk more about different types of intensity transformation and basic mathematics that is involved in digital image processing