Transcript for:
Grain Size Analysis Using Python

Hey guys, this is Srini and you're watching Python tutorial videos on my YouTube channel Python for Microscopists. In today's tutorial, let's work on a real-life application, which is grain size analysis. And as the name suggests, this is nothing but loading an image that has whole bunch of grains and getting some statistics out of it. What does the grain size distribution look like?

If you are a material scientist or geologist you probably know what I'm talking about. If you are a life scientist think of this as taking an image with a whole bunch of cells and trying to understand the size distribution of these cells. So let's let's go ahead and jump in and the image I'm gonna use today is pretty much this image and I Google searched online for you know grains or microscope and grain or microstructure of a grain and I found the one that I think is best demonstrates purpose here so this appears to be an optical microscope image and it is an RGB image so let's convert that to grayscale to begin with But if you work on a scanning electron microscope, then you probably have your microstructure pretty much looks pretty much the same, you know, from your sample, except the scale may be different.

And when I mention about scale, it's also important for us to define the scale in this case. Right. If you ask me, what is the average grain size of grains here? If I say, yeah, it is 200 pixels.

What does that mean? What does that mean? But if I say the average grain size is 0.85 microstructure.

microns that does mean something to you right so that's why we should also keep track of the pixel size when you work with microscope images, so let's jump in and one other thing I should mention is you should excuse me when I keep turning this side and then looking at my notes here because This is a going to be a relatively long exercise So I just want to make sure I hit the right topics, and I want to make sure that I am not skipping some of the key information for example the structure parameter. Don't worry about it yet but I really would like to talk at least for five seconds about it. So I'll keep looking on this screen but hopefully it shouldn't bother you and I hope you are focusing more on the code that I'm typing anyway.

So let's go ahead and go through these steps. Step one is reading image and defining the pixel size. and I always like to break down any scenario into various steps so I know I can focus on certain things and that's exactly why I divided this here.

Step one is obviously we need to read the image. Step two is clean up the image if needed. Step three is also clean up like eroding, dilation and some of those.

Step four is we need to label the grains once we identify them and then measure it and then output the results to csv file. So now we can break down things you know accordingly. Step zero you probably know always import cv2 import numpy as np and import by the way please excuse me looking down because I have never mastered mastered mastered mastered I've never mastered the art of typing by not looking at the keyboard so just again one other thing little thing that you need to know about me so from matplotlib import pyplot as PLT what else do we need to import I know I need nd image so from SK image import nd image and also from SK this one is from scipy sorry scipy import in the image and from SK scikit image I want to import IO I want to import color because I want to show I want to display images in color so we can I mean so we can assign different colors to the grains yeah and what else I want to import measure plus eventually we would like to measure do some measurements so this one says there is an error invalid syntax what did I do yeah so that makes sense from matplotlib okay so that's our step zero now let's do our step number one which is load our image and we know how to do that image equals to cv2 dot m read and our image is in images slash and what is it called grains2 dot jpeg and we may as well read this I mean let's read this as a gray level image and not as a color image color image would be a number one over there so once we read our image I'm also going to define pixels to microns a parameter and I'm just gonna call this let's do 0.5 so what that means is I'm just this is nothing but one equals to 0.5 micrometers or 500 nanometers yeah that's exactly so eventually when we do our measurements I just want to multiply every pixel by 500 that's pretty much it or 0.5 so that's why I'm defining it right now so let's just go ahead and continue.

Now if your image actually has a scale bar this image doesn't have a scale bar down here but most microscope images I hate that. It's important to know the micron you know what the pixel size is but embedding a scale bar other than for publishing purposes I don't see a real reason. Anyway if you're working with images that has scale bar you obviously need to crop your image and you probably know how to do that cropping an image. go ahead and show that cropped image equals to we are just going to slice our data set and which data set image array and if you are going to go this way you know crop some of the pixels then you just do 0 to whatever 300 let's say you just want to take the first 300 from the top and then all of the pixels in the X direction, in the width direction. So from height, 0 to 300, and from width, it's every pixel over there.

Luckily, we don't have to do that for the image that we are working with. So let's continue to our step number two. So now that we have a pixel now, let's do denoising but this image looks very clean So let's not worry about denoising if there is a lot of noise You will see that when you threshold it which which is when we can come back and then do some denoising, okay?

Let's skip the denoising there and what is the next step here? And by the way again, I'm looking at my notes by the way If you do end up doing denoising like I mentioned this multiple times in my previous tutorials the best ones would be a median filter or better yet for microscope images and non-local means filter is the best choice for denoising. So once you have that then we are all set you know so the next step is step 2 denoising if required and threshold image to separate grains from boundaries. So let's go ahead and do thresholding.

We looked at thresholding in our previous tutorials again step one step one Step one as part of thresholding is to actually looking at the histogram itself. So PLT dot histogram Yeah, so look at the histogram and the histogram here is of our image And our image is a 2d array to do histogram you need 1d array. So I'm gonna flatten this okay image.flat all it does is just takes this 2d and then just flattens it to 1d array and then let's plot this to 100 bins and then let's do our range equals to this is an 8-bit image so 0 to 255 so i just want to look at the entire range okay so now when i run this we should see a histogram there and that looks very nice our image is uh 8-bit and that histogram a bunch of pixels around 220 ish or so 210 and I see like as a valley right around 150 to 160. So we can do two things one of two things we can do manual thresholding or we can do some sort of an automatic thresholding using OTSU and I've covered this in the previous you know tutorial anyway so let's go ahead and do return and and and by the way I'm unwrapping again once I apply this CV to dot threshold I'm gonna get two things out one is the thresholded value and also the image itself or the array itself after thresholding so we'll see that in a second so the way you do that is CV to dot fresh and which image are we going to threshold or IMG that's what we are going to threshold and starting at 0 pixel value and once it's thresholded assign a value of 255 to all the thresholded pixels and using what algorithm do you want it to threshold so I'm going to use the th or ESH underscore binary threshold method plus CV sorry lowercase cv2 dot d h r e s h underscore otsu o t s u okay So this is, and again let's run the code to make sure everything is okay. So obviously something is not okay here.

So what did I do? CV2 has no attribute threshold. Sorry this should be threshold.

So this is why it's very important to run the code after almost after every line so it runs okay this time and now you see that ret value which is the OTSU suggested or picked threshold value for this image is 157 so we were okay with our guess between 150 to 160 so this is 157 and and if you look at the threshold it's also a 8-bit image It's not a binary image. It's an 8-bit image with values of 255 because we said all the thresholded pixels should have 255 so all the pixels corresponding to our grain Will have a value of 255 all the pixels corresponding to grain boundary will have a value of 0 so this is only a thresholded image, and this is not a this is not a binary image Okay, so we need to convert this thresholded image into a binary image, but before moving on you can actually look at this image of this Thresholded image so one way you know that cv2 dot im show Actually I need to give a Title and then the source and do not forget to do wait key This is how long does the system wait before the window closes if I put zero that means I'm going to close it manually if you don't put this it may crash you know so the kernel restarts and hell will unleash. So let's go ahead and run this and here is my thresholded image.

That looks great actually except a few of these areas. I see some missing some missing pixels. So I wonder if we can close some of these areas by eroding and dilating. When I erode my grains will shrink by one pixel and when I dilate them back they go back up by pixel and in this process when I shrink I'm hoping that some of these dark areas that the grain boundaries would connect and then give us a slightly better definition of these grains.

So let's go ahead and clean the image up and I guess we are getting into step 3 now. Step number three, clean up the image if needed. Well, it shouldn't say clean up. This is basically, well, maybe clean up is the right term.

So before, I mean, as part of our erode and dilate operations, it needs a kernel size. So let me go ahead and define kernel. Again, I have done this exercise in my previous tutorial, so I'm going a bit fast right now.

So I'm going to create np.1s. You know, this is my kernel of 3x3 and of type, let's np.uint8. Yeah.

So this is my kernel. And now I can just say my eroded image equals to cv2.erode. okay and which image are we eroding not the original image but the thresholded image yeah and then the other parameter is kernel and how many times iterations equal to one so now let's look at our eroded actually let's leave the original image as is and then add another line for eroded image so we can compare both eroded image equals to erode let's run this so So there you go. So my original image eroded image. All I did is you know pixels down.

I could have started with dilate to actually fill some of these. There is operation called hole fill. There are a few things that you can try.

But I don't want to experiment with that right now. But I eroded this. Now let's go ahead and dilate it. So we. not changing anything here so I mean so we go back with without changing the image by a lot so now I want to dilate this and the operation is CV to dilate and we are going to dilate the eroded image so my input image is eroded so let's actually look at dilated image now okay dilated.

So when I run this we should see on the left hand side is the actual threshold image on the right hand side this is eroded once and dilated. Some areas may be increased like if you look at this region I'm not really happy that I'm changing information in this region so I'm gonna keep it as is for now. Let's see let's do our analysis and we can actually use our threshold image as inputs to our next step rather than dilated so it doesn't hurt to do this exercise anyway. So now that we have done some erosion and dilation so let's move on to the next step which is what is my next step?

Clean up the image is done yeah now yeah so we need to convert this into a binary image because the thresholded image is nothing but a 8-bit integer with all values of 255 and zeros. It's a binary image 255 and zeros but the system doesn't know that it's a binary image the system knows that it is a 8-bit integer image so let's convert this thresholded image into a binary so I'm gonna call this my mask equals to and you probably know how to do this yeah so I'm gonna take my dilated image and if equals equals to 255. So what this does is at every pixel location if or every point in this array if the value is equals to 255 then the mask would be At that position would be true right you know this logical operation Anytime you do greater than less than or double equals this is comparing something with something else right so we are saying that in Dilated if the value is equal to 255 then it would return a true if it's not 255 meaning if it is 0 it returns at false so when we run this again we should now see mask a boolean of the same area size as our input image where every value is true or false okay so this is the binary image that we have actually generated and I owe dot M show let's go ahead and look at our mask so when I run this you should see the mask right there okay and why did I use io.imshow and not cv2 for whatever reason when I tried this during my practice run In fact, I had to re-record this video until this point because cv2.imshow is not showing the boolean image and I didn't want to dig into why. But this actually works so I'm going to use this.

So by the way, at this point, let's go ahead and zoom in and see how it looks like. let's do our image size is 354 by 410 so if we go to 250 by 250 or two let's just do 280 yeah 250 by 280 in x and then 250 by to 80 so let's just this is a way of zooming in so we only see that small box and you can see this is my grain boundary and this is my grain here okay so we are good with that so we are done with our mask and what is our next uh step uh so we created the mask so we're done with step three step four is label the grains in the masked image so now we need to label the grains Step four right so How do we label the grains let me look at my notes so I do not so I don't forget talking about This one thing called structure Before explaining that let me go ahead and and and just write this write this line down so So you know exactly what I mean one one one And, uh, da-da-da-da-da, one, one, one, and, uh, one, one. So I'm defining this as my structure factor, which I'm going to explain in a second.

And then the label mask in in ND image. The reason why I imported ND image here is in ND image. There is a function called label, which actually labels each of these unconnected grains.

If these grains are connected even by a single pixel. Meaning then that's one object all together. So all this label does is it actually goes through this and it it Figures out well, it actually assigns a label to all Unconnected objects now this s is a way of defining what that connection is whether they are connected or unconnected okay so that's pretty much it so label mask and number of labels these are the values that it's going to return the nd image dot label is going to return Yeah, so in the image that label is going to return label underscore mask and num labels Okay, so so far fine and on what image are we going to apply? Obviously we need a mask and then the structure equals to s okay and our s is one one one one one one one one one the reason why we have to define this If you do not define you can just run mask then it's going to use again. Let me look at the numbers 0 1 0 1 1 1 and 0 1 0 instead of all 1 1 1s.

Why does it matter? Well, they call this like 8 connectivity. This is 8 connectivity. You can look up the documentation and the other one is 4 connectivity.

What that basically means is the diagonal pixels will be included as part of the structure. Okay, when you do this, all of this and what a structure basically means it specifies when to consider a pixel. to be connected to another nearby pixel.

Okay so in a way when we are labeling these images it's actually using the structure factor to figure out whether an object is an independent object or whether it is connected to something else so that object would be a bigger object that's pretty much it. Okay and the reason why I'm using 111 all these ones is this is exactly what image j uses and image j is this imaging desktop program that microscopists use for image processing. So just to make sure I'm keeping the convention the same with imageJ, I'm just using all ones.

That's pretty much it. Anyway that's my notes on this and I don't want to fail about talking this about this. So anyway, so now let's go ahead and run this to make sure everything is working fine.

So that seems to be fine. And our label mask, as you can see, is a 32-bit integer with values going from 1. I also see some 16s there. And our number of labels, it says the value here is 296. So that kind of tells me it found 296 grains in this image. Okay. So it's now all of these objects are labeled.

So the next step is, in fact, I'm thinking whether we should go to next step. Let's actually look at this image by adding some color because I mean I imported this color for a reason so let's go ahead and look at this image so let's assign this to a parameter called I mean to a variable image2 and it's color dot let's just there is a function called label2RGB okay so it basically assigns random colors in red green and blue and it out to the label itself explanatory from the name right so I'm in label to RGB and then let's do label which is our label underscore mask right that's our label and then the other one is colors none BG color background color I don't know our background label equal to let's just say zero okay it says background label zero minus one you can experiment with a few of these we'll see uh so labeled uh uh what oh i just i can change it here or here so that's okay labeled mask and number of labels let's go ahead and plot this and we should have seen something image to color dot label oh never mind I mean I I mean I generated this but I'm we're not seeing it so let's just do cv2 dot M show and this would be a colored lady balls let's say okay and then image to c2 dot wait key okay now we should see something the image is on my other screen so let me go ahead and pull it it here that's a nice pretty image we could have I'm not sure if these are randomly assigned colors or if these are connected pixels so it's it's it's thinking that this is to me it sounds like it think it thinks that all of this is the same so we could have actually done some more image processing to get this by the way I'm a bit curious instead of dilated right here what happens if we use what happens if we use thresholded yeah I'm just getting curious now. So if we use thresholded then yeah, you see a big chunk of green right there continuous greens right here and And by actually doing this erode and dilate Let's do dilated now by doing this erode and dilate We are breaking down some of these regions maybe or maybe not. But anyway, so looks like this helped the image processing we could have done better so again please do it at your own time so now that we have our image what is the next step so the next step is the tutor well we are Sorry, I can easily look here. So label grains in the matched image.

We're done with that. Step five is measure the properties of each grain. So we finally are here. So let's measure the properties. So step number five.

If I can type correctly. Okay. Step number five is measuring. How do we do this?

And again, I There is a region prop function in scikit-image module. You can look at the documentation and the region props, all it does is it extracts the properties of each region if the image is labeled. So it takes in a labeled image. and just extracts a whole bunch of region properties including area and particle grain diameters and a whole bunch of stuff.

It's literally one line so let's assign that to something called, let's shall we call it grains, let's call this clusters. Yeah equals to it is in measure okay remember we imported measure before and it's called region props right there. so measure dot region props and by the way measure is part of our scikit image okay so measure dot region props and uh what do we do uh i'm actually thinking uh labeled image which is our mask and what else do we need intensity image equals to oh so this intensity image is nothing but our original image so if we actually include the original image it also calculates the intensity values like within that grain if you want to know what the maximum intensity is or mean intensity is or minimum intensity, that's when it's very useful. I'm going to do that anyway.

Why not? I mean, why miss any information? So that's it. Clusters, and let's see if the implementation worked okay.

Yeah, no issues there. So in fact, if you look at clusters, it's an object where you can see it's returning. The size is 296 because it detected 296 grains and it's got a bunch of region properties.

And how do we extract this region properties? That's the next part. So there are various region properties like area, equivalent diameter, and a whole bunch of other things that you can look at. In fact, I actually copied my property list just to Let me pull this over here. So these are some of these properties area, equivalent diameter, orientation, major axis length, minor axis length.

So let's go ahead and copy that. And before that, I think maybe we should look at some of these measurements. So let's look at perimeter, for example.

OK, so front within our clusters. Okay, and not all the clusters, all the grains. Let's just look at grain number zero. Okay, the first grain.

Which parameter? Let's actually print out the value of the perimeter. that one so if I just run this so the perimeter of the first one is 118 point whatever that that value okay so that's the perimeter it calculated for that image and you can actually look at a whole bunch of other values so let me copy and paste from my other screen here so in fact I did write a for a loop right there i don't want to waste time so i just prepared a little bit before doing this so for properties in clusters okay which is nothing but this print label area these are from one of my previous tutorials you can see this and the format is prop dot label meaning this prop dot label that goes in here and prop dot area that goes in here so I'm just looping through to print this and when I print this you should be able to see that it's actually printing label 275 area is 229 label 276 area is 37 you can dump this into an Excel sheet or into a note notepad I mean into a text file if you want so this is one way of extracting so this calculates the region props anyway now we are figuring out what are the best ways of actually extracting the result the best way is Thank you.

to dump it into an excel file so now let's look at our step number uh six i believe that's output results into csv file yeah so for that let me take my prop list that i just uh put it in my i mean i just copied in my notepad because i don't want to type this again here so these are the properties i want to dump okay i called it the properties list uh so equivalent diameter orientation so yeah i i took some notes while I was testing verify if it works angle between orientation so it was working so I just can delete that okay so these are the properties list I want to extract from region props and do the next step which is dump it into a CSV file. If you watched my tutorial just before this one, I talked about how to create, how to write and read, you know, from and to CSV files. So here I'm going to just define my output file, okay, equals to, if you remember, dealing with CSV files, I'm going to open a CSV file, and what do we call it let's just say image measurements dot CSV okay I'm going to open it in the right mode W if I put T right next to it right text mode but by default if I don't put anything it is text So I created this I opened it so let's start writing to it output Underscore file okay dot right this is how we write to this file once it's open What do I want to write so? Let's start with a comma because What I'm trying to put here is the is the header So the first row I want it to be area, equivalent diameter, orientation and so on. So a header and then I want to put all the dump all the numbers.

I'm leaving the first by putting a comma I'm leaving the first cell open basically that's pretty much it excuse me and I'm doing this because I've done a couple of things before and I thought this is the best way to deal with this so I'm not experimenting during this video I've done that already that's why I'm just doing in fact let's go ahead and copy and paste this so I don't type anything wrong way so there you go so this is what I just did so what it's going to do is this is nothing but comma it's going to add this prop list which is nothing but you know each one of these and then it creates a new line after it's done okay that's all it's doing okay and we leave the we leave we left the first cell blank to leave room for the header okay so uh meaning the column names now let's uh for clusters yeah i'm going to define our cluster props let's say for cluster properties in clusters okay so now i'm using the for loop to go through every one of these entry in the clusters okay so for cluster props in clusters now what do we want to do okay output underscore file right so underscore file dot write so i'm writing Let's just write a string. In fact, let me go back to my text and see, instead of me inventing this, do I have it in my text file here? So, that's actually bring this and let me not again I am I have done this before so I'm just going copying and pasting and this is very self-explanatory if you look at this so now let me go ahead and explain this output cluster properties to the excel file obviously dot write yeah and I'm writing labels as a string and then for I prop in enumerate prop list where enumerate remember again one of my previous tutorials prop list meaning it's unwrapping it's actually going through like area equivalent diameter orientation if the prop equals to area okay if it is equals to area again you don't need all of this i could have just i could have just uh uh gone through each of this and printed everything into an excel sheet the reason i chose to do this is i want to convert my pixels into microns or square microns that's exactly why i actually created this So I'm going through each of these property in the property list and I'm saying that if my property is area then what do you want to print? I want to print the cluster prop.

Yeah, the value multiplied by the square of my pixels. Again, area is two dimensional, right? So we need to multiply by the square. So convert the pixel to square microns rather than just pixels.

That's exactly what I'm doing here. Else if if it's not area then look at the orientation because I'm also printing I've added orientation literally five minutes before I am recording this video And then realized okay if it is orientation, it's printing the orientation in Radiance so we need to convert it to degrees so we can understand it a bit better So I am multiplying this by 57 point two nine five eight so which is converting your degrees into radians, okay? else if Intensity again if it is intensity values they go from 0 to 255 I don't want to multiply pixel value to that so else if it is intensity if the intensity is basically what the state says is if it has no intensity in its name meaning if it is equivalent diameter or if it is perimeter then multiply it by pixels to microns yeah which is nothing but point five microns in our example. Else meaning what else is left over minimum intensity mean and max intensity.

Yeah. Then just print the value as is don't multiply it with anything. OK please digest this part. This is again not pretty complicated.

All I'm doing is in fact I you don't need any of this if you just want to print everything as pixels. And be fine with it. Okay, or just the intensity but because I'm mixing some of these I just wanted to show you how to convert that into real-life Measurements, you know that you can relate to rather than just pixels So that's it. And then it's actually because we are to created a variable called to print.

Obviously, it's going to print right there as part of your output file dot write. OK, and then creates a new file. So that's pretty much it, actually.

OK, so let's go ahead and run this. And there you go. So hopefully in my. Folder here. I should see a file that's called image measurements written at yeah right now which is 10 20 a.m..

Pacific Standard Time and It's opening the excel sheet and in a second. I'll Okay, it opened it on this other screen, so let me pull it Right there Okay So there you go area equivalent diameter orientation major axis all of these and then your area here and orientation in degrees right there and And your major axis minor axis perimeter your minimum intensity in within that grain maximum intensity within that grain and the mean intensity So this is this is the entire statistics and now that it is in Excel you can do whatever you want you know convert that into a table for example you know and then go ahead and do plotting you know do your pivot tables and do a whole bunch of stuff that Excel lets you do although it's very nice if you know how to use pandas which is how Python handles these type of data structures they call it data frames it creates a data frame and then it makes it easy for you to do like any statistical analysis or machine learning type of analysis if you want to see what is the relation between the diameter and major axis and you know and I don't know its orientation if there is a relationship sometimes it's very easy to use machine learning to get better insights rather than just plain statistical based analysis. I'm not a very good expert at deep learning and all of that but I know a little bit of machine learning which I hope I can include as part of one of my upcoming videos. So until then I hope you learned something and I hope you found this to be useful and for you life scientists I'll see if I can repeat something similar using an image that has a whole bunch of cells so you can relate it better but again this is image image processing, the type of image, it should be pretty agnostic. You know, it's completely ones and zeros or 255s and zeros.

It has nothing to do with your image. But I do understand that you relate well if it is an image that you can well relate to. So thank you very much for your attention. If you like this video, please like it, you know, below.

And if you like these. videos that I'm creating please subscribe to my channel it definitely encourages me to create more content and creating these videos obviously takes time and I do a lot of research before doing this videos and and and cheat a lot by getting information from various sources all to make sure the content is put in such a way that a typical microscopist who is not a programmer can digest it in an easy way okay so please subscribe to buy channel and let's cover a different topic in the next tutorial and until then enjoy your life and thank you very very much