Understanding Spatial Interpolation Techniques

Let's assume we have a set of around 10 points in the plane like these ones and for each of these points we have the elevation. Then what we call spatial interpolation is the process of estimating the value of the attribute, so in our case the elevation, at locations that are not sampled. And we do this by trying to find a function that would fit or pass very closely to all the points that we have defined. So if we look at the 1DK, so for example if we define a one-dimensional coordinate system and then we have a few points like these ones. So the aim of interpolation would be to find a function that would, as I said, pass very closely. Of course, there's an infinity of functions that can be passed through these points. So the aim is to find the function that will approximate the best the shape of the terrain. Interpolation is used in several fields in engineering and in sciences. And depending on the application for which it's used, we want the interpolation method to have different properties. So here I'm listing seven properties of an ideal interpolation method. Most of the properties that I'm listing come from the book Contouring of Dave Watson. So if we go quickly over these properties, the first one is exact. So exact means that if we have a set of points, it's in 1D, same thing, an exact interpolant or an exact, the result of an interpolation method will create a surface that we call the interpolant. So the interpolant is exact if it goes through or if it honors the original samples that were collected to study the field. If we explain the second concept, so the second concept is continuity and smoothness, we can say that continuity in mathematics, we would call it a C0 interpolant, and smooth, we would call it C1 interpolant. C0 interpolant is one... where the interpolant is continuous. So at every location, we have a value. But C0, if we look, the derivative at certain points, for example, at this point, is undefined. So we don't know what it is. So a C1 interpolant is one for which the first derivative of the interpolant is known everywhere. And if we continue like this, a C2 interpolant is one where the second derivative, so I drew the same thing because it's more difficult to draw. a C2 interpolant, but it's an interpolant for which a second derivative is known at every point. Then the fourth property is local, so it means that if we want to interpolate at a given location here, for example, we will only use neighbors that are in the venicity of the points where we want to interpolate. It means that we will not try to look for points that are far away. The fifth one is that the interpolant should be adaptable, which means that the function should give realistic results for data distributions that are not normal, data distributions that are anisotropic. This happens a lot in GIS because when we collect data, for example, what we've seen in a previous lesson for bathymetric dataset, samples are collected by a boat following. certain lines, as you can see here. So if we wanted to interpolate with such a data set, then we don't want to take points that are only along one given line, but also we would like our interpolant to adapt to the situation and be able to collect points that are all around the location where we want to interpolate. 6.1 is computationally efficient. Nothing special to say about that. It means that Basically, it must be possible to implement it so that the result can be obtained in a reasonable time. Reasonable time depends on the application, but we wouldn't like to wait an hour to obtain our result. And the seventh one is automatic. That's something that is not always possible depending on the methods that we're using, but automatic would mean that there's no user-defined parameters. So some interpolation methods are very good. but they require the user to really have a good understanding of the dataset. And based on that, the user needs to define several parameters and to tweak them to obtain a result. This is something that can lead to a good interpolation method, but it's also something that requires the user to have very good knowledge of the dataset. And often you don't have good knowledge of the dataset when you want to interpolate. You just want to click a button and get a reasonable... answer, so something that is automatic or something for which the parameters could be automatically derived is something that is wish. Now let's discuss the fitting of a polynomial as a way to spatially interpolate. We know that if we have n points in the Euclidean space in 3D, then it's always possible to find at least one polynomial which has a degree of maximum n-1 that will pass through all the points that we have in 3D. If we look at the case in 1D, as you can see here, so if we have these points, So we can always find a polynomial that will approximate the samples. If we want to ensure that this polynomial goes through exactly all the points, then we might need to use a degree that is pretty high. Depending on the distribution of our point, it could be that if we have 100 points, this polynomial would be of degree 59. For example, it would take a lot of time to find, but it's possible to find it. The problem with it is that since we want the interpolant to be exact, it's known that when polynomials start to have a high degree, and high degree can be something above 10 for example, there's something the interpolant will start to oscillate between the points. It means that instead of simply having this, we would have something like this. So the interpolant passes through all the points but between the points it starts to oscillate and it can go higher and lower than the values. Instead, what is often being used in practice are splines. Splines are piecewise polynomials. This means that if we have a set of points in 2D, like these, then we divide the data set, all the points, in a certain way, for example, in four rectangles. And each of the rectangles, that we call an area, will contain a subset of the point, let's say around 100 points. And for each area we create a polynomial of degree that is considered rather low, for example 3, 4 or 5. So we ensure that inside each of the piece we have a polynomial which is a function which is continuous and smooth. And the only important aspect that we need to consider is that at the junction of the different areas we need to ensure that the interpolant is continuous and smooth. So that is for example C2. Details of how this is done are however out of scope for this course. While polynomials and splines are sometimes used in GIS for spatial interpolation, weighted average methods are much more common. We describe in the book five different weighted average methods. We won't cover these in detail in this video, instead we'll just look at the basic principles which they all share. Let's say we have a set of elevation samples like these. and we want to estimate the elevation at an unsampled location. A weighted average interpolation method will give you two rules. The first one will be, it will tell you which subsets of the points should be used if you want to estimate the elevation at the unsampled location. And the second one, it will give you for each of the samples that you've selected, so from your subset, what is the importance of each, what is the weight that is assigned to each of these. In practice, this can take many different forms. We see in the book five different interpolation methods that all follow these two rules. The most known and used weighted average interpolation method is inverse distance to a power. It uses a searching circle for the rule number one. Here you can see an orange circle having a radius of 10 meters that is drawn around the interpolation location. All the sample points inside this circle will be used as a subset of the dataset. In this case, we have three samples. Their importance is inversely proportional to their distance to the interpolation location. The closer they are, the more importance they will get. And once we know which sample should be used for the interpolation and we know which weight to apply, then all weighted average interpolation functions are like this. So the function which takes x, the location usually on the plane in 2D of the interpolation location is equal to the sum, so we do the sum of all the samples that we have, in our case we would have 3, and then wi is the weight assigned to each point, and z is the elevation of every point, and then we normalize everything, so we divide by the sum of all the weights, and then that gives us the estimation for the elevation at the location x. In the textbook, you can read about the details of several interpolation methods. Notice that in order for you to get a better understanding of the results you obtain with each of these methods, we've implemented them all for the same set of sample points for an area somewhere in Tasmania in Australia. If you go to the URL shown at the bottom here, you can download the resulting files, which are raster files, all of the same size and with every method we've interpolated exactly at the same location. We also give you a QGIS3 project that will allow you to explore and compare the resulting surface that you obtain that we obtain with each of the methods. Here we can see the terrain obtained from linearly interpolating in a tin with Delaunay triangles. Notice that since the samples are very far from each other, their samples are the red points that you see overlaid on the surface, you can clearly see the triangles and their edges. If instead we use natural neighbor interpolation, you can see how smoother of a surface we obtain and you can also notice how it can be made even smoother if the gradient at each sample is used to modify the weight. So we obtain something that's even smoother. You can also see the influence that the parameters of inverse distance to a power IDW will have on the resulting terrain. For example you can see that if we modify simply the radius or the power then we can obtain a surface that is drastically different. Notice also here how nearest neighbor also called closest neighbor yields a totally unrealistic terrain. Finally notice that with the QGIS project that we give you you can modify and extract information from all the terrain files that are given to you. For example, here I'm using the good old function contouring to extract the contour lines directly from one of the terrain that we got, the natural neighbor interpolation, and then the resulting contour lines are directly overlaid in 3D and that can help you to have a better understand of the morphology of the terrain that's created with a given interpolation method.

Of course, there's an infinity of functions that can be passed through these points. So the aim is to find the function that will approximate the best the shape of the terrain. Interpolation is used in several fields in engineering and in sciences. And depending on the application for which it's used, we want the interpolation method to have different properties.

So here I'm listing seven properties of an ideal interpolation method. Most of the properties that I'm listing come from the book Contouring of Dave Watson. So if we go quickly over these properties, the first one is exact. So exact means that if we have a set of points, it's in 1D, same thing, an exact interpolant or an exact, the result of an interpolation method will create a surface that we call the interpolant.

So the interpolant is exact if it goes through or if it honors the original samples that were collected to study the field. If we explain the second concept, so the second concept is continuity and smoothness, we can say that continuity in mathematics, we would call it a C0 interpolant, and smooth, we would call it C1 interpolant. C0 interpolant is one... where the interpolant is continuous.

So at every location, we have a value. But C0, if we look, the derivative at certain points, for example, at this point, is undefined. So we don't know what it is.

So a C1 interpolant is one for which the first derivative of the interpolant is known everywhere. And if we continue like this, a C2 interpolant is one where the second derivative, so I drew the same thing because it's more difficult to draw. a C2 interpolant, but it's an interpolant for which a second derivative is known at every point.

Then the fourth property is local, so it means that if we want to interpolate at a given location here, for example, we will only use neighbors that are in the venicity of the points where we want to interpolate. It means that we will not try to look for points that are far away. The fifth one is that the interpolant should be adaptable, which means that the function should give realistic results for data distributions that are not normal, data distributions that are anisotropic. This happens a lot in GIS because when we collect data, for example, what we've seen in a previous lesson for bathymetric dataset, samples are collected by a boat following. certain lines, as you can see here.

So if we wanted to interpolate with such a data set, then we don't want to take points that are only along one given line, but also we would like our interpolant to adapt to the situation and be able to collect points that are all around the location where we want to interpolate. 6.1 is computationally efficient. Nothing special to say about that.

It means that Basically, it must be possible to implement it so that the result can be obtained in a reasonable time. Reasonable time depends on the application, but we wouldn't like to wait an hour to obtain our result. And the seventh one is automatic.

That's something that is not always possible depending on the methods that we're using, but automatic would mean that there's no user-defined parameters. So some interpolation methods are very good. but they require the user to really have a good understanding of the dataset. And based on that, the user needs to define several parameters and to tweak them to obtain a result.

This is something that can lead to a good interpolation method, but it's also something that requires the user to have very good knowledge of the dataset. And often you don't have good knowledge of the dataset when you want to interpolate. You just want to click a button and get a reasonable... answer, so something that is automatic or something for which the parameters could be automatically derived is something that is wish.

Now let's discuss the fitting of a polynomial as a way to spatially interpolate. We know that if we have n points in the Euclidean space in 3D, then it's always possible to find at least one polynomial which has a degree of maximum n-1 that will pass through all the points that we have in 3D. If we look at the case in 1D, as you can see here, so if we have these points, So we can always find a polynomial that will approximate the samples. If we want to ensure that this polynomial goes through exactly all the points, then we might need to use a degree that is pretty high. Depending on the distribution of our point, it could be that if we have 100 points, this polynomial would be of degree 59. For example, it would take a lot of time to find, but it's possible to find it.

The problem with it is that since we want the interpolant to be exact, it's known that when polynomials start to have a high degree, and high degree can be something above 10 for example, there's something the interpolant will start to oscillate between the points. It means that instead of simply having this, we would have something like this. So the interpolant passes through all the points but between the points it starts to oscillate and it can go higher and lower than the values.

Instead, what is often being used in practice are splines. Splines are piecewise polynomials. This means that if we have a set of points in 2D, like these, then we divide the data set, all the points, in a certain way, for example, in four rectangles. And each of the rectangles, that we call an area, will contain a subset of the point, let's say around 100 points.

And for each area we create a polynomial of degree that is considered rather low, for example 3, 4 or 5. So we ensure that inside each of the piece we have a polynomial which is a function which is continuous and smooth. And the only important aspect that we need to consider is that at the junction of the different areas we need to ensure that the interpolant is continuous and smooth. So that is for example C2. Details of how this is done are however out of scope for this course. While polynomials and splines are sometimes used in GIS for spatial interpolation, weighted average methods are much more common.

We describe in the book five different weighted average methods. We won't cover these in detail in this video, instead we'll just look at the basic principles which they all share. Let's say we have a set of elevation samples like these. and we want to estimate the elevation at an unsampled location. A weighted average interpolation method will give you two rules.

The first one will be, it will tell you which subsets of the points should be used if you want to estimate the elevation at the unsampled location. And the second one, it will give you for each of the samples that you've selected, so from your subset, what is the importance of each, what is the weight that is assigned to each of these. In practice, this can take many different forms.

We see in the book five different interpolation methods that all follow these two rules. The most known and used weighted average interpolation method is inverse distance to a power. It uses a searching circle for the rule number one. Here you can see an orange circle having a radius of 10 meters that is drawn around the interpolation location. All the sample points inside this circle will be used as a subset of the dataset.

In this case, we have three samples. Their importance is inversely proportional to their distance to the interpolation location. The closer they are, the more importance they will get. And once we know which sample should be used for the interpolation and we know which weight to apply, then all weighted average interpolation functions are like this. So the function which takes x, the location usually on the plane in 2D of the interpolation location is equal to the sum, so we do the sum of all the samples that we have, in our case we would have 3, and then wi is the weight assigned to each point, and z is the elevation of every point, and then we normalize everything, so we divide by the sum of all the weights, and then that gives us the estimation for the elevation at the location x.

In the textbook, you can read about the details of several interpolation methods. Notice that in order for you to get a better understanding of the results you obtain with each of these methods, we've implemented them all for the same set of sample points for an area somewhere in Tasmania in Australia. If you go to the URL shown at the bottom here, you can download the resulting files, which are raster files, all of the same size and with every method we've interpolated exactly at the same location. We also give you a QGIS3 project that will allow you to explore and compare the resulting surface that you obtain that we obtain with each of the methods.

Here we can see the terrain obtained from linearly interpolating in a tin with Delaunay triangles. Notice that since the samples are very far from each other, their samples are the red points that you see overlaid on the surface, you can clearly see the triangles and their edges. If instead we use natural neighbor interpolation, you can see how smoother of a surface we obtain and you can also notice how it can be made even smoother if the gradient at each sample is used to modify the weight.

So we obtain something that's even smoother. You can also see the influence that the parameters of inverse distance to a power IDW will have on the resulting terrain. For example you can see that if we modify simply the radius or the power then we can obtain a surface that is drastically different.

Notice also here how nearest neighbor also called closest neighbor yields a totally unrealistic terrain. Finally notice that with the QGIS project that we give you you can modify and extract information from all the terrain files that are given to you. For example, here I'm using the good old function contouring to extract the contour lines directly from one of the terrain that we got, the natural neighbor interpolation, and then the resulting contour lines are directly overlaid in 3D and that can help you to have a better understand of the morphology of the terrain that's created with a given interpolation method.

Transcript for:Understanding Spatial Interpolation Techniques

Transcript for:
Understanding Spatial Interpolation Techniques