G 424/524 GIS for the Natural Sciences
D. Percy
e-mail: percyd@pdx.edu

Winter Term 2004

Assignments 4

Due Tuesday Feb 24, 2004

Raster analysis with Spatial Analyst and Excel

Part 1
Read through the accompanying handout for an explanation of why you would want to do this...


Load Spatial Analyst Extension

From the Esri/AV_GIS30/AVTUTOR/Spatial directory, add the XY data yield.txt, x_coord and y_coord are the x and y fields

Add thefarm.shp from same directory

Load Spatial Analyst extension, and make the Spatial Analyst Toolbar active

From the Spatial Analyst menu, choose Options and set the Extent to Same as Layer "thefarm"

From the Spatial Analyst menu, choose Interpolate to Raster -> Inverse Distance Weighted and set the parameters as shown below.

Do the same interpolation with spline, using the parameters in the handout. what's the difference? Specifically identify some problem areas where the spline creates artificial highs and/or lows! An exported map in your report would be a good way to show this. Label a couple of points to illustrate your observations.


From the Spatial Analyst menu, choose Reclassify, make sure the right raster is in the input raster dropdown, click Classify, choose Equal Interval, and 5 classes.

Choose a color ramp from Symbology, under Layer Properties. The darker colors should show areas of higher yield.


Add the grid file called DEM.

From the Spatial Analyst menu, choose Surface Analysis - > Contour. Make sure DEM is selected in the Input Surface dropdown, the rest of the parameters should be accepted as defaults. Keep track of where the Output features are being created! You might need to change this location...

From the Spatial Analyst menu, choose Surface Analysis - > Slope. Do the same with Aspect.


Add the soilsamp.shp file

Interpolate organic matter using the same parameters as you used for yield above, make this a permanent raster, keeping track of where you save it.

Summarize organic matter with respect to the zones of Reclassed yield using Zonal Statistics:

Repeat the above 2 steps for the SOILSAMP field potasium


Fun things to try:

Use the raster calculator to subtract a surface created with one interpolation from another, creating a sort of residual or "difference surface". This will show where one surface is above or below the other one...

Create a variogram of the data and use this to interpolate using the Krigging interpolation method. What are the consequences of using the wrong model?


Part 2

Building on the skills and understanding from the Part 1 (!), we will now go beyond exploratory methods on to predictive methods.

We have 6 data elements (fields) in the soilsamp table, plus an elevation layer (Digital Elevation Model or DEM), plus two derived layers from the DEM, Aspect and Slope. The dependent variable is Crop Yield. (Please think about how these could be changed into variables that are relevant to your own research!)

One way to relate these variables to the output variable is Multiple Linear Regression. This is a technique for determining the influence of each particular variable on the output variable. We will perform a linear regression using these variables, illustrating how raster analysis is a more powerful tool for natural science applications (compared to simple vector analysis, like overlay).

You are solving an equation that like this:

Y = Ax + By + Cz

Where x, y and z are variables like moisture content, pH, etc; A, B and C are the coefficients; Y is the dependent variable, in this case Yield.

We will use Excel to give us the values for the coefficients, then take these back into Arcview and multiply each raster layer by its calculated coefficient.

Do this:

·                      Fill this new field with values from the PointZM field using the bt right clicking the field and choosing Calculate Values. First check Advanced, then paste the following code into the pre-VBA box:

Dim dblZ As Double
Dim pPoint As IPoint
Set pPoint = [Shape]
dblZ = pPoint.Z

type dblz into the “elev =” input box

One way that you can test a predictive model like this is to leave a few locations out of the "training set", and then use them to test your model. These SHOULD be chosen randomly from the set, but you could just try taking some out. They are then left out of the regression calculation in Excel. Feel free to try this!

Okay, write it up. What does it mean? How would you use this type of analysis in your own research or to tackle another problem? You will want to put each of your maps into a different data frame so that you can make a final layout showing all 3 maps (yield, predicted yield and residual) on one page.

Answer these questions (justify your answers!):

Does the predicted value surface calculation specifically take into account the location of each point? Is this necessary at this scale? If so, would latitude or longitude be more important? What do the residuals represent? How well do you think this model does in predicting yield? How would you quantify this?