Ecosystem Services, Hands-On

Tools to inform decision-making

Resources

Introduction to Ecosystem Services Part 2

Welcome and Course Overview

Welcome to Ecosystem Services Part 2, where we will pick up where we left off with a hands-on, software-focused dive into ecosystem service modeling. This course represents a significant shift in approach from the previous sessions, moving from theoretical concepts to practical application and computational implementation. The focus of this segment will be on understanding the actual mechanics of ecosystem service models through direct engagement with the software and data that drive them.

Course Objectives and Structure

What We Will Accomplish

The primary objective of this course is to return to the carbon model, which all students have already run, but approach it with much greater depth and understanding. In the previous session, we ran through the model at light speed without taking the time to thoroughly discuss what we were actually doing. That pace was entirely intentional—the immediate goal was simply to ensure that the software was functioning properly on every student’s machine, as technical difficulties often present a significant roadblock to learning. However, given that this class includes students with above-average computer skills and everyone successfully navigated that initial hurdle, we now have the opportunity to dive deeply into what is actually happening behind the scenes.

The In-Class Exercise on Policy Assessment

Beyond the detailed examination of the carbon model, this class will include an in-class exercise on assessing a policy that affects ecosystem services. This exercise will provide hands-on experience in thinking through how policy decisions translate into changes in ecosystem service provision and how we can quantify and value those changes. If time permits, we will also move on to examining the second and much more complicated ecosystem service model that will be emphasized in this course: the sediment delivery model. This model represents a significant increase in complexity compared to the carbon storage and sequestration model, but understanding it will provide valuable insights into how multiple ecosystem services can be modeled simultaneously and how their interactions can be assessed.

Software Setup and Initial Preparation

Launching Your Software Environment

To begin, all students should go ahead and launch InVEST and QGIS if they have not already done so. Both applications take a little while to load and initialize, so it is useful to start them now while we review some foundational concepts and reminders about the workflow. While these programs are starting up, we should take a moment to review several important points about how we will be working with data throughout this course and how that differs from what you may encounter in later professional work.

Data Organization and File Management

We will be using the base data that was provided to you for this course. This data exists specifically for reference and educational purposes, and you downloaded it directly from InVEST. However, it is absolutely critical to keep in mind that when you do your country reports later in the course, you will be using different data—not this nicely curated, already organized set that we are working with now. Real-world data is often messy, inconsistently formatted, and requires significant preprocessing before it can be used in models.

I had you save the base data in your class directory, and we will use those different sample data directories named after each of the models for where we store the model results and intermediate outputs. In the previous class, what we did was quickly point InVEST to the key inputs for the simplest possible run: we basically plugged in the land use land cover map and a biophysical table in CSV format, and then ran the model. Now, I want to return to those data but spend significantly more time looking at it so we can get a proper sense of how to interpret it and understand what is actually going on in the model during computation.

Working with Land Use Land Cover Data

Loading the LULC Map into QGIS

Let us start by loading that land use land cover map from our sample data into QGIS. I am going to try to work dynamically with two screens so we can do this together in a live environment. I should note that I forgot to clear some work I was doing before, so you may see some other land use land cover maps on my screen. I was running land use land cover maps through a bunch of different models trying to predict where land use change is going to happen. This reflects the kind of work I do on a regular day—I am constantly looking at where land use change happens under different scenarios. You can see slightly different configurations of the landscape and different scenarios playing out on my screen. You do not need to understand the details of what you are seeing, but this actually represents a model I have gotten significant attention for called GTAP-InVEST, where it is looking at different scenarios of land use change. Here, we are looking specifically at what happens if you take away all irrigation and change how crops are grown. It turns out that if you take away irrigation, agriculture expands quite a bit, and this results in a relatively large reduction in carbon storage. We actually just got a paper accepted for publication on this exact topic.

For our purposes, we are going to start up in a blank project, and I want you to navigate to where you stored the data. For me, that is J.A. Johns files, APEC 3611, and then where I installed my InVEST data. The carbon model folder is the one we are looking at. We powered through it last class, but now let us talk a little bit more about it. You have all run it already once, so there is going to be more than just the default inputs—there is also going to be outputs as well as inputs stored in that directory.

Understanding TIFF Files and File Naming Conventions

The first file we are going to look for is called “LULC Current Willamette.tiff”, and I want to explain a little bit more about these file types and naming conventions. One thing you should be aware of regarding TIFF files is that you will notice there is a bunch of similarly named files in the directory. This is a very common convention in the GIS world. In addition to the actual raw data file, you might store auxiliary data, often abbreviated as AUX, that has information like what area this geographic layer is going to cover and other metadata. It is kind of annoying that it is stored as a separate file, but that is the standard approach. This is actually a common tripping point though, because if you try to load this auxiliary file into QGIS, it will not know what to do with it and will either fail or give you an error message.

This situation is further complicated by the fact that on Windows, oftentimes the file extensions will be hidden by default, which means the last three characters of the filename will not display. So the file we actually want is the one without the AUX extension—or more specifically, the actual TIFF file rather than the auxiliary file. That is kind of confusing, but go ahead and drag the correct file into QGIS.

Interpreting the LULC Map in QGIS

Navigation and Visualization Tools

So what do we see now that we have loaded this map? There are a few things to orient you as you learn to work with QGIS. First, if you have a mouse wheel, you can zoom in and out by scrolling. Otherwise you can use other key commands depending on your keyboard setup. Left-click and pan will allow you to focus on different areas of the map. A common problem that happens very easily is if you accidentally zoom out really fast, like I just did, you can lose track of the data entirely. It is there, it is just tiny—I zoomed out to be looking at it from Mars, so to speak.

A common shortcut you can use to fix this problem is to right-click on your layer in the layers panel and go to “Zoom to Layers”. All that does is refocus the window on the actual spatial extent of the data. One additional thing you might want to do is click the little dropdown arrow next to your layer name. If you expand that, it gives you more information about the color scheme and legend. This is particularly useful when trying to understand what the different colors represent on your map.

Categorical Data and Color Interpretation

In this case, what you are looking at is what is called categorical data, meaning each different integer value on this map corresponds to a specific category of land use land cover data. The values that are displayed in this particular color are residential lots with zero to four units per acre—low-density housing, kind of like suburbs or exurbs with yards where people have property around their homes. This color over here is more like inner-ring suburbs with gridded neighborhoods like you find in St. Paul or Minneapolis with city blocks. And residential with greater density is where you start to have skyscrapers or big condo buildings. This is just an example of what real data looks like when you are working with actual land use classification data.

Spatial Patterns and Observable Features

If you zoom in, you can start to see the sort of things you would expect to see in a real landscape. You start to see a road network develop, which is particularly interesting because you can actually detect the road network from satellite imagery, which is pretty cool when you think about it. You can also see other features like cropland, modeled as tiles of different colors, and the river valley running through it. The basic spatial patterns you see here determine some really important things about how ecosystem services are provided across the landscape. The distribution of land cover types determines everything from carbon storage to sediment retention to water quality.

The Carbon Model: From Theory to Practice

The Model Architecture and Workflow

So this is the map that we plugged into InVEST last class. Just to do it again, we are going to rerun it on your computer because we are going to rerun it under a few different scenarios to explore how the model responds to different inputs. Let us replicate what we did last class—open the carbon storage and sequestration model from the InVEST interface.

Just to remind yourself of the workflow, you can load a pre-configured version if you have one saved, but if not, this will remind you of the steps: we are going to select the workspace, or specifically where the model is going to save the output folder. I just chose the carbon model folder that was already created. You could have typed that out manually if needed.

Specifying Model Inputs

The Baseline Land Use Land Cover Map

Now here is where we are going to point the model to that baseline land use land cover map. It is over here in the file system. It is a little harder to tell which file is the right one because of all the different file extensions we discussed earlier. It is not this one or this one, but this one here. The preview window gives you a hint that this is the one that is actually showing a map, so you know you have the right file selected.

That is indeed the one we selected and examined in QGIS. Give it a double click and it loads up here in the InVEST interface. We have already talked about this conceptually: we have a raster file, not a vector file—it is just a big matrix of integer values where each cell contains a value corresponding to different types of land cover. That is exactly what we put in here for the model to process.

I have got some slides here that are a bit more comprehensive than what I can show you live, so if you ever need to remind yourself how to do this entire workflow, those slides give you a screen-by-screen walkthrough through each step. You have seen this before, but now I have it documented for future reference.

The Biophysical Table: Carbon Pools Data

The second file we plugged in without talking very much about is that CSV file: the carbon pools table, sometimes called the biophysical table. We are looking for the CSV file, and there it is: “carbonpoolsWillamette.csv”. But this time, instead of loading it into InVEST right away, let us open it up in Excel or however you prefer to look at CSV files. This will help us understand what the model is actually doing with this data.

I just got my new Mac and I am learning as we go, so some of my steps might be a bit circuitous. It looks like it loaded up in Sheets rather than Excel. I am going to change that because I actually use VS Code to replace Excel completely. I am almost to the point where I do not have to use Excel anymore, and I am genuinely thrilled about that because Excel does a lot of annoying things that waste enormous amounts of time. For example, Excel has a tendency to reformat numbers as dates, which is absolutely maddening. I cannot tell you how many hours I have wasted on that specific problem. Has anybody else had that issue? It is incredibly frustrating.

The Auto-Increment Problem in Excel

When you put a number in a cell and drag it down to fill multiple cells, Excel automatically increments that number each time. That is a very heavy-handed approach to what Excel thinks you should be doing, but often when I am repeating a number in multiple cells, I do not want it incrementing up each time—I want the exact same number in each cell. That sort of default behavior has burned me many times.

Right now, I am using Data Wrangler in VS Code, which is like Excel but is open source and does not have these kinds of annoying defaults. It gives me much better control over what is actually happening to my data.

Understanding the Biophysical Table

The LU Code and Categorical Values

Okay, so what do we have here now that I have opened the carbon pools data? Let us explore these columns quickly to understand the structure of this critical data file. The first column is the LU code. This is the integer value being displayed on the map as different colors that we looked at in QGIS. If you actually look at each individual cell in the raster, this will be some integer value like one, this will be like two, and so forth. What the CSV does is assign the specific meaning to each of these codes: one corresponds to residential zero to four units per acre, and it assigns carbon value data to that category.

Carbon Pool Data from Literature Review

These carbon values are actually data collected from an exhaustive review of scientific literature. We had undergrad research assistants going out and measuring or reviewing literature on how much carbon was actually in trees and other vegetation across different land use types. It summarizes the carbon for residential zero to four units per acre as fifteen tons per hectare in the above-ground carbon pool, versus ten in the below-ground pool, sixty in the soil, and one in the dead litter on the ground. These values represent different pools where carbon is stored in the ecosystem.

Patterns Across Land Use Categories

A lot of these numbers are quite intuitive when you think about them. Sparse residential might have some leaf litter or parts of your lawn where you do not have it fully maintained, accumulating carbon in that litter pool. But as you get to higher density residential areas, you are probably less likely to have scrub trees or brushy areas with accumulated litter, so that pool goes down. You probably have fewer trees overall in denser areas. Condo buildings have essentially zero carbon in most pools, or at least we summarize it as zero because dense urban development typically involves paving and removal of vegetation. Vacant lots get a lot more carbon because they grow back naturally over time. There is even a gas station on Hamlin and University where they cut down and destroyed the building, and now it is growing back. The carbon storage is going up there over time, so it is providing ecosystem service values even though it is just an abandoned lot.

The Lookup Table Mechanism

How InVEST Assigns Values

This is the biophysical table, and it represents the essential model logic in InVEST. When we point the model to this table, the core operation is really quite basic and elegant: for any grid cell with a land use classification of one, InVEST assigns fifteen tons per hectare to that pixel for the above-ground pool and ten for the below-ground pool. This is called a lookup table, and it is the fundamental mechanism by which categorical land use data gets converted into ecosystem service values.

Running the Carbon Model

Replicating the Previous Run

We are back at the point where we need to select the carbonpoolsWillamette.csv file. This is where we got to last time, and I actually had you click run. So go ahead and hit run on the model. We are going to rerun it now with the understanding that we have reviewed all the inputs.

Computational Process and Model Execution

Basically what is going on under the hood now as the model runs is it is taking that biophysical table and doing exactly what I described: it looks up the value from the table and assigns that value to each pixel of that land use type. So for every pixel that is classified as land use type one, the model assigns the fifteen tons per hectare value for above-ground carbon from the biophysical table. This process happens for every pixel in the map, and it does this simultaneously for all the different carbon pools.

You can see how long it took to compute on my machine. Mine was three point five five seconds. Can I get a quick poll of the class—anybody get results in less than ten seconds? How long was yours? This gives you a sense of how efficient the InVEST model is at processing data.

Interpreting Model Results

Loading and Visualizing the Carbon Stock Output

The Primary Output: Baseline Carbon Storage

Now let us talk about the results in much more detail than we did last time. The first output file we are going to look at is the key result: C_stock_BAS, where C is carbon, stock refers to storage, and BAS refers to the baseline scenario. I would like you to load that file into QGIS as well. Go ahead and drag it in. I am going to drag it on top of the land use land cover map so it displays properly in the visualization.

What do we see here now that we have loaded this output? This is the result of the model showing carbon storage amounts across the landscape. Let us give it a better color scheme to make the spatial patterns more obvious. Going back to the slides if you are looking for more detail on this step or want to revisit it later. You can double-click on the carbon storage map to bring up the symbology options.

Proper Symbology for Continuous Data

A common gotcha that trips people up is if you are in one of these other tabs in the symbology dialog, the map will not render correctly, so you have to make sure you are on the Symbology tab specifically. Here we have tons of different options for how to color our data. For our land use land cover map that we looked at earlier, that was categorized as paletted or unique colors because each different value has its own color since each value does not mean anything on a continuous scale—it is just a category.

But when we are talking about carbon storage, that is a continuous quantitative measure. You can have one ton, or one point one tons, or one point one one tons—it is a floating-point number that can take on any value. For those kinds of continuous data, it is best to use a color band with continuously varying colors rather than discrete colors. Let us select the single-band pseudo-color option.

Choosing Appropriate Color Schemes

In the color ramp section, you can select all sorts of different color schemes. Being a good geographer means knowing how to select the right ones for your data and your message. Something like greens will be a good choice here because thematically it matches the idea that at low levels of carbon storage, it is probably not very green, but as carbon storage increases, it ramps up toward darker greens. So it is continuous between all these values and gives us something that looks really intuitive and reads well.

Now it is easy to interpret what we are looking at. We can see that this area has the most carbon storage coming through in the darkest greens, and this area has a lot less carbon storage shown in the lighter colors. If we wanted to know why certain areas have more carbon storage than others, we might toggle off the carbon storage map and see what land use types are underneath. This shows us that this area with the highest carbon storage is all the natural lands. Basically, this area over here is the city and cropland, whereas over here we get into higher elevation or rougher terrain that is kept natural and undeveloped. Not too surprisingly, those are the areas with the most carbon storage because they have more trees and vegetation.

Spatial Analysis and Conservation Implications

Making Inferences from the Map

That is the carbon storage map that we have now symbologized in an intuitive way. We can start to make some basic conclusions about different conservation options just by looking at these maps spatially. You might ask: where would it be most damaging from the perspective of climate change to develop the land into agriculture? If you were to convert this land here, that would be the most damaging because you would have the largest change in carbon storage—a large negative sequestration. The darkest areas represent the greatest loss potential.

And that is really the basics of how you do this kind of analysis. The map tells us where conservation efforts should be focused if carbon storage is your primary concern. Any questions so far about how to interpret these results?

In-Class Exercise: Policy Assessment

Exercise Instructions and Expectations

Let us get some practice with this analytical approach. These questions that I am about to give you will be added to the next homework assignment. You will do very well by actually doing this exercise now because you will then basically copy and paste your work and have much of the homework already completed. We are going to break up into groups now—feel free to work in groups however you want and move around if needed to collaborate with others.

The Assignment

Rerunning the Model for Future Scenarios

You are going to rerun the carbon model, but this time for a future land use map rather than the baseline. Some of the file names will be a little different, so you will have to figure out whether it is future or alternative based on the naming convention. That is actually what it means to be a data analyst: figuring out how files are organized and following logical naming conventions to identify the data you need. You will need to navigate the file system, identify the correct future land use layer, and run the model on that data.

Analysis Steps and Questions

Walk through these steps, answer the questions that I have provided, and look at what the change in carbon storage values is between the baseline and future scenarios. You will benefit from examining the report.html file that is generated in your workspace. Have at it, and I will monitor everyone’s progress and call this back together when you are all getting close to done.

Important Technical Notes

Discount Rate Parameter Update

One minor thing I should note: I had an error in the slides that I provided. Do not put a discount rate of zero point zero three. They changed it in the most recent release of InVEST to be listed as actual percents rather than decimals, so three is the discount rate you would be better off using. This is important because the discount rate affects the net present value calculations significantly, so getting this right is critical.

Troubleshooting File Access Errors

An easy way if you are getting error messages about permission denied is that you probably got the output file loaded in QGIS and the model is trying to write over it. To fix that, take it out of your QGIS workspace. QGIS locks files that are loaded in the program, so you need to remove them from QGIS before running the model again on the same files.

Reviewing the Results: Discussion and Interpretation

Time Check and Reassessment

We have about eight minutes left on the clock. Even though many of you are still working on it, I would like to quickly talk through some of the answers to make sure everyone understands the key concepts, even if you did not finish running all the computations.

Key Differences in Model Parameterization

Enabling Sequestration

Here are a few things we did differently with InVEST this time: we enabled sequestration in addition to storage. Just to remind you of the conceptual difference, carbon storage is the amount of carbon present on the landscape presently, as opposed to sequestration, which is the rate at which new carbon is being captured and stored. Most people actually care about sequestration when they are thinking about climate change because what matters is how much additional carbon we can remove from the atmosphere.

Calculating Change Between Scenarios

When you are talking about carbon storage changes from different land use maps, that means you need to calculate carbon storage twice: once for the baseline land use land cover and a second time for some alternate or future land use land cover scenario. The difference between these two calculations gives you the sequestration value. If the alternate scenario stores more carbon, you have positive sequestration. If it stores less, you have negative sequestration or a carbon loss.

The Valuation Model and Net Present Value

You all figured out to run the valuation model, and it does a number of things for you automatically that would be extremely annoying to calculate manually. It requires you to put in the years, and if you are wondering why years is important, you can click on the information icon there. It is going to calculate the net present value for you, which is the annoying financial mathematics that we do not have to do manually. The results are already in net present values, so you can directly compare ecosystem service values to economic development values.

Why We Need the Time Horizon

The reason I needed to know the years is because we are basically saying this temporal horizon is the length of time over which we will consider these changes. We are asking: if this land use change happens and stays in place for this many years, what is the total value of the carbon storage loss? The model calculates that value for you using standard net present value formulas.

Carbon Versus Carbon Dioxide: An Important Distinction

The Pricing Difference

We gave a price of carbon of one hundred eighty-seven dollars. I also noted that the social cost of carbon dioxide is fifty-one dollars. The reason for this significant difference is that physical scientists talk about carbon the atom and the total mass, while many economists and policy makers talk about carbon dioxide as if we are buying and selling the same thing. But there is a critical difference: carbon dioxide has two heavy oxygen atoms attached to the carbon atom. Those are massive compared to electrons.

If you measure tonnage of carbon dioxide, the climate impact depends on how much actual carbon made it up into the atmosphere. The impact does not depend on those extra oxygen atoms. They are inert once in the atmosphere. It is roughly a factor of three point six difference in mass based on atomic weight. That is why you had to use the higher number of one hundred eighty-seven instead of fifty-one—because these results coming out of InVEST are in carbon the atom, not carbon dioxide the molecule.

Discount Rates and Price Trajectories

Applying the Discount Rate

The model uses the discount rate to convert future carbon storage values into present-day dollars. I had you not use an annual price change, but some people do like to have that option, which would mimic what we saw from the DICE model: the social cost of carbon goes up over time as climate damages compound. So you could automatically include an increasing carbon price trajectory.

The DICE Model Comparison

This connects back to what we learned from the DICE model earlier in the course. In that integrated assessment model, we saw that the social cost of carbon should theoretically increase over time as climate change damages accumulate. If we applied that same logic here, we would have increasing carbon prices over our time horizon, which would make preservation look even better compared to development in the later years.

Understanding Model Outputs and Reports

The HTML Report File

Generated Output Information

We can see that many of you were smart about looking not in the raw workspace folder, which has all the raw model results that are hard to interpret, but rather in the view results section. This loads up the HTML page that InVEST generates automatically. You could also find it in the folder manually, but it is also linked here directly in the InVEST interface for convenience.

What the Report Contains

The question in the exercise asked you a number of things. First, what is the change in actual carbon—the sequestration? That would be this specific value in the report. It is tempting to just look at the alternate scenario value, but what we need to see is how it changed. The carbon storage falls from about four million tons to three point seven million tons, so we have a loss of that amount of carbon storage. This is what we call negative sequestration or a carbon loss.

It also tells you what file names you would want to look at if you wanted specific geospatial results showing where those changes are happening on the map. Further, it provides you with a mapping of the net present value on a per-pixel basis. For any given pixel, you know how much carbon storage difference there was between the baseline and alternate, so you know the dollar value contributed from that pixel given the land use change and the potential loss in carbon storage. That is what the NPVAlt.tiff file reports—the spatially explicit value changes.

Creating Better Visualizations

Using GIS for Detailed Analysis

These summary statistics in the HTML report are nice and useful, but to do a good report—and for your final project you will definitely want to do this—you are probably going to want to load these geospatial results into GIS and create customized visualizations, not just rely on this auto-generated report. Customization can make these analyses much more compelling and informative.

One approach is to zoom in on key areas and show both the full map context and then zoom in on spatially on hotspots of change to really illustrate what is going on. You can annotate these maps with arrows or text explaining the changes. You can create side-by-side maps of the baseline and alternate scenarios. These kinds of customized presentations really help communicate your findings to decision makers.

Analyzing the Trade-off: Development Versus Preservation

The Question at Hand

The Key Comparison

Back to the main question in the exercise: the key question I had you answer is not just the change in carbon storage, but given that extra information we provided—specifically, the fact that the net present value of a timber harvesting project is fifty million dollars—we want you to think about whether this land use change should happen. I made it convenient for you by expressing everything in the same units of net present dollars so the comparison is straightforward.

You can have complex argumentation and nuance in your answer, but if you are going to stick with the value of ecosystem services to assess whether the scale tips in favor of development versus preservation, let us look at what we got when we did the full analysis.

The Economic Answer

Which one was worth more economically? In the report, we see that the net present value lost of carbon storage is about forty-six million dollars. Critically, forty-six million dollars is less than the fifty million dollars that the timber project would be worth if it proceeded. Here is an anti-environment argument framed in economic terms: even with ecosystem services fully valued and included in the analysis, it actually makes more economic sense to log the forest, at least if we think we have taken into account all the proper ecosystem services and we are weighing them purely on economic value.

Co-Benefits and Broader Ecosystem Services

The Limitation of Single Ecosystem Services

One critical thing to note is that we have only included one ecosystem service in this analysis: carbon storage. It is probably the case that there are what are called co-benefits—other ecosystem services that would be affected by the land use change. Biodiversity is probably better in the unlogged forest, or maybe sediment retention is better, or water filtration services might be compromised. We will dive into those other services in the next classes. But for now, I wanted to make the point that, just as in the DICE model, this conclusion fundamentally depends on what your social cost of carbon is.

Sensitivity to Carbon Pricing

This is a really important insight: our recommendation of whether to develop or preserve this land is extremely sensitive to the assumed social cost of carbon. If we had used a different carbon price, our recommendation could have flipped entirely. At a higher carbon price of, say, three hundred dollars per ton, the net present value of the carbon storage loss would be sixty-eight million dollars, which exceeds the fifty million in timber value. At a lower carbon price, preservation looks even worse.

Extending the Analysis

The Homework Extension

In the homework, I will have you go one step further than this exercise. You can copy and paste most of the work that you have done in this exercise, but it will ask you a few more questions about what the conclusion would be under different social costs of carbon. By running the model under a range of different carbon prices, you will develop an intuition for how sensitive environmental policy recommendations are to our assumptions about economic valuation. This is a crucial lesson for anyone thinking about environmental policy and ecosystem services.

Conclusion and Next Steps

Any questions about what we covered today? You all were very effective at working through this exercise, so good job. This kind of hands-on engagement with models and data is how you develop real facility with ecosystem service analysis.

I am available on email if you have questions, and with the way the course is structured and getting more hands-on, feel free to email me or come during office hours if you get stuck on anything. Have a good Wednesday, and I will see you next class.

Welcome to Day 2: Ecosystem Services and Sediment Retention

Welcome to Day 2 of our hands-on exploration of ecosystem services. Today’s session builds on the carbon storage model we practiced in our previous in-class exercise by introducing sediment retention as the next ecosystem service we will study. We will begin with a conceptual overview and discuss the science underlying sediment retention before transitioning to running the model on your computers. It would be a good time to launch both Invest and QGIS now so that these applications are fully loaded and ready by the time we need them.

Review of the Carbon Storage Exercise

The Carbon Storage Model Exercise

The exercise we completed in our previous class involved working with a carbon storage model. I will be posting the weekly assignment right after this class concludes, and it will essentially consist of the same exercises you have already completed. I’ve added a bonus third question that represents only a very slight modification to what you have already done, so you will be almost finished if you have already completed the base exercises—it will primarily involve writing up your results.

Reflections on the Exercise: Why Ecosystem Services Matter

Beyond the technical aspects of the exercise, I want us to reflect on what we learned and why we conducted this exercise in the first place. This reflection also connects to a broader question: why are ecosystem services a useful way of thinking about conservation? To me, one of the most important lessons comes from understanding the role of an economist in policy and conservation decisions. It is simply the truth that policymakers tend to listen quite carefully to economists, and I would argue they often listen too much so compared to other disciplines, but this is just the reality of how policy decisions get made. You will often see presidential candidates discussing what economists say, but you rarely hear them discussing what anthropologists say.

Why Economists Have Influence in Policy

There are many reasons why economists have such significant influence on policy decisions. One reason is that economists talk about the economy, which is a matter of profound importance to voters. However, another crucial reason is that economists have the tools and methodology available to quantify trade-offs. Any effective politician knows that most policies involve trade-offs—they benefit one group while potentially harming another, or they advance one objective while detrimental to another. The tools of ecosystem services are extraordinarily useful in this context because they can be used to build support for policies that politicians want to evaluate.

Building a Case for Conservation Through Valuation

What we essentially accomplished with our carbon storage example, particularly when we calculated the total value of carbon storage and compared it to the hypothetical value of a timber project, was to conduct a very sophisticated trade-off analysis that could inform a cost-benefit analysis. Here is the critical insight: if we had not calculated the value of carbon storage, what value do you think would have been assigned in a cost-benefit analysis to the value of nature? What is the default position that a developer takes? The default answer is zero. Left to their own devices, certainly developers but even politicians, if we do not provide them with a concrete number about how much nature is worth, the implicit decision becomes that nature is worth nothing. The decision will proceed regardless, and they will continue to make choices about whether or not to cut down the land, all while implicitly assigning a zero value simply because they lack a number. This is precisely the niche that ecosystem service analysis aims to fill—almost any estimate is better than assigning nature a zero dollar value.

Introduction to the Sediment Retention Model

Transitioning to Sediment Delivery Analysis

We are now transitioning to our next ecosystem service model, which focuses on sediment delivery and retention. I want to make an important point upfront: I am not going to ask you to understand all of the underlying science, mostly because we do not have the time to master all of it comprehensively, but I do want to make a broader point about the nature of these models. Underneath each of the ecosystem service models we work with, there exists a large literature of scientific research that supports all the calculations embedded within the model. Ecosystem Services and the Natural Capital Alliance have proceeded by taking consensus science and translating it into tools that are easier to use. The specific model we will be working with today is called the Universal Soil Loss Equation, or USLE. You will not need to learn the detailed mathematical mechanics of how this equation operates, but you can rest assured that there are hundreds of Ph.D. dissertations that have been written on how to correctly calibrate this equation. We are extremely fortunate that this calibration work exists, but we do not have to replicate it ourselves. The tools, specifically Invest, implement this science for us and make it far easier to simply be a user of good science rather than having to develop the science from first principles.

Where to Find Scientific Documentation

If you are curious about the deeper science, you can always access the user’s guide, which is linked directly from within the application itself. You can click on any of these variables within the guide to learn more about them. What is erosivity? What is the LS factor, the length-slope factor? In the user’s guide you will find references to original seminal work, such as Desmet and Gover’s study from 1996, as well as more recent scientific papers. By exploring the user’s guide, you can learn a substantial amount about the underlying science if you choose to do so.

The Science Behind Sediment Retention

A Graphical and Conceptual Overview

Rather than delving deeply into the science in its full complexity, I will give you a highly stylized graphical version of the sediment retention concepts. However, this stylized version discusses the exact same processes and principles that you could read about in depth in the user’s guide. The background science concept that underlies this entire model is the Universal Soil Loss Equation. This equation combines a collection of variables—essentially geospatial variables that measure the slope characteristics of the landscape, a specific measurement called the LS factor (the length-slope factor), the erosivity of rainfall, a conservation factor, a cover factor, and the erodibility of the soil itself. Essentially, this equation combines these various layers with statistical analysis to determine which different combinations of soil types and slopes best predict the tonnage of erosion that will occur on any given pixel of the landscape.

The Simplest Possible Landscape Model

Let me approach this by computing it in our heads using the simplest possible landscape scenario. We have been discussing land use and land cover maps extensively, and here is the simplest example I can present: imagine just four pixels rather than the several million pixels that appear in the maps we were working with in QGIS the other day. Even with just four pixels, this simple scenario still embodies the basic principle of having different grid cells representing different ways that land area could be used.

Introducing Flow Direction

In addition to these four grid cells representing different land uses, I have added one more piece of critical information: we also have a flow direction. Think of this as a representation of a hill. When rainfall falls at the top of the hill, the water flows down the hill in the direction shown by the flow arrows until it eventually reaches a stream. What we are going to track is the sediment that leaves any given grid cell. For example, consider the corn grid cell here—we can track how much sediment from erosion is going to leave that grid cell. This is called the sediment load. We can represent this with an arrow where the thickness of the arrow represents the amount of sediment. Behind the scenes in the model, this might represent something like eleven tons of sediment leaving that particular grid cell.

Why Different Land Uses Produce Different Amounts of Sediment

Why does this corn field produce so much sediment? The reason is straightforward: when rain hits a cornfield, corn does not have a very dense root structure. As a result, if the rain is strong enough, it will pull the soil away. Any farmer knows intimately about this problem—erosion is truly the enemy for many farmers, and they spend millions and millions of dollars attempting to prevent it.

The Role of Vegetation in Sediment Retention

However, what we must take into account is not just how much sediment leaves a grid cell, but what happens to it after it leaves. In this particular example, we know the flow direction, so we understand that all the sediment leaving this corn grid cell will first flow into an adjacent forest grid cell. Depending on the root structure density of that forest, there is a value that nature provides—specifically, the ecosystem service of stopping that sediment from continuing to move. I am a mountain biker, and I always pay attention to the trails. Whenever a tree falls across a trail, the trails become very eroded. This happens because now the water starts flowing rapidly without obstruction. However, if you have vegetation on the landscape, it slows down the water flow. Good mountain bike trails are designed with vegetation in key places to manage water flow. The same principle is true for farming—vegetation will retain sediment and prevent it from being transported downstream.

Tracking Sediment Through Multiple Cells

But here is the key point: not all of the sediment is retained. Some of it will continue flowing, and we need to consider it iteratively as it enters the next grid cell. We must calculate how much sediment the wheat crop retains and how much it allows to pass through and continue flowing to downstream cells. Finally, the sediment flows through one last forest grid cell, and the tiny arrow here represents how much sediment actually makes it into the stream. This is important because that is the worst-case scenario—sediment that reaches the stream eventually goes into hydropower reservoirs and reduces the amount of hydropower electricity that can be generated.

Summing Up: The Ecosystem Service of Sediment Retention

This arrow trajectory represents the amount of sediment that originated from the corn grid cell and what happened to it as it moved through the landscape. The retention that actually occurs is the ecosystem service in action—nature is keeping that sediment in place rather than allowing it to reach the stream where it would cause problems. We must not just consider this retention for a single corn grid cell; we must consider it for all of the different grid cells across the entire landscape. How much does the forest contribute to sediment transport? It has a smaller outgoing arrow than the corn because forest has better root structure, but we still keep track of both how much sediment gets retained there and how much eventually makes it to the stream. This way we can see that more sediment retention happens downstream or downhill as the flow passes through forested areas. The wheat, like corn, is probably going to generate a lot more sediment coming off it again because of its relatively shallow root structure. But fortunately, there is that last grid cell with forest vegetation to retain a significant portion of that sediment. So the amount that ultimately makes it into the stream is substantially mitigated by the vegetation it encounters on the way.

The Basic Model

That is the basic model in its simplest form. It is quite straightforward on a landscape like this where we have just one hill and one clear flow direction. Of course, you might be thinking about what this model looks like on a more realistic, complex landscape with multiple hills, valleys, and stream networks.

Digital Elevation Models and Landscape Complexity

Introduction to Digital Elevation Maps

To deal with more realistic landscapes, I want to introduce a new type of data: the digital elevation model, which is abbreviated as DEM. There are numerous ways to create models of elevation with high-resolution detail, but the most interesting and historically important method involves data collected from the space shuttle. There was a period when our society was sufficiently well-organized that we had a functioning space shuttle program. During the space shuttle era, there was a fascinating project involving a boom arm that extended extremely far from the space shuttle, reaching something like two hundred feet away. At the end of this boom arm was a specialized camera.

How DEMs Are Created Using Parallax

This camera, working in conjunction with another camera mounted on the space shuttle itself, would take measurements of the Earth by shooting two lasers down at the same spot. Because these two laser emission points were so far apart from each other, the cameras could measure the distance to the ground using the principle of parallax. Interestingly, this is the same method that our eyes use to perceive depth. If you ever try to catch a ball with one eye closed, you will understand how important parallax is for depth perception—it becomes very difficult. The space shuttle used this same principle, looking at the same location from two different angles and inferring the elevation from the differences in those viewpoints. This is one example of how we can use satellites or space-borne instruments to gather information about what the Earth looks like.

Interpreting Digital Elevation Models

What do these digital elevation maps actually look like? We will dive into one of these maps in a moment, and in fact, this is the actual map we will be using shortly. I think they are quite aesthetically interesting. You can look at them and see the structure of the landscape emerge from the colors and patterns. In this particular case, blue represents higher elevation and red represents lower elevation. If you look carefully at the map, you can see subtle patterns indicating stream networks—lines where water would flow downhill, and other lines that connect to them. This network structure is what a stream network looks like. We commonly think of the Mississippi River as a single river, but it is actually part of a vast network of tributaries that all feed into it. Even by just looking at the digital elevation model, you can see this same basic tributary structure forming naturally from the elevation patterns. The boom arms and everything else that NASA used to collect this data are visible in the details, but I brought my own markers to highlight particular areas.

A Specific Example: The Gura District of Kenya

This particular digital elevation model is from the Gura district of Kenya, which is one of the areas where the Natural Capital Alliance conducted a lot of really interesting ecosystem service work. In that area flows the Gura River, which appears quite prominent in the elevation model and is critically important for the well-being of the subsistence farmers who depend on it. This might be one of the more important rivers you could consider from a human welfare perspective. If you take away a river from people who have access to grocery stores and can buy food, it is not ideal, but it is not the end of the world. But consider what happens if you are a subsistence farmer depending on that river for survival—if the river stops flowing, this becomes a genuinely existential problem. You cannot simply buy your way out of that situation.

A Personal Story: The Gura River Dam Project

I actually visited this river myself. I was there looking at a World Bank-funded dam project. The dam was intended to create a large reservoir that would improve agriculture in the area by trapping water so it could be distributed throughout the fields during the dry season. The dam was supposed to create a reservoir that was thirty feet deep. However, when we were out there, we witnessed something troubling: a woman was literally walking across the reservoir. It was only about ankle-deep. This shallow depth occurred because the project planners had not adequately accounted for sediment accumulation. The dam still functioned as a dam in some sense, but it was rapidly becoming unable to provide the agricultural service it was designed to deliver to the area. It was depressing to see someone walking across what was supposed to be quite a deep body of water.

Advanced Concepts in Sediment Modeling

From Simple to Complex: Upslope Transport Considerations

So that is what a digital elevation model looks like in its basic form. Now let’s add a bit more realism and one more important science element. In addition to the Universal Soil Loss Equation, which we calculated for each of those four grid cells in our simple example, we actually face a much bigger computational challenge. For each pixel in the real model, we need to consider a much more complex set of upslope transport pixels that contribute sediment to that point. In the simple linear example with four grid cells, you could easily see which cell flowed into which cell. But in reality, when you have an actual stream network or a digital elevation model, for any given point you can compute the entire area that feeds into it through upslope flow. Maybe this area is bounded by ridge lines of a couple of hills, so everything inside this boundary flows into that single point of interest.

Computing the Complete Flow Contribution

We are going to be calculating how much sediment is coming from all of the grid cells in this upslope area just so we know what happens in this particular pixel of interest. Once that sediment leaves this pixel, we still have the complex challenge of knowing which grid cells it flows to next, because that is where retention from vegetation might still happen before the sediment eventually makes it to the stream. This adds a lot of detail to the calculation, but the reality of what we are actually looking at is far more complex still.

The Real Landscape: Complexity at Scale

Here is what it really looks like. We are now actually looking at the DEM of the Gura River watershed. Instead of just the one pixel of interest we showed in our simple example, we have a very large area flowing into it and a very long flow path by which sediment moves out of it. We need to calculate how much sediment is retained here and everywhere along that extended flow path. But here is the computational reality: we have to do this calculation for every single pixel in the entire area. This illustrates why this kind of modeling requires really powerful computers. For the longest time, we could only calculate this kind of sediment delivery for relatively small watersheds. However, as computers have gotten bigger and more powerful, we have increasingly gotten better at performing these calculations and can actually calculate sediment delivery globally now, which is pretty remarkable.

Understanding Flow Accumulation

The Concept and Importance of Flow Accumulation

One last data concept we need to define is flow accumulation. Flow accumulation keeps track of how much water is flowing through each pixel of the landscape. You can think about it this way: at the very top of a river system, like Lake Itasca where the Mississippi River originates, there is just a tiny stream. That stream gets bigger and bigger as you move downstream, especially as more tributaries flow into it. So the flow accumulation value when you are measuring the main river represents essentially how much water per minute is flowing through that river channel. The flow accumulation value will be much higher downstream than it is at the river’s source.

Flow Accumulation Beyond Rivers

The concept of flow accumulation is actually broader than just identifying rivers. Even in places that are not currently rivers but are nearby, you can also calculate the flow accumulation. Those areas will have much lower flow accumulation values. This number—this flow accumulation value—is what we actually use to define what constitutes a stream. You probably have not thought deeply about the philosophical question of how we define a stream, right? A stream is essentially defined as occurring when the flow accumulation crosses a threshold value such that we consider it a stream. The placement of that threshold is somewhat subjective because many streams are temporary or seasonal. Is something a stream if water only flows through it for half the year? Generally speaking, we define something as a stream if it has a flow accumulation value above a threshold that is sufficient to maintain flow there year-round. Regardless, this map, which reports the flow accumulation number in cubic meters of water per time period, helps us define what the stream network is.

Identification of Streams and the Dam Location

Here, I have colored the streams in blue to make them visible. This is now actually the identified stream network. This is where the dam is located. The dam that I was referring to, the one that made that reservoir that the woman walked across, is right here. It will have a flow accumulation value where we identify it as standing water with enough flow passing through it to be considered operational.

Running the Sediment Delivery Ratio Model

The Key Concepts Summarized

So those are the key concepts you need to understand to gain the basic intuition necessary for running the sediment delivery ratio model, which is what we are going to turn to now. Does anyone have questions about the science of it so far? Does it seem pretty straightforward?

A Student’s Question About Data Sources

A student asks: I’m doing a research project right now and I find it makes sense looking at it, but it’s just so complex. Where is the data coming from? Is there environmental teams that go out and collecting things and piecing it all together?

That is a great question, and that is honestly where most of the hard work has been spent in this domain. Each of the different data sets that we use all have a different source. Many of them are derived from satellites. The space shuttle is the source for the digital elevation map data. NASA has put various elevation data online, and one of the big skills in becoming good at GIS and ecosystem service analysis is knowing what all the databases are and how to access them. A more advanced GIS class would spend a lot more time thinking about where you get the data and how to access it. But to summarize for you: in addition to satellites, there are indeed environmental teams that go out into the field and collect ground-truth data. The hardest data to get, and the data that we really need, is site-specific measurement data. To verify that the model works correctly, you also need people to go out there and measure the amount of sediment in actual water bodies. We can infer sediment amounts from satellite data, but only because we have had millions of human hours spent going out and literally measuring the water—measuring the water quality, determining how much sediment is currently in the water. So this represents a combination of global satellite data with hard-fought, on-the-ground data collection at the local scale. There is not a single answer I can give you for all situations, except I will show you another answer, which is a great segue: if you go ahead and open up Invest, you will see where the model links to the data.

Accessing Data Documentation in Invest

First, let’s get to the model in Invest, and then I will show you where it links to data sources and documentation. For today, we are going to use the sediment delivery ratio model. One of the things I love about this science is that open science is heavily focused on documenting good answers to questions like this: where did you get the data? That is where we start looking—in the user’s guide. This is the section of the user’s guide specific to the sediment delivery ratio model we are using. Throughout the text, there will be numerous references, and then also at the very end, there is specific information about where each data type came from.

Where to Find Different Data Types

Here is a section on the digital elevation model that lists some freely available global maps. The World Wildlife Fund has digital elevation data available. NASA has digital elevation data available—that is the one I was referring to with the space shuttle data. But there are others out there as well. The guide describes where to get digital elevation models for all the different regions where you might want to work. Here is an example for watersheds and sub-watersheds data. We have the U.S. National Inventory of Dams, which turns out to be maintained by the military—it is militarily important, right? But there is also the Global Reservoirs and Dams database available. Hopefully this accelerates you into the process of learning where all the different data is located and how to access it. You can see that the model documentation really does point you toward where to find real data.

Getting Set Up to Run the Model

A Note on How We Will Proceed

We are going to do something similar to what we did with the carbon model, but it will be a little bit more detailed and complex this time. Just a few important notes before we begin: I have screenshots of what I am going to do, and I am also going to do it live in front of you right now. The reason I do it this way is so that you can refer back to these screenshots later if you need to see something again. Also, I should note that most of the screenshots I have prepared are for a Windows computer, and I am going to do this live on a Mac so we can see both types of systems in action. How many people here have Macs? We have at least two. So you will see both operating systems in action.

Setting Up Your Workspace

So go ahead and open the sediment delivery ratio model in Invest. The first thing we are going to do is set the workspace. This is very similar to what we did with the carbon model, but I want to add one extra organizational step. I want to create a new folder called “outputs” or “results” where we will store the model outputs separate from the input data.

I want you to navigate using Finder on Mac or File Explorer on Windows to wherever you saved the Invest data on your computer. For me, that was in my users folder, files, Teaching, APEC 3611, and then in there I had a folder called Invest. Right there. Navigate to the SDR folder. We are going to use this folder a number of times. Before we actually start using it to put data into Invest, let’s create a new folder called “Results.”

You do not have to do this organizational step, but I strongly recommend it because there are so many data files in this folder that it quickly becomes annoying if you do not separate the inputs from the outputs. Create a new folder called results using whatever method is typical on your operating system—right-click, add new folder, or the equivalent. Then, if you click back into Invest, navigate to that results folder and make sure you select the results folder as your workspace location.

A Brief Pause for Organizational Setup

Just pausing for a second: is everybody able to create the folder and get everything set up? I know that I have taught a lot of computer classes over the years, and this is one thing that is starting to change among newer students. Young people do not quite know what files are like they used to, which is understandable. The internet and Google Docs have all gotten so intuitive that we sometimes forget that files are just chunks of hard drive somewhere storing data. But yes, so we have just pointed your Invest workspace to that directory we created.

Locating and Selecting Input Data

The Challenge of Navigating Data Folders

Now I am actually going to involve you directly in the process. For the first five different layers or input fields that Invest asks for, I would like you to see if you can navigate through the input data folder and figure out which of the different input files in this folder go into which of the different input fields here in the model. I will start to show you the first one as an example, and then I want to see if you can figure out the rest. The digital elevation model is the first one—I have sort of been indicating what it is all along—it is probably a file labeled DEM. Please go through the first five different input fields and figure out and select the right file for each one. And again, watch out for the common gotcha: make sure you select the TIFF file, not one of those extra files that has extensions like .xml or something like that.

Tips for Finding the Right Files

One other thing to note: if you are curious what each of these variables or input fields are, you can click on the definition icon here. This will link you to the user’s guide, which brings us back to that earlier question of data—where do you get it and where do you learn about it? But I think for this particular example, the input file names are pretty well labeled, so you should be able to infer from the file name what data type each file contains. I am going to circulate around the room and check everybody’s screen to make sure we are all getting their inputs correct and selected properly.

Exploring the Data in QGIS

Loading Data to Understand What We Are Working With

Alright, good. Okay, so that is the first set of input files selected, and now let’s actually load them up into QGIS just so we know what we are actually working with. For that, my process is to click on QGIS to bring that application to the foreground, and then click back on your Finder window or Windows Explorer to navigate through the files. Let’s explore a few of these layers to understand what they represent. First, let’s look at that DEM layer.

Visualizing the Digital Elevation Model

You know, it defaults to different color schemes depending on how you have your QGIS configured. This particular default is not very useful for understanding elevation. It is using a color scheme that identifies a unique color for each value, which makes sense for a land use and land cover map where there might be maybe sixteen different categories of land cover. It does not work so well for something that is a continuous measurement like elevation, where you might have hundreds or thousands of different elevation values.

So the “palleted” color scheme makes no sense for this data. We will do the same thing we have done before: go to the single band pseudo-color visualization option. In QGIS, this is often found in the layer properties or style menu. Let’s select a pleasant color ramp for displaying elevation. I am going to go with yellow, orange, brown, which is a nice progression from low to high elevation.

Understanding the Elevation Data

When you click that color ramp, now it is showing you the correspondence between each color and the actual elevation in meters above sea level. We get a slightly prettier map like this. Yeah, now you can see the different areas where there are tributaries and streams leading into the main stream network. You can actually visualize the topography.

One cool thing about working with continuous data is if you zoom in on an area, all the values start to get so close to each other that you actually cannot see any difference between pixels anymore. One thing you can do, at least after you have set it to pseudo-color mode, is you can right-click on the layer and go to styles and then say something like “stretch to current extent.” If you click this option, all it is going to do is reset the minimum and maximum of the color bar to the current extent you are viewing.

But to do it manually, what you could do is notice that all those values are really close together in your current view. Let’s actually just manually change the minimum and maximum. Let’s look at only the values between 2000 and 2500 meters elevation. This is going to restretch the color scale and you can see it will highlight the differences in that elevation range a lot nicer, making subtle variations more visible.

Moving On From Detailed Exploration

Okay, that is what the DEM looks like and how to visualize it effectively. You could do the same exploration with the other data layers, but for the sake of time in our class, we are going to move on. When you actually write up your country reports later, it might be useful to go in and show some of these visualizations to make your analysis more compelling and clear.

Selecting Watershed and Vector Data

Working with Vector Data

But let’s talk about the next file types I need you to select. Vector is the next type of data we need to provide. For this input, the file should ideally be a geo package format, but the data available here is still stored in the older shapefile format. We actually have a number of different vector file options we could choose. I am going to recommend that we go with this one called sub-watersheds.shp. If you just wanted a single polygon representing the whole study area, that would have been watersheds.shp, but I would like you to choose sub-watersheds.shp to give us more spatial detail. And again, make sure it is the actual shapefile file, not the similarly named auxiliary files.

Handling Drainage Data

We are going to leave out drainages for now. This input is important because tile drainage matters—if farmers have installed tile drainage systems with pipes running through their fields—this obviously has a huge impact on where the water actually goes and where sediment ends up. For our purposes today, we are going to skip this input.

Transitioning to Model Parameters

Yeah, so that is all of the different input files we need to point to. But there is one last set of inputs we need to consider. Let me jump back to the presentation materials to show you what parameters to enter next.

Entering Model Parameters

Model Parameters and Configuration

Okay, so that is done with the input files. The next thing we are going to put in is the model parameters. In addition to datasets, we also have to specify certain parameters. Most importantly, we need to once and for all answer that philosophical question we discussed earlier: what is a stream? We are going to define a stream as having a flow accumulation value of 1,000. It is very concise, right? But go ahead and enter in these values. I will list them clearly:

Flow accumulation threshold: 1,000 Borcelli K parameter: 2 Maximum SDR: 0.8 Borcelli ICO parameter: 0.5 Maximum L value: 122

These are the defaults that are suggested in the user’s guide and that we will be using. Once you have got all those values entered in and assuming you have got all green checkmarks next to your inputs indicating everything is valid, you can go ahead and hit the run button to execute the model.

Computational Time Considerations

This one might take a little bit longer to run compared to the carbon model because it is doing a lot more mathematics. It is actually calculating the flow routes for the landscape, which is the most computationally expensive part of this whole process. For any given pixel, the model needs to compute which pixel it will flow to next, and then also consider which pixel that downstream grid cell will flow to, and so forth through the entire landscape.

Raise your hand if any problems are coming up as you run it. You might just sit tight for a second while the model processes.

Runtime Comparisons Across Computers

Good, good. Checking runtimes across the class, the fastest one I have heard about was over here at eight seconds. Does anybody have a faster runtime than eight seconds?

I was a little faster at three point zero five seconds, but I should not brag about that—I just really love computers. But I also should not brag because I have a significant budget for computers. This machine was seven thousand dollars, but it was funded through a grant I had to spend on technology. If I did not spend it, I would lose the funding. So I might as well have bought the literally best computer I could get at the time. For the record, this is not even the most expensive computer I have bought.

Interpreting Model Results

Viewing the Results Report

Okay, anyway, once the model is done running, you can hit the “View Results” button. This gets you a comprehensive report with a bunch of information. Side note: this results report feature is a new addition to Invest that they added just last year. Previously when I would teach this course, there was not a handy report that people could click through, and honestly that made teaching a bit more time-consuming because then you could spend a lot of time talking through each result individually. But now this is so nicely formatted that it sort of does the homework assignment for you. Whatever, you can definitely use it.

How the Report Display Works

There will be some parts where you still have to load the results into QGIS because you might want to zoom in on particular areas to examine spatial patterns in detail. But honestly, this report gets a fair amount of the analysis done for you already.

Reading the Results: Key Outputs

So, what do we see in this results report? Here is that DEM we looked at earlier and here is our definition of the stream network based on the flow accumulation threshold of 1,000 that we set. In this case, you would have to go into QGIS to see the full detail because the report view is zoomed in a bit far and it is missing a lot of pixels just because of that zoom level. But you can sort of see the basic outlines and it defines the stream network for us.

The report also gives you information about where all those different input files are and references to all the different output files generated. Stream.tiff is the stream layer we are looking at here. There are a ton more outputs generated by the model, and many of them might be interesting to you depending on your research questions.

The Most Important Outputs to Understand

But some of the most key outputs would be sediment export, for instance. This shows the total amount, the total tonnage, in this case of sediment and soil that makes it into the reservoir or into the stream network rather than getting trapped somewhere upstream. This is the amount of sediment reaching the stream.

Other outputs that you might be interested in are avoided erosion. Instead of only caring about the amount of sediment that makes it into the stream, you might also care about how much nature provided value by preventing erosion from happening in the first place. So avoided erosion shows you where retention happened.

Interpreting Sediment Retention Results for Policy

Loading and Visualizing the Avoided Erosion Layer

Okay, so you can spend a lot of time exploring those model outputs. Let’s talk about what these layers mean for policy decisions. Policy makers would need this data too. I am going to skip a few slides discussing graphing techniques, which might help you if you are curious, but what I want to focus on is adding the sediment retention layer to QGIS and talking about the policy implications.

Navigate to your results folder. This is why I made you create that results folder earlier—just so there are fewer files here to sort through. This output file is going to be called Avoided Erosion, which is what we are looking for. Go ahead and open it in QGIS. Give it some prettier colors using the same pseudo-color visualization we used for the DEM.

What the Avoided Erosion Map Shows

Okay, so now we are looking at a map of where nature was providing a service—specifically, where did nature prevent erosion from happening in the first place? This map is the most policy-relevant of all the outputs. The key policy question is: if you wanted to prevent erosion from happening and you wanted to keep that dam from becoming useless from sedimentation, you would want to make absolutely sure you do not cut down the vegetation on these pixels that have a high avoided erosion value.

Making Policy Recommendations Based on the Data

Here, for instance, in this area with high values, the policy advice would be pretty straightforward: do not degrade this area that has high sediment retention value. The reason is because it is holding a lot of that sediment, preventing it from making it into the stream. The policy implication is clear—preserve this vegetation.

Distinguishing Between Pixels and Stream-Level Effects

Whether you care about on-farm erosion versus erosion that makes it into the stream, there is a distinction to understand. The avoided erosion map we showed is reporting how much erosion is prevented on that exact pixel itself. So the policy advice would be slightly different here. This map is essentially saying: if you are a farmer, these are the locations you should be most nervous about regarding erosion, and you should probably invest in erosion prevention strategies like tiling or more well-designed drainage systems. These are the places where erosion risk is highest and therefore where conservation efforts would have the most impact.

Different Policy Applications of the Same Data

So yeah, either one of these maps—the avoided erosion on specific pixels versus the downstream impact—can point you to policy-relevant advice, but the policy recommendation might be different depending on which outcome you care most about. The higher elevation areas, based on what we have seen, are particularly important for sediment retention.

Policy Applications: Real-World Examples

China’s Sloping Lands Conversion Program

To give you a real-world example of how ecosystem service analysis informs policy: in China, one of the largest ecosystem service policies in the entire world is the Sloping Lands Conversion Protection Policy, or something very similar to that name. Basically, the policy pays farmers who have land above a certain degree of slope not to convert that land to agriculture. If they followed their private economic incentives, they might convert that sloped land to agriculture because they could grow crops there. But doing so would create a huge externality in terms of the erosion and sediment that would be generated. So it is actually worth it for the government to design policies that say: if you happen to have land with greater than a five-degree slope, we are willing to pay you some amount in yuan or whatever the local currency is, not to convert it to agriculture. The way this payment amount is often determined is through ecosystem service valuation.

The Brilliance and Challenges of Payment Programs

That is a pretty smart program because the farmers like it, right? They get compensated for land they would have a hard time farming anyway. An alternative policy would be to simply say you cannot do any agriculture on slopes greater than five degrees. What would be the downside of that alternative approach? That would likely impose real hardship on farmers. Or perhaps they would try to disguise what they are doing anyway, leading to enforcement challenges. But yeah, in some version any strict regulation would hurt the farmers and cause all sorts of unexpected consequences. People will try to disguise agricultural use or make it look natural while still cultivating it. Payment programs like China’s approach work better because they align farmer incentives with conservation goals rather than pitting them against each other.

Questions on Avoided Erosion Calculations

A Student Question About Erosion Metrics

A student asks: I have a question about the avoided erosion. Does a higher number mean more erosion avoided, or just more erosion?

This map here shows more erosion avoided. Yeah, if you are curious about the specific definition, the user’s guide will give you the full scientific definition. It is the tons difference between the current state and some alternative state. Here is a slide on that concept.

Understanding Avoided Erosion Through Comparison Scenarios

Whenever you are talking about avoided erosion, that is actually a whole lot harder to define because you need something to compare it to, right? If something did not happen, how do you put a dollar value on it? The way the model works is it runs twice: once with the current land use and calculates the export, but then also compares that to how much export would have happened in a worst-case scenario. For sediment retention, the worst-case scenario that the model assumes is bare field with no vegetation. So the difference between how much export would have happened with bare soil versus the current land use—that total amount is what the model reports as the avoided erosion.

Scenario-Based Analysis for Policy Questions

Sometimes, though, that worst-case bare-soil comparison is not really the comparison that you want to make for your specific policy question. Instead it makes more sense to look at it on a scenarios basis. Instead of using the extreme example of bare soil, you would run the model twice—once with the current land use and land cover map and once with some alternative land use and land cover map representing a different policy scenario. Then you could redefine erosion prevented by whatever policy assumptions led to that alternative scenario. So we will return to this concept when we talk about scenarios. That is why I have been emphasizing the value of scenario analysis throughout this course: a lot of times the most useful analysis is comparing where there is a big difference in the erosion that happens given one land use map versus another land use map that represents an alternative policy or management approach.

Closing Remarks and Next Steps

Final Questions

Great question. We have one minute left before we need to wrap up, but does anyone have any other last questions?

Compliments on Student Competence

Cool. Everybody here is pretty computer competent, which is always fun to see. We spend less time on annoying technical troubleshooting, which means we can focus more on the science and policy. But yes, you will be seeing the weekly assignment questions go live here shortly after class when I push it to GitHub. Have a great weekend.

Transcript (Day 1)

Alright, let’s get started. Welcome to Ecosystem Services Part 2, where we will pick up where we left off with a hands-on, software-focused dive into ecosystem service modeling.

So what does that look like specifically? It means we will return to the carbon model, which you all have run, but we ran it in light speed without really talking about what we were doing. That was on purpose—I just wanted to make sure it was working on everybody’s machine. That’s often a roadblock. However, this class has students with above-average computer skills, and you all succeeded, so it went well.

But we’ll dive into it and talk about what’s going on behind the scenes. We’ll then have an in-class exercise on assessing a policy that affects ecosystem services. If time permits, we’ll move on to the second, much more complicated ecosystem service model we’ll emphasize: the sediment delivery model.

Okay, so if you haven’t already, go ahead and launch InVEST and QGIS. Both take a little bit to get going. But while you do that, just a reminder of a few things. We’re going to be using the base data. This is just for your reference—you downloaded it directly from InVEST, but keep in mind that when you do your country reports, you’ll be using different data, not this nicely curated, already organized set.

I had you save it in your class directory, and we’ll use those different sample data directories named after each of the models for where we store the model results. What we did last class was quickly point InVEST to the key inputs for the simplest possible run: basically plugging in the land use land cover map and a biophysical table in CSV format.

What I want to do now is return to those data but spend more time looking at it so we can get a sense of how to interpret it and understand what’s going on.

So first, let’s load that land use land cover map from our sample data into QGIS. I’m going to try to work dynamically with two screens so we can do this together live.

I forgot to clear what I was doing before. I was running land use land cover maps through a bunch of different models trying to predict where land use change is going to happen. This is proof that on a regular day, I’m always looking at where land use change happens under different scenarios. You can see slightly different configurations of the landscape and different scenarios. You don’t need to know this, but this is actually running a model I’ve gotten attention for called GTAP-InVEST, where it’s looking at different scenarios. Here, we’re looking at what happens if you take away all irrigation and how crops are grown. It turns out that if you take away irrigation, agriculture expands quite a bit, and this results in a relatively large reduction in carbon storage. We just got a paper accepted for publication on this.

But we’re going to start up in a blank project, and I want you to navigate to where you stored the data. For me, that’s J.A. Johns files, APEC 3611, and then where I installed my InVEST data. The carbon model is the one we’re looking at. We powered through it last class, but now let’s talk a little bit more about it. You’ve run it already once, so there’s going to be more than just the default inputs—there’s also going to be outputs as well as inputs.

The first thing we’re going to look for, and I want to explain a little bit more about these file types, is we’re going to load this one: “LULC Current Willamette.tiff”.

One thing about TIFF files is you’ll notice there’s a bunch of similarly named files. This is a common convention in the GIS world: in addition to the actual raw data, you might store auxiliary data, AUX, that has information like what area this is going to cover. It’s kind of annoying that it’s stored as a separate file, but that’s what that is. It’s a common tripping point though, because if you try to load this auxiliary file into QGIS, it will not know what to do.

This is further complicated by the fact that on Windows, oftentimes the file extensions will be hidden, which means the last three characters will be gone. So the one we actually want is the one without the TIFF extension. That’s kind of confusing, but go ahead and drag this into QGIS.

So what do we see? A few things to orient you. First, if you have a mouse wheel, you can zoom. Otherwise you can use other key commands. Left-click and pan to focus on different areas. A common problem is if you accidentally zoom out really fast, like I just did, you can lose the data. It’s there, it’s just tiny—I zoomed out to be looking at it from Mars.

A common shortcut you can use is right-click on your layer and go “Zoom to Layers”. All that does is refocus the window on the actual extent. One thing you might want to do is click this little dropdown arrow here. If you expand that, it gives you more information about the color bar. In this case, this is what’s called categorical data, meaning each different integer value on this map corresponds to a specific category of land use land cover data.

The values that are this color are residential lots with 0 to 4 units per acre—low-density housing, kind of like suburbs or exurbs with yards. This is more like inner-ring suburbs with gridded neighborhoods like in St. Paul or Minneapolis with blocks. And residential with greater density is where you start to have skyscrapers or big condo buildings. This is just an example of what real data looks like.

You can zoom in and see the sort of things you’d expect. You start to see a road network—you can actually detect the road network from space, which is kind of cool. You can also see other features like cropland, modeled as tiles of different colors, and the river valley running through it. The basic patterns you see here determine some really important things about how ecosystem services are provided.

So that’s the map that we plugged into InVEST. Just to do it again, we’re going to rerun it on your computer because we’re going to rerun it under a few scenarios. Let’s replicate what we did last class—open the carbon storage and sequestration model.

Just to remind yourself, you can load a pre-configured version, but if not, this will remind you: we’re going to select the workspace, or where it’s going to save the folder. I just chose the carbon model folder. You could have typed that out.

Now here’s where we’re going to point to that baseline land use land cover map. It’s over here in Finder. It’s a little harder to tell because of all the different file extensions. It’s not this one or this one, but this one. The preview gives you a hint that this is the one that’s actually a map.

That’s the one we selected in QGIS. Give it a double click and it loads up here. We’ve already talked about this: we have a raster file, not a vector file—just a big matrix of integer values corresponding to different types of land cover. That’s what we put in here.

I’ve got slides here that are a bit more comprehensive, so if you need to remind yourself how to do this, these slides give you a screen-by-screen walkthrough. You’ve seen this before, but now I have it documented.

The second file we plugged in without talking about is that CSV file: the carbon pools table, sometimes called the biophysical table. We’re looking for the CSV file. There it is: “carbonpoolsWillamette.csv”. But this time, instead of loading it into InVEST right away, let’s open it up in Excel or however you prefer to look at CSVs.

I just got my new Mac and I’m learning as we go. It looks like it loaded up in Sheets rather than Excel. I’m going to change that. I actually use VS Code to replace Excel completely. I’m almost to the point where I don’t have to use Excel anymore, and I’m thrilled because Excel does a lot of annoying things, like reformatting numbers as dates. I can’t tell you how many hours I’ve wasted on that specific problem. Has anybody else had that issue?

When you put a number in and drag it down, it increments each time. That’s a very heavy-handed approach to what Excel thinks you should be doing, but often when I’m repeating a number, I don’t want it incrementing up—I want the same number in each cell.

Right now, I’m using Data Wrangler in VS Code, which is like Excel but open source.

Okay, so what do we have here? I’ve now opened the data. Let’s explore these columns quickly. First is the LU code. This is the integer value being displayed on the map as colors. If you actually look at each individual cell, this will be some integer value like 1, this will be like 2, and so forth. What the CSV does is assign the meaning: 1 corresponds to residential 0-4 units per acre, and it assigns it to some data.

These are data collected from an exhaustive review of scientific literature from undergrad research assistants going out and measuring how much carbon was in trees. It summarizes it as 15 tons per hectare in the above-ground carbon pool, versus 10 in the below-ground pool, 60 in the soil, and 1 in the dead litter on the ground.

A lot of these are intuitive. Sparse residential might have some leaf litter or parts of your lawn where you don’t have it fully maintained. But as you get to higher density, you’re probably less likely to have a scrub tree or brushy area with litter, so that goes down. You probably have fewer trees. Condo buildings have essentially zero carbon—or at least we summarize it as zero. Vacant lots get a lot more because they grow back. There’s a gas station on Hamlin and University where they cut down and destroyed the building, and now it’s growing back. The carbon storage is going up there, so it’s providing ecosystem service values.

This is the biophysical table. The essential model in InVEST, when we point to it, is a really basic one: for any grid cell with this categorization of 1, assign 15 tons per hectare to that pixel for the above-ground pool and 10 for the below-ground. This is called a lookup table.

Let’s redo that in our InVEST so we can actually generate results. We’re back at carbonpoolsWillamette.csv. This is where we got to last time, and I actually had you click run. So go ahead and hit run.

Basically what’s going on under the hood now is it’s taking that biophysical table and doing exactly what I said: it looks up the value and assigns that value to each pixel of that type.

You can see how long it took to compute. Mine was 3.55 seconds. Can I get a poll of the class—anybody less than 10 seconds? How long was yours?

Now let’s talk about the results in more detail than last time. The first one we’re going to look at is the key result: C_stock_BAS, where C is carbon, storage, and BAS refers to the baseline. I’d like you to load that file into QGIS as well. Go ahead and drag it. I’m going to drag it on top of the land use land cover map so it displays properly.

What do we see here? This is the result of the model. Let’s give it a better color. Back to the slides if you’re looking for more detail or want to revisit this later. You can double-click on the carbon storage map to bring up symbology.

A common gotcha is if you’re in one of these other tabs, it won’t look right, so you have to hit Symbology. Here we have tons of different options. For our land use land cover map, that was categorized as paletted or unique colors—each different value has its own color because each value doesn’t mean anything on a scale; it’s just a category.

But when we’re talking about carbon storage, that’s a continuous measure. You can have 1, or 1.1, or 1.11—it’s a floating-point number. For those, it’s best to use a color band with continuously varying colors. Let’s do single-band pseudo-color.

In the color ramp, you can select all sorts of different ones. Being a good geographer means knowing how to select the right ones. Something like greens will be good because thematically it matches the idea that at low levels of carbon storage, it’s probably not very green, but it ramps up. So it’s continuous between all these values and gives us something that looks like this.

Now it’s easy to interpret. We see that this area has the most carbon storage, and this area has a lot less. If we wanted to know why, we might toggle off the carbon storage map and see what’s underneath. This is all the natural lands. Basically, this is the city and cropland over here, whereas here we get into higher elevation or rougher terrain that’s kept natural. Not too surprisingly, those are the areas with the most carbon storage.

That’s symbologizing things nicely. We can start to make basic conclusions about different conservation options here. You might ask: where would it be most damaging from the perspective of climate change to develop the land into agriculture? If you were to convert this land here, that would be the most damaging because you’d have the largest change in carbon storage—a negative sequestration.

And that’s the basics of analysis. Any questions?

Let’s get some practice on this. These questions will be added to the next homework assignment. You’ll do well by actually doing it now because you’ll then basically copy and paste it and have much of the homework already done. We’re going to break up—feel free to work in groups however you want and move around if needed.

You’re going to rerun the carbon model, but this time for a future map. Some of the file names will be a little different, so you’ll have to figure out whether it’s future or alternative. That’s actually what it means to be a data analyst: figuring out how files are organized.

Walk through these steps, answer the questions, and look at what the change in carbon storage values is. You’ll benefit from the report.html file that is generated in your workspace. Have at it. I’ll monitor everyone’s progress and call this back together when you’re getting close to done.

One minor thing I’d suggest noting: I had an error in the slides. Don’t put a discount rate of 0.03. They changed it in the most recent release to be listed as actual percents, so 3 is the discount rate you’d be better off using.

An easy way if you’re getting error messages about permission denied is that you probably got it loaded in QGIS and it’s trying to write over it. To fix that, take it out of your QGIS.

We have about 8 minutes left. Even though many of you are still working on it, I’d like to quickly talk through some of the answers.

Here are a few things we did differently with InVEST: we enabled sequestration. Just to remind you, carbon storage is the amount present on the landscape presently, whereas most people care about sequestration. When you’re talking about carbon storage changes from different land use maps, that means you need to calculate it twice: once for the baseline land use land cover and a second time for some alternate land use land cover.

You all figured out to run the valuation model, and it does a number of things for you automatically. It requires you to put in the years, and if you’re wondering why, you can click on the information there. It’s going to calculate the net present value, which is the annoying work we don’t have to do manually. The results are already in net present values.

The reason I needed to know the years is because we’re basically saying this is the horizon over which we’ll consider this. We gave a price of carbon of $187. I also noted that the price of carbon dioxide is $51. The reason for this difference is that physical scientists talk about carbon the atom, while many economists talk about carbon dioxide as if it’s the same thing. But there’s a difference: carbon dioxide has two oxygen atoms. Those are heavy.

If you measure tonnage of CO2, the impact depends on how much actual carbon made it up into the atmosphere. It doesn’t depend on those extra oxygen atoms. It’s roughly a factor of 3.6 difference based on atomic weight. That’s why you had to use the higher number—because these results are in carbon the atom, not CO2 the molecule.

It uses the discount rate. I had you not use an annual price change, but some people do like to have that, which mimics what we saw from the DICE model: the price of carbon goes up over time. So you could automatically include that.

We can see that many of you looked not in the workspace, which has the raw results, but rather in the view results. This loads up the HTML page. You could also find it in the folder, but it’s also linked here directly.

The question asked you a number of things. First, what’s the change in actual carbon—the sequestration? That would be this value. It’s tempting to say this one (that’s the alternate), but what we see is the carbon storage falls from about 4 million to 3.7 million, so we have a loss of that amount of carbon storage.

It also tells you what file name you’d want to look at if you wanted specific geospatial results. Further, it provides you a mapping of the net present value. For any given pixel, you know how much carbon storage difference there was, so you know the dollar value contributed from that pixel given the land use change and the potential loss in carbon storage. That’s what the NPVAlt.tiff file reports.

These are nice summaries. To do a good report—and for your final project you will want to do this—you’re probably going to want to load them into GIS, not just rely on this report. Customization can make these much better, like zooming in on key areas and showing both the full map and then zooming on hotspots of change to illustrate what’s really going on.

Back to the question: the key question I had you answer is not just the change in carbon, but given that extra information, the fact that the net present value of timber is $50 million. I made it convenient for you—everything is expressed in the same units of net present dollars.

You can say whether this project should go forward. You can have complex argumentation, but if you’re going to stick with the value of ecosystem services to assess whether the scale tips in favor of development versus preservation, let’s look at who got the answer.

Which one was worth more? In the report, we see that the net present value lost of carbon storage is about $46 million. Critically, $46 million is less than the $50 million that the timber project would be worth. Here’s an anti-environment argument: even with ecosystem services, it actually makes more sense to log it, at least if we think we’ve taken into account all the proper ecosystem services.

One thing to note is we’ve only included one ecosystem service. It’s probably what are called co-benefits—biodiversity is probably better, or maybe sediment retention is better. We’ll dive into those in the next classes. But for now, I wanted to make the point that, just as in the DICE model, this fundamentally depends on what your social cost of carbon is.

In the homework, I’ll have you go one step further than this. You can copy and paste most of what you’ve done, but it will ask you a few more questions about what the conclusion would be under different social costs of carbon.

Any questions? You were all very effective at this, so good job. I’m available on email—we’re getting more hands-on, so feel free to email me or come during office hours. Have a good Wednesday.

Transcript (Day 2)

Alright, well, let’s get started. Welcome to Day 2 of our hands-on with Ecosystem Services. Today, we’re going to start by reviewing very briefly what we did in our in-class exercise last class with the carbon storage model. Then we’ll introduce the next ecosystem service, which will be sediment retention. We’ll first talk about it conceptually and the science behind it, but then we’ll switch over to running it on our computers.

This might be a good time to launch Invest and QGIS so that they are all up and loaded by the time we need them.

So, this was the exercise that we had. First off, I’ll be posting the weekly assignment right after class, and it’s going to be basically these ones. I’ve added a bonus third question, which will be a very slight modification, but you’re almost done if you’ve already done this. It’ll just be writing it up.

But the thing I want to reflect on is: What did we learn? Why did we do this exercise? And this also gets to the question of why are ecosystem services a useful way of thinking about conservation?

To me, one of the most important lessons comes from the role of an economist. It’s simply the truth that policymakers tend to really listen to economists. I think actually too much so compared to other disciplines, but it’s just the fact of the matter. You’ll often see presidential candidates talking about what economists say. They don’t often talk about what anthropologists say.

But why is this? Well, there are a lot of reasons. One is that we talk about the economy, which is really important to voters. But another is that we are willing and able, with the tools that we have, to quantify trade-offs. Any good politician knows that lots of policies have trade-offs, meaning they’ll help one group and hurt another group, or help one objective and hurt another objective. And so the tools of ecosystem services are super useful here because they can be used to build support for policies that politicians might want to assess.

So what we essentially just did with our carbon storage example, especially when we were considering the total value of that carbon and how it compares to a hypothetical timber project, was essentially a very sophisticated trade-off analysis that could inform a cost-benefit analysis.

Here’s the critical thing: If we had not calculated the value of carbon storage, what cost-benefit analysis value do you think would have been put on the value of nature? What’s the default position that the developer takes?

Yeah, and so this is where ecosystem service valuation is important. Left to their own devices, certainly the developer, but even politicians, if we don’t give them a number about how much nature is worth, the default answer is zero. The decision will still progress. They will still make a decision on whether or not to cut that down, and they will be implicitly assigning a zero value simply because they don’t have a number. And that’s really the niche that ecosystem service analysis aims to fill—almost any estimate is better than a zero dollar value.

Okay, so transitioning to our next ecosystem service model, this is going to be the sediment delivery model.

Here, I’m not going to ask you to understand all the science, mostly because we don’t have time to master all of it, but I do want to make a broader point: underneath each of the ecosystem service models, there is a large literature of scientific work supporting all the calculations that go into it. Ecosystem Services and the Natural Capital Alliance have proceeded by taking consensus science and making it easier to use. The model we will be working with today is called the Universal Soil Loss Equation, or USLE. You won’t have to learn the details of how this was computed, but just rest assured there are hundreds of PhDs that have been written on how to correctly calibrate this equation. We’re super fortunate that work exists, but we don’t have to do it ourselves. The tools, specifically Invest, implement it for us and make it so much easier to just be a user of good science.

If you’re curious, you can always go to the user’s guide, which is linked from within the app, and click on any one of these variables. What is erosivity? What is the LS, the length-slope factor? This is just a snippet from the user’s guide, referencing the original seminal work by Desmet and Gover’s 1996, but it will also reference newer ones since then. You can learn a lot there.

So instead of going into this science in depth, I will give a very stylized version of this graphically, but we’ll be talking about the exact same thing that you could read in depth in the user’s guide.

The background science concept is the Universal Soil Loss Equation. This is an equation that combines a bunch of variables—essentially geospatial variables on the slope, a specific measurement called the LS factor, the erosivity, conservation factor, cover factor, and erodibility of the soil. Essentially, this combines these layers with statistical analysis to determine what different combinations of soil types and slopes best predict the tonnage of erosion that will happen on that pixel.

But let’s go even simpler. I want to compute this in our heads on the simplest possible landscape. We’ve been talking a ton about land use land cover maps. Here’s the simplest one I can come up with: four pixels instead of several million like we had on the ones we were working with in QGIS the other day. But it still has the basic principle of having different grid cells representing different ways that land area could be used.

I’ve actually added one more piece of information. In addition to these four grid cells, we’re also going to have a flow direction. This is basically a hill, and when you have water from rainfall at the top of the hill, it flows down the hill in the flow direction until eventually it makes it into a stream.

What we’re going to track is: on any given grid cell, like this corn grid cell here, we can track things like how much nutrient from fertilizer, or today we’ll focus on sediment, is going to be leaving that grid cell. That’s called the sediment load. We’ll just pretend that the thickness of this arrow represents the number. But behind the scenes in the model, this will be something like 11 tons of sediment is going to leave that grid cell.

Why is that? Well, it’s simply because when rain hits a cornfield, corn doesn’t have a very dense root structure. As a result, if the rain is strong enough, it will start to pull that dirt away. Any farmer knows a lot about this. Erosion is really the enemy of a lot of farmers, and they spend millions and millions of dollars trying to prevent it.

What we need to take into account, though, is not just how much leaves, but what happens to it. In this particular example, we know the flow direction, so we know that all that sediment leaving that corn grid cell will first come into this forest grid cell. Depending on the root structure of that forest, there’s going to be some value that nature provides—the ecosystem service of stopping that sediment from still moving.

I’m a mountain biker, and I always look at the trails. Whenever there is a tree that goes down, the trails get really eroded. Why is that? It’s because now the water starts flowing really fast. But if you have vegetation on the landscape, it slows it down. Good mountain bike trails are ones that have vegetation in key places. The same thing is true for farming—vegetation will retain that sediment.

But some of it still continues on, and we’ll then consider it iteratively into the next grid cell—how much does the wheat retain, but also how much does it emit back into the flow direction?

Finally, through one last forest grid cell, and then the little tiny arrow here represents how much actually makes it into the stream. In many ways, that’s the worst case because that’s what goes into the hydropower reservoir and lowers the amount of hydropower electricity we can create.

This arrow represents the amount of sediment that came from this corn grid cell and what happened to it. The retention that actually happened is the actual ecosystem service that’s going on—it keeps it there. But we need to consider this not just for the one corn grid cell, we need to consider it for all of the different grid cells on a very simple landscape.

How much does the forest contribute to the sediment load? It has a smaller arrow because it has a better root structure, but we still keep track of how much of that gets retained or eventually makes it to the stream. So now we’re seeing that there’s more sediment retention that happens downstream or downhill.

The wheat, like corn, is probably going to have a lot more sedimentation come off it, again because of the small root structure. But fortunately, there’s this last grid cell, the forest one, to retain a lot of that. So the amount that makes it into the stream will be somewhat mitigated.

That’s the basic model. It’s super simple on a landscape like this where we just have one hill and one flow direction. But of course, you might be thinking, what does it look like on a more realistic landscape?

For that, I want to introduce a new type of data: the digital elevation map. That’s what DEM stands for. There are lots of ways to create models of elevation with high-resolution detail, but the coolest one, the one that was used for the longest time, was literally on the space shuttle back when our society was so well organized that we had a space shuttle. There was a fun project where there was a boom arm that stuck out really far away, like 200 feet from the space shuttle, and it had a camera on the end of it.

That camera, coupled with another camera on the space shuttle itself, would take pictures of the Earth—or rather, shoot two lasers down at the same spot. Because the two points where those lasers were emitted from were far apart, they could measure the distance. This is called using parallax. Incidentally, it’s the same way our eyes see depth perception. If you close one eye, you ever try to catch a ball with one eye closed? It’s really hard.

But anyway, the space shuttle did the same thing—looked from two different angles and inferred what the elevation is. That’s one example of how we can use satellites, or in this case, space-borne instruments, to get information about what the Earth is like.

What do they look like? We’ll dive into one of these in a moment, and in fact, this is the actual map we will be using momentarily. I think they’re kind of pretty. You can look at them and see a little bit of the structure of the landscape.

In this case, blue is higher elevation and red is lower elevation. If you look at it, you can see a little bit of a stream network—a line here that goes down, and another one here that goes down and connects. This is what a stream network looks like. We have the Mississippi, and it’s not just one river. There are all sorts of other rivers that lead into it. Even just looking at the digital elevation map, you can kind of see that same basic idea—the tributary structure.

By the way, I brought my own markers. That’s what you can kind of directly see from the digital elevation maps.

This one in particular is in the Gura district of Kenya, which is one of the areas where the Natural Capital Alliance did a lot of really interesting work. In that area, there’s the Gura River, which looks like this. It’s critically important in terms of the number of subsistence farmers relying on it for their well-being. This might be one of the more important rivers out there. If you take away a river from people who can go to a grocery store and buy food, it’s not ideal, but it’s not the end of the world. But what if you’re doing subsistence farming and the river stops flowing? This is a real problem. You can’t just buy your way out of that situation.

Side note: I actually visited this river myself. I was looking at a World Bank-funded dam project. The dam was intended to create a big reservoir, and they spent many millions of dollars on it. The thinking was that it would improve agriculture in the area because it traps the water so you can have water distributed throughout the fields. It was supposed to be a 30-foot deep reservoir, but when we were out there, we saw a woman walking across it. It was about that deep. This was because they had planned poorly and it filled up with sediment. It was still functioning as a dam, but it was getting very close to no longer being able to provide that agricultural service to the area. Kind of depressing to see somebody walk across an area that was supposed to be quite deep.

So that’s the digital elevation map.

Let’s add a little bit more realism and add one more science element. In addition to the Universal Soil Loss Equation, which is what we computed on each of those four grid cells, we actually have a much bigger challenge. For each pixel, we need to consider a more complex set of upslope transport pixels. In the simple linear example with four grid cells, you could just sort of see which one flowed into which one. But in reality, when you have a stream network or a digital elevation model, for any given point you can actually compute the area that feeds into it. Maybe this is like the ridge line of a couple of hills, so everything in this circle flows into that point.

We’re going to be calculating how much sediment is coming out of all of those grid cells just so we can know what happens in this particular pixel of interest. Once it leaves, we still have the complex challenge of knowing which grid cells it flows to, because that’s where the retention from vegetation might still happen before it eventually makes it to the stream.

This adds a little bit of detail, but really, that’s still quite simplified because this is what it really looks like. Now we’re actually looking at the DEM of Gura. Instead of just the one pixel of interest we showed before, we have a pretty big area flowing into it and a pretty long area that it flows out of. We’re going to make this calculation of how much is retained here and everywhere along that path. But we have to do it for all of the pixels. This illustrates the fact that this requires really powerful computers.

For the longest time, we could only calculate this for relatively small watersheds, but we’ve gotten better and better at big computers and can actually calculate this globally now, which is pretty cool.

But yeah, we’re going to repeat that for all the different areas.

One last data concept to define is flow accumulation. Flow accumulation keeps track of how much water is flowing through each pixel. You can think about it this way: at the very top of a river, like Lake Itasca where the Mississippi starts, there’s a very small, tiny stream there. It gets bigger and bigger, especially as more tributaries come into it. So the flow accumulation when you’re in the river is essentially how much water per minute is flowing through the river. It’s going to be much more down here than it is up there.

But the concept of flow accumulation is actually broader than just the river. Even places that aren’t the river, but nearby, you can also calculate the flow accumulation. That’s going to be a much lower number.

This number is what we actually use to define what is a stream. You probably didn’t think about the philosophical question of what is a stream, right? Well, a stream is when the flow accumulation crosses a threshold such that we consider it a stream. It is somewhat subjective where you put that measure because lots of streams are temporary. Is it a stream if it’s only there for half the year? Generally speaking, we say it’s a stream if it has a flow accumulation value above a threshold sufficient to keep it there year-round. Regardless, this map, which reports the number in cubic meters of water that passes through, we can use to define what is the stream.

Here, I’ve colored it in blue. This is now actually the stream. This is the dam. The dam that I was referring to that made that reservoir that the woman walked across is this one. It’s going to have a flow accumulation where we say this is standing water with enough flowing through it.

So those are the key concepts one needs to understand to get the basic intuition from running the sediment delivery ratio model, which is what we’re going to turn to now.

Any questions on the science of it? Pretty straightforward?

Student: I’m doing a research project right now and I find it makes sense looking at it, but it’s just so complex. Where is the data coming from? Is there environmental teams that are going out and collecting things and piecing it all together?

That’s a great question, and that’s where most of the hard work has been spent in this domain. Each of the different data sets that we use all have a different source. A lot of them are from satellites. The space shuttle is the source for the digital elevation map. NASA puts it online, and one of the big skills of becoming good at this is knowing what all the databases are and how to get them.

A more advanced GIS class would spend a lot more time thinking about where you get the data. But to summarize it for you: in addition to satellites, there are also environmental teams that go out. The hardest data to get that we really need is site-specific data. To know that the model works, you also need somebody to go out there and measure the amount of sediment. We can infer it from space, but only because we’ve had millions of human hours going out and literally measuring the water—the water quality, how much sediment is presently in the water. So it’s combining global satellite data with hard-fought data collection at the local scale. There’s not a single answer I can give you, except I will show you another answer, which is a great segue. If you go ahead and open up Invest.

First, let’s get to the model, and then I’ll show you where it links to the data. For today, we’re going to use the sediment delivery ratio model.

One of the things I love about this science is that open science is heavily focused on documenting good answers to questions like that—where did we get the data? Well, here’s where we start looking: in the user’s guide. This is the section of the user’s guide specific to the model we’re in, sediment delivery ratio. Throughout the text, there will be references, but then also at the end, there is specific information about where each of these came from.

Here’s a section on the digital elevation map listing some free global maps: the World Wildlife Fund, NASA—that’s the one I was referring to—but there are others out there. It describes where to get them for all the different data types. Here’s an example for watersheds and sewersheds. We have the U.S. National Inventory of Dams, which turns out is from the military—it’s militarily important, right? But also the Global Reservoirs and Dams database. Hopefully this accelerates you into the process of learning where all the data is.

So we’re going to do something similar to what we did with the carbon model, but it’ll be a little bit more detailed and complex this time.

Just a few notes: I have screenshots of what I’m going to do, but I’m going to do it live in front of you. The reason I do it this way is so you can refer back to these if you need to see them. Also, most of the screenshots are for a Windows computer, and I’ll do it here live with a Mac so we can see both types. How many Macs do we have? We have at least two.

Okay, so go ahead and hit that sediment delivery model. The first thing we’re going to do is set the workspace. This is very similar to what we did before, but I want to add one extra thing. I want to create a new folder called outputs.

I want you to navigate using Finder on Mac or File Explorer on Windows to wherever you saved the Invest data. For me, that was in my users, files, Teaching, APEC 3611, and then in here I had that folder called Invest. Right there. Navigate to the SDR folder. We’re going to use this a number of times, but before we actually start using it to put the data into Invest, let’s create a new folder called Results.

You don’t have to do this, but there are so many data files here that it quickly gets annoying to separate the inputs from the outputs. Create a new folder called results. You can do that however you do it—right-click, add new folder. Then, if you click back into Invest, navigate to it and make sure you select the results folder as your workspace.

Just pausing a second: everybody able to create the folder and everything?

You know, I’ve taught a lot of computer classes over the years, and this is one thing that is starting to change. People don’t quite know what files are like they used to. The internet and Google Docs have all gotten so intuitive that we don’t realize what a file is—it’s a chunk of hard drive somewhere storing data. But yeah, so we’ve just pointed to that directory.

Now I’m actually going to unleash you. For the first five different layers, I’d like you to see if you can navigate through the folder. I’ll start to show the first one, and see if you can figure out which of the different inputs in this folder go into which of the different ones here. Digital elevation model, I’ve sort of been indicating what it is all along, is probably a DEM. Please go through the first five different ones and get those pointing the right way.

And again, that gotcha: make sure you get the TIFF, not one of those extra files. So it’s got .tiff, not .xml or something like that.

One other thing to note: if you’re curious what each of these variables are, you can also click the definition here. This will link to the user’s guide, so that comes back to the question of data—where do you get it? Where do you learn about it? But I think for this particular example, the things are pretty well labeled, so you can just infer from the file name.

I’m going to circulate around and check everybody’s screen to make sure we’re all getting their input correct.

Alright, good. Okay, so that’s the first set, and let’s actually load them up into QGIS just so we know what we’re doing. For that, my process is to click on QGIS and then click back on your Finder window or Windows Explorer. Let’s explore a few of these. First, let’s look at that DEM.

You know, it defaults to different things depending on how you have it set up. This one is not very useful. It’s identifying a unique color for each value. That makes sense for a land use land cover map where there are maybe 16 different categories. It doesn’t work so well for something that is a continuous measure like elevation.

So palleted makes no sense. We’ll do the same thing we’ve done before: go to single band pseudo color.

Let’s explore some of the other ones. I’m going to go with yellow, orange, brown.

When you click that, now it’s showing you the correspondence in meters above sea level to which color it is. We get a slightly prettier map like this. Yeah, you can see the different areas where there are tributaries leading into the stream network.

One cool thing is if you zoom in on an area, all the values start to get so close to each other that you can’t actually see much difference. One thing you can do, at least after you’ve set it to pseudo-color, is you can right-click on the layer and go to styles. If you click this one, all it’s going to do is reset the minimum and maximum of the color bar to the current extent.

But to do it manually, what you could do is just say, all those values are really close together. Let’s actually just manually change it. Let’s look at only the values between 2000 and 2500. This is going to restretch it and you can see it’ll highlight the differences a little bit nicer.

Okay, that’s the DEM. You could do the same with the others, but for the sake of time, we’re going to move on. When you actually write up your country reports, it might be useful to go in and show some of these.

But let’s talk about the next file types. Vector is the next type. That is the one where it should be a geo package, but the data here is still in the old shapefile format.

We actually have a number of different possibilities we could choose. I’m going to recommend that we go with this one called Sub-Watersheds. If you just wanted a single polygon of the whole area, that would have been watersheds.shp, but I’d like you to choose subwatersheds.shp. And again, make sure it’s the actual shapefile, not the similarly named ones.

We’re going to leave out drainages. This one’s important. If you have tile drainage—farmers will put in pipes on their field—this obviously has a huge impact on where the water goes.

Yeah, so that’s all of the different files we need to point to. But there’s one last thing. Let me jump back to PowerPoint to show you what to enter next.

Okay, so that’s just what we’ve done. The next thing we’re going to put in is the model parameters. In addition to datasets, we also have to say things like, once and for all, answer that philosophical question of what is a stream. We’re going to say a stream is defined as having a flow accumulation of 1,000.

Very concise, right? But go ahead and enter in these values:

Flow accumulation threshold: 1,000 Borcelli K parameter: 2 Maximum SDR: 0.8 Borcelli ICO parameter: 0.5 Maximum L value: 122

These are the defaults that are in the user’s guide. Once you’ve got all those entered in and assuming you’ve got all green checks, you can go ahead and hit run.

This one might take a little bit longer because it’s doing a lot more math. It’s actually calculating the flow route. That’s the most computationally expensive part of this whole thing—for any given pixel, you need to compute which pixel it will flow to, and then also consider which pixel that grid cell will flow to, and so forth.

Raise your hand if any problems are coming up. You might just sit here for a second.

Good, good. Checking runtimes. The fastest one was over here at 8 seconds. Anybody faster than 8?

I was a little faster at 3.05 seconds. I shouldn’t brag about that—I just really love computers. But I also shouldn’t brag because I have a lot of budget for computers. This thing was $6,700, but it was on a grant I had to spend on technology. If I didn’t spend it, I lost the money. So I might as well have the literally best computer I could buy at the time. This is not the most expensive computer I’ve bought.

Okay, anyway, once it’s done, indeed, you can hit View Results. This gets you a bunch of information. Side note: this is a new feature of Invest that they added last year. Previously when I would teach this course, there was not a handy report that people could click on, and that actually made it easier to teach because then you could spend a lot of time talking through each of these. This is so nicely formatted that it sort of does the homework assignment for you. Whatever, you can use it.

There will be some parts where you still have to load it into QGIS because you might want to zoom in on particular areas. But this frankly gets a fair amount of it.

So, like, what do we see? Here’s that DEM and here is our definition of the stream. In this case, you’d have to go into QGIS. It’s missing a lot of pixels just because it’s zoomed in too far. But you can sort of see the basic outlines. It defines the stream network.

This also gives you information about where all those different files are. Stream.tiff is this one up here. There’s a ton more outputs, all of them might be interesting to you.

But some of the key ones would be sediment export, for instance. That’s the total amount, the tonnage, in this case of sediment and soil that makes it into the reservoir—basically makes it into that stream network rather than getting trapped somewhere.

Other ones that you might be interested in are avoided erosion. Instead of caring about the amount that makes it into the stream, you might also care about how much nature provided value by preventing it from happening in the first place.

Okay, so you can spend a lot of time with that.

Let’s talk about policy. Policy decisions would need this data too. How would we talk about that? I’m going to skip a few slides talking about graphing, which might help you if you’re curious, but what I want to talk about is adding the sediment retention layer to QGIS and talking about the policy of that.

Navigate to your results folder. This is why I made the results folder—just so there are fewer files here. This one is going to be Avoided Erosion, which is the file name.

Give it some prettier colors.

Okay, this one is a map of where nature was providing a service—in this case, where did nature prevent erosion from happening in the first place? This is the one that’s the most policy-relevant. The key policy is: if you wanted to not have erosion happen and you wanted to keep that dam from getting made totally useless, you would want to make sure you do not cut down the vegetation on these pixels that have a high value.

Here, like this area, the policy advice would be pretty straightforward: do not degrade this area that has high sediment retention. The reason is because it is holding a lot of that retention away from making it into the stream.

Whether you care about on-farm versus into the stream, this map we showed—this is avoided erosion. This isn’t necessarily reporting how much makes it into the stream; this is saying how much erosion happens on that exact pixel. So the policy advice would be a little bit different here. This is saying if you’re a farmer, these are the locations you should be most nervous about, and you should probably invest in erosion prevention strategies like tiling or more well-designed drainage systems.

So yeah, either one of these maps can point you to policy-relevant advice.

Student: So we’re running these kinds of tests on our countries of choice, and is all that data like in the folders already?

Yes. Actually, let me end on that point. I haven’t released the full assignment yet, which will say this in more detail, but indeed, when you downloaded those zip files, they’re organized by model, and we’ll have all these. They’ll be slightly differently labeled I think, but you’ll be able to find them, I think, pretty straightforward.

As a discussion point: what do you see from this? What are the areas that are most valuable?

Student: This area, right?

What do we know about this area? If we go back into QGIS, the higher elevation areas—and so instead of having really detailed policy advice about a specific set of grid cells worth protecting, you can also go with generalities: the higher elevation areas, based on the soil conditions and also the roughness and steepness of the resulting topography, are the areas where erosion is particularly bad.

To give an example: in China, one of the largest ecosystem service policies in the world is the Sloping Lands Conversion Protection Policy, or something like that. Basically, it pays farmers who have land above a certain degree of slope not to convert it to agriculture. If they followed their private incentives, they might do that because they could grow stuff there. But it would have a huge externality. So it’s actually worth it for the government to design policies that say: if you happen to have land with greater than 5 degrees slope, we’re willing to pay you X hundred yuan or whatever the currency is not to convert it to agriculture.

That’s a pretty smart program because the farmers like it, right? They get money. An equivalent policy would be to simply say you can’t do any agriculture where there is greater than 5 degree slope. What would be the downside of that?

Student: Equipment.

Exactly. Or angry farmers, maybe, is the way of putting it. Or maybe they do it anyway, and now you have a big enforcement problem. But yeah, in some version it will hurt the farmers and cause all sorts of unexpected things. People will try to disguise it or make it look natural but then still be cultivating it.

Student: I have a question about the avoided erosion. Does a higher number mean more erosion avoided, or just more erosion?

This one here is more erosion avoided. Yeah, if you’re curious, the user’s guide will give you the scientific definition. It’s the tons difference between the current state and some alternative. Here’s a slide on that.

Whenever you’re talking about avoided erosion, that’s actually a whole lot harder because you need something to compare it to, right? If something didn’t happen, how do you put a dollar value on it? The way the model works is it runs twice: once with the current land use and looks at the export, but then compares that to how much export would have happened in the worst-case scenario. For sediment retention, the worst case scenario is just bare field. So the difference between how much export would have happened with bare soil versus the current land use—that is the total amount of avoided erosion.

Sometimes, though, that’s not really the comparison that you want. Instead it makes more sense to look at it on a scenarios basis. Instead of the extreme example of bare soil, you would run it with a current land use land cover map and some alternative land use land cover map. Then you could redefine erosion prevented by following whatever policy led to that other scenario. So we’ll return to this. That’s why I’ve been emphasizing scenarios throughout: a lot of times the most useful way to look at it is where there is a big difference in the erosion that happens given one land use map and another land use map.

Great question. We have one minute left, but any other last questions?

Cool. Everybody here is pretty computer competent, which is always fun. We spend less time on annoying stuff. But yes, you’ll be seeing the weekly questions go live here shortly after class when I push it to GitHub. Have a great weekend.