Lecture 11 - Land-use Change Prediction

PRE-CLASS SOFTWARE SETUP:

Install SEALS into your devstack. You have already (hopefully) installed Python and Hazelbean. If you need a refresher, these are documented in the Basic Installation step of the devstack installation instructions https://justinandrewjohnson.com/earth_economy_devstack/installation.html. Now, I want you to continue on and follow the Repository Installation instructions to install the seals repository from the latest github version. These steps will install a C compiler and then use pip to install the SEALS github repository directly. The method documented here is a very common way that e.g. a software development company gets a new developer up to speed with their codebase. You only need to install the repo at https://github.com/jandrewjohnson/seals_dev, not all the other options listed.

Reading: Verburg and Overmars 2009

Slides as Powerpoint: Download here

Video link: On Youtube

Content

Introduction to Land Use Change Modeling

This lecture focuses on land use change prediction, a topic of central importance to applied earth economy modeling. Land use change modeling represents a critical component in understanding how human activities and economic decisions translate into spatial patterns of land cover that ultimately affect ecosystem services, biodiversity, and environmental outcomes. The session begins with an introduction to land use change modeling and its role within the broader framework of applied earth economy modeling, followed by a detailed examination of seminal methodological contributions and hands-on implementation of advanced modeling tools.

The agenda covers several key areas. First, we examine the conceptual foundations of land use change modeling and why it matters for integrated environmental-economic analysis. Second, we review the seminal work by Verburg and Overmars from 2009, which introduced the Dynaclue model, a foundational approach to spatially explicit land use allocation. Third, we explore the related work by Wolf and colleagues from 2017 on Clue Mondo, which extends these concepts to global applications. Finally, we transition to practical implementation using the SEALS model, which requires installation of a C compiler to achieve the computational efficiency necessary for high-resolution global analysis. Future versions may eliminate this installation requirement, but current implementations require unique compilation on each machine to run efficiently.

Course Progress and Scheduling

Looking at the course schedule, we are making substantial progress through the semester. The current session on land use change modeling is followed by a Thursday hands-on session that will continue working with land use change modeling concepts and tools. This Thursday session represents somewhat of a flexible component in the course design. In previous years, this time has been used to cover additional topics within computable general equilibrium models, such as gridded economic models. However, there is consideration for spending more time returning to earth economy modeling concepts with greater specificity, potentially including more detailed coverage of GTAP Invest and related integration approaches.

Assignment Structure and Deadlines

Regarding course assignments, students should remember their weekly reaction papers. For those who have fallen behind on these assignments, there is flexibility to submit them before the end of the semester. While stricter adherence to deadlines would be ideal, the graduate student context allows for some accommodation. However, it is important not to forget these assignments entirely, as they will affect final grades. Some extensions have already been granted, but students should plan to catch up on any missing submissions.

The first part of the final project is due on Thursday, consisting of a one to two sentence description of the proposed research project. Many students have already submitted these initial descriptions, and detailed feedback will be sent via email to each student. The next milestone comes on Tuesday of the following week, when the project outline is due. An update with more detailed instructions will be sent after class, but the essential requirement is a document of less than one page that outlines the structure and argument of the final project. Students should take their one-sentence description, incorporate any feedback received, and expand it into a clear paper outline with logical structure.

The deadline structure maintains some flexibility, but students should not take undue advantage of this accommodation. If work has not been turned in by the stated deadline, students have until the next class session to submit. The outline due on the eighth is particularly important because these outlines will be sent out for peer review immediately after class on Thursday. Students will be matched up in pairs for peer review, which is a key component of the learning process. The outline should be less than a page, perhaps eight to ten lines of text. Results are not required at the outline stage, but students should include bullet points describing expected results, such as a map showing ecosystem service changes, identification of hotspots of loss, regression tables if applicable, or graphs showing effect sizes over time. More detailed instructions will be sent after class. The peer review component is essential, so it is important that all students submit both the initial sentence description and the outline, as these represent intermediate steps toward the final project.

For projects involving regression analysis, the final submission should include standard statistical outputs and potentially graphs or visualizations of results. Figures are not required for the outline submission, but they will be expected for the final project. The peer review process will be similar to reviewing a journal article, where students identify both strengths in the work and areas for improvement. More detailed guidance on conducting effective peer review will be provided.

Technical Requirements and Software Setup

The SEALS model that we will be using only runs on Windows operating systems. Students who do not have access to a Windows machine will need to pair up with someone who does for the hands-on work. Students with Mac computers who wish to work independently have the option of installing a virtual machine running Windows. It is important that everyone has access to software that can run GTAP and related models. Students who need help coordinating access to appropriate hardware should reach out via email. While this platform limitation is unfortunate, the reality is that the best available model for this application is Windows-only. Looking ahead to next year, there may be a requirement for Mac users to install virtual machines so that all students are working in the same computing environment.

The mathematical content of the models involves basic algebra, but it requires correct conceptual understanding of the underlying processes. The actual mathematical solutions are relatively straightforward once the concepts are properly understood. Regarding upcoming presentations, with eight students presenting, each will have approximately five minutes, making these short, focused presentations.

The Environmental Economics Context

Returning to fundamental questions posed at the beginning of the semester about what remains to be solved in environmental economics, we are witnessing significant progress in certain areas. The transition toward electric vehicles and renewable energy is well underway. Recent developments show rapid solar energy implementation in China, and while renewable energy rollout in the United States may not be optimal, globally we are seeing substantial progress. When renewable energy becomes cheaper than fossil fuel alternatives, it becomes increasingly difficult to justify burning coal. The historical use of coal was driven by economics rather than preference. To provide some context on the scale of transitions, more people are currently employed at Wendy’s restaurants than work in coal mines, so while managing transitions is important, the actual scale is different from what is sometimes portrayed in public discourse.

Despite progress in energy transitions, there remain important environmental issues that require attention. Land is a fundamentally limited resource, and this limitation takes on special significance due to land’s connections to biodiversity and ecosystem services. We need to improve both our models and our understanding of land use dynamics. Land use change modeling represents the missing link in our integrated modeling framework. We have computable general equilibrium models on one side and ecosystem services or Earth Systems models on the other. There are additional models in this space, including climate change models. Even in the Banerjee article that was reviewed earlier in the course, land use change was discussed as a critical component.

Integrated Modeling Framework

We have now successfully connected computable general equilibrium models to land use change models, which in turn produce inputs for ecosystem services models. The Banerjee paper from 2022 and other recent papers describe this integration. Increasingly, there is recognition that we need to be explicit about having four distinct components in our integrated framework. Rather than going directly from ecosystem services models to computable general equilibrium models, there is a need for dependency models that explicitly model how biophysical outputs from ecosystem service or Earth systems models become inputs for computable general equilibrium models. This topic will be discussed in greater detail during the final days of the course. This dependency modeling is conceptually similar to generating the shock file that we worked with in RunGTAP, where we expressed changes in ecosystem services as economic shocks that could be analyzed within the computable general equilibrium framework.

The Motivation for Land Use Change Modeling

Focusing specifically on land use change, the motivation is clear because land use change outputs serve as the primary input into INVEST and other ecosystem service models. In earlier lectures, we discussed scenarios extensively, with land use change as a key component of scenario analysis. We have examined one important representation of land use futures in the form of the land use harmonization dataset produced by Pop and colleagues, Riyahi, and others. This dataset is useful because it provides both a long time series and high specificity, covering both historical periods and future projections across different combinations of Shared Socioeconomic Pathways and Representative Concentration Pathways. While the graphs produced from this data are visually striking, they can be hard to interpret, and often a matrix representation provides clearer insights.

Pop and colleagues noted in their 2017 paper that land use change data needs to meet several key requirements. It must be spatial, meaning it captures geographic patterns and locations. It must be temporal, covering changes over time. It must be conceptually consistent from past observations through future projections. Finally, it must be in a format that is usable by Earth systems models, which means it must have sufficiently high resolution for meaningful analysis. There are not many processes that can be meaningfully modeled at the thirty kilometer grid cell level that is common in some global datasets. While climate dynamics can be modeled at thirty kilometer resolution, this resolution is far too coarse for accurate modeling of processes like carbon storage. For ecosystem services, particularly processes like hydrological routing, you cannot use a thirty kilometer average elevation value and expect meaningful results. Higher resolution is essential for these applications.

Introduction to Dynaclue

The seminal article introducing Dynaclue fills an important gap in land use change modeling by modeling land system changes at coarse scales and then allocating those changes to fine resolution spatial grids. Before diving into the specifics of Dynaclue, it is important to discuss the distinction between top-down and bottom-up modeling approaches. The 2009 paper argues that certain dynamics are determined at broad scales, such as aggregate demand for agricultural land, urban expansion, and managed forestry land. These large-scale dynamics depend on macroeconomic models rather than just high-resolution local data. Dynaclue incorporates these top-down factors and couples them with bottom-up variables, such as vegetation changes driven by growth and seed dispersal processes, which operate at finer spatial resolution.

Dynaclue combines top-down and bottom-up approaches specifically for land use change prediction, with particular attention to processes like agricultural abandonment and forest regeneration. The model operates at two distinct levels, incorporating both top-down and bottom-up processes. A paper on the mesoscale as a frontier in sustainability modeling argued that models operating at intermediate scales are particularly important and often underrepresented in the literature. The mesoscale bridges global and local models, capturing phenomena like trade flows and regional markets that operate at scales between the purely local and the purely global. This concept was mapped onto a figure in a 2023 paper showing how different models vary in their levels of economic and spatial detail. This framework was later formalized in a 2025 paper that identified and categorized relevant models operating in this multidimensional space.

The key insight from this work is that different phenomena operate at different spatial scales, and attempting to analyze changes at the wrong scale presents significant challenges. Systematically linking models that operate at multiple scales represents an important research frontier in sustainability science and integrated assessment modeling.

Dynaclue Methodology

Returning to the specifics of Dynaclue, the model focuses on top-down allocation of land use change, determining which specific grid cells undergo conversion using bottom-up data on local conditions and suitability. The conceptual model diagram shows two distinct levels of operation. At the regional level, we have multi-sectoral land demand and various regional processes. The allocation step, which represents the core innovation of Dynaclue, takes this regional demand and distributes it spatially. Given a specified demand in the form of net hectarage change for each land use class, the model allocates these changes to specific locations using a sophisticated approach that considers multiple factors.

The paper provides detailed information on land use classes and their representation in the model. Built-up area and agricultural land are determined by regional scale modeling that feeds into the allocation process. The heart of the Dynaclue procedure is the allocation step itself. Walking through the logic, we predict the total probability for each time period T, location i, and land cover type L that a given grid cell will transition to another land use type. This probability is calculated as a function of several key factors.

The first factor is location suitability. For crops, you can run a crop model to estimate expected yield, which correlates strongly with the likelihood of cropland expansion into that area. Geospatial raster datasets indicate suitability for each land use type at each location. For urban expansion, factors like adjacency to other urban cells are particularly important determinants of suitability.

The second factor is neighborhood suitability. Urban areas tend to expand in locations where neighboring cells are already urban, creating spatial clustering patterns. This represents another important predictor that is parameterized through regression analysis of historical patterns.

The third factor is elasticity, which measures the ease or difficulty of changing from one land use type to another. Some transitions are much harder than others. For example, converting urban land back to forest is extremely difficult and rare. This resistance to change is captured through elasticity parameters or transition costs that vary by land use type and transition pathway.

The fourth factor is competitive advantage, which is determined iteratively during the allocation process and affects the probability that a particular land use type will claim a given grid cell when multiple types are competing for expansion.

Regression analysis is used to estimate these probability relationships. To predict land use change probability, a binomial logit model represents the classic approach. We are predicting the probability that a location will convert to land use type K at time t. The logit specification models the log of the probability divided by one minus the probability as a function of input layers representing various X variables, such as crop suitability, slope, soil type, and other relevant factors. Data comes from time series of observed land use and land cover along with associated covariates. The model trains on observed historical changes to estimate coefficients, which are then used to project patterns into the future.

After calculating probability maps for each land use type, the actual allocation of changes proceeds through an iterative algorithm. The probability map indicates the relative likelihood of expansion for each land use type at each location, but actual changes must also satisfy regional demand constraints. The allocation algorithm takes the regional net change in hectarage for each land use class and uses the probability map to select which specific grid cells undergo conversion. It is important to understand the distinction between the probability map, which shows relative likelihoods, and the change map, which shows actual allocated conversions.

The Allocation Algorithm

Describing the algorithm in words, the process begins by calculating the probability map for each land use class at the initial time step. The algorithm then assigns the land use class with the highest probability in each grid cell to create a new land use and land cover map. All changes across the landscape are summed and compared to the regional demand. If a particular class has been over-allocated relative to its demand, the algorithm reduces its competitive advantage, which lowers its probability of claiming additional cells, and then reallocates. This process repeats iteratively until the total amount allocated for each land use class matches the specified demand.

Demand in this context refers to the required land use change for a particular transition, such as one hundred thousand new hectares of cropland. This demand is exogenous to the allocation model, meaning it comes from outside the spatial allocation process, typically from an economic model or other source of land demand projections. In earth economy modeling, demand comes from the computable general equilibrium model, which provides a land supply curve and solves for land allocation across different production processes based on economic optimization.

Dynaclue simplifies the modeling problem by not explicitly modeling the probability of converting from type I to type J for every possible pair of land use types, but rather modeling the probability of a grid cell being a given type L at a point in time. Modeling specific transitions with full information about both source and destination types represents an important research frontier, as it becomes more complex to assign changes when you need to track both where conversions come from and where they go.

The computable general equilibrium model calculates a change vector representing the net hectarage change for each land use class. This change vector couples with the initial land use and land cover map to allocate changes spatially. In dynamic settings where the model runs across multiple time periods, the process generates a new land use map for each period by coupling the change vector for that period with the allocation algorithm.

The allocation process continues, adjusting competitiveness iteratively until all class demands are satisfied within acceptable tolerances. While academic journals often prefer formal algorithm diagrams, the process is fundamentally as described here. The model also incorporates a conversion matrix that identifies which transitions are allowed and which are prohibited. For example, water bodies cannot convert to pasture. Dynaclue also models land use change progression, reflecting the reality that transitions often follow predictable cycles. For instance, pasture might transition to abandoned pasture, then to shrubland, and eventually to forest. These transition paths are defined by specifying the number of years required for full conversion from one type to another.

Model Outputs and Applications

The result of running Dynaclue is a series of land use and land cover maps for different time periods. In the Dynaclue framework, the net change vector is specified exogenously, indicating the required expansion or contraction for each land use class in each year. For example, the demand might specify one hundred thousand new hectares of cropland from 2000 to 2015, followed by fifty thousand additional hectares allocated in subsequent years.

These land use and land cover map outputs serve as the primary input to INVEST and other ecosystem service models. Newer applications of these methods, such as the work by Schultz and colleagues from 2021, model land use change in Turkey at two kilometer resolution using scenario analysis with different net change vectors representing business as usual conditions, conservation priorities, and other alternative futures. Scenarios define different possible futures based on varying assumptions about policy, economic development, and environmental priorities, leading to different patterns of agricultural expansion and conservation outcomes.

Wolf and colleagues introduced Clue Mondo in 2018, a global land use change model that is closely related to both Dynaclue and the original Clue model. Clue Mondo assesses various scenarios including the effects of protected areas, which restrict crop expansion in certain grid cells based on conservation designations. The model produces both initial conditions and projected land use and land cover maps, showing patterns of expansion and change across different regions of the world. Scenario analysis yields distinctly different projected maps for 2050 depending on which scenario assumptions are applied.

Technical Limitations and Advances

Clue Mondo represents one of the best global applications of spatially explicit land use change modeling, though it has important limitations driven by computational constraints. At the time the model was developed, it was limited to four thousand by four thousand pixels, yielding a total of sixteen million grid cells. At global scale, this resolution corresponds to approximately five arc minutes, which translates to roughly ten kilometer pixels at the equator. This limitation existed due to legacy 32-bit computer systems, which impose a cap on integer values at approximately two billion. Saving floating point data at this resolution across multiple land use classes and time periods becomes extremely memory-intensive and pushes against these computational boundaries.

However, hardware constraints are fundamentally arbitrary and can be overcome through advances in computer science and programming approaches. This is precisely what was accomplished with the development of GTAP Invest and SEALS, which stands for Spatial Economic Allocation Landscape Simulator. SEALS is conceptually similar to Clue Mondo and Dynaclue but is not constrained by the four thousand by four thousand pixel limitation. SEALS operates at three hundred meter resolution globally, processing grids of one hundred twenty-nine thousand by sixty-four thousand pixels, yielding a total of 8.25 billion grid cells. Viewing and analyzing results at this resolution requires approximately 64 gigabytes of memory. Since most researchers do not have computers with this much memory, the solution involves parallel processing approaches.

Modern gains in computational speed come primarily from increases in the number of CPU cores rather than faster individual chips. Parallel processing allows you to break large datasets into smaller chunks that fit in available memory, process these chunks separately across multiple cores, and then merge the results. This represents the current paradigm of geospatial processing and big data analysis. SEALS is designed to use this parallel processing approach, running simultaneously across all available CPU cores.

The original SEALS implementation operated at three hundred meter resolution. More recent versions now exist at thirty meter resolution, processing 825 billion grid cells, and even at ten meter resolution, which involves 7.43 trillion grid cells. At ten meter resolution, you can distinguish and map individual desks within a classroom. For context, Google Maps uses approximately fifty centimeter resolution imagery in urban areas. Companies like Maxar provide imagery at fifteen centimeter resolution globally, which would correspond to thirty-four quadrillion grid cells if processed at global extent.

Processing data at these extreme scales requires petabytes of storage capacity and sophisticated data management approaches. The availability of big data at high resolution enables entirely new research directions. Higher resolution is not simply about seeing more detail in the same analyses. Instead, increased resolution allows for new types of linkages between models and enables analysis in modeling domains that were previously impossible, such as linking fine-scale ecosystem service provision to global-scale economic systems.

Integration with Economic Models

Returning to the connection between computable general equilibrium models and land use change maps, a common question among students concerns where the input demand for land use change originates. Ideally, this demand comes from an economic model that explicitly calculates it based on prices, productivity, and market dynamics. For large-scale sustainability commitments such as the Paris Climate Accord, the 30 by 30 conservation target, or the Half Earth proposal, land use changes have general equilibrium effects that propagate through the economy. This reinforces the critical importance of macroeconomic models within our interconnected modeling framework, as these policies cannot be properly evaluated without accounting for economy-wide adjustments and feedbacks.

Transition to Hands-On Implementation

Moving to the hands-on component and practical implementation, last year’s class ran Clue Mondo directly, but there were significant challenges getting it to run properly, especially on Mac and Linux systems and even on some Windows 11 installations. Despite these technical challenges, Clue Mondo represents a great methodological approach with a built-in logit regression model. Users can select raster layers to use as predictors in the regression, and the model trains itself on historical data. While we are not using Clue Mondo this year due to these technical difficulties, class exercises from that implementation remain available for students who wish to explore that approach.

Instead, the course is moving toward the SEALS model for hands-on implementation. The class website contains documentation for this lecture, including a pre-class software setup step that required installing SEALS into your development stack. Most of the required software has already been installed in previous sessions, including Python and the HazelBean library, which covers the majority of requirements. There is one additional installation step that is documented in the Earth Economy DevStack installation page.

The installation process is designed to enable students to clone code repositories and work as developers rather than simply as end users. This approach is similar to the onboarding process at a software company where new employees need to set up their development environment. The most challenging part of the installation for Windows users is installing the C compiler, which is necessary because Python alone cannot efficiently run global land use change models at three hundred meter resolution. Python is fast to write and highly readable, but it is relatively slow to execute compared to compiled languages. The common strategy in scientific computing is to use Python for the majority of code due to its ease of development and then call C code for performance-critical sections that need maximum speed. Python handles data types automatically and provides high-level abstractions, but C runs near bare-metal speed when properly compiled. On Mac and Linux systems, the necessary compilation tools are typically built in through Xcode or equivalent development environments.

Once students have successfully cloned the SEALS development repository into the correct location in their directory structure, they can activate their Python environment using Anaconda. The installation uses pip with the dash e flag for an editable install, pointing to the current folder, along with the dash dash no-deps flag to avoid updating other packages unnecessarily. This editable installation allows students to make changes to the SEALS code and have those changes immediately reflected when running the model, while ensuring compatibility with all other installed packages.

Working Environment Setup

Students should open Visual Studio Code using the workspace configuration file named eartheconomyDevstack.code-workspace, which pre-configures access to all relevant repositories. If SEALS dev is installed correctly, students will see its contents in the workspace file explorer. If the SEALS content is not visible, it likely means the repository was cloned into the wrong directory location and needs to be moved.

To run SEALS, students should open the file named Run Seals Standard Python. Rather than clicking the simple play button to execute the script, students should use the debugger interface. Students need to select the appropriate debugger configuration, which will be either “internal console current file” for Windows or the Mac and Linux equivalent configuration. After clicking in the code window to ensure it has focus, students can run the script using the debugger. If the script fails to run, students should take a screenshot of the error message and email it for assistance. Some initial failures are expected since every computer system is slightly different and may require minor adjustments.

If students encounter a “no module found” error, this typically indicates that the wrong Python environment is active. Students can activate the correct environment by clicking on the environment selector in Visual Studio Code and choosing their development stack environment, which will typically be named something like ENV2025A. With the correct environment activated, students should try running the debugger again. On a properly configured machine, students will observe subprocesses launching and running in parallel, utilizing all available CPU cores simultaneously to process different portions of the global dataset.

Homework and Next Steps

The homework assignment is for students to take a screenshot showing their progress with the installation and to write a brief description of what worked, what challenges they encountered, and where they are in the process. These progress reports should be emailed so that appropriate next steps and troubleshooting assistance can be provided. Students can either email immediately after attempting the installation or experiment further with the software before sending their update, depending on their comfort level and success with the initial setup.

With this foundation in land use change modeling theory and practical implementation established, the lecture concludes. The next session will build on these concepts with hands-on work applying SEALS to specific research questions and scenarios.

Transcript

All right, let’s get started. Welcome to Lecture 11, where we will focus on land use change prediction. This is a topic I am passionate about, so I’m excited to discuss it.

For today’s agenda, we’ll begin by introducing land use change modeling and its importance within applied earth economy modeling. Then we’ll review one of the seminal articles by Verburg and Overmars (2009), where they introduce Dynaclue. We’ll also discuss a closely related article by Wolf et al. (2017) called Clue Mondo. After that, we’ll transition to the hands-on component of land use change code. The installation instructions required you to install a C compiler because the model we’re using, SEALS, needs to be compiled uniquely on each computer to run efficiently. Hopefully, future versions will resolve this issue.

Let’s first look at the schedule. We are making progress. Looking ahead, we are here. On Thursday, we’ll do a hands-on session with land use change modeling, closely linked to today’s topic. This session is a bit of a freebie in terms of content, and I might adjust it. In previous years, I’ve covered topics within CGEs, such as gridded economic models, but I may spend more time returning to Earth economy modeling with more specifics, including GTAP Invest.

Regarding assignments, remember you have your weekly reactions. If you’re behind, that’s fine—just submit them before the end of the semester. I should be stricter with deadlines, but we’re all graduate students here. Don’t forget them, as they will affect your final grade. I’ve given some extensions, but on Thursday, the first part of the final project is due—a 1-2 sentence description. Many of you have already submitted these, and I’ll send detailed feedback via email. Then, on Tuesday of the following week, the outline is due. I’ll send an update after class with more details, but essentially, it’s less than a page outlining the structure of your argument. Take the one-sentence description, incorporate any feedback, and expand it into a paper outline. Does that answer your questions?

Assignments are due Tuesday and Thursday. The deadlines are flexible, but don’t take advantage of this. If you haven’t turned in your work, you have until the next class. The outline is due on the 8th, and that’s important because I’ll send them out for peer review immediately after class on Thursday. I’ll match you up for peer review. The outline should be less than a page, maybe 8-10 lines. You don’t need results yet, but include bullet points for expected results, such as a map of ecosystem service changes or hotspots of loss. I’ll send more details after class. The peer review is a key part, so please submit both the sentence and the outline, as they’re intermediate steps toward the final project.

For the outline, you might include expected key results, like a regression table or a graph of effect size over time. Figures are not required for the outline, but for the final project, yes. If you’re doing regression analysis, include standard outputs and perhaps a graph. Any other questions? I’ll explain more on Thursday, but peer review will be similar to reviewing a journal article—identify strengths and areas for improvement. More details will follow.

Are we ready to move on? Yes. We need to learn the rules because SEALS only runs on Windows. If you don’t have a Windows machine, pair up with someone who does. If you have a Mac and want to work independently, you can install a virtual machine. I want everyone to have access to RunGTAP software. If you need help coordinating, email me. It’s unfortunate, but the best model is Windows-only. Next year, I may require Mac users to install a virtual machine so we’re all on the same page.

The math is basic algebra, but you need to think about it correctly. The actual solution is simple. Regarding the presentation, we have 8 people, so each will present for 5 minutes. It’s a short presentation.

Let’s dive in. You may have seen the image on the title screen before. At the beginning of the semester, I asked what remains to be solved for environmental economists. We’re moving toward EVs and renewables, and recent developments show rapid solar implementation in China. In the U.S., renewable energy rollout isn’t optimal, but globally, progress is happening. When renewables are cheaper, it’s hard to justify burning coal. People weren’t burning coal out of preference. For context, more people work at Wendy’s than in coal mines, so while transitions matter, the scale is different.

There are still important issues to address. Land is a limited resource, especially due to its connections to biodiversity and ecosystem services. We need to improve our models and understanding of land use. Land use change modeling is the missing link in our model diagram. We have CGEs and ecosystem services or Earth Systems models. There are more models, like climate change. Even in the Banerjee article, land use change was discussed.

We’ve done this now—connected CGEs to land use change models, which produce inputs for ecosystem services. Banerjee (2022) and other papers describe this. Increasingly, I think we need to be explicit that there are four components. Instead of going straight from ecosystem services to CGEs, there’s a need for dependency models—modeling how biophysical outputs from ecosystem service or Earth systems models become inputs for CGEs. I’ll discuss this more in the last days of class. It’s akin to generating the shock file, as we did with RunGTAP, expressing changes in ecosystem services as shocks.

Let’s focus on land use change. The motivation is clear: it’s the input into INVEST. In earlier lectures, we discussed scenarios, with land use change as a key part. We’ve seen one representation—the land use harmonization dataset from Pop et al., Riyahi, and others. It’s a useful dataset with a long time series and specificity, both historical and across SSP and RCP combinations. The graphs are flashy but hard to interpret; a matrix is better.

Pop et al. (2017) noted that land use change data needs to be spatial, temporal, conceptually consistent from past to future, and in a format usable by Earth systems models. That means high enough resolution. There aren’t many processes you can model at the 30 km grid cell level. Climate dynamics can be modeled at 30 km, but it’s inaccurate for things like carbon storage. For ecosystem services, such as hydrological routing, you can’t use a 30 km average elevation. Higher resolution is essential.

The seminal article introducing Dynaclue fills this gap, modeling land system changes at coarse scale and allocating them to fine resolution. Before diving into Dynaclue, let’s discuss top-down versus bottom-up modeling. The 2009 paper argues that some dynamics are determined at broad scales, like aggregate demand for agriculture, urban expansion, and managed forestry land. These depend on macroeconomic models, not just high-resolution local data. Dynaclue includes these top-down factors and couples them with bottom-up variables, such as vegetation changes from growth and seed dispersal, which operate at finer resolution.

Dynaclue combines top-down and bottom-up approaches for land use change prediction, focusing on agricultural abandonment and forest regeneration. There are two modeling levels: top-down and bottom-up. I wrote a paper on the mesoscale as a frontier in sustainability modeling, arguing that models at intermediate scales are important. The mesoscale bridges global and local models, capturing phenomena like trade flows and regional markets. We mapped this idea onto a figure in a 2023 paper, showing models with varying economic and spatial detail. Later, we formalized this in a 2025 paper, identifying relevant models in the space.

The key takeaway is that different phenomena operate at different spatial scales, and analyzing changes at the wrong scale is challenging. Systematically linking models at multiple scales is an important research frontier.

Back to Dynaclue: it focuses on top-down allocation of land use change, determining which grid cells are converted using bottom-up data. The conceptual model diagram shows two levels: regional (multi-sectoral land demand and processes) and the allocation step, which is the core of Dynaclue. Given demand (net hectarage change), the model allocates changes using a specific approach.

Refer to the paper for details on land use classes and their representation. Built-up area and agriculture are determined by regional scale modeling. The heart of the procedure is the allocation step. Let’s walk through the logic. We predict the total probability (for each time T, location i, and land cover L) of a grid cell transitioning to another land use type as a function of several factors:

Location suitability: For crops, you can run a crop model to get expected yield, which correlates with cropland expansion. Geospatial rasters indicate suitability for each land use type. For urban expansion, adjacency to other urban cells matters.
Neighborhood suitability: Urban areas expand where neighboring cells are urban. This is another predictor, parameterized via regression.
Elasticity: Measures the ease or difficulty of changing types. Some areas are harder to change, like converting urban to forest. This is captured by elasticity or transition costs.
Competitive advantage: This is iteratively determined during allocation and affects probability.

We use regression to estimate probabilities. To predict land use change probability, a binomial logit model is classic. We’re predicting the probability that a location converts to land use type K at time t. The logit specification models the log of the probability over 1 minus probability as a function of input layers (X variables), such as crop suitability, slope, soil type, etc. Data comes from time series of land use/cover and covariates, training on observed changes to estimate coefficients and project into the future.

After calculating probabilities, we allocate changes. The probability map indicates likelihood of expansion, but actual change depends on regional demand. The allocation algorithm takes the regional net change in hectarage and uses the probability map to select specific grid cells. There’s a difference between the probability map and the change map.

The algorithm in words: calculate the probability map for each land use class at the initial time step. Assign the class with the highest probability in each grid cell to a new land use/cover map. Sum all changes. If a class is over-allocated, reduce its competitive advantage, lowering its probability, and reallocate. Repeat until the total allocated matches the demand.

Demand here refers to the required land use change for a transition, such as 100,000 new hectares of cropland. It’s exogenous to the allocation model, coming from the economic model or elsewhere. In earth economy modeling, demand comes from the CGE, which provides a land supply curve and solves for land allocation across production processes.

Dynaclue simplifies by not modeling the probability of converting from type I to J, but rather the probability of being a given type L. Modeling specific transitions is a research frontier, as it’s harder to assign changes when you know the source and destination types.

The CGE calculates a change vector—net hectarage change for each land use class. This couples with the initial land use/cover map to allocate changes. In dynamic settings, the process generates a new map for the next period, coupling the change vector with allocation.

The allocation continues, adjusting competitiveness until all class demands are satisfied. Journals prefer algorithm diagrams, but the process is as described. There’s a conversion matrix to identify allowed transitions (e.g., water can’t convert to pasture). Dynaclue also models land use change progression, reflecting that transitions follow cycles (e.g., pasture to abandoned pasture to shrubs to forest). Transition paths are defined by the number of years required for full conversion.

The result is a land use/cover map for different time periods. In Dynaclue, the net change vector is exogenous, specifying expansion for each class in each year. For example, 100,000 new hectares of cropland from 2000 to 2015, then 50,000 more allocated in subsequent years.

This output is the input to INVEST and ecosystem service models. Newer work, like Schultz et al. (2021), models Turkey at 2 km resolution, using scenario analysis with different net change vectors (business as usual, conservation, etc.). Scenarios define different futures, affecting agricultural expansion and conservation outcomes.

Wolf et al. (2018) introduced Clue Mondo, a global land use change model related to Dynaclue and Clue. Clumando assesses scenarios like protected areas, restricting crop expansion in certain grid cells. The model produces initial and projected land use/cover maps, showing expansion and changes across regions. Scenario analysis yields different 2050 maps for each scenario.

Clumando is one of the best global applications, though it has limitations. At the time of writing, it was limited to 4,000 by 4,000 pixels (16 million cells). At global scale, this matches 5 arc minutes resolution (about 10 km pixels at the equator). This limitation is due to legacy 32-bit systems, which cap integer values at about 2 billion. Saving floating point data at this resolution is memory-intensive.

Hardware constraints are arbitrary. Advances in computer science can relax these constraints. That’s what we did with GTAP Invest and SEALS (Spatial Economic Allocation Landscape Simulator), which is similar to Clumando and Dynaclue but not limited to 4,000 by 4,000 pixels. SEALS runs at 300 meter resolution, with 129,000 by 64,000 pixels (8.25 billion grid cells). Viewing this result requires 64GB of memory. Most people don’t have this, so the solution is parallel processing.

Modern speed gains come from more CPU cores, not faster chips. Parallel processing allows you to break data into chunks that fit in memory, process them separately, and merge results. This is the paradigm of geospatial processing. SEALS uses this approach, running in parallel. The original SEALS was 300 meters; now we have versions at 30 meters (825 billion grid cells) and even 10 meters (7.43 trillion grid cells). At 10 meters, you can map individual desks in a classroom. Google Maps uses 50 cm resolution in cities; Maxar achieves 15 cm globally (34 quadrillion grid cells).

Processing at this scale requires petabytes of storage. Big data enables new research directions. Higher resolution doesn’t just increase detail—it allows new linkages and modeling domains, such as linking ecosystem services at global scale.

Returning to CGEs and land use change maps, many of you are asking where input demand comes from. Ideally, it comes from a model that calculates it. For large-scale sustainability commitments (Paris Accord, 30 by 30, Half Earth), land use change has general equilibrium effects, reinforcing the importance of macroeconomic models in our interconnected modeling framework.

Any questions before we move to hands-on Python work? Let’s proceed.

Last year, we ran Clumando in class, but there were challenges getting it to run, especially on Mac, Linux, and even Windows 11. Despite these issues, it’s a great approach with a built-in logit regression model. You can select rasters for regression, and it trains itself. We’re not using it this year, but there are class exercises.

Instead, we’ll move toward the SEALS model. Let me show the class website. For this lecture, you had a pre-class software setup step: installing SEALS into your dev stack. We’ve already installed Python and HazelBean, which covers most requirements, with one additional step documented in the Earth Economy DevStack installation page.

The installation process enables you to clone repositories and work as a developer, not just a user. This is similar to onboarding at a software company. The hardest part for Windows is installing the compiler, which is necessary because Python alone can’t run global 300-meter land use maps efficiently. Python is fast to write but slow to run. The common strategy is to use Python for most code and call C for performance-critical parts. Python handles data types automatically, but C runs near bare-metal speed when compiled. On Macs and Linux, Xcode is built in.

Once you’ve cloned the SEALS dev repository into the correct location, activate your environment using Anaconda. Use pip with the “-e” flag for an editable install, pointing to the current folder, and “–no-deps” to avoid updating other packages. This lets you make changes that work with everything else.

Open VS Code using the workspace configuration file (eartheconomyDevstack.code-workspace), which pre-configures all repositories. If SEALS dev is installed correctly, you’ll see its content. If not, it may be in the wrong directory.

To run SEALS, open the Run Seals Standard Python file. Rather than clicking the play button, use the debugger. Select the appropriate configuration (internal console current file for Windows or Mac/Linux), click in the code window, and run with the debugger. If it fails, take a screenshot and email me. This is expected, as every system is different.

If you get a “no module” error, activate your environment by clicking the environment selector in VS Code and choosing your dev stack (e.g., ENV2025A). With the correct environment, run the debugger again. On my machine, you’ll see subprocesses running in parallel, using all cores.

Your homework is to take a screenshot and describe your progress. I’ll email you next steps. You can email me immediately or experiment further before sending your update.

With that, we’ll end for today. Thank you!