Infectious Respiratory Disease

CONTENTS OF CURRICULUM UNIT 25.05.06

  1. Unit Guide
  1. Introduction and Rationale
  2. Demographics and Student Description
  3. Content Objectives
  4. Teaching Strategies
  5. Classroom Activities
  6. Appendix on Implementing District Standards
  7. References
  8. Notes

Sickness Simulator: Modeling Infectious Disease Through NetLogo

Jiang Wu

Published September 2025

Tools for this Unit:

Content Objectives

This four-week unit is ideally used for AP Computer Science Principles, but can be adapted to fit an introductory computer science or third year math course. It focuses on two core computer science concepts (data and analysis, algorithms and programming) and one core computer science practice (developing and using abstractions). Students start by exploring a pre-made epidemiological model in NetLogo. After getting a rough idea of how the model works and what their finished deliverable should resemble, students will choose a specific respiratory disease and geographic region to investigate. Using their research, each student will program a NetLogo simulation that models how the chosen disease spreads in the selected area. The project culminates with in-class presentations of their simulations.

Overview of Four Infectious Respiratory Diseases

Infectious respiratory disease spread primarily by droplets, aerosols, or close contact, making them among the easiest pathogens to transmit. They differ widely in biology, speed of spread, and health impact. To provide students with a rich, contrasting set of topics that highlight the key elements of an SIR model, the following four infectious respiratory diseases were selected as focus areas for this project: measles, tuberculosis (TB), SARS-CoV-1 (SARS), and SARS-CoV-2 (COVID-19). With the highest known basic reproduction number (R0), measles showcases how extremely contagious pathogens race through susceptible populations. People also tend to develop immunity to measles after vaccination or surviving an infection. In contrast, TB has a long latent period and potential reactivation. SARS demonstrates how high fatality rates and swift public health interventions can stop a virus before it becomes a global pandemic. On the other hand, COVID-19’s higher transmissibility and delayed containment allowed it to sweep across the world.

An SIR framework will only credibly project the real-world trajectory of a disease if it’s informed by reliable, disease-specific data. The fundamental parameters of a disease we will look at are transmissibility, mortality rate, and infection length. This way, when the unit is being taught, the teacher will have a general idea of what parameters are accurate for each of the four selected diseases and can provide guidance to students who need help.

One other parameter that is not included with the others is the infection radius. The infection radius is an important component in the simulation but is not necessarily specific to each of these four diseases. This is because measles, TB, SARS and COVID-19 are all airborne and are not only transmitted through droplets3. This means that when an infected person coughs, talks, sneezes, or breathes, they expel tiny aerosols that contain the virus4. Due to the minute size of these aerosols (<5 μm), they don’t easily settle on the ground as droplets do. Instead, air currents cause them to stay suspended and infectious in the air for long periods of time. Therefore, it’s difficult to calculate a specific infectious radius for each of these diseases, as the CDC regards anyone who shares the same airspace as an infectious person as being exposed to the virus5. It is programmatically difficult to implement this into NetLogo, we will approximate it by giving each infected agent an adjustable infection radius. NetLogo uses patches, which are like pixels in the background, to measure distance. A slider can be made to adjust the infection radius anywhere from 1 patch to 10 patches or more to observe how widening the zone of exposure changes the model’s outbreak dynamics.

Measles

Measles is a highly contagious virus that resides in the nose and throats of infected people6. One of its hallmarks is its uniquely high R0, usually cited in the range of 12-18, although a 2017 study discourages using this value, as R0 fluctuates depending on the country’s development status, population density, and birth rates7. Instead, it’s recommended to use locally derived R0 values, or borrow estimates from settings that match the aforementioned demographics. This is important when students are researching the demographics of their selected locale. A model simulating the spread of measles in Luxembourg would have a lower R0 (6.2 - 7.7) than a model simulating the spread of measles in California (10.7 - 18.1).

The World Health Organization (WHO) estimates that there were an estimated 10.3 million cases of measles in 2023, with a death toll of 107,5008. This gives us a mortality rate of slightly over 1%. As with R0, this varies by context. In the US or Europe, mortality rates drop to around 0.1%9, while in sub-Saharan Africa, 5% is more realistic10. According to the CDC, the main variables affecting mortality rates are age, nutrition, and vaccination status11. Babies, children, and adults older than 20 die far more often than school-age children. The risk of death is compounded in pregnancy and in people whose immunity is weakened by leukemia, HIV, or other conditions. Malnutrition in general and vitamin A deficiency in particular, roughly quadruple the odds of death. Almost all measles deaths occur among unvaccinated or under-vaccinated people. Unfortunately, there are regions in the world with fragile health systems and all these factors coincide. This results in a death toll that can exceed one in twenty. However, upon getting vaccinated or contracting measles and recovering, individuals typically develop immunity, making it extremely unlikely they’ll get the disease again.

When a person is infected with measles, there is an average incubation period of about 12 days where they do not show any outward symptoms12. After this incubation period, the first symptoms, fever, cough, and conjunctivitis begin to manifest. Two to four days later, the characteristic red, blotchy rash breaks out and spreads throughout the person’s body. This usually persists for 5-6 days before fading. Throughout this symptomatic phase, the person is typically infectious from four days before the rash starts until four days after it appears13. Students looking to challenge themselves can upgrade the SIR model into an SEIR model, where the E stands for Exposed, to handle this incubation period. Exposed individuals have the disease but cannot yet transmit them to others. Students who prefer to stick with the standard SIR model can simply set the infection-length variable to eight days.

Tuberculosis (TB)

TB is caused by Mycobacteria tuberculosis, an airborne bacterium that most often infects the lungs14. Tuberculosis is typically transmitted by inhaling aerosols containing the bacteria, each roughly 2.71 ± 1.05 μm in length. Once infected, most people develop a clinically silent latent TB infection. This is a state where a person’s immune system has contained, but not eradicated, Mycobacteria tuberculosis. The bacteria persist in a dormant or very low replicating form, so the person has no symptoms and cannot transmit TB to others. Therefore, a person can only get TB from someone with active TB, not latent TB15. Interestingly, the only way to diagnose someone with latent TB is to check for the presence of Mycobacteria tuberculosis antigens, rather than for the bacteria itself. Only 5-10% of those with latent TB will develop active TB in their life, but the risk increases with immunosuppression, malnutrition, diabetes, smoking, and HIV. Because TB has an extremely long latent period, its R0 is lower than that of other acute viral diseases. As with other diseases though, the value varies depending on many other factors such as time, location, and public health measures. A 2018 study looked at past literature on TB transmission and found R0 values as low as 0.24 in the Netherlands (from 1933 – 2007) to 4.3 in China (2012)16. Students will be expected to provide justification for the R0 value they use in their model.

TB is back as the leading global infectious cause of death, after briefly losing the title to COVID-19. According to the WHO, approximately 10.8 million people fell ill with active TB in 2023, and of those, 1.25 million died17. This gives a mortality rate of 11.57%. Most of these deaths are untreated cases of TB, which is unfortunate, because TB is both curable and preventable. If left untreated, most deaths arise from pulmonary deterioration and respiratory failure. It’s important to emphasize that this 10.8 million are people with active TB. The WHO estimates that about 2 billion people have latent TB.

Nowadays, when a person is diagnosed with active TB, they go get antibiotic treatment. However, treatment is not an inherent part of an SIR model. This begs the question, what happens when one gets active TB and does not get treatment? A 2011 meta-analysis helps us answer this question. The study looked at historical data from the pre-antibiotic era and found that untreated active TB lasted a mean of 3 years from onset to recovery or death18. About 70% of these patients died, and many of those who recovered did not develop immunity and relapsed later in life. Those who do receive treatment are no longer infectious after a few weeks.

SARS-CoV-1 (SARS)

SARS is an airborne respiratory illness, best known for the 2002-2004 outbreak in East Asia19. It has a diameter of 80-120 nanometers, which means that transmission is typically through inhaling aerosols with the virus inside or coming into contact with surfaces contaminated with droplets. A detailed SIR model fit to historical trends of the epidemic in China estimated R0 at 2.8720. This value is consistent with independent analyses from Hong Kong and Singapore. Rapid public health measures such as rapid identification of cases, quarantine, and airport screening quickly reduced R to below 1, causing the outbreak to die out.

By the time the WHO declared the outbreak to be contained in July 2003, 8,096 cases and 774 attributable deaths had been reported worldwide, giving SARS a mortality rate of 9.56%. Treatments included receiving supplemental oxygen and, in more severe cases, lung-protective mechanical ventilation21. Those who succumbed to the virus died due to the deterioration of the lungs and a dysregulated immune response, leading to respiratory failure22.

Reliable data exists for a timeline spanning from when a patient is first infected with SARS, to when they eventually recover or die. Patients were typically hospitalized 3 - 5 days after their first symptoms, and followed a trajectory where their respiratory functions worsened until day 1223. For Hong Kong, the average number of days from symptom onset to hospital discharge was 26.5 days24, while in China it was 33 days25. Importantly, the CDC notes that the virus is only contagious after an infectious person shows symptoms, and infectiousness peaks at around day 10 of the illness26. When implementing this into the model, students have the option to either modify their SIR model into a SEIR model as with measles, or they could use an infection length within the range of 22 – 30 days.

SARS-CoV-2 (COVID-19)

COVID-19 is the illness caused by SARS-CoV-2, a coronavirus that caused a worldwide pandemic in 2020. The virus was first identified in Wuhan, China, in December 2019 and prompted the WHO to declare a Public Health Emergency of International Concern on January 30, 2020, and a pandemic on March 11, 202027. COVID-19 spreads primarily through aerosols in shared indoor air. Risk of infection rises in poorly ventilated, crowded settings. Since the pandemic started, many different strains of COVID-19 have appeared. The strain in early 2020 had a median R0 of 2.628, while the Delta variant of 2021 had a mean R0 of 529. Although these values are high, population immunity from vaccination and prior infection is now widespread, meaning the current effective reproduction number in July 2025 is much lower. 

Most infections are mild to moderate, with fever, cough, fatigue, and loss of taste or smell. The virus first attacks cells that line the air sacs of the lungs. As the infection spreads, fluid and debris fill these sacs and the surrounding tissue becomes inflamed, making it hard for oxygen to cross into the blood. For those who succumb to the illness, death usually results from either respiratory failure, shock from overwhelming inflammation, or organ damage caused by clots30. Mortality rates vary greatly for COVID-19 due to how each region handled the disease. As of July 2024, countries had reported approximately 775 million cases and 7 million deaths, making the global mortality rate roughly 0.9%31. Students will be expected to research the mortality rate in the region they intend to model their simulation in.

COVID-19’s incubation period averages 4 - 6 days and the infectious period typically spans from 1 - 2 days before symptom onset to about 8 - 10 days afterwards32. The recovery time is more varied, depending on the severity of the infection. For mild cases where there is no shortness of breath or hospital stay, the recovery time is typically 1-2 weeks. For cases where the patient is hospitalized, it may take 3 - 6 weeks for the patient to become clinically stable. And for the most severe cases where a patient is sent to the ICU, it may take many months for them to get better, if at all33.

This concludes the overview of the four diseases this curriculum unit covers. The next section provides a synopsis of NetLogo and its syntax.

Exploring a Basic Epidemiological Model in NetLogo

Since students will be building their simulations in NetLogo, it’s important to understand what NetLogo is and how to use it. NetLogo is a programmable environment for simulating natural and social phenomena, ideally suited for complex systems that unfold over time. This can range anywhere from modeling how traffic jams materialize on the freeway to how military engagements might play out. Users can script hundreds of autonomous agents (representing people) that act simultaneously, making it possible to study how individual behaviors generate large-scale patterns34. It has an innate models library where students can launch a pre-built epidemiological model and experiment by tweaking parameters. Figure 2 shows the epiDEM model, loaded from NetLogo’s models library.

The basic epidemiology model in NetLogo, loaded from the models library

Figure 2 – The basic epidemiology model in NetLogo, loaded from the models library

The green sliders at the top can be used to adjust the variables such as the initial amount of people, the infection chance, the recovery chance, and the average recovery time. Each of these variables contributes to the behavior of the outbreak, and the details can be viewed in the Code tab. The effective transmission rate (β) is represented as the average contacts per infected each tick multiplied by the infection chance. The recovery chance is used to calculate the average recovery time, using the formula: average recovery time = The basic epidemiology model in NetLogo, loaded from the models library. R is calculated in the calculate-r0 procedure, using the equation the equation The basic epidemiology model in NetLogo, loaded from the models library, where S0 is the initial number of susceptible agents, S(t) is the current number of susceptible agents, and N is the total number of agents. The setup and go buttons beneath them are used to initialize the base state of the simulation, and then to run it, respectively. The stage on the right shows the movement of each individual agent, with susceptible ones in green and infectious ones in red. As the model runs, three charts are generated: the cumulative number of infected and susceptible agents over time, the current number of infected and susceptible agents over time, and the infection and recovery rates over time. The variable R, shown in Figure 2 as R, is also updated live. It represents how many additional agents an average infectious agent can be expected to infect.

At the very top are three tabs. Clicking the Code tab switches the interface to a text editor of sorts that shows the NetLogo code behind the simulation. Excerpts of the code are shown below.  One of the first things you may notice is the abundance of double semicolons (;;) scattered throughout the text. Any text following ;; is treated as a comment and is ignored by NetLogo when the code is run. Comments are typically used by the programmer to remind themselves and other users what each section of code does. Additionally, if there are problems when running your program, ;; can be used to debug by commenting out sections of code to pinpoint where the problem lies.

For simulations involving disease spread, the code is typically organized into three sections: global variables, individual variables, and procedures. Global variables are declared with the globals keywords, followed immediately by a pair of square brackets that enclose a list of variable names. For example:

globals

[

r0

mask-rate

infection-radius

;; R naught, the basic reproduction number

;; The percentage of people wearing masks

;; How far a pathogen can travel from an infectious agent

]

The code above declares three global variables: r0, mask-rate, and infection-radius. Global variables are typically placed at the top of the code, before any individual variables or procedures. The declaration of the global variables does not assign values to those variables, so a common practice is to initialize each global variable in the setup procedure, which will be described later. Once initialized, global variables can be accessed or modified from any agent using the set command, allowing every part of the model to share and update the same data.

Individual variables represent attributes specific to each agent. It’s possible to program many different categories of agents, each with their own set of individual variables, but that is beyond the scope of this project. Instead, we will give each agent the same set of individual variables. Individual variables are declared with the keyword turtles-own, followed immediately by square brackets that contain a list of variable names. For example:

turtles-own

[

infectious?

immunity?

susceptible?

infection-length

recovery-time

;; If true, the agent is infectious

;; If true, the agent cannot get the disease

;; If true, the agent is susceptible

;; How long the agent has been infected for

;; Time it takes before the person has a chance to recover

]

The code above declares five individual variables but does not assign values to them. Just like global variables, individual variables are typically initialized in the setup procedure. It’s important to note that despite sharing the same set of individual variables, the values of these variables can and will differ across agents. Agent A can have an infection-length of 10 ticks (NetLogo’s unit of time) while Agent B has an infection-length of 0 ticks. Agent C can have infectious? set to true while Agent D has infectious? set to false. As a side note, the reason agents are referred to as turtles in NetLogo is because they are named after the turtle-shaped robots the Logo (NetLogo’s predecessor) programming language used35.

Following the declaration of global and individual variables are the procedures. A procedure (sometimes called a function) is a named block of code that bundles a sequence of commands you want to reuse or keep logically separate from the rest of your code. Procedures are declared with the to keyword followed by the procedure’s name and the commands you want the procedure to run. Procedures always end with the end keyword. Once defined, the procedure can be invoked by buttons in the Interface tab, by other procedures, or by agents. Most NetLogo programs will have one or more setup procedures to create the desired initial state before the simulation is run. Below are two setup procedures that might be used in an epidemiological model:

to setup

clear-all

setup-people

reset-ticks

end

;; Erases all turtles and patches. Resets all variables

;; Invokes another procedure called setup-people (below)

;; Sets the time (measured in ticks) to 0

;; Closes the setup procedure

to setup-people

create-turtles initial-people

;; Creates agents equal to the number on the initial-people

;; slider in the interface

[

setxy random-xcor random-ycor

set cured? false?

set infectious? false

set susceptible? True

set color green

;; Gives each agent a random starting position

;; Each agent does not start off cured

;; Each agent does not start off infectious

;; Each agent starts off infectious

;; Each agent’s color starts off green

]

if (random-float 100 < 5) ;; Gives each agent a 5% chance of starting off infectious

[

set infectious? True

set susceptible? False

set color red

;; If in the 5%, set infectious to true

;; If in the 5%, set susceptible to false

;; If in the 5%, set the agent’s color to red

]

end ;; Closes the setup-people procedure

Note how the setup procedure invokes the setup-people procedure. When something invokes a function by name, this is known as a call. The next line of code under setup-people (reset-ticks) will not run until every command in the setup-people procedure has finished running. This is because a procedure call must finish completely before control returns to the line that follows it. As previously mentioned, procedures are where values are typically assigned to variables. The syntax for this is the variable name, followed by a space and then the value you want to assign to it. The indents, while not strictly necessary, are a useful way to keep code organized. Table 1 below shows the commonly used parameters in SIR models and how each parameter affects the behavior of the model.

Table 1 – Commonly used parameters in SIR models and what they do

Parameter

Role in the model

beta

The number of average new secondary infections generated per infectious agent during the current tick

gamma

The number of average new recoveries per infectious agent during the current tick

R

The current estimate of the basic reproduction number based on how the susceptible pool has changed

infection-length

Tracks how many ticks the current agent has been infected for

infection-chance

Per contact probability that an interaction between an infectious and susceptible agent transmits the disease

Comments:

Add a Comment

Characters Left: 500