Even as a student pilot, I’m aware that the official performance tables for an aircraft are given in the Pilot Operating Handbook specific to a given aircraft, not just by make and model, but to the aircraft itself.

I extracted performance numbers from an unofficial source (but with aircraft serial number ranges that covered my training aircraft).

I provide the following code for instructional purposes – to document my approach and thinking, and so that others who are so inclined can do the same, using their own sourced data. But please do not endanger yourself.

There will be minimal explanation below.

Source CSV files

I can NOT take responsibility for the contents of these performance tables. There may be errors, or they may not be appropriate to your specific aircraft. They are included here for educational purposes only.

Short field takeoff

Continuous numeric predictors:

pressure altitude (9 levels)
temperature (5 levels)
takeoff weight (only 3 levels)

Numeric output:

ground roll in feet
total feet to clear a 50 foot obstruction
3 speeds at liftoff and at 50 feet (depending ONLY on weight)

Not modeled here: headwind/tailwind, non-paved runway

Predictive models:

mgcv::gam generalized additive model for ground roll and total distance to clear 50’ obstacle
stats::approxfun linear interpolation for liftoff and 50 foot speeds

Plot shows overlay of original and `gam` model predictions

Very good agreement, acceptable to me to use.

The curves in the POH are actually too perfect, and were almost certainly the results of a modelled curve and not actual results.

Code

df_takeoff <- read_csv("takeoff.csv", col_types = "d") |> 
  pivot_longer(
    cols = starts_with("groundroll") | starts_with("clearfifty"),
    names_to = "metric_temp",
    values_to = "value") |> 
  separate(metric_temp, into = c("metric", "temp"), sep = "_") |> 
  mutate(temp = as.numeric(temp))

# g <- df_takeoff |> 
#   mutate(temp = as.factor(temp)) |> 
#   # ggplot(aes(p_alt, value, color = temp)) +
#   ggplot(aes(p_alt, value, color = temp, linetype = metric)) +
#   geom_point() +
#   geom_line() +
#   # facet_grid(vars(wt), vars(metric)) +
#   facet_grid(~wt) +
#   theme_bw()
# ggplotly(g)

# Predictive modeling:
# gam_takeoff: bs = "gp" performs poorly
# - list of 2 models, "clearfifty" and "groundroll"
# - predictors: wt, p_alt, temp
gam_takeoff <- map(split(df_takeoff, df_takeoff$metric), ~ mgcv::gam(value ~ s(wt, p_alt, temp, bs = "tp"), family = gaussian(), data = .x))
liftoff <- approxfun(x = c(2550, 2400, 2200), y = c(51, 48, 44)) # KIAS
speed50 <- approxfun(x = c(2550, 2400, 2200), y = c(56, 54, 50)) # KIAS

df_takeoff_wide <- df_takeoff |> 
  pivot_wider(
    names_from = metric, values_from = value
  )

df_takeoff_wide <- df_takeoff_wide |> 
  mutate(
    groundroll_pred = predict(gam_takeoff[['groundroll']], df_takeoff_wide),
    clearfifty_pred = predict(gam_takeoff[['clearfifty']], df_takeoff_wide)
  ) |> 
  mutate(
    err_g = groundroll_pred - groundroll,
    err_c = clearfifty_pred - clearfifty
  )

# Confirm that take-off predictions match original
g <- df_takeoff_wide |> 
  transmute(
    wt, p_alt, temp = as.factor(temp),
    groundroll_orig = groundroll,
    clearfifty_orig = clearfifty,
    groundroll_pred,
    clearfifty_pred
  ) |> 
  pivot_longer(
    cols = c(groundroll_orig, clearfifty_orig, groundroll_pred, clearfifty_pred), 
    names_to = c("metric", "model"),
    names_pattern = "(groundroll|clearfifty)_(orig|pred)",
    values_to = "value"
  ) |> 
  ggplot(aes(p_alt, value, color= temp, linetype = model)) +
  geom_line() +
  facet_grid(metric ~ wt) +
  theme_bw()
ggplotly(g)

Maximum climb rate

Very small table.

Continuous numeric predictors:

pressure altitude (7 levels)
temperature (4 levels)

Numeric output:

rate of climb (feet per minute)
climb speed (KIAS) – only 3 values, depending ONLY on pressure altitude

Predictive models:

stats::lm second degree polynomial model for rate of climb
climb speed not modelled, as it’s only 72, 73, or 74 KIAS

Plot shows overlay of original and `lm` model predictions

Pretty good agreement, small discrepancies at 4000 feet pressure altitude (less than 20 fpm). Acceptable enough to use.

Code

df_climbrate <- read_csv("climb_rate.csv", col_types = "d") |> 
  pivot_longer(
    cols = starts_with("t"),
    names_to = "metric_temp",
    values_to = "climbrate") |> 
  separate(metric_temp, into = c("metric", "temp"), sep = "_") |> 
  mutate(temp = as.numeric(temp)) |> 
  select(-metric) |> 
  arrange(temp, p_alt)

# g <- df_climbrate |> 
#   mutate(temp = as.factor(temp)) |> 
#   # ggplot(aes(p_alt, value, color = )) +
#   ggplot(aes(p_alt, climbrate, color = temp)) +
#   geom_point() +
#   geom_line() +
#   theme_bw()
# ggplotly(g)

poly_climbrate <- lm(climbrate ~ poly(p_alt, 2) * temp, data = df_climbrate)

df_climbrate_wide <- df_climbrate |> 
  mutate(climbrate_pred = predict(poly_climbrate, df_climbrate)) |> 
  mutate(
    err_g = climbrate_pred - climbrate,
    percent_err = 100 * err_g / climbrate
  )

g <- df_climbrate_wide |> 
  transmute(
    p_alt, temp = as.factor(temp),
    climbrate_orig = climbrate,
    climbrate_pred
  ) |> 
  pivot_longer(
    cols = c(climbrate_orig, climbrate_pred), 
    names_to = c("metric", "model"),
    names_pattern = "(climbrate)_(orig|pred)",
    values_to = "value"
  ) |> 
  ggplot(aes(p_alt, value, color= temp, linetype = model)) +
  geom_line() +
  theme_bw()
ggplotly(g)

Climb distance

Only a SINGLE continuous numeric predictor:

pressure altitude (13 levels)
fixed fractional modification by change from ISA temperature (not modeled here)

Numeric output:

climb time
fuel used
distnace

All are cumulative values starting from sea level pressure altitude.

Predictive models:

stats::approxfun linear interpolation, because only a single predictor

Plot shows point-to-point connections via `approxfun`

This isn’t really predictive modeling at all, but just straight interpolation between points.

Code

##### Climb time, fuel, and distance #####
# - p_alt is the only predictor
# - use approxfun for all columns

df_climbdist <- read_csv("climb_dist.csv", col_types = "d") |> 
  pivot_longer(
    cols = -p_alt,
    names_to = "metric"
  )

approxfun_climbdist <- map(split(df_climbdist, df_climbdist$metric), function(df) { approxfun(x = df$p_alt, y = df$value) })

climbdist <- function(altitudes) {
  map(approxfun_climbdist, function(f) {f(altitudes)}) |>
    as_tibble() |>
    transmute(p_alt = altitudes, temp, speed, rate, cum_time, cum_fuel, cum_dist)
}

# climbdist(seq(0, 12000, 1000)) == read_csv("climb_dist.csv", col_types = "d") # Confirm perfect match to original
# climbdist(c(1000, 3000))$cum_fuel |> diff() # example from one altitude to another

g <- df_climbdist |> 
  ggplot(aes(p_alt, value, color = metric)) +
  geom_point() +
  geom_line() +
  theme_bw() +
  facet_wrap(~metric, scales = "free_y")
ggplotly(g)

Cruise performance

These were the most complex performance tables, with more predictors and more data to use.

Furthermore, close inspection of plots of the original data showed kinks in the POH data that lacked face validity and likely reflected measurement or rounding discrepancies.

At first, I tried to faithfully reproduce these irregularities in the POH performance data, with a rather complex 3-nearest neighbor not forming a degenerate triangle interpolating tiling predictive model, and it did “work” by completely memorizing the original performance data and interpolating between closest points. But, in the end, the kinks did not make physical or aerodynamic sense, and I ended up using a gam model.

(In fact, I compared linear, polynomial, generalized additive, and my bizarre nearest-neighbor interpolated models. I also performed leave-one-out-cross-validation on the gam model to try to get a sense of expected errors, as well as confirming sensible interpolation between the minimal 3 temperature values available.)

Continuous numeric predictors:

pressure altitude (6 levels)
engine RPM (6 to 7 levels for each pressure altitude, not fully overlapping)
temperature (only 3 levels)

Numeric output:

% maximum cruise power (% MCP)
knots true airspeed (KTAS)
gallons per hour fuel use (GPH)

Not modeled here: speed fairings

Predictive models:

mgcv::gam generalized additive model performed most acceptably to me

Comparison of linear, polynomial, and generalized additive predictive models

Just showing adjusted R². Not included here are LOOCV error calculations for gam model or the weird nearest neighbor triangular interpolation.

Code

df_cruise <- read_csv("cruise.csv", col_types = "d")
outcomes <- c("pwr", "ktas", "gph")
outcome_labels <- c(
  "pwr" = "% max cruise power",
  "ktas" = "TAS (knots)",
  "gph" = "Gallons per hour"
)
temp_labels <- c(
  "-20" = "20°C below standard",
  "0" = "Standard temperature",
  "20" = "20°C above standard"
)

# Color palette: choose darkest 6 colors in "Blues"
pal <- RColorBrewer::brewer.pal(n = 9, "Blues")[4:9] # has 9 colors, drop the first 3 lighter ones

##### Compare adjusted R-squared performance: lm, lm with polynomial terms, and gam #####
# - using full training set, so not as good for assessing prediction on new data
gam_models <- list() # save all the gam models for future use
df_rsq <- data.frame() # store the adjusted R-squared values for comparison
for (outcome in outcomes) {
  # pwr, ktas, gph
  lm_fit <- lm(as.formula(paste0(outcome, " ~ alt + rpm + temp")), data = df_cruise)
  poly_fit <- lm(
    as.formula(paste0(
      outcome,
      " ~ poly(alt, 2) + poly(rpm, 2) + poly(temp, 2)"
    )),
    data = df_cruise
  )
  gam_fit <- gam(
    as.formula(paste0(outcome, " ~ s(alt, rpm, temp)")),
    bs = "gp",
    # bs = "tp", # doesn't perform better
    family = gaussian(),
    data = df_cruise
  )
  gam_models[[outcome]] <- gam_fit

  df_rsq <- rbind(
    df_rsq,
    data.frame(
      outcome = outcome,
      lm = summary(lm_fit)$adj.r.squared,
      poly = summary(poly_fit)$adj.r.squared,
      gam = summary(gam_fit)$r.sq
    )
  )
}

df_rsq |> mutate_if(is.numeric, round, 4) |> knitr::kable()

outcome	lm	poly	gam
pwr	0.9848	0.9850	0.9993
ktas	0.9930	0.9931	0.9989
gph	0.9872	0.9871	0.9994

Plot shows overlay of original and `gam` model predictions

Good agreement. KTAS has the most weird kinks, which are smoothed out and non-intersecting in the gam model, which is preferable to me.

Next plot zooms in on an example problem area, but these plots are all ggplotly interactable, so you can select the weird areas to zoom in on yourself.

Code

# GAM final model test
# - overlay facet_grid of original POH and gam model-predicted
# - generate dataframe with both orig POH as well as pred from GAM
df_gam_test <- df_cruise |>
  mutate(model = "orig") |>
  pivot_longer(cols = pwr:gph, names_to = "outcome", values_to = "value") |>
  bind_rows(
    df_cruise |>
      transmute(alt, rpm, temp, model = "pred") |>
      bind_cols(map_dfc(gam_models, predict, df_cruise |> select(alt, rpm, temp))) |>
      pivot_longer(cols = pwr:gph, names_to = "outcome", values_to = "value")
  ) |>
  arrange(alt, rpm)

# Generate grid plot with overlaid orig and pred lines
g <- df_gam_test |>
  mutate(alt = factor(alt, ordered = TRUE)) |>
  ggplot(aes(rpm, value, color = alt, linetype = model)) +
  # geom_point() +
  geom_line() +
  facet_grid(
    rows = vars(outcome),
    cols = vars(temp),
    scale = "free_y",
    labeller = labeller(outcome = outcome_labels, temp = temp_labels)
  ) +
  theme_bw() +
  labs(
    # title = "Cruise Performance: C172S NAV III / KAP 140",
    x = "RPM",
    y = ""
  ) +
  scale_color_manual("Pressure altitude (ft)", values = rev(pal)) +
  scale_linetype_manual(values = c("orig" = "solid", "pred" = "longdash")) +
  guides(color = "none")
ggplotly(g)

Crossing lines in original data

Occur with true air speed (TAS) and only at the extremes. Seem very unlikely to be true. More likely the result of rounding errors and small errors of measurement. Example shown here, although previous plots are interactive and can be examined as well.

At first, I perfectly modeled these discontinuities with the complex triangle interpolation method, but in the end chose to go with a GAM model that seemed more plausible to be actually true.

Code

g <- df_gam_test |>
  filter(outcome == "ktas", temp == 0) |> 
  mutate(alt = factor(alt, ordered = TRUE)) |>
  ggplot(aes(rpm, value, color = alt, linetype = model)) +
  # geom_point() +
  geom_line() +
  facet_grid(
    rows = vars(outcome),
    cols = vars(temp),
    scale = "free_y",
    labeller = labeller(outcome = outcome_labels, temp = temp_labels)
  ) +
  theme_bw() +
  labs(
    # title = "Cruise Performance: C172S NAV III / KAP 140",
    x = "RPM",
    y = ""
  ) +
  scale_color_manual("Pressure altitude (ft)", values = rev(pal)) +
  scale_linetype_manual(values = c("orig" = "solid", "pred" = "longdash")) +
  guides(color = "none") +
  coord_cartesian(xlim = c(2400, 2700), ylim = c(105, 125))
ggplotly(g)

Heatmap of derived metric NM per gallon: RPM vs pressure altitiude, by temperature

Using the published cruise performance data, I calculated the travel efficiency metric of nautical miles traveled per gallon of fuel used, at the various RPM and altitudes available.

This doesn’t take into account the fuel consumed to actually reach the higher altitudes, where fuel use becomes more efficient.

Code

##### Heatmap of derived metric NM per gallon: RPM vs pressure altitiude, by temperature #####
df_cruise |>
  # filter(rpm/100 == round(rpm/100)) |>
  mutate(
    eff = ktas / gph,
    label = paste0(round(eff, 1), " NM / G\n", ktas, " KTAS\n", gph, " GPH")
  ) |>
  ggplot(aes(alt, rpm, fill = eff)) +
  geom_tile() +
  geom_text(aes(label = label), size = 2.2, color = "black") + # #555
  theme_bw() +
  scale_fill_viridis_c(option = "H", "NM per gallon", limits = c(10, 15)) +
  facet_wrap(
    ~temp,
    labeller = labeller(outcome = outcome_labels, temp = temp_labels)
  ) +
  labs(
    title = "Nautical miles per gallon, by altitude and RPM",
    x = "Pressure Altitude (feet)",
    y = "RPM"
  ) +
  scale_x_continuous(breaks = seq(2000, 12000, 2000)) +
  scale_y_continuous(breaks = seq(2100, 2700, 100))

Short field landing

Continuous numeric predictors:

pressure altitude (9 levels)
temperature (5 levels)

Numeric output:

ground roll in feet
total feet to clear a 50 foot obstruction

Not modeled here: headwind/tailwind, non-paved runway, flaps up

Predictive models:

mgcv::gam generalized additive model for ground roll and total distance to clear 50’ obstacle
- however, gam parameter of bs = "gp" performed poorly, so instead used bp = "tp"
- can’t say I fully understand exactly why

Plot shows overlay of original and `gam` model predictions

Very good agreement, acceptable to me to use.

The curves in the POH are actually too perfect, and were almost certainly the results of a modelled curve and not actual results.

Code

df_landing <- read_csv("landing.csv", col_types = "d") |> 
  pivot_longer(
    cols = starts_with("groundroll") | starts_with("clearfifty"),
    names_to = "metric_temp",
    values_to = "value") |> 
  separate(metric_temp, into = c("metric", "temp"), sep = "_") |> 
  mutate(temp = as.numeric(temp))

# g <- df_landing |> 
#   mutate(temp = as.factor(temp)) |> 
#   # ggplot(aes(p_alt, value, color = )) +
#   ggplot(aes(p_alt, value, color = temp, linetype = metric)) +
#   geom_point() +
#   geom_line() +
#   # facet_wrap(~metric) +
#   theme_bw()
# ggplotly(g)

# gam_landing: bs = "gp" performs poorly
# - list of 2 models, "clearfifty" and "groundroll"
# - predictors: p_alt, temp
gam_landing <- map(split(df_landing, df_landing$metric), ~ mgcv::gam(value ~ s(p_alt, temp, bs = "tp"), family = gaussian(), data = .x))

df_landing_wide <- df_landing |> 
  pivot_wider(
    names_from = metric, values_from = value
  )

df_landing_wide <- df_landing_wide |> 
  mutate(
    groundroll_pred = predict(gam_landing[['groundroll']], df_landing_wide),
    clearfifty_pred = predict(gam_landing[['clearfifty']], df_landing_wide)
  ) |> 
  mutate(
    err_g = groundroll_pred - groundroll,
    err_c = clearfifty_pred - clearfifty
  )

# Confirm that landing predictions match original
g <- df_landing_wide |> 
  transmute(
    p_alt, temp = as.factor(temp),
    groundroll_orig = groundroll,
    clearfifty_orig = clearfifty,
    groundroll_pred,
    clearfifty_pred
  ) |> 
  pivot_longer(
    cols = c(groundroll_orig, clearfifty_orig, groundroll_pred, clearfifty_pred), 
    names_to = c("metric", "model"),
    names_pattern = "(groundroll|clearfifty)_(orig|pred)",
    values_to = "value"
  ) |> 
  ggplot(aes(p_alt, value, color= temp, linetype = model)) +
  geom_line() +
  facet_wrap(~metric) +
  theme_bw()
ggplotly(g)

Shiny app developed

I will probably separately describe the Shiny App that I developed using these calculations.

It has some really nice usability enhancements like automatically fetching information for a selected airport, including weather information and runway headings, which allows calculations of head/tailwinds and pressure + density altitudes, which can then be fed into the performance predictive models.

However, I will probably not share the app or code, because I think that would be irresponsible of me and potentially endanger others.

Source CSV files

Short field takeoff

Plot shows overlay of original and gam model predictions

Maximum climb rate

Plot shows overlay of original and lm model predictions

Climb distance

Plot shows point-to-point connections via approxfun

Cruise performance

Comparison of linear, polynomial, and generalized additive predictive models

Plot shows overlay of original and gam model predictions

Crossing lines in original data

Heatmap of derived metric NM per gallon: RPM vs pressure altitiude, by temperature

Short field landing

Plot shows overlay of original and gam model predictions

Shiny app developed

Plot shows overlay of original and `gam` model predictions

Plot shows overlay of original and `lm` model predictions

Plot shows point-to-point connections via `approxfun`

Plot shows overlay of original and `gam` model predictions

Plot shows overlay of original and `gam` model predictions