Table-to-Visualisation Recommendations

Date: 13 October 2025
Purpose: Identify tables in vu_slides.qmd that would benefit from visualisation
Priority: Ranked by interpretability gains and narrative impact


Executive Summary

Of the 8+ tables in the presentation, 4 would significantly benefit from visualisation. The highest priority is the National Results Table (Slide 11), which would work brilliantly as a waterfall chart showing the cascading effect of constraint relaxation on ECS.


🌟 Top Priority: Must Visualise

1. National Results Table (Slide 11) ⭐⭐⭐

Current state: 02_tables/scenario_tables/results_table_national.html

Data structure: - 7 scenarios (S0-S6) showing sequential optimisations - 4 columns: ECS value, ECS reduction %, Required educators, Educator reduction %

Why visualise: - The sequential nature tells a story that tables obscure - Readers must mentally calculate cumulative effects - Key finding (S4: -18.8% ECS) gets lost in rows of numbers - The cascading constraint relaxation is inherently visual

Recommended visualisations:

  1. Waterfall chart (Primary recommendation)
    • Start at S0 baseline (43.3 ECS)
    • Show incremental reductions as cascading steps
    • Annotate key drops (especially S3→S4 and S4→S5)
    • End at S6 (32.7 ECS, -24.6% total)
    • Impact: Makes the “big jump” at S4 immediately visible
  2. Dual-axis step chart (Alternative)
    • Left axis: Absolute ECS values
    • Right axis: Cumulative percentage reduction
    • Lines connecting scenarios
    • Shaded regions showing constraint types (technical, capital, labour, spatial)
  3. Slope graph with annotations
    • Vertical axis: ECS
    • Horizontal: Scenarios
    • Slopes between points with gradient showing rate of reduction
    • Annotations at inflection points

Implementation notes: - Use colour coding: Technical (blue), Capital (green), Labour (red), Spatial (purple) - Highlight S4 as the critical margin - Include both “with current teachers” and “freeing educators” interpretations


2. Provincial Descriptives Table (Slide 10) ⭐⭐⭐

Current state: 02_tables/provincial_descriptives/table1_prov_descriptive.html

Data structure: - 9 provinces (8 provinces + National) - 10 metrics across 4 categories: Counts, School Means, Class Size & Student Experience, Teacher Ratios

Why visualise: - 90 data points to compare (9 rows × 10 columns) - Provincial comparisons require scanning back and forth - Key disparities (Limpopo vs Northern Cape) obscured by table format - Large class rate (>40) is a critical policy metric that needs emphasis

Recommended visualisations:

  1. Small multiples (faceted panel) (Primary recommendation)
    • One panel per province
    • Each panel shows: ECS, TCR, STR, Large class rate
    • Uniform scales for direct comparison
    • Provinces ordered by ECS severity
    • Impact: Instant visual comparison of provincial “profiles”
  2. Grouped bar chart
    • X-axis: Provinces (ordered by ECS)
    • Y-axis: Values
    • Grouped bars for: ECS, Mean School SCR, Mean School TCR (scaled appropriately)
    • Separate panel or colour for Large class rate
    • Impact: Rankings become immediately obvious
  3. Cleveland dot plot
    • Multiple metrics shown as dots connected by lines
    • Each metric in different colour
    • Provinces on Y-axis
    • Impact: Excellent for showing relative positions on multiple metrics
  4. Heatmap with annotations
    • Rows: Provinces
    • Columns: Metrics
    • Colour intensity: Scaled within each metric
    • Annotations: Key numbers overlaid
    • Impact: Pattern recognition across multiple dimensions

Implementation notes: - Emphasise Limpopo (highest ECS: 49.6, highest TCR: 1.24, 73% large classes) - Highlight Northern Cape as best performer (ECS: 37.0, TCR: 1.14) - Consider geographic map overlay showing regional patterns


3. Fiscal Leakage Table (Slide 26) ⭐⭐

Current state: 02_tables/fiscal_leakage.html

Data structure: - 8 provinces + National total - Undelivered teaching % (with S4 and S5 in brackets) - Surplus teachers, Yearly cost, Extrapolated cost - Provincial budget and % of budget

Why visualise: - The R22.3bn figure is headline-worthy but buried in table - Provincial variation in fiscal impact important for targeting - Bracketed values [S5 estimates] hard to compare visually - “Percent of education budget” is key accountability metric

Recommended visualisations:

  1. Horizontal bar chart with overlays (Primary recommendation)
    • Provinces on Y-axis (ordered by cost magnitude)
    • Two grouped bars per province: S4 cost (solid) vs S5 cost (pattern/lighter)
    • Absolute cost (R billions) on X-axis
    • Annotations showing % of provincial education budget
    • National total emphasised at bottom
    • Impact: Immediate sense of where the fiscal leakage is concentrated
  2. Geographic choropleth map
    • South African provinces coloured by fiscal leakage magnitude
    • Graduated colour scale (light to dark for low to high cost)
    • Pop-up annotations with exact figures
    • Impact: Spatial policy targeting becomes obvious
  3. Treemap
    • Rectangle size proportional to fiscal leakage
    • Each province a rectangle
    • Colour intensity = % of provincial budget
    • Labels with province name and R value
    • Impact: Relative magnitude immediately visible
  4. Stacked bar showing breakdown
    • National total bar (R22.3bn [R29.6bn])
    • Stacked segments for each province’s contribution
    • Labels showing provincial %
    • Impact: Shows composition of national figure

Implementation notes: - Emphasise the R22.3bn vs R29.6bn comparison (S4 vs S5) - Consider dual visualisation: cost magnitude + % of budget - Link to provincial education budgets for context


🔧 Lower Priority: Nice to Have

4. Variance Contribution Tables (Slides 25, 25b) ⭐

Current state: - 02_tables/analysis_of_variance/variance_contribution_table_ecs.html - 02_tables/analysis_of_variance/variance_contribution_table_scenarios.html

Why visualise: - LMG decomposition shows S4 reducibility explains 33.5% of variance - Traditional factors (Province 6.3%, Quintile 6.4%, Race 5.5%) are dwarfed - Tables work reasonably well but dominance effect could be stronger

Recommended visualisations:

  1. Treemap (Primary recommendation)
    • Rectangle size = variance explained
    • Colour by category type (structural vs efficiency)
    • S4 reducibility would dominate visually
    • Impact: The dominance of misallocation over structural factors becomes undeniable
  2. Horizontal bar chart with R² annotations
    • Bars ordered by contribution size
    • Different colours for different model components
    • R² value shown for full model
    • Impact: Clean, simple, effective
  3. Waffle chart / pictogram
    • 100 squares representing total explained variance
    • S4 reducibility gets ~34 squares
    • Other factors get proportional squares
    • Impact: Intuitive proportional representation

Implementation notes: - Emphasise that S4 reducibility > Province + Quintile + Race combined - Consider showing both tables side-by-side for comparison


📊 Tables That Work Well As-Is

5. Scenario Definition Table (Slide 281)

Keep as table: Clear, concise, definitional. Serves as reference.

6. Grade Mix vs Inefficiency Regression Table (Slide 23)

Keep as table: Regression coefficients need precision. Visualisation would lose detail.

7. Alternative Explanations Table (Backup Slide 12)

Keep as table: Comprehensive reference; 17 mechanisms. Too detailed for visualisation.

8. Individual School Scenario Tables (Slides 12-18)

Keep as tables: Specific detailed examples showing grade-by-grade allocation. Tables convey precision needed.


Implementation Strategy

Phase 1: Quick Wins (1-2 hours)

  1. National Results Waterfall Chart → Replaces Slide 11 table
    • Highest impact, medium complexity
    • R packages: ggplot2 with custom geom or waterfalls package
  2. Fiscal Leakage Horizontal Bar Chart → Replaces Slide 26 table
    • High impact, low complexity
    • Clear policy message

Phase 2: Enhanced Analysis (2-4 hours)

  1. Provincial Descriptives Small Multiples → Replaces Slide 10 table
    • High impact, higher complexity
    • Consider patchwork or facet_wrap approach
  2. Variance Contribution Treemap → Supplements Slides 25/25b
    • Medium impact, medium complexity
    • Use treemap or ggplot2 + treemapify

Phase 3: Polish & Integration (1-2 hours)

  • Ensure consistent colour schemes across visualisations
  • Add interactivity if presenting with HTML slides (Plotly)
  • Create static high-res versions for PDF export
  • Update narrative text to reference visualisations

File Organisation & Structure

⚠️ IMPORTANT: All visualisation scripts and outputs must be saved in the visualisations folder:

05_slides/vu_presentation/visualisations/
├── 01_national_results_waterfall.R
├── 02_provincial_descriptives_visualisations.R
├── 03_fiscal_leakage_visualisation.R (to be created)
├── 04_variance_contribution_visualisation.R (to be created)
├── outputs/
│   ├── national_results_waterfall.svg
│   ├── national_results_waterfall.png
│   ├── provincial_descriptives_cleveland.svg
│   ├── provincial_descriptives_cleveland.png
│   ├── provincial_descriptives_small_multiples.svg
│   ├── provincial_descriptives_small_multiples.png
│   └── ... (other visualisation outputs in SVG and PNG formats)
├── README.md
├── IMPLEMENTATION_LOG.md
└── table_to_visualisation_recommendations.md (this file)

Naming Conventions

  • Scripts: ##_descriptive_name.R (numbered sequentially)
  • Outputs: descriptive_name.svg and descriptive_name.png (both formats required)
  • All scripts should output to: ./outputs/ (relative to visualisations folder)

Technical Implementation Notes

Standard Output Directory Configuration

All R scripts should use this output directory:

output_dir <- "/Users/petercourtney/Library/CloudStorage/Dropbox/Master R/projects/allocation/05_slides/vu_presentation/visualisations/outputs/"
if (!dir.exists(output_dir)) dir.create(output_dir, recursive = TRUE)

Colour Palette (Standard Project Palette)

All visualisations should use these colours:

# Standard project colour palette
project_colours <- c(
  "#818E7B",  # Sage green
  "#E4CA60",  # Golden yellow
  "#D28540",  # Orange
  "#854B50",  # Dusty rose
  "#8B6914",  # Dark goldenrod
  "#D2691F",  # Burnt sienna
  "#E2B709",  # Bright yellow
  "#365365"   # Dark slate
)

Usage notes: - For 2-category comparisons: Use #D28540 (orange) and #E4CA60 (golden yellow) - For provincial comparisons: Assign colours by cost/severity magnitude - For sequential data: Order from light to dark or vice versa as appropriate - Maintain consistency across all visualisations in the presentation

Export Specifications

⚠️ CRITICAL: All visualisations must be saved in BOTH SVG and PNG formats.

  • SVG (Scalable Vector Graphics) - PRIMARY FORMAT:
    • Standard dimensions: 12” × 5” (flat), 14” × 10” (tall), or 16” × 10” (wide)
    • Infinite resolution - scales perfectly at any size
    • Perfect text clarity when projected or zoomed
    • Works natively in Quarto/reveal.js presentations
    • Smaller file sizes than high-resolution PNGs
    • Vector format ensures crisp rendering at any zoom level
  • PNG (Portable Network Graphics) - BACKUP FORMAT:
    • Same dimensions as SVG
    • 300 DPI for high-quality raster output
    • Ensures compatibility with all platforms
    • Useful for embedding in PowerPoint, Word, or other non-web formats

Example ggsave template:

# SVG - Infinite resolution (PRIMARY FORMAT)
ggsave(
  filename = file.path(output_dir, "my_plot.svg"),
  plot = my_plot,
  width = 12,
  height = 5,
  device = "svg",
  bg = "white"
)

# PNG - High resolution backup (BACKUP FORMAT)
ggsave(
  filename = file.path(output_dir, "my_plot.png"),
  plot = my_plot,
  width = 12,
  height = 5,
  device = "png",
  dpi = 300,
  bg = "white"
)

Why both formats? - SVG = infinite resolution for web/Quarto presentations - PNG = universal compatibility for PowerPoint, Word, email attachments - Both formats ensure maximum flexibility for different presentation contexts


Measurement of Success

A good visualisation should allow the viewer to answer these questions in <5 seconds:

  1. National Results: “Which scenario gives the biggest ECS reduction?” → S4 (labour activation)
  2. Provincial Descriptives: “Which province has the worst crowding?” → Limpopo
  3. Fiscal Leakage: “How much money is wasted annually?” → R22.3bn
  4. Variance Decomposition: “What explains crowding most?” → S4 reducibility (33.5%)

If your visualisation achieves this, it’s succeeded.


Next Steps

  1. ✅ Create this document
  2. ✅ Generate waterfall chart for National Results (priority 1)
  3. ✅ Generate Cleveland dot plot for Provincial Descriptives (priority 1)
  4. ✅ Generate small multiples for Provincial Descriptives (priority 1)
  5. ✅ Generate horizontal stacked bar chart for Fiscal Leakage (priority 1)
  6. ⏳ Generate variance contribution visualisation (priority 2)
  7. ⏳ Test rendering in Quarto reveal.js slides
  8. ⏳ Gather feedback and iterate
  9. ⏳ Update slide narrative to reference visualisations

Contact & Questions

For R implementation assistance or visualisation design queries, refer to: - Edward Tufte’s principles of graphical excellence - Claus Wilke’s Fundamentals of Data Visualisation - Kieran Healy’s Data Visualisation: A Practical Introduction