Metric Definitions

Overview

This section outlines the logic which is applied to arrive at key business metrics. Only the most important metrics are included in this section.

Attendance

Overview

This defines what constitutes a “registered attendance” in the Reclink system, based on the business logic embedded in the dbt data models.

Core Definition

A registered attendance is a record in the attendance table that represents a verified check-in event by a participant to a program session.

Mandatory Criteria

For a record to be considered a valid attendance, it must meet these non-negotiable requirements:

1. Must Have a Check-in Timestamp

WHERE a.checkin_timestamp IS NOT NULL
  • Business Rule: No attendance exists without a timestamp proving the check-in occurred
  • Technical Implementation: checkin_timestamp field must contain a valid integer timestamp
  • Data Transformation: Timestamp is converted to datetime via to_timestamp(checkin_timestamp)

2. Must Not Be Deleted

WHERE a.is_deleted = FALSE
  • Business Rule: Soft-deleted attendance records are excluded from all metrics
  • Purpose: Allows for data correction without losing audit trail

Contextual Requirements

For an attendance to be reportable and include full contextual information, it should link to active related entities:

3. Active Enrolment Required

LEFT JOIN enrolment_data ed ON ad.enrolment_id = ed.enrolment_id
WHERE e.is_deleted = FALSE
  • Business Rule: Attendance must link to a valid, non-deleted enrolment
  • Impact: Without valid enrolment, participant details cannot be retrieved

4. Active Session Required

LEFT JOIN session_data sd ON ad.session_id = sd.session_id  
WHERE s.is_deleted = FALSE
  • Business Rule: Session must be active for attendance to have context
  • Impact: Provides session title, timing, and location information

5. Active Program Required

LEFT JOIN program_data prd ON ad.program_id = prd.program_id
WHERE p.is_deleted = FALSE  
  • Business Rule: Program must be active
  • Impact: Provides program details and LGA attribution via owner_handle

Participant Status Handling

Critical Business Logic: Participant active status does NOT exclude attendance records.

LEFT JOIN participant_data pd ON ed.participant_uuid = pd.participant_uuid
WHERE p.is_active = TRUE  -- Applied in participant_data CTE only

Behaviour:

  • Active Participants: Full participant details populated (name, gender, postcode, etc.)
  • Inactive Participants:
    • Attendance still counted as valid
    • Participant details return as NULL
    • Enrolment details still available for basic information

Business Rationale:

Once a check-in has occurred, that attendance remains historically valid regardless of subsequent changes to participant status.

Data Transformation Rules

Timestamp Processing

to_timestamp(a.checkin_timestamp) AS checkin_at,
to_timestamp(a.checkin_timestamp)::date AS checkin_date
  • Raw integer timestamp converted to PostgreSQL timestamp
  • Date component extracted for daily aggregations

Count Logic

1 AS attendance_count
  • Each qualifying record represents exactly one attendance occurrence
  • No weighting or quantity multipliers applied to participant attendances

Participant Data Prioritisation

COALESCE(pd.first_name, ed.first_name) AS first_name,
COALESCE(pd.last_name, ed.last_name) AS last_name,
COALESCE(pd.preferred_name, ed.preferred_name) AS preferred_name
  • Participant table data takes precedence over enrolment data when available
  • Falls back to enrolment data if participant record unavailable

Geographic Attribution

LEFT JOIN lga_data lgad ON prd.owner_handle = lgad.lga_handle
  • State/LGA: Derived from program’s managing organisation, not participant location
  • Location: Derived from program’s designated location
  • Business Rule: Geographic metrics reflect where the program operates, not where participants live

What’s Excluded

Never Counted as Attendance:

  1. Records with checkin_timestamp IS NULL (no actual check-in occurred)
  2. Records with is_deleted = TRUE (soft-deleted attendances)
  3. Attendances where enrolment, session, or program are deleted

Counted But With Limited Data:

  1. Attendances linked to inactive participants (counted but participant details NULL)
  2. Attendances where related entities have missing optional data

Key Business Assumptions

  1. One Record = One Attendance: No quantity multipliers for participant attendances
  2. Historical Validity: Past attendances remain valid regardless of current participant status
  3. Geographic Attribution: Program location determines geographic metrics, not participant address
  4. Data Quality: NULL timestamps indicate no actual attendance occurred
  5. Audit Trail: Soft deletion preserves data integrity while excluding from active metrics

Usage in Reporting

This definition applies to: - Monthly attendance summaries (attend_sum) - Participant attendance analysis (mart_participant_attendance) - Total attendance metrics (mart_total_attendance) - State-based reporting (attend_by_state_access) - Rolling trend analysis (attend_sum_rolling)

Active Participant

Overview

An active participant is a participant who has met minimum attendance criteria within a specified time window, as defined by configurable business rules. This definition is separate from and independent of the participant.is_active system flag.

Disambiguation from System Active Status

Active Participant (business metric) ≠ System Active Status (data flag)

  • System Active Status: participant.is_active field indicates whether a participant account is enabled in the system
  • Active Participant: Business metric based on recent attendance behaviour regardless of system account status

Business Rules

An active participant is defined by meeting these criteria:

Minimum Attendance Threshold

attendance_count >= 2
  • Variable: active_participant_min_attendance: 2 (configurable in dbt_project.yml)
  • Business Logic: Participant must have attended at least 2 sessions to be considered active
  • Purpose: Distinguishes engaged participants from one-time attendees

Time Window for Activity Measurement

checkin_at >= (current_date - INTERVAL '30' DAY)
  • Variable: active_participant_days_window: 30 (configurable in dbt_project.yml)
  • Business Logic: Only attendances within the last 30 days count toward activity status
  • Rolling Window: Calculation performed daily, creating a rolling 30-day activity assessment

Technical Implementation

Data Sources

  • Attendance Data: From int_attendance_enriched (or equivalent mart)
  • Participant Details: From stg_participant
  • Department Affiliations: From stg_participant_affiliations

Calculation Logic

CASE 
    WHEN attendance_count >= {{ var('active_participant_min_attendance') }} THEN TRUE
    ELSE FALSE
END AS is_active

Key Metrics Calculated

  • Attendance Count: Total attendances within the time window
  • Days Since Last Attendance: Days elapsed since most recent attendance
  • Activity Status: Categorical classification of participant activity level

Activity Status Classifications

Active

  • Criteria: attendance_count >= 2 within last 30 days
  • Status: is_active = TRUE
  • Category: 'Active'

Inactive Recent

  • Criteria: attendance_count > 0 but < 2 within last 30 days
  • Status: is_active = FALSE
  • Category: 'Inactive Recent'

Inactive No Recent

  • Criteria: attendance_count = 0 within last 30 days
  • Status: is_active = FALSE
  • Category: 'Inactive No Recent'

Business Assumptions

Rolling Assessment

  • Active status is recalculated daily based on rolling 30-day window
  • Participants can move between active/inactive status as their attendance patterns change

Attendance Quality

  • All valid attendances (as per attendance definition) count equally toward activity threshold
  • No weighting by program type, duration, or other factors

System Independence

  • Active participant status is independent of participant.is_active system flag
  • Inactive system accounts can still be classified as active participants based on recent attendance

Additional Context Fields

Geographic Information

  • Primary Club: Participant’s designated club affiliation
  • LGA Details: Local Government Area of primary club
  • Department Affiliation: Most recent department/organisation association

Temporal Tracking

  • Reference Date: Date when activity calculation was performed
  • First/Last Attendance Dates: Temporal boundaries of recent activity
  • Days Since Last Attendance: Recency metric for engagement analysis

Configuration Management

Adjustable Thresholds

Both business rules are configurable via dbt variables:

vars:
  active_participant_min_attendance: 2      # Minimum attendances required
  active_participant_days_window: 30        # Days to look back for activity

Impact of Changes

  • Increasing attendance threshold: Fewer participants classified as active
  • Extending time window: More participants likely to meet criteria
  • Reducing time window: Higher recency requirement for active status

Usage in Reporting

This definition is currently implemented in these models:

int_active_participants

  • Purpose: Intermediate model that calculates active participant status
  • Output: Individual participant records with activity flags and metrics
  • Referenced by: Downstream reporting models for agency and geographic analysis

rpt_active_participants_by_lga

  • Purpose: Regional analysis of active participants by Local Government Area
  • Logic: Aggregates participants meeting the active criteria by LGA and month
  • Output: Monthly counts of active participants per LGA with percentage breakdowns

rpt_active_agencies

  • Purpose: Identifies which agencies/departments have active participants
  • Logic: Uses active_agency_min_participants: 1 threshold to determine agency activity status
  • Output: Department-level activity metrics with organisational rollups

Implementation Status

Note: The active participant models appear to exist in compiled/target form but may not be in the current active model set. The logic is defined and the variables are configured, but usage in production reporting should be verified.

Guest Attendance

Overview

A guest attendance represents non-registered attendees at program sessions, captured through the guest data collection system. Unlike registered participant attendance, guests don’t require individual registration and can represent multiple attendees through a quantity multiplier.

Core Definition

A guest attendance is a record in the guest table that represents attendance by unregistered individuals at a program session.

Mandatory Criteria

For a record to be considered a valid guest attendance:

1. Must Not Be Deleted

WHERE is_deleted = FALSE OR is_deleted IS NULL
  • Business Rule: Soft-deleted guest records are excluded from all metrics
  • Default Handling: NULL is_deleted values are treated as non-deleted

2. Must Have Quantity Information

COALESCE(quantity, 1) AS quantity
  • Business Rule: Each guest record represents one or more attendees
  • Default Handling: NULL quantity defaults to 1 person
  • Multiplier Effect: Unlike participant attendance, quantity can exceed 1

Data Transformation Rules

Quantity-Based Counting

quantity AS guest_count
  • Key Difference: Each guest record can represent multiple attendees
  • Multiplier Logic: All demographic calculations use quantity as multiplier
  • Business Impact: A single guest record with quantity = 5 counts as 5 guest attendances

Timestamp Handling

Guest attendance timing is derived from session information rather than direct check-in timestamps:

-- Session date becomes the attendance date
session_date AS checkin_at  -- Using session_date as proxy for attendance time

Temporal Attribution Logic

CASE 
    WHEN s.start_timestamp IS NOT NULL THEN 
        CASE
            WHEN EXTRACT(YEAR FROM to_timestamp(s.start_timestamp)) >= 2020 THEN 
                to_timestamp(s.start_timestamp)
            WHEN EXTRACT(YEAR FROM to_timestamp(s.start_timestamp/1000)) >= 2020 THEN
                to_timestamp(s.start_timestamp/1000)
            ELSE CURRENT_DATE -- Fallback for invalid timestamps
        END
    ELSE g.created_at -- Fallback to guest creation date
END AS session_date
  • Primary Source: Session start timestamp
  • Secondary Source: Guest record creation date
  • Fallback: Current date for invalid data

Demographic Processing

Gender Classification

-- Gender counts are multiplied by quantity
CASE 
    WHEN gender = 'Male' OR gender = 'M' THEN quantity 
    ELSE 0 
END AS guest_male_count
  • Quantity Multiplication: All gender counts multiplied by quantity field
  • Format Handling: Accepts both full names (‘Male’/‘Female’) and abbreviations (‘M’/‘F’)

Age Bracket Logic

Guest age processing uses categorical ranges rather than calculated ages:

-- Age range detection with quantity multiplication
CASE 
    WHEN age LIKE '%0-16%' OR 
         (REGEXP_MATCHES(age, '^[0-9]+$') AND CAST(age AS INTEGER) < 16)
    THEN quantity 
    ELSE 0 
END AS guest_0_16_count

Age Categories: - 0-16: Children and young teens - 16-24: Older teens and young adults
- 25+: Adults - Unknown: NULL or non-matching age values

Cultural Identity Tracking

  • isATSI: Aboriginal and Torres Strait Islander status
  • hasDisability: Disability status indicator
  • isCALD: Culturally and Linguistically Diverse status
  • isPasifika: Pacific Islander heritage status

Geographic Attribution

-- LGA derived from program ownership, not guest location
LEFT JOIN program_data p ON g.program_prc = p.prc
LEFT JOIN lga_data l ON p.owner_handle = l.handle
  • Program-Based: Geographic metrics reflect where the program operates
  • LGA Linkage: Through program owner organisation, not guest postcode
  • Consistency: Matches participant attendance geographic logic

Key Business Rules

No Individual Identity

  • Anonymous: Guests are not linked to individual participant profiles
  • Quantity-Based: Single records can represent multiple people
  • No Historical Tracking: Cannot track individual guest attendance patterns

Session-Based Attribution

  • Session Dependency: All guest attendance attributed to specific sessions
  • Date Inheritance: Attendance date derived from session timing
  • Location Linkage: Geographic data from session/program, not guest address

Demographic Multiplication

  • Quantity Effect: All demographic breakdowns multiplied by quantity field
  • Group Representation: Single record can represent demographic group (e.g., 5 males aged 16-24)

Usage in Reporting

Guest attendance is implemented in these models:

stg_guest

  • Purpose: Basic staging with data cleaning and quantity defaulting
  • Filters: Excludes deleted records, converts timestamps

int_guest_enriched

  • Purpose: Enriches guest data with session and program information
  • Output: Adds session dates, program names, location codes

int_guest_with_lga

  • Purpose: Adds LGA (Local Government Area) information via program linkage
  • Output: Geographic context for regional analysis

mart_total_attendance

  • Purpose: Combines guest and participant attendance for comprehensive metrics
  • Logic: Sums quantity-weighted guest counts with individual participant counts

What’s Excluded

Never Counted as Guest Attendance:

  1. Records with is_deleted = TRUE (soft-deleted guests)
  2. Records without valid session linkage (orphaned guest records)

Limitations in Data:

  1. No Check-in Timestamps: Uses session timing as proxy
  2. Limited Session Details: Session title/handle often unavailable
  3. No Participant UUID: Cannot link to individual identities

Demographic Comparison with Participants

Age Brackets:

  • Guests: 0-16, 16-24, 25+ (categorical)
  • Participants: 0-5, 6-12, 13-17, 18-24, 25-34, 35-44, 45-54, 55-64, 65+ (calculated from birth year)

Gender Handling:

  • Both: Support M/F/O/N/NULL variations
  • Guests: All counts multiplied by quantity
  • Participants: Individual counts only

Total Attendance

Overview

Total attendance represents the comprehensive count of all attendees at program sessions, combining both registered participant attendance and guest attendance into unified metrics. This composite measure provides the complete picture of session engagement across all attendee types.

Core Definition

Total attendance is calculated as:

SUM(participant_count + guest_count) AS total_attendance

Where: - Participant Count: Number of registered participants (always 1 per attendance record) - Guest Count: Number of guests (quantity field from guest records, can be >1)

Component Definitions

Participant Attendance Component

1 AS participant_count  -- Each record represents one registered participant
  • Source: mart_participant_attendance
  • Counting Logic: One attendance record = one participant
  • Business Rules: Must meet registered attendance criteria (timestamp + not deleted)

Guest Attendance Component

quantity AS guest_count  -- Each record can represent multiple guests
  • Source: int_guest_with_lga
  • Counting Logic: Quantity field determines attendance count
  • Business Rules: Must not be deleted, defaults to 1 if quantity NULL

Data Integration Method

Union All Approach

-- Combine both datasets
combined_attendance AS (
    SELECT * FROM participant_attendance
    UNION ALL
    SELECT * FROM guest_attendance
)
  • Structural Alignment: Both data sources conform to identical schema
  • Null Handling: Missing fields populated with NULL or 0 as appropriate
  • No Deduplication: UNION ALL preserves all records from both sources

Field Harmonisation

  • Temporal: Both use year/month for aggregation
  • Geographic: Both use LGA information via program linkage
  • Demographic: Standardised gender and age category processing

Aggregation Logic

Primary Calculation

-- Core total attendance metric
SUM(participant_count) AS participant_count,
SUM(guest_count) AS guest_count,
SUM(participant_count + guest_count) AS total_attendance

Session-Level Grouping

GROUP BY
    year,
    month,
    session_id,        -- Session-level analysis
    program_id,
    program_prc,
    location_id,
    lga_name,
    lga_state

Demographic Integration

Gender Demographics

Combined totals calculated by summing component demographics:

-- Combined gender totals
SUM(participant_male_count + guest_male_count) AS total_male_count,
SUM(participant_female_count + guest_female_count) AS total_female_count

Age Demographics Harmonisation

Two parallel age systems maintained:

Detailed Participant Brackets (birth year calculated): - 0-5, 6-12, 13-17, 18-24, 25-34, 35-44, 45-54, 55-64, 65+

Guest-Compatible Brackets (categorical ranges): - 0-16, 16-24, 25+

Combined Totals use guest-compatible brackets:

SUM(participant_0_16_count + guest_0_16_count) AS total_0_16_count

Temporal Attribution

Participant Timing

  • Source: Direct check-in timestamps (checkin_at)
  • Precision: Exact moment of check-in
  • Reliability: High - actual recorded events

Guest Timing

  • Source: Session start timestamps (session_date)
  • Precision: Session-level approximation
  • Reliability: Moderate - inferred from session timing

Combined Temporal Logic

  • Aggregation Level: Year/month groupings accommodate timing differences
  • Business Impact: Monthly totals valid despite timing precision differences

Geographic Attribution

Consistent LGA Assignment

Both components use identical geographic logic:

-- Program to LGA linkage for both participants and guests
LEFT JOIN program_data p ON [attendance].program_prc = p.prc
LEFT JOIN lga_data l ON p.owner_handle = l.handle
  • Program-Based: Geographic metrics reflect program delivery location
  • Consistency: Ensures participant and guest totals are geographically comparable

Key Business Rules

No Double Counting

  • Separation: Participants and guests are mutually exclusive categories
  • Clear Boundaries: No individual can be both participant and guest in same session
  • Audit Trail: Separate source tables prevent overlap

Quantity Multiplication

  • Participants: Always count as 1 per record
  • Guests: Count multiplied by quantity field
  • Total Impact: Guest quantity significantly affects totals

Data Quality Filters

Both components apply consistent exclusion rules: - Deleted Records: Excluded from both participant and guest data - Valid Sessions: Both require active session linkage - Program Validation: Both filtered to active programs

Usage in Reporting

Total attendance is implemented in:

mart_total_attendance

  • Purpose: Primary total attendance mart with full demographic breakdowns
  • Granularity: Monthly aggregations by session, program, LGA
  • Output: Combined metrics with separate participant/guest detail

Aggregation Dimensions

Temporal Dimensions

  • Year/Month: Primary temporal grouping
  • Year-Month Composite: (year * 100) + month for sorting
  • Month Names: Human-readable month labels

Program Dimensions

  • Program PRC: Primary program identifier
  • Program Details: Title, subtitle, status
  • Session Information: ID, title, timing

Geographic Dimensions

  • LGA: Local Government Area name and state
  • Location: Physical venue information
  • State: Australian state/territory

Key Assumptions

Attendance Validity

  • Historical Accuracy: Past attendance remains valid regardless of current status
  • Session Attribution: All attendance correctly attributed to intended sessions
  • Geographic Consistency: Program locations accurately reflect delivery geography

Demographic Reliability

  • Guest Demographics: Self-reported guest demographics are reasonably accurate
  • Age Calculations: Participant birth years produce meaningful age brackets
  • Gender Classification: Gender categories adequately capture attendee diversity

3-Month Rolling Average Attendance

Overview

The 3-Month Rolling Average Attendance metric provides a smoothed trend analysis of monthly attendance patterns by calculating the average attendance count across the current month and the two preceding months. This metric helps identify seasonal patterns and overall attendance trends while reducing the impact of monthly fluctuations.

Core Definition

A rolling calculation that averages the monthly registered attendance count for a 3-month window, including the current month and the 2 preceding months. The calculation moves forward chronologically, creating a new 3-month window for each month.

Mandatory Criteria

For a record to be included in the rolling average calculation:

1. Valid Check-in Timestamp

WHERE checkin_dt IS NOT NULL
  • Business Rule: Only attendance records with valid check-in timestamps are counted
  • Impact: Records without timestamps are excluded from all calculations

2. Non-deleted Records

-- Implicit filter from staging model
WHERE is_deleted = false OR is_deleted IS NULL
  • Business Rule: Deleted attendance records are excluded from calculations
  • Impact: Soft-deleted records do not contribute to attendance counts

3. Minimum Window Requirement

ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
  • Business Rule: Rolling average requires at least the current month’s data
  • Impact: The first two months in the dataset will have partial windows (1-month and 2-month averages respectively)

Data Transformation Rules

Timestamp Processing

CASE
  WHEN checkin_timestamp IS NOT NULL THEN to_timestamp(checkin_timestamp)
  ELSE NULL
END AS checkin_dt
  • Logic: Converts Unix timestamp to proper timestamp format
  • Purpose: Enables date extraction functions for monthly grouping

Monthly Aggregation

SELECT
  EXTRACT(YEAR FROM CAST(checkin_dt AS TIMESTAMP)) AS year,
  EXTRACT(MONTH FROM CAST(checkin_dt AS TIMESTAMP)) AS month,
  COUNT(attendance_handle) AS attendance_count
  • Logic: Groups attendance records by year and month, counting unique attendance handles
  • Purpose: Creates base monthly counts for rolling average calculation

Rolling Average Calculation

AVG(attendance_count) OVER (
  ORDER BY year, month
  ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS rolling_3_month_avg
  • Logic: Window function calculating average across 3-month sliding window
  • Purpose: Smooths monthly variations to reveal underlying trends

Current Month Exclusion

CASE
  WHEN year = (SELECT MAX(year) FROM rolling_avg) 
    AND month = (SELECT MAX(month) FROM rolling_avg WHERE year = (SELECT MAX(year) FROM rolling_avg))
  THEN NULL
  ELSE ROUND(rolling_3_month_avg, 0)
END AS rolling_3_month_avg
  • Logic: Sets rolling average to NULL for the most recent month in the dataset
  • Purpose: Indicates incomplete data for the current period

Temporal Attribution

Window Definition

  • 3-Month Window: Current month plus 2 preceding months
  • Chronological Order: Windows advance monthly in chronological sequence
  • Incomplete Windows: First month has 1-month window, second month has 2-month window

Data Currency

  • Current Month Handling: Latest month shows NULL for rolling average to indicate potential incompleteness
  • Historical Stability: All previous months show final calculated rolling averages

Key Business Assumptions

  1. Attendance Completeness: Monthly attendance data is complete for all months except potentially the current month
  2. Timestamp Reliability: Check-in timestamps accurately reflect when attendance occurred
  3. Deletion Logic: Soft-deleted records should not contribute to attendance metrics
  4. Trend Analysis Purpose: 3-month window provides appropriate smoothing for operational trend analysis

Usage in Reporting

This definition is currently implemented in these models:

attend_sum_rolling

  • Purpose: Primary mart model providing monthly attendance with rolling averages
  • Logic: Implements complete rolling average calculation with current month exclusion
  • Output: Year, month, attendance count, and 3-month rolling average

attend_sum

  • Purpose: Intermediate model providing base monthly attendance counts
  • Logic: Aggregates attendance records by year/month from staging data
  • Output: Foundation data for rolling average calculations

attendance_t

  • Purpose: Staging model with timestamp processing and data quality
  • Logic: Converts Unix timestamps and applies basic filtering
  • Output: Clean attendance data ready for aggregation

What’s Excluded

Never Counted in Rolling Average:

  1. Attendance records with NULL check-in timestamps
  2. Soft-deleted attendance records (is_deleted = true)
  3. The most recent month in the dataset (marked as NULL)

Counted But With Limited Data:

  1. First month in dataset (1-month “rolling” average)
  2. Second month in dataset (2-month “rolling” average)

Configuration Management

Rounding Precision

  • Current Setting: Rounded to 0 decimal places (whole numbers)
  • Rationale: Attendance counts are discrete values, fractional averages rounded for clarity

Window Size

  • Current Setting: 3 months (2 preceding + current)
  • Flexibility: Window size is hardcoded but could be parameterised for different analysis needs

Quality Considerations

Data Completeness

  • Rolling averages become more reliable after the third month of data
  • Current month exclusion prevents misleading trends from incomplete data

Temporal Consistency

  • Chronological ordering ensures proper window calculation
  • Year/month extraction maintains consistent temporal grouping

Business Intelligence Impact

  • Smoothed trends support strategic planning and resource allocation
  • Seasonal pattern identification aids operational forecasting

Active Agencies

Overview

Active Agencies identifies member agencies that have participants meeting current activity thresholds. This metric supports partnership management, resource allocation, and inter-agency engagement tracking by measuring organisational-level participation in programs.

Core Definition

A member agency is classified as “active” when it has at least one participant who meets the active participant criteria within the reporting period. The metric aggregates individual participant activity to the departmental and organisational level.

Mandatory Criteria

For an agency to be considered in the active agencies calculation:

1. Valid Department Affiliation

WHERE member_department_uuid IS NOT NULL
  • Business Rule: Only participants with valid department affiliations are counted
  • Impact: Participants without departmental links don’t contribute to agency metrics

2. Active Participant Threshold

CASE 
    WHEN COALESCE(a.active_participant_count, 0) >= 1 THEN TRUE
    ELSE FALSE
END AS is_active
  • Business Rule: Agencies need minimum 1 active participant to be classified as active
  • Impact: Configurable threshold via active_agency_min_participants variable

3. Recent Activity Window

-- From underlying int_active_participants
WHERE ae.checkin_at >= (current_date - INTERVAL '30' DAY)
  • Business Rule: Activity measured within last 30 days
  • Impact: Agencies with only historical activity are not considered active

Data Transformation Rules

Participant Affiliation Resolution

ROW_NUMBER() OVER (PARTITION BY pa.participant_uuid 
                   ORDER BY pa.affiliation_created_ts DESC) AS affiliation_rank
  • Logic: Uses most recent department affiliation per participant
  • Purpose: Handles participants with multiple or changing affiliations

Agency Completeness

-- Get all departments including those with no active participants
all_departments AS (
    SELECT DISTINCT
        member_department_uuid,
        member_department_name,
        member_department_org
    FROM stg_participant_affiliations
    WHERE member_department_uuid IS NOT NULL
)
  • Logic: Includes all known departments, even those with zero active participants
  • Purpose: Provides complete view of all potential partner agencies

Activity Status Classification

CASE
    WHEN is_active THEN 'Active'
    WHEN active_participant_count > 0 THEN 'Inactive - Below Threshold'
    ELSE 'Inactive - No Active Participants'
END AS activity_status
  • Logic: Three-tier classification system for agency engagement levels
  • Purpose: Distinguishes between different types of inactive states

Organisational Attribution

Department-level Metrics

  • Primary unit of measurement is the department (member_department_uuid)
  • Departments rolled up to parent organisations (member_department_org)

Organisation-level Aggregation

SUM(active_participant_count) OVER (
    PARTITION BY member_department_org
) AS total_active_participants_in_org
  • Logic: Aggregates all active participants across departments within an organisation
  • Purpose: Enables organisation-level partnership analysis

Key Business Assumptions

  1. Affiliation Currency: Most recent department affiliation accurately represents current relationship
  2. Activity Inheritance: Department activity reflects organisational engagement
  3. Threshold Sensitivity: Single active participant indicates meaningful agency engagement
  4. Reference Date Consistency: All calculations use same reference date for temporal accuracy

Usage in Reporting

This definition is currently implemented in these models:

rpt_active_agencies

  • Purpose: Primary mart model for agency partnership reporting
  • Logic: Implements complete active agency classification with organisational rollups
  • Output: Department and organisation-level activity metrics with rankings

int_active_participants

  • Purpose: Foundation model providing participant-level activity status
  • Logic: Determines which participants meet activity criteria for agency attribution
  • Output: Participant activity flags with department affiliation data

stg_participant_affiliations

  • Purpose: Staging model providing clean department affiliation data
  • Logic: Handles data quality issues and creates derived timestamps
  • Output: Standardised participant-department relationships

What’s Excluded

Never Counted as Active Agencies:

  1. Departments with no participants meeting activity criteria
  2. Participants without valid department affiliations
  3. Departments where all participants fall below activity thresholds

Counted But With Limited Data:

  1. Departments with participants below threshold (marked as “Inactive - Below Threshold”)
  2. Departments with historical but no recent activity

Agency Classifications

Active Agency Status

  • Active: Has ≥1 active participant meeting attendance and recency criteria
  • Inactive - Below Threshold: Has participants but none meeting activity criteria
  • Inactive - No Active Participants: No participants with recent activity

Organisational Hierarchy

  • Department Level: Primary classification unit using member_department_uuid
  • Organisation Level: Aggregate metrics using member_department_org
  • Cross-organisational: Rankings and comparisons across all agencies

Configuration Management

Configurable Thresholds

vars:
  active_agency_min_participants: 1
  active_participant_min_attendance: 2
  active_participant_days_window: 30
  • Agency Threshold: Minimum active participants required for active status
  • Participant Criteria: Inherited from active participant definition
  • Time Window: Activity measurement period

Reference Date Handling

COALESCE(a.reference_date, current_date) AS reference_date
  • Logic: Uses calculation date or current date for agencies without active participants
  • Purpose: Ensures temporal consistency across all agency records

Attendance Date

Overview

The calendar date when attendance was recorded, derived from session start timestamps for both registered participants and guests.

Core Definition

The date component extracted from session start timestamps (start_timestamp), converted to calendar date for attendance attribution.

Mandatory Criteria

1. Valid Session Start Timestamp

-- Session timestamp conversion with fallback handling
CASE 
    WHEN s.start_timestamp IS NOT NULL THEN 
        CASE
            WHEN EXTRACT(YEAR FROM to_timestamp(s.start_timestamp)) >= 2020 THEN 
                to_timestamp(s.start_timestamp)
            WHEN EXTRACT(YEAR FROM to_timestamp(s.start_timestamp/1000)) >= 2020 THEN
                to_timestamp(s.start_timestamp/1000)
            ELSE CURRENT_DATE
        END
    ELSE g.created_at -- Fallback for guests only
END AS session_date
  • Business Rule: All attendance dates derive from session scheduling, not check-in timing
  • Impact: Records without valid session timestamps use fallback dates or are excluded

2. Date Boundary Validation

session_date >= '2023-08-01'::date
AND session_date <= CURRENT_DATE
  • Business Rule: Attendance dates must be within valid operational period
  • Impact: Historical data before August 2023 and future dates excluded

Data Transformation Rules

Epoch Timestamp Conversion

-- Handles both second and millisecond epoch formats
to_timestamp(s.start_timestamp)  -- Direct conversion
to_timestamp(s.start_timestamp/1000)  -- Millisecond conversion
  • Logic: Attempts direct conversion, then millisecond division if year < 2020
  • Purpose: Accommodates inconsistent timestamp formats in source data

Guest Fallback Logic

ELSE g.created_at -- Fallback to guest creation date if no session date
  • Logic: Uses guest creation timestamp when session start unavailable
  • Purpose: Ensures all guest records have attendance dates

Usage in Reporting

mart_attendance_detail

  • Purpose: Detail-level attendance with date dimensions
  • Logic: Uses session_start_at for participants, session_date for guests
  • Output: Standardised attendance_date field across attendance types

int_guest_enriched

  • Purpose: Guest data enrichment with session dating
  • Logic: Derives session_date from session.start_timestamp with fallbacks
  • Output: Consistent dating for guest attendance analysis

What’s Excluded

Never Counted as Attendance Date:

  1. Check-in timestamps (due to post-hoc data entry issues)
  2. Records with session dates before August 2023
  3. Future-dated attendance records

Counted But With Limited Data:

  1. Sessions with completely invalid timestamps (fallback to current date)
  2. Guest records without session data (use guest creation date)

Key Business Assumptions

  1. Session Priority: Session start time is authoritative for attendance dating
  2. Epoch Format Variation: Source data may use second or millisecond epochs
  3. Operational Window: Valid attendance data begins August 2023