Skip to main content

Database Schema Documentation

Version: 2.2 (Verified against production database) Last Validated: 2025-11-09 Database: MongoDB with 11 collections Total Documents: Production deployment with multiple collections

For application architecture and data flow, see Architecture. For configuration details, see Platform Overview.


Overview

This document provides comprehensive documentation of all MongoDB collections used in the SmartRunning Coach application. The database stores runner profiles, training sessions, workout libraries, training plans, and background job information.

Note: This schema reflects the ACTUAL production database structure, validated against live MongoDB data.


Collections Summary

CollectionDocumentsPurposeKey Relationships
runner4Athlete profiles and settingsLinks to sessions, plans, history
sessionsFIT556Raw FIT file dataLinks to runner, GridFS files
sessionsParquet556Processed session recordsLinks to runner, GridFS files
sessionStatistics556Aggregated session metricsLinks to runner, sessions
runnerHistory1,256Historical training load dataLinks to runner
workouts-Workout libraryReferenced by training plans
trainingplanschema2Training plan templatesUsed to generate training plans
trainingplan993Training plan days with versioningLinks to runner, workouts, supports active plan management
background_jobs13Async job trackingLinks to runner, files
fs.files1,112GridFS file metadataLinks to sessions
fs.chunks1,644GridFS file chunksLinks to fs.files

Collection Schemas

1. runner

Purpose: Stores athlete profiles, physiological parameters, training metrics, and training plan configuration.

Document Count: 4 Indexes: _id_, runnerID_idx, marathon_date_idx

Fields (38 total):

FieldTypeRequiredDescriptionValidation
_idObjectIdYesUnique identifierMongoDB ObjectId
runnerIDStringYesHuman-readable runner identifierUnique string (e.g., "Marc_001")
nameStringYesRunner's full nameNon-empty string
ageIntegerYesRunner's agePositive integer
Training Load Metrics
CTL_startingFloatYesInitial Chronic Training LoadNon-negative float
ATL_startingFloatYesInitial Acute Training LoadNon-negative float
CTL_currentFloatYesCurrent Chronic Training LoadNon-negative float
ATL_currentFloatYesCurrent Acute Training LoadNon-negative float
TSB_currentFloatYesCurrent Training Stress BalanceFloat (can be negative)
ACWR_currentFloatYesCurrent Acute:Chronic Workload RatioNon-negative float
Banister Model Parameters
banister_k1FloatYesFitness gain coefficientPositive float
banister_k2FloatYesFatigue gain coefficientPositive float
banister_tau_fFloatYesFitness decay time constant (days)Positive float
banister_tau_frFloatYesFatigue decay time constant (days)Positive float
Heart Rate Zones (Legacy - Being Migrated)
hr_zone_1_maxFloatNoZone 1 upper limit (bpm)0-220 (legacy)
hr_zone_2_maxFloatNoZone 2 upper limit (bpm)0-220 (legacy)
hr_zone_3_maxFloatNoZone 3 upper limit (bpm)0-220 (legacy)
hr_zone_4_maxFloatNoZone 4 upper limit (bpm)0-220 (legacy)
hr_zone_5_maxFloatNoZone 5 upper limit (bpm)0-220 (legacy)
heart_rate_zones_historyArrayYesTime series HR zonesSee Time Series Schema
Personal Bests (Legacy - Being Migrated)
pb_400m_secondsFloatNo400m personal best timePositive float (legacy)
pb_800m_secondsFloatNo800m personal best timePositive float (legacy)
pb_5000m_secondsFloatNo5000m personal best timePositive float (legacy)
personal_bests_historyObjectYesTime series PBs by distanceSee Time Series Schema
Training Stress Score Thresholds
rtss_minFloatYesMinimum rTSS for workoutNon-negative float
rtss_maxFloatYesMaximum rTSS for workoutNon-negative float
Speed Metrics
threshold_speedFloatYesThreshold speed (m/s)Positive float
Training Plan Configuration
blocksIntegerYesNumber of training blocksPositive integer
weeks_per_blockIntegerYesWeeks per training blockPositive integer
taper_weeksIntegerYesNumber of taper weeksPositive integer
weekly_incrementFloatYesWeekly volume increase (%)Positive float
training_daysArrayYesScheduled training daysArray of integers (1-7)
marathon_dateDateTimeNoTarget marathon dateISO 8601 datetime
prep_startDateTimeNoPreparation start dateISO 8601 datetime or null
Data Collections
sessionsArrayYesSession referencesArray (typically empty)
schedulesArrayYesSchedule referencesArray (typically empty)
critical_speed_historyArrayNoHistorical critical speed dataArray of objects (legacy)
critical_speed_time_seriesArrayYesTime series critical speedSee Time Series Schema
Metadata
created_atDateTimeYesAccount creation timestampISO 8601 datetime
updated_atDateTimeYesLast modification timestampISO 8601 datetime
history_last_updatedDateTimeNoLast history updateISO 8601 datetime

Critical Speed History Schema:

{
"critical_speed_history": [
{
"date": "2025-09-17T08:55:16.393Z",
"critical_speed": 3.189, // m/s
"w_prime": 285.2, // meters
"r_squared": 0.9986, // model fit quality
"pb_400m_seconds": 65,
"pb_800m_seconds": 131,
"pb_5000m_seconds": 1480
}
]
}

Sample Document:

{
"_id": ObjectId("68c3f9062a04875c129365d4"),
"runnerID": "Marc_001",
"name": "Marc De Reu",
"age": 55,
"CTL_starting": 45.0,
"ATL_starting": 30.0,
"CTL_current": 49.1,
"ATL_current": 1.2,
"TSB_current": 47.87,
"ACWR_current": 0.025,
"banister_k1": 1.0,
"banister_k2": 2.0,
"banister_tau_f": 42.0,
"banister_tau_fr": 7.0,
"hr_zone_1_max": 100.0,
"hr_zone_2_max": 120.0,
"hr_zone_3_max": 140.0,
"hr_zone_4_max": 160.0,
"hr_zone_5_max": 200.0,
"pb_400m_seconds": 65.0,
"pb_800m_seconds": 131.0,
"pb_5000m_seconds": 1480.0,
"rtss_min": 30.0,
"rtss_max": 150.0,
"threshold_speed": 2.56,
"blocks": 4,
"weeks_per_block": 4,
"taper_weeks": 2,
"weekly_increment": 10.0,
"training_days": [1, 3, 5, 6],
"marathon_date": "2026-01-01T00:00:00",
"prep_start": null,
"sessions": [],
"schedules": [],
"critical_speed_history": [
{
"date": "2025-09-17T08:55:16.393Z",
"critical_speed": 3.189,
"w_prime": 285.2,
"r_squared": 0.9986,
"pb_400m_seconds": 65,
"pb_800m_seconds": 131,
"pb_5000m_seconds": 1480
}
],
"created_at": "2025-09-12T10:42:14.274000",
"updated_at": "2025-09-17T08:55:16.397000",
"history_last_updated": "2025-09-12T11:19:12.011203"
}

2. sessionsFIT

Purpose: Stores raw FIT file data and metadata for uploaded workout sessions.

Document Count: 556 Indexes: _id_ only

Fields:

FieldTypeRequiredDescriptionValidation
_idObjectIdYesUnique identifierMongoDB ObjectId
runner_idObjectIdYesReference to runnerValid runner._id
file_idObjectIdYesGridFS file referenceValid fs.files._id
original_filenameStringYesOriginal filenameNon-empty string
file_sizeIntegerYesFile size in bytesPositive integer
uploaded_atDateTimeYesUpload timestampISO 8601 datetime
processedBooleanYesProcessing statustrue/false
parquet_files_createdIntegerNoCount of parquet filesNon-negative integer
metadataObjectYesFIT file metadataSee Metadata Schema

Metadata Schema:

{
"metadata": {
"message_types": [
"file_id_mesgs",
"file_creator_mesgs",
"activity_mesgs",
"session_mesgs",
"lap_mesgs",
"record_mesgs",
// ... additional message types
],
"total_messages": 13671,
"errors": [],
"session_info": {
"start_time": "2024-11-26T11:00:48",
"total_elapsed_time": 2986.878, // seconds
"total_distance": 7861.87, // meters
"sport": "running",
"sub_sport": "generic"
}
}
}

Sample Document:

{
"_id": ObjectId("68c3f9122a04875c129365d6"),
"runner_id": ObjectId("68c3f9062a04875c129365d4"),
"original_filename": "tp-1581646.2024-11-26-11-58-13-283Z.GarminPing.AAAAAGdFt9Tjz2sX.FIT",
"file_id": ObjectId("68c3f9122a04875c129365d7"),
"file_size": 411633,
"uploaded_at": "2025-09-12T10:42:26.856000",
"processed": true,
"parquet_files_created": 1,
"metadata": {
"message_types": ["file_id_mesgs", "file_creator_mesgs", "activity_mesgs", "..."],
"total_messages": 13671,
"errors": [],
"session_info": {
"start_time": "2024-11-26T11:00:48",
"total_elapsed_time": 2986.878,
"total_distance": 7861.87,
"sport": "running",
"sub_sport": "generic"
}
}
}

3. sessionsParquet

Purpose: Stores processed session data with detailed record-level information in Parquet format.

Document Count: 556 Indexes: _id_ only

Fields:

FieldTypeRequiredDescriptionValidation
_idObjectIdYesUnique identifierMongoDB ObjectId
session_idObjectIdYesReference to FIT sessionValid sessionsFIT._id
runner_idObjectIdYesReference to runnerValid runner._id
original_fit_filenameStringYesSource FIT filenameNon-empty string
parquet_filenameStringYesParquet filenameNon-empty string
file_idObjectIdYesGridFS parquet fileValid fs.files._id
file_sizeIntegerYesParquet file size (bytes)Positive integer
created_atDateTimeYesCreation timestampISO 8601 datetime
metadataObjectYesParquet structure metadataSee Metadata Schema

Metadata Schema:

{
"metadata": {
"rows": 2987, // Number of data records
"columns": 42, // Number of columns
"column_names": [
"cumulative_time",
"timestamp",
"position_lat",
"position_long",
"distance",
"enhanced_speed",
"enhanced_altitude",
"heart_rate",
"cadence",
"temperature",
// ... additional columns
],
"data_types": {
"cumulative_time": "int64",
"timestamp": "datetime64[ns, UTC]",
"position_lat": "float64",
"position_long": "float64",
"distance": "float64",
"heart_rate": "float64",
"cadence": "float64",
// ... additional data types
},
"time_range": {
"start": "2024-11-26T11:00:48+00:00",
"end": "2024-11-26T11:50:34+00:00",
"duration_seconds": 2987
}
}
}

Sample Document:

{
"_id": ObjectId("68c3f9132a04875c129365dc"),
"session_id": ObjectId("68c3f9122a04875c129365d6"),
"runner_id": ObjectId("68c3f9062a04875c129365d4"),
"original_fit_filename": "tp-1581646.2024-11-26-11-58-13-283Z.GarminPing.AAAAAGdFt9Tjz2sX.FIT",
"parquet_filename": "tp-1581646.2024-11-26-11-58-13-283Z.GarminPing.AAAAAGdFt9Tjz2sX_record_mesgs.parquet",
"file_id": ObjectId("68c3f9132a04875c129365da"),
"file_size": 216719,
"created_at": "2025-09-12T10:42:26.856000",
"metadata": {
"rows": 2987,
"columns": 42,
"column_names": ["cumulative_time", "timestamp", "position_lat", "..."],
"data_types": {
"cumulative_time": "int64",
"timestamp": "datetime64[ns, UTC]",
"heart_rate": "float64"
},
"time_range": {
"start": "2024-11-26T11:00:48+00:00",
"end": "2024-11-26T11:50:34+00:00",
"duration_seconds": 2987
}
}
}

4. sessionStatistics

Purpose: Stores aggregated statistics and derived metrics for training sessions.

Document Count: 556 Indexes: _id_ only

Fields (65 total):

FieldTypeRequiredDescription
_idObjectIdYesUnique identifier
session_idObjectIdYesReference to session
runner_idObjectIdYesReference to runner
processed_timestampDateTimeYesCalculation timestamp
sportStringYesActivity type (e.g., "running")
sub_sportStringYesSub-activity type (e.g., "generic")
Heart Rate Statistics
hr_mean_bpmFloatNoAverage heart rate
hr_median_bpmFloatNoMedian heart rate
hr_std_bpmFloatNoHeart rate std deviation
hr_min_bpmFloatNoMinimum heart rate
hr_max_bpmFloatNoMaximum heart rate
hr_max_session_bpmFloatNoSession max heart rate
hr_iqr_bpmFloatNoInterquartile range
hr_5th_percentile_bpmFloatNo5th percentile HR
hr_95th_percentile_bpmFloatNo95th percentile HR
Heart Rate Zones
hr_zone_1_recovery_pctFloatNo% time in zone 1
hr_zone_1_recovery_secondsIntegerNoSeconds in zone 1
hr_zone_2_endurance_pctFloatNo% time in zone 2
hr_zone_2_endurance_secondsIntegerNoSeconds in zone 2
hr_zone_3_tempo_pctFloatNo% time in zone 3
hr_zone_3_tempo_secondsIntegerNoSeconds in zone 3
hr_zone_4_threshold_pctFloatNo% time in zone 4
hr_zone_4_threshold_secondsIntegerNoSeconds in zone 4
hr_zone_5_vo2max_pctFloatNo% time in zone 5
hr_zone_5_vo2max_secondsIntegerNoSeconds in zone 5
Speed Statistics
speed_mean_m_sFloatNoAverage speed (m/s)
speed_median_m_sFloatNoMedian speed (m/s)
speed_std_m_sFloatNoSpeed std deviation
speed_min_m_sFloatNoMinimum speed
speed_max_m_sFloatNoMaximum speed
speed_iqr_m_sFloatNoInterquartile range
speed_5th_percentile_m_sFloatNo5th percentile speed
speed_95th_percentile_m_sFloatNo95th percentile speed
speed_mean_ypmFloatNoAverage speed (yards/min)
Pace Statistics
pace_mean_min_kmFloatNoAverage pace (min/km)
pace_median_min_kmFloatNoMedian pace (min/km)
pace_min_min_kmFloatNoBest pace (min/km)
pace_max_min_kmFloatNoWorst pace (min/km)
pace_std_min_kmFloatNoPace std deviation
Efficiency Metrics
efficiency_factor_meanFloatNoAverage efficiency factor
efficiency_factor_medianFloatNoMedian efficiency factor
efficiency_factor_stdFloatNoEfficiency std deviation
Session Totals
session_duration_secondsIntegerYesTotal duration (seconds)
session_duration_minutesFloatYesTotal duration (minutes)
total_distance_mFloatYesTotal distance (meters)
total_distance_kmFloatYesTotal distance (kilometers)
Elevation
elevation_gain_mFloatNoTotal elevation gain
elevation_loss_mFloatNoTotal elevation loss
min_altitude_mFloatNoMinimum altitude
max_altitude_mFloatNoMaximum altitude
Cadence Statistics
cadence_mean_spmFloatNoAverage cadence (steps/min)
cadence_median_spmFloatNoMedian cadence
cadence_std_spmFloatNoCadence std deviation
cadence_min_spmFloatNoMinimum cadence
cadence_max_spmFloatNoMaximum cadence
Personal Bests
fastest_400m_secondsIntegerNoBest 400m split
fastest_800m_secondsIntegerNoBest 800m split
fastest_1km_secondsIntegerNoBest 1km split
fastest_5km_secondsIntegerNoBest 5km split
pb_updatesObjectNoPersonal best updates
Training Metrics
performance_indexFloatNoPerformance index
rtssFloatYesRunning Training Stress Score
rtss_threshold_speedFloatNoThreshold speed for rTSS
rtss_intensity_factorFloatNoIntensity factor for rTSS
rtss_critical_speedFloatNoCritical speed for rTSS

Sample Document:

{
"_id": ObjectId("68c3fa2a2a04875c12936b4f"),
"session_id": ObjectId("68c3f9122a04875c129365d6"),
"runner_id": ObjectId("68c3f9062a04875c129365d4"),
"processed_timestamp": "2025-09-12T10:47:03.589066",
"sport": "running",
"sub_sport": "generic",
"hr_mean_bpm": 145.48,
"hr_median_bpm": 143.0,
"hr_max_bpm": 166.0,
"hr_zone_3_tempo_pct": 28.89,
"hr_zone_3_tempo_seconds": 863,
"hr_zone_4_threshold_pct": 57.88,
"hr_zone_4_threshold_seconds": 1729,
"speed_mean_m_s": 2.66,
"pace_mean_min_km": 7.72,
"efficiency_factor_mean": 1.20,
"session_duration_seconds": 2986,
"session_duration_minutes": 49.77,
"total_distance_m": 7861.87,
"total_distance_km": 7.86,
"elevation_gain_m": 53.80,
"cadence_mean_spm": 79.31,
"fastest_1km_seconds": 351,
"fastest_5km_seconds": 1882,
"rtss": 89.74,
"rtss_threshold_speed": 2.56,
"rtss_intensity_factor": 1.04
}

5. runnerHistory

Purpose: Stores historical training load progression and daily aggregated metrics.

Document Count: 1,256 Indexes: _id_ only

Fields:

FieldTypeRequiredDescription
_idObjectIdYesUnique identifier
runner_idObjectIdYesReference to runner
dateDateTimeYesHistory date
session_idsArrayNoSession ObjectIds for day
Training Load (Banister Model)
rtssFloatYesRunning Training Stress Score
CTLFloatYesChronic Training Load
ATLFloatYesAcute Training Load
TSBFloatYesTraining Stress Balance
ACWRFloatYesAcute:Chronic Workload Ratio
Daily Aggregates
total_distance_kmFloatNoTotal distance for day
session_duration_minutesFloatNoTotal duration for day
hr_mean_bpmFloatNoAverage HR for day
pace_mean_min_kmFloatNoAverage pace for day
efficiency_factor_meanFloatNoAverage efficiency for day
Calendar Info
week_numberIntegerNoISO week number
yearIntegerNoYear
Metadata
updated_atDateTimeYesLast update timestamp

Sample Document:

{
"_id": ObjectId("68c4017c2a04875c12937249"),
"runner_id": ObjectId("68c3fa4e2a04875c12936b59"),
"date": "2024-09-07T00:00:00",
"session_ids": [ObjectId("68c3fb7e2a04875c12936f31")],
"rtss": 47.75,
"CTL": 45.0,
"ATL": 30.0,
"TSB": 15.0,
"ACWR": 0.67,
"total_distance_km": 23.58,
"session_duration_minutes": 409.8,
"hr_mean_bpm": 134.30,
"pace_mean_min_km": 11.24,
"efficiency_factor_mean": 0.32,
"week_number": 36,
"year": 2024,
"updated_at": "2025-09-12T11:18:20.076370"
}

6. trainingplan

Purpose: Stores individual days of training plans with full versioning support. Each document represents a single day in a training plan. Supports versioning for mid-plan regeneration and active plan management.

Document Count: 993 Indexes: _id_, date_idx, training_days_idx, week_idx, session_code_idx, date_training_idx, active_plan_lookup_idx, plan_list_view_idx

Important: This is a flat structure - one document per day, not nested blocks/weeks.

New Features (Version 2.2):

  • Active Plan Management: Each runner has exactly ONE active training plan (identified by is_active: true AND is_latest: true)
  • Generation Parameters: Stores complete parameters used to generate each plan for consistency during regeneration
  • Version Tracking: Full history of plan versions with parent-child relationships
  • Parameter Preservation: Regeneration pre-fills parameters from original plan

Fields:

FieldTypeRequiredDescription
_idObjectIdYesUnique identifier
trainingplanIDStringYesPlan identifier (UUID) - same across versions
runnerIDStringYesRunner identifier
dateDateTimeYesDate for this day
weekdayIntegerYesDay of week (0=Monday, 6=Sunday)
weekIntegerYesWeek number in plan
phaseStringYesTraining phase (e.g., "('block', 1)")
is_trainingBooleanYesIs this a training day?
recoveryBooleanYesIs this a recovery day?
rTSSFloatYesPlanned training stress score
CTLFloatYesProjected Chronic Training Load
ATLFloatYesProjected Acute Training Load
TSBFloatYesProjected Training Stress Balance
ACWRFloatYesProjected Acute:Chronic Workload Ratio
session_codeStringYesReference to workout (empty if rest day)
creationdateDateTimeYesWhen plan was created
versionIntegerYesVersion number (1, 2, 3, ...)
parent_versionIntegerNoPrevious version number (null for v1)
is_latestBooleanYesIs this the current version?
regeneration_dateDateTimeNoWhen this version was regenerated (null for v1)
regeneration_reasonStringNoWhy regeneration occurred (high_acwr, illness, manual, other)

Query Patterns:

// Get latest version of a training plan (RECOMMENDED)
db.trainingplan.find({
trainingplanID: "uuid",
is_latest: true
}).sort({ date: 1 })

// Get all versions of a training plan
db.trainingplan.find({
trainingplanID: "uuid"
}).sort({ version: -1, date: 1 })

// Get specific version
db.trainingplan.find({
trainingplanID: "uuid",
version: 2
}).sort({ date: 1 })

// Get specific week of latest version
db.trainingplan.find({
trainingplanID: "uuid",
is_latest: true,
week: 1
}).sort({ date: 1 })

// Get training days only (latest version)
db.trainingplan.find({
trainingplanID: "uuid",
is_latest: true,
is_training: true
})

// Get version metadata
db.trainingplan.aggregate([
{ $match: { trainingplanID: "uuid" } },
{ $group: {
_id: "$version",
parent_version: { $first: "$parent_version" },
is_latest: { $first: "$is_latest" },
regeneration_date: { $first: "$regeneration_date" },
regeneration_reason: { $first: "$regeneration_reason" },
total_days: { $sum: 1 }
}
},
{ $sort: { _id: -1 } }
])

Versioning Behavior:

  • Same trainingplanID across all versions of a plan
  • Only one version has is_latest: true at a time
  • version increments: 1, 2, 3, etc.
  • parent_version links to previous version (forms chain)
  • Historical versions preserved forever (immutable)

Sample Documents:

Version 1 (Original Plan):

{
"_id": ObjectId("68c3f360bff7cc7967116ee8"),
"trainingplanID": "145d05ec-1873-42ea-a5b4-8cf6b8e7a6a2",
"runnerID": "HDL",
"date": "2025-09-01T00:00:00",
"weekday": 0,
"week": 1,
"phase": "('block', 1)",
"is_training": false,
"recovery": false,
"rTSS": 0.0,
"CTL": 12.8,
"ATL": 2.4,
"TSB": 10.4,
"ACWR": 0.19,
"session_code": "",
"creationdate": "2025-09-12T10:16:37.825000",
"version": 1,
"parent_version": null,
"is_latest": false, // Superseded by v2
"regeneration_date": null,
"regeneration_reason": null
}

Version 2 (Regenerated After ACWR Spike):

{
"_id": ObjectId("68c3f361bff7cc7967116ee9"),
"trainingplanID": "145d05ec-1873-42ea-a5b4-8cf6b8e7a6a2", // Same ID
"runnerID": "HDL",
"date": "2025-10-15T00:00:00", // Later date
"weekday": 2,
"week": 8,
"phase": "('block', 2)",
"is_training": true,
"recovery": false,
"rTSS": 85.0, // Re-optimized value
"CTL": 58.3, // Re-calculated
"ATL": 42.1, // Re-calculated
"TSB": 16.2, // Re-calculated
"ACWR": 0.72, // Re-calculated
"session_code": "END4-W8D3-E4925C",
"creationdate": "2025-09-12T10:16:37.825000", // Original creation
"version": 2,
"parent_version": 1,
"is_latest": true, // Current version
"regeneration_date": "2025-11-07T14:30:00.000000",
"regeneration_reason": "high_acwr"
}

7. trainingplanschema

Purpose: Templates for generating periodized training plans with multiple training blocks. This is different from trainingplan - schemas are templates, trainingplan documents are actual scheduled days.

Document Count: 3 Indexes: Not verified

Fields:

FieldTypeRequiredDescription
_idObjectIdYesUnique identifier
nameStringYesSchema name
descriptionStringNoSchema description
authorStringNoAuthor name
blocksArrayYesTraining blocks array
is_publicBooleanYesPublic visibility flag (true = visible to all)
created_byObjectIdNoUser ID of creator (null for grandfathered data)
created_atDateTimeNoCreation timestamp
last_modifiedDateTimeNoLast modification timestamp

Block Schema:

{
"blocks": [
{
"block_type": "Volume 1", // Block name/phase
"workouts": [
{
"day_in_the_week": "Monday",
"day_in_week": 1, // 1-7
"performed": false,
"rtss": 50,
"sequence_of_execution": 1,
"session_code": "END4-W1D1-E4924B",
"session_definition": "Interval session: 5 x 45\"",
"session_type": "interval", // interval, LSD, recovery, tempo
"week_in_block": 1,
"workout_id": null
}
]
}
]
}

Sample Document: See validation report for full example with nested blocks and workouts.


8. workouts

Purpose: Library of reusable workout templates in ZWO (Zwift Workout) XML format.

Document Count: 63 Indexes: Not verified

Fields:

FieldTypeRequiredDescription
_idObjectIdYesUnique identifier
filenameStringYesGenerated filename (e.g., "Workout-Name-ABC123.zwo")
authorStringYesWorkout author/creator name
nameStringYesWorkout name
descriptionStringNoWorkout description
sport_typeStringYesSport type (e.g., "run")
duration_typeStringYesDuration measurement type (e.g., "time")
tagsStringNoComma-separated tags
workout_typeStringNoWorkout category (interval, LSD, recovery, tempo)
session_idStringNoSession identifier
rtssFloatNoRunning Training Stress Score
raw_xmlStringYesComplete ZWO XML content
file_sizeIntegerYesXML file size in bytes
is_publicBooleanYesPublic visibility flag (true = visible to all)
created_byObjectIdNoUser ID of creator (null for grandfathered data)
uploaded_atDateTimeYesUpload timestamp
last_modifiedDateTimeYesLast modification timestamp

9. background_jobs

Purpose: Tracks asynchronous background tasks (file processing, analysis).

Document Count: 13 Indexes: _id_ only

Fields:

FieldTypeRequiredDescription
_idObjectIdYesUnique identifier
job_idStringYesUUID job identifier
job_typeStringYesJob category (e.g., "file_processing")
runner_idObjectIdNoAssociated runner
parametersObjectYesJob input parameters
statusStringYesCurrent status
progressIntegerYesCompletion percentage (0-100)
resultObjectNoJob output results
errorStringNoError details if failed
created_atDateTimeYesJob creation time
started_atDateTimeNoProcessing start time
completed_atDateTimeNoCompletion time
updated_atDateTimeYesLast update time

Status Values:

  • pending - Queued for processing
  • running - Currently executing
  • completed - Finished successfully
  • failed - Error occurred
  • cancelled - User cancelled

Job Type Values:

  • file_processing - FIT file upload and processing

Sample Document:

{
"_id": ObjectId("68c3f9112a04875c129365d5"),
"job_id": "fbc90b6c-2443-4a1d-b4b1-83d5152312a8",
"job_type": "file_processing",
"runner_id": ObjectId("68c3f9062a04875c129365d4"),
"parameters": {
"temp_files": ["/app/data/uploads/temp_20250912_104224_941199_WorkoutFileExport.zip"],
"file_count": 1
},
"status": "completed",
"progress": 100,
"result": {
"total_files": 1,
"processed_fit_files": 188,
"processed_parquet_files": 0,
"skipped_duplicates": 0,
"failed_files": 0,
"errors": [],
"session_ids": ["68c3f9122a04875c129365d6", "..."]
},
"error": null,
"created_at": "2025-09-12T10:42:25.039000",
"started_at": null,
"completed_at": "2025-09-12T10:47:02.465000",
"updated_at": "2025-09-12T10:47:02.465000"
}

10. fs.files (GridFS)

Purpose: GridFS metadata for storing large binary files (FIT files, Parquet files).

Document Count: 4,260 Indexes: _id_, filename_1_uploadDate_1

Fields:

FieldTypeRequiredDescription
_idObjectIdYesUnique identifier
filenameStringYesOriginal filename
contentTypeStringYesMIME type
chunkSizeIntegerYesChunk size (default 255KB)
lengthIntegerYesFile size in bytes
uploadDateDateTimeYesUpload timestamp
metadataObjectYesFile ownership and linkage information (added Oct 2025)

Metadata Schema:

Since October 2025, all GridFS files include a metadata field for direct file ownership tracking and linkage verification.

For FIT files:

{
"runner_id": "68c3f9062a04875c129365d4", // Links to runner._id (as string)
"file_type": "fit", // Identifies file type
"session_id": "68c3f9122a04875c129365d6", // Links to sessionsFIT._id (as string)
"session_date": "2024-11-26T11:00:48", // Session start time
"uploaded_at": "2025-09-12T10:42:27.047000" // Upload timestamp
}

For Parquet files:

{
"runner_id": "68c3f9062a04875c129365d4", // Links to runner._id (as string)
"file_type": "parquet", // Identifies file type
"fit_session_id": "68c3f9122a04875c129365d6", // Links to sessionsFIT._id
"parquet_session_id": "68c3f9132a04875c129365da", // Links to sessionsParquet._id
"created_at": "2025-09-12T10:42:27.047000" // Creation timestamp
}

Sample Documents:

FIT file:

{
"_id": ObjectId("68c3f9122a04875c129365d7"),
"filename": "tp-1581646.2024-11-26-11-58-13-283Z.GarminPing.AAAAAGdFt9Tjz2sX.FIT",
"contentType": "application/octet-stream",
"chunkSize": 261120,
"length": 411633,
"uploadDate": "2025-09-12T10:42:27.047000",
"metadata": {
"runner_id": "68c3f9062a04875c129365d4",
"file_type": "fit",
"session_id": "68c3f9122a04875c129365d6",
"session_date": "2024-11-26T11:00:48",
"uploaded_at": "2025-09-12T10:42:27.047000"
}
}

Parquet file:

{
"_id": ObjectId("68c3f9132a04875c129365da"),
"filename": "tp-1581646.2024-11-26-11-58-13-283Z.GarminPing.AAAAAGdFt9Tjz2sX_record_mesgs.parquet",
"contentType": "application/octet-stream",
"chunkSize": 261120,
"length": 216719,
"uploadDate": "2025-09-12T10:42:27.047000",
"metadata": {
"runner_id": "68c3f9062a04875c129365d4",
"file_type": "parquet",
"fit_session_id": "68c3f9122a04875c129365d6",
"parquet_session_id": "68c3f9132a04875c129365da",
"created_at": "2025-09-12T10:42:27.047000"
}
}

Benefits of Metadata:

  • Direct file ownership queries without joining through sessions
  • Orphaned file detection (files with no corresponding session records)
  • Redundancy protection if sessionsFIT/sessionsParquet get corrupted
  • Simplified runner deletion (can find all files directly)

Migration Status:

  • Retroactive migration completed: Oct 21, 2025
  • Files migrated: 4,218 files (2,109 FIT + 2,109 Parquet)
  • Orphaned files found: 42 files (no session records)
  • Future uploads: All new uploads automatically include metadata

11. fs.chunks (GridFS)

Purpose: GridFS chunks storing actual binary data for large files.

Document Count: 1,644 Indexes: _id_, files_id_1_n_1

Fields:

FieldTypeRequiredDescription
_idObjectIdYesUnique identifier
files_idObjectIdYesReference to fs.files
nIntegerYesChunk sequence number
dataBinaryYesBinary chunk data

Sample Document:

{
"_id": ObjectId("68c3f9122a04875c129365d8"),
"files_id": ObjectId("68c3f9122a04875c129365d7"),
"n": 0,
"data": BinData(0, "...") // Binary data chunk
}

Relationships Diagram

runner (1) ──────< (many) sessionsFIT
│ │
│ └──> fs.files (GridFS)

├──────< (many) sessionsParquet
│ │
│ └──> fs.files (GridFS)

├──────< (many) sessionStatistics
│ │
│ └──> sessionsParquet (reference)

├──────< (many) runnerHistory

└──────< (many) trainingplan (by runnerID string)

└──> workouts (via session_code)

trainingplanschema ──> Used to generate trainingplan documents

background_jobs ──> runner (optional)
──> fs.files (for file processing jobs)

Data Flow

1. Session Upload Flow

User uploads FIT file

Store in GridFS (fs.files, fs.chunks)

Create sessionsFIT record with metadata

Background job processes FIT file

Extract records → Store as Parquet in GridFS

Create sessionsParquet record with metadata

Calculate statistics → Create sessionStatistics record

Update runnerHistory with new training load

Update runner training load metrics

2. Training Plan Creation Flow

User selects trainingplanschema template

System reads runner configuration (blocks, weeks_per_block, training_days, etc.)

Generate individual day documents for entire plan duration

For each day:
- Calculate projected training load metrics
- Assign workouts based on schema
- Set is_training, recovery flags

Insert ~100+ trainingplan documents (one per day)

Display calendar view by querying trainingplanID

3. Personal Best Update Flow

User completes workout session

Upload and process session (sessionsFIT → sessionsParquet → sessionStatistics)

Calculate fastest splits across multiple distances

Compare with runner personal best records

If new PB: Update runner document

If sufficient PBs exist: Recalculate performance metrics

Push new entry to runner performance history array

Performance Considerations

Current Indexes

Actual production indexes:

  • runner: _id_, runnerID_idx, marathon_date_idx
  • trainingplan: _id_, date_idx, training_days_idx, week_idx, session_code_idx, date_training_idx
  • fs.chunks: _id_, files_id_1_n_1
  • fs.files: _id_, filename_1_uploadDate_1
  • All other collections: _id_ only

Performance Indexes

Query performance indexes used in the system:

// Query sessions by runner
db.sessionsFIT.createIndex({ runner_id: 1, uploaded_at: -1 })
db.sessionsParquet.createIndex({ runner_id: 1, created_at: -1 })
db.sessionStatistics.createIndex({ runner_id: 1, processed_timestamp: -1 })

// Session statistics unique constraint
db.sessionStatistics.createIndex({ session_id: 1 }, { unique: true })

// Time series queries on runner history
db.runnerHistory.createIndex({ runner_id: 1, date: -1 })

// Background job queries
db.background_jobs.createIndex({ runner_id: 1, status: 1 })
db.background_jobs.createIndex({ created_at: -1 })

Query Patterns

Most common queries:

  1. Get all sessions for runner: db.sessionsFIT.find({ runner_id: ObjectId(...) })
  2. Get runner history: db.runnerHistory.find({ runner_id: ObjectId(...) }).sort({ date: -1 })
  3. Get training plan days: db.trainingplan.find({ trainingplanID: "uuid" }).sort({ date: 1 })
  4. Get session statistics: db.sessionStatistics.findOne({ session_id: ObjectId(...) })

Validation and Constraints

Business Logic Constraints

  • runner.marathon_date contains target race date
  • trainingplan.date aligns with trainingplanID date range
  • sessionStatistics.session_id is unique (one stats doc per session)
  • runnerHistory.date is unique per runner (one history doc per day per runner)
  • background_jobs.progress ranges from 0-100

Data Type Constraints

  • ObjectId: Valid 24-character hex string
  • Dates: ISO 8601 format (YYYY-MM-DDTHH:MM:SS)
  • Numbers: Within logical ranges (HR 0-220, speeds > 0, percentages 0-100)

Referential Integrity

When deleting a runner:

  • Should cascade delete: sessionsFIT, sessionsParquet, sessionStatistics, runnerHistory, background_jobs
  • Should handle trainingplan documents (query by runnerID string)
  • Should delete associated GridFS files

When deleting a workout from library:

  • Check if referenced by trainingplanschema
  • Prevent deletion or nullify references

Schema Version History

Version 2.2 (Current)

  • Training plan versioning support with parent-child relationships
  • Enhanced query patterns for version filtering
  • Support for mid-plan regeneration with reason tracking

Version 2.1 (Time Series Migration)

  • Added time series structures for heart rate zones, PBs, and critical speed
  • Legacy fields marked for migration
  • New standardized time series schema

Version 2.0 (2025-10-02)

  • Validated against production database
  • Updated runner schema to match actual 38 fields
  • Corrected trainingplan to flat structure (one doc per day)
  • Fixed sessionsFIT/sessionsParquet metadata organization
  • Updated background_jobs field names (parameters/result)
  • Removed non-existent metadata from fs.files
  • Added runnerHistory additional fields
  • Documented current database indexes

Version 1.0 (Initial)

  • Created from code analysis (had discrepancies with actual DB)

Time Series Schema

All time series data follows a consistent structure to track changes over time.

Generic Time Series Entry Structure

{
"timestamp": ISODate("2025-01-01T00:00:00.000Z"), // When this value is effective
"value": <any>, // The actual value (can be object)
"source": "manual|calculated|imported|session", // How this value was obtained
"notes": "Optional notes about this entry" // Optional context
}

Heart Rate Zones Time Series

{
"heart_rate_zones_history": [
{
"timestamp": ISODate("2025-01-01T00:00:00.000Z"),
"value": {
"hr_zone_1_max": 125,
"hr_zone_2_max": 146,
"hr_zone_3_max": 167,
"hr_zone_4_max": 188,
"hr_zone_5_max": 209
},
"source": "age_calculated",
"notes": "Calculated from age using HUNT formula"
}
]
}

Personal Bests Time Series

{
"personal_bests_history": {
"400m": [
{
"timestamp": ISODate("2025-01-15T00:00:00.000Z"),
"value": {
"distance": "400m",
"seconds": 65.5,
"session_id": "optional_session_reference"
},
"source": "session",
"notes": null
}
],
"800m": [...],
"5000m": [...]
}
}

Critical Speed Time Series

{
"critical_speed_time_series": [
{
"timestamp": ISODate("2025-01-15T00:00:00.000Z"),
"value": {
"critical_speed": 3.189, // m/s
"w_prime": 285.2, // anaerobic capacity (meters)
"r_squared": 0.9986, // regression fit quality
"pb_400m_seconds": 65.5, // PBs used for calculation
"pb_800m_seconds": 131.2,
"pb_5000m_seconds": 1480
},
"source": "calculated",
"notes": null
}
]
}

Time Series Query Functions

The application provides utility functions for time series operations:

  • get_value_at_date() - Get the value that was active at a specific date
  • get_latest_value() - Get the most recent value
  • add_value() - Add a new value to the time series
  • get_value_history() - Get historical values within a date range


Schema Version: 2.2 Last Validated: 2025-11-09 Database: Production MongoDB deployment Validation Method: Direct MongoDB connection and document analysis Total Collections: 11 Total Documents: Production scale deployment