Database Schema Documentation
Version: 2.2 (Verified against production database) Last Validated: 2025-11-09 Database: MongoDB with 11 collections Total Documents: Production deployment with multiple collections
For application architecture and data flow, see Architecture. For configuration details, see Platform Overview.
Overview
This document provides comprehensive documentation of all MongoDB collections used in the SmartRunning Coach application. The database stores runner profiles, training sessions, workout libraries, training plans, and background job information.
Note: This schema reflects the ACTUAL production database structure, validated against live MongoDB data.
Collections Summary
| Collection | Documents | Purpose | Key Relationships |
|---|---|---|---|
runner | 4 | Athlete profiles and settings | Links to sessions, plans, history |
sessionsFIT | 556 | Raw FIT file data | Links to runner, GridFS files |
sessionsParquet | 556 | Processed session records | Links to runner, GridFS files |
sessionStatistics | 556 | Aggregated session metrics | Links to runner, sessions |
runnerHistory | 1,256 | Historical training load data | Links to runner |
workouts | - | Workout library | Referenced by training plans |
trainingplanschema | 2 | Training plan templates | Used to generate training plans |
trainingplan | 993 | Training plan days with versioning | Links to runner, workouts, supports active plan management |
background_jobs | 13 | Async job tracking | Links to runner, files |
fs.files | 1,112 | GridFS file metadata | Links to sessions |
fs.chunks | 1,644 | GridFS file chunks | Links to fs.files |
Collection Schemas
1. runner
Purpose: Stores athlete profiles, physiological parameters, training metrics, and training plan configuration.
Document Count: 4
Indexes: _id_, runnerID_idx, marathon_date_idx
Fields (38 total):
| Field | Type | Required | Description | Validation |
|---|---|---|---|---|
_id | ObjectId | Yes | Unique identifier | MongoDB ObjectId |
runnerID | String | Yes | Human-readable runner identifier | Unique string (e.g., "Marc_001") |
name | String | Yes | Runner's full name | Non-empty string |
age | Integer | Yes | Runner's age | Positive integer |
| Training Load Metrics | ||||
CTL_starting | Float | Yes | Initial Chronic Training Load | Non-negative float |
ATL_starting | Float | Yes | Initial Acute Training Load | Non-negative float |
CTL_current | Float | Yes | Current Chronic Training Load | Non-negative float |
ATL_current | Float | Yes | Current Acute Training Load | Non-negative float |
TSB_current | Float | Yes | Current Training Stress Balance | Float (can be negative) |
ACWR_current | Float | Yes | Current Acute:Chronic Workload Ratio | Non-negative float |
| Banister Model Parameters | ||||
banister_k1 | Float | Yes | Fitness gain coefficient | Positive float |
banister_k2 | Float | Yes | Fatigue gain coefficient | Positive float |
banister_tau_f | Float | Yes | Fitness decay time constant (days) | Positive float |
banister_tau_fr | Float | Yes | Fatigue decay time constant (days) | Positive float |
| Heart Rate Zones (Legacy - Being Migrated) | ||||
hr_zone_1_max | Float | No | Zone 1 upper limit (bpm) | 0-220 (legacy) |
hr_zone_2_max | Float | No | Zone 2 upper limit (bpm) | 0-220 (legacy) |
hr_zone_3_max | Float | No | Zone 3 upper limit (bpm) | 0-220 (legacy) |
hr_zone_4_max | Float | No | Zone 4 upper limit (bpm) | 0-220 (legacy) |
hr_zone_5_max | Float | No | Zone 5 upper limit (bpm) | 0-220 (legacy) |
heart_rate_zones_history | Array | Yes | Time series HR zones | See Time Series Schema |
| Personal Bests (Legacy - Being Migrated) | ||||
pb_400m_seconds | Float | No | 400m personal best time | Positive float (legacy) |
pb_800m_seconds | Float | No | 800m personal best time | Positive float (legacy) |
pb_5000m_seconds | Float | No | 5000m personal best time | Positive float (legacy) |
personal_bests_history | Object | Yes | Time series PBs by distance | See Time Series Schema |
| Training Stress Score Thresholds | ||||
rtss_min | Float | Yes | Minimum rTSS for workout | Non-negative float |
rtss_max | Float | Yes | Maximum rTSS for workout | Non-negative float |
| Speed Metrics | ||||
threshold_speed | Float | Yes | Threshold speed (m/s) | Positive float |
| Training Plan Configuration | ||||
blocks | Integer | Yes | Number of training blocks | Positive integer |
weeks_per_block | Integer | Yes | Weeks per training block | Positive integer |
taper_weeks | Integer | Yes | Number of taper weeks | Positive integer |
weekly_increment | Float | Yes | Weekly volume increase (%) | Positive float |
training_days | Array | Yes | Scheduled training days | Array of integers (1-7) |
marathon_date | DateTime | No | Target marathon date | ISO 8601 datetime |
prep_start | DateTime | No | Preparation start date | ISO 8601 datetime or null |
| Data Collections | ||||
sessions | Array | Yes | Session references | Array (typically empty) |
schedules | Array | Yes | Schedule references | Array (typically empty) |
critical_speed_history | Array | No | Historical critical speed data | Array of objects (legacy) |
critical_speed_time_series | Array | Yes | Time series critical speed | See Time Series Schema |
| Metadata | ||||
created_at | DateTime | Yes | Account creation timestamp | ISO 8601 datetime |
updated_at | DateTime | Yes | Last modification timestamp | ISO 8601 datetime |
history_last_updated | DateTime | No | Last history update | ISO 8601 datetime |
Critical Speed History Schema:
{
"critical_speed_history": [
{
"date": "2025-09-17T08:55:16.393Z",
"critical_speed": 3.189, // m/s
"w_prime": 285.2, // meters
"r_squared": 0.9986, // model fit quality
"pb_400m_seconds": 65,
"pb_800m_seconds": 131,
"pb_5000m_seconds": 1480
}
]
}
Sample Document:
{
"_id": ObjectId("68c3f9062a04875c129365d4"),
"runnerID": "Marc_001",
"name": "Marc De Reu",
"age": 55,
"CTL_starting": 45.0,
"ATL_starting": 30.0,
"CTL_current": 49.1,
"ATL_current": 1.2,
"TSB_current": 47.87,
"ACWR_current": 0.025,
"banister_k1": 1.0,
"banister_k2": 2.0,
"banister_tau_f": 42.0,
"banister_tau_fr": 7.0,
"hr_zone_1_max": 100.0,
"hr_zone_2_max": 120.0,
"hr_zone_3_max": 140.0,
"hr_zone_4_max": 160.0,
"hr_zone_5_max": 200.0,
"pb_400m_seconds": 65.0,
"pb_800m_seconds": 131.0,
"pb_5000m_seconds": 1480.0,
"rtss_min": 30.0,
"rtss_max": 150.0,
"threshold_speed": 2.56,
"blocks": 4,
"weeks_per_block": 4,
"taper_weeks": 2,
"weekly_increment": 10.0,
"training_days": [1, 3, 5, 6],
"marathon_date": "2026-01-01T00:00:00",
"prep_start": null,
"sessions": [],
"schedules": [],
"critical_speed_history": [
{
"date": "2025-09-17T08:55:16.393Z",
"critical_speed": 3.189,
"w_prime": 285.2,
"r_squared": 0.9986,
"pb_400m_seconds": 65,
"pb_800m_seconds": 131,
"pb_5000m_seconds": 1480
}
],
"created_at": "2025-09-12T10:42:14.274000",
"updated_at": "2025-09-17T08:55:16.397000",
"history_last_updated": "2025-09-12T11:19:12.011203"
}
2. sessionsFIT
Purpose: Stores raw FIT file data and metadata for uploaded workout sessions.
Document Count: 556
Indexes: _id_ only
Fields:
| Field | Type | Required | Description | Validation |
|---|---|---|---|---|
_id | ObjectId | Yes | Unique identifier | MongoDB ObjectId |
runner_id | ObjectId | Yes | Reference to runner | Valid runner._id |
file_id | ObjectId | Yes | GridFS file reference | Valid fs.files._id |
original_filename | String | Yes | Original filename | Non-empty string |
file_size | Integer | Yes | File size in bytes | Positive integer |
uploaded_at | DateTime | Yes | Upload timestamp | ISO 8601 datetime |
processed | Boolean | Yes | Processing status | true/false |
parquet_files_created | Integer | No | Count of parquet files | Non-negative integer |
metadata | Object | Yes | FIT file metadata | See Metadata Schema |
Metadata Schema:
{
"metadata": {
"message_types": [
"file_id_mesgs",
"file_creator_mesgs",
"activity_mesgs",
"session_mesgs",
"lap_mesgs",
"record_mesgs",
// ... additional message types
],
"total_messages": 13671,
"errors": [],
"session_info": {
"start_time": "2024-11-26T11:00:48",
"total_elapsed_time": 2986.878, // seconds
"total_distance": 7861.87, // meters
"sport": "running",
"sub_sport": "generic"
}
}
}
Sample Document:
{
"_id": ObjectId("68c3f9122a04875c129365d6"),
"runner_id": ObjectId("68c3f9062a04875c129365d4"),
"original_filename": "tp-1581646.2024-11-26-11-58-13-283Z.GarminPing.AAAAAGdFt9Tjz2sX.FIT",
"file_id": ObjectId("68c3f9122a04875c129365d7"),
"file_size": 411633,
"uploaded_at": "2025-09-12T10:42:26.856000",
"processed": true,
"parquet_files_created": 1,
"metadata": {
"message_types": ["file_id_mesgs", "file_creator_mesgs", "activity_mesgs", "..."],
"total_messages": 13671,
"errors": [],
"session_info": {
"start_time": "2024-11-26T11:00:48",
"total_elapsed_time": 2986.878,
"total_distance": 7861.87,
"sport": "running",
"sub_sport": "generic"
}
}
}
3. sessionsParquet
Purpose: Stores processed session data with detailed record-level information in Parquet format.
Document Count: 556
Indexes: _id_ only
Fields:
| Field | Type | Required | Description | Validation |
|---|---|---|---|---|
_id | ObjectId | Yes | Unique identifier | MongoDB ObjectId |
session_id | ObjectId | Yes | Reference to FIT session | Valid sessionsFIT._id |
runner_id | ObjectId | Yes | Reference to runner | Valid runner._id |
original_fit_filename | String | Yes | Source FIT filename | Non-empty string |
parquet_filename | String | Yes | Parquet filename | Non-empty string |
file_id | ObjectId | Yes | GridFS parquet file | Valid fs.files._id |
file_size | Integer | Yes | Parquet file size (bytes) | Positive integer |
created_at | DateTime | Yes | Creation timestamp | ISO 8601 datetime |
metadata | Object | Yes | Parquet structure metadata | See Metadata Schema |
Metadata Schema:
{
"metadata": {
"rows": 2987, // Number of data records
"columns": 42, // Number of columns
"column_names": [
"cumulative_time",
"timestamp",
"position_lat",
"position_long",
"distance",
"enhanced_speed",
"enhanced_altitude",
"heart_rate",
"cadence",
"temperature",
// ... additional columns
],
"data_types": {
"cumulative_time": "int64",
"timestamp": "datetime64[ns, UTC]",
"position_lat": "float64",
"position_long": "float64",
"distance": "float64",
"heart_rate": "float64",
"cadence": "float64",
// ... additional data types
},
"time_range": {
"start": "2024-11-26T11:00:48+00:00",
"end": "2024-11-26T11:50:34+00:00",
"duration_seconds": 2987
}
}
}
Sample Document:
{
"_id": ObjectId("68c3f9132a04875c129365dc"),
"session_id": ObjectId("68c3f9122a04875c129365d6"),
"runner_id": ObjectId("68c3f9062a04875c129365d4"),
"original_fit_filename": "tp-1581646.2024-11-26-11-58-13-283Z.GarminPing.AAAAAGdFt9Tjz2sX.FIT",
"parquet_filename": "tp-1581646.2024-11-26-11-58-13-283Z.GarminPing.AAAAAGdFt9Tjz2sX_record_mesgs.parquet",
"file_id": ObjectId("68c3f9132a04875c129365da"),
"file_size": 216719,
"created_at": "2025-09-12T10:42:26.856000",
"metadata": {
"rows": 2987,
"columns": 42,
"column_names": ["cumulative_time", "timestamp", "position_lat", "..."],
"data_types": {
"cumulative_time": "int64",
"timestamp": "datetime64[ns, UTC]",
"heart_rate": "float64"
},
"time_range": {
"start": "2024-11-26T11:00:48+00:00",
"end": "2024-11-26T11:50:34+00:00",
"duration_seconds": 2987
}
}
}
4. sessionStatistics
Purpose: Stores aggregated statistics and derived metrics for training sessions.
Document Count: 556
Indexes: _id_ only
Fields (65 total):
| Field | Type | Required | Description |
|---|---|---|---|
_id | ObjectId | Yes | Unique identifier |
session_id | ObjectId | Yes | Reference to session |
runner_id | ObjectId | Yes | Reference to runner |
processed_timestamp | DateTime | Yes | Calculation timestamp |
sport | String | Yes | Activity type (e.g., "running") |
sub_sport | String | Yes | Sub-activity type (e.g., "generic") |
| Heart Rate Statistics | |||
hr_mean_bpm | Float | No | Average heart rate |
hr_median_bpm | Float | No | Median heart rate |
hr_std_bpm | Float | No | Heart rate std deviation |
hr_min_bpm | Float | No | Minimum heart rate |
hr_max_bpm | Float | No | Maximum heart rate |
hr_max_session_bpm | Float | No | Session max heart rate |
hr_iqr_bpm | Float | No | Interquartile range |
hr_5th_percentile_bpm | Float | No | 5th percentile HR |
hr_95th_percentile_bpm | Float | No | 95th percentile HR |
| Heart Rate Zones | |||
hr_zone_1_recovery_pct | Float | No | % time in zone 1 |
hr_zone_1_recovery_seconds | Integer | No | Seconds in zone 1 |
hr_zone_2_endurance_pct | Float | No | % time in zone 2 |
hr_zone_2_endurance_seconds | Integer | No | Seconds in zone 2 |
hr_zone_3_tempo_pct | Float | No | % time in zone 3 |
hr_zone_3_tempo_seconds | Integer | No | Seconds in zone 3 |
hr_zone_4_threshold_pct | Float | No | % time in zone 4 |
hr_zone_4_threshold_seconds | Integer | No | Seconds in zone 4 |
hr_zone_5_vo2max_pct | Float | No | % time in zone 5 |
hr_zone_5_vo2max_seconds | Integer | No | Seconds in zone 5 |
| Speed Statistics | |||
speed_mean_m_s | Float | No | Average speed (m/s) |
speed_median_m_s | Float | No | Median speed (m/s) |
speed_std_m_s | Float | No | Speed std deviation |
speed_min_m_s | Float | No | Minimum speed |
speed_max_m_s | Float | No | Maximum speed |
speed_iqr_m_s | Float | No | Interquartile range |
speed_5th_percentile_m_s | Float | No | 5th percentile speed |
speed_95th_percentile_m_s | Float | No | 95th percentile speed |
speed_mean_ypm | Float | No | Average speed (yards/min) |
| Pace Statistics | |||
pace_mean_min_km | Float | No | Average pace (min/km) |
pace_median_min_km | Float | No | Median pace (min/km) |
pace_min_min_km | Float | No | Best pace (min/km) |
pace_max_min_km | Float | No | Worst pace (min/km) |
pace_std_min_km | Float | No | Pace std deviation |
| Efficiency Metrics | |||
efficiency_factor_mean | Float | No | Average efficiency factor |
efficiency_factor_median | Float | No | Median efficiency factor |
efficiency_factor_std | Float | No | Efficiency std deviation |
| Session Totals | |||
session_duration_seconds | Integer | Yes | Total duration (seconds) |
session_duration_minutes | Float | Yes | Total duration (minutes) |
total_distance_m | Float | Yes | Total distance (meters) |
total_distance_km | Float | Yes | Total distance (kilometers) |
| Elevation | |||
elevation_gain_m | Float | No | Total elevation gain |
elevation_loss_m | Float | No | Total elevation loss |
min_altitude_m | Float | No | Minimum altitude |
max_altitude_m | Float | No | Maximum altitude |
| Cadence Statistics | |||
cadence_mean_spm | Float | No | Average cadence (steps/min) |
cadence_median_spm | Float | No | Median cadence |
cadence_std_spm | Float | No | Cadence std deviation |
cadence_min_spm | Float | No | Minimum cadence |
cadence_max_spm | Float | No | Maximum cadence |
| Personal Bests | |||
fastest_400m_seconds | Integer | No | Best 400m split |
fastest_800m_seconds | Integer | No | Best 800m split |
fastest_1km_seconds | Integer | No | Best 1km split |
fastest_5km_seconds | Integer | No | Best 5km split |
pb_updates | Object | No | Personal best updates |
| Training Metrics | |||
performance_index | Float | No | Performance index |
rtss | Float | Yes | Running Training Stress Score |
rtss_threshold_speed | Float | No | Threshold speed for rTSS |
rtss_intensity_factor | Float | No | Intensity factor for rTSS |
rtss_critical_speed | Float | No | Critical speed for rTSS |
Sample Document:
{
"_id": ObjectId("68c3fa2a2a04875c12936b4f"),
"session_id": ObjectId("68c3f9122a04875c129365d6"),
"runner_id": ObjectId("68c3f9062a04875c129365d4"),
"processed_timestamp": "2025-09-12T10:47:03.589066",
"sport": "running",
"sub_sport": "generic",
"hr_mean_bpm": 145.48,
"hr_median_bpm": 143.0,
"hr_max_bpm": 166.0,
"hr_zone_3_tempo_pct": 28.89,
"hr_zone_3_tempo_seconds": 863,
"hr_zone_4_threshold_pct": 57.88,
"hr_zone_4_threshold_seconds": 1729,
"speed_mean_m_s": 2.66,
"pace_mean_min_km": 7.72,
"efficiency_factor_mean": 1.20,
"session_duration_seconds": 2986,
"session_duration_minutes": 49.77,
"total_distance_m": 7861.87,
"total_distance_km": 7.86,
"elevation_gain_m": 53.80,
"cadence_mean_spm": 79.31,
"fastest_1km_seconds": 351,
"fastest_5km_seconds": 1882,
"rtss": 89.74,
"rtss_threshold_speed": 2.56,
"rtss_intensity_factor": 1.04
}
5. runnerHistory
Purpose: Stores historical training load progression and daily aggregated metrics.
Document Count: 1,256
Indexes: _id_ only
Fields:
| Field | Type | Required | Description |
|---|---|---|---|
_id | ObjectId | Yes | Unique identifier |
runner_id | ObjectId | Yes | Reference to runner |
date | DateTime | Yes | History date |
session_ids | Array | No | Session ObjectIds for day |
| Training Load (Banister Model) | |||
rtss | Float | Yes | Running Training Stress Score |
CTL | Float | Yes | Chronic Training Load |
ATL | Float | Yes | Acute Training Load |
TSB | Float | Yes | Training Stress Balance |
ACWR | Float | Yes | Acute:Chronic Workload Ratio |
| Daily Aggregates | |||
total_distance_km | Float | No | Total distance for day |
session_duration_minutes | Float | No | Total duration for day |
hr_mean_bpm | Float | No | Average HR for day |
pace_mean_min_km | Float | No | Average pace for day |
efficiency_factor_mean | Float | No | Average efficiency for day |
| Calendar Info | |||
week_number | Integer | No | ISO week number |
year | Integer | No | Year |
| Metadata | |||
updated_at | DateTime | Yes | Last update timestamp |
Sample Document:
{
"_id": ObjectId("68c4017c2a04875c12937249"),
"runner_id": ObjectId("68c3fa4e2a04875c12936b59"),
"date": "2024-09-07T00:00:00",
"session_ids": [ObjectId("68c3fb7e2a04875c12936f31")],
"rtss": 47.75,
"CTL": 45.0,
"ATL": 30.0,
"TSB": 15.0,
"ACWR": 0.67,
"total_distance_km": 23.58,
"session_duration_minutes": 409.8,
"hr_mean_bpm": 134.30,
"pace_mean_min_km": 11.24,
"efficiency_factor_mean": 0.32,
"week_number": 36,
"year": 2024,
"updated_at": "2025-09-12T11:18:20.076370"
}
6. trainingplan
Purpose: Stores individual days of training plans with full versioning support. Each document represents a single day in a training plan. Supports versioning for mid-plan regeneration and active plan management.
Document Count: 993
Indexes: _id_, date_idx, training_days_idx, week_idx, session_code_idx, date_training_idx, active_plan_lookup_idx, plan_list_view_idx
Important: This is a flat structure - one document per day, not nested blocks/weeks.
New Features (Version 2.2):
- Active Plan Management: Each runner has exactly ONE active training plan (identified by
is_active: trueANDis_latest: true) - Generation Parameters: Stores complete parameters used to generate each plan for consistency during regeneration
- Version Tracking: Full history of plan versions with parent-child relationships
- Parameter Preservation: Regeneration pre-fills parameters from original plan
Fields:
| Field | Type | Required | Description |
|---|---|---|---|
_id | ObjectId | Yes | Unique identifier |
trainingplanID | String | Yes | Plan identifier (UUID) - same across versions |
runnerID | String | Yes | Runner identifier |
date | DateTime | Yes | Date for this day |
weekday | Integer | Yes | Day of week (0=Monday, 6=Sunday) |
week | Integer | Yes | Week number in plan |
phase | String | Yes | Training phase (e.g., "('block', 1)") |
is_training | Boolean | Yes | Is this a training day? |
recovery | Boolean | Yes | Is this a recovery day? |
rTSS | Float | Yes | Planned training stress score |
CTL | Float | Yes | Projected Chronic Training Load |
ATL | Float | Yes | Projected Acute Training Load |
TSB | Float | Yes | Projected Training Stress Balance |
ACWR | Float | Yes | Projected Acute:Chronic Workload Ratio |
session_code | String | Yes | Reference to workout (empty if rest day) |
creationdate | DateTime | Yes | When plan was created |
version | Integer | Yes | Version number (1, 2, 3, ...) |
parent_version | Integer | No | Previous version number (null for v1) |
is_latest | Boolean | Yes | Is this the current version? |
regeneration_date | DateTime | No | When this version was regenerated (null for v1) |
regeneration_reason | String | No | Why regeneration occurred (high_acwr, illness, manual, other) |
Query Patterns:
// Get latest version of a training plan (RECOMMENDED)
db.trainingplan.find({
trainingplanID: "uuid",
is_latest: true
}).sort({ date: 1 })
// Get all versions of a training plan
db.trainingplan.find({
trainingplanID: "uuid"
}).sort({ version: -1, date: 1 })
// Get specific version
db.trainingplan.find({
trainingplanID: "uuid",
version: 2
}).sort({ date: 1 })
// Get specific week of latest version
db.trainingplan.find({
trainingplanID: "uuid",
is_latest: true,
week: 1
}).sort({ date: 1 })
// Get training days only (latest version)
db.trainingplan.find({
trainingplanID: "uuid",
is_latest: true,
is_training: true
})
// Get version metadata
db.trainingplan.aggregate([
{ $match: { trainingplanID: "uuid" } },
{ $group: {
_id: "$version",
parent_version: { $first: "$parent_version" },
is_latest: { $first: "$is_latest" },
regeneration_date: { $first: "$regeneration_date" },
regeneration_reason: { $first: "$regeneration_reason" },
total_days: { $sum: 1 }
}
},
{ $sort: { _id: -1 } }
])
Versioning Behavior:
- Same
trainingplanIDacross all versions of a plan - Only one version has
is_latest: trueat a time versionincrements: 1, 2, 3, etc.parent_versionlinks to previous version (forms chain)- Historical versions preserved forever (immutable)
Sample Documents:
Version 1 (Original Plan):
{
"_id": ObjectId("68c3f360bff7cc7967116ee8"),
"trainingplanID": "145d05ec-1873-42ea-a5b4-8cf6b8e7a6a2",
"runnerID": "HDL",
"date": "2025-09-01T00:00:00",
"weekday": 0,
"week": 1,
"phase": "('block', 1)",
"is_training": false,
"recovery": false,
"rTSS": 0.0,
"CTL": 12.8,
"ATL": 2.4,
"TSB": 10.4,
"ACWR": 0.19,
"session_code": "",
"creationdate": "2025-09-12T10:16:37.825000",
"version": 1,
"parent_version": null,
"is_latest": false, // Superseded by v2
"regeneration_date": null,
"regeneration_reason": null
}
Version 2 (Regenerated After ACWR Spike):
{
"_id": ObjectId("68c3f361bff7cc7967116ee9"),
"trainingplanID": "145d05ec-1873-42ea-a5b4-8cf6b8e7a6a2", // Same ID
"runnerID": "HDL",
"date": "2025-10-15T00:00:00", // Later date
"weekday": 2,
"week": 8,
"phase": "('block', 2)",
"is_training": true,
"recovery": false,
"rTSS": 85.0, // Re-optimized value
"CTL": 58.3, // Re-calculated
"ATL": 42.1, // Re-calculated
"TSB": 16.2, // Re-calculated
"ACWR": 0.72, // Re-calculated
"session_code": "END4-W8D3-E4925C",
"creationdate": "2025-09-12T10:16:37.825000", // Original creation
"version": 2,
"parent_version": 1,
"is_latest": true, // Current version
"regeneration_date": "2025-11-07T14:30:00.000000",
"regeneration_reason": "high_acwr"
}
7. trainingplanschema
Purpose: Templates for generating periodized training plans with multiple training blocks. This is different from trainingplan - schemas are templates, trainingplan documents are actual scheduled days.
Document Count: 3 Indexes: Not verified
Fields:
| Field | Type | Required | Description |
|---|---|---|---|
_id | ObjectId | Yes | Unique identifier |
name | String | Yes | Schema name |
description | String | No | Schema description |
author | String | No | Author name |
blocks | Array | Yes | Training blocks array |
is_public | Boolean | Yes | Public visibility flag (true = visible to all) |
created_by | ObjectId | No | User ID of creator (null for grandfathered data) |
created_at | DateTime | No | Creation timestamp |
last_modified | DateTime | No | Last modification timestamp |
Block Schema:
{
"blocks": [
{
"block_type": "Volume 1", // Block name/phase
"workouts": [
{
"day_in_the_week": "Monday",
"day_in_week": 1, // 1-7
"performed": false,
"rtss": 50,
"sequence_of_execution": 1,
"session_code": "END4-W1D1-E4924B",
"session_definition": "Interval session: 5 x 45\"",
"session_type": "interval", // interval, LSD, recovery, tempo
"week_in_block": 1,
"workout_id": null
}
]
}
]
}
Sample Document: See validation report for full example with nested blocks and workouts.
8. workouts
Purpose: Library of reusable workout templates in ZWO (Zwift Workout) XML format.
Document Count: 63 Indexes: Not verified
Fields:
| Field | Type | Required | Description |
|---|---|---|---|
_id | ObjectId | Yes | Unique identifier |
filename | String | Yes | Generated filename (e.g., "Workout-Name-ABC123.zwo") |
author | String | Yes | Workout author/creator name |
name | String | Yes | Workout name |
description | String | No | Workout description |
sport_type | String | Yes | Sport type (e.g., "run") |
duration_type | String | Yes | Duration measurement type (e.g., "time") |
tags | String | No | Comma-separated tags |
workout_type | String | No | Workout category (interval, LSD, recovery, tempo) |
session_id | String | No | Session identifier |
rtss | Float | No | Running Training Stress Score |
raw_xml | String | Yes | Complete ZWO XML content |
file_size | Integer | Yes | XML file size in bytes |
is_public | Boolean | Yes | Public visibility flag (true = visible to all) |
created_by | ObjectId | No | User ID of creator (null for grandfathered data) |
uploaded_at | DateTime | Yes | Upload timestamp |
last_modified | DateTime | Yes | Last modification timestamp |
9. background_jobs
Purpose: Tracks asynchronous background tasks (file processing, analysis).
Document Count: 13
Indexes: _id_ only
Fields:
| Field | Type | Required | Description |
|---|---|---|---|
_id | ObjectId | Yes | Unique identifier |
job_id | String | Yes | UUID job identifier |
job_type | String | Yes | Job category (e.g., "file_processing") |
runner_id | ObjectId | No | Associated runner |
parameters | Object | Yes | Job input parameters |
status | String | Yes | Current status |
progress | Integer | Yes | Completion percentage (0-100) |
result | Object | No | Job output results |
error | String | No | Error details if failed |
created_at | DateTime | Yes | Job creation time |
started_at | DateTime | No | Processing start time |
completed_at | DateTime | No | Completion time |
updated_at | DateTime | Yes | Last update time |
Status Values:
pending- Queued for processingrunning- Currently executingcompleted- Finished successfullyfailed- Error occurredcancelled- User cancelled
Job Type Values:
file_processing- FIT file upload and processing
Sample Document:
{
"_id": ObjectId("68c3f9112a04875c129365d5"),
"job_id": "fbc90b6c-2443-4a1d-b4b1-83d5152312a8",
"job_type": "file_processing",
"runner_id": ObjectId("68c3f9062a04875c129365d4"),
"parameters": {
"temp_files": ["/app/data/uploads/temp_20250912_104224_941199_WorkoutFileExport.zip"],
"file_count": 1
},
"status": "completed",
"progress": 100,
"result": {
"total_files": 1,
"processed_fit_files": 188,
"processed_parquet_files": 0,
"skipped_duplicates": 0,
"failed_files": 0,
"errors": [],
"session_ids": ["68c3f9122a04875c129365d6", "..."]
},
"error": null,
"created_at": "2025-09-12T10:42:25.039000",
"started_at": null,
"completed_at": "2025-09-12T10:47:02.465000",
"updated_at": "2025-09-12T10:47:02.465000"
}
10. fs.files (GridFS)
Purpose: GridFS metadata for storing large binary files (FIT files, Parquet files).
Document Count: 4,260
Indexes: _id_, filename_1_uploadDate_1
Fields:
| Field | Type | Required | Description |
|---|---|---|---|
_id | ObjectId | Yes | Unique identifier |
filename | String | Yes | Original filename |
contentType | String | Yes | MIME type |
chunkSize | Integer | Yes | Chunk size (default 255KB) |
length | Integer | Yes | File size in bytes |
uploadDate | DateTime | Yes | Upload timestamp |
metadata | Object | Yes | File ownership and linkage information (added Oct 2025) |
Metadata Schema:
Since October 2025, all GridFS files include a metadata field for direct file ownership tracking and linkage verification.
For FIT files:
{
"runner_id": "68c3f9062a04875c129365d4", // Links to runner._id (as string)
"file_type": "fit", // Identifies file type
"session_id": "68c3f9122a04875c129365d6", // Links to sessionsFIT._id (as string)
"session_date": "2024-11-26T11:00:48", // Session start time
"uploaded_at": "2025-09-12T10:42:27.047000" // Upload timestamp
}
For Parquet files:
{
"runner_id": "68c3f9062a04875c129365d4", // Links to runner._id (as string)
"file_type": "parquet", // Identifies file type
"fit_session_id": "68c3f9122a04875c129365d6", // Links to sessionsFIT._id
"parquet_session_id": "68c3f9132a04875c129365da", // Links to sessionsParquet._id
"created_at": "2025-09-12T10:42:27.047000" // Creation timestamp
}
Sample Documents:
FIT file:
{
"_id": ObjectId("68c3f9122a04875c129365d7"),
"filename": "tp-1581646.2024-11-26-11-58-13-283Z.GarminPing.AAAAAGdFt9Tjz2sX.FIT",
"contentType": "application/octet-stream",
"chunkSize": 261120,
"length": 411633,
"uploadDate": "2025-09-12T10:42:27.047000",
"metadata": {
"runner_id": "68c3f9062a04875c129365d4",
"file_type": "fit",
"session_id": "68c3f9122a04875c129365d6",
"session_date": "2024-11-26T11:00:48",
"uploaded_at": "2025-09-12T10:42:27.047000"
}
}
Parquet file:
{
"_id": ObjectId("68c3f9132a04875c129365da"),
"filename": "tp-1581646.2024-11-26-11-58-13-283Z.GarminPing.AAAAAGdFt9Tjz2sX_record_mesgs.parquet",
"contentType": "application/octet-stream",
"chunkSize": 261120,
"length": 216719,
"uploadDate": "2025-09-12T10:42:27.047000",
"metadata": {
"runner_id": "68c3f9062a04875c129365d4",
"file_type": "parquet",
"fit_session_id": "68c3f9122a04875c129365d6",
"parquet_session_id": "68c3f9132a04875c129365da",
"created_at": "2025-09-12T10:42:27.047000"
}
}
Benefits of Metadata:
- Direct file ownership queries without joining through sessions
- Orphaned file detection (files with no corresponding session records)
- Redundancy protection if sessionsFIT/sessionsParquet get corrupted
- Simplified runner deletion (can find all files directly)
Migration Status:
- Retroactive migration completed: Oct 21, 2025
- Files migrated: 4,218 files (2,109 FIT + 2,109 Parquet)
- Orphaned files found: 42 files (no session records)
- Future uploads: All new uploads automatically include metadata
11. fs.chunks (GridFS)
Purpose: GridFS chunks storing actual binary data for large files.
Document Count: 1,644
Indexes: _id_, files_id_1_n_1
Fields:
| Field | Type | Required | Description |
|---|---|---|---|
_id | ObjectId | Yes | Unique identifier |
files_id | ObjectId | Yes | Reference to fs.files |
n | Integer | Yes | Chunk sequence number |
data | Binary | Yes | Binary chunk data |
Sample Document:
{
"_id": ObjectId("68c3f9122a04875c129365d8"),
"files_id": ObjectId("68c3f9122a04875c129365d7"),
"n": 0,
"data": BinData(0, "...") // Binary data chunk
}
Relationships Diagram
runner (1) ──────< (many) sessionsFIT
│ │
│ └──> fs.files (GridFS)
│
├──────< (many) sessionsParquet
│ │
│ └──> fs.files (GridFS)
│
├──────< (many) sessionStatistics
│ │
│ └──> sessionsParquet (reference)
│
├──────< (many) runnerHistory
│
└──────< (many) trainingplan (by runnerID string)
│
└──> workouts (via session_code)
trainingplanschema ──> Used to generate trainingplan documents
background_jobs ──> runner (optional)
──> fs.files (for file processing jobs)
Data Flow
1. Session Upload Flow
User uploads FIT file
↓
Store in GridFS (fs.files, fs.chunks)
↓
Create sessionsFIT record with metadata
↓
Background job processes FIT file
↓
Extract records → Store as Parquet in GridFS
↓
Create sessionsParquet record with metadata
↓
Calculate statistics → Create sessionStatistics record
↓
Update runnerHistory with new training load
↓
Update runner training load metrics
2. Training Plan Creation Flow
User selects trainingplanschema template
↓
System reads runner configuration (blocks, weeks_per_block, training_days, etc.)
↓
Generate individual day documents for entire plan duration
↓
For each day:
- Calculate projected training load metrics
- Assign workouts based on schema
- Set is_training, recovery flags
↓
Insert ~100+ trainingplan documents (one per day)
↓
Display calendar view by querying trainingplanID
3. Personal Best Update Flow
User completes workout session
↓
Upload and process session (sessionsFIT → sessionsParquet → sessionStatistics)
↓
Calculate fastest splits across multiple distances
↓
Compare with runner personal best records
↓
If new PB: Update runner document
↓
If sufficient PBs exist: Recalculate performance metrics
↓
Push new entry to runner performance history array
Performance Considerations
Current Indexes
Actual production indexes:
runner:_id_,runnerID_idx,marathon_date_idxtrainingplan:_id_,date_idx,training_days_idx,week_idx,session_code_idx,date_training_idxfs.chunks:_id_,files_id_1_n_1fs.files:_id_,filename_1_uploadDate_1- All other collections:
_id_only
Performance Indexes
Query performance indexes used in the system:
// Query sessions by runner
db.sessionsFIT.createIndex({ runner_id: 1, uploaded_at: -1 })
db.sessionsParquet.createIndex({ runner_id: 1, created_at: -1 })
db.sessionStatistics.createIndex({ runner_id: 1, processed_timestamp: -1 })
// Session statistics unique constraint
db.sessionStatistics.createIndex({ session_id: 1 }, { unique: true })
// Time series queries on runner history
db.runnerHistory.createIndex({ runner_id: 1, date: -1 })
// Background job queries
db.background_jobs.createIndex({ runner_id: 1, status: 1 })
db.background_jobs.createIndex({ created_at: -1 })
Query Patterns
Most common queries:
- Get all sessions for runner:
db.sessionsFIT.find({ runner_id: ObjectId(...) }) - Get runner history:
db.runnerHistory.find({ runner_id: ObjectId(...) }).sort({ date: -1 }) - Get training plan days:
db.trainingplan.find({ trainingplanID: "uuid" }).sort({ date: 1 }) - Get session statistics:
db.sessionStatistics.findOne({ session_id: ObjectId(...) })
Validation and Constraints
Business Logic Constraints
runner.marathon_datecontains target race datetrainingplan.datealigns with trainingplanID date rangesessionStatistics.session_idis unique (one stats doc per session)runnerHistory.dateis unique per runner (one history doc per day per runner)background_jobs.progressranges from 0-100
Data Type Constraints
- ObjectId: Valid 24-character hex string
- Dates: ISO 8601 format (YYYY-MM-DDTHH:MM:SS)
- Numbers: Within logical ranges (HR 0-220, speeds > 0, percentages 0-100)
Referential Integrity
When deleting a runner:
- Should cascade delete: sessionsFIT, sessionsParquet, sessionStatistics, runnerHistory, background_jobs
- Should handle trainingplan documents (query by runnerID string)
- Should delete associated GridFS files
When deleting a workout from library:
- Check if referenced by trainingplanschema
- Prevent deletion or nullify references
Schema Version History
Version 2.2 (Current)
- Training plan versioning support with parent-child relationships
- Enhanced query patterns for version filtering
- Support for mid-plan regeneration with reason tracking
Version 2.1 (Time Series Migration)
- Added time series structures for heart rate zones, PBs, and critical speed
- Legacy fields marked for migration
- New standardized time series schema
Version 2.0 (2025-10-02)
- Validated against production database
- Updated runner schema to match actual 38 fields
- Corrected trainingplan to flat structure (one doc per day)
- Fixed sessionsFIT/sessionsParquet metadata organization
- Updated background_jobs field names (parameters/result)
- Removed non-existent metadata from fs.files
- Added runnerHistory additional fields
- Documented current database indexes
Version 1.0 (Initial)
- Created from code analysis (had discrepancies with actual DB)
Time Series Schema
All time series data follows a consistent structure to track changes over time.
Generic Time Series Entry Structure
{
"timestamp": ISODate("2025-01-01T00:00:00.000Z"), // When this value is effective
"value": <any>, // The actual value (can be object)
"source": "manual|calculated|imported|session", // How this value was obtained
"notes": "Optional notes about this entry" // Optional context
}
Heart Rate Zones Time Series
{
"heart_rate_zones_history": [
{
"timestamp": ISODate("2025-01-01T00:00:00.000Z"),
"value": {
"hr_zone_1_max": 125,
"hr_zone_2_max": 146,
"hr_zone_3_max": 167,
"hr_zone_4_max": 188,
"hr_zone_5_max": 209
},
"source": "age_calculated",
"notes": "Calculated from age using HUNT formula"
}
]
}
Personal Bests Time Series
{
"personal_bests_history": {
"400m": [
{
"timestamp": ISODate("2025-01-15T00:00:00.000Z"),
"value": {
"distance": "400m",
"seconds": 65.5,
"session_id": "optional_session_reference"
},
"source": "session",
"notes": null
}
],
"800m": [...],
"5000m": [...]
}
}
Critical Speed Time Series
{
"critical_speed_time_series": [
{
"timestamp": ISODate("2025-01-15T00:00:00.000Z"),
"value": {
"critical_speed": 3.189, // m/s
"w_prime": 285.2, // anaerobic capacity (meters)
"r_squared": 0.9986, // regression fit quality
"pb_400m_seconds": 65.5, // PBs used for calculation
"pb_800m_seconds": 131.2,
"pb_5000m_seconds": 1480
},
"source": "calculated",
"notes": null
}
]
}
Time Series Query Functions
The application provides utility functions for time series operations:
get_value_at_date()- Get the value that was active at a specific dateget_latest_value()- Get the most recent valueadd_value()- Add a new value to the time seriesget_value_history()- Get historical values within a date range
Related Documentation
- Architecture - System architecture and data flow
- Platform Overview - Application features and API documentation
- MongoDB Setup - MongoDB connection troubleshooting
- Troubleshooting - Common issues and solutions
Schema Version: 2.2 Last Validated: 2025-11-09 Database: Production MongoDB deployment Validation Method: Direct MongoDB connection and document analysis Total Collections: 11 Total Documents: Production scale deployment