LaneAward Operations

Time On Tasks & Operations Runbook

Quick-reference procedures for operating and maintaining the live production environment. For staging operations, development workflows, and the full process library see the Staging Runbook in the staging console.

Production Environment Reference

Time On Tasks URL	https://timeontasks.laneaward.com
Console URL	https://console.laneaward.com
App Launchpad	https://ai.laneaward.com
ToT web root	`/var/www/laneaward-timeontasks/`
Console web root	`/var/www/laneaward-console/`
Database	`/var/lib/laneaward/workforce.db`
Backend service	`laneaward-workforce-api.service` · port 9194
App source	`/opt/laneaward/`
TLS cert	`/etc/letsencrypt/live/timeontasks.laneaward.com/` — covers both domains, auto-renews
API health	`https://timeontasks.laneaward.com/api/health`
SSH	`ssh -i ~/.ssh/lane_webserver.pem ubuntu@3.130.69.109`

Production Is Live — Real Users, Real Data

Every operation here targets the live production database and services. There is no undo for destructive commands. Before any operation that writes, restarts, or replaces data, ask: have I taken a snapshot and confirmed this is the right environment?

Never run database reset or activity wipe commands in production. Those procedures exist only in the staging runbook and are intentionally excluded here.

Source Code Is Tracked With Local Git

The repository at /Users/donaldscott/Project-Code/laneaward/repo/ is under local Git version control. Deploy scripts read from the current working tree, so the branch that is checked out at deploy time determines what gets pushed to production. Before running any deploy, confirm the intended branch with git branch --show-current — production deploys should generally run from main. Full background is in the Version Control And Source Management section of the Project Reference document.

1 — Deploy Approved Changes To Production The only authorized path for promoting tested changes to production. Deploys both frontends (Time On Tasks and Operations Console) plus all four console documents in one run, then verifies health of both live URLs. Does not restart the backend or touch the database. Includes a quick-reference table of all deploy scripts — verify staging is healthy before starting.
1B — Deploy Console Documents To Production Deploys only the four console documents (runbook, user guide, project reference, topology) without touching app code, the database, or the backend service. Use any time only documents have changed.
2 — Backup and Recovery The PWA BDR service backs up the production database every 12h and source code weekly. Covers manual pre-operation snapshot. Always take a manual snapshot before any risky change — there is no undo.
2B — Restore Production Database From Backup Full guided restore procedure using restore_production_db.sh. Auto-selects the most recent backup or accepts a specific file. Requires two typed confirmations, takes an automatic safety backup, stops the service, restores the database, and verifies health. The service is never left stopped — recovery is attempted on failure.
3 — Health Check and Status Quick curl and systemctl commands to confirm the production API and both app URLs are responding correctly. Run after every deploy and any time application behavior seems off.
6 — Backend Service Commands Manual restart, stop, start, and enable-check commands for the production backend service. Deploy scripts handle restarts automatically — use these only for manual intervention. Always confirm the correct service name before running; two services share this VM.
7 — Update SQLite From ProfitMaker Files Refreshes only the reference tables (customers, orders) in the production database from the latest ProfitMaker export. Runs on weekdays only by default — use FORCE_REFRESH=1 to run on a weekend. Never replaces the database file or touches users, sessions, or activity. Staging must receive the same update and be verified healthy before production is updated.
8 — Reliability Verification For Task Writes Documents the Phase 1 database reliability features in production (WAL, busy timeout, synchronous FULL, BEGIN IMMEDIATE) that must be preserved on every backend update. Includes a manual checklist for verifying write behavior under concurrent load.
9 — Service Worker Deploy and Cache Invalidation Explains the production cache strategy (network-first for HTML, cache-first for versioned assets, network-only for API). Covers when sw.js must be updated and how to prevent stale content from being served to tablets after a deploy.
10 — Concurrent Stress Test Simulates up to 20 concurrent tablet operators against production. Run after significant backend changes or scale increases. Zero hard failures required to pass — retryable busy responses are acceptable.
11 — Server Access And Security Group Management SSH connection procedures via Twingate (any location) or direct public IP (office only). Covers AWS Security Group rule updates when authorized IPs change. Applies to both environments since they share the same server. Never remove all SSH rules simultaneously.
12 — Verify PIN Hash Status Confirms all active user PINs are Argon2id hashed and that HMAC login tokens are populated. Users missing a token self-heal automatically on their next successful login — no manual action required.
Salary Labor Cost — Calculation Methodology

Process 1: Deploy Approved Changes To Production

Use this process when changes have been validated on staging and are ready to go live. This is the only authorized path for promoting code and data to production.

Use The Right Script For Every Task

Each deploy task has a dedicated script. Using the wrong script can destroy production data. In particular, promote_staging_db_to_production.sh replaces the entire production database with staging — it must only be used for schema or structural database changes and includes a mandatory double-confirmation gate. For all routine work, use the targeted scripts below.

Task	Correct Script
Both frontends + production documents	`deploy_to_production.sh`
Time On Tasks frontend only	`deploy_tot_to_production.sh`
Operations Console frontend only	`deploy_console_to_production.sh`
Console documents only	`deploy_console_docs_to_production.sh` (Process 1B)
Backend code only (server.py or schema.sql changed)	`deploy_backend_to_production.sh` (Step 5)
ProfitMaker reference data update	`push_reference_to_production.sh` (Process 7)
Icons only	`deploy_icons_to_production.sh`
Schema or structural database change only	`promote_staging_db_to_production.sh` — requires double confirmation

Critical Rules

Staging is the source of truth. Only changes fully tested on staging should be promoted. Never edit production files directly.
For deploys that include schema or structural changes, the staging database is promoted to production via sqlite3 .backup — no shutdown required. Reference data (customers, orders) and all contributor activity data are never cross-promoted — each environment's operational data is always managed independently.
Staging is preserved. Deploying to production does not affect staging. Both environments continue to operate independently after the deploy.
Backend updates require a separate deploy. If server.py or schema.sql changed, use Step 5 (deploy_backend_to_production.sh) — not the main deploy script. It uploads backend files, restarts the service, and verifies health in one run.

What deploy_to_production.sh Does

Uploads timeontasks/ and promotes to /var/www/laneaward-timeontasks/
Uploads ops_console/ and promotes to /var/www/laneaward-console/
Verifies and deploys production documentation to the console web root — no transformation, production docs are purpose-built
Prints a live health check for both production URLs
Does not promote the database or restart the backend service. Use promote_staging_db_to_production.sh for DB promotion and Process 6 for service restart.

Step 1 — Confirm staging is in a deployable state

Use the Staging Runbook Process 3 (Health Check and Status) to confirm staging is healthy and the changes you intend to deploy have been tested end-to-end before proceeding.

Step 2 — Optional: take a manual production DB snapshot before deploying

Recommended before any deploy that includes schema changes or significant data migrations.

[VM]
sudo cp /var/lib/laneaward/workforce.db /var/lib/laneaward/workforce-pre-deploy-$(date +%F-%H%M%S).db

Step 3 — Run the deploy script

Run from the repo root on the Mac. The script uploads both frontends, deploys production documents, and verifies health. It does not touch the database or restart the backend.

[Mac-local]
/Users/donaldscott/Project-Code/laneaward/repo/scripts/deploy_to_production.sh

A clean deploy finishes with:

=== Verifying production health ===
timeontasks : {"ok": true, "status": "ok", ...}
console     : {"ok": true, "status": "ok", ...}

=== Deploy complete ===
  https://timeontasks.laneaward.com
  https://console.laneaward.com

Step 4 — Schema change only: promote the staging database to production

Only run this step if the deploy includes a schema or structural database change. This script replaces the entire production database with the staging database. It requires a manual backup, two explicit confirmations, and restarts the backend service automatically. Do not run this for frontend-only deploys, ProfitMaker updates, or any routine task.

[Mac-local]
/Users/donaldscott/Project-Code/laneaward/repo/scripts/promote_staging_db_to_production.sh

The script will display both database file sizes, require you to type PROMOTE then YES, take an automatic timestamped backup, and verify health before finishing.

Step 5 — Backend-only deploy (only when server.py or schema.sql changed)

If backend code changed, use the dedicated backend deploy script. It uploads server.py and schema.sql, promotes them to /opt/laneaward/workforce_app/backend/, restarts the service, and verifies health in a single run. Schema migrations (new columns, indexes) are applied automatically by ensure_schema_upgrades() on startup — no manual SQL required.

[Mac-local]
bash /Users/donaldscott/Project-Code/laneaward/repo/scripts/deploy_backend_to_production.sh

A clean run finishes with the service showing active and a live health response:

=== Production backend deploy complete ===
  https://timeontasks.laneaward.com/api/health

Step 6 — Individual app deploys (frontend only, no DB)

For routine frontend updates that do not require a DB promotion or service restart:

[Mac-local — Time On Tasks only]
/Users/donaldscott/Project-Code/laneaward/repo/scripts/deploy_tot_to_production.sh

[Mac-local — Console only]
/Users/donaldscott/Project-Code/laneaward/repo/scripts/deploy_console_to_production.sh

Do Not Confuse Production and Staging Services

Two backend services run on the same VM — one for staging, one for production. Always confirm you are targeting laneaward-workforce-api.service (production) and not the staging service before running any systemctl command.

Process 1B: Deploy Console Documents To Production

Use this when only documents have changed — runbook, user guide, project reference, or environment topology. This script deploys documents only and does not touch app.js, index.html, the database, or the backend service.

Documents deployed by this script:

_documents/production/runbook.html → runbook.html
_documents/user-guide.html → user-guide.html
_documents/project-reference.html → project-reference.html
_documents/laneaward_environment_topology.html → topology.html

Deploy: Run The Production Console Docs Script

Run from the repo root on the Mac. The script verifies all source files exist, sanity-checks that the production runbook contains no staging infrastructure references, uploads all four documents, promotes them to the production console web root, and cleans up temp files.

[Mac-local]

bash scripts/deploy_console_docs_to_production.sh

When To Use This vs. deploy_console_to_production.sh

Use deploy_console_docs_to_production.sh any time only documents changed. Use deploy_console_to_production.sh when console app code (app.js, index.html) also changed — that script deploys both the frontend and documents in a single run.

Process 2: Backup and Recovery

All LaneAward environments are protected by the 🚀 PWA BDR service — a shared macOS menu bar LaunchAgent on the development Mac. Look for the 🚀 icon in the top menu bar to open the LaneAward dropdown, trigger manual runs, or adjust schedules.

Active Backup Jobs

Job	What it protects	Schedule
`VM·PROD — Database`	`/var/lib/laneaward/workforce.db` on AWS	Every 12 h
`LOCAL — Source Code`	Full project on Mac, hardlink snapshots	Weekly

Backup Location

Production DB → ~/projectbackups/laneaward/production-database/
Source snapshots → ~/projectbackups/laneaward/source/
Unified log → ~/projectbackups/backup_logs/backup.log

Each VM backup uses sqlite3 .backup — safe online copy, no downtime, no locking. Every copy is independently restorable.

Manual Pre-Operation Snapshot (before any risky change)

[VM]

sudo cp /var/lib/laneaward/workforce.db /var/lib/laneaward/workforce-$(date +%F-%H%M%S).db

Restoring Production From Backup

Use the dedicated restore script — see Process 2B. The script handles confirmation, safety backup, service stop/start, and health verification in a single guided run. Do not restore manually.

Process 2B: Restore Production Database From Backup

Use this process when production data must be recovered from a Mac-local backup. The script is fully guided — it shows you the backup and current database file sizes, requires two explicit typed confirmations, takes an automatic safety backup before making any changes, and verifies health after the restore completes.

This Is A Last Resort — Understand The Impact First

Restoring from backup permanently destroys all production data recorded after the backup timestamp. Every work session, task, and contributor activity logged since that point will be gone. Before running this script, confirm:

The correct backup file has been identified (check the timestamp carefully).
The data loss window is understood and accepted.
No alternative — such as a correction or targeted SQL fix — would resolve the issue.
Twingate is running and connected on your Mac.

What The Script Does

Shows the selected backup file and the current live database side by side
Lists the five most recent backups available for reference
Requires you to type RESTORE then YES — two separate gates
Uploads the backup to the server before stopping anything
Takes an automatic timestamped safety backup of the current production database
Stops laneaward-workforce-api.service
Copies the backup into place and sets correct ownership
Restarts the service and runs a live health check
Restarts the service even if the restore fails — server is never left stopped

Backup Files

Location: ~/projectbackups/laneaward/production-database/
Pattern: workforce_vm_prod_YYYYMMDDTHHMMSS.db
Schedule: every 12 hours via PWA BDR service
Retention: 14 most recent copies kept

[Mac-local] List available backups

ls -lht ~/projectbackups/laneaward/production-database/

Step 1 — Confirm Twingate is running

The restore script connects via laneaward-vm → 172.31.7.224. The Twingate client must be active before proceeding. Verify SSH access first:

[Mac-local]
ssh -i ~/.ssh/lane_webserver.pem laneaward-vm "echo SSH OK"

Step 2 — Run the restore script

Run with no argument to auto-select the most recent backup, or pass a specific backup file path:

[Mac-local — most recent backup]
bash /Users/donaldscott/Project-Code/laneaward/repo/scripts/restore_production_db.sh

[Mac-local — specific backup file]
bash /Users/donaldscott/Project-Code/laneaward/repo/scripts/restore_production_db.sh ~/projectbackups/laneaward/production-database/workforce_vm_prod_TIMESTAMP.db

A clean restore finishes with:

  Restore complete. Production is live and healthy.

  Restored from  : ~/projectbackups/laneaward/production-database/workforce_vm_prod_TIMESTAMP.db
  Safety backup  : /var/lib/laneaward/workforce-pre-restore-TIMESTAMP.db

Step 3 — Verify and clean up

Confirm production is responding correctly and the data looks right before deleting the safety backup:

[Mac-local]
curl -sS https://timeontasks.laneaward.com/api/health

Once confirmed, delete the server-side safety backup:

[VM]
sudo rm /var/lib/laneaward/workforce-pre-restore-TIMESTAMP.db

If The Health Check Does Not Return Ok After Restore

Check the service status: ssh laneaward-vm 'sudo systemctl status laneaward-workforce-api.service --no-pager'
Check the service log: ssh laneaward-vm 'sudo journalctl -u laneaward-workforce-api.service -n 50 --no-pager'
The safety backup is on the server at /var/lib/laneaward/workforce-pre-restore-TIMESTAMP.db — it can be used to roll back by running the restore script again with that file as the argument.

Process 3: Health Check and Status

Use these checks to confirm the production service and both apps are responding correctly.

API Health — Direct (VM)

[VM]

curl -sS http://127.0.0.1:9194/api/health

API Health — Public Endpoints

[Mac-local or VM]

curl -sS https://timeontasks.laneaward.com/api/health
curl -sS https://console.laneaward.com/api/health

Service Status

[VM]

sudo systemctl status laneaward-workforce-api.service --no-pager

Live Logs

[VM]

sudo journalctl -u laneaward-workforce-api.service -f

Process 6: Backend Service Commands

Use these only when the backend service itself needs attention. The deploy script handles restarts automatically — these are for manual intervention only.

Restart

[VM]

sudo systemctl restart laneaward-workforce-api.service

Stop / Start

[VM]

sudo systemctl stop laneaward-workforce-api.service
sudo systemctl start laneaward-workforce-api.service

Check on boot

[VM]

sudo systemctl is-enabled laneaward-workforce-api.service

Verify Before Acting

Two backend services run on this VM — staging and production. The production service is laneaward-workforce-api.service. Always confirm the service name before running any systemctl command.

Process 7: Update SQLite From ProfitMaker Files

Use this process when a new asidta_file_* folder arrives from ProfitMaker and you need to refresh the production database with the latest customer numbers, customer names, order numbers, and order descriptions.

Critical Rule — Never Replace The Production Database File For A Reference Refresh

Do not copy the local workforce.db to production. This permanently destroys users, sessions, and activity. The process below updates only the reference tables (customer_account, sales_order, profitmaker_import_manifest) and never touches app_user, work_session, order_task, or any other operational table.

Primary Refresh Script

scripts/refresh_workforce_reference_snapshot.sh

Auto-detects the newest asidta_file_* folder if no path is passed.
Promotes only changed DBF, FPT, and CDX files into pm_database.
Ignores junk duplicates like Copy of ….
Refreshes the local workforce.db with current customers and the rolling order reference window.

Production Push Script

scripts/push_reference_to_production.sh

Copies only the six required DBF files to the VM and runs the importer against the live production database.
Uses the manifest to skip unchanged data — safe to run any time, even if nothing changed.
Runs on weekdays only by default. Use FORCE_REFRESH=1 to override on a weekend.
Never touches users, sessions, tasks, or any activity table.

Description Recovery For Shell Orders (added 2026-06-11)

The importer recovers descriptions from AINONOTE.DBF for orders not yet promoted to formal LNITM line items. Previously these "shell" orders displayed as "Order NNNNNN" placeholder text in Time On Tasks and the Operations Console. The first push after the AINONOTE fallback was deployed updates approximately 50 existing placeholders to real product descriptions, and the daily push continues to catch new shell orders on every subsequent run.

Single-line preliminary orders show the plain product description, for example "plaque polshd blck acrylc 9" x 12"".
Multi-line preliminary orders show numbered items using AINONOTE.ONLINENO position, for example "(1) Flag case walnut to hold 3' x 5' flag (2) Instal of AZ Flags in flags are Cust Provided".
Orders with no resolvable product information stay as "Order NNNNNN" — correct behavior when ProfitMaker has nothing more.
Retroactive automatically: when a shell order eventually gets formal LNITM entries, the LNITM description wins on the next push.

No operational action required. Schema is unchanged; this is purely an enrichment of the sales_order.description field. Implementation lives in workforce_app/backend/import_profitmaker_reference.py (build_item_description_lookup and supplement_summaries_from_ainonote).

Step 1 — Refresh The Local SQLite Database

[Mac-local]

/Users/donaldscott/Project-Code/laneaward/repo/scripts/refresh_workforce_reference_snapshot.sh

This script is fully idempotent — it is safe to run more than once against the same asidta_file_* snapshot. The importer uses upserts throughout, so re-running it will simply overwrite the reference tables with identical data and leave net zero change. No records are deleted, no operational tables are touched, and no counters are incremented. If you are unsure whether Step 1 has already run for the current snapshot, run it again — it will not cause any harm.

Prerequisite — Staging Must Receive This Update First

Before pushing to production, confirm that staging has already received the same reference update. Follow Staging Runbook Process 7 (Update SQLite From ProfitMaker Files) to push to staging and verify it is healthy before proceeding here.

Step 2 — Push The Reference Data To Production

Run independently against production. Updates only the reference tables in the production database — never touches contributor sessions, tasks, users, or any operational data.

[Mac-local]

/Users/donaldscott/Project-Code/laneaward/repo/scripts/push_reference_to_production.sh

If today is a weekend the script will exit early with a reminder. Override with:

FORCE_REFRESH=1 bash /Users/donaldscott/Project-Code/laneaward/repo/scripts/push_reference_to_production.sh

Step 3 — Verify Both Environments

[Mac-local or VM]

curl -sS https://staging.timeontasks.laneaward.com/api/health
curl -sS https://timeontasks.laneaward.com/api/health

Both should return {"ok": true, ...}. Then confirm order data is searchable on both environments:

curl -sS "https://staging.timeontasks.laneaward.com/api/orders/search?q=107923&limit=3"
curl -sS "https://timeontasks.laneaward.com/api/orders/search?q=107923&limit=3"

Both should return a matching order row with data_source: "PROFITMAKER".

Process 8: Reliability Verification For Task Writes

Phase 1 durability hardening is part of the production backend and should be preserved whenever the application programming interface, or API, is updated. The Time On Tasks API opens SQLite in Write-Ahead Logging (WAL) mode, waits up to 10 seconds for short lock contention, uses synchronous = FULL for safer commits, and wraps each mutating route in a short BEGIN IMMEDIATE write transaction.

Connection Settings

PRAGMA journal_mode = WAL — Write-Ahead Logging
PRAGMA busy_timeout = 10000 — up to 10,000 ms wait on a short lock
PRAGMA synchronous = FULL — favors safer disk writes over speed

Protected Writes

add task
start, pause, complete, cancel
material usage logging
admin user maintenance routes

Contention Response

lock contention returns HTTP 503 Service Unavailable
response includes retryable: true
treat as a transient retry condition, not a data-loss event

Iteration test checklist:

Open two or more contributor-capable sessions on production.
Rapidly add, start, pause, complete, and cancel tasks from separate browsers or tablets.
Confirm each UI action returns to a stable task state without duplicate sessions or duplicate task inserts.
If a write is delayed by contention, confirm the API reports a retryable busy condition instead of a generic server error.
After the test pass, verify the affected task lists and completed records still match the contributors who performed the actions.

Current Reliability Status

Five database-reliability features are fully implemented and should remain in place together:

Write-Ahead Logging (WAL)
busy_timeout = 10000
synchronous = FULL
short BEGIN IMMEDIATE write transactions
retryable HTTP 503 responses for SQLite busy/locked contention

Phase 2 (client-side retry/backoff, temporary local storage, idempotent write keys) was evaluated and deferred. A concurrent stress test at 2× the expected user load passed cleanly with significant headroom — Phase 1 alone is sufficient at current scale. Reconsider only if load grows significantly.

Process 9: Service Worker — Deploy Model And Cache Invalidation

Time On Tasks includes a service worker at timeontasks/sw.js that improves load speed on shared tablets by caching static assets locally after the first visit. Understanding the cache strategy is important before deploying any frontend changes.

Cache Strategy Summary

Request type	Strategy	Why
HTML documents (`index.html`, `user-guide.html`)	Network-first	Always fetches fresh HTML so deployed updates are visible on next page load without any SW changes.
Versioned static assets (`app.js?v=…`, icons, manifest)	Cache-first	Version token in the URL acts as the cache key. New token = new URL = automatic cache miss = fresh fetch.
`/api/*` and all non-GET requests	Network-only	Task writes, session state, and PIN login must never be served from cache.

Deploying A Routine Frontend Update (app.js or assets)

No changes to sw.js are required. The version token does the work.

Make your changes to app.js or other assets.
Bump the version token in index.html (e.g. app.js?v=20260406-foreman1 → app.js?v=20260411-myfix1).
Deploy index.html and the updated asset file via Process 1.
On next page load: SW fetches fresh index.html (network-first), browser sees the new token URL, cache misses, fetches new asset, caches it. Done.

When You DO Need To Update sw.js

Adding a new file type or path pattern that needs different caching behavior.
Changing the SW strategy itself (e.g. switching an asset from cache-first to network-first).
Forcing a full cache wipe on all tablets — bump CACHE_VERSION in sw.js. The new SW deletes all prior caches on activate.

When updating sw.js, the browser detects the change automatically (byte-for-byte comparison on every page load). The new SW installs in the background, then activates and claims all open tabs immediately via skipWaiting and clients.claim.

What To Avoid

Do not reuse a version token after changing the underlying file. The cache will serve the old content forever until the token changes.
Do not deploy only the asset without updating its token in index.html. The old token URL stays in cache and will be served.
Do not assume rsync transferred app.js. rsync skips files whose timestamps have not changed, even when content changed. Verify the deploy output listed app.js as transferred, or add --checksum to force a content comparison.
Do not add index.html to a cache-first rule. HTML must always be network-first or the stale-app-shell problem returns.

Process 10: Concurrent Stress Test

Run this test against production to confirm that the Phase 1 reliability improvements (WAL, busy_timeout, synchronous = FULL, BEGIN IMMEDIATE) hold up under the expected concurrent load of up to 20 simultaneous tablet operators on the shop floor. This test was completed before go-live and passed cleanly. The procedure is preserved here as a reference for future validation runs (e.g. after significant backend changes or scale increases).

The test script is at workforce_app/backend/stress_test_concurrent.py and runs from your local Mac. It seeds its own test fixtures into the production database via SSH, runs the load, then cleans up after itself.

What It Tests

Up to 20 simulated tablet users running simultaneously
Full lifecycle per user: login → add task → start → pause → resume → complete → read
Human-paced by default — 1.5–4 s between each action per user, matching real operator speed
All users start their shift together so the server handles concurrent sessions throughout

What To Look For

Hard failures — must be zero. These are non-retryable errors or request timeouts.
Retryable busy responses — acceptable if zero hard failures also. Means SQLite queued the write safely.
p95 latency under 800 ms — good tablet UX. 800 ms–2 s is acceptable. Above 2 s warrants review.

Prerequisites

SSH key access to ubuntu@3.130.69.109 with no passphrase prompt (BatchMode).
Production server running and reachable at https://timeontasks.laneaward.com.
Python 3.10 or later on your local Mac — no third-party packages required.
Run from the repo root: cd /Users/donaldscott/Project-Code/laneaward/repo

Step 1 — Verify SSH access before starting

ssh -i ~/.ssh/lane_webserver.pem ubuntu@3.130.69.109 echo "SSH OK"

Step 2 — Run the standard realistic test

Simulates 20 human-paced operators for 2 task cycles each. Expected wall time is roughly 20–40 seconds.

python3 workforce_app/backend/stress_test_concurrent.py --host https://timeontasks.laneaward.com

A clean pass looks like this:

====================================================================
  LANEAWARD TIME-ON-TASKS  —  CONCURRENT STRESS TEST
====================================================================
  Mode:             REALISTIC (human-paced 1.5–4.0 s)
  Target:           https://timeontasks.laneaward.com
  Concurrent users: 20
  Iterations/user:  2
  Total ops:        280
  Wall time:        31.4s

  Outcomes:
    Successes:        280  (100%)
    Hard failures:      0  (non-retryable errors or timeouts)
    Retryable busy:     0  (SQLite busy-wait — server queued OK)

  VERDICT
  ------------------------------------------------------------------
  PASS  All operations completed cleanly under concurrent load.

Step 3 — Optional: run the burst ceiling test

python3 workforce_app/backend/stress_test_concurrent.py --host https://timeontasks.laneaward.com --burst

Expect more retryable busy responses in this mode — that is normal. Zero hard failures is still required.

Common Options

--users N — number of concurrent users, 1–20 (default: 20)
--iterations N — task cycles per user (default: 2)
--burst — near-simultaneous writes, ceiling test only
--no-seed — skip seeding (test users already in DB from prior run)
--no-cleanup — leave test fixtures in DB for inspection

If the test fails

Hard failures (non-zero)

Check the error detail in the report — it includes which user and which operation failed
An HTTP 500 error means an unhandled exception — check sudo journalctl -u laneaward-workforce-api.service -n 100

High p95 latency

p95 above 2 s: check AWS instance CPU and disk I/O during the test run
Uniformly high latency across all ops usually means network — try ping timeontasks.laneaward.com

Manual Fixture Cleanup (if a run is interrupted before cleanup)

[Mac-local]

bash scripts/cleanup_stress_test_production.sh

Process 11: Server Access And Security Group Management

The server is protected by two layers: an AWS Security Group that restricts SSH (port 22) to authorized IP addresses, and a Twingate connector that allows SSH from any location through the Twingate client. Use this process to connect to the server and to update security group rules when IP addresses change.

Security Group Reference

Group ID	`sg-0cc9719fa0e029c40` (launch-wizard-1)
COX Fiber — office	`98.175.1.150/32` · SSH allowed
COX Cable failover — office	`72.215.199.214/32` · SSH allowed
Home lab (pending removal)	`72.208.129.218/32` · SSH allowed
HTTP — port 80	Open to all — `0.0.0.0/0` · required for Let's Encrypt HTTP-01 renewal
HTTPS — port 443	Open to all — `0.0.0.0/0`

Twingate Reference

Remote Network	Lane Award PWA Server
Connector	`eggplant-okapi`
Resource address	`172.31.7.224` (server private IP)
SSH key	`~/.ssh/lane_webserver.pem`

Never Remove The Port 80 Rule

Inbound port 80 (0.0.0.0/0) must stay open. Let's Encrypt renews every certificate via the HTTP-01 challenge, which the Certificate Authority validates over port 80. nginx redirects all real traffic from 80 to 443, so the only thing port 80 serves is the ACME challenge — but if it is closed, every renewal fails silently: HTTPS keeps working on the existing certs while they quietly march toward expiry. This exact failure was found and fixed on 2026-06-29 (port 80 had been removed, and two certs had already lapsed). If a renewal ever fails, first confirm port 80 is open: aws ec2 describe-security-groups --group-ids sg-0cc9719fa0e029c40 --query "SecurityGroups[0].IpPermissions[?FromPort==\`80\`]" --no-cli-pager. Re-add with: aws ec2 authorize-security-group-ingress --group-id sg-0cc9719fa0e029c40 --protocol tcp --port 80 --cidr 0.0.0.0/0.

Option A — SSH via Twingate (from any location)

Use this method when connecting from home or any location not on an authorized static IP. The Twingate client must be running and connected before opening SSH.

Step 1 — Open the Twingate client on your Mac and confirm it shows Connected

The Twingate icon lives in the Mac menu bar. Click it and verify the connection status is active.

Step 2 — SSH using the server private IP

[Mac-local]
ssh -i ~/.ssh/lane_webserver.pem ubuntu@172.31.7.224

Option B — SSH directly from an authorized office IP

Use this method when connecting from the office on either the fiber or cable connection. Twingate does not need to be running.

Step 1 — SSH using the server public IP

[Mac-local]
ssh -i ~/.ssh/lane_webserver.pem ubuntu@3.130.69.109

If Twingate Is Running While SSHing via Public IP

When the Twingate client is active, it intercepts connections to the server's public IP and routes them through the connector. Pause Twingate first before using Option B, or use the private IP (172.31.7.224) with Twingate active instead.

Check Twingate Connector Status

Run this on the server to confirm the connector service is running. A healthy connector shows State: Online in the log output.

[VM]
sudo systemctl status twingate-connector --no-pager

Update Security Group — Replace a Changed IP

Run these steps when an authorized IP address changes. Requires the AWS CLI configured on the development Mac with IAM user donald.

Step 1 — Verify current rules before making changes

[Mac-local]
aws ec2 describe-security-groups --group-ids sg-0cc9719fa0e029c40 --query "SecurityGroups[0].IpPermissions" --output json --no-cli-pager

Step 2 — Remove the old IP

[Mac-local]
aws ec2 revoke-security-group-ingress --group-id sg-0cc9719fa0e029c40 --protocol tcp --port 22 --cidr OLD.IP.ADDRESS/32

Step 3 — Add the new IP

[Mac-local]
aws ec2 authorize-security-group-ingress --group-id sg-0cc9719fa0e029c40 --protocol tcp --port 22 --cidr NEW.IP.ADDRESS/32

Step 4 — Verify the updated rules

[Mac-local]
aws ec2 describe-security-groups --group-ids sg-0cc9719fa0e029c40 --query "SecurityGroups[0].IpPermissions" --output json --no-cli-pager

Never Lock Yourself Out

Do not remove an IP that is your current connection without first confirming Twingate SSH works, or without another authorized IP still in place. If all SSH access is lost, recovery requires the AWS Console. Never remove all three SSH rules at once.

Remove the Home Lab IP (Pending Task)

Once Twingate is confirmed as the primary home access method, run this to remove the dynamic home lab IP. Do not run this until Twingate SSH has been verified working from the home location.

[Mac-local]
aws ec2 revoke-security-group-ingress --group-id sg-0cc9719fa0e029c40 --protocol tcp --port 22 --cidr 72.208.129.218/32

Process 12: Verify PIN Hash and Login Token Status

Use this to confirm all active user PINs have been migrated to Argon2id hashes and that the fast-login HMAC token has been populated. Both columns must be set for a user to take the fast login path (~230 ms). Users with an empty pin_token will take the slow fallback path on their next login, which automatically writes the token — no manual action required.

Step 1 — Check hash and token status for all active users

[VM]

sqlite3 'file:/var/lib/laneaward/workforce.db?immutable=1' 'SELECT id, display_name, CASE WHEN pin_code LIKE "$argon2%" THEN "hashed" ELSE "plaintext" END AS hash_status, CASE WHEN pin_token != "" THEN "token-ok" ELSE "no-token" END AS token_status FROM app_user WHERE is_active = 1 ORDER BY display_name;'

Expected Output After Full Migration

Every active user should show hashed and token-ok. If any show plaintext or no-token, that user will be upgraded automatically on their next successful login — no manual action required. The slow fallback path handles both cases gracefully.

Step 2 — Verify token values independently (spot check)

Run this on the Mac to compute the expected HMAC token for a known PIN and compare against the database:

[Mac-local]

python3 -c "import hmac; print(hmac.new(b'laneaward-pin-pepper-v1', b'444444', 'sha256').hexdigest())"

Compare the output against the pin_token column for the user with that PIN. They must match exactly.

Salary Labor Cost — Calculation Methodology

Salaried contributors log time through the Time On Tasks app exactly like hourly workers. The Order Profitability report converts their annual salary to an effective hourly rate using the U.S. Bureau of Labor Statistics standard:

Formula

Effective Hourly Rate = Annual Salary ÷ 2,080

Session Labor Cost = (Annual Salary ÷ 2,080) × (Session Minutes ÷ 60)

2,080 = 52 weeks × 40 hours — the standard used by ADP, Paychex, QuickBooks, and the BLS.

What Is Included

Base wage equivalent only — the annual salary stored in the contributor's profile.
Applies to all roles where compensation is tracked (Contributor, Foreman, Team Lead).
Salary workers appear in the same labor cost columns as hourly workers on the Order Profitability report.

Not Yet Included — Open for Future Discussion

Burden rate — payroll taxes (FICA ~7.65%), benefits, PTO, workers' comp, and overhead typically add 25–35% on top of base wage.
Blended rate approach — some operations use a single blended rate for all salaried staff in a role tier.
Part-time salary proration — the 2,080 divisor assumes full-time (40 hr/week).

Time On Tasks & Operations Runbook

Contents

Process 1: Deploy Approved Changes To Production

Step 1 — Confirm staging is in a deployable state

Step 2 — Optional: take a manual production DB snapshot before deploying

Step 3 — Run the deploy script

Step 4 — Schema change only: promote the staging database to production

Step 5 — Backend-only deploy (only when server.py or schema.sql changed)

Step 6 — Individual app deploys (frontend only, no DB)

Process 1B: Deploy Console Documents To Production

Process 2: Backup and Recovery

Process 2B: Restore Production Database From Backup

Step 1 — Confirm Twingate is running

Step 2 — Run the restore script

Step 3 — Verify and clean up

Process 3: Health Check and Status

Process 6: Backend Service Commands

Process 7: Update SQLite From ProfitMaker Files

Process 8: Reliability Verification For Task Writes

Current Reliability Status

Process 9: Service Worker — Deploy Model And Cache Invalidation

Process 10: Concurrent Stress Test

Prerequisites

Step 1 — Verify SSH access before starting

Step 2 — Run the standard realistic test

Step 3 — Optional: run the burst ceiling test

Common Options

If the test fails

Process 11: Server Access And Security Group Management

Option A — SSH via Twingate (from any location)

Step 1 — Open the Twingate client on your Mac and confirm it shows Connected

Step 2 — SSH using the server private IP

Option B — SSH directly from an authorized office IP

Step 1 — SSH using the server public IP

Check Twingate Connector Status

Update Security Group — Replace a Changed IP

Step 1 — Verify current rules before making changes

Step 2 — Remove the old IP

Step 3 — Add the new IP

Step 4 — Verify the updated rules

Remove the Home Lab IP (Pending Task)

Process 12: Verify PIN Hash and Login Token Status

Step 1 — Check hash and token status for all active users

Step 2 — Verify token values independently (spot check)

Salary Labor Cost — Calculation Methodology