Using Carrier Historical On-Time Data to Make Better Routing Decisions — Freightglint Blog

All posts

carrier management 3PL routing on-time performance

Using Carrier Historical On-Time Data to Make Better Routing Decisions

Andre Coleman · 2025-11-18 · 7 min read

Data visualization showing carrier on-time performance metrics for 3PL routing decisions

Most 3PLs have more carrier performance data than they know what to do with. Every BOL, every delivery confirmation, every detention invoice is a data point. The problem is that TMS platforms are built to move data — not to surface the patterns buried inside it. The result is that routing decisions at most 3PLs are still made on a mix of contract relationships, dispatcher intuition, and rate comparisons, with OTP analysis happening, if at all, at a high level of aggregation that obscures the variance that matters most.

A carrier that is 87% on time nationally may be 71% on time on your Dallas-to-Memphis lane in Q4. A carrier that looks mediocre at the portfolio level may be your most reliable option on a specific Midwest corridor during winter months. That granularity is the difference between carrier scoring and carrier management — and most 3PLs are doing the former when they need the latter.

The Aggregation Problem in Carrier OTP Analysis

When a 3PL's carrier management team reviews OTP quarterly, the typical report looks at total on-time deliveries over total deliveries for each carrier — maybe broken out by mode or by broad region. This number serves a contract negotiation purpose but does not serve a routing decision purpose.

Consider the following plausible scenario: a mid-size 3PL using five primary carriers for their Southeast network. At the aggregate level, all five carriers are performing between 83% and 89% OTP, which looks acceptable. But when the same data is sliced by lane, a different picture emerges. On lanes running from Atlanta into rural Georgia and South Carolina, carrier utilization diverges significantly. One carrier's familiarity with that sub-corridor's relay structure and appointment windows has them running at 94% OTP on those lanes. Another, who performs well on I-75 volume, is at 68% on the rural SC lanes because their drivers don't know the consignee requirements and their relay point adds half a day of handling time.

The aggregate 87% figure for the second carrier gives no signal about this. The lane-level breakdown does. But you only get that breakdown if you are actively structuring and querying your BOL data at lane resolution — not just uploading it to a TMS that treats it as a transaction record.

Building a Carrier Scoring Model from Historical BOL Data

The inputs for a useful carrier scoring model are already in most 3PL TMS environments. The work is in structuring them correctly. The core data elements needed for each shipment record are: origin region (typically by state or 3-digit ZIP prefix), destination region at the same granularity, carrier SCAC code, scheduled delivery date/time at booking, actual delivery date/time, and — critically — whether any exception event was recorded (weather hold, driver swap, shipper delay, detention at origin).

Exception tagging matters because it allows you to compute weather-adjusted OTP alongside raw OTP. A carrier that is 80% on time raw but 91% on time when weather-hold events are excluded has a different risk profile than one that is 80% raw with no weather exceptions — the former is likely suffering from operational issues unrelated to weather, while the latter may be a solid performer hurt by external factors on specific corridor segments.

The scoring model itself does not need to be complex. A rolling 90-day OTP rate by carrier by lane, computed monthly and compared against a baseline expectation, is sufficient to surface actionable routing signals. A confidence-weighted score — where lanes with fewer than 30 observed shipments are flagged as low-confidence — prevents over-indexing on small sample cells.

More sophisticated models add OTP variance (standard deviation of transit time, not just average) as a separate dimension. High average OTP with high variance is a different operational risk than average OTP with low variance. A carrier that delivers 88% on time but when late is often 8–12 hours late creates planning problems that a carrier delivering 83% on time with late deliveries clustered in the 1–3 hour range does not.

Key Metrics That Actually Drive Routing Decisions

Not every metric in a carrier scorecard translates into routing action. The ones that do:

Lane-specific OTP rate (rolling 90 days). The primary routing signal. Below a threshold — 80% is a common operational floor for committed lanes — it triggers carrier review and potential swap. The threshold should be set lane by lane, since some lanes inherently run with more variability than others.

Transit time variance by lane. As described above. High variance makes downstream scheduling unreliable regardless of average OTP. Express this as P50 versus P80 transit time — the gap between median and 80th percentile delivery time quantifies the planning exposure.

Weather-adjusted versus raw OTP. Allows separation of carrier execution quality from external factors. A carrier's weather-adjusted OTP trending down even as weather events are flat is a signal that something in their operations is degrading — equipment age, driver retention, relay quality — before it shows up visibly in customer complaints.

Detention rate at destination. Excessive detention often indicates a carrier is padding schedules by arriving late and waiting rather than running to appointment — an efficiency signal that affects cost as well as ETA reliability.

Translating Carrier Scores Into Routing Logic

The practical application in a 3PL routing workflow is to use carrier scores as a filter layer before rate comparison, not as a post-hoc report. If carrier A is 94% OTP on a given lane over the past 90 days and carrier B is 73%, and carrier B is $180 cheaper on a load, the routing decision should involve an explicit evaluation of whether that $180 delta justifies the elevated risk — including the cost of a potential WISMO call, potential detention, and the cost of any SLA credit to the shipper.

Most routing systems do not present this trade-off automatically. The ops team is just looking at rate and availability. The OTP data is in a separate report that gets reviewed quarterly. Bringing the two into the same decision surface — even as a simple color-coded indicator showing carrier lane performance next to the rate quote — meaningfully changes routing behavior.

We are not saying that cheaper carriers with lower OTP should never be used. On less time-sensitive lanes, a lower-OTP carrier at a material cost saving is a perfectly rational choice. The point is that the choice should be made with OTP data in view, not in ignorance of it.

Data Volume Requirements and Practical Limits

Lane-level carrier scoring requires a minimum data density to be reliable. As a practical guideline: a carrier-lane cell with fewer than 25–30 shipments in the trailing 90 days does not have enough observations to produce a stable OTP rate — the confidence interval on a rate based on 20 shipments is wide enough to make the number nearly meaningless for precise routing decisions. Those cells should be flagged as low-confidence and treated with more qualitative judgment.

For a 3PL running 300–600 FTL loads per month across 6–10 carriers and 20–30 active lanes, most top carrier-lane pairs will have sufficient data density within 60–90 days of structured tracking. The long tail of occasional lanes on infrequent carriers will remain low-confidence, which is fine — those are also the lanes where you are likely already applying more manual judgment.

The bigger barrier to lane-level scoring at most 3PLs is not data volume — it is data structure. BOL data is often stored in ways that require manual extraction and scrubbing to get it into a format where lane, carrier, and delivery timestamp are cleanly linked. That structural work is typically a one-time investment, after which the ongoing maintenance is incremental. The ROI calculus on that investment is usually straightforward once the first significant carrier routing realignment surfaces from the data.

Andre Coleman

CEO & Co-Founder, Freightglint

Machine learning model predicting freight ETA across US highway corridors

Using Carrier Historical On-Time Data to Make Better Routing Decisions

The Aggregation Problem in Carrier OTP Analysis

Building a Carrier Scoring Model from Historical BOL Data

Key Metrics That Actually Drive Routing Decisions

Translating Carrier Scores Into Routing Logic

Data Volume Requirements and Practical Limits

Related articles

How AI Makes Freight ETA Predictions More Reliable Than Carrier-Provided Windows

Freight Visibility and ETA Prediction Are Not the Same Thing

What Lane-Level Historical Data Actually Means for ETA Accuracy