Session Notes: Karen — Labor Data Deep Dive

June 3, 2026 · Nate St. Pierre, Karen

Overview

This session went deep on the labor data process — specifically how data is currently captured and analyzed, what the goal is, and where the friction lives. Karen walked Nate through the spreadsheet she uses for labor analysis (using badges as the working example), and they worked through the data together. The conversation surfaced several specific problems: no defined threshold for when a product has enough data, a key operator whose data is systematically missing, and a manual reconciliation process Karen does every time she runs analysis.

The Goal and the Format

The end goal of the labor data work is a master reference: for each product, how many pieces can be produced per hour, and what does that cost in labor. That number feeds pricing, production planning, and internal targets for workers.

The current spreadsheet tracks this by process type — one tab per production process (badges, plaques, sublimation, etc.). For each run, the operator logs the date, their name, order number, material type, and marks which steps they performed. The goal is to derive "minutes per piece" — Karen works with that number internally, and Briana converts it to pieces per hour for production planning.

Karen walked through the evolution of the format: early versions had free-text process step fields, which were nearly impossible to analyze consistently. The current version uses fixed step columns — operators mark what they did, which at least produces structured data. This has been refined through several iterations to balance how much burden it puts on the production team with how useful the output is for analysis.

The "Years to Get Data" Question

Briana had previously mentioned that getting actionable data across all their different products would take a few years. Nate pushed back on this: RCB runs 15 to 30 different product types on any given day, many of which are reorders. Over six months to a year, that's thousands of data points. Why isn't that enough?

Karen's clarification was useful: the concern probably isn't about badges, which runs frequently and has a relatively finite set of variations. The concern is more about products like drinkware on the radium machine — higher variation (different cup sizes, different batch configurations) and lower run frequency. For those products, getting a statistically confident baseline takes longer. But it's a product-specific problem, not a general one.

For badges specifically, there are roughly 50 data points currently visible in the scatter plot. The question of whether that's enough is unresolved — but Nate's read is that it probably is, or is close to it.

Why They Keep Collecting After They Probably Have Enough

This was one of the more useful parts of the conversation. The team doesn't have a defined threshold for when a product has enough data to stop tracking. As a result, everything stays in active collection indefinitely, even products that have been run dozens of times.

A few specific reasons for this:

No confidence in the numbers yet. The data is noisy — scatter plots show significant variation — and the team isn't sure they understand all the sources. Without being able to explain the variation, it feels premature to stop collecting.

Feral's data is missing. Feral is the fastest operator on the floor and has been consistently resistant to recording her data. As a result, a large portion of the dataset reflects the output rates of other, slower operators. The team knows this skews the numbers but doesn't know by how much. This alone is a reasonable argument for continuing to collect — but the fix is getting Feral's data, not running more orders with everyone else.

The system keeps changing. The capture format has evolved through multiple iterations. Data from different versions of the checklist isn't directly comparable, which erodes confidence in the older numbers.

No graduation mechanism exists. Even if everyone agreed that badges had enough data, there's no defined process for what "enough" means or what happens next. The checklist doesn't change. Collection continues.

Nate proposed a framework during the conversation: assign each product a red, yellow, or green status based on data maturity. Green means the data is good enough to quote and set targets. For anything that's green, pull the data capture requirement off the checklist — keep the SOP for training purposes, but stop asking operators to record times. Karen's response was immediate agreement. Nate estimated that 25 or more badge variants could potentially be retired from tracking today.

How the Data Actually Gets Analyzed

Two types of data rows exist in the labor spreadsheet, and they require different handling:

Full-process rows: One operator does the entire order, start to finish, and records the total time. These are clean and relatively easy to analyze — the time is real, the rate is direct.

Pieced-together rows: Different operators perform different steps at different times. Karen manually assembles these partial records into a complete order record, then calculates the rate from the combined steps. This is more laborious and introduces more potential for error.

Karen tracks which rows are which type, because averaging them together directly would produce misleading results. She noted she's started using Claude embedded in Excel (Microsoft Copilot, which became available about a month ago) to help with some of the averaging on full-process rows. That's been useful — she mentioned it as a natural part of her current workflow, not as a big new thing.

Sources of Variation in the Data

The scatter plots are messier than expected, and Karen has been working to understand why. Known sources:

Order quantity: Small orders have setup and ramp-up time that inflates per-unit time. The relationship should look like a curve that flattens out once the order hits the physical batch maximum (e.g., 50 badges on the press at once). The data roughly follows this shape, but there's more noise than the theory would predict.
Operator differences: Each operator has a different speed. Feral is fastest. Karen filters by operator to try to isolate this variable.
Interruptions: Some outlier data points are probably operators stepping away to help elsewhere mid-run, not genuine production variation. These are hard to flag or exclude without more context.
Product size (for plaques): Larger plaques (9x12) take meaningfully longer than smaller ones (5x7) — more time to move, fewer fit on the press. For badges, the size variation is small enough (fraction of an inch) that it doesn't materially affect the numbers.

What's Next

Nate flagged one specific question he wants to ask Briana before finalizing his recommendations: how much data is enough? That question — if Briana can answer it — unlocks the graduation framework. It changes the labor data conversation from "we keep collecting until we feel confident" to "we collect until we hit a threshold, and then we stop."

Follow-Ups

Ask Briana: how much data is enough? What would it take to call a product's labor rate final?
Understand why Feral doesn't record her data — is it a tool problem or something else?
Confirm whether Microsoft Copilot is available across the team or just Karen's setup
Check whether plaque size is currently being captured as a variable in the data collection form