Red, Green, Wow: How to Set Performance Metrics for Senior Leaders That Actually Work

Most performance thresholds look specific on paper and fall apart the moment you try to use them in a real review. Here is how to set targets that hold up.

After helping dozens of founder-CEOs set executive performance metrics for their senior leadership teams, I've noticed where the work usually breaks down. It's almost never the structure of the role scorecard. It's the thresholds.

Most CEOs can tell you when a leader is clearly failing. Most can tell you when a leader is clearly exceptional. Almost no one can tell you, in writing, before the year starts, exactly what separates one from the other. That gap is where performance conversations fall apart.

I learned this the hard way as a founder. When I was scaling my software agency to the Inc. 500 list, I wrote performance targets for my leadership team that looked specific on paper but meant almost nothing in practice. By the time we sat down to review, the metric I'd written down was useless. The leader and I were having two completely different conversations about the same period. We never had the alignment we thought we had.

I call the underlying problem The Performance Mirage. The threshold looks like clear water from a distance. When you walk up to it for the actual review, there's nothing there to drink.

A role scorecard is the document in which a CEO and a senior leader agree in advance on what success looks like in the role. The most important section is Key Responsibilities. That's where you list the five outcomes the leader is accountable for, each with three specific performance thresholds: Red, Green, and Wow. Most of the energy in scorecard work goes into writing those thresholds, and most of the trouble in performance conversations traces back to thresholds that were written carelessly. This post is about how to write them so they hold up.

Why Thresholds Make or Break the Scorecard

When thresholds are vague, both sides fill in the gap with their own interpretation. The CEO walks into the review expecting one outcome. The leader walks in, believing they had a strong period. Neither of them is wrong. They just never agreed on what the threshold meant in the first place.

This is the part of the work most CEOs want to skip. Writing concrete thresholds is uncomfortable. It forces you to commit to numbers you're not certain about. It exposes the parts of the role you haven't actually thought through. Vague feels safer in the moment.

The problem is that the discomfort doesn't go away. It just gets deferred. The conversation you avoided while drafting the scorecard comes up during the review, and by then the stakes are much higher. Now you're not negotiating a threshold. You're negotiating someone's standing, their advancement, or whether they keep the role.

There's a second cost most CEOs underestimate. Vague thresholds make it impossible to coach in real time. If neither of you can say exactly what a strong result looks like, neither of you can say whether the work in progress is on track. You lose the ability to course correct. You only find out the metric was off when it's too late to do anything about it.

Specific thresholds aren't an administrative chore. They're the operating system for the relationship between a CEO and a senior leader. When they work, the relationship runs smoothly and the leader develops fast. When they break, everything else built on top of them breaks too.

What Red, Green, and Wow Are Actually For

Red, Green, and Wow are three specific numbers, not three vague zones. Each one defines a precise point on the performance continuum, and the difference between them tells you what conversation to have.

Red is the number at which you know there is a problem. Hitting Red doesn't mean the leader is failing as a person. It means something in the function is not working and needs intervention. Red is a transparency signal, not a punishment level. When a leader's actual performance lands at Red, you have a real conversation about what changed and what needs to happen next.

Green is the number that defines the minimum acceptable performance for the role. Solid performance. The job is being done at the standard the role requires. Here is the contrarian point most CEOs miss. Green should not be easy. If everyone hits Green by default, the threshold is set too low. Green is the floor of acceptable performance, not the goal.

Wow is the stretch number that defines exceptional performance. The kind of result that creates compounding value for the business. Wow is achievable but not expected. If a leader hits Wow, something meaningful has shifted in their function. New revenue territory. A retention number nobody thought possible. A cost structure that unlocks the next stage of growth.

Here is what most CEOs get wrong about this framework. They treat Green as the goal and Wow as a nice-to-have. The orientation should be the opposite. People rise to the level you measure. If your scorecard implicitly tells a leader that hitting Green is success, they'll calibrate their effort to Green. If your scorecard tells them Wow is the actual orientation and Green is the floor, they'll calibrate their effort upward. The threshold you center is the threshold they will work toward.

This connects to a broader principle in goal-setting. High, Hard Goals are ambitious targets that force the team into the stretch zone where real growth happens. Safe goals produce safe teams. The job of a CEO is to calibrate the stretch, not remove it.

A second misconception worth naming. A healthy scorecard does not have everything in Green. If a leader is reporting Green across every Key Responsibility every period, one of two things is true. Either the thresholds are too easy, or the leader is hiding the parts of the role that aren't working. Real growth shows up as a leader actively wrestling with at least one Red or yellow area at any given time. That's what development looks like in motion. A scorecard with one Red and one Wow tells you more about a leader than a scorecard with five Greens.

The goal isn't to engineer a sea of Green checkmarks. The goal is to have an honest conversation about where the leader is performing, where they are stretching, and where they need help.

Where Thresholds Break

Every weak threshold I've seen falls into one of five failure modes. Most first drafts contain at least two.

Range Mismatch: The threshold doesn't align with how the metric actually works. The most common version is NPS. NPS ranges from -100 to +100, with anything above 50 considered excellent. I've seen leadership team scorecards that set NPS thresholds at "Red 20, Green 25, Wow 30." The numbers look specific but they aren't anchored in how the metric is calibrated. A 30 NPS is not a Wow result. It's a mediocre one. Before you write thresholds for any metric, understand how the metric is scaled and where industry benchmarks actually sit.

Cadence Mismatch: The metric can't be measured at the cadence you'd actually want to review it. If the only way to read this metric is the annual customer survey, you can't use it to coach in real time. By the time the data lands, the period is over. Useful executive performance metrics produce signal at the rhythm you actually meet to discuss them. Quarterly metrics for quarterly reviews. Monthly metrics for monthly check-ins. If a metric only resolves once a year, it's a year-end audit measure, not a performance management tool.

Missing Denominators: "Delivery efficiency: 35 to 50 percent." Of what? Percent of projects on time? Percent of margin? Percent of capacity utilized? A threshold without a clearly defined denominator isn't a threshold. It's a wish. Every percentage on a scorecard needs an explicit definition of what's being divided by what, and both you and the leader need to agree on that definition before the period starts.

Wrong Altitude: Activity metrics show up on senior leadership scorecards where they don't belong. A VP of Sales should not have "number of calls made" as a Key Responsibility threshold. A COO should not have "number of process documents written." Senior leaders are responsible for outcomes. Their teams handle the activities. When activity metrics creep onto senior scorecards, the scorecard has lost its altitude. The threshold is measuring the wrong layer of the org.

Insufficient Spread Between the Three Numbers: The thresholds look like Red, Green, and Wow but they're clustered so tightly together that they describe essentially the same outcome. The difference between Green and Wow has to represent real outperformance, not a rounding error. Take net margin as an example. If you know that anything below 8 percent is a problem, 15 percent is acceptable performance for the business, and exceptional performance is 20 percent or higher, that's a working set of thresholds. Red 8, Green 15, Wow 20. The Wow is genuinely difficult but possible. Hitting it represents a meaningfully different financial outcome than hitting Green. Now compare that to a set written as Red 13, Green 15, Wow 17. Same metric, no real spread. The leader who lands at 17 has done essentially the same job as the leader who lands at 15. Wow has to stretch. If the gap between Green and Wow doesn't require something different from the leader, the framework collapses into pass-fail.

These failure modes appear in various combinations across most first-draft scorecards. The common thread is that all of them produce thresholds that look specific but can't survive a real performance conversation. The metric reads well in the document. It doesn't read well when you're sitting across the table from the leader, trying to decide whether the period was good.

The work of fixing these isn't glamorous. It's a line-by-line revision of the language you wrote in a hurry. The payoff is a scorecard you can actually use.

Test Your Current Thresholds

If you have a role scorecard already in place for one of your senior leaders, here is the test. Pick one Key Responsibility. Read the Red, Green, and Wow thresholds out loud. Then ask yourself five questions in order.

  1. Does the metric make sense in its actual range? If the metric is NPS, are the numbers anchored in how NPS is scaled? If it's a percentage, do the thresholds fall within the range where the percentage actually applies? Many first drafts fail here because the writer didn't check the metric's natural scale.

  2. Can you measure this metric at the rhythm you'd want to discuss it? A useful threshold produces a signal at the cadence you actually review performance. If the answer is "only at year's end," the threshold can't be used for active management. It's an audit measure, not a coaching tool.

  3. Is the denominator clearly defined? If the threshold contains a percentage or a ratio, you should be able to state in one sentence what's being divided by what. If you can't, the threshold is a wish, not a target. Both you and the leader need to agree on the definition before the period starts.

  4. Is this an outcome metric appropriate to the seniority of the role? Or is it an activity metric that belongs further down the org chart? Senior leaders are accountable for the outcomes their team produces. Activity metrics on senior scorecards almost always signal that the altitude is wrong.

  5. Does the difference between Green and Wow describe a meaningfully different performance outcome? Or is it a single number with rounding on either side? If hitting Wow versus Green is the difference between 95 and 96, the framework has collapsed into pass-fail.

If the threshold passes all five, it's a real threshold. Two reasonable people can read it, look at the actual data, and agree on which level was hit. That's the bar. If it fails one, it's a mirage.

Run this test on three or four Key Responsibilities across your senior leadership team. You'll find more mirages than you expect.

How to Start the Conversation

Once you've found the mirages, the work is not to fix the document. It's to start a different kind of conversation with each senior leader. Neither of you arrives with the answer. Both of you arrive curious about what the right threshold should be, what the metric is really measuring, and what good performance in this role would mean for the business. The goal is shared understanding, not compliance. Three steps get you there.

Bring a draft, not a verdict. Pick one role and write a first version of the Key Responsibilities thresholds using the failure modes as a checklist. Treat what you write as a starting point for the conversation, not a finished decision the leader needs to accept. The leader will read your draft as a signal of how the rest of the process will feel. If the draft arrives as an edict, the ensuing conversation will be defensive. If it arrives as a working hypothesis, the conversation will be honest.

Pressure-test together and stay curious. Sit down with the person in the role and walk through the draft together. Ask where the language doesn't match how the metric actually works in practice. Ask what they would set as the Wow if they were writing it themselves. Ask what's getting in the way of even higher performance, and what they'd need to make the stretch real. The leader runs the function. They know the texture of the work in ways you can't from the outside. The threshold has to withstand their scrutiny before it can endure a period of performance, and the only way to know is to invite pushback and listen to what it tells you.

Treat threshold-setting as ongoing work. The first draft is not the final answer. Thresholds get revisited as the business changes, as the leader develops, and as you both learn what the metric is really capturing. Build the expectation early that you and the leader will recalibrate together as part of how the function operates, not as a separate event you do once and forget. The first version will be flawed. The version you have a year from now, after several rounds of revision you've worked through together, will be a tool you actually use because both of you helped build it.

Questions for You and Your Team

Before moving on, take a few minutes to reflect on these questions. The goal isn't to have perfect answers. It's to surface whether The Performance Mirage might be affecting your leadership team's performance and your ability to coach them well.

  • For one of your senior leaders, can you write down right now exactly what the Red, Green, and Wow numbers are on their most important Key Responsibility? Not in vague terms, but as three specific numbers with a clear denominator, anchored in how the metric actually works. If you can't, the threshold is a mirage, and the next performance conversation will be harder than it needs to be.

  • Read one of your senior leader's Key Responsibilities out loud and ask yourself: could two reasonable people look at the actual data and agree on which threshold was hit? If the answer requires interpretation, judgment calls, or "well, it depends," the threshold is not doing its job.

  • Across your senior team, how often does someone show a Red on their scorecard? If the answer is "almost never," either the thresholds are too easy or your leaders don't feel safe surfacing where they're struggling. Both problems trace back to how the thresholds were set and how the scorecard is being used.

Take the Next Step

If this post surfaced gaps in how your leadership team's roles are defined, the Leadership Team Assessment is a good starting point. It evaluates the health of your leadership structure, including how clearly expectations, roles, decision rights, and accountability are defined across your team.

Take the Leadership Team Assessment

Ready to go deeper? Book a call to talk through what Role Scorecards would look like for your leadership team.

Book a Call



The Exit Planning Book for Founder-CEOs

Why do 75% of founders regret their exit within a year—even when they hit their number? Because most exit planning ignores what actually matters: personal readiness, life after the transaction, and building a business that sells on your terms.

SPRINGBOARD: A Founder's Guide to Selling Your Company With Purpose, Clarity, and a Vision for What's Next provides a comprehensive framework for planning exits that serve your life goals, not just your financial targets. It covers the four phases most founders miss: preparing yourself, preparing your business, executing the transaction, and navigating what comes next.

The first three chapters are available now.

Get Your Free Preview


About the Author

Bruce Eckfeldt is a strategic business coach and exit planning advisor who helps founder-CEOs of growth-stage companies scale systematically and exit successfully. A former Inc. 500 CEO who built and sold his own company, he brings real-world operational experience to strategic planning and leadership development. He's a certified ScalingUp and 3HAG/Metronomics coach, Certified Exit Planning Advisor (CEPA), an Inc. Magazine contributor, and host of the "From Angel to Exit" podcast. Bruce works with growth companies in complex industries, guiding leadership teams through growth challenges and exit preparation. Reach him at bruce@eckfeldt.com with any questions or if you want more information or to book a call with him.

Previous
Previous

Run the Business or Build the Business: What Belongs on a Role Scorecard, And What Doesn't

Next
Next

The Role Scorecard Is an Agreement, Not a Checklist