A few weeks ago I was having a video conference with someone about managing large-scale, distributed, multi-disciplinary teams. A few years ago I coined the term "agility for adults"as a title for a conference. This term represents the fact that managing small teams and models to motivate and inspire, the large scale also requires more advanced models of agile metrics.
Beyond the self-adhesive papers (known as the popular 3M brand), there is the post-it) and aspirational phrases that sound nice to motivate teams, the challenge of agility designed to achieve things on a large scale is in mobilizing a larger organizational structure.
Big goals, big challenges
Completing initiatives or achieving goals of a higher order of magnitude is not a simple task. And whenever we think about this problem the following questions arise: how do we build agile indicators and metrics to monitor the operation? agile metrics that really allow us to see how good we are or how well we are doing in our projects? what agile metrics allow us to make better informed decisions?

However, for the purposes of this article, I want to make it clear that I am in love with coding or software development - I'm not a software developer. coding. Professional life took me through the paths of team and project management, but I've never stopped "throwing" code. Never for more than a year. Sometimes I do it as a hobby. I regularly take courses in some online platform to keep myself updated or I support development projects by throwing some lines of code. From the little valued PHP even powerful Go.
Coding is something I am passionate about. Thanks to this conjunction between managing people and writing colorful letters [1] In the most popular source code editors, it is customary to have a functionality called syntax highlighting which highlights key words, instructions and symbols, making it easier to differentiate the code and its structure. I have been fortunate to participate, coordinate and lead projects with few, many or hundreds of people, between design, development, testing and operation. And from that perspective I share the agile metrics that I have used or use in my projects and with my teams.
This article contains a summary of the metrics, indicators and dashboards - dashboard - that I have successfully worked with over the years and that today, in my role as a consultant, I like to implement with the teams I serve.
Measures, metrics and indicators
For a long time I thought that measurement, metric and indicator were essentially the same. Over time I found subtle differences and with experience I came to understand the great distance between them.
So what's the difference between measures, metrics and indicators? Let's go with some definitions and an example of good ones.
Definition | Example | |
---|---|---|
Measurement | It is the result of the comparison of a quantity (what we want to measure) and a constant or standard reference value (meters, bytes, kilograms). | 10 requirements 5 low priority defects 3 deliveries 4 iterations |
Metrics | It is the combination of two or more measurements to provide ratios or relationships. | 100 kilometers per hour 50 gigabytes per second 30 story points per iteration |
Indicator | It is a rating or valuation value that tells us something about a result. | 100% Compliance 3% Performance A+ on a test |
Comparative example between measure, metric and indicator
To understand the difference in an example where we use measures, metrics and indicators, I would like to give the following example.
Joanna took a 100-question test and got 50 correct.
Subjective evaluation of the data
In your opinion, how do you think Joanna did? Let's see.
- Joanna took a 100-question test (measurement).
- According to the creators of the test, Joanna answered 50 questions correctly (measurement). This measurement results from comparing each question and its answer to the expected question and answer (standard).
- Joanna scored 50% for correct questions (metric). This metric is the ratio of total questions to correct questions.
Experiences and value judgments
At this point you will probably say: "Joanna has failed". The truth is that you are using a metric as an indicator. This happens by assuming that the goal of the test was to get 100% of correct answers. This may be correct in school and college, but in the business context they are just value judgments - very subjective opinions.
To define or identify an indicator, let's add a little more information to the example.
After evaluating more than 15,000 tests performed on the same number of people, we found that Joanna obtained one of the highest scores, outperforming more than 99% of those tested.
Moment I'm slow! It got interesting. Despite answering only 50% of the questions correctly, Joanna is in the top scoring 1%. What do you think now? Do you have any insight into Joanna and her performance? Do you feel that 99% says something to you? Can you offer a more objective opinion of Joanna and her test?
It certainly is. The 99% tells us that Joanna is one of the top 150 out of 15,000. And we can certainly say for sure that Joanna is one of the good ones - or one of the less bad ones, if you get a little picky about the fact that she only answered 50% correctly. In both cases, that number (99%) is an indicator. If 50% was a cold metric that under some biases or paradigms was assumed to be an indicator, 99% does tell us something and reduces the objectivity of our judgment.
How to correctly define agile indicators and metrics?
Now that the difference between measurement, metrics and indicators is clear, I'm going to bring examples of indicators and metrics for agility. However, it is not possible to define metrics and indicators that apply to every context. In order to keep the article simple and enjoyable, I'm going to set two different objectives:
- Measuring agile team performance with indicators and metrics
- Measuring project performance and progress to completion
Although many of the measurements are the same, the metrics and indicators seek to establish different relationships and indications. You must not forget the temporality of the projects, that is to say that from their conception they have a beginning and an end.
Let's take an example, if you are in charge of the IT area of an organization, under normal conditions, you wouldn't be thinking about dismantling the team arbitrarily. However, when we talk about projects, we inevitably think of a team that, with absolute certainty, has to disintegrate - adjourning the connoisseurs would say.
What are Story Points?
This is an article about agile metrics. One concept I need to clarify before moving forward is Story Points - Story Points o SP. However, that doesn't mean that I'm talking exclusively about Scrum. To understand how agile team metrics work we must define the context and determine:
- A measurement period, such as Sprints in Scrum, iterations - a generic term for all agile and hybrid models, or Schedule Increments in SAFe.
- Define a unit of effort measurement - which hopefully includes complexity and risk in some way. Imagine having to consolidate a multi-team dashboard with person hours, function points and SPs - it's crazy!
The first is simple, define a fixed period of time - a week, an iteration, a month, a quarter or a year. The second requires a deeper discussion. The most common units of measurement we use to measure effort without:
- Person hours - how many hours it will take to complete a task or deliverable.
- Ideal days person - how many ideal days, i.e. without interruptions or task changes it would take someone to complete a task or deliverable. Ideal days are almost always accompanied by a productivity factor. That is, 3 ideal days with a factor of 50% means it will take 6 days to complete.
- Function points - difficult to explain if you're not a programmer, but I leave a link for the curious.
- History Points
But what are Story Points really? Let me explain with a simple example.
Defining History Points
The story points - Story Points o SP - are a unit of measurement to express a relative estimate of effort. This estimate contemplates a single value:
- Effort
- Complexity
- Risk
To understand the concept I leave you an example.
A person has to install 20 windows in a 10-storey building. All the same, of the same dimensions and anchored with the same mechanisms. The building has no structural modifications that imply changes in the installation process.
A scheduler might assume that if installing one window takes 20 to 30 minutes, the total process of installing all 20 windows takes 400 to 600 minutes.
However, if we consider that not all windows are on the same floor and that there is an additional risk due to height, we could assume that: [2] The example of windows, of course, is oversimplified to highlight the fact that the duration of an activity is not the only variable considered in the SP calculation. :
- The windows of the first 3 floors have a relative score of 1SP
- Windows on floors 4 to 8 have a relative score of 2SP - increased risk and complexity
- And the windows on the 9th and 10th floors have a relative score of 3.SP - much greater risk and complexity
As you can see, the SPs consider that the higher the altitude, the greater the complexity or risk. We not only evaluate the duration in time of the activities, we also consider other dimensions such as risk or complexity.
Story Points are a unit of relative measure that only makes sense when you have more than one request or requirement to analyze. Independently establish that an ABC requirement weighs or equals to 3SP, 5SP o 20SP does not make any sense. However, to say that the ABC application is 3SP and that the XYZ is 5SPThe XYZ approach tells us that XYZ involves almost twice as much effort, complexity and risk as ABC.
Metrics for an agile team
Beyond projects, there is a management universe where we can consider work as continuous and indefinite. I know, nobody lives forever and in our times, many people don't even last two years in their jobs. However, the way we understand the performance of a team is continuous.
What metrics are useful in a team? Here are the most common ones
Capacity: how much do we think we can complete?

My favourite definition of capacity - capacity - is: the amount of work we consider or project we can complete in a defined period of time.
Capability is measured (metrics) in person hours, function points or story points. We do not always have the information to set the measurement in a particular unit, and part of the job of a Scrum Master or a facilitator in roles such as RTE, DASSM, will be to validate that the defined metric and measurement are conducive to the context of the organization and are not a "headache" rather than a "relief". [3]In the agile world, relative estimation is favored. Among the most popular practices of relative estimation is the Planning Poker - Planning Poker. Likewise, the most commonly used unit is the story points, but they are not the only option.
Speed: What is the capacity and how much have we completed?
Velocity is a concept associated with the amount of work a team can perform or complete in a period. It was originally designed to measure "individual productivity" in eXtreme Programming (XP) teams - which didn't make some people very happy. It was originally used for speed -velocity- to determine the "load factor"of a team. Load factor was a convoluted concept that was displaced by Story Points.
Today, Velocity is a number of points - of relative effort, like SPs - in a given period. We can classify them into:
- Planned SpeedNumber of points a team expects to complete in a period.
- Real Speed: Number of points a team actually completed in a period - It should be noted that, in Scrum, "complete" includes the Development Team's work and the PO's approval - i.e., completed and approved.
Fulfillment: what have we successfully completed?

Fulfillment can be defined as the amount (measurement) of work that was effectively delivered or resolved within a defined period of time (another measurement). However, in contexts of high variability, I like to separate or decompose this metric into two:
- Compliance on the plan: what of what we planned to do were we able to complete successfully?
- Fulfillment over effort: how much of what we think we could complete was completed?
Although at first glance you might consider them to be the same thing. And for sure some Scrum-maniac shall consider that within a Sprint we cannot include additional work. However, this is not always the case[4]The term Scrum-maniac relates to blind love and ignorance of the realities of an academic management model. I am surprised by the political correctness of my definition. .
If the time period allows, or the scope of the work - scope - is extremely variable, it is very likely that we will find new elements or even interchange some - that we consider to be of similar magnitude. In this context, the tasks planned at the beginning of the period are not necessarily a good metric for compliance.
Variability: How stable is our plan?

Consequently, if compliance requires understanding how much the original plan - or plan defined at the beginning of the time period - has changed, considering a metric that rates how much the original plan has changed can be very useful and revealing to the team.
Calculating variability can be challenging and requires a lot of discipline. What we want to avoid is working on things that don't contribute to our metrics in any way. Therefore, every new activity or task within a period of time should be marked or classified as part or consequence of the original plan, or new and unexpected.
Quality: how well do we do the job?

Well, we couldn't leave out quality. In this case, I like to think of quality as a relationship to the quantity of work delivered. Many teams simply "count" defects or issues. This, as you may already know, is a mistake. Measurement is not a metric and certainly not an indicator.
The dictionary defines quality as:
Property or set of properties inherent to something, which allow its value to be judged.
Dictionary of the Spanish language
That's how one quality metric I like to use is what some authors call it, defect-rateWhat percentage of delivery is in some way compromised or affected by quality-related problems?
A similar metric - and my favorite - is defect density. This metric is a ratio between the number of defects and delivery.
Metrics for Agile Projects
Of course, projects can benefit from team metrics. However, unlike the team, they move toward a particular completion or closure. Similarly, project metrics can be used in tracking a release - release.
Progress: how far have we come?
By defining a goal to reach it is possible to establish a metric of progress. To do this we need one of the following - although I personally try to set both:
- One date - milestone
- A specific objective - depending on the product or service under development.
With a limit or goal in place we can define a progress metric based on the amount of work completed and its impact on the goal - something they often call value, as opposed to the total planned capacity to reach the milestone.
Danger! Progress to completion over a variable range
The greatest difficulty in measuring progress in a "adaptive "context is the constant quest to close the scope. It is difficult, and for some even annoying, to track progress on a moving goal. Within the agile context, the concept of target o value and is preferred over scope. Thus, the scope may vary depending on the objective to be achieved.
Example of a fixed target with variable range
Let's go with another example, but this time, let's leave Joanna alone.
It's Friday night and Pedro, a bachelor about to turn 30, wants to go out and have some fun. However, he doesn't have much money and, tomorrow, Saturday, he has a lunch with his parents that he wants to attend.
So, Peter has defined:
- A clear objective
- An estimated budget - a limit to resources
- A deadline - let's say you don't want to be hungover at lunch with your parents - or at least have the ability to mask your hangover.
However, Peter does not have a plan, and he is not clear on how he is to "have fun". So let's see how the scope is variable in light of the goal.
First scenario
Pedro has decided to call his friends and meet them at one of their houses. If Pedro is like me, they'll probably drink a few beers, reminisce about times gone by, maybe have a few video game tournaments and some pizza at some point in the evening. By about 3 or 4am, Pedro will be at home, happy to be reunited with his friends and have had a night full of "friendship".
Second scenario
Peter has decided to go to his favorite bar. There he has met a person he likes very much and with whom there is a clear possibility of having a great night. It is a moment of "flirting", of romanticism, it is a moment of conquest. Pedro decides to approach and try. The next day, without many details, Peter is happy and feels that he has had a night full of "romance".
Third scenario
Pedro has decided to go to his favorite bar, but he doesn't recognize anyone. He feels a bit lonely, but sees the opportunity to talk to new people. Over a few drinks and some good music he makes new friends. At the end of the day he is at home, meets new people and feels he has unexpectedly enjoyed his favorite bar.
Each of the scenarios poses different activities and behaviours within a common goal and with the same constraints - known in project jargon as constraintsNow if you consider it possible to vary the range depending on the target? I didn't say it was simple, or fast, but it is possible in certain contexts.
Borrowing: how much have we added or negotiated in or out of the original plan?
Well, if we have a progress metric, and we know that scope is variable, it's a very good idea to measure and track the level of project debt against plan. That is, what percentage of the work originally planned or projected is now outside the team's capabilities for the date and target set.
What happens if Peter is at a table with new friends and at that moment the person he likes appears? It's inevitable, for the same night and with the same budget, he has to make decisions. Every decision will have a cost and some activities will be beyond your capacity - for the period.
Examples of agile metrics
Now I present some examples and accompany them with some explanations that I hope will help to close the concepts. Let's see.
Agile metrics of a dedicated team
Metrics | Period 1 | Period 2 | Period 3 |
---|---|---|---|
Planned Speed | 35 | 31 | 31 |
Real Speed | 29 | 28 | 31 |
Plan Compliance | 82% | 90% | 100% |
Although the team never reaches 100% plan compliance - which in my experience is the best case scenario - we can determine that it has a stable real speed of around 30SP. While you might think that a planning team should meet 100%, the truth is that this practice compromises honest and transparent estimates.
Note that I use the word period and not Sprint, because it is not an obligation to use Scrum with these metrics - although it works well.
Meet 100% of a Sprint or Period plan
This is a very bad idea, and I'll explain why.
Imagine you are part of an agile team and during the planning of the iteration you have agreed to complete requirements for a total of 100SP. During the iteration, one of your colleagues has become ill and is unable to attend, and some of the requirements are blocked by dependencies with a supplier. What does this mean?
- It is clear that the team will not be able to complete its goal of 100SP.
- If it's not the team's fault, we couldn't immediately call it "bad planning" - This is a common mistake made by those who believe that people are machines and that plans must be followed to the letter.
- The most serious of the 100SPWith the new system, the team no longer has additional elements to take on and reassign work and avoid wasting time.
What should we do? Should we suspend iteration? Do contingency planning? What about agile engagement and metrics?
The team is not only locked out, it has no other elements to take and "recompose" its goal of 100SP. For those who enjoy mathematics, estimates should be ranges and not absolute points. If the projected capacity of a piece of equipment is 100SPthe team should plan on something close to 110SP o 115SPunderstanding that you may have high performance days. Likewise, if something with a negative impact occurs, the PO has a "plan B" to reorganize the team around what we can work on.
Agile metrics "in trouble", what happens?
Metrics | Period 1 | Period 2 | Period 3 | Period 4 | Period 5 | Period 6 |
---|---|---|---|---|---|---|
Planned Speed | 35 | 31 | 31 | 33 | 25 | 20 |
Real Speed | 29 | 28 | 31 | 25 | 20 | 18 |
Compliance | 82% | 90% | 100% | 75% | 80% | 90% |
At first glance, something has happened from period 4 onwards. But what has happened? Why is the "real speed" dropping?
The simple solution is, of course, to take advantage of opportunities for reflection with the team, identify the causes and propose solutions. However, at the management level, expressing what is happening in data can be useful. Managers are not always in constant contact with agile teams and are limited to reports and metrics. So, let's see some more information about this case.
Metrics | Period 1 | Period 2 | Period 3 | Period 4 | Period 5 | Period 6 |
---|---|---|---|---|---|---|
Planned Speed | 35 | 31 | 31 | 33 | 25 | 20 |
Real Speed | 29 | 28 | 31 | 25 | 20 | 18 |
Unplanned Delivery | 0 | 0 | 0 | 5 | 6 | 4 |
Compliance Effort | 82% | 90% | 100% | 75% | 80% | 90% |
Plan Compliance | 82% | 90% | 100% | 60% | 56% | 80% |
Variability | 0% | 0% | 0% | 20% | 30% | 22% |
Well, this extended table speaks volumes. For some reason, the team has started assigning unplanned work within the same period - for example, tweaks to existing features, changes or enhancements. Never defects.
How many points should I assign to defects?
This is a point of debate. Should I or should I not quantify the effort of the defects or incidents we need to correct? Yes and no. In my experience, agile metrics can be manipulated if the incidents or defects are weighted in the team metrics. However:
- It is good to quantify them to determine the relative effort.
- It is bad to quantify them to determine the speed of the equipment.
In my opinion, if a team has to spend a lot of their time and capacity fixing bugs - things that don't work as we agreed they should - the best metric to reflect that is a "loss of capacity". Otherwise you might have teams with high relative scores that deliver nothing but fixed defects.
In conclusion, a team that spends its time on corrections should see its speed slow down.
Are there agile metrics for the quality of work delivered?
Let's see some more information from our agile team.
Metrics | Period 1 | Period 2 | Period 3 | Period 4 | Period 5 | Period 6 |
---|---|---|---|---|---|---|
Planned Speed | 35 | 31 | 31 | 33 | 25 | 20 |
Real Speed | 29 | 28 | 31 | 25 | 20 | 18 |
Unplanned Delivery | 0 | 0 | 0 | 5 | 6 | 4 |
Compliance Effort | 82% | 90% | 100% | 75% | 80% | 90% |
Plan Compliance | 82% | 90% | 100% | 60% | 56% | 80% |
Variability | 0% | 0% | 0% | 20% | 30% | 22% |
Associated Defects | 3 | 3 | 5 | 7 | 8 | 7 |
Counting defects simply is a bad idea. One defect is not equal to another. Some defects require more work than others, so it's a good idea to use relative values to quantify defects - remember I wrote "It is good to quantify them in order to determine relative effort".
So sorting them into sizes or using something similar to SP is a good idea.
However, it is best to establish a relationship between SPs delivered and defects injected in a period - that's right, injecting sounds strange.
Metrics | Period 1 | Period 2 | Period 3 | Period 4 | Period 5 | Period 6 |
---|---|---|---|---|---|---|
Planned Speed | 35 | 31 | 31 | 33 | 25 | 20 |
Real Speed | 29 | 28 | 31 | 25 | 20 | 18 |
Unplanned Delivery | 0 | 0 | 0 | 5 | 6 | 4 |
Compliance Effort | 82% | 90% | 100% | 75% | 80% | 90% |
Plan Compliance | 82% | 90% | 100% | 60% | 56% | 80% |
Variability | 0% | 0% | 0% | 20% | 30% | 22% |
Associated Defects | 3 | 3 | 5 | 7 | 8 | 7 |
Defect Density | 0.15 | 0.10 | 0.16 | 0.28 | 0.40 | 0.38 |
In this second opportunity we see how the defect density - number of defects injected per SP delivered increases. It is possible that, by introducing new requirements to the period, you are not setting the plan or the impact of those requirements well. This, of course, affects the ability to deliver. lower actual speed - and the quality of the delivery - higher defect density.
Scale Models and Program Increments
Well, so far we've only discussed team-level metrics. Each and every metric presented can be used across multiple teams - if we use normalization - and for different periods, such as iterations, months, and program increments (PI).
If you are still offended by "variability" think about the following:
- You may decide not to accept anything within a single iteration.
- That which is additional or new will enter the next iteration - variability 0% of iteration
- However, that iteration is part of an IP, and therefore, it is prudent to measure the variability of the iteration to improve measurements and projections during the next IP Planning.
Considerations
As you can see, agile metrics apply to agile projects, agile teams, and large-scale issues. If you only keep metrics for the team, you lose the opportunity to scale. In large organizations, with hundreds or thousands of employees, it's not possible to do retrospectives and easily communicate reasons and actions. That's why establishing metrics and supporting good reading and understanding of them can help you gain the organizational support you need. Everyone needs "political" support in a large organization is key, and giving visibility of what is happening is the first step to getting it.
Author's comments and notes