Why do we use story points instead of man days when estimating user stories?
In agile methodologies (e.g. SCRUM), the complexity/effort needed for user stories are measured in Story points. Story points are used to calculate how many user stories a team can take in an iteration.
What is the advantage of introducing an abstract concept (story points), where we can just use a concrete measurement, like estimated man-days? We can also calculate velocity, estimate coverage of an iteration, etc. using estimated man-days.
In contrast, story points are harder to use (because the concept is abstract), and also harder to explain to stakeholders. What advantage does it offer?
Is it easier to explain that your estimate of 5 man-days means that it will take 1 month to complete because your velocity is 0.25 man-days/calendar-day?
@Eric Dietrich is exactly right but more improtantly if this wasn't explained to you already your company has a serious problem in their agile training. Agile needs a good deal of initial instruction and practice--explaining the reasons for all the different agile techniques. It also requires a pretty full implementation, companies that just pick and choose a few features are not really going to get the same benifit (most common error: calling short, daily meetings and forcing people to stand and explain how they spend their time is useless!).
@Patrick If it takes 1 month for, say, 2 developers to complete, that's about 40 man-days. Not 5. A "man day" is 1 per developer per calendar day. When estimating man-days, you don't estimate the difficulty of the task, but how long you think it would take for one developer to do. This _should_ be easier to explain because the concept also exists in other fields, not just software development.
@Patrick When using man-days (see man-hours on Wikipedia), there is no concept of velocity. That's an agile/scrum thing.
@Izkata that is an inherently naive view of a man-day and breaks down almost instantly, because you just equated the man-day it took for project A to be completed where half that day was actually a staff meeting where no work was really done, and the man-day it took for project B to complete where you skipped lunch and stayed late. man days are not and cannot be related (especially 1:1) to calendar days in any meaningful way.
@Ryathal A man-day is the average amount of work gotten done in one calendar day. Those two days will average out somewhere in between.
@Bill K: "most common error: calling short, daily meetings and forcing people to stand and explain how they spend their time is useless!": And a waste of time. Can you elaborate on this? What should be done instead? No daily meetings?
The interesting thing is that as soon as the velocity stabilizes, story points tend to be identified with a certain number of hours or days. E.g., 1 story point = 1 day. If I think it will take 2 days, I will estimate 2 story points.
@Giorgio You can get a benifit from stand-ups alone, but not explaining everything you did--instead discuss problems you are facing, roadblocks and things you can use help with. It's not to let your manager know you are keebing busy, it's to communicate with team members--a "Manager" need not even be involved, but you will get a lot more benifit from implementing many of the other diciplines such as keeping your whole core team available to you all day long (co-locating the team) or having a customer available at all times.
@Bill K: In my team, before SCRUM was introduced we did not have daily meetings. Instead, when we had a problem or a question, we'd just go to our colleagues and spoke to them. Twice a week we had a meeting with all the team so we were all up-to-date with what was going on. Daily meetings have added very little to our team work (besides knowing that there is a planned interruption of our work once a day). I think daily meetings can be good during the team building phase (in a new team) but, as the team communication develops, they can become superfluous.
@Glorgio Yeah, that's pretty much what I'm saying. Unless you are going full agile, any one dicipline doesn't help much. If you are full agile you kind of need the stand-ups to keep on track because so many things are moving at once--also full agile doesn't really have a significant manager more of a coach, so team coordination is more important. The worst use of it is telling your manager what you are doing every day--that shouldn't even enter into it and is utterly pointless from a development point of view.
I think one of the main advantages is that humans and developers specifically are actually pretty bad at estimating time. Think of the nature of development too -- it's not some linear progression from start to finish. It's often "write 90% of the code in 10 minutes and then tear your hair out debugging for 17 hours." That's pretty hard to estimate in the clock timing sense.
But using an abstraction takes the focus off of the actual time in hours or days and instead puts the focus on describing the relative expense and complexity of a task as compared to other tasks. Humans/developers are better at that. And then, once you get humming with those point estimates and some actual progress, you can start to look at time more empirically.
I suspect that there is also an observer effect that happens with time estimates that wouldn't happen with point estimates. For instance, the incentive to sandbag an estimate and deliver way "ahead of schedule" is going to be muted with indirection in a point based system.
+1 for the focus on *complexity*, not time. Translating to raw hours will be easy once you have enough sprints under your belt. I found that the variability between stories gets washed out over time.
So really, *complexity points* might be a clearer term than story points, and any task with too many complexity points is too complex and probably involves too many risks and unknowns to deal with all in one go. Complexity has exponential cost, and the poor sod who gets stuck with that task is going dig a hole so deep he won't be seen again for weeks or months. Better make a simpler task of understanding the complex task first, and dividing it up into smaller tasks with less risky and better understood complexity.
Time and cost are effects, complexity is the cause, and you can't understand time without understanding complexity.
I think it depends on the situation. If you know a lot about a project, you might be able to say with a lot of certainty that a task is going to take 3 hours. If your team doesn't know a lot about the project or the business, estimating in actual time might be very difficult.
It's not just complexity which contributes to the number, its also effort and doubt. As already mentioned by a similar question on another stackexchange http://pm.stackexchange.com/a/2766
If you're using Fibonacci numbers (or something similar), it limits the number of options when estimating a story. I worked with a group that used low numbers only: 1, 2, 3, 5, 8, and 13. We had a reference story that was a 5. This enabled us to easily make snap decisions on a story's complexity while doing Planning Poker. The other side effect was that anything rated a 13 probably had insufficient information and needed to be broken down further. I seriously doubt it would have been that easy and straightforward if we were using raw hours.
Your Product Owner speaks your stakeholders' language and should be able to translate between story points and man-hours (or other units) as needed. During my time as a PO, I had some hard data that 1 story point = 4 man-hours, but obviously every team is different.
With the knowledge that 1 point = 4 hours, you could theoretically change your Planning Poker deck to 4, 8, 12, 20, 32, and 52. But those numbers feel harder to deal with. I think I would mentally abstract the values back to something simple, e.g., "less than a day", "more than a week", etc. And if I'm going to do that, I might as well stick with the abstract unit-less story points.
We use this same method, and our planning deck has higher numbers but we have defined a 20 as meaning it needs broken down or defined better. For us a 13 is our biggest task and usually these are tasks that end up taking as much as a week to finish.
"The other side effect was that anything rated a 13 probably had insufficient information and needed to be broken down further." And I assume if it's broken down further, it'll be broken into equivalent Fibonacci parts?
@JoeZeng, yeah, those often became 8+5 or 5+5+3. It's an abstract measurement though, so if the new stories are close enough, then I didn't worry too much. The team could normally absorb a 13 becoming two 8's or three 5's, but three 8's meant I needed to have a clarifying talk with stakeholders. Ideally, we had estimated far enough in advance that it wouldn't impact the current sprint.
It's to enable estimation to get better over time, without the estimators all having to adjust their estimation.
Rather than everyone involved in the estimate having to think like "OK.. looks like 2 man days.. but last sprint we underestimated everything, so maybe it's really 2.5 man days. Or 3?", they carry on the same as always. "5 story points!"
Then, you adjust your estimation of how many story points the team can get through in a sprint, based on actual measured achievement in previous sprints. "We've been doing 90-110 story points per sprint previously!"
I would say the theory behind this is that developers are better at estimating relative complexity of different dev tasks than they are at estimating absolutes. Especially if multiple people are estimating a task which could be done by any one of them (and not everyone works at the same speed as everyone else).
Cynical alternative: I've seen it said that developers never come in under time estimates. If something takes longer than was estimated, you've gone over. But if something takes less time than estimated, developers may fiddle with it, gold-plate, or just slow down and take it easy since they've been given a cushy assignment. Taking the real units of time out of an estimate may curb these tendencies. End cynical alternative.
That's not even that cynical. It's the principle of "fast or cheap or good". I can give you a mostly stable, mostly working version of FizzBuzz that will give you what you generally want within a few minutes, but if you want user interaction, that'll take longer, and if you want regression testing, that'll take longer, and if you want it not to fail when you hit MAX_INT, that'll take longer. Tell me to prioritise speed, and I'll start dumping req's. Tell me to prioritise everything else, and I'll use all the time I'm given.
Man days or man hours are as you say concrete. So when a task is estimated at 5 hours and takes 6 it is now a late task.
When you have a story that is a 3 points and it takes 6 hours, it took 6 hours, it's not late, it just took six hours. The velocity measurement than is more a factor of how many of those points you get done in a sprint, and that number can fluctuate, because it isn't concrete. You also are not measuring each task, but the total of all the tasks. When you have hours on each task, the temptation is there to measure each task. When that happens, you get no benefit to the sprint for finishing under the time and it is a consequence for finishing over the time of any given task.
It can be a transition to thinking in terms of points. One place I worked before we even introduced agile used t-shirt sizes just to get an idea on the level of effort. Points are just an extension of that.
That isn't to say there isn't controversy, or some arbitrary assignment to the points. We have members of our team that almost always vote the lowest number, and complain when they think a task is a 1 and we think it is a 3 that we are suffering from point inflation.
The abstraction is sort of the point. Using the 'man day' as a measurement has a number of pitfalls, including:
- If the team isn't familiar with the tech they are going to be using, then it can be really hard to give real-time estimates of how long a task might take. They are much more likely to be able to give good relative estimates - e.g. "task A will probably take twice as long as task B".
- Different people work at different rates! If you use 'man days' you pretty much have to change the time estimate when a task is passed from one developer to another. Who defines how much work constitutes a 'man day' anyway?
If you want to estimate man-days it's a simple calculation:
user points in story / average user points per developer per day = estimated man days
Regarding point 2: how does story point solve this? You estimate a story as 4 story points. You give it to a faster programmer and it takes 4 days. You give it to a slow programmer and it takes 8 days. It seems to me that the problem is not solved but just moved around.
@Giorgio Re. point 2: Your faster programmer has a work rate of 1 story point/day, and your slower programmer has a work rate of 0.5 story points/day. If you do it in hours, either your faster programmer is going to work himself to death, or your slower programmer needs sacking. Bill Leeper's answer makes this point very well.
Re. point 1: For my money, if you're talking about 2 different sets of tech and 2 different areas of the product, you've got two teams.
Further re. point 2: User stories are developed by the *team*, not individual developers. It's the team's work-rate which is the important part. Bear in mind that when implementing user stories you should be breaking them down into tasks first. Give the faster developer more tasks!
You wrote that "Different people work at different rates! If you use 'man days' you pretty much have to change the time estimate when a task is passed from one developer to another." For exactly the same reasons, if you estimate in story points, you have to change the estimate in terms of velocity (story points / sprint) when a task is passed from one developer to another: a slower developer needs more time per story point. Regarding point 2: unfortunately it is not always possible to have different teams working on different areas of the product.
As already mentioned, story points are a relative measure of complexity. One can use power of 2 series (1,2,4,8,16...) or a Fibonacci scale (1,2,3,5,8,13,20...) for estimation. As espoused developers are quite adept at saying something like this:
Feature A is almost twice as hard as Feature B
But it's really difficult to say 'how long' will this feature take for implementation. You let that be balanced by velocity. So if something was estimated as a 5 but turned out to be a 13, a slower velocity would normalize that for the iteration (or you could re-estimate).
Now, there is another alternative, it's called 'ideal days' (some what similar to man-days but I'm not sure if that's what you meant) and I know of quite a few teams who prefer that. Ideal days are to be interpreted as:
If that's all what I do after coming to office and take only the necessary breaks, have no interruptions and will have everything I need to 'implement the story' i.e. no peripheral activities like meetings, responding to mails etc.,
Mike Cohn, one of the many well know agile evangelists provides the following comparison between story points and ideal days
- Helps drive cross-functional behavior i.e. teams estimate stories w.r.t. total implementation complexity all the way from UI to DB and back.
- SP estimates don't decay i.e. a few months from now a 5 point story is still likely to be 5 points, but an ideal day estimate may change depending on the acquired development skill/speed of that particular programmer
- SP are a pure measure of size i.e. they only and only reflect size w.r.t. complexity. Period. No duration etc., thrown it. That's the job of velocity. But not so with ideal days. In fact with ideal days there is a tendency to muddle it with calendar days. Keeping it abstract as SPs fights the temptation to compare with reality. Just a measure of size. No nonsense.
- Is typically faster than ideal days. It may be tricky for the first couple of stories, but once you get the hang of it, it's faster.
- Different developers can have a different take on their ideal day estimate for completing a story. I could do the same in 3 and you could in 5. SPs are more or less uniform across the board. They level the playing field so to speak.
- Easier to explain outside the team; for obvious reasons :)
- Easier to estimate at first as mentioned above. But once you get the hang of SPs it comes naturally
Now, which one to choose is up to the team. However, as most answers here and my personal experience, I prefer story points. Ideal days don't really have that much of a benefit over SPs (and Mike Cohn also advocates SP along with many other agile evangelists).
21 or 20 doesn't matter when estimating. SPs round off the next fibonacci number to eliminate the sense of false precision. The next number in the sequence is not 34 but 40 (double of 20) and then 100. The numbers represent 'uncertainty' in complexity and not precision. It's only an approximation.
@PhD: "SP estimates don't decay i.e. a few months from now a 5 point story is still likely to be 5 points, but an ideal day estimate may change depending on the acquired development skill/speed of that particular programmer": This implicitly assumes that the team will improve its skills uniformly in all areas of the project. I do not see why this should always be the case: some parts of a project (and technologies required for them) might turn out to be harder than others, making the initial estimate of relative complexities in story points invalid.
Story points also help you to measure performance improvement of the team over time. In addition, you do not need to re-estimate everything as performance improves.
Take this example which uses man days:
The team estimates different tasks with man-days. It works for a while, but after some time you see that the team is done faster with many tasks than originally thought. So the team re-estimates the tasks. It works for a while, and after some time you see again the same thing: The team is done faster with many tasks again. So you re-estimate again, and this story repeats again, and again and again...
Why? Because the performance of your team increased. But you do not know about it.
The same example with story points:
The team estimates the size of the user stories. After some sprints, you see that the team can do about 60 story points per sprint. Later, you see that the team has achieved more than 60 story points, maybe 70. And the team continues like that and pulls more user stories for next sprints and delivers them.
Why? Because performance has increased. And you can measure it. And you do not need to re-estimate everything after the performance of your team has increased.
"Why? Because the performance of your team increased.": An alternative explanation is that after a while the team starts to give more accurate / realistic time estimations (since they were late with some tasks in previous sprints, they start assigning more time when estimating stories for later sprints).
First, people are better at relative estimates than absolute estimate. The babylonians mapping and rating the relative brightness of stars is a great example. They didn't get the absolute figures right, but the order was mostly spot on even for very similar intensities.
The second advantage is that a prime reason for doing this exercise is to drive conversation.. If you start discussing in exact days, conversation may quickly derail.
As Napoleon said: the plan is worthless, planning is invaluable.
Third, the project manager does not have to edit all estimates, just because it turns out that estimates were off by a factor of, eg, 130%.
Man days estimate the time it takes to do something. They are best used when the items you are estimating are very precise and measurable. Specific, well known, repeatable tasks are estimable in man days.
For example, if a sales person can make 20 customer calls per day, on average, we can calculate how much time each call takes and from that we can estimate how many man days it will take to make 1000 calls.
In this example one can concretely think in statistical terms about the median length of a call because all calls can be assumed to be effectively the same thing.
Story points determine which combination of stories can be done in an iteration. They are used to combine heterogeneous goals with fuzzy boundaries and to measure how many can be done in a fixed time frame. They estimate the complexity of chunks of work compared to each other so to be able to add them together.
For example, your team developed 5 stories for a total of 23 points in iteration 1, and 8 stories for 20 points in iteration 2. From this you can estimate that in iteration two your team will do a few stories whose total is around 20 points in iteration 3.
Note that we do not need to determine the size of one point and in particular there is no assumption whatsoever that each story of the same size will take the same time to be developed! We only work on sums and on points per iteration. I didn't even mention how long the iteration is.
If you walk up to a human on the street and ask "How big was a T-rex?" the answers would fluctuate even though majority of humans know what a T-rex is, how big it kind of was, but nobody really knows for certain - because we have NO relative scale to baseline from.
That's the cognitive behavior you're trying to figure out with forecasting and many methodologies spin cycles with "I've got it!..i have the secret to accurate forecasting!" snake oil to the masses. When you actually do forecast you're actually saying outloud "I will ALLOW x days/hours/points for that to complete" - its in a sense creating a "timebox" for that event to be carried out within.
For me, Points is just shifting the boundaries, at the end of the day unless you're in a team that is happy to say "*Well we have 3 weeks per sprint, and thumb suck...i figure we should shoot for 30 points to complete in that cycle! who's with me!*" and thats as deep as you go in forecast modeling - fine! ..as realistically you're just setting an arbitary budget and that's it. You're also then in retrospective looking at the work completed with a sense of "holy crap, we did 33pts that sprint, that was pretty cool" and not much can be done about that. You can use velocity to determine mid-sprint you're getting bang for your budget buck by asking outloud "Have we hit 15pts yet? will we" but then your back to adding relative time to the equation aren't you?" but the danger here is you're now using Velocity to measure productivity not capacity which from what I understand kicks the Reactive Release Management (story points) in the head..
The point system is almost too clever to not notice that you still attach relative time to the equation, everything from your agreed "sprint cycles" to your daily standups in which you enact some hidden rule around duration + complexity = "Max is taking to long with that task" innate gut feeling team code red moment?
The human brain cant forecast because it involves a lot of working memory mixed with long/short term recall, so its like asking a novice math student to do fractions in their head not on paper.. It's why other industries never agree on a forecast and constantly validate forecasts in relative time (eg geologist never stop forecast modelling until that cubic meter has been dug out of the ground and then its "done").
I'd say Point system works if you're not forecasting. You're agreeing to a chunk of work that's based on a sub-chunking algorithm but that's really your closest approach to forecasting as possible. In fact you're release management would look for natural breaks in the "backlog" queue that fit around theme(s) (ie in Silverlight we Product managers would wait until after they complete their backlog and piece together the themes we initially set. We never knew what the engineering team were doing specifically we just had a basic outline. We'd then take that body of work and build our marketing event around it (Microsoft Mix))
When you start locking down velocity expectations inside sprint cycles that rely on velocity + time, you're back to forecasting estimations again only this time you're worse off because you're playing the "it depends game" ... More importantly you're also killing potential for team growth / career growth as well.
The tax you pay for Points vs Time is with points you need to look for alternative measurement formulas to track onjob skill development / mentoring or developer behavior.
As in you will still need look at a "median developer" as your ideal person to attach skill/effort with, you can then baseline other developers with that person to determine how they are fairing in their ongoing growth within your team. It also highlights situations where the "fast" developers are carrying most of the water but are getting bored or worse they are working longer hours and no recognition / reward because of competing deadlines etc. Standups don't unearth this in reality, they are really there to detect bad smells within the team per say, as in "that person is struggling, lets help"
Next also comes the "carry over" stories, stories that don't get chunked into that sprint cycle but then spill over to the next sprint cycle. Which then can easily create a knock-on effect if you're factoring in time, but the moment you do factor in relative time..again, you just regressed back to "time based forecasting/estimation" and again the point system is just muddying the waters.
If you go points you have ignore time completely and i mean completely as the moment you let time creep in you're gaming the idea / methodology.
Having travelled around the world as an Evangelist, I saw a lot of teams swear their hand on whatever they hold dear that they have cracked the Agile Forecast code... but I always clicked my tongue, smiled and walked away with the thought "yeah...you almost did, but that mistress we call 'time'... she's just cruel..."