Survival analysis to estimate kanban tasks completion times
I am working on a problem to estimate task completion time in kanban (project management tool). While doing EDA, I looked at tasks that are either done or cancelled. In this case, I defined the completion time as the time taken from task creation to done/cancelled.
I noticed I am running into an issue with that definition. I am disregarding tasks that have not been done yet. If we think of task = done as event = 1, this is like throwing away observations with event = 0 in survival analysis, giving us a biased result.
- How should I handle this?
- I would also like to get some inputs on how should I approach done vs cancelled?
Topic time survival-analysis r machine-learning
Category Data Science