As a manager, I felt the struggle of trying to balance the urgency of delivering new features with the importance of maintaining high code quality. That’s why I went together with a team lead from Reco - Michael Lantsman - to a great meetup at Riskified about this subject exactly. It had some great insights, so I’m sharing them here :)
After some food and beer, Shai Peretz gave a warm introduction, then Itay Waxman shared some practical strategies for fast delivery without creating tech debt, and finally we had a very insightful panel about team velocity.
Fast Delivery Without Creating Tech Debt
Itay Waxman - Engineering Manager & Head of Backend Guild @ Riskified
Big feature, urgent for now. “Has this happened to you”?
Let’s think about a better way to deliver without compromising on tech quality.
“The Shock phase”
- Skip code review
- A few hacks
- No automated tests
“The Pushback phase”
- Reduce scope.
- Downgrade non functional - not 1 second, 10 seconds.
- We can’t start before refactor
- Transfer responsibility to another team
“The practical phase”
Urgency in mind, quality at stake
From this part alone it’s very clear that Itay has a super clear understand of the ups and downs of managing development.
Planning for delivery
Working on 3 tasks at once - no idle but value later
VS working on the same task 3 people; earlier value
Speed, Quality, Cost - choose two.
Speed + Quality == Expensive. However, we want to cut down on costs. We need a development plan in which the friction is high and the development is not “blocked” all the time.
So we want to work with “all hands on deck”. Itay suggests to do it in a smart way. The planning cycle:
- Break down the task into small tasks
- Visualize the dependency graph
- Opening bottlenecks and go back to 1
Make the tasks “small”. Subjectively, up to 3 days is “small” - this shows that riskified is a very large org. In Reco, 3 days is a feature :sweat-smile:. The task must be independent until integration. And the task must be demoable with acceptance criteria.
Small tasks, independent, demoable.
Visualize the dependency graph
Nice visualization: show the dependencies and ALSO the weight of each task.
Dependency graph, discover critical path, parallelism capacity, and maximize the focus on the critical path.
To create less friction, create a simple enablement task to open the bottleneck!
- Email API: Start by writing a mock API instead of a real API.
- Dashboard: Start with a layout first, real components later.
Now the plan is done, let’s move to execution
How to deal with “blocked”?
- Attention for Unblocking.
Example for 2 - prioritize CRs above new tasks!
Communication is key:
- Close communication
- Kickoff between dependent tasks
Bus factor rises more with good communication.
When to use this tool?
- Close deadline or urgency
- Good reason not to take extra tech debt
- Team knowledge sharing
- Starvation of initiatives (all eggs in one basket)
- Tasks that require a lot of prep work (planning/legacy code)
- Not everyone at the team can take all tasks types (no time for “grace” period)
Like all “senior” advice, Itay’s answer is “Depends”. :)
Q: What tools do you use for managing this? A: Jira with Kendis: https://kendis.io/
Q: Overheads (CR, etc.)? A: Since task is demo-able it must be delivered - so the estimation includes everything (CR, deploy).
Q: Full stack developers only? If not, what about the bus factor? A: No, full stack teams, but you know things by proximity: it’s close enough.
Q: You do the estimations, so the team might not know the area and your estimations might be wrong! A: The planning is done WITH the developers, not ABOVE them. This is a team tool.
Q: New things? You don’t know the estimations? New tech/new product? A: My approach is adding a multiplier for uncertainty.
Q: Interrupts? A: Help should be distributed, not always to one person. Also tasks have an “epic owner” that makes sure that everything is OK, so they can be the “nexus”.
Q: If you use it all the time, the overall velocity will lower, no? A: Need to pick the correct tool for the correct case.
Q: Team motivation when not working like this? A: My team doesn’t open a new epic before the current one closes. My team is mostly in this mode.
Engineering managers - Panel
A very strong panel with a lot of experience:
- Victoriya Kalmanovich, Gloat
- Eti Dahan Noked, Wilco
- Shai Peretz, Riskified
- Inbal Porat, Antidote Health
The panel was about a pretty “wide” subject: Agile teams, how to get to maximum velocity and optimal delivery. Starting with a simliar question for all the panel:
How does it look when a team delivering well?
Eti: “Product is happy”. There isn’t a single way to do it, every team does it differently.
Inbal: What does “Good delivery” mean? It means good metrics and good questions actually. If the plan is good and the team is overdelivering then maybe?
Shai: Best measurement is a happy customer (account mgr, product). The only metric that makes a difference is Customer Impact. Eti: What if the impact is far? Shai: We must expose the customer as fast as possible.
Which “ceremonies” are you working with?
Shai: I joined the org when the ceremonies were already in place, but a lot of agile. 2 week sprints across the board, but a lot of freedom between the teams (and that’s important). The focus in the ceremonies should be on early feedback and staying focused on bringing value.
Inbal: I worked in a corporate, sprints, 2 weeks, pre planning, retro. Now in Antidote (Startup vibes) - are the ceremonies even worth it? After 8 months, still need to understand what’s right for the teams.
Eti: High level: There’s no way to know other then to always self-question.
Victoriya: 2 months ago, I joined to a team that develops a tiny new shiny service. But in the context of a heavy “Jira”. And I found that I’m working for Jira, instead of the other way around.
Inbal: Retrospective: ranting, this sucked. Action items? No follow up. So it didn’t serve its purpose. So we started a process with a company called Shamaim that helps continuous improvement. Most important point: follow up and have deadlines for retros and action items. This cause imporvements.
Who owns the deadlines?
Inbal: It was me :) But basically the lowest-ranking manager that owns it.
And how did the new process help?
Victoriya: It caused real change, and gave people the place to influence.
Eti: It hurt me - physically - when you [Inbal] said that the retros were cancelled.
Shai: At riskified, 30 teams, out of which 24 dev teams and 6 DA teams. In the past it was a “big room meeting” but now the scale is too big - hard to give up on the old ceremonies but need to make sure it keeps the correct context (inverse process from Inbal).
Examples of bad KPIs
Inbal: How many bugs is a bad KPI if it doesn’t take into consideration the impact of the bug as well - 1 critical bug » 30 small bug. A better KPI is to also measure impact and in which step they were discovered. Building KPIs is a continuous goal.
Eti: Measuring velocity is also hard. A good KPI is critical bugs that were
discovered by the customers. Velocity is important only to do
team = max(team) and not for competition. There’s no point in cheating the
KPI if it’s only used internally. How can I be better tomorrow than I was
Shai: DevOps metrics like MTTR, MTTD. We also like MTTS (learned from Etsy) - Mean Time To Sleep: How many interrupts were in off-hours. This makes sure that our processes are stable on a human level.
How to improve velocity
Eti: I joined Wilco a year ago, and the sprints were 1 week. 2 weeks were even worse - we just didn’t know how to plan, the plan totally changed after a week. So we moved to Kanban. It improved the feeling - lowered the “blame game” instead of working for the process (“Scrum said so”).
Victoriya: Frontend, Backend, DS were not synced on delivery times - so we need to plan for that ahead of time.
Change a company wide process
Shai: I believe in cross function teams. We’re trying that the teams will be around business and not tech. This is the ideal and we’re not there with all the teams (because of old monoliths etc.). Moved the DS teams closer to the dev teams. The problem is the professional “Guild” around them.
Q (Eti): How to make sure cross-organizational efforts don’t “fall”?
Shai: Riskified is big enough to be a Platform group.
Product hates you on delivery, what do you do?
Eti: I resent the question. The Product shouldn’t measure the team, they should be measured WITH the team. My previous job, the product was in the US, it was SUPER hard. If the Product asks “Why is it not delivered?” and they don’t already know the answer, then there’s a structure/personal problem.
_Shai: HR-wise, product is a different org, but it’s in the org chart only. Product should bring the customer voice, and developers should join customer calls. And product should be involved as early as possible to make sure there’s no “waste”. Shai liked that a Product Manager thought that part of their job was worrying about the motivaion and connection of the team to the task and the customer.
Features are easy to push. How to convince people to invest in Infra and Architecture?
Shai: How to talk the languance of “value”. Need to make sure that the platform team talks “Why” and talks “Value”. More incidents and more bugs are worse for the “Why”, and if the Product talks about that first in the quarterly planning, we’re in a good place.
Inbal: Should be part of the culture. Can’t be only “platform” team.
Shai: Need to make sure it’s balanced. Don’t be a purist.
Eti: The key word is
trust. Don’t use “urgent” all the time so the product
team believes you when you say “urgent”.
How to make the devs talk “business”?
Inbal: Watch customer recordings! Gong makes devs a lot more connected (and even frustrated).
Q from the crowd: Review/presentation at the end of the sprint
Eti: Feature “celebration” was great and valuable. Need to celebrate success! Once a week, demo, part of the “everyone” weekly.
Inbal: Have to demo something in the all-hands.
Victoriya: Recorded demos.
Shai: Write the release notes before the feature (PRFAQ). Also, try to pull it into a “pre-mortem”.
Processes are great; what about tools - opensource, bought, or otherwise.
Victoryia: Logging, metrics, etc. and o11y in general are pretty good. Coralogix for example.
Inbal: LogRocket, I like it for B2C.
Eti: Tool that gives the env in every PR livecycle.