Apologies for the cross-post, but doing so because the thread was forwarded, also apologies for the length.
On Wed, Jul 8, 2020 at 5:01 PM Maarten Dammers [email protected] wrote:
Of interest to the wider community. I really hope this is not part of a larger pattern of the WMF ignoring community.
"Never attribute to malice that which is adequately explained by stupidity." [1]
In this case, my own stupidity (I'm the new CTO here at the WMF, for context), or perhaps to be a little kinder to myself, a combination of bias and naivety: my engineering bias towards wanting to solve a problem I felt was important to take action on (context provided shortly) which got in the way of taking a user-centric approach first in trying to understand what the needs and wants are of the people using the system. As I said on the talk page, I mistakenly thought that the main feedback loops would be about porting workflows and not about the tool itself.
Even though many have publicly said that moving from Gerrit might still be the right decision, how we go about deciding that is just as important as what we do and I messed that up. Given that perspective, I've asked the team to pause with moving forward on changes to our Code Review (CR) tools and to begin a consultation that includes the option of sticking with what we have for CR. I've also asked my team to update some of our decision making processes relative to topics like this to make sure we properly hear from stakeholders (e.g. in this case, both staff developers and our broader community of developers) along the way.
For some more context, if it is helpful:
I'm ~11 months in here and still learning every day. While I've worked in open source for a long time, this community is new to me and different enough that I have and continue to need to update and adjust the way I think and the way I direct my teams to do their work.
Coming in and talking with our tech teams and folks in the community, I see a few themes that have emerged that contributed to me wanting to move forward faster on this decision:
1. We have a lot of tech debt[2]. In many cases, I think software, especially software that is successful, can collapse under its own weight if people are not careful in servicing that tech debt. The work required to both maintain existing infrastructure, products and services while at the same time improving what we offer is a delicate balancing act. At our scale, there is a significant and justified bias towards production, but it has come at a cost that has compounded over the years and has a very real human toll. Much of this debt was created because we had to invent things that didn't exist. Now some of those things do exist and we should check to see whether we can replace those older, albeit well-understood-by-us systems, with newer ones that have become standards or best in class and are still in line with our open source values.
2. The tech debt and the sheer number of services we support (many of which aren't fully maintained[3]) is compounded by the scale at which we support them. The result is that a number of people, especially those on the front line of caring for that software, are either burnt out, or approaching that point. A global pandemic hasn't helped. I view much of my role here early on as one of trying to help somehow reduce that burnout. Modernizing and upgrading our processes and toolchains can, I think, help fight this, even if there is some short term pain in the shift.
All that being said, in this particular case, we have a team of people who work on maintaining our CI (Continuous Integration) and CR systems who have long been looking at replacing our CI system. This system runs on an end-of-lifed version of Python and on an end-of-lifed version of Zuul, and it’s critical we correct this since end-of-lifed software doesn’t receive security updates. This is primarily behind the scenes work that most people don't have to think about. There is also a growing sense of desire by some of our developers to adopt more mainstream, well understood toolchains like Gitlab/Github for development, combined with my own view that CI/CR is *not* somewhere we should be deviating from broad industry norms on ourselves and that we should adopt workflows that are (de facto) standards (e.g. Gitlab/Github, with Gitlab being the open one of the two) amongst developers irrespective of their backgrounds. Those two things led to my biased thinking that it was obvious it needed to be changed and that the primary feedback needed would therefore be on the workflows, not the tool itself.
While I still think it needs to be changed, I completely missed, as I said above, the stakeholder angle here and basic community laws of not surprising people. For that I apologize. We are now working to correct this, even if it means it's going to take longer or we end up sticking with the status quo on CR.
Thanks,
Grant
[1] https://en.wikipedia.org/wiki/Hanlon%27s_razor
[2] https://en.wikipedia.org/wiki/Technical_debt
[3] https://www.mediawiki.org/wiki/Developers/Maintainers
Maarten
-------- Forwarded Message -------- Subject: Re: [Wikitech-l] CI and Code Review Date: Wed, 8 Jul 2020 22:40:38 +0200 From: Maarten Dammers [email protected] Reply-To: For developers discussing technical aspects and organization of Wikimedia projects [email protected] To: [email protected]
Hi Greg,
On 06-07-2020 19:39, Greg Grossmeier wrote:
First, apologies for not announcing this last week. A short work week coupled with a new fiscal year delayed this until today.
tl;dr: Wikimedia will be moving to a self-hosted (in our datacenter(s)) GitLab Community Edition (CE) installation for both code review and continuous integration (CI).
tl;dr: WMF decides to do a major change without any community consultation. Community members are upset. More at https://www.mediawiki.org/wiki/Topic:Vpbt50rwxgb2r6qn
Maarten