How to build a community around an inner source product
My experience with an active community building and maintaining multiple products with millions of users
- Be explicitly open for contribution
- Focus on great DX (developer experience)
- Give great support
- Share knowledge
- Spread the word
- Keep the quality high
- Praise good behavior
Inner open source vs open source
Before we start, let’s define “inner source”. According to Wikipedia:
Inner Source is the use of open source software development best practices and the establishment of an open source-like culture within organizations. The organization may still develop proprietary software, but internally opens up its development.
Although many of the tips in this post may apply to open source product, there are a few crucial differences stemming from the fact that the “community” is essentially comprised of employees who are recruited to a company bound by the same contract and working towards the same mission and vision.
Our company owns multiple media brands. Each with its own website and native mobile apps. The main difference between these brands is their content but traditionally they used different tech stacks. This is due to autonomy and trust that is one of the cultural aspects of the company, but the side effect is duplication of effort and higher costs. Therefore, since a few years ago, the company’s strategy is to consolidate those tech stacks as much as possible.
The product is essentially a media content management system (headless CMS) together with a stack to build sites and apps to distribute that content (text, images, videos, podcasts, etc). Think BBC, NYT, WSJ, etc. (some 25+ brands) all built on the same tech stack.
The majority of the shared code resides in a white-label product that can be customized to the needs of each brand. The white-label is mainly maintained by the Core team (where I work) while each brand has their own development team whose main goal is to customize it or add new features on top of what the white-label provides.
This, of course is a very simplified narration of what’s going on. In practice many features require changes to the brand code as well as the core code and sometimes involves cross-brand collaboration. Roughly around 150 people contribute to the product. Here’s a glimpse of the commit frequency from Github contributors tab:
You’ve probably heard the story of the blind men trying to guess the name of the creature they touched at the same time:
Each expert I talked to had great knowledge of the parts of the subsystem they worked with on a daily basis but not necessarily as deep understanding of adjacent subsystems or the system as a whole.
Meeting a wide range of people from different positions, helped develop a holistic knowledge of different requirements and understanding how various parts of the system are built to address those requirements. It was also evident how misunderstandings or lack of knowledge or control could negatively affect the architecture which can be only described as Conway’s law in action.
I thought sharing my notes could bring clarity to more people. So I published them as the first draft of documentation with additional illustrations and examples. It was received very positively which made me eager to do more of it. But why did I easily share the knowledge that took me so much time and energy to acquire, especially when no one even asked for it? Because it is one of my principals to deprecate myself. My main motivation was more selfish than altruistic.
I am not my knowledge. By setting that knowledge free, I challenged myself to learn new things and force myself to grow.
Besides, by openly sharing my knowledge up to that point, I got a great opportunity to be corrected for my own misunderstandings or get better explanations where it was needed.
This was a great exercise for technical communication and as we’ll see later, helped the product better than anticipated.
One of the best pieces of professional advice I received was from a “go-to” persons at a previous gig. He was a senior member of a team I used to work at and almost all tricky questions were answered by him. When I asked him about his secret he said:
When I was new to the company, I didn’t know much about the domain. But I made it my mission to find the answer to the questions that other people asked on Slack. The answer is always in the code but people don’t bother or have time to dig in.
So I started doing that. At first it was very uncomfortable because the code often tells you “how” a solution works but rarely touches upon “why” that problem exists as a business requirement. Sometimes I couldn’t even understand the questions people were asking! Sometimes I dug into the code to find an answer, but came back with an incomplete or wrong answer. But then I was corrected by others. Answering questions had multiple other benefits as well:
- it helps unblocking colleagues so they can deliver business value more efficiently
- it creates trust and helps building relationships which facilitated other unrelated initiatives
- it helps people to have a stronger base line, hence asking smarter questions which makes you learn more
- it improves one of the key skills of software engineering: to read code
- it provides great dialogues which leads to new solutions and features
- it is a great opportunity to discover DX issues
- it’s the best way to improve documentation and compile a FAQ
Like many other large companies we have internal meetups and conferences. In one of those international conferences at the beautiful Vienna, I volunteered to present our part of the company. This meant public speaking in front of some 100+ people. Since it was my first public speaking of this scale, I had to learn the skill first.
Then I created a basic agenda and spent several days preparing the slides and rehearsed it. The result was astonishing and I got very good feedback but more importantly this gave us exposure to a wider audience which led to more people wanting to join us.
Although we’ve spent a lot of time into documentation and even more on answering questions, some topics turned out to be too tricky to clarify with text or even images and people kept asking about them. So I thought maybe a hands-on workshop could help. I started by sending a short survey to the community on Slack to gather some data about what areas are particularly tricky to understand.
From 18 responses, I recognized two areas that I knew pretty well, so I decided to have a workshop for them. Roughly 12 people in Sweden showed up with about 10 people from our other hubs across Europe over VC (video conference). It was a new format for me to have face to face intro followed by QA session(questions & answers). Judging from the feedback form that was sent afterwards, most people found it really useful.
Try other formats
The day before I held the second workshops, our PM (product manager) said that he’s very keen on attending but can’t make it due to bad timing. He suggested to record the session if possible. I had never done that, but it turned out to be a great idea. That’s how my first video tutorial was born! Using video had several advantages over workshop:
- People who would miss the workshop can watch the video at their convenience (this even applies to people who are recruited after the event, which greatly helps their on-boarding)
- It is possible to pause, rewind and reply a video as many times as needed in order to understand the concept (some people including myself are slow learners)
The response was so great that I did a re-take on the previous workshop and converted the slides to video as well.
Obviously, recording and editing the video takes more time than just the workshop. Also the video is “time-less” therefore one needs to make sure that the concepts introduced in the video are not going to change rapidly rendering it useless.
Although I took many of those initiatives to learn new things or make my life easier, but it would not scale if I was the only one doing that. So I try to participate others as well:
- If I answer a question, I try not to sound very confident and instead ping other teammates for their take on the problem
- If I write documentation, I ask other colleagues for proof-reading it
- If I hold a workshop, I asked my team to join as support for any question that may raise
- More importantly if anyone else in the team is taking similar initiatives, I genuinely praised their initiative and wholeheartedly offered my help
The truth is, nothing big is ever achieved alone. A one man show is cute, but doesn’t sound as good as a team of players (pun intended). If you think you’re the only one doing an awesome job, you need to start by building momentum in people around you.
An example: when the need for documenting another repo came up, my teammates pointed at me as “the document master”. While I could easily get sucked into it and do the documentation, I told them a true story:
At my previous job, I got a reputation as “the CSS master”. I have a tendency to help wherever is needed and it just happened that no one at my team was interested to do CSS. So a lot of my time went to doing CSS, but did that mean that I’m particularly passionate about CSS? No! There was a need at that time and I felt that I could help.
At that team, I missed the opportunity to motivating my colleagues because I did all of that CSS work but I really hate for that to happen in this team. Technical writing is a core skill for every engineer, so anyone who wants this problem to be solved, is welcome to pair up with me and I’ll do my best to share my experience and learn along the way: “give a man a fish and you feed him for a day; teach a man to fish and you feed him for a lifetime”
Another example was how we started using Serverless. In my free time, I learn new technologies and one of the things I had learned proved to be a good fit for some of the technical problems we were trying to solve: AWS Lambda. Instead of going into a cave and come back with a finished solution, I pitched to motivate people to use it and then paired with them sharing my knowledge (and learning along the way) while they implemented the solution themselves. The added benefit was that if the service had some issues, there were at least 2 of us who could help troubleshoot it. This meant less stress and load for each of us. But we didn’t stop there, we made sure to leave enough documentation in everything we touch, so that anyone else would be able to quickly onboard themselves and start contributing. Your product may live way longer than you are at the team/company. Leave a good legacy.
In our team we chose the TL (Technical Lead) role by election. I’m very honored that in the last election 9 out of 11 people who voted, picked me.
On day-1 as the new TL, I delegated two of the important ongoing initiatives to the people who were really passionate about them. Essentially I told them that they have full TL responsibilities for those initiative and can make decisions on behalf of me. If they needed to consult me, they’d do as usual but there was no point in blocking them or slowing down the process by having to go through me. This was very much appreciated by those people, but also helped them to grow into the role and feel a sense of ownership. On the long run it proved to be the right choice because quite frankly I don’t understand everything they did but the results speak louder: they did a great job.
This also freed up my time to focus on other initiatives.
This delegation went so far as to send different team members to the meetings that only demanded the TL as the sole representative of the team. The only condition was to communicate the outcome of the meeting clearly to the next person. This led to the very good practice of writing brief and informative technical notes from meetings.
I’m by no means trying to take credit for everything we’re doing better compared to a few years ago, but if you want people to really bring their best, trust them and delegate control. The ideal delegation is something that’s a tad harder than one’s limit, so they’ll strive for their best.
Use code reviews as an opportunity for communication
One of my duties as a Core member is to review PRs. I really love how the code transparently reflects its creator’s understanding of the code. So when I see a drastic diversion from how the platform was meant to be used, I reach out to people on Slack or VC instead of commenting on the PR because:
- Maybe the platform isn’t addressing their needs, which leads to good feedback for refactoring to accommodate for their needs (saying no is often not a good option, because one of the main reasons we’re building our own platform is to be able to quickly evolve it to our needs)
- Maybe the platform is already capable of doing what they want, but they simply haven’t come across the “best practice”. In that case Slack or VC gives a better communication bandwidth to discuss instead of Github PR comments.
- Occasionally I make a PR with alternative solutions as a proof of concept (PoC) to show how a different approach may compare to theirs. In those cases I make sure to leave the judgement to them. They may close my PR and modify theirs, or they may even insist that their way is better. Either way, I’m not my code.
On-board new recruits
Occasionally, we received PRs that show that a member of the community (often new ones) hadn’t understood the platform. As we dug into these cases, we often learned that the person was on-boarded by someone else who didn’t have a good understanding of the platform either. So I volunteered to take part in on-boarding some of the new recruits. The idea was to:
- challenge myself to simply explain the stack to someone who doesn’t know anything about it: “if you can’t explain it simply you don’t understand it well enough”
- learn from their clear-mind questions about how we can rethink the platform to make it more approachable.
- we establish a relationship from the get go which makes it easier for them to ask questions and contribute. This is a great chance to break any “we vs. them” mentality that may raise between the brand and Core teams.
Now we have momentum! There are lots of people using the product, but also lots of ideas and suggestions pops up. One of our new recruits had a brilliant idea: to gather the community at the same location and mingle.
We decided to host an Unconference. The format is simple: just gather everyone who has a stake in the product under the same roof and let them decide what they are going to talk about in smaller break out groups on pre-allocated time slots.
At the end, one person from each group presents the top findings. At that final presentation I took it to myself to document all the ideas in a Google doc and share it with the participants afterwards. Gatherings like these are very expensive (in our case 50+ people showed up on paid work time, several of whom traveled from other countries). It would be a loss not to compile an action list out of all those ideas.
Apart from serving as a good platform for discussions, the Unconference gave us a chance to meet some of our colleagues face to face for the first time and have ad-hoc conversations which continued over the dinner. These interactions built trust that lowers people’s guards when they are going to approach each other on Slack or review PRs.
Provide a safe environment
There were literally hundreds of good ideas at the Unconference, but one of the key things I personally learned was from an ad-hoc chat with a colleague. He said:
Sometimes I am stuck working with the code, and probably someone has the answer to my questions, but I don’t feel like asking on Slack because people may judge me as incompetent or lazy.
I really appreciated that honest feedback but also saw it as a chance to work with him to understand his concerns in order to make it more “normal” to ask questions. If we optimize the support channels for the shy, it will be a safer environment for everyone to collaborate.
It goes the other way as well. As you create a platform for the community to interact with each other, some people may come across as harsh. It can happen to anyone. Sometimes people are in the rush and respond with minimum number of words. Sometimes they are having a bad day: “Everyone you meet is fighting a battle you know nothing about.” In these cases, the best approach is to casually take it privately with the persons involved and try to resolve a possible misunderstanding or conflict by analyzing the situation. Yes, building a community sometimes involves conflict resolution. On that topic, this is a book that I really enjoyed reading:
Never Split the Difference
A lot of what affects how much you enjoy these books is, again, how self aware you are or how much consideration you've…
A very important tip is to always be genuinely respectful and polite no matter how complicated the situation might be. People tend to show their best when they feel respected and heard. Another tip that helps keep the discussions professional and focused is to try to put the conversations in the context of company mission and vision.
Praise good behavior
One of the best tools for cultivating good behaviors is genuine praise.
- By praising the behaviors that helps the community and product, you positively reinforcing that behavior in whoever did it
- Publicly praising someone and clearly mentioning the reason for the praise, is a powerful tool to set an example for the other community members
At the start, I used to announce the top 3 contributors to the repositories (taken straight from the Github monthly stats). But “you become what you measure”. Optimizing for the commit frequency means for someone to reach the top, they should make smaller commits. While making smaller self-contained commits is a good practice, we don’t want to optimize a community around that metric. A good metric helps people optimize good behavior. We wanted:
- Members helping each other
- Members contributing to the common code
- Members doing quality contributions: good PR descriptions, good tests, etc.
And to identify that, requires more due diligence. One needs to keep an eye on the code and communication channels to identify those who go above and beyond for the community. Once identified, don’t miss a chance to give praise. The praise should be specific and reach the right audience for maximum effectiveness.
Set an example
Although the majority of this blog post is spent on non-coding activities I do, in practice this only takes half of my time. The other half goes to programming and adding features that make these repositories a pleasure to use. To keep this post free from technical details here’s a highlight of initiatives I started to improve the quality of the code:
- Batteries included: our code base is complex and can be intimidating for new starters. The fact that we use Node.js in slightly creative ways (dynamic requires and override mechanism) doesn’t make it easier either. Therefore we needed to lower the barrier for starters. After a Slack poll it became obvious that people really want the ability to be able to debug the product locally in a hassle free and smooth way. Our policy up until that point was to not bind the repo to any IDE. People could use vim, IntelliJ, Sublime, VS Code or whatever else they preferred. Stackoverflow survey shows that VS Code is the most preferred IDE on both 2018, 2019). I was one of those developers and in fact I did have a nice VS Code setup in my .vscode that wasn’t shared. So I checked it into the repo after some clean up, and the response was amazing. We had almost zero complains about how to debug the product because now we have at least one officially supported IDE (which if you think about it is not that crazy, considering that Android, iOS and some other ecosystems come with one official IDE that improve the DX). Fast forward to 2020, now we have some settings.json, extensions.json, and tasks.json which enables the users to have the best setup with Prettier, eslint and other goodies.
- Datadog integration: I love Datadog’s metrics, dashboards, alarms, etc. However, we relied on other tools for logging which were not good (some brand developers were expected to read the logs that were piped into the Slack!) Fortunately it was around the time that Datadog introduced log support. I was fairly new to their setup, but after some reading and experimenting, I figured out a way to add logs to our products. Unfortunately, this initiative was going to blocked due to a rumor that Datadog is expensive or has GDPR issues. So I had to debunk the myth and eventually, we managed to have Datadog as an official logging platform at the company. Today, you can detect an anomaly on a dashboard, dig into server logs and follow the request traces all in one interface. I have an upcoming video tutorial to share my knowledge about Datadog in general and our integration and tweaks in particular. Besides by getting some help from Datadog’s amazing team, we’ve managed to set up very useful monitoring and inspection tools in place that improves our response time to any incident.
- Commit message linting: we enforce a special commit message format (inspired by Angular) in order to quickly detect what is going to production on each release. Previously we used an open source project for it, but that one wasn’t flexible enough (for example we required the scope to be the name of a brand folder) and had some other quirks (like enforcing upper case, where it really didn’t create value). So I wrote a commit message linter that was very basic but at the same time, fit to our needs (note: some of our repos now use the opensource commitlint project, so we may refactor this code out but so far it’s been working solidly and there hasn’t been enough motivation to refactor it out).
- Transducer runner: the heart of our system is a transducer engine that runs core or brand functions one after another. That code was duplicated, had poor monitoring and would crash in edge cases. So I spent a few weeks learning, prototyping and finally unifying all of them with one RunTransformationChain() function that practically executes billions of times a month. We may at some point open source that part, but what’s important is that we have very good visibility, good error messages and predictability that was missing in the old code.
- Snapshot testing: this is actually one of the first things I did before the TypeScript refactoring but on retrospect looks like an obvious choice for the complex and huge server that we’re working with. The idea is very simple: threat the server as a black box and capture its outputs given a certain input and check it into the repo. Then, if as a result of a refactoring or feature, the functionality of the server drifts, quickly identify and visualize that. In practice the snapshot tests show a diff of what the server was supposed to return and what it returns. If the changes are OK, the developer can update the snapshots to match the new state, otherwise, they’ll need to debug their code to match the snapshots. In practice, snapshot tests are a fool-proof contract that will block faulty PRs from getting to master.
- Config compiler: one of the main reason we’re building our stack is the flexibility to apply our business logic. In practice that business logic is a set of config files that is checked into another repo, audit-able per commit. However, that external repo, had two huge faults: the config files were huge, and they were written in YAML which made it easy to make mistakes that found their way to production. So in one of our Hackathons, I refactored the configs to be in JSON5 format scattered around smaller files. This gave birth to one of my open source projects combine-json which is a simple CLI and library to plow through a directory structure and create a JSON out of it. These [huge] JSON files are then used to apply the business logic.
Wall of thanks
First I have to thank the Core team for supporting many of these initiatives. Without their support my time would be split between coding and merging and just calling it a day! Their support and encouragement and openness to improvement is a source of inspiration.
I should also thank the community (the wider group of brand developers, PMs, Operations, etc.) who work together to make an awesome product for the end users. The collaboration has been truly remarkable.
Things I wish we did differently
Of course nothing is perfect. Here’s a short list of things I wish we could do better or improve upon:
- Ambassador program: the idea is to have an official process for people from one team to become a temporary ambassador in another team. Apart from building trust and relationship, this is a good way to live each other’s daily lives and cross-pollinate best practices.
- Facilitate internal recruiting: currently the process for recruiting internal candidates is more or less similar to external recruiting. The process is complex enough that you may as well change jobs. Being able to move from one team to another is a cheap way for the company to keep its top talent while offering them new challenges. Also it’s one of the best way to tightly knit different parts of the organization.
- Knowledge/Mandate asymmetry: the white-label was born at one of the brands and later adopted as the common solution. The original people who made that tech stack followed to the Core team. Later, the company recruited some of its best engineers to the Core team due to its impact. But as a side effect many of the brands faced knowledge and skill starvation. Although the Core team has gone beyond “normal” for sharing this knowledge, some of the cultural difficulties we face are due to not having strong presence at the brand teams and not facing their day to day challenges, which at worse will cause to requirement and priority drift. One way to solve it is to have one representative for every brand team in the Core team. Another solution is to assign each brand team to a Core member as a “key account manager”.
- Dedicated developer advocacy: a lot of what is done to improve the community is not officially our duty. Sure, it’s nice and appreciated but often it comes at the cost of sacrificing performance or stealing from private time. I’m guilty of both. If your product needs a flourishing community, give it the priority it deserves and dedicate passionate people to work on developer advocacy and oiling the collaboration cogs.
- Autonomy vs unity: this community (as big as it is) is just part of a much bigger company which has central teams for a lot of common needs like handling logs, running infrastructures, aggregating metrics, etc. These teams are more or less like small tech startups inside the bigger company whose main task is to support the sister companies to reduce their cost while increase their cadence. However, our core team hasn’t managed to utilize those supporting teams and products to its fullest extent. The autonomy allowed our team to move fast and decide on the tech stack individually, but it also crippled us by dedicating resources to Operations that we could otherwise spend on improving the code/collaboration.
- Config fatigue: by aggregating a lot of ideas from various brands, we ended up with a situation where the majority of the use cases had been implemented. This led to the situation that many “new development” work is reduced to merely finding the right piece of code and then configuring it. Over time, the config (business logic) got so huge that it got harder to navigate. To solve this problem two major improvements are due: (a) config validation to flag config errors before going to production and (b) use semantic config to “deduce” the configuration from a small config object (c) a GUI for editing config (we have a limited working prototype for part of the config that is accessed a lot) and most importantly (d) reduce the configuration by define the requirements ahead of time: when engineers design the code without a strict requirement, they aim for the most configurable scenario.
- The cost of generalization: generally speaking every time we took a product or idea from one brand and tried to extend it to be used by others, the pace of the original brand slowed down because: (a) the code had more owners and stakeholders, so one could not “experiment” with it as easily. (b) the code usually gets more complicated to address variations of ideas from different brands and complex code is more vulnerable to bugs and security issues hence more expensive to maintain.
My main area of focus is our Node.js backend. When I joined the team, there was only one consultant working on it full time. Today, most of the contributions (both by number of commits, PRs and contributors) come to this project and in a recent survey people who use this part of the system ranked it quite high. It’s been an amazing couple of years at the Core team. Recently I got invited to a small group inside the company called “Building on top of others” (BOTO) whose job is to facilitate cross-team collaboration. I’m also honored to join the new “Identity and culture” group inside the company to help improve a work culture where people thrive and the company stays ahead of its competition in recruiting the best and brightest.
👏 If you enjoyed reading this far, you may also like my other posts: