How to build a community around an inner source product

  • Be explicitly open for contribution
  • Focus on great DX (developer experience)
  • Give great support
  • Share knowledge
  • Spread the word
  • Keep the quality high
  • Praise good behavior

Inner open source vs open source

The setup

6700+ commits by 153 people over the course of 5 years

Documentation

Photo credit: https://fs.blog/elephant/

I am not my knowledge. By setting that knowledge free, I challenged myself to learn new things and force myself to grow.

Answer questions

When I was new to the company, I didn’t know much about the domain. But I made it my mission to find the answer to the questions that other people asked on Slack. The answer is always in the code but people don’t bother or have time to dig in.

  • it helps unblocking colleagues so they can deliver business value more efficiently
  • it creates trust and helps building relationships which facilitated other unrelated initiatives
  • it helps people to have a stronger base line, hence asking smarter questions which makes you learn more
  • it improves one of the key skills of software engineering: to read code
  • it provides great dialogues which leads to new solutions and features
  • it is a great opportunity to discover DX issues
  • it’s the best way to improve documentation and compile a FAQ

Present it

Hold workshops

Try other formats

  • People who would miss the workshop can watch the video at their convenience (this even applies to people who are recruited after the event, which greatly helps their on-boarding)
  • It is possible to pause, rewind and reply a video as many times as needed in order to understand the concept (some people including myself are slow learners)

Pair up

  • If I answer a question, I try not to sound very confident and instead ping other teammates for their take on the problem
  • If I write documentation, I ask other colleagues for proof-reading it
  • If I hold a workshop, I asked my team to join as support for any question that may raise
  • More importantly if anyone else in the team is taking similar initiatives, I genuinely praised their initiative and wholeheartedly offered my help

Delegate

Use code reviews as an opportunity for communication

  • Maybe the platform isn’t addressing their needs, which leads to good feedback for refactoring to accommodate for their needs (saying no is often not a good option, because one of the main reasons we’re building our own platform is to be able to quickly evolve it to our needs)
  • Maybe the platform is already capable of doing what they want, but they simply haven’t come across the “best practice”. In that case Slack or VC gives a better communication bandwidth to discuss instead of Github PR comments.
  • Occasionally I make a PR with alternative solutions as a proof of concept (PoC) to show how a different approach may compare to theirs. In those cases I make sure to leave the judgement to them. They may close my PR and modify theirs, or they may even insist that their way is better. Either way, I’m not my code.

On-board new recruits

  • challenge myself to simply explain the stack to someone who doesn’t know anything about it: “if you can’t explain it simply you don’t understand it well enough”
  • learn from their clear-mind questions about how we can rethink the platform to make it more approachable.
  • we establish a relationship from the get go which makes it easier for them to ask questions and contribute. This is a great chance to break any “we vs. them” mentality that may raise between the brand and Core teams.

Physical gathering

Provide a safe environment

Sometimes I am stuck working with the code, and probably someone has the answer to my questions, but I don’t feel like asking on Slack because people may judge me as incompetent or lazy.

Praise good behavior

  • By praising the behaviors that helps the community and product, you positively reinforcing that behavior in whoever did it
  • Publicly praising someone and clearly mentioning the reason for the praise, is a powerful tool to set an example for the other community members
  • Members helping each other
  • Members contributing to the common code
  • Members doing quality contributions: good PR descriptions, good tests, etc.
Photo from VS Code’s website

Set an example

  • TypeScript: some people love it, some hate it, but as I’ve mentioned in one of my most popular blog posts, type system is an essential part of any huge repo with many contributors, and our stack is no exception. Unfortunately, when I started the repo was 10% converted to Flow which has a smaller tooling/ecosystem and misses some important features. After starting the conversion and making a PoC PR (proof of concept pull request), I started to gradually convert the files. Later another smart colleague of mine joined the effort which tripled the conversion speed. However, as you might expect not every member of the community loved TypeScript so we decided to support both JavaScript and TypeScript (and still compile everything with tsc)
  • Batteries included: our code base is complex and can be intimidating for new starters. The fact that we use Node.js in slightly creative ways (dynamic requires and override mechanism) doesn’t make it easier either. Therefore we needed to lower the barrier for starters. After a Slack poll it became obvious that people really want the ability to be able to debug the product locally in a hassle free and smooth way. Our policy up until that point was to not bind the repo to any IDE. People could use vim, IntelliJ, Sublime, VS Code or whatever else they preferred. Stackoverflow survey shows that VS Code is the most preferred IDE on both 2018, 2019). I was one of those developers and in fact I did have a nice VS Code setup in my .vscode that wasn’t shared. So I checked it into the repo after some clean up, and the response was amazing. We had almost zero complains about how to debug the product because now we have at least one officially supported IDE (which if you think about it is not that crazy, considering that Android, iOS and some other ecosystems come with one official IDE that improve the DX). Fast forward to 2020, now we have some settings.json, extensions.json, and tasks.json which enables the users to have the best setup with Prettier, eslint and other goodies.
  • Datadog integration: I love Datadog’s metrics, dashboards, alarms, etc. However, we relied on other tools for logging which were not good (some brand developers were expected to read the logs that were piped into the Slack!) Fortunately it was around the time that Datadog introduced log support. I was fairly new to their setup, but after some reading and experimenting, I figured out a way to add logs to our products. Unfortunately, this initiative was going to blocked due to a rumor that Datadog is expensive or has GDPR issues. So I had to debunk the myth and eventually, we managed to have Datadog as an official logging platform at the company. Today, you can detect an anomaly on a dashboard, dig into server logs and follow the request traces all in one interface. I have an upcoming video tutorial to share my knowledge about Datadog in general and our integration and tweaks in particular. Besides by getting some help from Datadog’s amazing team, we’ve managed to set up very useful monitoring and inspection tools in place that improves our response time to any incident.
  • Commit message linting: we enforce a special commit message format (inspired by Angular) in order to quickly detect what is going to production on each release. Previously we used an open source project for it, but that one wasn’t flexible enough (for example we required the scope to be the name of a brand folder) and had some other quirks (like enforcing upper case, where it really didn’t create value). So I wrote a commit message linter that was very basic but at the same time, fit to our needs (note: some of our repos now use the opensource commitlint project, so we may refactor this code out but so far it’s been working solidly and there hasn’t been enough motivation to refactor it out).
  • Transducer runner: the heart of our system is a transducer engine that runs core or brand functions one after another. That code was duplicated, had poor monitoring and would crash in edge cases. So I spent a few weeks learning, prototyping and finally unifying all of them with one RunTransformationChain() function that practically executes billions of times a month. We may at some point open source that part, but what’s important is that we have very good visibility, good error messages and predictability that was missing in the old code.
  • Snapshot testing: this is actually one of the first things I did before the TypeScript refactoring but on retrospect looks like an obvious choice for the complex and huge server that we’re working with. The idea is very simple: threat the server as a black box and capture its outputs given a certain input and check it into the repo. Then, if as a result of a refactoring or feature, the functionality of the server drifts, quickly identify and visualize that. In practice the snapshot tests show a diff of what the server was supposed to return and what it returns. If the changes are OK, the developer can update the snapshots to match the new state, otherwise, they’ll need to debug their code to match the snapshots. In practice, snapshot tests are a fool-proof contract that will block faulty PRs from getting to master.
  • Config compiler: one of the main reason we’re building our stack is the flexibility to apply our business logic. In practice that business logic is a set of config files that is checked into another repo, audit-able per commit. However, that external repo, had two huge faults: the config files were huge, and they were written in YAML which made it easy to make mistakes that found their way to production. So in one of our Hackathons, I refactored the configs to be in JSON5 format scattered around smaller files. This gave birth to one of my open source projects combine-json which is a simple CLI and library to plow through a directory structure and create a JSON out of it. These [huge] JSON files are then used to apply the business logic.

Wall of thanks

Things I wish we did differently

  • Ambassador program: the idea is to have an official process for people from one team to become a temporary ambassador in another team. Apart from building trust and relationship, this is a good way to live each other’s daily lives and cross-pollinate best practices.
  • Facilitate internal recruiting: currently the process for recruiting internal candidates is more or less similar to external recruiting. The process is complex enough that you may as well change jobs. Being able to move from one team to another is a cheap way for the company to keep its top talent while offering them new challenges. Also it’s one of the best way to tightly knit different parts of the organization.
  • Knowledge/Mandate asymmetry: the white-label was born at one of the brands and later adopted as the common solution. The original people who made that tech stack followed to the Core team. Later, the company recruited some of its best engineers to the Core team due to its impact. But as a side effect many of the brands faced knowledge and skill starvation. Although the Core team has gone beyond “normal” for sharing this knowledge, some of the cultural difficulties we face are due to not having strong presence at the brand teams and not facing their day to day challenges, which at worse will cause to requirement and priority drift. One way to solve it is to have one representative for every brand team in the Core team. Another solution is to assign each brand team to a Core member as a “key account manager”.
  • Dedicated developer advocacy: a lot of what is done to improve the community is not officially our duty. Sure, it’s nice and appreciated but often it comes at the cost of sacrificing performance or stealing from private time. I’m guilty of both. If your product needs a flourishing community, give it the priority it deserves and dedicate passionate people to work on developer advocacy and oiling the collaboration cogs.
  • Autonomy vs unity: this community (as big as it is) is just part of a much bigger company which has central teams for a lot of common needs like handling logs, running infrastructures, aggregating metrics, etc. These teams are more or less like small tech startups inside the bigger company whose main task is to support the sister companies to reduce their cost while increase their cadence. However, our core team hasn’t managed to utilize those supporting teams and products to its fullest extent. The autonomy allowed our team to move fast and decide on the tech stack individually, but it also crippled us by dedicating resources to Operations that we could otherwise spend on improving the code/collaboration.
  • Config fatigue: by aggregating a lot of ideas from various brands, we ended up with a situation where the majority of the use cases had been implemented. This led to the situation that many “new development” work is reduced to merely finding the right piece of code and then configuring it. Over time, the config (business logic) got so huge that it got harder to navigate. To solve this problem two major improvements are due: (a) config validation to flag config errors before going to production and (b) use semantic config to “deduce” the configuration from a small config object (c) a GUI for editing config (we have a limited working prototype for part of the config that is accessed a lot) and most importantly (d) reduce the configuration by define the requirements ahead of time: when engineers design the code without a strict requirement, they aim for the most configurable scenario.
  • The cost of generalization: generally speaking every time we took a product or idea from one brand and tried to extend it to be used by others, the pace of the original brand slowed down because: (a) the code had more owners and stakeholders, so one could not “experiment” with it as easily. (b) the code usually gets more complicated to address variations of ideas from different brands and complex code is more vulnerable to bugs and security issues hence more expensive to maintain.

Conclusion

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store