Product

When Does Open Source Make Sense for a Business?

May 5, 2021

While discussing some of my philosophies on open source with a friend recently, they asked me a great question: “When does open source make sense for a business?”

That’s tough to answer. In many ways it’s a loaded question—depending on who you ask, you might get extremely strong opinions with widely varying rationales.

Open source can provide a number of different advantages for a business, on both the consumption and production end of the spectrum. Many companies are strategic consumers of open-source software as a means to reduce the burden on their software engineering team to build everything from the ground up.

Other businesses, like LinkedIn and Netflix, have strong histories of being producers of open-source projects, which provide a strategic recruitment and retention tool for top-tier engineering talent.

Related podcast episode: How MongoDB Scaled Their Open-Source Product with a Bottom-Up and Top-Down Sales Motion

For this post, I’m going to get into the business decision to be (or not be) a Commercial Open-Source Software (COSS) company, as well as how to think through whether open source will provide a strategic advantage for your go-to-market motion. The decision to open source or not is anything but a black-or-white decision—and as with most strategy decisions, it’s probably best to think about the various different tradeoffs and ways you can slice the question.

I’m not going to tackle some of the hairier second-order questions that come up once you’ve decided to open source, like what license to use, how to defend against PaaS competitors, or when it makes sense to donate your code to an open-source foundation (all important and complicated questions)—but those may be interesting topics for another discussion down the road.

What are some of the benefits of open-sourcing your software?

Open-source communities can provide a powerful distribution and message amplification channel
Open source and community tend to go hand-in-hand, and the social dynamics that create that environment are probably complex enough to justify an entire article on its own.

One major reason could be that open source is often aligned with more mission-driven goals, rather than purely economic goals, and attracts a diverse set of personalities. They don’t come for free, and these communities require a lot of love and care to foster and grow. But they can provide a strong network effect, which turns the open-source community into an effective way to both distribute the project and amplify corporate messaging.

In a number of cases, as with Kafka (sponsored predominantly by Confluent) and Elasticsearch (from the eponymous Elastic), the open-source communities grew around projects before a corporate entity was incorporated and effectively created the opportunity to drive a fast-growing business.

You reduce adoption friction
Sometimes, open-sourcing your software can open a floodgate. Companies that had little interest in your solution as a proprietary offering might now be ready, willing, and excited to talk to you. This was a bit the case in the early days of StreamSets (which is a DataOps/ETL tool). We actually started out as a closed-source offering and spent about six months trying to sell it as a proprietary product, but we kept getting doors slammed in our face.

When we finally open-sourced the software, we were suddenly able to get meetings with a lot of the companies that had previously politely declined. However, a good question to ask yourself when considering the reduced friction is: Could you get the same benefit with a scaled-back, freemium offering? The answer likely depends on your market segmentation, as freemium may ease adoption friction for SMB/mid-market companies. But it may not drive results in the enterprise, where businesses are likely to care much more about things like long-term viability of a vendor and potential for lock-in, and freemium may not sufficiently de-risk an early-stage company for enterprise use.

Meme

On-premise (or “cloud-prem”) is still a thing
Yes, SaaS is powerful. Yes, SaaS is pervasive. And yet on-premise deployments are still very common, especially the enterprise. It depends on your definition of on-premise. There are unique security and privacy needs that come up in enterprise markets that just don’t have the same level of concern downmarket, so open source can make those challenges less of an obstacle.

These days, on-premise deployments commonly come in the form of businesses running their own VMs in a cloud IaaS environment. Functionally, it’s the same as running a data center, and the customer still shoulders the burden of running the infrastructure.

For some organizations, that’s a critical need. Think: financial services, healthcare, and other highly-regulated industries.

What are some of the downsides to open-sourcing your software?

You’ll compete with yourself

via GIPHY

This is probably the toughest aspect of building an open-source business from both a go-to-market and product strategy perspective. Often, if you ask an open-source salesperson who their primary competitor is, they’ll say DIYers—and it’s totally true. A lot of your top-of-funnel is focused on converting open-source users to paying customers. It produces a challenging messaging strategy because if your primary business model is selling support on an open-source project, you’re kind of incentivizing your team to build shitty software, and nobody ends up happy.

That leads to a strong need to determine a product strategy that both rewards adoption, but drives conversion. These days, that often looks like SaaS/PaaS versions of the open-source project, which reduces management costs for end users, as well as holding certain features back (enterprise security features is a very common holdback to entice conversion). You can often see these types of feature segmentations detailed on open-source pricing pages, as with GitLab and HashiCorp.

Your destiny will likely be influenced by your community
It’s hard to really bucket this item into a pro or con, because it has elements of both. Open-source communities typically fall into one of two buckets: user communities or developer communities. In both cases, your project direction is likely to be influenced by these communities, but in different ways depending on the type of community you foster.

User communities are powerful mechanisms to uncover unmet needs and organically discover use cases that you didn’t know your project had. However, communities take a lot of work and love and effort to keep alive, and a forgotten community can spell doom for a project—especially if people perceive that a project is slowing down.

Further reading: How to Bottle Community Lightning Like Datadog

Developer communities can also have a significant impact dependent upon how the community and its contributions are governed. For example, projects donated to the Apache Software Foundation have very strict bylaws about what qualifies as an acceptable contribution, and means that you may need to accept community contributions that lead your project in different directions than you intended.

Your code and roadmap are on display for competitors
Whether you like it or not, being open source means competitors will see some surface of your product strategy. They’ll be able to see what you’re building and putting out in the open source—and they can see how you’re building it, too.

This is often where licensing questions start to come up as well. The Apache Software License v2 has long been a highly popular open-source license, but it has caused a lot of headaches in recent days as cloud services (cough, cough, AWS) have forked a number of projects and developed PaaS offerings around them. This has resulted in some highly-controversial decisions in various open-source ecosystems—most notably with Elastic, who adopted the Server-Side Public License, which prohibits creation of a PaaS service for the purpose of commercializing the open-source project (though does not prohibit the use of the project within a commercialized solution).

“Whether you like it or not, being open source means competitors will see some surface of your product strategy.”

A number of companies like Confluent (with their Confluent Community License) and MongoDB (who originally designed the SSPL) have taken this approach to defend against would-be competitors.

Customers will likely ask for on-premise support
The ability to run on-premise opens up a new addressable market, but it is also damn expensive to support. Trying to debug some weird heisenbug via logs that are woefully uninformative? Yeah, that’s just Tuesday.

You can work around this a bit by offering support for single-tenant deployments in an environment accessible to you, which makes the support a lot easier. But any way you cut it, supporting on-premise deployments is way more expensive than supporting a multi-tenant SaaS version of the product.

What signals might indicate that open-sourcing could be a good strategy?

Pros and cons aside, there are a lot of other important factors that tie into whether it makes sense to be open source. In particular, I think a good mental model for thinking through the decision is to consider a handful of primary questions:

How technical is the end user or operator?
The more technical your end user is, the more likely it is that they’ll value a technology being open source. Developers or data engineers, for example, are likely to want to understand the source code because it helps them understand the inner workings and better rationalize how they might integrate it into their existing systems. Higher-functioning teams might also want to customize the software to their needs at various key integration points. Open-source code bases make that possible and enhances their “right-to-repair,” so to speak.

On the other end of the spectrum, if your end user is non-technical, such as a marketer, they probably couldn’t care less whether the thing is open source or closed source. And, frankly, they probably want it as a SaaS service so that they can self-serve as much as possible and limit their reliance on other technical teams.

“The more technical your end user is, the more likely it is that they’ll value a technology being open source.”

A data scientist or product analyst might be more towards the middle, as they probably understand SQL or maybe even some statistical languages, or they use Python. They probably value open source, but it may not be the thing that sways a decision to adopt a technology.

A slightly different lens on this might be to think about how frequently the end user interacts with source control systems like GitHub or Gitlab. In the data ecosystem, users are often writing code and checking it into source control with fair regularity. This may be true even with less technical members of the community like data analysts, who spend most of their time writing SQL queries. On the other hand, in the cybersecurity community, you may have highly technical users who understand the ins and outs of systems administration but aren’t developers—and in spite of being very familiar with a command line, they spend relatively little time working with source-controlled code.

Are you targeting enterprise or mid-market/SMB?
Enterprise sales are quirky beasts. There are a lot more hoops to jump through, like security audits, source code scans, and the like. There’s also often much different scale, which produces nuanced requirements that can vary from opportunity to opportunity. Generally, enterprises tend to be a lot more risk-averse and appreciate the relative safety/security that open source affords with the knowledge that bugs can be fixed. They also appreciate having the ability to run on-premise in a datacenter or VPC if they need to.

Enterprises tend to choose technologies to fill out boxes in architecture diagrams, so they often need a peg that can fit into a very specifically-sized hole. Open source can make it easier to shave off the bits on the side that make it hard to get the peg in the hole.

Mid-market and SMB companies, on the other hand, usually don’t have budget for the best-of-breed technology in every space, so they have to pick and choose and look for software that can offer something in many boxes rather than just do one thing really well. This often lends an edge to SaaS where the applications can be more general purpose and all-in-one, and it doesn’t incur the same management and operational expense that on-premise implementations of a similar tool might.

Is there a natural monetization strategy?

There are lots of different ways that open-source companies monetize, and none of those ways are perfect. Some companies choose to provide pure-play support for open source (like Hortonworks), but this creates some awkward software quality incentivization.

Another common approach is to hold back certain features, frequently enterprise security/governance capabilities, to drive commercial adoption (these usually show up as Community vs. Enterprise Editions of a product).

An increasingly common strategy is for companies to develop an open-source project that may be available for on-premise deployments but predominantly monetize a cloud-managed service (Confluent Cloud, Databricks, MongoDB Atlas, all good examples here), where the cloud-managed service enables greater adoption downmarket, and often these services are driven by consumption-based pricing. If an appropriate monetization strategy isn’t clear, that’s a big red flag for open-sourcing.

There’s no perfect answer to the question of whether you should open source or not, but it can be instructive to think about questions like these, especially in the context of other business in the market. Let’s take a look at some of the patterns across open-source companies (and take a look at some edge cases).

Segmented product offerings (Ex: Confluent, HashiCorp, and Grafana)

Segmented product offerings
These companies have all taken the approach of bifurcating their market between an on-premise enterprise offering (Confluent Platform, HashiCorp Enterprise, Grafana Enterprise Stack) and a cloud-managed service. The open-source roots of all three allowed them to make fast inroads into the enterprise in the early stages of growth, and their open-source communities enabled them to take market share while de-risking adoption in the enterprise.

They also all sport a very technical user base (DevOps and Data Engineers, primarily). The cloud-managed services attract the lower end of the market and make it possible for these businesses to address the SMB/mid-market segments. The cloud-managed services also have different pricing models in several of these cases, with per-node pricing on the enterprise offerings, but consumption-based pricing in the managed service. Strategically, this allows the businesses to continue to address enterprise customers as the enterprise becomes more accepting of cloud and can migrate to managed services.

Compete on enterprise functionality (Ex: Fivetran/Airbyte, Segment/RudderStack, Slack/Mattermost)

Enterprise functionality
Taking a quick look at the marketing messaging for Airbyte, RudderStack, and Mattermost makes the strategic differentiation very clear. Security, privacy, lock-in. These are the things that matter to enterprise customers, and these are primary differentiators that each leverages for competitive differentiation, which hinge on the open-source strategy. Both Segment and Fivetran are trying very hard to crack the enterprise, but neither has seen wild successes there.

In particular, Fivetran struggles because it is strategically aimed at a user persona lower on the technical spectrum (analysts), who are able to save time by using Fivetran to circumvent central IT. The double-edged sword of Fivetran’s success with a non-technical crowd is that it has alienated the organization and the individual (IT/CIO) that would typically own its category of technology in the enterprise. Mattermost is a different story, as Slack has been highly successful in the enterprise, but one element of Mattermost’s differentiation strategy is similarly on privacy and compliance..

Compete on user segmentation (Ex: Slack/Mattermost)

Mattermost screenshot
The other major front that Mattermost differentiates on is user segmentation. Slack is geared towards business teams broadly (evidenced by its former ticker $WORK), and with the Salesforce acquisition, its focus on revenue organizations feels even more pronounced. Mattermost paints itself as a collaboration tool for developers, keying into the more technical audience that will get more value out of its open-source strategy.

On second thought… (Ex: Panther)
Panther is particularly interesting because it is partially based on an open-source project from Airbnb called StreamAlert. Panther is, or more accurately, was, an open-source SIEM, akin to Splunk. Panther started open source and recently made the decision to go closed-source, citing the simplicity of the solution that it would bring. This is likely partially a reflection of the relative high cost of supporting customers using on-premise deployments.

The other likely factor is that their solution is aimed at a market that doesn’t value open source as much, and there’s not as much strategic value to being open source. Snowflake, which is a little unusual in the security space, to begin with, but also aims at a downmarket, likely more digitally native audience. Additionally, their end users are security analysts that are likely comfortable on a command line, but are not necessarily developers familiar with an IDE. The market segment doesn’t demand open source (although there have been a number of highly successful open-source security companies), nor does their user-base, so going closed source is likely a capital-efficient decision for them.

Upskilling your users (Ex: dbt):
dbt is a curious player in the open-source ecosystem. Its users tend to be analysts and analytics engineers who are somewhat in the middle of the technical spectrum, but tend to have less programming ability. The tool is used broadly across SMBs and mid-market, but also is starting to see adoption in the enterprise, as well. It’s fairly universal, and so the decision to be open source or not is a little up-in-the-air. Would dbt be successful if it were closed source?

My guess would be yes, but what’s particularly interesting about dbt is that in many ways, its users start out with less technical capabilities and dbt actually introduces them to software engineering concepts and helps them upskill themselves. The result is that dbt causes its user base to become more technical on the spectrum of programming and technical ability, which might create a strong justification for why it makes sense to be open source.

Does this really need to be open source? (Ex: Preset, Metabase)
Metabase and Preset both fall into the camp of “Why is this open source?” for me. To be clear, I don’t think it’s bad for them to be open source, I just don’t see a clear necessity. They’re end-user applications that don’t demand significant technical skills to use, and in a market (business intelligence) that has historically not necessitated open source in the enterprise (Tableau, Looker, Sisense have all done just fine for themselves). Superset (which is the open-source basis for Preset) has certainly been an efficient distribution mechanism, and perhaps Preset’s strategy will just be to primarily attempt to monetize open-source adopters. But when that well runs dry, I don’t see a clear advantage to their open-source approach.

Snowflake
Snowflake is an obvious example that goes against the grain of trends with many open-source ecosystems. They are highly successful in SMB/mid-market, but also have significant traction in the enterprise. There are lots of characteristics of Snowflake’s business that would make it seem like open source would have been a helpful strategy for driving into the enterprise earlier, but so much of their value proposition and secret sauce is in what a SaaS/PaaS distribution model enables. They offer a self-healing, self-optimizing database with data sharing across accounts.

It just wouldn’t work as well deployed on-premise. They’re able to provide extreme business value as a PaaS platform in spite of being closed source and available only as a multi-tenant service. In some ways, the value they provide is a consequence of being a multi-tenant PaaS offering, through network-oriented capabilities like shared tables and by dramatically reducing the headcount required to manage and administer a data warehouse. Snowflake is evidence that open source is not a requirement to be successful for a highly technical audience, even if it can be helpful in many other cases.

There’s no silver bullet to the question

Commercializing an open-source project as a core offering often cements a critical market, but being open source does not necessarily dictate a particular monetization strategy or product strategy.

Companies like Confluent and MongoDB have high-end offerings targeted by their core open-source projects, but they also offer PaaS versions of their products (Confluent Cloud and MongoDB Atlas), which are essentially proprietary products that can appeal to a lower end of the market. This enables organizations to find a balance that fits their growth strategy at the time.

Open-sourcing a project has the ability to enable a company to drive towards an untapped market, but it can have lasting effects on the direction of the business. There’s no right answer to whether a business should open source, and the question should be generally answered more in the context of who the business is targeting as an end user and how open-sourcing can provide strategic advantage.

Header photo by Annie Spratt on Unsplash

More posts like this

 

Jon “Natty” Natkins is a Solutions Architect at Fishtown Analytics (the makers of dbt). He’s held engineering, product, and sales roles at a variety of open-source companies, including Cloudera, StreamSets, and Corelight. He loves all things data and personal finance, and you can catch more of his musings and opinions on his personal blog, Semi-Structured. Opinions are his own and do not necessarily reflect those of his employer. <a href="https://semistructured.substack.com/">semistructured.substack.com</a>