Production-Ready Open-Source

The Problem

Innovation in computing has been a sequence of battles against three major bottlenecks. The first bottleneck was materials science and semiconductor design. This restricted computation to massive machines only accessible to those with enough capital to purchase and maintain them. As we rapidly decreased the cost and size of the chip, we were able to enable a new breed of personal computer accessible to the masses, but that unveiled the second significant bottleneck: information transfer cost, with ugly transitional solutions like the floppy disk or cd-rom. This was solved by the internet allowing computers to stream data between each other over TCP, but has created a third major bottleneck that's proven more difficult to wrangle than the previous two: distributed systems operational cost. At sufficient scale, the intractable likelihood of a large, multi-node system failing in some way approaches 1, so there will always be some need for maintenance, but we have still not found a great way to minimize it.

The solution the industry has chosen to manage that bottleneck has been very similar to the original problem of semiconductor design: reach economy of scale. This has led to a proliferation of centralized services so that the hiring needs for the unique skillset needed to maintain distributed systems can be minimized, along with preserving the quirky knowledge around how each specific systems fails under stress. This has also had major negative side-effects for the industry at large, for instance impinging on data privacy for users, or allowing cloud providers to leverage obscene markups for rip-offs of open source software by providing a simple "management" kicker. And finally because centralization was so necessary, the monetization of a huge swath of software has been misallocated to mega-caps like AMZN, GOOG and MSFT whose primary value is simply owning a lot of datacenter real estate instead of remunerating the developer communities who are truly responsible for the software.

I believe with the widespread adoption of cloud for the underlying compute power, and near-free managed kubernetes for orchestration, we're remarkably close to solving this last big bottleneck. Using the two, we can declaratively specify virtually any distributed system an organization might want to deploy, and using operators and CRDs, we can also embed the operational knowledge to maintain those applications into the cluster itself.

This should enable a new marketplace of portable SaaS applications that can allow privacy from the data layer up, allow process-level control of your core infrastructure, and allow mobility from provider to provider as costs change. With licensing and billing added on top, you should be able to take any bit of distributed systems software and convert it into its own SaaS business, without any of the malignant side-effects of the centralized model the industry has built itself on.

The Solution

We believe that with the proliferation of declarative configuration, distributed systems can now be delivered effectively as source code, and are building around that paradigm. This means a workflow with three basic steps:

  1. Package & Publish
  2. Install & Configure
  3. Lifecycle Management

How We'll Do It

Our core problem is solving the chicken-and-egg issue of market creation — no developer will build for this model if there's no demand for it, but there will never be demand for it unless people are building on it. So we need to find a high-leverage place to focus first, and we believe the biggest opportunity in the market is in open source managed services. The combination of lopsided monetization with the fact that they truly provide little marginal value in a modern, kubernetes-driven world along with the often limited configurability of the applications offered makes this entire space ripe for disruption. Considering they're overcharging on the order of 50% above their own posted compute/memory cost, the opportunity size is more than big enough to justify investment as well.

A simple way to put it is we believe there's about a 50% arbitrage between managed k8s + cloud compute+storage cost and the cost of a managed open source solution on the exact same cloud, and the tooling we're building is what is needed to capitalize on it.