Almost exactly a year ago, GitLab spun out Meltano, an ELT (extract, load, transform) service that is now setting its sights on becoming a more fully featured “DataOps” service. At the time, GitLab, which had originally built the service to improve its own data lifecycle platform, Alphabet’s GV and a number of angels, invested $4.2 million in seed funding. Today, the company announced that it has raised an additional $8.2 million to extend its seed round to $12.4 million. The new round was led by Venrock, with GV, Uncorrelated Ventures, Data Tech Fund and a number of angel investors also participating.
As Meltano CEO and GitLab veteran Douwe Maan told me, the original idea behind Meltano was to build an end-to-end data platform, but then the team narrowed its focus back to ELT again between 2020 and 2021, with the Singer open source ELT tool as its core component. Now, with this new funding as a basis and the launch of Meltano 2.0 today, the 17-people company is returning to its original vision of becoming an end-to-end platform.
Maan noted that when the team first brought Meltano to market, they realized that it was quite difficult to convince businesses to replace their entire data stack. So instead, the team decided to focus on an ELT service that could be an easy drop-in replacement for existing tools like Fivetran.
“Now we’re seeing that people who have come on board and think of Meltano as a component in their stack, are asking us to essentially expand this role to take over the management of all of these different components rather than just the Singer connectors for ETL and dbt for transformation,” Maan explained. “It was a matter of making it a more easily understandable and accessible value proposition. Now that we have reached maturity on the ELT side, we have earned the right to start expanding the value impact we want to have on people’s data work.”
Clearly, that approach is working, given that Meltano now counts the likes of Netlify, Zapier and, unsurprisingly, GitLab among its users. Still, as of now, the company is pre-revenue and remains focused on its open source offering. Like so many other open source companies, the plan is to offer a hosted version of the service in the future. But as Maan also noted, while the company found product-market fit with its ELT tool, it is now going back to the drawing board to get to the same stage with its end-to-end platform — and that’s also why the team decided to extend the seed round.
A lot of Meltano’s DNA is clearly influenced by GitLab. Like the company it spun out of, Meltano publishes a handbook with a roadmap, for example, and is, of course, remote-only. Just like GitLab, the Meltano team also believes that it needs to build its tools in close collaboration with its users and to do so, the company needs to be highly transparent. Meltano also recently moved its projects from GitLab to GitHub to be closer to where its community members are and to be able to engage them better. That’s going to make for an awkward dinner conversation at the next GitLab reunion, but GitHub simply is the de facto standard for open source development.
One thing that makes Meltano different, though, is that it heavily relies on third-party open source projects while GitLab’s focus has always been on building things in-house. “GitLab is one product that has everything built first in-house and first party,” he said. “It has its own Git hosting and issue tracker and CI/CD. We are building a layer below that, where these existing, best-in-class third-party components come in, with Meltano adding the intermediate tissue and abstracting away a lot of the differences between these components.”
In its early days, Meltano actually tried to follow the GitLab model, but now the company wants to embrace these third-party products and integrate them into a cohesive service. Maan argues that the data stack has moved away from end-to-end tools like Informatica to a world where there are a lot of tools that are very good at one particular step in the lifecycle. But in return, that cohesive developer experience is now gone.
In practice, that means that in addition to Singer and dbt, the company also uses Airflow for its scheduling and workflow orchestration, Great Expectations for data quality assurances and Superset for its data exploration and visualization capabilities.
“The way you build and manage a data infrastructure looks nothing like building and managing a software application,” said Ethan Batraski, a Venrock partner. “There are no isolated environments for development and testing, code reviews, unit testing or version control. Making updates to your data infrastructure is akin to making edits directly in production, and often leads to outages, data quality issues and constant firefighting. We believe Meltano is the missing layer for building, connecting and managing the various data services that make up a modern data stack, allowing data teams to build new services similarly to how software teams build applications.”