Speeding up helm dependency build

When working with Helm you may find yourself using helm dependency build.
This will resolve chart dependencies from scratch. It places the resulting packages in charts/ and generates a lockfile to boot.

I’ve noticed it can be particularly slow, and so went about some sleuthing.

Test setup

I’m using helm 3 (3.4.2) at the moment, however this behaviour has been around for some time. I’m using a fresh install of helm with no extra repositories defined (this is important).

Let’s setup our test case. We create a barebones chart with a single dependency. For simplicity, let’s use the archived stable repository.

apiVersion: v2
name: foo
description: A Helm chart for Kubernetes
type: application
version: 0.1.0
appVersion: 1.16.0

- name: rabbitmq
  repository: https://charts.helm.sh/stable/
  version: "^1.0.0"

Now let’s see how helm dependency build behaves.

➜  foo git:(master) ✗ time ./clean-build
Getting updates for unmanaged Helm repositories...
...Successfully got an update from the "https://charts.helm.sh/stable/" chart repository
Saving 1 charts
Downloading rabbitmq from repo https://charts.helm.sh/stable/
Deleting outdated charts

real	0m26.427s
user	0m6.640s
sys	    0m1.500s

➜  foo git:(master) ✗ ll -R
total 16
-rw-r--r--  1 stew  staff   214B 22 Mar 11:40 Chart.lock
-rw-r--r--  1 stew  staff   217B 22 Mar 11:35 Chart.yaml
drwxr-xr-x  3 stew  staff    96B 22 Mar 11:40 charts

total 24
-rw-r--r--  1 stew  staff   9.4K 22 Mar 11:40 rabbitmq-1.1.9.tgz

So far so good. It took about 30 seconds (bless my poor internet), and resulted in a single chart file being downloaded.

So what just happened? Helm downloaded the index file for the stable charts repository, resolved the chart version we need and downloaded it, producing a Chart.lock file along the way.

Let’s add a few more dependencies…

- name: rabbitmq
  repository: https://charts.helm.sh/stable/
  version: "^1.0.0"
- name: minio
  repository: https://charts.helm.sh/stable/
  version: "^1.0.0"
- name: mysql
  repository: https://charts.helm.sh/stable/
  version: "^1.0.0"
➜  foo git:(master) ✗ time helm dependency build
Getting updates for unmanaged Helm repositories...
...Successfully got an update from the "https://charts.helm.sh/stable/" chart repository
...Successfully got an update from the "https://charts.helm.sh/stable/" chart repository
...Successfully got an update from the "https://charts.helm.sh/stable/" chart repository
Saving 3 charts
Downloading rabbitmq from repo https://charts.helm.sh/stable/
Downloading minio from repo https://charts.helm.sh/stable/
Downloading mysql from repo https://charts.helm.sh/stable/
Deleting outdated charts

real	0m40.973s
user	0m17.711s
sys	    0m2.583s

Interestingly, we see Helm make three attempts to update the same chart repository.
We also see a long delay - multiple seconds - between fetching each individual chart. A chart is just a gzipped tarball, usually a few KiB in size.

The message ...Successfully got an update from the "https://charts.helm.sh/stable/" chart repository arrives in an irregular order, as if downloaded in parallel.

In an ideal world, we would expect Helm to download our chart manifest exactly once, resolve dependencies and download the relevant chart packages. We instead see multiple attempts to fetch the chart manifest, and long delays between fetching individual packages.

Here’s how it behaves in relation to total dependencies.

Helm Dependency Build timings

It is linear with dependencies. However, we’d expect the lengthy index download to dominate at the lower ordinals.
That it doesn’t indicates something isn’t quite right.

So what’s going on?

  1. The same chart manifest is downloaded multiple times during the first stage
  2. There’s some undetermined delay causing resolution of individual charts to be slower than expected

Rolling our own

For fun, I wrote a hacky and limited implementation of helm dependency build. You can find it on my GitHub.

It supports limited repository locations (just http(s) and local file locations) and relies on the helm CLI to package local charts. It has loose support for v1 and v2 Charts (using requirements.yaml vs Chart.yaml dependencies respectively), and is thoroughly untested.

➜  foo git:(master) ✗ time helm-dependency-fetch
Fetching rabbitmq @ ^1.0.0
Fetching index from https://charts.helm.sh/stable/
	Fetching chart: https://charts.helm.sh/stable/charts/rabbitmq-1.1.9.tgz
Fetching minio @ ^1.0.0
	Fetching chart: https://charts.helm.sh/stable/charts/minio-1.9.2.tgz
Fetching mysql @ ^1.0.0
	Fetching chart: https://charts.helm.sh/stable/charts/mysql-1.6.9.tgz
Fetching coredns @ ^1.0.0
	Fetching chart: https://charts.helm.sh/stable/charts/coredns-1.13.8.tgz
Fetching couchdb @ ^2.0.0
	Fetching chart: https://charts.helm.sh/stable/charts/couchdb-2.3.0.tgz
Fetching dokuwiki @ ^1.0.0
	Fetching chart: https://charts.helm.sh/stable/charts/dokuwiki-1.0.3.tgz
Fetching drone @ ^2.0.0
	Fetching chart: https://charts.helm.sh/stable/charts/drone-2.7.2.tgz
Fetching drupal @ ^6.0.0
	Fetching chart: https://charts.helm.sh/stable/charts/drupal-6.2.12.tgz
Fetching elastabot @ ^1.0.0
	Fetching chart: https://charts.helm.sh/stable/charts/elastabot-1.2.1.tgz
Fetching elastalert @ ^1.0.0
	Fetching chart: https://charts.helm.sh/stable/charts/elastalert-1.5.1.tgz
Fetching elastic-stack @ ^2.0.0
	Fetching chart: https://charts.helm.sh/stable/charts/elastic-stack-2.0.6.tgz
Fetching elasticsearch @ ^1.0.0
	Fetching chart: https://charts.helm.sh/stable/charts/elasticsearch-1.32.5.tgz

real	0m7.072s
user	0m0.883s
sys	    0m0.290s

It takes just over 7s to fetch 12 dependencies. Previously it was taking about 163s.
We’ve got it down to about 4.3% the original time, and I suspect that speedup is valid even though this logic is greatly simplified.

Diving into Helm

Rolling our own tool is fun, but now it’s time to investigate Helm itself.

helm dependency build hands off directly to the downloader package, specifically the manager, which in turn calls chart_downloader’s methods.

We observe the following abridged call structure.

cmd/dependency_build:newDependencyBuildCmd ->
  manager:Build ->
    manager:Update ->
      manager:UpdateRepositories ->             # Updates 'unmannaged' repositories in parallel
        chartrepo:DownloadIndexFile             # downloads the index file
    manager:downloadAll ->                      # Downloads all charts found as dependencies
      chart_downloader:DownloadTo ->
        chartrepo:FindChartUrl ->
          chartrepo:FindChartInRepoURL ->
            chartrepo:DownloadIndexFile         # Downloads the index file (again!)
        chart_downloader:ResolveChartVersion ->
          chart_downloader:scanReposForURL      # Finds the chart, iterates over all repos

It’s clear that UpdateRepositories does not perform any de-deduplication on unmannaged repos. This explains our excessive initialisation.
Those same repos are then re-fetched in FindChartInRepoURL. This does not happen with managed repositories, indicating the local cache is not being searched in this case.
Finally we still see a delay when fetching charts. This is the result of scanReposForURL which inefficiently searches all repository indexes for the given chart version.

func (c *ChartDownloader) scanReposForURL(u string, rf *repo.File) (*repo.Entry, error) {
	// FIXME: This is far from optimal. Larger installations and index files will
	// incur a performance hit for this type of scanning.
	for _, rc := range rf.Repositories {
		r, err := repo.NewChartRepository(rc, c.Getters)

In closing

All major Helm versions have a severe performance issue with resolving dependencies from unmanaged repositories.

There are 3 issues:

  1. Unmanaged Helm repository indexes are not de-deduplicated before download
  2. Unmanaged Helm repository indexes are fetched, and fetched again for each dependency
  3. Chart resolution unnecessarily loads irrelevant repository indexes, which negatively affects charts with many dependencies, or dependencies from diverse sources

We know we can avoid most of these problems by simply managing all repositories explicitly. However, the underlying issues remain valid.

Next up is to propose some fixes. De-duplication seems a quick win, but the others need more investigation.