Working with docker bind mounts and node_modules

In the age of containerization, developing Node apps within a docker container helps tremendously to simplify setup and ship the built app with all of its dependencies. It also helps to clean your system clean of different versions of Node.js for different projects so you don’t need to deal with solutions like nvm, the Node Version Manager.

If you’re just getting started with developing Node.js apps within docker you might want to check out good docker defaults.

We start of with a simple Dockerfile that is based off a node image, copies in the source code of our app and installs its dependencies using npm:

Dockerfile:

FROM node:stable-slim

WORKDIR /usr/local/app

COPY package*.json ./
RUN npm install && npm cache clean --force

COPY ./src ./src

The beauty of bind mounts during development

During development, we want our application’s source code to update within the container whenever we change something. This is achieved by docker using bind mounts which allows our source-code to be accessed and modified by both the running container and the host system. Simply mount your entire source code folder into a docker container and it will pick up the changes (bi-directionally!).

docker-compose.yml:

services:
    node:
        command: node index.js
        volumes:
            - ./:/usr/local/app/

To make sure the node application restarts, e.g. the popular nodemon can be used. Now you can go ahead and edit your files either on the host system or directly within the container and the application will always restart and reflect the changed source code.

docker-compose.yml:

services:
    node:
-        command: node index.js
+        command: nodemon index.js
        volumes:
            - ./:/usr/local/app/

All is good!

... Or is it?

The issue with bind-mounting node_modules

Bind-mounting the source code is amazing, but once we start dual developing, i.e. running our Node.js app both within the container and on the host system (maybe even just to run the tests), things get messy.

If you’re not developing a node app but another app with 3rd party dependencies (or e.g. a vendor folder, the following reasoning applies as well.

Why is that? When running the application on the host system, the host system also will need to have access to all of the app’s dependencies, i.e. the node_modules. And once you install them on the host system using npm install, they will be mounted into the container as well as they are contained within the project’s node_modules folder.

This has several implications:

  1. it can severely slow down the system 🚜, especially on systems like macOs that do not support docker natively and worsened by the fact that npm dependencies tend to lead to a gazillon of files. 100% CPU usage is just one of the fun results. Luckily, docker provides some ideas for a workaround but this does still not solve the second issue
  2. it overwrites node modules that have been installed within the docker container but not on the host machine. This can e.g. happen, because someone else made a change to the app that introduced a new dependency which is then automatically installed when building the docker container, but isn’t necessarily installed on the host machine (unless you remembered to run both docker build and npm install). It’s really tough debugging why your module isn’t there when it is so clearly state in your package.json. Even if you work around this by always running both docker build to build the container and npm install to install the dependencies on the host (you’re dual developing, after all 👯‍♂️), the problems 1 and 3 remain.
  3. breaking platform-specific dependencies 💥 and therefore the entire app. Some node modules depend on native packages that have to be built for the platform they are supposed to run on. A popular example would be node-sass which ships with pre-built binaries for some platforms and builds these binaries automatically on npm install if they are not present.

The problem with the native dependencies becomes apparent if you think of the case of developing on a macOS or Windows machine with docker: We install node-sass running npm install on our host machine (macOS or Windows) and now mount the entire src folder including node_modules into our container. The result is the following error when running the app within the container:

Node Sass could not find a binding for your current environment:Linux 64-bit with [...]

A quick google research results in the tip to run npm rebuild node-sass to rebuild the sass binaries within the container. Problem now is that due to the bidirectional mount ↔️, the app’s code will no longer be executable on the host system as macOS / Windows won’t be able to use the Linux binaries.

Preventing node_modules to be overwritten by the bind mount

So how do we prevent the node_modules from our host system to overwrite the node_modules within the container?

Prevent host’s node_modules from being mounted

The answer to the question “How to exclude files and folders from a bindmount” is fist of all: “You can’t” and a respective feature request did not find any liking yet. After some more research, a lot of sources at least propose a workaround (see 1, 2, 3) that involves bind-mounting a volume containing all of the application’s source code, then mounting (not bind-mounting!) another (anonymous or named) volume in the location where the node_modules reside within the container. For simplicity’s sake, let’s call this volume the exclude volume.

docker-compose.yml:

services:
    node:
        command: nodemon index.js
        volumes:
            - ./:/usr/local/app/
            # the volume above prevents our host system's node_modules to be mounted
            - exclude:/usr/local/app/node_modules/

volumes:
    exclude:

Due to the way docker handles volumes, this leads to the following behaviour:

  1. When the container starts and the exclude volume is still empty, it will first of all copy the node_modules that exist within the docker image into the volume (see docker volume documentation)
  2. Next, it will mount the volumes in the order they are defined, i.e. fist the source code including the host’s node_modules, then the exclude volume, effectively leading to what we wanted to achieve:

We now have bind-mounted our local source code into the container, while keeping the original node_modules.

Seemingly, this approach solves all of our problems but thinking ahead, it actually worsens the problems, as now we no longer have two different node_modules folders, but suddenly end up with three locations in which different states of the node libraries can exist:

  1. on the host
  2. within the container / image
  3. within the exclude volume

What started off as a tiny improvement and workaround, now made our life even harder with one more point of failure. Whenever the application’s dependencies change now and we want to run the application, we will need to rebuild the container image, install dependencies on the host and then install dependencies again in the running container.

Also, if you read the above carefully, you’ll have noticed that we just shifted the overwriting part away from the host system into the volume meaning that the volume will now always overwrite the node_volumes in the container. This is okay on the first run as the volume is still empty and will be prepopulated with the container image’s node_modules but it will inevitably lead to problems once the container image’s dependencies have changed.

One mitigation would be to always remove the exclude volume prior to starting the containers but this would mean changing the way you (and your team) develop and introduced one more thing to remember. It could also be mitigated using docker’s tmpfs volume driver which creates temporary volumes that only keep hold of their files while the container is running but sadly it’s only supported on Linux.

Improving the workaround

Even though the workaround above isn’t perfect yet, it definitely solves the basic issue of preventing the host’s node_modules to be mounted. Now how can we prevent the workaround from leeaving us with three locations with different dependency states?

What if we could ensure the container keeps its node_modules in an entirely different location than the host does? This combined with the exclude volume would mean that the exclude volume always stay empty as the container image would not keep its node_volumes in the location where the exclude volume would be mounted and we could dual develop both on the host and container as both systems would look for the node_modules in different locations without interfering with each other.

But how can we make our Node.js app to look for its dependencies in different paths within the container and on the host, respectively?

Leverage Node.js’s module resolver

Digging into Node.js’s module resolving algorithm provides us with a way forward: Node.js will recursively look for node_modules in the current folder and all of its parent folders. This means, we can go ahead and install the dependencies within the container one level above the actual app as once we run the app within the container it will also look for dependencies one level above.

Note ⚠️: Although Node.js is smart at resolving the modules, you will need to ensure that any binaries that are part of the node_modules/.bin folder will be available on the $PATH.

Dockerfile:

FROM node:stable-slim

-WORKDIR /usr/local/app
+WORKDIR /usr/local

COPY package*.json ./
RUN npm install && npm cache clean --force
+ENV PATH=/usr/local/node_modules/.bin:$PATH

+WORKDIR /usr/local/app

COPY src ./src

This will result in the following file structures within the container image:

| - /usr/local
    | - package.json
    | - package-lock.json
    | - node_modules
    | - app
        | - src
            | - index.js
            | - ...

… and the following on the host system:

| - my-node-app
    | - Dockerfile
    | - package.json
    | - package-lock.json
    | - node_modules
    | - src
        | - index.js
        | - ...

Now, the container’s node_modules won’t clash with the host’s node_modules and all is good. This approach is also what node-docker-good-defaults proposes.

Only real downside is that it just got more complex adding a new dependency to the container / project. You can no longer install dependencies in the /usr/local/app folder but instead have to go one folder up to install the dependency. Or you go ahead install the dependency on the host system and then rebuild the container.

One alternative approach to ensuring that the node_modules path is different between the container image and the host system would be to use yarn, an alternative package manager that offers a --modules-dir option out of the box which allows you to specify a different path to install the node_modules in.

Also, keep in mind that we still require the exclude volume to be mounted (and be empty at all times) as Node.js’s module resolver would prioritize any module within the exclude volume over the parent directory. That means if someone was to (accidentally) run an npm install within the running container at the /usr/local/app folder level, we would again pollute the exclude folder and introduce the third point of failure once again. This is why it might make sense to always clear out the contents of the exclude volume on every container start to ensure a clean state.

docker-compose.yml:

services:
    node:
+        # clear out the exclude volume's contents on every container start
-        command: nodemon index.js
+        command: bash -c "rm -rf /usr/local/app/node_modules/* && nodemon index.js"
        volumes:
            - ./:/usr/local/app/
            # the volume above prevents our host system's node_modules to be mounted
            - exclude:/usr/local/app/node_modules/

volumes:
    exclude:

In essence, we need three steps:

⬆️ when building the container image, install node_modules into a folder higher in the hierarchy

🚫prevent locally installed node*modules (from host system) to be mounted into the container (and overwriting the container’s node_modules) by mounting a volume over the app’s node_modules path

🧹 make sure the volume stays clean by deleting its contents on every start (keeping the local ones but deleting it within the container)

Alternative approaches

If the above solution seems like too much overhead for your setup, you may want to opt for one of the alternative approaches below.

Exclude node_modules from the mount

One very simple and straight-forward approach would be to adapt your project structure / bind mount in a way that the node_modules folder is simply not mounted into the container. Instead of attempting to mount the containing folder and then exclude the node_modules directory, we explicitly state all the files and directories we actually want to mount, leaving out the node_modules folder.

docker-compose.yml:

services:
    node:
        command: nodemon index.js
        volumes:
-            - ./:/usr/local/app/
+            - ./src/:/usr/local/app/src/
+            - ./package.json:/usr/local/app/package.json
+            - ./package-lock.json:/usr/local/app/package-lock.json
+            - ...

In our example above, we’d need to explicitly mount all the required files instead of the entire containing folder, which is simple enough for the example project, but can lead to a quite blown volume list if the project structure is not optimized.

Additionally, by bind-mounting e.g. the package.json as a file you can run into issues when attempting to install / change a npm package from within the container due to the way that npm deals with the package.json file (see 1 and 2). You will likely see an error of the form:

EBUSY: resource busy or locked, rename '/home/node/package.json.2756152664' -> '/home/node/package.json'

Again, this only appears to be an issue on macOS and potentially Windows, though and a workaround (untested) exists.

Switch to Linux

Technically speaking, most docker containers are Linux-based (i.e. Debian or Ubuntu, couldn’t find a source since docker does not provide these statistics but I guess you can find out doing some work research like this).

Anyways, Docker runs native on Linux, which instantly kills the performance issues mentioned above and since most docker images are Linux-based as well, chances are that you won’t run into the cross-platform problems of native depencies as described above, so effectively the node_modules on the host should work seemlessly in the container and vice versa.

However, this obviously might be quite a change considering such a small issue and it’s by far no guarantee you won’t run into any problems as your host setup will most likely still be different from the container’s setup so you’re likely to run into problems again and of course, forcing all collaborators to use a similar setup may not be desireable.

Stick to one environment

Probably the easiest solution to prevent the issue altogether is to stick to a consistent development environment. The playbook here is: Don’t dual develop. Run your application either in the container or on the host system, but never on both ⛔️.

This is probably a reasonable approach and surely works if you’re the sole developer on the project, if you’re collaborating on the project it may work if enforced as a convention but even if it can be enforced there are still reasons why you might want to support dual development, e.g. running tests, linters, etc. on the host system during development.

Summary

I tried to cover the whole topic with all of its issues when mounting source code containing 3rd party dependencies (like node_modules, vendor files, …) into a container and provided a couple of solution approaches. Choose whichever one works best for your setup but be advised that none of those are a silver bullet.