JS monorepos in prod 1: project initialization

JS monorepos in prod 1: project initialization

Do you like our work......we hire!

Never miss our publications about Open Source, big data and distributed systems, low frequency of one email every two months.

Every project journey begins with the step of initialization. When your overall project is composed of multiple projects, it is tempting to create one Git repository per project. In Node.js, a project translates to a package. However, managing too many closely related repositories is confusing and time-consuming.

Placing multiple projects inside a single Git repository and using a tool like Lerna to facilitate their management worth the effort. This architecture is called a monorepo. It simplifies the versioning and publishing of the components as well as their manipulation and development.

At Adaltas, we have been developing and maintaining several monorepos for a couple of years. This article is the first one from a serie of 7 in which we share our best practices. It covers the project initialization using Yarn and Lerna:

Starting a new project

The idea for an example project comes from our past work. Over the years, we have accumulated several Gatsby plugins that have never been published and shared with the open-source community. Those plugins are copy/pasted from one Gatsby website to another, sometimes with bug fixes and enhancements. Since we have multiple copies more or less up-to-dates between each other, older websites don’t benefit from those changes. The idea is to centralize the development of those plugins inside a single repository and share them by publishing them on NPM.

A new project is started from scratch. It is called remark-gatsby-plugins and is hosted on GitHub. This repository is a container for multiple packages that are plugins for Gatsby and gatsby-transformer-remark plugin.

# Repository initialization
mkdir remark-gatsby-plugins
cd remark-gatsby-plugins
git init
# Create and commit a new file
echo "# remark and Gatsby plugins by Adaltas" > README.md
git add README.md
git commit -m "docs: project creating"
# Define the GitHub remote server
git remote add origin https://github.com/adaltas/remark-gatsby-plugins.git
# Push commits to remote
git push -u origin master
# Next push commands will simply be `git push`

The commit message is prefixed by docs and it is not by hazard. This aspect is covered later by the Conventional Commits chapter in the following article commit enforcement and changelog generation.

Ignoring files from Git

There are two strategies to choose from:

  • To selectively define the path to be ignored.
  • To define global ignore rules and selectively exclude path from those rules.

I usually choose the latest strategy to ignore all hidden files by default. I start with:

cat <<CONTENT > .gitignore
.*
node_modules
!.gitignore
CONTENT
git add .gitignore
git commit -m 'build: ignore hidden files and node modules'

Project initialization

I am personally using Yarn instead of NPM. Both package managers are perfectly fine, but I had issues in the past using NPM with monorepos and links. In this setup, Yarn also seems to be the tool of choice across the community. Its native support for monorepos, called workspaces, works well with Lerna.

To initialize a package with yarn:

yarn init
yarn init v1.22.5
question name (remark-gatsby-plugins): 
question version (1.0.0): 0.0.0
question description: A selection of remark and Gatsby plugins developed and used by Adaltas
question entry point (index.js): 
question repository url (https://github.com/adaltas/remark-gatsby-plugins.git): 
question author (David Worms <david@adaltas.com>): 
question license (MIT): 
question private: 
git add package.json
git commit -m "build: package initialization"

It created a package.json file and committed it.

Monorepo with Lerna

The project contains a package.json file. Following the Node.js terminology, the project is now a Node.js package. However, it will not be published on NPM, the official Node.js repository. Only the packages inside this package will be published.

Instead of creating a Git repository for each package, it is easier to maintain a single repository storing multiple Node.js packages. Since multiple packages are managed inside the same repository, we call this a monorepo.

Multiple tools exist to manage monorepos. Lerna is a popular choice but not the only one. At Adaltas, we have been using it for some time and we continue for this article.

Besides having just one Git repository to manage, there are additional advantages to legitimate the usage of monorepos:

  • When multiple packages are developed, many duplicated dependencies are declared inside the package.json file. Declaring the dependencies inside the top-most project managed with Lerna reduces space and time. It is called “hoisting” dependencies.
  • When packages depend on each other’s, changes in one package often need to be instantly reflected in the other packages. A single feature may span multiple packages. Publishing the changes of the dependent packages is not possible, it takes too much time and there could be too many changes not justifying a release. The solution is to link the dependencies by creating symbolic links. For large projects, this is a tedious task. A tool like Lerna automates the creation of those links.
  • Having one central location federates the execution of your commands. For example, you install all the dependencies of all your packages with a single command, yarn install. For testing, the command lerna test runs all your tests.

Additionally, Lerna helps us to manage our versions with respect to the Semantic Versioning (SemVer) specification.

The command to initialize Lerna is:

yarn add lerna
yarn lerna init --independent

The --independent flag tells Lerna to manage the version of each package independently. Without it, Lerna aligns the versions of the packages it manages.

These commands add the lerna dependency to the package.json and creates a new lerna.json file:

{
  "packages": [
    "packages/*"
  ],
  "version": "independent"
}

Then, we commit our pending changes:

git add lerna.json package.json
git commit -m 'build: lerna initialization'

Publishing or ignoring lock files

The yarn add command has generated a yarn.lock file. With NPM, the file would have been package-lock.json.

My approach is to publish lock files for my final applications. I don’t publish the lock files for the packages which are meant to be used as dependencies. Some people agree with my opinion. However, the Yarn documentation states the contrary:

All yarn.lock files should be checked into source control (e.g. git or mercurial). This allows Yarn to install the same exact dependency tree across all machines, whether it be your coworker’s laptop or a CI server.

Framework and library authors should also check yarn.lock into source control. Don’t worry about publishing the yarn.lock file as it won’t have any effect on users of the library.

I am perplexed. If it is not used, then why committing a huge file. Anyway, let’s ignore them for now. The end result is that those lock files will be ignored from Git:

echo 'package-lock.json' >> .gitignore
echo 'yarn.lock' >> .gitignore
git add .gitignore
git commit -m "build: ignore lock files"

Yarn integration

Since we are using Yarn instead of NPM, add these properties to lerna.json:

{
  "npmClient": "yarn",
  "useWorkspaces": true
}

The useWorkspaces property tells Lerna to not use lerna.json#packages but instead to look for packages.json#workspaces. According to the Lerna Bootstrap documentation, both are similar except that Yarn doesn’t support recursive globs **.

Update Lerna to remove the packages property from lerna.json, it now contains only:

{
  "npmClient": "yarn",
  "useWorkspaces": true,
  "version": "independent"
}

Update the packages.json file to contain:

{
  "private": true,
  "workspaces": [
    "packages/*"
  ]
}

The private property is required. Any attempt to register a new dependency without it raises an error from Yarn in the form of “Workspaces can only be enabled in private projects”. Note, it was possible to define the project as private when we were initializing it with yarn init. Now, that our project is a monorepo, it is a good time to mark the root package as private since it will not be published on NPM. Only the packages inside it are for publishing.

Note, executing lerna init now will sync the packages.json#workspaces back inside lerna.json#packages with the new values.

Now, save the changes:

git commit -a -m 'build: activate yarn usage'

If you are not familiar with Git, the -a flag adds all the modified files to the commit. New files are disregarded.

Package location

By default, Lerna manages packages inside the “packages” folder. The majority of projects using Lerna uses this convention. It is a good idea to respect it. But in our case, we have two types of plugins:

  • The Gatsby plugins
  • The Gatsby Remark plugins which extend the gatsby-transformer-remark plugin

Thus, I modify the workspaces array in the packages.json file to be:

{
  "workspaces": [
    "gatsby/*",
    "gatsby-remark/*"
  ]
}

The packages’ location is saved:

git commit -a -m 'build: workspaces declaration'

Packages creation

Let’s import two packages for the sake of testing. They are currently located inside my /tmp folder:

ls -l /tmp/gatsby-caddy-redirects-conf
total 16
-rw-r--r--@ 1 david  staff   981B Nov 26 21:20 gatsby-node.js
-rw-r--r--@ 1 david  staff   239B Nov 26 21:19 package.json
ls -l /tmp/gatsby-remark-title-to-frontmatter
total 16
-rw-r--r--  1 david  staff   1.2K Nov 26 11:35 index.js
-rw-r--r--@ 1 david  staff   309B Nov 26 21:14 package.json

To import the packages and commit:

mkdir gatsby gatsby-remark
# Import first plugin
mv /tmp/gatsby-caddy-redirects-conf gatsby/caddy-redirects-conf
git add gatsby/caddy-redirects-conf
# Import second plugin
mv /tmp/gatsby-remark-title-to-frontmatter gatsby-remark/title-to-frontmatter
git add gatsby-remark/title-to-frontmatter
# Commit the changes
git commit -m 'build: import project'

Cheat sheet

Package initialization:

yarn init

Monorepo initialization:

yarn add lerna
yarn lerna init
# or
yarn lerna init --independent
# then
git add lerna.json package.json
git commit -m 'build: lerna initialization'

Ignore lock file (optional):

echo 'package-lock.json' >> .gitignore
echo 'yarn.lock' >> .gitignore
git add .gitignore
git commit -m "build: ignore lock files"

Yarn integration (unless using NPM), remove the package property from lerna.json and:

{
  "npmClient": "yarn",
  "useWorkspaces": true
}

Update the packages.json file to contain:

{
  "private": true,
  "workspaces": [
    "packages/*"
  ]
}

Next

The following article cover the versioning and publishing strategies of packages with Lerna.

Share this article

Canada - Morocco - France

We are a team of Open Source enthusiasts doing consulting in Big Data, Cloud, DevOps, Data Engineering, Data Science…

We provide our customers with accurate insights on how to leverage technologies to convert their use cases to projects in production, how to reduce their costs and increase the time to market.

If you enjoy reading our publications and have an interest in what we do, contact us and we will be thrilled to cooperate with you.

Support Ukrain