Production deployment
The Getting started page is designed for quick local testing with minimal configuration. For a production deployment intended for external users, you should customize the settings to fit your specific needs.
To help you with this,
the Neurobagel recipes repository
includes dedicated production deployment recipes that should
cover the most common use cases.
Start with a fresh recipe for each deployment
Make a fresh clone of the
recipes repository for each
deployment you want to launch.
For example, hosting multiple Neurobagel nodes,
or a single node alongside a containerized proxy server,
means creating a separate clone of the recipes repo for each node or proxy server.
This will make it easier to update and maintain each deployment.
Deployable services
All Neurobagel services are containerized. The production deployment recipes use Docker Compose profiles to group related services and allow them to be launched together.
Neurobagel services that are included in the Docker Compose deployment recipes:
- Neurobagel node API/n-API (
api): The API that communicates with a single graph store and determines how detailed the response to a query should be from that graph. - Graph store (
graph): A third-party RDF store that stores Neurobagel-harmonized data to be queried. At the moment our recipe uses the free tier of GraphDB for this. - Neurobagel federation/f-API (
federation): A special API that can federate over one or more Neurobagel nodes to provide a single point of access to multiple distributed databases. By default it will federate over all public nodes and any local nodes you specify. - Neurobagel query tool (
query_federation): A web app that provides a graphical interface for users to query a federation API and view the results from one or more nodes. Because the query tool is a static app and is run locally in the user's browser, this service simply hosts the app.
Two additional, third-party services are part of production deployment recipes:
- NGINX reverse proxy (
nginx-proxy): A containerized version of the popular proxy server that lets you serve your Neurobagel services under custom URLs, handling routes automatically. - Automatic SSL certificate service (
acme-companion): A containerized companion tool for nginx that automatically provisions SSL certificates for the routes created bynginxso users can communicate with your services via encrypted HTTPS connections.
Deployment profiles
Neurobagel offers different deployment profiles that allow you to launch specific combinations of services (listed below), depending on your use case.
-
node: Deploys an individual Neurobagel node. A Neurobagel node includes an internal graph database and a node API that handles all incoming queries and talks to the graph database. You can run several nodes on the same machine.apigraph
-
portal: Deploys the federation engine and a connected web query interface. Use this profile only if you need to host your own federated query tool, e.g. to federate over nodes that are not in the list of public Neurobagel nodes.federationquery_federation
-
proxy: Deploys pre-configured, containerized reverse-proxy services that will automatically set up routes to your Neurobagel services under your desired URLs.You can also use an existing proxy server
Our deployment instructions assume that there is no existing proxy server set up on the machine that will host your Neurobagel services. If you already have a proxy server setup, follow the slightly modified steps described in deploying with an existing proxy server.
In this case, you can ignore the
proxydeployment recipe.
Setting up a production deployment
Requirements
In addition to the Docker and Docker Compose requirements outlined on the Getting started page, the Neurobagel configuration wizard for a production deployment requires Python 3.10 or later on the host machine.
Install the Neurobagel configuration wizard
configure-nb is a tool that simplifies Neurobagel deployment configuration, and can be installed from PyPI using pip.
Install the configure-nb package:
pip install configure-nb
Common setup for all deployment profiles
Do these steps first for each deployment profile you set up
Each production deployment profile requires a fresh deployment recipe and begins with the same initial steps. Complete these steps before following the profile-specific setup instructions.
Clone the recipe repository
Make a fresh clone of the recipes repository in a location of your choice.
git clone https://github.com/neurobagel/recipes.git recipes
Consider changing recipes to a name you will recognize in the future
Then navigate into this directory for the remaining steps.
cd recipes
Proxy server
Skip if you already have a reverse proxy server
If you already have a reverse proxy server set up and want to continue using it, do not follow this section and instead continue with our guide on production deployment with an existing proxy server.
Always launch the proxy server first
Our reverse proxy recipe is set up to automatically configure routes to the Neurobagel services you launch. In order to do that, the proxy server must already be running when you launch a new Neurobagel service. If you have already launched Neurobagel services (e.g. node or portal), shut them down again, launch the proxy server, and then relaunch the services.
Start from a fresh deployment recipe!
To host your Neurobagel node services under a custom URL
(e.g. https://www.myfirstnode.org/query) rather than a server IP address and port
(e.g. http://192.168.0.1:3000), we provide a recipe for you to easily set up an
NGINX reverse proxy
alongside your Neurobagel services.
Make sure that:
- you have access to the domain you want to host your services at, so that you can create a DNS entry that points at the machine that will host your Neurobagel services.
- your machine firewall allows incoming connections on ports 80 (HTTP) and 443 (HTTPS). Both are necessary to enable HTTPS connections.
Launch the proxy server using the corresponding deployment recipe:
docker compose -f docker-compose.proxy.yml up -d
You should now see the nginx-proxy and acme-companion services running:
docker ps
Node
Start from a fresh deployment recipe!
Make sure the proxy service is already running
The default portal deployment recipe requires that you have already
deployed the proxy server.
Create node INI config file
In your deployment recipe directory, create a file called nb_config.ini.
This file will store the environment variables used to configure the services in your deployment.
Here is an example minimal nb_config.ini for a node deployment:
[compose]
COMPOSE_PROFILES=node #(1)!
[service:graph]
LOCAL_GRAPH_DATA=./data
[service:node-api]
NB_RETURN_AGG=true
NB_MIN_CELL_SIZE=0
NB_NAPI_DOMAIN=mydomain.org
NB_NAPI_BASE_PATH=/node
COMPOSE_PROFILESmust be set tonode.
Do not wrap values in quotations ('' or "") - they will be treated as literal characters
Info
For more details on all available environment variables, see the Environment variable reference.
The following sections describe the node configuration options in more detail.
Set graph store credentials
This is a security relevant section!
Changing passwords after the first launch requires a hard reset
If you've previously launched a Neurobagel deployment (Docker Compose stack), you'll need to reset your graph store for any changes you have made to user credentials to take effect. Any other configuration changes you've already made will be applied when you re-launch your node.
The graph store (GraphDB instance) in a Neurobagel node
is secured with password-based access and includes two users:
an admin superuser and a regular database user.
The Neurobagel deployment recipes automatically creates both these users.
In the ./secrets subdirectory,
change the default passwords for these users:
- Replace the contents of
NB_GRAPH_ADMIN_PASSWORD.txtto set the password for theadminsuperuser - Replace the contents of
NB_GRAPH_PASSWORD.txtto set the password for the graph database user
Want to generate a random password in the terminal?
To generate a random password in the terminal, you can use:
openssl rand -hex 16
(Optional) To change the directory where your password files are stored, use the NB_GRAPH_SECRETS_PATH variable:
[compose]
COMPOSE_PROFILES=node
[service:graph]
LOCAL_GRAPH_DATA=./data
NB_GRAPH_SECRETS_PATH=./secrets
[service:node-api]
NB_RETURN_AGG=true
NB_MIN_CELL_SIZE=0
NB_NAPI_DOMAIN=mydomain.org
NB_NAPI_BASE_PATH=/node
Graph store passwords are only for administrator use!
The admin user and graph database user credentials are intended solely for internal use by the deployment recipe scripts that automatically set up and update the graph store,
or for a node administrator to interact directly with the graph store.
These credentials also secure internal communication between your graph store and its node API,
ensuring that node users cannot query your graph directly.
GraphDB user credentials are not intended for use by a general node query user.
Passwords are handled as Docker secrets
The contents of NB_GRAPH_ADMIN_PASSWORD.txt and NB_GRAPH_PASSWORD.txt are passed to Neurobagel containers as Docker secrets.
This ensures that your passwords are not exposed in the container logs or in the docker-compose.yml file.
Do not share your password files with others.
Add data to the node
By default, any JSONLD files in the ./data subdirectory
of your deployment recipe directory will be automatically uploaded to the graph store.
To add the dataset JSONLD files for your node, either:
- place them inside
./data(replacing the example JSONLD file), or -
define a custom path to your JSONLD files using the variable
LOCAL_GRAPH_DATA:nb_config.ini[compose] COMPOSE_PROFILES=node [service:graph] LOCAL_GRAPH_DATA=./data [service:node-api] NB_RETURN_AGG=true NB_MIN_CELL_SIZE=0 NB_NAPI_DOMAIN=mydomain.org NB_NAPI_BASE_PATH=/node
Set node response granularity
This is a security relevant section!
Based on your data sharing requirements, set the following variables to control the level of detail returned in query results from your node:
NB_RETURN_AGG: whether to return aggregate counts only, instead of subject-level recordsNB_MIN_CELL_SIZE: minimum matching subject threshold for dataset visibility in queries
[compose]
COMPOSE_PROFILES=node
[service:graph]
LOCAL_GRAPH_DATA=./data
[service:node-api]
NB_RETURN_AGG=true
NB_MIN_CELL_SIZE=0
NB_NAPI_DOMAIN=mydomain.org
NB_NAPI_BASE_PATH=/node
Set node domain and subpath
Set NB_NAPI_DOMAIN to the domain name
(including any subdomain) you will use for your node API
(the web-accessible part of your node).
Optionally, set NB_NAPI_BASE_PATH to host your node API on a subpath of your domain.
This is useful when hosting multiple nodes or services on the same domain, because you can use a different subpath for each
(e.g. mydomain.org/node1, mydomain.org/node2, ...).
[compose]
COMPOSE_PROFILES=node
[service:graph]
LOCAL_GRAPH_DATA=./data
[service:node-api]
NB_RETURN_AGG=true
NB_MIN_CELL_SIZE=0
NB_NAPI_DOMAIN=mydomain.org
NB_NAPI_BASE_PATH=/node #(1)!
NB_NAPI_BASE_PATHis optional and can be omitted if you are not using a subpath.
Domain names must not include a protocol (http:// or https://)
Custom subpaths must include a leading slash /
Generate node runtime config
Run configure-nb to generate the runtime configuration file for your deployment
from the nb_config.ini file you created.
configure-nb
You should now have a .env file in your deployment recipe directory.
Launch node
Launch your node using Docker Compose.
docker compose -f docker-compose.prod.yml up -d
Portal
Start from a fresh deployment recipe!
Make sure the proxy server is already running
The default portal deployment recipe requires that you have already
deployed the proxy server.
Create portal INI config file
In your deployment recipe directory, create a file called nb_config.ini.
This file will store the environment variables used to configure the services in your deployment.
Here is an example minimal nb_config.ini for a portal deployment:
[compose]
COMPOSE_PROFILES=portal #(1)!
[node:1]
NAME=Parkinson's Disease Data - Site 1
API_URL=https://mydomain.org/site1
[node:2]
NAME=Parkinson's Disease Data - Site 2
API_URL=https://mydomain.org/site2
[service:federation-api]
NB_FAPI_DOMAIN=mydomain.org
NB_FAPI_BASE_PATH=/federate
[service:query]
NB_QUERY_DOMAIN=mydomain.org
NB_QUERY_APP_BASE_PATH=/
COMPOSE_PROFILESmust be set toportal.
Do not wrap values in quotations ('' or "") - they will be treated as literal characters
Info
For more details on all available environment variables, see the Environment variable reference.
The following sections describe the portal configuration options in more detail.
Set nodes to federate
[compose]
COMPOSE_PROFILES=portal
[node:1]
NAME=Parkinson's Disease Data - Site 1
API_URL=https://mydomain.org/site1
[node:2]
NAME=Parkinson's Disease Data - Site 2
API_URL=https://mydomain.org/site2
[service:federation-api]
NB_FAPI_DOMAIN=mydomain.org
NB_FAPI_BASE_PATH=/federate
[service:query]
NB_QUERY_DOMAIN=mydomain.org
NB_QUERY_APP_BASE_PATH=/query
A portal deployment federates queries across a custom set of nodes that you define. Each node of interest is defined using a federation node configuration section following the format:
[node:<ID>]
NAME=<NODE DISPLAY NAME (SHOWN IN THE QUERY PORTAL)>
API_URL=<URL OF THE NODE API>
Federation node configuration section headers must start with the prefix node:.
<ID> is an arbitrary internal identifier used only to ensure section names are unique.
For simplicity, we recommend using numeric IDs such as [node:1], [node:2], etc.
API_URL must include the protocol (http:// or https://)
Public Neurobagel nodes do not need to be included
We maintain a list of publicly accessible Neurobagel nodes
here.
By default, every new f-API will look up this list
on startup and include it in its internal list of nodes to
federate over
(this can be disabled by setting the variable NB_FEDERATE_REMOTE_PUBLIC_NODES in nb_config.ini).
This means that you do not have to manually define these public nodes in nb_config.ini.
Do not include URLs of federation APIs
Make sure you do not include your own f-API in the list of nodes to federate over. This will cause an infinite request loop that will likely overload your service, as an f-API will be repeatedly making requests to itself.
Set portal domains and subpaths
Set NB_FAPI_DOMAIN and NB_QUERY_DOMAIN to the domain names you will use for your federation API and web query tool, respectively.
Optionally, set NB_FAPI_BASE_PATH and/or NB_QUERY_APP_BASE_PATH to host your federation API and/or
web query tool on a subpath of your domain.
This is useful when hosting multiple services on the same domain, because you can use a different subpath for each
(e.g. mydomain.org/federate, mydomain.org/query, ...).
[compose]
COMPOSE_PROFILES=portal
[node:1]
NAME=Parkinson's Disease Data - Site 1
API_URL=https://mydomain.org/site1
[node:2]
NAME=Parkinson's Disease Data - Site 2
API_URL=https://mydomain.org/site2
[service:federation-api]
NB_FAPI_DOMAIN=mydomain.org
NB_FAPI_BASE_PATH=/federate #(1)!
[service:query]
NB_QUERY_DOMAIN=mydomain.org
NB_QUERY_APP_BASE_PATH=/query #(2)!
NB_FAPI_BASE_PATHis optional and can be omitted if you are not using a subpath.NB_QUERY_APP_BASE_PATHis optional and can be omitted if you are not using a subpath.
Domain names must not include a protocol (http:// or https://)
Custom subpaths must include a leading slash /
Generate portal runtime config
Run configure-nb to generate the runtime configuration files for your deployment
from the nb_config.ini file you created.
configure-nb
You should now have two additional files in your deployment recipe directory: .env and local_nb_nodes.json.
Launch portal
Launch your portal using Docker Compose.
docker compose -f docker-compose.prod.yml up -d
Making your node publicly discoverable
The public Neurobagel query tool (https://query.neurobagel.org) provides query federation to all publicly accessible Neurobagel nodes.
To make your node queryable at https://query.neurobagel.org, it simply needs to be added to our public federation index on GitHub.
If you have a GitHub Account
-
Fork the neurobagel/menu repository.
-
Add your node name and node API URL to the public federation index JSON file, using the following format:
{ "NodeName": "NAME OF YOUR NODE", "ApiURL": "https://URL-OF-YOUR-NODE-API" }NodeNamedefines the display name of your node as it will appear the Neurobagel query tool. -
Open a pull request in the neurobagel/menu repository.
If you do not have a GitHub Account
Join the Neurobagel Discord server and message @neurobagel/dev with your node information, so that a maintainer can add your node to the public federation index.