Merging Two Instances of Plausible Analytics
Discover how to merge two instances of self-hosted Plausible Analytics into one.
Table of Contents
In the world of web analytics, Plausible Analytics has emerged as a privacy-friendly alternative to traditional tools. But what happens when you have two separate instances of self-hosted Plausible Analytics and you want to merge them into one? This process is not straightforward as Plausible Analytics stores all its data inside Clickhouse, a column-oriented database management system. There is no direct method to merge two instances of Plausible Analytics at the moment. This guide will walk you through the process.
Background
Some time ago I installed Plausible Analytics on one of my servers. Afterwards I installed another instance on another server and managed other sites on it. I wanted to transfer all the data from server one to the second server to streamline my analytics and consolidate my resources. Sadly, this feature is not build into Plausible yet, I believe this is the biggest drawback of Plausible Analytics at the moment. I searched the documentation, but could not find a solution. On Github I found some features requests from 2020 without solutions. I know that the Plausible Analytics team is working on a solution, but I wanted to merge my instances now. So I decided to do it myself.
Prerequisites
- Two Plausible Analytics instances running in Docker
- Access to both instances
Step 1: Secure Your Data
First and foremost, back up your data from both instances. This precaution ensures you won’t lose your valuable insights if the merge doesn’t go as planned.
Access Your Server
Begin by connecting to your server via SSH. The SSH extension in VSCode simplifies this process. Navigate to your Plausible directory:
ssh user@ip-address-of-your-server
cd plausible
Interact with Clickhouse
Now, enter the Clickhouse database container:
docker-compose exec plausible_events_db bash
Confirm the container name with docker ps
or by checking your docker-compose.yml
.
Identify Your Data
List all Clickhouse databases and tables to locate your data:
clickhouse-client --query "SHOW DATABASES"
### OUTPUT ###
# INFORMATION_SCHEMA
# default
# information_schema
# plausible_events_db
# system
Ensure plausible_events_db
exists, then list its tables:
clickhouse-client --database="plausible_events_db" --query="SHOW TABLES"
### OUTPUT ###
# events
# imported_brwosers
# imported_other_stuff
# schema_migrations
# sessions
There are two tables that are important for us: events
and sessions
. Make sure that you have these two tables in your database. If you have different table names, you need to adjust the commands in the following steps.
Export the Tables
Export the events
and sessions
data to TSV files:
clickhouse-client --database="plausible_events_db" --query="SELECT * FROM events FORMAT TSV" > /var/lib/clickhouse/backup/backup_events_server1.tsv
clickhouse-client --database="plausible_events_db" --query="SELECT * FROM sessions FORMAT TSV" > /var/lib/clickhouse/backup/backup_sessions_server1.tsv
This will export the data from the events
and sessions
tables into the /var/lib/clickhouse/backup/
directory. The docker-compose.yml
should have bind mount this directory:
[...]
plausible_events_db:
volumes:
- ./data/event-data:/var/lib/clickhouse
[...]
This way you will find the exported data in the ./data/event-data/backup/
directory on your server.
If you encounter permission errors, you have to create the directory first:
mkdir data/event-data/backup
Download the Data
Verify the backup files have content.
cd data/event-data/backup/
cat backup_events_server1.tsv
cat backup_sessions_server1.tsv
Then download them to your local machine.
Repeat these steps on the second server to ensure you have both sets of backup_events.tsv
and backup_sessions.tsv
files on your local machine.
Step 2: Prep for the Merge
DANGER
The following steps could lead to data loss. I don’t take any responsibility for any data loss. Please make sure that you have a backup of your data before you start.
Now you need to prepare your Plausible Analytics instances for the merge.
- Update Plausible Analytics and Clickhouse container to the same version on both servers
- Add the domains you want to merge to the second server (if it’s not already there)
You can update Plausibe Analytics by editing the docker-compose.yml
file, then running docker-compose pull
and docker-compose down
and docker-compose up -d
. You can add the domain through the Plausible GUI.
Step 3: Backup the Database (Again)
Make sure that both instances are still running. Perform another backup following the steps from Step 1.
Step 4: Combine Your Data
Upload the backup_events_server1.tsv
and backup_sessions_server1.tsv
files from Step 3 second server’s backup directory (./data/event-data/backup/
). Import them into Clickhouse with:
clickhouse-client --database="plausible_events_db" --query="INSERT INTO events FORMAT TSV" < /var/lib/clickhouse/backup/backup_events_server1.tsv
clickhouse-client --database="plausible_events_db" --query="INSERT INTO sessions FORMAT TSV" < /var/lib/clickhouse/backup/backup_events_server1.tsv
Step 5: Confirm the Merge
Ensure the analytics on the new instance match the original server’s data. If discrepancies arise, revert to the previous backup. Delete the events
and sessions
tables in the Clickhouse database, then import the old backup again.
clickhouse-client --database="plausible_events_db" --query="DROP TABLE events"
clickhouse-client --database="plausible_events_db" --query="DROP TABLE sessions"
clickhouse-client --database="plausible_events_db" --query="INSERT INTO events FORMAT TSV" < /var/lib/clickhouse/backup/backup_events_server2.tsv
clickhouse-client --database="plausible_events_db" --query="INSERT INTO sessions FORMAT TSV" < /var/lib/clickhouse/backup/backup_events_server2.tsv
Step 6: Update DNS Settings
After the data is imported, point your DNS to the new server to direct traffic to the updated Plausible instance.
Step 7: Monitor the Transition
During the initial phase after the transfer, monitor your analytics closely. This will help you catch any discrepancies or issues early on.
Wrapping Up
Merging two instances of self-hosted Plausible Analytics is a complex process that requires some understanding of databases. If you’re not comfortable with these tasks, consider seeking help from a professional or the Plausible community.
Remember, this guide provides a workaround and might not work in all cases. Always back up your data before starting this process to prevent any data loss. By following these steps, you can consolidate your analytics into one instance and streamline your data analysis process.