Skip to content

Instantly share code, notes, and snippets.

@diegofcornejo
Last active May 25, 2024 23:21
Show Gist options
  • Save diegofcornejo/98e0254be1c9b073024ad76b9c0ea94a to your computer and use it in GitHub Desktop.
Save diegofcornejo/98e0254be1c9b073024ad76b9c0ea94a to your computer and use it in GitHub Desktop.
Cassandra Cluster Setup: Configuration to set up a three-node Cassandra cluster using Docker Compose and the Bitnami Cassandra image.

Cassandra Cluster with Docker Compose

This project sets up a multi-node Cassandra cluster using Docker Compose. The configuration includes three Cassandra nodes with a centralized logging setup using Loki.

Prerequisites

  • Docker
  • Docker Compose

Project Structure

  • Dockerfile: Dockerfile for the Cassandra nodes.
  • docker-compose.yml: Docker Compose file defining the services and their configurations.
  • .env: Environment variables for the Cassandra nodes.

Usage

Clone the repository:

# Build the Docker images
docker build -t cassandra-awscli .

# Start the first node as a seed node
docker compose up -d cassandra-node1

# Wait for the first node to start, this is necessary to avoid issues with the remaining nodes, other nodes need to connect to the seed node

# Note: The first node may take a few minutes to start
docker compose logs -f cassandra-node1

# Start the remaining nodes
docker compose up -d cassandra-node2 cassandra-node3

Cassandra consistency levels

https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/dml/dmlConfigConsistency.html

# Environment variables for Cassandra Nodes
CASSANDRA_SEEDS="cassandra-node1"
CASSANDRA_CLUSTER_NAME="production-cluster"
CASSANDRA_ENDPOINT_SNITCH=SimpleSnitch
CASSANDRA_AUTHENTICATOR=PasswordAuthenticator
CASSANDRA_AUTHORIZER=CassandraAuthorizer
CASSANDRA_USER=cassandra
CASSANDRA_PASSWORD=123456
HEAP_NEWSIZE=512M
MAX_HEAP_SIZE=4096M
# AWS credentials for backup and restore scripts
AWS_ACCESS_KEY_ID=YOUR_AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY=YOUR_AWS_SECRET_ACCESS_KEY
#!/bin/bash
# Variables
KEYSPACE_NAME="your_keyspace"
S3_BUCKET="your-bucket"
HOST_NAME=$(hostname) # Use container name
TODAY=$(date +%F)
if [ "$(date +%u)" -eq 7 ]; then
# Sunday: Perform full backup
SNAPSHOT_NAME="full_snapshot_${HOST_NAME}_$TODAY"
echo "Performing full backup: $SNAPSHOT_NAME"
else
# Other days: Perform incremental backup
SNAPSHOT_NAME="incremental_snapshot_${HOST_NAME}_$TODAY"
echo "Performing incremental backup: $SNAPSHOT_NAME"
fi
# Create Snapshot
nodetool flush
nodetool snapshot -t $SNAPSHOT_NAME $KEYSPACE_NAME
# Compress snapshot files
cd /var/lib/cassandra/data/$KEYSPACE_NAME/snapshots/
tar -czf /tmp/${SNAPSHOT_NAME}.tar.gz $SNAPSHOT_NAME
# Upload to S3
aws s3 cp /tmp/${SNAPSHOT_NAME}.tar.gz s3://$S3_BUCKET/backups/$HOST_NAME/
# Clean up old snapshots
nodetool clearsnapshot -t $SNAPSHOT_NAME $KEYSPACE_NAME
# Remove the local compressed file
rm /tmp/${SNAPSHOT_NAME}.tar.gz
echo "Backup $SNAPSHOT_NAME compressed and uploaded to S3"
name: cassandra-cluster
x-logging: &default-logging
driver: "loki"
options:
loki-url: "http://localhost:3100/loki/api/v1/push"
services:
cassandra-node1:
image: cassandra-awscli
env_file:
- .env
container_name: cassandra-node1
networks:
- cassandra-network
ports:
- "9042:9042"
volumes:
- cassandra-data1:/var/lib/cassandra
environment:
- CASSANDRA_PASSWORD_SEEDER=yes
logging: *default-logging
cassandra-node2:
image: cassandra-awscli
env_file:
- .env
container_name: cassandra-node2
networks:
- cassandra-network
ports:
- "9043:9042"
volumes:
- cassandra-data2:/var/lib/cassandra
logging: *default-logging
depends_on:
- cassandra-node1
cassandra-node3:
image: cassandra-awscli
env_file:
- .env
container_name: cassandra-node3
networks:
- cassandra-network
ports:
- "9044:9042"
volumes:
- cassandra-data3:/var/lib/cassandra
logging: *default-logging
depends_on:
- cassandra-node1
networks:
cassandra-network:
driver: bridge
volumes:
cassandra-data1:
cassandra-data2:
cassandra-data3:
# Use the latest version of the Bitnami Cassandra image as base image
FROM bitnami/cassandra:latest
# Change to root user to install packages
USER root
# Install required packages
RUN apt-get update && apt-get install -y \
curl \
unzip
# Install AWS CLI
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && \
unzip awscliv2.zip && \
./aws/install && \
rm -rf awscliv2.zip aws
# Remove installed packages
RUN apt-get remove -y \
curl \
unzip
# Clean up
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
# Copiar el script de backup al contenedor
COPY backup.sh /usr/local/bin/backup.sh
RUN chmod +x /usr/local/bin/backup.sh
# Go back to the default non-root user
USER 1001
#!/bin/bash
# Cassandra containers
CONTAINERS=("cassandra_node1" "cassandra_node2" "cassandra_node3")
# Execute backup in each container
for CONTAINER in "${CONTAINERS[@]}"; do
echo "Starting backup in container: $CONTAINER"
docker exec $CONTAINER /usr/local/bin/backup.sh
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment