DolphinScheduler is a powerful open-source distributed task scheduling system widely used in the big data field for managing complex workflows. This article will provide a detailed guide on how to install and configure DolphinScheduler using Docker Compose, allowing you to quickly set up and start using the system.
First, ensure that Docker and Docker Compose are installed on your system. Docker is an open-source containerization platform that allows developers to package applications and their dependencies into containers, providing high portability and consistency. Docker Compose is a tool for defining and managing multi-container applications. It uses a YAML file to configure the services and provides a single command to start or stop those services.
You can verify if Docker and Docker Compose are installed correctly using the following commands:
docker --versiondocker-compose --version
If you see the version information, the installation was successful.
Before installing and running DolphinScheduler, you need to obtain its Docker Compose configuration file. This file defines the runtime environment for DolphinScheduler and its dependent services. Follow these steps to get the configuration file:
First, use Git to clone the official DolphinScheduler repository:
git clone https://github.com/apache/dolphinscheduler.git
This will download the DolphinScheduler project to your local machine. Next, navigate to the project directory:
cd dolphinscheduler/docker
In this directory, you will find a file named docker-compose.yml
, which is the core configuration file for Docker Compose.
The docker-compose.yml
file defines the services needed to run DolphinScheduler, including a MySQL database, ZooKeeper cluster, and DolphinScheduler's Master and Worker nodes. You can modify this file as needed to adjust the configuration of each service.
The docker-compose.yml
file has the following basic structure:
version: '3.1'
services:
zookeeper:
image: zookeeper:3.5.6
ports:
- "2181:2181"
mysql:
image: mysql:5.7
environment:
MYSQL_ROOT_PASSWORD: root
MYSQL_DATABASE: dolphinscheduler
ports:
- "3306:3306"
dolphinscheduler-master:
image: apache/dolphinscheduler:latest
depends_on:
- mysql
- zookeeper
ports:
- "12345:12345"
environment:
- DOLPHINSCHEDULER_OPTS="-Xms512m -Xmx512m"
dolphinscheduler-worker:
image: apache/dolphinscheduler:latest
depends_on:
- dolphinscheduler-master
environment:
- DOLPHINSCHEDULER_OPTS="-Xms512m -Xmx512m"
In this configuration file:
Once the docker-compose.yml
file is configured correctly, you can start DolphinScheduler using Docker Compose:
docker-compose up -d
This command will start all the services defined in the docker-compose.yml
file in the background. You can check the status of the services with the following command:
docker-compose ps
If all services are listed as Up, DolphinScheduler has been successfully started.
Once started, you can access DolphinScheduler’s web UI through a browser. By default, the access URL is:
http://localhost:12345
At the login screen, use the default admin credentials (username: admin
, password: admin
). After logging in, you may want to change the default password to enhance system security.
In the web UI, you can create projects and define tasks. DolphinScheduler supports various task types such as Shell, Python, and SQL. You can create workflows by dragging and dropping tasks and setting dependencies between them.
DolphinScheduler offers rich monitoring and logging features. Users can view task execution statuses, monitor cluster health in real-time, and access detailed execution logs, which help debug and optimize workflows.
During usage, you may encounter some issues. Below are common problems and their solutions.
If a service fails to start, you can check the logs to diagnose the issue using the following command:
docker-compose logs <service_name>
For example:
docker-compose logs dolphinscheduler-master
The log information can help identify errors such as database connection failures or port conflicts.
If there are database connection failures during startup, it may be due to the MySQL service not starting in time. In this case, try restarting DolphinScheduler manually:
docker-compose restart dolphinscheduler-master dolphinscheduler-worker
DolphinScheduler excels in big data processing and ETL task scheduling. Some key advantages include:
By following the above steps, you have successfully installed and configured DolphinScheduler using Docker Compose. Its powerful features and flexible configuration make it an ideal choice for distributed task scheduling. Whether for enterprise-level big data processing or small-to-medium-sized data integration projects, DolphinScheduler is a reliable solution.
If you encounter issues during usage, you can refer to DolphinScheduler’s official documentation or community resources for more detailed technical support. With continued learning and exploration, you will be able to fully leverage DolphinScheduler’s potential, significantly improving your workflow management.