Docker - 103
Buiding a Pipeline Using Containers
This exercise demonstrates how to build a pipeline composed of independent containers whose goal is to train a machine learning model on-demand.
The workflow
Dockerfile for app
Path: ./app/Dockerfile
# Use the official Python image as base
FROM python:3.9-slim
# Install Flask
RUN pip install Flask requests
# Set the working directory in the container
WORKDIR /app
# Copy the current directory contents into the container at /app
COPY . /app
# Expose port 5000
EXPOSE 5000
# Run the Flask application
CMD ["python", "app.py"]
Explanation:
- This Dockerfile starts with the official Python image and installs Flask along with the
requests
library. - It sets the working directory inside the container as
/app
. - Copies all files from the current directory (
.
) into the container’s/app
directory. - Exposes port
5000
to allow communication with the Flask application. - Finally, it specifies the command to run the Flask application using
CMD
.
Python Script for app
Path: ./app/app.py
from flask import Flask, jsonify
import requests
app = Flask(__name__)
@app.route('/')
def index():
response = requests.get('http://train:5000')
return jsonify(message="App received response from Train",
train_response=response.json())
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
Explanation:
- This Python script defines a Flask application.
- It creates a route
/
which sends a request to thetrain
service at port5000
. - Upon receiving the response, it returns a JSON message indicating receipt of the response from
Train
.
Dockerfile for preprocess
Path: ./preprocess/Dockerfile
# Use the official Python image as base
FROM python:3.9-slim
# Install Flask
RUN pip install Flask mysql-connector-python
# Set the working directory in the container
WORKDIR /app
# Copy the current directory contents into the container at /app
COPY . /app
# Expose port 5000
EXPOSE 5000
# Run the Flask application
CMD ["python", "app.py"]
Explanation:
- Similar to the
app
Dockerfile, this one installs Flask along withmysql-connector-python
. - It sets the working directory as
/app
and copies all files into it. - Exposes port
5000
for communication. - Specifies the command to run the Flask application.
Python Script for preprocess
Path: ./preprocess/app.py
from flask import Flask, jsonify
import mysql.connector
app = Flask(__name__)
# Database connection configuration
db_config = {
'host': 'db', # Service name in Docker Compose network
'user': 'my_user',
'password': 'my_password',
'database': 'train_database'
}
# Function to fetch data from the database
def get_data():
try:
connection = mysql.connector.connect(**db_config)
cursor = connection.cursor()
cursor.execute('SELECT * FROM trains')
data = cursor.fetchall()
print(data)
return data
except mysql.connector.Error as error:
return f"Error: {error}"
finally:
if connection.is_connected():
cursor.close()
connection.close()
@app.route('/')
def index():
# Fetch data from the database
data = get_data()
data_json = {
"data": data
}
return jsonify(message="Preprocess received response from Database",
database_response=data_json)
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
Explanation:
- This script sets up a Flask application.
- It defines a route
/
which connects to the database (db
) to fetch data from thetrains
table. - The fetched data is returned as a JSON response.
Dockerfile for train
Path: ./train/Dockerfile
# Use the official Python image as base
FROM python:3.9-slim
# Install Flask
RUN pip install Flask requests
# Set the working directory in the container
WORKDIR /app
# Copy the current directory contents into the container at /app
COPY . /app
# Expose port 5000
EXPOSE 5000
# Run the Flask application
CMD ["python", "app.py"]
Explanation:
- Similar to the previous Dockerfiles, this one installs Flask and sets up the working directory.
- Exposes port
5000
for communication with the Flask application.
Python Script for train
Path: ./train/app.py
from flask import Flask, jsonify
import requests
app = Flask(__name__)
@app.route('/')
def index():
response = requests.get('http://preprocess:5000')
return jsonify(message="Train received response from Preprocess",
preprocess_response=response.json())
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
Explanation:
- This script defines a Flask application.
- It creates a route
/
which sends a request to thepreprocess
service at port5000
. - Upon receiving the response, it returns a JSON message indicating receipt of the response from
Preprocess
.
Dockerfile for db
Path: ./db/Dockerfile
# Use the official MySQL image as base
FROM mysql:8.3.0
# Copy the SQL script to create the table into the container
COPY init.sql /docker-entrypoint-initdb.d/
Explanation:
- This Dockerfile sets up the MySQL database container.
- It uses the official MySQL image and copies the SQL script for table creation into the container’s initialization directory.
SQL Script for Initializing db
Path: ./db/init.sql
-- Create the database if it doesn't exist
CREATE DATABASE IF NOT EXISTS train_database;
-- Use the created database
USE train_database;
-- Create table for trains
CREATE TABLE IF NOT EXISTS trains (
train_id INT AUTO_INCREMENT PRIMARY KEY,
train_name VARCHAR(255) NOT NULL,
departure_time INT NOT NULL,
arrival_time INT NOT NULL,
departure_station VARCHAR(100) NOT NULL,
arrival_station VARCHAR(100) NOT NULL
);
-- Generate and insert sample train data
INSERT INTO trains (train_name, departure_time, arrival_time, departure_station, arrival_station)
SELECT
CONCAT('Train ', numbers.n) AS train_name,
FLOOR(RAND() * 86400) AS departure_time,
FLOOR(RAND() * 86400) AS arrival_time,
CONCAT('Station ', FLOOR(RAND() * 1000)) AS departure_station,
CONCAT('Station ', FLOOR(RAND() * 1000)) AS arrival_station
FROM (
SELECT
(a.n + b.n * 10 + c.n * 100) AS n
FROM
(SELECT 0 AS n UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a,
(SELECT 0 AS n UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b,
(SELECT 0 AS n UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) AS numbers
LIMIT 1000;
Explanation:
- This SQL script initializes the MySQL database.
- It creates a database named
train_database
if it doesn’t exist and switches to it. - Then it creates a table
trains
with columns for train details. - Finally, it generates and inserts sample train data into the table.
Docker Compose Configuration
Path: ./docker-compose.yml
version: '3'
services:
app:
build: ./app
ports:
- "5001:5000"
networks:
- my_network
train:
build: ./train
networks:
- my_network
preprocess:
build: ./preprocess
networks:
- my_network
db:
build: ./db
environment:
MYSQL_ROOT_PASSWORD: root_password
MYSQL_DATABASE: train_database
MYSQL_USER: my_user
MYSQL_PASSWORD: my_password
networks:
- my_network
volumes:
- db_data:/var/lib/mysql
networks:
my_network:
driver: bridge
volumes:
db_data:
Explanation:
- This YAML file defines a multi-container environment using Docker Compose.
- It specifies services for
app
,train
,preprocess
, anddb
. - Each service is built from its respective Dockerfile.
- The
db
service also configures environment variables for MySQL and mounts a volume for persistent data storage. - All services are connected to a custom network named
my_network
.
This comprehensive setup allows you to build and deploy a pipeline using Docker containers. Each component is modularized and can be scaled or modified independently to suit your requirements.
Feel free to reach out if you have any questions or need further assistance!