With Docker image

We recommand to use Docker image for the pipeline depenancy management. Creating a Docker image for a Machine Learning (ML) project environment is a great way to ensure reproducibility, portability, and consistency across different development and production environments. Here's a step-by-step guide to help you create a Docker image for your ML project.

  1. Install Docker

    As IDOML do not provide an environment for build docker image neither the docker registry, you need to install Docker on your local machine. You can download Docker from the official Docker website. An account for public docker registry is also required to push the docker image. Here is some public docker registry you can use:

  2. Create a requirements.txt File List all the Python dependencies your project needs in a requirements.txt file. Here’s an example requirements.txt file:

    numpy==1.21.0
    pandas==1.3.0
    scikit-learn==0.24.2
    tensorflow==2.5.0
    matplotlib==3.4.2
    
  3. Create a Dockerfile

    Create a Dockerfile in the root directory of your project. This file will contain the instructions to build your Docker image. Here’s an example Dockerfile:

    # Use an IDOML pre-built image as parent image
    FROM ghcr.io/yulinhuang/idoml-python:3.8
    
    # Copy the requirements file into the container 
    COPY requirements.txt .
    
    # Install any needed packages specified in requirements.txt
    RUN pip install --no-cache-dir -r requirements.txt
    
    # Use an official Python runtime as a parent image
    FROM python:3.8-slim
    
    # Install curl
    RUN apt-get update && apt-get install -y \
        curl
    
    # install IDOML related packages
    RUN pip install s3fs boto3 lxml[html_clean] mlflow 
    
    # Copy the requirements file into the container 
    COPY requirements.txt .
    
    # Install any needed packages specified in requirements.txt
    RUN pip install --no-cache-dir -r requirements.txt
    
  4. Build the Docker Image

    Open a terminal and navigate to the root directory of your project. Run the following command to build the Docker image:

    docker build -t your_dockerhub_username/your_project_image_name .
    
  5. Push the Docker Image to a Registry For IDOML pipeline to use the custom Docker image, you need to push the Docker image to a public registry such as Docker Hub or GitHub Registry. Here’s an example of how to push the Docker image to Docker Hub:

    docker push your_dockerhub_username/your_project_image_name
    
  6. Configure the runtime image with elyra extension On the IDOML jupyterhub server, you can configure the runtime image with the elyra extension. Here is an example of how to configure the runtime image:

    Image title

    Click on the Runtime Images tab on the left side of the jupyterhub page, then click on the + button to add a new runtime image. Fill in the required fields and click on the Save & Close button to save the configuration. Make sure to specify the image you created in the previous step in the Image Name field.