IDOML server configuration repository
Welcome to IDOML server repo!
In this repository you will find the configuration files to deploy the IDOML server. The IDOML server is a docker-compose based deployment of the following services:
- Airflow
- Git-sync
- Minio
- Keycloak
- Traefik
Requirements
-
Hardware Requirements:
-
A Linux server with a minimum of 8GB of RAM and 2 CPU cores is required to run the platform efficiently. This configuration ensures optimal performance and scalability for your machine learning tasks.
-
Note that the above requirement is exclusive of resources needed for hosting a JupyterHub server and executing machine learning tasks. The resources for JupyterHub server hosting vary based on the number of users and the size of data they handle. Similarly, resources for machine learning tasks depend on task complexity and data size. Scaling up for ML tasks can be achieved by hosting additional servers to manage airflow workers.
-
-
To deploy the IDOML server, ensure your system meets the following requirements:
-
Docker: IDOML utilizes Docker for deployment. Refer to the official Docker documentation for installation instructions.
-
Docker-compose: Docker-compose is required for orchestrating the deployment process. Follow the installation instructions provided in the official Docker-compose documentation.
Info
Please be sure to create a group named docker and add the current user to this group. This is necessary to avoid permission issues when running Docker commands.
-
-
Clone the IDOML server repository:
-
Before deploying IDOML, update the .env file with the necessary configurations:
-
Domain name Configuration:
We expect that the user dispose a custom domain name. Please redirect all the subdomains to the server's IP address. Then update the .env file with the variable IDOML_DOMAIN. This domain will be used to access the services deployed on the server.
Deploying on localhost
If you do not have a custom domain name, you can use the default domain name which is a subdomain of localhost. It should be able accessed from the server itself.
Manual domain name configuration
If you don't have a domain name, but you want to deploy the IDOML platform on a remote server, you can manually configure the /etc/hosts file on your local machine. Suppose that the remote server IP address is a.b.c.d. In this case, you can add the following line to the /etc/hosts file on your local machine:
a.b.c.d idoml.com keycloak.idoml.com airflow.idoml.com dashboard.idoml.com minio.idoml.com console.minio.idoml.com jupyterhub.idoml.com idoml.mlflow.idoml.comIn this example, the IDOML_DOMAIN should be set to idoml.com.
It should be done on the local machine where the browser is running and the server deploying the IDOML platform. This configuration is only for testing purposes and should not be used in a production environment.
-
Credentials setup:
Please update the credential settings section of the .env file.
-
User ID Configuration:
To ensure proper permissions, the current user ID needs to be passed to the Docker-compose file for Airflow. According to the Airflow official documentation, the user should be in the root group to access the required folders.
Run the following command to update the .env file:
-
Docker Group ID Configuration:
The Docker group ID must be passed to the Docker-compose file for Airflow to enable the Docker operator.
Run the following command to update the .env file:
Tip: Ensure that the user is inside root and docker groups
Check if the user is in the root and docker groups by running the following commands:
If the user is not in the root and docker groups, add the user to these groups by running the following commands:
-
-
Establish a Git repository to monitor the Airflow DAGs. Kindly initiate an empty Git repository.
-
If you opt for a public repository, please update the .env file with the repository URL and branch name.
-
However, if a private repository is preferred, please use a SSH connection for the repository. For instance:
As we are using SSH for the private repository, we need to create an SSH key pair and add the public key to the repository's deploy keys. This deploy key does not require write access to the repository. Additionally, the repository must be added to the known hosts. This can be achieved by following the steps below:
- Create an SSH key pair:
- Add the SSH key to the repository's deploy keys:
- Add the repository to the known hosts:
Finally, uncomment the docker-compose.yaml file the following environment variables from the git-sync service:
-
Deployment
Once the requirements are met, the IDOML server can be deployed using the magic command at the root of the project folder:
Note
As some docker images are still private, please ensure that you have access to the packages from the Serval GitHub before deploying the platform. Before deploying the platform, please ensure that you have logged in to the GitHub Docker registry.
Try it out
The platform is now accessible via the domain name you have set up. The IDOML dashboard can be accessed at the following URL:
http://dashboard.{IDOML_DOMAIN}
Where {IDOML_DOMAIN} is the domain name you have set up previously in the .env file.
Note
The platform is still under development, and we are continuously adding new features. Currently, the first access to the dashboard will need to create an admin user. This account will not be used since the platform is using Keycloak for user authentication. The admin user will be created in Keycloak and assigned the necessary roles to access the IDOML platform.
Next Steps
Now that the IDOML server is up and running, the next step is to create an admin user to access the platform. To do this, refer to the instructions provided in the add user documentation.
Stopping and removing the platform
To stop the platform, run the following command at the root of the project folder:
To remove the platform, run the following command at the root of the project folder: