Automating ML&systems R&D¶
We started adding the following CK modules and actions with a unified API and I/O.
Platform and environment detection¶
These CK modules automate and unify the detection of different properties of user platforms and environments.
- module:os [API] [components]
- module:platform [API]
- module:platform.os [API]
- module:platform.cpu [API]
- module:platform.gpu [API]
- module:platform.gpgpu [API]
- module:platform.nn [API]
ck pull repo:octoml@mlops ck detect platform ck detect platform.gpgpu --cuda
This CK module automates the detection of a given software or files (datasets, models, libraries, compilers, frameworks, tools, scripts) on a given platform using CK names, UIDs, and tags:
It helps to understand a user platform and environment to prepare portable workflows.
ck detect soft:compiler.python ck detect soft --tags=compiler,python ck detect soft:compiler.llvm ck detect soft:compiler.llvm --target_os=android23-arm64
- module:env [API]
Whenever a given software or files are found using software detection plugins, CK creates a new “env” component in the local CK repository with an env.sh (Linux/MacOS) or env.bat (Windows).
This environment file contains multiple environment variables with unique names usually starting from CK_ with automatically detected information about a given soft such as versions and paths to sources, binaries, include files, libraries, etc.
This allows you to detect and use multiple versions of different software that can easily co-exist on your system in parallel.
ck detect soft:compiler.python ck detect soft --tags=compiler,python ck detect soft:compiler.llvm ck show env ck show env --tags=compiler ck show env --tags=compiler,llvm ck show env --tags=compiler,llvm --target_os=android23-arm64 ck virtual env --tags=compiler,python
When a given software is not detected on our system, we usually want to install related packages with different versions.
That’s why we have developed the following CK module that can automate installation of missing packages (models, datasets, tools, frameworks, compilers, etc):
This is a meta package manager that provides a unified API to automatically download, build, and install packages for a given target (including mobile and edge devices) using existing building tools and package managers.
All above modules can now support portable workflows that can automatically adapt to a given environment based on soft dependencies.
ck install package --tags=lib,tflite,v2.1.1 ck install package --tags=tensorflowmodel,tflite,edgetpu
See an example of variations to customize a given package: lib-tflite.
We also provided an abstraction for ad-hoc scripts:
See an example of the CK component with a script used for MLPerf™ benchmark submissions: GitHub
Portable program pipeline (workflow)¶
Next we have implemented a CK module to provide a common API to compile, run, and validate programs while automatically adapting to any platform and environment:
A user describes dependencies on CK packages in the CK program meta as well as commands to build, pre-process, run, post-process, and validate a given program.
ck pull repo:octoml@mlops ck compile program:image-corner-detection --speed ck run program:image-corner-detection --repeat=1 --env.OMP_NUM_THREADS=4
We have developed an abstraction to record and reply experiments using the following CK module:
This module records all resolved dependencies, inputs and outputs when running above CK programs thus allowing to preserve experiments with all the provenance and replay them later on the same or different machine:
ck benchmark program:image-corner-detection --record --record_uoa=my_experiment ck find experiment:my_experiment ck replay experiment:my_experiment ck zip experiment:my_experiment
Since we can record all experiments in a unified way, we can also visualize them in a unified way. That’s why we have developed a simple web server that can help to create customizable dashboards:
- module:web [API]
See examples of such dashboards:
One of our goals for CK was to automate the (re-)generation of reproducible articles. We have validated this possibility in this proof-of-concept project with the Raspberry Pi foundation.
We plan to develop a GUI to make the process of generating such papers more user friendly!
It is possible to use CK from Jupyter and Colab notebooks. We provided an abstraction to share Jupyter notebooks in CK repositories:
You can see an example of a Jupyter notebook with CK commands to process MLPerf™ benchmark results here.
During the past few years we converted all the workflows and components from our past ML&systems R&D including the MILEPOST and cTuning.org project to the CK format.
There are now 150+ CK modules with actions automating and abstracting many tedious and repetitive tasks in ML&systems R&D including model training and prediction, universal autotuning, ML/SW/HW co-design, model testing and deployment, paper generation and so on:
- A high level overview of portable CK workflows
- A Collective Knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques (collaboration with the Raspberry Pi foundation)
- A summary of main CK-based projects with academic and industrial partners
- cKnowledge.io platform documentation
Don’t hesitate to contact us if you have a feedback or want to know more about our plans!