Trying CK

How CK enables portable and customizable workflows

We originally developed CK to help our partners and collaborators implement modular, portable, customizable, and reusable workflows. We needed such workflows to enable collaborative and reproducible ML&systems R&D while focusing on deep learning benchmarking and ML/SW/HW co-design. We also wanted to automate and reuse tedious tasks that are repeated across nearly all ML&systems projects as described in our FOSDEM presentation.

In this section, we demonstrate how to use CK with portable and non-virtualized program workflows that can automatically adapt to any platform and user environment, i.e. automatically detect target platform properties and software dependencies and then compile and run a given program with any compatible dataset and model in a unified way.

Note that such approach also supports our reproducibility initiatives at ML&systems conferences to share portable workflows along with published papers. Our goal is to make it easier for the community to reproduce research techniques, compare them, build upon them, and adopt them in production.

CK installation

Follow this guide to install CK on Linux, MacOS, or Windows. Don’t hesitate to contact us if you encounter any problem or have questions.

Pull CK repositories with the universal program workflow

Now you can pull CK repo with the universal program workflow.

ck pull repo --url=https://github.com/ctuning/ck-crowdtuning

CK will automatically pull all required CK repositories with different automation actions, benchmarks, and datasets in the CK format. You can see them as follows:

ck ls repo

By default, CK stores all CK repositories in the user space in $HOME/CK-REPOS. However, you can change it using the environment variable CK_REPOS.

Manage CK entries

You can now see all shared program workflows in the CK format:

ck ls program

You can find and investigate the CK format for a given program (such as cbench-automotive-susan) as follows:

ck find program:cbench-automotive-susan

You can see the CK meta description of this program from the command line as follows:

ck load program:cbench-automotive-susan
ck load program:cbench-automotive-susan --min

It may be more convenient to check the structure of this entry at GitHub with all the sources and meta-descriptions.

You can also see the CK JSON meta description for this CK program entry here. When you invoke automation actions in the CK module program, the automation code will read this meta description and perform actions for different programs accordingly.

Invoke CK automation actions

You can now try to compile this program on your platform:

ck compile program:cbench-automotive-susan --speed

CK will invoke the function “compile” in the module “program” (you can see it at GitHub or you can find the source code of this CK module locally using “ck find module:program”), read the JSON meta of cbench-automotive-susan, and perform a given action.

Note, that you can obtain all flags for a given action as follows:

ck compile program --help

You can update any above key from the command line by adding “–“ to it. If you omit the value, CK will use “yes” by default.

When compiling program, CK will first attempt to automatically detect the properties of the platform and all required software dependencies such as compilers and libraries that are already installed on this platform. CK uses multiple plugins describing how to detect different software, models, and datasets.

Users can add their own plugins either in their own CK repositories or in already existing ones.

You can also perform software detection manually from the command line. For example you can detect all installed GCC or LLVM versions:

ck detect soft:compiler.gcc
ck detect soft:compiler.llvm

Detected software is registered in the local CK repository together with the automatically generated environment script (env.sh or env.bat) specifying different environment variables for this software (paths, versions, etc).

You can list registered software as follows:

ck show env
ck show env --tags=compiler

You can use CK as a virtual environment similar to venv and Conda:

ck virtual env --tags=compiler,gcc

Such approach allows us to separate CK workflows from hardwired dependencies and automatically plug in the requied ones.

You can now run this program as follows:

ck run program:cbench-automotive-susan

While running the program, CK will collect and unify various characteristics (execution time, code size, etc). This enables unified benchmarking reused across different programs, datasets, models, and platform. Furthermore, we can continue improving this universal program workflow to monitor CPU/GPU frequencies, performing statistical analysis of collected characteristics, validating outputs, etc:

ck benchmark program:cbench-automotive-susan --repetitions=4 --record --record_uoa=ck_entry_to_record_my_experiment
ck replay experiment:ck_entry_to_record_my_experiment

Note that CK programs can automatically plug different datasets from CK entries that can be shared by different users in different repos (for example, when publishing a new paper):

ck search dataset
ck search dataset --tags=jpeg

Our goal is to help researchers reuse this universal CK program workflow instead of rewriting complex infrastructure from scratch in each research project.

Install missing packages

Note, that if a given software dependency is not resolved, CK will attempt to automatically install it using CK meta packages (see the list of shared CK packages at cKnowledge.io). Such meta packages contain JSON meta information and scripts to install and potentially rebuild a given package for a given target platform while reusing existing build tools and native package managers if possible (make, cmake, scons, spack, python-poetry, etc). Furthermore, CK package manager can also install non-software packages including ML models and datasets while ensuring compatibility between all components for portable workflows!

You can list CK packages available on your system (CK will search for them in all CK repositories installed on your system):

ck search package --all

You can then try to install a given LLVM on your system as follows:

ck install package --tags=llvm,v10.0.0

If this package is successfully installed, CK will also create an associated CK environment:

ck show env --tags=llvm,v10.0.0

By default, all packages are installed in the user space ($HOME/CK-TOOLS). You can change this path using the CK environment variable CK_TOOLS. You can also ask CK to install packages inside CK virtual environment entries directly as follows:

ck set kernel var.install_to_env=yes

Note that you can now detect or install multiple versions of the same tool on your system that can be picked up and used by portable CK workflows!

You can run a CK virtual environment to use a given version as follows:

ck virtual env --tags=llvm,v10.0.0

You can also run multiple virtual environments at once to combine different versions of different tools together:

ck show env
ck virtual env {UID1 from above list} {UID2 from above list} ...

Another important goal of CK is invoke all automation actions and portable workflows across all operating systems and environments including Linnux, Windows, MacOS, Android (you can retarget your workflow for Andoird by adding –target_os=android23-arm64 flag to all above commands when installing packages or compiling and running your programs). The idea is to have a unified interface for all research techniques and artifacts shared along with research papers to make the onboarding easier for the community!

Participate in crowd-tuning

You can even participate in crowd-tuning of multiple programs and data sets across diverse platforms:.

ck crowdtune program:cbench-automotive-susan
ck crowdtune program

You can see the live scoreboard with optimizations here.

Use CK python API

You can also run CK automation actions directly from any Python (2.7+ or 3.3+) using one ck.access function:

import ck.kernel as ck

# Equivalent of "ck compile program:cbench-automotive-susan --speed"
r=ck.access({'action':'compile', 'module_uoa':'program', 'data_uoa':'cbench-automotive-susan', 
             'speed':'yes'})
if r['return']>0: return r # unified error handling 

print (r)

# Equivalent of "ck run program:cbench-automotive-susan --env.OMP_NUM_THREADS=4
r=ck.access({'action':'run', 'module_uoa':'program', 'data_uoa':'cbench-automotive-susan', 
             'env':{'OMP_NUM_THREADS':4}})
if r['return']>0: return r # unified error handling 

print (r)

Try the CK ML workflow

You can now try a more complex example with TensorFlow. You should pull the related CK repository and install the prebuilt version of TensorFlow CPU via CK:

ck pull repo:ck-tensorflow
ck install package --tags=lib,tensorflow,vcpu,vprebuilt

Check that it was successfully installed:

ck show env --tags=lib,tensorflow

You can find a path to a given entry describing this TF installation as follows:

ck find env:{env UID from above list}

Run the CK virtual environment and test TF:

ck virtual env --tags=lib,tensorflow
ipython
> import tensorflow as tf
>

You can try to run the CK image classification workflow example using the installed TF:

ck run program:tensorflow --cmd_key=classify

You can even try to rebuild TensorFlow via CK for your platform with CUDA:

ck install package:lib-tensorflow-1.7.0-cuda

CK will attempt detect your CUDA compiler and related libraries and tools including Java, Basel, and will then try to rebuild TF. Note that you may still need to install some extra dependencies yourself as described in this readme.

You can also try to run ML workflows from the MLPerf benchmarking initiative using this CK MLPerf repository.

Finally, you can try our recent MLPerf automation demo to automate submissions and validations of MLPerf results.

Further information

As you may notice, CK helps to convert ad-hoc research projects into a unified database of reusable components with common automation actions and unified meta descriptions. The goal is to promote artifact sharing and reuse while gradually substituting and unifying all tedious and repetitive research tasks!

You can find shared CK repositories, components, automation actions, and live scoreboards at the open cKnowledge.io platform.

You can also check how the universal CK program workflow was successfully reused in different projects including the ACM REQUEST tournaments to collaboratively co-design SW/HW stack for deep learning (Report about results of the 1st ReQuEST-ASPLOS’18 tournament and next steps and ACM ReQuEST-ASPLOS’18 proceedings with artifact descriptions) and reproducible quantum tournaments.

Finally, check this guide to learn how to add your own repositories, workflows, and components!

Contact the CK community

If you encounter problems or have suggestions, do not hesitate to contact us!