Competition


Expect More from your Experiment Tracker


The AIQC framework provides teams a standardized methodology that trains better algorithms in less time. The secret sauce of the AIQC backend is that it is not only data-aware (e.g. folds, encoders, dtypes) but also analysis-aware (e.g. supervision, candinality).

Data-Aware
Users define the transformations (e.g encoding, stratification, walk forward, etc.) they want to make to their dataset. Then AIQC automatically coordinates the data wrangling of each split/ fold during both the pre & post processing stages of analysis.
Analysis-Aware
Users define model components (e.g. build, train, optimize, loss, etc.), hyperparameters, and an analysis type. Then AIQC automatically trains & evaluates every model with metrics & charts for each split/ fold. It also handles decoding & inference.

This declarative approach results in significant time savings. It's like Terraform for MLOps. By simplifying the processes of data wrangling and model evaluation, AIQC makes it easy for practitioners to include validation splits/ folds in their workflow. Which, in turn, helps train more generalizable models by preventing evaluation bias & overfitting.



AIQC MLflow WandB Lightning
(Complementary)
Local
Setup
Automatic Manual
DB config
Manual
Docker config
N/A relies on
Grid, MLflow, WandB
Preprocess Declarative Manual Manual Manual
Log Automatic Manual
log() function
Manual
log() function
N/A relies on
Grid, MLflow, WandB
Evaluate Automatic Manual Manual Manual
Decode Automatic Manual Manual Manual
UI Dashboard Tracking
Server
Licensed Licensed
Scale
(if commercial)
Vertical Databricks
(parallel is
challenging)
WandB
(parallel sweeps)
Distributed
& Grid


While AIQC actively helps structure the analysis, alternative tools take a more passive approach. They expect users to manually prepare their own data and log their own training artifacts. They can't assist with the actual data science workflow because they know about neither the data involved nor the analysis being conducted. Many supposed "MLOps" tools are really batch execution schedulers marketing to data science teams.

PyTorch Lightning solves the challenge of distributed GPU training more elegantly than Horovod. It would be a great way to scale AIQC. But does 80% of the market need distributed jobs? Do they even need GPU in the first place?

MLflow has a nice user interface, but all it shows you is the fruits of your data wrangling. For example, even if you were conducting transfering learning using pretrained models, you'd still have to do all of the preprocessing and post-processing by hand. In June of 2022, they released a regression pipeline that shows that they are starting to take the same type of approach as AIQC's low-level-API, which is hugely validating. However, they have 2+ years worth of work ahead of them when it comes to pre/post-processing, multi-dimensional data types, and unsupervised analysis.




AIQC takes pride in automating thorough solutions to tedious challenges such as: (1) evaluation bias, (2) data leakage, (3) multivariate decoding, (4) continuous stratification -- no matter how many folds and dimensions are involved.

Reference our blogs on Towards Data Science aiqc.medium.com for more details.