Competition

Expect More from your Experiment Tracker

The AIQC framework provides teams a standardized methodology that trains better algorithms in less time. The secret sauce of the AIQC backend is that it is not only data-aware (e.g. folds, encoders, dtypes) but also analysis-aware (e.g. supervision, cardinality).

Data-Aware
Users define the transformations (e.g encoding, stratification, walk forward, etc.) they want to make to their dataset. Then AIQC automatically coordinates the data wrangling of each split/ fold during both the pre & post processing stages of analysis.

Analysis-Aware
Users define model components (e.g. build, train, optimize, loss, etc.), hyperparameters, and an analysis type. Then AIQC automatically trains & evaluates every model with metrics & charts for each split/ fold. It also handles decoding & inference.

This declarative approach results in significant time savings. It's like Terraform for MLOps. By simplifying the processes of data wrangling and model evaluation, AIQC makes it easy for practitioners to include validation splits/ folds in their workflow. Which, in turn, helps train more generalizable models by preventing evaluation bias & overfitting.

	AIQC	MLflow	WandB	Lightning (Complementary)
Local Setup	Automatic	Manual DB config	Manual Docker config	N/A relies on Grid, MLflow, WandB
Preprocess	Declarative	Manual	Manual	Manual
Log	Automatic	Manual log() function	Manual log() function	N/A relies on Grid, MLflow, WandB
Evaluate	Automatic	Manual	Manual	Manual
Decode	Automatic	Manual	Manual	Manual
UI	Dashboard	Tracking Server	Licensed	Licensed
Scale (if commercial)	Vertical	Databricks (parallel is challenging)	WandB (parallel sweeps)	Distributed & Grid

While AIQC actively helps structure the analysis, alternative tools take a more passive approach. They expect users to manually prepare their own data and log their own training artifacts. They can't assist with the actual data science workflow because they know about neither the data involved nor the analysis being conducted. Many supposed "MLOps" tools are really batch execution schedulers marketing to data science teams.

PyTorch Lightning solves the challenge of distributed GPU training more elegantly than Horovod. It would be a great way to scale AIQC. But does 80% of the market need distributed jobs? Do they even need GPU in the first place?

MLflow has a nice user interface, but all it shows you is the fruits of your data wrangling. For example, even if you were conducting transfering learning using pretrained models, you'd still have to do all of the preprocessing and post-processing by hand. In June of 2022, they released a regression pipeline that shows that they are starting to take the same type of approach as AIQC's low-level-API, which is hugely validating. However, they have 2+ years worth of work ahead of them when it comes to pre/post-processing, multi-dimensional data types, and unsupervised analysis.

AIQC takes pride in automating thorough solutions to tedious challenges such as: (1) evaluation bias, (2) data leakage, (3) multivariate decoding, (4) continuous stratification -- no matter how many folds and dimensions are involved.

Reference our blogs on Towards Data Science aiqc.medium.com for more details.