Competition
Expect More from your Experiment Tracker
The AIQC framework provides teams a standardized methodology that trains better algorithms in less time. The secret sauce of the AIQC backend is that it is not only data-aware (e.g. folds, encoders, dtypes) but also analysis-aware (e.g. supervision, candinality).
This declarative approach results in significant time savings. It's like Terraform for MLOps. By simplifying the processes of data wrangling and model evaluation, AIQC makes it easy for practitioners to include validation splits/ folds in their workflow. Which, in turn, helps train more generalizable models by preventing evaluation bias & overfitting.
AIQC | MLflow | WandB | Lightning(Complementary) | |
Local Setup |
Automatic |
Manual DB config |
Manual Docker config |
N/A relies on Grid, MLflow, WandB |
Preprocess | Declarative | Manual | Manual | Manual |
Log | Automatic |
Manual log() function |
Manual log() function |
N/A relies on Grid, MLflow, WandB |
Evaluate | Automatic | Manual | Manual | Manual |
Decode | Automatic | Manual | Manual | Manual |
UI | Dashboard |
Tracking Server |
Licensed | Licensed |
Scale(if commercial) | Vertical |
Databricks (parallel is challenging) |
WandB (parallel sweeps) |
Distributed & Grid |
While AIQC actively helps structure the analysis, alternative tools take a more passive approach. They expect users to manually prepare their own data and log their own training artifacts. They can't assist with the actual data science workflow because they know about neither the data involved nor the analysis being conducted. Many supposed "MLOps" tools are really batch execution schedulers marketing to data science teams.
PyTorch Lightning solves the challenge of distributed GPU training more elegantly than Horovod. It would be a great way to scale AIQC. But does 80% of the market need distributed jobs? Do they even need GPU in the first place?
MLflow has a nice user interface, but all it shows you is the fruits of your data wrangling. For example, even if you were conducting transfering learning using pretrained models, you'd still have to do all of the preprocessing and post-processing by hand. In June of 2022, they released a regression pipeline that shows that they are starting to take the same type of approach as AIQC's low-level-API, which is hugely validating. However, they have 2+ years worth of work ahead of them when it comes to pre/post-processing, multi-dimensional data types, and unsupervised analysis.
AIQC takes pride in automating thorough solutions to tedious challenges such as: (1) evaluation bias, (2) data leakage, (3) multivariate decoding, (4) continuous stratification -- no matter how many folds and dimensions are involved.
Reference our blogs on Towards Data Science aiqc.medium.com for more details.