Star on GitHub
AIQC provides structured protocols that automate data wrangling processes that vary based on: analysis type (e.g. categorize, quantify, generate), data type (e.g. spreadsheet, sequence, image), and data dimensionality (e.g. timepoints per sample).
The DIY approach of patching together custom code and toolsets for each analysis is not maintainable because it places a skillset burden of both data science and software engineering upon a research team.
|Prevent evaluation bias with 3-way+ stratification.||Validate the structure of new samples.|
|Prevent data leakage by only using preprocessing information derived from the training split/fold.||Prevent data drift by using original preprocessors.|
|Prevent overfitting by evaluating each split/ fold of every model||Detect model rot by reevaluating with supervised datasets.|
|Ensure reproducibility by using a standardized framework that records the entire workflow.|
Let's get started!