Commit Graph

1 Commits

Author SHA1 Message Date
Hope Ogbons
95a3766d85 Add data cleaning utilities for dataset preparation
This commit introduces a new Python module, data_cleaner.py, which provides functions for cleaning and preparing datasets for fine-tuning. The module includes a method to clean datasets based on text length and balance class distributions, as well as a function to analyze label distributions. These utilities enhance the data preprocessing capabilities for the application.
2025-10-31 03:20:08 +01:00