About 1,090,000 results
Open links in new tab
  1. GitHub - sodadata/soda-core: :zap: Data quality testing for the …

    An open-source, CLI tool and Python library for data quality testing Compatible with the Soda Checks Language (SodaCL) Enables data quality testing both in and out of your data pipelines …

  2. GitHub - cleanlab/cleanlab: Cleanlab's open-source library is the ...

    Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels. - cleanlab/cleanlab

  3. GitHub - kwanUm/awesome-data-quality: Curated list of tools and ...

    Frameworks and Libraries Open sourced elementary - Data monitoring and observability tailored to dbt. mobydq - tool for data engineering teams to run & automate data quality checks on …

  4. data-quality · GitHub Topics · GitHub

    Aug 18, 2024 · Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

  5. Open Source Data Quality Monitoring. - GitHub

    Datachecks is an open-source data monitoring tool that helps to monitor the data quality of databases and data pipelines. It identifies potential issues, including in the databases and data …

  6. The premier open source Data Quality solution - GitHub

    The premier Open Source Data Quality solution. DataCleaner is a Data Quality toolkit that allows you to profile, correct and enrich your data. People use it for ad-hoc analysis, recurring …

  7. GitHub - data-prep-kit/data-prep-kit: Open source project for data ...

    The data modalities supported today are: Natural Language, Code, and Image. The modules are built on common frameworks for Python and Ray runtimes for scaling up data processing. The …

  8. GitHub - ydataai/ydata-quality: Data Quality assessment with one …

    YData Quality ydata_quality is an open-source python library for assessing Data Quality throughout the multiple stages of a data pipeline development. A holistic view of the data can …

  9. GitHub - mlabonne/llm-datasets: Curated list of datasets and tools …

    Tools listed in this section can help you evaluate, generate, and explore datasets. Start by aggregating available data from various sources (open-source or not) and applying filters like …

  10. GitHub - evidentlyai/evidently: Evidently is an open-source ML and …

    Evidently is an open-source Python library to evaluate, test, and monitor ML and LLM systems—from experiments to production. 🔡 Works with tabular and text data. Supports evals …