Data Foundry — Curation

Verify dataset

Reviewer:

I carefully read this dataset's info (comments, links, AI assessment) and confirm this decision. This removes the “AI (UNVERIFIED)” flag, records me as the reviewer, and relabels the AI disclaimer as a plain “AI summary”.

About — Data Foundry curation

This page is the curation log behind TabArena and BeyondArena: one record per candidate tabular dataset, tracking whether it belongs in the benchmark, why, how it should be split, and how it is processed. Curators triage the backlog here (edit → commit → PR); an AI assistant can draft a provisional triage (🤖) that a human then verifies.

Goal: assemble a high-quality, representative collection of real-world tabular ML tasks for an open, living benchmark — and keep the curation reasoning transparent and reproducible.

Links

tabarena.ai — the living benchmark & leaderboard
github.com/TabArena/data-foundry — this curation toolkit
github.com/TabArena — our broader efforts
Curation guidelines · agentic curation skill (the /curate command)