Game Metadata Standards: Building a Canonical Library Catalog
How No-Intro, Redump, and TOSEC preservation standards form the foundation of a trustworthy game library — and how RetroCloud extends them with additional gaming and licensing metadata.
A game preservation platform is only as good as its catalog data. The titles in the library must be correctly identified, their technical properties accurately described, their relationships to other editions and releases clearly documented, and their rights status reliably recorded. Without accurate metadata, the catalog is a collection of opaque binary files that cannot be meaningfully searched, presented to users, or managed over time. Building and maintaining high-quality game metadata is a continuous engineering and curatorial effort that RetroCloud has invested in heavily since our earliest days.
The No-Intro Standard: Cartridge-Based Media
No-Intro is a preservation project dedicated to creating a canonical database of verified ROM dumps for cartridge-based gaming systems. The core of the No-Intro methodology is rigorous verification: every entry in the database corresponds to a specific ROM dump that has been verified against multiple independent dumping sources to ensure it is a bit-perfect copy of the original media with no modifications. Each entry is identified by multiple hash values (CRC32, MD5, and SHA-1) that allow any ROM file to be definitively identified as a No-Intro verified dump or not.
The No-Intro database organizes ROMs by their region, language, revision level, and release type (retail, proto, homebrew, unlicensed). The naming convention encodes these properties in the filename: "Super Mario World (USA) (Rev 1).sfc" is a No-Intro-formatted name that uniquely identifies a specific revision of a specific regional release. This naming convention enables automatic detection of duplicates, region variants, and revision differences when managing a large catalog — critical for a platform serving users globally who may have preferences for specific regional versions.
Redump: Optical Media Preservation
Where No-Intro focuses on cartridge media, Redump addresses optical disc preservation for systems like the PlayStation, Sega Saturn, Dreamcast, and GameCube. Disc preservation introduces additional complexity: optical discs contain multiple tracks with mixed data and audio, sub-channel data that encodes timing and copy protection information, and in many cases publisher-specific copy protection schemes that affect the disc's sector layout.
A Redump-verified dump includes the main data track image (ISO or BIN format), a CUE sheet describing the track layout, the sub-channel data, and in many cases the disc's PVD (Primary Volume Descriptor) information. Redump's database records the expected MD5 and SHA-1 hashes of each component, enabling verification at the component level rather than just the overall file level. RetroCloud's disc-based catalog validates each component hash independently, flagging any component that does not match the Redump reference.
TOSEC: The Legacy Archive
The TOSEC (The Old School Emulation Center) project predates both No-Intro and Redump and covers a much broader range of systems, including home computers, arcade hardware, and obscure platforms that No-Intro and Redump do not prioritize. TOSEC's methodology is less strict than No-Intro's — it includes modified ROMs, alternate dumps, and trainer-patched versions alongside verified originals — but it provides coverage for platforms and titles that would otherwise be unrepresented in any canonical database.
RetroCloud uses TOSEC data primarily for home computer platforms (Amiga, Atari ST, Commodore 64, ZX Spectrum) and arcade systems where No-Intro coverage is limited. We cross-reference TOSEC entries against known-good hashes where available and flag entries that are TOSEC-only (not cross-validated by No-Intro or Redump) in our internal catalog to indicate their lower verification confidence level.
RetroCloud's Extended Metadata Schema
The preservation databases provide authoritative technical identification but do not address the additional metadata needs of a consumer-facing platform. RetroCloud extends the base preservation metadata with several additional data categories. Gameplay metadata includes the title's original release date, developer, publisher, genre classification, player count range, estimated completion time, age rating, and a curated description written for platform users. This metadata is maintained by our editorial team using historical records, developer interviews, and period gaming publications as sources.
Rights metadata is the most operationally critical extension. Every title carries a rights record documenting the current rights-holder, the licensing agreement under which RetroCloud serves the title, the geographic territory coverage of that license, and the platform terms that govern how users can access the title. This metadata drives the platform's content delivery logic: a title licensed only for North American territories will not appear in the catalog for European users. Rights metadata is reviewed quarterly against the current state of licensing agreements and updated whenever rights status changes.
Metadata Quality as Infrastructure
Catalog metadata quality is invisible when it is correct and catastrophically visible when it is not. An incorrectly identified ROM that loads but produces an unidentifiable result. A release date listed as the North American release displayed to Japanese users who remember the title by its original JP release. A rights status that incorrectly marks a properly licensed title as unlicensed, causing it to be withheld from users who are entitled to access it. Every metadata error has a user impact, and the catalog is large enough that even a 0.1% error rate represents hundreds of affected titles.
RetroCloud treats metadata quality with the same engineering rigor as software quality: continuous automated validation, human review workflows for flagged discrepancies, an internal tool for editorial staff to submit and approve corrections, and a public-facing correction submission path for community members who identify errors. The preservation community's deep knowledge of gaming history has been one of our most valuable resources for maintaining metadata accuracy, and our relationship with that community — built on transparency about our methodology and responsiveness to corrections — is something we actively cultivate.
Sofia Reyes
Head of API Platform, RetroCloud
Sofia leads RetroCloud's public API and developer ecosystem. Her background spans API design, developer experience, OpenAPI standards, and real-time systems engineering for partner integrations.