definition |
Managed process throughout the data lifecycle, by which data/data collections are
cleansed, documented, standardised, formatted and inter-related. This includes versioning
data, or forming a new collection from several data sources, annotating with metadata,
adding codes to raw data (e.g., classifying a galaxy image with a galaxy type such
as “spiral”). Higher levels of curation involve maintaining links with annotation
and with other published materials. Thus a dataset may include a citation link to
publication whose analysis was based on the data. The goal of curation is to manage
and promote the use of data from its point of creation to ensure it is fit for contemporary
purpose and available for discovery and re-use. For dynamic datasets this may mean
continuous enrichment or updating to keep it fit for purpose. Special forms of curation
may be available in data repositories. The data curation process itself must be documented
as part of curation. Thus curation and provenance are highly related.
|
|