Term: dataset Class:  
 vernacular   (0%)
Created 9 September 2013
Last modified 19 November 2014
Contributed by Stephen Richard


Definition: an identifiable collection of data (ISO19115). a collection of data items unified by some criteria (authorship, subject, scope, extent...). A kind of Collection that contains data.

Criteria for what unifies the collection are variable (topic, area, author...). Data items may represent intellectual content -- information content and organization (data schema) -- or may represent particular manifestations (formats) of an intellectual artifact. Submitted 14 April 2016
by Stephen
Note that the intention of data is to describe domain entities, whereas the intention of metadata is to describe the representation of those entities in an information system for the purpose of discovery, evaluation of provenance, quality or fitness for use, and access. If the domain of discourse is an information system, the distinction of metadata and data becomes fuzzy and subjective. Submitted 14 April 2016
by Stephen
I also like parts of this definition for dataset - Data Set
"A named collection of data elements in which each data element is well-defined. The elements are logically related and arranged in a prescribed manner. " Here naming it is part of the identifiable idea. Also this adds the idea of "well defined." Both definitions speak of some idea of unified or logical arrangements so I think they converge on a definition.
Submitted 14 April 2016
by Gary
Making 'well defined' precise might be difficult. Take for example the ubuquitous spreadsheet with cryptic column headings. They might be well defined to the person who originated the sheet, but not to anyone else. Submitted 14 April 2016
by Stephen
see also http://purl.org/coar/resource_type/c_ddb1 Submitted 17 January 2018
by Stephen