Wikidata:WikiProject Chemistry/Guidelines/Basic metaclasses and relations

From Wikidata
Jump to navigation Jump to search

Basic metaclasses for chemical entities

[edit]
See discussion about this topic (2022) in WikiProject Chemistry.

In Wikidata, ‘chemical entities’ are defined as physical objects such as pure chemical substances and molecular entities. The definition does not specify whether the described object is a single chemical molecule or a portion of matter built of such molecules. This definition is similar, but narrower, than that used in the ChEBI[1], as it excludes mixtures and parts of molecular entities (like functional groups).

All items describing chemical entities should have a proper metaclass added using instance of (P31) property from one of two groups:

  1. type of chemical entity – for all stereochemically or isotopically defined chemical entities[2].
  2. class of chemical entities – for all items that describes classes of chemical entities. This metaclass may have many submetaclasses, which are generally divided into two groups[3] which help in determining the scope of the class:
    • open classes – class of chemical entities that have an infinite number of possible members, i.e. any entity that meets a certain definition belongs to that class and the definition does not restrict the number of entities;
    • closed classes – class of chemical entities that have restricted number of members, usually limited to a few members, i.e. a class has a predetermined number of entities belonging to it.

Every item should have only one metaclass from the above (i.e. one item should not have type of chemical entity metaclass and class of chemical entities metaclass or open class of chemical entities and closed class of chemical entities). No other chemistry-related metaclass should be present. For mixtures and parts of chemical entities, other metaclasses are used.

Example
‘Trihydroxybenzene’ may refer to:
  1. one of three structural isomers: phloroglucinol, 1,2,4-trihydroxybenzene and pyrogallol
  2. a group of these three isomers (plus possible isotopically modified forms): benzenetriol
  3. a class of compounds (every compound with a benzene ring substituted with three hydroxy groups): trihydroxybenzene
In this example:
  1. every item (of phloroglucinol, 1,2,4-trihydroxybenzene and pyrogallol) is a type of chemical entity (Q113145171) (every item represent a stereochemically and isotopically defined chemical entity)[clarification needed]
  2. benzenetriol is a closed class with a group of isomeric entities (Q15711994) metaclass (its definition restricts the number of entities to three isomers)
  3. trihydroxybenzene is an open class with a structural class of chemical entities (Q47154513) metaclass (its definition does not restrict the number of possible entities, i.e. any chemical entity with a benzene ring substituted with three hydroxy groups can be classified here, so the number of theoretically possible entities is infinite).
[edit]
Metaclass Category Description
type of chemical entity (Q113145171) type of chemical entity Every stereochemically or isotopically defined chemical entity, like pyridine (Q210385), ethanol (Q153), ramatroban (Q10357327), barium sulfate (Q309038) or maitotoxin (Q425072).
structural class of chemical entities (Q47154513) open class of chemical entities Represents classes of chemical entities that share a common structural feature. Such classes has definitions describing a fragment of the chemical structure that occurs in each entity belonging to such a class, and individual chemical entities differ only in the variable fragments attached to this structure. Such definition cannot contain other requirements regarding, for example, the function or occurrence of a chemical entity – if it has, it is a class of chemical entities with similar applications or functions (Q56256173) or class of chemical entities with similar source or occurrence (Q56256178), not structural class of chemical entities (Q47154513). Examples include: pyridine (Q47317020), sulfate ester (Q1072576) or organonitrogen heterocyclic compound (Q72084374).
class of chemical entities with similar applications or functions (Q56256173) open class of chemical entities Classes of chemical entities that share similar function or have similar application. Within such classes, only chemical entities can be classified (thus excluding mixtures), at the same time, the definition must include information about the common application or common function of each chemical entity belonging to this class. Examples include: phthalimide insecticide (Q107370441) or thiazine dye (Q55352937). Usually, these kind of classes are subclasses of structural classes mentioned above (like organochlorine pesticide (Q73269109) is a subclass of organochlorine compound (Q426809) which is a structural class of chemical entities (Q47154513)).
class of chemical entities with similar source or occurrence (Q56256178) open class of chemical entities Similarily to the class mentioned above, these classes include chemical entities that share similar source or occurrence in nature. Definition of such class must include an information about the source or occurrence which is the same for each chemical entity belonging to such a class. Examples include: quinoline alkaloid (Q17310777), synthetic cannabinoids (Q19904200) or Dendrobates alkaloid (Q94962697). Like with class of chemical entities with similar applications or functions (Q56256173), these kind of classes may be subclasses of structural classes (like biogenic amine (Q424455) is a subclass of amine (Q167198) which is a structural class of chemical entities (Q47154513)).
group of chemical entities (Q55640599) closed class of chemical entities It defines a class of chemical entities, for which the definition predetermines the possible number of members of such a class, and usually such classes are small in number. Examples include potassium citrates (Q59543257) (a class of only two chemical entities used as food additives), spiruchostatin (Q7578231) (a class of chemical entities named spiruchostatin X, isolated from a certian species). Most of these classes are defined as submetaclasses like group of isomeric entities (Q15711994). The boundary between a class of chemical entities and a group of chemical entities may in some cases require discussion or analysis of similar cases. Remember that the number of elements of such a class may change over time, e.g. for spiruchostatin (Q7578231) there are currently four isolated chemical compounds, but it is possible that more will be discovered in the future. Moreover, even a small number of elements that can actually exist does not mean that such a class is a group of chemical entities (Q55640599)diose (Q1318110) is a structural class of chemical entities (Q47154513) as it definition states that the class includes any monosaccharide with two carbon atoms, which in fact makes only one chemical entity meet such a definition.
group of isomeric entities (Q15711994) closed class of chemical entities This class includes any case where the definition specifies a chemical formula common to each chemical entity. Within this class, groups of constitutional isomers (except aromatic ring substitution isomers) will usually be defined, as groups of steric isomers are defined by a submetaclass. L-galactose (Q100602876), heptene (Q151375) or butanals (Q27902935) may begiven as examples here.
group of stereoisomers (Q59199015) closed class of chemical entities Class that includes all cases of chemical entities with the same arrangement of atoms and bonds between them, but different spatial configuration, i.e both optical isomers and E/Z isomers. One of most common metaclass as it covers every case of so called structure with undefined stereochemistry. Examples include D-ribopyranose (Q27120754) (one undefined stereocenter), dec-2-enal (Q27131349) (undefined configuration of a double bond) or XJIPREFALCDWRQ-UHFFFAOYSA-N (Q105328992) (30 undefined stereocenters).
group of ortho, meta, para isomers (Q55662456) closed class of chemical entities Represents classes composed of three items which are respectively ortho, meta and para isomers. Examples: hydroxybenzaldehyde (Q3143824), aminophenol (Q27118884).
type of mixture of chemical entities (Q119892838) other metaclass (mixtures)
type of polymer (Q119896085) other metaclass (mixtures)
imprecise class of chemical entities (Q74892521) other metaclass

Basic relations

[edit]

In items about chemical entities subclass of (P279) should be used for chemical classes only. For other classes has use (P366), subject has role (P2868) or other more specific properties should be used.

Examples:

There may be some borderline cases in which both subject has role (P2868) and has use (P366) could be used. It would require an arbitrary decision and consistent approach which one should be used.

Notes

[edit]
  1. Definition of a ‘chemical entity’ in ChEBI is as follows: A chemical entity is a physical entity of interest in chemistry including molecular entities, parts thereof, and chemical substances.
  2. From the beginning of Wikidata, chemical compound (Q11173) item served as such metaclass. However, using it caused many problems: (i) it was not applicable for many items (ions, radicals, simple substances, functional groups), (ii) it was present alongside regular chemical classes while being their superclass, which caused some redundancy and also confusion among users, (iii) its definition is not well established and there are some borderline cases where one cannot be sure that an entity can be correctly classified as a compound.
  3. This division is close to the one used in ChEBI (see Open and closed classes in ChEBI User Manual).