Wikidata:Property proposal/relative statement frequency

From Wikidata
Jump to navigation Jump to search

relative statement frequency

[edit]

Originally proposed at Wikidata:Property proposal/Generic

   Not done
Descriptionqualifier to specify the relative frequency the statement is present on instances of this class
Data typeItem
Domainall entities
Allowed valuesrarely (Q28962310)
sometimes (Q110143752)
often (Q28962312)
almost always (Q115722867)
certainty (Q1520777)
Example 1human (Q5)has part(s) (P527)human eye (Q430024)relative statement frequencyalmost always (Q115722867)
Example 2chair (Q15026)has part(s) (P527)armrest (Q2822867)relative statement frequencyoften (Q28962312)
Example 3laundry room (Q267734)has part(s) (P527)washing machine (Q124441)relative statement frequencyoften (Q28962312)
Example 4computer (Q68)uses (P2283)electronics (Q11650)relative statement frequencyoften (Q28962312) Current usage with "nature of statement"
Example 5computer (Q68)uses (P2283)mechanical energy (Q184550)relative statement frequencyrarely (Q28962310) Current usage with "nature of statement"
Robot and gadget jobsa bot should transfer the values that this property uses that are currently used on nature of statement (P5102) to this property

Motivation

[edit]

nature of statement (P5102) is used to describe many things about statements, one of them being relative frequency a statement is present on instances of a class. This proposal proposes a dedicated property for this relationship to replace nature of statement (P5102) so that users can easily find and select a value for describing this relationship. It also restricts this property to being used on classes unlike nature of statement (P5102) which can be used anywhere and could document relationships using the values of this property with unknown or badly-modeled meanings. Lectrician1 (talk) 03:53, 16 December 2022 (UTC)[reply]

Discussion

[edit]
Added in 4 and 5 @Yellowcard Lectrician1 (talk) 16:39, 26 December 2022 (UTC)[reply]
  •  Weak oppose Reading the comments, I think this property would perhaps be more appropriate to be a sense datatype where the lexeme is an adjective. The property would thus be "adjective used to describe frequency". I can also see the need for a related property that has an item datatype but only if the allowed values are those specified in a particular standard for e.g. risk assessments. For example, [1] provides a sample risk matrix for the European Space Agency (Q42262) and defines various likelihoods. Other organisations around the world define their own likelihoods, often reusing the same adjective for each likelihood but having a different criteria (e.g. "rare occurrence" could mean once every 100 years in one standard, or once every 10000 years in another standard). I'm less interested in pursuing the item data type property without being able to provide some decent examples of how it would be used (with sources). --Dhx1 (talk) 03:35, 27 December 2022 (UTC)[reply]
    Using a sense datatype would make statements language dependent, which is unnecessary for the information that we are trying to model. — The Erinaceous One 🦔 05:05, 27 December 2022 (UTC)[reply]
    My issue is that rarely (Q28962310) (and other example allowed values) are currently not defined in any meaningful way. Are the 5 allowed values specified meant to represent quintile (Q3176606)s of a normal distribution (Q133871) regardless of language? Or are allowed values meant to be items which should have described by source (P1343) technical standard (Q317623) (e.g. risk management standard used by NASA, or another organisation) and may therefore have a wide range of definitions and mapping to quantitative probabilities? Dhx1 (talk) 13:52, 27 December 2022 (UTC)[reply]
  •  Oppose in the current form. The proposed values are very vague. Having vague values and disallowing more precise one's like the ESA risk assessment statuses that Dhx1 speaks about seems like a bad modeling decision.
Storing information for every anatomical structure about how often it appears in humans would add thousands of statements to Q5. It would make more sense to store the information in the items of the individual human anatomical structures.
It seems to me like the example in this property were created without looking at any data sources. For creating a new property like this it would be worthwhile to first look at the datasources and then think how that data could best be represented on Wikidata. ChristianKl16:22, 17 January 2023 (UTC)[reply]
@Dhx1 @ChristianKl @The-erinaceous-one What if we had like "percentage chance levels" in order to make the values make more sense? For example, 10, 20, 50, 80, 90.
Also, it's extremely difficult to find sources for these examples. Look at them.
I still think it's important that we find a way to document that these statements vary in frequency. We need to find some way to do this. Lectrician1 (talk) 06:15, 20 March 2023 (UTC)[reply]
Finding sources about how many humans actually have human eyes isn't easy. At the same time the scarcity of sources makes it much more valuable to have data because it's not something that someone can currently easily look up elsewhere.
I'd be define with quantity as a datatype for the percentage. ChristianKl11:25, 20 March 2023 (UTC)[reply]