Webinar 2: Entity relationships and attributes (Francesca Jaroszynska)

Hi John,

I have difficulties with the way you handle the issue, but I think there are 2 topics: the way to handle relationship between entities, and the use of entityAssertionType vocabulary.

Let start by the second, with your exemple

entityAssertionID: ea14
entityID: PigeonPopulation1
entityAssertionType: minimum adult female count
entityAssertionValueNumeric: 3
entityAssertionUnit: individuals

Here, you tell us that you sax at least 2 adult female within one single assertionType. I am afraid here to see an enlargement of the assertionType vocabulary (@abentley topic), either by flipping words or by adding new elements (minimum white adult female count ?).

I would advocate to separate the assertion, in the same way you did for the sigle pigeon
sex: female
life stage: adult
organism quantity: 3
However, doing this means that the described entity is not anymore PigeonPopulation1, but a subpart of it.

entityAssertionID: ea10
entityID: PigeonPopulation1

entityAssertionType: dwc:organismQuantity
entityAssertionValueNumeric: 13
entityAssertionUnit: individuals

entityAssertionType: juvenile count
entityAssertionValueNumeric: 6
entityAssertionUnit: individuals

entityAssertionType: minimum adult male count
entityAssertionValueNumeric: 2
entityAssertionUnit: individuals

entityAssertionType: minimum adult female count
entityAssertionValueNumeric: 3
entityAssertionUnit: individuals

Here, you tell us that on PigeonPopulation1, you saw :
13 individuals,
6 juveniles,
7 adults,
at least 2 adult males,
at least 3 adult females.

From the structure of the data, I am not sure how many individual you saw as they all describe PigeonPopulation1

  • 13 (I guess it was your value)
  • 26 = 13 undetermined + 6 juveniles + 7 adults (including 2 males and 3 females)
  • 31 = 13 undetermined + 6 juveniles sex undetermined + 7 adults sex undetermined + 2 adult males + 3 adult females

Here, I would advocate to identify to kind of entities: the flock itself, of 13 individuals and likely other assertion specific to the folk (area covered, speed and direction…) ; and subparts of the flock, therefore as new entities related to the flock. It will help to keep the vocabulary as controlled as possible while being clear on the components. This advocates again for adding new entities.

This bring us to the first topic:

By increasing the number of entities, we increase the numbers of entity relationships. Those relationships “member of/ has member” or any kind of “parent/child” are not of the most interest for biological purposes, as they are here only to indicate a database hierarchical relation. They are, in addition, quite heavy to fill in both from scripts or hand.

If we had a parentEntityID field, we could manage that more easily. Interestingly, @DavidFichtmueller used a diagram including this parentEntityID on April 20 (topic)

entityID: PigeonPopulation1
entityType: dwc:Organism
entityAssertionType: dwc:organismQuantity
entityAssertionValue: 13

entityID: Pigeon1
parentEntityID: PigeonPopulation1
entityType: dwc:Organism*
entityAssertionType: dwc:sex
entityAssertionValue: female
entityAssertionType: dwc:lifeStage
entityAssertionValue: adult

entityID: PigeonPopulatoin1_1
parentEntityID: PigeonPopulation1
entityType: dwc:Population
entityAssertionType: dwc:organismQuantity
entityAssertionValue: 6
entityAssertionType: dwc:lifeStage
entityAssertionValue: juvenile

entityID: PigeonPopulatoin1_2
parentEntityID: PigeonPopulation1
entityType: dwc:Population
entityAssertionType: dwc:organismQuantity
entityAssertionValue: 7
entityAssertionType: dwc:lifeStage
entityAssertionValue: adult

entityID: PigeonPopulatoin1_2_1
parentEntityID: PigeonPopulation1_2
entityType: dwc:Population
entityAssertionType: dwc:organismQuantity
entityAssertionValue: 2
entityAssertionType: dwc:sex
entityAssertionValue: male

entityID: PigeonPopulatoin1_2_2
parentEntityID: PigeonPopulation1_2
entityType: dwc:Population
entityAssertionType: dwc:organismQuantity
entityAssertionValue: 3
entityAssertionType: dwc:sex
entityAssertionValue: female

This way would also be technically more easy to keep the original large observation: the flock of 13 individuals (i.e. no parentEntityID), and allows to clean the entityRelation table from the least relevant information.

The addition of “minimal” could be handle as a estimated value (discussion):
assertionID: x
parentassertionID: cf assertion ID of the “organismQuantity: 2 individuals”
assertionType: minimal
assertionValueNumeric: 2
assertionUnit: individuals

Wouldn’t that be a nice improvement, and perfectly in line with the parentEventID, parentAssertionID, parentTaxonID, and every dependsOn elements ?

2 Likes