Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing "schema:gender" predicate #6

Open
elad-shaked opened this issue Aug 24, 2020 · 3 comments
Open

Missing "schema:gender" predicate #6

elad-shaked opened this issue Aug 24, 2020 · 3 comments

Comments

@elad-shaked
Copy link

Looking at https://yago-knowledge.org/resource/yago:Elvis_Presley I can see the schema:gender property:

schema:gender | yago:male_Q6581097

However, I cannot find any trace of it in any of the files downloaded from https://yago-knowledge.org/data/yago4/en/2020-02-24/ (latest release AFAIK)

@elad-shaked
Copy link
Author

Also, in YAGO4, schema:gender only appears in "yago-wd-facts.nt", and that too only in 829 statements. By the way, here is the distribution of Objects for said statements:

      1 http://yago-knowledge.org/resource/Androgyny
     74 http://yago-knowledge.org/resource/Eunuch
      1 http://yago-knowledge.org/resource/Māhū
    145 http://yago-knowledge.org/resource/Non-binary_gender
      4 http://yago-knowledge.org/resource/Transgender
    162 http://yago-knowledge.org/resource/Trans_man
    441 http://yago-knowledge.org/resource/Trans_woman
      1 http://yago-knowledge.org/resource/Two-spirit

So no yago:male_Q6581097 anywhere (or Female).

For reference, YAGO3 https://yago-knowledge.org/data/yago3/yago-3.0.2-turtle-simple.7z has 1,108,862 statements with the <hasGender> predicate whose Object distribution is:

 174263 female
 934599 male

How can we get this useful information in YAGO4?

@Tpt
Copy link
Collaborator

Tpt commented Aug 31, 2020

Hi! Thank you for your interest in YAGO 4.

Sorry for this problem. The YAGO 4 "English Wikipedia" flavor only contains entities from Wikidata with an English Wikipedia articles. However, the Wikidata entities for male and female humans do not have articles on the English Wikipedia. Hence, they are not included and the schema:gender relations are not added to this flavor.
They are included in the other favors, like the "all Wikipedias" one that is displayed on the website.

This is indeed quite confusing. We should probably add to the next versions of the "English Wikipedia" flavor some import entities that does not have English Wikipedia article like male and female. I will close this issue when it's done.

Thank you again.

@elad-shaked
Copy link
Author

Adding to the discussion. Looking at "yago-wd-facts.nt" of the "wiki" flavor, there is indeed way more statements with a schema:gender predicate (around 2.9M). Here is a histogram of the Objects of those statements:

http://yago-knowledge.org/resource/Trans_woman , 618.0
http://yago-knowledge.org/resource/Two-spirit , 2.0
http://yago-knowledge.org/resource/genderfluid_Q18116794 , 7.0
http://yago-knowledge.org/resource/Hijra_(South_Asia) , 1.0
http://yago-knowledge.org/resource/genderqueer_Q12964198 , 3.0
http://yago-knowledge.org/resource/transmasculine_Q27679766 , 1.0
http://yago-knowledge.org/resource/Non-binary_gender , 156.0
http://yago-knowledge.org/resource/male_Q6581097 , 2946797.0
http://yago-knowledge.org/resource/Transgender , 6.0
http://yago-knowledge.org/resource/Māhū , 1.0
http://yago-knowledge.org/resource/Trans_man , 191.0
http://yago-knowledge.org/resource/Eunuch , 1.0
http://yago-knowledge.org/resource/Androgyny , 1.0

At first glance this seems fine and there are 2.9M indeed "male_*" objects, but it is evident there are no females.
So I looked into the "full" flavor and here is the histogram:

http://yago-knowledge.org/resource/Trans_woman , 675.0
http://yago-knowledge.org/resource/Two-spirit , 2.0
http://yago-knowledge.org/resource/genderfluid_Q18116794 , 9.0
http://yago-knowledge.org/resource/genderqueer_Q12964198 , 4.0
http://yago-knowledge.org/resource/transmasculine_Q27679766 , 1.0
http://yago-knowledge.org/resource/Non-binary_gender , 187.0
http://yago-knowledge.org/resource/male_Q6581097 , 4418009.0
http://yago-knowledge.org/resource/Transgender , 6.0
http://yago-knowledge.org/resource/Māhū , 1.0
http://yago-knowledge.org/resource/female_Q6581072 , 1257437.0
http://yago-knowledge.org/resource/cisgender_female_Q15145779 , 1.0
http://yago-knowledge.org/resource/cisgender_male_Q15145778 , 1.0
http://yago-knowledge.org/resource/Trans_man , 206.0
http://yago-knowledge.org/resource/Eunuch , 2.0
http://yago-knowledge.org/resource/Androgyny , 2.0

The good news are that the females are finally present and in force (1.2M). However, the IRI itself is a broken link (404), and I can't find it in the "Incoming properties" section of: https://yago-knowledge.org/resource/Gender_identity (which includes male_*)
What am I missing?

Tpt added a commit that referenced this issue Sep 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants