Mappings from UN P-Codes and Wikidata Q IDs #45

fititnt · 2022-07-13T05:02:11Z

Using as toy example
- adm0: Brazil https://www.wikidata.org/wiki/Q155 (Turtle http://www.wikidata.org/wiki/Special:EntityData/Q155.ttl)
- adm1: Rio de Janeiro state https://www.wikidata.org/wiki/Q41428 (Turtle http://www.wikidata.org/wiki/Special:EntityData/Q41428.ttl)
- adm2: Rio de Janeiro city https://www.wikidata.org/wiki/Q8678 (Turtle http://www.wikidata.org/wiki/Special:EntityData/Q8678.ttl)
  - Note: Rio de Janeiro is likely to be one of the easiest examples to have codes, even if not current capital of a country. This means cities like this one likely would not be realistic examples to expand for other P-Codes adm2

While the mappings at least at admin0 ("country level") is straightforward (since we can map ISO 3166-1 used on P-Codes prefix and UN m49), things get tricky already at admin boundary level 1. We know some UN PCode patterns of at least some regions (such as the case of P-Codes from Brazil) which we're even lucky have a mapping ready to use like https://www.wikidata.org/wiki/Property:P1585. But not sure about the rest.

Why such mappings becomes relevant

Even if we only manage to somewhat make mappings at best case of admin 1 and only specific administrative regions got very detailed, this alone already allow get more data from Wikidata, which is by far the best place different persons and organizations use it. I personally think (at least as soon as it get decent) worth allow publish such mappings as dedicated public domain dataset, so ITOS or OCHA can at least use it even if for internal comparisons. However, this "soon" can take time and is more likely that for population statistics such as #43, the data from such mappings would be less accurate than what OCHA have, in special for countries with active crisis.

However, in any case, the mappings start allow we know much more mappings (including OpenStreetMap and UN/LOCODE). But by no means I think this will be something ready anytime soon (assuming is something that could be ready at all, since regions can change over time).

Potential approaches

Tooling specialized to integrate intermediate controlled vocabularies

Note: by "intermediate controlled vocabularies" we're talking about anything that could be used to triangulate what could later be assumed to be an exact match with P-Codes

This topic alone will require create several scripts and strategies (even if the early ones would become not as necessary in the medium term) to start know how to make the other relations. The ones we should do more attention are what is relevant to run from time to time to discover new changes.

1. (Not sure, needs testing) maybe compare by matching geometries

At the moment we did not attempted to run tools which could make any type of matching by geometries, but while this definitely would need human intervention, maybe it could work.

To reach this point, not only we would need to create the scripts, but likely allow it run (maybe weekly or monthly) to check the official COD-ABs with what whatever is on Wikidata uses.

2. Trying reverse engineering numeric part of P-Codes (and hope already exist Wikidata P with them)

Since the documentation on how to design P-Codes for more than one decade already recommended to try reuse existing country codes, is likely that more regions would have equivalences such as IBGE Code P1585. The only thing we're sure is that all P-Codes without admi0 prefix are fully numeric (with few exceptions), so this already exclude a lot of potential existing codes

However, the new problem would become if other countries do have mappings on Wikidata P property (and such mappings be as updated as P1585 by others). Otherwise, even if we could know country by country how the P-Codes where designed without try and error (and they be be an 1:1 matching P-Code, which, again, we can't take for granted without human intervention) we cannot use it.

In any case, whatever would be the strategy to map P-Codes to Wikidata Q, we would need to document very well to allow revision.

3. Other inferences

There's several other codes on Wikidata, from OpenStreet Map (https://www.wikidata.org/wiki/Property:P402), UN/LOCODE (https://www.wikidata.org/wiki/Property:P1937), HASC (https://www.wikidata.org/wiki/Property:P8119) to a popular one, the GeoNames (https://www.wikidata.org/wiki/Property:P1566, this one not sure why somethines have more than one code for same place). They might somewhat allow some way to triangulate with P-Codes, but not sure at the moment.

…used with wikidata

…geolod, ix_unagrovoc, ix_unescothes, ix_unp, ix_usfactbook, ix_usworldnet, ix_xzgeonames, ix_xzglide, ix_xzgithubt, ix_xzglotto, ix_xzosmrel

fititnt · 2022-07-14T02:28:39Z

@todo edit one by one and add missing UN m49 directly on Wikidata

Wikipedia do have most of the ISO 3166-1 numeric (which, by definition, would be equivalent to UN m49 codes). However, the UN m49 codes themselves have only 16 over 177 ISO 3166-1 numeric items.

Not as sure if automate this or do one by one, but even if take some time, makes sense to add the missing numeric codes, so we can simplify the queries.

Trivia

From the list of existing codes, seems previous people started from letter A to add the codes then stopped. I'm not evey surprised with this, since lexicography, is quite common the first letters be doing with far more attention/time and then the rest be rushed.

Current SPARQL query

This query is somewhat manual, but the idea would be generate it based on what Wikidata P's would be related to adm0.

In any case, this more manual query at least allow know which properties have more data. Some like the KML (https://www.wikidata.org/wiki/Property:P3096) have very few points, while the geoshape (https://www.wikidata.org/wiki/Property:P3896) seems to have for every adm0 we're getting (returns a link to geojson, based on data published by datahub; not as perfect as the COD-ABs and I'm very sure not as updated, but at least allow quick use of some geometries for other checks).

Online try link

SELECT
  (xsd:integer(?ix_iso3166p1n) AS ?item__conceptum__codicem)
  (STRAFTER(STR(?item), "entity/") AS ?item__rem__i_qcc__is_zxxx__ix_wikiq)
  #?item__rem__i_qcc__is_zxxx__ix_iso3166p1n
  (GROUP_CONCAT(DISTINCT ?ix_iso3166p1n; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_iso3166p1n)
  (GROUP_CONCAT(DISTINCT ?ix_unm49; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_unm49)
  (GROUP_CONCAT(DISTINCT ?ix_iso3166p1a2; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_iso3166p1a2)
  (GROUP_CONCAT(DISTINCT ?ix_iso3166p1a3; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_iso3166p1a3)
  (GROUP_CONCAT(DISTINCT ?ix_unescothes; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_unescothes)
  (GROUP_CONCAT(DISTINCT ?ix_unagrovoc; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_unagrovoc)
  (GROUP_CONCAT(DISTINCT ?ix_xzosmrel; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_xzosmrel)
  (GROUP_CONCAT(DISTINCT ?ix_xzgeonames; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_xzgeonames)
  (GROUP_CONCAT(DISTINCT ?ix_jpgeolod; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_jpgeolod)
  (GROUP_CONCAT(DISTINCT ?ix_usworldnet; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_usworldnet)
  (GROUP_CONCAT(DISTINCT ?ix_usfactbook; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_usfactbook)
  (GROUP_CONCAT(DISTINCT ?ix_xzgithubt; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_xzgithubt)
  (GROUP_CONCAT(DISTINCT ?ix_zzwgs84point; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_zzwgs84point)
  (GROUP_CONCAT(DISTINCT ?ix_zzgeojson; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_zzgeojson)

WHERE
{
  ?item wdt:P31 wd:Q6256 ;
  OPTIONAL { ?item wdt:P2082 ?ix_unm49. }
  ?item wdt:P299 ?ix_iso3166p1n.
  OPTIONAL { ?item wdt:P297 ?ix_iso3166p1a2. }
  OPTIONAL { ?item wdt:P298 ?ix_iso3166p1a3. }
  OPTIONAL { ?item wdt:P3916 ?ix_unescothes. }
  OPTIONAL { ?item wdt:P8061 ?ix_unagrovoc. }
  OPTIONAL { ?item wdt:P402 ?ix_xzosmrel. }
  OPTIONAL { ?item wdt:P1566 ?ix_xzgeonames. }
  OPTIONAL { ?item wdt:P5400 ?ix_jpgeolod. }
  OPTIONAL { ?item wdt:P8814 ?ix_usworldnet. }
  OPTIONAL { ?item wdt:P9948 ?ix_usfactbook. }
  OPTIONAL { ?item wdt:P9100 ?ix_xzgithubt. }
  OPTIONAL { ?item wdt:P625 ?ix_zzwgs84point. }
  OPTIONAL { ?item wdt:P3896 ?ix_zzgeojson. }
}
GROUP BY ?item ?ix_iso3166p1n
ORDER BY ASC(?item__rem__i_qcc__is_zxxx__ix_iso3166p1n)

The result CSV

wikidata-admin0--2022-07-13.csv

fititnt · 2022-07-14T07:44:34Z

wikidata-admin0--2022-07-14.csv

Done (at least first batch) comparing both with Wikidata and page https://unstats.un.org/unsd/methodology/m49/. All existing ISO 3166-1 numeric match UN m49 (actually used then to be sure the pasted number would already exist on each page. Do exist some corner cases (UN M.49 528 https://www.wikidata.org/wiki/Q29999 vs https://www.wikidata.org/wiki/Q55, which an historical country exist; opted to only add UN M49 to non-historical administrative areas).

@todo also add numeric codes even for itens without ISO 3166-1 numeric on Wikidata

While there repeated Q29999, the 177 lines of data on wikidata-admin0--2022-07-14.csv still less than 249 on the list https://unstats.un.org/unsd/methodology/m49/. This could be done by manually comparing ISO 3166-1 alpha 3 and region name in natural language.

fititnt · 2022-07-15T05:28:11Z

Current draft of adm0

We shifted from hacky way of ix_iso3166p1n to ix_unm49 as direct generation of local numeric identifiers. From now on, then become necessary keep Wikidata up to date.

Trivia: the way Wikidata validation rules work, means adm0 without ix_iso3166p1a2 would raise warnings, so this means some countries which are not current (like Kingdom of the Netherlands (Q29999)) would return results. There's some discussion like https://www.wikidata.org/wiki/Talk:Q29999#Is_country_tag_really_appropriate for cases like this, but since no one is looking too much on UN m49 as key to decide other Wikidata validation rules, the ones we're using might be more easier to maintain.

SELECT
  (xsd:integer(?ix_unm49) AS ?item__conceptum__codicem)
  (STRAFTER(STR(?item), "entity/") AS ?item__rem__i_qcc__is_zxxx__ix_wikiq)
  (GROUP_CONCAT(DISTINCT ?ix_iso3166p1n; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_iso3166p1n)
  (GROUP_CONCAT(DISTINCT ?ix_unm49; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_unm49)
  (GROUP_CONCAT(DISTINCT ?ix_iso3166p1a2; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_iso3166p1a2)
  (GROUP_CONCAT(DISTINCT ?ix_iso3166p1a3; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_iso3166p1a3)
  (GROUP_CONCAT(DISTINCT ?ix_iso3166p2; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_iso3166p2)
  (GROUP_CONCAT(DISTINCT ?ix_unescothes; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_unescothes)
  (GROUP_CONCAT(DISTINCT ?ix_unagrovoc; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_unagrovoc)
  (GROUP_CONCAT(DISTINCT ?ix_xzosmrel; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_xzosmrel)
  (GROUP_CONCAT(DISTINCT ?ix_xzgeonames; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_xzgeonames)
  (GROUP_CONCAT(DISTINCT ?ix_jpgeolod; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_jpgeolod)
  (GROUP_CONCAT(DISTINCT ?ix_usworldnet; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_usworldnet)
  (GROUP_CONCAT(DISTINCT ?ix_usfactbook; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_usfactbook)
  (GROUP_CONCAT(DISTINCT ?ix_xzgithubt; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_xzgithubt)
  (GROUP_CONCAT(DISTINCT ?ix_zzwgs84point; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_zzwgs84point)
  (GROUP_CONCAT(DISTINCT ?ix_zzgeojson; separator = "|") AS ?item__rem__i_qcc__is_zxxx__ix_zzgeojson)

WHERE
{
  ?item wdt:P31 wd:Q6256 ;
  wdt:P2082 ?ix_unm49 ;
  OPTIONAL { ?item wdt:P299 ?ix_iso3166p1n . }
  OPTIONAL { ?item wdt:P297 ?ix_iso3166p1a2 . }
  OPTIONAL { ?item wdt:P298 ?ix_iso3166p1a3 . }
  OPTIONAL { ?item wdt:P300 ?ix_iso3166p2 . }
  OPTIONAL { ?item wdt:P3916 ?ix_unescothes . }
  OPTIONAL { ?item wdt:P8061 ?ix_unagrovoc . }
  OPTIONAL { ?item wdt:P402 ?ix_xzosmrel . }
  OPTIONAL { ?item wdt:P1566 ?ix_xzgeonames . }
  OPTIONAL { ?item wdt:P5400 ?ix_jpgeolod . }
  OPTIONAL { ?item wdt:P8814 ?ix_usworldnet . }
  OPTIONAL { ?item wdt:P9948 ?ix_usfactbook . }
  OPTIONAL { ?item wdt:P9100 ?ix_xzgithubt . }
  OPTIONAL { ?item wdt:P625 ?ix_zzwgs84point . }
  OPTIONAL { ?item wdt:P3896 ?ix_zzgeojson . }
}
GROUP BY ?item ?ix_unm49
ORDER BY ASC(?item__conceptum__codicem)

Result CSV

1603_3__adm0.tm.hxl.csv

TODOs

Note: manual addition of other UN m49 on Wikidata still not done (around 50-70 are missing). For now focusing on get other things working.

Minimal viable product of administrative boundaries level 1

https://www.wikidata.org/wiki/Wikidata:WikiProject_Country_subdivision

https://www.wikidata.org/wiki/Wikidata:Administrative_territorial_entity

Queries

https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Country_subdivision/Items

https://en.wikibooks.org/wiki/SPARQL/Subqueries

If the country (sense: sovereign state (Q3624078)) already is complicated, already at adm1 (first subdivision) things even more interesting. Now, even adm0 without disputes can be arranged in ways that have more than one strategy to reach then. And unless we know upfront every Wikidata Q item for subregions, the way to get such regions get complex.

In practice, we may need to first bootstrap each P-Code to Q IDs, then reuse Q IDs to everything else. From time to time, the Wikidata may change some parts on how to query data, so this approach may be more fail safe.

The https://www.wikidata.org/wiki/Wikidata_talk:WikiProject_Country_subdivision/Items does have an query for have a quick look at divisions

Try it

SELECT ?country ?countryLabel ?item ?itemLabel ?level ?expected ?found ?samenumber
WITH {
  SELECT ?item ?expected ?country ?level (COUNT(DISTINCT ?place) AS ?found) {
    ?item wdt:P279* ?acs ; wdt:P17 ?country.
    FILTER NOT EXISTS { ?country wdt:P576 [] }
    ?acs p:P279 [ ps:P279 wd:Q1799794 ; pq:P1545 ?level ] .
    
    OPTIONAL { ?item wdt:P1114 ?expected }    
    OPTIONAL { 
      ?place p:P31 ?placeStatement .
      ?placeStatement ps:P31 ?item.
      FILTER NOT EXISTS { ?placeStatement wdt:P582 [] }
    }  
  } 
  GROUP BY ?item ?expected ?country ?level
} AS %subdivisions
WHERE {
  include %subdivisions.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  BIND(IF(?expected = ?found, "✓", "✘") AS ?samenumber).
} 
ORDER BY ?countryLabel ?level DESC(?expected) ?itemLabel

This query gives a hint on ways to reach the subdivisions and try to compare if the numbers match. However, the reason for numbers do not match might be a reflection valid reasons. In general it 'expect' lower numbers than what results find, so I think it migth be at simpler cases disputed territories or references to items (at first subdivision) which are historical, like Kingdom of the Netherlands, but are so much items that they might not have a direct way to know if they're historical (or, if does have, the SPARQL would me huge, huge query.

…drafts)

…_et_wdata() draft

…_q_ex_totalibus()

…ikidata_q_ex_totalibus()

… valid output

…t colors

…nibus#45)

…xl.csv (near 250+ languages for each adm0) if cached file exist on disk

…r :0 (only one option availible)

…e bare minimum

…DF/SKOS Turtle

…EticaAI/lexicographi-sine-finibus#45)

…evel and is_zxxx added

…ed on demand (refs EticaAI/lexicographi-sine-finibus#45)

…ated

…raft

…rordinatio_de_wiki_q

…-finibus#45)

fititnt · 2022-07-25T04:10:46Z

Okay. Done. he public repo https://github.com/MDCIII/1603_16_1 (more specifically this commit MDCIII/1603_16_1@7f328dc) is starting to automate administrative boundaries level 0 from far more data.

Notable to dos

metadata for level 0

Since we're able to generate at least much more metadata as the "country/territory" level the MDCIII/1603_16_1 (1603_16_1_0) can be expanded by every other 1603_16_N_0 on MDCIII.

Compared to other tasks, this one is more trivial.

Less trivial to dos

metadata for level 1

Already at level 1, the mappings already are hard to scale up because neither (if we want to link) the CODs from OCHA or Wikidata itself (if using other vocabularies, such as ISO 3166-2
(https://en.wikipedia.org/wiki/ISO_3166-2).

Not saying that is not feasible, just that we would need do several triangulation. The data most likely already is on Wikipedia, but migth not be interlinked on Wikidata.

And, even if we could extract tables from ISO 3166-2, these nodes likely would need to me matched against P-Codes. And (this is relevant) not require overly manual review. In practice, we might need already have static checkers or validators to warn inconsistencies, but from time to time things would be wrong because disputed territories.

metadata for level 2 and beyond

For Brazil we already know how to do it (IBGE codes already are 1:1 with UN PCodes) and the level 1 of Brazil is few states that worth some manual copy pasting.

However, for other regions, we already would start to try check like for geometry or something. So depending of the focuses, we could use Brazil as benchmark to check if automation would work.

However, I'm already sure for small countries this will always be hard. Not surprisingly, things can evolve over time.

…dividual humans (#44)

fititnt added the epic label Jul 13, 2022

fititnt added a commit that referenced this issue Jul 13, 2022

#45: HXL_WDATA started hardcoded mappings for very common adm0 codes …

87b990e

…used with wikidata

fititnt added a commit that referenced this issue Jul 14, 2022

#45: HXL_WDATA mapped ix_iso4217, ix_iso15924a4, ix_iso15924n3, ix_jp…

4194cbf

…geolod, ix_unagrovoc, ix_unescothes, ix_unp, ix_usfactbook, ix_usworldnet, ix_xzgeonames, ix_xzglide, ix_xzgithubt, ix_xzglotto, ix_xzosmrel

fititnt added a commit that referenced this issue Jul 14, 2022

(#45): HXL_WDATA mapped ix_xzhasc, ix_zzwgs84point

4650c8b

fititnt added a commit that referenced this issue Jul 14, 2022

#45: HXL_WDATA mapped ix_zzgeojson

b3a151c

fititnt added a commit that referenced this issue Jul 15, 2022

#45: HXL_WDATA TODOs for Wikidata Ps

d7aaa82

fititnt added a commit that referenced this issue Jul 15, 2022

#45: draft of 1603_3__adm1_v2 (first level AB from Wikidata)

047a66b

fititnt added a commit that referenced this issue Jul 19, 2022

#45: more tests with wikidata queries for countries/territories

91249b3

fititnt added a commit that referenced this issue Jul 19, 2022

1603_3_12.sh (#45): 999999/1603/3/45/16/1/1/1603_3_45_16_1_1.tm.hxl.csv

53cec35

fititnt added a commit that referenced this issue Jul 19, 2022

999999999_7200235.py (#45) --methodus= wdata_adm0 & cod_ab_et_wdata (…

8baa986

…drafts)

fititnt added a commit that referenced this issue Jul 19, 2022

999999999_7200235.py (#45) hxltm__ex_dict() & hxltm_carricato__cod_ab…

e69fa0a

…_et_wdata() draft

fititnt added a commit that referenced this issue Jul 19, 2022

999999999_7200235.py (#45) --methodus=cod_ab_et_wdata MVP

09de840

fititnt added a commit that referenced this issue Jul 20, 2022

999999999.lib.sh (#45): file_extract_ix_wikiq() and draft of wikidata…

b86d994

…_q_ex_totalibus()

fititnt added a commit that referenced this issue Jul 20, 2022

999999999.lib.sh (#45): wikidata_q_ex_linguis_partibus() and MVP of w…

8ad79ed

…ikidata_q_ex_totalibus()

fititnt added a commit that referenced this issue Jul 20, 2022

999999999.lib.sh (#45): wikidata_q_ex_linguis_partibus() now generate…

378245c

… valid output

fititnt added a commit that referenced this issue Jul 21, 2022

999999999.lib.sh (#45): wikidata_q_ex_linguis_partibus() better outpu…

39ccbaa

…t colors

fititnt added a commit that referenced this issue Jul 21, 2022

999999999.lib.sh (#45): 1603_3_45_16_1_1 full test drill

7ef079a

fititnt added a commit to EticaAI/MDCIII-boostrapper that referenced this issue Jul 21, 2022

Preparing for pull wikidata labels (refs EticaAI/lexicographi-sine-fi…

c5fb7ac

…nibus#45)

fititnt added a commit that referenced this issue Jul 21, 2022

999999999_7200235.py (#45): --methodus=cod_ab_et_wdata uses no11.tm.h…

2005f32

…xl.csv (near 250+ languages for each adm0) if cached file exist on disk

fititnt added a commit that referenced this issue Jul 21, 2022

999999999_7200235.py (#45): bugfix; adm0 should not double unm49 afte…

2c2fbf2

…r :0 (only one option availible)

fititnt added a commit that referenced this issue Jul 21, 2022

999999999_7200235.py (#45): --methodus=de_hxltm_ordo0_ad_no11 draft

e3f401b

fititnt added a commit that referenced this issue Jul 21, 2022

999999999_7200235.py (#45): --methodus=de_hxltm_ordo0_ad_no11 improved

347a6f2

fititnt added a commit that referenced this issue Jul 23, 2022

1603_45_16.lib.sh (#45): bootstrap_1603_16_1__lsf

b5930ac

fititnt added a commit that referenced this issue Jul 23, 2022

1603_45_16.lib.sh (#45): bootstrap_1603_16_1__lsf already generate th…

d091584

…e bare minimum

fititnt added a commit that referenced this issue Jul 23, 2022

1603_45_16.lib.sh (#45): bootstrap_1603_16_1__lsf now also generate R…

b764406

…DF/SKOS Turtle

fititnt added a commit that referenced this issue Jul 23, 2022

1603_45_16.lib.sh (#45): bootstrap_1603_16_1__lsf works

e901a44

fititnt added a commit that referenced this issue Jul 23, 2022

#2, #31, #39, #45: 1603/16/1/0/1603_16_1_0.no1.tm.hxl.csv now core table

83f2d5f

fititnt added a commit to EticaAI/MDCIII-boostrapper that referenced this issue Jul 23, 2022

1603_16_1.lib.sh: gh_repo_update_1603_16_1__boostrap_0() started (refs …

def4e06

…EticaAI/lexicographi-sine-finibus#45)

fititnt added a commit that referenced this issue Jul 24, 2022

#2, #31, #39, #45: 1603/16/1/0/1603_16_1_0.no1.tm.hxl.csv ix_zzcodabl…

904fca9

…evel and is_zxxx added

fititnt added a commit to EticaAI/MDCIII-boostrapper that referenced this issue Jul 24, 2022

1603_16_1.lib.sh: wikiq (1603_16_1_0.wikiq.tm.hxl.csv) can be generat…

450b55d

…ed on demand (refs EticaAI/lexicographi-sine-finibus#45)

fititnt added a commit that referenced this issue Jul 24, 2022

999999999.lib.sh (#45): file_convert_bpc47min_de_numerordinatio() cre…

1b25bb9

…ated

fititnt added a commit that referenced this issue Jul 24, 2022

999999999_54872.py (#45): --methodus=hxltm_combinatio_linguae draft

311345a

fititnt added a commit that referenced this issue Jul 24, 2022

999999999_54872.py (#45): numerordinatio_data__combinatio_linguae() d…

d378bd9

…raft

fititnt added a commit that referenced this issue Jul 25, 2022

999999999_54872.py (#45): numerordinatio_data__combinatio_linguae() MVP

e18a5e3

fititnt added a commit that referenced this issue Jul 25, 2022

999999999.lib.sh (#45) 999999999_54872.py now used by file_merge_nume…

ddd8159

…rordinatio_de_wiki_q

fititnt added a commit to EticaAI/MDCIII-boostrapper that referenced this issue Jul 25, 2022

1603_16_1 now uses over 250 languages (refs EticaAI/lexicographi-sine…

21b759f

…-finibus#45)

fititnt added a commit that referenced this issue Jul 26, 2022

(#45): preparing for geo:wktLiteral data type

9a20fd1

fititnt added a commit that referenced this issue Jul 26, 2022

(#45): 1603/16/1/0/1603_16_1_0.no1.tm.hxl.csv uses geo:wktLiteral

f4133e3

fititnt added a commit that referenced this issue Jul 27, 2022

999999999_54872.py (#45): --methodus=geojson started

bc46c77

fititnt added a commit that referenced this issue Jul 27, 2022

999999999_54872.py (#45): --methodus=geojson already print some data

04a4732

fititnt added a commit that referenced this issue Jul 27, 2022

999999999_54872.py (#45): --methodus=geojson minimally okay

2a050a2

fititnt mentioned this issue Jul 30, 2022

Taxonomic strategy to encode collective of humans (such as population statistics by total and thematic) by P-codes #43

Open

fititnt added a commit that referenced this issue Aug 1, 2022

999999999/0/MDCIII.simulato.owl (#41): drafted ABL 0-6 (#45, #39), in…

eba1d78

…dividual humans (#44)

fititnt mentioned this issue Aug 9, 2022

Minimal Viable Product of documentation of orange3-hxl fititnt/orange3-hxl#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mappings from UN P-Codes and Wikidata Q IDs #45

Mappings from UN P-Codes and Wikidata Q IDs #45

fititnt commented Jul 13, 2022

fititnt commented Jul 14, 2022

fititnt commented Jul 14, 2022

fititnt commented Jul 15, 2022

fititnt commented Jul 25, 2022

Mappings from UN P-Codes and Wikidata Q IDs #45

Mappings from UN P-Codes and Wikidata Q IDs #45

Comments

fititnt commented Jul 13, 2022

Why such mappings becomes relevant

Potential approaches

Tooling specialized to integrate intermediate controlled vocabularies

1. (Not sure, needs testing) maybe compare by matching geometries

2. Trying reverse engineering numeric part of P-Codes (and hope already exist Wikidata P with them)

3. Other inferences

fititnt commented Jul 14, 2022

@todo edit one by one and add missing UN m49 directly on Wikidata

Trivia

Current SPARQL query

The result CSV

fititnt commented Jul 14, 2022

@todo also add numeric codes even for itens without ISO 3166-1 numeric on Wikidata

fititnt commented Jul 15, 2022

Current draft of adm0

Result CSV

TODOs

Minimal viable product of administrative boundaries level 1

fititnt commented Jul 25, 2022

Notable to dos

metadata for level 0

Less trivial to dos

metadata for level 1

metadata for level 2 and beyond