Experiments in converting wikidata to ftm

Last update: Nov 12, 2021

Related tags

Text Data & NLP wikidata-ftm

Overview

FollowTheMoney / Wikidata mappings

This repo will contain tools for converting Wikidata entities into FtM schema.

Prefixes: https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#Full_list_of_prefixes

https://github.com/tmtmtmtm/every-politician-scraper/blob/main/lib/every_politician_scraper/wikidata_query.rb#L87

https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples#Current_U.S._members_of_the_Senate_with_district,_party_and_date_they_assumed_office

https://stackoverflow.com/questions/46383784/wikidata-get-all-properties-with-labels-and-values-of-an-item/46385132

What do I want?

People as entities
Organizations as entities
Membership of people in certain bodies
- Query all legislatures
- Query all cabinets
- Query SOEs

Query experimentation

Get statement properties on all properties for the given entity:

select distinct ?statement ?wd ?wdLabel ?ps_ ?ps_Label ?wdpq ?wdpqLabel ?pq_ ?pq_Label where {
    wd:Q180589 ?p ?statement .
    ?statement ?ps ?ps_ .

    ?wd wikibase:claim ?p.
    ?wd wikibase:statementProperty ?ps.

    OPTIONAL {
      ?statement ?pq ?pq_ .
      ?wdpq wikibase:qualifier ?pq .
    }

    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

What statement properties (ie. the metadata properties) are we interested in?

pq:P580 - start time
pq:P582 - end time
pq:P585 - point in time
wd:P4100 - parliamentary group
wd:P768 - district
wd:P1534 - end cause

select distinct ?statement ?wd ?wdLabel ?value ?valueLabel ?starttime ?endtime ?time where {
    wd:Q180589 ?p ?statement .
    ?statement ?ps ?value .

    ?wd wikibase:claim ?p.
    ?wd wikibase:statementProperty ?ps.

    OPTIONAL {
      ?statement pq:P580 ?starttime.
    }
    OPTIONAL {
      ?statement pq:P582 ?endtime.
    }
    OPTIONAL {
      ?statement pq:P585 ?time.
    }

    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

How to get source/reference URLs: https://en.wikibooks.org/wiki/SPARQL/WIKIDATA_Qualifiers,_References_and_Ranks

Experiments in converting wikidata to ftm

Related tags

Overview

FollowTheMoney / Wikidata mappings

What do I want?

Query experimentation

Owner

Friedrich Lindenberg

Google AI 2018 BERT pytorch implementation

BERN2: an advanced neural biomedical namedentity recognition and normalization tool

HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools

The source code of HeCo

Simple text to phones converter for multiple languages

Under the hood working of transformers, fine-tuning GPT-3 models, DeBERTa, vision models, and the start of Metaverse, using a variety of NLP platforms: Hugging Face, OpenAI API, Trax, and AllenNLP

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 B) on a single 16 GB VRAM V100 Google Cloud instance with Huggingface Transformers using DeepSpeed

File-based TF-IDF: Calculates keywords in a document, using a word corpus.

Yodatranslator is a simple translator English to Yoda-language

Crie tokens de autenticação íntegros e seguros com UToken.

Ceaser-Cipher - The Caesar Cipher technique is one of the earliest and simplest method of encryption technique

Python package for performing Entity and Text Matching using Deep Learning.

Utilize Korean BERT model in sentence-transformers library

NSFW A chatbot based on GPT2-chitchat

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

DeepSpeech - Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.

Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.

BiNE: Bipartite Network Embedding

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

A notebook that shows how to import the IITB English-Hindi Parallel Corpus from the HuggingFace datasets repository

Experiments in converting wikidata to ftm

Related tags

Overview

FollowTheMoney / Wikidata mappings

What do I want?

Query experimentation

Owner

Friedrich Lindenberg

Google AI 2018 BERT pytorch implementation

BERN2: an advanced neural biomedical namedentity recognition and normalization tool

HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools

The source code of HeCo

Simple text to phones converter for multiple languages

Under the hood working of transformers, fine-tuning GPT-3 models, DeBERTa, vision models, and the start of Metaverse, using a variety of NLP platforms: Hugging Face, OpenAI API, Trax, and AllenNLP

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 B) on a single 16 GB VRAM V100 Google Cloud instance with Huggingface Transformers using DeepSpeed

File-based TF-IDF: Calculates keywords in a document, using a word corpus.

Yodatranslator is a simple translator English to Yoda-language

Crie tokens de autenticação íntegros e seguros com UToken.

Ceaser-Cipher - The Caesar Cipher technique is one of the earliest and simplest method of encryption technique

Python package for performing Entity and Text Matching using Deep Learning.

Utilize Korean BERT model in sentence-transformers library

**NSFW** A chatbot based on GPT2-chitchat

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

DeepSpeech - Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.

Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.

BiNE: Bipartite Network Embedding

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

A notebook that shows how to import the IITB English-Hindi Parallel Corpus from the HuggingFace datasets repository

NSFW A chatbot based on GPT2-chitchat