Python API for the Turkish Language Foundation
tdk-py is a Python package allowing access to Turkish dictionaries of the TDK, the Turkish Language Association.
tdk-py provides both synchronous and asynchronous interfaces to the TDK's
APIs and parses their responses into Python class objects based on Pydantic,
so you can do things like
.model_dump_json()
them, or use them in your API endpoints and generate beautiful schemas.
tdk-py is supported on Python 3.10+. First, make sure you have a Python environment set up.
# in the shell
poetry add tdk-py # if using python-poetry.org (recommended)
pipenv install tdk-py # if using pipenv.pypa.io
pip install tdk-py # straight pip.pypa.io
# in Python
import tdk
import tdk
results = tdk.search_gts_sync("merkeziyetçilik")
print(results[0].meanings[0].meaning)
Otoritenin ve işin tek bir merkezde toplanmasını amaçlayan görüş; merkeziyet, merkezcilik
You can query suggestions for misspelt words or for other similar words.
from difflib import get_close_matches
import tdk
# Calculate suggestions locally using the index:
words = get_close_matches("feldispat", tdk.get_gts_index_sync())
assert words == ['feldspat', 'ispat', 'fesat']
# Use the TDK API: (sometimes errors out)
words = tdk.get_gts_suggestions_sync("feldispat")
assert words == ['feldspat', 'felekiyat', 'ispat']
You can perform complex analyses very easily. Let's see the distribution of entries by the number of maximum consecutive consonants.
import tdk
annotated_dict = {}
for entry in tdk.get_gts_index_sync():
streaks = tdk.tools.max_streak(entry)
if streaks not in annotated_dict:
annotated_dict[streaks] = [entry]
else:
annotated_dict[streaks].append(entry)
for i in set(annotated_dict):
print(i, len(annotated_dict[i]))
0 19
1 15199
2 73511
3 3605
4 68
5 5
tdk-py's source code is provided under the MIT License
Copyright © 2021-2024 Emre Özcan