The TNLTK project is under active development and we have several plans for the future. Our main goal is to continue to improve the functionality and usability of the library, while also expanding its capabilities to cover new NLP tasks and applications. Here are some of the key features and improvements that we are currently working on or plan to implement in the near future:
New NLP tasks:
We are currently working on adding support for new NLP tasks such as named entity recognition, text summarization, and machine translation. These tasks will be integrated into the library in a way that is easy to use and consistent with the existing functionality.Improvements in performance and scalability:
We are working on improving the performance and scalability of the library, especially for large datasets and high-volume use cases. This will involve optimizing the existing algorithms and models, as well as adding support for distributed computing and parallel processing.Expanded documentation and tutorials:
We will be expanding the documentation and tutorials to cover new features and use cases, as well as providing more detailed explanations and examples.Support for deep learning and neural networks:
We plan to add support for deep learning and neural networks to the library in the future. This will involve integrating popular deep learning frameworks such as TensorFlow and PyTorch, and developing new models and algorithms for NLP tasks.
Evaluation and benchmarking: We will be conducting more detailed evaluations and benchmarking of the library to measure its performance and accuracy, as well as comparing it to other popular NLP libraries.
The TNLTK project, created by Tarık Kaan Koç, aims to empower developers and researchers by providing a comprehensive and user-friendly library for Turkish natural language processing tasks.
The Turkish Natural Language Toolkit currently, it includes the following methods described in the literature:
- {To be edited...}
- {To be edited...}
- {To be edited...}
Please, refer to the TNLTK Documentation before using the toolkit.
Python version requirements: 3.8 <= python <= 3.10
$ pip install tnltk
If you are interested in directly contributing to this project, please see CONTRIBUTING.
If you have problems installing gcc
using the command above, we recommend you to install it using Homebrew.
Additionally, you can refer to examples in the respective folder.
If you find TNLTK to be useful, please consider citing it in your published work:
@misc{TNLTK,
author = {Tarik Kaan Koc},
title = {TNLTK: Turkish Natural Language Toolkit},
subtitle = {Unlocking the potential of Turkish text data with TNLTK},
description = {TNLTK is a comprehensive toolkit for natural language processing (NLP) tasks in the Turkish language. It includes a wide range of features, such as tokenization, stemming, and POS tagging, and is designed to be highly accurate and easy to use.},
source-code = "https://github.com/tnltk/tnltk",
docs = "https://tnltk.readthedocs.io/en/latest/",
year = {2023},
}
This project is open source under the LICENSE.
Please note that this project is provided "as is" and comes with no warranty. (Use of this software is subject to the terms of the license agreement.) This software is licensed under Apache 2.0. See LICENSE.
- Koehn, P. and Schroeder, J. (n.d.). Non breaking Turkish prefixes txt file from is taken from this repository. (I converted it to a list Turkish prefixes, you can see it in my source code...)
- "A Comparative Study on Turkish Deasciification Methods", A scientific article that presents a comparison of various methods for Turkish Deasciification.
- This repository contains a Turkish Deasciifier implementation in Python.
- This repository contains a Turkish Deasciifier implementation in Python which is based on a statistical model.