PTMGPT2

Here, we introduce PTMGPT2, a suite of models capable of generating tokens that signify modified protein sequences, crucial for identifying PTM sites. At the core of this platform is PROTGPT2, an autoregressive transformer model. We have adapted PROTGPT2, utilizing it as a pre-trained model, and further fine-tuned it for the spe cific task of generating classification labels for a given PTM type. Uniquely, PTMGPT2 utilizes a decoder-only architecture, which eliminates the need for a task-specific clas- sification head during training. Instead, the final layer of the decoder functions as a projection back to the vocabulary space, effectively generating the next possible token based on the learned patterns among tokens in the input prompt.

PTMGPT2 model and workflow

Download sample model for inference

Link - (https://nsclbio.jbnu.ac.kr/GPT_model/)

Contact us directly at [email protected] for bulk predictions and trained models

Requirements

python 3.11.3
transformers 4.29.2
scikit-learn 1.2.2
pytorch 2.0.1
pytorch-cuda 11.7

Basic Usage

• Model: This folder hosts a sample model designed to predict PTM sites from given protein sequences, illustrating PTMGPT2’s application.
• Tokenizer: This folder contains a sample tokenizer responsible for tokenizing protein sequences, including handcrafted tokens for specific amino acids or motifs.
• Inference.ipynb: This file provides executable code for applying PTMGPT2 model and tokenizer to predict PTM sites, serving as a practical guide for users to apply the model to their datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
Data		Data
Prompts		Prompts
Tokenizer		Tokenizer
GPT2-Inference.ipynb		GPT2-Inference.ipynb
LICENSE		LICENSE
PTMGPT2-workflow-model.png		PTMGPT2-workflow-model.png
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PTMGPT2

PTMGPT2 model and workflow

Download sample model for inference

PTMGPT2 Webserver

PTMGPT2 Models

PTMGPT2 Datasets

Requirements

Basic Usage

About

Releases

Packages

Languages

License

aiproteins/PTMGPT2

Folders and files

Latest commit

History

Repository files navigation

PTMGPT2

PTMGPT2 model and workflow

Download sample model for inference

PTMGPT2 Webserver

PTMGPT2 Models

PTMGPT2 Datasets

Requirements

Basic Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages