CodeRefine is a framework designed to improve the quality of code implementations generated for research papers by Large Language Models (LLMs), with a specific focus on GPT-4o. This repository contains the pipeline for the CodeRefine system, aiming to enhance code synthesis from research papers through a structured and iterative process.
-
Clone the repository:
git clone https://github.com/AbhijitJowhari/CodeRefine.git
-
Navigate to the cloned directory:
cd CodeRefine
The CodeRefine pipeline consists of several steps designed to extract relevant information from research papers, construct knowledge graphs, and retrospectively refine the code output by querying related research papers and utilizing LLMs.
Create the following directories in the root of the repository:
mkdir query_paper query_paper_xml ref_papers ref_papers_xml
query_paper
: Add the input paper to this directory.query_paper_xml
: This directory will store the XML representation of the input paper.ref_papers
: Add both the input paper and its references to this directory.ref_papers_xml
: This directory will store the XML representation of the reference papers.
You need to generate API keys for the following services:
Set these API keys as environment variables in your terminal:
export OPENAI_API_KEY="<your_openai_api_key>"
export GROQ_API_KEY="<your_groq_api_key>"
export MXBAI_API_KEY="<your_mixedbread_api_key_here>"
Execute the following command from the root directory of the repository:
python3 main.py
This project is licensed under the MIT License. See the LICENSE file for details.