Skip to content

LeetScraper is a project designed to scrape LeetCode questions and serve them through a FastAPI server. The project uses Docker to containerize the FastAPI application and a MongoDB instance, ensuring a seamless development and deployment experience.

License

Notifications You must be signed in to change notification settings

Josephaedan/LeetScraper

Repository files navigation

LeetScraper

LeetScraper is a project designed to scrape LeetCode questions and serve them through a FastAPI server. The project uses Docker to containerize the FastAPI application and a MongoDB instance, ensuring a seamless development and deployment experience.

Table of Contents

Getting Started

1. Clone the Repository

git clone https://github.com/Josephaedan/LeetScraper.git
cd LeetScraper

2. Setup Environment Variables

The project can be configured using environment variables in a .env file in the root directory. Here's a breakdown of the available variables:

  • MONGO_HOST: The host for the MongoDB instance. Default is localhost.
  • MONGO_PORT: The port for the MongoDB instance. Default is 27017.

A sample file .env.example has been provided for you. Copy the .env.example file to a new file named .env:

cp .env.example .env

Modify the .env file with your desired configuration.

3. Build and Run with Docker

docker-compose build
docker-compose up

Running the Project

Once you've set up the environment variables and started the Docker containers, you can access the FastAPI server at:

http://localhost:8000

For API documentation, visit:

http://localhost:8000/docs

Data Schema

Each question scraped from LeetCode is saved with the following schema:

  • _id: A unique identifier for the document. Automatically generated by MongoDB.
  • id: The question's unique ID on LeetCode.
  • title: The title of the question.
  • category: The category of the question (e.g., "Algorithms").
  • description: A detailed description of the problem, including examples and constraints.
  • difficulty: The difficulty level of the question (e.g., "Easy", "Medium", "Hard").
  • likes: The number of likes the question has received on LeetCode.
  • dislikes: The number of dislikes the question has received on LeetCode.
  • hints: An array of hints provided for the question.
  • languages: An array of programming languages in which solutions can be submitted for the question.
  • paid_only: A boolean indicating whether the question is accessible only to paid members.
  • topic_tags: An array of topics/tags associated with the question.
  • url: The URL of the question on LeetCode.

Example

{
  "_id": "64f2e1bc61210a08888cfa5d",
  "id": "1",
  "category": "Algorithms",
  "description": "Given an array of integers nums and an integer target...",
  "difficulty": "Easy",
  "dislikes": 1642,
  "hints": [
    "A really brute force way would be to search for all possible pairs...",
    "...",
    "..."
  ],
  "languages": [
    "cpp",
    "java",
    "...",
    "vanillajs"
  ],
  "likes": 50938,
  "paid_only": false,
  "title": "Two Sum",
  "topic_tags": [
    "Array",
    "Hash Table"
  ],
  "url": "https://leetcode.com/problems/two-sum"
}

Endpoints

  • /: The root endpoint, which redirects to the API documentation.
  • /questions: Endpoint to retrieve all scraped LeetCode questions.
  • /questions/{id}: Endpoint to retrieve a specific LeetCode question by ID.

License

This project is open-source and available under the MIT License.

About

LeetScraper is a project designed to scrape LeetCode questions and serve them through a FastAPI server. The project uses Docker to containerize the FastAPI application and a MongoDB instance, ensuring a seamless development and deployment experience.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published