Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fall back to huggingface-cli when pulling via URL fails #475

Merged
merged 1 commit into from
Nov 21, 2024

Conversation

rhatdan
Copy link
Member

@rhatdan rhatdan commented Nov 20, 2024

Handle non GGUF files as well.

Summary by Sourcery

Implement a fallback mechanism to use 'huggingface-cli' for model pulling when URL-based pulling fails, and fix the argument path for the 'vllm serve' command. Update system tests to reflect these changes.

Bug Fixes:

  • Fix the argument path for the 'vllm serve' command to use the directory path instead of the model file path.

Enhancements:

  • Implement a fallback mechanism to use 'huggingface-cli' for model pulling when URL-based pulling fails.

Tests:

  • Update system test to verify the correct arguments for the 'vllm serve' command.

Copy link
Contributor

sourcery-ai bot commented Nov 20, 2024

Reviewer's Guide by Sourcery

This PR implements a fallback mechanism for pulling models from HuggingFace. When pulling via direct URL fails, it attempts to use the huggingface-cli. The changes also include fixes for VLLM model serving and error handling improvements.

Sequence diagram for model pulling with fallback

sequenceDiagram
    participant User
    participant System
    participant HuggingFace
    User->>System: Request to pull model
    System->>HuggingFace: Attempt URL pull
    alt URL pull fails
        System->>System: Log error
        System->>HuggingFace: Attempt CLI pull
        alt CLI not available
            System->>User: Raise NotImplementedError
        else CLI available
            System->>HuggingFace: Execute CLI pull
        end
    else URL pull succeeds
        System->>User: Return model path
    end
Loading

Updated class diagram for HuggingFace model handling

classDiagram
    class HuggingFaceModel {
        +login(args)
        +logout(args)
        +pull(args)
        +hf_pull(args, model_path, directory_path)
        +url_pull(args, model_path, directory_path)
    }
    note for HuggingFaceModel "Added hf_pull and url_pull methods for fallback mechanism"
    class Model {
        +remove(args)
        +_image(args)
        +serve(args)
    }
    note for Model "Modified serve method for VLLM model serving"
Loading

File-Level Changes

Change Details Files
Implement fallback mechanism for HuggingFace model downloads
  • Split model pulling into two methods: url_pull and hf_pull
  • Added try-catch block to attempt URL download first, then fall back to huggingface-cli
  • Changed error handling to raise NotImplementedError when huggingface-cli is not available
ramalama/huggingface.py
Update VLLM model serving configuration
  • Modified VLLM serve command to use model directory path instead of specific model file
ramalama/model.py
test/system/040-serve.bats
Improve error handling in model removal
  • Added missing return statement after handling OSError in ignore mode
ramalama/model.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @rhatdan - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@@ -94,13 +94,13 @@ def remove(self, args):
except OSError as e:
if not args.ignore:
raise KeyError(f"removing {self.model}: {e}")
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): This early return skips garbage collection when args.ignore is True

Consider moving the return after garbage_collection() to ensure it runs in all cases when args.ignore is True

@@ -68,7 +66,28 @@ def pull(self, args):

symlink_dir = os.path.dirname(model_path)
os.makedirs(symlink_dir, exist_ok=True)
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider restructuring the error handling flow to use specific exception types and sequential checks

The nested try-except blocks and generic exception handling make the fallback flow harder to follow. Consider restructuring to make the fallback strategy more explicit while maintaining the same behavior:

def pull(self, args):
    model_path = self.model_path(args)
    directory_path = os.path.join(args.store, "repos", "huggingface", self.directory, self.filename)
    os.makedirs(directory_path, exist_ok=True)

    symlink_dir = os.path.dirname(model_path)
    os.makedirs(symlink_dir, exist_ok=True)

    try:
        return self.url_pull(args, model_path, directory_path)
    except (urllib.error.HTTPError, urllib.error.URLError) as e:
        if self.hf_cli_available:
            return self.hf_pull(args, model_path, directory_path)
        perror("URL pull failed and huggingface-cli not available")
        raise KeyError(f"Failed to pull model: {str(e)}")

This version:

  1. Catches specific URL-related exceptions rather than generic Exception
  2. Checks CLI availability before attempting fallback
  3. Provides clearer error context
  4. Eliminates nested exception handling

@rhatdan rhatdan force-pushed the huggingface branch 2 times, most recently from 655cbaa to 5bf5fe0 Compare November 20, 2024 19:11
ramalama/huggingface.py Outdated Show resolved Hide resolved
ramalama/huggingface.py Outdated Show resolved Hide resolved
@rhatdan
Copy link
Member Author

rhatdan commented Nov 20, 2024

@bentito PTAL

@bentito
Copy link

bentito commented Nov 21, 2024

@bentito PTAL

When I run latest, I am seeing:

$ bin/ramalama --debug pull huggingface://ibm-granite/granite-3.0-8b-instruct
URL pull failed and huggingface-cli not available
Error: "Failed to pull model: 'failed to pull https://huggingface.co/ibm-granite/raw/main/granite-3.0-8b-instruct: HTTP Error 401: Unauthorized'"
ramalama fork/rhatdan/huggingface $ huggingface-cli version
usage: huggingface-cli <command> [<args>]
huggingface-cli: error: argument {env,login,whoami,logout,repo,upload,download,lfs-enable-largefiles,lfs-multipart-upload,scan-cache,delete-cache}: invalid choice: 'version' (choose from 'env', 'login', 'whoami', 'logout', 'repo', 'upload', 'download', 'lfs-enable-largefiles', 'lfs-multipart-upload', 'scan-cache', 'delete-cache')

Did the usage of hf-cli change from before, because with my suggest mods it was working. See my line comment.

try:
return self.url_pull(args, model_path, directory_path)
except (urllib.error.HTTPError, urllib.error.URLError, KeyError) as e:
if not self.hf_cli_available:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't working for me, I'm ending up with it showing my hf_cli is not available even though it is.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not self.hf_cli_available:
if self.hf_cli_available:

not is reversing the logic, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep

Handle non GGUF files as well.

Signed-off-by: Daniel J Walsh <[email protected]>
@rhatdan
Copy link
Member Author

rhatdan commented Nov 21, 2024

@bentito PTANL

@bentito
Copy link

bentito commented Nov 21, 2024

/lgtm

Works for me now!

Copy link

@bentito bentito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all working for me locally now!

@rhatdan rhatdan merged commit 0d1b98a into containers:main Nov 21, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants