-
Notifications
You must be signed in to change notification settings - Fork 25
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: add llava-bench-in-the-wild (jp) benchmark dataset (#32)
* feat: add llava-bench-in-the-wild benchmark JP dataset * chore: remove unnecessary file * chore: update * chore: rename filename from LLaVA_Bench.md to README.md * feat: add English README --------- Co-authored-by: kentosasaki-jp <s2113605.klis.tsukuba.ac.jp>
- Loading branch information
1 parent
1c4d50b
commit c666ceb
Showing
12 changed files
with
588 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
<div align="center"> | ||
|
||
# LLaVA-Bench-In-the-Wild (Japanese) | ||
|
||
English | [日本語](./ja/README_ja.md) | ||
|
||
</div> | ||
|
||
LLaVA-Bench-In-the-Wild (Japanese) is the Japanese version dataset of LLaVA-Bench-In-the-Wild. It has been translated into Japanese using DeepL. | ||
|
||
The `llava-bench-in-the-wild/en/*.jsonl` files have been copied from Hugging Face's [liuhaotian/llava-bench-in-the-wild](https://huggingface.co/datasets/liuhaotian/llava-bench-in-the-wild/tree/main). | ||
|
||
# Download Dataset | ||
Download `images/` from Hugging Face's [liuhaotian/llava-bench-in-the-wild](https://huggingface.co/datasets/liuhaotian/llava-bench-in-the-wild) and place them under `playground/data/llava-bench-in-the-wild/`. | ||
|
||
# License | ||
|
||
Released under the [Apache License 2.0](./LICENSE). |
60 changes: 60 additions & 0 deletions
60
playground/data/llava-bench-in-the-wild/en/answers_gpt4.jsonl
Large diffs are not rendered by default.
Oops, something went wrong.
60 changes: 60 additions & 0 deletions
60
playground/data/llava-bench-in-the-wild/en/bard_0718.jsonl
Large diffs are not rendered by default.
Oops, something went wrong.
60 changes: 60 additions & 0 deletions
60
playground/data/llava-bench-in-the-wild/en/bing_chat_0629.jsonl
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
{"id": "001", "image": "001.jpg", "caption": "An aerial view of Diamond Head in the Hawaiian Islands."} | ||
{"id": "002", "image": "002.jpg", "caption": "A photo of four mangosteens on a wooden table. Three of them are uncut, while one is cut open."} | ||
{"id": "003", "image": "003.jpg", "caption": "A creative painting of a dog dressed as the famous Mona Lisa."} | ||
{"id": "004", "image": "004.jpg", "caption": "A creative meme: Elon Musk holding a dog. It mimicks one of the most memorable scenes from The Lion King, where Rafiki stands on a mountaintop and holds up Simba to all the other animals."} | ||
{"id": "005", "image": "005.jpg", "caption": "A creative meme: a dog lying on the cyan wooden floor. The top of the meme reads: \"MONDAY.\" The bottom of the meme reads: \"JUST...MONDAY.\""} | ||
{"id": "006", "image": "006.jpg", "caption": "The famous painting: Mona Lisa."} | ||
{"id": "007", "image": "007.jpg", "caption": "The Space Needle with a clear blue sky in the background."} | ||
{"id": "008", "image": "008.jpg", "caption": "Headshot of the Japanese cartoon character: Conan Edogawa."} | ||
{"id": "009", "image": "009.jpg", "caption": "A serene scene with a T-shaped wooden pier extending out over a calm lake. The lake is surrounded by green trees, and there are mountains in the background. Despite the cloudy sky, the scene is illuminated by mild to bright light."} | ||
{"id": "010", "image": "010.jpg", "caption": "The image shows a man standing on the rear bumper of a yellow taxi on a city street. A foldable ironing table is attached to the back of the taxi, and he is ironing a blue T-shirt on the ironing board. The man is wearing a yellow sweater. Another yellow taxi is visible on the left side of the scene. Tall buildings surround the street, and street lights are visible along with red flags attached to them."} | ||
{"id": "011", "image": "011.jpg", "caption": "An open refrigerator filled with a variety of food items. In the left part of the compartment, towards the front, there is a plastic box of strawberries with a small bag of baby carrots on top. Towards the back, there is a stack of sauce containers. In the middle part of the compartment, towards the front, there is a green plastic box, and there is an unidentified plastic bag placed on it. Towards the back, there is a carton of milk. In the right part of the compartment, towards the front, there is a box of blueberries with three yogurts stacked on top. The large bottle of yogurt is Fage non-fat yogurt, and one of the smaller cups is Fage blueberry yogurt. The brand and flavor of the other smaller cup are unknown. Towards the back, there is a container with an unknown content."} | ||
{"id": "012", "image": "012.jpg", "caption": "A set of three identical coffee mugs adorned with the famous character Mario."} | ||
{"id": "013", "image": "013.jpg", "caption": "A large tray filled with four cooked lobsters. The lobsters are covered with melted butter, minced garlic, rosemary, and parsley. Lemon slices are also placed on the tray."} | ||
{"id": "014", "image": "014.jpg", "caption": "A meme featuring an oven tray with fried chicken nuggets arranged in the shape of the continents on a world map. Above the image, the caption reads: \"Sometimes I just look at pictures of the Earth from space and I marvel at how beautiful it all is.\""} | ||
{"id": "015", "image": "015.jpg", "caption": "A meme featuring two distinct parts. In the top half, it says \"statistical learning,\" and there is a person standing in front of a whiteboard with a concerned expression. A plot showing a red curve gradually decreasing is displayed, and the chat bubble reads: \"People, our learner overgeneralizes because the VC-Dimension of our Kernel is too high. Get some experts and minimize the structural risk in a new one. Rework our loss function, make the next kernel stable, unbiased, and consider using a soft margin.\" In the bottom half of the image, it says \"neural networks,\" and there is another person standing in front of a whiteboard with a happy expression. A plot showing a green curve with both x and y axes labeled \"LAYERS\" is displayed. The big chat bubble reads: \"STACK MORE LAYERS.\" At the bottom of the meme, it reads: \"But unironically.\""} | ||
{"id": "016", "image": "016.jpg", "caption": "A meme with three panels: one panel on the left and two stacked panels on the right. Panel 1: A smartphone (resembling an iPhone) with a VGA cable plugged into its charging port. The VGA cable is white, and the VGA connector is blue. The phone is protected with a glass screen protector and a protective case. Panel 2: A product package containing a VGA connector that closely resembles the one shown in the first panel. The brand is ELECOM, specifically the P-APLTDCN series. At the bottom, there is a green sticker that reads: \"For Lightning Cable.\" Panel 3: A top view of the connector, similar to the previous panels. The actual connector is visible only in this panel, and it shows that it is actually a Lightning connector."} | ||
{"id": "017", "image": "017.jpg", "caption": "A creature that looks like a llama, formed with lava. Its body looks like hot, red lava with flames on it, and its four feet resemble black volcanic rock that has cooled down after the lava flow. It is also wearing red glasses."} | ||
{"id": "018", "image": "018.jpg", "caption": "A watercolor painting of three animals picnicking around a table made of a tree stump. On the left, a brown bear is eating a chocolate cookie; in the middle, a blue-grey cat is holding a blue mug, and on the right, a light-brown rabbit is sipping from a pink coffee mug. On the table, there is a plate with various kinds of cookies. They are surrounded by grass, and tree leaves are visible above the rabbit."} | ||
{"id": "019", "image": "019.jpg", "caption": "A page of a notebook showing a sketch of a website layout. The heading reads \"My Joke Website\". It is followed by several rows, each row reading: \"[Really funny joke 1]\", \"[Push to reveal punchline]\", \"[Same, but joke 2]\", \"[Push to reveal punchline]\", \"Copyright OpenAI 2023\"."} | ||
{"id": "020", "image": "020.jpg", "caption": "A page of a notebook showing a sketch of a website layout. The heading reads \"My Joke Website\". It is followed by two rows, \"[Funny Joke]\", \"[Push to reveal punchline]\"."} | ||
{"id": "021", "image": "021.jpg", "caption": "The iconic \"flying scene\" of Titanic."} | ||
{"id": "022", "image": "022.jpg", "caption": "A close-up photo of a meal at ICHIRAN. The chashu ramen bowl with a spoon is placed in the center. The ramen is seasoned with chili sauce, chopped scallions, and served with two pieces of chashu. Chopsticks are placed to the right of the bowl, still in their paper wrap, not yet opened. The ramen is also served with nori on the left. On top, from left to right, the following sides are served: a bowl of orange spice (possibly garlic sauce), a plate of smoke-flavored stewed pork with chopped scallions, and a cup of matcha green tea."} | ||
{"id": "023", "image": "023.jpg", "caption": "An advertisement from Subway with a black background. On the top left, there is the Subway logo (SUBWAY SERIES), and the slogan reads \"A NEW WAY TO SUBWAY.\" There are two sandwiches in the photo, possibly a footlong sandwich cut in half. The sandwiches are served with toasted bread (possibly artisan Italian bread), cheese, ham, salami, shredded lettuce, tomato, and banana peppers."} | ||
{"id": "024", "image": "024.jpg", "caption": "A top view of a highway at night. The highway is divided into four sections. From left to right, there are three lanes and four lanes of traffic approaching the camera, and four lanes and three lanes of traffic moving away from the camera. Most of the cars in the four-lane section moving away from the camera have their brake lights on. There are many cars in both the four-lane sections. The traffic is light on the two three-lane sections. The four-lane highway is elevated compared to the three-lane highway. The lights alongside the highway are illuminated. On the right side of the highway, there are trees."} |
60 changes: 60 additions & 0 deletions
60
playground/data/llava-bench-in-the-wild/en/questions.jsonl
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
{"image": "001.jpg", "text": "What is the name of this famous sight in the photo?", "category": "conv", "question_id": 0} | ||
{"image": "001.jpg", "text": "Describe this photo in detail.", "category": "detail", "question_id": 1} | ||
{"image": "001.jpg", "text": "What are the possible reasons of the formation of this sight?", "category": "complex", "question_id": 2} | ||
{"image": "001.jpg", "text": "Compose an engaging travel blog post about a recent trip to this place, highlighting cultural experiences and must-see attractions, including both the attraction seen in the photo and other must-see attractions as well.", "category": "complex", "question_id": 3} | ||
{"image": "002.jpg", "text": "What type of fruit is this?", "category": "conv", "question_id": 4} | ||
{"image": "002.jpg", "text": "How many uncut fruits are in the image?", "category": "conv", "question_id": 5} | ||
{"image": "002.jpg", "text": "Describe this photo in detail.", "category": "detail", "question_id": 6} | ||
{"image": "002.jpg", "text": "Imagine the fragrance of the fruits in the image. How would you describe this to someone who has never had this fruit before?", "category": "complex", "question_id": 7} | ||
{"image": "003.jpg", "text": "Describe this photo in detail.", "category": "detail", "question_id": 8} | ||
{"image": "003.jpg", "text": "What might be the intended effect of this painting?", "category": "complex", "question_id": 9} | ||
{"image": "003.jpg", "text": "Discuss how this creative twist on a classic work of art might be interpreted differently by various audiences.", "category": "complex", "question_id": 10} | ||
{"image": "004.jpg", "text": "What is the name of the man in the photo?", "category": "conv", "question_id": 11} | ||
{"image": "004.jpg", "text": "Which iconic movie scene is being parodied in the meme?", "category": "conv", "question_id": 12} | ||
{"image": "004.jpg", "text": "How does this meme reflect or comment on Elon Musk's public image, personality, or actions?", "category": "complex", "question_id": 13} | ||
{"image": "005.jpg", "text": "Please explain the meme in detail.", "category": "detail", "question_id": 14} | ||
{"image": "005.jpg", "text": "In what other ways might someone express the same sentiment that this meme is expressing?", "category": "complex", "question_id": 15} | ||
{"image": "006.jpg", "text": "Do you know who paint this?", "category": "conv", "question_id": 16} | ||
{"image": "006.jpg", "text": "Describe this painting in detail.", "category": "detail", "question_id": 17} | ||
{"image": "006.jpg", "text": "Discuss the historical impact and the significance of this painting in the art world.", "category": "complex", "question_id": 18} | ||
{"image": "007.jpg", "text": "Describe this photo in detail.", "category": "detail", "question_id": 19} | ||
{"image": "007.jpg", "text": "What's the best weather, season, time of the day of visiting this place? Is the time when this photo was taken a good time to visit this place?", "category": "complex", "question_id": 20} | ||
{"image": "008.jpg", "text": "What is the name of the character in the image?", "category": "conv", "question_id": 21} | ||
{"image": "008.jpg", "text": "What's the personality of this character? Explain what elements or aspects of the character's design may have contributed to its popularity.", "category": "complex", "question_id": 22} | ||
{"image": "009.jpg", "text": "What are the things I should be cautious about when I visit here?", "category": "complex", "question_id": 23} | ||
{"image": "009.jpg", "text": "If you were a photographer looking to capture this location's essence, what time of day and weather conditions would you choose? Describe the reasons behind your choice.", "category": "complex", "question_id": 24} | ||
{"image": "010.jpg", "text": "Describe this photo in detail.", "category": "detail", "question_id": 25} | ||
{"image": "010.jpg", "text": "What is unusual about this image?", "category": "complex", "question_id": 26} | ||
{"image": "011.jpg", "text": "What fruit is in the left part of the fridge?", "category": "conv", "question_id": 27} | ||
{"image": "011.jpg", "text": "What is the brand of the yogurt flavored with blueberry?", "category": "conv", "question_id": 28} | ||
{"image": "011.jpg", "text": "Is there any strawberry-flavored yogurt in the fridge?", "category": "conv", "question_id": 29} | ||
{"image": "011.jpg", "text": "Describe this photo in detail.", "category": "detail", "question_id": 30} | ||
{"image": "011.jpg", "text": "What are the meals that I can cook with these?", "category": "complex", "question_id": 31} | ||
{"image": "012.jpg", "text": "How many coffee mugs are in the set?", "category": "conv", "question_id": 32} | ||
{"image": "012.jpg", "text": "Write an attractive product description for this.", "category": "complex", "question_id": 33} | ||
{"image": "013.jpg", "text": "Show the detailed recipe for this dish.", "category": "complex", "question_id": 34} | ||
{"image": "014.jpg", "text": "Can you explain this meme in detail?", "category": "complex", "question_id": 35} | ||
{"image": "015.jpg", "text": "What are the two machine learning concepts mentioned in the meme?", "category": "conv", "question_id": 36} | ||
{"image": "015.jpg", "text": "Give a detailed description of this meme.", "category": "detail", "question_id": 37} | ||
{"image": "015.jpg", "text": "Can you explain why this is funny. Think about it step-by-step.", "category": "complex", "question_id": 38} | ||
{"image": "016.jpg", "text": "Give a detailed description of this image. Describe it panel by panel.", "category": "detail", "question_id": 39} | ||
{"image": "016.jpg", "text": "What is funny about this image? Describe it panel by panel.", "category": "complex", "question_id": 40} | ||
{"image": "017.jpg", "text": "What material appears to make up the creature?", "category": "conv", "question_id": 41} | ||
{"image": "017.jpg", "text": "This is the logo of LLaVA, Large Language and Vision Assistant, based on the LLaMA architecture. Please explain this logo in detail, and how do you think of its design.", "category": "complex", "question_id": 42} | ||
{"image": "018.jpg", "text": "What are the animals in the painting and what are they doing?", "category": "conv", "question_id": 43} | ||
{"image": "018.jpg", "text": "Write a fairy tale based on this painting.", "category": "complex", "question_id": 44} | ||
{"image": "019.jpg", "text": "Describe this sketch in detail.", "category": "detail", "question_id": 45} | ||
{"image": "019.jpg", "text": "Write brief HTML/JS to turn this mock-up into a colorful website, where the jokes are replaced by two real jokes.", "category": "complex", "question_id": 46} | ||
{"image": "020.jpg", "text": "Describe this sketch in detail.", "category": "detail", "question_id": 47} | ||
{"image": "020.jpg", "text": "Write brief HTML/JS to turn this mock-up into a colorful and interactive website, where the joke is replaced by a real joke.", "category": "complex", "question_id": 48} | ||
{"image": "021.jpg", "text": "What's the ending of this movie?", "category": "conv", "question_id": 49} | ||
{"image": "021.jpg", "text": "What is the significance of this scene in the context of the movie?", "category": "complex", "question_id": 50} | ||
{"image": "022.jpg", "text": "What's the name of the restaurant serving these dishes?", "category": "conv", "question_id": 51} | ||
{"image": "022.jpg", "text": "Describe this photo in detail.", "category": "detail", "question_id": 52} | ||
{"image": "022.jpg", "text": "If someone were to recommend a new flavor or topping to the dish, describe the reason for this change and how it might alter the overall taste.", "category": "complex", "question_id": 53} | ||
{"image": "023.jpg", "text": "What brand is featured in this advertisement?", "category": "conv", "question_id": 54} | ||
{"image": "023.jpg", "text": "Describe this photo in detail.", "category": "detail", "question_id": 55} | ||
{"image": "023.jpg", "text": "Show me a detailed recipe for cooking this at home.", "category": "complex", "question_id": 56} | ||
{"image": "024.jpg", "text": "Describe this photo in detail.", "category": "detail", "question_id": 57} | ||
{"image": "024.jpg", "text": "What is the problem this city might be facing? What are some possible solutions?", "category": "complex", "question_id": 58} | ||
{"image": "024.jpg", "text": "Explain all the cues that indicate the current traffic conditions.", "category": "complex", "question_id": 59} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
<div align="center"> | ||
|
||
# LLaVA-Bench-In-the-Wild (Japanese) | ||
|
||
[English](../README.md) | 日本語 | ||
|
||
</div> | ||
|
||
LLaVA-Bench-In-the-Wild (Japanese)は、LLaVA-Bench-In-the-Wildの日本語版データセットです。DeepLを用いて、日本語に翻訳しています。 | ||
|
||
`llava-bench-in-the-wild/en/*.jsonl`は、Hugging Faceの[liuhaotian/llava-bench-in-the-wild](https://huggingface.co/datasets/liuhaotian/llava-bench-in-the-wild/tree/main)からコピーしています。 | ||
|
||
# Download Dataset | ||
Hugging Faceの[liuhaotian/llava-bench-in-the-wild](https://huggingface.co/datasets/liuhaotian/llava-bench-in-the-wild)から`images/`をダウンロードして、`playground/data/llava-bench-in-the-wild/`以下に配置してください。 | ||
|
||
# License | ||
|
||
Released under the [Apache License 2.0](./LICENSE). |
Oops, something went wrong.