On August 25,Taste of Future Sister-in-law (2023) Alibaba Cloud launched an open-source Large Vision Language Model (LVLM) named Qwen-VL. The LVLM is based on Alibaba Cloud’s 7 billion parameter foundational language model Qwen-7B. In addition to capabilities such as image-text recognition, description, and question answering, Qwen-VL introduces new features including visual location recognition and image-text comprehension, the company said in a statement. These functions enable the model to identify locations in pictures and to provide users with guidance based on the information extracted from images, the firm added. The model can be applied in various scenarios including image and document-based question answering, image caption generation, and fine-grained visual recognition. Currently, both Qwen-VL and its visual AI assistant Qwen-VL-Chat are available for free and commercial use on Alibaba’s “Model as a Service” platform ModelScope. [Alibaba Cloud statement, in Chinese]
Related Articles
2025-06-26 10:01
1829 views
Best portable power station deal: Save 44% on the Jackery Explorer 100 v2
SAVE OVER $350: As of May 8, the Jackery Explorer 1000 v2 power station is on sale for $448.99, down
Read More
2025-06-26 08:38
1300 views
Hero dog opens 3 doors and escapes an animal hospital
A dog named General managed to escape an animal hospital in Stafford, Virginia after opening three d
Read More
2025-06-26 08:00
549 views
Very good dog receives the wrong gift and is more grateful than we'll ever be for anything
Fact: Good dogs are great dogs. It's just the way it is. It's science.But we think we've found the g
Read More