On August 25,Watch Nice Sister Alibaba Cloud launched an open-source Large Vision Language Model (LVLM) named Qwen-VL. The LVLM is based on Alibaba Cloud’s 7 billion parameter foundational language model Qwen-7B. In addition to capabilities such as image-text recognition, description, and question answering, Qwen-VL introduces new features including visual location recognition and image-text comprehension, the company said in a statement. These functions enable the model to identify locations in pictures and to provide users with guidance based on the information extracted from images, the firm added. The model can be applied in various scenarios including image and document-based question answering, image caption generation, and fine-grained visual recognition. Currently, both Qwen-VL and its visual AI assistant Qwen-VL-Chat are available for free and commercial use on Alibaba’s “Model as a Service” platform ModelScope. [Alibaba Cloud statement, in Chinese]
Related Articles
2025-06-26 09:17
380 views
Best portable power station deal: Save $179.01 on the EcoFlow River 2 Max
SAVE $179.01:The EcoFlow River 2 Max portable power station is on sale at Amazon for $289.99, down f
Read More
2025-06-26 07:56
1010 views
How Luca Guadagnino's 'Bones and All' is different from the novel
It's a story of girl meets boy, but with a bloody bite. Director Luca Guadagnino's critically herald
Read More
2025-06-26 07:38
914 views
Astrology tech can provide a safe space for the LGBTQ community, but there are limitations
Mashable is celebrating Pride Monthby exploring the modern LGBTQ world, from the people who make up
Read More