Alibaba Cloud, a subsidiary of Chinese technology giant Alibaba (NASDAQ: BABA), has announced a new text-to-video generator based on artificial intelligence (AI).
The new AI model, dubbed I2VGen-xl, has shown proficiency in generating high-quality videos from various sources, according to available data from GitHub. Besides visually striking videos, the model’s creations are described as “semantically accurate,” reducing the chances of errors, hallucinations, or sycophancy.
“VGen can produce high-quality videos from the input text, images, desired motion, desired subjects, and even the feedback signals provided,” read the GitHub announcement.
Described as an open-source video generation codebase, VGen allows users to train their text-to-video models. By executing a simple command using Python, VGen users can train custom models and perform inference in a seamless process for efficiency.
The repository supports compositional video synthesis with motion controllability and instruction with human feedback and scaling T2V while featuring several pre-trained models for multiple tasks.
“It also offers a variety of commonly used video generation tools such as visualization, sampling, training, inference, join training using images and videos, acceleration, and more,” read the statement.
VGen achieves its advanced features via its massive training data comprising 6 billion text-to-image pairs and 35 million text-to-video pairs, per the announcement. The fallout from the model’s deep pool of training data is its versatility and increased accuracy across several use cases.
The team behind the model’s development has released the technical papers and an official webpage to introduce the model to researchers. Users can access pre-trained models and code for generating 1280×720 pixel videos, putting it on par with existing offerings.
In the future, the team says it will unveil new models specifically designed to generate videos of human bodies and an updated version for motion capture.
Alibaba moves forward with emerging technologies
Alibaba’s foray into AI has seen it launch a large language model (LLM), Tongyi Qianwen, to compete with Meta’s (NASDAQ: META) Llama 2. Not resting on its laurels, the company introduced its “Animate Anyone” offering designed to generate videos from static photos via its proprietary ReferenceNet framework.
A partnership with Web3 firm Avalanche in early 2023 saw Alibaba enter the metaverse despite its previous stance on blockchain technology. The raging semiconductor cold war between the U.S. and China has since slowed Alibaba’s march in AI and quantum computing as the company peers inward for new solutions. Alibaba introduces a new AI-based video generation tool to rival early movers.
Watch: AI truly is not generative, it’s synthetic
New to blockchain? Check out CoinGeek’s Blockchain for Beginners section, the ultimate resource guide to learn more about blockchain technology.
Source: https://coingeek.com/alibaba-new-ai-based-video-generation-tool-to-rival-early-movers/