model that converts images of a person’s face and audio clips into a video with proper lip-syncing, facial expressions, and head movements. Developed by a team of AI researchers at Microsoft Research Asia, the new AI model is called VASA-1.
VASA— short for Visual Affective Skills Animator— is capable of transforming any static images whether clicked by the camera, painted, or drawn, into “exquisitely synchronized” animations. The team utilized the publicly available VoxCeleb2 dataset which contains video clips of over 6,000 real-life celebrities. Discarding clips with multiple individuals and of low quality, the team trained their model on the processed dataset.The model offers control over gaze, distance, and emotions in the generated video.
“We are exploring visual affective skill generation for virtual, interactive characters, NOT impersonating any person in the real world,” they wrote in aThe research team maintains that the model will be used for education and provide companionship. They have also refused to release the code that powers the model.
Source: Tech Daily Report (techdailyreport.net)
United States Latest News, United States Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Source: petapixel - 🏆 527. / 51 Read more »
Source: BGR - 🏆 234. / 63 Read more »
Source: screenrant - 🏆 7. / 94 Read more »
Source: iamwellandgood - 🏆 462. / 53 Read more »
Source: SInow - 🏆 273. / 63 Read more »
Source: screenrant - 🏆 7. / 94 Read more »