Oculus Releases OVRLipSync Plugin for Unity
Oculus has announced the release of a new add-on plugin that allows developers to create natural and synchronized lip movements from the sound on a virtual character with the use of the Unity engine.
The Oculus Lip Sync Unity integration (OVRLipSync) plugin was unveiled at Unity’s 2016 Vision VR/Summit, and is designed to automatically detect and analyze existing audio or input from a microphone to generate life-like animated lip movements on a virtual character that appear like they are speaking, and properly enunciating each word. While the plugin is still in its beta stage, the result is good enough to see its potential.
To help achieve an effective natural animate the lips speaking animation, the plugin processes the audio stream and creates a set of values called ‘visimes’—which is a “gesture or expression of the lips and face that corresponds to a particular speech sound.” The result is a virtual animated character that looks like they are speaking and enunciating each word properly.
According to the official notes and documentation released for the OVRLipSync v1.0.1 beta plugin:
“OVRLipSync is an add-on plugin and set of scripts used to sync avatar lip movements to speech sounds from a canned source or microphone input. OVRLipSync requires Unity 5.x Professional or Personal or later, targeting Android or Windows platforms, running on Windows 7, 8, or 10 or 8. OS X 10.9 and later are also currently supported.”
“OVRLipSync uses a repertoire of visemes to modify avatars based on a specified audio input stream. Each viseme targets a specified morph target in an avatar to influence the amount that target will be expressed on the model. Thus, realistic lip movement can be used to sync what is being spoken to what is being seen, enhancing the visual cues that one can used when populating an application with avatars (either controlled by a user locally or on a network, or for generating lip-sync animations for NPC avatars via dialogue samples).”
“Our system currently maps to 15 separate viseme targets: sil, PP, FF, TH, DD, kk, CH, SS, nn, RR, aa, E, ih, oh, and ou. These visemes correspond to expressions typically made by people producing the speech sound by which they’re referred, e.g., the viseme sil corresponds to a silent/neutral expression, PP appears to be pronouncing the first syllable in “popcorn,” FF the first syllable of “fish,” and so forth.”
The new plugin for Unity 5 is available for download directly from the Oculus developer page. Stay tuned for more updates coming soon.