Ari Shapiro, Ph.D. firstname.lastname@example.org
Andrew Feng, Ph.D. email@example.com
Anton Leuski, Ph.D. firstname.lastname@example.org
Natural Language Dialogue Group
USC Institute for Creative Technologies
last modified 1/7/15
The design goal of the VHmobile platform is to provide a self-contained mobile architecture that can be easily scripted to generate chat-based virtual human applications. This is in contrast to traditional approaches to mobile app development which would involve either direct coding of a native language using the mobile API (e.g. Java on Android) or the use of a build tool (such as a game engine that can run on mobile architectures). The basic control mechanisms for a virtual human are either provided automatically, such as in the case of lip syncing to speech and automated nonverbal behavior.
There are two ways to use VHMobile:
1) using an app called VHVanilla that can be modified by changing the Python control scripts. VHVanilla contains all the code needed to run the application (including on the Google Cardboard platform) without using a separate development environment.
2) as an embedded library within another Android application. The Android application would be constructed using traditional mobile app tools, using the VHMobile library.
A build for iOS does not currently exist, although we expect that such a build could be done using many of the same approach, substituting the appropriate Android service for iOS service (Apple ASR instead of Google ASR, objective c instead of Java, etc.)
Download the VHvaniilla Android app on the Google Play store (approximately 500 mb).
You can also obtain a copy by contacting Ari Shapiro at email@example.com
The app will install the supporting files into /sdcard/vhdata. There are several sample applications located in the /sdcard/vhdata
VHVanilla will run the contents of the /sdcard/vhdata/init.py file which contain the instructions for the application.
There are several examples of usage of a 3D scene that can all be tested by replacing the init.py file with one of the following sample scripts:
|Script for 3D||Description|
|init_chatbot.py||Example of using a chatbot. Uses speech recognition and TTS.|
|init_TextToSpeechDemo.py||Creates buttons that when pushed caused the virtual human to speak with text-to-speech (TTS) synthesis.|
Example of using the networking capabilities (VHMG) to communicate between two mobile devices.
Place the PhoneA and PhoneB files on different mobile devices. Pressing the button on one will cause the virtual human
to speak on the other, and vice versa.
|init_SpeechRecognitionDemo.py||Example of using speech recognition. Character will echo what the user inputs.|
|init_DialogueNPCDemo.py||Example of using the dialogue classifier to achieve a question/answer|
|init_SensorDemo.py||Example of using sensor data to modify a virtual human's reactions (pick up or put down the mobile device).|
The init_naturalvoice.py has also been copied to the init.py file and will be run by default.
(above) Example of an application that uses natural voice.
(above) Example of using a chatbot virtual character in VHVanilla.
To switch between examples, copy that .py file to the init.py file. To modify or change an application, simply modify the contents of init.py (and other other needed scripts) and restart the application.
In addition to showing a 3D scene, VHVanilla can operate in 'video' mode, which means that instead of displaying a 3D scene, it can instead display a set of videos.
|Scripts for Video||Description|
|init_PlayVideo.py||Example of playing a video.|
To switch to this mode, change the contents of the setup.py file b commenting out the other two modes, and uncommenting the following mode:
(above) Example of using video playback in the VHVaniila app. Videos are played back in response to the user speech, which consults the classifier to return the proper video id, which is then played back.
In addition to showing a 3D scene and playing videos, VHVanilla can operate in 'Google Cardboard' mode suitable for a Cardboard viewer, which means that the app will display a 3D scene with two viewers:
|Scripts for Google Cardboard||Description|
|init_CardboardDemo.py||Example of using vhmobile with Google Cardboard.|
To put the app into Google Cardboard mode, uncomment only the following line in setup.py and comment out the other lines:
(above) Example of using the Google Cardboard interface in the VHVaniila app.
VHvanilla and VHmobile software
Obtaining a License
The Software is made available for academic or research purposes only. The license is for a copy of the executable program for an unlimited term. Individuals requesting a license for commercial use must pay for a commercial license.
USC Stevens Institute for Innovation
University of Southern California
1150 S. Olive Street, Suite 2300
Los Angeles, CA 90115, USA
For commercial license pricing and annual commercial update and support pricing, please contact:
USC Stevens Institute for Innovation
University of Southern California
1150 S. Olive Street, Suite 2300
Los Angeles, CA 90115, USA
Tel: +1 213.821.0943
Fax: +1 213-821-5001
What is it?
VHmobile is a mobile platform library that makes the creation of chat-based virtual human characters easy. A virtual character can be created and made to speak using TTS voices and automated nonverbal behavior with only a few lines of Python code. In addition, it offers easy access to networking, sensors and voice recognition. VHvanilla is a mobile application that uses the VHmobile platform that includes a simplified widget (button) layout and a set of example scripts, as well as support for video playback and Google Cardboard VR viewing.
What are the capabilities of the platform?
The platform includes an animation system, SmartBody, a dialogue management/classfication system, NPC Editor, a nonverbal behavior generator (Cerebella, light version), text-to-speech (Cereproc), a networking system (VHMSG), a set of 3D characters, 3D behaviors, and a Python-based API that allows the easy scripting of application control and virtual human behavior. Rendering is done in three ways: 3D rendering is done through SmartBody, video rendering is done via the Android platform, and Google Cardboard rendering is done through the SmartBody rendering and the Google Cardboard API. Using VHvanilla, there are no limitations to the extent that the application could be programmed through SmartBody and Python.
Is there an iOS version of VHmobile/VHvanilla as well?
Not yet, although it is the intention of the authors to generate one as well.
Why did you call it VHvanilla?
VH = virtual human, and 'vanilla' refers to the basic (although still delicious..) contents of the mobile app. You download a mobile app which gives you the generic/vanilla capability, and it is up to you to 'flavor' the app to your liking.
Which characters can be used in the application?
There are a few characters (6) characters that can be used and set up with only one line of Python code. Each character has the ability to talk, gesture and emote.
What voices can be used for each character? Can I used different voices?
There are two Cereproc-based voices included: Star and Heather. Additional voices can be purchased from Cereproc and used in the application.
Can I use recorded speech instead of text-to-speech?
Yes, you can use prerecorded speech by placing a .wav (sound), a .bml (lip sync) and a .xml (behavior) file in the /vhdata/sounds folder, then calling the appropriate BML command. Each sound (.wav) file needs to be processed by a phoneme scheduler to produce the lip sync file, then packaged in a BML description (.xml file) and put in the /vhdata/sounds folder. Then the character need to be switched from 'remote' to 'audiofile' voice (by setting the "voice" and "voiceCode" attributes on the character, see the SmartBody manual for details).
The VHvanilla app is rather big (500mb). How can I make a smaller application for distribution?
The VHvanilla app is intended to include all the necessary assets and capability for the VHmobile platform. As such, it is likely that not all the assets will be used. For example, you might use only one character in your application, even though there are 6 characters that could be used. To make a smaller app, you can use the VHvanilla source code and remove the assets that are not needed. THe VHmobile library itself is only 12 mb, so any application that uses it would be relatively small.
How do I control the widgets on the user interface in VHvanilla?
The VHvanilla app has a set format for widgets, which can be programmed in Python with the scripts. The widgets have a set placement in the application, and the scripting can show or hide them, as well as respond to button presses. If you want to create your own set of widgets or controls, you can either modify the layout in the VHvanilla app (you will need the VHvanilla source code for that) or you can create your own app by using VHmobile as a library.
How do I change the lighting, camera angles, and other 3D features?
There are some convenience functions in the /vhdata/scripts folder, including lights.py which detail the lighting configuration, and camera.py that details the camera positioning and settings. The built-in renderer is capable of using both normal and specular maps on the characters. Other 3D features can be programmed using the standard SmartBody commands.
How can I see the debug information from the application?
Using Android Studio (http://developer.android.com/sdk/index.html) you can connect a USB cable to your app and see the output in the console. Look for the messages using the log 'SBM' to eliminate other android messages that would otherwise make reading this console output difficult.
Can I use this in a commercial application?
You will need a separate commercial license. The software is for noncommercial and academic/research purposes only.
Where can I get the VHmobile libraries?
Please contact Ari Shapiro at firstname.lastname@example.org if you are interested in the VHmobile library.
Where can I ask questions/get support/report a bug?
You can use the SmartBody forums at : http://smartbody.ict.usc.edu/forum
Setting up a character in 3D
VHMobile includes the SmartBody animation which allows you to set up and control a 3D character with various conversational capabilities, such as lip sync to speech and automated nonverbal behavior, lighting control, and so forth. Thus the application developer can access the entire SmartBody API using Python as described here:
Assets and Data
VHMobile requires a set of data, including characters, animations and control scripts. The following folders describe the data that is included in VHvanilla under the /sdcard/vhdata:
|classifier/||Data needed for the classifier (NPC editor)|
|mesh/||3D model assets and textures|
|motions/||3D animations and skeletons|
|parser/||Data needed for the Charniak parser|
|scripts/||Convenience scripts for SmartBody|
|sounds/||Folder for prerecorded speech|
|pythonlibs/||supporting Python libraries|
|aiml/||AIML python library for use with chatbots|
|alice/||ALICE chatbot knowledge scripts|
In addition, there are numerous helper scripts that make such a process simpler. Those scripts include the following located in the /sdcard/vhdata/scripts folder:
|setupCharacter||Sets up characters with default behaviors: lip synching, gaze, gestures, locomotion|
|init-diphoneDefault||Sets up the lip syncing data set for English.|
|nonverbalbehavior.py||Default nonverbal behavior (head and face movements, gestures, gaze) automatically generated when an utterance is processed.|
|zebra2-map.py||Mapping file to convert characters from zebra2 format to SmartBody standard format.|
Many different types of characters can be created including the following:
ChrAmity, ChrAlyssa, ChrHarrison, ChrJin, ChrJohn, ChrLindsay, ChrTessa
In order to create a character, the following command
setupCharacter(name, characterType, "", voiceType)
where name is the name of the character, characterType is one of the valid characters listed above (such as ChrAlyssa), and voiceType is the cereproc voice.
Character TTS Voices
Note that currently two voices are available: Katherine (female) and Star (male). All female characters use the Katherine voice, and all male characters use the Star voice. Additional voices can be purchased from www.cereproc.com, and placed in the /vhdata/cereproc/voices folder.
To make a character speak, instruct the character using the following BML command:
If you want the character to speak using automated nonverbal behavior, run the following command which will return a more complicated behavior after running the utterance through the nonverbal behavior processor as follows:
Character Prerecorded Voices
Characters can instead use prerecorded voices instead of TTS voices. Prerecorded speech requires a sound file to play (.wav) file, a lip sync file (.bml) and a nonverbal behavior file to play while speaking (.xml).
To configure a character to use prerecorded speech, run the following commands:
This sets up the location where SmartBody will look for the sound files (/vhdata/sounds) then for your character, use those files located in a particular subdirectory(/vhdata/sounds/.) for the .wav. .bml and .xml files, Please consult the SmartBody manual for information on how to use recorded speech. Playing recorded speech is similar to playing TTS speech, but instead of specifying text, you instead specify an id that indicates the location of the sound and behaviorfiles:
Which assumes that peas.wav, peas.bml and peas.xml exist.
Characters can be easily configured with SmartBody behaviors by calling the addBehavior() function. Currently supported behaviors are: Gestures (male), FemaleGestures(female) and locomotion.
Gestures are designed to start from an initial posture, and return to that same posture. SmartBody includes mechanisms to coarticulate gestures (keep hands and arms in gesture space as two gestures are played back-to-back). To make sure that the character is in the proper posture for gesturing, the folliowing postures should be set via BML:
Note that any posture can be set for the characters. However, only gestures associated with that posture will be able to generate automatic nonverbal behavior.
Other Character Capabilities
The characters have the full functionality of other SmartBody characters, including gazing, breathing, saccadic eye movements, reaching, touching and so forth. Please see the SmartBody manual for more details.