Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Configuring the Facial-Expression File

TO BE POPULATED

 

...

Message API

...

Understanding Output Messages

 

NVBG subscribes to vrExpress messages and a few other control messages which allow for setting some options. Below is a list of messages that NVBG subscribes to:

vrExpress

This message is sent to NVBG by the NLU, NPCEditor or similar module. This message can be used to convey information about speech data, posture, status change, emotion change, gaze data etc. as shown below.

Speech

The speech messages are characterized by the speech tag within them. They are interpreted and the corresponding output bml is generated with the speech time marks, animations, head-nods, facial-movements etc. These animations are generated based on the content of the speech tag and the fml tag in the input message.

...

vrSpeak

Below is an example of the vrSpeak message that is output from NVBG. It contains animation/head-nods/facial behavior based on the text that is to be spoken/heard.

vrSpeak brad user 1337363228078-9-1 <?xml version="1.0" encoding="utf-16"?><act>
<participant id="brad" role="actor" />
<bml>
<speech id="sp1" ref="voice_defaultTTS" type="application/ssml+xml">
<mark name="T0" />Depending

<mark name="T1" /><mark name="T2" />on

<mark name="T3" /><mark name="T4" />the

<mark name="T5" /><mark name="T6" />default

<mark name="T7" /><mark name="T8" />voice

<mark name="T9" /><mark name="T10" />you

<mark name="T11" /><mark name="T12" />have

<mark name="T13" /><mark name="T14" />selected

<mark name="T15" /><mark name="T16" />in

<mark name="T17" /><mark name="T18" />Windows,

<mark name="T19" /><mark name="T20" />I

<mark name="T21" /><mark name="T22" />can

<mark name="T23" /><mark name="T24" />sound

<mark name="T25" /><mark name="T26" />pretty

<mark name="T27" /><mark name="T28" />bad.

<mark name="T29" />
</speech>
<event message="vrAgentSpeech partial 1337363228078-9-1 T1 Depending " stroke="sp1:T1" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T3 Depending on " stroke="sp1:T3" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T5 Depending on the " stroke="sp1:T5" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T7 Depending on the default " stroke="sp1:T7" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T9 Depending on the default voice " stroke="sp1:T9" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T11 Depending on the default voice you " stroke="sp1:T11" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T13 Depending on the default voice you have " stroke="sp1:T13" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T15 Depending on the default voice you have selected " stroke="sp1:T15" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T17 Depending on the default voice you have selected in " stroke="sp1:T17" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T19 Depending on the default voice you have selected in Windows, " stroke="sp1:T19" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T21 Depending on the default voice you have selected in Windows, I " stroke="sp1:T21" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T23 Depending on the default voice you have selected in Windows, I can " stroke="sp1:T23" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T25 Depending on the default voice you have selected in Windows, I can sound " stroke="sp1:T25" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T27 Depending on the default voice you have selected in Windows, I can sound pretty " stroke="sp1:T27" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T29 Depending on the default voice you have selected in Windows, I can sound pretty bad. " stroke="sp1:T29" />
<gaze target="user" direction="POLAR 0" angle="0" sbm:joint-range="HEAD EYES" xmlns:sbm="http://ict.usc.edu" />
<sbm:event message="vrSpoke brad user 1337363228078-9-1 Depending on the default voice you have selected in Windows, I can sound pretty bad." stroke="sp1:relaxxmlns:xml="http://www.w3.org/XML/1998/namespacexmlns:sbm="http://ict.usc.edu" />
<!--first_VP Animation-->
<animation stroke="sp1:T3" priority="5" name="HandsAtSide_Arms_GestureWhy" />
<!--First noun clause nod-->
<head type="NOD" amount="0.10" repeats="1.0" relax="sp1:T5" priority="5" />
<!--Noun clause nod-->
<head type="NOD" amount="0.10" repeats="1.0" relax="sp1:T11" priority="5" />
<!--Noun clause nod-->
<head type="NOD" amount="0.10" repeats="1.0" relax="sp1:T19" priority="5" />
<!--Noun clause nod-->
<head type="NOD" amount="0.10" repeats="1.0" relax="sp1:T21" priority="5" />
</bml>
</act>

Explanation:

The Time-Markers i.e. T0, T1, T2 and so on, are used by the NVBG to mark each individual word in the sentence and later it can use this information to play animations at the start/end of that word.

e.g. in the above output bml, the animation "HandsAtSide_Arms_GestureWhy" is timed to play at sp1:T3, which means it will be played when the word "on" has been spoken. 'sp1' is the id of the speech tag which indicates that this 'T3' is from this particular speech tag. (currently NVBG only supports one speeech tag).

The appropriate "vrAgentSpeech partial" event messages are sent out by smartbody after each word has been spoken by the character. e.g."vrAgentSpeech partial 1337363228078-9-1 T19 Depending on the default voice you have selected in Windows," is sent out after the word Windows is spoken.

Similarly the <head> tag is interpreted by smartbody to generate Nods and other specified behavior.

The "priority" attribute allows NVBG to decide which behavior is to be culled if any of the behaviors overlap. This is done before the final message is sent out.

 

Receives

NVBG subscribes to vrExpress messages and a few other control messages which allow for setting some options. Below is a list of messages that NVBG subscribes to:

vrExpress

This message is sent to NVBG by the NLU, NPCEditor or similar module. This message can be used to convey information about speech data, posture, status change, emotion change, gaze data etc. as shown below.


Speech

The speech messages are characterized by the speech tag within them. They are interpreted and the corresponding output bml is generated with the speech time marks, animations, head-nods, facial-movements etc. These animations are generated based on the content of the speech tag and the fml tag in the input message.


 vrExpress "harmony" "ranger" "harmony221" "<?xml version="1.0" encoding="UTF-8" standalone="no" ?><act>
<participant id="harmony" role="actor"/>
<fml>
<intention>
<object name="A316">
<attribute name="addressee">ranger</attribute>
<attribute name="speech-act">
<object name="A317">
<attribute name="content">
<object name="V28">
<attribute name="modality">
<object name="V29">
<attribute name="conditional">should</attribute>
</object>
</attribute>
<attribute name="polarity">negative</attribute>
<attribute name="attribute">jobAttribute</attribute>
<attribute name="value">bartender-job</attribute>
<attribute name="object-id">utah</attribute>
<attribute name="type">state</attribute>
<attribute name="time">present</attribute>
</object>
</attribute>
<attribute name="motivation">
<object name="V27">
<attribute name="reason">become-sheriff-harmony</attribute>
<attribute name="goal">address-problem</attribute>
</object>
</attribute>
<attribute name="addressee">ranger</attribute>
<attribute name="action">assert</attribute>
<attribute name="actor">harmony</attribute>
</object>
</attribute>
</object>
</intention>
</fml>
<bml>
<speech id="sp1" type="application/ssml+xml">ranger utah cant be bartender if he becomes sheriff</speech>
</bml>
</act>"

 

Posture Change

These messages are characterized by the <body posture=""> tag which allows NVBG to know that there has been a change in posture.

vrExpress "harmony" "None" "??" "<?xml version="1.0" encoding="UTF-8" standalone="no" ?><act>
<participant id="harmony" role="actor" />
<bml>
<body posture="HandsAtSide" />
</bml>
</act>"


Status / Request

The idle_behavior and all_behavior attributes within the request tag allows NVBG to keep track of whether or not to generate the corresponding behavior.

vrExpress "harmony" "None" "??" "<?xml version="1.0" encoding="UTF-8" standalone="no" ?><act>
<participant id="harmony" role="actor" />
<fml>
<status type="present" />
<request type="idle_behavior" value="off" />
</fml>
</act>"


Gaze


These gaze tags, if present within the input message are transferred unaltered to the output message.

vrExpress "harmony" "ranger" "constant103" "<?xml version="1.0" encoding="UTF-8" standalone="no" ?><act>
<participant id="harmony" role="actor" />
<fml>
<gaze type="weak-focus" target="ranger" track="1" speed="normal" > "listen_to_speaker" </gaze>
</fml>
</act>"


Emotion

The affect tag contains data about the emotional state the character is currently in. This can be used to affect output behavior.

vrExpress "harmony" "None" "schererharmony17" "<?xml version="1.0" encoding="UTF-8" standalone="no" ?><act>
<participant id="harmony" role="actor" />
<fml>
<affect type="Fear" STANCE="LEAKED" intensity="110.475"></affect>
</fml>
<bml> </bml>
</act>"

 

 

nvbg_set_option

These are control messages to set options for NVBG. They are as shown below:

nvbg_set_option [char-name] all_behavior true/false - sets/unsets flag that allows all behavior generated by NVBG.

nvbg_set_option [char-name] saliency_glance true/false - sets/unsets flag that allows saliency map generated gazes. These gazes are based on content in the speech tag and the information in the saliency map.

nvbg_set_option [char-name] saliency_idle_gaze true/false - sets/unsets flag that allows idle gazes generated by the saliency map. These idle gazes are based on the priority of pawns in the saliency map and are generated when the character is idle.

nvbg_set_option [char-name] speaker_gaze true/false - sets/unsets flag that allows for the character to look at the person he's speaking to.

nvbg_set_option [char-name] speaker_gesture true/false - sets/unsets flag that allows gestures to be generated when speaking.

nvbg_set_option [char-name] listener_gaze true/false - sets/unsets flag that allows the listener to gaze at the speaker when he speaks.

nvbg_set_option [char-name] nvbg_POS_rules true/false - sets/unsets flag that allows behavior to be generated based on parts of speech returned by the parser.

nvbg_set_option [char-name] saliency_map [filename] - lets you dynamically specify the saliency map that the character will use

nvbg_set_option [char-name] rule_input_file [filename] - lets you dynamically specify the behavior map that the character will use

 

Posture Change

These messages are characterized by the <body posture=""> tag which allows NVBG to know that there has been a change in posture.

vrExpress "harmony" "None" "??" "<?xml version="1.0" encoding="UTF-8" standalone="no" ?><act>
<participant id="harmony" role="actor" />
<bml>
<body posture="HandsAtSide" />
</bml>
</act>"

Status / Request

The idle_behavior and all_behavior attributes within the request tag allows NVBG to keep track of whether or not to generate the corresponding behavior.

vrExpress "harmony" "None" "??" "<?xml version="1.0" encoding="UTF-8" standalone="no" ?><act>
<participant id="harmony" role="actor" />
<fml>
<status type="present" />
<request type="idle_behavior" value="off" />
</fml>
</act>"

Gaze

These gaze tags, if present within the input message are transferred unaltered to the output message.

vrExpress "harmony" "ranger" "constant103" "<?xml version="1.0" encoding="UTF-8" standalone="no" ?><act>
<participant id="harmony" role="actor" />
<fml>
<gaze type="weak-focus" target="ranger" track="1" speed="normal" > "listen_to_speaker" </gaze>
</fml>
</act>"

Emotion

The affect tag contains data about the emotional state the character is currently in. This can be used to affect output behavior.

vrExpress "harmony" "None" "schererharmony17" "<?xml version="1.0" encoding="UTF-8" standalone="no" ?><act>
<participant id="harmony" role="actor" />
<fml>
<affect type="Fear" STANCE="LEAKED" intensity="110.475"></affect>
</fml>
<bml> </bml>
</act>"

 

 

nvbg_set_option

These are control messages to set options for NVBG. They are as shown below:

nvbg_set_option [char-name] all_behavior true/false - sets/unsets flag that allows all behavior generated by NVBG.

nvbg_set_option [char-name] saliency_glance true/false - sets/unsets flag that allows saliency map generated gazes. These gazes are based on content in the speech tag and the information in the saliency map.

nvbg_set_option [char-name] saliency_idle_gaze true/false - sets/unsets flag that allows idle gazes generated by the saliency map. These idle gazes are based on the priority of pawns in the saliency map and are generated when the character is idle.

nvbg_set_option [char-name] speaker_gaze true/false - sets/unsets flag that allows for the character to look at the person he's speaking to.

nvbg_set_option [char-name] speaker_gesture true/false - sets/unsets flag that allows gestures to be generated when speaking.

nvbg_set_option [char-name] listener_gaze true/false - sets/unsets flag that allows the listener to gaze at the speaker when he speaks.

nvbg_set_option [char-name] nvbg_POS_rules true/false - sets/unsets flag that allows behavior to be generated based on parts of speech returned by the parser.

nvbg_set_option [char-name] saliency_map [filename] - lets you dynamically specify the saliency map that the character will use

nvbg_set_option [char-name] rule_input_file [filename] - lets you dynamically specify the behavior map that the character will use

Understanding Output Messages

vrSpeak

Below is an example of the vrSpeak message that is output from NVBG. It contains animation/head-nods/facial behavior based on the text that is to be spoken/heard.

vrSpeak brad user 1337363228078-9-1 <?xml version="1.0" encoding="utf-16"?><act>
<participant id="brad" role="actor" />
<bml>
<speech id="sp1" ref="voice_defaultTTS" type="application/ssml+xml">
<mark name="T0" />Depending

<mark name="T1" /><mark name="T2" />on

<mark name="T3" /><mark name="T4" />the

<mark name="T5" /><mark name="T6" />default

<mark name="T7" /><mark name="T8" />voice

<mark name="T9" /><mark name="T10" />you

<mark name="T11" /><mark name="T12" />have

<mark name="T13" /><mark name="T14" />selected

<mark name="T15" /><mark name="T16" />in

<mark name="T17" /><mark name="T18" />Windows,

<mark name="T19" /><mark name="T20" />I

<mark name="T21" /><mark name="T22" />can

<mark name="T23" /><mark name="T24" />sound

<mark name="T25" /><mark name="T26" />pretty

<mark name="T27" /><mark name="T28" />bad.

<mark name="T29" />
</speech>
<event message="vrAgentSpeech partial 1337363228078-9-1 T1 Depending " stroke="sp1:T1" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T3 Depending on " stroke="sp1:T3" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T5 Depending on the " stroke="sp1:T5" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T7 Depending on the default " stroke="sp1:T7" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T9 Depending on the default voice " stroke="sp1:T9" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T11 Depending on the default voice you " stroke="sp1:T11" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T13 Depending on the default voice you have " stroke="sp1:T13" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T15 Depending on the default voice you have selected " stroke="sp1:T15" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T17 Depending on the default voice you have selected in " stroke="sp1:T17" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T19 Depending on the default voice you have selected in Windows, " stroke="sp1:T19" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T21 Depending on the default voice you have selected in Windows, I " stroke="sp1:T21" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T23 Depending on the default voice you have selected in Windows, I can " stroke="sp1:T23" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T25 Depending on the default voice you have selected in Windows, I can sound " stroke="sp1:T25" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T27 Depending on the default voice you have selected in Windows, I can sound pretty " stroke="sp1:T27" />
<event message="vrAgentSpeech partial 1337363228078-9-1 T29 Depending on the default voice you have selected in Windows, I can sound pretty bad. " stroke="sp1:T29" />
<gaze target="user" direction="POLAR 0" angle="0" sbm:joint-range="HEAD EYES" xmlns:sbm="http://ict.usc.edu" />
<sbm:event message="vrSpoke brad user 1337363228078-9-1 Depending on the default voice you have selected in Windows, I can sound pretty bad." stroke="sp1:relaxxmlns:xml="http://www.w3.org/XML/1998/namespacexmlns:sbm="http://ict.usc.edu" />
<!--first_VP Animation-->
<animation stroke="sp1:T3" priority="5" name="HandsAtSide_Arms_GestureWhy" />
<!--First noun clause nod-->
<head type="NOD" amount="0.10" repeats="1.0" relax="sp1:T5" priority="5" />
<!--Noun clause nod-->
<head type="NOD" amount="0.10" repeats="1.0" relax="sp1:T11" priority="5" />
<!--Noun clause nod-->
<head type="NOD" amount="0.10" repeats="1.0" relax="sp1:T19" priority="5" />
<!--Noun clause nod-->
<head type="NOD" amount="0.10" repeats="1.0" relax="sp1:T21" priority="5" />
</bml>
</act>

Explanation:

The Time-Markers i.e. T0, T1, T2 and so on, are used by the NVBG to mark each individual word in the sentence and later it can use this information to play animations at the start/end of that word.

e.g. in the above output bml, the animation "HandsAtSide_Arms_GestureWhy" is timed to play at sp1:T3, which means it will be played when the word "on" has been spoken. 'sp1' is the id of the speech tag which indicates that this 'T3' is from this particular speech tag. (currently NVBG only supports one speeech tag).

The appropriate "vrAgentSpeech partial" event messages are sent out by smartbody after each word has been spoken by the character. e.g."vrAgentSpeech partial 1337363228078-9-1 T19 Depending on the default voice you have selected in Windows," is sent out after the word Windows is spoken.

Similarly the <head> tag is interpreted by smartbody to generate Nods and other specified behavior.

The "priority" attribute allows NVBG to decide which behavior is to be culled if any of the behaviors overlap. This is done before the final message is sent out.

 

...

Known Issues

Either list of common known issues and/or link to all Jira tickets with that component name.

...