In a first lab experiment, 24 volunteers were audio / video recorded while performing three tasks of varying complexity on a photocopy machine. Data on the users’ experience level and the quality of the interaction were also collected, so they can later be correlated with the social signals observed.

Observational study on the interaction with a copy machine

At a later stage, observation will be taken to a natural context: a supermarket self-checkout service area, where it is possible to monitor users in a less constrained interaction context, where the same system is in contact with a wide range of users and usages and also where the physical approach and posture to the system are different than in a desktop computer scenario.

Data Analysis...

The next task was to try and extract, from the user interaction recordings, features from nonverbal and behavioral cues such as physical activity, patterns and speed of movement, posture changes, gestures, vocal outbursts, gaze direction and rhythm, and proxemic behavior towards the machine. The choice of signaling modalities valued features that are technically easy to detect, even in uncontrolled and noisy environments such as open, public spaces. To identify and annotate their occurrence in a reliable way, we have created and tested a social signal coding protocol for human observation.

Coding Scheme for the Annotation of Social Signals in HCI

The pre-recorded videos of interaction with the photocopier are now being classified according to this coding scheme, and the cues detected and their combinations (social signals) are being correlated to independent information on the user’s difficulty and experience levels, and the occurrence of incidents such as errors, hesitations, undos, corrections, and delay in task completion.

Implementation and testing...

Based on the model developed in the previous stage, a device will be implemented to monitor and classify the users’ social signals. We are working on video-based monitoring and audio processing techniques to identify the most relevant social signals and implement classification algorithms to classify them according to the interaction quality parameters.

A first attempt consisted on automatic video coding of the user’s activity and emphasis of body movement [for further information, vide]. General results indicate that these metrics are able to predict 46.6% of the variance of task difficulty.

Other processing techniques are being developed to allow a multimodal and more complete approach.

The implementation should consist of a self-contained system that can be easily deployed. The device developed will be subject to tests in real-life interaction contexts to assess the success of the model and the implementation techniques developed. These tests will occur within the same interaction contexts used for the development, such as the supermarket self-checkout machine.