BodySLAM: Opportunistic Body Tracking in Multi-User AR/VR Experiences

Handheld controllers, offering several buttons and six degree-of-freedom tracking, are the most common input approach seen in today’s augmented and virtual reality (AR/VR) systems (e.g., HTC Vive, Oculus Rift). Of course, there are many other facets that could be valuable to digitize, including user body pose, facial expression, skin tone and apparel. Unfortunately, very few AR/VR systems capture these dimensions, and when they do, it is most often via special worn sensors (e.g., instrumented gloves, additional cameras mounted on the headset). Alternatively, external infrastructure can be deployed (e.g., multiple room-mounted cameras) that capture body pose without having to instrument the user.

In this work, we take advantage of an emerging use case: co-located, multi-user AR/VR experiences. In such contexts, participants are often able to see each other’s bodies, hands, mouths, apparel, and other visual facets, even though they generally do not see themselves. Using the existing outwards-facing cameras on smartphones and AR/VR headsets (e.g., Microsoft Hololens, Google Cardboard), these visual dimensions can be opportunistically captured and digitized, and then relayed back to their respective users in real time. This is the key insight that motivated our work on BodySLAM.

Our system name was inspired by SLAM (simultaneous localization and mapping) approaches to mapping unknown environments. In these systems, many viewpoints are used to reconstruct the geometry of the environment and objects. In a similar vein, BodySLAM uses disparate camera views from many participants to reconstruct the geometric arrangement of bodies in the environment, as well digitize the individual bodies themselves (bodies, hands, mouths, skin and apparel). When a person is seen by two or more user, we can estimate 3D data.

We evaluated our system in a multi-part user study, incorporating two tasks and two group sizes. To explore how BodySLAM might scale to larger numbers of people, we ran software simulations in virtual rooms. Although we did not build any demo applications, we note that digitization of bodies has been well motivated in prior work, including uses in social VR and telepresence, entertainment and gaming, and 3D manipulation.



Ahuja, K., Goel, M. and Harrison, C. 2020. BodySLAM: Opportunistic Body, Hand and Mouth Tracking in Multi-User AR/VR Experiences. In Proceedings of the 8th ACM Symposium on Spatial User Interaction. (October 30 – November 1, 2020). SUI '20. ACM, New York, NY.

© Chris Harrison