Abstract

Augmented Reality is gaining attention primarily driven by the availability of consumer devices such as Head Mounted Displays (HDM) on the market. This paper focuses on a different flavour of Augmented Reality, called Mixed Reality and it describes the work carried under the H2020 European project 5GCity.

We realised an application running on Microsoft Hololens and designed to provide enhanced experience information on a city scale. The application provides information about historical buildings, thus supporting cultural outdoor tourism. The user experience is enriched with content coming from the archives of the Italian public broadcaster RAI.

A cloud application (conceived and designed to run on a 5G-ready infrastructure) based on a visual search engine receives an image flow captured by the HMD by the user and identifies known objects. The user can freely watch at the object for which augmented contents have to be displayed and interact with these contents through a set of pre-defined gestures.

Moreover, if the object of interest is detected and tracked by the mixed reality application, also 3D contents can be overlapped and aligned with the real world’s one. Subjective evaluations confirmed that the application was fluent and the initial recognition was stable and fast.

Introduction

In recent years there has been a lot of activity around Augmented Reality (AR): many important players have shown support for AR by introducing HDMs, such as Microsoft Hololens, Samsung Odissey, Meta2, Vuzix Blade AR.

AR and Mixed Reality (MR) are two well-known concepts: the difference is that augmented reality “blends” together real and virtual objects while in mixed reality, objects are positioned and aligned in order to appear as a part of the real world with respect to the user view.

Moreover, with respect to Virtual Reality, the shared feature of AR and MR application is that users have contact with the real world and this has an advantage both from a technical point of view (part of the space exists and a computer-generated model is not necessary) and from a physical point of view (a detachment from the real world can lead to physical and mental discomforts).

Virtual contents are called assets: a set of computer-generated objects overlapped to the real world. They can be: text, audio, 3D models, video. This paper addresses the problem of aligning assets (such as 3D models) with real-world objects. Several AR solutions have been used in the context of cultural tourism because of their potential to improve the tourist experience and help the tourist to access relevant information, improving their knowledge of the touristic destination and increasing the user’s entertainment.

Yovcheva et al. define augmented tourism as a “complex construct which involves the emotions, feelings, knowledge and skills resulting from the perception, processing and interaction with virtual information that is merged with the real physical world surrounding the tourist”, explaining that it has not yet been fully exploited.

Moreover, authors conclude that the value added to the overall tourist experience is determined by the fit between context and content, referring to the spatial, temporal, personal, and technical context where the AR system is being used.

Many solutions have been presented for indoor, where is quite easy to track the user position and light conditions can be controlled. Several museums provide users AR applications that recognise position and orientation of the user by framing well defined images (markers) and overlap computer generated assets to the artwork the user is interested in.

On the other hand, several challenges have to be tackled for outdoor applications. First of all, an accurate tracking of the user is not easy; often GPS-based solutions are provided, but they might be not able to accurately provide both the position and orientation.

Moreover, GPS-denied environments should be also considered (e.g. high buildings block the signal from the satellite in smaller side streets and close to buildings). The second issue is related to the impossibility to alter the environment by markers or target images. Finally, lighting conditions cannot be controlled.

This paper presents a MR application for outdoor environments enhancing the user experience during cultural tours, by anchoring synthetic contents to some positions in the real space and letting the user interact with them.

In order to allow the user a full mobility, the Microsoft Hololens has been chosen as a HMD; it is connected by a high speed and high-throughput network (e.g., 4G/5G cellular networks) to a cloud infrastructure. First, the application captures images of what the user is watching at a constant rate; then in a transparent way, images are sent to the cloud architecture where a visual search engine (based on MPEG Compact Descriptors for Visual Search - CDVS) is running.

In this way, objects of interest framed by the user can be identified; finally, a notification about a recognised object is sent back to the user, that can receive augmented contents about the object. The application can display both textual information and movies about the framed monument/building/artwork and a semitransparent silhouette of the recognised item can be used to align the user with respect the target, thus enhancing the tracking robustness. When the AR application tracks the object, 3D virtual assets can be overlapped to the target.

The paper is organised as follows: basic concepts behind AR and MR are presented in Section 2 as well as some examples of outdoor augmented reality for tourism, Section 3 details the architecture of the proposed application and Section 4 shows tests and results gathered in the Turin’s Archaeological Park.

Download the full tech paper below