MacKenzie, I. S. (2025). Experience 2.0: Mixed realities and beyond. Engineering Interactive Computer Systems. EICS 2024 International Workshops (LNCS 15518), pp. 235-253. Berlin, Springer. doi:10.1007/978-3-031-91760-8_16. [PDF]
Abstract. We examine current mixed reality (MR) environments, looking at what they are, what can and cannot be done, and the challenges and possibilities looking forward. The contribution of Milgram and Kishino's one-dimensional mixed reality model from 1994 is acknowledged with possibilities noted for extending the model to two- and three-dimensions. The challenges and possibilities for MR looking forward are summarized as seven community challenges (CCs) derived from current research and to be examined by MR researchers. Experience 2.0: Mixed Realities and Beyond
I. Scott MacKenzie
Dept. of Electrical Engineering and Computer Science
York University, Toronto, Canada
mack@yorku.ca
Keywords: Virtual reality • augmented reality • mixed reality • extended reality • multiple realities • heads-up display • see-through display • wall-size display • fogscreen • holographic display • brain-computer interface • sense of smell • olfactory display • tactile feedback • community challenges
1 Introduction
The environments of interest in this paper are virtual reality (VR), augmented reality (AR), and mixed reality (MR). These were explored at the workshop, "Experience 2.0 and Beyond - Engineering Cross Devices and Multiple Realities" on July 25, 2024, as part of The 16th ACM SIGCHI Symposium on Engineering Interactive Computing Systems, EICS '24. The goal of the workshop was to look at the "reality" spaces – what they are, where they are, and where they are headed. Of concern is the user experience (UX) and the engineering of realities and devices to enhance such in future contexts.
We focus on mixed modalities that include novel devices triggering multiple human senses and in different contexts. The term "mixed reality" or "MR" is meant to encompass VR, AR, and related environments, and go inside and beyond the common interpretations for each. We also pose research questions with each presented as a community challenge (CC). The questions arise from examples that expose interactions that challenge and push the boundaries of the mixed reality space.
Notably, the EICS workshop marked the 30-year anniversary of the publication of Milgram and Kishino's seminal1 paper, "A Taxonomy of Mixed Reality Visual Displays." In it, they explored what at the time was the emerging world of mixed realities. They did so using a descriptive model. See Fig. 1. Descriptive models are visualizations with labels and narratives that deconstruct a problem space into constituent parts [23, Chapter 7]. They are tools for thinking – for identifying the parts, what they are, how they relate to each other, and the possibilities offered. In Fig. 1, we see a world Milgram and Kishino anticipated with the "real" at one end, the "virtual" at the other end, and "augmented" variations within. They use "mixed reality" as an umbrella term to encompass the possibilities.
Fig. 1. Milgram and Kishino's mixed reality continuum [26].
The terms featured in Milgram and Kishino's paper are well-travelled in the HCI literature. Figure 2 shows the results from a search of the number of publications using these terms in the ACM Digital Library. The chart is organized as a timeline beginning in 1980, before HCI emerged as a field of study.
Fig. 2. Cumulative number of publications by year in the ACM Digital Library using the terms "virtual reality," "augmented reality," and "mixed reality."
Milgram and Kishino's model is one dimensional, expressing a single property ("degree of reality") along a continuum. This is the simplest rendering for a descriptive model. See Fig. 3a. If two dimensions are identified for a problem space, then Fig. 3b applies. An example of a model space in two dimensions is the positioning of an artefact along one axis for "form" (design) and a second axis for "function" (engineering) [23, Figure 4.2]. If a problem space is described along three dimensions, then Fig. 3c applies. An example here is Flavián et al.'s "EPI Cube" which characterises MR environments in a three-axis space according to their degree of embodiment, presence, and interactivity [10, Figure 6].
(a) (b)
(c)
Fig. 3. Visual rendering for descriptive models by dimension. (a) one-dimensional, 1D. (b) two-dimensional, 2D. and (c) three dimensional, 3D. Each axis is a continuum, positioning the problem space along a property of interest.
In Milgram and Kishino's paper, they also propose extending their model along additional dimensions, comprising the following elements:
- World knowledge ⇒ what we know of the world displayed
- Fidelity ⇒ the level of realism in the world displayed
- Presence ⇒ the degree of illusion that the observer experiences
Each was represented visually in their paper as a one-dimensional (1D) rendering as per Fig. 3a. Although not explored by Milgram and Kishino, it might be possible to combine any two of the 1D properties into a 2D model as per Fig. 3b or all three of the 1D properties into a 3D descriptive model as per Fig. 3c. The challenge with a 3D rendering is positioning environments within the space in a manner that is intuitive and revealing. For this, an interactive VR application might help. Research awaits.
2 Position Topics
Below are the topics raised in this position paper. Each includes examples and is followed with a community challenge in the form of a research question. The intent here is not to answer the research questions, but to offer the questions as points of discussion or research for the MR community.
2.1 Terminology
Considering the three terms in Fig. 2, "virtual reality" is the most common (26,469 occurrences), followed by "augmented reality" (18,346 occurrences) and mixed reality (5,546 occurrences). The curves show no sign of flattening as VR, AR, and MR continue to inspire HCI researchers today. A closer look at citations provides evidence of this: Of the 9700+ citations to Milgram and Kishino's 1994 paper in Google Scholar, more than 4700 citations are from 2020 or later.
The terminology itself is an issue, since the overlap in reality spaces inherently creates uncertainty on the boundaries of each and fosters additional terms, such as XR (extended reality) [28, 35], UR (unmediated reality) [44], PMR (pure mixed reality) [10], SAR (spatial augmented reality) [3], OAR (olfactory augmented reality) [34], MSAR (multi-sensory augmented reality) [2], metaverse [27, 46], and so on. Each carves out a niche in the diverse space of mixed realities. It is a concern that these terms are bantered about in myriad ways, often with overlapping interpretations [40, 41]. And so,
CC1 → What are the important terms for multiple reality environments? List, define, and organize in a glossary.We leave this for the MR community to address at a later time.
2.2 The Forgotten Senses
A common refrain for mixed reality is heightening the sense of immersion or presence. This means, among other things, engaging all the human senses. The visual channel is well represented in all setups. This is expected as most people obtain about 80% of their information though the sense of light [1]. There is generally and understandably less attention directed at the auditory and tactile senses (depending on context, of course). The senses of smell and taste are barely represented, if at all.
Mixed reality is an area within HCI that seems particularly suited to applications leveraging the human sensory channel of smell. This is due to the immersion experienced by the user. As an example, Niedenthal et al.'s olfactometer is a graspable olfactory display for virtual environments [30]. See Fig. 4. Their research tested new interaction domains for the human olfactory experience that involve active exploration and directed sniffing of odour blends. The device contains four gas chambers containing scents and attaches to a HTC Vive hand controller. Valves on the device control the flow and blending of scents. They evaluated the device in a user study with a virtual wine cellar game. Results indicate that the device is intuitive to use and stable enough for long-term smell training sessions. In another study [31], they demonstrate that use of an olfactory display has the potential to increase the sense of presence in virtual environments.
Although Niedenthal et al.'s olfactory display is a separate handheld device, one can imagine the display as built-in to a headset and engaging the nasal cavity much like the headset's visual display engages the eyes.
Fig. 4. A handheld olfactory display for use in virtual environments [30].
As another example, Zulkarnain et al. [49] created a mixed reality environment by adding the sense of smell to a VR setup. They created and studied virtual sensory booths where users are presented with an immersive yet controlled environment for assessing smells. See Fig. 5. Users wore head-mounted displays. In a training session, they visited sensory booths with items of food on plates. Later, the participants visited booths with empty plates and experienced smells presented through test tubes containing scented sticks and attempted to identify the food item. Zulkarnain et al. note that their mixed reality lab has the potential for cost savings in consumer research.
Fig. 5. Mixed reality sensory booth for consumer research on food (after [49]).
Fig. 6. Formative pillars in Ericsson's Internet of Senses (after [8]).
Smell and taste, along with the other senses and the mind, are the formative pillars in Ericsson's Internet of Senses, a futuristic look at computing [8]. See Fig. 6. The 2019 report was based on an online poll of users from 15 cities worldwide, with at least 500 respondents from each city.
Many scenarios in the Ericsson report assume that smell and taste will be realized through digital means for future interactions. For smell, 60% of respondents predict it will be possible to digitally visit forests or the countryside and experience the natural smells of those places. For taste, 40% anticipate a revolution in online shopping, with the ability to digitally taste samples from the comfort of their devices. And so,
CC2 → For each mixed reality environment, to what extent are the human senses represented?The CC above is not a yes-no checklist. Each sense is a multi-dimensional space. So, depicting the presence of a sense in MR needs an appropriately rich presentation. For example, sound or auditory content is common, but is the sound monophonic or stereo? Are there directional cues in the sound and, if so, are the cues omni-directional or limited to, say, a horizontal plane of 180° to +180°? And what type of sounds are present: sounds from people, from nature, or synthetic sounds? Within each of the categories, there are sub-categories. If the environment includes sounds from people, are they voiced or unvoiced, speech or non-speech, computer generated or natural, and so on. This elaboration for the sense of sound applies to the other senses as well – something for the MR community to explore.
We will shortly say more about the sense labelled "Mind" in Fig. 6.
2.3 Mixing Real and Virtual
Although head-mounted displays (HMDs) are inseparable from VR, mixed reality as an extension of VR invites the use of additional novel display technologies that often mix real and virtual content. Examples include see-through displays (aka heads-up displays) [15, 45, 48], holographic displays [16, 36], wall displays [17, 19, 25], fogscreens [37, 47], and so on.2 See-Through Displays (Automotive Applications). A popular MR application is automotive interfaces (or cockpits) that are augmented with see-through displays (sometimes smart glasses). See Fig. 7.
As expected, the bulk of the automotive research in MR is in simulated environments. Since the application is driving, the environment is unlikely to include a head-mounted display. And so, the visual channel is mostly real – the road ahead – but is augmented with additional displays, as depicted in the Fig. 7. For this reason, the environment falls closer to the real end of Milgram and Kishino's real-virtual continuum, perhaps at position 2 or 3 in Fig. 8. Whether the position is 2 or 3 might be a matter of other issues, such as whether the environment's audio content is fully real (position 2) or partly synthetic (position 3).
Fig. 7. Mixed reality with see-through displays in simulated automotive environment (after [5]).
Fig. 8. Placing environments in the MR continuum (after [26]).
And so,
CC3 → For each mixed reality environment, what is the environment's position in Milgram and Kishino's mixed reality space?In some setups, simply pegging the environment at a position in the MR continuum might not be possible. For example, in the setup described by Schmidt and Yigitbas [39], the environment's position in the MR continuum is dynamic. This is enabled since modern VR headsets are equipped with front-facing cameras on the outside of the device cover. In their transitional cross-reality system, a real-virtual control (RVC) appears as a slider to allow the user to set the environment's position in the MR space. See Fig. 9. This is done via a "transition manager" to activate or deactivate environment layers and adjust the internal logic of the application accordingly. The transparency of layers allows other layers below to potentially shine through. However, objects in layers are opaque and block content in adjacent "lower" layers.
Fig. 9. RVC slider. The user grabs the red sphere of the slider with their thumb and index finger and drags it to set the current position of the environment in the MR continuum (from [39, Figure 6]).
Ch et al. [6] conducted an elicitation study using a mock-up of a heads-up windshield display. See Fig. 10a. The goal was to see if drivers would use gestures or voice commands to interact with an in-vehicle interface. For 24 tasks, participants provided numerous examples deemed appropriate for gesture input or voice input. No significant difference was found between the agreement rates for gestures vs. voice commands, but Ch et al. also found a relatively low acceptance overall for the interaction tasks, some of which were complex (e.g., bookmark an audio selection or play karaoke vocals in the background). Issues concerning distraction were not included in the study.
Cao et al. [5] describe a see-through heads-up display and gesture input method to allow drivers to interact with and see menu selections and messages without looking away from the road. Their setup used an acrylic board positioned on an angle above an Apple iPad. Information on the iPad was visible on the see-through acrylic board. See Fig. 10b. The motivation was to maintain vehicular safety.
Safety is a concern in mixed reality environments that involve driving, flying, riding a bicycle, walking about, or other potentially hazardous tasks. And so,
CC4 → What safety issues emerge in the application of mixed realities?Holographic Displays. Holographic displays remain mostly a research topic with commercial applications largely on the fringe of HCI. This is expected: Research with an emerging technology tends to initially focus on "what can be done" rather that "how well it can be done." Reviewing the literature reveals that user studies with holographic displays are rare. Usability is largely reported through informal testing or anecdotal narratives. Most research with holographic displays focuses on the implementation details.
(a)
(b)
Fig. 10. Heads-up display. (a) Demonstration of heads-up display with in-vehicle tasks controlled by voice prompts and voice commands (after [6]). (b) An acrylic board functions as a see-through display and presents safety information to the driver (after [5]).
An exception is the Hololens product-line from Microsoft, aimed initially at the gaming market, but also for medical AR [12] and for developers with an eye to new possibilities. An impediment to wide-spread consumer adoption, however, is the price tag: $3500 for Hololens 2.3
Ihara et al. [16] used holographic displays that allowed remote and local users to co-exist and collaborate in a mixed reality environment. A remote "holographic user" collaborates with a local real user in a shared space for touching, grasping, and manipulating objects. See Fig. 11a. Their HoloBots environment combines a Hololens 2 with an Azure Kinect and Sony Toio mobile robots in creating a broad world-in-miniature telepresence experience. See Fig. 11b.
Sargolzaei et al. [38] use Hololens 2 to reconstruct game sessions for action-adventure games and first-person shooter games. Their GAMR system combines a Unity plugin with the HoloLens 2 MR headset to aid developers in understanding player behaviors within the MR environment. Post-game analyses are based on recorded and reconstructed gameplay sessions, leveraging the capabilities of holographic images from different perspectives.
![]()
Fig. 11. HoloBots [16]. (a) Local user and holographic rendering of remote user. (b) The remote user is tracked by an Azure Kinect (not shown) and appears as holographic user in the local environment, which includes a Microsoft Hololens 2 and Sony Toio tangible robots.
2.4 Standardized Testing
Ihara et al. [16] present an evaluation of their HoloBots system, comparing hologram-only, robot-only, and hologram+robot conditions. The evaluation used qualitative measures, such as user responses to SUS and NASA-TLX questionnaire items, but no quantitative measures of performance.
In other HCI areas, such as research on target selection or text entry, there are standardized tasks for evaluating user performance with new interaction methods. For target selection, standardized testing typically involves using a Fitts' law task as per ISO 9241-411 [18]. For text entry, standardize testing involves users entering phrases of text selected at random from a standard set [24] while the speed and accuracy of their performance is measured and compared.
One benefit of standardized testing is that across-study comparisons are possible. The same might be possible for mixed reality environments. An example might be a robot-like pick and drop tasks [43] since this involves grasping and manipulation objects as required in Ihara et al.'s [16] HoloBots system. One possible implementation is the disc transfer task or pin transfer task described by Fitts [9]. See Fig. 12. In VR, the task would likely benefit from including a home and final position for the hand and perhaps the same for the user's point of view (POV) or position in the environment.
And so,
CC5 → What are the possibilities for representative tasks and quantitative measures for standardized testing of mixed reality environments.Fogscreens. A fogscreen is another example of a novel display for mixed reality. Fogscreens are immaterial mid-air displays formed from flowing light-scattered particles – like fog. They bring new possibilities for displaying information in MR environments. Users can touch, reach into, even walk through a fogscreen. See Fig. 13a. Most commonly, fogscreens are found in settings such as trade shows, theme parks, or museums. Equipped with appropriate sensors, a fogscreen can act as an interactive display [20, 32, 33].
(a) (b)
Fig. 12. Possible standardized tasks for evaluating MR setups. (a) Disc transfer task. (a) Pin transfer task (after [9]).
![]()
Fig. 13. Fogscreen [37]. (a) Users can reach into, even walk through, a fogscreen. (b) Interaction with tactile feedback.
Remizova et al. [37] researched a fogscreen setup, focusing on interaction. Yes, fogscreens are enticing, even playful, but what are the possibilities for interaction? A critical issue is feedback. Users can reach into a fogscreen and touch or grab objects, but providing a sense of that might have benefits for performance or experience. In Remizova et al.'s setup, the position of the hands in the fogscreen was digitized using a motion-sensing Microsoft Kinect. Their study used a Fitts' law task, a per ISO 9241-411 [18], where participants reached into the fogscreen to acquire and select targets. Two selection methods (tapping, dwelling) and two feedback modes (audio-visual, audio-visual+haptic) were compared in a within-subjects user study with 20 participants. The haptic feedback method used a lightweight wireless vibrotactile actuator worn on the user's finger. See Fig. 13b. In the end, no performance advantage was found using haptic feedback vs. audio-visual feedback. This result has implications for the design of gestural interfaces suitable for interaction with fogscreens.
Wall Displays. Wall displays have a long history in HCI, dating to Bolt's "Put That There" demo in 1980 [4]. In Bolt's system, the user makes mid-air gestures, moving their hand to point to objects or locations on a wall-size display and then issues voice commands.4 See Fig. 14. Bolt's system was a room with a large projection screen and an instrumented chair with a microphone for the user. The system included voice recognition for simple commands and a Polhemus 6 DOF tracker mounted on the user's hand. It was, perhaps, the first VR "Cave."
A common feature in Bolt's cave and Remizova et al.'s fogscreen is that both systems require on-body hardware (6 DOF hand tracker, haptic finger actuator) and off-body5 hardware (chair with microphone, motion sensor). And so,
CC6 → For each mixed reality environment, what on-body and off-body hardware is present and what is its role for input control or output display?
Fig. 14. Wall-size display in Bolt's "Put That There" demo from 1980 (after [4]).
James et al. [19] used a wall display in a collaborative work setting. See Fig. 15. The wall display was public – viewable to all. Users wore head-mounted displays providing additional virtual surfaces viewable to all team members. Each user also had a personal space viewable only to them. The personal space moved about with the user. Their research included a user study with 24 participants acting in pairs to solve a collaborative task either using the wall display alone or using the wall display augmented with the personal space and virtual surfaces. The design included a standardized questionnaire with items for mentaldemand, physical demand, ease of use, enjoyment, etc. They also gathered quantitative measures of performance, such as task completion time. The questionnaire responses indicated a significantly better user experience for the wall+AR environment. However, the difference in task completion time between the wall-only and wall+AR conditions was not statistically significant. With this, we revisit a previous research question; see CC5.
Fig. 15. Wall display, personal space, and shared virtual surfaces for collaborative work (after [19]).
2.5 And Beyond
Returning to the Ericsson poll [8], respondents commented on a variety of future scenarios. For one scenario, 59% of respondents believe it will be possible in the future to see map routes on VR glasses by simply thinking of a travel destination. This implies that mixed realities in the future may also include a brain-computer interface (BCI) with interactions driven, at least in part, by a user's thoughts.
Although there is considerable research on brain computer interfaces (e.g., [11, 14, 21, 29]), mainstream applications remain elusive. This is partly due to the need to wear an electroencephalogram (EEG) cap with conductive gel, electrodes, and wires. Furthermore, most BCI systems are expensive and the set-up and calibration are complicated. So there is a technology gap for BCI systems for use by non-experts outside the lab.
In an attempt to address the issues just noted, Hou et al. [13] tested an inexpensive BCI product aimed at the gaming and VR markets. The NextMind,6 introduced in 2019 and priced at $300, is a wearable sensor module with nine comb-shaped dry electrodes to pick up EEG signals and send them to the computer for processing. See Fig. 16a. De Pace et al. [7] used the NextMind device to control a telerobotic arm in a pick-and-place task. They reported that selecting objects was "easy, fast, and reliable" (p. 1).
(a)
(b)
(c)
Fig. 16. NextMind brain-computer interface. (a) Device. (b) Experiment setup. (c) Use-case simulation for wheelchair and TV control (after [13]).
NextMind uses steady-state visually evoked potentials (SSVEP) whereby distinguishable brain responses occur when the user looks at an interface with objects flashing at different frequencies. In Hou et al.'s user study [13], six participants selected on-screen targets by looking at the targets. One task involved target selection as per ISO 9241-411 [18]. See Fig. 16b. The mean throughput achieved was 0.82 bits/s. This is quite low, considering that a similar task with touch-based selection yields throughput of ≈7 bits/s [22]. However, this is a notable result given that selection was performed simply by looking at tar gets and without using a eye tracking apparatus.
A second task was a use-case demonstration where a user selected on-screen buttons to steer and move a virtual wheelchair. See Fig. 16c. Once the wheelchair was in front of a TV, the user selected on-screen buttons to change the channel and control the volume.
Although Hou et al.'s setup used steady-state visually evoked potentials, BCI systems can also work through "motor imagery," whereby the user consciously thinks of certain actions, such as clenching a fist or raising a foot, while the signals emitted by the brain coincident with that action are measured, identfied, and acted on in unique ways. However, it is not feasible at the present time for BCI systems to detect and act on open-ended thought patterns, such as a travel destination. And so,
CC7 → What is our mixed-reality wish-list – interactions we cannot do today but would enjoy and benefit from should they be possible in the future?
3 Conclusion
In this position paper, we presented a cross-section of mixed reality environments and for each looked at the technology, what can be done, and where challenges lie. The context was the 2024 EICS workshop, "Experience 2.0 and Beyond - Engineering Cross Devices and Multiple Realities." The challenges were collected together in a series of community challenges which were offered as discussion topics for the workshop.
We conclude by again citing the important contribution of Milgram and Kishino's 1994 descriptive model for reality spaces than combine the real with the virtual.
References
1. Asakawa, C., Takagi, H.: Text entry for people with visual impairments. In: MacKenzie, I.S., Tanaka-Ishii, K. (eds.) Text Entry Systems: Mobility, Accessibility, Universality, pp. 305–318. Morgan Kaufmann, San Francisco (2007). https://doi.org/10.1016/B978-012373591-1/50016-4
2. Bilbow, S.: Developing multisensory augmented reality as a medium for computational artists. In: Proceedings of the Fifteenth International Conference on Tangible, Embedded, and Embodied Interaction – TEI 2021, pp. 72.1–72.7. ACM, New York (2021). https://doi.org/10.1145/3430524.3443690
3. Bimber, O., Raskar, R.: Modern approaches to augmented reality. In: Proceedings of SIGGRAPH 2006, p. 1-es. ACM, New York (2006). https://doi.org/10.1145/1185657.1185796
4. Bolt, R.A.: "Put-That-There": voice and gesture at the graphics interface. In: Proceedings of the 7th Annual Conference on Computer Graphics and Interactive Techniques - SIGGRAPH 1980, pp. 262–270. ACM, New York (1980). https://doi.org/10.1145/800250.807503
5. Cao, Y., Li, L., Yuan, J., Jeon, M.: Increasing driving safety and in-vehicle gesture-based menu navigation accuracy with a heads-up display. In: Adjunct Proceedings of the 14th International Conference on Automotive User Interfaces and Interactive Vehicular Applications – AutomotiveUI 2022, pp. 212–214. ACM, New York (2022). https://doi.org/10.1145/3544999.3551502
6. Ch, N.A.N., Tosca, D., Crump, T., Ansah, A., Kun, A., Shaer, O.: Gesture and voice commands to interact with AR windshield display in automated vehicle: a remote elicitation study. In: Proceedings of the 14th International Conference on Automotive User Interfaces and Interactive Vehicular Applications – AutomotiveUI 2022, pp. 171–182. ACM, New York (2022). https://doi.org/10.1145/3543174.3545257
7. De Pace, F., Manuri, F., Bosco, M., Sanna, A., Kaufmann, H.: Supporting human-robot interaction by projected augmented reality and a brain interface. IEEE Trans. Hum.-Mach. Syst. 1–10 (2024). https://doi.org/10.1109/THMS.2024.3414208
8. Ericsson: 10 hot consumer trends 2030: The Internet of senses. Technical report, Ericsson ConsumerLab (2019). https://www.ericsson.com/4ac661/assets/local/reports-papers/consumerlab/reports/2019/10hctreport2030.pdf. Accessed 15 Apr 2024
9. Fitts, P.M.: The information capacity of the human motor system in controlling the amplitude of movement. J. Exp. Psychol. 47(6), 381–391 (1954). https://doi.org/10.1037/h0055392
10. Flavián, C., Ibáñez-Sánchez, S., Orús, C.: The impact of virtual, augmented and mixed reality technologies on the customer experience. J. Bus. Res. 100, 547–560 (2019). https://doi.org/10.1016/j.jbusres.2018.10.050
11. Fouad, M.M., Amin, K.M., El-Bendary, N., Hassanien, A.E.: Brain computer interface: a review. In: Hassanien, A.E., Azar, A.T. (eds.) Brain-Computer Interfaces. ISRL, vol. 74, pp. 3–30. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-10978-7_1
12. Gsaxner, C., et al.: The HoloLens in medicine: a systematic review and taxonomy. Med. Image Anal. 85, 102757 (2023). https://doi.org/10.1016/j.media.2023.102757
13. Hou, B.J., et al.: Feasibility of a device for gaze interaction by visually-evoked brain signals. In: Proceedings of the 2022 Symposium on Eye Tracking Research & Applications – ETRA 2022, pp. 62.1–62.7. ACM, New York (2022). https://doi.org/10.1145/3517031.3529232
14. Hougaard, B.I., et al.: Who willed it? Decreasing frustration by manipulating perceived control through fabricated input for stroke rehabilitation BCI games. Proc. ACM Hum.-Comput. Interact. 5(CHI PLAY), 235.1–235.19 (2021). https://doi.org/10.1145/3474662
15. Hubenschmid, S., Zagermann, J., Leicht, D., Reiterer, H., Feuchtner, T.: ARound the smartphone: investigating the effects of virtually-extended display size on spatial memory. In: Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems – CHI 2023, pp. 527.1–527.15. ACM, New York (2023). https://doi.org/10.1145/3544548.3581438
16. Ihara, K., Faridan, M., Ichikawa, A., Kawaguchi, I., Suzuki, R.: HoloBots: augmenting holographic telepresence with mobile robots for tangible remote collaboration in mixed reality. In: Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology – UIST 2023, pp. 119.1–119.12. ACM, New York (2023). https://doi.org/10.1145/3586183.3606727
17. Irlitti, A., et al.: Volumetric mixed reality telepresence for real-time cross modality collaboration. In: Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems – CHI 2023, pp. 101.1–101.14. ACM, New York (2023). https://doi.org/10.1145/3544548.3581277
18. ISO: Ergonomics of human-system interaction - Part 411: Evaluation methods for the design of physical input devices. Report Number ISO/TS 9241-411:2012, International Organisation for Standardisation, Geneva, Switzerland (2012). https://www.iso.org/standard/54106.html
19. James, R., Bezerianos, A., Chapuis, O.: Evaluating the extension of wall displays with AR for collaborative work. In: Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems – CHI 2023, pp. 99.1–99.17. ACM, New York (2023). https://doi.org/10.1145/3544548.3580752
20. Jumisko-Pyykkö, S., Weitzel, M., Rakkolainen, I.: Biting, whirling, crawling - children's embodied interaction with walk-through displays. In: Gross, T., et al. (eds.) INTERACT 2009. LNCS, vol. 5726, pp. 123–136. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03655-2_15
21. Koike, Y., Hiroi, Y., Itoh, Y., Rekimoto, J.: Brain-computer interface using directional auditory perception. In: Proceedings of the Augmented Humans International Conference – AH 2023, pp. 342–345. ACM, New York (2023). https://doi.org/10.1145/3582700.3583713
22. Scott MacKenzie, I.: Fitts' throughput and the remarkable case of touch-based target selection. In: Kurosu, M. (ed.) HCI 2015. LNCS, vol. 9170, pp. 238–249. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20916-6 23
23. MacKenzie, I.S.: Human-Computer Interaction: An Empirical Research Perspective, 2nd edn. Morgan Kaufmann (an imprint of Elsevier), Cambridge (2024). https://www.yorku.ca/mack/HCIbook2e/
24. MacKenzie, I.S., Soukoreff, R.W.: Phrase sets for evaluating text entry techniques. In: Extended Abstracts of the ACM SIGCHI Conference on Human Factors in Computing Systems – CHI 2003, pp. 754–755. ACM, New York (2003). https://doi.org/10.1145/765891.765971
25. Maquil, V., Anastasiou, D., Afkari, H., Coppens, A., Hermen, J., Schwartz, L.: Establishing awareness through pointing gestures during collaborative decision-making in a wall-display environment. In: Extended Abstracts of the ACM SIGCHI Conference on Human Factors in Computing Systems – CHI 2023, pp. 104.1–104.7. ACM, New York (2023). https://doi.org/10.1145/3544549.3585830
26. Milgram, P., Kishino, F.: A taxonomy of mixed reality visual displays. IEICE Trans. Inf. Syst. E77-D(12), 1321–1329 (1994) https://cs.gmu.edu/~zduric/cs499/Readings/r76JBo-Milgram_IEICE_1994.pdf
27. Mirza-Babaei, P., Robinson, R., Mandryk, R., Pirker, J., Kang, C., Fletcher, A.: Games and the Metaverse. In: Extended Abstracts of the 2022 Annual Symposium on Computer-Human Interaction in Play – CHI PLAY 2023, pp. 318–319. ACM, New York (2022). https://doi.org/10.1145/3505270.3558355
28. Mukhopadhyay, A., Sharma, V.K., Gaikwad, P.T., Sandula, A.K., Biswas, P.: Exploring the use of XR interfaces for driver assistance in take over request. In: Adjunct Proceedings of the 14th International Conference on Automotive User Interfaces and Interactive Vehicular Applications – AutomotiveUI 2022, pp. 58–61. ACM, New York (2022). https://doi.org/10.1145/3544999.3552527
29. Nicolas-Alonso, L.F., Gomez-Gil, J.: Brain computer interfaces, a review. Sensors 12(2), 1211–1279 (2012). https://doi.org/10.3390/s120201211
30. Niedenthal, S., Fredborg, W., Lundén, P., Ehrndal, M., Olofsson, J.K.: A graspable olfactory display for virtual reality. Int. J. Hum Comput Stud. 169, 102928 (2023). https://doi.org/10.1016/j.ijhcs.2022.102928
31. Niedenthal, S., Lundén, P., Ehrndal, M., Olofsson, J.K.: A handheld olfactory display for smell-enabled VR games. In: IEEE International Symposium on Olfaction and Electronic Nose – ISOEN 2019, pp. 1–4. IEEE, New York (2019). https://doi.org/10.1109/ISOEN.2019.8823162
32. Palovuori, K., Rakkolainen, I.: Improved virtual reality for mid-air projection screen technology. In: Proceedings of the International Symposium on Communicability, Computer Graphics and Innovative Design for Interactive Systems – CCGIDIS 2013, pp. 25–33. Blue Herons Editions, Bergamo, Italy (2013). https://doi.org/10.978.8896471/227
33. Palovuori, K., Rakkolainen, I.: Improved interaction for mid-air projection screen technology. In: Association, I.R.M. (ed.) Virtual and Augmented Reality: Concepts, Methodologies, Tools, and Applications, pp. 1742–1761. IGI Global, Hershey, PA (2018). https://doi.org/10.4018/978-1-5225-5469-1.ch082
34. Pamparâu, C.: Dare we define olfactory augmented reality? In: Proceedings of the 2023 ACM International Conference on Interactive Media Experiences Workshops – IMXw 2023, pp. 52–55. ACM, New York (2023). https://doi.org/10.1145/3604321.3604376
35. Rauschnabel, P.A., Felix, R., Hinsch, C., Shahab, H., Alt, F.: What is XR? Towards a framework for augmented and virtual reality. Comput. Hum. Behav. 133, 107289 (2022). https://doi.org/10.1016/j.chb.2022.107289
36. Rebol, M., Lake, B., Reinisch, M., Pietroszek, K., Gütl, C.: Holographic sports training. In: Companion Proceedings of the 2023 Conference on Interactive Surfaces and Spaces – ISS 2023, pp. 70–73. ACM, New York (2023). https://doi.org/10.1145/3626485.3626547
37. Remizova, V., et al.: Mid-air gestural interaction with a large fogscreen. Multimodal Technol. Interact. 7(63), 1–18 (2023). https://doi.org/10.3390/mti7070063
38. Sargolzaei, P., Rastogi, M., Zaman, L.: Advancing mixed reality game development: an evaluation of a visual game analytics tool in action-adventure and FPS genres. In: Proceedings of the ACM on Human-Computer Interaction, vol. 8, No. CHI PLAY, Article 290, 32 p. ACM, New York (2024). https://doi.org/10.1145/3677055
39. Schmidt, L., Yigitbas, E.: Development and usability evaluation of transitional cross-reality interfaces. Proc. ACM Hum.-Comput. Interact. 8(EICS), 263.1–263.32 (2024). https://doi.org/10.1145/3664637
40. Skarbez, R., Smith, M., Whitton, M.: It is time to let go of 'Virtual Reality'. Commun. ACM 66(10), 41–43 (2023). https://doi.org/10.1145/3590959
41. Speicher, M., Hall, B.D., Nebeling, M.: What is mixed reality? In: Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems – CHI 2019, pp. 1–15. ACM, New York (2019). https://doi.org/10.1145/3290605.3300767
42. Suzuki, R., Karim, A., Xia, T., Hedayati, H., Marquardt, N.: Augmented reality and robotics: a survey and taxonomy for AR-enhanced human-robot interaction and robotic interfaces. In: Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems – CHI 2022. ACM, New York (2022). https://doi.org/10.1145/3491102.3517719
43. Vaiani, G., Patern`o, F.: End-user development for human-robot interaction: results and trends in an emerging field. Proc. ACM Hum.-Comput. Interact. 8(EICS), 252.1–252.40 (2024). https://doi.org/10.1145/3661146
44. Wang, X.M., et al.: The geometry of the vergence-accommodation conflict in mixed reality systems. Virtual Reality 28(95) (2024). https://doi.org/10.1007/s10055-024-00991-4
45. Wigdor, D., Forlines, C., Baudisch, P., Barnwell, J., Shen, C.: LucidTouch: a see-through mobile device. In: Proceedings of the ACM Symposium on User Interface Software and Technology – UIST 2007, pp. 269–278. ACM, New York (2007). https://doi.org/10.1145/1294211.1294259
46. Xu, J., et al.: Metaverse: the vision for the future. In: Extended Abstracts of the ACM SIGCHI Conference on Human Factors in Computing Systems – CHI 2022, pp. 167.1–167.3. ACM, New York (2022). https://doi.org/10.1145/3491101.3516399
47. Yamada, W., Manabe, H., Ikeda, D., Rekimoto, J.: RayGraphy: Aerial volumetric graphics rendered using lasers in fog. In: Proceedings of the 2020 ACM Symposium on Spatial User Interaction - SUI 2020, pp. 11.1–11.99. ACM, New York (2020). https://doi.org/10.1145/3385959.3418446
48. Zhao, S., Tan, F., Fennedy, K.: Heads-up computing: Moving beyond the device-centered paradigm. Commun. ACM 66(9), 56–63 (2023). https://doi.org/10.1145/3571722
49. Zulkarnain, A.H.B., Kóokai, Z., Gere, A.: Assessment of a virtual sensory laboratory for consumer sensory evaluations. Heliyon 10(3) (2024). https://doi.org/10.1016/j.heliyon.2024.e25498
-----
Footnotes:
1More than 9700 citations on Google Scholar (September 2024).
2Although this section's focus is visual displays, the term "display" applies to any computer output channel that engages human senses. This includes, for example, auditory displays and tactile displays.
3https://www.theregister.com/2023/04/17/microsoft_hololens_windows_11/.
44 Viewable on YouTube at https://www.youtube.com/watch?v=4XkUdwdq_-k.
5The term "on-environment" is also used [42].
6https://ar.snap.com/welcome-nextmind.