Research

In the 21st century, we are living in a more and more intelligent environment, evolving together with the pace of technologies. We have been surrounded by ubiquitous devices. The integration of the data from various sensors provides the great potential to picture a user’s behavior (e.g., routine behavior), help machines better understand human, and further support new interaction paradigms between humans and smart environments.

Ubiquitous devices have become more and more prevalent. They provide a unique opportunity to capture, model and impact human behavior. However, nowadays our ubiquitous devices around us are not intelligent enough. My research focuses on two aspects: 1) on the one side, I build algorithms and models that can better understand and model human daily behavior. 2) On the other side, I also create novel interaction techniques that better involve and leverage our daily behavior. 

These two aspects are organically related: the leveraging of human behavior is grounded on an intelligent system that understands what we are doing, thus effectively facilitate our daily lives. 

My research usually involves human-computer interaction, ubiquitous computing and machine learning, and sometimes a combination of them.

Research Interests

  • Human Computer Interaction
  • Ubiquitous Computing
  • Machine Learning
  • WWW 2020 Understanding User Behavior For Document Recommendation

    We conducted a large-scale log study of users’ interaction behavior with the explainable recommendation on one of the largest cloud document platforms office.com.

    Personalized document recommendation systems aim to provide users with a quick shortcut to the documents they may want to access next, usually with an explanation about why the document is recommended. Previous work explored various methods for better recommendations and better explanations in different domains. However, there are few efforts that closely study how users react to the recommended items in a document recommendation scenario. We conducted a large-scale log study of users’ interaction behavior with the explainable recommendation on one of the largest cloud document platforms office.com. Our analysis reveals a number of factors, including display position, file type, authorship, recency of last access, and most importantly, the recommendation explanations, that are associated with whether users will recognize or open the recommended documents. Moreover, we specifically focus on explanations and conduct an online experiment to investigate the influence of different explanations on user behavior. Our analysis indicates that the recommendations help users access their documents significantly faster, but sometimes users miss a recommendation and resort to other more complicated methods to open the documents. Our results suggest opportunities to improve explanations and more generally the design of systems that provide and explain recommendations for documents.  [Click for More Details]

  • CHI 2020 EarBuddy: Enabling On-Face Interaction via Wireless Earbuds

    We propose EarBuddy, a real-time system that leverages the microphone in commercial wireless earbuds to detect tapping and sliding gestures near the face and ears.

    Past research regarding on-body interaction typically requires custom sensors, limiting their scalability and generalizability. We propose EarBuddy, a real-time system that leverages the microphone in commercial wireless earbuds to detect tapping and sliding gestures near the face and ears. We develop a design space to generate 27 valid gestures and conducted a user study (N=16) to select the eight gestures that were optimal for both human preference and microphone detectability. We collected a dataset on those eight gestures (N=20) and trained deep learning models for gesture detection and classification. Our optimized classifier achieved an accuracy of 95.3%. Finally, we conducted a user study (N=12) to evaluate EarBuddy’s usability. Our results show that EarBuddy can facilitate novel interaction and that users feel very positively about the system. EarBuddy provides a new eyes-free, socially acceptable input method that is compatible with commercial wireless earbuds and has the potential for scalability and generalizability. [Click for More Details]

  • CHI LBW 2020 PneuFetch: Supporting Blind and Visually Impaired People to Fetch Nearby Objects via Light Haptic Cues

    We present PneuFetch, a wearable device that supports blind and visually impaired people to fetch nearby objects.

    We present PneuFetch, a light haptic cue based wearable device that supports blind and visually impaired (BVI) people to fetch nearby objects in an unfamiliar environment. In our design, we generate friendly, non-intrusive, and gentle presses and drags to deliver direction and distance cues on BVI user’s wrist and forearm. As a concept of proof, we discuss our PneuFetch wearable prototype, contrast it with past work, and describe a preliminary user study.  [Click for More Details]

  • IMWUT 2020 Recognizing Unintentional Touch on Interactive Tabletop

    We leverage gaze direction, head orientation and screen contact data to identify and filter out unintentional touches, so that users can take full advantage of the physical properties of an interactive tabletop.

    A multi-touch interactive tabletop is designed to embody the benefits of a digital computer within the familiar surface of a physical tabletop. However, the nature of current multi-touch tabletops to detect and react to all forms of touch, including unintentional touches, impedes users from acting naturally on them. In our research, we leverage gaze direction, head orientation and screen contact data to identify and filter out unintentional touches, so that users can take full advantage of the physical properties of an interactive tabletop, e.g., resting hands or leaning on the tabletop during the interaction. To achieve this, we first conducted a user study to identify behavioral pattern differences (gaze, head and touch) between completing usual tasks on digital versus physical tabletops. We then compiled our findings into five types of spatiotemporal features, and train a machine learning model to recognize unintentional touches with an F1 score of 91.3%, outperforming the state-of-the-art model by 4.3%. Finally we evaluated our algorithm in a real-time filtering system. A user study shows that our algorithm is stable and the improved tabletop effectively screens out unintentional touches, and provide more relaxing and natural user experience. By linking their gaze and head behavior to their touch behavior, our work sheds light on the possibility of future tabletop technology to improve the understanding of users’ input intention. [Click for More Details]

  • CHIIR 2020 Effects of Past Interactions on User Experience with Recommended Documents

    We provide an initial exploration of users’ experience with recommended documents, with a focus on how prior interactions influence recognition and interest.

    Recommender systems are commonly used in entertainment, news, e-commerce, and social media. Document recommendation is a new and under-explored application area, in which both re-finding and discovery of documents need to be supported. In this paper we provide an initial exploration of users’ experience with recommended documents, with a focus on how prior interactions influence recognition and interest. Through a field study of more than 100 users, we investigate the effects of past interactions with recommended documents on users’ recognition of, prior intent to open, and interest in the documents. We examined different presentations of interaction history, and the recency and richness of prior interaction. We found that presentation only influenced recognition time. Our findings also indicate that people are more likely to recognize documents they had accessed recently and to do so more quickly. Similarly, documents that people had interacted with more deeply were also more frequently and quickly recognized. However, people were more interested in older documents or those with which they had less involved interactions. This finding suggests that in addition to helping users quickly access documents they intend to re-find, document recommendation can add value in helping users discover other documents. Our results offer implications for designing document recommendation systems that help users fulfil different needs.  [Click for More Details]

  • IMWUT 2019 Leveraging Routine Behavior and Contextually-Filtered Features for Depression Detection among College Students

    We propose a new algorithm to mine behavior rules that can capture people routine behavior and the behavior pattern difference between groups.

    The rate of depression in college students is rising, which is known to increase suicide risk, lower academic performance and double the likelihood of dropping out of school. Existing work on finding relationships between passively sensed behavior and depression, as well as detecting depression, mainly derives relevant unimodal features from a single sensor. However, co-occurrence of values in multiple sensors may provide better features, because such features can describe behavior in context. We present a new method to extract contextually filtered features from passively collected, time-series mobile data via association rule mining. After calculating traditional unimodal features from the data, we extract rules that relate unimodal features to each other using association rule mining. We extract rules from each class separately (e.g., depression vs. non-depression). We introduce a new metric to select a subset of rules that distinguish between the two classes. From these rules, which capture the relationship between multiple unimodal features, we automatically extract contextually filtered features. These features are then fed into a traditional machine learning pipeline to detect the class of interest (in our case, depression), defined by whether a student has a high BDI-II score at the end of the semester. The behavior rules generated by our methods are highly interpretable representations of differences between classes. Our best model uses contextually-filtered features to significantly outperform a standard model that uses only unimodal features, by an average of 9.7% across a variety of metrics. We further verified the generalizability of our approach on a second dataset, and achieved very similar results. [Click for More Details]

  • CHI 2019 Clench Interaction: Novel Biting Input Techniques

    We propose Clench Interaction, a novel input techniques by clenching our teeth

    People eat every day and biting is one of the most fundamental and natural actions that they perform on a daily basis. Existing work has explored tooth click location and jaw movement as input techniques, however clenching has the potential to add control to this input channel. We propose clench interaction that leverages clenching as an actively controlled physiological signal that can facilitate interactions. We conducted a user study to investigate users’ ability to control their clench force. We found that users can easily discriminate three force levels, and that they can quickly confrm actions by unclenching (quick release). We developed a design space for clench interaction based on the results and investigated the usability of the clench interface. Participants preferred the clench over baselines and indicated a willingness to use clench-based interactions. This novel technique can provide an additional input method in cases where users’ eyes or hands are busy, augment immersive experiences such as virtual/augmented reality, and assist individuals with disabilities. [Click for More Details]

  • MobileHCI 2018 Hand Range Interface: Information Always at Hand With A Body-centric Mid-air Input Surface

    We propose Hand Range Interface, an input surface that is always at our fingertips.

    Most interfaces of our interactive devices such as phones and laptops are flat and are built as external devices in our environment, disconnected from our bodies. Therefore, we need to carry them with us in our pocket or in a bag and accommodate our bodies to their design by sitting at a desk or holding the device in our hand. We propose Hand Range Interface, an input surface that is always at our fingertips. This bodycentric interface is a semi-sphere attached to a user’s wrist, with a radius the same as the distance from the wrist to the index finger. We prototyped the concept in virtual reality and conducted a user study with a pointing task. The input surface can be designed as rotating with the wrist or fixed relative to the wrist. We evaluated and compared participants’ subjective physical comfort level, pointing speed and pointing accuracy on the interface that was divided into 64 regions. We found that the interface whose orientation was fixed had a much better performance, with 41.2% higher average comfort score, 40.6% shorter average pointing time and 34.5% lower average error. Our results revealed interesting insights on user performance and preference of different regions on the interface. We concluded with a set of guidelines for future designers and developers on how to develop this type of new body-centric input surface.  [Click for More Details]

  • CHI 2018 BreathVR: Leveraging Breathing as a Directly Controlled Interface for VR Games

    We propose breathing as a directly controlled physiological signal that can facilitate unique and engaging play experiences through natural interaction in single and multiplayer virtual reality games.

    With virtual reality head-mounted displays rapidly becoming accessible to mass audiences, there is growing interest in new forms of natural input techniques to enhance immersion and engagement for players. Research has explored physiological input for enhancing immersion in single player games through indirectly controlled signals like heart rate or galvanic skin response. In this paper, we propose breathing as a directly controlled physiological signal that can facilitate unique and engaging play experiences through natural interaction in single and multiplayer virtual reality games. Our study shows that participants report a higher sense of presence and find the gameplay more fun and challenging when using our breathing gestures. From study observations and analysis we present six design strategies that can aid virtual reality game designers interested in using directly controlled forms of physiological input.  [Click for More Details]

  • CHI 2018 ForceBoard: Subtle Text Entry Leveraging Pressure

    ForceBoard is a pressure-based input technique that enables text entry by subtle motion.

    ForceBoard is a pressure-based input technique that enables text entry by subtle motion. To enter text, users apply pressure to control a multi-letter-wide sliding cursor on a one-dimensional keyboard with alphabetical ordering, and confirm the selection with a quick release. In particular, we examined the error model of pressure control for successive and error-tolerant input, which was then incorporated into a Bayesian algorithm to infer user input. Moreover, we employed tactile feedback to facilitate pressure control. The results showed that text entry rate achieved 4.2 WPM (Words Per Minute) for character-level input, and 11.0 WPM for word-level input after 10 minutes of training. Users subjectively reported ForceBoard was easy to learn and interesting to use. These results demonstrated the feasibility of applying pressure as the main channel for text entry, and that ForceBoard can be useful for subtle interaction or when interaction is constrained.  [Click for More Details]

  • DIS 2018 vMotion: A Context-Sensitive Design Methodology for Real-Walking in VR

    We propose a design methodology of seamlessly integrating redirection into the virtual experience that takes advantage of the perceptual phenomenon of inattentional blindness.

    Physically walking in virtual reality can provide a satisfying sense of presence. However, natural locomotion in virtual worlds larger than the tracked space remains a practical challenge. Numerous redirected walking techniques have been proposed to overcome space limitations but they often require rapid head rotation, sometimes induced by distractors, to keep the scene rotation imperceptible. We propose a design methodology of seamlessly integrating redirection into the virtual experience that takes advantage of the perceptual phenomenon of inattentional blindness. Using the functioning of the human visual system, we present four novel visibility control techniques that work with our design methodology to minimize disruption commonly found in existing redirection techniques. A user study shows that our embedded techniques are imperceptible and users report significantly less dizziness when using our methods. The illusion of unconstrained walking in a large area (16 x 8m) is maintained even though users are limited to a smaller (3.5 x 3.5m) physical space.  [Click for More Details]

  • VRST Wksp 2017 GalVR: A Novel Collaboration Interface using GVS

    GalVR is a navigation interface that uses galvanic vestibular stimulation (GVS) during walking to cause users to turn from their planned trajectory.

    GalVR is a navigation interface that uses galvanic vestibular stimulation (GVS) during walking to cause users to turn from their planned trajectory. We explore GalVR for collaborative navigation in a two-player virtual reality (VR) game. The interface affords a novel game design that exploits the differences in first and third-person perspectives, allowing VR and non-VR users to share a play experience. By introducing interdependence arising from dissimilar points of view, players can uniquely contribute to the shared experience based on their roles. We detail the design of our asymmetrical game, Dark Room and present some insights from a pilot study. Trust emerged as the defining factor for successful play.  [Click for More Details]

  • 2017 Leapwrist: A Low-cost Hand Gesture Recognition Smart Band

    We proposed LeapWrist, a smart wristband that allows users to do gestures anywhere in the 3D space.

    In this project, we proposed LeapWrist, a smart wristband that allows users to do gestures anywhere in the 3D space. We mounted two pairs of miniature low power cameras and IR structural light projectors on a wristband, one pair looking at the palm while the other facing the hand back. The extracted hand depth maps were then fed into a CNN regression model to reconstruct the hand. This idea shared some similarity with the work from Microsoft [1]. However, they only used one pair of camera and projector for the palm, which limits the power of detection. Moreover, the deep learning method has not been used before. Finally, I led the team to capture the outstanding prize in the highest level technology competition “Challenge Cup” at Tsinghua (1 out of 800).  [Click for More Details]

  • 2016 Listening Behavior Generation of SARA

    We designed and implemented a new real-time algorithm for the listening behavior generation of SARA, which leveraged traditional multimodal features, together with rapport scores and conversational strategies.

    We designed and implemented a new real-time algorithm for the listening behavior generation of SARA, which leveraged traditional multimodal features, together with rapport scores and conversational strategies. SARA was demoed on World Economic Summer Forum 2016, SIGDIAL 2017 and World Economic Forum 2017.  [Click for More Details]