Keng-hao Chang

National Yang Ming Chiao Tung University

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

John Canny

University of California, Berkeley

Ruofei Zhang

Henan Academy of Sciences

Hao-Hua Chu

National Taiwan University

Changbo Hu

Microsoft (United States)

Qun Li

Harbin Institute of Technology

Allison G. Harvey

University of California, Berkeley

Jane Yung-jen Hsu

National Taiwan University

Eric Badger

Microsoft (United States)

Itai Almog

Microsoft (United States)

Tim Paek

Apple (United States)

Cooperative Institutions

Microsoft (United States)

National Taiwan University

University of California, Berkeley

Microsoft Research (United Kingdom)

Stanford University

National Yang Ming Chiao Tung University

National Cheng Kung University

Taipei Veterans General Hospital

National Defense Medical Center

Tri-Service General Hospital

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

Multimodal Feedback and Guidance Signals for Mobile Touchscreen Keyboards

Tim Paek Itai Almog Eric Badger Tirthankar Sengupta Keng-hao Chang

Mobile devices with touch capabilities often utilize touchscreen keyboards. However, due to the lack of tactile feedback, users often have to switch their focus of attention between the keyboard area, where they must locate and click the correct keys, and the text area, where they must verify the typed output. This can impair user experience and performance. In this paper, we examine multimodal feedback and guidance signals that keep users’ focus of attention in the keyboard area but also provide the kind of information users would normally get in the text area. We first conducted a usability study to assess and refine the user experience of these signals and their combinations. Then we evaluated whether those signals which users preferred could also improve typing performance in a controlled experiment. One combination of multimodal signals significantly improved typing speed by 11%, reduced keystrokes-per-character by 8%, and reduced backspaces by 28%. We discuss design implications.

Touchscreen

Text entry

Multimodal Interaction

Input device

Virtual keyboard

Source

Cite

Citations (3)

A universal topic framework (UniZ) and its application in online search

Youngchul Cha Keng-hao Chang Hari Bommaganti Ye Chen Tak W. Yan

Probabilistic topic models, such as PLSA and LDA, are gaining popularity in many fields due to their high-quality results. Unfortunately, existing topic models suffer from two drawbacks: (1) model complexity and (2) disjoint topic groups. That is, when a topic model involves multiple entities (such as authors, papers, conferences, and institutions) and they are connected through multiple relationships, the model becomes too difficult to analyze and often leads to in-tractable solutions. Also, different entity types are classified into disjoint topic groups that are not directly comparable, so it is difficult to see whether heterogeneous entities (such as authors and conferences) are on the same topic or not (e.g., are Rakesh Agrawal and KDD related to the same topic?).

Popularity

Disjoint sets

10.1145/2695664.2695806

Cite

Citations (1)

AnalyzeThis

Keng-hao Chang Matthew K. Chan John Canny

Mental illness is one of the most undertreated health problems worldwide. Previous work has shown that there are remarkably strong cues to mental illness in short samples of the voice. These cues are evident in severe forms of illness, but it would be most valuable to make earlier diagnoses from a richer feature set. Furthermore there is an abstraction gap between these voice cues and the diagnostic cues used by practitioners. We believe that by closing this gap, we can build more effective early diagnostic systems for mental illness. In order to develop improved monitoring, we need to translate the high-level cues used by practitioners into features that can be analyzed using signal processing and machine learning techniques. In this paper we describe the elicitation process that we used to tap the practitioners' knowledge. We borrow from both AI (expert systems) and HCI (contextual inquiry) fields in order to perform this knowledge transfer. The paper highlights an unusual and promising role for HCI - the analysis of interaction data for health diagnosis.

Abstraction

Feature (linguistics)

10.1145/1979742.1979859

Cite

Citations (7)

DeepProbe

Zi Yin Keng-hao Chang Ruofei Zhang

Information extraction and user intention identification are central topics in modern query understanding and recommendation systems. In this paper, we propose DeepProbe, a generic information-directed interaction framework which is built around an attention-based sequence to sequence (seq2seq) recurrent neural network. DeepProbe can rephrase, evaluate, and even actively ask questions, leveraging the generative ability and likelihood estimation made possible by seq2seq models. DeepProbe makes decisions based on a derived uncertainty (entropy) measure conditioned on user inputs, possibly with multiple rounds of interactions. Three applications, namely a rewritter, a relevance scorer and a chatbot for ad recommendation, were built around DeepProbe, with the first two serving as precursory building blocks for the third. We first use the seq2seq model in DeepProbe to rewrite a user query into one of standard query form, which is submitted to an ordinary recommendation system. Secondly, we evaluate DeepProbe's seq2seq model-based relevance scoring. Finally, we build a chatbot prototype capable of making active user interactions, which can ask questions that maximize information gain, allowing for a more efficient user intention idenfication process. We evaluate first two applications by 1) comparing with baselines by BLEU and AUC, and 2) human judge evaluation. Both demonstrate significant improvements compared with current state-of-the-art systems, proving their values as useful tools on their own, and at the same time laying a good foundation for the ongoing chatbot application.

Chatbot

Ask price

Relevance

10.1145/3097983.3098148

Cite

Citations (43)

Tracking Free-Weight Exercises

Lecture notes in computer science (2007)

Keng-hao Chang Mike Y. Chen John Canny

Viterbi algorithm

Trainer

10.1007/978-3-540-74853-3_2

Cite

Citations (92)

Selection Strategy for VM Migration Method

Proceedings of International Conference on Artificial Life and Robotics (2019)

Yan-Ren Chen I‐Hsien Liu Keng-hao Chang Chuan-Gang Liu Jung‐Shian Li

10.5954/icarob.2019.os1-3

Cite

Citations (1)

Improved classification of speaking styles for mental health monitoring using phoneme dynamics

Interspeech 2022 (2011)

Keng-hao Chang Howard Lei John Canny

This paper investigates the usefulness of segmental phonemedynamics for classification of speaking styles. We modeled transition details based on the phoneme sequences emitted by a speech recognizer, using data obtained from a recording of 39 depressed patients with 7 different speaking styles normal, pressured, slurred, stuttered, flat, slow and fast speech. We designed and compared two set of phoneme models: a language model treating each phoneme as a word unit (one for each style) and a context-dependent phoneme duration model based on Gaussians for each speaking style considered. The experiments showed that language modeling at the phoneme level performed better than the duration model. We also found that better performance can be obtained by user normalization. To see the complementary effect of the phoneme-based models, the classifiers were combined at a decision level with a Hidden Markov Model (HMM) classifier built from spectral features. The improvement was 5.7% absolute (10.4% relative), reaching 60.3% accuracy in 7-class and 71.0% in 4-class classification.

Normalization

10.21437/interspeech.2011-22

Cite

Citations (2)

Multi-modal Extreme Classification

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

Anshul Mittal Kunal Dahiya Shreya Malani Janani Ramaswamy Seba Kuruvilla

This paper develops the MUFIN technique for extreme classification (XC) tasks with millions of labels where datapoints and labels are endowed with visual and textual descriptors. Applications of MUFIN to product-to-product recommendation and bid query prediction over several millions of products are presented. Contemporary multi-modal methods frequently rely on purely embedding-based methods. On the other hand, XC methods utilize classifier architectures to offer superior accuracies than embedding only methods but mostly focus on text-based categorization tasks. MUFIN bridges this gap by reformulating multi-modal categorization as an XC problem with several millions of labels. This presents the twin challenges of developing multi-modal architectures that can offer embeddings sufficiently expressive to allow accurate categorization over millions of labels; and training and inference routines that scale logarithmically in the number of labels. MUFIN develops an architecture based on cross-modal attention and trains it in a modular fashion using pre-training and positive and negative mining. A novel product-to-product recommendation dataset MM-AmazonTitles-300K containing over 300K products was curated from publicly available amazon.com listings with each product endowed with a title and multiple images. On the all datasets MUFIN offered at least 3% higher accuracy than leading text-based, image-based and multi-modal techniques. Code for MUFIN is available at https://github.com/Extreme-classification/MUFIN

10.1109/cvpr52688.2022.01207

Cite

Citations (9)

How’s my Mood and Stress? An Efficient Speech Analysis Library for Unobtrusive Monitoring on Mobile Phones

Keng-hao Chang D. Grant Fisher John Canny Bjoern Hartmann

The human voice encodes a wealth of information about emotion, mood, stress, and mental state. With mobile phones (one of the mostly used modules in body area networks) this information is potentially available to a host of applications and can enable richer, more appropriate, and more satisfying hu

Emotion detection

Mental stress

10.4108/icst.bodynets.2011.247079

Cite

Citations (26)

A practical examination of multimodal feedback and guidance signals for mobile touchscreen keyboards

Tim Paek Keng-hao Chang Itai Almog Eric Badger Tirthankar Sengupta

Mobile devices with touch capabilities often utilize touchscreen keyboards. However, due to the lack of tactile feedback, users often have to switch their focus of attention between the keyboard area, where they must locate and click the correct keys, and the text area, where they must verify the typed output. This can impair user experience and performance. In this paper, we examine multimodal feedback and guidance signals that keep users' focus of attention in the keyboard area but also provide the kind of information users would normally receive in the text area. We evaluated whether combinations of multimodal signals could improve typing performance in a controlled experiment. One combination reduced keystrokes-per-character by 8% and correction backspaces by 28%.

Touchscreen

Text entry

Multimodal Interaction

10.1145/1851600.1851667

Cite

Citations (12)