Shigeo Morishima

Waseda University

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Akinobu Maejima

Western Digital (Japan)

Tatsuo Yotsukura

Western Digital (Japan)

Hiroyuki Kubo

Chiba University

Tsukasa Fukusato

The University of Tokyo

Hiroshi Harashima

The University of Tokyo

Satoshi Nakamura

Nara Institute of Science and Technology

Tatsunori Hirai

Komazawa University

Hubert P. H. Shum

Durham University

Takuya Kato

National Cancer Institute

Tatsuya Yatagawa

Hitotsubashi University

Cooperative Institutions

Waseda University

131

The University of Tokyo

National Institute of Advanced Industrial Science and Technology

Osaka University

Kyoto University

Seikei University

University of Tsukuba

Nagoya University

Tohoku University

Tokyo Institute of Technology

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

Detection of Driver's Drowsy Facial Expression

Taro Nakamura Akinobu Maejima Shigeo Morishima

We propose a method for the estimation of the degree of a driver's drowsiness on basis of changes in facial expressions captured by an IR camera. Typically, drowsiness is accompanied by falling of eyelids. Therefore, most of the related studies have focused on tracking eyelid movement by monitoring facial feature points. However, textural changes that arise from frowning are also very important and sensitive features in the initial stage of drowsiness, and it is difficult to detect such changes solely using facial feature points. In this paper, we propose a more precise drowsiness-degree estimation method considering wrinkles change by calculating local edge intensity on faces that expresses drowsiness more directly in the initial stage.

Feature (linguistics)

Degree (music)

10.1109/acpr.2013.176

Cite

Citations (17)

Facial Aging Simulator by Data-Driven Component-Based Texture Cloning

Lecture notes in computer science (2014)

Daiki Kuwahara Akinobu Maejima Shigeo Morishima

Component (thermodynamics)

Texture (cosmology)

Similarity (geometry)

10.1007/978-3-319-14442-9_32

Cite

Citations (2)

Speech coding based on a multi-layer neural network

Shigeo Morishima Hiroshi Harashima Yoshiaki Katayama

The authors present a speech-compression scheme based on a three-layer perceptron in which the number of units in the hidden layer is reduced. Input and output layers have the same number of units in order to achieve identity mapping. Speech coding is realized by scalar or vector quantization of hidden-layer outputs. By analyzing the weighting coefficients, it can be shown that speech coding based on a three-layer neural network is speaker-independent. Transform coding is automatically based on back propagation. The relation between compression ratio and SNR (signal-to-noise ratio) is investigated. The bit allocation and optimum number of hidden-layer units necessary to realize a specific bit rate are given. According to the analysis of weighting coefficients, speech coding based on a neural network is transform coding similar to Karhunen-Loeve transformation. The characteristics of a five-layer neural network are examined. It is shown that since the five-layer neural network can realize nonlinear mapping, it is more effective than the three-layer network.< >

Perceptron

10.1109/icc.1990.117117

Cite

Citations (11)

Multi-layer Lattice Model for Real-Time Dynamic Character Deformation

Computer Graphics Forum (2015)

Naoya Iwamoto Hubert P. H. Shum Longzhi Yang Shigeo Morishima

Due to the recent advancement of computer graphics hardware and software algorithms, deformable characters have become more and more popular in real-time applications such as computer games. While there are mature techniques to generate primary deformation from skeletal movement, simulating realistic and stable secondary deformation such as jiggling of fats remains challenging. On one hand, traditional volumetric approaches such as the finite element method require higher computational cost and are infeasible for limited hardware such as game consoles. On the other hand, while shape matching based simulations can produce plausible deformation in real-time, they suffer from a stiffness problem in which particles either show unrealistic deformation due to high gains, or cannot catch up with the body movement. In this paper, we propose a unified multi-layer lattice model to simulate the primary and secondary deformation of skeleton-driven characters. The core idea is to voxelize the input character mesh into multiple anatomical layers including the bone, muscle, fat and skin. Primary deformation is applied on the bone voxels with lattice-based skinning. The movement of these voxels is propagated to other voxel layers using lattice shape matching simulation, creating a natural secondary deformation. Our multi-layer lattice framework can produce simulation quality comparable to those from other volumetric approaches with a significantly smaller computational cost. It is best to be applied in real-time applications such as console games or interactive animation creation.

Skinning

Character Animation

Lattice (music)

10.1111/cgf.12749

Cite

Citations (17)

A study of relationship between speaker identification and acoustic features using perceptual similarity of imitated voice (音声)

Scientific Programming (2010)

Mari Tanaka Hideki Kawahara Shigeo Morishima

Speaker identification

Similarity (geometry)

Identification

Speaker diarisation

Source

Cite

Citations (0)

A skinning technique considering the shape of human skeletons

Hirofumi Suda Kentaro Yamanaka Shigeo Morishima

We propose a skinning technique to improve expressive power of Skeleton Subspace Deformation (SSD) by adding the influence of the shape of skeletons to the deformation result by post-processing.

Skinning

Skeleton (computer programming)

10.1145/1836845.1836850

Cite

Citations (0)

Life-Like Characters. Tools, Affective Functions, and Applications

Springer eBooks (2003)

Shin ichi Kawamoto Hiroshi Shimodaira Tsuneo Nitta Takuya Nishimoto Satoshi Nakamura

Source

Cite

Citations (2)

Scapegoat Generation for Privacy Protection from Deepfake

arXiv (Cornell University) (2023)

Gido Kato Yoshihiro Fukuhara Mariko Isogawa Hideki Tsunashima Hirokatsu Kataoka

To protect privacy and prevent malicious use of deepfake, current studies propose methods that interfere with the generation process, such as detection and destruction approaches. However, these methods suffer from sub-optimal generalization performance to unseen models and add undesirable noise to the original image. To address these problems, we propose a new problem formulation for deepfake prevention: generating a ``scapegoat image'' by modifying the style of the original input in a way that is recognizable as an avatar by the user, but impossible to reconstruct the real face. Even in the case of malicious deepfake, the privacy of the users is still protected. To achieve this, we introduce an optimization-based editing method that utilizes GAN inversion to discourage deepfake models from generating similar scapegoats. We validate the effectiveness of our proposed method through quantitative and user studies.

Scapegoat

Privacy Protection

10.48550/arxiv.2303.02930

Cite

Citations (0)

Audio-Oriented Video Interpolation Using Key Pose

International Journal of Pattern Recognition and Artificial Intelligence (2021)

Takayuki Nakatsuka Yukitaka Tsuchiya Masatoshi Hamanaka Shigeo Morishima

This paper describes a deep learning-based method for long-term video interpolation that generates intermediate frames between two music performance videos of a person playing a specific instrument. Recent advances in deep learning techniques have successfully generated realistic images with high-fidelity and high-resolution in short-term video interpolation. However, there is still room for improvement in long-term video interpolation due to lack of resolution and temporal consistency of the generated video. Particularly in music performance videos, the music and human performance motion need to be synchronized. We solved these problems by using human poses and music features essential for music performance in long-term video interpolation. By closely matching human poses with music and videos, it is possible to generate intermediate frames that synchronize with the music. Specifically, we obtain the human poses of the last frame of the first video and the first frame of the second video in the performance videos to be interpolated as key poses. Then, our encoder–decoder network estimates the human poses in the intermediate frames from the obtained key poses, with the music features as the condition. In order to construct an end-to-end network, we utilize a differentiable network that transforms the estimated human poses in vector form into the human pose in image form, such as human stick figures. Finally, a video-to-video synthesis network uses the stick figures to generate intermediate frames between two music performance videos. We found that the generated performance videos were of higher quality than the baseline method through quantitative experiments.

Key frame

Interpolation

Motion interpolation

View synthesis

10.1142/s0218001421600168

Cite

Citations (2)

Acquiring shell textures from a single image for realistic fur rendering

Hiroaki Ukaji Takahiro Kosaka Tomohito Hattori Hiroyuki Kubo Shigeo Morishima

To synthesize a realistic appearance of mammals, it is necessary to express disorderly lie of hairs. "Shell texturing method", proposed by Lengyel [2001], is possible to synthesize realistic fur appearance over arbitrary surfaces in real-time. Prior to rendering, it is necessary to prepare several shell textures as a pre-process. However, acquiring appropriate shell textures is a complicated and time consuming work. In this paper, we present a novel method to acquire shell textures only from a single input picture of an actual animal fur. Since every shell textures are automatically computed by a pixel shader in run-time, it is not necessary any complicated pre-computation. Furthermore, conventional shell texturing method employs typically 16 textures which require huge graphics memory. Because our method requires only a single texture, we realize a significant reduction in memory usage for practical purpose.

Shader

Texture memory

Texture Synthesis

10.1145/2342896.2343013

Cite

Citations (0)