Smart sensors, such as smart meters or smart phones, are nowadays ubiquitous. To be "smart", however, they need to process their input data with limited storage and computational resources. In this paper, we convert the stream of sensor data into a stream of symbols, and further, to higher level symbols in such a way that common analytical tasks such as anomaly detection, forecasting or state recognition, can still be carried out on the transformed data with almost no loss of accuracy, and using far fewer resources. We identify states of a monitored system and convert them into symbols (thus, reducing data size), while keeping "interesting" events, such as anomalies or transition between states, as it is. Our algorithm is able to find states of various length in an online and unsupervised way, which is crucial since behavior of the system is not known beforehand. We show the effectiveness of our approach using real-world datasets and various application scenarios.
This paper studies mobile sensing in a complete distributed and opportunistic scheme. We present a novel sensing strategy for sensing nodes without movement constraints. This strategy offers information sharing and sensor scheduling that maximizes the benefits from collaboration between the nodes. We evaluate the strategy with two real world datasets, where mobile devices sense the surroundings and share the information with nearby devices via Bluetooth. We show that data sharing of energy-hungry sensing (e.g., GPS) via nearby devices can save a substantial amount of energy. Moreover, we formulate the offline measurement scheduling as a non-linear optimization problem and compare the offline optimal results with our online scheduling performance.
Currently smart meter data analytics has received enormous attention because it allows utility companies to analyze customer consumption behavior in real time. However, the amount of data generated by these sensors is very large. As a result, analytics performed on top of it become very expensive. Furthermore, smart meter data contains very detailed energy consumption measurement which can lead to customer privacy breach and all risks associated with it. In this work, we address the problem on how to reduce smart meter data numerosity and its detailed measurement while maintaining its analytics accuracy. We convert the data into symbolic representation and allow various machine learning algorithms to be performed on top of it. In addition, our symbolic representation admit an additional advantage to allow also algorithms which usually work on nominal and string to be run on top of smart meter data. We provide an experiment for classification and forecasting tasks using real-world data. And finally, we illustrate several directions to extend our work further.
This paper presents an overview of the Mobile Data Challenge (MDC), a large-scale research initiative aimed at generating innovations around smartphone-based research, as well as community-based evaluation of related mobile data analysis methodologies. First we review the Lausanne Data Collection Campaign (LDCC) an initiative to collect unique, longitudinal smartphone data set for the basis of the MDC. Then, we introduce the Open and Dedicated Tracks of the MDC; describe the specific data sets used in each of them; and discuss some of the key aspects in order to generate privacy-respecting, challenging, and scientifically relevant mobile data resources for wider use of the research community. The concluding remarks will summarize the paper.
With the ever increasing adoption of smartphones worldwide, researchers have found the perfect sensor platform to perform context-based research and to prepare for context-based services to be also deployed for the end-users. However, continuous context sensing imposes a considerable challenge in balancing the energy consumption of the sensors, the accuracy of the recognized context and its latency. After outlining the common characteristics of continuous sensing systems, we present a detailed overview of the state of the art, from sensors sub-systems to context inference algorithms. Then, we present the three main contribution of this thesis. The first approach we present is based on the use of local communications to exchange sensing information with neighboring devices. As proximity, location and environmental information can be obtained from nearby smartphones, we design a protocol for synchronizing the exchanges and fairly distribute the sensing tasks. We show both theoretically and experimentally the reduction in energy needed when the devices can collaborate. The second approach focuses on the way to schedule mobile sensors, optimizing for both the accuracy and energy needs. We formulate the optimal sensing problem as a decision problem and propose a two-tier framework for approximating its solution. The first tier is responsible for segmenting the sensor measurement time series, by fitting various models. The second tier takes care of estimating the optimal sampling, selecting the measurements that contributes the most to the model accuracy. We provide near-optimal heuristics for both tiers and evaluate their performances using environmental sensor data. In the third approach we propose an online algorithm that identifies repeated patterns in time series and produces a compressed symbolic stream. The first symbolic transformation is based on clustering with the raw sensor data. Whereas the next iterations encode repetitive sequences of symbols into new symbols. We define also a metric to evaluate the symbolization methods with regard to their capacity at preserving the systems' states. We also show that the output of symbols can be used directly for various data mining tasks, such as classification or forecasting, without impacting much the accuracy, but greatly reducing the complexity and running time. In addition, we also present an example of application, assessing the user's exposure to air pollutants, which demonstrates the many opportunities to enhance contextual information when fusing sensor data from different sources. On one side we gather fine grained air quality information from mobile sensor deployments and aggregate them with an interpolation model. And, on the other side, we continuously capture the user's context, including location, activity and surrounding air quality. We also present the various models used for fusing all these information in order to produce the exposure estimation.
We present Mobile Observatory, a mobile air quality monitoring application that provides evaluation of air quality of the city of Zurich, Switzerland. As Mobile Observatory utilizes air quality data gathered by sensors mounted on around 10 trams in Zurich, it is able to provide neighborhood-level air quality information within the city. In this paper, we introduce a mobile air quality monitoring application Mobile Observatory. Also, we describe a user study with 10 participants and show our preliminary results in hopes of yielding insights toward improving civic and urban engagement on air quality.
With an increasing number of rich embedded sensors, like accelerometer and GPS, smartphone becomes a pervasive people-centric sensing platform for inferring user's daily activities and social contexts. Alternatively, wireless sensor network offers a comprehensive platform for capturing the surrounding environmental information using mobile sensing nodes, e.g., the OpenSense project [2] in Switzerland deploying air quality sensors like CO on public transports like buses and trams. The two sensing platforms are typically isolated from each other. In this paper, we build ExposureSense, a rich mobile participatory sensing infrastructure that integrates the two independent sensing paradigms. ExposureSense is able to monitor people's daily activities as well to compute a reasonable estimation of pollution exposure in their daily life. Besides using external sensor networks, ExposureSense also supports pluggable sensors (e.g., O 3 ) to further enrich air quality data using mobile participatory sensing with smartphones.
Air pollution is one of the key indicators for quality of life in urban environments, and is also the subject of global health concern, given the number of mortal diseases associated to exposure to pollutants. Assessing and monitoring air quality is an important step in order to better understand the impact of pollution on the health of the population. Nevertheless, in order to scale to the city level, traditional high-quality stationary sensing stations are not enough. Limitations include lack of coverage, the cost of deployment and maintenance, as well as the resolution of the observed phenomena. The OpenSense2 project aims at providing a city-level sensing deployment that combines different levels of air quality sensing: reference stations, mobile sensing on public transportation, and participatory crowdsensing. In this paper we highlight some of the key challenges of managing the data captured by such infrastructure, taking the city of Lausanne as a driving use-case. Furthermore, we present a semantics-based approach for characterizing and exposing the air quality data, so that it can be made available to citizens and application developers in a way that it can be usable and understood effectively.
Both sensor coverage maximization and energy cost minimization are the fundamental requirements in the design of real-life mobile sensing applications, e.g., (1) deploying environmental sensors (like CO 2 , fine particle measurement) on public transports to monitor air pollution, (2) analyzing smart phone embedded sensors (like GPS, accelerometer) to recognize people daily activities. However sensor coverage and energy cost contradict each other: the higher frequency mobile sensing takes, the more energy is used, and vise versa. In this paper, we design a novel two-step mobile sensing process ("OptiMoS") to achieve optimal mobile sensing that can effectively balance sensor coverage and energy cost. In the first step, OptiMoS divides the continuous mobile sensor readings into several segments, where the readings in one segment are highly-correlated rather than readings amongst different segments. In the second step, OptiMoS identifies optimal sampling for the sensor readings in each segment, where the selected readings can guarantee reasonably high sensor coverage with limited sampling rate. Various greedy & near-optimal segmentation and sampling methods are designed in OptiMoS, and are evaluated using real-life environmental data from mobile sensors.