Trend Prediction and Distributed Pattern Matching for M2M Data

PI : Prof. Shou-De Lin

Champion: Dr. Phillip B Gibbons



Sensor networks can provide rich contextual information about an environment which can be utilized to detect or predict events useful for making decisions in a given system. This is a clear departure from past systems which relied heavily on domain expert knowledge. In this work, the emphasis is on learning from data, and while the increase in number and diversity of sensors (and thus data sources) which makes this approach possible has opened the door to new applications, it has also introduced many fresh research challenges. Our group aims to develop a general model for Heterogeneous Sensor Networks (HSNs) which addresses these issues.



Event Detection/Prediction

We are designing a general event prediction model for heterogeneous sensor networks. This model is capable of predicting the occurrence of an event as well as its space/time position. For example, consider a traffic application. Given sensor data related to road conditions, we can predict where and when a traffic jam may occur as well as the probability of this occurrence.

Sensor Fault DetectionIn

a large-scale deployment of sensors, it is inevitable that there will be point failures and aberrations within the network. To address this issue, we are designing a framework for HSN-based anomaly detection. Using this framework, we can predict quantitatively which sensors have failed or malfunctioned.



Uniform Representation

 In an HSN environment, sensor type and data format (including meta-data) may be non-uniform throughout the network. How can these disparate data sources be efficiently and effectively normalized such that they can be used to facilitate event prediction?

Capturing Dependencies

In data mining of HSNs, the independent and identically distributed (i.i.d.) condition not only cannot be assumed, we almost certainly have that there is a dependency between the observations of a given sensor AND the observations between different sensors. How can these dependencies best be leveraged to identify malfunctioning sensors?


The large number of sensors involved in a given system as well as the frequency at which observations are taken combine to yield vast (and perhaps distributed) datasets which, in turn, invoke ever increasing computational complexity. How should one best address these issues?

Uncertainty and Noise

Data in HSNs is especially subject to noise and uncertainty. Does a given observation accurately reflect the measurand? With what certainty can we answer the previous question? How should missing data be handled?

Domain Knowledge

As HSNs are often deployed with some events or scenarios in mind, domain knowledge has in the past played an important role in the detect or prediction of events in the system. It is important when designing a general model for event prediction, however, that the methods not be tied to one specific application, but rather be extensible to a wide range of scenarios while being able to incorporate domain knowledge as needed.