On Big Data Management In Internet Of Things
ABSTRACT
The Internet of Things (IoT) has generated a large amount of research interest across a wide
variety of technical areas. These include the physical devices themselves, communications
among them, and relationships between them. One of the effects of ubiquitous sensors
networked together into large ecosystems has been an enormous flow of data supporting a wide
variety of applications. In this work, we propose a new “IntelliFog-Cloud” approach to IoT Big
Data Management by leveraging mined historical intelligence from a Big Data platform and
combining it with real-time actionable events from IoT devices at the Fog layer to reduce action
latency in IoT applications. This approach is demonstrated through an advertisement service
simulation with VoltDB technology where advertisements are being served on mobile phones
based on geo-location and highest bids, and displayed from user interests determined by data
analytics of activities on the web. Results from the demonstration show very low latency
overhead of processing large hundreds of thousands of transactions. This approach improves
both action latency and accuracy of real-time decisions in IoT applications.
TABLE OF CONTENTS
ABSTRACT ……………………………………………………………………………………………………………… iii
ACKNOWLEDGEMENT ………………………………………………………………………………………….. iv
DEDICATION ………………………………………………………………………………………………………….. vi
LIST OF FIGURES ……………………………………………………………………………………………………. x
CHAPTER ONE ………………………………………………………………………………………………………… 1
INTRODUCTION ……………………………………………………………………………………………………… 1
1.0 Introduction …………………………………………………………………………………………………… 1
1.1 Research Question …………………………………………………………………………………………. 3
1.2 Objective of the Research ……………………………………………………………………………….. 3
1.3 Implication of Research ………………………………………………………………………………….. 3
1.4 Scope of work ……………………………………………………………………………………………….. 4
1.5 Organization …………………………………………………………………………………………………. 4
CHAPTER TWO ……………………………………………………………………………………………………….. 5
LITERATURE REVIEW ……………………………………………………………………………………………. 5
2.0 Internet of Things (IoT) ………………………………………………………………………………….. 5
2.0.1 Why Internet of Things? …………………………………………………………………………… 6
2.0.2 Applications of IoT ………………………………………………………………………………….. 6
2.0.3 Challenges of IoT ……………………………………………………………………………………. 7
2.1 Big Data ……………………………………………………………………………………………………….. 8
2.1.1 Big Data Management ……………………………………………………………………………. 10
2.1.2 Big Data and Internet of Things ………………………………………………………………. 11
2.2 Data Streams ……………………………………………………………………………………………….. 12
2.2.1 Data Stream Processing ………………………………………………………………………….. 13
2.2.2 Stream Processing Models ………………………………………………………………………. 14
2.3 Real-Time Data Stream Processing ………………………………………………………………… 14
2.3.1 Requirements of Real-time Data Stream Processing ………………………………………. 15
2.4 Stream Processing Applications …………………………………………………………………….. 16
2.4.1 Aurora ………………………………………………………………………………………………….. 16
2.4.2 Borealis ………………………………………………………………………………………………… 17
2.4.3 Apache Storm ……………………………………………………………………………………….. 18
2.4.4 Apache S4 …………………………………………………………………………………………….. 20
2.4.5 Apache Samza ………………………………………………………………………………………. 22
2.4.6 VoltDb …………………………………………………………………………………………………. 23
2.5 Related Work ………………………………………………………………………………………………. 24
2.5.1 Towards Cloud-Based Big Data Analytics for Smart Future Cities ………………. 24
2.5.2 A Data-Centric Framework for Development and Deployment of Internet
of Things Applications in Clouds …………………………………………………………….. 25
2.5.3 A CIM-Based Framework for Utility Big Data Analytics …………………………… 25
2.5.4 Data Management for the Internet of Things: Design Primitives and
Solution ………………………………………………………………………………………………… 26
2.5.5 An Architecture to Support the Collection of Big Data in the Internet of
Things ………………………………………………………………………………………………….. 26
2.5.6 Lambda Architecture ……………………………………………………………………………… 26
2.6 Chapter Summary ………………………………………………………………………………………… 27
CHAPTER THREE ………………………………………………………………………………………………….. 28
ANALYSIS ……………………………………………………………………………………………………………… 28
3.0 Latency ………………………………………………………………………………………………………. 28
3.0.1 Types of Latency …………………………………………………………………………………… 28
3.1 Fog Computing ……………………………………………………………………………………………. 29
3.2 Multi-Tier Fog-Cloud Architecture ………………………………………………………………… 31
3.3 Analysis of the Existing/Traditional Approach ………………………………………………… 32
3.3.1 From Devices to Cloud …………………………………………………………………………… 32
3.3.2 Existing Fog Approach …………………………………………………………………………… 34
3.4 The Proposed Latency-Reducing Intelli-Fog Approach …………………………………….. 34
3.4.1 The intelli-Fog Layer ……………………………………………………………………………… 35
3.4.2 The Cloud Layer ……………………………………………………………………………………. 35
3.5 Chapter Summary ……………………………………………………………………………………………. 35
CHAPTER FOUR …………………………………………………………………………………………………….. 37
USE CASES AND IMPLEMENTATION …………………………………………………………………… 37
4.0 Introduction …………………………………………………………………………………………………. 37
4.1 Use Case Scenarios ………………………………………………………………………………………. 37
4.1.1 Intelligent Patient Monitoring System. ……………………………………………………… 38
4.1.2 Smart Cities (Intelligent Traffic Light) …………………………………………………………. 40
4.1.3 Geo-Location and User Interest Based Mobile Advertisement display …………….. 42
4.2 Use Case Implementation (Intelli-Fog Advertisement Display Based on Location
and User Interest) ………………………………………………………………………………………… 43
4. 2.1 Technology and Tools ……………………………………………………………………………. 44
4.2.2 The Developed Advertisement Display Simulation ……………………………………….. 45
4.4 Chapter Summary ………………………………………………………………………………………… 48
CHAPTER FIVE ……………………………………………………………………………………………………… 49
SUMMARY, CONCLUSION AND RECOMMENDATION ………………………………………… 49
5.0 Summary …………………………………………………………………………………………………….. 49
CHAPTER ONE
1.0 Introduction
Advances in sensor technology, communication capabilities and data analytics have resulted
in a new world of novel opportunities. With improved technology such as nanotechnology,
manufacturers can now make sensors which are not only small enough to fit into anything and
everything but also more intelligent. These sensors can now pass their sensing data effectively
and in real time due to improvements in communication protocols among devices. There are
now, also, emerging tools for processing these data. These phenomena combined have made
the Internet of Things (IoT) a topic of interest among researchers in recent years. Simply put,
the IoT is the ability of people’s “things” to connect with anything, anywhere and at any time
using any communication medium. “Things” here means connected devices of any form. It is
estimated that by 2020 there will be 50 to 100 billion devices connected to the internet [2].
These devices will generate an incredible amount of massively heterogeneous data. These data,
due to their size, rate at which they are generated and their heterogeneity are referred to as “Big
Data”. Big Data can be defined with the famous three characteristics known as the 3Vs:
volume, variety, and velocity or sometimes 5Vs, including Value and Veracity [3], [12]. These
data, if well managed, can give us invaluable insights into the behaviour of people and “things”;
an insight that can have a wide range of applications.
The potentials of incorporating insights from IoT data into aspects of our daily lives are
becoming a reality at a very fast rate. The acceptability and trust level is also growing as people have expressed willingness to apply IoT data analytics results in situations even as delicate as
stock market trading [1]. These developments inform the need for efficient approaches to
manage and make use these huge and fast-moving data streams. Distributed processing
frameworks such as Hadoop have been developed to manage large data but not data streams.
One major limitation of distributed settings such as Hadoop is latency. They are still based on
the traditional Store-Process-and-Forward approach which makes them unsuitable for real-time
processing, a contrast with the real-time demands of the current and emerging application areas
[4]. Store and forward also will not be able to satisfy the latency requirements of IoT data
because of the velocity and the unstructured nature of the data. Stream processing frameworks
like Apache Storm and Samza are then introduced to solve this problem. In stream processing,
data from data sources are continuously processed as they arrive and do not need to be stored
first. This improves latency, especially in stateless stream processing which processes data as
it comes without reference to the current situation of the system.
Stream processing frameworks, however, are more general for processing data streams and are
not tailored for the specific needs of IoT data management systems. IoT applications typically
have strict latency requirements. IoT applications also involve a great deal of Machine-to-
Machine (M2M) communications. The latency requirements of emerging IoT applications, no
doubt, requires a new approach to reduce latency to its barest minimum and make fast and
efficient use of “things” data.
1.1 Research Question
Can we develop a generic Big Data management approach to reduce the latency of intelligent
reaction to actionable events in IoT applications?
1.2 Objective of the Research
The aim of this work is to propose and demonstrate a generic, efficient, scalable and robust
approach to Big Data management approach in IoT which extracts real-time value from data
and demonstrate its operation in an application area. Using existing and emerging computing
paradigms, we seek to develop an approach to significantly reduce latency in streaming data
from a network of connected devices and thus capture events that trigger actions in real time.
1.3 Implication of Research
This work seeks to propose a general latency-reducing approach to IoT data management
independent of data source, type or communication protocol. Finding this approach will
improve significantly, the speed and responsiveness of current real-time applications and also
broaden the applications of IoT to new latency critical domains. The approach will reduce
response time of IoT applications and enable them to react fast enough to suit the requirements of emerging applications. It will also serve as the underlying principle of both open-source and
commercial IoT data management applications.
1.4 Scope of work
The scope of this work includes the following:
I. To provide a new approach to IoT data management with a view to reduce latency.
II. To implement this approach with software tools.
III. To apply this implementation to a challenging use case.
1.5 Organization
An extensive review of literature is contained in the second chapter of this write-up. This
includes review of the main concepts, technologies used as well as related work in the research
area. The third chapter presents and describes the proposed model, how it works and its latencyreducing
advantages. The fourth chapter describes an implementation of the model with results
and evaluations and the fifth chapter contains the conclusion and future works on the subject.
IF YOU CAN'T FIND YOUR TOPIC, CLICK HERE TO HIRE A WRITER»