Download this complete Project material titled; Study And Comparison Of Modern Nosql Models For Large Scaled Websites with abstract, chapters 1-5, references, and questionnaire. Preview Abstract or chapter one below

  • Format: PDF and MS Word (DOC)
  • pages = 65

 5,000

ABSTRACT

 

With the rise of internet and the explosion of data sources, more and more
companies are facing new challenges are facing new challenges that were
previously very rare. Those problems concern the storage and the usability of set
of data so big and often growing so fast that the usual tools were no more
adapted. Moreover the multiplication of data sources and types lead to the
problem of storing data that should be considered as unstructured in fixed and
structured data models provided by Relational Database Management System.
Those problematic lead companies and open source communities to build new
tools to face those new challenges. They are known as NoSQL databases. This
project titled THE STUDY AND COMPARISON OF NOSQL MODELS FOR LARGE
SCALED WEBSITES takes a practical approach at studying these NoSQL models,
finding their scalability properties and investigating performances. Matlab
diagrams are used for comparison. These results as well as CAP theorem were
used to make suggestions and use cases to help people know which will best suit
their website. It also solves some problems encountered in some of these
database models.

 

TABLE OF CONTENTS

Title Page………………………………………………………………………………………………. i
Approval Page ………………………………………………………………………………………. ii
Certification …………………………………………………………………………………………. iii
Dedication …………………………………………………………………………………………… iv
Acknowledgement ………………………………………………………………………………. v
Abstract ……………………………………………………………………………………………….. vi
Table Of Contents ………………………………………………………………………………… viii
List Of Figures ……………………………………………………………………………………….. xi
List Of Tables ………………………………………………………………………………………… xiii
Chapter One Introduction ……………………………………………………………. PAGE
1.1. Background Of Study ………………………………………………. 1
1.2. Aims And Objectives ……………………………………………….. 2
1.3. Scope Of Study ………………………………………………………… 2
1.4. Justification Of Study ……………………………………………….. 2
1.5. Constraints And Limitations …………………………………….. 2
Chapter Two Literature Review ……………………………………………………. 3
2.1 Structured Query Language ……………………………………… 3
2.1.1. Why NoSQL Models? ……………………………………………….. 4
2.1.2. Problems That A NoSQL Database Tries To Solve ……… 4
2.1.3. SQL And NoSQL Scaling ……………………………………………. 6
2.2. Partitioning ……………………………………………………………… 7
2.3. Storage Layout ………………………………………………………… 14
2.3.1 Row-Based Storage Layout ………………………………………. 15
2.3.2 Columnar Storage Layout ………………………………………… 15
2.3.3. Columnar Storage Layout with Locality Groups ………… 15
2.3.4. Log Structured Merge Trees ……………………………………. 15
2.4. Storage Distribution ………………………………………………… 20
2.5 The CAP Theorem ……………………………………………………. 25
2.6. Distribute The Computation ……………………………………. 26
2.7. Study Of Some Selected Database ………………………….. 28
2.7.1. HBase ……………………………………………………………………… 28
2.7.2 Cassandra ……………………………………………………………….. 34
2.7.3. MongoDB ……………………………………………………………….. 38
2.9.3. CAP Approach …………………………………………………………. 42
ix
2.7.4. Riak …………………………………………………………………………. 44
2.7.5. Scalaris …………………………………………………………………….. 46
2.8. YCSB ………………………………………………………………………… 50
2.9. Catalyst 2950 Switch ……………………………………………….. 51
2.10. Matlab Description …………………………………………………. 51
2.11. Matrix And Array Operations ………………………………….. 54
Chapter Three Methodology ….………………………………………………………. 58
3.1. Parameters/Properties Being Measured ………………….. 58
3.1.1. Performance ……………………………………………………………. 58
3.1.2. Scalability ………………………………………………………………… 58
3.1.3. Stability …………………………………………………………………… 58
3.2. Materials Used ……………………………………………………….. 59
3.2.1. Software Configuration Settings ………………………………. 59
3.2.2. Cassandra ………………………………………………………………… 59
3.2.3. Hbase ………………………………………………………………………. 60
3.3.4. MongoDB ………………………………………………………………… 60
3.3. Data Set 61
3.3.1. Creating The Mydata Keyspace And The Two Column
Families
64
3.3.2 Createdata.php. Script ……………………………………………. 65
3.3.3. Readwrite100million.php Script ………………………………. 66
3.3.4. Readwrite100million.php Code ……………………………….. 67
3.3.5. Readwrite200million.php Script ………………………………. 69
3.3.6. The Result Format …………………………………………………… 70
3.4. Program Flow …………………………………………………………… 71
3.5 Readwrite100million.php ………………………………………… 72
3.6. Network Configuration ……………………………………………. 73
3.7. Network Diagrams ………………………………………………….. 74
3.8. Solving Load Balancing Problem In NoSQL ……………… 75
3.9 Experiment Showing The Effect Of Load Balancing ….. 76
3.10. Application And Result On My Cluster ……………………… 80
3.11. Microsoft System Center Virtual Machine Manager
(VMM)
81
3.12. Running The Virtual Machine Converter Software ……. 81
x
Chapter Four Results Analysis And Discussion ……………………………… 83
4.1. Results And Calculations …………………………………………. 83
4.2. Math Lab Diagrams …………………………………………………. 85
4.3. Stabilization Times Of The Database Models …………… 88
4.4. Scalability ……………………………………………………………….. 89
4.5. Performance Measurement …………………………………… 91
4.6. Comparison With Results From Ecole Polythechnique
De Louvain Computer Engineering Dept.
93
4.7. Results Of Physical And Virtual Nodes ………………………. 97
4.8. Performance And Stability Results …………………………… 100
4.9. Results Analysis ………………………………………………………. 101
Chapter Five Conclusion And Recommendation …………………………. 105
5.1. Use Cases ………………………………………………………………… 105
5.2. Website With Medium To High Traffic ……………………. 105
5.3. File Storage In The Cloud …………………………………………. 107
5.4. Conclusions …………………………………………………………….. 109
5.5. Further Work ………………………………………………………….. 110
A References ……………………………………………………………… 112
B Appendix …………………………………………………………………

 

 

CHAPTER ONE

INTRODUCTION
1.1. Background of study
With the rise of Internet and the explosion of data sources, more and more
companies are facing new challenges that were previously very rare. Those
problems concern the storage and the usability of set of data so big and often
growing so fast that the usual tools are no more adaptable. Moreover, the
multiplication of data sources and types lead to the problem of storing data that
should be considered as unstructured in fixed and structured data models
provided by Relational Database Management System. These problems lead
companies and open source communities to building of new tools to face these
new challenges. These new models are known as NoSQL databases.
The world of NoSQL databases is very interesting but also quite complex due to
several factors:
· NoSQL databases are evolving rapidly, therefore critics and analysis can be
obsolete very fast.
· There is a profusion of NoSQL databases and each of them has its own
specifications.
· The various NoSQL databases have reached different level of maturity and
it is not always trivial to determine if a given database or a subset of its
functionalities is ready for production.
xv
1.2. AIMS AND OBJECTIVES
Due to new database models emerging every day, there is a problem of
determining which will best fit a particular type of website. Therefore this project
practically examines the most common NoSQL models in other to find their
scalability properties, stability and performances as well as look into their CAP
properties according to the CAP theorems to enable the development of a use
case or advisory that can suit any particular website. This project also solves some
problems encountered in some of these NoSQL models.
1.3. SCOPE OF STUDY
This project concentrates more on the three most common NoSQL models used
today which are the Cassandra, the Hbase and the MongoDB. These are perfect
NoSQL models and are mostly used by large scaled sites in the world today.
1.4. JUSTIFICATION OF STUDY
The work done here will be a guide to a lot of companies whose database are
growing beyond what RDBMS or SQL. It also suits companies moving into NoSQL
models just like facebook, Twiter, Amazon and a lot others moved into NoSQL
models. It will also be of great help to companies who are already using these
models but are facing some of the problems solved in this thesis. It will actually
help them speed up there database processing, increase efficiency and also save
cost.
1.5. CONSTRAINTS AND LIMITATIONS
Some of these large scaled sites have servers that number into hundreds, some
even thousands but due to cost and availability of resources I was limited to
experimenting with only 30 nodes / systems.

 

GET THE COMPLETE PROJECT»

Do you need help? Talk to us right now: (+234) 08060082010, 08107932631 (Call/WhatsApp). Email: [email protected].

IF YOU CAN'T FIND YOUR TOPIC, CLICK HERE TO HIRE A WRITER»

Disclaimer: This PDF Material Content is Developed by the copyright owner to Serve as a RESEARCH GUIDE for Students to Conduct Academic Research.

You are allowed to use the original PDF Research Material Guide you will receive in the following ways:

1. As a source for additional understanding of the project topic.

2. As a source for ideas for you own academic research work (if properly referenced).

3. For PROPER paraphrasing ( see your school definition of plagiarism and acceptable paraphrase).

4. Direct citing ( if referenced properly).

Thank you so much for your respect for the authors copyright.

Do you need help? Talk to us right now: (+234) 08060082010, 08107932631 (Call/WhatsApp). Email: [email protected].

//
Welcome! My name is Damaris I am online and ready to help you via WhatsApp chat. Let me know if you need my assistance.