ABSTRACT
The study was conducted of the Improvement on concurrency control in a distributed database. The objectives of the study are to ensure consistent and correct transaction, efficient data collection and management and recoverability of data in a distributed database. The Mysql relational database was employed in collecting data for concurrency control protocols. The Unified Modeling Language (UML) was used in creating the use case, sequence diagram, class diagram and contract for the sequence diagram. Petri-nets a graphical tool for the analysis of sequence of operations in the distributed database was also used in this thesis. Finally MATLAB was applied to simulate the operation of concurrency control protocol in a distributed database. The result obtained from the improved concurrency control in a distributed database as developed in this thesis showed an improvement of 55% over the existing method of concurrency control in a distributed data base.
TABLE OF CONTENTS
Title page – – – – – – – – – – i
Approval page – – – – – – – – – – ii
Declaration – – – – – – – – – – iii
Certification Page – – – – – – – – – iv
Dedication – – – – – – – – – – v
Acknowledgement – – – – – – – – – vi
Table of contents – – – – – – – – – vii
List of figures – – – – – – – – – – x
List of tables – – – – – – – – – – xii
Abstract – – – – – – – – – – xiii
CHAPTER ONE: INTRODUCTION 1
1.1Background to the study – – – – – – – – 1
1.2 statement of the problem – – – – – – – – 3
1.3 Objectives – – – – – – – – – 3
1.4 scope of study – – – – – – – – – 3
1.5 Significance of study – – – – – – – – 3
1.6 Plan/Organization of thesis – — – – – – – 4
Chapter Two: LITERATURE REVIEW
2.1 History – – – – – – – – – – 5
2.2 Database – – – – – – – – – – 8
2.3 Database Analysis – – – – – – – – – 8
2.4 Database design and modeling – – – – – – – 11
2.5 Database management system – – – – – – – 13
2.6 Distributed database – – – – – – – – 14
2.7 Advantages of distributed database – – – – – – 16
2.8 Disadvantages of distributed database – – – – – – 18
2.9 Transaction – — – – – – – – – 19
2.10 Concurrency control in a Distributed database – – – – – 22
2.11 concurrency control algorithm – – – – – – – 25
2.11.1 Two-phase locking (2PL) – – – – – – – 25
2.11.2 Time stamp ordering – – – – – – – – 26
2.11.3 Wound-Wait (WW) – – – – – – – – 28
2.12 Models of concurrency – – — – – – – – 28
2.12.1 Petri-nets – — – – – – – – – 28
2.12.2 Process model – – – – – – – – – 33
2.12.3 Actor model – – – – – – – – – 36
2.13 Overview of the Project – – – – – – – – 37
2.13.1 Goals of the Project – – – – – – – – 39
2.13.2Software process improvement – – – — – – – 40
2.13.3Functional requirements – – – – – – – 41
2.13.4 Data collection – – – – – – – – – 41
2.13.5 Data Duplication – – – – – – – – 41
2.13.6 Data Distribution – – – – – – – – 42
2.13.7 Distributed serializability – – – – – – – 42
2.13.8 Non-functional Requirements – — – – – – – 44
2.13.9 Reliability – – – – – – – – – 44
2.13.10 Availability – – – – – – – – – 45
2.13.11 Recoverability – – – – – – – – 45
2.13.12 Maintainability – – – – – – – – 45
2.13.13 Software Requirement- – – – – – – – 45
2.13.14 The main architectures for parallel DBMS – – – – – 45
2.14.1 Model of a Database management system – – – – – 47
2.5 Software design – – – – – – – – – 51
CHAPTER THREE: RESEARCH METHODOLOGY
3.1 Requirements Analysis and Specification – – – – – 53
3.1.1 Use Case diagram for concurrency control in a distributed database – – 53
3.1.2 Action diagram for concurrency control in a distributed database – – 54
3.1.3 Event Trace diagram for concurrency control in a distributed database- – 55
3.1.4 Sequence diagram for concurrency control in a distributed database – – 56
3.1.5 Class diagram for concurrency control in a distributed database – – 57
3.1.6 Contract for the concurrency control in a distributed database – – – 58
3.2.0 Design process – – – – – – – – – 61
3.2.1 The Design Model- – – – – – – – – 61
3.2.2 Data design elements – – – – – – – – 62
3.2.3 Data diagram for concurrency control in a distributed database – – 62
3.3.1 An Improved Architecture for concurrency control in a distributed database – 65
3.4.1Petri net design approach – – – – – – – 67
3.4.2 Petri-nets Architectural design approach – – – – – 67
3.4.3 State Space design approach with a Petri-Net – – – – 68
CHAPTER FOUR: RESULTS AND DISCUSSION
4.1 Table for ONE-WAY concurrency control in a distributed database – – 85
4.1.1 Equation for the Simulation – – – – – – – 86
4.1.2 Simulation Process – – – – – – – – 86
4.2 Table for an improved TWO-WAY concurrency control in a distributed database – 88
4.2.1 Equation for the Simulation – – – – – – – 89
4.2.2 Simulation Process – — – – – – – – 90
- Table for an existing ONE-WAY concurrency control in a distributed database – 93
4.3.1 Equation for the Simulation – – – – – – – 93 4.3.2 Simulation Process – — – – – – – – 94
4.4 Evaluation of the Results – – – – – – – – 96 4.4.1 Validating the analytical results with the Simulation Results. – –
4.4.2 Discussion on the improvement fn concurrency control in a distributed database – 96
CHAPTER FIVE: CONCLUSION AND RECOMMENDATION FOR FURTHER WORK
- Summary – – – – – – – – – – 99
- Conclusion – – – – – – — – – 99
- Recommendation – – – – – – – – 99
- Suggestions For Further Work – – – – – – – 99
References – – – – – – – – – – 100
Ahppendix – – – – – – – – – –
CHAPTER ONE
INTRODUCTION
- BACKGROUND TO THE STUDY
Computer and its diverse applications have in recent times witnessed a revolution. Its enormous success is due to largely to the flexibility and reliability that computer system offer to potential users [1]. According to [2], the last decade has seen the creation and rapid expansion of the field of distributed computing systems. This has been driven by both technical and social forces and it seems likely that the pressure from both will continue for some time yet. For over 20 years, businesses have been moving their Data processing activities on line. Many businesses, such as airlines and banks, are no longer able to function when their on-line computer system are down. Their on-line database must be up-to-date and correct at all times.
Managing large scale distributed data and activities gets difficult as the amount of heterogonous data grows. Therefore classic techniques for distributed design and query processing in relational database system have been revisited to address dynamic issues in high performance computing and flexibility challenges of xml document [3]
Concurrency control ensures that correct results for concurrent operations are generated, while getting these results as quickly as possible. The general area of concurrency control provides rules, methods, design methodologies, and theories to maintain consistency of components operating concurrently while interacting, and thus the consistency and correctness of the whole system
Concurrency control is the activity of coordinating the actions of processes that operate in parallel, access shared data, and therefore potentially interfere with each other [4]. According to [5], in mobile computing environment, clients can access data irrespective of their physical location. Data is shared among multiple clients and can be updated by each client independently. This leads to inconsistency of data. Due to limitations of mobile computing environment, traditionally techniques cannot be used
Many enterprises are cooperatives in nature, e.g. offices, multinational companies, university campuses etc; requiring sharing of resources and information. Distributed system can provide this either by integrating pre-existing system, or building new system which inherently reflect sharing patterns in their structure [2]. As stated by [6], the protocol structure for our distributed database is based on the transaction oriented high level protocol, supported by the top three layers of the ISO reference model namely, the application layer, presentation layer, and the session layer. In most existing distributed databases. Database function are incorporated at the application layer [7][8], the performance of such protocol is not satisfactory because of high level time-out, and excessive communication delay. The failure detection facility is usually established at the application layer in most distributed database system, which is triggered off by a high level time-out, so that is quite time consuming. In order to enhance the performance of the system, the failure detection facility is encapsulated in the session layer. It detects the site status of the cooperative system at allow cost and with enhanced performance [9]
Maintaining consistency requires the imposition of the events within a system. The substantial insight is that events in a distributed system only define a partial order rather a total order [10]. Required orderings can be achieved by extending existing centralized mechanism, such as locking, or using-time stamp based algorithms.
Furthermore, a concurrent program can be executed either by allowing processes to share one or more processor or by running each process on its own processor. The first approach is referred to as multiprogramming; it is supported by an operating system kernel that multiplexes the processes on the processors [11]. The second approach is referred to as multiprocessing if the processors share a common memory (as in multiprocessors or a distributed processing if the processors are connected by a communication network.
In concurrency control of database, transaction processing(transaction management), and various transactional application(e.g. transactional memory and software transactional memory), both centralized and distributed, a transaction schedule is serializable if its outcome(e.g., the resulting database state) is equal to the outcome of its transactions executed serially, i.e. sequentially without overlapping in time [12].Serializability is the major correctness criterion for concurrent transactions executions. It is considered the highest level of isolation between transactions, and plays an essential role in concurrency control.
However the commonly used correctness criteria for concurrency control and recovery, serializability and total recoverability, are very strict criteria. The use of more relaxed
criteria(allowing more true parallel behavior and more true partial behavior, is therefore very appealing-as long as this can be achieved without compromising safety or applicability[13].
From the foregoing, one can see that improvement on concurrency control in a distributed database plays a crucial role on the abnormal (e.g. crashes) and normal (e.g. concurrent sharing) activity of a system
- STATEMENT OF THE PROBLEM
The problem of concurrency control is well known in multiprocessing systems. The problem and its solutions become harder when the degree of sharing and the amount of concurrency increase in distributed system [2]. In particular, the lack of global state first of all makes the solution more difficult and also introduces the need for replication which causes more consistency problems.
,
- OBJECTIVES
The primary objective of this project is to improve on the concurrency control in a distributed database. The secondary objectives of the project include:
- To ensure Consistent and correct transaction in a distributed database.
- To ensure Efficient data collection and management
- To ensure Recoverability of data in a distributed database.
- SCOPE OF STUDY
The study is delimited to the improvement on concurrency control in a distributed database. The study covers Concurrency control on database management files of Post Graduates studies in University of Nigeria Nsukka.
- SIGNIFICANCE OF STUDY
The findings from this study will be of immense benefits to the banks, institutions, churches, offices etc. This is because distributed systems are in use in a wide range of computer applications and are being considered as a ‘first candidate’ whenever a new application emerges [14].
- PLAN/ORGANIZATION OF THESIS.
The project is made of five chapters:
Chapter one introduces the project. It gives the background information of the project and discusses the statement of the problem, objectives of the project, significance of study, proposed method and plan/ organization of thesis.
Chapter two reviews database, database management system, distributed database and concurrency control in a distributed database
Chapter three discusses requirement elicitation, analysis and specification
Chapter four deals on project design which encompasses the framework and the design of different modules use in the project. The modules comprise data module, database management system module, distributed data module and concurrency control module.
Chapter five involves simulation and documentation of the project
Chapter six deal on evaluation and conclusion
Do you need help? Talk to us right now: (+234) 08060082010, 08107932631 (Call/WhatsApp). Email: [email protected].
IF YOU CAN'T FIND YOUR TOPIC, CLICK HERE TO HIRE A WRITER»