## ABSTRACT

The Gompertz distribution can be skewed to the right or to the left. This dissertation introduces a new positively skewed Gompertz model known as Lomax-Gompertz Distribution (LGD). This extension was possible with the aid of a Lomax generator.Some basic statistical propertiesof the new distribution such as moments, moment generating function, characteristics function, reliability analysis, quantile function and distribution of order statistics were derived. A plot of the probability density function (pdf)of the distribution revealed that it ispositively skewed. The model parameters have been estimated using the method of maximum likelihood estimation.The plot for the survival function indicates that the Lomax-GompertzDistribution could be used to model time or age-dependent variables, where probability of survival decreases with time or age.The performance of the Lomax-GompertzDistribution has been compared to the Generalized Gompertz, Transmuted Gompertz, Odd Generalized Exponential Gompertz and the Gompertz distributions by some applications to three real-life data sets. The results show that the proposed distribution outperformed the Generalized Gompertz, Transmuted Gompertz, Odd Generalized Exponential Gompertz and the Gompertz distributions in two of the datasets. The model should be used to modelpositively skewed datasets with various peaks where the sample size is large.

## TABLE OF CONTENTS

COVERPAGE

DECLARATION …………………………………………………………………………………………………… i

CERTIFICATION ……………………………………………………………………………………………… iii

DEDICATION …………………………………………………………………………………………………….. iv

ACKNOWLEDGEMENTS……………………………………………………………………………………. v

ABSTRACT ………………………………………………………………………………………………………… vi

TABLE OF CONTENTS …………………………………………………………………………………….. vii

LIST OF FIGURES ………………………………………………………………………………………………. x

LIST OF TABLES ……………………………………………………………………………………………….. xi

CHAPTER ONE:INTRODUCTION

1.1 Background to the Study 1

1.2 Statement of the Problem 3

1.3 Aim and Objectives of the Study 3

1.4 Significance of the Study 4

1.5 Limitation 4

1.6 Definition of Terms 4

1.6.1 Probability distribution ………………………………………………………………………………………. 4

1.6.2 Moments …………………………………………………………………………………………………………. 5

1.6.3 Moment generating function ……………………………………………………………………………….. 5

1.6.4 Characteristics function ……………………………………………………………………………………… 6

1.6.5 Reliability analysis ……………………………………………………………………………………………. 6

1.6.6 Order statistics ………………………………………………………………………………………………….. 7

1.6.7 Maximum likelihood method ………………………………………………………………………………. 7

1.6.8 Lifetime data ……………………………………………………………………………………………………. 7

viii

CHAPTER TWO: LITERATURE REVIEW

2.1 The Gompertz Distribution 8

2.2 Generalization of Distributions 9

2.2.1 The G-classes approach ……………………………………………………………………………………. 10

2.2.2 Compounding approach ……………………………………………………………………………………. 12

2.3 Some Generalizations of the Gompertz Distributions 12

2.4 Some Generalizations of the Lomax Distributions 14

CHAPTER THREE: RESEARCH METHODOLOGY

3.1 The Definition of the Lomax-Gompertz Distribution 17

3.1.1 The pdf and cdf of Lomax-Gompertz distribution …………………………………………………… 18

3.1.2 Plot of pdf and cdf of the Lomax-Gompertz distribution ………………………………………….. 20

3.2 Some Properties of the Lomax-Gompertz Distribution 22

3.2.1 Moments ……………………………………………………………………………………………………… 22

3.2.2 Moment generating function …………………………………………………………………………….. 26

3.2.3 Characteristics function…………………………………………………………………………………… 27

3.2.4 Quantile function ……………………………………………………………………………………………. 28

3.2.5 Skewness and kurtosis ……………………………………………………………………………………… 30

3.3 Reliability Analysis 30

3.3.1 Survival function …………………………………………………………………………………………….. 30

3.3.2 Hazard function ……………………………………………………………………………………………… 31

3.4 Order Statistics 33

ix

3.5 Estimation of Parameters 35

CHAPTER FOUR: ANALYSIS AND DISCUSSION

4.1 Application to real life datasets 37

4.2 Datasets 38

4.3 Information Criteria for Comparison of Distributions 41

4.4 Results 42

4.5 Discussion of Results 43

CHAPTER FIVE: SUMMARY, CONCLUSION AND RECOMMENDATIONS

5.1 Summary 45

5.2 Conclusions 45

5.3 Recommendations 46

5.4 Contribution to Knowledge 46

5.5 Areas of Further Research 46

REFERENCES …………………………………………………………………………………………………… 48

x

## CHAPTER ONE

BACKGROUND TO THE STUDY

1.1 Introduction

Lomax (1954) pioneered the study of a distribution used for modeling business failure data called the Lomax or Pareto II distribution. This distribution has found wide application in a variety of fields such as income and wealth inequality, size of cities, actuarial science, medical and biological sciences, engineering, lifetime and reliability modeling. It has been applied to model data obtained from income and wealth (Harris, 1968), firm size (Corbellini et al., 2007), size distribution of computer files on servers (Holland et al.,1989), reliability and life testing (Hassan and Al-Ghamdi, 2009), receiver operating characteristic curve analysis (Campbell and Ratnaparkhi, 1993) and Hirsch-related statistics (Gl‟anzel, 2008). It is known as a special form of Pearson type VI distribution. In the lifetime context, the Lomax model belongs to the family of decreasing failure rate (Chahkandi and Ganjali, 2009) and arises as a limiting distribution of residual lifetimes at great age (Balkema and de Hann, 1974). This distribution has been suggested as heavy-tailed alternative to the exponential, Weibull and gamma distributions (Bryson, 1974). Further, it is related to the Burr family of distributions (Tadikamalla, 1980) and as a special case obtained from compound gamma distributions (Durbey, 1970). Some details about the Lomax distribution and Pareto family are given in (Arnold, 1983) as well as (Johnson et al., 1994). In record value theory, some properties and moments for the Lomax distribution have been discussed in (Balakrishnan and Ahsanullah, 1994) as well as (Amin, 2011). The moments and inference for the order statistics and generalized order statistics are given in (Saran and Pushkarna. 1999) as well as (Moghadam et al., 2012). The estimation of parameters in case of progressive and hybrid censoring have been investigated in (Asgharzadan and

2

Valiollahi, 2011) as well as(Ashour et al., 2011).The problem of Bayesian prediction bounds for future observation based on uncensored and type-I censored sample from the Lomax model are dealt with in (Abd-Ellah, 2003) and (Al-Hussaini et al., 2001). Furthermore, the Bayesain and non-Bayesian estimators of the sample size in case of type-I censored samples for the Lomax distribution are obtained in (Abd-Elfattah et al., 2007), and the estimation under step-stress accelerated life testing for the Lomax distribution is considered in (Hassan and Al-Ghamdi, 2009). The parameter estimation through generalized probability weighted moments (PWMs) is addressed in (Abd-Elfattah and Alharby, 2010). More recently, the second-order bias and bias-correction for the maximum likelihood estimators (MLEs) of the parameters of the Lomax distribution are determined in (Giles et al., 2013). Cordeiro et al. (2014d) introduced a new family of distributions based on the Lomax distribution, called the Lomax-G generator. The Lomax-G generator adds two additional positive parameters to an existing continuous distribution. It allows for greater flexibility of its tails and can be widely applied to many areas of Engineering and Biology. This study takes advantage of this generator to introduce Lomax-Gompertz Distribution, by generalizing the Gompertz Distribution. The resultant Lomax-Gompertz Distribution (LGD) will have four parameters, two from the baseline Gompertz Distribution and two additional positive parameters from the Lomax-G generator. This will increase the flexibility of the Gompertz distribution and also widen its areas of applications in Engineering and Biology.

The Gompertz distribution (GD) can be skewed to the right or to the left. It is a generalization of the exponential distribution (ED) and is commonly used in many applied lifetime data analysis (Johnson et al., 1995). The GD is applied in the analysis of survival, in some sciences such as Gerontology (Brown and Forbes, 1974), Computer (Ohishi et al., 2009), Biology (Economos, 1982), and Marketing science (Bemmaor and Glady, 2012). The hazard rate function of Gompertz distribution is an increasing function,which makes it applicable

3

todescribe the distribution of adult life spans by actuaries and demographers (Willemse and Koppelaar, 2000).

1.2 Statement of the Problem One of the major problems in distribution theory and applications is that somedatasets do not follow any of the existing and well known probability distributions appropriately and hence create anomalies in the process of statistical analysis. Despite the applicability of the Gompertz distribution, limited work has been done in extending the distribution to increase its flexibility. Also to the best of our knowledge, there has been no research that extends the Gompertz distribution based on a Lomax link function,therefore, we expect that the proposed distribution will perform better in fitting real datasets and also improve the flexibility of the distribution as reported by previous studies in the literature.

1.3 Aim and Objectives of the Study The aim of this work is to generalize a Gompertz distribution using the Lomax link function.The specific objectives for achieving the stated aim are, to;

i. define the proposed Lomax-Gompertz distribution.

ii. derive some statistical properties of the proposed distribution such as the moments, the moment generating function, characteristics function, survival function, hazard function, quantile function and distribution of order statistics.

iii. estimate the parameters of the proposed distribution using the method of Maximum Likelihood Estimation (MLE).

iv. evaluate the performance of the proposed distribution when compared to other generalizations of Gompertz distribution.

4

1.4 Significance of the Study The main significance of this study isincreasing the flexibility of theGompertz Distribution by the addition of two positive parameters, thus making it better able to fit datasets. Also this will widen the areas that the distribution can be applied. The proposed distribution has been compared to some existing generalizations of the Gompertz distribution using some real life datasets to evaluate its performance.

1.5Limitation In this study, we focus mainly on generalizing the Gompertz distribution, deriving some statistical properties of the proposed distribution, studying and interpreting some plots of the proposed distribution and estimating the parameters of the model using only the method of maximumlikelihood estimation. The maximum likelihood Estimation method employed in this study yielded non-linear system of equations which could not be solved analytically to obtain the maximum likelihood estimates. R software was utilized to solve for the Estimates of the parameters in the model using datasets. Other suitable software includes Python, SAS etc.

1.6 Definition of Terms

1.6.1 Probability distribution

A random variable is a variable whose value changes from one subject to the other. Probability is used to describe the likelihood or the chances that these random variables will equal specific values or be within a given range of specific values. A probability density function is a mathematical expression that approximately agrees with the frequencies of possible events of a random variable. A random variable is said to be continuous if its range contains an interval of real numbers.For a continuous random variable X, the respective cumulative distribution function (cdf) and probability density function (pdf)are defined as;

5

F(x) Pr[X x] = α

( ) ( )

x

F x f u du

(1.1)

Where α is a real number between 0 and1.

( ( ))

( )

d F x

f x

dx

(1.2)

The pdf has the following properties

1. f (x) 0

2. f (x)dx 1

3. ( ) ( )

b

a

P a X b f x dx

1.6.2 Moments

Moments are used to study some of the most important features and characteristics of a random

variable such as mean, variance, skewness and kurtosis. Let X be a continuous random variable,

the nth moment of X about the origin can be defined as;

‘

( ) n n

n

E X x f x dx

(1.3)

1.6.3 Moment generating function

The moment generating function of a continuous random variable X can be defined as;

( ) ( ) tx tx

x M t E e e f x dx

(1.4)

Where f (x) is the pdf of a continuous distribution.

6

This function is used for generating other moments as it is a general function for all the

moments.

1.6.4 Characteristics function

The characteristics function of a random variable Xcan be defined as

( ) cos( ) sin( ) cos( ) sin( ) itx

x t E e E tx i tx E tx E i tx (1.5)

It has many useful and important properties which give it a central role in statistical theory. Its

approach is particularly useful for generating moments, characterization of distributions and in

the analysis of linear combination of independent random variables.

1.6.5 Reliability analysis

1.6.5.1 Survival function

Survival function is the probability function that a system or an individual will survive beyond

a given time. Mathematically, the survival function is given by:

S x 1 F x (1.6)

where F x is Cumulative distribution function (cdf) of a baseline distribution

1.6.5.Hazard function

Hazard function is also called the failure or risk function and is the probability that a component

will fail or die within an interval of time. The hazard function is defined as;

1

f x f x

h x

F x S x

(1.7)

where F(x) and f(x) are the cdf and pdf of a baseline distribution.

7

1.6.6 Order statistics

Suppose 1 2 , ,……, n X X X is a random sample from a distribution with pdf, f(x), and let

1: 2: : , ,……, n n i n X X X denote the corresponding order statistic obtained from this sample.The pdf,

i:n f x of the ith order statistic can be defined as;

1

:

!

( ) ( ) ( ) 1 ( )

( 1)!( )!

i n i

i n

n

f x f x F x F x

i n i

(1.8)

Order statistics are used in a wide range of problems including robust statistical estimation and

detection of outliers, characterization of probability distributions and goodness of fit tests,

entropy estimation, analyses of censored samples, quality control, reliability analysis and

strength of materials.

1.6.7 Maximum likelihood method

Let 1 2 , ,……, n x x x be a random sample from a population X with probability density function

f (x; ), whereparameterθ is unknown. The likelihood function, L , is defined to be the joint

density of the random variables 1 2 , ,……, n x x x . That is,

1

( ) ( , )

n

i

i

L f x

(1.9)

The sample statistic that maximizes the likelihood functions L

,

is called the maximum

likelihood estimator of θ and is denoted by

.

1.6.8 Lifetime data

Datasets from real happenings or normal life occurrences are called real life datasets. They are

observations or records in our day to day activities. Lifetime data are data collected on living

subjects.

8