## ABSTRACT

In this study, we tried to evaluatestudents’ academic performance through the use of the relationship between mathematical and less-mathematical subjects and also tried to verify the subjects that contribute significantly to the variation among them. Canonical correlation analysis was employed to analyse the relationship between mathematical and less-mathematical subjects and to also test the significance of canonical variate and the homogeneity of variance among the variables obtained with the use of Wilk’s Lambda and Bartlett’s test respectively. Factor analysis was used to investigate the variability among the subjects and find out the variables that contribute significantly to the percentage of variance obtained. The data used is the Senior Secondary Certificate Examination results of Command Secondary School, Kaduna, conducted by the National Examination Council in the year 2008, 2009 and 2010 respectively. The data consists of results of 90 students in eight subjects out of nine registered subjects by each student. Two sets were formed; set-1 which consists of Mathematics, Chemistry and Physics; was classified as the mathematical subjects, while set-2 which consists of Economics, English Language, Biology, Geography and Hausa Language; was classified as less-mathematical subjects. The data wasanalyzed and structured into mathematical and less-mathematical data using the NCSS 2007 package. The purpose of the structuring was that of gathering information needed to quantify elements which were considered in order to construct determinant factors and to as well check which of the variables contributed significantly to the variance.Our results showed that less-mathematical subjects have significant impact on determining students’ academic performance.Three canonical roots were obtained and two are statistically significant showing a strong correlation between the two sets. Four factors were considered and it shows that less-mathematical subjects contribute significantly to the

vi

variation among the variables.Mathematical subjects are directly related, so also less-mathematical subjects. However, mathematical subjects are inversely related to less-mathematical subjects, which means, an increase in the students’ performance in mathematical subjects will lower the performance in less-mathematical subjects

## TABLE OF CONTENTS

Title Page Declaration ii Certification iii Dedication iv Acknowledgement v Abstract vi Table of Content viii List of Tables x CHAPTER ONE: General Introduction

1.1 Introduction 1

1.2 Statement of the Problem 4

1.3 Background to the Study 4

1.4 Aim and Objectives 6

1.5 Scope and Limitation of the Study 6

1.6 Definition of terms and Concept Used 7

CHAPTER TWO: Literature Review 2.1 The Concept of Academic Performance 11 2.2 Canonical Correlation Analysis of Students’ Performance 13 2.3 Factor Analysis of Students’ Performance 14 2.4 Other Statistical Analysis of Students’ Performance 16 CHAPTER THREE: Materials and Methodology 3.1 Data Collection 18

viii

3.2 Mathematical Computation of Canonical Correlation Analysis 18 3.3 Mathematical Computation of Factor Analysis 19 3.3.1 Orthogonal Factor Model 20 3.3.2 Estimating Factor Scores 22 3.4 Method of Computation of Canonical Coefficient 23 3.5 Significant Test 25 3.5.1 Wilk’s Lambda Test 26 3.5.2 Bartlett’s Test 27 3.6 Interpretation 28 3.7 Statistical Package used for the Study 29 CHAPTER FOUR: Results and Discussion 4.1 Introduction 30 4.2 Resultsfrom the Analysis of the Data Used 30 4.3 Discussion on the Result of the Analysis from the Data Used 34 CHAPTER FIVE: Summary, Conclusion and Recommendations 5.0 Introduction 39 5.1 Summary from output of Canonical Correlation Analysis 39 5.2 Summary from output of Factor Analysis 39 5.3 Conclusion 40 5.4 Recommendation 41 5.5 Contribution to Knowledge 41 References 42 Computed Data from the Outcome of Grade Scored by each Student48 Appendix 1: Output of Canonical Correlation Analysis 50 Appendix 2: Output of Factor Analysis 53

ix

LIST OF TABLES Table 4.1.: Canonical Correlation Coefficient of Set – 1 and Set – 2 30 Table 4.2: To test that the Canonical Correlations are zero 30 Table 4.3: Canonical Loading for Set – 1 and Set – 2 31 Table 4.4: Canonical Cross Loading for Set – 1 and Set – 2 31 Table 4.5: Bartlett’s Test 32 Table 4.6: Total Variance Explained by each Component 32 Table 4.7 Communalities Extracted by each Variable 33 Table 4.8 Factor Loadings showing Correlation between Factors and variable Factor

## CHAPTER ONE

General Introduction 1.1 Introduction Multivariate Statistics is a useful set of methods for analyzing a large amount of information in an integrated frame, focusing on the simplicity (Simon, 1969) and latent order (Wheatley, 1994) in seemingly complex array of variables. Benefits of using multivariate statistics include: Expanding sense of knowing, allowing rich and realistic research designs and having flexibility built on similar univariate methods. Canonical correlation analysis is the most generalized member of the family of multivariate statistical techniques. It is directly related to several dependence methods, similar to regression; canonical correlation’s goal is to quantify the strength of the relationship, in this case between the two sets of variables. It also resembles discriminant analysis in its ability to determine independent dimension for each variable set, in this situation with the objective of producing the maximum correlation between the dimensions. Thus, canonical correlation identifies the optimum structure or the dimensionality of each variable set that maximizes the relationship between dependent and independent variable sets(Anastasi and Urbina, 1997; Harlow, 2005).

Numerous studies, such as those carried out by Fullana (1995) and Montero (1990)have sought to understand the factors which account for low or high achievements. Studies seeking to identify what determines academic failure or success frequently appear as a reaction to conditions of change, such as plans for educational reform, or in response to critical situations: the National Examination Council (NECO) result, 2010, shows that less than twenty percent of those who sat for the examination made their subject combinations that can guarantee them admission into the tertiary institutions.

2

In general, the various studies which attempt to explain academic failure or success do so beginning with the three elements that intervene in education: parents (family causal factors), teachers (academic causal factors), and students (personal causal factors). Among personal variables most studied are motivation and self-concept. Motivation is considered to be the element that initiates the subject’s own involvement in learning: when a student is strongly motivated, all his effort and personality are directed towards the achievement of a specific goal, thus bringing to bear all his or her resources. Some authors have found that subjects themselves attribute low/high performance to low/highability and to luck (Valle, 1999), and an improvement in performance to motivation (task goal orientation), to self-regulating behaviors, and to competence as a function of task characteristics (Slater, 2002). In recent research, positive correlations were found between the value given to the task and the perceptions of auto-efficacy and performance (Chia, 2002). However, in a theoretical review, Fuente (2002) showed how there has been a branching off toward the study of academic goals, to the detriment of those of a social nature, even though these have been shown to be especially important in the most disadvantaged social contexts. A frequently applied paradigm in analyzing data from multivariate observations is to model the relevant information (represented in a multivariate variable X) as coming from a limited number of latent factors. Because each factor may affect several variables in common, they are known as “common factors”. Each variable is assumed to depend on a linear combination of the common factors, and the coefficients are known as loadings. Each measured variable also includes a component due to independent random variability, known as “specific variance” because it is specific to one variable.

In a survey on household consumption, for example, the consumption levels, X, of p different goods during one month could be observed. The variations and covariations of the p components of X throughout the survey might in fact be explained by two or three main

3

social behavior factors of the household. For instance, a basic desire of comfort or the willingness to achieve a certain social level or other social latent concepts might explain most of the consumption behavior (Hardle and Simar, 2003). Complex multivariate data structures are better understood by studying low-dimensional projections. For a joint study of two data sets, we may ask what type of low-dimensional projection helps in finding possible joint structures for the two samples (Hardle and Simar, 2003). Bartholomew et al (2008) described Factor Analysis as a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in four observed variables, mainly reflect the variations in two unobserved variables. Factor analysis searches for such joint variations in response to unobserved latent variables. The observed variables are modelled as linear combinations of the potential factors, plus “error” terms. The information gained about the interdependencies between observed variables can be used later to reduce the set of variables in a dataset. Computationally this technique is equivalent to low rank approximation of the matrix of observed variables. Shen et al (2009) stated that, the canonical correlation analysis is a standard tool of multivariate statistical analysis for discovery and quantification of associations between two sets of variables. Specifically, this analysis allows us to investigate the relationship between two sets of variables. For example, an educational researcher may want to compute the simultaneous relationship between three measures of scholastic ability with five measures of successes in school. A medical researcher may want to study the relationship of various risk factors to the development of a group of symptoms.

4

This study is carried out to look at evidence of influence as to how students’ preference and extreme rapt attention to some subjects can affect other subjects with less or no preference and attention. It is believed that in doing this, one may be able to adequately counsel students on how to prepare for examinations. 1.2 Statement of the Problem Various researches which focus on management had been carried out on Canonical Correlation, such as; “Study of Asset Liability Management in Indian Banks, Canonical Correlation Analysis” by Ranjan and Nallari,(2005). In fact, it is rarely applied on the outcome of students’ academic performance with a view to analyzing mathematical subjects to establish the relationship between the mathematical and less-mathematical subjects studied in secondary schools. Also, numerous studies had been carried out using Factor Analysis to focus on health and socio-economic issues while we have little to show for analysis of students’ academic performance with a view to analyzing mathematical and less-mathematical subjects studied in secondary schools. In this thesis, we have tried to use a combination of both Canonical Correlation and Factor Analysis to determine students’ academic performance of Command Secondary School, Kaduna using mathematical and less mathematical subjects. 1.3 Background to the Study

Predicting students’ performance in an academic program is a difficult but useful undertaking. Many studies on predicting academic performance such as one carried out by Mustapha et al(2010) and Wooten (1998) have been carried out,using differentpossible influencing factors. While some looked at the teaching/learning processes, good atmosphere

5

for learning and students’ behaviors, others looked at the examination outcomes of the students. Most of the observable phenomena in the empirical sciences are of a multivariate nature. In financial studies, assets in stock markets are observed simultaneously and their joint development is analyzed to better understand general tendencies and to track indices. In medicine, recorded observations of subjects in different locations are the basis of reliable diagnoses and prescription. In quantitative marketing, consumer preferences are collected in order to construct models of consumer behavior. The underlying theoretical structure of these and many other quantitative studies of applied sciences are multivariate. Factor analysis has provoked rather turbulent controversies throughout its history. Its modern beginnings lie in the early twentieth century attempts of Karl Pearson, Charles Spearman and others to define and measure “intelligence”. Because of this early association with constructs such as intelligence, factor analysis was nurtured and developed primarily by scientists interested in psychometric measurement. The advent of high-speed computers has generated a renewed interest in the theoretical and computational aspects of factor analysis. Most of the original techniques have been abandoned and early controversies resolved in the wake of recent developments. It is still true that each application of the technique must be examined on its own merits to determine its success (Richard and Dean, 1992)

Although the numerical methods required for canonical correlation analysis are much more complex than those required for computing a bivariate coefficient, canonical correlation analysis can be conceptually understood in terms familiar to bivariate analysis. Canonical correlation analysis computes to derive variables U and V, such that the correlation between U and V is as large as possible. Canonical correlation analysis computes several such relations between independent and dependent variables. Each relation indicates a distinct

6

pattern that exists in the data. Canonical correlation analysis reduces each of these patterns to derive variables – the canonical U and V variables. This means, for example that each such relation can be visually inspected using a familiar bivariate scatter diagram. The largest canonical correlation corresponds to the strongest relation between independent and dependent variables. Subsequent canonical correlations correspond to relations of decreasing strength. The significance of this feature for human factor research is that we often find different response patterns under different environmental conditions. 1.4 Aim and Objectives The aim of this research work is to evaluate Students’ Academic Performance through the use of the relationship between mathematical and less-mathematical subjects. This aim will be achieved through the following objectives: i. To determine the relationship between mathematical and less-mathematical subjects, using Canonical Correlation. ii. To investigate the homogeneity of variance among the subjects using Bartlett’s Test. iii. To find out the variable that contributes significantly to the percentage of variance using Factor Analysis. 1.5 Scope and Limitation of the Study

The study is centered at analyzing some SSCE results from Command (Day) Secondary School, Kaduna conducted by the National Examination Council (NECO) in the years 2008, 2009 and 2010 respectively. Linear Canonical Correlation and Factor

7

Analyseswere used for the study. Bartlett’s test is used to test for the homogeneity of variance of the variables. The data used (result of the students) were not individually verified to ascertain if the scripts marked by the examination council are error free. It is also a secondary data and cannot be established if there are different factors that influence the final result of each student either during the course of writing the examination or while marking the scripts by the assessors. The data is also from a single school due to the refusal of many secondary schools to release their students’ Senior Secondary Certificate Examination (S.S.C.E) results, for fear of it being exposed to the general public. 1.6 Definition of Terms and Concepts Used 1.6.1 Matrix A rectangular array of numbers enclosed by a pair of brackets such as

and each is called a matrix and both are called matrices. The matrix is called a row vector and is called a column vector. 1.6.2 Eigenvalue and Eigenvector

Let be a symmetric matrix and if there exists a scalar such that , then the roots of the determinant equation are called the eigenvalues (or characteristic values or latent roots) of the matrix . If also there is a vector such that (, such that

8

, then the columns of which satisfy the equation are called the eigenvectors (or characteristic root) of . 1.6.3 Correlation

A measure of the degree of a relationship between two sets of variables is called correlation. The value of the association is called the correlation coefficient, which lies between and . 1.6.4 Linear Compound

Suppose is an n-dimensional random variable with and . Then if is a vector of constants, is a random variable called a linear compound. 1.6.5 Canonical Variates and Canonical Correlation

As in (1.5.10) if there are two linear compounds and where and are two sets of variables. Then the values of and are called canonical variates and their correlation is called canonical correlation. 1.6.6 Canonical Weight

Canonical weight, also called the canonical function coefficient or canonical coefficient is the coefficient of the linear combinations of and of the linear compounds in (1.5.10). The standardized canonical weights are used to assess the relative importance of individual variables contribution to a given canonical correlation. Canonical coefficients are

9

the standardized weights in the linear equation of variables, which creates the canonical variables. 1.6.7 Canonical Scores They are values on a canonical variable for a given case, based on the canonical coefficients for that variable. Canonical coefficients multiplied by the standardized scores of the cases and summed to yield the canonical score for each case in the analysis. 1.6.8 Structure Correlation Coefficients (Canonical Loading) This can also be called canonical factor loading. A structure correlation is the correlation of a canonical variable with the standardized scores of an original input variable. The table of structure correlation is sometimes called the factor structure. The squared structure correlation indicates the contribution made by a given variable to the explanatory power of the canonical variate based on the set of variables to which it belongs. 1.6.9 Canonical Communality Coefficient This is the sum of the squared structure coefficients for a given variable. The canonical communality coefficient measures how much of a given original variable’s variance is reproducible from the canonical variables. As in factor analysis, the original variables with low canonical communality are those for which the model is not working and the researcher may use this information to drop such variables from the analysis.

The canonical variate adequacy coefficient is the average of the entire squared structure coefficient for one set of variables (the dependent or independent set) with respect

10

to a given canonical variable. It is also the percent of variance in the set of variables (e.g. the independent set) extracted by the canonical variate for that set. 1.6.10 Factor Model

Let us assume that be the set of observed variables which are linearly related with a set of unobservable latent variables such that . The general model is . We refer to as common factors and as the specific factors. The quantities are called factor loadings. They are correlations of items (rows) with factors (columns) 1.6.11 Hypotheses It is often in practice to make decision about a population based on sample information (data). Such decisions are called statistical decisions. To reach a decision, it is useful to make assumptions (or guess) about the population involved. Such assumptions, which may or may not be true, are called statistical hypotheses. They are generally statements about the probability distribution of the population. A null hypothesis is a statement made for the main purpose of rejecting it. If the null hypothesis is rejected then there must be an alternative. This alternative is called alternative hypothesis that will be accepted if the null hypothesis is rejected. 1.6.12 Latent Variable Analysis

Latent variable analysis focuses on three or more variables at least one of which is unobserved. As such, latent variable analysis is appropriate any time a researcher wishes to evaluate a theoretically posited relationship among concepts by the examination of measures

11

designed to measure those concepts such as definition means that virtually all research is “latent variable analysis refers to factor analysis”.

12

Do you need help? Talk to us right now: (+234) 08060082010, 08107932631 (Call/WhatsApp). Email: [email protected].

**IF YOU CAN'T FIND YOUR TOPIC, CLICK HERE TO HIRE A WRITER»**