Application of multivariate analysis to agronomic trial
Table Of Contents
- Title page…………………………………………………………………… i
Declaration………………………………………………………………… iii
Certification………………………………………………………………… iv
Dedication…………………………………………………………………… v
Acknowledgment…………………………………………………………… vi
Abstract……………………………………………………………………… vii
Table of Content …………………………………………………………… viii
Chapter ONE
INTRODUCTION
- General Introduction………………………………………… 1
- 1.0Introduction ………………………………………………………… 1
- 1.1Statement of the Problem …………………………………………… 3
- 1.2Objective of the Study …………………………………………….. 3
- 1.3Significance of the Study…………………………………………… 4
- 1.4Brief History of Cowpea ………………………………………… 4
- 1.5Definition of Terms………………………………………………… 5
- 1.6Contribution to Knowledge ………………………………………… 9
Chapter TWO
LITERATURE REVIEW
- …………………………………………… 10
- 2.0Introduction ………………………………………………………… 10
- 2.1Normality…………………………………………………………… 10
- 2.2Path Analysis……………………………………………………….. 11
- 2.3Factor Analysis ……………………………………………………… 13
2.
- 3.1Principal Component Analysis ……………………………………… 13
x
2.
- 3.2Common Factor Analysis ………………………………………… 14
Chapter THREE
SYSTEM DESIGN AND IMPLEMENTATION
- ……………………………………………… 16
- 3.0Source of Data……………………………………………………… 16
- 3.1Quartile-Quartile Plot……………………………………………… 16
- 3.2Population Principal Components ………………………………… 17
- 3.3Factors Analysis Theory…………………………………………… 19
- 3.4Path Analysis ……………………………………………………… 20
Chapter FOUR
SYSTEM TESTING AND EVALUATION
- Results and Discussion……………………………………… 22
- 4.0Introduction………………………………………………………… 22
- 4.1Q-Q Plot to test for Normality……………………………………… 23
- 4.2Principal Component Scree Plot Analysis………………………… 24
- 4.3Principal Components Extraction………………………………… 25
- 4.4Partitioning using Path Analysis…………………………………… 27
Chapter FIVE
SUMMARY, CONCLUSION AND RECOMMENDATIONS
- Conclusion and Recommendation……………… 29
- 5.1Summary and Conclusion……………………………………………. 29
- 5.2Recommendation …………………………………………………………………….. 30
References ………………………………………………………… 31
Appendixes………………………………………………………… 34
Thesis Abstract
Bivariate correlation was carried out on the growth and yield characters of cowpea
(Vigna unguiculata [L] Walp) and pod yield and number of pods were observed to
be positively and significantly correlated with all the components assessed
(Number of Branches, Weight of defoliated leaves, Number of Pods, Length of
Pods, Shoot Dry Weight, Plant Height, Number of Leaves, Leaf Area Plant, and
Crop Growth Rates) except leaf area index and weight of defoliated leaves. The
various parameters exhibited significant interrelationship with one another. Of the
10 growth characters considered in the study, the Principal component analysis
shows the reduction of the dimension of the variates to 3 components and these
three components explain about 95.57% of the variation. All variables considered
shows normality with a linear line indicating that the data observed multivariate
normal distribution.
Thesis Overview
<p>
GENERAL INTRODUCTION<br>1.0 Introduction<br>Multivariate analysis refers to any statistical technique used to analyze data<br>that arises from more than one variable. This essentially models reality where each<br>situation, product, or decision involves more than a single variable. Multivariate<br>analysis can be used for both spectral and non-spectral type of data. Spectra data<br>are essentially data derived by the use of spectroscopic instrument. This data<br>specifies variable with its properties and undoubtedly provide a great deal of<br>useful information about organic molecules. Spectral data are used by scientists to:<br>(i). Discover the chemical composition of materials by looking at the high (and<br>other kinds of electromagnetic radiation) and (ii). Identify and monitor the<br>production of products in factories.<br>The advent of gas chromatograph (GC) machines with non-destructive<br>detectors has made research in flavor more interactive, and some GCs now<br>incorporate sniffing ports to enable trained assessors to smell individual flavor<br>compounds as they are separated by the GC.<br>A Non-spectral data is essentially data collected from sensory and<br>environment. Any data that is collected from other sources, than spectroscopic and<br>chromatographic instruments is non-spectral data.<br>2<br>The principal component analysis (PCA) and cluster analysis (CA) are the<br>most common multivariate statistical methods in environmental studies. Principal<br>component analysis is widely used to reduce data dimensionality (Salawu, 2008)<br>and to extract a small number of latent factors for analyzing relationships among<br>the observed variables. If large differences exist in the standard deviations of<br>variables, PCA result will vary considerably depending on whether the correlation<br>or covariance matrix is used (Farnham et al, 2003).<br>Factor analysis (FA) is a statistical approach that can be used to analyze<br>interrelationship among a large number of variables and to explain these variables<br>in terms of their common underlying dimension (factors). The statistical approach<br>involving finding a way of condensing the information contained in a number of<br>original variables into a smaller set of dimension (factors) with a minimum loss of<br>information (Everith and Dunn, 2001).<br>Agronomy is the study of soils and plants, relating to the scientific study of<br>soils management, land cultivation and crop production. It describes plants<br>characteristics that are important during growth and development of a crop e.g<br>height and stem strength. The data on cowpea was collected and used for the<br>analysis.<br>3<br>1.1 Statement of the Problem<br>Over the years the measurable dimensions of agricultural trial are<br>numerous. This leads to cumbersome and time consuming tasks in the gathering<br>and analysis of data from agricultural experiments.<br>Efforts were made in this thesis to reduce the dimensionality of the<br>measurement from m to p, (p < m) such that maximum information will be<br>retained and large percent variation will be explained with the use of fewer<br>variables.<br>1.2 Objective of the Study<br>The specific objectives of the study include:-<br>ï‚· To test the multivariate normality assumption on agronomic trial.<br>ï‚· To study the relationship of the agronomic measured characteristics in<br>both the growth and yield parameter.<br>ï‚· To determine the percent contribution of both direct and indirect<br>correlation in path analysis.<br>ï‚· To reduce the dimensionality of the measured characteristics from m to<br>p; where p < m.<br>4<br>1.3 Significance of the Study<br>The Significance of this study is that the model constructed will provide:<br>ï‚· The agriculturist with useful information on the most prominent<br>characteristics to use in the crop management.<br>ï‚· The agriculturist with information on direct and indirect contributions of<br>the crop characteristics<br>ï‚· Demonstration of the application of Path Analysis and Principal<br>Component Analysis in agricultural experiments to obtain useful<br>information.<br>1.4 Brief History of Cowpea<br>Cowpea (vigna Unguiculata (L) walp), common name for any of a genus of<br>leguminous herbs also knows as black-eyed pea. Cowpeas are sprawling or<br>twining herbs with triple leaves and pods 20 to 30 cm long enclosing several<br>kidney-shaped seeds. Cowpeas were cowpea originally native of Asia are now an<br>important forage and cover crops in Southern United State and Africa. Worldwide<br>Cowpea production has increased dramatically in the last 25 years (Adebayo and<br>Tukur, 1999).<br>Being a drought tolerant and warm weather crops, cowpea is well as<br>adapted to the drier regions of the tropics where other food legumes do not<br>perform well.<br>5<br>It also has the unique ability to fix atmospheric nitrogen through its nodules<br>and it grows well even in the soil with more than 85% sand and with less than<br>0.2% organic matter and low level of phosphorous, where soil PH is in the range<br>of 5.5 to 6.5 (Tuan and Philips, 1992).<br>Cowpea is consumed in many forms: young leaves, green pods, green seeds<br>are used as vegetables and dry seed are used in various food preparations. With<br>over 25% protein, in its seeds and tender leaves, cowpea is a major source of<br>protein, minerals and vitamins in the daily diets. Therefore Cowpea seed is valued<br>as a nutritional supplement to cereals and animals. Its seeds is a nutritious<br>component in the human diet as well as livestock feed with nutrient content as<br>follows: Protein, 24.8%, Fat, 1.9% Fiber, 6.3% Carbohydrate, 63.6%, Thiamine,<br>0.00074%, Riboflavin, 0.00042% and Niacin, 0.00281%, thus its positively<br>impacts on the health of women and children (Van and Gerike, 2000).<br>1.5 Definition of Terms and Concepts.<br>Variance (MS): The term MS (mean sum of square) is a short form of mean sum<br>of squares (MSS), which is calculated by dividing the sum of squares with its<br>degrees of freedom. One of the ‘S’was dropped in usage and its short form MS is<br>used. It is the most important measure of dispersion. In fact it has a very crucial<br>role in biostatistical analysis of experimental data and in the tests of significance.<br>The MS in the analysis of variance table is the same as variance. Mathematically,<br>it is square of standard deviation.<br>6<br>Coefficient of Variation (C.V): It is defined as the calculated standard deviation<br>expressed as a percentage of the mean.<br>Completely Randomized Design: The simplest experimental design used where<br>experimental material is uniform e.g. in the laboratory, green house, screen house,<br>growth chamber.<br>Continuous Variable: They are variables which can have any values between<br>certain limits e.g. yield, weight etc.<br>Number of Pods: The average number of pods in a stand of cowpea.<br>Length of Pods: The length of pods from the base to the tips of the pod. Shoot<br>Dry Weigh<br>Plant Height: It’s the measurement of cowpea stand from the ground to the<br>longest leaf tip or panicle.<br>Number of Leaves: This will be determined by counting the total number of fully<br>expanded leaves on the tagged plants and the average will be computed.<br>Leaf Area Plant: This will be determined by measuring the length and maximum<br>width of the leaves of five tagged plants and multiplied by a constant coefficient of<br>0.75 (Montgomery, 1911)<br>Leaf Area Index: This will be determined by measuring the length and breadth<br>from the widest portion of functional leaves of each leaf from the tagged plant<br>with a ruler. The product of length and breadth will then be multiplied by a factor,<br>7<br>0.75 from which area of individual leaves of the sampled plants will be obtained,<br>added and divided by the land area occupied by the tagged plants.<br>LAI =<br>P<br>A = leaf area per plant / Area subtended by plant<br>Where, A = Leaf area per plant (m2)<br>P = Ground area per plant (m2)<br>LAI = leaf area index.<br>Crop Growth Rates: Three plants will be randomly sampled from each sub-plot<br>for destructive sampling these will be oven dry at 700c for 24 hours using the<br>formula by Smith and Haddad, (2000). The unit of measurement is g cm -2 wk-1.<br>( ) 2 1<br>2 1<br>t t<br>CGR W W<br>ï€<br>ï€<br> 1.x<br>W1 = dry matter weight at a sampling period<br>W2 = dry matter weight at the next sampling period<br>t1 = time at which w1 was taken<br>t2 = time at which w2 was taken<br>Path Model: A path model is a diagram relating intermediary, and dependent<br>variables. Single arrows indicate causation between exogenous or intermediary<br>variables and the dependent(s). Arrows also connect the error terms with the<br>respective endogenous variables. Double arrows indicate correlation between pairs<br>of exogenous variables.<br>8<br>Endogenous Variables: These are variables which form an internal part of the<br>system. Endogenous variables are those which do have incoming arrows.<br>Endogenous variables include intervening casual arrows in the path diagram. The<br>dependent variable(s) have only incoming arrows.<br>Exogenous Variables: These are variables which form an external part of the<br>system. Exogenous variables are those with no explicit causes (no arrows going to<br>them, other than the measurement error term). If exogenous variables are<br>correlated, this is indicated by a double-headed arrow connecting them.<br>Causal Paths: To a given variable include (1) the direct paths from arrows<br>leading to it, and (2) correlated paths from endogenous variables with others have<br>arrows leading to the given variable.<br>Path Coefficient: A path coefficient is a standardized regression coefficient (beta)<br>showing the direct effect of an independent variable on a dependent variable in the<br>path model. Thus when the model has two or more causal variables, path<br>coefficients are partial regression coefficients which measure the extent of effect<br>of one variable on another in the path model controlling for other prior variables,<br>using standardized data or a correlation matrix as input.<br>Effect Decomposition: Path coefficients may be used to decompose correlations<br>in the model into direct and indirect effects. This is based on the rule that in a<br>linear system, the total causal effect of variable i on variable j is the sum of the<br>values of all the paths from i to j.<br>9<br>Disturbance Terms: The residual error terms, reflect unexplained variance (the<br>effect of unmeasured variables) plus measurement error.<br>1.6 Contribution to Knowledge<br>This research x-rays the application of Principal Component Analysis and<br>Path Analysis in the reduction and partitioning of agricultural data.
<br></p>