Multiple Factor Analysis by MML EstimationC. S. Wallace |
|
Abstract: Previous work by Wallace and Freeman (1992) on the application of Minimum Message Length (MML) estimation to the Factor Analysis model of a multivariate Gaussian population is extended to allow several common factors. The extension is not trivial, raising problems in the choice of an appropriate prior for the factors, and in the evaluation of the Fisher information of the model. However, the resulting complications appear to cancel out, giving an estimator similar to the single-factor case. The estimator has been extensively tested on simulated data, and compared with the maximum likelihood and Akaike Information Criterion (AIC) estimator. The MML estimator is found to be substantially more accurate, to provide consistent estimates of factor scores, and to recover the number of common factors more reliably than a likelihood-ratio test among maximum likelihood models. Keywords: Minimum Message Length, MML, factor analysis, multivariate Gaussian.
1. Introduction1.1 The modelWe have developed and tested a Minimum Message Length (MML) estimator for a version of the well known Factor Analysis model for a multivariate population, here assumed to have a Gaussian density. The work extends an earlier work (Wallace and Freeman 1992), which addressed the restricted problem with a single factor. Here, we consider models with an unknown number of factors. In some respects the earlier work generalizes in a straightforward way, and we assume familiarity with it. For consistency, we retain and extend the perhaps unusual notation of the single-factor paper. Hereafter, that paper is referred to as WF. The data is a sample of N data vectors
where parameter μk is the mean in dimension k, parameter σk is the "specific variation" in dimension k, parameters {aj, j=1...J} are a set of J "Factor load" vectors each of dimension K, parameters {vj, j=1...J} are a set of J "score" vectors each of dimension N, and the NK variates {rnk, n=1...N, k=1...K} are regarded as having independent N(0,1) distributions (Harman 1967). The conceptual model underlying factor analysis is that there exist some hidden, or latent, variables which have a linear effect on the K variables actually measured. The aim of a factor estimation method is to discover, to the extent revealed by the data, the number J of these variables, their values {vnj} for each member of the sample, and their effects on the measured variables, represented in the model by the factor load vectors {aj}. The model further assumes that each measured variable is subject, independently of all other variables, to some Normally-distributed variation or measurement error, rnk, which is not affected by the hidden variables. The magnitude of this "specific" variance is modelled by the set of parameters {σk}. In Maximum likelihood (ML) treatment of the factor model, the model actually estimated is less complete. As ML is incapable of consistent simultaneous estimation of both factor loads and factor scores, the latter are treated not as parameters but as random variables with N(0,1) distributions, and the inferred model contains no inference of their individual values. The model (1.1) contains ambiguities. First, the factor load vector aj and factor score vector vj appear in the model only via their product. Thus the data cannot reveal their individual lengths. (By "length", we mean throughout the Euclidean vector length √(aTa) = |a|.) This ambiguity is conventionally resolved by assuming the scores {vnj} to be parameters with independent unit Normal priors. An alternative resolution is to constrain the scores so that Σn vnj2=N for all j. This alternative is mathematically more difficult and will not be pursued. A more troublesome ambiguity is that the data cannot reveal
the individual factor loads. For each load vector
aj,
define the scaled load tj given by
1.2 MMLMinimum Message Length (Wallace and Freeman 1987) or MML is a
form of Bayesian model selection and estimation principle.
Using the result from Information Theory that the most
efficient binary encoding of an event or value of
probability P has length MML chooses the model and parameter estimates which minimize the total message length. Among its virtues are its invariance under non-linear transformations of the parameter space, and its avoidance of the over-fitting to which Maximum Likelihood (ML) is prone. For convergence theorems see Barron and Cover (1991), Wallace (1989). If the set of models contemplated is a family of models of
increasing order and number of parameters, as is the case in
the Factor problem, MML chooses the order and parameter
values in one step.
If regarded simply as an
order-selection process, it shares with Schwartz (1978) the
property of penalizing the log-likelihood by an amount of order
Both Bozdogan and Schwartz thus argue that the choice of
order should be based on maximizing a quantity like
For certain parameters of the factor model,
The aim of this paper is not to show how prior information can be included in a Factor analysis (although the work presented here could be extended to do so) but rather to show that even when an uninformative prior is assumed, MML gives results different from, and superior to, the results from maximizing the likelihood. 1.3 Related WorkPress and Shigemasu (1989) surveyed several Bayesian approaches to Factor analysis. A property shared by them all is that point estimates of factor scores and loads are obtained as either modes or means of the posterior density, or, in most methods, modes of a marginal posterior density. Such estimators are not invariant under non-linear transformations of the parameter space. For instance, they would give different results were the factor load vectors to be specified in Polar rather than Cartesian co-ordinates. Press and Shigemasu described a new formulation which allows an explicit expression for the marginal posterior of factor scores. Adopting its mean as an estimate, they find a posterior density for the loads, and are able to give credibility intervals for elements of the load vectors. The method appears to require an informative prior for the loads, and thus by-passes questions of the number and identification of latent factors. Akaike (1987) describes his Information Criterion (AIC) as
a means of choosing a model which has maximal predictive
accuracy in the sense of minimizing the expected Kullback-Leibler
distance between the population and model distributions.
He proposed AIC, basically
2. PriorsIn this paper, we attempt to adopt priors which are as nearly as possible uninformative. In considering the choice of a prior for the latent variable effects (the "factor loadings" aj) the estimate of σk for each measured variable may be taken, in default of any other measure, as determining a natural scale for that variable. The effects of hidden variable j on the measured variables, when expressed in these natural scales, are given by the vector tj and the entire factor model is most "naturally" expressed in terms of measured and hidden variables represented in the "naturally scaled" space. We therefore suppose the prior density over the length and direction of a factor load vector to be colourless in the scaled space. In particular, we suppose tj to have a spherically symmetric prior density, and each of the J such vectors to be an independent realization of this density. This assumption does not carry over to the priors assumed for the estimated OLV vectors {βj}. If the "true" scaled effects of the hidden variables are given by the TLV vectors {tj}, we have no reason to suppose that these vectors will be mutually orthogonal. Indeed, we assume their directions to be independently uniformly distributed, this being the vaguest possible prior for their directions. Pro tem, assume the TLVs to have independent spherically symmetric prior densities of K-dimensional Normal form with scale parameter ρ. If a set of J vectors {uj} are drawn from this density, the joint density of their lengths {uj} is
If now a set of J mutually-orthogonal vectors {cj} is constructed such that
the joint distribution of the lengths {cj} is proportional to (Muirhead, 1982, p.107)
The density (2.2) has the same form as (2.1) save for a "correction" factor which suppresses the density if two or more vectors have similar length, and inflates the density when one or more vectors are small. Thus, starting from the rather vague prior over the TLVs, namely that they have independent spherically-symmetric Normal densities, we are led to conclude that an appropriate prior for the OLVs actually estimated shows their lengths not to be independently distributed. It has been suggested that one might prefer to model the OLV prior directly, presumably assuming the OLV lengths to be independent. However, our conceptual model is that the observed data are linearly affected by several distinct hidden variables, and there seems no prior reason to suppose that the effects of these variables on the measured variables should bear any relation to one another such as orthogonality. If this argument is accepted, then we expect the lengths of the OLVs rarely to be close to equality. A pair of OLVs can have equal length only in the unlikely coincidence that a pair of hidden variables (TLVs) happens to have equal lengths and to be orthogonal. Form (2.2) expresses the prior belief that such coincidences are unlikely. The lengths of the v and β
vectors for an OLV are confounded in the model,
as only their product affects the likelihood.
Thus, the estimator really estimates the product
where C is a normalization constant independent of ρ. The factor 2J J! is included because the model is unchanged by simultaneous negation of βj and vj, and the labelling of the OLVs is arbitrary. An estimator was constructed using the prior (2.3).
It performed well in trials on simulated data provided the
lengths of the load vectors were not greatly dissimilar.
However, when the lengths ranged over a ratio of five or more,
the estimator tended to over-estimate the shortest and
under-estimate the longest. This bias arises because, for large K,
the multivariate Normal prior implies a TLV
length prior (2.1) with a relatively narrow peak at
Unfortunately, the Normal TLV prior is the only form for which we were able to find an explicit expression for the corresponding OLV prior. As a compromise, we adopted the following OLV prior for load and score lengths:
and CNKJ is a normalization constant discussed later. This form assumes a very diffuse load length prior, and retains a "correction" factor similar to that of the Normal form (2.3). Simple geometric arguments show that the joint OLV length density derived from any smooth, independent, TLV length density will exhibit a "correction factor" of roughly this form. The prior (2.4) is a generalization of the "Cauchy" prior used in WF, and is far more diffuse than the Normal form. The joint prior density of the directions of the OLVs is assumed to have the uniform value
is the surface area of the unit D-sphere, and we similarly assume the joint prior density of the directions of the score vectors to be uniformly
As in WF, we assume the specific variances {σk} to have independent prior densities proportional to 1/σk in some finite range, and the means {μk} to have independent uniform priors in some finite range. The assumed priors import little useful information into the analysis. Those for means and specific variances have the conventional "uninformative" forms, the priors of load and score vectors presume no preference for any direction, and the prior for load strengths expresses only the vaguest expectation that they will be of the same order as the specific variations. 3. The Information MatrixMML chooses the model and parameter estimate θ which minimize
where h(θ) is the prior density of the parameter vector θ,
and 1/√I(θ)
is an approximation to
the volume in θ-space spanned by the
expected estimation error.
I(θ) is the
determinant of the matrix of expected partial second derivatives of T
with respect to the components of θ. For many
estimation problems, I(θ)
is well approximated
by the Fisher Information, but for the factor model,
the derivatives of h() cannot be neglected.
As in WF (section 3), we take into account the derivatives of
Using Σn vn = 0 from WF p.198, and changing to scaled components gives
The calculation in WF generalizes directly to J>1 factors giving
Note that (3.2) is the determinant of a matrix which includes derivatives with respect to all K components of each OLV βj and all N components of each score vector vj. Thus, it includes the sensitivity of T to differential changes of {βj} and {vj} which do not preserve the mutual orthogonality of OLVs and score vectors. However, in computing (3.2), use has been made of the fact that the derivatives are evaluated for mutually-orthogonal vectors. The question of enforcing the derivatives to preserve orthogonality is addressed below. First, we switch to the polar parameterization of the model, using length and direction rather than Cartesian components for the OLVs and score vectors. In this change of parameters, the sensitivity determinant transforms as the square of a density, giving
Consider some pair of OLVs βj and βl, and their associated score vectors vj and vl. The expression (3.3) contains contributions from the sensitivity of T to changes in direction of βj, βl, vj and vl. Let us consider changes in the directions of βj and βl corresponding to differential rotations of these vectors in the plane containing them both. Let θj be an angle giving the direction of βj in the plane, and θl be an angle giving the direction of βl in the plane. Similarly, let φj, φl be angles giving the directions of vj, vl in the plane containing them both. Then I3(θ) contains a contribution due to the expected second differentials of T with respect to θj, θl, φj and φl. It can be shown that this contribution amounts to a multiplicative factor of I3 given by
and that there are no cross-derivatives with other parameters of the model with non-zero expectation. As noted above, (3.4) arises from the unconstrained variation of the four parameters θj, θl, φj and φl. In fact, the four parameters are constrained by the requirement of orthogonality, so that there are really only two parameters, say θ and φ, with
When T is expressed in terms of these two parameters, and its second differentials with respect to them calculated, it is found that together they contribute to the sensitivity determinant a multiplicative factor of the form
rather than the form (3.4). The above expression shows that when factors j and l have equal size, a rotation can be applied to the OLVs and score vectors which has no effect on the likelihood or the priors, and hence no effect on T. The reader might be surprised by the fact that this degeneracy becomes apparent only when the orthogonality constraint is taken into account, since such rotations are possible, and have no effect on T, whether or not the OLVs are explicitly constrained to be orthogonal. The reason that the degeneracy is not apparent in the expression I3(θ) is that I3(θ) indicates the effects of perturbations of the parameters on T only to second order. When the OLVs and score vectors are not explicitly constrained, simultaneous variation of two OLVs and two score vectors is required to produce a perturbation having no effect on T, and so the degeneracy would be revealed only by a fourth-order analysis treating differentials up to the fourth. Once the orthogonality constraint is applied, however, the simultaneous variation involves only two, rather than four, dimensions of parameter space, and so is revealed by a second-order analysis using second partial differentials. Modifying I3(θ) by replacing a factor of form (3.4) by a factor of form (3.6) for every pair of OLVs gives the final sensitivity determinant as
4. The MML EstimatorIn this section, σk , βj , vj, etc. refer to estimates rather than "true" parameter values. The MML estimator is chosen to minimize
Omitting constant terms, we have
where {ynk}
are the scaled data
Note that the factors of the form
This cancellation is not an accident, as these factors arise from a singularity in the mapping of TLVs onto OLVs which affects both the prior and I in similar ways. Consider the example of a model with only two latent vectors (J=2) in K dimensions. The space of possible TLVs has 2K dimensions, but the space of possible OLVs has only 2K-1, as the OLVs lose one degree of freedom by being constrained to be orthogonal. For each OLV pair, there is a one-dimensional manifold of TLV pairs which map onto the OLV pair. Normally, the mapping is unique: such a manifold of TLV pairs maps to a single OLV pair. However, consider a set of TLV pairs in which all pairs lie in the same fixed plane of K-space, both TLVs of all pairs have the same fixed length, and the two TLVs of each pair are orthogonal. This set is a 1-dimensional manifold, as there is only one degree of freedom: the angular orientation of the TLV pair in the fixed plane. The matrix
has the same value for all pairs in the set. However, rather than mapping into a single OLV pair, such a special set of TLV pairs maps into a 1-dimensional manifold of OLV pairs identical to the set of TLV pairs. Normally, the prior density in a 1-dimensional manifold of TLV pairs condenses onto a single OLV pair, but for a special TLV set, as above, the prior density is spread over a manifold of OLV pairs, so these OLV pairs acquire a zero prior density. Similarly, it is normally the case that if two OLV pairs differ, then they are images of two different TLV manifolds, giving rise to two different data distributions. However, if the two OLV pairs are members of the image of the same special TLV set, they give rise to the same data distribution, and so the perturbation which changes one pair into the other (the θ of (3.6)) has no effect on T, and so I is zero. An OLV pair is in the image of a special TLV set just when the two OLVs have equal length, so just in this case the prior density of the pair drops to zero, and I becomes zero.
We now proceed to describe the MML estimator.
Then the estimates minimizing MessLen satisfy (Wallace and Freeman 1992):
Equations (4.2) to (4.6) may be solved by functional iteration. After each iteration, the revised vectors {βj} are re-orthogonalized by evaluating
To start the iteration one may set
As in WF, the iteration may drive an OLV to zero if the data are consistent with a model having fewer than J factors. In that case, the iteration is restarted with reduced J. The iteration does not directly require or yield estimates of the score vectors. These may be calculated after convergence as
5. Message LengthThe full calculation of the message length is not needed in estimating the parameters of a model with a specified number of OLVs. It is needed for the comparison of models with differing numbers of OLVs, and can be used for comparison of a factor model with a model of different functional form. The full expression for the message length of data vector x using a model with estimated vector parameter θ, prior density h(θ), senstivity matrix I(θ) and probability law P(x|θ) is (Wallace and Freeman 1987).
where D is the number of scalar parameters, i.e. the dimension of θ. Using (2.4, 2.5, 2.6, 3.7 and 4.1), we obtain
where D = 2K + J (N+K-J+1), Sm is defined in (2.5), ynk in (4.1), and Y is defined before (4.2). The finite ranges used in normalizing the priors for {μk} and {σk} are omitted from (5.2) as they affect all models equally. The normalization constant CNKJ (from (2.4)) is not known in closed form. Numerical integration of (2.4) for various N, K and J has shown that log CNKJ depends little on N for N>>1, and for N≥40 is well approximated by the empirical expresssion
where the constants AJ, BJ are:
6. Testing6.1 Artificial DataTest runs were made on artificial data sets randomly selected from populations with known TLV (and hence OLV) structure. In each run, 1000 populations were tried, all with the same number K of dimensions, the same number J of TLVs and all having the same set of TLV lengths. With no loss of generality, all specific variances {σk2} were set to one. The populations differed in that, in each population, the TLV directions were randomly and independently selected from a uniform distribution over the K-sphere. For each population, a sample of constant size N was selected and the covariance matrix about the sample mean computed. This matrix was then analysed by both ML and MML methods, seeking estimated models with J-1, J and J+1 OLVs. The MML method would sometimes fail to find a model with the requested number of factors, instead finding a reduced number. For most runs, we chose K=16, N=300, and J in the range 2 to 5. The sets of factor lengths were chosen arbitrarily, with most runs including at least one factor small enough to give some difficulty in estimating it. Note that the TLV lengths, which were held constant in each of the 1000 trials in a run, were not selected from the assumed prior distribution of TLV lengths. Hence the population OLV lengths, which varied from population to population, were not drawn from the assumed OLV length prior. In some runs, the population OLVs of each population were generated directly, with a fixed set of equal lengths and random mutually-orthogonal directions. These "orthogonal" runs were designed to test the MML estimator on populations for which the assumed prior OLV density (2.4) was zero. The score vectors {vj} were not explicitly estimated. ML cannot consistently estimate them together with the load vectors {βj} and specific variances {σk2}. MML can do so, but the directions of the score vectors were eliminated from the iteration equations (4.2-4.7), and only the covariance matrix made available to the two estimators. The estimated OLVs and specific variances estimated by the two methods for each population were compared with the population values using criteria described below, and the average error criteria for the run compared. 6.2 Error CriteriaWe chose criteria which are meaningful even if the
estimated number of OLVs J differs from the
population number J.
Recall that all test populations
have S1 = Σk log σk is a measure of general over- or under-estimation of specific variance (and so under- or over-estimation of OLV effects.) S2 = Σk (log σk)2 is a simple measure of error in estimation of specific variances. Both S1 and S2 are zero for exact estimates. In comparing ML and MML estimators, S1 and S2 are relevant because both estimators agree in estimating the OLVs as eigenvectors of the σ-scaled covariance Y. Hence good estimates of {σk} are essential for their estimation of OLVs. KL = ∫ dx P(x|θ) [log P(x|θ) - log P(x|θ)] is the non-symmetric Kullback-Leibler distance between the true population data density and that implied by the estimate θ. It is zero for an exact estimate, otherwise positive. It does not require P(x|θ) and P(x|θ) to have the samea number of parameters, and is invariant under non-linear transformations of the parameter space. For each run, the ML and MML estimators were compared on
two bases.
First ("fixed J"), we compare the
estimates produced when each method was asked to find a
model with the true number of J of OLVs.
The MML method
sometimes returned a model with fewer OLVs. Second
("chosen J"), we compared the results when the number
of OLVs was estimated.
The three ML models for
6.3 Results on Artificial DataAll runs showed MML to give the lower average values for
all three error criteria on both fixed J and chosen
The criterion S2, measuring squared error in {σk}, was typically less than half as large for MML as for ML. The KL criterion was typically about 0.01 to 0.03 less
for MML. This may seem a small difference. However, a
difference of 0.02 shows that if the log likelihoods of the
ML and MML models were compared on another sample with N=300
from the same population, we would expect the MML model to
have the greater likelihood, by a factor of about
Overall, the chosen J and fixed J comparisons gave similar results, with the MML advantage increasing with increasing J. Using the "chosen J" basis, we compared the estimates
J of the number of factors. Generally, ML was far
more likely than MML to find
This result might suggest that ML is superior to MML in
estimating the number of TLVs. However, closer examination
shows the apparent ML advantage to be an illusion. When
the 347 trials were examined in which ML estimated
For runs with strong population TLVs of markedly differing
lengths, it is possible to identify estimated OLVs with
population OLVs with little confusion. In several such
runs, additional error criteria were assessed. One summed
the squared differences between the (unscaled) population
and estimated OLVs, another summed the squared sine of the
angles between population and estimated OLV directions.
When Runs in which the population TLVs had equal length and were mutually orthogonal (and hence gave equal-length OLVs) showed no unusual behaviour, despite the fact that such OLV sets have zero prior density in our scheme. The superiority of MML remained evident. Details of the above tests are available in a Technical Report (Wallace 1995). 6.4 An ExamplePress and Shigemasu (1989) analyse some aptitude data from Kendall (1980) with N=48, K=15, using their explicit Bayesian method. They assume four factors, with an informative prior on the load vectors based on experience with similar data. After determining by unspecified methods some hyperparameters appearing in their model priors, they find estimates of the factor scores and loads. (We denote their results "P&S".) The same data (p.282) was analysed by ML and our MML methods, attempting to find models with 3, 4 and 5 factors. The three ML models gave AIC values of -512.0, -543.6 and -540.8 respectively, weakly preferring four factors over five. MML found no 5-factor solution, and gave message lengths for 3 and 4 factors respectively 195.3 and 197.1 less than the no-factor message length. Thus MML also prefers four factors, but the posterior odds in favour of four factors over three is only e1.8 or about 6:1. The agreement among all three four-factor models was assessed by evaluating the RMS sine of the angles (in unscaled space) between corresponding load vectors of two models, where the correspondence was chosen to minimize the RMS sine. The angles given by these RMS sines for each pair of models, and the average cosines between matched loads, were:
There is reasonable agreement between the MML and Press-Shigemasu results, with little doubt as to the correspondence between load vectors. The ML model is hard to reconcile with either of the other two, and has one load vector deviating by at least 65° from all loads of the other two models. This result, details of which are given in Wallace (1995) suggests the two Bayesian methods, despite their different priors and approaches, recover similar views of the data. The weak MML preference for four, rather than three, factors, together with the behaviour of ML on artificial data, gives grounds for doubting that the aberrant ML factor has any validity. 7. Discussion7.1 The Impact of the PriorsThe MML method has been shown by extensive testing to be more accurate than Maximum Likelihood estimation when the data is drawn from a population consistent with the Normal Factor model. It may be asked whether the improved accuracy is partly or wholly the result of the prior information assumed in MML and other Bayesian methods. To some extent, the assumption of a proper prior for TLV lengths must help suppress the improper solutions sometimes observed using ML, as these improper solutions are characterized by large |βj| and small σk. However, the effect of our prior is small. It is very diffuse, having infinite second moments, and at least for N>>K, its effect on the message length is swamped by the effects of the sensitivity determinant. We have found experimentally that even when an improper flat prior for TLV lengths is assumed, MML still outperforms ML, and still gives on average larger estimates of specific variance despite the 1/σ prior. The assumed prior density for TLV lengths could have
been of little assistance to the MML method on the test
runs reported in section 6.3, which mostly used data
drawn from populations with TLV lengths of 1.0 or 2.0.
Under the assumed prior, with K=16, the prior
probability of a TLV length being 2.0 or less is only
about 0.06, and for lengths 1.0 or less is only 0.001.
Hence the prior was misleading rather than helpful
under the test conditions. On the same data, a
modified MML method assuming the K-dimensional Normal
prior (2.1) was also tested. This prior has a
hyper-parameter ρ determining the length scale of
the prior, and the method estimated ρ from the
data. The modified method gave almost the same
estimates as the original MML method, but because
factor lengths around 1.0 are less severely penalized
by this prior (with ρ estimated at about 0.25), the
minimum message length was more frequently achieved with
The reason MML suppresses improper models is not that they occur in regions of parameter space having low prior density. Rather, it is that their high likelihood is achieved only in a very small region of parameter space. A message using such an improper solution to encode the data would thus have to specify the parameters to high accuracy (many digits of precision) making the message long. No other aspect of our assumed priors can be reasonably considered as importing useful information. The prior over load vector directions is uniform, representing total ignorance. The Normal prior assumed for factor scores serves only to disambiguate the confusion between load and score lengths, and is in any case tacitly assumed in the conventional ML Normal model. 7.2 Choice of Number of FactorsMML appears to provide a fairly sound basis for estimating
the number of factors. It is conservative, erring on the
side of rejecting factors for which the evidence is weak.
If an MML solution with J factors yields a correlation
matrix among the residuals
When compared with AIC selection among ML solutions, MML may more frequently select a small number of factors, but none the less still gives a better model of the population density by KL measure. With the prior here assumed, the MML choice of J may underestimate J when an OLV is small, simply because the prior probability assigned to small (scaled) OLV lengths is small. Experiments reported in (Wallace 1995) showed that, rather than choosing the MML model of smallest message length, one may with advantage choose the MML model with the highest J for which a solution exists. For populations with scaled OLV lengths around 1.0, this scheme may find a model with an additional factor (and slightly higher message length) and on average was slightly better both at estimating J and on our error measures. Where this scheme found a different model, the message length increase over the minimum message length model was on average only about 2.0, which is of little significance. For data with larger factors, both schemes almost always agree. That is, the model with the smallest message length is almost always the model with the largest number of factors found by the MML iteration. 7.3 Assessment of ErrorUnlike some other Bayesian methods, MML does not directly compute or use the posterior density. However, the elements of the sensitivity matrix may be used as indications of probable estimation error. For instance, it is easily shown that the probable error in estimating any one factor score vnj is of order 1/√{1+βj2}. 8. AcknowledgementThis work owes much to the assistance of References
|
|
↑ © L. Allison, www.allisons.org/ll/ (or as otherwise indicated). Created with "vi (Linux)", charset=iso-8859-1, fetched Thursday, 01-Jun-2023 04:46:39 UTC. Free: Linux, Ubuntu operating-sys, OpenOffice office-suite, The GIMP ~photoshop, Firefox web-browser, FlashBlock flash on/off. |