首页 子空间半监督Fisher判别分析

子空间半监督Fisher判别分析

子空间半监督Fisher判别分析子空间半监督Fisher判别分析 Vol. 35, No. 12 ACTA AUTOMATICA SINICA December, 2009 Subspace Semi-supervised Fisher Discriminant Analysis 1141YANG Wu-Yi, 2, 3 LIANG WeiXIN LeZHANG Shu-Wu Abstract Fisher discriminant analysis (FDA) is a popular method for supervised dim...

子空间半监督Fisher判别分析 Vol. 35, No. 12 ACTA AUTOMATICA SINICA December, 2009 Subspace Semi-supervised Fisher Discriminant Analysis 1141YANG Wu-Yi, 2, 3 LIANG WeiXIN LeZHANG Shu-Wu Abstract Fisher discriminant analysis (FDA) is a popular method for supervised dimensionality reduction. FDA seeks for an embedding transformation such that the ratio of the between-class scatter to the within-class scatter is maximized. Labeled data, however, often consume much time and are expensive to obtain, as they require the e,orts of human annotators. In order to cope with the problem of e,ectively combining unlabeled data with labeled data to ,nd the embedding transformation, we propose a novel method, called subspace semi-supervised Fisher discriminant analysis (SSFDA), for semi-supervised dimensionality reduction. SSFDA aims to ,nd an embedding transformation that respects the discriminant structure inferred from the labeled data and the intrinsic geometrical structure inferred from both the labeled and unlabeled data. We also show that SSFDA can be extended to nonlinear dimensionality reduction scenarios by applying the kernel trick. The experimental results on face recognition demonstrate the e,ectiveness of our proposed algorithm. Key words Fisher discriminant analysis (FDA), semi-supervised learning, manifold regularization, dimensionality reduction [8]In cases of machine learning and data mining, such as of Caiis closest in spirit to the intuitions of our paper. [5][8]image retrieval, and face recognition, we may increasingly Techniques of Belkinand Caiare based on regulariza- confront with the collection of high-dimensional data. This tion. leads us to consider methods of dimensionality reduction In this paper, we aim at dimensionality reduction in that allow us to represent the data in a lower dimensional semi-supervised case. To cope with the problem of e,ec- space. Techniques for dimensionality reduction have at- tively combining unlabeled data with labeled data, we pro- tracted much attention in computer vision and pattern pose a novel semi-supervised dimensionality reduction algo- recognition. The most popular dimensionality reduction al- rithm called subspace semi-supervised Fisher discriminant [1gorithms include principal component analysis (PCA)?2] analysis (SSFDA). SSFDA exploits the geometric structure [3]of the labeled an unlabeled data and incorporates it as an and Fisher discriminant analysis (FDA). additional regularization term. SSFDA intends to ,nd an PCA is an unsupervised method. It projects the original m-dimensional data into a d (d ? m)-dimensional subspace embedding transformation that respects the discriminant structure inferred from the labeled data and the intrinsic in which the data variance is maximized. It computes the geometrical structure inferred from both labeled and unla- eigenvectors of the data covariance matrix, and approx- beled data. imates the original data by a linear combination of the [8]Semi-supervised discriminant analysis (SDA)is the leading eigenvectors. If the data are embedded in a linear most relevant algorithm to our algorithm. In the follow- subspace, PCA is guaranteed to discover the dimensionality ing, we list the similarities and major di,erence between of the subspace and produces a compact representation. SDA and our algorithm: Unlike PCA, FDA is a supervised method. In the context 1) Both SDA and our algorithm are graph-based ap- of pattern classi,cation, FDA seeks for the best projection proaches. Both use a p-nearest neighbor graph to model subspace such that the ratio of the between-class scatter the relationship between the nearby data points and incor- to the within-class scatter is maximized. For classi,cation porate the geometric structure of the labeled and unlabeled task, FDA can achieve signi,cant better performance than data as an additional regularization term. PCA. 2) There is one major di,erence between SDA and our Labeled data, however, often consume much time and algorithm. In the SDA algorithm, without considering the are expensive to obtain, as they require the e,orts of hu- [4]labels of the labeled data, the weight matrix of the p-nearest man annotators. Contrarily, in many cases, it is far easier neighbor graph is constructed according to the relationship to obtain large numbers of unlabeled data. The problem between nearby points in the original data space. In our of e,ectively combining unlabeled data with labeled data [4]algorithm, using the labeled data, we ,rst ,nd a projec- is therefore of central importance in machine learning. tion subspace by applying the FDA algorithm and embed Learning from labeled and unlabeled data has attracted the labeled and unlabeled data into this subspace. Then, an increasing amount of attention recently, and several the weight matrix of the p-nearest neighbor graph is con- novel approaches have been proposed. Graph-based semi- [4structed according to the relationship between nearby data supervised learning algorithms?13] have attracted consid- points in the subspace, as well as the labels of the labeled erable attention in recent years. These algorithms consider data. the graph over all the data as a priori knowledge to guide The rest of this paper is organized as follows. In Section the decision making. The regularization-based technique 1, we provide a brief review of FDA. The proposed SSFDA algorithm for dimensionality reduction is introduced in Sec- Received March 18, 2008; in revised form June 8, 2009 Supported by National Sciences and Technology Support- tion 2. The experimental results are presented in Section ing Program of China (2008BAH26B02-3, 2008BAH21B03-04, 3. Finally, we conclude the paper in Section 4. 2008BAH26B03) 1. Hi-tech Innovation Center, Institute of Automation, Chinese 1 Fisher discriminant analysis Academy of Sciences, Beijing 100190, P. R. China 2. Key Lab- oratory of Underwater Acoustic Communication and Marine Infor- mation Technology of the Minister of Education, Xiamen University, In this section, we ,rst formulate the problem of linear Xiamen 361005, P. R. China 3. College of Oceanography and Envi- dimensionality reduction. Then, FDA is reviewed. Last, ronmental Science, Xiamen University, Xiamen 361005, P. R. China 4. School of Electronics Information and Control Engineering, Bei- the graph perspective of FDA is introduced. jing University of Technology, Beijing 100124, P. R. China DOI: 10.3724/SP.J.1004.2009.01513 1514 ACTA AUTOMATICA SINICA Vol. 35 1.1 Formulation 1.3 Graph perspective of Fisher discriminant manalysis Suppose that we have a set of l samples x, ? ? ? , x? R 1lclWe have that belong to c classes, then l =k=1k, where lis the knumber of samples in the k-th class. For the linear dimen- c ?sionality reduction, we focus on ,nding a transformation ((TSb = lk(µk) ? µ)(µk) ? µ)= matrix A = (a1, ? ? ? , ad) that maps these l points to a set k=1 dÃ ! Ã !T of points y, ? ? ? , yin R(d ? m). The embedded sample 1lTllc ?Tkk((yis given by y= Ax. 1 ?k 1 ?k iiilk ? lµµ= xi) xi) i=1 i=1 k=1 1.2 Fisher discriminant analysis for dimensional- l l kkity reduction ?c ((T[3](8) Xk)(k)Xk)T ? lµµ FDAis one of the most popular dimensionality reduc- W()k=1 tion techniques. FDA seeks directions on which the data lpoints of di,erent classes are far from each other while data ?c ?k [3]kk((Tt = S(9) points of the same class are close to each other. Here, we (xi)xi)T ? lµµ )()k=1 i=1 brie,y describe the de,nition of FDA. Let S, S, and Sbe the within-class scatter matrix, wbt(the between-class scatter matrix, and total scatter matrix, where Wk) is an lk × lk matrix with all the elements krespectively. (((equal to 1/lk and Xk)x1)? ? ? , xlk)Let Xl = k= [,].(1)(cl[X, ? ? ? , Xc)] and de,ne an l × l matrix Wl×l as ??k kk(1)(((((1) S= w 0 ? ? ? 0 W(xi) ? µk)xi) ? µk)T )()k=1 i=1 (2) 0 W? ? ? 0 Wl×l = (10) .. .. . .. .. ?c . . . ((T(Sb = (2) lk(µk) ? µ)(µk) ? µ) 0 0 ? ? ? Wc) k=1 Then, we have l?c ?k = XBXT (11) Skbllk(T(i) ? µ)= Sw + Sb (3) St = (xi) ? µ)(xk=1 i=1 (12) St = XlCXT l T 1k(where lk is the number of data in the k-th class, xi) is the B = Wl×l ? (11i-th data in the k-th class, µk) is the mean of the data in l1T the k-th class, and µ is the mean of all data: C = I ? 11l l ?k (k(µk)1 T=(4) xi) where 1 = [1, ? ? ? , 1]is an l-dimensional vector and I is an lk i=1 l × l identity matrix. Thus, the objective function of FDA in (6) can be rewritten as T?l aSa b aT lBXTla 1i (5) xµ = aopt = arg maxSta = arg maxaTXCXTa (13) Tallai=1 a l The generalized eigenvalue problem in (7) can be rewritten The objective function of FDA is as follows: as TaaSa aTb BXT= λXCXT(14) XbllllSaa(6) a opt= arg max This formulation of FDA objective function was ,rst intro- aTSwa= arg maxaaaTSta duced in [14]. The projection vector a that maximizes (6) is given by the maximum eigenvalue solution to the generalized eigenvalue 2 Subspace semi-supervised Fisher dis- problem: criminant analysis Sa = λSa (7) btWe introduce our SSFDA algorithm that respects both Let the column vector a1, ? ? ? , ad be the solutions of (7), or- discriminant and geometrical structures in the data. We begin with a description of the semi-supervised learning dered according to their eigenvalues, λ1 > ? ? ? > λd. Thus, problem. the embedding is as follows: 2.1 Problem formulation Txi ? y= Axi imGiven a sample set {x1 , ? ? ? , xl, xl+1, ? ? ? , xn} ? Rand a label set L = {1, ? ? ? , c}, the ,rst l points xi (i ? l) are where yis a d-dimensional representation of the high di- ilabeled as ti ? L and the remaining points xu(l + 1 ? mensional data point xi and A = (a1 , ? ? ? , ad) is the trans- u ? n) are unlabeled. Find a transformation matrix formation matrix. The between-class scatter matrix Shas bA = (a1, ? ? ? , ad) that maps these n points y, ? ? ? , yto 1nat most rank c ? 1 . This implies that the multiplicity of dλ = 0 is at least m ? c + 1. Therefore, FDA can ,nd at a set of points in R(d ? m). The embedded sam- Tmost c ? 1 meaningful directions. ple yis given by y= Axi. For any unlabeled sample ii No. 12 YANG Wu-Yi et al.: Subspace Semi-supervised Fisher Discriminant Analysis 1515 [7]xu (l + 1 ? u ? n), its label is then predicted as ti? pro- . In order to discover known labeled and unlabeled points=Tboth geometrical and discriminant structures of the data vided that y?Ax? minimizes ky? yk, i = 1, ? ? ? , l. iiiumanifold, we incorporate the manifold structure of both la- The performance of an algorithm is measured by the recog- beled and unlabeled data as the regularization term in the nition error rate on the unlabeled samples. objective function (18). 2.2 The objective function When k = n, St in (16) may be singular. We use regu- FDA is a supervised method. It wants to ,nd an em- larization to ensure the nonsingularity of St: St = St + δIr , bedding transformation such that the ratio of the between- where δ (δ > 0) is the regularization parameter and Ir is the class scatter to the within-class scatter is maximized. When r × r identity matrix. Let the column vectors b1, ? ? ? , bc?1 there is no su,cient training (labeled) samples, the prob- be the solutions of (17), ordered according to their eigen- lem of learning from both labeled and unlabeled samples values, λ1 > ? ? ? > λc?1. Then, we ,nd a transformation (semi-supervised learning) is of central importance in im- matrix B = [b, ? ? ? , b?1]. The r × (c ? 1) transforma- 1cproving the performance of recognition. tion matrix B maps all the n samples to a set of points TcSSFDA will be performed in the subspace R(Xk), where {q|q= Bz, i = 1, ? ? ? , n} in R?1. Let l(x) be the iiii}R(Xk) = Span[x1, ? ? ? , xk] is the subspace spanned by the 1p, qclass label of xi, and Np(q) = {qi2, ? ? ? , qibe the set iicolumns of Xk = [x1, ? ? ? , xk] and k = l or n. Suppose 0sof qip-nearest neighbors. We construct a p-nearest neigh- that the rank of Xk is r, i.e., rank(Xk) = dim(R(Xk)) = r. bor graph G to model the relationship between nearby data Perform the singular value decomposition of Xas X= kkpoints. Thus, the weight matrix W of G can be de,ned as T,U V, where U = [?u, ? ? ? , u, u+1? ? ? , u] is the left 1rrmfollows: orthonormal matrix, is the singular value matrix, and V is the right orthonormal matrix. The subspace R(X) can k γ, if xand xare labeled, and ijbe spanned by the columns of P , where P = [u1, ? ? ? , ur]. When rank(Xk) = m, we can simply select the m×m iden- tity matrix I as P , i.e., P = I. We project both labeled and they share the same unlabeled data into subspace R(Xk). Thus, the embedding is as follows: Txi ? zi = Pxi, i = 1, ? ? ? , l, l + 1, ? ? ? , n label; (19) = Wij γ, if xis labeled, xis unlabeled, ijLet X = [x1, x2 , ? ? ? , xn], Xl = [x1, x2, ? ? ? , xl], kk((((1)(Zk)z1)? ? ? , zlk)Z= [Z, ? ? ? , Zc)], and Z = = [,],lpi[z 1, z2 , ? ? ? zn]. Then, Zl = P Xl and Z = P X. In the sub- ifx?Nis labeled,(q),sjqspace, the between-class scatter matrix Sb and total scatter matrix St are as follows: qs and for each(x) = l(xi);? sl Np(q), j γ, if xj is labeled, xi is unlabeled, q?Nis jsifx qpslabeled,(q) and for each(xs) = l(xj);? N(q), ipil , if xand xare ij1 unlabeled, and q? Np(q) or q? Np(q); ijji 0, otherwise Sb = ZlBZTP XlBXTT (15) ll=P In general, if two data points are linked by an edge, they (16) St = ZlCZTP XlCXTT ll=Pare likely to be in the same class. Thus, a natural regular- izer can be de,ned as follows: Then, the objective function of FDA is as follows: ? T TT21a z i ? azj)Wij = (aP XlBXTlPTa R(a) = i,j aopt = arg max 2 aTP XlCXTTa ? lPa ? TTTTaa(20) aziDiizi? 2 aziWijzj= The optimal a is the eigenvector corresponding to the max- i i,j imum eigenvalue of eigen-problem: TTTTTaZ(D ? W )Za = aP XLXPa (17) ZlBZT= λZlCZT llaawhere D is a diagonal matrix; its entries are column sum A typical way to prevent over,tting, which may happen [15]. The optimization problem of the regularized regularizerwhen there is no su,cient training sample, is to impose a version of FDA can be written as follows: ii =Wij, and L = D ? W is the Laplacian of W , DjTaSba matrix. We get the objective function of SSFDA: TaSba a opt= arg max aT(αSt+ (1?α)P XLXTPT)a(21)a (18) a= arg max optα ? aTSa + (1 ? α)R(a) tThe projective vector a that maximizes (21) is given by the a maximum eigenvalue solution to the generalized eigenvalue The a priori knowledge of the data can be incorporated problem: in the regularized term R(a), which provides us the ,exi- TT(22) bility to incorporate the geometrical structure of the data Sa = λ(αS+ (1 ? α)P XLXP)a bt[8]manifold. The key to semi-supervised learning problems [7]TTis the priori assumption of consistency, which means that P XlBXTT= λP (αXlCXT + (1? α)XLX)Pa (23) llPa1) nearby points are likely to have the same label and 2) Let the column vectors a1, ? ? ? , ad be the solutions of (23), points on the same structure (typically referred to as a cluster or a manifold) are likely to have the same label. A ordered according to their eigenvalues, λ1 > ? ? ? > λd. principled approach to formalize the assumption of consis- Thus, the embedding is as follows: tency is to design a mapping function which is su,ciently TTTTxi ? y= Az i = APxi = (P A)xi ismooth with respect to the intrinsic structure revealed by 1516 ACTA AUTOMATICA SINICA Vol. 35 where yis a d-dimensional representation of the high di- {β, ? ? ? , β} be an orthonormal basis of the subspace i1rmensional data point xi and A = (a1, ? ? ? , ad). P A is the R(Φk), where r is the rank of Φk. It is easy to show that transformation matrix. The between-class scatter matrix every vector β, i = 1, ? ? ? , r can be linearly expanded by iShas at most rank c ? 1. This implies that the multiplic- bk? ity of λ = 0 is at least m ? c + 1. Therefore, SSFDA can ,nd at most c ? 1 meaningful directions. β= γφ(x), i = 1, ? ? ? , r iijjj=1 2.3 Algorithm TIn summary of the discussion so far, the SSFDA algo- Denote M = Φk Φ, whose elements can be determined by krithm is given below: the following kernel function: 1) Find a orthonormal basis of the subspace R(X): kTPerform singular value decomposition of Xas X= kkM= φ(x)φ(x) = (φ(x) ? φ(x)) = K(x, x) i,jijijijT,U V. U = [u, ? ? ? , u, u+1? ? ? , u] is the orthonor- 1rrmmal matrix, where r is the rank of Xk. The vector set Calculate the orthonormal eigenvectors ν1, ? ? ? , ν r of M {u1, ? ? ? , ur } forms a orthonormal basis of R(Xk). The ma- corresponding to the r largest positive eigenvalues, λ1 > trix P = [u1, ? ? ? , ur ] gives a vector space projection from ? ? ? > λ. Then, the orthonormal vectors β, ? ? ? , βcan be r1rmRto subspace R(Xk). obtain by 2) Map data points into subspace R(Xl): 1 β= λ? Ti2x? z= Px, i = 1, ? ? ? , l, l + 1, ? ? ? , n iiii Φkν i, i = 1, ? ? ? , r After projection of the mapped samples φ(xi), i = 1, ? ? ? , n 3) Construct the scatter matrixes: Construct the into the subspace R(Φ), the transformed vectors z, i = kibetween-class scatter matrix Sand total scatter matrix b1, ? ? ? , n can be obtained by St as TS= ZBZTP XBXTT bllll=Pzi = Pφ(xi), i = 1, ? ? ? , l, l + 1, ? ? ? , n (24) TSt = ZlCZTPXlCXTT 1 ll1=P? 2 where P = [β, ? ? ? , β]. Let Λ = diag{λ? 1r2,4) Eigen-problem: When k = n, we use regularization 1? ? ? , λ} and rto ensure the nonsingularity of St: St = St + δIr, where V = [ν, ? ? ? , ν] . Equation (24) can be rewritten as 1rδ (δ > 0) is the regularization parameter and Ir is the r × r TTTidentity matrix. Compute the eigenvectors with respect = [β, ? ? ? , β]φ(x) = ΛVΦk φ(x) = zi1rii(25) to the nonzero eigenvalues for the generalized eigenvector TTΛV[K(x, x), ? ? ? , K(x, x)] 1ikiproblem: and total Construct the between-class scatter matrix SbSbbi = λiStbi, i = 1, 2, ? ? ? , c ? 1 scatter matrix St as We obtain the transformation matrix B = [b1 , ? ? ? , b c?1]. Sb = ZlBZT l5) Construct the adjacency graph: Construct the p- nearest neighbor graph matrix W as in (19) to model the relationship between nearby data points and calculate the S= ZCZT tllgraph Laplacian matrix L = D ? W . When k = n, St may be singular. We use regularization 6) Eigen-problem: Compute the eigenvectors with re- spect to the nonzero eigenvalues for the generalized eigen- to ensure the nonsingularity of St: St = St + δIr, where vector problem: δ (δ > 0) is the regularization parameter and Ir is the r × r identity matrix. Compute the eigenvectors with respect to TTSbai = λ(αSt + (1 ? α)P XLXP)ai, i = 1, ? ? ? , c ? 1 the nonzero eigenvalues for the eigenvector problem: There are at most c ? 1 nonzero eigenvalues for the eigen- Sbi = λStbi, i = 1, 2, ? ? ? , c ? 1 bvector problem. We obtain the transformation matrix ).We obtain the transformation matrix B = (b1 , ? ? ? , bc?1). A = (a, ? ? ? , a?1 1c7) SSFDA embedding: The data point can be embedded The transformation matrix B maps all the n samples to a Tcinto c ? 1 dimensional subspace by set of points {q|q= Bz, i = 1, ? ? ? , n} in R?1. Con- iiiTTTTstruct the p-nearest neighbor graph matrix W as in (19) x ? y = Az = APx = (P A)x to model the relationship between nearby data points, and calculate the graph Laplacian matrix L = D ? W . 2.4 Kernel SSFDA for nonlinear dimensionality Compute the eigenvectors with respect to the nonzero reduction eigenvalues for the generalized eigenvector problem: When the data manifold is highly nonlinear, the linear TSbai = λ(αSt + (1 ? α)ZLZ)ai, i = 1, ? ? ? , c ? 1 method described above may fail to discover the intrinsic geometry. Here, we show how SSFDA can be extended to Let A = (a, ? ? ? , a?1). For a data point x, the embed- nonlinear dimensionality reduction scenarios. 1c For a given nonlinear mapping φ, the input data in the ded data point y can be obtained by mspace Rcan be mapped into the space H: TTTy = Az = APφ(x) = mφ : R? H, x 7? φ(x) TTTTTA[β, ? ? ? , β]φ(x) = AΛVΦk φ(x) = 1rKernel SSFDA will be performed in the subspace R(Φ), kTT(VΛA)[K(x1, x), ? ? ? , K(xk, x)] where Φk = [φ(x1), ? ? ? , φ(xk)] and k = l or n. Let No. 12 YANG Wu-Yi et al.: Subspace Semi-supervised Fisher Discriminant Analysis 1517 3 Experimental results 3.2 Face recognition results The recognition error rates of di,erent algorithms on In this section, the performance of the proposed SSFDA CMU PIE databases are shown in Table 1. In each case, the minimum error rate is boldfaced. For each Gm/Ll/Pn, for face recognition is investigated. we average the results over 20 random splits and report the A face recognition task is handled as a multi-class mean as well as the standard deviation. classi,cation problem. Each test image is mapped to a Dimensionality estimation is a crucial problem for most low-dimensional subspace via the embedding learned from of the subspace learning-based face recognition methods. training data, and then it is classi,ed by the nearest neigh- The performance usually varies with the number of dimen- bor criterion. sions. We show the best results obtained by these subspace learning algorithms. The numbers in parentheses in Table 1 3.1 Datasets and compared algorithms are the corresponding feature dimensions with the best re- sults after dimensionality reduction. From the results in Table 1, we can make the follow- In our experiments, we use the CMU PIE (Pose, Illumi- ing comments: 1) The SSFDAl achieved the best perfor- nation and Expression) databases[16] for face recognition to mance of face recognition among all the compared algo- evaluate our proposed SSFAD algorithm. The CMU PIE rithms, and the SSFDAn achieved a little poorer perfor- face database contains more than 40 000 facial images of mance than the SSFDAl; 2) From the standard deviation 68 people. The face images were captured under varying of error rates shown in the table, we found that our algo- pose, illumination, and expression. From the dataset that rithm is more stable than other algorithms; 3) The out-of- contains ,ve near frontal poses (C05, C07, C09, C27, and sample extension is good for SSFDA; 4) Compared with the C29), we randomly select 20 persons and 80 images for each performance of Fisherface, the proposed algorithm greatly person in the experiments. improved the performance of face recognition by using the In all the experiments, preprocessing to locate the faces unlabeled data; 5) Compared with the performance of Fish- was applied. Original images were normalized (in scale and erface, the SDA algorithm impaired the performance of face orientation) such that the two eyes were aligned at the same recognition by using the unlabeled data. position. Then, the facial areas were cropped into the ,nal 3.3 Model selection for SSFDA images for matching. The size of each cropped image in all the experiments is 32 × 32 pixels, with 256 gray lev- Model selection is a crucial problem in many learning els per pixel. Thus, each image is represented by a 1 024- problems. If the learning performance varies drastically dimensional vector in image space. with di,erent choices of the parameters, we have to apply The image set is then partitioned into the gallery and some model selection methods for estimating the general- probe set with di,erent numbers. For ease of representa- ization error. tion, Gm/Ll/Pn means m images per person are randomlyIn our SSFDA algorithm, there are three parameters: α, selected for training and the remaining n images are for γ, and p. In the previous experiments, we empirically set testing. Among these m images, l images are randomly them as α = 0.8, γ = 0.9, and p = 5. Fig. 1 shows the selected and labeled which leaves other (m ? l) images un- erformance of SSFDAl with respect to di,erent values of plabeled. these parameters on CMU PIE database. Only one param- We compare the performance of SSFDA with eter was varied at a time while the others were ,xed. As [9]Fisherface[17] (PCA followed by FDA), Laplacianface can be seen from Figs. 1 (a) and (b), when α is less than (PCA followed by LPP), and PCA + SDA (PCA followed 0.9, the algorithm is not sensitive to α. However, when [8]by SDA). In order to deal with the “small sample size” we gradually increased α from 0.9 to 1, the performance problem, for Fisherface, Laplacianface, and PCA + SDA, deteriorated gradually. When α is equal to 1, the SSFDA we kept 99.9 % information in the sense of reconstruction algorithm degenerates to the FDA algorithm and we get error in the PCA step. We name the SSFDA as SSFDAl the worst performance. As can be seen from Figs. 1 (c) and when the SSFDA is performed in the subspace R(Xl) and (d), in order to get a good performance, we can empirically the SSFDA as SSFDAn when the SSFDA is performed set γ as near 1. For the value of p, in order to discover the in the subspace R(X). When the weight matrix W in nlocal geometrical and discriminant structures of the data the SSFDAn is obtained only according to the relationship space, it is usually set to a small number. between the data in the subspace spanned by the principle 3.4 Discussion components after the PCA step, the SSFDAn degenerates Why could not the SDA algorithm improve the perfor- to the PCA + SDA. Table 1 Recognition error rates on CMU PIE database (%) G40/L3/P40 G40/L4/P40 Method Unlabeled error Test error Unlabeled error Test error Laplacianface 38.0?8.3 (19) 38.4?10.3 (39) 30.4?10.7 (41) 35.2?9.2 (39) Fisherface 40.6?8.8 (19) 41.6?10.3 (19) 31.4?10.2 (19) 36.1?8.6 (19) PCA + SDA 45.5?7.0 (19) 43.7?7.5 (19) 38.6?10.5 (19) 40.5?8.5 (19) SSFDAl 26.2?6.3 (19) 26.3?7.5 (19) 20.2?8.0 (19) 24.2?6.0 (19) SSFDAn 28.8?7.2 (19) 29.5?9.3 (19) 22.7?9.4 (19) 26.0?7.5 (19) G40/L5/P40 G40/L6/P40 Method Unlabeled error Test error Unlabeled error Test error Laplacianface 27.1?10.3 (40) 27.5?10.7 (39) 22.1?6.7 (43) 21.1?5.8 (43) Fisherface 28.5?10.6 (19) 28.8?11.2 (19) 23.1?7.6 (19) 22.2?6.2 (19) PCA + SDA 37.3?10.0 (19) 36.6?10.7 (19) 33.9?7.8 (19) 30.4?7.4 (19) SSFDAl 19.0?7.7 (19) 19.7?8.7 (19) 15.5?6.0 (19) 15.4?4.3 (19) SSFDAn 20.0?7.9 (19 21.0?9.2 (19) 17.1?6.5 (19) 16.6?4.6 (19)

                    本文档为【子空间半监督Fisher判别分析】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

子空间半监督Fisher判别分析

你可能还喜欢