proposal for Scenic Beauty Estimation Using Independent Component Analysis & Support Vector Machines submitted to fulfill the semester project requirement for EE 4773/6773: Digital Signal Processing March 11, 1999 submitted to: Dr. Joseph Picone Department of Electrical and Computer Engineering 413 Simrall, Hardy Rd. Mississippi State University Box 9571 MS State, MS 39762 submitted by: Vijay Ramani, Xinping Zhang, Zhiling Long, Yu Zeng Scenic Beauty Group The Institute for Signal and Information Processing Mississippi State University Box 9671 Mississippi State, Mississippi 39762 Tel: 601-325-8335 Fax: 601-325-8192 email: {ramani, zhang, long, zeng}@isip.msstate.edu 1. ABSTRACT The main focus of this project is to estimate the scenic beauty of forest images using independent component analysis and support vector machines, two valuable tools for multigroup classification and data reduction. This project originated from a need of the United States Forest Service (USFS) to determine the scenic beauty of forests to preserve recreation and aesthetic resources in forest management. The results of this project will be useful to determine a predefined pattern to cut the trees so as to retain the scenic beauty even after cutting the forest for timber. The algorithms in this project will be developed in C++. The software developed will be tested on 637 images available in a standard evaluation database used to benchmark progress on this problem. Every image will be rated on a scale of -200 to 200 and the results obtained will be compared with approaches that use principal components analysis and linear discriminant analysis. 2. INTRODUCTION The database to be used [1] is based on two sets of images taken approximately four years apart. The first set consists of photographs taken during 1990-91. The second set consists of photographs taken during 1994-95. The images included in the database were taken from a study spanning four years in the Ouachita Forest in Arkansas. Photographs taken under controlled conditions were digitized using an extremely high quality scanning process and converted into computer readable data. The area under study was partitioned into four blocks, with each block subdivided into five plots. The database includes images from each plot during each of the four seasons in the year. The plots were photographed both in the 1990-91 period and the 1994-95 period. Each plot was photographed from four different angles. Thus the total number of images in the database is 4 blocks x 5 plots x 4 seasons x 4 angles x 2 years = 640. In addition, 40 images (20 images from the 1990-91 and 20 images from the 1994-95 period) were added as baseline images. A set of 20 images was also included as warm-up slides. These warm-up slides were used to calibrate human performance during subjective testing. The images were originally delivered in a proprietary format developed by Kodak known as the PhotoCD format (PCD), a format for archiving high-quality photographs. Due to the proprietary nature of the PCD format, the images were converted into the Portable Pixel Map format (PPM) [1]. The PCD to PPM conversion was done using version 0.6 of pcdtoppm on a Unix machine. The PPM images were converted from a 4x PCD resolution into 1536 pixel wide x 1024 pixel high format of PPM. Each image requires about 5 Mbytes of disk space. The PPM P6 formatted image represents each pixel as a sequence of three bytes. The first byte corresponds to the value of the red color encoded on a linear scale of 0 to 255. The second and third bytes represent the values of green color and blue color. Each file also includes information about the image such as the photographic tray and slide number, the human subjective judgement of 1 to10, the Scenic Beauty Rating (SBE), the Normalized Scenic Beauty Rating (SBEZ), the block number, the treatment, the plot number, the date, the angle, and other miscellaneous information [1]. 3. CLASSIFICATION TECHNIQUES Once significant feature values are extracted from the images and the most probable model obtained, we use different classification techniques to classify images into high scenic, medium scenic and low scenic groups. In the previous phase of the project, principal components analysis was used to classify the images into various groups. In the present phase, we will investigate classification methods using the recently evolved techniques such as Independent Component Analysis, Support Vector Machines and Decision Trees. A comparison of results obtained from all these techniques will be useful in estimating the efficiency of each algorithm and the significant features can be easily sorted out. 3.1 Linear Discriminant Analysis Linear discriminant analysis easily handles the case where the within-class frequencies are unequal. This method maximizes the ratio of the between-class variance to the within-class variance in any particular data set, thereby guaranteeing maximal separability. The method of linear discriminant analysis is applied to classification problems in image classification and speech recognition. The prime difference between LDA and PCA is that PCA does more of feature classification while LDA does more of data classification. LDA performs better than PCA in cases where data between two classes overlap or the correlation between the features is significant. In PCA, the shape and location of the original data sets change when transformed to a different space whereas LDA doesn't change the location but only tries to provide more class separability and draw a decision region between the given classes. This method also helps to obtain a better understanding of the distribution of the feature data. 3.2 Support Vector Machines Support vector machine (SVM) is a new technique, emerged in early `90s, for classification problem. In this paradigm, input vectors are mapped into a higher-dimensional feature space through an a priori chosen non-linear mapping, where a linear classification decision region is constructed. This decision region when mapped back into the original feature space can take a non-linear form. This is one of the greatest assets of SVMs where constructing non-linear decision regions has better classification ability than traditional classification techniques like linear discriminant analysis. Also, special properties of the decision surface ensures good generalization ability of this learning machine. The main objective of this technique is to construct an optimal hyperplane, which uses a small part of training set as support vectors. If the training vectors are separated without errors by an optimal hyperplane, the expected error rate on a test sample is bounded by the ratio of the expectation of the support vectors to the number of training vectors. Since this ratio is independent of the dimension of the problem, if we can find a small set of support vectors, good result is guaranteed. 3.3 Decision Trees A distribution-free classifier called a decision tree classifier, splits the N-dimensional feature space into unique regions by a sequential method. The algorithm is such that every class need not be tested to arrive at a decision. This becomes advantageous when the number of classes is very large. Moreover, unlike many other training algorithms, this algorithm is guaranteed to converge whether or not the feature space is linearly separable. The objective of a symbolic decision-learning system is to provide an inference for classifications of all events, that is, to generalize the partial knowledge represented in the initial database of events. In other words, the procedure is to partition the event space with different decisions. Ideally, this should be an equivalence partition with large blocks which increase the comprehensibility of the generated knowledge and usually improve generalization properties. However, in general, the generated partition may neither be complete nor consistent. Decision tree classifier systems are made up of two different approaches: one to build the tree, another for knowledge inference. Each training instance contains a set of features and the associated classifications. Each internal node is labeled with an attribute and has outgoing edges corresponding to domain values of the attribute. Each node in the tree contains a subset of the training events which classifies the conditions leading to the node. 3.4 Independent Component Analysis Independent Component Analysis is the name for a family of techniques which in recent years has been producing interesting results on multi- or single-source audio data, natural images, etc.. It is a simple, general purpose method for transforming ensembles of multivariate data into `natural' coordinate systems. Independent component analysis will be applied to the existing results of the tests applied to the images with an aim of determining a more accurate overall rating for each image. Similar to principal component analysis, it finds a linear co-ordinate system for the data, transforming it by matrix into a new basis set. Unlike PCA, the transformation is free to be non-orthogonal. Furthermore the axes are sensitive to statistics of all orders, while in the PCA, only the second-order statistics of the covariance matrix is used. When the data are transformed by , the resulting output variables are supposed to be as statistically independent as possible. 3.4.1 More on ICA Consider an ensemble of data vectors (1)Find a new affine basis Let's assume to be of zero-mean, unit variance and be uncorrelated: i.e. Solve for W in terms of x (for simplicity assume ) (2) (3) (4) Here, W is UNDER-DETERMINED. We can decide that W should be: 1) orthogonal => That is PCA. (5) 2) symmetrical => That is ZERO-PHASE DECORRELATION(ZCA) 3) anything => 3.4.2 Why ICA One method commonly used for data compression by dimension reduction is principal component analysis (PCA). The first principal component of a sample vector is the direction along which there is the largest variance over all samples. This direction corresponds to the eigenvector associated with the largest eigenvalue. The kth principal component is chosen to be the linear combination of the input features that has the largest variance, under the constraint that it is also uncorrelated to the previous k-1 principal components. This approach is well suited for data compression, where the objective is to transmit data with minimal distortion. PCA finds more application in pattern classification, where the consideration is that the direction along which there is maximum variation is also most likely to contain the information about the class discrimination. The method of principal components analysis depends on feature scaling. Multiplication of the nth component of the input feature vector by some constant cannot possibly affect the information contained in the nth component about the class with which the input feature vector is associated. However, as this component begins to point in the direction of the nth feature component it may not contain any class-discrimination information. The major advantage that ICA has over PCA is that although it finds a linear coordinate system for the data, transformation is free to be non-orthogonal. Moreover the exact method to choose these axes is sensitive to statistics of all orders, while in PCA, only the second order statistics of the covariance matrix is used. Figure 1 demonstrates an example illustrating the basic difference between PCA and ICA. 4. TESTING The ICA and SVM techniques will be initially coded in C++ and tested with the training and testing data consisting of 12 features, and results will be obtained by reducing features. Then the results (error performance) will be compared with the previous test results and human SBE ratings. These tests will provide a new space with reduced features where adequate discrimination information is available. These techniques will also be applied to sets of synthetic data with 2 or 3 features. The synthetic data to be used will be data that fit the ICA and SVM model, and data point inside and outside of the classes will be tested. 5. PROJECT SUMMARY The purpose of the project of Scenic Beauty Estimation Using Independent Component Analysis and Support Vector Machines is to provide a more accurate estimation of the scenic beauty of forestry images from the database and to compare the various pattern recognition techniques. This project is a continuation of research in the Institute for Signal and Information Processing and projects done in the course of EE4773/6773: Digital Signal Processing. ICA and SVM will be applied to the features extracted from the images using the pre-existing code. In this project we hope that these techniques will give better scenic beauty estimation of the images. 6. PROJECT EVALUATION The algorithms developed have to be evaluated to check the performance of the system in estimating the scenic beauty. Different combinations of the models will be used to evaluate the algorithms. A database is to be developed for evaluating the algorithms. The purpose of the different combinations of the features is to identify the best combination which can give a scenic beauty estimation with least error performance. An extensive database has been made available by the USFS to support the development of algorithms that will automatically estimate scenic quality. There are a total of 700 images in the database. Photographs taken under controlled conditions have been digitized using an extremely high quality scanning process, and converted into computer readable data. The performance of the algorithm is a measurement of the efficiency of the algorithm in estimating the scenic beauty of the images. A total of 45 features were extracted from the images and different combinations of these features will be tried to identify the best system. This project will be evaluated by a comparison of the existing Scenic Beauty Rating (SBE) and the Normalized Scenic Beauty Rating (SBEZ), the previous results obtained using other algorithms, and the results generated by applying ICA and SVM to the same data sets. Histograms will be used in the evaluation. With this, it will be possible to find the difference between ICA, SVM, and other algorithms like PCA and LDA. 7. PROJECT DEMONSTRATION The demonstration for this project will be an active test of our system on images from the database which will be chosen by the audience. The images will be evaluated using the ICA and SVM algorithms. A comparison will be made of the ratings obtained using previous algorithms with the ratings obtained using the methods of ICA and SVM. 8. TASK AHEAD 9. REFERENCES [1] N. Kalidindi, A. Le, J. Picone and V. A. Rudis, "The USFS Scenic Beauty Estimation Database," USFS Southern Research Station, Institute for Signal and Information Processing, Mississippi State University, Mississippi State, Mississippi, USA, December 1996. [2] N. Kalidindi and J. Picone, "Scenic Beauty Estimation of Forestry Images," USFS Project Final Report, Institute for Signal and Information Processing, Mississippi State University, Mississippi State, Mississippi, USA, December 1997. [3] N. Kalidindi, A. Le, L. Zheng, H. Yaquin, J. Picone and V. A. Rudis, "Scenic Beauty Estimation of Forestry Images," Proceedings of IEEE Southeastcon, pp. 337 - 339, Blacksburg, Virginia, USA, April 1997. [4] R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis, John Wiley and Sons Inc., New York, New York, USA, 1973. [5] A. J. Bell and T. J. Sejnowski, "An Information-Maximization Approach to Blind Separation and Blind Deconvolution," Technical Report No. INC-9501, Institute for Neural Computing, University of California at San Diego, San Diego, California, USA, February 1995. [6] A. K. Jain, Fundamentals of Digital Image Processing, Prentice-Hall International Inc., Englewood Cliffs, New Jersey, USA, 1989. [7] R. Brown, S. Balakrishnama, and J. Picone, "Scenic Beauty Estimation using Linear Discriminant Analysis," MS State ECE 4773 Semester Project, Institute for Signal and Information Processing, Mississippi State University, Mississippi State, Mississippi, USA, December 1997. Figure 1 Data here is distributed uniformly in a diamond. It is pleasing to note the non-orthogonal ICA axes. As we can see the PCA axes are orthogonal. Figure 2