Home > Professional > Programming > Feature Extraction and Cluster Analysis | ||
Feature Extraction and Cluster Analysis on Brain Imaging Data![]() This program, somewhat misguidedly named VV-Classifier, performs feature extraction and subsequent cluster analysis for certain EEG (electroencephalography) brain imaging data. The DataThe data is so called TFR (Time-Frequency Representation) data, also known as spectrogram. What you see above is part of a matrix from a single electrode measuring brain activation. The X axis is time in the order of milliseconds, the whole width of the matrix above represents about one second. (As is usual in cognitive brain research the data is already averaged over a number of trials, so it doesn't represent a single physical period of time.) The Y axis is frequency, here about 0-40Hz. Each point represents a voltage. Bright red shows the highest positive values and deep blue shows lowest negative values. This program reads data files in Matlab format made by NeuroScan software. Each matrix in Matlab format actually contains two vectors and one 3D matrix. The vectors give the scales for X and Y axis. The 3D matrix contains data for all electrodes that were used, e.g. 20 or 64 channels.Feature ExtractionThe feature extraction is visualized in the animated picture above. Feature extraction begins by marking the highest and lowest areas, shown in white in the picture above. The high and low areas are enclosed in minimum bounding rectangles. The rectangles are "sliced" to get smaller rectangles that conform rather tightly to the shapes of the marked areas. In further processing, only the coordinates of the small rectangles are used. Thus, information in a matrix with thousands of data points is reduced to a few dozen points, the coordinates of the rectangles.Cluster analysisThe cluster analysis is based on a distance matrix. The distance metric is based on finding the closest corresponding corners in the two matrice being compared, e.g. the closest positive top-left corner in matrix B for each positive top-left corner in matrix A.To perform the cluster analysis you can set your desired number of clusters in the end and a cutoff percentage to say what portion of data you're willing to leave out of the clusters. For example you could say that you want at least 90% of the data to be clustered into 3 largest clusters. You can always cluster 100% of the data if you will, but it may be just reasonable to assume that a part of the data is 'unclean' or otherwise invalid. By leaving a part out, you get in your main clusters only the percentage that best fit in those clusters. The graphical result of the cluster analysis can be seen below. The cluster "Class 1" here is composed of 1591 individual matrice. This picture shows what is common to all those matrice. The brighness of red in each spot is determined by the ratio of matrice belonging to the class that have a positive rectangle containing that point. The brighness of blue for each point is the same with negative rectangles. ![]() Features
|
||