ESTHER: A “R Package” Implementing a Novel Approach to Bidimen-sional Display of Multidimensional Binary Data

2007 
The R package ESTHER implements two novel algorithms designed to dispose in a reduced space multidimen- sional objects defined by binary descriptors. The approach is to assign discrete and fix positions to all possible combina- tions obtained with the descriptors employed. One of the two algorithms, called clock, position objects on a circle at regu- lar intervals, whereas the other, star, maintains the angular position as in clock, but defines the distance of the object from the center of the circle proportionally to the number of descriptor in state"1". Comparisons with Principal Coordinate Analysis (PCoA) showed that the three methods perform differently according to the number of objects and descriptors and to the distance method employed to carry out the PCoA. The algorithm clock produced the best object clustering in a validation carried out with a matrix generated by molecular fingerprint of yeast isolates. An additional problem, which is at the origin of the algo- rithm proposed in this paper, is the impossibility to show in the PCoA graph the proportion of the descriptors in use. In fact, n binary variables can describe a maximum of 2n possi- ble objects, although the number of biological objects stud- ied is normally well below 2n. In normal conditions, all ob- jects (excluding the repetitions) are represented by one of the 2n combinations obtained with the n descriptors in use, meaning that only a few combinations are actually present in the matrix. The relationships among these few combinations and their positioning within the complex of all possible combinations are further aims of the present algorithms, named ESTHER after the ancient Persian word, meaning star, for some similarity between the point scattering and the stylized depiction of the star. METHODS General Presentation of the Algorithms Included in ESTHER ESTHER includes seven functions, two of them (clock and star) are designed to position in a binary space object described by several descriptors, both of them include an option to evaluate the quality of the point scattering in the bidimensional graph. The other four functions are auxiliary; one, import, is designed to directly import binary matrices, three to produce binary matrices: fullmat generates matrices with all 2n combinations, partmat yields only a defined por- tion of all combinations and finally ranmat generates random combinations with the desired number of objects and de- scriptors. The function shepard returns a double plot with the Shepard diagrams of ESTHER and PCoA.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    2
    Citations
    NaN
    KQI
    []