Stress majorization is an optimization strategy used in multidimensional scaling (MDS) where, for a set of n m-dimensional data items, a configuration X of n points in r(<<m)-dimensional space is sought that minimizes the so-called stress function σ ( X ) {displaystyle sigma (X)} . Usually r is 2 or 3, i.e. the (n x r) matrix X lists points in 2- or 3-dimensional Euclidean space so that the result may be visualised (i.e. an MDS plot). The function σ {displaystyle sigma } is a cost or loss function that measures the squared differences between ideal ( m {displaystyle m} -dimensional) distances and actual distances in r-dimensional space. It is defined as: Stress majorization is an optimization strategy used in multidimensional scaling (MDS) where, for a set of n m-dimensional data items, a configuration X of n points in r(<<m)-dimensional space is sought that minimizes the so-called stress function σ ( X ) {displaystyle sigma (X)} . Usually r is 2 or 3, i.e. the (n x r) matrix X lists points in 2- or 3-dimensional Euclidean space so that the result may be visualised (i.e. an MDS plot). The function σ {displaystyle sigma } is a cost or loss function that measures the squared differences between ideal ( m {displaystyle m} -dimensional) distances and actual distances in r-dimensional space. It is defined as: where w i j ≥ 0 {displaystyle w_{ij}geq 0} is a weight for the measurement between a pair of points ( i , j ) {displaystyle (i,j)} , d i j ( X ) {displaystyle d_{ij}(X)} is the euclidean distance between i {displaystyle i} and j {displaystyle j} and δ i j {displaystyle delta _{ij}} is the ideal distance between the points (their separation) in the m {displaystyle m} -dimensional data space. Note that w i j {displaystyle w_{ij}} can be used to specify a degree of confidence in the similarity between points (e.g. 0 can be specified if there is no information for a particular pair). A configuration X {displaystyle X} which minimizes σ ( X ) {displaystyle sigma (X)} gives a plot in which points that are close together correspond to points that are also close together in the original m {displaystyle m} -dimensional data space. There are many ways that σ ( X ) {displaystyle sigma (X)} could be minimized. For example, Kruskal recommended an iterative steepest descent approach. However, a significantly better (in terms of guarantees on, and rate of, convergence) method for minimizing stress was introduced by Jan de Leeuw. De Leeuw's iterative majorization method at each step minimizes a simple convex function which both bounds σ {displaystyle sigma } from above and touches the surface of σ {displaystyle sigma } at a point Z {displaystyle Z} , called the supporting point. In convex analysis such a function is called a majorizing function. This iterative majorization process is also referred to as the SMACOF algorithm ('Scaling by MAjorizing a COmplicated Function'). The stress function σ {displaystyle sigma } can be expanded as follows: Note that the first term is a constant C {displaystyle C} and the second term is quadratic in X (i.e. for the Hessian matrix V the second term is equivalent to tr X ′ V X {displaystyle X'VX} ) and therefore relatively easily solved. The third term is bounded by: where B ( Z ) {displaystyle B(Z)} has: and b i j = 0 {displaystyle b_{ij}=0} for d i j ( Z ) = 0 , i ≠ j {displaystyle d_{ij}(Z)=0,i eq j} and b i i = − ∑ j = 1 , j ≠ i n b i j {displaystyle b_{ii}=-sum _{j=1,j eq i}^{n}b_{ij}} .