Skip to main content
Fig. 4 | Applied Network Science

Fig. 4

From: Semisupervised regression in latent structure networks on unknown manifolds

Fig. 4

Plot of the difference between the empirical powers of tests for model validity based on the 1-dimensional raw-stress embeddings and the true regressors. The arclength parameterized manifold is taken to be \(\psi ([0,1])\) where \(\psi (t)=(t/2,t/2,t/2,t/2)\). For a small fixed number (\(s=20\)) of nodes, responses \(y_i\) are generated from \(y_i=\alpha +\beta t_i + \epsilon _i\), \(\epsilon _i \sim ^{iid} N(0,\sigma ^2_{\epsilon })\). A large number \((n-s)\) of auxiliary nodes are generated on \(\psi ([0,1])\) and a localization graph is constructed on the adjacency spectral estimates. When n is the K-th term of the vector \((100,150,200,\dots 1000)\), the neighbourhood parameter is taken to be \(\lambda =0.9 \times 0.99^{K-1}\). The dissimilarity matrix of the shortest path distances is embedded into 1-dimension by minimization of raw-stress criterion. In order to test \(H_0:\beta =0\) vs \(H_1: \beta \ne 0\), the test statistics \(F^*\) based on the true regressors \(t_i\) and \({\hat{F}}\) based on the 1-dimensional raw-stress embeddings \({\hat{z}}_i\) are comapared, where n is the total number of nodes in the graph. The corresponding powers are empirically estimated by the proportions of times in a collection of 100 Monte Carlo samples the test statistics reject \(H_0\), for every n varying from 100 to 1000 in steps of 50. The plot shows that the difference between the estimated powers of the two tests goes to zero, indicating the tests based on the raw-stress embeddings are almost as good as the tests based on the true regressors, for sufficiently large number of auxiliary nodes

Back to article page