No Arabic abstract
We present study results from two experiments to empirically validate that separable bivariate pairs for univariate representations of large-magnitude-range vectors are more efficient than integral pairs. The first experiment with 20 participants compared: one integral pair, three separable pairs, and one redundant pair, which is a mix of the integral and separable features. Participants performed three local tasks requiring reading numerical values, estimating ratio, and comparing two points. The second 18-participant study compared three separable pairs using three global tasks when participants must look at the entire field to get an answer: find a specific target in 20 seconds, find the maximum magnitude in 20 seconds, and estimate the total number of vector exponents within 2 seconds. Our results also reveal the following: separable pairs led to the most accurate answers and the shortest task execution time, while integral dimensions were among the least accurate; it achieved high performance only when a pop-out separable feature (here color) was added. To reconcile this finding with the existing literature, our second experiment suggests that the higher the separability, the higher the accuracy; the reason is probably that the emergent global scene created by the separable pairs reduces the subsequent search space.
Little is known about how people learn from a brief glimpse of three-dimensional (3D) bivariate vector field visualizations and about how well visual features can guide behavior. Here we report empirical study results on the use of color, texture, and length to guide viewing of bivariate glyphs: these three visual features are mapped to the first integer variable (v1) and length to the second quantitative variable (v2). Participants performed two tasks within 20 seconds: (1) MAX: find the largest v2 when v1 is fixed; (2) SEARCH: find a specific bivariate variable shown on the screen in a vector field. Our first study with eighteen participants performing these tasks showed that the randomized vector positions, although they lessened viewers ability to group vectors, did not reduce task accuracy compared to structured vector fields. This result may support that these color, texture, and length can provide to a certain degree, guide viewers attention to task-relevant regions. The second study measured eye movement to quantify viewers behaviors with three-errors (scanning, recognition, and decision errors) and one-behavior (refixation) metrics. Our results showed two dominant search strategies: drilling and scanning. Coloring tended to restrict eye movement to the task-relevant regions of interest, enabling drilling. Length tended to support scanners who quickly wandered around at different v1 levels. Drillers had significantly less errors than scanners and the error rates for color and texture were also lowest. And length had limited discrimination power than color and texture as a 3D visual guidance. Our experiment results may suggest that using categorical visual feature could help obtain the global structure of a vector field visualization. We provide the first benchmark of the attention cost of seeing a bivariate vector on average about 5 items per second.
Visualization of large vector line data is a core task in geographic and cartographic systems. Vector maps are often displayed at different cartographic generalization levels, traditionally by using several discrete levels-of-detail (LODs). This limits the generalization levels to a fixed and predefined set of LODs, and generally does not support smooth LOD transitions. However, fast GPUs and novel line rendering techniques can be exploited to integrate dynamic vector map LOD management into GPU-based algorithms for locally-adaptive line simplification and real-time rendering. We propose a new technique that interactively visualizes large line vector datasets at variable LODs. It is based on the Douglas-Peucker line simplification principle, generating an exhaustive set of line segments whose specific subsets represent the lines at any variable LOD. At run time, an appropriate and view-dependent error metric supports screen-space adaptive LOD levels and the display of the correct subset of line segments accordingly. Our implementation shows that we can simplify and display large line datasets interactively. We can successfully apply line style patterns, dynamic LOD selection lenses, and anti-aliasing techniques to our line rendering.
The vast availability of large scale, massive and big data has increased the computational cost of data analysis. One such case is the computational cost of the univariate filtering which typically involves fitting many univariate regression models and is essential for numerous variable selection algorithms to reduce the number of predictor variables. The paper manifests how to dramatically reduce that computational cost by employing the score test or the simple Pearson correlation (or the t-test for binary responses). Extensive Monte Carlo simulation studies will demonstrate their advantages and disadvantages compared to the likelihood ratio test and examples with real data will illustrate the performance of the score test and the log-likelihood ratio test under realistic scenarios. Depending on the regression model used, the score test is 30 - 60,000 times faster than the log-likelihood ratio test and produces nearly the same results. Hence this paper strongly recommends to substitute the log-likelihood ratio test with the score test when coping with large scale data, massive data, big data, or even with data whose sample size is in the order of a few tens of thousands or higher.
Vector quantization is an essential tool for tasks involving large scale data, for example, large scale similarity search, which is crucial for content-based information retrieval and analysis. In this paper, we propose a novel vector quantization framework that iteratively minimizes quantization error. First, we provide a detailed review on a relevant vector quantization method named textit{residual vector quantization} (RVQ). Next, we propose textit{generalized residual vector quantization} (GRVQ) to further improve over RVQ. Many vector quantization methods can be viewed as the special cases of our proposed framework. We evaluate GRVQ on several large scale benchmark datasets for large scale search, classification and object retrieval. We compared GRVQ with existing methods in detail. Extensive experiments demonstrate our GRVQ framework substantially outperforms existing methods in term of quantization accuracy and computation efficiency.
How to automatically generate a realistic large-scale 3D road network is a key point for immersive and credible traffic simulations. Existing methods cannot automatically generate various kinds of intersections in 3D space based on GIS data. In this paper, we propose a method to generate complex and large-scale 3D road networks automatically with the open source GIS data, including satellite imagery, elevation data and two-dimensional(2D) road center axis data, as input. We first introduce a semantic structure of road network to obtain high-detailed and well-formed networks in a 3D scene. We then generate 2D shapes and topological data of the road network according to the semantic structure and 2D road center axis data. At last, we segment the elevation data and generate the surface of the 3D road network according to the 2D semantic data and satellite imagery data. Results show that our method does well in the generation of various types of intersections and the high-detailed features of roads. The traffic semantic structure, which must be provided in traffic simulation, can also be generated automatically according to our method.