Automatic detection and locating of objects such as poles, traffic signs, and building corners in street scenes captured from a mobile mapping system has many applications. Template matching is a technique that could automatically recognise the counterparts or correspondents of an object from multi-view images. In this study, we aim at finding correspondents of an object from wide baseline panoramic images with large geometric deformations from sphere projection and significant systematic errors from multi-camera rig geometry. Firstly, we deduce the camera model and epipolar model of a multi-camera rig system. Then, epipolar errors are analysed to determine the search area for pixelwise matching. A low-cost laser scanner is optionally used to constrain the depth of an object. Lastly, several classic feature descriptors are introduced to template matching and evaluated on the multi-view panoramic image dataset. We propose a template matching method combining a fast variation of a scale-invariant feature transform (SIFT) descriptor. Our method experimentally achieved the best performance in terms of accuracy and efficiency comparing to other feature descriptors and the most recent robust template matching methods.
Remote sensing image retrieval is to find the most identical or similar images to a query image in the vast archive of remote sensing images. A key process is to extract the most distinctive features. In this study, we introduce a second-order pooling named compact bilinear pooling (CBP) into convolutional neural networks (CNNs) for remote sensing image retrieval. The retrieval algorithm has three stages, pretraining, fine-tuning and retrieval. In the pretraining stage, two classic CNN structures, VGG16 and ResNet34, are pretrained respectively with the ImageNet consisting of close-range images. A CBP layer is introduced before the fully connected layers in the two networks. To extract globally consistent representations, a channel and spatial integrated attention mechanism is proposed to refine features from the last convolution layer and the features are used as the input of the CBP. In the fine-tuning stage, the new network is fine-tuned on a remote sensing dataset to train discriminable features. In the retrieval stage, the network, with fully connected layers being replaced by a PCA (principal component analysis) module, is applied to new remote sensing datasets. Our retrieval algorithm with the combination of CBP and PCA obtained the best performance and outperformed several mainstream pooling or encoding methods such as full-connected layer, IFK (Improved Fisher Kernel), BoW (Bag-of-Words) and maxpooling, etc. The channel and spatial attention mechanism contributes to the CBP based retrieval method and obtained the best performance on all the datasets, as well as outperformed several recent attention methods. Source code is available at http://study.rsgis/whu.edu.cn/pages/download.
BackgroundDiabetes is a major health concern and is influenced by lifestyle, which can be affected by the neighbourhood environment. Specifically, a fast-food environment can influence eating behaviours and thus diabetes prevalence. Therefore, our aim was to assess the relationship between fast-food environment and diabetes prevalence for urban and rural environments in the Netherlands, using multiple indicators and buffer sizes.MethodsIn this cross-sectional study, data on a nationwide sample of adults older than 19 years in the Netherlands were taken from the 2012 Dutch national health survey (from Public Health Monitor), in which participants were surveyed on topics related to health and lifestyle behaviour. Fast-food outlet exposures were determined within street-network buffers of 100 m, 400 m, 1000 m, and 1500 m around residential addresses. For each of these buffers, three indicators were calculated: presence (yes or no) of fast-food outlets, fast-food outlet density, and ratio. Logistic regression analyses were carried out to assess associations of these indicators with diabetes, adjusting for potential confounders and stratifying into urban and rural areas.Findings387 195 adults were surveyed, 284 793 of whom were included in the study. 22 951 (8%) reported having diabetes. Fast-food outlet exposures were positively associated with diabetes prevalence. We did not observe large differences between urban and rural areas. The effect estimates were small for all indicators. For example, in the 400 m buffer in the urban environment, the odds ratio (OR) for having diabetes among people with a fast-food outlet present compared with those without, was 1·006 (95% CI 1·003–1·009) using the presence indicator. The presence indicator showed higher effect estimates and the most consistent results across buffer sizes (ranging from OR 1·005 [95% CI 1·000–1·010] with the 1000 m buffer to 1·016 [1·005–1·028] with the 1500 m buffer in urban areas and from 1·002 [0·998–1·005] with the 1500 m buffer to 1·009 [1·006–1·018] with the 100 m buffer in rural areas) compared with the density and ratio indicators.InterpretationThe results confirm the evidence that the fast-food outlet environment is a diabetes risk factor. All data included were at the individual level and the variability was ensured by the spatial distribution and number of participants. In this study, we only accounted for residential exposure because we were unable to account for exposure outside the residential environment. The findings of this study encourage local governments to consider the potential adverse effects of fast-food exposures and aim at minimising unhealthy food access.FundingGlobal Geo Health Data Centre, Utrecht University, Netherlands.
Dense stereo matching plays a key role in 3D reconstruction. The capability of using deep learning in the stereo matching of remote sensing data is currently uncertain. This article investigated the application of deep learning–based stereo methods in aerial image series and proposed a deep learning–based multi-view dense matching framework. First, we applied three typical convolutional neural network models, MC-CNN, GC-Net, and DispNet, to aerial stereo pairs and compared the results with those of the SGM and a commercial software, SURE. Second, on different data sets, the generalization ability of each network is evaluated by using direct transfer learning with models pretrained on other data sets and by fine-tuning with a small number of target training data. Third, we present a deep learning–based multi-view dense matching framework where the multi-view geometry is introduced to further refine matching results. Three sets of aerial images as the main data sets and two open-source sets of street images as auxiliary data sets are used for testing. Experiments show that, first, the performance of deep learning–based stereo methods is slightly better than traditional methods. Second, both the GC-Net and the MC-CNN have demonstrated good generalization ability and can obtain satisfactory results on aerial images using a pretrained model on several available stereo benchmarks. Third, multi-view geometry constraints can further improve the performance of deep learning–based methods, which is better than that of the multi-view–based SGM and SURE.
Semantic segmentation of LiDAR point clouds has implications in self-driving, robots, and augmented reality, among others. In this paper, we propose a Multi-Scale Attentive Aggregation Network (MSAAN) to achieve the global consistency of point cloud feature representation and super segmentation performance. First, upon a baseline encoder-decoder architecture for point cloud segmentation, namely, RandLA-Net, an attentive skip connection was proposed to replace the commonly used concatenation to balance the encoder and decoder features of the same scales. Second, a channel attentive enhancement module was introduced to the local attention enhancement module to boost the local feature discriminability and aggregate the local channel structure information. Third, we developed a multi-scale feature aggregation method to capture the global structure of a point cloud from both the encoder and the decoder. The experimental results reported that our MSAAN significantly outperformed state-of-the-art methods, i.e., at least 15.3% mIoU improvement for scene-2 of CSPC dataset, 5.2% for scene-5 of CSPC dataset, and 6.6% for Toronto3D dataset.
To investigate associations between annual average air pollution exposures and health, most epidemiological studies rely on estimated residential exposures because information on actual time-activity patterns can only be collected for small populations and short periods of time due to costs and logistic constraints. In the current study, we aim to compare exposure assessment methodologies that use data on time-activity patterns of children with residence-based exposure assessment. We compare estimated exposures and associations with lung function for residential exposures and exposures accounting for time activity patterns.We compared four annual average air pollution exposure assessment methodologies; two rely on residential exposures only, the other two incorporate estimated time activity patterns. The time-activity patterns were based on assumptions about the activity space and make use of available external data sources for the duration of each activity. Mapping of multiple air pollutants (NO2, NOX, PM2.5, PM2.5absorbance, PM10) at a fine resolution as input to exposure assessment was based on land use regression modelling. First, we assessed the correlations between the exposures from the four exposure methods. Second, we compared estimates of the cross-sectional associations between air pollution exposures and lung function at age 8 within the PIAMA birth cohort study for the four exposure assessment methodologies.The exposures derived from the four exposure assessment methodologies were highly correlated (R > 0.95) for all air pollutants. Similar statistically significant decreases in lung function were found for all four methods. For example, for NO2 the decrease in FEV1 was -1.40% (CI; -2.54, -0.24%) per IQR (9.14 μg/m3) for front door exposure, and -1.50% (CI; -2.68, -0.30%) for the methodology which incorporates time activity pattern and actual school addresses.Exposure estimates from methods based on the residential location only and methods including time activity patterns were highly correlated and associated with similar decreases in lung function. Our study illustrates that the annual average exposure to air pollution for 8-year-old children in the Netherlands is sufficiently captured by residential exposures.
Geographic data is growing in size and variety, which calls for big data management tools and analysis methods. To efficiently integrate information from high dimensional data, this paper explicitly proposes array-based modeling. A large portion of Earth observations and model simulations are naturally arrays once digitalized. This paper discusses the challenges in using arrays such as the discretization of continuous spatiotemporal phenomena, irregular dimensions, regridding, high-dimensional data analysis, and large-scale data management. We define categories and applications of typical array operations, compare their implementation in open-source software, and demonstrate dimension reduction and array regridding in study cases using Landsat and MODIS imagery. It turns out that arrays are a convenient data structure for representing and analysing many spatiotemporal phenomena. Although the array model simplifies data organization, array properties like the meaning of grid cell values are rarely being made explicit in practice.