Abstract Background and Aim The study aims to develop a hybrid machine learning model for predicting resectability of the pancreatic cancer, which is based on computed tomography (CT) and National Comprehensive Cancer Network (NCCN) guidelines. Method We retrospectively studied 349 patients. One hundred seventy‐one cases from Center 1 and 92 cases from Center 2 were used as the primary training cohort, and 66 cases from Center 3 and 20 cases from Center 4 were used as the independent test dataset. Semi‐automatic module of ITK‐SNAP software was used to assist CT image segmentation to obtain three‐dimensional (3D) imaging region of interest (ROI). There were 788 handcrafted features extracted for 3D ROI using PyRadiomics. The optimal feature subset consists of three features screened by three feature selection methods as the input of the SVM to construct the conventional radiomics‐based predictive model (cRad). 3D ROI was used to unify the resolution by 3D spline interpolation method for constructing the 3D tumor imaging tensor. Using 3D tumor image tensor as input, 3D kernelled support tensor machine‐based predictive model (KSTM), and 3D ResNet‐based deep learning predictive model (ResNet) were constructed. Multi‐classifier fusion ML model is constructed by fusing cRad, KSTM, and ResNet using multi‐classifier fusion strategy. Two experts with more than 10 years of clinical experience were invited to reevaluate each patient based on their CECT following the NCCN guidelines to obtain resectable, unresectable, and borderline resectable diagnoses. The three results were converted into probability values of 0.25, 0.75, and 0.50, respectively, according to the traditional empirical method. Then it is used as an independent classifier and integrated with multi‐classifier fusion machine learning (ML) model to obtain the human–machine fusion ML model (HMfML). Results Multi‐classifier fusion ML model's area under receiver operating characteristic curve (AUC; 0.8610), predictive accuracy (ACC: 80.23%), sensitivity (SEN: 78.95%), and specificity (SPE: 80.60%) is better than cRad, KSTM, and ResNet‐based single‐classifier models and their two‐classifier fusion models. This means that three different models have mined complementary CECT feature expression from different perspectives and can be integrated through CFS‐ER, so that the fusion model has better performance. HMfML's AUC (0.8845), ACC (82.56%), SEN (84.21%), SPE (82.09%). This means that ML models might learn extra information from CECT that experts cannot distinguish, thus complementing expert experience and improving the performance of hybrid ML models. Conclusion HMfML can predict PC resectability with high accuracy.
In the past decades, remote sensing (RS) data fusion has always been an active research community. A large number of algorithms and models have been developed. Generative adversarial networks (GANs), as an important branch of deep learning, show promising performances in a variety of RS image fusions. This review provides an introduction to GANs for RS data fusion. We briefly review the frequently used architecture and characteristics of GANs in data fusion and comprehensively discuss how to use GANs to realize fusion for homogeneous RS, heterogeneous RS, and RS and ground observation (GO) data. We also analyze some typical applications with GAN-based RS image fusion. This review provides insight into how to make GANs adapt to different types of fusion tasks and summarizes the advantages and disadvantages of GAN-based RS data fusion. Finally, we discuss promising future research directions and make a prediction on their trends.
Automatic matching of multisource data is an important technique for achieving change detection, fusion and updating spatial data. However, most current learning methods for building footprint matching require a large number of samples, and labeling these samples is costly in terms of labor and time. Moreover, multisource building footprint data are complex and diverse leading to recognizing the different matching relationships is a hard task. Thus, this study proposes a learning-based method for recognizing multisource building footprints matching relationships by using a one-class support vector machine (OCSVM). The OCSVM was trained using only positive samples. First, a set of geometric indicators was designed to train a model and realize initial matching recognition. Then, a contextual metric was calculated based on the rough matching results, and geometric and contextual metrics were combined to train the model and realize relaxed matching recognition. Relaxed matching is an optimization process implemented after initial matching to recognize more relaxed matching relationships. In relaxed matching, a convex hull is used to recognize matching relationships besides 1:1, such as 1:n, m:1 and m:n. The experimental results showed that the proposed method outperformed indicator-weighted (weighted average) and learning-based matching methods, such as traditional SVMs and decision trees (DTs). The precision scores of the proposed model were 97.1%, 95% and 97.2% for the Wuhan (China), Beijing (China) and Richmond Hill (Canada) datasets, respectively. Furthermore, the proposed model identified the matching relationships of buildings with complex geometric features and high-density spatial distributions.
In this paper, we propose a fast 3-D empirical mode decomposition (fTEMD) method for hyperspectral images (HSIs) to achieve class-oriented multitask learning (cMTL). The major steps of the proposed method are twofold: 1) fTEMD and 2) cMTL. On the one hand, the traditional empirical mode decomposition is extended to its 3-D version, which naturally treats the HSI as a cube and effectively decomposes the HSI into several 3-D intrinsic mode functions (TIMFs). To accelerate the fTEMD, 3-D Delaunay triangulation is adopted to determine the distances of extrema, whereas separable filters are implemented to generate the envelopes. On the other hand, cMTL is performed on the TIMFs by taking those TIMFs as features of different tasks. The proposed cMTL learns the representation coefficients by taking advantage of the class labels and fully exploiting the information contained in each TIMF. Experiments conducted on three benchmark data sets demonstrate the effectiveness of the proposed method.
Infrared small target detection is widely used in precision guided weapons and early warning systems. It is very difficult to detect small target in infrared image because the target is small in size and weak in texture information. The traditional methods can’t make full use of the characteristics of the target which results in a low detection rate and a high false alarm rate, and do not consider the complexity of the algorithm which makes it difficult to apply to systems with high real-time requirements. This letter proposes a method that combines hand-designed features and machine-designed features for fast and accurate infrared small target detection. First of all, the region of interest is extracted by the proposed local weighted intensity difference method and local eight-direction gradient method. Then, the detection decision map is constructed to improve the detection speed and compose the decomposed target. Finally, the non-translation invariant CoordConv is used to detect target in the region of interest, which solves the defect that the translation invariance of traditional convolution loses the hand-designed features in the region of interest. Numerous experimental results demonstrate our method has the best detection performance compared with baseline methods, while keeps the lowest detection time.
This paper presents a new approach for accurate spatial-spectral classification of hyperspectral images, which consists of three main steps. First, a pixelwise classifier, i.e., the probabilistic-kernel collaborative representation classification (PKCRC), is proposed to obtain a set of classification probability maps using the spectral information contained in the original data. This is achieved by means of a kernel extension based on collaborative representation (CR) classification. Then, an adaptive weighted graph (AWG)-based postprocessing model is utilized to include the spatial information by refining the obtained pixelwise probability maps. Furthermore, to deal with scenarios dominated by limited training samples, we modify the postprocessing model by fixing the probabilistic outputs of training samples to integrate the spatial and label information. The proposed approach is able to cover different analysis scenarios by means of a fully adaptive processing chain (based on three steps) for hyperspectral image classification. All the techniques that integrate the proposed approach have a closed-form analytic solution and are easy to be implemented and calculated, exhibiting potential benefits for hyperspectral image classification under different conditions. Specifically, the proposed method is experimentally evaluated using two real hyperspectral imagery data sets, exhibiting good classification performance even when the number of training samples available a priori is very limited.
Spatial-spectral classification is a very important topic in the field of remotely sensed hyperspectral imaging. In this work, we develop a parallel implementation of a novel supervised spectral-spatial classifier, which models the likelihood probability via l 1 - l 2 sparse representation and the spatial prior as a Gibbs distribution. This classifier takes advantage of the spatial piecewise smoothness and correlation of neighboring pixels in the spatial domain, but its computational complexity is very high which makes its application to time-critical scenarios quite limited. In order to improve the computational efficiency of the algorithm, we optimized its serial version and developed a parallel implementation for commodity graphics processing units (GPUs). Our parallel spatial-spectral classifier with sparse representation and Markov random fields (SSC-SRMRF-P) exploits the low-level architecture of GPUs. The parallel optimization of the proposed method has been carried out using the compute unified device architecture (CUDA). The performance of the parallel implementation is evaluated and compared with the serial and multicore implementations on central processing units (CPUs). In fact, the proposed method has been designed to adequately exploit the massive data parallel capacities of GPUs together with the control and logic capacities of CPUs, thus resorting to a heterogeneous CPU-GPU framework in the design of the parallel algorithm. Experimental results using real hyperspectral images demonstrate very high performance for the proposed CPU-GPU parallel method, both in terms of classification accuracy and computational performance.
Video satellite can generate video image sequences with rich dynamic information, thus providing a new way for monitoring moving objects. However, to maintain high temporal resolution, video satellite images usually sacrifice their spatial resolution. Therefore, super-resolution (SR) plays a vital role in improving the quality of video satellite images. In this article, we propose a multiframe video SR neural network (MVSRnet) for video satellite image SR reconstruction. The proposed MVSRnet consists of three main subnetworks: an optical flow estimation subnetwork (OFEnet), an upscaling subnetwork (Upnet) and an attention-based residual learning subnetwork (ARLnet). The OFEnet aims to estimate low-resolution (LR) optical flow of multiple image frames. Upnet is then constructed to enhance the resolution of both input frames and the estimated LR optical flows. Motion compensation is subsequently performed according to the high-resolution (HR) optical flows. Finally, the compensated HR cube is fed to the ARLnet to generate SR results. Different from existing video satellite image SR methods, the proposed MVSRnet is a multiframe-based method with an attention mechanism, which can merge the motion information among adjacent frames and highlight the importance of extracted features. Experiments conducted on Jilin-1 and OVS-1 video satellite images demonstrate that the proposed MVSRnet significantly outperforms some state-of-the-art SR methods.
Current subaperture-based azimuth-variant motion compensation algorithms for synthetic aperture radar (SAR) imagery usually suffer from the challenge of keeping high precision and efficiency simultaneously. In this letter, a novel motion compensation approach is developed to precisely correct the azimuth-variant motion errors. The proposed algorithm applies a time-domain filter to implement the precise Subaperture-to-Pulse correction, providing a promising azimuth-variant phase correction. Extensive experiments demonstrate the superiorities of the proposal with real-measured high-squint SAR data.