Not found 2022 Computer Science & Technology

    Default Latest Most Read
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Classification Method of Tactile Pressure Footprint Based on Fusion Distribution Graph Network
    ZHANG Yan, GAO Zijian, XU Changkang, et al
    Journal of South China University of Technology (Natural Science Edition)    2022, 50 (1): 91-100.   DOI: 10.12141/j.issn.1000-565X.210128
    Abstract2282)      PDF(pc) (928KB)(542)       Save
    With the development of biometric recognition technology, the research of tactile pressure footprint classification has been more and more frequently used, and the traditional classification method is labor-intensive. For tactile pressure footprint classification, the paper proposed a tactile pressure footprint classification method of fusion distribution graph network. Firstly, the convolutional features of the tactile pressure footprint image was extracted through the embedding module and the sample correlation matrix was obtained with the norm-regularization method. Then the fusion correlation matrix between the samples and the label one-hot vector was formed, and feature information was added through the self-attention module. A new feature distribution graph was obtained through the distribution module, and an association matrix between the labeled sample and the unlabeled sample was constructed. Finally, the convolution feature and feature distribution graph of the tactile pressure footprint image were used as the input of the update module to implement the tactile pressure footprint classification. The experimental results show that, compared with the few-shot classification method, this new methods classification accuracy of 5-way1-shot experiments on Mini-Imagenet and Tiered-Imagenet data set reached 71.71% and 74.34%, respectively. Meanwhile, the 5-way 1-shot and 5-way 5-shot experiments on the left and right foot data sets of tactile pressure reached 88.87% and 98.66%, respectively.
    Related Articles | Metrics | Comments0
    Reversible Data Hiding Technology for JPEG Images Based on Frequency Selective Expansion
    LÜ Wanli TAN Sheng
    Journal of South China University of Technology (Natural Science Edition)    2022, 50 (1): 109-121.   DOI: 10.12141/j.issn.1000-565X.210062
    Abstract1936)      PDF(pc) (2587KB)(323)       Save
    Reversible data hiding technology for JPEG images has been widely used.In order to improve the embedding capacity and embedding efficiency of JPEG images in the reversible data hiding process, this paper proposed a reversible data hiding technology based on frequency selective expansion.Firstly, the original image was quantized to generate several DCT blocks, then the DCT blocks were divided into two types to select different embedding positions for embedding based on whether the block contains ±3 and ±4 AC coefficient values.Finally, the embedding distortion of each position in the block were calculated according to the frequency selection strategy.The embedding position was selected and the secret data was embedded according to the ordered frequency of embedded distortion.The experimental results show that the proposed algorithm expands the embedding capacity and reduces the file increase.
    Related Articles | Metrics | Comments0
    Handwritten Text Segmentation Method Based on Greedy Snake Algorithm and Radical Recognition
    FU Pengbin DONG Aojing YANG Huirong
    Journal of South China University of Technology (Natural Science Edition)    2022, 50 (1): 80-90.   DOI: 10.12141/j.issn.1000-565X.210028
    Abstract1715)      PDF(pc) (1180KB)(295)       Save
    A segmentation method based on greedy snake algorithm and radical recognition was proposed to solve the problems of interlacing, adhesion and over-segmentation of Chinese handwritten text. Firstly, the original text segmentation trajectory was established based on the greedy snake algorithm, and the segmentation path was optimized according to the multiple rules. Then, candidate adhesion points were extracted based on the outline and skeleton of adhesion characters, and the gluttonous snake algorithm was used for secondary segmentation. Finally, the radical extraction and recognition of the over-segmentation characters was carried out, and the merging direction was determined based on the structure of Chinese characters. Combined with geometric confidence and recognition confidence, the merging of the over-segmentation characters was completed, and the correct text segmentation result was finally obtained. The effectiveness of the algorithm was verified by the experiment on 1542 lines of handwritten text from a high school test papers of Shaanxi province. The result shows that the accuracy of the segmentation algorithm can reach 82.15%.
    Related Articles | Metrics | Comments0
    Obstacle Avoidance Algorithm for Unmanned Aerial Vehicle Vision Based on Deep Learning
    ZHANG Xiangzhu, ZHANG Lijia, SONG Yifan, et al
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (1): 101-108, 131.   DOI: 10.12141/j.issn.1000-565X.210096
    Abstract6560)      PDF(pc) (3578KB)(1096)       Save
    In order to solve the obstacle avoidance problem of unmanned aerial vehicle (UAV) based on monocular vision, a quadrotor autonomous obstacle avoidance method based on monocular depth estimation and object detection was proposed.The monocular depth estimation model provides the pixel-level depth information of the obstacle, and the object detection model provides the location information of the obstacles.The depth map and object detection results of a single Red-Green-Blue image were obtained by convolutional neural network (CNN).The region division of the image was based on the object detection results, and the region depth was calculated based on the depth estimation results.The linear velocity and angular velocity of UAV were calculated by the planning algorithm based on the regional depth and regional division results, so as to realize the autonomous obstacle avoidance of UAV.In order to verify the autonomous obstacle avoidance performance of the algorithm, the Parrot Bebop2 UAV was employed to carry out real flight comparison experiments between the proposed algorithm and the direct flight algorithm.The results show that the proposed algorithm can be used for low speed autonomous obstacle avoi-dance of quadrotor.
    Related Articles | Metrics | Comments0
    Recommendation Model Based on Polarization Relation Representation and Low-Dimensional Data Association Learning
    CAI Xiaodong HONG Tao CAO Yi
    Journal of South China University of Technology (Natural Science Edition)    2022, 50 (1): 122-131.   DOI: 10.12141/j.issn.1000-565X.210082
    Abstract521)      PDF(pc) (1229KB)(817)       Save
    The traditional recommendation model based on knowledge graph generally adopts TransH strategy to represent the relations among nodes in the graph, and uses the interactive mode based on feature machine to learn recommendation. This method is not accurate enough to represent the relation among nodes, and often ignores the potential relations among nodes in the low dimensional data. In order to improve the accuracy of recommendation, this research proposed a new representation method based on polarization relation representation, which maps the representation among nodes to unitary space and enriches the effective information of the relations among nodes. In addition, an association learning method for knowledge graph embedding and low dimensional data of recommendation process was designed to deeply mine the rich and detailed relation hided in it, so as to improve the accuracy of recommendation. The experimental results show that the proposed method is effective. Compared with the results of the advanced methods in related fields, Recall Rate and Normalized Discounted Cumulative Gain(NDCG) have significant improvement on Amazon-book, Last-FM datasets.
    Related Articles | Metrics | Comments0
    Survey on Document-Level Relation Extraction
    ZHOU Youhua, HUANG Han, LIU Haolong, et al
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (4): 10-25.   DOI: 10.12141/j.issn.1000-565X.210152
    Abstract2800)      PDF(pc) (2465KB)(488)       Save
    Relation extraction (RE) is one of the most important tasks in information extraction of NLP, the result of RE can be used to downstream missions such as construction of knowledge graphs, knowledge base question answering, semantic search et al. which means RE has wide-ranging application scenarios and important research value. Recent years, RE achieves frutiful results, but most of them are limited in sentence-level RE, which focus on extract relation between two mentions within a single sentence. Reserches shows that a large number of relations can’t extract from a single sentence, in rencent years, document-level RE faces new opportunities and challenges with the development of deep learning and NLP. This study reviews the recent advances in document-level RE research, summarize a general technology roadmap of this task, and then analyzes the encoding and aggregation methods used in the researches, We also introduce the common datasets and evaluation metrics of this task. This paper ends up with forecasting the future development trend of this task.
    Related Articles | Metrics | Comments0
    Machine Vision-Based Approach for Width Measurement of Irregular Ultra-Thin Heat Pipe
    LI Yong, SHE Peijian, OU Qiguan, et al
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (4): 46-55.   DOI: 10.12141/j.issn.1000-565X.210372
    Abstract977)      PDF(pc) (5223KB)(97)       Save
    In order to realize the automatic measurement of width of irregular ultra-thin heat pipe, a pixel-by-pixel width search algorithm based on the central axis gradient of machine vision is proposed. First, the Canny algorithm is used to obtain pixel edges, and the improved quadratic interpolation method is further used to extract sub-pixel edges. Then, the thinning algorithm is used to extract the central axis of the enclosed area within the edge and crop it appropriately. Next, to construct the edge of the central axis area by morphological expansion and Gaussian filtering algorithm, to calculate the gradient angle, to approximate the smooth change of the true gradient angle with the help of two-sided mean filtering. Finally, searching for sub-pixel edge points along both sides of the gradient angle, and the edge point pair distance is the target width. It solves the problem that the conventional measurement algorithm cannot determine the width measurement point corresponding to the irregular deformation contour and the high-order transition curve area cannot be measured, and realizes the dense measurement of the width in the axial direction, which approximates the true edge width distribution to the greatest extent. The experimental results show that when the width of measured object is within 10 mm, the measurement uncertainty of the algorithm in the article is ±0.026 mm, which can well adapt to the irregular shape of the ultra-thin heat pipe, and realize stable, reliable, and high-precision automated measurement.
    Related Articles | Metrics | Comments0
    Semantic Textual Similarity Justification based on Multi-Model Ensemble
    SU Jindian, HONG Xiaobin, YU Shanshan
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (4): 1-9.   DOI: 10.12141/j.issn.1000-565X.210427
    Abstract2385)      PDF(pc) (1147KB)(209)       Save
    As the mainstream and typical methods in current natural language processing and artificial intelligence, various pre-trained language models perform differently on the downstream tasks, due to their different language modeling, feature representation, model structure, training tasks and pre-training corpus, et al. In order to better ensemble the knowledge in different pre-trained language models and utilize their learning abilities on the downstream tasks, we propose a multi-model ensemble method MME-STS (Multi-Model Ensemble for Semantic Textual Similarity) for semantic textual similarity justification tasks. The model structure and the corresponding feature representations are presented, and three different ensemble strategies based on average values, full-connected layer training and Adaboost algorithm with respect to model ensemble are also proposed. Experimental results show that MME-STS outperforms significantly over single pre-trained language model-based approaches on the two benchmark datasets of SemEval 2014 task 4 SICK and SemEval 2017 STS-B corpus in terms of Pearson correlation coefficient and Spearman coefficient metrics.
    Related Articles | Metrics | Comments0
    Deep Multi-Scale Attention Hashing Network for Large-Scale Image Retrieval
    FENG Hao WANG Nian TANG Jun
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (4): 35-45.   DOI: 10.12141/j.issn.1000-565X.210268
    Abstract952)      PDF(pc) (4771KB)(128)       Save
    Aiming at the limited feature extraction capability and inefficient quantization constraint mechanism of existing hashing algorithms, a deep multi-scale attention hashing network was proposed for large-scale image retrieval. The whole network was composed of a main branch and a object branch. In the main branch, two modules of multi-scale attention localization and saliency region extraction were added to effectively localize and extract saliency regions of images, and the results were fed into the object branch to learn more detailed features. Subsequently, the multi-granularity features learned by two branches were fused to perform binary hash coding. In addition, a triplet quantization constraint was introduced to reduce quantization error while maintaining the similarity relationship between sample pairs. In order to verify the effectiveness of the proposed method, extensive experiments were carried out on two benchmark datasets. Experimental results showed that our method outperforms most existing hashing retrieval approaches.
    Related Articles | Metrics | Comments0
    Image Tampering Localization Based on Visual Multi-Scale Transformer
    LU Lu, ZHONG Wenyu, WU Xiaokun
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (6): 10-18.   DOI: 10.12141/j.issn.1000-565X.210603
    Abstract2606)      PDF(pc) (2425KB)(273)       Save
    With the continuous development of digital image processing technology, image tampering is no longer limited to a single method such as image splicing. The traces of malicious tampering are hidden through the post-processing of the image editing software, which leads to poor results of traditional image forgery detection algorithms and the tampering localization methods based on deep learning. Aiming at the problem of low accuracy of existing image tampering algorithms, an end-to-end image tampering location network based on Multi-Scale Visual Transformer is proposed. The network combines a transformer and a convolutional encoder to extract the feature difference between the tampered area and the non-tampered area. Multi-Scale Transformer models the spatial information of image block sequences of different sizes, so that the network can adapt to tampered areas of various shapes and sizes. Experimental results show that the F1 and AUC scores of the proposed algorithm in the CASIA and NIST2016 test sets are 0.431、0.877、0.728 and 0.971, respectively, which are significantly improved co- mpared to the existing mainstream algorithms. Moreover, the algorithm proposed in this paper is robust against JPEG compression attacks.
    Related Articles | Metrics | Comments0
    An Imbalanced Classification Method Based on Adaptive Sampling
    CHEN Qiong XIE Jialiang
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (4): 26-34,45.   DOI: 10.12141/j.issn.1000-565X.210267
    Abstract2019)      PDF(pc) (2027KB)(88)       Save
    In view of the problem that traditional resampling methods mostly use fixed sampling strategies and cannot change the sampling strategy according to the optimization requirements of the model, this paper proposes an adaptive sampling-based imbalanced classification method (Adaptive Sampling Imbalanced Classification, ASIC). This method dynamically adjusts the sampling probabilities of samples of different classes on the training set according to the performance of the classification model on the validation set, so that the sampling probabilities of different classes are dynamically determined by the requirements of the current classification model. At the same time, this method pays extra attention to the minority classes, and gives the minority classes a higher sampling probability under the same other conditions, so as to compensate for the negative impact of the insufficient example number of the minority class itself on the classification model, thereby improving the classification model's ability to recognize minority classes. The experimental results show that the classification model trained with the ASIC method is better than the comparison methods in terms of balanced accuracy and geometric mean, and the more imbalanced the data distribution, the more obvious the superiority of the ASIC method.
    Related Articles | Metrics | Comments0
    Product Attribute Extraction Method Based on Seed-Constraint-LDA
    CHEN Kejia, ZHENG Jingjing
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (6): 37-48,70.   DOI: 10.12141/j.issn.1000-565X.210124
    Abstract1876)      PDF(pc) (1838KB)(80)       Save
    In order to classify and extract product features from reviews, make reviews displayed separately according to different product features, and improve the efficiency of making purchasing decisions for consumers, this paper proposes a product feature extraction method based on SC-LDA(Seed Constraint-Latent Dirichlet Allocation). Firstly, the TF-IDF (Term Frequency–Inverse Document Frequency) algorithm is used to automatically extract the keywords as a feature seed set. Secondly, document reorganization is adopted to solve the problem of multi-feature co-occurrence of the long text as well as sparsity of the short one and improve the rate of document reorganization. Then, must-link and cannot-link seed constraints are applied to define the probability expansion and contraction value, which affects the topic allocation of the LDA model and makes the training results more reasonable. Finally, the topics generated by SC-LDA are mapped to the prior feature categories. The advantages of the proposed method are verified by carrying out qualitative analysis in terms of feature categories as well as feature words and quantitative analysis in terms of accuracy, entropy as well as purity.
    Related Articles | Metrics | Comments0
    Injection Molding Part Size Prediction Method Based on Stacking Ensemble Learning
    SONG Jian, WANG Wenlong, LI Dong, et al
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (6): 19-26.   DOI: 10.12141/j.issn.1000-565X.210664
    Abstract2062)      PDF(pc) (4236KB)(133)       Save
    Machine learning algorithms can handle high-dimensional and multivariate data and extract hidden relationships in data in complex and dynamic environments, which has a good application prospect in injection molding part size prediction. The performance of injection molding part size prediction system depends on the choice of machine learning algorithm, however, the traditional machine learning algorithm can not achieve good prediction effect in practical application. In this paper, a fusion model based on Stacking integrated learning method is proposed, and the optimal performance model is obtained by comparing different Stacking learner combinations and combining multiple types of learners. The performance of the model in injection molding parts size prediction is greatly improved compared with the traditional model, and the model prediction results can be explained back to the actual production according to the characteristics, providing decision guidance for the optimization of manufacturing processes and processes.
    Related Articles | Metrics | Comments0
    Trajectory Planning Method for Unmanned Aerial Vehicles Based on Local Soft-Constrained Optimization
    CHEN Peng, JIANG Yongqi, YU Tianwei, et al
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (6): 27-36.   DOI: 10.12141/j.issn.1000-565X.210452
    Abstract2689)      PDF(pc) (6172KB)(96)       Save
    In order to improve the efficiency of the quadrotor long-distance trajectory planning in 3D complex scene, this paper proposes a real-time quadrotor trajectory planning method based on local soft-constrained optimization. The method can be divided into the following two steps: Firstly, the safety distance constraint is added to theta* algorithm, and the heuristic function is improved by using the turning cost to reduce the time consumption caused by quadrotor turning, and finally the initial path composed of a small number of key points is generated; Secondly, the local optimization strategy is used to optimize the segments with potential safety hazards in the initial path based on soft constraints, and the Hodograph property of Bézier curve are used for time allocation to ensure the continuity, smoothness and dynamic feasibility of the trajectory and improve the flight efficiency of quadrotor. Experimental results show that the proposed method has shorter flight distance and flight time and higher planning efficiency while ensuring the safety of quadrotor. This method can also be successfully applied to the actual quadrotor flight.
    Related Articles | Metrics | Comments0
    A Cross-Modal Face Retrieval Algorithm Based on Metric Learning
    WO Yan, LIANG Jiyun, HAN Guoqiang
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (6): 1-9.   DOI: 10.12141/j.issn.1000-565X.210709
    Abstract3192)      PDF(pc) (2022KB)(735)       Save
    Metric learning is an important technique to reduce modal differences. Existing cross-modal retrieval methods based on metric learning for cross-modal face retrieval tasks lack attention to pose differences and domain differences, and there are two problems in the process of metric learning: lack of learning of global information and the existence of a large number of redundant triplets. In this paper, a cross-modal common representation generation algorithm based on metric learning is proposed. Our study uses the yaw angle equivariant module to compensate for yaw angle differences so that we can obtain the image features with robustness, uses the multi-layer attention mechanism to obtain video features with differentiability; combines global triplets and local triplets to jointly train the cross-modal common representation generation network, then accelerates the convergence of the loss function through the screening of semi-hard triplets; combines domain calibration and transfer learning to improve the generalization of common representations. Finally, the results of comparison experiments on three face video datasets: PB, YTC and UMD Faces, demonstrate that our algorithm can improve the accuracy of cross-modal face retrieval, and the results of fine-tuning the cross-modal common representation generation network using different numbers of samples demonstrate that our algorithm can improve the accuracy of cross-modal retrieval of target domain images.
    Related Articles | Metrics | Comments0
    Lightweight Object Detection Combined with Multi-Scale Dilated-Convolution and Multi-Scale Deconvolution
    YI Qingming, LÜ Renyi, SHI Min, et al
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (12): 41-48.   DOI: 10.12141/j.issn.1000-565X.220095
    Abstract4060)   HTML11)    PDF(pc) (2447KB)(597)       Save

    Due to the tough issues of slow detection and heavy parameters, the deep neural networks are inapplicable to be deployed on mobile application scenarios which are computing-resource-constrained but demand high speed calculation. To improve the inference speed for object detection and achieve a better tradeoff between detection accuracy and inference speed, this paper proposed a lightweight object detection network named MDDNet which combined multi-scale dilated-convolution and multi-scale deconvolution. Firstly, a lightweight detection backbone network was designed based on an efficient single-stage strategy, and the depthwise separable convolution was introduced to reduce the parameter amount of the baseline and further speed up the feature extraction. Secondly, two feature extension branches based on multi-scale dilated convolution were added to the backbone network, which were respectively connected to the ends of the final and the penultimate residual layers of the basic network. The features of the two branches were fused in the prediction layer to augment the texture features of the shallow feature maps. Thirdly, the multi-scale deconvolution module was further introduced and connected to the deep feature network layers to increase the size of the feature map, and then the shallow feature maps of the previous layer with different scales were fused so as to enrich the feature semantic information and the detailed information, improving the detection accuracy. Finally, the parameters of the prior bounding box were optimized in the prediction layer based on the K-means clustering method, so that the prior bounding box could better match the ground truth of the object, achieving higher object recognition accuracy. The experimental results show that the MDDNet produces about 7.21×106 parameters. The average accuracy is 58.7% and 76.0% in KITTI and Pascal VOC datasets, respectively, while the corresponding inference speed respectively reaches 55 f/s and 52 f/s in the above two datasets. Therefore, MDDNet achieves a decent tradeoff among the parameter amount, detection speed, and detection accuracy, and it can be applied to real-time object detection on mobile terminals.

    Table and Figures | Reference | Related Articles | Metrics | Comments0
    Two-Stream Adaptive Attention Graph Convolutional Networks for Action Recognition
    DU Qiliang, XIANG Zhaoyi, TIAN Lianfang, et al
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (12): 20-29.   DOI: 10.12141/j.issn.1000-565X.220055
    Abstract1736)   HTML15)    PDF(pc) (2386KB)(334)       Save

    Human action recognition has received much attention in the field of computer vision because of its important role in public safety. However, when fusing the neighborhood features of multi-scale nodes, existing graph convolutional networks usually adopt a direct summation method, in which the same importance is attached to each feature, so it is difficult to focus on important features and is not conducive to the establishment of optimal nodal relationships. In addition, the two-stream fusion method, which averages the prediction results of different models, ignores the potential data distribution differences and the fusion effect is not good. To this end, this paper proposed a two-stream adaptive attention graph convolutional network for human action recognition. Firstly, a multi-order adjacency matrix that adaptively balances the weights was designed to focus the model on more important domains. Secondly, a multi-scale spatio-temporal self-attention module and a channel attention module were designed to enhance the feature extraction capability of the model. Finally, a two-stream fusion network was proposed to improve the fusion effect by using the data distribution of the two-stream prediction results to determine the fusion coefficients. On the two subdatasets of cross subject and cross view of NTU RGB+D, the recognition accuracy of the algorithm is 92.3% and 97.5%, respectively; while on the Kinetics-Skeleton dataset, it reaches 39.8%, both of which are higher than the existing algorithms, indicating the superiority of the algorithm in human motion recognition.

    Table and Figures | Reference | Related Articles | Metrics | Comments0
    BiLSTM-BiDAF Named Entity Recognition Based on Machine Reading Comprehension
    WANG Jie, XIA Xiaoming
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (12): 80-88.   DOI: 10.12141/j.issn.1000-565X.220013
    Abstract3248)   HTML13)    PDF(pc) (1576KB)(164)       Save

    Named entity recognition is a fundamental task of natural language processing (NLP) and plays an important role in many downstream NLP tasks, including information extraction and machine translation, etc. The existing named entity recognition methods are usually based on sequence labeling and extract entities within a sentence independently. These methods ignore the semantic information between sentences. Named entity recognition methods based on machine reading comprehension encode important prior information about the entity class. It is easier to distinguish similar classification labels, which reduces the difficulty of model learning, but it still only models at the sentence level, ignoring the semantic information between sentences, which is easy to cause the problem of inconsistent entity labeling in different sentences. To this end, this paper extended the sentence-level named entity recognition to the text-level named entity recognition, and then proposed a BiLSTM-BiDAF named entity recognition model based on machine reading comprehension. First, to utilize the context information within the whole text, NEZHA pre-training language model was used to obtain information of the full text and local features were further captured through BiLSTM, so as to strengthen the model’s ability to capture locally dependent information. Then, a bidirectional attention flow was introduce to learn the semantic association between the text and entity category. Finally, to predict the position of entities in the text, a boundary detector based on the gating mechanism was design to strengthen the correlation of the entity boundary. At the same time, an answer count detector was establish to identify the unanswerable questions. Experimental results on the CCKS2020 Chinese electronic medical records dataset and CMeEE dataset show that our model can effectively identify document-level and sentence-level named entities, and F1 can reach 84.76% and 57.35%, respectively.

    Table and Figures | Reference | Related Articles | Metrics | Comments0
    Image Sentiment Transformation Based on Adaptive Brightness Adjustment
    XING Xiaofen, LI Minsheng, XU Xiangmin
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (12): 1-12.   DOI: 10.12141/j.issn.1000-565X.220165
    Abstract1365)   HTML60)    PDF(pc) (6379KB)(132)       Save

    Common image sentiment transformation methods are based on the assumption that transferring image color can transfer image sentiment. However, due to the influence of image content, transferring image color cannot completely transfer image sentiment, and it is necessary to obtain a suitable reference image before transferring image color. However, in practical application, there will be difficulties in obtaining reference images that are similar to the target image in sentiments and similar to the source image in content, and the semantic consistency of local objects need to be considered when transferring image color. Therefore, this paper proposed an image sentiment transformation method based on adaptive brightness adjustment. According to the significant correlation between image brightness and image sentiment (also known as Valence value, abbreviated as V value) in psychology, the method adaptively adjusts brightness through deep neural network ISTNet to convert the image to target image sentiment. First, an image and its corresponding true V value were obtained from the existing image emotion dataset. By changing the image brightness, a series of images with different brightness can be obtained. Then, the pseudo V values corresponding to the images with the same content but different brightness were predicted by the pre-trained image V value regression. Finally, ISTNet was trained with these images and pseudo V values to learn the internal relationship between image brightness adjustment and sentimental change. In practical application, without any reference image, directly input the image and the target V value into the neural network ISTNet to obtain the output image of the corresponding sentimental tag. The experimental results show that the performance of this method is better than the existing color based image sentiment transformation methods.

    Table and Figures | Reference | Related Articles | Metrics | Comments0
    Anchor Graph Based Low-Rank Incomplete Multi-View Subspace Clustering
    LIU Xiaolan, SHI Zongyu, YE Zehui, et al
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (12): 60-70.   DOI: 10.12141/j.issn.1000-565X.220069
    Abstract1426)   HTML15)    PDF(pc) (2961KB)(95)       Save

    Traditional multi-view clustering task is for complete data. However, in practical tasks, due to the limitation of the information acquisition method, some views tend to contain missing data, and this leads to the problem of incomplete multi-view clustering. In view of this problem, most of the existing clustering models are based on non-negative matrix factorization or distance graph, and their co-optimization strategy can easily make the performance of the solution insecure and the global structure can’t be fully characterized. In order to improve the performance of clustering graph, this paper proposed an incomplete multi-view clustering algorithm ALIMSC based on low-rank subspace clustering and anchor graph. The algorithm first obtained the benchmark similarity matrix of data by incomplete multi-view subspace clustering algorithm APMC based on anchor graph, which was embedded in the low-rank subspace clustering model. The similarity matrix was obtained by dimensionality ascending alignment and weighted fusion, and the final clustering graph was obtained by making the similarity matrix as consistent as possible with the benchmark similarity matrix. ALIMSC algorithm characterized the low-dimensional subspace distribution of high-dimensional data by imposing rank minimization constraint on the similarity matrix of each view and emphasized the subspace structure of the data on the basis of the original anchor graph, that is, the block diagonality reflected in the cluster graph. Experimental results on several public datasets show that the proposed algorithm outperforms the classical incomplete multi-view algorithms.

    Table and Figures | Reference | Related Articles | Metrics | Comments0
    AFGSRec: A Social Recommendation Model Based on Adaptive Fusion of Global Collaborative Features
    CAI Xiaodong, ZENG Zhiyang
    Journal of South China University of Technology(Natural Science Edition)    2022, 50 (12): 71-79.   DOI: 10.12141/j.issn.1000-565X.220180
    Abstract1347)   HTML7)    PDF(pc) (1245KB)(110)       Save

    Previous session-based recommendation systems usually capture users’ consumption preferences from their recent transaction records, and this method ignores the influence of global transaction information and friends’ preferences on users’ transaction behavior, resulting in less accurate recommendation results of the model. To solve the problem, this paper proposed a social recommendation model AFGSRec based on an adaptive fusion of global collaborative features. Firstly, a heterogeneous graph neural network was used to model users and their historical transaction information on the social network for capturing global collaborative features and social influence among friends. Secondly, this paper designed a graph neural network based on a selection mechanism that effectively filters out the node transition features irrelevant to the current session and captures user preferences more accurately. Thirdly, an adaptive fusion method was designed to capture the impact of global collaborative features on users’ current preferences dynamically and improve the model’s recommendation accuracy. Finally, this paper used a dynamic cyclical learning rate to help the model better handle saddle points during the training process to improve the convergence speed of model AFGSRec. The experimental results show that AFGSRec is robust; both the HR (Hit Rate) and MRR (Mean Reciprocal Rank) of AFGSRec outperform the state-of-art model SERec. On the Gowalla dataset, HR@10 and HR@20 are increased by 1.91% and 1.15%, respectively; MRR@10 and MRR@20 are increased by 5.05% and 4.83%, respectively. On the Delicious dataset, HR@10 and HR@20 are increased by 2.45% and 1.19%, respectively; MRR@10 and MRR@20 are increased by 4.84% and 4.32%, respectively.

    Table and Figures | Reference | Related Articles | Metrics | Comments0
News
 
Featured Article
Most Read
Most Download
Most Cited