Network traffic prediction is very important for load balancing and network planning. This paper proposes an attention-based traffic predictor (ATP) model to achieve traffic prediction in a software-defined network (SDN) environment. To improve the accuracy and efficiency of a prediction, improvements are made from three aspects: data, model and evaluation optimisation. First, a combination of lower sampling frequency and data augmentation is adopted to reduce the resource consumption of the request. Second, based on the long correlation and self-similar characteristics of network traffic, a sequence-to-sequence model with attention (Seq2Seq+Attention) is selected for network traffic prediction. Finally, this paper proposes an improved weighted MSE evaluation method, which is more suitable for network traffic prediction. Experiments show that the proposed method can maintain the prediction accuracy while reducing the sampling frequency by 50%. The weighted MSE evaluation method can improve the accuracy by 5.37% compared with the original MSE evaluation method.
This paper uses a deep learning-based model to solve the problem of automatic classification of mobile applications. In this paper, we address the classification problem of mobile applications from the perspective of text classification. By analyzing the major mobile phone application markets, we have developed the main categories of applications, and crawled the descriptions of various mobile phone applications as needed. With analyzing the original corpus of the crawl, the semantic information is further expanded by using data augmentation methods based on both word and char. Then, we design different text classification networks and compare the experimental results, and finally select the network with the best classification effect for tuning. The results of experiments show that the classification network of Bert+Highway+GRU designed in this paper has better classification effect. The average P/R/Fl value of the classification is 0.8820/0.8892/0.8856. The classification indicators under the above all reached 0.85 or higher, which in the first level label of those applications; at the same time, it also showed better performance in network training and convergence speed. The deep learning-based mobile phone application classification network designed in this paper has high classification efficiency and can achieve higher classification accuracy.
Although the sequence-to-sequence models have achieved state-of-the-art performance in many summarization datasets, there are still some problems in the processing of Chinese social media text, such as short sentences, lack of coherence and accuracy. These issues are caused by two factors: the principle of the RNN-based sequence-to-sequence model is maximum likelihood estimation, which will lead to gradient vanishing or exploding when generating long summaries; the text in the Chinese social media is long and noisy, for which it is very difficult to generate high-quality summaries. To solve these issues, we apply a sequence generative adversarial network framework. The framework includes generator and discriminator, in which generator is used to generate summaries and discriminator is used to evaluate generated summaries. The softargmax layer is used as a connection layer to guarantee the co-training of generator and discriminator. Experiments are carried out on Large Scale Chinese Social Media Text Summarization Dataset. The length of the sentence, ROUGE score and artificial score of summary’s quality are used to evaluate the generated summaries. The result shows that the sentences in the summaries generated by our model are longer and have higher accuracy.
With the development of science and technology and a large numbers of advanced vocabularies, the traditional classification of disciplines cannot meet the current needs of the subject division of scientific literature. At the same time, the clustering of the scientific literature put forward more requirements to the efficiency of the methods and the corresponding software and hardware facilities. In this paper, text features are extracted based on the TF-IDF method and the features of scientific literature. In Hadoop distributed environment, text clustering is carried out through Canopy-Kmeans algorithm, which achieved clustering of the massive scientific literature. As a result, our method proposed in this paper has improved key indicators compared to previous algorithms and greatly improved the efficiency of clustering.
This paper uses word-embedding and deep neural networks to build a multi-label classification model based on technical articles. In this paper, we use deep learning algorithms to train word vectors based on numerous technical articles, and then with the abstracts and corresponding CNKI labels of these articles as input of network is trained for compare and research the prediction results of different networks, finally determining the threshold by statistical distribution for label screening. Through parameter tuning, model fusion and data augmentation, the accuracy of multi-tag prediction network reaches 92.05%. Multi-label classification based on deep neural network has advantages in simple preprocessing, high accuracy and computational efficiency.
This paper proposes a network model that combines long and short feature extractors to solve the problem of automatic classification of web pages. By analyzing the current major portal websites, the main categories of original corpus are formulated. By analyzing the composition of webpage content, the composite extraction features of long and short feature extractors are designed. The attention mechanism is introduced in the short feature extraction network to enhance the ability of short text information extraction. For the longer text, the long feature extraction network combines the attention mechanism of the word and segment to capture information. In the last layer of the classification, the correction mechanism is used for model fusion, which further improves the accuracy of classification. The experimental results show that the proposed method has higher classification accuracy. The classification indicators under the first-level label all reached 0.94 or higher, and 0.90 under the secondary label. The composite feature extraction network designed in this paper has better anti-noise ability and classification efficiency, and can achieve higher classification accuracy.