Education, English

Sequence-Based Feature

Author: arsalan
Posted on: 14 Mar 2019

Introduction

There are different methods that can be used to determine spammers and malicious activities. These methods include graph structure features, sequence-based features, and statistical relational model. Thus all this method can help in identifying spammers that need manual or automated intervention (Ahmed et al, 2014). Therefore, multi-national nature assists spammers with more options but improves the detection mechanisms to limit patterns across the task types and time. Apart from the use of graph feature, one could use sequence-based features as well as a statistical model.

Sequence-Based Features

Sequence method is used in different domains, for instance, biology, malicious activities detection, and information retrieval. Thus, it can be used for computational prediction and detection. In changing multi-relational social networks, every individual using social network generates a sequence which can be detected by this method. Thus, spammers normally use particular objectives in the social network, and this causes their sequence of behavior diverge from their norm (Ahmed et al, 2014). These sequences include sequential k-gram features and Mixtures of Markov Models. The model multi-national nature assists spammers with more options but improves the detection mechanisms to limit patterns across the task types, and time

Sequential k-gram features

The easier way to represent a series with the feature is to recognize each element in the series is independent from each other. But this doesn’t make it possible to identify the order of the sequence. Also, treating each element independently it implies that values in the sequence will not change as the out-degree in each vertex. Therefore, in order to identify the order of the sequence, the K-gram is used. The sequence is taken as a vector of frequencies. To enable us to keep the feature computationally effective, we use the bigram sequence, i.e., k=2. The k-gram helps in identifying spammers that need a manual or automated intervention like graphical feature.

For example,

Mixture of Markov Models

Although k-gram feature helps in attaining an order of events in sequence, it may fail to outline this order properly in longer sequences. This because the increasing K alters the feature space which leads to computational inefficiency and estimation problems due feature gaps. Therefore, to identify the sequence order in a longer series, and to predict the information, we use generative model. This model is the same a chain-augmented Bayes model which has proven to be efficient in information modeling (Huang et al, 2013). Like graph feature, the model multi-national nature assists spammers with more options but improves the detection mechanisms to limit patterns across the task types and time. Thus, the model identifies the actions of each social network user through a mixture of Markov models. Therefore, each class of spammer has a relationship with a feature y. Thus, it is assumed that component y is generated from a Markov particular class. The joint probability is given by

Statistical Relational Model

Hinge-loss Markov Random Fields

Hinge-loss Markov random Fields (HL-MRFs) include models that are conditional and probabilistic continuous (Huang et al, 2013). These models are log-linear whose components are the hinge-loss activity of the variables states. They are made and based on soft logic, and they can be used to generalize logical implication. The function takes potentials and random variables as well as conditioned variables. Thus it takes the following function:

Hinge-loss Markov Random Fields Collective Model for Reports

The objective of this model is to utilize reports to forecast spammers. Thus, Hinge-loss Markov Random Fields Models help to incorporate users’ credibility information in the report and help in improving the predictability of the given reports (Huang et al 2013). By using this, we show that group reasoning over the reliability of the informing user and his or her probability being a malicious user helps in increasing the performance of the system. This approach may use graph relation to report and is founded on believe that reliability of the user’s abuse reporting should have a higher probability of being a spammer. Thus, if the user report is not likely to be a spammer, the reliability of the reporting should decrease and vice versa.

Use of Time Window

Graph feature uses multi-national nature to give spammers more options, but it also improves the detection systems to control patterns across the time. Thus, the representation of the social network users’ is instructed by time-stamped data. Therefore, the author produces graph use time window. The author uses the sequence of the users’ information in identifying the occurrence of malicious activity in using a social network. The time is used to determine the user’s report reliability. As the use of social network evolves, each user produces a sequence of behaviors or activities which are measured by time (Gao, 2010). The time will help in determining the intentions of the user. The test time is computed by the posterior probability of the user and recorded activity of the sequence.

Conclusion

Therefore, rather than graph feature, one could use sequence-based features and statistical relational model. Although graph feature is widely used, the named models can effectively and efficiently be used to increase the performance of the system. Thus all this method can help in identifying spammers that need manual or automated intervention. Therefore, multi-national nature assists spammers with more options but improves the detection mechanisms to limit patterns across the task types and time. The use sequence-based features, as well as statistical model, can be used to achieve the same results as graph feature. Graph feature uses multi-national nature to give spammers more options, but it also improves the detection systems to control patterns across the time. Therefore, the author produces graph using time window.

References

Ahmed, N. K., Neville, J., & Kompella, R. (2014). Network sampling: From static to streaming graphs. ACM Transactions on Knowledge Discovery from Data (TKDD), 8(2), 7.

Gao, H., Hu, J., Wilson, C., Li, Z., Chen, Y., & Zhao, B. Y. (2010, November). Detecting and characterizing social spam campaigns. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement (pp. 35-47). ACM.

Huang, B., Kimmig, A., Getoor, L., & Golbeck, J. (2013, April). A flexible framework for probabilistic models of social trust. In International conference on social computing, behavioral-cultural modeling, and prediction (pp. 265-273). Springer, Berlin, Heidelberg.

Spirin, N., & Han, J. (2012). Survey on web spam detection: principles and algorithms. ACM SIGKDD Explorations Newsletter, 13(2), 50-64.

SEARCH

Calculate Your Order

Standard price

$310

SAVE ON YOUR FIRST ORDER!

$263.5