Skip to main content

Table 1 Previous works that used writing style features in social media

From: Comparing writing style feature-based classification methods for estimating user reputations in social media

Previous works

Tasks

Research purpose

Test bed social media

Language

Features of writing styles

Techniques

Lexical

Syntactic

Structural

Content-specific

Abbasi and Chen (2005)

Authorship identification

Evaluating the linguistic features of Web messages and comparing them to known writing styles offers the intelligence community a tool for identifying patterns of terrorist communication

Web forum

English, Arabic

√

√

√

√

C4.5, SVM

Zheng et al. (2006)

Authorship identification

Examining writing style features and classification techniques to identify authorship of unknown online messages

Internet news group, Bulletin board system (BBS)

English, Chinese

√

√

√

√

C4.5, NN, SVM

Argamon et al. (2007)

Classification

Developing a new type of lexical feature for stylistic text classification, and demonstrating its usefulness in sentiment classification

Movie reviews

English

√

  

√

SVM

Abbasi and Chen (2008)

Authorship identification, similarity detection

Using writing style analysis techniques for identification and similarity detection of anonymous identities

eBay comments, Java forum

English

√

√

√

√

SVM, RS-SVM, PCA, Standard K-L transforms

Abbasi et al. (2008b)

Classification

Evaluating techniques that select writing style features for sentiment classification

Movie reviews, Web forum

English, Arabic

√

√

√

√

IG, GA, SVM weights, EWGA

Abbasi et al. (2008a)

Similarity detection

Evaluating writing style similarity detection techniques

Online feedback comments of eBay members

English

√

√

√

√

PCA, n-gram models, Markov models, Cross entropy, K–L similarity

Agichtein et al. (2008)

Classification

Automatic finding on high-quality content in a question/answering portal

Yahoo! Answer

English

√

√

 

√

C4.5, SVM

Koppel et al. (2009)

Classification

Comparing methods and features applied to authorship attribution problems representative of the range of classical attribution problems

Blog

English

√

√

√

√

NB, C4.5, Window, Bayesian regression, SVM

Huang et al. (2010)

Classification

Evaluating the effectiveness of user-generated text data in online video classification

Video-sharing Web site

English

√

√

 

√

NB, C4.5, SVM

Zhang et al. (2011)

Classification

Evaluating writing style features and classification techniques for online gender classification

Web forum

English

√

√

√

√

SVM

Benjamin and Hsinchun (2012)

Classification

Investigating relationship between hacker posting behaviors and reputation to identify potential cues for determining key actors

Hacker communities

English

  

√

 

Regression analysis

Iqbal et al. (2013)

Authorship identification, classification

Studying three typical authorship analysis problems encountered by cybercrime investigators: authorship identification with large training samples, authorship identification with small training samples, and authorship characterization for gender and location

Blog

English

√

√

√

√

Ensemble of Nested Dichotomies, C4.5, RBF Network, NB, BayesNet

Jiang et al. (2014)

Similarity detection

Using online writing style analysis to segment the forum participants by stakeholder groups, and partitions their messages into different time periods of major firm events to examine how important stakeholders evolve over time

Web forum

English

√

√

√

√

EM clustering

This study

Classification

Using writing style features as objective features to estimate the classes of user reputations in social media

Web forum

Korean, English

√

√

√

√

C4.5, NN, SVM, NB, RS-C4.5, RS-NN, RS-SVM, RS-NB