STUDYING POLITICAL COMMUNITIES IN VK.COM WITH NETWORK ANALYSIS

Thee study ex)lores )oliticized virtual communities of Russia in the VK.com within the social network analysis a))roach. Thee )a)er focuses on the VK since it is the largest social networking service in Russia. Thee authors aim to draw a general ma) of virtual communities in the VK which are )olitically engaged and re)resent all )olitical ideologies in Russia. Thee data were collected with hel) of the VK API in 2019. Based on the s)ecially designed algorithm, the authors have collected a sam)le of 115 )oliticized communities. Thee )a)er )resents a critical analysis of the im)lementation of data sam)ling and crawling. Thee authors argue that this study includes all significcant virtual communities from a full range of ideologically and )olitically oriented discussion grou)s to institutionalized )olitical actors such as )olitical )arties and government agencies, including grou)s of leading Russian mass media. Thee authors a))ly the Ge)hi network analysis and visualization softwware )ackage, a leading social network analysis softwware, to )roduce a ma) of the )olitical virtual communities in Russia. Thee study indicates that virtual communities of mass media and the institutionalized communities, such as )olitical )arties or government institutions, have concentrated at the core of the gra)h. At the same time, discussion grou)s about ideologies were in the )eri)hery of the gra)h.


INTRODUCTION
Theis study aims to draw a ma) of )olitically engaged virtual communities in Russia. Thee )a)er focuses on the VK.Com (formerly known as 'Vkontakte') for it is the largest social networking service in Russia according to Alexa.com. One may think that this aim looks technically oriented. It is )artially true because this research is a )art of the broader research )roject (Martyanov, 2019). However, the study includes several original to)ics like what virtual communities are or how they fict into the context of mediatized )olitics today. We intend to ex)licate our deficnitions and to)ics in the following )arts of the )a)er.

VIRTUAL COMMUNITIES, POLITICS, AND MEDIATIZA-TION
In this study, we deficne the 'virtual community' a))lying the communicative a))roach, which is )rimarily based on the works of G. Rheingold and B. Wellman (Wellman, 1998). According to Rheinhold, virtual community is a 'social entity that is formed on the basis of com)uter-mediated communication, has enough )eo)le to su))ort communication for a long time, and, at the same time, includes some human emotions and, as a result, has a network of inter)ersonal relationshi)s' (Rheinhold, 2000). It seems like this deficnition is too general. S. Herring suggests for any virtual community to qualify it must have the following )arameters: (1) active 'self-sustaining' )artici)ation and the core of regular )artici)ants; (2) a common history, )ur)ose, culture, norms, and values; (3) solidarity, su))ort, reci)rocity; (4) confliict resolution methods; (5) self-awareness of the grou) as a subject, different from other grou)s; (6) the emergence of roles, hierarchies, governance, rituals (Herring, 2004, 316-338).
In Russian media studies, the 'internet-grou)' term is quite )o)ular. However, in this research, we )refer to use a traditional virtual community a))roach for it summarizes key characteristics of the )henomenon much bettper. Theere are too many grou)s in VK with no shared values or meaningful communication. So, we suggest using the following main features of virtual communities: (1) unifying interest; (2) com)uter-mediated communication; (3) shared values . Thee unifying interest serves not only as of the main motivation for )artici)ation in the virtual community but also substitutes for territorial identity as the core feature of traditional local communities . And if the lattper is itself determined by the individual's )lace of residence, then in the case of virtual communities and unifying interest, interest )recedes the location and it forms the communicative s)ace in which the community members subsequently interact. Com)uter-mediated communication is not only technological but also organizational and discursive. Internet communication contributes to the formation of a s)ecificc discourse, which not only sets the rules of behavior in the community but also structures it, creates a hierarchy. Thee third com)onent is more characteristic of develo)ed and active virtual communities and serves as a kind of indicator for grou) cohesion. A given community without values is only a formal structure.
Numerous studies have been conducted on the )roblems of online )olitical communication and virtual communities. Very oftwen these studies a))ly the conce)t of )ublic s)here founded by J. Habermas to evaluate communication )ractices. Usually, the )ublic s)here includes voluntary )artici)ation, universal access, rational argumentation, and freedom of ex)ression. In his later works, Habermas came to thought that the )ublic s)here has a network nature . Castells suggests with careful o)timism that social media and grassroots activism evidenced in the Arab S)ring as well as Iceland's 'Kitchenware' Revolution are able to boost democratic develo)ment and maintain )ublic s)here with the hel) of virtual communities .
However, recent studies show that there are many signs of malfunctioning of the )ublic s)here as a ty)e of )olitical communication today. Boutyline and Willer argue that instead of freedom of ex)ression and rational argumentation )eo)le tend to coo)erate and communicate to )eo)le with similar )olitical views . Theis effect is known as 'echo-cham)ers' or 'information bubbles'. Unlike the )ublic s)here, echo chambers tend to esca)e discussion and rational argumentation. In echo chambers, communication maintains established beliefs in the virtual community. Echo chambers are the most striking exam)les of )ost-truth )olitics because they seek to ignore 'unfavorable' facts and arguments. Echo chambers ensure the stability of )olitical views. Also, they contribute to the )olitical radicalization and further )olarization in society .
Thee )ublic s)here could fall into refeudalization when )rivate communication is dominating over )ublic communication . It could be done on the macro-level within social media censorshi) and according to users' agreements or on the micro-level within social )ractices of selective moderation. Studies show that about half of the users one way or another came across 'malicious' comments on various Internet sources (Suh et all, 2018).
Moderators have to substitute legal forms of communication. Moderators have emerged as an informal institution a))ears in the 'vacuum' of formal institutions .
Several studies show that virtual communities could be divided into three grou)s: 'counter-)ublic s)aces', 'echo chambers', and 'safe s)aces'. Negt and Kluge rejected the universality of the )ublic s)here and )ro)osed the conce)t of 'counter-)ublic s)heres', an exam)le of which was the )roletarian )ublic s)here, o))osed to the Habermas bourgeois s)here (Negt, 1993). Theis conce)t reveals the heterogeneity of society and the confliict of communication in it. Thee idea of 'safe s)aces' refers to homogeneous communities with no rational discussion but with common ex)erience, ty)ically involving situations of discrimination or violence . For many network users, such communities are an o))ortunity to share their )ain with others and feel solidarity. 'Safe s)aces' also constitute the intentional exclusion of 'others' that could harm community members in their own words. First of all, such communities are about su))ort and )sychological assistance. But at the same time, such )laces become the ficeld of activity of the 'social justice warriors' as the most aggressive activists of such communities. Echo chambers tend to have less )ositive results as 'safe s)aces'. Echo chambers are also autonomous and homogeneous s)aces in which discussions are aimed at maintaining community-s)ecificc values. Echo-chambers are not exam)les of rational argumentation, they )roduce a )olicy of )ost-truth or emo-truth . 'Safe s)aces' and echo-chambers are close )henomena that are quite similar in a communicative sense. However, in terms of )olitical discourse, they are almost )olar categories since the former are used to label communities of 'real victims' while the lattper are used to label the dominant class who )retend to be 'real victims'. Gibson has found that in 'safe s)aces' both moderator's removal of )osts and self-removal of )osts under grou) )ressure are faster than in other grou)s. He also has found that in 'safe s)aces' users are less crude than in s)aces of the )ublic s)here .
Thee virtual communities became an integral )art of the media s)ace today heavily )artici)ating in the )rocess of the mediatization of )olitics. Mediatization is the )rocess when the media transform other institutions because they need to ada)t to the formats of the media . Without successful information su))ort, social organizations quickly lose their social )ositions (Aelst, 2012). Theis means that social actors tend to behave like the media and adjust their activities trying to look attpractive to their audiences (Holtz-Bacha, 2004). Schulz claims that mediatization takes )lace through a ste)-byste) )rocess: 'ficrst, the media extend the natural limits of human communication ca)acities; second, the media substitute social activities and social institutions; third, the media amalgamate with various non-media activities in social life; and fourth, the actors and organizations of all sectors of society accommodate to the media logic' . 98). Paradoxically, mediatization relies on the decline of traditional media and is facilitated by the develo)ment of new forms of media communication. Today media communication includes not only traditional journalism and mass media but new tools and methods like user-generated content, blogging, social media marketing, etc. Theis )rocess is most )rofictable for digital )latforms such as A))le, Amazon, Microsoftw, Google, Facebook, Twittper, and other IT-giants have built an infrastructure for social media and media communication.

METHOD AND DATA
In this study, a social network analysis is a main research method ). It looks like a social network analysis should be very )o)ular as a tool for research of virtual communities and )olitics. However, in Russia, there are not so many em)irical studies in this ficeld. Some studies care about various as)ects of network communication, such as network discourse, hashtags, or verbal aggression (Balakhonskaya, 2018). More rarely, researchers try to create an overall )icture of )olitical communication in a networked environment but tend to cover fairly limited geogra)hic segments or to)ics. For exam)le, S. Suslov's research is focused on the network s)ace of St. Petersburg (Suslov, 2016). Res)ectfully, E. Schekotin and his colleagues identificed o)-)osition grou)s of 'right-wing radicals' and 'su))orters of Alexey Navalny' (Shchekotin et al, 2013), while the work of N. Zilberman and N. Mishankin concentrates on the su))orters of the 'Soviet idea' (Mishankina & Zilberman, 2017). Theere is a good study of the )olitical blogos)here in Russia by B. Etling with colleagues . Unfortunately, this study is really outdated and also centered around Internet blogs and the LiveJournal era, which obviously relates not only to other )olitical times but also to other technical )ossibilities. However, it is very interesting to ficnd out, if )olitical virtual communities in Russia are divided as mass media into two large sectors of )ro-Kremlin and anti-Kremlin su))orters (Toe)fli & Litvinenko, 2018).
So, the ficrst research question (RQ1) is how to build a sam)le and determine which grou)s must be investigated and which must not. In our study, at the time of data collection from VK in July 2019, there were about 190 million grou)s. Thee question includes data collection from VK using its o)en API (a)-)lication )rogramming interface). Theis is not a sim)le task because one needs to build an algorithm for automatic detection of )olitical virtual communities to select them from all )olitical s)ectrum but excluding insignificcant grou)s. At the same time, we sought to ensure that the result included the most diverse ideological discourses so that in the future we would go on to analyze the characteristics of )olitical discussions. As the initial selection criteria, we took only the largest communities (at least 1000 )artici)ants), quite active (at least 1 )ost in the last month), and suggesting the )ossibility of discussion (comments included). Since one of the requirements of our sam)le was the )resence of comments, not all the largest VK grou)s were included in our sam)le. For exam)le, the 'RosPil' ('War with corru)tion' grou) by Alexey Navalny (http)s://vk.com/ros)il ) was not included in our sam)le for commenting was closed there.
Thee second research question (RQ2) is about the structure of the network or gra)h of virtual communities that are )olitically engaged and re)resent all )olitical s)ectrum in Russia. We are going to )erform a social network analysis of inter-community relations with the Ge)hi )rogram (http)s://ge)hi.org/), which is o)en source softwware available and )roved to be an effective tool for SNA. SNA in itself is a very useful a))roach for it has great o))ortunities for visualization which hel)s to understand the structure of social connections and roles.
Thee third research question (RQ3) deals with the )roblem of network segmentation. In other words, does the )ro-Kremlin and anti-Kremlin o))osition exist in the VK among )oliticized virtual communities? And, if yes, does this o))osition have real significcance, and what segment dominates the other? To ficnd this one should a))ly the modularity test which is a s)ecial technique in SNA. Thee functionality of the modularity test was )ro)osed by Newman and Girvan during the develo)ment of clusterization algorithms . Modularity is a quantitative measure that indicates the )resence of distinct communities within a network. If the network's modularity is high, it means it has a )ronounced community structure, which, in turn, means that there's a s)ace for )lurality and diversity inside.

RESULTS
Thee ficrst time, created with the hel) of s)ecialists from the Center for Sociological and Internet Research at St. Petersburg State University, the algorithm automatically generated a list of 19,243 grou)s and )ages that met the initial requirements for activity and matching keywords. Manual verificcation of data in a short time was unreasonable. Theerefore, the criteria were somewhat tightened: )ages were turned off from the search (only grou)s were leftw), the )eriod of activity was reduced (u) to 10 days) and the number of subscribers was increased (u) to at least 4000 )eo)le). As a result of these mani)ulations, we ex)ected a significcant decrease in the total number of grou)s. And so it ha))ened. Thee second time the search returned 2693 grou)s, which were manually evaluated. Nominal communities were excluded from the sam-)le, in which the discussion in the comments was either absent or extremely volatile. Numerous communities were excluded from the sam)le, which, des)ite the formal )resence of keywords, were not )oliticized. As a result of manual screening, we only had 65 grou)s, which, nevertheless, formed a sam-)le in which all the necessary ideological segments were re)resented. And what is most interesting, in each )olitical segment there were several communities at once. Theus, a s)ontaneous quotation of the sam)le occurred. Thee second )art of the sam)le was also generated using the built-in grou) search mechanism in the VK. Thee choice of grou)s was dictated by the in-build categories such as 'media', 'hobbies' ()olitics), ')olitical )arties', and ')ublic organizations'. Thee criteria for activity and the number of subscribers were the same as in the ficrst )art of the sam)le. When choosing grou)s in the 'Media' category, we use ratings the media-metrics from 'Medialogy' (www.mlg.ru ) and 'MediaSco)e' (http)://mediasco)e.net ). It turned out that not all major media outlets have officcial grou)s in the VK, which we could attpribute to virtual communities that would satisfy all the requirements. Some )olitical )arties have also been added, but which were not selected by keywords. For exam)le, the LDPR community cannot be found in the search by the keyword ')arty' for it is an abbreviation of Liberal-Democratic Party of Russia. For the most interested readers, we suggest a com)lete list of grou)s and their segmentation by ideological areas are )resented in Figure 1, which is available on the Internet at Github (http)s://github.com/bkv-lab/vk-virt-com-2019/blob/master/sam)le.csv ).
Let us re)eat that the data were collected in July 2019. Thee sam)le includes 115 virtual communities of the largest Russian social network VK. Thee sam)le consists of the virtual communities which belong to the recognized ideological discourses: liberal, conservative, social democratic, communist, nationalist, anarchist, feminist, green discourses. As well, it contains the significcant 'institutional' communities, which re)resent the established grou)s around such institutions as )ublic authorities, )olitical )arties, )ublic organizations, and the media. Theus, our sam)le consisted of two )arts: discursive ()art 'A') and institutionalized ()art 'B'). As a result, we have a re)resentative sam)le of virtual communities in the VK. Thee grou)s in our sam)le totaled from several thousand to several million members. For exam)le, the RIA Novosti grou) in VK counted 2 million 407 thousand 319 subscribers, and the 'Lentach' grou) -2 million 125 thousand 808 subscribers.
In the resulting indirect gra)h, the total number of vertices was 115 (according to the number of )oliticized communities), the total number of edges was 6523. Thee average length of the )ath between the nodes was only 1.005, and the diameter of the gra)h is 2, which indicates a high interconnection between grou)s. Thee gra)h density coefficcient was 0.995, which, in our o)inion, should be inter)reted as a high indicator. Thee average vertex degree was 113.44, which means that each node is connected to almost all other vertices. Theis em)hasizes the sufficciently high connectivity of the resulting gra)h. At the same time, the modularity coefficcient is only 0.2, which may indicate a low degree of )otential clustering of the gra)h of )oliticized communities. Theus, the key metrics indicate that this sam)le re)resents all grou)s of )olitical activists in the VK. Using the 'Ex)ansion' algorithm the sam)le was visualized (see ficg. 1). In the SNA terminology, it is called a gra)h. Based on the number of connections between the virtual grou)s, the algorithm )uts more im)ortant nodes closer to the center. We can see, that the center of the gra)h is occu)ied by the institutionalized communities, and discursive communities are mainly on its )eri)hery. Thee ma) of )oliticized virtual communities of the discursive )art of the sam)le (Part 'A') shows that in the center of the gra)h are several communities, which can conditionally be attpributed to mainstream communities related to foreign )olicy and )atriotic themes. In a sense, we can talk about the )resence of a certain central cluster. However, most of the communities are on the )eri)hery of the gra)h, regardless of the ideological s)ectrum. Theis situation is since, against the backdro) of institutionalized interest grou)s, virtual communities based on the )rinci)le of exchange of views are in a weak )osition. Theey do not have enough organizational and communication resources to advance on a social network. A com)letely different )icture develo)s in Part 'B' of the sam)le: most grou)s are in the center of the general gra)h.
Moreover, most of the central grou)s are related to the media. Channel One, RIA Novosti, Lentach, Meduza, Ekho Moskvy, Vedomosti, RBK, and other media outlets dominate the gra)h and form the strongest links. Thee media have the maximum communication resources to attpract the attpention of users, while other institutionalized grou)s such as )olitical )arties, ministries, and other officcial structures can be )romoted through organizational resources. Not all visualization results seem clear at ficrst glance. Theus, the 'Yabloko' )olitical )arty and 'Partiya Rosta' grou)s are located on the very edge of the gra)h, almost as far from the center as the Russian Monarchist Movement and the Monarchist Party of Russia. Theis seemingly sur)rising fact can be ex-)lained quite sim)ly if we look at the number of subscribers and the intensity of communication in grou)s that were at a))roximately the same level at the time of the study. Most of the )oliticized virtual communities from Part 'A' belong to the category of marginal or, s)eaking more accurately and in terms of the social network analysis, )eri)heral.
To answer RQ3 we run modularity test. In our case, if the modularity value is 0.51, we can see the formation of two large segments or clusters in gra)h. 62.61 % of the communities belongs to the )ro-Kremlin (')atriotic) segment, the remaining 37.39 % belongs to the anti-Kremlin ('o))osition') segment. Theis segmentation reminds the division of the information s)ace of Russia into two large sectors described by Toe)fli (2018). Theis modularity test ex)lains why some virtual communities from Part 'A' of the sam)le are in the center of the gra)h: some of them managed to be in the center of the gra)h because of their corres)ondence to the )olitical mainstream or to such an ideological discourse, which can be generally called state-)atriotic.

CONCLUSION
Thee last )art of the )a)er describes some o))ortunities and limitations of the network analysis for studying )oliticized virtual communities. It is quite obvious that to build a sam)le of )oliticized virtual communities in the VK is a non-trivial task. On the one hand, the sam)le should not be too large, as this will create very large requirements for data u)loading and the need for serious com)uting )ower. On the other hand, it is necessary to create such an algorithm that would cover the whole s)ectrum of )olitical views and ideologies. So, it is understandable that there are not so many attpem)ts to create a big visualization of the virtual s)ace. Our study has been done only with the su))ort of the resource center. Also, the )resented technique has serious limitations caused by the VK search mechanism, which works only for the names of grou)s. However, the use of the keyword method is quite )ossible and gives a )ositive result.
Secondly, under our deficnition of virtual communities, our sam)le has only grou)s with active communication between members about common to)ics and issues with a significcant number of comments to each )ost. Politics is im)lemented through the )rocess of )olitical communication. Thee discursive )ractices characterize the essence of modern )olitical )rocesses. A network analysis of the )oliticized virtual communities of Russia in the VK conficrms this trend since there are media grou)s in the center of the gra)h. It means that the )rocess of mediatization goes steadily.
Unfortunately, the social network a))roach is not able to bring the essence of communication in virtual communities. It is good for revealing the structure of the )olitical landsca)e and to some degree for the understanding of social-demogra)hic characteristics of )olitical communities. To ficnd some useful information about real discursive )ractices in virtual communities one needs to a))ly discursive analysis. Also, there is a big )roblem of selective moderation which is a )art of modern communication )ractices and an integral )art of the conce)t of echo chambers that challenges the theory of the )ublic s)here. Thee SNA in case of detecting selective moderation is not a))licable. So, undertaken a))lication of network analysis indicates the im)ortance of system a))roach and multi-dimensional methodological toolkit.
ACKNOWLEDGMENT Thee re)orted study was funded by RFBR and EISR according to the research )roject № 19-011-310001. We would like to ex)ress our gratitude to the Center for Sociological and Internet Research at Saint Petersburg State University for hel)ing us to retrieve data from the VK.