This paper explor es the uses’ influences on microblog. At first, according to the social network theory, we present an analysis of information transmitting network structure based on the relationship of following and followed phenomenon of microblog users. Informed by the microblog user behavior analysis, the paper also addresse s a model for calculating weights of users’ influence. It proposes a U-R model, using which we can evaluate users’ influence based on PageRank algorithms and analyze s user behaviors. In the U-R model, the effect of user behaviors is explored and PageRank is applied to evaluate the importance and the influence of every user in a microblog network by repeatedly iterating their own U-R value. The users’ influences in a microblog network can be ranked by the U-R value. Finally, the validity of U-R model is proved with a real-life numerical example.
Microblog, is a platform based on user relationships for sharing, transmitting and acquiring information, on which users can establish individual communities, update information with around 140 characters and achieve realtime sharing via WEB, WAP and a variety of clients [
At present, many scholars have started to pay attention to and study the microblog or twitter (in China, microblog is ordinarily called, so in the context “microblog” is used) all over the world. Also, the hot areas of these researches include the motivation and behaviors of microblog users, besides microblog social network structure. The evaluation of microblog users’ influence (microblog influence) has also become a new research focus in the analysis of the social network. Foreign studies mainly discuss Twitter, which is considered as the pioneer prototype of microblog. In addition, AKSHAYJAVA et al. (2007) [
However, there are a few academic researches on microblog users’ influence presently. GABRIEL W (1994) [
In this paper, on the basis of previous studies, we learn from the PageRank algorithm which is used to evaluate the page of search engine. Then, considering the factors that include users’ activity represented by the frequency of posting microblogs and interactive positiveness, we propose a U-R model, which is an algorithm for evaluating users’ influence based on PageRank and microblog users’ behavior analysis. And this model could cover some shortages of above models.
In microblog networks, the description of friend relationship varies with service providers. For instance, when we use SINA microblog, the relationship is “follow and followed”, while using Tencent microblog, it is “listen and listened”. In this paper, we adopt the “follow and followed”, shown in
According to the characteristics of microblog, combining with social network analysis theory, we propose the following definitions:
• Node: every user is a node in a microblog network, such as user A and user B (See
• Edge: that is the relationship of “follow and followed” among microblog users, between which the edge has directionality.
• In degree and Out degree: the number of Followee is the out degree of user nodes, instead, the number of Follower is the in degree.
Additionally, PageRank algorithm is based on the following two assumptions [
• If a page is referenced for multiple times, it may be very important. In spite of a webpage isn’t referenced frequently, if it is referenced by important webpages, it still may be important. The importance of a webpage is transmitted averagely to the pages referenced.
• Assume that at the beginning, access to a page of a webpage collection randomly, then continue to browse the pages following the current page links, and the PageRank value is the probability to browse the next page.
In other words, if a webpage is linked by many significant webpages, it means that the content of this page has been recognized and trusted largely. Moreover, the content has high authority and should have a higher ranking. Therefore, the equation [
where PR(X) is the PageRank value of webpage X;
N(X) is the out degree (the number of the links from this webpage to other webpages);
M(X) is the page collection that points to webpage X.
This is a recursive equation, and the PageRank value
of a webpage will be evenly distributed to each forward link. In addition, PageRank value is a rank value about the indicator of the importance of a webpage and the value is generated by the hyperlink structure of the network. Then PageRank value of any webpage can be calculated by other pages’ and the specific number of hyperlinks. In other words, as for each webpage linked into, the PageRank value is divided by the respective number of links out. Next, sum up them. In the calculation, we make simple modification to Equation (1) by adding damping coefficient [
According to empirical analysis, the p is always set as 0.85, so that the result is convergent.
Hence, we propose two assumptions of U-R model drawing on PageRank algorithms:
• If a microblog is forwarded and commented for multiple times, it may be very important. On the other hand, although the microblog isn’t forwarded and commented frequently, if it is forwarded and commented by important microblog users, it still may be important. The influence of a user is distributed equally to the other users he/she follows.
• Assume that at the beginning, access to a user in the microblog user collection randomly, then continue to browse microblogs following the current user’s following, forwarding and comments, the U-R value is the probability to browse the next user.
In the microblog network, if user A follows user B, A is a follower of B. Then, A can see the microblog posted by B, but B cannot see the microblog posted by A. The flow of information in the microblog network is completed by follow and followed. Actually, the structure of microblog network is similar to the link model of webpages. A follows B is equal to A votes B. Hence, we are able to rank influence of microblog users at the basis of PageRank.
The more Followers a user has, the greater influence of information transmitting he has in the microblog network. And the microblog he has posted will appear on webpages of tens of thousands of Followers; therefore, he has bigger weight in the process of calculating his authority. On the other side, the vote has bigger weight if he follows other users. That is to say, the influence which belongs to the user he has voted will become bigger. Cite an instance, if user A has 100,000 Followers and follows user B, A will see the content posted by B. As a result of A’s forwarding, this microblog is presented to 100,000 Followers. In this way, A is like an amplifier to enlarge the effects of information forwarding. Consequently, B is highly authoritative in microblog network. However, a condition relative to B’s authority is whether A will forward the B’s microblog, the probability of which is similar to damping coefficient [
Additionally, in PageRank, UR value of a webpage is evenly transmitted to the pages linking out of it, then, as a result of this, we will overlook the importance of the page itself. When PageRank algorithm is applied to the analysis process of microblog network, the weight ratio of user behavior is a standard to distribute PageRank value. Under this standard, the user with higher weight will accordingly obtain higher PageRank value and the PageRank transmitting is nonuniform. Eventually, active users will have more authority than inactive users in the network. The shortage, merely relying on the relationship of follow and followed to evaluate the influence, will be overcome by above process and the model can better reflect the objective reality. At last, based on the above analysis and combining the user behavior analysis, we advance a new U-R model to evaluate microblog users’ influence.
where UR(u) and UR(v) represent the influence evaluation value of microblog user u and user v;
p is the probability for v to forward u’s microblog, and here p is set as 0.5;
M(u) is the collection of u’s Followers;
A(v,u) is user u’s UR value ratio assigned by user v, which is determined by the ratio that the u’s weight account for of the total behavior weights of v’s entire Followees. The equation is
In Equation (4), Wu is user u’s microblog behavior weight. Nv represents the number of Followees of u (node u’s out degree).
According to the description of microblog user behavior, microblog behavior is a major factor of microblog’s influence, such as the frequency of updating microblog, interaction with other users, and so on. At the same time, we need to consider the users’ active degree and the enthusiasm to participate in interaction. So the higher active degree and interactive positiveness the user has the greater influence he will generate. In order to facilitate the subsequent analysis, a model on the weight of user behavior influence is defined to describe the active degree of microblog users and interactive positiveness. The model is shown in
If W represents the weight of user influence, then
where Xi is the user’s positiveness;
Yi is the user’s interactive positiveness;
a and b are both weighting coefficients.
Then, the users’ active degree is defined as the frequency of updating microblogs under a unified time scale. Meanwhile, the definition for interactive positiveness is the state of users’ mentioning, commenting and forwarding under a unified time scale. That is
Among above equations, T is a unified time scale. In order to objectively characterize users’ active degree and positiveness indicators, we unify the T time. Also, the number of microblogs that users have posted is Q. A represents the amount of “@”, and the number of comments is shown by C. The number of forwarding is R. In addition, c, d, e are all weighting coefficients. Since the impact on a user’s weight varies with the user behaviors, weighting coefficients can be given different values. Then, after calculating the user’s weights on active degree and interactive positiveness, then we can get the weight on user behaviors.
The microblog data set [
Due to the large user group of microblog network, in this paper, only 10 nodes are selected from the data set as a sub-network of the entire microblog social network. We use these nodes to achieve the U-R model calculations and explore the information transmitting and nodes’ influence. Next, the relationship (follow and followed) among nodes is shown in the form of adjacency matrix.
If exists, the adjacency matrix is
According to the above weight setting model and attribute analysis in the samples, we make the following settings, a = 0.4, b = 0.6, c = 0.5, d = 0.3, e = 0.2, T = 100. The network attributes and user weights of the sample are presented in
At the basis of the relationship among nodes and Equation (3), the iterative equation is as follows:
Then, the user’s weight proportion assigned to the node is then calculated in accordance with Equation (4). For instance, node 4’s following nodes are 1, 2, 3, 5, 6, 9, node 1’s weight proportion assigned by node 4 is
Similarly, node 10’s following nodes are 1, 4, 8, and the user’s weight proportion assigned to node 1 is
Each node’s initial value is 1. Next, each node’s value is iteratively calculated based on Equation (8) until the result is converged. The iterative process is the iterative process table of U-R model example shown in Appendix. Eventually, after calculated, the UR values presented in
It can be concluded from the calculations of 10 nodes selected that there isn’t a positive correlation between the number of Follower and the user’s influence. For example, although node 8’s number of Follower (in degree) is more than node 2, node 8’s microblogs posted by users themselves and interactive positiveness is less and the influence is smaller. In this way, U-R model covers some shortages of the algorithm model proposed by GABRIEL W [
Currently, microblog is the most popular online social network, for it has not only the characteristics of the social network, but also clear ones of media, it is also
called “social media”. This paper can reflect the influence of microblog users veritably through the UR algorithm which is simple and clear, and that can be helpful for marketing, public opinion control etc.
However, how to set the values of the damping coefficient p in U-R algorithm and the weighting coefficients a, b, c, d, e in weight model is a hypothesis, which is necessary to make specific judgment based on the actual situation. Additionally, the U-R model doesn’t accurately reflect the quality of microblog content, while in a microblog network, it is easy for the higher quality content to be spread in a viral way and these microblogs tend to have an impact on other users. These two problems remain to be studied further.
This paper is supported by the fundamental research funds for the central universities under grant No. 72115096.