A Ranking Algorithm for Online Social Network Search

A Ranking Algorithm for Online Social Network Search INDIAN INSTITUTE OF MANAGEMENT CALCUTTA WORKING PAPER SERIES WPS No.724 / February 2013 A Ranking Algorithm for Online Social Network Search by Divya Sharma Doctoral student, IIM Calcutta D. H. Road, Joka...

INDIAN INSTITUTE OF MANAGEMENT CALCUTTA WORKING PAPER SERIES WPS No.724 / February 2013 A Ranking Algorithm for Online Social Network Search by Divya Sharma Doctoral student, IIM Calcutta D. H. Road, Joka, Kolkata 700104, India Parthasarathi Dasgupta Professor, IIM Calcutta, Diamond Harbour Road, Joka, Kolkata 700104, India & Debashis Saha Professor, IIM Calcutta, Diamond Harbour Road, Joka P.O., Kolkata 700 104 India A Ranking Algorithm for Online Social Network Search Divya Sharma*, Parthasarathi Dasgupta and Debashis Saha *Fellow Programme student, MIS Group, IIM Calcutta Abstract Online Social Networks have become the new arena for people to stay in touch, pursue their interests and collaborate. In October 2012, Facebook reported a whopping 1 billion users which is testimony of fact how online social networks have proliferated and made inroads into the real world. Some of the obvious advantages of social networks are (1) 24X7 availability allowing users to stay in virtual touch at any time of the day, (2) get in touch with people who have similar interests and collaborate with them and (3) the ability to be able to search for users to add to your friend circle. The third advantage is the focus of this paper. With the increasing number of users on online social networks, it is important that when a user searches for another user, appropriate results are returned. This paper identifies three criteria – proximity, similarity and interaction ‐ which can be used to rank search results so that more appropriate results are presented to the searching user. Also, this algorithm allows the search ranking to be customized according to the nature of the online social network in question. Keywords Algorithm, Online Social Network, Search Ranking Introduction Social networks consist of sites that allow people to interact and share social experiences by exchange of multimedia objects (text, audio, and video) associated with the people themselves and their actions [1]. Every user on a social network possesses a user profile that contains all the information about the user ranging from basic information like name, date of birth, gender, location to more detailed information capturing educational and professional information and areas of interest. Online social networks have become abstractions of the real world where users interact, exchange and keep in touch. A major challenge, therefore, is the ability of a social network site to allow users to search for potential friends and in doing this, return relevant results. The need for such a search technique arises due to the inherent structure of social networks and the behaviour of users on the network. Most searches on a social network are queries containing names of users and a multitude of users may share the same name, which makes the trivial task of searching for online friends very cumbersome. Here we discuss social network search ranking by means of an algorithm which takes into account three important factors that make search results relevant to a user – proximity, similarity and interaction. Based on these three factors, search ranks are allotted to search results when one user searches for another user by name. Also this algorithm takes into account the fact that all social networks are not of the same type. When a user searches for friends on a network of classmates, it is more relevant to rank results which are closer to the searching user in the virtual space higher in the results list. On the other hand, on a social network for football fans, ranking must be done on the basis of sharing common interests like being fans of the same football club or the same football players. Still another example is a social network of acquaintances, where ranking on the basis of the level of interaction among users might be a more useful criteria. Related Work Recently, there has been lots of interest in the field of online social network search and ranking. The work in [1] focuses on the problem of how to improve the search experience of the users. It suggests seed based ranking instead of text‐based ranking by measuring shortest distances between the nodes in a friendship graph. This is the first work that makes use of the friendship graph in a big social network to improve the search experience. The paper in [2] presents a novel social search model for finding a friend with common interests in OSN (Online Social Network) with the introduction of trust value and popularity value. The trust value is calculated by the improved shortest path algorithm with a trust threshold, and the popularity value is obtained by the page rank algorithm iteratively. In order to ensure more accurate search results, [3] demonstrates an algorithm called E.L.I.T.E. which has five essential components ‐ Engagement‐U, Lifetime, Impression, Timeframe and Engagement‐O. Engagement‐U is the affinity between users which is measured by their relationships and other related interests between them, Lifetime is a trace of users’ past based on their positive, neutral and even negative interactions and actions with other users, Impression is the weight of each object determined by the number of positive responses from users, Timeframe is the timeline scoring technique in which an object naturally loses its value as time passes and Engagement‐O is the attraction of users to objects which is measured between objects and associated interests of users. Emphasizing on tree based search techniques, [4]compares the efficiency of reliable searching between Maximum Reliable Tree (MRT) algorithm and Optimum Branching Tree (OBT) algorithm and proposes the use of the MRT algorithm that is newly developed based on a graph‐based method, as a generic technique which facilitates effective social network search and that can be the most reliable social network search method for the promptly appearing smart phone technologies.[5]explores the correlation between preferences of web search results and similarities among users by presenting an efficient search system called SMART finder. The work provides more information about SMART finder and publishes a quantitative evaluation of how SMART finder improves web searching compared to a baseline ranking algorithm. The concept of user tag feedback scores is employed in [6]. Based on this concept, a tag‐based feedback web ranking algorithm is designed. The algorithm can efficiently use the user’s feedback. Motivation Searching for another person is one of the most fundamental uses to which social networks are put. However, most of the research work until now does not specifically look into this area. A small quantum of research work which does delve into this topic either provides just a conceptual framework or does not discuss the implementation of the framework in detail. Considering that online social networks are an abstraction of real‐world social networks, it is important to understand the factors that influence the associations between the users. From experience it can be said that one person’s association with another person is a function of how closely linked the two users are, how many interests they share and how often they interact. We call these three factors proximity, similarity and interaction respectively. Though this is not an exhaustive list of factors that might influence association between two persons, it does give a fairly good idea about the same. This work tries to use this concept of association between users to rank search results when a particular user searches for another user on an online social network. When any framework is to be implemented, it is required that the various factors which are being considered are quantified. The proposed algorithm in this work provides such a means to quantify the factors like proximity, similarity and interaction and then combine them to give a value for association between two persons. Along with providing a conceptual framework, this work also details out the implementation of the framework in practice. Ranking Metrics This paper defines three metrics for the purpose of ranking search results: 1. Proximity: On a social network, a user may be linked directly and indirectly to millions of users. A social network is a myriad web of interconnections and a node which is closer to the searching node in terms of number of hops is likely to be a better search result in comparison to a node which is several hops further away. Proximity measures the closeness of a node from the searching node. It is calculated by running a shortest distance algorithm and finding the minimum distance of the various nodes in the search result from the searching node. Let dab be the shortest distances between nodes a and b, then proximity pab is calculated as ݌௔௕ ൌ 1݀௔௕ 2. Similarity: Social networks provide users a platform for interacting with other users who share similar interests, listen to the same kind of music, read books from the same author, follow the same sport, share the same hobbies, etc. On a social network all these details are captured in the user profiles. When a user issues a search for another user, a user who is more similar to the searching user is likely to be a more relevant result in comparison to a user who is less similar. We define a user profile of user a as ௔ܲ ൌ ሼ݇, ݈, ݉, ݊ … ሽ where k, l, m, n … are the interests. Similarity S, between two nodes a and b in a list of search results with n nodes is defined as ܵ௔௕ ൌ ܥܽݎ݈݀݅݊ܽ݅ݐݕ ሺ ௔ܲ ת ௕ܲሻ ܥܽݎ݈݀݅݊ܽ݅ݐݕሺ ௔ܲ ׫ ௕ܲ ׫ ௖ܲ ׫ … ׫ ௡ܲሻ 3. Interaction: Social networks provide users different means for interacting with one another. Most online social networks allow users to interact via textual comments, exchange links through shares and like the posts of other users. When two users interact on a social network, there are two factors that can be used to gauge the closeness of these two users – frequency of interaction and recency of interaction. Frequency captures volume of interaction (for each of comment, share, like) between two users within a given span of time. This span of time is defined by window size w, defined further in the discussion. Recency captures the time gap between the time of issuance of the search query and the most recent interaction (for each of comment, share, like) between the searching user and the user being searched. Frequency of an interaction of type T between user a and b is defined as ௔݂௕ ்೔ ൌ 1 െ 1 ௔ܸ௕ ்೔ where Ti is the type of interaction (i = 1 for comment, i = 2 for share and i = 3 for like) and Vab Ti is the volume of interaction between users a and b of type Ti. Recency of interaction between users a and b is defined as ݎ௔௕்೔ ൌ 1 െ ݓ்೔ ݐ௢ െ ݐ௔௕்೔ where to is the time instance at which the search query was issued, tabTi is the time instance of the most recent interaction between user a and b of type Ti and window size wTi is defined as ݓ்೔ ൌ max൫ݐ௢ െ ݐ௔௕்೔ , ݐ௢ െ ݐ௔௖்೔ , ݐ௢ െ ݐ௔ௗ்೔ , … . , ݐ௢ െ ݐ௔௡்೔ ൯ where users b, c, d, … ,n are the search results of the query. The frequency and recency metrics are then used to define interaction i of type Ti between two users a and b as ݅௔௕்೔ ൌ ߙݎ௔௕்೔ ൅ ሺ1 െ ߙሻ ௔݂௕்೔ where 0 ≤ α ≤ 1. α is defined as the relative importance of recency in comparison to frequency when quantifying a particular interaction. The weighted interaction metric I between two users a and b is defined as the weighted sum of the three types of interactions (comment, share, like) as ܫ௔௕ ൌ ߚ݅௔௕భ் ൅ ߛ݅௔௕మ் ൅ ߜ݅௔௕య் where β + γ + δ = 1. Here β, γ and δ define the percentage importance of the three types of interaction i.e. comment, share and like in the overall interaction metric I. The Association Function Once the three metrics proximity, similarity and interaction have been defined, the next step is to define the association between the user and the search results based on these three parameters. Association captures the effect of the three metrics in question and returns a composite value which describes how closely the user is associated with each of the search results. As discussed earlier, the importance of each of these metrics may vary according to the nature of the online social network. Therefore, weights are defined to give different weights to each of these factors while calculating the rank of the search results. The weighted association of a search result, b, being searched by user a is defines as ܣ௔௕ ൌ ߤଵ݌௔௕ ൅ ߤଶܵ௔௕ ൅ ߤଷܫ௔௕ where µ1, µ2 and µ3 are the weights assigned to each of the metrics pab, Sab and Iab. It is important to note that μଵ ൅ μଶ ൅ μଷ ൌ 1 Therefore, µ1, µ2 and µ3 can be defined as the percentage importance of proximity, similarity and interaction in calculating the association according to the nature of the social network. For example, in social network catering to football fans, where similarity is more important than proximity and interaction, µ2 may have a higher value, while on a social network for classmates, proximity is more important, so, µ1may have a higher value. The Rank Function The ranks of the search results are subsequently obtained by sorting the results by the weighted association values. The search with the highest weighted association value is given rank 1, the search result with the second highest weighted association value is give rank 2, and so on, until the search result with the lowest weighted association value is given rank n. Algorithm for Computing Search Ranks The proposed algorithm is a ranking algorithm and, therefore, allows different search algorithms to be used to identify the unranked search results. Once the unranked search results are obtained, they are then ranked using the proposed algorithm. The association function is central to the calculation of the ranks of the search results. We consider that n potential search results are returned by the searching algorithm for a given query. Once this list is obtained, the next steps are to calculate the association and then rank the list on the basis of the association values. Pseudo Code: 1. Initialize the network N, searching user s, search query q 2. ranked_search_results(N, s, q) 3. begin 4. search_results[1 … n] ←search(q, N) 5. ranked_list[1 … n] ←compute_ranks(s, search_results[1 … n]) 6. end 7. compute_ranks(s, search_results[1 … n]) 8. begin 9. window_sizes←get_window_sizes(s, search_results[1 … n]) 10. common_interests_cardinality←get_common_interests_cardinality(s,search_results[1 … n]) 11. association[1 … n] ←compute_association(s, search_results[1 … n], window_sizes, common_interests_cardinality) 12. ranked_results[1 … n] ← sort(association[1 … n]) 13. end 14. get_window_sizes(s, search_results[1 … n]) 15. begin 16. comment_recency_window←0 17. share_recency_window←0 18. like_recency_window←0 19. for i = 1 to n comment_recency_window←max (comment_recency_window, comment_recencysi) 20. share_recency_window←max(share_recency_window, share_recencysi) 21. like_recency_window←max(like_recency_window, like_recencysi) 22. end 23. End 24. get_common_interests_cardinality(s, search_results[1 … n]) 25. begin 26. common_interests←interestss 27. for i = 1 to n 28. common_interests←common_interests׫ interestsi 29. end 30. common_interests_cardinality←cardinality(common_interests) 31. end 32. compute_association(s, search_results[1 … n], window_sizes, common_interests_cardinality) 33. begin 34. for i = 1 to n 35. proximitysi←get_proximity(s, search_resultsi) 36. similaritysi←get_similarity(s, search_resultsi)/common_interests_cardinality 37. interactionsi← get_interaction(s, search_resultsi, window_sizes) 38. associationsi← µ1proximitysi+ µ2 similaritysi+ µ3interactionsi 39. end 40. end 41. get_proximity(s,i) 42. begin 43. proximity ←1/disctance(s,i) 44. end 45. get_similarity(s, i) 46. begin 47. similarity ←interestss׫ interestsi 48. end 49. get_interaction(s, i, window_sizes) 50. begin 51. frequency ←1 – 1/interaction_volume(s,i) 52. recencycomment← 1 – window_sizecomment/recency_of_comment(s,i) 53. recencyshare← 1 – window_sizeshare/recency_of_share(s,i) 54. recencylike← 1 – window_sizelike/recency_of_like(s,i) 55. recency←β recencycomment + γrecencyshare+ δ recencylike 56. interaction ← α recency + (1 – α) frequency 57. end Lemma 1: ܶ݅݉݁ ܥ݋݉݌݈݁ݔ݅ݐݕ ൌ ܱሺ݊ሻ Proof: Time complexity of interest is that of the function compute_ranks. We assume that the sorting algorithm used has a worst case complexity of n log n. The Framework We propose to model a social network in the form of a graph, where the nodes correspond to the users and the edges correspond to the friendship between them. We have considered a simple network shown in Fig.1 where a user John issues a search for another user Maria. The search query will return Maria a, Maria b and Maria c. The task now is to rank these results according to relevance to user John. Proof: begin get_window_sizes n steps get_common_interests_cardinality n steps compute_association n steps sort n log n steps end compute_ranks (3n + n log n) steps Time Complexity ൎ Number of Steps = 3n + n log n = O(n log n) Simulation and Results We now return to the framework described in Fig. 1. The profiles of users we are interested in i.e. John, Mariaa, Mariab and Mariac are defined as follows: ௝ܲ௢௛௡ ൌ ሼ݇, ݉, ݊ሽ ெܲ௔௥௜௔ೌ ൌ ሼ݇, ݉ሽ ெܲ௔௥௜௔್ ൌ ሼ݉, ݊ሽ ெܲ௔௥௜௔೎ ൌ ሼ݇, ݈ሽ The interactions between John and Mariaa, Mariab and Mariac are defined in Table 1. User Comment Share Like Volume Most Recent Volume Most Recent Volume Most Recent Mariaa 3 05/08/2012 1 05/08/2012 12 18/09/2012 Mariab ‐ ‐ ‐ ‐ ‐ ‐ Mariac 9 24/09/2012 10 18/07/2012 11 02/10/2012 Table 1 The ranking algorithm was then simulated by varying the weights of the different parameters α, β, γ, δ, µ1, µ2 and µ3. Results and Comparisons John Mariab Mariac Mariaa Figure 1 Some illustrative results of the proposed ranking algorithm for different weights are mentioned along with a comparison of search results from other popular social network sites and recent research work in this area in Table 2. S.No Proposed Algorithm Rank results α Β γ Δ µ1 µ2 µ3 Rank 1 Rank 2 Rank 3 1. 0.5 0.5 0.3 0.2 0.34 0.33 0.33 Mariac Mariaa Mariab 2. 0.5 0.5 0.3 0.2 0.5 0 0.5 Mariac Mariaa Mariab 3. 0 0 0 0 0.5 0.5 0 Mariaa Mariac Mariab 4. 0 0 0 0 0 1 0 Mariaa Mariab Mariac 5. 0.5 1 0 0 0.5 0 0.5 Mariac Mariaa Mariab 6. 0 1 0 0 0.5 0 0.5 Mariac Mariaa Mariab 7. Facebook Mariac Mariaa Mariab 8. Google + Mariaa Mariac Mariab 9. Rank Algorithm Based on Trust and Popularity [2] Mariaa Mariac Mariab Table 2 The small simulation network which was used for computing the results of the proposed algorithm was also reconstructe

                    本文档为【A Ranking Algorithm for Online Social Network Search】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

A Ranking Algorithm for Online Social Network Search

你可能还喜欢