Gmail's implicit social graph

Table des matières

Introduction
The algorithm
Suggesting friends
What about épinards & caramel ?

Introduction

When you write an email to some friends in Gmail, more contacts are suggested. I've been wondering for some time how that worked, because I want something similar on epinardscaramel, for the tags to be attached to posts.

Screenshot of the feature

I was thinking of doing something along the lines of :

//create groups of tags. Give them each a score. //increase their score each time the group is used. //when a user picks tags A and B, find the groups which contains A and B //suggest tags, different from A and B, that belong to the group with the best score.

But by looking around, someone pointed me in the direction of research.google.com, where one can find a paper detailing how the gmail team has implemented this algorithm : Suggesting Friends Using the Implicit Social Graph.

The algorithm

The paper start by suggesting the use of an hypergraph, where each node is an email address. Each time a mail, sent or received, is considered, google creates an "hyperedge" of the graph, which is simply put a set of email addresses. So far so good.

Things get complicated when we get to the formula for the score, IR, of each group :

Gmail's "friend suggest" formula

Isn't math grand ? Don't leave yet, I will explain, this formula is rather smart. The first thing we notice is that we have to do the sum of two identical parts, one concerning the incoming emails I_in, and the other the outgoing emails I_out. That part of the formula is :

détail de la formule

This part of the computation is concerned with the messages' dates, where t_now is today's date and t(i) the email's date. The difference is divided by λ, a constant that denotes the "half-life" of the score. Indeed, if the mail has just been created, "t_now - t(i)" will be zero, same as zero divided by λ, and the message's score will be (½) to the power of 0, which is 1.

Once λ seconds have passed between the message creation and now, "t_now - t(i)" will be equal to λ, λ divided by itself is 1, and ½ to the power of one is ½. When 2λ seconds have passed, the score will then be (½)², or ¼. Therefore, we see that the score is divided by two each time that the duration λ passes.

The formula computes the score of each mail, then sums them (symbol Σ). Finally, to show that outgoing mails are more important than incoming mails, we multiply the sum of the outgoing mails' scores by ω_out, another constant.

Taking into account the time that has passed since the message's creation is really smart; Gmail rightly considers that a recent group of friends is more relevant that an older one.

Suggesting friends

Now that we have a score for each group of emails, we are capable of suggesting addresses that would go nicely with the ones already inputed.

In order to do so, it's a little more complex than finding the group with the best score and suggesting all the members of this group : We need to compute a score for each email, considering all the groups and their scores.

The paper suggests 5 different ways of calculating each address' score, I'll let you read it if you want to go further.

What about épinards & caramel ?

For this site, I want to implement the algorithm for groups of tags : the formula for IR will be simpler in my case, since I don't have the difference between outgoing and incoming. My hypergraph will have each tag as a node, and the groups' scores will take into account the publication date of each blog post.

Gmail's implicit social graph

Introduction

The algorithm

Suggesting friends

What about épinards & caramel ?

Manu·e

Articles Récents

Les prochains emoji (en beta)

Qu'est-ce que la politique a à voir avec moi ?

Cyrano de Bergerac, le jeu vidéo

FacebookLeak

Télétravail : recommandations COVID-19

Étiquettes

Gmail's implicit social graph

Introduction

The algorithm

Suggesting friends

What about épinards & caramel ?

Manu·e

Articles Récents

Les prochains emoji (en beta)

Qu'est-ce que la politique a à voir avec moi ?

Cyrano de Bergerac, le jeu vidéo

FacebookLeak

Télétravail : recommandations COVID-19

Étiquettes

Vous pourriez être intéressé par

“I Shared My Phone Number. I Learned I Shouldn’t Have.”

Google images comme outil de traduction

Game idea: “Wad Lord”💸👑

Qu'est-ce que la politique a à voir avec moi ?