'How to implement the activity stream in a social network

I'm developing my own social network, and I haven't found on the web examples of implementation the stream of users' actions... For example, how to filter actions for each users? How to store the action events? Which data model and object model can I use for the actions stream and for the actions itselves?



Solution 1:[1]

This is my implementation of an activity stream, using mysql. There are three classes: Activity, ActivityFeed, Subscriber.

Activity represents an activity entry, and its table looks like this:

id
subject_id
object_id
type
verb
data
time

Subject_id is the id of the object performing the action, object_id the id of the object that receives the action. type and verb describes the action itself (for example, if a user add a comment to an article they would be "comment" and "created" respectively), data contains additional data in order to avoid joins (for example, it can contain the subject name and surname, the article title and url, the comment body etc.).

Each Activity belongs to one or more ActivityFeeds, and they are related by a table that looks like this:

feed_name
activity_id

In my application I have one feed for each User and one feed for each Item (usually blog articles), but they can be whatever you want.

A Subscriber is usually an user of your site, but it can also be any object in your object model (for example an article could be subscribed to the feed_action of his creator).

Every Subscriber belongs to one or more ActivityFeeds, and, like above, they are related by a link table of this kind:

feed_name
subscriber_id
reason

The reason field here explains why the subscriber has subscribed the feed. For example, if a user bookmark a blog post, the reason is 'bookmark'. This helps me later in filtering actions for notifications to the users.

To retrieve the activity for a subscriber, I do a simple join of the three tables. The join is fast because I select few activities thanks to a WHERE condition that looks like now - time > some hours. I avoid other joins thanks to data field in Activity table.

Further explanation on reason field. If, for example, I want to filter actions for email notifications to the user, and the user bookmarked a blog post (and so he subscribes to the post feed with the reason 'bookmark'), I don't want that the user receives email notifications about actions on that item, while if he comments the post (and so it subscribes to the post feed with reason 'comment') I want he is notified when other users add comments to the same post. The reason field helps me in this discrimination (I implemented it through an ActivityFilter class), together with the notifications preferences of the user.

Solution 2:[2]

There is a current format for activity stream that is being developed by a bunch of well-know people.

http://activitystrea.ms/.

Basically, every activity has an actor (who performs the activity), a verb (the action of the activity), an object (on which the actor performs on), and a target.

For example: Max has posted a link to Adam's wall.

Their JSON's Spec has reached version 1.0 at the time of writing, which shows the pattern for the activity that you can apply.

Their format has already been adopted by BBC, Gnip, Google Buzz Gowalla, IBM, MySpace, Opera, Socialcast, Superfeedr, TypePad, Windows Live, YIID, and many others.

Solution 3:[3]

I think that an explanation on how notifications system works on large websites can be found in the stack overflow question how does social networking websites compute friends updates?, in the Jeremy Wall's answer. He suggests the use of Message Qeue and he indicates two open source softwares that implement it:

  1. RabbitMQ
  2. Apache QPid

See also the question What’s the best manner of implementing a social activity stream?

Solution 4:[4]

You absolutely need a performant & distributed message queue. But it does not end there, you'll have to make decisions on what to store as persistent data and what as transient and etc.

Anyway, it is really a difficult task my friend if you are after a high performance and scalable system. But, of course some generous engineers have shared their experience on this. LinkedIn lately made its message queue system Kafka open source. Before that, Facebook had already provided Scribe to the open source community. Kafka is written in Scala and at first it takes some time to make it run but i tested with a couple of virtual servers. It is really fast.

http://blog.linkedin.com/2011/01/11/open-source-linkedin-kafka/

http://incubator.apache.org/kafka/index.html

Solution 5:[5]

Instead of rolling your own, you could look to a third party service used via an API. I started one called Collabinate (http://www.collabinate.com) that has a graph database backend and some fairly sophisticated algorithms for handling large amounts of data in a highly concurrent, high performance manner. While it does not have the breadth of functionality that say Facebook or Twitter do, it more than suffices for most use cases where you need to build activity streams, social feeds, or microblogging functionality into an application.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Nicolò Martini
Solution 2 S?n Tr?n-Nguy?n
Solution 3 Community
Solution 4 Cagatay Kalan
Solution 5 Mafuba