Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский
Discovr: a flickr experiment gone wrong
+0 Vote Up -0 Vote Down

I need help with this. I had a dream… Well, not so much as a dream, maybe a “It’d be cool to…”

I thought it’d be nice to discover new photos on flickr using your favorite photos and the people who also favorited those photos, and the favorite photos of those who also favorited my pictures. Still with me?

It’s actually a quite simple code (about 500 lines, check it on github: discovr), but it’s terribly slow. Some possible reasons:

  • Way too much data. I’ve found people with around more than 18000 favorites, and there are photos with more than 2k fans. After limiting to 50 last favorites, the numbers are still creepy. Following from my personal favorites (366), I discovered 1268 users and 52632 photos
  • Too complicated for an API. This is the kind of feature that wouldn’t be so hard to implement if you have access to the flickr database directly, but having to do so many requests adds a lot of time to the process.
  • Inefficient library. I had to do some modifications to the flickr ruby library just to make it work, but it’s still quite inefficient in some cases. Want to know the url of a picture (knowing the picture id)? 4 (completely unnecessary) API calls
  • My code is bad. OK, I know it’s ugly to start blaming everyone else. I know my code is not very good, as it’s a quick prototype. Still, I’m not sure if making my code/libraries better would be enough improvement given the network/api bottleneck

The simplified algorithm goes like this.

?View Code RUBY
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
  # method from class User
  def similar_pictures
    similar = {}
 
    favorites.each do |favorite|
      favorite.favorited_by.each do |user|
        user.favorites.each do |v|
          similar[k] ||= {:weight => 0, :picture => v[:picture]}
          similar[k][:weight] += 1
        end
      end
    end
 
    similar.values.sort {|a,b| b[:weight] <=> a[:weight]}.select {|v| v[:weight] > 1}
  end

So I’ve created a github repository and uploaded the code: discovr at github. Feel free to clone, test and improve

Related posts

Votes:

You must be logged in with a MySQL.com account to vote on Planet MySQL entries. More information on PlanetMySQL voting.

Planet MySQL © 1995-2008 MySQL AB, 2008-2009 Sun Microsystems, Inc.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Sun Microsystems, Inc. and does not
necessarily represent the opinion of Sun Microsystem, Inc. or any other party.