My Last Three projects

wins and failures

Created by Anton Katunin / @antulik

About me

  • Bachelor of IT at UQ
  • MS Access and VBA
  • C# and .Net
  • Ruby on Rails
  • Javascript
  • Visualisations

Agenda

  • TwinMaze
  • TopicRay.com
  • StoryLine.im

TwinMaze

Movie recomendation website

Feb 2011

Frustration

  • Hard to find movies you would enjoy
  • General ratings don't work (IMDB)
    e.g. movies you enjoy aren't neccessary have high rating

The Idea

What if ...

  • you had a clone of yourself
  • it could watch movies
  • and you could do other things
  • then he could tell you what is good and what is not
  • just like twins

Twins

What if ...

  • You had not 1 twin
  • but 2 twins
  • or 10 twins
  • or 100 twins
  • or 1000 twins

Plan

  1. Select movies you like and don't
  2. Run magic algorithm
  3. Get personalised movie recommendations

Algorithm

  • Machine learning
  • Neuron networks
  • Data mining


just keep it simple

Algorithm

Step 1. Find your top Twins


  • Match each user with every other user
  • Calculate compatibility rating between each pair of users
                REPLACE INTO `user_twins`
                (`user_id`, `twin_id`, `avg_difference`,
                `percent`, `movies_matched`, `level`, `updated_at`)

                (select user_id,twin_id, avg_difference, percent, movies_matched,
                if(percent > 92.5, 1,
                if(percent > 90, 2,
                if(percent > 87.5, 3,
                if(percent > 85, 4,
                if(percent > 82.5, 5,
                if(percent > 80, 6,
                if(percent > 77.5, 7,
                if(percent > 75, 8, 9)))))))),
                CURRENT_TIMESTAMP
                from
                (select sr.user_id as user_id, sr.twin_id as twin_id,
                Sum(single_points) as avg_difference,
                (10-(Sum(single_points)/count(*)))*10 as percent, count(*) as movies_matched
                from
                (select mr1.user_id as user_id,
                mr2.user_id as twin_id,
                ABS(mr1.rating-mr2.rating) as single_points
                from user_votes as mr1 join user_votes as mr2
                on mr1.movie_id = mr2.movie_id and mr1.user_id <> mr2.user_id
                where mr1.user_id = %d)
                as sr
                group by sr.user_id, sr.twin_id
                having count(*) >= %d) as t2
                );
            

Algorithm

Step 2. Calculate personalized movie ratings


  • Pick 1000 best twins
  • Combine their ratings
  • Predict rating for each movie
  • Pick best movies which user haven't seen

            select ...
            Power(10-level, ln(count(*))/ln(2.5)) * count(*) * avg(rating) as rating_points_sum,
            Power(10-level, ln(count(*))/ln(2.5)) * count(*) as vote_points_sum,
            ln(count(*))/ln(2.5) as power,
        

Problem

Where do you get your movie catalogue?

  • IMDB.com - good and expensive
  • TMDB.org - okay and free

Big problem

How do you test your recommendations
when you have 0 users?

more users = better system

Another Social network?

Bootstrap the data

get data from other websites

Dark days

web crawler days

  • 5 days
  • 17GB of web pages (no images)
  • 30k users
  • 500k ratings
  • user accounts data?? WAT!?

One suprising morning

  • /logs
  • /backups

The next day...

BANNED!

at least they fixed their security issues...

... you are welcome!

How to save $10/m

or hosting from home

Team

divide and conquer

business + design + development

founders vs contributors

50% of nothing is $0

Testing

It works!

Launch!!

Not sure if we ever launched

What's next?

Push or drop?

  • Performance is slow
    (10s + 2min per user)
  • Movie database quality is bad
  • Dealing with competitors
  • Social network effect

What's next?

Push or drop?

Drop!

TopicRay

multi-threaded chat system

Sep 2011

Frustration

  • chats are hard for work discussions
  • deep comments
  • GoogleWave is closed

HackerNews nested comments

Context

context in a converstion

or 'what do you mean'


think of bug reports

The idea

What if ...

  • You could never mix context in your discussions

Problem analysis

hierarchical data


very very deep hierarchical data

TopicRay progress

start with the end in mind

html map, topic list, linear chat view

styles, linear chat with rotation

map in ocanvas

map is draggable, chat space view

different layout with bigger chat view, focused message

trying another layout

and another, direction of chat is the same as on the map

switched to panels, map is expandable

straight layout + transparency

collapsable messages

collapsable messages with limit and size

moved away from draggable map to automatic map layout

moved away from panels, added bootstrap css, map is not expandable

consistent color scheme, topic headers, issues with ui overlap, topic name at the corner

unread messages, collapsable limit set to one

trying sunburst map layout in d3.js

added map view with icicle graph

icicle graph styles

removed inline map, sorted topic list, experimenting with floating messages

rebranded to TopicRay, added focused line, hacker news integration

topic list is transparent

icicle with unread messages

attempt to improve icicle

inline sunburst graph

animation for sunburst

tree depth indicator, inline message radar

new conversation tree view

demo

Stack

Backend:

  • Ruby on Rails + Postgres


Frontend:

  • HTML + CSS + SCSS + jQuery
  • backbone.js
  • oCanvas, d3.js
  • websockets (Pusher)
  • twitter bootstrap
  • and others

Performance issues

jQuery.animate is very slow

Firefox -> Chrome

TopicRay Launch

  • HackerNews
  • We need blog

One month later...

It's time to launch!!

and ...

nothing happened!

What's next?

Push or drop?

  • 1 year old
  • solid proof of concept
  • no real use case
  • no competitors
  • no market?

What's next?

Push or drop?

Hold!

Lessons

  • Keep history of your progress
  • Don't strive for perfection
  • Release as soon as you can
  • Have a blog
  • Keep asking for feedback
  • Javascript frameworks are good
  • Browsers are not as fast

and

Iterate

Iterate

Iterate

StoryLine

historic news browser

Frustration

  • I hate twitter!!
  • @antulik
  • Google Reader
    300 rss articles per day
  • Too much information
  • Too little time

The idea

What if ...


we had a tool which could
give us insights into information importance


tool which would
structure and prioritise information

get inspired

StoryLine progress

started with google calendar

prettify

moved to force graph

group into lines

add twitter

focus on twitter, calendar list

focus on twitter, zoom

navigation slider

prettify, add top previews

add avatars

add inline link preview

remove link preview

x axis is minutes

twisted layout 1/2 (DNA?)

twisted layout 2/2

bended lines

problems with size

zoom 1/5

zoom 2/5

zoom 3/5

zoom 4/5

zoom 5/5

added mouse scroll

reddit and multi-calendar colors

make everything smaller

grouped per lines, less chaos

another twist layout attemp

lines with small overlap

top 10 reddit streams for 12 hours

screencast v2

calendars hidden, add links preview

HackerNews integration

repositioned ui, mark as read, 4 previews

smarter selection

add tutorial, joyride.js

demo

Testing

common reaction

Wow! It's awesome!

what is it?

Explaning

narrowing down

from

data visualisation tool

to

historic news browser

Stack

Backend:

  • Ruby on Rails + Postgres


Frontend:

  • HTML + CSS + SCSS + jQuery
  • ember.js
  • d3.js
  • twitter bootstrap
  • S3 + CloudFront CDN
  • and others

Branding

  • 'StoryLine' is too popular
  • HooDB.com?
  • StoryParticles.com?
  • .com?
  • StoryLine.im

Launch

on HackerNews

+ facebook, twitter, gplus

attempt #2



at BrisJelly

Launch stats

in 24 hours

  • 85 up votes
  • 4k unique visitors
  • 9.5k pageviews
  • 600 visitors/hour peak
  • No performance issues during launch

StoryLine on StoryLine

Performance issues

  • 800+kb .js file
  • 1000+ animated objects
  • Optimised filtering
  • Smart fetching
  • How many times to redraw
  • Chrome vs others

Lessons

  • Grow and change ideas
  • Browser perfomance is hard
  • Communicating ideas is hard
  • Iterate
  • Ember.js + d3.js is a killer combo
  • Caching and CDN - FTW!
  • Web analytics to measure success

What's next?

Push or drop?


PUSH!

campjs

campjs.com

9-11th August

The end