Skip to content

Instantly share code, notes, and snippets.

@yanatan16
Last active April 19, 2017 21:03
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yanatan16/c2a72ce50674b503e3ab to your computer and use it in GitHub Desktop.
Save yanatan16/c2a72ce50674b503e3ab to your computer and use it in GitHub Desktop.
explain roshi

Roshi is a set CRDT store modeled as a stateless web server in front of N (clusters) x M (shards) redis servers. It has 3 operations: Insert, Select, and Delete

Each element is a {"key", "member", "score"}

  • A key is the set name essentially. You select based on key. keys can be merged on select.
  • A member is the unique name of a member of the set
  • A score is a numeric value that has multiple uses.
    • If two elements with the same key and member are inserted, a select will only return the one with the highest score
    • If an element is deleted, the score of the delete must be at least as high as the score in the database.
    • One can think of the score as a timestamp, because that's SoundCloud's use for it.

Using Roshi for Entry Counts

Option A: Member for each Score

  • key = <raffle_id>
  • member = <bundle_id>:<trigger_id>:<score_n> where score_n = 1,2,3,4 for an entry with a score of 4
  • score = <timestamp_of_action>
  • EntryCount(raffle_id) = Count(Select(raffle_id))
  • SelectRandom(raffle_id) = Nth(Select(raffle_id), Random() % EntryCount(raffle_id))

SelectRandom and EntryCount are fast, but memory usage increased by a factor of avg_entry_score over option B

Option B: Scores as Scores

  • key = <raffle_id>
  • member = <bundle_id>:<trigger_id>
  • score = <trigger_score>
  • EntryCount(raffle_id) = Sum(Map(Select(raffle_id), :score))
  • SelectRandom(raffle_id) = IterateUntil0(Select(raffle_id), Random() % EntryCount(raffle_id))

SelectRandom is still slow, but memory usage is less (by a factor of avg_entry_score

@thealmightygrant
Copy link

I like Option A, and then you could go back through and squash the extra entries after the fact, which would allow for the same memory usage as Option B.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment