C*@ING

Train Management

Created by Christopher Reedijk (@creedijk) and Gary Stewart (@Gaz_GandA)
Dev Engineers, ING Netherlands


ING
ING Nederland (@ingnl)

Short introduction

Chris   Gary
Christopher Reedijk   Gary Stewart
  • Dev Engineer
  • C* enthousiast
  • @ING since: 01-01-2011
  • aka Grumpy

 
  • Dev Engineer
  • Love/hate relationship
    with C* :)
  • @ING since: 01-01-2012
  • aka Chaos Monkey
quick recap: our challenges - previous meetup

Improve availability
without trading consistency

Aim to be easier scalable

Adopt new ways of thinking


presentation
quick recap: about C* - previous meetup

Availability is easier

Performance is easier

Consistency is harder


presentation
and now for tonight?

Lets talk about consistency

as it is supposed to be harder!

a generic problem

Flow

Split batch files into single units of work

Also process single unit of work

Merge into a single batch file for processing

Batches can be scheduled and processed before certain times

Single unit of work MUST be processed ONCE only >NOT at least 1 and NOT at the most 1!

and this is how it looks like...

Flow

Easy - Split batches into single units

Hard - Single units collected into a batch file

Commonly known as Synchronize Issue

Challenging in a distributed
world even more so when masterless!

typical solution - transactional database

e.g. Oracle, Postgres

1 Query

                SELECT .... FOR UPDATE 
                   FROM .... a
                   JOIN .... b ON a. = b.  //optional
                WHERE status = 'NEW'
                   AND timestamp between ... and ...
                   AND rownum < X  
                NO WAIT 

DONE - thank you for your time...


Hang on - what about scalability, resilience and availability?

Will infra solve that? Too easy :-)

so could this be solved in C*?

                SELECT .... FOR UPDATE 
                   FROM .... a
                   JOIN .... b ON a. = b.  //optional
                WHERE status = 'NEW'
                   AND timestamp between ... and ...
                   AND rownum < X  
                NO WAIT 

"FOR UPDATE" No - C* - No client side locking

"WHERE" Filtering on status and timestamps = tricky
e.g. could use clustering columns

Table scan over partitions - C* = slow(er)

Large partitions can be cumbersome


Not looking good for C* :-(

Sucker for pain
let's try this in C* anyway


Remember the goal is to improve availability
without trading consistency

how to solve this in C*

View the problem differently

Application should have more responsibility

What if we track the size of the 'file'


To use an analogy

we will call out-going files - trains

And single units of work - passengers

proposed solutions

No Lock

Counters

Paxos

Carriages

possible solutions

NO LOCK - locking is
overrated right?

            SELECT TrainTotal FROM TrainTotal WHERE trainid = ?
            BEGIN BATCH
              INSERT INTO TrainPassenger (TrainId, PassengerId) VALUES (?,?);
              UPDATE TrainTotal SET TrainTotal = ? WHERE TrainId = ?;
            APPLY BATCH
                
            CREATE TABLE TrainTotal (           CREATE TABLE TrainPassenger (
            TrainId       text,                 TrainId       text,  
            TrainTotal    int,                  PassengerId   text,
            PRIMARY KEY ( TrainId )             PRIMARY KEY ( TrainId,  PassengerId) 
            );                                  );

Assign passenger to train

Keep total of trainsize in TrainTotal table

Requires read before write

Read current size before writing new size

Is this consistent? In sequence this would be perfect

Expected results: TrainTotal and TrainPassenger out-of-sync

possible solutions

COUNTERS - cool C* feature

            INSERT INTO TrainPassenger (TrainId, PassengerId) VALUES (?,?);
            UPDATE Traincounter SET TrainTotal = TrainTotal + 1 WHERE TrainId = ?;

            CREATE TABLE TrainCounter (         CREATE TABLE TrainPassenger (
            TrainId       text,                 TrainId       text,  
            TrainTotal    counter,              PassengerId   text,
            PRIMARY KEY ( TrainId )             PRIMARY KEY ( TrainId,  PassengerId) 
            );                                  );

No client-side read-before-write

Separate table - counters not allowed in batches
with other statements


Is this consistent?

Document says - Counters are made for this
but overcounts are possible

Expected results: TrainCounter and TrainPassenger are in-sync

possible solutions

PAXOS - lightweight transactions in C* (since v2)

        SELECT CarriageTotal FROM CarriageTotal WHERE Carriageid = ?
        UPDATE CarriageTotal SET CarriageTotal = ? WHERE Carriageid = ? IF CarriageTotal= ?;
        if [applied] == true  then INSERT INTO CarriagePassenger (CarriageId, PassengerId) 
        VALUES (?,?);

        CREATE TABLE CarriageTotal (           CREATE TABLE CarriagePassenger (
        CarriageId       text,                 CarriageId       text,  
        CarriageTotal    int,                  PassengerId   text,
        PRIMARY KEY ( CarriageId )             PRIMARY KEY ( CarriageId,  PassengerId) 
        );                                     );

Still need a read-before-write

Even though the document says use for 1% of your application -
use wisely - costs are high

Lightweight in features BUT not in 'execution'

We don't need 100.000 tps so this should be fine - possible solution!

Expected results: Consistent but slower

possible solutions

CARRIAGES - lets break the problem into chunks

            BEGIN BATCH
              INSERT INTO CarriagePassenger (CarriageId, PassengerId) VALUES (?,?);
              UPDATE CarriageTotal SET CarriageTotal = ? WHERE CarriageId = ?;
            APPLY BATCH

            CREATE TABLE CarriageTotal (           CREATE TABLE CarriagePassenger (
            CarriageId       text,                 CarriageId       text,  
            CarriageTotal    int,                  PassengerId   text,
            PRIMARY KEY ( CarriageId )             PRIMARY KEY ( CarriageId,  PassengerId) 
            );                                     );

Trains and carriages

Each application is responsible for their own carriage

Board Carriage instead of train

No read-before-write - application knows it's carriage size

All statements fit in a batch - atomic

Expected results: Consistent and fast but requires more logic

possible solutions

So now we have options -
let's test them out


 

No Lock

Expected results: TrainTotal and TrainPassenger out-of-sync
 

Counters

Expected results: TrainCounter and TrainPassenger are in-sync
 

Paxos

Expected results: Consistent but slower
 

Carriages

Expected results: Consistent and fast but requires more logic
the setup

Gatling - Results Gatling - Results

Datastax Community C* v2.0.6

Gatling performance tester

4 users each doing 250 runs

Call each step in a sequence


ExpressJS

Node-Cassandra-cql

test

Running the test...

test

Test Results

Gatling - Results
cqlsh:meetup> select * from traintotal;          cqlsh:meetup> select * from traintotal;
 trainid       | traintotal                       trainid | traintotal
---------------+------------                     ---------+------------
         paxos |        753                       counter |       1000
       no_lock |        930


cqlsh:meetup> select * from carriagetotal;
 carriageid    | carriagetotal 
---------------+--------------
 carriage_3001 |          400 
 carriage_3000 |          600
next steps

Great, so application fills up the carriage

but what about the trains

Carriage capacity is kept in application

When carriage is FULL - Attach to train and create new one

Any action on train requires cluster wide locking to ensure consistency!

how does it work?

Cluster Locks

Basically needed to ensure no-one else is working on the object - train

Only applications will fight for the lock - not threads

Use TTL to ensure locks are not kept endlessly

how does it work?

Aquire Cluster Lock


lockid =  UuidGenerator.generate();

INSERT INTO clusterlock (name, lockid) values  ( ?, ?) IF NOT EXISTS USING TTL ?;

if (rows.get(0).getBool('[applied]') ) { 
	return lockid.toString();
} else  {
	return NOK;
}

CREATE TABLE ClusterLock (
Name       text,
LockId     text,
PRIMARY KEY ( Name )
);
how does it work?

Refresh Cluster Lock


UPDATE clusterlock USING TTL ? SET lockid=? WHERE name= ? IF lockid=?;

if (rows.get(0).getBool('[applied]') ) { 
	return OK;
} else  {
	return NOK;
} 
 

CREATE TABLE ClusterLock (
Name       text,
LockId     text,
Hostname   text,
PRIMARY KEY ( Name )
);
how does it work?

Release Cluster Lock


DELETE FROM clusterlock WHERE name= ?  IF lockid=?;

if (rows.get(0).getBool('[applied]') ) { 
	return OK;
} else  {
	return NOK;
} 

CREATE TABLE ClusterLock (
Name       text,
LockId     text,
Hostname   text,
PRIMARY KEY ( Name )
);
other steps

Train departure - requires cluster lock

Depot - table of all unlinked carriages - should be empty helps with error situations

Scheduling - based on deadlines etc...

Maintenance - ensure that all operations are running smoothly - complete the feedback loop

to summarize

More responsibility in the application results in more scalable solution

Reduced contention in the cluster

MUCH more responsibility in application

Get it right - rewards are great

Get it wrong - penalties can be huge

Is it worth it?


Only you can decide

Hailo and NetFlix are rapidly deploying in
new countries within weeks

One final piece of advice:
test ALL features you are using


Ensures that upgrades will be easier and safer!

Paxos + TTL combined did not work after a minor upgrade

We spotted this issue within hours after an upgrade of our cluster


Jira Cassandra 6801

Datastax solved it within a week

Gatling - Results

Thank you

...graceful bow


Contact
gary.stewart@ing.nl
christopher.reedijk@ing.nl

We're hiring!
ING



this presentation is created using:
reveal.js

Who's willing to contribute to the next meetup(s)?

meetup