C*@ING

3 challenges, 3 facts, 3 questions

Created by Christopher Reedijk (@creedijk) and Gary Stewart (@Gaz_GandA)
Dev Engineers, ING Netherlands


ING
ING Nederland (@ingnl)

Short introduction

Chris Gary
Christopher Reedijk Gary Stewart
  • Dev Engineer
  • C* enthousiast
  • @ING since: 01-01-2011
  • aka Grumpy
  • Dev Engineer
  • Love/hate relationship with C* :)
  • @ING since: 01-01-2012
  • aka Legacy

challenge 1

Improve availability

without trading consistency

availability & consistency

Consequences of not being available

ING outage ING outage
ING outage
source: nu.nl
availability & consistency

Consequences of not being consistent

Insuffiecient funds

challenge 2

Aim to be easier scalable

Changes are happening at an increasing pace

easier scalable

Stop focusing on the
expected load

Web yesterday
yesterday

source: bradfrostweb.com
easier scalable

Start focusing on the
unexpected load

Web now Web tomorrow
today tomorrow
source: bradfrostweb.com

challenge 3

Adopt new ways of thinking

Become the top engineering company

new way of thinking

ING's culture is changing fast.
Waterfall to Scrum to DevOps in less than 2 years

new way of thinking

new way of thinking

Pre-conceived ideas still need to be cleared

Cache is bad, when used incorrectly

Need for locking (transaction) everywhere

Every domain for themselves

all challenges appear...

Improve availability

Aim to be scalable

Adopt new ways of thinking

C* to save the day

Superman

fact 1

Availability is easier

Proof

Availability result
availability is easier, because

Masterless architecture

No outages in contrast to master-slave architecture

This does make C* chatty

availability is easier, because

Replication of data

Replication factor is configurable per keyspace

Increasing nodes does not mean higher availability
Increasing the replication factor does

fact 2

Performance is easier

Proof

Performance result
performance is easier, and

Solve your reads with your writes

De-normalization is acceptable for solving reads

performance is easier, but

Know your partitions

Large partitions will hurt!

Partitioning
performance is easier, because

Linear scalable

Adding nodes will increase your throughput through sharding

Linear scalability

source: Datastax

fact 3

Consistency is harder

consistency is harder, and

Application developer has much more responsibility

Focus on the data model and the flow of data is key

noSQL versus SQL

consistency is harder, because

CAP theorem

Consistency (all nodes see the same data at the same time)

Availability (a guarantee that every request receives a response about whether it was successful or failed)

Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)

consistency is harder, because

Transactions and queuing are anti-patterns

C* v2.0 reduces the complexity by using lightweight transactions (paxos)

all facts appear...

Availability is easier | challenge 1: availability & consistency

Performance is easier | challenge 2 easier scalable

Consistency is harder | challenge * availability & consistency, easier scalable, new way of thinking


all challenges covered...

question 1

What risks do we have with
2 DC's and using Write/Read CL of LOCAL_QUORUM?

Question 1

Split-brain

Split brain

Who wins? Neither side knows the other side!

CAP theorem - choose wisely

Use QUORUM but risk availability
when there are issues with the WAN link

Arrange a third DC... €€€

Applications apply DC stickiness
whilst processing data which increases complexity

question 2

Does QUORUM read and QUORUM write guarantee consistency?

No - it gives strong consistency

NTP must be set-up correctly
Know your flow of data
Model as idem-potent-ly as possible
NTP issue

question 3

Will C* solve all your IT issues?

No - but it will help solve a lot of your problems

Performance and availability are tuneable based on needs and costs

Consistency is solved by design and knowing the flow of data

Going from SQL to noSQL is a paradigm shift

Skills are not readily available

Some issues are better solved by relational databases

Thank you

...graceful bow


Contact
christopher.reedijk@ing.nl
gary.stewart@ing.nl

We're hiring!
ING



this presentation is created using:
reveal.js

Let's kick-off the
open discussion