This is my summary for the lecture “Architectures for Distributed
Internet Services”, as held by Franz Hauck and Benjamin Erb in 2022 at
Ulm University.
Server performance in this case predominantly I/O-bound (both disk
and network)
Architectures
multi-thread / multi-process
One dedicated thread for accepting connections
Dedicated thread per connections (spawned by first thread)
Allows multiple concurrent connections
Synchronous, blocking IO may be used
Large number of connections make efficient scheduling
challenging
-> C10k Problem: Challenge of handling 10k concurrent connections
to a server (early 2000s, now: C1M scale and beyond)
Solutions for the C10k
Problem
Avoid spawning many processes
Avoid 1:1 mapping between connections and threads
Using asynchronous and non-blocking I/O (reduces number of waiting
threads)
Reactor Pattern
Reactor: single thread with event-loop
reacts to IO events from a queue
dispatches events
Event-specific handlers
process events produced by reactor
Implemented in nginx: One master process + one worker per core
Dynamic Web Content
Client-side dynamicity
Generated within browser
Related to presentation of content
Server-side dynamicity
automatic updating of static files
webserver delegates requests to external programs
e.g. CGI script
webserver executes scripts internally
e.g. php module
webserver has builtin site-gerneration code
tight coupling between webserver and application
Creating dynamic web content
some program takes http requests as input and produces web resources
(HTML) as output
e.g. java servelet
code may be embedded into web resource, inserting code output into
resource
e.g. java server faces, PHP
Server Side Includes
Directives inserted into HTML
executed by the server, but may delegate to CGI scripts
External Programs: CGI
Specifies interaction between webserver and external programs (CGI
scripts)
Server spawns new process for each request
HTTP headers passed via env variables
HTTP body passed via stdin
CGI script may produce entire HTTP response, or only essential
headers and body (remaining data added by server)
Modules for interacting with CGI webserver exist in programming
languages, such as CGI.pm in perl
Drawbacks:
Process creation may be slow
Script initialization happens on every request
Limited communication channel in stdin/out
Alternative: FastCGI
Binary protocol
via unix socket/named pipe/tcp connection
Script keeps running and processes multiple request
Multiple (remote) FastCGI applications may be running and accessed
by single web server
allows load balancing
separates web and application server
SCGI
(Simple Common Gateway Interface): Alternative to FastCGI
Simplifies interface: non-binary interface
Easier to implement than FastCGI
Problems with CGI
CGI programs often use same security context as web server
Apache allows running CGI scripts as specified user/group
Server-side Scripting
Scenario: Static HTML page should be enriched by dynamic content.
Here, inside-out scripting is considered. This means scripts are
placed inside HTML document. The script output is inserted at their
location in the HTML document.
Execution of the scripts:
External script interpreter, spawned by server
Long-ruinning external execution environment
Similar benefits to FastCGI vs CGI
Used by PHP for example
Server-internal script handling
Inside-out scripting allows HTML-centric development approach.
Self-processing pages are a pattern for processing form data on the
same page it is displayed on. This is done by distinguishing between
GET/POST request during script execution.
Example: Perl
Approach 1
HTML provided in template file with placeholders
Separate script provides content for placeholders
Approach 2
Code directly embedded in HTML with special syntax
Output is inserted in location of script, if requested
Example: Java
Inside-out scripting approach
Before execution, JSP-to-Java transformation transforms HTML with
embedded scripts to Java program (“Servlet”) which outputs entire page,
containing static and dynamic content.
Execution environment included in web servers like apache
Includes useful functionality for web-dev (databases, HTTP,
etc)
Data Management: Sever-Side
State
Data is stored both on server and client, and data is exchanged in
requests and responses. Examples for client data include:
Cached resources
Cookies
JS engine state
Server data includes:
Application data
Session state
Application state
The server holds static resources and dynamic
application state. Application state may include:
User accounts and preferences
Dynamically generated content
recommendations
shopping cart
messages between users
Cached and aggregated data
Aspects of server-side state include:
how to interact with the server state
how data is modeled
mutability of data
security
storage duration
access patterns
Resulting from those aspects are design choices regarding server-side
state:
Storage Options
Options are:
main memory
application state
program memory (stack + heap)
usually managed by language runtime
in-memory databases
provide access API
may provide guarantees about integrity, consistency, visibility
examples: sqlite (in-memory version), h2, many more
both easy to use and fast access time
drawbacks:
volatility (data loss on process stop/restart, overhead
otherwise)
small size
popular for short lived data with fast access requirements
local storage
file system (flat files)
advantages:
easy to use
well supported
persistent
easy to backup
large size
disadvantages
unindexed
unstructures
no guarantees for concurrent writes etc
popular for simple applications
databases
advantages:
data is structured
indexed
allow complex search operations
allow relational data
large size
transaction support -> consistency
usually allow distributed setups
integration with other services through standard interfaces
(SQL)
drawbacks:
complex API
complex management and administration
popular for complex structured application data
relational databases like postgresql, sqlite, mariadb
noSQL databases, such as k/v stores, document or graph
databases
remote storage
information systems like CRMs
external web services (database server, remote FS)
Session Management
Sessions provide state beyond individual HTTP requests. Example uses
are multi-step forms, login status or shopping carts.
This requires the server to identify subsequent requests belonging to
the same user/session. A session is the list of consecutive actions of
an individual user. This is usually implemented by labeling the
requests, that label is called Session ID or Session
Token. The ID is generated on the server side, and has to be unique
and un-guessable (to prevent session hijacking). It is transmitted to
the client in the first response, which then includes it in subsequent
requests. This allows the server to relate it to the session.
Sessions may end, either explicitly (discarding the session ID on
server or client) or by timeout.
URL Session Identifiers
Also called URL rewriting, this is a technique for handling
session IDs. It works by embedding the ID in the URL, which requires the
server to rewrite all hyperlinks in the response to include the ID.
Advantages:
No client support needed
No additional headers required
No cookies required
Disadvantages:
Security risk: Session ID easily exposed by sharing or logging an
URL
Correct rewriting might be difficult
Cookies
Cookies are small chunks of data (KV pairs) related to one domain,
which the client includes for every request to the same domain. Cookies
provide a more secure way to attach IDs to session.
The server provides the cookies using Set-Cookie
headers, which include the session ID.
The client receives the cookie, saves it and sends it back to the
server in subsequent requests using the Cookie header.
Session State
Once the server has the possibility to relate requests to sessions,
session state can be stored. This session state can be arbitrary, and
would usually stored in a database indexed by the session ID.
In principle, the entire state could be stored in a cookie, which
eliminates the need for server-side storage but has several drawbacks
such as limited size and the possibility for the client to modify the
state in unexpected ways.
Security
There are multiple security considerations regarding sessions, in
particular Leaking the ID, which can lead to session hijacking and thus
impersonating the user
Client-side Developments
The desire for more dynamic user interfaces in the web go beyond
static HTML pages delivered from the server, and interactions consisting
mainly of navigating to different pages. This requires computation on
the client-side, which is possible using javascript. Today, the
client-side of a web page consist of multiple elements of
HTML, defining the content and structure of the page
CSS, providing styling and display information
JavaScript, providing interaction and defining the dynamic
behavior
The JS execution model of a browser consists of an event loop
processing callbacks. Inside the callbacks, the JS code has access to
multiple APIs, such as for reading and modifying the DOM, setting timers
or making (AJAX) requests.
Object-Relational Mapping
Traditionally, relational DBs are accessed using SQL statements.
Drawbacks to this are that SQL is both a different programming model to
the application language and a different data model (relational vs e.g.
variables, object oriented). This requires mapping between those two
domains, and to synchronously maintain both models, the application code
and database.
An idea to solve this is to specify the model only in the application
programming language. Data entities could be represented by language
objects, relationships by references between objects. The mapping to the
DB data model is automatic. Database queries/updates are hidden behind
read/write of an object or its properties. Database inserts are hidden
behind object creation. Executing queries instantiates language objects
(queries themselves still exist).
The database still requires a data model, and schema. This could
however automatically be generated from the data model definition in the
application language, and may even automatically synchronize changes in
the data model.
Structured Developments
All web developments covered up to this lecture required lots of
boilerplate code, which is error prone and causes a high development
overhead. This created a demand for a more structured development
environment, which includes:
Structure and best practices
Common patterns for useful abstractions
Automation of common tasks
Those demands ultimately resulted in entire new frameworks for web
development.
MVC
Model-View-Controller is an architectural pattern applied to
web-based services. It separates the application into three independent
components:
Model
Represents data
View
Obtains data from the model in some way
Presents data to the user
Controller
Handles user interaction
Performs operation on data
Controls views, such as activating a new view
The business logic may be located in either the controller or model.
If the logic is combined with the data in the model, operations would
still be invoked by the controller.
MVC can be applied to web applications in multiple ways:
Server Side MVC
All components contained in the server (e.g. CGI script)
Input request handled by controller
Response generated by view
Technologies such as Java Servlets assist in extracting the model,
but controller and view would still be tightly coupled. Java Server
Faces provides a generic, configurable controller component and allow
separating the view (“Facelet”). The view can be implemented
declaratively as an HTML page containing references to the model.
Django takes a different approach: Django separates the tasks in the
components view, model and template. The view handles the request and
contains the business logic (like the controller in other examples). The
template corresponds to the view in the java example: It is an HTML
template which gets instantiated with model data by the view.
Summary
MVC allows modularization, and separates design from
programming
Frameworks reduce required boilerplate code by providing common
components
Templating for views simplify development
Modularization leads to improved testability
Some more structural
patterns
Inversion of Control (IoC)
This principle states that the application code is reactive,
not active: it gets called by the framework when relevant events occur,
instead of being the main entrypoint into the application.
Don’t Repeat Yourself
Self explanatory
Convention Over
Configuration
Code, configuration and other aspects such as directory structure
should be derived from convention whenever possible.
Structured Project
Directories
Frameworks usually have conventions, an it’s useful to adhere to
them
Frameworks sometimes provide generation of project scaffolding
(Django, Ruby on Rails)
RIA and SPA
Thin vs. Thick Clients
In thin-client applications, the browser only handles inputs and
renders responses from the server. Interactions between client and
server are HTTP requests.
Thick-client applications move some business logic to the client
side. This typically allows for improvements in user experience. Some
actions are executed locally without server interaction. Other
interactions with the server might be application specific, once the
client application is loaded.
Rich Internet Applications
(RIA)
Basic idea: Resemble user experience of local desktop application.
This requires running some logic on the client side. An extreme example
would be a client side game engine running in the browser. RIAs allow
platform independence, which might be an advantage compared to native
applications.
Examples of RIAs:
Google Docs
Web mail clients
Flash games
Frameworks are available for both client- and server-side
programming.
Today, RIA are “rich web applications”, using native web technologies
such as HTML(5), CSS and JS.
Single Page Applications
(SPA)
SPAs are thick clients, using current web technologies (HTML5, CSS3)
and modern web APIs using JS. The page is loaded only once, and the
application dynamically modifies the current page based on user input,
instead of loading new pages. Interaction with the server is done in the
background.
Technologies used:
Communication
Regular HTTP request for initial load
AJAX/fetch, server-push or WS for later communication
Data formats
JSON and XML for structured data
Alternative: pre-generated HTML from the server
Challenges include:
Slow initial loading time
Programmatic access e.g. for search engines
Handling of application state vs page location
Partitioning of logic between client and server
Node.js and Unification of
Languages
Emerging Challenges
Scalability: C10k-like problems for dynamic content
Coordination between server and client, with many clients
Interaction between concurrent requests
Near-real-time information distribution
A traditional web application usually features
Isolation of individual requests at the server
Application state exclusively in database
while a modern application wants to use techniques such as long
polling, wherein the server delays a response until some event has
occurred (such as a new chat message being sent). This requires a new
programming model with
Explicit concurrency support (inter-request interactions)
Support for long-running and event-triggered requests
Scalability for dynamic content generation
Event-driven Web
Application Runtime
This model extends the reactor pattern
by allowing the event loop to execute arbitrary application code such as
making database queries and generating HTML output.
Node.js
Asynchronous, event-driven JS runtime
JS is used because it does not have a threading model or standard IO
functionality, as all those are provided exclusively by the runtime
Extensive use of callbacks
Framework express simplifies HTTP server creation
All callbacks executed sequentially
events library provides inter-request interaction
npm package manager and package registry
Unification of Languages
Using the same programming language for client and server has many
advantages, such as easier debugging and less duplication of server and
client code.
A full JS stack has the advantage of not requiring transpilation or
code generation for the client side. Also, developers might already be
familiar with JS because of its prominence for frontend code.
Popular JS stack “MEAN”: MongoDB, Express, AngularJS, Node.js.
Component-based Frameworks
This focuses on web component for the client side.
Components are reusable view elements, which is desirable to avoid
re-implementing features
No standard way for re-using HTML blocks, templating
page styles not applied to shadow dom, shadow dom styles not applied
on page level
HTML Templates: Declare HTML fragments, instantiate at runtime
<slot> allows injection of instance specific
data
ES Modules: reuse of JS documents
Component-based
Application Frameworks
Those don’t use the standard components directly, but implement
similar concepts.
Angular
Declarative HTML templates
Typescript for defining components, which references HTML and
CSS
React
JSX templates: functional instead of declarative
components as state machines
no scoped styles
Vue.js
Declarative HTML templates
Standard Components vs Frameworks:
Component support is builtin to browsers
Frameworks often more feature-rich
Frameworks provide virtual DOM
Some frameworks allow export to web components
Progressive Web Applications
Principles of PWAs:
discoverable: indexable by search engines
installable: integrates with platform
app-launcher/homescreen/whatever
linkable: sharing via URI
network independent: PWA can work offline or with limited
connectivity
progressive: basic feature set for old browser, extended features
for new browsers
graceful degradation: mitigation techniques for missing
features
shims, polyfills: re-implementation of modern web APIs for browsers
without native support in plain JS
progressive enhancements: opposite of graceful degradation: upgrade
basic functionality if available
transpilation of modern JS to version compatible with older browser
version
re-engageable: can send notifications to users
responsive: adapts to device and screen size
safe: secure communication, protection of sensitive data
Implementing PWAs:
secure contexts
browser allows access to sensitive APIs only if requirements
fulfilled
requirements might be e.g. a secure connection
web app manifests
declares information about web app such as
homescreen icons
appearance options such as full-screen mode
screen orientation
system colors
referenced by link from HTML header
service workers
proxy between application and network
implements:
offline support
cache control
pre-fetching resources
background tasks such as data sync, push notifications
fully asynchronous
only available in secure context
Popular Implementations
This section has been skipped due to an unbearable amount of Java EE
content.
Introduction to Scalability
Part 1
The main requirements for a large scale web application are
reliability
scalability
maintainability
A scalable system is able to cope with varying load. This might
require utilizing additional resources, or graceful degradation in case
of temporary overload. Scalability is not the same as performance, as a
system which has good performance for a single user it might still not
scale well for large numbers of users.
Client Side Performance
Metrics
DNS lookup time
Time to first byte TTFB
Time to first contentful paint (FCP)
Time to start render
Time to interactive (TTI)
Server Side Performance
Metrics
request throughput (requests per second)
response times (time between receiving request and responding)
hardware utilization
availability
uptime
Response Times
Both average and peak response time can be measured.. However, those
statistics are of limited relevance, since the distribution is usually
not normal or uniform. A better way of assessing response times are
percentiles (99% of all responses arrive within \(p_{99}\)). Percentile plots (hockey stick
plots) allow comparing response times of multiple systems.
Throughput vs Response Time
Under load, increased throughput negatively affects response times.
This might be due to requests being queued for example.
Part 2
Load Characteristics
Load: externally assigned work for the system
Load is characterized by load parameters:
web server load
number of requests
type of requests
target URIs
application backend load
application specific, e.g. ratio of read/write DB operations
Roary example load parameters:
add-post operations per second
view-post operations per second
ratio between both
secondary load parameters in this example might be the number of
followers and subscriptions per user, or the distribution of those among
the user-base.
Scalability Strategies
Scale-Up (vertical): Adding more resources to existing machines
inherent physical limits
Scale-Out (horizontal): Adding more machines
requires distribution
Scale Cube
Axes:
horizontal duplication
cloning services
replicating instances such as DBs
load-balancing between entities
functional decomposition
separation of services
partitioning of data sets
data partitioning
sharding of service instances (dedicated instances for certain
groups of users)
distribute dataset based on key
Example for scaling up users+products database:
Replication of entire database allows more concurrent read
operations
Separating users and products tables into two instances increases
capacity for each and allows independent scaling
Splitting entire dataset into two databases by key
All three approaches can be combined, even for this simple database
example.
General Strategies
stateless components are easier to scale and replicate
avoid single points of failure
shared state makes replication and sharding difficult
asynchronous services and calls prevent bottlenecks
caching may be used on all architecture layers and can have huge
benefits
AJAX: load additional information by initiation of client
SPAs:
accessing business data
triggering server-side operations
loading code
loading HTML
REST
“Representational state transfer” is a style and set of guidelines,
which supports large-scale web applications and works well with
HTTP:
client-server architecture with request-response communication
statelessness: client manages session state, each request includes
all necessary information (such as pagination cursor versus
server-stored pagination state for session)
cacheability: responses labeled with cacheability
layered system: clients, servers, intermediaries such as caches,
load balancers, etc
uniform interface:
small set of generic operations: create, read, update, delete.
Applicable to all resources in system
identification of resources:
architecture is “resource oriented”, such as users, products or blog
posts.
Resources are identified by URIs, which are independent from
resource representation
manipulation of resources via their representations: request
resource (in some representation), modify, send changed resource
back
self-descriptive messages: message contains metadata,
e.g. representation used via content-type header
hypermedia as the engine of application state:
application state changes on following links, submitting forms or
following redirections
clients do not need to know internal application layout, base URL
allows exploring the application
might require additional formats for non-HTML data
GraphQL
The goal of GraphQL is to avoid disadvantages of REST, and in
particular make it possible to retrieve only the necessary data, and all
of it in a single operation. An example from Roary would be requesting
some information about a user and their last three posts. This would
require multiple requests when following REST (user, post list,
individual post-details).
In GraphQL, the query specifies the exact fields required, such
as:
This requests only the name of the users, and the text of the last
three roars. Responses are provided in JSON format.
Using GraphQL requires a schema definition for the available
resources, which serves as a template for possible queries. Queries can
also be created to modify data, or subscribe to update-events. The
server has to implement some functions to resolve queries, libraries are
available to connect GraphQL queries to data sources such as
databases.
Client-side frameworks are available which link views with GraphQL
queries.
While GraphQL eliminates REST problems such as transferring
unnecessary data in multiple requests, and allows for fast frontend
iteration without backend modifications, drawbacks exist: Caching is
more complex, since data is often incomplete, and queries can be very
complex. This leads to large processing demands on the server, which
also vary widely by request.
Websockets
HTTP is not suited for interactive two-way communication (prime
example: web chat application). An option to solve this would be AJAX
polling, or AJAX long polling, but this requires complex server-side
coordination and bears the hidden cost of HTTP (headers, TLS handshakes,
etc).
The modern approach however is WebSocket: WebSocket describes both an
API and protocol:
reliable client-server interaction
compatible with HTTP connections
TLS integrated
callback-based API suited for JS
The client opens a WebSocket connection to the server, and can then
send messages and react to incoming messages. The server listens to
incoming requests to open connections, and then interacts with the
connection.
The WebSocket protocol is compatible with HTTP, and uses a GET
request to initiate a connection. After the response, the protocol is
switched to WebSocket. Subsequent messages are not HTTP requests, but
use the WebSocket wire protocol. WebSocket packets have at least two
bytes overhead and allow binary payloads.
Application protocols are usually used on top of WebSocket, which can
already be specified when initiating the connection.
Backend-Backend
Communication
This type of communication occurs between servers which are part of
the backend. An example might be communication between an application
server and database server.
Previously mentioned protocols like REST and GraphQL might be used
between servers, but there are additional options:
RPC: Remote Procedure Call
Resembles a function invocation that happens on a remote host
request-reply interaction
Extends REST, which only allows CRUD operations
requires marshaling (serialization) of parameters and result
Popular implementations:
Java RMI: transparent method invocation between JVMs
gRPC: language agnostic, based on HTTP/2 and protobuf. Also able to
handle streams of data.
SQL: known for DB access, usually embedded into database
connectors
defines application data encoding based on XML schema
WSDL “Web Service Description Language”:
XML format for describing web services, basic building blocks:
types: specify messages (such as SOAP)
interface: specify functionality
binding: how to access service (protocols)
service: where to access functionality
can be used for code generation
Advanced Concepts
Messaging
Interconnect servers by message bus
Queuing
messages addressed to queues
queues resemble mailboxes
receiver asynchronously fetches messages from queue
queue decouples sender and receiver
Pub-Sub
messages not specifically addressed
receivers subscribe to desired messages
could be topic based or content based
implementation could be one queue per topic
Implementation Notes
library approach: queue is integrated into receiver application
server approach: queue located on dedicated message server
replication of server allows scaling
Examples:
IBM MQ
combines queuing and pubsub
Apache Kafka
persistent message stored
topic-based pubsub
allows stream processing
scalability:
partitioning of topic queues
receivers organized in consumer groups, which subscribe to topics.
Consumer group could have one receiver per partition of topic
queue.
SEDA: Staged Event-Driven
Architecture
motivation: efficient internet services even for load spikes
idea: pipelining
stage input: event queue
stage outputs: events for subsequent stages
Advantages:
independent processing units
decoupled stages
parallelization of stages
Disadvantages:
queues can become bottlenecks
complex architecture
nowadays: no performance improvement compared to (modern) threading
systems
Back Pressure
concept for preventing exhaustion of service resources
idea: block resource consumption until resources are available
examples:
block additional connections
block producers from submitting new messages
block callers from invoking procedures
might lead to pipeline stalls, but prevents system from crashing
(graceful degradation)
State
State in Scalable Web
Architectures
In this section, we consider server-side state. We have to
differentiate the context of the state: request, session, long-term.
Additionally, we have to differentiate between ephemeral/volatile,
derived and persistent state.
Caching
Main idea: prevent duplicate work.
Challenges:
potential for inconsistent state
cache could become bottleneck
failing/restarting caches may overload downstream components
increased system complexity
Application of caching throughout the architecture:
client side caching of responses
outsourcing to CDNs
reverse proxies with cache: serve static content and cached
dynamically generated pages
cache support on web servers: cache-control headers etc
application servers: caching in language runtimes (bytecode for
interpreted languages), caching of DB queries, application specific
strategies
database: various internal caching methods, caching of query
results
in-memory data stores: provides caching for other components
backend services: caching of results
Rule of thumb: cache high up in the call stack (close to the
client).
Data Models
Relational Model
SQL for data definition, manipulation, querying
transactions
Key/Value
hash table storage
limited size key
usually arbitrary value
Document oriented
similar to k/v, but structured document as value
document usually JSON, XML, etc
Graph Model
data items as vertices and edges
interesting if querying relationships between items is common
operation (social network, road network)
Column-oriented Model
storage of tables by column instead of rows
more complex insert operation
efficient operations over large table, such as aggregation
Functional Decomposition
Idea: Split application model into logical parts which can be
separated.
allows physical separation
enables use of different data models
join-logic must be handled by application
Replication
Idea: Multiple copies of the same data.
increases throughput
increases availability (tolerate failure of a replica)
reduces latency via geo-replication
Primary challenge: mutable data. This requires propagation of updates
to achieve consistency between replicas.
Leader-Follower Replication
Dedicated leader database and set of replicas, called followers
Only read-only applications sent to follower databases
writes replicated from leader to follower
replication might be done synchronously (within the write operation)
or asynchronously
Replication Mechanism:
statement based: direct execution of statements in replicas.
statements must be deterministic.
log based: append-only log of changes sent to replicas. might be
“write-ahead log” of more low-level write operations
application triggered: replication triggered and specified by user
application
Properties/guarantees relating to consistency anomalies:
read your own writes: guarantee of seeing previous write in
subsequent read
monotonic reads: guarantee that subsequent reads will never return a
previous version of the state
consistent prefix reads: data items observed in causally correct
order
Multi-Leader Replication
Multiple instances handle write operations
might be generalized to client-server interaction (e.g. for
PWAs)
Main problem: write conflicts. Strategies for resolution:
conflict avoidance: each data item has one authoritative leader
which handles write operations
convergence to consistent state: ensure all leaders agree after
replication, such as by enforcing global order on update operations
user-defined resolution: application specific
Leaderless Replication
always contact all instances, but only require certain number of
responses (quorum)
write quorum \(w < n\) for \(n\) replicas: writes possible even if
replica is down
sum of write and read quorum \(w+r>n\), such as \(w=r=\frac{n}{2}+1\)
optimization: smaller \(r\) for
read-heavy loads
if \(w+r \leq n\): increased risk
of reading stale data
Sharding/Partitioning
Split a collection of data item into multiple partitions
requires routing of requests to correct partition
primary challenge: partitioning scheme which creates even
distribution and allows rebalancing
Partitioning schemes:
key range: shard = ranged partition of key space
straightforward
prone to skewed workloads
hashed keys
no inherent support of range queries
Rebalancing shards: scale-out by adding instances, ideally with low
migration effort
dynamic partitioning (key based sharding): split keyspace of
overloaded shard in two halves, assign one half to new shard
fixed number of shards which is way larger than number of machines:
new machine can take some existing shards
fixed number of shards per machine: new machine picks some shards,
splits them and hosts one half
Sharding is often combined with replication.
Data Consistency
Consistency models define visibility semantics and system
behavior.
CAP Theorem
In case of a partition, a system can either
focus on consistency, but loose availability, or:
keep instance available, but not guarantee consistency
PACELC Theorem
In case of a network partition, you have to choose between
availability and consistency.
Else, in normal operation, you have to choose between latency and
consistency.
Harvest/Yield
harvest: completeness of queries
yield: probability of completing a request
in some cases, an incomplete result is better than no result
Event Sourcing
alternative style for persisting data
append-only sequence of (immutable) state-changes
state computed by applying all state-changes in order
event log is persisted, for example on append-only k/v store
Benefits:
Persistence of state history
explicit consistency model
Drawbacks:
unconventional
increased storage requirements
Logic
Application Logic
logic required for generating dynamic web pages or providing
application services and functions
external and internal endpoints
sequential execution of application logic
application server sequentially calls and waits for results from DBs
and backend services
does not scale for large number of backend services
event-driven and asynchronous execution
first, dispatch all calls
second, wait for all results
this might require additional coordination or dependency
management
Stateless and Stateful
Application Logic
stateless: independent request-response cycles, desirable for
scalability
specific meaning of state:
session state
server-side application state (“stateful resources”)
Examples:
Video transcoding service:
stateless service: each individual request contains all required
information
easy scalability: replication
Game:
stateful application logic (game state) maintained by server
scalability strategy: sharding of game instances without shared
state
Calendar:
stateless application logic, stateless request handling: state is
queried from DB for every request, no dependency between requests
database maintains stateful resources: mutable data
scalability: replication of application servers, scaling of DB (see
above)
Stateless logic is preferred whenever possible. It might be possible
to decompose application logic into stateless and stateful parts.
Distributing Application
Logic
Functional Partitioning
Application specific characteristics
Decouple application functions
Application functions might still access the same backend
Allows independent scaling of endpoints
Horizontal Duplication
(Replication)
Multiple application servers run same copy of application logic
often requires stateless request handling
arbitrary load balancing strategy
Data Partitioning (Sharding)
Designate instance to handle designated set of requests
Partitioning based on e.g. user id, session id, client IP, etc
allows stateful request handling logic
load balancing needs to be aware of sharding
Shared and Global State
state often handled by database system
transactional semantics challenging for distributed databases
querying state from DB might not be fast enough
Alternatives:
state bound to individual shard: concurrency only within single
application server
distributed state between application instances: requires
distributed concurrency mechanism
actor model
event-driven model (shared event bus, pubsub)
distributed transactions
Background Tasks
request independent application logic
maintenance tasks
data analysis tasks
might be long running compared to requests
often requires access to backend and application state
Batch Processing
long running task
fully available and bounded input data
Stream Processing
continuous processing task
input data unbounded and not available in advance
results might be incomplete or approximate
example: trending topic analysis on social networks
Event Sourcing and CQRS
CQRS: Command Query Responsibility Segregation
idea: differentiate commands (no return value, might have side
effects) and queries (returns data, no side effects)
application to event based programming:
command model: state-changing operations, write only
query model: read only query operations
separation of models allows independent development and scaling
strategies
application to event sourcing:
command model creates events
query model interacts with materialized state representation
explicit eventual consistency
easy scaling of query model, loose coupling of components
Deployment/Infrastructure
Backend Endpoints
Proxies
endpoint for HTTP requests
main task: forwarding to server, usually also caching
location: near the client
purpose:
scalability
filtering
source obfuscation
Reverse Proxies:
location: near the server
purpose:
hides backend structure
filtering, caching, compression
security: TLS endpoint, maybe even authentication
load balancing
logging
Popular software:
apache
nginx
HAProxy
Load Balancing
goal: scalability
distribute incoming requests among multiple servers
Two design dimensions:
Request routing
naming: Resolve same DNS name to multiple IP addresses, usually in a
round-robin strategy. This resolves the DNS resolution equally, but this
might not balance the load equally. It would also be feasible here to
distribute based on source IP.
Application: A reverse proxy distributes HTTP requests. Here, a load
based distribution strategy might be possible, if the proxy has access
to load parameters of servers. Session stickiness ensures that all the
requests of a single session reach the same backend server. This
simplifies session state handling.
Transport: Reverse proxy distributes TCP connections. This has the
advantage of being protocol agnostic.
distribution strategies
round robin
load based
session based
source based
CDNs
3rd party provider
CDN handles (global) replication
client directly routed to CDN servers
Cloud Computing
Model for enabling on demand access to shared pool of computing
resources, which can be rapidly provisioned with minimal effort and
interaction.
Characteristics of cloud services:
on-demand self service
network access via standard interfaces
resource pooling (multiplexing, virtualization)
elasticity: fast adaptation and scaling
measured service: pay as you go
Deployment Models:
private cloud
community cloud
public cloud
(hybrid models)
Service Models
Infrastructure as a Service
IaaS
computing instance
physical or virtual (typically VM)
initialized using image
network
bridges/routers
storage
disk-level (block storage)
file-level (e.g. S3) (i think this should be “object storage”?)
containers
Platform as a Service PaaS
examples of platforms:
servlet container
FastCGI
abstracts away everything below application level
typically includes adjacent services
Storage:
platforms for storage services
mostly database-like, such as SQL database as a service
caching services (in-memory)
Software as a Service SaaS
maintained application and hardware
e.g. google docs
Virtualization
cloud: typically hypervisor based virtualization
Advanced Topics
Monitoring
allows quantifiable metrics and shows if systems provide intended
service