Distributed system - ✔✔ A collection of autonomous computing elements that appears to its users as a single coherent system
Nodes - ✔✔
... [Show More] Computers in distributed system
Middleware - ✔✔ Operating system of distributed system which contain commonly used components and functions that need not be implemented by application separately
Characteristic of distributed system - ✔✔ - components independent
- concurrency off components for share resources
- lack of global clock
- units may fail independently
Goal of distributed systems - ✔✔ - better reliability
- better performance
Design goals of distributed system - ✔✔ - support sharing resources
- distribution transparency: hiding the implementation of system
- opened: ability to extend & reimplementation
- scalability: ability to expand
Types of distribution transparency - ✔✔ - access: hide differences in data representation & how on object is accessed
- location: hide where object is located
- relocation: hide that object may be moved to another location while in use
- migration: hide that object may move to another location
- replication: hide that object is replicated
- concurrency: hide that object may be shared by several independent users
- failure: hide failure & recovery of object
Degrees of transparency - ✔✔ - completely hiding failures is impossible
- full transparency will cost performance, excluding distribution of system
Types of scalability - ✔✔ - size: number of users & processes
- geographical: maximum distance between nodes
- administrative: number of administrative domains
Scalability problems - ✔✔ - computational capacity bounded by CPUs
- storage capacity & transfer rate between CPUs and disks
- network delays
Latency - ✔✔ End-to-end delay: time takes for data to travel from source to destination
Bandwidth - ✔✔ Amount of data that can be transmitted in fixed period of time
Throughput - ✔✔ Actual amount of data that is successfully sent/received over a link
Layered systems - ✔✔ Each layer in system implement server
Purpose: divide-and-conquer & dealing with complex systems
TCP/IP stack - ✔✔ 1. Application
2. Transport
3. Network
4. Link
5. Physical
Interface - ✔✔ Communication between upper & lower layer
Network structure - ✔✔ - network edge: hosts & servers
- physical media
- network core: routers & subnetworks
Packet-switching - ✔✔ - store-and-forward: entire packet must arrive at router before it can be transmitted on link
- queueing delay & loss: if arrival rate to link exceeds transmission rate of link for a period of time => packet will queue or dropped off buffer fills up
Routing - ✔✔ Determining source-destination route taken by packets
Client-server concept - ✔✔ each endpoint is a running program (servers wait for incoming connections & provide a service; clients make connections to servers)
2 types of communication - ✔✔ - Streams(TCP) - reliable communication & continuous stream of bytes
- Datagrams(UDP) - unreliable & discrete packets
Sockets - ✔✔ programming abstraction that represent end points of network communication & encapsulate the implementation of network and transport layer protocols
Multicast - ✔✔ simultaneous transmission of messages to several recipients
UDP multicasting - ✔✔ allows sending messages only to interested parties
TCP - ✔✔ connection-oriented; reliable bidirectional; error recovery; in-order delivery, big cost
UDP - ✔✔ connectionless; unreliable; segments can be lost; out-of-order, very small overhead
Concurrency - ✔✔ multiple processes are making progress at the same time or at least seemingly at the same time
Concurrent activities - ✔✔ multiprocessing and multithreading
Process - ✔✔ instance of computer program that is being executed = executable program + resources + context(state) - heavyweight
2 categories of processes - ✔✔ I/O bounded & CPU bounded
Thread - ✔✔ entity within process that can be scheduled for execution - smallest unit of processing can be performed in operating system.
Race conditon - ✔✔ multiple threads concurrently change the shared data. To avoid race condition threads need to be synchronized
Critical region - ✔✔ piece of code which introduces the race condition.
Producer/consumer model: - ✔✔ producers threads - adding to shared data structure; consumers threads - removing from shared data structure; only 1 party have access to the structure at any time. As shared structure queue can be used - multi-producer& multi-consumer & implements the lock semantics
Global interpreter lock (GIL) - ✔✔ safety mechanism in cpython that prevents multiple threads to execute at the same time instance even if there are multiple CPUs.
Usage of threads in client-server - ✔✔ - thread-per-connection
- thread-per-request
- combined pipeline
- combined delegation
Thread-per-connection - ✔✔ Dealing with each connection in separate thread (manager-workers & producer-consumer patterns)
Thread-per-request - ✔✔ each request in separate thread
Combined pipeline - ✔✔ workers separated into receivers, handler, senders; workers put task into queue for next group
Combined delegation - ✔✔ group managers + on-demand workers: main, request-manager, receiver-manager, sender-manager - each manager create additional threads for handling the tasks
Remote procedure call (RPC) - ✔✔ allows programs to call procedures located on other machines
Challenges of RPC - ✔✔ - diff machines
-diff address spaces
-diff data structures
-failure management [Show Less]