My Reading List in Computer Systems Area
1968. Virtual memory, processes, and sharing in multics. Commun. ACM, 11(5):306-312. R. C. Daley and J. B. Dennis.
Notes: All problems in computer science can be solved by another level of indirection; share data/procedure rather than sharing whole program; the earliest Linker?
- 1968. The Structure of the "THE"-Multiprogramming System. Communications of the ACM. Edsger W. Dijkstra.
- Notes: The ancestor of monolithic kernel. Layers of abstraction. Earliest description (inventor?) of semaphore (P/V), which is covered in every operating systems textbook.
- 1970. The nucleus of a multiprogramming system. Commun. ACM, 13(4):238-241. P. B. Hansen.
- Notes: The ancestor of microkernel. Everything is a process; use message passing instead of semaphore for protection/flexibility.
- 1974. The unix time-sharing system. Commun. ACM, 17(7):365-375. D. M. Ritchie and K. Thompson.
- Notes: Everything is a file (e.g., plain file, directory, pipe, socket, device, ...). Interactive programming environment improves productivity, in comparison to batch processing systems.
- 1974. Hydra: the kernel of a multiprocessor operating system. Commun. ACM, 17(6):337-345. W. Wulf, E. Cohen, W. Corwin, A. Jones, R. Levin, C. Pierson, and F. Pollack.
- Notes: Everything is an object referenced by capabilities, for protection; separation of mechanism from policy rather than layered design.
- 1975. The protection of information in computer systems. Proceedings of the IEEE, 63(9):1278- 1308. J. Saltzer and M. Schroeder.
- Notes: Protection is mechanism (ACL / capability), security is policy (failsafe default, least privilege).
- 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM, 21(7):558-565. L. Lamport.
- Notes: In distributed systems, we are more interested in relative order of events rather than in when the events actually happen. Theory foundation for vector clock, version vector (used in Amazon's Dynamo), ...
- 1978. Using encryption for authentication in large networks of computers. Commun. ACM, 21(12):993-999. R. M. Needham and M. D. Schroeder.
- Notes: Theory foundation for Kerberos.
- 1980. Experience with processes and monitors in mesa. Commun. ACM, 23(2):105-117. B. W. Lampson and D. D. Redell.
- Notes: The monitor abstraction encapsulates mutual exclusion and signaling (communication), and notify as a hint (no context switch).
- 1980. Pilot: an operating system for a personal computer. Commun. ACM, 23(2):81-92. D. D. Redell, Y. K. Dalal, T. R. Horsley, H. C. Lauer, W. C. Lynch, P. R. McJones, H. G. Murray, and S. C. Purcell.
- Notes: Design for single user, so limited protection (for error not for maliciousness) and unfair resource allocation. Steve Jobs stole it for Mac OS.
- 1981. Wsclock - a simple and effective algorithm for virtual memory management. SIGOPS Oper. Syst. Rev., 15(5):87-95. R. W. Carr and J. L. Hennessy.
- Notes: Want to maximize # of active tasks w/o inducing thrashing, and want to have local policy per task and predictable load control.
- 1981. Observations on the development of an operating system. In SOSP '81: Proceedings of the eighth ACM symposium on Operating systems principles. H. C. Lauer.
- Notes: 5-to-7-year-rule, 5-category of OS
- 1982. The byzantine generals problem. ACM Trans. Program. Lang. Syst., 4(3):382-401. L. Lamport, R. Shostak, and M. Pease.
- Notes: Makes reliable system work even in the presence of faulty component (not necessary to know which component is faulty, as long as the whole system work).
- 1983. Process migration in demos/mp. SIGOPS Oper. Syst. Rev., 17(5):110-119, 1983. M. L. Powell and B. P. Miller.
- Notes: OS should support location-independent context (naming, resource used), so that it can support process migration. Another example of "All problems in computer science can be solved by another level of indirection".
- 1983. Hints for computer system design. SIGOPS Oper. Syst. Rev., 17(5):33-48, 1983. B. W. Lampson.
- Notes: Functionality (simple), performance (dedicated,hint,shed load), fault tolerance (simple,end-to-end recovery, log, restartable).
- 1984. Implementing remote procedure calls. ACM Trans. Comput. Syst., 2(1):39-59, 1984. A. D. Birrell and B. J. Nelson.
- Notes: Want same semantics as local procedure call in the presence of networking that may fail.
- 1984. A fast file system for unix. ACM Trans. Comput. Syst., 2(3):181-197, 1984. M. K. McKusick, W. N. Joy, S. J. Leffler, and R. S. Fabry.
- Notes: Exposes disk geometry to os and consider predicted spatial locality (placement and layout).
- 1984.Experience with grapevine: the growth of a distributed system. ACM Trans. Comput. Syst., 2(1):3-23, 1984. M. D. Schroeder, A. D. Birrell, and R. M. Needham.
- Notes: Wants horizontal scalability, and reliability (replication), so tolerate weak consistency
- 1984. End-to-end arguments in system design. ACM Trans. Comput. Syst., 2(4):277-288, 1984. J. H. Saltzer, D. P. Reed, and D. D. Clark.
- Notes: Placement of functionality should be close to endpoint, typically for fault recovery.
- 1985. Distributed snapshots: determining global states of distributed systems. ACM Trans. Comput. Syst., 3(1):63-75, 1985. K. M. Chandy and L. Lamport.
- Notes: Use marker to cut snapshot, so to get feasible global state between [snapshot start, snapshot end], instead of exact global state at one time.
- 1985. Design and implementation of the sun network filesystem. In Proceedings of USENIX Summer Techincal Conference, 1985. R. Sandberg, D. Goldberg, S. Kleiman, D. Walsh, and B. Lyon.
- Notes: Make operation idempotent, so easy to recover
- 1986. Vaxcluster: a closely-coupled distributed system. ACM Trans. Comput. Syst., 4(2):130-146, 1986. N. P. Kronenberg, H. M. Levy, and W. D. Strecker.
- Notes: Closely coupled system in the same security domain uses quorum voting to tolerate network partitioning.
- 1987. Debugging parallel programs with instant replay. IEEE Trans. Comput., 36(4):471-482, 1987. T. J. LeBlanc and J. M. Mellor-Crummey.
- Notes: Monitor then replay, just record input and shared variable access order.
- 1987. Machine-independent virtual memory management for paged uniprocessor and multiprocessor architectures. SIGARCH Comput. Archit. News, 15(5):31-39, 1987. R. Rashid, A. Tevanian, M. Young, D. Golub, R. Baron, D. Black, W. Bolosky, and J. Chew.
- Notes: For portability, separate software memory management from hardware management; for message passing performance, remapping and sharing and customized pager for memory object.
- 1988. Scale and performance in a distributed file system. ACM Trans. Comput. Syst., 6(1):51-81, 1988. J. H. Howard, M. L. Kazar, S. G. Menees, D. A. Nichols, M. Satyanarayanan, R. N. Sidebotham, and M. J. West.
- Notes: AFS! Wants scalability, so reduce network i/o (whole file caching, callback) and server workload (path resolution, callback).
- 1988. Kerberos: An authentication service for open network systems. In USENIX 1988, Feb. 1988. J. G. Steiner, C. Neuman, and J. I. Schiller.
- Notes: Use authenticator to avoid replay attack and has out of band registration service.
- 1988. The v distributed system. Commun. ACM, 31(3):314-333, 1988. D. Cheriton.
- Notes: Distributed os kernel is a software backplane, providing kernel utilities, naming protocol, and UIO interface. The author invested Google, and became rich. He's the richest professor in the world.
- 1988. A Case for Redundant Arrays of Inexpensive Disks (RAID). SIGMOD. Davtd A Patterson, Garth Gibson, and Randy H Katz
- Notes: Use redundancy to improve performance and fault tolerance.
- 1990. A vmm security kernel for the vax architecture. In Proceedings of the IEEE Computer Society Symposium on Security and Privacy, may 1990. P. Karger, M. Zurko, D. Bonin, A. Mason, and C. Kahn.
- Notes: kernel becomes virtual machine monitor and is trusted, while virtual machine is not trusted.
- 1990. Lightweight remote procedure call. ACM Trans. Comput. Syst., 8(1):37-55, 1990. B. N. Bershad, T. E. Anderson, E. D. Lazowska, and H. M. Levy.
- Notes: optimize common case - local rpc, and small fixed-size messages.
- 1991. High-availability computer systems. Computer, 24(9):39-48, 1991. J. Gray and D. P. Siewiorek.
- Notes: fault tolerance design principles: modularity, fault-fast, independent failure modes, redundancy and fail.
- 1991. Scheduler activations: effective kernel support for the user-level management of parallelism. In SOSP '91: Proceedings of the thirteenth ACM symposium on Operating systems principles, 1991. ACM. T. E. Anderson, B. N. Bershad, E. D. Lazowska, and H. M. Levy.
- Notes: expose information across user-kernel boundary
- 1992. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst., 10(1):26-52, 1992. M. Rosenblum and J. K. Ousterhout.
- Notes: separate disk writing from long term layout; use known locality at write instead of predicted locality.
- 1993. Efficient software-based fault isolation. In SOSP '93: Proceedings of the fourteenth ACM symposium on Operating systems principles, 1993. ACM. R. Wahbe, S. Lucco, T. E. Anderson, and S. L. Graham.
- Notes: separate the same address space into fault domains, make sure read/write/jmp would not go outside its own fault domain.
- 1994. Lottery scheduling: flexible proportional-share resource management. In OSDI '94: Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation, 1994. USENIX Association. C. A. Waldspurger and W. E. Weihl.
- Notes: use randomness to avoid weird corner case, and independent from hardware details
- 1995. Exokernel: an operating system architecture for application-level resource management. In SOSP '95: Proceedings of the fifteenth ACM symposium on Operating systems principles, 1995. ACM. D. R. Engler, M. F. Kaashoek, and J. O'Toole, Jr.
- Notes: no abstraction is better than having abstraction; separation of resource protection and resource management.
- 1996. The hp autoraid hierarchical storage system. ACM Trans. Comput. Syst., 14(1):108-136, 1996. J. Wilkes, R. Golding, C. Staelin, and T. Sullivan.
- Notes: configuration is hard; data has locality (active/inactive data); so raid1+raid5.
- 1996. Latency and the Quest for Interactivity.
- Notes: 100ms is the target end to end latency for interactive requests.
- 1997. Disco: running commodity operating systems on scalable multiprocessors. ACM Trans. Comput. Syst., 15(4):412-447, 1997. E. Bugnion, S. Devine, K. Govil, and M. Rosenblum.
- Notes: use vmm to hide hardware changes, handle virtualization overhead by reducing trap overhead, improving sharing, and modifying guest os. This thing becomes Vmware.
- 1997. Eraser: a dynamic data race detector for multithreaded programs. ACM Trans. Comput. Syst., 15(4):391-411, 1997. S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson.
- Notes: Finding violation of locking discipline is easier than finding data race; if a shared variable is not consistently protected whenever accessed, there's potential data race.
- 1997. Flexible update propagation for weakly consistent replication. In SOSP '97: Proceedings of the sixteenth ACM symposium on Operating systems principles, 1997. ACM. K. Petersen, M. J. Spreitzer, D. B. Terry, M. M. Theimer, and A. J. Demers.
- Notes: need lightweight replica propagation method; contact arbitrary peers to transfer write logs, based on partial ordering (prefix property).
- 1999. Borrowed-Virtual-Time (BVT) scheduling: supporting latency-sensitive threads in a general-purpose scheduler. Kenneth J. Duda and David R. Cheriton.
- Notes: provides well-defined behavior relative to other tasks and have applications map the requirements onto this measure by tuning parameters.
- 2000. CAP theorem. PODC keynote.
- Notes: It is impossible for a distributed computer system to simultaneously provide all three of the following guarantees: consistency, availability, and network partition tolerance, but you can still have something like 60% consistency, 80% availability, and 70% partition tolerance. It's all about trade-offs.
- 2001. Chord: A scalable peer-to-peer lookup service for internet applications. SIGCOMM'01. I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. 2003. The Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. SOSP 2003.
- Notes: keep suitable amount of state (finger table); each node forwards query at least halfway along distance remaining to the target.
- 2004. MapReduce: Simplied Data Processing on Large Clusters. Jeffrey Dean and Sanjay Ghemawat. OSDI 2004.
- 2006. Bigtable: A Distributed Storage System for Structured Data. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber. OSDI 2006.
- 2007. Architecture of a Database System. Joseph M. Hellerstein, Michael Stonebraker and James Hamilton.
- 2007. Dynamo: Amazon's Highly Available Key-Value Store. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. SIGOPS Oper. Syst. Rev., 41(6):205-220.
- Notes: Trade weak consistency for better availability, scalability, performance, allow application to do conflict conciliation, combo of a lot of existing ideas and best practices, nothing fundamentally new. See Linkedin's open source version - http://www.project-voldemort.com/voldemort/design.html
2012. Spanner: Google’s Globally-Distributed Database. James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, Dale Woodford
2013. An Analysis of Facebook Photo Caching. Qi Huang, Ken Birman, Robbert van Renesse, Wyatt Lloyd, Sanjeev Kumar, and Harry C. Li. SOSP 2013.
- Notes: This is an example of data-driven software development, which is a good model to follow for us.
- 2013. The Tail at Scale. Jeffrey Dean, Luiz André Barroso. CACM.