mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2025-01-26 15:59:20 +07:00
1bd22e374b
The version numbers change too quickly, so use a canonical URL that represents the most recent version. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-16-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
876 lines
28 KiB
Plaintext
876 lines
28 KiB
Plaintext
Read the F-ing Papers!
|
|
|
|
|
|
This document describes RCU-related publications, and is followed by
|
|
the corresponding bibtex entries. A number of the publications may
|
|
be found at http://www.rdrop.com/users/paulmck/RCU/.
|
|
|
|
The first thing resembling RCU was published in 1980, when Kung and Lehman
|
|
[Kung80] recommended use of a garbage collector to defer destruction
|
|
of nodes in a parallel binary search tree in order to simplify its
|
|
implementation. This works well in environments that have garbage
|
|
collectors, but most production garbage collectors incur significant
|
|
overhead.
|
|
|
|
In 1982, Manber and Ladner [Manber82,Manber84] recommended deferring
|
|
destruction until all threads running at that time have terminated, again
|
|
for a parallel binary search tree. This approach works well in systems
|
|
with short-lived threads, such as the K42 research operating system.
|
|
However, Linux has long-lived tasks, so more is needed.
|
|
|
|
In 1986, Hennessy, Osisek, and Seigh [Hennessy89] introduced passive
|
|
serialization, which is an RCU-like mechanism that relies on the presence
|
|
of "quiescent states" in the VM/XA hypervisor that are guaranteed not
|
|
to be referencing the data structure. However, this mechanism was not
|
|
optimized for modern computer systems, which is not surprising given
|
|
that these overheads were not so expensive in the mid-80s. Nonetheless,
|
|
passive serialization appears to be the first deferred-destruction
|
|
mechanism to be used in production. Furthermore, the relevant patent
|
|
has lapsed, so this approach may be used in non-GPL software, if desired.
|
|
(In contrast, implementation of RCU is permitted only in software licensed
|
|
under either GPL or LGPL. Sorry!!!)
|
|
|
|
In 1990, Pugh [Pugh90] noted that explicitly tracking which threads
|
|
were reading a given data structure permitted deferred free to operate
|
|
in the presence of non-terminating threads. However, this explicit
|
|
tracking imposes significant read-side overhead, which is undesirable
|
|
in read-mostly situations. This algorithm does take pains to avoid
|
|
write-side contention and parallelize the other write-side overheads by
|
|
providing a fine-grained locking design, however, it would be interesting
|
|
to see how much of the performance advantage reported in 1990 remains
|
|
in 2004.
|
|
|
|
At about this same time, Adams [Adams91] described ``chaotic relaxation'',
|
|
where the normal barriers between successive iterations of convergent
|
|
numerical algorithms are relaxed, so that iteration $n$ might use
|
|
data from iteration $n-1$ or even $n-2$. This introduces error,
|
|
which typically slows convergence and thus increases the number of
|
|
iterations required. However, this increase is sometimes more than made
|
|
up for by a reduction in the number of expensive barrier operations,
|
|
which are otherwise required to synchronize the threads at the end
|
|
of each iteration. Unfortunately, chaotic relaxation requires highly
|
|
structured data, such as the matrices used in scientific programs, and
|
|
is thus inapplicable to most data structures in operating-system kernels.
|
|
|
|
In 1992, Henry (now Alexia) Massalin completed a dissertation advising
|
|
parallel programmers to defer processing when feasible to simplify
|
|
synchronization. RCU makes extremely heavy use of this advice.
|
|
|
|
In 1993, Jacobson [Jacobson93] verbally described what is perhaps the
|
|
simplest deferred-free technique: simply waiting a fixed amount of time
|
|
before freeing blocks awaiting deferred free. Jacobson did not describe
|
|
any write-side changes he might have made in this work using SGI's Irix
|
|
kernel. Aju John published a similar technique in 1995 [AjuJohn95].
|
|
This works well if there is a well-defined upper bound on the length of
|
|
time that reading threads can hold references, as there might well be in
|
|
hard real-time systems. However, if this time is exceeded, perhaps due
|
|
to preemption, excessive interrupts, or larger-than-anticipated load,
|
|
memory corruption can ensue, with no reasonable means of diagnosis.
|
|
Jacobson's technique is therefore inappropriate for use in production
|
|
operating-system kernels, except when such kernels can provide hard
|
|
real-time response guarantees for all operations.
|
|
|
|
Also in 1995, Pu et al. [Pu95a] applied a technique similar to that of Pugh's
|
|
read-side-tracking to permit replugging of algorithms within a commercial
|
|
Unix operating system. However, this replugging permitted only a single
|
|
reader at a time. The following year, this same group of researchers
|
|
extended their technique to allow for multiple readers [Cowan96a].
|
|
Their approach requires memory barriers (and thus pipeline stalls),
|
|
but reduces memory latency, contention, and locking overheads.
|
|
|
|
1995 also saw the first publication of DYNIX/ptx's RCU mechanism
|
|
[Slingwine95], which was optimized for modern CPU architectures,
|
|
and was successfully applied to a number of situations within the
|
|
DYNIX/ptx kernel. The corresponding conference paper appeared in 1998
|
|
[McKenney98].
|
|
|
|
In 1999, the Tornado and K42 groups described their "generations"
|
|
mechanism, which quite similar to RCU [Gamsa99]. These operating systems
|
|
made pervasive use of RCU in place of "existence locks", which greatly
|
|
simplifies locking hierarchies.
|
|
|
|
2001 saw the first RCU presentation involving Linux [McKenney01a]
|
|
at OLS. The resulting abundance of RCU patches was presented the
|
|
following year [McKenney02a], and use of RCU in dcache was first
|
|
described that same year [Linder02a].
|
|
|
|
Also in 2002, Michael [Michael02b,Michael02a] presented "hazard-pointer"
|
|
techniques that defer the destruction of data structures to simplify
|
|
non-blocking synchronization (wait-free synchronization, lock-free
|
|
synchronization, and obstruction-free synchronization are all examples of
|
|
non-blocking synchronization). In particular, this technique eliminates
|
|
locking, reduces contention, reduces memory latency for readers, and
|
|
parallelizes pipeline stalls and memory latency for writers. However,
|
|
these techniques still impose significant read-side overhead in the
|
|
form of memory barriers. Researchers at Sun worked along similar lines
|
|
in the same timeframe [HerlihyLM02]. These techniques can be thought
|
|
of as inside-out reference counts, where the count is represented by the
|
|
number of hazard pointers referencing a given data structure (rather than
|
|
the more conventional counter field within the data structure itself).
|
|
|
|
By the same token, RCU can be thought of as a "bulk reference count",
|
|
where some form of reference counter covers all reference by a given CPU
|
|
or thread during a set timeframe. This timeframe is related to, but
|
|
not necessarily exactly the same as, an RCU grace period. In classic
|
|
RCU, the reference counter is the per-CPU bit in the "bitmask" field,
|
|
and each such bit covers all references that might have been made by
|
|
the corresponding CPU during the prior grace period. Of course, RCU
|
|
can be thought of in other terms as well.
|
|
|
|
In 2003, the K42 group described how RCU could be used to create
|
|
hot-pluggable implementations of operating-system functions [Appavoo03a].
|
|
Later that year saw a paper describing an RCU implementation of System
|
|
V IPC [Arcangeli03], and an introduction to RCU in Linux Journal
|
|
[McKenney03a].
|
|
|
|
2004 has seen a Linux-Journal article on use of RCU in dcache
|
|
[McKenney04a], a performance comparison of locking to RCU on several
|
|
different CPUs [McKenney04b], a dissertation describing use of RCU in a
|
|
number of operating-system kernels [PaulEdwardMcKenneyPhD], a paper
|
|
describing how to make RCU safe for soft-realtime applications [Sarma04c],
|
|
and a paper describing SELinux performance with RCU [JamesMorris04b].
|
|
|
|
2005 brought further adaptation of RCU to realtime use, permitting
|
|
preemption of RCU realtime critical sections [PaulMcKenney05a,
|
|
PaulMcKenney05b].
|
|
|
|
2006 saw the first best-paper award for an RCU paper [ThomasEHart2006a],
|
|
as well as further work on efficient implementations of preemptible
|
|
RCU [PaulEMcKenney2006b], but priority-boosting of RCU read-side critical
|
|
sections proved elusive. An RCU implementation permitting general
|
|
blocking in read-side critical sections appeared [PaulEMcKenney2006c],
|
|
Robert Olsson described an RCU-protected trie-hash combination
|
|
[RobertOlsson2006a].
|
|
|
|
2007 saw the journal version of the award-winning RCU paper from 2006
|
|
[ThomasEHart2007a], as well as a paper demonstrating use of Promela
|
|
and Spin to mechanically verify an optimization to Oleg Nesterov's
|
|
QRCU [PaulEMcKenney2007QRCUspin], a design document describing
|
|
preemptible RCU [PaulEMcKenney2007PreemptibleRCU], and the three-part
|
|
LWN "What is RCU?" series [PaulEMcKenney2007WhatIsRCUFundamentally,
|
|
PaulEMcKenney2008WhatIsRCUUsage, and PaulEMcKenney2008WhatIsRCUAPI].
|
|
|
|
2008 saw a journal paper on real-time RCU [DinakarGuniguntala2008IBMSysJ],
|
|
a history of how Linux changed RCU more than RCU changed Linux
|
|
[PaulEMcKenney2008RCUOSR], and a design overview of hierarchical RCU
|
|
[PaulEMcKenney2008HierarchicalRCU].
|
|
|
|
2009 introduced user-level RCU algorithms [PaulEMcKenney2009MaliciousURCU],
|
|
which Mathieu Desnoyers is now maintaining [MathieuDesnoyers2009URCU]
|
|
[MathieuDesnoyersPhD]. TINY_RCU [PaulEMcKenney2009BloatWatchRCU] made
|
|
its appearance, as did expedited RCU [PaulEMcKenney2009expeditedRCU].
|
|
The problem of resizeable RCU-protected hash tables may now be on a path
|
|
to a solution [JoshTriplett2009RPHash].
|
|
|
|
Bibtex Entries
|
|
|
|
@article{Kung80
|
|
,author="H. T. Kung and Q. Lehman"
|
|
,title="Concurrent Maintenance of Binary Search Trees"
|
|
,Year="1980"
|
|
,Month="September"
|
|
,journal="ACM Transactions on Database Systems"
|
|
,volume="5"
|
|
,number="3"
|
|
,pages="354-382"
|
|
}
|
|
|
|
@techreport{Manber82
|
|
,author="Udi Manber and Richard E. Ladner"
|
|
,title="Concurrency Control in a Dynamic Search Structure"
|
|
,institution="Department of Computer Science, University of Washington"
|
|
,address="Seattle, Washington"
|
|
,year="1982"
|
|
,number="82-01-01"
|
|
,month="January"
|
|
,pages="28"
|
|
}
|
|
|
|
@article{Manber84
|
|
,author="Udi Manber and Richard E. Ladner"
|
|
,title="Concurrency Control in a Dynamic Search Structure"
|
|
,Year="1984"
|
|
,Month="September"
|
|
,journal="ACM Transactions on Database Systems"
|
|
,volume="9"
|
|
,number="3"
|
|
,pages="439-455"
|
|
}
|
|
|
|
@techreport{Hennessy89
|
|
,author="James P. Hennessy and Damian L. Osisek and Joseph W. {Seigh II}"
|
|
,title="Passive Serialization in a Multitasking Environment"
|
|
,institution="US Patent and Trademark Office"
|
|
,address="Washington, DC"
|
|
,year="1989"
|
|
,number="US Patent 4,809,168 (lapsed)"
|
|
,month="February"
|
|
,pages="11"
|
|
}
|
|
|
|
@techreport{Pugh90
|
|
,author="William Pugh"
|
|
,title="Concurrent Maintenance of Skip Lists"
|
|
,institution="Institute of Advanced Computer Science Studies, Department of Computer Science, University of Maryland"
|
|
,address="College Park, Maryland"
|
|
,year="1990"
|
|
,number="CS-TR-2222.1"
|
|
,month="June"
|
|
}
|
|
|
|
@Book{Adams91
|
|
,Author="Gregory R. Adams"
|
|
,title="Concurrent Programming, Principles, and Practices"
|
|
,Publisher="Benjamin Cummins"
|
|
,Year="1991"
|
|
}
|
|
|
|
@phdthesis{HMassalinPhD
|
|
,author="H. Massalin"
|
|
,title="Synthesis: An Efficient Implementation of Fundamental Operating
|
|
System Services"
|
|
,school="Columbia University"
|
|
,address="New York, NY"
|
|
,year="1992"
|
|
,annotation="
|
|
Mondo optimizing compiler.
|
|
Wait-free stuff.
|
|
Good advice: defer work to avoid synchronization.
|
|
"
|
|
}
|
|
|
|
@unpublished{Jacobson93
|
|
,author="Van Jacobson"
|
|
,title="Avoid Read-Side Locking Via Delayed Free"
|
|
,year="1993"
|
|
,month="September"
|
|
,note="Verbal discussion"
|
|
}
|
|
|
|
@Conference{AjuJohn95
|
|
,Author="Aju John"
|
|
,Title="Dynamic vnodes -- Design and Implementation"
|
|
,Booktitle="{USENIX Winter 1995}"
|
|
,Publisher="USENIX Association"
|
|
,Month="January"
|
|
,Year="1995"
|
|
,pages="11-23"
|
|
,Address="New Orleans, LA"
|
|
}
|
|
|
|
@conference{Pu95a,
|
|
Author = "Calton Pu and Tito Autrey and Andrew Black and Charles Consel and
|
|
Crispin Cowan and Jon Inouye and Lakshmi Kethana and Jonathan Walpole and
|
|
Ke Zhang",
|
|
Title = "Optimistic Incremental Specialization: Streamlining a Commercial
|
|
Operating System",
|
|
Booktitle = "15\textsuperscript{th} ACM Symposium on
|
|
Operating Systems Principles (SOSP'95)",
|
|
address = "Copper Mountain, CO",
|
|
month="December",
|
|
year="1995",
|
|
pages="314-321",
|
|
annotation="
|
|
Uses a replugger, but with a flag to signal when people are
|
|
using the resource at hand. Only one reader at a time.
|
|
"
|
|
}
|
|
|
|
@conference{Cowan96a,
|
|
Author = "Crispin Cowan and Tito Autrey and Charles Krasic and
|
|
Calton Pu and Jonathan Walpole",
|
|
Title = "Fast Concurrent Dynamic Linking for an Adaptive Operating System",
|
|
Booktitle = "International Conference on Configurable Distributed Systems
|
|
(ICCDS'96)",
|
|
address = "Annapolis, MD",
|
|
month="May",
|
|
year="1996",
|
|
pages="108",
|
|
isbn="0-8186-7395-8",
|
|
annotation="
|
|
Uses a replugger, but with a counter to signal when people are
|
|
using the resource at hand. Allows multiple readers.
|
|
"
|
|
}
|
|
|
|
@techreport{Slingwine95
|
|
,author="John D. Slingwine and Paul E. McKenney"
|
|
,title="Apparatus and Method for Achieving Reduced Overhead Mutual
|
|
Exclusion and Maintaining Coherency in a Multiprocessor System
|
|
Utilizing Execution History and Thread Monitoring"
|
|
,institution="US Patent and Trademark Office"
|
|
,address="Washington, DC"
|
|
,year="1995"
|
|
,number="US Patent 5,442,758 (contributed under GPL)"
|
|
,month="August"
|
|
}
|
|
|
|
@techreport{Slingwine97
|
|
,author="John D. Slingwine and Paul E. McKenney"
|
|
,title="Method for maintaining data coherency using thread
|
|
activity summaries in a multicomputer system"
|
|
,institution="US Patent and Trademark Office"
|
|
,address="Washington, DC"
|
|
,year="1997"
|
|
,number="US Patent 5,608,893 (contributed under GPL)"
|
|
,month="March"
|
|
}
|
|
|
|
@techreport{Slingwine98
|
|
,author="John D. Slingwine and Paul E. McKenney"
|
|
,title="Apparatus and method for achieving reduced overhead
|
|
mutual exclusion and maintaining coherency in a multiprocessor
|
|
system utilizing execution history and thread monitoring"
|
|
,institution="US Patent and Trademark Office"
|
|
,address="Washington, DC"
|
|
,year="1998"
|
|
,number="US Patent 5,727,209 (contributed under GPL)"
|
|
,month="March"
|
|
}
|
|
|
|
@Conference{McKenney98
|
|
,Author="Paul E. McKenney and John D. Slingwine"
|
|
,Title="Read-Copy Update: Using Execution History to Solve Concurrency
|
|
Problems"
|
|
,Booktitle="{Parallel and Distributed Computing and Systems}"
|
|
,Month="October"
|
|
,Year="1998"
|
|
,pages="509-518"
|
|
,Address="Las Vegas, NV"
|
|
}
|
|
|
|
@Conference{Gamsa99
|
|
,Author="Ben Gamsa and Orran Krieger and Jonathan Appavoo and Michael Stumm"
|
|
,Title="Tornado: Maximizing Locality and Concurrency in a Shared Memory
|
|
Multiprocessor Operating System"
|
|
,Booktitle="{Proceedings of the 3\textsuperscript{rd} Symposium on
|
|
Operating System Design and Implementation}"
|
|
,Month="February"
|
|
,Year="1999"
|
|
,pages="87-100"
|
|
,Address="New Orleans, LA"
|
|
}
|
|
|
|
@techreport{Slingwine01
|
|
,author="John D. Slingwine and Paul E. McKenney"
|
|
,title="Apparatus and method for achieving reduced overhead
|
|
mutual exclusion and maintaining coherency in a multiprocessor
|
|
system utilizing execution history and thread monitoring"
|
|
,institution="US Patent and Trademark Office"
|
|
,address="Washington, DC"
|
|
,year="2001"
|
|
,number="US Patent 5,219,690 (contributed under GPL)"
|
|
,month="April"
|
|
}
|
|
|
|
@Conference{McKenney01a
|
|
,Author="Paul E. McKenney and Jonathan Appavoo and Andi Kleen and
|
|
Orran Krieger and Rusty Russell and Dipankar Sarma and Maneesh Soni"
|
|
,Title="Read-Copy Update"
|
|
,Booktitle="{Ottawa Linux Symposium}"
|
|
,Month="July"
|
|
,Year="2001"
|
|
,note="Available:
|
|
\url{http://www.linuxsymposium.org/2001/abstracts/readcopy.php}
|
|
\url{http://www.rdrop.com/users/paulmck/rclock/rclock_OLS.2001.05.01c.pdf}
|
|
[Viewed June 23, 2004]"
|
|
annotation="
|
|
Described RCU, and presented some patches implementing and using it in
|
|
the Linux kernel.
|
|
"
|
|
}
|
|
|
|
@Conference{Linder02a
|
|
,Author="Hanna Linder and Dipankar Sarma and Maneesh Soni"
|
|
,Title="Scalability of the Directory Entry Cache"
|
|
,Booktitle="{Ottawa Linux Symposium}"
|
|
,Month="June"
|
|
,Year="2002"
|
|
,pages="289-300"
|
|
}
|
|
|
|
@Conference{McKenney02a
|
|
,Author="Paul E. McKenney and Dipankar Sarma and
|
|
Andrea Arcangeli and Andi Kleen and Orran Krieger and Rusty Russell"
|
|
,Title="Read-Copy Update"
|
|
,Booktitle="{Ottawa Linux Symposium}"
|
|
,Month="June"
|
|
,Year="2002"
|
|
,pages="338-367"
|
|
,note="Available:
|
|
\url{http://www.linux.org.uk/~ajh/ols2002_proceedings.pdf.gz}
|
|
[Viewed June 23, 2004]"
|
|
}
|
|
|
|
@conference{Michael02a
|
|
,author="Maged M. Michael"
|
|
,title="Safe Memory Reclamation for Dynamic Lock-Free Objects Using Atomic
|
|
Reads and Writes"
|
|
,Year="2002"
|
|
,Month="August"
|
|
,booktitle="{Proceedings of the 21\textsuperscript{st} Annual ACM
|
|
Symposium on Principles of Distributed Computing}"
|
|
,pages="21-30"
|
|
,annotation="
|
|
Each thread keeps an array of pointers to items that it is
|
|
currently referencing. Sort of an inside-out garbage collection
|
|
mechanism, but one that requires the accessing code to explicitly
|
|
state its needs. Also requires read-side memory barriers on
|
|
most architectures.
|
|
"
|
|
}
|
|
|
|
@conference{Michael02b
|
|
,author="Maged M. Michael"
|
|
,title="High Performance Dynamic Lock-Free Hash Tables and List-Based Sets"
|
|
,Year="2002"
|
|
,Month="August"
|
|
,booktitle="{Proceedings of the 14\textsuperscript{th} Annual ACM
|
|
Symposium on Parallel
|
|
Algorithms and Architecture}"
|
|
,pages="73-82"
|
|
,annotation="
|
|
Like the title says...
|
|
"
|
|
}
|
|
|
|
@InProceedings{HerlihyLM02
|
|
,author={Maurice Herlihy and Victor Luchangco and Mark Moir}
|
|
,title="The Repeat Offender Problem: A Mechanism for Supporting Dynamic-Sized,
|
|
Lock-Free Data Structures"
|
|
,booktitle={Proceedings of 16\textsuperscript{th} International
|
|
Symposium on Distributed Computing}
|
|
,year=2002
|
|
,month="October"
|
|
,pages="339-353"
|
|
}
|
|
|
|
@article{Appavoo03a
|
|
,author="J. Appavoo and K. Hui and C. A. N. Soules and R. W. Wisniewski and
|
|
D. M. {Da Silva} and O. Krieger and M. A. Auslander and D. J. Edelsohn and
|
|
B. Gamsa and G. R. Ganger and P. McKenney and M. Ostrowski and
|
|
B. Rosenburg and M. Stumm and J. Xenidis"
|
|
,title="Enabling Autonomic Behavior in Systems Software With Hot Swapping"
|
|
,Year="2003"
|
|
,Month="January"
|
|
,journal="IBM Systems Journal"
|
|
,volume="42"
|
|
,number="1"
|
|
,pages="60-76"
|
|
}
|
|
|
|
@Conference{Arcangeli03
|
|
,Author="Andrea Arcangeli and Mingming Cao and Paul E. McKenney and
|
|
Dipankar Sarma"
|
|
,Title="Using Read-Copy Update Techniques for {System V IPC} in the
|
|
{Linux} 2.5 Kernel"
|
|
,Booktitle="Proceedings of the 2003 USENIX Annual Technical Conference
|
|
(FREENIX Track)"
|
|
,Publisher="USENIX Association"
|
|
,year="2003"
|
|
,month="June"
|
|
,pages="297-310"
|
|
}
|
|
|
|
@article{McKenney03a
|
|
,author="Paul E. McKenney"
|
|
,title="Using {RCU} in the {Linux} 2.5 Kernel"
|
|
,Year="2003"
|
|
,Month="October"
|
|
,journal="Linux Journal"
|
|
,volume="1"
|
|
,number="114"
|
|
,pages="18-26"
|
|
}
|
|
|
|
@techreport{Friedberg03a
|
|
,author="Stuart A. Friedberg"
|
|
,title="Lock-Free Wild Card Search Data Structure and Method"
|
|
,institution="US Patent and Trademark Office"
|
|
,address="Washington, DC"
|
|
,year="2003"
|
|
,number="US Patent 6,662,184 (contributed under GPL)"
|
|
,month="December"
|
|
,pages="112"
|
|
}
|
|
|
|
@article{McKenney04a
|
|
,author="Paul E. McKenney and Dipankar Sarma and Maneesh Soni"
|
|
,title="Scaling dcache with {RCU}"
|
|
,Year="2004"
|
|
,Month="January"
|
|
,journal="Linux Journal"
|
|
,volume="1"
|
|
,number="118"
|
|
,pages="38-46"
|
|
}
|
|
|
|
@Conference{McKenney04b
|
|
,Author="Paul E. McKenney"
|
|
,Title="{RCU} vs. Locking Performance on Different {CPUs}"
|
|
,Booktitle="{linux.conf.au}"
|
|
,Month="January"
|
|
,Year="2004"
|
|
,Address="Adelaide, Australia"
|
|
,note="Available:
|
|
\url{http://www.linux.org.au/conf/2004/abstracts.html#90}
|
|
\url{http://www.rdrop.com/users/paulmck/rclock/lockperf.2004.01.17a.pdf}
|
|
[Viewed June 23, 2004]"
|
|
}
|
|
|
|
@phdthesis{PaulEdwardMcKenneyPhD
|
|
,author="Paul E. McKenney"
|
|
,title="Exploiting Deferred Destruction:
|
|
An Analysis of Read-Copy-Update Techniques
|
|
in Operating System Kernels"
|
|
,school="OGI School of Science and Engineering at
|
|
Oregon Health and Sciences University"
|
|
,year="2004"
|
|
,note="Available:
|
|
\url{http://www.rdrop.com/users/paulmck/RCU/RCUdissertation.2004.07.14e1.pdf}
|
|
[Viewed October 15, 2004]"
|
|
}
|
|
|
|
@Conference{Sarma04c
|
|
,Author="Dipankar Sarma and Paul E. McKenney"
|
|
,Title="Making RCU Safe for Deep Sub-Millisecond Response Realtime Applications"
|
|
,Booktitle="Proceedings of the 2004 USENIX Annual Technical Conference
|
|
(FREENIX Track)"
|
|
,Publisher="USENIX Association"
|
|
,year="2004"
|
|
,month="June"
|
|
,pages="182-191"
|
|
}
|
|
|
|
@unpublished{JamesMorris04b
|
|
,Author="James Morris"
|
|
,Title="Recent Developments in {SELinux} Kernel Performance"
|
|
,month="December"
|
|
,year="2004"
|
|
,note="Available:
|
|
\url{http://www.livejournal.com/users/james_morris/2153.html}
|
|
[Viewed December 10, 2004]"
|
|
}
|
|
|
|
@unpublished{PaulMcKenney05a
|
|
,Author="Paul E. McKenney"
|
|
,Title="{[RFC]} {RCU} and {CONFIG\_PREEMPT\_RT} progress"
|
|
,month="May"
|
|
,year="2005"
|
|
,note="Available:
|
|
\url{http://lkml.org/lkml/2005/5/9/185}
|
|
[Viewed May 13, 2005]"
|
|
,annotation="
|
|
First publication of working lock-based deferred free patches
|
|
for the CONFIG_PREEMPT_RT environment.
|
|
"
|
|
}
|
|
|
|
@conference{PaulMcKenney05b
|
|
,Author="Paul E. McKenney and Dipankar Sarma"
|
|
,Title="Towards Hard Realtime Response from the Linux Kernel on SMP Hardware"
|
|
,Booktitle="linux.conf.au 2005"
|
|
,month="April"
|
|
,year="2005"
|
|
,address="Canberra, Australia"
|
|
,note="Available:
|
|
\url{http://www.rdrop.com/users/paulmck/RCU/realtimeRCU.2005.04.23a.pdf}
|
|
[Viewed May 13, 2005]"
|
|
,annotation="
|
|
Realtime turns into making RCU yet more realtime friendly.
|
|
"
|
|
}
|
|
|
|
@conference{ThomasEHart2006a
|
|
,Author="Thomas E. Hart and Paul E. McKenney and Angela Demke Brown"
|
|
,Title="Making Lockless Synchronization Fast: Performance Implications
|
|
of Memory Reclamation"
|
|
,Booktitle="20\textsuperscript{th} {IEEE} International Parallel and
|
|
Distributed Processing Symposium"
|
|
,month="April"
|
|
,year="2006"
|
|
,day="25-29"
|
|
,address="Rhodes, Greece"
|
|
,annotation="
|
|
Compares QSBR (AKA "classic RCU"), HPBR, EBR, and lock-free
|
|
reference counting.
|
|
"
|
|
}
|
|
|
|
@Conference{PaulEMcKenney2006b
|
|
,Author="Paul E. McKenney and Dipankar Sarma and Ingo Molnar and
|
|
Suparna Bhattacharya"
|
|
,Title="Extending RCU for Realtime and Embedded Workloads"
|
|
,Booktitle="{Ottawa Linux Symposium}"
|
|
,Month="July"
|
|
,Year="2006"
|
|
,pages="v2 123-138"
|
|
,note="Available:
|
|
\url{http://www.linuxsymposium.org/2006/view_abstract.php?content_key=184}
|
|
\url{http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf}
|
|
[Viewed January 1, 2007]"
|
|
,annotation="
|
|
Described how to improve the -rt implementation of realtime RCU.
|
|
"
|
|
}
|
|
|
|
@unpublished{PaulEMcKenney2006c
|
|
,Author="Paul E. McKenney"
|
|
,Title="Sleepable {RCU}"
|
|
,month="October"
|
|
,day="9"
|
|
,year="2006"
|
|
,note="Available:
|
|
\url{http://lwn.net/Articles/202847/}
|
|
Revised:
|
|
\url{http://www.rdrop.com/users/paulmck/RCU/srcu.2007.01.14a.pdf}
|
|
[Viewed August 21, 2006]"
|
|
,annotation="
|
|
LWN article introducing SRCU.
|
|
"
|
|
}
|
|
|
|
@unpublished{RobertOlsson2006a
|
|
,Author="Robert Olsson and Stefan Nilsson"
|
|
,Title="{TRASH}: A dynamic {LC}-trie and hash data structure"
|
|
,month="August"
|
|
,day="18"
|
|
,year="2006"
|
|
,note="Available:
|
|
\url{http://www.nada.kth.se/~snilsson/public/papers/trash/trash.pdf}
|
|
[Viewed February 24, 2007]"
|
|
,annotation="
|
|
RCU-protected dynamic trie-hash combination.
|
|
"
|
|
}
|
|
|
|
@unpublished{ThomasEHart2007a
|
|
,Author="Thomas E. Hart and Paul E. McKenney and Angela Demke Brown and Jonathan Walpole"
|
|
,Title="Performance of memory reclamation for lockless synchronization"
|
|
,journal="J. Parallel Distrib. Comput."
|
|
,year="2007"
|
|
,note="To appear in J. Parallel Distrib. Comput.
|
|
\url{doi=10.1016/j.jpdc.2007.04.010}"
|
|
,annotation={
|
|
Compares QSBR (AKA "classic RCU"), HPBR, EBR, and lock-free
|
|
reference counting. Journal version of ThomasEHart2006a.
|
|
}
|
|
}
|
|
|
|
@unpublished{PaulEMcKenney2007QRCUspin
|
|
,Author="Paul E. McKenney"
|
|
,Title="Using Promela and Spin to verify parallel algorithms"
|
|
,month="August"
|
|
,day="1"
|
|
,year="2007"
|
|
,note="Available:
|
|
\url{http://lwn.net/Articles/243851/}
|
|
[Viewed September 8, 2007]"
|
|
,annotation="
|
|
LWN article describing Promela and spin, and also using Oleg
|
|
Nesterov's QRCU as an example (with Paul McKenney's fastpath).
|
|
"
|
|
}
|
|
|
|
@unpublished{PaulEMcKenney2007PreemptibleRCU
|
|
,Author="Paul E. McKenney"
|
|
,Title="The design of preemptible read-copy-update"
|
|
,month="October"
|
|
,day="8"
|
|
,year="2007"
|
|
,note="Available:
|
|
\url{http://lwn.net/Articles/253651/}
|
|
[Viewed October 25, 2007]"
|
|
,annotation="
|
|
LWN article describing the design of preemptible RCU.
|
|
"
|
|
}
|
|
|
|
########################################################################
|
|
#
|
|
# "What is RCU?" LWN series.
|
|
#
|
|
|
|
@unpublished{PaulEMcKenney2007WhatIsRCUFundamentally
|
|
,Author="Paul E. McKenney and Jonathan Walpole"
|
|
,Title="What is {RCU}, Fundamentally?"
|
|
,month="December"
|
|
,day="17"
|
|
,year="2007"
|
|
,note="Available:
|
|
\url{http://lwn.net/Articles/262464/}
|
|
[Viewed December 27, 2007]"
|
|
,annotation="
|
|
Lays out the three basic components of RCU: (1) publish-subscribe,
|
|
(2) wait for pre-existing readers to complete, and (2) maintain
|
|
multiple versions.
|
|
"
|
|
}
|
|
|
|
@unpublished{PaulEMcKenney2008WhatIsRCUUsage
|
|
,Author="Paul E. McKenney"
|
|
,Title="What is {RCU}? Part 2: Usage"
|
|
,month="January"
|
|
,day="4"
|
|
,year="2008"
|
|
,note="Available:
|
|
\url{http://lwn.net/Articles/263130/}
|
|
[Viewed January 4, 2008]"
|
|
,annotation="
|
|
Lays out six uses of RCU:
|
|
1. RCU is a Reader-Writer Lock Replacement
|
|
2. RCU is a Restricted Reference-Counting Mechanism
|
|
3. RCU is a Bulk Reference-Counting Mechanism
|
|
4. RCU is a Poor Man's Garbage Collector
|
|
5. RCU is a Way of Providing Existence Guarantees
|
|
6. RCU is a Way of Waiting for Things to Finish
|
|
"
|
|
}
|
|
|
|
@unpublished{PaulEMcKenney2008WhatIsRCUAPI
|
|
,Author="Paul E. McKenney"
|
|
,Title="{RCU} part 3: the {RCU} {API}"
|
|
,month="January"
|
|
,day="17"
|
|
,year="2008"
|
|
,note="Available:
|
|
\url{http://lwn.net/Articles/264090/}
|
|
[Viewed January 10, 2008]"
|
|
,annotation="
|
|
Gives an overview of the Linux-kernel RCU API and a brief annotated RCU
|
|
bibliography.
|
|
"
|
|
}
|
|
|
|
#
|
|
# "What is RCU?" LWN series.
|
|
#
|
|
########################################################################
|
|
|
|
@article{DinakarGuniguntala2008IBMSysJ
|
|
,author="D. Guniguntala and P. E. McKenney and J. Triplett and J. Walpole"
|
|
,title="The read-copy-update mechanism for supporting real-time applications on shared-memory multiprocessor systems with {Linux}"
|
|
,Year="2008"
|
|
,Month="April"
|
|
,journal="IBM Systems Journal"
|
|
,volume="47"
|
|
,number="2"
|
|
,pages="@@-@@"
|
|
,annotation="
|
|
RCU, realtime RCU, sleepable RCU, performance.
|
|
"
|
|
}
|
|
|
|
@article{PaulEMcKenney2008RCUOSR
|
|
,author="Paul E. McKenney and Jonathan Walpole"
|
|
,title="Introducing technology into the {Linux} kernel: a case study"
|
|
,Year="2008"
|
|
,journal="SIGOPS Oper. Syst. Rev."
|
|
,volume="42"
|
|
,number="5"
|
|
,pages="4--17"
|
|
,issn="0163-5980"
|
|
,doi={http://doi.acm.org/10.1145/1400097.1400099}
|
|
,publisher="ACM"
|
|
,address="New York, NY, USA"
|
|
,annotation={
|
|
Linux changed RCU to a far greater degree than RCU has changed Linux.
|
|
}
|
|
}
|
|
|
|
@unpublished{PaulEMcKenney2008HierarchicalRCU
|
|
,Author="Paul E. McKenney"
|
|
,Title="Hierarchical {RCU}"
|
|
,month="November"
|
|
,day="3"
|
|
,year="2008"
|
|
,note="Available:
|
|
\url{http://lwn.net/Articles/305782/}
|
|
[Viewed November 6, 2008]"
|
|
,annotation="
|
|
RCU with combining-tree-based grace-period detection,
|
|
permitting it to handle thousands of CPUs.
|
|
"
|
|
}
|
|
|
|
@conference{PaulEMcKenney2009MaliciousURCU
|
|
,Author="Paul E. McKenney"
|
|
,Title="Using a Malicious User-Level {RCU} to Torture {RCU}-Based Algorithms"
|
|
,Booktitle="linux.conf.au 2009"
|
|
,month="January"
|
|
,year="2009"
|
|
,address="Hobart, Australia"
|
|
,note="Available:
|
|
\url{http://www.rdrop.com/users/paulmck/RCU/urcutorture.2009.01.22a.pdf}
|
|
[Viewed February 2, 2009]"
|
|
,annotation="
|
|
Realtime RCU and torture-testing RCU uses.
|
|
"
|
|
}
|
|
|
|
@unpublished{MathieuDesnoyers2009URCU
|
|
,Author="Mathieu Desnoyers"
|
|
,Title="[{RFC} git tree] Userspace {RCU} (urcu) for {Linux}"
|
|
,month="February"
|
|
,day="5"
|
|
,year="2009"
|
|
,note="Available:
|
|
\url{http://lkml.org/lkml/2009/2/5/572}
|
|
\url{git://lttng.org/userspace-rcu.git}
|
|
[Viewed February 20, 2009]"
|
|
,annotation="
|
|
Mathieu Desnoyers's user-space RCU implementation.
|
|
git://lttng.org/userspace-rcu.git
|
|
"
|
|
}
|
|
|
|
@unpublished{PaulEMcKenney2009BloatWatchRCU
|
|
,Author="Paul E. McKenney"
|
|
,Title="{RCU}: The {Bloatwatch} Edition"
|
|
,month="March"
|
|
,day="17"
|
|
,year="2009"
|
|
,note="Available:
|
|
\url{http://lwn.net/Articles/323929/}
|
|
[Viewed March 20, 2009]"
|
|
,annotation="
|
|
Uniprocessor assumptions allow simplified RCU implementation.
|
|
"
|
|
}
|
|
|
|
@unpublished{PaulEMcKenney2009expeditedRCU
|
|
,Author="Paul E. McKenney"
|
|
,Title="[{PATCH} -tip 0/3] expedited 'big hammer' {RCU} grace periods"
|
|
,month="June"
|
|
,day="25"
|
|
,year="2009"
|
|
,note="Available:
|
|
\url{http://lkml.org/lkml/2009/6/25/306}
|
|
[Viewed August 16, 2009]"
|
|
,annotation="
|
|
First posting of expedited RCU to be accepted into -tip.
|
|
"
|
|
}
|
|
|
|
@unpublished{JoshTriplett2009RPHash
|
|
,Author="Josh Triplett"
|
|
,Title="Scalable concurrent hash tables via relativistic programming"
|
|
,month="September"
|
|
,year="2009"
|
|
,note="Linux Plumbers Conference presentation"
|
|
,annotation="
|
|
RP fun with hash tables.
|
|
"
|
|
}
|
|
|
|
@phdthesis{MathieuDesnoyersPhD
|
|
, title = "Low-Impact Operating System Tracing"
|
|
, author = "Mathieu Desnoyers"
|
|
, school = "Ecole Polytechnique de Montr\'{e}al"
|
|
, month = "December"
|
|
, year = 2009
|
|
,note="Available:
|
|
\url{http://www.lttng.org/pub/thesis/desnoyers-dissertation-2009-12.pdf}
|
|
[Viewed December 9, 2009]"
|
|
}
|