by Raymond Hettinger
Tool for making rapid tallies or counts
- Multisets in C++
- Bags in smalltalk and Objective C
Very flexible, unrestricted implementation as a directory
You can put anything in it:
- Count with positive and negative numbers
- Count with decimals, floats, or fractions
- it is just a dictionary
With __missing__() that returns zero:
c[x] += 1 # easy to tally
Has __delitem__() to match __missing__():
del c[x] # easy to delete
- returns sorted list of the n highest counts
- reduces the code to a simple one-liner
- implemented using either sorted() or heapq.nlargest()
- lists all the contents individually
- if an element has a count of three, it is emitted three times
- Differs from __iter__ which returns pairs
- Done this way because otherwise you change the dict API.
Multiset use case
- With multisets, the counts are always positive
- Math operations with omit zero or negative counts from the result
- Operations are: + - & |
- The subtraction operation is said to be saturating
- When the counts are all one, works just like regular sets
Works just like a regular tuple and lets you assign names to each field.
- Makes the code self-documenting
- Makes the printed __repr__ intelligible
- Let’s you change tuple order without affecting client code
- No extra memory cost over tuple
One of the best single best changes you can make to existing code.
Doing an enum in python (not needed but its kind of cool):
Color = namedtuple('Color', 'red orange yellow')._make(range(3))
Simple unbounded caches can grow without bound.
How to do it:
from functools import lru_cache @lru_cache(maxsize=100) def big_computation(*args): ...
@lru_cache() def fibonacci(n): if n <= 1: return n return fibonacci(n-1) + fibonacci(n-2) print(fibonacci(5))
Fun things to do with Python
The iter() function has a two argument form that takes a sentinal value:
for block in iter(partial(f.read,20), "") ...
Awesome way to stop a loop!
Use DefaultDict for 2-D sparse arrays:
d = defaultdict(dict) d['Canada']['Quebec'] = 1
Make dict sparse to speed up a dictionary:
d.update(dict(d)) # doubles space in hash table
Wrap it up in a function:
def sparser(d): return d.update(dict(d)) # doubles space in hash table
Turn-off thread-switching (cheap locking):
ci = getcheckinterval(0) sys.setcheckinterval(0) # switching off value = max(tasklist) tasklist.remove(value) sys.setcheckinterval(ci) # switching on
Atomic actions (no locks required):
v = d.pop(key) # find and remove in one-step d.setdefault(key, ).append(v) # one-step init
Speed-up builtin access:
from __builtin__ import *
>>> def make_constant_function(x): ... return lambda x=x: x >>> f = make_constant_function(3) >>> f() 3
Fast constant functions:
>>> def make_constant_function(x): ... return itertools.repeat(x).next