terminal = false
nup_logo

Machine Learning with Python

Lecture 3. Python slices, module Collections, refcount


Alexander Avdiushenko
October 10, 2023

Indexes of Python list

a = list(range(10)) a[2], a[-1], a[-5] # index: count from zero, forward or backward a[10]
slicing

Slices

a = list(range(10)) print(a[1:5]) # start included, end excluded print(a[1:-1]) a = list(range(10)) print(a[1:12:2]) # start, end, step print(a[-1:1:-2]) # one line to reverse list print(a[::-1])

List comprehensions

a = [[i] for i in range(10)] a[:5] = a[-5::-1] a a[0][0] = 1000 a a = list(range(10)) old_id_a = id(a) a[:5] = a[-5::-1] a, id(a) == old_id_a

Unpacking in Python

a, b = 0, 1 a, b a, b = b, a a, b a, b, c = range(3) b a, (b, c) = [1, (2, 3)] a, b, c
# fun example True, True, True == (True, True, True) # Fibonacci numbers a, b = 1, 1 for _ in range(10): a, b = b, a + b a, b
first, *b = range(5) first, b a, *b, c, d = range(10) a, b, c, d

Unpacking in a loop

for a, b in [('This', 1), ('is', 2), ('Test', 3)]: print(a, b, end = ', ') print() for i, letter in enumerate('hello'): print(i, ' - ', letter, end = ', ') print() a, b, c = range(3) b a, (b, c) = [1, (2, 3)] a, b, c

Comprehensions

lst = [i ** 2 for i in range(15)] lst {i ** 0.5 for i in range(-3, 3, 1) if i > 0} dct = {i : i ** 3 for i in range(4)} dct {key if key > 1 else None: value for key, value in dct.items()} " ".join([c for s in ("Nested", "List", "Comprehension") for c in s])
collections

Collections → defaultdict

from collections import defaultdict dct = defaultdict(float) float(), dct[2] for i in range(2): if dct[i] == 0: print("hello") dct

Collections → deque

from collections import deque q = deque() for i in range(5): q.append(i) while q: print(q.pop(), q)

Collections → deque → complexities

Insertion at the beginning/end, in the middle? O(1) and O(n) respectively
Access by index at the beginning/end/middle O(1)
Search for an element O(n)

Collections → OrderedDict

from collections import OrderedDict data = [(1, 'a'), (3, 'c'), (2, 'b')] print(dict(data)) print(OrderedDict(data)) d = OrderedDict(data) d.move_to_end(2, last=False) d d.popitem(last=True) d

Collections → Counter

from collections import Counter print(Counter("aaabbbbccd")) d = OrderedDict(data) d.move_to_end(2, last=False) d d.popitem(last=True) d row1 = ['never', 'give', 'never', 'let'] row2 = ['gona', 'you up', 'gona', 'you down'] for first, second in zip(row1, row2): print(first, second)

More interesting examples in Python

# lazy evaluation not False or non_existant # non_existant non_existant bool("1") 1 < 3 < 5 False == False != True
# equivalent to (False == False) and (False != True) 1 < 3 < 5 # equivalent to (1 < 3) and (3 < 5) None == None None is None
gc_python_meme
import sys print('\n'.join([f'{x}: {sys.getrefcount(x)}' for x in (1, 3, 5, 127, 127.0, object())])) help(sys.getrefcount)
x = 128.0 sys.getrefcount(x) # in Python interpreter in the console, it will be 3 sys.getrefcount(x) sys.getrefcount(float()) Fun with sys getrefcount post
Python doc about the garbage collector (gc)
  1. Reference count algorithm — deletion is triggered as soon as the count hits 0, but it brings many problems (circular references, thread locking, memory and CPU overhead)
  2. Additional garbage collector — gc module. Three generations of an object, each with its own heuristics when launched

Never repeat this at home!

def hi(x): print("Hello,", x) return None def add_two(x): return x + 2 add_two.__code__ = hi.__code__ add_two(20)
import dis dis.dis(hi)
LOAD_GLOBAL namei — Loads the global named co_names[namei] onto the stack
LOAD_CONST consti — Pushes "co_consts[consti]" onto the stack
LOAD_FAST var_num — Pushes a reference to the local co_varnames[var_num] onto the stack
CALL_FUNCTION argc — Calls a function. The low byte of argc indicates the number of positional parameters, the high byte the number of keyword parameters. On the stack, the opcode finds the keyword parameters first. For each keyword argument, the value is on top of the key. Below the keyword parameters, the positional parameters are on the stack, with the right-most parameter on top. Below the parameters, the function object to call is on the stack.
POP_TOP — Removes the top-of-stack (TOS) item.