performance - Why is dict faster than if-else in python? -


i tried compare dict , if-else faster follows.

d = {     str  : lambda x: "'%s'" % str(x),     int  : lambda x: str(x),     float: lambda x: str(x), } items = ['a', 'b', 'c', 1, 2, 3, 4, 5, 1.0]  def use_dict():     r = []     in items:         r.append(d[type(i)](i))     return r  def use_if():     r = []     in items:         if isinstance(i, str):             r.append("'%s'" % str(i))         elif isinstance(i, (int, float)):             r.append(str(i))     return r  if __name__ == '__main__':      timeit import timeit      print 'use_dict:', timeit(use_dict)     # -> use_dict: 9.21109666657      print 'use_if  :', timeit(use_if)     # -> use_if  : 10.9568739652 

i found dict faster if-else. means when want write switch-statement, dict better solution. have doubt why dict faster? can explain it. thanks.

if want idea of how code executes, take @ dis module.

a quick example...

import dis  # here things might want def do_something_a():     print 'i did a'   def do_something_b():     print 'i did b'   def do_something_c():     print 'i did c'   # case 1 def f1(x):     if x == 1:         do_something_a()     elif x == 2:         do_something_b()     elif x == 3:         do_something_c()   # case 2 func_map = {1: do_something_a, 2: do_something_b, 3: do_something_c} def f2(x):     func_map[x]()   # show how functions execute print 'case 1' dis.dis(f1) print '\n\ncase 2' dis.dis(f2) 

...which outputs...

case 1  18           0 load_fast                0 (x)               3 load_const               1 (1)               6 compare_op               2 (==)               9 pop_jump_if_false       22   19          12 load_global              0 (do_something_a)              15 call_function            0              18 pop_top              19 jump_forward            44 (to 66)   20     >>   22 load_fast                0 (x)              25 load_const               2 (2)              28 compare_op               2 (==)              31 pop_jump_if_false       44   21          34 load_global              1 (do_something_b)              37 call_function            0              40 pop_top              41 jump_forward            22 (to 66)   22     >>   44 load_fast                0 (x)              47 load_const               3 (3)              50 compare_op               2 (==)              53 pop_jump_if_false       66   23          56 load_global              2 (do_something_c)              59 call_function            0              62 pop_top              63 jump_forward             0 (to 66)         >>   66 load_const               0 (none)              69 return_value   case 2  29           0 load_global              0 (func_map)               3 load_fast                0 (x)               6 binary_subscr               7 call_function            0              10 pop_top              11 load_const               0 (none)              14 return_value 

...so it's pretty easy see function has execute instructions.

as faster, that's you'd have check profiling code.

the if/elif/else structure compares key given sequence of possible values 1 one until finds match in condition of if statement, reads supposed execute inside if block. can take long time, because many checks (n/2 on average, n possible values) have made every lookup.

the reason sequence of if statements more difficult optimize switch statement condition checks (what's inside parens in c++) might conceivably change state of variable that's involved in next check, have them in order. restrictions on switch statements remove possibility, order doesn't matter (i think).

python dictionaries are implemented hash tables. idea this: if deal arbitrarily large numbers , had infinite ram, create huge array of function pointers indexed casting whatever lookup value integer , using index. lookup virtually instantaneous.

you can't that, of course, can create array of manageable length, pass lookup value hash function (which generates integer, depending on lookup value), % result length of array index within bounds of array. way, lookup takes time needed call hash function once, take modulus, , jump index. if amount of different possible lookup values large enough, overhead of hash function becomes negligible compared n/2 condition checks.

(actually, since many different lookup values inevitably map same index, it's not quite simple. have check , resolve possible conflicts, can done in number of ways. still, gist of described above.)


Comments

Popular posts from this blog

qt - Using float or double for own QML classes -

Create Outlook appointment via C# .Net -

ios - Swift Array Resetting Itself -