Ruby - fastest way to select items that belong to two arrays by nested attribute -
my ruby query slow. i'm looking faster way select items belong 2 arrays
lista list of objecta attribute name
listb list of objectb attribute name
there can multiple items in lista have same name, nothing in listb same name. there items in lista names not in listb. want function (loop) return items in lista name in listb.
this current solution, it's slow. how can make faster ?
z = 0 new = [] while z < listb.length hold = lista.select{|obj| obj.name == listb[z].name} new += hold z += 1 end
you want maximise lookup speed. best speed can reach lookup o(1) using hashes. method described below 1000x faster code , 350x faster loop include? n=10000
edit 2 : this link explains more in detail why hash lookup fast.
names = {} listb.each{|item| names[item.name] = true} result = lista.select{|item| names[item.name]} edit : updates benchmark.
- 03/08/2015 - added user12341234 set method
benchmark code
class myobject def initialize(name) @name = name end def name @name end end n=10000 k=n/10 lista = (1..n).map{ myobject.new(('a'..'z').to_a.shuffle[0,8].join)} listb = lista.sample(k) listb += (1..(n-k)).map{ myobject.new(('a'..'z').to_a.shuffle[0,8].join)} require 'benchmark' require 'set' benchmark.bm |x| x.report("hash") names = {} listb.each{|item| names[item.name] = true} result = lista.select{|item| names[item.name]} end x.report("op's code") z = 0 new = [] while z < listb.length hold = lista.select{|obj| obj.name == listb[z].name} new += hold z += 1 end end x.report("include") names = listb.map(&:name) lista.select{|obj| names.include?(obj.name) } end x.report("set") names = listb.map(&:name).to_set lista.select{|item| names.include?(item.name)} end end benchmark
specs :
- intel core i7 920 @ 2.67 ghz
- 13 gb ram
- windows 8 x64
- ruby 21 x64
results :
algorithm user system total real hash 0.015000 0.000000 0.015000 ( 0.013002) op's code 26.053000 0.016000 26.069000 ( 26.161283) include 9.219000 0.000000 9.219000 ( 9.244161) set 0.016000 0.000000 0.016000 ( 0.013001)
Comments
Post a Comment