Ruby - fastest way to select items that belong to two arrays by nested attribute -
my ruby query slow. i'm looking faster way select items belong 2 arrays
lista
list of objecta attribute name
listb
list of objectb attribute name
there can multiple items in lista
have same name
, nothing in listb
same name
. there items in lista
name
s not in listb
. want function (loop) return items in lista
name
in listb
.
this current solution, it's slow. how can make faster ?
z = 0 new = [] while z < listb.length hold = lista.select{|obj| obj.name == listb[z].name} new += hold z += 1 end
you want maximise lookup speed. best speed can reach lookup o(1) using hashes. method described below 1000x faster code , 350x faster loop include?
n=10000
edit 2 : this link explains more in detail why hash lookup fast.
names = {} listb.each{|item| names[item.name] = true} result = lista.select{|item| names[item.name]}
edit : updates benchmark.
- 03/08/2015 - added user12341234 set method
benchmark code
class myobject def initialize(name) @name = name end def name @name end end n=10000 k=n/10 lista = (1..n).map{ myobject.new(('a'..'z').to_a.shuffle[0,8].join)} listb = lista.sample(k) listb += (1..(n-k)).map{ myobject.new(('a'..'z').to_a.shuffle[0,8].join)} require 'benchmark' require 'set' benchmark.bm |x| x.report("hash") names = {} listb.each{|item| names[item.name] = true} result = lista.select{|item| names[item.name]} end x.report("op's code") z = 0 new = [] while z < listb.length hold = lista.select{|obj| obj.name == listb[z].name} new += hold z += 1 end end x.report("include") names = listb.map(&:name) lista.select{|obj| names.include?(obj.name) } end x.report("set") names = listb.map(&:name).to_set lista.select{|item| names.include?(item.name)} end end
benchmark
specs :
- intel core i7 920 @ 2.67 ghz
- 13 gb ram
- windows 8 x64
- ruby 21 x64
results :
algorithm user system total real hash 0.015000 0.000000 0.015000 ( 0.013002) op's code 26.053000 0.016000 26.069000 ( 26.161283) include 9.219000 0.000000 9.219000 ( 9.244161) set 0.016000 0.000000 0.016000 ( 0.013001)
Comments
Post a Comment