Python regex for integers combination from 01000 to 95999 -
i've been trying build regular expression match french zipcodes in python.
a french zipcode composed of department code (from 01 95) followed 3 digits subregion (let's 000 999) large.
i'm trying 1 : 0[1-9][0-9]{3}$|[1-8][0-9]{4}$|9[0-5][0-9]{3}$
i split problem in three
01xxx 09xxx, 1xxxx 8xxxx, 90xxx 95xxx
any idea make better ?
edit :
(0[1-9][0-9]{3}$)|([1-8][0-9]{4}$)|(9[0-5][0-9]{3}$) : match if input number has 5 digits.
and final version : ^((0[1-9]{1})|([1-8]{1}[0-9]{1})|9[0-5]{1})[0-9]{3}$ "factorize" [0,9]{3} endpart.
you can(/should/must) test regex on the official list of french postal codes.
import collections codes = collections.defaultdict(list) line in open('code_postaux_v201410.csv'): if not line[:1].isdigit(): continue row = line.strip().split(';') codes[row[2]]+= [row[1].strip()] def test_failures(regexp): r = re.compile(regexp) return [code code in codes if not r.match(code)] len(test_failures(r'^((0[1-9]{1})|([1-8]{1}[0-9]{1})|9[0-5]{1})[0-9]{3}$')) # 283 ! # not ideal, because not guarantee input existing 1 len(test_failures(r'^0[1-9]|[1-8][0-9]|9[0-8]|2a|2b[0-9]{3}$')) # @ least no miss!
Comments
Post a Comment