regex - python substitute words between two points in a text -
in last few days dealing regular expressions. so, let's have text
text = ' 1. sometext sometext sometext given follows: «book 1 title here part 1 1. mpla mpla mpla 2. text text «here spesific text» book 2 1. text text. 2. «also» try in case of emergency.» book 3 part 3 directions home'
and trying find books between '«' , '»'. change word 'chapter' , text back. using regular expression can't result want because far can understand regex isn't best solution counting how many '»' have passed far.
for example if use
print re.findall(r'«([book\s\s+]*?)»', data, re.dotall)
i text until first '»'. there way book 1 , book two?
i tried this:
print re.findall(r'(?<=«)(?=(book\s\s+))|(?=[^«]*»)(?=(book\s\s+))',data, re.dotall)
but neither works. there way result or should use other regular expressions?
one solution in 2 parts follows:
print re.findall(r"(book\s\s+)", re.search("«(.*)»", text, re.s).group(1), re.s)
this first finds outer « »
, searches inside books.
this gives following output:
['book one', 'book two']
Comments
Post a Comment