regex - python substitute words between two points in a text -


in last few days dealing regular expressions. so, let's have text

text = ' 1. sometext sometext sometext given follows: «book 1 title here part 1 1. mpla mpla mpla 2. text text «here spesific text» book 2  1. text text. 2. «also» try in case of emergency.»  book 3 part 3 directions home' 

and trying find books between '«' , '»'. change word 'chapter' , text back. using regular expression can't result want because far can understand regex isn't best solution counting how many '»' have passed far.

for example if use

print re.findall(r'«([book\s\s+]*?)»', data, re.dotall) 

i text until first '»'. there way book 1 , book two?

i tried this:

print re.findall(r'(?<=«)(?=(book\s\s+))|(?=[^«]*»)(?=(book\s\s+))',data, re.dotall) 

but neither works. there way result or should use other regular expressions?

one solution in 2 parts follows:

print re.findall(r"(book\s\s+)", re.search("«(.*)»", text, re.s).group(1), re.s) 

this first finds outer « » , searches inside books.

this gives following output:

['book one', 'book two'] 

Comments

Popular posts from this blog

qt - Using float or double for own QML classes -

Create Outlook appointment via C# .Net -

ios - Swift Array Resetting Itself -