java - Tokenize Timestamp with jFlex and handle ISO format -


i trying solve date tokenization issue various date format, , 1 of them maybe in iso8601 format, using 't' delimiter. , want able know character 't' timestamp when has digit preceding , following it.

for instance, if have array

string[] timestamp = {"time: 12/45/60", "2015-07-13t05:30:59"} 

i want splitted result of

(time) (:) (12) (/) (45) (/) (60)  (2015) (-) (07) (-) (13) (t) (05) (:) (30) (:) (59) 

i using jflex make tokenizer, , wrote .flex file such:

    %%     %class lexer     specialt = (\dt\d)     parameter = [:jletterdigit:]+     delimiter = [^a-za-z0-9]|{specialt}     %%     [:digit:]+ {return new datetoken(yytext(), "int");}     {delimiter} {return new datetoken(yytext(), "delimiter");}     {parameter} {return new datetoken(yytext(), "text");} 

however, tokenizer parse out symbols, not 't'. has suggestions? thank much.

it works me, or rather works grammar describes things.

<!-- language: lang-none --> 7-29t08:42 [int] 7 [delimiter] - [text] 29t08 [delimiter] : [int] 42 [delimiter]  

indeed, after scanner matches - delimiter

  • it gets 2, matches [:digit:]+, , {parameter}
  • it gets 9, matches [:digit:]+, , {parameter}
  • it gets t, doesn't match [:digit:]+ still matches {parameter}
  • then 0 , 8 keep matching {parameter}
  • : doesn't match {parameter}; , token {parameter} 29t08 returned.

note {specialt} recognized if enter that:

<!-- language: lang-none --> 5t6 [delimiter] 5t6 

your first problem specialt capturing much.

your second problem {parameter} matches virtually everything.

i suggest define iso date more accurately:

// hh:mm or hh:mm:ss isotime = {dig2} {delimiter} {dig2} ({delimiter} {dig2})?  // yyyy-mm-dd or yyyy-mm-ddt<isotime> isodate =  {dig4} ({delimiter} {dig2}){2} (t {isotime})? 

this create nice token full 2015-07-29t16:42.


Comments

Popular posts from this blog

qt - Using float or double for own QML classes -

Create Outlook appointment via C# .Net -

ios - Swift Array Resetting Itself -