How do I use Nette\Utils\Tokenizer?
- petr.pavel
- Member | 535
Hey guys, I have a long text with a variable number of tokens. A token is a sequence of characters with optional parameters.
Simplified example:
some text
second line
[reference:ABC,DEF,GHI]
some other text
[different-token]
again some text
I'm wondering if Tokenizer could be used to convert the text into something like:
array(
array('value' => "some text\nsecond line", 'type' => "text"),
array('value' => array(
array('value' => 'reference', 'type' => 'component'),
array('value' => 'ABC,DEF,GHI', 'type' => 'parameters'),
), 'type' => 'token'),
array('value' => "some other text", 'type' => "text"),
array('value' => array(
array('value' => 'different-token', 'type' => 'component'),
), 'type' => 'token'),
array('value' => "again some text", 'type' => "text"),
)
The format of the output array doesn't really matter. I just need pieces of texts and components with their parameters.
I know how to do it the hard way with preg_split()
in two
iterations but I'm hoping for some more clever way.
- petr.pavel
- Member | 535
Thank you for the links. Coincidentally, your PropertiesLexer.php was one of the sources I've read when trying to understand the Tokenizer. I also studied Nette unit tests.
I know close to nothing about lexical analysis and
I was wondering if the Tokenizer could be misused for my purpose. I guess not.
Or perhaps yes, but with no additional benefit to simple
preg_split()
.
Thanks Filip.