Commit Graph

86 Commits

Author SHA1 Message Date
Maurice Makaay b9eeac3480 Work in progress on switching to byte stack. Committing to do some performance checks against master. 2019-07-18 08:06:26 +00:00
Maurice Makaay e659380a5f Implemented an efficient M.DropUntilEndOfLine handler, which is now used in the TOML parser for a dramatic speed increase on comment parsing. 2019-07-17 23:51:37 +00:00
Maurice Makaay 64f92696b2 Fixed unit tests for the new allocation behavior. 2019-07-17 23:03:14 +00:00
Maurice Makaay 0a4e44b8f8 Allow for bufio Readers that deliver data in chunks (like our unit test Reader) 2019-07-17 23:03:00 +00:00
Maurice Makaay 6d3eacdcae Allocate read buffer in 1024 byte chunks, and read the data in chunks as well. This is more efficient than reading byte by byte. 2019-07-17 22:12:37 +00:00
Maurice Makaay 5e3e4b0f0a Yay! First version for which parsing long.toml drops below 100ms! Got an outcome of 93ms. Almost down to BurntSushi's speed level, but still with a generic parser backing. Looking good!! 2019-07-16 23:34:01 +00:00
Maurice Makaay ddd0ed49f6 Don't resize the stack slices, since we keep track of their starts and ends anyway. 2019-07-16 12:19:50 +00:00
Maurice Makaay 06faabdfe2 Small bugfix for the rune-to-byte-fallback code and added byte-support to the Str and StrNoCase matchers. 2019-07-16 07:35:06 +00:00
Maurice Makaay 4cfdbafa6e Further switching to byte-based input handling. 2019-07-16 07:05:10 +00:00
Maurice Makaay 0362763e83 Switched to byte input for built-in tokenize.Handler functions. 2019-07-15 22:48:00 +00:00
Maurice Makaay d4492e4f0a Bytes reader working, now carry on switching to byte reading in the tokenizer code. 2019-07-15 20:03:05 +00:00
Maurice Makaay 17935b7534 Further performance optimization and code cleanup. 2019-07-12 21:32:40 +00:00
Maurice Makaay 56b8df3aab Removed loop protection code. This is useful, but it puts a performance burden on the code when doing it by keeping track of actual callers through the call stack. Maybe to be reintroduced in a future version with something like a simple counter and a maximum depth-style protection. 2019-07-12 12:33:18 +00:00
Maurice Makaay 09746c0d2e Speeding up the code some more. Big step was made by simplifying the cursor, continuing with that in the next commit. 2019-07-12 08:02:04 +00:00
Maurice Makaay 7116aa47df Squishing out more performance. 2019-07-12 00:21:02 +00:00
Maurice Makaay a4eda45d2c Made all unit tests work again. 2019-07-11 14:55:08 +00:00
Maurice Makaay 3c9a678d7a Fixed the ModifyDrop() behavior. It worked, but it caused memory build-up in the old implementation. 2019-07-11 14:52:12 +00:00
Maurice Makaay c532af67ca Optimization round completed (for now :-) All tests successful. 2019-07-11 12:43:57 +00:00
Maurice Makaay 7598b62dd0 Finalized the work-through of the new version of the tokenizer code. 2019-07-10 20:36:21 +00:00
Maurice Makaay 48d7fda9f8 New implementation for performance. 2019-07-10 11:26:47 +00:00
Maurice Makaay 7795588fe6 Speed improvement work. 2019-07-08 21:57:32 +00:00
Maurice Makaay 5fa0b5eace Backup work on performance improvements. 2019-07-08 14:31:01 +00:00
Maurice Makaay 23ca3501e1 Backup changes for performance fixes. 2019-07-08 00:12:30 +00:00
Maurice Makaay 7bc7fda593 Backup changes for performance fixes. 2019-07-05 15:07:07 +00:00
Maurice Makaay 5e9879326a Backup work to performance tuning. 2019-07-05 08:08:42 +00:00
Maurice Makaay 583197c37a Made a distinction between MatchWhitespace() and MatchUnicodeSpace(). 2019-07-04 11:32:07 +00:00
Maurice Makaay d96511ce0a Backup work. 2019-07-03 15:46:43 +00:00
Maurice Makaay 92e6eec7f3 implemented Cursor.moveByRune(), to get rid of some useless rune->string conversion for updating cursor positions. 2019-06-30 10:16:46 +00:00
Maurice Makaay 4b0309453f Added a feature to run the parser without any of the built-in sanity checks (like loop checks). This improved performance, but at the risk of missing some runtime issues with the parser implementation. 2019-06-30 01:05:54 +00:00
Maurice Makaay 7ce12d1632 A few small changes used for TOML support. 2019-06-23 12:06:31 +00:00
Maurice Makaay 5904da9677 Added some package docs. 2019-06-18 22:52:17 +00:00
Maurice Makaay 2293627232 Small code cleanup things, mainly backing up the changes. 2019-06-18 15:46:09 +00:00
Maurice Makaay 99654c2f9e Simplified some internal code, which also fixes a bug with correct error reporting from within parsekit in various edge cases. 2019-06-17 13:59:31 +00:00
Maurice Makaay cdfc4ce52c More documentation and examples. 2019-06-12 16:17:13 +00:00
Maurice Makaay 1a280233b0 Got rid of the testify dependency. My testing needs are so basic, that there's no need for this full fledged testing library. 2019-06-12 15:25:15 +00:00
Maurice Makaay cef6ae1bc4 Working on documentation. 2019-06-12 15:24:09 +00:00
Maurice Makaay 27c97ae902 Big overhaul on separating packages for code containment. 2019-06-12 14:30:46 +00:00
Maurice Makaay 1f0e0fcc17 Splitting up functionality in packages, intermediate step. 2019-06-11 22:23:30 +00:00
Maurice Makaay 0f7b4e0d26 Added a few syntactic sugar methods for ParseHandler. 2019-06-11 09:09:41 +00:00
Maurice Makaay 65895ac502 Making parsekit.reader both simpler and more complex (more complex by adopting some buffer allocation logic from the built-in bytes package, to not be copying memory all the time during the read operations. 2019-06-09 21:55:01 +00:00
Maurice Makaay 9656cd4449 The parsekit.reader.Reader now caches error messages that are returned from
the embedded io.Reader. When an error is returned, the read offset and the
error are stored. When later on, the same of a higher offset is requested,
the error is returned again. This way the code will work for Readers that do
not repeatedly return the correct error when calling the Read() method
multiple times arter a first error has occurred.

Note: I am not sure if there are any Reader implementations that wouldn't
return the same error message over and over again, but hardening the
parsekit Reader to support this is not hard, so let's just go for it.
2019-06-09 19:42:20 +00:00
Maurice Makaay 76336e883e Removed the use of Error.Full(). The default Error() method now includes the extra data from Full() (line and column offset) 2019-06-09 15:20:44 +00:00
Maurice Makaay add28feb33 In the spirit of Go, slimmed down the ParseAPI interface. I'm no longer using ParseAPI.On(..).<DoSomething>(), but now it's simply ParseAPI.<DoSomething>(). I also dropped the difference between a Stay() and an Accept(). All that is possible now is ParseAPI.Peek() and ParseAPI.Accept(). 2019-06-09 10:25:49 +00:00
Maurice Makaay 9f5caa2024 Backup work. 2019-06-08 22:48:56 +00:00
Maurice Makaay 05ae55c487 Brought the examples up-to-date with the lateset code. All are working correctly now. 2019-06-07 16:20:32 +00:00
Maurice Makaay 40bad51064 Improvement a few TokanHandlers by letting them make use of the new MatchRuneByCallback method, instead of having them implement their own logic. 2019-06-07 15:57:53 +00:00
Maurice Makaay 9a5bf8b9af Further code cleaning for the interaction between ParseAPI and TokenAPI. Extra atoms added, also one based on a callback which can accept single runes based on thhat callback function. 2019-06-07 15:48:49 +00:00
Maurice Makaay 98d2db0374 Moved Reader into its own package. 2019-06-07 10:55:55 +00:00
Maurice Makaay 6d92e1dc68 Merged functionality of p.Expects(string) and p.UnexpectedInput().
It is now simply p.UnexpectedInput(string). This makes the naming
of unexpected input not as magical, but explicit (which is a GoodThing).
With one of the earlier incarnations of parsekit it did make sense,
but it went in a way in which explicit is more idiomatic for the package.
2019-06-07 07:56:24 +00:00
Maurice Makaay 3094b09284 Adding documentation and getting the interactions between ParseAPI and TokenAPI cleaned up a bit. 2019-06-07 07:26:41 +00:00