Commit Graph

  • 179ce57826 New read buffer peek options for extra performance. main Maurice Makaay 2019-08-01 13:26:02 +0000
  • f70bf8d074 Speed improvements Maurice Makaay 2019-07-29 23:51:09 +0000
  • b9cc91c0ae More speed improvements. Maurice Makaay 2019-07-29 22:52:38 +0000
  • 8ef9aed096 Switching from various Byte and Rune handlers to single Char handlers. The Char handlers determine on their own if they should handle things in byte or rune mode. Maurice Makaay 2019-07-29 09:45:25 +0000
  • e0b1039abd Made a big jump in performance on big files with lots of comments, by reading in chunks till end of line, instead of byte-by-byte. Maurice Makaay 2019-07-28 23:50:58 +0000
  • 53ae659ef6 Moving results to their own light weight tokenize.API.Result. Maurice Makaay 2019-07-28 22:35:33 +0000
  • eda71f304e Dropped PeekWithResult(), because it does not add any substantial performance. A simpler API which is virtually as fast wins any day. Maurice Makaay 2019-07-27 12:26:02 +0000
  • fcdd3d4ea7 Wow, going nicely! Some more miliseconds stripped. Maurice Makaay 2019-07-26 22:56:12 +0000
  • daf3b9838f Backup work on dropping forking support. Maurice Makaay 2019-07-26 14:51:40 +0000
  • 4c94374107 Getting rid of forking, the new system delivers more performance. Maurice Makaay 2019-07-26 12:14:15 +0000
  • 87cdadae78 Hmm... this whole snapshot idea seems to work and a valid replacement for the forking method. Maurice Makaay 2019-07-26 08:02:37 +0000
  • bc9e718e47 Lowering the number of forks required. Maurice Makaay 2019-07-24 22:42:40 +0000
  • 99b0abc490 WIP on lowering the number of forks required. Maurice Makaay 2019-07-24 22:42:16 +0000
  • 548289560b Code cleanup. Maurice Makaay 2019-07-24 11:03:02 +0000
  • 62cd84bb74 Use zero-indexed cursor positioning data inside stackframes. This simplifies some things. Also a bit of code cleanup. Maurice Makaay 2019-07-24 10:34:24 +0000
  • 802701ade5 Added multi-byte peeks for some performance improvements. Maurice Makaay 2019-07-23 23:23:40 +0000
  • 7037c6d24a Fixing some naming inconsistencies. Maurice Makaay 2019-07-23 17:55:13 +0000
  • a968f22d45 Code cleanup, making the byte and rune inputs look as much the same as possible and get rid of some unneeded functionality. Maurice Makaay 2019-07-23 08:03:16 +0000
  • 93d2cfa6f1 Backup work. Maurice Makaay 2019-07-22 23:28:05 +0000
  • cf679b2225 Backup work for next refactoring step. Maurice Makaay 2019-07-22 22:16:28 +0000
  • 070e6a13a7 Made some nice steps, backup and continue! Maurice Makaay 2019-07-22 15:37:52 +0000
  • dd1159e309 Committing a bit of code cleanup before trying something bigger. Maurice Makaay 2019-07-22 07:57:05 +0000
  • 183f5df00d Brought back some lost performance. Doing everything via api.Input/Output causes an extra level of indirection and it does not cost that much, but we do loose performance through that route. So added private methods for the API struct, which are used internally to squeeze out a bit of extra performance. Maurice Makaay 2019-07-20 23:51:08 +0000
  • acdf83332b Use pointers instead of values, since we're updating the structs. Maurice Makaay 2019-07-20 11:50:36 +0000
  • 7998d05113 More efficient version of MatchOctet. Maurice Makaay 2019-07-20 01:50:12 +0000
  • 0c057e4a9a Split up the api.go into three files: api.go, api_input.go and api_output.go. This makes it easier to manage the individual code sets. Maurice Makaay 2019-07-20 00:48:11 +0000
  • 93c75af87f Moved Input and Output related fields from the API to their respective sub-structs. Maurice Makaay 2019-07-20 00:28:37 +0000
  • 7d2d8dbed3 Moved input-related functions to their own API.Input struct. Maurice Makaay 2019-07-19 23:41:15 +0000
  • 9d98c9dff7 Moving output functions to its own substruct of the API. Maurice Makaay 2019-07-19 22:57:06 +0000
  • 458d6f60a6 A nice performance gain by making a difference between AcceptRunes/AcceptBytes and the new simpler AcceptRune/AcceptByte functions. The simpler versions are faster when only accepting a single byte or rune (which is the case in most situations). Maurice Makaay 2019-07-19 21:13:15 +0000
  • 9a53ea9012 Working on API speed. Maurice Makaay 2019-07-19 14:44:44 +0000
  • 31055a3cd3 Bugfix for parsekit.read: when filling the buffer, the read offset was not taken into account for determining how many bytes could be read. Maurice Makaay 2019-07-19 10:13:32 +0000
  • 3f9c745ac4 Unit tests improved for the parsekit.read package. Maurice Makaay 2019-07-19 09:50:42 +0000
  • 22bcf4677e Some work on simlifying the reader code, to see if I can squeeze some more performance out of that part. Maurice Makaay 2019-07-19 08:47:13 +0000
  • 1771e237c0 Switched to a []byte backing store instead of []rune for collecting input data (we can use both bytes and runes for input in an easy way now) Maurice Makaay 2019-07-18 09:26:11 +0000
  • b9eeac3480 Work in progress on switching to byte stack. Committing to do some performance checks against master. Maurice Makaay 2019-07-18 08:06:26 +0000
  • e659380a5f Implemented an efficient M.DropUntilEndOfLine handler, which is now used in the TOML parser for a dramatic speed increase on comment parsing. Maurice Makaay 2019-07-17 23:51:37 +0000
  • 64f92696b2 Fixed unit tests for the new allocation behavior. Maurice Makaay 2019-07-17 23:03:14 +0000
  • 0a4e44b8f8 Allow for bufio Readers that deliver data in chunks (like our unit test Reader) Maurice Makaay 2019-07-17 23:03:00 +0000
  • 6d3eacdcae Allocate read buffer in 1024 byte chunks, and read the data in chunks as well. This is more efficient than reading byte by byte. Maurice Makaay 2019-07-17 22:12:37 +0000
  • 5e3e4b0f0a Yay! First version for which parsing long.toml drops below 100ms! Got an outcome of 93ms. Almost down to BurntSushi's speed level, but still with a generic parser backing. Looking good!! Maurice Makaay 2019-07-16 23:34:01 +0000
  • ddd0ed49f6 Don't resize the stack slices, since we keep track of their starts and ends anyway. Maurice Makaay 2019-07-16 12:19:50 +0000
  • 06faabdfe2 Small bugfix for the rune-to-byte-fallback code and added byte-support to the Str and StrNoCase matchers. Maurice Makaay 2019-07-16 07:35:06 +0000
  • 4cfdbafa6e Further switching to byte-based input handling. Maurice Makaay 2019-07-16 07:05:10 +0000
  • 0362763e83 Switched to byte input for built-in tokenize.Handler functions. Maurice Makaay 2019-07-15 22:48:00 +0000
  • d4492e4f0a Bytes reader working, now carry on switching to byte reading in the tokenizer code. Maurice Makaay 2019-07-15 20:03:05 +0000
  • 17935b7534 Further performance optimization and code cleanup. Maurice Makaay 2019-07-12 21:32:40 +0000
  • 56b8df3aab Removed loop protection code. This is useful, but it puts a performance burden on the code when doing it by keeping track of actual callers through the call stack. Maybe to be reintroduced in a future version with something like a simple counter and a maximum depth-style protection. Maurice Makaay 2019-07-12 12:33:18 +0000
  • 09746c0d2e Speeding up the code some more. Big step was made by simplifying the cursor, continuing with that in the next commit. Maurice Makaay 2019-07-12 08:02:04 +0000
  • 7116aa47df Squishing out more performance. Maurice Makaay 2019-07-12 00:21:02 +0000
  • a4eda45d2c Made all unit tests work again. Maurice Makaay 2019-07-11 14:55:08 +0000
  • 3c9a678d7a Fixed the ModifyDrop() behavior. It worked, but it caused memory build-up in the old implementation. Maurice Makaay 2019-07-11 14:52:12 +0000
  • c532af67ca Optimization round completed (for now :-) All tests successful. Maurice Makaay 2019-07-11 12:43:57 +0000
  • 7598b62dd0 Finalized the work-through of the new version of the tokenizer code. Maurice Makaay 2019-07-10 20:36:21 +0000
  • 48d7fda9f8 New implementation for performance. Maurice Makaay 2019-07-10 11:26:47 +0000
  • 7795588fe6 Speed improvement work. Maurice Makaay 2019-07-08 21:57:32 +0000
  • 5fa0b5eace Backup work on performance improvements. Maurice Makaay 2019-07-08 14:31:01 +0000
  • 23ca3501e1 Backup changes for performance fixes. Maurice Makaay 2019-07-08 00:12:30 +0000
  • 7bc7fda593 Backup changes for performance fixes. Maurice Makaay 2019-07-05 15:07:07 +0000
  • 5e9879326a Backup work to performance tuning. Maurice Makaay 2019-07-05 08:08:42 +0000
  • 583197c37a Made a distinction between MatchWhitespace() and MatchUnicodeSpace(). Maurice Makaay 2019-07-04 11:32:07 +0000
  • d96511ce0a Backup work. Maurice Makaay 2019-07-03 15:46:43 +0000
  • 92e6eec7f3 implemented Cursor.moveByRune(), to get rid of some useless rune->string conversion for updating cursor positions. Maurice Makaay 2019-06-30 10:16:46 +0000
  • 4b0309453f Added a feature to run the parser without any of the built-in sanity checks (like loop checks). This improved performance, but at the risk of missing some runtime issues with the parser implementation. Maurice Makaay 2019-06-30 01:05:54 +0000
  • 7ce12d1632 A few small changes used for TOML support. Maurice Makaay 2019-06-23 12:06:31 +0000
  • 5904da9677 Added some package docs. Maurice Makaay 2019-06-18 22:52:17 +0000
  • 2293627232 Small code cleanup things, mainly backing up the changes. Maurice Makaay 2019-06-18 15:46:09 +0000
  • 99654c2f9e Simplified some internal code, which also fixes a bug with correct error reporting from within parsekit in various edge cases. Maurice Makaay 2019-06-17 13:59:31 +0000
  • cdfc4ce52c More documentation and examples. Maurice Makaay 2019-06-12 16:17:13 +0000
  • 1a280233b0 Got rid of the testify dependency. My testing needs are so basic, that there's no need for this full fledged testing library. Maurice Makaay 2019-06-12 15:25:15 +0000
  • cef6ae1bc4 Working on documentation. Maurice Makaay 2019-06-12 15:24:09 +0000
  • 27c97ae902 Big overhaul on separating packages for code containment. Maurice Makaay 2019-06-12 14:30:46 +0000
  • 1f0e0fcc17 Splitting up functionality in packages, intermediate step. Maurice Makaay 2019-06-11 22:23:30 +0000
  • 0f7b4e0d26 Added a few syntactic sugar methods for ParseHandler. Maurice Makaay 2019-06-11 09:09:41 +0000
  • 65895ac502 Making parsekit.reader both simpler and more complex (more complex by adopting some buffer allocation logic from the built-in bytes package, to not be copying memory all the time during the read operations. Maurice Makaay 2019-06-09 21:55:01 +0000
  • 9656cd4449 The parsekit.reader.Reader now caches error messages that are returned from the embedded io.Reader. When an error is returned, the read offset and the error are stored. When later on, the same of a higher offset is requested, the error is returned again. This way the code will work for Readers that do not repeatedly return the correct error when calling the Read() method multiple times arter a first error has occurred. Maurice Makaay 2019-06-09 19:42:20 +0000
  • 76336e883e Removed the use of Error.Full(). The default Error() method now includes the extra data from Full() (line and column offset) Maurice Makaay 2019-06-09 15:20:44 +0000
  • add28feb33 In the spirit of Go, slimmed down the ParseAPI interface. I'm no longer using ParseAPI.On(..).<DoSomething>(), but now it's simply ParseAPI.<DoSomething>(). I also dropped the difference between a Stay() and an Accept(). All that is possible now is ParseAPI.Peek() and ParseAPI.Accept(). Maurice Makaay 2019-06-09 10:25:49 +0000
  • 9f5caa2024 Backup work. Maurice Makaay 2019-06-08 22:48:56 +0000
  • 05ae55c487 Brought the examples up-to-date with the lateset code. All are working correctly now. Maurice Makaay 2019-06-07 16:20:32 +0000
  • 40bad51064 Improvement a few TokanHandlers by letting them make use of the new MatchRuneByCallback method, instead of having them implement their own logic. Maurice Makaay 2019-06-07 15:57:53 +0000
  • 9a5bf8b9af Further code cleaning for the interaction between ParseAPI and TokenAPI. Extra atoms added, also one based on a callback which can accept single runes based on thhat callback function. Maurice Makaay 2019-06-07 15:48:49 +0000
  • 98d2db0374 Moved Reader into its own package. Maurice Makaay 2019-06-07 10:55:55 +0000
  • 6d92e1dc68 Merged functionality of p.Expects(string) and p.UnexpectedInput(). It is now simply p.UnexpectedInput(string). This makes the naming of unexpected input not as magical, but explicit (which is a GoodThing). With one of the earlier incarnations of parsekit it did make sense, but it went in a way in which explicit is more idiomatic for the package. Maurice Makaay 2019-06-07 07:56:24 +0000
  • 3094b09284 Adding documentation and getting the interactions between ParseAPI and TokenAPI cleaned up a bit. Maurice Makaay 2019-06-07 07:26:41 +0000
  • c0389283bd Added input check for MatchIntegerBetween() Maurice Makaay 2019-06-05 22:21:34 +0000
  • 3d791233e0 Added a lot of IP-address-related TokenHandlers, so we can now process IPv4 addresses, IPv6 addresses, CIDR netmasks, IPv4 dotted quad netmasks, IPv4Net (ipv4 + mask) and IPv6Mask (ipv6 + mask). Maurice Makaay 2019-06-05 22:16:09 +0000
  • 05585db341 Normalizing error handling, to always include the caller location in errors. This makes debugging a lot easier for users of the package, because it doesn't say stuff like 'Method() was called incorrectly', but instead something like 'Method() was called incorrectlty at /path/to/file.go:1234'. Maurice Makaay 2019-06-05 10:07:50 +0000
  • 75373e5ed5 Big simplification run once more, cleaned up code, added tests and examples, made stuff unexported where possible, to slim down the exported interface. Maurice Makaay 2019-06-04 23:15:02 +0000
  • 4580962fb8 Backup a load of work on typed token support, making it easy to produce tokens directly from parser/combinator-based parsing rules. Maurice Makaay 2019-06-04 00:03:08 +0000
  • 21f1aa597c Made the panic() calls (which basically indicate parser implementation bugs) more useful by referencing from where illegal calls were made. Maurice Makaay 2019-05-29 07:24:27 +0000
  • 2fa5b8d0f4 OCD ..OCD ...OCD ... Maurice Makaay 2019-05-29 00:01:24 +0000
  • 1e7ec7553a Tiny fix in variable naming, because the test had grown in a different direction. Maurice Makaay 2019-05-28 23:59:02 +0000
  • e1534f678e Simplified calculator 2 example. Maurice Makaay 2019-05-28 23:51:19 +0000
  • 11883b06ac Added a unit test for the actual parser loop issue that I ran into myself. This one will not bite me again! Maurice Makaay 2019-05-28 23:13:28 +0000
  • d31d09abf0 Added crude loop protection to the parser, which should prevent parsers running in circles (happened to me a few times too). Maurice Makaay 2019-05-28 23:01:23 +0000
  • 7aff3fc43e Added a nice example that shows how a []string-based type can be turned into a parser that fills its own slice elements during parsing. Maurice Makaay 2019-05-28 14:38:04 +0000
  • 2d851103e5 Cleanup of stuff that I don't need anymore, because it has been fully deprecated. Also added some tests for panic() calls in parsekit, which brings test coverage to 100%. It's not a goal as such, but it's good to know that I got there without cheaty tests :) Maurice Makaay 2019-05-28 13:41:58 +0000
  • 3dfa99c965 Modified all examples and tests to make use of the new ideas on how to keep parsing state. After this commit, I can cleanup a lot of stuff from the emitting loop-based parser which was basically crap for complex parsers. Maurice Makaay 2019-05-28 10:42:46 +0000
  • 980c18099e A small change to the computation interpreter to get rid of one useless level of recursion. Maurice Makaay 2019-05-28 07:26:50 +0000