Skip to content

1 line fix + benchmark: stop resetting class cache for each document#78

Open
Philosobyte wants to merge 2 commits into
simdjson:mainfrom
Philosobyte:stop-clearing-class-cache-for-each-doc
Open

1 line fix + benchmark: stop resetting class cache for each document#78
Philosobyte wants to merge 2 commits into
simdjson:mainfrom
Philosobyte:stop-clearing-class-cache-for-each-doc

Conversation

@Philosobyte

Copy link
Copy Markdown

Every time we parse a document, we enter method SchemaBasedJsonIterator.walkDocument. Every time we enter said method, we call classResolver.reset();. This clears the class cache and defeats its purpose.

I benchmarked by reading twitter.json into memory, splitting it into 100 individual messages (around ~6kB each), and having each iteration read the 100 messages individually.

We are far slower than Jackson:

Benchmark                           Mode   Cnt   Score   Error  Units
SchemaBasedParseBenchmark.jackson     ss  5000   0.700 ± 0.009  ms/op
SchemaBasedParseBenchmark.simdjson    ss  5000  14.051 ± 0.017  ms/op

I removed classResolver.reset() and benchmarked again:

Benchmark                           Mode   Cnt  Score   Error  Units
SchemaBasedParseBenchmark.jackson     ss  5000  0.706 ± 0.008  ms/op
SchemaBasedParseBenchmark.simdjson    ss  5000  0.327 ± 0.006  ms/op

We are now faster than Jackson.

Any reason this classResolver.reset() call needs to exist?

Here is my command so you can reproduce my results:

java --add-modules=jdk.incubator.vector -jar .\build\libs\simdjson-java-0.4.1-stop-clearing-class-cache-for-each-doc-SNAPSHOT-jmh.jar -wi 200 -i 1000 -tu ms -t 1 SchemaBasedParseBenchmark

I also needed to update version of jsoniterScalaVersion, or else I would get build errors.

@Philosobyte

Copy link
Copy Markdown
Author

@piotrrzysko I would appreciate if you could look at this 1 line fix (plus benchmark).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant