Jsonparser
One of the fastest alternative JSON parser for Go that does not require schema
Install / Use
/learn @buger/JsonparserREADME
Alternative JSON parser for Go (10x times faster standard library)
It does not require you to know the structure of the payload (eg. create structs), and allows accessing fields by providing the path to them. It is up to 10 times faster than standard encoding/json package (depending on payload size and usage), allocates no memory. See benchmarks below.
Rationale
Originally I made this for a project that relies on a lot of 3rd party APIs that can be unpredictable and complex.
I love simplicity and prefer to avoid external dependecies. encoding/json requires you to know exactly your data structures, or if you prefer to use map[string]interface{} instead, it will be very slow and hard to manage.
I investigated what's on the market and found that most libraries are just wrappers around encoding/json, there is few options with own parsers (ffjson, easyjson), but they still requires you to create data structures.
Goal of this project is to push JSON parser to the performance limits and not sacrifice with compliance and developer user experience.
Example
For the given JSON our goal is to extract the user's full name, number of github followers and avatar.
import "github.com/buger/jsonparser"
...
data := []byte(`{
"person": {
"name": {
"first": "Leonid",
"last": "Bugaev",
"fullName": "Leonid Bugaev"
},
"github": {
"handle": "buger",
"followers": 109
},
"avatars": [
{ "url": "https://avatars1.githubusercontent.com/u/14009?v=3&s=460", "type": "thumbnail" }
]
},
"company": {
"name": "Acme"
}
}`)
// You can specify key path by providing arguments to Get function
jsonparser.Get(data, "person", "name", "fullName")
// There is `GetInt` and `GetBoolean` helpers if you exactly know key data type
jsonparser.GetInt(data, "person", "github", "followers")
// When you try to get object, it will return you []byte slice pointer to data containing it
// In `company` it will be `{"name": "Acme"}`
jsonparser.Get(data, "company")
// If the key doesn't exist it will throw an error
var size int64
if value, err := jsonparser.GetInt(data, "company", "size"); err == nil {
size = value
}
// You can use `ArrayEach` helper to iterate items [item1, item2 .... itemN]
jsonparser.ArrayEach(data, func(value []byte, dataType jsonparser.ValueType, offset int, err error) {
fmt.Println(jsonparser.Get(value, "url"))
}, "person", "avatars")
// Or use can access fields by index!
jsonparser.GetString(data, "person", "avatars", "[0]", "url")
// You can use `ObjectEach` helper to iterate objects { "key1":object1, "key2":object2, .... "keyN":objectN }
jsonparser.ObjectEach(data, func(key []byte, value []byte, dataType jsonparser.ValueType, offset int) error {
fmt.Printf("Key: '%s'\n Value: '%s'\n Type: %s\n", string(key), string(value), dataType)
return nil
}, "person", "name")
// The most efficient way to extract multiple keys is `EachKey`
paths := [][]string{
[]string{"person", "name", "fullName"},
[]string{"person", "avatars", "[0]", "url"},
[]string{"company", "url"},
}
jsonparser.EachKey(data, func(idx int, value []byte, vt jsonparser.ValueType, err error){
switch idx {
case 0: // []string{"person", "name", "fullName"}
...
case 1: // []string{"person", "avatars", "[0]", "url"}
...
case 2: // []string{"company", "url"},
...
}
}, paths...)
// For more information see docs below
Reference
Library API is really simple. You just need the Get method to perform any operation. The rest is just helpers around it.
You also can view API at godoc.org
Get
func Get(data []byte, keys ...string) (value []byte, dataType jsonparser.ValueType, offset int, err error)
Receives data structure, and key path to extract value from.
Returns:
value- Pointer to original data structure containing key value, or just empty slice if nothing found or errordataType- Can be:NotExist,String,Number,Object,Array,BooleanorNulloffset- Offset from provided data structure where key value ends. Used mostly internally, for example forArrayEachhelper.err- If the key is not found or any other parsing issue, it should return error. If key not found it also setsdataTypetoNotExist
Accepts multiple keys to specify path to JSON value (in case of quering nested structures).
If no keys are provided it will try to extract the closest JSON value (simple ones or object/array), useful for reading streams or arrays, see ArrayEach implementation.
Note that keys can be an array indexes: jsonparser.GetInt("person", "avatars", "[0]", "url"), pretty cool, yeah?
GetString
func GetString(data []byte, keys ...string) (val string, err error)
Returns strings properly handing escaped and unicode characters. Note that this will cause additional memory allocations.
GetUnsafeString
If you need string in your app, and ready to sacrifice with support of escaped symbols in favor of speed. It returns string mapped to existing byte slice memory, without any allocations:
s, _, := jsonparser.GetUnsafeString(data, "person", "name", "title")
switch s {
case 'CEO':
...
case 'Engineer'
...
...
}
Note that unsafe here means that your string will exist until GC will free underlying byte slice, for most of cases it means that you can use this string only in current context, and should not pass it anywhere externally: through channels or any other way.
GetBoolean, GetInt and GetFloat
func GetBoolean(data []byte, keys ...string) (val bool, err error)
func GetFloat(data []byte, keys ...string) (val float64, err error)
func GetInt(data []byte, keys ...string) (val int64, err error)
If you know the key type, you can use the helpers above. If key data type do not match, it will return error.
ArrayEach
func ArrayEach(data []byte, cb func(value []byte, dataType jsonparser.ValueType, offset int, err error), keys ...string)
Needed for iterating arrays, accepts a callback function with the same return arguments as Get.
ObjectEach
func ObjectEach(data []byte, callback func(key []byte, value []byte, dataType ValueType, offset int) error, keys ...string) (err error)
Needed for iterating object, accepts a callback function. Example:
var handler func([]byte, []byte, jsonparser.ValueType, int) error
handler = func(key []byte, value []byte, dataType jsonparser.ValueType, offset int) error {
//do stuff here
}
jsonparser.ObjectEach(myJson, handler)
EachKey
func EachKey(data []byte, cb func(idx int, value []byte, dataType jsonparser.ValueType, err error), paths ...[]string)
When you need to read multiple keys, and you do not afraid of low-level API EachKey is your friend. It read payload only single time, and calls callback function once path is found. For example when you call multiple times Get, it has to process payload multiple times, each time you call it. Depending on payload EachKey can be multiple times faster than Get. Path can use nested keys as well!
paths := [][]string{
[]string{"uuid"},
[]string{"tz"},
[]string{"ua"},
[]string{"st"},
}
var data SmallPayload
jsonparser.EachKey(smallFixture, func(idx int, value []byte, vt jsonparser.ValueType, err error){
switch idx {
case 0:
data.Uuid, _ = value
case 1:
v, _ := jsonparser.ParseInt(value)
data.Tz = int(v)
case 2:
data.Ua, _ = value
case 3:
v, _ := jsonparser.ParseInt(value)
data.St = int(v)
}
}, paths...)
Set
func Set(data []byte, setValue []byte, keys ...string) (value []byte, err error)
Receives existing data structure, key path to set, and value to set at that key. This functionality is experimental.
Returns:
value- Pointer to original data structure with updated or added key value.err- If any parsing issue, it should return error.
Accepts multiple keys to specify path to JSON value (in case of updating or creating nested structures).
Note that keys can be an array indexes: jsonparser.Set(data, []byte("http://github.com"), "person", "avatars", "[0]", "url")
Delete
func Delete(data []byte, keys ...string) value []byte
Receives existing data structure, and key path to delete. This functionality is experimental.
Returns:
value- Pointer to original data structure with key path deleted if it can be found. If there is no key path, then the whole data structure is deleted.
Accepts multiple keys to specify path to JSON value (in case of updating or creating nested structures).
Note that keys can be an array indexes: jsonparser.Delete(data, "person", "avatars", "[0]", "url")
What makes it so fast?
- It does not rely on
encoding/json,reflectionorinterface{}, the only real package dependency isbytes. - Operates with JSON payload on byte level, providing you pointers to the original data structure: no memory allocation.
- No automatic type conversions, by default everything is a []byte, but it provides you value type, so you can convert by yourself (there is few helpers included).
- Does not parse full record, only keys you specified
Benchmarks
There are 3 benchmark types, trying to simulate real-life usage for small, medium and large JSON payloads. For each metric, the lower value is better. Time/op is in nanoseconds. Values better than standard encoding/json marked as bold text. Benchmarks run on standard Linode 1024 box.
Compared libraries:
- https://golang.org/pkg/encoding/json
- https://github.com/Jeffail/gabs
- https://github.com/a8m/djson
- https://github.com/bitly/go-simplejson
- https://github.com/an
