Go

Understanding Google Cloud Datastore Keys (using Go)


If you are new to datastore, you will find that, unlike a relational database, Google Cloud Engine's datastore uses a distributed architecture which brings some benefits; such as scalability and some restrictions like the inability to use join operations, inequality filtering on multiple properties and use sub-queries, among others.

But another thing that will get your attention is that defining a key to uniquely identify a record is a little bit strange:

key := datastore.NewKey(context, "EntityType", "EntityName", 0, nil)
incompleteKey := datastore.NewIncompleteKey(context, "EntityType", nil)

Above, we see two examples of how to define datastore keys, a complete key and an incomplete key. But what is the difference between them? When should we use one or the other?

Complete Key

A complete key is constructed using 5 parameters: context, entity type, string id (also known as entity name), int id, and a parent key.

Let's suppose we have some product catalogs: food and clothes. And now, let's suppose that those categories are fixed and we'll use them to categorize products so all the food products like milk and eggs will go under the food catalog and shoes and shirts under the clothes catalog. Now we can create the keys for those categories.

foodKey := datastore.NewKey(context, "Catalog", "Food", 0, nil)
clothesKey := datastore.NewKey(context, "Catalog", "Clothes", 0, nil)

As we can see, the entity type id of both categories is Catalog, but the string id is different. We can say that the string id, in this case, is what is making these keys unique.

We can achieve the same result by using the int id instead of the string id.

foodKey := datastore.NewKey(context, "Catalog", "", 1, nil)
clothesKey := datastore.NewKey(context, "Catalog", "", 2, nil)

Not so descriptive as the first example, but it works.

By taking a look at those examples, we can guess that both the string id and int id parameters are the ones that give the key its uniqueness. So, what happens when we omit those parameters?

Incomplete Key

Now let's suppose we want to store a product:

 type Product struct{
    name string
    price float64
 }
newProduct := Product{"ham", 9.99}
key := datastore.NewKey(context, "Product", "", 0, nil)
datastore.Put(context, key, newProduct)

But there is something strange with that example: we didn't specify neither a string id nor an int id. That means that we don't have anything that make that key unique: our key is incomplete!. What we get is an incomplete key instead of a complete one.

Instead of using datastore.NewKey, we could have used datastore.NewIncompleteKey and just pass the context, the entity type and the parent key if any.

So this is equivalent:

incompleteKey := datastore.NewKey(context, "Product", "", 0, nil)
//is equivalent to
incompleteKey = datastore.NewIncompleteKey(context, "Product", nil)

So you may be wondering: what is the use of an incomplete key then? The answer is very simple: We use incomplete keys when we want Google Cloud's datastore to generate the entity's int id for us, so we don't have to worry about creating a new string id or int id for each entity we store.

It's worth to mention that Google Cloud's datastore only generates the int id, and it will generate a big 16 digit number by default.

Parent Keys

As we mentioned earlier, we can specify a parent key when creating a complete or incomplete key. That will allow us to group those entities in order to do queries by specifying an ancestor. For example: if we store products specifying the food catalog key as the parent key, we can do a query like this:

//this will return all food products
datastore.NewQuery("Product").Ancestor(foodKey)

The above query is known as "ancestor query" and another benefit of it, besides grouping entities, is that it always return strongly consistent results and are allowed to be used within transactions.

Conclusion

The Google Cloud datastore distributed architecture may feel a little weird when you are used to work with relational databases, but the scalability is a great advantage to take into consideration. Also, if you feel that having to create the entity key separated from the actual data it not right, there are always another options to help you make your work easier.

I recommend you to take a look into Goon (https://godoc.org/github.com/mjibson/goon). It helps you manage the key creation problem and also has other cool features like using memcache.

Thanks for reading!

Beginner
De Código, CafĂ© y Cervezas 09 – ÂĄHola Mundo!
Javascript
De Código, CafĂ© y Cervezas 05 – ReactJS + AngularJS (Parte 3)
Best Practices
De Código, CafĂ© y Cervezas 10 – Technical Debt