Spanner – GCP custom SQL solution

Google Cloud Platform (GCP) offers 3 SQL engines as a service: MySQL 5.6/5.7, PostgreSQL 9.6 and Spanner. MySQL and PostgreSQL are quite well known across the world. Spanner is completely new player – Google’s own product dedicated to handle thousands of requests.

When we setup the database, it is required to specify number of nodes and its location (regional or multiregional). The single node is composed from 3 replicas. Such setup allows to get 10k reads and 2k writes but do not guarantee any SLA. To get SLA guarantee (99.99% for “regional” setup and 99.999% for multiregional), you need to set at least 3 nodes.

It ensures:

Scalability
Sharding
Synchronous replication (3 replicas, master-master)
Access management
Auditability
Online schema changes
Encryption
Externally consistent read/write

What is external consistency? Google describes it as when you start 2 transactions, A and B (A starts first and ends second), the result of the transaction B won’t be visible until A won’t complete.

The database consistency (row versioning based on timestamp) is based on True Time API – a complex solution which ensures time synchronization between servers in Google’s data centers.

Available data types:

bool
int64
float64
string (length)
bytes (length)
date (timezone-independent)
timestamp

Interesting part is lack of autoincrement (identity) column property. Also there is no decimal data type. Google advises “store arbitrary precision numbers as strings”. Yet another unusual treat is storing data in array. Array cannot be nested one into another and they are limited to 10 MB.
Also struct data type is supported (only in operations like select, update) not to store the data.

Very interesting concept is a DDL clause INTERLEAVE IN PARENT. On table creation we can specify that new table “lives” inside the parent. That causes that physical data are stored on a same shard (e.g. a car with history of his service).

Yet another is tablesample operator. It is used to get sample results from database (e.g. we want to take a look on usual data). They are 2 methods: Reservoir (based on row number) and Bernoulli (based on % of total records).

SELECT * FROM User TABLESAMPLE RESERVOIR (100 ROWS);
SELECT * FROM User TABLESAMPLE BERNOULLI (0.1 PERCENT);

The last feature which is worth to mention is Partitioned DML which can be used to batch update or delete of records. The sad part is that I did not found how to execute it directly using SQL (until the engine is not doing that for us), otherwise using Google’s lib we have to specify that we want to execute partitioned DML.

With partitioned DML only rows which matches where clause are locked, transaction is managed on Cloud Spanner side, DML must be idempotent and statment must be fully partitionable.

/* partitionedstatements */
UPDATE User SET IsActive = true WHERE IsActive IS NULL
UPDATE User SET AccountCreation = '2018-01-01 00:00:00'

/* non-partitioned statements */
UPDATE User SET IsActive = !IsActive WHERE true
UPDATE User SET FailedLoginTry = FailedLoginTry + 1 WHERE true
DELETE FROM User WHERE UserId IN (SELECT UserId FROM UserRole WHERE RoleId = 7)

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Janek's dev blog

Spanner – GCP custom SQL solution

Leave a Reply Cancel reply