=encoding UTF-8 =head1 Name sqitchtutorial-snowflake - A tutorial introduction to Sqitch change management on Snowflake =head1 Synopsis sqitch * =head1 Description This tutorial explains how to create a sqitch-enabled Snowflake project, use a VCS for deployment planning, and work with other developers to make sure changes remain in sync and in the proper order. We'll start by creating a new project from scratch, a fictional antisocial networking site called Flipr. All examples use L as the VCS and L as the storage engine, but for the most part you can substitute other VCSes and database engines in the examples as appropriate. If you'd like to manage a PostgreSQL database, see L. If you'd like to manage an SQLite database, see L. If you'd like to manage an Oracle database, see L. If you'd like to manage a MySQL database, see L. If you'd like to manage a Firebird database, see L. If you'd like to manage a Vertica database, see L. If you'd like to manage an Exasol database, see L. =head2 Connection Configuration Sqitch requires ODBC to connect to the Snowflake database. As such, you'll need to make sure that the L is installed and properly configured. At its simplest, on Unix-like systems, name the driver "Snowflake" by adding this entry to C (usually found in C, C, or C): [Snowflake] Description = ODBC for Snowflake Driver = /usr/lib64/snowflake/odbc/lib/libSnowflake.so Note that you'll need to adjust the path depending on the version of the ODBC driver, and where you installed it. See the L for details on downloading, installing, and configuring ODBC for your platform. =head1 Starting a New Project Usually the first thing to do when starting a new project is to create a source code repository. So let's do that with Git: > mkdir flipr > cd flipr > git init . Initialized empty Git repository in /flipr/.git/ > touch README.md > git add . > git commit -am 'Initialize project, add README.' If you're a Git user and want to follow along the history, the repository used in these examples is L. Now that we have a repository, let's get started with Sqitch. Every Sqitch project must have a name associated with it, and, optionally, a unique URI. We recommend including the URI, as it increases the uniqueness of object identifiers internally, and will prevent the deployment of a different project with the same name. So let's specify one when we initialize Sqitch: > sqitch init flipr --uri https://github.com/sqitchers/sqitch-snowflake-intro/ --engine snowflake Created sqitch.conf Created sqitch.plan Created deploy/ Created revert/ Created verify/ Let's have a look at F: > cat sqitch.conf [core] engine = snowflake # plan_file = sqitch.plan # top_dir = . # [engine "snowflake"] # target = db:snowflake: # registry = sqitch # client = snowsql Good, it picked up on the fact that we're creating changes for the Snowflake engine, thanks to the C<--engine snowflake> option, and saved it to the file. Furthermore, it wrote a commented-out C<[engine "snowflake"]> section with all the available Snowflake engine-specific settings commented out and ready to be edited as appropriate. By default, Sqitch will read F in the current directory for settings. But it will also read F<~/.sqitch/sqitch.conf> for user-specific settings. Since Snowflake's C client is not in the path on my system, let's go ahead an tell it where to find the client on our computer (don't bother if you're using the L because it uses the client inside the container, not on your host machine): > sqitch config --user engine.snowflake.client /Applications/SnowSQL.app/Contents/MacOS/snowsql And let's also tell it who we are, since this data will be used in all of our projects: > sqitch config --user user.name 'Marge N. O’Vera' > sqitch config --user user.email 'marge@example.com' Have a look at F<~/.sqitch/sqitch.conf> and you'll see this: > cat ~/.sqitch/sqitch.conf [engine "snowflake"] client = /Applications/SnowSQL.app/Contents/MacOS/snowsql [user] name = Marge N. O’Vera email = marge@example.com Which means that Sqitch should be able to find C for any project, and that it will always properly identify us when planning and committing changes. Back to the repository. Have a look at the plan file, F: > cat sqitch.plan %syntax-version=1.0.0 %project=flipr %uri=https://github.com/sqitchers/sqitch-snowflake-intro/ Note that it has picked up on the name and URI of the app we're building. Sqitch uses this data to manage cross-project dependencies. The C<%syntax-version> pragma is always set by Sqitch, so that it always knows how to parse the plan, even if the format changes in the future. Let's commit these changes and start creating the database changes. > git add . > git commit -am 'Initialize Sqitch configuration.' [main b731cc3] Initialize Sqitch configuration. 2 files changed, 15 insertions(+) create mode 100644 sqitch.conf create mode 100644 sqitch.plan =head1 Our First Change First, our project will need a schema. This creates a nice namespace for all of the objects that will be part of the flipr app. Run this command: > sqitch add appschema -n 'Add schema for all flipr objects.' Created deploy/appschema.sql Created revert/appschema.sql Created verify/appschema.sql Added "appschema" to sqitch.plan The L|sqitch-add> command adds a database change to the plan and writes deploy, revert, and verify scripts that represent the change. Now we edit these files. The C script's job is to create the schema. So we add this to F: CREATE SCHEMA flipr; The C script's job is to precisely revert the change to the deploy script, so we add this to F: DROP SCHEMA flipr; Now we can try deploying this change. We tell Sqitch where to send the change via a L. Let's say we're using the account name C, username C, database C, and warehouse C, and an ODBC driver named C (see L for details). The URI would be structured like this: db:snowflake://movera@example/flipr?Driver=Snowflake;warehouse=sqitch Note that Sqitch requires a C parameter in order to record its work in the registry. The default warehouse is named C, so you can omit it from the URI if that's the warehouse you want Sqitch to use (we'll omit it for the remainder of this tutorial). Otherwise, specify it in the URI. Snowflake also requires a password, which could also be included in the URI, but it's best to put it in the C section of the L file|https://docs.snowflake.com/en/user-guide/snowsql-start.html#configuring-default-connection-settings>. See L for details. We just tell Sqitch to use that URI to deploy the change: > sqitch deploy 'db:snowflake://movera@example/flipr?Driver=Snowflake' Adding registry tables to db:snowflake://movera@example/flipr?Driver=Snowflake Deploying changes to db:snowflake://movera@example/flipr?Driver=Snowflake + appschema .. ok First Sqitch created registry tables used to track database changes. The structure and name of the registry varies between databases (Snowflake uses a schema to namespace its registry, while SQLite and MySQL use separate databases). Next, Sqitch deploys changes. We only have one so far; the C<+> reinforces the idea that the change is being C to the database. Note that this process can take quite a bit of time. Sqitch connects to the database via ODBC and retains the connection throughout, but the creation of the registry and all change scripts run through individual runs of C. These connections can be quite slow. So if Sqitch seems hung, just wait; it's most likely waiting on Snowflake. With this change deployed, if you connect to the database, you'll be able to see the schema: > snowsql --accountname example --username movera --dbname flipr -o friendly=false \ --query "SHOW TERSE SCHEMAS LIKE 'flipr'" +-------------------------------+-------+------+---------------+-------------+ | created_on | name | kind | database_name | schema_name | |-------------------------------+-------+------+---------------+-------------| | 2018-07-27 14:47:22.614 +0000 | FLIPR | NULL | DWHEELER | NULL | +-------------------------------+-------+------+---------------+-------------+ 1 Row(s) produced. Time Elapsed: 0.283s =head2 Trust, But Verify But that's too much work. Do you really want to do something like that after every deploy? Here's where the C script comes in. Its job is to test that the deploy did was it was supposed to. It should do so without regard to any data that might be in the database, and should throw an error if the deploy was not successful. In Snowflake, the simplest way to do so for schema is probably to simply create an object in the schema. Put this SQL into F: CREATE TEMPORARY TABLE flipr.verify__ (id INT); In truth, you can use I query that generates an SQL error if the schema doesn't exist. Another handy way to do that is to divide by zero if an object doesn't exist. For example, to throw an error when the C schema does not exist, you could do something like this: USE WAREHOUSE &warehouse; SELECT 1/COUNT(*) FROM information_schema.schemata WHERE schema_name = 'FLIPR'; Note the C statement which is provided in the default Snowflake change script templates. For scripts that execute queries requiring compute resources (typically DML and C statement can actually execute. Probably easiest just to leave the default, which uses the warehouse that Sqitch uses to maintain its registry. Edit F to look like this: USE WAREHOUSE &warehouse; SELECT nickname, password, fullname, twitter, timestamp FROM flipr.users WHERE FALSE; Now for the revert script: all we have to do is drop the table. Add this to F: DROP TABLE flipr.users; Couldn't be much simpler, right? Let's deploy this bad boy: > sqitch deploy Deploying changes to flipr_test + users .. ok We know, since verification is enabled, that the table must have been created. But for the purposes of visibility, let's have a quick look: > snowsql --accountname example --username movera --dbname flipr -o friendly=false \ --query "SHOW TERSE TABLES LIKE 'users' IN flipr" +-------------------------------+-------+-------+---------------+-------------+ | created_on | name | kind | database_name | schema_name | |-------------------------------+-------+-------+---------------+-------------| | 2018-07-27 15:13:21.767 +0000 | USERS | TABLE | DWHEELER | FLIPR | +-------------------------------+-------+-------+---------------+-------------+ 1 Row(s) produced. Time Elapsed: 0.318s We can also verify all currently deployed changes with the L|sqitch-verify> command: > sqitch verify Verifying flipr_test * appschema .. ok * users ...... ok Verify successful Now have a look at the status: > sqitch status # On database flipr_test # Project: flipr # Change: d251b2c9b4bc46a4b4db6b7a8a637951484e6f6b # Name: users # Deployed: 2018-07-27 11:09:12 -0400 # By: Marge N. O’Vera # Nothing to deploy (up-to-date) Success! Let's make sure we can revert the change, as well: > sqitch revert --to @HEAD^ -y Reverting changes to appschema from flipr_test - users .. ok Note that we've used the C<--to> option to specify the change to revert to. And what do we revert to? The symbolic tag C<@HEAD>, when passed to L|sqitch-revert>, always refers to the last change deployed to the database. (For other commands, it refers to the last change in the plan.) Appending the caret (C<^>) tells Sqitch to select the change I to the last deployed change. So we revert to C, the penultimate change. The other potentially useful symbolic tag is C<@ROOT>, which refers to the first change deployed to the database (or in the plan, depending on the command). Back to the database. The C table should be gone but the C schema should still be around: > snowsql --accountname example --username movera --dbname flipr -o friendly=false \ --query "SHOW TERSE TABLES LIKE 'users' IN flipr" +------------+------+------+---------------+-------------+ | created_on | name | kind | database_name | schema_name | |------------+------+------+---------------+-------------| +------------+------+------+---------------+-------------+ 0 Row(s) produced. Time Elapsed: 0.367s The L|sqitch-status> command politely informs us that we have undeployed changes: # On database flipr_test # Project: flipr # Change: 5a2ac4ae6801bfe392483ee5912b4e3592cdd57a # Name: appschema # Deployed: 2018-07-27 10:52:54 -0400 # By: Marge N. O’Vera # Undeployed change: * users As does the L|sqitch-verify> command: > sqitch verify Verifying flipr_test * appschema .. ok Undeployed change: * users Verify successful Note that the verify is successful, because all currently-deployed changes are verified. The list of undeployed changes (just "users" here) reminds us about the current state. Okay, let's commit and deploy again: > git add . > git commit -am 'Add users table.' [main 8c16c09] Add users table. 4 files changed, 22 insertions(+) create mode 100644 deploy/users.sql create mode 100644 revert/users.sql create mode 100644 verify/users.sql > sqitch deploy Deploying changes to flipr_test + users .. ok Looks good. Check the status: > sqitch status # Project: flipr # Change: d251b2c9b4bc46a4b4db6b7a8a637951484e6f6b # Name: users # Deployed: 2018-07-27 11:19:30 -0400 # By: Marge N. O’Vera # Nothing to deploy (up-to-date) Excellent. Let's do some more! =head1 Add Two at Once Let's add a couple more changes. Our app will need to store status messages from users. Let's call them -- and the table to store them -- "flips". And we'll also need a view that lists user names with their flips. Let's add changes for them both: > sqitch add flips -r appschema -r users -n 'Adds table for storing flips.' Created deploy/flips.sql Created revert/flips.sql Created verify/flips.sql Added "flips [appschema users]" to sqitch.plan > sqitch add userflips -r appschema -r users -r flips \ -n 'Creates the userflips view.' Created deploy/userflips.sql Created revert/userflips.sql Created verify/userflips.sql Added "userflips [appschema users flips]" to sqitch.plan Now might be a good time to have a look at the deployment plan: > cat sqitch.plan %syntax-version=1.0.0 %project=flipr %uri=https://github.com/sqitchers/sqitch-snowflake-intro/ appschema 2018-07-27T14:27:24Z Marge N. O’Vera # Add schema for all flipr objects. users [appschema] 2018-07-27T15:03:56Z Marge N. O’Vera # Creates table to track our users. flips [appschema users] 2018-07-27T15:23:41Z Marge N. O’Vera # Adds table for storing flips. userflips [appschema users flips] 2018-07-27T15:23:50Z Marge N. O’Vera # Creates the userflips view. Each change appears on a single line with the name of the change, a bracketed list of dependencies, a timestamp, the name and email address of the user who planned the change, and a note. Let's write the code for the new changes. Here's what F should look like: -- Deploy flipr:flips to snowflake -- requires: appschema -- requires: users USE WAREHOUSE &warehouse; CREATE TABLE flipr.flips ( id INTEGER PRIMARY KEY, nickname TEXT NOT NULL REFERENCES flipr.users(nickname), body VARCHAR(180) NOT NULL DEFAULT '', timestamp TIMESTAMP_TZ NOT NULL DEFAULT CURRENT_TIMESTAMP ); Here's what F might look like: -- Verify flipr:flips on snowflake USE WAREHOUSE &warehouse; SELECT id, nickname, body, timestamp FROM flipr.flips WHERE FALSE; And F should look something like this: -- Revert flipr:flips from snowflake USE WAREHOUSE &warehouse; DROP TABLE flipr.flips; Now for C; F might look like this: -- Deploy flipr:userflips to snowflake -- requires: appschema -- requires: users -- requires: flips USE WAREHOUSE &warehouse; CREATE OR REPLACE VIEW flipr.userflips AS SELECT f.id, u.nickname, u.fullname, f.body, f.timestamp FROM flipr.users u JOIN flipr.flips f ON u.nickname = f.nickname; Use a C