How to use transactions with node postgres

Transactions in computer programming are essential for data integrity. Transactions provide primitives for guaranteeing your query logic succeeds or fails gracefully. A transaction contains any number of statements. At any point you can rollback if an error occurs or your logic fail conditions trigger.

A RDBMS like PostgreSQL gives you an amazing set of properties called ACID. Atomicity, Consistency, Isolation, and Durablity. Computer scientist and Turing Award winner Jim Gray established these principles.

Each time you run a query in PostgreSQL you get these guarantees. A transaction is a comprised of one or more queries. You can think of a transaction as a single unit of work.

Atomicity means that the transaction is all or nothing. If an error occurs, your transaction can be cancelled and the database remains the same. This property extends to hardware failures, pulling the plug on your computer, or spilling coffee on your server.

If this property wasn’t there, a bank transaction may deposit money into one account but not withdraw money from the other if the server dies mid-transaction.

Consistency ensures that the transaction, or set of queries, results in changes that conform to your schema, constraints, foriegn keys, and more.

Sticking with a bank example, Consistency ensures that the dollar amounts stay in the correct format, that the bank account numbers actually exist in the database, and that the two accounts are authorized to transfer between eachother. These rules must be defined through database constraints and other tools. But once these are in place, the transaction ensures that your application code won’t be a non-valid state because of invalid database data.

Isolation provides a sandbox for your transaction to run in. It provides concurrency control to a set of records in your database. The level of isolation is tweakable from relaxed and strict controls.

A great example of strict access is when you are booking airline tickets (at least a good ticketing system). The system will ensure that only one customer can access the seat record in the transaction. This ensures that once you chose your seat it will be there througout the checkout process.

A similar example is movie or concert tickets. This level of locking is why you typically see a timer saying you have 5 minutes to purchase.

Durablity is similar to Atomicity, but durability goes a step further to make it a requirement to persist a committed transaction to non-volatile memory, a harddrive.

PostgreSQL and other RDBMS ensure this. MongoDB famously didn’t make Durablity a default feature in the early releases of their product. This resulted in data loss for those servers that died. The applications using MongoDB thought they had stored their values permanetly, but they were in fact lost in situations of process failure or power failure.

In this example we will be building a bookstore called Booktown. We will explore these four principles and see how you can leverage them for building reliable and scalable systems.

Let’s get started

Install Postgres.app for Mac or Postgres Installer by EnterpriseDB for Windows

Create Booktown Database

createdb booktown

Clone my postgres example git repo to get the booktown.sql

 git clone git@github.com:wlaurance/node-postgresql-example.git && cd node-postgresql-example && npm install

Import booktown.sql into the PostgreSQL booktown database

psql booktown < booktown.sql

Explore a little

Now we have the booktown database available. Run psql booktown to open a Postgres shell. Type \d and press enter to list all of the tables and relations in the booktown database. I recommend the you type \x and enter to turn on extended display.

  • project all of the books. select * from books;
  • project all of the authors. select * from authors;
  • project all of the books with their authors. select title, first_name, last_name from books join authors on books.author_id = authors.id;
  • find all the books by Mark Lutz. select title from authors join books on authors.id = books.author_id where authors.first_name = 'Mark' and authors.last_name = 'Lutz';
  • We need to find the ISBN numbers. Project the book title, book id, and ISBN number. select title, id, isbn from editions join books on editions.book_id = books.id;
  • Here’s a fun one. Let’s find out the total cost and total retail value of all of the books in stock at booktown bookstore. select sum(cost * stock) as cost, sum(retail * stock) as retail_value from stock;

Let’s build something

Now we are transitioning to running queries in Node.js using the pg module. The git repository you cloned earlier contains all of the code below.

Run all of the explore a little queries using Node.js and Node Postgres module

'use strict';

const pg = require('pg');
const async = require('async');
const _ = require('underscore');

let pool = new pg.Pool({
    host: 'localhost',
    database: 'booktown'
});

function runQuery(query, callback) {
    pool.connect((err, client, done) => {
        if (err) {
            //likely a connection error that will print to console.
            done();
            throw err;
        }
        client.query(query, (err, results) => {
            done(); //call done to release the client to the connection pool.
            callback(err, results); //make it the callers responsiblity for checking for query errors.
        });
    });
}

//Let's run all of the queries listed in the 'let's explore' section.

async.eachSeries([
    'select * from books',
    'select * from authors',
    'select title, first_name, last_name from books join authors on books.author_id = authors.id',
    "select title from authors join books on authors.id = books.author_id where authors.first_name = 'Mark' and authors.last_name = 'Lutz'",
    'select title, id, isbn from editions join books on editions.book_id = books.id',
    'select sum(cost * stock) as cost, sum(retail * stock) as retail_value from stock'
], (query, callback) => {
    console.log(`Running query: ${query}`);
    runQuery(query, (err, results) => {
        if (err) {
            console.error(`Error running query: ${err}`);
        } else {
            console.log('Rows from query');
            _.each(results.rows, (r) => {
                console.log(r);
            });
        }
        callback();
    });
}, (err) => {
    if (err) {
        throw err;
    }
    console.log('Finished running lets explore queries.');
    process.exit();
});

Now let’s build something real

Imagine you are consulting with the owner of Booktown Bookstore, Martha. Martha is having many difficulties with her current system and it’s impacting her business in negative ways.

For simplicity sake, we will only be updating table schemas and providing a command line interface to run our specific tasks.

For command line interface will use yargs the command line module for pirates.

Let’s start simple

Martha wants to start a marketing campaign to send emails and text messages to her customers. Her current customer data model doesn’t include phone numbers or email addresses. Customers are not required to give this information so we won’t add a not null constraint to the field.

Update data model

We need to alter the customers table. The email and phone will both be varchar data types. The max length for an email is 254 characters and the max length for phone is 15 digits. Phone could be an integer but for simplicity let’s make it a varchar.

Let’s open up psql and run ALTER TABLE customers ADD COLUMN email varchar(254), ADD COLUMN phone varchar(15);. This adds both the email and phone fields to the customer table. You can run \d customers to see the new schema.

One more setup item. The booktown database doesn’t come with auto incrementing ids for the customer table. We will create a new sequence type called customer_ids and set the default value of customer.id to read the nextval from this sequence.

  1. Open psql.
  2. Run create sequence customer_ids;
  3. Run alter table customers alter COLUMN id set default nextval('customer_ids');
  4. Find out the current max id + 1 in the current customer table. select (max(id) + 1) from customers;
  5. Run alter sequence customer_ids restart with 1150; (replace 1150 with the number from step 4 if different).

The steps you did allow you to not worry about the customer id when creating a new customer.

Build the create customer command

Before coding I like to think about the design of the user interface. Even though this is only a CLI example, we still want it to be clear what we are doing.

I think this is nice:

node customer.js --action create --firstName Will --lastName Laurance --phone 530-555-8888 --email will@wlaurance.com

The CLI options make it very clear what this command will do.

Notice the additional --action parameter. My plan is to later add the ability to update a customer’s information.

node customer.js --action update --id 1150 --firstName Will --lastName Laurance --email will@dubsoft.net

In the update action we will make all fields optional. If no fields are to be updated, we should throw an error. We also require that id be provided as to not update any other customers by accident.

Let’s see some code.

'use strict';

const _ = require('underscore');

//Database functions
const pg = require('pg');

let pool = new pg.Pool({
    host: 'localhost',
    database: 'booktown'
});

function runQuery(query, argsArray, callback) {
    pool.connect((err, client, done) => {
        if (err) {
            //likely a connection error that will print to console.
            done();
            throw err;
        }
        client.query(query, argsArray, (err, results) => {
            done(); //call done to release the client to the connection pool.
            callback(err, results); //make it the callers responsiblity for checking for query errors.
        });
    });
}

const yargs = require('yargs');

const args = yargs
    .alias('f', 'firstName')
    .describe('f', 'first name')
    .alias('l', 'lastName')
    .describe('l', 'last name')
    .alias('p', 'phone')
    .describe('p', 'phone number')
    .alias('e', 'email')
    .describe('e', 'email')
    .alias('a', 'action')
    .demand('a')
    .describe('a', 'action to take [update, create]')
    .alias('i', 'id')
    .describe('i', 'customer id for an update')
    .argv;

if (args.action === 'create') {
    create(args);
} else if (args.action == 'update') {
    update(args);
} else {
    throw new Error('Action not supported');
}

function create(options) {
    let query = 'insert into customers(first_name, last_name, email, phone) values ($1, $2, $3, $4)';
    let args = [options.firstName, options.lastName, options.email, options.phone];
    runQuery(query, args, (err, results) => {
        if (err) {
            console.error(err);
        } else {
            console.log(results);
        }
        process.exit();
    })
}

function update(options) {
    if (!options.id) {
        throw new Error('id is required for an update');
    }
    let fieldsToUpdate = [];
    let valuesToUpdate = [];
    if (options.firstName) {
        fieldsToUpdate.push('first_name');
        valuesToUpdate.push(options.firstName);
    }
    if (options.lastName) {
        fieldsToUpdate.push('last_name');
        valuesToUpdate.push(options.lastName);
    }
    if (options.email) {
        fieldsToUpdate.push('email');
        valuesToUpdate.push(options.email);
    }
    if (options.phone) {
        fieldsToUpdate.push('phone');
        valuesToUpdate.push(options.phone);
    }

    if (fieldsToUpdate.length === 0 || valuesToUpdate.length === 0) {
        throw new Error('no fields to update');
    }
    let placeHolderParams = (_.map(valuesToUpdate, (v, i) => {
        return `$${i + 1}`;
    })).join(', ');
    let idWhereClause = `where id = $${valuesToUpdate.length + 1}`;
    valuesToUpdate.push(options.id);
    let query = `update customers set (${fieldsToUpdate.join(', ')}) = (${placeHolderParams}) ${idWhereClause}`;
    runQuery(query, valuesToUpdate, (err, result) => {
        if (err) {
            console.error(err);
        } else {
            console.log(result);
            if (result.rowCount === 0) {
                console.error('No record updated');
            }
        }
        process.exit();
    });
}

Great job!

Martha is very impressed with how quickly you got this done! Fast forward a few weeks. Martha and her team put together amazing email and text messaging campaigns. The conversion rate is through the roof.

The Booktown store is still a very small store with limited storage. Martha and here team have been ordering books like crazy. But recently they ordered way to many books by accident because their system doesn’t have any limit in place.

Martha has asked you too make it so they don’t accidently order too many books. She says they have enough room for 100 copies of each book.

We need to build here a new command line ordering tool that ensures we don’t order too many books. One caveat of the existing ordering system is that it prints CSV data to the console. We must ensure to keep this contract. Historically the ordering has all be done through spreadsheets, manual database entries, and shell scripts.

Since we did such a good job on the first task, Martha is trusting us with a larger budget to build this ordering system. She wants a way to view the existing orders with their line items.

planning We should build a order cli with actions being create, addItem, updateItem, subtractItem, list, and delete. Esentially order CRUD with sub item CRUD.

Let’s design the command line API like we did in our first project with Martha.

  • node book.js --action list [--format csv] Lists all the books with their id’s
  • node order.js --action create returns the created order id
  • node order.js --action addItem --orderId {orderId} --bookId {bookId} --quantity {quantity} adds the bookId to the order
  • node order.js --action updateItem --orderId {orderId} --bookId {bookId} --quantity {quantity} updates the quantity of the bookId in the order
  • node order.js --action removeItem --orderId {orderId} --bookId {bookId} removes the bookId from the order
  • node order.js --action delete --orderId {orderId} deletes the order and the associated line items

Let’s knock book.js out!

This one is pretty easy!

'use strict';

const _ = require('underscore');

//Database functions
const pg = require('pg');

let pool = new pg.Pool({
    host: 'localhost',
    database: 'booktown'
});

function runQuery(query, argsArray, callback) {
    pool.connect((err, client, done) => {
        if (err) {
            //likely a connection error that will print to console.
            done();
            throw err;
        }
        client.query(query, argsArray, (err, results) => {
            done(); //call done to release the client to the connection pool.
            callback(err, results); //make it the callers responsiblity for checking for query errors.
        });
    });
}

const yargs = require('yargs');

const args = yargs
    .alias('a', 'action')
    .demand('a')
    .describe('a', 'action to take [list]')
    .argv;


if (args.action === 'list') {
    runQuery('select id, title from books', [], (err, results) => {
        if (err) {
            throw err;
        }
        console.log('id', 'title');
        _.each(results.rows, (r) => {
            console.log(r.id, r.title);
        });
        process.exit();
    });
}

Now it’s time for the order command line tool

If you dug through the booktown database you will know that there is no orders table. We need to create two new tables to have a nice normalized design.

The orders table will have an order id, a created time, and a creator column. The line_items table will have an order id, a book id, and a quantity column.

We could have one massive table that stores all of this information. But this is very bad practice. You should get into a habit of refining your tables and reducing redundant information. This is a great book if you want to learn more.

Create the required tables

Run psql booktown in your terminal and then type out these statements.

create table orders (id serial primary key, created timestamp with time zone not null, creator varchar not null);
create table line_items (order_id integer references orders(id) not null, book_id integer references books(id) not null, quantity integer, check (quantity <= 100));

This creates two tables. Notice how the book_id and order_id are set up as foreign keys. This means we have referencial integrity between our tables. For example if you tried to delete an order without deleting the line items first you will get a foreign key constraint error.

Create the order CLI
'use strict';

const _ = require('underscore');

//Database functions
const pg = require('pg');

let pool = new pg.Pool({
    host: 'localhost',
    database: 'booktown'
});

function runQuery(query, argsArray, callback) {
    pool.connect((err, client, done) => {
        if (err) {
            //likely a connection error that will print to console.
            done();
            throw err;
        }
        client.query(query, argsArray, (err, results) => {
            done(); //call done to release the client to the connection pool.
            callback(err, results); //make it the callers responsiblity for checking for query errors.
        });
    });
}

const yargs = require('yargs');

const args = yargs
    .alias('i', 'orderId')
    .describe('i', 'order id to operate on')
    .alias('b', 'bookId')
    .describe('b', 'book id to operate on')
    .alias('q', 'quantity')
    .describe('q', 'the quantity of the book')
    .alias('a', 'action')
    .demand('a')
    .describe('a', 'action to take [create, addItem, removeItem, updateItem, list, delete]')
    .argv;

require('console.table');

function printer(words) {
    return function(err, results) {
        if (err) {
            console.error(err);
        } else {
            if (words && words !== '') {
                console.log(words);
            }
            if (results.rows.length > 0) {
                console.table(results.rows);
            }
        }
        process.exit();
    }
}

function ensureRequired(map, fields, checkers) {
    if (fields.length !== checkers.length) {
        throw 'invalid fields and checkers';
    }
    let valid = _.all(fields, (f, i) => {
        return checkers[i](map[f]);
    });
    if (!valid) {
        throw 'checkers did not pass'
    }
}

function rollback(client) {
    client.query('ROLLBACK', function() {
        client.end();
    });
}

try {
    var query;
    var params;
    switch (args.action) {
        case 'create':
            query = 'insert into orders (created, creator) values ($1, $2) returning *';
            params = [new Date().toISOString(), process.env.USER];
            runQuery(query, params, printer('created order'));
            break;
        case 'addItem':
            ensureRequired(args, ['orderId', 'bookId', 'quantity'], [_.isNumber, _.isNumber, _.isNumber]);
            query = 'insert into line_items (order_id, book_id, quantity) values ($1, $2, $3) returning *';
            params = [args.orderId, args.bookId, args.quantity];
            runQuery(query, params, printer('added line item with book_id ' + args.bookId + ' to order_id ' + args.orderId));
            break;
        case 'removeItem':
            ensureRequired(args, ['orderId', 'bookId'], [_.isNumber, _.isNumber]);
            query = 'delete from line_items where order_id = $1 and book_id = $2';
            params = [args.orderId, args.bookId];
            runQuery(query, params, printer('deleted line_items from order_id ' + args.orderId + ' with book_id ' + args.bookId));
            break;
        case 'updateItem':
            ensureRequired(args, ['orderId', 'bookId', 'quantity'], [_.isNumber, _.isNumber, _.isNumber]);
            query = 'update line_items set quantity = $1 where order_id = $2 and book_id = $3';
            params = [args.quantity, args.orderId, args.bookId];
            runQuery(query, params, printer('updated line_items for order_id ' + args.orderId));
            break;
        case 'list':
            ensureRequired(args, ['orderId'], [_.isNumber]);
            query = `
                select order_id, created, creator, book_id, title, quantity from orders
                    join line_items on orders.id = line_items.order_id
                    join books on line_items.book_id = books.id
                    where orders.id = $1 returning *;
            `;
            params = [args.orderId];
            runQuery(query, params, args.csv ? printCsv : printer())
            break;
        case 'delete':
            ensureRequired(args, ['orderId'], [_.isNumber]);
            params = [args.orderId];
            pool.connect((err, client, done) => {
                if (err) {
                    throw err;
                }
                client.query('BEGIN', (err, result) => {
                    if (err) {
                        console.error(err);
                        return rollback(client);
                    }
                    client.query('delete from line_items where order_id = $1', params, (err, result) => {
                        if (result.rowCount > 0) {
                            console.log('line items from order', args.orderId, 'deleted.');
                        }
                        if (err) {
                            return rollback(client);
                        }
                        client.query('delete from orders where id = $1', params, (err, result) => {
                            if (err) {
                                console.error(err);
                                return rollback(client);
                            }
                            if (result.rowCount === 0) {
                                console.log('No order with id', args.orderId);
                                return rollback(client);
                            }
                            console.log(result);
                            client.query('COMMIT', client.end.bind(client));
                        });
                    });
                })
            });
            break;
        default:
            console.log('Action not supported');
    }
} catch (e) {
    console.error(e);
    process.exit(1);
}

OK, I know that was a lot of code! Let’s review what we did. Our CLI is action driven. So we are switching on the args.action field. If we don’t find one we support we invoke the default action. The ensureRequired function takes the arguments of the argsfieldNames, and the checkerFunctions. It’s a nice reusable function we use for each action case to see if we have the correct arguments. Note that this function throws an error if any of the required parameters are missing. This is why the entire switch statement is wrapped in a try/catch. Once the required parameters are ensured, we construct the respective query for the action and put the placeholder values into the array. We call the runQuery function with the query, placeholder variables, and a printer function. The printer function is a callback wrapper for the query callback. It handles the error and will print the provided message. It also prints any rows the query may return. I’m using the console.table module to pretty print columns and rows to the console. The printer function also invokes process.exit to stop the node process.

The delete action is a little different than the other actions. The other actions require a single query statment. The delete action is deleting from two tables. line_items and orders. We first delete the line items for the provided order id. Then we delete the order.

Examples of using the CLI

node order.js --action create creates a new order

created order
id  created                                  creator
--  ---------------------------------------  ---------
10  Sun Nov 06 2016 18:03:14 GMT-0800 (PST)  wlaurance

node order.js --action addItem --orderId 10 --bookId 41473 --quantity 25

added line item with book_id 41473 to order_id 10
order_id  book_id  quantity
--------  -------  --------
10        41473    25

node order.js --action addItem --orderId 10 --bookId 4267 --quantity 25

added line item with book_id 4267 to order_id 10
order_id  book_id  quantity
--------  -------  --------
10        4267     25

node order.js --action addItem --orderId 10 --bookId 190 --quantity 25

added line item with book_id 4267 to order_id 10
order_id  book_id  quantity
--------  -------  --------
10        190      25

node order.js --action removeItem --orderId 10 --bookId 4267

deleted line_items from order_id 10 with book_id 4267

node order.js --action list --orderId 10

order_id  created      creator    book_id  title               quantity
--------  -----------  ---------  -------  ------------------  --------
10        06 Nov 2016  wlaurance  190      Little Women        25
10        06 Nov 2016  wlaurance  41473    Programming Python  25

node order.js --action list --orderId 10 --csv

"order_id","created","creator","book_id","title","quantity"
10,"06 Nov 2016","wlaurance",190,"Little Women",25
10,"06 Nov 2016","wlaurance",41473,"Programming Python",25

node order.js --action updateItem --orderId 10 --bookId 190 --quantity 100

updated line_items for order_id 10

node order.js --action list --orderId 10 --csv

"order_id","created","creator","book_id","title","quantity"
10,"06 Nov 2016","wlaurance",190,"Little Women",100
10,"06 Nov 2016","wlaurance",41473,"Programming Python",25

node order.js --action delete --orderId 10

line items from order 10 deleted.
{ command: 'DELETE',
  rowCount: 1,
  oid: NaN,
  rows: [],
  fields: [],
  _parsers: [],
  RowCtor: null,
  rowAsArray: false,
  _getTypeParser: [Function: bound ] }
Summary

We’ve built the command line to spec and our first release is working great. Martha has reported a bug with our system. The data model allows the user to add the same book to multiple lines items. This allows the total count to go above 100. This violates one of our system requirements.

At first glance we might write some logic in our code to handle this. This would work but there is actually a much simplier way. In PostgreSQL we can add a compound unique constraint on the line_items table. If we add a unique index on the book_id and order_id columns it will be impossible to have this problem.

Let’s run some tests to reproduce the problem on our dev machine.

node order.js --action create
created order
id  created                                  creator
--  ---------------------------------------  ---------
11  Thu Nov 10 2016 20:38:53 GMT-0800 (PST)  wlaurance
node order.js --action addItem --orderId 11 --bookId 1501 --quantity 56
added line item with book_id 1501 to order_id 11
order_id  book_id  quantity
--------  -------  --------
11        1501     56
node order.js --action addItem --orderId 11 --bookId 1501 --quantity 56
added line item with book_id 1501 to order_id 11
order_id  book_id  quantity
--------  -------  --------
11        1501     56
node order.js --action addItem --orderId 11 --bookId 1501 --quantity 99
added line item with book_id 1501 to order_id 11
order_id  book_id  quantity
--------  -------  --------
11        1501     99
node order.js --action list --orderId 11
order_id  created      creator    book_id  title           quantity
--------  -----------  ---------  -------  --------------  --------
11        10 Nov 2016  wlaurance  1501     Goodnight Moon  99
11        10 Nov 2016  wlaurance  1501     Goodnight Moon  56
11        10 Nov 2016  wlaurance  1501     Goodnight Moon  56

Sure enough, our program is incorrect because we can order more than 100 of a single book.

Let’s add that compound unique constraint.

psql

alter table line_items add constraint book_id_order_id_unique unique(book_id, order_id);

You’ll notice that we are unable to create this contraint because we our data breaks this rule.

ERROR:  could not create unique index "book_id_order_id_unique"
DETAIL:  Key (book_id, order_id)=(41473, 6) is duplicated.

We’ll have to go onsite to clean up Booktown’s database because there isn’t a perfect way to clean up the data.

For now, let’s delete all of the orders with duplicate line items. For me it’s order 6 and 11.

Run the alter table command again and it should succeed without error.

Let’s test our solution:

node order.js --action create
created order
id  created                                  creator
--  ---------------------------------------  ---------
12  Thu Nov 10 2016 21:22:25 GMT-0800 (PST)  wlaurance
node order.js --action addItem --orderId 12 --bookId 1501 --quantity 99
added line item with book_id 1501 to order_id 12
order_id  book_id  quantity
--------  -------  --------
12        1501     99
node order.js --action addItem --orderId 12 --bookId 1501 --quantity 99
{ [error: duplicate key value violates unique constraint "book_id_order_id_unique"]
  name: 'error',
  length: 226,
  severity: 'ERROR',
  code: '23505',
  detail: 'Key (book_id, order_id)=(1501, 12) already exists.',
  hint: undefined,
  position: undefined,
  internalPosition: undefined,
  internalQuery: undefined,
  where: undefined,
  schema: 'public',
  table: 'line_items',
  column: undefined,
  dataType: undefined,
  constraint: 'book_id_order_id_unique',
  file: 'nbtinsert.c',
  line: '424',
  routine: '_bt_check_unique' }

Perfect! Now we cannot have the same book in the same order.

Martha is thrilled!

Martha says amazing job! You did excellent on the ordering system. She was very happy for the onsite support you provided.

You built a correct and robust program. Sometimes all it takes is some up front design and brainstorming.

I hope you enjoyed our tutorials and learned something about Node and PostgreSQL.

Related Posts

Node.js PostgreSQL Tutorial


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *