Tips for upgrading Postgres from 6.5.3 to 7.0.3

version 1.2 by Mark Stosberg Last significant update: 2001-04-04
Jag the Cat Here are some tips that came out my experience upgrading Summersault's database from Postgres 6.5.3 to 7.0.3. Some of the information will be specific to crossing the 7.0 version line, other bits will useful for postgres upgrades and general, and some of it is probably specific to our system and data. It does not try to be a complete "HOW-TO" document, but rather to offer some insights about challenges that we ran into, and how we got around them. Bits of it are also specific to UNIX.

Dealing with the date format change

In the 7.x branch of Postgres, the default date style changed from being "Postgres" to being "ISO". We wanted to adopt the new standard, but at the same time, we didn't want all of our legacy applications to break that depended on the old style format. Here are some options we thought of for dealing with this: Since this step did not require the new database, we were able to do it in advance to save us time and trouble when were actually doing the upgrade.

Postgresql.org has detailed documentation on the differences with the Postgres and ISO date formats. from the change to the ISO date format.

Processing the dump file

We set up a instance of Postgres 7.0.3 on a test server, so that we could test dumping and importing into the new system before we tried it on our production server. I highly recommend this step because in our case, some errors in the dump files appeared to be specific to our own projects, so they would not have been documented elsewhere. :)

Dump the schema first

I found that most of the errors that occurred when trying to import the dump file were in the schema, not the data. So to save time in debugging process, I just dumped the schemas, and worked with importing that and debugging the errors. I used this command to the dump the schema:
pg_dumpall -cs >schema.sql
The "c" flag means "clean drop", it means that drop statements will be included. This will be useful when you are repeatedly trying to import the schema. :) the "s" flag means "schema only"

When I ran pg_dump, I got this error:

pq_recvbuf: unexpected EOF on client connection
However, as explained here, this is not something to worry about. The dump and import went fine regardless of this error.

Dump with the old pg_dump, import with the new psql

It's important to use the pg_dump that came with the older version of the database when exporting. It's also important to use the psql that came with the new database when importing. In both cases, it's a good idea to be doing this as the Unix user that owns the database files, probably "pgsql".

Trap the import output for debugging

I think the most useful system I developed for debugging the import process was to pipe the output and errors to a file for later review. I did that like this:

psql -e <schema.sql 2>&1 | tee schema-full-results.txt; grep ERROR schema-full-results.txt >schema-err-results.txt

What's happening here is that I'm actually sending the output to three places at once:

  1. The Screen, so that I can hit "Ctrl-C" to cancel as soon as there is an error
  2. schema-err-results.txt -- This file shows all the errors in the order that they occurred. I can browse it quickly to find any errors that I missed on the screen.
  3. schema-full-results.txt -- this is a complete version of the output.If I find an error I need to investigate, the first thing I do is to try to find it in this file to see exactly where it occurred in the schema.
So my debugging cycle would go like this: try the import, find an error in the error log, find the error in the full log to see where it occurred in the schema, edit that location in the schema, try the import again. (Of course, I made a backup of the schema dump before I started editing it!)

It's normal to get a lot of errors to the effect of "Does not exist" or "Already exits". These occur when you try to drop a table that has not yet been created, or the entity has already been successfully created.

Things that might need changing in the schema dump

Here's the list of things we ended up needing to change in the schema dump. In many cases, the easiest way to make change was using perl on the command line. This saves the time of opening, processing and saving a large file in a text editor:
perl -pi.bak -e 's/old/new/' schema.sql
That will change the first occurrence of "old" to "new" on every line that it appears on. If you don't want perl to create a backup file with this operation, remove the ".bak" from the command.

Notes on handling the import file

I followed much the same process for dumping and inserting the data as I did the schema. I used this command to export it:
pg_dumpall -a >inserts.sql
That dumps just the data for all the databases. Here's what I used to import it:
psql -e <inserts.sql 2>&1 | tee inserts-full-results.txt; grep ERROR inserts-full-results.txt >inserts-err-results.txt
There was only one change to the data that needed to be made in my (18 Meg) dump file. The old database user was named 'postgres' and the new database user was named 'pgsql'. I quickly updated all these entries with a good perl trick:
perl -pi -e 's/connect(.*)postgres/connect$1pgsql/g' inserts.sql
A note for BBEdit users: Here's why it doesn't work to edit Postgres data dump files in BBEdit. The dump files contain two styles of line breaks, and it's important to preserve this. For the file to be translated into BBEdit in a useful way, both the line-break styles get converted to a single style, and that information gets lost when you re-save the file-- although you can't easily tell this by looking at it in BBEdit! There may be a Mac workaround to get this to work with BBEdit-- my workaround was to use a Unix text editor when I needed to edit the file, which correctly preserved the line breaks.

Upgrading SQL incompatibilities

So far I've discovered a few places where the SQL statements we were running on Postgres 6.5.3 no longer work on 7.0.3: Good luck!
Mark Stosberg