+1 (855) 920-2583
Login

Upgrading PostgreSQL from version 11 to 13 using the pg_upgrade tool

Modified on: Thu, 30 May 2024 2:43 PM
  1. Before the upgrade
  2. Upgrade procedure
  3. Disaster recovery


Before the upgrade


This article explains, how to upgrade PostgreSQL from 11 to 13 using the pg_upgrade tool.


Before the upgrade, the BSS database will be checked for tables with OID columns. As these columns are not supported starting PostgreSQL version 12, such tables, if found, must be updated as explained below.


A pre-check detecting tables with OID columns is performed automatically before the upgrade, however, it is possible to run it manually. To do this, complete the following steps:

  1. Connect to the BSS database.
  2. Run the following query:
    select c.relname
         , c.reltuples
         , pg_size_pretty(c.relpages::bigint * 8192) table_size
      from pg_class c
           join pg_namespace ns on (ns.oid = c.relnamespace)
     where 1=1
       and c.relkind = 'r'
       and ns.nspname = 'public'
       and exists (select 1 from pg_attribute a where a.attrelid = c.oid and a.attnum <= 0 and a.attname = 'oid')
     order by c.relpages desc;
    SQL


  3. An empty result indicates that the database does not contain tables with OID columns and the upgrade can be started.
  4. If the result is not empty, and there are tables with OID columns in the BSS database, you must disable OID option for the tables before starting the upgrade. To do this, execute the following query:

    ALTER TABLE <tablename> SET WITHOUT OIDS;
    HTML
    Note: The ALTER TABLE statement locks the table exclusively and re-writes the whole table. This can take significant time and consume much disk space for large tables.


Upgrade procedure


The upgrade procedure consists of two stages, as the replica is not upgraded along with the master database:

  1. Master database upgrade
  2. Replica upgrade

Master database upgrade


To upgrade PostgreSQL from version 11 to 13 for ClodBlue, complete the following steps.

  1. Login to a workstation with access to the Kubernetes cluster, where CloudBlue is deployed, and stop all applications that use the PostgreSQL database, including OSS, BSS, and other microservices. Depending on the infrastructure, the commands to do that can be different. These are example commands:

    kubectl scale statefulset --replicas=0 oss-node
    kubectl scale deploy --all --replicas=0 -n <NS>
    Generic
    Note: The subsequent commands must be executed on BSS and OSS database hosts. 
  2. Install PostgreSQL 13.
    1. Stop the puppet service using the following command:

      systemctl stop puppet
      Generic
    2. Download the postgresql-13 packages for the latest minor release (in our example, for the 13.7 version released on May 2022) and install them:

      • postgresql13-13.7-1PGDG.rhel7.x86_64.rpm

      • postgresql13-contrib-13.7-1PGDG.rhel7.x86_64.rpm

      • postgresql13-libs-13.7-1PGDG.rhel7.x86_64.rpm

      • postgresql13-server-13.7-1PGDG.rhel7.x86_64.rpm

      • Download the packages from the https://download.postgresql.org/pub/repos/yum/13/redhat/rhel-7-x86_64/ repository to the current directory and install them using the following command as an example:

        yum localinstall ./postgresql13-13.7-1PGDG.rhel7.x86_64.rpm ./postgresql13-contrib-13.7-1PGDG.rhel7.x86_64.rpm ./postgresql13-libs-13.7-1PGDG.rhel7.x86_64.rpm ./postgresql13-server-13.7-1PGDG.rhel7.x86_64.rpm
        Generic
    3. Initialize a new PostgreSQL cluster using this command as an example:

      su - postgres -c '/usr/pgsql-13/bin/initdb --locale=en_US.UTF-8 --pgdata=/var/lib/pgsql/13/data'
      Generic
  3. Ensure that free disk space available in the /var/lib/pgsql/13/data directory is at least 1.2 (20 percent) greater than the disk space used in the /var/lib/pgsql/11/data directory. This is necessary for a successful upgrade. Below are example commands that can be used to find out the disk space used and available:

    • Sample command:

      du -hs /var/lib/pgsql/11/data
      Generic

      Sample output:

      799M    /var/lib/pgsql/11/data
      Generic
    • Sample command:

      df -h /var/lib/pgsql/13/data
      Generic

      Sample output:

      Filesystem      Size  Used Avail Use% Mounted on
      /dev/sda2        30G  3,6G   26G  13% /
      Generic
  4. Back up the data.

    1. Create a directory to store the dump. Below are example commands:

      mkdir /opt/dumps/host
      chown -R postgres:postgres /opt/dumps/host
      Generic
    2. Check the directory. Below is an example command

      df -h /opt/dumps/host
      Generic
    3. Export the data. Below is an example command

      export cpu_cnt=`grep -c "processor" /proc/cpuinfo` &&
      jobs=$(( $cpu_cnt * 3 / 2 )); echo "jobs=$jobs" &&
      echo "Start at "`date` &&
      dump_path="/opt/dumps/host" &&
      pg_13_bin="/usr/pgsql-13/bin" &&
      cd ${dump_path} &&
      for DB in $(psql -U postgres -d postgres -t -c "select datname from pg_database where not datname ~ 'template|oids|test|pass'");
      do echo "${dump_path}/${DB} `date`";
      su postgres -c "mkdir ${dump_path}/${DB}";
      su postgres -c "${pg_13_bin}/pg_dump -d ${DB} -N pg-exporter-bss -N pg-exporter-oss -Z 1 -j ${jobs} -F d -v -C -f ${dump_path}/${DB}";
      done &&
      su postgres -c "${pg_13_bin}/pg_dumpall -r -U postgres -f ${dump_path}/users_n_roles.sql" &&
      echo "Finish at"`date`
      Generic
  5. Upgrade the data using the pg_upgrade tool.
    1. Check a cast for OID columns exists. Below is an example query that you can use:

      psql -U postgres -d postgres -t -c "select p.proname from pg_cast c join pg_proc p on (p.oid = c.castfunc) where p.proname = 'varchar2uuid'"
      Generic

      If the result is varchar2uuid, the cast exists and the data can be created in the PostgreSQL 13 database.

    2. Stop PostgreSQL 11 service. You can use the following command:

      systemctl stop postgresql-11
      Generic
    3. Check that there are no postgres processes running on the database host:

      ps -ef|grep "postgres"|grep -v "grep"
      Generic
      Warning! If any process was found, you must identify, disconnect, and stop found application(s). After that, repeat this step to stop PostgreSQL 11 service.
    4. Temporary enable local access to all databases in cluster only for postgres and back up PostgreSQL configuration: pg_hba.conf. Use the following commands:
      cp /var/lib/pgsql/11/data/pg_hba.conf /root/pg_hba.conf_bak_`date '+%Y-%m-%d'`
      su - postgres -c 'cp /var/lib/pgsql/11/data/pg_hba.conf /var/lib/pgsql/11/data/pg_hba.conf.prod'
      su - postgres -c 'echo "local   all postgres trust" >/var/lib/pgsql/11/data/pg_hba.conf'
      Generic
    5. Copy configuration files from PostgreSQL 11 to PostgreSQL 13 server:

      su - postgres -c 'cp /var/lib/pgsql/11/data/pg_hba.conf* /var/lib/pgsql/13/data'
      su - postgres -c 'cp /var/lib/pgsql/11/data/pg_ident.conf /var/lib/pgsql/13/data'
      su - postgres -c 'cp /var/lib/pgsql/11/data/postgresql.conf /var/lib/pgsql/13/data'
      su - postgres -c 'cp /var/lib/pgsql/11/data/server.* /var/lib/pgsql/13/data'
      Generic
    6. If required, enable PostgreSQL 13 features:

      • If the database host has more than 8 CPUs, then the max_worker_processes must be set to the number of CPUs minus 1. The default value is 8.

      • Only if the database host has an SSD disk, set effective_io_concurrency to 100. This is applicable only to SSD disks.

      • The following parameters are added in PostgreSQL 13:
      • autovacuum_vacuum_insert_scale_factor=0.2

      • autovacuum_vacuum_insert_threshold=1000

      • hash_mem_multiplier=1

    7. Run pg_upgrade in check mode.

      su - postgres -c '/usr/pgsql-13/bin/pg_upgrade --jobs=8 --new-port=8352 --old-port=8352 --check --old-datadir "/var/lib/pgsql/11/data" --new-datadir "/var/lib/pgsql/13/data" --old-bindir "/usr/pgsql-11/bin" --new-bindir "/usr/pgsql-13/bin"'
      Generic

      All results must be positive like in the example below:

      #su - postgres -c '/usr/pgsql-13/bin/pg_upgrade --jobs=8 --new-port=8352 --old-port=8352 --check --old-datadir "/var/lib/pgsql/11/data" --new-datadir "/var/lib/pgsql/13/data" --old-bindir "/usr/pgsql-11/bin" --new-bindir "/usr/pgsql-13/bin"'
      Performing Consistency Checks
      -----------------------------
      Checking cluster versions                                   ok
      Checking database user is the install user                  ok
      Checking database connection settings                       ok
      Checking for prepared transactions                          ok
      Checking for system-defined composite types in user tables  ok
      Checking for reg* data types in user tables                 ok
      Checking for contrib/isn with bigint-passing mismatch       ok
      Checking for tables WITH OIDS                               ok
      Checking for invalid "sql_identifier" user columns          ok
      Checking for presence of required libraries                 ok
      Checking database user is the install user                  ok
      Checking for prepared transactions                          ok
      Checking for new cluster tablespace directories             ok
       
      *Clusters are compatible*
      Generic
    8. If there are OID columns in the billing database, the response will be similar to the one below and you will need to update the tables to remove OIDs.

      # su - postgres -c '/usr/pgsql-13/bin/pg_upgrade --jobs=8 --new-port=8352 --old-port=8352 --check --old-datadir "/var/lib/pgsql/11/data" --new-datadir "/var/lib/pgsql/13/data" --old-bindir "/usr/pgsql-11/bin" --new-bindir "/usr/pgsql-13/bin"'
      Performing Consistency Checks
      -----------------------------
      Checking cluster versions                                   ok
      Checking database user is the install user                  ok
      Checking database connection settings                       ok
      Checking for prepared transactions                          ok
      Checking for system-defined composite types in user tables  ok
      Checking for reg* data types in user tables                 ok
      Checking for contrib/isn with bigint-passing mismatch       ok
      Checking for tables WITH OIDS                               fatal
       
      Your installation contains tables declared WITH OIDS, which is not
      supported anymore.  Consider removing the oid column using
          ALTER TABLE ... SET WITHOUT OIDS;
      A list of tables with the problem is in the file:
          tables_with_oids.txt
       
      Failure, exiting
      Generic
  6. Skip this step if no columns with OIDs were found in the billing database. Otherwise, complete the steps below to make the necessary corrections.
    1. Prepare the following script to be executed in the PostgreSQL 11 database. Complete the following commands:s

      sed 's/In database:/\\c/g' /var/lib/pgsql/tables_with_oids.txt | awk -F '.'  '{if ($0 ~ '/\c\ /') {print $0} else {print "alter table "$1".\""$2"\" set without oids;"}}' > /var/lib/pgsql/alter_tables_without_oids.sql
      Generic
      chown postgres /var/lib/pgsql/alter_tables_without_oids.sql
      Generic
    2. Run the following commands to apply the fix:

      systemctl start postgresql-11 su - postgres -c 'psql -U postgres -d pba -f /var/lib/pgsql/alter_tables_without_oids.sql' systemctl stop postgresql-11
      Generic
    3. Check if there are still columns with OIDs by completing step 5.g again.

  7. Execute the upgrade using the following command:


    su - postgres -c '/usr/pgsql-13/bin/pg_upgrade --jobs=8 --new-port=8352 --old-port=8352 --old-datadir "/var/lib/pgsql/11/data" --new-datadir "/var/lib/pgsql/13/data" --old-bindir "/usr/pgsql-11/bin" --new-bindir "/usr/pgsql-13/bin"'
    SQL
  8. Restore the PostgreSQL configuration by executing the following commands:

    su - postgres -c 'mv /var/lib/pgsql/11/data/pg_hba.conf.prod /var/lib/pgsql/11/data/pg_hba.conf'
    su - postgres -c 'mv /var/lib/pgsql/13/data/pg_hba.conf.prod /var/lib/pgsql/13/data/pg_hba.conf'
    	
    SQL
  9. More details on the upgrade from PostgreSQL 11 to PostgreSQL 13 can be found in the PostgreSQL documentation.
  10. Start PostgreSQL 13 service.

    systemctl enable postgresql-13
    systemctl start postgresql-13
    Generic
  11. Disable PostgreSQL 11:

    systemctl disable postgresql-11
    Generic
  12. If the pg_stat_statements extension is used, upgrade it using the script below as an example:

    do $$
    begin
      --
      if exists (
                  select 1
                   from pg_extension
                  where extname = 'pg_stat_statements'
                    and extversion = '1.6'
                 )
      then          
        execute 'drop extension pg_stat_statements cascade';
        execute 'create extension pg_stat_statements';
      end if;
      --
    end $$;
     
    \dx
    Generic
  13. Reindex and analyze the new database cluster.

    su - postgres -c '/usr/pgsql-13/bin/reindexdb --all --jobs=8'
    su - postgres -c '/usr/pgsql-13/bin/vacuumdb --all --jobs=8 --analyze-in-stages'
    Generic
  14. If you use CloudBlue core components of version 21.16 or later, skip this step.
    If a cast was used in PostgreSQL 11, create it in PostgreSQL 13:

    /usr/pgsql-13/bin/psql -U postgres -d postgres -t -c "select * from setup_varchar2uuid(false)"
    
    
    Generic
  15. Start CloudBlue applications that use PostgreSQL. Depending on the installation, the component list can be different. Use this command as an example:

    kubectl scale statefulset --replicas=1 oss-node
    kubectl scale --replicas=1 deployment/idp
    Generic
  16. Check the applications' health. In case of an issue, refer to the Disaster Recovery section.

  17. Repair DB Replication after upgrade to PostgreSQL 13:

    1. Identify master DB replica hosts. This is a sample command to do this:

      grep -E "^host\s*replication\s*all\s*([0-9\.]*)" /var/lib/pgsql/11/data/pg_hba.conf |sed -E "s/.*all\s*([0-9\.]*).*/\1/"
      10.26.230.128
      SQL
    2. Before restoring the replica, set the proper value for the wal_keep_size configuration parameter and it defines the size of the WAL logs to be kept on the master database server. In PostgreSQL 11 this parameter had the name wal_keep_segments.
      To calculate the value for the wal_keep_size parameter, multiply the wal_keep_segments value by the wal_segment_size value. Specify the result value as the value for the wal_keep_size parameter in the /var/lib/pgsql/13/data/postgresql.conf file.

    3. To apply the changes, reload the postgres-13 service

      systemctl reload postgresql-13
      SQL
    4. Login to the replication host and install PostgreSQL 13 there:

      yum localinstall ./postgresql13-13.6-1PGDG.rhel7.x86_64.rpm ./postgresql13-contrib-13.6-1PGDG.rhel7.x86_64.rpm ./postgresql13-libs-13.6-1PGDG.rhel7.x86_64.rpm ./postgresql13-server-13.6-1PGDG.rhel7.x86_64.rpm
      SQL
    5. Stop and disable PostgreSQL 11 service:

      systemctl stop postgresql-11
      systemctl disable postgresql-11
      SQL
    6. Make sure that there are no processes of PostgreSQL 11 running. To do this, execute this command on each server:

      ps -ef | grep "postgres"| grep -v "grep"
      SQL
    7. Restore replication configuration for PostgreSQL 13

      su - postgres -c '/usr/pgsql-13/bin/initdb --locale=en_US.UTF-8 --pgdata=/var/lib/pgsql/13/data'
      su - postgres -c 'cp /var/lib/pgsql/11/data/pg_hba.conf /var/lib/pgsql/13/data'
      su - postgres -c 'cp /var/lib/pgsql/11/data/pg_ident.conf /var/lib/pgsql/13/data'
      su - postgres -c 'cp /var/lib/pgsql/11/data/postgresql.conf /var/lib/pgsql/13/data'
      SQL
    8. Make sure that there are no PostgreSQL 11 references in this file: /var/lib/pgsql/13/data/postgresql.conf. An example of such a reference is this one:

      data_directory = '/var/lib/postgresql/11/data'.
      SQL
    9. Repair replication for PostgreSQL 13. To do this, create the /usr/local/bin/db-replica-repair-13.sh script with the content below, specifying the master database's IP address in the maindb parameter:

      maindb= ###### fill correct value for master database ######
      echo "Stopping DB.."
      systemctl stop postgresql-11
      echo "Backing up.."
      cp -f /var/lib/pgsql/13/data/postgresql.conf /root/postgresql.conf.bak
      cp -f /var/lib/pgsql/13/data/pg_hba.conf /root/pg_hba.conf.bak
      echo "Rebuilding DB.."
      rm -rf /var/lib/pgsql/13/data
      sudo -u postgres pg_basebackup -D /var/lib/pgsql/13/data -U repl -h $maindb -P -v -R -X fetch
      if [ "$?" != "0" ]; then
      echo "Failed."
      exit 1
      fi
      echo "Restoring configs.."
      mv -f /root/postgresql.conf.bak /var/lib/pgsql/13/data/postgresql.conf
      echo -e "\nhot_standby=on\n" >> /var/lib/pgsql/13/data/postgresql.conf
      cp -f /root/pg_hba.conf.bak /var/lib/pgsql/13/data/pg_hba.conf
      chown postgres:postgres -R /var/lib/pgsql/13/data/
      echo "Starting DB.."
      systemctl start postgresql-13
      if [ "$?" != "0" ]; then
      echo "Failed."
      exit 1
      fi
      sleep 5
      echo "Verifying.."
      t1=$(sudo -u postgres psql -Aqtc "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))::INT"2> /dev/null)
      sleep 30
      t2=$(sudo -u postgres psql -Aqtc "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))::INT"2> /dev/null)
      if (( $t2-$t1 >= 30 )); then
      echo "Replica rebuild failed. Please try again"
      exit 1
      else
      echo "All done!"
      fi
      SQL
    10. Run the script to fix replication:

      sh /usr/local/bin/db-replica-repair-13.sh
      SQL
  18. Enable and start PostgreSQL 13 service

    systemctl enable postgresql-13
    systemctl start postgresql-13
    SQL
  19. After repairing the replica, change the wal_keep_size parameter value to 8GB in the /var/lib/pgsql/13/data/postgresql.conf file of the master database and reload PostgreSQL configuration:

    systemctl reload postgresql-13
    SQL

Known Issues

  1. Execution of the /usr/local/bin/db-replica-repair-13.sh script can stuck with the following message:

    pg_basebackup: initiating base backup, waiting for checkpoint to complete
    SQL

    If this happens, create CHECKPOINT in the master database and execute the following on the main DB host:

    sudo -u postgres psql -c "CHECKPOINT;"
    SQL

Disaster Recovery

In case of any issue during the upgrade process, you need to: 

  1. Stop all applications connected to the PostgreSQL database as described in step 1.
  2. Stop and disable the postgresql-13 service:

    systemctl stop postgresql-13
    systemctl disable postgresql-13
    SQL
  3. Restore the production pg_hba.conf:

    su - postgres -c 'mv /var/lib/pgsql/11/data/pg_hba.conf.prod /var/lib/pgsql/11/data/pg_hba.conf'
    SQL
  4. Enable and start the postgresql-11 service:

    systemctl enable postgresql-11
    systemctl start postgresql-11
    SQL
  5. For the BSS database, restore the "WITH OIDS" option in PostgreSQL 11 cluster.

    su - postgres -c 'psql -U postgres -d pba -f /var/lib/pgsql/alter_tables_without_oids.sql'
    SQL
  6. Start all application services, as described in step 15.

  7. Restore the production pg_hba.conf:

    su - postgres -c 'mv /var/lib/pgsql/11/data/pg_hba.conf.prod /var/lib/pgsql/11/data/pg_hba.conf'
    SQL

    Make sure to keep PostgreSQL 13 data for further investigation.

    This completes the recovery procedure.

  • Restore the production pg_hba.conf:

    su - postgres -c 'mv /var/lib/pgsql/11/data/pg_hba.conf.prod /var/lib/pgsql/11/data/pg_hba.conf'
    SQL