H-Sphere Documentation Sysadmin Guide


Fixing Crashed H-Sphere Database


Related Docs:   Backing Up H-Sphere Restoring H-Sphere Restoring the H-Sphere Database Accelerating CP Performance

Sometimes PostgreSQL database can get corrupted to the point of no return. That might manifest itself in things like:

hsphere=# VACUUM ;
ERROR: Relation 71343 does not exist

This usually means that index is corrupted.

To recover from the problem:

  1. Login as root:
    su -
  2. Stop Postgres if it is running.
  3. Make sure that no Postgres processes are running using the command:
    ps auxw | grep post
    If any of them are running, kill them.
  4. Remove Postgres' pid file:
    rm -f PGSQL_HOME/data/postmaster.pid
    From now on, we note PGSQL_HOME as the Postgres home directory which is /var/lib/pgsql on RedHat servers, and /usr/local/pgsql on FreeBSD.
  5. Switch to Postgres user:
    # su - postgres (on RedHat)
    # su - pgsql (on FreeBSD)
  6. Backup PostgreSQL database stored in the PGSQL_HOME/data directory:
    cp -r PGSQL_HOME/data pgdata.backup
  7. Try to connect to the H-Sphere database in single mode:
    postgres -D PGSQL_HOME/data -O -P hsphere

If you get any errors like:

        FindExec: found "/usr/bin/postmaster" using argv[0]
        2002-03-22 13:42:46 [6002]   DEBUG:  database system was shut down at 2002-03-22 11:46:11 CET
        2002-03-22 13:42:46 [6002]   DEBUG:  ReadRecord: invalid resource manager id 157 at (0, 561372168)
        2002-03-22 13:42:46 [6002]   DEBUG:  Invalid primary checkPoint record
        2002-03-22 13:42:46 [6002]   DEBUG:  Invalid RMID in secondary checkPoint record
        2002-03-22 13:42:46 [6002]   FATAL 2:  Unable to locate a valid CheckPoint record
        2002-03-22 13:42:46 [6002]   DEBUG:  proc_exit(2)
        2002-03-22 13:42:46 [6002]   DEBUG:  shmem_exit(2)
        2002-03-22 13:42:46 [6002]   DEBUG:  exit(2)
        /usr/bin/postmaster: reaping dead processes...
        /usr/bin/postmaster: Startup proc 6002 exited with

The messages such as:
ReadRecord: invalid resource manager
and other are culprit of the error.

In view of the above errors, do the following:

  1. Execute:
    pg_resetxlog PGSQL_HOME/data
    (this will reset the write-ahead log and other control information of a PostgreSQL database cluster; they are important but this is the only way to recover).
  2. Try to log into Postgres again in single mode:
    postgres -D PGSQL_HOME/data -O -P hsphere
  3. Once you are in, type:
    reindex database hsphere;
  4. Exit the database:
  5. Finally, start Postgres and see if everything is working.

Here, two Postgres tools are used:
- reindex database to recover corrupted indexes;
- pg_resetxlogs to reset write-ahead log files and the state of Postgres.

Related Docs:   Backing Up H-Sphere Restoring H-Sphere Restoring the H-Sphere Database Accelerating CP Performance

User comments
No comments for this page.
Add comment

Home   Products   Services   Partners   Support   News   Contact   Forum
© 2020 psoft.net.
All rights reserved.