Cloning Using Hot Backups

Shankar Govindan

This document gives you a detailed flow of how to backup the database in a 24x7 operation and use it for cloning a Development or Test database. By using the System switching of Log files option, it reduces the amount of archive logs to be applied and reduces the time of recovery(the goal).

Most sites running Oracle databases need the latest production databases to be cloned for the development and test teams to work on. This is especially true when you are working on an Oracle Applications site where patches must be applied and tested before applying the same on the Development box for user evaluation.

These sites need to be refreshed periodically with the production database and Application. Most DBAs use Cold Backup procedures to clone a database. Although cold backups are the most common backup procedures for database cloning, there may be situations where no downtime is available. Sometimes the job has to be executed during the lunch break for the development team to apply patches and test certain critical issues that need to be addressed immediately.

Hot Backups can be performed when the database is up and in Archive log mode. First, check the v$database view to see if your database is set in Archive log mode:

SVRMGR> select log_mode from v$database;

LOG_MODE

------------

ARCHIVELOG

Or to see if archiver is running,

SVRMGR> select archiver from v$instance;

ARCHIVE

-------

STARTED

Next we write a small script to spool all the files that are in v$datafiles and their respective tablespaces from dba_data_files. We can concatenate the required open, end and copy commands to it.

First lets check the number of logfiles that exist for your database,

Sqlplus>select * from v$logfile;

We normally switch the log files to the number of log groups that exist. For example, if you have three log groups for your database, then use the alter system switch logfiles three times in the script. This would make sure that all the logs have switched and are archived. Simply put, it works on the same concept of increasing or decreasing the size of log files to tune your checkpoints.

Lets say we want to increase the log files size so that the checkpoints are not happening too frequently. What would we do? We would create new logfile groups with the intended sizes and then start switching the log groups one by one. The old ones are dropped and removed and the new ones take over. Then the database is bounced.

Let me give you another interesting example:

When we query the v$logfile we find a couple of logfiles stale. (There is a document on Metalink that explains stale logfiles,

Note: 1014824.6 ORA-00346: REDO LOG FILE HAS STATUS 'STALE’)

SVRMGR> select * from v$logfile;

GROUP# STATUS MEMBER

---------- ------- -------------------------------------------------------------

1 /u00/oradata/imgt/redoimgt01.log

2 STALE /u01/oradata/imgt/redoimgt02.log

3 STALE /u02/oradata/imgt/redoimgt03.log

SVRMGR> alter system switch logfile;

Statement processed.

SVRMGR> select * from v$logfile;

GROUP# STATUS MEMBER

---------- ------- -------------------------------------------------------------

1 /u00/oradata/imgt/redoimgt01.log

2 /u01/oradata/imgt/redoimgt02.log

3 STALE /u02/oradata/imgt/redoimgt03.log

Only one of the log groups status shows that it is still stale. Lets switch again and see what happens,

SVRMGR> alter system switch logfile;

Statement processed.

SVRMGR> select * from v$logfile;

GROUP# STATUS MEMBER

---------- ------- -------------------------------------------------------------

1 /u00/oradata/imgt/redoimgt01.log

2 /u01/oradata/imgt/redoimgt02.log

3 /u02/oradata/imgt/redoimgt03.log

The status does not show stale for the third group.

The same way we switch to the number of groups we have before we start running the hot backup script. Well what is the advantage in doing this?

We don’t have to move all the archive log files, we don’t have to apply all the archive log files and the recover time is shorter, also a fewer number of files means a lesser chance of recovery failure due to file corruption.

Now lets note down the last archive log file from the archive log destination directory or from the alert log and then do a

sqlplus>Alter system switch logfile;

This would give us the current log that got archived by switching and also a checkpoint occurs. Remember that switching leads to check pointing and check pointing not necessarily need lead to a log switch. So the datafiles all get consistent since the current scn’s are stamped. We do the switch to the number of log groups we have. That means the first archive log file that was switched might be your full file size and the subsequent ones would have lesser size depending on the data.

Let’s create a Backup directory to hold our hot back datafiles. Make sure you have enough space to hold all the datafiles. Now log on to sqlplus or svrmgr as database user sys or system and run the script. This script is a modified script available in script repositories.

set serveroutput on

set heading off

set feedback off

spool $ORACLE_HOME/backup.sql

declare

fname varchar2(55);

tname varchar2(55);

tname1 varchar2(55);

cursor tspaces is

select tablespace_name,file_name

from v$datafile,sys.dba_data_files

where enabled like '%WRITE%'

and file# = file_id

order by 1;

begin

dbms_output.enable(32000);

dbms_output.put_line('spool hotback');

open tspaces;

fetch tspaces into tname,fname;

tname1 := tname;

dbms_output.put_line('alter system switch logfile;');

dbms_output.put_line('alter system switch logfile;');

dbms_output.put_line('alter system switch logfile;');

dbms_output.put_line('alter tablespace '||tname||' begin backup;');

while tspaces%FOUND loop

if tname1 != tname then

dbms_output.put_line('alter tablespace '||tname1||' end backup;');

dbms_output.put_line('alter tablespace '||tname||' begin backup;');

tname1 := tname;

end if;

dbms_output.put_line('!cp '||fname||' /u03/ora_bkp/hot_backup/imgt');

fetch tspaces into tname,fname;

end loop;

dbms_output.put_line('alter tablespace '||tname1||' end backup;');

close tspaces;

dbms_output.put_line('alter system switch logfile;');

dbms_output.put_line('alter system switch logfile;');

dbms_output.put_line('alter system switch logfile;');

dbms_output.put_line('alter database backup controlfile to trace;');

dbms_output.put_line('spool off');

end;

/

spool off

set heading on

set feedback on

set serveroutput off

oracle@eclipse:/u01/app> cd $ORACLE_HOME

oracle@eclipse:/u01/app/oracle/product/8.1.6>echo $ORACLE_SID

imgt

oracle@eclipse:/u01/app/oracle/product/8.1.6> svrmgrl

Oracle Server Manager Release 3.1.6.0.0 - Production

Copyright (c) 1997, 1999, Oracle Corporation. All Rights Reserved.

Oracle8i Enterprise Edition Release 8.1.6.0.0 - Production

With the Partitioning option

JServer Release 8.1.6.0.0 - Production

SVRMGR> connect system

Password:

Connected.

SVRMGR>@backup.sql

Make sure you run this during less database activity to reduce the number of archive logs being generated and correspondingly reduce recovery time.

We have switched the log groups twice in the script, once before starting the backup and once after the backup is complete. This way we need to copy over only those archive files starting with the ones that we noted down when we started the log switch to the ones we have when we did the log switch after the backup. These would be very few and hence the recovery would also be faster when we apply these archive files. We would also have a set of archive files that we are sure of and consistency can thus be assured.

When the backup is complete check the backup location to see if all the files are available. We could now either FTP the same to the other system or copy over these files to another location in case of cloning on the same system. You could also use the remote copy command rcp to copy over these files to a remote system for rebuilding a test or development database. Just make sure you have the permission on the remote system to do the rcp.

Copy over all the files to their respective filesystems and directories and then edit the file that was created using the backup controlfile to trace. Copy that file to the remote system and edit it accordingly.

STARTUP NOMOUNT

CREATE CONTROLFILE REUSE DATABASE "IMGT" NORESETLOGS NOARCHIVELOG

MAXLOGFILES 32

MAXLOGMEMBERS 2

MAXDATAFILES 1000

MAXINSTANCES 8

MAXLOGHISTORY 843

LOGFILE

GROUP 1 '/u00/oradata/imgt/redoimgt01.log' SIZE 5M,

GROUP 2 '/u01/oradata/imgt/redoimgt02.log' SIZE 5M,

GROUP 3 '/u02/oradata/imgt/redoimgt03.log' SIZE 5M

DATAFILE

'/u01/oradata/imgt/system01.dbf',

'/u01/oradata/imgt/rbs01.dbf',

'/u01/oradata/imgt/temp01.dbf',

-

-

-

Remove all the unwanted information and change the REUSE DATABASE to SET DATABASE and the SID to the SID you have chosen and set in the environment you are running the svrmgrl. Also check that all the files are in the right location and edit that information in the control file here accordingly.

Next, remove the lines that say,

# Recovery is required if any of the datafiles are restored backups,

# or if the last shutdown was not normal or immediate.

RECOVER DATABASE

# Database can now be opened normally.

ALTER DATABASE OPEN;

Save the file with an extension of sql. Like controlfile<sid>.sql

The recover database clause is removed from the controlfile script to enable us to do the recovery manually.

Once the controlfile is run successfully and you get the statement processed, we can start applying the archive logs that we have moved to the archive log destination directory as per the init<sid>.ora file.

(Note : I have skipped the copy over of the ORACLE_HOME,oratab, listener and tnsnames and init.ora files and subsequent editing of the same to reflect the changes because these activities are normally done the same way when you do a cold backup and then clone.)

We are still in the startup nomount state; lets recover the database to its current state. Remember that the data is still written to the datafiles even though the tablespace is brought to the begin backup mode. Simply put this is just like leaving the datafile at a particular state of scn during the shutdown abort. When the database is brought up again the recovery process brings the database to its current state by rolling forward and then rolling back automatically.

Lets do a recover of the database to its consistent state,

SVRMGR> recover database using backup controlfile until cancel;

The create control file command only changes the structure of the database and the SID name, the header of the datafiles still hold all the required information. The above command would ask you to input the archive log file names one by one to do recovery or you could choose the AUTO option. Once the recovery process is complete, open the database with the resetlogs option, so that the log files are all cleaned up.(since we are doing this manually now we removed this option too from the controlfile)

SVRMGR> alter database open resetlogs;

The database is ready for use. Take a backup before applying any patches. If you are cloning on the same system then sometimes the database will not come up and give an error stating that ,

OERR: ORA 9782 "sfifi: another instance has the same database mounted."

Just rename the sgadef<SID>.dbf as sgadef<SID>.old and bring the cloned database up using the backup controlfile option. Bounce it once again before renaming the sgadef file of the other database back to its original name.

Note : The above, as usual, has nothing to do with the company I work for and are entirely my own views, please test the same before you use for your production. However I have used the same on NT,HP-UX,SUN,AIX and DataGeneral Unix and version 7.3,8.x and 8i.

 

Shankar Govindan is an Oracle Certified Professional and Microsoft Certified Professional working as an Oracle Applications DBA for Syntel Inc., in their IMG Division.