Topic:
Backup of large ORACLE databases with BRBACKUP.
Extract from the 3.0 documentation.
Solution
Backup of Large ORACLE Databases
--------------------------------
You are informed of the support provided by the SAP utility program
BRBACKUP when backing up large databases.
Contents:
--------
- Backup of Large ORACLE Databases: Overview
- Terminology and Hardware Requirements
- Use of BRBACKUP for Backup of Large Databases to Tape
- Use of BRBACKUP for Parallel Backup of Large Databases on Hard Disks
- Use of External Backup Programs to Backup Large Databases
- Final Comments
Backup of Large ORACLE Databases: Overview
------------------------------------------
The following information should help you to understand the problems involved in backing up large ORACLE databases (SAP databases with size = 100 GB - 1 TB). You receive a number of notes and background information which should help you to carry out such backups successfully in the shortest time possible.
The backup strategy for productive databases recommended by SAP is a daily image backup of the entire database (see Backup Strategy). However, this strategy is often not feasible for large databases for the following reasons.
- The backup of a dataset of 100 GB - 1TB leads to performance problems, as the database server resources are completely used up by a backup of this type (in particular CPU time, the system bus and the I/O bus, hard disk controller and volume controllers). As a result, online use of the SAP System is limited.
- The backup of a dataset of 100GB - 1TB also causes a temporal problem. This type of backup should usually be carried out when there is little or no activity in the SAP system (that is, there is a so-called time window, usually at night). This daily time window should ideally be 12 hours for a productive SAP system, but is in reality often considerably smaller.
Terminology and Hardware Requirements
-------------------------------------
The following example shows how a database backup could normally run:
- Backup devices:
The database backup is carried out onto locally mounted tape units. The DAT DDS-2 disk drives are used presently with hardware compression. 10 tape units should be available in our example.
- Throughput:
The throughput of the backup should be assumed to be about 2 GB per hour.
- Tape capacity assuming 120 meters of tape:
- about 2.4 GB for tape units without hardware compression
- about 7 GB for tape units with hardware compression (assuming the average compression rate for SAP data: about 3-4)
- Time window:
The backup should be possible in less than 10 hours overnight.
- Size of the database:
The size of the database should be 100 GB.
If the database can be saved in parallel onto 10 tape units, it is conceivable that the backup is feasible within the available time window (thus, within 10 hours) (with hardware compression). The limitations in the scalability were taken into account in this calculation this (if 2GB per hour can be saved to a tape unit, this does not necessarily mean that 20 GB can be saved on 10 tape units in one hour. Normally, more time is required. Thus the use of many tape units means that the backup time per tape unit is increased.).
As soon as the database becomes larger than 100 GB, timing problems could occur in addition to problems in connection with the load on the computer, as shown in the above example. It is therefore necessary to find a suitable backup strategy for such databases. The following sections therefore contain some notes, suggestions, ideas and information about the options provided by SAP utilities which should help you to develop an optimal backup concept for databases which are larger than 100GB.
Large Databases
---------------
In the following, large databases are assumed to be in the range from 100 GB - 1 TB.
Special features of large databases:
- Backups normally exceed the available time window at night.
- The quantity of data in the database greatly exceeds the quantity of all other data managed by the system.
- A common backup strategy of database files and non-database files is not attempted, as the smaller non-database quantities can be saved without problems with the operating system resources (for example tar).
- The system is normally configured in such a way that there is a database server on which no other major applications run.
- A backup of large databases via the network is not of interest, as this can lead to instability and further performance loss.
Backup Devices
--------------
The backup of large databases creates an extreme load, both on the system resources and on the backup device being used.
Some examples follow, as well as some orientation values for the hardware environment of the backup devices.
- The backup should not be carried out over a network, but either to volumes in locally mounted backup devices or directly to hard disks.
- Backup devices:
- Locally mounted tape units:
Maximum number of locally mounted tape units which are supported by BRBACKUP:
SAP Release Number
----------- ------
as of 2.1A 25
as of 2.1K/2.2E 50
as of 3.0A 255
- Backup devices (for example magneto-optical media), which are addressed via an external backup program:
Such devices can be reached via the interface program BACKINT (see Use of External Backup Programs to Backup Large Databases).
- BRBACKUP/BRARCHIVE do not support tape jukeboxes. However, you can address such devices via the BACKINT interface to external backup programs.
Examples of Backup Devices
The following backup devices are the most common: 4 mm (DAT DDS-1, DDS-2) or 8 mm (video).
However, these devices should only be used to backup large databases where there is no other option, due of the low tape capacity and small throughput. Experience shows that for a database backup of about 500 GB, about 100 tapes would be necessary. The backup time would also greatly exceed 12 hours, even with 20 tape units running in parallel.
Tape units with a larger capacity and higher throughput rate should therefore be used if possible.
The following examples are not a representative selecton of the available backup devices!
IBM 3490
--------
Throughput with backup: about 5-10 GB per hour
Tape capacity: about 2 GB per tape
These devices offer high throughput, but relatively low capacity. These would only be useful if a tape jukebox were used, running an external backup program, for example, via the BACKINT interface
IBM 3590
--------
Throughput with backup: about 5-10 GB per hour
Tape capacity: about 20-30 GB per tape
These new devices from IBM have promising technical possibilities. However, there is little practical experience available, as these devices are not yet very widespread.
DLT 4000 (Digital Linear Tape)
-------------------------
Throughput with backup: about 6-9 GB per hour
Tape capacity: about 40-50 GB per tape
If a throughput of 5 GB per hour is assumed for 10 tape units in parallel, the backup of a 500GB database could conceivably run within 12 hours on less than 12 tapes.
The numbers given in the examples are based solely on the abilities of the backup devices. However, other factors also play a large role, for example, the throughput values of the computer (in particular, hard disk access times, system bus speed, I/O (SCSI) bus speed). As a result, the total throughput decreases significantly. Do not mount too many backup devices on one I/O bus, so as not to overload it. If hard disks and tape units are mounted on the same I/O bus, you must expect the load to be split between the two.
Use of BRBACKUP for Backup of Large Databases to Tape
-----------------------------------------------------
BRBACKUP calls cpio in order to copy individual files from the hard disk to tape. As a result, throughput is basically determined by cpio.
BRBACKUP offers a number of functions to enable an optimal backup of large databases.
- BRBACKUP can backup in parallel to several locally mounted tape units.
- In a parallel backup, BRBACKUP also supports tape units with automatic tape swapping (so-called autochangers). As a result, a completely automatic backup is also possible if tape swapping is to occur for one or more tape units during a backup.
- All files to be saved are distributed to the volumes (tapes) inserted in the tape units. BRBACKUP has different optimization targets (see Optimization of File Distribution).
Restrictions for Backup with BRBACKUP:
- As the database server is working at 100% during the backup, other activities on the computer should be reduced to the absolute minimum during the backup. Virtually all of the computer's resources should be available for the backup.
This is obviously contradictory to the high availability required of the SAP system. There is no simple solution to this conflict.
- No cpio continuation mechanism is supported during parallel backups (see cpio Continuation Tape), that is, individual files must always be completely saved onto one volume. Tape swapping must always be carried out by BRBACKUP. Therefore, the size of a file to be saved must not exceed the tape capacity. If you work with tape units with hardware compression, the size of the compressed file must not exceed the tape capacity.
Make sure that the tape capacity is not set too high, in order to avoid reaching the end of the tape.
- Since individual files can only be written in their entirety to a volume, there is a certain amount of wastage when distributing the files to the available tapes. That is, the tape capacity cannot be fully used. When distributing the files, BRBACKUP makes sure that the tape capacity is never exceeded. The size of the tape header files (label, init
.ora, init .sap) and the log files at the end of the tape (central, detailed and summary log) are not taken into account.
- Special features when using tape units with hardware compression: see Tape Units with Hardware Compression.
Optimization of File Distribution
---------------------------------
When distributng the files to the tapes, BRBACKUP has different optimization targets:
- BRBACKUP attempts to save all files from a hard disk to a volume (tape). As a result, competing hard disk accesses should be avoided, and the number of read/write accesses should be kept to a minimum.
If a Logical Volume Manager (LVM) is used, this can only be attained if the logical volumes are not unnecessarily distributed over several physical hard disks. See Optimization when Using a Logical Volume Manager.
- Using the backup times for individual files stored in the database, BRBACKUP attempts to minimize the total backup time by keeping the backup time equal on each of the individual tape units.
- This mechanism helps to avoid the following problem: When using tape units with hardware compression, the backup time cannot be estimated accurately from the file size or the size of the compressed file. The optimization is carried out therefore from the backup times stored by BRBACKUP.
- The tape capacity in general is not used to its maximum due to temporal optimization (if the optimal backup time is reached for a tape, no further files are saved to this tape, though the tape still has plenty of space left). The total capacity of all available tapes inserted at the same time must therefore significantly exceed the total size of the files to be saved.
- The temporal optimization is cancelled internally if it would cause a tape change (BRBACKUP always attempts to avoid volume changes). In this case, the entire capacity of the tapes is used. As a result, the backup time can vary for individual volumes.
- BRBACKUP always attempts to use all tape units mounted, even if the backup would fit onto a smaller number of tapes. This is done in order to minimize the total backup time.
The one exception is when the number of files to be backed up is less than the number of tape devices.
- BRBACKUP sorts the files to be saved according to their size and distributes the largest files to the available tapes first, followed by the smaller ones. As a result, wastage is reduced on individual volumes, as the tape capacity (parameter tape_size) can never be exceeded and the total tape capacity can be better utilized.
- BRBACKUP is capable of #learning#. The backup times of individual files are stored in the database and used in the next backup for temporal optimization (if this is carried out).
As a result, the changes to backup time are taken into account for individual files. The backup time can change, for example, if the filler level of the individual database files varies and the files should be saved with hardware compression using a tape unit.
- The option -o|-output dist,time for BRBACKUP/BRARCHIVE can be used to:
- display the file distribution carried out by BRBACKUP before starting the actual backup.
- display the backup times of the individual files after the backup is completed.
See extensions of the log
Optimization when Using a Logical Volume Manager
------------------------------------------------
If a Logical Volume Manager (LVM) is used, BRBACKUP can only save all files from a hard disk to tape if the logical volumes are not unnecessarily distributed over several hard disks (see configuration A).
Configuration A:
Each logical volume exactly corresponds to one hard disk (for example logical volume 1 = disk 1, logical volume 2 = disk 2, logical volume 3 = disk 3).
Configuration B:
Each logical volume is set up so that it covers areas of all the hard disks (for example logical volume 1 = area1.disk1+area1.disk2+area1.disk3, logical volume 2 = area2.disk1+area2.disk2+area2.disk3, logical volume 3 = area3.disk1+area3.disk2+area3.disk3).
Configuration A is more efficient than configuration B with respect to database backup. Configuration B, however, can provide a better performance for online operation of the SAP system.
If you plan the configuration of logical volumes for large databases, you should find a compromise for the options which meets your requirements. You should consider which is more important: a more effective backup or higher performance of the online operation of the SAP system.
Some notes on this point:
- Advantages of using an LVM: easier administration, high flexibility, higher security by using RAID systems
Disadvantages of using an LVM: performance loss through management overhead and possible (configuration B) deterioration of optimization through BRBACKUP
- Although the higher security and availability of your datasets provided by using the LVM has a generally high priority, you should consider whether you could do without an LVM. You lose the above-mentioned advantages this way, but, on the other hand, you can carry out a backup of the datasets considerably more effectively by means of BRBACKUP.
- In large databases, the configuration of the database files is not variably selectable. Every change to the configuration (for example, structure changes due to a tablespace extension or a tablespace reorganization with data files) must always be planned with respect to the physical hard disk configuration and the influence that this will have on the performance. The use of an LVM therefore has no major advantage in this case.
Tape Units with Hardware Compression
------------------------------------
Tape units with hardware compression are very widespread.
The advantage of such tape units is clear:
- the #logical# tape capacity is increased by a certain factor, called the compression rate (average compression rate for SAP data is about 3-4)
- the data throughput is increased, the backup time is shortened
The compression method used is normally based on the Lempel-Ziv algorithm.
BRBACKUP can optimize a backup very well on tape units with hardware compression if the current compression rates are known before starting the backup. To do this, there is a BRBACKUP option which approximates compression rates: brbackup -k only (see Hardware Compression ). You should use this option once a month so that the compression rates always have the current status.
However, this recommendation can only be carried out partially for large databases. This will be explained here briefly:
Let us assume that approx. 3GB of data can be compressed in an hour. The compression, therefore, of a 300 GB database takes roughly 100 hours. This period is, of course, not acceptable:
The solution to this problem lies in the reduction of the dataset to be compressed and in the parallelization of the procedure:
- It is possible to compress individual database files or to exclude them from the compression run.
- For large databases, most database files are well filled, and subject to only a few changes. Consequently, the compression rates of these files can be assumed to be a constant. These files can be excluded from the regular compression.
- If a database file has no essential changes in two consecutive compression runs, this compression rate can also be seen as a constant. A further check of the compression rate must only be carried out after a longer period (for example, after a year).
- After a large data transfer or a reorganization of a tablespace, the affected tablespaces must be compressed again.
- Parallelization of the compression run to determine the compression rates (without starting a backup) is implemented.
Partial Backups
---------------
You would be able to reduce long backup times by dividing the backup of the entire database into several partial backups.
As a result, the backup of a 500 GB database could be split into 5 partial backups with 100 GB in each.
A 100 GB backup fits more easily into the available time window.
Advantages
- The partial backups can be carried out daily without major problems.
- The database can be recovered at any time if the corresponding redo log entries exist. It is also possible to restore the entire database and then recover it.
- Recovery after a media error (crash of individual hard disks) can be done quickly as only the files of the affected hard disks must be restored.
Disadvantages
- Restoring the entire database takes as long as the total of all of the partial backups. This situation only normally occurs with logical errors (for example program error or handling error) which cause a data loss. This disadvantage is generally acceptable, since these errors should not occur frequently.
- It must be ensured that all database files were saved in a cycle of the partial backups. This is the responsibility of the database administrator. Backup at tablespace level (not at database file level, which is, in principle, possible), as BRBACKUP can then ensure that all files from a tablespace are backed up.
Splitting extremely large tablespaces into several smaller tablespaces can be useful in this case (for example, extremely large tables could be put into separate tablespaces, and then backed up separately at tablespace level).
Use of BRBACKUP for Parallel Backup of Large Databases on Hard Disks
--------------------------------------------------------------------
BRBACKUP offers the option to backup onto several hard disks (see Backup on Several Disks). Note the following feature:
- Parallel backup onto the specified hard disks is possible. The directories must be defined in parameter backup_root_dir. The backup can be carried out with or without software compression.
The degree of parallelity (the number of parallel copy processes) is controlled by parameter exec_parallel of initialization profile init.sap (see exec_parallel) or the command option -e.
A two-level backup can be made. See Two-Level Backup.
In order to avoid weighing the database server down with the second level of backup, the following procedure would be conceivable: Unmount (umount) the file system from the databasae server, and mount (mount) this file system on a second host. Then start the backup from this host. A requirement for this procedure is that the hard disk controllers for the backup disks can be physically mounted on both hosts simultaneously. The operations umount and mount are necessary, since the file system cannot be mounted on different computers at the same time due to the buffering mechanism.
Use of External Backup Programs to Backup Large Databases
---------------------------------------------------------
Informationen on the BACKINT interface can be found in Using External Backup Programs.
The essential advantage of using this interface is pointed out here: You can use other backup media (for example magneto-optical (MO) media) for a backup or you can transfer volume management to other systems (for example tape jukeboxes)..
The disadvantage of these programs was that in an online backup (before SAP R/3 Release 3.0B) all the tablespaces to be backed up were in backup mode during the whole backup and considerably more redo logentries were created for this reason (that is, the volume of offline redo log files was considerably increased in a backup with BACKINT). The interface was extended to eliminate this problem. See information about the option util_file_online of paramter backup_dev_type or about the corresponding command option -d util_file_online.
Information on the throughput and performance of external backup programs can be obtained from the vendors providing the interface program BACKINT and its support for the respective backup program.
Final Comments
--------------
The feasibility of the backup of large databases with BRBACKUP (using cpio for the copy procedures) depends on the following factors:
- Capacity and maximum throughput of the tape units
- Hard disk access times
- Maximum throughput of the I/O (SCSI) bus
- Maximum throughput of the system bus
- Performance of cpio which is determined by internal buffering and block size
The hardware restrictions can only be removed by the hardware vendors.
The hardware configuration for a large database needs careful planning so that a backup can be executed optimally. When large databases are backed up, it is often necessary to make multiple tape swaps and to manage hundreds of volumes. Tape jukeboxes which can be supported via the BACKINT interface and external backup programs should therefore be used if possible.
No comments:
Post a Comment