Latest Updates

Post Top Ad

29 January, 2014

Howto run expdp with data compression in Oracle 10g and 11g

Many people are still reluctant to use the Oracle export Data Pump (expdp) in 10g because there is no effective method of compression via pipe as there is in simple export command (exp). Starting in version 11g, Oracle added the parameter “COMPRESS=ALL” that helps in compressing the dump file generated, but still uses the internal mechanisms for compression without allowing the user to choose the best form of compression.

To solve this, I developed a script that performs a parallel compression of the files generated by expdp dumps, doing efficiently as the pipe and using the parameter of maximum file size of the expdp DUMP, also taking advantage of the parallelism parameter.

1) How the script works?
When calling the expdp, we define a value for the parameter “FILESIZE”, for example, of 1GB.:
eg: expdp full=Y directory=… dumpfile=expfile.%u.dmp logfile=expfile.log filesize=1G

Thus, assuming that its total DUMP file is 30 GBs, the expdp will generate the first 1GB file and when finished, will start to record the second file. At this time, a parallel compression script will act in the first file generated while the second file is being recorded by the database. Compression script can use bzip2, gzip or any other algorithm of your choice and existing on your server.

The script can also take advantage of the “PARALLEL” parameter and start recording several files in parallel. As they are coming to the defined “FILESIZE”, compression will be performed in the background making the export much faster:
eg: expdp full=Y directory=… dumpfile=expfile.%u.dmp logfile=expfile.log filesize=1G parallel=8

This script, however, does not yet support specific schemas export. He performs a FULL database with one of the following commands (depending on the chosen parameterization):

EXPDP: expdp full=y directory=… dumpfile=… logfile=… filesize=… parallel=…
EXP:      exp file=… log=… full=y consistent=y compress=n buffer=9999999 statistics=none direct=y
2) Requirements
The script has been tested on Linux, HP-UX and Solaris. Actually, the basic requirement is that there are primary unix commands (cd, mktemp, mkfifo, awk, head, etc).
Necessary that the database to be backed up is registered in the oratab file (for exp and expdp) or in TNSNAMES file (for remote exp).
The compression tool desired must be installed (gzip, bzip2, zip, compress, …).
3) How to install?
Download it here!

Unzip the directory “exp” in the folder you want to configure Data Pump. Subdirectories will only have shell scripts, configuration files and logs. The dumps will be generated on a configurable destination.

exp-tree

The folder has 3 basic subdirectories:
SH folder – Contains the shells responsible for export and compression. It is not necessary to change them
CFG folder – Contains the configuration files for each instance parameterized.
LOG folder – Contains the logs of executions.

The SH folder contains 3 shells:
expcron.sh – This shell only centralizes the execution of everything and save the log. This is the script which must always be called.
expfull.sh – This shell has the entire Oracle export logic process.
expdpcompress.sh – This shell has all Oracle dumps logic compression.

The CFG folder contains the configuration file of the instance:
orcl.cfg – The file name should be exactly the same as the instance name you are backing up in the oratab file, appending “.cfg”. If this file does not exist, the process attempts to find a file named “default.cfg” in that same directory.

The LOG folder is initially empty and contains the executions logs.

Set the permissions on the directories and files according to the following images:

exp-tree-sh
exp-tree

4) Configuring
The configuration file “oracle_sid.cfg” existing in the cfg folder (where oracle_sid is the SID of the instance to be backed up) requires the following parameters:

- EXPTYPE
Optional parameter that specifies the type of export and Oracle binary to be called (“exp” or “expdp”). Accepts the values ??”EXP” or “EXPDP”. If omitted, the default value “EXP” will be used.
Eg: EXPTYPE=EXPDP

- EXPRETENTION
Mandatory parameter that determines the retention time of the old backups, in days. After the end of the process, dump files before “EXPRETENTION” days will be deleted.
Eg: EXPRETENTION=3

- EXPUSER
Mandatory parameter that defines the user and password that will connect the database to perform the export. It is recommended to create a user exclusive for this purpose, giving the user only the grants required, as follows:


1
2
3
4
create user expuser identified by asb378 default tablespace USERS temporary tablespace TEMP profile DEFAULT quota unlimited on users;
grant exp_full_database to expuser;
grant create session to expuser;
grant create table to expuser;
Eg: EXPUSER=expuser/asb378

- EXPDEST
Mandatory parameter that defines the folder where the dump will be written. This folder must also exist within the instance if the variable “EXPTYPE” is “expdp”, as follows:


1
2
create or replace directory EXPDP1 as '/u02/export/dump';
grant read,write on directory EXPDP1 to expuser;
Eg: EXPDEST=/u02/export/dump

- COMPALG
Mandatory parameter that determines the command line to run the compression tool. Currently, only supports Gzip or Bzip2.
Bzip2 is slower but the file is much smaller. If the export type is “expdp”, this compression will occur in the background at the end of each part. If “exp”, the compression will run through PIPE.
Eg: COMPALG=”bzip2 -9?

- EXPPARALLEL
Mandatory parameter if the value of “EXPTYPE” variable is “EXPDP”. Specifies the amount of jobs that will run parallel export, being passed to the parameter “PARALLEL” of “expdp” command. Do not exceed 2x the number of CPUs. If you are generating a small dump, leave as “1?. If the variable “EXPTYPE” is “EXP” or null, this parameter is ignored.
Eg: EXPPARALLEL=2

- EXPFILESIZE
Mandatory parameter if the value of “EXPTYPE” variable is “EXPDP”. Specifies the maximum size of each dump file being passed to the parameter “FILESIZE” of “expdp” command. Do not leave too small because the maximum limit are 99 files. The size must be specified in the format of GBs. If the variable “EXPTYPE” is “EXP” or null, this parameter is ignored.
Eg: EXPFILESIZE=3G

- TARCOMPACT
Mandatory parameter if the value of “EXPTYPE” variable is “EXPDP”. Specifies whether in the end of the process, all the export files compressed will be joined into a single TAR (1= Yes, 0=No). If the variable “EXPTYPE” is “EXP” or null, this parameter is ignored.
Eg: TARCOMPACT=0

- RESOLVTNS
Optional parameter and only applies if the value of “EXPTYPE” variable is “EXP” or null. Specifies the database connection string inside TNSNAMES file, if you wish to run backup of a remote server. Only works for “exp”, not “expdp”. If the variable “EXPTYPE” is “EXPDP” and this parameter is specified, the process will point an alert and stops.
Eg: RESOLVTNS=orcl.world

a) Example configuration to export data pump (expdp) with no executions in parallel, maximum size for each dump is 3 GBs.


1
2
3
4
5
6
7
8
EXPTYPE=EXPDP
EXPRETENTION=7
EXPUSER=expuser/asb378
EXPDEST=/u02/export/dump
COMPALG="bzip2 -9"
EXPPARALLEL=1
EXPFILESIZE=3G
TARCOMPACT=0
b) Example configuration to export data pump (expdp) with 8 executions in parallel, maximum size for each dump is 2 GBs.


1
2
3
4
5
6
7
8
EXPTYPE=EXPDP
EXPRETENTION=7
EXPUSER=expuser/asb378
EXPDEST=/u02/export/dump
COMPALG="bzip2 -9"
EXPPARALLEL=8
EXPFILESIZE=2G
TARCOMPACT=0
c) Example configuration for simple export (exp).


1
2
3
4
5
EXPTYPE=EXP
EXPRETENTION=7
EXPUSER=expuser/asb378
EXPDEST=/u02/export/dump
COMPALG="bzip2 -9"
d) Example configuration for simple remote export (exp) through TNSNAMES.


1
2
3
4
5
6
EXPTYPE=EXP
EXPRETENTION=7
EXPUSER=expuser/asb378
EXPDEST=/u02/export/dump
COMPALG="bzip2 -9"
RESOLVTNS=orcl.world
5) Running
To execute the export script, we run the shell expcron.sh passing as a parameter the instance sid exactly as written in the oratab file.
Thus, the script will fetch the configuration folder for the file “oracle_sid.cfg” or the file “default.cfg”, if the first one does not exist.

Example of punctual execution:


1
2
3
4
5
[oracle@orasrv ~]$ cat /etc/oratab
#
orcldb:/u01/app/oracle/product/11.2.3/db_1:Y
[oracle@orasrv ~]$ $EXPORT_HOME/sh/expcron.sh orcldb &
[1] 17637$
The execution log is written in the log folder “$EXPORT_HOME/log”.

Example of backup scheduled in crontab (most common), running every day at 21h:


1
2
[oracle@orasrv ~]$ crontab -l
00 21 * * * /u02/export/exp/sh/expcron.sh orcldb
NOTE 1: The script does not run on servers that have two identical ORACLE_SID in different ORACLE_HOME’s. The script also does not accept to run as root, for security reasons.

NOTE 2: If the export method is simple remote export (exp) via TNSNAMES, obviously there is no requirement for the presence of the SID in oratab file. The only parameter of the “expcron.sh” command will only be used to locate the configuration file “oracle_sid.cfg” and it will read the string name inside TNSNAMES files.

6) Script output
In addition to the log file generated in the directory “$EXPORT_HOME/log”, the output of the script is obviously a compressed DUMP file (if “exp”) or multiple compressed DUMPS files inserted into a folder created by the process (if “expdp”). If the variable “TARCOMPACT” has been set to “1?, the entire directory will be placed in a container “.tar” file.
The log of the export output will also be generated along with the dump file.

7) Conclusion
Now we can run the expdp with in a efficient way, compressing files. If the compression is too slow, try changing the second parameter of the “gzip” tool (or “bzip2?) to “– fast” (or “-1?). If the files are too big, do the opposite, switch to “– best” (or “-9?).

No comments:

Post Top Ad

Your Ad Spot