- Renaming a RAC cluster - 27/09/2018
- Stop/Start all RAC databases at once - 26/09/2018
- RMAN Backup script - 08/11/2017
I recently came across an issue while exporting a huge partitioned table for a data migration. The export took years without any obvious reason.
After some research I found that the parallelism wasn’t working, I could only see one datapump worker at a time, then I found this note explaining the behaviour (Doc ID 1467662.1) .
A table with BASICFILE lobs cannot be exported in parallel, oracle advise to convert those BASICFILES to SECUREFILES; this was not possible for me.
The second solution was to run multiple export jobs in parallel, each one exporting a fraction of the table rows. In the example, oracle use a query using a range of the primary key to do an export, this was ok but I needed a more automatic method to divide the rows based on the degree of parallelism I wanted to apply, then I found this blog post from Morten Jensen.
He uses the modulo and the rowid to create an equal division of the rows based on a parallelism parameter, that was exactly what I was looking for. He used a perl script to achieve that, I preferred to make a shell script to do it because I have only a little knowledge of Perl. I created one shell script to create the export, in my case it was to export a specific partition but the script supports tables as well, and one script to import the dumpfiles generated in parallel.
Datapump in parallel scripts
#!/bin/bash
#############################################################################
# EXPORT IN PARALLEL WITH DATAPUMP
#############################################################################
#
# Description: This script will create multiple export job in parallel
# to export table
#
#############################################################################
#
# Author: Cyrille MODIANO
#
# Version : 1.0 (2016-10-25)
# Version : 1.1 (2016-11-30) - Fixed mod = 0 values skipped during export
#
#############################################################################
BASE_SCHEMA=$1
BASE_TABLE=$2
PARALLEL=$3;
PARTITION=$4
function usage(){
echo "USAGE:
Parameter 1 is the SCHEMA
Parameter 2 is the TABLE NAME
Parameter 3 is the DEGREE of parallelism
Parameter 4 (optional) is the partition (if any)"
}
if [ $# -lt 3 ]; then
usage
exit 1
fi
if [ $# -eq 4 ]; then
PARFILE=${BASE_SCHEMA}_${BASE_TABLE}_${PARTITION}.par
echo "tables=${BASE_SCHEMA}.${BASE_TABLE}:${PARTITION}" > $PARFILE
START_MESSAGE="Beginning export of partition : ${BASE_SCHEMA}.${BASE_TABLE}:${PARTITION} "
END_MESSAGE "Finished export of partition: ${BASE_SCHEMA}.${BASE_TABLE}:${PARTITION}"
DUMPFILE_BASE=${BASE_SCHEMA}_${BASE_TABLE}_${PARTITION}
LOGFILE_BASE=${BASE_SCHEMA}_${BASE_TABLE}_${PARTITION}
else
PARFILE=${BASE_SCHEMA}_${BASE_TABLE}.par
echo "tables=${BASE_SCHEMA}.${BASE_TABLE}" > $PARFILE
START_MESSAGE="# Beginning export of table : ${BASE_SCHEMA}.${BASE_TABLE}"
END_MESSAGE "# Finished export of table: ${BASE_SCHEMA}.${BASE_TABLE}"
DUMPFILE_BASE=${BASE_SCHEMA}_${BASE_TABLE}
LOGFILE_BASE=${BASE_SCHEMA}_${BASE_TABLE}
fi
# Adding parameters to the parfile
echo "directory=DATA_PUMP" >> $PARFILE
echo "EXCLUDE=STATISTICS" >> $PARFILE
echo "CLUSTER=N" >> $PARFILE
echo "#########################################################################################"
echo $START_MESSAGE
echo "#########################################################################################"
echo " "
LIMIT=$(expr $PARALLEL - 1)
START_TIME=`date`
for i in `seq 0 $LIMIT`
do
QUERY="where mod(dbms_rowid.rowid_block_number(rowid), ${PARALLEL}) = $i"
expdp userid=\'/ as sysdba\' query=$BASE_SCHEMA.$BASE_TABLE:\"$QUERY\" dumpfile=${DUMPFILE_BASE}_${i}.dmp logfile=${LOGFILE_BASE}_${i}.log parfile=$PARFILE &
sleep 3
done
wait `pidof expdp`
echo "#########################################################################################"
echo $END_MESSAGE
echo "# Start time : $START_TIME "
echo "# End time is: `date`"
echo "#########################################################################################"
and the import script:
#!/bin/bash
#############################################################################
# IMPORT IN PARALLEL WITH DATAPUMP
#############################################################################
#
# Description: This script will import multiple dumpfiles in parallel
#
#############################################################################
#
# Author: Cyrille MODIANO
#
# Version : 1.0 (2016-10-25)
# Version : 1.1 (2016-12-12) - added support for working directories and notifications
#
#############################################################################
export ORAENV_ASK=NO
export ORACLE_SID=$1
. oraenv
TABLE_NAME=$2
PARTITION=$3
function usage(){
echo "USAGE:
Parameter 1 is the SID of the database where you want to import
Parameter 2 is the TABLE you want to import
Parameter 3 (optional) is the PARTITION name you want to import (if any)"
}
if [ $# -lt 2 ]; then
usage
exit 1
fi
if [ $# -eq 3 ]; then
PARFILE=${TABLE_NAME}_${PARTITION}.par
START_MESSAGE="Beginning import of partition : ${TABLE_NAME}:${PARTITION} "
END_MESSAGE "Finished import of partition: ${TABLE_NAME}:${PARTITION}"
SEARCH_PATTERN=${BASE_TABLE}_${PARTITION}
SUCCESS_MESSAGE="partition: ${TABLE_NAME}:${PARTITION} successfully imported, started at"
ERROR_MESSAGE="partition: ${TABLE_NAME}:${PARTITION} failed to import, check logfile for more info"
MAIL_OBJECT="Successfully imported partition ${TABLE_NAME}:${PARTITION}"
else
PARFILE=${TABLE_NAME}.par
START_MESSAGE="Beginning import of table : ${TABLE_NAME}"
END_MESSAGE "Finished import of table : ${TABLE_NAME}"
SEARCH_PATTERN=${BASE_TABLE}
SUCCESS_MESSAGE="Table ${TABLE_NAME} successfully imported, started at "
ERROR_MESSAGE="Table ${TABLE_NAME} failed to import, check logfile for more info"
MAIL_OBJECT="Successfully imported table ${TABLE_NAME}"
fi
#directories
BASEDIR=/u10/
DUMPDIR=$BASEDIR/DUMP
PARFILEDIR=$BASEDIR/parfiles
mkdir -p $PARFILEDIR
# building the parfile
echo "DIRECTORY=MY_DUMP_DIR" > ${PARFILEDIR}/$PARFILE
echo "CLUSTER=N" >> ${PARFILEDIR}/$PARFILE
echo "TABLE_EXISTS_ACTION=APPEND" >> ${PARFILEDIR}/$PARFILE
echo "DATA_OPTIONS=DISABLE_APPEND_HINT" >> ${PARFILEDIR}/$PARFILE
echo "#########################################################################################"
echo $START_MESSAGE
echo "#########################################################################################"
echo " "
START_TIME=`date`
for dump in `ls ${DUMPDIR}/*${SEARCH_PATTERN}*.dmp`
do
DUMPFILE=${dump}
LOGFILE=imp_${dump}.log
impdp userid=\'/ as sysdba\' dumpfile=$DUMPFILE logfile=${LOGFILE} parfile=${PARFILEDIR}/$PARFILE &
sleep 3
done
wait `pidof impdp`
echo "#########################################################################################"
echo $END_MESSAGE
echo "# Start time : $START_TIME "
echo "# End time : `date`"
echo "#########################################################################################"
# Verifying errors
errors_count=`grep ORA- *${SEARCH_PATTERN}*.log | wc -l`
if [ $errors_count -eq 0 ]; then
echo "$SUCCESS_MESSAGE $START_TIME and finished at `date`" | mail -s $MAIL_OBJECT you@your-email.com
else
echo $ERROR_MESSAGE | mail -s $MAIL_OBJECT you@your-email.com
fi
This solution is a workaround for the issue mentioned previously but it can also be used in standard edition to use parallelism with datapump which is only accessible to EE users otherwise.
To use these scripts you will have to change the DIRECTORY to match the one you are using.
You can also customize the parfile options, on my side I used the following:
- TRANSFORM=LOB_STORAGE:SECUREFILE (to transform BASICFILES to SECUREFILES while importing)
- TABLE_EXISTS_ACTION=APPEND (because I was importing partitions and the table existed already)
- DATA_OPTIONS=DISABLE_APPEND_HINT (this is mandatory when using multiple jobs importing in the same table at the same time as mentioned in the oracle note)
Thanks for reading.
Thanks Cyrille for sharing this post.
Foued
You’re welcome Foued.
hello very interesting..however am having issues to get it to work..can you post some script usage please
Hi,
Usage is at the beginning of the script, what issue are you having?
Thanks
Well done Cyrille & Morten. MOD is magic!
Thanks Vincent
Hi, how to do the connection string in a container/pluggable database. Assume the plugggable dbs is called mypdb.
The connect sring “/ as sysdba”@mypdb : this syntac does not work.
The connect sring “/@mypdb as sysdba” : this syntac does not work.
It will work only if I put explicitly a user not the “/”
The connect sring “system/password@mypdb as sysdba” : this syntac will work.
But I don’t want to use neither sys nor system nor any other user. I need to use “/ as sysdba”, how to make work with “pluggable names string”
Thanks in advance
Hi,
For what I know, you can’t connect without providing a password to a pdb service.
Thanks
Cyrille great job…can you take a schema backup in parallel mode?And if yes what is the command
I have tried :
Script_name.sh schema_name 5
Not working :)… am on SE
Hi,
This script was made to export a table or a table partition but it’s possible to adapt it to export a whole schema, you will to add a step to loop through the different tables.
Thanks
Your code is too sophisticated for me too add a single line.. I will be most greatful if you could change the code…I would be really really appreciative
I was able to complete the schema export using Pl/Sql