Data Pump Import
We learned in a previous blog that Data Pump Export is used to transfer data from an Oracle database into a flat file. The blog also contained a few quick examples of the benefits that transferring data from an Oracle database to a flat file provides. Now that we have the data in that flat file, we need to learn how to run the Data Pump Import utility to transfer the data back into an Oracle database.
Data Pump Import
Let's begin our discussion with a quick review of a sample Data Pump Import
command:
impdp foot/foot directory=data_pump_dir1 dumpfile=expfull.dmp logfile=impfull.log full=y
"impdp" tells the operating system to run the Data Pump Import utility. The parameters that follow are used to control the job's execution. In our example, the account and password are both "foot", the file that will contain the input data is "expfull.dmp", the logfile is"impfull.log" , and a full database import will be performed
We learned previously that you must precreate a directory using a SQL DDL statement on the database being accessed. If you don't pre-create the directory, Oracle provides a default directory called DATA_PUMP_DIR. That means that users are unable to fully qualify the output file and log file as they were able to do in the non Data Pump versions of Export and Import. You specify a fully qualified directory on the server when you create the new database directory object and then you must grant access to that directory. My last blog contains more information on the creation and administration of Oracle directories.
Data Pump provides the following import modes:
-
Full
- Activated by specifying "FULL=Y". Notifies Import to import the
entire content of the input file. The amount of data actually imported is
dependent upon the data contained in the input file. If the input file contains
all of the objects and data from a full database export, all of those objects
will be imported. If the input file contains a schema export, all of the objects
in that schema will be imported.
- Schema Import
- Activated by specifying "SCHEMAS=schema, schema, schema
.). Imports
the entire schema and all dependent objects (tables, indexes, grants, synonyms,
PL/SQL programs, views). The problem with the older version of Import was
that because it didn't recreate the user (just the user's objects), the administrator
was forced to save the user creation and grant statements to recreate the
user with the same privileges it had before the drop was performed. A Data
Pump schema import will recreate the user and execute all of the associated
grants, user password history, etc..
-
Table Import
- Activated by specifying "TABLES=[schemas].tablename, [schemas].tablename,
.
Imports a table and all dependent objects.
- Tablespace
Import - Activated by specifying "TABLESPACES=tablespacename, tablespacename,
tablespacename
All of the objects contained in the specified tablespace
are imported. If a table is imported, its dependent objects are also imported
regardless of the tablespace they were stored in.
-
Transportable
Tablespace Import
- Activated by specifying "TRANSPORT TABLESPACES=tablespacename
Here's how this feature works. The administrator first runs Data Pump Export
with the Transportable Tablespace feature activated. Data Pump exports only
the metadata for the objects contained in the specified tablespaces. The objects
in the tablespace set being exported can not have any references to objects
contained in tablespaces outside of the tablespace set. In addition there
can't be any objects in the tablespace set that are referenced by objects
outside of the tablespace set being exported. Having a index in a tablespace
that is not contained in the same tablespace set that contains the table it
is built upon would be an example of a dependency reference problem. Administrators
use the TRANSPORT_FULL_CHECK parameter to perform the dependency verification.
The administrator then uses an operating system command to copy the tablespace datafile(s) to the target database. The transportable tablespace import feature is then used to attach the datafile (and its contents) to the database. Transferring the datafile using operating system commands is much faster than exporting and importing the data using the traditional Data Pump Export and Import operations. Administrators are only limited by hardware and network performance.
It is important to note that you are not forced into using the same import mode as you did for the export. You can use the schema and table option using an export file that was created by a full database export. You can also import one table from a schema level export (and so on ). Lastly, Data Pump Import can only read data that was created by a Data Pump Export.
Let's continue by reviewing a few of the 10G Data Pump Import parameters that you may find interesting. Many of these parameters will be the same as for Data Pump Export.
Data Pump Import Parameters
-
JOB_NAME=jobname_string
- You are able to assign a job name of up to 30 bytes to a Data Pump operation.
This allows the job to be more easily identified when querying the DBA_DATAPUMP_JOBS
and DBA_DATAPUMP_SESSIONS data dictionary views that provide information on
Data Pump operations. If you do not specify a job name, Oracle will create
a default job name for you. The form will be SYS_IMPORT_mode_nn, where mode
is the type of export (FULL, TABLE, etc.) and "nn" is a incrementing
number starting at 01. The job name is used in the ATTACH command (shown below)
to attach, and reattach, to running jobs.
-
ATTACH [=[schema_name.]job_name]
- Used to attach a client session to an existing Data Pump operation. An example
of the ATTACH command is provided below:
> impdp foot/foot ATTACH=foot.importfulljob
- DIRECTORY=directory_object
- All of the dumpfile set I/O is generated by Data Pump processes that run
within the constructs of the database engine. As a result, the O/S account
performing the I/O is Oracle, which usually has a higher level of privileges
than most other accounts. Oracle attempts to prevent unwarranted operations
from occurring by using Oracle directory objects that require read and write
privileges granted by the DBA. Users running Data Pump operations are only
able to read and write files in the directories that they have been granted
access to.
Before a user is able to run any Data Pump operation, the administrator must pre-create a directory and grant privileges to the user on that directory. This means that users are unable to fully qualify the output file and log file as they were able to do in the non Data Pump versions of Export and Import. If you don't pre-create the directory, Oracle provides a default directory called DATA_PUMP_DIR. -
DUMPFILE=[directory_object:]file_name
[, ...] -
Contains the metadata and/or data output from a Data Pump Export operation
that Data Pump Import is using as input. The directory object specification
is not required to be specified if a directory is specified in the DIRECTORY
parameter. If one is specified in the DIRECTORY parameter and the DUMPFILE=
parameter, the DUMPFILE=directory: specification will take precedence.
- ESTIMATE={BLOCKS
| STATISTICS} - Used to estimate the amount of data a Data Pump Import
network operation will generate. This differs from the estimate option that
is used with Data Pump Export. Using the ESTIMATE parameter with Data Pump
Export will ask Data Pump to estimate how large the output dumpfile will be.
I'll provide more information on Data Pump network import operations later
in this blog.
-
Fine Grained
Selectivity Parameters (Exclude, Include, Query, Flashback, Content)
- Allows the administrator to include or exclude objects. For more information
on these parameters, please turn to my previous blog titled "10G
Data Pump Part 1".
-
HELP = {y
| n} -
Displays a listing of all the parameters that can be specified for the Data
Pump operation. For example:
> impdp help=y
Would provide a listing of parameters for a Data Pump Import operation.
-
LOGFILE=[directory_object:]file_name
- Contains
work-in-progress, work completed and errors generated during a Data Pump Operation.
For a real time update on a Data Pump operation's activities, you are able
to attach to a running Data Pump operation using the ATTACH command and then
execute the STATUS command. Like its dumpfile counterpart, The directory object
specification is not required to be specified if a directory is specified
in the DIRECTORY parameter. If one is specified in the DIRECTORY parameter
and the DUMPFILE= parameter, the DUMPFILE=directory: specification will take
precedence.
-
NETWORK_LINK=database_link_name
- OK, I must admit that this is one of my favorite new features provided by
Data Pump Import. The network transfer option allows administrators to attach
to any source database that can be accessed from the network, transfer the
data and write it to the target database. No dumpfile required! Administrators
use the NETWORK_LINK parameter to specify a pre-created database link that
points to the source database.
This parameter must be specified when using the FLASHBACK_SCN, FLASHBACK_TIME, ESTIMATE, or TRANSPORT_TABLESPACES parameters.
-
PARALLEL=integer
- Multiple worker processes are able to perform inter-table and inter-partition
work in parallel. This allows administrators to load and unload database tables
using parallel processing. Data Pump Import takes advantage of parallelism
to build indexes and load package bodies.
The most beneficial aspect of Data Pump's parallel processing feature is that it is dynamic. Administrators are able to attach (connect) to a Data Pump job that is currently executing and dynamically alter the number of parallel processes it uses.
Oracle documentation states that the parallel option may cause file contention if too many parallel processes are used to read the same input dumpfile. As a result, you'll need to be careful when specifying the level of parallelism.
- REUSE_DATAFILES={y
| n} - Tells Data Pump Import to reinitialize existing tablespace datafiles.
Specifying "N" tells Data Pump Import to generate an error message
from the failing create tablespace execution. The Import will then continue
executing. Specifying "Y" tells Import to issue a warning and reinitialize
the tablespace datafile(s) which results in a total loss of existing data
in the target tablespace.
- STATUS=[integer]
- Specifies the number of seconds between each status information display.
-
TABLE_EXISTS_ACTION={SKIP
| APPEND | TRUNCATE | REPLACE} - Provides Data Pump Import with instructions
to perform when it finds a table in the target database with the same name
as a table contained in the input file.
TABLE_EXISTS_ACTION=SKIP - Existing tables are bypassed if they already exist. No action is taken.
TABLE_EXISTS_ACTION=APPEND -Data Pump Import will append rows to the existing table.
TABLE_EXISTS_ACTION=TRUNCATE - Data Pump Import will truncate the existing table and load rows from the source.
TABLE_EXISTS_ACTION=REPLACE - Existing tables will be dropped, recreated and then loaded from the source.
- TRANSFORM = transform_name:value[:object_type] - Allows the administrator to alter object creation DDL for specified objects . For more information on these parameters, please turn to my previous blog titled "10G Data Pump Part 1".
Commands Available
in Interactive Mode
We learned previously that administrators are able to attach and detach from
actively running Data Pump operations. Administrators use the ATTACH and EXIT_CLIENT
commands to connect and disconnect from an active Data Pump job. This output
file provides a demonstration of detaching and attaching from an active
Data Pump operation. You'll notice that I dynamically change the parallel setting
after I reattach.
Let's take a look at some of the other parameters for Data Pump Import that you can dynamically change during its execution:
-
CONTINUE_CLIENT
- Switches from interactive mode to status (logging) mode. An example of this
command follows:
impdp> CONTINUE_CLIENT
- EXIT_CLIENT
- Detaches the client from an active Data Pump operation. The job continues
to run. The administrator is then able to use the ATTACH command to reattach
to the job at any time. The attach can be performed if the job is executing
or in a stopped state. An error message will be returned if an attempt is
made to attach to a completed Data Pump Operation.
- HELP
- Provides help on interactive mode Data Pump commands.
-
KILL_JOB
- Kills the active Data Pump operation and detaches all client sessions. This
command differs from the STOP_JOB command which is discussed below. The KILL_JOB
will remove DUMPFILEs that were created which prevents the user from taking
advantage of Data Pump's restart capabilities. The log files will remain.
-
PARALLEL=integer
- Change the number of parallel processes for a Data Pump operation.
-
START_JOB
/ STOP _JOB
- Restarts and stops the Data Pump operation the client is currently attached
to. Data Pump jobs update a master table with all pertinent information pertaining
to the job being executed. The master table knows about the objects being
exported or imported and their location in the dump file set.
As a result, a Data Pump job can be restarted in the event of a planned or unplanned stoppage. The Data Pump job accesses the master table to determine where to start and picks up right where it left off. Administrators are able to use the STOP_JOB AND START_JOB commands on the client to temporarily stop jobs and restart them when desired.
Oracle provides the SKIP_CURRENT option of the START_JOB parameter to allow administrators to restart a Data Pump Import operation that failed to restart because of a failing DDL statement. SKIP_CURRENT tells Data Pump Import to skip the object having the problem and restart processing on the next object.
- STATUS=[integer] - Specifies the number of seconds between each status information display.
WRAPUP
I think after reading these last four blogs, we should all have a pretty good
understanding of 10G Data Pump. We can finally retire Data Pump's ancestors
Export and Import. As we now know, they just can't compete with the features
and functionality provided by Data Pump Export and Import.
Thanks for reading.
Chris Foot
Oracle Ace