Tuesday, 21 October 2014

Refresh is a scenario where we migrate data from one database to another database or from one schema to another schema.

Generally , Refreshes are done at:

  1. Table level Refrtesh
  2. Schema Level Refresh
  3. Database Level Refresh
We perform table level and schema level refreshes using export/import utility or Datapump Technology.

Data pump is a new feature in Oracle10g that provides fast parallel data load. With direct path and parallel execution, data pump is several times faster then the traditional exp/imp. Traditional exp/imp runs on client side. But impdp/expdp runs on server side. So we have much control on expdp/expdp compared to traditional exp/imp. When compared to exp/imp, data pump startup time is longer. Because, it has to setup the jobs, queues, and master table. Also at the end of the export operation the master table data is written to the dump file set, and at the beginning of the import job the master table is located and loaded in the schema of the user.

Following are the process involved in the data pump operation:
Client Process : This process is initiated by client utility. This process makes a call to the data pump API. Once the data pump is initiated, this process is not necessary for the progress of the job.Shadow Process : When client log into the database, foreground process is created. It services the client data pump API requests. This process creates the master table and creates Advanced queuing queues used for communication. Once client process ends, shadow process also go away.
Master Control Process : MCP controls the execution of the data pump job. There is one MCP per job. MCP divides the data pump job into various metadata and data load or unload jobs and hands them over to the worker processes.Worker Process : MCP creates worker process based on the valule of the PARALLEL parameter. The worker process performs the task requested by MCP.Advantage of Data pump

1. We can perform export in parallel. It can also write to multiple files on different disks. (Specify parameters PARALLEL=2 and the two directory names with file specification DUMPFILE=ddir1:/file1.dmp, DDIR2:/file2.dmp)
2. Has ability to attach and detach from job, monitor the job progress remotely.
3. Has more option to filter metadata objects. Ex, EXCLUDE, INCLUDE
4. ESTIMATE_ONLY option can be used to estimate disk space requirements before performs the job
5. Data can be exported from remote database by using Database link
6. Explicit DB version can be specified, so only supported object types are exported.
7. During impdp, we can change the target file names, schema, and tablespace. Ex, REMAP_SCHEMA, REMAP_DATAFILES, REMAP_TABLESPACE
8. Has the option to filter data rows during impdp. Traditional exp/imp, we have this filter option only in exp. But here we have filter option on both impdp, expdp.
9. Data can be imported from one DB to another without writing to dump file, using NETWORK_LINK parameter.
10. Data access methods are decided automatically. In traditional exp/imp, we specify the value for the parameter DIRECT. But here, it decides where direct path can not be used , conventional path is used.
11. Job status can be queried directly from data dictionary(For example, dba_datapump_jobs, dba_datapump_sessions etc)
Exp & Expdp common parameters: These below parameters exists in both traditional exp and expdp utility.
FILESIZE
FLASHBACK_SCN
FLASHBACK_TIME
FULL
HELP
PARFILE
QUERY
TABLES
TABLESPACES
TRANSPORT_TABLESPACES(exp value is Y/N, expdp value is name of the tablespace)
Comparing exp & expdp parameters: These below parameters are equivalent parameters between exp & expdp. Exp and corresponding Expdp parameters...

FEEDBACK => STATUS
FILE => DUMPFILE
LOG => LOGFILE
OWNER => SCHEMAS
TTS_FULL_CHECK => TRANSPROT_FULL_CHECK
New parameters in expdp Utility
ATTACH Attach the client session to existing data pump jobs
CONTENT Specify what to export(ALL, DATA_ONLY, METADATA_ONLY)
DIRECTORY Location to write the dump file and log file.
ESTIMATE Show how much disk space each table in the export job consumes.
ESTIMATE_ONLY It estimate the space, but does not perform export
EXCLUDE List of objects to be excluded
INCLUDE List of jobs to be included
JOB_NAME Name of the export job
KEEP_MASTER Specify Y not to drop the master table after export
NETWORK_LINK Specify dblink to export from remote database
NOLOGFILE Specify Y if you do not want to create log file
PARALLEL Specify the maximum number of threads for the export job
VERSION DB objects that are incompatible with the specified version will not be exported.
ENCRYPTION_PASSWORD The table column is encrypted, then it will be written as clear text in the dump file set when the password is not specified. We can define any string as a password for this parameter.
COMPRESSION Specifies whether to compress metadata before writing to the dump file set. The default is METADATA_ONLY. We have two values(METADATA_ONLY,NONE). We can use NONE if we want to disable during the expdp.
SAMPLE - Allows you to specify a percentage of data to be sampled and unloaded from the source database. The sample_percent indicates the probability that a block of rows will be selected as part of the sample.
Imp & Impdp common parameters: These below parameters exist in both traditional imp and impdp utility.
FULL
HELP
PARFILE
QUERY
SKIP_UNUSABLE_INDEXES
TABLES
TABLESPACES
Comparing imp & impdp parameters: These below parameters are equivalent parameters between imp & impdp. imp and corresponding impdp parameters...
DATAFILES => TRANSPORT_DATAFILES
DESTROY =>REUSE_DATAFILES
FEEDBACK =>STATUS
FILE =>DUMPFILE
FROMUSER =>SCHEMAS, REMAP_SCHEMAS
IGNORE =>TABLE_EXISTS_ACTION(SKIP,APPEND,TRUNCATE,REPLACE)
INDEXFILE, SHOW=>SQLFILE
LOG =>LOGFILE
TOUSER =>REMAP_SCHEMA
New parameters in impdp Utility
FLASHBACK_SCN Performs import operation that is consistent with the SCN specified from the source database. Valid only when NETWORK_LINK parameter is used.
FLASHBACK_TIME Similar to FLASHBACK_SCN, but oracle finds the SCN close to the time specified.
NETWORK_LINK Performs import directly from a source database using database link name specified in the parameter. The dump file will be not be created in server when we use this parameter. To get a consistent export from the source database, we can use the FLASHBACK_SCN or FLASHBACK_TIME parameters. These two parameters are only valid when we use NETWORK_LINK parameter.
REMAP_DATAFILE Changes name of the source DB data file to a different name in the target.
REMAP_SCHEMA Loads objects to a different target schema name.
REMAP_TABLESPACE Changes name of the source tablespace to a different name in the target.
TRANSFORM We can specify that the storage clause should not be generated in the DDL for import. This is useful if the storage characteristics of the source and target database are different. The valid values are SEGMENT_ATTRIBUTES, STORAGE. STORAGE removes the storage clause from the CREATE statement DDL, whereas SEGMENT_ATTRIBUTES removes physical attributes, tablespace, logging, and storage attributes.
TRANSFORM = name:boolean_value[:object_type], where boolean_value is Y or N.
For instance, TRANSFORM=storage:N:table
ENCRYPTION_PASSWORD It is required on an import operation if an encryption password was specified on the export operation.
CONTENT, INCLUDE, EXCLUDE are same as expdp utilities.
Steps for Table Level :

For Example consider a below scenario:

Source Database: Prod
Schema: Scott
Table:Emp

Target Database:Stage
Schema:  Scott
Table: Emp

Now perform below steps on source database:

SQL>select count(*) from scott.emp;

SQL>select sum(bytes)/1024/1024/1024 from dba_segments where segment_name='Emp' and owner='scott';

$exp file=exp_prod_scott_emp.dmp log=exp_prod_scott_emp.log tables=scott.emp buffer=102400 consistent=y
 copy the dump to target database location.

Now perform below steps on Target database:

 SQL>select count(*) from scott.emp;
SQL>select sum(bytes)/1024/1024/1024 from dba_segments where segment_name='Emp' and owner='scott';

$exp file=exp_stage_scott_emp.dmp log=exp_stage_scott_emp.log tables=scott.emp buffer=102400 consistent=y

SQL>drop table scott.emp;

$imp file=exp_prod_scott_emp.dmp log=imp_stage_scott_emp.log fromuser=scott touser=scott

now crosscheck the data in both the  databases.

SQL>select count(*) from scott.emp;
SQL>select sum(bytes)/1024/1024/1024 from dba_segments where segment_name='Emp' and owner='scott';

The o/p of above queries must be same in both the databases after import.
 

Read More
Powered by Blogger.