2017-12-24 - Progress - Tony Finch
I have used ora2pg
to do a quick export
of the IP Register database from Oracle to PostgreSQL. This export
included an automatic conversion of the table structure, and the
contents of the tables. It did not include the more interesting parts
of the schema such as the views, triggers, and stored procedures.
Oracle Instant Client
Before installing ora2pg
, I had to install the Oracle client
libraries. These are not available in Debian, but Debian's ora2pg
package is set up to work with the following installation process.
Get the Oracle Instant Client RPMs
from Oracle's web site. This is a free download, but you will need to create an Oracle account.
I got the
basiclite
RPM - it's about half the size of thebasic
RPM and I didn't need full i18n. I also got thesqlplus
RPM so I can talk to Jackdaw directly from my dev VMs.The
libdbd-oracle-perl
package in Debian 9 (Stretch) requires Oracle Instant Client 12.1. I matched the version installed on Jackdaw, which is 12.1.0.2.0.Convert the RPMs to debs (I did this on my workstation)
$ fakeroot alien oracle-instantclient12.1-basiclite-12.1.0.2.0-1.x86_64.rpm $ fakeroot alien oracle-instantclient12.1-sqlplus-12.1.0.2.0-1.x86_64.rpm
Those packages can be installed on the dev VM, with
libaio1
(which is required by Oracle Instant Client but does not appear in the package dependencies), andlibdbd-oracle-perl
andora2pg
.sqlplus
needs a wrapper script that sets environment variables so that it can find its libraries and configuration files. After some debugging I foud that although the documentation claims thatglogin.sql
is loaded from$ORACLE_HOME/sqlplus/admin/
in fact it is loaded from$SQLPATH
.To configure connections to Jackdaw, I copied
tnsnames.ora
andsqlnet.ora
froment
.
Running ora2pg
By default, ora2pg
exports the table definitions of the schema we
are interested in (i.e. ipreg
). For the real conversion I intend to
port the schema manually, but ora2pg
's automatic conversion is handy
for a quick trial, and it will probably be a useful guide to
translating the data type names.
The commands I ran were:
$ ora2pg --debug $ mv output.sql tables.sql $ ora2pg --debug --type copy $ mv output.sql rows.sql $ table-fixup.pl <tables.sql >fixed.sql $ psql -1 -f functions.sql $ psql -1 -f fixed.sql $ psql -1 -f rows.sql
The fixup script and SQL functions were necessary to fill in some gaps
in ora2pg
's conversion, detailed below.
Compatibility problems
Oracle treats the empty string as equivalent to NULL but PostgreSQL does not.
This affects constraints on the
lan
andmzone
tables.The Oracle
substr
function supports negative offsets which index from the right end of the string, but PostgreSQL does not.This affects subdomain constraints on the
unique_name
,maildom
, andservice
tables. These constraints should be replaced by function calls rather than copies.The
ipreg
schema usesraw
columns for IP addresses and prefixes;ora2pg
converted these tobytea
.The
v6_prefix
table has a constraint that relies on implicit conversion fromraw
to a hex string. PostgreSQL is stricter about types, so this expression needs to work onbytea
directly.There are a number of cases where
ora2pg
represented named unique constraints as unnamed constraints with named indexes. This unnecessarily exposes an implementation detail.There were a number of Oracle functions which PostgreSQL doesn't support (even with
orafce
), so I implemented them in thefunctions.sql
file.- regexp_instr()
- regexp_like()
- vzise()
Other gotchas
The
mzone_co
,areader
, andregistrar
tables reference thepers
table in thejdawadm
schema. These foreign key constraints need to be removed.There is a weird bug in
ora2pg
which mangles the regex[[:cntrl:]]
into[[cntrl:]]
This is used several times in the
ipreg
schema to ensure that various fields are plain text. The regex is correct in the schema source and in theALL_CONSTRAINTS
table on Jackdaw, which is why I think it is anora2pg
bug.There's another weird bug where a
regexp_like(string,regex,flags)
expression is converted tostring ~ regex, flags
which is nonsense.There are other calls to
regexp_like()
in the schema which do not get mangled in this way, but they have non-trivial string expressions whereas the broken one just has a column name.
Performance
The export of the data from Oracle and the import to PostgreSQL took an uncomfortably long time. The SQL dump file is only 2GB so it should be possible to speed up the import considerably.