[rt-users] MySQL->Oracle DB Migration & Binary Attachment data corruption - UTF8 issue?
Joop van de Wege
JoopvandeWege at mococo.nl
Wed Jun 13 15:19:25 EDT 2007
Todd Williams wrote:
> Hello,
>
> Here are the pertinent details for our attempted migration:
>
> Suse Linux 9.3 x86 32-bit
> MIGRATING: MySQL 4.0.18 to Oracle 9i 9.2.0.6.0
> RT 3.6.3
> Perl 5.8.3
> DBIx::SearchBuilder 1.45, DBD::Oracle 1.19
> Apache 2.0.49
> Oracle SQL Developer & MySQL Migration Workbench plugin (latest)
>
>
> Issue: Binary Attachments & MySQL->Oracle DB Migration
>
> Does anyone have experience with a successful MySQL 4.x to Oracle 9x RT
> migration?
>
> 1. We have a perfectly functioning RT instance on this maching using
> either MySQL or Oracle database type. I have done both installs, and
> both work independently.
>
> 2. After some trial and error while migrating to the Oracle database,
> things seem to function much better if I allow RT to create the initial
> DB Schema, rather than attempt to have the Oracle tool migrate the
> schema from MySQL.
>
> 3. I have used the Oracle SQL Developer to migrate the data from the
> MySQL to the Oracle instance.
>
> 4. The database contents seem to have migrated without any issue,
> however, a primary issue is the binary attachments appear to be corrupt
> coming from the migrated Oracle DB.
>
> 5. Upon comparing the contents of an attachment bit-for-bit from the
> MySQL to Oracle tables, the data in each is identical post migration!!
>
> 6. Text attachments work fine, migrated just fine. Tickets, users,
> etc. migrated successfully.
>
> 7. There appears to be a decoding issue with the Oracle binary
> attachments (from the migrated MySQL db) in that they seem to grow in
> size approximately 23% upon downloading, and are corrupted as a result.
> The data obtained through RT (downloading a ticket attachment, both WEB
> and CMDLINE) is NOT the expected data stored in the Oracle table.
> Perhaps this is a UTF8 issue of some sort?
>
> 8. Hex dump (snippet) of data (Correct in the database, Incorrect
> produced out of RT's migrated database):
>
> Correct data: d0 cf 11 e0 a1 b1 1a e1 00 00 00 00 00 00 00 00 00 00
> 00 00 00 00 00 00 3e 00 03 00 fe ff 09 00
> Incorrect data: ef bf bd ef bf bd 11 ef bf bd ef bf bd ef bf bd 1a ef
> bf bd 00 00 00 00 00 00 00 00 00 00 00 00
>
> Note the 8-bit characters from the correct data above being replaced by:
> 0xEF 0xBF 0xBD in what is obtained from RT in the incorrect data line
> above.
>
> 9. How do we overcome this so that previous (migrated) attachments work
> correctly, and going forward all new attachments appear correctly as well?
I'm using Oracle too, Oracle XE. The way rows in the attachments table
look is that they are base64 encoded. You can check that by starting
from scratch and entering one ticket with a picture of word document
attachment. You'll see that the content column contains the base64
encoded data and contenttype and contentencoding columns contain the
values needed to reconstruct the attachment. What probably is happening
is that the migration wizard has transferred all the data from MySQL to
Oracle but didn't take into account that RT sees Oracle as a binary
unsafe database, thats why the content column is base64 encoded.
Depending on what is actually in the attachments table you might be able
to transform it to the right values.
If you need help, let me know.
Joop
More information about the rt-users
mailing list