Convert MySQL Files to Postgres Format on the Fly | by Danil Vityazev | Apr, 2022

Write a customized script to permit loading MySQL recordsdata into Postgres

Picture by Jan Antonin Kolar on Unsplash

Typically life offers you an SQL file to load into your database, and it’s not within the appropriate format. That occurred in my case.

I had a file generated for MySQL and wanted to load it into my Postgres database. The usual answer would require both contacting the seller to ask them to supply a correct file or utilizing a converter software program. Sadly, each of those methods had been unsuitable.

So, I’m going to show a 3rd method — a customized script that means that you can load MySQL recordsdata into Postgres whereas changing it on the fly with out creating any extra recordsdata and allocating additional house on a drive.

The file consists of two components. First, the desk creation half, which is principally one massive assertion that creates an empty desk in a database to load information into.

The second half, which inserts rows into the desk:

Probably the most tough half to rewrite is the primary one, as a result of there are numerous variations between MySQL and Postgres on the subject of declaring a desk, whereas INSERT statements are very related.

The plan is to jot down a Python script that reads an SQL file line by line and adjustments the MySQL syntax to Postgres. Then submits the question to the server as soon as it detects a ; image.

I’ll use a typical psycorg library to submit queries. For starters, let’s write a operate that handles queries. We will name it to jot down information into the database.

Now, the plan is as follows:

  • Learn the road from the unique file into the variable sql_line.
  • Verify if it accommodates any dialect contradictions. If that’s the case, repair them.
  • Add the road to the variable containing the entire assertion.
  • If the road accommodates ; image, submit the assertion to the database.

To start with, MySQL makes use of the backtick image in desk and columns names, whereas Postgres makes use of double citation marks. That one is straightforward to deal with, and it’s proven under:

sql_line = sql_lone.exchange('`', '"')

One other facet that’s straightforward to cope with is the timestamp format:

sql_lime = sql_line.exchange("datetime", "timestamp(0)")

Now, Postgres would not want any numbers in brackets when specifying INT sort for columns, so you must delete it.

To implement a column that mechanically increments, we have to exchange the AUTO_INCREMENT key phrase with SERIAL. The road might also include stuff between the title of the column and the auto-increment key phrase, so we’re simply going to disregard that and write the serial key phrase proper after the column title.

I additionally deleted all of the feedback.

Now, MySQL has the key phrase collate, which we additionally don’t want, so we will delete it and the phrase after it.

Some columns have the unsigned property. There isn’t a such property in Postgres, however we will merely verify that every one the values within the column are larger than 0.

Probably the most tough half about declaring an enum is that the MySQL enum sort requires declaring a customized sort earlier than the desk assertion.

CREATE TYPE enum_1 as ENUM ('worth 1', 'worth 2');

After which within the desk assertion, create a column of that sort:

”ENUM_column” enum_1 NOT NULL

So, if we encounter this key phrase, we have to parse the road for potential entries, declare the sort initially of the entire desk assertion, after which, make a column of the newly declared sort. The result’s the next code:

That’s it for the column half. Now, we have to deal with Desk constraints.

The file accommodates three primary constraints:

  • The major key. In Postgres, the first secret’s declared by including a corresponding key phrase to a column, so we have to parse the first key assertion, determine the title of the first key, and alter the road that creates that column.
  • Distinctive key. The identical strategy is used right here. We have to parse the distinctive key constraint within the unique line and add the distinctive key phrase to the corresponding column.
  • International key. There are vital variations within the syntax in making a international key. In my keys, neither of the tables talked about within the international key statements existed in my database, so I simply ignored them. However within the instance under, there’s a code that rewrites the code into Postgres format, only for the sake of versatility.

So, combining all three strategies, we get the next outcome:

In spite of everything these transformations, we now have working Postgres code that creates the meant desk. All that’s left to do is to submit the question to the server, and we will use the next code to do this:

execute_sql(table_statement)

After that, we solely must iterate by way of all of the remaining traces and insert the info into the freshly made desk.

Already talked about above, the INSERT syntax is just about the identical in MySQL and Postgres, aside from one distinction.

The distinction is within the escape character’s remedy. In MySQL, if one wants to incorporate, say, an apostrophe right into a string all they need to do is add a backslash earlier than it. In Postgres, along with that, a capital E is required to point that the road could include escape characters.

'McDonald's' -> E'McDonald's'

So, let’s write a operate that detects backslashes in a string and provides a capital E earlier than the apostrophe opens this string. The catchy second right here is that the road could have a number of apostrophes, so, earlier than putting the letter E, we have to verify whether or not or not this apostrophe really begins the string.

Simply as earlier than, we have to exchange any backticks with citation marks.

In order that’s it. Now we will safely execute the road, and it’ll seem in your desk like this:

execute_sql(sql_line)

This record of variations will not be exhaustive; I lined solely these I confronted throughout loading the file myself, and chances are you’ll encounter others for those who use the script.

Thanks for studying.

The total code might be discovered within the GitHub repository.

More Posts