Another Mike

Postgres for the win!

4/23/2009

I guess all my arguments (read: complaining) finally paid off. Our old database is starting to fall apart under the load. Too much data and too many users is an equation that won't last, no matter who makes the software. And eventually my aggressive caching scheme didn't matter much when the server began to run out of disk space. So I was given the go ahead to build a reporting database with Postgres to ease the load on the production box.

See the inspiration for this post.

I'm not a database expert but I have learned plenty enough about them in order to make my applications run as best as possible. If you have questions please use the comments below

Createdb

This Postgres database would store the complete databases from three sources every month end. Every night it'd also update a cache of the production data. The first task was to resolve the schema differences between them. They'd started off with the same design but over time programmers have added to them.

Since the entire schema was roughly 160 tables * 3 databases, I wasn't going to do that by hand and I certainly didn't want to have to update it when new projects changed the schema. Automation would be key.

Fortunately, this is only for reporting. That means I wouldn't have to port the packages, functions or views. I also ignored foreign key constraints since it'd do little good. Only the table structure itself was needed. I ended up writing a process that'd simply loop through all the tables and columns of all databases and resolve types between them, writing DDL statements to stdout.

The benefit to that approach meant I could review the changes before piping the result to psql. Actually, I ended up using streaming to psql for everything -- it's fast and easy.

Partitioning

Right away I'd decided to use Postgres partitioning. Even though most people will tell you that a single Postgres table is usually fast enough, my report queries wouldn't be running over the entire table. They'd only ever run on one at a time so it seemed obvious to eliminate as much data processing as possible. Plus, I'd be dropping tables every night and it was much faster than DELETE. Partitioning will also allow me to move older data to slower archive disks in the future.

Postgres partitioning doesn't work like anything else I'm familiar with. It uses table inheritance and at first the benefits weren't clear. I'd expected to have set up storage definitions and range partitioning after my experience with Oracle.

But it's far, far better than I'd thought. There were no problems inserting to the right partition (since you can just INSERT INTO the table you want), "moving" a partition could be as easy as renaming it, and dropping one is nothing more complicated than DROP TABLE. You can even backup and restore a single partition without all the fuss Oracle requires.

Table inheritance lets me keep several tables together under one large storage-less parent table. I was able to issue ALTER TABLE ADD COLUMN statements against the parent table and update all the children, which really helped when writing my schema tool.

Indexing

One drawback of the table partitioning and inheritance scheme was any index created on the parent table didn't span the child tables. I had to recreate them each time on every child table.

That one ended up not being such a problem though, since not having indexes on the table while importing millions of rows was another performance win. I ended up simply writing a script to create indexes on the child tables to match any indexing on the parent. Indexes on the parent tables are pointless (they have no data in my case) but I could define them once and simply run my script after the import finished. Postgres made it quick and easy to get the index definitions, change the names and recreate.

Data

For my data dumping tool I wrote another mode in my tool to simply SELECT * FROM each table and write CSV to stdout. My goal was to read from the database, perform any necessary transforms and write to the reporting server in a stream. Any stage that required dumping to disk first would be too slow.

Of course, Postgres' COPY FROM is crazy fast and it didn't take long to figure out the right CSV format. It was pretty easy. I toyed with dblink or writing some pljava to make the remote calls from the database itself but it seems a lot of type information is lost in the process. In the end piping to psql was the best route.

Lessons Learned

A few quick points about Postgres I learned along the way:

Postgres is wicked fast. So fast that System.currentTimeMillis() is not accurate enough to measure its fastness. (And why I don't have any more graphs to show off.)

Since I was generating SQL for the reports already, it ended up being easier to use the child partition table directly. That meant several fewer SQL criteria statements per query just to use the right partition, even though performance was about the same.

Use check constraints. They're limited in a lot of ways but they can really speed up queries dramatically.

Examine your query plans for type casts. I found several that made a massive impact to query speed. (It'll show up as a :: style cast in the explain plan.)

Don't use numeric for ID columns. My schema tool had mistakenly created a single table with this type and all other comparisons were cast to numeric. Altering the column to int4 was a 2x query time gain.

Postgres partitioning is unfortunately verbose. Instead, write scripts to generate SQL statements for you. Postgres makes scripting the database painless.

Follow the best practices for tuning your database. Increasing the shared memory helped a lot in my case. I also disabled fsync during development since I was dropping and loading a lot of data and didn't need any guarantees.

Comparisons

The final product is processing about ten times the data I had on Oracle, two to three times faster... Postgres certainly exceeded even my high expectations.

And while I don't mean to knock on Oracle too much, I feel very much like I'm wearing an 80's era polyester polo shirt with orange and brown stripes every time I declare a VARCHAR2.

One issue was constantly upping the temp tables and rollback space, as well as writing on-off procedures to do simple things like delete a few thousand rows. There was no limit to the amount of disk it seemed to need. Operations on a lot of data took an exceptionally long time. I spent several late nights rescuing a downed Oracle job because it'd run out of rollback space -- and the solution, committing in intervals, is horribly slow. Of course, this is a well-known, oft-encountered problem. And yet it's still a complete waste of time.

I did also try increasing the rollback segment sizes but having a lot of data in a rollback log wasn't any faster than committing in intervals. I'm not an Oracle expert by any means, but it didn't seem like a problem a mere mortal could solve.

Postgres is much better suited for large queries. There's quite a lot of interesting information available on Postgres' MVCC system and I think made the difference on this project. For example, which I discovered one of the tables had hundreds of thousands of NULL values I didn't need, I simply issued a DELETE FROM on the parent table and went on a coffee break. No commit intervals, no rollback, no nothing. It was done processing when I got back.

Another problem on Oracle was when I'd accidentally written a ton of data to the wrong table partitions while fixing something else. Oracle has a solution for this, alter table mytable enable row movement and then issue an UPDATE statement to modify the rows. Unfortunately this means it's using an insert and a delete statement for every row, so you're once again stuck writing procs to commit in intervals. That time fixing it took such an awful long time that I was forced to cancel the job when users began logging in the next morning. Much of the data is still in the wrong partition.

A final nail in the coffin for Oracle was LOBs. They suck. Not kinda suck, but turning-grey-at-23 suck. You're not allowed to have more than one defined per table and that would have been a nightmare for this project, since several of the Postgres source tables used the TEXT type where it made sense. Porting that data to Oracle would have taken a lot of time and probably would have made updating the schema automatically impractical.

Of course, I haven't even mentioned price. Since this project is available externally, Oracle licensing would have been prohibitive. Contrary to what many think, you really do have to pay for CPU licenses, partitioning, full text searching, and a whole lot more that you get for free with Postgres. As it stands, all the company had to pay for was some hardware and a small amount of development time. I'm happy to not allow Oracle to dictate how many cores I can use.

I'd faced some opposition to running Postgres in an enterprise setting. It seems Oracle has a free pass while Postgres must prove itself all over again despite great past experiences. In this case at least, Postgres did so in spades. I'm fortunate to have smart management that doesn't mind listening to me drone on every once in a while. Total time to write all of the tools, port the old data and update the report queries was about two weeks.