No Pugs

they're evil

PostgreSQL

DISTINCT/:select is wiped out/overwritten by COUNT when using .size on an association proxy... with rails + PostgreSQL

If you create a named scope that uses :select, and then call “.size” or “.count” on the resulting proxy served by a call to the scoped method, and if that :select option sets DISTINCT, then it will be ignored because rails ignores the old :select and creates a new one of “COUNT(*) AS count_all”

Here is a made up example:

class Book < ActiveRecord::Base
  named_scope :on_a_favorite_list,
    :select => "DISTINCT ON (books.id) books.*",
    :joins => "INNER JOIN favorites_list_entries 
      ON favorites_list_entries.book_id = books.id"
end

Now let’s pretend that there is only one book on anybody’s favorites_list, but that two people have that book on their list.

Then this:

Book.on_a_favorite_list.map(&:id).size

would return 1, because size would be called on an array that only has 1 integer in it.

but…

Book.on_a_favorite_list.size

Would return 2! Why? Because the proxy object (remember, Book.on_a_favorite_list is NOT an array of books… it’s a proxy object that will fetch the books if needed) realizes it needs to create a COUNT query (it’s not going to actually fetch the books) and it does this by overriding the :select option from “DISTINCT ON (books.id) books.*” with “COUNT(*) as count_all”. So DISTINCT has been lost. Now the duplicate row appears to the select statement and 2 is returned as count_all to rails .size method.

One way around this is to pass the thing being counted, including the DISTINCT keyword, to size…

Book.on_a_favorite_list.size("DISTINCT books.id")

This will return the expected result of 1.

Published on 08/21/2010 at 08:56PM under , .

Fix for long index names in rails using postgresql (fix for "Input string is longer than NAMEDATALEN-1 (63)")

If you are getting this when running rails migrations using a postgresql adapter:

Input string is longer than NAMEDATALEN-1 (63)

Then this tiny plugin can solve this problem. Index names longer than 63 characters work fine with some postgresql adapters and not others. I’m not sure why. This adds a method to ActiveRecord::ConnectionAdapters::PostgreSQLAdapter that keeps the index_names under the limit.

here’s the entire piece of code:

ActiveRecord::ConnectionAdapters::PostgreSQLAdapter.class_eval do
  #There are PostgreSQL drivers/versions of drivers that truncate
  #index names to 63 characters. Since some do not, this override makes
  #sure that the truncate occurs regardless of PostgreSQL driver.
  def index_name(table, columns)
    super(table, columns)[0..62]
  end
end

To install this plugin using ext, you can use:

ext install git://github.com/azimux/ax_fix_long_psql_index_names.git

or, using rake plugin:

./script/plugin install git://github.com/azimux/ax_fix_long_psql_index_names.git

Published on 05/02/2010 at 04:26PM under , .

How to migrate typo from mysql to postgresql

I almost always use postgresql when working on a rails application. I won’t list all the little reasons why, but a major reason is for transactional DDL statements, which means when I run a migration that fails, I don’t have to then go run a bunch of cleanup queries to get my database back to how it was before the migration was ran.

When I was setting up this instance of typo, I decided I’d go ahead and go with mysql since I didn’t plan to hack on typo very much. Long story short: I decided to migrate from mysql to postgresql. This howto was done with mysql 5.0.70, postgresql 8.3.5, and typo 5.1.3 It probably will work with any mysql 5+ and postgresql 8+.

In case anybody else out there might be interested in doing likewise, here’s how I did it. These steps will be for a production database, but the changes required for doing it to a development database should be obvious.

Step 0: Backup your data

You don’t really need to be told this, do you?

Step 1: Dump the data from mysql

run the following to dump the data.

mysqldump --compatible=postgresql --no-create-info -u root -p --skip-extended-insert --complete-insert --skip-opt typo > typo.dump 

We are only dumping the data, hence the –no-create-info option.

Step 2: Create your postgresql database

You can do this however you see fit. I’ve included how I do it in case it’s useful:

CREATE USER typo_prod;
CREATE DATABASE typo_prod OWNER typo_prod ENCODING 'utf8';

\password typo_prod

and enter the password you wish to use.

Step 3: Change your database.yml to use your new postgresql database

Again, do this however you want. Here’s my database.yml with passwords omitted:

defaults: &defaults
  database: typo
  adapter: postgresql
  encoding: utf8
  host: localhost
  password: 

development:
  username: typo_dev
  database: typo_dev
  <<: *defaults

test:
  username: typo_test
  database: typo_test
  <<: *defaults

production:
  username: typo_prod
  database: typo_prod
  password: 
  host: salmon
  <<: *defaults

Step 4: Create the schema in your new database

To do this we’ll run the db:migrate rake task

RAILS_ENV="production" rake db:migrate

Step 5: Fire up a rails console to fix stuff

Now we need to fire up a rails console to do a lot of necessary cleanup work before we can import our data

ruby script/console production

Once it’s ready to go, type (or more practically, copy pasta)

conn = ActiveRecord::Base.connection

We’ll need this for a lot of the commands we have yet to run. You’ll keep this console open for the remainder of this howto. Any ruby code you see in this document will go into this console.

Step 6: Remove data created during the migrations

The typo migrations automatically add some default data, like some default pages/articles/blog. All of the data we want is in the dump we created earlier. Let’s delete all this stuff that’s in the way

conn.tables.each do |table|
  conn.execute "delete from #{table}"
end

Step 7: Temporarily change boolean columns to integers

mysqldump dumps it’s booleans as 0/1. These are interpreted by postgres as integers. It will not automatically cast these into booleans just because the column is boolean (I’m not sure why.) It’s too time consuming to go add casts to all of these 0/1’s, and a regular expression to use with sed would be far too complex to bother with since not all 1’s and 0’s in the dump correspond to boolean data.

So, we will temporarily change the boolean columns in our shiny new database to integers. Before we do this, we need to temporarily drop the defaults for these boolean columns because there won’t be an implicit cast from false/true to 0/1.

This code will build a couple of hashes to store which columns are booleans and what the defaults are.

bools = {}
defaults = {}


conn.tables.each do |table|
  conn.columns(table).each do |col|
    if col.type.to_s == "boolean"
      (bools[table] ||= []) << col.name
      (defaults[table] ||= {})[col.name] = col.default if !col.default.nil?
    end
  end
end

here’s the value of bools and defaults in my console after the above code:

#bools
{"resources"=>["itunes_metadata", "itunes_explicit"], "contents"=>["published", 
"allow_pings", "allow_comments"], "users"=>["notify_via_email", 
"notify_on_new_articles", "notify_on_comments", "notify_watch_my_articles", 
"notify_via_jabber"], "feedback"=>["published", "status_confirmed"], 
"categorizations"=>["is_primary"]}
#defaults
{"contents"=>{"published"=>false}, "feedback"=>{"published"=>false}}

Let’s now temporarily drop the defaults

defaults.each_pair do |table,cols|
  cols.each_key do |col|
    conn.execute "alter table #{table} alter column #{col} DROP DEFAULT"      
  end
end

Now let’s alter the column types for the columns in bools.

We’ll use a closure to run the alter statements, so that we can use it again later to alter them back to booleans.

change_to_type = proc {|to_type|
  bools.each_pair do |table, cols|
    cols.each do |col|
      conn.execute "alter table #{table} alter column #{col} type #{to_type} 
                                USING (#{col}::#{to_type});"
    end
  end
}

change_to_type.call :integer

Step 8: Load the data dump into the new database

Ah, finally. Let’s load the data. Back to a shell in a directory with the dump, run:

sed "s/\\\'/\'\'/g" typo.dump | sed "s/\\\r/\r/g" | sed "s/\\\n/\n/g" | psql -1 typo_prod

Pass whatever options you need to connect to psql as you normally would. The first sed converts all of the \’ to two consecutive ‘s, which is what psql expects. The next two calls to sed in the pipeline replace the escaped carriage returns and newlines with actual carriage returns and newlines, which is again what psql expects.

You may get a couple warnings, but hopefully no errors. The few warnings I received were inconsequential.

Step 9: Change the boolean columns back to boolean and restore the default columns

Back to our rails console. We now have the data in place and can change the columns back using our closure from earlier:

change_to_type.call :boolean

And then restore the defaults we dropped:

defaults.each_pair do |table, cols|
  cols.each_pair do |col, default|
    conn.execute "alter table #{table} alter column #{col} SET DEFAULT #{default}"
  end
end

Step 10: Repair the sequences.

Another annoying aspect of postgresql is that inserting a value into a serial column doesn’t automatically advance the sequence to be ready to serve up an unused value. There will be a sequence called “#{table}idseq” for each table with an id column in the database.

We manually have to advance all of the sequences:

conn.tables.each do |table|
  if conn.columns(table).detect{|i|i.name == "id"}
    conn.execute "SELECT setval('#{table}_id_seq', (SELECT max(id) FROM #{table}))"
  end
end

Conclusion

So that should do it. Restart your mongrel cluster (or whatever you are using to manage your rails server processes) and you should now be using your blog with a postgresql backend!

Published on 01/04/2009 at 11:44PM under , , , .

Powered by Typo – Thème Frédéric de Villamil | Photo Glenn