No Pugs

they're evil

Gentoo turns 10 today, and so I figured I’d make a post about one of my favorite aspects of Gentoo: the world file.

Such a simple idea and yet most package management systems seem to lack it.

When you install a package, it gets recorded in the world file, but its dependencies do not. This means you can always figure out which packages you need/don’t need.

Contrast this with many other package management systems, who do not distinguish between packages specifically installed by name by the user and those pulled in as dependencies.

After some time passes and you install and remove packages, you wind up with a bunch of packages that you are not sure if they’re necessary or not.

With gentoo, you can simply do a

emerge --pretend --depclean

On an up-to-date system to see which packages the system thinks are no longer needed. You can then rerun without the pretend option if you don’t see anything necessary in the list.

With other package management systems this can be a pain in the neck. For instance, on debian, I used to have to go through dselect and look at each package and use intuition to determine if anything installed was no longer necessary. And with gem, the package management system for rubygems, I wind up maintaining my own world-file-like file list in a file I call required_gems by hand, then to remove unneeded packages, deleting every single gem, and then reinstalling the ones I have recorded in my required_gems file.

So all hail the world file and happy 10th anniversary Gentoo!

Published on 10/04/2009 at 12:30PM under , .

0 comments

I’m in the process of learning Spanish and wanted to have something in Spanish to listen to.

I figured I’d try listening to the Spanish tracks from some of my DVDs for this purpose. The only one I have that actually has a Spanish audio track is Pulp Fiction and my Simpson’s DVDs.

To rip Pulp Fiction, I used the following command:

mplayer -alang es -vc dummy -vo null -ao pcm:fast -af resample=44100:0:0 -ao pcm:file=pf.18.wav -chapter 18-18 dvd://

This rips chapter 18. I repeated this for chapter 01 through 27. Note that you have to replace that number in 3 places in the command each time. I prefer to have it broken up by chapter for skipping-around convenience, but if you want it all in one file, leave off -chapter.

If you want to rip from a different language, change what you pass to -alang, or remove -alang for the default language.

You can then encode them with LAME if you’d like.

By the way, my chapter 16 is screwed up on my Pulp Fiction disc and won’t rip. If you have this DVD and don’t mind ripping the Spanish audio from that chapter for me, let me know. It’s the “Where’s my watch?” chapter.

Published on 06/16/2009 at 09:21PM under .

0 comments

Below is a quick ruby program I threw together to try and make a very rough simulation of the impact of rare and barely beneficial mutations in bacteria colonies.

The way this code is here, there’s a strong type of bacteria (with the beneficial mutation) and a weak kind of bacteria (without the beneficial mutation)

The way these numbers are currently set, the bacterial colony starts at 1 weak bacteria and grows from there. There’s a 1 in 10 million chance that a weak bacteria will gain the strong mutation or that a strong bacteria will lose it’s strong mutation. The benefit to the bacteria is a 0.1% improvement in survival rate. Anytime the colony grows above capacity, a bunch are killed off (the current survival rate is 50%, so half die every time it grows above 4 million bacteria.)

Feel free to play with the numbers and see the various outcomes.

A quick explanation of the algorithm: I don’t test every single bacteria to see if it survives. Instead, if 49.9% of them are supposed to die, and let’s say that there’s 5 bacteria. 5 * 49.9 = 2.495. So I kill 2 bacteria and then generate a random number between 0 and 1 to test against 0.495 to see if the 3rd bacteria dies or not. Not testing every single bacteria directly makes it less like nature of course, but it also makes the algorithm much faster.

MUTATION_RATE = 1.0 / 10_000_000
CAPACITY = 4_000_000

BENEFIT = 0.001
SURVIVAL_RATE = 0.5

class Colony
  attr_accessor :survival_rate, :count
  def initialize rate, count = 0
    self.survival_rate = rate
    self.count = count
  end

  def double
    self.count *= 2
  end

  def kill
    self.count = self.count.cut(survival_rate)
  end
end

class Experiment
  attr_accessor :colony_w, :colony_s, :milestones, :capacity, :mutation_rate,
    :generations, :min_w, :min_s, :max_w, :max_s, :first_mutation

  def initialize survival_rate, benefit, capacity, mutation_rate
    self.capacity = capacity
    self.mutation_rate = mutation_rate
    self.colony_w = Colony.new(survival_rate, 1)
    self.colony_s = Colony.new(survival_rate + benefit)
    self.milestones = (0..10).map {|i| i * 0.1}
    self.generations = 0
    self.min_w = colony_w.count
    self.min_s = colony_s.count
    self.max_w = 0
    self.max_s = 0
    update_min_max
  end

  def update_min_max
    self.min_w = [min_w, colony_w.count].min
    self.min_s = [min_s, colony_s.count].min
    self.max_w = [max_w, colony_w.count].max
    self.max_s = [max_s, colony_s.count].max
  end

  def total
    colony_w.count + colony_s.count
  end

  def percent_w
    colony_w.count.to_f / total
  end

  def percent_s
    colony_s.count.to_f / total
  end

  def double
    [colony_w, colony_s].each {|c| c.double}
  end

  def above_capacity?
    total > capacity
  end

  def tick
    run
    update_min_max
    check_milestone
    if generations % 10_000 == 0
      print_status
    end
  end

  def check_milestone
    if percent_s > milestones[0]
      stone = milestones.shift
      puts "colony_s has reached #{stone * 100}% of the population at generation #{generations}.
      total population: #{total}
      "
    end
  end

  def print_status
    puts "gen #{generations}: weak: #{colony_w.count} strong: #{colony_s.count}"
  end
  
  def mutate
    cols = [colony_s,colony_w]
    [cols, cols.reverse].each do |cs|
      mutated = cs[0].count.cut(mutation_rate)
      if mutated > 0
        cs[1].count += mutated
        cs[0].count -= mutated
        
        unless first_mutation
          puts "first mutation arises at gen #{generations}"
          print_status
          self.first_mutation = true
        end
      end
    end
  end

  def run
    double
    mutate
    while above_capacity?
      colony_w.kill
      colony_s.kill
    end
    self.generations += 1
  end
end

Integer.class_eval do
  def cut prob
    full = self * prob
    whole = full.to_i
    partial = full - whole

    rand < partial ? whole + 1 : whole
  end
end

e = Experiment.new SURVIVAL_RATE, BENEFIT, CAPACITY, MUTATION_RATE

e.print_status

while e.percent_w > 0.001 && e.generations <= 500_000_000
  e.tick
end

puts "took #{e.generations} generations to get population_strong from min of #{e.min_s}
to #{e.colony_s.count} (#{e.percent_s * 100}%)
and populations_weak from #{e.min_w} to a max of #{e.max_w} down
to #{e.colony_w.count} (#{e.percent_w * 100})%"

Here’s some of the results I get when running it with different values:

Using the current values

gen 0: weak: 1 strong: 0
first mutation arises at gen 23
gen 23: weak: 4194303 strong: 1
colony_s has reached 0.0% of the population at generation 29.
      total population: 2097152

colony_s has reached 10.0% of the population at generation 3911.
      total population: 2329272

colony_s has reached 20.0% of the population at generation 4317.
      total population: 2620602

colony_s has reached 30.0% of the population at generation 4587.
      total population: 2995564

colony_s has reached 40.0% of the population at generation 4808.
      total population: 3494804

colony_s has reached 50.0% of the population at generation 5010.
      total population: 2097183

colony_s has reached 60.0% of the population at generation 5213.
      total population: 2621926

colony_s has reached 70.0% of the population at generation 5434.
      total population: 3495609

colony_s has reached 80.0% of the population at generation 5703.
      total population: 2622864

colony_s has reached 90.0% of the population at generation 6108.
      total population: 2623759

took 7307 generations to get population_strong from min of 0
to 3258961 (99.0003848879678%)
and populations_weak from 1 to a max of 2097152 down
to 32906 (0.999615112032169)%

So with a benefit of 0.001, it took 29 generations for the mutation to surface, 5010 generations for it to match the frequency of the weaker bacteria, and 7307 generations for the strong bacteria to make up over 99% of the colony.

With a 0.0001 benefit (strong bacteria has a %0.01 better chance of survival over the weak bacteria)

gen 0: weak: 1 strong: 0
first mutation arises at gen 20
gen 20: weak: 2097151 strong: 1
colony_s has reached 0.0% of the population at generation 21.
      total population: 2097152

gen 10000: weak: 2095041 strong: 6691
gen 20000: weak: 2093070 strong: 56031
colony_s has reached 10.0% of the population at generation 27047.
      total population: 2324111

gen 30000: weak: 2091153 strong: 420262
colony_s has reached 20.0% of the population at generation 31090.
      total population: 2613776

colony_s has reached 30.0% of the population at generation 33781.
      total population: 2986635

colony_s has reached 40.0% of the population at generation 35989.
      total population: 3484132

colony_s has reached 50.0% of the population at generation 38015.
      total population: 2090424

gen 40000: weak: 1045189 strong: 1554583
colony_s has reached 60.0% of the population at generation 40043.
      total population: 2613202

colony_s has reached 70.0% of the population at generation 42254.
      total population: 3485008

colony_s has reached 80.0% of the population at generation 44952.
      total population: 2615286

colony_s has reached 90.0% of the population at generation 49018.
      total population: 2620706

gen 50000: weak: 262319 strong: 2870166
gen 60000: weak: 33896 strong: 2649526
gen 70000: weak: 5297 strong: 2445672
took 75938 generations to get population_strong from min of 0
to 2004344 (99.9000174446134%)
and populations_weak from 1 to a max of 2097151 down
to 2006 (0.0999825553866474)%

It took 38015 generations for the strong and weak bacteria to be equal in numbers, 75938 generations for the strong bacteria to go from 0 to over 99% of the population. Almost 10 times longer, which is interesting because the benefit was decreased by 10.

With 0 benefit

Note: I changed the code to only print out the status every 50 million generations for this one, and only to run a maximum of 500 million generations

gen 0: weak: 1 strong: 0
first mutation arises at gen 25
gen 25: weak: 4194303 strong: 1
colony_s has reached 0.0% of the population at generation 27.
      total population: 2097151

colony_s has reached 10.0% of the population at generation 1117956.
      total population: 2097329

colony_s has reached 20.0% of the population at generation 2550113.
      total population: 2097589

colony_s has reached 30.0% of the population at generation 4577301.
      total population: 2097995

colony_s has reached 40.0% of the population at generation 8054126.
      total population: 2097269

colony_s has reached 50.0% of the population at generation 35885018.
      total population: 2101332

gen 50000000: weak: 1052661 strong: 1052518
gen 100000000: weak: 1051453 strong: 1052153
gen 150000000: weak: 1051255 strong: 1050806
gen 200000000: weak: 1052545 strong: 1051996
gen 250000000: weak: 1057002 strong: 1055415
gen 300000000: weak: 1053651 strong: 1055580
gen 350000000: weak: 1057720 strong: 1056790
gen 400000000: weak: 1059992 strong: 1059863
gen 450000000: weak: 1060187 strong: 1059305
gen 500000000: weak: 1058117 strong: 1059018
took 500000001 generations to get population_strong from min of 0
to 1059018 (50.0212787564326%)
and populations_weak from 1 to a max of 2097152 down
to 1058117 (49.9787212435674)%

Looks like it works it’s way to 50% very slowly, and then stays there. This is interesting and isn’t what I expected. I had expected that it would stay at whatever frequency it was at when it reached capacity. I guess what is happening is that whichever type there is more of is going to have more bacteria mutate into the other type than it gets back in return.

Published on 03/20/2009 at 04:00PM under , .

0 comments

I’ve added an uninstall command to for removing subprojects from being managed by externals.

Use:

ext uninstall some_project

To stop tracking some_project. This will not remove the files in some_project’s directory. If you want to do that, use the -f/–force_removal option:

ext uninstall -f some_project

This will remove all of the files and the some_project directory itself.

To switch a project to a new repository, you can do something like the following:

ext uninstall -f some_project
ext install new/repository/url/some_project

This is the type of thing I added the feature for. I moved a couple plugins from subversion to git recently, and didn’t feel like manually changing the .externals and ignore features.

Enjoy!

Published on 02/25/2009 at 12:40PM under .

0 comments

I almost always use postgresql when working on a rails application. I won’t list all the little reasons why, but a major reason is for transactional DDL statements, which means when I run a migration that fails, I don’t have to then go run a bunch of cleanup queries to get my database back to how it was before the migration was ran.

When I was setting up this instance of typo, I decided I’d go ahead and go with mysql since I didn’t plan to hack on typo very much. Long story short: I decided to migrate from mysql to postgresql. This howto was done with mysql 5.0.70, postgresql 8.3.5, and typo 5.1.3 It probably will work with any mysql 5+ and postgresql 8+.

In case anybody else out there might be interested in doing likewise, here’s how I did it. These steps will be for a production database, but the changes required for doing it to a development database should be obvious.

Step 0: Backup your data

You don’t really need to be told this, do you?

Step 1: Dump the data from mysql

run the following to dump the data.

mysqldump --compatible=postgresql --no-create-info -u root -p --skip-extended-insert --complete-insert --skip-opt typo > typo.dump 

We are only dumping the data, hence the –no-create-info option.

Step 2: Create your postgresql database

You can do this however you see fit. I’ve included how I do it in case it’s useful:

CREATE USER typo_prod;
CREATE DATABASE typo_prod OWNER typo_prod ENCODING 'utf8';

\password typo_prod

and enter the password you wish to use.

Step 3: Change your database.yml to use your new postgresql database

Again, do this however you want. Here’s my database.yml with passwords omitted:

defaults: &defaults
  database: typo
  adapter: postgresql
  encoding: utf8
  host: localhost
  password: 

development:
  username: typo_dev
  database: typo_dev
  <<: *defaults

test:
  username: typo_test
  database: typo_test
  <<: *defaults

production:
  username: typo_prod
  database: typo_prod
  password: 
  host: salmon
  <<: *defaults

Step 4: Create the schema in your new database

To do this we’ll run the db:migrate rake task

RAILS_ENV="production" rake db:migrate

Step 5: Fire up a rails console to fix stuff

Now we need to fire up a rails console to do a lot of necessary cleanup work before we can import our data

ruby script/console production

Once it’s ready to go, type (or more practically, copy pasta)

conn = ActiveRecord::Base.connection

We’ll need this for a lot of the commands we have yet to run. You’ll keep this console open for the remainder of this howto. Any ruby code you see in this document will go into this console.

Step 6: Remove data created during the migrations

The typo migrations automatically add some default data, like some default pages/articles/blog. All of the data we want is in the dump we created earlier. Let’s delete all this stuff that’s in the way

conn.tables.each do |table|
  conn.execute "delete from #{table}"
end

Step 7: Temporarily change boolean columns to integers

mysqldump dumps it’s booleans as 0/1. These are interpreted by postgres as integers. It will not automatically cast these into booleans just because the column is boolean (I’m not sure why.) It’s too time consuming to go add casts to all of these 0/1’s, and a regular expression to use with sed would be far too complex to bother with since not all 1’s and 0’s in the dump correspond to boolean data.

So, we will temporarily change the boolean columns in our shiny new database to integers. Before we do this, we need to temporarily drop the defaults for these boolean columns because there won’t be an implicit cast from false/true to 0/1.

This code will build a couple of hashes to store which columns are booleans and what the defaults are.

bools = {}
defaults = {}


conn.tables.each do |table|
  conn.columns(table).each do |col|
    if col.type.to_s == "boolean"
      (bools[table] ||= []) << col.name
      (defaults[table] ||= {})[col.name] = col.default if !col.default.nil?
    end
  end
end

here’s the value of bools and defaults in my console after the above code:

#bools
{"resources"=>["itunes_metadata", "itunes_explicit"], "contents"=>["published", 
"allow_pings", "allow_comments"], "users"=>["notify_via_email", 
"notify_on_new_articles", "notify_on_comments", "notify_watch_my_articles", 
"notify_via_jabber"], "feedback"=>["published", "status_confirmed"], 
"categorizations"=>["is_primary"]}
#defaults
{"contents"=>{"published"=>false}, "feedback"=>{"published"=>false}}

Let’s now temporarily drop the defaults

defaults.each_pair do |table,cols|
  cols.each_key do |col|
    conn.execute "alter table #{table} alter column #{col} DROP DEFAULT"      
  end
end

Now let’s alter the column types for the columns in bools.

We’ll use a closure to run the alter statements, so that we can use it again later to alter them back to booleans.

change_to_type = proc {|to_type|
  bools.each_pair do |table, cols|
    cols.each do |col|
      conn.execute "alter table #{table} alter column #{col} type #{to_type} 
                                USING (#{col}::#{to_type});"
    end
  end
}

change_to_type.call :integer

Step 8: Load the data dump into the new database

Ah, finally. Let’s load the data. Back to a shell in a directory with the dump, run:

sed "s/\\\'/\'\'/g" typo.dump | sed "s/\\\r/\r/g" | sed "s/\\\n/\n/g" | psql -1 typo_prod

Pass whatever options you need to connect to psql as you normally would. The first sed converts all of the \’ to two consecutive ‘s, which is what psql expects. The next two calls to sed in the pipeline replace the escaped carriage returns and newlines with actual carriage returns and newlines, which is again what psql expects.

You may get a couple warnings, but hopefully no errors. The few warnings I received were inconsequential.

Step 9: Change the boolean columns back to boolean and restore the default columns

Back to our rails console. We now have the data in place and can change the columns back using our closure from earlier:

change_to_type.call :boolean

And then restore the defaults we dropped:

defaults.each_pair do |table, cols|
  cols.each_pair do |col, default|
    conn.execute "alter table #{table} alter column #{col} SET DEFAULT #{default}"
  end
end

Step 10: Repair the sequences.

Another annoying aspect of postgresql is that inserting a value into a serial column doesn’t automatically advance the sequence to be ready to serve up an unused value. There will be a sequence called “#{table}idseq” for each table with an id column in the database.

We manually have to advance all of the sequences:

conn.tables.each do |table|
  if conn.columns(table).detect{|i|i.name == "id"}
    conn.execute "SELECT setval('#{table}_id_seq', (SELECT max(id) FROM #{table}))"
  end
end

Conclusion

So that should do it. Restart your mongrel cluster (or whatever you are using to manage your rails server processes) and you should now be using your blog with a postgresql backend!

Published on 01/04/2009 at 11:44PM under , , , .

11 comments

So you’ve written some helper script or possibly a script that’s ran by cron to do some background work on your site (updating full text indexes, sending out notification emails, generating reports, etc) but you can’t find a way to debug it in rails so that it hits your breakpoints. Annoying.

What I did to solve this was first to create a rake task that creates other rake tasks based on script name. This way you can debug it the way you would debug any rake task (and it’s also convenient to be able to execute scripts from the rake context menu anyways.) This allows you to run any script by right clicking on the project in Netbeans, and going to “Run/Debug Rake Task->script->your_killer_script.rb”

Place this in a file called scripts.rake and place it in your lib/tasks folder

require 'find'

namespace :scripts do
  Find.find("#{RAILS_ROOT}/script/") do |p|
    if File.file?(p) && p !~ /(\.svn-base|\.netbeans-base)$/
      desc "Run #{File.basename(p)}"
      task File.basename(p, "*") => :environment do
        load p
      end
    end
  end
end

Then, right click on your project and hit “Run Rake Task->Refresh List”

You should now be able to right click on your project and hit “Debug Task Rake Task->script->your_killer_script.rb”

It should hit any breakpoints you have set. Happy debugging!

Published on 12/12/2008 at 02:12AM under , .

0 comments

In Netbeans, to test a Ruby application normally I right click on the project and go to “Run Rake Task -> test”

A problem arises when I try to debug a test. None of the breakpoints get hit. I think this is because a new process is spawned off to actually run the tests and the debugger is attached to the parent process. So you can really only hit breakpoints involved in spawning the tests but none in the tests or in any of your application code called by the tests.

Opening an individual test and right clicking in the buffer and hitting “Debug your_mom_test.rb” seems to fail for me with rails 2.1. I was able to correct this by changing the line at the top of the test from

require 'test_helper'

to

$:.unshift File.join(File.dirname(__FILE__), '..', 'lib') if $0 == __FILE__
require File.dirname(__FILE__) + '/../test_helper'

Enjoy hitting your breakpoints while testing!

Published on 12/12/2008 at 02:00AM under , .

0 comments

have added a freeze feature to ext

It works like this:

ext freeze project_name [revision]

This records the revision in the .externals file under that subproject’s entry. When it is checked out/exported it fetches that revision.

If you leave the revision off of the command, it uses the revision the project is currently checked out at.

Enjoy!

Published on 10/17/2008 at 11:48PM under .

42 comments

I recently switched from mephisto to typo, mostly due to lack of activity and features in the mephisto project.

With mephisto, I had to make a modification to allow me to use permalinks without the date to access articles. For example, so I could say http://nopugs.com/permalinks-without-dates instead of http://nopugs.com/2008/09/11/permalinks-without-dates I’m not sure why this isn’t already a feature of such blogging systems. I’m somewhat new to blogging so perhaps there’s a good reason. Maybe there are sometimes different articles with the same permalink (I don’t know why somebody would do this.)

Regardless, here’s how I modified typo to allow me to do this.

In app/controllers/redirect_controller.rb change the redirect method from this:

 def redirect
    if (params[:from].first == 'articles')
      path = request.path.sub('/articles', '')
      url_root = request.relative_url_root
      path = url_root + path unless url_root.nil?
      redirect_to path, :status => 301
      return
    end

    r = Redirect.find_by_from_path(params[:from].join("/"))

    if(r)
      path = r.to_path
      url_root = request.relative_url_root
      path = url_root + path unless url_root.nil? or path[0,url_root.length] == url_root
      redirect_to path, :status => 301
    else
      render :text => "Page not found", :status => 404
    end
  end

to this:

  def redirect
    if (params[:from].first == 'articles')
      path = request.path.sub('/articles', '')
      url_root = request.relative_url_root
      path = url_root + path unless url_root.nil?
      redirect_to path, :status => 301
      return
    end
   
    article = Article.find_by_permalink(params[:from].first)
   
    if article
      redirect_to article_path(article)
      return
    end

    r = Redirect.find_by_from_path(params[:from].join("/"))

    if(r)
      path = r.to_path
      url_root = request.relative_url_root
      path = url_root + path unless url_root.nil? or path[0,url_root.length] == url_root
      redirect_to path, :status => 301
    else
      render :text => "Page not found", :status => 404
    end
  end

Then in app/models/article.rb change findbypermalink from this:

  def self.find_by_permalink(year, month=nil, day=nil, title=nil)
    unless month
      case year
      when Hash
        year, month, day, title = date_from(year)
      when Array
        year, month, day, title = year
      end
    end
    date_range = self.time_delta(year, month, day)
    find_published(:first,
                   :conditions => { :permalink => title,
                                    :published_at => date_range }) \
      or raise ActiveRecord::RecordNotFound
  end

to this:

  def self.find_by_permalink(year, month=nil, day=nil, title=nil)
    unless month
      case year
      when Hash
        year, month, day, title = date_from(year)
      when Array
        year, month, day, title = year
      end
    end
    
    if year && !month && !day && !title
      year, title = title, year
    end
    
    published = nil
    
    if year
      date_range = self.time_delta(year, month, day)
      published = find_published(:first,
        :conditions => { :permalink => title,
          :published_at => date_range }) 
    end
    
    unless published
      published = find_published(:first, :conditions => {:permalink => title})
    end
      
    raise ActiveRecord::RecordNotFound unless published
    published
  end

Then you should be able to leave the year/month/day off of your article URLs when sharing them with friends and such.

Enjoy!

Published on 09/11/2008 at 06:59PM under .

0 comments

What is externals and what is it used for?

externals allows you to make use of an svn:externals-like workflow with any combination of SCMs. What is the svn:externals workflow? I would describe it roughly like this:

You register subprojects with your main project. When you checkout the main project, the subprojects are automatically checked out. Doing a ‘status’ will tell you the changes in the main projects and any subprojects from where it’s ran. You commit changes to the the projects all seperately as needed. If somebody else does an update, they will get the changes to the subprojects as well.

For a more detailed explanation of why I started the externals project, please visit http://nopugs.com/why-ext It’s largely a rant about git-submodule.

On with the tutorial

Installation

ext should run on unix-like systems and windows systems. All the unit tests pass on Linux and Windows vista (with cygwin).

First we need to install externals. The first, and easiest, method is to use gem:

gem install ext

The other method is to use github:

git clone git://github.com/azimux/externals.git
chmod u+x externals/bin/ext

If you install using git clone instead of rubygems, be sure to add the externals/bin directory to your path.

Creating a repository to play around with

I will use git for the main project, and will use git and subversion for the subprojects (the tutorial would be mostly identical if I used svn for the main project, that’s part of the point of ext.)

Now let’s create a repository for use with our project. I like to test out stuff like this in my ~/tmp/ folder.

cd
mkdir tmp
cd tmp

mkdir repo
mkdir work

cd repo
mkdir rails_app.git
cd rails_app.git
git init --bare

Now let’s go to our work directory and make a rails app to push to this repository.

cd ../../work/
rails rails_app
cd rails_app
git init
git add . 
git commit -m "created fresh rails app"
git remote add origin ../../repo/rails_app.git 
git push origin master

If you’re like me, you consider empty directories in your project’s directory structure to be part of the project. Git will not track empty directories. So, here’s our first use of ext:

ext touch_emptydirs
git add .
git commit -m "touched empty dirs"
git push

This adds a .emptydir file to every empty directory so that git will track these folders.

Using “ext install” to register subprojects.

Now for our second use of ext. Let’s add the current edge rails to our application:

ext install git://github.com/rails/rails.git

It should take a moment because rails is a large project.

Now that that’s done, let’s see what “ext install” did.

$ cat .externals 
[.]
scm = git
type = rails

[vendor/rails]
path = vendor/rails
repository = git://github.com/rails/rails.git
scm = git

.externals is the externals configuration file. This is the file used to keep track of your subprojects. Projects are stored in the form:

[path/to/project]
repository = urlfor://project.repository/url
branch = somebranch
scm = git/svn

The format is very similar to ini format. The section name is the path to the project. The main project’s settings are stored under [.]

Some things to notice: externals was automatically able to figure out that we’re using git for the main project (scm = git under [.]) Also, note that the type of the main project has been detected as rails (type = rails) This means that we can leave the paths off of the repositories in .externals (when using “ext install”) and ext will automatically know where to install stuff (if it’s called rails it goes in vendor/rails otherwise it goes in vendor/plugins/) Let’s make sure it’s there.

$ ls vendor/rails
Rakefile      activemodel     activesupport  pushgems.rb
actionmailer  activerecord    ci             railties
actionpack    activeresource  doc            release.rb

That’s not all, take a look at the ignore file:

$ cat .gitignore
vendor/rails

This makes sense because we don’t want the main repository to track any of the files in the subproject. The files in the subproject are tracked by their own repository, possibly of a different SCM than the main project.

Let’s add some more subprojects: some rails plugins this time. We’ll add a couple that are tracked under subversion and one tracked under git to demnostrate how ext is scm agnostic.

ext install git://github.com/lazyatom/engines -b edge
ext install svn://rubyforge.org/var/svn/redhillonrails/trunk/vendor/plugins/redhillonrails_core
ext install svn://rubyforge.org/var/svn/redhillonrails/trunk/vendor/plugins/foreign_key_migrations

let’s see if our plugins made it

$ du --max-depth=2 -h vendor/plugins/ | grep lib
252K    vendor/plugins/foreign_key_migrations/lib
340K    vendor/plugins/redhillonrails_core/lib
24K vendor/plugins/engines/lib

looks good

$ cat .externals 
[.]
scm = git
type = rails

[vendor/rails]
path = vendor/rails
repository = git://github.com/rails/rails.git
scm = git

[vendor/plugins/engines]
path = vendor/plugins/engines
repository = git://github.com/lazyatom/engines
scm = git
branch = edge

[vendor/plugins/redhillonrails_core]
path = vendor/plugins/redhillonrails_core
repository = svn://rubyforge.org/var/svn/redhillonrails/trunk/vendor/plugins/red
hillonrails_core
scm = svn

[vendor/plugins/foreign_key_migrations]
path = vendor/plugins/foreign_key_migrations
repository = svn://rubyforge.org/var/svn/redhillonrails/trunk/vendor/plugins/for
eign_key_migrations
scm = svn

…and the ignore file…

$ cat .gitignore 
vendor/rails
vendor/plugins/acts_as_list
vendor/plugins/foreign_key_migrations
vendor/plugins/redhillonrails_core

also looks very good!

Something worth noting: if we were using svn for our main project, ext is smart enough to set the ignores using ‘svn propset svn:ignore’ on the appropriate directories.

Let’s now commit and push our work.

git add .
git commit -m "added 4 subprojects"
git push

Using “ext checkout” and “ext export”

And now let’s delete and check it out again to make sure we get the sub projects

cd ..
rm -rf rails_app
ext checkout ../repo/rails_app.git

It will take a moment as it clones rails from github again.

Let’s make sure all of the subprojects were checked out properly:

$ cd rails_app
$ du --max-depth=3 -h vendor/ | grep lib
12K     vendor/plugins/acts_as_list/lib
66K     vendor/plugins/foreign_key_migrations/lib
162K    vendor/plugins/redhillonrails_core/lib
382K    vendor/rails/actionmailer/lib
1.5M    vendor/rails/actionpack/lib
104K    vendor/rails/activemodel/lib
791K    vendor/rails/activerecord/lib
92K     vendor/rails/activeresource/lib
2.4M    vendor/rails/activesupport/lib
584K    vendor/rails/railties/lib

let’s also make sure the engines plugin is on a branch called “edge” (which is tracking the remote repository’s edge branch)

$ cd vendor/plugins
$ git branch -a
* edge
  master
  origin/HEAD
  origin/add_test_for_rake_task_redefinition
  origin/edge
  origin/master
  origin/timestamped_migrations

Notice how the subprojects were automatically fetched. As mentioned in the why ext article, the main project is usually incapable of functioning without it’s subprojects, so it makes sense to fetch the subprojects when we do a checkout or export. (This is what svn checkout does when it checks out a folder that has svn:externals set on it. It fetches the external projects automatically, which is very convenient.)

Note that you can use “ext export” instead of checkout if you don’t want histories to accompany the files. This tells ext to use “svn export” for subversion managed (sub)projects and “git clone –depth 1” for git managed (sub)projects. This can save a lot of time and is useful for deployment.

looks good, let’s go back to the rails_app directory to continue the tutorial

cd ../../../

“ext status” propagates through subprojects

Let’s modify a subproject.

echo "lol, internet" >> vendor/plugins/foreign_key_migrations/README

And now let’s check the status

$ ext status
status for .:
# On branch master
nothing to commit (working directory clean)

status for vendor/rails:
# On branch master
nothing to commit (working directory clean)

status for vendor/plugins/acts_as_list:
# On branch master
nothing to commit (working directory clean)

status for vendor/plugins/redhillonrails_core:


status for vendor/plugins/foreign_key_migrations:
M      README

As expected, foreign_key_migrations has a modified file. This same (very common) task is a bit of a pain in the neck with git-submodule (unless I’m missing something), and impossible in this situation where the subproject is not managed under the same source control system as the main project (as in this example.)

Deployment with capistrano

Most commands also have a short version of the command. The short versions only operate on the subprojects and not the main projects. “ext checkout” or “ext export” fetches the main project and subprojects but “ext co” and “ext ex” (meant to be ran in the working folder of the main project, use –workdir to do it from elsewhere) will fetch all subprojects and doesn’t touch the main project.

If you deploy with capistrano, you can have all your subprojects fetched on deployment by adding the following to your deploy.rb:

task :after_update_code, :roles => :app do
  run "ext --workdir #{release_path} ex"
end

Notice how I chose to use “ex” instead of “co” This is because I never do work from a deployed project’s working directory, so the history is pointless.

If people find externals usefull, I’d be happy to add a :ext scm type to capistrano so that it runs ext instead of git/svn. Then it would pickup all the subprojects during a deploy without having to supply the above after_update_code task. I could also add a switch to rails “./script/plugin install” (perhaps -X) to tell it to use ext to manage the project (kind of how you can use -x to tell it to use svn:externals.) Though, this isn’t really any easier to make use of than just doing “ext install”

A few other tips

“ext help” will show you all the available commands. Also, feel free to manage the .externals file manually if you wish.

Conclusion

For issue tracking, at the moment I’m using lighthouseapp. Report bugs to http://ext.lighthouseapp.com/

I also have a rubyforge account for this project at http://rubyforge.org/projects/ext/ if you would prefer to submit bugs/feature requests via rubyforge’s tracking system. I’ve used both sites but never managed a project with either, so I don’t know which is better. Rubyforge seems to be more feature complete.

Externals is my first attempt at contributing a useful open source project to the community. If you have some tips for me in this regard, please feel free to share them.

Cheers!


Published on 09/06/2008 at 11:58PM under , , .

14 comments

Powered by Typo – Thème Frédéric de Villamil | Photo Glenn