rb_spread on Mac OS X

Posted by daniel Thursday, November 13, 2008 16:39:00 GMT

I recently tried to start using the spread messaging toolkit on the mac, along with the associated ruby client api rb_spread, but had a few failures. I got everything running fine on a linux box, but doing the same thing on a Mac gave me this error when trying to use the ruby client:

% irb
irb(main):001:0> require 'spread.so'
dyld: NSLinkModule() error
dyld: Symbol not found: _SP_get_num_vs_offset_memb_mess
  Referenced from: /usr/local/lib/ruby/site_ruby/1.8/i686-darwin9.5.0/spread.bundle
  Expected in: flat namespace

Trace/BPT trap

There was not much information out there about this problem, so it took a while to figure out what the problem was.

The solution was to use an older version of the spread toolkit. Using version 3.17.x instead of the newer 4.0 version works perfectly.

Profiling Postgres with PQA

Posted by daniel Friday, October 10, 2008 17:00:35 GMT

If you need to understand what your database is really doing, you're going to need to profile its activity. If you are using postgres, then PQA (Practical Query Analyzis) is your best bet.

PQA takes data from your production logs and creates a report for you. First you need to get logs from your database. Chances are, you have disabled full statement logging in your db, so you need to turn those back on.

I find it easiest to have the PQA settings settings commented-out in postgres.conf file. When you want to run profiling, you uncomment them for the duration of the run.

# Standard settings:
log_duration = false
log_min_duration_statement = -1

# PQA Settings 
#log_duration = true
#log_min_duration_statement = 0

You need to install the PQA. It's a regular gem, so you can install it like so:

sudo gem install pqa

Now you need to gather the logs. First, you need to enable the logging of the database statements on the database, as shown in the settings above.

Second, we tell postgres to re-read its config file. Like this:

% psql -U postgres
postgres -> select pg_reload_conf();

Telling it to reload the config will not take down the DB, but it will re-read postgres.conf, and change any logging-related behavior. Postgres is now logging. Check the logs to make sure you did not make a mistake. If everything is good, tell it to rotate logfiles, so you don't run your profiling with a half-empty one.

% psql -U postgres
postgres -> select pg_rotate_logfile();

You're now collecting data. How long you collect data for depends on how predictable the traffic is. One hour of logs can usually tell you quite a bit.

After enough time has passed, re-comment those statements in postgres.conf, and once again tell postgres to reload the config file, and rollover the log files.

It's likely that in your data collection run, many files were generated. In this case, you need to collate them back together. For postgres files following the standard naming convention (postgresql-YYYY-MM-DD-HHMMSS.log) I've got a hacked-up script for this:

files = Dir.glob("postgres*.log")
sorted = files.sort_by do |file|
  reg = Regexp.new(/postgresql-2008-\d\d-\d\d_(\d+).log/)
  match = reg.match file
  number = match[1]
  number.to_i
end

sorted.each do |f|
  cmd = "cat #{f} >> very_big_postgres_log.log"
  `#{cmd}`
end

Now you have a single big logfile. The format of data accepted by PQA is here. Postgres 8.2 logging seems to have log incompatibilities with PQA. A tiny bit of log massaging needs to be done. Basically, you need to take out the timezone from the log line. You can do this on the command line with cut, or with ruby:

% ruby -p -e '$_.gsub!(/PDT/, "")' very_big_postgres_log.log > fixed_postgres_log.log

Then you do the analysis:

%  pqa -normalize -logtype pglog -file fixed_postgres_log.log

to_plain_segments error

Posted by daniel Tuesday, September 23, 2008 08:14:00 GMT

I recently tried to start using metric-fu for code statistics. This makes use of the rcov gem, as well as a few other gems. It generates the statistics with the rake task rake metrics:all_with_migrate. Somehow, this caused a slew of errors in the functional tests, all looking like this:

NoMethodError: undefined method `to_plain_segments' for #<ActionController::Routing::RouteSet:0x2aaaacad5648>

On the Mac, everything worked fine. On Fedora 6 hosts running 1.8.5 or 1.8.6 I would get the error. A bit of googling found that some rspec people are getting it too, but it's not rspec's fault. It's rcov.

I updated my rcov to the yet-unofficially unreleased version 0.8.3.1, thanks to a github branch of rcov.

sudo gem install spicycode-rcov -s http://gems.github.com

That fixes it, and tests run fine now.

Accessing Trac with Ruby's HTTP Library

Posted by daniel Friday, August 08, 2008 09:06:00 GMT

I needed to access a Trac repository which was password-protected with basic authentication. Using ruby's standard standard http libraries, my first try did not work.

This is what does not work:

def trac_scraper_that_not_work
    http = Net::HTTP.new('some_trac_hoster.com', 443)
    http.use_ssl = true
    http.start do |http|

      login_location = '/trac/<something>/login'
      req = Net::HTTP::Get.new(login_location)
      req.basic_auth 'username', 'secret'
      response = http.request(req)    
      return nil if response.class == Net::HTTPUnauthorized

      actual_page_to_fetch_url = 'reports/or/whatever...'   
      req = Net::HTTP::Get.new(actual_page_to_fetch_url)
      response = http.request(req)
      #puts "response is #{response.body}"
      response.body
    end
end

Turns out that Trac uses cookies, so you need to provide the cookie you are given. First you to provide the authentication using the basic_auth method, then you copy the cookie you are given when provide it in the next request. Like this:

def trac_scraper_that_works
    http = Net::HTTP.new('some_trac_hoster.com', 443)
    http.use_ssl = true
    http.start do |http|

      login_location = '/trac/<something>/login'
      req = Net::HTTP::Get.new(login_location)
      req.basic_auth 'username', 'secret'
      response = http.request(req)

      return nil if response.class == Net::HTTPUnauthorized

      cookie = response.response['set-cookie']      
      headers = {
        'Cookie' => cookie
      }
      actual_page_to_fetch_url = 'reports/or/whatever...'   
      req = Net::HTTP::Get.new(actual_page_to_fetch_url, headers)
      response = http.request(req)
      #puts "response is #{response.body}"
      response.body
    end
end

postgres RI_ConstraintTrigger Error

Posted by daniel Friday, July 25, 2008 13:12:00 GMT

A problem that I recently started running into is a strange Postgres FK constraints error, which only shows up when running tests. The error looks something like this:

PGError: ERROR:  permission denied: "RI_ConstraintTrigger_XXXXX" is a system trigger

This is something that came in with FoxyFixtures, and there's a discussion about this problem and a patch in the rails trac. So what's this about?

When loading fixtures, rails tries to disable all foreign keys so that there are no problems inserting the data in whatever order. The calls work like this:

ALTER TABLE table_name DISABLE TRIGGER ALL
insert a bunch of fixtures 
ALTER TABLE table_name ENABLE TRIGGER ALL

However, postgres keeps some kind of metadata regarding the FKs to be enforced called RI_Constraint_something. Disabling the triggers on a table causes these to be deleted as well. The problems is that these other triggers belong to the superuser and not you. So, you got this permissions problem.

You may not have superuser permissions for your postgres instance, but it's probably best that the database you work with does not have superuser permissions, since that's the way it's going to be in production.

One workaround for to disable the foreign key checking in a different way. Instead of running the DISABLE TRIGGER, you can make the constraints DEFERRED postgres docs. This way, you run like this:

SET CONSTRAINTS ALL DEFERRED
insert a bunch of fixtures 
SET CONSTRAINTS ALL IMMEDIATE

This allows you to load your fixtures, but keep all contraint-checking.

The easiest way to patch this is to override the behavior of the disable_referential_integrity method. I created a new file, active_record_fk_hack.rb, stuck this in there:

module ActiveRecord
  module ConnectionAdapters
    class PostgreSQLAdapter < AbstractAdapter
      def disable_referential_integrity(&block)
         transaction {
           begin
             execute "SET CONSTRAINTS ALL DEFERRED"
             yield
           ensure
             execute "SET CONSTRAINTS ALL IMMEDIATE"
           end
         }
      end
    end
  end
end

then, in your environment.rb, add this at the end:

require 'active_record_fk_hack'

Query for Postgres database sizes

Posted by daniel Wednesday, April 16, 2008 11:11:00 GMT

If you need to figure out how big your postgres databases are, this query can come in pretty useful.

SELECT 
   pg_database.datname,
   pg_size_pretty(pg_database_size(pg_database.datname)) AS size
FROM 
  pg_database                                                                                                                                       
JOIN 
  pg_shadow ON pg_database.datdba = pg_shadow.usesysid                                                                                                     
ORDER BY 
  pg_database_size(pg_database.datname) desc;

You may need to log in as postgres to get the privilege to look at the pg_shadow tables.

Google Charts API rails plugin

Posted by daniel Sunday, December 09, 2007 21:00:33 GMT

I bundled the code from the previous post into a plugin, called Chartr. This makes it easy to create graphs within rails.

Here's how to use it.

First, install the plugin. (It's probably a good idea to install it with '-x' since it's likely to be updated. Also, I should mention that this is my first plugin.)

    ruby script/plugin install -x svn://syvera.com/plugins/chartr

Right, so now we need to graph something. You can put a graph into any page, but let's create a page that definitely deserves a graph:

    ruby script/generate controller graphs index

This gives you an index.html.erb page where you can put your graph. In that page, we're going to use Chartr to give us a random graph. This is the code:

    <%= Chartr.make_simple_line_chart Array.new(5) {|i| rand(10)}, 
                                      ['one', 'two', 'three', 'four', 'five'],
                                      'stuff to graph' %>

So that's going to give you a graph like this:

stuff to graph

Sure, that's cool. But what the heck were those arguments?

The first one is for your values:

    Array.new(5) {|i| rand(10)}  # returns an array of 5 random values between 0 and 9

The second, also an array, is for the legend. No explanation needed for that. The last is for the html 'alt' tag.

You could provide a fourth one for the size of the graph. If not, the default is 200x100.

Google Charts API

Posted by daniel Friday, December 07, 2007 22:48:00 GMT

I just noticed that google has published another cool API, this time for charts. It makes very fine looking charts, and has a simple API. Getting the values into the chart is very strange at first, but the constraints that you have on the values actually help in that your chart ends up looking better.

Here's how to use it. First we need some values to chart. A couple of weeks ago I wrote about how to extract some data out of subversion and put in into postgres. I still have those values around, so I for values I can use the number of commits per month in the rails project since it began. If you want to know more about that, read svn tricks. The strange part, as I mentioned before, is that you don't pass numbers to the chart. You pass some kind of encoding which has either 62, 1000, or 4096 values. For many uses, the first one is good enough.

Ok, so first we need to generate the encoding for the values. The Google maps API webpage shows a sample javascript function to generate that. I translated that to ruby. I think it's more or less correct, and goes like this:

    def simpleEncode(values, maxValue)
      simpleEncoding = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';

      chartData = ['s:'];
      values.each do |v|
        val = Float(v)
        if (!val.nan? && v >= 0)   
          chartData.push(simpleEncoding[( (simpleEncoding.length-1) * val / maxValue).round, 1]);
        else
          chartData.push('_');
        end
      end
      return chartData.join('');
    end

Right, so now how do we get our values to this function. As mentioned before, our values would come out of the database like this:

    % psql -d work_activity -t -c "select date_trunc('month', date), count(*) from svn_activity group by 1 order by 1;"
     2004-11-01 00:00:00 |    30
     2004-12-01 00:00:00 |   259
     2005-01-01 00:00:00 |   218
     etc
     ...

In order to generate the graph, we need to stuff those values into the funny URL google wants. So we hack a little chunk of code to parse the stuff.

    a = STDIN.read
    values = []
    dates = []
    a.each do |line|
      next if line.chomp.length < 1
      date, value = line.split('|')
      value.chomp!; date.chomp! #get rid of all the chars
      values << value.to_i 
      dates << date.split[0].split('-')[1] # just take month, legend too busy otherwise
    end

That gives us two arrays, one with the values, and one with the dates. Notice that I chopped off the year and the day off the dates. Before I did that, the labels on the X axis were so busy with text that you could not even make sense out of any of them.

Now that we have our values, we make the img tag with all the funny values in it. We also decide the size of the chart and the 'alt' value here. There are plenty of other things that you can modify, but this does most of what you want:

    image_str =<<-START
    <img src="http://chart.apis.google.com/chart?
    chs=500x225
    &amp;chd=#{simpleEncode(values, values.max)}
    &amp;cht=lc
    &amp;chxt=x,y
    &amp;chxl=0:|#{dates.join('|')}|1:||#{values.max}"
    alt="Rails Subversion Commits" />
    START
    puts image_str

Now we put all those bits of code above into a single file, and call it makegoogleurl.rb. We run the query as before, and pipe it to our nifty new program like this:

    psql -d work_activity -t -c "select date_trunc('month', date), count(*) from svn_activity group by 1 order by 1"| ruby make_google_url.rb

This is our result:

    <img src="http://chart.apis.google.com/chart?
    chs=500x225
    &amp;chd=s:FummojRe1Lu2vQOd9cLURcmXYRbQSLiUHIxpS
    &amp;cht=lc
    &amp;chxt=x,y
    &amp;chxl=0:|11|12|01|02|03|04|05|06|07|08|09|10|11|12|01|02|03|04|05|06|07|08|09|10|11|12|01|02|03|04|05|06|07|08|09|10|11|1:||347"
    alt="Rails Subversion Commits" />

And if you put that in your page, it looks like:

Rails Subversion Commits

deploying with google maps API keys

Posted by daniel Sunday, December 02, 2007 13:00:00 GMT

If you are using google maps in your apps, you get special key from them that works for the URL where you want to use the maps (ie www.yoursite.com). Since you probably test your code on localhost before you deploy it, you have a problem. The key they give you only works at that URL. What you need to do is get another key that works with your testing on your localhost, for example for localhost:3000.

What this means that when you deploy, you have to switch them. Of course, you do not want to do this by hand.

Here's a simple way to do this with capistrano:

    task :after_update_code, :roles => [:app, :db, :web] do

       layout_file = File.read('app/views/layouts/standard.rhtml')
       google_map_key = File.read("config/google_map_key.txt")
       layout_file.sub!(/file=api&amp;v=2&amp;key=(\w+)/, "file=api&amp;v=2&amp;key=#{google_map_key}")
       put(layout_file,
           "#{release_path}/app/views/layouts/standard.rhtml",
           :mode => 0444)
    end

So, what's going on?

First, save the production key on some file, like config/googlemapkey.txt. Don't think that is is like saving database passwords in SVN, since this key appears on all your webpages anyway.

I put my key in the site layout I use, standard.rhtml. By default, you want to keep development version of the key in there. When we deploy, we read the key from googlemapkey.txt, use a regexp to switch the development key with the production key. Then we send that file over with the put command.

You could be more clever by changing the file that's already on the server, instead of changing the local file and sending it over, but this works well enough.

svn tricks and rails on sundays

Posted by daniel Sunday, November 25, 2007 16:00:27 GMT

I've got a few projects that I work on when I get the time. Since I usually work on all of them at the same time, it seems none of them moves forward very fast. I got curious to see how much work I am actually doing over time, and came up with a few little SVN hacks.

First, get the svn logs, pipe into a file:

% cd <head_of_the_svn_tree>
% svn log -q | egrep '^r' > activity.csv

Right, that gives us a file with all of the project checkins. The 'egrep' part strips out all of the annoying dashes that come with the svn log. The data looks like of like this:

r2 | danielw | 2006-12-20 00:38:13 +0200 (Wed, 20 Dec 2006)
r1 | danielw | 2006-12-20 00:33:41 +0200 (Wed, 20 Dec 2006)

Now, with some command-line tricks I can break down the activity a little more:

% svn log -q | egrep '^r' | cut -d '|' -f 2 | sort | uniq -c | sort -n

This breaks down the log and counts the number of checkins per person. You can point it to a URL as well. Results on one of my SVN trees gives something like this:

6  carl 
123  danielw

What I am really interested in is how this activity progresses over time. I don't know how to do this on the command line, but SQL could do this in no time. We need to create a database and a table to hold the data. In postgres, like this:

 % createdb work_activity
 % psql -d work_activity
 work_activity => create table svn_activity (revision varchar, who varchar, date timestamp);

Now we need to populate this with data. Since the end of that SVN line has got some funny timestamps, we'll get AWK to strip that out for us. Also, since the standard postgres column delimiter is the tab (\t), we'll delimit our records like that. Also, let's use the rails project to get more interesting stats.

% svn log -q http://svn.rubyonrails.org/rails/trunk > activity_rails.txt
% cat activity_rails.txt | egrep '^r' | awk '{print $1"\t"$3"\t"$5}' > activity_rails.data

This puts all of the data into a file, which we can now load into the DB in a single easy command:

% psql -d work_activity -c 'COPY svn_activity FROM STDIN' < activity_rails.data

Now it's all in the database, and we can do loads of fancy queries on it:

% psql -d work_activity -c "select date_trunc('month', date), count(*) from svn_activity group by 1 order by 1;"

     date_trunc      | count 
---------------------+-------
 2004-11-01 00:00:00 |    30
 2004-12-01 00:00:00 |   259
 2005-01-01 00:00:00 |   218
 2005-02-01 00:00:00 |   219
 2005-03-01 00:00:00 |   227
 2005-04-01 00:00:00 |   199
 2005-05-01 00:00:00 |    99
 2005-06-01 00:00:00 |   172
 2005-07-01 00:00:00 |   304
 2005-08-01 00:00:00 |    63
 2005-09-01 00:00:00 |   263
 2005-10-01 00:00:00 |   306
 2005-11-01 00:00:00 |   265
 2005-12-01 00:00:00 |    93
 2006-01-01 00:00:00 |    79
 2006-02-01 00:00:00 |   163
 2006-03-01 00:00:00 |   347
 2006-04-01 00:00:00 |   162
 2006-05-01 00:00:00 |    60
 2006-06-01 00:00:00 |   116
 2006-07-01 00:00:00 |    96
 2006-08-01 00:00:00 |   162
 2006-09-01 00:00:00 |   216
 2006-10-01 00:00:00 |   130
 2006-11-01 00:00:00 |   139
 2006-12-01 00:00:00 |    97
 2007-01-01 00:00:00 |   155
 2007-02-01 00:00:00 |    92
 2007-03-01 00:00:00 |   101
 2007-04-01 00:00:00 |    65
 2007-05-01 00:00:00 |   192
 2007-06-01 00:00:00 |   115
 2007-07-01 00:00:00 |    39
 2007-08-01 00:00:00 |    43
 2007-09-01 00:00:00 |   278
 2007-10-01 00:00:00 |   236
 2007-11-01 00:00:00 |   105

Looks like a very healthy project. Ok, let's find out on what day of the week rails developers have been most prolific:

psql -d work_activity -c "select extract(dow from date) as day, count(*) from svn_activity group by 1 order by 1;"  

 day | count 
-----+-------
   0 |  1040
   1 |   969
   2 |   874
   3 |   755
   4 |   790
   5 |   688
   6 |   789
(7 rows)

Day 0 is sunday! Thanks for the hard work, guys.

Useful Capistrano tricks

Posted by daniel Tuesday, November 13, 2007 13:23:00 GMT

I just ran into a neat little capistrano trick. I have been making some changes to an app, and knew it had been a while since I had made a release to one of the apps I am working on. But which version do I have installed?

I tried the cap diff_from_last_deploy, but it was a little too much information. I ran into capistrano's stream command, and that proved to be just right:

    desc "Find out svn version on server"
    task :what_version, :roles => [:app] do
        stream <<-CMD
            svn info #{current_path}/app
        CMD
    end

It's also useful for those one-off commands you might want to run on your server, like seeing how many users registered at your site:

    desc "Find out how many users are registered"
    task :how_many_users, :roles => [:app] do
        stream <<-CMD
            psql -U user -d yourapp_production -c 'select count(login) from users'
        CMD
    end

Rails Inflector problems

Posted by daniel Sunday, November 04, 2007 16:23:21 GMT

I knew that the rails pluralization did some cool fancy stuff, but never really had to mess with it until today. I was trying to duplicate some bug I ran into, so I created a brand new rails app to recreate the bug in there. I had to create some model, so created "Dive", and did the whole scaffold thing that goes along with it.

I ran the tests, and they then started to complain about "Dife" not being found.

Curious, I went to the console:

    % "dives".singularize   # => dife

Huh? What's a 'dife'? Even dictionary.com does not even know what a 'dife' is! No matter, I knew that you could fix this up in the environment.rb file. I went in there and uncommented that default block in there, and added my little contribution to rails' knowledge of the english language.

     Inflector.inflections do |inflect|
        inflect.singular /dives/i, 'dive'
     end

I then ran the console again, just to make sure. First thing I see are all these errors on the command line:

    ./script/../config/../config/environment.rb:55:NameError: uninitialized constant Inflector
    /usr/local/lib/ruby/gems/1.8/gems/actionpack-1.13.5/lib/action_controller/assertions/selector_assertions.rb:525:NoMethodError:      undefined method `camelize' for "top":String
    ..../app/controllers/application.rb:4:NameError: uninitialized constant ActionController::Base

On the prompt, I try the 'dives'.singularize again, but only get an error.

    >> "dives".singularize
    NoMethod Error: undefined method `singularize' for "dives":String
      from (irb):2

I look in the awdwr book and some other places, and find that what I added really does seem to be correct. A little googling on the error finally takes me to someone else that ran into the same problem.

It's really simple. The Inflector block needs to be outside the whole 'Rails::Initializer.run' block, not inside. Once you move it to the right place, you get what you expect:

    >> "dive".pluralize
    => "dives"
    >> "dives".singularize
    => "dive"

Rails 2.0.0_PR and Globalize breakage

Posted by daniel Tuesday, October 30, 2007 15:25:00 GMT

My functional tests broke when I went to Rails 2.0. It took a while to grok the stack trace:

ArgumentError: wrong number of arguments (2 for 1)
..../vendor/rails/activerecord/lib/active_record/base.rb:1965:in `attributes_with_quotes'
    ..../vendor/rails/activerecord/lib/active_record/base.rb:1965:in `update_without_lock'

Turns out the problem was that in Rails 2.0, ActiveRecord's 'attributeswithquotes' method now takes 2 arguments instead of a single one. The problem is that globalize overrides this method in the db_translate.rb, so things break.

It's easy to fix, though. Adding the extra argument to globalize's db_translate.rb seems to do fix what hurts.

basic plugins

Posted by daniel Thursday, October 18, 2007 14:45:00 GMT

Some plugins are plain good. I tend to forget where they are, and then have to go around hunting for them again. Here's so that I don't have to hunt around for these next time.

    ruby script/plugin install exception_notification
    ruby script/plugin install tztime
    ruby script/plugin install tzinfo_timezone
    ruby script/plugin install http://svn.techno-weenie.net/projects/plugins/restful_authentication/
    ruby script/plugin install http://svn.pragprog.com/Public/plugins/annotate_models
    ruby script/plugin install svn://errtheblog.com/svn/plugins/will_paginate
    ruby script/plugin install svn://caboo.se/plugins/court3nay/spider_test
    ruby script/plugin install http://terralien.com/svn/projects/plugins/query_trace/

Also

    svn propset svn:ignore "*.log" log/
    svn propset svn:ignore "ruby_sess*" tmp/sessions/   
    svn propset svn:ignore "*" tmp/pids/
    svn propset svn:ignore "*" tmp/cache/
    svn propset svn:ignore "*" tmp/sockets/

Learning Javascript

Posted by daniel Tuesday, July 17, 2007 15:07:00 GMT

I have avoided javascript for a while, but finally decided to find out a little more about it. A few videos out there made the process painless.

First, Douglas Crockford's excellent presentations.

  1. Basic Javascript Part (split into 4 parts)
  2. Theory of the DOM (split into 3)
  3. Advanced Javascripts (split into 3)

After this, I felt I had inkling of what the language was about. Still not sure how I would write a large chunk of code with it, but it's good to know what's going on. He's got some more videos which I will try to come back to, but at this point I want to move on to apply some of this stuff.

What I really wanted was to be able to use some of these new javascript libraries like Prototype, but I did not want to start blindly using then without knowing a little about what was going on in the background.

A good introduction to those libraries is Peepcode's Prototype video. It costs money, but it's certainly worth it.