rb_spread on Mac OS X
I recently tried to start using the spread messaging toolkit on the mac, along with the associated ruby client api rb_spread, but had a few failures. I got everything running fine on a linux box, but doing the same thing on a Mac gave me this error when trying to use the ruby client:
% irb
irb(main):001:0> require 'spread.so'
dyld: NSLinkModule() error
dyld: Symbol not found: _SP_get_num_vs_offset_memb_mess
Referenced from: /usr/local/lib/ruby/site_ruby/1.8/i686-darwin9.5.0/spread.bundle
Expected in: flat namespace
Trace/BPT trap
There was not much information out there about this problem, so it took a while to figure out what the problem was.
The solution was to use an older version of the spread toolkit. Using version 3.17.x instead of the newer 4.0 version works perfectly.
Profiling Postgres with PQA
If you need to understand what your database is really doing, you're going to need to profile its activity. If you are using postgres, then PQA (Practical Query Analyzis) is your best bet.
PQA takes data from your production logs and creates a report for you. First you need to get logs from your database. Chances are, you have disabled full statement logging in your db, so you need to turn those back on.
I find it easiest to have the PQA settings settings commented-out in postgres.conf file. When you want to run profiling, you uncomment them for the duration of the run.
# Standard settings:
log_duration = false
log_min_duration_statement = -1
# PQA Settings
#log_duration = true
#log_min_duration_statement = 0
You need to install the PQA. It's a regular gem, so you can install it like so:
sudo gem install pqa
Now you need to gather the logs. First, you need to enable the logging of the database statements on the database, as shown in the settings above.
Second, we tell postgres to re-read its config file. Like this:
% psql -U postgres
postgres -> select pg_reload_conf();
Telling it to reload the config will not take down the DB, but it will re-read postgres.conf, and change any logging-related behavior. Postgres is now logging. Check the logs to make sure you did not make a mistake. If everything is good, tell it to rotate logfiles, so you don't run your profiling with a half-empty one.
% psql -U postgres
postgres -> select pg_rotate_logfile();
You're now collecting data. How long you collect data for depends on how predictable the traffic is. One hour of logs can usually tell you quite a bit.
After enough time has passed, re-comment those statements in postgres.conf, and once again tell postgres to reload the config file, and rollover the log files.
It's likely that in your data collection run, many files were generated. In this case, you need to collate them back together. For postgres files following the standard naming convention (postgresql-YYYY-MM-DD-HHMMSS.log) I've got a hacked-up script for this:
files = Dir.glob("postgres*.log")
sorted = files.sort_by do |file|
reg = Regexp.new(/postgresql-2008-\d\d-\d\d_(\d+).log/)
match = reg.match file
number = match[1]
number.to_i
end
sorted.each do |f|
cmd = "cat #{f} >> very_big_postgres_log.log"
`#{cmd}`
end
Now you have a single big logfile. The format of data accepted by PQA is here. Postgres 8.2 logging seems to have log incompatibilities with PQA. A tiny bit of log massaging needs to be done. Basically, you need to take out the timezone from the log line. You can do this on the command line with cut, or with ruby:
% ruby -p -e '$_.gsub!(/PDT/, "")' very_big_postgres_log.log > fixed_postgres_log.log
Then you do the analysis:
% pqa -normalize -logtype pglog -file fixed_postgres_log.log
to_plain_segments error
I recently tried to start using metric-fu for code statistics. This
makes use of the rcov gem, as well as a few other gems. It generates the statistics with the rake
task rake metrics:all_with_migrate.
Somehow, this caused a slew of errors in the functional tests, all looking like this:
NoMethodError: undefined method `to_plain_segments' for #<ActionController::Routing::RouteSet:0x2aaaacad5648>
On the Mac, everything worked fine. On Fedora 6 hosts running 1.8.5 or 1.8.6 I would get the error. A bit of googling found that some rspec people are getting it too, but it's not rspec's fault. It's rcov.
I updated my rcov to the yet-unofficially unreleased version 0.8.3.1, thanks to a github branch of rcov.
sudo gem install spicycode-rcov -s http://gems.github.com
That fixes it, and tests run fine now.
Accessing Trac with Ruby's HTTP Library
I needed to access a Trac repository which was password-protected with basic authentication. Using ruby's standard standard http libraries, my first try did not work.
This is what does not work:
def trac_scraper_that_not_work
http = Net::HTTP.new('some_trac_hoster.com', 443)
http.use_ssl = true
http.start do |http|
login_location = '/trac/<something>/login'
req = Net::HTTP::Get.new(login_location)
req.basic_auth 'username', 'secret'
response = http.request(req)
return nil if response.class == Net::HTTPUnauthorized
actual_page_to_fetch_url = 'reports/or/whatever...'
req = Net::HTTP::Get.new(actual_page_to_fetch_url)
response = http.request(req)
#puts "response is #{response.body}"
response.body
end
end
Turns out that Trac uses cookies, so you need to provide the cookie you are given.
First you to provide the authentication using the basic_auth method, then you copy the cookie
you are given when provide it in the next request. Like this:
def trac_scraper_that_works
http = Net::HTTP.new('some_trac_hoster.com', 443)
http.use_ssl = true
http.start do |http|
login_location = '/trac/<something>/login'
req = Net::HTTP::Get.new(login_location)
req.basic_auth 'username', 'secret'
response = http.request(req)
return nil if response.class == Net::HTTPUnauthorized
cookie = response.response['set-cookie']
headers = {
'Cookie' => cookie
}
actual_page_to_fetch_url = 'reports/or/whatever...'
req = Net::HTTP::Get.new(actual_page_to_fetch_url, headers)
response = http.request(req)
#puts "response is #{response.body}"
response.body
end
end
postgres RI_ConstraintTrigger Error
A problem that I recently started running into is a strange Postgres FK constraints error, which only shows up when running tests. The error looks something like this:
PGError: ERROR: permission denied: "RI_ConstraintTrigger_XXXXX" is a system trigger
This is something that came in with FoxyFixtures, and there's a discussion about this problem and a patch in the rails trac. So what's this about?
When loading fixtures, rails tries to disable all foreign keys so that there are no problems inserting the data in whatever order. The calls work like this:
ALTER TABLE table_name DISABLE TRIGGER ALL
insert a bunch of fixtures
ALTER TABLE table_name ENABLE TRIGGER ALL
However, postgres keeps some kind of metadata regarding the FKs to be enforced called
RI_Constraint_something. Disabling the triggers on a table causes these to be deleted as well.
The problems is that these other triggers belong to the superuser and not you. So, you got this
permissions problem.
You may not have superuser permissions for your postgres instance, but it's probably best that the database you work with does not have superuser permissions, since that's the way it's going to be in production.
One workaround for to disable the foreign key checking in a different way. Instead of running the DISABLE TRIGGER, you can make the constraints DEFERRED postgres docs. This way, you run like this:
SET CONSTRAINTS ALL DEFERRED
insert a bunch of fixtures
SET CONSTRAINTS ALL IMMEDIATE
This allows you to load your fixtures, but keep all contraint-checking.
The easiest way to patch this is to override the behavior of the disable_referential_integrity method.
I created a new file, active_record_fk_hack.rb, stuck this in there:
module ActiveRecord
module ConnectionAdapters
class PostgreSQLAdapter < AbstractAdapter
def disable_referential_integrity(&block)
transaction {
begin
execute "SET CONSTRAINTS ALL DEFERRED"
yield
ensure
execute "SET CONSTRAINTS ALL IMMEDIATE"
end
}
end
end
end
end
then, in your environment.rb, add this at the end:
require 'active_record_fk_hack'
Query for Postgres database sizes
If you need to figure out how big your postgres databases are, this query can come in pretty useful.
SELECT
pg_database.datname,
pg_size_pretty(pg_database_size(pg_database.datname)) AS size
FROM
pg_database
JOIN
pg_shadow ON pg_database.datdba = pg_shadow.usesysid
ORDER BY
pg_database_size(pg_database.datname) desc;
You may need to log in as postgres to get the privilege to look at the pg_shadow tables.
Google Charts API rails plugin
I bundled the code from the previous post into a plugin, called Chartr. This makes it easy to create graphs within rails.
Here's how to use it.
First, install the plugin. (It's probably a good idea to install it with '-x' since it's likely to be updated. Also, I should mention that this is my first plugin.)
ruby script/plugin install -x svn://syvera.com/plugins/chartr
Right, so now we need to graph something. You can put a graph into any page, but let's create a page that definitely deserves a graph:
ruby script/generate controller graphs index
This gives you an index.html.erb page where you can put your graph. In that page, we're going to use Chartr to give us a random graph. This is the code:
<%= Chartr.make_simple_line_chart Array.new(5) {|i| rand(10)},
['one', 'two', 'three', 'four', 'five'],
'stuff to graph' %>
So that's going to give you a graph like this:
Sure, that's cool. But what the heck were those arguments?
The first one is for your values:
Array.new(5) {|i| rand(10)} # returns an array of 5 random values between 0 and 9
The second, also an array, is for the legend. No explanation needed for that. The last is for the html 'alt' tag.
You could provide a fourth one for the size of the graph. If not, the default is 200x100.
Google Charts API
I just noticed that google has published another cool API, this time for charts. It makes very fine looking charts, and has a simple API. Getting the values into the chart is very strange at first, but the constraints that you have on the values actually help in that your chart ends up looking better.
Here's how to use it. First we need some values to chart. A couple of weeks ago I wrote about how to extract some data out of subversion and put in into postgres. I still have those values around, so I for values I can use the number of commits per month in the rails project since it began. If you want to know more about that, read svn tricks. The strange part, as I mentioned before, is that you don't pass numbers to the chart. You pass some kind of encoding which has either 62, 1000, or 4096 values. For many uses, the first one is good enough.
Ok, so first we need to generate the encoding for the values. The Google maps API webpage shows a sample javascript function to generate that. I translated that to ruby. I think it's more or less correct, and goes like this:
def simpleEncode(values, maxValue)
simpleEncoding = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
chartData = ['s:'];
values.each do |v|
val = Float(v)
if (!val.nan? && v >= 0)
chartData.push(simpleEncoding[( (simpleEncoding.length-1) * val / maxValue).round, 1]);
else
chartData.push('_');
end
end
return chartData.join('');
end
Right, so now how do we get our values to this function. As mentioned before, our values would come out of the database like this:
% psql -d work_activity -t -c "select date_trunc('month', date), count(*) from svn_activity group by 1 order by 1;"
2004-11-01 00:00:00 | 30
2004-12-01 00:00:00 | 259
2005-01-01 00:00:00 | 218
etc
...
In order to generate the graph, we need to stuff those values into the funny URL google wants. So we hack a little chunk of code to parse the stuff.
a = STDIN.read
values = []
dates = []
a.each do |line|
next if line.chomp.length < 1
date, value = line.split('|')
value.chomp!; date.chomp! #get rid of all the chars
values << value.to_i
dates << date.split[0].split('-')[1] # just take month, legend too busy otherwise
end
That gives us two arrays, one with the values, and one with the dates. Notice that I chopped off the year and the day off the dates. Before I did that, the labels on the X axis were so busy with text that you could not even make sense out of any of them.
Now that we have our values, we make the img tag with all the funny values in it. We also decide the size of the chart and the 'alt' value here. There are plenty of other things that you can modify, but this does most of what you want:
image_str =<<-START
<img src="http://chart.apis.google.com/chart?
chs=500x225
&chd=#{simpleEncode(values, values.max)}
&cht=lc
&chxt=x,y
&chxl=0:|#{dates.join('|')}|1:||#{values.max}"
alt="Rails Subversion Commits" />
START
puts image_str
Now we put all those bits of code above into a single file, and call it makegoogleurl.rb. We run the query as before, and pipe it to our nifty new program like this:
psql -d work_activity -t -c "select date_trunc('month', date), count(*) from svn_activity group by 1 order by 1"| ruby make_google_url.rb
This is our result:
<img src="http://chart.apis.google.com/chart?
chs=500x225
&chd=s:FummojRe1Lu2vQOd9cLURcmXYRbQSLiUHIxpS
&cht=lc
&chxt=x,y
&chxl=0:|11|12|01|02|03|04|05|06|07|08|09|10|11|12|01|02|03|04|05|06|07|08|09|10|11|12|01|02|03|04|05|06|07|08|09|10|11|1:||347"
alt="Rails Subversion Commits" />
And if you put that in your page, it looks like:
deploying with google maps API keys
If you are using google maps in your apps, you get special key from them that works for the URL where you want to use the maps (ie www.yoursite.com). Since you probably test your code on localhost before you deploy it, you have a problem. The key they give you only works at that URL. What you need to do is get another key that works with your testing on your localhost, for example for localhost:3000.
What this means that when you deploy, you have to switch them. Of course, you do not want to do this by hand.
Here's a simple way to do this with capistrano:
task :after_update_code, :roles => [:app, :db, :web] do
layout_file = File.read('app/views/layouts/standard.rhtml')
google_map_key = File.read("config/google_map_key.txt")
layout_file.sub!(/file=api&v=2&key=(\w+)/, "file=api&v=2&key=#{google_map_key}")
put(layout_file,
"#{release_path}/app/views/layouts/standard.rhtml",
:mode => 0444)
end
So, what's going on?
First, save the production key on some file, like config/googlemapkey.txt. Don't think that is is like saving database passwords in SVN, since this key appears on all your webpages anyway.
I put my key in the site layout I use, standard.rhtml. By default, you want to keep development version of the key in there. When we deploy, we read the key from googlemapkey.txt, use a regexp to switch the development key with the production key. Then we send that file over with the put command.
You could be more clever by changing the file that's already on the server, instead of changing the local file and sending it over, but this works well enough.
svn tricks and rails on sundays
I've got a few projects that I work on when I get the time. Since I usually work on all of them at the same time, it seems none of them moves forward very fast. I got curious to see how much work I am actually doing over time, and came up with a few little SVN hacks.
First, get the svn logs, pipe into a file:
% cd <head_of_the_svn_tree>
% svn log -q | egrep '^r' > activity.csv
Right, that gives us a file with all of the project checkins. The 'egrep' part strips out all of the annoying dashes that come with the svn log. The data looks like of like this:
r2 | danielw | 2006-12-20 00:38:13 +0200 (Wed, 20 Dec 2006)
r1 | danielw | 2006-12-20 00:33:41 +0200 (Wed, 20 Dec 2006)
Now, with some command-line tricks I can break down the activity a little more:
% svn log -q | egrep '^r' | cut -d '|' -f 2 | sort | uniq -c | sort -n
This breaks down the log and counts the number of checkins per person. You can point it to a URL as well. Results on one of my SVN trees gives something like this:
6 carl
123 danielw
What I am really interested in is how this activity progresses over time. I don't know how to do this on the command line, but SQL could do this in no time. We need to create a database and a table to hold the data. In postgres, like this:
% createdb work_activity
% psql -d work_activity
work_activity => create table svn_activity (revision varchar, who varchar, date timestamp);
Now we need to populate this with data. Since the end of that SVN line has got some funny timestamps, we'll get AWK to strip that out for us. Also, since the standard postgres column delimiter is the tab (\t), we'll delimit our records like that. Also, let's use the rails project to get more interesting stats.
% svn log -q http://svn.rubyonrails.org/rails/trunk > activity_rails.txt
% cat activity_rails.txt | egrep '^r' | awk '{print $1"\t"$3"\t"$5}' > activity_rails.data
This puts all of the data into a file, which we can now load into the DB in a single easy command:
% psql -d work_activity -c 'COPY svn_activity FROM STDIN' < activity_rails.data
Now it's all in the database, and we can do loads of fancy queries on it:
% psql -d work_activity -c "select date_trunc('month', date), count(*) from svn_activity group by 1 order by 1;"
date_trunc | count
---------------------+-------
2004-11-01 00:00:00 | 30
2004-12-01 00:00:00 | 259
2005-01-01 00:00:00 | 218
2005-02-01 00:00:00 | 219
2005-03-01 00:00:00 | 227
2005-04-01 00:00:00 | 199
2005-05-01 00:00:00 | 99
2005-06-01 00:00:00 | 172
2005-07-01 00:00:00 | 304
2005-08-01 00:00:00 | 63
2005-09-01 00:00:00 | 263
2005-10-01 00:00:00 | 306
2005-11-01 00:00:00 | 265
2005-12-01 00:00:00 | 93
2006-01-01 00:00:00 | 79
2006-02-01 00:00:00 | 163
2006-03-01 00:00:00 | 347
2006-04-01 00:00:00 | 162
2006-05-01 00:00:00 | 60
2006-06-01 00:00:00 | 116
2006-07-01 00:00:00 | 96
2006-08-01 00:00:00 | 162
2006-09-01 00:00:00 | 216
2006-10-01 00:00:00 | 130
2006-11-01 00:00:00 | 139
2006-12-01 00:00:00 | 97
2007-01-01 00:00:00 | 155
2007-02-01 00:00:00 | 92
2007-03-01 00:00:00 | 101
2007-04-01 00:00:00 | 65
2007-05-01 00:00:00 | 192
2007-06-01 00:00:00 | 115
2007-07-01 00:00:00 | 39
2007-08-01 00:00:00 | 43
2007-09-01 00:00:00 | 278
2007-10-01 00:00:00 | 236
2007-11-01 00:00:00 | 105
Looks like a very healthy project. Ok, let's find out on what day of the week rails developers have been most prolific:
psql -d work_activity -c "select extract(dow from date) as day, count(*) from svn_activity group by 1 order by 1;"
day | count
-----+-------
0 | 1040
1 | 969
2 | 874
3 | 755
4 | 790
5 | 688
6 | 789
(7 rows)
Day 0 is sunday! Thanks for the hard work, guys.
Useful Capistrano tricks
I just ran into a neat little capistrano trick. I have been making some changes to an app, and knew it had been a while since I had made a release to one of the apps I am working on. But which version do I have installed?
I tried the cap diff_from_last_deploy, but it was a little too much information.
I ran into capistrano's stream command, and that proved to be just right:
desc "Find out svn version on server"
task :what_version, :roles => [:app] do
stream <<-CMD
svn info #{current_path}/app
CMD
end
It's also useful for those one-off commands you might want to run on your server, like seeing how many users registered at your site:
desc "Find out how many users are registered"
task :how_many_users, :roles => [:app] do
stream <<-CMD
psql -U user -d yourapp_production -c 'select count(login) from users'
CMD
end
Rails Inflector problems
I knew that the rails pluralization did some cool fancy stuff, but never really had to mess with it until today. I was trying to duplicate some bug I ran into, so I created a brand new rails app to recreate the bug in there. I had to create some model, so created "Dive", and did the whole scaffold thing that goes along with it.
I ran the tests, and they then started to complain about "Dife" not being found.
Curious, I went to the console:
% "dives".singularize # => dife
Huh? What's a 'dife'? Even dictionary.com does not even know what a 'dife' is! No matter, I knew that you could fix this up in the environment.rb file. I went in there and uncommented that default block in there, and added my little contribution to rails' knowledge of the english language.
Inflector.inflections do |inflect|
inflect.singular /dives/i, 'dive'
end
I then ran the console again, just to make sure. First thing I see are all these errors on the command line:
./script/../config/../config/environment.rb:55:NameError: uninitialized constant Inflector
/usr/local/lib/ruby/gems/1.8/gems/actionpack-1.13.5/lib/action_controller/assertions/selector_assertions.rb:525:NoMethodError: undefined method `camelize' for "top":String
..../app/controllers/application.rb:4:NameError: uninitialized constant ActionController::Base
On the prompt, I try the 'dives'.singularize again, but only get an error.
>> "dives".singularize
NoMethod Error: undefined method `singularize' for "dives":String
from (irb):2
I look in the awdwr book and some other places, and find that what I added really does seem to be correct. A little googling on the error finally takes me to someone else that ran into the same problem.
It's really simple. The Inflector block needs to be outside the whole 'Rails::Initializer.run' block, not inside. Once you move it to the right place, you get what you expect:
>> "dive".pluralize
=> "dives"
>> "dives".singularize
=> "dive"
Rails 2.0.0_PR and Globalize breakage
My functional tests broke when I went to Rails 2.0. It took a while to grok the stack trace:
ArgumentError: wrong number of arguments (2 for 1)
..../vendor/rails/activerecord/lib/active_record/base.rb:1965:in `attributes_with_quotes'
..../vendor/rails/activerecord/lib/active_record/base.rb:1965:in `update_without_lock'
Turns out the problem was that in Rails 2.0, ActiveRecord's 'attributeswithquotes' method now takes 2 arguments instead of a single one. The problem is that globalize overrides this method in the db_translate.rb, so things break.
It's easy to fix, though. Adding the extra argument to globalize's db_translate.rb seems to do fix what hurts.
basic plugins
Some plugins are plain good. I tend to forget where they are, and then have to go around hunting for them again. Here's so that I don't have to hunt around for these next time.
ruby script/plugin install exception_notification
ruby script/plugin install tztime
ruby script/plugin install tzinfo_timezone
ruby script/plugin install http://svn.techno-weenie.net/projects/plugins/restful_authentication/
ruby script/plugin install http://svn.pragprog.com/Public/plugins/annotate_models
ruby script/plugin install svn://errtheblog.com/svn/plugins/will_paginate
ruby script/plugin install svn://caboo.se/plugins/court3nay/spider_test
ruby script/plugin install http://terralien.com/svn/projects/plugins/query_trace/
Also
svn propset svn:ignore "*.log" log/
svn propset svn:ignore "ruby_sess*" tmp/sessions/
svn propset svn:ignore "*" tmp/pids/
svn propset svn:ignore "*" tmp/cache/
svn propset svn:ignore "*" tmp/sockets/
Learning Javascript
I have avoided javascript for a while, but finally decided to find out a little more about it. A few videos out there made the process painless.
First, Douglas Crockford's excellent presentations.
- Basic Javascript Part (split into 4 parts)
- Theory of the DOM (split into 3)
- Advanced Javascripts (split into 3)
After this, I felt I had inkling of what the language was about. Still not sure how I would write a large chunk of code with it, but it's good to know what's going on. He's got some more videos which I will try to come back to, but at this point I want to move on to apply some of this stuff.
What I really wanted was to be able to use some of these new javascript libraries like Prototype, but I did not want to start blindly using then without knowing a little about what was going on in the background.
A good introduction to those libraries is Peepcode's Prototype video. It costs money, but it's certainly worth it.
