Monday 3 September 2012

Project Euler

Signed up recently to Project Euler, seems like a good way to keep your computer science/programming skills sharp. Although I'm surprised on the forums everyone's using C/Assembly to solve the problems. Got a nice badge auto generated for me as well:


The problems seem like they could get a bit addictive, but definitely a lot of fun and probably better than a spending time on facebook.

Saturday 11 August 2012

How to see which package a file belongs to

Previously I wrote a post talking about how to see what files were installed as a part of a package. In this post, I'll talk about going to other way i.e. how to tell which package a file belongs to

On Debian/Ubuntu you can do so using the dpkg search function:

# sudo dpkg -S /etc/init.d/whoopsie
whoopsie: /etc/init.d/whoopsie


On RedHat/CentOS you can use the rpm command:

# rpm -qf /usr/bin/bash

Configuring boot services

One of the common tasks when setting up a server is to configure whether a service is set to start up on boot or not. This is handled differently on different versions of Linux.

To list all services and whether they're set to run on boot:

RHEL/CentOS/Fedora

chkconfig --list

Debian/Ubuntu

rcconf



NOTE: this program doesn't come installed by default

Enable a service to run on boot:

RHEL/CentOS/Fedora

chkconfig [service name] on

Debian/Ubuntu

update-rc.d [service name] enable

Disable a service from running on boot

RHEL/CentOS/Fedora

chkconfig [service name] off

Debian/Ubuntu

update-rc.d [service name] disable


Tomcat and Virtual Hosts

In this guide I'll go through setting up some very simple virtual hosts on an Tomcat server. This guide assumes the steps gone through to setup Tomcat 6 on Ubuntu as per this previous post.

So, the first step is to define the host under /etc/tomcat6/server.xml:


Put the above line in the Catalina "Engine" section. The "name" attribute will be used as the hostname to match and the "appBase" will define where Tomcat will look for the applications to run off of this host. If you'd like to define some aliases for this virtual host, you can do so with a nested "Alias" directive as described in the Tomcat documentation.

Next, if we want to define the Context we simply create the directory for it under Catalina:

mkdir /etc/tomcat6/Catalina/example.org

And then for a simple application we can just copy the ROOT app from the default context:

cp /etc/tomcat6/Catalina/localhost/ROOT.xml /etc/tomcat6/example.org

This will define the Context for our application. Next, we will need to create the application directory to actually hold our applications for this virtual host and copy the relevant application files to this new directory. This is done with:

mkdir /var/lib/tomcat6/example.org
cp -r /var/lib/tomcat6/webapps/ROOT /var/lib/tomcat6/example.org


Then I modified the "index.html" file under "example.org/ROOT/" to display "example.org" instead of the default "It Works!" so that we would know when the Virtual Host was being accessed. Once this is done, we can go ahead and restart Tomcat in order to apply the changes:

sudo service tomcat6 restart

To test out that this configuration is working, I added a line to my hosts file (/etc/hosts under linux) on my desktop machine to point "example.org" to the IP address of the VM that I had installed Tomcat on. This allowed me to type in http://example.org:8080 and have the request go to the Tomcat server.

If everything worked out well, going to the virtual host at http://example.org:8080 should yield the modified page, where as going to http://[Tomcat server IP]:8080 will result in the default page.

So, there you have it, that's the short story on how to setup up virtual hosts on Apache Tomcat.

Tuesday 7 August 2012

Ubuntu - The following packages have been kept back

If you've used Ubuntu for long enough, you'll find that eventually you'll run into a problem when upgrading the installed packages. When running apt-get from the command line, the problem manifests itself as the following:

$ sudo apt-get upgrade
[sudo] password for srdan:
Reading package lists... Done
Building dependency tree      
Reading state information... Done
The following packages have been kept back:
  linux-headers-server linux-image-server linux-server
0 upgraded, 0 newly installed, 0 to remove and 3 not upgrade


The short answer is that you should be able to upgrade by running the "apt-get dist-upgrade" command:

$ sudo apt-get dist-upgrade
Reading package lists... Done
Building dependency tree      
Reading state information... Done
Calculating upgrade... Done
The following NEW packages will be installed:
  linux-headers-3.2.0-27 linux-headers-3.2.0-27-generic linux-image-3.2.0-27-generic
The following packages will be upgraded:
  linux-headers-server linux-image-server linux-server
3 upgraded, 3 newly installed, 0 to remove and 0 not upgraded.
Need to get 51.2 MB of archives.
After this operation, 217 MB of additional disk space will be used.
Do you want to continue [Y/n]?


The long answer comes from the man page of the "apt-get" command. In particular, if you look at the description of the "upgrade" argument, two sentences stick out:

"under no circumstances are currently installed packages removed, or packages not already installed retrieved and installed."

"New versions of currently installed packages that cannot be upgraded without changing the install status of another package will be left at their current version."

Because you can't upgrade the "linux-image-server" (a.k.a. the kernel) without upgrading the headers as well (technically you can, but it can lead to serious problems) it won't let you upgrade them using the "upgrade" command. Either that, or the "new" linux image package requires that a "new" linux headers package be installed, violating the requirement that packages not already installed not be installed.

The reason that the "dis-upgrade" command works, where the "upgrade" command does not is that "dist-upgrade" takes into account dependencies between packages. Also from the man page:

"dist-upgrade ... intelligently handles changing dependencies with new versions of packages; apt-get has a 'smart' conflict resolution system, and it will attempt to upgrade the most important packages at the expense of less important ones if necessary."

This does beg the question though as to why not just use "dist-upgrade" all the time or incorporate the conflict resolution system into the "upgrade" command?

I suspect the answer has something to do with how each "edition" of a distribution is defined. i.e. Ubuntu 12.04 will ship with version 3.1.13 of the "at" package and all other packages that are a part of this "edition" should work with that version. Having many different packages depend on specific versions of other packages could end up in a situation where it becomes very difficult to upgrade any individual package.

Monday 23 July 2012

svn:ignore property, STS and Subversive

Working with a combination of Springsource Tool Suite, Subversion and the Subversive plugin, one of the things that really bugs me is the issues I keep on getting as a result of messed up source control of the "target" directory.

This directory typically holds all of the compiled classes as well as a stacktrace log and a few other things. In general it never really needs to be put under version control.

One of the things I keep forgetting to do, that frustrates the hell out of me is to set the svn:ignore property as soon as the repository is created. You see, without this property set, the target directory is submitted (added) to version control and once it's in there, there's no (easy) way of getting it out, as the Subversion documentation says:

"...Once an object is under Subversion's control, the ignore pattern mechanisms no longer apply to it..."

And even if you set the "svn:ignore" property on the directory, it still probably won't work properly. Instead committing all of the class files and so forth that you couldn't give two craps about.

All this is well and good, but could be written off as a minor problem, if it weren't for the fact that when you try to revert your changes, Subversion could very well pick up "conflicts" between the class files, and in fact can put you in a situation where you're struggling to get a functional checkout of the repository, simply because of the "target" directory.

The problem seems to be with the nature of the files being stored in the directory being constantly deleted/recreated etc... which seems to screw up Subversion. At the moment there doesn't seem to be a fix for the issue i.e. every time I see the problem, I just manually manage the conflicting files.

Tuesday 3 July 2012

Linux Kernel Swappiness

There is a tendency of the Linux kernel to use memory as file system cache. This generally improves performance and is considered to be a "good thing". However, one thing that the kernel also occasionally does is take the memory allocated to running processes and swap them to disk, in order to use that memory for file systems cache. Now, this can and does result in processes becoming slower, especially if they've been running (sitting in memory) for a while, but haven't been actively used.

Luckily, there is a way in which you can define this behaviour and it's called the kernel "swappiness" value. The value has a range of 0 to 100, with zero roughly meaning that process memory will never get swapped out for the sake of disk caching and a value of 100 means that process memory is very aggressively swapped out, in favour of disk caching. A more in depth explanation of how the kernel manages swappiness, can be found here.

By default, this value is set to 60, which is configured more for server throughput, rather than desktop responsiveness.On my desktops I usually set the value to 10, which seems to be a good fit for desktop responsiveness.

The way to set it on Ubuntu is:

$ sudo sysctl vm.swappiness=10

and to make the changes permanent, just add the following line to /etc/sysctl.conf:

vm.swappiness=10

Thursday 28 June 2012

Apache - Enabling the info module

If you've ever wanted to have something like the phpinfo page for Apache, that showed all of the modules that were enabled as well as configuration and compilation settings, just make use of the mod_info module:


This can be useful for getting a definitive answer on which configuration files are being loaded as well as what modules are being loaded and what they're configured to.

However, as with phpinfo, you have to keep security in mind when using it as a lot of this information can be used by potential attackers. The recommended way of securing it is just modifying the configuration file (/etc/apache2/mods-available/info.conf) to deny access from all but trusted IP addresses:
<ifmodule mod_info.c>

<location /server-info>
    SetHandler server-info
    Order deny,allow
    Deny from all
    Allow from localhost ip6-localhost
#    Allow from .example.com
</location>

</ifmodule>
By default, the only access allowed is from localhost.

Tuesday 26 June 2012

Humble Indie Bundle - Bastion

This month being my birthday, I've bought and downloaded the Humble Indie Bundle, which I bought as a present for myself. After seeing the lineup of games for this bundle, I couldn't say no and apparently neither could a lot of other people, the bundle having raked in over $5 million and selling just under 600k bundles.

To be honest, I haven't played any kind of computer games for a while now, so it's been a challenge trying to find the time to play. Out of all the games in the bundle, I started with Bastion, which attracted me to it with its artistic visuals.

The game is good, really good. The only criticism I can make is that it's a little on the difficult side. This  likely has to do with me not having played any video games for a while, however.

Wednesday 30 May 2012

Grails many-to-many deletes

This is an interesting problem I've had recently when trying to delete a Child object that's a member of a many-to-many relationship.

So, the classes are as follows:

class List {

String name
String description

static hasMany = [tasks:Task]
}

class Task {

String name
String description

static belongsTo = List
static hasMany = [lists:List]
}


Pretty simple right? But try calling the delete method of the "Task" controller and you'll soon run into problems. The problem is that when the child object is deleted, it's references in the parent aren't deleted. If you're not sure what this means, just keep an eye on the database contents and you'll see how after deleting the child object you end up with a bunch of foreign key pointers that are invalid now.

The solution is to add some code like the following in the delete method right?

taskInstance.lists.each{
  it.removeFromTasks(taskInstance)
}


However, if you try this you'll get a "java.util.ConcurrentModificationException". Which is telling you that Java doesn't allow you to change the contents of a collection (array, list, set etc...) while iterating over that same collection.

The solution which I found is kind of kludgey, but does the job:

def tmp = []
tmp.addAll(taskInstance.lists)
tmp.each{
it.removeFromTasks(taskInstance)
}


This somehow doesn't "feel" right. It feels like it shouldn't be this hard to delete a child object and have the parent object be updated at the same time.


Tuesday 29 May 2012

Multi-boot with MultiSystem

Recently I read about a tool called MultiSystem, which you can get from here. The tool is a live Linux CD which allows you to create multi-bootable USB drives. Why would this be useful? I found a use for it immediately in that it allowed me to boot my desktop PC, which doesn't have a CD/DVD drive, off of the USB and try out the latest version of Ubuntu, make sure that everything works and install the OS. 

How does it work? It's really simple, you just get all of the live CD ISO's that you want to put onto a FAT32 formatted spare USB drive, start up MultiSystem and drag and drop the .iso files onto the interface. Admittedly the interface is somewhat randomly designed, but once you drop the .iso files the rest it taken care of. The next step is to configure your PC to boot off of the USB drive, which may require a change in the BIOS settings. Once that's configured, and you've booted off the USB you should get a GRUB-like menu and have a choice of all of the live CD's which you've configured USB drive with. 

It's a great tool for trying out different flavours of Linux/BSD based systems and there's even an option to configure the USB to be Mac bootable. So, look out for another post down the road about setting up a dual boot system on my Macbook. 

Wednesday 18 April 2012

Internal LAN on LXC

When I wrote my previous post about setting up LXC, one of the things I found was that when installing the lxc package, it went ahead and created an "lxcbr0" interface.

It turns out that this interface is actually an "internal network" which you can connect your VM's to if you want them talking to each other directly, as opposed to any network which the host is also on.

To setup my VM's I just added another interface and connected it to the "lxcbr0" bridge, by adding the following lines to the configuration:

lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = lxcbr0
lxc.network.hwaddr = 4a:49:43:49:79:ef


and then configuring the interface in the "interfaces" file:

auto eth1
iface eth1 inet static
    address 10.0.3.2
    netmask 255.255.255.0


Then I did the same to another VM and they were able to talk to each other.

Wednesday 11 April 2012

Btrfs snapshots and LXC

In my previous post I talked about LXC, which is a light-weight virtualization technology for Linux. One thing which LXC lacks is the ability to make snapshots. As the VM is running as a regular process in RAM, at the moment it's not possible to just make a copy of the ram file, as it is in VMware etc...

So, in order to work around this, we're only going to be taking snapshots of VM's when they're shutdown and we're going to be making use of the Btrfs snapshot functionality.

First of all, we need to create a Btrfs filesystem. I assume that you have a spare drive or partition which you can use for this:

$ sudo mkfs.btrfs -L btrfs-test /dev/sda8 

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL 
WARNING! - see http://btrfs.wiki.kernel.org before using 

fs created label btrfs-test on /dev/sda8 nodesize 4096 leafsize 4096 sectorsize 4096 size 10.00GB 
Btrfs Btrfs v0.19

Now we can mount the filesystem:

$ sudo mount /dev/sda8 /lxc

Before we set about creating our VM's, we're going to create some "subvolumes", which we'll be able to snapshot. We're going to use this feature of Btrfs to handle the snapshotting of our VM's.

$ sudo btrfs subvolume create /lxc/vm0
Create subvolume '/lxc/vm0'


Now that we've done this, we can go ahead and create a VM using the template scripts, and configure it, as described in my previous post.

$ sudo /usr/lib/lxc/templates/lxc-ubuntu -p /lxc/vm0 ...

Now that we've created a brand new VM, we're going to create a snapshot of it's "clean" state, so that we can roll back to it, should something go wrong. We do this by creating a snapshot called "clean" of the /lxc/vm0 subvolume.

$ cd /lxc/
$ sudo btrfs subvolume snapshot vm0 vm0-clean
Create a snapshot of 'vm0' in './vm0-clean'
$ sudo btrfs subvolume list /lxc
ID 256 top level 5 path vm0
ID 258 top level 5 path vm0-clean


Now we can start the VM:

$ sudo lxc-start -n vm0 -f /lxc/vm0/config

And install/configure our software:

$ sudo apt-get install apache2 postgresql

Now, suppose that we're humming along nicely, but then realise that we've made a mistake and installed apache2 instead of tomcat6 and postgresql instead of mysql-server and want to start over. All we would need to do is to delete the "vm0" subvolume and rename the "vm0-clean" directory to "vm0":

$ sudo btrfs subvolume delete /lxc/vm0
Delete subvolume '/lxc/vm0'
$ sudo btrfs subvolume list /lxc
ID 258 top level 5 path vm0-clean
$ sudo mv vm0-clean vm0
$ sudo btrfs subvolume list /lxc
ID 258 top level 5 path vm0


Notice how the ID of the snapshot doesn't change, even though we've renamed it.

Now, when we start up the VM, we can see that the "apache2" and "postgresql" packages haven't been installed yet, because we've rolled our VM back to the original snapshot that we've taken.

Now, perhaps the scenario with installing the wrong packages isn't that realistic (it would probably be easier to just remove the packages instead of rolling back the snapshots), however, this was just chosen to demonstrate the capabilities of the technology and you can probably imagine a scenario where a snapshot would be more useful. e.g. testing out a software upgrade, which you're concerned might break some functionality.

NOTE: Another way of using snapshots is to mount them directly, by passing the "-o subvol=..." option at mount time, as described at: http://btrfs.ipv5.de/index.php?title=SysadminGuide#Managing_snapshots

Monday 9 April 2012

LXC on Ubuntu

This post talks about how to setup LXC (Linux Containers) on Ubuntu 12.04. LXC is an operating system-level virtualization technology, which allow you to run multiple virtual machines on one host.

There are quite a few limitations to this type of virtualization, when compared with the type of full, emulator style virtualization that VMware, VirtualBox etc... use. One of the main ones is that you'll be unable to run different operating systems on the virtualization host. i.e. we can only run Linux VM's on our host.

The big advantage is performance. Because the host doesn't have to bother with all of the code which does virtual hardware emulation, the virtual machines run a lot faster in general.

So, to setup LXC, we first need to install it:

sudo apt-get install lxc lxctl uuid btrfs-tools

The 'lxctl', 'uuid' and 'btrfs-tools' packages aren't really needed, but come recommended, and it doesn't hurt to install them.

Now, at this point I did a reboot, which may not be necessary, and afterwards checked the LXC configuration using:

lxc-checkconfig

You should find that all of the different settings are set to "enabled".

Now that we've got LXC installed, we can go ahead and start creating our first VM. Luckily, the 'lxc' package comes with a set of template scripts, which make setting up a VM easy. These scripts are located under '/usr/lib/lxc/templates' and to create our first VM we run:

sudo /usr/lib/lxc/templates/lxc-ubuntu -p /lxc/vm0/
 
Where '/lxc/vm0' is the path to the VM. Note that you will have to create this directory as it doesn't exist by default.

Once you run that script, you'll see a lot of output from the VM getting created and initialized. From the output and from looking at the template script, it looks like the template takes all of the currently installed packages, copies them over to the VM filesystem and configures them.

Once, this is done, we need to configure the networking, which I found to be the trickiest part of the setup. Because the guest VM is using the same hardware as the host, it has the ability to use the network interface attached to the host. Now, obviously you don't want both the host and the guest using the same interface, as it will lead to IP address and MAC address conflicts. So, the guest should have a distinct MAC and/or IP address.

There are several ways to configure the networking for the guest VM's and there are example configuration files of the many ways that they can be configured under file:///usr/share/doc/lxc/examples/ (note that you can enter this location into your web browser and it should load). For our purposes, we're going to go with the lxc-veth.conf file, which sets up a virtual network interface connected to a network bridge which we have to create.

So, firstly, we need to create a network bridge, which is done by adding the following lines to the /etc/network/interfaces file:

auto br0
iface br0 inet dhcp
bridge_ports eth0


This will create the bridge that we're going to connect the virtual network interface of the VM to. In order to enable it, restart the 'networking' service:

sudo service networking restart

Note that we have to create a bridge, even if we've only got one physical interface to connect to it.

Then add the relevant lines from the example config to the VM configuration file, under /lxc/vm0/config:

...
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br0
lxc.network.hwaddr = 4a:49:43:49:79:bf
lxc.network.ipv4 = 10.0.0.0/24
...


Note that the IP network should match that of your host interface, otherwise you might have some problems getting an IP through DHCP. Also, note that the MAC address is just taken from the example file an was probably randomly generated.

That should complete the configuration. We are now free to start up the VM using the 'lxc-start' command:

sudo lxc-start -n vm0 -f /lxc/vm0/config

This should start up the VM and bring up the console on the terminal screen. For the default Ubuntu template, you can log in using ubuntu/ubuntu username/password pair.

Once you've logged in, you can confirm that the VM has a different IP address to that of the host and start configuring it. 

Funnily enough, just as I've finished writing up this post, I stumbled upon a Launchpad blog post talking about how they're using LXC to speed up their testing: http://blog.launchpad.net/general/parallelising-the-unparallelisable

Monday 26 March 2012

Programming and premature optimization

I recently watched a lecture by Jonathan Blow, which can be found here. In it he talks about the kinds of challenges that you have to go through being an independent game developer. The main point to take away is that you really have to take a holistic view of the whole "business" and not just the "beauty of computer science" bits which programmers tend to focus on. Once of the main points he covers is to avoid at all costs the sin of premature optimization. The argument goes that if you spend too much time looking for the "perfect" way of doing something, you're likely going to over optimize it and have both wasted your own time on a problem, where you won't get the return on effort that you needed and ended up with code which can't be easily reused.

One of the ways that he gets this message across is to take a look at one of his games, Braid, which it turns out is approximately 90,000 lines of code. He makes the point that the industry average for "lines of code" per programmer, per year is 3,250. At this rate, it would take some ridiculous amount of time, like 28 years to produce a game. So, in order to launch a game by yourself you have to be "super productive" and spending time on anything which won't lead to the game launching (like implementing complex algorithms, which only optimize the game by 0.5%) will without a doubt result in failure.

It's a compelling point and one that can be seen at work in the real world. It's often not the "best" code that ends up being successful, but rather code that ships. The trade-off I guess is that this "ship first" mentality means that a lot of the time, the user ends up with poor quality code, buggy and resistant to upgrades.

Thursday 9 February 2012

Converting param values in Grails

There are some cases in Grails where you want to compare values as Integers instead of the default String objects. There are two ways to do this. The first is to use either the "parseInt" or "valueOf" methods which are original Java methods. The other way to skin this cat is to call the "int()" method on the params object:

if(Integer.parseInt(params.user.id) != user.id){
...
}
if(params.int('user.id') != user.id){
...
}


There are other methods which the params object has to easily convert HTTP post values to well known data types, such as:

param.short(...)
param.byte(...)
param.long(...)
param.double(...)
param.boolean(...)


Another nice feature is that these methods accept a second optional parameter, which is the default that the value is set to, in the case that there is an error in the conversion:

def price = prams.float("seventy", 0.0)

This behaviour is documented in the Grails documentation here. Although, there doesn't seem to be a comprehensive list of the methods available at the time of writing.

STS/Eclipse version control plugins

STS/Eclipse has a number of plugins which make it easy to work with your version control system right in your IDE. Which one you install largely depends on which version control system you're working with.

If you're working with Subversion, you'll want to use the Subversive plugin. There is also a Subclipse plugin which apparently does the same thing, but I haven't had time to try it out as I've been quite happy with Subversive and if it's not broken...

If you're working with Git, you'll want to use the EGit plugin. I've used this plugin to hook up my projects with GitHub and haven't had any major dramas with it so far.

Wednesday 8 February 2012

Grails - Setting failOnError globally

One of the small annoyances with Grails that I've found is that the application doesn't fail when a call to the "save" method fails. One of the ways to fix this is to pass the "failOnError" parameter to the save method, set to true:

def book = new Book(title: "The Shining").save(failOnError: true)

However, this gets annoying, having to pass the parameter every time that you call the "save" method. A solution is to declare it as the default setting and forget about it.

This can be done in Config.groovy, by adding the following line:

grails.gorm.failOnError=true

You can also add this configuration to specific packages, in case that you didn't want the configuration to apply to all of the packages used in your application:

grails.gorm.failOnError = ['com.companyname.somepackage','com.companyname.someotherpackage']

From: http://grails.org/doc/latest/guide/conf.html#configGORM

Tuesday 7 February 2012

Override the toString method in Grails

The Grails framework is pretty smart in that if you have objects that are related to another object, it will allow you to associate an object at creation time. For example, if you had an User class, which had a "hasMany" relationship with the Post class, when creating a new Post object you would see a drop down allowing you to select a User:


Note that this is assuming that you've used the "generate-all" command to create the default scaffolding. The screen listing the Posts also shows the User:



However, as we can see from the above screenshot the values in the drop down and the User field don't seem to make much sense. The reason for the values are that by default the scaffolding generator calls the "toString" method on the object. Which is set to return the class name of the object, plus it's unique id. While this makes sense to have as the default, we need to change it in order to make it easier for people to read.

Fortunately, changing it is quite straightforward, with us only needing to define (override) the toString method. Simply add the following to the User domain class:

String toString(){
  return username
}


After putting this into the User controller, restart the grails application and you should see the usernames of the users in the drop down when creating a new Post:


This works for any class that you can think of. Simply override the toString method to make it more presentable to the people using the site. I do this almost automatically for all of my domain classes.

Saturday 28 January 2012

How to setup Apache as a Tomcat proxy

In this post we're going to setup Apache to act as a proxy for the Tomcat application server on Ubuntu. First off we need to install the "tomcat6" package from the Ubuntu repositories, which is as simple as:

sudo apt-get install tomcat6
and answering "Y" to download Tomcat along with all of its dependencies. To make sure that the Tomcat server is running, try to open up port 8080 on the machine in your browser. If all is well you will see the Tomcat server Welcome page. If not, you may need to start up the server, which can be done with:

sudo service tomcat6 start

Next we need to install the Apache HTTP server, which "apt-get" also makes easy for us:

sudo apt-get install apache2

Again, just enter "Y" when asked whether to download the package and all of its dependencies.

In order to enable Apache to act as a proxy for Tomcat, we're going to need to make use of the "proxy" and "proxy_http" modules. Unfortunately these two modules don't come enabled by default, so we're going to have to enable them and restart apache for the changes to take effect:

sudo a2enmod proxy
sudo a2enmod proxy_http
sudo service apache2 restart


Now we need to tell the proxy module how to proxy the requests and where to proxy them to. For this I've created a "tomcat-proxy" file under /etc/apache2/sites-available/, which we're going to enable using Apache's a2ensite command. The file itself looks like the following:

ProxyRequests Off
ProxyPreserveHost On
ProxyTimeout 1000
TimeOut 1000
#
# Configure the mod_proxy
#
ProxyPass / http://127.0.0.1:8080/
ProxyPassReverse / http://127.0.0.1:8080/


After editing the file we enable the site and reload Apache's configuration:

sudo a2ensite tomcat-proxy
sudo service apache2 reload


And that's it! If everything's gone to plan we should be able to hit up port 80 on our server and get the Tomcat welcome page.

Note that this isn't the only way to configure Apache as a proxy. A more sophisticated way is to make use of the AJP protocol/module, which is custom designed to work with Tomcat.

To get Apache proxying to Tomcat using the AJP protocol, we have to enable the Apache module and restart Apache for the changes to take effect:

sudo a2enmod proxy_ajp
sudo service apache2 restart

Next we have to enable the AJP connector in Tomcat. This is done in the /etc/tomcat6/server.xml file. If you edit this file you'll need to uncomment a line, which looks like:

<connector port="8009" protocol="AJP/1.3" redirectport="8443"></connector>

and restart Tomcat:

sudo service tomcat6 restart

The next step is to simply go back to our configuration file under /etc/apache2/sites-available/tomcat-proxy and change the protocol and port of the URL's we supplied the ProxyPass and ProxyPassReverse directives:

ProxyRequests Off
ProxyPreserveHost On
ProxyTimeout 1000
TimeOut 1000
#
# Configure the mod_proxy
#
ProxyPass / ajp://127.0.0.1:8009/
ProxyPassReverse / ajp://127.0.0.1:8009/


You'll notice we've replaced "http" in the URL with the custom "ajp" protocol. Restart Apache and hitting port 80 on the server should redirect you to the Tomcat welcome page as before.

What's the difference between the two approaches to proxing? Functionally there's not really any difference that the user gets to see. However, the AJP is a binary protocol, compared to the regular HTTP proxy method that the first approach uses. This should mean less data passed between the proxy and application server as well as lower latencies. Look out for a future post benchmarking the two approaches to see what the real-world difference in performance between the two approaches is.

Thursday 26 January 2012

How to export your Blogger posts to Wordpress

I've been trying to export my Blogger posts to Wordpress for some time now. The reason was to assure myself that if in the future I wanted to migrate away from Blogger for whatever reason, there was an easy way to transfer all of the content to an alternative system. In order to do this, I was trying to get the Blogger Importer plugin to work for ages, but it constantly errored out, giving a message about Google denying the request due to it being malformed.

I had almost given up on finding a way to import the content, when a google search led me to the "Importing Content" section of the Wordpress Codex. This was useful as it lead me to this page:

http://blogger2wordpress.appspot.com/

which usefully converts your blogger export file into the Wordpress format and gives it to you as a downloadable file. After downloading, I just went to the "Tools", "Import" section of the Wordpress admin console, selected the "Wordpress" link, installed the plugin, selected the file and viola! The posts were imported along with the images which had been saved as attachments. The only missing feature that I've found is that the import didn't import the labels that accompanied each post.

Wednesday 25 January 2012

Writeable Rsync server with authentication

In my previous post we talked about how to setup a simple read-only rsync server. In this post,  we'll be taking that simple read-only example and expanding it to allow multiple users, each with their own credentials.

In order to do this, we need to modify the /etc/rsyncd.conf file, and change the following lines:

read only = no
auth users = bozo
secrets file = /etc/rsyncd.secrets


Note that "bozo" is the username of a fake user we're going to create. We can see in this configuration that we're pointing to a file under /etc/rsyncd.secrets. We're going to have to create this file and populate it with the credentials for any users we have created. In this case, we populate it with bozo's username and password:

bozo:clown

We also have to set the permissions on this file to make sure that it's only readable by the root user, using the chmod command:

chmod 600 /etc/rsyncd.secrets

Now, usually we would run the "reload" command to send a message to the server to reload its configuration, but when we do this for rsync, we get the following message:

$ service rsync reload
 * Reloading rsync daemon: not needed, as the daemon
 * re-reads the config file whenever a client connects.


Which is very useful indeed. Now when we connect from the client side, we have to do so using the credentials which we've just created. The command looks like:

rsync -r bozo@192.168.1.10::public/ .


This will ask us for a password, which we know is "clown" from before after which the copy should start as usual. If you want to test out the write capability of the server, we just need to create a file in our current directory and then execute the rsync command going the other way:

rsync -r . bozo@192.168.1.10::public/

In the future we might want to automate the rsync process in order to have it run as a cron job or other automated job. This means that we won't have a human there to enter the password. This can be gotten around with by using the "--password-file" option of the rsync command, like so:

rsync --password-file=~/.rsync_pass -r . bozo@192.168.1.10::public/

Note that as with the rsyncd.secrets file mentioned previously, you'll have to change the permissions on this file to ensure that it's not world readable. The file itself just needs to contain the password to use and nothing else.

Thursday 19 January 2012

Simple read-only rsync server on Ubuntu

If you haven't heard of it rsync is a piece of software which allows you to keep files in sync over a network, while only copying across the "changes" from one copy to the next. The advantages of this are that a lot less data needs to be transferred than would have to be done with something like FTP or SFTP. This attribute of rsync also makes it perfect for things like backups which don't change much from one iteration to the next.

Installing rsync is as simple as:

sudo apt-get install rsync

Although, I've found that with the server version of Ubuntu, it's already installed after installing the OS.

By default, the server doesn't come configured or enabled to start at boot. In order to configure it, we will need to copy across the example rsync configuration into the /etc directory and modify the /etc/default/rsync file:

sudo cp /usr/share/doc/rsync/example/rsyncd.conf /etc/

Modify the /etc/default/rsync file to look like the following:

# defaults file for rsync daemon mode

# start rsync in daemon mode from init.d script?
#  only allowed values are "true", "false", and "inetd"
#  Use "inetd" if you want to start the rsyncd from inetd,
#  all this does is prevent the init.d script from printing a message
#  about not starting rsyncd (you still need to modify inetd's config yourself).
RSYNC_ENABLE=true

# which file should be used as the configuration file for rsync.
# This file is used instead of the default /etc/rsyncd.conf
# Warning: This option has no effect if the daemon is accessed
#          using a remote shell. When using a different file for
#          rsync you might want to symlink /etc/rsyncd.conf to
#          that file.
# RSYNC_CONFIG_FILE=

# what extra options to give rsync --daemon?
#  that excludes the --daemon; that's always done in the init.d script
#  Possibilities are:
#   --address=123.45.67.89 (bind to a specific IP address)
#   --port=8730 (bind to specified port; default 873)
RSYNC_OPTS=''

# run rsyncd at a nice level?
#  the rsync daemon can impact performance due to much I/O and CPU usage,
#  so you may want to run it at a nicer priority than the default priority.
#  Allowed values are 0 - 19 inclusive; 10 is a reasonable value.
RSYNC_NICE=''

# run rsyncd with ionice?
#  "ionice" does for IO load what "nice" does for CPU load.
#  As rsync is often used for backups which aren't all that time-critical,
#  reducing the rsync IO priority will benefit the rest of the system.
#  See the manpage for ionice for allowed options.
#  -c3 is recommended, this will run rsync IO at "idle" priority. Uncomment
#  the next line to activate this.
# RSYNC_IONICE='-c3'

# Don't forget to create an appropriate config file,
# else the daemon will not start.


The only variable that's really changed from the default is the "RSYNC_ENABLED" which has been set to "true".

If we have a look at the config file under /etc/rsyncd.conf, we can see that we're allowing read-only access to the /var/www/pub directory to any user:


# sample rsyncd.conf configuration file

# GLOBAL OPTIONS

#motd file=/etc/motd
#log file=/var/log/rsyncd
# for pid file, do not use /var/run/rsync.pid if
# you are going to run rsync out of the init.d script.
# pid file=/var/run/rsyncd.pid
#syslog facility=daemon
#socket options=

# MODULE OPTIONS

[public]

comment = public access
path = /var/www/pub
use chroot = yes
# max connections=10
lock file = /var/lock/rsyncd
# the default for read only is yes...
read only = yes
list = yes
uid = nobody
gid = nogroup
# exclude =
# exclude from =
# include =
# include from =
# auth users =
# secrets file = /etc/rsyncd.secrets
strict modes = yes
# hosts allow =
# hosts deny =
ignore errors = no
ignore nonreadable = yes
transfer logging = no
# log format = %t: host %h (%a) %o %f (%l bytes). Total %b bytes.
timeout = 600
refuse options = checksum dry-run
dont compress = *.gz *.tgz *.zip *.z *.rpm *.deb *.iso *.bz2 *.tbz


Now all we need to do is to create the folder and start up the rsync server:

sudo mkdir -p /var/www/pub
sudo service rsync start


In order to test the server, you can just drop any files into the /var/www/pub directory and then download them using:

rsync -r [hostname/IP address]::public/ .


e.g. rsync -r 192.168.1.10::public/ .

This will copy across all of the files from /var/www/public into your current directory. Note that if you leave out the dot at the end, it will merely display the list of files under /var/www/pub. Another thing to note is that by default the Rsync server uses TCP port 873 to communicate with the rsync client, so you may have to open up this port on your firewall if it is blocked.

Monday 9 January 2012

pwgen - Generate random passwords on Linux

There's a useful package in Debian/Ubuntu called pwgen, which allows you to generate random, human pronouncable (this is moot) passwords.

It works simply by running the 'pwgen' binary:

$ pwgen Teboo0sh Rahz3Jee aeWae1mn isheL9oo Ahbubo6o fie7ow7L eij3Re0i ieCheh2A oSae0pah uGu1Co0k Pa0PhieZ riope6Ie IeC6aiYi zie4Yahx Yoh0quae yab2iCae Ooqu2wei chel2ohG EeSh5jok hoxoZa7o He8gaale gao6EiSh Uo8loh1b Phie2gie Ehei7ais yeicoo4Z Een1ohcu duZ9ook6 aQuu3wei YuW4gaen soh8ueCh Phohwai5 bi9bu4Li ieWah7ae Aip5Ohv0 lieM1aiG raeF6voe Fooduo9a pohqu3Da Ahn0iRio Uwaech6U ne8Quu9b AhV3oNee zieG1thi Shai1Chu Zae0pie1 aet1geFe Ko8wi4go

It also comes with some useful options such as: -y which adds a random character to each password -N which allows you to specify the number of passwords generated (by default the entire terminal is filled up with passwords) -H which allows you to generate repeatable passwords by using a file and a piece of text to seed the random number generator You can install pwgen on Debian/Ubuntu using:

apt-get install pwgen

For a full list of options have a look at 'man pwgen'.