Friday, August 28, 2009

Helpful online tools

These are a bunch of online tools which I find indispensable when I need to do some quick validation, encoding, or formatting/indenting.
  1. XML escaping - http://escapehtmlforxml.com/
  2. XML indenting (critical when trying to read ugly XML) - http://xmlindent.com/
  3. JSlint (javascript code checker) - http://www.jslint.com/
  4. JSONlint (JSON validator and formatter) - http://www.jsonlint.com/
  5. W3C HTML validator - http://validator.w3.org/
  6. W3C CSS validator - http://jigsaw.w3.org/css-validator/
  7. WDG tools (web validators) - http://htmlhelp.com/tools/
  8. XHTML/CSS page validator - http://xhtml-css.com/
  9. shell tools (xml, base64, md5/sha1, and more) - http://www.shell-tools.net/
Hope this list of tools is helpful!

Saturday, August 08, 2009

Easy license headers with maven

If you are like me you probably hate trying to maintain license headers on your source code files. It has to be done for pretty much all of my projects (since I deal in open source 99% of the time) but it is pure drudgery. I found a great plugin for maven 2 which makes this a piece of cake (very easy). The maven-license-plugin can (optionally) check your source files for headers (you control which ones or just use the defaults) and add in or replace the headers for you. Forget about doing this manually anymore; those days are over. You just specify a license header template like this (i.e. create a file, I use LICENSE_HEADER):

Copyright (C) ${year} ${holder} <${contact}>

This file is part of ${name}.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Then add something like this to your project pom.xml (in the build section under plugins):

<plugin>
<groupId>com.google.code.maven-license-plugin</groupId>
<artifactId>maven-license-plugin</artifactId>
<configuration>
<header>${basedir}/LICENSE_HEADER</header>
<excludes>
<exclude>target/**</exclude>
<exclude>m2-target/**</exclude>
<exclude>**/*.properties</exclude>
</excludes>
<properties>
<name>${project.name}</name>
<year>${project.inceptionYear}</year>
<holder>Aaron Zeckoski</holder>
<contact>azeckoski@gmail.com</contact>
</properties>
<encoding>UTF-8</encoding>
</configuration>
<executions>
<execution>
<goals>
<goal>check</goal>
</goals>
</execution>
</executions>
</plugin>

You need to add in the plugin repo in the pluginRepositories section:

<pluginRepository>
<id>mc-release</id>
<url>http://mc-repo.googlecode.com/svn/maven2/releases</url>
</pluginRepository>

That config will cause the check to run on every build (ignoring properties files is a good idea since the plugin has trouble with them). Files with a missing license header will cause the build to fail ensuring you remember to run the command to format them. The properties you set there will fill in the ${field} vars in the license header template.

Now run the maven command to check for license headers:
mvn license:check
or simply do a build (which will also run the check):
mvn clean install
You should get a report about the files missing license headers.
Run this command and all the license headers will be added or updated to match your template:
mvn license:format
One final note, you can remove all the license headers using "mvn license:remove". Very cool.

Monday, July 27, 2009

Keys to High Performance Web Applications

I know web application performance has been discussed about 100 times before, but it bears repeating (if only briefly and mostly with links) since it is such an important topic.

If you have never tried to ensure your web application will run well then my rule #1 is to not change your application architecture. Program performance is a separate issue that rarely is a problem compared to network latency and server request overhead. I am not saying it is never a problem but you should try things that are much easier first before diving into a restructuring or a rewrite of your app (in most cases buying more hardware is cheaper and safer). As Donald Knuth says, "Premature optimization is the root of all evil (or at least most of it) in programming"

Now that you have done nothing to start (so far so good right?) it is time to do something. Get the YSlow analyzer for Firebug and run it against your web application. It will provide you with a report which you can use to identify possible performance issue. The Firebug network monitor and to a lesser extent the Safari Web Inspector are also good tools for profiling the requests on a page.
Here is a list of a few more performance apps from the RazorSpeed blog and around the web:
No discussion of web app performance would be complete without including a link to Steve Souders' blog. While you are there check out compare. Some of the results are surprising and others not so much.

Many tuning option are in the hands of your system administrator so if that is not you then you can relax a little bit. However, as a web application developer (frontend/web developer or backend engineer), you should at least know where the common problems lie and this is where the bible of web application performance (Yahoo performance rules) comes in. It is a list of best practices which can be roughly summarized as reduce requests, spread the load, utilize caching and compression, and structure pages for efficiency. If you want the shorter list then check out 14 Rules for Faster-Loading Web Sites (it is just a list of rules taken from the bible with samples). If you prefer an alternative list then try the PageSpeed rules.

If you are lucky you are using an environment that has performance tuning built in (like the grails ui performance plugin or the RockStarApps eclipse/aptana plugin) which will do most of what the performance rules suggest automatically (what can be done in the app anyway). Most web servers provide support for compression so that usually is best handled at that layer anyway. For the rest of the best practices, you will just have to learn and apply the performance rules best practices. In most cases it will be well worth your time.

Monday, July 20, 2009

JavaForge requires authn to access SVN

I went to setup an account on JavaForge for Steeple today. Everything went pretty smoothly with the initial setup. It was easy to create an account and setup a new project. The site allows for fine-grained permissions which are easy to configure and has a very nice wiki. It also included code analysis and build tools (which are why I decided to try it out in the first place).

I hit the first bump after creating the SVN repository. I could not find the URL to the respository anywhere. After searching around for ahile I figured out that the URL was:
http://svn.javaforge.com/svn/steeple

The next issue, which ended up being insurmountable, was related to access to the SVN respository. Try as I might there was no way to allow public access to it. Anyone trying to access the public URL will receive a basicauth challenge. Just to view the respository a user has to enter in the username of "anonymous" with a password of "anon". As a result I had to drop javaforge and go with my backup of google code for now.

I did post a question on the javaforge forums about this but from reading the other forum messages I think it is just not possible.

Thursday, July 16, 2009

My first week with Groovy and Grails

I spent time over the last week learning about Groovy (a dynamic language for the JVM) and Grails (a code by convention web application framework built on Groovy) so I thought I would write up my impressions and some of the fun things I learned.
So you have a sense of where I am coming from, I am a long time Web applications and Java/PHP/Javascript/Perl developer. I am somewhat newer to Python and Ruby but I prefer Python. I am a REST and Open Source advocate when I am in the right mood.

If you are totally unfamiliar with Groovy then I recommend you take a look at this post as it lays out the reasons why you might want to learn more about it:
http://codetojoy.blogspot.com/2009/06/case-for-groovy.html

If you know you are going to be writing web-apps then just skip Groovy and go straight for Grails. If you are looking to do some JSR-223 (Java Scripting) stuff (with Groovy) then Groovy is the place to focus on. Either way, you will need to get familiar with the basics of Groovy so look at these:
http://groovy.codehaus.org/Quick+Start
http://groovy.codehaus.org/Collections
Feel free to checkout some sample Groovy scripts I made which illustrate many of the key concepts.

The Getting Started Guide for Groovy is huge and not really a very good place to try to get started unfortunately. That said, the Beginners Tutorial is pretty good, especially the section on closures.

If you want to get going with Grails then it is a little bit easier since it mostly builds on Groovy. Grails borrows heavily from Ruby on Rails so if you are familiar with it then things will come to you quickly. This is the best place to start (not surprisingly):
http://www.grails.org/Quick+Start
I really liked the screencasts (which are oddly located here also). They provided a nice introduction to Grails without much effort. When you are ready for a little more the tutorials are a good next step.

Things I learned in no particular order:
  • Maven and Grails do not get along - There is some really weak maven integration available but it does not work very well. The structure of grails (e.g. src/groovy) does not match the maven standard structure (e.g. src/main/groovy). Mostly maven just allows you to run the grails build commands which is easier to do with grails itself. I struggled with this for awhile before just giving up on using maven. The Grails team recommends using Ivy if you want to add dependency management (or Grails plugins which are preferred if they are available).
    The grails and groovy artifacts are available in maven repositories which is nice.
  • Grails and Eclipse don't easily integrate - The Groovy plugin is pretty good (not great) but the build integration is pretty poor and requires you to jump through hoops. It seems like the integration with IDEA is a lot better and recommended by the Grails team.
  • Groovy supports closures - The closure support in groovy is great and very easy to use. I found myself writing closures like crazy (even more than in Javascript) and it made the code very clean.
    NOTE: I ran across one weird bug where passing in a String[] to a closure causes it to be misinterpreted as a collection of separate arguments for each array entry. There are hacks to get around this but be aware that it may bite you.
  • Grails has a great plugin system - Grails has a pretty powerful plugin system which allows easy extension of a grails app. The plugins seems to be very easy to install and fairly easy to write. There is a complete guide if you are interested in developing your own plugins.
  • Grails app creation puts in too much stuff - The structure generated by grails create-app has a lot of stuff in it which you will probably want to cleanup (like the hibernate plugin by default for example). There is no uninstall for plugins so just remove the dir of the plugin to get rid of it. Be careful to not leave in a lot of things you are not going to use and clean out the sample stuff under web-app as well.
  • Grails convention is not very flexible - Grails prides itself on "code by convention" and "convention over configuration" and it does a good job of establishing a lot of conventions. It takes a little while to get used to them but if you follow them then things are pretty easy. Unfortunately, this implies that it is possible to override the convention using configuration if needed and in many cases it is not. I have been bitten a few times already when I tried to do things that are not "on the rail".
  • Grails uses prototype.js by default - The built in javascript engine in Grails is the portal/multi-framework unfriendly prototype.js. I can't use it so I am playing around with using jQuery instead (so far this is proving to be manageable). There is a jQuery plugin which helps make this easier.
I am still getitng used to things but I am not sure what Groovy/Grails gains me over using Jython/. It seems like Jython does everything Groovy does plus it has the massive Python community for support.
As far as scripting languages go I think I prefer PHP, Javascript, Python, and Perl (in that order) over Groovy but this may just due to a lack of familiarity on my part.

Friday, July 03, 2009

Sakai AppBuilder Plugin updated to 0.8.7

The Sakai AppBuilder Eclipse Plugin is updated to a new version (0.8.80.8.7) which includes updates for Sakai K1 and support for Wicket. Many thanks to Steve Swinsburg who did all the heavy lifting on this update. You can install the plugin using instructions here or update it to the new version from within eclipse if you have installed it before.
The Sakai AppBuilder is a RAD tool that allows you to quickly create Sakai webapp projects in Eclipse that will work in the Sakai Framework. Use these as a basis for the projects that you want to make without all the busy work of creating the structures and adding in all the dependencies. You can choose various UI layer options and implementation types to get you started quickly.
NOTE: Updated for the 0.8.8 release (minor fix from 0.8.7)

Wednesday, July 01, 2009

Aptana Studio 1.2 crash and upgrading to 1.3

I recently upgraded Java to version 1.6 (build 1.6.0_13-b03-211) on my macbook pro running OSX 10.5.7 (leopard). It was a bit of a chore but it mostly sped things up and allowed me to run some of the newer apps that require Java 6.

I had a major casualty though when Aptana Studio stopped working. It would simply crash without even giving a decent error message and the logs were not helpful.I normally run the standalone version of Aptana Studio (1.2.7) which is built on eclipse 3.2. This seems to no longer run on OSX and Java 6 so I went in search of a fix. After lots of forum browsing, tweaking configurations, and reinstalling I ended up retiring version 1.2 and trying out version 1.3 (still in beta). It was hard to find the 1.3 downloads page so here is a link.

It installs quite easily and ran which was a major improvement. However, when I tried to install the features (plugins) I am used to, they all indicated that they were incompatible and would not install. This seemingly hopeless situation was actually easily fixed by updating Aptana Studio (Help -> Software Updates). Once that was done (definitely restart here) I installed the features (php, pydev, git) and plugins (epic) that I like and everything seems to work fine again.

Hopefully this will save someone a little pain.

Wednesday, June 17, 2009

PHP dash in class and method names

I ran into what seems like a common issue when working with PHP and SimpleXML today. Parsing XML is normally pretty easy:
<?php
header('Content-type: text/plain');

$xmlData = <<<XML
<?xml version='1.0'?>
<trees>
<fruit>
<apple name='apple' type='Deciduous' has-fruit='Y' />
<pear name='pear' type='Deciduous' has-fruit='Y' />
</fruit>
<pine>
<white name='whitepine' type='Coniferous' has-fruit='N' />
</pine>
</trees>
XML;

$xml = simplexml_load_string($xmlData);

echo "Testing SimpleXml";
echo "\n".$xmlData;
echo "\nName:".$xml->fruit->apple->getName();
echo " Type:".$xml->fruit->apple->attributes()->type;
?>
Output:
Name:apple Type:Deciduous

However, if you decide to include a hyphen or a dash in the name of your attribute things get a bit more interesting. The code has to be adjusted since the name of a class method cannot contain "-". To make it work, the attribute name has to include braces and single quotes (e.g. "{'name'}").
echo "\n".$xmlData;
echo "\nName:".$xml->fruit->apple->getName();
echo " Type:".$xml->fruit->apple->attributes()->type;
echo "\nFruit?:".$xml->fruit->apple->attributes()->{'has-fruit'};
Output:
Name:apple Type:Deciduous
Fruit?:Y

Friday, June 12, 2009

Tricky SOLR schema issue with StrField

I have been setting up SOLR (version 1.3) as a search index for the Darwin Correspondence project. While making a few changes I ran into a really annoying issue today related to the way the schema configuration works. The SOLR schema (schema.xml) allows you to setup Analyzers and Filters which allow control of how terms are indexed and searches are executed.

I needed to make it so we could match names when the case is not exact and when the chars are special (i.e. "u" needs to match a name with "ü"). The field started out like this:
<fieldType name="name" class="solr.StrField" sortMissingLast="true" omitNorms="true" compressed="false" indexed="true" stored="true">
For my first attempt I added an analyzer to the field like so:
<fieldType name="name" class="solr.StrField" sortMissingLast="true" omitNorms="true" compressed="false" indexed="true" stored="true">
<analyzer type="index">
<tokenizer class="solr.HTMLStripStandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.ISOLatin1AccentFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
I loaded data into SOLR and tried out some searches and go no results. I was getting exact matches only (as if I had no analyzers). When I checked the solr admin analysis page it indicated that the filters were working and the tests there even seemed to show that things were ok. Unfortuantely, I found out that SOLR does not actually execute the analyzers if the field class is set to solr.StrField. It doesn't fail or indicate errors in the logs but your searches will not work the way you expect them to. Changing the field over to class solr.TextField fixed the problem.
The correct configuration for the field is this:
<fieldType name="name" class="solr.TextField" sortMissingLast="true" omitNorms="true" compressed="false" indexed="true" stored="true">
<analyzer type="index">
<tokenizer class="solr.HTMLStripStandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.ISOLatin1AccentFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>

I spent a few hours figuring this out so I hope that this saves someone a little time.

Monday, May 25, 2009

PHP CodeSniffer tips

I really love tools which help my code be more correct and more readable. I have referred to tools like JSLint (JS) and FindBugs (Java) in previous posts and now I am going to write some tips about using PHP CodeSniffer (PHP) (a.k.a. phpcs). It is probably the most aggressive of the three and can be especially tricky on the requirements it puts on your file headers.

Here is a sample file header:

<?php
/**
* Presto - a lightweight REST framework for PHP
*
* Presto is a simple to use and very lightweight REST framework for PHP,
* it will help you to handle rest routing and input/output of data without
* getting in your way
*
* PHP Version 5
*
* LICENSE:
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
* http://www.apache.org/licenses/LICENSE-2.0
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* @category File
* @package Presto
* @author Aaron Zeckoski <azeckoski@vt.edu>
* @copyright 2009 Aaron Zeckoski
* @license http://www.apache.org/licenses/LICENSE-2.0 Apache License, Version 2.0
* @version SVN: $Id:$
* @link https://link/to/your/project/site
* @since inception
*/

A few comments about the header:
  • The copyright cannot have a comma after the year but the year can be a range. "2002-2009 AZ" is ok, "2009, AZ" is not
  • The license should appear inside the header like shown, not above it (this would cause an ERROR in phpcs)
  • The alignment of the data after the tags (e.g. @license, @version) is not optional, misaligned data causes an ERROR
Sample class:

/**
* My class which does some stuff
*
* @category Class
* @package Presto
* @author Aaron Zeckoski <azeckoski@vt.edu>
* @license http://www.apache.org/licenses/LICENSE-2.0 Apache License, Version 2.0
* @link https://link/to/the/project/again
*/
class RestController
{
const DEFAULT_RESOURCES_DIR = 'resources';

/**
* This is a method in my class
*
* @param object $_resourcesPath [optional] the resource path
*
* @return void
*/
protected function loadResources($_resourcesPath = self::DEFAULT_RESOURCES_DIR)
{

There are also a few of the rules about classes that caught me out as well:
  • Class methods must use camelCase. myMethodName is good, my_method_name is not
  • Classes MUST have a comment on them and it has to include a lot of the fields from the header. The ones I list in the sample above are the minimum (seriously).
  • The space between @params and @return is not optional
  • Just a note on constants in PHP, you use const inside classes and define outside
Take a look at the sample file and class headers here for more details: http://pear.php.net/manual/en/standards.sample.php

Saturday, May 23, 2009

Open Repositories 09 developer view

I just got back from the Open Repositories 2009 conference in Atlanta, GA, US and wanted to highlight a few things which were interesting to me (from a developer's perspective).
  1. Pluggable (extendable) repository systems
    DSpace 2 was initially designed to support plugins and there were some suggestions which will improve it further. The Eprints team showed off a really cool proof of concept of a plugins store which allows browsing, downloading, and installing from within eprints. The Fedora Commons team indicated interest in using OSGi to manage their services and enable plugin points.
  2. ReST interfaces
    All the major systems have some ReST in place now and are working on having fully restful access available in the fairly near future. I think (and hope) this will lead to more mashup style integrations and easier access to repository data which can only be a good thing.
  3. DuraSpace
    The merging of the DSpace and Fedora Commons communities into DuraSpace is cool because it means 2 teams of great developers will now be one. They also showed off the DuraCloud distributed storage service which is interesting from a scaling and backup perspective.
  4. Developer Repo Challenge
    There were some really cool projects and ideas demonstrated for the repo challenge. My personal favorite was the EprintsAppStore. I also really liked the FedoraFS entry from a technical coolness perspective and MentionIt (the winner) for its simplicity.

Tuesday, May 19, 2009

OSGi system and bundle start levels

OSGi has a concept of start levels. This is fairly well documented in the r4 core spec but there seems to be some confusion around how they work so here is a quick summary for my own reference (and in case it helps anyone else).

Start Levels determine the start order of bundles (not services). There are two types of start levels in an OSGi system. The system start level and the bundle start level (set for each bundle). The default start level of an OSGi system will be 1 (this is called the beginning start level and can be configured) and the bundles installed in an OSGi system will use the default start level when they are installed unless this is changed manually. An OSGi system has a current level (called the active start level) which determines the bundles which are allowed to be started. If a bundle has a start level higher than the active start level it will not start when the OSGi system starts up and it will not start if given a manual start command. If the active start level increases to be greater than or equal to the level of the bundle it will be started. Likewise, if the active start level changes to be below the level of a bundle, it will be shutdown.

Here are a few points about start levels that were not completely obvious to me:
  • The bundles at a given start level will all have their start() method completely executed before any bundles at a higher level are started. Start order within the start level is indeterminate.
  • When the active start level is changed, the system will move in increments of 1 until the desired level is reached. For example, from 5 to 10 means the system will do 6, 7, 8, 9, and finally 10.
  • If the start level is changed many times rapidly it must completely reach all requested levels in sequence. The system will not give up on level 3 if it was requested and has not been reached yet just because level 15 was requested while it was moving to level 3.
  • The system bundle is always located at start level 0. This cannot be changed.
  • Start level should NOT be used as a way to control service startup order. This is considered a programming error in OSGi as service start orders are not guaranteed and services may come and go at will.
  • Start level can be used as a way to reduce load on a system by setting non-critical parts as high start levels. This allows the level to be reduced in order to reduce the load and shutdown non-critical services and bundles.
  • OSGi has a compatibility mode which forces all bundles to use start level 1 (this is a good way to check to make sure you are not depending on the start levels as a way to ensure service start order).
NOTE: If you are working with apache felix 1.6.0 there is a confusing error in the default config file. The system start level property is commented out as org.osgi.framework.startlevel but the correct value is org.osgi.framework.startlevel.beginning.

The OSGi spec (section 8, page 2o3) has more details about start levels.

Monday, April 27, 2009

Installing perl modules in OSX

This should not have been hard but it ended up being a big pain for me so I thought I would document the process for installing a perl module on Mac OSX. It ended up being tricky because there were two ways I found to do it. I will first list the way I ended up NOT using.
  1. Install Xcode - this is required for installing and running darwin ports (a.k.a macports)
    This is a disk image with a binary installer which is about 900 MBs in size and takes a few minutes to install, it also requires an apple developer connection membership before you can download it
  2. Install DarwinPorts - this is required for installing yum
    Download the disk image and run the binary installer, you have to fill in your name and email address to download
    Run this command as root once you finish running the installer:
    sudo port -d selfupdate
  3. Install Yum - this is required to install the perl module
    Run this command (takes a long long time) to as root to cause darwin ports to install yum:
    sudo port install yum
  4. Use yum to install the perl module
    yum -y install perl-Frontier-RPC
After doing all this and facing failure I searched around more and found the instructions here:
http://triopter.com/archive/how-to-install-perl-modules-on-mac-os-x-in-4-easy-steps/
For the same module (Frontier) the commands ended up being (after installing Xcode, much like linux):
sudo su
perl -MCPAN -e shell
install Frontier::Client
This worked out a lot better for me (though the process took about 15 minutes total). Make sure you run this as root since it will produce lots of fun failures otherwise.

Thursday, April 16, 2009

Configuring Jetty in the pax-web OSGi bundle

It took awhile for me to figure out how to configure Jetty as deployed in the pax-web web service bundle. It took even longer to figure out how to enable AJP in Jetty using the bundle fragment. Unfortunately, the steps on the pax-web site are incorrect (now fixed: 2009-04-17) for the current version of pax-web (0.6.0). Hopefully this will save someone else from having to go through this "fun".

Note that this will only work in pax-web version 0.5.2 or higher and requires an OSGi container that supports OSGi bundle fragments. I was doing this in Felix 1.6.0 (inside Sling).
  1. Checkout my bundle fragment source code
    https://source.caret.cam.ac.uk/camtools/trunk/sling/sling-jetty-config
  2. Edit the jetty.xml file to suit your taste
    (the one in there enables AJP)
  3. Build using Maven 2
    mvn clean install -Pajp
    (Leave off the -Pajp if you are not enabling/using ajp)
  4. Install the bundle fragment into your OSGi container using whatever mechanism you are used to. It tends to work best to install it with the same level as the bundle it is being used with.
  5. Restart the OSGi container
    (this is not always required but I find it tends to work a lot better if you do)
Unfortunately, I could not find a very good way to verify that this works other than putting a breakpoint at line 66 of /pax-web/bundle/src/main/java/org/ops4j/pax/web/service/internal/JettyServerImpl.java. You should see the jetty.xml file get loaded (resource will be non-null). You could also just check to see if Jetty is behaving as you would expect with the changes.

If there are problems (the fragment seems to have no effect) then here are a few debugging steps that may work:
  1. If you are using felix then you can run the resolve command from the felix command shell for your fragment. If you get nothing then you are probably good to go, if you get a failure then the Fragment-Host: org.ops4j.pax.web.pax-web-service value probably does not match the one in your container. Make sure the value is the same as the symbolic name of your installed pax-web-service-*.jar.
  2. Try making sure your jetty.xml actually works with Jetty. There are instructions on the Jetty website.

Wednesday, April 08, 2009

Filtering in FindBugs

Here is an example findbugs exclusion filter which is generally useful for most projects. I could not seem to find a good example of something general anywhere so hopefully this will be a helpful reference to others. If you placed an xml file like this into your project in eclipse you can configure findbugs to use it as an exclusion filter.

Here is the sample xml (escaped):
<FindBugsFilter>
<!-- filter out test classes with medium (2) or low (3) warnings -->
<Match>
<Or>
<Class name="~.*\.AbstractTest.+" />
<Class name="~.*Test" />
</Or>
<Bug category="PERFORMANCE,MALICIOUS_CODE,STYLE,SECURITY" />
</Match>

<!-- remove the rules which require all passed vars to be immutable -->
<Match>
<Bug code="EI,EI2" />
</Match>

<!-- Filter out certain categories of bugs -->
<Match>
<Bug category="STYLE" />
</Match>
</FindBugsFilter>

To make this work in eclipse (with findbugs plugin installed) just do the following:
  1. Create an xml file in your project with the contents from above
  2. Right click the project name and select Properties
  3. Choose FindBugs
  4. Check the box marked Run FindBugs automatically
  5. Click on the Filter files tab at the top
  6. Click the Add button next to Exclude filter files:
  7. Select the xml file you created
  8. Click OK
You can adjust the filter using the findbugs guide and the full list of bug types and categories is available.

Tuesday, April 07, 2009

Developing with OSGi and Apache Sling

These are notes I compiled while working on a project which uses OSGi (Apache Felix) and Apache Sling as the foundational framework. Sling is essentially Felix with JCR and templating bundles installed and a REST bundle. I will tell you how I got going with the basics of OSGi and Sling and some of the issues I ran into. First a few important links:
Before we really get started I want to mention something about OSGi in general. OSGi is basically a system for making Java code modular and handling isolation of dependencies (no slop allowed). This is done via a lot of really complex classloader graphing. A chunk of code in OSGi is called a module and a system will typically be made up of many modules. Inside modules there are packages which can be used internally (private), exported so others can use them (this is how services dependencies are shared), and imported (this is how a bundle would get the dependencies it needs to use an external service). OSGi completely manages the lifecycle of the the bundles in the system. Bundles define an activator (implements BundleActivator) which allows the developer to control startup and shutdown actions and register services. The modular nature of OSGi means services could go away at any time (or may not be started when your bundle is starting). Because of this, bundle code should be written to expect that a service might not be ready to use when it is starting. Waiting for services to be available is bad also because OSGi activators are expected to be quick so delaying them will cause failures. This generally means using the listeners and trackers provided by OSGi. The practice is called the Whiteboard Pattern (basically inverted listeners, those familiar with EntityBroker/EntityBus will recognize this as how providers work). These links are a good intro and you will want to be familiar with this conceptually before you attempt to create your first real bundle (helloworld not included):
http://www.theserverside.com/tt/articles/article.tss?l=WhiteboardForOSGi
http://www.knopflerfish.org/osgi_service_tutorial.html

Here are the steps I took to get Sling up and running (more details on the Sling site):
  1. Checkout the source code using subversion:
    svn co http://svn.apache.org/repos/asf/sling/trunk sling
    (I checked out revision 758703 since the current revision was not working)
  2. Build the source using maven 2:
    cd sling
    mvn clean install
  3. Run Sling using the built in Jetty webserver (as an executable jar):
    java -jar launchpad/app/target/org.apache.sling.launchpad.app-5-SNAPSHOT.jar
    (Note: the version of the jar will change over time)
    (you can also run drop a sling war into your own servlet container)
    (you can change the port by adding "-p #" (where # is the port number like 9090))
    (you can cause all logs to go to the console by adding "-f -")
    (you can change the logging level of sling by adding "-l #" (where # is 0-4 with 4=debug))
  4. Test out sling using your web browser:
    Go to http://localhost:8080/
    Click on the console link (login as admin/admin)
Now, if you are going to do anything in Sling (or Felix) at all you are likely to need to debug it, so here are the steps to use a debugger with Sling (generally applies to any java app really).
  1. Run Sling with debugging options enabled:
    java -Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,address=9001,server=y,suspend=n -jar launchpad/app/target/org.apache.sling.launchpad.app-5-SNAPSHOT.jar
    (You should see this in the logs: Listening for transport dt_socket at address: 9001)
  2. Attach the debugger from eclipse (or use whatever debugger you like):
    Run -> Debug Configurations
    Right click Remote Java Application -> New
    Connect tab: Set Port to 9001
    Source tab: Add the imported Sling code and your OSGi bundle project
    Click the Debug button at the bottom
    (You should not get an error if everything connects up)
  3. Place a breakpoint in Runtime class on the gc method
  4. Click on System Information and then the Run button next to Garbage Collection
    (the debugger should pick up the call and pause the JVM at the breakpoint)
  5. Use the debugger to trace through if you like (but bear in mind that the admin console is not actually part of Sling, it is part of Apache Felix so you will need the source for Felix)
Now you probably want to deploy a bundle so here are the steps to make a really simple one you can build and deploy into sling. I use the maven-bundle-plugin for this example, see the links for more details about it:
Here is the part of the maven pom that configures the bundle manifest:
<dependencies>
<!-- OSGi -->
<dependency>
<groupId>org.osgi</groupId>
<artifactId>osgi_R4_core</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>org.osgi</groupId>
<artifactId>osgi_R4_compendium</artifactId>
<version>1.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.felix</groupId>
<artifactId>maven-bundle-plugin</artifactId>
<extensions>true</extensions>
<configuration>
<instructions>
<Bundle-Activator>org.azeckoski.osgi.SampleActivator</Bundle-Activator>
</instructions>
</configuration>
</plugin>
</plugins>
</build>
  1. Checkout the sample bundle code:
    svn co https://source.sakaiproject.org/contrib/caret/osgi-sample/tags/sample-1.1/ osgi-sample
  2. Build the bundle using maven 2:
    cd osgi-sample
    mvn clean install
  3. Access the Felix console for bundles (included in Sling):
    http://localhost:8080/system/console/bundles
  4. Browse and select the bundle (target/sample-1.1.jar)
  5. Click Install or Update and then Refresh Packages
  6. Scroll down to the Sample OSGi Bundle
    If it is listed then you have a properly installed bundle!
  7. Click the Start and then Stop buttons to the right of the bundle
  8. You should see something like this in the logs:
    Sample starting at: Mon Apr 13 13:15:27 BST 2009
    Sample stopping at: Mon Apr 13 13:15:29 BST 2009
  9. Now click Start to make sure the bundle is running
  10. Restart Sling (use the admin console or just kill it and rerun the command)
  11. You should see that the bundle is running (it was started automatically) and in the startup logs the Sample starting... should appear
Here is the (very) simple code from the activator class (comments removed). It just prints out a message when the bundle starts and another when it stops.
import java.util.Date;
import org.osgi.framework.BundleActivator;
import org.osgi.framework.BundleContext;

public class SampleActivator implements BundleActivator {
public void start(BundleContext context) throws Exception {
System.out.println("Sample starting at: " + new Date());
}

public void stop(BundleContext context) throws Exception {
System.out.println("Sample stopping at: " + new Date());
}
}
Congratulations. You have installed you first bundle. If you want to try out the debugging you can put breakpoints in the start and stop methods. It only took me a couple days to get to this point which I think is not great so I hope this tutorial has really sped up the process for you.

Now for the advanced bit. Creating your own bundle. I have a somewhat realistic set of bundles where one depends on another and one uses the http.service (and other bundles) so chances are you won't face things that are too much more complex than this.
There are a few things that I learned going through this that were not all that apparent so hopefully I can save you some digging with these pro-tips:
  • OSGi start order does not guarantee service start order - The order which bundles start does not guarantee that their services will also be started. In fact, relying on this is a mistake since OSGi is meant to be modular and services could go away at any time (see the note at the top about OSGi and whiteboard pattern). Code should be written to expect that a service might not be ready to use when the bundle starts. Waiting for services to be available is bad also because OSGi activators are expected to be quick so delaying them will cause failures.
  • Bundles should import the packages they export - This is not obvious but it makes sense when you think about it. Since there may be other bundles of higher rank that are exporting packages that match yours, you should use the highest priority, even in your own bundle. This behavior is actually taken care of my most bundle making tools automatically so just don't be surprised to see your packages in the imports.
  • javax packages will need to be imported - Only java.* is available to your bundle by default so all the dom,xml,net,wtc. stuff from javax has to imported. If you do not import it and happen to use it then your bundle will fail at runtime. Luckily, most OSGi installations have a core which already exports most of this stuff so just put it into your import-package as optional like so (just an example):
    org.w3c.dom;resolution:=optional,org.xml.*;resolution:=optional,javax.*;resolution:=optional,*
  • Services in the manifest are deprecated - As of OSGi R4, listing services (using export/import-services) is deprecated. All exports/imports are handled as packages now and the registration of the services is done in the activator.
  • Felix shell access inside Sling - There is a felix remote shell which can be used to access the felix shell service. It can be installed by just installing a couple bundles (shell, shell remote). This will allow you to run felix shell commands inside Sling via telnet.
I strongly suggest you use code that you know works already when building your first functional bundle. Pick something that does not have a lot of dependencies as well. For the following tutorial steps I will take you through the download, build, activate, and test process with my bundles and then look at some of the code.
  1. Checkout the sample bundle code:
    svn co https://source.sakaiproject.org/contrib/caret/osgi-eb/tags/eb-1.0/ osgi-eb
  2. Build the bundle using maven 2:
    cd osgi-eb
    mvn clean install
  3. Use the Felix console for bundles to install and start the eb-services and eb-rest bundles:
    (should be eb-component/target/eb-services-1.0.jar and eb-webapp/target/eb-rest-1.0.jar)
  4. Verify they started by refreshing and making sure they appear to be running:
    For some reason there are no errors logged to the console by default when a bundle fails to start and no messages appear in the web interface. As a result you simply get a silent failure where it seems to have worked but is actually just installed and inactive. The bundle should say Active next to it (make sure you refresh).
    NOTE: The errors will be logged into SLING_HOME/logs/error.log
  5. Go to the URL that is setup for the bundles (http://localhost:8080/eb), it should load up the description page for the rest webapp portion
For this sample code I have used the OSGi ServiceTracker and ServiceListener with a bit of utilities scaffolding (ServiceTracker2 and ServicesTracker) to setup my bundles and services in the whiteboard pattern. While this works fine for a smaller use case, I would suggest looking into OSGi DS (Declarative Services) and the Maven SCR Plugin when working with a large number of bundles and services.

Here is the start method from the EB services activator showing an example of how services are registered and tracked. This also demonstrates one way to track things which are used by our services. Pre-OSGi Java programming might require the EntityProvider to be manually registered with the EB system using a register method. By leveraging OSGi's ability to lookup services, we can instead simply have developers create their providers and register them with OSGi as services. Then the eb-services bundle picks up the registered providers and handles the registration automatically.
public void start(BundleContext context) throws Exception {
System.out.println("INFO: Starting EB module");

// Create the EB core services
coreServiceManager = new EntityBrokerCoreServiceManager();

// register trackers to handle the optional services
eipTracker = new ServiceTrackerPlus<ExternalIntegrationProvider>(context, ExternalIntegrationProvider.class) {
@Override
protected void serviceUpdate(ServiceEvent event, ExternalIntegrationProvider service)
throws Exception {
coreServiceManager.getEntityBrokerManager().setExternalIntegrationProvider(getService());
}
};

// register a tracker for the entity providers (will find them as osgi services and handle registration)
providerTracker = new ServiceTrackerPlus<EntityProvider>(context, EntityProvider.class) {
@Override
protected void serviceUpdate(ServiceEvent event, EntityProvider service)
throws Exception {
EntityProvider provider = getService(event);
if (event.getType() == ServiceEvent.UNREGISTERING) {
coreServiceManager.getEntityProviderManager().unregisterEntityProvider(provider);
} else {
coreServiceManager.getEntityProviderManager().registerEntityProvider(provider);
}
}
};

// look up a service we optionally want to use but do not require
ServiceTrackerPlus<DeveloperHelperService> dhsTracker =
new ServiceTrackerPlus<DeveloperHelperService>(context, DeveloperHelperService.class);
DeveloperHelperService dhs = dhsTracker.getService();
System.out.println("DeveloperHelperService is currently: " + dhs);

// register the core services from this bundle
ebRegistration = context.registerService( EntityBroker.class.getName(), coreServiceManager.getEntityBroker(), null);
ebManagerRegistration = context.registerService( EntityBrokerManager.class.getName(), coreServiceManager.getEntityBrokerManager(), null);
ebProviderManagerRegistration = context.registerService( EntityProviderManager.class.getName(), coreServiceManager.getEntityProviderManager(), null);
ebEVAPManagerRegistration = context.registerService( EntityViewAccessProviderManager.class.getName(), coreServiceManager.getEntityViewAccessProviderManager(), null);
ebHSAPManagerRegistration = context.registerService( HttpServletAccessProviderManager.class.getName(), coreServiceManager.getHttpServletAccessProviderManager(), null);

System.out.println("INFO: Started EB module and registered services");
}

In this code example, we see the code to handle the start of the EB rest activator. This allows the eb-services bundle to be stopped and started without having to manually do anything to the rest bundle (whose services depend on the ones in the eb-services bundle). It will shutdown the rest services until the core services it requires are available again.
public void start(BundleContext context) throws Exception {
System.out.println("INFO: Starting EB ReST module");
// initialize tracker
this.requiredServicesTracker = new ServicesTracker(context, HttpService.class, EntityBrokerManager.class) {
@Override
protected void requiredServicesReady(Object service, ServiceTracker2 changed,
Map<String, ServiceTracker2> serviceTrackers) throws Exception {
// required services are ready so startup
startServices(getService(HttpService.class), getService(EntityBrokerManager.class));
}
@Override
protected void requiredServicesChanged(Object service, ServiceTracker2 changed,
Map<String, ServiceTracker2> serviceTrackers) throws Exception {
// required services changed so stop and restart
stopServices(getService(HttpService.class));
startServices(getService(HttpService.class), getService(EntityBrokerManager.class));
}
@Override
protected void requiredServicesDropped(Object service, ServiceTracker2 changed,
Map<String, ServiceTracker2> serviceTrackers) throws Exception {
// required services gone so shutdown
stopServices(getService(HttpService.class));
}
};

// optional services trackers
dhsTracker = new ServiceTrackerPlus<DeveloperHelperService>(context, DeveloperHelperService.class) {
@Override
protected void serviceUpdate(ServiceEvent event, DeveloperHelperService service)
throws Exception {
DeveloperHelperService dhs = getService();
if (servlet != null) {
servlet.updateDHS(dhs);
}
}
};
// TODO need to handle the case of the hsapm changing?
hsapmTracker = new ServiceTrackerPlus<HttpServletAccessProviderManager>(context, HttpServletAccessProviderManager.class);

// register the servlet if services are ready
this.requiredServicesTracker.startCheck();
}

Here are some more links to OSGi materials that may be helpful:
http://neilbartlett.name/blog/osgi-articles/
http://www.osgi.org/About/HowOSGi

NOTE: Updated for the graduation of sling and changes in URLs that resulted

Monday, April 06, 2009

High Quality Javascript

Writing good javascript is still more art than science, but there are a growing number of tools out there to help those who are less artsy. I have tried to compile the ones that I use and a few tips which I think are helpful.

jQuery (http://jquery.com/) - a great javascript framework built for developers
If you are still writing javascript without using a framework like jQuery then you are punishing yourself. Stop that! This framework makes javascript work like it probably should have anyway and it is very reliable and widely used. It helps protect you from browser incompatibilities and makes ajax very easy. There are a large number of addons and extensions which provide flashy widgets and things to make your UI look snazzy.

JSLint (http://www.jslint.com/) - this website / tool is similar to findbugs for java and helps with correctness and valid code practices (NOTE: Can enable JSlint in Aptana Studio, see tips below)
JSLint is a JavaScript program that looks for problems in JavaScript programs. Just paste your javascript file into the box and click JSLint. I use the following options (good starting point if you are not sure what to use):
Assume a browser, Disallow undefined variables, Disallow leading _ in identifiers, and Disallow == and !=
You should strive to have no errors indicated. I recommend you avoid the Strict white space and Require parens around immediate invocations options as these are more likely to cause frustration than be helpful. JSLint will also provide you with a list of all functions, members, and Globals in your code which is helpful as a reference.

jqUnit (http://code.google.com/p/jqunit/) - a test writing framework (like jUnit for java)
This is a test framework which allows a developer to write tests which exercise and validate their javascript code. It is compatible with JSUnit but has special handling for jQuery and since you are using jQuery anyway you may as well get the benefits. The tests run in the browser and normally are accessed by loading a page (e.g project/test.html). I won't go into the reasons why you should always have unit tests but if you want high quality code then they are not optional. This will have a huge impact on the reliability and change tolerance of your code and since JS code in general tends to change a lot this is critical.

Fluid Infusion (http://fluidproject.org/products/fluid-infusion/) - framework for accessible javascript
The fluid project is trying to ensure that javascript enabled pages are accessible to everyone. They have a framework which includes widgets which are specially designed to be accessible and also have a lot of documentation for developers.

Aptana Studio (http://www.aptana.com/studio) - IDE for javascript, HTML, DOM, CSS
This is an eclipse based IDE for the web. It provides code completion, formatting, validation, and the things you might expect to get from an IDE. This makes writing javascript a heck of a lot easier and it looks better than textedit/notepad. I just use the eclipse plugin rather than the full product now and if you are web designer or just trying it out then the full option is probably best (as it seems it does not uninstall from eclipse cleanly).

General Tips:
  • Always namespace your javascript functions and vars - http://www.dustindiaz.com/namespace-your-javascript/
  • Write unobtrusive javascript (this basically means avoid writing code in your html except for a few lines at the bottom to run scripts and pass in values as needed) - http://www.onlinetools.org/articles/unobtrusivejavascript/
  • Always use var in front of variables, if you don't they will become globals and when you are namespacing (which you should be) this is considering leaking
  • Declare globals at the top of your JS file in a comment like so:
    /*global jQuery, myGlobal */
  • Always refer to globally used variables at the top of your script so it will fail if the global is missing. This will make JSLint happy and keeps things from appearing to work when globals are not available. Example:
    var $ = $ || function() { throw "JQuery undefined"; };
  • Check to make sure you found something when you use jQuery selectors (if you always except to find something). This protects you from thinking you have actually found something based on an id when there is nothing by that id available. Example:
    var adhocArea = $("#"+adhocAreaId);
    if (adhocArea.length > 0) {
    // do something
    } else {
    throw "failed to find thing with id: "+
    adhocAreaId;
    }
  • Use script blocks to initialize javascript when the DOM it is working with has loaded rather than always (ever?) using onLoad. This will cause the javascript to start quicker on slow loading pages and avoid ugly issues like resetting selections made by the user or clearing fields which had loaded before the whole page loaded and the onLoad executed. For example, if you are adding a simple numeric validator to a field, you should place the script tag to load the JS (from a namespaced variable in a separate file) after the form. This way the script executes as soon as the DOM it will act on is loaded.
  • Be familiar the concept of closures in javascript - http://www.jibbering.com/faq/faq_notes/closures.html
  • Quirksmode is a good reference for javascript cross browser compatibility
  • Enable JSLint validator in Aptana Studio: Preferences -> Aptana -> Editors -> Javascript -> Validation -> Check JSLint Javascript Validator
It is probably a good idea to glance at AJAXian every once in awhile as well.

Wednesday, February 18, 2009

Debugging Jetty when running mvn jetty:run

It took me awhile to get this to work so hopefully this will save someone some time. If you want to debug a webapp that is run using the mvn jetty:run or mvn jetty:run-war commands then these steps will help you get going quickly.
1) setup your webapp to run using jetty: Maven Jetty Plugin Guide
2) setup the debugger in one of the following ways:
(A) setup your MAVEN_OPTS (environment variable) to include debugging by adding this "-agentlib:jdwp=transport=dt_socket,address=8000,server=y,suspend=n".
Mine looks like this: MAVEN_OPTS='-Xms256m -Xmx512m -XX:PermSize=64m -XX:MaxPermSize=128m -agentlib:jdwp=transport=dt_socket,address=8000,server=y,suspend=n'
If you want it to suspend until the debugger is attached then just set suspend=y
Note that this will stop you from running more than one mvn process
(B) execute mvn using the mvndebug script (skip step 3)
mvndebug jetty:run
The only issue with this is that it suspends always and you cannot kill it without attaching to it with a debugger (this annoys me so I use the first option)
Note that you can only have one mvndebug process running at once
3) Execute mvn jetty:run
4) Attach a debugger (like the one in eclipse) to port 8000 and you are ready to debug

Wednesday, January 28, 2009

SVN 1.5.5 for OSX installer

If you are using OSX and you need to upgrade Subversion (SVN) to 1.5.5 (or higher) you have probably gone to the subversion website and found out that you need fink or macports or some such. Since I more of the point and click type when it comes to my OS, I prefer installers, so here is the place to get one:
http://www.open.collab.net/downloads/apple/index.html
It is as easy as: download => install => use => done

Tuesday, January 06, 2009

Java Collection Performance

This is just a helpful reference when trying to decide which collections to use in Java. I use this for my personal reference but it may help others as well. The links go to the Sun Javadocs. The collections of each type are ordered based on performance (i.e. the highest performance (highest speed) ones are listed first and will be the fastest for most operations)

List - this is an ordered list of objects, insertion order is maintained and retrieval order is in the list order but items can also be random accessed, duplicate items are allowed, generally allow storage of null values (the ones below do), generally fast to iterate and find items by position but slow to do lookups
  • ArrayList - Unsychronized, nulls allowed (fastest)
  • Vector - Synchronized, only slightly slower in tests of sizes under 100000
  • Stack - Synchronized, same speed as Vector, LIFO queue
  • LinkedList - Unsynchronized, allows two way iteration and modification of items (like a stack or queue)
  • CopyOnWriteArrayList - Synchronized, significantly slower in tests of large numbers of items or average list changes, only slightly slower when used with very small numbers (<100)>
Set - this a set of items with no duplicates (no two items can compare as equal), ordering is typically inconsistent over multiple set iterations depending on the implementation but you should assume the order is effectively random unless the set specifies ordered iteration, generally ok to iterate and fast to do lookups
  • HashSet - Unsychronized (fastest), slower than HashMap which it is built on, allows nulls
  • LinkedHashSet - Unsychronized, ordered by insertion, allows nulls
  • TreeSet - Unsychronized, ordered by the natural ordering of the items or a comparator provided at construction, allows nulls but there are issues with removing them
  • CopyOnWriteArraySet - Synchronized, significantly slower in tests of large numbers of items or average set changes, only slightly slower when used with very small numbers (<100)>
Map - Stores key/value pairs (maps keys to values) where the keys must be unique, order of iteration over keys, values, or pairs is highly dependent on the implementation of the map, allowed nulls also vary by implementation, generally very fast to lookup keys and slow to lookup values
  • IdentityHashMap - Unsychronized (fastest), uses reference equality (==) instead of object equality (equals) to compare keys, actually violates the Map interface guarantee, all iterators are unordered, allows null keys and values
  • HashMap - Unsychronized, this is the fastest general purpose map, all iterators are unordered, allows null keys and values
  • ConcurrentHashMap - Synchronized, all iterators are unordered, does not allow null keys or values
  • Hashtable - Synchronized, all iterators are unordered, does not allow null keys or values
  • LinkedHashMap - Unsychronized, all iterators are ordered based on insertion order of the original key (does not change if a key is reinserted), allows null values but null keys are not allowed
  • TreeMap - Unsychronized, iterators are ordered by the natural or comparator ordering of the keys, allows null keys and values but the comparator needs to understand them