Blog Taxonomy

Home

Replacing a HashSet with a BitSet

kinow @ Oct 20, 2012 19:51:39 ()

I always read the messages in the Apache dev mailing lists, including Apache Commons dev mailing list. And you should too. There are always interesting discussions. Sometimes you participate, other times you only watch what’s happening, but in the end you always learn something new.

A few days ago, I found an issue where it was being proposed to replace an unnecessary HashSet in ArrayUtils#removeElements() by a BitSet. Here’s how the code looked like:

HashSet<Integer> toRemove = new HashSet<Integer>();
for (Map.Entry<Character, MutableInt> e : occurrences.entrySet()) {
    Character v = e.getKey();
    int found = 0;
    for (int i = 0, ct = e.getValue().intValue(); i < ct; i++) {
        found = indexOf(array, v.charValue(), found);
        if (found < 0) {
            break;
        }
        toRemove.add(found++);
    }
}
return (char[]) removeAll((Object)array, extractIndices(toRemove));

The HashSet created at line 1, in the code above, was used to store the array index of the elements that should be removed. And at line 13 there is a call to removeAll method, passing the indexes to be removed. And here’s how the new code looks like:

BitSet toRemove = new BitSet();
for (Map.Entry<Character, MutableInt> e : occurrences.entrySet()) {
    Character v = e.getKey();
    int found = 0;
    for (int i = 0, ct = e.getValue().intValue(); i < ct; i++) {
        found = indexOf(array, v.charValue(), found);
        if (found < 0) {
            break;
        }
        toRemove.set(found++);
    }
}
return (char[]) removeAll(array, toRemove);

The first difference is at line 1. Instead of a HashSet, it is now using a BitSet. And at line 10, instead of adding a new element to the HashSet, now it “sets” a bit in the set (the bit at the specified position is now true). But there are important changes at line 13. The method removeAll was changed, and now the array doesn’t require a cast anymore. And the it is not necessary to cast the elements from HashSet anymore, as now the bit in the index position of the set is set to true. So the extractIndices method could be removed.

The code got simpler. But that’s not all. At Apache Software Foundation you can find a lot of talented developers - that’s why I got so excited after joining them. Besides simplifying the code, the developer responsible for these changes (sebb) also pointed out that the new code consumes less memory and is faster. Ah! And he also wrote unit tests

Adding coverage reports in Jenkins to GoogleTest with gcovr

kinow @ Oct 16, 2012 16:09:33 ()

After the last post about GoogleTest and TestLink using Jenkins TestLink Plug-in, I received an e-mail asking about coverage with GoogleTest and Jenkins. I’ve just updated the Makefile in the samples directory, of the GoogleTest TAP listener project, to output coverage data.

Basically, you have to add the compiler flags -fprofile-arcs -ftest-coverage and link the executable with -lgcov. Take a look at the project’s Makefile and you’ll notice how simple it is. In order to have Jenkins interpreting your coverage report, you’ll have to convert it to cobertura XML. There is a Python utility that can be used for this: gcovr. Download it and copy it to somewhere where Jenkins can execute it (e.g.: /usr/local/bin).

Now, if you’ve followed the instructions from the previous post, you should have a job that reports your GoogleTest tests from Jenkins to TestLink using the plug-in, and is downloading the source code from GitHub. Add an extra build step (Shell) to execute gcovr.

( Read more ... )

Jenkins, TestLink and GTest in 5 minutes (or so)

kinow @ Oct 11, 2012 23:44:59 ()

This is a 5 minutes guide on creating a job for a C++ project in Jenkins with GoogleTest and reporting the test results back to TestLink, with testlink-plugin.

The test project with GoogleTest

For this simple guide we will use the samples that come with GTest TAP Listener. You can get the code from GitHub with git clone git://github.com/kinow/gtest-tap-listener.git. Take a look at gtest-tap-listener/samples/src/, there you will find two C++ files: gtest_main.cc and gtest_testHelloWorld.cc.

gtest_main.cc has the main function, and executes the test suite. And gtest_testHelloWorld.cc has the test cases and tests. Take note of the test case and tests names.

( Read more ... )

Paper: Patterns for Introducing a Superclass for Test Classes

kinow @ Sep 25, 2012 17:58:09 ()

Few days ago we had SugarLoafPlop 2012 in Natal - RN

It is a conference on pattern languages of programming. About six months ago I saw a tweet by Eduardo Guerra asking if anyone had some cases where certain patterns were applied. It was a big coincidence, since I was working on Apache Commons Functor and some Jenkins plugins, both projects with cases that could be used in his paper.

So I joined Guerra and gave my small contribution to the paper that has been accepted for this edition of SugarLoafPlop. Guerra also went there to give a talk and participate in an open discussion about several papers, including ours. I simply love when these things happen, it was great work with Guerra, and even better for being able to use Open Source examples in our paper.

Bioinformatics tools: Stacks

kinow @ Sep 25, 2012 16:08:03 ()

It is the first post about bioinformatics tools, but I will try to post more about other tools such as MrBayes, Structure, maybe some next generation sequencing tools too, and Bioperl, Biojava, and so on.

As I am more a computer geek, rather than a bioinformatics one, I will focus on requirements for running these tools on clusters and the requirements to install them on your machine. The instructions require that you have an intermediary knowledge on *nix OS and sometimes a bit of programming experience.

I will be using tutorials available on the Internet and hosting my code in GitHub/kinow. Hammer time!

( Read more ... )