Proposed logos for OpenNLP

kinow @ Jan 15, 2017 20:42:03

A couple of logos submitted to OPENNLP-6. Made with Inkscape, fonts from Google Fonts.

Proposed logos for OpenNLP
Proposed logos for OpenNLP

For more, check out my DeviantArt page.

Drawing Cave

kinow @ Jan 13, 2017 02:11:03

Apache Commons Lang: Memoizer

kinow @ Jan 08, 2017 18:34:03

The current release of Apache Commons Lang is 3.5. The upcoming release, probably 3.6, will include a new feature, added in a pull request: a Memoizer implementation. Check out the ticket LANG-740 for more about the implementation being added to [lang].

The book Java Concurrency in Practice introduces readers to the Memoizer, and has also a public domain implementation available for download (besides that, the book has also lots of other interesting topics!).

In summary, Memoizer is a simple cache, that will store the result of a computation. It receives a Computable object, responsible for doing something that will be stored by the Memoizer. Here’s a simple code to illustrate how that will work in your Java code.

// Computation to be stored in the cache
Computable<String, String> getFormattedCurrentDate = new Computable<String, String>() {
    public String compute(String fmt) throws InterruptedException {
        return new SimpleDateFormat(fmt).format(new Date());

// Our memoizer
Memoizer<String, String> dateCache = new Memoizer<>(getFormattedCurrentDate);

// To illustrate its use
for (int i = 0; i < 10; i++) {
    try {
        // S -> Millisecond
        System.out.println(dateCache.compute("HH:mm:ss:S Z dd/MM/YYYY"));
        // Regardless of this sleep call, we get the same result every iteration
    } catch (InterruptedException e) {

The computable created (getFormattedCurrentDate) will be called only once, and stored in a map. The parameter passed in the #compute() method will be used as key in the map. So choose your parameter wisely :-) The output of the example will be similar to the following one.

19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017

In the example above I used a for-loop to illustrate what will happen. Even though we call the memoizer #compute() method several times, followed by Thread#sleep(); only one result, the first to be computed, will be returned.

So that’s all for today. Hope you learned something about this new class, that must be available in the next release of Apache Commons Lang.

Happy hacking!

ps: [lang] uses Java 7, so that is why we do not have a functional instead of the Comparable

Apache Commons Text

kinow @ Jan 07, 2017 20:39:03

There is a new component in Apache Commons: Apache Commons Text. The 1.0 release might be announced in the next weeks. The current site is still in the Commons Sandbox, but it will change with the 1.0 release. The promotion from the sandbox happened a few days ago in the project mailing list.

Here’s the project description: Apache Commons Text is a library focused on algorithms working on strings.

There was a thread on the mailing list some time ago (Oct/2014) when we first discussed the component idea. Since then many people contributed porting code from Apache Commons Lang, Apache Lucene, donating code from existing projects, and with new ideas.

It is important to be aware that certain parts of Apache Commons Lang are being marked as deprecated, and will be removed in the future, after Apache Commons Text 1.0 is out. For example: StringUtils, and RandomStringUtils.

That will happen probably in a 4.x release of Apache Commons Lang, if everything goes well with Apache Commons Text :-)

And there are already future features in branches too. It was decided that these features needed further work, so they will probably be included in next releases.

So that’s a little bit of background on the new component that will be released soon. If you have code using Apache Commons Lang, you might be interested in staying tuned to release announcements in the mailing list!

And should you have suggestions and would like to contribute, feel free to join and start a thread in the mailing list, open a JIRA issue, or submit a pull request.

Happy hacking!

Plotting Auckland with OSMnx

kinow @ Jan 05, 2017 10:39:03

A couple of days ago I saw a thread in reddit about OSMnx. It is a utilty for interacting with the OpenStreeMap (OSM) API, manipulate it in pure Python and using libraries like NetworkX (a Python graph package).

With it you can do things like visualize cul-de-sacs or one-way streets, plot shortest-path routes, or calculate stats like intersection density, average node connectivity, or betweenness centrality. Or simply plot the OSM data as in the graph above.

The source code is hosted at GitHub:

OSMnx is a Python 2+3 package that lets you download spatial geometries and construct, project, visualize, and analyze street networks from OpenStreetMap’s APIs. Users can download and construct walkable, drivable, or bikable urban networks with a single line of Python code, and then easily analyze and visualize them.

The only issue I found while creating the map for Auckland using the example from the project README, was that the script would exit with the following error: “ValueError: Geometry must be a shapely Polygon or MultiPolygon”.

After looking at the list of dependencies and finding that everything seemed to be OK, I started looking at the project issues. And thankfully someone else had found the same issue and the maintainer of the project answered how to fix it.

The issue was that the OSM API returns two entries for Auckland , where the first one is a Point, and the second is the one that we want (a Polygon). The application defaults to using the first element, so in order to change it you have to give a which_result argument.

#!/usr/bin/env python3

import osmnx as ox
G = ox.graph_from_place('Auckland, NZ', network_type='drive', which_result=2)

And after that, and after waiting a few minutes, you should get your map :-)

Happy hacking!