Blog

Posts about technology and arts.

Learning more about SPARQL and Jena internals

O Corvo
O Corvo

Recently a pull request for Apache Jena that I started three years ago got merged. Even though it has been three years since that pull request, there are still many parts of the project code base that I am not familiar with.

And not only the code, but there are also many concepts about SPARQL, other standards used in Jena, and internals about triple stores.

The following list contains some presentations and posts that I am reading right now, while I try to improve my knowledge of SPARQL and Jena internals.

R Shiny + Ansible = shiny.nzoss.org.nz

R Shiny + Ansible = shiny.nzoss.org.nz

Event
R Shiny + Ansible = shiny.nzoss.org.nz
Where ?
Auckland, New Zealand
When ?
2018-02-08

Exif Odd Offsets

A file format like JPEG may contain metadata in JFIF, Exif, or a vendor proprietary format. The Exif format is based - or uses parts of - on the TIFF format.

Within an Exif metadata block, you should see directories, with several entries. The entries have fields like description, value, and also an offset. The offset indicates the offset to the next entry.

The Exif specification defines that implementers must make sure to keep the offset an even number, within 4 bytes.

I recently worked on IMAGING-205, a ticket about odd offsets in files with Exif metadata. This issue was exactly to address that when files were rewritten with Apache Commons Imaging, even though the image initially had no odd offsets, after the entries were rearranged, we could have odd offsets.

The fix was simply checking for odd offsets, adding +1, and later it would be put within the 4 bytes limit.

A screen shot of Eclipse with source code
Locating the bug

One interesting point, however, is that this is in the standard, but not all software that read and write Exif follow the specification. So it is quite common to find images with odd offsets.

Which means you could take a picture with your phone, that contains some Exif metadata, and be surprised to analyze it with exiftool and get warnings about odd offsets. Most viewers handle odd and even offsets, so it should work for most cases, unless you have a strict reader/viewer.

Happy hacking!

&heart; Open Source

Remember to synchronize when iterating streams from a synchronized Collection

When iterating collections created via Collections.synchronizedList for instance, you are required to obtain a lock on the actual list before doing so. So you normally end up with code similar to:

List list = Collections.synchronizedList(new ArrayList());
synchronized (list) {
  Iterator i = list.iterator(); // Must be in synchronized block
  while (i.hasNext())
      foo(i.next());
}

This requirement is documented in the javadocs.

Since lambdas and streams are being more widely used, it is important to remind that when iterating via a stream we also need to obtain a lock on the synchronized collection created.

List list = Collections.synchronizedList(new ArrayList());
synchronized (list) {
  list.stream()
    .anyMatch(...)
}

Here’s an example from Zalando Nakadi Event Broker.

Happy hacking!

Watch out for Locales when using NumberFormat with currencies

In Java you have the NumberFormatException to help you formatting and parsing numbers for any locale. Said that, here’s some code.

BigDecimal negative = new BigDecimal("-1234.56");

DecimalFormat nf = (DecimalFormat) NumberFormat.getCurrencyInstance(Locale.UK);
String formattedNegative = nf.format(negative);

System.out.println(formattedNegative);

The output for this code is -£1,234.56. That’s expected, as the locale is set to UK, so the currency symbol used is for British Pounds. And as the number is negative, you get that minus sign as a prefix. For Japanese locale you’d get -¥1,235, and for Brazilian locale you’d get -R$ 1.234,56.

So far so good.

What about the following code, with nothing different except for the locale set to US.

BigDecimal negative = new BigDecimal("-1234.56");

DecimalFormat nf = (DecimalFormat) NumberFormat.getCurrencyInstance(Locale.US); // <--- US now
String formattedNegative = nf.format(negative);

System.out.println(formattedNegative);

Some could intuitively expect -$1,234.56. However, the output is actually ($1,234.56).

There are different prefixes and suffixes. But in some locales the prefix can be empty, or, as in the case of the US locale, it can be quite different than what you could expect.

Learned about this peculiarity from NumberFormat while working on VALIDATOR-433 for Apache Commons Validator.

Happy hacking!

Subscribe