Blog Taxonomy

Posts tagged with 'programming'

Writing a binary parser in Python: NumPy vs. Construct

kinow @ Apr 14, 2017 19:21:03

Some time ago I worked with researchers to write a parser for an old data format. The data was generated by device (radiosonde) using the vendor (Vaisala) specific binary format.

One of the researchers told me someone had written a parser for his work, and shared it on GitHub. To be honest, that was my first time parsing data in binary with Python. Did that before with C, C++, Perl, and Java, but never with Python.

The code on GitHub used NumPy and looked similar to this one.

import numpy as np

parse_header = np.dtype( [ (('field_a', 'b1'), ('field_b', '17b1') ] )

with open('input.dat', 'rb') as f:
    header = np.fromfile(f, dtype=parse_header, count=1)
    # ...

And it indeed worked fine. But in the end I used the code - after contacting the author and letting him know what I was about to do - as reference together with an old specification document for the format, and created a parser with Construct.

From Construct’s website:

Construct is a powerful declarative parser (and builder) for binary data.

This is what the code with construct looked like.

from construct import *

parse_header = Struct("parse_header",
    Enum(Byte("file_ready"),
        READY = 1,
        NOT_READY = 0,
        _default_ = "UNKNOWN",
    ),
    Bytes("reserved", 17)
)

# ...

parse_contents = Struct("parse_contents",
    parse_header,
    Range(mincount=1, maxcout=5, subcon=pre_data),
    OptionalGreedyRange(detailed_data)
)

with open('input.dat', 'rb') as f:
    parse_results = parse_contents.parse_stream(fid)
    # ...

Writing the parser with NumPy or Construct would achieve the same result. However, in the end this came down to personal preference, and my point of view as Software Engineer. This is the description of NumPy.

NumPy is the fundamental package for scientific computing with Python.

NumPy is a project tailored for scientific computing, with a focus on linear algebra, N-dimensional arrays, and so it goes. While it contains code that can parse binary data, the footprint added to a project that includes it as dependency is quite big.

The parser written with NumPy wasn’t using 5% of the NumPy code base. Probably less than 1%. Updates to NumPy could break the application compatibility, even if the update came due to some new matrix operation added to NumPy through some external and missing dependency.

In Java something similar happens with Google Guava. While I use it some times, most of the times I find myself using one of the Apache Commons libraries, or another dependency with just what I need. To avoid including unnecessary code to my application.

If you prefer to use NumPy that’s fine too :-) I just had the time enough to rewrite it instead of using the NumPy (took a couple of hours). In other cases it may still make sense to use another tool or library, even if it was not made specifically for the job ¯\(ツ)

♥ Open Source

Spring Cloud encrypted values and Spring PropertySources

kinow @ Apr 14, 2017 11:21:03

As I could not find any documentation for that, I decided to write it as a note to myself in case I use the encryption and decryption with Spring Cloud again.

In Spring and Spring Boot, you normally have multiple sources of properties, like multiple properties files, environment properties and variables, and so it goes. In the Spring API, these are represented as PropertySource‘s.

In a Spring Boot application, you would be used to overriding certain properties by defining environments and using an application-production.properties file, or overriding values with environment properties.

This is common in Spring Boot applications deployed to Amazon Elastic Beanstalk.

Some time ago another team at work found that overriding did not always work when you have encrypted values in your properties files. Even if you specified new values in the Amazon Elastic Beanstalk application configuration.

Yesterday, while debugging the issue and reading Spring Cloud source code, I found its EnvironmentDecryptApplicationInitializer.

It basically iterates through all loaded property sources, looking for values that start with {cipher}. Then it calls the Spring Security TextEncryptor defined in the application.

Finally, it creates a new property source, called decrypted, with the decrypted values. So when your application looks for a property called XPTO, and if it has been encrypted, it will find the value in the decrypted propery source, regardless of whether you tried to override it or not.

# Property sources listed in Eclipse IDE

[
  servletConfigInitParams,
  servletContextInitParams,
  systemProperties,
  systemEnvironment,
  random,
  applicationConfigurationProperties,
  springCloudClientHostInfo,
  defaultProperties
]

# When using encrypted values

[
  decrypted, <-------- created by Spring Cloud, with decrypted values. Prepended to the list of property sources
  servletConfigInitParams,
  servletContextInitParams,
  systemProperties,
  systemEnvironment,
  random,
  applicationConfigurationProperties,
  springCloudClientHostInfo,
  defaultProperties
]

So in case you have encrypted values in your Spring application (and you are using Spring Cloud, of course) remember that these values will have higher priority, and can only be overriden by other encrypted values.

♥ Open Source

Fixing Qt warning "QLayout: Attempting to add QLayout "" to QWidget "", which already has a layout"

kinow @ Apr 02, 2017 12:01:03

If you ever started Krita 3.x in your command line, and had a look at the console output, you may noticed the following warning.

QLayout: Attempting to add QLayout “” to QWidget “”, which already has a layout

Krita recently announced the release of 3.1.3-alpha-2, and while testing I saw this warning and decided to investigate why this warning happens.

There was already a similar question posted on StackOverflow. And the best answer’s initial paragraph gave me a hint of what to look for.

When you assign a widget as the parent of a QLayout by passing it into the constructor, the layout is automatically set as the layout for that widget. In your code you are not only doing this, but explicitly calling setlayout(). This is no problem when when the widget passed is the same. If they are different you will get an error because Qt tries to assign the second layout to your widget which has already had a layout set.

So, somewhere in Krita code, there was a a QWidget being created, and layouts were being added to it more than once. To find where the issue was happening was quite easy. A breakpoint at main.cc where the application is initialized, then step through a few times, until the message appeared in the console.

Further investigation led me to the History docker (the one that shows undo steps) constructor.

    QVBoxLayout *vl = new QVBoxLayout(page); // layout being set to page
    m_undoView = new KisUndoView(this);
    vl->addWidget(m_undoView);
    QHBoxLayout *hl = new QHBoxLayout(page); // layout being set to page again
    hl->addSpacerItem(new QSpacerItem(10, 1,  QSizePolicy::Expanding, QSizePolicy::Fixed));
    m_bnConfigure = new QToolButton(page);
    m_bnConfigure->setIcon(KisIconUtils::loadIcon("configure"));
    connect(m_bnConfigure, SIGNAL(clicked(bool)), SLOT(configure()));
    hl->addWidget(m_bnConfigure);
    vl->addItem(hl);

    setWidget(page);
    setWindowTitle(i18n("Undo History"));

Here the QWidget created receives both QHBoxLayout and QVBoxLayout. Again, searching the Internet a little bit, then came across this post with a good example of a QWidget with QHBoxLayout and QVBoxLayout. Here’s what the constructor looks after the patch has been applied.

    QVBoxLayout *vl = new QVBoxLayout(page); // layout being set to page
    m_undoView = new KisUndoView(this);
    vl->addWidget(m_undoView);
    QHBoxLayout *hl = new QHBoxLayout();
    hl->addSpacerItem(new QSpacerItem(10, 1,  QSizePolicy::Expanding, QSizePolicy::Fixed));
    m_bnConfigure = new QToolButton(page);
    m_bnConfigure->setIcon(KisIconUtils::loadIcon("configure"));
    connect(m_bnConfigure, SIGNAL(clicked(bool)), SLOT(configure()));
    hl->addWidget(m_bnConfigure);
    vl->addItem(hl);
    vl->addLayout(hl); // horizontal layout added to the vertical layout

    setWidget(page);
    setWindowTitle(i18n("Undo History"));

That’s it. Learned something new in Qt. Not as important and useful as learning about signals and slots, but now I can focus on other warnings in the console output of Krita.

And you? Have you tested Krita 3.1.3 alpha already? What are you waiting for? :-)

♥ Open Source

Simulating less memory with ulimit

kinow @ Mar 26, 2017 11:14:03

These days I was trying to reproduce a bug in Krita where it would crash when a user copied group layers between windows. It appeared that the user was getting a segmentation fault due to the user’s computer running out of memory.

I could not reproduce the issue, but my computer has 16 GB. The first thing that came to my mind was to create a virtual machine with less memory to reproduce the issue. But I decided to spend some time looking for a simpler way of doing it.

Searching the web I found some suggestions that ulimit could work. After playing for a while with ulimit and htop, and verifying the amount of memory necessary to open two files in two windows in Krita, I came up with the following settings.

# 2550 mb in kb
ulimit -v 2550000

# 2 gb in kb
ulimit -m 2000000

# Confirm limits
ulimit -a

gdb $HOME/Development/cpp/workspace/krita_install/bin/krita

Then after copying a few layers from one window to another, I successfully reproduced the issue, and could include a backtrace in the Krita issue tracking system.

♥ Open Source

Apache Commons Lang: Memoizer

kinow @ Jan 08, 2017 18:34:03

The current release of Apache Commons Lang is 3.5. The upcoming release, probably 3.6, will include a new feature, added in a pull request: a Memoizer implementation. Check out the ticket LANG-740 for more about the implementation being added to [lang].

The book Java Concurrency in Practice introduces readers to the Memoizer, and has also a public domain implementation available for download (besides that, the book has also lots of other interesting topics!).

In summary, Memoizer is a simple cache, that will store the result of a computation. It receives a Computable object, responsible for doing something that will be stored by the Memoizer. Here’s a simple code to illustrate how that will work in your Java code.

// Computation to be stored in the cache
Computable<String, String> getFormattedCurrentDate = new Computable<String, String>() {
    @Override
    public String compute(String fmt) throws InterruptedException {
        return new SimpleDateFormat(fmt).format(new Date());
    }
};

// Our memoizer
Memoizer<String, String> dateCache = new Memoizer<>(getFormattedCurrentDate);

// To illustrate its use
for (int i = 0; i < 10; i++) {
    try {
        // S -> Millisecond
        System.out.println(dateCache.compute("HH:mm:ss:S Z dd/MM/YYYY"));
        // Regardless of this sleep call, we get the same result every iteration
        Thread.sleep(1500);
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
}

The computable created (getFormattedCurrentDate) will be called only once, and stored in a map. The parameter passed in the #compute() method will be used as key in the map. So choose your parameter wisely :-) The output of the example will be similar to the following one.

19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017
19:15:57:854 +1300 08/01/2017

In the example above I used a for-loop to illustrate what will happen. Even though we call the memoizer #compute() method several times, followed by Thread#sleep(); only one result, the first to be computed, will be returned.

So that’s all for today. Hope you learned something about this new class, that must be available in the next release of Apache Commons Lang.

Happy hacking!

ps: [lang] uses Java 7, so that is why we do not have a functional instead of the Comparable