Blog

Posts about technology and arts.

Processing Vaisala Radiosonde data with Python, and creating GRUAN-like NetCDF files

One of my last projects involved parsing a few GB’s of data that was in a certain binary format, and convert it into NetCDF files. In this post I will describe what was done, what I learned about radiosonde, GRUAN and other geek stuff. Ready?

Vaisala Radiosonde data

When I was told the data was in some binary format, I thought it would be similar to parsing a flat file. That this would contain a fixed length entry, with the same kind of item repeated multiple times.

Well, not exactly.

The files had been generated by an instrument made by Vaisala, a Finnish company. This instrument is called a radiosonde. It is an instrument about the size of an old mobile phone, that is launched with a balloon into the atmosphere.

I was lucky to be given the chance to release one of these balloons carrying a newer version of this equipment.

Radiosonde balloon launch

The balloon can carry equipments for measuring different things, like air pressure, altitude, temperature, latitude, longitude, relative humidity, among others. Equipments like the radiosonde send the data back to a ground-level station via radio, normally in a short and constant interval.

Drawing sketch: Blue Hair

For redditgetsdrawn

Some Linux commands I used this week

These are some commands I used on Linux servers this week. Adding them here in case someone else find them interesting, and also due to my bad memory :-)

Listing latest installed packages in SLES

rpm -qa --last

This will display the last packages installed. Useful when there are packages being updated, and you need to confirm what changed, and when.

Listing packages in SLES and origin repository

rpm -qa --qf '%-30{DISTRIBUTION} %{NAME}\n'| sort

The output will have two columns. The first containing the repository name, and the second column with the package name. For example.

devel:languages:R:base / SLE_11_SP2 R-base
devel:languages:R:base / SLE_11_SP2 R-base-devel
home:flacco:sles / SLE_11_SP3 php53-phar
home:happenpappen / SLE_11_SP2 nodejs

Grep for content in XML tags

Be it for web services, or for finding things in Jenkins XML files. Being able to grep the tag attribute or tag name might be useful. Look at the following example that uses the books XML provided by Microsoft for testing.

grep -oP "(?<=<genre>).*?(?=</genre>)" books.xml | sort | uniq

Which will outputs the following.

Computer
Fantasy
Horror
Romance
Science Fiction

Find Python site packages directory

Sometimes you have Anaconda, but also the system installation, and maybe even other Python distributions. Knowing where Python is looking for site packages can be helpful to confirm the package exists, and also to inspect its sources.

python -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())"

An example of the output of the script.

/usr/lib/python2.7/dist-packages

Force no-cache via curl for a list of files

Useful when you have a proxy like squid caching some requests from an application and you want to flush the cache and get the latest content (which will be cached again, but then you can fix it once confirmed).

curl --silent -H 'Cache-Control: no-cache' http://systemcachingvalues.local/somedoc.html

Find to which servers a Linux process is talking to

You have to find the pid of the process that you would like to investigate (e.g. 6364) and have strace installed.

strace -p 6364 -f -e trace=network -o output.txt

The command above creates output.txt with the trace information. Then you can grep for the IP addresses with the following regex.

grep -E -o "(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)" output.txt

Which will output something similar to the following example.

127.0.1.1
127.0.1.1
127.0.1.1
192.168.20.4
10.10.0.12
...

And finally, you can call dig to get the server name, and also remove duplicates.

grep -E -o "(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)" output.txt | xargs -l dig +noall +answer +nocmd -x | awk '{ print $5}' | sort | uniq

Which gives you the following.

ec2-52-13-43-205.compute-1.amazonaws.com.
ec2-52-32-244-147.compute-1.amazonaws.com.
ec2-52-44-11-85.compute-1.amazonaws.com.
ec2-52-55-36-20.us-west-2.compute.amazonaws.com.
ec2-52-11-19-24.us-west-2.compute.amazonaws.com.
ec2-52-2-21-13.compute-1.amazonaws.com.
ec2-54-33-249-49.us-west-2.compute.amazonaws.com.
ec2-54-180-165-17.us-west-2.compute.amazonaws.com.
ec2-54-2-177-91.compute-1.amazonaws.com.
ec2-54-8-163-15.compute-1.amazonaws.com.
syd11s01-in-f124.1e110.net.
syd11s02-in-f5.1e110.net.
syd12s02-in-f3.1e110.net.
...

That’s all for today.

Happy hacking!

Using Active Choices with Role Strategy Plug-in

Having worked in Open Source for a few years, one of my favorite things is when you can share experience with other people that you meet. Andrew Gray has worked with .NET and Jenkins for years, and we met through Open Source. He has helped me in the past with Jenkins and .NET, and also maintains the blog Jenkins.NET.

A couple of days ago he sent me an interesting question. He asked me if that would be possible to use Active Choices Plug-in with the Role Strategy Plug-in. This plug-in lets you define roles, define which permissions a role has, and then assign users to the roles.

Drawing vector art: Kumamoto Kenjinkai mascot

For Brazilian Kumamoto Kenjinkai. Done with Macromedia Freehand MX, some long time ago.

Subscribe