Tag

Posts tagged with ‘r’

Securely using passwords with R

kinow @ Jun 03, 2017 20:21:39

It is quite common to have code that needs to interact with another system, database, or third party application, and need to use some sort of credentials to securely communicate.

Most of the code I wrote in R, or reviewed, had normally no borders (from a system analysis perspective) with other systems, or basically just interacted with the file system to retrieve NetCDF or JSON files.

However, after I saw a comment in Reddit [1] some time ago about this, I decided to check what others used. With R Shiny becoming more popular, and moving more R code to the web, I think this will become a common requirement for R code.

There is a blog post from RevolutionAnalytics [2] that does a nice summary of the options for that. The post is from 2015, but I do not think much changed now in 2017. From the blog post and comments, there are seven methods listed:

  1. Put credentials in your source code
  2. Put credentials in a file in your project, but do not share this file (#3, #4, and #5 are similar to this one)
  3. Put credentials in a .Rprofile file
  4. Put credentials in a .Renviron file
  5. Put credentials in a JSON or YAML file
  6. Put credentials in a secure store that you can access from R
  7. Ask the user for the credentials when the script is executed (possibly not useful for R Shiny applications)

My preferred way for R is the .Renviron (or a dotEnv) file. You basically store your password in this file, make sure you do not share this (a global gitignore could be helpful to prevent any accident) and read the variables when you start your code.

## Secrets
MYSQL_PASSWORD=secret

If you would like to increase the security, you can combine it with a variation of #6. You use a .Renviron file, and use an encryption service like Amazon KMS (KMS stands for Key Management Service).

With AWS KMS in R, you can encrypt your values, put them encrypted in your .Renviron, and even if someone gets hold of your .Renviron file, you have an extra layer of protection, as the attacker would require access to your cloud environment to decrypt it too.

## Secrets
MYSQL_PASSWORD=AQECAHga320J8WadplGCqqVAr4HNvDaFSQ+NaiwIBhmm6qDSFwAAAGIwYAYJKoZIhvcNAQcGoFMwUQIBADBMBgkqhkiG9w0BBwEwHgYJYIZIAWUDBAEuMBEE99+LoLdvYv8l41OhAAIBEIAfx49FFJCLeYrkfMfAw6XlnxP23MmDBdqP8dPp28OoAQ==

References

♥ Open Source

Treemapping Jenkins Extension Points with R

kinow @ May 19, 2014 00:51:33

I have been playing with R and its packages for some time, and decided to study it a bit harder. Last week I started reading the Advanced R Programming by Hadley Wickham.

One of the first chapters talks about the basic data structures in R. In order to get my feet wet I thought about a simple example: treemapping Jenkins extension points.

Treemap graph

There is a Wiki page with the extension points in Jenkins (what parts of Jenkins can be customized) and its implementations. That page is generated by a Jenkins job.

That job outputs a JSON file that contains each available extension points, as well as an array with its implementations. The R code below will produce two vectors, one with the count of implementations of each extension point, and the other with the class name.

nExtensionPoints <- length(extensionPointsJson$extensionPoints)
numberOfImplementations <- vector(length = nExtensionPoints)
namesOfTheExtensionPoints <- vector(length = nExtensionPoints)
for (i in seq_along(extensionPoints)) {
  extensionName = extensionPoints[[i]]$className
  lastIndexOfDot = regexpr("\\.[^\\.]*$", extensionName)
  namesOfTheExtensionPoints[[i]] = substr(extensionName, lastIndexOfDot[1]+1, nchar(extensionName))
  numberOfImplementations[[i]] = length(extensionPoints[[i]]$implementations)
  print(paste(namesOfTheExtensionPoints[[i]], " -> ", numberOfImplementations[[i]]))
}

For creating the treemap I used the portfolio package, and the map.market function. The structures previously created and the next snippet of code are all that is needed to create the treemap of the Jenkins extension points.

map.market(id=seq_along(extensionPoints), area=numberOfImplementations, group=namesOfTheExtensionPoints, color=numberOfImplementations, main="Jenkins Extension Points")

You can also use the Jenkins R Plug-in to produce this graph, as in this sample job. You can get the complete script in this gist, or just copy it here.

library('rjson')
library('portfolio')

download.file(url="https://ci.jenkins-ci.org/view/Infrastructure/job/infra_extension-indexer/ws/extension-points.json", destfile="extension-points.json", method="wget")
extensionPointsJson <- fromJSON(paste(readLines("extension-points.json"), collapse=""))
extensionPoints <- extensionPointsJson$extensionPoints
nExtensionPoints <- length(extensionPointsJson$extensionPoints)
numberOfImplementations <- vector(length = nExtensionPoints)
namesOfTheExtensionPoints <- vector(length = nExtensionPoints)
for (i in seq_along(extensionPoints)) {
  extensionName = extensionPoints[[i]]$className
  lastIndexOfDot = regexpr("\\.[^\\.]*$", extensionName)
  namesOfTheExtensionPoints[[i]] = substr(extensionName, lastIndexOfDot[1]+1, nchar(extensionName))
  numberOfImplementations[[i]] = length(extensionPoints[[i]]$implementations)
  print(paste(namesOfTheExtensionPoints[[i]], " -> ", numberOfImplementations[[i]]))
}
png(filename="extension-points.png", width=2048, height=1536, units="px", bg="white")
map.market(id=seq_along(extensionPoints), area=numberOfImplementations, group=namesOfTheExtensionPoints, color=numberOfImplementations, main="Jenkins Extension Points")
dev.off()