Show your Data Science Workplace!

I like surveys like that. To see how other people organize their daily work with all their hard- and software. I will also give a short insight to my workplace setup and how do I use it. The blog post from Benjamin ask the following questions to get an deep inside:

How many monitors do you use (or wish to have)?

Well, as you see on the picture above, I’m using two monitors at the same time. One in a horizontal and one in a vertical direction. On the horizontal display I’m working with mail, short documents and terminal windows (many thanks to the guys from to implement a macOS windows manager that works). The vertical one is for Jira swimlanes, web sites, MacVim edits, wiki entries and kibana dashboards. I’m using the same LG model (I forgot the name but I like this white one). Both are connected on my MacBook Pro. One on HDMI (yes I’m using yet another adapter) and one on a USB-C-to-DVI connector.

What hardware do you use? Apple? Dell? Lenovo? Others?

Yes, I’m using a Apple MacBook Pro, latest generation (2016). I’m not so satisfied with it because of the battery lifetime and the monitor handling. From time to time If I disconnected my two displays, I can’t reanimate the display output in a conference room (beamer or tv). It is a f*** digital adapter and synchronization nightmare. But the big plus is the weight. If I take it away to my home office, it is not so heavy as my old one. I like the touch bar but I don’t use it in this configuration. Only if I lie on my sofa without any external monitor.

Which OS do you use (or prefer)? MacOS, Linux, Windows? Virtual Machines?

I’m using macOS hight sierra on my MacBook Pro. This is not the latest supported macOS from my company but I have an „early adopter“ status for that. On the server side, I prefer Linux and therefor mainly Centos. We’re using customized Centos AMIs on AWS.

What are your favorite databases, programming languages and tools? (e.g. Python, R, SAS, Postgre, Neo4J,…)

In Big Data projects, I’m using Hive and Spark for data engineering tasks. For business intelligence tasks I’m using Impala together with Tableau. For data science tasks we’re using Clouderas Data Science Workbench together with Python and Spark.

Which data dou you analyze on your local hardware? Which in server clusters or clouds?

I do not analyse any data on my local machine. We had a strong security and governance process to analyse data. The data that we’re analyse comes from a virtual private cloud on AWS. The data is encrypted at rest and is encrypted by transport too (VPN, SSL et cetera).

If you use clouds, do you prefer Azure, AWS, Google or others?

I prefer AWS at the moment because I’m working with it. I like the API first approach of Amazons webservices and I like services like Lambda or the machine learning APIs. Amazon is the market leader and very innovativ in many fields. We’re using the AWS API together with Ansible and Terraform.

Where do you make your notes/memos/sketches. On paper or digital?

I prefer digital notes because of its searchability. I store all of my notes inside the companies Microsoft Exchange server because then I can access the notes also from my mobile phone too. I’m working mainly with macOS notebook for all of my sketches.

