Data visualization

VizBox Bergen og årets geogründer

Saturday, March 21st, 2015 | Data visualization | 2 Comments

Geomatikkdagene 2015 gjekk av stabelen 17.-19. mars på Lillehammer. Geomatikk handlar om å kombinere tradisjonelle kartfag med informasjonsteknologi, som inkluderer alt frå oppmåling og datainnsamling til presentasjon av data på kart.

Fem geogründerar var inviterte til konferansen for å presentere prosjekta sine, og eg presenterte eksperimenta eg har gjort med projisering av grafikk på 3D-printa terrengmodellar. Det var mange spennande prosjekt, men etter avstemminga var eg den heldige som vart kåra til ‘årets geogründer’! Hurra!


The VizBox Experiments

Eg presenterte både nye og gamle eksperiment. Under kan du sjå dei første eksperimenta, der eg tok utgangspunkt i eit 3D-printa kart over Sør-Noreg og bygningskomplekset til School of Cinematic Arts i Los Angeles:


VizBox Bergen

I tillegg presenterte eg nokre nye eksperiment der eg har tatt utgangspunkt i Bergen by og omegn:


I tillegg til heder og ære var premien å få delta gratis på Geomatikkdagene 2017 – og det gledar eg meg til!

Tags: , , , , , ,

The VizBox Experiments

Wednesday, April 30th, 2014 | Data visualization, General | 2 Comments

This is the result of a project I have been working on for the past months. The video demonstrates the setup and use of the VizBox (previously known as TopoBox) – a physical platform for interactive data visualization on three-dimensional surfaces.

The project has been highly explorative, geared towards testing and demonstrating new potentials rather than producing a finished product ready for use. Hopefully, this can serve as a starting point for new discussions, projects and experiments.

I do not currently have any specific plans for developing the VizBox further. However, I would be happy to discuss ideas and possibilities for collaboration. What would YOU do if you had a box like this?

Update march 2015: I have made a new prototype and video, and won a prize for the work! See the new blogpost: “VizBox Bergen og årets geogründer” (in Norwegian).



Special thanks to colleagues and students in the Media Arts + Practice program within the School of Cinematic Arts – especially Virginia Kuhn, Andreas Kratky and Behnaz Farahi.

Tags: , , , , , , , , , ,

TopoBox: exploring a tangible dataviz platform

Wednesday, December 18th, 2013 | Data visualization, General | 3 Comments

A lot of exciting examples of interactive data visualization projects are popping up on the web these days. Many of these are interactive, and allow the user to navigate or filter information using a mouse or a finger on a touch screen. But what might the future of interactive data visualization look like?

Even though touch screens might seem perfect for interacting with computers, they are far from perfect. As pointed out by Bret Victor, our screens force us to interact with pictures under glass, and don’t allow us to make use of our full range of capabilities for perceiving and interacting with tangible objects. After all, we are experts at interacting with real-world objects that have physical properties like texture, translucency, elasticity and weight.


Science fiction movies and holographic displays

If we look to science fiction movies (which often give us a glimpse into possible futures of interface design), it seems like the future is going to be filled with large, immersive holographic displays. Even though many of these envisioned interfaces don’t seem very usable or even useful, there is obviously something immediate and engaging with the presentation of graphics in three-dimensional space, integrated with our physical environment. For example, such interfaces would allow multiple users to see and explore three-dimensional data from multiple angles.


Still image from Avatar. Image:

While there have been a lot of interesting developments in holographic display technology in recent years, for example by the Graphics Lab at the University of Southern California, I believe we have a long way to go before high quality holographic displays will be available outside high tech labs.

Virtual Reality and Augmented Reality are probably more likely to be successful, but they come with a big disadvantage: they provide un-social interfaces. AR and VR typically place the interface in front of users’ faces, thereby obstructing social interaction between the user(s) and other people. In contrast, the holographic displays portrayed in science fiction movies like Avatar are social; they allow multiple users to seamlessly interact with the information and each other.

However, even if holographic displays were cheap and available, they would not provide any haptic feedback and data with material qualities. What if we could find a way to combine haptic interaction with spatial, three-dimensional graphics, and thereby create a new medium for data visualization, using commonly available and affordable technology?


Case: a topographic map of Norway

While I was working with survey data from Difi a while back, I was sketching out different ideas for creating interactive data visualizations using the data from Norway.

One of the ideas that emerged was to build a physical box with a topographic landscape of Norway on top of it, onto which graphics could be projected from inside the box. For practical purposes the graphics could be projected onto a mirror in the box from one of the sides, as shown in the illustration below.

TopoBox sketch

By attaching a Leap Motion sensor to the box, the users would be able to interact with the visualization using gestures, for example pointing to specific places to open more information on an adjacent screen, or make a swipe gesture above the box to go to the next visualization. If rear-projection is used, there will be no shadows from the hands, which often is a problem when front projection is used.

Obviously, this surface does not provide the same flexibility for 3D graphics that a holographic display would do. But for specific applications it might be superior to holographic displays, because of its physical manifestation and tangibility. And more importantly, this platform can be put together using commonly available and fairly cheap technology.


Initial experiments: projection on 3D printed surfaces

3D printing is perfect for creating complex surfaces. However, I wasn’t sure how rear projection would work with the 3D printed material, so I needed to start off by doing some simple experiments. I ordered some sample models with different thickness from Shapeways, including one representing buildings in a cityscape, and projected graphics onto them.

The result wasn’t as good as I hoped for, but not too bad either. I was especially happy to see how effective it is to see movement on a curved, physical object. Kind of magic, don’t you think?

As expected, the quality of the graphic is best using front projection. However, if the surface is thin (less than 1mm), rear projection works quite well, if not for the most detailed graphics.

After much struggle, I also managed to create and print a 3d model of the southern part of Norway:

3D printed model of Norway

Note that the elevation has been exaggerated. Otherwise, it would probably be impossible to see any fjords or mountains at all. After all, the aim here is not to reproduce a perfect scale model, but rather a physical representation that is as informative as possible.

A rough mockup to test the rear projection via a mirror in a box:

Projection test setup

Finally, the result:

Rear projection on 3D printed model of Norway

Even though I found these experiment promising, it is clear that I need to print a larger model in order to get higher graphical resolution. In addition, I think I need to exaggerate the elevation even more.


Further work and remaining questions

As you probably can guess, this work is highly experimental, including a lot of trial and error. This can be quite frustrating at times, and I am not sure how good result it is possible to get using 3D printing and rear projection.

Probably, this new ‘medium’ will turn out to work well for some types of visualizations, and not for others. What are the design affordances of such a medium? What kind of data would you like to see projected onto such a physical landscape? How would you interact with it?

Tags: , , , , , ,

Norway in 3D part I: from DEM to 3D surface

Wednesday, December 4th, 2013 | Data visualization, General | No Comments

Earlier this year the Norwegian Mapping Authority released a bunch of free data, including Digital Elevation Model (DEM) files that describe the terrain of Norway in high detail.

I wanted to use this data to create a 3D model of Norway that I eventually could 3D print. This turned out to be a bit complicated, so I want to share some of my experiences in case anyone else wants to do something similar. So yes, this will be a bit technical.


Getting the DEM files

After registering and logging in to you are ready to download the files. But don’t think you get one file for the whole of Norway! First, you have to figure out which files to download (I used ‘Digital terrengmodell 10 m , UTM 33’), and then select and download 254 individual files one by one to get all the tiles you need.



Merging the tiles

So, now you have a lot of DEM files that you need to combine into one large file. I installed GDAL, which is an open source and powerful “translator library for raster geospatial data formats”. The library is used by various software packages, and can also be used with shell scripts. After quite a lot of googling and trial and error I managed to write a shell script that combined all the DEM files into one GeoTIFF file. The script:

#!/bin/bash -ps 100 100 -init 255 -o merged.tif *.dem

-ps 100 100 specifies the resolution for the output file. Higher number: smaller output file.
-init 255 specifies color range.
-o merged.tif specifies the output file.
*.dem uses all the DEM files in the current folder as input.

See full gdal_merge documentation for more.

GeoTIFF files are similar to regular TIFF images, but they store a lot of additional geospatial data. You may open the file in Photoshop, but you will probably just see some basic black and white contours.

Create heightmap

One of the simplest ways to create a landscape in 3D is to use a greyscale heightmap, and displace a surface according to the image. A heightmap is basically just an image in which color represents height, from low (black) to high (white). There are probably other more sophisticated ways of creating a 3D surface, but this is at least quite straightforward.

First, it is necessary to create a greyscale heightmap from the GeoTIFF file. This shell script did the trick:

gdal_translate -b 1 -scale -20 2500 -of PNG merged.tif heightmap.png

-b 1 select input band 1 (don’t even ask me).
-scale -20 2500 set height/color range from -20 m (black) to 2500 m (white).
-of PNG specify output file format.
merged.tif input filename.
heightmap.png output filename.

See full gdal_translate documentation

The result is a greyscale map that looks like this:


As you may see, parts of Sweeden have been included in this image. We can’t have that, of course, so I removed the Sweedish areas by applying a mask in Photoshop. Here is the final heightmap (click to see larger version on Flickr):

Norway heightmap

By the way, it looks even cooler with inverted colors, don’t you think? The fjords and the mountains look like veins in some kind of alien organism..

Norway inverted heightmap

Create 3D surface in Blender

A lot of 3D applications can create 3D surfaces from heightmaps. Since I am trying to use open source software as much as I can, I wanted to use Blender, and for this purpose it worked quite well. (The capabilities of Blender are impressive, but is unfortunately quite hard to use due to a range of usability issues, even though they did a major UI facelift a couple of years ago.)

Creating a 3D surface from a heightmap goes something like this:

  1. Create a plane
  2. Go to edit mode (Tab), and subdivide the plane (W). More subdivisions will give you more detail, but also a very heavy model.
  3. Go back to object mode, add new material to the plane
  4. Add new texture to the place, and select “Image or movie” as input. Load the heightmap image
  5. Add a Displacement modifier to the plane, using the texture as input. Adjust the strength to adjust how much the surface should be displaced

And voilà! You have Norway as a 3D surface. Add some light, and you get something like this:

Norway in 3D

You can probably see that I have exaggerated the height a lot in this particular model. Otherwise it would probably be hard to see any landscape at all.

Actually, when I look at it, I find this to be an interesting visualization in itself; by exaggerating the height, it is possible to get an impression of the topography of different areas of Norway in one picture, which otherwise would be impossible to achieve.

Wouldn’t it be cool to 3D print this? Oh yes, I’ll get back to that. And why I wish the earth was flat.

Tags: , , , , , , , , , ,

Using visualization for understanding survey data

Thursday, October 31st, 2013 | Data visualization, General | No Comments

One of the goals for my Fulbright project has been to work with real data, and use visualization as a means of exploring, understanding and analyzing data.

When doing research on data visualization it is obviously necessary to have some data to work with. Before going to the US I decided to cooperate with Difi (The Agency for Public Management and eGovernment) in Norway, since I had positive experiences from working with them before, and they were more than happy to see someone use their data.

More specifically, I have been working with survey data from The Norwegian Citizen Survey (Innbyggerundersøkelsen), which asked 30.000 Norwegians about their perceptions of public services in Norway, the welfare state and democracy, and living in different parts of Norway.

I have carried out a range of visualization experiments in order to get to know the data, but also for testing different visualization techniques as tools for discovery and analysis. Some of these visualization experiments will be presented in this blogpost, along with some of the lessons I have learned on the way.


Gigamapping the survey

One of the challenges when facing a large dataset is to understand the context in which it has been created. How was the data gathered? What can the data tell us, and more importantly, what can it not tell us? Are there any specific issues that are important to be aware of when analyzing and presenting the data?

Before leaving Norway, I had a meeting with the people in Difi that are responsible for the survey. While discussing the survey, I made notes on a large canvas laid out on the table, which served as a medium for documentation as well as a shared platform for discussion during the meeting. The resulting sketch was messy and kind of ugly (as expected!), but served its purpose well.

DIFI sketch from meeting

Sketch from meeting with Difi. Click to see larger image.

When I got to the US I digitized and cleaned up the map, and added some more relevant information. The result is a visual overview, or gigamap, that maps out different aspects of the survey and the data, and also serves as a medium for discussing the project with Difi and others. For me, the process of making the map was probably as important as the resulting map itself, as this forced me to clarify and reflect on the information, and look deeper into areas that I didn’t know enough about (like relevant concept from statistics).


Digitized and cleaned gigamap. Click to download map in PDF format (1.6 MB). Feel free to print it out and put it on the wall, like I did!

I have put the final map on the wall in my office to remind myself of all the aspects of the survey, as well as make my project visible to my colleagues. In addition, it has been useful to show the map in presentations, and literally zoom into specific areas of interest.

Lesson learned: Visualization and gigamapping can be useful for gathering, processing and communicating information about the context in which a specific dataset has been created. One of the main challenges is to find the balance between complexity and simplicity, between visualization as a tool for thinking and visualization as a tool for communication.


10,039,380 data points

After my initial meeting with Difi, it was time to get to know the data. When I have visualized data before I have mostly worked with aggregated data that has already been processed. In this case, however, I wanted to work with raw data in order to get more control and better understanding of the data.

The survey data comes in a large spreadsheet format in which each respondent’s answers are located in a row, and each column represents an answer to a question (or data about the respondent). For just part 1 of the survey there are 23790 rows and 422 columns, which results in 10,039,380 cells! Where to begin?


Screenshot from Excel


Lessons learned: it can be quite overwhelming to approach a large dataset for the first time. Even though working with ‘raw data’ has its advantages in terms of pliability, it requires specific knowledge and skills to approach it efficiently.


First attempts: Tableau Public

My first attempts for working with the data were done with Tableau Public, the free version of Tableau Software.

Tableau Software is an application designed specifically for data visualization, and provides an (seemingly) easy-to-use interface. Unfortunately Tableau only runs on PCs, so I had to install Windows on my Mac to try it out. (A Mac version of Tableau Public is coming out in 2014).

I have the impression that many dataviz designers use the software for discovery, so I wanted to give it a try. However, I found that the type of data I work with (survey data that should be weighted) was hard to get into Tableau. Finally, I managed to get some of the data in, and I was able to gain some insight from the resulting visualizations. For example, I found that people in Norway are overall satisfied with their life situation – which of course is a good thing, but not so interesting to visualize!


Screenshot from Tableau Public. Click to see larger image.

Lesson learned: the good and bad thing with tools like Tableau is that the software presents you with a limited set of visualization types. This is useful if you need a standard graph, but Tableau is not the software package you would choose for exploring new kinds of data visualization. In addition, Tableau seems to live in its own little bubble; you have to work with the presets, interface styles and export formats presented to you. I needed something more flexible, and didn’t want to spend a lot of time learning to use an application that could only take me so far.


Going behind the scenes with Python

While struggling to get my data into Tableu, I realized that I needed to learn some new tools to be able to sort and rearrange data. In addition, I wanted to combine several datasets into a new one, and for that purpose I needed to do some scripting.

Inspired by Nathan Yao, I decided to learn Python, which runs in the Terminal, and is powerful for working with large datasets. Python is also useful for scraping websites for data, which might come in handy later. If you want to learn Python, I highly recommend CodeAcademy. But be warned: you might get addicted to its game-like learning environment! I did.

Lesson learned: preparing, sorting, merging and rearranging data is a necessary but time consuming part of the data visualization process. Even though a couple of lines of code might be enough, the challenge is to find out exactly what those lines should contain. I am convinced that dataviz designers should learn at least some programming.


Categorical data overview with dbcounter

Since I started working with the survey data, I was looking for a way to get an overview of all the data. The solution was a Nodebox script written by Moritz Stefaner, that quickly makes a visual overview of large sets of categorical data (like survey data). I downloaded Nodebox and the script (which is written in Python), and tweaked it a bit. I saved the resulting illustration as a PDF, and changed the colors and fixed the text in Illustrator.

Each row shows the distribution of the responses for each question in the survey. Next to the graphic I put all the questions, so that I can zoom in and see the distribution and the corresponding question.


The image shows the top 15% of the visualization. Click to see the full visualization, PDF 1.1 MB


The nice thing with this visualization is that it is possible to see all the data at once. The responses that show a red to green gradient represent likert scale questions (disagree – agree, or very bad – very good). The pink-to-cyan gradients represent all other kinds of questions and variables, in which the categories are more arbitrary. Consequently, it is necessary to know the questions (and possible answers) in order to fully understand the visualization. This would not make much sense to present this for a general audience, but it works well for the purpose of discovery and exploration.

Lesson learned: dbcounter demonstrates how a simple script can be used to create a visualization that would take hours to produce manually. The resulting visualization allows us to see the distribution for all the questions at once, and makes it easier to discover interesting areas in the data, and start asking questions.


Explore interrelations with Parallel sets

While the dbcounter visualization gave a nice overview of the distribution of all the responses, it did not say anything about the interrelations across the questions/variables (also known as cross tabulation). One way of exploring such interrelations is to use an app called Parallel Sets, developed by Robert Kosara. The app makes it easy to interactively explore and analyze categorical data, and show relations between different categories.

The best way to explain this is through an example, using the survey data:


Image exported from Parallel Sets, with text added. Click to see larger image.

The horizontal line at the top shows the distribution of the size of the municipalities the respondents live in, from small municipalities (1) to large municipalities (4). The bottom line shows how satisfied people are with their access to cinema, concerts and theatre, from very unsatisfied (1) to very satisfied (7). Then, each respondent is tracked across the two variables, so that we can see what those who live in a small or large municipality feel about their access to cinema etc. As you might see, most people from large municipalities are very satisfied with the access, while the respondents from the small municipalities are not.

Lesson learned: even though it might be interesting enough to look at individual questions and responses, it becomes more interesting when we start to look for relations across variables. The Parallel Sets app provides one way of investigating such relations visually. However, it seems that this type of visualization works best when there are relatively few categories for each variable.


Searching for patterns in R

While dbcounter and Parallel Sets are great for visualizing data, they provide limited possibilities for analysis. At this point in the project, I was especially interested in correlation, and wanted to look for correlations between different variables in the dataset. For such statistical analysis, R is the place to go. R is a free, programming-based software environment for statistical computing. It can be used for visualizing data, even though the resulting graphs are a bit rough. However, by saving the graph as PDF and edit it in Illustrator, it is possible to make something decent out of it.

I have written some small and simple scripts in R in order to look for patterns in the data, reusing some code examples from Nathan Yau. The visualization type I found most useful was the Scatterplot matrix with LOESS curves. Here, the different variables are mapped against each other, plotting each respondent’s answer according to the two variables. Then, a LOESS curves is drawn horizontally, based on the averages of the variable on the vertical axis.


Scatterplot matrix. Click to see larger version (PDF).

In this example, you see the correlation between respondents’ age (top row and left column) and 4 questions. For example, in the left column, satisfaction is plotted vertically against age (horizontally). As you may see, older people seem to be slightly more satisfied than young ones, and young and old people are the ones most satisfied with the quality of roads and streets. In addition, we see that there is a strong correlation between the 4 questions: people who are satisfied with one topic are likely to be satisfied with something else as well.

Lessons learned: data analysis and statistical computing is highly useful and necessary for finding interesting patterns in a data set. As a designer, however, it is important to find the balance between doing statistical analysis and doing data visualization. Even though infoviz designers should know a bit about statistics, it may be more important to know how far our knowledge goes, and when we should talk to a professional statistician. Obviously, this also points to the need for multidisciplinary teams in data visualization projects.


Next up: concept development

To sum up, all the visualization examples presented here have been carried out in order to get to know the survey data, and test different visualization techniques for discovery and analysis. Looking back at all these experiments, I think the most important lesson has been to experience how many different ways it is possible to approach a survey dataset through visualization. There simply is not one type of visualization that can show everything; different types of visualizations provide different types of insights. This also points to the importance of data visualization in general: by visualizing our data in different ways, we see it through different lenses, and thereby learn something new about the data itself.

In parallel to these experiments, I have been working on different ideas for creating a more comprehensive concept for interactive data visualization. This is still very much a work in progress, so stay tuned!

Tags: , , , , , , ,