Data Visualization in Django: A dream?

Continiung a bit in the vein of yesterday’s post, one thing I go keep wishing I could do is easily generate graphs and charts of Django-managed data. I have so much other stuff going on that I can’t work on it now, but I’d love to do it sometime. I know things like matplotlib are already out there, and I’m not hoping to reinvent them.

Instead, I’d like to built a Django-friendly interface to them, using declarative syntax, for instance. Also useful would be to automatically identify DateField and DateTimeField instances and correlate them to other data automatically, so you can easily generate time-based graphs and link them up to actual events.

Imagine viewing a graph of your blog stats with built-in markers indicating when you posted. Then, you click a tag and your graph is automatically filtered articles containing that tag, possibly also limiting your stats to hits on pages containing those articles. Or you select a few tags, and you get a bar chart, comparing the average page views for each tag, for a certain amount of time after a post was published. So you can get a good idea of which tags generate the most buzz on your blog.

Now imagine taking that to sports scores, crime data, anything you can imagine. If there were an easy and powerful way to bridge the gap between Django models and matplotlib, a world of possibilities awaits.

Comments

  1. At 12:27 a.m. on Nov 17, 2007, Doug Napoleone said ...

    I too have spent much time thinking about this concept. Over a year ago we had a co-op at work who was working on data visualization in Django using matplotlib. I was the one who pushed him in this direction, and the end result was nice, but special purposed (not generalized).

    I started thinking about generalizing it, but my boss (thankfully) told me not to bother, and that it was not useful for work. The entire project died soon later (other things took it's place).

    There are issues with matplotlib, in that it is not thread safe. On windows problems are compounded. Caching and other issues start to come up. While it is very very cool to have a url pattern for generating a .png on the fly via a custom view [extremely sexy], it is not the fastest way to serve up larger graphs. One of the things I would have loved to have seen was a means of diving into the image to get to the data behind it. Specifically click the dots on the graph. Maybe zoom in. All those things on even the most basic stock data widget does.

    To that end I believe the best way to go is not an image generator at all, but Dojo (or like package with client side graphing package), and json generating views. This would give the best bang for the buck and allow for doing more meaningful visualization.

    Imagine replacing the default databrowse templates with those which generate json (model->json is not enough, but there is not enough space here for that.)

  2. At 10:08 p.m. on Nov 17, 2007, Doug Napoleone said ...

    Follow up:

    I am actually leaning towards Open Flash Chart as a charting package solution because:

    1. it's open source

    2. faster than JS

    3. has the ability to run off of data files from urls (databrowse views)

    4. change the chart type in JS (and other dynamic stuff

    5. links on data points (for loading new data, databrowse in a single page! the chart just updates!)

    6. the all important tooltips

    7. The site includes helper python code.

    The only negative is that it is Flash, but what isn't these days.

  3. At 11:10 p.m. on Nov 17, 2007, Marty Alchin said ...

    I'll be honest, I'm not opposed to flash, so thanks for the heads up on Open Flash Chart. I'll definitely look into it. I think what I'd like more than anything is to be able to define graphs in a general manner, and maybe be able to plug in different renderers. I had figured matplotlib and reportlab would be useful to start, but Open Flash Chart looks like a better general solution. Dojo's charting uses SVG/VML, which is cool, so that'd be useful to support for some situations, I expect. Again, probably a pipedream.

    Basically, my main goal would be to easily bridge between Django models and a graph/chart definition, whether it's matplotlib, Open Flash Chart, or some generalization that can use pluggable renderers. There's just a lot of Django-specific handling that can be done to generate chart-ready datasets, and I think that's more where my passion lies.

    But yes, more looking shows that Open Flash Chart looks like a really nice package, and if it's nice and quick, it could be a great tool to use, even as a default if other options are made available.

  4. At 6:21 a.m. on Nov 18, 2007, Amirouche B. said ...

    You may be interrested in something like this http://prefuse.org/. I do believe that flash is not the answer but this is just the current rendering engine, no ? The tamarin project release is not that far from now. I can't find the mozilla roadmap, but if remember well, the JIT compiler will be introduced in firefox 4 with the mozilla 2 platform/stack/... If mozilla follow a 6 month developement cycle this techno should see the light next year. Who wants to bet on this ?

    [edit] keep coding [/edit] keep blogging

  5. At 6:27 a.m. on Nov 18, 2007, Amirouche B. said ...

    you may not even need to learn a new programming language

  6. At 10:51 a.m. on Nov 18, 2007, patrys said ...

    Maybe that comes useful:

    http://code.google.com/p/python-libchart/

    My little library that uses Cairo for drawing.

  7. At 11:41 a.m. on Nov 18, 2007, Amirouche B. said ...

    nice shoots, but dojo.charts can already do this more or less, and it can't solve, easly, the interactivity problem. Have a look at the dojo.charts and the api.

  8. At 11:45 a.m. on Nov 18, 2007, patrys said ...

    amirouche:

    It's not possible to embed JS charts inside PDF documents so it's not really useful for us.

  9. At 1:01 p.m. on Nov 18, 2007, Amirouche B. said ...

    I though we were speaking about web, whatever Some told me that the next gecko engine will get png/pdf generation.

  10. At 1:33 p.m. on Nov 18, 2007, Joe said ...

    I've used FusionCharts (a pay-for varient of the OpenFlashCharts you're looking at) - and it's been very nice to use with Django - a few django templates to generate XML data and you're good to roll.

    Of note, if you end up putting a LOT of graphs onto a single page, the flash "weight" of memory in the browser will become... er, extraordinary.

    It's not declarative goodness is just a few views, but it's a nice way to get to very data.

  11. At 3:26 p.m. on Nov 18, 2007, Marty Alchin said ...

    Thanks for all the suggesions, guys!

    Now that I know how many options are available, it makes me that much more interested in a generic graph definition framework. Basically, if we could just define a way to describe graphs, and an interface for retrieving data, that'd be great.

    Then, there could be renderers for all the options laid out in these comments, as well as data wrappers for everything form Django, SQLAlchemy, whatever.

    Regardless, I'm glad to see it shouldn't be hard to actually get it working in a real-world environment. I'll definitely be writing up my experiences in the future.

  12. At 3:15 a.m. on Nov 20, 2007, Anonymous said ...

    Open Flash Chart is nice. It easily integrates into Django. Unfortunately, the Python support is 2nd tier. It is missing support for quite a few things including bar charts.

  13. At 3:16 a.m. on Nov 20, 2007, Anonymous said ...

    Oops. I meant pie charts. Bar charts work nicely.

  14. At 6:59 a.m. on Nov 20, 2007, Marty Alchin said ...

    Personally, I wouldn't care much about supporting pie charts anyway. For reasons why, check out this article by Stephen Few.

  15. At 5:29 a.m. on Feb 23, 2008, Toby Dylan Hocking said ...

    I am developing a software package extension to Django for just what you describe --- a streamlined interface to various plotting backends. It's been in development for over a year and is already quite functional

    http://sf.net/projects/django-dataplot

Speak up!


This particular article was posted on Friday, November 16, 2007, and has received 15 comments.

It was preceeded by Blog stats and followed by Fixing bugs or adding features?.

It contains the following links:

Archive

Categories

Powered by Django.