Load large dataset into crossfilter/dc.js-open source projects square/crossfilter
I built a crossfilter with several dimensions and groups to display the data visually using dc.js. The data visualized is bike trip data, and each trip will be loaded in. Right now, there’s over 750,000 pieces of data. The JSON file I’m using is 70 mb large, and will only need to grow as I receive more data in the months to come.
So my question is, how can I make the data more lean so it can scale well? Right now it is taking approximately 15 seconds to load on my internet connection, but I’m worried that it will take too long once I have too much data. Also, I’ve tried (unsuccessfully) to get a progress bar/spinner to display while the data loads, but I’m unsuccessful.
The columns I need for the data are start_date, start_time, usertype, gender, tripduration, meters, age
. I have shortened these fields in my JSON to start_date, start_time, u, g, dur, m, age
so the file is smaller. On the crossfilter there is a line chart at the top showing the total # of trips per day. Below that there are row charts for the day of week (calculated from the data), month (also calculated), and pie charts for usertype, gender, and age. Below that there are two bar charts for the start_time (rounded down to the hour) and tripduration (rounded up to the minute).
The project is on GitHub: https://github.com/shaunjacobsen/divvy_explorer (the dataset is in data2.json). I tried to create a jsfiddle but it is not working (likely due to the data, even gathering only 1,000 rows and loading it into the HTML with
tags): http://jsfiddle.net/QLCS2/
Ideally it would function so that only the data for the top chart would load in first: this would load quickly since it's just a count of data by day. However, once it gets down into the other charts it needs progressively more data to drill down into finer details. Any ideas on how to get this to function?