Subscribe to our Newsletter

BigQuery Big Data Visualization With D3.js

How to handle large dataset with D3.js?

It’s a frequently asked question. You can read several discussions on the topic here,here, and here. So far, the best solution is to process data to a smaller dataset. Then use D3.js to visualize.

With carefully crafted data processing, we can get decent story from data. But this solution doesn’t provide a lot of flexibility to experiment with data on the fly. We need a more streamlined workflow. Less friction can spark interesting data innovation.

Google BigQuery a great tool to handle big dataset. It’s definitely going to help us handle big dataset for D3.js.

I will use New York Taxi dataset hosted on Google BigQuery. It is 4+ GB and has more than 350 million rows in 2 tables. In this article, I want to show you how to query it on the fly. Then use D3.js to create a line chart of total trip amount over time. You can explore the dataset here:

(You’ll need to setup BigQuery account with one project to see public table)

https://bigquery.cloud.google.com/table/833682135931:nyctaxi.trip_fare?pli=1

BigQuery has full SQL support. So we can run aggregate query directly on dataset. We’ll group by month/year and sum total_amount column. It takes less than 5 seconds.

SELECT CONCAT(CONCAT(STRING(MONTH(TIMESTAMP(pickup_datetime))), "/"), STRING(YEAR(TIMESTAMP(pickup_datetime)))) AS time, SUM(INTEGER(total_amount)) AS total_amount FROM [833682135931:nyctaxi.trip_fare] GROUP BY time;

Then you can use Javascript client library to query for your dataset on the fly. See full article and code here:

http://blog.vida.io/2014/07/06/bigquery-big-data-visualization-with-d3-dot-js/

Finally, you'll have a visualization that gets data directly from Google BigQuery.

Phuoc Do

https://vida.io/dnprock

Originally posted on Data Science Central

E-mail me when people leave their comments –

You need to be a member of DataViz to add comments!

Join DataViz

Featured Blog Posts - DSC

Webinar Series

Follow Us

@DataScienceCtrl | RSS Feeds