Create scales
We create our scales, but first, we learn how to split our data into equally-sized bins.
This lesson preview is part of the Fullstack D3 Masterclass course and can be unlocked immediately with a single-time purchase. Already have access to this course? Log in here.
[00:00 - 00:10] Okay, so our next step is to create our scales. And this is where things are going to start diverging a little bit from the charts we've already created, and we're going to learn some new concepts.
[00:11 - 00:27] So let's go to our JavaScript file, and the X scale should be fairly straightforward. Again, we want to map the humanity values and map them onto how many pixels to the right will this data point be.
[00:28 - 00:38] So let's create an X scale. This is going to be our trusty linear scale, so scale linear.
[00:39 - 00:49] And then as usual, we want a domain and a range. So the domain is the input values.
[00:50 - 01:04] So these are going to be humidity values. So we can use d3.extent and grab our data and pass our X accessor function into here.
[01:05 - 01:15] And then our range will be zero to the width of our chart. So dimensions bounded width.
[01:16 - 01:27] All right, so for our Y scale, let's get our chart. Remember that we don't even have an accessor function for our Y scale.
[01:28 - 01:46] And this is because these are counts instead of a value that already exists within the data set. So we're going to need to figure out how many days fall within a set number of buckets of humidity values.
[01:47 - 02:02] And we could probably do this ourselves, but we can also do this much more easily with a built in d3 method called d3.bin. So this will set up a generator function.
[02:03 - 02:23] So let's create a bin generator and this d3.bin. So our bin generator function will need to know what is the extent of the values similar to our scale.
[02:24 - 02:35] So it'll want to know the domain. And one of the nice things here is that we can use our X scale that we already used and just grab the domain from there.
[02:36 - 02:49] And remember if we're not specifying anything within this domain method, it'll just spit out the existing domain. Another thing that this bin generator needs to know is how to get the value from each data point.
[02:50 - 03:02] So say it's looking at a single day of data, it wants to know what values is it concerned with. So we tell it through this value method.
[03:03 - 03:15] So in value, we're going to want to create a function that takes a day and returns the humidity value. And this is something we already have with our X successor.
[03:16 - 03:29] So we can just pop that in here. And then the last thing that we'll want to specify here is the number of thresholds that we want for our histogram.
[03:30 - 03:46] Let's aim for 13 of these rectangles. It's just a nice number that keeps it from being too crowded but isn't so small that you kind of lose the general shape of the distribution.
[03:47 - 03:56] So we want to tell it the number of thresholds. And the thresholds are these gaps in between the bars.
[03:57 - 04:05] So while we might want 13 bars, we're going to want 12 thresholds. And this number is going to be kind of a guesstimate.
[04:06 - 04:32] We're telling this bin generator to aim for 12 thresholds but to release its best judgment and keep our numbers round. D3 does a good job keeping these numbers as readable as possible so you can see they're incrementing in values of 0.05 so we could have exactly 13 rectangles.
[04:33 - 04:49] But this goes from 0.3 to 0.27 and then the numbers just get wonky. So it's going to keep that in mind and give us a number of thresholds that's close to the number that we specify.
[04:50 - 04:57] Alright, so bin generator is a function and we need to pass it our data. Our data set in order to get these bins.
[04:58 - 05:15] So we're going to grab our bins and then use our bin generator and pass it our data set. So let's go ahead and see what these bins are.
[05:16 - 05:27] Let's pull this up over here. I hope you can read this and if not I hope you're logging it out on your own screen.
[05:28 - 05:43] So bins, the thing that has been given to a spire bin generator, this is an array of arrays. So each item in this array is going to be one of those rectangles in our hist ogram.
[05:44 - 05:55] So this first array, we can see it has one data point in here. The second one, we can see it has three data points.
[05:56 - 06:26] And one thing that we're not going to see in this console but you'll see in the native DevTools is that each of these arrays has items in it but it also has two keys associated with it. So if we look at that first item in our bins array, so get this one.
[06:27 - 06:49] We can see here it's just an array of objects but if we look at the x0 key of this array, we can see that the x0 key has a value of 0.31 and the x1 key has a value of 0. 35.
[06:50 - 07:02] And these numbers correspond to the bottom and the top of each of these bins. So this first bin will include days that have a humidity value that is 0.31 or above.
[07:03 - 07:21] The lower bound is inclusive and they also will only have days whose humidity value is lower than 0.35 and the upper bound is exclusive. So if it has a community value of 0.35, then it'll go in the next bin.
[07:22 - 07:37] And if we look at the data point for this one item in this array, we'll see the humidity is 0.31. So that is the lower bound so it's being included in this one bin.
[07:38 - 07:50] So now that we have this bin data structure, we can create our y scale. But first, let's go up to the top and create a y-accessor.
[07:51 - 08:12] So with this new bins dataset, we can take one of these bins and return the length. So if we look at this first bin, the length of the array is 1 because there's 1 data point in here.
[08:13 - 08:25] And for the second one, there's 3 items in this array so the length is going to be 3. So this is really looking at the count and the number of days within each of these buckets.
[08:26 - 08:37] So having that y-accessor up at the top, we can create our y-scale. And it's going to be another linear scale with the domain.
[08:38 - 08:54] We'll just get ourselves set up in the range. And the domain, we can use the extent of, so this goes from the lowest number of counts to the highest number of counts within each bin.
[08:55 - 09:12] But one thing that's really important here is that our y-scale goes all the way down to 0 because 0 is a meaningful point on this chart. It's really important to us to preserve the relationship of 14 is 7 times as tall as 2.
[09:13 - 09:22] And in order to do that, we need our y-axis to go all the way down to 0. So the first number in our domain is going to be 0, 0 for a histogram.
[09:23 - 09:43] And then the top number, we can just use d3.max and pass in our bins, not our data, and use that y-accessor we just created. And for the range, we're going to do our usual flipped axis.
[09:44 - 10:00] So it's not 0 to the higher number, it's the higher number down to 0 because y- axis go from bottom to the top. So it's going to be dimensions.founded height and the second number will be 0.
[10:01 - 10:19] And the last thing we want to do here is use that dot nice method that we used in our last lesson just to keep these numbers nice and even. And maybe instead of 0.31, it goes all the way down to 0.3.
[10:20 - 10:23] So let's just throw that into both our x and our y-scale.