How I implemented conditional group by using Lodash on NodeJS and MongoDB
As MongoDB’s aggregation pipelines is limited
Situation
I have a series of vehicles that need to be grouped by it’s type, e.g. car, 4x4 and aggregated through their number of years. If a car is 5 years old or less, 1 is displayed in the column 5. The years are for each 5 years, that is, 5, 10, 15. So the end product is to group the cars, the type as Y and the number of years in aggregation (summarise) form in X, all in a table.
Task 1 — Look for an existing solution
My first approach is to look for already existing solutions. As we are using NoSQL with MongoDB, I begin checking the aggregation API. The group pipeline looks promising. I can group vehicles by their type, and count each of them. Applying this code, I get an array of objects consisting of type: vehicle (type) and count: number (summarise). I’ve completed the sum or total count of each vehicle type with minimum code.
Things turn more complicated, however, when I want to group the vehicle by number of years. Basically, I write a pipeline that will group all the vehicle type, for example, a car, if it’s equal or less than 5 years. The new pipeline overrides the previous one. Yet, I see no way to apply a condition for the year property.
On looking deeper, I find conditional can indeed be used, but only on projection (which I don’t know what it does) or redact (even worse). Nevertheless, I try applying the conditional in the group in the pipeline. The query completely fails. I dedicate the next 3 to 4 hours looking for related solutions on both StackOverflow and Github issues to no avail. I soon understand that — conditional aggregation — is limited in MongoDB.
I can try experimenting with using different aggregation pipelines variables but apart from learning, I have no guarantee of getting a working solution.
Task 2 — Code Something myself
I decide to implement this aggregation feature for the statistics myself to have more control. At this point, I know exactly the type of data I want to display in the UI. I do not, however, know that Lodash has fantastic functions such as group by and count by. Therefore, I try experimenting with the functions to become more familiar with the process.
The Coding Part
After looking for plenty of solutions, mostly on StackOverflow, I use the countBy function, .groupBy(“typeOfVehicle”) — to count the number of records by type. With two functions, I have similar behaviour to MongoDB’s aggregation pipeline count feature.
But the most difficult part of the requirement is yet to come. I want the sum of all vehicles released in the last 5 years. To do that, I use MomentJS, an outstanding JavaScript library for working with dates and timezones.
const releaseDate = moment(vehicle.releaseDate).startOf(‘day’);
I use the startOf(‘day’) function to reset the time to midnight. I do the same for the number of years I want to compare with. Therefore, only the number of years and months are taken into account. I learned this swift trick from the timepicker component in AngularJS Material source code.
const comparisonDate = moment().subtract(year, ‘years’).startOf(‘day’);
I pass year as a parameter so that I can compare with any number of years.
Const result = moment(releaseDate).isSameOrBefore(comparisonDate);
When I develop this part of the codes, using momentJS, I don’t know if the above code gives the result I expect. Judging by the examples on the documentation and StackOverflow, I assume the code works well. In case it does not, I might lose a lot of time — to realise that the issue might be — how I am using this function.
Therefore, in hindsight, a better coding approach for me is to develop one part at a time; write a unit test to be certain that this part works as expected. I can wrap the isSameOrBefore date comparison into a function, and pass real data that returns the expected value. I get more control on the process. I can play with the data and look for the expected results.
I use the reduce function from Lodash to count if the year matches the above condition. This correctly gives me the aggregation. Instead of coding down the other years, I use Lodash’s each loop and pass the verification date, that is, 5 years, 10 years and so on.
Result
I finally get my set of data aggregated by the type of vehicle and the number of them that are between the year range.
Overall, working on this task has been an overwhelming positive experience, detached temporarily from the routine front end Web Development.