Before you keep reading...
Runestone Academy can only continue if we get support from individuals like you. As a student you are well aware of the high cost of textbooks. Our mission is to provide great books to you for free, but we ask that you consider a $10 donation, more if you can or less if $10 is a burden.
Before you keep reading...
Making great stuff takes time and $$. If you appreciate the book you are reading now and want to keep quality materials free for other students please consider a donation to Runestone Academy. We ask that you consider a $10 donation, but if you can give more thats great, if $10 is too much for your budget we would be happy with whatever you can afford as a show of support.
9.2. Predicting Bike Rentals¶
The data we will use in this chapter is used with the permission of Capital Bikeshare. You can download the data from their website. We are using a prepared version of this data that has already been augmented with additional weather data from the UCI Machine Learning Repository. To download the data sets click here *.
The basic data for the sql lessons is in bikeshare.db. The additional data about weather is not needed until the last section of this chapter in which we try to predict bike rentals. Later sections of this chapter use bikeshare_11_12.db which has the same schema as bikeshare.db but data for two years instead of just one. These two files are sqllite database files, feel free to download them and use them with sqllite directly.
Predicting bike rental trends is very important from both an operational and planning perspective. Bikeshare companies need to stay up to date on rental trends to know where they should add new facilities, and how to reposition bikes to get them to the locations with the highest demand. They do not want to wait until all of the bikes are rented at a particular location before moving additional bikes into position, as that is lost revenue for them.
In the zip file you downloaded from the UCI Machine Learning Repository there are two data sets:
Both have the following fields (with the exception of
hr which is not available in
instant: record index
season: season (1:spring, 2:summer, 3:fall, 4:winter)
yr: year (0: 2011, 1:2012)
mnth: month (1 to 12)
hr: hour (0 to 23)
holiday: whether day is holiday or not
weekday: day of the week
workingday: 0 if day is either weekend nor holiday is 1, otherwise 1
1: Clear, Few clouds, Partly cloudy, Partly cloudy
2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
4: Heavy Rain + Ice Pellets + Thunderstorm + Mist, Snow + Fog
temp: Normalized temperature in Celsius
atemp: Normalized feeling temperature in Celsius
hum: Normalized humidity
windspeed: Normalized wind speed
casual: count of casual users
registered: count of registered users
cnt: count of total rental bikes including both casual and registered