Unlike other social network such as facebook, twitter and foursquare. Instagram contains lot of images. It would be very interesting to exploare the hidden value behind. As a result, I used Instagram API to crawl the dataset. After drawing the figure of it's weekly number, a pattern during the weekdays is significant.
A pipeline for forecasting MRT demands(numbers of passengers), including methodologies in feature selection, how machine learning models are implemented, and the error indicators is proposed in this research. The pipeline can be divided into two main parts, Feature engineering and Machine learning model training. From data collection, feature extraction to reorganized feature to input vector are belongs to feature engineering. Different training set and test set splitting is the initial step to start a machine learning process, in this work, three models- RandomForest Support Vector Machine, and Stochastic gradient boost are practiced. Finally, the prediction results will be evaluated by RMSE and MAPE.
The detailed of pipeline process are shown in the slideshow and the result of forecasting in different stations are visualized through cartodb below.