18-788-K4   Big Data Science

Location: Africa

Units: 6

Semester Offered: Spring

Course description

The proliferation of mobile technology, wireless sensors, and social media provides a means of monitoring socio-economic activity, consumption of resources, and human mobility. Recent advances in data science are now capable of coping with the technical challenges of collecting, managing, and developing actionable insights from big data. Partnerships between academia, government, and the private sector are at the heart of the revolution that is currently demonstrating how data is a valuable commodity and a source of intellectual property. This course will take a practical approach to solving challenges in the public and private sectors using a collection of techniques that constitute this new multidisciplinary field known as data science. A number of different themes will be explored as case studies in order to demonstrate how big data collected from a wide range of disparate sources can be combined to provide insights, drive decisions, and influence policy. The course content will be structured to provide a roadmap for deploying data science techniques using case studies, reading material, and previously published models. Participants will obtain hands-on experience by working on real-world datasets during assignments.

Learning objectives

The objective of this course is to provide students with practical experience in the different techniques and skills that constitute the field of data science. In particular, these case studies are selected to demonstrate the technical challenges of dealing with the three V’s that define big data (volume, velocity, and variety).  The various steps required will include (1) exploration of data using visualization techniques; (2) construction of features; (3) evaluation of a collection of models; (4) consideration of how a decision-maker can utilize the analysis; and (5) development of a dashboard for displaying the results of the analysis. The sources of big data will range from surveys to mobile data to satellite imagery and therefore involve both structured and unstructured data.

Outcomes

After completing this course, students should be able to: 

  • Identify sources of big data in response to a specific challenge
  • Download and organize data for addressing the challenge
  • Explore the dataset using visualization techniques
  • Develop a number of features to extract information
  • Construct a range of quantitative models
  • Discuss the advantages and disadvantages of different models
  • Select an approach that is optimal for meeting the objective
  • Present conclusions and recommendations
  • Communicate model output to decision-makers

Content details

  1. Weather and climate impacts
  2. Survey data
  3. Google trends
  4. Sentiment analysis
  5. Mobile data
  6. Big data for development

Prerequisites

  • Data and Inference and Applied Machine Learning Mini-Courses
  • Background in quantitative discipline (Engineering, Computer Science, Physics, Mathematics, Statistics)
  • Programming

Faculty

Patrick McSharry