Big Data Infrastructure: Spring 2015

Instructor:

Jimmy Lin

School:

iSchool at Maryland

Semester:

Spring 2015

Description:

Over the past few years, we have seen the emergence of “big data”: disruptive technologies that have transformed commerce, science, and many aspects of society. These developments are enabled by infrastructure that allows us to distribute computations across hundreds or even thousands of commodity servers. One key breakthrough that makes this all possible is the development of abstractions for data-intensive computing that allow programmers to reason about computations at a massive scale, hiding low-level details such as synchronization, data movement, and fault tolerance.

This course provides an introduction to big data infrastructure, starting with MapReduce, the first of these datacenter-scale programming abstractions. The Hadoop implementation of MapReduce lies at the core of an application stack that is gaining widespread adoption in both industry and academia. A major focus of this course is algorithm design and “thinking at scale”, applied to a variety of domains: text, graphs, relational data, etc. We will also cover a number of next generation systems that are vying to replace MapReduce as the de facto big data processing platform of tomorrow.

Required Textbook:

Lin, J., Dyer, C. 2010. Data-Intensive Text Processing with MapReduce.
White, T. 2015. Hadoop: The Definitive Guide.

Link to Syllabus:

https://github.com/lintool/UMD-courses/tree/master/bigdata-2015-Spring

Digging Into Data: Spring 2014

Instructor:

Jordan Boyd-Graber

School:

iSchool at Maryland

Semester:

Spring 2014

Description:

Computers have made it possible, even easy, to collect vast amounts of data from a wide variety of sources. It is not always clear, however, how to use those data and how to extract useful information from data. This problem is faced in a tremendous range of scholarly, government, business, medical, and scientific applications. The purpose of this course is to teach some of the best and most general approaches to get the most out of data through clustering, classification, and regression techniques. Students will gain experience analyzing several kinds of data, including document collections, financial data, scientific data, and natural images.

Required Textbook:

Williams, G. 2011. Data Mining with Rattle and R.

Link to Syllabus:

http://www.umiacs.umd.edu/~jbg/teaching/DATA_DIGGING/

Data Analytics for Information Professionals: Fall 2015

Instructor:

Yla Tausczik

School:

iSchool at Maryland

Semester:

Fall 2015

Description:

Advances in hardware and software technologies have led to a rapid increase in the amount of data collected, with no end in sight. Decision making in the coming decades will depend, to an ever greater extent, on extracting meaning and knowledge from all that data. In this class we focus on one branch of statistics, inferential statistics, to help us reason about data. By gathering datasets, formulating proper statistical analyses and executing these analyses, information professionals play a significant role in bridging the gap between raw data and decision making.

This course will introduce basic concepts in data analytics including study design, measure construction, data exploration, hypothesis testing, and statistical analysis. The course also provides an overview of commonly used data manipulation and analytic tools. Through homework assignments, projects, and in-class activities, you will practice working with these techniques and develop statistical reasoning skills.

Required Textbook:

Rice University. The Online Stats Book. Available online.

Link to Syllabus:

http://ischool.umd.edu/sites/default/files/syllabi/inst627fall15tausczik.pdf

Introduction to Data Mining: Winter 2014

Instructor:

Benjamin Fung

School:

McGill

Semester:

Winter 2014

Description:

Introduction to data mining. Includes data preprocessing, data warehouse architecture,online analytical processing (OLAP), online analytical mining (OLAM), basic concepts and methods of frequent patterns mining, association rules mining, classification analysis, cluster analysis, and text mining.

Required Textbook:

Han, J., Kamber, M., Pei, J., Kaufmann, M. 2012. Data Mining: Concepts and Techniques, 3rd ed.

Link to Syllabus:

http://www.mcgill.ca/sis/files/sis/glis693_2014winter_fung_20130923.pdf

Data Mining: Fall 2014

Instructor:

Bei Yu

School:

Syracuse

Semester:

Fall 2014

Description:

This course will introduce popular data mining methods for extracting knowledge from data. The principles and theories of data mining methods will be discussed and will be related to the issues in applying data mining to problems. Students will also acquire hands-on experience using state-of-the-art software to develop data mining solutions to scientific and business problems. The focus of this course is in understanding data and how to formulate data mining tasks in order to solve problems using the data.

The topics of the course will include the key tasks of data mining, including data preparation, concept description, association rule mining, classification, clustering, evaluation and analysis.

Required Textbook:

Tan, P., Steinbach, M., Kumar, V. 2005. Introduction to Data Mining.

Link to Syllabus:

http://my.ischool.syr.edu/Uploads/CourseSyllabus/IST565_Fall2014-Yu-syllabus-schedule-1151.24293-625c3297-a249-4509-bca4-a98ca437d734.pdf