The big data paradigm describes a world in which nearly every facet of our lives: commerce, entertainment, education, transportation, social interaction, health care and primary research generates large datasets that are fruitful but challenging to mine for insight. Challenges include the volume of data both historically produced and generated on a daily basis, the speed at which new data is being created as well as the intrinsic complexity, inconsistency and veracity of data captured. Powerful insights, however, are possible if individuals have the skills and training to work with large and complex datasets.

Leveraging big data requires individuals in every sector of tomorrow’s professional organizations including information technology, engineering, finance, marketing, procurement and operations to have understanding and technical capacity in the manipulation, analysis and visualization of increasingly large datasets. Businesses in all industries are challenged to recruit professionals capable of both working with emerging technologies and interpreting data to infer meaningful insights.

In this Scholarship in Practice seminar, students will investigate a research, business or policy interest of their choosing. The semester-long investigation will include the search for, location, acquisition, analysis and visualization of both primary literature and large datasets. The course assumes that students have no prior experience working with large-scale data, programming or producing advanced visualizations of data. In this manner, students of all backgrounds and majors should consider this course an opportunity to become a future professional ready, capable and hirable to tackle big-data challenges. By the end of the semester all students will have an appreciation for the cultural pervasiveness of big-data challenges and will have developed extensive capacities with primary literature, large-scale datasets and 4th generation computational toolsets.

Readings include:

The course will also use: The Signal and the Noise by Nate Silver (previous UMD first-year book). This book provides significant perspectives on the challenges and pitfalls of human-mediated data inference. The professor has 40+ copies of this book that can be distributed to students.

Reading also include both primary and journalistic literature meant to provide perspectives and insight on the breadth and depth of big data challenges.