Data Sources

· by Jo Hardin · Read in about 5 min · (963 words) ·

Today’s blog is a compilation of datasets and data sources to use in a data science classroom whose goals are to include relevant and timely information to consider issues of the day. We hope that the datasets below can be used in conjunction with some of this summer’s previous blogs, for example, considering

Collecting Data

Before linking to the data, we encourage you to reflect on how data are collected and what impact poor data collection can have on any ensuing conclusions.

  • In their Data Equity Framework, We All Count details seven stages of looking at data projects, including data collection & sourcing.

We All Count

They say:

The requirements for equitable data collection are complex. It’s not as simple as trying to ask everyone and not leave people out. Sample selection is important of course, but so is survey design, collector behaviour, scope and scale, cultural translation, collection mediums, data corruption, compatibility and fidelity and much more. It’s super worth doing, if for no other reason than your data will be more useful.


Criminal Justice

Race & gender


Large Data Archives

About this blog

Last summer we wrote a series of blog entries designed to start conversations around teaching data science, Teach Data Science. We covered topics such as data science software, data ingestation, data technologies, data wrangling, visualization & exploration, communication, and key reports and findings on data science.

One key element that was lacking on our 2019 blog was a discussion about and a commitment to teaching the ethical aspects of data science. We have now found ourselves in the summer of 2020, overwhelmed by the state of the world and re-committed to the ethical challenges which can help data science be a positive force for change.

Although none of us are experts in ethics, we have all included ethics discussions in our classrooms for many years. In the weeks to come, we will share some of the ways we engage our students in these important topics. We will provide resources for readings, examples, datasets, and exercises. We believe that data ethics are part of every data science analysis and classroom experience, and we hope that this summer’s blog will entice you into presenting ethical dilemmas and related conversations to your students early and often.

During the summer of 2020, we wrote a dozen or so blog entries. We hope that you bookmark the site and check in regularly. Want a reminder? Sign up for emails at!forum/teach-data-science (you must be logged into Google to sign up).