Skip to Main Content

Statistics & Data Sources

Data and statistical resources

Data & Statistics Research 101

Data vs. Statistics

Data are raw ingredients from which statistics are created. Statistics are useful when you just need a few numbers to support an argument; e.g. 13% of United States foreign born veterans are from Mexico (ACS). They are usually presented in tables. Statistcal analysis can be performed on data to show relationships among the variables collected. Through secondary data analysis, many different researchers can re-use the same data set for different purposes.

On the most basic level it will help to think of data as the entire collection of information gathered during the process of a survey.  A statistic takes that data and refines it down to a single number or percentage in order to answer a specific question - such as "what is the median age of first year students at the University of Puget Sound?"

As a general rule of thumb:

  • If what you need is a number to back up your argument, then a statistic will probably do.
  • If, on the other hand, you need to manipulate the information to answer a new or different question, you will likely need to get your hands on some data.

Aggregate/Macro Data vs. Microdata

Aggregate or Macro Data are higher-level data that have been compiled from smaller units of data. For example, the Census data that you find on American Factfinder have been aggregated to preserve the confidentiality of individual respondents. Microdata contain individual cases, usually individual people, or in the case of Census data, individual households. The Integrated Public Use Microdata Sample (IPUMS) for the Census provides access to the actual survey data from the Census, but eliminates information that would identify individuals.

Types of Data

Cross-Sectional describes data that are only collected once.

Time Series study the same variable over time. The National Health Interview Survey is an example of time series data because the questions generally remain the same over time, but the individual respondents vary.

Longitudinal Studies describe surveys that are conducted repeatedly, in which the same group of respondents are surveyed each time. This allows for examining changes over the life course. The Project on Human Development in Chicago Neighborhoods (PHDCN) Series contains a longitudinal component that tracks changes in the lives of individuals over time through interviews.

For more definitions, I highly recommend the Glossary of Selected Social Science Computing Terms and Social Science Data Terms compiled by Jim Jacobs, Data Services Librarian, UCSD.

(Republished with permission from Pamela Morgan at Vanderbilt University)