NLM is no longer updating the HSRIC webpages, and this product will be retired on September 14, 2021.
Detailed information about this transition can be found on the June 3, 2021 Technical Bulletin post.
If you have questions or suggestions, please contact NLM Customer Service.
The information on this page is intended to serve as an introduction to the topics of data science, data literacy, data management, data sharing, and research reproducibility. Though the emphasis is on health data, information from the broad data science community is included.
The Toolkit is a Web-based tool that provides hospitals, health systems, clinics, and health plans information and resources for systematically collecting race, ethnicity, and primary language data from patients.
The CDISC SHARE API is a RESTful web service that allows end users to programmatically retrieve CDISC standards' metadata from CDISC SHARE to support process automation.
An effort to create Persona profiles representing roles across the ecosystem of translational research: Basic Research, Pre-Clinical Research, Clinical Research, Clinical Implementation,Public Health. These profiles are intended for use for the CTSA community and beyond, to assist those developing software projects, educational and communication materials, and more.
Aligning Forces for Quality has developed this module which outlines their experiences in collecting and reporting cost data as a means of reducing health care costs.
Online tool designed to help select good color schemes for maps and other graphics.
Data collection standards for measures of race, ethnicity, sex, primary language, and disability status that are to be used in all national population health surveys.
Information on data quality, access, curation, confidentiality, citation, as well as links to tools and services for data management and curation.
This toolkit is a set of field-tested tools designed to support planning for a public health surveillance program that will rely on data from EHR systems.
This guide encourages community stakeholders to organize events around the 500 Cities data, focusing on engaging audiences, learning about the data, and generating ideas on how to use the data to advance health.
This learning guide explains how to improve electronic health record (EHR) data quality to stimulate practice quality improvement
This workbook outlines communication concepts, a framework for communicating data, and the application of that framework to actual public health situations.
The Building Health Places Network's website contains tools aimed at measuring programs' impact on families and communities and on factors related to health.
Open source software for epidemiologic statistics. It provides a number of epidemiologic and statistical tools for summary data.
A freely available, open-source tool for working with messy data.
ResDAC provides free assistance to academic, government and non-profit researchers interested in using Medicare and/or Medicaid data for their research.
Provides a map-based interface for the display and analysis of infectious disease epidemiological data, including molecular data, utilizing Google Maps and Google Earth.
SHPDR captures cross-sectional and longitudinal variation in states' statutes and laws to enable researchers to perform clinically oriented health economics research, and investigate the diffusion of medical technology and other health services research outcomes of interest.
The TranStat tool enables field personnel and researchers to enter and revise data from local outbreaks and to test for the presence of human-to-human (or animal-to-animal) transmission.
This site contains examples of tested and evaluated graphic displays of health information. The health visualizations include graphs, charts, and images that effectively communicate risk information and help make sense of health data.
The formulation, development, and initial expert review of 3x3 Data Quality Assessment (DQA), a dynamic, evidence-based guideline to enable electronic health record (EHR) data quality assessment and reporting for clinical research.
A decision tree to help guide national government ministries, donors, and implementers in designing interventions to improve data use. This decision tree provides insights for strategies to consider when strengthening data-driven action and decision making.
This manual provides states with detailed guidance on common data standards, collection, aggregation and analysis involved with establishing these databases.
Guidance from the NSF regarding data sharing for projects using data from electronic health records.
This report outlines the emerging discipline of data science at the undergraduate level as a guide for the transformation of the field
This issue brief summarizes key lessons learned and bright spots in navigating data partnerships with a focus on child-serving sectors. In addition, it includes an Appendix with resources that were compiled based on suggestions from key informant interviews, input from members of the Collaborative on Accountable Communities for Health for Children and Families, and a scan.
This "decision tree" helps users identify tools that will best meet their needs when designing interventions for data demand and use.
Outlines NIH policy on transparency in the research process, including the use of data. Includes an FAQ and guidance on rigor and reproducibility in grant appliations
This study discusses a vision for the emerging discipline of data science at the undergraduate level; after the study concludes in 2018, a final report will be issued.
This document was created to guide researchers in creating data management plans, in accordance with federal funding agencies' policies.
Collection of tools and resources designed to help address legal barriers and facilitate data sharing while ensuring health agencies and organizations operate within the legal requirements of HIPAA and other laws and regulations.
A report which outlines a framework for coordination and integration in current data systems as well as opportunities for future integration and alignment of data from surveys, administrative data systems and EHR systems.
This report forecasts costs for preserving, archiving, and promoting access to biomedical research data.
This document outlines NIH's policy on data sharing for information garnered with funding provided by the Institutes and Centers.
Instructions from the NIH on the requirements for rigor and reproducibility research funded by their Institutes and Centers.
This guidance is intended to assist sponsors, clinical investigators,contract research organizations, institutional review boards(IRBs), and other interested parties on the use of electronic health record data in FDA-regulated clinical investigations.
An article that outlines methods for promoting big data literacy through education.
This White paper includes a glossary of terms relevant to data literacy, a discussion of the history and growth of the term, and the important aspects of promoting data literacy.
Discusses the need for the healthcare fields to become more proficient with data collection, use and preservation.
The result of a workshop on the nature of data literacy, envisioning what the data-literate person does, and how we might teach this.
An online collection of 100+ papers, toolkits, and other materials focused on privacy, consent, and policy documentation with data sharing.
A review of data access policies for many publicly funded federal and state datasets from the 2016 eGems.
This Special Publication outlines a number of potentially valuable policy changes and actions that will help drive toward effective, efficient, and ethical data sharing, including more compelling and widespread communication efforts to improve awareness, understanding, and participation in data sharing. Achieving the vision of a learning health system will require eliminating the artificial boundaries that exist today among patient care, health system improvement, and research.
A discussion of the value and methods of health cost data, this resource also looks at its limitations.
Strategic plan outlines the NIH's challenges, objectives, and focus for data science.
This white paper reports on a project to develop a working definition of data literacy.
Proceedings from a November 2019 workshop that considered ways in which policy, technology, incentives, and governance could be leveraged to overcome remaining barriers and further facilitate data sharing.
This report forecasts the ways in which health data might be used in the private and public sectors, and the importance of meeting the challenges of data collection and use.
A video which discusses the ways in which complex data sets can be more easily portrayed using data visualization techniques.
An essay from Interworks that looks at 1. Data Science; 2. Big Data; 3. Data Visualization
Though intended for use with the CMS data contained in the CCW, these white papers provide a useful of how the data is compiled and the specific uses for which it applicable.
This two-minute video outlines the history of literacy, and the evolving definitions of data literacy.
Freely available data sets; registration for use is required.
An open-source implementation of the FHIR specification in Java from the University Health Network.
Publicly available files designed to allow interested parties to gain familiarity using Medicare claims data while protecting beneficiary privacy.
a 1K sample of simulated CMS SynPUF data in CDMV5 format available for download.
A listing of servers that provide a testing area. Registration for each may be required.
This is a virtual testing environment that mimics a live EHR production environment, but is populated with sample data that is reset on a nightly basis. Free registration is required.
Describes how to create the dictionary that is required for an RDC proposal.
A discussion of the purpose of data dictionaries and how to create a useful one.
The purpose of a data dictionary is to explain what all the variable names and values in your spreadsheet really mean
Provides critical social and cultural perspectives on big data initiatives.
A politically independent Think tank based in Denmark with a European (and global) outreach
This website is a resource to educate the public about the main elements of the General Data Protection Regulation (GDPR). It outlines standards from the EU on data privacy regulations
This one-hour on-demand training session with videos and exercises is intended to introduce you to health data standards and how they are used, including relevant National Library of Medicine (NLM) products and services.
This class is for anyone who wants to learn what all the data science action is about, including those who will eventually need to manage data scientists.
This introductory seminar from September 8, 2016 provides an overview of how a researcher can determine what data is right for their research question, which sources of federal data to access, and the spectrum of data access.
This course is designed to address present and future data management needs.
From the State University of New York, this course presents methods for working with Big Data analytics on real datasets, from preparing data for analysis to completing the analysis, interpreting the results, visualizing them, and sharing the results.
A discussion of the tools required for students to understand big data, its uses, and how to ask correct questions of the data sets.
The DART Fellowship is a 6-week cohort training and development program, providing a guided pathway for information professionals to acquire data literacy skills using common methodologies applied toward public health efforts.
A collection of course materials and syllabi on data literacy.
This module from the Wisconsin Department of Public Instruction discusses why data literacy is essential for to interpret and use local and state assessment data, inform practices and improve student achievement.
This course provides a framework to analyze these concerns as you examine the ethical and privacy implications of collecting and managing big data.
The Data Incubator is a Cornell-funded data science training organization. It is a free advanced 8-week fellowship for PhDs looking to enter industry. Fellows have the option to participate in the program either in person in New York City, San Francisco Bay Area, Seattle, Boston, Washington DC, or online.
A series of online programs, available for credit or for self-learning.
A listing of online degree programs and courses in data science
This MOOC examines principles of data mining and pattern discovery.
Classes, webinars, guides, articles and websites about data visualization
Workbook from an educational program which was intended to educate new investigators about conducting responsible data management in scientific research.
An educational program, with two webinars, and accompanying modules. Periodically, moderated sessions are held in conjunction with the recorded webinars.
Course of study for certification in data science. Fees for admission to program are required.
This page serves as a resource for librarians, library students, information professionals, and interested individuals to learn about and discuss: library roles in data science; fundamentals of domain sciences; emerging trends in supporting biomedical research.
From Johns Hopkins University, this course focuses on literate statistical analysis tools which permit the publication of data analyses in a single document, allowing others to easily execute the same analysis to obtain the same results.
Resources created for undergraduate faculty and students to set up data-driven learning experiences,
Course of studies for online certification in data science. Fees for the program are required.
Tool provides a state-by-state listing of federally funded technical assistance to support clinicians in various health information technology transformation activities.
Webinars, prerecorded videos, and online/in-person classroom training to help users learn to effectively use U.S. Census data.
Training opportunities in the area of data science
All In with the Network for Public Health Law are pleased to present a three-part webinar series on Racial Equity Throughout Data Integration.
The webinar series will provide essential training suitable for individuals at an introductory overview level
A meetup in the metro District of Columbia area, centered on healthcare related big data discussion.
Locate in-person discussion groups interested in big data from various fields.
An annual meeting held in May of each year. Agendas and videos from past meetings are available
One-hour Webinars on data science occur monthly at 9am (PT) on the second Tuesday of the month.
It is intended to serve as a news and discussion vehicle for the overall community and is an opportunity for the NIH Office of the Associate Director for Data Science to present timely information and receive important feedback.
Updates and commentary from the Department of Health and Human Services.
DASH is a national initiative to support multi-sector collaborations sharing data and information to improve the health of their communities.
Includes the Global Health Observatory, Global Health Estimates, and WHO Mortality Database.
An international collaboration on big data, designed to promote people-centered big data.
This initiative seeks to improve the reliability and value of published health research literature by promoting transparent and accurate reporting and wider use of robust reporting guidelines. The website contains a blog, a library for health research reporting, a toolkit to promote accurate publication of health research, and upcoming and past events and classes.
The principal internal advisory body to the Secretary on health and human services data policy.
National, not-for-profit membership organization dedicated to improving health care through the collection, analysis, dissemination, public availability, and use of health data.
Preferred Reporting Items for Systematic Reviews and Meta-Analyses is an evidence-based minimum set of items for reporting in systematic reviews and meta-analyses.
RDA provides a neutral space where its members can come together to develop and adopt infrastructure that promotes data-sharing and data-driven research
Videos and documents help to outline how actuaries utilize big data.
A group project founded to develop the tools needed to shape a successful, data literate society.
A guide to data management.
A collection of links to data sources for health services research.
Data sources from national, state, county, and local levels.