Utility Mobile

Compass

Data Science Ethics

Structure and Framework

Data science teams bring together researchers and application stakeholders across many many areas of expertise, each with their own set of research integrity norms and habits. This requires ethics to be woven into every aspect of doing data science. The Data Science Framework reinforces this as data science ethics touches every component and step in the practice of data science.

Woman at computer looking at list

Data Science Ethics Tool

Using an ethics tool is the first step for researchers to agree on a set of principles. The data science ethics tool template can be adapted to specific data science projects. It provides a set of guiding principles and questions to address the ethical decisions across the data lifecycle. You can also download this tool to use for your own projects by clicking the button below. 

Data Science Ethics Tool

Click on the sections below to view the different facets of data ethics.

  • Recognize and affirm that all project plans incorporate regular checks, discussion, and documentation to ensure adherence to the ethical principles of research.

  • Establish the ethical basis for undertaking the project as well as the project requirements for both the protection of research participants and the equitable allocation of all potential project benefits and risks.

    • What are the expected benefits of the project to the ‘public good,’ and do they outweigh potential risks to certain populations?
    • Are there implicit assumptions and biases in the framing of the project regarding the studied communities and how will they be addressed?
    • What type of institutional review board approval process is needed? Has the team reviewed the protocol?
  • Consider potential biases that may be introduced through the choice of datasets and variables.

    • Do the data include disproportionate coverage for different communities under study?
    • Do data have adequate geographic coverage?
    • Have checks and balances been established to identify and address implicit biases in the data?
  • Put in place data platforms and processes to ensure data transfer, storage, and database development adheres to data governance agreements and best practices for data quality assurance.

    • Have team members reviewed standard operating procedures (SOPs) and data management plans?
    • Do additional procedures need to be defined for this project?
  • Cleaning, transforming, linking, and exploratory analysis are critical steps in understanding data quality, how representative the data are, and potential biases in the data.

    • What is the quality of the data?
    • How representative are the data? What populations are covered, not covered?
    • Are your assumptions correct?
  • Critically assess the overall utility of the results in achieving the predicted benefits of the study, to be transparent about potential limitations of the study, and to ensure that unintended biases haven’t been introduced as a result of data choice and model refinement.

    • What are the limitations of the results? Are the results useful given the purpose of the study?
    • Do the statistical results support the potential benefits of the study previously stated? 
    • Do the statistical results support the mitigation of the potential risks of the study previously stated?
    • Do any of the data require revisiting the question of potential biases being introduced through the choice of datasets and variables?
  • Establish transparency in methods, results and limitations.

    • Have project methods and outputs been made as transparent as possible?
    • Are the potential limitations of the research clearly presented?
    • Should the research be used as the basis for policy action? Have the predicted benefits and social costs to all potentially affected communities been considered?
  • Summarize ethics-related questions and actions taken, to reinforce the process of ethical consideration in continuing and future projects. Refine protocols for replication and expansion of the research findings, and information dissemination.

    • Did key ethical questions arise during the research and, if so, how were they addressed? How could they be addressed differently in future projects?
    • Are research protocols, methods and data available to other researchers? If so, in what way, and, if not, what factors are limiting the ability to do so?
  • Summarize questions and actions taken to reinforce the process of ethical consideration on all continuing and future projects. Establish protocols for replication and expansion of the research findings, and information dissemination.

    • Did key ethical questions arise during the research and, if so, how were they addressed? How could they be addressed differently on future projects?
    • Are research protocols, methods and data available to other researchers? If so, in what way, and, if not, what factors are limiting the ability to do so?