Data Services Classes and Customized Trainings

Classes Available on Request or Customized Training for your Group

To request a class, please contact HSLS Data Services.

Introduction to Python through Jupyter targets users of any experience level. If you have experience with another programming language or have never programmed at all, this workshop will get you off the ground running. This workshop approaches Python as a tool to complete data science tasks. Attendees will walk through Python at their own pace covering: types, operators, data structures, loops, flow control, comprehensions, and dealing with files. If you finish the introductory material you can continue to learn about Pandas. Pandas is a Python library which contains useful data structures for completing common data science tasks.

In this workshop, attendees will work at their own pace to learn basic data science tasks in Pandas. Pandas is a fantastic Python package which provides data structures and analysis tools for data science tasks. The workshop will cover the data structures, selection, mapping functions, reductions, statistics, input/output, pivot tables, grouping, and time-series data. Basic knowledge of Python is required. Attendees should be familiar with the syntax, using lists, and basic knowledge of lambdas.

Do you have data that require bioinformatics analysis?  Are you concerned about scientific rigor and reproducibility? Come learn about the “4 C’s” available to Pitt researchers: Core facilities, Collaboration with bioinformaticians, Coding, and Commercially-licensed tools.  Make an informed decision on the best option(s) for your data needs.

Your Pitt Box account provides unlimited online/cloud storage and allows individual files, up to 15 GB, to be uploaded. In this class you will learn how to access your Box account, upload files, streamline your workflow with Box Drive, share files and folders (with or without password protection), plus more.

What is a Data Management Plan? This session will answer that question, as well as describe the steps to creating a DMP, tools that can help with DMP development, and post-award management issues. University of Pittsburgh-specific guidelines and support resources will also be shared.

Many funders, publishers, and institutions require researchers to make their research data public, but practical challenges can act as a barrier to sharing data, especially in the health sciences. This hands-on workshop will guide participants through the data sharing process, from initial study design to data deposit. Exercises will prompt participants to think through issues of data documentation, reuse value, and promotion of their own research projects.

You already conduct literature searches with PubMed and read free full-text articles from PubMed Central (PMC), so why try Europe PMC? Quite simply, because your current search strategy might not be finding all of the relevant information. This class will focus on two specific features of Europe PMC: (1) preprint searching and (2) direct linking within articles to public gene, protein, and chemical compound databases.
This workshop will focus on LabArchives, the Electronic Research Notebook selected by the University of Pittsburgh. We will cover how to get started using it, including planning strategies, access, lab notebook creation and organization, adding and editing entries, linking, and sharing data.

Microsoft Excel is a commonly used program to record and store datasets with headings, rows, and columns. In this class, we will look at transforming that data into summary tables and charts. You will work through data examples to create pivot tables and pivot charts, apply conditional formatting to your tables and sheets, and prepare your figures for use in other programs.

OpenRefine (formerly Google Refine) is a powerful, free, open source, tool for working with messy tabular data.  It runs offline in a web browser and allows for reproducibility in data cleaning.

“What's in a name?” When you create a new file, do you give much thought to the name you save it as? This class focuses on best practices for naming files so that they are easily found, understood, and sharable in the future.

Learn how to keep your data safe AND preserve it for future use by following a few simple rules. File formats, file-naming conventions, repositories, storage options and more will be discussed.

In this hands-on workshop, learn how to manage your work with the version control system Git. Git helps keep your files safe from accidental deletion, tracks who made what change when, and lets multiple people work on the same project without overwriting each other's work. We'll cover using Git from the Unix shell and through Github online. No previous experience with the command line is necessary, although some basic knowledge is recommended.

In this class, learn the fundamentals of keeping your data secure and organized through brief introductions to the core areas of data management: file storage and organization, file documentation, data preservation, and data publication and/or data sharing. This class is intended for graduate students and researchers who are working on long-term research projects, or for anyone who wants to make sure their personal files are safe for the long-term.

You've collected your data. Now what? In this class we will learn how to use Tableau to demonstrate the significance of your data.

Have you shared your data in an open-access repository or to accompany a publication? Are you interested in sharing data with potential collaborators, but you need to maintain control over who can access it? Come share your experiences in this drop-in session and learn about a new data-discovery platform from the Pitt HSLS team that can help you increase the visibility of your datasets without necessarily making them public.

Need to find a dataset to act as a control for your study? Or do you want to reuse open access data? This class will offer tips for locating and citing data and include hands-on exercises to explore directories of data repositories and data journals.

Need to find a dataset to act as a control for your study? Or do you want to reuse open access data? This workshop offers tips for locating and citing data, and includes hands-on exercises to explore directories of data repositories and data journals.

HSLS Coffee breaks are 30-minute classes on a focused topic. Coffee will be provided.

In this coffee break, learn how to advertise your data in the Pitt Data Catalog to help increase the reproducibility of your research, without having to make it completely public.

The Western Pennsylvania Regional Data Center (WPRDC) maintains Allegheny County and the City of Pittsburgh’s open data portal. To date it host over 300 data datasets including those from public sector agencies, academic institutions, and non-profit organizations such as the Port Authority, Housing Authority, and BikePGH. Come hear how you can use these open datasets, the tools and tutorials created by the WPRDC, and explore opportunities to work with the Data Center.

HSLS Coffee breaks are 30-minute classes on a focused topic. Coffee will be provided.

In this coffee break, learn the basics of version control and how it helps keep your work safe and reliable. We’ll cover how Github, Google Drive, and Box track the changes you or your collaborators make to uploaded files, and how that can help make your research more reproducible.

HSLS Coffee breaks are 30-minute classes on a focused topic. Coffee will be provided.

Have you ever seen a README for a piece of software? It's a simple text document that tells you who made a program, what it does, and how to run it. Learn how to write a great README for your code, data, or even file organization system.

Did you know that for each minute of planning at the beginning of a project, you will save yourself roughly 10 minutes of headache later? This session will provide practical tips for organizing, naming, documenting, storing and preserving your data.