Data management fueled by girl power

BY SYDNEY SAUER

 


One of the most incredible things about working in academic research is the diversity of people who touch each project. Bringing large experiments and mixed-methods analysis to life takes collaboration between dozens of research assistants, graduate students, and faculty from several universities, and these collaborations can be fruitful for years—or even decades—to come. 

When it comes to data management, however, this “revolving door” can make it difficult to keep up. Large teams working on collaborative analyses need sophisticated yet flexible data management systems, often on a tight budget.

At the beginning of my predoctoral research fellowship, I was tasked with creating a durable data management system for the Creating Moves to Opportunity (CMTO) project. We needed something that could easily be updated with new waves of the study, merge qualitative and quantitative data from various sources, reproduce results from already published papers, and provide intuitive documentation for future users. 

This was a daunting task, but over the course of three months, I built the Base File: a single source of truth for the CMTO project complete with user-friendly documentation. Armed with this file, future RAs can quickly provide replication code for paper results when needed. Managers can much more easily onboard future cohorts of predoctoral fellows. Updates to the dataset can be made in minutes, rather than hours or days. And, most importantly, everything is in one place. 

Out of everything I did during my predoctoral fellowship, this project was the most impactful. I learned that I love creating organizational structures that streamline lab workflows—and that I’m good at it. I discovered, through painful hours of digging through prior data repositories, that version control, documentation, code comments, and changelogs are essential skills for researchers to master. 

The first few rows of a documentation table that I created. This table demonstrates the availability of each variable across different waves of the survey, providing easy lookup for future staff.

Working in a female-led (and, at the time, mostly female-staffed) lab also helped me realize that traits I had devalued as “girly,” such as my penchant for color-coding and graphic design, were actually quite useful organizational strategies. Making things look nice isn’t just about aesthetics. It’s a simple way to streamline communication and help people understand and stick to the way that things are supposed to be. It also makes everything you produce look more professional and well-done, which is a nice plus, especially when you want your work taken seriously. In data analysis, it’s easy to get caught up in a “bro” culture of churning out analyses without taking the time to make nice documentation and visuals to accompany them. But in academia, there’s too much turnover for this strategy to be efficient; a touch of “girly” goes a long way. 

An example of how color-coding can streamline communication. The “source” in pink links directly to the pink table. There is also a corresponding green table for the “time suffixes.”

The challenge with data management systems is that they will never be enough. Project formats and priorities are always shifting, so no solution can be perfect. For example, the Base File is structured in a wide format with one row for each participant in the study. Some analyses, such as time-series analysis, work best with a long format, with multiple rows for each participant at different points in time (e.g., the baseline versus follow-up survey).The great thing about a well-documented data system is that, when new problems or use cases arise, changes like these are easy to make. 

Overall, I’m so honored to have been given the opportunity to work at PIRL and develop all sorts of new skills, from data management to fieldwork. I’m excited to see the creative new ways that future PIRL RAs use this data, and I’m certain that they will make the systems I’ve developed even better in the future!