Project Process
What we have done so far
Welcome to the part of the site where we tell you about this project and how it came about.
Motivation and Initial Questions
As Computer Science students, we all separately noticed in our first CS classes that there were few female students and few students of color. And we all felt that those numbers dwindled as we progressed through the program. And that feeling of being one of the so few people who look like you can affect you. Have you ever taken a class where you were the only person of your race or gender in that class?" It can feel isolating. Our perception of this lack of diversity in our collegiate setting is accurate about other post-secondary academic institutions and the industry as a whole.
The project's work started in an Explorations in Data Science class, taught by Kristen Tufte over the summer of 2020. The diversity issue and what affects retention came to mind as a topic for our class project. For one, we wanted to look at some numbers to know if we were right and how right we were about the demographics and the retention issue at PSU.
Women and minority groups are underrepresented in STEM and even more underrepresented in CS. We wanted to know why. It is evident when you look around that there is a problem, but we wanted a bigger picture to see the issue's scope. Why are there fewer women in computing fields? Why are there fewer black people in computing fields? Has this always been a problem? Is it trending up or down or stable?
How are the numbers at PSU? Are the trends the same here vs. other universities? What are other universities doing to curb this problem? Portland is pretty white itself; how much of a role does that play? How much of the problem is CS, the PSU CS department, or just Portland? What issues are there within the curriculum or department? How is the retention/ attrition overall? How does it compare with PSU? With other CS departments? Are women and URM more likely to leave the program? Why? When do students leave the program?
This project started as a class project, but it very quickly expanded out of that scope. One part of our beginning process was collecting data from the university, then to take steps to get realistic feedback from students. We started planning how to interview students, and once we completed those interviews, we decided on approaches to analyze and present that data to leadership/ faculty. Ultimately we want to develop ideas for how the faculty could help students be more successful and increase diversity.
Of course, this is a significant sprawling problem, so we have no delusions that we can "fix it," but a great thing that has come out of this project is feeling less alone. Many of the students we spoke with mentioned that this was the first time anyone ever asked them questions about their experience in the Computer Science program. Having those feelings validated is undoubtedly a great side effect of this project.
There is still ongoing work, but we wanted to share the things we have found so far.
Background Research
One of the first things we did as part of this project is a large amount of research into the diversity in computing fields, both academic and in the workplace. For our initial proposal, we presented a bit of a state of the industry. Our class was interested, and it raised several questions that inspired us to research specific topics more.
For example, during that summer term, we saw a chart that was an example of a spurious data connection. It was a spurious correlation between the decline of arcade profits and the rise in CS degrees. That question led us down a road of contemplating. How have societal expectations and video game marketing contributed to the decline of women in computing and encouraging young men to study computer science? These questions and the resulting research are what we presented for our final presentation, along with a bit of our planning process to conduct our interviews. We were continually frustrated that there was such an incredible amount of information about why women might not be inspired to choose computing, but not as many articles regarding why white people would be more likely to study CS.
That research is ongoing, but you can see an overview of the history of women in CS in the history tab above.
Data
General Challenges to Accessing Data
Gaining access to data is a challenge because we do not have direct access to the database and must rely on a third party to pull data and grant us access to that. In addition to the process of requesting that data, we are limited by:
- The staff available to process our request
- The confidentiality of the data
- The challenges of grouping the data in a way that is still meaningful to us but protects the privacy of the individuals
For the third point, some individuals are in demographics so underrepresented that they might be identifiable if groups are not aggregated. For example, in some terms, there is a single-digit number of black women in the program. If we were given a particular detail about the student, it would be pretty obvious who that person is. Thus, the challenge of tracking from year to year. Even if ten students of the same demographic are enrolled one term to the next, it might not be the same students. Students join the program at any time, not just in the fall.
The problems we have encountered in accessing data are not unique. The Association for Computing Machinery (ACM) study, Retention in Computer Science Undergraduate Programs in the U.S. –Data Challenges and Promising Interventions from 2018, echoes our experience.
The report laments two of our critical data challenges, the inability to track an individual's progress through their course of study and the requirement of looking at aggregate data without identifying patterns for race/ethnicity and gender.The ACM provides two case studies in which data was requested from universities, the University of California, San Diego, and Colorado School of Mines. While each case was unique, the researchers had common challenges, which we have encountered as well.
- Only data analysts can access the database
- The database is complicated, requiring subject matter expertise in data fields and their meaning
- Structuring a data request that was meaningful to the data analyst was challenging
- It took more than a month to receive data (at UCSD, it took multiple months)
- Students have numerous entry points to the university and CS program, and defining how and when to include them into the dataset was not straightforward
Ultimately, for both case studies, ACM acquired a complete data set that included data for each student who anonymized the data and still allowed tracking of the individual student's progress. We have been told that this will not be possible and that if we can acquire the data we have requested, it will be aggregated.
Challenges Regarding Department Specific Data
We contacted two main people within the university to get data. Our initial data run was from Jim Hook, Associate Dean in the Maseeh College of Engineering and Computer Science. We planned to analyze the types of data included already processed data dashboards received from PSU's CS department Associate Dean, James Hook. Hook had previously presented on the retention of women in the department and kindly shared his findings with us.
We received dashboards from Hook that grouped students by academic standing, including students who have not yet been admitted into the CS department, admitted juniors and seniors, pre-admitted and admitted post-baccalaureate students, and graduate students. Two central dashboards tracked student count from 2014 to 2019; one containing all students regardless of gender and another tracking only women. We analyzed and found ways to represent the data visually and presented that to our class. You can see some of these charts in the data tab above.
Hook's original data source has limitations, including that the data itself didn't track students who identified outside of the binary gender norm. The dashboards made trivial assumptions about one's gender identity in which we hope to address. Though Hook's data gave us a brief look into our department's diversity, it was evident that we needed to probe for more data elsewhere. It was also processed data, and we were itching to get our hands on some raw, queried numbers.
With Hook's help, we connected with Zach in OIRP, who helped us retrieve more detailed data regarding gender, race/ethnicity, and other relevant groups. Two problems presented themselves: as a response to a departmental loss of income due to COVID-19, and research faculty were furloughed on Fridays, which limited free-time. The team would have to work on these queries on our behalf. Second, the data is not de-identified, and any efforts to do so that allowed us access to the database to run our queries would be too costly. We are limited to only the queries this team will do on our behalf, subject to the data set being large enough to warrant student privacy. Thus, the third unfortunate truth that queries relating to underrepresented students, black students, especially in our department, will be too low of quantities that there would still be privacy concerns. This third concern necessitates aggregated data. When data is aggregated over a more significant period, it is hard to analyze the trends year after year. At times, we see several cells stating "Less than 5".
We analyzed and created some charts using that data, some of which you can see in the data tab above.
Interviews
IRB Approval and Preparation
One of our goals was to identify issues with students and share that with the department. The data can tell us that the PSU CS department has a high attrition rate -- especially for students who are members of groups historically underrepresented in computing. The data can't tell us why, of course. We wanted to know why some students persist and why other students choose to leave the program. To further this investigation, we conducted semi-structured qualitative interviews of current and former students regarding their experiences in the CS program at PSU to understand what contributes to persistence, attrition, retention, and success.
Interviews are categorized as "Human Research", so before conducting Human Research, we must get approval from the Institutional Review Board (IRB). The IRB submission includes a script for recruiting and a set of questions to ask potential participants. The IRB approval process included approval of our interview scripts, recruitment emails, graphics, and social network posts. For this process, we enlisted the help of Senior Instructor Ellie Harmon. We needed help, and it was great to have guidance throughout this process.
We based the submitted script on what we thought would make a good interview conversation for both those who feel excluded from Computer Science and those who felt included. Though we aim to hear stories from those who experienced discrimination or felt othered to the point of dropping out, we also want to hear from students who stayed in the program, even if they had considered leaving. Analysis of both experiences will provide a richer basis for reasons for attrition and retention.
Once the IRB documentation was submitted and approved, we could move forward with the interview process.
To recruit students, we created an email and slack script. We sent email campaigns to students, postings on various social networks (including the department's slack channel, personal Facebook pages, personal Instagram accounts, etc.), and leveraged personal connections with current and previous students in our department. Interview subjects also referred other current and former students to us. We were overwhelmed with the outpouring of students wanting to participate, which was a pleasant surprise. We interviewed the following people regardless of their gender identity, country of residence, race/ethnicity, or sexual orientation: current PSU CS students graduated PSU CS students and students who left the CS program. This focus is essential because different levels of the program have different retention rates and various experiences.
Conducting Interviews
Interviews were semi-structured, meaning that while we prepared categories and a set of questions, each interview was adapted based on the study progression and individual participants. We expect to stay relatively close to an outline of planned questions because we will have multiple interviewers. We grouped the guide questions into three categories, background information, motivation and interest, adversity, and closing.
Background information gives context and makes it easy to compare the experiences from other interviews. This section is where we collect basic personal information such as the interviewee's age, gender, and sexual orientation. This section also explores any previous education or work experience before their studies at PSU's CS department. It provides a timeline for which CS-related classes they took in our department or elsewhere (i.e., community college, coding boot camps, self-taught, etc.)
The following section is named Motivation and Interest, which also explores the interviewee's support in the program. Here we probe for any types of support they have access to while pursuing their studies in CS. We open up the questioning here to be free-flowing and allow the interviewee to share anything about their personal experience with the interviewers. We aim to probe for what motivates them to continue or discontinue their studies in computer science. Adversity questions were modified as the study progressed and tailored to individual respondents as appropriate. Here we discussed any discrimination or microaggression they may have faced from a fellow student or faculty member. We also asked the student whether they have dropped out or considered leaving the program and whether lack of support played a role in this decision.
We conducted interviews remotely over Zoom. We recorded the calls to allow for transcription. Upon transcription, we omitted all identifying information from the transcription and the analysis. Once transcribed, we deleted the recording. We used the software Atlasti to organize our data and analyze trends.
After analyzing both the interviews and any additional data we can obtain from OIRP, our goal was to make our findings accessible to our department's students and faculty. We aim to share insights on diversity statistics within the PSU CS department and amplify the voices of underrepresented students in the department while offering valuable insight into how they can better support these students.
Faculty and Dean Presentations
After we completed the interviews and performed an initial analysis, we wanted to present this to the faculty. In different circumstances, we might have performed our further interview analysis first. However, two of the three students on our research team graduated at the end of the Fall 2020 term. Through conversations with our faculty support and from the Computer Science department chair, Mark Jones, we arrived at the idea of presenting our initial findings and expressing our concern at an optional CS faculty meeting around the time of the close of the term. We spent a lot of time selecting quotations and excerpts from our student experience interviews to give the faculty a good overview of what we heard. We intended to shed some light on what the department is like from some students' perspectives.
Since we wanted to understand why people leave, a lot of the quotes we used were about some of the things that might be going wrong. We were worried about how the faculty would react, so we very carefully curated this presentation. We have a great deal of respect for the faculty who put so much time and effort into our education. So, it was tough to balance expressing that respect while also painting an honest picture. We, of course, know that a lot of students graduate and are very successful, but this project focuses on what prevents students from being successful or the challenges they face. We tried to find students who had a great time and students who had a bad time and dropped out. And, of course, we focus on the students who came forward and wanted a voice. That in itself can skew negative.
We organized these student experience snapshots into three categories for the presentation: 1. Asking for Support: Stories about challenges students have experienced when reaching out for help or support in our CS department. 2. Belonging and Confidence: stories about feeling unsure if you belong in CS or doubting your abilities. And 3. Culture of Isolation: There is an expectation of figuring things out on your own or having prior knowledge.
We were careful that for every student quote we selected, we selected it because the topic was pervasive in multiple interviews and represented more than just that one student's experience. Additionally, because a lot of the students we spoke with voiced that this was the first time anyone had asked them to talk about this, we mentioned that. We wanted to communicate that this was not just a tiny subset of people with isolated experiences. Since our time and resources were limited, if we were to do this again, we would want more time to recruit, interview, transcribe and analyze.
We took questions at the end of the presentation and facilitated a discussion where we could get faculty feedback. Every person on the faculty who attended was very supportive and expressed that they were glad we focused on the topic.
After this presentation, we gave a presentation for the Dean of College of Engineering, Richard Corsi. For this presentation, we felt comfortable softening things a bit less. We also described certain things about the CS program that the dean might not know, but of course, the CS department's faculty would know. We wanted to express some of the overall difficulties that the dean might not know as well. We intended to inspire the faculty and the dean to realize that students are concerned about this, and we want things to get better.
Interview Analysis
At the time of writing this, we are focusing on analyzing the data from the university and our interviews. Preliminary analysis on student perspectives as to why they leave suggests that factors include:
- Archaic Curriculum
- Anxiety inducing Proficiency Demos
- Discouraging or confusing Advising and advice
- Challenges in Asking for Support - Lack of open communication with some instructors
- Unclear or out of reach resources
This site is under construction