USTC Course

Computational Social Science Methods course at USTC, Summer 2025.

PADM6435P Quantitative Research Methods: Computational Social Science Methods (Tentative)

2025 Summer, Tuesday (7:50-12:10) and Thursday (14:00-18:20), USTC5106



This research-oriented course introduces and contextualizes computational methods from a social science research design perspective. Students will examine how computational tools and techniques can enhance traditional quantitative research methods in social science. The course is structured around four key components of research design in computational social science: data management, concept representation, data analysis, and scientific communication.

The course emphasizes the research design rationale for using computational methods in answering social science questions. Through a combination of theoretical readings, discussions, and hands-on exercises, students will learn to integrate computational approaches into social science research, understand the strengths and limitations of these methods, and critically evaluate empirical studies that utilize computational data analysis.

Students are strongly encouraged to responsibly use AI tools (e.g., ChatGPT or DeepSeek) to boost productivity in assignments, while maintaining academic integrity. Programming ability will be helpful; however, programming itself is treated as a means to an end—the focus is on using computational tools to advance social science research objectives.

By the end of this course, students should be able to:

  • Understand key CSS methods: Define and describe the major components of computational social science research design (data management, concept representation, data analysis, scientific communication) and explain why they are important to quantitative social science research.
  • Apply computational techniques: Implement a variety of computational methods (such as automated text analysis, network analysis, simulation, and data visualization) to real-world social science data, in order to answer research questions or solve practical problems.
  • Critically evaluate research: – Analyze and critique existing empirical studies in social science that employ computational methods, assessing their research design, data usage, analytic techniques, and the validity and reproducibility of their results.
  • Communicate findings effectively: Present quantitative findings through clear writing, well-designed data visualizations, and, if applicable, interactive dashboards, demonstrating the ability to communicate complex data-driven insights to both technical and non-technical audiences.

Course Materials and Readings

Required Text/Manuscript: Computational Social Science Methods: A Research Design Primer (referred to as CSSPrimer). This is a draft book manuscript that I have been working on and will serve as a primary text. Relevant chapters are assigned as weekly readings (see schedule). The manuscript will be provided on Google Drive.

Required Readings: Each week will include scholarly readings (journal articles, book chapters, or other sources) that exemplify or elaborate on that week’s theme. Students are expected to complete all required readings before class each week to participate actively in discussions.

Additional Recommended Readings:

  • Code and Data for the Social Sciences: A Practitioner’s Guide by Matthew Gentzkow and Jesse M. Shapiro (2014). https://web.stanford.edu/~gentzkow/research/CodeAndData.pdf
  • Text as Data: A New Framework for Machine Learning and the Social Sciences by Justin Grimmer, Margaret E. Roberts, and Brandon M. Stewart (2022).
  • Social Network Analysis (4th ed.) by John Scott (2017).
  • Speech and Language Processing (3rd draft) by Daniel Jurafsky and James H. Martin (2022). https://web.stanford.edu/~jurafsky/slp3/

All additional weekly readings (journal articles, etc.) will be accessible through the university library, open-access sources, or provided as PDFs via the course website. Students are not required to buy any books.


Assignments and Evaluation: Two Paths (Standard or Working Paper)

All students will complete Assignments 1 and 2. There are two paths for the rest of the assignments.

  • Standard Path: Complete Assignment 3 for each week.
  • Working Paper Path: If you already have a working paper that you wish to advance by incorporating the computational methods we cover, you may skip Assignment 3 and do Assignment 4 instead.

All students must decide by Session 2 whether they will choose the Standard Path or the Working Paper Path. Below are the assignment details.

Assignment 1: Plagiarism Certification Test (10 points; ALL students; individual)

Complete an online plagiarism education module and test. Students must pass the Indiana University Plagiarism Test (master’s/doctoral level; https://plagiarism.iu.edu/certificationTests/) (the website is currently offline due to technology outage as of 2025/7/1) and submit the completion certificate by Session 2. If you have previously completed a comparable certification, provide proof for instructor approval. This ensures all students understand what constitutes plagiarism and how to avoid it.

Assignment 2: Group Presentation and Annotated Bibliography (65 points; ALL students; group & individual)

Group Presentation (15 points per presentation, 45 points total):

  • Students form small groups (topics assigned early) to lead a 30-minute in-class presentation on one of the three core themes of the course: data management, concept representation, and data analysis.
  • Presentations are scheduled for Sessions 3 (data management), 5 (concept representation), and 7 (data analysis). Each group synthesizes their theme from a research design perspective, incorporating definitions, frameworks, and empirical examples.
  • Slides must be submitted to OSF before presenting.

Presentation of Empirical Studies (10 points per session, 20 points total):

  • For each week you are not presenting, prepare a brief presentation analyzing two empirical studies from the CSS Empirical Studies Database or your own search.
  • The presentation should highlight each study’s research question, data, computational method, and how it illustrates the theme of that session (i.e., data management, concept representation, or data analysis).
  • Grading is based on completeness, thoughtfulness, and relevance to the session theme.

Assignment 3: DataCamp Modules (30 pints; Standard Path only; individual)

To complement the weekly learning objectives and content, Standard Path students are required to complete at least one DataCamp module per week (approximately 3–4 hours of learning) and submit the corresponding earned certificates as proof of completion by the end of each week to receive credit. Below are recommended learning modules:

Data Management:

Concept Representation:

Data Analysis:

Students may substitute a different module of their own choice, provided that the alternative module is relevant in content and learning time (i.e., 3–4 hours).

Assignment 4: Presentations of Working Papers (30 points; Working Paper Path only; individual)

  • Working Paper Path students will work on their existing paper and present progress at each of “Review Sessions.”
  • The final deliverable is an improved working paper demonstrating integration of data management, concept representation, data analysis, and scientific communication principles.
  • In the final session (Session 8), Working Paper Path students will present their final Working Papers. This final presentation can serve as a showcase project for your path.
  • The grade will be the evaluation of your working paper progress and final result (i.e., each milestone and final submission show how you have integrated principles of computational methods into your paper).

Grading Scale

Final grades are out of 100 points (sum of all assignments). Letter grades follow university policy without rounding:

  • A = 95–100 A– = 90–94
  • B+ = 87–89 B = 83–86 B– = 80–82
  • C+ = 77–79 C = 73–76 C– = 70–72
  • D+ = 67–69 D = 63–66 D– = 60–62
  • F < 60

Detailed Schedule

Session 1 (July 1) – Course Introduction

Required Readings:

Key Topics:

  • Research Design Fundamentals – Overview of CSS research design and how it differs from conventional social research.
  • Computational Methods Overview – Introduction to the core computational tools and methods used in the course, plus an outline of the four major themes (data management, concept representation, data analysis, scientific communication).
  • High-Performance Computing (HPC) Basics – Basics of using advanced computing resources for research (e.g., a demonstration of cloud computing environments).

Working Paper Track:

  • Students on the Working Paper Track bring a draft research paper to class on Day 1. This draft should already include a research question/aim and some initial framing. Throughout the course, they will iteratively expand or refine this draft with each computational theme, rather than starting a new project.

Exercises/Activities – In Class:

  • Course Resources Tour:
    • Syllabus
    • Book manuscript, OSF project page, DataCamp
    • WeChat Group
  • Working Paper Path: Introducing your projects.
  • Instructor lecture on key topics
  • HPC and Cloud Computing Setup:
    • Google Colab
    • DataCamp
    • VS Code + Remote instance

Exercises/Activities – After Class:

  • Assignment 1 (Plagiarism Test) – Complete online certification by the end of the week.
  • Register accounts: OSF, DataCamp, VPN.
  • Working Paper Track student: Prepares a 20-minute presentation of research paper.

Session 2 (July 3) – Review Session: Working Paper and Group Presentation Prep

Exercises/Activities – In Class:

  • Each Working Paper Track student prepares a 20-minute presentation of their current research paper. The Instructor and the class will provide feedback.
  • Assign students to different groups to plan for the Group Presentations (i.e., Assignment 2).
  • Review CSS Empirical Studies Database, preparing for the Presentation of Empirical Studies (i.e., Assignment 2).

Exercises/Activities – After Class:

  • Prepare Assignment 2 Group Presentation (group).
  • Prepare Assignment 2 Presentation of Empirical Studies (individual).

Session 3 (July 8) – Data Management

Required Readings:

  • CSSPrimer: Chapter 2 (data management foundations).
  • Additional journal articles TBD.

Key Topics:

  • Core Data Management Concepts: Best practices (data formats, APIs, JSON, relational databases), efficiency, reproducibility.
  • Tools and Methods: Brief introduction to OpenAlex (for scholarly literature), draw.io (workflow diagrams), and MySQL Workbench (relational databases).

Exercises/Activities – In Class:

  • Student group presentation on key concepts (Assignment 2 Group Presentation).
  • Student individual presentation on empirical studies (Assignment 2 Presentation of Empirical Studies)
  • Instructor comments on presentations.

Exercises/Activities – After Class:

  • For Standard Path: Complete DataCamp modules
  • For Working Paper Path: Prepare presentations on new analysis or improvements.

Session 4 (July 10) – Review Session: Data Management

Instructor Lecture:

  • Key concepts and best practices recap.
  • “Do’s and don’ts” of data cleaning, structuring, documentation, and reproducibility.

Working Paper Discussion:

  • Share how you have revised/improved the data management section of your existing draft.
  • For instance, you might have reorganized references, identified new data sources, refined how you store/clean data, adhered to “tidy” principles, and improved workflow documentation.

Exercises/Activities – After Class:

  • Prepare Assignment 2 Group Presentation (group).
  • Prepare Assignment 2 Presentation of Empirical Studies (individual).

Session 5 (July 15) – Concept Representation

Required Readings:

  • CSSPrimer: Chapter 4: Concept Representation.
  • Additional journal articles TBD.

Key Topics:

  • Concept Representation in Research: Defining theoretical constructs and measuring them in computational ways (operationalization).
  • Automated Coding & Text Analysis: Inductive (topic modeling) vs. deductive (classification) methods for turning text/unstructured data into coded variables.

Exercises/Activities – In Class:

  • Student group presentation on key concepts (Assignment 2 Group Presentation).
  • Student individual presentation on empirical studies (Assignment 2 Presentation of Empirical Studies)
  • Instructor comments on presentations.

Exercises/Activities – After Class:

  • For Standard Path: Complete DataCamp modules
  • For Working Paper Path: Prepare presentations on new analysis or improvements.

Session 6 (July 17) – Review Session: Concept Representation

Instructor Lecture:

  • Key concepts and best practices recap
  • Example studies of concept representation and measurement in CSS.
  • Lessons & Pitfalls: Common errors (e.g., ambiguous definitions, overreliance on unsupervised models) and how to mitigate them.

Working Paper Discussion:

  • Share how you have revised/improved the concept representation / operationalization of your existing draft. For instance, you might have more validation test, used improved algorithms, etc.

Exercises/Activities – After Class:

  • Prepare Assignment 2 Group Presentation (group).
  • Prepare Assignment 2 Presentation of Empirical Studies (individual).
  • For Standard Path: Complete DataCamp modules
  • For Working Paper Path: Prepare presentations on new analysis or improvements.

Session 7 (July 22) – Data Analysis (July 25 Friday Morning)

Required Readings:

  • CSSPrimer – Chapter 4: Concept Representation.
  • Additional journal articles TBD.

Key Topics:

  • Explanatory vs. Predictive: Balancing causal inference with predictive modeling in CSS.
  • Advanced data analysis examples, illustrating how CSS merges explanation and prediction.
  • Network Analysis: Key concepts: nodes, edges, centrality, clustering, and how network structure can yield insights for social science questions.

Exercises/Activities – In Class:

  • Student group presentation on key concepts (Assignment 2 Group Presentation).
  • Student individual presentation on empirical studies (Assignment 2 Presentation of Empirical Studies)
  • Instructor comments on presentations.
  • Instructor lecture on CSS Data Analysis methods and examples.

Session 8 (July 24) – Final Session: Research Presentations (July 25 Friday Afternoon)

Research Presentation for Working Paper Track (3 hours):

  • Students present their revised draft paper, now incorporating improvements in data management, measurement, and analysis.
  • Since they arrived with a draft, they highlight the before/after changes inspired by the course (e.g., “Originally, I only had a broad concept definition; now, I used an automated text classifier to measure it more precisely.”).
  • Q&A and Feedback: The class and instructor ask questions and provide final suggestions or commendations.

Happy Hour (location TBD; 2 hours in the evening): Informal gathering to celebrate the completion of the course. Students network, discuss future research directions, and reflect on the journey of integrating computational methods into their work.