Course Schedule
Introduction to the course
- Week 1 1/16: Course introduction
- Week 2 1/23: Computational Social Science: Why Research Design Approach
Data Management
- Week 3 1/30: Data Management: Methods and tools
- Week 4 2/6: Data Management: Background and Purposes (group presentation)
- Week 5 2/13: Data Management Exercise: Gathering Literature in Your Field
Concept Representation
- Week 6 2/20: Concept Representation: Background and Purposes (group presentation)
- Week 7 2/27: Concept Representation: Methods and tools
- Week 8 3/5: Concept Representation Exercise: Automated Coding
Data Analysis
- Week 9 3/19: Data Analysis: Background and Purposes (group presentation)
- Week 10 3/26: Data Analysis: Methods and Tools
- Network analysis as a representation and analysis method
- Process of network analysis
- Visualization tool: Gephi
- Week 11 4/2: Data Analysis Exercise: Simulation and Regression
Scientific Communication
- Week 12 4/9: Scientific Communication: Background and Purposes (group presentation)
- Week 13 4/16: Scientific Communication: Methods and tools
- Basic principles of visualization.
- Review tools:
- Programming language-based tools: Plotly and Dash for Python, Shiny for R
- Off-the-shelf tools: Tableau, PowerBI, Excel.
- Week 14 4/23: Presenting Data Dashboard or Final Project + Happy Hour
- Meet at Haymaker@11am
- Bring your laptop, we will Zoom-share and present your data dashboard or final project first.
- Happy hour: I’ll cover all snacks, one main course and two drinks (either alcohol or non-alcohol) per person.
Weekly Details
Week 1: Course introduction Back2Top
In class
- Course overview:
- Context of this course.
- Course sites: Syllabus website, Open Science Framework, Canvas.
- Helpful resources:
- Open source communities (e.g., Stack Overflow)
- ChatGPT. Discussion: How to effectively and responsibly use it? Your best practices.
- CSS Empirical Studies Database. Discussion: Pick 2 studies of your interests, discuss with neighbors.
After class
- Complete readings for the upcoming week.
- Register accounts:
- How to use Chameleon Cloud computing resources:
- Assignment 2 sign up due on upcoming Monday.
Week 2: Computational Social Science: Why Research Design Approach Back2Top
Before class
- Required readings:
- CSSPrimer: Chapter 1 and 2.
In class
- Discussion and lecture on readings. Key points:
- Philosophical and epistemological fundamentals, research design overview, comparison between CSS and conventional approaches
- Data management, concept representation, data analysis, and scientific communication
- In-class review and prepare:
- Group presentations.
- Empirical studies for analysis.
If time allows: High-performance cloud computing with Chameleon
- Start an instance on Chameleon Cloud
- Install Anaconda Python and Jupyter Notebook.
- Snapshot the instance as an image.
- You can also watch the video recordings below:
- How to use Chameleon Cloud: Set up a new instance
- How to set up a Jupyter Lab server
After class
- Assignment 1 due on upcoming Monday.
- Review tools and platforms for upcoming week, prepare to discuss how you plan to use them.
Week 3: Data Management: Methods and tools Back2Top
Before class
- Review Assignment 3: Gathering Literature in Your Field
In class
- File and data format: API, JSON, and relational database.
- Efficiency and automation.
- Tools review:
- OpenAlex
- Draw.io
- MySQL Workbench
- Prepare Assignment 3: Gathering Literature in Your Field
After class
- Group presentation slides and annotated bibliography on Data Management due upcoming Monday.
Week 4: Data Management: Background and Purposes (group presentation) Back2Top
Before class
- Required readings
- Leonelli, Sabina. “Scientific Research and Big Data.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Summer 2020. Metaphysics Research Lab, Stanford University, 2020. https://plato.stanford.edu/archives/sum2020/entries/science-big-data/.
- Wickham, Hadley. “Tidy Data.” The Journal of Statistical Software 59, no. 10 (2014). http://www.jstatsoft.org/v59/i10/.
- Goble, Carole, and David De Roure. “The Impact of Workflow Tools on Data-Centric Research.” In The Fourth Paradigm: Data-Intensive Scientific Discovery, edited by Tony Hey, Stewart Tansley, Kristin Tolle, and Jim Gray. Microsoft Research, 2009. https://www.microsoft.com/en-us/research/publication/fourth-paradigm-data-intensive-scientific-discovery/.
- Fidler, Fiona, and John Wilcox. “Reproducibility of Scientific Results.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Summer 2021. Metaphysics Research Lab, Stanford University, 2021. https://plato.stanford.edu/archives/sum2021/entries/scientific-reproducibility/.
In class
- Student-lead group presentation and instructor lecture.
- Group discussion on annotated bibliography.
- Prepare Assignment 3: Gathering Literature in Your Field.
After class
- Assignment 3: Gathering Literature in Your Field due upcoming Monday.
- Group presentation slides and annotated bibliography on Concept Representation due in two weeks.
Week 5: Data Management Exercise: Gathering Literature in Your Field Back2Top
Before class
- Complete the Assignment and submit to OSF.
In class
- Presentation and discussion of Assignment.
- Review peer assignments and provide feedback.
- Preview next exercise.
After class
- Revise assignments according to feedback.
- Group presentation slides and annotated bibliography on Concept Representation due in upcoming Monday.
- Read required readings.
- Prepare group presentation.
- Write annotated bibliography.
Week 6: Concept Representation: Background and Purposes (group presentation) Back2Top
Before class
- Required readings
- Creswell, John W. “The Selection of a Research Approach.” In Research Design: Qualitative, Quantitative, and Mixed Methods Approaches, 4th ed. Thousand Oaks: SAGE Publications, 2014.
- Ragin, Charles C., and Lisa M. Amoroso. “The Goals of Social Research.” In Constructing Social Research: The Unity and Diversity of Method, 135–62. Pine Forge Press, 2011.
- Grimmer, Justin, and Brandon M. Stewart. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21, no. 3 (2013): 267–97. https://doi.org/10.1093/pan/mps028.
In class
- Student-lead group presentation and instructor lecture.
- Discussion on annotated bibliography.
- Prepare Assignment 4: Automated Coding, due in two weeks.
After class
Week 7: Concept Representation: Methods and tools Back2Top
Before class
- Prepare Assignment 4: Automated Coding
In class
- Inductive coding with topic modeling:
- Deductive coding (text classification):
- Fine-tune model with training dataset.
- Prompt-based classification with LLMs: Llama 2 (English) and Qwen (multilingual)
- Prepare Assignment 4: Automated Coding
After class
- Group presentation slides and annotated bibliography on Data Analysis due in two weeks.
- Complete Assignment 4: Automated Coding, due upcoming Monday.
Week 8: Concept Representation Exercise: Automated Coding Back2Top
Before class
- Complete the Assignment and submit to OSF.
In class
- Presentation and discussion of Assignment.
- Review peer assignments and provide feedback.
- Preview next exercise.
After class
- Revise assignments according to feedback.
- Group presentation slides and annotated bibliography on Data Analysis due in upcoming Monday.
- Read required readings.
- Prepare group presentation.
- Write annotated bibliography.
Week 9: Data Analysis: Background and Purposes (group presentation) Back2Top
Before class
- Required readings
- Hofman, Jake M., Duncan J. Watts, Susan Athey, Filiz Garip, Thomas L. Griffiths, Jon Kleinberg, Helen Margetts, et al. “Integrating Explanation and Prediction in Computational Social Science.” Nature 595, no. 7866 (July 2021): 181–88. https://doi.org/10.1038/s41586-021-03659-0.
- Ludwig, Jens, and Sendhil Mullainathan. “Machine Learning as a Tool for Hypothesis Generation.” Working Paper. Working Paper Series. National Bureau of Economic Research, March 2023. https://doi.org/10.3386/w31017.
In class
- Student-lead group presentation and instructor lecture.
- Discussion on annotated bibliography.
- Review Assignment 5: Network Analysis.
After class
Prepare Assignment 5: Network Analysis, due in two weeks.
Week 12: Scientific Communication: Background and Purposes (group presentation) Back2Top
Before class
- Required readings
- Wickham, H. (2014). Tidy data. The Journal of Statistical Software, 59(10). http://www.jstatsoft.org/v59/i10/
- Kirk, A. (2019). The Visualisation Design Process. In Data Visualisation: A Handbook for Data Driven Design (2nd edition, pp. 31–58). SAGE Publications Ltd.
- Kirk, A. (2019). Working With Data. In Data Visualisation: A Handbook for Data Driven Design (2nd edition, pp. 95–117). SAGE Publications Ltd.
In class
- Student-lead group presentation and instructor lecture.
- Discussion on annotated bibliography.
- Review Data Dashboards and Final Project.
After class
Prepare Data Dashboards or Final Project.