Course Schedule
Computational methods and research design fundamentals (4 weeks)
- Week 1: Course introduction
- Key points: course structure, assignments, syllabus
- Week 2: Computational methods for social sciences: Overview
- Key points: philosophical and epistemological fundamentals, research design overview, comparison between CSS and conventional approaches
- Week 3: Analyzing computational methods from a research design perspective
- Key points: data management, concept representation, data analysis, and scientific communication
- Week 4: Field visit: Texas Advanced Computing Center
- Visit TACC; Discussion on final project options
Analyzing computational social science methods (8 + 1 weeks)
Instructor-lead sessions are voted by the class before 1/27 from these options
- Week 5: Computational methods: NLP algorithms and models as concept representation tools (instructor-lead)
- Key points: methodological background and overview, vector semantics and embeddings, Word2Vec, Doc2Vec, semantic similarity
- Week 6: Research design: Data management (student-lead)
- Week 7: Computational methods: topic modeling and classification (instructor-lead)
- Key points: Topic modeling and text classification
- Week 8: Research design: Concept representation (student-lead)
- Week 9: Computational methods: Network analysis as a representation and analysis method (instructor-lead)
- Key points: Basic concepts of network analysis, network generation and transformation, levels of analysis
- Week 10: Research design: Data analysis (student-lead)
- Week 11: Computational methods: Process of network analysis
- Week 12: Group consultation on final project (no class)
- Week 13: Research design: Scientific communication (student-lead)
Final project
- Week 14: Final project presentations
Weekly Details
Week 1: Course introduction Back2Top
Before class
- Readings:
- Hofman, Jake M., Duncan J. Watts, Susan Athey, Filiz Garip, Thomas L. Griffiths, Jon Kleinberg, Helen Margetts, et al. 2021. “Integrating Explanation and Prediction in Computational Social Science.” Nature 595 (7866): 181–88. https://doi.org/10.1038/s41586-021-03659-0.
- Edelmann, Achim, Tom Wolff, Danielle Montagne, and Christopher A. Bail. 2020. “Computational Social Science and Sociology.” Annual Review of Sociology 46 (1): 61–81. https://doi.org/10.1146/annurev-soc-121919-054621.
- Lazer, David M. J., Alex Pentland, Duncan J. Watts, Sinan Aral, Susan Athey, Noshir Contractor, Deen Freelon, et al. 2020. “Computational Social Science: Obstacles and Opportunities.” Science 369 (6507): 1060–62. https://doi.org/10.1126/science.aaz8170.
In class
- Course overview:
- Motivation and history of this course.
- Course sites: Syllabus website, Canvas, and how to use them.
- Helpful resources: open source communities, ChatGPT (and how to responsibly use it for educational purposes), etc.
- Review final project options.
- Discussion on readings: Analytical capacity of CSS methods
- Review CSS Empirical Studies Database.
After class
- Register Accounts:
- Review “Getting started with Chameleon Cloud”
Week 2: Computational methods for social sciences: Overview Back2Top
Before class
- Ragin, Charles C., and Lisa M. Amoroso. 2011. “The Goals of Social Research.” In Constructing Social Research: The Unity and Diversity of Method, 135–62. Pine Forge Press.
- Leonelli, Sabina. 2020. “Scientific Research and Big Data.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Summer 2020. Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/sum2020/entries/science-big-data/.
In class
- Review upcoming assignments.
- Discussion and lecture on readings.
- Hands-on: High-performance cloud computing with Chameleon
- Start an instance on Chameleon Cloud
- Install Anaconda Python and Jupyter Notebook.
- Snapshot the instance as an image.
- Discussion on final project options.
Video recording
- How to use Chameleon Cloud: Set up a new instance
- How to set up a Jupyter Lab server
After class
Week 3: Analyzing computational methods from a research design perspective Back2Top
Before class
- Ragin, Charles C., and Lisa M. Amoroso. 2011. “What Is (and Is Not) Social Research?” In Constructing Social Research: The Unity and Diversity of Method, 5–32. Pine Forge Press.
- Ma, Ji, Islam Akef Ebeid, Arjen de Wit, Meiying Xu, Yongzheng Yang, René Bekkers, and Pamala Wiepking. 2021. “Computational Social Science for Nonprofit Studies: Developing a Toolbox and Knowledge Base for the Field.” VOLUNTAS: International Journal of Voluntary and Nonprofit Organizations, October. https://doi.org/10.1007/s11266-021-00414-x.
In class
- Discussion and lecture on readings.
- Discussion on final project options.
Week 4: Field visit: Texas Advanced Computing Center (TBD) Back2Top
In class
- Visit TACC.
- Discussion on final project options.
After class
Week 5: Computational methods: NLP algorithms and models as concept representation tools Back2Top
Before class
- Required readings (copies of GRS chapters are on course’s Canvas site because of copyright)
- Grimmer, Justin, Margaret E. Roberts, and Brandon M. Stewart. 2022. “Social Science Research and Text Analysis.” In Text as Data: A New Framework for Machine Learning and the Social Sciences. Princeton, New Jersey Oxford: Princeton University Press.
- Grimmer, Justin, Margaret E. Roberts, and Brandon M. Stewart. 2022. “Principles of Measurement.” In Text as Data: A New Framework for Machine Learning and the Social Sciences. Princeton, New Jersey Oxford: Princeton University Press.
- Jurafsky, Daniel, and James H. Martin. 2022. “Vector Semantics and Embeddings.” In Speech and Language Processing, 3rd draft. https://web.stanford.edu/~jurafsky/slp3/.
- Recommended readings:
- Rodriguez, Pedro L., and Arthur Spirling. 2022. “Word Embeddings: What Works, What Doesn’t, and How to Tell the Difference for Applied Research.” The Journal of Politics 84 (1): 101–15. https://doi.org/10.1086/715162.
- Grimmer, Justin, and Brandon M. Stewart. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21 (3): 267–97. https://doi.org/10.1093/pan/mps028.
In class
- Overview: typical application of NLP in social science research
- Hands-on:
- Preprocess text with Stanza.
- Vectorize words with pretrained models.
- Calculate word similarity. Example studies:
- Kozlowski, Austin C., Matt Taddy, and James A. Evans. 2019. “The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings.” American Sociological Review 84 (5): 905–49. https://doi.org/10.1177/0003122419877135.
- Jones, Jason J., Mohammad Ruhul Amin, Jessica Kim, and Steven Skiena. 2020. “Stereotypical Gender Associations in Language Have Decreased Over Time.” Sociological Science 7 (January): 1–35. https://doi.org/10.15195/v7.a1.
- Calculate document similarity with Word Mover Distance. Example studies:
- Ma, Ji. 2022. “How Does an Authoritarian State Co-Opt Its Social Scientists Studying Civil Society?” VOLUNTAS: International Journal of Voluntary and Nonprofit Organizations, July. https://doi.org/10.1007/s11266-022-00510-6.
- Calculate word similarity. Example studies:
- Vectorize documents/paragraphs/sentences with pretrained models.
- Calculate document similarity between documents/paragraphs/sentences. Example studies:
- Ma, Ji, and René Bekkers. 2023. “Consensus Formation in Nonprofit and Philanthropic Studies: Networks, Reputation, and Gender.” Nonprofit and Voluntary Sector Quarterly, January, 08997640221146948. https://doi.org/10.1177/08997640221146948.
- Max length of input documents (caveat 1, caveat 2)
- Calculate document similarity between documents/paragraphs/sentences. Example studies:
Video recording
After class
Practice the coding sessions, play with your own datasets, revise research proposal.
Week 6: Research design: Data management (student-lead) Back2Top
Before class
- Recommended readings:
- Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature News, 533(7604), 452. https://doi.org/10.1038/533452a
- Wilson, Greg, D. A. Aruliah, C. Titus Brown, Neil P. Chue Hong, Matt Davis, Richard T. Guy, Steven H. D. Haddock, et al. 2014. “Best Practices for Scientific Computing.” PLOS Biology 12 (1): e1001745. https://doi.org/10.1371/journal.pbio.1001745.
- Gentzkow, Matthew, and Jesse M. Shapiro. 2014. Code and Data for the Social Sciences: A Practitioner’s Guide. https://web.stanford.edu/~gentzkow/research/CodeAndData.pdf.
- Wickham, Hadley. 2014. “Tidy Data.” The Journal of Statistical Software 59 (10). http://www.jstatsoft.org/v59/i10/.
- Boyd, Nora Mills. 2018. “Evidence Enriched.” Philosophy of Science 85 (3): 403–21. https://doi.org/10.1086/697747.
- Leonelli, Sabina. 2020. “Scientific Research and Big Data.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Summer 2020. Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/sum2020/entries/science-big-data/.
- Empirical readings (TBD by student group)
In class
- Discussion and lecture on readings.
- Discussion on final project.
After class
Provide feedback to group report.
Week 7: Computational methods: topic modeling and classification (instructor-lead) Back2Top
Before class
No readings before class, practice the coding sessions from previous weeks, and prepare a sample dataset of text for your proposed research (we will need it in class for practice purposes).
In class
- Overview: Technical background of topic modeling and classification, application in research
- Hands-on:
- Topic modeling based on different vectorization methods:
- Static word embedding (universal-sentence-encoder-multilingual)
- Contextual word embedding (BERT)
- Generation of topic keywords
- Classification of texts (code review)
- Practice with your own datasets
- Topic modeling based on different vectorization methods:
After class
Practice the coding sessions, play with your own datasets, revise research proposal.
Week 8: Research design: Concept representation (student-lead) Back2Top
In class
- Discussion and lecture on readings.
- Discussion on final project.
After class
Provide feedback to group report.
Week 9: Computational methods: Network analysis as a representation and analysis method (instructor-lead) Back2Top
Before class
- Recommended readings:
- Scott, John. 2017. “What Is Social Network Analysis?” In Social Network Analysis, Fourth edition. Thousand Oaks, CA: SAGE Publications Ltd.
- Scott, John. 2017. “Terminology for Network Analysis.” In Social Network Analysis, Fourth edition, 73–94. Thousand Oaks, CA: SAGE Publications Ltd.
- Watts, Duncan J. 2004. “The ‘New’ Science of Networks.” Annual Review of Sociology 30 (1): 243–70. https://doi.org/10.1146/annurev.soc.30.020404.104342.
In class
- Discussion and lecture on readings.
- Discussion on final project.
After class
Provide feedback to group report.
Week 10: Research design: Data analysis (student-lead) (TBD) Back2Top
Before class
- Recommended readings:
- Hofman, Jake M., Duncan J. Watts, Susan Athey, Filiz Garip, Thomas L. Griffiths, Jon Kleinberg, Helen Margetts, et al. 2021. “Integrating Explanation and Prediction in Computational Social Science.” Nature 595 (7866): 181–88. https://doi.org/10.1038/s41586-021-03659-0.
- Gerring, J. (2012). Mere Description. British Journal of Political Science, 42(4), 721–746. https://doi.org/10.1017/S0007123412000130
- Humphreys, P. (2009). The philosophical novelty of computer simulation methods. Synthese, 169(3), 615–626. https://doi.org/10.1007/s11229-008-9435-2
- Empirical readings (TBD by student group)
In class
- Discussion and lecture on readings.
- Discussion on final project.
After class
Provide feedback to group report.
Week 11: Computational methods: Process of network analysis Back2Top
Before class
- Recommended readings:
- Borgatti, Stephen P., and Daniel S. Halgin. 2011. “On Network Theory.” Organization Science 22 (5): 1168–81. https://doi.org/10.1287/orsc.1100.0641.
- Review network analysis algorithms
In class
- Discussion and lecture on readings.
- Discussion on final project.
After class
Work on final project, prepare to finalize the written report for Assignment 3: Student-lead seminar on research design
Week 13: Research design: Scientific communication (student-lead) (TBD) Back2Top
Before class
- Recommended readings:
- Wickham, H. (2014). Tidy data. The Journal of Statistical Software, 59(10). http://www.jstatsoft.org/v59/i10/
- Kirk, Andy. 2019. Data Visualisation: A Handbook for Data Driven Design. 2nd edition. S.l.: SAGE Publications Ltd.
- “Data storytelling” books through university library. The issue is that there are too many such books, and not all of them are helpful. This book is a bestseller on Amazon.
- Recommended DataCamp modules:
- Empirical readings (TBD by student group)
In class
- Discussion and lecture on readings.
- Discussion on final project.
After class
Provide feedback to group report.