Late submissions are not accepted. Check due dates on Canvas.

Assignment 1: Plagiarisms test

The first assignment of this course is to pass the plagiarism test and obtain a certificate at the master and doctoral level. Plagiarism is a serious academic misconduct. You will receive zero grade on plagiarized work and there may be other consequences. We have been told not to do this maybe since primary school, and we are always assuming we know what plagiarism is. However, we may assume we know too much (e.g., famous cases of plagiarism).

You do not need to take this test if you have a comparable certification or you took this test before, but the validity of your certification needs to be approved by the instructor.

“All assignments in this course may be processed by TurnItIn, a tool that compares submitted material to an archived database of published work to check for potential plagiarism. Other methods may also be used to determine if a paper is the student’s original work. Regardless of the results of any TurnItIn submission, the faculty member will make the final determination as to whether or not a paper has been plagiarized” (Statement from the Faculty Writing Committee: Guidelines for Preventing Plagiarism).

For this assignment, please submit your certificate as a file to Canvas.

Assignment 2: Create your own cloud computing server


  1. Create a running instance on ChameleonCloud.
  2. Install Anaconda Python.
  3. Run a Jupyter Notebook server with: (a). Password, so that only you can access the Jupyter interface. (b). SSL (i.e., “https” instead of “http”), so that the communication between you and the server is encrypted.
  4. Login your Jupyter Notebook server through web-browser.
  5. Save an image of your instance, submit screenshots through Canvas showing: 1) instance image is saved successfully, and 2) Jupyter server is started successfully.
  6. After submission, release your IP and server to other users (if you don’t plan to use the instance).

** Special attention: ChameleonCloud often has technical glitches, so please DON’T procrastinate this assignment to the last minute. If you have technical issue, submit a ticket through Help Desk. They almost only work on weekdays and will reply you in one or two business days. Again, don’t procrastinate this assignment. **

Knowledge and skills practiced:

  • Using cloud computing platform;
  • Using command line terminal and Linux system.

Assignment 3: Student-lead seminar on research design

Sign up your session here.

Students will lead four seminar sessions on four topics: data management, concept representation, data analysis, and scientific communication.

Using the definitions and frameworks of research design introduced at the beginning of the semester, students are expected to select and analyze empirical studies from the CSS Empirical Studies Database. Expected deliverables for this assignment are (1) Group lecture in class, (2) presentation slides, (3) a written report.

The deliverables should respond to the following points:

  1. How to define this function of CSS methods from a research design perspective, and why it is necessary?
  2. What are the common technical methods or practices, how do they complement existing research approaches, and how are they unique?
  3. How do existing empirical studies apply specific methods, and how can these applications be improved? Select at least 2 articles from the CSS Empirical Studies Database and 2 articles of your own selection (add your article to the database).
  4. What are the general patterns or rationales you can abstract from your analysis?

Assignment 4: Research proposal

**Research profile.**

Complete research profile so that your classmates will know more about you and your research interests.

**One-page research abstract**

This is a very brief introduction about your research idea. It usually includes: (1) Research questions and their importance; (2) major concepts and variables, and your hypotheses about the relations between these variables; (3) your primary data sources and analysis methods; (4) any challenges and how you plan to handle them; (5) a few key references that you identify at this stage, usually these articles are the “idols” of your proposed project.

**Research proposal draft (2 pages)**

In addition to the contents in your one-page research abstract, this proposal draft should at least include the following items:

  1. Specific research questions or hypotheses, and significance.
  2. Empirical analysis plan: (1) specific analysis methods; (2) relations between your research questions and analysis methods.
  3. A small sample dataset for answering these questions.
  4. Member responsibilities and project timeline by week (with weekly goals).

**Research proposal final**

By far, in addition to the items you’ve completed in the proceeding submissions, you should also have:

  1. Complete dataset for the research.
  2. Preliminary analysis results.
  3. Tentative conclusions.

**Peer-review of research proposal**

Assignment 5: Final project

Dataset options

Required components

The final project is a mini-research project. You need to clearly respond how the below components are covered in your project.

  1. Research design and data management:
    • Clear research questions and significance
    • Clear operationalization linkage
    • Validation of measures
    • Data preprocessing and reduction
    • Documentation for reproducibility
  2. Analysis:
    • Descriptive analysis and visualization
    • Hypothesis testing and visualization
    • Any or all of the following:
      • Clustering analysis and visualization (e.g., topic modeling)
      • Network analysis of nodes/communities/topology
  3. Bonus: Applying any of these skills will get 5% additional points
    • Machine learning
    • Named-entity recognition
    • Optical character recognition

Expected deliverables:

  1. Research paper draft.
  2. Research presentation.
  3. Research paper final.
  4. Peer-evaluation of group members.