AI chatbots in unsupervised assessment
TU Delft Assessment Taskforce has created practical guidelines on how lecturers can deal with the influence of AI chatbots on unsupervised assessments and what mitigating measures can be taken. In June 2023, the Special Interest Group AI & Assessment updated the guidelines.
The possibilities of AI chatbots in assessment require further investigation. A group of AI experts carried out a risk analysis of AI chatbots on assessment (Special Interest Group AI & Assessment, SIG AI&A). This page was last updated on 13 June 2023, based on input of the SIG AI&A.
About AI tools
AI tools like the chatbot ‘chatGPT’ can produce convincing answers to certain questions, depending on their nature. However, their output is not always reliable: outputs can contain convincingly presented factual errors (so-called hallucinations). Furthermore, their training data can be outdated. For example, ChatGPT currently uses training data up to September 2021, whereas others (such as Bing) do have access to recent data. It is also important to note that most chatbots do not list their resources (Bing Chat does, though).
On the positive side, chatbots can help with:
- Checking grammar, spelling, and references in a text
- Generating ideas by listing information from different sources in an accessible way
- Giving feedback
- Summarising or expanding texts and concepts
- Coding in a wide variety of computer languages
Use by lecturers for assessments: AI chatbots can help lecturers in creating assessments (including different versions of an assignment), answer models, and rubrics.
The following assumptions have been used as the basis of the practical guidelines in the how-to section:
1. Students’ use
We are assuming that AI tools are used by our students and graduates, especially because these services are currently free of charge.
2. Academic integrity
We assume that students and employees act according to principles of academic integrity, as formulated in the Code of Conduct. To foster common understanding and clarify expectations, discussions with students about integrity in the context of AI tools use are recommended.
3. Quality requirements for assessment
This is how the quality assessment requirements might be impacted by AI tools:
- Validity & reliability:
- The ability to use AI tools may influence the validity of the grade, because the grade may be a poorer representation of how students master the learning objectives.
- Allowing the use of AI tools while changing the assessment criteria (or their weight) but not (yet) the learning objectives may diminish the validity.
- In case students use AI tools while the teaching staff does not anticipate this, the use of AI tools may increase the grades of students. On the other hand, uncertainty about the use of AI tools may lead examiners to compensate in their assessment for (unfounded) suspicions about the use of AI tools, making the assessment inconsistent and therefore less reliable.
- Feasibility for students: Mitigating measures could increase the number of assessments and therefore increase the study load.
- Feasibility for teaching staff: Extra assessments (see point c) will also increase the workload for teaching staff.
- Transparency: In addition, teaching staff may forget to communicate some of their changed expectations to students.
4. AI chatbot detection
It is currently unknown what the reliability and validity of AI chatbot detectors is.
5. Definition of fraud
The definition of fraud is (source: model R&G, article 7.1):
“Fraud is taken to mean any act or omission by a student that makes it fully or partially impossible to properly assess the knowledge, insight and skill of that student or another student.
The use of AI chatbots (and of tools in general) should be acknowledged and properly referenced, to ensure the distinction between the students' original ideas and those provided by AI, and to check whether the student critically checked the output of the AI-generated outcome. However, this challenging, as it is expected that AI tools are to be used in an organic and evolutionary manner.
Currently, most chatbots are still free-of-charge, which makes a low threshold for students to use this. In the (near) future, it is likely that users will need to pay a fee. This could potentially lead to the need for higher education institutions to accommodate these AI tools when they are actively used in our education and that all students have equal access to these types of tooling.
8. Security & privacy
As AI tools use user input to train future versions, it can have consequences for the privacy and intellectual property of information that is fed to the AI tool during its use. Almost every online tool requires the use of personal data.
ChatGPT specific information: OpenAI is the company that offers the ChatGPT service. Users need to sign up for an account in order to use https://chat.openai.com. Besides user account information OpenAI also processes the following personal information:
- User Content: when you use ChatGPT, OpenAI will collect personal Information that is included in the input, file uploads, or feedback that you provide to ChatGPT during interactions
- Browser user agents
- Operating System and device information
- Tracking Identifiers
OpenAI uses technology requires sharing personal data with third parties and that data is stored on servers in the USA. At this moment is unclear with whom personal data is shared and what parties are responsible for the use and protection of your personal data. The security and privacy team will update when new information becomes available.
8. Ethical issues
In addition, ethical questions are arising regarding the current and future influence of AI chatbots on truth finding and society as a whole, as well as regarding the power of its owners (big tech companies) and the impact of the technology on vulnerable communities (exploited labour) and the environment.
9. Rapid evaluation of AI tools
Many of the current shortcomings will be (partially) solved in the next versions of AI tools. Therefore, it is important to focus on the possibilities and not so much on current shortcomings, because the latter change. However, we should consider the more static risks of these technologies, which are unlikely to change. In other words, we should distinguish between shortcomings and risks.
- Validity & reliability:
How to assess assignments and projects
Invigilated exams versus assignments and projects
During classical written exams and digital exams in the TUD secure exam environment, students do not have access to the internet, and therefore your students cannot access online AI tools. The same holds for oral exams that are held on campus.
On the other hand, if students work on assignments (exam-like or other) outside an exam hall and without invigilators (Dutch: surveillanten), the use of AI tools cannot be prevented.
Advice for fraud prevention in (non-invigilated) assessment
Feed your assignments to chatbots and study their output. How would you assess the output using your answer model or rubric? You can use this information to get a feeling of whether students used AI tools in their work.
Discuss the possibilities and limitations with your students of using AI tools in unsupervised assignments. Let students use the AI tool and let them reflect on the answer provided. Train students to not trust the answer of AI tools, even for questions that are not too difficult and require mostly factual knowledge. Students need to internalize that they need to double-check all output of AI tools, to prevent them from learning incorrect facts and reasoning.
TU Delft recognises the value of AI tools but sharing data is never without any risks. If you choose to use AI tools like ChatGPT or AI plugins, we recommend you take the following recommendations to heart:Reveal nothing: Do not share any personal data, internal information or (highly) confidential information during your interactions with the AI tool.
- Reveal nothing: Do not share any personal data, internal information or (highly) confidential information during your interactions with the AI tool.
- Private/incognito window: Use ChatGPT while browsing in a private or incognito window.
- Password: As long as AI tools do not offer Single Sign On, account and password management is up to the individual user. Do not reuse passwords, preferably use a password manager or other ways to create a safe password and change the password on a regular basis.
- AI plugin awareness: The new plugin functionality of, for example, ChatGPT offers the possibility to include external sources. This makes it easier to share data. Keep in mind that the above recommendations also apply to these plugins.
If you consider the use of AI bots in your course detrimental to achieving the learning objectives, clearly state your reasons to your students. Make sure they understand that they might fail in the summative assignment when they do not have access to these AI tools. Refer students back to the definition of fraud and to our TU Delft code of conduct.
Inform your students on how you expect them to correctly attribute the use of AI tools.
- Reflection: Have them write a short section on how they used chatbots and in what ways it was and was not helpful, and what they learned.
- Coding: Give instructions on using AI tools for developing software code, and on how to acknowledge their use.
- Have sufficient feedback moments during the course and ask students to reflect on how they processed the feedback. If possible, do this in a discussion.
- Regularly check the progress of individual students during their projects/assignments (if feasible). This is also good for learning and confidence building if you turn it into a supervision/feedback moment. Check during the creation of the deliverable (of a project/assignment) whether they are all contributing / learning, for example by brief oral discussions of a product they are working on (e.g. after finishing a specific (small) step in a project/code assignment).
- Shift assessment criteria towards the process instead of the deliverable. Make sure that the assessment is valid and transparent for students.
- Version control: Track the progress of students through version control in e.g. Word or Gitlab. Are they processing feedback proactively?
Take fraud detection measures and report suspicions of fraud to your Board of Examiners:
- Oral authenticity check: Do an oral authenticity check (4a) to check if it is likely that the student produced the text in themselves. This should either be a random check, or based on justifiable parameters to prevent bias.
- Check the transfer of skills & knowledge: Consider adding a written exam to a project in which students have to answer questions on a case that is similar to their project. That way, you can test the students’ ability to transfer their knowledge to another situation. Additionally, this aids with retention of knowledge & skills, especially if you discuss the exam in class. Carefully consider the timing and the weight of the exam (consider making the exam pass-fail) to prevent students from focussing on the exam instead of on their project. Adding assessments adds to the study load and is not permitted during the academic year without the permission of the board of examiners (and the programme director).
Rethink your course’s assessment plan. If necessary, adjust the learning objectives. This doesn’t necessarily mean that the taxonomy level should be increased since this could lead to an increase in the difficulty and study load of the course. Consider the relation to other courses in your decision.
Keep in mind that these changes require a study-guide adjustment and will therefore have to be approved by your programme director, the Board of Studies and the Faculty Student Council before the start of the academic year. Changes during the academic year can only occur in very special circumstances after approval by the board of examiners, see here).