Future of Work
Immigration
Belonging
Social Isolation
Gender
Climate Policy
Social Media
Food
Supply Chains
Circular Economy
Community
Nature
Waste
Identity
Housing
Art
Climate
Fashion
Media
Wellbeing
Health
Sport
Culture
Technology
Care System
Government
Politics
Law
Urban Design
Inequality
AI
Education
Zuzana Fernandes
BASc Year 3

Language, Learning, and YouTube

Automating the CEFR Assessment of User-like Generated videos for Language Learning
Education

Summary

Methods
Thematic Analysis
Interviews
Data Science
Disciplinary perspectives
Tech and Ethics
Design Thinking

Language learning tools should prioritise the intertwined nature of languages and cultures, valuing learning from their respective communities. This project stems from the idea that YouTube videos can be valuable resources for such language learning. However, there's a gap in understanding which videos best meet the needs of learners at various proficiency levels. My goal was to start bridging this gap by interviewing language experts to identify video features that correlate with language proficiency and using machine learning to classify these videos.

Approach and Methodology

I started with a problem area and looking at what product could fix it and came up with an app idea. I looked at what components would be needed to build that product, in my case an algorithm that would tell you who the video is suited for, eg an intermediate learner. From here, I knew I had to break this down into a simpler task, so decided to build a classifier. Also, not knowing much about language teaching, I felt expert interviews were necessary. My data is a catalogue of rated language videos used by language learning apps such as Busuu.

These videos are not made by Creators but but are curated by a team of language experts, directors and actors. However, they are made to mimic creator content. The main contributor was seeing the process as a whole, through a design thinking perspective where I was prioritising the user, and the hypothetical people who would own the videos. This allowed me to focus on doing research on and building something useful for language teachers and learners, not just building it for the sake of innovation or because it works.

Without synthesising the insights from experts, I wouldn’t know about the database, how experts feel about AI/ML used in language learning and most importantly I would not have been able to extract the relevant data for the model.

Beyond Outcomes

Want to learn more about this project?

Here is some student work from their formal assignments. Please note it may contain errors or unfinished elements. It is shared to offer insights into our programme and build a knowledge exchange community.

Summary

Methods
Thematic Analysis
Interviews
Data Science
Disciplinary perspectives
Tech and Ethics
Design Thinking

Language learning tools should prioritise the intertwined nature of languages and cultures, valuing learning from their respective communities. This project stems from the idea that YouTube videos can be valuable resources for such language learning. However, there's a gap in understanding which videos best meet the needs of learners at various proficiency levels. My goal was to start bridging this gap by interviewing language experts to identify video features that correlate with language proficiency and using machine learning to classify these videos.

Approach and Methodology

I started with a problem area and looking at what product could fix it and came up with an app idea. I looked at what components would be needed to build that product, in my case an algorithm that would tell you who the video is suited for, eg an intermediate learner. From here, I knew I had to break this down into a simpler task, so decided to build a classifier. Also, not knowing much about language teaching, I felt expert interviews were necessary. My data is a catalogue of rated language videos used by language learning apps such as Busuu.

These videos are not made by Creators but but are curated by a team of language experts, directors and actors. However, they are made to mimic creator content. The main contributor was seeing the process as a whole, through a design thinking perspective where I was prioritising the user, and the hypothetical people who would own the videos. This allowed me to focus on doing research on and building something useful for language teachers and learners, not just building it for the sake of innovation or because it works.

Without synthesising the insights from experts, I wouldn’t know about the database, how experts feel about AI/ML used in language learning and most importantly I would not have been able to extract the relevant data for the model.

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

  • Item A
  • Item B
  • Item C

Text link

Bold text

Emphasis

Superscript

Subscript

Beyond Outcomes

Want to learn more about this project?

Here is some student work from their formal assignments. Please note it may contain errors or unfinished elements. It is shared to offer insights into our programme and build a knowledge exchange community.

Author's Final Reflection

Overall LIS Journey

Academic References

Further Information

Lorem ipsum dolor sit amet consectetur. Pharetra vestibulum praesent sapien bibendum id egestas leo pellentesque adipiscing.

View the full project

About me

I’m interested in making an impact, technology and hopefully turning this project into something. I’m currently a software Grad & apprentice at TUI. In everyday life, I love climbing, chilling with my friends & cat, and trying new hobbies (my next goal is learning to dive).

Other Related Projects

Back to the repository

Language, Learning, and YouTube

Automating the CEFR Assessment of User-like Generated videos for Language Learning

Education
Community-Scale Plastic Recycling Workspaces

From Waste to Resource: A Cost Model for Efficient Community-Scale Plastic Recycling

Waste
Community
Community Land Ownership and Native Woodland Restoration

Investigating Community-Owned Forest Initiatives as a Vehicle for Native Woodland Restoration in the Scottish Highlands: A Case Study Approach

Nature
Community
Climate