MATHEMATICS (TURKISH, PHD)
PhD TR-NQF-HE: Level 8 QF-EHEA: Third Cycle EQF-LLL: Level 8

Course Introduction and Application Information

Course Code Course Name Semester Theoretical Practical Credit ECTS
CMP4507 Text Mining Fall 3 0 3 6
The course opens with the approval of the Department at the beginning of each semester

Basic information

Language of instruction:
Type of course: Departmental Elective
Course Level:
Mode of Delivery: Face to face
Course Coordinator : Prof. Dr. ÇAĞATAY ÇATAL
Course Objectives: Textual data is increasing in many environments such as articles, blogs, tweets, news, publications, and books. To work with such data and to discover knowledge from this huge amount of data; several techniques such as linguistics, machine learning, deep learning, and natural language processing are required and as such, this is indeed a very difficult task. The aim of this course is to provide the application of machine learning in textual documents in order to analyze the textual data quantitatively. Cleaning textual data, representation of this data, and generating textual data in different problems are three important issues that should be known while working with textual data in general.

Learning Outputs

The students who have succeeded in this course;
1) Describe the methods for the preparation of textual data
2) Understand the basic methods of data representation
3) Apply the required techniques for text classification problems
4) Understand the methods for language modeling
5) Understand how to create textual description from the picture.
6) Describe the methods required for machine translation from one language to another.

Course Content

1) "
Introduction to Text Mining and related topics (natural language processing, machine learning, deep learning, opportunities offered by deep learning)"
2) Explanation of data preparation methods (manual text cleaning, cleaning with NLTK, data preparation with scikit-learn, data preparation with Keras)
3) Explanation of data representation models (Bag-of-words model, preparation of movie review data for sentiment classification problem)
4) Word embeddings used in data representation
5) Explanation of the methods that can be applied in the text classification problem
6) Midterm
7) Explanation of character and word based language models
8) Explanation of the methods that can create textual definition from the picture (Image captioning)
9) Machine translation from one language to another
10) Practical Application: Sentiment analysis with artificial neural network based bag-of-words model
11) Practical Application: Sentiment analysis with word embedding and CNN model, Sentiment analysis with N-gram based CNN model
12) Practical Application: Designing a Neural Network based language model for Text Generation
13) Practical Application: Designing artificial neural network based image captioning model
14) Practical Application: Designing artificial neural network based machine translation model

Weekly Detailed Course Contents

Week Subject Related Preparation
1) Introduction to Text Mining and related topics (natural language processing, machine learning, deep learning, opportunities offered by deep learning)
2) Explanation of data preparation methods (manual text cleaning, cleaning with NLTK, data preparation with scikit-learn, data preparation with Keras)
3) Explanation of data representation models (Bag-of-words model, preparation of movie review data for sentiment classification problem)
4) Word embeddings used in data representation
5) Explanation of the methods that can be applied in the text classification problem
6) Explanation of character and word based language models
7) Explanation of character and word based language models
8) Review for the midterm exam
9) Explanation of the methods that can create textual definition from the picture (Image captioning)
10) Explanation of the methods that can create textual definition from the picture (Image captioning)
11) Machine translation from one language to another
12) Machine translation from one language to another
13) Practical applications
14) Practical Applications

Sources

Course Notes: Brownlee, J. (2017). Deep Learning for Natural Language Processing: Develop Deep Learning Models for your Natural Language Problems. Machine Learning Mastery.
References: Ignatow, G., & Mihalcea, R. (2017). An introduction to text mining: Research design, data collection, and analysis. Sage Publications.

Evaluation System

Semester Requirements Number of Activities Level of Contribution
Attendance 42 % 0
Laboratory % 0
Application % 0
Field Work % 0
Special Course Internship (Work Placement) % 0
Quizzes % 0
Homework Assignments % 0
Presentation % 0
Project 57 % 40
Seminar % 0
Midterms 15 % 20
Preliminary Jury % 0
Final 20 % 40
Paper Submission % 0
Jury % 0
Bütünleme % 0
Total % 100
PERCENTAGE OF SEMESTER WORK % 20
PERCENTAGE OF FINAL WORK % 80
Total % 100

Contribution of Learning Outcomes to Programme Outcomes

No Effect 1 Lowest 2 Low 3 Average 4 High 5 Highest
           
Program Outcomes Level of Contribution