Master degree program
Machine Learning and Data Mining

Machine Learning and Data Mining

QUALIFICATION

  • Scientific and pedagogical direction - Master of Engineering Sciences

MODEL OF GRADUATING STUDENT

ON Use the conceptual apparatus, methods, techniques and technologies for developing software (software) for inductive learning based on the analysis and synthesis of information data flows, which are precedents for the problem being solved. Such sets of tasks characteristic of the banking sector, online trading, IoT, social networks, data measuring devices of complex technical objects (TO), servers DC;
ON 2 Conduct comparative – regression, comparative probabilistic, systemic and structural analysis for modeling and formalization of large information flows of Internet space. To use data mining & information extraction approaches as an alternative to these statistical methods;
ON 3 Solve network technical, economic marketing, banking, information and forecasting tasks based on Knowledge Bases accumulated by expert systems in Data Centers to structure this information into a single, understandable and self-learning formalized mathematical model;
ON 4 Process data streams of servers, maintenance, Internet sources to build a variety of situational objects and a variety of possible responses, the reactions studied, depending on the cause-time development of the system. To be able to solve typical tasks using Google's DeepMind simulator;
ON 5 To correlate the methodological foundations of the analytical approaches of formal mathematics with the concepts of fuzzy logic and the search for implicit solutions by algorithms of neural networks based on the empirical formalization of solutions;
ON 6 Effectively developed self-learning systems for generalizing the various information flows of DC, Internet resources, the testimony of numerous sensors of complex TO to develop an adequate response to data falling outside the limits of a training set of situations;
ON 7 Create new knowledge bases and segments in DC. Design a pilot Machine Learning for maintenance and business processes with the formation of self-learning mathematical models for the processing of large data flows DC KazNU named after Al Farabi;
ON 8 To create projects on the basis of artificial neural networks for deep learning with a teacher, to apply the methods of error correction, back propagation of errors and reference factors;
ON 9 To form pilot courses for training employees of business companies, to conduct trainings on big data, machine learning and interface design. To be able to transparently and clearly present the conceptual framework ML / AI / Big Data and their areas of application;
ON 10 To have the skills to use the application programs of the Machine Learning package in the approximation of functions, handwriting recognition, technical diagnostics;
ON 11 Apply Machine Learning methods to study time series or signals, image or video sequence;
ON 12 Use skills to work with information from various literary sources, present it in various forms of messages, presentations and reports taking into account the specifics of the audience, substantiating and competently presenting their point of view on problematic issues. Effectively work in a team when searching and solving research problems of the OP.

Program passport

Speciality Name
Machine Learning and Data Mining
Speciality Code
7M07115
Faculty
Information technology

disciplines

Big Data infrastructure
  • Number of credits - 9
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to develop the ability to create an infrastructure and means of integrating systems, applications and services for processing big data. Content of discipline: Technologies of distributed computing. distributed file systems. Big data infrastructure design. Development and support of integration tools for systems, applications and services. Batch processing of big data. Stream processing of big data. Interactive big data processing. Big data placement optimization.

Computer models of computing
  • Number of credits - 6
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of discipline: formation of the ability to apply the tools and methods necessary to offer algorithmic solutions to real problems that have strict theoretical limitations on the use of time and space. Within the discipline the following aspects will be considered: asymptotic notation, recursion, and the “divide-and-rule” paradigm.

Foreign Language (professional)
  • Number of credits - 6
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose is to acquire and improve competencies by international standards of foreign language education and to communicate in an intercultural, professional, and scientific environment. A master's student must integrate new information, understand the organization of languages, interact in society, and defend his point of view.

History and Philosophy of Science
  • Number of credits - 3
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The course forms knowledge about the history and theory of science; on the laws of the development of science and the structure of scientific knowledge; about science as a profession and social institution; оn the methods of conducting scientific research; the role of science in the development of society.

Organization and Planning of Scientific Research (in English)
  • Number of credits - 6
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to train masters in conducting scientific research, carrying out scientific and methodological work, socializing young students and their participation in the corporate governance system of Organization of higher and postgraduate education (OHPE). Undergraduates learn to interact with OHPE stakeholders, participate in research projects.

Pedagogy of Higher education
  • Number of credits - 5
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose is the formation of the ability of pedagogical activity through the knowledge of higher education didactics, theories of upbringing and education management, analysis, and self-assessment of teaching activities. The course covers the educational activity design of specialists, Bologna process implementation, acquiring a lecturer, and curatorial skills by TLA-strategies.

Predictive analysis
  • Number of credits - 6
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to develop the ability to apply statistical methods and machine learning models for predictive data analysis. Course content: Approaches for predictive data analysis. Statistical methods for data analysis. Data processing. Machine learning for predictive analysis. Search for patterns and relationships in large data warehouses. Predictive modeling. Model evaluation and deployment. visualization tools.

Psychology of Management
  • Number of credits - 3
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The course reveals the subject, the basic principles of management psychology, personality in managerial interactions, personal behavior management, psychology of managing group phenomena and processes, psychological characteristics of the leader's personality, individual management style, psychology of influence in management activities, conflict management.

Specialized big data technologies
  • Number of credits - 5
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of discipline: consists in developing the ability to work with integration platform architectures, applications and services of high-load systems for storing and processing big data. Within the discipline the following aspects will be considered: An introduction to specialized big data technologies. Extra large data processing platforms CLAVIRE 2.0. ASPEN is a comprehensive evaluation platform for streaming information processing systems. Expanded SparkStreaming platform with integrated scheduling engine. Optimization of big data placement. Exarch semantic storage. Blockchain technology in data storage tasks.

Statistical data analysis methods
  • Number of credits - 5
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to build the ability to apply the concepts of statistical analysis and more advanced statistical modeling procedures using the Python programming language to interpret and analyze data. Within the discipline the following aspects will be considered: Statistical modeling methods. Correlation analysis. Regression analysis. Canonical analysis. Methods for comparing means. Frequency analysis. Crosstabulation (pairing). Hierarchical and multilevel models and Bayesian inference methods. Correspondence analysis. Cluster analysis. Discriminant analysis. Factor analysis. Classification trees. ways to use Python as a tool, including the Numpy, Pandas, Statsmodels, Matplotlib, and Seaborn libraries. Create visualization and data management in Python. Principal component analysis and classification. Multidimensional scaling.

Data for 2021-2024 years

disciplines

Applied Cluster analisis
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to form the ability to apply various data clustering algorithms to solve applied problems, visualize the formed clusters and analyze the results. Course content: Introduction to clustering. K-means method and modifications. Hierarchical clustering. Organization of clusters in the form of a hierarchical tree. Gaussian mixture model algorithm. Distance metrics used in clustering. Determining the appropriate number of clusters. Evaluation of clustering efficiency.

Applied Machine Learning
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of discipline: is to build the ability to use data manipulation tools to solve machine learning problems, to develop algorithms for data mining. Within the discipline the following aspects will be considered: Logical methods: decision trees and decision forests. Metric classification methods. Linear methods, stochastic gradient.

Cloud solutions for machine learning
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The goal of the discipline is to develop the ability to apply machine learning engineering and cloud computing infrastructure to create cloud-based predictive analytics applications. Course content: Fundamentals of cloud computing. Basic cloud computing infrastructure. Cloud virtualization. Containers and APIs Cloud data engineering. Using AutoML. MLOps strategies and best practices for developing cloud solutions. Edge Machine Learning Strategies. Using AI API.

Construction and analysis of algorithms
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to develop the ability to implement high-performance algorithms and data structures for fundamental computational problems in various fields.Within the discipline the following aspects will be considered: Basic algorithms: asymptotic writing, recursion, the divide-and-conquer paradigm, and basic data structures. Balanced binary trees, 2-3 trees, B-trees, structures for sets, hashing, text compression (Huffman encoding). Application of maximum flow algorithms Randomized selection and sorting. Automata, string matching (Boyer and Moore algorithm, Knuth-Morris-Pratt algorithm), pattern matching. Complexity classes P and NP, NP-completeness, and some NP-complete problems. The strategy of parallel design. Distributed computing algorithms.

Data processing and analysis
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to build the ability to analyze and process data using MATLAB, to create applications for predictive modeling. Within the discipline the following aspects will be considered: The principles of data processing and analysis. Import data. Visualization and data filtering. Performing calculations.

Data visualization
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to build the ability to use methods and applications of pattern detection in data mining, create and interpret visualization. Within the discipline the following aspects will be considered: History of the development of visualization. Basic concepts and directions in visualization. Data: data abstractions and their types. Alphabet render. Sketching and designing visualization solutions. The visual perception of man. Awareness of images. Perception of color. Tasks: abstractions of tasks and methods for their formulation. Human-computer interaction. Data visualization. Networks and graphs. Visualization of text and documents. Visualization of spatial data. Volumes, flows, cards. Social visualization. Analysis of social data. Visualization for society. Visualization and art. Visualization as an art.

Deep learning
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The subject covers the following aspects: Architecture of deep neural networks. Customize hyperparameters and deep learning frameworks. Convolutional neural networks, their applications. Classification of objects and similar methods. Convolutional neural networks, their applications. Recurrent neural networks, their applications. Parallel deep learning algorithms. Acceleration of neural network learning.

Machine Learning Methods for industrial data processing
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to form the ability to evaluate the features of industrial data, to develop machine learning methods for analyzing and visualizing industrial data. Content of discipline: Industrial data. Industrial data platforms. Use of industrial data. Integration API. Machine learning methods for industrial data. Training with a teacher. Learning without a teacher. Reinforcement training. Condition monitoring and digital twins.

Methods and models of multivariate data analysis
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to build the ability to apply modern multidimensional statistical methods and modern software systems in solving problems of analysis of various processes and phenomena. Within the discipline the following aspects will be considered: Probabilistic models for one-dimensional and multidimensional random variables.

Neural network architecture for deep learning
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to develop the ability to perform visual design of neural networks, use a graphical interface to transfer layers of neural architecture, and configure and deploy models using popular deep learning environments. Within the discipline the following aspects will be considered: Introduction to deep learning. Practical aspects of deep learning. Deep direct distribution networks. Recurrent and recursive networks. Advanced algorithms for deep neural networks. Configuring hyperparameters. Regularization and optimization. Batch normalization and framework programming. Convolutional neural networks. Optimization of hyperparameters. Visual design of neural networks. Secure cloud infrastructure.

Neural networks
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to develop the ability to build, train and apply deep neural networks to solve data analysis problems that require large computing resources. Course content: Architecture of neural networks. Shallow neural networks. Key computations at the heart of deep learning. Building and training deep neural networks. Hyperparameter tuning, regularization and optimization. Evaluation of the quality of education.

Programming for data science
  • Type of control - [RK1+MT+RK2+Exam] (100)
  • Description - The purpose of the discipline is to develop the ability to apply methods for obtaining, transforming and preprocessing data, to develop software for data analysis and visualization. Content of discipline: Transformation of raw data into a structured form. Development of a data analysis pipeline. Data processing. Data visualization. Interaction with machine learning frameworks. Development of software for data analysis and visualization.

Data for 2021-2024 years

INTERNSHIPS

Pedagogical
  • Type of control - Защита практики
  • Description - Aim оf discipline: formation of the ability to carry out educational activities in universities, to design the educational process and conduct certain types of training sessions using innovative educational technologies.

Research
  • Type of control - Защита практики
  • Description - The purpose of the practice: gaining experience in the study of an actual scientific problem, expand the professional knowledge gained in the learning process, and developing practical skills for conducting independent scientific work. The practice is aimed at developing the skills of research, analysis and application of economic knowledge.

Data for 2021-2024 years