profile - دانشکده علوم

Course Name	unit	term
	3	first semester Academic year 2025-2026
sds	4	first semester Academic year 2025-2026
Statistical Inference	4	first semester Academic year 2025-2026
probability1	3	first semester Academic year 2025-2026
Stastistical quality of Control	3	first semester Academic year 2025-2026

Master Theses

Evaluation of shear wave velocity in fine-grained soils based on mechanical parameters and standard penetration coefficient of soil
Shima Azizi 2026
Semi-supervised Nonparametric Bayesian Clustering in the SHM structure.
Shahnaz Rahimi chegeni 2026
Prediction of Alzheimer's disease using federated learning .
Sharareh sadat Alizadeh 2026
Designing an electronic marketing model in the context of social networks with the aim of increasing customer engagement (Instagram social network case study).
Fatemeh Biabani 2025
This study is focused on designing an electronic
Reconstruction order statistics and missing data in extreme value distributions
Sasan Akbari 2025
آمده را با ساير روش‌ها مقايسه و مورد ارزيابي قرار مي‌دهيم.
Forecasting time series with neural networks
Hadis Heidaryan 2024
A time series is a set of data recorded over time. For example, we can refer to the time series of the price of a share in the stock market, the amount of rain in an area, etc. One of the most important goals of time series analysis is to predict its future values. Several statistical methods for predicting time series, such as the method of using time series analysis models of Autoregressive integrated moving average model and Seasonal autoregressive integrated moving average model, wavelet analysis methods have been in-troduced. Along with statistical methods, neural networks are also a powerful tool for predicting time series, due to the ability of neural networks to model relationships and complex patterns in data, predicting time series using it has attracted the attention of researchers in various fields and is a research topic. It has become popular. The use of convolutional neural networks to predict time series is known as an e?ective method for analyzing and predicting repetitive patterns in time data. These networks are designed to recognize di?erent patterns in temporal data and can make accurate predictions for future data values using these patterns. Long Short Term Memory (LSTM) Recurrent Neural Networks and Gate Recurrent Neural Networks (GRU) are two types of neural networks used for time series forecasting. LSTM and GRU networks are useful for time series forecasting due to their ability to retain memory over time. These networks, using a memory unit, keep the previous information and make the next prediction according to this information. In this thesis, time series forecasting with convolutional neural network and LSTM and GRU neural networks is investigated. The performance of these networks in prediction accuracy is compared. Also, their performance is compared with statistical methods such as SARIMA.Key ords. Forecasting time series, Neural networks, Recurrent neural networks, Long short-term memory, Gated recurrent unit, Convolutional neural networks.
Study Of a Numerical Methods to Find A-optimal Designs
Narges Nazari 2024
Presenting a model based on data mining and fuzzy inference system for the diagnosis of heart disease (case study: Sahne Moaeven Hospital)
Ehsan Naserpour 2024
This research has investigated and implemented a combined model of fuzzy logic and decision tree to diagnose heart disease using medical data of 920 patients (UCI database). After reviewing and preparing the data, the information of 304 patients were selected as complete data, and their quantification and normalization process was performed with the Min-Max method. The local database of the deputy scene hospital was also used to complete the model. The decision tree model was extracted with an accuracy of 78.30% and the area under the ROC chart equal to 0.798, but the classification error was reported as 22.36%. According to the decision tree, the fuzzy inference system was implemented with 9 inputs and 12 fuzzy rules. Then, the combined decision tree-fuzzy inference system model was implemented with 70 hospital data and evaluated with 35 new data. The results showed that the model was able to achieve an overall accuracy of 91.43% in data classification. The sensitivity of the model was 89.47% and 93.75% for the diagnosis of healthy people and heart patients, respectively. Comparison with other methods showed that the proposed model has a higher accuracy between 1 and 12% than the support vector machine, artificial neural network and k-nearest neighbor methods. The results of this research showed that the use of local data tailored to the specific characteristics of each region can improve the accuracy of machine learning models in medical applications. The combined decision tree-fuzzy model with high accuracy and simplicity in implementation was introduced as an effective tool for diagnosing heart diseases.
Optimization of the process of heavy metal ions removal from wastewater by using D-optimal designand Genetic algorithm
Mahya Arjmandnia 2024
In this research, heavy metal removal from wastewater was investigated using a combination of electro-Fenton and membrane filtration methods.The integration of these methods was done with the aim of increasing the purification efficiency, and the effect of operating parameters, reaction time, current density, solution acidity (pH), volume ratio of hydrogen peroxide to wastewater, molar ratio of hydrogen peroxide to ferrous ion (Fe2+), nanoparticle concentration and concentration of the input feed was evaluated on the removal percentage of this pollutant. In order to optimize the operating parameters with the aim of maximizing the removal percentage of this pollutant, two optimization methods, the D-optimal criterion which is a real valued function of the value according to Fisher's information matrix and the combined method of artificial neural network-genetic algorithm have been used. The aim of comparison of statistical analysis for these methods is finding an objective function with the lowest mean squared error.
A review on cox regression and support vector machine algorithm for survival analysis and comparing them in a case study
HUMAM FAEQ HUSSEIN HUSSEIN 2023
One of the topics of interest in statistics is the time of occurrence of a particular event. Therefore, a sub-field called survival analysis has been created in statistics. In general, survival analysis is a set of statistical methods for analyzing data in which the outcome variable is the time until the occurrence of a specific event. In survival analysis, time variable usually is called survival time. Because this variable determines how long a person "survived" during the follow-up period. Also, because usually in this type of analysis, the desired events are death, illness or other individual experiences, desired event usually called failure. However, failure does not necessarily have a negative meaning, for example, it can be the time until the birth of the first child after marriage (as the moment of starting the study). Many survival analyses are faced with a fundamental problem called censoring. Censoring occurs when we have partial survival time information but do not know the exact survival time.\\\\ With the expansion of science and the progress of various data analysis methods, survival data analysis methods are also progressing, and the application of this science in medical data and other fields is increasing.\\\\ One of the common statistical methods for analyzing survival data is Cox proportional hazard regression, this model does not have the optimal performance when we are faced with the problem of high dimensions, an alternative method that is introduced in this thesis is the use of support vector machine, which is one of the techniques in machine learning and it can work well with high-dimensional problems, and it also does not need to hold the usual regression assumptions that we have in classical statistics.\\\\ The original version of the support vector machine does not have the ability to deal with survival data due to the presence of censors. A naive idea is to exclude the censored samples from the study, as a large amount of information will be lost. In this thesis, by making changes on the constraints of the support vector machine optimization problem, we arrive at a version of this method that is suitable for the analysis of survival data and uses the information of censors.This version is called survival support vector machine. Finally, for a case study, we will use the survival support vector machine method to analyze it and compare it with classical methods in statistics such as Cox proportional hazards regression.
Prediction Based on Combination of Mixed Models
Zahra Sohaylikia 2023
In linear models, when the number of independent variables is large, it is common to use methods such as step-by-step, forward, backward, etc. to find optimal models among possible models. But these methods of finding the optimal model do not find the best model in the absolute sense and produce different results case by case. A model found with these methods may be the best in terms of the mean squared error or it may be the best in terms of the coefficient of determination. However, using these methods requires removing a number of independent variables from the analysis, which can be misleading or at least limiting in some applications. When the researcher's goal in determining the models is to predict new observations, the use of models obtained from these variable elimination methods can have a greater effect on losing information and reducing accuracy. Our goal in this thesis is to obtain a model based on the combination of simple models in such a way that this combination of models is the most reliable (in various meanings such as minimum MSE or maximum information) instead of using independent variable elimination approaches. and ...) to produce forecasts.
Spatial modeling of unemployment rate in counties of Iran based on data from Populationand Housing census 2016
Hamed Seifi 2023
Unemployment is one of the most important issues in all countries around the world. An increase in the number of unemployed in any society will cause a lot of problems. So having deep and appropriate knowledge of the factors affecting unemployment is taken into account to reduce it. In this thesis, we gathered data of Population and Housing census 2016 from the Statistical Center of Iran. These data categorized the active and unemployed population of 15 years old or above, based on gender and different levels of education in the counties of Iran. We edit these data, based on our purpose. Our purpose in the thesis is spatial modeling of the number of unemployed based on gender and education as covariates. To achieve this goal, we use Bayesian approach and a method called “integrated nested Laplace approximation” or INLA for short. For many years, Bayesian inference has relied upon Markov chain Monte Carlo (MCMC) methods. This approach focuses on estimating the joint posterior distribution of model parameters, therefore, it is computationally expensive in high-dimensional spaces. Instead, Inla focuses on estimating marginal posterior distributions, and according to tremendous developments in computational systems in recent years, it is done more quickly. In addition, INLA is expressed in models with GMRF feature and it has some advantages that reduce the time of model fitting calculations. Finally and after appropriate modeling of the data, we interpret the effects of the two variables of gender and education as well as spatial effects of the counties of Iran on the number of unemployed.
Predicting the academic achievement of Razi University students using data mining techniques
Elnaz Kasani 2023
يكي از عوامل مهم در بررسي آموزش، پيش‌بيني پيشرفت ‌تحصيلي است و استفاده از فنون داده‌كاوي يكي از راه‌كار‌هاي نوين پيش‌بيني پيشرفت ‌تحصيلي است. در اين پايان‌نامه، فنون داده‌كاوي در دو بخش روش‌هاي ساده شامل درخت‌ تصميم، جنگل تصادفي، $K$-نزديك‌ترين همسايه و بخش روش‌هاي پيچيده‌تر شامل ماشين ‌بردار ‌پشتيبان و شبكه عصبي مورد مطالعه قرار گرفته‌اند. همچنين دقت اين روش‌ها بر روي مجموعه داده‌هاي مربوط به دانشجويان دانشگاه رازي از سال 1375 تا 1401 در مقطع كارداني و كارشناسي مورد بررسي و مقايسه قرار گرفته است. از روش‌هاي بررسي شده جنگل تصادفي بيشترين دقت پيش‌بيني را نتيجه داده است اما از لحاظ سرعت پاسخ‌دهي هزينه محاسباتي بالايي دارد. روش $K$-نزديك‌ترين همسايه از لحاظ دقت خيلي نزديك به روش جنگل تصادفي است با اين تفاوت كه زمان اندكي لازم است تا خروجي‌ها حاصل شوند.
Institutional investor Massive behavior and audit quality
2023
Institutional investor Massive behavior and audit quality Auditing with quality plays a great role in reducing information asymmetry and, as a result, reducing agency costs. In other words, by providing suitable conditions for providing quality information to users, auditors provide the necessary background to control the optimistic behavior of managers. Therefore, a strong regulatory mechanism prevents managers from using cash resources in low-return investments. The purpose of this investigation is the relationship between mass behavior of institutional investors and audit quality. The statistical research population of companies admitted to the Baghdad Stock Exchange and the studied sample includes 80 admitted companies during the years 2017 to 2021. The research method is descriptive and correlational in terms of the relationship between variables, and it is practical in terms of purpose. The regression method and panel data as well as the fixed effects model were used to process and test the hypotheses. The results of the hypothesis analysis showed that the collective behavior of institutional investors has a negative and significant relationship with the auditor's tenure, the collective behavior of institutional investors has a negative and significant relationship with the size of the auditing firm, and the collective behavior of institutional investors has a negative and significant relationship with the auditor's expertise in the industry. Keywords: collective behavior of institutional investors, auditor's expertise in the industry, auditor's tenure
A Review of data mining classification algorithms and their comparison on a case study
Raziye Tavangar 2022
Factors affecting poverty measurement indicators and choosing the best model
Maryam Amiri 2021
Study of the penalized Weibull regression for high dimensional features.
Ensieh Ghobadiasl 2021
Regression in Statistics means returning to an average or a verage value, Statisticians have always examined the relation ship between Variables, One of the most common models that fit data, Are regression models. Regression analysis is a Statistical method for analyzing and modeling multivariate data. Aspecial type of regression model is the high dimensional regression model in Which the Volume of Variables independent of the sample size is greater, that is, when is p > n, in these models, because the matrix X is not, complete column rank, therefore estimating the least squares ?ˆOLS is not obtained uniquely and estimating the parameters will not be a good predictor. for this reason, in recent years, methods called penalty regression or contraction methods have been used. such as ridge, lasso, group lasso and elastic net, that in this thesis, the lasso convex function is used. lasso is defind as the L1 norm of the parameters, that ? is the vector of regression coefficients and ? is the penalizing parameter. larger value of ? exerts higher penalty on regression coefficients, resulting in the inclusion of fewer variables in the model. and conversely commonly, a sequence of ? value are generated, and then variables are detected for each value of the series. Thereafter, a value of ? is chosen by k-fold cross-validation, and corresponding set of predictors are included in the model. also for simulation results in this Thesis, InfTh and BIC have been used, and we discuss all these issues in R software. Keywords Weibull regression, Penalty methods, Shrinkage methods, Lasso, Criteria of information theory, Bayesian information criteria
On some shock models using phase-type distributions
MAREAM MORADY 2021
Using Random Forest Algorithm with Multiple Classification, to improve Customer Relationship Management in the banking industry
Zaynab Taheri kal koshavandi 2021
در مسائل دسته‌بندي، داده‌ها با توجه به وجه اشتراكي كه دارند به چند دسته خاص تقسيم مي‌شوند. دسته‌بندي ابزار مهمي براي تحليل مشكلات آماري است. روش‌هاي متعددي براي دسته‌بندي داده‌ها وجود دارد كه برحسب اينكه متغير پاسخ مشخص و يا نامشخص باشند به ترتيب به دو دسته كلي بانظارت و بدون نظارت تقسيم‌بندي مي‌شوند. از جمله اين روش‌ها مي‌توان به روش‌هاي كلاسيك رگرسيوني مثل رگرسيون با داده‌هاي دودويي (لجستيك، پروبيت و...) اشاره كرد. همچنين روش‌هاي دسته‌بندي براساس آموزش ماشين مثل درخت تصميم، جنگل تصادفي و ... جايگزين‌هاي مناسبي براي روش‌هاي رگرسيون كلاسيك هستند. در اين تحقيق ما به بررسي اين روش‌ها مي‌پردازيم و در نهايت اين روش‌ها، براي مجموعه داده‌هاي بانكي از يك كمپين بازاريابي تلفني به كار برده مي‌شود. روش‌هاي مختلف با استفاده از معيار دقت و منحني ROC مقايسه مي‌گردند.
Comparing some different risk measures by using a simulation method
Fateme Bagheri 2021
The intuition of risk is based on two main concepts: the possibility of a negative outcome, i.e. a lo and the variability in terms of an expected result, i.e. a deviation. Since the time when the modern theory of finance was accepted, the role of risk measurement has attracted attention. Initially, it was predominantly used as a dispersion measure, such as variance, which contemplates the second pillar of the intuition. More recently, the occurrence of critical events has turned the attention to tail-risk measurement, as is the case of well known Value at Risk (VaR) and Expected Shortfall (ES)measures, which contemplate the first pillar. In this Thesis, a risk measure is considered which contemplate both pillars of intuition on risk. These pillars include the possibility of negative results and variability over an expected result, as a single measure. This resulting composition, based on properties of the two components, is a coherent risk measure. Similar results for the cases of convex and co-monotone risk measures are exposed. Then, the eleven well-known risk measures consider from different classes. Finally, the empirical values of corresponding to loss, deviation and loss-deviation risk measures are obtained and compared using Monte Carlo and real data.
Approximating the Likelihood in Approximate Bayesian Computation
Mitra Havasi 2021
A comparison of binary classification methods for diagnosis of type of cancerous mass (malignant or benign) in breast cancer data
Mohsen Haghdost 2020
reast cancer is one of the most common cancers in women today. Although men also get this cancer, the risk is more serious in women. Sometimes a misdiagnosis of cancer can lead to the death of a human being, and this should be considered a serious risk. Breast cancer tumors have two types, malignant and benign. Identifying the right type of these tumors will prevent unnecessary treatments and reduce mortality.The aim of this dissertation is to compare five methods of classification, naive Bayes, support vector machine, artificial neural network, logistic regression and random forest on breast cancer data to diagnose benign and malignant cancer tumors to determine the best method according to evaluation criteria. Choose binary, accuracy, precision, sensitivity, specificity, F1 score and Matthews correlation coefficient. The main criterion is to compare the accuracy of the model, then other criteria will be considered.
Sampling techniques for analyzing big data in data mining
Zaenab Nazari 2020
In analyzing big data, time of computations is increased, so in data mining algorithms cannot use all the data. Therefore, using sampling methods in big data set is a good solution.\\\\In statistical studies of multivariate populations, obtaining information over all variation range of variables is very important. Since it is difficult or impossible to select all data, the required information can be obtained by survey a subpopulation as a sample. In such cases, the appropriate sample can be selected by LPM2-kdtree method.\\\\Also, in big data analysis, selection bias is very important. In this thesis, in order to decrease the bias by using importance sampling a method is explained. Finally, in a numerical study on two real populations, the spatially balance of LPM2-kdtree and decreasing selection bias of the sampling design that uses importance sampling are evaluated.\\\\ \\textbf{Keywords:} {Big Data}, {Clustering}, {Data Mining}, {Inverse sampling}, {Knowledge Discovery}, {Non-probability sampling}, {Selection bias} . \\end{latin}
Estimation and Analysis of Urban Water Drinking Rate Function at Hamedan Water and Wastewater Company in 2019
Razieh Karami 2020
Sample Size Determination in Complex Surveys Sampling
Vahid Lanjabpour 2020
Simulation methods on the two parametres poisson dirichlet and the normalized inverse Gaussian processes
SEYEDEHSHIVA MOUSAVI 2020
In this thesis, we develop simple, yet efficient, procedures for sampling approximations of the two-Parameter Poisson-Dirichlet Process and the normalized inverse- Gaussian process. We compare the efficiency of the new approximations to the corresponding stick-breaking approximations of the two-parameter Poisson-Dirichlet Process and the normalized inverse-Gaussian process, in which we demonstrate a substantial improvement.

اعضای هیأت علمی دانشکده علوم

اسحاق الماسي

Assistant Professor / علوم / Statistics

Download resume file

Current courses

Master Theses

Update: 2026-06-03