Median estimation in the presence of non-response using randomised response technique

: The theory of parameter estimation was developed a long time ago and is currently a widely accepted scientific approach. In some situations where the mean has to deal with the effect of extreme values, the median can be used instead. As there is almost little literature available on median estimation in the existence of non-responsive with the help of the Randomized Response Technique (RRT), the main objective of this paper was to develop a basic theoretical framework for median estimation in the existence of non-responsive with the help of RRT. In this work, we suggested median estimation for delicate variables in auxiliary information using a randomised response model. We have suggested a basic median estimator, product, ratio, exponential product, exponential ratio, and regression cum ratio estimation of the median for non-response by utilising RRT. The mathematical derivations for optimum values of constants, biasness, and Mean Square Error (MSE) of purposed estimators result from the application of well-known Taylor and exponential expansions. The performance of mentioned estimators is evaluated through the numerical study of two populations which discovered the regression cum ratio estimate is more proficient than the remaining estimations mentioned in this article.


Introduction
The assumption of parameter estimation has been established for a long time and is now a wellrecognized scientific method worldwide.The mean is commonly used to measure central tendency for quantitative data.However, if the data contains large outliers, the median may be a more appropriate measure (Mohan & Su, 2022).The median is a popular and widely used measure of central tendency in statistics and research.Mutually all data like discrete and continuous are focused by this method, even though it is primarily used with continuous data.However, it may not always be the best choice for all statistical data and variables.Despite extensive research on population, quantity, variability, and other factors, little attention has been given to accurate median estimates.
In many areas of research, researchers are often concerned with variables such as salary, expenses, tolls, spending, and manufacturing, which tend to have highly skewed distributions.
To address this issue, the median is often preferred over the mean, as it is less sensitive to extreme values, making it a more appropriate measure of location in survey sampling.
Numerous empirical studies have focused on the issue of median estimates in simple random sampling techniques, with researchers investigating the use of auxiliary variables.It is also worth noting that in cases where sensitive questions are asked, many people may hesitate to provide honest answers or will not respond.The bias in data occurred due to nonresponsiveness.
In 1965, Warner used the Randomized Response Technique (RRT) to lower biases in facts during surveys.This practice was consumed to acquire a precise estimation of susceptible variables.Various statisticians made use of a RRT model to estimate the variables that are sensitive in nature and devoid of taking into account supplementary facts.Other researchers have considered this by suggesting explanations favouring extra data.Another empirical data used the variable of auxiliary to predict the tousled replies and projected estimates of regression type (Gupta et al., 2012).Arithmetical features were the results that predicted the output of the suggested estimation.
A research study was conducted using the additive model for randomized responses.Using RRT, Zamanzade and Al-Omar (2011) proposed an exponential estimate with either single or double auxiliary variables.The investigation analysed real data and demonstrated that the recommended assessors were more productive than the existing estimators.Another study by Koyuncu et al. (2014) used RRT to recommend estimations of exponential type with sole and double variables to progress the competence of the estimation of mean.The study obtained the expressions of MSE and bias for both estimators up to the first order.The results showed that the recommended assessor was more productive than the established one, thus indicating a gain in efficiency.
The scholars have conducted numerous investigations to investigate the variables, but a meticulous part of the estimation of the median has not been explored well in the presence of non-responsive with the use of RRT.This study aimed to propose some estimators that would help to predict the median of a populace in the company of variables of auxiliary.The development of these estimators included a varied variety of estimations for different occurrences in various techniques of drawing the samples.These estimators will pave the way for new methods in estimating non-response using the randomized response technique.

Literature
When the variable under investigation is connected to the variable of auxiliary, utilizing auxiliary variables can help improve the effectiveness of estimators.The most common methods of estimation that use auxiliary information are the classical methods.Laplace (1820) and Watson (1937) were two pioneers who used auxiliary information to estimate population and leaf area, respectively.When a variable covers a sensitive or private domain that a respondent does not want to disclose, it is known as a sensitive variable.
In recent years, various techniques have been proposed to assess the means of restricted populace with the usage of variables of auxiliary.Many previous studies projected different estimations using the optional technique of RRT (Kalucha et al., 2014;Gupta et al., 2016).The comparisons, both theoretically and numerically, were agreed upon by Gupta et al. (2010) regarding the basic mean estimator.Another research utilized stratification to list assessors of a susceptible inquired variable with the use of facts about auxiliary (Mushtaq et al., 2017).The proposed class of assessors is more productive than the current assessors given by Sousa et al. (2014) according to exact and hypothetical outcomes through simulation.Ozgul and Cingi (2017) developed a better estimator to assess the meaning of the susceptible data of the variable under investigation using RRT with the additive of partial nature in the presence of twice variables of auxiliary of non-sensitive type.The MSE of the developed estimator is derived and compared with the estimators suggested by Sousa et al. (2014), Diana andPerri (2011), andGupta et al. (2016).Khalil et al. (2018) proposed a widespread estimate for sensitive study variables using benchmark and ME using two different sampling techniques.They compared their proposed estimators with the mean and ratio estimators mathematically and numerically and found that the recommended estimators performed better than the competing estimator for the two sampling schemes.
In one of the arguments, in unbiased research, a comparison was made in 4 presented quantitative RRT with the help of divide and non-divide metrics about the privacy of the sample and the effectiveness of this model.It was found that under particular circumstances, some variables performed better than others for meticulous values of parameters and constants (Azeem et al., 2023).However, such models become ineffective when applied with the diverse values of the parameters.The arithmetical setting for the competence of unlike models was also considered.
Grover and Kaur (2018) also proposed a productive mixed assessor for the populace by utilizing the transformation of non-delicate outcomes.They conducted productivity examinations with the leaving assessors, both mathematically and hypothetically, and found that the mentioned estimation did better than the established assessors under specific circumstances.Khalil et al. (2019) improved the generalized estimator proposed by using the optional technique for calculating sensitive study variables.The results obtained showed a better efficiency of the estimator.Zhang et al. (2019) presented a geometric mean proportion assessor for the limited population mean utilizing discretionary RRT models.The outcomes showed that the mean of the mathematics assessor is more effective than the standard mean RRT mean assessment in circumstances where the linkage with coefficient among the variables of review and the assistant is higher than 0.5.This empirical research also demonstrated that the expertise of the projected mean of geometric proportion estimation and the newly introduced estimator of proportion are similar but with lesser biases.Saleem et al. (2019) utilized double sampling types to estimate the variable's mean with sensitive features.They also suggested the arithmetical expressions of biases and MSE and conducted a numerical study to conclude the capability of the predictable estimator.Lastly, another research in 2022, discovered the innovative estimation to examine the t (a, b) variable that is globally practiced for the data that is auxiliary for SRWOR (Adichwal et al., 2022).
In the contemporary world, a number of investigations have introduced dissimilar methods that review the researched variables of receptive type.For instance, Sanaullah et al. (2019) proposed the use of non-sensitive concomitant variables to develop more efficient estimators, which have rejected the estimates given by previous studies like Sousa et al. (2010), Gupta et al. (2012), andKoyuncu et al. (2014).Similarly, Shahzad et al. (2019) recommended an estimation of the regression type for the mean populace with the inquired variables of a sensitive nature and data of auxiliary under ranked set sampling.They showed that their proposed estimators performed better than other difficult assessors.Waseem et al. (2020) used one or double variables of auxiliary type to generalize the exponential-type estimation for scrambled responses in SRS design.They also discussed special cases of the suggested estimator with privacy protection.Numerical results were obtained through simulation and real data illustration to support the competence of the recommended estimates.Anwar (2020) has given classical and modified forms of estimators, a class of estimates of exponential and comprehensive estimations for mean assessment of investigated variables of a susceptible nature by consuming a systematic sampling plan.They did a broad study with numerical by focusing on actual and fabricated samples.
Besides that, another study suggested a pack of mean estimators of in-type to predict variables for study with data of a sensitive nature with disparity of the variables of auxiliary with sampling methods both simple and stratified (Yousaf, 2020).They found that the projected types of estimation outnumbered present-day estimators for double-kind populations.To conclude, Anwar (2020) recommended employing generalized estimations for susceptible variables by applying data of non-sensitive features in diverse techniques of sampling.It derived the terms of biasness and MSE for suggested estimation and completed a comparative investigation with conventional estimates.The observational evaluation showed that the widespread method is much better than the previous estimations, with diverse techniques used to draw samples from the population.Rana (2021) proposed the generalized estimators for confidential study variables using auxiliary attributes in double sampling phases of simple and stratified.They also made comparisons and derivations of the expressions of biases and MSE for the projected estimate with mean, ratio, and exponential estimators.The outcomes of the numerical answers revealed that the estimation of generalized estimations behaved better than the established formula for sampling designs with a double stage of simple and stratified sampling.Tiwari et al. (2022) checked the influence of non-response on estimations of mean by applying RRT.Another investigation was replicated to check the efficiency of the proposed estimates.
Generally, survey sampling deals with the development of literature and methods relating to the estimation of means and totals.However, with the advancement in survey sampling design, many authors have worked on median estimation depending on the nature of the variable.Such as, when we have ordinal variables or ordered observations, then finding the appropriate central tendency median is proposed.Also, the use of existing information has been an orthodox method since the birth of sampling used for surveys.Helping statistics is commonly utilized to get the modified methods and high precision for estimating unknown parameters.
Numerous investigations have been carried out to inquire about the variables, but little research has been done on a meticulous part of the estimation of the median in the presence of nonresponsive with the use of RRT.To cover up this vacuum in research, this research was done to develop a basic theoretical framework for median estimation in the presence of nonresponsive using a randomized response technique.Some suggestions about median estimation for delicate variables in the existence of data that is of auxiliary in nature by using a model of RRT were also specified.The basic median estimator, product, ratio, exponential product, exponential ratio, and regression cum ratio estimators of the median for no response using RRT were also introduced.

Methods
In this study, the estimators for sensitive study variables were proposed while taking into account auxiliary information for estimating the population median using RRT model.Warner was the pioneer in initiating this model in 1965, and diverse statisticians have since modified it.This technique aims to obtain the most accurate answers to susceptible queries that respondents may be cautious to respond truly.In this paper, we use the additive model introduced in 1976 by Pollock and Beck for the quantitative replies, which is the most commonly used model.
Where Z is the reported response, Y is the variable of attention and S is the variable that is scrambled.Further, it is implicated as S~N (0, 10% of Sx) The mean and variance of sensitive study variables is:

Terminology
Here, the investigated variable that is sensitive is specified as Y. Allow X to be an auxiliary variable of non-sensitive that correlates to Y. Suppose the tangled variable is S and is unrelated to Y and X.To respond, the respondent was given the scrambled formula Z=Y+S.but is required to give an accurate answer for X.Let us say that for the i th component (i=1, 2, 3, 4, ..., N), a sample that is of random nature with n size taken without substitution from a restricted populace U= (U1, U2, U3, ..., UN).The yi and xi represent the corresponding variables' sample data.Let My and Mx now represent, respectively, the variable under research and the variable of auxiliary population medians.Also take into account the sample medians My, Mx, and Mz.Assume that the X median is accessible, that S belongs to a normal distribution with a zero mean, and that its normal error is 10% of the X median's standard error.Let the error terms be denoted as: Using these notations, we can define expectations like (1 ) where and 4 When the information of X is ignored, the basic median estimator for scrambled response is an unbiased estimator given by The variance of t1 is specified by:

Product estimator
The product estimator of the median for scrambled response is defined as t The biasness and MSE of tp up to first-order guess is shown as ( ) b.

The ratio estimator
The ratio median estimator for replies of a scrambled nature is illustrated as The bias and MSE of tr are derived below: c.

The exponential product median estimator
The exponential product median estimation for the scrambled response is given by Equation 10.
After important mathematical computations, the bias and MSE of tep are derived as below.

d. The exponential ratio median estimator
The exponential ratio median estimate for the variable of susceptible in nature is specified as Using the Taylor series and the first order of estimation of the biasness and MSE are results under.

Generalized regression-cum-ratio median estimator
Motivated by Gupta et al. (2012) who gave this estimate and proposed regression-cum-ratio generalized median estimator as given in Equation ( 16)

Some specific cases
Using the Taylor series up to the 2 nd order of estimation, it is given as  ( ) Now to get expressions for MSE, consider Equation 9.2 and squaring and applying expectation, we get.
( ) ) Now, to obtain the minimum mean square error and the optimum values of K1 and K2, the researcher distinguished the Equation 19with the values of K1 and K2 and have set it equal to nil. ( After solving the above equations, we obtained the following optimum values.
After putting the value of K1 and K2 in Equation 19 and doing the necessary mathematical computations we have our minimum MSE as Median estimation in the presence of non-response using randomised response technique __________________________________________________________________________________________ __________________________________________________________________________________________ Natural and Applied Sciences International Journal (NASIJ) 117

Numerical study
Here, we have conducted a numerical experiment to check the validity of our derived formulas.We consider two populations, the descriptions of which are as follows.The formula Z = Y + S provides the reported response.For both populations, the sizes of sample are n= 5, 10, 20 and 30 percent of 69.R Language is used to process the data.Table-1 gives the MSEs for both populations of seven estimators which are derived.The relative effectiveness using percentage PRE is utilized to compare the estimators.
The formula of PRE is given below: Where β= P, R, EP, ER, GRR, exp

Results
The findings from both the simulation study and the real data illustrations are in agreement with each other, as expected.The results presented in Table No. 1 indicate that the ratio estimation is a more competent estimator than the basic median estimate, particularly in the presence of non-response.Furthermore, the exponential ratio median estimator is found to contain a lessen MSE value than the ratio median estimate.The regression-associated ratio median equations exhibit greater efficiency than the other estimators.However, the product median and exponential product median estimators do not perform well in estimating the parameters.

Discussion
Researchers are increasingly examining sensitive issues such as drug and alcohol use, reproductive and mental health, and illegal activities.Participants often request confidentiality, anonymity, and protection when sharing such information, suspecting that data collectors may be intruding into their personal lives.The model of RRT is a legitimate process that offers greater anonymity and privacy for obtaining sensitive information.
However, the RR technique has both advantages and disadvantages.The estimation of variance is enlarged if randomly selected, and the reply to patterns of the sample under investigation individual cannot be deduced unswervingly.Nonetheless, combining RR and direct answers in the sample makes information more reliable and convincing.
Real data and simulation studies show that an estimator with greater efficiency value is better equipped to handle non-response in certain situations.The regression-cum-ratio median estimator has been found to perform better than all other estimators mentioned in the research.This is supported by Mutembei et al. (2014), who proposed a regression-cum-ratio estimator that uses data from multiple auxiliary variables and characteristics simultaneously to estimate the population mean (Koyuncu et al., 2019).

Conclusion
To conclude, it can be observed that there are issues in actual existence where we need to address sensitive variables, and respondents usually refuse to answer such types of variables or give fabricated answers to the questions.The results gathered by using direct approach on delicate questions can lead the researcher towards nonresponse or biasness.The issue of nonresponse and biasness may be tackled by using non-sensitive variables.In the past, mean estimation was considered the best approach to dealing with such cases.Now, with the development of a theoretical framework of median estimation in this paper, researchers have significantly new methods to deal with the data with many outliers or extreme values.When we talk about our proposed estimators, it can be seen that both the estimation of ratio and exponential ratio work in a better way than the basic median estimator for scrambled response derived in this paper, but the outcomes of the estimate of regression-cum-ratio are much greater than the remaining.From the numerical findings, it is concluded that regression-cum-ratio estimator performs outstandingly on different sample sizes for the assessment of scrambled response under certain conditions.So, with the help of numerical results, to conclude, it is justified to say that the estimations of regression-cum-ratio is superior then all remaining estimators mentioned in this research with different sample sizes.Also, the efficiency of purposed estimators will increase with the big size of a sample.In the future, researchers can estimate the median in the occurrence of non-responsive variables by applying multiple models of randomized response approach and double variables of scrambled nature.

Table - 1
: MSE and PRE of all the estimations for real people