1. Introduction
Using available auxiliary information to assist in estimating the population mean or population total in sample survey can increase the efficiency of the estimators. Well-known methods for estimating the population mean or population total using auxiliary variables are ratio and regression estimators. They are very helpful techniques for both government and private organizations that deal with large population data, covering for example economics, agriculture or other state assets.
Cochran [1] was the first to propose a ratio estimator for estimating population mean. He proposed to use a known population mean of an auxiliary variable ሺതሻ in order to increase the efficiency of the estimator. Several researchers have developed ratio estimators by using other known population values of an auxiliary variable. For example, Sisoda & Dwivedi [2], Singh & Upadhyaya [3] and Pandey & Dubey [4] proposed to adjust the customary ratio estimator by using the coefficient of variation in estimating the population mean. Singh & Tailor [5] suggested applying the correlation coefficient of an auxiliary variable for its estimation (see for example Soponviwatkul & Lawson [6]).
Kadilar and Cingi, in their papers [7] and [8] on this subject, proposed using both ratio and regression estimators by substituting the sample mean of the study variable y with the regression estimator and also by applying other auxiliary variables to estimate the population mean of the interest variable, described as coefficient of variation, coefficient of kurtosis, and correlation coefficient. Some researchers have proposed improving the efficiency of the population mean estimator by using other parameters of auxiliary variables, such as deciles and the quartile function (see e.g. Subramani & Kumarapandiyan [9,10]).
There is a great deal of research that covers the benefit of known auxiliary variables. Therefore, Khoshnevisan, et al. [11] proposed a general family of ratio estimators for estimating population mean that covers the existing ratio estimators. The estimator by Khoshnevisan, et al. [11] is given as follows in Eq. (1):
\[T_{1} = \overline{y} \left( \frac{a\overline{X} + c}{\alpha(a\overline{x} + c) + (1 - \alpha)(a\overline{X} + c)} \right)^{g}, \tag{1}\] where \(\bar{x}\) and \(\bar{X}\) are the sample and population means of an auxiliary variable x respectively. \(a \neq 0\) and c are either real numbers or functions of known parameters of an auxiliary variable such as the coefficient of variation \((C_x)\), coefficient of skewness \((\beta_1)\), coefficient of kurtosis \((\beta_2)\), correlation coefficient \((\rho)\) and inter-quartile range \((Q_r)\) of the population. \(\alpha\) and g are real numbers to be determined. Khoshnevisan, et al. [11] assumed that all of the sample units used were at full response rate.
Up to the first degree of Taylor linearization, the bias and mean square error of the estimator of Khoshnevisan, et al. [11] is given in Eqs. (2) and (3) as follows:
\[Bias(T_1) = \left(\frac{1-f}{n}\right)\overline{Y}\left[\frac{g(g+1)}{2}\alpha^2w_1^2C_x^2 - \alpha w_1g\rho C_yC_x\right],\tag{2}\]
\[MSE(T_1) = \left(\frac{1-f}{n}\right) \overline{Y}^2 \left[ C_y^2 + \alpha^2 w_1^2 g^2 C_x^2 - 2\alpha w_1 g \rho C_y C_x \right], \tag{3}\] where \[w_1 = \frac{a\overline{X}}{a\overline{X} + c}\], \(f = \frac{n}{N}\).
Later, Kumar [12] suggested adjusting the estimator of Khoshnevisan, et al. [11] by replacing sample mean \(\bar{y}\) in Eq. (1) using a traditional regression estimator. The bias and mean square error for the new estimator were considered. Kumar's [12] estimator is given in Eq. (4) as follows:
\[T_2 = \left[\overline{y} + b(\overline{X} - \overline{x})\right] \left(\frac{d\overline{X} + h}{\alpha(d\overline{x} + h) + (1 - \alpha)(d\overline{X} + h)}\right)^g, \tag{4}\] where b is the sample regression coefficient of \(\beta\), \(d \neq 0\) and h are either real numbers or functions of known parameters of an auxiliary variable. \(\alpha\) and g are real numbers to be determined.
Up to the first degree of Taylor linearization, the bias and mean square error of Kumar's [12] estimator is given as follows:
\[Bias(T_2) = \frac{1}{\overline{X}} \left( \frac{1-f}{n} \right) \left[ \frac{g(g+1)}{2} \alpha^2 w_2^2 R S_x^2 + \alpha g w_2 \left( A - \beta \right) R S_x^2 - \left( \frac{\lambda_{12}}{\rho} - \lambda_{03} \right) S_x \right], \tag{5}\]
\[MSE(T_2) = \left(\frac{1-f}{n}\right) \left[S_y^2 + R\left(A + gaw_2\right)\left\{R\left(A + gaw_2\right) - 2\beta\right\}S_x^2\right],\tag{6}\] where \[w_2 = \frac{d\overline{X}}{d\overline{x} + h}\], \(A = \frac{\beta}{R}\), \(R = \frac{\overline{Y}}{\overline{X}}\), \(\lambda_{rs} = \frac{\mu_{rs}}{S_y^{r/2} S_x^{s/2}}\), \(\mu_{rs} = \frac{1}{N} \sum_{i=1}^N (y_i - \overline{Y})^r (x_i - \overline{X})^s\), r and s are non-negative integers, \(S_x\) and \(S_y\) are the standard variations of X and Y respectively.
More research is involved in using the benefit of known parameters of auxiliary variables, such as the coefficient of skewness, coefficient of kurtosis, median and quartile function (see e.g. Upadhyaya & Singh [13], Singh [14], Alomari, et al. [15], Yan & Tian [16], Yadav, et al. [17], Subramani & Kumarapandiyan [18], and Lawson [19]). Alternatively, some researchers have recommended combining ratio estimators in order to minimize the mean square error. For example, Enang, et al. [20] proposed combining Singh & Tailor's [5] estimator with Kadilar & Cingi's [7] estimator to minimize the mean square error.
However, the estimators of Khoshnevisan, et al. [11] and Kumar [12] are in a form that is quite difficult to use in practice. This paper proposes two new classes of population mean estimators created by adjusting the estimators of Khoshnevisan, et al. [11] and Kumar [12] under simple random sampling without replacement (SRSWOR).
The proposed estimators only provide a minor improvement to the existing estimators, but they are in a simple form and easier to use when compared to the existing ones. Moreover, we propose to combine the two new families of ratio estimators in general form. The bias and mean square error are shown using Taylor's series up to the first order. This alternative estimator can be useful to both public and private sector organizations who know some of the parameters of auxiliary variables.
2. Proposed Estimators
By adjusting the estimators of Khoshnevisan, et al. [11] and Kumar [12], we created two new families of ratio estimators. We propose a combined family of ratio estimators to estimate the population mean by combining the two new families of ratio estimators to minimize the MSE. We propose to adjust the estimators of Khoshnevisan, et al. [11] and Kumar [12] in a simple form by considering the case where \(\alpha\) and g are equal to one. The modified estimators \(t_R\) and \(t_{R_{eq}}\) are given as follows:
\[t_R = \overline{y} \left( \frac{a\overline{X} + c}{a\overline{x} + c} \right), \tag{7}\] and
\[t_{Reg} = \left[\overline{y} + b(\overline{X} - \overline{x})\right] \left(\frac{d\overline{X} + h}{d\overline{x} + h}\right),\tag{8}\] where c and h are either real numbers or functions of known parameters of an auxiliary variable and b is the sample regression coefficient.
To obtain the bias and MSE of the modified estimators we can use the following notations:
Let \[\overline{y} = \overline{Y}(1 + e_0)\] and \(\overline{x} = \overline{X}(1 + e_1)\) such that \(E(e_0) = E(e_1) = 0\), \(E(e_0^2) = \frac{1 - f}{n}C_y^2\), \(E(e_1^2) = \frac{1 - f}{n}C_x^2\) and \(E(e_0e_1) = \frac{1 - f}{n}C_{xy} = \frac{1 - f}{n}\rho C_y C_x\).
Rewriting Eq. (7) in terms of \(e_0\) and Eq. (8) in terms of \(e_1\) we have:
\[t_R = \overline{Y} \left( 1 - e_0 \right) \left( \frac{a\overline{X} + c}{a \left( \overline{X} \left( 1 + e_1 \right) \right) + c} \right), \tag{9}\] and
\[t_{R_{\text{eg}}} = \left[\overline{Y}\left(1 + e_0\right) + b(\overline{X} - \overline{X}\left(1 + e_1\right))\right] \left(\frac{d\overline{X} + h}{d\overline{X}\left(1 + e_1\right) + h}\right). \tag{10}\]
Up to the first degree of approximation using a Taylor series, the bias and MSE of estimators \(t_R\) and \(t_{R_{eg}}\) are shown in Table 1.
Table 1 Bias and MSE for \(t_R\) and \(t_{R_{eq}}\).
| Estimator | Constant | Bias | MSE |
|---|---|---|---|
| \(t_R\) | \[w_1 = \frac{a\overline{X}}{a\overline{x} + c}\] | \[\frac{\left(1-f\right)}{n}w_{1}\overline{Y}\left[w_{1}C_{x}^{2}-\rho C_{x}C_{y}\right]\] | \[\frac{\left(1-f\right)}{n}\overline{Y}^{2}\left[C_{y}^{2}+w_{1}\left(w_{1}C_{x}^{2}-2\rho C_{x}C_{y}\right)\right]\] |
| \(t_{R_{eg}}\) | \[w_2 = \frac{d\overline{X}}{d\overline{x} + h},\] \[K = \frac{\overline{X}}{\overline{Y}}\] | \[\frac{\left(1-f\right)}{n}w_{2}\overline{Y}\left[\left(w_{2}+bK\right)C_{x}^{2}-\rho C_{x}C_{y}\right]\] | \[\frac{(1-f)}{n} \overline{Y}^{2} \left[ C_{y}^{2} + (w_{2} + bK)^{2} C_{x}^{2} -2\rho C_{x} C_{y} (w_{2} + bK) \right]\] |
Some members of estimators \(t_R\) and \(t_{R_{eg}}\) are shown in Table 2.
Table 2 Some Members of Estimators \(t_R\) and \(t_{R_{eg}}\).
| Estimator | a or d | g or h |
|---|---|---|
| Ratio Estimator | ||
| \[t_{R_1} = \overline{y}\left(\frac{\overline{X}}{\overline{x}}\right)\] | 1 | 0 |
| Sisoda and Dwivedi [2] Estimator | ||
| \[t_{R_2} = \overline{y} \left( \frac{\overline{X} + C_x}{\overline{x} + C_x} \right)\] | 1 | \(C_x\) |
| Upadhyaya and Singh [13] Estimator | ||
| \[t_{R_3} = \overline{y} \left( \frac{C_x \overline{X} + \beta_2}{C_x \overline{x} + \beta_2} \right)\] | \(C_x\) | \(\beta_2\) |
| Singh and Tailor [5] Estimator | ||
| \[t_{R_4} = \overline{y} \left( \frac{\overline{X} + \rho}{\overline{x} + \rho} \right)\] | 1 | ρ |
| Subramani and Kumarapandiyan [12] Estimator | ||
| \[t_{R_{5}} = \overline{y} \left( \frac{\overline{X} + Q_{r}}{\overline{x} + Q_{r}} \right)\] | 1 | \(Q_r\) |
| Kadilar and Cingi [7] Estimator | ||
| \(t_{R_{eg_1}} = (\overline{y} + b(\overline{X} - \overline{x})) \left(\frac{\overline{X}}{\overline{x}}\right)\) | 1 | 0 |
| Kadilar and Cingi [7] Estimator | ||
| \[t_{R_{eg_2}} = (\overline{y} + b(\overline{X} - \overline{x})) \left(\frac{\overline{X} + C_x}{\overline{x} + C_x}\right)\] | 1 | \(C_x\) |
| Kadilar and Cingi [8] Estimator | ||
| \[t_{R_{eg_3}} = (\overline{y} + b(\overline{X} - \overline{x})) \left(\frac{\overline{X} + \rho}{\overline{x} + \rho}\right)\] | 1 | ρ |
| Estimator | a or d | g or h |
|---|---|---|
| Kadilar and Cingi [8] Estimator | ||
| \[t_{R_{eg_4}} = \left(\overline{y} + b(\overline{X} - \overline{x})\right) \left(\frac{\beta_2 \overline{X} + C_x}{\beta_2 \overline{x} + C_x}\right)\] | \(\beta_2\) | \(C_x\) |
| Kadilar and Cingi [8] Estimator | ||
| \[t_{R_{eg5}} = \left(\overline{y} + b(\overline{X} - \overline{x})\right) \left(\frac{\beta_2 \overline{X} + \rho}{\beta_2 \overline{x} + \rho}\right)\] | \(\beta_2\) | ρ |
| Yan and Tian [16] Estimator | ||
| \[t_{Reg_{6}} = \left(\overline{y} + b(\overline{X} - \overline{x})\right) \left(\frac{\beta_{1}\overline{X} + \beta_{2}}{\beta_{1}\overline{x} + \beta_{2}}\right)\] | \(\beta_1\) | \(\beta_2\) |
By substituting constants a, d, g and h with suitable alternatives in Eq. (7) and Eq. (8), estimators \(t_R\) and \(t_{Reg}\) are exposed as known estimators.
We propose a combined family of ratio estimators by combining the estimators \(t_R\) and \(t_{Reg}\) in order to find the minimum mean square error of the proposed combined family of ratio estimators. This combined family of ratio estimators is given as follows:
\[t_{RC} = \alpha t_R + (1 - \alpha) t_{R_{po}}, \tag{11}\] where \(\alpha\) is a suitable choice of constant that makes the mean squared error of \(t_{RC}\) minimum.
Expressing Eq. (11) in terms of e's we have:
\[t_{RC} = \alpha \overline{Y} (1 + e_0) \left( \frac{a \overline{X} + c}{a (\overline{X} (1 + e_1)) + c} \right)\] \[+ (1 - \alpha) \left[ \overline{Y} (1 + e_0) + b (\overline{X} - \overline{X} (1 + e_1)) \right] \left( \frac{d \overline{X} + h}{d \overline{X} (1 + e_1) + h} \right)\] \[= \alpha \overline{Y} (1 + e_0) (1 + w_1 e_1)^{-1} + (1 - \alpha) (\overline{Y} (1 + e_0) - b \overline{X} e_1) (1 + w_2 e_1)^{-1}\] \[= \alpha \overline{Y} \left[ 1 - w_1 e_1 + w_1^2 e_1^2 + e_0 - w_1 e_0 e_1 \right]\] \[+ (1 - \alpha) \left[ \overline{Y} - \overline{Y} w_2 e_1 + \overline{Y} w_2^2 e_1^2 + \overline{Y} e_0 - \overline{Y} w_2 e_0 e_1 - b K \overline{Y} e_1 + b K \overline{Y} w_2 e_1^2 \right].\] \[(12)\]
The bias of the proposed combined family of ratio estimators \(t_{RC}\) to the first-order approximation is given by:
\[\operatorname{Bias}(t_{RC}) = E(t_{RC} - \overline{Y})\] \[= E(\alpha \overline{Y}w_1^2 e_1^2 - \alpha \overline{Y}w_1 e_0 e_1 + \overline{Y}w_2^2 e_1^2 - \overline{Y}w_2 e_0 e_1\]
\[+bK\bar{Y}w_{2}e_{1}^{2} - \alpha\bar{Y}w_{2}^{2}e_{1}^{2} + \alpha\bar{Y}w_{2}e_{0}e_{1} - \alpha bK\bar{Y}w_{2}e_{1}^{2})\] \[= \frac{(1-f)}{n}\bar{Y}\left[ (\alpha w_{1}^{2} + (1-\alpha)w_{2}^{2} + (1-\alpha)bKw_{2})C_{x}^{2} + ((\alpha-1)w_{2} - \alpha w_{1})\rho C_{x}C_{y} \right].\] (13)
An approximation of the mean square error of the proposed combined family of ratio estimators up to the first order is given by:
\[MSE(t_{RC}) = E(t_{RC} - \overline{Y})^{2}\] \[\text{[rumus tidak dapat ditampilkan dengan baik — lihat PDF asli]}\]
In order to find the optimum value of MSE for the proposed combined family of ratio estimators \(t_{RC}\) in Eq. (11), we can find the minimum value for \(\alpha\) by taking a partial derivative of Eq. (14) with respect to \(\alpha\) and equating it to zero. The MSE of the proposed combined family of ratio estimators \(t_{RC}\) in Eq. (14) is minimized for:
\[\alpha_{opt} = \frac{(bk + w_2)C_x^2 - \rho C_x C_y}{(bk - w_1 + w_2)C_x^2}.\] (15)
Substituting Eq. (15) into Eq. (11) we can find the optimum of \(t_{RC}^{opt}\) as:
\[t_{RC}^{opt} = \alpha_{opt} t_R + \left(1 - \alpha_{opt}\right) t_{R_{eg}}. \tag{16}\]
Substituting Eq. (15) into Eq. (14), the minimum MSE of \(t_{RC}^{opt}\) is given as:
\[MSE_{\min}(t_{RC}^{opt}) = \left(\frac{1-f}{n}\right)\overline{Y}^{2}C_{y}^{2}\left(1-\rho^{2}\right). \tag{17}\]
We can see that the estimator proposed by Enang, et al. in [20] is a special case for the proposed combined family of ratio estimators \(t_{RC}\) when \(w_1 = w_2 = w_3\).
3. Efficiency Comparisons
In this section, the efficiency of the proposed combined family of ratio estimators is compared with \(t_R\), \(t_{R_{eq}}\) and the usual sample mean estimator \(\bar{y}\) by considering expressions of MSE from these estimators up to the first order of approximation. The proposed combined family of ratio estimators \(t_{RC}\) is more efficient than the estimators \(t_R\), \(t_{Reg}\) and \(\bar{y}\) if the conditions below are satisfied. The proposed combined family of ratio estimators \(t_{RC}\) is more efficient than the estimator \(t_R\) if:
\[MSE(t_{RC}) < MSE(t_{R})\] \[\frac{(1-f)}{n} \bar{Y}^{2} \begin{bmatrix} C_{y}^{2} + 2(\alpha bK - bK - \alpha w_{1} + (\alpha - 1)w_{2})\rho C_{x}C_{y} \\ + (\alpha bK - bK - \alpha w_{1} - w_{2} + \alpha w_{2})^{2} C_{x}^{2} \end{bmatrix}\] \[< \frac{(1-f)}{n} \bar{Y}^{2} \Big[ C_{y}^{2} + w_{1} \Big( w_{1}C_{x}^{2} - 2\rho C_{x}C_{y} \Big) \Big]\]
This condition holds if:
\[\rho < \frac{\left[w_1^2 - (\alpha bK - bK - \alpha w_1 - w_2 + \alpha w_2)^2\right] C_x}{2(\alpha bK - bK - \alpha w_1 + (\alpha - 1)w_2 + w_1)C_y}.\] (18)
The proposed combined family of ratio estimators \(t_{RC}\) is more efficient than the estimator \(t_{Reg}\) if:
\[\begin{split} & MSE(t_{RC}) < MSE(t_{R}) \\ & \underbrace{\left(1 - f\right)}_{n} \overline{Y}^{2} \begin{bmatrix} C_{y}^{2} + 2\left(\alpha bK - bK - \alpha w_{1} + \left(\alpha - 1\right)w_{2}\right)\rho C_{x}C_{y} \\ & + \left(\alpha bK - bK - \alpha w_{1} - w_{2} + \alpha w_{2}\right)^{2} C_{x}^{2} \end{bmatrix} \\ & < \underbrace{\left(1 - f\right)}_{n} \overline{Y}^{2} \left[ C_{y}^{2} + w_{1}C_{x}^{2} - 2\rho C_{x}C_{y} \right] \end{split}\]
This condition holds if:
\[\rho < \frac{\left[\left(w_2 + bK\right)^2 - \left(\alpha bK - bK - \alpha w_1 - w_2 + \alpha w_2\right)^2\right]C_x}{2\left[\left(\alpha bK + \left(w_2 - w_1\right)\alpha\right)\right]C_y}.\] (19)
The proposed combined family of ratio estimators \(t_{RC}\) is more efficient than estimator \(\bar{y}\) if:
\[MSE(t_{PC}) < V(\overline{v})\]
\[\frac{(1-f)}{n}\overline{Y}^{2} \begin{bmatrix} C_{y}^{2} + 2(\alpha bK - bK - \alpha w_{1} + (\alpha - 1)w_{2})\rho C_{x}C_{y} \\ + (\alpha bK - bK - \alpha w_{1} - w_{2} + \alpha w_{2})^{2} C_{x}^{2} \end{bmatrix} < \frac{(1-f)}{n}s^{2}\]
This condition holds if:
\[\rho < \frac{s^2 - \overline{Y}^2 \left[ C_y^2 + (\alpha bK - bK - \alpha w_1 - w_2 + \alpha w_2)^2 C_x^2 \right]}{2 \left[ (\alpha - 1)bK - \alpha w_1 + (\alpha - 1)w_2 \right] \overline{Y}^2 C_x C_y}.\] (20)
4. Simulation Study
To compare the performance of the proposed combined family of ratio estimators against the usual sample mean estimator \(\bar{y}\), a simulation study was conducted by generating (X,Y) from the bivariate normal distribution using two populations, with the following details:
Population 1: \[N = 700\], \(n = 80\), \(\mu_y = 500\), \(\mu_x = 30\), \(\rho = 0.9\), \(C_y = 10\), \(C_x = 2\)
Population 2: \(N = 250\), \(n = 50\), \(\mu_y = 40\), \(\mu_x = 25\), \(\rho = 0.7\), \(C_y = 1.5\), \(C_x = 1.6\)
A simple random sampling without replacement was used to select the sample size from each population. The percentage relative efficiency (PREs) of all existing estimators with respect to \(\bar{y}\) was used to compare the performance of the proposed combined family of ratio estimators. The results are presented in Table 3. It can clearly be seen that all proposed combined estimators performed much better than the existing estimators because they give a larger PRE when compared to the usual sample mean estimator \(\bar{y}\) and the existing estimators for both populations. The combined estimator \(t_{RC_{52}}\), which is a combined estimator of \(t_{R_{6}}\) and \(t_{R_{6}}\), performed the best.
Table 3 PREs of Proposed Estimators with respect to \(\bar{y}\).
| Population 1 | Population 2 | ||
|---|---|---|---|
| Estimator | PRE | Estimator | PRE |
| \(\overline{y}\) | 100 | \(\overline{y}\) | 100 |
| \(t_{R_1}\) | 87.66 | \(t_{R_1}\) | 123.89 |
| \(t_{R_2}\) | 93.13 | \(t_{R_2}\) | 142.02 |
| \(t_{R_3}\) | 91.57 | \(t_{R_3}\) | 143.15 |
| \(t_{R_4}\) | 90.27 | \(t_{R_4}\) | 132.13 |
| \(t_{R_5}\) | 106.51 | \(t_{R_5}\) | 155.58 |
| Population 1 | Population 2 | ||
|---|---|---|---|
| Estimator | PRE | Estimator | PRE |
| Reg1 t | 384.29 | Reg1 t | 39.15 |
| Reg 2 t | 408.26 | Reg 2 t | 46.08 |
| Reg 3 t | 395.34 | Reg 3 t | 42.19 |
| Reg 4 t | 103.94 | Reg 4 t | 102.29 |
| Reg 5 t | 104.07 | Reg 5 t | 102.36 |
| Reg 6 t | 99.52 | Reg 6 t | 22.90 |
| RC21 t | 462.11 | RC21 t | 181.28 |
| RC22 t | 472.95 | RC22 t | 180.12 |
| RC23 t | 467.16 | RC23 t | 180.86 |
| RC31 t | 461.24 | RC31 t | 180.46 |
| RC32 t | 472.12 | RC32 t | 179.88 |
| RC33 t | 466.31 | RC33 t | 180.33 |
| RC41 t | 460.41 | RC41 t | 178.65 |
| RC42 t | 471.06 | RC42 t | 176.80 |
| RC43 t | 465.36 | RC43 t | 177.96 |
| RC51 t | 467.54 | RC51 t | 190.40 |
| RC52 t | 480.83 | RC52 t | 192.78 |
| RC53 t | 473.80 | RC53 t | 191.55 |
5. Conclusion
Although they only provide a minor improvement to the existing estimators, two new classes of population mean estimators were proposed by adjusting the estimators of Khoshnevisan, et al. [11] and Kumar [12] under SRSWOR. Then the combination of these two estimators in general form was proposed in order to find the minimum mean square error of the proposed combined family of ratio estimators. From the theoretical and empirical study it can be seen that the proposed combined family of ratio estimators performed better than the existing estimators in terms of relative efficiency percentage when certain conditions are satisfied.
Acknowledgement
This research was funded by King Mongkut's University of Technology North Bangkok under contract no. KMUTNB-60-COV-52. The authors would like to express their gratitude to the referees for their helpful comments.
