Statistical estimation and its properties. Point estimation of distribution parameters

Statistical estimates of the parameters of the general population. Statistical hypotheses

LECTURE 16

Let it be required to study the quantitative sign of the general population. Assume that, from theoretical considerations, it was possible to establish which distribution has a feature. This gives rise to the problem of estimating the parameters that determine this distribution. For example, if it is known that the trait under study is distributed in the general population according to the normal law, then it is necessary to estimate (approximately find) the mathematical expectation and standard deviation, since these two parameters completely determine the normal distribution. If there are reasons to believe that the feature has a Poisson distribution, then it is necessary to estimate the parameter , which determines this distribution.

Usually, in the distribution, the researcher has only sample data, for example, the values ​​of a quantitative trait obtained as a result of observations (hereinafter, the observations are assumed to be independent). Through these data and express the estimated parameter.

Considering as values ​​of independent random variables , we can say that to find a statistical estimate of an unknown parameter of a theoretical distribution means to find a function of the observed random variables, which gives an approximate value of the estimated parameter. For example, as will be shown below, to estimate the mathematical expectation of a normal distribution, the function (arithmetic mean of the observed values ​​of a feature) is used:

.

So, statistical evaluation unknown parameter of the theoretical distribution is called a function of the observed random variables. The statistical estimate of an unknown parameter of the general population, written as a single number, is called point. Consider the following point estimates: biased and unbiased, effective and consistent.

To statistical estimates gave "good" approximations of the estimated parameters, they must satisfy certain requirements. Let's specify these requirements.

Let there be a statistical estimate of the unknown parameter of the theoretical distribution. Assume that when sampling the volume, an estimate is found. Let's repeat the experiment, that is, we will extract another sample of the same size from the general population and, using its data, we will find an estimate, etc. Repeating the experiment many times, we get the numbers , which, generally speaking, will differ from each other. Thus, the estimate can be considered as a random variable, and the numbers as possible values.

It is clear that if the estimate gives an approximate value with an excess, then each number found from the data of the samples will be greater than the true value of . Therefore, in this case, the mathematical (mean value) of the random variable will be greater than , that is, . Obviously, if it gives an approximate value with a disadvantage, then .


Therefore, the use of a statistical estimate, the mathematical expectation of which is not equal to the estimated parameter, leads to systematic (one sign) errors. For this reason, it is natural to require that the mathematical expectation of the estimate be equal to the estimated parameter. Although compliance with this requirement will not, in general, eliminate errors (some values ​​are greater than and others less than ), errors of different signs will occur equally often. However, compliance with the requirement guarantees the impossibility of obtaining systematic errors, that is, eliminates systematic errors.

unbiased called a statistical estimate (error), the mathematical expectation of which is equal to the estimated parameter for any sample size, that is, .

Displaced called a statistical estimate, the mathematical expectation of which is not equal to the estimated parameter for any sample size, that is.

However, it would be erroneous to assume that an unbiased estimate always gives a good approximation of the estimated parameter. Indeed, the possible values ​​may be highly scattered around their mean, i.e. the variance may be significant. In this case, the estimate found from the data of one sample, for example, may turn out to be very remote from the average value , and hence from the estimated parameter itself. Thus, taking as an approximate value, we will make a big mistake. If, however, the variance is required to be small, then the possibility of making a large error will be excluded. For this reason, the requirement of efficiency is imposed on the statistical evaluation.

efficient called a statistical estimate, which (for a given sample size ) has the smallest possible variance.

Wealthy is called a statistical estimate, which tends in probability to the estimated parameter, that is, the equality is true:

.

For example, if the variance of the unbiased estimator at tends to zero, then such an estimator also turns out to be consistent.

Consider the question of which sample characteristics best estimate the general mean and variance in terms of unbiasedness, efficiency, and consistency.

Let a discrete general population be studied with respect to some quantitative attribute .

General secondary is called the arithmetic mean of the values ​​of the feature of the general population. It is calculated by the formula:

§ - if all values ​​of the sign of the general population of volume are different;

§ – if the values ​​of the sign of the general population have frequencies, respectively, and . That is, the general average is the weighted average of the trait values ​​with weights equal to the corresponding frequencies.

Comment: let the population of the volume contain objects with different values ​​of the attribute . Imagine that one object is randomly selected from this collection. The probability that an object with a feature value, for example , will be retrieved is obviously equal to . Any other object can be extracted with the same probability. Thus, the value of a feature can be considered as a random variable, the possible values ​​of which have the same probabilities equal to . It is not difficult, in this case, to find the mathematical expectation:

So, if we consider the examined sign of the general population as a random variable, then the mathematical expectation of the sign is equal to the general average of this sign: . We arrived at this conclusion by assuming that all objects in the general population have various meanings sign. The same result will be obtained if we assume that the general population contains several objects with the same attribute value.

Generalizing the result obtained to the general population with a continuous distribution of the attribute , we define the general average as the mathematical expectation of the attribute: .

Let a sample of volume be extracted to study the general population with respect to a quantitative attribute.

Sample mean called the arithmetic mean of the values ​​of the feature of the sample population. It is calculated by the formula:

§ - if all values ​​of the sign of the sample population of volume are different;

§ – if the values ​​of the feature of the sampling set have, respectively, frequencies , and . That is, the sample mean is the weighted average of the trait values ​​with weights equal to the corresponding frequencies.

Comment: the sample mean found from the data of one sample is obviously a certain number. If we extract other samples of the same size from the same general population, then the sample mean will change from sample to sample. Thus, the sample mean can be considered as a random variable, and therefore, we can talk about the distributions (theoretical and empirical) of the sample mean and the numerical characteristics of this distribution, in particular, the mean and variance of the sample distribution.

Further, if the general mean is unknown and it is required to estimate it from the sample data, then the sample mean is taken as an estimate of the general mean, which is an unbiased and consistent estimate (we propose to prove this statement on our own). It follows from the foregoing that if several samples of a sufficiently large volume from the same general population are used to find sample means, then they will be approximately equal to each other. This is the property stability of sample means.

Note that if the variances of two populations are the same, then the proximity of the sample means to the general ones does not depend on the ratio of the sample size to the size of the general population. It depends on the sample size: the larger the sample size, the less the sample mean differs from the general one. For example, if 1% of objects are selected from one set, and 4% of objects are selected from another set, and the volume of the first sample turned out to be larger than the second, then the first sample mean will differ less from the corresponding general mean than the second.

Questions of statistical evaluation link into a single whole such problematic aspects of mathematical statistics as scientific methodology, random variables, statistical distributions, etc. For any sample, errors are inherent due to incomplete coverage of units, measurement errors, and similar reasons. Such errors in real life give each hypothesis (in particular, formulated on the basis of economic conclusions) a random, stochastic character. Regardless of the number of variables provided by the theoretical hypotheses, it is assumed that the influence various kinds errors can be accurately described using only one component. Such methodological approach allows us to restrict ourselves to a one-dimensional probability distribution with simultaneous estimation of several parameters.

Statistical evaluation is one of two types of statistical judgment (the second type is hypothesis testing). It is a special kind of method for judging the numerical values ​​of the characteristics (parameters) of the distribution of the general population according to the sample data from this population. That is, having the results of selective observation, we are trying to estimate (with the greatest accuracy) the values ​​of certain parameters on which the distribution of the trait (replaceable) that interests us depends in the general population. Since the sample includes only a subset of the population (sometimes a very small number), there is a risk of error. Despite the decrease in this risk with an increase in the number of units of observation, it still takes place during selective observation. Hence, the decision taken based on the results of the sample is provided with a probabilistic character. But it would be wrong to consider statistical judgments only in terms of probabilities. This approach is not always sufficient to build correct theoretical assumptions about the parameters of the general population. Often a number of additional judgments are needed to provide a deeper justification. For example, it is necessary to estimate, as close as possible, the average number of skilled workers at the enterprises of the region. In this case, the arithmetic mean of the variable x from the general population, which has a normal distribution, is estimated. Having received a sample for this attribute in the amount P units, it is necessary to solve the question: what value, according to the sample data, should be taken as the closest to the average in the general population? There are several such values, the mathematical expectation of which is equal to the desired parameter (or close to it): a) arithmetic mean; b) fashion; c) median; d) average, calculated by the range of variation, etc.

From a probabilistic point of view, each of the above quantities can be considered to give the best approximation to the desired population parameter (x), since the mathematical expectation of each of these functions (especially for large samples) is equal to the general average. This assumption is due to the fact that with repeated repetition of the sample from the same general population, an “on average” correct result will be obtained.

The correctness "on average" is explained by the equality of repetitions of positive and negative deviations of the emerging errors in the estimation of the general average, that is, the average estimation error will be zero.

AT practical conditions, as a rule, organize one sample, so the researcher is interested in the question of a more accurate estimate of the desired parameter based on the results of a particular sample. To solve such a problem, in addition to the conclusions that follow directly from the abstract calculation of probabilities, additional rules are needed to motivate the best approximation of the estimate to the desired parameter of the general population.

There are a sufficient number of ways to estimate constants from sample observations. Which of them are the best in solving specific research problems - deals with the theory of statistical estimation. It explores the conditions that this or that assessment should obey, focuses on assessments that are more preferable under the given circumstances. Evaluation theory indicates the superiority of one evaluation over another.

As you know, the information obtained on the basis of the sample is not categorical in the conclusion. If, for example, 99 of the 100 animals studied turned out to be healthy due to their diseases, then there is a possibility that one animal that remained unexamined carries the virus of the alleged disease. Since this is unlikely, it is concluded that there is no this disease. In most cases, this conclusion is fully justified.

Based on similar findings, practical activities, the experimenter (researcher) relies not on the reliability of information, but only on its probability.

The other side of sample observation, as already noted, solves the problem of determining the degree of reliability of the obtained sample estimates as objectively as possible. The solution of this problem is trying to provide the most accurate probabilistic expression, that is, we are talking about determining the degree of accuracy of the estimate. Here, the researcher determines the boundaries of the possible discrepancy between the estimate obtained in the sample and the actual value of its value in the general population.

The accuracy of the estimate is due to the method of its calculation according to the sample data and the method of selecting units in the sample.

The method of obtaining estimates involves any computational procedure (method, rule, algebraic formula). This is the priority of the theory of statistical estimation. Methods of selection lead to questions of technique for carrying out sampling research.

The foregoing allows us to define the concept of "statistical evaluation".

Statistical evaluation- this is an approximate value of the desired parameter of the general population, which is obtained from the results of the sample and provides the possibility of making informed decisions about the unknown parameters of the general population.

Suppose that ^ "is a statistical estimate of the unknown parameter ^ of the theoretical distribution. By repeating the same

The sample size from the population found estimates and 2 ^ ""n,

having different meanings. Therefore, the estimate ^ ", can be considered as

random variable, and +17 two, 3 ~ "n - as its possible values. As random value, it is characterized by a certain probability density function. Since this function is due to the result of selective observation (experiment), it is called selective distribution. Such a function describes the probability density for each of the estimates, using a certain number of sample

observations. If we assume that, the statistical estimate ^ ", is an algebraic function of a certain data set and such a set will be obtained during selective observation, then in

In general, the estimate will receive the expression: ® n = f (Xl.X2, ^ 3, ... X t).

At the end of the sample survey, this function is no longer a general assessment, but takes on a specific value, that is, it becomes a quantitative assessment (number). In other words, it follows from the above expression of the function that any of the indicators characterizing the results of a sample observation can be considered an estimate. The sample mean is an estimate of the population mean. The variance calculated from the sample or the value of the standard deviation calculated from it are estimates of the corresponding characteristics of the general population, etc.

As already noted, the calculation of statistical estimates does not guarantee the elimination of errors. The bottom line is that the latter should not be systematic. Their presence should be random. Let us consider the methodological side of this provision.

Assume that the estimate ^ "gives an inexact value of the estimate ^ of the population with a disadvantage. In this case, each calculated value = 1,2,3, ..., n) will be less than the actual value of the value $.

For this reason, the mathematical expectation (mean value) of the random variable in will be less than in, that is, (M (^ n. And, conversely, if it gives an excess estimate, then the mathematical expectation

random ^" will become greater than $.

It follows that the use of a statistical estimate, the mathematical expectation of which is not equal to the estimated parameter, leads to systematic errors, that is, to non-random errors that distort the measurement results in one direction.

A natural requirement arises: the mathematical expectation of the estimate ^ "should be equal to the estimated parameter. Compliance with this requirement does not eliminate errors in general, since the sample values ​​of the estimate may be greater or less than the actual value of the estimate of the general population. But errors in one and the other direction from the values ​​^ will occur (according to probability theory) with the same frequency.Consequently, compliance with this requirement, the mathematical expectation of the sample estimate must be equal to the estimated parameter, eliminates the receipt of systematic (non-random) errors, that is

M (in) = 6.

The choice of a statistical estimate that gives the best approximation of the estimated parameter is an important problem in estimation theory. If it is known that the distribution of the random variable under study in the general population corresponds to the normal distribution law, then it is necessary to estimate the mathematical expectation and standard deviation from the sample data. This is explained by the fact that these two characteristics completely determine the foundations on which the normal distribution is built. If the random variable under study is distributed according to the Poisson law, the parameter ^ is evaluated, since it determines this distribution.

Mathematical statistics distinguishes such methods for obtaining statistical estimates from sample data: the method of moments, the method of maximum likelihood.

When obtaining estimates by the method of moments, the moments of the general population are replaced by the moments of the sample population (instead of probabilities with weight, frequencies are used).

In order for a statistical estimate to give a "best approximation" to a general characteristic, it must have a number of properties. They will be discussed below.

The possibility of choosing the best estimate is due to the knowledge of their basic properties and the ability to classify estimates according to these properties. In the mathematical literature, "properties of estimates" are sometimes called "requirements for estimates" or "criteria for estimates". The main properties of statistical estimates include: Unbiasedness, efficiency, ability, sufficiency.

If we assume that the sample mean (~) and sample variance

(Sv) are estimates of the corresponding general characteristics (^), that is, their mathematical expectation, we take into account that with a large number

sampling units named characteristics (~) will be approximated to their mathematical expectations. If the number of sample units is small, these characteristics may differ significantly from the corresponding mathematical expectations.

If the mean value of the sample characteristics chosen as an estimate corresponds to the value of the general characteristic, the estimate is called unbiased. The proof that the expectation of the sample mean is equal to the general average (m (x) = x), indicates that the value ~ is an unbiased general

average. The situation is different with selective dispersion (o). her

M (ST 2) \u003d - o-2. .

mathematical expectation n, not equal to the general

dispersion. So, a h is a biased estimator of a ". To eliminate the systematic error and obtain an unbiased estimator, the sample

the variance is multiplied by the correction n - 1 (this follows from the formation

in 2 _ 2 p P -1 "n -1

the above equation: n).

Thus, with a small sample, the variance is:

2 Ch, - ~) 2 P E (x and - ~) 2

cg in= x - = -.

p p - 1 p -1

Fraction (P- 1) is called the Bessel correction. The mathematician of Bessel was the first to establish that the sample variance is a biased estimate of the general variance and applied the specified correction to correct

ratings. For small samples, the correction (n - 1) differs significantly from 1. With an increase in the number of observation units, it quickly approaches 1. At n<>50 the difference between the scores disappears, i.e.

° ~ "- . From the foregoing, the following definitions of the requirements of unbiasedness follow.

unbiased is called a statistical estimate, the mathematical expectation of which, for any sample size, is equal to the value

parameter of the general population, that is, m (^) = 9; m(x) = x.

The category "mathematical expectation" is studied in the course of probability theory. This is a numerical characteristic of a random variable. The mathematical expectation is approximately equal to the average value of a random variable. Mathematical expectation of a discrete random variable is the sum of the products of all its possible values ​​and their probabilities. Assume that n studies have been performed in which the random variable X took w 1 time the value of w 2 times the value of W and times the value of X k. In this case, W 1 + W 2 + W 3 + ... + W k \u003d n. Then the sum of all values ​​\u200b\u200btaken x is equal to

x 1 w 1 + x 2 w 2 + x 3 w 3 + ... + x k w k

The arithmetic mean of these values ​​will be:

X 1 w 1 + x 2 w 2 + x 3 w 3 + ... + x k w k - w 1^ w 2 ^ w 3 ^ ^ w k

P or 1 p 2 p 3 p 1 p.

Since n is the relative frequency ^ value X ^ P- the relative frequency of the value x 2, etc., the above equation will take the form:

X = X 1 No. 1 + X 2 No. 2 + X 3 No. 3 + ... + X to N> to

With a large number of sample observations, the relative frequency is approximately equal to the probability of the occurrence of an event, that is

u> 1 = L; ^ 2 \u003d W \u003d ™ k \u003d Pk and therefore x 2 x 1 p 1 + x 2 p 2 + X 3 g. 3 + ... + X KRK. Then

x~ m(x) the probabilistic meaning of the result of the calculations is that the mathematical expectation is approximately equal (the more accurate, the larger the sample) to the arithmetic mean of the observed values ​​of the random variable [M (x -) = ~ 1.

The unbiasedness criterion guarantees the absence of systematic errors in estimating the parameters of the general population.

Note that the sample estimate (^) is a random variable, the value of which can change from one sample to another. The measure of its variation (scattering) around the mathematical expectation of the parameter of the general population # characterizes the variance c2 (t).

Let in andAT -- two unbiased estimates of the parameter ^, i.e. M (in") \u003d 6 and M (d,) \u003d c. Their variances in 1 (in -) and inGf -). From two 0's to Artaud's, give preference to the one that has the least dispersion around the estimated parameter. If the score variance ^ "is less than the variance

estimates Sp, then the first estimate is taken as &, that is, ^ ".

An unbiased estimator t, which has the smallest variance among all possible unbiased estimators of the parameter t calculated from samples of the same size, is called an effective estimator. This is the second property (requirement) of statistical estimates of the parameters of the general population. It must be remembered that the effective estimate of the parameter of the general population, subject to a certain distribution law, does not coincide with the effective estimate of the parameter of the second section.

When considering large samples, statistical estimates should have a capability property. The estimate is capable (the term "fit" or "consistent" is also used) means that the larger the sample size, the more likely it is that the estimate error will not exceed an arbitrarily small positive

number E. Estimation 6 of the parameter ^ is called consistent if it obeys the law of large numbers, that is, the following equality holds:

/ shh | G in-in <Е} = 1.

As you can see, such a statistical estimate is called capable, which, for n, approaches the estimated parameter in probability. In other words, this is the value of the indicator obtained from the sample and approaching (coincides in probability) due to the law of large numbers with an increase in the sample size to its mathematical expectation. For example, if the variance of an unbiased estimate tends to zero as n, then such an estimate also turns out to be consistent, since it has the smallest possible variance (for a given sample size).

Able estimates are:

1) the share of the feature in the sample, that is, the frequency as an estimate of the share of the feature in the general population;

2) the sample mean as an estimate of the general mean;

3) sample variance as an estimate of the general variance;

4) sample coefficients of asymmetry and kurtosis as an estimate of the general coefficients.

In the literature on mathematical statistics, for some reason, it is not always possible to find a description of the fourth property of statistical estimates - sufficiency. Grade sufficient(or exhaustive) is an estimate that results (ensures) the completeness of coverage of all sample information about an unknown parameter of the general population. Thus, a sufficient estimate includes all the information that is contained in the sample on the studied statistical characteristics of the general population. None of the three estimates considered earlier can provide the necessary additional information about the parameter under study, as a sufficient statistical estimate.

Therefore, the arithmetic mean of the sample ~ is an unbiased estimate of the arithmetic mean of the population x. The non-bias factor of this estimate shows: if a large number of random samples are taken from the general population, then their averages *<отличались бы от генеральной средней в большую и меньшую сторону одинаково, то есть, свойство несмещенности хорошей оценки также показывает, что среднее значение бесконечно большого числа выборочных средних равно значению генеральной средней.

In symmetrical distribution series, the median is an unbiased estimate of the overall mean. And provided that the size of the sample population approaches the general population (P ~ * N), the median can be in such series and a consistent estimate of the general average. large volume, the standard error of the median (Stme) is 1.2533 of the standard error of the sample mean

). That is, Stme *. Therefore, the median cannot be an efficient estimate of the arithmetic mean of the population, since its mean square error is greater than the mean square error of the mean arithmetic sample. In addition, the arithmetic mean satisfies the conditions of unbiasedness and ability, and, therefore, is the best estimate.

Such a setting is also possible. Can the arithmetic mean of the sample be an unbiased estimate of the median in symmetric population distributions for which the mean and median are the same? And will the sample mean be a consistent estimate of the population median? In both cases, the answer will be yes. For the population median (with a symmetric distribution), the arithmetic mean of the sample is an unbiased and consistent estimate.

Keeping in mind that Cme ~ 1.2533st th, we come to the conclusion: the arithmetic mean of the sample, and not the median, is a more effective estimate of the median of the studied general population.

Each feature of the sample is not necessarily the best estimate of the corresponding feature of the population. Knowing the properties of estimates allows us to solve the problem of not only choosing estimates, but also improving them. As an example, consider the case when calculations show that the values ​​of the standard deviations of several samples from the same general population in all cases are less than the standard deviation of the general population, and the magnitude of the difference is due to the sample size. Multiplying the value of the standard deviation of the sample by the correction factor, we get an improved estimate of the standard deviation of the population. For such a correction factor, the Bessel correction is used

P a I P

(P - 1), that is, to eliminate bias, estimates are obtained "P- 1. Such a numerical expression shows that the standard deviation of the sample, used as an estimate, gives an underestimated value of the population parameter.

As you know, the statistical characteristics of the sample population is a rough estimate of the unknown parameters of the population. The score itself can take the form of a single number or some specific point. An estimate that is determined by a single number is called a point estimate. Thus, the sample mean (~) is the unbiased and most efficient point estimate of the general mean (x), and the sample variance) is a biased point estimate of the general

variance (). If we denote the average error of the sample mean t <>then the point estimate of the general mean can be written as x ± m °. This means that ~ is an estimate of the general mean x with an error equal to m". It is clear that the point statistical estimates of x and o should not have a systematic error in

ooo~~o<в 2

the direction of overestimation or underestimation of the estimated parameters x and. As mentioned earlier, estimators that satisfy such a condition are called

unbiased. What is the error of the parameter m "? This is the average of many specific errors:

The point estimate of the general population parameter consists in the fact that, from different possible sample estimates, the one that has optimal properties is first selected, and then the value of this estimate is calculated. The resulting calculated value of the latter is considered as the best approximation to the unknown true value of the population parameter. Additional calculations related to the determination of a possible estimation error are not always obligatory (depending on the variance of the estimation tasks), but, as a rule, are almost always carried out.

Let us consider examples of determining a point estimate for the average of the characteristics under study and for their share in the general population.

Example. The region's grain crops are 20,000 hectares. With a 10% sample survey of the fields, the following selective characteristics were obtained: average yield - 30 centners per hectare, yield dispersion - 4, area under crops of high-yielding crops - 1200 hectares.

What to know about the value of the indicator of the average yield of grain crops in the area and which is the numerical value of the indicator of the share (specific gravity) of high-yielding crops in the total area of ​​\u200b\u200bcereal crops under study

region? That is, it is necessary to evaluate the named parameters (x, z) in the general population. To calculate the grades, we have:

N = 20000; - = 20000 x 0.1 = 2000; ~=30;<т = л / 4; № 2000,

As is known, the selective arithmetic mean is an effective estimate

general arithmetic mean. Thus, it can be accepted that

the best estimate of the general parameter (^) is 30. To determine the degree

accuracy of the estimate, it is necessary to find its average (standard) error:

ia. n~i April 2000 PPL

t = L - (1--) = - (1--) = 0,04

v n N i2000 2000^

The resulting error value indicates a high accuracy of the estimate. The value of m here means that with repeated repetition of such samples, the parameter estimation error would average 0.04. That is, behind the dot

According to estimates, the average yield in the farms of the region will be x = 30 - 0.04 centners per hectare.

To obtain a point estimate of the share of high-yielding grain crops in the total area of ​​grain crops, the best estimate can be taken as the share in the sample ¥ = 0.6. Thus, we can say that according to the results of observations, the best estimate of the desired structure indicator will be the number 0.6. Refining the calculations, one should calculate the average error of this estimate: t and (1 _ p) and 0.6 (1 - 0.b) (1 = 0.01

v P N v 2000 2000 a

As you can see, the average error in estimating the general characteristic is 0.01.

The result obtained means that if the sample were repeated many times with a volume of 2000 hectares of grain, the average error of the accepted estimate of the share (specific weight) of high-yielding crops in the area of ​​grain crops of enterprises in the region would be ± 0.01. In this case, P = 0.6 ± 0.01. In percentage terms, the share of high-yielding crops in the total grain area of ​​the district will average 60 ± I.

Calculations show that for a specific case, the best estimate of the desired structure indicator will be the number 0.6, and the average error of the estimate in one direction or another will be approximately equal to 0.01. As you can see, the estimate is quite accurate.

Several methods are known for point estimation of the standard deviation in cases where the sample is made from a general population of units with a normal distribution and the parameter β is unknown. A simple (easiest to calculate) estimate is the range of variation (and °) of the sample, multiplied by a correction factor taken from standard tables and which depends on the sample size (for small samples). The standard deviation parameter in the general population can be estimated using the calculated sample variance, taking into account the number of degrees of freedom. The square root of this variance gives a value that will be used as an estimate of the general standard deviation).

Using the parameter value in "calculate the mean error of the estimate of the general mean (x") in the manner discussed above.

As mentioned earlier, in accordance with the requirement of ability, confidence in the accuracy of a particular point estimate increases with increasing sample size. It is somewhat difficult to demonstrate this theoretical position on the example of a point estimate. The influence of the sample size on the accuracy of the estimate is obvious when calculating interval estimates. They will be discussed below.

Table 39 lists the most commonly used point estimates of population parameters.

Table 39

Basic point estimates _

The estimates calculated in different ways may not be the same in magnitude. In this regard, in practical calculations, one should not deal with the sequential calculation of possible options, but, relying on the properties of various estimates, choose one of them.

With a small number of units of observations, the point estimate is largely random, and therefore not very reliable. Therefore, in small samples, it can differ greatly from the estimated characteristic of the general population. This situation leads to gross errors in the conclusions that apply to the general population based on the results of the sample. For this reason, interval estimates are used for small samples.

In contrast to the point estimate, the interval estimate gives the range of points within which the population parameter must lie. In addition, the interval estimate indicates the probability, and, therefore, it is important in statistical analysis.

An interval estimate is called, which is characterized by two numbers - the boundaries of the interval that covers (covers) the estimated parameter. Such an estimate is a certain interval in which the desired parameter is located with a given probability. The center of the interval is a sample point estimate.

Thus, interval estimation is a further development of point estimation, when such an estimation is inefficient with a small sample size.

The problem of interval estimation in general form can be formulated as follows: according to the data of a sample observation, it is necessary to construct a numerical interval in relation to which, by a previously selected probability level, it can be argued that the estimated parameter is within this interval.

If we take a sufficiently large number of sampling units, then, using the Lyapunov theorem, we can prove the probability that the sampling error does not exceed some given value a, that is

And ~ "*!" A or I No. "g. yA.

In particular, this theorem makes it possible to estimate the errors of approximate equalities:

- "P (n and - frequency) x "x. n

If ^ * 2Xz..., x - ~ independent random variables and n, then the probability of their mean (x) is in the range from a to 6 and can be determined by the equations:

p(a(X (e) 1 e 2 these,

_a- E (x); _ in - E (x) DE ° a

The probability P is called the confidence probability.

Thus, the confidence probability (reliability) of estimating the general parameter according to a sample estimate is the probability with which the inequalities are realized:

| ~ X | <а; | и, ориентир | <д

where a is the marginal error of the estimate, according to the mean and share.

The boundaries in which the general characteristic can be located with this given probability are called confidence intervals (confidence boundaries). And the boundaries of this interval are called confidence boundaries.

Confidence (or tolerant) boundaries are boundaries beyond which a given characteristic due to random fluctuations has an insignificant probability (A ^ 0.5; p 2<0,01; Л <0,001). Понятие "доверительный интервал" введено Дж.Нейман и К.Пирсоном (1950 г.). Это установленный по выборочным данным интервал, который с заданной вероятностью (доверительной вероятностью) охватывает (покрывает) настоящее, но неизвестно для нас значение параметра. Если уровня доверительной вероятности принять значения 0,95, то эта вероятность свидетельствует о том, что при частых приложениях данного способа (метода) вычислений доверительный интервал примерно в 95% случаев будет покрывать параметр. Доверительный интервал генеральной средней и генеральной доли определяется на основе приведенных выше неравенств, из которых

it follows that ~ _A - x - ~ + A; No. _A - g. - No. + A.

In mathematical statistics, the reliability of a parameter is estimated by the value of the following three levels of probability (sometimes called "probability thresholds"): L \u003d 0.95; ^ 2 \u003d 0.99; P 3 \u003d 0.999. Probabilities that it was decided to neglect, that is a 1 = 0.05;; a 2 = 0.01; "3 \u003d 0.001 are called levels of significance, or levels of significance. From the above levels, reliable conclusions are provided by the probability P 3 = 0.999. Each confidence level corresponds to a certain value of the normalized deviation (see Table 27). If there are no standard tables of values ​​of the probability interval at the disposal, then this probability can be calculated with a certain degree of approximation by the formula:

R (<) = - = ^ = 1 e"~ y i.

In Figure 11, those parts of the total area bounded by the normal curve and the abscissa axis that correspond to the value <= ± 1;<= ± 2; <= и 3 и для которых вероятности равны 0,6287, 0,9545; 0,9973. При точечном оценке рассчитывается, как уже известно, средняя ошибка выборки, при интервальном - предельная.

Depending on the principles of unit selection (repeated or non-repeated), the structural formulas for calculating sampling errors

differ in the magnitude of the correction (N).

Rice. 11. Normal Probability Curve

Table 40 shows the formulas for calculating the errors in estimates of the general parameter.

Let us consider a specific case of interval estimation of the parameters of the general population according to the data of sample observation.

Example. During a selective survey of farms in the region, it was found that the average daily milk yield of cows (x) is 10 kg. The share of purebred cattle in the total number of livestock is 80%. The sampling error with a confidence probability P = 0.954 turned out to be 0.2 kg; for private purebred cattle 1%.

Thus, the boundaries within which the general average can be

performance will be 9.8<х <10,2; для генеральной доли скота -79 <Р <81.

Conclusion: with a probability of 0.954 it can be argued that the difference between the selective average productivity of cows and the general productivity is 0.2 kg. The limit of average daily milk yield is 9.8 and 10.2 kg. The share (specific weight) of purebred cattle in the enterprises of the region ranges from 79 to 81%, the estimation error does not exceed 1%.

Table 40

Calculation of point and interval sampling errors

When organizing a sample, it is important to determine the required size (n). The latter depends on the variation of the units of the surveyed population. The larger the randomness, the larger the sample size should be. Feedback between the sample size and its marginal error. The desire to get a smaller error requires an increase in the size of the sample.

The required sample size is determined based on the formulas for the marginal sampling error (e) with a given level of probability (P). By mathematical transformations, formulas for calculating the sample size are obtained (Table 41).

Table 41

Calculation of the required sample size _

It should be noted that everything stated in relation to statistical estimates is based on the assumption that the sample population, the parameters of which are used in the assessment, is obtained using a selection method (method) that provides sampling probabilities.

At the same time, when choosing the confidence level of the estimate, one should be guided by the principle that the choice of its level is not a mathematical problem, but is determined by the specific problem being solved. In support of the above, consider an example.

Example. Suppose that at two enterprises the probability of producing finished (high-quality) products is P = 0.999, that is, the probability of obtaining defective products will be a = 0.001. Is it possible, within the framework of mathematical considerations, without being interested in the nature of the product, to decide whether there was a high probability of shortage a = 0.001. Let's say one company produces seeders, and the second - aircraft for processing crops. If there is one defective one per 1000 seeders, then this can be put up with, because the remelting of 0.1% of seeders is cheaper than the restructuring of the technological process. If there is one defective aircraft per 1000 aircraft, this will certainly lead to serious consequences during its operation. So, in the first case, the probability of getting a marriage a = 0.001 can be accepted, in the second case - no. For this reason, the choice of confidence probability in calculations in general and in the calculation of estimates, in particular, should be carried out on the basis of the specific conditions of the problem.

Depending on the objectives of the study, it may be necessary to calculate one or two confidence limits. If the features of the problem being solved require setting only one of the boundaries, upper or lower, you can make sure that the probability with which this boundary is set will be higher than when both boundaries are specified for the same value of the confidence coefficient 1

Let the confidence limits be set with a probability P = 0.95, that is,

in 95% of cases, the general average (x) will be no less than the lower

confidence interval x ™ - x "m and not more than the upper confidence

interval Xup - = x + In this case, only with a probability of a = 0.05 (or 5%), the average general can go beyond the specified boundaries. Since the distribution of X is symmetrical, then half of this level

probability, that is, 2.5% will fall on the case when x (x ™ -and the second half - on the case when, x ^ x "^ -. It follows from this that the probability that the average general may be less than top value

Khwei's confidence limit "-, is equal to 0.975 (that is, 0.95 + 0.025). Therefore, conditions are created when, with two confidence limits, we neglect

the value of x is less than x "" *., and greater or Xeerx. calling

only one confidence limit, for example, Хupper., we neglect only those ~ exceeding this limit. For the same value of the confidence coefficient X, the significance level a here turns out to be two times less.

If only characteristic values ​​are calculated that exceed

(or vice versa do not exceed) the value of the desired parameter x, the confidence interval is called one-sided. If the values ​​under consideration are limited on both sides, the confidence interval is called two-sided. It follows from the above that the hypotheses and a number of criteria, in particular the Student's X test, should be considered as one-sided and two-sided. Therefore, under a two-tailed hypothesis, the significance level for the same value of X will be twice as high as a one-tailed one. If we want to keep the level of significance (and confidence level) the same with a one-tailed hypothesis as with a two-tailed hypothesis, then the value of X should be taken less. This feature was taken into account when compiling standard tables of X-Student criteria (Appendix 1).

It is known that from the practical point of view, it is not so much the confidence intervals of the possible value of the general average that are of interest, but those maximum and minimum values, more or less than which the general average cannot be with a given (confidence) probability. In mathematical statistics, they are called guaranteed maximum and guaranteed minimum average. Denoting the named parameters

respectively, through and x ™, you can write: ХШ ™ = x +; xship = x~.

When calculating the guaranteed maximum and minimum values ​​of the general average, as the boundaries of a one-sided confidence interval in the above formulas, the value 1 is taken as a one-sided criterion.

Example. For 20 sampling sites, the average yield of sugar beet was 300 n/ha. This sample mean characterizes the corresponding

population parameter (x) with an error of 10 n/ha. According to the selectivity of estimates, the general average yield can be either more or less than the sample average x = 300. With a probability of P = 0.95, it can be argued that the desired parameter will not be more than ХШ "= 300 + 1.73 x10 = 317.3 q / ha.

The value 1 is taken for the number of degrees of freedom ^ = 20-1 with a one-sided critical region and a significance level a = 0.05 (Appendix 1). So, with a probability P = 0.95, the guaranteed maximum possible level of the general average yield is estimated at 317 n / ha, that is, under favorable conditions, the average yield of sugar beet does not exceed the specified value.

In some branches of knowledge (for example, in the natural sciences), the theory of evaluation is inferior to the theory of testing statistical hypotheses. In economics, statistical evaluation methods play a very important role in verifying the reliability of research results, as well as in various practical calculations. First of all, this concerns the use of a point estimate of the statistical populations under study. The choice of the best possible estimate is the main problem of point estimation. The possibility of such a choice is due to the knowledge of the basic properties (requirements) of statistical estimates.

is shifted O. of page. for the variance, since ; as unshifted O. with. for s 2 one usually takes the function


see also Unbiased estimator.

For a measure of accuracy of unshifted O. with. and for the parameter, the variance Da is most often taken.

O. s. with the smallest dispersion. the best. In the above example, the arithmetic mean (1) is the best O. with. However, if random variables X i different from normal, then O. s. (1) may not be the best. For example, if the results of observations Х i distributed uniformly in the interval ( b, c), then the best O. s. for math. expectations a=(b+c)/2 will be half the sum of the extreme values

(3)

As a characteristic for comparing the accuracy of various O. s. apply the efficiency - variances of the best estimate and the given unbiased estimate. For example, if the results of observations Х i are uniformly distributed, then the variances of estimates (1) and (3) are expressed by the formulas

and (4)

Since estimate (3) is the best, the efficiency of estimate (1) in this case is

With a large number of observations, they usually require that the selected O. with. aspired by probability to the true value of the parameter a, i.e., for any e > 0

such O. s. called consistent (an example of a consistent O. s, - any, the variance of which tends to zero at; see also Consistent assessment). Since the tendency to the limit plays an important role in this, the asymptotically best are the asymptotically efficient O.s., that is, O.s. for which, for

For example, if distributed equally normally, then O. s. (2) is an asymptotically efficient estimate for the unknown parameter , since for , the variance of the estimate and the variance of the best estimate are asymptotically equivalent:

and besides,

Fundamental value for the O.'s theory of page. and its applications has the fact that O. s. for the parameter a, it is limited from below by a certain value (R. Fischer proposed to characterize the amount of information about the unknown parameter a contained in the results of observations with this value). For example, if they are independent and equally distributed with a probability density p(x; a).and if - O. s. for some function g(a). on the parameter a, then in a wide class of cases

The function b(a) is called. displacement, and the reciprocal of the right side of inequality (5), called. the amount of information (according to Fisher) about the function g(a) , contained in the observation. In particular, if a is an unbiased O. s. parameter a, then,

and the amount of information nIa in this case, it is proportional to the number of observations (the function I(a) is called the amount of information contained in one observation).

The main conditions under which inequalities (5) and (6) hold are the smoothness of the estimate for a as a function of X i , and also on the parameter of the set of those points X, where p( x, a)=0. The last condition is not satisfied, for example, in the case of a uniform distribution, and therefore the variance of O. s. (3) does not satisfy inequality (6) [according to (4) this dispersion is of the order of n -2, while by inequality (6) it cannot be smaller than n -1].

Inequalities (5) and (6) are also valid for discretely distributed random variables X i is needed only in the definition of the information I(a). p(x; a).replace with the probability of the event (X=x).

If the dispersion of unbiased O. s. a* for parameter a coincides with the right side of inequality (6), then is the best estimate. The converse statement, generally speaking, is not true: the variance of the best O. s. may exceed . However, if , then the variance of the best estimate is asymptotically equivalent to the right side of (6), i.e., . Thus, using the amount of information (according to Fisher), one can determine the asymptotic efficiency of not displaced O. of page. a, assuming

Particularly fruitful is the informational approach to O.'s theory of s. affects when the density (in the discrete case - ) of the joint distribution of random variables can be represented as a product of two functions h( x 1 ,x 2 ,...,x p).[y( x 1 , x 2 ,..., x n);a] , of which the first one does not depend on a, and the second is the distribution density of a certain random variable Z=y(X 1, X 2,...,X p), called sufficient statistics or exhaustive statistics.

One of the most widespread methods of finding point O. page - moments method. According to this method, theoretical distribution depending on unknown parameters is put into a discrete sample , which is determined by the results of observations X i and represents the probability distribution of an imaginary random variable that takes values ​​with equal probabilities equal to 1/n (the sample distribution can be considered as a point distribution for a theoretical distribution). As O. with. for the theoretical moments. distributions take the corresponding moments of the sampling distribution; e.g. for math. expectation ai and variance s 2 method of moments gives the following OS: (1) and sample variance (2). Unknown parameters are usually expressed (exactly or approximately) as functions of several theoretical moments. distribution. Replacing in these functions theoretical. the moments selective, receive required O. with. This method, which often leads in practice to comparatively simple calculations, usually yields O. s. low asymptotic efficiency (see above an example of estimating the mathematical expectation of a uniform distribution).

Other method of finding O. of page, more perfect with theoretical. points of view,- maximum likelihood method, or the maximum likelihood method. According to this method, the likelihood function L(a) is considered, which is a function of the unknown parameter a and is obtained as a result of the replacement in the density of the joint distribution of arguments x i the random variables themselves X i; if Xi - are independent and equally distributed with a probability density p(x; a), then

(if X i are discretely distributed, then in the definition of the likelihood function L, the density should be replaced by the probabilities of events ). As O. with. the maximum likelihood for an unknown parameter a is taken to be such a value a, for which L(a) reaches its maximum value (in this case, instead of L, the so-called logarithmic likelihood function is often considered ; due to the monotonicity of the logarithm, the points of maxima of the functions L(a) and l(a) coincide). Examples of O. with. maximum likelihood estimates are least squares method.

The main advantage of O. with. The maximum likelihood lies in the fact that, under certain general conditions, these estimates are consistent, asymptotically efficient, and approximately normally distributed.

The listed properties mean that if a is O. s. maximum likelihood, then

(if X are independent, then ). Thus, for the distribution function of the normalized O. s. there is a limit relation

O.'s advantages with. maximum likelihood justify the computational work of finding the maximum of the function L(or l) . In some cases, the computational work is significantly reduced due to the following properties: first, if a* is such an O.S. for which (6) becomes an equality, then the O.S. maximum likelihood is unique and coincides with a*; secondly, if Z exists, then O. s. maximum likelihood is a function Z.

Let, for example, be independent and distributed equally normally so that

that's why

Coordinates a = a 0 and s= s 0 maximum points of the function I( a, s).satisfy the system of equations


Thus, and, therefore, in this case, O. s. (1) and (2) - maximum likelihood estimates, and - the best O. s. parameter a, normally distributed (, ), and - asymptotically effective O. s. parameter s 2 , distributed for large p is approximately normal (). Both estimates are independent sufficient statistics.

Another example, in which

This density satisfactorily describes the distribution of one of the coordinates of particles that have reached the flat screen and escaped from a point located outside the screen (a is the coordinate of the source projection onto the screen, it is assumed to be unknown). For the specified distribution of mathematical. the expectation does not exist because the corresponding one diverges. Therefore O.'s search by page. for the method of moments is impossible. Formal application as O. of page. the arithmetic mean (1) is meaningless, since it is distributed in this case with the same density p(x; a) as each single observational result. To estimate, one can use the fact that the considered distribution is symmetrical with respect to the point x=a and, therefore, a - theoretical median distribution. Somewhat modifying the method of moments, as O. s. for accepting the so-called. sample median m, which is unbiased O. s. for a, and if p is large, then m is distributed approximately normally with variance


In the same time

therefore and, hence, according to (7) asymptotically. efficiency is equal. Thus, in order for m to be just as accurate O. s. for a, as well as the maximum likelihood estimate a, the number of observations should be increased by 25%. If the costs of the experiment are high, then O. should be used to determine the page. a, which in this case is defined as the equations

As the first approximation, a 0 =u is chosen and then this is solved by successive approximations according to the formula

see also Point estimate.

Interval estimates. The interval estimate is called. such a boundary system, which is geometrically representable as a set of points belonging to the parameter space. Interval O. with. can be considered as point O. with. This set depends on the results of observations and, therefore, it is random; therefore each interval O. with. is put in correspondence with the probability in which this estimate "covers" the unknown parametric. point. Such a probability, generally speaking, depends on the unknown parameters; therefore, as a characteristic of the reliability of interval O. with. accept confidence - the smallest possible value of the specified probability. Informative statistic. conclusions make it possible to obtain only those interval O. s., the coefficient of confidence to-rykh is close to unity.

If one parameter a is estimated, then interval O. with. usually is some (b, g). (so-called ), the endpoints of which (b and g are functions of the results of observations; the confidence coefficient co in this case is defined as the probabilities of the simultaneous occurrence of two events (b< a} и (g >a) calculated over all possible values ​​of the parameter a:


If the middle of such an interval is taken as a point O. with. for the parameter a, then with a probability of at least ω it can be argued that this O. s. does not exceed half the length of the interval . In other words, if we follow the indicated rule for estimating the absolute error, then an erroneous conclusion will be obtained on average in less than cases. For a fixed confidence coefficient with the shortest confidence intervals are most advantageous, for which the mathematical the length expectation reaches the smallest value.

If the distribution of random variables X i depends on only one unknown parameter a, then the construction of a confidence interval is usually carried out with the help of some point O. s. a. For the majority of practically interesting cases, the distribution function of a reasonably chosen O. s. and monotonically depends on the parameter a. In these conditions for finding interval O. of page. follows in F(x; a) substitute x= a . and find the roots a 1 = a 1(a, w) and a 2 \u003d a 2 (a, w) equations

(9) where

[for continuous distributions]. The points with coordinates and limit the confidence interval with the confidence factor w. Of course, an interval constructed in such a simple way may in many cases differ from the optimal (shortest) one. However, if a is an asymptotically efficient O.S. for a, then for a sufficiently large number of observations such an interval O. s. practically insignificantly different from the optimal one. In particular, this is true for O. s. maximum likelihood, because it is asymptotically normally distributed (see (8)). In cases where equations (9) are difficult, interval O. s. calculate approximately with the help of point O. with. maximum likelihood and ratio (8):

where X - root of the equation

If , then the true confidence factor of the interval estimate tends to w. In a more general case, the distribution of the results of observations X i- depends on several parameters a, b.... Under these conditions, the above rules for constructing confidence intervals often turn out to be inapplicable, since the distribution of point O. with. a , depends, as a rule, not only on a, but also on other parameters. However, in practically interesting cases O. s. a can be replaced by such a function from the results of observations X i and an unknown parameter i, the distribution of which does not depend (or "almost does not depend") on all unknown parameters. An example of such a function is the normalized O. s. maximum likelihood ; if the denominator contains arguments a,b... replace them with maximum likelihood estimates a, b,. . . , then the limit distribution will remain the same as in formula (8). Therefore, approximate confidence intervals for each parameter separately can be built as follows same, as in the case of one parameter.

As noted above, if , ... are independent and equally normally distributed random variables, then s 2 are the best O. s. for parameters a and s 2 respectively. Distribution function O. s. is expressed by the formula


and hence it depends not only on a, but also on s. At the same time, the distribution of the so-called. student relationship


does not depend on either a or s, and

where the constant is chosen so that the equality . So the confidence interval

corresponds to the coefficient of confidence

The distribution of the estimate s 2 depends only on s 2, and the distribution function of O. s. s 2 is given by the formula

where the constant D n-1 is determined by the condition (the so-called -distribution with n-1 degrees of freedom).

Since the probability increases monotonically with increasing s, to construct an interval O. s. rule (9) applies. Thus, if x 1 and x 2 are the roots of the equations and = , then the confidence interval

corresponds to the confidence coefficient w. From here, in particular, it follows that the confidence interval for the relative error is given by the inequalities

Detailed tables of Student's distribution functions and -distributions are available in most manuals on mathematical. statistics.

Until now, it has been assumed that the distribution function of the results of observations is known to within the values ​​of several parameters. However, in applications one often encounters the case when the distribution function is unknown. In this situation, the so-called. nonparametric methods of statistics(i.e., methods that do not depend on the original probability distribution). Let, for example, it is required to estimate the median of the theoretical continuous distribution of independent random variables X 1 , X 2 ,..., X p(for symmetric distributions, it coincides with the mathematical expectation, if, of course, it exists). Let Y 1 be the same quantities X i but arranged in ascending order. Then if k- an integer satisfying the inequalities n/2, then

Thus, - interval O. with. for TS by the coefficient of confidence w=w n,k. This is true for any continuous distribution of random variables X i .

It was noted above that the selective distribution is point O. with. for an unknown theoretical distribution. Moreover, the sampling distribution function F n(x). - unshifted O. s. for the theoretical function. distribution F(x) . At the same time, as shown BUT. N. Kolmogorov, distribution of statistics

does not depend on the unknown theoretical. distribution and as tends to the limit distribution K(y) , to-roe naz. the Kolmogorov distribution. Thus, if y - solution of the equation K(y)=w, then with probability w it can be argued that the theoretical functions. distribution F (y). is completely "covered" by the strip enclosed between the graphs of functions (when the difference between the prelimit and limit distributions of the statistics l n is practically insignificant). Such interval O. with. called trust zone. see also Interval evaluation.

Statistical estimates in the theory of errors. The theory of errors is a section of mathematical statistics devoted to the numerical determination of unknown quantities from the results of measurements. Due to the random nature of measurement errors and, perhaps, the random nature of the phenomenon under study, not all such results are equal: in repeated measurements, some of them occur more often, others less frequently.

The basis of the theory of errors is mathematical. , according to which, prior to the experiment, the totality of all conceivable measurement results is treated as a set of values ​​of a certain random variable. Therefore, O. acquires an important role. The conclusions of the theory of errors are statistical. . The meaning and content of such conclusions (as well as the conclusions of O.

Assuming the measurement result X to be a random variable, there are three main types of measurement errors: systematic, random and gross (qualitative descriptions of such errors are given in Art. Error theory). In this case, the measurement error of the unknown quantity anaz. X-a, mathematical. expectation of this difference E( Ha)=b called systematic error (if b \u003d 0, then they say that the measurements are devoid of systematic errors), and the difference d \u003d X- a-b called random mistake . Thus, if p-independent measurements of a are given, then their results can be written as equalities

where ai and b are constants, a d i- random variables. More generally

where b i- do not depend on d i random variables, which are equal to zero with a probability very close to unity (therefore, any other value is unlikely). b i called gross mistake.

The task of assessing (and eliminating) systematic. errors usually go beyond the mathematical. statistics. The exceptions are the so-called. the method of standards, according to Krom, to evaluate b, a series of measurements of a known value a is performed (in this method b- estimated value and a - known systematic. error), as well as , allowing you to evaluate the systematic. discrepancies between several series of measurements.

The main task of the theory of errors is finding O. with. for an unknown value of a and an estimate of the measurement accuracy. If the systematic the error is eliminated (b=0) and the observations do not contain gross errors, then according to (10) X i=a+d i and, therefore, in this case the problem of estimating c is reduced to finding, in one sense or another, an optimal O. s. for math. expectations of identically distributed random variables X i . As shown in the previous sections, the type of such O. s. (point or interval) essentially depends on the law of distribution of random errors. If this law is known up to a few unknown parameters, then for the estimation, as well as for the estimation, it is possible to apply, for example, the maximum likelihood method; otherwise, it follows first according to the results of observations Х i find O. s. for an unknown random error distribution function d i(the "non-parametric" interval O. of such a function is indicated above). In practical work are often content with two O. of page. and (see (1) and (2)). If d i distributed equally normally, then these O. s. the best; in other cases, these estimates may be ineffective.

The presence of gross errors complicates the problem of estimating the parameter a. Usually the proportion of observations in which is small, and mathematical. expect nonzero |b i| significantly exceeds (gross errors occur as a result of random miscalculation, incorrect reading of the readings of the measuring device, etc.). Measurement results containing gross errors are often clearly visible because they differ greatly from other measurement results. Under these conditions, the most expedient way to identify (and eliminate) gross errors is a direct analysis of measurements, a thorough check of the invariance of the conditions of all experiments, recording the results "in two hands", etc. Statistic. methods for detecting gross errors should be used only in doubtful cases.

The simplest example of such methods is the statist. identification of one outlier observation, when either Y 1 =minX 1, or Y p \u003d maxX i(it is assumed that in equalities (11) b=0 and the law of distribution of quantities d i known). In order to find out whether the assumption of the presence of one gross error is justified, for a pair Y 1 , Y n calculate joint interval O. with. (confidence ), assuming all b i zero. If this O. s. "covers" the point with coordinates ( Y 1 , Y n), then the suspicion of the presence of a gross error should be considered statistically unfounded; otherwise, the hypothesis of the presence of a gross error must be recognized as confirmed (in this case, a rejected observation is usually discarded, since it is not statistically possible to reliably estimate the magnitude of a gross error from one observation).

Topic 7. Statistical estimates of distribution parameters: point and interval estimates

The meaning of statistical methods lies in the fact that, based on a sample of a limited size, that is, on a certain part of the general population, to make a reasonable judgment about its properties as a whole.

Naturally, replacing a population study with a sample study raises a number of questions:

1. To what extent does the sample reflect the properties of the general population, that is, to what extent is the sample representative of the general population?

2. What information about the values ​​of the parameters of the general population can the parameters of the sample give?

3. Can it be argued that the statistical characteristics obtained by sampling (mean values, variance or any other derived values) are equal to those characteristics that can be obtained from the general population.

The check shows that the values ​​of the parameters obtained for different samples from the same general population usually do not match. The numerical values ​​of the sample parameters calculated randomly are only the result of an approximate statistical evaluation values ​​of these parameters in the general population. Statistical estimation, due to the variability of the observed phenomena, allows obtaining only their approximate values.

Note. Strictly speaking, in statistics, an estimate is a rule for calculating the parameter being estimated, and the term evaluate, that is, to evaluate, means to indicate an approximate value.

Distinguish estimates point and interval estimates.

Point estimation of distribution parameters

Let x 1 , x 2 , …, x n– volume sampling n from the general population with a distribution function F(x).

The numerical characteristics of this sample are called selective (empirical) numerical features.

Note that sample numerical characteristics are characteristics of a given sample, but are not characteristics of the distribution of the general population. However, these characteristics can be used to estimate the parameters of the general population.

dotted called a statistical estimate, which is determined by a single number.

The point estimate is characterized by properties: unbiasedness, viability and efficiency.

unbiased called a point estimate, the mathematical expectation of which is equal to the estimated parameter for any sample size.

The point estimate is called wealthy , if with an unlimited increase in the sample size ( n® ¥) it converges in probability to the true value of the parameter, that is, tends to the true value of the estimated parameter of the general population.

efficient is called a point estimate, which (for a given sample size n) has the smallest possible variance, that is, it guarantees the smallest deviation of the sample estimate from the same estimate of the general population.

Mathematical statistics show that a consistent, unbiased estimate of the general mean value a is the sample mean:

where x i- sampling option, n i– frequency options x i, is the sample size.

Unbiased estimate of the general variance serves to correct the sample variance

,

More convenient formula .

Grade s 2 for the general variance is also consistent, but not efficient. However, in the case of a normal distribution, it is "asymptotically efficient", that is, with increasing n the ratio of its variance to the minimum possible one approaches unity indefinitely.

So, given a sample from the distribution F(x) random variable X with unknown expectation a and dispersion s 2, then to calculate the values ​​of these parameters, we have the right to use the following approximate formulas:

Point estimates have the disadvantage that, with a small sample size, they can differ significantly from the estimated parameters. Therefore, in order to get an idea of ​​the proximity between a parameter and its estimate, so-called interval estimates are introduced in mathematical statistics.

Confidence interval

If, during statistical processing of the results, it is required to find not only a point estimate of the unknown parameter θ, but also to characterize the accuracy of this estimate, then a confidence interval is found.

Confidence interval is the interval in which the unknown parameter of the general population is found by a predetermined confidence probability.

Confidence probability is the probability with which the unknown population parameter belongs to the confidence interval.

The length of the confidence interval characterizes the accuracy of the interval estimation and depends on the sample size and confidence level. With an increase in the sample size, the length will confi. the interval decreases (accuracy increases), and when the confidence probability tends to 1, the length will be confidence. interval increases (accuracy decreases) Along with the confidence level p, the significance level α = 1 - p is often used in practice.

Usually take p = 0.95 or (rarely) 0.99. These probabilities are recognized as sufficient for a confident judgment about the general parameters based on known sample indicators.

The confidence interval for the mathematical expectation is: where S - RMS, - the critical value of the Student's distribution (See APPENDIX 1 to Topic 7)

Lecture plan:

    The concept of evaluation

    Properties of statistical estimates

    Methods for finding point estimates

    Interval parameter estimation

    Confidence interval for the mathematical expectation with a known variance of a normally distributed population.

    Chi-squared distribution and Student's distribution.

    Confidence interval for the mathematical expectation of a random variable that has a normal distribution with an unknown variance.

    Confidence interval for the standard deviation of the normal distribution.

Bibliography:

    Wentzel, E.S. Probability theory [Text] / E.S. Wentzel. - M.: Higher school, 2006. - 575 p.

    Gmurman, V.E. Probability theory and mathematical statistics [Text] / V.E. Gmurman. - M.: Higher school, 2007. - 480 p.

    Kremer, N.Sh. Probability theory and mathematical statistics [Text] / N.Sh. Kremer - M: UNITI, 2002. - 543 p.

P.1. The concept of evaluation

Distributions such as binomial, exponential, normal are families of distributions that depend on one or more parameters. For example, the exponential distribution with probability density , depends on one parameter λ, the normal distribution
- from two parameters m and σ. As a rule, it is clear from the conditions of the problem under study which family of distributions is being discussed. However, the specific values ​​of the parameters of this distribution, which are included in the expressions of the distribution characteristics that are of interest to us, remain unknown. Therefore, it is necessary to know at least an approximate value of these quantities.

Let the distribution law of the general population be defined up to the values ​​of the parameters included in its distribution
, some of which may be known. One of the tasks of mathematical statistics is to find estimates of unknown parameters from a sample of observations
from the general population. Estimation of unknown parameters consists in constructing a function
from a random sample such that the value of this function is approximately equal to the estimated unknown parameter θ . Function called statistics parameter θ .

Statistical evaluation(hereinafter just evaluation) parameter θ theoretical distribution is called its approximate value, depending on the choice data.

Grade is a random variable, because is a function of independent random variables
; if you make a different sample, then the function will, generally speaking, take a different value.

There are two types of estimates - point and interval.

dotted is called an estimate determined by a single number. With a small number of observations, these estimates can lead to gross errors. To avoid them, interval estimates are used.

Interval is called an estimate, which is determined by two numbers - the ends of the interval, in which the estimated value is enclosed with a given probability θ .

P. 2 Properties of statistical estimates

the value
called assessment accuracy. The less
, the better, the more precisely the unknown parameter is determined.

A number of requirements are imposed on the estimation of any parameter, which it must satisfy in order to be “close” to the true value of the parameter, i.e. be in some sense a "benign" assessment. The quality of an estimate is determined by checking whether it has the properties of unbiasedness, efficiency, and consistency.

Grade parameter θ called unbiased(without systematic errors) if the mean of the estimate is the same as the true value θ :

. (1)

If equality (1) does not hold, then the estimate called displaced(with systematic errors). This bias may be due to errors in measurement, counting, or the non-random nature of the sample. Systematic errors lead to overestimation or underestimation.

For some problems of mathematical statistics, there may be several unbiased estimates. Usually preference is given to the one that has the least scattering (dispersion).

Grade called efficient if it has the smallest variance among all possible unbiased estimates of the parameter θ .

Let D() is the minimum variance, and
is the variance of any other unbiased estimator parameter θ . Then the efficiency of the estimate is equal to

. (2)

It's clear that
. The closer
to 1, the more efficient the evaluation . If a
at
, then the estimate is called asymptotically efficient.

Comment: If score shifted, then the smallness of its dispersion does not mean the smallness of its error. Taking, for example, as an estimate of the parameter θ some number , we obtain an estimate even with zero variance. However, in this case, the error (error)
can be arbitrarily large.

Grade called wealthy, if with an increase in the sample size (
) the estimate converges in probability to the exact value of the parameter θ , i.e. if for any

. (3)

Consistency of the assessment parameter θ means that with growth n sample size evaluation quality is improving.

Theorem 1. The sample mean is an unbiased and consistent estimate of the expectation.

Theorem 2. The corrected sample variance is an unbiased and consistent estimate of the variance.

Theorem 3. The empirical distribution function of the sample is an unbiased and consistent estimate of the distribution function of a random variable.