Package 'educineq' reference manual

Title:	Compute and Decompose Inequality in Education
Description:	Easily compute education inequality measures and the distribution of educational attainments for any group of countries, using the data set developed in Jorda, V. and Alonso, JM. (2017) <DOI:10.1016/j.worlddev.2016.10.005>. The package offers the possibility to compute not only the Gini index, but also generalized entropy measures for different values of the sensitivity parameter. In particular, the package includes functions to compute the mean log deviation, which is more sensitive to the bottom part of the distribution; the Theil’s entropy measure, equally sensitive to all parts of the distribution; and finally, the GE measure when the sensitivity parameter is set equal to 2, which gives more weight to differences in higher education. The decomposition of these measures in the components between-country and within-country inequality is also provided. Two graphical tools are also provided, to analyse the evolution of the distribution of educational attainments: The cumulative distribution function and the Lorenz curve.
Authors:	Vanesa Jorda [aut, cre], Jose Manuel Alonso [aut]
Maintainer:	Vanesa Jorda <[email protected]>
License:	GPL (>= 2)
Version:	0.1.0
Built:	2025-01-23 03:54:41 UTC
Source:	https://github.com/cran/educineq

This dataset contains information about the available countries, their corresponding country codes and the regions they belong to, which are used to with `educineq` functions.

Description

country. Country name
code. World Bank country code
region. Macro-region to which the country belongs

Usage

data(data_country)
data(data_country)

Format

A data frame with 142 rows and 3 variables

Cumulative distribution function of time of schooling

Description

edcdf is a function to graph the CDF of time of schooling for any group of countries using the set of estimates developed in Jorda and Alonso (2017).

Usage

edcdf(countries, init.y, final.y, database)
edcdf(countries, init.y, final.y, database)

Arguments

`countries`	character vector with the country codes of the countries to be used. Some macro-regions are already defined and can be used instead of the country codes: `South Asia, Europe and Central Asia, Middle East and North Africa, Latin America and the Caribbean, Advanced Economies, Sub-Saharan Africa, East Asia and the Pacific`. (see `data_country`).
`init.y`	the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`final.y`	the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`database`	population subgrup for which the function is calculated. The following options are available: `"total15"`: Total population aged over-15. `"total25"`: Total population aged over-25. `"male15"`: Male population aged over-15. `"male25"`: Male population aged over-25. `"female15"`: Female population aged over-15. `"female25"`: Female population aged over-25.

Details

We use the set of estimates developed in Jorda and Alonso (2017), where the generalized gamma distribution (Stacy, 1962) is used to model the time that individuals attend school until they complete the educational cycle or decide to drop out. The reason is twofold; first, the generalized gamma distribution is a parsimonious model that nests most of the parametric assumptions described in the literature (see, Marshall and Olkin, 2007). Second, the generalized gamma distribution is able to model one- and zero-mode distributions and to represent several types of hazard rates.The flexibility of this model to consider such heterogeneity, makes it an outstanding candidate to model the distribution of education. It is important to highlight that this parametric model includes as particular cases most of the distributions commonly used in survival analysis, including the Weibull, the exponential, and the gamma distributions, so it would converge to any of its special cases if needed.

To accommodate time and country varying parameters, the distribution of education of each country and year is estimated by non-linear least squares (see, Jorda and Alonso (2017) for further description on the estimation strategy).The distribution of education of a particular group or region of countries is simply defined as a mixture of the national distributions, weighted by their population shares.

Value

edcdf returns a graph of the evolution of the CDF of education over the specified period.

References

Jorda, V. and Alonso, J.M. (2017). New estimates on educational attainment using a continuous approach (1970-2010), World Development, 90, 281 - 293. http://www.sciencedirect.com/science/article/pii/S0305750X16305010

Marshall, A. W. and Olkin, I. (2007). Life distributions. Structure of nonparametric, semiparametric, and parametric families. New York: Springer.

Stacy, E. W. (1962). A generalization of the gamma distribution. Annals of Mathematical Statistics, 33, 1187 - 1192.

Examples

edcdf(countries = "South Asia", init.y = 1980, final.y = 1990, database = "female25")
edcdf(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"),init.y = 1995,
final.y = 2010, database = "male25")
edcdf(countries = "South Asia", init.y = 1980, final.y = 1990, database = "female25")
edcdf(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"),init.y = 1995,
final.y = 2010, database = "male25")

Generalized entropy measure of education

Description

ege2 function computes the generalized entropy measure of education, with the sensitivity parameter set to 2, for any group of countries included in the dataset developed in Jorda and Alonso (2017). The function also provides a decomposition of this index in between-county and within- country inequality.

Usage

ege2(countries, init.y, final.y, database, plot = TRUE)
ege2(countries, init.y, final.y, database, plot = TRUE)

Arguments

`countries`	character vector with the country codes of the countries to be used. Some macro-regions are already defined and can be used instead of the country codes: `South Asia, Europe and Central Asia, Middle East and North Africa, Latin America and the Caribbean, Advanced Economies, Sub-Saharan Africa, East Asia and the Pacific` and `all` for the 142 counrties included in the dataset.(see `data_country`).
`init.y`	the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`final.y`	the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`database`	population subgrup for which the function is calculated. The following options are available: `"total15"`: Total population aged over-15. `"total25"`: Total population aged over-25. `"male15"`: Male population aged over-15. `"male25"`: Male population aged over-25. `"female15"`: Female population aged over-15. `"female25"`: Female population aged over-25.
`plot`	if `TRUE` (the default), displays a graph of the results.

Details

The estimates of the generalized entropy measure for the specified group of countries can be easily derived by taking advantage of the decomposition of this family. It is computed as the sum of the following terms, which correspond to within- country and between, country inequality respectively (see, e.g., Cowell, 2011):

$GE(2)_W=\sum_{i=1}^{N} s_i^2 p_i^{-1} GE(2)_i;$

$GE(2)_B= 0.5 \sum_{i=1}^{N} p_i (\mu_i / \mu)^2 -1,$

where N is the number of countries, $GE(2)_i$ and $p_i$ denote, respectively, the generalized entropy measure and the population weight of the country i, and $s_i$ stands for the proportion of mean income of the country i in the overall mean of the group: $s_i=\lambda_i \mu_i / \sum_{i=1}^{N} \lambda_i \mu_i$ .

Value

ege2 returns a list with the following objects:

GE_2: evolution of the generalized entropy measure of education from the initial to the last year, decomposed in between-country and within-country inequality.
countries: countries used to compute the generalized entropy measure.
If plot = TRUE, graphical representation of the numerical results.

References

Cowell, F. (2011). Measuring inequality. Oxford University Press.

Examples

ege2(countries = "all", init.y = 1980, final.y = 2000,
     database = "total25")
ege2(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980,
     final.y = 2000, database = "female15")
ege2(countries = "all", init.y = 1980, final.y = 2000,
     database = "total25")
ege2(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980,
     final.y = 2000, database = "female15")

Gini index of education

Description

egini is a function to compute the Gini index of education for any group of countries using the set of estimates developed in Jorda and Alonso (2017).

Usage

egini(countries, init.y, final.y, database, M = 5000, plot = TRUE)
egini(countries, init.y, final.y, database, M = 5000, plot = TRUE)

Arguments

`countries`	character vector with the country codes of the countries to be used. Some macro-regions are already defined and can be used instead of the country codes: `South Asia, Europe and Central Asia, Middle East and North Africa, Latin America and the Caribbean, Advanced Economies, Sub-Saharan Africa, East Asia and the Pacific` and `all` for the 142 counrties included in the dataset (see `data_country`).
`init.y`	the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`final.y`	the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`database`	population subgrup for which the function is calculated. The following options are available: `"total15"`: Total population aged over-15. `"total25"`: Total population aged over-25. `"male15"`: Male population aged over-15. `"male25"`: Male population aged over-25. `"female15"`: Female population aged over-15. `"female25"`: Female population aged over-25.
`M`	size of the simulated sample.
`plot`	if `TRUE` (the default), displays a graph of the results.

Details

Value

egini returns a list with the following objects:

Gini_index: evolution of the Gini index of education from the initial to the last year.
countries: countries used to compute the Gini index.
If plot = TRUE, graphical representation of the numerical results.

References

Cowell, F. (2011). Measuring inequality. Oxford University Press.

Stacy, E. W. (1962). A generalization of the gamma distribution. Annals of Mathematical Statistics, 33, 1187 - 1192.

Examples

egini(countries = c("DNK", "FIN"), init.y = 1995, final.y = 1995,
  database = "male25", M = 100, plot = FALSE)

egini(countries = c("DNK", "FIN"), init.y = 1995, final.y = 1995,
  database = "male25", M = 100, plot = FALSE)

Lorenz curve of education

Description

elc is a function to graph the Lorenz curve of education for any group of countries using the set of estimates developed in Jorda and Alonso (2017).

Usage

elc(countries, init.y, final.y, database, M = 5000)
elc(countries, init.y, final.y, database, M = 5000)

Arguments

`countries`	character vector with the country codes of the countries to be used. Some macro-regions are already defined and can be used instead of the country codes: `South Asia, Europe and Central Asia, Middle East and North Africa, Latin America and the Caribbean, Advanced Economies, Sub-Saharan Africa, East Asia and the Pacific`. (see `data_country`).
`init.y`	the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`final.y`	the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`database`	population subgrup for which the function is calculated. The following options are available: `"total15"`: Total population aged over-15. `"total25"`: Total population aged over-25. `"male15"`: Male population aged over-15. `"male25"`: Male population aged over-25. `"female15"`: Female population aged over-15. `"female25"`: Female population aged over-25.
`M`	size of the simulated sample (default `M = 5000`).

Details

We use the set of estimates developed in Jorda and Alonso (2017), where the generalized gamma distribution (Stacy, 1962) is used to model the time that individuals attend school until they complete the educational cycle or decide to drop out. To accommodate time and country varying parameters, the distribution of education of each country and year is estimated by non-linear least squares (see, Jorda and Alonso (2017) for further description on the estimation strategy).The Lorenz curve is computed from a synthetic sample of size M of the distribution of education of the specified group of countries. The sample is obtained by Monte Carlo simulation using the mixture of the national distributions, weighted by their population shares.

Value

elc returns a graph of the evolution of the Lorenz curve of education over the specified period.

References

Cowell, F. (2011). Measuring inequality. Oxford University Press.

Stacy, E. W. (1962). A generalization of the gamma distribution. Annals of Mathematical Statistics, 33, 1187 - 1192.

Examples

elc(countries = c("CAN","USA"), init.y = 1985, final.y = 1985,
  database = "female25", M = 300)

elc(countries = c("CAN","USA"), init.y = 1985, final.y = 1985,
  database = "female25", M = 300)

Mean years of schooling

Description

emean is a function to compute mean years of schooling for any group of countries included in the dataset developed in Jorda and Alonso (2017). It is computed as the average of the national years of schooling weighted by population weights.

Usage

emean(countries, init.y, final.y, database, plot = TRUE)
emean(countries, init.y, final.y, database, plot = TRUE)

Arguments

`countries`	character vector with the country codes of the countries to be used. Some macro-regions are already defined and can be used instead of the country codes: `South Asia, Europe and Central Asia, Middle East and North Africa, Latin America and the Caribbean, Advanced Economies, Sub-Saharan Africa, East Asia and the Pacific` and `all` for the 142 counrties included in the dataset.(see `data_country`).
`init.y`	the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`final.y`	the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`database`	population subgrup for which the function is calculated. The following options are available: `"total15"`: Total population aged over-15. `"total25"`: Total population aged over-25. `"male15"`: Male population aged over-15. `"male25"`: Male population aged over-25. `"female15"`: Female population aged over-15. `"female25"`: Female population aged over-25.
`plot`	if `TRUE` (the default), displays a graph of the results.

Value

emean returns a list with the following objects:

mean_years_of_schooling: evolution of mean years of schooling from the initial to the last year.
countries: countries used to compute mean years of schooling.
If plot = TRUE, graphical representation of the numerical results.

References

Examples

emean(countries = "Advanced Economies", init.y = 1980, final.y = 2000,
      database = "male25")
emean(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980,
      final.y = 2000, database = "male25")
emean(countries = "Advanced Economies", init.y = 1980, final.y = 2000,
      database = "male25")
emean(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980,
      final.y = 2000, database = "male25")

Mean log deviation (MLD) of education

Description

emld function computes the MLD of education, with for any group of countries included in the dataset developed in Jorda and Alonso (2017). The function also provides a decomposition of this index in between-county and within-country inequality.

Usage

emld(countries, init.y, final.y, database, plot = TRUE)
emld(countries, init.y, final.y, database, plot = TRUE)

Arguments

`countries`	character vector with the country codes of the countries to be used. Some macro-regions are already defined and can be used instead of the country codes: `South Asia, Europe and Central Asia, Middle East and North Africa, Latin America and the Caribbean, Advanced Economies, Sub-Saharan Africa, East Asia and the Pacific` and `all` for the 142 counrties included in the dataset.(see `data_country`).
`init.y`	the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`final.y`	the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`database`	population subgrup for which the function is calculated. The following options are available: `"total15"`: Total population aged over-15. `"total25"`: Total population aged over-25. `"male15"`: Male population aged over-15. `"male25"`: Male population aged over-25. `"female15"`: Female population aged over-15. `"female25"`: Female population aged over-25.
`plot`	if `TRUE` (the default), displays a graph of the results.

Details

The estimates of the MLD for the specified group of countries can be easily derived by taking advantage of the decomposition of this family. It is computed as the sum of the following terms, which correspond to within- country and between, country inequality respectively (see, e.g., Cowell, 2011):

$MLD_W=\sum_{i=1}^{N} p_i MLD_i;$

$MLD_B=\sum_{i=1}^{N} p_i log(\mu / \mu_i),$

where N is the number of countries, $MLD_i$ and $p_i$ denote, respectively, the MDL and the population weight of the country i.

Value

emld returns a list with the following objects:

MLD: evolution of the MLD of education from the initial to the last year, decomposed in between-country and within-country inequality.
countries: countries used to compute the MLD.
If plot = TRUE, graphical representation of the numerical results.

References

Cowell, F. (2011). Measuring inequality. Oxford University Press.

Examples

emld(countries = "East Asia and the Pacific", init.y = 1980,
     final.y = 2000, database = "female25")
emld(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980,
     final.y = 2000, database = "total25")
emld(countries = "East Asia and the Pacific", init.y = 1980,
     final.y = 2000, database = "female25")
emld(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980,
     final.y = 2000, database = "total25")

Theil index of education

Description

etheil is a function to compute the Theil index of education for any group of countries included in the dataset developed in Jorda and Alonso (2017). The function also provides a decomposition of this index in between-county and within-country inequality.

Usage

etheil(countries, init.y, final.y, database, plot = TRUE)
etheil(countries, init.y, final.y, database, plot = TRUE)

Arguments

`countries`	character vector with the country codes of the countries to be used. Some macro-regions are already defined and can be used instead of the country codes: `South Asia, Europe and Central Asia, Middle East and North Africa, Latin America and the Caribbean, Advanced Economies, Sub-Saharan Africa, East Asia and the Pacific` and `all` for the 142 counrties included in the dataset.(see `data_country`).
`init.y`	the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`final.y`	the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010.
`database`	population subgrup for which the function is calculated. The following options are available: `"total15"`: Total population aged over-15. `"total25"`: Total population aged over-25. `"male15"`: Male population aged over-15. `"male25"`: Male population aged over-25. `"female15"`: Female population aged over-15. `"female25"`: Female population aged over-25.
`plot`	if `TRUE` (the default), displays a graph of the results.

Details

The estimates of the Theil index for the specified group of countries can be easily derived by taking advantage of the decomposition of this family. It is computed as the sum of the following terms, which correspond to within- country and between, country inequality respectively (see, e.g., Cowell, 2011):

$T_W=\sum_{i=1}^{N} s_i T_i;$

$T_B=\sum_{i=1}^{N} s_i log(\mu_i / \mu),$

where N is the number of countries, $T_i$ denotes the Theil index of the country i and $s_i$ stands for the proportion of mean income of the country i in the overall mean of the group: $s_i=\lambda_i \mu_i / \sum_{i=1}^{N} \lambda_i \mu_i$ .

Value

etheil returns a list with the following objects:

Theli_index: evolution of the Theil index of education from the initial to the last year, decomposed in between-country and within-country inequality.
countries: countries used to compute the Theil index.
If plot = TRUE, graphical representation of the numerical results.

References

Cowell, F. (2011). Measuring inequality. Oxford University Press.

Examples

etheil(countries = "Advanced Economies", init.y = 1980, final.y = 2000,
       database = "male25")
etheil(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980,
       final.y = 2000, database = "female15")
etheil(countries = "Advanced Economies", init.y = 1980, final.y = 2000,
       database = "male25")
etheil(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980,
       final.y = 2000, database = "female15")

This dataset contains some statistics about the distribution of educational attainments for female population aged over 15, taken from www.educationdata.unican.es. The data on population come from Barro and Lee database available at http://www.barrolee.com

Description

country. Country name
year
code. World Bank country code
region. Macro-region to which the country belongs
mys. Mean years of schooling
mld. Mean log deviation of education.
theil. Theil index of education
ge2. Generalized entropy measure of education
pop. Total population, http://www.barrolee.com

Usage

data(ineq_female15a)
data(ineq_female15a)

Format

A data frame with 1278 rows and 9 variables

This dataset contains some statistics about the distribution of educational attainments for female population aged over 25, taken from www.educationdata.unican.es. The data on population come from Barro and Lee database available at http://www.barrolee.com

Description

country. Country name
year
code. World Bank country code
region. Macro-region to which the country belongs
mys. Mean years of schooling
mld. Mean log deviation of education.
theil. Theil index of education
ge2. Generalized entropy measure of education
pop. Total population, http://www.barrolee.com

Usage

data(ineq_female25a)
data(ineq_female25a)

Format

A data frame with 1278 rows and 9 variables

This dataset contains some statistics about the distribution of educational attainments for male population aged over 15, taken from www.educationdata.unican.es. The data on population come from Barro and Lee database available at http://www.barrolee.com

Description

country. Country name
year
code. World Bank country code
region. Macro-region to which the country belongs
mys. Mean years of schooling
mld. Mean log deviation of education.
theil. Theil index of education
ge2. Generalized entropy measure of education
pop. Total population, http://www.barrolee.com

Usage

data(ineq_male15a)
data(ineq_male15a)

Format

A data frame with 1278 rows and 9 variables

This dataset contains some statistics about the distribution of educational attainments for male population aged over 25, taken from www.educationdata.unican.es. The data on population come from Barro and Lee database available at http://www.barrolee.com

Description

country. Country name
year
code. World Bank country code
region. Macro-region to which the country belongs
mys. Mean years of schooling
mld. Mean log deviation of education.
theil. Theil index of education
ge2. Generalized entropy measure of education
pop. Total population, http://www.barrolee.com

Usage

data(ineq_male25a)
data(ineq_male25a)

Format

A data frame with 1278 rows and 9 variables

This dataset contains some statistics about the distribution of educational attainments for population aged over 15, taken from www.educationdata.unican.es. The data on population come from Barro and Lee database available at http://www.barrolee.com

Description

country. Country name
year
code. World Bank country code
region. Macro-region to which the country belongs
mys. Mean years of schooling
mld. Mean log deviation of education.
theil. Theil index of education
ge2. Generalized entropy measure of education
pop. Total population, http://www.barrolee.com

Usage

data(ineq_total15a)
data(ineq_total15a)

Format

A data frame with 1278 rows and 9 variables

This dataset contains some statistics about the distribution of educational attainments for population aged over 25, taken from www.educationdata.unican.es. The data on population come from Barro and Lee database available at http://www.barrolee.com

Description

country. Country name
year
code. World Bank country code
region. Macro-region to which the country belongs
mys. Mean years of schooling
mld. Mean log deviation of education.
theil. Theil index of education
ge2. Generalized entropy measure of education
pop. Total population, http://www.barrolee.com

Usage

data(ineq_total25a)
data(ineq_total25a)

Format

A data frame with 1278 rows and 9 variables

Package 'educineq'

Help Index

This dataset contains information about the available countries, their corresponding country codes and the regions they belong to, which are used to with educineq functions.

Description

Usage

Format

Cumulative distribution function of time of schooling

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Generalized entropy measure of education

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Gini index of education

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Lorenz curve of education

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Mean years of schooling

Description

Usage

Arguments

Value

References

See Also

Examples

Mean log deviation (MLD) of education

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Theil index of education

Description

Usage

Arguments

Details

Value

References

See Also

Examples

This dataset contains some statistics about the distribution of educational attainments for female population aged over 15, taken from www.educationdata.unican.es. The data on population come from Barro and Lee database available at http://www.barrolee.com

Description

Usage

Format

This dataset contains some statistics about the distribution of educational attainments for female population aged over 25, taken from www.educationdata.unican.es. The data on population come from Barro and Lee database available at http://www.barrolee.com

Description

Usage

Format

This dataset contains some statistics about the distribution of educational attainments for male population aged over 15, taken from www.educationdata.unican.es. The data on population come from Barro and Lee database available at http://www.barrolee.com

Description

Usage

Format

This dataset contains information about the available countries, their corresponding country codes and the regions they belong to, which are used to with `educineq` functions.