Title: | Compute and Decompose Inequality in Education |
---|---|
Description: | Easily compute education inequality measures and the distribution of educational attainments for any group of countries, using the data set developed in Jorda, V. and Alonso, JM. (2017) <DOI:10.1016/j.worlddev.2016.10.005>. The package offers the possibility to compute not only the Gini index, but also generalized entropy measures for different values of the sensitivity parameter. In particular, the package includes functions to compute the mean log deviation, which is more sensitive to the bottom part of the distribution; the Theil’s entropy measure, equally sensitive to all parts of the distribution; and finally, the GE measure when the sensitivity parameter is set equal to 2, which gives more weight to differences in higher education. The decomposition of these measures in the components between-country and within-country inequality is also provided. Two graphical tools are also provided, to analyse the evolution of the distribution of educational attainments: The cumulative distribution function and the Lorenz curve. |
Authors: | Vanesa Jorda [aut, cre], Jose Manuel Alonso [aut] |
Maintainer: | Vanesa Jorda <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.0 |
Built: | 2025-01-23 03:54:41 UTC |
Source: | https://github.com/cran/educineq |
educineq
functions.country. Country name
code. World Bank country code
region. Macro-region to which the country belongs
data(data_country)
data(data_country)
A data frame with 142 rows and 3 variables
edcdf
is a function to graph the CDF of time of schooling for any group of
countries using the set of estimates developed in Jorda and Alonso (2017).
edcdf(countries, init.y, final.y, database)
edcdf(countries, init.y, final.y, database)
countries |
character vector with the country codes of the countries
to be used. Some macro-regions are already defined and can be used
instead of the country codes: |
init.y |
the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
final.y |
the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
database |
population subgrup for which the function is calculated. The following options are available:
|
We use the set of estimates developed in Jorda and Alonso (2017), where the generalized gamma distribution (Stacy, 1962) is used to model the time that individuals attend school until they complete the educational cycle or decide to drop out. The reason is twofold; first, the generalized gamma distribution is a parsimonious model that nests most of the parametric assumptions described in the literature (see, Marshall and Olkin, 2007). Second, the generalized gamma distribution is able to model one- and zero-mode distributions and to represent several types of hazard rates.The flexibility of this model to consider such heterogeneity, makes it an outstanding candidate to model the distribution of education. It is important to highlight that this parametric model includes as particular cases most of the distributions commonly used in survival analysis, including the Weibull, the exponential, and the gamma distributions, so it would converge to any of its special cases if needed.
To accommodate time and country varying parameters, the distribution of education of each country and year is estimated by non-linear least squares (see, Jorda and Alonso (2017) for further description on the estimation strategy).The distribution of education of a particular group or region of countries is simply defined as a mixture of the national distributions, weighted by their population shares.
edcdf
returns a graph of the evolution of the CDF of education
over the specified period.
Jorda, V. and Alonso, J.M. (2017). New estimates on educational attainment using a continuous approach (1970-2010), World Development, 90, 281 - 293. http://www.sciencedirect.com/science/article/pii/S0305750X16305010
Marshall, A. W. and Olkin, I. (2007). Life distributions. Structure of nonparametric, semiparametric, and parametric families. New York: Springer.
Stacy, E. W. (1962). A generalization of the gamma distribution. Annals of Mathematical Statistics, 33, 1187 - 1192.
GenGamma.orig
, data_country
.
Visit http://www.educationdata.unican.esfor more information on
the constructoin of the dataset and the available
countries.
edcdf(countries = "South Asia", init.y = 1980, final.y = 1990, database = "female25") edcdf(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"),init.y = 1995, final.y = 2010, database = "male25")
edcdf(countries = "South Asia", init.y = 1980, final.y = 1990, database = "female25") edcdf(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"),init.y = 1995, final.y = 2010, database = "male25")
ege2
function computes the generalized entropy measure of education, with
the sensitivity parameter set to 2, for any group of countries included
in the dataset developed in Jorda and Alonso (2017). The function also
provides a decomposition of this index in between-county and within-
country inequality.
ege2(countries, init.y, final.y, database, plot = TRUE)
ege2(countries, init.y, final.y, database, plot = TRUE)
countries |
character vector with the country codes of the countries
to be used. Some macro-regions are already defined and can be used
instead of the country codes: |
init.y |
the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
final.y |
the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
database |
population subgrup for which the function is calculated. The following options are available:
|
plot |
if |
The estimates of the generalized entropy measure for the specified group of countries can be easily derived by taking advantage of the decomposition of this family. It is computed as the sum of the following terms, which correspond to within- country and between, country inequality respectively (see, e.g., Cowell, 2011):
where N is the number of countries, and
denote,
respectively, the generalized entropy measure and the population weight of the
country i, and
stands for the proportion of mean income of the
country i in the overall mean of the group:
.
ege2
returns a list with the following objects:
GE_2
: evolution of the generalized entropy measure of education
from the initial to the last year, decomposed in between-country
and within-country inequality.
countries
: countries used to compute the generalized entropy measure.
If plot = TRUE
, graphical representation of the numerical results.
Cowell, F. (2011). Measuring inequality. Oxford University Press.
Jorda, V. and Alonso, J.M. (2017). New estimates on educational attainment using a continuous approach (1970-2010), World Development, 90, 281 - 293. http://www.sciencedirect.com/science/article/pii/S0305750X16305010
data_country
. Visit http://www.educationdata.unican.es
for more information on the constructoin of the dataset and the available
countries.
ege2(countries = "all", init.y = 1980, final.y = 2000, database = "total25") ege2(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980, final.y = 2000, database = "female15")
ege2(countries = "all", init.y = 1980, final.y = 2000, database = "total25") ege2(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980, final.y = 2000, database = "female15")
egini
is a function to compute the Gini index of education for any group of
countries using the set of estimates developed in Jorda and Alonso (2017).
egini(countries, init.y, final.y, database, M = 5000, plot = TRUE)
egini(countries, init.y, final.y, database, M = 5000, plot = TRUE)
countries |
character vector with the country codes of the countries
to be used. Some macro-regions are already defined and can be used
instead of the country codes: |
init.y |
the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
final.y |
the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
database |
population subgrup for which the function is calculated. The following options are available:
|
M |
size of the simulated sample. |
plot |
if |
We use the set of estimates developed in Jorda and Alonso (2017), where
the generalized gamma distribution (Stacy, 1962) is used to model the time that
individuals attend school until they complete the educational cycle or decide to
drop out. The Gini index is computed from a synthetic sample of size
M
of the distribution of education of the specified group of countries.
The sample is obtained by Monte Carlo simulation using the mixture of the national
distributions, weighted by their population shares.
egini
returns a list with the following objects:
Gini_index
: evolution of the Gini index of education
from the initial to the last year.
countries
: countries used to compute the Gini index.
If plot = TRUE
, graphical representation of the numerical results.
Cowell, F. (2011). Measuring inequality. Oxford University Press.
Jorda, V. and Alonso, J.M. (2017). New estimates on educational attainment using a continuous approach (1970-2010), World Development, 90, 281 - 293. http://www.sciencedirect.com/science/article/pii/S0305750X16305010
Stacy, E. W. (1962). A generalization of the gamma distribution. Annals of Mathematical Statistics, 33, 1187 - 1192.
GenGamma.orig
, Gini
, data_country
.
Visit http://www.educationdata.unican.es for more information on
the constructoin of the dataset and the available
countries.
egini(countries = c("DNK", "FIN"), init.y = 1995, final.y = 1995, database = "male25", M = 100, plot = FALSE)
egini(countries = c("DNK", "FIN"), init.y = 1995, final.y = 1995, database = "male25", M = 100, plot = FALSE)
elc
is a function to graph the Lorenz curve of education for any group of
countries using the set of estimates developed in Jorda and Alonso (2017).
elc(countries, init.y, final.y, database, M = 5000)
elc(countries, init.y, final.y, database, M = 5000)
countries |
character vector with the country codes of the countries
to be used. Some macro-regions are already defined and can be used
instead of the country codes: |
init.y |
the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
final.y |
the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
database |
population subgrup for which the function is calculated. The following options are available:
|
M |
size of the simulated sample (default |
We use the set of estimates developed in Jorda and Alonso (2017), where
the generalized gamma distribution (Stacy, 1962) is used to model the time that
individuals attend school until they complete the educational cycle or decide to
drop out. To accommodate time and country varying parameters, the distribution of education
of each country and year is estimated by non-linear least squares (see, Jorda and
Alonso (2017) for further description on the estimation strategy).The Lorenz curve
is computed from a synthetic sample of size M
of the distribution of
education of the specified group of countries.
The sample is obtained by Monte Carlo simulation using the mixture of the national
distributions, weighted by their population shares.
elc
returns a graph of the evolution of the Lorenz curve of education
over the specified period.
Cowell, F. (2011). Measuring inequality. Oxford University Press.
Jorda, V. and Alonso, J.M. (2017). New estimates on educational attainment using a continuous approach (1970-2010), World Development, 90, 281 - 293. http://www.sciencedirect.com/science/article/pii/S0305750X16305010
Stacy, E. W. (1962). A generalization of the gamma distribution. Annals of Mathematical Statistics, 33, 1187 - 1192.
GenGamma.orig
, Lc
,
data_country
. Visit http://www.educationdata.unican.es
for more information on the constructoin of the dataset and the available
countries.
elc(countries = c("CAN","USA"), init.y = 1985, final.y = 1985, database = "female25", M = 300)
elc(countries = c("CAN","USA"), init.y = 1985, final.y = 1985, database = "female25", M = 300)
emean
is a function to compute mean years of schooling for any group of
countries included in the dataset developed in Jorda and Alonso (2017).
It is computed as the average of the national years of schooling
weighted by population weights.
emean(countries, init.y, final.y, database, plot = TRUE)
emean(countries, init.y, final.y, database, plot = TRUE)
countries |
character vector with the country codes of the countries
to be used. Some macro-regions are already defined and can be used
instead of the country codes: |
init.y |
the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
final.y |
the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
database |
population subgrup for which the function is calculated. The following options are available:
|
plot |
if |
emean
returns a list with the following objects:
mean_years_of_schooling
: evolution of mean years of
schooling from the initial to the last year.
countries
: countries used to compute mean years of schooling.
If plot = TRUE
, graphical representation of the numerical results.
Jorda, V. and Alonso, J.M. (2017). New estimates on educational attainment using a continuous approach (1970-2010), World Development, 90, 281 - 293. http://www.sciencedirect.com/science/article/pii/S0305750X16305010
data_country
. Visit http://www.educationdata.unican.es
for more information on the constructoin of the dataset and the available
countries.
emean(countries = "Advanced Economies", init.y = 1980, final.y = 2000, database = "male25") emean(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980, final.y = 2000, database = "male25")
emean(countries = "Advanced Economies", init.y = 1980, final.y = 2000, database = "male25") emean(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980, final.y = 2000, database = "male25")
emld
function computes the MLD of education, with for any group of
countries included in the dataset developed in Jorda and Alonso (2017).
The function also provides a decomposition of this index in between-county
and within-country inequality.
emld(countries, init.y, final.y, database, plot = TRUE)
emld(countries, init.y, final.y, database, plot = TRUE)
countries |
character vector with the country codes of the countries
to be used. Some macro-regions are already defined and can be used
instead of the country codes: |
init.y |
the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
final.y |
the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
database |
population subgrup for which the function is calculated. The following options are available:
|
plot |
if |
The estimates of the MLD for the specified group of countries can be easily derived by taking advantage of the decomposition of this family. It is computed as the sum of the following terms, which correspond to within- country and between, country inequality respectively (see, e.g., Cowell, 2011):
where N is the number of countries, and
denote, respectively, the MDL
and the population weight of the country i.
emld
returns a list with the following objects:
MLD
: evolution of the MLD of education
from the initial to the last year, decomposed in between-country
and within-country inequality.
countries
: countries used to compute the MLD.
If plot = TRUE
, graphical representation of the numerical results.
Cowell, F. (2011). Measuring inequality. Oxford University Press.
Jorda, V. and Alonso, J.M. (2017). New estimates on educational attainment using a continuous approach (1970-2010), World Development, 90, 281 - 293. http://www.sciencedirect.com/science/article/pii/S0305750X16305010
data_country
. Visit http://www.educationdata.unican.es
for more information on the constructoin of the dataset and the available
countries.
emld(countries = "East Asia and the Pacific", init.y = 1980, final.y = 2000, database = "female25") emld(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980, final.y = 2000, database = "total25")
emld(countries = "East Asia and the Pacific", init.y = 1980, final.y = 2000, database = "female25") emld(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980, final.y = 2000, database = "total25")
etheil
is a function to compute the Theil index of education for any group of
countries included in the dataset developed in Jorda and Alonso (2017).
The function also provides a decomposition of this index in between-county
and within-country inequality.
etheil(countries, init.y, final.y, database, plot = TRUE)
etheil(countries, init.y, final.y, database, plot = TRUE)
countries |
character vector with the country codes of the countries
to be used. Some macro-regions are already defined and can be used
instead of the country codes: |
init.y |
the first year in which the function is calculated. Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
final.y |
the last year in which the function is calculated Available years are 1970, 1975, 1980, 1985, 1990, 1995, 2000, 2005, 2010. |
database |
population subgrup for which the function is calculated. The following options are available:
|
plot |
if |
The estimates of the Theil index for the specified group of countries can be easily derived by taking advantage of the decomposition of this family. It is computed as the sum of the following terms, which correspond to within- country and between, country inequality respectively (see, e.g., Cowell, 2011):
where N is the number of countries, denotes the Theil index
of the country i and
stands for the proportion of mean income
of the country i in the overall mean of the group:
.
etheil
returns a list with the following objects:
Theli_index
: evolution of the Theil index of education
from the initial to the last year, decomposed in between-country
and within-country inequality.
countries
: countries used to compute the Theil index.
If plot = TRUE
, graphical representation of the numerical results.
Cowell, F. (2011). Measuring inequality. Oxford University Press.
Jorda, V. and Alonso, J.M. (2017). New estimates on educational attainment using a continuous approach (1970-2010), World Development, 90, 281 - 293. http://www.sciencedirect.com/science/article/pii/S0305750X16305010
data_country
. Visit http://www.educationdata.unican.es
for more information on the constructoin of the dataset and the available
countries.
etheil(countries = "Advanced Economies", init.y = 1980, final.y = 2000, database = "male25") etheil(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980, final.y = 2000, database = "female15")
etheil(countries = "Advanced Economies", init.y = 1980, final.y = 2000, database = "male25") etheil(countries = c("DNK", "FIN", "ISL", "NOR", "SWE"), init.y = 1980, final.y = 2000, database = "female15")
country. Country name
year
code. World Bank country code
region. Macro-region to which the country belongs
mys. Mean years of schooling
mld. Mean log deviation of education.
theil. Theil index of education
ge2. Generalized entropy measure of education
pop. Total population, http://www.barrolee.com
data(ineq_female15a)
data(ineq_female15a)
A data frame with 1278 rows and 9 variables
country. Country name
year
code. World Bank country code
region. Macro-region to which the country belongs
mys. Mean years of schooling
mld. Mean log deviation of education.
theil. Theil index of education
ge2. Generalized entropy measure of education
pop. Total population, http://www.barrolee.com
data(ineq_female25a)
data(ineq_female25a)
A data frame with 1278 rows and 9 variables
country. Country name
year
code. World Bank country code
region. Macro-region to which the country belongs
mys. Mean years of schooling
mld. Mean log deviation of education.
theil. Theil index of education
ge2. Generalized entropy measure of education
pop. Total population, http://www.barrolee.com
data(ineq_male15a)
data(ineq_male15a)
A data frame with 1278 rows and 9 variables
country. Country name
year
code. World Bank country code
region. Macro-region to which the country belongs
mys. Mean years of schooling
mld. Mean log deviation of education.
theil. Theil index of education
ge2. Generalized entropy measure of education
pop. Total population, http://www.barrolee.com
data(ineq_male25a)
data(ineq_male25a)
A data frame with 1278 rows and 9 variables
country. Country name
year
code. World Bank country code
region. Macro-region to which the country belongs
mys. Mean years of schooling
mld. Mean log deviation of education.
theil. Theil index of education
ge2. Generalized entropy measure of education
pop. Total population, http://www.barrolee.com
data(ineq_total15a)
data(ineq_total15a)
A data frame with 1278 rows and 9 variables
country. Country name
year
code. World Bank country code
region. Macro-region to which the country belongs
mys. Mean years of schooling
mld. Mean log deviation of education.
theil. Theil index of education
ge2. Generalized entropy measure of education
pop. Total population, http://www.barrolee.com
data(ineq_total25a)
data(ineq_total25a)
A data frame with 1278 rows and 9 variables