La biblioteca pandas¶
pandas es una biblioteca de Python para análisis y manipulación de datos. Proporciona estructuras de datos y operaciones para manejar tablas numéricas y series temporales. Fue creada por Wes McKinney in 2008. El nombre “pandas” hace referencia tanto a “Panel Data” como a “Python Data Analysis”.
Como su estructura principal de datos, pandas implementa el data frame, el cual es un arreglo rectangular, organizado en filas y columnas.
Instalación¶
pandas puede instalarse tanto mediante Conda como mediante pip.
# Instalación mediante Conda
conda install pandas
# Instalación mediante pip
pip install pandas
Carga¶
Antes de cargar pandas, debe cargarse antes la biblioteca numpy, para operaciones de álgebra lineal. Nótese el uso de los alias np y pd, los cuales no son obligatorios, pero sí recomendados.
import numpy as np # biblioteca para álgebra lineal
import pandas as pd # biblioteca para análisis de datos
Estructuras de datos¶
Las dos principales estructuras de datos de pandas son series y dataframes.
Series¶
Las series son arreglos unidimensionales que contienen datos de cualquier tipo. Se asemejan a una columna de una tabla.
primos = [2, 3, 5, 7, 11]
serie_primos = pd.Series(primos)
serie_primos
0 2
1 3
2 5
3 7
4 11
dtype: int64
Cada elemento de una serie tiene un índice (i.e. posición), comenzando con 0.
# Primer elemento
print(serie_primos[0])
# Segundo elemento
print(serie_primos[1])
2
3
Los índices también pueden tener etiquetas personalizadas:
serie_primos = pd.Series(primos, index = ["A", "B", "C", "D", "E"])
serie_primos
A 2
B 3
C 5
D 7
E 11
dtype: int64
# Elemento en el índice "D"
print(serie_primos["D"])
7
Dataframes¶
Los dataframes son estructuras multidimensionales. Una serie puede verse como una columna de una tabla y un dataframe como una tabla completa. Un dataframe puede construirse a partir de varias series.
# Dataframe construído a partir de dos series
datos = {
"pais": ["PA", "CR", "NI"],
"poblacion": [4.1, 5.0, 6.6]
}
paises = pd.DataFrame(datos)
paises
pais | poblacion | |
---|---|---|
0 | PA | 4.1 |
1 | CR | 5.0 |
2 | NI | 6.6 |
El atributo loc permite retornar una o más filas de un dataframe:
# Segundo elemento
paises.loc[1]
pais CR
poblacion 5.0
Name: 1, dtype: object
# Segundo y tercer elemento
paises.loc[[1, 2]]
pais | poblacion | |
---|---|---|
1 | CR | 5.0 |
2 | NI | 6.6 |
Los índices de los dataframes también pueden etiquetarse:
paises = pd.DataFrame(datos, index=["pais0", "pais1", "pais2"])
paises
pais | poblacion | |
---|---|---|
pais0 | PA | 4.1 |
pais1 | CR | 5.0 |
pais2 | NI | 6.6 |
# Elemento en "pais0"
paises.loc["pais0"]
pais PA
poblacion 4.1
Name: pais0, dtype: object
Operaciones básicas¶
Seguidamente, se describen y ejemplifican algunas de las funciones básicas de pandas.
En los siguientes ejemplos, se utilizará un conjunto de registros de presencia de felinos (familia Felidae) de Costa Rica, obtenido a través de una consulta al portal de GBIF.
read_csv() - carga de datos¶
felidae = pd.read_csv("https://raw.githubusercontent.com/curso-python-imn/curso-python-imn.github.io/main/datos/gbif/felidae.csv", sep="\t")
info() - información general sobre un conjunto de datos¶
felidae.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 50 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 gbifID 150 non-null int64
1 datasetKey 150 non-null object
2 occurrenceID 147 non-null object
3 kingdom 150 non-null object
4 phylum 150 non-null object
5 class 150 non-null object
6 order 150 non-null object
7 family 150 non-null object
8 genus 150 non-null object
9 species 150 non-null object
10 infraspecificEpithet 18 non-null object
11 taxonRank 150 non-null object
12 scientificName 150 non-null object
13 verbatimScientificName 150 non-null object
14 verbatimScientificNameAuthorship 16 non-null object
15 countryCode 150 non-null object
16 locality 41 non-null object
17 stateProvince 132 non-null object
18 occurrenceStatus 150 non-null object
19 individualCount 27 non-null float64
20 publishingOrgKey 150 non-null object
21 decimalLatitude 150 non-null float64
22 decimalLongitude 150 non-null float64
23 coordinateUncertaintyInMeters 129 non-null float64
24 coordinatePrecision 0 non-null float64
25 elevation 3 non-null float64
26 elevationAccuracy 3 non-null float64
27 depth 3 non-null float64
28 depthAccuracy 3 non-null float64
29 eventDate 148 non-null object
30 day 147 non-null float64
31 month 148 non-null float64
32 year 148 non-null float64
33 taxonKey 150 non-null int64
34 speciesKey 150 non-null int64
35 basisOfRecord 150 non-null object
36 institutionCode 137 non-null object
37 collectionCode 150 non-null object
38 catalogNumber 150 non-null object
39 recordNumber 4 non-null object
40 identifiedBy 121 non-null object
41 dateIdentified 109 non-null object
42 license 150 non-null object
43 rightsHolder 132 non-null object
44 recordedBy 135 non-null object
45 typeStatus 0 non-null float64
46 establishmentMeans 10 non-null object
47 lastInterpreted 150 non-null object
48 mediaType 102 non-null object
49 issue 137 non-null object
dtypes: float64(13), int64(3), object(34)
memory usage: 58.7+ KB
head(), tail(), sample() - despliegue de filas de un conjunto de datos¶
# Primeros 10 registros
felidae.head()
gbifID | datasetKey | occurrenceID | kingdom | phylum | class | order | family | genus | species | ... | identifiedBy | dateIdentified | license | rightsHolder | recordedBy | typeStatus | establishmentMeans | lastInterpreted | mediaType | issue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3337559907 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/90794984 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Marvin López M. | 2021-08-11T17:02:57 | CC_BY_NC_4_0 | Marvin López M. | Marvin López M. | NaN | NaN | 2021-09-23T21:26:16.096Z | StillImage | COORDINATE_ROUNDED |
1 | 3333401669 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/88270427 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Tiziano Luka Pesci Rubilar | 2021-07-23T17:52:04 | CC_BY_NC_4_0 | Rebeca Quirós | Rebeca Quirós | NaN | NaN | 2021-09-23T21:15:51.507Z | NaN | COORDINATE_ROUNDED |
2 | 3325502794 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/85490861 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | Sofía Pastor Parajeles | 2021-07-03T16:59:46 | CC_BY_NC_4_0 | Sofía Pastor Parajeles | Sofía Pastor Parajeles | NaN | NaN | 2021-09-23T20:59:21.345Z | StillImage | COORDINATE_ROUNDED |
3 | 3314547422 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/84224884 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | Sofía Pastor Parajeles | 2021-06-23T20:39:47 | CC_BY_NC_4_0 | Sofía Pastor Parajeles | Sofía Pastor Parajeles | NaN | NaN | 2021-09-23T21:25:26.648Z | StillImage | NaN |
4 | 3307298689 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/82810053 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus wiedii | ... | David | 2021-06-13T09:10:50 | CC_BY_NC_4_0 | David | David | NaN | NaN | 2021-09-23T21:25:57.478Z | StillImage | COORDINATE_ROUNDED |
5 rows × 50 columns
# Últimos 15 registros
felidae.tail()
gbifID | datasetKey | occurrenceID | kingdom | phylum | class | order | family | genus | species | ... | identifiedBy | dateIdentified | license | rightsHolder | recordedBy | typeStatus | establishmentMeans | lastInterpreted | mediaType | issue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
145 | 439779436 | 7e2989f0-f762-11e1-a439-00145eb45e9a | NaN | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | NaN | NaN | CC_BY_4_0 | NaN | NaN | NaN | NaN | 2021-09-24T03:32:37.674Z | StillImage | GEODETIC_DATUM_ASSUMED_WGS84;RECORDED_DATE_INV... |
146 | 45869665 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:13378 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus tigrinus | ... | NaN | NaN | CC0_1_0 | NaN | Gardner, Alfred L. | NaN | NATIVE | 2021-09-24T06:03:59.587Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
147 | 45869664 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:13377 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus tigrinus | ... | NaN | NaN | CC0_1_0 | NaN | Gardner, Alfred L. | NaN | NATIVE | 2021-09-24T06:03:59.587Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
148 | 45869301 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:10219 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | NaN | NaN | CC0_1_0 | NaN | Arnold, Keith A. | NaN | NATIVE | 2021-09-24T06:03:59.952Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
149 | 45869265 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:9289 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | NaN | NaN | CC0_1_0 | NaN | NaN | NaN | NATIVE | 2021-09-24T06:03:59.933Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
5 rows × 50 columns
# 5 registros seleccionados aleatoriamente
felidae.sample(5)
gbifID | datasetKey | occurrenceID | kingdom | phylum | class | order | family | genus | species | ... | identifiedBy | dateIdentified | license | rightsHolder | recordedBy | typeStatus | establishmentMeans | lastInterpreted | mediaType | issue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
118 | 1806323327 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | http://www.inaturalist.org/observations/4116862 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | Lauren Harter | 2016-09-15T18:30:43 | CC0_1_0 | Lauren Harter | Lauren Harter | NaN | NaN | 2021-09-23T21:15:40.756Z | StillImage | COORDINATE_ROUNDED |
145 | 439779436 | 7e2989f0-f762-11e1-a439-00145eb45e9a | NaN | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | NaN | NaN | CC_BY_4_0 | NaN | NaN | NaN | NaN | 2021-09-24T03:32:37.674Z | StillImage | GEODETIC_DATUM_ASSUMED_WGS84;RECORDED_DATE_INV... |
144 | 476813558 | 4bfac3ea-8763-4f4b-a71a-76a6f5f243d3 | MCZ:Mamm:5719 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | [no agent data] | NaN | CC_BY_NC_4_0 | President and Fellows of Harvard College | William More Gabb | NaN | NaN | 2021-10-01T01:54:30.139Z | NaN | OCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COU... |
8 | 3124895397 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/80059056 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | Alejandro Garza Garcia | 2021-05-24T03:15:07 | CC_BY_NC_4_0 | Alejandro Garza Garcia | Alejandro Garza Garcia | NaN | NaN | 2021-09-23T21:25:32.610Z | StillImage | COORDINATE_ROUNDED |
120 | 1563873476 | 4bfac3ea-8763-4f4b-a71a-76a6f5f243d3 | MCZ:Cryo:3120 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | Judith Marie Chupasko | NaN | CC_BY_NC_4_0 | President and Fellows of Harvard College | Judith Marie Chupasko | NaN | NaN | 2021-10-01T01:53:10.737Z | NaN | OCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COU... |
5 rows × 50 columns
Los contenidos de un data frame también pueden desplegarse al escribir su nombre en la consola de Python.
felidae
gbifID | datasetKey | occurrenceID | kingdom | phylum | class | order | family | genus | species | ... | identifiedBy | dateIdentified | license | rightsHolder | recordedBy | typeStatus | establishmentMeans | lastInterpreted | mediaType | issue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3337559907 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/90794984 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Marvin López M. | 2021-08-11T17:02:57 | CC_BY_NC_4_0 | Marvin López M. | Marvin López M. | NaN | NaN | 2021-09-23T21:26:16.096Z | StillImage | COORDINATE_ROUNDED |
1 | 3333401669 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/88270427 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Tiziano Luka Pesci Rubilar | 2021-07-23T17:52:04 | CC_BY_NC_4_0 | Rebeca Quirós | Rebeca Quirós | NaN | NaN | 2021-09-23T21:15:51.507Z | NaN | COORDINATE_ROUNDED |
2 | 3325502794 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/85490861 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | Sofía Pastor Parajeles | 2021-07-03T16:59:46 | CC_BY_NC_4_0 | Sofía Pastor Parajeles | Sofía Pastor Parajeles | NaN | NaN | 2021-09-23T20:59:21.345Z | StillImage | COORDINATE_ROUNDED |
3 | 3314547422 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/84224884 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | Sofía Pastor Parajeles | 2021-06-23T20:39:47 | CC_BY_NC_4_0 | Sofía Pastor Parajeles | Sofía Pastor Parajeles | NaN | NaN | 2021-09-23T21:25:26.648Z | StillImage | NaN |
4 | 3307298689 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/82810053 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus wiedii | ... | David | 2021-06-13T09:10:50 | CC_BY_NC_4_0 | David | David | NaN | NaN | 2021-09-23T21:25:57.478Z | StillImage | COORDINATE_ROUNDED |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
145 | 439779436 | 7e2989f0-f762-11e1-a439-00145eb45e9a | NaN | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | NaN | NaN | CC_BY_4_0 | NaN | NaN | NaN | NaN | 2021-09-24T03:32:37.674Z | StillImage | GEODETIC_DATUM_ASSUMED_WGS84;RECORDED_DATE_INV... |
146 | 45869665 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:13378 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus tigrinus | ... | NaN | NaN | CC0_1_0 | NaN | Gardner, Alfred L. | NaN | NATIVE | 2021-09-24T06:03:59.587Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
147 | 45869664 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:13377 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus tigrinus | ... | NaN | NaN | CC0_1_0 | NaN | Gardner, Alfred L. | NaN | NATIVE | 2021-09-24T06:03:59.587Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
148 | 45869301 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:10219 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | NaN | NaN | CC0_1_0 | NaN | Arnold, Keith A. | NaN | NATIVE | 2021-09-24T06:03:59.952Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
149 | 45869265 | 847e2306-f762-11e1-a439-00145eb45e9a | urn:catalog:LSUMZ:Mammals:9289 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Leopardus | Leopardus pardalis | ... | NaN | NaN | CC0_1_0 | NaN | NaN | NaN | NATIVE | 2021-09-24T06:03:59.933Z | NaN | INSTITUTION_COLLECTION_MISMATCH |
150 rows × 50 columns
Selección de columnas¶
Las columnas que se despliegan en un data frame pueden especificarse mediante una lista.
# Despliegue de las columnas con el nombre científico, la especie, la fecha, el año, el mes y el día
felidae[["scientificName", "species", "eventDate", "year", "month", "day"]]
scientificName | species | eventDate | year | month | day | |
---|---|---|---|---|---|---|
0 | Puma concolor (Linnaeus, 1771) | Puma concolor | 2021-08-11T10:22:36 | 2021.0 | 8.0 | 11.0 |
1 | Puma concolor (Linnaeus, 1771) | Puma concolor | 2021-07-15T16:22:29 | 2021.0 | 7.0 | 15.0 |
2 | Leopardus pardalis (Linnaeus, 1758) | Leopardus pardalis | 2021-07-01T19:28:25 | 2021.0 | 7.0 | 1.0 |
3 | Leopardus pardalis (Linnaeus, 1758) | Leopardus pardalis | 2021-06-23T10:55:00 | 2021.0 | 6.0 | 23.0 |
4 | Leopardus wiedii (Schinz, 1821) | Leopardus wiedii | 2015-12-05T14:41:42 | 2015.0 | 12.0 | 5.0 |
... | ... | ... | ... | ... | ... | ... |
145 | Puma concolor (Linnaeus, 1771) | Puma concolor | NaN | NaN | NaN | NaN |
146 | Leopardus tigrinus (Schreber, 1775) | Leopardus tigrinus | 1967-05-15T00:00:00 | 1967.0 | 5.0 | 15.0 |
147 | Leopardus tigrinus (Schreber, 1775) | Leopardus tigrinus | 1967-02-01T00:00:00 | 1967.0 | 2.0 | 1.0 |
148 | Leopardus pardalis (Linnaeus, 1758) | Leopardus pardalis | 1965-06-28T00:00:00 | 1965.0 | 6.0 | 28.0 |
149 | Leopardus pardalis (Linnaeus, 1758) | Leopardus pardalis | 1963-02-03T00:00:00 | 1963.0 | 2.0 | 3.0 |
150 rows × 6 columns
Selección de filas¶
# Selección de filas correspondientes a jaguares (*Panthera onca*)
panthera_onca = felidae[felidae["species"] == "Panthera onca"]
# Despliegue de los primeros registros
panthera_onca.head()
gbifID | datasetKey | occurrenceID | kingdom | phylum | class | order | family | genus | species | ... | identifiedBy | dateIdentified | license | rightsHolder | recordedBy | typeStatus | establishmentMeans | lastInterpreted | mediaType | issue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
21 | 3008449314 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/15270189 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | mike_cove | 2018-08-09T20:10:35 | CC0_1_0 | mike_cove | mike_cove | NaN | NaN | 2021-09-23T20:57:45.632Z | NaN | COORDINATE_ROUNDED |
32 | 2860189171 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/58257138 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | Kate Rothra Fleming | 2020-09-01T16:56:35 | CC_BY_NC_4_0 | Kate Rothra Fleming | Kate Rothra Fleming | NaN | NaN | 2021-09-23T21:14:12.219Z | StillImage | COORDINATE_ROUNDED |
38 | 2850700339 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/55493722 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | Osa Conservation | 2020-08-05T14:47:40 | CC_BY_NC_4_0 | Osa Conservation | Osa Conservation | NaN | NaN | 2021-09-23T21:08:25.102Z | StillImage | COORDINATE_ROUNDED |
57 | 2802770349 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/15255648 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | James Telford | 2018-08-09T07:22:08 | CC_BY_4_0 | James Telford | James Telford | NaN | NaN | 2021-09-23T20:58:14.603Z | NaN | COORDINATE_ROUNDED |
61 | 2629043484 | 09d2da7b-4699-4e45-b0da-73c982660c98 | urn:catalog:KU:KUM:145971:08353198cc55c65471bf... | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | Consuelo Lorenzo & Jorge Bolaños | NaN | CC_BY_4_0 | Comisión Nacional para el Conocimiento y Uso d... | NO DISPONIBLE | NaN | NaN | 2021-09-23T18:42:40.301Z | NaN | TYPE_STATUS_INVALID;OCCURRENCE_STATUS_INFERRED... |
5 rows × 50 columns
# Selección de filas correspondientes a jaguares (*Panthera onca*) o pumas (*Puma concolor*)
panthera_onca_puma_concolor = felidae[(felidae["species"] == "Panthera onca") | (felidae["species"] == "Puma concolor")]
# Despliegue de los primeros registros
panthera_onca_puma_concolor.head(10)
gbifID | datasetKey | occurrenceID | kingdom | phylum | class | order | family | genus | species | ... | identifiedBy | dateIdentified | license | rightsHolder | recordedBy | typeStatus | establishmentMeans | lastInterpreted | mediaType | issue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3337559907 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/90794984 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Marvin López M. | 2021-08-11T17:02:57 | CC_BY_NC_4_0 | Marvin López M. | Marvin López M. | NaN | NaN | 2021-09-23T21:26:16.096Z | StillImage | COORDINATE_ROUNDED |
1 | 3333401669 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/88270427 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Tiziano Luka Pesci Rubilar | 2021-07-23T17:52:04 | CC_BY_NC_4_0 | Rebeca Quirós | Rebeca Quirós | NaN | NaN | 2021-09-23T21:15:51.507Z | NaN | COORDINATE_ROUNDED |
6 | 3302057398 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/81502744 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Michelle Monge-Velazquez | 2021-06-04T00:10:47 | CC_BY_NC_4_0 | Michelle Monge-Velazquez | Michelle Monge-Velazquez | NaN | NaN | 2021-09-23T21:15:55.933Z | StillImage | COORDINATE_ROUNDED |
10 | 3097275563 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/73407648 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Jaime Marcelo Aranda Sánchez | 2021-04-09T22:42:23 | CC_BY_NC_4_0 | nubegris | nubegris | NaN | NaN | 2021-09-23T21:15:03.814Z | StillImage | COORDINATE_ROUNDED |
11 | 3079910798 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/73053113 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | gernotkunz | 2021-04-05T21:48:18 | CC_BY_NC_4_0 | gernotkunz | gernotkunz | NaN | NaN | 2021-09-23T21:24:37.211Z | NaN | COORDINATE_ROUNDED |
12 | 3079872785 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/73053107 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | gernotkunz | 2021-04-05T21:48:17 | CC_BY_NC_4_0 | gernotkunz | gernotkunz | NaN | NaN | 2021-09-23T21:23:35.478Z | NaN | COORDINATE_ROUNDED |
13 | 3067612232 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/66718421 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Jeff Mollenhauer | 2020-12-18T01:09:50 | CC_BY_NC_4_0 | Jeff Mollenhauer | Jeff Mollenhauer | NaN | NaN | 2021-09-23T21:09:44.468Z | StillImage | COORDINATE_ROUNDED |
17 | 3031700803 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/68067200 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | Marian Paniagua | 2021-01-14T20:04:58 | CC_BY_NC_4_0 | Marian Paniagua | Marian Paniagua | NaN | NaN | 2021-09-23T21:14:47.597Z | StillImage | COORDINATE_ROUNDED |
20 | 3008566753 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/66811638 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Puma | Puma concolor | ... | jayras | 2020-12-20T05:44:17 | CC_BY_NC_4_0 | Pacho Gutierrez | Pacho Gutierrez | NaN | NaN | 2021-09-23T21:14:42.430Z | StillImage | COORDINATE_ROUNDED |
21 | 3008449314 | 50c9509d-22c7-4a22-a47d-8c48425ef4a7 | https://www.inaturalist.org/observations/15270189 | Animalia | Chordata | Mammalia | Carnivora | Felidae | Panthera | Panthera onca | ... | mike_cove | 2018-08-09T20:10:35 | CC0_1_0 | mike_cove | mike_cove | NaN | NaN | 2021-09-23T20:57:45.632Z | NaN | COORDINATE_ROUNDED |
10 rows × 50 columns
Operaciones de análisis¶
Graficación¶
Carga de bibliotecas¶
import matplotlib.pyplot as plt # biblioteca de graficación
%matplotlib inline
import calendar # biblioteca para manejo de fechas
Estilo de los gráficos¶
# Estilo de los gráficos
plt.style.use('ggplot')
Ejemplos de gráficos¶
Distribución de registros de presencia por año¶
# Cambio del tipo de datos del campo de fecha
felidae["eventDate"] = pd.to_datetime(felidae["eventDate"])
# Agrupación de los registros por año
felidae_registros_x_anio = felidae.groupby(felidae['eventDate'].dt.year).count().eventDate
felidae_registros_x_anio
eventDate
1839.0 6
1928.0 2
1931.0 1
1932.0 2
1933.0 1
1939.0 1
1954.0 2
1958.0 1
1963.0 2
1964.0 1
1965.0 2
1967.0 2
1970.0 1
1993.0 2
2002.0 1
2005.0 1
2007.0 1
2008.0 1
2009.0 6
2010.0 3
2011.0 2
2012.0 6
2013.0 6
2014.0 3
2015.0 10
2016.0 8
2017.0 15
2018.0 4
2019.0 20
2020.0 23
2021.0 12
Name: eventDate, dtype: int64
# Conversión a un dataframe
felidae_registros_x_anio_df = pd.DataFrame({'anio':felidae_registros_x_anio.index, 'registros':felidae_registros_x_anio.values})
# Conversión del tipo de la columna de año
felidae_registros_x_anio_df["anio"] = pd.to_numeric(felidae_registros_x_anio_df["anio"], downcast='integer')
felidae_registros_x_anio_df.style.set_precision(2)
felidae_registros_x_anio_df
/tmp/ipykernel_2999717/1460916896.py:6: FutureWarning: this method is deprecated in favour of `Styler.format(precision=..)`
felidae_registros_x_anio_df.style.set_precision(2)
anio | registros | |
---|---|---|
0 | 1839 | 6 |
1 | 1928 | 2 |
2 | 1931 | 1 |
3 | 1932 | 2 |
4 | 1933 | 1 |
5 | 1939 | 1 |
6 | 1954 | 2 |
7 | 1958 | 1 |
8 | 1963 | 2 |
9 | 1964 | 1 |
10 | 1965 | 2 |
11 | 1967 | 2 |
12 | 1970 | 1 |
13 | 1993 | 2 |
14 | 2002 | 1 |
15 | 2005 | 1 |
16 | 2007 | 1 |
17 | 2008 | 1 |
18 | 2009 | 6 |
19 | 2010 | 3 |
20 | 2011 | 2 |
21 | 2012 | 6 |
22 | 2013 | 6 |
23 | 2014 | 3 |
24 | 2015 | 10 |
25 | 2016 | 8 |
26 | 2017 | 15 |
27 | 2018 | 4 |
28 | 2019 | 20 |
29 | 2020 | 23 |
30 | 2021 | 12 |
# Graficación
felidae_registros_x_anio_df.plot(x='anio', y='registros', kind='bar', figsize=(12,7), color='red')
# Título y leyendas en los ejes
plt.title('Registros de presencia de Felidae (felinos) en Costa Rica por año', fontsize=20)
plt.xlabel('Año', fontsize=16)
plt.ylabel('Cantidad de registros', fontsize=16)
Text(0, 0.5, 'Cantidad de registros')

Distribución de registros de presencia por mes¶
# Agrupación de los registros por mes
felidae_registros_x_mes = felidae.groupby(felidae['eventDate'].dt.month).count().eventDate
felidae_registros_x_mes
eventDate
1.0 28
2.0 15
3.0 21
4.0 8
5.0 9
6.0 15
7.0 13
8.0 9
9.0 5
10.0 8
11.0 2
12.0 15
Name: eventDate, dtype: int64
# Reemplazo del número del mes por el nombre del mes
felidae_registros_x_mes.index=[calendar.month_name[x] for x in range(1,13)]
felidae_registros_x_mes
January 28
February 15
March 21
April 8
May 9
June 15
July 13
August 9
September 5
October 8
November 2
December 15
Name: eventDate, dtype: int64
# Gráfico de barras
felidae_registros_x_mes.plot(kind='bar',figsize=(12,7), color='blue', alpha=0.5)
# Título y leyendas en los ejes
plt.title('Registros de presencia de Felidae (felinos) en Costa Rica por mes', fontsize=20)
plt.xlabel('Mes', fontsize=16)
plt.ylabel('Cantidad de registros', fontsize=16);

Graficación en una línea de tiempo¶
# Agrupación de los registros por fecha
registros_x_fecha = felidae.groupby(felidae['eventDate'].dt.date).count().eventDate
registros_x_fecha
eventDate
1839-01-01 6
1928-01-01 2
1931-05-29 1
1932-01-01 1
1932-06-01 1
..
2021-06-03 1
2021-06-23 1
2021-07-01 1
2021-07-15 1
2021-08-11 1
Name: eventDate, Length: 135, dtype: int64
# Gráfico de líneas
registros_x_fecha.plot(figsize=(20,8), color='blue')
# Título y leyendas en los ejes
plt.title('Registros de presencia de Felidae (felinos) en Costa Rica por fecha', fontsize=20)
plt.xlabel('Fecha',fontsize=16)
plt.ylabel('Cantidad de registros',fontsize=16);
plt.legend()
<matplotlib.legend.Legend at 0x7f94eed61730>

Análisis de riesgo ante eventos hidrometeorológicos extremos¶
Nicoya¶
# Lectura de datos
datos = pd.read_csv("https://raw.githubusercontent.com/curso-python-imn/curso-python-imn.github.io/main/datos/analisis-riesgo/indicadores-vulnerabilidad-NICOYA.csv")
Tabla de datos por UGM¶
# Despliegue de datos
datos
DISTRITO | COD_DIST | UGM | HOMBRES | MUJERES | TOTAL | SIN_CARENCIAS | CARENCIAS_1 | CARENCIAS_2 | CARENCIAS_3 | ... | MEDIOS_DE_VIDA | NORMALIZACION_BASE_100_MEDIOS_DE_VIDA | LONGITUD_CAMINOS_PCT | NORMALIZACION_BASE_100_10_CAMINOS | AREA_ASP_PCT | NORMALIZACION_BASE_100_10_ASP | AREA_SOBREUSO_PCT | NORMALIZACION_BASE_100_10_SOBREUSO | SUMA_VULNERABILIDADES | INDICE_VULNERABILIDAD_INTEGRADO | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Nicoya | 50201 | 38581 | 48 | 56 | 104 | 71 | 27 | 6 | 0 | ... | 102.5152 | 18.818688 | 25.345476 | 0.000000 | 1.00678 | 9.21241 | 16.048980 | 4.421661 | 90.646146 | 25.990407 |
1 | Nicoya | 50201 | 38625 | 41 | 36 | 77 | 73 | 4 | 0 | 0 | ... | 95.2399 | 12.101375 | 25.345476 | 0.000000 | 1.00678 | 9.21241 | 16.048980 | 4.421661 | 88.069345 | 25.181674 |
2 | Nicoya | 50201 | 38672 | 14 | 11 | 25 | 14 | 11 | 0 | 0 | ... | 171.4000 | 82.420293 | 25.345476 | 0.000000 | 1.00678 | 9.21241 | 16.048980 | 4.421661 | 112.148857 | 32.739060 |
3 | Nicoya | 50201 | 38706 | 17 | 20 | 37 | 17 | 12 | 8 | 0 | ... | 114.2799 | 29.681082 | 25.345476 | 0.000000 | 1.00678 | 9.21241 | 16.048980 | 4.421661 | 62.181956 | 17.056883 |
4 | Nicoya | 50201 | 38718 | 15 | 15 | 30 | 24 | 6 | 0 | 0 | ... | 142.8400 | 56.050734 | 25.345476 | 0.000000 | 1.00678 | 9.21241 | 16.048980 | 4.421661 | 79.441470 | 22.473804 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
756 | Belén de Nosarita | 50207 | 72789 | 35 | 32 | 67 | 0 | 47 | 16 | 4 | ... | 171.4000 | 82.420293 | 10.698224 | 7.602904 | 0.00000 | 10.00000 | 28.529665 | 10.000000 | 157.692386 | 47.032956 |
757 | Belén de Nosarita | 50207 | 72790 | 35 | 40 | 75 | 46 | 14 | 6 | 9 | ... | 152.3600 | 64.840587 | 10.698224 | 7.602904 | 0.00000 | 10.00000 | 28.529665 | 10.000000 | 138.961329 | 41.154190 |
758 | Belén de Nosarita | 50207 | 72791 | 47 | 52 | 99 | 30 | 55 | 14 | 0 | ... | 142.8399 | 56.050641 | 10.698224 | 7.602904 | 0.00000 | 10.00000 | 28.529665 | 10.000000 | 149.906771 | 44.589431 |
759 | Belén de Nosarita | 50207 | 72792 | 48 | 38 | 86 | 6 | 31 | 25 | 24 | ... | 152.3600 | 64.840587 | 10.698224 | 7.602904 | 0.00000 | 10.00000 | 28.529665 | 10.000000 | 155.406990 | 46.315682 |
760 | Belén de Nosarita | 50207 | 72793 | 58 | 53 | 111 | 15 | 18 | 37 | 32 | ... | 152.3601 | 64.840679 | 10.698224 | 7.602904 | 0.00000 | 10.00000 | 28.529665 | 10.000000 | 185.927710 | 55.894649 |
761 rows × 46 columns
datos.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 761 entries, 0 to 760
Data columns (total 46 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 DISTRITO 761 non-null object
1 COD_DIST 761 non-null int64
2 UGM 761 non-null int64
3 HOMBRES 761 non-null int64
4 MUJERES 761 non-null int64
5 TOTAL 761 non-null int64
6 SIN_CARENCIAS 761 non-null int64
7 CARENCIAS_1 761 non-null int64
8 CARENCIAS_2 761 non-null int64
9 CARENCIAS_3 761 non-null int64
10 CARENCIAS_4 761 non-null int64
11 NULO 761 non-null int64
12 TOTAL_CON_CARENCIAS 761 non-null int64
13 TOTAL.1 761 non-null int64
14 NORMALIZACION_BASE_100_CARENCIAS 761 non-null float64
15 SIN_LIMITACIONES 761 non-null int64
16 TOTAL_CON_LIMITACIONES 761 non-null int64
17 LIMITACION_VISUAL 761 non-null int64
18 LIMITACION_AUDITIVA 761 non-null int64
19 LIMITACION_HABLAR 761 non-null int64
20 LIMITACION_CAMINAR 761 non-null int64
21 LIMITACION_BRAZOS_Y_PIERNAS 761 non-null int64
22 LIMITACION_INTELECTUAL 761 non-null int64
23 LIMITACION_MENTAL 761 non-null int64
24 TOTAL.2 761 non-null int64
25 NORMALIZACION_BASE_100_LIMITACIONES 761 non-null float64
26 DEPENDENCIA_EDAD_0_14 761 non-null int64
27 DEPENDENCIA_EDAD_MAYORIGUAL_65 761 non-null int64
28 EDAD_15_64 761 non-null int64
29 TOTAL_DEPENDIENTE 761 non-null int64
30 TOTAL.3 761 non-null int64
31 NORMALIZACION_BASE_100_DEPENDIENTES 761 non-null float64
32 POBLACION_DESOCUPADA 761 non-null int64
33 POBLACION_OCUPADA 761 non-null int64
34 FUERA_DE_LA_FUERZA_LABORAL 761 non-null int64
35 NORMALIZACION_BASE_100_DESOCUPADOS 761 non-null float64
36 MEDIOS_DE_VIDA 761 non-null float64
37 NORMALIZACION_BASE_100_MEDIOS_DE_VIDA 761 non-null float64
38 LONGITUD_CAMINOS_PCT 761 non-null float64
39 NORMALIZACION_BASE_100_10_CAMINOS 761 non-null float64
40 AREA_ASP_PCT 761 non-null float64
41 NORMALIZACION_BASE_100_10_ASP 761 non-null float64
42 AREA_SOBREUSO_PCT 761 non-null float64
43 NORMALIZACION_BASE_100_10_SOBREUSO 761 non-null float64
44 SUMA_VULNERABILIDADES 761 non-null float64
45 INDICE_VULNERABILIDAD_INTEGRADO 761 non-null float64
dtypes: float64(14), int64(31), object(1)
memory usage: 273.6+ KB
datos_belennosarita = datos[datos["COD_DIST"] == 50207]
datos_belennosarita.sample(5)
DISTRITO | COD_DIST | UGM | HOMBRES | MUJERES | TOTAL | SIN_CARENCIAS | CARENCIAS_1 | CARENCIAS_2 | CARENCIAS_3 | ... | MEDIOS_DE_VIDA | NORMALIZACION_BASE_100_MEDIOS_DE_VIDA | LONGITUD_CAMINOS_PCT | NORMALIZACION_BASE_100_10_CAMINOS | AREA_ASP_PCT | NORMALIZACION_BASE_100_10_ASP | AREA_SOBREUSO_PCT | NORMALIZACION_BASE_100_10_SOBREUSO | SUMA_VULNERABILIDADES | INDICE_VULNERABILIDAD_INTEGRADO | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
759 | Belén de Nosarita | 50207 | 72792 | 48 | 38 | 86 | 6 | 31 | 25 | 24 | ... | 152.3600 | 64.840587 | 10.698224 | 7.602904 | 0.0 | 10.0 | 28.529665 | 10.0 | 155.406990 | 46.315682 |
745 | Belén de Nosarita | 50207 | 48352 | 37 | 29 | 66 | 7 | 20 | 34 | 5 | ... | 152.3600 | 64.840587 | 10.698224 | 7.602904 | 0.0 | 10.0 | 28.529665 | 10.0 | 152.505569 | 45.405067 |
728 | Belén de Nosarita | 50207 | 48147 | 40 | 37 | 77 | 45 | 25 | 7 | 0 | ... | 133.3199 | 47.260788 | 10.698224 | 7.602904 | 0.0 | 10.0 | 28.529665 | 10.0 | 125.980188 | 37.080041 |
753 | Belén de Nosarita | 50207 | 48396 | 50 | 53 | 103 | 87 | 14 | 2 | 0 | ... | 120.6177 | 35.532797 | 10.698224 | 7.602904 | 0.0 | 10.0 | 28.529665 | 10.0 | 118.411304 | 34.704537 |
744 | Belén de Nosarita | 50207 | 48348 | 14 | 12 | 26 | 2 | 18 | 6 | 0 | ... | 180.9200 | 91.210147 | 10.698224 | 7.602904 | 0.0 | 10.0 | 28.529665 | 10.0 | 136.401239 | 40.350702 |
5 rows × 46 columns
Distribución porcentual de la población a nivel de distrito¶
# Suma de población por distrito
poblacion_x_distrito = datos.groupby(datos['DISTRITO'])['TOTAL'].sum()
poblacion_x_distrito
DISTRITO
Belén de Nosarita 2686
Mansión 5717
Nicoya 24833
Nosara 4912
Quebrada Honda 2523
San Antonio 6642
Sámara 3512
Name: TOTAL, dtype: int64
# Gráfico de pastel
poblacion_x_distrito.plot.pie()
<AxesSubplot:ylabel='TOTAL'>

Características de la población dependiente¶
# Suma de población dependiente por distrito
poblaciondependiente_x_distrito = datos.groupby(datos['DISTRITO'])['TOTAL_DEPENDIENTE'].sum()
poblaciondependiente_x_distrito
DISTRITO
Belén de Nosarita 971
Mansión 1985
Nicoya 8187
Nosara 1659
Quebrada Honda 857
San Antonio 2285
Sámara 1134
Name: TOTAL_DEPENDIENTE, dtype: int64
# Gráfico de barras
poblaciondependiente_x_distrito.plot.bar()
<AxesSubplot:xlabel='DISTRITO'>
