Load SISAL¶
load SISAL data (https://ora.ox.ac.uk/objects/uuid:1e91e2ac-ca9f-46e5-85f3-8d82d4d3cfd4 | MNE 2024/06/03)
loads the SISALv3 database (2024) downloaded on June 3rd 2024 from https://ora.ox.ac.uk/objects/uuid:1e91e2ac-ca9f-46e5-85f3-8d82d4d3cfd4 | MNE 2024/06/03
Created by Kevin Fan and Lucie Luecke. Based on the code from sisal3_extractCSVdata.py (by Jens Fohlmeister)
Updates: 06/11/2025 by LL: Overhauled, commented and tidied code with markdown. 29/11/2024 by KF: Changes have been made to filtering entityIds by date and how the dataframe is constructed. Mostly cleaning and checking work 21/11/2024 by LL : added option to save as csv 30/10/2024 by LL : added check for empty paleoData_values row
Here we extract a dataframe with the following columns:
archiveTypedataSetNamedatasetIdgeo_meanElevgeo_meanLatgeo_meanLongeo_siteNameinterpretation_direction(new in v2.0)interpretation_variableinterpretation_variableDetailinterpretation_seasonality(new in v2.0)originalDataURLoriginalDatabasepaleoData_notespaleoData_proxypaleoData_sensorSpeciespaleoData_unitspaleoData_valuespaleoData_variableNameyearyearUnits`
We save a standardised compact dataframe for concatenation to DoD2k
This python script reads SISALv3 csv-data in a directory './SISALv3_csv' relative to the path of this file (unpack all your downloaded csv-files there!) and extracts stable isotope, Mg/Ca and growth rate data for all entities, which cover a to be specified period of interest (change lines 70 and 71 according to your needs).
Only records with more than 'number_of_dating_points' U-Th dated depths (line 78) will be accounted for. attention: it may happen that there are enough dated depth available in the requested period, but proxies might not be provided this will result in an empty output, but output nevertheless
The individual data will be plotted and the plots can be saved (comment/uncomment line 225).
The mean and standard deviation of all proxies within your specified period will be determined and saved in a csv file.
There will also be a raw plot for illustrative purposes available.
Feel free to change the code as you see fit.
@original author: Jens Fohlmeister
Set up working environment¶
Make sure the repo_root is added correctly, it should be: your_root_dir/dod2k
This should be the working directory throughout this notebook (and all other notebooks).
%load_ext autoreload
%autoreload 2
import sys
import os
from pathlib import Path
# Add parent directory to path (works from any notebook in notebooks/)
# the repo_root should be the parent directory of the notebooks folder
init_dir = Path().resolve()
# Determine repo root
if init_dir.name == 'dod2k': repo_root = init_dir
elif init_dir.parent.name == 'dod2k': repo_root = init_dir.parent
else: raise Exception('Please review the repo root structure (see first cell).')
# Update cwd and path only if needed
if os.getcwd() != str(repo_root):
os.chdir(repo_root)
if str(repo_root) not in sys.path:
sys.path.insert(0, str(repo_root))
print(f"Repo root: {repo_root}")
if str(os.getcwd())==str(repo_root):
print(f"Working directory matches repo root. ")
Repo root: /home/jupyter-lluecke/dod2k Working directory matches repo root.
# Import packages
import numpy as np
import pandas as pd
import os
from dod2k_utilities import ut_functions as utf # contains utility functions
from dod2k_utilities import ut_plot as uplt # contains plotting functions
Load source data¶
In order to get the source data, run the cell below, which downloads the data directly
# Download the file (use -O to specify output filename)
!wget -O data/sisal/sisalv3_database_mysql_csv.zip https://ora.ox.ac.uk/objects/uuid:1e91e2ac-ca9f-46e5-85f3-8d82d4d3cfd4
# Unzip to the correct destination
# !unzip -d data/sisal/ data/sisal/sisalv3_database_mysql_csv.zip
!unzip data/sisal/sisalv3_database_mysql_csv.zip -d data/sisal/sisalv3_csv
--2025-12-17 09:49:37-- https://ora.ox.ac.uk/objects/uuid:1e91e2ac-ca9f-46e5-85f3-8d82d4d3cfd4
Resolving ora.ox.ac.uk (ora.ox.ac.uk)... 172.66.159.143, 104.20.30.6, 2606:4700:10::6814:1e06, ...
Connecting to ora.ox.ac.uk (ora.ox.ac.uk)|172.66.159.143|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘data/sisal/sisalv3_database_mysql_csv.zip’
data/sisal/sisalv3_ [ <=> ] 71.69K --.-KB/s in 0.1s
2025-12-17 09:49:38 (484 KB/s) - ‘data/sisal/sisalv3_database_mysql_csv.zip’ saved [73408]
Archive: data/sisal/sisalv3_database_mysql_csv.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of data/sisal/sisalv3_database_mysql_csv.zip or
data/sisal/sisalv3_database_mysql_csv.zip.zip, and cannot find data/sisal/sisalv3_database_mysql_csv.zip.ZIP, period.
Read CSVs¶
# read the sisalv3 csv files
entity = pd.read_csv('data/sisal/sisalv3_csv/entity.csv')
d13C = pd.read_csv('data/sisal/sisalv3_csv/d13C.csv')
d18O = pd.read_csv('data/sisal/sisalv3_csv/d18O.csv')
MgCa = pd.read_csv('data/sisal/sisalv3_csv/Mg_Ca.csv')
dating = pd.read_csv('data/sisal/sisalv3_csv/dating.csv')
dating.rename(columns = {'238U_content':'c238U_content','238U_uncertainty':'c238U_uncertainty',
'232Th_content':'c232Th_content','c232Th_uncertainty':'c232Th_uncertainty',
'230Th_content':'c230Th_content','c230Th_uncertainty':'c230Th_uncertainty',
'230Th_232Th_ratio':'a230Th_232Th_ratio','230Th_232Th_ratio_uncertainty':'a230Th_232Th_ratio_uncertainty',
'230Th_238U_activity':'a230Th_238U_activity','230Th_238U_activity_uncertainty':'a230Th_238U_activity_uncertainty',
'234U_238U_activity':'a234U_238U_activity','234U_238U_activity_uncertainty':'a234U_238U_activity_uncertainty'},
inplace = True)
# it is necessary to rename those columns with a number on first position
entity_link_reference = pd.read_csv('data/sisal/sisalv3_csv/entity_link_reference.csv')
original_chronology = pd.read_csv('data/sisal/sisalv3_csv/original_chronology.csv')
reference = pd.read_csv('data/sisal/sisalv3_csv/reference.csv')
sample = pd.read_csv('data/sisal/sisalv3_csv/sample.csv')
sisal_chronology = pd.read_csv('data/sisal/sisalv3_csv/sisal_chronology.csv')
site = pd.read_csv('data/sisal/sisalv3_csv/site.csv')
# os.chdir('..')
Filter data out of wanted bounds¶
###########################################################################
# extract required data from speleothems covering the period of interest
# + provides all entities, which include non-14C ages and non-events
# during the time period
###########################################################################
low = 1950 # defines minimum age [a]
# KF: Filtering data indicies that don't match age and variable requirements
i0 = dating.loc[(dating['corr_age'] <= low) &
(dating['date_type']!='C14') & (dating['date_type'].str.find('Event')!=0)]
i1 = i0['entity_id'].to_numpy()
i3 = np.unique(i1)
### remove all entities with less than 'number_of_dating_points' dated depths
number_of_dating_points = 3
for i in np.arange(0,len(i3)):
i_dummy = i0.entity_id[i0.entity_id==i3[i]].count()
if i_dummy < number_of_dating_points:
i0 = i0[i0.entity_id!=i3[i]]
i1 = i0['entity_id'].to_numpy()
i2 = np.unique(i1) # provides all entities, which include >= 'number_of_dating_points'
# dated depths during the required time period
# You could speed up the above process by forming a frequency dictionary to begin with and just referencing those as you go intead of remeasuring frequencies.
# However, this only becomes more efficient when i0 gets realy big with a lot of repeated entity IDs - Kevin
###########################################################################
Create compact dataframe¶
Parameter definitions¶
### define parameters (all of those will be saved in a final file)
site1_id = np.zeros(len(i2))
site_name1 = ['0']*len(i2)
rock_age1 = ['0']*len(i2)
material1 = ['0']*len(i2)
entity_name1 = ['0']*len(i2)
lon = np.zeros(len(i2))
lat = np.zeros(len(i2))
elev = np.zeros(len(i2))
entity1_id = np.zeros(len(i2))
mean_C = np.zeros(len(i2))
mean_O = np.zeros(len(i2))
mean_GR = np.zeros(len(i2))
mean_MgCa = np.zeros(len(i2))
std_C = np.zeros(len(i2))
std_O = np.zeros(len(i2))
std_GR = np.zeros(len(i2))
std_MgCa = np.zeros(len(i2))
#we need to initialize a publication_DOI array with the length set by the number of publications meeting the selection criteria.
publication_DOI1 = np.zeros(len(i2),dtype='object')
#Check size of daa lists
len(i2)
211
Parameter/metadata population¶
# KF:common dataframe.
df = pd.DataFrame(columns=['archiveType', 'dataSetName', 'datasetId', 'geo_meanElev',
'geo_meanLat', 'geo_meanLon', 'originalDataURL',
'paleoData_notes', 'paleoData_proxy', 'paleoData_units',
'paleoData_values', 'year', 'yearUnits'])
# KF: Populating common dataframe
for n in np.arange(0,len(i2)): #for every valid unique entity
dummy = dating.loc[(dating['entity_id'] == i2[n])] # Row associated with valid entity
### already some metadata for individual speleothems
site1_id[n] = entity.site_id[(entity['entity_id'] == i2[n])].to_numpy()[0]
entity1_id[n] = entity.entity_id[(entity['entity_id'] == i2[n])].to_numpy()[0]
entity_name1[n] = entity.entity_name[(entity['entity_id'] == i2[n])].to_list()
site_name1[n] = site.site_name[(site['site_id'] == site1_id[n])].to_list()
refID = entity_link_reference.ref_id[(entity_link_reference['entity_id'] == i2[n])].to_list()
publication_DOI1[n] = reference.publication_DOI[(reference['ref_id'] == refID[0])].to_list()
if len(publication_DOI1[n])==1: publication_DOI1[n]=publication_DOI1[n]
lon[n] = site.longitude[(site['site_id'] == site1_id[n]).to_numpy()].iloc[0]
lat[n] = site.latitude[(site['site_id'] == site1_id[n]).to_numpy()].iloc[0]
elev[n] = site.elevation[(site['site_id'] == site1_id[n]).to_numpy()].iloc[0]
if dummy.material_dated.dropna().eq('calcite').all():
material1[n] = 'calcite'
elif dummy.material_dated.dropna().eq('aragonite').all():
material1[n] = 'aragonite'
else:
material1[n] = 'mixed'
### extract isotope data (d18O and d13C) and elements #####################
idx1 = sample.sample_id[(sample['entity_id']==i2[n])].to_numpy() # sample ids for current entity id
age = original_chronology.interp_age[original_chronology['sample_id'].isin(idx1)].to_numpy() # interpretation ages from orig. chron. based on current sample ids
# Oxygen
idx2 = original_chronology.sample_id[original_chronology['sample_id'].isin(idx1)].to_numpy() # orig. chron. sample ids based on idx sample ids
d18O_1 = d18O.d18O_measurement[d18O['sample_id'].isin(idx2)].to_numpy() # d18O measurements corresponding to idx2 sample ids
idx3 = d18O.sample_id[d18O['sample_id'].isin(idx2)].to_numpy() # d18O sample ids corresponding to idx2 sample ids
age18 = original_chronology.interp_age[original_chronology['sample_id'].isin(idx3)].to_numpy() #orig. chron. interpretation ages corresponding to idx3 sample ids
# KF: Filter dates too low and too high
filter = list(map((lambda x: True if (x <= low) else False), age18))
age18 = age18[filter]
d18O_1 = d18O_1[filter]
if (len(idx3) < len(idx2) and len(idx3) > 0):
idx2 = idx3 # KF: if a discrepancy exists, brute force idx3 into idx2 for whatever reason, doing this for the other variables causes an index bounds error for growth rate.
# probably something to do with the age list manipulation.
age = original_chronology.interp_age[original_chronology['sample_id'].isin(idx2)].to_numpy() # then set age to the interp. age pertaining to the new idx2 sample ids
if (len(d18O_1) > 0):
# Follows the common dictionary format.
df.loc[len(df)] = ['speleothem', site_name1[n][0], entity1_id[n], elev[n], lat[n], lon[n], publication_DOI1[n], material1[n], 'd18O', 'permil', d18O_1, age18, 'BP']
# Carbon
d13C_1 = d13C.d13C_measurement[d13C['sample_id'].isin(idx2)].to_numpy() # d13C measurements for idx sample ids
idx4 = d13C.sample_id[d13C['sample_id'].isin(idx2)].to_numpy() # d13C sample ids corresponding to idx2 sample ids
age13 = original_chronology.interp_age[original_chronology['sample_id'].isin(idx4)].to_numpy() # interp ages of d13C sample ids from orig. chron.
# KF: Filter dates too low and too high
filter = list(map((lambda x: True if (x<= low) else False), age13))
age13 = age13[filter]
d13C_1 = d13C_1[filter]
if (len(d13C_1) > 0):
df.loc[len(df)] = ['speleothem', site_name1[n][0], entity1_id[n], elev[n], lat[n], lon[n], publication_DOI1[n], material1[n], 'd13C', 'permil', d13C_1, age13, 'BP']
# Magnesium Calcium
MgCa_1 = MgCa.Mg_Ca_measurement[MgCa['sample_id'].isin(idx2)].to_numpy() # MgCa measurements corresponding to idx2 sample ids
idx5 = MgCa.sample_id[MgCa['sample_id'].isin(idx2)].to_numpy() # MgCa sample ids based on idx2 sample ids
ageMgCa = original_chronology.interp_age[original_chronology['sample_id'].isin(idx5)].to_numpy() # interp ages of MgC samples from orig. chron.
# KF: Filter dates too low and too high
filter = list(map((lambda x: True if (x<= low) else False), ageMgCa))
ageMgCa = ageMgCa[filter]
MgCa_1 = MgCa_1[filter]
if (len(MgCa_1) > 0):
df.loc[len(df)] = ['speleothem', site_name1[n][0], entity1_id[n], elev[n], lat[n], lon[n], publication_DOI1[n], material1[n],
'MgCa', 'mmol/mol', MgCa_1, ageMgCa, 'BP']
### also growth rate (gr) could be important ##############################
if len(idx2) != 0:
isotopeDepth = sample.depth_sample[sample['sample_id'].isin(idx2)].to_numpy()
# Estalish placeholder arrays
gr = np.zeros(len(isotopeDepth))
ageGR = np.zeros(len(isotopeDepth))
fage_err_gr = np.zeros(len(isotopeDepth))
# Array population
for i in np.arange(0,len(gr)-1):
if entity.depth_ref[(entity['entity_id'] == i2[n])].to_numpy() == 'from top':
gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1])
else:
gr[i] = -(isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1])
ageGR[i] = age[i]
# KF: error checking
fage_err = (dating['corr_age_uncert_pos'][i] + dating['corr_age_uncert_neg'][i])/dating['corr_age'][i]
fage_err1 = (dating['corr_age_uncert_pos'][i + 1] + dating['corr_age_uncert_neg'][i + 1])/dating['corr_age'][i + 1]
fage_err_gr[i] = np.sqrt((fage_err ** 2) + (fage_err1 **2))
gr[-1] = gr[-2]
if len(np.argwhere(np.isinf(gr))>0):
if (np.argwhere(np.isinf(gr))[-1]==len(gr)-1): # if the last value is 'inf'
gr[np.argwhere(np.isinf(gr))[-1]] = gr[np.argwhere(np.isinf(gr))[-1]-2]
else:
gr[np.argwhere(np.isinf(gr))]=gr[np.argwhere(np.isinf(gr))+1] # replace 'inf' values by neighboring values for gr
while len(np.argwhere(np.isinf(gr))>0): # second iteration for cases where there is very fast growth and initially two successive 'inf' values
gr[np.argwhere(np.isinf(gr))]=gr[np.argwhere(np.isinf(gr))+1] # replace 'inf' values by neighboring values for gr
# GR calculation
for i in np.arange(1,len(gr)-1):
if gr[i]>1:
gr[i]=(gr[i-1]+gr[i+1])/2
# KF: error masking
gr[fage_err_gr>0.1]=-9999.99
# KF: Adding growth rate to common frame
filter = list(map((lambda x: True if (x<= low) else False), ageGR))
df.loc[len(df)] = ['speleothem', site_name1[n][0], entity1_id[n], elev[n], lat[n], lon[n], publication_DOI1[n], material1[n], "growth rate", "mm/year", gr[filter], ageGR[filter], "BP"]
/tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: invalid value encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:88: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = -(isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1]) /tmp/ipykernel_2405199/2886697801.py:86: RuntimeWarning: divide by zero encountered in scalar divide gr[i] = (isotopeDepth[i] - isotopeDepth[i+1]) / (age[i] - age[i+1])
Data cleaning and format conventions¶
# KF: adding og dataset name
df.insert(6, 'originalDatabase', ['SISAL v3']*len(df))
df.insert(6, 'geo_siteName', df['dataSetName'])
df.insert(1, 'interpretation_variable', ['N/A']*len(df))
df.insert(1, 'interpretation_variableDetail', ['N/A']*len(df))
df.insert(1, 'interpretation_direction', ['N/A']*len(df))
df.insert(1, 'interpretation_seasonality', ['N/A']*len(df))
df.insert(12, 'paleoData_sensorSpecies', ['N/A']*len(df))
df.loc[df['paleoData_proxy']=='MgCa', 'paleoData_proxy']='Mg/Ca'
# KF: Temp cleaning rows with NAN in year
# There are thirteen of them, hopefully this does not skew data too much.
length = len(df['year'])
df = df[df['year'].notna()]
df = df[df['year'].map(lambda x: len(x) > 1)]
df = df[df['paleoData_values'].map(lambda x: len(x) > 1)]
df = df[df['paleoData_values'].map(lambda x: not any(pd.isnull(x)))]
print('Number of rows discarded: ', (length - len(df['year'])))
Number of rows discarded: 13
Check proxy types included in paleoData_proxy
df['archiveType'] = df['archiveType'].replace({'speleothem': 'Speleothem'})
set(df['paleoData_proxy'])
{'Mg/Ca', 'd13C', 'd18O', 'growth rate'}
assign climateInterpretation_variable:
- d18O is temperature and moisture
- Mg/Ca is temperature
# d18O is temperature and moisture
df.loc[df['paleoData_proxy']=='d18O', 'interpretation_variable']='temperature+moisture'
df.loc[df['paleoData_proxy']=='d18O', 'interpretation_variableDetail']='temperature+moisture - manually assigned by DoD2k authors for paleoData_proxy = d18O.'
# Mg/Ca is temperature
df.loc[df['paleoData_proxy']=='Mg/Ca', 'interpretation_variable']='temperature'
df.loc[df['paleoData_proxy']=='Mg/Ca', 'interpretation_variableDetail']='temperature - manually assigned by DoD2k authors for paleoData_proxy = Mg/Ca'
df['paleoData_variableName'] = df['paleoData_proxy']
convert years before present to year Common Era
# BP 0 Adjustment
def BP2CE(year):
year = 1950 - year
if year <= 0:
year = year - 1
return year
df['year'] = df['year'].apply(lambda x: (np.array(list(map(lambda y: BP2CE(y), x)))))
df['yearUnits'] = ['CE']*len(df)
# KF: Type-checking
df = df.astype({'archiveType': str, 'dataSetName': str, 'datasetId': str, 'geo_meanElev': float, 'geo_meanLat': float, 'geo_meanLon': float, 'geo_siteName': str,
'originalDatabase': str, 'originalDataURL': str, 'paleoData_notes': str, 'paleoData_proxy': str, 'paleoData_units': str, 'yearUnits': str,
'interpretation_direction': str, 'interpretation_seasonality': str, 'interpretation_variable': str,'interpretation_variableDetail': str })
df['year'] = df['year'].map(lambda x: np.array(x, dtype = float))
df['paleoData_values'] = df['paleoData_values'].map(lambda x: np.array(x, dtype = float))
# KF: display the dataframe
df.reset_index(drop= True, inplace= True)
df = df[sorted(df.columns)]
print(df.info())
<class 'pandas.core.frame.DataFrame'> RangeIndex: 546 entries, 0 to 545 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 archiveType 546 non-null object 1 dataSetName 546 non-null object 2 datasetId 546 non-null object 3 geo_meanElev 546 non-null float64 4 geo_meanLat 546 non-null float64 5 geo_meanLon 546 non-null float64 6 geo_siteName 546 non-null object 7 interpretation_direction 546 non-null object 8 interpretation_seasonality 546 non-null object 9 interpretation_variable 546 non-null object 10 interpretation_variableDetail 546 non-null object 11 originalDataURL 546 non-null object 12 originalDatabase 546 non-null object 13 paleoData_notes 546 non-null object 14 paleoData_proxy 546 non-null object 15 paleoData_sensorSpecies 546 non-null object 16 paleoData_units 546 non-null object 17 paleoData_values 546 non-null object 18 paleoData_variableName 546 non-null object 19 year 546 non-null object 20 yearUnits 546 non-null object dtypes: float64(3), object(18) memory usage: 89.7+ KB None
# check that the datasetId is unique - it currently is not (df has 546 records).
print(len(df.datasetId.unique()))
# make datasetId unique by simply adding index number
df.datasetId=df.apply(lambda x: 'sisal_'+x.datasetId+'_'+str(x.name), axis=1)
# check uniqueness - problem solved.
print(len(df.datasetId.unique()))
200 546
Drop missing entries and standardize missing data format¶
drop_inds = []
for ii in df.index:
try:
year = np.array(df.at[ii, 'year'], dtype=float)
vals = np.array(df.at[ii, 'paleoData_values'], dtype=float)
df.at[ii, 'year'] = year[year>=1]
df.at[ii, 'paleoData_values'] = vals[year>=1]
except:
# print
df.at[ii, 'paleoData_values'] = np.array([utf.convert_to_float(y) for y in df.at[ii, 'paleoData_values']], dtype=float)
df.at[ii, 'year'] = np.array([utf.convert_to_float(y) for y in df.at[ii, 'year']], dtype=float)
print(f'Converted values in paleoData_values and/or year for {ii}.')
# drop_inds.append(ii)
# df_compact = df_compact.drop(drop_inds)
# drop all missing values and exclude all-missing-values-rows
for ii in df.index:
dd = np.array(df.at[ii, 'paleoData_values'])
mask = dd==-9999.99
df.at[ii, 'paleoData_values']=dd[~mask]
df.at[ii, 'year']=np.array(df.at[ii, 'year'])[~mask]
drop_inds = []
for ii, row in enumerate(df.paleoData_values):
try:
if len(row)==0:
print(ii, 'empty row for paleodata_values')
elif len(df.iloc[ii]['year'])==0:
print(ii, 'empty row for year')
elif np.std(row)==0:
print(ii, 'std=0')
elif np.sum(np.diff(row)**2)==0:
print(ii, 'diff=0')
elif np.isnan(np.std(row)):
print(ii, 'std nan')
else:
continue
if df.index[ii] not in drop_inds:
drop_inds += [df.index[ii]]
except:
drop_inds+=[df.index[ii]]
print(drop_inds)
df = df.drop(index=drop_inds)
[]
Save and output dataframe¶
df = df[sorted(df.columns)]
print(df.info())
<class 'pandas.core.frame.DataFrame'> RangeIndex: 546 entries, 0 to 545 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 archiveType 546 non-null object 1 dataSetName 546 non-null object 2 datasetId 546 non-null object 3 geo_meanElev 546 non-null float64 4 geo_meanLat 546 non-null float64 5 geo_meanLon 546 non-null float64 6 geo_siteName 546 non-null object 7 interpretation_direction 546 non-null object 8 interpretation_seasonality 546 non-null object 9 interpretation_variable 546 non-null object 10 interpretation_variableDetail 546 non-null object 11 originalDataURL 546 non-null object 12 originalDatabase 546 non-null object 13 paleoData_notes 546 non-null object 14 paleoData_proxy 546 non-null object 15 paleoData_sensorSpecies 546 non-null object 16 paleoData_units 546 non-null object 17 paleoData_values 546 non-null object 18 paleoData_variableName 546 non-null object 19 year 546 non-null object 20 yearUnits 546 non-null object dtypes: float64(3), object(18) memory usage: 89.7+ KB None
save pickle¶
# KF: Save to pickle
df.to_pickle('data/sisal/sisal_compact.pkl')
save csv¶
# save to a list of csv files (metadata, data, year)
df.name='sisal'
utf.write_compact_dataframe_to_csv(df)
METADATA: datasetId, archiveType, dataSetName, geo_meanElev, geo_meanLat, geo_meanLon, geo_siteName, interpretation_direction, interpretation_seasonality, interpretation_variable, interpretation_variableDetail, originalDataURL, originalDatabase, paleoData_notes, paleoData_proxy, paleoData_sensorSpecies, paleoData_units, paleoData_variableName, yearUnits Saved to /home/jupyter-lluecke/dod2k/data/sisal/sisal_compact_%s.csv
# load dataframe
df = utf.load_compact_dataframe_from_csv('sisal')
print(df.info)
<bound method DataFrame.info of archiveType dataSetName datasetId geo_meanElev geo_meanLat \
0 Speleothem Bittoo cave sisal_9.0_0 3000.0 30.790300
1 Speleothem Bittoo cave sisal_9.0_1 3000.0 30.790300
2 Speleothem Bittoo cave sisal_9.0_2 3000.0 30.790300
3 Speleothem Kesang cave sisal_19.0_3 2000.0 42.869999
4 Speleothem Kesang cave sisal_19.0_4 2000.0 42.869999
.. ... ... ... ... ...
541 Speleothem Sahiya cave sisal_900.0_541 1190.0 30.600000
542 Speleothem Sahiya cave sisal_900.0_542 1190.0 30.600000
543 Speleothem Sahiya cave sisal_901.0_543 1190.0 30.600000
544 Speleothem Sahiya cave sisal_901.0_544 1190.0 30.600000
545 Speleothem Sahiya cave sisal_901.0_545 1190.0 30.600000
geo_meanLon geo_siteName interpretation_direction \
0 77.776398 Bittoo cave N/A
1 77.776398 Bittoo cave N/A
2 77.776398 Bittoo cave N/A
3 81.750000 Kesang cave N/A
4 81.750000 Kesang cave N/A
.. ... ... ...
541 77.866699 Sahiya cave N/A
542 77.866699 Sahiya cave N/A
543 77.866699 Sahiya cave N/A
544 77.866699 Sahiya cave N/A
545 77.866699 Sahiya cave N/A
interpretation_seasonality interpretation_variable ... \
0 N/A temperature+moisture ...
1 N/A N/A ...
2 N/A N/A ...
3 N/A temperature+moisture ...
4 N/A N/A ...
.. ... ... ...
541 N/A N/A ...
542 N/A N/A ...
543 N/A temperature+moisture ...
544 N/A N/A ...
545 N/A N/A ...
originalDataURL originalDatabase paleoData_notes paleoData_proxy \
0 ['10.1038/srep24374'] SISAL v3 calcite d18O
1 ['10.1038/srep24374'] SISAL v3 calcite d13C
2 ['10.1038/srep24374'] SISAL v3 calcite growth rate
3 ['10.1038/srep36975'] SISAL v3 calcite d18O
4 ['10.1038/srep36975'] SISAL v3 calcite d13C
.. ... ... ... ...
541 ['10.1038/ncomms7309'] SISAL v3 calcite d13C
542 ['10.1038/ncomms7309'] SISAL v3 calcite growth rate
543 ['10.1038/ncomms7309'] SISAL v3 calcite d18O
544 ['10.1038/ncomms7309'] SISAL v3 calcite d13C
545 ['10.1038/ncomms7309'] SISAL v3 calcite growth rate
paleoData_sensorSpecies paleoData_units \
0 N/A permil
1 N/A permil
2 N/A mm/year
3 N/A permil
4 N/A permil
.. ... ...
541 N/A permil
542 N/A mm/year
543 N/A permil
544 N/A permil
545 N/A mm/year
paleoData_values paleoData_variableName \
0 [-7.194, -5.274, -7.206, -7.624, -6.122, -6.57... d18O
1 [-0.848, 2.907, -1.927, -3.213, -3.958, -4.64,... d13C
2 [0.025, 0.023809524, 0.025, 0.023809524, 0.023... growth rate
3 [-7.15, -7.49, -7.59, -7.98, -7.69, -7.95, -7.... d18O
4 [-1.85, -3.48, -4.34, -4.9, -4.51, -4.6, -4.69... d13C
.. ... ...
541 [-0.02, -0.23, -0.11, -0.18, -0.11, -0.13, -0.... d13C
542 [0.14251104, 0.14144272, 0.1398895, 0.13837, 0... growth rate
543 [-8.83, -9.12, -9.11, -9.15, -8.98, -9.07, -8.... d18O
544 [0.863, 1.244, 0.668, 1.17, 1.262, 1.109, 1.18... d13C
545 [0.35714287, 0.35714287, 0.3539823, 0.3508772,... growth rate
year yearUnits
0 [1076.0, 1056.0, 1035.0, 1015.0, 994.0, 973.0,... CE
1 [1076.0, 1056.0, 1035.0, 1015.0, 994.0, 973.0,... CE
2 [1076.0, 1056.0, 1035.0, 1015.0, 994.0, 973.0,... CE
3 [1098.0, 1086.0, 1074.0, 1061.0, 1049.0, 1037.... CE
4 [1098.0, 1086.0, 1074.0, 1061.0, 1049.0, 1037.... CE
.. ... ...
541 [2006.3071, 2005.6053, 2004.8984, 2003.4688, 2... CE
542 [2006.3071, 2005.6053, 2004.8984, 2003.4688, 2... CE
543 [1357.23, 1356.67, 1356.11, 1354.98, 1354.41, ... CE
544 [1357.23, 1356.67, 1356.11, 1354.98, 1354.41, ... CE
545 [1357.23, 1356.67, 1356.11, 1354.98, 1354.41, ... CE
[546 rows x 21 columns]>
Visualise dataframe¶
Show spatial distribution of records, show archive and proxy types
# count archive types
archive_count = {}
for ii, at in enumerate(set(df['archiveType'])):
archive_count[at] = df.loc[df['archiveType']==at, 'archiveType'].count()
sort = np.argsort([cc for cc in archive_count.values()])
archives_sorted = np.array([cc for cc in archive_count.keys()])[sort][::-1]
# Specify colour for each archive (smaller archives get grouped into the same colour)
archive_colour, major_archives, other_archives = uplt.get_archive_colours(archives_sorted, archive_count)
fig = uplt.plot_geo_archive_proxy(df, archive_colour)
utf.save_fig(fig, f'geo_{df.name}', dir=df.name)
0 Speleothem 546 saved figure in /home/jupyter-lluecke/dod2k/figs/sisal/geo_sisal.pdf
Now plot the coverage over the Common Era
fig = uplt.plot_coverage(df, archives_sorted, major_archives, other_archives, archive_colour)
utf.save_fig(fig, f'time_{df.name}', dir=df.name)
saved figure in /home/jupyter-lluecke/dod2k/figs/sisal/time_sisal.pdf
Display dataframe¶
Display identification metadata: dataSetName, datasetId, originalDataURL, originalDatabase¶
index¶
# # check index
print(df.index)
RangeIndex(start=0, stop=546, step=1)
dataSetName (associated with each record, may not be unique)¶
# # check dataSetName
key = 'dataSetName'
print('%s: '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
dataSetName: ['Bittoo cave' 'Bittoo cave' 'Bittoo cave' 'Kesang cave' 'Kesang cave' 'Kesang cave' 'Paraiso cave' 'Paraiso cave' 'Paraiso cave' 'Paraiso cave' 'Paraiso cave' 'Paraiso cave' 'Villars cave' 'Villars cave' 'Villars cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Lancaster Hole' 'Lancaster Hole' 'Jeita cave' 'Jeita cave' 'Jeita cave' 'Jeita cave' 'Jeita cave' 'Jeita cave' 'Huangye cave' 'Huangye cave' 'Huangye cave' 'Huangye cave' 'Huangye cave' 'Huangye cave' 'Lapa grande cave' 'Lapa grande cave' 'Lapa grande cave' 'Palestina cave' 'Palestina cave' 'Palestina cave' 'Palestina cave' 'Palestina cave' 'Palestina cave' 'Okshola cave' 'Okshola cave' 'Okshola cave' 'Tamboril cave' 'Tamboril cave' 'Tamboril cave' 'Anjokipoty' 'Anjokipoty' 'Anjokipoty' 'Curupira cave' 'Curupira cave' 'Curupira cave' 'Dayu cave' 'Dayu cave' 'Diva cave' 'Diva cave' 'Diva cave' 'Dongge cave' 'Dongge cave' 'Ifoulki cave' 'Ifoulki cave' 'Ifoulki cave' 'Kapsia cave' 'Kapsia cave' 'Kapsia cave' 'Larshullet cave' 'Larshullet cave' 'Larshullet cave' 'Leviathan cave' 'Leviathan cave' 'Leviathan cave' 'Natural Bridge caverns' 'Natural Bridge caverns' 'Natural Bridge caverns' "Pau d'Alho cave" "Pau d'Alho cave" "Pau d'Alho cave" 'Skala Marion cave' 'Skala Marion cave' 'Skala Marion cave' 'Soylegrotta cave' 'Soylegrotta cave' 'Soylegrotta cave' 'Taurius cave' 'Taurius cave' 'Taurius cave' 'Torrinha cave' 'Torrinha cave' 'Torrinha cave' 'Tzabnah cave' 'Tzabnah cave' 'Wah Shikhar cave' 'Wah Shikhar cave' 'Wah Shikhar cave' 'Chilibrillo cave' 'Chilibrillo cave' 'Chilibrillo cave' 'Furong cave' 'Furong cave' 'Furong cave' 'Macal Chasm' 'Macal Chasm' 'Macal Chasm' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Dante cave' 'Dante cave' 'Dante cave' 'Dante cave' 'Dante cave' 'Dante cave' 'Klapferloch cave' 'Klapferloch cave' 'Klapferloch cave' 'Lapa Doce cave' 'Lapa Doce cave' 'Lapa Doce cave' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Yok Balum cave' 'Yok Balum cave' 'Yok Balum cave' 'Yok Balum cave' 'Yok Balum cave' 'Yok Balum cave' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Bukit Assam cave' 'Bukit Assam cave' 'Bukit Assam cave' 'Bukit Assam cave' 'Bukit Assam cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Cueva de Asiul' 'Cueva de Asiul' 'Heshang cave' 'Heshang cave' 'Heshang cave' 'Buckeye creek' 'Buckeye creek' 'Buckeye creek' 'Buckeye creek' 'Buckeye creek' 'Buckeye creek' 'Buckeye creek' 'Grotta di Carburangeli' 'Grotta di Carburangeli' 'Grotta di Carburangeli' 'Dandak cave' 'Dandak cave' 'Dandak cave' 'Juxtlahuaca cave' 'Juxtlahuaca cave' 'Juxtlahuaca cave' 'Juxtlahuaca cave' 'Kinderlinskaya cave' 'Kinderlinskaya cave' 'Kinderlinskaya cave' 'Oregon caves national monument' 'Oregon caves national monument' 'Oregon caves national monument' 'Sanbao cave' 'Sanbao cave' 'Sofular cave' 'Sofular cave' 'Sofular cave' 'Cova da Arcoia' 'Cova da Arcoia' 'Cova da Arcoia' 'Botuverá cave' 'Botuverá cave' 'Botuverá cave' 'Gunung-buda cave (snail shell cave)' 'Gunung-buda cave (snail shell cave)' 'Jhumar cave' 'Jhumar cave' 'Jhumar cave' 'Jiuxian cave' 'Jiuxian cave' 'Jiuxian cave' 'Jiuxian cave' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'Mavri Trypa cave' 'Mavri Trypa cave' 'Mavri Trypa cave' 'Munagamanu cave' 'Munagamanu cave' 'Munagamanu cave' 'Munagamanu cave' 'Munagamanu cave' 'Munagamanu cave' 'Soreq cave' 'Soreq cave' 'Te Reinga cave' 'Te Reinga cave' 'Te Reinga cave' 'Te Reinga cave' 'Te Reinga cave' 'Te Reinga cave' 'Liang Luar' 'Liang Luar' 'Perdida cave' 'Perdida cave' 'Perdida cave' 'Closani cave' 'Closani cave' 'Closani cave' 'Closani cave' 'Forestry cave' 'Forestry cave' 'Forestry cave' 'Forestry cave' 'Forestry cave' 'Forestry cave' 'Bribin cave' 'Bribin cave' 'Bribin cave' 'KNI-51' 'KNI-51' 'Lianhua cave, Hunan' 'Lianhua cave, Hunan' 'Lianhua cave, Hunan' 'Minnetonka cave' 'Minnetonka cave' 'Minnetonka cave' 'São Bernardo cave' 'São Bernardo cave' 'São Matheus cave' 'São Matheus cave' 'Shatuca cave' 'Shatuca cave' 'Shatuca cave' 'Shatuca cave' 'Tangga cave' 'Tangga cave' 'Tangga cave' 'Uluu-Too cave' 'Uluu-Too cave' 'Uluu-Too cave' 'Xibalba cave' 'Xibalba cave' 'Xibalba cave' 'Dongge cave' 'Dongge cave' 'Dos Anas cave' 'Dos Anas cave' 'Dos Anas cave' 'Dongge cave' 'Dongge cave' 'Jaraguá cave' 'Jaraguá cave' 'Jaraguá cave' 'Jaraguá cave' 'Jaraguá cave' 'Jaraguá cave' 'Mawmluh cave' 'Mawmluh cave' 'Mawmluh cave' 'Chaara cave' 'Chaara cave' 'Dark cave' 'Dark cave' "E'mei cave" "E'mei cave" "E'mei cave" 'Grotte de Piste' 'Grotte de Piste' 'Baeg-nyong cave' 'Baeg-nyong cave' 'Wanxiang cave' 'Wanxiang cave' 'Xianglong cave' 'Xianglong cave' 'Xianglong cave' 'Lianhua cave, Hunan' 'Lianhua cave, Hunan' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Bat cave' 'Bat cave' 'Bat cave' 'Cueva del Tigre Perdido' 'Cueva del Tigre Perdido' 'Umajalanta cave' 'Umajalanta cave' 'Lianhua cave, Shanxi' 'Lianhua cave, Shanxi' 'Shenqi cave' 'Shenqi cave' 'Shenqi cave' 'Shenqi cave' 'Wuya cave' 'Wuya cave' 'Wuya cave' 'Wuya cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Akcakale cave' 'Akcakale cave' 'Akcakale cave' 'Bleßberg cave' 'Bleßberg cave' 'Bleßberg cave' 'Gejkar cave' 'Gejkar cave' 'Gejkar cave' 'Gejkar cave' 'Crystal cave' 'Crystal cave' 'Chaara cave' 'Chaara cave' 'Chaara cave' 'Chaara cave' 'El Condor cave' 'El Condor cave' 'El Condor cave' 'Tamboril cave' 'Tamboril cave' 'Tamboril cave' 'Huagapo cave' 'Huagapo cave' 'Huagapo cave' 'Huagapo cave' 'Huagapo cave' 'Huagapo cave' 'Pink Panther cave' 'Pink Panther cave' 'Kesang cave' 'Kesang cave' 'Kesang cave' 'La Garma cave' 'La Garma cave' 'Tham Doun Mai' 'Tham Doun Mai' 'Tham Doun Mai' 'Hollywood cave' 'Hollywood cave' 'Hollywood cave' 'Jaraguá cave' 'Jaraguá cave' 'Jaraguá cave' 'Shennong cave' 'Shennong cave' 'Shennong cave' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Bàsura cave' 'Bàsura cave' 'Bàsura cave' 'Careys cave' 'Careys cave' 'Careys cave' 'Careys cave' 'Cathedral cave' 'Cathedral cave' 'Cathedral cave' 'Cuíca cave' 'Cuíca cave' 'Cuíca cave' 'Cuíca cave' 'Cuíca cave' 'Cuíca cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Heifeng cave' 'Heifeng cave' 'Herbstlabyrinth cave' 'Herbstlabyrinth cave' 'Herbstlabyrinth cave' 'Herbstlabyrinth cave' 'Huangchao cave' 'Huangchao cave' 'Huangchao cave' 'Huangchao cave' 'Ifoulki cave' 'Ifoulki cave' 'Ifoulki cave' 'Jinfo cave' 'Jinfo cave' 'Jiulong cave' 'Jiulong cave' 'Jiulong cave' 'Kuna Ba' 'Kuna Ba' 'Kuna Ba' 'Kuna Ba' 'La Vierge cave' 'La Vierge cave' 'La Vierge cave' 'Mata Virgem cave' 'Mata Virgem cave' 'Mata Virgem cave' 'Nova\xa0Grgosova\xa0cave' 'Nova\xa0Grgosova\xa0cave' 'Nova\xa0Grgosova\xa0cave' 'Coves del pirata' 'Coves del pirata' 'Coves del pirata' 'Coves del pirata' 'Qujia cave' 'Qujia cave' 'Rey Marcos' 'Rey Marcos' 'Rey Marcos' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Shijiangjun cave' 'Shijiangjun cave' 'Shijiangjun cave' 'Shizi cave' 'Shizi cave' 'Tham Doun Mai' 'Tham Doun Mai' 'Tham Doun Mai' 'Trapiá cave' 'Trapiá cave' 'Trapiá cave' 'Wuya cave' 'Wuya cave' 'Wuya cave' 'Wintimdouine' 'Wintimdouine' 'Wintimdouine' 'Wintimdouine' 'Wulu cave' 'Wulu cave' 'Wulu cave' 'Xiniu cave' 'Xiniu cave' 'Xiniu cave' 'Xiniu cave' 'Xiniu cave' 'Xiniu cave' 'Yonderup cave' 'Yonderup cave' 'Yonderup cave' 'Yonderup cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Kocain cave' 'Kocain cave' 'Kocain cave' 'Kocain cave' 'São Bernardo cave' 'São Bernardo cave' 'São Bernardo cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Sahiya cave' 'Sahiya cave' 'Sahiya cave' 'Sahiya cave' 'Sahiya cave' 'Sahiya cave'] ["<class 'str'>"] No. of unique values: 129/546
datasetId (unique identifier, as given by original authors, includes original database token)¶
# # check datasetId
print(len(df.datasetId.unique()))
print(len(df))
key = 'datasetId'
print('%s (starts with): '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
print('datasetId starts with: ', np.unique([str(dd.split('_')[0]) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
546 546 datasetId (starts with): ['sisal_9.0_0' 'sisal_9.0_1' 'sisal_9.0_2' 'sisal_19.0_3' 'sisal_19.0_4' 'sisal_19.0_5' 'sisal_20.0_6' 'sisal_20.0_7' 'sisal_20.0_8' 'sisal_21.0_9' 'sisal_21.0_10' 'sisal_21.0_11' 'sisal_33.0_12' 'sisal_33.0_13' 'sisal_33.0_14' 'sisal_45.0_15' 'sisal_45.0_16' 'sisal_45.0_17' 'sisal_46.0_18' 'sisal_46.0_19' 'sisal_46.0_20' 'sisal_47.0_21' 'sisal_47.0_22' 'sisal_47.0_23' 'sisal_48.0_24' 'sisal_48.0_25' 'sisal_49.0_26' 'sisal_49.0_27' 'sisal_49.0_28' 'sisal_51.0_29' 'sisal_51.0_30' 'sisal_58.0_31' 'sisal_58.0_32' 'sisal_58.0_33' 'sisal_60.0_34' 'sisal_60.0_35' 'sisal_60.0_36' 'sisal_76.0_37' 'sisal_76.0_38' 'sisal_77.0_39' 'sisal_77.0_40' 'sisal_78.0_41' 'sisal_78.0_42' 'sisal_90.0_43' 'sisal_90.0_44' 'sisal_90.0_45' 'sisal_93.0_46' 'sisal_93.0_47' 'sisal_93.0_48' 'sisal_94.0_49' 'sisal_94.0_50' 'sisal_94.0_51' 'sisal_95.0_52' 'sisal_95.0_53' 'sisal_95.0_54' 'sisal_97.0_55' 'sisal_97.0_56' 'sisal_97.0_57' 'sisal_107.0_58' 'sisal_107.0_59' 'sisal_107.0_60' 'sisal_110.0_61' 'sisal_110.0_62' 'sisal_110.0_63' 'sisal_111.0_64' 'sisal_111.0_65' 'sisal_113.0_66' 'sisal_113.0_67' 'sisal_113.0_68' 'sisal_115.0_69' 'sisal_115.0_70' 'sisal_118.0_71' 'sisal_118.0_72' 'sisal_118.0_73' 'sisal_120.0_74' 'sisal_120.0_75' 'sisal_120.0_76' 'sisal_123.0_77' 'sisal_123.0_78' 'sisal_123.0_79' 'sisal_124.0_80' 'sisal_124.0_81' 'sisal_124.0_82' 'sisal_127.0_83' 'sisal_127.0_84' 'sisal_127.0_85' 'sisal_128.0_86' 'sisal_128.0_87' 'sisal_128.0_88' 'sisal_136.0_89' 'sisal_136.0_90' 'sisal_136.0_91' 'sisal_137.0_92' 'sisal_137.0_93' 'sisal_137.0_94' 'sisal_144.0_95' 'sisal_144.0_96' 'sisal_144.0_97' 'sisal_146.0_98' 'sisal_146.0_99' 'sisal_146.0_100' 'sisal_147.0_101' 'sisal_147.0_102' 'sisal_148.0_103' 'sisal_148.0_104' 'sisal_148.0_105' 'sisal_167.0_106' 'sisal_167.0_107' 'sisal_167.0_108' 'sisal_172.0_109' 'sisal_172.0_110' 'sisal_172.0_111' 'sisal_178.0_112' 'sisal_178.0_113' 'sisal_178.0_114' 'sisal_187.0_115' 'sisal_187.0_116' 'sisal_187.0_117' 'sisal_188.0_118' 'sisal_188.0_119' 'sisal_188.0_120' 'sisal_189.0_121' 'sisal_189.0_122' 'sisal_189.0_123' 'sisal_190.0_124' 'sisal_190.0_125' 'sisal_190.0_126' 'sisal_197.0_127' 'sisal_197.0_128' 'sisal_197.0_129' 'sisal_198.0_130' 'sisal_198.0_131' 'sisal_198.0_132' 'sisal_201.0_133' 'sisal_201.0_134' 'sisal_201.0_135' 'sisal_203.0_136' 'sisal_203.0_137' 'sisal_203.0_138' 'sisal_204.0_139' 'sisal_204.0_140' 'sisal_205.0_141' 'sisal_205.0_142' 'sisal_209.0_143' 'sisal_209.0_144' 'sisal_209.0_145' 'sisal_210.0_146' 'sisal_210.0_147' 'sisal_210.0_148' 'sisal_226.0_149' 'sisal_226.0_150' 'sisal_226.0_151' 'sisal_226.0_152' 'sisal_227.0_153' 'sisal_227.0_154' 'sisal_227.0_155' 'sisal_237.0_156' 'sisal_237.0_157' 'sisal_238.0_158' 'sisal_238.0_159' 'sisal_238.0_160' 'sisal_240.0_161' 'sisal_240.0_162' 'sisal_240.0_163' 'sisal_240.0_164' 'sisal_242.0_165' 'sisal_242.0_166' 'sisal_242.0_167' 'sisal_242.0_168' 'sisal_249.0_169' 'sisal_249.0_170' 'sisal_253.0_171' 'sisal_253.0_172' 'sisal_253.0_173' 'sisal_271.0_174' 'sisal_271.0_175' 'sisal_271.0_176' 'sisal_272.0_177' 'sisal_272.0_178' 'sisal_273.0_179' 'sisal_273.0_180' 'sisal_277.0_181' 'sisal_277.0_182' 'sisal_277.0_183' 'sisal_278.0_184' 'sisal_278.0_185' 'sisal_278.0_186' 'sisal_286.0_187' 'sisal_286.0_188' 'sisal_289.0_189' 'sisal_289.0_190' 'sisal_291.0_191' 'sisal_291.0_192' 'sisal_291.0_193' 'sisal_294.0_194' 'sisal_294.0_195' 'sisal_294.0_196' 'sisal_298.0_197' 'sisal_298.0_198' 'sisal_305.0_199' 'sisal_305.0_200' 'sisal_305.0_201' 'sisal_310.0_202' 'sisal_310.0_203' 'sisal_310.0_204' 'sisal_311.0_205' 'sisal_311.0_206' 'sisal_311.0_207' 'sisal_319.0_208' 'sisal_319.0_209' 'sisal_328.0_210' 'sisal_328.0_211' 'sisal_328.0_212' 'sisal_329.0_213' 'sisal_329.0_214' 'sisal_330.0_215' 'sisal_330.0_216' 'sisal_335.0_217' 'sisal_335.0_218' 'sisal_336.0_219' 'sisal_336.0_220' 'sisal_340.0_221' 'sisal_340.0_222' 'sisal_341.0_223' 'sisal_341.0_224' 'sisal_346.0_225' 'sisal_346.0_226' 'sisal_347.0_227' 'sisal_347.0_228' 'sisal_347.0_229' 'sisal_348.0_230' 'sisal_348.0_231' 'sisal_348.0_232' 'sisal_349.0_233' 'sisal_349.0_234' 'sisal_349.0_235' 'sisal_352.0_236' 'sisal_352.0_237' 'sisal_361.0_238' 'sisal_361.0_239' 'sisal_361.0_240' 'sisal_362.0_241' 'sisal_362.0_242' 'sisal_362.0_243' 'sisal_367.0_244' 'sisal_367.0_245' 'sisal_378.0_246' 'sisal_378.0_247' 'sisal_378.0_248' 'sisal_390.0_249' 'sisal_390.0_250' 'sisal_390.0_251' 'sisal_390.0_252' 'sisal_392.0_253' 'sisal_392.0_254' 'sisal_392.0_255' 'sisal_393.0_256' 'sisal_393.0_257' 'sisal_393.0_258' 'sisal_399.0_259' 'sisal_399.0_260' 'sisal_399.0_261' 'sisal_419.0_262' 'sisal_419.0_263' 'sisal_420.0_264' 'sisal_420.0_265' 'sisal_420.0_266' 'sisal_422.0_267' 'sisal_422.0_268' 'sisal_422.0_269' 'sisal_430.0_270' 'sisal_430.0_271' 'sisal_431.0_272' 'sisal_431.0_273' 'sisal_432.0_274' 'sisal_432.0_275' 'sisal_433.0_276' 'sisal_433.0_277' 'sisal_436.0_278' 'sisal_436.0_279' 'sisal_436.0_280' 'sisal_437.0_281' 'sisal_437.0_282' 'sisal_437.0_283' 'sisal_440.0_284' 'sisal_440.0_285' 'sisal_440.0_286' 'sisal_442.0_287' 'sisal_442.0_288' 'sisal_443.0_289' 'sisal_443.0_290' 'sisal_443.0_291' 'sisal_446.0_292' 'sisal_446.0_293' 'sisal_447.0_294' 'sisal_447.0_295' 'sisal_447.0_296' 'sisal_448.0_297' 'sisal_448.0_298' 'sisal_448.0_299' 'sisal_451.0_300' 'sisal_451.0_301' 'sisal_451.0_302' 'sisal_460.0_303' 'sisal_460.0_304' 'sisal_461.0_305' 'sisal_461.0_306' 'sisal_463.0_307' 'sisal_463.0_308' 'sisal_463.0_309' 'sisal_464.0_310' 'sisal_464.0_311' 'sisal_468.0_312' 'sisal_468.0_313' 'sisal_471.0_314' 'sisal_471.0_315' 'sisal_472.0_316' 'sisal_472.0_317' 'sisal_472.0_318' 'sisal_496.0_319' 'sisal_496.0_320' 'sisal_498.0_321' 'sisal_498.0_322' 'sisal_499.0_323' 'sisal_499.0_324' 'sisal_506.0_325' 'sisal_506.0_326' 'sisal_506.0_327' 'sisal_514.0_328' 'sisal_514.0_329' 'sisal_518.0_330' 'sisal_518.0_331' 'sisal_528.0_332' 'sisal_528.0_333' 'sisal_538.0_334' 'sisal_538.0_335' 'sisal_539.0_336' 'sisal_539.0_337' 'sisal_542.0_338' 'sisal_542.0_339' 'sisal_543.0_340' 'sisal_543.0_341' 'sisal_546.0_342' 'sisal_546.0_343' 'sisal_546.0_344' 'sisal_547.0_345' 'sisal_547.0_346' 'sisal_547.0_347' 'sisal_548.0_348' 'sisal_548.0_349' 'sisal_548.0_350' 'sisal_559.0_351' 'sisal_559.0_352' 'sisal_559.0_353' 'sisal_564.0_354' 'sisal_564.0_355' 'sisal_564.0_356' 'sisal_573.0_357' 'sisal_573.0_358' 'sisal_573.0_359' 'sisal_573.0_360' 'sisal_577.0_361' 'sisal_577.0_362' 'sisal_588.0_363' 'sisal_588.0_364' 'sisal_589.0_365' 'sisal_589.0_366' 'sisal_592.0_367' 'sisal_592.0_368' 'sisal_592.0_369' 'sisal_594.0_370' 'sisal_594.0_371' 'sisal_594.0_372' 'sisal_597.0_373' 'sisal_597.0_374' 'sisal_597.0_375' 'sisal_598.0_376' 'sisal_598.0_377' 'sisal_598.0_378' 'sisal_613.0_379' 'sisal_613.0_380' 'sisal_620.0_381' 'sisal_621.0_382' 'sisal_623.0_383' 'sisal_650.0_384' 'sisal_650.0_385' 'sisal_672.0_386' 'sisal_672.0_387' 'sisal_672.0_388' 'sisal_673.0_389' 'sisal_673.0_390' 'sisal_673.0_391' 'sisal_707.0_392' 'sisal_707.0_393' 'sisal_707.0_394' 'sisal_723.0_395' 'sisal_723.0_396' 'sisal_723.0_397' 'sisal_738.0_398' 'sisal_738.0_399' 'sisal_738.0_400' 'sisal_739.0_401' 'sisal_739.0_402' 'sisal_739.0_403' 'sisal_742.0_404' 'sisal_742.0_405' 'sisal_742.0_406' 'sisal_742.0_407' 'sisal_743.0_408' 'sisal_743.0_409' 'sisal_743.0_410' 'sisal_752.0_411' 'sisal_752.0_412' 'sisal_752.0_413' 'sisal_753.0_414' 'sisal_753.0_415' 'sisal_753.0_416' 'sisal_765.0_417' 'sisal_765.0_418' 'sisal_765.0_419' 'sisal_766.0_420' 'sisal_766.0_421' 'sisal_766.0_422' 'sisal_767.0_423' 'sisal_767.0_424' 'sisal_767.0_425' 'sisal_768.0_426' 'sisal_768.0_427' 'sisal_768.0_428' 'sisal_769.0_429' 'sisal_769.0_430' 'sisal_769.0_431' 'sisal_770.0_432' 'sisal_770.0_433' 'sisal_770.0_434' 'sisal_771.0_435' 'sisal_771.0_436' 'sisal_771.0_437' 'sisal_772.0_438' 'sisal_772.0_439' 'sisal_774.0_440' 'sisal_774.0_441' 'sisal_774.0_442' 'sisal_774.0_443' 'sisal_782.0_444' 'sisal_782.0_445' 'sisal_782.0_446' 'sisal_782.0_447' 'sisal_787.0_448' 'sisal_787.0_449' 'sisal_787.0_450' 'sisal_791.0_451' 'sisal_791.0_452' 'sisal_792.0_453' 'sisal_792.0_454' 'sisal_792.0_455' 'sisal_799.0_456' 'sisal_799.0_457' 'sisal_800.0_458' 'sisal_800.0_459' 'sisal_810.0_460' 'sisal_810.0_461' 'sisal_810.0_462' 'sisal_817.0_463' 'sisal_817.0_464' 'sisal_817.0_465' 'sisal_831.0_466' 'sisal_831.0_467' 'sisal_831.0_468' 'sisal_837.0_469' 'sisal_837.0_470' 'sisal_837.0_471' 'sisal_837.0_472' 'sisal_840.0_473' 'sisal_840.0_474' 'sisal_841.0_475' 'sisal_841.0_476' 'sisal_841.0_477' 'sisal_842.0_478' 'sisal_842.0_479' 'sisal_842.0_480' 'sisal_842.0_481' 'sisal_843.0_482' 'sisal_843.0_483' 'sisal_843.0_484' 'sisal_843.0_485' 'sisal_844.0_486' 'sisal_844.0_487' 'sisal_844.0_488' 'sisal_844.0_489' 'sisal_851.0_490' 'sisal_851.0_491' 'sisal_851.0_492' 'sisal_852.0_493' 'sisal_852.0_494' 'sisal_857.0_495' 'sisal_857.0_496' 'sisal_857.0_497' 'sisal_861.0_498' 'sisal_861.0_499' 'sisal_861.0_500' 'sisal_866.0_501' 'sisal_866.0_502' 'sisal_866.0_503' 'sisal_868.0_504' 'sisal_868.0_505' 'sisal_869.0_506' 'sisal_869.0_507' 'sisal_873.0_508' 'sisal_873.0_509' 'sisal_873.0_510' 'sisal_874.0_511' 'sisal_874.0_512' 'sisal_874.0_513' 'sisal_875.0_514' 'sisal_875.0_515' 'sisal_875.0_516' 'sisal_879.0_517' 'sisal_879.0_518' 'sisal_879.0_519' 'sisal_879.0_520' 'sisal_885.0_521' 'sisal_885.0_522' 'sisal_885.0_523' 'sisal_886.0_524' 'sisal_886.0_525' 'sisal_886.0_526' 'sisal_887.0_527' 'sisal_887.0_528' 'sisal_887.0_529' 'sisal_887.0_530' 'sisal_896.0_531' 'sisal_896.0_532' 'sisal_896.0_533' 'sisal_898.0_534' 'sisal_898.0_535' 'sisal_898.0_536' 'sisal_899.0_537' 'sisal_899.0_538' 'sisal_899.0_539' 'sisal_900.0_540' 'sisal_900.0_541' 'sisal_900.0_542' 'sisal_901.0_543' 'sisal_901.0_544' 'sisal_901.0_545'] ["<class 'str'>"] datasetId starts with: ['sisal'] No. of unique values: 546/546
originalDataURL (URL/DOI of original published record where available)¶
# originalDataURL
key = 'originalDataURL'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([kk for kk in df[key] if 'this' in kk]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# 'this study' should point to the correct URL (PAGES2k)
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
originalDataURL: ["['10.1002/2015GL063826']" "['10.1002/2015gl065397']" "['10.1002/2016GL071786']" "['10.1002/jqs.1490']" "['10.1007/s11430-019-9649-1']" "['10.1016/j.chemgeo.2013.08.026']" "['10.1016/j.epsl.2004.10.024']" "['10.1016/j.epsl.2005.01.036']" "['10.1016/j.epsl.2007.10.015']" "['10.1016/j.epsl.2008.07.060']" "['10.1016/j.epsl.2008.08.018']" "['10.1016/j.epsl.2009.12.017']" "['10.1016/j.epsl.2009.12.039']" "['10.1016/j.epsl.2010.04.002']" "['10.1016/j.epsl.2010.08.016']" "['10.1016/j.epsl.2011.05.028']" "['10.1016/j.epsl.2015.03.015']" "['10.1016/j.epsl.2016.02.050']" "['10.1016/j.epsl.2016.06.008']" "['10.1016/j.epsl.2017.01.034']" "['10.1016/j.epsl.2017.07.045']" "['10.1016/j.epsl.2017.11.044']" "['10.1016/j.epsl.2018.04.001']" "['10.1016/j.epsl.2018.04.048']" "['10.1016/j.epsl.2018.07.027']" "['10.1016/j.epsl.2019.115717']" "['10.1016/j.epsl.2019.115737']" "['10.1016/j.gca.2019.12.007']" "['10.1016/j.gca.2022.03.020']" "['10.1016/j.gloplacha.2019.03.007']" "['10.1016/j.gloplacha.2020.103266']" "['10.1016/j.jseaes.2010.06.011']" "['10.1016/j.jseaes.2013.04.015']" "['10.1016/j.jseaes.2017.10.016']" "['10.1016/j.palaeo.2009.06.030']" "['10.1016/j.palaeo.2011.02.030']" "['10.1016/j.palaeo.2013.02.030']" "['10.1016/j.palaeo.2016.07.007']" "['10.1016/j.palaeo.2017.01.003']" "['10.1016/j.palaeo.2017.10.021']" "['10.1016/j.quageo.2009.01.009']" "['10.1016/j.quaint.2007.09.039']" "['10.1016/j.quaint.2013.03.018']" "['10.1016/j.quascirev.2013.01.016']" "['10.1016/j.quascirev.2013.05.008']" "['10.1016/j.quascirev.2013.08.004']" "['10.1016/j.quascirev.2014.12.021']" "['10.1016/j.quascirev.2015.06.023']" "['10.1016/j.quascirev.2016.01.007']" "['10.1016/j.quascirev.2016.05.023']" "['10.1016/j.quascirev.2016.11.012']" "['10.1016/j.quascirev.2016.12.014']" "['10.1016/j.quascirev.2017.03.017']" "['10.1016/j.quascirev.2018.07.021']" "['10.1016/j.quascirev.2019.02.019']" "['10.1016/j.quascirev.2020.106191']" "['10.1016/j.quascirev.2020.106655']" "['10.1016/j.quascirev.2021.106822']" "['10.1016/j.quascirev.2021.106865']" "['10.1016/j.quascirev.2021.106911']" "['10.1016/j.quascirev.2021.107137']" "['10.1016/j.quascirev.2022.107383']" "['10.1016/j.quascirev.2022.107742']" "['10.1016/j.yqres.2006.05.003']" "['10.1016/j.yqres.2008.08.005']" "['10.1016/j.yqres.2011.01.005']" "['10.1016/j.yqres.2013.12.009']" "['10.1016/s0031-0182(00)00225-x']" "['10.1016/s0031-0182(98)00223-5']" "['10.1016/s0277-3791(03)00204-x']" "['10.1029/2000gl012728']" "['10.1029/2004jd004694']" "['10.1029/2009gl040050']" "['10.1029/2011gl047713']" "['10.1029/2012gl053936']" "['10.1029/2017GL076838']" "['10.1029/2019GL082405']" "['10.1029/2019GL084879']" "['10.1029/2020GL090273']" "['10.1029/2021GL093071']" "['10.1029/2021GL094232']" "['10.1029/2021GL094733']" "['10.1038/nature06164']" "['10.1038/nature20787']" "['10.1038/ncomms11719']" "['10.1038/ncomms2222']" "['10.1038/ncomms2415']" "['10.1038/ncomms3908']" "['10.1038/ncomms4805']" "['10.1038/ncomms7309']" "['10.1038/ncomms8627']" "['10.1038/ngeo1862']" "['10.1038/ngeo2353']" "['10.1038/ngeo2953']" "['10.1038/ngeo605']" "['10.1038/s41467-020-14490-y']" "['10.1038/s41467-020-17927-6']" "['10.1038/s41467-022-32654-w']" "['10.1038/s41598-017-15566-4']" "['10.1038/s41598-018-30112-6']" "['10.1038/s41598-018-35498-x']" "['10.1038/s41598-019-56852-7']" "['10.1038/s43247-022-00347-3']" "['10.1038/srep01767']" "['10.1038/srep06381']" "['10.1038/srep24374']" "['10.1038/srep24745']" "['10.1038/srep24762']" "['10.1038/srep36975']" "['10.1073/pnas.1214870110']" "['10.1073/pnas.1422270112']" "['10.1126/sciadv.aax6656']" "['10.1126/sciadv.abb2459']" "['10.1126/sciadv.abi9275']" "['10.1126/science.1091220']" "['10.1126/science.1106296']" "['10.1126/science.1163965']" "['10.1126/science.1226299']" "['10.1130/G22865A.1']" "['10.1130/g32098.1']" "['10.1130/g32471.1']" "['10.1130/g34718.1']" "['10.1177/0959683609350393']" "['10.1177/0959683610378880']" "['10.1177/0959683612449759']" "['10.1177/0959683612471986']" "['10.1177/0959683616652711']" "['10.1177/0959683616660170']" "['10.1177/0959683619831433']" "['10.1177/0959683620981717']" "['10.1177/09596836211019120']" "['10.1191/095968399672625464']" "['10.1371/journal.pone.0189447']" "['10.3390/geosciences11040166']" "['10.5194/cp-10-1319-2014']" "['10.5194/cp-10-1967-2014']" "['10.5194/cp-2016-137']" "['10.5194/cp-5-667-2009']" "['10.5194/cp-8-1751-2012']" "['unpublished']"] [] ["<class 'str'>"] No. of unique values: 140/546
originalDatabase (original database used as input for dataframe)¶
# # originalDataSet
key = 'originalDatabase'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# Note: the last two records have missing URLs
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
originalDatabase: ['SISAL v3'] ["<class 'str'>"] No. of unique values: 1/546
geographical metadata: elevation, latitude, longitude, site name¶
geo_meanElev (mean elevation in m)¶
# check Elevation
key = 'geo_meanElev'
print('%s: '%key)
print(df[key])
print(np.unique(['%d'%kk for kk in df[key] if np.isfinite(kk)]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
geo_meanElev:
0 3000.0
1 3000.0
2 3000.0
3 2000.0
4 2000.0
...
541 1190.0
542 1190.0
543 1190.0
544 1190.0
545 1190.0
Name: geo_meanElev, Length: 546, dtype: float32
['10' '100' '1000' '1120' '1140' '1160' '1190' '120' '1200' '1240' '1250'
'1260' '1290' '1300' '131' '1370' '1386' '1400' '1407' '1420' '1440'
'1460' '1490' '1495' '150' '1530' '162' '165' '1650' '175' '180' '184'
'1900' '1960' '20' '200' '2000' '2114' '2132' '22' '230' '2347' '239'
'240' '2400' '250' '2660' '280' '2830' '285' '294' '300' '3000' '306'
'310' '32' '335' '336' '340' '350' '352' '365' '383' '3850' '393' '400'
'401' '41' '420' '43' '433' '435' '440' '455' '456' '475' '480' '500'
'518' '53' '530' '550' '570' '590' '60' '600' '631' '650' '660' '680'
'70' '700' '72' '730' '85' '860' '870' '934' '940' '965']
["<class 'float'>"]
No. of unique values: 100/546
geo_meanLat (mean latitude in degrees N)¶
# # Latitude
key = 'geo_meanLat'
print('%s: '%key)
print(np.unique(['%d'%kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
geo_meanLat: ['-11' '-12' '-13' '-14' '-15' '-16' '-18' '-19' '-21' '-24' '-27' '-31' '-32' '-34' '-35' '-38' '-4' '-41' '-5' '-8' '-9' '0' '12' '15' '16' '17' '18' '19' '20' '22' '25' '26' '27' '28' '29' '30' '31' '32' '33' '35' '36' '37' '38' '39' '4' '40' '41' '42' '43' '44' '45' '46' '50' '51' '54' '66' '67' '9'] ["<class 'float'>"] No. of unique values: 126/546
geo_meanLon (mean longitude)¶
# # Longitude
key = 'geo_meanLon'
print('%s: '%key)
print(np.unique(['%d'%kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
geo_meanLon: ['-104' '-105' '-111' '-115' '-118' '-123' '-2' '-3' '-37' '-4' '-41' '-44' '-46' '-47' '-49' '-55' '-56' '-60' '-65' '-67' '-7' '-75' '-77' '-79' '-80' '-83' '-89' '-9' '-90' '-98' '-99' '0' '10' '100' '102' '103' '105' '106' '107' '108' '109' '11' '110' '113' '114' '115' '117' '118' '120' '128' '13' '14' '148' '15' '159' '167' '17' '171' '177' '21' '22' '24' '29' '3' '30' '31' '35' '39' '45' '46' '54' '56' '63' '7' '72' '77' '8' '81' '82' '91'] ["<class 'float'>"] No. of unique values: 128/546
geo_siteName (name of collection site)¶
# Site Name
key = 'geo_siteName'
print('%s: '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
geo_siteName: ['Bittoo cave' 'Bittoo cave' 'Bittoo cave' 'Kesang cave' 'Kesang cave' 'Kesang cave' 'Paraiso cave' 'Paraiso cave' 'Paraiso cave' 'Paraiso cave' 'Paraiso cave' 'Paraiso cave' 'Villars cave' 'Villars cave' 'Villars cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Cold Air cave' 'Lancaster Hole' 'Lancaster Hole' 'Jeita cave' 'Jeita cave' 'Jeita cave' 'Jeita cave' 'Jeita cave' 'Jeita cave' 'Huangye cave' 'Huangye cave' 'Huangye cave' 'Huangye cave' 'Huangye cave' 'Huangye cave' 'Lapa grande cave' 'Lapa grande cave' 'Lapa grande cave' 'Palestina cave' 'Palestina cave' 'Palestina cave' 'Palestina cave' 'Palestina cave' 'Palestina cave' 'Okshola cave' 'Okshola cave' 'Okshola cave' 'Tamboril cave' 'Tamboril cave' 'Tamboril cave' 'Anjokipoty' 'Anjokipoty' 'Anjokipoty' 'Curupira cave' 'Curupira cave' 'Curupira cave' 'Dayu cave' 'Dayu cave' 'Diva cave' 'Diva cave' 'Diva cave' 'Dongge cave' 'Dongge cave' 'Ifoulki cave' 'Ifoulki cave' 'Ifoulki cave' 'Kapsia cave' 'Kapsia cave' 'Kapsia cave' 'Larshullet cave' 'Larshullet cave' 'Larshullet cave' 'Leviathan cave' 'Leviathan cave' 'Leviathan cave' 'Natural Bridge caverns' 'Natural Bridge caverns' 'Natural Bridge caverns' "Pau d'Alho cave" "Pau d'Alho cave" "Pau d'Alho cave" 'Skala Marion cave' 'Skala Marion cave' 'Skala Marion cave' 'Soylegrotta cave' 'Soylegrotta cave' 'Soylegrotta cave' 'Taurius cave' 'Taurius cave' 'Taurius cave' 'Torrinha cave' 'Torrinha cave' 'Torrinha cave' 'Tzabnah cave' 'Tzabnah cave' 'Wah Shikhar cave' 'Wah Shikhar cave' 'Wah Shikhar cave' 'Chilibrillo cave' 'Chilibrillo cave' 'Chilibrillo cave' 'Furong cave' 'Furong cave' 'Furong cave' 'Macal Chasm' 'Macal Chasm' 'Macal Chasm' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Dante cave' 'Dante cave' 'Dante cave' 'Dante cave' 'Dante cave' 'Dante cave' 'Klapferloch cave' 'Klapferloch cave' 'Klapferloch cave' 'Lapa Doce cave' 'Lapa Doce cave' 'Lapa Doce cave' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Yok Balum cave' 'Yok Balum cave' 'Yok Balum cave' 'Yok Balum cave' 'Yok Balum cave' 'Yok Balum cave' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Liang Luar' 'Bukit Assam cave' 'Bukit Assam cave' 'Bukit Assam cave' 'Bukit Assam cave' 'Bukit Assam cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Cueva de Asiul' 'Cueva de Asiul' 'Heshang cave' 'Heshang cave' 'Heshang cave' 'Buckeye creek' 'Buckeye creek' 'Buckeye creek' 'Buckeye creek' 'Buckeye creek' 'Buckeye creek' 'Buckeye creek' 'Grotta di Carburangeli' 'Grotta di Carburangeli' 'Grotta di Carburangeli' 'Dandak cave' 'Dandak cave' 'Dandak cave' 'Juxtlahuaca cave' 'Juxtlahuaca cave' 'Juxtlahuaca cave' 'Juxtlahuaca cave' 'Kinderlinskaya cave' 'Kinderlinskaya cave' 'Kinderlinskaya cave' 'Oregon caves national monument' 'Oregon caves national monument' 'Oregon caves national monument' 'Sanbao cave' 'Sanbao cave' 'Sofular cave' 'Sofular cave' 'Sofular cave' 'Cova da Arcoia' 'Cova da Arcoia' 'Cova da Arcoia' 'Botuverá cave' 'Botuverá cave' 'Botuverá cave' 'Gunung-buda cave (snail shell cave)' 'Gunung-buda cave (snail shell cave)' 'Jhumar cave' 'Jhumar cave' 'Jhumar cave' 'Jiuxian cave' 'Jiuxian cave' 'Jiuxian cave' 'Jiuxian cave' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'KNI-51' 'Mavri Trypa cave' 'Mavri Trypa cave' 'Mavri Trypa cave' 'Munagamanu cave' 'Munagamanu cave' 'Munagamanu cave' 'Munagamanu cave' 'Munagamanu cave' 'Munagamanu cave' 'Soreq cave' 'Soreq cave' 'Te Reinga cave' 'Te Reinga cave' 'Te Reinga cave' 'Te Reinga cave' 'Te Reinga cave' 'Te Reinga cave' 'Liang Luar' 'Liang Luar' 'Perdida cave' 'Perdida cave' 'Perdida cave' 'Closani cave' 'Closani cave' 'Closani cave' 'Closani cave' 'Forestry cave' 'Forestry cave' 'Forestry cave' 'Forestry cave' 'Forestry cave' 'Forestry cave' 'Bribin cave' 'Bribin cave' 'Bribin cave' 'KNI-51' 'KNI-51' 'Lianhua cave, Hunan' 'Lianhua cave, Hunan' 'Lianhua cave, Hunan' 'Minnetonka cave' 'Minnetonka cave' 'Minnetonka cave' 'São Bernardo cave' 'São Bernardo cave' 'São Matheus cave' 'São Matheus cave' 'Shatuca cave' 'Shatuca cave' 'Shatuca cave' 'Shatuca cave' 'Tangga cave' 'Tangga cave' 'Tangga cave' 'Uluu-Too cave' 'Uluu-Too cave' 'Uluu-Too cave' 'Xibalba cave' 'Xibalba cave' 'Xibalba cave' 'Dongge cave' 'Dongge cave' 'Dos Anas cave' 'Dos Anas cave' 'Dos Anas cave' 'Dongge cave' 'Dongge cave' 'Jaraguá cave' 'Jaraguá cave' 'Jaraguá cave' 'Jaraguá cave' 'Jaraguá cave' 'Jaraguá cave' 'Mawmluh cave' 'Mawmluh cave' 'Mawmluh cave' 'Chaara cave' 'Chaara cave' 'Dark cave' 'Dark cave' "E'mei cave" "E'mei cave" "E'mei cave" 'Grotte de Piste' 'Grotte de Piste' 'Baeg-nyong cave' 'Baeg-nyong cave' 'Wanxiang cave' 'Wanxiang cave' 'Xianglong cave' 'Xianglong cave' 'Xianglong cave' 'Lianhua cave, Hunan' 'Lianhua cave, Hunan' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Bat cave' 'Bat cave' 'Bat cave' 'Cueva del Tigre Perdido' 'Cueva del Tigre Perdido' 'Umajalanta cave' 'Umajalanta cave' 'Lianhua cave, Shanxi' 'Lianhua cave, Shanxi' 'Shenqi cave' 'Shenqi cave' 'Shenqi cave' 'Shenqi cave' 'Wuya cave' 'Wuya cave' 'Wuya cave' 'Wuya cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Hoq cave' 'Akcakale cave' 'Akcakale cave' 'Akcakale cave' 'Bleßberg cave' 'Bleßberg cave' 'Bleßberg cave' 'Gejkar cave' 'Gejkar cave' 'Gejkar cave' 'Gejkar cave' 'Crystal cave' 'Crystal cave' 'Chaara cave' 'Chaara cave' 'Chaara cave' 'Chaara cave' 'El Condor cave' 'El Condor cave' 'El Condor cave' 'Tamboril cave' 'Tamboril cave' 'Tamboril cave' 'Huagapo cave' 'Huagapo cave' 'Huagapo cave' 'Huagapo cave' 'Huagapo cave' 'Huagapo cave' 'Pink Panther cave' 'Pink Panther cave' 'Kesang cave' 'Kesang cave' 'Kesang cave' 'La Garma cave' 'La Garma cave' 'Tham Doun Mai' 'Tham Doun Mai' 'Tham Doun Mai' 'Hollywood cave' 'Hollywood cave' 'Hollywood cave' 'Jaraguá cave' 'Jaraguá cave' 'Jaraguá cave' 'Shennong cave' 'Shennong cave' 'Shennong cave' 'Anjohibe' 'Anjohibe' 'Anjohibe' 'Bàsura cave' 'Bàsura cave' 'Bàsura cave' 'Careys cave' 'Careys cave' 'Careys cave' 'Careys cave' 'Cathedral cave' 'Cathedral cave' 'Cathedral cave' 'Cuíca cave' 'Cuíca cave' 'Cuíca cave' 'Cuíca cave' 'Cuíca cave' 'Cuíca cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Golgotha cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Harrie Wood cave' 'Heifeng cave' 'Heifeng cave' 'Herbstlabyrinth cave' 'Herbstlabyrinth cave' 'Herbstlabyrinth cave' 'Herbstlabyrinth cave' 'Huangchao cave' 'Huangchao cave' 'Huangchao cave' 'Huangchao cave' 'Ifoulki cave' 'Ifoulki cave' 'Ifoulki cave' 'Jinfo cave' 'Jinfo cave' 'Jiulong cave' 'Jiulong cave' 'Jiulong cave' 'Kuna Ba' 'Kuna Ba' 'Kuna Ba' 'Kuna Ba' 'La Vierge cave' 'La Vierge cave' 'La Vierge cave' 'Mata Virgem cave' 'Mata Virgem cave' 'Mata Virgem cave' 'Nova\xa0Grgosova\xa0cave' 'Nova\xa0Grgosova\xa0cave' 'Nova\xa0Grgosova\xa0cave' 'Coves del pirata' 'Coves del pirata' 'Coves del pirata' 'Coves del pirata' 'Qujia cave' 'Qujia cave' 'Rey Marcos' 'Rey Marcos' 'Rey Marcos' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Sa balma des quartó cave' 'Shijiangjun cave' 'Shijiangjun cave' 'Shijiangjun cave' 'Shizi cave' 'Shizi cave' 'Tham Doun Mai' 'Tham Doun Mai' 'Tham Doun Mai' 'Trapiá cave' 'Trapiá cave' 'Trapiá cave' 'Wuya cave' 'Wuya cave' 'Wuya cave' 'Wintimdouine' 'Wintimdouine' 'Wintimdouine' 'Wintimdouine' 'Wulu cave' 'Wulu cave' 'Wulu cave' 'Xiniu cave' 'Xiniu cave' 'Xiniu cave' 'Xiniu cave' 'Xiniu cave' 'Xiniu cave' 'Yonderup cave' 'Yonderup cave' 'Yonderup cave' 'Yonderup cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Bunker cave' 'Kocain cave' 'Kocain cave' 'Kocain cave' 'Kocain cave' 'São Bernardo cave' 'São Bernardo cave' 'São Bernardo cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Chiflonkhakha cave' 'Sahiya cave' 'Sahiya cave' 'Sahiya cave' 'Sahiya cave' 'Sahiya cave' 'Sahiya cave'] ["<class 'str'>"] No. of unique values: 129/546
proxy metadata: archive type, proxy type, interpretation¶
archiveType (archive type)¶
# archiveType
key = 'archiveType'
print('%s: '%key)
print(np.unique(df[key]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
archiveType: ['Speleothem'] ["<class 'str'>"] No. of unique values: 1/546
paleoData_proxy (proxy type)¶
# paleoData_proxy
key = 'paleoData_proxy'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
paleoData_proxy: ['Mg/Ca' 'd13C' 'd18O' 'growth rate'] ["<class 'str'>"] No. of unique values: 4/546
paleoData_sensorSpecies (further information on proxy type: species)¶
paleoData_notes (notes)¶
# climate_interpretation
key = 'paleoData_sensorSpecies'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
paleoData_sensorSpecies: ['N/A'] ["<class 'str'>"] No. of unique values: 1/546
# # paleoData_notes
key = 'paleoData_notes'
print('%s: '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
paleoData_notes: ['calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'mixed' 'mixed' 'mixed' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'mixed' 'mixed' 'mixed' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'mixed' 'mixed' 'mixed' 'calcite' 'calcite' 'calcite' 'mixed' 'mixed' 'mixed' 'mixed' 'mixed' 'mixed' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'mixed' 'mixed' 'mixed' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'mixed' 'mixed' 'mixed' 'mixed' 'mixed' 'mixed' 'mixed' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'mixed' 'mixed' 'mixed' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'mixed' 'mixed' 'mixed' 'mixed' 'mixed' 'mixed' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'mixed' 'mixed' 'mixed' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'mixed' 'mixed' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'mixed' 'mixed' 'mixed' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'mixed' 'mixed' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'mixed' 'mixed' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'mixed' 'mixed' 'mixed' 'mixed' 'mixed' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'aragonite' 'aragonite' 'aragonite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite' 'calcite'] ["<class 'str'>"] No. of unique values: 3/546
paleoData_variableName¶
# key = 'paleoData_variableName'
# print('%s: '%key)
# print(np.unique([kk for kk in df[key]]))
# print(np.unique([str(type(dd)) for dd in df[key]]))
# print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
climate metadata: interpretation variable, direction, seasonality¶
Interpretation_direction¶
# climate_interpretation
key = 'interpretation_direction'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
interpretation_direction: ['N/A'] No. of unique values: 1/546
Interpretation_seasonality¶
# climate_interpretation
key = 'interpretation_seasonality'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
interpretation_seasonality: ['N/A'] No. of unique values: 1/546
Interpretation_variable¶
# climate_interpretation
key = 'interpretation_variable'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
interpretation_variable: ['N/A' 'temperature' 'temperature+moisture'] No. of unique values: 3/546
Interpretation_variableDetail¶
# climate_interpretation
key = 'interpretation_variableDetail'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
interpretation_variableDetail: ['N/A' 'temperature - manually assigned by DoD2k authors for paleoData_proxy = Mg/Ca' 'temperature+moisture - manually assigned by DoD2k authors for paleoData_proxy = d18O.'] No. of unique values: 3/546
data¶
paleoData_values¶
# # paleoData_values
key = 'paleoData_values'
print('%s: '%key)
for ii, vv in enumerate(df[key][:20]):
try:
print('%-30s: %s -- %s'%(df['dataSetName'].iloc[ii][:30], str(np.nanmin(vv)), str(np.nanmax(vv))))
print(type(vv))
except: print(df['dataSetName'].iloc[ii], 'NaNs detected.')
print(np.unique([str(type(dd)) for dd in df[key]]))
paleoData_values: Bittoo cave : -10.871 -- -5.274 <class 'numpy.ndarray'> Bittoo cave : -8.426 -- 2.907 <class 'numpy.ndarray'> Bittoo cave : 0.023809524 -- 0.5 <class 'numpy.ndarray'> Kesang cave : -9.86 -- -7.15 <class 'numpy.ndarray'> Kesang cave : -7.34 -- -1.85 <class 'numpy.ndarray'> Kesang cave : 0.011538462 -- 0.02875 <class 'numpy.ndarray'> Paraiso cave : -7.432 -- -5.568 <class 'numpy.ndarray'> Paraiso cave : -10.426 -- -7.532 <class 'numpy.ndarray'> Paraiso cave : 0.07721781 -- 0.9637938 <class 'numpy.ndarray'> Paraiso cave : -6.82 -- -4.73 <class 'numpy.ndarray'> Paraiso cave : -10.03 -- -6.5 <class 'numpy.ndarray'> Paraiso cave : 0.1243238 -- 0.3016036 <class 'numpy.ndarray'> Villars cave : -5.2911587 -- -3.3193688 <class 'numpy.ndarray'> Villars cave : -11.889349 -- -7.2085137 <class 'numpy.ndarray'> Villars cave : 0.09545446 -- 11.775341 <class 'numpy.ndarray'> Cold Air cave : -5.52 -- -2.395 <class 'numpy.ndarray'> Cold Air cave : -6.63 -- -2.32 <class 'numpy.ndarray'> Cold Air cave : 0.0016826758 -- 1.0569707 <class 'numpy.ndarray'> Cold Air cave : -5.928387 -- -2.3865893 <class 'numpy.ndarray'> Cold Air cave : -5.998736 -- -2.3895519 <class 'numpy.ndarray'> ["<class 'numpy.ndarray'>"]
paleoData_units¶
# paleoData_units
key = 'paleoData_units'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
paleoData_units: ['mm/year' 'mmol/mol' 'permil'] ["<class 'str'>"] No. of unique values: 3/546
year¶
# # year
key = 'year'
print('%s: '%key)
for ii, vv in enumerate(df[key][:20]):
try: print('%-30s: %s -- %s'%(df['dataSetName'].iloc[ii][:30], str(np.nanmin(vv)), str(np.nanmax(vv))))
except: print('NaNs detected.', vv)
print(np.unique([str(type(dd)) for dd in df[key]]))
year: Bittoo cave : 11.0 -- 1076.0 Bittoo cave : 11.0 -- 1076.0 Bittoo cave : 11.0 -- 1950.0 Kesang cave : 632.0 -- 1098.0 Kesang cave : 632.0 -- 1098.0 Kesang cave : 640.0 -- 1950.0 Paraiso cave : 7.2082186 -- 1235.592 Paraiso cave : 7.2082186 -- 1235.592 Paraiso cave : 7.2082186 -- 1950.0 Paraiso cave : 1181.5256 -- 1998.0459 Paraiso cave : 1181.5256 -- 1998.0459 Paraiso cave : 1188.324 -- 1998.0459 Villars cave : 6.147 -- 1987.727 Villars cave : 6.147 -- 1987.727 Villars cave : 6.147 -- 1987.727 Cold Air cave : 1264.2115 -- 1484.1798 Cold Air cave : 1264.2115 -- 1484.1798 Cold Air cave : 1264.2115 -- 1950.0 Cold Air cave : 19.0 -- 1996.0 Cold Air cave : 19.0 -- 1996.0 ["<class 'numpy.ndarray'>"]
yearUnits¶
# yearUnits
key = 'yearUnits'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
yearUnits: ['CE'] ["<class 'str'>"] No. of unique values: 1/546