Load CoralHydro 2k¶
load data from Coral Hydro 2k v1.0.1 (https://essd.copernicus.org/articles/15/2081/2023/)
Dataset downloaded from LiPDverse: https://lipdverse.org/CoralHydro2k/current_version/
Created by Kevin Fan and Lucie Luecke (LL). Base on Feng Zhu and Julien Emile-Geay's template (lipd2df notebook)
Update 24/10/25 by LL: streamline and tidy up for publication and documentation Update 21/11/24 by LL : added option to save as csv Update 30/10/24 by LL (v4): added check for empty paleoData_values row Update 29/10/2024 by LL (v4): modified datasetId to create unique identifier for each record.
Here we extract a dataframe with the following columns:
archiveTypedataSetNamedatasetIdgeo_meanElevgeo_meanLatgeo_meanLongeo_siteNameinterpretation_direction(new in v2.0)interpretation_variableinterpretation_variableDetailinterpretation_seasonality(new in v2.0)originalDataURLoriginalDatabasepaleoData_notespaleoData_proxypaleoData_sensorSpeciespaleoData_unitspaleoData_valuespaleoData_variableNameyearyearUnits
We save a standardised compact dataframe for concatenation to DoD2k
Set up working environment¶
Make sure the repo_root is added correctly, it should be: your_root_dir/dod2k This should be the working directory throughout this notebook (and all other notebooks).
%load_ext autoreload
%autoreload 2
import sys
import os
from pathlib import Path
# Add parent directory to path (works from any notebook in notebooks/)
# the repo_root should be the parent directory of the notebooks folder
init_dir = Path().resolve()
# Determine repo root
if init_dir.name == 'dod2k': repo_root = init_dir
elif init_dir.parent.name == 'dod2k': repo_root = init_dir.parent
else: raise Exception('Please review the repo root structure (see first cell).')
# Update cwd and path only if needed
if os.getcwd() != str(repo_root):
os.chdir(repo_root)
if str(repo_root) not in sys.path:
sys.path.insert(0, str(repo_root))
print(f"Repo root: {repo_root}")
if str(os.getcwd())==str(repo_root):
print(f"Working directory matches repo root. ")
Repo root: /home/jupyter-lluecke/dod2k_v2.0/dod2k Working directory matches repo root.
# Import packages
import lipd
import pandas as pd
import numpy as np
from dod2k_utilities import ut_functions as utf # contains utility functions
from dod2k_utilities import ut_plot as uplt # contains plotting functions
Load source data¶
In order to get the source data, run the cell below. This will download a series of LiPD files into the directory lipdfiles
Alternatively skip the cell and directly use the files as provided in this directory (see cell below next).
# # Download the file (use -O to specify output filename)
# !wget -O data/ch2k/CoralHydro2k1_0_1.zip https://lipdverse.org/CoralHydro2k/current_version/CoralHydro2k1_0_1.zip
# # Unzip to the correct destination
# !unzip data/ch2k/CoralHydro2k1_0_1.zip -d data/ch2k/ch2k_101
# load LiPD files from the given directory
D = lipd.readLipd(str(repo_root)+'/data/ch2k/ch2k_101/');
TS = lipd.extractTs(D);
len(TS)
os.chdir(repo_root)
Disclaimer: LiPD files may be updated and modified to adhere to standards Found: 179 LiPD file(s) reading: CH03BUN01.lpd reading: ZI15MER01.lpd reading: CO03PAL03.lpd reading: CO03PAL02.lpd reading: CA13PEL01.lpd reading: LI06RAR01.lpd reading: CO03PAL07.lpd reading: FL18DTO03.lpd reading: SM06LKF02.lpd reading: UR00MAI01.lpd reading: TU95MAD01.lpd reading: ZI04IFR01.lpd reading: RE18CAY01.lpd reading: KU99HOU01.lpd reading: OS13NLP01.lpd reading: EV98KIR01.lpd reading: LI00RAR01.lpd reading: NU11PAL01.lpd reading: MA08DTO01.lpd reading: AB20MEN03.lpd reading: CA14TIM01.lpd reading: KA17RYU01.lpd reading: MC11KIR01.lpd reading: AB20MEN09.lpd reading: HE08LRA01.lpd reading: DA06MAF01.lpd reading: SM06LKF01.lpd reading: NA09MAL01.lpd reading: SW98STP01.lpd reading: MU18GSI01.lpd reading: ZI14HOU01.lpd reading: FL17DTO02.lpd reading: DA06MAF02.lpd reading: SA19PAL02.lpd reading: CO03PAL01.lpd reading: ZI16ROD01.lpd reading: OS13NGP01.lpd reading: CH98PIR01.lpd reading: RE19GBR02.lpd reading: MU18RED04.lpd reading: GR13MAD01.lpd reading: XI17HAI01.lpd reading: DE14DTO03.lpd reading: KL97DAH01.lpd reading: QU06RAB01.lpd reading: DE14DTO01.lpd reading: KU00NIN01.lpd reading: TU01SIA01.lpd reading: RE19GBR01.lpd reading: GR13MAD02.lpd reading: AB20MEN07.lpd reading: BR19RED01.lpd reading: NU09FAN01.lpd reading: MU18RED01.lpd reading: OS14RIP01.lpd reading: DE14DTO02.lpd reading: LI04FIJ01.lpd reading: EV18ROC01.lpd reading: CA13SAP01.lpd reading: JI18GAL02.lpd reading: TU01LAI01.lpd reading: HE13MIS01.lpd reading: GU99NAU02.lpd reading: ZI15IMP02.lpd reading: PF04PBA01.lpd reading: SA20FAN02.lpd reading: WE09ARR01.lpd reading: CO03PAL05.lpd reading: XU15BVI01.lpd reading: HE18COC02.lpd reading: MU18NPI01.lpd reading: MO06PED01.lpd reading: KR20SAR01.lpd reading: SA18GBR01.lpd reading: OS14UCP01.lpd reading: AB20MEN08.lpd reading: HE13MIS02.lpd reading: HE10GUA01.lpd reading: BO14HTI01.lpd reading: DE12ANC01.lpd reading: WA17BAN01.lpd reading: DR99ABR01.lpd reading: RO19MAR01.lpd reading: LI06RAR02.lpd reading: MU18RED03.lpd reading: SW99LIG01.lpd reading: SA16CLA01.lpd reading: ZI15TAN01.lpd reading: RE19GBR03.lpd reading: DR00KSB01.lpd reading: BO14HTI02.lpd reading: MU17DOA01.lpd reading: TA18TAS01.lpd reading: XU15BVI03.lpd reading: DU94URV02.lpd reading: AS05GUA01.lpd reading: FE09OGA01.lpd reading: GU99NAU01.lpd reading: SA20FAN01.lpd reading: CA13DIA01.lpd reading: AL16PUR02.lpd reading: CO03PAL10.lpd reading: RE19GBR05.lpd reading: ZI15IMP01.lpd reading: KR20SAR02.lpd reading: RO19YUC01.lpd reading: ST13MAL01.lpd reading: DR00NBB01.lpd reading: PF19LAR01.lpd reading: AL16YUC01.lpd reading: CO03PAL09.lpd reading: ZI16ROD02.lpd reading: AB20MEN05.lpd reading: SH92PUN01.lpd reading: KI04MCV01.lpd reading: AL16PUR01.lpd reading: CH18YOA02.lpd reading: DE14DTO04.lpd reading: AB20MEN04.lpd reading: DE16RED01.lpd reading: BA04FIJ02.lpd reading: CO03PAL06.lpd reading: JI18GAL01.lpd reading: CH18YOA01.lpd reading: RE19GBR04.lpd reading: DO18DAV01.lpd reading: GO12SBV01.lpd reading: CA07FLI01.lpd reading: SW99LIG02.lpd reading: MC04PNG01.lpd reading: CO93TAR01.lpd reading: RO19PAR01.lpd reading: CO00MAL01.lpd reading: MO20WOA01.lpd reading: AB20MEN01.lpd reading: QU96ESV01.lpd reading: DE13HAI01.lpd reading: LI94SEC01.lpd reading: ZI15CLE01.lpd reading: MU18RED02.lpd reading: ZI08MAY01.lpd reading: TU01DEP01.lpd reading: CO03PAL04.lpd reading: RA19PAI01.lpd reading: AB15BHB01.lpd reading: FL18DTO01.lpd reading: MO20KOI01.lpd reading: DU94URV01.lpd reading: CO03PAL08.lpd reading: WU14CLI01.lpd reading: ZI14TUR01.lpd reading: AB20MEN02.lpd reading: LI99CLI01.lpd reading: ZI15BUN01.lpd reading: FE18RUS01.lpd reading: WU13TON01.lpd reading: KI14PAR01.lpd reading: ZI14IFR02.lpd reading: XU15BVI02.lpd reading: KI08PAR01.lpd reading: AB20MEN06.lpd reading: AB08MEN01.lpd reading: NU09KIR01.lpd reading: RI10PBL01.lpd reading: CA14BUT01.lpd reading: FL18DTO02.lpd reading: FL18DTO04.lpd reading: BA04FIJ01.lpd reading: HE02GBR01.lpd reading: GO08BER01.lpd reading: CA13TUR01.lpd reading: LI06FIJ01.lpd reading: HE18COC01.lpd reading: FL17DTO01.lpd reading: BO99MOO01.lpd reading: CH03LOM01.lpd reading: SA19PAL01.lpd reading: CH97BVB01.lpd reading: RA20TAI01.lpd Finished read: 179 records extracting paleoData... extracting: CH03BUN01 extracting: ZI15MER01 extracting: CO03PAL03 extracting: CO03PAL02 extracting: CA13PEL01 extracting: LI06RAR01 extracting: CO03PAL07 extracting: FL18DTO03 extracting: SM06LKF02 extracting: UR00MAI01 extracting: TU95MAD01 extracting: ZI04IFR01 extracting: RE18CAY01 extracting: KU99HOU01 extracting: OS13NLP01 extracting: EV98KIR01 extracting: LI00RAR01 extracting: NU11PAL01 extracting: MA08DTO01 extracting: AB20MEN03 extracting: CA14TIM01 extracting: KA17RYU01 extracting: MC11KIR01 extracting: AB20MEN09 extracting: HE08LRA01 extracting: DA06MAF01 extracting: SM06LKF01 extracting: NA09MAL01 extracting: SW98STP01 extracting: MU18GSI01 extracting: ZI14HOU01 extracting: FL17DTO02 extracting: DA06MAF02 extracting: SA19PAL02 extracting: CO03PAL01 extracting: ZI16ROD01 extracting: OS13NGP01 extracting: CH98PIR01 extracting: RE19GBR02 extracting: MU18RED04 extracting: GR13MAD01 extracting: XI17HAI01 extracting: DE14DTO03 extracting: KL97DAH01 extracting: QU06RAB01 extracting: DE14DTO01 extracting: KU00NIN01 extracting: TU01SIA01 extracting: RE19GBR01 extracting: GR13MAD02 extracting: AB20MEN07 extracting: BR19RED01 extracting: NU09FAN01 extracting: MU18RED01 extracting: OS14RIP01 extracting: DE14DTO02 extracting: LI04FIJ01 extracting: EV18ROC01 extracting: CA13SAP01 extracting: JI18GAL02 extracting: TU01LAI01 extracting: HE13MIS01 extracting: GU99NAU02 extracting: ZI15IMP02 extracting: PF04PBA01 extracting: SA20FAN02 extracting: WE09ARR01 extracting: CO03PAL05 extracting: XU15BVI01 extracting: HE18COC02 extracting: MU18NPI01 extracting: MO06PED01 extracting: KR20SAR01 extracting: SA18GBR01 extracting: OS14UCP01 extracting: AB20MEN08 extracting: HE13MIS02 extracting: HE10GUA01 extracting: BO14HTI01 extracting: DE12ANC01 extracting: WA17BAN01 extracting: DR99ABR01 extracting: RO19MAR01 extracting: LI06RAR02 extracting: MU18RED03 extracting: SW99LIG01 extracting: SA16CLA01 extracting: ZI15TAN01 extracting: RE19GBR03 extracting: DR00KSB01 extracting: BO14HTI02 extracting: MU17DOA01 extracting: TA18TAS01 extracting: XU15BVI03 extracting: DU94URV02 extracting: AS05GUA01 extracting: FE09OGA01 extracting: GU99NAU01 extracting: SA20FAN01 extracting: CA13DIA01 extracting: AL16PUR02 extracting: CO03PAL10 extracting: RE19GBR05 extracting: ZI15IMP01 extracting: KR20SAR02 extracting: RO19YUC01 extracting: ST13MAL01 extracting: DR00NBB01 extracting: PF19LAR01 extracting: AL16YUC01 extracting: CO03PAL09 extracting: ZI16ROD02 extracting: AB20MEN05 extracting: SH92PUN01 extracting: KI04MCV01 extracting: AL16PUR01 extracting: CH18YOA02 extracting: DE14DTO04 extracting: AB20MEN04 extracting: DE16RED01 extracting: BA04FIJ02 extracting: CO03PAL06 extracting: JI18GAL01 extracting: CH18YOA01 extracting: RE19GBR04 extracting: DO18DAV01 extracting: GO12SBV01 extracting: CA07FLI01 extracting: SW99LIG02 extracting: MC04PNG01 extracting: CO93TAR01 extracting: RO19PAR01 extracting: CO00MAL01 extracting: MO20WOA01 extracting: AB20MEN01 extracting: QU96ESV01 extracting: DE13HAI01 extracting: LI94SEC01 extracting: ZI15CLE01 extracting: MU18RED02 extracting: ZI08MAY01 extracting: TU01DEP01 extracting: CO03PAL04 extracting: RA19PAI01 extracting: AB15BHB01 extracting: FL18DTO01 extracting: MO20KOI01 extracting: DU94URV01 extracting: CO03PAL08 extracting: WU14CLI01 extracting: ZI14TUR01 extracting: AB20MEN02 extracting: LI99CLI01 extracting: ZI15BUN01 extracting: FE18RUS01 extracting: WU13TON01 extracting: KI14PAR01 extracting: ZI14IFR02 extracting: XU15BVI02 extracting: KI08PAR01 extracting: AB20MEN06 extracting: AB08MEN01 extracting: NU09KIR01 extracting: RI10PBL01 extracting: CA14BUT01 extracting: FL18DTO02 extracting: FL18DTO04 extracting: BA04FIJ01 extracting: HE02GBR01 extracting: GO08BER01 extracting: CA13TUR01 extracting: LI06FIJ01 extracting: HE18COC01 extracting: FL17DTO01 extracting: BO99MOO01 extracting: CH03LOM01 extracting: SA19PAL01 extracting: CH97BVB01 extracting: RA20TAI01 Created time series: 608 entries
# for ii in range(len(TS)):
# if np.any(['climate' in key for key in TS[ii].keys()]):
# print(TS[ii].keys())
Create compact dataframe¶
Create empty dataframe with set of columns for compact dataframe, and populate with the LiPD data
col_str=['archiveType', 'dataSetName', 'datasetId', 'geo_meanElev', 'geo_meanLat', 'geo_meanLon', 'geo_siteName',
'originalDataUrl', 'paleoData_notes', 'paleoData_variableName',
'paleoData_archiveSpecies','paleoData_units', 'paleoData_values', 'year']
df_tmp = pd.DataFrame(index=range(len(TS)), columns=col_str)
populate dataframe¶
Start by populating paleoData_variableName (paleoData_proxy in dod2k standard terms)
# loop over the timeseries and pick those for global temperature analysis
i = 0
for ts in TS: #for every time series
# need to filter these variables in the list
if ts['paleoData_variableName'] not in ['year', 'd18OUncertainty', 'SrCaUncertainty']: #filter out all ts with thee three as the var name
for name in col_str: #for each of the 12 main keys, shove the wanted data into the df
try:
df_tmp.loc[i, name] = ts[name]
except:
df_tmp.loc[i, name] = np.nan
i += 1
# drop the rows with all NaNs (those not for global temperature analysis)
df = df_tmp.dropna(how='all')
- Now check that paleoData_variableName has been correctly populated and does not contain NaNs:
# double check the variable names we have
set(df['paleoData_variableName'])
{'SrCa', 'SrCa_annual', 'd18O', 'd18O_annual', 'd18O_sw', 'd18O_sw_annual'}
- Add more metadata to the dataframe, including originalDatabase, yearUnits, interpretation_variable (these are added manually and not from the LiPD files)
# KF: adding original dataset name and yearUnits
df.insert(7, 'originalDatabase', ['CoralHydro2k v1.0.1']*len(df))
df.insert(len(df.columns), 'yearUnits', ['CE'] * len(df))
df.insert(1, 'interpretation_variable', ['N/A']*len(df))
df.insert(1, 'interpretation_variableDetail', ['N/A']*len(df))
df.insert(1, 'interpretation_seasonality', ['N/A']*len(df))
df.insert(1, 'interpretation_direction', ['N/A']*len(df))
df.insert(1, 'paleoData_proxy', df['paleoData_variableName'])
- Rename columns to fit naming conventions
df = df.rename(columns={'originalDataUrl': 'originalDataURL', 'paleoData_archiveSpecies': 'paleoData_sensorSpecies'})
Assign interpretation_variable based on paleoData_proxy type:
- d18O are temperature and moisture sensitive
- d18O_sw are misture sensitive (sw=seawater)
- SrCa are temperature sensitive
# d18O is temperature and moisture
df.loc[np.isin(df['paleoData_proxy'], ['d18O', 'd18O_annual']), 'interpretation_variable']='temperature+moisture'
df.loc[np.isin(df['paleoData_proxy'], ['d18O', 'd18O_annual']), 'interpretation_variableDetail']='temperature+moisture - manually assigned by DoD2k authors for paleoData_proxy = d18O'
# d18O_sw is moisture
df.loc[np.isin(df['paleoData_proxy'], ['d18O_sw', 'd18O_sw_annual']), 'interpretation_variable']='moisture'
df.loc[np.isin(df['paleoData_proxy'], ['d18O_sw', 'd18O_sw_annual']), 'interpretation_variableDetail']='moisture - manually assigned by DoD2k authors for paleoData_proxy = d18O_sw'
# SrCa is temperature
df.loc[np.isin(df['paleoData_proxy'], ['SrCa', 'SrCa_annual']), 'interpretation_variable']='temperature'
df.loc[np.isin(df['paleoData_proxy'], ['SrCa', 'SrCa_annual']), 'interpretation_variableDetail']='temperature - manually assigned by DoD2k authors for paleoData_proxy = Sr/Ca'
Now filter for certain records:
- exclude sw data
- drop
_annualtag - rename
SrCatoSr/Cato match the standard terminology - rename
coraltoCoralto match the standard terminology - drop rows with no data
Rename entries according to standard terminology¶
import re
# KF: Extract and exclude sw values
df_sw = df[df['paleoData_proxy'].isin(['d18O_sw', 'd18O_sw_annual'])]
df = df[df['paleoData_proxy'].isin(['d18O_sw', 'd18O_sw_annual']) == False]
# KF: Turn annual measurements into regular
df_annual = df[df['paleoData_proxy'].isin(['SrCa_annual', 'd18O_annual'])]
for key in ['paleoData_proxy', 'paleoData_variableName']:
df [key] = df[key].apply(lambda x: re.match(r'(.*)_annual', x).group(1) if re.match(r'(.*)_annual', x) else x)
# KF: Replace SrCa with Sr/Ca for concat consistency
df[key] = df[key].apply(lambda x: 'Sr/Ca' if re.match('SrCa', x) else x)
# KF: Temp cleaning rows with NAN in year
length = len(df['year'])
df = df[df['year'].notna()]
df = df[df['year'].map(lambda x: len(x) > 0)]
df = df[df['paleoData_values'].map(lambda x: not any(pd.isnull(x)))]
print('Number of rows discarded: ', (length - len(df['year'])))
df['archiveType'] = df['archiveType'].replace({'coral': 'Coral'})
# # KF: Make datasetIds unique
# df['datasetId'] = df['datasetId'] + np.array(df.index, dtype = str)
Number of rows discarded: 0
# KF: Type-checking
df = df.astype({'archiveType': str, 'dataSetName': str, 'datasetId': str, 'geo_meanElev': np.float32, 'geo_meanLat': np.float32, 'geo_meanLon': np.float32, 'geo_siteName': str,
'originalDatabase': str, 'originalDataURL': str, 'paleoData_notes': str, 'paleoData_proxy': str, 'paleoData_units': str, 'yearUnits': str})
df['year'] = df['year'].map(lambda x: np.array(x, dtype = np.float32))
df['paleoData_values'] = df['paleoData_values'].map(lambda x: np.array(x, dtype = np.float32))
Include Common Era data only
for ii in df.index:
year = np.array(df.at[ii, 'year'], dtype=float)
vals = np.array(df.at[ii, 'paleoData_values'], dtype=float)
df.at[ii, 'year'] = year[year>=1]
df.at[ii, 'paleoData_values'] = vals[year>=1]
Note that the datasetId is not unique for each record and thus we added an additional array of strings to make the datasetId unique.
# check that the datasetId is unique
print(len(df.datasetId.unique()))
# make datasetId unique by simply adding index number
df.datasetId=df.apply(lambda x: x.datasetId.replace('ch2k','ch2k_')+'_'+str(x.name), axis=1)
# check uniqueness - problem solved.
print(len(df.datasetId.unique()))
179 272
mask out nans and set fill value, then later drop
Drop missing entries and standardize missing data format¶
drop_inds = []
for ii in df.index:
try:
year = np.array(df.at[ii, 'year'], dtype=float)
vals = np.array(df.at[ii, 'paleoData_values'], dtype=float)
df.at[ii, 'year'] = year[year>=1]
df.at[ii, 'paleoData_values'] = vals[year>=1]
except:
# print
df.at[ii, 'paleoData_values'] = np.array([utf.convert_to_float(y) for y in df.at[ii, 'paleoData_values']], dtype=float)
df.at[ii, 'year'] = np.array([utf.convert_to_float(y) for y in df.at[ii, 'year']], dtype=float)
print(f'Converted values in paleoData_values and/or year for {ii}.')
# drop_inds.append(ii)
# df_compact = df_compact.drop(drop_inds)
# drop all missing values and exclude all-missing-values-rows
for ii in df.index:
dd = np.array(df.at[ii, 'paleoData_values'])
mask = dd==-9999.99
df.at[ii, 'paleoData_values']=dd[~mask]
df.at[ii, 'year']=np.array(df.at[ii, 'year'])[~mask]
drop_inds = []
for ii, row in enumerate(df.paleoData_values):
try:
if len(row)==0:
print(ii, 'empty row for paleodata_values')
elif len(df.iloc[ii]['year'])==0:
print(ii, 'empty row for year')
elif np.std(row)==0:
print(ii, 'std=0')
elif np.sum(np.diff(row)**2)==0:
print(ii, 'diff=0')
elif np.isnan(np.std(row)):
print(ii, 'std nan')
else:
continue
if df.index[ii] not in drop_inds:
drop_inds += [df.index[ii]]
except:
drop_inds+=[df.index[ii]]
print(drop_inds)
df = df.drop(index=drop_inds)
5 std nan 9 std nan 10 std nan 14 std nan 26 std nan 28 std nan 36 std nan 37 std nan 42 std nan 43 std nan 44 std nan 45 std nan 46 std nan 58 std nan 76 std nan 87 std nan 91 std nan 103 std nan 116 std nan 117 std nan 123 std nan 138 std nan 139 std nan 147 std nan 166 std nan 169 std nan 171 std nan 172 std nan 175 std nan 177 std nan 186 std nan 202 std nan 206 std nan 215 std nan 219 std nan 227 std nan 230 std nan 238 std nan 239 std nan 240 std nan 241 std nan 242 std nan 243 std nan 250 std nan 252 std nan 253 std nan 254 std nan 255 std nan 256 std nan 259 std nan 270 std nan [10, 18, 20, 28, 58, 62, 80, 82, 92, 94, 96, 98, 100, 124, 164, 190, 198, 224, 252, 254, 268, 298, 300, 318, 364, 370, 376, 378, 384, 388, 406, 444, 454, 474, 484, 502, 508, 528, 530, 532, 534, 536, 538, 556, 560, 562, 564, 568, 570, 580, 604]
Now show the final compact dataframe
df = df[sorted(df.columns)]
df.reset_index(drop= True, inplace= True)
print(df.info())
<class 'pandas.core.frame.DataFrame'> RangeIndex: 221 entries, 0 to 220 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 archiveType 221 non-null object 1 dataSetName 221 non-null object 2 datasetId 221 non-null object 3 geo_meanElev 186 non-null float32 4 geo_meanLat 221 non-null float32 5 geo_meanLon 221 non-null float32 6 geo_siteName 221 non-null object 7 interpretation_direction 221 non-null object 8 interpretation_seasonality 221 non-null object 9 interpretation_variable 221 non-null object 10 interpretation_variableDetail 221 non-null object 11 originalDataURL 221 non-null object 12 originalDatabase 221 non-null object 13 paleoData_notes 221 non-null object 14 paleoData_proxy 221 non-null object 15 paleoData_sensorSpecies 221 non-null object 16 paleoData_units 221 non-null object 17 paleoData_values 221 non-null object 18 paleoData_variableName 221 non-null object 19 year 221 non-null object 20 yearUnits 221 non-null object dtypes: float32(3), object(18) memory usage: 33.8+ KB None
save compact dataframe¶
save pickle¶
# save to a pickle file
df_compact = df[sorted(df.columns)]
df_compact.to_pickle('data/ch2k/ch2k_compact.pkl')
save csv¶
# save to a list of csv files (metadata, data, year)
df_compact.name='ch2k'
utf.write_compact_dataframe_to_csv(df_compact)
METADATA: datasetId, archiveType, dataSetName, geo_meanElev, geo_meanLat, geo_meanLon, geo_siteName, interpretation_direction, interpretation_seasonality, interpretation_variable, interpretation_variableDetail, originalDataURL, originalDatabase, paleoData_notes, paleoData_proxy, paleoData_sensorSpecies, paleoData_units, paleoData_variableName, yearUnits Saved to /home/jupyter-lluecke/dod2k_v2.0/dod2k/data/ch2k/ch2k_compact_%s.csv
# load dataframe
df = utf.load_compact_dataframe_from_csv('ch2k')
Visualise dataframe¶
Show spatial distribution of records, show archive and proxy types
# count archive types
archive_count = {}
for ii, at in enumerate(set(df['archiveType'])):
archive_count[at] = df.loc[df['archiveType']==at, 'archiveType'].count()
sort = np.argsort([cc for cc in archive_count.values()])
archives_sorted = np.array([cc for cc in archive_count.keys()])[sort][::-1]
# Specify colour for each archive (smaller archives get grouped into the same colour)
archive_colour, major_archives, other_archives = uplt.get_archive_colours(archives_sorted, archive_count)
fig = uplt.plot_geo_archive_proxy(df, archive_colour)
utf.save_fig(fig, f'geo_{df.name}', dir=df.name)
0 Coral 221 saved figure in /home/jupyter-lluecke/dod2k_v2.0/dod2k/figs/ch2k/geo_ch2k.pdf
Now plot the coverage over the Common Era
fig = uplt.plot_coverage(df, archives_sorted, major_archives, other_archives, archive_colour)
utf.save_fig(fig, f'time_{df.name}', dir=df.name)
saved figure in /home/jupyter-lluecke/dod2k_v2.0/dod2k/figs/ch2k/time_ch2k.pdf
Display dataframe¶
Display identification metadata: dataSetName, datasetId, originalDataURL, originalDatabase¶
index¶
# # check index
print(df.index)
RangeIndex(start=0, stop=221, step=1)
dataSetName (associated with each record, may not be unique)¶
# # check dataSetName
key = 'dataSetName'
print('%s: '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
dataSetName: ['CH03BUN01' 'ZI15MER01' 'ZI15MER01' 'CO03PAL03' 'CO03PAL02' 'LI06RAR01' 'CO03PAL07' 'FL18DTO03' 'UR00MAI01' 'TU95MAD01' 'ZI04IFR01' 'RE18CAY01' 'RE18CAY01' 'RE18CAY01' 'RE18CAY01' 'KU99HOU01' 'OS13NLP01' 'EV98KIR01' 'LI00RAR01' 'LI00RAR01' 'NU11PAL01' 'NU11PAL01' 'MA08DTO01' 'CA14TIM01' 'CA14TIM01' 'KA17RYU01' 'MC11KIR01' 'AB20MEN09' 'HE08LRA01' 'DA06MAF01' 'NA09MAL01' 'SW98STP01' 'MU18GSI01' 'MU18GSI01' 'FL17DTO02' 'DA06MAF02' 'SA19PAL02' 'SA19PAL02' 'CO03PAL01' 'ZI16ROD01' 'OS13NGP01' 'CH98PIR01' 'RE19GBR02' 'RE19GBR02' 'MU18RED04' 'GR13MAD01' 'XI17HAI01' 'XI17HAI01' 'XI17HAI01' 'XI17HAI01' 'DE14DTO03' 'KL97DAH01' 'QU06RAB01' 'QU06RAB01' 'DE14DTO01' 'KU00NIN01' 'TU01SIA01' 'RE19GBR01' 'RE19GBR01' 'GR13MAD02' 'AB20MEN07' 'BR19RED01' 'NU09FAN01' 'NU09FAN01' 'MU18RED01' 'OS14RIP01' 'DE14DTO02' 'LI04FIJ01' 'LI04FIJ01' 'EV18ROC01' 'EV18ROC01' 'CA13SAP01' 'TU01LAI01' 'HE13MIS01' 'HE13MIS01' 'ZI15IMP02' 'ZI15IMP02' 'PF04PBA01' 'SA20FAN02' 'WE09ARR01' 'WE09ARR01' 'CO03PAL05' 'XU15BVI01' 'HE18COC02' 'HE18COC02' 'MU18NPI01' 'MO06PED01' 'KR20SAR01' 'KR20SAR01' 'SA18GBR01' 'OS14UCP01' 'AB20MEN08' 'HE13MIS02' 'HE13MIS02' 'HE10GUA01' 'HE10GUA01' 'HE10GUA01' 'HE10GUA01' 'DE12ANC01' 'WA17BAN01' 'WA17BAN01' 'DR99ABR01' 'DR99ABR01' 'LI06RAR02' 'MU18RED03' 'SW99LIG01' 'SA16CLA01' 'ZI15TAN01' 'ZI15TAN01' 'RE19GBR03' 'RE19GBR03' 'DR00KSB01' 'BO14HTI02' 'BO14HTI02' 'MU17DOA01' 'TA18TAS01' 'XU15BVI03' 'AS05GUA01' 'FE09OGA01' 'FE09OGA01' 'FE09OGA01' 'FE09OGA01' 'GU99NAU01' 'SA20FAN01' 'AL16PUR02' 'CO03PAL10' 'RE19GBR05' 'ZI15IMP01' 'ZI15IMP01' 'KR20SAR02' 'KR20SAR02' 'RO19YUC01' 'RO19YUC01' 'ST13MAL01' 'ST13MAL01' 'DR00NBB01' 'PF19LAR01' 'PF19LAR01' 'AL16YUC01' 'CO03PAL09' 'ZI16ROD02' 'AB20MEN05' 'KI04MCV01' 'KI04MCV01' 'CH18YOA02' 'DE16RED01' 'BA04FIJ02' 'CO03PAL06' 'CH18YOA01' 'RE19GBR04' 'DO18DAV01' 'GO12SBV01' 'GO12SBV01' 'CA07FLI01' 'CA07FLI01' 'SW99LIG02' 'CO93TAR01' 'RO19PAR01' 'CO00MAL01' 'MO20WOA01' 'MO20WOA01' 'AB20MEN01' 'QU96ESV01' 'DE13HAI01' 'DE13HAI01' 'DE13HAI01' 'DE13HAI01' 'LI94SEC01' 'ZI15CLE01' 'ZI15CLE01' 'MU18RED02' 'ZI08MAY01' 'TU01DEP01' 'CO03PAL04' 'RA19PAI01' 'AB15BHB01' 'FL18DTO01' 'MO20KOI01' 'MO20KOI01' 'DU94URV01' 'DU94URV01' 'CO03PAL08' 'WU14CLI01' 'ZI14TUR01' 'ZI14TUR01' 'LI99CLI01' 'ZI15BUN01' 'ZI15BUN01' 'FE18RUS01' 'FE18RUS01' 'FE18RUS01' 'FE18RUS01' 'WU13TON01' 'WU13TON01' 'KI14PAR01' 'KI14PAR01' 'KI14PAR01' 'KI14PAR01' 'ZI14IFR02' 'ZI14IFR02' 'XU15BVI02' 'NU09KIR01' 'NU09KIR01' 'RI10PBL01' 'CA14BUT01' 'CA14BUT01' 'FL18DTO02' 'BA04FIJ01' 'GO08BER01' 'GO08BER01' 'LI06FIJ01' 'HE18COC01' 'HE18COC01' 'FL17DTO01' 'FL17DTO01' 'BO99MOO01' 'CH03LOM01' 'SA19PAL01' 'SA19PAL01' 'CH97BVB01' 'RA20TAI01'] ["<class 'str'>"] No. of unique values: 155/221
datasetId (unique identifier, as given by original authors, includes original database token)¶
# # check datasetId
print(len(df.datasetId.unique()))
print(len(df))
key = 'datasetId'
print('%s (starts with): '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
print('datasetId starts with: ', np.unique([str(dd.split('_')[0]) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
221 221 datasetId (starts with): ['ch2k_CH03BUN01_0' 'ch2k_ZI15MER01_2' 'ch2k_ZI15MER01_4' 'ch2k_CO03PAL03_6' 'ch2k_CO03PAL02_8' 'ch2k_LI06RAR01_12' 'ch2k_CO03PAL07_14' 'ch2k_FL18DTO03_16' 'ch2k_UR00MAI01_22' 'ch2k_TU95MAD01_24' 'ch2k_ZI04IFR01_26' 'ch2k_RE18CAY01_30' 'ch2k_RE18CAY01_32' 'ch2k_RE18CAY01_34' 'ch2k_RE18CAY01_36' 'ch2k_KU99HOU01_40' 'ch2k_OS13NLP01_42' 'ch2k_EV98KIR01_44' 'ch2k_LI00RAR01_46' 'ch2k_LI00RAR01_48' 'ch2k_NU11PAL01_52' 'ch2k_NU11PAL01_54' 'ch2k_MA08DTO01_60' 'ch2k_CA14TIM01_64' 'ch2k_CA14TIM01_66' 'ch2k_KA17RYU01_70' 'ch2k_MC11KIR01_72' 'ch2k_AB20MEN09_74' 'ch2k_HE08LRA01_76' 'ch2k_DA06MAF01_78' 'ch2k_NA09MAL01_84' 'ch2k_SW98STP01_86' 'ch2k_MU18GSI01_88' 'ch2k_MU18GSI01_90' 'ch2k_FL17DTO02_102' 'ch2k_DA06MAF02_104' 'ch2k_SA19PAL02_106' 'ch2k_SA19PAL02_108' 'ch2k_CO03PAL01_110' 'ch2k_ZI16ROD01_112' 'ch2k_OS13NGP01_114' 'ch2k_CH98PIR01_116' 'ch2k_RE19GBR02_118' 'ch2k_RE19GBR02_120' 'ch2k_MU18RED04_122' 'ch2k_GR13MAD01_126' 'ch2k_XI17HAI01_128' 'ch2k_XI17HAI01_130' 'ch2k_XI17HAI01_134' 'ch2k_XI17HAI01_136' 'ch2k_DE14DTO03_140' 'ch2k_KL97DAH01_142' 'ch2k_QU06RAB01_144' 'ch2k_QU06RAB01_146' 'ch2k_DE14DTO01_148' 'ch2k_KU00NIN01_150' 'ch2k_TU01SIA01_152' 'ch2k_RE19GBR01_154' 'ch2k_RE19GBR01_156' 'ch2k_GR13MAD02_158' 'ch2k_AB20MEN07_160' 'ch2k_BR19RED01_162' 'ch2k_NU09FAN01_166' 'ch2k_NU09FAN01_168' 'ch2k_MU18RED01_172' 'ch2k_OS14RIP01_174' 'ch2k_DE14DTO02_176' 'ch2k_LI04FIJ01_178' 'ch2k_LI04FIJ01_180' 'ch2k_EV18ROC01_184' 'ch2k_EV18ROC01_186' 'ch2k_CA13SAP01_188' 'ch2k_TU01LAI01_192' 'ch2k_HE13MIS01_194' 'ch2k_HE13MIS01_196' 'ch2k_ZI15IMP02_200' 'ch2k_ZI15IMP02_202' 'ch2k_PF04PBA01_204' 'ch2k_SA20FAN02_206' 'ch2k_WE09ARR01_208' 'ch2k_WE09ARR01_210' 'ch2k_CO03PAL05_212' 'ch2k_XU15BVI01_214' 'ch2k_HE18COC02_216' 'ch2k_HE18COC02_218' 'ch2k_MU18NPI01_222' 'ch2k_MO06PED01_226' 'ch2k_KR20SAR01_228' 'ch2k_KR20SAR01_230' 'ch2k_SA18GBR01_234' 'ch2k_OS14UCP01_236' 'ch2k_AB20MEN08_238' 'ch2k_HE13MIS02_240' 'ch2k_HE13MIS02_242' 'ch2k_HE10GUA01_244' 'ch2k_HE10GUA01_246' 'ch2k_HE10GUA01_248' 'ch2k_HE10GUA01_250' 'ch2k_DE12ANC01_258' 'ch2k_WA17BAN01_260' 'ch2k_WA17BAN01_262' 'ch2k_DR99ABR01_264' 'ch2k_DR99ABR01_266' 'ch2k_LI06RAR02_270' 'ch2k_MU18RED03_272' 'ch2k_SW99LIG01_274' 'ch2k_SA16CLA01_276' 'ch2k_ZI15TAN01_278' 'ch2k_ZI15TAN01_280' 'ch2k_RE19GBR03_282' 'ch2k_RE19GBR03_284' 'ch2k_DR00KSB01_286' 'ch2k_BO14HTI02_288' 'ch2k_BO14HTI02_290' 'ch2k_MU17DOA01_292' 'ch2k_TA18TAS01_294' 'ch2k_XU15BVI03_296' 'ch2k_AS05GUA01_302' 'ch2k_FE09OGA01_304' 'ch2k_FE09OGA01_306' 'ch2k_FE09OGA01_308' 'ch2k_FE09OGA01_310' 'ch2k_GU99NAU01_314' 'ch2k_SA20FAN01_316' 'ch2k_AL16PUR02_320' 'ch2k_CO03PAL10_324' 'ch2k_RE19GBR05_326' 'ch2k_ZI15IMP01_328' 'ch2k_ZI15IMP01_330' 'ch2k_KR20SAR02_332' 'ch2k_KR20SAR02_334' 'ch2k_RO19YUC01_338' 'ch2k_RO19YUC01_340' 'ch2k_ST13MAL01_344' 'ch2k_ST13MAL01_346' 'ch2k_DR00NBB01_348' 'ch2k_PF19LAR01_350' 'ch2k_PF19LAR01_352' 'ch2k_AL16YUC01_354' 'ch2k_CO03PAL09_358' 'ch2k_ZI16ROD02_360' 'ch2k_AB20MEN05_362' 'ch2k_KI04MCV01_366' 'ch2k_KI04MCV01_368' 'ch2k_CH18YOA02_374' 'ch2k_DE16RED01_380' 'ch2k_BA04FIJ02_382' 'ch2k_CO03PAL06_386' 'ch2k_CH18YOA01_390' 'ch2k_RE19GBR04_392' 'ch2k_DO18DAV01_394' 'ch2k_GO12SBV01_396' 'ch2k_GO12SBV01_398' 'ch2k_CA07FLI01_400' 'ch2k_CA07FLI01_402' 'ch2k_SW99LIG02_404' 'ch2k_CO93TAR01_408' 'ch2k_RO19PAR01_410' 'ch2k_CO00MAL01_412' 'ch2k_MO20WOA01_414' 'ch2k_MO20WOA01_416' 'ch2k_AB20MEN01_420' 'ch2k_QU96ESV01_422' 'ch2k_DE13HAI01_424' 'ch2k_DE13HAI01_426' 'ch2k_DE13HAI01_430' 'ch2k_DE13HAI01_432' 'ch2k_LI94SEC01_436' 'ch2k_ZI15CLE01_438' 'ch2k_ZI15CLE01_440' 'ch2k_MU18RED02_442' 'ch2k_ZI08MAY01_446' 'ch2k_TU01DEP01_450' 'ch2k_CO03PAL04_452' 'ch2k_RA19PAI01_456' 'ch2k_AB15BHB01_458' 'ch2k_FL18DTO01_460' 'ch2k_MO20KOI01_462' 'ch2k_MO20KOI01_464' 'ch2k_DU94URV01_468' 'ch2k_DU94URV01_470' 'ch2k_CO03PAL08_472' 'ch2k_WU14CLI01_476' 'ch2k_ZI14TUR01_480' 'ch2k_ZI14TUR01_482' 'ch2k_LI99CLI01_486' 'ch2k_ZI15BUN01_488' 'ch2k_ZI15BUN01_490' 'ch2k_FE18RUS01_492' 'ch2k_FE18RUS01_494' 'ch2k_FE18RUS01_496' 'ch2k_FE18RUS01_498' 'ch2k_WU13TON01_504' 'ch2k_WU13TON01_506' 'ch2k_KI14PAR01_510' 'ch2k_KI14PAR01_512' 'ch2k_KI14PAR01_516' 'ch2k_KI14PAR01_518' 'ch2k_ZI14IFR02_522' 'ch2k_ZI14IFR02_524' 'ch2k_XU15BVI02_526' 'ch2k_NU09KIR01_540' 'ch2k_NU09KIR01_542' 'ch2k_RI10PBL01_546' 'ch2k_CA14BUT01_548' 'ch2k_CA14BUT01_550' 'ch2k_FL18DTO02_554' 'ch2k_BA04FIJ01_558' 'ch2k_GO08BER01_572' 'ch2k_GO08BER01_574' 'ch2k_LI06FIJ01_582' 'ch2k_HE18COC01_584' 'ch2k_HE18COC01_586' 'ch2k_FL17DTO01_590' 'ch2k_FL17DTO01_592' 'ch2k_BO99MOO01_594' 'ch2k_CH03LOM01_596' 'ch2k_SA19PAL01_598' 'ch2k_SA19PAL01_600' 'ch2k_CH97BVB01_602' 'ch2k_RA20TAI01_606'] ["<class 'str'>"] datasetId starts with: ['ch2k'] No. of unique values: 221/221
originalDataURL (URL/DOI of original published record where available)¶
# originalDataURL
key = 'originalDataURL'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([kk for kk in df[key] if 'this' in kk]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# 'this study' should point to the correct URL (PAGES2k)
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
originalDataURL: ['https://doi.org/10.1594/PANGAEA.874078' 'https://doi.pangaea.de/10.1594/PANGAEA.743953' 'https://doi.pangaea.de/10.1594/PANGAEA.830601' 'https://doi.pangaea.de/10.1594/PANGAEA.88199' 'https://doi.pangaea.de/10.1594/PANGAEA.88200' 'https://doi.pangaea.de/10.1594/PANGAEA.887712' 'https://doi.pangaea.de/10.1594/PANGAEA.891094' 'https://www.ncdc.noaa.gov/paleo/study/1003972' 'https://www.ncdc.noaa.gov/paleo/study/1003973' 'https://www.ncdc.noaa.gov/paleo/study/10373' 'https://www.ncdc.noaa.gov/paleo/study/10425' 'https://www.ncdc.noaa.gov/paleo/study/10808' 'https://www.ncdc.noaa.gov/paleo/study/11935' 'https://www.ncdc.noaa.gov/paleo/study/12278' 'https://www.ncdc.noaa.gov/paleo/study/12891' 'https://www.ncdc.noaa.gov/paleo/study/12893' 'https://www.ncdc.noaa.gov/paleo/study/12994' 'https://www.ncdc.noaa.gov/paleo/study/13035' 'https://www.ncdc.noaa.gov/paleo/study/13439' 'https://www.ncdc.noaa.gov/paleo/study/15238' 'https://www.ncdc.noaa.gov/paleo/study/15794' 'https://www.ncdc.noaa.gov/paleo/study/16217' 'https://www.ncdc.noaa.gov/paleo/study/16338' 'https://www.ncdc.noaa.gov/paleo/study/16339' 'https://www.ncdc.noaa.gov/paleo/study/16438' 'https://www.ncdc.noaa.gov/paleo/study/17035' 'https://www.ncdc.noaa.gov/paleo/study/17289' 'https://www.ncdc.noaa.gov/paleo/study/17378' 'https://www.ncdc.noaa.gov/paleo/study/1839' 'https://www.ncdc.noaa.gov/paleo/study/1842' 'https://www.ncdc.noaa.gov/paleo/study/1844' 'https://www.ncdc.noaa.gov/paleo/study/1845' 'https://www.ncdc.noaa.gov/paleo/study/1846' 'https://www.ncdc.noaa.gov/paleo/study/1847' 'https://www.ncdc.noaa.gov/paleo/study/1850' 'https://www.ncdc.noaa.gov/paleo/study/1853' 'https://www.ncdc.noaa.gov/paleo/study/1855' 'https://www.ncdc.noaa.gov/paleo/study/1856' 'https://www.ncdc.noaa.gov/paleo/study/1857' 'https://www.ncdc.noaa.gov/paleo/study/1859' 'https://www.ncdc.noaa.gov/paleo/study/1866' 'https://www.ncdc.noaa.gov/paleo/study/1867' 'https://www.ncdc.noaa.gov/paleo/study/1875' 'https://www.ncdc.noaa.gov/paleo/study/1876' 'https://www.ncdc.noaa.gov/paleo/study/1881' 'https://www.ncdc.noaa.gov/paleo/study/18895' 'https://www.ncdc.noaa.gov/paleo/study/1891' 'https://www.ncdc.noaa.gov/paleo/study/1897' 'https://www.ncdc.noaa.gov/paleo/study/1901' 'https://www.ncdc.noaa.gov/paleo/study/1903' 'https://www.ncdc.noaa.gov/paleo/study/1911' 'https://www.ncdc.noaa.gov/paleo/study/1913' 'https://www.ncdc.noaa.gov/paleo/study/1914' 'https://www.ncdc.noaa.gov/paleo/study/1915' 'https://www.ncdc.noaa.gov/paleo/study/19179' 'https://www.ncdc.noaa.gov/paleo/study/19239' 'https://www.ncdc.noaa.gov/paleo/study/1925' 'https://www.ncdc.noaa.gov/paleo/study/21011' 'https://www.ncdc.noaa.gov/paleo/study/21310' 'https://www.ncdc.noaa.gov/paleo/study/21710' 'https://www.ncdc.noaa.gov/paleo/study/22056' 'https://www.ncdc.noaa.gov/paleo/study/22252' 'https://www.ncdc.noaa.gov/paleo/study/22991' 'https://www.ncdc.noaa.gov/paleo/study/23390' 'https://www.ncdc.noaa.gov/paleo/study/23850' 'https://www.ncdc.noaa.gov/paleo/study/24477' 'https://www.ncdc.noaa.gov/paleo/study/24630' 'https://www.ncdc.noaa.gov/paleo/study/25270' 'https://www.ncdc.noaa.gov/paleo/study/25290' 'https://www.ncdc.noaa.gov/paleo/study/26531' 'https://www.ncdc.noaa.gov/paleo/study/27271' 'https://www.ncdc.noaa.gov/paleo/study/27450' 'https://www.ncdc.noaa.gov/paleo/study/28130' 'https://www.ncdc.noaa.gov/paleo/study/28451' 'https://www.ncdc.noaa.gov/paleo/study/29312' 'https://www.ncdc.noaa.gov/paleo/study/29412' 'https://www.ncdc.noaa.gov/paleo/study/30493' 'https://www.ncdc.noaa.gov/paleo/study/31552' 'https://www.ncdc.noaa.gov/paleo/study/33732' 'https://www.ncdc.noaa.gov/paleo/study/34372' 'https://www.ncdc.noaa.gov/paleo/study/34373' 'https://www.ncdc.noaa.gov/paleo/study/34392' 'https://www.ncdc.noaa.gov/paleo/study/34393' 'https://www.ncdc.noaa.gov/paleo/study/34394' 'https://www.ncdc.noaa.gov/paleo/study/34412' 'https://www.ncdc.noaa.gov/paleo/study/34413' 'https://www.ncdc.noaa.gov/paleo/study/34452' 'https://www.ncdc.noaa.gov/paleo/study/34472' 'https://www.ncdc.noaa.gov/paleo/study/34512' 'https://www.ncdc.noaa.gov/paleo/study/34552' 'https://www.ncdc.noaa.gov/paleo/study/34553' 'https://www.ncdc.noaa.gov/paleo/study/34612' 'https://www.ncdc.noaa.gov/paleo/study/34692' 'https://www.ncdc.noaa.gov/paleo/study/34953' 'https://www.ncdc.noaa.gov/paleo/study/6087' 'https://www.ncdc.noaa.gov/paleo/study/6089' 'https://www.ncdc.noaa.gov/paleo/study/6116' 'https://www.ncdc.noaa.gov/paleo/study/6184' 'https://www.ncdc.noaa.gov/paleo/study/8424' 'https://www.ncdc.noaa.gov/paleo/study/8609' 'https://www.ncdc.noaa.gov/paleo/study/9639'] [] ["<class 'str'>"] No. of unique values: 101/221
originalDatabase (original database used as input for dataframe)¶
# # originalDataSet
key = 'originalDatabase'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
# Note: the last two records have missing URLs
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
originalDatabase: ['CoralHydro2k v1.0.1'] ["<class 'str'>"] No. of unique values: 1/221
geographical metadata: elevation, latitude, longitude, site name¶
geo_meanElev (mean elevation in m)¶
# check Elevation
key = 'geo_meanElev'
print('%s: '%key)
print(df[key])
print(np.unique(['%d'%kk for kk in df[key] if np.isfinite(kk)]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
geo_meanElev:
0 -3.0
1 -17.0
2 -17.0
3 NaN
4 NaN
...
216 -5.0
217 -10.0
218 -10.0
219 -7.0
220 -6.0
Name: geo_meanElev, Length: 221, dtype: float32
['-1' '-10' '-11' '-12' '-14' '-16' '-17' '-18' '-2' '-25' '-3' '-4' '-5'
'-6' '-7' '-8' '-9' '0']
["<class 'float'>"]
No. of unique values: 44/221
geo_meanLat (mean latitude in degrees N)¶
# # Latitude
key = 'geo_meanLat'
print('%s: '%key)
print(np.unique(['%d'%kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
geo_meanLat: ['-10' '-11' '-12' '-13' '-14' '-15' '-16' '-17' '-18' '-19' '-21' '-22' '-23' '-28' '-3' '-4' '-5' '-6' '-8' '0' '1' '10' '11' '12' '13' '15' '16' '17' '18' '19' '2' '20' '21' '22' '23' '24' '25' '27' '28' '3' '32' '4' '5' '7'] ["<class 'float'>"] No. of unique values: 128/221
geo_meanLon (mean longitude)¶
# # Longitude
key = 'geo_meanLon'
print('%s: '%key)
print(np.unique(['%d'%kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
geo_meanLon: ['-109' '-114' '-149' '-157' '-159' '-162' '-169' '-174' '-22' '-33' '-61' '-64' '-66' '-67' '-80' '-82' '-86' '-88' '-91' '100' '105' '109' '110' '111' '113' '114' '115' '117' '118' '119' '120' '122' '123' '124' '130' '134' '142' '143' '144' '145' '146' '147' '148' '150' '151' '152' '153' '163' '166' '167' '172' '173' '179' '34' '36' '37' '38' '39' '40' '43' '45' '49' '55' '58' '63' '7' '70' '71' '72' '92' '96'] ["<class 'float'>"] No. of unique values: 130/221
geo_siteName (name of collection site)¶
# Site Name
key = 'geo_siteName'
print('%s: '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
geo_siteName: ['Bunaken Island, Indonesia' 'Rowley Shoals, Australia' 'Rowley Shoals, Australia' 'Palmyra Island, United States Minor Outlying Islands' 'Palmyra Island, United States Minor Outlying Islands' 'Rarotonga, Cook Islands' 'Palmyra Island, United States Minor Outlying Islands' 'Dry Tortugas, Florida, USA' 'Maiana, Republic of Kiribati' 'Madang Lagoon, Papua New Guinea' 'Ifaty Reef, Madagascar' 'Little Cayman, Cayman Islands' 'Little Cayman, Cayman Islands' 'Little Cayman, Cayman Islands' 'Little Cayman, Cayman Islands' 'Houtman Abrolhos Islands, Australia' 'Ngeralang, Palau' 'Kiritimati (Christmas) Island, Republic of Kiribati' 'Rarotonga, Cook Islands' 'Rarotonga, Cook Islands' 'Palmyra Island, United States Minor Outlying Islands' 'Palmyra Island, United States Minor Outlying Islands' 'Dry Tortugas, Florida, USA' 'Timor, Indonesia' 'Timor, Indonesia' 'Kikai Island, Japan' 'Kiritimati (Christmas) Island, Republic of Kiribati' 'Mentawai Islands, Indonesia' 'Cayo Sal, Los Roques Archipelago, Venezuela' 'Fungu Mrima Reef, Tanzania' 'Malindi Marine Park, Kenya' 'Ponta Banana, Principe Island' 'Gili Selang, Bali, Indonesia' 'Gili Selang, Bali, Indonesia' 'Dry Tortugas, Florida, USA' 'Fungu Mrima Reef, Tanzania' 'Palmyra Island, United States Minor Outlying Islands' 'Palmyra Island, United States Minor Outlying Islands' 'Palmyra Island, United States Minor Outlying Islands' 'Rodrigues, Republic of Mauritius' 'Ngaragabel, Palau' 'Pirotan Island, Gujarat, India' 'Portland Roads, Australia' 'Portland Roads, Australia' 'Coral Gardens, Red Sea' 'Nosy Boraha, Madagascar (formerly Ile Sainte-Marie)' 'Fengjiawan, Wenchang, China' 'Fengjiawan, Wenchang, China' 'Fengjiawan, Wenchang, China' 'Fengjiawan, Wenchang, China' 'Dry Tortugas, Florida, USA' 'Dur-Ghella Island, Eritrea' 'Rabaul, East New Britain, Papua New Guinea' 'Rabaul, East New Britain, Papua New Guinea' 'Dry Tortugas, Florida, USA' 'Ningaloo Reef, Australia' 'Sialum, Huon Peninsula, Papua New Guinea' 'Eel Reef, Australia' 'Eel Reef, Australia' 'Nosy Boraha, Madagascar (formerly Ile Sainte-Marie)' 'Mentawai Islands, Indonesia' 'Canyon, Red Sea' 'Tabuaeran (Fanning Island), Republic of Kiribati' 'Tabuaeran (Fanning Island), Republic of Kiribati' 'Semicolon, Red Sea' 'Rock Islands, Palau' 'Dry Tortugas, Florida, USA' 'Vanua Levu, Fiji' 'Vanua Levu, Fiji' 'Rocas Atoll, Rio Grande do Norte, Brazil' 'Rocas Atoll, Rio Grande do Norte, Brazil' 'Sapodilla Cayes, Belize' 'Laing Island, Papua New Guinea' 'Misima Island, Papua New Guinea' 'Misima Island, Papua New Guinea' 'Rowley Shoals, Australia' 'Rowley Shoals, Australia' 'Peros Banhos Atoll, Chagos Archipelago' 'Tabuaeran (Fanning Island), Republic of Kiribati' 'Arlington Reef, Australia' 'Arlington Reef, Australia' 'Palmyra Island, United States Minor Outlying Islands' 'Anegada, British Virgin Islands' 'Cocos (Keeling) Islands, Australia' 'Cocos (Keeling) Islands, Australia' 'Nusa Penida, Indonesia' 'Pedra de Lume, Sal Island' 'Sarawak, Malaysia' 'Sarawak, Malaysia' 'Great Keppel Island, Australia' 'Ulong Channel, Palau' 'Mentawai Islands, Indonesia' 'Misima Island, Papua New Guinea' 'Misima Island, Papua New Guinea' 'Isle de Gosier, Guadeloupe' 'Isle de Gosier, Guadeloupe' 'Isle de Gosier, Guadeloupe' 'Isle de Gosier, Guadeloupe' 'Amedee Island, New Caledonia' 'Bandar Khayran, Oman' 'Bandar Khayran, Oman' 'Abraham Reef, Australia' 'Abraham Reef, Australia' 'Rarotonga, Cook Islands' 'Abu Galawa, Red Sea' 'Lignumvitae Basin, Florida, USA' 'Clarion Island, Mexico' 'Ningaloo Reef, Australia' 'Ningaloo Reef, Australia' 'Reef 13-050, Australia' 'Reef 13-050, Australia' 'Kitchen Shoals, Bermuda' 'Hon Tre Island, Vietnam' 'Hon Tre Island, Vietnam' 'Doangdoangan Besar, Indonesia' "Ta'u, American Samoa" 'Anegada, British Virgin Islands' 'Double Reef, Guam' 'Ogasawara Islands, Japan' 'Ogasawara Islands, Japan' 'Ogasawara Islands, Japan' 'Ogasawara Islands, Japan' 'Nauru Island, Republic of Nauru' 'Tabuaeran (Fanning Island), Republic of Kiribati' 'Pinacles Reef, Puerto Rico' 'Palmyra Island, United States Minor Outlying Islands' 'Clerke Reef, Australia' 'Rowley Shoals, Australia' 'Rowley Shoals, Australia' 'Sarawak, Malaysia' 'Sarawak, Malaysia' 'Puerto Morelos, Mexico' 'Puerto Morelos, Mexico' 'Rasdhoo Atoll, Maldives' 'Rasdhoo Atoll, Maldives' 'Northeast Breakers, Bermuda' 'St. Gilles Reef, La Reunion' 'St. Gilles Reef, La Reunion' 'Puerto Morelos, Mexico' 'Palmyra Island, United States Minor Outlying Islands' 'Rodrigues, Republic of Mauritius' 'Mentawai Islands, Indonesia' 'Espiritu Santo Island, Vanuatu' 'Espiritu Santo Island, Vanuatu' 'Lingyang Reef, Yongle Atoll' 'Red Sea' 'Savusavu Bay, Vanua Levu, Fiji' 'Palmyra Island, United States Minor Outlying Islands' 'Lingyang Reef, Yongle Atoll' 'Nomad Reef, Australia' 'Davies Reef, Australia' 'Sabine Bank, Vanuatu' 'Sabine Bank, Vanuatu' 'Flinders Reef, Australia' 'Flinders Reef, Australia' 'Lignumvitae Basin, Florida Bay' 'Tarawa Atoll, Republic of Kiribati' 'Parguera, Puerto Rico' 'Malindi Marine Park, Kenya' 'Wolei Atoll, Fed. States of Micronesia' 'Wolei Atoll, Fed. States of Micronesia' 'Mentawai Islands, Indonesia' 'Espiritu Santo Island, Vanuatu' 'Longwan, Qionghai, China' 'Longwan, Qionghai, China' 'Longwan, Qionghai, China' 'Longwan, Qionghai, China' 'Secas Island, Panama' 'Rowley Shoals, Australia' 'Rowley Shoals, Australia' 'Popponesset, Red Sea' 'Mayotte' 'Madang Lagoon, Papua New Guinea' 'Palmyra Island, United States Minor Outlying Islands' 'Palaui Island, Philippines' 'Batu Hitam Beach, Indonesia' 'Dry Tortugas, Florida, USA' 'Kosrae Island, Fed. States of Micronesia' 'Kosrae Island, Fed. States of Micronesia' 'Urvina Bay, Isabela Island, Ecuador' 'Urvina Bay, Isabela Island, Ecuador' 'Palmyra Island, United States Minor Outlying Islands' 'Clipperton Island' 'Tulear Reef, Madagascar' 'Tulear Reef, Madagascar' 'Clipperton Island' 'Ningaloo Reef, Australia' 'Ningaloo Reef, Australia' 'Ras Umm Sidd, Egypt' 'Ras Umm Sidd, Egypt' 'Ras Umm Sidd, Egypt' 'Ras Umm Sidd, Egypt' "Ha'afera, Tonga" "Ha'afera, Tonga" 'La Parguera, Puerto Rico' 'La Parguera, Puerto Rico' 'La Parguera, Puerto Rico' 'La Parguera, Puerto Rico' 'Ifaty Reef, Madagascar' 'Ifaty Reef, Madagascar' 'Anegada, British Virgin Islands' 'Kiritimati (Christmas) Island, Republic of Kiribati' 'Kiritimati (Christmas) Island, Republic of Kiribati' 'Port Blair, Andaman Islands, India' 'Butaritari Atoll, Republic of Kiribati' 'Butaritari Atoll, Republic of Kiribati' 'Dry Tortugas, Florida, USA' 'Savusavu Bay, Vanua Levu, Fiji' 'Bermuda' 'Bermuda' 'Savusavu Bay, Vanua Levu, Fiji' 'Cocos (Keeling) Islands, Australia' 'Cocos (Keeling) Islands, Australia' 'Dry Tortugas, Florida, USA' 'Dry Tortugas, Florida, USA' 'Moorea, French Polynesia' 'Padang Bai, Bali, Indonesia' 'Palmyra Island, United States Minor Outlying Islands' 'Palmyra Island, United States Minor Outlying Islands' 'Mahe Island, Republic of the Seychelles' 'Houbihu, Taiwan'] ["<class 'str'>"] No. of unique values: 103/221
proxy metadata: archive type, proxy type, interpretation¶
archiveType (archive type)¶
# archiveType
key = 'archiveType'
print('%s: '%key)
print(np.unique(df[key]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
archiveType: ['Coral'] ["<class 'str'>"] No. of unique values: 1/221
paleoData_proxy (proxy type)¶
# paleoData_proxy
key = 'paleoData_proxy'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
paleoData_proxy: ['Sr/Ca' 'd18O'] ["<class 'str'>"] No. of unique values: 2/221
paleoData_sensorSpecies (further information on proxy type: species)¶
# climate_interpretation
key = 'paleoData_sensorSpecies'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
paleoData_sensorSpecies: ['Diploastrea heliopora' 'Diploria labyrinthiformis' 'Favia speciosa' 'Orbicella faveolata' 'Pavona clavus' 'Platygyra lamellina' 'Porites australiensis' 'Porites lobata' 'Porites lutea' 'Porites solida' 'Porites sp.' 'Pseudodiploria strigosa' 'Siderastrea radians' 'Siderastrea siderea' 'Siderastrea sp.' 'Siderastrea stellata' 'Solenastrea bournoni'] ["<class 'str'>"] No. of unique values: 17/221
paleoData_notes (notes)¶
# # paleoData_notes
key = 'paleoData_notes'
print('%s: '%key)
print(df[key].values)
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
paleoData_notes: ['This paper did not calibrate the d18O proxy or reconstruct temperature. It instead analyzed variability through time by directly using the d18O proxy.' 'Sr/Ca-SST recconstructed with composite plus scale method to ERSSTv3b, no regression applied' 'Sr/Ca-SST recconstructed with composite plus scale method to ERSSTv3b, no regression applied' 'nan' 'nan' 'Individual coral records that are part of the Rarotonga composite' 'nan' 'nan' 'nan' 'monthly correlations with SST not reported' 'Other calibration slopes are available in Zinke et al. 2004; 1920-1995 samples monthly; 1919-1658 sampled bimonthly' 'nan' 'nan' 'nan' 'nan' '1953-1993 and 1961-1993 calibration periods, first with 0.13 slope, latter -0.17 slope' 'nan' 'nan' 'Sr/Ca-SST calibrations listed were found in Linsley et al. 2004. The calibration from Linsley et al. 2000 is as follows: slope = -0.082; intercept = 11.568; rsq = 0.75' 'Sr/Ca-SST calibrations listed were found in Linsley et al. 2004. The calibration from Linsley et al. 2000 is as follows: slope = -0.082; intercept = 11.568; rsq = 0.75' 'nan' 'nan' 'nan' 'nan' 'nan' 'Core data is a composite of overlapping individual pieces and any replicate analyses. Data is seasonal min-max (Feb - Aug)' 'nan' 'Fossil Coral' 'nan' 'One core, this record: younger part of core MAF00-01 at monthly resolution (1896-1998); annual correlation based on 1897-1997 data' 'monthly correlations not reported, only assesed with IOD; used Cole et al. (2000) slope of -0.24' 'nan' 'Sr/Ca-SST calibrations not published because of weak relationship for both GS and NP cores' 'Sr/Ca-SST calibrations not published because of weak relationship for both GS and NP cores' 'nan' 'One core, this record: older part of core MAF00-01 at bimonthly resolution (1622-1722)' 'nan' 'nan' 'nan' 'Totor Sr/Ca core top performed well in many sections, others less good, especially core top since 1988 to 2006. Cabri was much better. There is a table with likely best section in Totor in paper' 'Multiple linear regression analyses using monthly (non-detrended) anomalies for coral and instrumental SST and SSS data for the period 1970 to 2008. For NGB core, regression equation is: d18O_anom = 0.15 (0.03)*SST + 0.36 (0.07)*SSS.n' 'slope is not calculated, reported slope from Weber and Woodhead is used of -0.24' 'nan' 'nan' 'This record is a combination of high-resolution data from Murty et al. 2018 and annual data from Bryan et al. 2019. Studies are independent but were performed on the same coral core (CG). Calibration information, SST range, and analytical error provided are for high-resolution data from Murty et al. 2018.' 'nan' 'nan' 'nan' 'nan' 'nan' 'Same colony as 08PS-A2, but different core and core top age' 'nan' 'nan' 'nan' 'Same colony as 08PS-A1, but different core and core top age' 'used Gagan 1994 d18O slope' 'monthly correlations not reported' 'nan' 'nan' 'STM4 Sr/Ca core top was too cold vs STM2, would be careful. STM2 is really good, compares well to Rodrigures core Cabri' 'Fossil Coral' 'nan' 'Two cores were spliced at 1984 to avoid secondary aragonite' 'Two cores were spliced at 1984 to avoid secondary aragonite' 'nan' 'Multiple linear regression analyses using monthly (non-detrended) anomalies for coral and instrumental SST and SSS data for the period 1970 to 2008. For RI core, regression equation is: d18O_anom = 0.03 (0.03)*SST + 0.36 (0.05)*SSS.nn n n n n' 'nan' 'nan' 'nan' 'nan' 'nan' 'nan' 'monthly correlations not reported' 'Composite of two fossil cores. Data was sub-monthly and was linearly resampled to monthly.' 'Composite of two fossil cores. Data was sub-monthly and was linearly resampled to monthly.' 'Sr/Ca-SST recconstructed with composite plus scale method to ERSSTv3b, no regression applied' 'Sr/Ca-SST recconstructed with composite plus scale method to ERSSTv3b, no regression applied' 'd18O driven by rainfall; little SST correlation' 'Publication notes that core F4 has higher variance than contemporary cores and instrumental SST products and that this is likely due to lagoonal mixing.' 'coral was primariliy analysed for Boron and paper does not discuss much about Sr/Ca and d18O even though they were analysed' 'coral was primariliy analysed for Boron and paper does not discuss much about Sr/Ca and d18O even though they were analysed' 'nan' 'nan' 'Sr/Ca regression slope error is estimated to be (+/-) 0.21 degrees C; Used the Zinke (2015) method - normalised and scaled to the s.d. SST box of the period 1961-1990; equation based on composite of 2 cores' 'Sr/Ca regression slope error is estimated to be (+/-) 0.21 degrees C; Used the Zinke (2015) method - normalised and scaled to the s.d. SST box of the period 1961-1990; equation based on composite of 2 cores' 'NP1 and NP2 cores spliced together to get full NP record due to bioerosion in the NP1 core. Sr/Ca-SST calibrations not published because of weak relationship for both GS and NP cores.' 'nan' 'nan' 'nan' 'Study focuses on Ba/Ca and Y/Ca. Sr/Ca is primarily used for chronology' 'Multiple linear regression analyses using monthly (not detrended) anomalies for coral and instrumental SST and SSS data for the period 1970 to 2008. For UC core, regression equation is: d18O_anom = 0.09 (0.03)*SST + 0.33 (0.05)*SSS' 'Fossil Coral' 'Raw data was sub-monthly and was linearly resampled to monthly.' 'Raw data was sub-monthly and was linearly resampled to monthly.' 'This calibration data is taken from the top 40 years of the core from Hetzinger et al 2006' 'This calibration data is taken from the top 40 years of the core from Hetzinger et al 2006' 'This calibration data is taken from the top 40 years of the core from Hetzinger et al 2006' 'This calibration data is taken from the top 40 years of the core from Hetzinger et al 2006' 'Sr/Ca are average values of three colonies and replicate paths' 'nan' 'nan' 'This is a refinement of the record available previously (Druffel and Griffin, JGR 1993) which showed biennial d18O for the period 1635-1957.' 'This is a refinement of the record available previously (Druffel and Griffin, JGR 1993) which showed biennial d18O for the period 1635-1957.' 'Individual coral records that are part of the Rarotonga composite' 'Mean SST ranges given in paper for northern reef 7.7C; for southern reefs 5.8C' 'nan' 'nan' 'Sr/Ca-SST recconstructed with composite plus scale method to ERSSTv3b, no regression applied' 'Sr/Ca-SST recconstructed with composite plus scale method to ERSSTv3b, no regression applied' 'nan' 'nan' 'regressions in paper use air temps instead of SST, relevant growth information about coral found in PhD thesis (http://nbn-resolving.de/urn:nbn:de:gbv:46-ep000102521)' 'A composite of cores TN (CoralHydro2k ID BO14HTI01) and BB was used from 2010-1977 for the published reconstructions in Bolton et al. 2014 and Goodkin et al. 2021.' 'A composite of cores TN (CoralHydro2k ID BO14HTI01) and BB was used from 2010-1977 for the published reconstructions in Bolton et al. 2014 and Goodkin et al. 2021.' 'Calibrations to SST data were performed on a shorter, higher-resolution set of samples. See Murty et al. 2017 for more information.' 'The calibration equation incorporated both SST and salinity, so the d18O-SST slope is not included here to avoid misrepresentation.' 'nan' 'nan' 'Annual regression slopes (-0.213 / C, -0.140 mmol/mol / C) imply an apparent amplification of inferred SST variations on interannual and longer timescales.' 'Annual regression slopes (-0.213 / C, -0.140 mmol/mol / C) imply an apparent amplification of inferred SST variations on interannual and longer timescales.' 'Annual regression slopes (-0.213 / C, -0.140 mmol/mol / C) imply an apparent amplification of inferred SST variations on interannual and longer timescales.' 'Annual regression slopes (-0.213 / C, -0.140 mmol/mol / C) imply an apparent amplification of inferred SST variations on interannual and longer timescales.' 'Oxygen isotope data has NOT been corrected for the acid fractionation difference (acid-alpha) between standards (calcite) and coral samples (aragonite). Prior to 1895/1896 the data exhibits a kinetic overprint.' 'nan' 'Study focuses on use of Sr/U rather than Sr/Ca' 'nan' 'nan' 'Sr/Ca-SST recconstructed with composite plus scale method to ERSSTv3b, no regression applied' 'Sr/Ca-SST recconstructed with composite plus scale method to ERSSTv3b, no regression applied' 'nan' 'nan' 'focused more on Sr/U calibrations' 'focused more on Sr/U calibrations' 'nan' 'nan' 'regressions in paper use air temps instead of SST, relevant growth information about coral found in PhD thesis (http://nbn-resolving.de/urn:nbn:de:gbv:46-ep000102521)' 'See Pfeiffer et al. 2004 for more monthly/bimonthly d18O calibrations and Pfeiffer et al. 2019 for annual d18O and Sr/Ca calibrations.' 'See Pfeiffer et al. 2004 for more monthly/bimonthly d18O calibrations and Pfeiffer et al. 2019 for annual d18O and Sr/Ca calibrations.' 'Study focuses on use of Sr/U rather than Sr/Ca' 'nan' 'nan' 'Fossil Coral' 'regression information based on Kilbourne MS Thesis' 'regression information based on Kilbourne MS Thesis' 'Microatoll; coral rubble samples; data reported in Supplements of paper' 'Study focuses on use of Sr/U rather than Sr/Ca; multiple locations Atlantic and Pacific; Sr/Ca uncertainty 1 deg C; Sr-U uncertainty 0.5 deg C; Sr/Ca-SST slope not indicated' 'nan' 'nan' 'Microatoll; coral rubble samples; data reported in Supplements of paper' 'nan' 'Study uses multiple cores from multiple locations in the Great Barrier Reef; spans 15-18S latitude; multiple Sr/Ca regression equations in the paper. Supplemnet has all data.' 'd18O is a composite of cores 06SB-A1 and 07SB-A2' 'd18O is a composite of cores 06SB-A1 and 07SB-A2' 'nan' 'nan' 'nan' 'monthly correlations not reported' 'nan' 'Extra information supplied that is not included in publication such as higher precision slope value. Noted as exposed to open ocean; seasonally influenced by river discharge' 'uncertainty on Sr/Ca intercept is 0.0018' 'uncertainty on Sr/Ca intercept is 0.0018' 'Note: Length of coral record increased for Abram et al., 2020 publication relative to Abram et al., 2015' 'nan' 'nan' 'nan' 'nan' 'nan' 'nan' 'Sr/Ca-SST recconstructed with composite plus scale method to ERSSTv3b, no regression applied' 'Sr/Ca-SST recconstructed with composite plus scale method to ERSSTv3b, no regression applied' 'Mean SST ranges given in paper for northern reef 7.7C; for southern reefs 5.8C' 'in situ d18o; for annual (sr/Ca, slope- -0.0583, intercept - 10.378)' 'monthly correlations not reimported' 'nan' 'Sr/Ca calibration equation originally found in Ramos et al. 2017; Monthly Sr/Ca data available from 1880-2012; Monthly and seasonal (DJFM vs JJAS, data input on Jan and July) d18O data available from 1894-2012 and 1880-1893, respectively.' 'monthly correlations not reported, instead used IOD season' 'nan' 'nan' 'nan' 'Core was collected in a subhorizontal, not vertical, orientation from coral colony; indistinct growth banding in top 50 years of core' 'Core was collected in a subhorizontal, not vertical, orientation from coral colony; indistinct growth banding in top 50 years of core' 'nan' 'Composite of cores C2B (13.1m depth), C4B (8.2m), C6A (11.3m), and CF1B (found on beach); reconstructed d18Osw data are anomalies' 'Published slopes based on composite coral results. Used -0.20.02 permil per 1 deg C regressions' 'Published slopes based on composite coral results. Used -0.20.02 permil per 1 deg C regressions' 'authors describe density banding as poor; some fish grazing scars' 'Sr/Ca-SST recconstructed with composite plus scale method to ERSST3b, no regression applied' 'Sr/Ca-SST recconstructed with composite plus scale method to ERSST3b, no regression applied' 'Annual regression slopes (-0.29 / C, -0.115 mmol/mol / C) imply an apparent amplification of inferred SST variations on interannual and longer timescales.' 'Annual regression slopes (-0.29 / C, -0.115 mmol/mol / C) imply an apparent amplification of inferred SST variations on interannual and longer timescales.' 'Annual regression slopes (-0.29 / C, -0.115 mmol/mol / C) imply an apparent amplification of inferred SST variations on interannual and longer timescales.' 'Annual regression slopes (-0.29 / C, -0.115 mmol/mol / C) imply an apparent amplification of inferred SST variations on interannual and longer timescales.' 'Microatoll with core taken from top and side and spliced together; As with all coral timeseries, exact months are not known, so annual averages sometimes represent more or less than 12 months.' 'Microatoll with core taken from top and side and spliced together; As with all coral timeseries, exact months are not known, so annual averages sometimes represent more or less than 12 months.' 'This is the bottom part of a core that included an unconformity. Base of the coral is u-series dated and age model is from bands counted up from there.' 'This is the bottom part of a core that included an unconformity. Base of the coral is u-series dated and age model is from bands counted up from there.' 'This is the bottom part of a core that included an unconformity. Base of the coral is u-series dated and age model is from bands counted up from there.' 'This is the bottom part of a core that included an unconformity. Base of the coral is u-series dated and age model is from bands counted up from there.' 'Published slopes based on composite coral results. Used -0.20.02 permil per 1 deg C regressions' 'Published slopes based on composite coral results. Used -0.20.02 permil per 1 deg C regressions' 'nan' 'nan' 'nan' 'slope and y-intercept information only available for mean annual calibrations' 'used published values (i.e., not locally derived) for Sr/Ca-SST and d18O-SST slopes' 'used published values (i.e., not locally derived) for Sr/Ca-SST and d18O-SST slopes' 'nan' 'nan' 'See Goodkin et al 2005 for interannual Sr/Ca calibration' 'See Goodkin et al 2005 for interannual Sr/Ca calibration' 'mm-scale drilling but available data is at annual resolution' 'Sr/Ca regression slope error is estimated to be (+/-) 0.21 degrees C; Used the Zinke (2015) method - normalised and scaled to the s.d. SST box of the period 1961-1990; equation based on composite of 2 cores' 'Sr/Ca regression slope error is estimated to be (+/-) 0.21 degrees C; Used the Zinke (2015) method - normalised and scaled to the s.d. SST box of the period 1961-1990; equation based on composite of 2 cores' 'nan' 'nan' '7 calibration equations are available in Boiseau et al. 1998 (this database contains Equation (1) information)' 'This paper did not calibrate the d18O proxy or reconstruct temperature. It instead analyzed variability through time by directly using the d18O proxy.' 'nan' 'nan' 'nan' 'Monthly Sr/Ca data available from 1788-2013; Monthly and seasonal (DJFM vs JJAS, data input on Jan and July) d18O data available from 1906-2013 and 1788-1905 respectively.'] ["<class 'str'>"] No. of unique values: 73/221
paleoData_variableName¶
# paleoData_variableName
key = 'paleoData_variableName'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
paleoData_variableName: ['Sr/Ca' 'd18O'] ["<class 'str'>"]
climate metadata: interpretation variable, direction, seasonality¶
interpretation_direction¶
# climate_interpretation
key = 'interpretation_direction'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
interpretation_direction: ['N/A'] No. of unique values: 1/221
interpretation_seasonality¶
# climate_interpretation
key = 'interpretation_seasonality'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
interpretation_seasonality: ['N/A'] No. of unique values: 1/221
interpretation_variable¶
# climate_interpretation
key = 'interpretation_variable'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
interpretation_variable: ['temperature' 'temperature+moisture'] No. of unique values: 2/221
interpretation_variableDetail¶
# climate_interpretation
key = 'interpretation_variableDetail'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
interpretation_variableDetail: ['temperature - manually assigned by DoD2k authors for paleoData_proxy = Sr/Ca' 'temperature+moisture - manually assigned by DoD2k authors for paleoData_proxy = d18O'] No. of unique values: 2/221
data¶
paleoData_values¶
# # paleoData_values
key = 'paleoData_values'
print('%s: '%key)
for ii, vv in enumerate(df[key][:20]):
try:
print('%-30s: %s -- %s'%(df['dataSetName'].iloc[ii][:30], str(np.nanmin(vv)), str(np.nanmax(vv))))
print(type(vv))
except: print(df['dataSetName'].iloc[ii], 'NaNs detected.')
print(np.unique([str(type(dd)) for dd in df[key]]))
paleoData_values: CH03BUN01 : -5.758 -- -4.6518 <class 'numpy.ndarray'> ZI15MER01 : 8.80159 -- 9.006902 <class 'numpy.ndarray'> ZI15MER01 : 8.80159 -- 9.006902 <class 'numpy.ndarray'> CO03PAL03 : -5.38 -- -4.11 <class 'numpy.ndarray'> CO03PAL02 : -5.295 -- -4.338 <class 'numpy.ndarray'> LI06RAR01 : -5.13 -- -3.82 <class 'numpy.ndarray'> CO03PAL07 : -5.51 -- -4.44 <class 'numpy.ndarray'> FL18DTO03 : 8.891 -- 9.476 <class 'numpy.ndarray'> UR00MAI01 : -5.304433 -- -3.752342 <class 'numpy.ndarray'> TU95MAD01 : -5.895 -- -4.578 <class 'numpy.ndarray'> ZI04IFR01 : -5.43 -- -3.41 <class 'numpy.ndarray'> RE18CAY01 : -4.812 -- -3.629 <class 'numpy.ndarray'> RE18CAY01 : 8.807 -- 9.1 <class 'numpy.ndarray'> RE18CAY01 : 8.863 -- 9.043 <class 'numpy.ndarray'> RE18CAY01 : -4.577 -- -3.915 <class 'numpy.ndarray'> KU99HOU01 : -4.7 -- -3.04 <class 'numpy.ndarray'> OS13NLP01 : -6.1125712 -- -5.112277 <class 'numpy.ndarray'> EV98KIR01 : -5.233 -- -3.748 <class 'numpy.ndarray'> LI00RAR01 : -4.9993 -- -3.5122 <class 'numpy.ndarray'> LI00RAR01 : 9.1651 -- 9.75 <class 'numpy.ndarray'> ["<class 'numpy.ndarray'>"]
paleoData_units¶
# paleoData_units
key = 'paleoData_units'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
paleoData_units: ['mmol/mol' 'permil'] ["<class 'str'>"] No. of unique values: 2/221
year¶
# # year
key = 'year'
print('%s: '%key)
for ii, vv in enumerate(df[key][:20]):
try: print('%-30s: %s -- %s'%(df['dataSetName'].iloc[ii][:30], str(np.nanmin(vv)), str(np.nanmax(vv))))
except: print('NaNs detected.', vv)
print(np.unique([str(type(dd)) for dd in df[key]]))
year: CH03BUN01 : 1860.0 -- 1990.58 ZI15MER01 : 1891.0 -- 2009.0 ZI15MER01 : 1891.0 -- 2009.0 CO03PAL03 : 1317.17 -- 1406.49 CO03PAL02 : 1149.08 -- 1220.205 LI06RAR01 : 1906.88 -- 1999.75 CO03PAL07 : 1635.02 -- 1666.48 FL18DTO03 : 1997.646 -- 2012.208 UR00MAI01 : 1840.0 -- 1994.5 TU95MAD01 : 1922.542 -- 1991.292 ZI04IFR01 : 1659.625 -- 1995.625 RE18CAY01 : 1887.04 -- 2012.54 RE18CAY01 : 1887.04 -- 2012.54 RE18CAY01 : 1887.0 -- 2011.0 RE18CAY01 : 1887.0 -- 2011.0 KU99HOU01 : 1794.71 -- 1994.38 OS13NLP01 : 1990.17 -- 2008.17 EV98KIR01 : 1938.292 -- 1993.625 LI00RAR01 : 1726.753 -- 1996.8641 LI00RAR01 : 1726.753 -- 1996.8641 ["<class 'numpy.ndarray'>"]
yearUnits¶
# yearUnits
key = 'yearUnits'
print('%s: '%key)
print(np.unique([kk for kk in df[key]]))
print(np.unique([str(type(dd)) for dd in df[key]]))
print(f'No. of unique values: {len(np.unique(df[key]))}/{len(df)}')
yearUnits: ['CE'] ["<class 'str'>"] No. of unique values: 1/221